首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Discriminating membrane proteins using the joint distribution of length sums of success and failure runs
Authors:Sotirios Bersimis  Athanasios Sachlas  Pantelis G Bagos
Institution:1.Department of Statistics and Insurance Science,University of Piraeus,Piraeus,Greece;2.Department of Computer Science and Biomedical Informatics,University of Thessaly,Lamia,Greece
Abstract:Discriminating integral membrane proteins from water-soluble ones, has been over the past decades an important goal for computational molecular biology. A major drawback of methods appeared in the literature, is that most of the authors tried to solve the problem using machine learning techniques. Specifically, most of the proposed methods require an appropriate dataset for training, and consequently the results depend heavily on the suitability of the dataset, itself. Motivated by these facts, in this paper we develop a formal discrimination procedure that is based on appropriate theoretical observations on the sequence of hydrophobic and polar residues along the protein sequence and on the exact distribution of a two dimensional runs-related statistic defined on the same sequence. Specifically, for setting up our discrimination procedure, we study thoroughly the exact distribution of a bivariate random variable, which accumulates the exact lengths of both success and failure runs of at least a specific length in a sequence of Bernoulli trials. To investigate the properties of this bivariate random variable, we use the Markov chain embedding technique. Finally, we apply the new procedure to a well-defined dataset of proteins.
Keywords:
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号