首页 | 本学科首页   官方微博 | 高级检索  
     


Distribution of the length of the longest common subsequence of two multi-state biological sequences
Authors:James C. Fu  W.Y. Wendy Lou
Affiliation:1. Department of Statistics, University of Manitoba, Winnipeg, MB, Canada R3T 2N2;2. Department of Public Health Sciences, University of Toronto, Toronto, ON, Canada M5T 3M7
Abstract:The length of the longest common subsequence (LCS) among two biological sequences has been used as a measure of similarity, and the application of this statistic is of importance in genomic studies. Even for the simple case of two sequences of equal length and composed of binary elements with equal state probabilities, the exact distribution of the length of the LCS remains an open question. This problem is also known as an NP-hard problem in computer science. Apart from combinatorial analysis, using the finite Markov chain imbedding technique, we derive the exact distribution for the length of the LCS between two multi-state sequences of different lengths. Numerical results are provided to illustrate the theoretical results.
Keywords:primary, 60E05   secondary, 60J10
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号