首页 | 本学科首页   官方微博 | 高级检索  
     


Transcription factor-binding site identification and gene classification via fusion of the supervised-weighted discrete kernel clustering and support vector machine
Authors:Insuk Sohn  Jooyong Shim  Changha Hwang  Sujong Kim
Affiliation:1. Samsung Cancer Research Institute, Samsung Medical Center, Seoul 137-710, Korea;2. Department of Data Science, Institute of Statistical Information, Inje University, Kyungnam 621-749, Korea;3. Department of Applied Statistics, Dankook University, Gyeonggido 448-160, Korea;4. R&5. D Center, Komipharm International Co., LTD, Kyounggi-do 429-450, Korea
Abstract:The genetic regulatory mechanism heavily influences a substantial portion of biological functions and processes needed to sustain life. For a comprehensive mechanistic understanding of biological processes, it is important to identify the common transcription factor (TF) binding sites (TFBSs) from a set of promoter sequences of co-regulated genes and classify genes that are co-regulated by certain TFs, therefore to provide an insight into the mechanism that underlies the interaction among the co-regulated genes and complicate genetic regulation. We propose a new supervised-weighted discrete kernel clustering (SWDKC) classification method for the identification of TFBS and the classification of gene. Our SWDKC method gave smaller misclassification error rate than the other methods on both the simulated data and the real NF-κB data. We verify that the selected over-represented TFBSs serve informative TFBSs from a biological point of view.
Keywords:supervised clustering  transcription factor-binding site  transcription factor  gene classification  support vector machines
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号