首页 | 本学科首页   官方微博 | 高级检索  
     

基于变系数模型的高维数据异同性识别方法研究
引用本文:孙怡帆等. 基于变系数模型的高维数据异同性识别方法研究[J]. 统计研究, 2021, 38(5): 136-146. DOI: 10.19343/j.cnki.11-1302/c.2021.05.011
作者姓名:孙怡帆等
摘    要:随着信息技术的发展,高维数据日益丰富。现实中,很多高维数据由多个主体各异的数据集融合而成。如何准确识别出高维数据集间的异同性成为大数据分析的目标之一。本文提出了变系数模型下的高维数据整合分析方法。该方法可以同时对多个数据集进行变量选择和系数估计,并且能 够自动识别出变量系数在数据集间的异同性。模拟结果表明本文方法在异同性识别、变量选择、系数估 计和预测等方面明显优于对比方法。在肺癌致病基因识别的应用研究中,本文方法能够识别出具有生物解释的致病基因并发现了两种亚型之间的异同性。

关 键 词:高维数据  异同性  变系数模型  整合分析  

A Study on Identification of Commonality and Difference among High-dimensional Data Based on a Varying-coefficient Model
Sun Yifan et al. A Study on Identification of Commonality and Difference among High-dimensional Data Based on a Varying-coefficient Model[J]. Statistical Research, 2021, 38(5): 136-146. DOI: 10.19343/j.cnki.11-1302/c.2021.05.011
Authors:Sun Yifan et al
Abstract:With the development of information technology, high-dimensional data has become increasingly rich. In reality, a lot of high-dimensional data is a mixture of multiple datasets with heterogeneous sources or subjects. The identification of commonality and difference among high-dimensional datasets has become one of the goals of big data analysis. This paper proposes a novel integrative analysis method for high-dimensional data based on a varying-coefficient model. It can simultaneously conduct variable selection, coefficient estimation, and, most importantly, automatically identify the commonality and difference among multiple datasets. Simulations demonstrate that the proposed method outperforms alternative methods in commonality identification, variable selection, coefficient estimation, and forecast. Finally, the proposed method is applied to lung cancer datasets and biologically meaningful pathogenic genes are identified as well as the commonality and difference of two sub-types.
Keywords:High Dimensional Data   Commonality and Difference   Varying-coefficient Model   Integrative Analysis  
点击此处可从《统计研究》浏览原始摘要信息
点击此处可从《统计研究》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号