首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Robust PCA for high-dimensional data based on characteristic transformation
Authors:Lingyu He  Yanrong Yang  Bo Zhang
Institution:1. Hunan University;2. The Australian National University;3. Department of Statistics & Finance, International Institute of Finance School of Management, University of Science and Technology of China Hefei, 230026 China
Abstract:In this paper, we propose a novel robust principal component analysis (PCA) for high-dimensional data in the presence of various heterogeneities, in particular strong tailing and outliers. A transformation motivated by the characteristic function is constructed to improve the robustness of the classical PCA. The suggested method has the distinct advantage of dealing with heavy-tail-distributed data, whose covariances may be non-existent (positively infinite, for instance), in addition to the usual outliers. The proposed approach is also a case of kernel principal component analysis (KPCA) and employs the robust and non-linear properties via a bounded and non-linear kernel function. The merits of the new method are illustrated by some statistical properties, including the upper bound of the excess error and the behaviour of the large eigenvalues under a spiked covariance model. Additionally, using a variety of simulations, we demonstrate the benefits of our approach over the classical PCA. Finally, using data on protein expression in mice of various genotypes in a biological study, we apply the novel robust PCA to categorise the mice and find that our approach is more effective at identifying abnormal mice than the classical PCA.
Keywords:characteristic function  heavy-tailed data  high-dimensional data  kernel PCA  robust PCA  spiked covariance model  
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号