期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Measures of fit in principal component and canonical variate analyses

Sugnet Gardner-Lubbe John C. Gowers 《Journal of applied statistics》2008,35(9):947-965

Treating principal component analysis (PCA) and canonical variate analysis (CVA) as methods for approximating tables, we develop measures, collectively termed predictivity, that assess the quality of fit independently for each variable and for all dimensionalities. We illustrate their use with data from aircraft development, the African timber industry and copper froth measurements from the mining industry. Similar measures are described for assessing the predictivity associated with the individual samples (in the case of PCA and CVA) or group means (in the case of CVA). For these measures to be meaningful, certain essential orthogonality conditions must hold that are shown to be satisfied by predictivity. 相似文献

2.

Functional principal component analyses of biomedical images as outcome measures

Emma O'Connor Nick Fieller rew Holmes John C. Waterton Edward Ainscow 《Journal of the Royal Statistical Society. Series C, Applied statistics》2010,59(1):57-76

相似文献

3.

Treatments of non-metric variables in partial least squares and principal component analysis

Jisu Yoon Tatyana Krivobokova 《Journal of applied statistics》2018,45(6):971-987

This paper reviews various treatments of non-metric variables in partial least squares (PLS) and principal component analysis (PCA) algorithms. The performance of different treatments is compared in an extensive simulation study under several typical data generating processes and associated recommendations are made. Moreover, we find that PLS-based methods are to prefer in practice, since, independent of the data generating process, PLS performs either as good as PCA or significantly outperforms it. As an application of PLS and PCA algorithms with non-metric variables we consider construction of a wealth index to predict household expenditures. Consistent with our simulation study, we find that a PLS-based wealth index with dummy coding outperforms PCA-based ones. 相似文献

4.

The eigenstructure of block-structured correlation matrices and its implications for principal component analysis

Jorge Cadima Francisco Lage Calheiros Isabel P. Preto 《Journal of applied statistics》2010,37(4):577-589

Block-structured correlation matrices are correlation matrices in which the p variables are subdivided into homogeneous groups, with equal correlations for variables within each group, and equal correlations between any given pair of variables from different groups. Block-structured correlation matrices arise as approximations for certain data sets’ true correlation matrices. A block structure in a correlation matrix entails a certain number of properties regarding its eigendecomposition and, therefore, a principal component analysis of the underlying data. This paper explores these properties, both from an algebraic and a geometric perspective, and discusses their robustness. Suggestions are also made regarding the choice of variables to be subjected to a principal component analysis, when in the presence of (approximately) block-structured variables. 相似文献

5.

Missing data in principal component analysis of questionnaire data: a comparison of methods

《Journal of Statistical Computation and Simulation》2012,82(11):2298-2315

Principal component analysis (PCA) is a widely used statistical technique for determining subscales in questionnaire data. As in any other statistical technique, missing data may both complicate its execution and interpretation. In this study, six methods for dealing with missing data in the context of PCA are reviewed and compared: listwise deletion (LD), pairwise deletion, the missing data passive approach, regularized PCA, the expectation-maximization algorithm, and multiple imputation. Simulations show that except for LD, all methods give about equally good results for realistic percentages of missing data. Therefore, the choice of a procedure can be based on the ease of application or purely the convenience of availability of a technique. 相似文献

6.

Identification of genomic markers correlated with sensitivity in solid tumors to Dasatinib using sparse principal components

Ahmed Hossain Hafiz T.A. Khan 《Journal of applied statistics》2016,43(14):2538-2549

Differential analysis techniques are commonly used to offer scientists a dimension reduction procedure and an interpretable gateway to variable selection, especially when confronting high-dimensional genomic data. Huang et al. used a gene expression profile of breast cancer cell lines to identify genomic markers which are highly correlated with in vitro sensitivity of a drug Dasatinib. They considered three statistical methods to identify differentially expressed genes and finally used the results from the intersection. But the statistical methods that are used in the paper are not sufficient to select the genomic markers. In this paper we used three alternative statistical methods to select a combined list of genomic markers and compared the genes that were proposed by Huang et al. We then proposed to use sparse principal component analysis (Sparse PCA) to identify a final list of genomic markers. The Sparse PCA incorporates correlation into account among the genes and helps to draw a successful genomic markers discovery. We present a new and a small set of genomic markers to separate out the groups of patients effectively who are sensitive to the drug Dasatinib. The analysis procedure will also encourage scientists in identifying genomic markers that can help to separate out two groups. 相似文献

7.

A comparison of different procedures for principal component analysis in the presence of outliers

B. Bariş Alkan Cemal Atakan Nesrin Alkan 《Journal of applied statistics》2015,42(8):1716-1722

Principal component analysis (PCA) is a popular technique that is useful for dimensionality reduction but it is affected by the presence of outliers. The outlier sensitivity of classical PCA (CPCA) has caused the development of new approaches. Effects of using estimates obtained by expectation–maximization – EM and multiple imputation – MI instead of outliers were examined on the artificial and a real data set. Furthermore, robust PCA based on minimum covariance determinant (MCD), PCA based on estimates obtained by EM instead of outliers and PCA based on estimates obtained by MI instead of outliers were compared with the results of CPCA. In this study, we tried to show the effects of using estimates obtained by MI and EM instead of outliers, depending on the ratio of outliers in data set. Finally, when the ratio of outliers exceeds 20%, we suggest the use of estimates obtained by MI and EM instead of outliers as an alternative approach. 相似文献

8.

A simulation comparison of imputation methods for quantitative data in the presence of multiple data patterns

《Journal of Statistical Computation and Simulation》2012,82(18):3588-3619

相似文献