首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Canonical variate analysis can be viewed as a two-stage principal component analysis. Explicit consideration of the principal components from the first stage, formalized in the content of shrunken estimators, leads to a number of practical advantages. In morphometric studies, the first eigenvector is often a size vector, with the remaining vectors contrast or shape-type vectors, so that a decomposition of the canonical variates into size and shape components can be achieved. In applied studies, often a small number of the principal components effect most of the separation between groups; plots of group means and associated concentration ellipses (ideally these should be circular) for important principal components facilitate graphical inspection. Of considerable practical importance is the potential for improved stability of the estimated canonical vectors. When the between-groups sum of squares for a particular principal component is small, and the corresponding eigenvalue of the within-groups correlation matrix is also small, marked instability of the canonical vectors can be expected. The introduction of shrunken estimators, by adding shrinkage constrants to the eigenvalues, leads to more stable coefficients.  相似文献   

2.
马景义 《统计教育》2010,(5):54-56,43
本文通过引入数据阵在Frobenius范数下的最优近似等概念来重新探讨主成分和因子分析。我发现,主成分分析中主成分和因子分析中因子得分(通过主成分解因子载荷,然后用最小二乘解因子得分)的估计为数据阵的最优近似(在Frobenius范数下)在不同正交坐标方向矩阵下的坐标。两种方法分别采用了不同的约束条件分解的最优近似(在Frobenius范数下),因为该分解并不唯一。  相似文献   

3.
The effect of nonstationarity in time series columns of input data in principal components analysis is examined. Nonstationarity are very common among economic indicators collected over time. They are subsequently summarized into fewer indices for purposes of monitoring. Due to the simultaneous drifting of the nonstationary time series usually caused by the trend, the first component averages all the variables without necessarily reducing dimensionality. Sparse principal components analysis can be used, but attainment of sparsity among the loadings (hence, dimension-reduction is achieved) is influenced by the choice of parameter(s) (λ 1,i ). Simulated data with more variables than the number of observations and with different patterns of cross-correlations and autocorrelations were used to illustrate the advantages of sparse principal components analysis over ordinary principal components analysis. Sparse component loadings for nonstationary time series data can be achieved provided that appropriate values of λ 1,j are used. We provide the range of values of λ 1,j that will ensure convergence of the sparse principal components algorithm and consequently achieve sparsity of component loadings.  相似文献   

4.
Dynamic principal component analysis (DPCA), also known as frequency domain principal component analysis, has been developed by Brillinger [Time Series: Data Analysis and Theory, Vol. 36, SIAM, 1981] to decompose multivariate time-series data into a few principal component series. A primary advantage of DPCA is its capability of extracting essential components from the data by reflecting the serial dependence of them. It is also used to estimate the common component in a dynamic factor model, which is frequently used in econometrics. However, its beneficial property cannot be utilized when missing values are present, which should not be simply ignored when estimating the spectral density matrix in the DPCA procedure. Based on a novel combination of conventional DPCA and self-consistency concept, we propose a DPCA method when missing values are present. We demonstrate the advantage of the proposed method over some existing imputation methods through the Monte Carlo experiments and real data analysis.  相似文献   

5.
ABSTRACT

Canonical correlations are maximized correlation coefficients indicating the relationships between pairs of canonical variates that are linear combinations of the two sets of original variables. The number of non-zero canonical correlations in a population is called its dimensionality. Parallel analysis (PA) is an empirical method for determining the number of principal components or factors that should be retained in factor analysis. An example is given to illustrate for adapting proposed procedures based on PA and bootstrap modified PA to the context of canonical correlation analysis (CCA). The performances of the proposed procedures are evaluated in a simulation study by their comparison with traditional sequential test procedures with respect to the under-, correct- and over-determination of dimensionality in CCA.  相似文献   

6.
In practice, when a principal component analysis is applied on a large number of variables the resultant principal components may not be easy to interpret, as each principal component is a linear combination of all the original variables. Selection of a subset of variables that contains, in some sense, as much information as possible and enhances the interpretations of the first few covariance principal components is one possible approach to tackle this problem. This paper describes several variable selection criteria and investigates which criteria are best for this purpose. Although some criteria are shown to be better than others, the main message of this study is that it is unwise to rely on only one or two criteria. It is also clear that the interdependence between variables and the choice of how to measure closeness between the original components and those using subsets of variables are both important in determining the best criteria to use.  相似文献   

7.
In this study, classical and robust principal component analyses are used to evaluate socioeconomic development of regions of development agencies that give service on the purpose of decreasing development difference among regions in Turkey. Due to the high differences between development levels of regions outlier problem occurs, hence robust statistical methods are used. Also, classical and robust statistical methods are used to investigate if there are any outliers in data set. In classic principal component analyse, the number of observations must be larger than the number of variables. Otherwise determinant of covariance matrix is zero. In Robust method for Principal Component Analysis (ROBPCA), a robust approach to principal component analyse in high-dimensional data, even if the number of variables is larger than the number of observations, principal components are obtained. In this paper, firstly 26 development agencies are evaluated with 19 variables by using principal component analysis based on classical and robust scatter matrices and then these 26 development agencies are evaluated with 46 variables by using the ROBPCA method.  相似文献   

8.
ADE-4: a multivariate analysis and graphical display software   总被引:59,自引:0,他引:59  
We present ADE-4, a multivariate analysis and graphical display software. Multivariate analysis methods available in ADE-4 include usual one-table methods like principal component analysis and correspondence analysis, spatial data analysis methods (using a total variance decomposition into local and global components, analogous to Moran and Geary indices), discriminant analysis and within/between groups analyses, many linear regression methods including lowess and polynomial regression, multiple and PLS (partial least squares) regression and orthogonal regression (principal component regression), projection methods like principal component analysis on instrumental variables, canonical correspondence analysis and many other variants, coinertia analysis and the RLQ method, and several three-way table (k-table) analysis methods. Graphical display techniques include an automatic collection of elementary graphics corresponding to groups of rows or to columns in the data table, thus providing a very efficient way for automatic k-table graphics and geographical mapping options. A dynamic graphic module allows interactive operations like searching, zooming, selection of points, and display of data values on factor maps. The user interface is simple and homogeneous among all the programs; this contributes to making the use of ADE-4 very easy for non- specialists in statistics, data analysis or computer science.  相似文献   

9.
A number of results have been derived recently concerning the influence of individual observations in a principal component analysis. Some of these results, particularly those based on the correlation matrix, are applied to data consisting of seven anatomical measurements on students. The data have a correlation structure which is fairly typical of many found in allometry. This case study shows that theoretical influence functions often provide good estimates of the actual changes observed when individual observations are deleted from a principal component analysis. Different observations may be influential for different aspects of the principal component analysis (coefficients, variances and scores of principal components); these differences, and the distinction between outlying and influential observations are discussed in the context of the case study. A number of other complications, such as switching and rotation of principal components when an observation is deleted, are also illustrated.  相似文献   

10.
ABSTRACT

Parallel analysis (Horn 1965) and the minimum average partial correlation (MAP; Velicer 1976) have been widely spread as optimal solutions to identify the correct number of axes in principal component analysis. Previous results showed, however, that they become inefficient when variables belonging to different components strongly correlate. Simulations are used to assess their power to detect the dimensionality of data sets with oblique structures. Overall, MAP had the best performances as it was more powerful and accurate than PA when the component structure was modestly oblique. However, both stopping rules performed poorly in the presence of highly oblique factors.  相似文献   

11.
The case that the factor model does not account for all the covariances of the observed variables is considered. It is shown that principal components representing covariances not accounted for by the factor model can have a nonzero correlation with the common factors of the factor model. The substantial correlations of components representing variance not accounted for by the factor model with common factors are demonstrated in a simulation study comprising model error. Based on these results, a new version of Harman's factor score predictor minimizing the correlation with residual components is proposed.  相似文献   

12.
We develop functional data analysis techniques using the differential geometry of a manifold of smooth elastic functions on an interval in which the functions are represented by a log-speed function and an angle function. The manifold's geometry provides a method for computing a sample mean function and principal components on tangent spaces. Using tangent principal component analysis, we estimate probability models for functional data and apply them to functional analysis of variance, discriminant analysis, and clustering. We demonstrate these tasks using a collection of growth curves from children from ages 1–18.  相似文献   

13.
主成分与因子分析中指标同趋势化方法探讨   总被引:9,自引:0,他引:9  
样本主成分和样本因子分析法已成为一种最主要的综合评价方法之一,指标变量的同趋势化是运用该方法的重要步骤。文章总结了主成分与因子分析中指标同趋势化的具体方法,论述了这些方法对综合评价的影响,并指出了这些方法的适用条件。  相似文献   

14.
Common factor analysis (CFA) and principal component analysis (PCA) are widely used multivariate techniques. Using simulations, we compared CFA with PCA loadings for distortions of a perfect cluster configuration. Results showed that nonzero PCA loadings were higher and more stable than nonzero CFA loadings. Compared to CFA loadings, PCA loadings correlated weakly with the true factor loadings for underextraction, overextraction, and heterogeneous loadings within factors. The pattern of differences between CFA and PCA was consistent across sample sizes, levels of loadings, principal axis factoring versus maximum likelihood factor analysis, and blind versus target rotation.  相似文献   

15.
Summary.  The problem of component choice in regression-based prediction has a long history. The main cases where important choices must be made are functional data analysis, and problems in which the explanatory variables are relatively high dimensional vectors. Indeed, principal component analysis has become the basis for methods for functional linear regression. In this context the number of components can also be interpreted as a smoothing parameter, and so the viewpoint is a little different from that for standard linear regression. However, arguments for and against conventional component choice methods are relevant to both settings and have received significant recent attention. We give a theoretical argument, which is applicable in a wide variety of settings, justifying the conventional approach. Although our result is of minimax type, it is not asymptotic in nature; it holds for each sample size. Motivated by the insight that is gained from this analysis, we give theoretical and numerical justification for cross-validation choice of the number of components that is used for prediction. In particular we show that cross-validation leads to asymptotic minimization of mean summed squared error, in settings which include functional data analysis.  相似文献   

16.
韩猛等 《统计研究》2018,35(6):97-108
为了内生地识别动态因子模型因子载荷矩阵的结构突变(包括因子个数的变化),本文利用主成分估计得伪因子序列构造累积平方和统计量检验因子载荷矩阵的结构突变性,进一步利用迭代累积平方和算法对多个结构突变点的位置进行探测。研究发现,本文提出的检验统计量对于因子个数误设具有稳健性;并且该检验具有良好的有限样本性质和渐近性;另外,实证分析发现,中国沪市A股市场制造业上市公司的对数收益率序列存在结构突变的共同因子。  相似文献   

17.
对2000—2006年中国东部、中部、西部地区保险密度的差异进行了比较,对保险密度的影响因子进行了主成分分析,利用PandData模型分别对东部、中部、西部地区进行回归分析,研究表明:引起各地区保险密度差异的因素主要有地区人均GDP、人均消费水平、文化程度、城市化、产业结构、社会福利费用、性别比和年龄结构等,不同地区保险密度的影响因素和影响程度不同。为了缩小保险密度区域性差异,应针对不同地区采取相应的政策措施。  相似文献   

18.
19.
A central issue in principal component analysis (PCA) is that of choosing the appropriate number of principal components to be retained. Bishop (1999a) suggested a Bayesian approach for PCA for determining the effective dimensionality automatically on the basis of the probabilistic latent variable model. This paper extends this approach by using mixture priors, in that the choice dimensionality and estimation of principal components are done simultaneously via MCMC algorithm. Also, the proposed method provides a probabilistic measure of uncertainty on PCA, yielding posterior probabilities of all possible cases of principal components.  相似文献   

20.
A regression approach to principal component analysis is presented in this note. We provide an alternative interpretation of principal components that illustrates the relation between the extra sum of squares in regression analysis and the eigenvalues associated with the principal components.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号