首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 9 毫秒
Canonical correlation has been little used and little understood, even by otherwise sophisticated analysts. An alternative approach to canonical correlation, based on a general linear multivariate model, is presented. Properties of principal component analysis are used to help explain the method. Standard computational methods for full rank canonical correlation, techniques for canonical correlation on component scores, and canonical correlation with less than full rank are discussed. They are seen to be essentially equivalent when the model equation for canonical correlation on component scores is presented. The two approaches to less than full rank situations are equivalent in some senses, but quite different in usefulness, depending on the application. An example dataset is analyzed in detail to help demonstrate the conclusions.  相似文献   

Many empirical time series such as asset returns and traffic data exhibit the characteristic of time-varying conditional covariances, known as volatility or conditional heteroscedasticity. Modeling multivariate volatility, however, encounters several difficulties, including the curse of dimensionality. Dimension reduction can be useful and is often necessary. The goal of this article is to extend the idea of principal component analysis to principal volatility component (PVC) analysis. We define a cumulative generalized kurtosis matrix to summarize the volatility dependence of multivariate time series. Spectral analysis of this generalized kurtosis matrix is used to define PVCs. We consider a sample estimate of the generalized kurtosis matrix and propose test statistics for detecting linear combinations that do not have conditional heteroscedasticity. For application, we applied the proposed analysis to weekly log returns of seven exchange rates against U.S. dollar from 2000 to 2011 and found a linear combination among the exchange rates that has no conditional heteroscedasticity.  相似文献   

Probabilistic Principal Component Analysis   总被引:2,自引:0,他引:2  
Principal component analysis (PCA) is a ubiquitous technique for data analysis and processing, but one which is not based on a probability model. We demonstrate how the principal axes of a set of observed data vectors may be determined through maximum likelihood estimation of parameters in a latent variable model that is closely related to factor analysis. We consider the properties of the associated likelihood function, giving an EM algorithm for estimating the principal subspace iteratively, and discuss, with illustrative examples, the advantages conveyed by this probabilistic approach to PCA.  相似文献   

在典型相关分析中,求得典型相关变量的表达式并没有全部完成任务,例如需要确定典型相关变量的个数和变量选择。针对典型相关变量的个数问题,发现了常用的卡方检验和冗余分析方法的不足,进而提出了一种新的算法。针对原始变量的选择问题,提出了三种可能的路径。最后利用人体尺寸数据对相关结论进行了验证。  相似文献   

主成分分析在我国区域经济梯度评价中的应用   总被引:3,自引:0,他引:3  
本文利用数据统计和SPSS计算机软件进行主成分分析,对我国30个省份及地区的经济梯度状况进行评分、划类和排序,比较客观地阐述了我国区域经济大致的发展状况和未来发展前景,并提出了相关的政策建议,以期为各地经济的健康发展提供更好的思路。  相似文献   

Canonical correlation assesses the relationship between two groups of variables. Although it has been a useful tool in a wide variety of research areas, it is not well known that weaker canonical correlations require larger sample sizes to be correctly inferred. In this article, we investigate small sample bias in canonical correlation analysis and apply the jackknife bias correction to the estimation of canonical correlations. We use bootstrap samples to obtain a better confidence interval for the jackknife canonical correlation estimator.  相似文献   

主成分与因子分析中指标同趋势化方法探讨   总被引:14,自引:0,他引:14  
样本主成分和样本因子分析法已成为一种最主要的综合评价方法之一,指标变量的同趋势化是运用该方法的重要步骤。文章总结了主成分与因子分析中指标同趋势化的具体方法,论述了这些方法对综合评价的影响,并指出了这些方法的适用条件。  相似文献   

In the classical principal component analysis (PCA), the empirical influence function for the sensitivity coefficient ρ is used to detect influential observations on the subspace spanned by the dominants principal components. In this article, we derive the influence function of ρ in the case where the reweighted minimum covariance determinant (MCD1) is used as estimator of multivariate location and scatter. Our aim is to confirm the reliability in terms of robustness of the MCD1 via the approach based on the influence function of the sensitivity coefficient.  相似文献   

The Bayesian information criterion (BIC) is widely used for variable selection. We focus on the regression setting for which variations of the BIC have been proposed. A version that includes the Fisher Information matrix of the predictor variables performed best in one published study. In this article, we extend the evaluation, introduce a performance measure involving how closely posterior probabilities are approximated, and conclude that the version that includes the Fisher Information often favors regression models having more predictors, depending on the scale and correlation structure of the predictor matrix. In the image analysis application that we describe, we therefore prefer the standard BIC approximation because of its relative simplicity and competitive performance at approximating the true posterior probabilities.  相似文献   


In influence analysis several problems arise in the field of Principal Components when applying different sample versions. Among these are the difficulty of determining a certain correspondence between the eigenvalues before and after the deletion of observations, the choice of the sign of the eigenvectors and the computational problem derived from the resolution of a great number of eigenproblems. In this article, such problems are discussed from the joint influence point of view and a solution is proposed by using approximations. Furthermore, the influence on a new parameter of interest is introduced: the proportion of variance explained by a set of principal components.  相似文献   

Abstract. Lasso and other regularization procedures are attractive methods for variable selection, subject to a proper choice of shrinkage parameter. Given a set of potential subsets produced by a regularization algorithm, a consistent model selection criterion is proposed to select the best one among this preselected set. The approach leads to a fast and efficient procedure for variable selection, especially in high‐dimensional settings. Model selection consistency of the suggested criterion is proven when the number of covariates d is fixed. Simulation studies suggest that the criterion still enjoys model selection consistency when d is much larger than the sample size. The simulations also show that our approach for variable selection works surprisingly well in comparison with existing competitors. The method is also applied to a real data set.  相似文献   

Supersaturated designs are a large class of factorial designs which can be used for screening out the important factors from a large set of potentially active variables. The huge advantage of these designs is that they reduce the experimental cost drastically, but their critical disadvantage is the confounding involved in the statistical analysis. In this article, we propose a method for analyzing data using several types of supersaturated designs. Modifications of widely used information criteria are given and applied to the variable selection procedure for the identification of the active factors. The effectiveness of the proposed method is depicted via simulated experiments and comparisons.  相似文献   

This work is devoted to robust principal component analysis (PCA). We give a comparison between some multivariate estimators of location and scatter by computing the influence functions of the sensitivity coefficient ρ corresponding to these estimators, and the mean squared error (MSE) of estimators of ρ. The coefficient ρ measures the closeness between the subspaces spanned by the initial eigenvectors and their corresponding version derived from an infinitesimal perturbation of the data distribution.  相似文献   

In this paper, we introduce linear modeling of canonical correlation analysis, which estimates canonical direction matrices by minimising a quadratic objective function. The linear modeling results in a class of estimators of canonical direction matrices, and an optimal class is derived in the sense described herein. The optimal class guarantees several of the following desirable advantages: first, its estimates of canonical direction matrices are asymptotically efficient; second, its test statistic for determining the number of canonical covariates always has a chi‐squared distribution asymptotically; third, it is straight forward to construct tests for variable selection. The standard canonical correlation analysis and other existing methods turn out to be suboptimal members of the class. Finally, we study the role of canonical variates as a means of dimension reduction for predictors and responses in multivariate regression. Numerical studies and data analysis are presented.  相似文献   

科技期刊质量综合评价的主成分分析法及其改进   总被引:1,自引:0,他引:1  
应用主成分分析进行理工大学、工业综合类科技期刊质量综合评价,根据主成分累计贡献值确定主成分的有效维数和权重,消除由于指标间的相关性带来的偏差和人为确定指标权重引起的缺陷,使评价结果更客观、公正和准确。研究了评价指标数、期刊种类数等对评价结果的影响,从而确定了合理的评价指标,得到了可靠、有效的评价结果。在18个指标中,根据保留具有重要作用变量的原则,最终选定14个有效评价指标,并且对期刊质量都具有正向作用,其中引用刊教、学科扩散指标等5个指标最重要,而影响因子的重要性最低。  相似文献   

一种加权主成分距离的聚类分析方法   总被引:1,自引:0,他引:1  
吕岩威  李平 《统计研究》2016,33(11):102-108
指标之间的高度相关性及其重要性差异导致了传统聚类分析方法往往无法获得良好的分类效果。本文在对传统聚类分析方法及其各种改进方法局限性展开探讨的基础上,运用数学方法重构了分类定义中的距离概念,通过定义自适应赋权的主成分距离为分类统计量,提出一种新的改进的主成分聚类分析方法——加权主成分距离聚类分析法。理论研究表明,加权主成分距离聚类分析法系统集成了已有聚类分析方法的优点,有充分的理论基础保证其科学合理性。仿真实验结果显示,加权主成分距离聚类分析法能够有效解决已有聚类分析方法在特定情形下的失真问题,所得分类效果更为理想。  相似文献   

主成分分析与因子分析的异同比较及应用   总被引:51,自引:0,他引:51  
王芳 《统计教育》2003,(5):14-17
主成分分析法和因子分析法都是从变量的方差-协方差结构入手,在尽可能多地保留原始信息的基础上,用少数新变量来解释原始变量的多元统计分析方法。教学实践中,发现学生运用主成分分析法和因子分析法处理降维问题的认识不够清楚,本文针对性地从主成分分析法、因子分析法的基本思想、使用方法及统计量的分析等多角度进行比较,并辅以实例。  相似文献   

In this article, we develop a robust variable selection procedure jointly for fixed and random effects in linear mixed models for longitudinal data. We propose a penalized robust estimator for both the regression coefficients and the variance of random effects based on a re-parametrization of the linear mixed models. Under some regularity conditions, we show the oracle properties of the proposed robust variable selection method. Simulation study shows the robustness of the proposed method against outliers. In the end, the proposed methods is illustrated in the analysis of a real data set.  相似文献   

We propose a penalized quantile regression for partially linear varying coefficient (VC) model with longitudinal data to select relevant non parametric and parametric components simultaneously. Selection consistency and oracle property are established. Furthermore, if linear part and VC part are unknown, we propose a new unified method, which can do three types of selections: separation of varying and constant effects, selection of relevant variables, and it can be carried out conveniently in one step. Consistency in the three types of selections and oracle property in estimation are established as well. Simulation studies and real data analysis also confirm our method.  相似文献   

Principal components are often used for reducing dimensions in multivariate data, but they frequently fail to provide useful results and their interpretation is rather difficult. In this article, the use of entropy optimization principles for dimensional reduction in multivariate data is proposed. Under the assumptions of multivariate normality, a four-step procedure is developed for selecting principal variables and hence discarding redundant variables. For comparative performance of the information theoretic procedure, we use simulated data with known dimensionality. Principal variables of cluster bean (Guar) are identified by applying this procedure to a real data set generated in a plant breeding experiment.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号