首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 484 毫秒
1.
时间序列自回归AR模型在建模过程中易受离群值的影响,导致计算结果与实际不相符。针对这一现象,运用FQn统计量对传统自相关函数进行改进,构建出自回归AR模型的稳健估计算法,以克服离群值的影响,并对此方法进行了模拟和实证分析。模拟和实证分析均表明:当时序数据中不存在离群值时,传统估计方法与稳健估计方法得到的结果基本保持一致;当数据中存在离群值时,运用传统估计方法得到的结果出现较大变化,而运用稳健估计方法得到的结果基本不变.这说明相对于传统估计方法,稳健估计方法能有效抵抗离群值的影响,具有良好的抗干扰性和高抗差性。  相似文献   

2.
鉴于区域技术创新数据中存在指标多而且数据常出现"离群点"的现状,利用稳健主成分方法对于区域技术创新绩效数据进行分析,该方法拚弃了传统方法中均值和方差易受离群点影响的缺点,使得众多区域指标的综合评价更为准确,更能反映客观现实.  相似文献   

3.
时间序列自回归AR模型的Yule-Walker估计法在建模过程中易受离群值的影响,导致计算结果与实际不相符。针对这一现象,基于均值和方差的稳健组合估计量构建了稳健自相关函数,得到了时序AR模型的稳健Yule-Walker估计算法,以克服离群值的影响。并对此方法进行了模拟与金融数据实证检验,模拟和实证检验均表明:当时序数据中不存在离群值时,传统估计方法与稳健估计方法得到的结果基本保持一致;当数据中存在离群值时,运用传统估计方法得到的结果出现较大变化,而运用稳健估计方法得到的结果基本不变。这说明相对于传统估计方法,稳健估计方法能有效抵抗离群值的影响,具有良好的抗干扰性和高抗差性。  相似文献   

4.
由于传统因子分析方法对离群值较敏感,导致计算结果与实际不相符。针对这一现象,本文运用FAST-MCD方法对传统因子分析方法进行改进,构建出因子分析的稳健算法,以克服离群值的影响,并对此方法进行了模拟和实证分析。模拟和实证分析结果均表明:因子旋转前后,当数据中不存在离群值时,传统因子分析与稳健因子分析得到的结果基本保持一致;当数据中存在离群值时,运用传统因子分析得到的结果出现较大变化,而运用稳健因子分析方法得到的结果基本不变,这说明相对于传统因子分析方法,稳健因子分析方法能有效抵抗离群值的影响,具有良好的抗干扰性和高抗差性。  相似文献   

5.
文章克服了传统高维协方差阵估计方法的缺点,将主成分和门限方法相结合,提出了门限主成分正交补(TPO)估计量,该估计量主要通过前K个主成分来刻画高维协方差阵的信息,通过引入合适的门限函数来对矩阵的正交补进行稀疏估计,从而有效的降低了数据的维度并剔除了噪声的影响.模拟和实证研究发现:较严格的因子(SFM)模型而言,门限主成分正交补(TPO)模型明显提高了协方差阵的估计效率,并且将其应用在投资组合时,投资者获得了更高的收益和经济福利.  相似文献   

6.
主成分分析与因子分析的异同比较及应用   总被引:51,自引:0,他引:51  
王芳 《统计教育》2003,(5):14-17
主成分分析法和因子分析法都是从变量的方差-协方差结构入手,在尽可能多地保留原始信息的基础上,用少数新变量来解释原始变量的多元统计分析方法。教学实践中,发现学生运用主成分分析法和因子分析法处理降维问题的认识不够清楚,本文针对性地从主成分分析法、因子分析法的基本思想、使用方法及统计量的分析等多角度进行比较,并辅以实例。  相似文献   

7.
主成分分析应用中应注意的问题   总被引:1,自引:0,他引:1  
主成分分析应用广泛,但有些使用者不能明确多元数据何时能降维,主成分个数如何确定更好,单独使用主成分分析何时更好,还常常混淆主成分系数与初始因子载荷系数等,以致于得不到合理的主成分分析结论,如一些多元统计分析的教材中,用协方差矩阵的主成分分析出现了类似问题.这里从相关性等理论与实证上,解决了这些问题,给出了合理的结论与相应建议.  相似文献   

8.
刘丽萍等 《统计研究》2015,32(6):105-112
大维数据给传统的协方差阵估计方法带来了巨大的挑战,数据维度和噪声的影响不容忽视。本文将主成分和门限方法有效的结合,应用到DCC模型的估计中,提出了基于主成分正交补门限方法的DCC模型(poetDCC)。poetDCC模型主要通过前K个主成分来刻画高维动态条件协方差阵的信息,然后将门限函数应用在矩阵的正交补中,有效的降低了数据的维度并剔除了噪声的影响。通过模拟和实证研究发现:较DCC模型而言,poetDCC模型明显提高了高维协方差阵的估计和预测效率;并且将其应用在投资组合时,投资者获得了更高的投资收益和经济福利。  相似文献   

9.
高维数据给传统的协方差阵估计方法带来了巨大的挑战,数据维度和噪声的影响使传统的CCCGARCH模型估计起来较为困难。将主成分和门限方法有效结合,应用到CCC-GARCH模型的估计中,提出基于主成分正交补门限方法的CCC-GARCH模型(PTCCC-GARCH)。PTCCC模型主要通过前K个最优主成分来刻画大维协方差阵的信息,并通过门限函数以剔除噪声的影响。通过模拟和实证研究发现:较CCCGARCH模型而言,PTCCC-GARCH模型明显提高了高维协方差阵的估计和预测效率;并且将其应用在投资组合时,投资者获得了更高的投资收益和经济福利。  相似文献   

10.
主成分分析中的统计检验问题   总被引:8,自引:0,他引:8  
主成分分析已经越来越成为人们广泛应用的多元统计分析方法。但应用中盲目套用主成分分析方法的情况很多,而对主成分分析的适用性,主成分个数的合理性等问题重视不够,更谈不上对主成分分析进行统计检验。为此,为了更好应用主成分分析,就应对主成分分析结果进行统计检验并建立统计检验体系。主成分分析统计检验体系主要包括:主成分分析的适用性检验;等相关性检验;主成分方差的假设检验和选取主成分数目检验。  相似文献   

11.
In this paper, we propose a novel robust principal component analysis (PCA) for high-dimensional data in the presence of various heterogeneities, in particular strong tailing and outliers. A transformation motivated by the characteristic function is constructed to improve the robustness of the classical PCA. The suggested method has the distinct advantage of dealing with heavy-tail-distributed data, whose covariances may be non-existent (positively infinite, for instance), in addition to the usual outliers. The proposed approach is also a case of kernel principal component analysis (KPCA) and employs the robust and non-linear properties via a bounded and non-linear kernel function. The merits of the new method are illustrated by some statistical properties, including the upper bound of the excess error and the behaviour of the large eigenvalues under a spiked covariance model. Additionally, using a variety of simulations, we demonstrate the benefits of our approach over the classical PCA. Finally, using data on protein expression in mice of various genotypes in a biological study, we apply the novel robust PCA to categorise the mice and find that our approach is more effective at identifying abnormal mice than the classical PCA.  相似文献   

12.
In this paper, a new method for robust principal component analysis (PCA) is proposed. PCA is a widely used tool for dimension reduction without substantial loss of information. However, the classical PCA is vulnerable to outliers due to its dependence on the empirical covariance matrix. To avoid such weakness, several alternative approaches based on robust scatter matrix were suggested. A popular choice is ROBPCA that combines projection pursuit ideas with robust covariance estimation via variance maximization criterion. Our approach is based on the fact that PCA can be formulated as a regression-type optimization problem, which is the main difference from the previous approaches. The proposed robust PCA is derived by substituting square loss function with a robust penalty function, Huber loss function. A practical algorithm is proposed in order to implement an optimization computation, and furthermore, convergence properties of the algorithm are investigated. Results from a simulation study and a real data example demonstrate the promising empirical properties of the proposed method.  相似文献   

13.
The estimation of the covariance matrix is important in the analysis of bivariate longitudinal data. A good estimator for the covariance matrix can improve the efficiency of the estimators of the mean regression coefficients. Furthermore, the covariance estimation itself is also of interest, but it is a challenging job to model the covariance matrix of bivariate longitudinal data due to the complex structure and positive definite constraint. In addition, most of existing approaches are based on the maximum likelihood, which is very sensitive to outliers or heavy-tail error distributions. In this article, an adaptive robust estimation method is proposed for bivariate longitudinal data. Unlike the existing likelihood-based methods, the proposed method can adapt to different error distributions. Specifically, at first, we utilize the modified Cholesky block decomposition to parameterize the covariance matrices. Secondly, we apply the bounded Huber's score function to develop a set of robust generalized estimating equations to estimate the parameters both in the mean and the covariance models simultaneously. A data-driven approach is presented to select the parameter c in the Huber's score function, which can ensure that the proposed method is robust and efficient. A simulation study and a real data analysis are conducted to illustrate the robustness and efficiency of the proposed approach.  相似文献   

14.
In this paper, we study estimation of linear models in the framework of longitudinal data with dropouts. Under the assumptions that random errors follow an elliptical distribution and all the subjects share the same within-subject covariance matrix which does not depend on covariates, we develop a robust method for simultaneous estimation of mean and covariance. The proposed method is robust against outliers, and does not require to model the covariance and missing data process. Theoretical properties of the proposed estimator are established and simulation studies show its good performance. In the end, the proposed method is applied to a real data analysis for illustration.  相似文献   

15.
Many methods have been developed for detecting multiple outliers in a single multivariate sample, but very few for the case where there may be groups in the data. We propose a method of simultaneously determining groups (as in cluster analysis) and detecting outliers, which are points that are distant from every group. Our method is an adaptation of the BACON algorithm proposed by Billor, Hadi and Velleman for the robust detection of multiple outliers in a single group of multivariate data. There are two versions of our method, depending on whether or not the groups can be assumed to have equal covariance matrices. The effectiveness of the method is illustrated by its application to two real data sets and further shown by a simulation study for different sample sizes and dimensions for 2 and 3 groups, with and without planted outliers in the data. When the number of groups is not known in advance, the algorithm could be used as a robust method of cluster analysis, by running it for various numbers of groups and choosing the best solution.  相似文献   

16.
宋鹏等 《统计研究》2020,37(7):116-128
高维协方差矩阵的估计问题现已成为大数据统计分析中的基本问题,传统方法要求数据满足正态分布假定且未考虑异常值影响,当前已无法满足应用需要,更加稳健的估计方法亟待被提出。针对高维协方差矩阵,一种稳健的基于子样本分组的均值-中位数估计方法被提出且简单易行,然而此方法估计的矩阵并不具备正定稀疏特性。基于此问题,本文引进一种中心正则化算法,弥补了原始方法的缺陷,通过在求解过程中对估计矩阵的非对角元素施加L1范数惩罚,使估计的矩阵具备正定稀疏的特性,显著提高了其应用价值。在数值模拟中,本文所提出的中心正则稳健估计有着更高的估计精度,同时更加贴近真实设定矩阵的稀疏结构。在后续的投资组合实证分析中,与传统样本协方差矩阵估计方法、均值-中位数估计方法和RA-LASSO方法相比,基于中心正则稳健估计构造的最小方差投资组合收益率有着更低的波动表现。  相似文献   

17.
Recently, several new robust multivariate estimators of location and scatter have been proposed that provide new and improved methods for detecting multivariate outliers. But for small sample sizes, there are no results on how these new multivariate outlier detection techniques compare in terms of p n , their outside rate per observation (the expected proportion of points declared outliers) under normality. And there are no results comparing their ability to detect truly unusual points based on the model that generated the data. Moreover, there are no results comparing these methods to two fairly new techniques that do not rely on some robust covariance matrix. It is found that for an approach based on the orthogonal Gnanadesikan–Kettenring estimator, p n can be very unsatisfactory with small sample sizes, but a simple modification gives much more satisfactory results. Similar problems were found when using the median ball algorithm, but a modification proved to be unsatisfactory. The translated-biweights (TBS) estimator generally performs well with a sample size of n≥20 and when dealing with p-variate data where p≤5. But with p=8 it can be unsatisfactory, even with n=200. A projection method as well the minimum generalized variance method generally perform best, but with p≤5 conditions where the TBS method is preferable are described. In terms of detecting truly unusual points, the methods can differ substantially depending on where the outliers happen to be, the number of outliers present, and the correlations among the variables.  相似文献   

18.
The first step in statistical analysis is the parameter estimation. In multivariate analysis, one of the parameters of interest to be estimated is the mean vector. In multivariate statistical analysis, it is usually assumed that the data come from a multivariate normal distribution. In this situation, the maximum likelihood estimator (MLE), that is, the sample mean vector, is the best estimator. However, when outliers exist in the data, the use of sample mean vector will result in poor estimation. So, other estimators which are robust to the existence of outliers should be used. The most popular robust multivariate estimator for estimating the mean vector is S-estimator with desirable properties. However, computing this estimator requires the use of a robust estimate of mean vector as a starting point. Usually minimum volume ellipsoid (MVE) is used as a starting point in computing S-estimator. For high-dimensional data computing, the MVE takes too much time. In some cases, this time is so large that the existing computers cannot perform the computation. In addition to the computation time, for high-dimensional data set the MVE method is not precise. In this paper, a robust starting point for S-estimator based on robust clustering is proposed which could be used for estimating the mean vector of the high-dimensional data. The performance of the proposed estimator in the presence of outliers is studied and the results indicate that the proposed estimator performs precisely and much better than some of the existing robust estimators for high-dimensional data.  相似文献   

19.
This work is intended to suggest modifications in the construction of the GFI index using robust methods for estimating the unrestricted sample covariance matrix, leading to new indices called GFI(MCD) and GFI(MVE). The validation of this proposal was made using Monte Carlo simulation methods, considering differences between the unrestricted sample covariance matrix and those imposed by the structural model, and different numbers of outliers generated by distributions with deviations from symmetry and excess kurtosis. It was concluded that for larger samples size (n ? 100), given that the outliers are from distributions that are symmetrical, the GFI(MCD) and GFI(MVE) present similar results, including samples with high percentages of outliers.  相似文献   

20.
In this study, classical and robust principal component analyses are used to evaluate socioeconomic development of regions of development agencies that give service on the purpose of decreasing development difference among regions in Turkey. Due to the high differences between development levels of regions outlier problem occurs, hence robust statistical methods are used. Also, classical and robust statistical methods are used to investigate if there are any outliers in data set. In classic principal component analyse, the number of observations must be larger than the number of variables. Otherwise determinant of covariance matrix is zero. In Robust method for Principal Component Analysis (ROBPCA), a robust approach to principal component analyse in high-dimensional data, even if the number of variables is larger than the number of observations, principal components are obtained. In this paper, firstly 26 development agencies are evaluated with 19 variables by using principal component analysis based on classical and robust scatter matrices and then these 26 development agencies are evaluated with 46 variables by using the ROBPCA method.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号