首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 140 毫秒
1.
基于稳健MM估计的统计数据质量评估方法   总被引:2,自引:1,他引:1       下载免费PDF全文
卢二坡  黄炳艺 《统计研究》2010,27(12):16-22
 政府统计数据质量是当前各界关注的热点问题,如何采用严谨的诊断方法,对我国统计数据进行科学的评估具有重要的现实意义。稳健回归方法可使求出的回归估计不受异常值的强烈影响,并且能更好的识别异常点。本文首次运用基于稳健MM估计的异常值诊断方法,在生产函数模型的框架下,分别使用两种不同的劳动投入数据,对改革以来我国GDP数据质量进行了评估。结果表明,基于稳健MM估计的异常值诊断方法可有效的解决传统方法容易出现的多个异常点的掩盖现象,改革以来我国的GDP数据是相对可靠的。  相似文献   

2.
李向杰等 《统计研究》2018,35(7):115-124
经典的充分降维方法对解释变量存在异常值或者当其是厚尾分布时效果较差,为此,经过对充分降维理论中加权与累积切片的分析研究,本文提出了一种将两者有机结合的稳健降维方法:累积加权切片逆回归法(CWSIR)。该方法对自变量存在异常值以及小样本情况下表现比较稳健,并且有效避免了对切片数目的选择。数值模拟结果显示CWSIR要优于传统的切片逆回归(SIR)、累积切片估计(CUME)、基于等高线的切片逆回归估计(CPSIR)、加权典则相关估计(WCANCOR)、切片逆中位数估计(SIME)、加权逆回归估计(WIRE)等方法。最后,我们通过对某视频网站真实数据的分析也验证了CWSIR的有效性。  相似文献   

3.
文章在合理利用已有研究成果的基础上,从反映国内生产总值增长的代表性行业经济部门出发,选取较为全面的统计指标数据,综合运用主成分回归和计量经济学中的邹检验等方法,从一个较新的角度对我国国内生产总值数据的可靠性进行了分析。  相似文献   

4.
文章利用四川省近30年来的经济变量数据,采用关系分析法和多因素模型分析法对四川省地区经济增长统计数据的可靠性进行了评估。评估结果表明,改革开放以来四川省地区经济增长统计数据相对比较可靠,经济增速在波动中呈上升趋势;1997年后地区经济增长数据则存在一定程度的虚增误差。  相似文献   

5.
基于经典计量模型的统计数据质量评估方法   总被引:6,自引:2,他引:4       下载免费PDF全文
刘洪  黄燕 《统计研究》2009,26(3):91-96
 本文以经济理论为基础,从整个经济系统出发,利用研究对象的相关影响因素构造计量模型,在既定模型下,运用异常值的检验方法及统计诊断原理进行数据质量的定量评估。通过选择合适的模型对考察对象的变化规律进行模拟,找出异常数据(离群值),判断异常数据是否显著异常,对异常数据进行多方查证和原因分析来进一步判断数据的质量,并对我国的统计数据质量进行了实证分析。  相似文献   

6.
时间序列自回归AR模型在建模过程中易受离群值的影响,导致计算结果与实际不相符。针对这一现象,运用FQn统计量对传统自相关函数进行改进,构建出自回归AR模型的稳健估计算法,以克服离群值的影响,并对此方法进行了模拟和实证分析。模拟和实证分析均表明:当时序数据中不存在离群值时,传统估计方法与稳健估计方法得到的结果基本保持一致;当数据中存在离群值时,运用传统估计方法得到的结果出现较大变化,而运用稳健估计方法得到的结果基本不变.这说明相对于传统估计方法,稳健估计方法能有效抵抗离群值的影响,具有良好的抗干扰性和高抗差性。  相似文献   

7.
采用空间计量经济学方法,通过2008年中国大陆31个省份的宏观经济数据,运用地理加权回归(GWR)模型,在内生经济增长框架下就出口贸易与经济增长的关系进行实证分析。研究表明:经济增长存在显著的空间相关性;在中国大陆,要素投入尤其是资本投入对经济增长的贡献整体上高于出口贸易对经济增长的贡献,且在省域间产生了显著差别;出口贸易主要通过贸易权重和外溢效应等渠道对经济增长产生影响,整体上看,贸易权重与经济增长负相关,外溢效应与经济增长正相关。根据研究结论提出了相关政策建议。  相似文献   

8.
熊巍等 《统计研究》2020,37(5):104-116
随着计算机技术的迅猛发展,高维成分数据不断涌现并伴有大量近似零值和缺失,数据的高维特性不仅给传统统计方法带来了巨大的挑战,其厚尾特征、复杂的协方差结构也使得理论分析难上加难。于是如何对高维成分数据的近似零值进行稳健的插补,挖掘潜在的内蕴结构成为当今学者研究的焦点。对此,本文结合修正的EM算法,提出基于R型聚类的Lasso-分位回归插补法(SubLQR)对高维成分数据的近似零值问题予以解决。与现有高维近似零值插补方法相比,本文所提出的SubLQR具有如下优势。①稳健全面性:利用Lasso-分位回归方法,不仅可以有效地探测到响应变量的整个条件分布,还能提供更加真实的高维稀疏模式;②有效准确性:采用基于R型聚类的思想进行插补,可以降低计算复杂度,极大提高插补的精度。模拟研究证实,本文提出的SubLQR高效灵活准确,特别在零值、异常值较多的情形更具优势。最后将SubLQR方法应用于罕见病代谢组学研究中,进一步表明本文所提出的方法具有广泛的适用性。  相似文献   

9.
利用数据包络分析(DEA)估计了中国31个省份1999—2006年基础设施投资的相对效率,并以此代表各地区基础设施投资的成效,进而分析其对经济增长的贡献是否存在空间溢出和门限效应。采用普通面板回归、空间面板回归和门限面板回归这三种基于面板数据的回归模型进行实证分析后发现:基础设施投资效率对本地区和相邻地区的经济增长均有显著促进作用,并且随着经济规模的扩大,基础设施投资效率对经济增长的促进作用也有所提高,即证实了空间溢出和门限效应的存在。  相似文献   

10.
时间序列自回归AR模型的Yule-Walker估计法在建模过程中易受离群值的影响,导致计算结果与实际不相符。针对这一现象,基于均值和方差的稳健组合估计量构建了稳健自相关函数,得到了时序AR模型的稳健Yule-Walker估计算法,以克服离群值的影响。并对此方法进行了模拟与金融数据实证检验,模拟和实证检验均表明:当时序数据中不存在离群值时,传统估计方法与稳健估计方法得到的结果基本保持一致;当数据中存在离群值时,运用传统估计方法得到的结果出现较大变化,而运用稳健估计方法得到的结果基本不变。这说明相对于传统估计方法,稳健估计方法能有效抵抗离群值的影响,具有良好的抗干扰性和高抗差性。  相似文献   

11.
王斌会 《统计研究》2007,24(8):72-76
传统的多元统计分析方法,如主成分分析方法和因子分析方法等的共同点是计算样本的均值向量和协方差矩阵,并在这两者的基础上计算其他统计量。当样本数据中没有离群值时,这些方法都能得到优良的结果。但是当样本数据中包括离群值时,计算结果就会很容易受到这些离群值的影响,这是因为传统的均值向量和协方差矩阵都不是稳健的统计量。本文对目前较流行的FAST-MCD方法的算法进行研究,构造了稳健的均值向量和稳健的协方差矩阵,应用到主成分分析中,并针对其不足之处提出改进方法。从模拟和实证的结果来看,改进后的的方法和新的稳健估计量确实能够对离群值起到很好的抵抗作用,大幅度地降低它们对计算结果的影响。  相似文献   

12.
Univariate time series often take the form of a collection of curves observed sequentially over time. Examples of these include hourly ground-level ozone concentration curves. These curves can be viewed as a time series of functions observed at equally spaced intervals over a dense grid. Since functional time series may contain various types of outliers, we introduce a robust functional time series forecasting method to down-weigh the influence of outliers in forecasting. Through a robust principal component analysis based on projection pursuit, a time series of functions can be decomposed into a set of robust dynamic functional principal components and their associated scores. Conditioning on the estimated functional principal components, the crux of the curve-forecasting problem lies in modelling and forecasting principal component scores, through a robust vector autoregressive forecasting method. Via a simulation study and an empirical study on forecasting ground-level ozone concentration, the robust method demonstrates the superior forecast accuracy that dynamic functional principal component regression entails. The robust method also shows the superior estimation accuracy of the parameters in the vector autoregressive models for modelling and forecasting principal component scores, and thus improves curve forecast accuracy.  相似文献   

13.
In this article, we introduce restricted principal components regression (RPCR) estimator by combining the approaches followed in obtaining the restricted least squares estimator and the principal components regression estimator. The performance of the RPCR estimator with respect to the matrix and the generalized mean square error are examined. We also suggest a testing procedure for linear restrictions in principal components regression by using singly and doubly non-central F distribution.  相似文献   

14.
One advantage of quantile regression, relative to the ordinary least-square (OLS) regression, is that the quantile regression estimates are more robust against outliers and non-normal errors in the response measurements. However, the relative efficiency of the quantile regression estimator with respect to the OLS estimator can be arbitrarily small. To overcome this problem, composite quantile regression methods have been proposed in the literature which are resistant to heavy-tailed errors or outliers in the response and at the same time are more efficient than the traditional single quantile-based quantile regression method. This paper studies the composite quantile regression from a Bayesian perspective. The advantage of the Bayesian hierarchical framework is that the weight of each component in the composite model can be treated as open parameter and automatically estimated through Markov chain Monte Carlo sampling procedure. Moreover, the lasso regularization can be naturally incorporated into the model to perform variable selection. The performance of the proposed method over the single quantile-based method was demonstrated via extensive simulations and real data analysis.  相似文献   

15.
由于传统因子分析方法对离群值较敏感,导致计算结果与实际不相符。针对这一现象,本文运用FAST-MCD方法对传统因子分析方法进行改进,构建出因子分析的稳健算法,以克服离群值的影响,并对此方法进行了模拟和实证分析。模拟和实证分析结果均表明:因子旋转前后,当数据中不存在离群值时,传统因子分析与稳健因子分析得到的结果基本保持一致;当数据中存在离群值时,运用传统因子分析得到的结果出现较大变化,而运用稳健因子分析方法得到的结果基本不变,这说明相对于传统因子分析方法,稳健因子分析方法能有效抵抗离群值的影响,具有良好的抗干扰性和高抗差性。  相似文献   

16.
In this study, classical and robust principal component analyses are used to evaluate socioeconomic development of regions of development agencies that give service on the purpose of decreasing development difference among regions in Turkey. Due to the high differences between development levels of regions outlier problem occurs, hence robust statistical methods are used. Also, classical and robust statistical methods are used to investigate if there are any outliers in data set. In classic principal component analyse, the number of observations must be larger than the number of variables. Otherwise determinant of covariance matrix is zero. In Robust method for Principal Component Analysis (ROBPCA), a robust approach to principal component analyse in high-dimensional data, even if the number of variables is larger than the number of observations, principal components are obtained. In this paper, firstly 26 development agencies are evaluated with 19 variables by using principal component analysis based on classical and robust scatter matrices and then these 26 development agencies are evaluated with 46 variables by using the ROBPCA method.  相似文献   

17.
Principal component analysis (PCA) is widely used to analyze high-dimensional data, but it is very sensitive to outliers. Robust PCA methods seek fits that are unaffected by the outliers and can therefore be trusted to reveal them. FastHCS (high-dimensional congruent subsets) is a robust PCA algorithm suitable for high-dimensional applications, including cases where the number of variables exceeds the number of observations. After detailing the FastHCS algorithm, we carry out an extensive simulation study and three real data applications, the results of which show that FastHCS is systematically more robust to outliers than state-of-the-art methods.  相似文献   

18.
Longitudinal data are commonly modeled with the normal mixed-effects models. Most modeling methods are based on traditional mean regression, which results in non robust estimation when suffering extreme values or outliers. Median regression is also not a best choice to estimation especially for non normal errors. Compared to conventional modeling methods, composite quantile regression can provide robust estimation results even for non normal errors. In this paper, based on a so-called pseudo composite asymmetric Laplace distribution (PCALD), we develop a Bayesian treatment to composite quantile regression for mixed-effects models. Furthermore, with the location-scale mixture representation of the PCALD, we establish a Bayesian hierarchical model and achieve the posterior inference of all unknown parameters and latent variables using Markov Chain Monte Carlo (MCMC) method. Finally, this newly developed procedure is illustrated by some Monte Carlo simulations and a case analysis of HIV/AIDS clinical data set.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号