首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 33 毫秒
1.
Cook距离公式常用于回归模型的异常值诊断,但由于公式中的样本方差■对异常值敏感,导致公式缺乏稳健性,使得诊断效果不理想。基于以上问题,文章选取绝对离差中位数作为样本标准差的稳健估计量,得到了样本方差■的稳健估计量,进而构造出稳健Cook距离公式;借鉴传统Cook距离的回归模型异常值诊断理论,将稳健Cook距离公式应用于时间序列异常值诊断,拓展了传统Cook距离公式的异常值诊断领域。通过选取模拟样本量分别为50、100、200,污染率分别为0、1%、5%、10%的ARMA(1,1)序列及金融时间序列进行实例分析,结果发现:(1)在无污染时,稳健Cook距离法与常规Cook距离法的诊断正确率均为100%,两者没有出现"误诊"现象;(2)在样本量、污染率同时增大时,常规Cook距离诊断正确率急剧下降,当污染率达到5%及以上时,已基本无诊断力,而稳健Cook距离法依然能保持较高的诊断力。稳健Cook距离法不仅能应用于时间序列异常值诊断,也能应用于回归分析的异常值诊断。  相似文献   

2.
In this paper, we extend the censored linear regression model with normal errors to Student-t errors. A simple EM-type algorithm for iteratively computing maximum-likelihood estimates of the parameters is presented. To examine the performance of the proposed model, case-deletion and local influence techniques are developed to show its robust aspect against outlying and influential observations. This is done by the analysis of the sensitivity of the EM estimates under some usual perturbation schemes in the model or data and by inspecting some proposed diagnostic graphics. The efficacy of the method is verified through the analysis of simulated data sets and modelling a real data set first analysed under normal errors. The proposed algorithm and methods are implemented in the R package CensRegMod.  相似文献   

3.
The composite quantile regression (CQR) has been developed for the robust and efficient estimation of regression coefficients in a liner regression model. By employing the idea of the CQR, we propose a new regression method, called composite kernel quantile regression (CKQR), which uses the sum of multiple check functions as a loss in reproducing kernel Hilbert spaces for the robust estimation of a nonlinear regression function. The numerical results demonstrate the usefulness of the proposed CKQR in estimating both conditional nonlinear mean and quantile functions.  相似文献   

4.
A new technique is devised to mitigate the errors-in-variables bias in linear regression. The procedure mimics a 2-stage least squares procedure where an auxiliary regression which generates a better behaved predictor variable is derived. The generated variable is then used as a substitute for the error-prone variable in the first-stage model. The performance of the algorithm is tested by simulation and regression analyses. Simulations suggest the algorithm efficiently captures the additive error term used to contaminate the artificial variables. Regressions provide further credit to the simulations as they clearly show that the compact genetic algorithm-based estimate of the true but unobserved regressor yields considerably better results. These conclusions are robust across different sample sizes and different variance structures imposed on both the measurement error and regression disturbances.  相似文献   

5.
Tsou (2003a) proposed a parametric procedure for making robust inference for mean regression parameters in the context of generalized linear models. This robust procedure is extended to model variance heterogeneity. The normal working model is adjusted to become asymptotically robust for inference about regression parameters of the variance function for practically all continuous response variables. The connection between the novel robust variance regression model and the estimating equations approach is also provided.  相似文献   

6.
Two diagnostic plots for selecting explanatory variables are introduced to assess the accuracy of a generalized beta-linear model. The added variable plot is developed to examine the need for adding a new explanatory variable to the model. The constructed variable plot is developed to identify the nonlinearity of the explanatory variable in the model. The two diagnostic procedures are also useful for detecting unusual observations that may affect the regression much. Simulation studies and analysis of two practical examples are conducted to illustrate the performances of the proposed plots.  相似文献   

7.
Various diagnostic statistics have been proposed to help identify cases that markedly affect, or influence, the features of a fitted linear regression model. Once influential cases are found, decisions can be made regarding their worth in the model building process. Since a subject data set may contain both singly influential cases and influential multiple case subsets, the capability to assess the joint influence of cases is needed for a complete analysis. The aim of this work is to briefly review Cook’s distance measure for multiple cases, an effective diagnostic for this purpose, and present a method using it to search for influential multiple case subsets. The method is applied in two example analyses by way of a MINITAB Statistical Software macro.  相似文献   

8.
A robust Bayesian analysis in a conjugate normal framework for the simple ANOVA model is suggested. By fixing the prior mean and varying the prior covariance matrix over a restricted class, we obtain the so-called HiFi and core region, a union and intersection of HPD regions. Based on these robust HPD regions we develop the concept of a ‘robust Bayesian judgement’ procedure. We apply this approach to the simple analysis of variance model with orthogonal designs. The example analyses the costs of an asthma medication obtained by a two-way cross-over study.  相似文献   

9.
A fast routine for converting regression algorithms into corresponding orthogonal regression (OR) algorithms was introduced in Ammann and Van Ness (1988). The present paper discusses the properties of various ordinary and robust OR procedures created using this routine. OR minimizes the sum of the orthogonal distances from the regression plane to the data points. OR has three types of applications. First, L 2 OR is the maximum likelihood solution of the Gaussian errors-in-variables (EV) regression problem. This L 2 solution is unstable, thus the robust OR algorithms created from robust regression algorithms should prove very useful. Secondly, OR is intimately related to principal components analysis. Therefore, the routine can also be used to create L 1, robust, etc. principal components algorithms. Thirdly, OR treats the x and y variables symmetrically which is important in many modeling problems. Using Monte Carlo studies this paper compares the performance of standard regression, robust regression, OR, and robust OR on Gaussian EV data, contaminated Gaussian EV data, heavy-tailed EV data, and contaminated heavy-tailed EV data.  相似文献   

10.
Multivariate mixture regression models can be used to investigate the relationships between two or more response variables and a set of predictor variables by taking into consideration unobserved population heterogeneity. It is common to take multivariate normal distributions as mixing components, but this mixing model is sensitive to heavy-tailed errors and outliers. Although normal mixture models can approximate any distribution in principle, the number of components needed to account for heavy-tailed distributions can be very large. Mixture regression models based on the multivariate t distributions can be considered as a robust alternative approach. Missing data are inevitable in many situations and parameter estimates could be biased if the missing values are not handled properly. In this paper, we propose a multivariate t mixture regression model with missing information to model heterogeneity in regression function in the presence of outliers and missing values. Along with the robust parameter estimation, our proposed method can be used for (i) visualization of the partial correlation between response variables across latent classes and heterogeneous regressions, and (ii) outlier detection and robust clustering even under the presence of missing values. We also propose a multivariate t mixture regression model using MM-estimation with missing information that is robust to high-leverage outliers. The proposed methodologies are illustrated through simulation studies and real data analysis.  相似文献   

11.
ABSTRACT

Modeling diagnostics assess models by means of a variety of criteria. Each criterion typically performs its evaluation upon a specific inferential objective. For instance, the well-known DFBETAS in linear regression models are a modeling diagnostic which is applied to discover the influential cases in fitting a model. To facilitate the evaluation of generalized linear mixed models (GLMM), we develop a diagnostic for detecting influential cases based on the information complexity (ICOMP) criteria for detecting influential cases which substantially affect the model selection criterion ICOMP. In a given model, the diagnostic compares the ICOMP criterion between the full data set and a case-deleted data set. The computational formula of the ICOMP criterion is evaluated using the Fisher information matrix. A simulation study is accomplished and a real data set of cancer cells is analyzed using the logistic linear mixed model for illustrating the effectiveness of the proposed diagnostic in detecting the influential cases.  相似文献   

12.
This paper studies robust estimation of multivariate regression model using kernel weighted local linear regression. A robust estimation procedure is proposed for estimating the regression function and its partial derivatives. The proposed estimators are jointly asymptotically normal and attain nonparametric optimal convergence rate. One-step approximations to the robust estimators are introduced to reduce computational burden. The one-step local M-estimators are shown to achieve the same efficiency as the fully iterative local M-estimators as long as the initial estimators are good enough. The proposed estimators inherit the excellent edge-effect behavior of the local polynomial methods in the univariate case and at the same time overcome the disadvantages of the local least-squares based smoothers. Simulations are conducted to demonstrate the performance of the proposed estimators. Real data sets are analyzed to illustrate the practical utility of the proposed methodology. This work was supported by the National Natural Science Foundation of China (Grant No. 10471006).  相似文献   

13.
The authors propose a robust bounded‐influence estimator for binary regression with continuous outcomes, an alternative to logistic regression when the investigator's interest focuses on the proportion of subjects who fall below or above a cut‐off value. The authors show both theoretically and empirically that in this context, the maximum likelihood estimator is sensitive to model misspecifications. They show that their robust estimator is more stable and nearly as efficient as maximum likelihood when the hypotheses are satisfied. Moreover, it leads to safer inference. The authors compare the different estimators in a simulation study and present an analysis of hypertension on Harlem survey data.  相似文献   

14.
Fitting multiplicative models by robust alternating regressions   总被引:1,自引:0,他引:1  
In this paper a robust approach for fitting multiplicative models is presented. Focus is on the factor analysis model, where we will estimate factor loadings and scores by a robust alternating regression algorithm. The approach is highly robust, and also works well when there are more variables than observations. The technique yields a robust biplot, depicting the interaction structure between individuals and variables. This biplot is not predetermined by outliers, which can be retrieved from the residual plot. Also provided is an accompanying robust R 2-plot to determine the appropriate number of factors. The approach is illustrated by real and artificial examples and compared with factor analysis based on robust covariance matrix estimators. The same estimation technique can fit models with both additive and multiplicative effects (FANOVA models) to two-way tables, thereby extending the median polish technique.  相似文献   

15.
Fuzzy least-square regression can be very sensitive to unusual data (e.g., outliers). In this article, we describe how to fit an alternative robust-regression estimator in fuzzy environment, which attempts to identify and ignore unusual data. The proposed approach concerns classical robust regression and estimation methods that are insensitive to outliers. In this regard, based on the least trimmed square estimation method, an estimation procedure is proposed for determining the coefficients of the fuzzy regression model for crisp input-fuzzy output data. The investigated fuzzy regression model is applied to bedload transport data forecasting suspended load by discharge based on a real world data. The accuracy of the proposed method is compared with the well-known fuzzy least-square regression model. The comparison results reveal that the fuzzy robust regression model performs better than the other models in suspended load estimation for the particular dataset. This comparison is done based on a similarity measure between fuzzy sets. The proposed model is general and can be used for modeling natural phenomena whose available observations are reported as imprecise rather than crisp.  相似文献   

16.
A class of trimmed linear conditional estimators based on regression quantiles for the linear regression model is introduced. This class serves as a robust analogue of non-robust linear unbiased estimators. Asymptotic analysis then shows that the trimmed least squares estimator based on regression quantiles ( Koenker and Bassett ( 1978 ) ) is the best in this estimator class in terms of asymptotic covariance matrices. The class of trimmed linear conditional estimators contains the Mallows-type bounded influence trimmed means ( see De Jongh et al ( 1988 ) ) and trimmed instrumental variables estimators. A large sample methodology based on trimmed instrumental variables estimator for confidence ellipsoids and hypothesis testing is also provided.  相似文献   

17.
Users of statistical packages need to be aware of the influence that outlying data points can have on their statistical analyses. Robust procedures provide formal methods to spot these outliers and reduce their influence. Although a few robust procedures are mentioned in this article, one is emphasized; it is motivated by maximum likelihood estimation to make it seem more natural. Use of this procedure in regression problems is considered in some detail, and an approximate error structure is stated for the robust estimates of the regression coefficients. A few examples are given. A suggestion of how these techniques should be implemented in practice is included.  相似文献   

18.
In this article, the parametric robust regression approaches are proposed for making inferences about regression parameters in the setting of generalized linear models (GLMs). The proposed methods are able to test hypotheses on the regression coefficients in the misspecified GLMs. More specifically, it is demonstrated that with large samples, the normal and gamma regression models can be properly adjusted to become asymptotically valid for inferences about regression parameters under model misspecification. These adjusted regression models can provide the correct type I and II error probabilities and the correct coverage probability for continuous data, as long as the true underlying distributions have finite second moments.  相似文献   

19.
In regression analysis, RESET has widely been regarded as an effective diagnostic test especially for omitted variables. This paper investigates the limitations of the existing RESET tests in detecting omitted variables. We analyze the sources from which RESET draws its power and point out the circumstances under which RESET will likely be ineffective. We offer some Monte Carlo evidence as well as an empirical application to illustrate the weaknesses of the RESET tests. A more robust RESET type test is proposed.  相似文献   

20.
For the functional errors-in-varinbles regression model, we define a class of robust regression estimators and study their properties  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号