首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Leverage values are being used in regression diagnostics as measures of influential observations in the $X$-space. Detection of high leverage values is crucial because of their responsibility for misleading conclusion about the fitting of a regression model, causing multicollinearity problems, masking and/or swamping of outliers, etc. Much work has been done on the identification of single high leverage points and it is generally believed that the problem of detection of a single high leverage point has been largely resolved. But there is no general agreement among the statisticians about the detection of multiple high leverage points. When a group of high leverage points is present in a data set, mainly because of the masking and/or swamping effects the commonly used diagnostic methods fail to identify them correctly. On the other hand, the robust alternative methods can identify the high leverage points correctly but they have a tendency to identify too many low leverage points to be points of high leverages which is not also desired. An attempt has been made to make a compromise between these two approaches. We propose an adaptive method where the suspected high leverage points are identified by robust methods and then the low leverage points (if any) are put back into the estimation data set after diagnostic checking. The usefulness of our newly proposed method for the detection of multiple high leverage points is studied by some well-known data sets and Monte Carlo simulations.  相似文献   

2.
High leverage points can induce or disrupt multicollinearity patterns in data. Observations responsible for this problem are generally known as collinearity-influential observations. A significant amount of published work on the identification of collinearity-influential observations exists; however, we show in this article that all commonly used detection techniques display greatly reduced sensitivity in the presence of multiple high leverage collinearity-influential observations. We propose a new measure based on a diagnostic robust group deletion approach. Some practical cutoff points for existing and developed diagnostics measures are also introduced. Numerical examples and simulation results show that the proposed measure provides significant improvement over the existing measures.  相似文献   

3.
Regression analysis aims to estimate the approximate relationship between the response variable and the explanatory variables. This can be done using classical methods such as ordinary least squares. Unfortunately, these methods are very sensitive to anomalous points, often called outliers, in the data set. The main contribution of this article is to propose a new version of the Generalized M-estimator that provides good resistance against vertical outliers and bad leverage points. The advantage of this method over the existing methods is that it does not minimize the weight of the good leverage points, and this increases the efficiency of this estimator. To achieve this goal, the fixed parameters support vector regression technique is used to identify and minimize the weight of outliers and bad leverage points. The effectiveness of the proposed estimator is investigated using real and simulated data sets.  相似文献   

4.
Leverage values are being used in regression diagnostics as measures of unusual observations in the X-space. Detection of high leverage observations or points is crucial due to their responsibility for masking outliers. In linear regression, high leverage points (HLP) are those that stand far apart from the center (mean) of the data and hence the most extreme points in the covariate space get the highest leverage. But Hosemer and Lemeshow [Applied logistic regression, Wiley, New York, 1980] pointed out that in logistic regression, the leverage measure contains a component which can make the leverage values of genuine HLP misleadingly very small and that creates problem in the correct identification of the cases. Attempts have been made to identify the HLP based on the median distances from the mean, but since they are designed for the identification of a single high leverage point they may not be very effective in the presence of multiple HLP due to their masking (false–negative) and swamping (false–positive) effects. In this paper we propose a new method for the identification of multiple HLP in logistic regression where the suspect cases are identified by a robust group deletion technique and they are confirmed using diagnostic techniques. The usefulness of the proposed method is then investigated through several well-known examples and a Monte Carlo simulation.  相似文献   

5.
Application of quantile regression models with measurement errors in predictors is becoming increasingly popular. High leverage points in predictors can have substantial impacts on these models. Here, we propose a predictive leverage statistic for these models, assuming that the measurement errors follow a multivariate normal distribution, and derive its exact distribution. We compare its performance versus known predictive leverage statistics using simulation and a real dataset. The proposed statistic is shown to have desirable features. It is also the first predictive leverage statistic having its distribution derived in a closed form.  相似文献   

6.
Although quantile regression estimators are robust against low leverage observations with atypically large responses (Koenker & Bassett 1978), they can be seriously affected by a few points that deviate from the majority of the sample covariates. This problem can be alleviated by downweighting observations with high leverage. Unfortunately, when the covariates are not elliptically distributed, Mahalanobis distances may not be able to correctly identify atypical points. In this paper the authors discuss the use of weights based on a new leverage measure constructed using Rosenblatt's multivariate transformation which is able to reflect nonelliptical structures in the covariate space. The resulting weighted estimators are consistent, asymptotically normal, and have a bounded influence function. In addition, the authors also discuss a selection criterion for choosing the downweighting scheme. They illustrate their approach with child growth data from Finland. Finally, their simulation studies suggest that this methodology has good finite‐sample properties.  相似文献   

7.
Detection of multiple unusual observations such as outliers, high leverage points and influential observations (IOs) in regression is still a challenging task for statisticians due to the well-known masking and swamping effects. In this paper we introduce a robust influence distance that can identify multiple IOs, and propose a sixfold plotting technique based on the well-known group deletion approach to classify regular observations, outliers, high leverage points and IOs simultaneously in linear regression. Experiments through several well-referred data sets and simulation studies demonstrate that the proposed algorithm performs successfully in the presence of multiple unusual observations and can avoid masking and/or swamping effects.  相似文献   

8.
The hat matrix is widely used as a diagnostic tool in linear regression because it contains the leverages which the independent variables exert on the fitted values. In some experiments, cases with high leverage may be avoided by judicious choice of design for the independent variables. A variety of methods for constructing equileverage designs for linear regression are discussed. Such designs remove one of the factors, namely large leverage points, which can lead to nonrobust estimators and tests. In addition, a method is given for combining equileverage designs to test for lack of fit of the linear model.  相似文献   

9.
This work introduces specific tools based on phi-divergences to select and check generalized linear models with binary data. A backward selection criterion that helps to reduce the number of explanatory variables is considered. Diagnostic methods based on divergence measures such as a new measure to detect leverage points and two indicators to detect influential points are introduced. As an illustration, the diagnostics are applied to human psychology data.  相似文献   

10.
Both the least squares estimator and M-estimators of regression coefficients are susceptible to distortion when high leverage points occur among the predictor variables in a multiple linear regression model. In this article a weighting scheme which enables one to bound the leverage values of a weighted matrix of predictor variables is proposed. Bounded-leverage weighting of the predictor variables followed by M-estimation of the regression coefficients is shown to be effective in protecting against distortion due to extreme predictor-variable values, extreme response values, or outlier-induced multieollinearites. Bounded-leverage estimators can also protect against distortion by small groups of high leverage points.  相似文献   

11.
Robust regression has not had a great impact on statistical practice, although all statisticians are convinced of its importance. The procedures for robust regression currently available are complex, and computer intensive. With a modification of the Gaussian paradigm, taking into consideration outliers and leverage points, we propose an iteratively weighted least squares method which gives robust fits. The procedure is illustrated by applying it on data sets which have been previously used to illustrate robust regression methods.It is hoped that this simple, effective and accessible method will find its use in statistical practice.  相似文献   

12.
A test for lack of fit in regression is presented. Unlike other methods, this one doesn't require replicates or a prior estimate of variance. It can be used for linear or multiple regression, and would be easy to add to existing computer packages. It is based on comparing a fit over low leverage points with a fit over the entire set of data. Distribution theory results are pre¬sented, with examples of power. A discussion of its use for de¬tecting violations of other regression assumptions is also given.  相似文献   

13.
This paper conducts simulation-based comparison of several stochastic volatility models with leverage effects. Two new variants of asymmetric stochastic volatility models, which are subject to a logarithmic transformation on the squared asset returns, are proposed. The leverage effect is introduced into the model through correlation either between the innovations of the observation equation and the latent process, or between the logarithm of squared asset returns and the latent process. Suitable Markov Chain Monte Carlo algorithms are developed for parameter estimation and model comparison. Simulation results show that our proposed formulation of the leverage effect and the accompanying inference methods give rise to reasonable parameter estimates. Applications to two data sets uncover a negative correlation (which can be interpreted as a leverage effect) between the observed returns and volatilities, and a negative correlation between the logarithm of squared returns and volatilities.  相似文献   

14.
This paper studies the outlier detection and robust variable selection problem in the linear regression model. The penalized weighted least absolute deviation (PWLAD) regression estimation method and the adaptive least absolute shrinkage and selection operator (LASSO) are combined to simultaneously achieve outlier detection, and robust variable selection. An iterative algorithm is proposed to solve the proposed optimization problem. Monte Carlo studies are evaluated the finite-sample performance of the proposed methods. The results indicate that the finite sample performance of the proposed methods performs better than that of the existing methods when there are leverage points or outliers in the response variable or explanatory variables. Finally, we apply the proposed methodology to analyze two real datasets.  相似文献   

15.
It is common for linear regression models that the error variances are not the same for all observations and there are some high leverage data points. In such situations, the available literature advocates the use of heteroscedasticity consistent covariance matrix estimators (HCCME) for the testing of regression coefficients. Primarily, such estimators are based on the residuals derived from the ordinary least squares (OLS) estimator that itself can be seriously inefficient in the presence of heteroscedasticity. To get efficient estimation, many efficient estimators, namely the adaptive estimators are available but their performance has not been evaluated yet when the problem of heteroscedasticity is accompanied with the presence of high leverage data. In this article, the presence of high leverage data is taken into account to evaluate the performance of the adaptive estimator in terms of efficiency. Furthermore, our numerical work also evaluates the performance of the robust standard errors based on this efficient estimator in terms of interval estimation and null rejection rate (NRR).  相似文献   

16.
In this paper the most commonly used diagnostic criteria for the identification of outliers or leverage points in the ordinary regression model are reviewed. Their use in the context of the errors-in-variables (e.v.) linear model is discussed and evidence is given that under the e.v. model assumptions the distinction between outliers and leverage points no longer exists.  相似文献   

17.
Calculations of local influence curvatures and leverage have been well developed when the parameters are unrestricted. In this article, we discuss the assessment of local influence and leverage under linear equality parameter constraints with extensions to inequality constraints. Using a penalized quadratic function we express the normal curvature of local influence for arbitrary perturbation schemes and the generalized leverage matrix in interpretable forms, which depend on restricted and unrestricted components. The results are quite general and can be applied in various statistical models. In particular, we derive the normal curvature under three useful perturbation schemes for generalized linear models. Four illustrative examples are analyzed by the methodology developed in the article.  相似文献   

18.
This paper presents a one–step robust generalised M-estimation for orthogonal regression. The GM-estimator uses Schweppe weights which are based on high breakdown initial and scale estimates to downweight outliers and high leverage points. The one-step iteratively reweighted least squares procedure was used to compute the GM estimates. The robustness of the GM-estimator was shown from the results illustrated on measurements of concrete compressive strengths data.  相似文献   

19.

Structural change in any time series is practically unavoidable, and thus correctly detecting breakpoints plays a pivotal role in statistical modelling. This research considers segmented autoregressive models with exogenous variables and asymmetric GARCH errors, GJR-GARCH and exponential-GARCH specifications, which utilize the leverage phenomenon to demonstrate asymmetry in response to positive and negative shocks. The proposed models incorporate skew Student-t distribution and prove the advantages of the fat-tailed skew Student-t distribution versus other distributions when structural changes appear in financial time series. We employ Bayesian Markov Chain Monte Carlo methods in order to make inferences about the locations of structural change points and model parameters and utilize deviance information criterion to determine the optimal number of breakpoints via a sequential approach. Our models can accurately detect the number and locations of structural change points in simulation studies. For real data analysis, we examine the impacts of daily gold returns and VIX on S&P 500 returns during 2007–2019. The proposed methods are able to integrate structural changes through the model parameters and to capture the variability of a financial market more efficiently.

  相似文献   

20.
In linear regression, outliers and leverage points often have large influence in the model selection process. Such cases are downweighted with Mallows-type weights here, during estimation of submodel parameters by generalised M-estimation. A robust version of Mallows's Cp (Ronchetti &. Staudte, 1994) is then used to select a variety of submodels which are as informative as the full model. The methodology is illustrated on a new dataset concerning the agglomeration of alumina in Bayer precipitation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号