首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In this study, we develop the adjusted deviance residuals for the gamma regression model (GRM) by following Cordeiro's (2004) method. These adjusted deviance residuals under the GRM are used for influence diagnostics. A comparative analysis has been sorted out between our proposed method of the adjusted deviance residuals and an existing method for influence diagnostics. These results are illustrated by a simulation study and using a real data set. They are presented for different values of dispersion and sample sizes and indicate the significant role of the GRM inferences.  相似文献   

2.
It sometimes occurs that one or more components of the data exert a disproportionate influence on the model estimation. We need a reliable tool for identifying such troublesome cases in order to decide either eliminate from the sample, when the data collect was badly realized, or otherwise take care on the use of the model because the results could be affected by such components. Since a measure for detecting influential cases in linear regression setting was proposed by Cook [Detection of influential observations in linear regression, Technometrics 19 (1977), pp. 15–18.], apart from the same measure for other models, several new measures have been suggested as single-case diagnostics. For most of them some cutoff values have been recommended (see [D.A. Belsley, E. Kuh, and R.E. Welsch, Regression Diagnostics: Identifying Influential Data and Sources of Collinearity, 2nd ed., John Wiley & Sons, New York, Chichester, Brisban, (2004).], for instance), however the lack of a quantile type cutoff for Cook's statistics has induced the analyst to deal only with index plots as worthy diagnostic tools. Focussed on logistic regression, the aim of this paper is to provide the asymptotic distribution of Cook's distance in order to look for a meaningful cutoff point for detecting influential and leverage observations.  相似文献   

3.
In this paper, we use a likelihood approach and the local influence method introduced by Cook [Assessment of local influence (with discussion). J Roy Statist Soc Ser B. 1986;48:133–149] to study a vector autoregressive (VAR) model. We present the maximum likelihood estimators and the information matrix. We establish the normal curvature and slope diagnostics for the VAR model under several perturbation schemes and use the Monte Carlo method to obtain benchmark values for determining the influence of directional diagnostics and possible influential observations. An empirical study using the VAR model to fit real data of monthly returns of IBM and S&P500 index illustrates the effectiveness of our proposed diagnostics.  相似文献   

4.
We propose some statistical tools for diagnosing the class of generalized Weibull linear regression models [A.A. Prudente and G.M. Cordeiro, Generalized Weibull linear models, Comm. Statist. Theory Methods 39 (2010), pp. 3739–3755]. This class of models is an alternative means of analysing positive, continuous and skewed data and, due to its statistical properties, is very competitive with gamma regression models. First, we show that the Weibull model induces ma-ximum likelihood estimators asymptotically more efficient than the gamma model. Standardized residuals are defined, and their statistical properties are examined empirically. Some measures are derived based on the case-deletion model, including the generalized Cook's distance and measures for identifying influential observations on partial F-tests. The results of a simulation study conducted to assess behaviour of the global influence approach are also presented. Further, we perform a local influence analysis under the case-weights, response and explanatory variables perturbation schemes. The Weibull, gamma and other Weibull-type regression models are fitted into three data sets to illustrate the proposed diagnostic tools. Statistical analyses indicate that the Weibull model fitted into these data yields better fits than other common alternative models.  相似文献   

5.
The detection of outliers and influential observations has received a great deal of attention in the statistical literature in the context of least-squares (LS) regression. However, the explanatory variables can be correlated with each other and alternatives to LS come out to address outliers/influential observations and multicollinearity, simultaneously. This paper proposes new influence measures based on the affine combination type regression for the detection of influential observations in the linear regression model when multicollinearity exists. Approximate influence measures are also proposed for the affine combination type regression. Since the affine combination type regression includes the ridge, the Liu and the shrunken regressions as special cases, influence measures under the ridge, the Liu and the shrunken regressions are also examined to see the possible effect that multicollinearity can have on the influence of an observation. The Longley data set is given illustrating the influence measures in affine combination type regression and also in ridge, Liu and shrunken regressions so that the performance of different biased regressions on detecting and assessing the influential observations is examined.  相似文献   

6.
The aim of this paper is to define and develop diagnostic measures with respect to kernel ridge regression in a reproducing kernel Hilbert space (RKHS). To identify influential observations, we define a particular version of Cook’s distance for the kernel ridge regression model in RKHS, which is conceptually consistent with Cook’s distance in a classical regression model. Then, by using the perturbation formula for the regularized conditional expectation of the outcome in RKHS, we develop an approximate version of Cook”s distance in RKHS because the original definition requires intensive computations. Such an approximated Cook”s distance is represented in terms of basic building blocks such as residuals and leverages of the kernel ridge regression. The results of the simulation and real application demonstrate that our diagnostic measure successfully detects potentially influential observations on estimators in kernel ridge regression.  相似文献   

7.
The added variable plot is useful for examining the effect of a covariate in regression models. The plot provides information regarding the inclusion of a covariate, and is useful in identifying influential observations on the parameter estimates. Hall et al. (1996) proposed a plot for Cox's proportional hazards model derived by regarding the Cox model as a generalized linear model. This paper proves and discusses properties of this plot. These properties make the plot a valuable tool in model evaluation. Quantities considered include parameter estimates, residuals, leverage, case influence measures and correspondence to previously proposed residuals and diagnostics.  相似文献   

8.
In this paper, we investigated the Andrews–Pregibon (AP), COVRATIO and Cook–Weisberg (CW) statistics to determine the influential observations on the confidence ellipsoids in linear regression model with correlated errors and correlated regressors. A real example and a Monte Carlo simulation study are given to detect the effects of autocorrelation coefficient and ridge parameter on the AP, COVRATIO and CW statistics.  相似文献   

9.
Birnbaum-Saunders models have largely been applied in material fatigue studies and reliability analyses to relate the total time until failure with some type of cumulative damage. In many problems related to the medical field, such as chronic cardiac diseases and different types of cancer, a cumulative damage caused by several risk factors might cause some degradation that leads to a fatigue process. In these cases, BS models can be suitable for describing the propagation lifetime. However, since the cumulative damage is assumed to be normally distributed in the BS distribution, the parameter estimates from this model can be sensitive to outlying observations. In order to attenuate this influence, we present in this paper BS models, in which a Student-t distribution is assumed to explain the cumulative damage. In particular, we show that the maximum likelihood estimates of the Student-t log-BS models attribute smaller weights to outlying observations, which produce robust parameter estimates. Also, some inferential results are presented. In addition, based on local influence and deviance component and martingale-type residuals, a diagnostics analysis is derived. Finally, a motivating example from the medical field is analyzed using log-BS regression models. Since the parameter estimates appear to be very sensitive to outlying and influential observations, the Student-t log-BS regression model should attenuate such influences. The model checking methodologies developed in this paper are used to compare the fitted models.  相似文献   

10.
ABSTRACT

Statistical methods are effectively used in the evaluation of pharmaceutical formulations instead of laborious liquid chromatography. However, signal overlapping, nonlinearity, multicollinearity and presence of outliers deteriorate the performance of statistical methods. The Partial Least Squares Regression (PLSR) is a very popular method in the quantification of high dimensional spectrally overlapped drug formulations. The SIMPLS is the mostly used PLSR algorithm, but it is highly sensitive to outliers that also effect the diagnostics. In this paper, we propose new robust multivariate diagnostics to identify outliers, influential observations and points causing non-normality for a PLSR model. We study performances of the proposed diagnostics on two everyday use highly overlapping drug systems: Paracetamol–Caffeine and Doxylamine Succinate–Pyridoxine Hydrochloride.  相似文献   

11.
The grouped relative risk model (GRRM) is a popular semi-parametric model for analyzing discrete survival time data. The maximum likelihood estimators (MLEs) of the regression coefficients in this model are often asymptotically efficient relative to those based on a more restrictive, parametric model. However, in settings with a small number of sampling units, the usual properties of the MLEs are not assured. In this paper, we discuss computational issues that can arise when fitting a GRRM to small samples, and describe conditions under which the MLEs can be ill-behaved. We find that, overall, estimators based on a penalized score function behave substantially better than the MLEs in this setting and, in particular, can be far more efficient. We also provide methods of assessing the fit of a GRRM to small samples.  相似文献   

12.
ABSTRACT

Ridge penalized least-squares estimators has been suggested as an alternative to the minimum penalized sum of squares estimates in the presence of collinearity among the explanatory variables in semiparametric regression models (SPRMs). This paper studies the local influence of minor perturbations on the ridge estimates in the SPRM. The diagnostics under the perturbation of ridge penalized sum of squares, response variable, explanatory variables and ridge parameter are considered. Some local influence diagnostics are given. A Monte Carlo simulation study and a real example are used to illustrate the proposed perturbations.  相似文献   

13.
We occasionally find that a small subset of the data exerts a disproportionate influence on the fitted regression model. We would like to locate these influential points and assess their impact on the model. However, the existence of influential data is complicated by the presence of collinearity (see, e.g. [15 E. Walker and J. Birch, Influence measures in ridge regression, Technometrics 30 (1989), pp. 221227. doi: 10.1080/00401706.1988.10488370[Taylor &; Francis Online], [Web of Science ®] [Google Scholar]]). In this article we develop a new influence statistic for one or a set of observations in linear regression dealing with collinearity. We show that this statistic has asymptotically normal distribution and is able to detect a subset of high ridge leverage outliers. Using this influence statistic we also show that when ridge regression is used to mitigate the effects of collinearity, the influence of some observations can be drastically modified. As an illustrative example, simulation studies and a real data set are analysed.  相似文献   

14.
ABSTRACT

Constrained general linear models (CGLMs) have wide applications in practice. Similar to other data analysis, the identification of influential observations that may be potential outliers is an important step beyond in the CGLMs. We develop multiple case-deletion diagnostics for detecting influential observations in the CGLMs. The diagnostics are functions of basic building blocks: studentized residuals, error contrast matrix, and the inverse of the response variable covariance matrix. The basic building blocks are computed only once from the complete data analysis and provide information on the influence of the data on different aspects of the model fit. Computational formulas are given which make the procedures feasible. An illustrative example with a real data set is also reported.  相似文献   

15.
For the first time, we introduce a generalized form of the exponentiated generalized gamma distribution [Cordeiro et al. The exponentiated generalized gamma distribution with application to lifetime data, J. Statist. Comput. Simul. 81 (2011), pp. 827–842.] that is the baseline for the log-exponentiated generalized gamma regression model. The new distribution can accommodate increasing, decreasing, bathtub- and unimodal-shaped hazard functions. A second advantage is that it includes classical distributions reported in the lifetime literature as special cases. We obtain explicit expressions for the moments of the baseline distribution of the new regression model. The proposed model can be applied to censored data since it includes as sub-models several widely known regression models. It therefore can be used more effectively in the analysis of survival data. We obtain maximum likelihood estimates for the model parameters by considering censored data. We show that our extended regression model is very useful by means of two applications to real data.  相似文献   

16.
The identification of influential observations has drawn a great deal of attention in regression diagnostics. Most of these identification techniques are based on single case deletion and among them DFFITS has become very popular with the statisticians. But this technique along with all other single case diagnostics may be ineffective in the presence of multiple influential observations. In this paper we develop a generalized version of DFFITS based on group deletion and then propose a new technique to identify multiple influential observations using this. The advantage of using the proposed method in the identification of multiple influential cases is then investigated through several well-referred data sets.  相似文献   

17.
The analysis of residuals may reveal various functional forms suitable for the regression model. In this paper, we investigate some selection criteria for selecting important regression variables. In doing so, we use statistical selection and ranking procedures. Thus, we derive an appropriate criterion to measure the influence and bias for the reduced models. We show that the reduced models are based on some noncentrality parameters which provide a measure of goodness of fit for the fitted models. In this paper, we also discuss the relationships of influence diagnostics and the statistic proposed earlier by Gupta and Huang (J. Statist. Plann. Inference 20 (1988) 155–167). We introduce a new measure for detecting influential data as an alternative to Cook's measure.  相似文献   

18.
The purpose of this paper is to develop diagnostics analysis for nonlinear regression models (NLMs) under scale mixtures of skew-normal (SMSN) distributions introduced by Garay et al. [Nonlinear regression models based on SMSN distributions. J. Korean Statist. Soc. 2011;40:115–124]. This novel class of models provides a useful generalization of the symmetrical NLM [Vanegas LH, Cysneiros FJA. Assessment of diagnostic procedures in symmetrical nonlinear regression models. Comput. Statist. Data Anal. 2010;54:1002–1016] since the random terms distributions cover both symmetric as well as asymmetric and heavy-tailed distributions such as the skew-t, skew-slash, skew-contaminated normal distributions, among others. Motivated by the results given in Garay et al. [Nonlinear regression models based on SMSN distributions. J. Korean Statist. Soc. 2011;40:115–124], we presented a score test for testing the homogeneity of the scale parameter and its properties are investigated through Monte Carlo simulations studies. Furthermore, local influence measures and the one-step approximations of the estimates in the case-deletion model are obtained. The newly developed procedures are illustrated considering a real data set.  相似文献   

19.
This paper examines local influence assessment in generalized autoregressive conditional heteroscesdasticity models with Gaussian and Student-t errors, where influence is examined via the likelihood displacement. The analysis of local influence is discussed under three perturbation schemes: data perturbation, innovative model perturbation and additive model perturbation. For each case, expressions for slope and curvature diagnostics are derived. Monte Carlo experiments are presented to determine the threshold values for locating influential observations. The empirical study of daily returns of the New York Stock Exchange composite index shows that local influence analysis is a useful technique for detecting influential observations; most of the observations detected as influential are associated with historical shocks in the market. Finally, based on this empirical study and the analysis of simulated data, some advice is given on how to use the discussed methodology.  相似文献   

20.
ABSTRACT

In high-dimensional regression, the presence of influential observations may lead to inaccurate analysis results so that it is a prime and important issue to detect these unusual points before statistical regression analysis. Most of the traditional approaches are, however, based on single-case diagnostics, and they may fail due to the presence of multiple influential observations that suffer from masking effects. In this paper, an adaptive multiple-case deletion approach is proposed for detecting multiple influential observations in the presence of masking effects in high-dimensional regression. The procedure contains two stages. Firstly, we propose a multiple-case deletion technique, and obtain an approximate clean subset of the data that is presumably free of influential observations. To enhance efficiency, in the second stage, we refine the detection rule. Monte Carlo simulation studies and a real-life data analysis investigate the effective performance of the proposed procedure.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号