期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A robust diagnostic plot for explanatory variables under model mis-specification

Li-Chu Chien 《Journal of applied statistics》2011,38(1):113-126

A typical added variable plot is a commonly used plot in assessing the accuracy of a normal linear model. This plot is often used to evaluate the effect of adding an explanatory variable into the model and to detect possibly high leverage points or influential observations on the added variable. However, this type of plot is generally in doubt, once the normal distributional assumptions are violated. In this article, we extend the robust likelihood technique introduced by Royall and Tsou [11] to propose a robust added variable plot. The validity of this diagnostic plot requires no knowledge of the true underlying distributions so long as their second moments exist. The usefulness of the robust graphical approach is demonstrated through a few illustrations and simulations. 相似文献

2.

Added variable plots for linear regression with censored data

Peter J. Smith Lalith W. Peiris 《统计学通讯:理论与方法》2013,42(8):1987-2000

In linear regression the structure of the hat matrix plays an important part in regression diagnostics. In this note we investigate the properties of the hat matrix for regression with censored responses in the presence of one or more explanatory variables observed without censoring. The censored points in the scatterplot are renovated to positions had they been observed without censoring in a renovation process based on Buckley-James censored regression estimators. This allows natural links to be established with the structure of ordinary least squares estimators. In particular, we show that the renovated hat matrix may be partitioned in a manner which assists in deciding whether further explanatory variables should be added to the linear model. The added variable plot for regression with censored data is developed as a diagnostic tool for this decision process. 相似文献

3.

Residuals from deletion in added variable plots

A. H. M. Rahmatullah Imon 《Journal of applied statistics》2003,30(7):827-841

An added variable plot is a commonly used plot in regression diagnostics. The rationale for this plot is to provide information about the addition of a further explanatory variable to the model. In addition, an added variable plot is most often used for detecting high leverage points and influential data. So far as we know, this type of plot involves the least squares residuals which, we suspect, could produce a confusing picture when a group of unusual cases are present in the data. In this situation, added variable plots may not only fail to detect the unusual cases but also may fail to focus on the need for adding a further regressor to the model. We suggest that residuals from deletion should be more convincing and reliable in this type of plot. The usefulness of an added variable plot based on residuals from deletion is investigated through a few examples and a Monte Carlo simulation experiment in a variety of situations. 相似文献

4.

The SSR Plot: A Graphical Representation for Regression

Chong Sun Hong 《统计学通讯:模拟与计算》2013,42(4):726-735

An alternative graphical method, called the SSR plot, is proposed for use with a multiple regression model. The new method uses the fact that the sum of squares for regression (SSR) of two explanatory variables can be partitioned into the SSR of one variable and the increment in SSR due to the addition of the second variable. The SSR plot represents each explanatory variable as a vector in a half circle. Our proposed SSR plot explains that the explanatory variables corresponding to the vectors located closer to the horizontal axis have stronger effects on the response variable. Furthermore, for a regression model with two explanatory variables, the magnitude of the angle between two vectors can be used to identify suppression. 相似文献

5.

Robust estimation for ordinal regression

C. Croux G. Haesbroeck C. Ruwet 《Journal of statistical planning and inference》2013

Ordinal regression is used for modelling an ordinal response variable as a function of some explanatory variables. The classical technique for estimating the unknown parameters of this model is Maximum Likelihood (ML). The lack of robustness of this estimator is formally shown by deriving its breakdown point and its influence function. To robustify the procedure, a weighting step is added to the Maximum Likelihood estimator, yielding an estimator with bounded influence function. We also show that the loss in efficiency due to the weighting step remains limited. A diagnostic plot based on the Weighted Maximum Likelihood estimator allows to detect outliers of different types in a single plot. 相似文献

6.

Adjusted variable plots for Cox's proportional hazards regression model

Charles B. Hall Scott L. Zeger Karen J. Bandeen-Roche 《Lifetime data analysis》1996,2(1):73-90

Adjusted variable plots are useful in linear regression for outlier detection and for qualitative evaluation of the fit of a model. In this paper, we extend adjusted variable plots to Cox's proportional hazards model for possibly censored survival data. We propose three different plots: a risk level adjusted variable (RLAV) plot in which each observation in each risk set appears, a subject level adjusted variable (SLAV) plot in which each subject is represented by one point, and an event level adjusted variable (ELAV) plot in which the entire risk set at each failure event is represented by a single point. The latter two plots are derived from the RLAV by combining multiple points. In each point, the regression coefficient and standard error from a Cox proportional hazards regression is obtained by a simple linear regression through the origin fit to the coordinates of the pictured points. The plots are illustrated with a reanalysis of a dataset of 65 patients with multiple myeloma. 相似文献

7.

On variable selection in generalized linear and related regression models

Lennart Nordberg 《统计学通讯:理论与方法》2013,42(21):2427-2449

This paper is concerned with selection of explanatory variables in generalized linear models (GLM). The class of GLM's is quite large and contains e.g. the ordinary linear regression, the binary logistic regression, the probit model and Poisson regression with linear or log-linear parameter structure. We show that, through an approximation of the log likelihood and a certain data transformation, the variable selection problem in a GLM can be converted into variable selection in an ordinary (unweighted) linear regression model. As a consequence no specific computer software for variable selection in GLM's is needed. Instead, some suitable variable selection program for linear regression can be used. We also present a simulation study which shows that the log likelihood approximation is very good in many practical situations. Finally, we mention briefly possible extensions to regression models outside the class of GLM's. 相似文献

8.

Properties of Added Variable Plots in Cox's Regression Model

Lindkvist M 《Lifetime data analysis》2000,6(1):23-38

The added variable plot is useful for examining the effect of a covariate in regression models. The plot provides information regarding the inclusion of a covariate, and is useful in identifying influential observations on the parameter estimates. Hall et al. (1996) proposed a plot for Cox's proportional hazards model derived by regarding the Cox model as a generalized linear model. This paper proves and discusses properties of this plot. These properties make the plot a valuable tool in model evaluation. Quantities considered include parameter estimates, residuals, leverage, case influence measures and correspondence to previously proposed residuals and diagnostics. 相似文献

9.

Multivariate-multiple circular regression

Sungsu Kim Ashis SenGupta 《Journal of Statistical Computation and Simulation》2017,87(7):1277-1291

We introduce a fully model-based approach of studying functional relationships between a multivariate circular-dependent variable and several circular covariates, enabling inference regarding all model parameters and related prediction. Two multiple circular regression models are presented for this approach. First, for an univariate circular-dependent variable, we propose the least circular mean-square error (LCMSE) estimation method, and asymptotic properties of the LCMSE estimators and inferential methods are developed and illustrated. Second, using a simulation study, we provide some practical suggestions for model selection between the two models. An illustrative example is given using a real data set from protein structure prediction problem. Finally, a straightforward extension to the case with a multivariate-dependent circular variable is provided. 相似文献

10.

Direction dependence in a regression line

Yadolah Dodge Valentin Rousson 《统计学通讯:理论与方法》2013,42(9-10):1957-1972

In this paper, we derive some simple formulae to express the association between two random variables in the case of a linear relationship, One of these representations, the cube of the correlation coefficient, is given as the ratio of the skewness of the response variable to that of the explanatory variable. This result, along with other expressions of the correlation coefficient presented in this paper, has implications for choosing the response variable in a linear regression modelling. 相似文献

11.

Outlier detection and robust variable selection via the penalized weighted LAD-LASSO method

Yunlu Jiang Yan Wang Jiantao Zhang Baojian Xie Jibiao Liao Wenhui Liao 《Journal of applied statistics》2021,48(2):234

This paper studies the outlier detection and robust variable selection problem in the linear regression model. The penalized weighted least absolute deviation (PWLAD) regression estimation method and the adaptive least absolute shrinkage and selection operator (LASSO) are combined to simultaneously achieve outlier detection, and robust variable selection. An iterative algorithm is proposed to solve the proposed optimization problem. Monte Carlo studies are evaluated the finite-sample performance of the proposed methods. The results indicate that the finite sample performance of the proposed methods performs better than that of the existing methods when there are leverage points or outliers in the response variable or explanatory variables. Finally, we apply the proposed methodology to analyze two real datasets. 相似文献

12.

Gaussian Markov random field spatial models in GAMLSS

Fernanda De Bastiani Robert A. Rigby Audrey H.M.A. Cysneiros Miguel A. Uribe-Opazo 《Journal of applied statistics》2018,45(1):168-186

This paper describes the modelling and fitting of Gaussian Markov random field spatial components within a Generalized AdditiveModel for Location, Scale and Shape (GAMLSS) model. This allows modelling of any or all the parameters of the distribution for the response variable using explanatory variables and spatial effects. The response variable distribution is allowed to be a non-exponential family distribution. A new package developed in R to achieve this is presented. We use Gaussian Markov random fields to model the spatial effect in Munich rent data and explore some features and characteristics of the data. The potential of using spatial analysis within GAMLSS is discussed. We argue that the flexibility of parametric distributions, ability to model all the parameters of the distribution and diagnostic tools of GAMLSS provide an ideal environment for modelling spatial features of data. 相似文献

13.

Order-restricted Dose-related Trend Phi-divergence Tests for Generalized Linear Models

A. Felipe M. L. Menéndez L. Pardo 《Journal of applied statistics》2007,34(5):611-623

In this paper a new family of test statistics is presented for testing the independence between the binary response Y and an ordered categorical explanatory variable X (doses) against the alternative hypothesis of an increase dose-response relationship between a response variable Y and X (doses). The properties of these test statistics are studied. This new family of test statistics is based on the family of φ-divergence measures and contains as a particular case the likelihood ratio test. We pay special attention to the family of test statistics associated with the power divergence family. A simulation study is included in order to analyze the behavior of the power divergence family of test statistics. 相似文献

14.

Residual analysis for spatial point processes (with discussion)

A. Baddeley R. Turner J. Møller M. Hazelton 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2005,67(5):617-666

Summary. We define residuals for point process models fitted to spatial point pattern data, and we propose diagnostic plots based on them. The residuals apply to any point process model that has a conditional intensity; the model may exhibit spatial heterogeneity, interpoint interaction and dependence on spatial covariates. Some existing ad hoc methods for model checking (quadrat counts, scan statistic, kernel smoothed intensity and Berman's diagnostic) are recovered as special cases. Diagnostic tools are developed systematically, by using an analogy between our spatial residuals and the usual residuals for (non-spatial) generalized linear models. The conditional intensity λ plays the role of the mean response. This makes it possible to adapt existing knowledge about model validation for generalized linear models to the spatial point process context, giving recommendations for diagnostic plots. A plot of smoothed residuals against spatial location, or against a spatial covariate, is effective in diagnosing spatial trend or co-variate effects. Q – Q -plots of the residuals are effective in diagnosing interpoint interaction. 相似文献

15.

A Dirichlet random coefficient regression model for quality indicators

《Journal of statistical planning and inference》2006,136(3):942-961

We present a random coefficient regression model in which a response is linearly related to some explanatory variables with random coefficients following a Dirichlet distribution. These coefficients can be interpreted as weights because they are nonnegative and add up to one. The proposed estimation procedure combines iteratively reweighted least squares and the maximization on an approximated likelihood function. We also present a diagnostic tool based on a residual Q–Q plot and two procedures for estimating individual weights. The model is used to construct an index for measuring the quality of the railroad system in Spain. 相似文献

16.

Graphical Techniques for Selecting Explanatory Variables for Time Series Data

J. M. Marriott & A. N. Pettitt 《Journal of the Royal Statistical Society. Series C, Applied statistics》1997,46(2):253-264

Bayesian model building techniques are developed for data with a strong time series structure and possibly exogenous explanatory variables that have strong explanatory and predictive power. The emphasis is on finding whether there are any explanatory variables that might be used for modelling if the data have a strong time series structure that should also be included. We use a time series model that is linear in past observations and that can capture both stochastic and deterministic trend, seasonality and serial correlation. We propose the plotting of absolute predictive error against predictive standard deviation. A series of such plots is utilized to determine which of several nested and non-nested models is optimal in terms of minimizing the dispersion of the predictive distribution and restricting predictive outliers. We apply the techniques to modelling monthly counts of fatal road crashes in Australia where economic, consumption and weather variables are available and we find that three such variables should be included in addition to the time series filter. The approach leads to graphical techniques to determine strengths of relationships between the dependent variable and covariates and to detect model inadequacy as well as determining useful numerical summaries. 相似文献

17.

Bayesian least squares estimates of univariate regression functions

H. D. Brunk 《统计学通讯:理论与方法》2013,42(11):1101-1136

The regression function R(?) to be estimated is assumed to have an expansion in terms of specified functions, orthogonalized vich respect to values of the explanatory variable. Relative precisions of OBSERVATION are assumed known. The estimate is the posterior linear mean of R(?) given the data. The investigator plots graphs of appropriate functions as an aid in eliciting his prior means and precisions for the coefficients in the expansion. The method is illustrated by an example using simulated data, an example in which effects of various dosages of Vitamin D are estimated, and an example in which a utility function is estimated. 相似文献

18.

Approximate regression models and splines

R.L. Eubank 《统计学通讯:理论与方法》2013,42(4):433-484

The literature pertaining to splines in regression analysis is reviewed. Spline regression is motivated as a simple extension of the basic polynomial regression model. Using this framework, the concepts of fixed and variable knot spline regression are developed and corresponding inferential procedures are considered. Smoothing splines are also seen to be an extension of polynomial regression and various optimality properties, as well as inferential and diagnostic methods, for these types of splines are discussed. 相似文献

19.

Robust regression: an inferential method for determining which independent variables are most important

Rand R. Wilcox 《Journal of applied statistics》2018,45(1):100-111

Consider the usual linear regression model consisting of two or more explanatory variables. There are many methods aimed at indicating the relative importance of the explanatory variables. But in general these methods do not address a fundamental issue: when all of the explanatory variables are included in the model, how strong is the empirical evidence that the first explanatory variable is more or less important than the second explanatory variable? How strong is the empirical evidence that the first two explanatory variables are more important than the third explanatory variable? The paper suggests a robust method for dealing with these issues. The proposed technique is based on a particular version of explanatory power used in conjunction with a modification of the basic percentile method. 相似文献

20.

Robust and diagnostic regression analyses

Anthony C Atkinson 《统计学通讯:理论与方法》2013,42(22):2559-2571

Graphical methods of diagnostic regression analysis are applied to three examples in which least squares and robust regression analyses give substantially different results. The diagnostic tools lead to the identification of data deficiencies and model inadequacies. The analyses serve as a reminder that robust regressions depend upon the linear model and upon the scale in whicli the response is analysed. The robust analysis may also be sensitive to gross errors in one or more explanatory variables 相似文献