首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The heterogeneity of error variance often causes a huge interpretive problem in linear regression analysis. Before taking any remedial measures we first need to detect this problem. A large number of diagnostic plots are now available in the literature for detecting heteroscedasticity of error variances. Among them the ‘residuals’ and ‘fits’ (R–F) plot is very popular and commonly used. In the R–F plot residuals are plotted against the fitted responses, where both these components are obtained using the ordinary least squares (OLS) method. It is now evident that the OLS fits and residuals suffer a huge setback in the presence of unusual observations and hence the R–F plot may not exhibit the real scenario. The deletion residuals based on a data set free from all unusual cases should estimate the true errors in a better way than the OLS residuals. In this paper we propose ‘deletion residuals’ and the ‘deletion fits’ (DR–DF) plot for the detection of the heterogeneity of error variances in a linear regression model to get a more convincing and reliable graphical display. Examples show that this plot locates unusual observations more clearly than the R–F plot. The advantage of using deletion residuals in the detection of heteroscedasticity of error variance is investigated through Monte Carlo simulations under a variety of situations.  相似文献   

2.
This paper provides an examination of the problem of heteroscedasticity as it relates to estimating park use, although the results can also be applied to a wide variety of flow problems involving traffic, people or commodities. The major issue is that estimates of flows obtained using ordinary least squares, OLS, often yield statistically significant results while still giving rise to large differences between observed and predicted flows (residuals). The paper presents results which show that for the flow estimation problem of concern, more accurate use estimates may be obtained by using generalized least squares, GLS, rather than using OLS. Weights to use in GLS regression are developed taking into account the variance to be expected in origin-destination flows. It is shown that deriving the correct weights, estimates of variances, to use in a regression analysis results in an ‘absolute’ test for the structural appropriateness of the regression model. Tests related to the ‘absolute’ adequacy test are introduced and their use to identify specific structural problems with a model is illustrated.  相似文献   

3.
A robust rank-based estimator for variable selection in linear models, with grouped predictors, is studied. The proposed estimation procedure extends the existing rank-based variable selection [Johnson, B.A., and Peng, L. (2008), ‘Rank-based Variable Selection’, Journal of Nonparametric Statistics, 20(3):241–252] and the ww-scad [Wang, L., and Li, R. (2009), ‘Weighted Wilcoxon-type Smoothly Clipped Absolute Deviation Method’, Biometrics, 65(2):564–571] to linear regression models with grouped variables. The resulting estimator is robust to contamination or deviations in both the response and the design space.The Oracle property and asymptotic normality of the estimator are established under some regularity conditions. Simulation studies reveal that the proposed method performs better than the existing rank-based methods [Johnson, B.A., and Peng, L. (2008), ‘Rank-based Variable Selection’, Journal of Nonparametric Statistics, 20(3):241–252; Wang, L., and Li, R. (2009), ‘Weighted Wilcoxon-type Smoothly Clipped Absolute Deviation Method’, Biometrics, 65(2):564–571] for grouped variables models. This estimation procedure also outperforms the adaptive hlasso [Zhou, N., and Zhu, J. (2010), ‘Group Variable Selection Via a Hierarchical Lasso and its Oracle Property’, Interface, 3(4):557–574] in the presence of local contamination in the design space or for heavy-tailed error distribution.  相似文献   

4.
5.
Abstract. In regression experiments, to learn about the strength of the relationship between a covariate vector and a dependent variable, we propose a ‘coefficient of determination’ based on the quantiles. Such a coefficient is a ‘local’ measure in the sense that the strength is measured at a prespecified quantile level. Once estimated, it can be used, for example, to measure the relative importance of a subset of covariates in the quantile regression context. Related to this coefficient, we also propose a new ‘local’ lack‐of‐fit measure of a given parametric model. We provide some asymptotic results of the proposed measures and carry out a Monte Carlo simulation study to illustrate their use and performance in practice.  相似文献   

6.
This paper presents a simple computational procedure for generating ‘matching’ or ‘cloning’ datasets so that they have exactly the same fitted multiple linear regression equation. The method is simple to implement and provides an alternative to generating datasets under an assumed model. The advantage is that, unlike the case for the straight model‐based alternative, parameter estimates from the original data and the generated data do not include any model error. This distinction suggests that ‘same fit’ procedures may provide a general and useful alternative to model‐based procedures, and have a wide range of applications. For example, as well as being useful for teaching, cloned datasets can provide a model‐free way of confidentializing data.  相似文献   

7.
Simultaneous confidence bands have been shown in the statistical literature as powerful inferential tools in univariate linear regression. While the methodology of simultaneous confidence bands for univariate linear regression has been extensively researched and well developed, no published work seems available for multivariate linear regression. This paper fills this gap by studying one particular simultaneous confidence band for multivariate linear regression. Because of the shape of the band, the word ‘tube’ is more pertinent and so will be used to replace the word ‘band’. It is shown that the construction of the tube is related to the distribution of the largest eigenvalue. A simulation‐based method is proposed to compute the 1 ? α quantile of this eigenvalue. With the computation power of modern computers, the simultaneous confidence tube can be computed fast and accurately. A real‐data example is used to illustrate the method, and many potential research problems have been pointed out.  相似文献   

8.
The problem of comparing mean responses for several treatments applied to a common population is considered. The analysis of co-variance ‘ANCOVA’ is frequently used to take advantage of covariate information in this setting, but in many cases ANCOVA's assumption of parallel regression functions precludes the use of ANCOVA. In this paper, an alternative method is developed which does not make this assumption  相似文献   

9.
We propose a segmented discrete-time model for the analysis of event history data in demographic research. Through a unified regression framework, the model provides estimates of the effects of explanatory variables and jointly accommodates flexibly non-proportional differences via segmented relationships. The main appeal relies on ready availability of parameters, changepoints, and slopes, which may provide meaningful and intuitive information on the topic. Furthermore, specific linear constraints on the slopes may also be set to investigate particular patterns. We investigate the intervals between cohabitation and first childbirth and from first to second childbirth using individual data for Italian women from the Second National Survey on Fertility. The model provides insights into dramatic decrease of fertility experienced in Italy, in that it detects a ‘common’ tendency in delaying the onset of childbearing for the more recent cohorts and a ‘specific’ postponement strictly depending on the educational level and age at cohabitation.  相似文献   

10.
Characterising the correspondence between an ordinal measurement and a continuous measurement is often of interest in mental health studies. To this end Peng et al. [(2011), ‘A Framework for Assessing Broad Sense Agreement Between Ordinal and Continuous Measurements’, Journal of the American Statistical Association, 106, 1592–1601] introduced the concept of broad sense agreement (BSA) and developed nonparametric estimation and inference for a BSA measure. In this work, we propose a nonparametric regression framework for BSA, which provides a robust tool to further investigate population heterogeneity in BSA. We develop inferential procedures including regression function estimation and hypothesis testing. Extensive simulation studies demonstrate satisfactory performance of the proposed method. We also apply the new method to a recent Grady Trauma Study and reveal an interesting impact of depression severity on the alignment between a self-reported symptom instrument and clinician diagnosis in posttraumatic stress disorder patients.  相似文献   

11.
This article provides alternative circular smoothing methods in nonparametric estimation of periodic functions. By treating the data as ‘circular’, we solve the “boundary issue” in the nonparametric estimation treating the data as ‘linear’. By redefining the distance metric and signed distance, we modify many estimators used in the situations involving periodic patterns. In the perspective of ‘nonparametric estimation of periodic functions’, we present the examples in nonparametric estimation of (1) a periodic function, (2) multiple periodic functions, (3) an evolving function, (4) a periodically varying-coefficient model and (5) a generalized linear model with periodically varying coefficient. In the perspective of ‘circular statistics’, we provide alternative approaches to calculate the weighted average and evaluate the ‘linear/circular–linear/circular’ association and regression. Simulation studies and an empirical study of electricity price index have been conducted to illustrate and compare our methods with other methods in the literature.  相似文献   

12.
An approach to teaching linear regression with unbalanced data is outlined that emphasizes its role as a method of adjustment for associated regressors. The method is introduced via direct standardization, a simple form of regression for categorical regressors. Properties of regression in the presence of association and interaction are emphasized. Least squares is introduced as a more efficient way of calculating adjusted effects for which exact decompositions of the variance are possible. Interval-scaled regressors are initially grouped and treated as categorical; polynomial regression and analysis of covariance can be introduced later as alternative methods.  相似文献   

13.
The geometric characterization of linear regression in terms of the ‘concentration ellipse’ by Galton [Galton, F., 1886, Family likeness in stature (with Appendix by Dickson, J.D.H.). Proceedings of the Royal Society of London, 40, 42–73.] and Pearson [Pearson, K., 1901, On lines and planes of closest fit to systems of points in space. Philosophical Magazine, 2, 559–572.] was extended to the case of unequal variances of the presumably uncorrelated errors in the experimental data [McCartin, B.J., 2003, A geometric characterization of linear regression. Statistics, 37(2), 101–117.]. In this paper, this geometric characterization is further extended to planar (and also linear) regression in three dimensions where a beautiful interpretation in terms of the concentration ellipsoid is developed.  相似文献   

14.
Generalized estimating equations (GEE) have become a popular method for marginal regression modelling of data that occur in clusters. Features of the GEE methodology are the use of a ‘working covariance’, an approximation to the underlying covariance, which is used to improve the efficiency in estimating the regression coefficients, and the ‘sandwich’ estimate of variance, which provides a way of consistently estimating their standard errors. These techniques have been extended to include estimating equations for the underlying correlation structure, both to improve the efficiency of the regression coefficient estimates and to provide estimates of correlations between units in a cluster, when these are of interest. If the mean structure is of primary interest, then a simpler set of equations (GEE1) can be used, whereas if the underlying covariance structure is of interest in its own right, the use of the more complex GEE2 estimating equations is often recommended. In this paper, we compare the effect of increasing the complexity of the ‘working covariances’ on the variance of the parameter estimates, as well as the mean-squared error of the ‘sandwich’ estimate of variance. We give asymptotic expressions for these variances and mean-squared error terms. We use these to study the behaviour of different variants of GEE1 and GEE2 when we change the number of clusters, the cluster size, and the within-cluster correlation. We conclude that the extra complexity of the full GEE2 approach is not usually justified if the mean structure is of primary interest.  相似文献   

15.
A class of trimmed linear conditional estimators based on regression quantiles for the linear regression model is introduced. This class serves as a robust analogue of non-robust linear unbiased estimators. Asymptotic analysis then shows that the trimmed least squares estimator based on regression quantiles ( Koenker and Bassett ( 1978 ) ) is the best in this estimator class in terms of asymptotic covariance matrices. The class of trimmed linear conditional estimators contains the Mallows-type bounded influence trimmed means ( see De Jongh et al ( 1988 ) ) and trimmed instrumental variables estimators. A large sample methodology based on trimmed instrumental variables estimator for confidence ellipsoids and hypothesis testing is also provided.  相似文献   

16.
The variance inflation factor (VIF) is used to detect the presence of linear relationships between two or more independent variables (i.e. collinearity) in the multiple linear regression model. However, the traditionally used VIF definitions encounter some problems when extended to the case of the ridge estimation (RE). This paper presents an extension of the VIF in RE by providing two alternative VIF expressions that overcome these problems in the general case. Some characteristics of these expressions are also presented and compared with the traditional expression. The results are illustrated with an economic example in the case of three independent variables and with a Monte Carlo simulation for the general case.  相似文献   

17.
18.
19.
Modifications to the usual least squares normal equations have been proposed by Buckley and James (1979) when using distribution-free linear regression modelling under fixed right-censorship of the response. In this paper we consider large-sample distributional properties of the modified normal equation for the slope parameter and of estimators which ‘satisfy’ this equation.  相似文献   

20.
We present a Bayesian semiparametric approach to exponential family regression that extends the class of generalized linear regression models. Further, flexibility in the process of modelling is achieved by explicitly accounting for the discrepancy between the ‘true’ response-covariate regression surface and an assumed parametric functional relationship. An approximate full Bayesian analysis is provided, based upon the Gibbs sampling algorithm.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号