首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
To bootstrap a regression problem, pairs of response and explanatory variables or residuals can be resam‐pled, according to whether we believe that the explanatory variables are random or fixed. In the latter case, different residuals have been proposed in the literature, including the ordinary residuals (Efron 1979), standardized residuals (Bickel & Freedman 1983) and Studentized residuals (Weber 1984). Freedman (1981) has shown that the bootstrap from ordinary residuals is asymptotically valid when the number of cases increases and the number of variables is fixed. Bickel & Freedman (1983) have shown the asymptotic validity for ordinary residuals when the number of variables and the number of cases both increase, provided that the ratio of the two converges to zero at an appropriate rate. In this paper, the authors introduce the use of BLUS (Best Linear Unbiased with Scalar covariance matrix) residuals in bootstrapping regression models. The main advantage of the BLUS residuals, introduced in Theil (1965), is that they are uncorrelated. The main disadvantage is that only np residuals can be computed for a regression problem with n cases and p variables. The asymptotic results of Freedman (1981) and Bickel & Freedman (1983) for the ordinary (and standardized) residuals are generalized to the BLUS residuals. A small simulation study shows that even though only np residuals are available, in small samples bootstrapping BLUS residuals can be as good as, and sometimes better than, bootstrapping from standardized or Studentized residuals.  相似文献   

2.
3.
A fast routine for converting regression algorithms into corresponding orthogonal regression (OR) algorithms was introduced in Ammann and Van Ness (1988). The present paper discusses the properties of various ordinary and robust OR procedures created using this routine. OR minimizes the sum of the orthogonal distances from the regression plane to the data points. OR has three types of applications. First, L 2 OR is the maximum likelihood solution of the Gaussian errors-in-variables (EV) regression problem. This L 2 solution is unstable, thus the robust OR algorithms created from robust regression algorithms should prove very useful. Secondly, OR is intimately related to principal components analysis. Therefore, the routine can also be used to create L 1, robust, etc. principal components algorithms. Thirdly, OR treats the x and y variables symmetrically which is important in many modeling problems. Using Monte Carlo studies this paper compares the performance of standard regression, robust regression, OR, and robust OR on Gaussian EV data, contaminated Gaussian EV data, heavy-tailed EV data, and contaminated heavy-tailed EV data.  相似文献   

4.
In this article, the parametric robust regression approaches are proposed for making inferences about regression parameters in the setting of generalized linear models (GLMs). The proposed methods are able to test hypotheses on the regression coefficients in the misspecified GLMs. More specifically, it is demonstrated that with large samples, the normal and gamma regression models can be properly adjusted to become asymptotically valid for inferences about regression parameters under model misspecification. These adjusted regression models can provide the correct type I and II error probabilities and the correct coverage probability for continuous data, as long as the true underlying distributions have finite second moments.  相似文献   

5.
This paper presents results of a Monte Carlo simulation of eight families of robust regression estimators in various situations. The effects studied include long-tailed error terms, measurement error in the independent variables, various spacings of the independent variables, different sample sizes and correlation between the independent variables. An estimator that combines the best features of several of the estimators is recommended for further study.  相似文献   

6.
A method of examining the uniqueness of estimates is reviewed, which we show to be flawed in that it neglects a continuity problem that can arise when simultaneously estimating the scale and regression parameters.  相似文献   

7.
A well-known problem in multiple regression is that it is possible to reject the hypothesis that all slope parameters are equal to zero, yet when applying the usual Student's T-test to the individual parameters, no significant differences are found. An alternative strategy is to estimate prediction error via the 0.632 bootstrap method for all models of interest and declare the parameters associated with the model that yields the smallest prediction error to differ from zero. The main results in this paper are that this latter strategy can have practical value versus Student's T; replacing squared error with absolute error can be beneficial in some situations and replacing least squares with an extension of the Theil-Sen estimator can substantially increase the probability of identifying the correct model under circumstances that are described.  相似文献   

8.
The different parts (variables) of a compositional data set cannot be considered independent from each other, since only the ratios between the parts constitute the relevant information to be analysed. Practically, this information can be included in a system of orthonormal coordinates. For the task of regression of one part on other parts, a specific choice of orthonormal coordinates is proposed which allows for an interpretation of the regression parameters in terms of the original parts. In this context, orthogonal regression is appropriate since all compositional parts – also the explanatory variables – are measured with errors. Besides classical (least-squares based) parameter estimation, also robust estimation based on robust principal component analysis is employed. Statistical inference for the regression parameters is obtained by bootstrap; in the robust version the fast and robust bootstrap procedure is used. The methodology is illustrated with a data set from macroeconomics.  相似文献   

9.
Wild Bootstrapping in Finite Populations with Auxiliary Information   总被引:1,自引:0,他引:1  
Consider a finite population u , which can be viewed as a realization of a super-population model. A simple ratio model (linear regression, without intercept) with heteroscedastic errors is supposed to have generated u . A random sample is drawn without replacement from u . In this set-up a two-stage wild bootstrap resampling scheme as well as several other useful forms of bootstrapping in finite populations will be considered. Some asymptotic results for various bootstrap approximations for normalized and Studentized versions of the well-known ratio and regression estimator are given. Bootstrap based confidence interval s for the population total and for the regression parameter of the underlying ratio model are also discussed  相似文献   

10.
Abstract

This article proposes new regression-type estimators by considering Tukey-M, Hampel M, Huber MM, LTS, LMS and LAD robust methods and MCD and MVE robust covariance matrices in stratified sampling. Theoretically, we obtain the mean square error (MSE) for these estimators. We compare the efficiencies based on MSE equations, between the proposed estimators and the traditional combined and separate regression estimators. As a result of these comparisons, we observed that our proposed estimators give more efficient results than traditional approaches. And, these theoretical results are supported with the aid of numerical examples and simulation based on data sets that include outliers.  相似文献   

11.
In this article, a robust variable selection procedure based on the weighted composite quantile regression (WCQR) is proposed. Compared with the composite quantile regression (CQR), WCQR is robust to heavy-tailed errors and outliers in the explanatory variables. For the choice of the weights in the WCQR, we employ a weighting scheme based on the principal component method. To select variables with grouping effect, we consider WCQR with SCAD-L2 penalization. Furthermore, under some suitable assumptions, the theoretical properties, including the consistency and oracle property of the estimator, are established with a diverging number of parameters. In addition, we study the numerical performance of the proposed method in the case of ultrahigh-dimensional data. Simulation studies and real examples are provided to demonstrate the superiority of our method over the CQR method when there are outliers in the explanatory variables and/or the random error is from a heavy-tailed distribution.  相似文献   

12.
A Monte Carlo simulation is used to study the performance of hypothesis tests for regression coefficients when least absolute value regression methods are used. In small samples, the results of the simulation suggest that using the bootstrap method to compute standard errors will provide improved test performance  相似文献   

13.
In this paper, we focus on resampling non-stationary weakly dependent point processes in two dimensions to make inference on the inhomogeneous K function ( Baddeley et al., 2000). We provide theoretical results that show a consistency result of the bootstrap estimates of the variance as the observation region and resampling blocks increase in size. We present results of a simulation study that examines the performance of nominal 95% confidence intervals for the inhomogeneous K function obtained via our bootstrap procedure. The procedure is also applied to a rainforest dataset.  相似文献   

14.
Fuzzy least-square regression can be very sensitive to unusual data (e.g., outliers). In this article, we describe how to fit an alternative robust-regression estimator in fuzzy environment, which attempts to identify and ignore unusual data. The proposed approach concerns classical robust regression and estimation methods that are insensitive to outliers. In this regard, based on the least trimmed square estimation method, an estimation procedure is proposed for determining the coefficients of the fuzzy regression model for crisp input-fuzzy output data. The investigated fuzzy regression model is applied to bedload transport data forecasting suspended load by discharge based on a real world data. The accuracy of the proposed method is compared with the well-known fuzzy least-square regression model. The comparison results reveal that the fuzzy robust regression model performs better than the other models in suspended load estimation for the particular dataset. This comparison is done based on a similarity measure between fuzzy sets. The proposed model is general and can be used for modeling natural phenomena whose available observations are reported as imprecise rather than crisp.  相似文献   

15.
We investigate by simulation how the wild bootstrap and pairs bootstrap perform in t and F tests of regression parameters in the stochastic regression model, where explanatory variables are stochastic and not given and there exists no heteroskedasticity. The wild bootstrap procedure due to Davidson and Flachaire [The wild bootstrap, tamed at last, Working paper, IER#1000, Queen's University, 2001] with restricted residuals works best but its dominance is not strong compared to the result of Flachaire [Bootstrapping heteroskedastic regression models: wild bootstrap vs. pairs bootstrap, Comput. Statist. Data Anal. 49 (2005), pp. 361–376] in the fixed regression model where explanatory variables are fixed and there exists heteroskedasticity.  相似文献   

16.
Let M be a parametric model for an unknown regression function m. In order to check the validity of M, i.e., to test for m ∈ M, it is known that optinal tests should be based on the empirical process of the regressors marked by the residuals. In this paper we extend the methodology to censored regression. The asymptotic distribution of the underlying marked empirical process in provided. The Wild Bootstrap, appropriately modified to account for censhorship, provides distributional approximations. The method is applied to simulated data sets as well as tto the Stanford Heart Transplant Data.  相似文献   

17.
18.
In the nonparametric setting, the standard bootstrap method is based on the empirical distribution function of a random sample. The author proposes, by means of the empirical likelihood technique, an alternative bootstrap procedure under a nonparametric model in which one has some auxiliary information about the population distribution. By proving the almost sure weak convergence of the modified bootstrapped empirical process, the validity of the proposed bootstrap procedure is established. This new result is used to obtain bootstrap confidence bands for the population distribution function and to perform the bootstrap Kolmogorov test in the presence of auxiliary information. Other applications include bootstrapping means and variances with auxiliary information. Three simulation studies are presented to demonstrate the performance of the proposed bootstrap procedure for small samples.  相似文献   

19.
A robust regression methodology is proposed via M-estimation. The approach adapts to the tail behavior and skewness of the distribution of the random error terms, providing for a reliable analysis under a broad class of distributions. This is accomplished by allowing the objective function, used to determine the regression parameter estimates, to be selected in a data driven manner. The asymptotic properties of the proposed estimator are established and a numerical algorithm is provided to implement the methodology. The finite sample performance of the proposed approach is exhibited through simulation and the approach was used to analyze two motivating datasets.  相似文献   

20.
Huber (1964) found the minimax-variance M-estimate of location under the assumption that the scale parameter is known; Li and Zamar (1991) extended this result to the case when the scale is unknown. We consider the robust estimation of the regression coefficients (β1,…,βp) when the scale and the intercept parameters are unknown. The minimax-variance estimates of (β1,…,βp) with respect to the trace of their asymptotic covariance matrix are derived. The maximum is taken over ?-contamination neighbourhoods of a central regression model with Gaussian errors (asymmetric contamination is allowed), and the minimum is taken over a large class of generalized M-estimates of regression of the Mallow type. The optimal choice of estimates for the nuisance parameters (scale and intercept) is also considered.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号