首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 31 毫秒
Summary This paper investigates the effects of ordinal regressors in linear regression models and in limited dependent variable models. Each ordered categorical variable is interpreted as a rough measurement of an underlying continuous variable as it is often done in microeconometrics for the dependent variable. It is shown that using ordinal indicators only leads to correct answers in a few special cases. In most situations, the usual estimators are biased. In order to estimate the parameters of the model consistently, the indirect estimation procedure suggested by Gourieroux et al. (1993) is applied. To demonstrate this method, first a simulation study is performed and then in a second step, two real data sets are used. In the latter case, continuous regressors are transformed into categorical variables to study the behavior of the estimation procedure. The method is extended to the case of limited dependent variable models. In general, the indirect estimators lead to adequate results. Received: March 27, 2000; revised version: March 6, 2001  相似文献   

The use of heteroscedasticity-consistent covariance matrix (HCCM) estimators is very common in practice to draw correct inference for the coefficients of a linear regression model with heteroscedastic errors. However, in addition to the problem of heteroscedasticity, linear regression models may also be plagued with some considerable degree of collinearity among the regressors when two or more regressors are considered. This situation causes many adverse effects on the least squares measures and alternatively, the ordinary ridge regression method is used as a common practice. But in the available literature, the problems of multicollinearity and heteroscedasticity have not been discussed as a combined issue especially, for the inference of the regression coefficients. The present article addresses the inference about the regression coefficients taking both the issues of multicollinearity and heteroscedasticity into account and suggests the use of HCCM estimators for the ridge regression. This article proposes t- and F-tests, based on these HCCM estimators, that perform adequately well in the numerical evaluation of the Monte Carlo simulations.  相似文献   

This work studies outlier detection and robust estimation with data that are naturally distributed into groups and which follow approximately a linear regression model with fixed group effects. For this, several methods are considered. First, the robust fitting method of Peña and Yohai [A fast procedure for outlier diagnostics in large regression problems. J Am Stat Assoc. 1999;94:434–445], called principal sensitivity components (PSC) method, is adapted to the grouped data structure and the mentioned model. The robust methods RDL1 of Hubert and Rousseeuw [Robust regression with both continuous and binary regressors. J Stat Plan Inference. 1997;57:153–163] and M-S of Maronna and Yohai [Robust regression with both continuous and categorical predictors. Journal of Statistical Planning and Inference 2000;89:197–214] are also considered. These three methods are compared in terms of their effectiveness in outlier detection and their robustness through simulations, considering several contamination scenarios and growing contamination levels. Results indicate that the adapted PSC procedure is able to detect a high percentage of true outliers and a small number of false outliers. It is appropriate when the contamination is in the error term or in the covariates, detecting also possibly masked high leverage points. Moreover, in simulations the final robust regression estimator preserved good efficiency under Normality while keeping good robustness properties.  相似文献   

Summary. We model daily catches of fishing boats in the Grand Bank fishing grounds. We use data on catches per species for a number of vessels collected by the European Union in the context of the Northwest Atlantic Fisheries Organization. Many variables can be thought to influence the amount caught: a number of ship characteristics (such as the size of the ship, the fishing technique used and the mesh size of the nets) are obvious candidates, but one can also consider the season or the actual location of the catch. Our database leads to 28 possible regressors (arising from six continuous variables and four categorical variables, whose 22 levels are treated separately), resulting in a set of 177 million possible linear regression models for the log-catch. Zero observations are modelled separately through a probit model. Inference is based on Bayesian model averaging, using a Markov chain Monte Carlo approach. Particular attention is paid to the prediction of catches for single and aggregated ships.  相似文献   

We study the bias that arises from using censored regressors in estimation of linear models. We present results on bias in ordinary least aquares (OLS) regression estimators with exogenous censoring and in instrumental variable (IV) estimators when the censored regressor is endogenous. Bound censoring such as top-coding results in expansion bias, or effects that are too large. Independent censoring results in bias that varies with the estimation method—attenuation bias in OLS estimators and expansion bias in IV estimators. Severe biases can result when there are several regressors and when a 0–1 variable is used in place of a continuous regressor.  相似文献   

In this article, we extend the functional-coefficient cointegration model (FCCM) to the cases in which nonstationary regressors contain both stochastic and deterministic trends. A nondegenerate distributional theory on the local linear (LL) regression smoother of the FCCM is explored. It is demonstrated that even when integrated regressors are endogenous, the limiting distribution is the same as if they were exogenous. Finite-sample performance of the LL estimator is investigated via Monte Carlo simulations in comparison with an alternative estimation method. As an application of the FCCM, electricity demand analysis in Illinois is considered.  相似文献   

The paper considers tests against autocorrelation among the disturbances in linear regression models that can be expressed as ratios of quadratic forms. It shows that such tests are, in general, not unbiased and that power can even drop to zero for certain regressors and spatial weight matrices. Whether or not this can happen is however easily diagnosed for given regressors and for given spatial weights.  相似文献   

In this paper a new robust estimator, modified median estimator, is introduced and studied for the logistic regression model. This estimator is based on the median estimator considered in Hobza et al. [Robust median estimator in logistic regression. J Stat Plan Inference. 2008;138:3822–3840]. Its asymptotic distribution is obtained. Using the modified median estimator, we also consider a Wald-type test statistic for testing linear hypotheses in the logistic regression model and we obtain its asymptotic distribution under the assumption of random regressors. An extensive simulation study is presented in order to analyse the efficiency as well as the robustness of the modified median estimator and Wald-type test based on it.  相似文献   

Consider a linear regression model with some relevant regressors are unobservable. In such a situation, we estimate the model by using the proxy variables as regressors or by simply omitting the relevant regressors. In this paper, we derive the explicit formula of predictive mean squared error (PMSE) of a general family of shrinkage estimators of regression coefficients. It is shown analytically that the positive-part shrinkage estimator dominates the ordinary shrinkage estimator even when proxy variables are used in place of the unobserved variables. Also, as an example, our result is applied to the double k-class estimator proposed by Ullah and Ullah (Double k-class estimators of coefficients in linear regression. Econometrica. 1978;46:705–722). Our numerical results show that the positive-part double k-class estimator with proxy variables has preferable PMSE performance.  相似文献   

This article analyzes the effects of multicollienarity on the maximum likelihood (ML) estimator for the Tobit regression model. Furthermore, a ridge regression (RR) estimator is proposed since the mean squared error (MSE) of ML becomes inflated when the regressors are collinear. To investigate the performance of the traditional ML and the RR approaches we use Monte Carlo simulations where the MSE is used as performance criteria. The simulated results indicate that the RR approach should always be preferred to the ML estimation method.  相似文献   

A new estimation method for the dimension of a regression at the outset of an analysis is proposed. A linear subspace spanned by projections of the regressor vector X , which contains part or all of the modelling information for the regression of a vector Y on X , and its dimension are estimated via the means of parametric inverse regression. Smooth parametric curves are fitted to the p inverse regressions via a multivariate linear model. No restrictions are placed on the distribution of the regressors. The estimate of the dimension of the regression is based on optimal estimation procedures. A simulation study shows the method to be more powerful than sliced inverse regression in some situations.  相似文献   

The multivariate regression model is considered with p regressors. A latent vector with p binary entries serves to identify one of two types of regression coefficients: those close to 0 and those not. Specializing our general distributional setting to the linear model with Gaussian errors and using natural conjugate prior distributions, we derive the marginal posterior distribution of the binary latent vector. Fast algorithms aid its direct computation, and in high dimensions these are supplemented by a Markov chain Monte Carlo approach to sampling from the known posterior distribution. Problems with hundreds of regressor variables become quite feasible. We give a simple method of assigning the hyperparameters of the prior distribution. The posterior predictive distribution is derived and the approach illustrated on compositional analysis of data involving three sugars with 160 near infrared absorbances as regressors.  相似文献   

This article considers the problem of statistical inference in linear regression models with dependent errors. A sieve-type generalized least squares (GLS) procedure is proposed based on an autoregressive approximation to the generating mechanism of the errors. The asymptotic properties of the sieve-type GLS estimator are established under general conditions, including mixingale-type conditions as well as conditions which allow for long-range dependence in the stochastic regressors and/or the errors. A Monte Carlo study examines the finite-sample properties of the method for testing regression hypotheses.  相似文献   

In this paper we introduce an interesting feature of the generalized least absolute deviations method for seemingly unrelated regression equations (SURE) models. Contrary to the collapse of generalized leasts-quares parameter estimations of SURE models to the ordinary least-squares estimations of the individual equations when the same regressors are common between all equations, the estimations of the proposed methodology are not identical to the least absolute deviations estimations of the individual equations. This is important since contrary to the least-squares methods, one can take advantage of efficiency gain due to cross-equation correlations even if the system includes the same regressors in each equation.  相似文献   

In the presence of multicollinearity the literature points to principal component regression (PCR) as an estimation method for the regression coefficients of a multiple regression model. Due to ambiguities in the interpretation, involved by the orthogonal transformation of the set of explanatory variables, the method could not yet gain wide acceptance. Factor analysis regression (FAR) provides a model-based estimation method which is particularly tailored to overcome multicollinearity in an errors-in-variables setting. In this paper two feasible versions of a FAR estimator are compared with the OLS estimator and the PCR estimator by means of Monte Carlo simulation. While the PCR estimator performs best in cases of strong and high multicollinearity, the Thomson-based FAR estimator proves to be superior when the regressors are moderately correlated.  相似文献   

From the prediction viewpoint, mode regression is more attractive since it pay attention to the most probable value of response variable given regressors. On the other hand, high-dimensional data are very prevalent as the advance of the technology of collecting and storing data. Variable selection is an important strategy to deal with high-dimensional regression problem. This paper aims to propose a variable selection procedure for high-dimensional mode regression via combining nonparametric kernel estimation method with sparsity penalty tactics. We also establish the asymptotic properties under certain technical conditions. The effectiveness and flexibility of the proposed methods are further illustrated by numerical studies and the real data application.  相似文献   

In many practical situation the regression analysis with stochastic regressors is used. The estimations of this model are often influenced by a high degree of multicollinearity. For avoidance of this fact a criterion and a procedure for the selection of an optimal subset for regression will be derived on the base of the partition of the moments of the conditional normal distribution of the regressand under the condition of the regressors. Further two stage procedures improving the result of the subset regression. based also on the partition of the conditional moments will be given.  相似文献   

This note discusses a problem that might occur when forward stepwise regression is used for variable selection and among the candidate variables is a categorical variable with more than two categories. Most software packages (such as SAS, SPSSx, BMDP) include special programs for performing stepwise regression. The user of these programs has to code categorical variables with dummy variables. In this case the forward selection might wrongly indicate that a categorical variable with more than two categories is nonsignificant. This is a disadvantage of the forward selection compared with the backward elimination method. A way to avoid the problem would be to test in a single step all dummy variables corresponding to the same categorical variable rather than one dummy variable at a time, such as in the analysis of covariance. This option, however, is not available in forward stepwise procedures, except for stepwise logistic regression in BMDP. A practical possibility is to repeat the forward stepwise regression and change the reference categories each time.  相似文献   

In contrast to the common belief that the logit model has no analytical presentation, it is possible to find such a solution in the case of categorical predictors. This paper shows that a binary logistic regression by categorical explanatory variables can be constructed in a closed-form solution. No special software and no iterative procedures of nonlinear estimation are needed to obtain a model with all its parameters and characteristics, including coefficients of regression, their standard errors and t-statistics, as well as the residual and null deviances. The derivation is performed for logistic models with one binary or categorical predictor, and several binary or categorical predictors. The analytical formulae can be used for arithmetical calculation of all the parameters of the logit regression. The explicit expressions for the characteristics of logit regression are convenient for the analysis and interpretation of the results of logistic modeling.  相似文献   

This paper considers two-phase random design linear regression models. Errors and regressors are stationary long-range-dependent Gaussian processes. The regression parameters, the scale parameter and the change-point are estimated using a method introduced by Rousseeuw and Yohai [Robust regression by means of S-estimators, in Robust and Nonlinear Time Series Analysis, J. Franke, W. Hrdle, and R.D. Martin, eds., Lecture Notes in Statistics, Vol. 26, Springer, New York, 1984, pp. 256–272], which is called the S-estimator and has the property be more robust than the classical estimators in the sense that the outliers do not bias the estimation results. Some asymptotic results, including the strong consistency and the convergence rate of the S-estimator are proved. Simulations and an application to the Nile River data are also presented. It is shown via Monte Carlo simulations that the S-estimator is better than two other estimators that are proposed in the literature.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号