首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 140 毫秒
1.
It is well-known that under fairly conditions linear regression becomes a powerful statistical tool. In practice, however, some of these conditions are usually not satisfied and regression models become ill-posed, implying that the application of traditional estimation methods may lead to non-unique or highly unstable solutions. Addressing this issue, in this paper a new class of maximum entropy estimators suitable for dealing with ill-posed models, namely for the estimation of regression models with small samples sizes affected by collinearity and outliers, is introduced. The performance of the new estimators is illustrated through several simulation studies.  相似文献   

2.
It is often thought that regression data should be mean-centered before being diagnosed for collinearity (ill conditioning). This view is shown not generally to be correct. Such centering can mask elements of ill conditioning and produce meaningless and misleading collinearity diagnostics. In order to assess conditioning meaningfully, the data must be in a form that possesses structural interpretability.  相似文献   

3.
4.
This paper defines collinearity for generalized linear models (GLMs), investigates its consequences and proposes diagnostic criteria. The relationship between collinearity in GLMs and standard linear models (SLMs) is explored and bounds which relate the degree of collinearity in these two models are given. Estimation based on ridge methods is discussed.  相似文献   

5.
We propose a new collinearity diagnostic tool for generalized linear models. The new diagnostic tool is termed the weighted variance inflation factor (WVIF) behaving exactly the same as the traditional variance inflation factor in the context of regression diagnostic, given data matrix normalized. Compared to the use of condition number (CN), WVIF shows more reliable information on how severe the situation is, when data collinearity does exist. An alternative estimator, a by-product of the new diagnostic, outperforms the ridge estimator in the presence of data collinearity in both aspects of WVIF and CN. Evidences are given through analyzing various real-world numerical examples.  相似文献   

6.
In this study, we investigate linear regression having both heteroskedasticity and collinearity problems. We discuss the properties related to the perturbation method. Important observations are summarized as theorems. We then prove the main result that states the heteroskedasticity-robust variances can be improved and that the resulting bias is minimized by using the matrix perturbation method. We analyze a practical example for validation of the method.  相似文献   

7.
Although the collinearity issue has been studied in previous simulation studies with a simultaneous system of equations, alternative estimators to circumvent this problem have received little attention. Monte Carlo techniques are used to examine the performance of several estimators under a squared error loss criterion. In particular, this study considers the Vinod–Ullah ridge-type estimators at the first and/or second stage of 2SLS. Ridge regression in the second stage only of 2SLS but not the first stage only, seems to be a practical alternative to 2SLS, especially in situations of strong collinearity. The OLS estimator and the ordinary ridge regression estimator also yield favorable results in situations of moderate to strong collinearity.  相似文献   

8.
Logistic regression using conditional maximum likelihood estimation has recently gained widespread use. Many of the applications of logistic regression have been in situations in which the independent variables are collinear. It is shown that collinearity among the independent variables seriously effects the conditional maximum likelihood estimator in that the variance of this estimator is inflated in much the same way that collinearity inflates the variance of the least squares estimator in multiple regression. Drawing on the similarities between multiple and logistic regression several alternative estimators, which reduce the effect of the collinearity and are easy to obtain in practice, are suggested and compared in a simulation study.  相似文献   

9.
Ridge regression has been widely applied to estimate under collinearity by defining a class of estimators that are dependent on the parameter k. The variance inflation factor (VIF) is applied to detect the presence of collinearity and also as an objective method to obtain the value of k in ridge regression. Contrarily to the definition of the VIF, the expressions traditionally applied in ridge regression do not necessarily lead to values of VIFs equal to or greater than 1. This work presents an alternative expression to calculate the VIF in ridge regression that satisfies the aforementioned condition and also presents other interesting properties.  相似文献   

10.
In at least one important application of stochastic linear programming (Lavaca-Tres Palacios Estuary:A Study of the Influence of Freshwater Inflows, 1980)constraint parameters are simultaneously estimated using multiple regression with historic data for the values of the decision variables and the right hand side of the constraint function. In this circumstance, the question immediately arises "How stable is the linear programming (LP) solution with regard to regression issues such as sample size, magnitude of the error variance, centroids of the decision variables, apd collinearity?" This paper reports a simulation designed to assess the stability of the LP solution and to compare the effectiveness of ridge as an alternative to ordinary least squares (OLS) regression. For the given scenario, the LP solution is consistently "biased." The amount of bias is exacerbated by small samples, large error variances, and collinearity among observations of the decision variables. The best regression criterion is a function not only of collinearity, but also of the magnitude of the error variance and the sum of the means of the decision variables relative to the right hand side of the stochastic constraint

In the application that motivated this research, the LP solutions were recommended fresh water inflows from Lake Texana into the estuaries of the Gulf of Mexico. The stochastic constraint estimates commercial fish harvest as a function of seasonal fresh water inflow. The historic data set used to estimate parameters of the constraint comprised rainfall data and fish harvest data prior to the construction of the Lake Texana dam, of necessity a small sample with collinear seasonal rainfall. It is not the authors' intent to solve this application, but rather to investigate through a simpler simulated systemwhether or not regression estimates in similar circumstances might introduce a systematic and predictable bias. The answer to this latter question is a qualified Yes!.  相似文献   

11.
In this article, we highlight some interesting facts about Bayesian variable selection methods for linear regression models in settings where the design matrix exhibits strong collinearity. We first demonstrate via real data analysis and simulation studies that summaries of the posterior distribution based on marginal and joint distributions may give conflicting results for assessing the importance of strongly correlated covariates. The natural question is which one should be used in practice. The simulation studies suggest that posterior inclusion probabilities and Bayes factors that evaluate the importance of correlated covariates jointly are more appropriate, and some priors may be more adversely affected in such a setting. To obtain a better understanding behind the phenomenon, we study some toy examples with Zellner’s g-prior. The results show that strong collinearity may lead to a multimodal posterior distribution over models, in which joint summaries are more appropriate than marginal summaries. Thus, we recommend a routine examination of the correlation matrix and calculation of the joint inclusion probabilities for correlated covariates, in addition to marginal inclusion probabilities, for assessing the importance of covariates in Bayesian variable selection.  相似文献   

12.
High-dimensional data arise frequently in modern applications such as biology, chemometrics, economics, neuroscience and other scientific fields. The common features of high-dimensional data are that many of predictors may not be significant, and there exists high correlation among predictors. Generalized linear models, as the generalization of linear models, also suffer from the collinearity problem. In this paper, combining the nonconvex penalty and ridge regression, we propose the weighted elastic-net to deal with the variable selection of generalized linear models on high dimension and give the theoretical properties of the proposed method with a diverging number of parameters. The finite sample behavior of the proposed method is illustrated with simulation studies and a real data example.  相似文献   

13.
It is known that collinearity among the explanatory variables in generalized linear models (GLMs) inflates the variance of maximum likelihood estimators. To overcome multicollinearity in GLMs, ordinary ridge estimator and restricted estimator were proposed. In this study, a restricted ridge estimator is introduced by unifying the ordinary ridge estimator and the restricted estimator in GLMs and its mean squared error (MSE) properties are discussed. The MSE comparisons are done in the context of first-order approximated estimators. The results are illustrated by a numerical example and two simulation studies are conducted with Poisson and binomial responses.  相似文献   

14.
15.
Recent developments of multivariate smoothing methods provide a rich collection of feasible models for nonparametric multivariate data analysis. Among the most interpretable are those with smoothed additive terms. Construction of various methods and algorithms for computing the models have been the main concern in literature in this area. Less results are available on the validation of computed fit, instead, and many applications of nonparametric methods end up in computing and comparing the generalized validation error or related indexes. This article reviews the behaviour of some of the best known multivariate nonparametric methods, based on subset selection and on projection, when (exact) collinearity or multicollinearity (near collinearity) is present in the input matrix. It shows the possible aliasing effects in computed fits of some selection methods and explores the properties of the projection spaces reached by projection methods in order to help data analysts to select the best model in case of ill conditioned input matrices. Two simulation studies and a real data set application are presented to illustrate further the effects of collinearity or multicollinearity in the fit.  相似文献   

16.
The variance inflation factor (VIF) is used to detect the presence of linear relationships between two or more independent variables (i.e. collinearity) in the multiple linear regression model. However, the traditionally used VIF definitions encounter some problems when extended to the case of the ridge estimation (RE). This paper presents an extension of the VIF in RE by providing two alternative VIF expressions that overcome these problems in the general case. Some characteristics of these expressions are also presented and compared with the traditional expression. The results are illustrated with an economic example in the case of three independent variables and with a Monte Carlo simulation for the general case.  相似文献   

17.
In longitudinal studies, missing responses and mismeasured covariates are commonly seen due to the data collection process. Without cautiousness in data analysis, inferences from the standard statistical approaches may lead to wrong conclusions. In order to improve the estimation for longitudinal data analysis, a doubly robust estimation method for partially linear models, which can simultaneously account for the missing responses and mismeasured covariates, is proposed. Imprecisions of covariates are corrected by taking advantage of the independence between replicate measurement errors, and missing responses are handled by the doubly robust estimation under the mechanism of missing at random. The asymptotic properties of the proposed estimators are established under regularity conditions, and simulation studies demonstrate desired properties. Finally, the proposed method is applied to data from the Lifestyle Education for Activity and Nutrition study.  相似文献   

18.
Bivariate responses of repeated measures data are usually analysed as two separate responses in the literature by several authors. The two responses usually tend to be related in some way and analysing this data jointly presents an opportunity to account for the joint movement, which may impact on the conclusions reached compared to analysing the responses separately. In this paper, a bivariate regression model with random effects (linear mixed model) is used to detect a change if any in the prescribing habits in the UK at the general practice (family medicine) level due to an educational intervention given repeated measures data before and after the intervention and a control group. The message was to increase the prescribing of one drug while simultaneously decreasing the prescribing of another. The effects of modelling a bivariate auto-regressive process are evaluated.  相似文献   

19.
In this work, a simulation study is conducted to evaluate the performance of Bayesian estimators for the log–linear exponential regression model under different levels of censoring and degrees of collinearity for two covariates. The diffuse normal, independent Student-t and multivariate Student-t distributions are considered as prior distributions and to draw from the posterior distributions, the Metropolis algorithm is implemented. Also, the results are compared with the maximum likelihood estimators in terms of the mean squared error, coverages and length of the credibility and confidence intervals.  相似文献   

20.
Apparently contradictory results between direct and reverse regression in employment-discrimination data analysis are a manifestation of collinearity in the data. An easily implemented guideline that alerts the analyst to the presence of contaminating collinearity is illustrated with employment data from Title VII litigation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号