首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In modeling count data with multivariate predictors, we often encounter problems with clustering of observations and interdependency of predictors. We propose to use principal components of predictors to mitigate the multicollinearity problem and to abate information losses due to dimension reduction, a semiparametric link between the count dependent variable and the principal components is postulated. Clustering of observations is accounted into the model as a random component and the model is estimated via the backfitting algorithm. Simulation study illustrates the advantages of the proposed model over standard poisson regression in a wide range of scenarios.  相似文献   

2.
Abstract.  We consider non-parametric additive quantile regression estimation by kernel-weighted local linear fitting. The estimator is based on localizing the characterization of quantile regression as the minimizer of the appropriate 'check function'. A backfitting algorithm and a heuristic rule for selecting the smoothing parameter are explored. We also study the estimation of average-derivative quantile regression under the additive model. The techniques are illustrated by a simulated example and a real data set.  相似文献   

3.
Suppose the observations (ti,yi), i = 1,… n, follow the model where gj are unknown functions. The estimation of the additive components can be done by approximating gj, with a function made up of the sum of a linear fit and a truncated Fourier series of cosines and minimizing a penalized least-squares loss function over the coefficients. This finite-dimensional basis approximation, when fitting an additive model with r predictors, has the advantage of reducing the computations drastically, since it does not require the use of the backfitting algorithm. The cross-validation (CV) [or generalized cross-validation (GCV)] for the additive fit is calculated in a further 0(n) operations. A search path in the r-dimensional space of degrees of freedom is proposed along which the CV (GCV) continuously decreases. The path ends when an increase in the degrees of freedom of any of the predictors yields an increase in CV (GCV). This procedure is illustrated on a meteorological data set.  相似文献   

4.
Given a multiple time series sharing common autoregressive patterns, we estimate an additive model. The autoregressive component and the individual random effects are estimated by integrating maximum likelihood estimation and best linear unbiased predictions in a backfitting algorithm. The simulation study illustrated that the estimation procedure provides an alternative to the Arellano–Bond generalized method of moments (GMM) estimator of the panel model when T > N and the Arellano–Bond generally diverges. The estimator has high predictive ability. In cases where T ≤ N, the backfitting estimator is at least comparable to Arellano–Bond estimator.  相似文献   

5.
We consider the problem of using shrinkage estimators that shrink towards subspaces in linear regression, in particular subspaces spanned by principal components. This is especially important when multicollinearity is present and the number of predictors is not small compared to the sample size. New theoretical results about Stein estimation are used to get estimators with lower theoretical risk than standard Stein estimators used by Oman (1991). Application of the techniques to real data is largely successful.  相似文献   

6.
In this work, we propose a new model called generalized symmetrical partial linear model, based on the theory of generalized linear models and symmetrical distributions. In our model the response variable follows a symmetrical distribution such a normal, Student-t, power exponential, among others. Following the context of generalized linear models we consider replacing the traditional linear predictors by the more general predictors in whose case one covariate is related with the response variable in a non-parametric fashion, that we do not specified the parametric function. As an example, we could imagine a regression model in which the intercept term is believed to vary in time or geographical location. The backfitting algorithm is used for estimating the parameters of the proposed model. We perform a simulation study for assessing the behavior of the penalized maximum likelihood estimators. We use the quantile residuals for checking the assumption of the model. Finally, we analyzed real data set related with pH rivers in Ireland.  相似文献   

7.
This paper focuses on efficient estimation, optimal rates of convergence and effective algorithms in the partly linear additive hazards regression model with current status data. We use polynomial splines to estimate both cumulative baseline hazard function with monotonicity constraint and nonparametric regression functions with no such constraint. We propose a simultaneous sieve maximum likelihood estimation for regression parameters and nuisance parameters and show that the resultant estimator of regression parameter vector is asymptotically normal and achieves the semiparametric information bound. In addition, we show that rates of convergence for the estimators of nonparametric functions are optimal. We implement the proposed estimation through a backfitting algorithm on generalized linear models. We conduct simulation studies to examine the finite‐sample performance of the proposed estimation method and present an analysis of renal function recovery data for illustration.  相似文献   

8.
In this paper we considered a generalized additive model with second-order interaction terms. A local scoring algorithm (with backfitting) based on local linear kernel smoothers was used to estimate the model. Our main aim was to obtain procedures for testing second-order interaction terms. Backfitting theory is difficult in this context, and a bootstrap procedure is therefore provided for estimating the distribution of the test statistics. Given the high computational cost involved, binning techniques were used to speed up the computation in the estimation and testing process. A simulation study was carried out in order to assess the validity of the bootstrap-based tests. Lastly, our method was applied to real data drawn from an SO2 binary time series.  相似文献   

9.
Partially linear additive model is useful in statistical modelling as a multivariate nonparametric fitting technique. This paper considers statistical inference for the semiparametric model in the presence of multicollinearity. Based on the profile least-squares (PL) approach and Liu estimation method, we propose a PL Liu estimator for the parametric component. When some additional linear restrictions on the parametric component are available, the corresponding restricted Liu estimator for the parametric component is constructed. The properties of the proposed estimators are derived. Some simulations are conducted to assess the performance of the proposed procedures and the results are satisfactory. Finally, a real data example is analysed.  相似文献   

10.
Additive models are often applied in statistical learning which allow linear and nonlinear predictors to coexist. In this article we adapt existing boosting methods for both mean regression and quantile regression in additive models which can simultaneously identify nonlinear, linear and zero predictors. We use gradient boosting in which simple linear regression and univariate penalized spline are used as base learners. Twin boosting is applied to achieve better variable selection accuracy. Simulation studies as well as real data applications illustrate the strength of our proposed methods.  相似文献   

11.
In linear regression models, predictors based on least squares or on generalized least squares estimators are usually applied which, however, fail in case of multicollinearity. As an alternative biased estimators like ridge estimators, Kuks-Olman estimators, Bayes or minimax estimators are sometimes suggested. In our analysis the relative instead of the generally used absolute squared error enters the objective function. An explicit minimax solution is derived which, in an important special case, can be viewed as a predictor based on a Kuks-Olman estimator.  相似文献   

12.
In situations that the predictors are correlated with the error term, we propose a bridge estimator in the two-stage least squares estimation. We apply this estimator to overcome the multicollinearity and sparsity of the explanatory variables, when the endogeneity problem is present.The proposed estimator was applied to modify the Durbin-Wu-Hausman (DWH) test of endogeneity in the presence of multicollinearity. To compare our modified test with the existing DWH for detection of an endogenous problem in multi-collinear data, some numerical assessments are carried out. The numerical results showed that the proposed estimators and the suggested test perform better for the multi-collinear data. Finally, a genetical data set is applied for illustration the our results by estimating the coefficients parameters in the presence of endogeneity and multicollinearity.  相似文献   

13.
The presence of outliers in the data sets affects the structure of multicollinearity which arises from a high degree of correlation between explanatory variables in a linear regression analysis. This affect could be seen as an increase or decrease in the diagnostics used to determine multicollinearity. Thus, the cases of outliers reduce the reliability of diagnostics such as variance inflation factors, condition numbers and variance decomposition proportions. In this study, we propose to use a robust estimation of the correlation matrix obtained by the minimum covariance determinant method to determine the diagnostics of multicollinearity in the presence of outliers. As a result, the present paper demonstrates that the diagnostics of multicollinearity obtained by the robust estimation of the correlation matrix are more reliable in the presence of outliers.  相似文献   

14.
In the multiple linear regression analysis, the ridge regression estimator and the Liu estimator are often used to address multicollinearity. Besides multicollinearity, outliers are also a problem in the multiple linear regression analysis. We propose new biased estimators based on the least trimmed squares (LTS) ridge estimator and the LTS Liu estimator in the case of the presence of both outliers and multicollinearity. For this purpose, a simulation study is conducted in order to see the difference between the robust ridge estimator and the robust Liu estimator in terms of their effectiveness; the mean square error. In our simulations, the behavior of the new biased estimators is examined for types of outliers: X-space outlier, Y-space outlier, and X-and Y-space outlier. The results for a number of different illustrative cases are presented. This paper also provides the results for the robust ridge regression and robust Liu estimators based on a real-life data set combining the problem of multicollinearity and outliers.  相似文献   

15.
Generalized additive models for location, scale and shape   总被引:10,自引:0,他引:10  
Summary.  A general class of statistical models for a univariate response variable is presented which we call the generalized additive model for location, scale and shape (GAMLSS). The model assumes independent observations of the response variable y given the parameters, the explanatory variables and the values of the random effects. The distribution for the response variable in the GAMLSS can be selected from a very general family of distributions including highly skew or kurtotic continuous and discrete distributions. The systematic part of the model is expanded to allow modelling not only of the mean (or location) but also of the other parameters of the distribution of y , as parametric and/or additive nonparametric (smooth) functions of explanatory variables and/or random-effects terms. Maximum (penalized) likelihood estimation is used to fit the (non)parametric models. A Newton–Raphson or Fisher scoring algorithm is used to maximize the (penalized) likelihood. The additive terms in the model are fitted by using a backfitting algorithm. Censored data are easily incorporated into the framework. Five data sets from different fields of application are analysed to emphasize the generality of the GAMLSS class of models.  相似文献   

16.
Summary.  Compared with the classical backfitting of Buja, Hastie and Tibshirani, the smooth backfitting estimator (SBE) of Mammen, Linton and Nielsen not only provides complete asymptotic theory under weaker conditions but is also more efficient, robust and easier to calculate. However, the original paper describing the SBE method is complex and the practical as well as the theoretical advantages of the method have still neither been recognized nor accepted by the statistical community. We focus on a clear presentation of the idea, the main theoretical results and practical aspects like implementation and simplification of the algorithm. We introduce a feasible cross-validation procedure and apply it to the problem of data-driven bandwidth choice for the SBE. By simulations it is shown that the SBE and our cross-validation work very well indeed. In particular, the SBE is less affected by sparseness of data in high dimensional regression problems or strongly correlated designs. The SBE has reasonable performance even in 100-dimensional additive regression problems.  相似文献   

17.
In this article, we consider the problem of variable selection in linear regression when multicollinearity is present in the data. It is well known that in the presence of multicollinearity, performance of least square (LS) estimator of regression parameters is not satisfactory. Consequently, subset selection methods, such as Mallow's Cp, which are based on LS estimates lead to selection of inadequate subsets. To overcome the problem of multicollinearity in subset selection, a new subset selection algorithm based on the ridge estimator is proposed. It is shown that the new algorithm is a better alternative to Mallow's Cp when the data exhibit multicollinearity.  相似文献   

18.
The present Monte Carlo simulation study adds to the literature by analyzing parameter bias, rates of Type I and Type II error, and variance inflation factor (VIF) values produced under various multicollinearity conditions by multiple regressions with two, four, and six predictors. Findings indicate multicollinearity is unrelated to Type I error, but increases Type II error. Investigation of bias suggests that multicollinearity increases the variability in parameter bias, while leading to overall underestimation of parameters. Collinearity also increases VIF. In the case of all diagnostics however, increasing the number of predictors interacts with multicollinearity to compound observed problems.  相似文献   

19.
In the multiple linear regression, multicollinearity and outliers are commonly occurring problems. They produce undesirable effects on the ordinary least squares estimator. Many alternative parameter estimation methods are available in the literature which deals with these problems independently. In practice, it may happen that the multicollinearity and outliers occur simultaneously. In this article, we present a new estimator called as Linearized Ridge M-estimator which combats the problem of simultaneous occurrence of multicollinearity and outliers. A real data example and a simulation study is carried out to illustrate the performance of the proposed estimator.  相似文献   

20.
We consider estimation for the homoscedastic additive model for multiple regression. A recursion is proposed in Opsomer (1999), and independently by the authors, for obtaining the estimators that solve the normal equations given by Hastie and Tibshirani (1990). The recursion can be exploited to obtain the asymptotic bias and variance expressions of the estimators for any p > 2 (Opsomer 1999) using repeated application of Opsomer and Ruppert (1997). Opsomer and Ruppert (1997) provide asymptotic bias and variance for the estimators when p = 2. Opsomer (1999) also uses the recursion to provide sufficient conditions for convergence of the backfitting algorithm to a unique solution of the normal equations. However, since explicit expressions for the solution to the normal equations are not given, he states, “The lemma does not provide a practical way of evaluating the existence and uniqueness of the backfitting estimators … ”. In this paper, explicit expressions for the estimators are derived. The explicit solution requires inverses of n × n matrices to solve the np × np system of normal equations. These matrix inverses are feasible to implement for moderate sample sizes and can be used in place of the backfitting algorithm.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号