首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Ridge estimator of a singular design is considered for linear and gener¬alized linear models. Ridge penalty helps determine a unique estimator in singmar uesign. me tuning parameter o± tue penalty is seiecteu via gener¬alized cross-validation (GCV) method. It is proven that the ridge estimator lies in a special sub-parameter space and converges to the intrinsic estimator, an estimable function in singular design, as the shrinkage penalty diminishes. The expansion of the ridge estimator and its variance are also obtained. Thismethod is demonstrated through an application to age-period-cohort (APC) analysis of the incidence rates of cervical cancer in Ontario women 1980-1994  相似文献   

2.
In the paper we consider minimisation of U-statistics with the weighted Lasso penalty and investigate their asymptotic properties in model selection and estimation. We prove that the use of appropriate weights in the penalty leads to the procedure that behaves like the oracle that knows the true model in advance, i.e. it is model selection consistent and estimates nonzero parameters with the standard rate. For the unweighted Lasso penalty, we obtain sufficient and necessary conditions for model selection consistency of estimators. The obtained results strongly based on the convexity of the loss function that is the main assumption of the paper. Our theorems can be applied to the ranking problem as well as generalised regression models. Thus, using U-statistics we can study more complex models (better describing real problems) than usually investigated linear or generalised linear models.  相似文献   

3.
In this article, the partially linear covariate-adjusted regression models are considered, and the penalized least-squares procedure is proposed to simultaneously select variables and estimate the parametric components. The rate of convergence and the asymptotic normality of the resulting estimators are established under some regularization conditions. With the proper choices of the penalty functions and tuning parameters, it is shown that the proposed procedure can be as efficient as the oracle estimators. Some Monte Carlo simulation studies and a real data application are carried out to assess the finite sample performances for the proposed method.  相似文献   

4.
We consider the problem of constructing nonlinear regression models with Gaussian basis functions, using lasso regularization. Regularization with a lasso penalty is an advantageous in that it estimates some coefficients in linear regression models to be exactly zero. We propose imposing a weighted lasso penalty on a nonlinear regression model and thereby selecting the number of basis functions effectively. In order to select tuning parameters in the regularization method, we use a deviance information criterion proposed by Spiegelhalter et al. (2002), calculating the effective number of parameters by Gibbs sampling. Simulation results demonstrate that our methodology performs well in various situations.  相似文献   

5.
Liu X  Wang L  Liang H 《Statistica Sinica》2011,21(3):1225-1248
Semiparametric additive partial linear models, containing both linear and nonlinear additive components, are more flexible compared to linear models, and they are more efficient compared to general nonparametric regression models because they reduce the problem known as "curse of dimensionality". In this paper, we propose a new estimation approach for these models, in which we use polynomial splines to approximate the additive nonparametric components and we derive the asymptotic normality for the resulting estimators of the parameters. We also develop a variable selection procedure to identify significant linear components using the smoothly clipped absolute deviation penalty (SCAD), and we show that the SCAD-based estimators of non-zero linear components have an oracle property. Simulations are performed to examine the performance of our approach as compared to several other variable selection methods such as the Bayesian Information Criterion and Least Absolute Shrinkage and Selection Operator (LASSO). The proposed approach is also applied to real data from a nutritional epidemiology study, in which we explore the relationship between plasma beta-carotene levels and personal characteristics (e.g., age, gender, body mass index (BMI), etc.) as well as dietary factors (e.g., alcohol consumption, smoking status, intake of cholesterol, etc.).  相似文献   

6.
Many different methods have been proposed to construct nonparametric estimates of a smooth regression function, including local polynomial, (convolution) kernel and smoothing spline estimators. Each of these estimators uses a smoothing parameter to control the amount of smoothing performed on a given data set. In this paper an improved version of a criterion based on the Akaike information criterion (AIC), termed AICC, is derived and examined as a way to choose the smoothing parameter. Unlike plug-in methods, AICC can be used to choose smoothing parameters for any linear smoother, including local quadratic and smoothing spline estimators. The use of AICC avoids the large variability and tendency to undersmooth (compared with the actual minimizer of average squared error) seen when other 'classical' approaches (such as generalized cross-validation (GCV) or the AIC) are used to choose the smoothing parameter. Monte Carlo simulations demonstrate that the AICC-based smoothing parameter is competitive with a plug-in method (assuming that one exists) when the plug-in method works well but also performs well when the plug-in approach fails or is unavailable.  相似文献   

7.
8.
In this paper we are concerned with the problems of variable selection and estimation in double generalized linear models in which both the mean and the dispersion are allowed to depend on explanatory variables. We propose a maximum penalized pseudo-likelihood method when the number of parameters diverges with the sample size. With appropriate selection of the tuning parameters, the consistency of the variable selection procedure and asymptotic properties of the resulting estimators are established. We also carry out simulation studies and a real data analysis to assess the finite sample performance of the proposed variable selection procedure, showing that the proposed variable selection method works satisfactorily.  相似文献   

9.
In this paper, we consider the shrinkage and penalty estimation procedures in the linear regression model with autoregressive errors of order p when it is conjectured that some of the regression parameters are inactive. We develop the statistical properties of the shrinkage estimation method including asymptotic distributional biases and risks. We show that the shrinkage estimators have a significantly higher relative efficiency than the classical estimator. Furthermore, we consider the two penalty estimators: least absolute shrinkage and selection operator (LASSO) and adaptive LASSO estimators, and numerically compare their relative performance with that of the shrinkage estimators. A Monte Carlo simulation experiment is conducted for different combinations of inactive predictors and the performance of each estimator is evaluated in terms of the simulated mean-squared error. This study shows that the shrinkage estimators are comparable to the penalty estimators when the number of inactive predictors in the model is relatively large. The shrinkage and penalty methods are applied to a real data set to illustrate the usefulness of the procedures in practice.  相似文献   

10.
In this paper, we consider the problem of variable selection for partially varying coefficient single-index model, and present a regularized variable selection procedure by combining basis function approximations with smoothly clipped absolute deviation penalty. The proposed procedure simultaneously selects significant variables in the single-index parametric components and the nonparametric coefficient function components. With appropriate selection of the tuning parameters, the consistency of the variable selection procedure and the oracle property of the estimators are established. Finite sample performance of the proposed method is illustrated by a simulation study and real data analysis.  相似文献   

11.
Spatially-adaptive Penalties for Spline Fitting   总被引:2,自引:0,他引:2  
The paper studies spline fitting with a roughness penalty that adapts to spatial heterogeneity in the regression function. The estimates are p th degree piecewise polynomials with p − 1 continuous derivatives. A large and fixed number of knots is used and smoothing is achieved by putting a quadratic penalty on the jumps of the p th derivative at the knots. To be spatially adaptive, the logarithm of the penalty is itself a linear spline but with relatively few knots and with values at the knots chosen to minimize the generalized cross validation (GCV) criterion. This locally-adaptive spline estimator is compared with other spline estimators in the literature such as cubic smoothing splines and knot-selection techniques for least squares regression. Our estimator can be interpreted as an empirical Bayes estimate for a prior allowing spatial heterogeneity. In cases of spatially heterogeneous regression functions, empirical Bayes confidence intervals using this prior achieve better pointwise coverage probabilities than confidence intervals based on a global-penalty parameter. The method is developed first for univariate models and then extended to additive models.  相似文献   

12.
Risk estimation is an important statistical question for the purposes of selecting a good estimator (i.e., model selection) and assessing its performance (i.e., estimating generalization error). This article introduces a general framework for cross-validation and derives distributional properties of cross-validated risk estimators in the context of estimator selection and performance assessment. Arbitrary classes of estimators are considered, including density estimators and predictors for both continuous and polychotomous outcomes. Results are provided for general full data loss functions (e.g., absolute and squared error, indicator, negative log density). A broad definition of cross-validation is used in order to cover leave-one-out cross-validation, V-fold cross-validation, Monte Carlo cross-validation, and bootstrap procedures. For estimator selection, finite sample risk bounds are derived and applied to establish the asymptotic optimality of cross-validation, in the sense that a selector based on a cross-validated risk estimator performs asymptotically as well as an optimal oracle selector based on the risk under the true, unknown data generating distribution. The asymptotic results are derived under the assumption that the size of the validation sets converges to infinity and hence do not cover leave-one-out cross-validation. For performance assessment, cross-validated risk estimators are shown to be consistent and asymptotically linear for the risk under the true data generating distribution and confidence intervals are derived for this unknown risk. Unlike previously published results, the theorems derived in this and our related articles apply to general data generating distributions, loss functions (i.e., parameters), estimators, and cross-validation procedures.  相似文献   

13.
In this article, we consider the variable selection and estimation for high-dimensional generalized linear models when the number of parameters diverges with the sample size. We propose a penalized quasi-likelihood function with the bridge penalty. The consistency and the Oracle property of the quasi-likelihood bridge estimators are obtained. Some simulations and a real data analysis are given to illustrate the performance of the proposed method.  相似文献   

14.
We propose a robust rank-based estimation and variable selection in double generalized linear models when the number of parameters diverges with the sample size. The consistency of the variable selection procedure and asymptotic properties of the resulting estimators are established under appropriate selection of tuning parameters. Simulations are performed to assess the finite sample performance of the proposed estimation and variable selection procedure. In the presence of gross outliers, the proposed method is showing that the variable selection method works better. For practical application, a real data application is provided using nutritional epidemiology data, in which we explore the relationship between plasma beta-carotene levels and personal characteristics (e.g. age, gender, fat, etc.) as well as dietary factors (e.g. smoking status, intake of cholesterol, etc.).  相似文献   

15.
Recent literature provides many computational and modeling approaches for covariance matrices estimation in a penalized Gaussian graphical models but relatively little study has been carried out on the choice of the tuning parameter. This paper tries to fill this gap by focusing on the problem of shrinkage parameter selection when estimating sparse precision matrices using the penalized likelihood approach. Previous approaches typically used K-fold cross-validation in this regard. In this paper, we first derived the generalized approximate cross-validation for tuning parameter selection which is not only a more computationally efficient alternative, but also achieves smaller error rate for model fitting compared to leave-one-out cross-validation. For consistency in the selection of nonzero entries in the precision matrix, we employ a Bayesian information criterion which provably can identify the nonzero conditional correlations in the Gaussian model. Our simulations demonstrate the general superiority of the two proposed selectors in comparison with leave-one-out cross-validation, 10-fold cross-validation and Akaike information criterion.  相似文献   

16.
Jing Yang  Fang Lu  Hu Yang 《Statistics》2017,51(6):1179-1199
In this paper, we develop a new estimation procedure based on quantile regression for semiparametric partially linear varying-coefficient models. The proposed estimation approach is empirically shown to be much more efficient than the popular least squares estimation method for non-normal error distributions, and almost not lose any efficiency for normal errors. Asymptotic normalities of the proposed estimators for both the parametric and nonparametric parts are established. To achieve sparsity when there exist irrelevant variables in the model, two variable selection procedures based on adaptive penalty are developed to select important parametric covariates as well as significant nonparametric functions. Moreover, both these two variable selection procedures are demonstrated to enjoy the oracle property under some regularity conditions. Some Monte Carlo simulations are conducted to assess the finite sample performance of the proposed estimators, and a real-data example is used to illustrate the application of the proposed methods.  相似文献   

17.
Although the t-type estimator is a kind of M-estimator with scale optimization, it has some advantages over the M-estimator. In this article, we first propose a t-type joint generalized linear model as a robust extension to the classical joint generalized linear models for modeling data containing extreme or outlying observations. Next, we develop a t-type pseudo-likelihood (TPL) approach, which can be viewed as a robust version to the existing pseudo-likelihood (PL) approach. To determine which variables significantly affect the variance of the response variable, we then propose a unified penalized maximum TPL method to simultaneously select significant variables for the mean and dispersion models in t-type joint generalized linear models. Thus, the proposed variable selection method can simultaneously perform parameter estimation and variable selection in the mean and dispersion models. With appropriate selection of the tuning parameters, we establish the consistency and the oracle property of the regularized estimators. Simulation studies are conducted to illustrate the proposed methods.  相似文献   

18.
This paper considers variable and factor selection in factor analysis. We treat the factor loadings for each observable variable as a group, and introduce a weighted sparse group lasso penalty to the complete log-likelihood. The proposal simultaneously selects observable variables and latent factors of a factor analysis model in a data-driven fashion; it produces a more flexible and sparse factor loading structure than existing methods. For parameter estimation, we derive an expectation-maximization algorithm that optimizes the penalized log-likelihood. The tuning parameters of the procedure are selected by a likelihood cross-validation criterion that yields satisfactory results in various simulation settings. Simulation results reveal that the proposed method can better identify the possibly sparse structure of the true factor loading matrix with higher estimation accuracy than existing methods. A real data example is also presented to demonstrate its performance in practice.  相似文献   

19.
This article proposes a variable selection procedure for partially linear models with right-censored data via penalized least squares. We apply the SCAD penalty to select significant variables and estimate unknown parameters simultaneously. The sampling properties for the proposed procedure are investigated. The rate of convergence and the asymptotic normality of the proposed estimators are established. Furthermore, the SCAD-penalized estimators of the nonzero coefficients are shown to have the asymptotic oracle property. In addition, an iterative algorithm is proposed to find the solution of the penalized least squares. Simulation studies are conducted to examine the finite sample performance of the proposed method.  相似文献   

20.
Abstract

Variable selection in finite mixture of regression (FMR) models is frequently used in statistical modeling. The majority of applications of variable selection in FMR models use a normal distribution for regression error. Such assumptions are unsuitable for a set of data containing a group or groups of observations with heavy tails and outliers. In this paper, we introduce a robust variable selection procedure for FMR models using the t distribution. With appropriate selection of the tuning parameters, the consistency and the oracle property of the regularized estimators are established. To estimate the parameters of the model, we develop an EM algorithm for numerical computations and a method for selecting tuning parameters adaptively. The parameter estimation performance of the proposed model is evaluated through simulation studies. The application of the proposed model is illustrated by analyzing a real data set.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号