期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Ridge estimator in singulah oesiun with application to age-period-cohort analysis of disease rates

Wenjiang J. Fu PhD 《统计学通讯:理论与方法》2013,42(2):263-278

Ridge estimator of a singular design is considered for linear and gener¬alized linear models. Ridge penalty helps determine a unique estimator in singmar uesign. me tuning parameter o± tue penalty is seiecteu via gener¬alized cross-validation (GCV) method. It is proven that the ridge estimator lies in a special sub-parameter space and converges to the intrinsic estimator, an estimable function in singular design, as the shrinkage penalty diminishes. The expansion of the ridge estimator and its variance are also obtained. Thismethod is demonstrated through an application to age-period-cohort (APC) analysis of the incidence rates of cervical cancer in Ontario women 1980-1994 相似文献

2.

Model selection consistency of U-statistics with convex loss and weighted lasso penalty

W. Rejchel 《Journal of nonparametric statistics》2017,29(4):768-791

In the paper we consider minimisation of U-statistics with the weighted Lasso penalty and investigate their asymptotic properties in model selection and estimation. We prove that the use of appropriate weights in the penalty leads to the procedure that behaves like the oracle that knows the true model in advance, i.e. it is model selection consistent and estimates nonzero parameters with the standard rate. For the unweighted Lasso penalty, we obtain sufficient and necessary conditions for model selection consistency of estimators. The obtained results strongly based on the convexity of the loss function that is the main assumption of the paper. Our theorems can be applied to the ranking problem as well as generalised regression models. Thus, using U-statistics we can study more complex models (better describing real problems) than usually investigated linear or generalised linear models. 相似文献

3.

Variable Selection for Semiparametric Partially Linear Covariate-Adjusted Regression Models

Jiang Du Gaorong Li 《统计学通讯:理论与方法》2013,42(13):2809-2826

In this article, the partially linear covariate-adjusted regression models are considered, and the penalized least-squares procedure is proposed to simultaneously select variables and estimate the parametric components. The rate of convergence and the asymptotic normality of the resulting estimators are established under some regularization conditions. With the proper choices of the penalty functions and tuning parameters, it is shown that the proposed procedure can be as efficient as the oracle estimators. Some Monte Carlo simulation studies and a real data application are carried out to assess the finite sample performances for the proposed method. 相似文献

4.

Nonlinear regression modeling via the lasso-type regularization

Shohei Tateishi Hidetoshi Matsui Sadanori Konishi 《Journal of statistical planning and inference》2010

We consider the problem of constructing nonlinear regression models with Gaussian basis functions, using lasso regularization. Regularization with a lasso penalty is an advantageous in that it estimates some coefficients in linear regression models to be exactly zero. We propose imposing a weighted lasso penalty on a nonlinear regression model and thereby selecting the number of basis functions effectively. In order to select tuning parameters in the regularization method, we use a deviance information criterion proposed by Spiegelhalter et al. (2002), calculating the effective number of parameters by Gibbs sampling. Simulation results demonstrate that our methodology performs well in various situations. 相似文献

5.

Estimation and Variable Selection for Semiparametric Additive Partial Linear Models (SS-09-140)

Liu X Wang L Liang H 《Statistica Sinica》2011,21(3):1225-1248

Semiparametric additive partial linear models, containing both linear and nonlinear additive components, are more flexible compared to linear models, and they are more efficient compared to general nonparametric regression models because they reduce the problem known as "curse of dimensionality". In this paper, we propose a new estimation approach for these models, in which we use polynomial splines to approximate the additive nonparametric components and we derive the asymptotic normality for the resulting estimators of the parameters. We also develop a variable selection procedure to identify significant linear components using the smoothly clipped absolute deviation penalty (SCAD), and we show that the SCAD-based estimators of non-zero linear components have an oracle property. Simulations are performed to examine the performance of our approach as compared to several other variable selection methods such as the Bayesian Information Criterion and Least Absolute Shrinkage and Selection Operator (LASSO). The proposed approach is also applied to real data from a nutritional epidemiology study, in which we explore the relationship between plasma beta-carotene levels and personal characteristics (e.g., age, gender, body mass index (BMI), etc.) as well as dietary factors (e.g., alcohol consumption, smoking status, intake of cholesterol, etc.). 相似文献

6.

Smoothing parameter selection in nonparametric regression using an improved Akaike information criterion 总被引：1，自引：0，他引：1

Clifford M. Hurvich Jeffrey S. Simonoff & Chih-Ling Tsai 《Journal of the Royal Statistical Society. Series B, Statistical methodology》1998,60(2):271-293

Many different methods have been proposed to construct nonparametric estimates of a smooth regression function, including local polynomial, (convolution) kernel and smoothing spline estimators. Each of these estimators uses a smoothing parameter to control the amount of smoothing performed on a given data set. In this paper an improved version of a criterion based on the Akaike information criterion (AIC), termed AIC_C, is derived and examined as a way to choose the smoothing parameter. Unlike plug-in methods, AIC_C can be used to choose smoothing parameters for any linear smoother, including local quadratic and smoothing spline estimators. The use of AIC_C avoids the large variability and tendency to undersmooth (compared with the actual minimizer of average squared error) seen when other 'classical' approaches (such as generalized cross-validation (GCV) or the AIC) are used to choose the smoothing parameter. Monte Carlo simulations demonstrate that the AIC_C-based smoothing parameter is competitive with a plug-in method (assuming that one exists) when the plug-in method works well but also performs well when the plug-in approach fails or is unavailable. 相似文献

7.

Robust estimation and variable selection in heteroscedastic linear regression

I. Gijbels I. Vrinssen 《Statistics》2019,53(3):489-532

相似文献

8.

Variable selection in high-dimensional double generalized linear models

Dengke Xu Zhongzhan Zhang Liucang Wu 《Statistical Papers》2014,55(2):327-347

In this paper we are concerned with the problems of variable selection and estimation in double generalized linear models in which both the mean and the dispersion are allowed to depend on explanatory variables. We propose a maximum penalized pseudo-likelihood method when the number of parameters diverges with the sample size. With appropriate selection of the tuning parameters, the consistency of the variable selection procedure and asymptotic properties of the resulting estimators are established. We also carry out simulation studies and a real data analysis to assess the finite sample performance of the proposed variable selection procedure, showing that the proposed variable selection method works satisfactorily. 相似文献

9.

Application of shrinkage estimation in linear regression models with autoregressive errors

《Journal of Statistical Computation and Simulation》2012,82(16):3335-3351

In this paper, we consider the shrinkage and penalty estimation procedures in the linear regression model with autoregressive errors of order p when it is conjectured that some of the regression parameters are inactive. We develop the statistical properties of the shrinkage estimation method including asymptotic distributional biases and risks. We show that the shrinkage estimators have a significantly higher relative efficiency than the classical estimator. Furthermore, we consider the two penalty estimators: least absolute shrinkage and selection operator (LASSO) and adaptive LASSO estimators, and numerically compare their relative performance with that of the shrinkage estimators. A Monte Carlo simulation experiment is conducted for different combinations of inactive predictors and the performance of each estimator is evaluated in terms of the simulated mean-squared error. This study shows that the shrinkage estimators are comparable to the penalty estimators when the number of inactive predictors in the model is relatively large. The shrinkage and penalty methods are applied to a real data set to illustrate the usefulness of the procedures in practice. 相似文献

10.

Variable selection for partially varying coefficient single-index model

Sanying Feng Liugen Xue 《Journal of applied statistics》2013,40(12):2637-2652

In this paper, we consider the problem of variable selection for partially varying coefficient single-index model, and present a regularized variable selection procedure by combining basis function approximations with smoothly clipped absolute deviation penalty. The proposed procedure simultaneously selects significant variables in the single-index parametric components and the nonparametric coefficient function components. With appropriate selection of the tuning parameters, the consistency of the variable selection procedure and the oracle property of the estimators are established. Finite sample performance of the proposed method is illustrated by a simulation study and real data analysis. 相似文献

11.

Spatially-adaptive Penalties for Spline Fitting 总被引：2，自引：0，他引：2

David Ruppert & Raymond J. Carroll 《Australian & New Zealand Journal of Statistics》2000,42(2):205-223

The paper studies spline fitting with a roughness penalty that adapts to spatial heterogeneity in the regression function. The estimates are p th degree piecewise polynomials with p − 1 continuous derivatives. A large and fixed number of knots is used and smoothing is achieved by putting a quadratic penalty on the jumps of the p th derivative at the knots. To be spatially adaptive, the logarithm of the penalty is itself a linear spline but with relatively few knots and with values at the knots chosen to minimize the generalized cross validation (GCV) criterion. This locally-adaptive spline estimator is compared with other spline estimators in the literature such as cubic smoothing splines and knot-selection techniques for least squares regression. Our estimator can be interpreted as an empirical Bayes estimate for a prior allowing spatial heterogeneity. In cases of spatially heterogeneous regression functions, empirical Bayes confidence intervals using this prior achieve better pointwise coverage probabilities than confidence intervals based on a global-penalty parameter. The method is developed first for univariate models and then extended to additive models. 相似文献

12.

Asymptotics of cross-validated risk estimation in estimator selection and performance assessment 总被引：1，自引：0，他引：1

Sandrine Dudoit Mark J. van der Laan 《Statistical Methodology》2005,2(2):131-154

Risk estimation is an important statistical question for the purposes of selecting a good estimator (i.e., model selection) and assessing its performance (i.e., estimating generalization error). This article introduces a general framework for cross-validation and derives distributional properties of cross-validated risk estimators in the context of estimator selection and performance assessment. Arbitrary classes of estimators are considered, including density estimators and predictors for both continuous and polychotomous outcomes. Results are provided for general full data loss functions (e.g., absolute and squared error, indicator, negative log density). A broad definition of cross-validation is used in order to cover leave-one-out cross-validation, V-fold cross-validation, Monte Carlo cross-validation, and bootstrap procedures. For estimator selection, finite sample risk bounds are derived and applied to establish the asymptotic optimality of cross-validation, in the sense that a selector based on a cross-validated risk estimator performs asymptotically as well as an optimal oracle selector based on the risk under the true, unknown data generating distribution. The asymptotic results are derived under the assumption that the size of the validation sets converges to infinity and hence do not cover leave-one-out cross-validation. For performance assessment, cross-validated risk estimators are shown to be consistent and asymptotically linear for the risk under the true data generating distribution and confidence intervals are derived for this unknown risk. Unlike previously published results, the theorems derived in this and our related articles apply to general data generating distributions, loss functions (i.e., parameters), estimators, and cross-validation procedures. 相似文献

13.

Robust rank-based variable selection in double generalized linear models with diverging number of parameters under adaptive Lasso

Brice M. Nguelifack Isabelle Kemajou-Brown 《Journal of Statistical Computation and Simulation》2019,89(11):2051-2072

We propose a robust rank-based estimation and variable selection in double generalized linear models when the number of parameters diverges with the sample size. The consistency of the variable selection procedure and asymptotic properties of the resulting estimators are established under appropriate selection of tuning parameters. Simulations are performed to assess the finite sample performance of the proposed estimation and variable selection procedure. In the presence of gross outliers, the proposed method is showing that the variable selection method works better. For practical application, a real data application is provided using nutritional epidemiology data, in which we explore the relationship between plasma beta-carotene levels and personal characteristics (e.g. age, gender, fat, etc.) as well as dietary factors (e.g. smoking status, intake of cholesterol, etc.). 相似文献

14.

Quasi-likelihood Bridge estimators for high-dimensional generalized linear models

Xiaohua Cui Li Yan 《统计学通讯:模拟与计算》2017,46(10):8190-8204

In this article, we consider the variable selection and estimation for high-dimensional generalized linear models when the number of parameters diverges with the sample size. We propose a penalized quasi-likelihood function with the bridge penalty. The consistency and the Oracle property of the quasi-likelihood bridge estimators are obtained. Some simulations and a real data analysis are given to illustrate the performance of the proposed method. 相似文献

15.

Shrinkage tuning parameter selection in precision matrices estimation

Heng Lian 《Journal of statistical planning and inference》2011,141(8):2839-2848

Recent literature provides many computational and modeling approaches for covariance matrices estimation in a penalized Gaussian graphical models but relatively little study has been carried out on the choice of the tuning parameter. This paper tries to fill this gap by focusing on the problem of shrinkage parameter selection when estimating sparse precision matrices using the penalized likelihood approach. Previous approaches typically used K-fold cross-validation in this regard. In this paper, we first derived the generalized approximate cross-validation for tuning parameter selection which is not only a more computationally efficient alternative, but also achieves smaller error rate for model fitting compared to leave-one-out cross-validation. For consistency in the selection of nonzero entries in the precision matrix, we employ a Bayesian information criterion which provably can identify the nonzero conditional correlations in the Gaussian model. Our simulations demonstrate the general superiority of the two proposed selectors in comparison with leave-one-out cross-validation, 10-fold cross-validation and Akaike information criterion. 相似文献

16.

Quantile regression for robust estimation and variable selection in partially linear varying-coefficient models

Jing Yang Fang Lu Hu Yang 《Statistics》2017,51(6):1179-1199

In this paper, we develop a new estimation procedure based on quantile regression for semiparametric partially linear varying-coefficient models. The proposed estimation approach is empirically shown to be much more efficient than the popular least squares estimation method for non-normal error distributions, and almost not lose any efficiency for normal errors. Asymptotic normalities of the proposed estimators for both the parametric and nonparametric parts are established. To achieve sparsity when there exist irrelevant variables in the model, two variable selection procedures based on adaptive penalty are developed to select important parametric covariates as well as significant nonparametric functions. Moreover, both these two variable selection procedures are demonstrated to enjoy the oracle property under some regularity conditions. Some Monte Carlo simulations are conducted to assess the finite sample performance of the proposed estimators, and a real-data example is used to illustrate the application of the proposed methods. 相似文献

17.

A Robust Variable Selection to t-type Joint Generalized Linear Models via Penalized t-type Pseudo-likelihood

Liu-Cang Wu Zhong-Zhan Zhang Guo-Liang Tian Deng-Ke Xu 《统计学通讯:模拟与计算》2016,45(7):2320-2337

Although the t-type estimator is a kind of M-estimator with scale optimization, it has some advantages over the M-estimator. In this article, we first propose a t-type joint generalized linear model as a robust extension to the classical joint generalized linear models for modeling data containing extreme or outlying observations. Next, we develop a t-type pseudo-likelihood (TPL) approach, which can be viewed as a robust version to the existing pseudo-likelihood (PL) approach. To determine which variables significantly affect the variance of the response variable, we then propose a unified penalized maximum TPL method to simultaneously select significant variables for the mean and dispersion models in t-type joint generalized linear models. Thus, the proposed variable selection method can simultaneously perform parameter estimation and variable selection in the mean and dispersion models. With appropriate selection of the tuning parameters, we establish the consistency and the oracle property of the regularized estimators. Simulation studies are conducted to illustrate the proposed methods. 相似文献

18.

Variable Selection for Partially Linear Models with Randomly Censored Data

Yiping Yang Liugen Xue Weihu Cheng 《统计学通讯:模拟与计算》2013,42(8):1577-1589

This article proposes a variable selection procedure for partially linear models with right-censored data via penalized least squares. We apply the SCAD penalty to select significant variables and estimate unknown parameters simultaneously. The sampling properties for the proposed procedure are investigated. The rate of convergence and the asymptotic normality of the proposed estimators are established. Furthermore, the SCAD-penalized estimators of the nonzero coefficients are shown to have the asymptotic oracle property. In addition, an iterative algorithm is proposed to find the solution of the penalized least squares. Simulation studies are conducted to examine the finite sample performance of the proposed method. 相似文献

19.

Simultaneous variable and factor selection via sparse group lasso in factor analysis

Yuanchu Dang 《Journal of Statistical Computation and Simulation》2019,89(14):2744-2764

This paper considers variable and factor selection in factor analysis. We treat the factor loadings for each observable variable as a group, and introduce a weighted sparse group lasso penalty to the complete log-likelihood. The proposal simultaneously selects observable variables and latent factors of a factor analysis model in a data-driven fashion; it produces a more flexible and sparse factor loading structure than existing methods. For parameter estimation, we derive an expectation-maximization algorithm that optimizes the penalized log-likelihood. The tuning parameters of the procedure are selected by a likelihood cross-validation criterion that yields satisfactory results in various simulation settings. Simulation results reveal that the proposed method can better identify the possibly sparse structure of the true factor loading matrix with higher estimation accuracy than existing methods. A real data example is also presented to demonstrate its performance in practice. 相似文献

20.

Robust variable selection in finite mixture of regression models using the t distribution

Lin Dai Junhui Yin Zhengfen Xie 《统计学通讯:理论与方法》2013,42(21):5370-5386

Abstract

Variable selection in finite mixture of regression (FMR) models is frequently used in statistical modeling. The majority of applications of variable selection in FMR models use a normal distribution for regression error. Such assumptions are unsuitable for a set of data containing a group or groups of observations with heavy tails and outliers. In this paper, we introduce a robust variable selection procedure for FMR models using the t distribution. With appropriate selection of the tuning parameters, the consistency and the oracle property of the regularized estimators are established. To estimate the parameters of the model, we develop an EM algorithm for numerical computations and a method for selecting tuning parameters adaptively. The parameter estimation performance of the proposed model is evaluated through simulation studies. The application of the proposed model is illustrated by analyzing a real data set. 相似文献