Selection of the important variables is one of the most important model selection problems in statistical applications. In this article, we address variable selection in finite mixture of generalized semiparametric models. To overcome computational burden, we introduce a class of variable selection procedures for finite mixture of generalized semiparametric models using penalized approach for variable selection. Estimation of nonparametric component will be done via multivariate kernel regression. It is shown that the new method is consistent for variable selection and the performance of proposed method will be assessed via simulation.  相似文献   

In this article, we develop a generalized penalized linear unbiased selection (GPLUS) algorithm. The GPLUS is designed to compute the paths of penalized logistic regression based on the smoothly clipped absolute deviation (SCAD) and the minimax concave penalties (MCP). The main idea of the GPLUS is to compute possibly multiple local minimizers at individual penalty levels by continuously tracing the minimizers at different penalty levels. We demonstrate the feasibility of the proposed algorithm in logistic and linear regression. The simulation results favor the SCAD and MCP’s selection accuracy encompassing a suitable range of penalty levels.  相似文献   

 本文讨论了指数族广义部分线性单指数模型(Generalized Partially Linear Single Index Models, GPLSIM) 的惩罚样条迭代估计,提出了基于惩罚似然和一组预先取定的单指数参数向量 的初始估计的迭代估计算法。另外本文还通过一组模拟数据的分析对所提出的迭代算法进行了验证。  相似文献   

This article considers the shrinkage estimation procedure in the Cox's proportional hazards regression model when it is suspected that some of the parameters may be restricted to a subspace. We have developed the statistical properties of the shrinkage estimators including asymptotic distributional biases and risks. The shrinkage estimators have much higher relative efficiency than the classical estimator, furthermore, we consider two penalty estimators—the LASSO and adaptive LASSO—and compare their relative performance with that of the shrinkage estimators numerically. A Monte Carlo simulation experiment is conducted for different combinations of irrelevant predictors and the performance of each estimator is evaluated in terms of simulated mean squared error. Simulation study shows that the shrinkage estimators are comparable to the penalty estimators when the number of irrelevant predictors in the model is relatively large. The shrinkage and penalty methods are applied to two real data sets to illustrate the usefulness of the procedures in practice.  相似文献   

在广义线性模型假设下,采用Lin的医疗费用模型,运用LASSO和SCAD方法对影响医疗费用的因素进行选择,并对两种方法的有效性进行了对比分析,从而得出影响医疗保险赔付的重要因素,解决了高维变量带来的一系列问题。实例分析中,由于两种方法注重的统计性质不同,选择出的解释变量略微不同,但通过分析发现,两种结果都具有良好的解释性,反映了影响医疗保险赔付的重要信息。  相似文献   

Model selection and estimation are crucial parts of econometrics. This article introduces a new technique that can simultaneously estimate and select the model in generalized method of moments (GMM) context. The GMM is particularly powerful for analyzing complex datasets such as longitudinal and panel data, and it has wide applications in econometrics. This article extends the least squares based adaptive elastic net estimator by Zou and Zhang to nonlinear equation systems with endogenous variables. The extension is not trivial and involves a new proof technique due to estimators’ lack of closed-form solutions. Compared to Bridge-GMM by Caner, we allow for the number of parameters to diverge to infinity as well as collinearity among a large number of variables; also, the redundant parameters are set to zero via a data-dependent technique. This method has the oracle property, meaning that we can estimate nonzero parameters with their standard limit and the redundant parameters are dropped from the equations simultaneously. Numerical examples are used to illustrate the performance of the new method.  相似文献   

When teaching regression classes real-life examples help emphasize the importance of understanding theoretical concepts related to methodologies. This can be appreciated after a little reflection on the difficulty of constructing novel questions in regression that test on concepts rather than mere calculations. Interdisciplinary collaborations can be fertile contexts for questions of this type. In this article, we offer a case study that students will find: (1) practical with respect to the question being addressed, (2) compelling in the way it shows how a solid understanding of theory helps answer the question, and (3) enlightening in the way it shows how statisticians contribute to problem solving in interdisciplinary environments. Supplementary materials for this article are available online.  相似文献   

孙燕 《统计研究》2013,30(4):92-98
 在颇具争议的收入差距和健康关系研究中,为了降低可能存在的模型设定和遗漏变量偏误,本文提出了随机效应半参数logit模型,其中非参数的设定还可用于数据的初探性分析。随后本文提出了模型非参数和参数部分的估计方法。这里涉及的难点是随机效应的存在导致似然函数中的积分没有解析式,而非参数的存在更加大了估计难度。本文基于惩罚样条非参数估计方法和四阶Laplace近似方法建立了惩罚对数似然函数,其最大化采用了Newton_Raphson近似方法。文章还建立了惩罚样条中重要光滑参数的选取准则。模型在收入差距和健康实例中的估计结果表明数据支持收入差距弱假说,且非参数估计结果表明其具有U型形式,与实例估计结果的比较指出本文提出的估计方法是较准确的。  相似文献   

Varying-coefficient models are useful extensions of classical linear models. They arise from multivariate nonparametric regression, nonlinear time series modeling and forecasting, longitudinal data analysis, and others. This article proposes the penalized spline estimation for the varying-coefficient models. Assuming a fixed but potentially large number of knots, the penalized spline estimators are shown to be strong consistency and asymptotic normality. A systematic optimization algorithm for the selection of multiple smoothing parameters is developed. One of the advantages of the penalized spline estimation is that it can accommodate varying degrees of smoothness among coefficient functions due to multiple smoothing parameters being used. Some simulation studies are presented to illustrate the proposed methods.  相似文献   

This paper studies penalized quantile regression for dynamic panel data with fixed effects, where the penalty involves l1 shrinkage of the fixed effects. Using extensive Monte Carlo simulations, we present evidence that the penalty term reduces the dynamic panel bias and increases the efficiency of the estimators. The underlying intuition is that there is no need to use instrumental variables for the lagged dependent variable in the dynamic panel data model without fixed effects. This provides an additional use for the shrinkage models, other than model selection and efficiency gains. We propose a Bayesian information criterion based estimator for the parameter that controls the degree of shrinkage. We illustrate the usefulness of the novel econometric technique by estimating a “target leverage” model that includes a speed of capital structure adjustment. Using the proposed penalized quantile regression model the estimates of the adjustment speeds lie between 3% and 44% across the quantiles, showing strong evidence that there is substantial heterogeneity in the speed of adjustment among firms.  相似文献   


We consider multiple linear regression models under nonnormality. We derive modified maximum likelihood estimators (MMLEs) of the parameters and show that they are efficient and robust. We show that the least squares esimators are considerably less efficient. We compare the efficiencies of the MMLEs and the M estimators for symmetric distributions and show that, for plausible alternatives to an assumed distribution, the former are more efficient. We provide real-life examples.  相似文献   

响应变量存在数据缺失的情况广泛出现在社会经济研究中,对响应变量存在数据缺失的回归模型提出了一个在矩估计框架下的单一的半参数估计量,这种估计量保留了参数回归估计量与非参数匹配估计量的特性,从而使得该估计量既能在响应变量被观测的子样本中保持较好的拟合性,又能够降低响应变量未被观测的子样本的估计误差,并且证明了这种估计量是一致、渐进正态估计量。  相似文献   

This paper extends the adaptive LASSO (ALASSO) for simultaneous parameter estimation and variable selection to a varying-coefficient partially linear model where some of the covariates are subject to measurement errors of an additive form. We draw comparisons with the SCAD, and prove that both the ALASSO and the SCAD attain the oracle property under this setup. We further develop an algorithm in the spirit of LARS for finding the solution path of the ALASSO in practical applications. Finite sample properties of the proposed methods are examined in a simulation study, and a real data example based on the U.S. Department of Agriculture's Continuing Survey of Food Intakes by Individuals (CSFII) is considered.  相似文献   

Abstract. In this paper, two non‐parametric estimators are proposed for estimating the components of an additive quantile regression model. The first estimator is a computationally convenient approach which can be viewed as a more viable alternative to existing kernel‐based approaches. The second estimator involves sequential fitting by univariate local polynomial quantile regressions for each additive component with the other additive components replaced by the corresponding estimates from the first estimator. The purpose of the extra local averaging is to reduce the variance of the first estimator. We show that the second estimator achieves oracle efficiency in the sense that each estimated additive component has the same variance as in the case when all other additive components were known. Asymptotic properties are derived for both estimators under dependent processes that are strictly stationary and absolutely regular. We also provide a demonstrative empirical application of additive quantile models to ambulance travel times.  相似文献   

秦磊等 《统计研究》2018,35(6):109-116
针对具有多个来源的异质性数据,文献中通常提出复杂程度较高的模型用于描述每个数据子总体的特征,而本文着眼于刻画不同数据子总体的共性进而建立一个简单的模型。在参数估计方面,本文借鉴了普通线性模型的Maximin估计思想,提出了适用于广义线性模型的Maximin似然比估计方法及稀疏结构下的惩罚估计。该方法通过最大化所有子总体中似然比统计量的最小值,构建成一个简单而保守的模型,以减少数据来源较多而呈现的复杂性。所提方法适用于因变量服从正态分布、两点分布、泊松分布等指数族分布的情形,丰富了前人的研究成果,具有更好的实践意义。模拟分析显示,相比于经典的估计方法,Maximin似然比估计方法不仅能够有效地探寻子总体的共性,而且具有较高的样本外预测精度。本文提出的方法也适用于政府统计和经济统计中具有异质性的大型数据集。  相似文献   

This article proposes a variable selection procedure for partially linear models with right-censored data via penalized least squares. We apply the SCAD penalty to select significant variables and estimate unknown parameters simultaneously. The sampling properties for the proposed procedure are investigated. The rate of convergence and the asymptotic normality of the proposed estimators are established. Furthermore, the SCAD-penalized estimators of the nonzero coefficients are shown to have the asymptotic oracle property. In addition, an iterative algorithm is proposed to find the solution of the penalized least squares. Simulation studies are conducted to examine the finite sample performance of the proposed method.  相似文献   

李小胜  王申令 《统计研究》2016,33(11):85-92
本文首先构造线性约束条件下的多元线性回归模型的样本似然函数,利用Lagrange法证明其合理性。其次,从似然函数的角度讨论线性约束条件对模型参数的影响,对由传统理论得出的参数估计作出贝叶斯与经验贝叶斯的改进。做贝叶斯改进时,将矩阵正态-Wishart分布作为模型参数和精度阵的联合共轭先验分布,结合构造的似然函数得出参数的后验分布,计算出参数的贝叶斯估计;做经验贝叶斯改进时,将样本分组,从方差的角度讨论由子样得出的参数估计对总样本的参数估计的影响,计算出经验贝叶斯估计。最后,利用Matlab软件生成的随机矩阵做模拟。结果表明,这两种改进后的参数估计均较由传统理论得出的参数估计更精确,拟合结果的误差比更小,可信度更高,在大数据的情况下,这种计算方法的速度更快。  相似文献   

This paper is the generalization of weight-fused elastic net (Fu and Xu, 2012 Fu, G., Xu, Q. (2012). Grouping variable selection by weight fused elastic net for multi-collinear data. Communications in Statistics-Simulation and Computation 41(2):205221.[Taylor & Francis Online], [Web of Science ®] [Google Scholar]), which performs group variable selection by combining weight-fused LASSO(wfLasso) and elastic net (Zou and Hastie, 2005 Zou, H., Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 67(2):301320.[Crossref], [Web of Science ®] [Google Scholar]) penalties. In this study, the elastic net penalty is replaced by adaptive elastic net penalty (AdaEnet) (Zou and Zhang, 2009 Zou, H., Zhang, H. (2009). On the adaptive elastic-net with a diverging number of parameters. Annals of Statistics 37(4):17331751.[Crossref], [PubMed], [Web of Science ®] [Google Scholar]), and a new group variable selection algorithm with oracle property (Fan and Li, 2001 Fan, J., Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association 96(456):13481360.[Taylor & Francis Online], [Web of Science ®] [Google Scholar]; Zou, 2006 Zou, H. (2006). The adaptive lasso and its oracle properties. Journal of the American Statistical Association 101(476):14181429.[Taylor & Francis Online], [Web of Science ®] [Google Scholar]) is obtained.  相似文献   

In this article, we propose three M-estimators for multiple regression model when response variable is subject to twice censoring. The consistency of the proposed M-estimators is established. A simulation study is conducted to investigate the performance of the proposed estimators. Furthermore, the simple bootstrap methods are used to construct interval estimators.  相似文献   

