期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Variable selection in finite mixture of semi-parametric regression models

Ehsan Ormoz Farzad Eskandari 《统计学通讯:理论与方法》2013,42(3):695-711

Abstract

In this paper we are concerned with variable selection in finite mixture of semiparametric regression models. This task consists of model selection for non parametric component and variable selection for parametric part. Thus, we encountered separate model selections for every non parametric component of each sub model. To overcome this computational burden, we introduced a class of variable selection procedures for finite mixture of semiparametric regression models using penalized approach for variable selection. It is shown that the new method is consistent for variable selection. Simulations show that the performance of proposed method is good, and it consequently improves pervious works in this area and also requires much less computing power than existing methods. 相似文献

2.

Bayesian spectral analysis models for quantile regression with Dirichlet process mixtures

Seongil Jo Taeyoung Roh 《Journal of nonparametric statistics》2016,28(1):177-206

This paper presents a Bayesian analysis of partially linear additive models for quantile regression. We develop a semiparametric Bayesian approach to quantile regression models using a spectral representation of the nonparametric regression functions and the Dirichlet process (DP) mixture for error distribution. We also consider Bayesian variable selection procedures for both parametric and nonparametric components in a partially linear additive model structure based on the Bayesian shrinkage priors via a stochastic search algorithm. Based on the proposed Bayesian semiparametric additive quantile regression model referred to as BSAQ, the Bayesian inference is considered for estimation and model selection. For the posterior computation, we design a simple and efficient Gibbs sampler based on a location-scale mixture of exponential and normal distributions for an asymmetric Laplace distribution, which facilitates the commonly used collapsed Gibbs sampling algorithms for the DP mixture models. Additionally, we discuss the asymptotic property of the sempiparametric quantile regression model in terms of consistency of posterior distribution. Simulation studies and real data application examples illustrate the proposed method and compare it with Bayesian quantile regression methods in the literature. 相似文献

3.

A new model selection procedure for finite mixture regression models

Conglian Yu 《统计学通讯:理论与方法》2020,49(18):4347-4366

Abstract

In this article, we propose a new penalized-likelihood method to conduct model selection for finite mixture of regression models. The penalties are imposed on mixing proportions and regression coefficients, and hence order selection of the mixture and the variable selection in each component can be simultaneously conducted. The consistency of order selection and the consistency of variable selection are investigated. A modified EM algorithm is proposed to maximize the penalized log-likelihood function. Numerical simulations are conducted to demonstrate the finite sample performance of the estimation procedure. The proposed methodology is further illustrated via real data analysis. 相似文献

4.

Semiparametric Mixtures of Generalized Exponential Families

RICHARD CHARNIGO RAMANI S. PILLA 《Scandinavian Journal of Statistics》2007,34(3):535-551

Abstract. A semiparametric mixture model is characterized by a non-parametric mixing distribution Q (with respect to a parameter θ ) and a structural parameter β common to all components. Much of the literature on mixture models has focused on fixing β and estimating Q . However, this can lead to inconsistent estimation of both Q and the order of the model m . Creating a framework for consistent estimation remains an open problem and is the focus of this article. We formulate a class of generalized exponential family (GEF) models and establish sufficient conditions for the identifiability of finite mixtures formed from a GEF along with sufficient conditions for a nesting structure. Finite identifiability and nesting structure lead to the central result that semiparametric maximum likelihood estimation of Q and β fails. However, consistent estimation is possible if we restrict the class of mixing distributions and employ an information-theoretic approach. This article provides a foundation for inference in semiparametric mixture models, in which GEFs and their structural properties play an instrumental role. 相似文献

5.

Quantile regression for robust estimation and variable selection in partially linear varying-coefficient models

Jing Yang Fang Lu Hu Yang 《Statistics》2017,51(6):1179-1199

In this paper, we develop a new estimation procedure based on quantile regression for semiparametric partially linear varying-coefficient models. The proposed estimation approach is empirically shown to be much more efficient than the popular least squares estimation method for non-normal error distributions, and almost not lose any efficiency for normal errors. Asymptotic normalities of the proposed estimators for both the parametric and nonparametric parts are established. To achieve sparsity when there exist irrelevant variables in the model, two variable selection procedures based on adaptive penalty are developed to select important parametric covariates as well as significant nonparametric functions. Moreover, both these two variable selection procedures are demonstrated to enjoy the oracle property under some regularity conditions. Some Monte Carlo simulations are conducted to assess the finite sample performance of the proposed estimators, and a real-data example is used to illustrate the application of the proposed methods. 相似文献

6.

Variable selection for semiparametric regression models with iterated penalization

Dai Y Ma S 《Journal of nonparametric statistics》2012,24(2):283-298

Semiparametric regression models with multiple covariates are commonly encountered. When there are covariates not associated with response variable, variable selection may lead to sparser models, more lucid interpretations and more accurate estimation. In this study, we adopt a sieve approach for the estimation of nonparametric covariate effects in semiparametric regression models. We adopt a two-step iterated penalization approach for variable selection. In the first step, a mixture of the Lasso and group Lasso penalties are employed to conduct the first-round variable selection and obtain the initial estimate. In the second step, a mixture of the weighted Lasso and weighted group Lasso penalties, with weights constructed using the initial estimate, are employed for variable selection. We show that the proposed iterated approach has the variable selection consistency property, even when number of unknown parameters diverges with sample size. Numerical studies, including simulation and analysis of a diabetes dataset, show satisfactory performance of the proposed approach. 相似文献

7.

Orthogonal weighted empirical likelihood-based variable selection for semiparametric instrumental variable models

Jiting Huang 《统计学通讯:理论与方法》2018,47(18):4375-4388

In this article, we consider the variable selection for a class of semiparametric instrumental variable models. By combining orthogonal weighting technology and empirical likelihood method, we propose an orthogonal weighted empirical likelihood-based variable selection procedure. Under some mild conditions, the consistency and sparsity of the variable selection procedure are studied. Furthermore, some simulation studies and a real data analysis are carried out to examine the finite-sample performance of the proposed method. 相似文献

8.

Bayesian variable selection in a class of mixture models for ordinal data: a comparative study

《Journal of Statistical Computation and Simulation》2012,82(10):1926-1944

In this paper, we consider a special finite mixture model named Combination of Uniform and shifted Binomial (CUB), recently introduced in the statistical literature to analyse ordinal data expressing the preferences of raters with regards to items or services. Our aim is to develop a variable selection procedure for this model using a Bayesian approach. Bayesian methods for variable selection and model choice have become increasingly popular in recent years, due to advances in Markov chain Monte Carlo computational algorithms. Several methods have been proposed in the case of linear and generalized linear models (GLM). In this paper, we adapt to the CUB model some of these algorithms: the Kuo–Mallick method together with its ‘metropolized’ version and the Stochastic Search Variable Selection method. Several simulated examples are used to illustrate the algorithms and to compare their performance. Finally, an application to real data is introduced. 相似文献

9.

Variable selection for semiparametric proportional hazards model under progressive Type-II censoring

Xuejing Zhao Jinxia Su 《统计学通讯:模拟与计算》2017,46(6):4367-4376

Variable selection is an effective methodology for dealing with models with numerous covariates. We consider the methods of variable selection for semiparametric Cox proportional hazards model under the progressive Type-II censoring scheme. The Cox proportional hazards model is used to model the influence coefficients of the environmental covariates. By applying Breslow’s “least information” idea, we obtain a profile likelihood function to estimate the coefficients. Lasso-type penalized profile likelihood estimation as well as stepwise variable selection method are explored as means to find the important covariates. Numerical simulations are conducted and Veteran’s Administration Lung Cancer data are exploited to evaluate the performance of the proposed method. 相似文献

10.

A simple root selection method for univariate finite normal mixture models

Supawadee Wichitchan Weixin Yao 《统计学通讯:理论与方法》2019,48(15):3778-3794

It is well known that there exist multiple roots of the likelihood equations for finite normal mixture models. Selecting a consistent root for finite normal mixture models has long been a challenging problem. Simply using the root with the largest likelihood will not work because of the spurious roots. In addition, the likelihood of normal mixture models with unequal variance is unbounded and thus its maximum likelihood estimate (MLE) is not well defined. In this paper, we propose a simple root selection method for univariate normal mixture models by incorporating the idea of goodness of fit test. Our new method inherits both the consistency properties of distance estimators and the efficiency of the MLE. The new method is simple to use and its computation can be easily done using existing R packages for mixture models. In addition, the proposed root selection method is very general and can be also applied to other univariate mixture models. We demonstrate the effectiveness of the proposed method and compare it with some other existing methods through simulation studies and a real data application. 相似文献

11.

NEW EFFICIENT ESTIMATION AND VARIABLE SELECTION METHODS FOR SEMIPARAMETRIC VARYING-COEFFICIENT PARTIALLY LINEAR MODELS 总被引：1，自引：0，他引：1

Kai B Li R Zou H 《Annals of statistics》2011,39(1):305-332

The complexity of semiparametric models poses new challenges to statistical inference and model selection that frequently arise from real applications. In this work, we propose new estimation and variable selection procedures for the semiparametric varying-coefficient partially linear model. We first study quantile regression estimates for the nonparametric varying-coefficient functions and the parametric regression coefficients. To achieve nice efficiency properties, we further develop a semiparametric composite quantile regression procedure. We establish the asymptotic normality of proposed estimators for both the parametric and nonparametric parts and show that the estimators achieve the best convergence rate. Moreover, we show that the proposed method is much more efficient than the least-squares-based method for many non-normal errors and that it only loses a small amount of efficiency for normal errors. In addition, it is shown that the loss in efficiency is at most 11.1% for estimating varying coefficient functions and is no greater than 13.6% for estimating parametric components. To achieve sparsity with high-dimensional covariates, we propose adaptive penalization methods for variable selection in the semiparametric varying-coefficient partially linear model and prove that the methods possess the oracle property. Extensive Monte Carlo simulation studies are conducted to examine the finite-sample performance of the proposed procedures. Finally, we apply the new methods to analyze the plasma beta-carotene level data. 相似文献

12.

Variable selection for semiparametric varying coefficient partially linear model based on modal regression with missing data

Yafeng Xia Yarong Qu Nailing Sun 《统计学通讯:理论与方法》2013,42(20):5121-5137

Abstract

In this article, we focus on the variable selection for semiparametric varying coefficient partially linear model with response missing at random. Variable selection is proposed based on modal regression, where the non parametric functions are approximated by B-spline basis. The proposed procedure uses SCAD penalty to realize variable selection of parametric and nonparametric components simultaneously. Furthermore, we establish the consistency, the sparse property and asymptotic normality of the resulting estimators. The penalty estimation parameters value of the proposed method is calculated by EM algorithm. Simulation studies are carried out to assess the finite sample performance of the proposed variable selection procedure. 相似文献

13.

Variable selection for semiparametric errors-in-variables regression model with longitudinal data

《Journal of Statistical Computation and Simulation》2012,82(8):1654-1669

In this paper, we focus on the variable selection for the semiparametric regression model with longitudinal data when some covariates are measured with errors. A new bias-corrected variable selection procedure is proposed based on the combination of the quadratic inference functions and shrinkage estimations. With appropriate selection of the tuning parameters, we establish the consistency and asymptotic normality of the resulting estimators. Extensive Monte Carlo simulation studies are conducted to examine the finite sample performance of the proposed variable selection procedure. We further illustrate the proposed procedure with an application. 相似文献

14.

Variable selection in high-dimensional double generalized linear models

Dengke Xu Zhongzhan Zhang Liucang Wu 《Statistical Papers》2014,55(2):327-347

In this paper we are concerned with the problems of variable selection and estimation in double generalized linear models in which both the mean and the dispersion are allowed to depend on explanatory variables. We propose a maximum penalized pseudo-likelihood method when the number of parameters diverges with the sample size. With appropriate selection of the tuning parameters, the consistency of the variable selection procedure and asymptotic properties of the resulting estimators are established. We also carry out simulation studies and a real data analysis to assess the finite sample performance of the proposed variable selection procedure, showing that the proposed variable selection method works satisfactorily. 相似文献

15.

Semiparametric Time Series Models with Log‐concave Innovations: Maximum Likelihood Estimation and its Consistency

下载免费PDF全文

Yining Chen 《Scandinavian Journal of Statistics》2015,42(1):1-31

We study semiparametric time series models with innovations following a log‐concave distribution. We propose a general maximum likelihood framework that allows us to estimate simultaneously the parameters of the model and the density of the innovations. This framework can be easily adapted to many well‐known models, including autoregressive moving average (ARMA), generalized autoregressive conditionally heteroscedastic (GARCH), and ARMA‐GARCH models. Furthermore, we show that the estimator under our new framework is consistent in both ARMA and ARMA‐GARCH settings. We demonstrate its finite sample performance via a thorough simulation study and apply it to model the daily log‐return of the FTSE 100 index. 相似文献

16.

Robust variable selection in finite mixture of regression models using the t distribution

Lin Dai Junhui Yin Zhengfen Xie 《统计学通讯:理论与方法》2013,42(21):5370-5386

Abstract

Variable selection in finite mixture of regression (FMR) models is frequently used in statistical modeling. The majority of applications of variable selection in FMR models use a normal distribution for regression error. Such assumptions are unsuitable for a set of data containing a group or groups of observations with heavy tails and outliers. In this paper, we introduce a robust variable selection procedure for FMR models using the t distribution. With appropriate selection of the tuning parameters, the consistency and the oracle property of the regularized estimators are established. To estimate the parameters of the model, we develop an EM algorithm for numerical computations and a method for selecting tuning parameters adaptively. The parameter estimation performance of the proposed model is evaluated through simulation studies. The application of the proposed model is illustrated by analyzing a real data set. 相似文献

17.

Profile likelihood approaches for semiparametric copula and frailty models for clustered survival data

Il Do Ha Jong-Min Kim Takeshi Emura 《Journal of applied statistics》2019,46(14):2553-2571

ABSTRACT

In clustered survival data, the dependence among individual survival times within a cluster has usually been described using copula models and frailty models. In this paper we propose a profile likelihood approach for semiparametric copula models with different cluster sizes. We also propose a likelihood ratio method based on profile likelihood for testing the absence of association parameter (i.e. test of independence) under the copula models, leading to the boundary problem of the parameter space. For this purpose, we show via simulation study that the proposed likelihood ratio method using an asymptotic chi-square mixture distribution performs well as sample size increases. We compare the behaviors of the two models using the profile likelihood approach under a semiparametric setting. The proposed method is demonstrated using two well-known data sets. 相似文献

18.

Variable selection in finite mixture of regression models using the skew-normal distribution

Junhui Yin Liucang Wu Lin Dai 《Journal of applied statistics》2020,47(16):2941

Variable selection in finite mixture of regression (FMR) models is frequently used in statistical modeling. The majority of applications of variable selection in FMR models use a normal distribution for regression error. Such assumptions are unsuitable for a set of data containing a group or groups of observations with asymmetric behavior. In this paper, we introduce a variable selection procedure for FMR models using the skew-normal distribution. With appropriate choice of the tuning parameters, we establish the theoretical properties of our procedure, including consistency in variable selection and the oracle property in estimation. To estimate the parameters of the model, a modified EM algorithm for numerical computations is developed. The methodology is illustrated through numerical experiments and a real data example. 相似文献

19.

Estimation of finite mixtures with symmetric components

Chew-Seng Chee Yong Wang 《Statistics and Computing》2013,23(2):233-249

It may sometimes be clear from background knowledge that a population under investigation proportionally consists of a known number of subpopulations, whose distributions belong to the same, yet unknown, family. While a parametric family is commonly used in practice, one can also consider some nonparametric families to avoid distributional misspecification. In this article, we propose a solution using a mixture-based nonparametric family for the component distribution in a finite mixture model as opposed to some recent research that utilizes a kernel-based approach. In particular, we present a semiparametric maximum likelihood estimation procedure for the model parameters and tackle the bandwidth parameter selection problem via some popular means for model selection. Empirical comparisons through simulation studies and three real data sets suggest that estimators based on our mixture-based approach are more efficient than those based on the kernel-based approach, in terms of both parameter estimation and overall density estimation. 相似文献

20.

Sensitivity analysis of partially linear models with response missing at random

Ai-Xia Fan 《统计学通讯:模拟与计算》2017,46(7):5323-5339

This article investigates case-deletion influence analysis via Cook’s distance and local influence analysis via conformal normal curvature for partially linear models with response missing at random. Local influence approach is developed to assess the sensitivity of parameter and nonparametric estimators to various perturbations such as case-weight, response variable, explanatory variable, and parameter perturbations on the basis of semiparametric estimating equations, which are constructed using the inverse probability weighted approach, rather than likelihood function. Residual and generalized leverage are also defined. Simulation studies and a dataset taken from the AIDS Clinical Trials are used to illustrate the proposed methods. 相似文献