首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The purpose of this paper is threefold. First, we obtain the asymptotic properties of the modified model selection criteria proposed by Hurvich et al. (1990. Improved estimators of Kullback-Leibler information for autoregressive model selection in small samples. Biometrika 77, 709–719) for autoregressive models. Second, we provide some highlights on the better performance of this modified criteria. Third, we extend the modification introduced by these authors to model selection criteria commonly used in the class of self-exciting threshold autoregressive (SETAR) time series models. We show the improvements of the modified criteria in their finite sample performance. In particular, for small and medium sample size the frequency of selecting the true model improves for the consistent criteria and the root mean square error (RMSE) of prediction improves for the efficient criteria. These results are illustrated via simulation with SETAR models in which we assume that the threshold and the parameters are unknown.  相似文献   

2.
This paper is concerned with selection of explanatory variables in generalized linear models (GLM). The class of GLM's is quite large and contains e.g. the ordinary linear regression, the binary logistic regression, the probit model and Poisson regression with linear or log-linear parameter structure. We show that, through an approximation of the log likelihood and a certain data transformation, the variable selection problem in a GLM can be converted into variable selection in an ordinary (unweighted) linear regression model. As a consequence no specific computer software for variable selection in GLM's is needed. Instead, some suitable variable selection program for linear regression can be used. We also present a simulation study which shows that the log likelihood approximation is very good in many practical situations. Finally, we mention briefly possible extensions to regression models outside the class of GLM's.  相似文献   

3.
4.
This paper addresses the problem of comparing the fit of latent class and latent trait models when the indicators are binary and the contingency table is sparse. This problem is common in the analysis of data from large surveys, where many items are associated with an unobservable variable. A study of human resource data illustrates: (1) how the usual goodness-of-fit tests, model selection and cross-validation criteria can be inconclusive; (2) how model selection and evaluation procedures from time series and economic forecasting can be applied to extend residual analysis in this context.  相似文献   

5.
Selection of the important variables is one of the most important model selection problems in statistical applications. In this article, we address variable selection in finite mixture of generalized semiparametric models. To overcome computational burden, we introduce a class of variable selection procedures for finite mixture of generalized semiparametric models using penalized approach for variable selection. Estimation of nonparametric component will be done via multivariate kernel regression. It is shown that the new method is consistent for variable selection and the performance of proposed method will be assessed via simulation.  相似文献   

6.
Abstract. The Dantzig selector (DS) is a recent approach of estimation in high‐dimensional linear regression models with a large number of explanatory variables and a relatively small number of observations. As in the least absolute shrinkage and selection operator (LASSO), this approach sets certain regression coefficients exactly to zero, thus performing variable selection. However, such a framework, contrary to the LASSO, has never been used in regression models for survival data with censoring. A key motivation of this article is to study the estimation problem for Cox's proportional hazards (PH) function regression models using a framework that extends the theory, the computational advantages and the optimal asymptotic rate properties of the DS to the class of Cox's PH under appropriate sparsity scenarios. We perform a detailed simulation study to compare our approach with other methods and illustrate it on a well‐known microarray gene expression data set for predicting survival from gene expressions.  相似文献   

7.
A regression model with skew-normal errors provides a useful extension for ordinary normal regression models when the data set under consideration involves asymmetric outcomes. Variable selection is an important issue in all regression analyses, and in this paper, we investigate the simultaneously variable selection in joint location and scale models of the skew-normal distribution. We propose a unified penalized likelihood method which can simultaneously select significant variables in the location and scale models. Furthermore, the proposed variable selection method can simultaneously perform parameter estimation and variable selection in the location and scale models. With appropriate selection of the tuning parameters, we establish the consistency and the oracle property of the regularized estimators. Simulation studies and a real example are used to illustrate the proposed methodologies.  相似文献   

8.
In statistical analysis, one of the most important subjects is to select relevant exploratory variables that perfectly explain the dependent variable. Variable selection methods are usually performed within regression analysis. Variable selection is implemented so as to minimize the information criteria (IC) in regression models. Information criteria directly affect the power of prediction and the estimation of selected models. There are numerous information criteria in literature such as Akaike Information Criteria (AIC) and Bayesian Information Criteria (BIC). These criteria are modified for to improve the performance of the selected models. BIC is extended with alternative modifications towards the usage of prior and information matrix. Information matrix-based BIC (IBIC) and scaled unit information prior BIC (SPBIC) are efficient criteria for this modification. In this article, we proposed a combination to perform variable selection via differential evolution (DE) algorithm for minimizing IBIC and SPBIC in linear regression analysis. We concluded that these alternative criteria are very useful for variable selection. We also illustrated the efficiency of this combination with various simulation and application studies.  相似文献   

9.
Variable selection is an important issue in all regression analysis, and in this article, we investigate the simultaneous variable selection in joint location and scale models of the skew-t-normal distribution when the dataset under consideration involves heavy tail and asymmetric outcomes. We propose a unified penalized likelihood method which can simultaneously select significant variables in the location and scale models. Furthermore, the proposed variable selection method can simultaneously perform parameter estimation and variable selection in the location and scale models. With appropriate selection of the tuning parameters, we establish the consistency and the oracle property of the regularized estimators. These estimators are compared by simulation studies.  相似文献   

10.
Empirical likelihood based variable selection   总被引:1,自引:0,他引:1  
Information criteria form an important class of model/variable selection methods in statistical analysis. Parametric likelihood is a crucial part of these methods. In some applications such as the generalized linear models, the models are only specified by a set of estimating functions. To overcome the non-availability of well defined likelihood function, the information criteria under empirical likelihood are introduced. Under this setup, we successfully solve the existence problem of the profile empirical likelihood due to the over constraint in variable selection problems. The asymptotic properties of the new method are investigated. The new method is shown to be consistent at selecting the variables under mild conditions. Simulation studies find that the proposed method has comparable performance to the parametric information criteria when a suitable parametric model is available, and is superior when the parametric model assumption is violated. A real data set is also used to illustrate the usefulness of the new method.  相似文献   

11.
Stepwise variable selection procedures are computationally inexpensive methods for constructing useful regression models for a single dependent variable. At each step a variable is entered into or deleted from the current model, based on the criterion of minimizing the error sum of squares (SSE). When there is more than one dependent variable, the situation is more complex. In this article we propose variable selection criteria for multivariate regression which generalize the univariate SSE criterion. Specifically, we suggest minimizing some function of the estimated error covariance matrix: the trace, the determinant, or the largest eigenvalue. The computations associated with these criteria may be burdensome. We develop a computational framework based on the use of the SWEEP operator which greatly reduces these calculations for stepwise variable selection in multivariate regression.  相似文献   

12.
In some fields, we are forced to work with missing data in multivariate time series. Unfortunately, the data analysis in this context cannot be carried out in the same way as in the case of complete data. To deal with this problem, a Bayesian analysis of multivariate threshold autoregressive models with exogenous inputs and missing data is carried out. In this paper, Markov chain Monte Carlo methods are used to obtain samples from the involved posterior distributions, including threshold values and missing data. In order to identify autoregressive orders, we adapt the Bayesian variable selection method in this class of multivariate process. The number of regimes is estimated using marginal likelihood or product parameter-space strategies.  相似文献   

13.
Abstract. It is quite common in epidemiology that we wish to assess the quality of estimators on a particular set of information, whereas the estimators may use a larger set of information. Two examples are studied: the first occurs when we construct a model for an event which happens if a continuous variable is above a certain threshold. We can compare estimators based on the observation of only the event or on the whole continuous variable. The other example is that of predicting the survival based only on survival information or using in addition information on a disease. We develop modified Akaike information criterion (AIC) and Likelihood cross‐validation (LCV) criteria to compare estimators in this non‐standard situation. We show that a normalized difference of AIC has a bias equal to o ( n ? 1 ) if the estimators are based on well‐specified models; a normalized difference of LCV always has a bias equal to o ( n ? 1 ). A simulation study shows that both criteria work well, although the normalized difference of LCV tends to be better and is more robust. Moreover in the case of well‐specified models the difference of risks boils down to the difference of statistical risks which can be rather precisely estimated. For ‘compatible’ models the difference of risks is often the main term but there can also be a difference of mis‐specification risks.  相似文献   

14.
In this article we present a robust and efficient variable selection procedure by using modal regression for varying-coefficient models with longitudinal data. The new method is proposed based on basis function approximations and a group version of the adaptive LASSO penalty, which can select significant variables and estimate the non-zero smooth coefficient functions simultaneously. Under suitable conditions, we establish the consistency in variable selection and the oracle property in estimation. A simulation study and two real data examples are undertaken to assess the finite sample performance of the proposed variable selection procedure.  相似文献   

15.
The linear regression models with the autoregressive moving average (ARMA) errors (REGARMA models) are often considered, in order to reflect a serial correlation among observations. In this article, we focus on an adaptive least absolute shrinkage and selection operator (LASSO) (ALASSO) method for the variable selection of the REGARMA models and extend it to the linear regression models with the ARMA-generalized autoregressive conditional heteroskedasticity (ARMA-GARCH) errors (REGARMA-GARCH models). This attempt is an extension of the existing ALASSO method for the linear regression models with the AR errors (REGAR models) proposed by Wang et al. in 2007 Wang, H., Li, G., Tsai, C. (2007). Regression coefficient and autoregressive order shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B 69:6378. [Google Scholar]. New ALASSO algorithms are proposed to determine important predictors for the REGARMA and REGARMA-GARCH models. Finally, we provide the simulation results and real data analysis to illustrate our findings.  相似文献   

16.
Variable selection in finite mixture of regression (FMR) models is frequently used in statistical modeling. The majority of applications of variable selection in FMR models use a normal distribution for regression error. Such assumptions are unsuitable for a set of data containing a group or groups of observations with asymmetric behavior. In this paper, we introduce a variable selection procedure for FMR models using the skew-normal distribution. With appropriate choice of the tuning parameters, we establish the theoretical properties of our procedure, including consistency in variable selection and the oracle property in estimation. To estimate the parameters of the model, a modified EM algorithm for numerical computations is developed. The methodology is illustrated through numerical experiments and a real data example.  相似文献   

17.
Variable selection in elliptical Linear Mixed Models (LMMs) with a shrinkage penalty function (SPF) is the main scope of this study. SPFs are applied for parameter estimation and variable selection simultaneously. The smoothly clipped absolute deviation penalty (SCAD) is one of the SPFs and it is adapted into the elliptical LMM in this study. The proposed idea is highly applicable to a variety of models which are set up with different distributions such as normal, student-t, Pearson VII, power exponential and so on. Simulation studies and real data example with one of the elliptical distributions show that if the variable selection is also a concern, it is worthwhile to carry on the variable selection and the parameter estimation simultaneously in the elliptical LMM.  相似文献   

18.
Multilevel Mixed Linear Models for Survival Data   总被引:2,自引:0,他引:2  
For the analysis of correlated survival data mixed linear models are useful alternatives to frailty models. By their use the survival times can be directly modelled, so that the interpretation of the fixed and random effects is straightforward. However, because of intractable integration involved with the use of marginal likelihood the class of models in use has been severely restricted. Such a difficulty can be avoided by using hierarchical-likelihood, which provides a statistically efficient and fast fitting algorithm for multilevel models. The proposed method is illustrated using the chronic granulomatous disease data. A simulation study is carried out to evaluate the performance.  相似文献   

19.
Varying-coefficient models have been widely used to investigate the possible time-dependent effects of covariates when the response variable comes from normal distribution. Much progress has been made for inference and variable selection in the framework of such models. However, the identification of model structure, that is how to identify which covariates have time-varying effects and which have fixed effects, remains a challenging and unsolved problem especially when the dimension of covariates is much larger than the sample size. In this article, we consider the structural identification and variable selection problems in varying-coefficient models for high-dimensional data. Using a modified basis expansion approach and group variable selection methods, we propose a unified procedure to simultaneously identify the model structure, select important variables and estimate the coefficient curves. The unique feature of the proposed approach is that we do not have to specify the model structure in advance, therefore, it is more realistic and appropriate for real data analysis. Asymptotic properties of the proposed estimators have been derived under regular conditions. Furthermore, we evaluate the finite sample performance of the proposed methods with Monte Carlo simulation studies and a real data analysis.  相似文献   

20.
Model choice is one of the most crucial aspect in any statistical data analysis. It is well known that most models are just an approximation to the true data-generating process but among such model approximations, it is our goal to select the ‘best’ one. Researchers typically consider a finite number of plausible models in statistical applications, and the related statistical inference depends on the chosen model. Hence, model comparison is required to identify the ‘best’ model among several such candidate models. This article considers the problem of model selection for spatial data. The issue of model selection for spatial models has been addressed in the literature by the use of traditional information criteria-based methods, even though such criteria have been developed based on the assumption of independent observations. We evaluate the performance of some of the popular model selection critera via Monte Carlo simulation experiments using small to moderate samples. In particular, we compare the performance of some of the most popular information criteria such as Akaike information criterion (AIC), Bayesian information criterion, and corrected AIC in selecting the true model. The ability of these criteria to select the correct model is evaluated under several scenarios. This comparison is made using various spatial covariance models ranging from stationary isotropic to nonstationary models.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号