首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
As researchers increasingly rely on linear mixed models to characterize longitudinal data, there is a need for improved techniques for selecting among this class of models which requires specification of both fixed and random effects via a mean model and variance-covariance structure. The process is further complicated when fixed and/or random effects are non nested between models. This paper explores the development of a hypothesis test to compare non nested linear mixed models based on extensions of the work begun by Sir David Cox. We assess the robustness of this approach for comparing models containing correlated measures of body fat for predicting longitudinal cardiometabolic risk.  相似文献   

2.
The purpose of this paper is threefold. First, we obtain the asymptotic properties of the modified model selection criteria proposed by Hurvich et al. (1990. Improved estimators of Kullback-Leibler information for autoregressive model selection in small samples. Biometrika 77, 709–719) for autoregressive models. Second, we provide some highlights on the better performance of this modified criteria. Third, we extend the modification introduced by these authors to model selection criteria commonly used in the class of self-exciting threshold autoregressive (SETAR) time series models. We show the improvements of the modified criteria in their finite sample performance. In particular, for small and medium sample size the frequency of selecting the true model improves for the consistent criteria and the root mean square error (RMSE) of prediction improves for the efficient criteria. These results are illustrated via simulation with SETAR models in which we assume that the threshold and the parameters are unknown.  相似文献   

3.
We study model selection and model averaging in semiparametric partially linear models with missing responses. An imputation method is used to estimate the linear regression coefficients and the nonparametric function. We show that the corresponding estimators of the linear regression coefficients are asymptotically normal. Then a focused information criterion and frequentist model average estimators are proposed and their theoretical properties are established. Simulation studies are performed to demonstrate the superiority of the proposed methods over the existing strategies in terms of mean squared error and coverage probability. Finally, the approach is applied to a real data case.  相似文献   

4.
In this paper, a generalized partially linear model (GPLM) with missing covariates is studied and a Monte Carlo EM (MCEM) algorithm with penalized-spline (P-spline) technique is developed to estimate the regression coefficients and nonparametric function, respectively. As classical model selection procedures such as Akaike's information criterion become invalid for our considered models with incomplete data, some new model selection criterions for GPLMs with missing covariates are proposed under two different missingness mechanism, say, missing at random (MAR) and missing not at random (MNAR). The most attractive point of our method is that it is rather general and can be extended to various situations with missing observations based on EM algorithm, especially when no missing data involved, our new model selection criterions are reduced to classical AIC. Therefore, we can not only compare models with missing observations under MAR/MNAR settings, but also can compare missing data models with complete-data models simultaneously. Theoretical properties of the proposed estimator, including consistency of the model selection criterions are investigated. A simulation study and a real example are used to illustrate the proposed methodology.  相似文献   

5.
Model choice is one of the most crucial aspect in any statistical data analysis. It is well known that most models are just an approximation to the true data-generating process but among such model approximations, it is our goal to select the ‘best’ one. Researchers typically consider a finite number of plausible models in statistical applications, and the related statistical inference depends on the chosen model. Hence, model comparison is required to identify the ‘best’ model among several such candidate models. This article considers the problem of model selection for spatial data. The issue of model selection for spatial models has been addressed in the literature by the use of traditional information criteria-based methods, even though such criteria have been developed based on the assumption of independent observations. We evaluate the performance of some of the popular model selection critera via Monte Carlo simulation experiments using small to moderate samples. In particular, we compare the performance of some of the most popular information criteria such as Akaike information criterion (AIC), Bayesian information criterion, and corrected AIC in selecting the true model. The ability of these criteria to select the correct model is evaluated under several scenarios. This comparison is made using various spatial covariance models ranging from stationary isotropic to nonstationary models.  相似文献   

6.
This paper analyzes the impact of some kinds of contaminant on model selection in graphical Gaussian models. We investigate four different kinds of contaminants, in order to consider the effect of gross errors, model deviations, and model misspecification. The aim of the work is to assess against which kinds of contaminant a model selection procedure for graphical Gaussian models has a more robust behavior. The analysis is based on simulated data. The simulation study shows that relatively few contaminated observations in even just one of the variables can have a significant impact on correct model selection, especially when the contaminated variable is a node in a separating set of the graph.  相似文献   

7.
8.
Selecting an appropriate structure for a linear mixed model serves as an appealing problem in a number of applications such as in the modelling of longitudinal or clustered data. In this paper, we propose a variable selection procedure for simultaneously selecting and estimating the fixed and random effects. More specifically, a profile log-likelihood function, along with an adaptive penalty, is utilized for sparse selection. The Newton-Raphson optimization algorithm is performed to complete the parameter estimation. By jointly selecting the fixed and random effects, the proposed approach increases selection accuracy compared with two-stage procedures, and the usage of the profile log-likelihood can improve computational efficiency in one-stage procedures. We prove that the proposed procedure enjoys the model selection consistency. A simulation study and a real data application are conducted for demonstrating the effectiveness of the proposed method.  相似文献   

9.
This paper considers model averaging for the ordered probit and nested logit models, which are widely used in empirical research. Within the frameworks of these models, we examine a range of model averaging methods, including the jackknife method, which is proved to have an optimal asymptotic property in this paper. We conduct a large-scale simulation study to examine the behaviour of these model averaging estimators in finite samples, and draw comparisons with model selection estimators. Our results show that while neither averaging nor selection is a consistently better strategy, model selection results in the poorest estimates far more frequently than averaging, and more often than not, averaging yields superior estimates. Among the averaging methods considered, the one based on a smoothed version of the Bayesian Information criterion frequently produces the most accurate estimates. In three real data applications, we demonstrate the usefulness of model averaging in mitigating problems associated with the ‘replication crisis’ that commonly arises with model selection.  相似文献   

10.
Spatial regression models are important tools for many scientific disciplines including economics, business, and social science. In this article, we investigate postmodel selection estimators that apply least squares estimation to the model selected by penalized estimation in high-dimensional regression models with spatial autoregressive errors. We show that by separating the model selection and estimation process, the postmodel selection estimator performs at least as well as the simultaneous variable selection and estimation method in terms of the rate of convergence. Moreover, under perfect model selection, the 2 rate of convergence is the oracle rate of s/n, compared with the convergence rate of ◂√▸slogp/n in the general case. Here, n is the sample size and p, s are the model dimension and number of significant covariates, respectively. We further provide the convergence rate of the estimation error in the form of sup norm, and ideally the rate can reach as fast as ◂√▸logs/n.  相似文献   

11.
In this article, we are going to study the strong laws of large numbers for countable non homogeneous hidden Markov models. First, we introduce the notion of countable non homogeneous hidden Markov models. Then, we obtain some properties for those Markov models. Finally, we establish two strong laws of large numbers for countable non homogeneous hidden Markov models. As corollaries, we obtain some known results of strong laws of large numbers for finite non homogeneous Markov chains.  相似文献   

12.
Model selection aims to find the best model. Most of the usual criteria are based on goodness of fit and parsimony and aim to maximize a transformed version of likelihood. The situation is less clear when two models are equivalent: are they close to the unknown true model or are they far from it? Based on simulations, we study the results of Vuong's test, Cox's test, AIC and BIC and the ability of these four tests to discriminate between models.  相似文献   

13.
Variable selection in finite mixture of regression (FMR) models is frequently used in statistical modeling. The majority of applications of variable selection in FMR models use a normal distribution for regression error. Such assumptions are unsuitable for a set of data containing a group or groups of observations with asymmetric behavior. In this paper, we introduce a variable selection procedure for FMR models using the skew-normal distribution. With appropriate choice of the tuning parameters, we establish the theoretical properties of our procedure, including consistency in variable selection and the oracle property in estimation. To estimate the parameters of the model, a modified EM algorithm for numerical computations is developed. The methodology is illustrated through numerical experiments and a real data example.  相似文献   

14.
Mixture of linear mixed-effects models has received considerable attention in longitudinal studies, including medical research, social science and economics. The inferential question of interest is often the identification of critical factors that affect the responses. We consider a Bayesian approach to select the important fixed and random effects in the finite mixture of linear mixed-effects models. To accomplish our goal, latent variables are introduced to facilitate the identification of influential fixed and random components and to classify the membership of observations in the longitudinal data. A spike-and-slab prior for the regression coefficients is adopted to sidestep the potential complications of highly collinear covariates and to handle large p and small n issues in the variable selection problems. Here we employ Markov chain Monte Carlo (MCMC) sampling techniques for posterior inferences and explore the performance of the proposed method in simulation studies, followed by an actual psychiatric data analysis concerning depressive disorder.  相似文献   

15.
In this paper, we propose a penalized likelihood method to simultaneous select covariate, and mixing component and obtain parameter estimation in the localized mixture of experts models. We develop an expectation maximization algorithm to solve the proposed penalized likelihood procedure, and introduce a data-driven procedure to select the tuning parameters. Extensive numerical studies are carried out to compare the finite sample performances of our proposed method and other existing methods. Finally, we apply the proposed methodology to analyze the Boston housing price data set and the baseball salaries data set.  相似文献   

16.
We present a methodology for Bayesian model choice and averaging in Gaussian directed acyclic graphs (dags). The dimension-changing move involves adding or dropping a (directed) edge from the graph. The methodology employs the results in Geiger and Heckerman and searches directly in the space of all dags. Model determination is carried out by implementing a reversible jump Markov Chain Monte Carlo sampler. To achieve this aim we rely on the concept of adjacency matrices, which provides a relatively inexpensive check for acyclicity. The performance of our procedure is illustrated by means of two simulated datasets, as well as one real dataset.  相似文献   

17.
Summary.  Structured additive regression models are perhaps the most commonly used class of models in statistical applications. It includes, among others, (generalized) linear models, (generalized) additive models, smoothing spline models, state space models, semiparametric regression, spatial and spatiotemporal models, log-Gaussian Cox processes and geostatistical and geoadditive models. We consider approximate Bayesian inference in a popular subset of structured additive regression models, latent Gaussian models , where the latent field is Gaussian, controlled by a few hyperparameters and with non-Gaussian response variables. The posterior marginals are not available in closed form owing to the non-Gaussian response variables. For such models, Markov chain Monte Carlo methods can be implemented, but they are not without problems, in terms of both convergence and computational time. In some practical applications, the extent of these problems is such that Markov chain Monte Carlo sampling is simply not an appropriate tool for routine analysis. We show that, by using an integrated nested Laplace approximation and its simplified version, we can directly compute very accurate approximations to the posterior marginals. The main benefit of these approximations is computational: where Markov chain Monte Carlo algorithms need hours or days to run, our approximations provide more precise estimates in seconds or minutes. Another advantage with our approach is its generality, which makes it possible to perform Bayesian analysis in an automatic, streamlined way, and to compute model comparison criteria and various predictive measures so that models can be compared and the model under study can be challenged.  相似文献   

18.
19.
In a randomized trial designed to study the effect of a treatment of interest on the evolution of the mean of a time-dependent outcome variable, subjects are assigned to a treatment regime, or, equivalently, a treatment protocol. Unfortunately, subjects often fail to comply with their assigned regime. From a public health point of view, the causal parameter of interest will often be a function of the treatment differences that would have been observed hadcontrary to fact, all subjects remained on protocol. This paper considers the identification and estimation of these treatment differences based on a new class of structural models, the multivariate structural nested mean models, when reliable estimates of each subject's actual treatment are available. Estimates of “actual treatment” might, for example, be obtained by measuring the amount of “active drug” in the subject's blood or urine at each follow-up visit or by pill counting techniques. In addition, we discuss a natural extension of our methods to observational studies.  相似文献   

20.
Varying-coefficient models have been widely used to investigate the possible time-dependent effects of covariates when the response variable comes from normal distribution. Much progress has been made for inference and variable selection in the framework of such models. However, the identification of model structure, that is how to identify which covariates have time-varying effects and which have fixed effects, remains a challenging and unsolved problem especially when the dimension of covariates is much larger than the sample size. In this article, we consider the structural identification and variable selection problems in varying-coefficient models for high-dimensional data. Using a modified basis expansion approach and group variable selection methods, we propose a unified procedure to simultaneously identify the model structure, select important variables and estimate the coefficient curves. The unique feature of the proposed approach is that we do not have to specify the model structure in advance, therefore, it is more realistic and appropriate for real data analysis. Asymptotic properties of the proposed estimators have been derived under regular conditions. Furthermore, we evaluate the finite sample performance of the proposed methods with Monte Carlo simulation studies and a real data analysis.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号