首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 296 毫秒
Monte Carlo experiments are conducted to compare the Bayesian and sample theory model selection criteria in choosing the univariate probit and logit models. We use five criteria: the deviance information criterion (DIC), predictive deviance information criterion (PDIC), Akaike information criterion (AIC), weighted, and unweighted sums of squared errors. The first two criteria are Bayesian while the others are sample theory criteria. The results show that if data are balanced none of the model selection criteria considered in this article can distinguish the probit and logit models. If data are unbalanced and the sample size is large the DIC and AIC choose the correct models better than the other criteria. We show that if unbalanced binary data are generated by a leptokurtic distribution the logit model is preferred over the probit model. The probit model is preferred if unbalanced data are generated by a platykurtic distribution. We apply the model selection criteria to the probit and logit models that link the ups and downs of the returns on S&P500 to the crude oil price.  相似文献   

Nowadays, Bayesian methods are routinely used for estimating parameters of item response theory (IRT) models. However, the marginal likelihoods are still rarely used for comparing IRT models due to their complexity and a relatively high dimension of the model parameters. In this paper, we review Monte Carlo (MC) methods developed in the literature in recent years and provide a detailed development of how these methods are applied to the IRT models. In particular, we focus on the “best possible” implementation of these MC methods for the IRT models. These MC methods are used to compute the marginal likelihoods under the one-parameter IRT model with the logistic link (1PL model) and the two-parameter logistic IRT model (2PL model) for a real English Examination dataset. We further use the widely applicable information criterion (WAIC) and deviance information criterion (DIC) to compare the 1PL model and the 2PL model. The 2PL model is favored by all of these three Bayesian model comparison criteria for the English Examination data.  相似文献   

Motivated from a colorectal cancer study, we propose a class of frailty semi-competing risks survival models to account for the dependence between disease progression time, survival time, and treatment switching. Properties of the proposed models are examined and an efficient Gibbs sampling algorithm using the collapsed Gibbs technique is developed. A Bayesian procedure for assessing the treatment effect is also proposed. The deviance information criterion (DIC) with an appropriate deviance function and Logarithm of the pseudomarginal likelihood (LPML) are constructed for model comparison. A simulation study is conducted to examine the empirical performance of DIC and LPML and as well as the posterior estimates. The proposed method is further applied to analyze data from a colorectal cancer study.  相似文献   

We compare Bayesian and sample theory model specification criteria. For the Bayesian criteria we use the deviance information criterion and the cumulative density of the mean squared errors of forecast. For the sample theory criterion we use the conditional Kolmogorov test. We use Markov chain Monte Carlo methods to obtain the Bayesian criteria and bootstrap sampling to obtain the conditional Kolmogorov test. Two non nested models we consider are the CIR and Vasicek models for spot asset prices. Monte Carlo experiments show that the DIC performs better than the cumulative density of the mean squared errors of forecast and the CKT. According to the DIC and the mean squared errors of forecast, the CIR model explains the daily data on uncollateralized Japanese call rate from January 1, 1990 to April 18, 1996; but according to the CKT, neither the CIR nor Vasicek models explains the daily data.  相似文献   

Model choice is one of the most crucial aspect in any statistical data analysis. It is well known that most models are just an approximation to the true data-generating process but among such model approximations, it is our goal to select the ‘best’ one. Researchers typically consider a finite number of plausible models in statistical applications, and the related statistical inference depends on the chosen model. Hence, model comparison is required to identify the ‘best’ model among several such candidate models. This article considers the problem of model selection for spatial data. The issue of model selection for spatial models has been addressed in the literature by the use of traditional information criteria-based methods, even though such criteria have been developed based on the assumption of independent observations. We evaluate the performance of some of the popular model selection critera via Monte Carlo simulation experiments using small to moderate samples. In particular, we compare the performance of some of the most popular information criteria such as Akaike information criterion (AIC), Bayesian information criterion, and corrected AIC in selecting the true model. The ability of these criteria to select the correct model is evaluated under several scenarios. This comparison is made using various spatial covariance models ranging from stationary isotropic to nonstationary models.  相似文献   

A virologic marker, the number of HIV RNA copies or viral load, is currently used to evaluate antiretroviral (ARV) therapies in AIDS clinical trials. This marker can be used to assess the antiviral potency of therapies, but may be easily affected by clinical factors such as drug exposures and drug resistance as well as baseline characteristics during the long-term treatment evaluation process. HIV dynamic studies have significantly contributed to the understanding of HIV pathogenesis and ARV treatment strategies. Viral dynamic models can be formulated through differential equations, but there has been only limited development of statistical methodologies for estimating such models or assessing their agreement with observed data. This paper develops mechanism-based nonlinear differential equation models for characterizing long-term viral dynamics with ARV therapy. In this model we not only incorporate clinical factors (drug exposures, and susceptibility), but also baseline covariate (baseline viral load, CD4 count, weight, or age) into a function of treatment efficacy. A Bayesian nonlinear mixed-effects modeling approach is investigated with application to an AIDS clinical trial study. The effects of confounding interaction of clinical factors with covariate-based models are compared using the deviance information criteria (DIC), a Bayesian version of the classical deviance for model assessment, designed from complex hierarchical model settings. Relationships between baseline covariate combined with confounding clinical factors and drug efficacy are explored. In addition, we compared models incorporating each of four baseline covariates through DIC and some interesting findings are presented. Our results suggest that modeling HIV dynamics and virologic responses with consideration of time-varying clinical factors as well as baseline characteristics may play an important role in understanding HIV pathogenesis, designing new treatment strategies for long-term care of AIDS patients.  相似文献   

In this article, we develop a Bayesian approach for the estimation of two cure correlated frailty models that have been extended to the cure frailty models introduced by Yin [34]. We used the two different type of frailty with bivariate log-normal distribution instead of gamma distribution. A likelihood function was constructed based on a piecewise exponential distribution function. The model parameters were estimated by the Markov chain Monte Carlo method. The comparison of models is based on the Cox correlated frailty model with log-normal distribution. A real data set of bilateral corneal graft rejection was used to compare these models. The results of this data, based on deviance information criteria, showed the advantage of the proposed models.  相似文献   


Model selection can be defined as the task of estimating the performance of different models in order to choose the most parsimonious one, among a potentially very large set of candidate statistical models. We propose a graphical representation to be considered as an extension to the class of mixed models of the deviance plot proposed in the literature within the framework of classical and generalized linear models. This graphical representation allows, once a reduced number of models have been selected, to identify important covariates focusing only on the fixed effects component, assuming the random part properly specified. Nevertheless, we suggest also a standalone figure representing the residual random variance ratio: a cross-evaluation of the two graphical representations will allow to derive some conclusions on the random part specification of the model and a more accurate selection of the final model.  相似文献   

We introduce a multivariate heteroscedastic measurement error model for replications under scale mixtures of normal distribution. The model can provide a robust analysis and can be viewed as a generalization of multiple linear regression from both model structure and distribution assumption. An efficient method based on Markov Chain Monte Carlo is developed for parameter estimation. The deviance information criterion and the conditional predictive ordinates are used as model selection criteria. Simulation studies show robust inference behaviours of the model against both misspecification of distributions and outliers. We work out an illustrative example with a real data set on measurements of plant root decomposition.  相似文献   

In this paper, a generalized partially linear model (GPLM) with missing covariates is studied and a Monte Carlo EM (MCEM) algorithm with penalized-spline (P-spline) technique is developed to estimate the regression coefficients and nonparametric function, respectively. As classical model selection procedures such as Akaike's information criterion become invalid for our considered models with incomplete data, some new model selection criterions for GPLMs with missing covariates are proposed under two different missingness mechanism, say, missing at random (MAR) and missing not at random (MNAR). The most attractive point of our method is that it is rather general and can be extended to various situations with missing observations based on EM algorithm, especially when no missing data involved, our new model selection criterions are reduced to classical AIC. Therefore, we can not only compare models with missing observations under MAR/MNAR settings, but also can compare missing data models with complete-data models simultaneously. Theoretical properties of the proposed estimator, including consistency of the model selection criterions are investigated. A simulation study and a real example are used to illustrate the proposed methodology.  相似文献   

Copula, marginal distributions and model selection: a Bayesian note   总被引:3,自引:0,他引:3  
Copula functions and marginal distributions are combined to produce multivariate distributions. We show advantages of estimating all parameters of these models using the Bayesian approach, which can be done with standard Markov chain Monte Carlo algorithms. Deviance-based model selection criteria are also discussed when applied to copula models since they are invariant under monotone increasing transformations of the marginals. We focus on the deviance information criterion. The joint estimation takes into account all dependence structure of the parameters’ posterior distributions in our chosen model selection criteria. Two Monte Carlo studies are conducted to show that model identification improves when the model parameters are jointly estimated. We study the Bayesian estimation of all unknown quantities at once considering bivariate copula functions and three known marginal distributions.  相似文献   

Applying nonparametric variable selection criteria in nonlinear regression models generally requires a substantial computational effort if the data set is large. In this paper we present a selection technique that is computationally much less demanding and performs well in comparison with methods currently available. It is based on a polynomial approximation of the nonlinear model. Performing the selection only requires repeated least squares estimation of models that are linear in parameters. The main limitation of the method is that the number of variables among which to select cannot be very large if the sample is small and the order of an adequate polynomial at the same time is high. Large samples can be handled without problems.  相似文献   

Bayesian propensity score regression analysis with misclassified binary responses is proposed to analyse clustered observational data. This approach utilizes multilevel models and corrects for misclassification in the responses. Using the deviance information criterion (DIC), the performance of the approach is compared with approaches without correcting for misclassification, multilevel structure specification, or both in the study of the impact of female employment on the likelihood of physical violence. The smallest DIC confirms that our proposed model best fits the data. We conclude that female employment has an insignificant impact on the likelihood of physical spousal violence towards women. In addition, a simulation study confirms that the proposed approach performed best in terms of bias and coverage rate. Ignoring misclassification in response or multilevel structure of data would yield biased estimation of the exposure effect.  相似文献   

This paper investigates, by means of Monte Carlo simulation, the effects of different choices of order for autoregressive approximation on the fully efficient parameter estimates for autoregressive moving average models. Four order selection criteria, AIC, BIC, HQ and PKK, were compared and different model structures with varying sample sizes were used to contrast the performance of the criteria. Some asymptotic results which provide a useful guide for assessing the performance of these criteria are presented. The results of this comparison show that there are marked differences in the accuracy implied using these alternative criteria in small sample situations and that it is preferable to apply BIC criterion, which leads to greater precision of Gaussian likelihood estimates, in such cases. Implications of the findings of this study for the estimation of time series models are highlighted.  相似文献   

Saddlepoint conditions on a predictor are introduced and developed to reconfirm the need for the assumption of a prior distribution in constructing a useful inferential procedure. A condition yields that the predictor induced from the maximum likelihood estimator is the worst under a loss, while the predictor induced from a suitable posterior mean is the best. This result indicates the promising role of Bayesian criteria, such as the deviance information criterion (DIC). As an implication, we critique the conventional empirical Bayes method because of its partial assumption of a prior distribution.  相似文献   

Modelling count data with overdispersion and spatial effects   总被引:1,自引:1,他引:0  
In this paper we consider regression models for count data allowing for overdispersion in a Bayesian framework. We account for unobserved heterogeneity in the data in two ways. On the one hand, we consider more flexible models than a common Poisson model allowing for overdispersion in different ways. In particular, the negative binomial and the generalized Poisson (GP) distribution are addressed where overdispersion is modelled by an additional model parameter. Further, zero-inflated models in which overdispersion is assumed to be caused by an excessive number of zeros are discussed. On the other hand, extra spatial variability in the data is taken into account by adding correlated spatial random effects to the models. This approach allows for an underlying spatial dependency structure which is modelled using a conditional autoregressive prior based on Pettitt et al. in Stat Comput 12(4):353–367, (2002). In an application the presented models are used to analyse the number of invasive meningococcal disease cases in Germany in the year 2004. Models are compared according to the deviance information criterion (DIC) suggested by Spiegelhalter et al. in J R Stat Soc B64(4):583–640, (2002) and using proper scoring rules, see for example Gneiting and Raftery in Technical Report no. 463, University of Washington, (2004). We observe a rather high degree of overdispersion in the data which is captured best by the GP model when spatial effects are neglected. While the addition of spatial effects to the models allowing for overdispersion gives no or only little improvement, spatial Poisson models with spatially correlated or uncorrelated random effects are to be preferred over all other models according to the considered criteria.  相似文献   

Bayesian estimation via MCMC methods opens up new possibilities in estimating complex models. However, there is still considerable debate about how selection among a set of candidate models, or averaging over closely competing models, might be undertaken. This article considers simple approaches for model averaging and choice using predictive and likelihood criteria and associated model weights on the basis of output for models that run in parallel. The operation of such procedures is illustrated with real data sets and a linear regression with simulated data where the true model is known.  相似文献   

Mixture experiments are commonly encountered in many fields including chemical, pharmaceutical and consumer product industries. Due to their wide applications, mixture experiments, a special study of response surface methodology, have been given greater attention in both model building and determination of designs compared with other experimental studies. In this paper, some new approaches are suggested on model building and selection for the analysis of the data in mixture experiments by using a special generalized linear models, logistic regression model, proposed by Chen et al. [7]. Generally, the special mixture models, which do not have a constant term, are highly affected by collinearity in modeling the mixture experiments. For this reason, in order to alleviate the undesired effects of collinearity in the analysis of mixture experiments with logistic regression, a new mixture model is defined with an alternative ratio variable. The deviance analysis table is given for standard mixture polynomial models defined by transformations and special mixture models used as linear predictors. The effects of components on the response in the restricted experimental region are given by using an alternative representation of Cox's direction approach. In addition, odds ratio and the confidence intervals of odds ratio are identified according to the chosen reference and control groups. To compare the suggested models, some model selection criteria, graphical odds ratio and the confidence intervals of the odds ratio are used. The advantage of the suggested approaches is illustrated on tumor incidence data set.  相似文献   

The variational approach to Bayesian inference enables simultaneous estimation of model parameters and model complexity. An interesting feature of this approach is that it also leads to an automatic choice of model complexity. Empirical results from the analysis of hidden Markov models with Gaussian observation densities illustrate this. If the variational algorithm is initialized with a large number of hidden states, redundant states are eliminated as the method converges to a solution, thereby leading to a selection of the number of hidden states. In addition, through the use of a variational approximation, the deviance information criterion for Bayesian model selection can be extended to the hidden Markov model framework. Calculation of the deviance information criterion provides a further tool for model selection, which can be used in conjunction with the variational approach.  相似文献   

The purpose of this paper is threefold. First, we obtain the asymptotic properties of the modified model selection criteria proposed by Hurvich et al. (1990. Improved estimators of Kullback-Leibler information for autoregressive model selection in small samples. Biometrika 77, 709–719) for autoregressive models. Second, we provide some highlights on the better performance of this modified criteria. Third, we extend the modification introduced by these authors to model selection criteria commonly used in the class of self-exciting threshold autoregressive (SETAR) time series models. We show the improvements of the modified criteria in their finite sample performance. In particular, for small and medium sample size the frequency of selecting the true model improves for the consistent criteria and the root mean square error (RMSE) of prediction improves for the efficient criteria. These results are illustrated via simulation with SETAR models in which we assume that the threshold and the parameters are unknown.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号