首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We provide general conditions to ensure the valid Laplace approximations to the marginal likelihoods under model misspecification, and derive the Bayesian information criteria including all terms of order Op(1). Under conditions in theorem 1 of Lv and Liu [J. R. Statist. Soc. B, 76, (2014), 141–167] and a continuity condition for prior densities, asymptotic expansions with error terms of order op(1) are derived for the log-marginal likelihoods of possibly misspecified generalized linear models. We present some numerical examples to illustrate the finite sample performance of the proposed information criteria in misspecified models.  相似文献   

2.
Model choice is one of the most crucial aspect in any statistical data analysis. It is well known that most models are just an approximation to the true data-generating process but among such model approximations, it is our goal to select the ‘best’ one. Researchers typically consider a finite number of plausible models in statistical applications, and the related statistical inference depends on the chosen model. Hence, model comparison is required to identify the ‘best’ model among several such candidate models. This article considers the problem of model selection for spatial data. The issue of model selection for spatial models has been addressed in the literature by the use of traditional information criteria-based methods, even though such criteria have been developed based on the assumption of independent observations. We evaluate the performance of some of the popular model selection critera via Monte Carlo simulation experiments using small to moderate samples. In particular, we compare the performance of some of the most popular information criteria such as Akaike information criterion (AIC), Bayesian information criterion, and corrected AIC in selecting the true model. The ability of these criteria to select the correct model is evaluated under several scenarios. This comparison is made using various spatial covariance models ranging from stationary isotropic to nonstationary models.  相似文献   

3.
As a natural successor of the information criteria AIC and ABIC, information criteria for the Bayes models were developed by evaluating the bias of the log likelihood of the predictive distribution as an estimate of its expected log-likelihood. Considering two specific situations for the true distribution, two information criteria, PIC1 and PIC2 are derived. Linear Gaussian cases are considered in details and the evaluation of the maximum a posteriori estimator is also considered. By a simple example of estimating the signal to noise ratio, it was shown that the PIC2 is a good approximation to the expected log-likelihood in the entire region of the signal to noise ratio. On the other hand, PIC1 performs good only for the smaller values of the variance ratio. For illustration, the problems of trend estimation and seasonal adjustment are considered. Examples show that the hyper-parameters estimated by the new criteria are usually closer to the best ones than those by the ABIC.  相似文献   

4.
The K-prime and K-square distributions, involved in the Bayesian predictive distributions of standard t and F tests are investigated. They generalize the classical noncentral t and noncentral F distributions and can receive different characterizations. Their moments and their probability density and distribution functions are made explicit.  相似文献   

5.
Construction methods for prior densities are investigated from a predictive viewpoint. Predictive densities for future observables are constructed by using observed data. The simultaneous distribution of future observables and observed data is assumed to belong to a parametric submodel of a multinomial model. Future observables and data are possibly dependent. The discrepancy of a predictive density to the true conditional density of future observables given observed data is evaluated by the Kullback-Leibler divergence. It is proved that limits of Bayesian predictive densities form an essentially complete class. Latent information priors are defined as priors maximizing the conditional mutual information between the parameter and the future observables given the observed data. Minimax predictive densities are constructed as limits of Bayesian predictive densities based on prior sequences converging to the latent information priors.  相似文献   

6.
The change-point problem for normal regression models is considered here as the problem of choosing the hypothesis H0 of no change or one of the hypotheses Hi that one or more parameters change after the ith observation. The observations are often associated with a known increasing sequence τi (for example, τi is the date of the ith observation). It then seems natural to introduce a quadratic loss function involving (τiτj)2 for selecting Hi instead of the true hypothesis Hj. A Bayes optimal invariant procedure is derived within such a framework and compared to previous proposals. When H0 is rejected, large errors may arise in the estimation of the change point. To get around this difficulty another procedure is introduced whose main feature is to select one of the Hi's when H0 is rejected only if there is sufficient evidence in favour of this choice.  相似文献   

7.
We consider exact and approximate Bayesian computation in the presence of latent variables or missing data. Specifically we explore the application of a posterior predictive distribution formula derived in Sweeting And Kharroubi (2003), which is a particular form of Laplace approximation, both as an importance function and a proposal distribution. We show that this formula provides a stable importance function for use within poor man’s data augmentation schemes and that it can also be used as a proposal distribution within a Metropolis-Hastings algorithm for models that are not analytically tractable. We illustrate both uses in the case of a censored regression model and a normal hierarchical model, with both normal and Student t distributed random effects. Although the predictive distribution formula is motivated by regular asymptotic theory, it is not necessary that the likelihood has a closed form or that it possesses a local maximum.  相似文献   

8.
A supra-Bayesian (SB) wants to combine the information from a group of k experts to produce her distribution of a probability θ. Each expert gives his counts of what he thinks are the numbers of successes and failures in a sequence of independent trials, each with probability θ of success. These counts, used as a surrogate for each expert's own individual probability assessment (together with his associated level of confidence in his estimate), allow the SB to build various plausible conjugate models. Such models reflect her beliefs about the reliability of different experts and take account of different possible patterns of overlap of information between them. Corresponding combination rules are then obtained and compared with other more established rules and their properties examined.  相似文献   

9.
Bayesian predictive density functions, which are necessary to obtain bounds for predictive intervals of future order statistics, are obtained when the population density is a finite mixture of general components. Such components include, among others, the Weibull (exponential and Rayleigh as special cases), compound Weibull (three-parameter Burr type XII), Pareto, beta, Gompertz and compound Gompertz distributions. The prior belief of the experimenter is measured by a general distribution that was suggested by AL-Hussaini (J. Statist. Plann. Inf. 79 (1999b) 79). Applications to finite mixtures of Weibull and Burr type XII components are illustrated and comparison is made, in the special cases of the exponential and Pareto type II components, with previous results.  相似文献   

10.
This article deals with model comparison as an essential part of generalized linear modelling in the presence of covariates missing not at random (MNAR). We provide an evaluation of the performances of some of the popular model selection criteria, particularly of deviance information criterion (DIC) and weighted L (WL) measure, for comparison among a set of candidate MNAR models. In addition, we seek to provide deviance and quadratic loss-based model selection criteria with alternative penalty terms targeting directly the MNAR models. This work is motivated by the need in the literature to understand the performances of these important model selection criteria for comparison among a set of MNAR models. A Monte Carlo simulation experiment is designed to assess the finite sample performances of these model selection criteria in the context of interest under different scenarios for missingness amounts. Some naturally driven DIC and WL extensions are also discussed and evaluated.  相似文献   

11.
The World Health Organization (WHO) diagnostic criteria for diabetes mellitus were determined in part by evidence that in some populations the plasma glucose level 2 h after an oral glucose load is a mixture of two distinct distributions. We present a finite mixture model that allows the two component densities to be generalized linear models and the mixture probability to be a logistic regression model. The model allows us to estimate the prevalence of diabetes and sensitivity and specificity of the diagnostic criteria as a function of covariates and to estimate them in the absence of an external standard. Sensitivity is the probability that a test indicates disease conditionally on disease being present. Specificity is the probability that a test indicates no disease conditionally on no disease being present. We obtained maximum likelihood estimates via the EM algorithm and derived the standard errors from the information matrix and by the bootstrap. In the application to data from the diabetes in Egypt project a two-component mixture model fits well and the two components are interpreted as normal and diabetes. The means and variances are similar to results found in other populations. The minimum misclassification cutpoints decrease with age, are lower in urban areas and are higher in rural areas than the 200 mg dl-1 cutpoint recommended by the WHO. These differences are modest and our results generally support the WHO criterion. Our methods allow the direct inclusion of concomitant data whereas past analyses were based on partitioning the data.  相似文献   

12.
In this paper, we extend the focused information criterion (FIC) to copula models. Copulas are often used for applications where the joint tail behavior of the variables is of particular interest, and selecting a copula that captures this well is then essential. Traditional model selection methods such as the Akaike information criterion (AIC) and the Bayesian information criterion (BIC) aim at finding the overall best‐fitting model, which is not necessarily the one best suited for the application at hand. The FIC, on the other hand, evaluates and ranks candidate models based on the precision of their point estimates of a context‐given focus parameter. This could be any quantity of particular interest, for example, the mean, a correlation, conditional probabilities, or measures of tail dependence. We derive FIC formulae for the maximum likelihood estimator, the two‐stage maximum likelihood estimator, and the so‐called pseudo‐maximum‐likelihood (PML) estimator combined with parametric margins. Furthermore, we confirm the validity of the AIC formula for the PML estimator combined with parametric margins. To study the numerical behavior of FIC, we have carried out a simulation study, and we have also analyzed a multivariate data set pertaining to abalones. The results from the study show that the FIC successfully ranks candidate models in terms of their performance, defined as how well they estimate the focus parameter. In terms of estimation precision, FIC clearly outperforms AIC, especially when the focus parameter relates to only a specific part of the model, such as the conditional upper‐tail probability.  相似文献   

13.
The Heston-STAR model is a new class of stochastic volatility models defined by generalizing the Heston model to allow the volatility of the volatility process as well as the correlation between asset log-returns and variance shocks to change across different regimes via smooth transition autoregressive (STAR) functions. The form of the STAR functions is very flexible, much more so than the functions introduced in Jones (J Econom 116:181–224, 2003), and provides the framework for a wide range of stochastic volatility models. A Bayesian inference approach using data augmentation techniques is used for the parameters of our model. We also explore goodness of fit of our Heston-STAR model. Our analysis of the S&P 500 and VIX index demonstrates that the Heston-STAR model is more capable of dealing with large market fluctuations (such as in 2008) compared to the standard Heston model.  相似文献   

14.
The goal of this paper is to compare several widely used Bayesian model selection methods in practical model selection problems, highlight their differences and give recommendations about the preferred approaches. We focus on the variable subset selection for regression and classification and perform several numerical experiments using both simulated and real world data. The results show that the optimization of a utility estimate such as the cross-validation (CV) score is liable to finding overfitted models due to relatively high variance in the utility estimates when the data is scarce. This can also lead to substantial selection induced bias and optimism in the performance evaluation for the selected model. From a predictive viewpoint, best results are obtained by accounting for model uncertainty by forming the full encompassing model, such as the Bayesian model averaging solution over the candidate models. If the encompassing model is too complex, it can be robustly simplified by the projection method, in which the information of the full model is projected onto the submodels. This approach is substantially less prone to overfitting than selection based on CV-score. Overall, the projection method appears to outperform also the maximum a posteriori model and the selection of the most probable variables. The study also demonstrates that the model selection can greatly benefit from using cross-validation outside the searching process both for guiding the model size selection and assessing the predictive performance of the finally selected model.  相似文献   

15.
Inverse Gamma-Pareto composite distribution is considered as a model for heavy tailed data. The maximum likelihood (ML), smoothed empirical percentile (SM), and Bayes estimators (informative and non-informative) for the parameter θ, which is the boundary point for the supports of the two distributions are derived. A Bayesian predictive density is derived via a gamma prior for θ and the density is used to estimate risk measures. Accuracy of estimators of θ and the risk measures are assessed via simulation studies. It is shown that the informative Bayes estimator is consistently more accurate than ML, Smoothed, and the non-informative Bayes estimators.  相似文献   

16.
Looking at predictive accuracy is a traditional method for comparing models. A natural method for approximating out-of-sample predictive accuracy is leave-one-out cross-validation (LOOCV)—we alternately hold out each case from a full dataset and then train a Bayesian model using Markov chain Monte Carlo without the held-out case; at last we evaluate the posterior predictive distribution of all cases with their actual observations. However, actual LOOCV is time-consuming. This paper introduces two methods, namely iIS and iWAIC, for approximating LOOCV with only Markov chain samples simulated from a posterior based on a full dataset. iIS and iWAIC aim at improving the approximations given by importance sampling (IS) and WAIC in Bayesian models with possibly correlated latent variables. In iIS and iWAIC, we first integrate the predictive density over the distribution of the latent variables associated with the held-out without reference to its observation, then apply IS and WAIC approximations to the integrated predictive density. We compare iIS and iWAIC with other approximation methods in three kinds of models: finite mixture models, models with correlated spatial effects, and a random effect logistic regression model. Our empirical results show that iIS and iWAIC give substantially better approximates than non-integrated IS and WAIC and other methods.  相似文献   

17.
Subset selection is an extensively studied problem in statistical learning. Especially it becomes popular for regression analysis. This problem has considerable attention for generalized linear models as well as other types of regression methods. Quantile regression is one of the most used types of regression method. In this article, we consider subset selection problem for quantile regression analysis with adopting some recent Bayesian information criteria. We also utilized heuristic optimization during selection process. Simulation and real data application results demonstrate the capability of the mentioned information criteria. According to results, these information criteria can determine the true models effectively in quantile regression models.  相似文献   

18.
Polytomous Item Response Theory (IRT) models are used by specialists to score assessments and questionnaires that have items with multiple response categories. In this article, we study the performance of five model comparison criteria for comparing fit of the graded response and generalized partial credit models using the same dataset when the choice between the two is unclear. Simulation study is conducted to analyze the sensitivity of priors and compare the performance of the criteria using the No-U-Turn Sampler algorithm, under a Bayesian approach. The results were used to select a model for an application in mental health data.  相似文献   

19.
Both knowledge-based systems and statistical models are typically concerned with making predictions about future observables. Here we focus on assessment of predictive performance and provide two techniques for improving the predictive performance of Bayesian graphical models. First, we present Bayesian model averaging, a technique for accounting for model uncertainty.

Second, we describe a technique for eliciting a prior distribution for competing models from domain experts. We explore the predictive performance of both techniques in the context of a urological diagnostic problem.  相似文献   

20.
It is common practice to compare the fit of non‐nested models using the Akaike (AIC) or Bayesian (BIC) information criteria. The basis of these criteria is the log‐likelihood evaluated at the maximum likelihood estimates of the unknown parameters. For the general linear model (and the linear mixed model, which is a special case), estimation is usually carried out using residual or restricted maximum likelihood (REML). However, for models with different fixed effects, the residual likelihoods are not comparable and hence information criteria based on the residual likelihood cannot be used. For model selection, it is often suggested that the models are refitted using maximum likelihood to enable the criteria to be used. The first aim of this paper is to highlight that both the AIC and BIC can be used for the general linear model by using the full log‐likelihood evaluated at the REML estimates. The second aim is to provide a derivation of the criteria under REML estimation. This aim is achieved by noting that the full likelihood can be decomposed into a marginal (residual) and conditional likelihood and this decomposition then incorporates aspects of both the fixed effects and variance parameters. Using this decomposition, the appropriate information criteria for model selection of models which differ in their fixed effects specification can be derived. An example is presented to illustrate the results and code is available for analyses using the ASReml‐R package.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号