首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 30 毫秒
1.
We analyze the computational efficiency of approximate Bayesian computation (ABC), which approximates a likelihood function by drawing pseudo-samples from the associated model. For the rejection sampling version of ABC, it is known that multiple pseudo-samples cannot substantially increase (and can substantially decrease) the efficiency of the algorithm as compared to employing a high-variance estimate based on a single pseudo-sample. We show that this conclusion also holds for a Markov chain Monte Carlo version of ABC, implying that it is unnecessary to tune the number of pseudo-samples used in ABC-MCMC. This conclusion is in contrast to particle MCMC methods, for which increasing the number of particles can provide large gains in computational efficiency.  相似文献   

2.

Approximate Bayesian computation (ABC) has become one of the major tools of likelihood-free statistical inference in complex mathematical models. Simultaneously, stochastic differential equations (SDEs) have developed to an established tool for modelling time-dependent, real-world phenomena with underlying random effects. When applying ABC to stochastic models, two major difficulties arise: First, the derivation of effective summary statistics and proper distances is particularly challenging, since simulations from the stochastic process under the same parameter configuration result in different trajectories. Second, exact simulation schemes to generate trajectories from the stochastic model are rarely available, requiring the derivation of suitable numerical methods for the synthetic data generation. To obtain summaries that are less sensitive to the intrinsic stochasticity of the model, we propose to build up the statistical method (e.g. the choice of the summary statistics) on the underlying structural properties of the model. Here, we focus on the existence of an invariant measure and we map the data to their estimated invariant density and invariant spectral density. Then, to ensure that these model properties are kept in the synthetic data generation, we adopt measure-preserving numerical splitting schemes. The derived property-based and measure-preserving ABC method is illustrated on the broad class of partially observed Hamiltonian type SDEs, both with simulated data and with real electroencephalography data. The derived summaries are particularly robust to the model simulation, and this fact, combined with the proposed reliable numerical scheme, yields accurate ABC inference. In contrast, the inference returned using standard numerical methods (Euler–Maruyama discretisation) fails. The proposed ingredients can be incorporated into any type of ABC algorithm and directly applied to all SDEs that are characterised by an invariant distribution and for which a measure-preserving numerical method can be derived.

  相似文献   

3.
Approximate Bayesian computation (ABC) methods permit approximate inference for intractable likelihoods when it is possible to simulate from the model. However, they perform poorly for high-dimensional data and in practice must usually be used in conjunction with dimension reduction methods, resulting in a loss of accuracy which is hard to quantify or control. We propose a new ABC method for high-dimensional data based on rare event methods which we refer to as RE-ABC. This uses a latent variable representation of the model. For a given parameter value, we estimate the probability of the rare event that the latent variables correspond to data roughly consistent with the observations. This is performed using sequential Monte Carlo and slice sampling to systematically search the space of latent variables. In contrast, standard ABC can be viewed as using a more naive Monte Carlo estimate. We use our rare event probability estimator as a likelihood estimate within the pseudo-marginal Metropolis–Hastings algorithm for parameter inference. We provide asymptotics showing that RE-ABC has a lower computational cost for high-dimensional data than standard ABC methods. We also illustrate our approach empirically, on a Gaussian distribution and an application in infectious disease modelling.  相似文献   

4.
Abstract. We investigate simulation methodology for Bayesian inference in Lévy‐driven stochastic volatility (SV) models. Typically, Bayesian inference from such models is performed using Markov chain Monte Carlo (MCMC); this is often a challenging task. Sequential Monte Carlo (SMC) samplers are methods that can improve over MCMC; however, there are many user‐set parameters to specify. We develop a fully automated SMC algorithm, which substantially improves over the standard MCMC methods in the literature. To illustrate our methodology, we look at a model comprised of a Heston model with an independent, additive, variance gamma process in the returns equation. The driving gamma process can capture the stylized behaviour of many financial time series and a discretized version, fit in a Bayesian manner, has been found to be very useful for modelling equity data. We demonstrate that it is possible to draw exact inference, in the sense of no time‐discretization error, from the Bayesian SV model.  相似文献   

5.
In segmentation problems, inference on change-point position and model selection are two difficult issues due to the discrete nature of change-points. In a Bayesian context, we derive exact, explicit and tractable formulae for the posterior distribution of variables such as the number of change-points or their positions. We also demonstrate that several classical Bayesian model selection criteria can be computed exactly. All these results are based on an efficient strategy to explore the whole segmentation space, which is very large. We illustrate our methodology on both simulated data and a comparative genomic hybridization profile.  相似文献   

6.
Bayesian statistical inference relies on the posterior distribution. Depending on the model, the posterior can be more or less difficult to derive. In recent years, there has been a lot of interest in complex settings where the likelihood is analytically intractable. In such situations, approximate Bayesian computation (ABC) provides an attractive way of carrying out Bayesian inference. For obtaining reliable posterior estimates however, it is important to keep the approximation errors small in ABC. The choice of an appropriate set of summary statistics plays a crucial role in this effort. Here, we report the development of a new algorithm that is based on least angle regression for choosing summary statistics. In two population genetic examples, the performance of the new algorithm is better than a previously proposed approach that uses partial least squares.  相似文献   

7.
Both approximate Bayesian computation (ABC) and composite likelihood methods are useful for Bayesian and frequentist inference, respectively, when the likelihood function is intractable. We propose to use composite likelihood score functions as summary statistics in ABC in order to obtain accurate approximations to the posterior distribution. This is motivated by the use of the score function of the full likelihood, and extended to general unbiased estimating functions in complex models. Moreover, we show that if the composite score is suitably standardised, the resulting ABC procedure is invariant to reparameterisations and automatically adjusts the curvature of the composite likelihood, and of the corresponding posterior distribution. The method is illustrated through examples with simulated data, and an application to modelling of spatial extreme rainfall data is discussed.  相似文献   

8.
Bayesian Semiparametric Regression for Median Residual Life   总被引:3,自引:0,他引:3  
Abstract.  With survival data there is often interest not only in the survival time distribution but also in the residual survival time distribution. In fact, regression models to explain residual survival time might be desired. Building upon recent work of Kottas & Gelfand [ J. Amer. Statist. Assoc. 96 (2001) 1458], we formulate a semiparametric median residual life regression model induced by a semiparametric accelerated failure time regression model. We utilize a Bayesian approach which allows full and exact inference. Classical work essentially ignores covariates and is either based upon parametric assumptions or is limited to asymptotic inference in non-parametric settings. No regression modelling of median residual life appears to exist. The Bayesian modelling is developed through Dirichlet process mixing. The models are fitted using Gibbs sampling. Residual life inference is implemented extending the approach of Gelfand & Kottas [ J. Comput. Graph. Statist. 11 (2002) 289]. Finally, we present a fairly detailed analysis of a set of survival times with moderate censoring for patients with small cell lung cancer.  相似文献   

9.
Despite the popularity and importance, there is limited work on modelling data which come from complex survey design using finite mixture models. In this work, we explored the use of finite mixture regression models when the samples were drawn using a complex survey design. In particular, we considered modelling data collected based on stratified sampling design. We developed a new design-based inference where we integrated sampling weights in the complete-data log-likelihood function. The expectation–maximisation algorithm was developed accordingly. A simulation study was conducted to compare the new methodology with the usual finite mixture of a regression model. The comparison was done using bias-variance components of mean square error. Additionally, a simulation study was conducted to assess the ability of the Bayesian information criterion to select the optimal number of components under the proposed modelling approach. The methodology was implemented on real data with good results.  相似文献   

10.
For nearly any challenging scientific problem evaluation of the likelihood is problematic if not impossible. Approximate Bayesian computation (ABC) allows us to employ the whole Bayesian formalism to problems where we can use simulations from a model, but cannot evaluate the likelihood directly. When summary statistics of real and simulated data are compared??rather than the data directly??information is lost, unless the summary statistics are sufficient. Sufficient statistics are, however, not common but without them statistical inference in ABC inferences are to be considered with caution. Previously other authors have attempted to combine different statistics in order to construct (approximately) sufficient statistics using search and information heuristics. Here we employ an information-theoretical framework that can be used to construct appropriate (approximately sufficient) statistics by combining different statistics until the loss of information is minimized. We start from a potentially large number of different statistics and choose the smallest set that captures (nearly) the same information as the complete set. We then demonstrate that such sets of statistics can be constructed for both parameter estimation and model selection problems, and we apply our approach to a range of illustrative and real-world model selection problems.  相似文献   

11.
Abstract.  Methodology for Bayesian inference is considered for a stochastic epidemic model which permits mixing on both local and global scales. Interest focuses on estimation of the within- and between-group transmission rates given data on the final outcome. The model is sufficiently complex that the likelihood of the data is numerically intractable. To overcome this difficulty, an appropriate latent variable is introduced, about which asymptotic information is known as the population size tends to infinity. This yields a method for approximate inference for the true model. The methods are applied to real data, tested with simulated data, and also applied to a simple epidemic model for which exact results are available for comparison.  相似文献   

12.
Bayesian networks are not well-formulated for continuous variables. The majority of recent works dealing with Bayesian inference are restricted only to special types of continuous variables such as the conditional linear Gaussian model for Gaussian variables. In this context, an exact Bayesian inference algorithm for clusters of continuous variables which may be approximated by independent component analysis models is proposed. The complexity in memory space is linear and the overfitting problem is attenuated, while the inference time is still exponential. Experiments for multibiometric score fusion with quality estimates are conducted, and it is observed that the performances are satisfactory compared to some known fusion techniques.  相似文献   

13.
We present a Bayesian semiparametric approach to exponential family regression that extends the class of generalized linear regression models. Further, flexibility in the process of modelling is achieved by explicitly accounting for the discrepancy between the ‘true’ response-covariate regression surface and an assumed parametric functional relationship. An approximate full Bayesian analysis is provided, based upon the Gibbs sampling algorithm.  相似文献   

14.
As is the case of many studies, the data collected are limited and an exact value is recorded only if it falls within an interval range. Hence, the responses can be either left, interval or right censored. Linear (and nonlinear) regression models are routinely used to analyze these types of data and are based on normality assumptions for the errors terms. However, those analyzes might not provide robust inference when the normality assumptions are questionable. In this article, we develop a Bayesian framework for censored linear regression models by replacing the Gaussian assumptions for the random errors with scale mixtures of normal (SMN) distributions. The SMN is an attractive class of symmetric heavy-tailed densities that includes the normal, Student-t, Pearson type VII, slash and the contaminated normal distributions, as special cases. Using a Bayesian paradigm, an efficient Markov chain Monte Carlo algorithm is introduced to carry out posterior inference. A new hierarchical prior distribution is suggested for the degrees of freedom parameter in the Student-t distribution. The likelihood function is utilized to compute not only some Bayesian model selection measures but also to develop Bayesian case-deletion influence diagnostics based on the q-divergence measure. The proposed Bayesian methods are implemented in the R package BayesCR. The newly developed procedures are illustrated with applications using real and simulated data.  相似文献   

15.
A general framework is presented for Bayesian inference of multivariate time series exhibiting long-range dependence. The series are modelled using a vector autoregressive fractionally integrated moving-average (VARFIMA) process, which can capture both short-term correlation structure and long-range dependence characteristics of the individual series, as well as interdependence and feedback relationships between the series. To facilitate a sampling-based Bayesian approach, the exact joint posterior density is derived for the parameters, in a form that is computationally simpler than direct evaluation of the likelihood, and a modified Gibbs sampling algorithm is used to generate samples from the complete conditional distribution associated with each parameter. The paper also shows how an approximate form of the joint posterior density may be used for long time series. The procedure is illustrated using sea surface temperatures measured at three locations along the central California coast. These series are believed to be interdependent due to similarities in local atmospheric conditions at the different locations, and previous studies have found that they exhibit ‘long memory’ when studied individually. The approach adopted here permits investigation of the effects on model estimation of the interdependence and feedback relationships between the series.  相似文献   

16.
It is suggested that inference under the proportional hazard model can be carried out by programs for exact inference under the logistic regression model. Advantages of such inference is that software is available and that multivariate models can be addressed. The method has been evaluated by means of coverage and power calculations in certain situations. In all situations coverage was above the nominal level, but on the other hand rather conservative. A different type of exact inference is developed under Type II censoring. Inference was then less conservative, however there are limitations with respect to censoring mechanism, multivariate generalizations and software is not available. This method also requires extensive computational power. Performance of large sample Wald, score and likelihood inference was also considered. Large sample methods works remarkably well with small data sets, but inference by score statistics seems to be the best choice. There seems to be some problems with likelihood ratio inference that may originate from how this method works with infinite estimates of the regression parameter. Inference by Wald statistics can be quite conservative with very small data sets.  相似文献   

17.
In this article, Bayesian inference for the half-normal and half-t distributions using uninformative priors is considered. It is shown that exact Bayesian inference can be undertaken for the half-normal distribution without the need for Gibbs sampling. Simulation is then used to compare the sampling properties of Bayesian point and interval estimators with those of their maximum likelihood based counterparts. Inference for the half-t distribution based on the use of Gibbs sampling is outlined, and an approach to model comparison based on the use of Bayes factors is discussed. The fitting of the half-normal and half-t models is illustrated using real data on the body fat measurements of elite athletes.  相似文献   

18.
Categorical data frequently arise in applications in the Social Sciences. In such applications, the class of log-linear models, based on either a Poisson or (product) multinomial response distribution, is a flexible model class for inference and prediction. In this paper we consider the Bayesian analysis of both Poisson and multinomial log-linear models. It is often convenient to model multinomial or product multinomial data as observations of independent Poisson variables. For multinomial data, Lindley (1964) [20] showed that this approach leads to valid Bayesian posterior inferences when the prior density for the Poisson cell means factorises in a particular way. We develop this result to provide a general framework for the analysis of multinomial or product multinomial data using a Poisson log-linear model. Valid finite population inferences are also available, which can be particularly important in modelling social data. We then focus particular attention on multivariate normal prior distributions for the log-linear model parameters. Here, an improper prior distribution for certain Poisson model parameters is required for valid multinomial analysis, and we derive conditions under which the resulting posterior distribution is proper. We also consider the construction of prior distributions across models, and for model parameters, when uncertainty exists about the appropriate form of the model. We present classes of Poisson and multinomial models, invariant under certain natural groups of permutations of the cells. We demonstrate that, if prior belief concerning the model parameters is also invariant, as is the case in a ‘reference’ analysis, then the choice of prior distribution is considerably restricted. The analysis of multivariate categorical data in the form of a contingency table is considered in detail. We illustrate the methods with two examples.  相似文献   

19.
Approximate Bayesian computation (ABC) has become a popular technique to facilitate Bayesian inference from complex models. In this article we present an ABC approximation designed to perform biased filtering for a Hidden Markov Model when the likelihood function is intractable. We use a sequential Monte Carlo (SMC) algorithm to both fit and sample from our ABC approximation of the target probability density. This approach is shown to, empirically, be more accurate w.r.t.?the original filter than competing methods. The theoretical bias of our method is investigated; it is shown that the bias goes to zero at the expense of increased computational effort. Our approach is illustrated on a constrained sequential lasso for portfolio allocation to 15 constituents of the FTSE 100 share index.  相似文献   

20.
Approximate Bayesian computation (ABC) is a popular approach to address inference problems where the likelihood function is intractable, or expensive to calculate. To improve over Markov chain Monte Carlo (MCMC) implementations of ABC, the use of sequential Monte Carlo (SMC) methods has recently been suggested. Most effective SMC algorithms that are currently available for ABC have a computational complexity that is quadratic in the number of Monte Carlo samples (Beaumont et al., Biometrika 86:983?C990, 2009; Peters et al., Technical report, 2008; Toni et al., J.?Roy. Soc. Interface 6:187?C202, 2009) and require the careful choice of simulation parameters. In this article an adaptive SMC algorithm is proposed which admits a computational complexity that is linear in the number of samples and adaptively determines the simulation parameters. We demonstrate our algorithm on a toy example and on a birth-death-mutation model arising in epidemiology.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号