首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The computational demand required to perform inference using Markov chain Monte Carlo methods often obstructs a Bayesian analysis. This may be a result of large datasets, complex dependence structures, or expensive computer models. In these instances, the posterior distribution is replaced by a computationally tractable approximation, and inference is based on this working model. However, the error that is introduced by this practice is not well studied. In this paper, we propose a methodology that allows one to examine the impact on statistical inference by quantifying the discrepancy between the intractable and working posterior distributions. This work provides a structure to analyse model approximations with regard to the reliability of inference and computational efficiency. We illustrate our approach through a spatial analysis of yearly total precipitation anomalies where covariance tapering approximations are used to alleviate the computational demand associated with inverting a large, dense covariance matrix.  相似文献   

2.
While most regression models focus on explaining distributional aspects of one single response variable alone, interest in modern statistical applications has recently shifted towards simultaneously studying multiple response variables as well as their dependence structure. A particularly useful tool for pursuing such an analysis are copula-based regression models since they enable the separation of the marginal response distributions and the dependence structure summarised in a specific copula model. However, so far copula-based regression models have mostly been relying on two-step approaches where the marginal distributions are determined first whereas the copula structure is studied in a second step after plugging in the estimated marginal distributions. Moreover, the parameters of the copula are mostly treated as a constant not related to covariates and most regression specifications for the marginals are restricted to purely linear predictors. We therefore propose simultaneous Bayesian inference for both the marginal distributions and the copula using computationally efficient Markov chain Monte Carlo simulation techniques. In addition, we replace the commonly used linear predictor by a generic structured additive predictor comprising for example nonlinear effects of continuous covariates, spatial effects or random effects and furthermore allow to make the copula parameters covariate-dependent. To facilitate Bayesian inference, we construct proposal densities for a Metropolis–Hastings algorithm relying on quadratic approximations to the full conditionals of regression coefficients avoiding manual tuning. The performance of the resulting Bayesian estimates is evaluated in simulations comparing our approach with penalised likelihood inference, studying the choice of a specific copula model based on the deviance information criterion, and comparing a simultaneous approach with a two-step procedure. Furthermore, the flexibility of Bayesian conditional copula regression models is illustrated in two applications on childhood undernutrition and macroecology.  相似文献   

3.
Probabilistic graphical models offer a powerful framework to account for the dependence structure between variables, which is represented as a graph. However, the dependence between variables may render inference tasks intractable. In this paper, we review techniques exploiting the graph structure for exact inference, borrowed from optimisation and computer science. They are built on the principle of variable elimination whose complexity is dictated in an intricate way by the order in which variables are eliminated. The so‐called treewidth of the graph characterises this algorithmic complexity: low‐treewidth graphs can be processed efficiently. The first point that we illustrate is therefore the idea that for inference in graphical models, the number of variables is not the limiting factor, and it is worth checking the width of several tree decompositions of the graph before resorting to the approximate method. We show how algorithms providing an upper bound of the treewidth can be exploited to derive a ‘good' elimination order enabling to realise exact inference. The second point is that when the treewidth is too large, algorithms for approximate inference linked to the principle of variable elimination, such as loopy belief propagation and variational approaches, can lead to accurate results while being much less time consuming than Monte‐Carlo approaches. We illustrate the techniques reviewed in this article on benchmarks of inference problems in genetic linkage analysis and computer vision, as well as on hidden variables restoration in coupled Hidden Markov Models.  相似文献   

4.
Summary.  A fundamental issue in applied multivariate extreme value analysis is modelling dependence within joint tail regions. The primary focus of this work is to extend the classical pseudopolar treatment of multivariate extremes to develop an asymptotically motivated representation of extremal dependence that also encompasses asymptotic independence. Starting with the usual mild bivariate regular variation assumptions that underpin the coefficient of tail dependence as a measure of extremal dependence, our main result is a characterization of the limiting structure of the joint survivor function in terms of an essentially arbitrary non-negative measure that must satisfy some mild constraints. We then construct parametric models from this new class and study in detail one example that accommodates asymptotic dependence, asymptotic independence and asymmetry within a straightforward parsimonious parameterization. We provide a fast simulation algorithm for this example and detail likelihood-based inference including tests for asymptotic dependence and symmetry which are useful for submodel selection. We illustrate this model by application to both simulated and real data. In contrast with the classical multivariate extreme value approach, which concentrates on the limiting distribution of normalized componentwise maxima, our framework focuses directly on the structure of the limiting joint survivor function and provides significant extensions of both the theoretical and the practical tools that are available for joint tail modelling.  相似文献   

5.
In functional magnetic resonance imaging, spatial activation patterns are commonly estimated using a non-parametric smoothing approach. Significant peaks or clusters in the smoothed image are subsequently identified by testing the null hypothesis of lack of activation in every volume element of the scans. A weakness of this approach is the lack of a model for the activation pattern; this makes it difficult to determine the variance of estimates, to test specific neuroscientific hypotheses or to incorporate prior information about the brain area under study in the analysis. These issues may be addressed by formulating explicit spatial models for the activation and using simulation methods for inference. We present one such approach, based on a marked point process prior. Informally, one may think of the points as centres of activation, and the marks as parameters describing the shape and area of the surrounding cluster. We present an MCMC algorithm for making inference in the model and compare the approach with a traditional non-parametric method, using both simulated and visual stimulation data. Finally we discuss extensions of the model and the inferential framework to account for non-stationary responses and spatio-temporal correlation.  相似文献   

6.
Latent variable models have been widely used for modelling the dependence structure of multiple outcomes data. However, the formulation of a latent variable model is often unknown a priori, the misspecification will distort the dependence structure and lead to unreliable model inference. Moreover, multiple outcomes with varying types present enormous analytical challenges. In this paper, we present a class of general latent variable models that can accommodate mixed types of outcomes. We propose a novel selection approach that simultaneously selects latent variables and estimates parameters. We show that the proposed estimator is consistent, asymptotically normal and has the oracle property. The practical utility of the methods is confirmed via simulations as well as an application to the analysis of the World Values Survey, a global research project that explores peoples’ values and beliefs and the social and personal characteristics that might influence them.  相似文献   

7.
In seasonal influenza epidemics, pathogens such as respiratory syncytial virus (RSV) often co-circulate with influenza and cause influenza-like illness (ILI) in human hosts. However, it is often impractical to test for each potential pathogen or to collect specimens for each observed ILI episode, making inference about influenza transmission difficult. In the setting of infectious diseases, missing outcomes impose a particular challenge because of the dependence among individuals. We propose a Bayesian competing-risk model for multiple co-circulating pathogens for inference on transmissibility and intervention efficacies under the assumption that missingness in the biological confirmation of the pathogen is ignorable. Simulation studies indicate a reasonable performance of the proposed model even if the number of potential pathogens is misspecified. They also show that a moderate amount of missing laboratory test results has only a small impact on inference about key parameters in the setting of close contact groups. Using the proposed model, we found that a non-pharmaceutical intervention is marginally protective against transmission of influenza A in a study conducted in elementary schools.  相似文献   

8.
Mengya Liu  Qi Li 《Statistics》2019,53(1):1-25
This article studies an observation-driven model for time series of counts, which allows for overdispersion and negative serial dependence in the observations. The observations are supposed to follow a negative binomial distribution conditioned on past information with the form of thresh old models, which generates a two-regime structure on the basis of the magnitude of the lagged observations. We use the weak dependence approach to establish the stationarity and ergodicity, and the inference for regression parameters are obtained by the quasi-likelihood. Moreover, asymptotic properties of both quasi-maximum likelihood estimators and the threshold estimator are established, respectively. Simulation studies are considered and so are two applications, one of which is the trading volume of a stock and another is the number of major earthquakes.  相似文献   

9.
We propose a new class of time dependent random probability measures and show how this can be used for Bayesian nonparametric inference in continuous time. By means of a nonparametric hierarchical model we define a random process with geometric stick-breaking representation and dependence structure induced via a one dimensional diffusion process of Wright-Fisher type. The sequence is shown to be a strongly stationary measure-valued process with continuous sample paths which, despite the simplicity of the weights structure, can be used for inferential purposes on the trajectory of a discretely observed continuous-time phenomenon. A simple estimation procedure is presented and illustrated with simulated and real financial data.  相似文献   

10.
We consider the detection of changes in the mean of a set of time series. The breakpoints are allowed to be series specific, and the series are assumed to be correlated. The correlation between the series is supposed to be constant along time but is allowed to take an arbitrary form. We show that such a dependence structure can be encoded in a factor model. Thanks to this representation, the inference of the breakpoints can be achieved via dynamic programming, which remains one the most efficient algorithms. We propose a model selection procedure to determine both the number of breakpoints and the number of factors. This proposed method is implemented in the FASeg R package, which is available on the CRAN. We demonstrate the performances of our procedure through simulation experiments and present an application to geodesic data.  相似文献   

11.
Rubbery Polya Tree   总被引:1,自引:0,他引:1  
Abstract. Polya trees (PT) are random probability measures which can assign probability 1 to the set of continuous distributions for certain specifications of the hyperparameters. This feature distinguishes the PT from the popular Dirichlet process (DP) model which assigns probability 1 to the set of discrete distributions. However, the PT is not nearly as widely used as the DP prior. Probably the main reason is an awkward dependence of posterior inference on the choice of the partitioning subsets in the definition of the PT. We propose a generalization of the PT prior that mitigates this undesirable dependence on the partition structure, by allowing the branching probabilities to be dependent within the same level. The proposed new process is not a PT anymore. However, it is still a tail‐free process and many of the prior properties remain the same as those for the PT.  相似文献   

12.
We consider the problem of deriving Bayesian inference procedures via the concept of relative surprise. The mathematical concept of surprise has been developed by I.J. Good in a long sequence of papers. We make a modification to this development that permits the avoidance of a serious defect; namely, the change of variable problem. We apply relative surprise to the development of estimation, hypothesis testing and model checking procedures. Important advantages of the relative surprise approach to inference include the lack of dependence on a particular loss function and complete freedom to the statistician in the choice of prior for hypothesis testing problems. Links are established with common Bayesian inference procedures such as highest posterior density regions, modal estimates and Bayes factors. From a practical perspective new inference procedures arise that possess good properties.  相似文献   

13.
We introduce a multivariate heteroscedastic measurement error model for replications under scale mixtures of normal distribution. The model can provide a robust analysis and can be viewed as a generalization of multiple linear regression from both model structure and distribution assumption. An efficient method based on Markov Chain Monte Carlo is developed for parameter estimation. The deviance information criterion and the conditional predictive ordinates are used as model selection criteria. Simulation studies show robust inference behaviours of the model against both misspecification of distributions and outliers. We work out an illustrative example with a real data set on measurements of plant root decomposition.  相似文献   

14.
Longitudinal clinical trials with long follow-up periods almost invariably suffer from a loss to follow-up and non-compliance with the assigned therapy. An example is protocol 128 of the AIDS Clinical Trials Group, a 5-year equivalency trial comparing reduced dose zidovudine with the standard dose for treatment of paediatric acquired immune deficiency syndrome patients. This study compared responses to treatment by using both clinical and cognitive outcomes. The cognitive outcomes are of particular interest because the effects of human immunodeficiency virus infection of the central nervous system can be more acute in children than in adults. We formulate and apply a Bayesian hierarchical model to estimate both the intent-to-treat effect and the average causal effect of reducing the prescribed dose of zidovudine by 50%. The intent-to-treat effect quantifies the causal effect of assigning the lower dose, whereas the average causal effect represents the causal effect of actually taking the lower dose. We adopt a potential outcomes framework where, for each individual, we assume the existence of a different potential outcomes process at each level of time spent on treatment. The joint distribution of the potential outcomes and the time spent on assigned treatment is formulated using a hierarchical model: the potential outcomes distribution is given at the first level, and dependence between the outcomes and time on treatment is specified at the second level by linking the time on treatment to subject-specific effects that characterize the potential outcomes processes. Several distributional and structural assumptions are used to identify the model from observed data, and these are described in detail. A detailed analysis of AIDS Clinical Trials Group protocol 128 is given; inference about both the intent-to-treat effect and average causal effect indicate a high probability of dose equivalence with respect to cognitive functioning.  相似文献   

15.
In this paper, we discuss the bivariate Birnbaum-Saunders accelerated lifetime model, in which we have modeled the dependence structure of bivariate survival data through the use of frailty models. Specifically, we propose the bivariate model Birnbaum-Saunders with the following frailty distributions: gamma, positive stable and logarithmic series. We present a study of inference and diagnostic analysis for the proposed model, more concisely, are proposed a diagnostic analysis based in local influence and residual analysis to assess the fit model, as well as, to detect influential observations. In this regard, we derived the normal curvatures of local influence under different perturbation schemes and we performed some simulation studies for assessing the potential of residuals to detect misspecification in the systematic component, the presence in the stochastic component of the model and to detect outliers. Finally, we apply the methodology studied to real data set from recurrence in times of infections of 38 kidney patients using a portable dialysis machine, we analyzed these data considering independence within the pairs and using the bivariate Birnbaum-Saunders accelerated lifetime model, so that we could make a comparison and verify the importance of modeling dependence within the times of infection associated with the same patient.  相似文献   

16.
Hidden Markov random field models provide an appealing representation of images and other spatial problems. The drawback is that inference is not straightforward for these models as the normalisation constant for the likelihood is generally intractable except for very small observation sets. Variational methods are an emerging tool for Bayesian inference and they have already been successfully applied in other contexts. Focusing on the particular case of a hidden Potts model with Gaussian noise, we show how variational Bayesian methods can be applied to hidden Markov random field inference. To tackle the obstacle of the intractable normalising constant for the likelihood, we explore alternative estimation approaches for incorporation into the variational Bayes algorithm. We consider a pseudo-likelihood approach as well as the more recent reduced dependence approximation of the normalisation constant. To illustrate the effectiveness of these approaches we present empirical results from the analysis of simulated datasets. We also analyse a real dataset and compare results with those of previous analyses as well as those obtained from the recently developed auxiliary variable MCMC method and the recursive MCMC method. Our results show that the variational Bayesian analyses can be carried out much faster than the MCMC analyses and produce good estimates of model parameters. We also found that the reduced dependence approximation of the normalisation constant outperformed the pseudo-likelihood approximation in our analysis of real and synthetic datasets.  相似文献   

17.
18.
A Bayesian mixture model for differential gene expression   总被引:3,自引:0,他引:3  
Summary.  We propose model-based inference for differential gene expression, using a nonparametric Bayesian probability model for the distribution of gene intensities under various conditions. The probability model is a mixture of normal distributions. The resulting inference is similar to a popular empirical Bayes approach that is used for the same inference problem. The use of fully model-based inference mitigates some of the necessary limitations of the empirical Bayes method. We argue that inference is no more difficult than posterior simulation in traditional nonparametric mixture-of-normal models. The approach proposed is motivated by a microarray experiment that was carried out to identify genes that are differentially expressed between normal tissue and colon cancer tissue samples. Additionally, we carried out a small simulation study to verify the methods proposed. In the motivating case-studies we show how the nonparametric Bayes approach facilitates the evaluation of posterior expected false discovery rates. We also show how inference can proceed even in the absence of a null sample of known non-differentially expressed scores. This highlights the difference from alternative empirical Bayes approaches that are based on plug-in estimates.  相似文献   

19.
The use of surrogate variables has been proposed as a means to capture, for a given observed set of data, sources driving the dependency structure among high-dimensional sets of features and remove the effects of those sources and their potential negative impact on simultaneous inference. In this article we illustrate the potential effects of latent variables on testing dependence and the resulting impact on multiple inference, we briefly review the method of surrogate variable analysis proposed by Leek and Storey (PNAS 2008; 105:18718-18723), and assess that method via simulations intended to mimic the complexity of feature dependence observed in real-world microarray data. The method is also assessed via application to a recent Merck microarray data set. Both simulation and case study results indicate that surrogate variable analysis can offer a viable strategy for tackling the multiple testing dependence problem when the features follow a potentially complex correlation structure, yielding improvements in the variability of false positive rates and increases in power.  相似文献   

20.
Many disease processes are characterized by two or more successive health states, and it is often of interest and importance to assess state-specific covariate effects. However, with incomplete follow-up data such inference has not been satisfactorily addressed in the literature. We model the logarithm-transformed sojourn time in each state as linearly related to the covariates; however, neither the distributional form of the error term nor the dependence structure of the states needs to be specified. We propose a regression procedure to accommodate incomplete follow-up data. Asymptotic theory is presented, along with some tools for goodness-of-fit diagnostics. Simulation studies show that the proposal is reliable for practical use. We illustrate it by application to a cancer clinical trial.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号