首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 964 毫秒
1.
The lasso is a popular technique of simultaneous estimation and variable selection in many research areas. The marginal posterior mode of the regression coefficients is equivalent to estimates given by the non-Bayesian lasso when the regression coefficients have independent Laplace priors. Because of its flexibility of statistical inferences, the Bayesian approach is attracting a growing body of research in recent years. Current approaches are primarily to either do a fully Bayesian analysis using Markov chain Monte Carlo (MCMC) algorithm or use Monte Carlo expectation maximization (MCEM) methods with an MCMC algorithm in each E-step. However, MCMC-based Bayesian method has much computational burden and slow convergence. Tan et al. [An efficient MCEM algorithm for fitting generalized linear mixed models for correlated binary data. J Stat Comput Simul. 2007;77:929–943] proposed a non-iterative sampling approach, the inverse Bayes formula (IBF) sampler, for computing posteriors of a hierarchical model in the structure of MCEM. Motivated by their paper, we develop this IBF sampler in the structure of MCEM to give the marginal posterior mode of the regression coefficients for the Bayesian lasso, by adjusting the weights of importance sampling, when the full conditional distribution is not explicit. Simulation experiments show that the computational time is much reduced with our method based on the expectation maximization algorithm and our algorithms and our methods behave comparably with other Bayesian lasso methods not only in prediction accuracy but also in variable selection accuracy and even better especially when the sample size is relatively large.  相似文献   

2.
Effective component relabeling in Bayesian analyses of mixture models is critical to the routine use of mixtures in classification with analysis based on Markov chain Monte Carlo methods. The classification-based relabeling approach here is computationally attractive and statistically effective, and scales well with sample size and number of mixture components concordant with enabling routine analyses of increasingly large data sets. Building on the best of existing methods, practical relabeling aims to match data:component classification indicators in MCMC iterates with those of a defined reference mixture distribution. The method performs as well as or better than existing methods in small dimensional problems, while being practically superior in problems with larger data sets as the approach is scalable. We describe examples and computational benchmarks, and provide supporting code with efficient computational implementation of the algorithm that will be of use to others in practical applications of mixture models.  相似文献   

3.
In this paper, we discuss a fully Bayesian quantile inference using Markov Chain Monte Carlo (MCMC) method for longitudinal data models with random effects. Under the assumption of error term subject to asymmetric Laplace distribution, we establish a hierarchical Bayesian model and obtain the posterior distribution of unknown parameters at τ-th level. We overcome the current computational limitations using two approaches. One is the general MCMC technique with Metropolis–Hastings algorithm and another is the Gibbs sampling from the full conditional distribution. These two methods outperform the traditional frequentist methods under a wide array of simulated data models and are flexible enough to easily accommodate changes in the number of random effects and in their assumed distribution. We apply the Gibbs sampling method to analyse a mouse growth data and some different conclusions from those in the literatures are obtained.  相似文献   

4.
In this article, to reduce computational load in performing Bayesian variable selection, we used a variant of reversible jump Markov chain Monte Carlo methods, and the Holmes and Held (HH) algorithm, to sample model index variables in logistic mixed models involving a large number of explanatory variables. Furthermore, we proposed a simple proposal distribution for model index variables, and used a simulation study and real example to compare the performance of the HH algorithm with our proposed and existing proposal distributions. The results show that the HH algorithm with our proposed proposal distribution is a computationally efficient and reliable selection method.  相似文献   

5.
Two new implementations of the EM algorithm are proposed for maximum likelihood fitting of generalized linear mixed models. Both methods use random (independent and identically distributed) sampling to construct Monte Carlo approximations at the E-step. One approach involves generating random samples from the exact conditional distribution of the random effects (given the data) by rejection sampling, using the marginal distribution as a candidate. The second method uses a multivariate t importance sampling approximation. In many applications the two methods are complementary. Rejection sampling is more efficient when sample sizes are small, whereas importance sampling is better with larger sample sizes. Monte Carlo approximation using random samples allows the Monte Carlo error at each iteration to be assessed by using standard central limit theory combined with Taylor series methods. Specifically, we construct a sandwich variance estimate for the maximizer at each approximate E-step. This suggests a rule for automatically increasing the Monte Carlo sample size after iterations in which the true EM step is swamped by Monte Carlo error. In contrast, techniques for assessing Monte Carlo error have not been developed for use with alternative implementations of Monte Carlo EM algorithms utilizing Markov chain Monte Carlo E-step approximations. Three different data sets, including the infamous salamander data of McCullagh and Nelder, are used to illustrate the techniques and to compare them with the alternatives. The results show that the methods proposed can be considerably more efficient than those based on Markov chain Monte Carlo algorithms. However, the methods proposed may break down when the intractable integrals in the likelihood function are of high dimension.  相似文献   

6.
Lin  Tsung I.  Lee  Jack C.  Ni  Huey F. 《Statistics and Computing》2004,14(2):119-130
A finite mixture model using the multivariate t distribution has been shown as a robust extension of normal mixtures. In this paper, we present a Bayesian approach for inference about parameters of t-mixture models. The specifications of prior distributions are weakly informative to avoid causing nonintegrable posterior distributions. We present two efficient EM-type algorithms for computing the joint posterior mode with the observed data and an incomplete future vector as the sample. Markov chain Monte Carlo sampling schemes are also developed to obtain the target posterior distribution of parameters. The advantages of Bayesian approach over the maximum likelihood method are demonstrated via a set of real data.  相似文献   

7.
Summary.  Structured additive regression models are perhaps the most commonly used class of models in statistical applications. It includes, among others, (generalized) linear models, (generalized) additive models, smoothing spline models, state space models, semiparametric regression, spatial and spatiotemporal models, log-Gaussian Cox processes and geostatistical and geoadditive models. We consider approximate Bayesian inference in a popular subset of structured additive regression models, latent Gaussian models , where the latent field is Gaussian, controlled by a few hyperparameters and with non-Gaussian response variables. The posterior marginals are not available in closed form owing to the non-Gaussian response variables. For such models, Markov chain Monte Carlo methods can be implemented, but they are not without problems, in terms of both convergence and computational time. In some practical applications, the extent of these problems is such that Markov chain Monte Carlo sampling is simply not an appropriate tool for routine analysis. We show that, by using an integrated nested Laplace approximation and its simplified version, we can directly compute very accurate approximations to the posterior marginals. The main benefit of these approximations is computational: where Markov chain Monte Carlo algorithms need hours or days to run, our approximations provide more precise estimates in seconds or minutes. Another advantage with our approach is its generality, which makes it possible to perform Bayesian analysis in an automatic, streamlined way, and to compute model comparison criteria and various predictive measures so that models can be compared and the model under study can be challenged.  相似文献   

8.
Summary.  The retrieval of wind vectors from satellite scatterometer observations is a non-linear inverse problem. A common approach to solving inverse problems is to adopt a Bayesian framework and to infer the posterior distribution of the parameters of interest given the observations by using a likelihood model relating the observations to the parameters, and a prior distribution over the parameters. We show how Gaussian process priors can be used efficiently with a variety of likelihood models, using local forward (observation) models and direct inverse models for the scatterometer. We present an enhanced Markov chain Monte Carlo method to sample from the resulting multimodal posterior distribution. We go on to show how the computational complexity of the inference can be controlled by using a sparse, sequential Bayes algorithm for estimation with Gaussian processes. This helps to overcome the most serious barrier to the use of probabilistic, Gaussian process methods in remote sensing inverse problems, which is the prohibitively large size of the data sets. We contrast the sampling results with the approximations that are found by using the sparse, sequential Bayes algorithm.  相似文献   

9.
We consider a general class of prior distributions for nonparametric Bayesian estimation which uses finite random series with a random number of terms. A prior is constructed through distributions on the number of basis functions and the associated coefficients. We derive a general result on adaptive posterior contraction rates for all smoothness levels of the target function in the true model by constructing an appropriate ‘sieve’ and applying the general theory of posterior contraction rates. We apply this general result on several statistical problems such as density estimation, various nonparametric regressions, classification, spectral density estimation and functional regression. The prior can be viewed as an alternative to the commonly used Gaussian process prior, but properties of the posterior distribution can be analysed by relatively simpler techniques. An interesting approximation property of B‐spline basis expansion established in this paper allows a canonical choice of prior on coefficients in a random series and allows a simple computational approach without using Markov chain Monte Carlo methods. A simulation study is conducted to show that the accuracy of the Bayesian estimators based on the random series prior and the Gaussian process prior are comparable. We apply the method on Tecator data using functional regression models.  相似文献   

10.
Bayesian hierarchical modeling with Gaussian process random effects provides a popular approach for analyzing point-referenced spatial data. For large spatial data sets, however, generic posterior sampling is infeasible due to the extremely high computational burden in decomposing the spatial correlation matrix. In this paper, we propose an efficient algorithm—the adaptive griddy Gibbs (AGG) algorithm—to address the computational issues with large spatial data sets. The proposed algorithm dramatically reduces the computational complexity. We show theoretically that the proposed method can approximate the real posterior distribution accurately. The sufficient number of grid points for a required accuracy has also been derived. We compare the performance of AGG with that of the state-of-the-art methods in simulation studies. Finally, we apply AGG to spatially indexed data concerning building energy consumption.  相似文献   

11.
Mixture models are flexible tools in density estimation and classification problems. Bayesian estimation of such models typically relies on sampling from the posterior distribution using Markov chain Monte Carlo. Label switching arises because the posterior is invariant to permutations of the component parameters. Methods for dealing with label switching have been studied fairly extensively in the literature, with the most popular approaches being those based on loss functions. However, many of these algorithms turn out to be too slow in practice, and can be infeasible as the size and/or dimension of the data grow. We propose a new, computationally efficient algorithm based on a loss function interpretation, and show that it can scale up well in large data set scenarios. Then, we review earlier solutions which can scale up well for large data set, and compare their performances on simulated and real data sets. We conclude with some discussions and recommendations of all the methods studied.  相似文献   

12.
Abstract. We propose a Bayesian semiparametric methodology for quantile regression modelling. In particular, working with parametric quantile regression functions, we develop Dirichlet process mixture models for the error distribution in an additive quantile regression formulation. The proposed non‐parametric prior probability models allow the shape of the error density to adapt to the data and thus provide more reliable predictive inference than models based on parametric error distributions. We consider extensions to quantile regression for data sets that include censored observations. Moreover, we employ dependent Dirichlet processes to develop quantile regression models that allow the error distribution to change non‐parametrically with the covariates. Posterior inference is implemented using Markov chain Monte Carlo methods. We assess and compare the performance of our models using both simulated and real data sets.  相似文献   

13.
Approximate Bayesian computation (ABC) methods permit approximate inference for intractable likelihoods when it is possible to simulate from the model. However, they perform poorly for high-dimensional data and in practice must usually be used in conjunction with dimension reduction methods, resulting in a loss of accuracy which is hard to quantify or control. We propose a new ABC method for high-dimensional data based on rare event methods which we refer to as RE-ABC. This uses a latent variable representation of the model. For a given parameter value, we estimate the probability of the rare event that the latent variables correspond to data roughly consistent with the observations. This is performed using sequential Monte Carlo and slice sampling to systematically search the space of latent variables. In contrast, standard ABC can be viewed as using a more naive Monte Carlo estimate. We use our rare event probability estimator as a likelihood estimate within the pseudo-marginal Metropolis–Hastings algorithm for parameter inference. We provide asymptotics showing that RE-ABC has a lower computational cost for high-dimensional data than standard ABC methods. We also illustrate our approach empirically, on a Gaussian distribution and an application in infectious disease modelling.  相似文献   

14.
Estimating parameters in a stochastic volatility (SV) model is a challenging task. Among other estimation methods and approaches, efficient simulation methods based on importance sampling have been developed for the Monte Carlo maximum likelihood estimation of univariate SV models. This paper shows that importance sampling methods can be used in a general multivariate SV setting. The sampling methods are computationally efficient. To illustrate the versatility of this approach, three different multivariate stochastic volatility models are estimated for a standard data set. The empirical results are compared to those from earlier studies in the literature. Monte Carlo simulation experiments, based on parameter estimates from the standard data set, are used to show the effectiveness of the importance sampling methods.  相似文献   

15.
Estimating parameters in a stochastic volatility (SV) model is a challenging task. Among other estimation methods and approaches, efficient simulation methods based on importance sampling have been developed for the Monte Carlo maximum likelihood estimation of univariate SV models. This paper shows that importance sampling methods can be used in a general multivariate SV setting. The sampling methods are computationally efficient. To illustrate the versatility of this approach, three different multivariate stochastic volatility models are estimated for a standard data set. The empirical results are compared to those from earlier studies in the literature. Monte Carlo simulation experiments, based on parameter estimates from the standard data set, are used to show the effectiveness of the importance sampling methods.  相似文献   

16.
The analysis of non-Gaussian time series by using state space models is considered from both classical and Bayesian perspectives. The treatment in both cases is based on simulation using importance sampling and antithetic variables; Markov chain Monte Carlo methods are not employed. Non-Gaussian disturbances for the state equation as well as for the observation equation are considered. Methods for estimating conditional and posterior means of functions of the state vector given the observations, and the mean-square errors of their estimates, are developed. These methods are extended to cover the estimation of conditional and posterior densities and distribution functions. The choice of importance sampling densities and antithetic variables is discussed. The techniques work well in practice and are computationally efficient. Their use is illustrated by applying them to a univariate discrete time series, a series with outliers and a volatility series.  相似文献   

17.
18.
Due to the escalating growth of big data sets in recent years, new Bayesian Markov chain Monte Carlo (MCMC) parallel computing methods have been developed. These methods partition large data sets by observations into subsets. However, for Bayesian nested hierarchical models, typically only a few parameters are common for the full data set, with most parameters being group specific. Thus, parallel Bayesian MCMC methods that take into account the structure of the model and split the full data set by groups rather than by observations are a more natural approach for analysis. Here, we adapt and extend a recently introduced two-stage Bayesian hierarchical modeling approach, and we partition complete data sets by groups. In stage 1, the group-specific parameters are estimated independently in parallel. The stage 1 posteriors are used as proposal distributions in stage 2, where the target distribution is the full model. Using three-level and four-level models, we show in both simulation and real data studies that results of our method agree closely with the full data analysis, with greatly increased MCMC efficiency and greatly reduced computation times. The advantages of our method versus existing parallel MCMC computing methods are also described.  相似文献   

19.
We demonstrate the use of our R package, gammSlice, for Bayesian fitting and inference in generalised additive mixed model analysis. This class of models includes generalised linear mixed models and generalised additive models as special cases. Accurate Bayesian inference is achievable via sufficiently large Markov chain Monte Carlo (MCMC) samples. Slice sampling is a key component of the MCMC scheme. Comparisons with existing generalised additive mixed model software shows that gammSlice offers improved inferential accuracy, albeit at the cost of longer computational time.  相似文献   

20.
We develop a novel computational methodology for Bayesian optimal sequential design for nonparametric regression. This computational methodology, that we call inhomogeneous evolutionary Markov chain Monte Carlo, combines ideas of simulated annealing, genetic or evolutionary algorithms, and Markov chain Monte Carlo. Our framework allows optimality criteria with general utility functions and general classes of priors for the underlying regression function. We illustrate the usefulness of our novel methodology with applications to experimental design for nonparametric function estimation using Gaussian process priors and free-knot cubic splines priors.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号