首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Recently, mixture distribution becomes more and more popular in many scientific fields. Statistical computation and analysis of mixture models, however, are extremely complex due to the large number of parameters involved. Both EM algorithms for likelihood inference and MCMC procedures for Bayesian analysis have various difficulties in dealing with mixtures with unknown number of components. In this paper, we propose a direct sampling approach to the computation of Bayesian finite mixture models with varying number of components. This approach requires only the knowledge of the density function up to a multiplicative constant. It is easy to implement, numerically efficient and very practical in real applications. A simulation study shows that it performs quite satisfactorily on relatively high dimensional distributions. A well-known genetic data set is used to demonstrate the simplicity of this method and its power for the computation of high dimensional Bayesian mixture models.  相似文献   

2.
ABSTRACT

We propose a multiple imputation method based on principal component analysis (PCA) to deal with incomplete continuous data. To reflect the uncertainty of the parameters from one imputation to the next, we use a Bayesian treatment of the PCA model. Using a simulation study and real data sets, the method is compared to two classical approaches: multiple imputation based on joint modelling and on fully conditional modelling. Contrary to the others, the proposed method can be easily used on data sets where the number of individuals is less than the number of variables and when the variables are highly correlated. In addition, it provides unbiased point estimates of quantities of interest, such as an expectation, a regression coefficient or a correlation coefficient, with a smaller mean squared error. Furthermore, the widths of the confidence intervals built for the quantities of interest are often smaller whilst ensuring a valid coverage.  相似文献   

3.
Multivariate model validation is a complex decision-making problem involving comparison of multiple correlated quantities, based upon the available information and prior knowledge. This paper presents a Bayesian risk-based decision method for validation assessment of multivariate predictive models under uncertainty. A generalized likelihood ratio is derived as a quantitative validation metric based on Bayes’ theorem and Gaussian distribution assumption of errors between validation data and model prediction. The multivariate model is then assessed based on the comparison of the likelihood ratio with a Bayesian decision threshold, a function of the decision costs and prior of each hypothesis. The probability density function of the likelihood ratio is constructed using the statistics of multiple response quantities and Monte Carlo simulation. The proposed methodology is implemented in the validation of a transient heat conduction model, using a multivariate data set from experiments. The Bayesian methodology provides a quantitative approach to facilitate rational decisions in multivariate model assessment under uncertainty.  相似文献   

4.
Generalized linear mixed models (GLMM) are commonly used to model the treatment effect over time while controlling for important clinical covariates. Standard software procedures often provide estimates of the outcome based on the mean of the covariates; however, these estimates will be biased for the true group means in the GLMM. Implementing GLMM in the frequentist framework can lead to issues of convergence. A simulation study demonstrating the use of fully Bayesian GLMM for providing unbiased estimates of group means is shown. These models are very straightforward to implement and can be used for a broad variety of outcomes (eg, binary, categorical, and count data) that arise in clinical trials. We demonstrate the proposed method on a data set from a clinical trial in diabetes.  相似文献   

5.
Spatial modeling is widely used in environmental sciences, biology, and epidemiology. Generalized linear mixed models are employed to account for spatial variations of point-referenced data called spatial generalized linear mixed models (SGLMMs). Frequentist analysis of these type of data is computationally difficult. On the other hand, the advent of the Markov chain Monte Carlo algorithm has made the Bayesian analysis of SGLMM computationally convenient. Recent introduction of the method of data cloning, which leads to maximum likelihood estimate, has made frequentist analysis of mixed models also equally computationally convenient. Recently, the data cloning was employed to estimate model parameters in SGLMMs, however, the prediction of spatial random effects and kriging are also very important. In this article, we propose a frequentist approach based on data cloning to predict (and provide prediction intervals) spatial random effects and kriging. We illustrate this approach using a real dataset and also by a simulation study.  相似文献   

6.
This paper presents a method for Bayesian inference for the regression parameters in a linear model with independent and identically distributed errors that does not require the specification of a parametric family of densities for the error distribution. This method first selects a nonparametric kernel density estimate of the error distribution which is unimodal and based on the least-squares residuals. Once the error distribution is selected, the Metropolis algorithm is used to obtain the marginal posterior distribution of the regression parameters. The methodology is illustrated with data sets, and its performance relative to standard Bayesian techniques is evaluated using simulation results.  相似文献   

7.
This paper considers the problem of analysis of covariance (ANCOVA) under the assumption of inverse Gaussian distribution for response variable from the Bayesian point of view. We develop a fully Bayesian model for ANCOVA based on the conjugate prior distributions for parameters contained in the model. The Bayes estimator of parameters, ANCOVA model and adjusted effects for both treatments and covariates along with predictive distribution of future observations are developed. We also provide the essentials for comparing adjusted treatments effects and adjusted factor effects. A simulation study and a real world application are also performed to illustrate and evaluate the proposed Bayesian model.  相似文献   

8.
Efficient estimation of the regression coefficients in longitudinal data analysis requires a correct specification of the covariance structure. If misspecification occurs, it may lead to inefficient or biased estimators of parameters in the mean. One of the most commonly used methods for handling the covariance matrix is based on simultaneous modeling of the Cholesky decomposition. Therefore, in this paper, we reparameterize covariance structures in longitudinal data analysis through the modified Cholesky decomposition of itself. Based on this modified Cholesky decomposition, the within-subject covariance matrix is decomposed into a unit lower triangular matrix involving moving average coefficients and a diagonal matrix involving innovation variances, which are modeled as linear functions of covariates. Then, we propose a fully Bayesian inference for joint mean and covariance models based on this decomposition. A computational efficient Markov chain Monte Carlo method which combines the Gibbs sampler and Metropolis–Hastings algorithm is implemented to simultaneously obtain the Bayesian estimates of unknown parameters, as well as their standard deviation estimates. Finally, several simulation studies and a real example are presented to illustrate the proposed methodology.  相似文献   

9.
Multiple imputation is a common approach for dealing with missing values in statistical databases. The imputer fills in missing values with draws from predictive models estimated from the observed data, resulting in multiple, completed versions of the database. Researchers have developed a variety of default routines to implement multiple imputation; however, there has been limited research comparing the performance of these methods, particularly for categorical data. We use simulation studies to compare repeated sampling properties of three default multiple imputation methods for categorical data, including chained equations using generalized linear models, chained equations using classification and regression trees, and a fully Bayesian joint distribution based on Dirichlet process mixture models. We base the simulations on categorical data from the American Community Survey. In the circumstances of this study, the results suggest that default chained equations approaches based on generalized linear models are dominated by the default regression tree and Bayesian mixture model approaches. They also suggest competing advantages for the regression tree and Bayesian mixture model approaches, making both reasonable default engines for multiple imputation of categorical data. Supplementary material for this article is available online.  相似文献   

10.
Abstract. We investigate simulation methodology for Bayesian inference in Lévy‐driven stochastic volatility (SV) models. Typically, Bayesian inference from such models is performed using Markov chain Monte Carlo (MCMC); this is often a challenging task. Sequential Monte Carlo (SMC) samplers are methods that can improve over MCMC; however, there are many user‐set parameters to specify. We develop a fully automated SMC algorithm, which substantially improves over the standard MCMC methods in the literature. To illustrate our methodology, we look at a model comprised of a Heston model with an independent, additive, variance gamma process in the returns equation. The driving gamma process can capture the stylized behaviour of many financial time series and a discretized version, fit in a Bayesian manner, has been found to be very useful for modelling equity data. We demonstrate that it is possible to draw exact inference, in the sense of no time‐discretization error, from the Bayesian SV model.  相似文献   

11.
In recent years, zero-inflated count data models, such as zero-inflated Poisson (ZIP) models, are widely used as the count data with extra zeros are very common in many practical problems. In order to model the correlated count data which are either clustered or repeated and to assess the effects of continuous covariates or of time scales in a flexible way, a class of semiparametric mixed-effects models for zero-inflated count data is considered. In this article, we propose a fully Bayesian inference for such models based on a data augmentation scheme that reflects both random effects of covariates and mixture of zero-inflated distribution. A computational efficient MCMC method which combines the Gibbs sampler and M-H algorithm is implemented to obtain the estimate of the model parameters. Finally, a simulation study and a real example are used to illustrate the proposed methodologies.  相似文献   

12.
This paper presents a Bayesian method for the analysis of toxicological multivariate mortality data when the discrete mortality rate for each family of subjects at a given time depends on familial random effects and the toxicity level experienced by the family. Our aim is to model and analyse one set of such multivariate mortality data with large family sizes: the potassium thiocyanate (KSCN) tainted fish tank data of O'Hara Hines. The model used is based on a discretized hazard with additional time-varying familial random effects. A similar previous study (using sodium thiocyanate (NaSCN)) is used to construct a prior for the parameters in the current study. A simulation-based approach is used to compute posterior estimates of the model parameters and mortality rates and several other quantities of interest. Recent tools in Bayesian model diagnostics and variable subset selection have been incorporated to verify important modelling assumptions regarding the effects of time and heterogeneity among the families on the mortality rate. Further, Bayesian methods using predictive distributions are used for comparing several plausible models.  相似文献   

13.
Time-varying parameter models with stochastic volatility are widely used to study macroeconomic and financial data. These models are almost exclusively estimated using Bayesian methods. A common practice is to focus on prior distributions that themselves depend on relatively few hyperparameters such as the scaling factor for the prior covariance matrix of the residuals governing time variation in the parameters. The choice of these hyperparameters is crucial because their influence is sizeable for standard sample sizes. In this article, we treat the hyperparameters as part of a hierarchical model and propose a fast, tractable, easy-to-implement, and fully Bayesian approach to estimate those hyperparameters jointly with all other parameters in the model. We show via Monte Carlo simulations that, in this class of models, our approach can drastically improve on using fixed hyperparameters previously proposed in the literature. Supplementary materials for this article are available online.  相似文献   

14.
We propose a general latent variable model for multivariate ordinal categorical variables, in which both the responses and the covariates are ordinal, to assess the effect of the covariates on the responses and to model the covariance structure of the response variables. A?fully Bayesian approach is employed to analyze the model. The Gibbs sampler is used to simulate the joint posterior distribution of the latent variables and the parameters, and the parameter expansion and reparameterization techniques are used to speed up the convergence procedure. The proposed model and method are demonstrated by simulation studies and a real data example.  相似文献   

15.
Abstract

Handling data with the nonignorably missing mechanism is still a challenging problem in statistics. In this paper, we develop a fully Bayesian adaptive Lasso approach for quantile regression models with nonignorably missing response data, where the nonignorable missingness mechanism is specified by a logistic regression model. The proposed method extends the Bayesian Lasso by allowing different penalization parameters for different regression coefficients. Furthermore, a hybrid algorithm that combined the Gibbs sampler and Metropolis-Hastings algorithm is implemented to simulate the parameters from posterior distributions, mainly including regression coefficients, shrinkage coefficients, parameters in the non-ignorable missing models. Finally, some simulation studies and a real example are used to illustrate the proposed methodology.  相似文献   

16.
Considerable progress has been made in applying Markov chain Monte Carlo (MCMC) methods to the analysis of epidemic data. However, this likelihood based method can be inefficient due to the limited data available concerning an epidemic outbreak. This paper considers an alternative approach to studying epidemic data using Approximate Bayesian Computation (ABC) methodology. ABC is a simulation-based technique for obtaining an approximate sample from the posterior distribution of the parameters of the model and in an epidemic context is very easy to implement. A new approach to ABC is introduced which generates a set of values from the (approximate) posterior distribution of the parameters during each simulation rather than a single value. This is based upon coupling simulations with different sets of parameters and we call the resulting algorithm coupled ABC. The new methodology is used to analyse final size data for epidemics amongst communities partitioned into households. It is shown that for the epidemic data sets coupled ABC is more efficient than ABC and MCMC-ABC.  相似文献   

17.
In this paper, the author presents an efficient method of analyzing an interest-rate model using a new approach called 'data augmentation Bayesian forecasting.' First, a dynamic linear model estimation was constructed with a hierarchically-incorporated model. Next, an observational replication was generated based on the one-step forecast distribution derived from the model. A Markov-chain Monte Carlo sampling method was conducted on it as a new observation and unknown parameters were estimated. At that time, the EM algorithm was applied to establish initial values of unknown parameters while the 'quasi Bayes factor' was used to appreciate parameter candidates. 'Data augmentation Bayesian forecasting' is a method of evaluating the transition and history of 'future,' 'present' and 'past' of an arbitrary stochastic process by which an appropriate evaluation is conducted based on the probability measure that has been sequentially modified with additional information. It would be possible to use future prediction results for modifying the model to grasp the present state or re-evaluate the past state. It would be also possible to raise the degree of precision in predicting the future through the modification of the present and the past. Thus, 'data augmentation Bayesian forecasting' is applicable not only in the field of financial data analysis but also in forecasting and controlling the stochastic process.  相似文献   

18.
The use of relevance vector machines to flexibly model hazard rate functions is explored. This technique is adapted to survival analysis problems through the partial logistic approach. The method exploits the Bayesian automatic relevance determination procedure to obtain sparse solutions and it incorporates the flexibility of kernel-based models. Example results are presented on literature data from a head-and-neck cancer survival study using Gaussian and spline kernels. Sensitivity analysis is conducted to assess the influence of hyperprior distribution parameters. The proposed method is then contrasted with other flexible hazard regression methods, in particular the HARE model proposed by Kooperberg et al. [16]. A simulation study is conducted to carry out the comparison. The model developed in this paper exhibited good performance in the prediction of hazard rate. The application of this sparse Bayesian technique to a real cancer data set demonstrated that the proposed method can potentially reveal characteristics of the hazards, associated with the dynamics of the studied diseases, which may be missed by existing modeling approaches based on different perspectives on the bias vs. variance balance.  相似文献   

19.
Bivariate exponential models have often been used for the analysis of competing risks data involving two correlated risk components. Competing risks data consist only of the time to failure and cause of failure. In situations where there is positive probability of simultaneous failure, possibly the most widely used model is the Marshall–Olkin (J. Amer. Statist. Assoc. 62 (1967) 30) bivariate lifetime model. This distribution is not absolutely continuous as it involves a singularity component. However, the likelihood function based on the competing risks data is then identifiable, and any inference, Bayesian or frequentist, can be carried out in a straightforward manner. For the analysis of absolutely continuous bivariate exponential models, standard approaches often run into difficulty due to the lack of a fully identifiable likelihood (Basu and Ghosh; Commun. Statist. Theory Methods 9 (1980) 1515). To overcome the nonidentifiability, the usual frequentist approach is based on an integrated likelihood. Such an approach is implicit in Wada et al. (Calcutta Statist. Assoc. Bull. 46 (1996) 197) who proved some related asymptotic results. We offer in this paper an alternative Bayesian approach. Since systematic prior elicitation is often difficult, the present study focuses on Bayesian analysis with noninformative priors. It turns out that with an appropriate reparameterization, standard noninformative priors such as Jeffreys’ prior and its variants can be applied directly even though the likelihood is not fully identifiable. Two noninformative priors are developed that consist of Laplace's prior for nonidentifiable parameters and Laplace's and Jeffreys's priors for identifiable parameters. The resulting Bayesian procedures possess some frequentist optimality properties as well. Finally, these Bayesian methods are illustrated with analyses of a data set originating out of a lung cancer clinical trial conducted by the Eastern Cooperative Oncology Group.  相似文献   

20.
This paper introduces a new bivariate exponential distribution, called the Bivariate Affine-Linear Exponential distribution, to model moderately negative dependent data. The construction and characteristics of the proposed bivariate distribution are presented along with estimation procedures for the model parameters based on maximum likelihood and objective Bayesian analysis. We derive Jeffreys prior and discuss its frequentist properties based on a simulation study and MCMC sampling techniques. A real data set of mercury concentration in largemouth bass from Florida lakes is used to illustrate the methodology.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号