首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Missing data, a common but challenging issue in most studies, may lead to biased and inefficient inferences if handled inappropriately. As a natural and powerful way for dealing with missing data, Bayesian approach has received much attention in the literature. This paper reviews the recent developments and applications of Bayesian methods for dealing with ignorable and non-ignorable missing data. We firstly introduce missing data mechanisms and Bayesian framework for dealing with missing data, and then introduce missing data models under ignorable and non-ignorable missing data circumstances based on the literature. After that, important issues of Bayesian inference, including prior construction, posterior computation, model comparison and sensitivity analysis, are discussed. Finally, several future issues that deserve further research are summarized and concluded.  相似文献   

2.
This article proposes a Bayesian approach, which can simultaneously obtain the Bayesian estimates of unknown parameters and random effects, to analyze nonlinear reproductive dispersion mixed models (NRDMMs) for longitudinal data with nonignorable missing covariates and responses. The logistic regression model is employed to model the missing data mechanisms for missing covariates and responses. A hybrid sampling procedure combining the Gibber sampler and the Metropolis-Hastings algorithm is presented to draw observations from the conditional distributions. Because missing data mechanism is not testable, we develop the logarithm of the pseudo-marginal likelihood, deviance information criterion, the Bayes factor, and the pseudo-Bayes factor to compare several competing missing data mechanism models in the current considered NRDMMs with nonignorable missing covaraites and responses. Three simulation studies and a real example taken from the paediatric AIDS clinical trial group ACTG are used to illustrate the proposed methodologies. Empirical results show that our proposed methods are effective in selecting missing data mechanism models.  相似文献   

3.
Dealing with incomplete data is a pervasive problem in statistical surveys. Bayesian networks have been recently used in missing data imputation. In this research, we propose a new methodology for the multivariate imputation of missing data using discrete Bayesian networks and conditional Gaussian Bayesian networks. Results from imputing missing values in coronary artery disease data set and milk composition data set as well as a simulation study from cancer-neapolitan network are presented to demonstrate and compare the performance of three Bayesian network-based imputation methods with those of multivariate imputation by chained equations (MICE) and the classical hot-deck imputation method. To assess the effect of the structure learning algorithm on the performance of the Bayesian network-based methods, two methods called Peter-Clark algorithm and greedy search-and-score have been applied. Bayesian network-based methods are: first, the method introduced by Di Zio et al. [Bayesian networks for imputation, J. R. Stat. Soc. Ser. A 167 (2004), 309–322] in which, each missing item of a variable is imputed using the information given in the parents of that variable; second, the method of Di Zio et al. [Multivariate techniques for imputation based on Bayesian networks, Neural Netw. World 15 (2005), 303–310] which uses the information in the Markov blanket set of the variable to be imputed and finally, our new proposed method which applies the whole available knowledge of all variables of interest, consisting the Markov blanket and so the parent set, to impute a missing item. Results indicate the high quality of our new proposed method especially in the presence of high missingness percentages and more connected networks. Also the new method have shown to be more efficient than the MICE method for small sample sizes with high missing rates.  相似文献   

4.
In this paper, we develop Bayesian methodology and computational algorithms for variable subset selection in Cox proportional hazards models with missing covariate data. A new joint semi-conjugate prior for the piecewise exponential model is proposed in the presence of missing covariates and its properties are examined. The covariates are assumed to be missing at random (MAR). Under this new prior, a version of the Deviance Information Criterion (DIC) is proposed for Bayesian variable subset selection in the presence of missing covariates. Monte Carlo methods are developed for computing the DICs for all possible subset models in the model space. A Bone Marrow Transplant (BMT) dataset is used to illustrate the proposed methodology.  相似文献   

5.
The aim of this study is to determine the effect of informative priors for variables with missing value and to compare Bayesian Cox regression and Cox regression analysis. For this purpose, firstly simulated data sets with different sample size within different missing rate were generated and each of data sets were analysed by Cox regression and Bayesian Cox regression with informative prior. Secondly lung cancer data set as real data set was used for analysis. Consequently, using informative priors for variables with missing value solved the missing data problem.  相似文献   

6.
In some fields, we are forced to work with missing data in multivariate time series. Unfortunately, the data analysis in this context cannot be carried out in the same way as in the case of complete data. To deal with this problem, a Bayesian analysis of multivariate threshold autoregressive models with exogenous inputs and missing data is carried out. In this paper, Markov chain Monte Carlo methods are used to obtain samples from the involved posterior distributions, including threshold values and missing data. In order to identify autoregressive orders, we adapt the Bayesian variable selection method in this class of multivariate process. The number of regimes is estimated using marginal likelihood or product parameter-space strategies.  相似文献   

7.
ABSTRACT

A general Bayesian random effects model for analyzing longitudinal mixed correlated continuous and negative binomial responses with and without missing data is presented. This Bayesian model, given some random effects, uses a normal distribution for the continuous response and a negative binomial distribution for the count response. A Markov Chain Monte Carlo sampling algorithm is described for estimating the posterior distribution of the parameters. This Bayesian model is illustrated by a simulation study. For sensitivity analysis to investigate the change of parameter estimates with respect to the perturbation from missing at random to not missing at random assumption, the use of posterior curvature is proposed. The model is applied to a medical data, obtained from an observational study on women, where the correlated responses are the negative binomial response of joint damage and continuous response of body mass index. The simultaneous effects of some covariates on both responses are also investigated.  相似文献   

8.
Missing data are often problematic in social network analysis since what is missing may potentially alter the conclusions about what we have observed as tie-variables need to be interpreted in relation to their local neighbourhood and the global structure. Some ad hoc methods for dealing with missing data in social networks have been proposed but here we consider a model-based approach. We discuss various aspects of fitting exponential family random graph (or p-star) models (ERGMs) to networks with missing data and present a Bayesian data augmentation algorithm for the purpose of estimation. This involves drawing from the full conditional posterior distribution of the parameters, something which is made possible by recently developed algorithms. With ERGMs already having complicated interdependencies, it is particularly important to provide inference that adequately describes the uncertainty, something that the Bayesian approach provides. To the extent that we wish to explore the missing parts of the network, the posterior predictive distributions, immediately available at the termination of the algorithm, are at our disposal, which allows us to explore the distribution of what is missing unconditionally on any particular parameter values. Some important features of treating missing data and of the implementation of the algorithm are illustrated using a well-known collaboration network and a variety of missing data scenarios.  相似文献   

9.
Abstract

Handling data with the nonignorably missing mechanism is still a challenging problem in statistics. In this paper, we develop a fully Bayesian adaptive Lasso approach for quantile regression models with nonignorably missing response data, where the nonignorable missingness mechanism is specified by a logistic regression model. The proposed method extends the Bayesian Lasso by allowing different penalization parameters for different regression coefficients. Furthermore, a hybrid algorithm that combined the Gibbs sampler and Metropolis-Hastings algorithm is implemented to simulate the parameters from posterior distributions, mainly including regression coefficients, shrinkage coefficients, parameters in the non-ignorable missing models. Finally, some simulation studies and a real example are used to illustrate the proposed methodology.  相似文献   

10.
Family studies are often conducted to examine the existence of familial aggregation. Particularly, twin studies can model separately the genetic and environmental contribution. Here we estimate the heritability of quantitative traits via variance components of random-effects in linear mixed models (LMMs). The motivating example was a myopia twin study containing complex nesting data structures: twins and siblings in the same family and observations on both eyes for each individual. Three models are considered for this nesting structure. Our proposal takes into account the model uncertainty in both covariates and model structures via an extended Bayesian model averaging (EBMA) procedure. We estimate the heritability using EBMA under three suggested model structures. When compared with the results under the model with the highest posterior model probability, the EBMA estimate has smaller variation and is slightly conservative. Simulation studies are conducted to evaluate the performance of variance-components estimates, as well as the selections of risk factors, under the correct or incorrect structure. The results indicate that EBMA, with consideration of uncertainties in both covariates and model structures, is robust in model misspecification than the usual Bayesian model averaging (BMA) that considers only uncertainty in covariates selection.  相似文献   

11.
The use of parametric linear mixed models and generalized linear mixed models to analyze longitudinal data collected during randomized control trials (RCT) is conventional. The application of these methods, however, is restricted due to various assumptions required by these models. When the number of observations per subject is sufficiently large, and individual trajectories are noisy, functional data analysis (FDA) methods serve as an alternative to parametric longitudinal data analysis techniques. However, the use of FDA in RCTs is rare. In this paper, the effectiveness of FDA and linear mixed models (LMMs) was compared by analyzing data from rural persons living with HIV and comorbid depression enrolled in a depression treatment randomized clinical trial. Interactive voice response systems were used for weekly administrations of the 10-item Self-Administered Depression Scale (SADS) over 41 weeks. Functional principal component analysis and functional regression analysis methods detected a statistically significant difference in SADS between telphone-administered interpersonal psychotherapy (tele-IPT) and controls but linear mixed effects model results did not. Additional simulation studies were conducted to compare FDA and LMMs under a different nonlinear trajectory assumption. In this clinical trial with sufficient per subject measured outcomes and individual trajectories that are noisy and nonlinear, we found FDA methods to be a better alternative to LMMs.  相似文献   

12.
We propose methods for Bayesian inference for missing covariate data with a novel class of semi-parametric survival models with a cure fraction. We allow the missing covariates to be either categorical or continuous and specify a parametric distribution for the covariates that is written as a sequence of one dimensional conditional distributions. We assume that the missing covariates are missing at random (MAR) throughout. We propose an informative class of joint prior distributions for the regression coefficients and the parameters arising from the covariate distributions. The proposed class of priors are shown to be useful in recovering information on the missing covariates especially in situations where the missing data fraction is large. Properties of the proposed prior and resulting posterior distributions are examined. Also, model checking techniques are proposed for sensitivity analyses and for checking the goodness of fit of a particular model. Specifically, we extend the Conditional Predictive Ordinate (CPO) statistic to assess goodness of fit in the presence of missing covariate data. Computational techniques using the Gibbs sampler are implemented. A real data set involving a melanoma cancer clinical trial is examined to demonstrate the methodology.  相似文献   

13.
Multiple imputation is a common approach for dealing with missing values in statistical databases. The imputer fills in missing values with draws from predictive models estimated from the observed data, resulting in multiple, completed versions of the database. Researchers have developed a variety of default routines to implement multiple imputation; however, there has been limited research comparing the performance of these methods, particularly for categorical data. We use simulation studies to compare repeated sampling properties of three default multiple imputation methods for categorical data, including chained equations using generalized linear models, chained equations using classification and regression trees, and a fully Bayesian joint distribution based on Dirichlet process mixture models. We base the simulations on categorical data from the American Community Survey. In the circumstances of this study, the results suggest that default chained equations approaches based on generalized linear models are dominated by the default regression tree and Bayesian mixture model approaches. They also suggest competing advantages for the regression tree and Bayesian mixture model approaches, making both reasonable default engines for multiple imputation of categorical data. Supplementary material for this article is available online.  相似文献   

14.
In Rubin (1976) the missing at random (MAR) and missing completely at random (MCAR) conditions are discussed. It is concluded that the MAR condition allows one to ignore the missing data mechanism when doing likelihood or Bayesian inference but also that the stronger MCAR condition is in some sense the weakest generally sufficient condition allowing (conditional) frequentist inference while ignoring the missing data mechanism. In this paper it is shown that (a slightly strengthened version of) the MAR condition is sufficient to yield ordinary large sample results for estimators and test statistics and thus may be used for (asymptotic) frequentist inference.  相似文献   

15.
Asthma is an important chronic disease of childhood. An intervention programme for managing asthma was designed on principles of self-regulation and was evaluated by a randomized longitudinal study.The study focused on several outcomes, and, typically, missing data remained a pervasive problem. We develop a pattern-mixture model to evaluate the outcome of intervention on the number of hospitalizations with non-ignorable dropouts. Pattern-mixture models are not generally identifiable as no data may be available to estimate a number of model parameters. Sensitivity analyses are performed by imposing structures on the unidentified parameters.We propose a parameterization which permits sensitivity analyses on clustered longitudinal count data that have missing values due to non-ignorable missing data mechanisms. This parameterization is expressed as ratios between event rates across missing data patterns and the observed data pattern and thus measures departures from an ignorable missing data mechanism. Sensitivity analyses are performed within a Bayesian framework by averaging over different prior distributions on the event ratios. This model has the advantage of providing an intuitive and flexible framework for incorporating the uncertainty of the missing data mechanism in the final analysis.  相似文献   

16.
This paper considers the multiple change-point estimation for exponential distribution with truncated and censored data by Gibbs sampling. After all the missing data of interest is filled in by some sampling methods such as rejection sampling method, the complete-data likelihood function is obtained. The full conditional distributions of all parameters are discussed. The means of Gibbs samples are taken as Bayesian estimations of the parameters. The implementation steps of Gibbs sampling are introduced in detail. Finally random simulation test is developed, and the results show that Bayesian estimations are fairly accurate.  相似文献   

17.
This paper is concerned with Bayesian estimation of a spatial regression model with skew non-Gaussian errors. The regression parameters are estimated by using a closed skew normal (CSN) distribution, which is closed under conditioning and linear combination. The proposed model captures skewness in the response variable. Sometimes, we may encounter missing observations in the response variable, accordingly we model and predict the missing observations by a Bayesian approach using Gibbs sampling methods. Next, a simulation study is performed to asses our model validity. Also, the proposed model in this work is applied to CO data from Tehran, the capital city of Iran. Then, the accuracy of the CSN and Gaussian models is compared by cross validation criterion.  相似文献   

18.
If unit‐level data are available, small area estimation (SAE) is usually based on models formulated at the unit level, but they are ultimately used to produce estimates at the area level and thus involve area‐level inferences. This paper investigates the circumstances under which using an area‐level model may be more effective. Linear mixed models (LMMs) fitted using different levels of data are applied in SAE to calculate synthetic estimators and empirical best linear unbiased predictors (EBLUPs). The performance of area‐level models is compared with unit‐level models when both individual and aggregate data are available. A key factor is whether there are substantial contextual effects. Ignoring these effects in unit‐level working models can cause biased estimates of regression parameters. The contextual effects can be automatically accounted for in the area‐level models. Using synthetic and EBLUP techniques, small area estimates based on different levels of LMMs are investigated in this paper by means of a simulation study.  相似文献   

19.
This paper provides a practical simulation-based Bayesian analysis of parameter-driven models for time series Poisson data with the AR(1) latent process. The posterior distribution is simulated by a Gibbs sampling algorithm. Full conditional posterior distributions of unknown variables in the model are given in convenient forms for the Gibbs sampling algorithm. The case with missing observations is also discussed. The methods are applied to real polio data from 1970 to 1983.  相似文献   

20.
Missing data pose a serious challenge to the integrity of randomized clinical trials, especially of treatments for prolonged illnesses such as schizophrenia, in which long‐term impact assessment is of great importance, but the follow‐up rates are often no more than 50%. Sensitivity analysis using Bayesian modeling for missing data offers a systematic approach to assessing the sensitivity of the inferences made on the basis of observed data. This paper uses data from an 18‐month study of veterans with schizophrenia to demonstrate this approach. Data were obtained from a randomized clinical trial involving 369 patients diagnosed with schizophrenia that compared long‐acting injectable risperidone with a psychiatrist's choice of oral treatment. Bayesian analysis utilizing a pattern‐mixture modeling approach was used to validate the reported results by detecting bias due to non‐random patterns of missing data. The analysis was applied to several outcomes including standard measures of schizophrenia symptoms, quality of life, alcohol use, and global mental status. The original study results for several measures were confirmed against a wide range of patterns of non‐random missingness. Robustness of the conclusions was assessed using sensitivity parameters. The missing data in the trial did not likely threaten the validity of previously reported results. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号