首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Dependent data arise in many studies. Frequently adopted sampling designs, such as cluster, multilevel, spatial, and repeated measures, may induce this dependence, which the analysis of the data needs to take into due account. In a previous publication (Geraci and Bottai in Biostatistics 8:140–154, 2007), we proposed a conditional quantile regression model for continuous responses where subject-specific random intercepts were included to account for within-subject dependence in the context of longitudinal data analysis. The approach hinged upon the link existing between the minimization of weighted absolute deviations, typically used in quantile regression, and the maximization of a Laplace likelihood. Here, we consider an extension of those models to more complex dependence structures in the data, which are modeled by including multiple random effects in the linear conditional quantile functions. We also discuss estimation strategies to reduce the computational burden and inefficiency associated with the Monte Carlo EM algorithm we have proposed previously. In particular, the estimation of the fixed regression coefficients and of the random effects’ covariance matrix is based on a combination of Gaussian quadrature approximations and non-smooth optimization algorithms. Finally, a simulation study and a number of applications of our models are presented.  相似文献   

2.
Sampling from the posterior distribution in generalized linear mixed models   总被引:5,自引:0,他引:5  
Generalized linear mixed models provide a unified framework for treatment of exponential family regression models, overdispersed data and longitudinal studies. These problems typically involve the presence of random effects and this paper presents a new methodology for making Bayesian inference about them. The approach is simulation-based and involves the use of Markov chain Monte Carlo techniques. The usual iterative weighted least squares algorithm is extended to include a sampling step based on the Metropolis–Hastings algorithm thus providing a unified iterative scheme. Non-normal prior distributions for the regression coefficients and for the random effects distribution are considered. Random effect structures with nesting required by longitudinal studies are also considered. Particular interests concern the significance of regression coefficients and assessment of the form of the random effects. Extensions to unknown scale parameters, unknown link functions, survival and frailty models are outlined.  相似文献   

3.
Two new implementations of the EM algorithm are proposed for maximum likelihood fitting of generalized linear mixed models. Both methods use random (independent and identically distributed) sampling to construct Monte Carlo approximations at the E-step. One approach involves generating random samples from the exact conditional distribution of the random effects (given the data) by rejection sampling, using the marginal distribution as a candidate. The second method uses a multivariate t importance sampling approximation. In many applications the two methods are complementary. Rejection sampling is more efficient when sample sizes are small, whereas importance sampling is better with larger sample sizes. Monte Carlo approximation using random samples allows the Monte Carlo error at each iteration to be assessed by using standard central limit theory combined with Taylor series methods. Specifically, we construct a sandwich variance estimate for the maximizer at each approximate E-step. This suggests a rule for automatically increasing the Monte Carlo sample size after iterations in which the true EM step is swamped by Monte Carlo error. In contrast, techniques for assessing Monte Carlo error have not been developed for use with alternative implementations of Monte Carlo EM algorithms utilizing Markov chain Monte Carlo E-step approximations. Three different data sets, including the infamous salamander data of McCullagh and Nelder, are used to illustrate the techniques and to compare them with the alternatives. The results show that the methods proposed can be considerably more efficient than those based on Markov chain Monte Carlo algorithms. However, the methods proposed may break down when the intractable integrals in the likelihood function are of high dimension.  相似文献   

4.
The Integrated Nested Laplace Approximation (INLA) has established itself as a widely used method for approximate inference on Bayesian hierarchical models which can be represented as a latent Gaussian model (LGM). INLA is based on producing an accurate approximation to the posterior marginal distributions of the parameters in the model and some other quantities of interest by using repeated approximations to intermediate distributions and integrals that appear in the computation of the posterior marginals. INLA focuses on models whose latent effects are a Gaussian Markov random field. For this reason, we have explored alternative ways of expanding the number of possible models that can be fitted using the INLA methodology. In this paper, we present a novel approach that combines INLA and Markov chain Monte Carlo (MCMC). The aim is to consider a wider range of models that can be fitted with INLA only when some of the parameters of the model have been fixed. We show how new values of these parameters can be drawn from their posterior by using conditional models fitted with INLA and standard MCMC algorithms, such as Metropolis–Hastings. Hence, this will extend the use of INLA to fit models that can be expressed as a conditional LGM. Also, this new approach can be used to build simpler MCMC samplers for complex models as it allows sampling only on a limited number of parameters in the model. We will demonstrate how our approach can extend the class of models that could benefit from INLA, and how the R-INLA package will ease its implementation. We will go through simple examples of this new approach before we discuss more advanced applications with datasets taken from the relevant literature. In particular, INLA within MCMC will be used to fit models with Laplace priors in a Bayesian Lasso model, imputation of missing covariates in linear models, fitting spatial econometrics models with complex nonlinear terms in the linear predictor and classification of data with mixture models. Furthermore, in some of the examples we could exploit INLA within MCMC to make joint inference on an ensemble of model parameters.  相似文献   

5.
In this paper, we discuss a fully Bayesian quantile inference using Markov Chain Monte Carlo (MCMC) method for longitudinal data models with random effects. Under the assumption of error term subject to asymmetric Laplace distribution, we establish a hierarchical Bayesian model and obtain the posterior distribution of unknown parameters at τ-th level. We overcome the current computational limitations using two approaches. One is the general MCMC technique with Metropolis–Hastings algorithm and another is the Gibbs sampling from the full conditional distribution. These two methods outperform the traditional frequentist methods under a wide array of simulated data models and are flexible enough to easily accommodate changes in the number of random effects and in their assumed distribution. We apply the Gibbs sampling method to analyse a mouse growth data and some different conclusions from those in the literatures are obtained.  相似文献   

6.
During recent years, analysts have been relying on approximate methods of inference to estimate multilevel models for binary or count data. In an earlier study of random-intercept models for binary outcomes we used simulated data to demonstrate that one such approximation, known as marginal quasi-likelihood, leads to a substantial attenuation bias in the estimates of both fixed and random effects whenever the random effects are non-trivial. In this paper, we fit three-level random-intercept models to actual data for two binary outcomes, to assess whether refined approximation procedures, namely penalized quasi-likelihood and second-order improvements to marginal and penalized quasi-likelihood, also underestimate the underlying parameters. The extent of the bias is assessed by two standards of comparison: exact maximum likelihood estimates, based on a Gauss–Hermite numerical quadrature procedure, and a set of Bayesian estimates, obtained from Gibbs sampling with diffuse priors. We also examine the effectiveness of a parametric bootstrap procedure for reducing the bias. The results indicate that second-order penalized quasi-likelihood estimates provide a considerable improvement over the other approximations, but all the methods of approximate inference result in a substantial underestimation of the fixed and random effects when the random effects are sizable. We also find that the parametric bootstrap method can eliminate the bias but is computationally very intensive.  相似文献   

7.
ABSTRACT

Clustered observations such as longitudinal data are often analysed with generalized linear mixed models (GLMM). Approximate Bayesian inference for GLMMs with normally distributed random effects can be done using integrated nested Laplace approximations (INLA), which is in general known to yield accurate results. However, INLA is known to be less accurate for GLMMs with binary response. For longitudinal binary response data it is common that patients do not change their health state during the study period. In this case the grouping covariate perfectly predicts a subset of the response, which implies a monotone likelihood with diverging maximum likelihood (ML) estimates for cluster-specific parameters. This is known as quasi-complete separation. In this paper we demonstrate, based on longitudinal data from a randomized clinical trial and two simulations, that the accuracy of INLA decreases with increasing degree of cluster-specific quasi-complete separation. Comparing parameter estimates by INLA, Markov chain Monte Carlo sampling and ML shows that INLA increasingly deviates from the other methods in such a scenario.  相似文献   

8.
On Block Updating in Markov Random Field Models for Disease Mapping   总被引:3,自引:0,他引:3  
Gaussian Markov random field (GMRF) models are commonly used to model spatial correlation in disease mapping applications. For Bayesian inference by MCMC, so far mainly single-site updating algorithms have been considered. However, convergence and mixing properties of such algorithms can be extremely poor due to strong dependencies of parameters in the posterior distribution. In this paper, we propose various block sampling algorithms in order to improve the MCMC performance. The methodology is rather general, allows for non-standard full conditionals, and can be applied in a modular fashion in a large number of different scenarios. For illustration we consider three different applications: two formulations for spatial modelling of a single disease (with and without additional unstructured parameters respectively), and one formulation for the joint analysis of two diseases. The results indicate that the largest benefits are obtained if parameters and the corresponding hyperparameter are updated jointly in one large block. Implementation of such block algorithms is relatively easy using methods for fast sampling of Gaussian Markov random fields ( Rue, 2001 ). By comparison, Monte Carlo estimates based on single-site updating can be rather misleading, even for very long runs. Our results may have wider relevance for efficient MCMC simulation in hierarchical models with Markov random field components.  相似文献   

9.
Jointly modeling longitudinal and survival data has been an active research area. Most researches focus on improving the estimating efficiency but ignore many data features frequently encountered in practice. In the current study, we develop the joint models that concurrently accounting for longitudinal and survival data with multiple features. Specifically, the proposed model handles skewness, missingness and measurement errors in covariates which are typically observed in the collection of longitudinal survival data from many studies. We employ a Bayesian inferential method to make inference on the proposed model. We applied the proposed model to an real data study. A few alternative models under different conditions are compared. We conduct extensive simulations in order to evaluate how the method works.  相似文献   

10.
The article details a sampling scheme which can lead to a reduction in sample size and cost in clinical and epidemiological studies of association between a count outcome and risk factor. We show that inference in two common generalized linear models for count data, Poisson and negative binomial regression, is improved by using a ranked auxiliary covariate, which guides the sampling procedure. This type of sampling has typically been used to improve inference on a population mean. The novelty of the current work is its extension to log-linear models and derivations showing that the sampling technique results in an increase in information as compared to simple random sampling. Specifically, we show that under the proposed sampling strategy the maximum likelihood estimate of the risk factor’s coefficient is improved through an increase in the Fisher’s information. A simulation study is performed to compare the mean squared error, bias, variance, and power of the sampling routine with simple random sampling under various data-generating scenarios. We also illustrate the merits of the sampling scheme on a real data set from a clinical setting of males with chronic obstructive pulmonary disease. Empirical results from the simulation study and data analysis coincide with the theoretical derivations, suggesting that a significant reduction in sample size, and hence study cost, can be realized while achieving the same precision as a simple random sample.  相似文献   

11.
Abstract. This paper reviews some of the key statistical ideas that are encountered when trying to find empirical support to causal interpretations and conclusions, by applying statistical methods on experimental or observational longitudinal data. In such data, typically a collection of individuals are followed over time, then each one has registered a sequence of covariate measurements along with values of control variables that in the analysis are to be interpreted as causes, and finally the individual outcomes or responses are reported. Particular attention is given to the potentially important problem of confounding. We provide conditions under which, at least in principle, unconfounded estimation of the causal effects can be accomplished. Our approach for dealing with causal problems is entirely probabilistic, and we apply Bayesian ideas and techniques to deal with the corresponding statistical inference. In particular, we use the general framework of marked point processes for setting up the probability models, and consider posterior predictive distributions as providing the natural summary measures for assessing the causal effects. We also draw connections to relevant recent work in this area, notably to Judea Pearl's formulations based on graphical models and his calculus of so‐called do‐probabilities. Two examples illustrating different aspects of causal reasoning are discussed in detail.  相似文献   

12.
This paper uses random scales similar to random effects used in the generalized linear mixed models to describe “inter-location” population variation in variance components for modeling complicated data obtained from applications such as antenna manufacturing. Our distribution studies lead to a complicated integrated extended quasi-likelihood (IEQL) for parameter estimations and large sample inference derivations. Laplace's expansion and several approximation methods are employed to simplify the IEQL estimation procedures. Asymptotic properties of the approximate IEQL estimates are derived for general structures of the covariance matrix of random scales. Focusing on a few special covariance structures in simpler forms, the authors further simplify IEQL estimates such that typically used software tools such as weighted regression can compute the estimates easily. Moreover, these special cases allow us to derive interesting asymptotic results in much more compact expressions. Finally, numerical simulation results show that IEQL estimates perform very well in several special cases studied.  相似文献   

13.
We consider a Bayesian approach to the study of independence in a two-way contingency table which has been obtained from a two-stage cluster sampling design. If a procedure based on single-stage simple random sampling (rather than the appropriate cluster sampling) is used to test for independence, the p-value may be too small, resulting in a conclusion that the null hypothesis is false when it is, in fact, true. For many large complex surveys the Rao–Scott corrections to the standard chi-squared (or likelihood ratio) statistic provide appropriate inference. For smaller surveys, though, the Rao–Scott corrections may not be accurate, partly because the chi-squared test is inaccurate. In this paper, we use a hierarchical Bayesian model to convert the observed cluster samples to simple random samples. This provides surrogate samples which can be used to derive the distribution of the Bayes factor. We demonstrate the utility of our procedure using an example and also provide a simulation study which establishes our methodology as a viable alternative to the Rao–Scott approximations for relatively small two-stage cluster samples. We also show the additional insight gained by displaying the distribution of the Bayes factor rather than simply relying on a summary of the distribution.  相似文献   

14.
This paper proposes a hierarchical probabilistic model for ordinal matrix factorization. Unlike previous approaches, we model the ordinal nature of the data and take a principled approach to incorporating priors for the hidden variables. Two algorithms are presented for inference, one based on Gibbs sampling and one based on variational Bayes. Importantly, these algorithms may be implemented in the factorization of very large matrices with missing entries.  相似文献   

15.
Summary.  A common objective in longitudinal studies is the joint modelling of a longitudinal response with a time-to-event outcome. Random effects are typically used in the joint modelling framework to explain the interrelationships between these two processes. However, estimation in the presence of random effects involves intractable integrals requiring numerical integration. We propose a new computational approach for fitting such models that is based on the Laplace method for integrals that makes the consideration of high dimensional random-effects structures feasible. Contrary to the standard Laplace approximation, our method requires much fewer repeated measurements per individual to produce reliable results.  相似文献   

16.
Abstract.  In this paper we propose fast approximate methods for computing posterior marginals in spatial generalized linear mixed models. We consider the common geostatistical case with a high dimensional latent spatial variable and observations at known registration sites. The methods of inference are deterministic, using no simulation-based inference. The first proposed approximation is fast to compute and is 'practically sufficient', meaning that results do not show any bias or dispersion effects that might affect decision making. Our second approximation, an improvement of the first version, is 'practically exact', meaning that one would have to run MCMC simulations for very much longer than is typically done to detect any indication of error in the approximate results. For small-count data the approximations are slightly worse, but still very accurate. Our methods are limited to likelihood functions that give unimodal full conditionals for the latent variable. The methods help to expand the future scope of non-Gaussian geostatistical models as illustrated by applications of model choice, outlier detection and sampling design. The approximations take seconds or minutes of CPU time, in sharp contrast to overnight MCMC runs for solving such problems.  相似文献   

17.
Abstract

Linear mixed effects models have been popular in small area estimation problems for modeling survey data when the sample size in one or more areas is too small for reliable inference. However, when the data are restricted to a bounded interval, the linear model may be inappropriate, particularly if the data are near the boundary. Nonlinear sampling models are becoming increasingly popular for small area estimation problems when the normal model is inadequate. This paper studies the use of a beta distribution as an alternative to the normal distribution as a sampling model for survey estimates of proportions which take values in (0, 1). Inference for small area proportions based on the posterior distribution of a beta regression model ensures that point estimates and credible intervals take values in (0, 1). Properties of a hierarchical Bayesian small area model with a beta sampling distribution and logistic link function are presented and compared to those of the linear mixed effect model. Propriety of the posterior distribution using certain noninformative priors is shown, and behavior of the posterior mean as a function of the sampling variance and the model variance is described. An example using 2010 Small Area Income and Poverty Estimates (SAIPE) data is given, and a numerical example studying small sample properties of the model is presented.  相似文献   

18.
This study takes up inference in linear models with generalized error and generalized t distributions. For the generalized error distribution, two computational algorithms are proposed. The first is based on indirect Bayesian inference using an approximating finite scale mixture of normal distributions. The second is based on Gibbs sampling. The Gibbs sampler involves only drawing random numbers from standard distributions. This is important because previously the impression has been that an exact analysis of the generalized error regression model using Gibbs sampling is not possible. Next, we describe computational Bayesian inference for linear models with generalized t disturbances based on Gibbs sampling, and exploiting the fact that the model is a mixture of generalized error distributions with inverse generalized gamma distributions for the scale parameter. The linear model with this specification has also been thought not to be amenable to exact Bayesian analysis. All computational methods are applied to actual data involving the exchange rates of the British pound, the French franc, and the German mark relative to the U.S. dollar.  相似文献   

19.
In this paper, we will extend the joint model of longitudinal biomarker and recurrent event via copula function for accounting the dependence between the two processes. The general idea of joining separate processes by allowing model-specific random effect may come from different families distribution. It is a main advantage of the proposed method that a copula construction does not constrain the choice of marginal distributions of random effects. A maximum likelihood estimation with importance sampling technique as a simple and easy understanding method is employed to model inference. To evaluate and verify the validation of the proposed joint model, a bootstrapping method as a model-based resampling is developed. Our proposed joint model is also applied to pemphigus disease data for assessing the effect of biomarker trajectory on risk of recurrence.  相似文献   

20.
Longitudinal imaging studies have moved to the forefront of medical research due to their ability to characterize spatio-temporal features of biological structures across the lifespan. Valid inference in longitudinal imaging requires enough flexibility of the covariance model to allow reasonable fidelity to the true pattern. On the other hand, the existence of computable estimates demands a parsimonious parameterization of the covariance structure. Separable (Kronecker product) covariance models provide one such parameterization in which the spatial and temporal covariances are modeled separately. However, evaluating the validity of this parameterization in high dimensions remains a challenge. Here we provide a scientifically informed approach to assessing the adequacy of separable (Kronecker product) covariance models when the number of observations is large relative to the number of independent sampling units (sample size). We address both the general case, in which unstructured matrices are considered for each covariance model, and the structured case, which assumes a particular structure for each model. For the structured case, we focus on the situation where the within-subject correlation is believed to decrease exponentially in time and space as is common in longitudinal imaging studies. However, the provided framework equally applies to all covariance patterns used within the more general multivariate repeated measures context. Our approach provides useful guidance for high dimension, low-sample size data that preclude using standard likelihood-based tests. Longitudinal medical imaging data of caudate morphology in schizophrenia illustrate the approaches appeal.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号