期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Bayesian analysis of multivariate t linear mixed models with missing responses at random

《Journal of Statistical Computation and Simulation》2012,82(17):3594-3612

The multivariate t linear mixed model (MtLMM) has been recently proposed as a robust tool for analysing multivariate longitudinal data with atypical observations. Missing outcomes frequently occur in longitudinal research even in well controlled situations. As a powerful alternative to the traditional expectation maximization based algorithm employing single imputation, we consider a Bayesian analysis of the MtLMM to account for the uncertainties of model parameters and missing outcomes through multiple imputation. An inverse Bayes formulas sampler coupled with Metropolis-within-Gibbs scheme is used to effectively draw the posterior distributions of latent data and model parameters. The techniques for multiple imputation of missing values, estimation of random effects, prediction of future responses, and diagnostics of potential outliers are investigated as well. The proposed methodology is illustrated through a simulation study and an application to AIDS/HIV data. 相似文献

2.

Influence of human immunodeficiency virus infection on neurological impairment: an analysis of longitudinal binary data with informative drop-out 总被引：1，自引：0，他引：1

X. Liu C. Waternaux & E. Petkova 《Journal of the Royal Statistical Society. Series C, Applied statistics》1999,48(1):103-115

A study to investigate the human immunodeficiency virus (HIV) status on the course of neurological impairment, conducted by the HIV Center at Columbia University, followed a cohort of HIV positive and negative gay men for 5 years and assessed the presence or absence of neurological impairment every 6 months. Almost half of the subjects dropped out before the end of the study for reasons that might have been related to the missing neurological data. We propose likelihood-based methods for analysing such binary longitudinal data under informative and non-informative drop-out. A transition model is assumed for the binary response, and several models for the drop-out processes are considered which are functions of the response variable (neurological impairment). The likelihood ratio test is used to compare models with informative and non-informative drop-out mechanisms. Using simulations, we investigate the percentage bias and mean-squared error (MSE) of the parameter estimates in the transition model under various assumptions for the drop-out. We find evidence for informative drop-out in the study, and we illustrate that the bias and MSE for the parameters of the transition model are not directly related to the observed drop-out or missing data rates. The effect of HIV status on the neurological impairment is found to be statistically significant under each of the models considered for the drop-out, although the regression coefficient may be biased in certain cases. The presence and relative magnitude of the bias depend on factors such as the probability of drop-out conditional on the presence of neurological impairment and the prevalence of neurological impairment in the population under study. 相似文献

3.

Using auxiliary data for parameter estimation with non-ignorably missing outcomes

Joseph G. Ibrahim Stuart R. Lipsitz & Nick Horton 《Journal of the Royal Statistical Society. Series C, Applied statistics》2001,50(3):361-373

We propose a method for estimating parameters in generalized linear models when the outcome variable is missing for some subjects and the missing data mechanism is non-ignorable. We assume throughout that the covariates are fully observed. One possible method for estimating the parameters is maximum likelihood with a non-ignorable missing data model. However, caution must be used when fitting non-ignorable missing data models because certain parameters may be inestimable for some models. Instead of fitting a non-ignorable model, we propose the use of auxiliary information in a likelihood approach to reduce the bias, without having to specify a non-ignorable model. The method is applied to a mental health study. 相似文献

4.

Regression modelling of weighted κ by using generalized estimating equations

R. Gonin S. R. Lipsitz G. M. Fitzmaurice & G. Molenberghs 《Journal of the Royal Statistical Society. Series C, Applied statistics》2000,49(1):1-18

In many clinical studies more than one observer may be rating a characteristic measured on an ordinal scale. For example, a study may involve a group of physicians rating a feature seen on a pathology specimen or a computer tomography scan. In clinical studies of this kind, the weighted κ coefficient is a popular measure of agreement for ordinally scaled ratings. Our research stems from a study in which the severity of inflammatory skin disease was rated. The investigators wished to determine and evaluate the strength of agreement between a variable number of observers taking into account patient-specific (age and gender) as well as rater-specific (whether board certified in dermatology) characteristics. This suggested modelling κ as a function of these covariates. We propose the use of generalized estimating equations to estimate the weighted κ coefficient. This approach also accommodates unbalanced data which arise when some subjects are not judged by the same set of observers. Currently an estimate of overall κ for a simple unbalanced data set without covariates involving more than two observers is unavailable. In the inflammatory skin disease study none of the covariates were significantly associated with κ, thus enabling the calculation of an overall weighted κ for this unbalanced data set. In the second motivating example (multiple sclerosis), geographic location was significantly associated with κ. In addition we also compared the results of our method with current methods of testing for heterogeneity of weighted κ coefficients across strata (geographic location) that are available for balanced data sets. 相似文献

5.

A finite mixture approach to joint clustering of individuals and multivariate discrete outcomes

Francesca Martella Marco Alfò 《Journal of Statistical Computation and Simulation》2017,87(11):2186-2206

In this work, we modify finite mixtures of factor analysers to provide a method for simultaneous clustering of subjects and multivariate discrete outcomes. The joint clustering is performed through a suitable reparameterization of the outcome (column)-specific parameters. We develop an expectation–maximization-type algorithm for maximum likelihood parameter estimation where the maximization step is divided into orthogonal sub-blocks that refer to row and column-specific parameters, respectively. Model performance is evaluated via a simulation study with varying sample size, number of outcomes and row/column-specific clustering (partitions). We compare the performance of our model with the performance of standard model-based biclustering approaches. The proposed method is also demonstrated on a benchmark data set where a multivariate binary response is considered. 相似文献

6.

Analysis of longitudinal data unbalanced over time

Wenzheng Huang Garrett M. Fitzmaurice 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2005,67(1):135-155

Summary. The paper considers modelling, estimating and diagnostically verifying the response process generating longitudinal data, with emphasis on association between repeated meas-ures from unbalanced longitudinal designs. Our model is based on separate specifications of the moments for the mean, standard deviation and correlation, with different components possibly sharing common parameters. We propose a general class of correlation structures that comprise random effects, measurement errors and a serially correlated process. These three elements are combined via flexible time-varying weights, whereas the serial correlation can depend flexibly on the mean time and lag. When the measurement schedule is independent of the response process, our estimation procedure yields consistent and asymptotically normal estimates for the mean parameters even when the standard deviation and correlation are misspecified, and for the standard deviation parameters even when the correlation is misspecified. A generic diagnostic method is developed for verifying the models for the mean, standard deviation and, in particular, the correlation, which is applicable even when the data are severely unbalanced. The methodology is illustrated by an analysis of data from a longitudinal study that was designed to characterize pulmonary growth in girls. 相似文献

7.

Selection of Latent Variables for Multiple Mixed‐outcome Models

Ling Zhou Huazhen Lin Xinyuan Song Yi Li 《Scandinavian Journal of Statistics》2014,41(4):1064-1082

Latent variable models have been widely used for modelling the dependence structure of multiple outcomes data. However, the formulation of a latent variable model is often unknown a priori, the misspecification will distort the dependence structure and lead to unreliable model inference. Moreover, multiple outcomes with varying types present enormous analytical challenges. In this paper, we present a class of general latent variable models that can accommodate mixed types of outcomes. We propose a novel selection approach that simultaneously selects latent variables and estimates parameters. We show that the proposed estimator is consistent, asymptotically normal and has the oracle property. The practical utility of the methods is confirmed via simulations as well as an application to the analysis of the World Values Survey, a global research project that explores peoples’ values and beliefs and the social and personal characteristics that might influence them. 相似文献

8.

Data-driven desirability function to measure patients’ disease progression in a longitudinal study

Hsiu-Wen Chen Weng Kee Wong Hongquan Xu 《Journal of applied statistics》2016,43(5):783-795

Multiple outcomes are increasingly used to assess chronic disease progression. We discuss and show how desirability functions can be used to assess a patient overall response to a treatment using multiple outcome measures and each of them may contribute unequally to the final assessment. Because judgments on disease progression and the relative contribution of each outcome can be subjective, we propose a data-driven approach to minimize the biases by using desirability functions with estimated shapes and weights based on a given gold standard. Our method provides each patient with a meaningful overall progression score that facilitates comparison and clinical interpretation. We also extend the methodology in a novel way to monitor patients’ disease progression when there are multiple time points and illustrate our method using a longitudinal data set from a randomized two-arm clinical trial for scleroderma patients. 相似文献

9.

Parametric Estimator of Linear Model with Interval-Censored Data

Wenli Deng Yong Tian Qiuping Lv 《统计学通讯:模拟与计算》2013,42(10):1794-1804

In this article, we consider the estimation of regression parameters in linear model in the presence of interval-censored data. When the response variable is interval-censored, the traditional methods can not be used to estimate the parameters directly. In this article, unbiased transformation is carried out and a new random variable which has the same expectation as the function of the response variable is established. With the regression analysis for the constructed statistic we conclude the estimator by least square method. 相似文献

10.

Truncated location-scale non linear regression models

Carolina Costa Mota Paraíba Carlos Alberto Ribeiro Diniz Aline de Holanda Nunes Maia Lineu Neiva Rodrigues 《统计学通讯:理论与方法》2017,46(15):7355-7374

We present a class of truncated non linear regression models for location and scale where the truncated nature of the data is incorporated into the statistical model by assuming that the response variable follows a truncated distribution. The location parameter of the response variable is assumed to be modeled by a continuous non linear function of covariates and unknown parameters. In addition, the proposed model also allows for the scale parameter of the responses to be characterized by a continuous function of the covariates and unknown parameters. Three particular cases of the proposed models are presented by considering the response variable to follow a truncated normal, truncated skew normal, and truncated beta distribution. These truncated non linear regression models are constructed assuming fixed known truncation limits and model parameters are estimated by direct maximization of the log-likelihood using a non linear optimization algorithm. Standardized residuals and diagnostic metrics based on the cases deletion are considered to verify the adequacy of the model and to detect outliers and influential observations. Results based on simulated data are presented to assess the frequentist properties of estimates, and a real data set on soil-water retention from the Buriti Vermelho River Basin database is analyzed using the proposed methodology. 相似文献

11.

Segmental modeling of changing immunologic response for CD4 data with skewness,missingness and dropout

Yangxin Huang Getachew A. Dagne Jeong-Gun Park 《Journal of applied statistics》2013,40(10):2244-2258

In clinical practice, the profile of each subject's CD4 response from a longitudinal study may follow a ‘broken stick’ like trajectory, indicating multiple phases of increase and/or decline in response. Such multiple phases (changepoints) may be important indicators to help quantify treatment effect and improve management of patient care. Although it is a common practice to analyze complex AIDS longitudinal data using nonlinear mixed-effects (NLME) or nonparametric mixed-effects (NPME) models in the literature, NLME or NPME models become a challenge to estimate changepoint due to complicated structures of model formulations. In this paper, we propose a changepoint mixed-effects model with random subject-specific parameters, including the changepoint for the analysis of longitudinal CD4 cell counts for HIV infected subjects following highly active antiretroviral treatment. The longitudinal CD4 data in this study may exhibit departures from symmetry, may encounter missing observations due to various reasons, which are likely to be non-ignorable in the sense that missingness may be related to the missing values, and may be censored at the time of the subject going off study-treatment, which is a potentially informative dropout mechanism. Inferential procedures can be complicated dramatically when longitudinal CD4 data with asymmetry (skewness), incompleteness and informative dropout are observed in conjunction with an unknown changepoint. Our objective is to address the simultaneous impact of skewness, missingness and informative censoring by jointly modeling the CD4 response and dropout time processes under a Bayesian framework. The method is illustrated using a real AIDS data set to compare potential models with various scenarios, and some interested results are presented. 相似文献

12.

A comparison of non-homogeneous Markov regression models with application to Alzheimer's disease progression

Hubbard RA Zhou XH 《Journal of applied statistics》2011,38(10):2313-2326

Markov regression models are useful tools for estimating the impact of risk factors on rates of transition between multiple disease states. Alzheimer's disease (AD) is an example of a multi-state disease process in which great interest lies in identifying risk factors for transition. In this context, non-homogeneous models are required because transition rates change as subjects age. In this report we propose a non-homogeneous Markov regression model that allows for reversible and recurrent disease states, transitions among multiple states between observations, and unequally spaced observation times. We conducted simulation studies to demonstrate performance of estimators for covariate effects from this model and compare performance with alternative models when the underlying non-homogeneous process was correctly specified and under model misspecification. In simulation studies, we found that covariate effects were biased if non-homogeneity of the disease process was not accounted for. However, estimates from non-homogeneous models were robust to misspecification of the form of the non-homogeneity. We used our model to estimate risk factors for transition to mild cognitive impairment (MCI) and AD in a longitudinal study of subjects included in the National Alzheimer's Coordinating Center's Uniform Data Set. Using our model, we found that subjects with MCI affecting multiple cognitive domains were significantly less likely to revert to normal cognition. 相似文献

13.

Modeling Longitudinal Obesity Data with Intermittent Missingness Using a New Latent Variable Model

Li Qin Lisa Weissfeld Marsha D. Marcus Michele D. Levine Feng Dai 《统计学通讯:模拟与计算》2016,45(6):2018-2031

We propose a latent variable model for informative missingness in longitudinal studies which is an extension of latent dropout class model. In our model, the value of the latent variable is affected by the missingness pattern and it is also used as a covariate in modeling the longitudinal response. So the latent variable links the longitudinal response and the missingness process. In our model, the latent variable is continuous instead of categorical and we assume that it is from a normal distribution. The EM algorithm is used to obtain the estimates of the parameter we are interested in and Gauss–Hermite quadrature is used to approximate the integration of the latent variable. The standard errors of the parameter estimates can be obtained from the bootstrap method or from the inverse of the Fisher information matrix of the final marginal likelihood. Comparisons are made to the mixed model and complete-case analysis in terms of a clinical trial dataset, which is Weight Gain Prevention among Women (WGPW) study. We use the generalized Pearson residuals to assess the fit of the proposed latent variable model. 相似文献

14.

Computation aspects of the parameter estimates of linear mixed effects model in multivariate repeated measures set-up

Anuradha Roy 《Journal of applied statistics》2008,35(3):307-320

The number of parameters mushrooms in a linear mixed effects (LME) model in the case of multivariate repeated measures data. Computation of these parameters is a real problem with the increase in the number of response variables or with the increase in the number of time points. The problem becomes more intricate and involved with the addition of additional random effects. A multivariate analysis is not possible in a small sample setting. We propose a method to estimate these many parameters in bits and pieces from baby models, by taking a subset of response variables at a time, and finally using these bits and pieces at the end to get the parameter estimates for the mother model, with all variables taken together. Applying this method one can calculate the fixed effects, the best linear unbiased predictions (BLUPs) for the random effects in the model, and also the BLUPs at each time of observation for each response variable, to monitor the effectiveness of the treatment for each subject. The proposed method is illustrated with an example of multiple response variables measured over multiple time points arising from a clinical trial in osteoporosis. 相似文献

15.

Modeling the Effects of Genetic Factors on Late-Onset Diseases in Cohort Studies

Glickman ME Gagnon DR 《Lifetime data analysis》2002,8(3):211-228

Many late-onset diseases are caused by what appears to be a combination of a genetic predisposition to disease and environmental factors. The use of existing cohort studies provides an opportunity to infer genetic predisposition to disease on a representative sample of a study population, now that many such studies are gathering genetic information on the participants. One feature to using existing cohorts is that subjects may be censored due to death prior to genetic sampling, thereby adding a layer of complexity to the analysis. We develop a statistical framework to infer parameters of a latent variables model for disease onset. The latent variables model describes the role of genetic and modifiable risk factors on the onset ages of multiple diseases, and accounts for right-censoring of disease onset ages. The framework also allows for missing genetic information by inferring a subject's unknown genotype through appropriately incorporated covariate information. The model is applied to data gathered in the Framingham Heart Study for measuring the effect of different Apo-E genotypes on the occurrence of various cardiovascular disease events. 相似文献

16.

Zero-spiked regression models generated by gamma random variables with application in the resin oil production

Elizabeth M. Hashimoto Gauss M. Cordeiro Vicente G. Cancho Carine Klauberg 《Journal of Statistical Computation and Simulation》2019,89(1):52-70

Zero-inflated data are more frequent when the data represent counts. However, there are practical situations in which continuous data contain an excess of zeros. In these cases, the zero-inflated Poisson, binomial or negative binomial models are not suitable. In order to reduce this gap, we propose the zero-spiked gamma-Weibull (ZSGW) model by mixing a distribution which is degenerate at zero with the gamma-Weibull distribution, which has positive support. The model attempts to estimate simultaneously the effects of explanatory variables on the response variable and the zero-spiked. We consider a frequentist analysis and a non-parametric bootstrap for estimating the parameters of the ZSGW regression model. We derive the appropriate matrices for assessing local influence on the model parameters. We illustrate the performance of the proposed regression model by means of a real data set (copaiba oil resin production) from a study carried out at the Department of Forest Science of the Luiz de Queiroz School of Agriculture, University of São Paulo. Based on the ZSGW regression model, we determine the explanatory variables that can influence the excess of zeros of the resin oil production and identify influential observations. We also prove empirically that the proposed regression model can be superior to the zero-adjusted inverse Gaussian regression model to fit zero-inflated positive continuous data. 相似文献

17.

Identification of Multivariate Responders/Non-Responders Using Bayesian Growth Curve Latent Class Models

Leiby BE Sammel MD Ten Have TR Lynch KG 《Journal of the Royal Statistical Society. Series C, Applied statistics》2009,58(4):505-524

In this paper, we propose a multivariate growth curve mixture model that groups subjects based on multiple symptoms measured repeatedly over time. Our model synthesizes features of two models. First, we follow Roy and Lin (2000) in relating the multiple symptoms at each time point to a single latent variable. Second, we use the growth mixture model of Muthén and Shedden (1999) to group subjects based on distinctive longitudinal profiles of this latent variable. The mean growth curve for the latent variable in each class defines that class's features. For example, a class of "responders" would have a decline in the latent symptom summary variable over time. A Bayesian approach to estimation is employed where the methods of Elliott et al (2005) are extended to simultaneously estimate the posterior distributions of the parameters from the latent variable and growth curve mixture portions of the model. We apply our model to data from a randomized clinical trial evaluating the efficacy of Bacillus Calmette-Guerin (BCG) in treating symptoms of Interstitial Cystitis. In contrast to conventional approaches using a single subjective Global Response Assessment, we use the multivariate symptom data to identify a class of subjects where treatment demonstrates effectiveness. Simulations are used to confirm identifiability results and evaluate the performance of our algorithm. The definitive version of this paper is available at onlinelibrary.wiley.com. 相似文献

18.

Latent variable model for mixed correlated power series and ordinal longitudinal responses with non ignorable missing values

F. Razie M. Ganjali 《统计学通讯:理论与方法》2017,46(12):5738-5753

We propose a joint model based on a latent variable for analyzing mixed power series and ordinal longitudinal data with and without missing values. A bivariate probit regression model is used for the missing mechanisms. Random effects are used to take into account the correlation between longitudinal responses. A full likelihood-based approach is used to yield maximum-likelihood estimates of the model parameters. Our model is applied to a medical data set, obtained from an observational study on women where the correlated responses are the ordinal response of osteoporosis of the spine and the power series response of the number of joint damages. Sensitivity analysis is also performed to study the influence of small perturbations of the parameters of the missing mechanisms and overdispersion of the model on likelihood displacement. 相似文献

19.

Analysis of Panel Count Data with Dependent Observation Times

Yang-Jin Kim 《统计学通讯:模拟与计算》2013,42(4):983-990

In this article, a semiparametric approach is proposed for the regression analysis of panel count data. Panel count data commonly arise in clinical trials and demographical studies where the response variable is the number of multiple recurrences of the event of interest and observation times are not fixed, varying from subject to subject. It is assumed that two processes exist in this data: the first is for a recurrent event and the second is for observation time. Many studies have been done to estimate mean function and regression parameters under the independency between recurrent event process and observation time process. In this article, the same statistical inference is studied, but the situation where these two processes may be related is also considered. The mixed Poisson process is applied for the recurrent event processes, and a frailty intensity function for the observation time is also used, respectively. Simulation studies are conducted to study the performance of the suggested methods. The bladder tumor data are applied to compare previous studie' results. 相似文献

20.

Analysis of Failure Time Data with Mixed-Effects Accelerated Failure Time Model

Man Jin 《统计学通讯:模拟与计算》2013,42(4):614-619

In randomized clinical trials or observational studies, subjects are recruited at multiple treating sites. Factors that vary across sites may have some influence on outcomes; therefore, they need to be taken into account to get better results. We apply the accelerated failure time (AFT) model with linear mixed effects to analyze failure time data, accounting for correlations between outcomes. Specifically, we use Bayesian approach to fit the data, computing the regression parameters by Gibbs sampler combined with Buckley-James method. This approach is compared with the marginal independence approach and other methods through simulations and an application to a real example. 相似文献