首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Non ignorable missing data is a common problem in longitudinal studies. Latent class models are attractive for simplifying the modeling of missing data when the data are subject to either a monotone or intermittent missing data pattern. In our study, we propose a new two-latent-class model for categorical data with informative dropouts, dividing the observed data into two latent classes; one class in which the outcomes are deterministic and a second one in which the outcomes can be modeled using logistic regression. In the model, the latent classes connect the longitudinal responses and the missingness process under the assumption of conditional independence. Parameters are estimated by the method of maximum likelihood estimation based on the above assumptions and the tetrachoric correlation between responses within the same subject. We compare the proposed method with the shared parameter model and the weighted GEE model using the areas under the ROC curves in the simulations and the application to the smoking cessation data set. The simulation results indicate that the proposed two-latent-class model performs well under different missing procedures. The application results show that our proposed method is better than the shared parameter model and the weighted GEE model.  相似文献   

2.
Some studies generate data that can be grouped into clusters in more than one way. Consider for instance a smoking prevention study in which responses on smoking status are collected over several years in a cohort of students from a number of different schools. This yields longitudinal data, also cross‐sectionaliy clustered in schools. The authors present a model for analyzing binary data of this type, combining generalized estimating equations and estimation of random effects to address the longitudinal and cross‐sectional dependence, respectively. The estimation procedure for this model is discussed, as are the results of a simulation study used to investigate the properties of its estimates. An illustration using data from a smoking prevention trial is given.  相似文献   

3.
Misclassifications in binary responses have long been a common problem in medical and health surveys. One way to handle misclassifications in clustered or longitudinal data is to incorporate the misclassification model through the generalized estimating equation (GEE) approach. However, existing methods are developed under a non-survey setting and cannot be used directly for complex survey data. We propose a pseudo-GEE method for the analysis of binary survey responses with misclassifications. We focus on cluster sampling and develop analysis strategies for analyzing binary survey responses with different forms of additional information for the misclassification process. The proposed methodology has several attractive features, including simultaneous inferences for both the response model and the association parameters. Finite sample performance of the proposed estimators is evaluated through simulation studies and an application using a real dataset from the Canadian Longitudinal Study on Aging.  相似文献   

4.
Although Fan showed that the mixed-effects model for repeated measures (MMRM) is appropriate to analyze complete longitudinal binary data in terms of the rate difference, they focused on using the generalized estimating equations (GEE) to make statistical inference. The current article emphasizes validity of the MMRM when the normal-distribution-based pseudo likelihood approach is used to make inference for complete longitudinal binary data. For incomplete longitudinal binary data with missing at random missing mechanism, however, the MMRM, using either the GEE or the normal-distribution-based pseudo likelihood inferential procedure, gives biased results in general and should not be used for analysis.  相似文献   

5.
In this paper, a simulation study is conducted to systematically investigate the impact of dichotomizing longitudinal continuous outcome variables under various types of missing data mechanisms. Generalized linear models (GLM) with standard generalized estimating equations (GEE) are widely used for longitudinal outcome analysis, but these semi‐parametric approaches are only valid under missing data completely at random (MCAR). Alternatively, weighted GEE (WGEE) and multiple imputation GEE (MI‐GEE) were developed to ensure validity under missing at random (MAR). Using a simulation study, the performance of standard GEE, WGEE and MI‐GEE on incomplete longitudinal dichotomized outcome analysis is evaluated. For comparisons, likelihood‐based linear mixed effects models (LMM) are used for incomplete longitudinal original continuous outcome analysis. Focusing on dichotomized outcome analysis, MI‐GEE with original continuous missing data imputation procedure provides well controlled test sizes and more stable power estimates compared with any other GEE‐based approaches. It is also shown that dichotomizing longitudinal continuous outcome will result in substantial loss of power compared with LMM. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

6.
Patient dropout is a common problem in studies that collect repeated binary measurements. Generalized estimating equations (GEE) are often used to analyze such data. The dropout mechanism may be plausibly missing at random (MAR), i.e. unrelated to future measurements given covariates and past measurements. In this case, various authors have recommended weighted GEE with weights based on an assumed dropout model, or an imputation approach, or a doubly robust approach based on weighting and imputation. These approaches provide asymptotically unbiased inference, provided the dropout or imputation model (as appropriate) is correctly specified. Other authors have suggested that, provided the working correlation structure is correctly specified, GEE using an improved estimator of the correlation parameters (‘modified GEE’) show minimal bias. These modified GEE have not been thoroughly examined. In this paper, we study the asymptotic bias under MAR dropout of these modified GEE, the standard GEE, and also GEE using the true correlation. We demonstrate that all three methods are biased in general. The modified GEE may be preferred to the standard GEE and are subject to only minimal bias in many MAR scenarios but in others are substantially biased. Hence, we recommend the modified GEE be used with caution.  相似文献   

7.
Longitudinal categorical data are commonly applied in a variety of fields and are frequently analyzed by generalized estimating equation (GEE) method. Prior to making further inference based on the GEE model, the assessment of model fit is crucial. Graphical techniques have long been in widespread use for assessing the model adequacy. We develop alternative graphical approaches utilizing plots of marginal model-checking condition and local mean deviance to assess the GEE model with logit link for longitudinal binary responses. The applications of the proposed procedures are illustrated through two longitudinal binary datasets.  相似文献   

8.
Clustered binary responses are often found in ecological studies. Data analysis may include modeling the marginal probability response. However, when the association is the main scientific focus, modeling the correlation structure between pairs of responses is the key part of the analysis. Second-order generalized estimating equations (GEE) are established in the literature. Some of them are more efficient in computational terms, especially facing large clusters. Alternating logistic regression (ALR) and orthogonalized residual (ORTH) GEE methods are presented and compared in this paper. Simulation results show a slightly superiority of ALR over ORTH. Marginal probabilities and odds ratios are also estimated and compared in a real ecological study involving a three-level hierarchical clustering. ALR and ORTH models are useful for modeling complex association structure with large cluster sizes.  相似文献   

9.
The aim of this study was to investigate the Type I error rate of hypothesis testing based on generalized estimating equations (GEE) for data characteristic of periodontal clinical trials. The data in these studies consist of a large number of binary responses from each subject and a small number of subjects (Haffajee et al. (1983), Goodson (1986), Jenkins et al. (1988)) Computer simulations were employed to investigate GEE based both on an empirical estimate of the variance-covariance matrix and a model-based estimate. Results from this investigation indicate that hypothesis testing based on GEE resulted in inappropriate Type I error rates when small samples are employed. Only an increase in the number of subjects to the point where it matched the number of observations per subject resulted in appropriate Type I error rates  相似文献   

10.
Longitudinal surveys have emerged in recent years as an important data collection tool for population studies where the primary interest is to examine population changes over time at the individual level. Longitudinal data are often analyzed through the generalized estimating equations (GEE) approach. The vast majority of existing literature on the GEE method; however, is developed under non‐survey settings and are inappropriate for data collected through complex sampling designs. In this paper the authors develop a pseudo‐GEE approach for the analysis of survey data. They show that survey weights must and can be appropriately accounted in the GEE method under a joint randomization framework. The consistency of the resulting pseudo‐GEE estimators is established under the proposed framework. Linearization variance estimators are developed for the pseudo‐GEE estimators when the finite population sampling fractions are small or negligible, a scenario often held for large‐scale surveys. Finite sample performances of the proposed estimators are investigated through an extensive simulation study using data from the National Longitudinal Survey of Children and Youth. The results show that the pseudo‐GEE estimators and the linearization variance estimators perform well under several sampling designs and for both continuous and binary responses. The Canadian Journal of Statistics 38: 540–554; 2010 © 2010 Statistical Society of Canada  相似文献   

11.
Summary.  In a large, prospective longitudinal study designed to monitor cardiac abnormalities in children born to women who are infected with the human immunodeficiency virus, instead of a single outcome variable, there are multiple binary outcomes (e.g. abnormal heart rate, abnormal blood pressure and abnormal heart wall thickness) considered as joint measures of heart function over time. In the presence of missing responses at some time points, longitudinal marginal models for these multiple outcomes can be estimated by using generalized estimating equations (GEEs), and consistent estimates can be obtained under the assumption of a missingness completely at random mechanism. When the missing data mechanism is missingness at random, i.e. the probability of missing a particular outcome at a time point depends on observed values of that outcome and the remaining outcomes at other time points, we propose joint estimation of the marginal models by using a single modified GEE based on an EM-type algorithm. The method proposed is motivated by the longitudinal study of cardiac abnormalities in children who were born to women infected with the human immunodeficiency virus, and analyses of these data are presented to illustrate the application of the method. Further, in an asymptotic study of bias, we show that, under a missingness at random mechanism in which missingness depends on all observed outcome variables, our joint estimation via the modified GEE produces almost unbiased estimates, provided that the correlation model has been correctly specified, whereas estimates from standard GEEs can lead to substantial bias.  相似文献   

12.
Longitudinal or clustered response data arise in many applications such as biostatistics, epidemiology and environmental studies. The repeated responses cannot in general be assumed to be independent. One method of analysing such data is by using the generalized estimating equations (GEE) approach. The current GEE method for estimating regression effects in longitudinal data focuses on the modelling of the working correlation matrix assuming a known variance function. However, correct choice of the correlation structure may not necessarily improve estimation efficiency for the regression parameters if the variance function is misspecified [Wang YG, Lin X. Effects of variance-function misspecification in analysis of longitudinal data. Biometrics. 2005;61:413–421]. In this connection two problems arise: finding a correct variance function and estimating the parameters of the chosen variance function. In this paper, we study the problem of estimating the parameters of the variance function assuming that the form of the variance function is known and then the effect of a misspecified variance function on the estimates of the regression parameters. We propose a GEE approach to estimate the parameters of the variance function. This estimation approach borrows the idea of Davidian and Carroll [Variance function estimation. J Amer Statist Assoc. 1987;82:1079–1091] by solving a nonlinear regression problem where residuals are regarded as the responses and the variance function is regarded as the regression function. A limited simulation study shows that the proposed method performs at least as well as the modified pseudo-likelihood approach developed by Wang and Zhao [A modified pseudolikelihood approach for analysis of longitudinal data. Biometrics. 2007;63:681–689]. Both these methods perform better than the GEE approach.  相似文献   

13.
Generalised estimating equations (GEE) for regression problems with vector‐valued responses are examined. When the response vectors are of mixed type (e.g. continuous–binary response pairs), the GEE approach is a semiparametric alternative to full‐likelihood copula methods, and is closely related to Prentice & Zhao's mean‐covariance estimation equations approach. When the response vectors are of the same type (e.g. measurements on left and right eyes), the GEE approach can be viewed as a ‘plug‐in’ to existing methods, such as the vglm function from the state‐of‐the‐art VGAM package in R. In either scenario, the GEE approach offers asymptotically correct inferences on model parameters regardless of whether the working variance–covariance model is correctly or incorrectly specified. The finite‐sample performance of the method is assessed using simulation studies based on a burn injury dataset and a sorbinil eye trial dataset. The method is applied to data analysis examples using the same two datasets, as well as to a trivariate binary dataset on three plant species in the Hunua ranges of Auckland.  相似文献   

14.
In a longitudinal set-up, to examine the effects of certain fixed covariates on the repeated binary responses, there exists an approach to model the binary probabilities through a dynamic logistic relationship. In some practical situations such as in longitudinal clinical studies, it may happen that some of the covariates such as treatments are selected randomly following an adaptive design, whereas the rest of the covariates may be fixed by nature. The purpose of this study is to examine the effects of the design weights selection on the parameter estimation including the treatment effects, after taking the longitudinal correlations of the repeated binary responses into account.  相似文献   

15.
This paper presents the results of a small sample simulation study designed to evaluate the performance of a recently proposed test statistic for the analysis of correlated binary data. The new statistic is an adjusted Mantel-Haenszel test, which may be used in testing for association between a binary exposure and a binary outcome of interest across several fourfold tables when the data have been collected under a cluster sampling design. Al- though originally developed for the analysis of periodontal data, the proposed method may be applied to clustered binary data arising in a variety of settings, including longitu- dinal studies, family studies, and school-based research. The features of the simulation are intended to mimic those of a research study of periodontal health, in which a large number of observations is made on each of a relatively small number of patients. The simulation reveals that the adjusted test statistic performs well in finite samples, having empirical type I error rates close to nominal and empirical power similar to that of more complicated marginal regression methods. Software for computing the adjusted statistic is also provided.  相似文献   

16.
We propose a mixture model for data with an ordinal outcome and a longitudinal covariate that is subject to missingness. Data from a tailored telephone delivered, smoking cessation intervention for construction laborers are used to illustrate the method, which considers as an outcome a categorical measure of smoking cessation, and evaluates the effectiveness of the motivational telephone interviews on this outcome. We propose two model structures for the longitudinal covariate, for the case when the missing data are missing at random, and when the missing data mechanism is non-ignorable. A generalized EM algorithm is used to obtain maximum likelihood estimates.  相似文献   

17.
The problem of interpreting lung-function measurements in industrial workers is examined. The data under discussion pertain to FEV1 and FVC measurements in smoking and in nonsmoking groups of grain-elevator workers in British Columbia and of workers in Vancouver City Hall. Initial observations have now been enriched by longitudinal follow up data on the same groups after three and after six years. It is shown that interesting selection phenomena, favouring “fit” individuals, take place over time, with regard both to lung symptoms and lung functions. Thus cross-sectional and longitudinal studies refer to somewhat different populations. It also appears that longitudinal studies are considerably more sensitive to identifying cumulative lung damage than are corresponding cross-sectional studies. The nonlinearity of the effect of age on lung functions is noted in the longitudinal data in a number of cases, lending support to the hypothesis of association between quadratic age effect and cumulative exposure to lung insults.  相似文献   

18.
Myers & Broyles (2000a, 2000b) illustrate that regression coefficient analysis (RCA) is a viable alternative to a generalized estimating equation (GEE) in the analysis of correlated binomial data. Since the regression coefficients (b i ' s ) may have different precisions, we modify RCA by weighting b i ' s by the inverses of their variances for statistical optimality. We perform the simulation study to evaluate the performance of RCA, modified RCA and GEE in terms of empirical type I errors and empirical powers of the regression coefficients in repeated binary measurement designs with and without dropouts. Two thousand data sets are generated using autoregressive (AR(1)) and compound symmetry (CS) correlation structures. We compare the type I errors and powers of RCA, modified RCA and GEE for the analysis of repeated binary measurement data as affected by different dropout mechanisms such as random dropouts and treatment dependent dropouts.  相似文献   

19.
Clustered or correlated samples of categorical response data arise frequently in many fields of application. The method of generalized estimating equations (GEEs) introduced in Liang and Zeger [Longitudinal data analysis using generalized linear models, Biometrika 73 (1986), pp. 13–22] is often used to analyse this type of data. GEEs give consistent estimates of the regression parameters and their variance based upon the Pearson residuals. Park et al. [Alternative GEE estimation procedures for discrete longitudinal data, Comput. Stat. Data Anal. 28 (1998), pp. 243–256] considered a modification of the GEE approach using the Anscombe residual and the deviance residual. In this work, we propose to extend this idea to a family of generalized residuals. A wide simulation study is conducted for binary and Poisson correlated outcomes and also two numerical illustrations are presented.  相似文献   

20.
In longitudinal studies, as repeated observations are made on the same individual the response variables will usually be correlated. In analyzing such data, this dependence must be taken into account to avoid misleading inferences. The focus of this paper is to apply a logistic marginal model with Markovian dependence proposed by Azzalini [A. Azzalini, Logistic regression for autocorrelated data with application to repeated measures, Biometrika 81 (1994) 767–775] to the study of the influence of time-dependent covariates on the marginal distribution of the binary response in serially correlated binary data. We have shown how to construct the model so that the covariates relate only to the mean value of the process, independent of the association parameters. After formulating the proposed model for repeated measures data, the same approach is applied to missing data. An application is provided to the diabetes mellitus data of registered patients at the Bangladesh Institute of Research and Rehabilitation in Diabetes, Endocrine and Metabolic Disorders (BIRDEM) in 1984, using both time stationary and time varying covariates.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号