期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A pseudo likelihood approach to analyze rate difference of binary response data in longitudinal factorial studies

Chunpeng Fan 《统计学通讯:模拟与计算》2018,47(7):2169-2183

Although Fan showed that the mixed-effects model for repeated measures (MMRM) is appropriate to analyze complete longitudinal binary data in terms of the rate difference, they focused on using the generalized estimating equations (GEE) to make statistical inference. The current article emphasizes validity of the MMRM when the normal-distribution-based pseudo likelihood approach is used to make inference for complete longitudinal binary data. For incomplete longitudinal binary data with missing at random missing mechanism, however, the MMRM, using either the GEE or the normal-distribution-based pseudo likelihood inferential procedure, gives biased results in general and should not be used for analysis. 相似文献

2.

Variance function in regression analysis of longitudinal data using the generalized estimating equation approach

《Journal of Statistical Computation and Simulation》2012,82(12):2700-2709

Longitudinal or clustered response data arise in many applications such as biostatistics, epidemiology and environmental studies. The repeated responses cannot in general be assumed to be independent. One method of analysing such data is by using the generalized estimating equations (GEE) approach. The current GEE method for estimating regression effects in longitudinal data focuses on the modelling of the working correlation matrix assuming a known variance function. However, correct choice of the correlation structure may not necessarily improve estimation efficiency for the regression parameters if the variance function is misspecified [Wang YG, Lin X. Effects of variance-function misspecification in analysis of longitudinal data. Biometrics. 2005;61:413–421]. In this connection two problems arise: finding a correct variance function and estimating the parameters of the chosen variance function. In this paper, we study the problem of estimating the parameters of the variance function assuming that the form of the variance function is known and then the effect of a misspecified variance function on the estimates of the regression parameters. We propose a GEE approach to estimate the parameters of the variance function. This estimation approach borrows the idea of Davidian and Carroll [Variance function estimation. J Amer Statist Assoc. 1987;82:1079–1091] by solving a nonlinear regression problem where residuals are regarded as the responses and the variance function is regarded as the regression function. A limited simulation study shows that the proposed method performs at least as well as the modified pseudo-likelihood approach developed by Wang and Zhao [A modified pseudolikelihood approach for analysis of longitudinal data. Biometrics. 2007;63:681–689]. Both these methods perform better than the GEE approach. 相似文献

3.

On generalised estimating equations for vector regression

下载免费PDF全文

A. Huang 《Australian & New Zealand Journal of Statistics》2017,59(2):195-213

Generalised estimating equations (GEE) for regression problems with vector‐valued responses are examined. When the response vectors are of mixed type (e.g. continuous–binary response pairs), the GEE approach is a semiparametric alternative to full‐likelihood copula methods, and is closely related to Prentice & Zhao's mean‐covariance estimation equations approach. When the response vectors are of the same type (e.g. measurements on left and right eyes), the GEE approach can be viewed as a ‘plug‐in’ to existing methods, such as the vglm function from the state‐of‐the‐art VGAM package in R. In either scenario, the GEE approach offers asymptotically correct inferences on model parameters regardless of whether the working variance–covariance model is correctly or incorrectly specified. The finite‐sample performance of the method is assessed using simulation studies based on a burn injury dataset and a sorbinil eye trial dataset. The method is applied to data analysis examples using the same two datasets, as well as to a trivariate binary dataset on three plant species in the Hunua ranges of Auckland. 相似文献

4.

Power and sample size for GEE analysis of incomplete paired outcomes in 2 × 2 crossover trials

Yongqiang Tang 《Pharmaceutical statistics》2021,20(4):820-839

The 2 × 2 crossover trial uses subjects as their own control to reduce the intersubject variability in the treatment comparison, and typically requires fewer subjects than a parallel design. The generalized estimating equations (GEE) methodology has been commonly used to analyze incomplete discrete outcomes from crossover trials. We propose a unified approach to the power and sample size determination for the Wald Z-test and t-test from GEE analysis of paired binary, ordinal and count outcomes in crossover trials. The proposed method allows misspecification of the variance and correlation of the outcomes, missing outcomes, and adjustment for the period effect. We demonstrate that misspecification of the working variance and correlation functions leads to no or minimal efficiency loss in GEE analysis of paired outcomes. In general, GEE requires the assumption of missing completely at random. For bivariate binary outcomes, we show by simulation that the GEE estimate is asymptotically unbiased or only minimally biased, and the proposed sample size method is suitable under missing at random (MAR) if the working correlation is correctly specified. The performance of the proposed method is illustrated with several numerical examples. Adaption of the method to other paired outcomes is discussed. 相似文献

5.

Small sample characteristics of generalized estimating equations

J. C. Gunsolley C. Getchell V. M. Chinchilli 《统计学通讯:模拟与计算》2013,42(4):869-878

The aim of this study was to investigate the Type I error rate of hypothesis testing based on generalized estimating equations (GEE) for data characteristic of periodontal clinical trials. The data in these studies consist of a large number of binary responses from each subject and a small number of subjects (Haffajee et al. (1983), Goodson (1986), Jenkins et al. (1988)) Computer simulations were employed to investigate GEE based both on an empirical estimate of the variance-covariance matrix and a model-based estimate. Results from this investigation indicate that hypothesis testing based on GEE resulted in inappropriate Type I error rates when small samples are employed. Only an increase in the number of subjects to the point where it matched the number of observations per subject resulted in appropriate Type I error rates 相似文献

6.

A PRESS statistic for working correlation structure selection in generalized estimating equations

Gul Inan Mahbub A. H. M. Latif John Preisser 《Journal of applied statistics》2019,46(4):621-637

Generalized estimating equations (GEE) is one of the most commonly used methods for regression analysis of longitudinal data, especially with discrete outcomes. The GEE method accounts for the association among the responses of a subject through a working correlation matrix and its correct specification ensures efficient estimation of the regression parameters in the marginal mean regression model. This study proposes a predicted residual sum of squares (PRESS) statistic as a working correlation selection criterion in GEE. A simulation study is designed to assess the performance of the proposed GEE PRESS criterion and to compare its performance with its counterpart criteria in the literature. The results show that the GEE PRESS criterion has better performance than the weighted error sum of squares SC criterion in all cases but is surpassed in performance by the Gaussian pseudo-likelihood criterion. Lastly, the working correlation selection criteria are illustrated with data from the Coronary Artery Risk Development in Young Adults study. 相似文献

7.

Bias from the use of generalized estimating equations to analyze incomplete longitudinal binary data

Andrew J. Copas Shaun R. Seaman 《Journal of applied statistics》2010,37(6):911-922

Patient dropout is a common problem in studies that collect repeated binary measurements. Generalized estimating equations (GEE) are often used to analyze such data. The dropout mechanism may be plausibly missing at random (MAR), i.e. unrelated to future measurements given covariates and past measurements. In this case, various authors have recommended weighted GEE with weights based on an assumed dropout model, or an imputation approach, or a doubly robust approach based on weighting and imputation. These approaches provide asymptotically unbiased inference, provided the dropout or imputation model (as appropriate) is correctly specified. Other authors have suggested that, provided the working correlation structure is correctly specified, GEE using an improved estimator of the correlation parameters (‘modified GEE’) show minimal bias. These modified GEE have not been thoroughly examined. In this paper, we study the asymptotic bias under MAR dropout of these modified GEE, the standard GEE, and also GEE using the true correlation. We demonstrate that all three methods are biased in general. The modified GEE may be preferred to the standard GEE and are subject to only minimal bias in many MAR scenarios but in others are substantially biased. Hence, we recommend the modified GEE be used with caution. 相似文献

8.

The pseudo‐GEE approach to the analysis of longitudinal surveys

Iván A. Carrillo Jiahua Chen Changbao Wu 《Revue canadienne de statistique》2010,38(4):540-554

Longitudinal surveys have emerged in recent years as an important data collection tool for population studies where the primary interest is to examine population changes over time at the individual level. Longitudinal data are often analyzed through the generalized estimating equations (GEE) approach. The vast majority of existing literature on the GEE method; however, is developed under non‐survey settings and are inappropriate for data collected through complex sampling designs. In this paper the authors develop a pseudo‐GEE approach for the analysis of survey data. They show that survey weights must and can be appropriately accounted in the GEE method under a joint randomization framework. The consistency of the resulting pseudo‐GEE estimators is established under the proposed framework. Linearization variance estimators are developed for the pseudo‐GEE estimators when the finite population sampling fractions are small or negligible, a scenario often held for large‐scale surveys. Finite sample performances of the proposed estimators are investigated through an extensive simulation study using data from the National Longitudinal Survey of Children and Youth. The results show that the pseudo‐GEE estimators and the linearization variance estimators perform well under several sampling designs and for both continuous and binary responses. The Canadian Journal of Statistics 38: 540–554; 2010 © 2010 Statistical Society of Canada 相似文献

9.

Model Selection Criterion Based on the Multivariate Quasi‐Likelihood for Generalized Estimating Equations

下载免费PDF全文

Shinpei Imori 《Scandinavian Journal of Statistics》2015,42(4):1214-1224

The generalized estimating equations (GEE) approach has attracted considerable interest for the analysis of correlated response data. This paper considers the model selection criterion based on the multivariate quasi‐likelihood (MQL) in the GEE framework. The GEE approach is closely related to the MQL. We derive a necessary and sufficient condition for the uniqueness of the risk function based on the MQL by using properties of differential geometry. Furthermore, we establish a formal derivation of model selection criterion as an asymptotically unbiased estimator of the prediction risk under this condition, and we explicitly take into account the effect of estimating the correlation matrix used in the GEE procedure. 相似文献

10.

Marginal models for the association structure of hierarchical binary responses

André G. F. C. Costa Aline B. M. Vaz José Luiz P. Silva Leila D. Amorim 《Journal of applied statistics》2017,44(10):1827-1838

Clustered binary responses are often found in ecological studies. Data analysis may include modeling the marginal probability response. However, when the association is the main scientific focus, modeling the correlation structure between pairs of responses is the key part of the analysis. Second-order generalized estimating equations (GEE) are established in the literature. Some of them are more efficient in computational terms, especially facing large clusters. Alternating logistic regression (ALR) and orthogonalized residual (ORTH) GEE methods are presented and compared in this paper. Simulation results show a slightly superiority of ALR over ORTH. Marginal probabilities and odds ratios are also estimated and compared in a real ecological study involving a three-level hierarchical clustering. ALR and ORTH models are useful for modeling complex association structure with large cluster sizes. 相似文献

11.

Modeling the correlation structure of data that have multiple levels of association

Justine Shults 《统计学通讯:理论与方法》2013,42(5-6):1005-1015

Some modem approaches for the analysis of non-normally distributed and correlated data, including Liang and Zeger's ( 1986 ) method of generalized estimating equations (GEE), model the pattern of association among outcomes by assuming a structure for their correlation matrix. A number of relatively simple patterned correlation matrices are available for measurements with one level of correlation. However, modeling the correlation structure of data with multiple levels, or causes, of association is not as straightforward; this note discusses some of the difficulties and discusses a simple class of correlation models that may prove useful in this endeavor. 相似文献

12.

To adjust or not to adjust for baseline when analyzing repeated binary responses? The case of complete data when treatment comparison at study end is of interest

下载免费PDF全文

Honghua Jiang Pandurang M. Kulkarni Craig H. Mallinckrodt Linda Shurzinske Geert Molenberghs Ilya Lipkovich 《Pharmaceutical statistics》2015,14(3):262-271

The benefits of adjusting for baseline covariates are not as straightforward with repeated binary responses as with continuous response variables. Therefore, in this study, we compared different methods for analyzing repeated binary data through simulations when the outcome at the study endpoint is of interest. Methods compared included chi‐square, Fisher's exact test, covariate adjusted/unadjusted logistic regression (Adj.logit/Unadj.logit), covariate adjusted/unadjusted generalized estimating equations (Adj.GEE/Unadj.GEE), covariate adjusted/unadjusted generalized linear mixed model (Adj.GLMM/Unadj.GLMM). All these methods preserved the type I error close to the nominal level. Covariate adjusted methods improved power compared with the unadjusted methods because of the increased treatment effect estimates, especially when the correlation between the baseline and outcome was strong, even though there was an apparent increase in standard errors. Results of the Chi‐squared test were identical to those for the unadjusted logistic regression. Fisher's exact test was the most conservative test regarding the type I error rate and also with the lowest power. Without missing data, there was no gain in using a repeated measures approach over a simple logistic regression at the final time point. Analysis of results from five phase III diabetes trials of the same compound was consistent with the simulation findings. Therefore, covariate adjusted analysis is recommended for repeated binary data when the study endpoint is of interest. Copyright © 2015 John Wiley & Sons, Ltd. 相似文献

13.

Assessment of modeling longitudinal binary data based on graphical methods

Kuo-Chin Lin Yi-Ju Chen 《统计学通讯:理论与方法》2017,46(7):3426-3437

Longitudinal categorical data are commonly applied in a variety of fields and are frequently analyzed by generalized estimating equation (GEE) method. Prior to making further inference based on the GEE model, the assessment of model fit is crucial. Graphical techniques have long been in widespread use for assessing the model adequacy. We develop alternative graphical approaches utilizing plots of marginal model-checking condition and local mean deviance to assess the GEE model with logit link for longitudinal binary responses. The applications of the proposed procedures are illustrated through two longitudinal binary datasets. 相似文献

14.

A Pairwise Likelihood Procedure for Analyzing Exchangeable Binary Data with Random Cluster Sizes

Huixiu Zhao 《统计学通讯:理论与方法》2013,42(5):594-606

For the exchangeable binary data with random cluster sizes, we use a pairwise likelihood procedure to give a set of approximately optimal unbiased estimating equations for estimating the mean and variance parameters. Theoretical results are obtained establishing the large sample properties of the solutions to the estimating equations. An application to a developmental toxicity study is given. Simulation results show that the pairwise likelihood procedure is valid and performs better than the GEE procedure for the exchangeable binary data. 相似文献

15.

Comparison of GEE1 and GEE2 estimation applied to clustered logistic regression

《Journal of Statistical Computation and Simulation》2012,82(4):361-378

Generalized estimating equations (GEE) have become a popular method for marginal regression modelling of data that occur in clusters. Features of the GEE methodology are the use of a ‘working covariance’, an approximation to the underlying covariance, which is used to improve the efficiency in estimating the regression coefficients, and the ‘sandwich’ estimate of variance, which provides a way of consistently estimating their standard errors. These techniques have been extended to include estimating equations for the underlying correlation structure, both to improve the efficiency of the regression coefficient estimates and to provide estimates of correlations between units in a cluster, when these are of interest. If the mean structure is of primary interest, then a simpler set of equations (GEE1) can be used, whereas if the underlying covariance structure is of interest in its own right, the use of the more complex GEE2 estimating equations is often recommended. In this paper, we compare the effect of increasing the complexity of the ‘working covariances’ on the variance of the parameter estimates, as well as the mean-squared error of the ‘sandwich’ estimate of variance. We give asymptotic expressions for these variances and mean-squared error terms. We use these to study the behaviour of different variants of GEE1 and GEE2 when we change the number of clusters, the cluster size, and the within-cluster correlation. We conclude that the extra complexity of the full GEE2 approach is not usually justified if the mean structure is of primary interest. 相似文献

16.

Marginal semiparametric multivariate accelerated failure time model with generalized estimating equations

Sy Han Chiou Sangwook Kang Junghi Kim Jun Yan 《Lifetime data analysis》2014,20(4):599-618

The semiparametric accelerated failure time (AFT) model is not as widely used as the Cox relative risk model due to computational difficulties. Recent developments in least squares estimation and induced smoothing estimating equations for censored data provide promising tools to make the AFT models more attractive in practice. For multivariate AFT models, we propose a generalized estimating equations (GEE) approach, extending the GEE to censored data. The consistency of the regression coefficient estimator is robust to misspecification of working covariance, and the efficiency is higher when the working covariance structure is closer to the truth. The marginal error distributions and regression coefficients are allowed to be unique for each margin or partially shared across margins as needed. The initial estimator is a rank-based estimator with Gehan’s weight, but obtained from an induced smoothing approach with computational ease. The resulting estimator is consistent and asymptotically normal, with variance estimated through a multiplier resampling method. In a large scale simulation study, our estimator was up to three times as efficient as the estimateor that ignores the within-cluster dependence, especially when the within-cluster dependence was strong. The methods were applied to the bivariate failure times data from a diabetic retinopathy study. 相似文献

17.

Some Properties of the Liang-Zeger Method Applied to Clustered Binary Regression

Andrew Balemi & Alan Lee 《Australian & New Zealand Journal of Statistics》1999,41(1):43-58

The Generalized Estimating Equation (GEE) method popularized by Liang and Zeger provides a very general method for fitting regression models to observations that occur in clusters. Features of the method are the specification of a 'working correlation' (a guess at the true correlation structure of the data) which is used to improve efficiency in estimating the regression coefficients, and the 'information sandwich' which provides a way of consistently estimating the standard errors of the estimated regression coefficients even if (as we might expect) the working correlation is wrong. This paper develops asymptotic expressions for the bias and efficiency both of the regression coefficient estimates and of the sandwich estimate, and uses them to study the behaviour of the estimates.
It looks at the effect of the choice of the working correlation on the estimate and also examines the effect of different cluster sizes and different degrees of correlation between the covariates. The performance of these methods is found to be excellent, particularly when the degree of correlation in the responses and covariates is small to moderate. 相似文献

18.

P-value adjustment for multiple binary endpoints

James J. Chen 《统计学通讯:理论与方法》2013,42(11):2791-2806

The p-value-based adjustment of individual endpoints and the global test for an overall inference are the two general approaches for the analysis of multiple endpoints. Statistical procedures developed for testing multivariate outcomes often assume that the multivariate endpoints are either independent or normally distributed. This paper presents a general approach for the analysis of multivariate binary data under the framework of generalized linear models. The generalized estimating equations (GEE) approach is applied to estimate the correlation matrix of the test statistics using the identity and exchangeable working correlation matrices with the model-based as well as robust estimators. The objectives of the approaches are the adjustment of p-values of individual endpoints to identify the affected endpoints as well as the global test of an overall effect. A Monte Carlo simulation was conducted to evaluate the overall family wise error (FWE) rates of the single-step down p-value adjustment approach from two adjustment methods to three global test statistics. The p-value adjustment approach seems to control the FWE better than the global approach Applications of the proposed methods are illustrated by analyzing a carcinogenicity experiment designed to study the dose response trend for 10 tumor sites, and a developmental toxicity experiment with three malformation types: external, visceral, and skeletal. 相似文献

19.

Comparing alternating logistic regressions to other approaches to modelling correlated binary data

《Journal of Statistical Computation and Simulation》2012,82(10):2059-2071

Alternating logistic regressions (ALRs) seem to offer some of the advantages of marginal models estimated via generalized estimating equations (GEE) and generalized linear mixed models (GLMMs). Via simulation study we compared ALRs to marginal models estimated via GEE and subject-specific models estimated via GLMMs, with a focus on estimation of the correlation structure in three-level data sets (e.g. students in classes in schools). Data set size and structure, and amount of correlation in the data sets were varied. For simple correlation structures, ALRs performed well. For three-level correlation structures, all approaches, but especially ALRs, had difficulty assigning the correlation to the correct level, though sample sizes used were small. In addition, ALRs and GEEs had trouble attaching correct inference to the mean effects, though this improved as overall sample size improved. ALRs are a valuable addition to the data analyst's toolkit, though care should be taken when modelling data with three-level structures. 相似文献

20.

Resistant fits for regression with correlated outcomes an estimating equations approach

《Journal of statistical planning and inference》1999,75(2):415-431

The generalized estimating equations procedure of Liang and Zeger (1986) can be highly influenced by the presence of unusual data points. A generalization is introduced which yields parameter estimates and fitted values resistant to influential data. A diagonal weight matrix for each cluster is incorporated into the estimating equations which downweights the multivariate response vector element-wise. Efficiency of the procedure is investigated, including the case of correlated binary outcomes. 相似文献