期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Modified regression coefficient analysis for repeated binary measurements

Chul Ahn Sin-Ho Jung Seung-Ho Kang 《Journal of applied statistics》2002,29(5):703-710

Myers & Broyles (2000a, 2000b) illustrate that regression coefficient analysis (RCA) is a viable alternative to a generalized estimating equation (GEE) in the analysis of correlated binomial data. Since the regression coefficients (b i ' s ) may have different precisions, we modify RCA by weighting b i ' s by the inverses of their variances for statistical optimality. We perform the simulation study to evaluate the performance of RCA, modified RCA and GEE in terms of empirical type I errors and empirical powers of the regression coefficients in repeated binary measurement designs with and without dropouts. Two thousand data sets are generated using autoregressive (AR(1)) and compound symmetry (CS) correlation structures. We compare the type I errors and powers of RCA, modified RCA and GEE for the analysis of repeated binary measurement data as affected by different dropout mechanisms such as random dropouts and treatment dependent dropouts. 相似文献

2.

Comparison of methods for analyzing binary repeated measures data: A simulation-based study (comparison of methods for binary repeated measures)

M. B. M. B. K. Gawarammana 《统计学通讯:模拟与计算》2017,46(3):2103-2120

In this study, some methods suggested for binary repeated measures, namely, Weighted Least Squares (WLS), Generalized Estimating Equations (GEE), and Generalized Linear Mixed Models (GLMM) are compared with respect to power, type 1 error, and properties of estimates. The results indicate that with adequate sample size, no missing data, the only covariate being time effect, and a relatively limited number of time points, the WLS method performs well. The GEE approach performs well only for large sample sizes. The GLMM method is satisfactory with respect to type I error, but its estimates have poorer properties than the other methods. 相似文献

3.

Optimal model averaging estimation for correlation structure in generalized estimating equations

Fang Fang Jingli Wang 《统计学通讯:模拟与计算》2019,48(5):1574-1593

Longitudinal data analysis requires a proper estimation of the within-cluster correlation structure in order to achieve efficient estimates of the regression parameters. When applying likelihood-based methods one may select an optimal correlation structure by the AIC or BIC. However, such information criteria are not applicable for estimating equation based approaches. In this paper we develop a model averaging approach to estimate the correlation matrix by a weighted sum of a group of patterned correlation matrices under the GEE framework. The optimal weight is determined by minimizing the difference between the weighted sum and a consistent yet inefficient estimator of the correlation structure. The computation of our proposed approach only involves a standard quadratic programming on top of the standard GEE procedure and can be easily implemented in practice. We provide theoretical justifications and extensive numerical simulations to support the application of the proposed estimator. A couple of well-known longitudinal data sets are revisited where we implement and illustrate our methodology. 相似文献

4.

GEE-based zero-inflated generalized Poisson model for clustered over or under-dispersed count data

Fatemeh Sarvi Hossein Mahjub 《Journal of Statistical Computation and Simulation》2019,89(14):2711-2732

The zero-inflated regression models such as zero-inflated Poisson (ZIP), zero-inflated negative binomial (ZINB) or zero-inflated generalized Poisson (ZIGP) regression models can model the count data with excess zeros. The ZINB model can handle over-dispersed and the ZIGP model can handle the over or under-dispersed count data with excess zeros as well. Moreover, the count data may be correlated because of data collection procedure or special study design. The clustered sampling approach is one of the examples in which the correlation among subjects could be defined. In such situations, a marginal model using generalized estimating equation (GEE) approach can incorporate these correlations and lead up to the relationships at the population level. In this study, the GEE-based zero-inflated generalized Poisson regression model was proposed to fit over and under-dispersed clustered count data with excess zeros. 相似文献

5.

On generalised estimating equations for vector regression

下载免费PDF全文

A. Huang 《Australian & New Zealand Journal of Statistics》2017,59(2):195-213

Generalised estimating equations (GEE) for regression problems with vector‐valued responses are examined. When the response vectors are of mixed type (e.g. continuous–binary response pairs), the GEE approach is a semiparametric alternative to full‐likelihood copula methods, and is closely related to Prentice & Zhao's mean‐covariance estimation equations approach. When the response vectors are of the same type (e.g. measurements on left and right eyes), the GEE approach can be viewed as a ‘plug‐in’ to existing methods, such as the vglm function from the state‐of‐the‐art VGAM package in R. In either scenario, the GEE approach offers asymptotically correct inferences on model parameters regardless of whether the working variance–covariance model is correctly or incorrectly specified. The finite‐sample performance of the method is assessed using simulation studies based on a burn injury dataset and a sorbinil eye trial dataset. The method is applied to data analysis examples using the same two datasets, as well as to a trivariate binary dataset on three plant species in the Hunua ranges of Auckland. 相似文献

6.

Assessment of modeling longitudinal binary data based on graphical methods

Kuo-Chin Lin Yi-Ju Chen 《统计学通讯:理论与方法》2017,46(7):3426-3437

Longitudinal categorical data are commonly applied in a variety of fields and are frequently analyzed by generalized estimating equation (GEE) method. Prior to making further inference based on the GEE model, the assessment of model fit is crucial. Graphical techniques have long been in widespread use for assessing the model adequacy. We develop alternative graphical approaches utilizing plots of marginal model-checking condition and local mean deviance to assess the GEE model with logit link for longitudinal binary responses. The applications of the proposed procedures are illustrated through two longitudinal binary datasets. 相似文献

7.

Pseudo-Likelihood Methodology for Hierarchical Count Data

George Kalema 《统计学通讯:理论与方法》2014,43(22):4790-4805

Generalized Estimating Equations (GEE) are a widespread tool for modeling correlated data, based on properly formulating a marginal regression function, combined with working assumptions about the correlation function. Should interest be placed in addition on the correlation function, then, apart from second-order GEE, pseudo-likelihood (PL) also provides an attractive alternative, especially in its pairwise form, where the covariance between each pair of the response vector is modeled as well. An elegant PL approach is formulated in this paper, based on a flexible bivariate Poisson model. The performance of the PL-method is studied, relative to GEE, using simulations. Data on repeated counts of epileptic seizures in a two-arm clinical trial are analyzed. A macro has been developed by the authors and made available on their web pages. 相似文献

8.

A simple approach for generating correlated binary variates ∗

《Journal of Statistical Computation and Simulation》2012,82(3):231-255

Correlated binary data arise frequently in medical as well as other scientific disciplines; and statistical methods, such as generalized estimating equation (GEE), have been widely used for their analysis. The need for simulating correlated binary variates arises for evaluating small sample properties of the GEE estimators when modeling such data. Also, one might generate such data to simulate and study biological phenomena such as tooth decay or periodontal disease. This article introduces a simple method for generating pairs of correlated binary data. A simple algorithm is also provided for generating an arbitrary dimensional random vector of non-negatively correlated binary variates. The method relies on the idea that correlations among the random variables arise as a result of their sharing some common components that induce such correlations. It then uses some properties of the binary variates to represent each variate in terms of these common components in addition to its own elements. Unlike most previous approaches that require solving nonlinear equations or use some distributional properties of other random variables, this method uses only some properties of the binary variate. As no intermediate random variables are required for generating the binary variates, the proposed method is shown to be faster than the other methods. To verify this claim, we compare the computational efficiency of the proposed method with those of other procedures. 相似文献

9.

ESTIMATING A RATIO OF MEANS FROM BIVARIATE COUNTS WITH APPLICATIONS IN STEREOLOGY

Adrian Baddeley 《Australian & New Zealand Journal of Statistics》2011,53(3):365-387

In survey sampling and in stereology, it is often desirable to estimate the ratio of means θ= E(Y)/E(X) from bivariate count data (X, Y) with unknown joint distribution. We review methods that are available for this problem, with particular reference to stereological applications. We also develop new methods based on explicit statistical models for the data, and associated model diagnostics. The methods are tested on a stereological dataset. For point‐count data, binomial regression and bivariate binomial models are generally adequate. Intercept‐count data are often overdispersed relative to Poisson regression models, but adequately fitted by negative binomial regression. 相似文献

10.

Direct Modelling of Regression Effects for Transition Probabilities in Multistate Models 总被引：4，自引：0，他引：4

THOMAS H. SCHEIKE MEI-JIE ZHANG 《Scandinavian Journal of Statistics》2007,34(1):17-32

Abstract. A simple and standard approach for analysing multistate model data is to model all transition intensities and then compute a summary measure such as the transition probabilities based on this. This approach is relatively simple to implement but it is difficult to see what the covariate effects are on the scale of interest. In this paper, we consider an alternative approach that directly models the covariate effects on transition probabilities in multistate models. Our new approach is based on binomial modelling and inverse probability of censoring weighting techniques and is very simple to implement by standard software. We show how to do flexible regression models with possibly time-varying covariate effects. 相似文献

11.

A PRESS statistic for working correlation structure selection in generalized estimating equations

Gul Inan Mahbub A. H. M. Latif John Preisser 《Journal of applied statistics》2019,46(4):621-637

Generalized estimating equations (GEE) is one of the most commonly used methods for regression analysis of longitudinal data, especially with discrete outcomes. The GEE method accounts for the association among the responses of a subject through a working correlation matrix and its correct specification ensures efficient estimation of the regression parameters in the marginal mean regression model. This study proposes a predicted residual sum of squares (PRESS) statistic as a working correlation selection criterion in GEE. A simulation study is designed to assess the performance of the proposed GEE PRESS criterion and to compare its performance with its counterpart criteria in the literature. The results show that the GEE PRESS criterion has better performance than the weighted error sum of squares SC criterion in all cases but is surpassed in performance by the Gaussian pseudo-likelihood criterion. Lastly, the working correlation selection criteria are illustrated with data from the Coronary Artery Risk Development in Young Adults study. 相似文献

12.

Relative Risk Regression for Binary Outcomes: Methods and Recommendations

下载免费PDF全文

Ian C. Marschner 《Australian & New Zealand Journal of Statistics》2015,57(4):437-462

Relative risks are often considered preferable to odds ratios for quantifying the association between a predictor and a binary outcome. Relative risk regression is an alternative to logistic regression where the parameters are relative risks rather than odds ratios. It uses a log link binomial generalised linear model, or log‐binomial model, which requires parameter constraints to prevent probabilities from exceeding 1. This leads to numerical problems with standard approaches for finding the maximum likelihood estimate (MLE), such as Fisher scoring, and has motivated various non‐MLE approaches. In this paper we discuss the roles of the MLE and its main competitors for relative risk regression. It is argued that reliable alternatives to Fisher scoring mean that numerical issues are no longer a motivation for non‐MLE methods. Nonetheless, non‐MLE methods may be worthwhile for other reasons and we evaluate this possibility for alternatives within a class of quasi‐likelihood methods. The MLE obtained using a reliable computational method is recommended, but this approach requires bootstrapping when estimates are on the parameter space boundary. If convenience is paramount, then quasi‐likelihood estimation can be a good alternative, although parameter constraints may be violated. Sensitivity to model misspecification and outliers is also discussed along with recommendations and priorities for future research. 相似文献

13.

Comparison of GEE1 and GEE2 estimation applied to clustered logistic regression

《Journal of Statistical Computation and Simulation》2012,82(4):361-378

Generalized estimating equations (GEE) have become a popular method for marginal regression modelling of data that occur in clusters. Features of the GEE methodology are the use of a ‘working covariance’, an approximation to the underlying covariance, which is used to improve the efficiency in estimating the regression coefficients, and the ‘sandwich’ estimate of variance, which provides a way of consistently estimating their standard errors. These techniques have been extended to include estimating equations for the underlying correlation structure, both to improve the efficiency of the regression coefficient estimates and to provide estimates of correlations between units in a cluster, when these are of interest. If the mean structure is of primary interest, then a simpler set of equations (GEE1) can be used, whereas if the underlying covariance structure is of interest in its own right, the use of the more complex GEE2 estimating equations is often recommended. In this paper, we compare the effect of increasing the complexity of the ‘working covariances’ on the variance of the parameter estimates, as well as the mean-squared error of the ‘sandwich’ estimate of variance. We give asymptotic expressions for these variances and mean-squared error terms. We use these to study the behaviour of different variants of GEE1 and GEE2 when we change the number of clusters, the cluster size, and the within-cluster correlation. We conclude that the extra complexity of the full GEE2 approach is not usually justified if the mean structure is of primary interest. 相似文献

14.

Some Properties of the Liang-Zeger Method Applied to Clustered Binary Regression

Andrew Balemi & Alan Lee 《Australian & New Zealand Journal of Statistics》1999,41(1):43-58

The Generalized Estimating Equation (GEE) method popularized by Liang and Zeger provides a very general method for fitting regression models to observations that occur in clusters. Features of the method are the specification of a 'working correlation' (a guess at the true correlation structure of the data) which is used to improve efficiency in estimating the regression coefficients, and the 'information sandwich' which provides a way of consistently estimating the standard errors of the estimated regression coefficients even if (as we might expect) the working correlation is wrong. This paper develops asymptotic expressions for the bias and efficiency both of the regression coefficient estimates and of the sandwich estimate, and uses them to study the behaviour of the estimates.
It looks at the effect of the choice of the working correlation on the estimate and also examines the effect of different cluster sizes and different degrees of correlation between the covariates. The performance of these methods is found to be excellent, particularly when the degree of correlation in the responses and covariates is small to moderate. 相似文献

15.

Modification of GEE1 and linear mixed-effects models for heteroscedastic longitudinal Gaussian data

Eiji Nakashima 《统计学通讯:理论与方法》2017,46(22):11110-11122

A characterization of GLMs is given. Modification of the Gaussian GEE1, modified GEE1, was applied to heteroscedastic longitudinal data, to which linear mixed-effects models are usually applied. The modified GEE1 models scale multivariate data to homoscedastic data maintaining the correlation structure and apply usual GEE1 to homoscedastic data, which needs no-diagnostics for diagonal variances. Relationships among multivariate linear regression methods, ordinary/generalized LS, naïve/modified GEE1, and linear mixed-effects models were discussed. An application showed modified GEE1 gave most efficient parameter estimation. Correct specification of the main diagonals of heteroscedastic data variance appears to be more important for efficient mean parameter estimation. 相似文献

16.

Efficient Computation of Reduced Regression Models

Stuart R. Lipsitz Garrett M. Fitzmaurice Debajyoti Sinha Nathanael Hevelone Edward Giovannucci Quoc-Dien Trinh 《The American statistician》2017,71(2):171-176

We consider settings where it is of interest to fit and assess regression submodels that arise as various explanatory variables are excluded from a larger regression model. The larger model is referred to as the full model; the submodels are the reduced models. We show that a computationally efficient approximation to the regression estimates under any reduced model can be obtained from a simple weighted least squares (WLS) approach based on the estimated regression parameters and covariance matrix from the full model. This WLS approach can be considered an extension to unbiased estimating equations of a first-order Taylor series approach proposed by Lawless and Singhal. Using data from the 2010 Nationwide Inpatient Sample (NIS), a 20% weighted, stratified, cluster sample of approximately 8 million hospital stays from approximately 1000 hospitals, we illustrate the WLS approach when fitting interval censored regression models to estimate the effect of type of surgery (robotic versus nonrobotic surgery) on hospital length-of-stay while adjusting for three sets of covariates: patient-level characteristics, hospital characteristics, and zip-code level characteristics. Ordinarily, standard fitting of the reduced models to the NIS data takes approximately 10 hours; using the proposed WLS approach, the reduced models take seconds to fit. 相似文献

17.

The analysis of incontinence episodes and other count data in patients with overactive bladder by Poisson and negative binomial regression

下载免费PDF全文

R. Martina R. Kay R. van Maanen A. Ridder 《Pharmaceutical statistics》2015,14(2):151-160

Clinical studies in overactive bladder have traditionally used analysis of covariance or nonparametric methods to analyse the number of incontinence episodes and other count data. It is known that if the underlying distributional assumptions of a particular parametric method do not hold, an alternative parametric method may be more efficient than a nonparametric one, which makes no assumptions regarding the underlying distribution of the data. Therefore, there are advantages in using methods based on the Poisson distribution or extensions of that method, which incorporate specific features that provide a modelling framework for count data. One challenge with count data is overdispersion, but methods are available that can account for this through the introduction of random effect terms in the modelling, and it is this modelling framework that leads to the negative binomial distribution. These models can also provide clinicians with a clearer and more appropriate interpretation of treatment effects in terms of rate ratios. In this paper, the previously used parametric and non‐parametric approaches are contrasted with those based on Poisson regression and various extensions in trials evaluating solifenacin and mirabegron in patients with overactive bladder. In these applications, negative binomial models are seen to fit the data well. Copyright © 2014 John Wiley & Sons, Ltd. 相似文献

18.

Variance function in regression analysis of longitudinal data using the generalized estimating equation approach

《Journal of Statistical Computation and Simulation》2012,82(12):2700-2709

Longitudinal or clustered response data arise in many applications such as biostatistics, epidemiology and environmental studies. The repeated responses cannot in general be assumed to be independent. One method of analysing such data is by using the generalized estimating equations (GEE) approach. The current GEE method for estimating regression effects in longitudinal data focuses on the modelling of the working correlation matrix assuming a known variance function. However, correct choice of the correlation structure may not necessarily improve estimation efficiency for the regression parameters if the variance function is misspecified [Wang YG, Lin X. Effects of variance-function misspecification in analysis of longitudinal data. Biometrics. 2005;61:413–421]. In this connection two problems arise: finding a correct variance function and estimating the parameters of the chosen variance function. In this paper, we study the problem of estimating the parameters of the variance function assuming that the form of the variance function is known and then the effect of a misspecified variance function on the estimates of the regression parameters. We propose a GEE approach to estimate the parameters of the variance function. This estimation approach borrows the idea of Davidian and Carroll [Variance function estimation. J Amer Statist Assoc. 1987;82:1079–1091] by solving a nonlinear regression problem where residuals are regarded as the responses and the variance function is regarded as the regression function. A limited simulation study shows that the proposed method performs at least as well as the modified pseudo-likelihood approach developed by Wang and Zhao [A modified pseudolikelihood approach for analysis of longitudinal data. Biometrics. 2007;63:681–689]. Both these methods perform better than the GEE approach. 相似文献

19.

Laplace based approximate posterior inference for differential equation models

Sarat C. Dass Jaeyong Lee Kyoungjae Lee Jonghun Park 《Statistics and Computing》2017,27(3):679-698

Ordinary differential equations are arguably the most popular and useful mathematical tool for describing physical and biological processes in the real world. Often, these physical and biological processes are observed with errors, in which case the most natural way to model such data is via regression where the mean function is defined by an ordinary differential equation believed to provide an understanding of the underlying process. These regression based dynamical models are called differential equation models. Parameter inference from differential equation models poses computational challenges mainly due to the fact that analytic solutions to most differential equations are not available. In this paper, we propose an approximation method for obtaining the posterior distribution of parameters in differential equation models. The approximation is done in two steps. In the first step, the solution of a differential equation is approximated by the general one-step method which is a class of numerical numerical methods for ordinary differential equations including the Euler and the Runge-Kutta procedures; in the second step, nuisance parameters are marginalized using Laplace approximation. The proposed Laplace approximated posterior gives a computationally fast alternative to the full Bayesian computational scheme (such as Makov Chain Monte Carlo) and produces more accurate and stable estimators than the popular smoothing methods (called collocation methods) based on frequentist procedures. For a theoretical support of the proposed method, we prove that the Laplace approximated posterior converges to the actual posterior under certain conditions and analyze the relation between the order of numerical error and its Laplace approximation. The proposed method is tested on simulated data sets and compared with the other existing methods. 相似文献

20.

To adjust or not to adjust for baseline when analyzing repeated binary responses? The case of complete data when treatment comparison at study end is of interest

下载免费PDF全文

Honghua Jiang Pandurang M. Kulkarni Craig H. Mallinckrodt Linda Shurzinske Geert Molenberghs Ilya Lipkovich 《Pharmaceutical statistics》2015,14(3):262-271

The benefits of adjusting for baseline covariates are not as straightforward with repeated binary responses as with continuous response variables. Therefore, in this study, we compared different methods for analyzing repeated binary data through simulations when the outcome at the study endpoint is of interest. Methods compared included chi‐square, Fisher's exact test, covariate adjusted/unadjusted logistic regression (Adj.logit/Unadj.logit), covariate adjusted/unadjusted generalized estimating equations (Adj.GEE/Unadj.GEE), covariate adjusted/unadjusted generalized linear mixed model (Adj.GLMM/Unadj.GLMM). All these methods preserved the type I error close to the nominal level. Covariate adjusted methods improved power compared with the unadjusted methods because of the increased treatment effect estimates, especially when the correlation between the baseline and outcome was strong, even though there was an apparent increase in standard errors. Results of the Chi‐squared test were identical to those for the unadjusted logistic regression. Fisher's exact test was the most conservative test regarding the type I error rate and also with the lowest power. Without missing data, there was no gain in using a repeated measures approach over a simple logistic regression at the final time point. Analysis of results from five phase III diabetes trials of the same compound was consistent with the simulation findings. Therefore, covariate adjusted analysis is recommended for repeated binary data when the study endpoint is of interest. Copyright © 2015 John Wiley & Sons, Ltd. 相似文献