期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

To adjust or not to adjust for baseline when analyzing repeated binary responses? The case of complete data when treatment comparison at study end is of interest

下载免费PDF全文

Honghua Jiang Pandurang M. Kulkarni Craig H. Mallinckrodt Linda Shurzinske Geert Molenberghs Ilya Lipkovich 《Pharmaceutical statistics》2015,14(3):262-271

The benefits of adjusting for baseline covariates are not as straightforward with repeated binary responses as with continuous response variables. Therefore, in this study, we compared different methods for analyzing repeated binary data through simulations when the outcome at the study endpoint is of interest. Methods compared included chi‐square, Fisher's exact test, covariate adjusted/unadjusted logistic regression (Adj.logit/Unadj.logit), covariate adjusted/unadjusted generalized estimating equations (Adj.GEE/Unadj.GEE), covariate adjusted/unadjusted generalized linear mixed model (Adj.GLMM/Unadj.GLMM). All these methods preserved the type I error close to the nominal level. Covariate adjusted methods improved power compared with the unadjusted methods because of the increased treatment effect estimates, especially when the correlation between the baseline and outcome was strong, even though there was an apparent increase in standard errors. Results of the Chi‐squared test were identical to those for the unadjusted logistic regression. Fisher's exact test was the most conservative test regarding the type I error rate and also with the lowest power. Without missing data, there was no gain in using a repeated measures approach over a simple logistic regression at the final time point. Analysis of results from five phase III diabetes trials of the same compound was consistent with the simulation findings. Therefore, covariate adjusted analysis is recommended for repeated binary data when the study endpoint is of interest. Copyright © 2015 John Wiley & Sons, Ltd. 相似文献

2.

Marginalized transition random effect models for multivariate longitudinal binary data

Ozlem Ilk Michael J. Daniels 《Revue canadienne de statistique》2007,35(1):105-123

Generalized linear models with random effects and/or serial dependence are commonly used to analyze longitudinal data. However, the computation and interpretation of marginal covariate effects can be difficult. This led Heagerty (1999, 2002) to propose models for longitudinal binary data in which a logistic regression is first used to explain the average marginal response. The model is then completed by introducing a conditional regression that allows for the longitudinal, within‐subject, dependence, either via random effects or regressing on previous responses. In this paper, the authors extend the work of Heagerty to handle multivariate longitudinal binary response data using a triple of regression models that directly model the marginal mean response while taking into account dependence across time and across responses. Markov Chain Monte Carlo methods are used for inference. Data from the Iowa Youth and Families Project are used to illustrate the methods. 相似文献

3.

Conditions for consistent estimation in mixed-effects models for binary matched-pairs data

J.M. Neuhaus J.D. Kalbfleisch W.W. Hauck 《Revue canadienne de statistique》1994,22(1):139-148

Parametric mixed-effects logistic models can provide effective analysis of binary matched-pairs data. Responses are assumed to follow a logistic model within pairs, with an intercept which varies across pairs according to a specified family of probability distributions G. In this paper we give necessary and sufficient conditions for consistent covariate effect estimation and present a geometric view of estimation which shows that when the assumed family of mixture distributions is rich enough, estimates of the effect of the binary covariate are typically consistent. The geometric view also shows that under the conditions for consistent estimation, the mixed-model estimator is identical to the familar conditional-likelihood estimator for matched pairs. We illustrate the findings with some examples. 相似文献

4.

Joint analysis of nonlinear heterogeneous longitudinal data and binary outcome: an application to AIDS clinical studies

Xiaosun Lu Rong Zhou 《Journal of applied statistics》2016,43(15):2713-2728

Finite mixture models are currently used to analyze heterogeneous longitudinal data. By releasing the homogeneity restriction of nonlinear mixed-effects (NLME) models, finite mixture models not only can estimate model parameters but also cluster individuals into one of the pre-specified classes with class membership probabilities. This clustering may have clinical significance, which might be associated with a clinically important binary outcome. This article develops a joint modeling of a finite mixture of NLME models for longitudinal data in the presence of covariate measurement errors and a logistic regression for a binary outcome, linked by individual latent class indicators, under a Bayesian framework. Simulation studies are conducted to assess the performance of the proposed joint model and a naive two-step model, in which finite mixture model and logistic regression are fitted separately, followed by an application to a real data set from an AIDS clinical trial, in which the viral dynamics and dichotomized time to the first decline of CD4/CD8 ratio are analyzed jointly. 相似文献

5.

Goodness-of-fit statistics for log-link regression models

《Journal of Statistical Computation and Simulation》2012,82(12):2533-2545

The use of log binomial regression, regression on binary outcomes using a log link, is becoming increasingly popular because it provides estimates of relative risk. However, little work has been done on model evaluation. We used simulations to compare the performance of five goodness-of-fit statistics applied to different models in a log binomial setting, namely the Hosmer–Lemeshow, the normalized Pearson chi-square, the normalized unweighted sum of squares, Le Cessie and van Howelingen's statistic based on smoothed residuals and the Hjort–Hosmer test. The normalized Pearson chi-square was unsuitable as the rejection rate depended also on the range of predicted probabilities. The Le Cessie and van Howelingen's test statistic had poor sampling properties when evaluating a correct model and was also considered to be unsuitable in this context. The performance of the remaining three statistics was comparable in most simulations. However, using real data the Hjort–Hosmer outperformed the other two statistics. 相似文献

6.

Optimal Designs for Binary Logistic Regression with a Qualitative Classifier with Independent Levels

Karabi Nandy Sami Helle Antti Liski Erkki Liski 《统计学通讯:模拟与计算》2013,42(10):1962-1977

Dose response studies arise in many medical applications. Often, such studies are considered within the framework of binary-response experiments such as success-failure. In such cases, popular choices for modeling the probability of response are logistic or probit models. Design optimality has been well studied for the logistic model with a continuous covariate. A natural extension of the logistic model is to consider the presence of a qualitative classifier. In this work, we explore D-, A-, and E-optimal designs in a two-parameter, binary logistic regression model after introducing a binary, qualitative classifier with independent levels. 相似文献

7.

Information attainable in some randomly incomplete data models

《Journal of statistical planning and inference》2006,136(7):2309-2326

The Fisher information is intricately linked to the asymptotic (first-order) optimality of maximum likelihood estimators for parametric complete-data models. When data are missing completely at random in a multivariate setup, it is shown that information in a single observation is well-defined and it plays the same role as in the complete-data model in characterizing the first-order asymptotic optimality properties of associated maximum likelihood estimators; computational aspects are also thoroughly appraised. As an illustration, the logistic regression model with incomplete binary responses and an incomplete categorical covariate is worked out. 相似文献

8.

A comparison of two approaches for power and sample size calculations in logistic regression models

Gwowen Shieh 《统计学通讯:模拟与计算》2013,42(3):763-791

Whittemore (1981) proposed an approach for calculating the sample size needed to test hypotheses with specified significance and power against a given alternative for logistic regression with small response probability. Based on the distribution of covariate, which could be either discrete or continuous, this approach first provides a simple closed-form approximation to the asymptotic covariance matrix of the maximum likelihood estimates, and then uses it to calculate the sample size needed to test a hypothesis about the parameter. Self et al. (1992) described a general approach for power and sample size calculations within the framework of generalized linear models, which include logistic regression as a special case. Their approach is based on an approximation to the distribution of the likelihood ratio statistic. Unlike the Whittemore approach, their approach is not limited to situations of small response probability. However, it is restricted to models with a finite number of covariate configurations. This study compares these two approaches to see how accurate they would be for the calculations of power and sample size in logistic regression models with various response probabilities and covariate distributions. The results indicate that the Whittemore approach has a slight advantage in achieving the nominal power only for one case with small response probability. It is outperformed for all other cases with larger response probabilities. In general, the approach proposed in Self et al. (1992) is recommended for all values of the response probability. However, its extension for logistic regression models with an infinite number of covariate configurations involves an arbitrary decision for categorization and leads to a discrete approximation. As shown in this paper, the examined discrete approximations appear to be sufficiently accurate for practical purpose. 相似文献

9.

Goodness-of-fit tests for additive mean residual life model under right censoring

Zhigang Zhang Xingqiu Zhao Liuquan Sun 《Lifetime data analysis》2010,16(3):385-408

The mean residual life (MRL) measures the remaining life expectancy and is useful in actuarial studies, biological experiments and clinical trials. To assess the covariate effect, an additive MRL regression model has been proposed in the literature. In this paper, we focus on the topic of model checking. Specifically, we develop two goodness-of-fit tests to test the additive MRL model assumption. We explore the large sample properties of the test statistics and show that both of them are based on asymptotic Gaussian processes so that resampling approaches can be applied to find the rejection regions. Simulation studies indicate that our methods work reasonably well for sample sizes ranging from 50 to 200. Two empirical data sets are analyzed to illustrate the approaches. 相似文献

10.

Goodness‐of‐fit methods for matched case‐control studies

Patrick G. Arbogast Danyu Y. Lin 《Revue canadienne de statistique》2004,32(4):373-386

The authors propose graphical and numerical methods for checking the adequacy of the logistic regression model for matched case‐control data. Their approach is based on the cumulative sum of residuals over the covariate or linear predictor. Under the assumed model, the cumulative residual process converges weakly to a centered Gaussian limit whose distribution can be approximated via computer simulation. The observed cumulative residual pattern can then be compared both visually and analytically to a certain number of simulated realizations of the approximate limiting process under the null hypothesis. The proposed techniques allow one to check the functional form of each covariate, the logistic link function as well as the overall model adequacy. The authors assess the performance of the proposed methods through simulation studies and illustrate them using data from a cardiovascular study. 相似文献

11.

Bayesian estimation of logistic regression with misclassified covariates and response

Brandi N. Falley James D. Stamey A. Alexander Beaujean 《Journal of applied statistics》2018,45(10):1756-1769

Measurement error is a commonly addressed problem in psychometrics and the behavioral sciences, particularly where gold standard data either does not exist or are too expensive. The Bayesian approach can be utilized to adjust for the bias that results from measurement error in tests. Bayesian methods offer other practical advantages for the analysis of epidemiological data including the possibility of incorporating relevant prior scientific information and the ability to make inferences that do not rely on large sample assumptions. In this paper we consider a logistic regression model where both the response and a binary covariate are subject to misclassification. We assume both a continuous measure and a binary diagnostic test are available for the response variable but no gold standard test is assumed available. We consider a fully Bayesian analysis that affords such adjustments, accounting for the sources of error and correcting estimates of the regression parameters. Based on the results from our example and simulations, the models that account for misclassification produce more statistically significant results, than the models that ignore misclassification. A real data example on math disorders is considered. 相似文献

12.

A simple test procedure in standardizing the power of Hosmer–Lemeshow test in large data sets

Xin Lai 《Journal of Statistical Computation and Simulation》2018,88(13):2463-2472

The Hosmer–Lemeshow (H–L) test is a widely used method when assessing the goodness-of-fit of a logistic regression model. However, the H–L test is sensitive to the sample sizes and the number of groups in H–L test. Cautions need to be taken for interpreting an H–L test with a large sample size. In this paper, we propose a simple test procedure to evaluate the model fit of logistic regression model with a large sample size, in which a bootstrap method is used and the test result is determined by the power of H–L test at the target sample size. Simulation studies show that the proposed method can effectively standardize the power of the H–L test under the pre-specified level of type I error. Application to the two datasets illustrates the usefulness of the proposed model. 相似文献

13.

An omnibus lack of fit test in logistic regression with sparse data

Ying Liu Paul I. Nelson Shie-Shien Yang 《Statistical Methods and Applications》2012,21(4):437-452

The usefulness of logistic regression depends to a great extent on the correct specification of the relation between a binary response and characteristics of the unit on which the response is recoded. Currently used methods for testing for misspecification (lack of fit) of a proposed logistic regression model do not perform well when a data set contains almost as many distinct covariate vectors as experimental units, a condition referred to as sparsity. A new algorithm for grouping sparse data to create pseudo replicates and using them to test for lack of fit is developed. A simulation study illustrates settings in which the new test is superior to existing ones. Analysis of a dataset consisting of the ages of menarche of Warsaw girls is also used to compare the new and existing lack of fit tests. 相似文献

14.

Comparison of Goodness-of-Fit Measures in Probit Regression Model

Berna Yazici Özlem Alpu Yaning Yang 《统计学通讯:模拟与计算》2013,42(5):1061-1073

This article examines several goodness-of-fit measures in the binary probit regression model. Existing pseudo-R ² measures are reviewed, two modified and one new pseudo-R ² measure are proposed. For the probit regression model, empirical comparisons are made for different goodness-of-fit measures with the squared sample correlation coefficient of the observed response and the predicted probabilities. As an illustration, the goodness-of-fit measures are applied to a “paid labor force” data set. 相似文献

15.

Analyzing Binary Outcome Data with Small Clusters: A Simulation Study

Ying Xu Chun Fan Lee Yin Bun Cheung 《统计学通讯:模拟与计算》2013,42(7):1771-1782

Binary outcome data with small clusters often arise in medical studies and the size of clusters might be informative of the outcome. The authors conducted a simulation study to examine the performance of a range of statistical methods. The simulation results showed that all methods performed mostly comparable in the estimation of covariate effects. However, the standard logistic regression approach that ignores the clustering encountered an undercoverage problem when the degree of clustering was nontrivial. The performance of random-effects logistic regression approach tended to be affected by low disease prevalence, relatively small cluster size, or informative cluster size. 相似文献

16.

Comparing Fits of Latent Trait and Latent Class Models Applied to Sparse Binary Data: An Illustration with Human Resource Management Data

Lilian M. De Menezes Ana Lasaosa 《Journal of applied statistics》2007,34(3):303-319

This paper addresses the problem of comparing the fit of latent class and latent trait models when the indicators are binary and the contingency table is sparse. This problem is common in the analysis of data from large surveys, where many items are associated with an unobservable variable. A study of human resource data illustrates: (1) how the usual goodness-of-fit tests, model selection and cross-validation criteria can be inconclusive; (2) how model selection and evaluation procedures from time series and economic forecasting can be applied to extend residual analysis in this context. 相似文献

17.

Likelihood methods for missing covariate data in highly stratified studies

Paul J. Rathouz 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2003,65(3):711-723

Summary. The paper considers canonical link generalized linear models with stratum-specific nuisance intercepts and missing covariate data. This family includes the conditional logistic regression model. Existing methods for this problem, each of which uses a conditioning argu- ment to eliminate the nuisance intercept, model either the missing covariate data or the missingness process. The paper compares these methods under a common likelihood framework. The semiparametric efficient estimator is identified, and a new estimator, which reduces dependence on the model for the missing covariate, is proposed. A simulation study compares the methods with respect to efficiency and robustness to model misspecification. 相似文献

18.

On power and sample size calculations for Wald tests in generalized linear models

《Journal of statistical planning and inference》2005,128(1):43-59

A Wald test-based approach for power and sample size calculations has been presented recently for logistic and Poisson regression models using the asymptotic normal distribution of the maximum likelihood estimator, which is applicable to tests of a single parameter. Unlike the previous procedures involving the use of score and likelihood ratio statistics, there is no simple and direct extension of this approach for tests of more than a single parameter. In this article, we present a method for computing sample size and statistical power employing the discrepancy between the noncentral and central chi-square approximations to the distribution of the Wald statistic with unrestricted and restricted parameter estimates, respectively. The distinguishing features of the proposed approach are the accommodation of tests about multiple parameters, the flexibility of covariate configurations and the generality of overall response levels within the framework of generalized linear models. The general procedure is illustrated with some special situations that have motivated this research. Monte Carlo simulation studies are conducted to assess and compare its accuracy with existing approaches under several model specifications and covariate distributions. 相似文献

19.

A note on goodness-of-fit test of continuation ratio logistic regression models under case–control data

Cheng Peng Biao Zhang 《Journal of statistical planning and inference》2008,138(8):2355-2365

We extend the discussion of Qin and Zhang's [1997. A goodness of fit test for logistic regression models base on case–control data. Biometrika 84, 609–618] goodness-of-fit test of logistic regression under case–control data to continuation ratio logistic regression (CRLR) models. We first showed that the retrospective CRLR model, which is valid for case–control data (the null hypothesis _H₀)

H_{0})

, is equivalent to an I -sample semiparametric model. Then under _H₀

H_{0}

, we find the semiparametric profile empirical likelihood estimators of distributions of the covariate conditioning on each response category and use them to define a Kolmogorov–Smirnov type test for assessing the global fit of CRLR models under case–control data. Unlike prospective CRLR models, retrospective CRLR models cannot be partitioned to a series of retrospective binary logistic regression models studied by Qin and Zhang [1997. A goodness of fit test for logistic regression models base on case–control data. Biometrika 84, 609–618]. 相似文献

20.

Hierarchical logistic regression models for imputation of unresolved enumeration status in undercount estimation

Belin TR Diffendal GJ Mack S Rubin DB Schafer JL Zaslavsky AM 《Journal of the American Statistical Association》1993,88(423):1,149-1,166

"In this article we describe a logistic regression modeling approach for nonresponse in the [U.S.] Post-Enumeration Survey (PES) that has desirable theoretical properties and that has performed well in practice.... In the 1990 PES, interviews were not obtained from approximately 1.2% of households in the sample, and approximately 2.1% of the individuals in interviewed households were considered unresolved after follow-up....The missing binary enumeration statuses for these unresolved cases were replaced with probabilities estimated under a statistical model that incorporated covariate information observed for these cases. This article describes an approach to modeling missing binary outcomes when there are a large number of covariates." 相似文献