期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Different methods for handling incomplete longitudinal binary outcome due to missing at random dropout

《Statistical Methodology》2015

This paper compares the performance of weighted generalized estimating equations (WGEEs), multiple imputation based on generalized estimating equations (MI-GEEs) and generalized linear mixed models (GLMMs) for analyzing incomplete longitudinal binary data when the underlying study is subject to dropout. The paper aims to explore the performance of the above methods in terms of handling dropouts that are missing at random (MAR). The methods are compared on simulated data. The longitudinal binary data are generated from a logistic regression model, under different sample sizes. The incomplete data are created for three different dropout rates. The methods are evaluated in terms of bias, precision and mean square error in case where data are subject to MAR dropout. In conclusion, across the simulations performed, the MI-GEE method performed better in both small and large sample sizes. Evidently, this should not be seen as formal and definitive proof, but adds to the body of knowledge about the methods’ relative performance. In addition, the methods are compared using data from a randomized clinical trial. 相似文献

2.

Inference methods for saturated models in longitudinal clinical trials with incomplete binary data

Song JX 《Pharmaceutical statistics》2006,5(4):295-304

In the longitudinal studies with binary response, it is often of interest to estimate the percentage of positive responses at each time point and the percentage of having at least one positive response by each time point. When missing data exist, the conventional method based on observed percentages could result in erroneous estimates. This study demonstrates two methods of using expectation-maximization (EM) and data augmentation (DA) algorithms in the estimation of the marginal and cumulative probabilities for incomplete longitudinal binary response data. Both methods provide unbiased estimates when the missingness mechanism is missing at random (MAR) assumption. Sensitivity analyses have been performed for cases when the MAR assumption is in question. 相似文献

3.

A multiple imputation method for incomplete correlated ordinal data using multivariate probit models

Xiao Zhang Quanlin Li Karen Cropsey Xiaowei Yang Kui Zhang Thomas Belin 《统计学通讯:模拟与计算》2017,46(3):2360-2375

The multiple imputation technique has proven to be a useful tool in missing data analysis. We propose a Markov chain Monte Carlo method to conduct multiple imputation for incomplete correlated ordinal data using the multivariate probit model. We conduct a thorough simulation study to compare the performance of our proposed method with two available imputation methods – multivariate normal-based and chain equation methods for various missing data scenarios. For illustration, we present an application using the data from the smoking cessation treatment study for low-income community corrections smokers. 相似文献

4.

Optimum designs for estimation of regression parameters in a balanced treatment incomplete block design set-up

Ganesh Dutta Premadhis Das 《Journal of statistical planning and inference》2013

The use of covariates in block designs is necessary when the experimental errors cannot be controlled using only the qualitative factors. The choice of values of the covariates for a given set-up attaining minimum variance for estimation of the regression parameters has attracted attention in recent times. In this paper, optimum covariate designs (OCD) have been considered for the set-up of the balanced treatment incomplete block (BTIB) designs, which form an important class of test-control designs. It is seen that the OCDs depend much on the methods of construction of the basic BTIB designs. The series of BTIB designs considered in this paper are mainly those as described by Bechhofer and Tamhane (1981) and Das et al. (2005). Different combinatorial arrangements and tools such as Hadamard matrices and different kinds of products of matrices viz Khatri-Rao product and Kronecker product have been conveniently used to construct OCDs with as many covariates as possible. 相似文献

5.

Imputation techniques for incomplete data in quadratic discriminant analysis

《Journal of Statistical Computation and Simulation》2012,82(6):863-877

We have compared the efficacy of five imputation algorithms readily available in SAS for the quadratic discriminant function. Here, we have generated several different parametric-configuration training data with missing data, including monotone missing-at-random observations, and used a Monte Carlo simulation to examine the expected probabilities of misclassification for the two-class quadratic statistical discrimination problem under five different imputation methods. Specifically, we have compared the efficacy of the complete observation-only method and the mean substitution, regression, predictive mean matching, propensity score, and Markov Chain Monte Carlo (MCMC) imputation methods. We found that the MCMC and propensity score multiple imputation approaches are, in general, superior to the other imputation methods for the configurations and training-sample sizes we considered. 相似文献

6.

Skew-mixed effects model for multivariate longitudinal data with categorical outcomes and missingness

S. Eftekhari Mahabadi E. Rahimi Jafari 《Journal of applied statistics》2018,45(12):2182-2201

A longitudinal study commonly follows a set of variables, measured for each individual repeatedly over time, and usually suffers from incomplete data problem. A common approach for dealing with longitudinal categorical responses is to use the Generalized Linear Mixed Model (GLMM). This model induces the potential relation between response variables over time via a vector of random effects, assumed to be shared parameters in the non-ignorable missing mechanism. Most GLMMs assume that the random-effects parameters follow a normal or symmetric distribution and this leads to serious problems in real applications. In this paper, we propose GLMMs for the analysis of incomplete multivariate longitudinal categorical responses with a non-ignorable missing mechanism based on a shared parameter framework with the less restrictive assumption of skew-normality for the random effects. These models may contain incomplete data with monotone and non-monotone missing patterns. The performance of the model is evaluated using simulation studies and a well-known longitudinal data set extracted from a fluvoxamine trial is analyzed to determine the profile of fluvoxamine in ambulatory clinical psychiatric practice. 相似文献

7.

Examining the robustness of fully synthetic data techniques for data with binary variables

《Journal of Statistical Computation and Simulation》2012,82(6):609-624

There is a growing demand for public use data while at the same time there are increasing concerns about the privacy of personal information. One proposed method for accomplishing both goals is to release data sets that do not contain real values but yield the same inferences as the actual data. The idea is to view confidential data as missing and use multiple imputation techniques to create synthetic data sets. In this article, we compare techniques for creating synthetic data sets in simple scenarios with a binary variable. 相似文献

8.

Estimation equations for multivariate linear models with Kronecker structured covariance matrices

Anna Szczepańska-Álvarez Chengcheng Hao Yuli Liang Dietrich von Rosen 《统计学通讯:理论与方法》2017,46(16):7902-7915

相似文献

9.

Impact of the non-distinctness and non-ignorability on the inference by multiple imputation in multivariate multilevel data: a simulation assessment

Recai Yucel 《Journal of Statistical Computation and Simulation》2017,87(9):1813-1826

Multiple imputation (MI) is an increasingly popular method for analysing incomplete multivariate data sets. One of the most crucial assumptions of this method relates to mechanism leading to missing data. Distinctness is typically assumed, which indicates a complete independence of mechanisms underlying missingness and data generation. In addition, missing at random or missing completely at random is assumed, which explicitly states under which conditions missingness is independent of observed data. Despite common use of MI under these assumptions, plausibility and sensitivity to these fundamental assumptions have not been well-investigated. In this work, we investigate the impact of non-distinctness and non-ignorability. In particular, non-ignorability is due to unobservable cluster-specific effects (e.g. random-effects). Through a comprehensive simulation study, we show that MI inferences suggest that nonignoriability due to non-distinctness do not immediately imply dismal performance while non-ignorability due to missing not at random leads to quite subpar performance. 相似文献

10.

Comparison of several imputation methods for missing baseline data in propensity scores analysis of binary outcome

Brenda J. Crowe Ilya A. Lipkovich Ouhong Wang 《Pharmaceutical statistics》2010,9(4):269-279

We performed a simulation study comparing the statistical properties of the estimated log odds ratio from propensity scores analyses of a binary response variable, in which missing baseline data had been imputed using a simple imputation scheme (Treatment Mean Imputation), compared with three ways of performing multiple imputation (MI) and with a Complete Case analysis. MI that included treatment (treated/untreated) and outcome (for our analyses, outcome was adverse event [yes/no]) in the imputer's model had the best statistical properties of the imputation schemes we studied. MI is feasible to use in situations where one has just a few outcomes to analyze. We also found that Treatment Mean Imputation performed quite well and is a reasonable alternative to MI in situations where it is not feasible to use MI. Treatment Mean Imputation performed better than MI methods that did not include both the treatment and outcome in the imputer's model. Copyright © 2009 John Wiley & Sons, Ltd. 相似文献

11.

Markov chain models for multivariate repeated binary data analysis

Wei Tian Stewart J. Anderson 《统计学通讯:模拟与计算》2013,42(4):1001-1019

Repeated categorical outcomes frequently occur in clinical trials. Muenz and Rubinstein (1985) presented Markov chain models to analyze binary repeated data in a breast cancer study. We extend their method to the setting when more than one repeated outcome variable is of interest. In a randomized clinical trial of breast cancer, we investigate the dependency of toxicities on predictor variables and the relationship among multiple toxic effects. 相似文献

12.

Testing homogeneity of difference of two proportions for stratified correlated paired binary data

Xi Shen 《Journal of applied statistics》2018,45(8):1410-1425

In ophthalmologic or otolaryngologic study, each subject may contribute paired organs measurements to the analysis. A number of statistical methods have been proposed on bilateral correlated data. In practice, it is important to detect confounding effect by treatment interaction, since ignoring confounding effect may lead to unreliable conclusion. Therefore, stratified data analysis can be considered to adjust the effect of confounder on statistical inference. In this article, we investigate and derive three test procedures for testing homogeneity of difference of two proportions for stratified correlated paired binary data in the basis of equal correlation model assumption. The performance of proposed test procedures is examined through Monte Carlo simulation. The simulation results show that the Score test is usually robust on type I error control with high power, and therefore is recommended among the three methods. One example from otolaryngologic study is given to illustrate the three test procedures. 相似文献

13.

Power analysis for cluster randomized trials with binary outcomes modeled by generalized linear mixed-effects models

T. Chen J. Arora I. Katz R. Bossarte H. He 《Journal of applied statistics》2016,43(6):1104-1118

Power analysis for cluster randomized control trials is difficult to perform when a binary response is modeled using the generalized linear mixed-effects model (GLMM). Although methods for clustered binary responses exist such as the generalized estimating equations, they do not apply to the context of GLMM. Also, because popular statistical packages such as R and SAS do not provide correct estimates of parameters for the GLMM for binary responses, Monte Carlo simulation, a popular ad-hoc method for estimating power when the power function is too complex to evaluate analytically or numerically, fails to provide correct power estimates within the current context as well. In this paper, a new approach is developed to estimate power for cluster randomized control trials when a binary response is modeled by the GLMM. The approach is easy to implement and seems to work quite well, as assessed by simulation studies. The approach is illustrated with a real intervention study to reduce suicide reattempt rates among US Veterans. 相似文献

14.

Joint generalized estimating equations for multivariate longitudinal binary outcomes with missing data: an application to acquired immune deficiency syndrome data

Stuart R. Lipsitz Garrett M. Fitzmaurice Joseph G. Ibrahim Debajyoti Sinha Michael Parzen Steven Lipshultz 《Journal of the Royal Statistical Society. Series A, (Statistics in Society)》2009,172(1):3-20

Summary. In a large, prospective longitudinal study designed to monitor cardiac abnormalities in children born to women who are infected with the human immunodeficiency virus, instead of a single outcome variable, there are multiple binary outcomes (e.g. abnormal heart rate, abnormal blood pressure and abnormal heart wall thickness) considered as joint measures of heart function over time. In the presence of missing responses at some time points, longitudinal marginal models for these multiple outcomes can be estimated by using generalized estimating equations (GEEs), and consistent estimates can be obtained under the assumption of a missingness completely at random mechanism. When the missing data mechanism is missingness at random, i.e. the probability of missing a particular outcome at a time point depends on observed values of that outcome and the remaining outcomes at other time points, we propose joint estimation of the marginal models by using a single modified GEE based on an EM-type algorithm. The method proposed is motivated by the longitudinal study of cardiac abnormalities in children who were born to women infected with the human immunodeficiency virus, and analyses of these data are presented to illustrate the application of the method. Further, in an asymptotic study of bias, we show that, under a missingness at random mechanism in which missingness depends on all observed outcome variables, our joint estimation via the modified GEE produces almost unbiased estimates, provided that the correlation model has been correctly specified, whereas estimates from standard GEEs can lead to substantial bias. 相似文献

15.

Likelihood-based approach for analysis of longitudinal nominal data using marginalized random effects models

Keunbaik Lee Sanggil Kang Xuefeng Liu Daekwan Seo 《Journal of applied statistics》2011,38(8):1577-1590

Likelihood-based marginalized models using random effects have become popular for analyzing longitudinal categorical data. These models permit direct interpretation of marginal mean parameters and characterize the serial dependence of longitudinal outcomes using random effects [12,22]. In this paper, we propose model that expands the use of previous models to accommodate longitudinal nominal data. Random effects using a new covariance matrix with a Kronecker product composition are used to explain serial and categorical dependence. The Quasi-Newton algorithm is developed for estimation. These proposed methods are illustrated with a real data set and compared with other standard methods. 相似文献

16.

Strategies for handling missing data in longitudinal studies with questionnaires

Nazanin Nooraee Geert Molenberghs Johan Ormel 《Journal of Statistical Computation and Simulation》2018,88(17):3415-3436

Missing data methods, maximum likelihood estimation (MLE) and multiple imputation (MI), for longitudinal questionnaire data were investigated via simulation. Predictive mean matching (PMM) was applied at both item and scale levels, logistic regression at item level and multivariate normal imputation at scale level. We investigated a hybrid approach which is combination of MLE and MI, i.e. scales from the imputed data are eliminated if all underlying items were originally missing. Bias and mean square error (MSE) for parameter estimates were examined. ML seemed to provide occasionally the best results in terms of bias, but hardly ever on MSE. All imputation methods at the scale level and logistic regression at item level hardly ever showed the best performance. The hybrid approach is similar or better than its original MI. The PMM-hybrid approach at item level demonstrated the best MSE for most settings and in some cases also the smallest bias. 相似文献

17.

Resampling method to estimate intra-cluster correlation for clustered binary data

Hrishikesh Chakraborty Pranab K. Sen 《统计学通讯:理论与方法》2013,42(8):2368-2377

ABSTRACT

Various methods have been proposed to estimate intra-cluster correlation coefficients (ICCs) for correlated binary data, and many are very sensitive to the type of design and underlying distributional assumptions. We proposed a new method to estimate ICC and its 95% confidence intervals based on resampling principles and U-statistics, where we resampled with replacement pairs of individuals from within and between clusters. We concluded from our simulation study that the resampling-based estimates approximate the population ICC more precisely than the analysis of variance and method of moments techniques for different event rates, varying number of clusters, and cluster sizes. 相似文献

18.

Pairwise- and marginal-likelihood estimation for the mixed Rasch model with binary data

《Journal of Statistical Computation and Simulation》2012,82(3):419-430

A marginal–pairwise-likelihood estimation approach is examined in the mixed Rasch model with the binary response and logit link. This method belonging to the broad class of composite likelihood provides estimators with desirable asymptotic properties such as consistency and asymptotic normality. We study the performance of the proposed methodology when the random effect distribution is misspecified. A simulation study was conducted to compare this approach with the maximum marginal likelihood. The different results are also illustrated with an analysis of the real data set from a quality-of-life study. 相似文献

19.

Exact methods of testing the homogeneity of prevalences for correlated binary data

Xiaobin Liu Zhengyu Yang Song Liu Chang-Xing Ma 《Journal of Statistical Computation and Simulation》2017,87(15):3021-3039

Correlated binary data arise in many ophthalmological and otolaryngological clinical trials. To test the homogeneity of prevalences among different groups is an important issue when conducting these trials. The equal correlation coefficients model proposed by Donner in 1989 is a popular model handling correlated binary data. The asymptotic chi-square test works well when the sample size is large. However, it would fail to maintain the type I error rate when the sample size is relatively small. In this paper, we propose several exact methods to deal with small sample scenarios. Their performances are compared with respect to type I error rate and power. The ‘M approach’ and the ‘E + M approach’ seem to outperform the others. A real work example is given to further explain how these approaches work. Finally, the computational efficiency of the exact methods is discussed as a pressing issue of future work. 相似文献

20.

A comparison of methods for simulating correlated binary variables with specified marginal means and correlations

《Journal of Statistical Computation and Simulation》2012,82(11):2441-2452

Simulation studies employed to study properties of estimators for parameters in population-average models for clustered or longitudinal data require suitable algorithms for data generation. Methods for generating correlated binary data that allow general specifications of the marginal mean and correlation structures are particularly useful. We compare an algorithm based on dichotomizing multi-normal variates to one based on a conditional linear family (CLF) of distributions [Qaqish BF. A family of multivariate binary distributions for simulating correlated binary variables with specified marginal means and correlations. Biometrika. 2003;90:455–463] with respect to range restrictions induced on correlations. Examples include generating longitudinal binary data and generating correlated binary data compatible with specified marginal means and covariance structures for bivariate, overdispersed binomial outcomes. Results show the CLF method gives a wider range of correlations for longitudinal data having autocorrelated within-subject associations, while the multivariate probit method gives a wider range of correlations for clustered data having exchangeable-type correlations. In the case of a decaying-product correlation structure, it is shown that the CLF method achieves the nonparametric limits on the range of correlations, which cannot be surpassed by any method. 相似文献