首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Methods for missing covariates in logistic regression
Authors:Myunghee Cho Paik
Institution:Division of Biostatistics Columbia University , 10032, New York, 600 W 168th street New York mcp@biostat.columbia.edu
Abstract:Various methods have been suggested in the literature to handle a missing covariate in the presence of surrogate covariates. These methods belong to one of two paradigms. In the imputation paradigm, Pepe and Fleming (1991) and Reilly and Pepe (1995) suggested filling in missing covariates using the empirical distribution of the covariate obtained from the observed data. We can proceed one step further by imputing the missing covariate using nonparametric maximum likelihood estimates (NPMLE) of the density of the covariate. Recently Murphy and Van der Vaart (1998a) showed that such an approach yields a consistent, asymptotically normal, and semiparametric efficient estimate for the logistic regression coefficient. In the weighting paradigm, Zhao and Lipsitz (1992) suggested an estimating function using completely observed records after weighting inversely by the probability of observation. An extension of this weighting approach designed to achieve semiparametric efficient bound is considered by Robins, Hsieh and Newey (RHN) (1995). The two ends of each paradigm (NPMLE and RHN) attain the efficiency bound and are asymptotically equivalent. However, both require a substantial amount of computation. A question arises whether and when, in practical situations, this extensive computation is worthwhile. In this paper we investigate the performance of single and multiple imputation estimates, weighting estimates, semiparametric efficient estimates, and two new imputation estimates. Simulation studies suggest that the sample size should be substantially large (e.g. n=2000) for NPMLE and RHN to be more efficient than simpler imputation estimates. When the sample size is moderately large (n≤ 1500), simpler imputation estimates have as small a variance as semiparametric efficient estimates.
Keywords:Missing covariate  Logistic regression  Imputation  Efficiency  Nonparametric maximum likelihood
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号