Handling missing data by deleting completely observed records |
| |
Authors: | Myunghee Cho Paik Cuiling Wang |
| |
Institution: | 1. Department of Biostatistics, Mailman School of Public Health, Columbia University, 722 West 168th Street, New York City, NY 10032, USA;2. Department of Epidemiology and Population Health, Albert Einstein College of Medicine, 1300 Morris Park Ave., Bronx, NY 10461, USA |
| |
Abstract: | When data are missing, analyzing records that are completely observed may cause bias or inefficiency. Existing approaches in handling missing data include likelihood, imputation and inverse probability weighting. In this paper, we propose three estimators inspired by deleting some completely observed data in the regression setting. First, we generate artificial observation indicators that are independent of outcome given the observed data and draw inferences conditioning on the artificial observation indicators. Second, we propose a closely related weighting method. The proposed weighting method has more stable weights than those of the inverse probability weighting method (Zhao, L., Lipsitz, S., 1992. Designs and analysis of two-stage studies. Statistics in Medicine 11, 769–782). Third, we improve the efficiency of the proposed weighting estimator by subtracting the projection of the estimating function onto the nuisance tangent space. When data are missing completely at random, we show that the proposed estimators have asymptotic variances smaller than or equal to the variance of the estimator obtained from using completely observed records only. Asymptotic relative efficiency computation and simulation studies indicate that the proposed weighting estimators are more efficient than the inverse probability weighting estimators under wide range of practical situations especially when the missingness proportion is large. |
| |
Keywords: | Missing data Deletion method Inverse probability weighting |
本文献已被 ScienceDirect 等数据库收录! |
|