Addressing the problem of missing data in decision tree modeling期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

Addressing the problem of missing data in decision tree modeling

Authors:	Saiedeh Haji-Maghsoudi Azam Rastegari Behshid Garrusi

Institution:	1. Dept. of Biostatistics, School of Public Health, Hamadan University of Medical Sciences, Hamadan, Iran;2. Modeling in Health Research Center, Institute for Futures Studies in Health, Kerman University of Medical Sciences, Kerman, Iran;3. Kerman Neurososcience Research Center, Institute of Neuropharmacology, Kerman University of Medical Sciences, Kerman, Iran

Abstract:	Tree-based models (TBMs) can substitute missing data using the surrogate approach (SUR). The aim of this study is to compare the performance of statistical imputation against the performance of SUR in TBMs. Employing empirical data, a TBM was constructed. Thereafter, 10%, 20%, and 40% of variable values appeared as the first split was deleted, and imputed with and without the use of outcome variables in the imputation model (IMP? and IMP+). This was repeated one thousand times. Absolute relative bias above 0.10 was defined as sever (SARB). Subsequently, in a series of simulations, the following parameters were changed: the degree of correlation among variables, the number of variables truly associated with the outcome, and the missing rate. At a 10% missing rate, the proportion of times SARB was observed in either SUR or IMP? was two times higher than in IMP+ (28% versus 13%). When the missing rate was increased to 20%, all these proportions were approximately doubled. Irrespective of the missing rate, IMP+ was about 65% less likely to produce SARB than SUR. Results of IMP? and SUR were comparable up to a 20% missing rate. At a high missing rate, IMP? was 76% more likely to provide SARB estimates. Statistical imputation of missing data and the use of outcome variable in the imputation model is recommended, even in the content of TBM.

Keywords:	Tree missing surrogate imputation prediction

设为首页 | 免责声明 | 关于勤云 | 加入收藏