Model Averaging for Prediction With Fragmentary Data |
| |
Authors: | Fang Fang Wei Lan Jingjing Tong Jun Shao |
| |
Affiliation: | 1. School of Statistics, East China Normal University, Shanghai, China (ffang@sfs.ecnu.edu.cn);2. Center of Statistical Research, Southwestern University of Finance and Economics, Chengdu, Sichuan, China (lanwei@swufe.edu.cn);3. Department of Statistics, University of Wisconsin-Madison, WI 53706 |
| |
Abstract: | ABSTRACTOne main challenge for statistical prediction with data from multiple sources is that not all the associated covariate data are available for many sampled subjects. Consequently, we need new statistical methodology to handle this type of “fragmentary data” that has become more and more popular in recent years. In this article, we propose a novel method based on the frequentist model averaging that fits some candidate models using all available covariate data. The weights in model averaging are selected by delete-one cross-validation based on the data from complete cases. The optimality of the selected weights is rigorously proved under some conditions. The finite sample performance of the proposed method is confirmed by simulation studies. An example for personal income prediction based on real data from a leading e-community of wealth management in China is also presented for illustration. |
| |
Keywords: | Asymptotic optimality Cross-validation Heteroscedastic errors Linear regression models Multiple data sources. |
|
|