首页 | 本学科首页   官方微博 | 高级检索  
     


Imputation in Data Fusion of Heterogeneous Data Sets A Model-Based Numerical Experiment
Authors:Andre Berchtold  Andre Jeannin
Affiliation:1. Groupe de Recherche sur la Santé des Adolescents , University Hospital Center and University of Lausanne , Lausanne, Switzerland;2. Institut de Mathématiques Appliquées, University of Lausanne , Lausanne, Switzerland andre.berchtold@unil.ch;4. Groupe de Recherche sur la Santé des Adolescents , University Hospital Center and University of Lausanne , Lausanne, Switzerland
Abstract:Given the very large amount of data obtained everyday through population surveys, much of the new research again could use this information instead of collecting new samples. Unfortunately, relevant data are often disseminated into different files obtained through different sampling designs. Data fusion is a set of methods used to combine information from different sources into a single dataset. In this article, we are interested in a specific problem: the fusion of two data files, one of which being quite small. We propose a model-based procedure combining a logistic regression with an Expectation-Maximization algorithm. Results show that despite the lack of data, this procedure can perform better than standard matching procedures.
Keywords:Binary variable  Data fusion  Data structure  Expectation-Maximization algorithm  Logistic regression  Matching
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号