首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Simultaneous edit-imputation and disclosure limitation for business establishment data
Authors:Hang J Kim  Jerome P Reiter  Alan F Karr
Institution:1. Department of Mathematical Sciences, University of Cincinnati, Cincinnati, OH, USA;2. Department of Statistical Science, Duke University, Durham, NC, USA;3. RTI International, Research Triangle Park, NC, USA
Abstract:Business establishment microdata typically are required to satisfy agency-specified edit rules, such as balance equations and linear inequalities. Inevitably some establishments' reported data violate the edit rules. Statistical agencies correct faulty values using a process known as edit-imputation. Business establishment data also must be heavily redacted before being shared with the public; indeed, confidentiality concerns lead many agencies not to share establishment microdata as unrestricted access files. When microdata must be heavily redacted, one approach is to create synthetic data, as done in the U.S. Longitudinal Business Database and the German IAB Establishment Panel. This article presents the first implementation of a fully integrated approach to edit-imputation and data synthesis. We illustrate the approach on data from the U.S. Census of Manufactures and present a variety of evaluations of the utility of the synthetic data. The paper also presents assessments of disclosure risks for several intruder attacks. We find that the synthetic data preserve important distributional features from the post-editing confidential microdata, and have low risks for the various attacks.
Keywords:Confidentiality  measurement error  missing  multiple imputation  synthetic
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号