首页 | 本学科首页   官方微博 | 高级检索  
     


A Probabilistic Perspective on Re-Identifiability
Authors:MATTHIJS KOOT  MICHEL MANDJES  GUIDO VAN 'T NOORDENDE  CEES DE LAAT
Affiliation:1. Informatics Institute , University of Amsterdam , The Netherlands koot@uva.nl;3. Korteweg-de Vries Institute for Mathematics , Amsterdam;4. Eurandom , Eindhoven University of Technology , Eindhoven;5. Centrum Wiskunde &6. Informatica (CWI) , Amsterdam , The Netherlands;7. Eurandom , Eindhoven University of Technology , Eindhoven
Abstract:A quasi-identifier is a set of attributes that can be used to re-identify entries in anonymized data sets. A group of individuals is considered about whom quasi-identifying numerical information is disclosed such as date of birth, age, weight, and height. The fraction of individuals is determined whose information is unique in that group and hence is identifiable unambiguously. Nonuniformity can be captured well by a single number, the Kullback-Leibler distance. For example sets of real microdata, given approximations based on Kullback-Leibler distances are accurate. Second, the effect of disclosing more specific or less specific information is analyzed experimentally. Third, the effect of correlation between numerical attributes is measured. A formula gives the re-identifiability level. The approximations are validated using publicly available demographic data sets.
Keywords:data anonymity  demographic data  Kullback-Leibler distance  privacy  probability theory  security
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号