Network sampling coverage II: The effect of non-random missing data on network measurement期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

Network sampling coverage II: The effect of non-random missing data on network measurement

Institution:	1. University of Nebraska-Lincoln, United States;2. Duke University, United States;3. King Abdulaziz University, Jeddah, Saudi Arabia;1. College of Information System and Management, National University of Defense Technology, Changsha, China;2. Department of Sociology, Stockholm University, Stockholm, Sweden;3. Department of Public Health Sciences, Karolinska Institutet, Stockholm, Sweden;1. University of Nebraska-Lincoln, United States;2. University of California at Davis, United States;1. University of Maribor, Faculty of Organizational Sciences, Kidričeva 55a, 4000 Kranj, Slovenia;2. University of Ljubljana, Faculty of Social Sciences, Kardeljeva ploščad 5, 1000 Ljubljana, Slovenia;3. University of Pittsburgh, Department of Sociology, USA

Abstract:	Missing data is an important, but often ignored, aspect of a network study. Measurement validity is affected by missing data, but the level of bias can be difficult to gauge. Here, we describe the effect of missing data on network measurement across widely different circumstances. In Part I of this study (Smith and Moody, 2013), we explored the effect of measurement bias due to randomly missing nodes. Here, we drop the assumption that data are missing at random: what happens to estimates of key network statistics when central nodes are more/less likely to be missing? We answer this question using a wide range of empirical networks and network measures. We find that bias is worse when more central nodes are missing. With respect to network measures, Bonacich centrality is highly sensitive to the loss of central nodes, while closeness centrality is not; distance and bicomponent size are more affected than triad summary measures and behavioral homophily is more robust than degree-homophily. With respect to types of networks, larger, directed networks tend to be more robust, but the relation is weak. We end the paper with a practical application, showing how researchers can use our results (translated into a publically available java application) to gauge the bias in their own data.

Keywords:	Missing data Network sampling Network bias
本文献已被 ScienceDirect 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏