Alleviating linear ecological bias and optimal design with subsample data |
| |
Authors: | Adam N. Glynn Jon Wakefield Mark S. Handcock Thomas S. Richardson |
| |
Affiliation: | Harvard University, Cambridge, USA; University of Washington, Seattle, USA |
| |
Abstract: | Summary. We illustrate that combining ecological data with subsample data in situations in which a linear model is appropriate provides two main benefits. First, by including the individual level subsample data, the biases that are associated with linear ecological inference can be eliminated. Second, available ecological data can be used to design optimal subsampling schemes that maximize information about parameters. We present an application of this methodology to the classic problem of estimating the effect of a college degree on wages, showing that small, optimally chosen subsamples can be combined with ecological data to generate precise estimates relative to a simple random subsample. |
| |
Keywords: | Combining information Ecological bias Returns to education Sample design Within-area confounding |
|
|