Identification and Inference With Many Invalid Instruments |
| |
Authors: | Michal Kolesár Raj Chetty John Friedman Edward Glaeser Guido W. Imbens |
| |
Affiliation: | 1. Department of Economics and Woodrow Wilson SchoolPrinceton University, Princeton, 08544, NJ(mkolesar@princeton.edu);2. Department of EconomicsHarvard University, Cambridge, 02138, and NBER, Cambridge, MA 02138(chetty@fas.harvard.edu);3. Kennedy School of GovernmentHarvard University, Cambridge, 02138, and NBER, Cambridge, MA 02138(john_friedman@harvard.edu);4. Department of EconomicsHarvard University, Cambridge, 02138, and NBER, Cambridge, MA 02138(eglaeser@harvard.edu);5. Graduate School of BusinessStanford University, Stanford, CA, 94305, and NBER, Cambridge, MA 02138 (imbens@stanford.edu) |
| |
Abstract: | We study estimation and inference in settings where the interest is in the effect of a potentially endogenous regressor on some outcome. To address the endogeneity, we exploit the presence of additional variables. Like conventional instrumental variables, these variables are correlated with the endogenous regressor. However, unlike conventional instrumental variables, they also have direct effects on the outcome, and thus are “invalid” instruments. Our novel identifying assumption is that the direct effects of these invalid instruments are uncorrelated with the effects of the instruments on the endogenous regressor. We show that in this case the limited-information-maximum-likelihood (liml) estimator is no longer consistent, but that a modification of the bias-corrected two-stage-least-square (tsls) estimator is consistent. We also show that conventional tests for over-identifying restrictions, adapted to the many instruments setting, can be used to test for the presence of these direct effects. We recommend that empirical researchers carry out such tests and compare estimates based on liml and the modified version of bias-corrected tsls. We illustrate in the context of two applications that such practice can be illuminating, and that our novel identifying assumption has substantive empirical content. |
| |
Keywords: | Instrumental variables Limited-information-maximum-likelihood Many instruments Misspecification Two-stage-least-squares |
|
|