Abstract: | This paper applies some general concepts in decision theory to a simple instrumental variables model. There are two endogenous variables linked by a single structural equation; k of the exogenous variables are excluded from this structural equation and provide the instrumental variables (IV). The reduced‐form distribution of the endogenous variables conditional on the exogenous variables corresponds to independent draws from a bivariate normal distribution with linear regression functions and a known covariance matrix. A canonical form of the model has parameter vector (ρ, φ, ω), where φis the parameter of interest and is normalized to be a point on the unit circle. The reduced‐form coefficients on the instrumental variables are split into a scalar parameter ρand a parameter vector ω, which is normalized to be a point on the (k−1)‐dimensional unit sphere; ρmeasures the strength of the association between the endogenous variables and the instrumental variables, and ωis a measure of direction. A prior distribution is introduced for the IV model. The parameters φ, ρ, and ωare treated as independent random variables. The distribution for φis uniform on the unit circle; the distribution for ωis uniform on the unit sphere with dimension k‐1. These choices arise from the solution of a minimax problem. The prior for ρis left general. It turns out that given any positive value for ρ, the Bayes estimator of φdoes not depend on ρ; it equals the maximum‐likelihood estimator. This Bayes estimator has constant risk; because it minimizes average risk with respect to a proper prior, it is minimax. The same general concepts are applied to obtain confidence intervals. The prior distribution is used in two ways. The first way is to integrate out the nuisance parameter ωin the IV model. This gives an integrated likelihood function with two scalar parameters, φand ρ. Inverting a likelihood ratio test, based on the integrated likelihood function, provides a confidence interval for φ. This lacks finite sample optimality, but invariance arguments show that the risk function depends only on ρand not on φor ω. The second approach to confidence sets aims for finite sample optimality by setting up a loss function that trades off coverage against the length of the interval. The automatic uniform priors are used for φand ω, but a prior is also needed for the scalar ρ, and no guidance is offered on this choice. The Bayes rule is a highest posterior density set. Invariance arguments show that the risk function depends only on ρand not on φor ω. The optimality result combines average risk and maximum risk. The confidence set minimizes the average—with respect to the prior distribution for ρ—of the maximum risk, where the maximization is with respect to φand ω. |