1. Introduction
Does an efficient financial system enhance growth? Does uncertainty depress private investment? Do private schools provide better education than public schools? In answering these and many other empirical questions instrumental variables play a central role.
In a recent paper
Swamy et al. (
2015) argue that valid instruments cannot exist in the presence of
any model misspecification. Such mis-specifications include wrong functional form, omission of relevant explanatory variables, and the presence of measurement error in explanatory variables. As a consequence, instrumental variables (IV) and generalized method of moments (GMM) would not work.
No doubt, instruments may be hard to find or difficult to justify in empirical applications. The claim that instruments cannot exist appears to be too strong, however. This note discusses three simple examples where valid instruments can exist and where IV methods will work.
This note proceeds as follows. The next section reproduces the derivations in
Swamy et al. (
2015) for a simple linear model. This example helps to explain the logic of their nonexistence result for instrumental variables.
Section 3 briefly discusses the nature of structural models. The aim is to clarify some misconceptions that bedevil the interpretation of structural models.
Section 4 reexamines the basic arguments of
Swamy et al. (
2015) in the context of three extremely simple structural models. All three regressions for estimating the structural parameter of interest are misspecified in these cases. In the first two cases one of the two explanatory variables has been omitted. In the third case the single explanatory variable contains measurement error.
The following will be shown. In the first case the existence of a valid instruments depends on the purpose of the analysis. In the second case instruments can exist and will produce consistent estimates of the structural parameter of interest. In the third example instruments can solve the measurement error problem.
2. Non-Existence of Instruments
The central argument in
Swamy et al. (
2015) is that the explanatory variables that must be instrumented in an empirical model to obtain consistent estimates are at the same time also part of the error term of the model. All potential instruments are therefor necessarily correlated with the error term. The requirement that instruments must be uncorrelated with the error term is always violated. Hence, valid instruments cannot exist.
Swamy et al. (
2015) start with a very general theoretical relationship
with unknown functional form, where a possibly time dependent number
of variables
determine
. This theoretical relationship is assumed to be exact. Thus there is no need for an error term.
The derivations in
Swamy et al. (
2015) are now repeated for a simple linear model where
is exactly determined by two variables
and
. This model can be written as
Equation (
2) is the true model.
Let us assume that we can only observe the measurements and where and are measurement errors. Thus, the model that we can estimate has the correct functional form, but suffers from measurement error and an omitted variable.
Swamy et al. (
2015) argue that the relationship between the unobserved and the observed determinant can always be written as
where
is the portion of
that remains after the effect of
is removed. Substituting (
3) into (
2) yields
Accounting for the measurement errors in (
4) gives
This equation includes the bias arising from omitting
, the bias from measurement error in
, and the measurement error in
. It is a simple version of the fully general expression (8) in
Swamy et al. (
2015). This model can be written as
where the parameters
and
are time-varying.
Let us now consider two of the cases discussed in
Swamy et al. (
2015). In the first case the model is linear. Adding and subtracting the constant parameter model
to (
5) yields
where the last two terms become the error term in the model to be estimated. A valid instrument must be correlated with
and uncorrelated with the error term. But
is also in the error term of (
6). Thus, every instrument for
must be correlated with the error. No valid instrument exists in model (
6).
The other case is a simple example of a model with a measurement error in the explanatory variable. This model is
where the measured value of the explanatory variable is
. The model that can be estimated is
where
is the error term. Written as time-varying coefficient model (
8) becomes
where
. Adding and subtracting a fixed coefficient model where
yields
and
is again also in the error term of Equation (
8). No instrument for
would work.
3. Structural Models
Before we reexamine the arguments in
Section 2 we need to clarify what structural models are and what kind of assumptions they encode. This note cannot give a full exposition, of course.
Pearl (
2009a) provides a comprehensive treatment of structural models and causal inference in general. In particular, Chapter 5 in
Pearl (
2009a) discusses the interpretation of structural parameters and the error term in structural models in great detail. This section draws heavily on
Pearl (
2009b) which gives an excellent overview of the foundations of causal modeling with structural equations.
Let us consider the simple linear structural equation
where
y depends on
x and
, a variable that stands for all other factors that affect
y when
x is held constant. Particular values of
x and
assign a particular value
.
Equation (
11) encodes the causal assumption that
changing or
manipulating x causes
y to vary. The strength of the this effect is
. Note that the interpretation of
does not depend on
. The equation says that the effect of a unit change in
x on
y is
, regardless of the values taken by the other variables in the model. Whether or not
x is correlated with
plays no role. Equation (
11) describes a causal mechanism, not statistical associations. The correlation between
x and
becomes important, however, when one attempts to estimate the causal parameter
from observational data.
Furthermore, the relation between
x and
y is
asymmetric. The equality sign is therefore somewhat misleading. Rewriting (
11) as
would lead to the miss-interpretation that
y causes
x. For example, when
y is a symptom of a disease
x than (
12) would imply that the symptom causes the disease. This makes of course no sense.
The graph in
Figure 1 makes the causal relationship between
x and
y explicit. The arrow that points from
x to
y shows the direction of causality between these variables. The solid nodes indicate that
x and
y are observable variables. The hollow node indicates that the variable
is unobserved. The absence of a link between
x and
in this graph indicate the these two variables are assumed to be independent.
Another important assumption in Equation (
11) is the invariance of the target parameter
. This is an identifying assumption. The assumption implies that the causal link between
x and
y is stable. One could of course assume that
changes over time in some way. But this would imply a very different structural model.
Until now nothing was said about estimating Equation (
11). The question is whether the causal effect
can be estimated from observational data. When
x is uncorrelated with
then
is identified and consistently estimable from data on
y and
x. When
x and
are dependent, however, then we need additional information to consistently estimate
. This information may come from an instrumental variable.
4. Non-Existence Revisited: Instruments Do Exist
Let us now turn again to our simple model given by Equation (
2) and let us assume that we can only observe
and
. We cannot observe
. For simplicity the intercept
is set to zero. The time subscript
t is superfluous and therefore dropped.
Figure 2 shows two possible structural models. In model (a)
affects
directly and indirectly via its effect on
. In model (b) the variable
is a confounding variable that jointly affects
and
. In both graphs
z is an instrumental variable.
The regression model that is actually estimated is
where the error term
. Thus, the model is misspecified because
is omitted. In both cases
is correlated with the error. Let us now compute the ordinary least squares (OLS) and IV estimates for both cases.
Case (a): Easy computations show that OLS yields
This is the total (i.e., direct + indirect) effect of
on
. The IV estimate for
is
and therefore the same as the OLS estimate. When the goal is to estimate only the direct effect of
on
no instrument for
works as argued in
Swamy et al. (
2015). If one is interested in the total effect no instrument is needed anyway.
Case (b): Now
is a confounding variable. As already mentioned, this model is miss-specified and
Swamy et al. (
2015) would conclude that a valid instrument cannot exist. But in fact a valid instrument can exist. OLS wont work but IV estimation will.
OLS yields
where the second term is the well known omitted variable bias. The term
is the coefficient from a regression of
on
. Note that this auxiliary regression has no causal content. The regression just measures the statistical association between
and
. Some simple algebra shows that IV estimation yields now
which is the structural parameter that we wanted to estimate.
Why can an instrument work in case (b) but not in case (a)? In case (a) the error term is caused by the included variable. The error term is thus indeed a function of . No instrument for will therefore work if one wants to estimate the direct effect of on .
Case (b) is quite different. in
Figure 2b the error is
not a function of the included explanatory variable. Equation (
3) in
Section 2 is thus not consistent with the underlying structural model. Varying
does not affect
. Instead the omitted variable
in the error term is a cause of the explanatory variable
. The instrument
z is another cause of
that is independent of
. Thus the error and the instrument are uncorrelated. Moreover, the instrument affects the dependent variable
only via
. IV estimation works with a proper instrument.
Let us now turn to the second example in
Section 2 where
and the explanatory variable
is measured with error. If
is the true structural model then the causal effect of
on
is stable. The constant parameter
reflects this. Hence, we cannot simply transform this model into a time-varying coefficient model. Such a model states that the causal effect is unstable and changes over time. The transformed model is therefore inconsistent with the true structural model that we want to estimate. The transformed model has very different implications for the causal link between
and
. It is a different structural model.
To be consistent with the original structural model one should estimate the model . Here the error is not a function of and instruments can in principle be found. For example, any cause of that is unrelated to the measurement error and does not directly cause would be a valid instrument for .
5. Conclusions
The examples presented in this note demonstrate that instruments can exist. The arguments in
Swamy et al. (
2015) hold when the structural error is indeed a function of the included explanatory variables. But this is rarely the case in a structural model. For instance, omitted confounding variables are not functions of included variables. The three variable model (b) in
Figure 2 provides a simple example. The model is misspecified, but a valid instrument exists since the omitted confounding variable is not a function of the included variable.
Furthermore,
Swamy et al. (
2015) assume that any model can be expressed as a time varying coefficient (TVC) model. The true model would then be a special case of this general model.
Of course, a constant parameter model is a special case of a TVC model. A true structural model with constant parameters cannot be turned into an equivalent TVC model, however. A structural model with constant parameters cannot at the same time be expressed as a model where the parameters vary over time. The later model has very different causal implications.