We appreciate the effort and thoughtfulness of
Raunig’s (
2017) attempted critique of
Swamy et al. (
2015). As we show below, however, it is based on a misunderstanding of the distinction between simultaneous and recursive modeling.
In Section 3 of his comment, Burkhard Raunig opens his argument with a reference to
Pearl (
2009b) and to Pearl’s treatment of structural models and causal inference in general. But it must be pointed out that in his book on causality,
Pearl (
2009b) (i) confined his analysis to Markovian (recursive) models and (ii) applied to them the concept of Bayesian subjective probability to answer questions of probabilistic causation. Related treatments of Bayesian and other types of probabilistic causation are by
Skyrms (
1988) and, most importantly for us, the work of
Basmann (
1988), who dealt with simultaneous equations models.
Pearl’s (
2009a, pp. 173–82) Bayesian subjective views implied that “if something is real then it cannot be causal, because causality is a mental construct that is not well defined”. By contrast,
Basmann (
1988, p. 73) found that causality strictly refers to a property of the real world and that causal relations and orderings are unique in the real world, and, since they are unique, they remain invariant under mere changes in the language (including algebraic symbols) used to describe them. Raunig appears to follow Pearl, whereas
Swamy et al. (
2015) strictly follow Basmann. We present our rebuttal as four comments, designated as (R1) to (R4).
(R1) The core of
Raunig’s (
2017) thesis is based on his Equation (11), presented as a structural model, for which he asserts that it “encodes the causal assumption that
changing or
manipulating x causes y to vary. The strength of this effect is β.” We now disprove this statement and, therefore, the assertion that his Equation (11) is a structural model.
Disproof. Equation (11) is a reformulation of Equation (2) which is
where the implicit assumption is that
=
is the error term with mean zero, and
and
are the observed dependent variable and regressor, respectively. For Raunig, Equation (1) above is the true model.
Raunig would be correct in this assessment if (i) Equation (1) above were free of misspecifications and (ii) its coefficients and error term were unique. But do these conditions hold? To satisfy condition (i), let us assume that the linear functional form of Equation (1) above is correct and there is no omitted relevant regressor other than
. To check whether condition (ii) is satisfied, let us do what
Pratt and Schlaifer (
1984, p. 13) (hereafter PS) did in their paper. They added and subtracted the product of the coefficient (
) of the omitted regressor (
) and the included regressor (
) on the right-hand side of Equation (1) above. Doing so gives
This equation is the same as (1) above and yet
has two different coefficients,
and
, while the error term has two different forms,
and
. Since we cannot prove that the coefficients and the error term of Equation (2) above are inadmissible, the possibility that (1) above can be written as (2) above establishes that the coefficients and error term of (1) above and omitted regressor are not unique and hence (1) above is a false model with non-unique coefficients and error term. In light of
Basmann’s (
1988) insight, models such as (1) above cannot encode the causal information. And they can surely not be structural models.
Q.E.D.
(R2) In their 1984 JASA paper,
Pratt and Schlaifer (
1984, p. 13) defined any linear equation with unique coefficients and error term to be “a linear stochastic law” and showed further that because neither the coefficients (
,
,
) in (1) above nor its omitted regressor (
) are unique, the relation in (1) above cannot be considered “a linear stochastic law,” in contradiction to Raunig’s assertion that his Equation (11) “encodes the causal assumption” that changing or manipulating
x causes
y to vary.
Swamy et al. (
2015) use what is essentially Raunig’s Equation (3),
=
+
, and his Equation (2) to obtain
=
+
+ (
+
)
. It can be shown that this equation has unique coefficients and error term and can be called “a stochastic law,” capable of encoding the causal assumption.
Pratt and Schlaifer (
1984, p. 13) treated
as the random error term, as do
Swamy et al. (
2015), who, however, do not assume that this error term has mean zero, in contrast to Raunig who makes the potentially false assumption that
is the error term with mean zero.
Swamy et al. (
2015) also do not assume that the coefficients of (1) above are constant. Raunig’s assumption of the invariance of
is very strong because it is a non-unique coefficient, as is
in (1) and (2) above. However, non-unique coefficients cannot be invariant. Raunig’s claims that “the effect of a unit change in
x on
y is
, regardless of the values taken by the other variables in the model” and “Whether or not
x is correlated with
plays no role” are false because his Equation (11) does not describe a causal mechanism. As in (1) and (2) above, we do not know whether the effect of a unit change in
x on
y is
,
or some other number. The quantity
is not unique.
(R3) Pratt and Schlaifer (
1984, p. 14) proved that although the included regressors cannot be uncorrelated with every omitted relevant regressor that affects
, they can be uncorrelated with the remainder of every such variable. Let us explain this sentence. The variable
is the included regressor and
is omitted regressor in Raunig’s Equation (13). What PS are saying is that
cannot be uncorrelated with
. Raunig also writes that
is correlated with the error term
=
. Yet Raunig and PS proceed differently from here. On the one hand, Raunig sets
=
to obtain his Equation (14) and on the other hand
Pratt and Schlaifer (
1984, p. 13) show that in the regression
=
+
+ (
+
)
with unique coefficient (
+
) and unique error term (
), the included regressor
can be uncorrelated with the remainder
. In other words, PS used
as the error term with mean 0. Since
can be uncorrelated with
, assuming that
is uncorrelated with
gives the result that the least squares estimator of the coefficient (
+
) of
is consistent. Raunig’s result is different from PS’ result if
. PS’s assumption that
=
+
is much more reasonable than Raunig’s assumption
=
. In light of the logic underlying the argument of PS, Raunig’s assumption that
=
is questionable and suggests that
is a constant proportion of
—an impossibility. Raunig makes the further strong assumption that
=
z to give an instrumental variable interpretation to his estimator (15). This is just an assumption and not a proof of the existence of z.
(R4) The so-called instrumental variable estimator in Raunig’s Equation (17) produces
which is a non-unique coefficient of the false model with a non-unique error term in Raunig’s Equation (2). This proves that z in Raunig’s Equation (17) is not a valid instrument.
Skyrms (
1988, p. 59) proved that spurious correlations implied by Raunig’s Equation (2) disappear when we control for the confounding variable
by controlling the included variable
via Raunig’s Equation (3). Raunig writes that “Equation (3) in Section 2 is thus not consistent with the underlying structural model.” We have proved in (1) and (2) above that Raunig’s Equation (11) is not a structural model.
Pratt and Schlaifer (
1984, p. 14) disproved Raunig’s statement, “Varying
does not affect
,” by proving that the included regressor
cannot be uncorrelated with the omitted regressor
that effects
. We have shown above that the error term of Raunig’s Equation (11) is non-unique. But then, how can a valid instrument be uncorrelated with a non-unique (arbitrary) error term? Thus, Raunig has not proved the existence of a valid instrument.
To summarize, using the model presented by Raunig, we have confirmed the non-existence of instrumental variables. Specifically, we have analyzed four aspects of Raunig’s true model and we demonstrated that in each aspect Raunig’s true, or structural, model is neither structural nor true. Under what we call R1, Raunig’s structural or true model has non-unique coefficients and error term, violating Basmann’s definition of causality. Under what we call R2, Raunig’s true model does not conform to PS’s definition of a stochastic law. Therefore, the model cannot be causal. Under what we call R3, Raunig’s assumption of proportionality between an omitted regressor and the included regressor is overly restrictive; PS provided a reasonable example of the relationship between an omitted regressor and the included regressor. Finally, under what we call R4, Raunig’s instrumental variable is assumed to be proportional to the included regressor. An instrumental variable estimator based on this instrument produces a non-unique coefficient of a false model with a non-unique error term.