On the Plausibility of the Latent Ignorability Assumption

The estimation of the causal effect of an endogenous treatment based on an instrumental variable (IV) is often complicated by attrition, sample selection, or non-response in the outcome of interest. To tackle the latter problem, the latent ignorability (LI) assumption imposes that attrition/sample selection is independent of the outcome conditional on the treatment compliance type (i.e. how the treatment behaves as a function of the instrument), the instrument, and possibly further observed covariates. As a word of caution, this note formally discusses the strong behavioral implications of LI in rather standard IV models. We also provide an empirical illustration based on the Job Corps experimental study, in which the sensitivity of the estimated program effect to LI and alternative assumptions about outcome attrition is investigated.


Introduction
A frequently encountered complication when estimating the effect of a potentially endogenous treatment based on an instrumental variable (IV) methods is attrition/sample selection/nonresponse bias in the outcome. To account for this problem, the missing at random (MAR) assumption (e.g. Rubin (1976)), for instance, requires outcome attrition to only depend on observable variables. Alternatively, Frangakis and Rubin (1999) propose a latent ignorability (LI) restriction, which assumes attrition to be independent of the outcome conditional on the instrument and the treatment compliance type (i.e. whether one is a complier or non-complier in the notation of Angrist, Imbens, and Rubin (1996)). In the IV framework both assumptions can be combined (e.g. Mealli, Imbens, Ferro, and Biggeri (2004)), imposing independence conditional on the compliance type, the instrument, and further observables.
We argue that LI is nevertheless quite restrictive, as attrition is not allowed to be related to unobservables affecting the outcome in a very general way. Section 2 formally discusses the strong behavioral implications of LI in standard IV models with non-response. This assumption should therefore be cautiously scrutinized in applications. As an example, consider Barnard, Frangakis, Hill, and Rubin (2003), who assess a randomized voucher program for private schooling with noncompliance (where the IV is the randomization and the treatment is private schooling) and attrition in the test score outcomes, because some children did not take the test. Unobservables as ability or motivation likely affect both test taking and test scores. LI (combined with MAR) requires that conditional on the compliance type (i.e. private schooling as a function of voucher receipt), voucher assignment, and observed covariates, test taking is not related to ability or motivation (and thus, test scores). Among compliers (only in private schooling when randomized in), those taking the test must thus have the same distribution of ability and motivation as those abstaining. However, even within compliers, heterogeneity in ability and motivation may be sufficiently high to selectively affect test taking such that LI fails. Section 3 provides an empirical illustration using the Job Corps experimental study, in which the estimated program effect under LI is compared to alternative assumptions about outcome attrition. 1 2 IV models with nonresponse Assume the following parametric IV model with nonresponse: (1) Y is the outcome of interest, D is the binary (and potentially endogenous) treatment, and R is the response indicator. Note that 1(·) is the indicator function that is equal to one if its argument is satisfied and zero otherwise. Y is only observed if R = 1 and unknown if R = 0, implying nonresponse, sample selection, or attrition. Z is a randomly assigned instrument affecting D (but not directly Y or R) and assumed to be binary, e.g., the randomization indicator in an experiment.
Angrist, Imbens, and Rubin (1996) define four compliance types, denoted by T , based on how the potential treatment status depends on the instrument: An individual is a complier (defier) if her potential treatment state is one (zero) in the presence and zero (one) in the absence of the instrument and an always-taker (never-taker) if the potential treatment is always (never) one, independent of the instrument. Assume that β 1 is positive (a symmetric case could be made for a negative β 1 ). Then, an individual is a complier if β 0 + β 1 ≥ V > β 0 , an always taker if β 0 ≥ V , and a never taker if β 0 + β 1 < V . Defiers do not exist due to the positive sign of β 1 .
We now impose the following latent ignorability (LI) assumption, see Frangakis and Rubin (1999), and critically assess it in the light of our standard IV model with attrition: which is equivalent to Y ⊥R|Z, D, T as Z and T perfectly determine D. Furthermore, we assume that the error term U is continuous, such that Y is continuous. Finally, for the moment we also impose that U = V = W such that the same unobservable (e.g. motivation) affects the outcome (e.g. test score), treatment (e.g. private schooling), and response (e.g. test taking).
Note that Assumption 1 implies that the distribution of U among compliers is the same across 2 response states given the instrument: where f (·) denotes an arbitrary function with a finite expectation and the second line follows from the parametric model in (1). Obviously, the joint satisfaction of U = V = W and (2) is impossible in this context, as the distribution of U conditional on γ 0 + γ 1 ≥ U and γ 0 + γ 1 < U , respectively, is non-overlapping. An analogous impossibility result holds for , which is also implied by Assumption 1.
Imposing U = V = W seems too extreme for most applications and was chosen for illustrative purposes. However, even if the unobserved terms in the various equations are not the same, but non-negligibly correlated as commonly assumed in IV models, identification may seem questionable. Suppose, for instance, that W = δ 1 V + ǫ, where ǫ is random noise and δ 1 is a coefficient.
Then, Assumption 1 and the model in (1) imply that If U is associated with either ǫ, V , or both, the latter equality does not hold in general, but only if the association of U, ǫ, V is of a very specific form, which raises concerns about Assumption 1.
Finally, we investigate an in terms of functional form assumptions more general IV model, where Y , D, and R are given by nonparametric functions denoted by φ, ψ, and η, respectively: Under this model, Assumption 1 implies that This can be satisfied in special cases, for instance if U = π1(ψ(1, V ) ≥ 0, ψ(0, V ) < 0) + ε, with π denoting the (homogeneous) effect of being a complier and ε being random noise. Then, (5) simplifies , which holds because ǫ is independent of W . In general, identification requires that T is a sufficient statistic to control for the endogeneity introduced by conditioning on R. This, however, implies that the association between U , V , and W is quite specific, otherwise Assumption 1 does not hold.

Empirical illustration
As an illustration for treatment evaluation under LI and alternative assumptions about attrition, we consider the experimental evaluation of the U.S. Job Corps program (see for instance Schochet, Burghardt, and Glazerman (2001) Reconsidering the IV model of (4), we assume that in each of φ, ψ, and η a vector of observed covariates, denoted by X, may enter as additional explanatory variables. Similar to Frölich and Huber (2014), Section 2.2, we assume that (i) Assumption 1 holds conditional on X (thus combining LI and MAR), (ii) U ⊥Z|X, T such that the instrument affects the outcome only through the treatment, (iii) T ⊥Z|X which is implied by random assignment, (iv) Pr(T = c) > 0 and Pr(T = d) = 0 so that compliers exist and defiers are ruled out, and (v) 0 < Pr(Z = 1|X) < 1, ensuring common support in the covariates across instrument states. X (measured prior to randomization) includes education, ethnicity, age and its square, school and working status, and receipt of Aid to Families with Dependent Children (AFDC) and food stamps. We compare sempiparametric LATE estimation based on the latter assumptions (see Theorem 1 in Frölich and Huber (2014) Fricke, Frölich, Huber, and Lechner (2020), which tackles sample selection and treatment endogeneity by two distinct instruments. In the latter approach, which allows for non-ignorable selection related to U in a more general way than LI, we use the number of kids younger than 6 in the household 2.5 years after random assignment as instrument for R. We apply a semiparametric version of the estimator outlined in equation (23) of Fricke, Frölich, Huber, and Lechner (2020) along with the weighting function in their expression (21).  (2014)) of 0.12 log points virtually identical to the Wald estimator which ignores sample selection bias, and both are statistically significantly different from zero. The MAR-based estimate is one third higher, but not significantly differently so. The method of Fricke, Frölich, Huber, and Lechner (2020) based on two instruments (2 IVs) yields virtually the same effect as MAR and is neither statistically significantly different from any other estimator, nor from zero at any conventional level.
It seems important to understand the differences in the behavioral assumptions of the estimators. LI + MAR, for instance, assumes that given the covariates and program assignment, unobservables like ability and motivation do not jointly affect employment and wages among compliers. In constrast, the method of Fricke, Frölich, Huber, and Lechner (2020) does not rely on this restriction and allows for more general forms of sample selection, at the cost of also requiring a valid instrument for employment. In our illustration, the results turned out to be rather robust to the different assumptions considered, which need not necessarily hold in other contexts.