Propensity Score Weighting with Mismeasured Covariates: An Application to Two Financial Literacy Interventions

Estimation of the causal effect of a binary treatment on outcomes often requires conditioning on covariates to address selection on observed variables. This is not straightforward when one or more of the covariates are measured with error. Here, we present a new semi-parametric estimator that addresses this issue. In particular, we focus on inverse propensity score weighting estimators when the propensity score is of an unknown functional form and some covariates are subject to classical measurement error. Our proposed solution involves deconvolution kernel estimators of the propensity score and the regression function weighted by a deconvolution kernel density estimator. Simulations and replication of a study examining the impact of two financial literacy interventions on the business practices of entrepreneurs show our estimator to be valuable to empirical researchers.


Introduction
Empirical researchers in economics, finance, management, and other disciplines are often interested in the causal effect of a binary treatment on outcomes. In some cases, randomization is used to ensure comparability across the treatment and control groups. However, researchers must rely on observational data when randomization is not feasible. With observational data, concern over the non-random selection of subjects into the treatment group becomes well-founded. Addressing the possibility of non-random selection requires much of the data at hand. Moreover, even with randomization, demands on the data may be non-trivial since randomization only balances covariates across the treatment and control groups in expectation.
In this paper, we consider the case where adjustment for observed covariates is performed to recover an unbiased estimate of the effect of a treatment. Thus, we are restricting ourselves to the case of selection on observed variables. The econometric and statistics literature on the estimation of causal effects in the case of selection on observed variables has grown tremendously of late. 1 This has led to the proliferation of statistical methods designed to estimate the causal effect(s) of the treatment, including parametric regression methods, semi-or non-parametric methods based on the propensity score, and combinations thereof.
Despite the growing number of estimation methods, there are only a few that take into account measurement errors in the data. Here, we present a new semi-parametric estimator that partially fills this gap.
In particular, we focus on the case when the propensity score is of an unknown functional form and some covariates are subject to classical measurement error. There are two issues to be dealt with to estimate the treatment effect in such a situation: first, we need to estimate the functional form of the propensity score; second, we need to estimate the moment of a known (or estimated) function of mismeasured covariates. The first issue is solved by using deconvolution kernel regression. For the second issue, as the sample analogue is no longer feasible due to the unobservability of the error-free covariates, we consider the integration weighted by deconvolution kernel density estimator.
We illustrate our estimator both via simulation and by revisiting the randomized control trial (RCT) on financial literacy examined in Drexler et al. (2014). In the experiment, micro-entrepreneurs taking out a loan from ADOPEM, a microfinance institution in the Dominican Republic, are randomly assigned to one of three treatment arms to assess the causal effect of financial literacy programs on a firm's financial practices, objective reporting quality, and business performance. The first treatment provided subjects with standard accounting training. The second treatment provided rule-of-thumb training that covered basic financial heuristics. The final group received neither training and serves as the control group. The 1 See Imbens and Wooldridge (2009) and Abadie and Cattaneo (2018) for excellent surveys. authors find significant beneficial effects of the rule-of-thumb training, but not the standard accounting training.
We revisit this study for three reasons. First, proper evaluation of financial literacy interventions is critical. As documented in Lusardi and Mitchell (2014), financial literacy in the US and elsewhere seems woefully inadequate for individuals and small business owners to navigate complex financial matters.
McKenzie and Woodruff (2013, p. [48][49] offer the following vivid description: "Walk into a typical micro or small business in a developing country and spend a few minutes talking with the owner, and it often becomes clear that owners are not implementing many of the business practices that are standard in most small businesses in developed countries.
Formal records are not kept, and household and business finances are combined. Marketing efforts are sporadic and rudimentary. Some inventory sits on shelves for years at a time, whereas more popular items are frequently out of stock. Few owners have financial targets or goals that they regularly monitor and act to achieve." As evidenced in this quote, the lack of financial literacy among micro-entrepreneurs has real consequences. Lusardi  ployment, and developing such enterprises is a key policy concern in most countries, and in particular in developing countries where they employ more than half of the labor force." However, the viability of microenterprises has been found to be heterogenous, as the authors further note that "a growing literature shows that success cannot be taken for granted" (p. 707). Recent research has focused on sources of this heterogeneity, finding that it is not explained fully by variation in capital (Bruhn et al. 2018). The study by Drexler et al. (2014) addresses this issue by exploring the impact of different types of financial literacy training on firm success.
Finally, our proposed estimator is well-suited to the application. To start, despite training being randomly assigned, the authors control (via regression) for several covariates to increase the precision of the treatment effect estimates. Moreover, one covariate is continuous and potentially suffers from classical measurement error. This covariate reflects the size of the loan received by the entrepreneur. While this variable is unlikely to be mismeasured as it is obtained from bank records, arguably the 'true' covariate of interest is a measure of capital investment in the firm by the entrepreneur. This could be below the official size of the loan due to some funds being diverted to non-business use, or above the official size of the loan due to other funds being used to supplement the loan. As Drexler et al. (2014, p. 2) note, "for microenterprises the boundary between business and personal financial decisions is often blurred." Applying our proposed estmator, we find the results in Drexler et al. (2014) to be generally robust to 'modest' amounts of measurement error. However, for a few outcomes, the magnitude of the estimated treatment effect changes. With greater amounts of measurement error, the results are not surprisingly less robust. Typically in such cases we find larger point estimates once measurement error is addressed.
The remainder of the paper is organized as follow. Section 2 provides a brief overview of the literature on measurement error in covariates. Section 3 provides an overview of the potential outcomes framework, discusses identification with and without measurement error in covariates, and presents our proposed estimator. Section 4 contains our application to assessment of two financial literacy interventions. Section 5 concludes.

Measurement Error in Covariates
A small literature has considered measurement error in an observed covariate when estimating the causal effect of a treatment in the case of selection on observed variables. In a regression context with classical measurement error, it is well known that the Ordinary Least Squares (OLS) estimate of the coefficient on the mismeasured regressor suffers from attenuation bias (see, e.g., Frisch 1934;Koopmans 1937;Reiersøl 1950). However, bias will also impact the estimated treatment effect if treatment assignment is correlated with the true value of the mismeasured covariate (Bound et al. 2001). The sign of this covariance determines the sign of the bias. If the measurement error is correlated with treatment assignment (i.e., it is nonclassical), then the direction of the bias depends on whether the partial correlation between the measurement error and treatment assignment is positive or negative (Bound et al. 2001). Finally, if multiple covariates suffer from measurement error, then one is typically unable to sign the bias even under classical measurement error (Bound et al. 2001).
With classical measurement error, a consistent estimate of the treatment effect can be recovered using Instrumental Variable (IV) estimation, where the mismeasured covariate(s) are instrumented for using valid exclusion restrictions. However, this solution places further demands on the data as valid instruments must be available. As an aside, it is also important to realize that the estimated treatment effect will still be inconsistent if treatment assignment is correlated with the measurement error (Bound et al. 2001).
Beyond the regression context, several recent papers consider the effect of measurement error in one or more covariates when relying on semi-or non-parametric estimators of the treatment effect. Battistin and Chesher (2014), extending early work in Cochran and Rubin (1973), focus on the bias of treatment effect parameters estimated using semiparametric (propensity score) methods. The bias, which may be positive or negative, is a function of the measurement error variance. The authors consider bias-corrected estimators where the bias is estimated under different assumptions concerning the reliability of the data.
McCaffrey et al. (2013) develops a consistent inverse propensity score weighted estimator for the case when covariates are mismeasured. In particular, the authors consider a weight function of mismeasured covariates whose conditional expectation given the correctly measured covariates equals the error-free inverse propensity score. Their estimator is then constructed based on approximating the weight function by projecting the inverse of the estimated propensity score onto a set of basis functions. To estimate the propensity score with mismeasured covariates, knowledge of the measurement error distribution is generally needed. It is worth noting that the measurement error considered in this paper could be non-classical, as only conditional independence between the measurement error and the outcome and the treatment given the correctly measured covariates is required. As a cost to this extra flexibility, the authors only establish consistency; further characterization of the asymptotic properties are left as a gap to be filled. Jakubowski (2015) assesses the performance of propensity score matching when an unobserved covariate is proxied by several variables. The author considers two estimation methods. The first is a propensity score matching estimator where the propensity score model includes the proxy variables. The second is also a propensity score matching model except now the propensity score model includes an estimate of the unobserved covariate obtained via a factor analysis approach.

Potential Outcomes Framework
Our analysis is couched within the potential outcomes framework (see, e.g., Neyman 1923; Fisher 1935; Roy 1951; Rubin 1974). We consider a random sample of N individuals from a large population, where individuals are indexed by j = 1, ..., n. Define Y j (T) to be the potential outcome of individual j under treatment T, T ∈ T . 2 In this paper, we limit ourselves to binary treatments: T = {0, 1}. The causal effect of the treatment for a given individual is defined as the individual's potential outcome under treatment (T = 1) relative to the individual's potential outcome under control (T = 0). Formally, In the evaluation literature, several population parameters are of potential interest. Here, attention is given to the average treatment effect (ATE) and the average treatment effect for the treated (ATT) The ATE is the expected treatment effect of an observation chosen at random from the population, whereas the ATT is the expected treatment effect of an observation chosen at random from the treatment group.
T j is a binary indicator of the treatment received, X j is a scalar covariate, and Z j is a d-dimensional vector of covariates. The covariates included in X j and Z j must be pre-determined (i.e., they are not affected by T j ) and must not perfectly predict treatment assignment. The observed outcomes is which makes clear that only one potential outcome is observed for any individual. Absent randomization, τ and τ treat are not identified in general due to the selection problem, that is the distribution of (Y(0), Y(1)) may depend on T. Even with randomization, the efficiency of estimates can be improved by incorporating the covariates.

Strong Ignorability
To overcome the selection problem, or to improve the efficiency of estimates obtained under randomization, a set of fully observed covariates are commonly assumed, conditional on which (Y(0), Y(1)) and T are independent. This is referred to as the conditional independence or unconfoundedness assumption (Rubin 1974; Heckman and Robb 1985). Formally, this assumption is expressed as In addition to Assumption 1, the following overlap or common support assumption concerning the joint distribution of treatment assignment and covariates is also needed. Let p X,Z (x, z) = P(T = 1|X = z, Z = z) denote the propensity score, and X and Z denote supports of X and Z respectively.
Assumptions 1 and 2 are jointly referred to as strong ignorability in Rosenbaum and Rubin (1983) and lead to the following well known result where p = P(T = 1) is the probability of getting treated; See Proposition 18.3 of Wooldridge (2010). Thus, strong ignorability is sufficient to identify the estimands, τ and τ treat , when all variables are accurately measured.

Strong Ignorability with Measurement Error
Consider the case where Assumptions 1 and 2 continue to hold, but the quadruple {Y j , T j , W j , Z j } is observed by the researcher instead of {Y j , T j , X j , Z j }. Here, the observed scalar, W j , is assumed to be a noisy measure of X j , generated by where j is measurement error. Let f V denote the density of a random variable V and f ft (t) = e itx f (x)dx denote the Fourier transform of a function f with i = √ −1. To identify τ and τ treat in the presence of contaminated data, we impose the following assumption in addition to strong ignorability. Assumption 3. ⊥ (Y, T, X, Z), f is known, and f ft vanishes nowhere.
Assumption 3 requires the measurement error to be classical. Although this is somewhat restrictive, it is worth noting that this setup is consistent with multiplicative measurement error of the form W = X , as this can be transformed to an additive structure by taking the natural logarithm. In fact, as argued in is necessary, which is as equally strong as a conditional mean restriction. The assumption of a known error distribution is unlikely to hold in practice, but is imposed here for simplicity. We discuss the relaxation of this assumption when auxiliary information is available, such as the repeated measurements of X, in Section 3.5.
The identification result in the presence of contaminated data is given in the following theorem. The intuition behind Theorem 1 is straightforward. Based on (3) and (4), to identify τ and τ treat , it is sufficient to identify f Y,X,Z|T , which follows by implementing the convolution theorem to f Y,W,Z|T under  1)], which is needed to construct τ, the function A is given by While it is not easy to give an intuitive interpretation of (5), using the law of iterated expectation, the result shown in Appendix A.2 implies that (5) is the equivalent quantity of the inverse propensity score in the contaminated case. As can be seen, the functional form of A depends on f W,Z|T and f . The former, f W,Z|T , is identified as {T, W, Z} are directly observed, but extra knowledge on the latter, f , is needed to identify A, which echos the known error distribution part of Assumption 3.
In fact, the functional form of A not only matters to the identification of τ and τ treat , but also matters to the convergence rates of estimators of τ and τ treat . In particular, as will be seen in Section 3.4, varying the smoothness of f (implying f ft decays to zero at different rates as t → ∞) alters the convergence rate.
Intuitively, as f ft appears in the denominator of A, even if the same estimator of f W,Z|T is used to construct the estimator of A and then the estimators of τ and τ treat , due to the integration, the resulting estimators of τ and τ treat should converge at different speeds if f ft decays to zero at different rates.

Estimation
If we directly observe X, τ and τ treat can be estimated by wherep = 1 n ∑ n j=1 T j andp X,Z is a nonparametric estimator of the propensity score, p X,Z . These are known as the inverse propensity score weighting (IPW) estimators; see Horvitz and Thompson (1952). However, this estimator is no longer feasible when X is unobserved due to measurement error. To overcome this, note that we can alternatively express τ and τ treat as Derivation of (6) and (7) are discussed in Appendix A.1. To keep the notation simple, we will focus on the case when both X and Z are scalar for the rest of this section. By applying the deconvolution method with f known and given the i.i.d. sample {Y j , T j , W j , Z j } n j=1 of (Y, T, W, Z), the conditional densities f Y,X,Z|T=1 (y, x, z) and f Y,X,Z|T=0 (y, x, z) can be estimated bỹ and the propensity score p X,Z (x, z) can be estimated bỹ where b n is a bandwidth, K is a (ordinary) kernel function, and K is a deconvolution kernel function defined as Plugging (8), (9), and (10) into (6) and (7), we obtain estimators of τ and τ treat as follow.
where X and Z separately denote the support of X and Z, and Derivation of (11) and (12) where X d 1 and Z d 2 separately denote the support of X d 1 and Z d 2 for d 1 = 1, . . . , d x and d 2 = 1, . . . , d z , x = (x 1 , . . . , x d x ), z = (z 1 , . . . , z d z ), and We conjecture that analogous results to our main theorems can be established for the multivariate case.
To derive the convergence rates ofτ andτ treat , we need following conditions.

Assumption 4.
(i) {Y j , T j , W j , Z j } n j=1 is an i.i.d. sample of (Y, T, W, Z). f X,Z and E[Y(s)|X, Z] are bounded away from zero, and f X,Z and E[Y 2 (s)|X, Z] are bounded for s = 0, 1 over compact support X × Z.
(ii) f X,Z , p X,Z , and E[Y(s)|X, Z] for s = 0, 1 are γ-times continuously differentiable with bounded and integrable derivatives for some positive integer γ.
Assumption 4 (i) requires the random sampling and the regularity of densities and conditional moments. Assumption 4 (ii) imposes smoothness restrictions on the densities and conditional moments, which are needed to control the magnitude of bias in the estimation together with the properties of kernel function K as imposed in Assumption 4 (iii). In addition to the standard properties of a high-order kernel function, Assumption 4 (iii) also requires K ft to be compactly supported, which is commonly used in deconvolution problems to truncate the ill-behaved tails of the integrand for regularization purposes.
Meister (2009) discusses how kernels of any order can be constructed quite simply. Assumption 4 (iv) imposes mild bandwidth restrictions. In particular, it simply requires that the bandwidth must decay to zero as the sample size grows, but should not decay too fast. The second part of Assumption 4 (iv) is needed so that the higher order components of the estimation error are asymptotically negligible.
Theorem 2 presents the convergence rates ofτ andτ treat . The second term b γ n in the convergence rate characterizes the magnitude of the estimation bias, which is identical to the error-free case. The first term characterizes the magnitude of the estimation variance. Compared to the error-free case, the estimation variance ofτ andτ treat decays slower due to the extra term b −1/2 In particular, the smoother is the error distribution, the larger will be the estimation variance and, hence, the slower will be the convergence rate.
As is typical in the nonparametric measurement literature, to further specify the convergence rates ofτ andτ treat , we consider two separate cases characterized by different smoothness of the measurement error: the ordinary smooth case and the supersmooth case. For the ordinary smooth case, the error characteristic function decays at a polynomial rate. In particular, we impose the following condition.
Corollary 1 shows thatτ andτ treat converge in a polynomial rate n −r for some constant r > 0. The value of r depends on the choice of the bandwidth b n , which will be discussed in Section 4.
For the supersmooth case, the error characteristic function decays at an exponential rate. In particular, we impose the following condition.
Corollary 2 shows thatτ andτ treat can only converge at a logarithm rate, which is much slower than the polynomial rate that has been seen in the ordinary smooth case. In particular, a normal error would make the estimator much more data-demanding than the case with a Laplace error. Again, the specific rate will depends on the choice of the bandwidth b n , which will be discussed in Section 4.

Case of Unknown Measurement Error Distribution
Assuming the measurement error distribution to be fully known is usually unrealistic in practice. Auxiliary information, such as repeated measurements of X, can be used to relax the assumption of a known error distribution imposed in Assumption 3.
Suppose we have two independent noisy measures of X, W and W r , determined as follows for j = 1, . . . , n. To identify the distribution of , we impose the following assumption. When the measurement error distribution is unknown, we can estimate τ and τ treat by plugging in this whereq and the deconvolution kernel function based on estimated error characteristic function defined bŷ

Inference
The We leave for future work the examination of bootstrap methods for the construction of confidence intervals of our proposed estimators. In particular, a non-parametric bootstrap as in Bissantz et al. (2007) could be considered for the case when the error distribution is known, and a wild bootstrap as in Kato and Sasaki (2019) could be considered for the case when the error distribution is unknown but repeated measurements are available.

Simulation
In this section, we evaluate the finite sample performance of the proposed estimators using Monte Carlo simulation. In particular, we focus on the case with a single covariate for which we can only observe its noisy measurement, and the following data generating process is considered where covariate X is drawn from U[0.5, 1.5] and is independent of U, the error term U is drawn from N(0, 1) and is independent of (T, X), the treatment is assigned according to P(T = 1|X) = exp(0.5 − X), and three specifications of g are considered DGP1 :g(t, x) = t + x, While X is assumed unobserved, we suppose W = X + and W r = X + r are observed, where ( , r ) is mutually independent and independent of (T, X, U). For the distributions of and r , we consider two cases. First, as an example of the case of ordinary smooth errors, we consider the case when ( , r ) have a zero mean Laplace distribution with standard deviation 1/3. Second, as an example of the case of supersmooth errors, we consider the case when ( , r ) have a normal distribution with zero mean and standard deviation of 1/3.
Throughout the simulation study, we use the kernel function whose Fourier transform is This is the infinite-order flat-top kernel proposed by McMurry and Politis (2004), whose Fourier transform is compactly supported and can be used for the regularization purpose in the deconvolution estimation.
A trimming term is used in the denominatorsq 0,0 andq 0       In Tables 1-3, results are reported for bothτ andτ treat , the proposed estimators for the case when the error distribution is known and only W is observed, andτ andτ treat , the proposed estimators for the case when the error distribution is unknown and both W and W r are observed. Results are given for the bias (Bias), standard deviation (SD), and the rooted mean squared error (RMSE) of each estimator in different settings.
The results appear encouraging and display several interesting features. First, the estimators have better performance with a larger sample size, and the performance of estimators is better with ordinary smooth error compared to the supersmooth error case. We also note that the performance of the estimators for the case when the error distribution is unknown is close to that of the estimators for the case when the error distribution is known. Using the estimated error distribution generally adds extra noise to the estimation. There are cases when the symmetry of the error distribution may allow the performance of the estimator to be independent of whether the error distribution is estimated or not; see Dong et al. (2020b). Finally, as to be expected, the proposed estimators have similar performance across different data generating processes, which implies that they are robust to unobserved nonlinearity in the conditional expectation function.

Application
To illustrate our estimator in practice, we revisit the analysis in Drexler et al. (2014). 4 Drexler et al. As our estimator is based on the IPW estimator, we examine each treatment separately. To do so, we restrict the sample to a single treatment arm along with the control group. Thus, our sample sizes diverge from the original study. Nonetheless, we present OLS estimates for comparison to Drexler et al. (2014) and they are essentially identical.
Our results are presented in Tables 4 and 5. The only difference across the two tables is the set of outcomes being analyzed. 5 In each table, Columns 2 and 5 report the OLS estimates of the treatment effect. These are most directly comparable to Drexler et al. (2014), subject to the caveat mentioned above that we assess each treatment separately. Columns 3 and 6 report IPW estimates of the ATE treating all covariates as correctly measured and estimating the propensity score via logit. Finally, Columns 4 and 7 report the results of our estimator for the ATE, treating the size of the loan as potentially mismeasured.
Because the application lacks any auxiliary information on possible measurement error in this covariate, we assume the measurement error is normally distributed with mean zero and three different variance, corresponding to increasing levels of measurement error. Specifically, we set the standard deviation of the measurement error to be 1/6, 1/3, and 2/3 of the standard deviation of the observed loan values.
The bandwidth for the observed loan values, as in the simulation experiment, is chosen as two times the optimal bandwidth suggested by Delaigle and Gijbels (2004), and bandwidths for other covariates, which are supposed to be error-free, are chosen based on Li and Racine (2003).
The results are interesting. In terms of the standard accounting treatment, the results appear robust to modest measurement in loan size. The only outcome for which the treatment effect is statistically significant ignoring measurement error is "setting aside cash for business purposes". Here, the OLS and IPW estimates are both 0.07. With modest measurement error, our estimator yields a point estimate of 0.08. It is noteworthy, however, that we find even stronger effects as we increase the variance of the measurement error.
In terms of the the rule-of-thumb treatment, the results appear predominantly robust to modest measurement in loan size as well. In Table 4, all OLS and IPW estimates are statistically significant at conventional levels. With modest measurement error, our estimator yields point estimates are qualitatively unchanged; sometimes slightly larger and sometimes slightly smaller in absolute value. As we increase the variance of the measurement error, however, the point estimates generally increase in magnitude.
Thus, the economic magnitudes of the treatment effects are sensitive to the degree of measurement error.
For example, increasing the standard deviation of the measurement error from 1/6 to 2/3 of the standard deviation of the observed loan values at least doubles the magnitude of the ATE for the outcomes "setting aside cash for business purposes," "keep accounting records," "separate business and personal accounting," and "calculate revenues formally".
In Table 5, the only outcomes for which the treatment effect is statistically significant ignoring measurement error are "any reporting errors" and "revenue index". For the former, our estimator suggests, if anything, a larger ATE in absolute value once measurement error is addressed. For the latter, our estimator suggests a smaller ATE once measurement error is addressed.  Notes: Sample includes only those individuals with own business and either exposed to the treatment in the column heading or neither treatment. Number of observations beneath results for each model. Standard errors for OLS and IPW are in parentheses and are clustered at the barrio-level. IPW-ME reports only point estimates. The IPW-ME estimates in first row for each outcome assume the variance of the measurement is 1/6 of the variance of the observed covariate; estimates in {} assume the variance of the measurement is 1/3 of the variance of the observed covariate; estimates in {{}} assume the variance of the measurement is 2/3 of the variance of the observed covariate. IPW and IPW-ME estimates are of the average treatment effect.

Conclusion
Estimation of the causal effect of a binary treatment on outcomes, even in the case of selection on observed covariates, can be complicated when one or more of the covariates are measured with error. In this paper, we present a new semi-parametric estimator that addresses this issue. In particular, we focus on the case when the propensity score is of an unknown functional form and some covariates are subject to classical measurement error. Allowing the functional form of the propensity score to be unknown as well as a function of unobserved, error-free covariates, we consider an integration weighted by deconvolution kernel density estimator. Our simulations and replication exercise show our estimator to be valuable to empirical researchers. However, future work is needed to conduct inference with this estimator.

A Derivation of Equations
A.1 Derivation of (6) and (7) = y f Y|T=t,X=x,Z=z (y)dy f X,Z (x, z)dxdz = P(T = t) y f Y,X,Z|T=t (y, x, z) f X,Z (x, z) f X,Z|T=t (x, z)P(T = t) dxdz dy = P(T = t) y f Y,X,Z|T=t (y, x, z) P(T = t|X = x, Z = z) dydxdz, where the second step follows by Assumption 1 and the last step requires Assumption 2. = y f Y|T=0,X=x,Z=z (y)dy f X,Z|T=1 (x, z)dxdz = y f X,Z|T=1 (x, z) f X,Z|T=0 (x, z) f Y,X,Z|T=0 (y, x, z)dydxdz = E Y f X,Z|T=1 (X, Z) f X,Z|T=0 (X, Z) where the first step follows by Assumption 1 and the last step requires Assumption 2.