Estimating the Effects of Credit Constraints on Productivity of Peruvian Agriculture

Woutersen, Tiemen; Hauck, Katherine; Khandker, Shahidur R.

doi:10.3390/econometrics12040027

Open AccessArticle

Estimating the Effects of Credit Constraints on Productivity of Peruvian Agriculture

by

Tiemen Woutersen

^1,*,

Katherine Hauck

² and

Shahidur R. Khandker

^3,4

¹

Department of Economics, Eller College of Management, University of Arizona, P.O. Box 210108, Tucson, AZ 85721, USA

²

Department of Agricultural and Resource Economics, University of California at Davis, Davis, CA 95616, USA

³

World Bank Group, 1818 H Street NW, Washington, DC 20433, USA

⁴

International Food Policy Research Institute (IFPRI), Washington, DC 20005, USA

^*

Author to whom correspondence should be addressed.

Econometrics 2024, 12(4), 27; https://doi.org/10.3390/econometrics12040027

Submission received: 10 June 2024 / Revised: 10 September 2024 / Accepted: 13 September 2024 / Published: 26 September 2024

Download Versions Notes

Abstract

This paper proposes an estimator for the endogenous switching regression models with fixed effects. The decision to switch from one regime to the other may depend on unobserved factors, which would cause the state, such as being credit constrained, to be endogenous. Our estimator allows for this endogenous selection and for conditional heteroscedasticity in the outcome equation. Applying our estimator to a dataset on the productivity in agriculture substantially changes the conclusions compared to earlier analysis of the same dataset. Intuitively, the reason that our estimate of the impact of switching between states is smaller than previously estimated is that we captured the selection issue: switching between being credit constrained and credit unconstrained may be endogenous to farm production. In particular, we find that being credit constant has the substantial effect of reducing yield by 11%, but not the previously estimated very dramatic effect of reducing yield by 26%.

Keywords:

endogenous switching regression models; credit constraints

JEL Classification:

C10; O16; O13; Q10; Q14

1. Introduction

The endogenous switching regression model is useful when analyzing individuals and firms that switch between two regimes, for example, being credit constrained versus being credit unconstrained. A credit constrained business may not be able to make the necessary investments, which may lower the productivity of that business. Similarly, a credit constrained farmer may not be able to purchase fertilizer or tools, which may also cause their productivity to be lower. The decision to switch from one regime to the other could also depend on unobserved factors, which would cause the state, such as being credit constrained, to be endogenous.

Charlier et al. (2001) estimated several switching models. They find that the data rejects them all. The models that they rejected include a linear panel model and Kyriazidou’s (1997) semi-parametric model. Kyriazidou’s (1997) model did not allow for conditional time-varying heteroscedasticity, so it may not be that surprising that the data rejected that model. Allowing for conditional time-varying heteroscedasticity is important because the variance of the error terms is often dependent on the predictors. For example, the residual variance of farm output may be greater in some years than other years, and we allow for this possibility.

We propose to generalize the existing fixed effects and random effects models to allow for endogenous switching. This generalization will allow for conditional heteroscedasticity in the outcome equation, a feature of almost any dataset. In particular, Maddala and Nelson’s (1974) switching model was a special case of the proposed model, as is the linear model with fixed effects and heteroscedastic errors.

Maddala and Nelson (1974) and Kyriazidou (1997) both considered the problem of endogenous switching in switching regression models. We expand on these papers in the following ways. Maddala and Nelson (1974) did not allow for fixed effects or time dummies. However, many empirical settings require the use of fixed effects or time dummies to absorb individual-level heterogeneity, such as a farm’s land quality. Therefore, we generalize Maddala and Nelson’s model to allow for fixed effects and time dummies. Kyriazidou (1997), on the other hand, allowed for fixed effects in a switching regression model, but assumed homoscedasticity. However, in many empirical settings, homoscedasticity does not fit the data well. Therefore, we relax the assumption of homoscedasticity to allow for conditional time-varying heteroscedasticity. By (i) allowing for fixed effects and time dummies, and (ii) relaxing the assumption of homoscedasticity to require only conditional time-varying heteroscedasticity, we generalize the existing fixed effects and random effects models to allow for endogenous switching. This generalization allows our model to better fit the data. We demonstrate this better fit by re-estimating the effects of removing credit constraints on farm productivity on a dataset that had been previously analyzed using both a linear panel model and Kyriazidou’s (1997) method.

Specifically, the application we are interested in is agricultural financing in developing countries and micro non-farm financing. Studying the effects of credit constraints on farmers in is a focus of development literature to better understand the role of institutions in eliminating rural poverty. Guirkinger and Boucher (2008) applied both a fixed effects model and Kyriazidou’s (1997) model to farmers who may switch between being credit constrained and unconstrained to estimate the effects of such credit constraints on farm productivity. However, it might be argued that the fixed effects models that Charlier et al. (2001) and Guirkinger and Boucher (2008) estimate did not adequately take the endogenous switching decision into account. Specifically, the linear panel model employed by Guirkinger and Boucher (2008) did not incorporate a selection equation. A selection equation is important because it takes the endogenous switching decision into account in the outcome equation. In this context, unobserved factors that impact a farmer’s credit constraint may also impact their farm productivity, so switching between being credit constrained and unconstrained may not be exogenous to farm output. Our model resolves this endogeneity problem.

In addition to our methodological contribution, our paper contributes to the literature on credit constraints in agriculture. We show that accounting for unobserved farmer characteristics that influence both credit constraints and farm output greatly reduces the benefits to removing credit constraints. This result has important policy implications both in the scope of agricultural policy and in other settings. Policy makers attempting to alleviate rural poverty should take into account that removing credit constraints may not increase farm output as much as previously thought. Additionally, many policy settings involve estimating the impact of moving individuals or households from one regime to another, and we demonstrate that accounting for endogenous selection into those programs may be important for understanding the policy outcomes associated with them.

Guirkinger and Boucher (2008) estimated that removing the credit constraint from constrained farmers would increase productivity by 26%. This number is based on an estimate from a fixed effects model without a selection equation. We extend such a model by adding a selection equation, and find that the credit constraint has a much smaller impact (11% versus 26%) on farm production when using the same dataset, demonstrating the importance of having a selection equation.

Aside from the papers previously mentioned, our paper relates to two main literature sets. First, it relates to the econometric literature about fixed effects and random effects models and selection bias. The methodology of our paper differs from Wooldridge (1995). Like this paper, Wooldridge (1995) addressed selection bias in panel models; however, Wooldridge (1995) did not consider endogenous switching models, which is the focus of this paper. Semykinaa and Wooldridge (2010) further considered selection bias in panel models, and they used the inverse Mills ratio as an instrumental variable. Our method also uses the inverse Mills ratio, but unlike Semykinaa and Wooldridge (2010), we do not use it as an instrument.

Second, our paper pertains to the empirical literature on credit constraints in agriculture. Two related papers are Feder et al. (1988, 1990). Like Guirkinger and Boucher (2008) and the present paper, these two papers argued that being credit constrained may be endogenous. However, unlike Guirkinger and Boucher (2008) and the present paper, these papers did not use individual effects to control for the heterogeneity of the quality of the farmland. Controlling for time-invariant individual-level effects is important in many empirical settings, but it complicates generating unbiased estimators of the coefficients of interest in a switching regression model with endogenous switching due to selection. Our model allows for individual-level fixed effects and endogenous switching. More recently, Sekyi et al. (2017) used survey data and a conditional mixed logit model to analyze access to credit and farmer productivity simultaneously. Unlike our paper, this paper does not use an endogenous switching model. Seck (2019) and Zabatantou Louyindoula et al. (2023), which both used a switching model to analyze credit constraints, are most similar to our paper. However, both Seck (2019) and Zabatantou Louyindoula et al. (2023) followed Woolridge (2010) by using an instrument to account for the endogeneity of credit constraints and farm productivity. Unlike Seck (2019) and Zabatantou Louyindoula et al. (2023), we do not use an instrument. Instead, we account for this endogeneity directly and use an exclusion restriction.

We note that switching models are not only useful for loan decisions, but are also useful for labor supply and household expenditure decisions. For example, Lee (1978) and Adamchik and Bedi (2000) estimated a switching model to analyze wage differences between different sectors of the economy. We expect our extensions to be useful for such applications as well.

This paper is organized as follows. Section 2 introduces the model and states the consistency and asymptotic normality result of our estimator. Section 3 applies the new estimator to data on productivity in Peruvian agriculture. Section 4 concludes.

2. Model and Theorem

In our application, farmers can be credit constrained or credit unconstrained. Being credit constrained may reduce output of the farm, since it may be more difficult to buy the relevant inputs such as fertilizer and machines, as well as to hire farm hands or specialized workers. Thus, being credit constrained may reduce productivity. However, being credit constrained and having low productivity could also be caused by a unobserved shock that impacts both, such as illness of the farmer. Thus, it is important to account for this sample selection, and accounting for this selection is what Maddala and Nelson’s (1974) “switching regression model with endogenous switching” intends to do. In particular, their switching regression model has a selection equation and an outcome equation, and their model is a special case of the more general model we describe below.

Let

W_{i t}

be equal to one if the farmer i is credit constrained in period t, and zero otherwise. If the farmer i is not credit constrained in period t,

W_{i t} = 0

, then the productivity of the farm is

Y_{i}^{(0)} = β^{'} X_{i t} + f_{i}^{(0)} + κ_{t} + u_{i t}^{(0)},

(1)

where

X_{i t}

denotes the regressors,

f_{i}^{(0)}

is an individual-specific fixed effect,

κ_{τ}

is a time dummy, and

u_{i t}^{(0)}

is the error term. The models considered by Maddala and Nelson (1974) or Maddala (1983) do not have fixed effects or time dummies, but we use those here. Such individual fixed effects are important in our application to control for the quality of land, and they are also important in many other settings to control for time-invariant individual-level heterogeneity. However, including fixed effects and time dummies complicates obtaining an unbiased estimate of the coefficients of interest in the outcome equation. Kyriazidou (1997) addresses this issue by assuming homoscedasticity. A contribution of our paper is to solve this issue without assuming homoscedasticity by generalizing fixed effects and random effects models to allow for endogenous switching.

Similarly to the last equation, if the farmer i is credit constrained in period t (

W_{i t} = 1

), then the productivity of the farm is

Y_{i}^{(1)} = α^{'} X_{i t} + f_{i}^{(1)} + τ_{t} + u_{i t}^{(1)},

(2)

where the fixed effect

f_{i}^{(1)}

and error term

u_{i t}^{(1)}

are, in general, different from

f_{i}^{(0)}

and

u_{i t}^{(0)}

in Equation (1).

Maddala and Nelson (1974) assume that the error terms in the selection equation and in the outcome equation are jointly normal. This assumption implies that the error terms in the outcome equations do not have expectation zero conditional on the regressors. Therefore, Maddala and Nelson (1974) and Maddala (1983) subtract the inverse Mills ratio with a known coefficient from the outcome equations. The inverse Mills ratio is the ratio of the probability density function over the complementary cumulative distribution function of a distribution. Specifically, let X be a normally distributed random variable with mean

μ

and variance

σ^{2}

. Then, the inverse Mills ratio is given by the two fractions

E [X | X > α] = μ + σ \frac{ϕ (\frac{α - μ}{σ})}{1 - Φ (\frac{α - μ}{σ})},

E [X | X < α] = μ - σ \frac{ϕ (\frac{α - μ}{σ})}{1 - Φ (\frac{α - μ}{σ})} .

Above,

α

denotes a constant,

ϕ

denotes the standard normal density function, and

Φ

denotes the standard normal cumulative distribution function.

Like Maddala and Nelson (1974), we also subtract the inverse Mills ratio from the outcome equations. However, since we do not assume that the error terms in the outcome equation are normally distributed, we need to estimate the coefficient of the inverse Mills ratio. In particular, we propose the following procedure, which is the main contribution of our paper.

First, we estimate a selection equation. In our application, this selection equation predicts if a farmer is credit constrained or not. Second, we difference out the fixed effects to obtain unbiased estimates of the coefficients of interest. In our application, these coefficients of interest are the marginal impacts of endowments on credit constrained farm productivity and credit unconstrained farm productivity. This method generalizes fixed effects and random coefficients models to allow for endogenous switching. Specifically, it takes a fixed effect or random effect model, and explicitly incorporates a selection equation as a first stage to account for endogenous selection into either regime (e.g., credit constrained or unconstrained). Further, our method differs from Kyriazidou (1997) because it allows for conditional heteroscedasticity in the outcome equation, meaning that it may better fit the data.

In our model, let

Q_{i t}

denote the regressors of the selection equation of individual i in period t, and suppose we observe N individuals for T periods. Our procedure allows for predetermined regressors (step 1A) or for exogenous regressors and correlated random effects (step 1B).

Step 1A (selection equation with predetermined regressors): Estimate a Probit model with predetermined regressors. Let

({\hat{γ}}_{1}, \dots, {\hat{γ}}_{T})

denote the quasi maximum likelihood estimator, i.e.,

({\hat{γ}}_{1}, \dots, {\hat{γ}}_{T}) = arg max \frac{\sum_{i}}{N} \frac{\sum_{t}}{T} ln [{Φ (γ_{t}^{'} Q_{i t})}^{W_{i t}} {1 - Φ (γ_{t}^{'} Q_{i t})}^{1 - W_{i t}}] .

(3)

Using

({\hat{γ}}_{1}, \dots, {\hat{γ}}_{T})

, calculate

{\hat{R}}_{i t} = \frac{ϕ ({\hat{γ}}_{t}^{'} Q_{i t})}{1 - Φ ({\hat{γ}}_{t}^{'} Q_{i t})}

for

i = 1, \dots, N

and

t = 1, \dots, T

.

Step 1B (selection equation with correlated random effects): Estimate a Probit model with strictly exogenous regressors, constant slope coefficients and correlated random effects. Let

(\hat{γ}, {\hat{ψ}}_{1}, \dots, {\hat{ψ}}_{T})

denote the quasi maximum likelihood estimator, i.e.,

(\hat{γ}, {\hat{ψ}}_{1}, \dots, {\hat{ψ}}_{T}) = arg max \frac{\sum_{i}}{N} \frac{\sum_{t}}{T} ln [{Φ (γ^{'} Q_{i t} + \frac{\sum_{t = 1}^{T}}{T} ψ_{t}^{'} Q_{i t})}^{W_{i t}} {1 - Φ (γ^{'} Q_{i t} + \frac{\sum_{t = 1}^{T}}{T} ψ_{t}^{'} Q_{i t})}^{1 - W_{i t}}] .

(4)

Using

(\hat{γ}, {\hat{ψ}}_{1}, \dots, {\hat{ψ}}_{T})

, calculate

{\hat{R}}_{i t} = \frac{ϕ ({\hat{γ}}^{'} Q_{i t} + \frac{1}{T} \sum_{t = 1}^{T} {\hat{ψ}}_{t}^{'} Q_{i t})}{1 - Φ ({\hat{γ}}^{'} Q_{i t} + \frac{1}{T} \sum_{t = 1}^{T} {\hat{ψ}}_{t}^{'} Q_{i t})}

for

i = 1, \dots, N

and

t = 1, \dots, T

.

Step 2: After step 1A or step 1B, we need to difference out the fixed effect. The previous literature that relies on Kyriazidou (1997), such as Guirkinger and Boucher (2008), assumes that the propensity of an individual i to be in one category of the selection equation (e.g., to be credit constrained or credit unconstrained) is constant across time. However, if the propensity of individual i to be in one category of the selection equation changes over time, this approach works less well. Therefore, our method relaxes this assumption to difference out the fixed effect while specifically accounting for endogenous switching in the selection equation. For example, in our application, the outcome (farm productivity) and the selection equation (credit constrained or unconstrained) might both depend on factors that change over time, such as farmer health, and our method allows for this possibility.

To difference out the fixed effect, for every time period and every individual for which

W_{i t} = W_{i, t - 1} = 0

, calculate

Δ Y_{i t}^{(0)} = Y_{i t}^{(0)} - Y_{i, t - 1}^{(0)}, Δ X_{i t} = X_{i t} - X_{i, t - 1}

, and

Δ {\hat{R}}_{i t} = {\hat{R}}_{i t} - {\hat{R}}_{i, t - 1}

. Next, regress

Δ Y_{i t}^{(0)}

on a constant,

Δ X_{i t}

, and

Δ {\hat{R}}_{i t}

. The constant takes care of the difference in time dummies,

κ_{t} - κ_{t - 1}

. Then, for every time period and every individual for which

W_{i t} = W_{i, t - 1} = 1

, calculate

Δ Y_{i t}^{(1)} = Y_{i t}^{(1)} - Y_{i, t - 1}^{(1)}

and regress

Δ Y_{i t}^{(1)}

on a constant,

Δ X_{i t}

, and

Δ {\hat{R}}_{i t}

. This process allows the terms

Δ X_{i t}

and

Δ {\hat{R}}_{i t}

to difference out the fixed effects

f_{i}^{(0)}

and

f_{i}^{(1)}

, so that we can build moments that do not depend on these fixed effects.

If the researcher is willing to make stronger assumptions, then other differences can be used as well. In particular, define

Δ_{l} Y_{i t}^{(1)} = Y_{i t}^{(1)} - Y_{i, t - l}^{(1)}, Δ_{l} Y_{i t}^{(0)} = Y_{i t}^{(0)} - Y_{i, t - l}^{(0)}, Δ_{l} X_{i t} = X_{i t} - X_{i, t - l}

, and

Δ_{l} {\hat{R}}_{i t} = {\hat{R}}_{i t} - {\hat{R}}_{i, t - l}

for

l = 1, \dots T

. Then, define the moment

g^{(0)} (θ) = \frac{1}{N} (\begin{matrix} \sum_{i} Δ Y_{i T}^{(0)} - Δ X_{i T} β - Δ {\hat{R}}_{i T} δ - (κ_{T} - κ_{T - 1}) \\ \sum_{i} Δ X_{i T} {Δ Y_{i T}^{(0)} - Δ X_{i T} β - Δ {\hat{R}}_{i T} δ - (κ_{T} - κ_{T - 1})} \\ \sum_{i} Δ {\hat{R}}_{i T} {Δ Y_{i T}^{(0)} - Δ X_{i T} β - Δ {\hat{R}}_{i T} δ - (κ_{T} - κ_{T - 1})} \\ \sum_{i} Δ_{2} Y_{i T}^{(0)} - Δ_{2} X_{i T} β - Δ_{2} {\hat{R}}_{i T} δ - (κ_{T} - κ_{T - 2}) \\ \sum_{i} Δ_{2} X_{i T} {Δ_{2} Y_{i T}^{(0)} - Δ_{2} X_{i T} β - Δ_{2} {\hat{R}}_{i T} δ - (κ_{T} - κ_{T - 2})} \\ \sum_{i} Δ_{2} {\hat{R}}_{i T} {Δ_{2} Y_{i T}^{(0)} - Δ_{2} X_{i T} β - Δ_{2} R_{i T} δ - (κ_{T} - κ_{T - 2})} \\ \dots \end{matrix})

where

κ_{1}

is normalized to be zero and

θ = {β, δ, κ_{2}, \dots, κ_{T}}^{'}

. Let the moment for the other outcome,

g^{(1)} (\bar{θ})

, be similarly defined, where

τ_{1} = 0

, and

\bar{θ} = {α, φ, τ_{2}, \dots, τ_{T}}^{'}

. One can use this general method of moment procedure instead of the least squares method in step 2, but we do not consider this in further detail here. In the application, we use a regressor in step 1A that is not used in step 2. This is usually called an exclusion restriction.

We now state the assumptions. These assumptions support the theorem that follows them.

Assumption 1

(Selection equation with predetermined regressors). Let

E (W_{i t} | Q_{i t}) = Φ (γ_{t}^{'} Q_{i t})

. Let

E {Q_{i t} Q_{i t}^{'}}

be nonsingular for

t = 1, \dots T

. Let the parameter space

Θ_{A}

be compact. Define

ω = {α, β, γ_{1}, \dots, γ_{T}, δ, κ, τ, φ}^{'}

, and let the true value

ω_{0}

be in the interior of

Θ_{A}

.

Assumption 1 allows for arbitrary correlation of the error in the selection equation and also allows the variance of this equation to vary with time. De Jong and Woutersen (2011) discuss dynamic binary choice models in more detail. An example that satisfies Assumption 1 is

W_{i t} = {H e a l t h}_{i 1} + {H a r v e s t}_{i t} \cdot (t + 1) + η_{i t} \sqrt{t + 1}

(5)

where

H e a l t h_{i 1}

is the health of farmer i in period 1,

H a r v e s t_{i t}

is the harvest of farmer i in period t, and

η_{i t}

is a standard normal error term that is i.i.d. conditional on the regressors.

Assumption 1 allows for arbitrary correlation of the error in the selection equation, and also allows the variance of this equation to vary with time. De Jong and Woutersen (2011) discuss dynamic binary choice models, such as the selection equations discussed here, in more detail. An example of a selection equation that satisfies Assumption 1 is

W_{i t} = H e a l t h_{i 1} + H a r v e s t_{i t} \cdot (t + 1) + η_{i t} \sqrt{t + 1},

(6)

where

H e a l t h_{i 1}

is the health of farmer i in period 1,

H a r v e s t_{i t}

is the harvest of farmer i in period t, and

η_{i t}

is a standard normal error term that is i.i.d. conditional on the regressors. In Equation (6), the variance of

W_{i t}

may be non-constant in t, and our model allows for this option. Equation (6) satisfies Assumption 1 because

Q_{i t}

is defined here as

{H e a l t h_{i 1}, H a r v e s t_{i t}}

. However, if

E {Q_{i t} Q_{i t}^{'}}

were a singular matrix (e.g., if

E {Q_{i t} Q_{i t}^{'}} = 0

), then Assumption 1 would not be satisfied. Assumption 1 corresponds to step 1A.

Assumption 2

(Selection equation with correlated random effects). Let

E (W_{i t} | Q_{i 1,} \dots, Q_{i T,}

,

v_{i}) = Φ (γ^{'} Q_{i t} + v_{i})

where

v_{i} = \frac{1}{T} \sum_{t = 1}^{T} ψ_{t}^{'} Q_{i t}

. Let the parameter space,

Θ_{B}

, be compact. Define

ϖ = {α, β, γ_{1}, \dots, γ_{T}

,

ψ_{1}, \dots, ψ_{T} {, δ, κ, τ, φ}}^{'}

, and let the true value

ϖ_{0}

be in the interior of

Θ_{B}

. Let

E {(Q_{i t} - v_{i}) {(Q_{i t} - v_{i})}^{'}}

be nonsingular for

t = 1, . ., T

and all

ψ_{t} \in Θ_{B}

.

Assumption 2 allows for correlated random effects because

v_{i}

can depend on the regressors. Such correlated random effects were proposed by Chamberlain (1980). Mundlak (1978) lets the random effect depend on the averages of the regressors,

v_{i} = ψ^{'} \frac{1}{T} \sum_{t = 1}^{T} Q_{i t}

, and the last assumption also allows for that. Assumption 2 corresponds to step 1B.

The next assumption means that the error term is uncorrelated with regressors, after differentiating out the fixed effect. Define

Δ ε_{i t}^{(0)} = Δ Y_{i}^{(0)} - (β^{'} Δ X_{i t} + Δ R_{i t} δ + Δ κ_{t}),

Δ ε_{i t}^{(1)} = Δ Y_{i}^{(1)} - (α^{'} Δ X_{i t} + Δ R_{i t} φ + Δ τ_{t}),

(7)

where

R_{i t} = \frac{ϕ (γ_{t}^{'} Q_{i t})}{1 - Φ (γ_{t}^{'} Q_{i t})}

is calculated if step 1A (predetermined regressors) is used, and

R_{i t} = \frac{ϕ (γ^{'} Q_{i t} + \frac{1}{T} \sum_{t = 1}^{T} ψ_{t}^{'} Q_{i t})}{1 - Φ (γ^{'} Q_{i t} + \frac{1}{T} \sum_{t = 1}^{T} ψ_{t}^{'} Q_{i t})}

if step 1B (correlated random effects) is used.

Assumption 3.

Let

Y_{i} = β^{'} X_{i t} + R_{i t} δ + κ_{t} + ε_{i t}^{(0)}

if

W_{i t} = 0

. Let

Y_{i} = α^{'} X_{i t} + R_{i t} φ + τ_{t} + ε_{i t}^{(1)}

if

W_{i t} = 1

. Let

Δ ε_{i t}^{(0)}

and

Δ ε_{i t}^{(1)}

be uncorrelated with

Δ X_{i t}

,

Δ W_{i t}

, and

Δ R_{i t}

, and let

E (Δ ε_{i t}^{(0)}) = E (Δ ε_{i t}^{(1)}) = 0

. Let

v a r (Δ ε_{i t}^{(0)} | Δ X_{i t}, Δ W_{i t}, Δ R_{i t}) > 0

, and

v a r (Δ ε_{i t}^{(1)} | Δ X_{i t}, Δ W_{i t}, Δ R_{i t}) > 0

for all

Δ X_{i t}

,

Δ W_{i t}

, and

Δ R_{i t}

. Further, let

E [{(1, Δ X_{i t}

,

Δ W_{i t}

,

Δ R_{i t}) {(1, Δ X_{i t}, Δ W_{i t}, Δ R_{i t})}^{'}} | W_{i t} = W_{i, t - 1} = 0]

be nonsingular for some

t \in {1, \dots, T}

and

P (W_{i t} = W_{i, t - 1} = 0) > 0

. Let

E [{(1, Δ X_{i s}

,

Δ W_{i s}

,

Δ R_{i s}) {(1, Δ X_{i s}, Δ W_{i s}, Δ R_{i s})}^{'}} | W_{i s} = W_{i, s - 1} = 1]

be nonsingular for some

s \in {1, \dots, T}

and

P (W_{i s} = W_{i, s - 1} = 1) > 0

.

Assumption 3 helps allow for for unbiased estimates of the coefficients of interest in the outcome equation because it means that the error terms are uncorrelated with the regressors, after differentiating out the fixed effect. To generate unbiased coefficients of interest, we need to (i) difference out the fixed effect, and (ii) allow for endogenous selection at the individual level. The above assumption addresses (i). Assumptions 1 and 2 address (ii).

Assumption 4.

Let

{Q_{i t}, W_{i t}, X_{i t}, Y_{i t}, t = 1, \dots, T}, i = 1, \dots, T

be i.i.d. across individuals. Let

E ({| Q_{i t} |}^{4}) < M, E ({| W_{i t} |}^{4}) < M, E ({| X_{i t} |}^{4}) < M, E ({| Y_{i t} |}^{4}) < M

for all

t = 1, . ., T

, and

i = 1, \dots, N

where

M < \infty

.

The above assumption requires boundedness of the fourth moment of

Q_{i t}

,

W_{i t}

,

X_{i t}

, and

Y_{i t}

.

Theorem 1

(Consistency and asymptotic normality). Let Assumptions 1 and 3–4 hold. Then

\begin{matrix} \hat{ω} \underset{p}{\to} ω_{0} and \\ \sqrt{N T} (\hat{ω} - ω_{0}) \underset{p}{\to} N (0, Ω_{A}) \end{matrix}

where

Ω_{A}

is positive semidefinite.

Let Assumptions 2–4 hold. Then,

\begin{matrix} \hat{ϖ} \underset{p}{\to} ω_{0} and \\ \sqrt{N T} (\hat{ϖ} - ϖ_{0}) \underset{p}{\to} N (0, Ω_{B}) \end{matrix}

where

Ω_{B}

is positive semidefinite.

The main use of Theorem 1 is to extend fixed effect and random effect models to incorporate endogenous switching between regimes. Specifically, our estimator can be used as a helpful check if there is an endogeneity problem in the linear panel model: unobserved factors may impact both the selection equation and the outcome equation, leading to biased estimates, and our estimator addresses this problem. In Appendix A, we show that our estimator is consistent and asymptotically normal.

Our estimator generalizes Kyriazidou (1997). Kyriazidou (1997) imposes exchangability of the error terms. This assumption implies that the error terms are homoscedastic. However, the outcome equation in this paper allows for conditional heteroscedasticity, and the selection equation of Assumption 1 allows for time-varying variances (see Equation (6)).1

In order to correct the standard errors for our two step estimator, it is convenient to write the estimator as the maximum of an objective function. This method is similar to Heckman’s (1979) sample selection estimator. The objective function that is used to prove asymptotic normality of our estimator, as well as the asymptotic variance-covariance matrix, is presented in Appendix A.

In our application, however, we bootstrap the estimators. That is, we sample the data with replacement and go through step 1A and step 2 for every dataset that we generated. As Horowitz (2001, Theorem 2.2) shows, bootstrapping an asymptotically normally distributed estimator that can be represented by an influence function yields a consistent variance-covariance matrix and consistent confidence intervals.2

3. Productivity in Peruvian Agriculture

The method presented in Section 2 generalizes fixed effects and random effects models to allow for endogenous switching. The reason allowing for endogenous switching is important in many empirical settings is because individuals may experience unobserved factors that impact both their selection equation and their outcome equation. For example, a farmer may experience an illness that impacts both their credit constraints and their farm productivity.

Our model generalizes previous models to allow for both (i) fixed effects and time dummies and (ii) conditional time-varying heteroscedasticity. The previous literature had not allowed for both of these in one model. This generalization may allow our model to better fit the data, and this better fit of the data may lead to different estimates of the coefficients of interest. To illustrate how generalizing fixed effects and random effects models to allow for endogenous switching may alter the estimates of the coefficients of interest, we re-estimate a previous study (Guirkinger and Boucher 2008) that used both a linear panel estimation and Kyriazidou’s (1997) method.

3.1. Guirkinger and Boucher (2008)

To demonstrate how our methodology works, and its benefits over other ways of approaching the problem of endogenous switching in switching regressions, we apply our method to estimating farm productivity in Peru, when the farms may switch from between being credit constrained and unconstrained. Specifically, we analyze the same data and ask the same question as Guirkinger and Boucher (2008): “By how much would the productivity of farmers [who were credit] constrained [have] …increase[d] if their credit constraint [had been] removed?”.

A particular challenge in answering this question is that farmers are endogenously sorted between being credit constrained and credit unconstrained. Guirkinger and Boucher (2008) described the agricultural sector in northern Peru, where “the limited liquidity of most small farmers plus the high input requirements of the commercial crops grown in the region combine[d] to make credit a critical determinant of farm production” (Guirkinger and Boucher 2008). They further noted that small farms controlled the majority of agricultural land in northern Peru, making access to credit particularly important.

The panel data analyzed by Guirkinger and Boucher (2008) were collected in 1997 and 2003. This household-level dataset contains information about whether the farmer was credit constrained, which crops were grown on her/his farm, the amount of labor used, etc. In total, 443 households have farm production data for both 1997 and 2003. In 1997, 27.5% of those 443 households had a formal loan, while 25% of the 443 households had a formal loan in 2003. The average net revenue per-hectare of credit unconstrained farmers was about USD 350 higher than for credit constrained farmers. However, as Guirkinger and Boucher (2008) noted, this difference is a naive comparison that does not take into account selection into credit constraints.

To address the selection issue, Guirkinger and Boucher (2008) estimated a switching regression model. They described how theory suggests that for households that were not credit constrained, there was no relationship between farm production and endowments, while for households that were credit constrained, farm production had an inverse relationship with land endowments and a positive relationship with liquidity endowments. The coefficients on these endowment variables were their parameters of interest. They discussed how generating unbiased estimates of these coefficients was a challenge because of (i) correlation between household fixed effects for constrained and unconstrained households, and (ii) endogenous selection into being credit constrained or unconstrained.

Guirkinger and Boucher (2008) applied both a linear fixed effects model and Kyriazidou’s (1997) model to estimate the effect of removing credit constraints on the productivity of credit constrained farmers. We follow Guirkinger and Boucher’s (2008) specification and use regression coefficients that are constant over time. The only difference between our specification and Guirkinger and Boucher (2008) is that we apply the methodology of the previous sections by using a selection equation to account for the endogeneity between factors that impact both the selection equation and the outcome equation.

A linear panel model allowed Guirkinger and Boucher (2008) to estimate the productivity increase caused by removing credit constraints for farmers who were constrained. Guirkinger and Boucher (2008) found that this productivity increase was 26%. However, this result relied on the assumption that the fixed effects between constrained and unconstrained households were the same. On the other hand, Kyriazidou’s (1997) method did not allow for estimating the productivity increased caused by removing credit constraints. It assumed that the propensity for a household to be credit constrained did not change over time. However, consider a farmer who experienced a serious illness between 1997 and 2003. This illness may have impacted both the farmer’s credit constraints (the selection equation) and their farm productivity (the outcome equation). Kyriazidou’s (1997) method did not allow for this type of event. Both a linear panel model and Kyriazidou’s (1997) model imposed strong assumptions, which we relax.

Specifically, we incorporate a selection equation into a fixed effects and random effects model to generalize these models to account for endogenous selection. Further, to allow the propensity of a household to be credit constrained to vary over time in order to account for events like a farmer’s illness shock, we relax Kyriazidou’s (1997) assumption of homoscedasticity to allow for conditional time-varying heteroscedasticity.

We then re-estimate the impact of removing credit constraints on farm productivity with Guirkinger and Boucher’s (2008) dataset, using our method to demonstrate how specifically accounting for endogenous selection into credit constraints decreases the estimated impact of removing those credit constraints on farm productivity for constrained households, relative to Guirkinger and Boucher’s (2008) results, which do not specifically account for selection into credit constraints changing over time. While Guirkinger and Boucher (2008) found that removing credit constraints increased the value of farm output by 26% using a linear panel model that assumes fixed effects between constrained and unconstrained households were the same, we find that removing credit constraints only increases the value of farm output by 11% when using the same dataset. This difference in estimates from Guirkinger and Boucher (2008), and when using our more generalized model is due to our method specifically taking into account endogenous switching between being credit constrained and unconstrained. Specifically, switching from being credit constrained to credit unconstrained could depend on unobserved factors, which also impact farm productivity, and our model takes into account this endogeneity problem.

3.2. Results from the Application

We estimate the impact of removing credit constraints on farm productivity. The results are reported in Table 1. The first column reports the first stage Probit regression, estimated by step 1A of our method. This first stage is the selection equation which predicts if a farmer is credit constrained or unconstrained. The selection equation could also be estimated using step 1B with exogenous regressors and correlated random effects, but we use predetermined regressors in our application to more closely follow Guirkinger and Boucher (2008).

The second and third columns of Table 1 report columns D and E of Table 6 in Guirkinger and Boucher (2008). These columns give the estimates for the marginal impact of land and liquidity endowments on productivity for credit unconstrained (column D) and constrained (column E) farms, using the linear panel method employed by Guirkinger and Boucher (2008). For example, column D reports that the marginal impact of a farm’s land endowment in hectares on the farm’s output value was −85.57: Guirkinger and Boucher (2008) found that as the farm increased in size, the marginal impact of an additional hectare decreased farm output by USD 86 for credit unconstrained farmers.

The fourth and fifth columns report our corresponding estimates of the marginal impacts. We call these columns D’ and E’. These columns represent the outcome equation that incorporates the selection equation estimated in column 1. Specifically, the difference between Guirkinger and Boucher’s (2008) estimates of the marginal impact of endowments reported in columns D and E and our estimate of the marginal impact of endowments reported in columns D’ and E’ is that our estimates explicitly incorporate a selection equation to allow for endogenous switching between being credit constrained and unconstrained, and Guirkinger and Boucher’s (2008) estimates did not. For example, while Guirkinger and Boucher (2008) found that the marginal impact of an additional hectare decreased farm ouput by USD 86 for farmers who are not credit constrained, using a selection equation, we find that this marginal impact decreases farm output by USD 131 for the same group.

The sixth and seventh columns of Table 1 report columns F and G of Table 6 from Guirkinger and Boucher (2008). These columns use a per-hectare linear panel model to give the marginal impact of the household’s endowment of land and liquidity per unit of land on the value of farm output per hectare. The last two columns of Table 1 report the corresponding estimates using our method, which accounts for endogenous selection into being credit constrained or unconstrained. We call these columns F’ and G’. As discussed in Section 2, the standard errors are calculated by bootstrapping the two-step estimation together.

Our first result is that adding a selection equation to the model dramatically shrinks the differences between the coefficients of the credit constrained and unconstrained farmers. The coefficients of interest represent the marginal impacts of the farm’s endowments of land and liquidity on farm output, and adding a selection equation decreases the difference between credit constrained and unconstrained farmers of these marginal impacts. In other words, accounting for selection into credit constraints makes credit unconstrained farmers “look more like” constrained farmers in terms of their relationship between endowments and farm output. The difference between the estimates for the marginal impact of land size endowments from columns D’ and E’ (our estimates) is smaller than the difference between columns D and E (Guirkinger and Boucher’s (2008) linear panel estimates). Specifically, we find that difference between the marginal impact of land endowments on farm output for constrained and unconstrained farmers is USD 3.62, while Guirkinger and Boucher (2008) found that the difference is USD 45.05. This difference comes from adding a selection equation that takes into account unobserved factors that may impact both the propensity to be credit constrained and the value of farm output.

This finding has important policy implications because the effect of being credit constrained is not as large as previously thought. This decrease in the impact of credit constraints is due to specifically incorporating endogenous selection into the model: a farmer may experience a health shock or other unobserved factor which influences both their credit constraints and their farm productivity. Not incorporating this endogenous selection may overstate the impact of removing credit constraints.

Our statistical test confirms these results. We use a Wald test with 11 degrees of freedom to test the statistical difference between the marginal impacts for credit constrained farmers and credit unconstrained farmers. We implement this test for both the original Guirkinger and Boucher (2008) paper that did not include a selection equation, and for our results, which do include a selection equation, in order to understand how accounting for endogenous selection changes the difference in the marginal impacts of endowments on farm output between credit constrained and unconstrained farmers. Using the specification of Guirkinger and Boucher (2008), we find that the value of the Wald test is 56.62 (comparing columns D and E in Table 1) and 61.17 (comparing columns F and G of Table 1). Adding the selection equation reduces these numbers to 22.92 (comparing columns D’ and E’ of Table 1) and 25.96 (comparing columns F’ and G’ of Table 1).3 Thus, the differences in the marginal impact of endowments on farm output between credit constrained and unconstrained farmers do not completely go away, but are substantially reduced.

Next, we estimate a counterfactual of how farm productivity would change if credit constrained farmers were to become unconstrained. Using our estimator, we find that removing credit constraints would increase productivity by 10.6%, which is significantly smaller than the estimate of 26% from Guirkinger and Boucher (2008). This significantly smaller estimate of the impact of removing credit constraints is our second main result. Moreover, the 10.6% increase we find is not statistically significant.

Guirkinger and Boucher (2008) used the results from their linear panel model (columns D and E in Table 1) to predict the impact of removing credit constraints for constrained households. To estimate this impact, they assumed that the fixed effects between credit constrained and unconstrained households were equal. However, including a selection equation as we do allows us to difference out the fixed effect in step 2 of our method, so we do not assume that the fixed effects between constrained and unconstrained households are equal. We use the same methodology to calculate the change in productivity as Guirkinger and Boucher (2008), but we use the new estimator for the parameter value, using step 1A and step 2. Specifically, we use our estimates of the marginal impact of endowments on the value of farm output from columns D’ and E’ of Table 1. These estimates incorporate the impact of adding a selection equation (column 1). Adding this selection equation means that we are able to account for endogenous switching, and eliminating this endogeneity reduces the estimated impact of removing credit constraints on unconstrained farmers.

To further contextualize our results, consider the counterfactual posed by Guirkinger and Boucher (2008) and re-addressed in this paper: how would the value of farm output have changed if credit constraints had been removed from credit constrained farmers? We might not expect these farmers’ output to be the same (after removing credit constraints), as the output of farmers who were not credit constrained because we might expect credit constrained farmers to be unobservably different to credit unconstrained farmers in a way that impacts their farm productivity. Because Guirkinger and Boucher (2008) assumed that the fixed effects between credit constrained and unconstrained farmers were equal, they could not allow for this unobserved difference, which may caused them to overestimate the impact of removing credit constraints. However, our model can allow for this unobserved difference using a selection equation as a first stage, leading to our qualitatively lower counterfactual estimate of the impact of removing credit constraints.

A full comparison between our estimates for the change in productivity due to removing credit constraints and Guirkinger and Boucher’s (2008) estimates can be seen in Table 2. Columns B, C, and E report Guirkinger and Boucher’s (2008) estimates from their columns B, C, and E of their Table 7. Guirkinger and Boucher’s (2008) estimates are reported in the even columns, and our corresponding results using our estimator that accounts for endogenous selection are reported in the following odd columns. Columns B and B’ of Table 2 are estimates for absolute productivity change by credit constrained type if the credit constraints were removed. Columns C and C’ columns of Table 2 are relative productivity increases for each group (constrained and unconstrained), if they were to become unconstrained. The total estimated impact on productivity due to removing credit constraints is found in column E and E’, which includes our overall estimate of 10.6%. This result represents a qualitative difference from Guirkinger and Boucher’s (2008) estimate of the same measure, reported in column E of Table 2, of 26%. Across all categories in Table 2, we find a qualitatively smaller result from removing credit constraints using our estimator that accounts for selection into being credit constrained than do Guirkinger and Boucher (2008), who did not account for selection. In general, our estimates are about 50% of Guirkinger and Boucher’s (2008) estimates of the impact of removing credit constraints, indicating that endogenous selection was a significant element of this setting. This endogenous selection demonstrates a benefit of our estimator.

We find that applying our estimator to a dataset on the productivity in Peruvian agriculture shows that the new estimator changes the quantitative and qualitative conclusions compared to earlier results on the same dataset. In particular, adding a selection equation to a model with fixed effects causes the coefficients of the credit constrained farmers to be the quite similar to the coefficients of the unconstrained farmers. Because Guirkinger and Boucher (2008) applied Kyriazidou’s (1997) method, they did not allow for conditional heteroscedasticity in the outcome equation. We allow for conditional time-varying heteroscedasticity. Further, when Guirkinger and Boucher (2008) used a linear panel method, they assumed that the fixed effects had the same impact on productivity for credit constrained and unconstrained households. Under this assumption, they found that removing credit constraints would have increased farm productivity by 26%. However, we relax the assumption that the fixed effects had the same impact on productivity for constrained and unconstrained households by explicitly allowing for selection, and we find that relaxing this assumption decreases the estimates of the impact of removing credit constraints on farm productivity from 26% to 11%. Intuitively, the reason for that our estimate of the impact of removing credit constraints is smaller than that of Guirkinger and Boucher (2008) is that we captured the selection issue: switching between being credit constrained and credit unconstrained may be endogenous to farm production. If a farmer experienced a negative shock, such as an illness, it may have impacted both their selection into being credit constrained versus unconstrained and their productivity. Therefore, removing the credit constraints of constrained farmers may not increase their productivity up to the level of credit unconstrained farmers.

4. Conclusions

We propose an estimator for the endogenous switching regression models with fixed effects. The estimator allows for endogenous selection and for conditional heteroscedasticity in the outcome equation. Applying the estimator to a dataset on the productivity in Peruvian agriculture shows that the new, more general estimator substantially changes the quantitative and qualitative conclusions compared to the earlier analysis of the same dataset. In particular, adding a selection equation to a model with fixed effects causes the coefficients of the credit constrained farmers to be similar to those of the unconstrained farmers, demonstrating the importance of having a selection equation.

This result has important policy implications, both in terms of credit constraints in agriculture specifically, and more broadly in terms of the importance of accounting for endogenous selection in settings that involve switching between two regimes. We show that accounting for unobserved farmer characteristics that influence both credit constraints and farm output greatly reduces the benefits to removing credit constraints. More generally, many policy settings involve estimating the impact of moving individuals or households from one regime to another, including increasing the scope of healthcare subsidies, job training programs, and government food assistance. We demonstrate that accounting for endogenous selection into those programs may be important for understanding the policy outcomes associated with them.

Relaxing the single index assumption of the selection equation by extending Altonji and Matzkin (2005) is left for future research, as is developing a model that allows for selection between more than two regimes.

Author Contributions

Conceptualization, T.W. and S.R.K.; methodology, T.W.; validation, T.W. and K.H.; formal analysis, T.W. and K.H.; writing—original draft preparation, T.W. and K.H.; writing—review and editing, T.W. and K.H.; supervision, T.W.; project administration, T.W. and K.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Guirkinger and Boucher (2008) published the data used in this article. Further inquiries can be directed to the corresponding author.

Acknowledgments

We thank Wesley Blundell, William J. Martin, Ming-sen Wang, and Roula Yazigi for helpful comments and discussions.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Consistency and Asymptotic Normality Results

In order toprove consistency and asymptotic normality of our two step estimator, it is convenient to write the estimator as the maximum of an objective function. Let Assumptions 1 and 3–4 hold. Define

S (ω) = \frac{1}{N} \sum_{i} S_{i} (ω) where

\begin{matrix} S_{i} (ω) & = \frac{\sum_{t}}{T} ln [{Φ (γ^{'} Q_{i t} + \frac{1}{T} \sum_{t = 1}^{T} ψ_{t}^{'} Q_{i t})}^{W_{i t}} {1 - Φ (γ^{'} Q_{i t} + \frac{1}{T} \sum_{t = 1}^{T} ψ_{t}^{'} Q_{i t})}^{1 - W_{i t}}] \\ - \frac{\sum_{t}}{T} 1 (Δ Y_{i}^{(0)} observed) \cdot {Δ Y_{i}^{(0)} - (β^{'} Δ X_{i t} + Δ R_{i t} δ + Δ κ_{t})}^{2} \\ - \frac{\sum_{t}}{T} 1 (Δ Y_{i}^{(1)} observed) \cdot {Δ Y_{i}^{(1)} - (α^{'} Δ X_{i t} + Δ R_{i t} φ + Δ τ_{t})}^{2} . \end{matrix}

Let

\frac{\partial S (ω)}{\partial ω}

|_{ω = ω_{0}}

denote the derivative of

S (ω)

with respect to

ω

and evaluated at the true value

ω_{0}

. Let

Λ

denote the variance-covariance matrix of

\sqrt{N} \frac{\partial S (ω)}{\partial ω}

|_{ω = ω_{0}}

. Remember that

ω = {α, β, γ_{1}, \dots, γ_{T}, δ, κ, τ, φ}^{'}

. Let

H_{1}

denote the second derivative of

E {S (ω)}

with respect to the vector

γ = {γ_{1}, \dots, γ_{T}}^{'}

, evaluated at the true value of

γ

, i.e.,

H_{1} = \partial^{2} E {S (ω)} / \partial γ \partial γ^{'} |_{γ = γ_{0}} .

Similarly, let

ω_{s u b s e t} = {α, β, δ, κ, τ, φ}^{'}

and

H_{2} = \partial^{2} E {S_{2} (ω_{s u b s e t})} / \partial ω_{s u b s e t} \partial ω_{s u b s e t}^{'} |_{ω_{s u b s e t} = ω_{s u b s e t, 0}} .

Thus, the Hessian of

E {S (ω)}

has the following form:

\partial^{2} E {S (ω)} / \partial ω \partial ω^{'} |_{ω = ω_{0}} = (\begin{matrix} H_{1} & 0 \\ 0 & H_{2} \end{matrix}) .

Note that maximizing

S (ω)

with respect to

ω

is the same as maximizing

\frac{\sum_{i}}{N} \frac{\sum_{t}}{T} ln [{Φ (γ^{'} Q_{i t} + \frac{1}{T} \sum_{t = 1}^{T} ψ_{t}^{'} Q_{i t})}^{W_{i t}} {1 - Φ (γ^{'} Q_{i t} + \frac{1}{T} \sum_{t = 1}^{T} ψ_{t}^{'} Q_{i t})}^{1 - W_{i t}}]

with respect to

γ

and maximizing

\begin{matrix} - \frac{\sum_{i}}{N} \frac{\sum_{t}}{T} 1 (Δ Y_{i}^{(0)} observed) \cdot {Δ Y_{i}^{(0)} - (β^{'} Δ X_{i t} + Δ R_{i t} δ + Δ κ_{t})}^{2} \\ - \frac{\sum_{i}}{N} \frac{\sum_{t}}{T} 1 (Δ Y_{i}^{(1)} observed) \cdot {Δ Y_{i}^{(1)} - (α^{'} Δ X_{i t} + Δ R_{i t} φ + Δ τ_{t})}^{2} . \end{matrix}

with respect to

ω_{s u b s e t}

. Thus, our ‘first step estimator’ is equivalent to the estimator considered by Newey and McFadden (1994, example 1.2). Their conditions are satisfied so that

\hat{γ} - γ_{0} = o_{p} (1)

. Next, note that

\begin{matrix} - \frac{\sum_{i}}{N} \frac{\sum_{t}}{T} 1 (Δ Y_{i}^{(0)} observed) \cdot {Δ Y_{i}^{(0)} - (β^{'} Δ X_{i t} + Δ R_{i t} δ + Δ κ_{t})}^{2} \\ - \frac{\sum_{i}}{N} \frac{\sum_{t}}{T} 1 (Δ Y_{i}^{(1)} observed) \cdot {Δ Y_{i}^{(1)} - (α^{'} Δ X_{i t} + Δ R_{i t} φ + Δ τ_{t})}^{2} . \end{matrix}

is a concave function. The i.i.d. assumption and the assumption on the moments ensure that this expression converges in probability to its expectation. The full rank assumption then ensures that this expectation has a unique maximum at the true value. Thus, all the assumptions of Newey and McFadden (1994, Theorem 2.7) are satisfied so that

{\hat{ω}}_{s u b s e t} - ω_{s u b s e t} = o_{p} (1)

.

Concerning the asymptotic normality: Note that

S (ω)

is twice continuously differentiable. Interpret the first derivative as a moment, and note that all the assumptions of Newey and McFadden (1994, Theorem 3.4) are satisfied. The asymptotic normality follows, and

Ω = (\begin{matrix} H_{1}^{- 1} & 0 \\ 0 & H_{2}^{- 1} \end{matrix}) Λ (\begin{matrix} H_{1}^{- 1} & 0 \\ 0 & H_{2}^{- 1} \end{matrix}) .

The nonzero variation follows from the strictly positive variation of the error terms. The matrix

Ω

can be estimated using a sample analogue. In particular, define

\begin{matrix} {\hat{H}}_{1} & = \frac{\partial^{2} S (ω)}{\partial γ \partial γ^{'}} |_{ω = \hat{ω}}, \\ {\hat{H}}_{2} & = \frac{\partial^{2} S (ω)}{\partial ω_{s u b s e t} \partial ω_{s u b s e t}^{'}} |_{ω = \hat{ω}}, \\ \hat{Λ} & = (\begin{matrix} \frac{\sum_{i}}{N} {\frac{\partial S_{i} (γ)}{\partial γ}} \frac{\partial S_{i} (γ)}{\partial γ^{'}} & \frac{\sum_{i}}{N} {\frac{\partial S_{i} (ω_{s u b s e t})}{\partial ω_{s u b s e t}}} \frac{\partial S_{i} (γ)}{\partial γ^{'}} \\ \frac{\sum_{i}}{N} {\frac{\partial S_{i} (γ)}{\partial γ}} \frac{\partial S_{i} (ω_{s u b s e t})}{\partial ω_{s u b s e t}^{'}} & \frac{\sum_{i}}{N} {\frac{\partial S_{2, i} (ω_{s u b s e t})}{\partial ω_{s u b s e t}}} \frac{\partial S_{2, i} (ω_{s u b s e t})}{\partial ω_{s u b s e t}} \end{matrix}) |_{ω = \hat{ω}}, \end{matrix}

and

\hat{Ω} = (\begin{matrix} {\hat{H}}_{1}^{- 1} & 0 \\ 0 & {\hat{H}}_{2}^{- 1} \end{matrix}) \hat{Λ} (\begin{matrix} {\hat{H}}_{1}^{- 1} & 0 \\ 0 & {\hat{H}}_{2}^{- 1} \end{matrix}) .

Newey and McFadden (1994, Theorem 4.5) yields that

\hat{Ω} = Ω + o_{p} (1)

.

Now, suppose that Assumptions 2–4 holds. Define

\bar{S} (ϖ) = \frac{1}{N} \sum_{i} {\bar{S}}_{i} (ϖ) where

\begin{matrix} {\bar{S}}_{i} (ϖ) & = \frac{\sum_{t}}{T} ln [{Φ (γ^{'} Q_{i t} + \frac{\sum_{t = 1}^{T}}{T} ψ_{t}^{'} Q_{i t})}^{W_{i t}} {1 - Φ (γ^{'} Q_{i t} + \frac{\sum_{t = 1}^{T}}{T} ψ_{t}^{'} Q_{i t})}^{1 - W_{i t}} \\ - \frac{\sum_{t}}{T} 1 (Δ Y_{i}^{(0)} observed) \cdot {Δ Y_{i}^{(0)} - (β^{'} Δ X_{i t} + Δ R_{i t} δ + Δ κ_{t})}^{2} \\ - \frac{\sum_{t}}{T} 1 (Δ Y_{i}^{(1)} observed) \cdot {Δ Y_{i}^{(1)} - (α^{'} Δ X_{i t} + Δ R_{i t} φ + Δ τ_{t})}^{2} . \end{matrix}

Remember that

ϖ = {α, β, γ_{1}, \dots, γ_{T}, ψ_{1}, \dots, ψ_{T}, δ, κ, τ, φ}^{'}

. The same reasoning as above holds, but now the consistency of the first step follows from Chamberlain (1980). The function

\bar{S} (ϖ)

is twice continuously differentiable, and

Ω

can again be consistently estimated by its sample analogue. Also, the conditions of Horowitz (2001, Theorem 2.2) are satisfies since the estimator is asymptotically normally distributed and this normality follows from an averaging operator, in particular from applying the Lindeberg–Lévy central limit theorem to

\frac{1}{\sqrt{N}} \sum_{i} \frac{\partial S_{i} (ω)}{\partial ω}

and

\frac{1}{\sqrt{N}} \sum_{i} \frac{\partial {\bar{S}}_{i} (\bar{ω})}{\partial \bar{ω}}

.

Notes

1	Dustmann and Rochina-Barrachina (2007) discuss empirical identification issues with Kyriazidou’s estimator.
2	Horowitz (2001, Theorem 2.2) averages $g_{n} (X_{i})$ .
3	The p-values for the Wald tests are 0.000, 0.000, 0.0181, and 0.0066, respectively.

References

Adamchik, Vera A., and Arjun S. Bedi. 2000. Wage differentials between the public and the private sectors: Evidence from an economy in transition. Labour Economics 7: 203–24. [Google Scholar] [CrossRef]
Altonji, Joseph G., and Rosa L. Matzkin. 2005. Cross section and panel data estimators for nonseparable models with endogenous regressors. Econometrica 73: 1053–102. [Google Scholar] [CrossRef]
Chamberlain, G. 1980. Analysis with qualitative data. Review of Economic Studies 47: 225–38. [Google Scholar] [CrossRef]
Charlier, Erwin, Bertrand Melenberg, and Arthur van Soest. 2001. An analysis of housing expenditure using semiparametric models and panel data. Journal of Econometrics 101: 71–107. [Google Scholar] [CrossRef]
De Jong, Robert M., and Tiemen Woutersen. 2011. Dynamic Time Series Binary Choice. Econometric Theory 27: 673–702. [Google Scholar] [CrossRef][Green Version]
Dustmann, Christian, and María Engracia Rochina-Barrachina. 2007. Selection correction in panel data models: An application to the estimation of females’ wage equations. Econometrics Journal 10: 263–93. [Google Scholar] [CrossRef]
Feder, Gershon, Lawrence J. Lau, Justin Y. Lin, and Xiaopeng Luo. 1990. The relationship between credit and productivity in Chinese agriculture: A microeconomic model of disequilibrium. American Journal of Agricultural Economics 72: 1151–57. [Google Scholar] [CrossRef]
Feder, Gershon, Tongroj Onchan, and Tejaswi Raparla. 1988. Collateral, guaranties and rural credit in developing countries: Evidence from Asia. Agricultural Economics 2: 231–45. [Google Scholar] [CrossRef]
Guirkinger, C., and S. R. Boucher. 2008. Credit Constraints and Productivity in Peruvian Agriculture. Agricultural Economics 39: 295–308. [Google Scholar] [CrossRef]
Heckman, J. J. 1979. Sample Selection Bias as a Specification Error. Econometrica 47: 153–61. [Google Scholar] [CrossRef]
Horowitz, Joel L. 2001. The Bootstrap. In Handbook of Econometrics. Edited by J. J. Heckman and E. Leamer. Amsterdam: North-Holland, vol. 5. [Google Scholar]
Kyriazidou, Ekaterini. 1997. Estimation of a Panel Data Sample Selection Model. Econometrica 65: 1335–64. [Google Scholar] [CrossRef]
Lee, Lung-Fei. 1978. Unionism and Wage Rates: A Simultaneous Equations Model with Qualitative and Limited Dependent Variables. International Economic Review 19: 415–33. [Google Scholar] [CrossRef]
Maddala, G. 1983. Limited-Dependent and Qualitative Variables in Econometrics. Econometric Society Monographs No. 3. New York: Cambridge University Press. [Google Scholar]
Maddala, Gangadharrao S., and Forrest D. Nelson. 1974. Maximum likelihood methods for models of markets in disequilibrium. Econometrica 42: 1013–30. [Google Scholar] [CrossRef]
Mundlak, Yair. 1978. On the pooling of time series and cross section data. Econometrica 46: 69–85. [Google Scholar] [CrossRef]
Newey, Whitney K., and Daniel McFadden. 1994. Large Sample Estimation and Hypothesis Testing. In Handbook of Econometrics. Edited by R. F. Engle and D. McFadden. Amsterdam: North-Holland, vol. 4. [Google Scholar]
Seck, Abdoulaye. 2019. Heterogeneous credit constraints and smallholder farming productivity in the Senegal River Valley. Emerging Markets Finance and Trade 57: 3301–19. [Google Scholar] [CrossRef]
Sekyi, Samuel, Benjamin Musah Abu, and Paul Kwame Nkegbe. 2017. Farm Credit Access, Credit Constraint and Productivity in Ghana: Empirical Evidence from Northern Savannah Ecological Zone. Agricultural Finance Review 77: 446–62. [Google Scholar] [CrossRef]
Semykinaa, Anastasia, and Jeffrey M. Wooldridge. 2010. Estimating panel data models in the presence of endogeneity and selection. Journal of Econometrics 157: 375–80. [Google Scholar] [CrossRef]
Wooldridge, Jeffrey M. 1995. Selection corrections for panel data models under conditional mean independence assumptions. Journal of Econometrics 68: 115–32. [Google Scholar] [CrossRef]
Woolridge, R. W. 2010. Econometric Analysis of Cross Section and Panel Data, 2nd ed. Cambridge, MA: The MIT Press. [Google Scholar]
Zabatantou Louyindoula, Hardy, Charles Alexis Bouity, and Fernand Owonda. 2023. Impact of Agricultural Credit on Productivity. Theoretical Economics Letters 13: 1434–62. [Google Scholar] [CrossRef]

Table 1. Estimation results for productivity.

	Probit (Constant)	(D) Unconstrained Productivity	(E) Constrained Productivity	(D’) Unconstrained Productivity	(E’) Constrained Productivity	(F) Unconstrained Productivity	(G) Unconstrained Productivity	(F’) Unconstrained Productivity	(G’) Unconstrained Productivity
	b/se	b/se	b/se	b/se	b/se	b/se	b/se	b/se	b/se
A	0.00	−85.57	−130.62 **	−131.03	−134.65	−88.65	−116.20 *	−137.92	−122.15
	(0.01)	(58.78)	(48.65)	(96.25)	(94.43)	(59.57)	(48.20)	(93.92)	(86.10)
K		14.45	182.67 *	14.61	133.58
		(13.26)	(82.27)	(38.59)	(118.76)
K/A						−24.28	645.98 **	158.88	586.26
						(127.11)	(236.42)	(210.46)	(394.83)
Labor/A						7.72	−34.93	−18.42	−19.27
						(42.06)	(33.08)	(98.43)	(86.17)
Labor	−0.01	−61.50	2.33	−45.55	−15.40
	(0.02)	(48.74)	(38.08)	(69.31)	(58.54)
Depend. ratio	0.11	490.90	10.34	397.34	−67.39	696.21	29.14	472.43	14.82
	(0.23)	(418.87)	(330.09)	(610.32)	(404.30)	(404.46)	(318.23)	(596.20)	(386.36)
Reg income	−0.00	263.85	−14.79	49.97	−30.66	244.34	32.22	52.85	−37.26
	(0.13)	(167.80)	(214.76)	(346.38)	(283.65)	(171.22)	(202.05)	(356.85)	(265.71)
Herd size	0.01	53.66 **	40.89	42.73	52.57	56.55 **	38.94	44.45	54.61
	(0.01)	(20.12)	(21.76)	(55.64)	(53.32)	(20.31)	(21.03)	(53.65)	(49.71)
Rice		632.30 *	93.30	763.05 *	74.44	672.66 *	115.48	713.98	114.22
		(252.99)	(147.59)	(361.37)	(276.05)	(268.99)	(146.88)	(381.22)	(268.85)
Cotton		−279.51	−27.99	−616.28	−292.07	−236.15	−23.86	−681.16	−323.61
		(223.06)	(153.51)	(340.81)	(235.61)	(226.56)	(147.37)	(363.86)	(217.68)
Banana		−374.69	754.11 **	−368.99	669.92	−395.52	759.00 **	−366.49	688.66
		(267.39)	(275.51)	(575.70)	(681.77)	(272.64)	(271.02)	(589.59)	(691.07)
Corn		61.04	−64.55	−0.29	−164.73	12.66	−38.69	−25.66	−137.48
		(186.70)	(117.97)	(280.17)	(188.64)	(185.75)	(116.87)	(280.70)	(193.14)
Durables	−0.02 *	5.03	4.83	24.98	−28.85	5.92 *	6.38	23.62	−37.18
	(0.01)	(2.67)	(28.11)	(29.03)	(44.61)	(2.64)	(26.34)	(26.47)	(45.08)
Constant	0.66 ***	1493.80 ***	977.59 ***	406.41 **	357.68 *	1220.67 ***	948.34 ***	461.99 **	374.78 *
	(0.14)	(344.02)	(262.72)	(150.74)	(158.21)	(344.68)	(248.31)	(157.09)	(157.97)
Title	−0.41 ***
	(0.10)
Network	−1.25 ***
	(0.19)
$δ$				−532.09	421.59			−555.28	497.45
				(667.13)	(498.24)			(693.75)	(464.47)

* means statistically significant at the 10% level. ** means statistically significant at the 5% level. *** means statistically significant at the 1% level.

Table 2. Counterfactuals: the impact of eliminating credit constraints on productivity and regional output.

Type of Credit Constraint	A Frequency in Sample	B Productivity Change	B’ Productivity Change	C Relative Change	C’ Relative Change	D Land Controlled	E Impact on Regional Output	E’ Impact on Regional Output
Quantity rationed	23.50%	516.28	256.37	58.3%	28.90%	20.5%	11.90%	6.41%
		[176]	[667.73]				[4.5%]	[15.7%]
Risk rationed	15.50%	477.71	129.96	68.2%	18.56%	16.0%	10.90%	2.53%
		[175]	[697.25]				[4.7%]	[16.2%]
Transaction cost rationed	10.50%	412.75	160.29	49.0%	19%	7.8%	3.8%	1.52%
		[216]	[620.77]				[2.1%]	[6.0%]
Constrained	49.50%	482.24	196.41	58.9%	24%	44.2%	26%	10.60%
		[149]	[657.27]				[8.4%]	[35.7%]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Woutersen, T.; Hauck, K.; Khandker, S.R. Estimating the Effects of Credit Constraints on Productivity of Peruvian Agriculture. Econometrics 2024, 12, 27. https://doi.org/10.3390/econometrics12040027

AMA Style

Woutersen T, Hauck K, Khandker SR. Estimating the Effects of Credit Constraints on Productivity of Peruvian Agriculture. Econometrics. 2024; 12(4):27. https://doi.org/10.3390/econometrics12040027

Chicago/Turabian Style

Woutersen, Tiemen, Katherine Hauck, and Shahidur R. Khandker. 2024. "Estimating the Effects of Credit Constraints on Productivity of Peruvian Agriculture" Econometrics 12, no. 4: 27. https://doi.org/10.3390/econometrics12040027

APA Style

Woutersen, T., Hauck, K., & Khandker, S. R. (2024). Estimating the Effects of Credit Constraints on Productivity of Peruvian Agriculture. Econometrics, 12(4), 27. https://doi.org/10.3390/econometrics12040027

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Estimating the Effects of Credit Constraints on Productivity of Peruvian Agriculture

Abstract

1. Introduction

2. Model and Theorem

3. Productivity in Peruvian Agriculture

3.1. Guirkinger and Boucher (2008)

3.2. Results from the Application

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A. Consistency and Asymptotic Normality Results

Notes

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI