# Indirect Inference: Which Moments to Match?

^{1}

^{2}

^{*}

^{†}

## Abstract

**:**

## 1. Introduction

**Principle 1**: Following Eastwood (1991), implementation of the SNP approach requires choosing the truncation degree in the expansion in an adaptive (i.e., random, data-dependent) manner. While AL interpret the results of Eastwood (1991) to suggest that AIC is the optimal model choice strategy for this adaptive truncation, they eventually decide to elicit a “choice of score generator(s) (...) guided by the more conservative HQC and BIC criteria”. In addition, Gallant and Long (1997) and Gallant et al. (1997) also use the BIC in determining the choice of the auxiliary model, while the latter stresses that “to implement the EMM estimator we require a score generator that fits these data well”, leading to the use of the BIC to measure the trade-off between parsimony vs goodness-of-fit.

**Principle 2:**For a given number of terms in the SNP expansion, the score generator can be interpreted as the score of an unconstrained parametric model, which ensures that, by definition, we end up with a just-identified set of moment conditions to match: the number of auxiliary parameters to estimate is exactly the number of components in the score vector. For instance, and in contrast to Gallant and Long (1997), who allow for conditional heterogeneity in the innovation density, AL (see p. 364) “find no evidence that such an extension is required”, and, by the same token, AL eliminate the additional heterogeneity parameter introduced by Gallant and Long (1997) and the corresponding moments to match. However, one may realize that an alternative approach would have been adding moment conditions aimed at utilizing the knowledge that this kind of heterogeneity is not present in the data, and which would then lead to an overidentified set of moments to match.

- (i)
- We argue that the first principle puts too much emphasis on the idea that “to implement the EMM estimator we require a score generator that fits these data well”. Gallant and Long (1997) show that (see their Lemma 1, p. 135) under convenient regularity conditions, asymptotic efficiency is reached if and only if the linear span of a “true score” (i.e., the score of a well-specified parametric model for the structural model) is asymptotically included (at the true value of the structural parameters) in the linear span of the score of the auxiliary model. This does not require in any way that the auxiliary model is (even asymptotically for an arbitrary large number of parameters) a well-specified model, in the sense that it is consistent with the Data Generating Process (DGP). Of course, as emphasized by GT, a sufficient condition for this score spanning property is the so-called smooth embedding of the score generator, which means that there is a one-to-one and twice continuously differentiable mapping between the two parametrizations (i.e., the auxiliary and structural). Then score spanning is just a consequence of computing compounded derivatives. However, this sufficient condition for score spanning is definitely not necessary and, thus, there is no logical argument to impose a model selection criterion, like AIC or BIC, to select an auxiliary model. We remind the reader that the purpose of the auxiliary model is not to describe the DGP, but to provide informative estimating equations. After all, it may be possible to satisfy the linear spanning condition by using a vector of moments that define well-suited auxiliary parameters but have no interpretation as a score function of a quasi-likelihood.
- (ii)
- The next point is the realization that what determines the efficiency of indirect inference estimators is the moments they match, and not necessarily the auxiliary parameters. Hence, and in contrast to the example given in Principle 2 above, one may well contemplate using a set of moment conditions that overidentify the vector of unknown auxiliary parameters. Of course, moment estimation of the auxiliary parameters will eventually resort to a just-identified set of moment conditions, through the choice of a particular linear combination of the (possibly) overidentified moment conditions. However, we argue that the choice of this just-identified set of moment conditions should not be guided by efficiency of the resulting estimator of auxiliary parameters (as would be the case with an efficient two-step Generalized Method of Moments (GMM) estimator) but, on the contrary, by our goal of obtaining an asymptotically efficient indirect estimator of the structural parameters. We demonstrate that this new focus of interest produces a novel way to devise a two-step GMM estimator of the auxiliary parameters that is, in general, different from standard efficient two-step GMM estimators.

## 2. Auxiliary versus Structural Models

**Identification**of the true unknown value ${\theta}^{0}$ of the structural parameters $\theta \in \Theta \subset {\mathbb{R}}^{{d}_{\theta}}$, for a given value (the true unknown one) of the auxiliary parameters.**Identification**of the true unknown value ${\beta}^{0}$ of the auxiliary parameters $\beta \in B\subset {\mathbb{R}}^{{d}_{\beta}}$, for a given value (the true unknown one ${\theta}^{0}$) of the structural parameters.

#### 2.1. Auxiliary Model

**Assumption**

**1.**

- (i)
- $E\left[g(y,\beta )\right]=0\iff \beta ={\beta}^{0}$.
- (ii)
- $J=\frac{\partial E\left[g(y,\beta )\right]}{\partial {\beta}^{\prime}}{|}_{\beta ={\beta}^{0}}$ is full column rank.
- (iii)
- $\Omega ={plim}_{T=\infty}\left[\sqrt{T}{\overline{g}}_{T}\left({\beta}^{0}\right)\right]$ is a positive definite matrix.

**Assumption**

**2.**

#### 2.2. The Structural Model

**Assumption**

**3.**

- (i)
- ${E}_{\theta}\left[g(y,{\beta}^{0})\right]=0\iff \theta ={\theta}^{0}$.
- (ii)
- $\Gamma =\frac{\partial {E}_{\theta}\left[g(y,{\beta}^{0})\right]}{\partial {\theta}^{\prime}}{|}_{\theta ={\theta}^{0}}$ is full column rank.

#### 2.3. Moment Selection Criterion

## 3. Which Moments to Match?

#### 3.1. Optimal Selection Matrix K

#### 3.2. Efficient Two-Step Moment Matching

#### 3.3. Efficient Two-Step Parameter Matching

**Assumption**

**4.**

- (i)
- There exists a function ${b}_{K}(.)$ such that, for all $\theta \in \Theta $,$${E}_{\theta}\left[Kg(y,{b}_{K}\left(\theta \right))\right]=0.$$
- (ii)
- ${b}_{K}\left({\theta}^{0}\right)={\beta}^{0}={plim}_{T\to \infty}{\widehat{\beta}}_{T}\left(K\right)$ and ${b}_{K}\left(\theta \right)={plim}_{T\to \infty}{\tilde{\beta}}_{T,H}(\theta ,K)$, where ${\tilde{\beta}}_{T,H}(\theta ,K)$ is the solution of$$\frac{1}{TH}\sum _{t=1}^{T}\sum _{h=1}^{H}Kg\left({y}_{t}^{\left(h\right)}\left(\theta \right),{\tilde{\beta}}_{T,H}(\theta ,K)\right)=0.$$
- (iii)
- ${b}_{K}\left(\theta \right)={\beta}^{0}\iff \theta ={\theta}^{0}$.
- (iv)
- The Jacobian matrix $\partial {b}_{K}\left({\theta}^{0}\right)/\partial {\theta}^{\prime}$ has rank ${d}_{\theta}$.

#### 3.4. Interpretation of Results and Discussion

## 4. Conclusions

## Author Contributions

## Funding

## Conflicts of Interest

## Appendix A. Proofs

#### Appendix A.1. Proof of Equation (6)

#### Appendix A.2. Proof of Equation (15)

## References

- Andersen, Torben G., and Lund Jesper. 1997. Estimating continuous-time stochastic volatility models of the short-term interest rate. Journal of Econometrics 77: 343–77. [Google Scholar] [CrossRef]
- Bates, Charles E., and White Halbert. 1993. Determination of estimators with minimum asymptotic covariance matrices. Econometric Theory 9: 633–48. [Google Scholar] [CrossRef]
- Breusch, Trevor, Hailong Qian, Peter Schmidt, and Wyhowski Donald. 1999. Redundancy of moment conditions. Journal of Econometrics 91: 89–111. [Google Scholar] [CrossRef]
- Cheng, Xu, and Zhipeng Liao. 2015. Select the valid and relevant moments: An information-based lasso for GMM with many moments. Journal of Econometrics 186: 443–64. [Google Scholar] [CrossRef]
- Dovonon, Prosper, and Alastair R. Hall. 2018. The asymptotic properties of GMM and indirect inference under second-order identification. Journal of Econometrics 205: 76–111. [Google Scholar] [CrossRef] [Green Version]
- Dovonon, Prosper, and Renault Eric. 2013. Testing for common conditionally heteroskedastic factors. Econometrica 81: 2561–86. [Google Scholar]
- Eastwood, Brian J. 1991. Asymptotic normality and consistency of semi-nonparametric regression estimators using an upwards F test truncation rule. Journal of Econometrics 48: 151–81. [Google Scholar] [CrossRef]
- Gallant, A. Ronald, and Douglas W. Nychka. 1987. Semi-nonparametric maximum likelihood estimation. Econometrica: Journal of the Econometric Society 55: 363–90. [Google Scholar] [CrossRef]
- Gallant, A. Ronald, and George Tauchen. 1996. Which moments to match? Econometric Theory 12: 657–81. [Google Scholar] [CrossRef]
- Gallant, A. Ronald, and Jonathan R. Long. 1997. Estimating stochastic differential equations efficiently by minimum chi-squared. Biometrika 84: 125–41. [Google Scholar] [CrossRef]
- Gallant, A. Ronald, Hsieh David, and George Tauchen. 1997. Estimation of stochastic volatility models with diagnostics. Journal of Econometrics 81: 159–92. [Google Scholar] [CrossRef]
- Gourieroux, Christian, Monfort Alain, and Renault Eric. 1993. Indirect inference. Journal of Applied Econometrics 8: S85–S118. [Google Scholar] [CrossRef]
- Hansen, Bruce E. 2016. Stein Combination Shrinkage for Vector Autoregressions. Working Paper A39, Sir Clive Granger Building, University Park, PA, USA. [Google Scholar]
- Kleibergen, Frank. 2005. Testing parameters in GMM without assuming that they are identified. Econometrica 73: 1103–23. [Google Scholar] [CrossRef]
- Sargan, J. D. 1983. Identification and lack of identification. Econometrica: Journal of the Econometric Society 51: 1605–33. [Google Scholar] [CrossRef]
- Smith, A. A., Jr. 1993. Estimating nonlinear time-series models using simulated vector autoregressions. Journal of Applied Econometrics 8: S63–S84. [Google Scholar] [CrossRef] [Green Version]
- Stock, James H., and Jonathan H. Wright. 2000. GMM with weak identification. Econometrica 68: 1055–96. [Google Scholar] [CrossRef]
- Tauchen, George. 1997. New minimum chi-square methods in empirical finance. Econometric Society Monographs 28: 279–317. [Google Scholar] [CrossRef]

1 | Similar to the selection matrix K, the weighting matrix W can be replaced by a consistent, data-dependent estimator, say ${W}_{T}$ such that ${W}_{T}{\to}_{p}W$ as $T\to \infty $, without altering the first-order asymptotic theory of the II estimator. |

2 | See the Appendix A for more detailed derivations. |

3 | A discussion and interpretation of this surprising feasibility (in spite of the fact that ${\beta}^{0}$ is unknown) is given in Section 3.4. |

4 | Note that one may want to compute this derivative numerically and to choose ${H}^{*}$ very large (and possibly different from H in the rest of the procedure) to take advantage of the smoothness properties of the population moments. |

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Frazier, D.T.; Renault, E.
Indirect Inference: Which Moments to Match? *Econometrics* **2019**, *7*, 14.
https://doi.org/10.3390/econometrics7010014

**AMA Style**

Frazier DT, Renault E.
Indirect Inference: Which Moments to Match? *Econometrics*. 2019; 7(1):14.
https://doi.org/10.3390/econometrics7010014

**Chicago/Turabian Style**

Frazier, David T., and Eric Renault.
2019. "Indirect Inference: Which Moments to Match?" *Econometrics* 7, no. 1: 14.
https://doi.org/10.3390/econometrics7010014