# Generalized Empirical Likelihood-Based Focused Information Criterion and Model Averaging

## Abstract

## 1. Introduction

## 2. Local Misspecification Framework

## 3. Focused Information Criterion

**Assumption 3.1**

- 1.
- $\Theta \subset {\mathbb{R}}^{p}$, $\Gamma \subset {\mathbb{R}}^{q}$, and $\mathcal{T}\subset {\mathbb{R}}^{l}$ are compact.
- 2.
- $m(y,\theta ,\gamma )$ is continuous in $\theta \in \Theta $ and $\gamma \in \Gamma $ for almost every y.
- 3.
- ${sup}_{\theta \in \Theta ,\gamma \in \Gamma ,\tau \in \mathcal{T}}\left|{\widehat{Q}}_{n}(\theta ,\gamma ,\tau )-{Q}_{n}(\theta ,\gamma ,\tau )\right|\stackrel{p}{\to}0$ under the sequence of ${f}_{n}\left(y\right)$.
- 4.
- $\left|{Q}_{n}(\theta ,\gamma ,\tau )-Q(\theta ,\gamma ,\tau )\right|\to 0$ as $n\to \infty $ for all $\theta \in \Theta $, $\gamma \in \Gamma $, and $\tau \in \mathcal{T}$.
- 5.
- $E\left[m({y}_{i},\theta ,\gamma )m{({y}_{i},\theta ,\gamma )}^{\prime}\right]$ is nonsingular for all $\theta \in \Theta $ and $\gamma \in \Gamma $.
- 6.
- $({\theta}_{0},{\gamma}_{0})$ is the unique solution to $E\left[m({y}_{i},\theta ,\gamma )\right]=0$ and $({\theta}_{0},{\gamma}_{0})\in \mathit{int}(\Theta \times \Gamma )$.
- 7.
- $\rho \left(v\right)$ is twice continuously differentiable in a neighborhood of zero.
- 8.
- $E\left[{m}_{\theta i}\right]$ and $E\left[{m}_{\gamma i}\right]$ are of full rank.
- 9.
- ${sup}_{n}{E}_{n}[\parallel {m}_{i}{\parallel}^{2+\alpha}]<\infty $ for some $\alpha >0$.
- 10.
- $m(y,\theta ,\gamma )$ is continuously differentiable in $\theta $ and $\gamma $ in a neighborhood, $\mathcal{N}$, of $({\theta}_{0},{\gamma}_{0})$.
- 11.
- ${sup}_{\theta ,\gamma \in \mathcal{N}}\left|{n}^{-1}{\sum}_{i=1}^{n}\frac{\partial m({y}_{i},\theta ,\gamma )}{\partial {\theta}^{\prime}}-{E}_{n}\left[\frac{\partial m({y}_{i},\theta ,\gamma )}{\partial {\theta}^{\prime}}\right]\right|\stackrel{p}{\to}0$ and ${sup}_{\theta ,\gamma \in \mathcal{N}}\left|{n}^{-1}{\sum}_{i=1}^{n}\frac{\partial m({y}_{i},\theta ,\gamma )}{\partial {\gamma}^{\prime}}-{E}_{n}\left[\frac{\partial m({y}_{i},\theta ,\gamma )}{\partial {\gamma}^{\prime}}\right]\right|\stackrel{p}{\to}0$ under the sequence of ${f}_{n}\left(y\right)$.
- 12.
- $\parallel {E}_{n}\left[{m}_{\theta i}\right]-E\left[{m}_{\theta i}\right]\parallel \to 0$ and $\parallel {E}_{n}\left[{m}_{\gamma i}\right]-E\left[{m}_{\gamma i}\right]\parallel \to 0$ as $n\to \infty $.
- 13.
- $\parallel {E}_{n}\left[{m}_{i}{m}_{i}^{\prime}\right]-E\left[{m}_{i}{m}_{i}^{\prime}\right]\parallel \to 0$ as $n\to \infty $.

**Lemma 3.1**Suppose Assumption 3.1 holds. Then, under the sequence of ${f}_{n}\left(y\right)$, we have:

**Theorem 3.1**Suppose Assumption 3.1 holds. Then, under the sequence of ${f}_{n}\left(y\right)$, we have:

## 4. Model Averaging

## 5. Example

## 6. Monte Carlo Study

**Table 1.**Estimation results; DGP, data generating process; AIC, Akaike information criterion; BIC, Bayesian information criterion; FIC, focused information criterion.

DGP | |||||

(1) | (2) | (3) | (4) | ||

Full | Bias | -0.104 | -0.109 | - 0.089 | - 0.076 |

Std | 0.544 | 0.533 | 0.509 | 0.489 | |

RMSE | 0.554 | 0.544 | 0.516 | 0.495 | |

Reduced | Bis | -0.279 | -0.057 | -0.148 | -0.048 |

Std | 0.780 | 0.473 | 0.955 | 0.448 | |

RMSE | 0.828 | 0.477 | 0.965 | 0.450 | |

AIC | Bias | -0.113 | -0.099 | -0.101 | -0.079 |

Std | 0.559 | 0.557 | 0.497 | 0.509 | |

RMSE | 0.570 | 0.566 | 0.507 | 0.515 | |

BIC | Bias | -0.136 | -0.088 | -0.104 | -0.073 |

Std | 0.689 | 0.552 | 0.499 | 0.502 | |

RMSE | 0.702 | 0.559 | 0.510 | 0.507 | |

FIC | Bias | -0.139 | -0.095 | -0.112 | -0.076 |

Std | 0.530 | 0.509 | 0.464 | 0.452 | |

RMSE | 0.548 | 0.517 | 0.477 | 0.458 | |

Averaging | Bias | -0.139 | -0.092 | -0.107 | -0.074 |

Std | 0.511 | 0.476 | 0.455 | 0.444 | |

RMSE | 0.529 | 0.484 | 0.468 | 0.450 |

## 7. Conclusions

## Acknowledgments

## A. Appendix

^{1.}Although ${y}_{1},\cdots ,{y}_{n}$ is a triangular array, we suppress the additional subscript, n, on y for notational simplicity.^{2.}Simulations were also conducted for different sample sizes. The results are not reported here, because the difference among candidate models is so small for large n that RMSEs are the almost identical for all models.

