# Information-Criterion-Based Lag Length Selection in Vector Autoregressive Approximations for I(2) Processes

## Abstract

**:**

## 1. Introduction

`AIC`(${C}_{T}=2$) and

`BIC`(${C}_{T}=logT$). The minimal integer $\widehat{h}$ minimizing IC is then the selected lag length.3

`BIC`, we have ${\widehat{h}}_{BIC}/l\left(T\right)=1$ a.s. (Theorem 6.6.3 of HD), and the same limit holds for ${\widehat{h}}_{AIC}$ in probability (Theorem 6.6.4 of HD), where $l\left(T\right)=logT/(-2log{\rho}_{0})$. Consequently, ${\rho}_{0}^{l\left(T\right)}=exp(-log{\rho}_{0}logT/(2log{\rho}_{0}))={T}^{-1/2}$. Thus, in this case, asymptotically, the approximation error (2) is of the order ${T}^{-1/2}$.

**Assumption**

**1.**

`AIC`or

`BIC`also in this situation while a theoretical justification is missing.

## 2. Integrated Processes

**Assumption**

**2.**

- $\alpha ,\beta \in {\mathbb{R}}^{p\times r},0\le r<p$ are full column rank matrices;
- The function $\Pi \left(z\right)={(1-z)}^{2}{I}_{p}-\alpha {\beta}^{\prime}z-\Gamma (1-z)z-{\sum}_{j=1}^{\infty}{(1-z)}^{2}{\Pi}_{j}{z}^{j}$ (converging absolutely for $\left|z\right|<1+\delta ,\delta >0$) fulfills that $|\Pi (z\left)\right|=0$ implies that $\left|z\right|>1$ or $z=1$;
- With ${\beta}_{2}={\beta}_{\perp}{\eta}_{\perp},{\alpha}_{2}={\alpha}_{\perp}{\xi}_{\perp}$ (where ${\alpha}_{\perp}^{\prime}\Gamma {\beta}_{\perp}=\xi {\eta}^{\prime},\eta ,\xi \in {\mathbb{R}}^{(p-r)\times s}$ are of full column rank $s<p-r$) the matrix$${\alpha}_{2}^{\prime}({I}_{p}+\Gamma \beta {\left({\beta}^{\prime}\beta \right)}^{-1}{\left({\alpha}^{\prime}\alpha \right)}^{-1}{\alpha}^{\prime}\Gamma -\sum _{j=1}^{\infty}{\Pi}_{j}){\beta}_{2}$$

**Theorem**

**1.**

- (i)
- $\mathbb{P}(\widehat{h}\left({C}_{T}\right)\le M)\to 0$ for each constant $0<M$.
- (ii)
- If Assumption A1 holds for ${\Sigma}_{h}$ with function $\theta \left(h\right)$, then $\widehat{h}\left({C}_{T}\right)/{h}_{T}^{*}\to 1$ in probability, where ${h}_{T}^{*}$ minimizes the function ${L}_{T}(h;{C}_{T})=h{p}^{2}({C}_{T}-1)/T+\theta \left(h\right)$.
- (iii)
- If ${\left({y}_{t}\right)}_{t\in \mathbb{Z}}$ is an I(2) invertible VARMA process corresponding to the left-coprime pair $\left(A\right(z),B(z\left)\right)$, then $-2\widehat{h}\left({C}_{T}\right)log{\rho}_{0}/logT\to 1$ in probability, where ${\rho}_{0}^{-1}=min\left\{\right|z|:z\in \mathbb{C},|B\left(z\right)|=0\}$.

`BIC`is (weakly) consistent for VAR(${h}_{0}$) processes.

## 3. Simulations

`BIC`. The main effects contained in the theorem are clearly visible: the selected lag length increases with sample size. The logarithmic scale on the x-axis for plot (a) shows8 the roughly linear increase in $logT$. In addition, larger absolute values of $\theta $ result in larger lag lengths. The average selected lag lengths with

`BIC`are very similar to the optimal values ${h}_{T}^{*}$ (corresponding to the stationary process ${\left({u}_{t}\right)}_{t\in \mathbb{Z}}$). The doubly integrated processes ${\left({y}_{t}\right)}_{t\in \mathbb{Z}}$ require roughly two more lags in all cases compared to the stationary process ${\Delta}^{2}{\left({y}_{t}\right)}_{t\in \mathbb{Z}}$ except close to $\theta =0.9$ (which is close to a pole zero cancellation), where only one additional lag results in the lag length selection on average.

## 4. Conclusions

`AIC`or

`BIC`tend to infinity as a function of the sample size proportional to $logT$. The proportionality constant depends on the location of the zero closest to the unit circle. This is identical to the stationary case. The proof of the result indicates that this property is robust for a great number of unit roots being present in the data-generating process.

## Funding

## Data Availability Statement

## Conflicts of Interest

## Appendix A. Proof of Theorem 1

**Proof.**

`AIC`. Since the lag length estimated for finite ${C}_{T}$ is at least as large as the one selected using

`BIC`, ${\widehat{h}}_{T}\to \infty $ follows, again implying the results.

## Notes

1 | Other variants exist, for example, using the same time points in the summation for all models; compare with Kilian and Lütkepohl (2017). |

2 | Hannan and Deistler (1988) will, in the following, be abbreviated as HD. |

3 | In the unlikely case of draws, the smallest integer h achieving the minimum is selected. |

4 | |

5 | Here and below we use the notation ${X}_{\perp}$ for a full column rank matrix whose columns span the orthonormal complement of the column space of a full column rank matrix X. |

6 | Since we fix the values ${y}_{0},{y}_{1}$, and ${y}_{2}$, the processes will only be stationary for appropriate choices of the values, otherwise it is the sum of a stationary process and the effects of the initial values. |

7 | As pointed out by a reviewer (for which I am grateful), the moment condition in HD is slightly stronger than assuming finite fourth moment as used in LB. |

8 | Note that the optima ${h}_{T}^{*}$ correspond to $logT/(-2log{\rho}_{0})$ rounded to the nearest integer. |

9 | LB uses the lag length $h+2$ instead of h. |

## References

- Bauer, Dietmar, and Martin Wagner. 2004. Autoregressive Approximations to MFI(1) Processes. Working Paper No. 174, Reihe Ökonomie/ Economics Series; Vienna: Institut für Höhere Studien (IHS). [Google Scholar]
- Granger, Clive W. J., and Tae-Hwy Lee. 1989. Investigation of Production, Sales and Inventory relationships using multicointegration and non-symmetric error correction models. Journal of Applied Econometrics 4: 145–59. [Google Scholar] [CrossRef]
- Hannan, Edward James, and Manfred Deistler. 1988. The Statistical Theory of Linear Systems. New York: John Wiley. [Google Scholar]
- Johansen, Søren. 1995. Likelihood-Based Inference in Cointegrated Vector Auto-Regressive Models. Oxford: Oxford University Press. [Google Scholar]
- Juselius, Katarina. 2006. The Cointegrated VAR Model: Methodology and Applications. Oxford: Oxford University Press. [Google Scholar]
- Kilian, Lutz, and Helmut Lütkepohl. 2017. Structural Vector Autoregressive Analysis. Cambridge: Cambridge University Press. [Google Scholar]
- Li, Yuanyuan, and Dietmar Bauer. 2020. Modeling I (2) processes using vector autoregressions where the lag length increases with the sample size. Econometrics 8: 38. [Google Scholar] [CrossRef]
- Lütkepohl, Helmut, and Pennti Saikkonen. 1999. Order Selection in Testing for the Cointegrating Rank of a VAR Process, Cointegration, Causality, and Forecasting. A Festschrift in Honour of Clive WJ Granger. Oxford: Oxford University Press, pp. 168–99. [Google Scholar]
- Ng, Serena, and Pierre Perron. 1995. Unit Root Tests in ARMA Models with Data-Dependent Methodes for the Selection of the Truncation Lag. Journal of the American Statistical Association 90: 268–81. [Google Scholar] [CrossRef]
- Paulsen, Jostein. 1984. Order determination of multivariate autoregressive time series with unit roots. Journal of Time Series Analysis 5: 115–27. [Google Scholar] [CrossRef]
- Saikkonen, Pentti, and Helmut Lütkepohl. 1996. Infinite-Order Cointegrated Vector Autoregressive Processes: Estimation and Inference. Econometric Theory 12: 814–44. [Google Scholar] [CrossRef]

**Figure 1.**Average selected lag length for the stationary processes (blue) and the $I\left(2\right)$ processes (black, dashed). Red stars denote the optimizers ${h}_{T}^{*}$. (

**a**) For $\theta =-0.9$ over sample sizes $T=100,\phantom{\rule{0.166667em}{0ex}}200,\phantom{\rule{0.166667em}{0ex}}400,\phantom{\rule{0.166667em}{0ex}}800$. (

**b**) For sample size $T=800$ over different values of $\theta $.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2023 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Bauer, D.
Information-Criterion-Based Lag Length Selection in Vector Autoregressive Approximations for I(2) Processes. *Econometrics* **2023**, *11*, 11.
https://doi.org/10.3390/econometrics11020011

**AMA Style**

Bauer D.
Information-Criterion-Based Lag Length Selection in Vector Autoregressive Approximations for I(2) Processes. *Econometrics*. 2023; 11(2):11.
https://doi.org/10.3390/econometrics11020011

**Chicago/Turabian Style**

Bauer, Dietmar.
2023. "Information-Criterion-Based Lag Length Selection in Vector Autoregressive Approximations for I(2) Processes" *Econometrics* 11, no. 2: 11.
https://doi.org/10.3390/econometrics11020011