VAR Models with an Index Structure: A Survey with New Results

Gianluca Cubadda

doi:10.3390/econometrics13040040

Abstract

The main aim of this paper is to review recent advances in the multivariate autoregressive index model [MAI] and their applications to economic and financial time series. MAI has recently gained momentum because it can be seen as a link between two popular but distinct multivariate time series approaches: vector autoregressive modeling [VAR] and the dynamic factor model [DFM]. Indeed, on the one hand, MAI is a VAR model with a peculiar reduced-rank structure that can lead to a significant dimension reduction; on the other hand, it allows for the identification of common components and common shocks in a similar way as the DFM. Our focus is on recent developments of the MAI, which include extending the original model with individual autoregressive structures, stochastic volatility, time-varying parameters, high-dimensionality, and co-integration. In addition, some gaps in the literature are filled by providing new results on the representation theory underlying previous contributions, and a novel model is provided.

Keywords:

multivariate autoregressive index models; vector autoregressive models; dynamic factor models; reduced-rank regression

JEL Classification:

C32

1. Introduction

The vector autoregressive model [VAR] and the dynamic factor model [DFM] are arguably among the most popular tools for analyzing economic and financial variables over time. Since the seminal contribution of Sims (1980), VARs have been theoretically extended and practically implemented to forecast, structurally analyze, and detect comovements in multivariate time series. DFMs were introduced more recently (Forni et al., 2000; Forni & Lippi, 2001; Stock & Watson, 2002a, 2002b; Bai & Ng, 2002; Bai, 2003), but they rapidly contested the role of the workhorse in empirical macroeconomics.

The main reason for the success of the DFM is two-fold. First, it allows handling a much larger number of variables than those that are generally employed in traditional small-scale VARs, thus potentially boosting forecasting accuracy and solving the informational deficiency problems that arise in structural analyses when the agent’s information set is richer than the econometrician’s information set (see, e.g., Forni & Gambetti, 2014). Second, DFM allows for disentangling the shocks that drive the common component of high-dimensional time series and recovering the structural shocks from these common shocks only. Hence, in structural DFMs, the number of shocks is smaller than the number of variables (Forni et al., 2009), which is in line with dynamic, stochastic general equilibrium models (see Fernández-Villaverde et al., 2016 and the references therein) and, more generally, with the standard macroeconomic view that a low number of shocks drives aggregate fluctuations.

Efforts have recently been made to endow the VAR with the above-mentioned features of the DFM. On the one hand, shrinkage estimators have been proposed for medium–large VARs, both from a Bayesian perspective (e.g., Bańbura et al., 2010; Koop, 2013; Carriero et al., 2015) and from a classical standpoint (e.g., Hsu et al., 2008; Kock & Callot, 2015; Hecq et al., 2023). On the other hand, the multivariate autoregressive index (MAI) model—originally proposed by Reinsel (1983) as a convenient approach to dimension reduction in stationary VARs—has recently gained renewed attention.1

Late advances have shown that MAI and its variants allow for both forecasting variables and identifying shocks analogously to the DFM but without encountering some issues in model identification and statistical inference that characterize the latter, such as the requirement that the number of variables diverges at a given rate and the need for specific assumptions on both the correlation structure of the idiosyncratic components and the factor loadings (see, i.e., Bai, 2003; Bai & Ng, 2006). Moreover, VARs with index structures have been shown to be able to accommodate features such as stochastic volatility (Carriero et al., 2022) and time-varying parameters (Cubadda et al., 2025), which are not easy to handle within the DFM framework. The MAI falls within reduced-rank VARs, a general class of models that include, as special cases, both the cointegrated VAR (see Johansen, 1995 and the references therein) and the common serial correlation [CSC] models (see Cubadda & Hecq, 2022b and the references therein). Although CSC models and MAI have similar mathematical formulations, their respective goals and properties are rather different; whereas the former are based on the existence of (possibly dynamic) linear combinations of autocorrelated time series that are white noise, VARs with an index structure assume that there is a limited number of channels through which information from the past is transmitted to the variables of interest. The main aim of this paper is twofold. First, recent developments in MAI are reviewed, such as the structural MAI (Carriero et al., 2016), the vector heterogeneous index model for realized volatilities (Cubadda et al., 2017), and augmentations of the original model with individual autoregressive structures (Cubadda & Guardabascio, 2019), stochastic volatility (Carriero et al., 2022), time-varying parameters (Cubadda et al., 2025), high-dimensionality (Cubadda & Hecq, 2022a), and cointegration (Cubadda & Mazzali, 2024). Second, new results are provided in terms of representation theory for various models, and a novel modeling is proposed, namely the cointegrated index-augmented autoregressive model, which combines and extends the results in Cubadda and Guardabascio (2019) and Cubadda and Mazzali (2024).

This paper is organized as follows. Focusing on representation theory, Section 2 reviews previous contributions and provides new insights into some of them. Section 3 presents the new model and deals with its estimation, whereas some details of the estimation procedure are relegated to Appendix A. Finally, Section 4 provides some conclusions.

2. VAR Models with an Index Structure

In this section, we review models that are rooted in the original MAI formulation and provide new results with respect to representation theory for some of them. Analogies and differences with the DFM are discussed in detail. Estimation and identification issues are also covered.

2.1. The Structural Multivariate Autoregressive Index Model

Let us assume that the n-vector time series

Y_{t} = {(y_{1 t}, \dots, y_{n t})}^{'}

is generated by the following stationary VAR

(p)

model:

Φ (L) Y_{t} = ε_{t}, t = 1 \dots T,

(1)

where L is the lag operator,

Φ (L) =

I_{n} - \sum_{j = 1}^{p} Φ_{j} L^{j}

;

ε_{t}

is a vector or n errors with

E (ε_{t} ε_{t}^{'}) = Σ

(positive definite) and finite fourth moments,

E (ε_{t} | 𝟊_{t - 1}) = 0

; and

𝟊_{t}

is the natural filtration of the process

Y_{t}

. For simplicity, deterministic elements are ignored.

The key assumption of MAI (Reinsel, 1983) is the following:

Assumption 1.

The following holds:

{[Φ_{1}^{'}, \dots, Φ_{p}^{'}]}^{'} = {[α_{1}^{'}, \dots, α_{p}^{'}]}^{'} ω^{'},

where ω is a full-rank

n \times q

-matrix with

q < n

, and

α_{j}

is a

n \times q

-matrix for

j = 1, \dots, p

.

Under Assumption 1, Model (1) can be rewritten as

Y_{t} = \sum_{j = 1}^{p} α_{j} \underset{f_{t - i}}{\underset{︸}{ω^{'} Y_{t - i}}} + ε_{t}

(2)

where linear combinations of

f_{t} = ω^{'} Y_{t}

are called indexes. The MAI has at most

n q (p + 1) - q^{2}

mean parameters, which implies a significant dimension reduction when p is small with respect to n.2

By premultiplying both sides of Equation (2) with

ω^{'}

, we get

f_{t} = \sum_{j = 1}^{p} ω^{'} α_{j} f_{t - j} + ω^{'} ε_{t},

(3)

which shows that the indexes follow a VAR

(p)

process and not a VARMA process, as is generally the case for linear combinations of elements of a VAR (see Cubadda et al., 2009 and the references therein).

Remark 1.

In view of Equations (2) and (3), the MAI resembles the exact DFM [EDFM] (see Lippi, 2019 and the references therein), but there are also some relevant differences. First, in the EDFM series,

Y_{t}

loads the factors contemporaneously and not only with lags. Second, the factors and the idiosyncratic terms in the EDFM are uncorrelated at any lag lead, whereas in the MAI, we have

E (f_{t} ε_{t + j}) = 0

only for

j > 0

. Third, the contemporaneous variance matrix of the idiosyncratic terms in the EDFM is diagonal, whereas Σ is generally not diagonal.

Placing emphasis on the analogies between MAI and EDFM, Carriero et al. (2016) propose identifying structural shocks as linear transformations of the index shocks only. Starting from the Wold representation of series

Y_{t}

Y_{t} = Ψ (L) ε_{t},

and inserting the decomposition of the identity matrix between

Ψ (L)

and

ε_{t}

, as in Centoni and Cubadda (2003),

I_{n} = Σ ω {(ω^{'} Σ ω)}^{- 1} ω^{'} + ω_{⊥} {(ω_{⊥}^{'} Σ^{- 1} ω_{⊥})}^{- 1} ω_{⊥}^{'} Σ^{- 1},

(4)

one obtains the following decomposition of series

Y_{t} :

Y_{t} = χ_{t} + ι_{t}

(5)

where

\begin{matrix} χ_{t} & = Ψ (L) Σ ω {\underset{̲}{Σ}}^{- 1} ε_{t}^{χ}, \end{matrix}

(6)

\begin{matrix} ι_{t} & = Ψ (L) ω_{⊥} {(ω_{⊥}^{'} Σ^{- 1} ω_{⊥})}^{- 1} ε_{t}^{ι}, \end{matrix}

(7)

\underset{̲}{Σ} = ω^{'} Σ ω

,

ε_{t}^{χ} = ω^{'} ε_{t}

,

ε_{t}^{ι} = ω_{⊥}^{'} Σ^{- 1} ε_{t}

,

E (ε_{t}^{χ} ε_{t}^{ι'}) = 0

, and

E (χ_{t} ι_{t - j}^{'}) = 0

for

\forall j

.

Since the shock

ε_{t}^{χ}

is one of the indexes,

ε_{t}^{χ}

may be interpreted as the common shock and

χ_{t}

as the common components of the series

Y_{t}

. Similarly,

ε_{t}^{ι}

and

ι_{t}

can be labeled, respectively, as uncommon shocks and an uncommon component.

Interestingly, post-multiplying with

ω_{⊥}

both sides of relation

Ψ (L) (I_{n} - \sum_{j = 1}^{p - 1} α_{j} ω^{'} L^{j}) = I_{n}

we obtain

Ψ (L) ω_{⊥} = ω_{⊥}

, which in turn implies that the Wold polynomial matrix of the MAI has the form

Ψ (L) = I_{n} + \sum_{j = 1}^{\infty} θ_{j} ω^{'} L^{j}

(8)

where

θ_{j}

is an

n \times q

-matrix for

j > 0

.

Having substituted

Ψ (L)

in Equations (6) and (7) with the RHS of Equation (8), we can finally prove the following proposition.

Proposition 1.

In the MAI, the components of

Y_{t}

in (5) read as follows:

\begin{matrix} χ_{t} & = (Σ ω {\underset{̲}{Σ}}^{- 1} + \sum_{j = 1}^{\infty} θ_{j} L^{j}) ε_{t}^{χ}, \\ ι_{t} & = ω_{⊥} {(ω_{⊥}^{'} Σ^{- 1} ω_{⊥})}^{- 1} ε_{t}^{ι}, \end{matrix}

where the uncommon component

ι_{t}

denotes n-dimensional white noise such that

Rank (E (ι_{t} ι_{t}^{'})) = n - q

.

Corollary 1.

The indexes and the common component are linked through the relation

f_{t} = ω^{'} χ_{t}

, which trivially follows from Proposition 1.

Remark 2.

In view of Proposition 1, the decomposition (5) has clear analogies with the analogous decomposition in the EDFM. However, differently from the idiosyncratic terms in the EDFM, the uncommon component

ι_{t}

is obviously cross-sectionally dependent.3

Carriero et al. (2016) suggest recovering the structural shocks as linear transformations of the common shock

ε_{t}^{χ}

only. Hence, for most

q < n

, structural shocks can be recovered, as observed in DFMs and in dynamic stochastic general equilibrium models. In principle, all identification strategies that are available for structural VARs or structural DFMs (see Stock & Watson, 2016 and the references therein) can be adopted.

On the estimation side, Carriero et al. (2016) prove that the iterative maximum likelihood procedure proposed by Reinsel (1983) is consistent when

n = o (\sqrt{T})

. Moreover, they provide an MCMC algorithm for Bayesian estimation and show, by simulations, that the Bayesian approach outperforms the classical one when

n = 15, 20

. Finally, they document the practical value of the structural MAI by two empirical applications: the transmission mechanism of monetary policies and the propagation of demand and supply shocks.

2.2. The Vector Heterogeneous Autoregressive Index Model

The univariate heterogeneous AR model [HAR], originally proposed by Corsi (2009) is a popular tool for analyzing and forecasting daily realized volatility [RV] measures without resorting to more involved long-memory models. Technically speaking, the HAR is a constrained AR

(22)

model where the predictors are the first lags of the following: (i) the daily RV; (ii) the weekly (5 days) average of the daily RV; (iii) the monthly (22 days) average of the daily RV.

Cubadda et al. (2017) propose a multivariate HAR for a set of n daily RV measures,

Y_{t}^{(d)} \equiv {(Y_{1, t}^{(d)}, \dots, Y_{n, t}^{(d)})}^{'}

, that is endowed with an index structure. In particular, the vector heterogeneous autoregressive index model [VHARI] reads as follows:

Y_{t}^{(d)} = α^{(d)} ω^{'} Y_{t - 1 d}^{(d)} + α^{(w)} ω^{'} Y_{t - 1 d}^{(w)} + α^{(m)} ω^{'} Y_{t - 1 d}^{(m)} + ε_{t},

where

(d)

,

(w)

, and

(m)

denote, respectively, the time horizons of one day, one week, and one month such that

Y_{t}^{(w)} = \frac{1}{5} \sum_{j = 0}^{4} Y_{t - j d}^{(d)}, Y_{t}^{(m)} = \frac{1}{22} \sum_{j = 0}^{21} Y_{t - j d}^{(d)}

The VHARI enjoys two important properties that are not shared by alternative approaches, and they induce dimensional reductions in the vector HAR4: First, the index

f_{t}^{(d)} = ω^{'} Y_{t - 1 d}^{(d)}

preserves the temporal cascade structure of the HAR model since

f_{t}^{(w)} = \frac{1}{5} \sum_{j = 0}^{4} f_{t - j d}^{(d)}, f_{t}^{(m)} = \frac{1}{22} \sum_{j = 0}^{21} f_{t - j d}^{(d)} .

Second, pre-multiplying both sides of the VHARI by

ω^{'}

yields the following:

f_{t}^{(d)} = ω^{'} α + ω^{'} α^{(d)} f_{t - 1 d}^{(d)} + ω^{'} α^{(w)} f_{t - 1 d}^{(w)} + ω^{'} α^{(m)} f_{t - 1 d}^{(m)} + ω^{'} ε_{t},

which shows that the indexes follow a multivariate HAR model. In particular, when

q = 1

, a univariate HAR model generated all the dynamics of the n RVs.

On the estimation side, Cubadda et al. (2017) suggest using a switching algorithm [SA], an iterative method for the numerical maximization of the log-likelihood of complex models that has a long tradition in time series analysis (see Boswijk & Doornik, 2004 and the references therein). In particular, the proposed SA requires the following steps:

Given an (initial) estimate of $ω$ , maximize the conditional Gaussian likelihood $ℓ (A, Σ | ω)$ , where $A = {[α^{{(d)}^{'}}, α^{{(w)}^{'}}, α^{{(m)}^{'}}]}^{'}$ .
Given the previously obtained estimates of A and $Σ$ , maximize the conditional likelihood $ℓ (ω | A, Σ)$ .
Repeat steps 1 and 2 until numerical convergence occurs.5

A key point of the above SA is that both steps 1 and 2 require running OLS regressions only. This feature provides the SA with several advantages over Newton-type optimization methods: computational simplicity, with no need for normalization conditions in

ω

; explicit optimization at each step; and the ease of application of regularization schemes or linear restrictions on parameters (see Cubadda and Guardabascio (2019) for additional discussions).

Furthermore, when the SA is initialized with consistent estimates and is iterated sufficiently, the resulting estimator is asymptotically equivalent to the ML one (Hautsch et al., 2023). Cubadda et al. (2017) show by simulation that the suggested SA performs well even when elements of

ε_{t}

have a log-normal error distribution with GARCH variances.

Following Patton and Sheppard (2009), Cubadda et al. (2017) use a VHARI to build the optimal linear combination of ten different estimators of the volatility of the same market to evaluate its merits through an out-of-sample forecasting exercise. The VHARI model performs well, often outperforming previously existing methods.

2.3. The Index-Augmented Autoregressive Model

A possible limitation of MAI as a forecasting tool is that the only predictors of the series

y_{i, t}

for

i = 1, \dots, n

are the lagged indexes, whereas the forecasts obtained through the DFM exploit information coming from the past of both factors and the series

y_{i, t}

itself (see the seminal contributions by Stock and Watson (2002a, 2002b)). Although the indexes may be interpreted as ’supervised’ factors that are constructed for emphasizing the comovements between the present and the past of the system, some variables are better predicted by their own lags rather than by any linear combination of all variables only.

In order to overcome such limitations, Cubadda and Guardabascio (2019) extended the basic MAI model by allowing individual AR structures for each element of

Y_{t}

. Their key assumption is the following.

Assumption 2.

It holds

ϕ_{i k}^{(j)} = \sum_{m = 1}^{q} α_{i m}^{(j)} ω_{k m},

where

ϕ_{i k}^{(j)}

is the generic element of the polynomial matrix

Φ_{j}

,

ω_{k m}

is the generic element of ω, and

α_{i m}^{(j)}

is the generic element of

α_{j}

for

j = 1, \dots, p

,

i = 1, \dots, n

,

k = 1, \dots, i - 1, i + 1, \dots, n

.

In other words, Assumption 2 states that there is a reduced number of channels p through which each variable is influenced by the past of other variables in the system, which is consistent with the common view that few shocks are responsible for most macroeconomic fluctuations.

Under Assumption 2 and using the reparametrization

δ_{i i}^{(j)} = ϕ_{i i}^{(j)} - \sum_{m = 1}^{q} α_{i m}^{(j)} ω_{i m}

, Model (1) can be rewritten into the following index-augmented autoregressive model [IAAR]:

Y_{t} = \sum_{j = 1}^{p} D_{j} Y_{t - j} + \sum_{j = 1}^{s} α_{j} f_{t - j} + ε_{t},

(9)

where

D_{j}

is a

n \times n

diagonal matrix with

δ_{i i}^{(j)}

as a generic diagonal element, and for greater generality,

s \leq p

.

Remark 3.

Since the number of parameters of Model (9) is equal to

n (q s + q + p) - q^{2}

, it is necessary to impose proper upper bounds to either q or s to ensure that the MIAAR is more parsimonious than the VAR. To this end, it is easy to see that sufficient conditions are

q < n - 1

for

s = p \geq 2

or

s < p - 1

for any p and

q < n

. However, in empirical applications, the estimated values of q are typically much smaller than n (see Cubadda & Guardabascio, 2019; Carriero et al., 2022).

Remark 4.

The individual forecasting equation of the IAAR reads

y_{i t + 1} = \sum_{j = 0}^{p - 1} δ_{i i}^{(j)} y_{i t - j} + \sum_{j = 0}^{s - 1} α_{i \cdot}^{(j)'} f_{t - j} + ε_{i t + 1},

(10)

where

α_{i \cdot}^{(j)'}

is the i-th row of matrix

α_{j}

. Equation (10) is entirely analogous to the individual forecasting equation of the DFM, with one important difference. Whereas factors are typically estimated using principal component methods, which aim to maximize the contemporaneous variability of series

Y_{t}

, the indexes in (10) are constructed explicitly by taking into account the covariability between each series

y_{i t}

and the lags of other elements of

Y_{t}

conditionally on the lags of the series

y_{i t}

.

Remark 5.

Interestingly, by the same argument underlying Proposition 1, we see that, differently from the MAI,

Ψ (L) ω_{⊥} \neq ω_{⊥}

, which implies, in view of Equation (7), that the uncommon component

ι_{t}

is generally autocorrelated in the case of the IAAR. Hence, the decomposition (5) for the IAAR closely resembles the analogous decomposition in the approximate DFM (see Lippi (2019) and the references therein). However, the estimation of the index

f_{t}

does not require that

n \to \infty

, nor does it impose conditions on the autocorrelations and cross-correlations of the elements of

ι_{t}

or on the loading

α_{j}

as in the approximate DFM.

Cubadda and Guardabascio (2019) proposed a two-step SA for the estimation of the IAAR, along with a variant where a

ℓ_{2}

regularization scheme is applied in both steps. They show, by simulations, that the regularized version of the SA outperforms the standard one with

n = 20

. Regarding model specification, they opt for the use of information criteria [IC], in line with previous contributions showing that IC outperforms likelihood ratio tests in the selection of reduced-rank VAR models (see, e.g., Gonzalo & Pitarakis, 1999; Cavaliere et al., 2015; Cavaliere et al., 2018). Finally, the IAAR proves to outperform well-known macroeconomic forecasting methods when applied to systems with n ranging from 4 to 40.

Carriero et al. (2022) endowed the IAAR with Stochastic Volatility [IAAR-SV] in the error

ε_{t}

and offered Bayesian estimations using Markov Chain Monte Carlo [MCMC] techniques. Furthermore, they used (4) to decompose the time-varying volatility

E (ε_{t} ε_{t}^{'}) = Σ_{t}

as follows:

Σ_{t} = \underset{common}{\underset{︸}{Σ_{t} ω {(ω^{'} Σ_{t} ω)}^{- 1} ω^{'} Σ_{t}}} + \underset{uncommon}{\underset{︸}{ω_{⊥} {(ω_{⊥}^{'} Σ_{t}^{- 1} ω_{⊥})}^{- 1} ω_{⊥}^{'}}}

Carriero et al. (2022) applied the IAAR-SV to analyze the commonality in both levels and volatilities of inflation rates in several countries, and their main finding is that a substantial fraction of inflation volatility can be attributed to a global factor that also drives inflation levels and their persistence.

2.4. The Time-Varying Multivariate Autoregressive Index Model

A further step towards taking parameter instabilities over time into account was made by Cubadda et al. (2025), who proposed the following MAI with time-varying parameters and time-varying volatility [MAI-TVP-TVV]:

\begin{matrix} Y_{t} & = \sum_{j = 1}^{p} α_{j, t} ω^{'} Y_{t - i} + ε_{t}, \\ α_{t} & = α_{t} + κ_{t}, \end{matrix}

where

α_{t} = Vec {(α_{1, t}^{'}, \dots, α_{n, t}^{'})}^{'}

,

ε_{t} \sim N (0, Σ_{t})

,

κ_{t} \sim N (0, Q_{t})

;

ε_{t}

and

κ_{t}

are independent at any lag and lead. Notice that it is assumed that the index loadings evolve over time as random walks, while the index weight

ω

remain stable.

In order to overcome the computational limitation related to MCMC procedures, Cubadda et al. (2025) offer a hybrid estimation method that combines the SA, Kalman filter with forgetting factors (Koop & Korobilis, 2014), and exponentially weighted moving average techniques (Johansson et al., 2023) for the time-varying volatility.

An empirical application, where 25 US quarterly time series are used to forecast three key macroeconomic variables, shows that the MAI-TVP-TVV is one of the best models in a large set of competitors for all targets, improving upon its counterparts, especially for short horizons. Other interesting findings are that once the MAI is endowed with time-varying volatility [MAI-TVV], there are no clear improvements in adding time-varying parameters for point forecasting, but the MAI-TVP-TVV always outperforms the MAI-TVV in density forecasting.

2.5. The Dimension-Reducible VAR

Cubadda and Hecq (2022a) studied the conditions under which the dynamics in a large-dimension VAR are entirely generated by a small-scale VAR. They show that such conditions are met when the coefficient matrices of the large VAR have the same common right space and a common left null space. This entails combing Assumption 1 with the following.

Assumption 3.

The following holds:

ω_{⊥}^{'} [Φ_{1}, \dots, Φ_{p}] = 0

Assumption 3 is popularly known in time series econometrics as the CSC (see Cubadda & Hecq, 2022b and the reference therein) given that

ω_{⊥}^{'} Y_{t} = ω_{⊥}^{'} ε_{t},

That is, there exist

(n - q)

linear combinations of variables

Y_{t}

that are white noise, and as such, cyclical behavior cannot be exhibited.

Taking Assumptions 1 and 3 together leads to the dimension-reducible VAR model [DRVAR]:

Y_{t} = \sum_{j = 1}^{p} ω ϕ_{j} f_{t - j} + ε_{t},

(11)

where

ϕ_{j}

is a

q \times q

matrix for

j = 1, \dots, p

.

Assuming, without loss of generality, that

ω^{'} ω = I_{q}

and

ω_{⊥}^{'} ω_{⊥} = I_{n - q}

, we can decompose series

Y_{t}

as follows:

Y_{t} = ω f_{t} + ω_{⊥} η_{t},

(12)

where

f_{t}

is the dynamic component, and

η_{t} = ω_{⊥}^{'} ε_{t}

is the static one. Premultiplying both sides of DRVAR by

ω^{'}

one obtains

f_{t} = \sum_{j = 1}^{p} ϕ_{j} f_{t - j} + ε_{t}^{χ},

where

ε_{t}^{χ} = ω^{'} ε_{t}

, which shows that

f_{t}

is generated by a q-dimensional VAR (p) process.

By inserting the Wold representation of the dynamic component

f_{t}

in Equation (12), it follows that

Y_{t} = ω γ (L) ε_{t}^{χ} + ω_{⊥} η_{t},

(13)

where

γ {(L)}^{- 1} = I_{n} - \sum_{j = 1}^{p} ϕ_{j} L^{j}

. Finally, by linearly projecting

ω_{⊥} η_{t}

on

ε_{t}^{χ}

, we obtain

ω_{⊥} η_{t} = ρ ε_{t}^{χ} + ν_{t}

with

E (ε_{t}^{χ} v_{t}^{'}) = 0

, which can be inserted into Equation (13) to obtain

Y_{t} = C (L) ε_{t}^{χ} + ν_{t},

(14)

where

C_{0} = ω + ρ

and

C_{j} = ω γ_{j}

for

j > 0

.

Representation (14) highlights that system dynamics are completely generated by common reduced form errors

ε_{t}^{χ}

. Consequently, Cubadda and Hecq (2022a) label

ν_{t}

as the ignorable errors, as they are noise without structural interpretation. Since errors

ε_{t}^{χ}

and

ν_{t}

are uncorrelated at any lead and lag, it is then possible to recover the structural shocks solely from the reduced form errors

ε_{t}^{χ}

of the common component

χ_{t}

using any of the procedures that are commonly employed in structural VARs or structural DFMs (see Stock and Watson (2016) and the references therein).

In order to estimate the matrix

ω

, one may rely on a nonparametric estimator proposed by Lam et al. (2011). The underlying intuition is that the matrix

ω

lies in the space generated by the eigenvectors associated with the q nonzero eigenvalues of the symmetric and semipositive definite matrix:

M = \sum_{j = 1}^{p_{0}} Σ_{y} (j) Σ_{y} {(j)}^{'},

where

p_{0} \geq p

, and

Σ_{y} (j)

is the autocovariance matrix of series

Y_{t}

in lag j. Under some regularity conditions, the matrix formed by the eigenvectors associated with the q largest eigenvalues of the sample estimate of M is a

\sqrt{T}

-consistent estimator of

ω

(up to an orthonormal transformation) when q is fixed:

n, T \to \infty

, and

ω_{i}^{'} ω_{i} = O (n)

for

i = 1, \dots, q

, where

ω = [ω_{1}, \dots, ω_{q}]

. Remarkably, the speed of convergence of the estimator, namely

\sqrt{T}

, is the same as when the dimension n is finite.

Moreover, Cubadda and Hecq (2022a) provide both the OLS and GLS estimators of the coefficient

ϕ

in Equation (11) and consistent information criteria for the selection of q, and they show by simulations that the proposed methodology works well with the temporal and cross-sectional sizes that are typical in macroeconomics. Finally, the approach is applied to analyze a large set of US economic time series and to identify the shock that is responsible for most of the common volatility in the business cycle frequency band.

2.6. The Vector Error-Correction Index Model

The models considered so far do not explicitly deal with the possible presence of unit roots. Given that most macroeconomic and financial time series are characterized by stochastic trends, it is important to understand how a cointegrated VAR model can be augmented with an index structure.

Let us assume that series

Y_{t}

follows the vector error-correction model [VECM]

Δ Y_{t} = α_{0} β^{'} Y_{t - 1} + \sum_{j = 1}^{p - 1} Π_{j} Δ Y_{t - j} + ε_{t},

(15)

where

α_{0}

and

β

are full-rank

n \times r

(

r < n

) matrices such that

α_{0} β^{'} = - Φ (1)

,

Π_{j} = - \sum_{i > j} Φ_{i}

for

j = 1, \dots, p - 1

,

{α_{0}}_{⊥}^{'} \bar{Π} β_{⊥}

is non-singular, and

\bar{Π} = I_{n} - \sum_{j = 1}^{p - 1} Π_{j}

. Under such assumptions, it is well known that the elements of

Y_{t}

are individually at most

I (1)

and that they are jointly cointegrated with respect to an order 1 in the sense that

β^{'} Y_{t - 1}

is

I (0)

(see Johansen (1995) and the references therein).

To possibly reduce the number of parameters in the VECM, Cubadda and Mazzali (2024) made the following assumptions:

Assumption 4.

For

Π = {[Π_{1}^{'}, \dots, Π_{p - 1}^{'}]}^{'}

, the following holds:

Π = A ω^{'},

where ω is a full-rank

n \times q

matrix with

q < n

, and A is a full-rank

n (p - 1) \times q

matrix.

Assumption 5.

The following holds:

β = ω γ,

where γ is a full-rank

q \times r

matrix with

q \geq r

.

Under Assumptions 4 and 5, Model (15) can be rewritten in the following vector error-correction index model [VECIM]:

Δ Y_{t} = α_{0} γ^{'} f_{t - 1} + \sum_{j = 1}^{p - 1} α_{j} Δ f_{t - j} + ε_{t},

where

γ

is a full-rank

q \times r

matrix (

q \geq r

), and

α_{j}

is an

n \times q

matrix for

j = 1, \dots, p - 1

such that

rank ({[α_{1}^{'}, \dots, α_{p - 1}^{'}]}^{'}) = q

. Notice that the cointegration matrix is given by

β = ω γ

.

Interestingly, the indexes

f_{t}

themselves are generated by a q-dimensional VECM:

Δ f_{t} = {\underset{̲}{α}}_{0} γ^{'} f_{t - 1} + \sum_{j = 1}^{p - 1} {\underset{̲}{α}}_{j} Δ f_{t - j} + ε_{t}^{χ},

where

{\underset{̲}{α}}_{j} = ω^{'} α_{j}

, for

j = 0, 1 \dots, p - 1

.

By first inserting the decomposition (5) between

Ψ (L)

and

ε_{t}

into the Wold representation of the first differences

Δ Y_{t}

:

Δ Y_{t} = Ψ (L) ε_{t},

and then further decomposing the common component

χ_{t}

into permanent and transitory subcomponents as in Centoni and Cubadda (2003), we obtain the following:

Y_{t} = χ_{t} + ι_{t} = π_{t} + τ_{t} + ι_{t},

(16)

where

\begin{matrix} Δ π_{t} & = Ψ (L) Σ ω {\underset{̲}{Σ}}^{- 1} \underset{̲}{Σ} {\underset{̲}{α}}_{0 ⊥} {({\underset{̲}{α}}_{0 ⊥}^{'} \underset{̲}{Σ} {\underset{̲}{α}}_{0 ⊥})}^{- 1} \underset{ε_{t}^{π}}{\underset{︸}{{\underset{̲}{α}}_{0 ⊥}^{'} ε_{t}^{χ}}}, \end{matrix}

(17)

\begin{matrix} Δ τ_{t} & = Ψ (L) Σ ω {\underset{̲}{Σ}}^{- 1} {\underset{̲}{α}}_{0} {({\underset{̲}{α}}_{0}^{'} {\underset{̲}{Σ}}^{- 1} {\underset{̲}{α}}_{0})}^{- 1} \underset{ε_{t}^{τ}}{\underset{︸}{{\underset{̲}{α}}_{0}^{'} {\underset{̲}{Σ}}^{- 1} ε_{t}^{χ}}}, \end{matrix}

(18)

\begin{matrix} Δ ι_{t} & = Ψ (L) ω_{⊥} {(ω_{⊥}^{'} Σ^{- 1} ω_{⊥})}^{- 1} \underset{ε_{t}^{ι}}{\underset{︸}{ω_{⊥}^{'} Σ^{- 1} ε_{t}}} \end{matrix}

(19)

Since errors

ε_{t}^{π}

are the innovations of the common trends in the indexes

f_{t}

(see, e.g., Johansen, 1995) and errors

ε_{t}^{τ}

are such that

E (ε_{t}^{π} ε_{t}^{τ'}) = 0

, Cubadda and Mazzali (2024) labeled

π_{t}

as the common permanent component and

τ_{t}

as the common transitory component, whereas

ι_{t}

is the uncommon component given that

E (ε_{t}^{ι} ε_{t}^{π'}) = 0

and

E (ε_{t}^{ι} ε_{t}^{τ'}) = 0

.

Following a similar reasoning as the one leading to Proposition 1, post-multiplying, with

ω_{⊥}

both sides of the relation

Ψ (L) (Δ I_{n} - \sum_{j = 1}^{p - 1} α_{j} ω^{'} Δ L^{j} - α_{0} γ^{'} ω^{'} L) = Δ I_{n}

we again obtain

Ψ (L) ω_{⊥} = ω_{⊥}

, which in turn implies that the Wold polynomial matrix of the VECIM has the same form as (8). Finally, inserting (8) in Equations (17)–(19), we can prove the following proposition:

Proposition 2.

In the VECIM, the first differences of the components of

Y_{t}

in (16) read

\begin{matrix} Δ π_{t} & = (Σ ω {\underset{̲}{Σ}}^{- 1} + \sum_{j = 1}^{\infty} θ_{j} L^{j}) Σ {\underset{̲}{α}}_{0 ⊥} {({\underset{̲}{α}}_{0 ⊥}^{'} \underset{̲}{Σ} {\underset{̲}{α}}_{0 ⊥})}^{- 1} ε_{t}^{π} \equiv P (L) ε_{t}^{π}, \\ Δ τ_{t} & = (Σ ω {\underset{̲}{Σ}}^{- 1} + \sum_{j = 1}^{\infty} θ_{j} L^{j}) {\underset{̲}{α}}_{0} {({\underset{̲}{α}}_{0}^{'} {\underset{̲}{Σ}}^{- 1} {\underset{̲}{α}}_{0})}^{- 1} ε_{t}^{τ} \equiv T (L) ε_{t}^{τ} \\ Δ ι_{t} & = ω_{⊥} {(ω_{⊥}^{'} Σ^{- 1} ω_{⊥})}^{- 1} ε_{t}^{ι}, \end{matrix}

where the uncommon component

ι_{t}

is an n-dimensional random walk such that

Rank (E (Δ ι_{t} Δ ι_{t}^{'})) = n - q

.

Notice that Proposition 2 implies that Corollary 1 applies to the VECIM as well.

Remark 6.

Given that the components in (16) are not correlated with each other at any lag and lead, the VECIM allows one to perform a structural analysis, taking advantage of the features of both the DFM—namely isolating shocks that are common among variables—and the VECM—namely disentangling shocks having transitory or permanent effects. For instance, one may identify the structural transitory shocks as

u_{t} = C^{- 1} D ε_{t}^{τ}

and the impulse response functions as

Θ (L) = T (L) D^{- 1} C

, where D is the matrix formed by the first r rows of

T (0)

, and C is a lower triangular matrix such that

C C^{'} = D {\underset{̲}{α}}_{0}^{'} Σ^{- 1} ω^{'} Σ ω Σ^{- 1} {\underset{̲}{α}}_{0} D^{'}

Since the first r rows of

Θ (0)

, being equal to C, form a lower triangular matrix, the usual interpretation of structural shocks obtained through a Cholesky factorization applies to

u_{t}

.

Cubadda and Mazzali (2024) offered a three-step SA for the estimation of the VECIM and proposed selecting the triple

(p, q, r)

in a unique search by IC. An extensive Monte Carlo study shows that the proposed methodology works reasonably well for n, ranging from 6 to 18 when the model is identified by the Hannan–Quinn IC. Moreover, in an empirical application, they identified a shock that maximizes the variability of the common transitory component of unemployment at business cycle frequencies and another one that does the same, but for the common permanent component of unemployment. These two shocks are endowed with a neater economic interpretation than compared to a unique main business cycle shock identified according to Angeletos et al. (2020).

3. A New Proposal: The Cointegrated Index-Augmented Autoregressive Model

A possible limitation of the VECIM is that the uncommon component

ι_{t}

is necessarily a random walk, which may be considered restrictive for some applications. For example, Barigozzi et al. (2021) proposed a DFM where the idiosyncratic components may be I(0) or I(1).

In order to overcome this issue, one can combine the VECIM with the IAAR. Formally, this involves using Assumption 5 along with the following one:

Assumption 6.

For the VECM (15), the following holds:

π_{i k}^{(j)} = \sum_{m = 1}^{q} α_{i m}^{(j)} ω_{k m},

where

π_{i k}^{(j)}

is the generic element of the polynomial matrix

Π_{j}

for

j = 1, \dots, p - 1

,

i = 1, \dots, n

,

k = 1, \dots, i - 1, i + 1, \dots, n

.

Taking Assumptions 5 and 6, the model (15) can be rewritten into the following cointegrated index-augmented auto-regressive model [CIAAR]:

Δ Y_{t} = \sum_{j = 1}^{p - 1} D_{j} Δ Y_{t - j} + α_{0} γ^{'} ω^{'} Y_{t - 1} + \sum_{j = 1}^{s - 1} α_{j} ω^{'} Δ Y_{t - j} + ε_{t},

(20)

where

D_{j}

is an

n \times n

diagonal matrix with

δ_{i i}^{(j)} = π_{i i}^{(j)} - \sum_{m = 1}^{q} α_{i m}^{(j)} ω_{i m}

as a generic diagonal element.

When the elements of series

Y_{t}

are I(1), Model (20) includes several earlier models for series

Δ Y_{t}

as special cases, as summarized in Table 1.

Table 1. Previous models as special cases of the CIAAR.

Remark 7.

Interestingly, by the same argument underlying Proposition 2, we see that, differently from the VECIM,

Ψ (L) ω_{⊥} \neq ω_{⊥}

, which implies, in view of Equation (19), that the first differences of the uncommon component

Δ ι_{t}

are generally autocorrelated in the case of the CIAAR. The uncommon component

ι_{t}

is still stochastically singular with rank

n - q

. Since system (20) has overall

n - r

unit roots, while the common component

χ_{t}

has

q - r

unit roots, uncommon component

ι_{t}

has

n - q

unit roots (see Deistler & Wagner, 2017; Barigozzi et al., 2020 on the properties of singular I(1) stochastic processes).

Following Cubadda and Guardabascio (2019) and Cubadda and Mazzali (2024), the estimation procedure is based on an SA where each step is designed to increase the Gaussian likelihood of Model (20). In detail, when

0 < r < q

, the procedure goes as follows:

Given (initial) estimates of $γ$ , $ω$ , and $D = {[D_{1}, \dots, D_{p - 1}]}^{'}$ , maximize the conditional Gaussian likelihood $L (A^{†}, Σ | γ, ω, D)$ by estimating $A^{†} = {[α_{0}^{'}, A^{'}]}^{'}$ , where $A = {[α_{1}^{'}, \dots, α_{s - 1}^{'}]}^{'}$ , and $Σ$ is applied on OLS with respect to the following equation:

$Δ Y_{t} - \sum_{j = 1}^{p - 1} D_{j} Δ Y_{t - j} = α_{0} γ^{'} ω^{'} Y_{t - 1} + \sum_{j = 1}^{s - 1} α_{j} ω^{'} Δ Y_{t - j} + ε_{t}$
Premultiply with $Σ^{- 1 / 2}$ and apply the $Vec$ operator to both sides of Equation (20); then, use the property $Vec (A B C) = (C^{'} \otimes A) Vec (B)$ to obtain

$\begin{matrix} Σ^{- 1 / 2} Δ Y_{t} & = \sum_{j = 1}^{p - 1} (Y_{t - j}^{'} \otimes Σ^{- 1 / 2}) Vec (D_{j}) + \\ (Y_{t - 1}^{'} \otimes Σ^{- 1 / 2} α_{0} γ^{'} + \sum_{j = 1}^{s - 1} Δ Y_{t - j}^{'} \otimes Σ^{- 1 / 2} α_{j}) Vec (ω^{'}) + Σ^{- 1 / 2} ε_{t}, \end{matrix}$

and reparametrize the above model as

$\begin{matrix} Σ^{- 1 / 2} Δ Y_{t} & = \sum_{h = 1}^{p - 1} [(Y_{t - j}^{'} \otimes Σ^{- 1 / 2}) M] δ_{j} + \\ (Y_{t - 1}^{'} \otimes Σ^{- 1 / 2} α_{0} γ^{'} + \sum_{j = 1}^{s - 1} Δ Y_{t - j}^{'} \otimes Σ^{- 1 / 2} α_{j}) Vec (ω^{'}) + Σ^{- 1 / 2} ε_{t}, \end{matrix}$

(21)

where $δ_{j}$ is a n-vector such that $D_{j} = diag (δ_{j})$ , and M is a binary $n^{2} \times n$ -matrix for which its generic element $m_{i k}$ is such that

$m_{i k} = \{\begin{matrix} 1 & if i = 1 + (k - 1) (n + 1), k = 1, \dots, N \\ 0 & otherwise \end{matrix}$

Given the previously obtained estimates of $A^{†}$ , $γ$ , and $Σ$ , maximize $L (ω, D | A^{†}, γ, Σ)$ by estimating $Vec (ω^{'})$ and $δ = {[δ_{1}^{'}, \dots, δ_{p - 1}^{'}]}^{'}$ with OLS in Equation (21).
Given the previously obtained estimates of $ω$ and D, maximize $L (γ | ω, D)$ by estimating $γ$ as the eigenvectors that correspond to the r largest eigenvalues of the matrix

$S_{11}^{- 1} S_{10} S_{00}^{- 1} S_{01}$

where $S_{i j} = \sum_{t = p + 1}^{T} R_{i, t} R_{j, t}^{'}$ for $i, j = 0, 1$ ; $R_{0, t}$ and $R_{1, t}$ are, respectively, the residuals of an OLS regression of $Δ Y_{t} - \sum_{j = 1}^{p - 1} D_{j} Δ Y_{t - j}$ and $ω^{'} Y_{t - 1}$ on ${[Δ Y_{t - 1}^{'} ω, \dots, Δ Y_{t - s + 1}^{'} ω]}^{'}$ .
Repeat steps 1 to 3 until numerical convergence occurs.

When

r = 0

, step 3 is clearly not needed, and steps 1 and 2 must be modified as follows:

1.1: Given (initial) estimates of $ω$ and D, maximize $L (A, Σ | ω, D)$ by estimating A and $Σ$ with OLS on the following model:

$Δ Y_{t} - \sum_{j = 1}^{p - 1} D_{j} Δ Y_{t - j} = \sum_{j = 1}^{s - 1} α_{j} ω^{'} Δ Y_{t - j} + ε_{t}$
2.1: Given the previously obtained estimates of A and $Σ$ , maximize $L (ω | A, Σ)$ by estimating $Vec (ω^{'})$ and $δ$ with OLS on the following model:

$Σ^{- 1 / 2} Δ Y_{t} = \sum_{h = 1}^{p - 1} [(Y_{t - j}^{'} \otimes Σ^{- 1 / 2}) M] δ_{j} + (\sum_{j = 1}^{s - 1} Δ Y_{t - j}^{'} \otimes Σ^{- 1 / 2} α_{j}) Vec (ω^{'}) + Σ^{- 1 / 2} ε_{t},$

Finally, when

r = q

, we can assume, without loss of generality, that

γ = I_{q}

. Then, step 3 is, again, not needed, whereas steps 1 and 2 must be modified as follows:

1.3: Given (initial) estimates of $ω$ and D, maximize $L (A^{†}, Σ | ω, D)$ by estimating $A^{†}$ and $Σ$ with OLS in the following model:

$Δ Y_{t} - \sum_{j = 1}^{p - 1} D_{j} Δ Y_{t - j} = α_{0} ω^{'} Y_{t - 1} + \sum_{j = 1}^{s - 1} α_{j} ω^{'} Δ Y_{t - j} + ε_{t}$
2.3: Given the previously obtained estimates of $A^{†}$ and $Σ$ , maximize $L (ω, D | A^{†}, Σ)$ by estimating $Vec (ω^{'})$ and $δ$ with OLS in the following model:

$\begin{matrix} Σ^{- 1 / 2} Δ Y_{t} & = \sum_{h = 1}^{p - 1} [(Y_{t - j}^{'} \otimes Σ^{- 1 / 2}) M] δ_{j} + \\ (Y_{t - 1}^{'} \otimes Σ^{- 1 / 2} α_{0} + \sum_{j = 1}^{s - 1} Δ Y_{t - j}^{'} \otimes Σ^{- 1 / 2} α_{j}) Vec (ω^{'}) + Σ^{- 1 / 2} ε_{t} \end{matrix}$

The choice of initial values for the above procedures is discussed in Appendix A, whereas the selection of the quadruple

(p, s, q, r)

can be carried out by IC sequentially or in a unique search, as suggested by Cubadda and Mazzali (2024).

4. Conclusions and Future Research Directions

The DFM and VAR are, arguably, among the most popular tools in macroeconometrics and financial econometrics. The two approaches should be considered complementary rather than substitutive, since each has its own merits. The MAI represents a link between these two methodologies: On the one hand, it is a VAR with a specific reduced-rank structure that alleviates the dimensionality problem; on the other hand, the MAI and its variants have several analogies with the DFM; in particular, they allow for identifying a small number of common reduced-form errors and for recovering structural shocks from those errors only.

However, the MAI is not affected by some theoretical limitations of the DFM, such as the requirement that the cross-sectional dimension diverges to infinity and the need for specific assumptions on the dynamic correlation structure of the idiosyncratic component and on the factor loadings. In a more practical perspective, VARs with an index structure can also handle features such as stochastic volatility (Carriero et al., 2022) and time-varying parameters (Cubadda et al., 2025), which are not easily accommodated in DFMs.

Recent developments in VAR models with index structures have considerably extended the original MAI formulation, endowing the model with individual autoregressive structures, stochastic volatility, time-varying parameters, high dimensionality, and cointegration. These extensions have proven to be useful tools for detecting common components, obtaining efficiency gains through the imposition of parameter restrictions, performing structural analysis, and boosting forecast accuracy.

Having reviewed most of the recent advances on the MAI and provided new insights on the representation theory underlying the IAAR and the VECIM, a new model, namely the CIAAR, was proposed along with an estimation procedure. The CIAAR extends previous contributions by allowing the VECIM for individual AR structures and the IAAR for cointegration.

There is plenty of room for future research that could be developed in at least three directions. First, the practical relevance of the CIAAR must be investigated both empirically and by simulations. Second, sparsity could be introduced in the MAI and its variants, employing regularized

ℓ_{1}

regressions in the SA in place of OLS. This would open up the possibility of tackling both dimension reduction, through the index structure, and sparsity in the model coefficients, through Lasso and its variants. Third, the approaches considered in this survey could be applied to data with more elaborate dependence structures than vector time series, such as spatiotemporal processes or matrix–tensor time series. First contributions along these lines were provided by Pu et al. (2025), Wang et al. (2022), and Hecq et al. (2024).

Funding

The financial support of MUR under the 20223725WE (PRIN 2022) grant is gratefully acknowledged.

Data Availability Statement

No new data were created or analyzed in this study.

Acknowledgments

Previous versions of this paper were presented at an intermediate workshop on methodological and computational issues in large-scale time series models for economics and finance in Messina, the Villa Mondragone time series symposium in honour of Marco Lippi in Monte Porzio Catone (Rome); the 11th ICEEE in Palermo; and the final workshop on methodological and computational issues in large-scale time series models for economics and finance in Monte Porzio Catone (Rome). The author thanks the participants, as well as three anonymous referees, for their helpful comments and suggestions. The usual disclaimers apply.

Conflicts of Interest

The author declares no conflicts of interest.

Appendix A

The choice of the initial values for an SA is important. Not only is an accurate initialization necessary to boost numerical convergence but the SA is also asymptotically equivalent to the ML one when the parameters to be initialized are consistently estimated (Hautsch et al., 2023).

With reference to the SA in Section 3, the initial values for

γ

,

ω

, and D can be obtained as follows:

Use the usual Johansen procedure on the model (15) and obtain estimates ${\hat{α}}_{0}$ , $\hat{β}$ , and ${\hat{Π}}_{j}$ for $j = 1, \dots, m$ , where $m = max {p, s} - 1$ .
Construct matrices ${\tilde{Π}}_{j} = {\hat{Π}}_{j} - diag {[{\hat{π}}_{11}^{(j)}, \dots, {\hat{π}}_{n n}^{(j)}]}^{'}$ for $j = 1, \dots, m$ .
Construct the matrix $\tilde{Φ} = {[{\tilde{Π}}_{1}^{'}, \dots, {\tilde{Π}}_{m}^{'}, \hat{β} {\hat{α}}_{0}^{'},]}^{'}$ .
Compute the singular-value decomposition $\tilde{Φ} = U Λ V^{'}$ , where the singular values are not increasingly ordered, and obtain $\hat{ω}$ as the matrix formed by the first q columns of V.
Compute the q-rank approximation of $\tilde{Φ}$ as $\bar{Φ} = U \bar{Λ} V^{'}$ , where $\bar{Λ}$ is obtained from $Λ$ by setting the smallest $n - q$ singular values to 0.
Construct $\bar{Π} = {[{\bar{Π}}_{1}^{'}, \dots, {\bar{Π}}_{p - 1}^{'}]}^{'}$ as the matrix formed by the first $n (p - 1)$ rows of $\bar{Φ}$ .
Construct ${\hat{D}}_{j}$ as a diagonal matrix with the diagonal equal to $diag {[{\hat{π}}_{11}^{(j)} - {\bar{π}}_{11}^{(j)}, \dots, {\hat{π}}_{n n}^{(j)} - {\bar{π}}_{n n}^{(j)}]}^{'}$ for $j = 1, \dots, s - 1$ .

The motivation for the above choices is twofold. First, the asymptotic distribution of the Johansen estimator of

β

is not affected by restrictions on the short-run parameters (Johansen, 1995), which implies that

{\hat{α}}_{0}

,

{\hat{Π}}_{j}

, and

{\tilde{Π}}_{j}

are consistent, although inefficient, estimators of the associated parameters. Second, the right-singular vectors that correspond to the q largest singular values of the matrix

\tilde{Φ}

consistently estimate

ω

(see, e.g., Reinsel et al., 2022). By the same argument,

\bar{Π}

provides a consistent estimator of

Π

. Finally, the consistency of

\hat{D} = {[{\hat{D}}_{1}, \dots, {\hat{D}}_{p - 1}]}^{'}

trivially follows from the ones of

\hat{Π}

and

\bar{Π}

.

Notes

1	At the end of 2024, the annual citation rate of Reinsel (1983) in Scopus has increased by about 54% in the last 9 years, with the majority of recent citations coming from econometric journals.
2	Indeed, the matrix $ω$ , once identified through normalizing restrictions, has $q (n - q)$ free parameters.
3	Remarkably, when the factors in the EDFM are estimated by some principal components of series $Y_{t}$ , the sample variance matrix of the estimated idiosyncratic component has a reduced rank as well.
4	The most obvious alternatives to the VHARI are likely multivariate principal component regression and reduced-rank regression.
5	A general proof of the convergence of this family of iterative procedures is given by Oberhofer and Kmenta (1974).

References

Angeletos, G. M., Collard, F., & Dellas, H. (2020). Business-cycle anatomy. American Economic Review, 110(10), 3030–3070. [Google Scholar] [CrossRef]
Bai, J. (2003). Inferential theory for factor models of large dimensions. Econometrica, 71(1), 135–171. [Google Scholar] [CrossRef]
Bai, J., & Ng, S. (2002). Determining the number of factors in approximate factor models. Econometrica, 70(1), 191–221. [Google Scholar] [CrossRef]
Bai, J., & Ng, S. (2006). Confidence intervals for diffusion index forecasts and inference for factor-augmented regressions. Econometrica, 74(4), 1133–1150. [Google Scholar] [CrossRef]
Bańbura, M., Giannone, D., & Reichlin, L. (2010). Large Bayesian vector auto regressions. Journal of Applied Econometrics, 25(1), 71–92. [Google Scholar] [CrossRef]
Barigozzi, M., Lippi, M., & Luciani, M. (2020). Cointegration and error correction mechanisms for singular stochastic vectors. Econometrics, 8(1), 3. [Google Scholar] [CrossRef]
Barigozzi, M., Lippi, M., & Luciani, M. (2021). Large-dimensional dynamic factor models: Estimation of impulse–response functions with I(1) cointegrated factors. Journal of Econometrics, 221(2), 455–482. [Google Scholar] [CrossRef]
Boswijk, H., & Doornik, J. (2004). Identifying, estimating and testing restricted cointegrated systems: An overview. Statistica Neerlandica, 58(4), 440–465. [Google Scholar] [CrossRef]
Carriero, A., Clark, T. E., & Marcellino, M. (2015). Bayesian VARs: Specification choices and forecast accuracy. Journal of Applied Econometrics, 30(1), 46–73. [Google Scholar] [CrossRef]
Carriero, A., Corsello, F., & Marcellino, M. (2022). The global component of inflation volatility. Journal of Applied Econometrics, 37(4), 700–721. [Google Scholar] [CrossRef]
Carriero, A., Kapetanios, G., & Marcellino, M. (2016). Structural analysis with multivariate autoregressive index models. Journal of Econometrics, 192(2), 332–348. [Google Scholar] [CrossRef]
Cavaliere, G., De Angelis, L., Rahbek, A., & Taylor, A. (2015). A comparison of sequential and information-based methods for determining the co-integration rank in heteroskedastic VAR models. Oxford Bulletin of Economics and Statistics, 77(1), 106–128. [Google Scholar] [CrossRef]
Cavaliere, G., De Angelis, L., Rahbek, A., & Taylor, A. (2018). Determining the cointegration rank in heteroskedastic VAR models of unknown order. Econometric Theory, 34(2), 349–382. [Google Scholar] [CrossRef]
Centoni, M., & Cubadda, G. (2003). Measuring the business cycle effects of permanent and transitory shocks in cointegrated time series. Economics Letters, 80(1), 45–51. [Google Scholar] [CrossRef]
Corsi, F. (2009). A simple approximate long-memory model of realized volatility. Journal of Financial Econometrics, 7(2), 174–196. [Google Scholar] [CrossRef]
Cubadda, G., & Guardabascio, B. (2019). Representation, estimation and forecasting of the multivariate index-augmented autoregressive model. International Journal of Forecasting, 35(1), 67–79. [Google Scholar] [CrossRef]
Cubadda, G., Guardabascio, B., & Grassi, S. (2025). The time-varying multivariate autoregressive index model. International Journal of Forecasting, 41(1), 175–190. [Google Scholar] [CrossRef]
Cubadda, G., Guardabascio, B., & Hecq, A. (2017). A vector heterogeneous autoregressive index model for realized volatility measures. International Journal of Forecasting, 33(2), 337–344. [Google Scholar] [CrossRef]
Cubadda, G., & Hecq, A. (2022a). Dimension reduction for high dimensional vector autoregressive models. Oxford Bulletin of Economics and Statistics, 84(5), 1123–1152. [Google Scholar] [CrossRef]
Cubadda, G., & Hecq, A. (2022b). Reduced rank regression models in economics and finance. In Oxford handbook of economic forecasting. Oxford University Press. [Google Scholar]
Cubadda, G., Hecq, A., & Palm, F. (2009). Studying co-movements in large multivariate data prior to multivariate modelling. Journal of Econometrics, 148(1), 25–35. [Google Scholar] [CrossRef]
Cubadda, G., & Mazzali, M. (2024). The vector error correction index model: Representation, estimation and identification. Econometrics Journal, 27, 126–150. [Google Scholar] [CrossRef]
Deistler, M., & Wagner, M. (2017). Cointegration in singular ARMA models. Economics Letters, 155(4), 39–42. [Google Scholar] [CrossRef]
Fernández-Villaverde, J., Rubio-Ramírez, J., & Schorfheide, F. (2016). Solution and estimation methods for DSGE models. In J. Taylor, & H. Uhlig (Eds.), Handbook of macroeconomics (Vol. 2, pp. 527–724). North Holland. [Google Scholar]
Forni, M., & Gambetti, L. (2014). Sufficient information in structural VARs. Journal of Monetary Economics, 66(1), 124–136. [Google Scholar] [CrossRef]
Forni, M., Hallin, M., Lippi, M., & Reichlin, L. (2000). The generalized dynamic-factor model: Identification and estimation. Review of Economics and Statistics, 82(4), 540–554. [Google Scholar] [CrossRef]
Forni, M., Hallin, M., Lippi, M., & Reichlin, L. (2009). Opening the black box: Structural factor models with large cross sections. Econometric Theory, 25(5), 1319–1347. [Google Scholar] [CrossRef]
Forni, M., & Lippi, M. (2001). The generalized dynamic factor model: Representation theory. Econometric Theory, 17(6), 1113–1141. [Google Scholar] [CrossRef]
Gonzalo, J., & Pitarakis, J. (1999). Dimensionality effect in cointegration analysis. In Cointegration, causality, and forecasting. A festschrift in honour of Clive WJ Granger (pp. 212–229). Oxford University Press. [Google Scholar] [CrossRef]
Hautsch, N., Okhrin, O., & Ristig, A. (2023). Maximum-likelihood estimation using the zig-zag algorithm. Journal of Financial Econometrics, 21(4), 1346–1375. [Google Scholar] [CrossRef]
Hecq, A., Margaritella, L., & Smeekes, S. (2023). Granger causality testing in high-dimensional VARs: A post-double-selection procedure. Journal of Financial Econometrics, 21(3), 915–958. [Google Scholar] [CrossRef]
Hecq, A., Ricardo, I., & Wilms, I. (2024). Reduced-rank matrix autoregressive models: A medium n approach. Available online: https://arxiv.org/abs/2407.07973 (accessed on 20 January 2025).
Hsu, N. J., Hung, H. L., & Chang, Y. M. (2008). Subset selection for vector autoregressive processes using Lasso. Computational Statistics & Data Analysis, 52(7), 3645–3657. [Google Scholar] [CrossRef]
Johansen, S. (1995). Likelihood-based inference in cointegrated vector autoregressive models. Oxford University Press. [Google Scholar]
Johansson, K., Ogut, M. G., Pelger, M., Schmelzer, T., & Boyd, S. (2023). A simple method for predicting covariance matrices of financial returns. Foundations and Trends in Econometrics, 12(4), 324–407. [Google Scholar] [CrossRef]
Kock, A. B., & Callot, L. (2015). Oracle inequalities for high dimensional vector autoregressions. Journal of Econometrics, 186(2), 325–344. [Google Scholar] [CrossRef]
Koop, G. M. (2013). Forecasting with medium and large Bayesian VARs. Journal of Applied Econometrics, 28(2), 177–203. [Google Scholar] [CrossRef]
Koop, G. M., & Korobilis, D. (2014). A new index of financial conditions. European Economic Review, 71, 101–116. [Google Scholar] [CrossRef]
Lam, C., Yao, Q., & Bathia, N. (2011). Estimation of latent factors for high-dimensional time series. Biometrika, 98(4), 901–918. [Google Scholar] [CrossRef]
Lippi, M. (2019). Time-domain approach in high-dimensional dynamic factor models. In Oxford research encyclopedia of economics and finance. Oxford University Presss. [Google Scholar]
Oberhofer, W., & Kmenta, J. (1974). A general procedure for obtaining maximum likelihood estimates in generalized regression models. Econometrica, 42(3), 579–590. [Google Scholar] [CrossRef]
Patton, A., & Sheppard, K. (2009). Optimal combinations of realised volatility estimators. International Journal of Forecasting, 25(2), 218–238. [Google Scholar] [CrossRef]
Pu, D., Fang, K., Lan, W., Yu, J., & Zhang, Q. (2025). Reduced rank spatio-temporal models. Journal of Business & Economic Statistics, 43(1), 98–109. [Google Scholar]
Reinsel, G. (1983). Some results on multivariate autoregressive index models. Biometrika, 70(1), 145–156. [Google Scholar] [CrossRef]
Reinsel, G., Velu, R., & Chen, K. (2022). Multivariate reduced-rank regression. Theory, methods and applications. Springer Nature. [Google Scholar]
Sims, C. A. (1980). Macroeconomics and reality. Econometrica, 48(1), 1–48. [Google Scholar] [CrossRef]
Stock, J. H., & Watson, M. W. (2002a). Forecasting using principal components from a large number of predictors. Journal of the American Statistical Association, 97(460), 1167–1179. [Google Scholar] [CrossRef]
Stock, J. H., & Watson, M. W. (2002b). Macroeconomic forecasting using diffusion indexes. Journal of Business & Economic Statistics, 20(2), 147–162. [Google Scholar]
Stock, J. H., & Watson, M. W. (2016). Dynamic factor models, factor-augmented vector autoregressions and structural vector autoregressions in macroeconomics. In J. Taylor, & H. Uhlig (Eds.), Handbook of macroeconomics (Vol. 2A, pp. 415–525). North Holland. [Google Scholar]
Wang, D., Zheng, Y., Lian, H., & Li, G. (2022). High-dimensional vector autoregressive time series modeling via tensor decomposition. Journal of the American Statistical Association, 117(539), 1338–1356. [Google Scholar] [CrossRef]

Table 1. Previous models as special cases of the CIAAR.

Model	Restrictions on Model (20)	N. of Restrictions
MAI	$α_{0} = 0_{n \times r}, γ = 0_{q \times r}, D_{j} = 0_{n \times n}$ for $j = 1, \dots, p - 1$	$n (p - 1) + r (n + q - r)$
IAAR	$α_{0} = 0_{n \times r}, γ = 0_{q \times r}$	$r (n + q - r)$
VECIM	$D_{j} = 0_{n \times n}$ for $j = 1, \dots, p - 1$	$n (p - 1)$

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Article Metrics

Citations

Article Access Statistics

Journal Statistics

Multiple requests from the same IP address are counted as one view.