Next Article in Journal
Numerical Solution of the Nonlinear Convection–Diffusion Equation Using the Fifth Order Iterative Method by Newton–Jarratt
Previous Article in Journal
Adaptive Bayesian Nonparametric Regression via Stationary Smoothness Priors
Previous Article in Special Issue
A High-Dimensional Cramér–von Mises Test
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Estimator’s Properties of Specific Time-Dependent Multivariate Time Series

Université libre de Bruxelles, Solvay Brussels School of Economics and Management and ECARES, CP 114/04, Avenue Franklin Roosevelt, 50, B-1050 Brussels, Belgium
Mathematics 2025, 13(7), 1163; https://doi.org/10.3390/math13071163
Submission received: 27 February 2025 / Revised: 25 March 2025 / Accepted: 27 March 2025 / Published: 31 March 2025
(This article belongs to the Special Issue New Challenges in Time Series and Statistics)

Abstract

:
There is now a vast body of literature on ARMA and VARMA models with time-dependent or time-varying coefficients. A large part of it is based on local stationary processes using time rescaling and assumptions of regularity with respect to time. A recent paper has presented an alternative asymptotic theory for the parameter estimators based on several distinct assumptions that seem difficult to verify at first look, especially for time-dependent VARMA or tdVARMA models. The purpose of the present paper is to detail several examples that illustrate the verification of the assumptions in that theory. These assumptions bear on the moments of the errors, the existence of the information matrix, but also how the coefficients of the pure moving average representation of the derivatives of the residuals (with respect to the parameters and evaluated at their true value) behave. We will do that analytically for two bivariate first-order models, an autoregressive model, and a moving average model, before sketching a generalization to higher-order models. We also show simulation results for these two models illustrating the analytical results. As a consequence, not only the assumptions can be checked but the simulations show how well the small sample behavior of the estimators agrees with the theory.

1. Introduction

Consider a multivariate time series ( x t ; t = 1 , , n ) of length n and dimension r. It is supposed to be generated by an array process ( x t ( n ) ; t = 1 , , n ; n > 0 ) . There is now a vast body of literature on ARMA and VARMA models with time-dependent or time-varying coefficients. For a recent review of VARMA models, including their time-dependent variants, see [1].
The history of time-dependent models for time series started with the works of [2,3,4,5,6]. Several of them, [2,4,5] and also [7,8,9,10,11,12], focussed on the temporal aspects, while others [3,6] were more interested in the spectral point of view.
We will not consider the numerous studies related to the time-dependent spectral approaches, except for the following one that had considerable attraction. We mean the theory based on local stationarity processes (LSP) due to Dahlhaus (see [13,14,15,16,17]). We will not repeat here the many contributions about it since they are nicely summarized in [18], and a few more recent references are mentioned in [19]. See also [20,21,22,23,24,25,26,27,28,29,30].
Other somewhat related approaches to ours include [31] (Chapter 17), which treats tdVAR models by Gaussian maximum likelihood (but does not discuss asymptotic properties), generalized autoregressive score (GAS) models of [32], testing parameter constancy against deterministically time-varying parameters, e.g., ref. [33] (Section 6.3) and references therein, generalized to VAR models in [34], smooth online parameter estimation approach [35], deep-learning approaches [36], using state-space methods [37], a non-parametric approach [38], or by using an explicit representation for ARMA recursions with either deterministically or stochastically varying coefficients [39].
If a few of the above-mentioned papers treat multivariate time-dependent models, like [25,31,36], they are often tdVAR, and, apparently, never tdVMA and tdVARMA models, with the notable exception of [38], which is semi-parametric and uses a kernel-density estimator. Ref. [40] explains that, although in theory a tdMA process can be written as an infinite tdAR process, it is not efficient to fit a high-order model. Since the first edition of [41] in 1970 and the practical studies that followed, it is well known that most time series in many fields like economics, sociology, tourism, agriculture, energy, and so on, are better fitted by ARIMA models, possibly on transformed data, rather than by AR models. Based on 13,238 monthly series taken from the Industrial Short-Term Indicator section of the EUROSTAT database on 15 Member States of the European Union and a few series from the United States and Japan, ref. [42] observed that the airline model, i.e., the seasonal ARIMA(0,1,1)(0,1,1 ) 12 model on the log-transformed data, was best-fitting for 61% of the series, whereas only 2% were best fitted by an autoregression. There is no reason why the primacy of moving averages would not extend to multivariate time series and time-dependent models, although there is no empirical study at this stage. A few of the previously mentioned papers for univariate ARMA models with time-dependent coefficients include MA coefficients like [7,10,11,12]. In the multivariate case, ref. [17] treats time-varying or time-dependent VARMA (tdVARMA) models in the context of local stationarity but only for Gaussian processes and under the assumption that all the eigenvalues of the true spectral density matrix are uniformly bounded from below, a condition that is difficult to verify in practice. More recently, ref. [38] has provided results for a semi-parametric estimator for tdVARMA models.
In a recent paper [43], estimation results were produced for a wide class of vector ARMA array processes with time-dependent coefficients, denoted tdVARM A ( n ) ( p , q ) , which includes as special cases both local stationary processes [18] and cyclically time-dependent processes [11]. The assumptions are rather general but complex at first sight, so it would be worthwhile to demonstrate their applicability. Previously, ref. [44] has already treated examples for cyclically tdVARMA stochastic processes. Ref. [43] could also be seen as a generalization to multivariate processes of [12] devoted to univariate ARMA models with time-dependent coefficients, thereby generalizing the autoregressive moving average (ARMA) models popularized by Box and Jenkins, ref. [41]. In [12], the two cases where the coefficients depend only on time t and both t and n were considered with an accent on the former case. The case where the coefficients depend on t but not series length n was generalized to VARMA models by [44]. Ref. [12] contained very simple univariate examples where the theoretical assumptions for the asymptotic properties were checked. We will see that developing simple examples is much more complex in a multivariate setting.
Previously, ref. [45] had shown that, when specialized to VARMA models with constant coefficients, these assumptions coincide with the assumptions for the standard asymptotic properties of the parametric estimation for these models. The problem is that the assumptions in [43] are complex. For instance, that paper contains remarks to address requests from reviewers who could not believe that these assumptions would work, despite the provided proofs. Consequently, we consider here the case of relatively simple array processes and will check analytically a representative sample of the assumptions for bivariate processes. By representative sample, we mean that the other assumptions can be verified using the same arguments. To simplify, we will restrain ourselves to first-order processes, tdVA R ( n ) (1), when p = 1 and q = 0 , and tdVM A ( n ) (1), when p = 0 and q = 1 . Even for these two special cases, we drastically limit the form of time-dependency and the number m of parameters. We believe, however, that verifying the assumptions of [43] in these two cases is exemplary for more complex models. These models were used, for specific values of the parameters, in a simulation study presented by [43], so that the present paper completes the information. Not only does estimation give results that are predicted by the theory for sufficiently large series, but we will also see that the values selected for the parameters in these simulation experiments fulfill the requirements.
We do not consider high-dimensional models in this paper for two reasons: first, there is little hope to be able to offer simple examples for r > 2 ; and second, since the size of the matrices in the coefficients will be r × r , the number of parameters m increases with r 2 . With n = 100 , p = q = 2 , and r = 4 , the number of parameters is already 4 × 4 2 = 64 , plus 4 × ( 4 + 1 ) / 2 = 10 in the error covariance matrix, leaving only 100 74 = 26 degrees of freedom. Moreover, the dimension of the information matrix (defined below) grows with r 4 , so in that case, it is already 64 × 64 , possibly implying serious computational problems with matrix inversion. A solution would be to extend the sparse identification and estimation approach proposed for VARMA models by [46] and implemented by [47] in the R package bigtime 0.2.3 for R 3.6.0 and above. This is done by using sparsity-inducing convex regularizers. It works even for large-scale VARMA models, under sufficient regularity conditions expressed by the condition r 3 log ( r n ) 0 .
This article is organized as follows. In Section 2, we introduce the general marginally heteroscedastic tdVARM A ( n ) array model with the main notations, describe the crucial assumptions under which [43] have proved the asymptotic properties of the Gaussian maximum likelihood estimator, and present approximations of the true information matrix. Section 3 contains our results: in Section 3.1, we consider a tdVA R ( n ) (1) process and a tdVM A ( n ) (1) process in Section 3.2. In both cases, after reducing the number of parameters progressively, we prove analytically that the assumptions can be verified, provide the constraints on the true values, and simulation results with non-Gaussian errors to assess the quality of the estimates and their standard errors. Section 3.3 and Section 3.4 are short attempts for generalization but, of course, the model complexity does not permit a complete analytical treatment. The paper ends with a discussion of the results in Section 4. There are three appendices for details about Section 3.1 and Section 3.2.

2. Materials and Methods

2.1. The General tdVARMA Array Model

Let θ = ( θ 1 , , θ m ) T , m 1 , where T denotes transposition, be the parameter belonging to Θ , an open set of R m , and θ 0 Θ , the true value of θ . Let P θ be the probability measure on Θ . Let the r × r matrices A t 1 ( n ) , , A t p ( n ) , and B t 1 ( n ) , , B t q ( n ) , as well as g t ( n ) , be deterministic functions of time t and, possibly, of n. Let { ϵ t ; t N } be a sequence of zero-mean independent random variables with a covariance matrix Σ which is invertible, and with finite moments of order 4 + 2 δ with δ > 0 . Denoting ⊗ the Kronecker product, and vec ( . ) which transforms a matrix into a column vector, we let κ t = E vec ( ϵ t ϵ t T ) vec ( ϵ t ϵ t T ) T = E ( ϵ t ϵ t T ) ( ϵ t ϵ t T ) , which can depend on t, but it will not in all our examples. For t < 1 , we suppose that ϵ t and x t ( n ) are both equal to zero. According to [43], the tdVARM A ( n ) ( p , q ) array process can be defined by the equation
x t ( n ) = i = 1 p A t i ( n ) x t i ( n ) + g t ( n ) ϵ t + j = 1 q B t j ( n ) g t j ( n ) ϵ t j .
Note that g t ( n ) implies marginal heteroscedasticity, meaning that the errors do not have a constant covariance matrix but well a time-dependent covariance matrix. Unfortunately, the theory in [43] does not accommodate conditional heteroscedasticity, as seen in VARMA-GARCH models, such as VARMA-BEKK models (see [48,49,50]).
The model consists of replacing the coefficients A t i ( n ) , i = 1 , , p , B t j ( n ) , j = 1 , , q , and g t ( n ) by adding an argument ( θ ) so that the coefficients in (1) correspond to the value at θ = θ 0 . We suppose that the parameters involved in the A t i ( n ) ( θ ) and the B t j ( n ) ( θ ) , on the one hand, and those in g t ( n ) ( θ ) , on the other hand, are distinct and ordered that way. Let m A B m be the number of parameters of the first type so that the derivative of g t ( n ) ( θ ) with respect to θ i is identically zero for i = 1 , , m A B . Of course, we also need to replace g t j ( n ) ϵ t j with the residuals e t j ( θ ) , j = 0 , , q , noting that e t j ( θ 0 ) = g t j ( n ) ϵ t j . Let α t ( n ) ( θ ) = log det ( Σ t ( n ) ( θ ) ) ) + e t ( n ) T ( θ ) Σ t ( n ) 1 ( θ ) e t ( n ) ( θ ) , where Σ t ( n ) ( θ ) = E θ ( e t ( n ) ( θ ) e t ( n ) T ( θ ) ) = g t ( n ) ( θ ) Σ g t ( n ) T ( θ ) . Using a series of length n, we estimate θ by θ ^ ( n ) = arg min θ Θ 1 2 t = 1 n α t ( n ) ( θ ) , which means using the Gaussian quasi-likelihood method. Under the assumptions given in detail in [43] and summarized in Section 2.2, θ ^ ( n ) θ 0 in probability, and n ( θ ^ ( n ) θ 0 ) N ( 0 , V 1 W V 1 ) in law, both when n , where V and W are defined in (4) below. Note that using this so-called sandwich or robust asymptotic covariance matrix V 1 W V 1 should improve the estimated standard errors and reduce their negative bias when the errors are not normally distributed, especially for the parameters related to heteroscedasticity.
The assumptions are based on the behavior of the coefficients of a pure moving average decomposition of the process and the derivatives (up to order 3 and evaluated at θ = θ 0 ) of the residuals, taking into account the assumptions for t < 1 . For the process, the decomposition is x t ( n ) = k = 0 t 1 ψ t k ( n ) g t k ( n ) ϵ t k and the coefficients are denoted ψ t k ( n ) . We will also use the coefficients of the pure autoregressive decomposition of the process x t ( n ) = k = 1 t 1 π t k ( n ) ( θ ) x t k ( n ) + e t ( n ) ( θ ) and let π t k ( n ) = π t k ( n ) ( θ 0 ) . The first-order derivative of e t ( n ) ( θ ) with respect to θ i , i = 1 , , m , is expressed either as
k = 1 t 1 π t k ( n ) ( θ ) θ i x t k ( n ) or k = 1 t 1 ψ t i k ( n ) ( θ ) g t k ( n ) ϵ t k ,
and we let ψ t i k ( n ) = ψ t i k ( n ) ( θ 0 ) . These coefficients can be obtained by recurrence. Indeed, generalizing [44] (Equation (3.8)), it can be shown that
ψ t i k ( n ) = u = 1 k π t u ( n ) ( θ ) θ i θ = θ 0 ψ t u , k u ( n ) .
It is not obvious how to deal with these coefficients that are, however, essential in [43], as will be shown in Section 3.
We need finally to introduce the Hessian matrix V and the outer product of gradient W. They are obtained as limits of averages of time-dependent matrices
V = lim n 1 n t = 1 n V t ( n ) , W = lim n 1 n t = 1 n W t ( n ) ,
where the elements V i j ( n ) and W i j ( n ) for i , j = 1 , , m are obtained, respectively, by
E θ 0 e t ( n ) T ( θ ) θ i Σ t ( n ) 1 ( θ ) e t ( n ) ( θ ) θ j + 1 2 tr Σ t ( n ) 1 ( θ ) Σ t ( n ) ( θ ) θ i Σ t ( n ) 1 ( θ ) Σ t ( n ) ( θ ) θ j θ = θ 0 ,
where E θ 0 is the expectation under P θ 0 , and
1 4 E θ 0 α t ( n ) ( θ ) θ i α t ( n ) ( θ ) θ j .
The theoretical aspects of the computation of V and W are discussed in [51], with the impact of the distribution on κ t for the latter. In particular, as a consequence of our assumptions and notations, for i , j = 1 , , m A B , there is a simple common expression for the terms in (5) of both V and W:
V t , i j ( n ) = W t , i j ( n ) = tr ψ t i k ( n ) T Σ t ( n ) 1 ψ t j k ( n ) Σ t k ( n ) .
For i , j > m A B , the expressions differ: for V, it is based on the second term in (5), while for W, it is based on tr ( Σ t ( n ) ( Σ t ( n ) 1 / θ i ) ) , vec ( g t ( n ) T ( Σ t ( n ) 1 / θ i ) g t ( n ) ) and makes use of κ t (see [51] for details). It is also not obvious at all how to use (4)–(7) for a given tdVARM A ( n ) model, although these matrices V and W are crucial for obtaining the standard errors of the estimators. Also, obtaining a finite-sample approximation will be useful (see Section 2.3).

2.2. Typical Assumptions

It would be very lengthy to check all the assumptions of [43]. We will examine a representative sample of these assumptions, knowing that the others can be checked in the same way. More precisely,
(i) 
the matrices A t i ( n ) ( θ ) , i = 1 , , p , B t j ( n ) ( θ ) , j = 1 , , q , and g t ( n ) ( θ ) are three times continuously differentiable with respect to θ ;
(ii) 
existence of bounds on the Frobenius norm, denoted by . F , of g t ( n ) , κ t , Σ t ( n ) ( θ ) , Σ t ( n ) 1 ( θ ) , and their derivatives with respect to θ at θ = θ 0 ;
(iii) 
upper bounds like k = ν t 1 ψ t i k ( n ) F 2 < N 1 Φ ν 1 and k = ν t 1 ψ t i k ( n ) F 4 < N 2 Φ ν 1 , i = 1 , , m , ν = 1 , , t 1 , where N 1 and N 2 are positive constants and 0 < Φ < 1 , and similar conditions on derivatives up to the third order;
(iv) 
existence of a strictly positive definite matrix V defined by (4) and (5);
(v) 
existence of a positive definite matrix W defined by (4) and (6);
(vi) 
and, for i = 1 , , m , an assumption on triple sums of the kind
1 n 2 d = 1 n 1 t = 1 n d k = 1 t 1 g t k ( n ) F 2 ψ t i k ( n ) F ψ t + d , i , k + d ( n ) F = O 1 n .
Note that, contrary to the theory of local stationarity processes, we do not assume regularity conditions on the dependence with respect to time.

2.3. Obtaining Approximations of True V and W

To validate the simulations and the theory, we have implemented an estimation of V and W as a by-product of estimation (see the computational details in [52]). This was done, in particular, for the bivariate first-order models tdVA R ( n ) (1) and tdVM A ( n ) (1) of [43] (Section 4).
Moreover, for given values of the parameters θ , a program written in Matlab (but outside of the estimation program first described in [53]) computes for t = 1 , , n , successively g t ( n ) ( θ ) , Σ t ( n ) ( θ ) = g t ( n ) ( θ ) Σ g t ( n ) T ( θ ) , A t 1 ( n ) ( θ ) or B t 1 ( n ) ( θ ) , the ψ t i k ( n ) ( θ ) ’s for k = 1 , , t in (3). For the computation of the standard errors of the estimates, we need the information matrix, thus the Hessian V using (5) and the outer product of gradient W using (6). Approximations of V and W are then obtained using the averages in (4), respectively, without taking the limits. Divided by n, they are used to obtain the “theoretical standard errors” in the simulation results. Note that the number of operations is O ( n 2 ) but this is done only once for the true value of the parameters, specifically θ = θ 0 .

3. Results

To begin with, we consider here two first-order models, i.e., the cases of the tdVA R ( n ) (1) model where p = 1 and q = 0 , on the one hand, and the tdVM A ( n ) (1) model where p = 0 and q = 1 , on the other hand. To make it practical, and allow for a fully analytical treatment, we will quickly move to bivariate processes, i.e., the case where r = 2 in (1). Then, we will consider a tdVARM A ( n ) ( 1 , 1 ) model for which we will provide an expression for the coefficients ψ t i k in the moving average representation of the first-order derivatives of the residuals. Finally, for the general case of a tdVARM A ( n ) ( p , q ) model, we will only provide indications on sufficient conditions to demonstrate how it is possible to proceed.

3.1. Treatment of a tdVA R ( n ) (1) Model

In this section, we consider a tdVA R ( n ) (1) model, first in the general case before taking r = 2 . The model is defined by
x t ( n ) = A t ( n ) ( θ ) x t 1 ( n ) + e t ( n ) ( θ ) ,
with e t ( n ) ( θ 0 ) = g t ( n ) ϵ t , g t ( n ) = g t ( n ) ( θ 0 ) , and ϵ t having all moments of order 4 + 2 δ , δ > 0 , and such that κ t F is bounded. Let also A t ( n ) = A t ( n ) ( θ 0 ) . Let us define
A t ( n ) [ k 1 ] = l = 1 k 1 A t l ( n ) , k > 1 ,   and   A t ( n ) [ 0 ] = I r .
It can be checked that ψ t k ( n ) = A t + 1 ( n ) [ k ] , k = 0 , 1 , , and ψ t i k ( n ) = A t ( n ) ( θ ) / θ i | θ = θ 0 A t ( n ) [ k 1 ] , i = 1 , , m . Note that the coefficients of the pure autoregressive decomposition of the process are π t 1 ( n ) ( θ ) = A t ( n ) ( θ ) , and π t k ( n ) ( θ ) = 0 , k > 1 .
To be more specific, assume a bivariate process such that the elements of the matrices A t ( n ) ( θ ) are linear functions of time, and the diagonal elements of g t ( n ) ( θ ) are exponential functions of time. More precisely, we suppose that
A t ( n ) ( θ ) = A 11 A 12 A 21 A 22 + 1 n 1 ( t n + 1 2 ) A 11 A 12 A 21 A 22 , g t ( n ) ( θ ) = exp η 11 n 1 ( t n + 1 2 ) 0 0 exp η 22 n 1 ( t n + 1 2 ) and Σ = σ 11 0 0 σ 22 .
Remark 1. 
We have taken linear functions of time for illustrative purposes but it should be clear that the theory works in whole generality. The case of a linear function of time with the divisor n 1 appearing in (10) is compatible with Dahlhaus LSP theory. This is why we consider array processes instead of stochastic processes. We will come back to this in the discussion.
Now we start examining the typical assumptions of the theory stated in Section 2.2.

3.1.1. Assumptions (i) and (ii)

Assumption (i) is clearly satisfied. Denote L ( t , n ) = 1 n 1 ( t n + 1 2 ) for t = 1 , , n and 1 2 for t 0 . It is obvious that | L ( t , n ) | 1 2 for all t n . This is because L ( 1 , n ) = L ( n , n ) = 1 2 so that we have preferred the denominator n 1 instead of n used in the theory of locally stationary processes. Using the definition,
Σ t ( n ) ( θ ) = σ 11 e 2 η 11 L ( t , n ) 0 0 σ 22 e 2 η 22 L ( t , n ) , Σ t ( n ) 1 ( θ ) = 1 σ 11 e 2 η 11 L ( t , n ) 0 0 1 σ 22 e 2 η 22 L ( t , n )
are matrices whose Frobenius norms are bounded uniformly in t and n, from below by a strictly positive number and also from above, hence Assumption (ii) is satisfied.

3.1.2. Assumption (iii)

Only to simplify the analytical expressions, assume that η 11 = 0 in P θ , that the element ( 2 , 1 ) of A t ( n ) ( θ ) is identically zero, and the element ( 1 , 2 ) of A t ( n ) ( θ ) is a constant A 12 0 . This is to have an upper-triangular form (facilitating the analytical treatment) that does not degenerate into a diagonal matrix (as the latter would imply uncorrelated components since Σ is diagonal). To simplify further the details (although it is not necessary for the principles), in addition to η 11 = 0 , instead of the full vector of parameters θ = ( A 11 , A 12 , A 22 , A 11 , A 22 , η 22 ) T , we put A 22 = A 11 = 0 and A 12 is fixed to A 12 0 , so that the vector of parameters to estimate reduces to θ = ( A 11 , A 22 , η 22 ) T , hence m = 3 and there is one parameter of each kind. The presence of A 12 0 0 makes sure that the two components of x t ( n ) are not independent. There are assumptions about the true value θ 0 = ( A 11 0 , A 22 0 , η 22 0 ) T that will be stated later. Then, for k 2 , we can check by induction that A t , 21 ( n ) [ k 1 ] = 0 and
A t , 11 ( n ) [ k 1 ] = ( A 11 0 ) k 1 , A t , 22 ( n ) [ k 1 ] = ( A 22 0 ) k 1 l = 1 k 1 L ( t l , n ) , A t , 12 ( n ) [ k 1 ] = A 12 0 l = 1 k 1 ( A 11 0 ) k l 1 ( A 22 0 ) l 1 f = k l k 2 L ( t f , n ) δ l f ,
where δ l f = 0 , for l + f k 1 , and δ l f = 1 , for l + f > k 1 , generalizing relations in [44] (Appendix S1). To simplify the notations, we will henceforth omit (n) in the entries of A t ( n ) [ k 1 ] .
From the definition of ψ t i k ( n ) in (3), we deduce ψ t 3 k ( n ) = 0 and
ψ t 1 k ( n ) = 1 0 0 0 A t , 11 [ k 1 ] A t , 12 [ k 1 ] 0 A t , 22 [ k 1 ] = A t , 11 [ k 1 ] A t , 12 [ k 1 ] 0 0 , ψ t 2 k ( n ) = 0 0 0 L ( t , n ) A t , 11 [ k 1 ] A t , 12 [ k 1 ] 0 A t , 22 [ k 1 ] = 0 0 0 L ( t , n ) A t , 22 [ k 1 ] .
We define the constant Φ 1 / 2 = max { | A 11 0 | , 1 2 | A 22 0 | } such that 0 < Φ < 1 . Therefore, we assume | A 11 0 |   < 1 and | A 22 0 |   < 2 . From (13), since L ( t , n ) A t , 22 [ k 1 ] < Φ ( k 1 ) / 2 , we deduce k = ν t 1 ψ t 2 k ( n ) F 2 = k = ν t 1 L ( t , n ) A t , 22 [ k 1 ] 2 k = ν t 1 Φ k 1 < N 1 Φ ν 1 , where N 1 = 1 / ( 1 Φ ) . It is more delicate for ψ t 1 k ( n ) for which
k = ν t 1 ψ t 1 k ( n ) F 2 = k = ν t 1 A t , 11 [ k 1 ] 2 + A t , 12 [ k 1 ] 2 .
The sum of the first term is also bounded by N 1 Φ ν 1 because A t , 11 [ k 1 ] < Φ ( k 1 ) / 2 . Thanks to (12), an upper bound of | A t , 12 [ k 1 ] | equals
| A 12 0 | = 1 k 1 | A 11 0 | k l 1 | A 22 0 | l 1 f = k l k 2 | L ( t f , n ) | δ l f | A 12 0 | ( k 1 ) Φ k / 2 1 ,
hence k = ν t 1 ψ t 1 k ( n ) F 2 < N 1 Φ ν 1 , where N 1 = ( A 12 0 ) 2 Φ / ( 1 Φ ) 3 (see the details in Appendix A.1). We could have used [54] to simplify the investigation.

3.1.3. Assumption (iv)

First, to evaluate the elements ( i , j ) , i = 1 , 2 , of V in (4) with (5), we have to take the limit for n of the sum for t = 1 , , n of V t , i j ( n ) / n , which can be computed using (7). But in the tdVA R ( n ) (1) case, using (2) with π t 1 ( n ) ( θ ) = A t ( n ) ( θ ) , it is more simply expressed as
V t , i j ( n ) = tr A t T ( n ) ( θ ) θ i θ = θ 0 Σ t ( n ) 1 A t ( n ) ( θ ) θ j θ = θ 0 E ( x t 1 ( n ) x t 1 ( n ) T ) , i , j = 1 , , m A B .
The most difficult case is for i = j = 2 : the product of derivatives and Σ t ( n ) 1 has the element ( 2 , 2 ) equal to L 2 ( t , n ) exp ( 2 η 22 0 L ( t , n ) ) / σ 22 . Assume η 22 0 > 0 . Then it can be seen that ( 1 / n ) t = 1 n L 2 ( t , n ) exp ( 2 η 22 0 L ( t , n ) ) converges to a limit when n . We will now prove that E ( x t 1 ( n ) x t 1 ( n ) T ) tends to a limit when n . Denoting E t k = exp { 2 η 22 0 L ( t k , n ) } , it is a sum for k = 1 to t 1 of matrices
σ 11 ( A 11 0 ) 2 ( k 1 ) + σ 22 E t k A t , 12 [ k 1 ] 2 σ 22 E t k A t , 12 [ k 1 ] ( A 22 0 ) k 1 K t n [ k 1 ] σ 22 E t k A t , 12 [ k 1 ] ( A 22 0 ) k 1 K t n [ k 1 ] σ 22 E t k ( A 22 0 ) 2 ( k 1 ) K t n [ k 1 ] ,
where K t n [ k 1 ] = l = 1 k 1 L ( t l , n ) . Using the bounds already presented when discussing ψ t i k , i = 1 , 2 , and given that E t k = exp { 2 η 22 0 L ( t k , n ) } = exp { 2 η 22 0 L ( t , n ) } ( exp { 2 η 22 0 / ( n 1 ) } ) k , it is easy to see that the elements of (16) behave like terms of a geometric series (or terms of two sums of geometric series) for large k, so that, for large t, their sum is finite, strictly positive, and convergent when n . The existence of the limit V 22 then follows from Theorem 2.5 of [55]. The two other cases ( i = j = 1 , and i = 1 , j = 2 ) are straightforward.
The computation of V 33 involves only the second term of the definition in (5). The derivative of Σ t ( n ) with respect to η 22 yields a zero matrix, except the element ( 2 , 2 ) , which is 2 σ 22 L ( t , n ) exp { 2 η 22 0 L ( t , n ) } . The product of Σ t ( n ) 1 by that derivative gives a matrix of 0 except for the element ( 2 , 2 ) which is equal to 2 L ( t , n ) . We have, therefore, to take one-half of the limit for n of the sum for t = 1 , , n of 4 L 2 ( t , n ) / n . However, the sum of t n + 1 2 2 / n is the variance in a discrete uniform distribution over { 1 , 2 , , n } which equals ( n 2 1 ) / 12 . Hence, we obtain ( n + 1 ) / ( 6 ( n 1 ) ) whose limit for n equals 1 / 6 . We can check that the elements not discussed vanish. Note that the factor 1 6 appeared already in a similar univariate example shown by [12] (Example 3).

3.1.4. Assumption (v)

For a Gaussian process, W = V in (6). For a Laplace or a Student distribution (the latter with at least 5 degrees of freedom), in particular, the entries of κ t are not smaller than those for a normal distribution [51] (Section 4), hence W 33 V 33 .

3.1.5. Assumption (vi)

In order to verify Assumption (vi), for example for i = 2 , the simplest case, we have to show that ( 1 / n 2 ) multiplied by d = 1 n 1 t = 1 n d k = 1 t 1 g t k ( n ) F 2 ψ t 2 k ( n ) F ψ t + d , 2 , k + d ( n ) F is O ( 1 / n ) . First g t k ( n ) F 2 equals 1 + exp 2 η 22 0 L ( t k , n ) , which is bounded by 1 + exp η 22 0 . Then, we take an upper bound of ψ t 2 k ( n ) F by Φ k 1 and thus of ψ t + d , 2 , k + d ( n ) F by Φ k + d 1 . The sum for k = 1 , , t 1 of the product Φ k 1 Φ k + d 1 = Φ d + 2 k 2 is bounded by Φ d 2 times a constant 1 / ( 1 Φ 2 ) . By exchanging the two outside summations, we have to find an upper bound of t = 1 n 1 d = 1 n t Φ d 1 by Φ 1 times the sum for t = 1 , , n 1 of a constant 1 / ( 1 Φ ) . Dividing by n 2 , we have O ( 1 / n ) . The case where i = j = 1 is more delicate and will not be detailed but the principle is identical.
Example 1. 
In [43] (Section 4.1.2), simulations results were shown for artificial Gaussian time series generated by a specific case of model Equations (8)–(10), under all the restrictions mentioned above and with the following values for θ 0 = ( A 11 0 , A 22 0 , η 22 ) T = ( 0.8 , 0.75 , 0.7 ) T . Moreover, A 12 = 0.5 is a constant as above. First, note that these values satisfy the requirements. Indeed Φ 1 / 2 = max ( | A 11 0 | , 1 2 | A 22 0 | ) < 1 and η 22 > 0 . Let us consider 1000 time series of length n = 25 , 50 , 100 , 200 , and 400 obtained using multivariate Laplace, on the one hand, and Student with 5 degrees of freedom, on the other hand. The empirical estimation results are shown, respectively, in Table 1 and Table 2.
In both cases (multivariate Laplace and multivariate Student), the estimates in column (a)  (emp.est.) are close to the true value, and nearly always closer when n increases; the sample standard errors in column (b)  (emp.s.e.) decrease with n and are very close to the averages across simulations of the estimated standard errors (obtained using the sandwich formula and estimates V ^ and W ^ of V and W, respectively, see [52] for details) shown in column (c)  (est.s.e.) and the approximate theoretical standard errors in column (d)   (theor.s.e.), also based on a sandwich formula but now on the two finite averages of (4) evaluated at θ 0 , as discussed in Section 2.3; the percentages of simulations where the hypothesis H 0 ( θ i = θ i 0 ) is rejected at the 5% level in column  (e)  (% rej.) are, of course, close to 5%. The results look better for the Laplace than the Student distribution, especially for column (d)   and row η 22 . As a consequence of these simulations, we see that, for n large enough, the estimates become close to the true value, the standard errors that are the by-product of estimation correspond broadly to the empirical results (although less well for the parameter η 22 related to heteroscedasticity), coincide relatively well with the approximated values derived from the theory, and, finally, the level of the test that the parameter equals the true value is relatively close to 5%. Thanks to the sandwich correction, the standard errors are not underestimated. For instance, for η 22 , n = 400 , and the Laplace distribution, the average of the standard errors using V ^ 1 instead of V ^ 1 W ^ V ^ 1 would lead to 0.1239 instead of 0.1801 as shown in column (c)  and 0.1222 instead of 0.1932 in column (d). These numbers are far from the empirical standard deviation of 0.1948 . Finally, the results in [43] for a normal distribution in an otherwise similar setup are closer to the expected values than those shown here. For instance, the last row shows the results 0.6913 , 0.1229 , 0.1235 , 0.1222 , and 4.3 in columns (a) to (e), respectively, with a much better agreement for the empirical, estimated, and theoretical standard errors.

3.2. Treatment of a tdVM A ( n ) (1) Model

Moving average models are more difficult to study than autoregressive ones. We consider a tdVM A ( n ) (1) model defined by
x t ( n ) = e t ( n ) ( θ ) + B t ( n ) ( θ ) e t 1 ( n ) ( θ ) ,
with the same notations as in Section 2, and B t ( n ) ( θ ) = B t 1 ( n ) ( θ ) . For any θ , the pure autoregressive representation of x t ( n ) is
x t ( n ) = e t ( n ) ( θ ) + B t ( n ) ( θ ) x t 1 ( n ) B t ( n ) ( θ ) B t 1 ( n ) ( θ ) x t 2 ( n ) + B t ( n ) ( θ ) B t 1 ( n ) ( θ ) B t 2 ( n ) ( θ ) x t 3 ( n ) ,
hence
e t ( n ) ( θ ) θ i = B t ( n ) ( θ ) θ i x t 1 ( n ) + B t ( n ) ( θ ) B t 1 ( n ) ( θ ) θ i x t 2 ( n ) B t ( n ) ( θ ) B t 1 ( n ) ( θ ) B t 2 ( n ) ( θ ) θ i x t 3 ( n ) +
Replacing x t j ( n ) , j = 1 , 2 , by (17) for θ = θ 0 , given that e t ( n ) ( θ 0 ) = g t ( n ) ( θ 0 ) ϵ t , we obtain
e t ( n ) ( θ ) θ i = B t ( n ) ( θ ) θ i ( g t 1 ( n ) ϵ t 1 + B t 1 ( n ) g t 2 ( n ) ϵ t 2 ) + B t ( n ) ( θ ) B t 1 ( n ) ( θ ) θ i ( g t 2 ( n ) ϵ t 2 + B t 2 ( n ) g t 3 ( n ) ϵ t 3 ) B t ( n ) ( θ ) B t 1 ( n ) ( θ ) B t 2 ( n ) ( θ ) θ i ( g t 3 ( n ) ϵ t 3 + B t 3 ( n ) g t 4 ( n ) ϵ t 4 ) + ,
with g t ( n ) = g t ( n ) ( θ 0 ) and B t ( n ) = B t ( n ) ( θ 0 ) , like before. Let us denote
B t ( n ) [ k 1 ] ( θ ) = j = 0 k 1 B t j ( n ) ( θ ) ,
with elements B t , i j ( n ) [ k 1 ] ( θ ) , i , j = 1 , 2 . Hence, from (3),
ψ t i k ( n ) ( θ ) = ( 1 ) k B t ( n ) [ k 1 ] ( θ ) θ i B t ( n ) [ k 2 ] ( θ ) θ i B t k + 1 ( n ) .
Assume now B t , 21 ( n ) ( θ ) = 0 in P θ . To proceed more in detail, in order to simplify the analytic computations (and this is not necessary for numerical computations), we assume also that B t , 12 ( n ) ( θ ) = B 12 , a constant.
We can now examine the typical assumptions of the theory stated in Section 2.2.

3.2.1. Assumptions (i) and (ii)

To obtain nice analytic expressions, we suppose that the diagonal elements of the matrices B t ( n ) ( θ ) and g t ( n ) ( θ ) are exponential functions of time. The elements ( 1 , 2 ) and ( 2 , 1 ) of g t ( n ) ( θ ) are supposed to be different from zero so that the correlation between the two components of ϵ t varies with time. More precisely, using again L ( t , n ) = 1 n 1 ( t n + 1 2 ) , we suppose that
B t ( n ) ( θ ) = B 11 exp B 11 L ( t , n ) B 12 0 B 22 exp B 22 L ( t , n ) , g t ( n ) ( θ ) = exp η 11 L ( t , n ) α β exp η 22 L ( t , n ) and Σ = σ 11 σ 12 σ 21 σ 22 ,
where α and β are here constants, with η 11 0 and η 22 0 . Denoting g t , i i ( n ) an element of g t ( n ) ( θ ) and omitting the argument θ , Σ t ( n ) ( θ ) is equal to
σ 11 g t , 11 ( n ) 2 + 2 σ 12 α g t , 11 ( n ) + σ 22 α 2 σ 11 β g t , 11 ( n ) + σ 12 ( α β + g t , 11 ( n ) g t , 22 ( n ) ) + σ 22 α g t , 22 ( n ) σ 11 β g t , 11 ( n ) + σ 12 ( α β + g t , 11 ( n ) g t , 22 ( n ) ) σ 11 β 2 + 2 σ 12 β g t , 22 ( n ) + σ 22 g t , 22 ( n ) 2 + σ 22 α g t , 22 ( n ) .
It is easy to see that Assumptions (i) and (ii) are verified.

3.2.2. Assumption (iii)

To simplify the example, since a correlation between the two components is already guaranteed by the presence of a non-diagonal covariance matrix Σ , we will assume that B 12 = 0 so that B t ( n ) [ k 1 ] ( θ ) defined in (18) is diagonal.
To simplify the discussion, we will further suppose, similarly to what we did in Section 3.1, in addition of course to the already introduced parameters η 11 and η 22 , we have only one parameter of each type, e.g., θ 1 = B 11 and θ 2 = B 22 , assuming that B 22 0 and B 11 are fixed constants, respectively denoted B 22 0 and B 11 0 . To summarize, θ = ( B 11 B 22 η 11 η 22 ) T . Contrary to Section 3.1, we recourse to the results of [54] here, although the more lengthy direct analysis is given in Appendix A.2. Indeed, a sufficient condition for (iii) is stated in [54]. It can be shown that the eigenvalues of the MA polynomial are the solutions of the equation ( 1 B 11 exp ( B 11 0 L ( t , n ) ) ) ( 1 B 22 0 exp ( B 22 L ( t , n ) ) ) = 0 . They should be smaller than 1 in absolute value at θ 0 . Therefore, we assume that the true value θ 0 = ( B 11 0 B 22 0 η 11 0 η 22 0 ) T of θ satisfies | B j j 0 | exp ( B j j 0 / 2 ) < 1 , j = 1 , 2 . We denote
Φ 1 / 2 = max { | B 11 0 | exp ( B 11 0 / 2 ) , | B 22 0 | exp ( B 22 0 / 2 ) } < 1 .
Assumption (iii) is, therefore, satisfied. In practice, we don’t assume zero initial values for the process, but that it is invertible before time 1. This leads to additional conditions that | B 11 0 | exp ( B 11 0 L ( 0 , n ) ) < 1 and | B 22 0 | exp ( B 22 0 L ( 0 , n ) ) < 1 . If we were interested in forecasting up to time n + h , we should replace above L ( t , n ) with L ( t + h , n ) to have stronger conditions in (22).

3.2.3. Assumption (iv)

Now, we consider the existence of V in (4) with (5) for the 4-parameter model described in the previous paragraphs. Let us start with the elements V i j , i , j = 3 , 4 , corresponding to the parameters η 11 and η 22 . The formulas in [51] (Theorem 3) are applicable for computing the terms V t , i j ( n ) , i , j = 3 , 4 . Given the expression of g t , ( n ) in (20), the terms in these formulas are all of the forms
v t ( n ) = L ( t , n ) 2 g t , ξ ( n ) ( α β g t , η ( n ) ) 2 ,
where ξ 0 is equal to ρ η , ρ = 0 , 1 , 2 , and = 1 , 2 , or η 11 η 22 , and η = η 11 + η 22 > 0 . See Appendix A.3 for indications on how to prove the existence of the limit of averages of these terms, and even their computation using integration.
Using (7), the elements of V t , i j ( n ) , i , j = 1 , 2 are sums for k = 1 to t 1 of terms of the kind τ t , i j ( n ) ( ψ t i k ( n ) ) i i ( ψ t j k ( n ) ) j j σ t k ( n ) , where τ t , i j ( n ) are elements of Σ t ( n ) 1 and σ t k ( n ) are elements of Σ t k ( n ) . Given (21), the special form of g t ( n ) ( θ ) , σ t ( n ) is a quadratic polynomial of the diagonal elements g t , ( n ) , = 1 , 2 . More precisely, σ t ( n ) can be written as a finite sum of the form c δ 1 δ 2 ( g t , 11 ( n ) ) δ 1 ( g t , 22 ( n ) ) δ 2 , with δ 1 and δ 2 integers such that 0 δ 1 2 , 0 δ 2 2 , and δ 1 + δ 2 2 , and c δ 1 δ 2 is a constant. However, g t k , ( n ) is equal to g t , ( n ) exp ( η 0 k / ( n 1 ) ) . Consequently, σ t k ( n ) is composed of terms c δ 1 δ 2 ( g t k , 11 ( n ) ) δ 1 ( g t k , 22 ( n ) ) δ 2 equal to c δ 1 δ 2 ( g t , 11 ( n ) ) δ 1 ( g t , 22 ( n ) ) δ 2 exp ( ( η 11 0 δ 1 + η 22 0 δ 2 ) k / ( n 1 ) ) . Hence V t , i j ( n ) , i , j = 1 , 2 , is a sum for k = 1 to t 1 of terms of the kind
c δ 1 δ 2 τ t , i j ( n ) ( g t , 11 ( n ) ) δ 1 ( g t , 22 ( n ) ) δ 2 u t , i j δ 1 δ 2 ( n ) ,
where we denote
u t , i j δ 1 δ 2 ( n ) = k = 1 t 1 ( ψ t i k ( n ) ) i i ( ψ t j k ( n ) ) j j exp ( ( η 11 0 δ 1 + η 22 0 δ 2 ) k / ( n 1 ) ) ,
where ( ψ t i k ( n ) ) i i and ( ψ t j k ( n ) ) j j are obtained using (A2) or/and (A3). The terms in (24) are not all positive because of L ( t k + 1 , n ) . However, L ( t k + 1 , n ) < 1 / 2 for all t = 1 , , n and all n, and, therefore, = 1 k 1 L ( t , n ) k / 2 so that the absolute value of the product ( ψ t i k ( n ) ) i i ( ψ t j k ( n ) ) j j can be bounded by Φ k , given (22). We want to show that the triangular array { u t , i j δ 1 δ 2 ( n ) , t = 1 , , n } converges to a constant (dependent on i, j, δ 1 , and δ 2 ) when t , and hence n . First, the absolute value of the kth term of (24) is bounded by Φ k exp ( ( η 11 0 δ 1 + η 22 0 δ 2 ) k / ( n 1 ) ) < Φ k , since η 11 0 δ 1 + η 22 0 δ 2 > 0 , and the limit of the sum of these terms is a convergent geometric series, ensuring convergence of the sequence (24) when t . Now, for i , j = 1 , 2 , V t , i j ( n ) is a sum of terms that are proportional to u t , i j δ 1 δ 2 ( n ) v t , i j δ 1 δ 2 ( n ) , where
v t , i j δ 1 δ 2 ( n ) = τ t , i j ( n ) ( g t , 11 ( n ) ) δ 1 ( g t , 22 ( n ) ) δ 2 .
We have defined τ t , i j ( n ) as elements of Σ t ( n ) 1 , so as elements of the matrix of cofactors of the matrix Σ t ( n ) divided by its determinant D t ( n ) , where D t ( n ) is ( σ 11 σ 22 σ 12 2 ) ( α β g t , 11 ( n ) g t , 22 ( n ) ) 2 . Hence, v t , i j δ 1 δ 2 ( n ) is a finite sum with terms that are proportional to
v t , i j δ 3 δ 4 ( n ) = ( g t , 11 ( n ) ) δ 3 ( g t , 22 ( n ) ) δ 4 ( α β g t , 11 ( n ) g t , 22 ( n ) ) 2 ,
where δ 3 δ 1 and δ 4 δ 2 are integers. We obtain V t , i j ( n ) for i , j = 3 , 4 to show that
lim n 1 n t = 1 n v t , i j δ 3 δ 4 ( n )
exists, is finite, and can be evaluated using an integral. Therefore, we can invoke Theorem 2.5 of [55] to show that
lim n 1 n t = 1 n u t , i j δ 1 δ 2 ( n ) v t , i j δ 3 δ 4 ( n )
exists and is finite. This proves the existence of V i j for i , j = 1 , 2 .

3.2.4. Assumption (v)

The existence of W in (4) with (6) can be studied similarly using κ t , depending on the distribution of ϵ t , see [52] (Section 3.5).

3.2.5. Assumption (vi)

It is also easy to check Assumption (vi), for example for i = 1 . We have to show that ( 1 / n 2 ) multiplied by d = 1 n 1 t = 1 n d k = 1 t 1 g t k ( n ) F 2 ψ t 1 k ( n ) F ψ t + d , 1 , k + d ( n ) F is O ( 1 / n ) . First, it can be shown that g t k ( n ) F 2 is equal to 2 + exp ( 2 η 11 0 L ( t k , n ) ) + exp ( 2 η 22 0 L ( t k , n ) ) , and this is less than or equal to 2 + exp ( η 11 0 ) + exp ( η 22 0 ) . Then, we are exactly in the same situation as in Section 3.1.
Example 2. 
In [43] (Section 4.1.2), simulations results were shown for artificial Gaussian or Student (with 5 degrees of freedom) time series generated by model Equations (17)–(20) with the following values for θ 0 = ( B 11 0 , B 22 0 , η 22 0 ) T = ( 0.8 , 0.4 , 0.7 ) T but with a linear expression for B t ( n ) that we keep here instead of the exponential that was easier for the analytic developments: B t , 11 ( n ) = B 11 + B 11 L ( t , n ) and B t , 22 ( n ) = B 22 + B 22 L ( t , n ) . Moreover, B 11 = 0 , B 22 = 0.25 , and η 11 = 0 are taken as constants, as well as α = β = 0.6 . First, note that these values satisfy the requirements after a linear approximation of the exponentials. Indeed, Φ 1 / 2 = max ( | B 11 0 | exp ( B 11 / 2 ) , | B 22 0 | exp ( B 22 / 2 ) ) < 1 and η 22 > 0 .
Let us consider 1000 time series of length n = 25 , 50 , 100 , 200 , and 400 obtained using multivariate Laplace distribution. The empirical estimation results are shown in Table 3. In all cases, the estimates in column (a)  (emp.est.) are often close to the true value, and closer when n increases; the sample standard errors in column (b)  (est.s.e.) decrease with n and are very close to the averages across simulations of the estimated standard errors (obtained using the sandwich formula and estimates V ^ and W ^ of V and W, respectively, see [52] for details) shown in column (c)  (emp.s.e.) and the approximate standard errors in column (d)  (theor.s.e.), also based on a sandwich formula but now on the two finite averages of (4) evaluated at θ 0 ; the percentages of simulations where the hypothesis H 0 ( θ i = θ i 0 ) is rejected at the 5% level in column (e)   (% rej.) are close to 0.05 at least for large n. Like in Example 1, for n large enough, the estimates become close to the true value, the estimated standard errors are close to the empirical results (but also for the parameter η 22 related to heteroscedasticity) and the approximated values derived from the theory, and, finally, the percentage of rejection in the tests are slightly closer to 5%. Thanks to the sandwich correction, the standard errors are not underestimated. For instance, for η 22 and n = 400 , the average of the standard errors using V ^ 1 instead of V ^ 1 W ^ V ^ 1 would lead to 0.0746 instead of 0.1078 shown in column (c)   and 0.0731 instead of 0.1154 in column (d)   , far from the empirical standard error of 0.1190 shown in column (b). Note that the results are slightly better for a normal distribution of the errors. In Table 4, we present the normal results obtained for n = 400 as we had only shown them for n = 100 in [43]. The agreement is better for η 22 with 0.0747 in row (b)  (emp.s.e.), 0.0744 in row (c1)   (est.s.e.) or 0.0729 in row (c2)   (est.s.e.) or 0.0731 in row (d)   (theor.s.e.). In the normal case, we have computed an estimate of V 33 using the integration approach in [55] (Procedure 6). Using this, refer to the details in Appendix A.3, V 33 = 0.465316 and the corresponding standard error for η 22 when n = 400 is 0.0732986 , in full agreement with the empirical, estimated, and theoretical approximation standard errors in the last column of Table 4.

3.3. Treatment of a tdVARM A ( n ) (1,1) Model

Starting from here, we consider only homoscedastic models. Let { x t ( n ) ; t = 1 , , n , n N } be an r-vector time series satisfying
x t ( n ) = A t ( n ) ( θ ) x t 1 ( n ) + e t ( n ) ( θ ) + B t ( n ) ( θ ) e t 1 ( n ) ( θ ) ,
and g t ( n ) ( θ ) = I r . Hence, (ii) and (v) have no object since Σ t ( n ) ( θ ) = Σ does not depend on θ and W = V . We will not cover the tdVARM A ( n ) ( 1 , 1 ) model in detail, simply show how the coefficients ψ t i k ( n ) can be computed. It is a special case of the model defined in (1) with p = 1 and q = 1 . Now, following [44], in this special case, the coefficients of the pure moving average representation are given by:
ψ t k ( n ) ( θ ) = l = 0 k 2 A t l ( n ) ( θ ) B t k + 1 ( n ) ( θ ) + A t k + 1 ( n ) ( θ ) , for k = 1 , 2 , , t 1 ,
where a product for l = 0 to 1 is set to I r . The coefficients of the pure autoregressive form are
π t k ( n ) ( θ ) = ( 1 ) k 1 l = 0 k 2 B t l ( n ) ( θ ) A t k + 1 ( n ) ( θ ) + B t k + 1 ( n ) ( θ ) ,
so, for i = 1 , , m , their derivatives are given by
π t 1 ( n ) ( θ ) θ i = A t ( n ) ( θ ) θ i + B t ( n ) ( θ ) θ i ,
π t 2 ( n ) ( θ ) θ i = B t ( n ) ( θ ) θ i { A t 1 ( n ) ( θ ) + B t 1 ( n ) ( θ ) } B t ( n ) ( θ ) A t 1 ( n ) ( θ ) θ i + B t 1 ( n ) ( θ ) θ i ,
π t 3 ( n ) ( θ ) θ i = B t ( n ) ( θ ) θ i B t 1 ( n ) ( θ ) A t 2 ( n ) ( θ ) + B t 2 ( n ) ( θ )   + B t ( n ) ( θ ) B t 1 ( n ) ( θ ) θ i A t 2 ( n ) ( θ ) + B t 2 ( n ) ( θ ) + B t ( n ) ( θ ) B t 1 ( n ) ( θ ) A t 2 ( n ) ( θ ) θ i + B t 2 ( n ) ( θ ) θ i ,
Consequently
π t k ( n ) ( θ ) θ i = ( 1 ) k 1 l = 1 k h = 1 k χ t + 1 h , k , l , h , i ( n ) ( θ ) ,
where
χ t , k , l , h , i ( n ) ( θ ) = χ t , k , l , h ( n ) ( θ ) θ i if h = l , χ t , k , l , h ( n ) ( θ ) if h l ,
and
χ t , k , l , h ( n ) ( θ ) = B t ( n ) ( θ ) i f h < k , A t ( n ) ( θ ) + B t ( n ) ( θ ) i f h = k .
Then
ψ t i k ( n ) ( θ ) = u = 1 k l = 1 u h = 1 u χ t + 1 h , k , l , h , i ( n ) ( θ ) h = u + 1 k χ ˜ t + 1 h , k , h ( n ) ( θ 0 ) ,
χ ˜ t + 1 h , k , h ( n ) ( θ ) = A t ( n ) ( θ ) i f h < k , A t ( n ) ( θ ) + B t ( n ) ( θ ) i f h = k .
These results correct the findings presented in the univariate case by [12].
From this expression, it is possible to check (iii) and (vi) relatively easily. Indeed, again let A t ( n ) = A t ( n ) ( θ 0 ) and B t ( n ) = B t ( n ) ( θ 0 ) . If we assume that A t ( n ) F   < Φ and B t ( n ) F   < Φ , for all t, where 0 < Φ < 1 , since only a few factors involve sums like A t ( n ) + B t ( n ) and the others are bounded by Φ , the Frobenius norm of the ψ t k ( n ) and the ψ t i k ( n ) are bounded by Φ k . Of course, checking (iii), the existence of V, depends heavily on the parametrization. For example, if it exists, V will not be invertible if A t ( n ) ( θ 0 ) = B t ( n ) ( θ 0 ) for all t (or even for most t).

3.4. Treatment of a More General tdVARM A ( n ) ( p , q ) Model

Ref. [45] indicates how to handle more generally homoscedastic tdVARM A ( n ) ( p , q ) models, with g t ( n ) ( θ ) = I r . Indeed, it is shown in [54] that (iii) and (vi) are valid if the determinants of the tdVAR and tdVMA polynomials, respectively, I r i = 1 p A t i ( n ) z i and I r + j = 1 q B t j ( n ) z j , do not vanish when | z | 1 . Of course, it is only a sufficient condition. That argument was used in Section 3.2.2 to simplify the treatment, whereas Appendix A.2 does not use that argument for the same model and is therefore lengthier. Again, (ii) and (v) have no object, while (i) and (iii) depend on the specific parametrization.

4. Discussion

The results presented in Section 3.1 and Section 3.2 confirm that the theory exposed in [43] is applicable and that its assumptions can be checked analytically, at least in the two simple bivariate tdVA R ( n ) (1) and tdVM A ( n ) (1) models. The results have already been exploited in the simulation experiments in [43] (Sections 4.1.2 and 4.2.2) as the values of the parameters used there meet the conditions stated here. The treatment of other models by analytical methods is certainly more challenging, although we discussed it in some detail for a tdVARM A ( n ) (1, 1) model. For more complex models, the approach of [54] can also be considered, as it allows us to put the model in tdVAR form, albeit with a higher dimension.
As mentioned in Remark 1, we have taken linear functions of time for illustrative purposes, but it should be clear that the theory works in whole generality. First, a linear function of time with the divisor n 1 appearing in (10) is compatible with Dahlhaus’ LSP theory (see [16]), since the (Frobenius norm of the) coefficient can easily be bounded from above by 1. Anyway, we can consider that linearity is a first attempt if constancy cannot be retained. Second, in [19], the authors have shown a univariate tdA R ( n ) process of order 1, where the coefficient can be greater than 1 during an interval of time that shrinks to 0 as n increases. We can have the same change here, meaning that the process does not need to be locally stationary. What is essential is that the coefficients of the pure tdMA representation are bounded by a decreasing exponential function of the lag. Third, the tdA R ( n ) coefficient does not need to be differentiable with respect to time. The theory in [43] is valid when there are any number of breaks, provided they do not add too many parameters, for instance, if there is a periodicity of 2, with a coefficient equal to θ 1 for t odd, and θ 2 for t even. Indeed, it is a generalization of [12] where such an example is shown. We can even have a periodic behavior, even with an incommensurable period for the different matrix entries, as shown by [44].
The lag of one can be replaced with another integer without any change. For instance, for quarterly data, we can replace it with 4, or 12 for monthly data. The results in Section 3.1 and Section 3.2 are easily adapted. Because the inference is infill, meaning that more and more observations are assumed to be made between the first and last ones, the LSP theory is no longer valid in the case of periodicity. On the contrary, the inference in [43,44] is of type outfill or increasing domain, that is to say, that more years are supposed to be observed, therefore preserving the period of the periodic behavior. Ref. [19] has also shown a tdA R ( n ) (1) process with a coefficient that varies linearly with time but where heteroscedasticity (so g t ( n ) ) is a periodic function of time. Finally, ref. [19] indicates that the so-called multiplicative seasonal ARMA models of [41] cannot be generalized in the LSP framework for the same reason. On the contrary, there is no problem working with those models in our context. That remark made here for the tdVA R ( n ) (1) model extends to the tdVMA(1) model and all the tdVARM A ( n ) models, as well.
It is possible to obtain the coefficients of the pure MA representations by simple recursive relations (see [54] for algorithms). Ref. [39] has even proposed an explicit solution, although it is limited to univariate processes. With these coefficients, everything can be computed, at least for finite n, and the assumptions can be checked. Of course, when we have a finite multivariate time series, it is not immediately apparent to suggest a tdVARMA model. What we propose in practice (see the examples treated in [43,53]) is first fitting a VARMA model, and then adding slopes for the linearly varying coefficients and, possibly, heteroscedasticity, before removing non-significant parameters one by one. At this time, there is no large-scale study on real data using time-dependent multivariate models. Still, the results for univariate models in [21] (with marginal heteroscedasticity only) and [56] seem promising.
We have specified in the introduction that our approach is not adequate for high-dimensional tdVARM A ( n ) models, i.e., when r is greater than a very small integer because the number of parameters grows too quickly. Here, we have the entries in the VAR and VMA polynomials but also the coefficients of polynomials if we extend the linear dependence in (10) or (20) to a polynomial dependence, for instance. In principle, the concept in [46] should be valid in our framework of time-dependent VARMA models, including the case of polynomial dependence, multiple breaks, or threshold models with multiple regimes, although bigtime in [47] should be strongly changed for that purpose. This is left for future research.
It should be noted that, although it is interesting to see that the assumptions of [43] can be verified, they are not particularly helpful in the face of a real multivariate time series. We note that, unfortunately, verifying the same assumptions using the data (one realization of length n) is impossible. The same criticism, however, holds for the entire literature on models with time-dependent coefficients.

Funding

This research received no external funding.

Data Availability Statement

This research did not require data.

Acknowledgments

I acknowledge moral support from my research department ECARES and my university. I thank my longtime coauthor Rajae Azrak and younger previous co-author Abdelkamel Alj for their encouragement. I thank deeply the two reviewers and Danna Zhang for their numerous suggestions that led to an improvement of the paper, and particularly the reviewer who pointed out the computational aspects for handling high-dimensional problems that can be useful for future research. I also thank the reviewers of [43].

Conflicts of Interest

The author declares no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
ARAutoregressive
ARIMAAutoregressive integrated moving average
ARMAAutoregressive moving average
MAMoving average
tdARTime-dependent autoregressive
tdMATime-dependent moving average
tdVARTime-dependent vector autoregressive
tdVARMATime-dependent vector autoregressive moving average
tdVMATime-dependent vector moving average
VARMAVector autoregressive moving average

Appendix A. Details for the First-Order Models

Appendix A.1. Details for the tdVAR(n)(1) Model

We refer to the tdVA R ( n ) (1) model of Section 3.1. In checking Assumption (iii) for that model, the derivation of the upper bound for the sum of ψ t 1 k ( n ) F for k = ν to t 1 , after (15), by N 1 Φ ν 1 is not so easy. The first term in (14) can be bounded like for ψ t 2 k ( n ) F . For the second term, we first replace the upper summation bound by . Let S ( ν ) = k = ν ( k 1 ) 2 Φ k 2 . This is called a polynomial geometric series with a polynomial of degree m = 2 here. A standard trick for evaluating them (e.g., ref. [57], and references therein) is to multiply the series by ( 1 Φ ) m + 1 , hence
( 1 Φ ) 3 S ( ν ) = ( 1 3 Φ + 3 Φ 2 Φ 3 ) S ( ν ) = ( ν 1 ) 2 Φ ν 1 + ν 2 Φ ν + ( ν + 1 ) 2 Φ ν + 1 3 { ( ν 1 ) 2 Φ ν + ν 2 Φ ν + 1 } + 3 ( ν 1 ) 2 Φ ν + 1
because the other terms cancel, since ( k + 2 ) 2 3 ( k + 1 ) 2 + 3 k 2 ( k 1 ) 2 = 0 . Therefore, since | Φ | < 1 , an upper bound of | ( 1 Φ ) 3 S ( ν ) | is Φ ν 1 . The arguments are formalized and put a step further in [45].

Appendix A.2. Details for the tdVMA(n)(1) Model

We consider the tdVM A ( n ) (1) model of Section 3.2. Let us start with a direct derivation of bounds for ψ t i k ( n ) . Consider the B t ( n ) [ k 1 ] ( θ ) defined in (18). Omitting the superscripts (n) in the entries of B t ( n ) [ k 1 ] ( θ ) for simplicity, it can be shown by induction that
B t [ k 1 ] ( θ ) = B t , 11 [ k 1 ] ( θ ) B t , 12 [ k 1 ] ( θ ) B t , 22 [ k 1 ] ( θ )
and
B t , 11 [ k 1 ] ( θ ) = j = 0 k 1 ( B t j ( n ) ( θ ) ) 11 , B t , 22 [ k 1 ] ( θ ) = j = 0 k 1 ( B t j ( n ) ( θ ) ) 22 ,
B t , 12 [ k 1 ] ( θ ) = B 12 = 1 k 1 f = 1 k 1 ( B t f + 1 ( n ) ( θ ) ) 11 f = k k 2 ( B t f ( n ) ( θ ) ) 22 .
Hence, for j = 1 , 2 , letting δ l f = 1 if l = f and 0 elsewhere,
B t , j j [ k 1 ] ( θ ) θ i = = 0 k 1 f = 0 k 1 B t , j j ( n ) ( θ ) 1 δ l f B t , j j ( n ) ( θ ) θ i ,
and a more complex expression, with a double sum and a double product, for the derivative of B t , 12 [ k 1 ] ( θ ) .
The exponential evolution of the coefficients in B t ( n ) will simplify the subsequent computations. Indeed, for j = 1 , 2 ,
B t , j j [ k 1 ] ( θ ) = = 0 k 1 B t ( n ) ( θ ) j j = B j j k exp B j j = 0 k 1 L ( t , n ) .
Hence
B t , j j [ k 1 ] ( θ ) B j j = k B j j k 1 exp B j j = 0 k 1 L ( t , n ) ,
and
B t , j j [ k 1 ] ( θ ) B j j = B j j k = 0 k 1 L ( t , n ) exp B j j = 0 k 1 L ( t , n ) .
If θ i = B j j , (19) and (A1) imply
( ψ t i k ( n ) ( θ ) ) j j = ( 1 ) k k B j j k 1 exp B j j = 0 k 1 L ( t , n ) ( k 1 ) B j j k 2 exp B j j = 0 k 2 L ( t , n ) B j j 0 exp B j j 0 L ( t k + 1 , n ) = ( 1 ) k B j j k 2 exp B j j = 0 k 2 L ( t , n ) k B j j exp B j j L ( t k + 1 , n ) ( k 1 ) B j j 0 exp B j j 0 L ( t k + 1 , n ) ,
and, if θ i = B j j , (19) and (A1) imply
( ψ t i k ( n ) ( θ ) ) j j = ( 1 ) k B j j k = 0 k 1 L ( t , n ) exp B j j = 0 k 1 L ( t , n ) B j j k 1 = 0 k 2 L ( t , n ) exp B j j = 0 k 2 L ( t , n ) B j j 0 exp B j j 0 L ( t k + 1 , n ) = ( 1 ) k B j j k 1 exp B j j = 0 k 2 L ( t , n ) × B j j = 0 k 1 L ( t , n ) exp B j j L ( t k + 1 , n ) B j j 0 = 0 k 2 L ( t , n ) exp B j j 0 L ( t k + 1 , n ) .
In order to obtain the ψ t i k ( n ) involved in the assumptions, we still need to replace θ by θ 0 . If θ i = B j j ,
( ψ t i k ( n ) ) j j = ( 1 ) k ( B j j 0 ) k 1 exp B j j 0 = 0 k 1 L ( t , n ) ,
and, if θ i = B j j ,
( ψ t i k ( n ) ) j j = ( 1 ) k L ( t k + 1 , n ) ( B j j 0 ) k exp B j j 0 = 0 k 1 L ( t , n ) .
Given that the functions of time are assumed to have distinct parameters, the matrices ψ t i k ( n ) ( θ , θ 0 ) are zero except for the diagonal element corresponding to θ i . Consequently, ψ t i k ( n ) ( θ , θ 0 ) F is the absolute value of that element.
Hence, ( ψ t 1 k ( n ) ) 11 is given by (A2) for i = j = 1 and ( ψ t 2 k ( n ) ) 22 is given by (A3) for i = j = 2 .
Therefore, given that | L ( t , n ) | 1 2 , ψ t i k ( n ) F 2   < Φ k , i = 1 , 2 , and that fulfills the first part of Assumption (iii). The reasoning is similar for the second part.

Appendix A.3. Details for Computing V for the tdVMA(n)(1) Model

We have seen that V t , i j ( n ) , i , j = m A B + 1 , , m is a linear combination of terms of the form v t ( n ) in (23). Hence, we can try using Procedure 2.6 of [55] in order to replace the limit for n of ( 1 / n ) t = 1 n v t ( n ) with an integral that can be computed. The integral has the form
0.5 0.5 x 2 e ξ x ( α β e η x ) 2 d x ,
for which a primitive exists, involving in general the hypergeometric function, see [58] (Chapter 15) or [59]. In particular, for the case of Example 2, there is a simpler expression
345 5831 7 x 175 e 7 x / 10 x 9 25 e 7 x / 10 + ( 20 + 7 x ) log 1 25 9 e 7 x / 10 + 20 ( 10 + 7 x ) Li 2 ( 25 9 e 7 x / 10 ) 200 Li 3 ( 25 9 e 7 x / 10 ) + C ,
where Li s ( z ) = k = 1 z k / k s , | z | < 1 , is the polylogarithm function [60] with the dilogarithmic Li 2 ( z ) and trilogarithmic functions Li 3 ( z ) as special cases. We have given the value of the definite integral at the end of the example.

References

  1. Düker, C.; Matteson, D.S.; Tsay, R.S.; Wilms, I. Vector autoregressive moving average models: A review. WIREs Comput. Stat. 2025, 17, e70009. [Google Scholar]
  2. Quenouille, M.H. The Analysis of Multiple Time Series; Griffin: London, UK, 1957. [Google Scholar]
  3. Priestley, M.B. Evolutionary spectra and non-stationary processes. J. R. Stat. Soc. B 1965, 27, 204–237. [Google Scholar]
  4. Whittle, P. Recursive relations for predictors of non-stationary processes. J. R. Stat. Soc. Ser. B 1965, 27, 523–532. [Google Scholar]
  5. Subba Rao, T. The fitting of non-stationary time-series models with time-dependent parameters. J. R. Stat. Soc. Ser. B 1970, 32, 312–322. [Google Scholar]
  6. Priestley, M.B.; Tong, H. On the analysis of bivariate non-stationary processes. J. R. Stat. Soc. Ser. B 1973, 35, 153–166. [Google Scholar]
  7. Bordignon, S.; Masarotto, G. Una classe di modelli non stazionari. Statistica 1983, 43, 83–104. [Google Scholar]
  8. Tjøstheim, D. Estimation in Linear Time Series Models II: Some Nonstationary Series; Departement of Mathematics, University of Bergen: Bergen, Norway, 1984. [Google Scholar]
  9. Kwoun, G.H.; Yajima, Y. On an autoregressive model with time-dependent coefficients. Ann. Inst. Stat. Math. Part A 1986, 38, 297–309. [Google Scholar]
  10. Grillenzoni, C. Modeling time-varying dynamical systems. J. Am. Stat. Assoc. 1990, 85, 499–507. [Google Scholar]
  11. Bibi, A.; Francq, C. Consistent and asymptotically normal estimators for cyclically time-dependent linear models. Ann. Inst. Stat. Math. 2003, 55, 41–68. [Google Scholar]
  12. Azrak, R.; Mélard, G. Asymptotic properties of quasi-likelihood estimators for ARMA models with time-dependent coefficients. Stat. Inference Stoch. Process. 2006, 9, 279–330. [Google Scholar]
  13. Dahlhaus, R. Maximum likelihood estimation and model selection for locally stationary processes. J. Nonparametric Stat. 1996, 6, 171–191. [Google Scholar]
  14. Dahlhaus, R. On the Kullback-Leibler information divergence of locally stationary processes. Stoch. Process. Their Appl. 1996, 62, 139–168. [Google Scholar]
  15. Dahlhaus, R. Asymptotic statistical inference for nonstationary processes with evolutionary spectra. In Athens Conference on Applied Probability and Time Series Analysis 2; Robinson, P.M., Rosenblatt, M., Eds.; Springer: New York, NY, USA, 1996; pp. 145–159. [Google Scholar]
  16. Dahlhaus, R. Fitting time series models to nonstationary processes. Ann. Stat. 1997, 25, 1–37. [Google Scholar]
  17. Dahlhaus, R. A likelihood approximation for locally stationary processes. Ann. Stat. 2000, 28, 1762–1794. [Google Scholar]
  18. Dahlhaus, R. Locally stationary processes. In Handbook of Statistics; Rao, T.S., Rao, S.S., Rao, C.R., Eds.; Elsevier: Amsterdam, The Netherlands, 2012; Volume 30, Chapter 13; pp. 351–413. [Google Scholar]
  19. Azrak, R.; Mélard, G. Autoregressive models with time-dependent coefficients—A comparison between several approaches. Stats 2022, 5, 784–804. [Google Scholar] [CrossRef]
  20. Ombao, H.C.; Raz, J.A.; von Sachs, R.; Malow, B.A. Automatic statistical analysis of bivariate nonstationary time series. J. Am. Stat. Assoc. 2001, 96, 543–560. [Google Scholar]
  21. Van Bellegem, S.; von Sachs, R. Forecasting economic time series with unconditional time-varying variance. Int. J. Forecast. 2004, 20, 611–627. [Google Scholar]
  22. Ombao, H.; von Sachs, R.; Guo, W. SLEX analysis of multivariate nonstationary time series. J. Am. Stat. Assoc. 2005, 100, 519–531. [Google Scholar]
  23. Van Bellegem, S.; Dahlhaus, R. Semiparametric estimation by model selection for locally stationary processes. J. R. Stat. Soc. Ser. B 2006, 68, 721–746. [Google Scholar]
  24. Nason, G.P. A test for second-order stationarity and approximate confidence intervals for localized autocovariances for locally stationary time series. J. R. Stat. Soc. Ser. B 2015, 75, 879–904. [Google Scholar]
  25. Puchstein, R.; Preuss, P. Testing for stationarity in multivariate locally stationary processes. J. Time Ser. Anal. 2015, 37, 3–29. [Google Scholar] [CrossRef]
  26. Dette, H.; Wu, W. Prediction in locally stationary time series. J. Bus. Econ. Stat. 2020, 40, 370–381. [Google Scholar] [CrossRef]
  27. Killick, R.; Knight, M.I.; Nason, G.P.; Eckley, I.A. The local partial autocorrelation function and some applications. Electron. J. Statist. 2020, 14, 3268–3314. [Google Scholar] [CrossRef]
  28. Bardet, J.-M. Stationarity and Goodness-of-Fit Tests for Locally Stationary Time Series. 2024. hal-04675274. Available online: https://hal.science/hal-04675274 (accessed on 24 March 2025).
  29. Bourhattas, A.; Laïb, N. Nonparametric Estimation in Nonlinear Time-Varying Autoregressive Locally Stationary Processes with ARCH-Errors. 2024. hal-04611601. Available online: https://hal.science/hal-04611601 (accessed on 24 March 2025).
  30. Killick, R.; Knight, M.I.; Nason, G.P.; Nunes, M.A.; Eckley, I.A. Automatic locally stationary time series forecasting with application to predicting UK gross value added time series. J. R. Stat. Soc. Ser. C 2025, 74, 18–33. [Google Scholar] [CrossRef]
  31. Lütkepohl, H. New Introduction to Multiple Time Series Analysis; Springer-Verlag: New York, NY, USA, 2005. [Google Scholar]
  32. Creal, D.D.; Koopman, S.J.; Lucas, A. Generalized autoregressive score models with applications. J. Appl. Econom. 2013, 28, 777–795. [Google Scholar] [CrossRef]
  33. Teräsvirta, T.; Tjøstheim, D.; Granger, C.W.J. Modelling Nonlinear Economic Time Series; Oxford University Press: Oxford, UK, 2010. [Google Scholar]
  34. Teräsvirta, T.; Yang, Y. Linearity and Misspecification Tests for Vector Smooth Transition Regression Models; CREATES Research Papers No. 2014-04; Aarhus Universitet: Aarhus, Denmark, 2014. [Google Scholar]
  35. El Yaagoubi Bourakna, A.; Pinto-Orellana, M.A.; Fortin, N.; Ombao, H. Smooth online parameter estimation for time varying VAR models with application to rat local field potential activity data. Stat. Its Interface 2023, 16, 227–257. [Google Scholar] [CrossRef]
  36. Li, X.; Yuan, J. DeepTVAR: Deep learning for a time-varying VAR model with extension to integrated VAR. Int. J. Forecast. 2023, 40, 1123–1133. [Google Scholar]
  37. Hindrayanto, I.; Koopman, S.J.; Ooms, M. Exact maximum likelihood estimation for non-stationary periodic time series models. Comput. Stat. Data Anal. 2010, 54, 2641–2654. [Google Scholar]
  38. Yan, Y.; Gao, J.; Peng, B. Asymptotics for time-varying vector MA() processes. Econom. Theory 2024, in press. [Google Scholar] [CrossRef]
  39. Karanasos, M.; Paraskevopoulos, A.; Magdalinos, A.; Canepa, A. A unified theory for ARMA models with varying coefficients: One solution fits all. Econom. Theory 2024, in press. [Google Scholar]
  40. Triantafyllopoulos, K.; Nason, G.P. A Bayesian analysis of moving average processes with time-varying parameters. Comput. Stat. Data Anal. 2007, 52, 1025–1046. [Google Scholar] [CrossRef]
  41. Box, G.E.P.; Jenkins, G.M.; Reinsel, G.C.; Ljung, G.C. Time Series Analysis: Forecasting and Control, 5th ed.; Wiley: New York, NY, USA, 2015. [Google Scholar]
  42. Fischer, B.; Planas, C. Large scale fitting of regression models with ARIMA errors. J. Off. Stat. 2000, 16, 173–184. [Google Scholar]
  43. Alj, A.; Azrak, R.; Mélard, G. General estimation results for tdVARMA array models. J. Time Ser. Anal. 2025, 46, 137–151. [Google Scholar] [CrossRef]
  44. Alj, A.; Azrak, R.; Ley, C.; Mélard, G. Asymptotic properties of QML estimators for VARMA models with time-dependent coefficients. Scand. J. Stat. 2017, 44, 617–635. [Google Scholar] [CrossRef]
  45. Mélard, G. An indirect proof for the asymptotic properties of VARMA model estimators. Econom. Stat. 2022, 21, 96–111. [Google Scholar] [CrossRef]
  46. Wilms, I.; Basu, S.; Bien, J.; Matteson, D.S. Sparse identification and estimation of large-scale vector autoregressive moving averages, J. Am. Stat. Assoc 2023, 118, 571–582. [Google Scholar] [CrossRef]
  47. Wilms, I.; Basu, S.; Bien, J.; Matteson, D.S. Bigtime: Sparse Estimation of Large Time Series Models, R Package Version 0.1.0. 2017. Available online: https://CRAN.R-project.org/package=bigtime (accessed on 24 March 2025).
  48. Ling, S.; McAleer, M. Asymptotic theory for a vector ARMA-GARCH model. Econom. Theory 2003, 19, 280–310. [Google Scholar] [CrossRef]
  49. Engle, R.F.; Kroner, K.F. Multivariate simultaneous generalized ARCH. Econom. Theory 1995, 11, 122–150. [Google Scholar] [CrossRef]
  50. Francq, C.; Zakoïan, J.-M. GARCH Models: Structure, Statistical Inference and Financial Applications; Wiley: New York, NY, USA, 2019. [Google Scholar]
  51. Mélard, G. The information matrix of time-dependent models for vector time series. 2025; submitted. [Google Scholar]
  52. Mélard, G. New computational aspects for estimating time-dependent VARMA models. 2025; submitted. [Google Scholar]
  53. Alj, A.; Jónasson, K.; Mélard, G. The exact Gaussian likelihood estimation of time-dependent VARMA models. Comput. Stat. Data Anal. 2016, 100, 633–644. [Google Scholar] [CrossRef]
  54. Mélard, G. Time-dependent processes and time series models: Comments on Marc Hallin’s early contributions and a pragmatic view on estimation. In Recent Advances in Econometrics and Statistics, Festschrift in Honour of Marc Hallin; Barigozzi, M., Hörmann, S., Paindaveine, D., Eds.; Springer Nature: Cham, Switzerland, 2024; pp. 429–446. [Google Scholar]
  55. Azrak, R.; Mélard, G. Asymptotic properties of conditional least-squares estimators for array time series. Stat. Inference Stoch. Process. 2021, 24, 525–547. [Google Scholar] [CrossRef]
  56. Mélard, G. ARMA models with time-dependent coefficients: Official statistics examples. In Time Series Analysis—New Insights; Rifaat, A., El-Diasty, M., Kostogryzov, A., Makhutov, N., Eds.; IntechOpen: London, UK, 2023; pp. 18–35. [Google Scholar]
  57. Boyadzhiev, K.N.; Dil, A. Geometric polynomials: Properties and applications to series with zeta values. Anal. Math. 2016, 42, 203–224. [Google Scholar]
  58. Abramowitz, M.; Stegun, I.A. Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables; Dover Publications: New York, NY, USA, 1965. [Google Scholar]
  59. Erdélyi, A. Higher Transcendental Functions, Volume 1; McGraw-Hill: New York, NY, USA, 1953. [Google Scholar]
  60. Wood, D. The Computation of Polylogarithms; Technical Report 15-92; Computing Laboratory, University of Kent: Canterbury, UK, 1992; Available online: https://www.cs.kent.ac.uk/pubs/1992/110/ (accessed on 24 March 2025).
Table 1. Estimation results for the tdVA R ( n ) (1) model (8) under (10) with Laplace errors. (a) emp.est. is the empirical estimate (average over the simulation estimates), (b) emp.s.e. is the empirical standard error (standard deviation over the simulation estimates), (c) est.s.e. is the estimated standard error (average of the standard errors as a by-product of estimation), (d) theor.s.e. is the (approximate) theoretical standard error (finite approximation of standard error using (4)), and (e) % rej. is the percentage of rejection of the test of the hypothesis H 0 : θ = θ 0 across the simulations.
Table 1. Estimation results for the tdVA R ( n ) (1) model (8) under (10) with Laplace errors. (a) emp.est. is the empirical estimate (average over the simulation estimates), (b) emp.s.e. is the empirical standard error (standard deviation over the simulation estimates), (c) est.s.e. is the estimated standard error (average of the standard errors as a by-product of estimation), (d) theor.s.e. is the (approximate) theoretical standard error (finite approximation of standard error using (4)), and (e) % rej. is the percentage of rejection of the test of the hypothesis H 0 : θ = θ 0 across the simulations.
Parameter:Sample(a)(b)(c)(d)(e)
True Value: Size n emp.est.emp.s.e.est.s.e.theor.s.e.% rej.
A 11 : 0.8 250.75070.12950.09970.113613.0
500.78430.08110.06920.07678.5
1000.78890.05580.05090.05306.7
2000.79510.03740.03730.03706.3
4000.79710.02690.02580.02605.7
A 22 : 0.75 250.60140.64370.54400.703015.7
500.70040.47590.43050.48279.8
1000.70410.34610.30990.33637.9
2000.73470.23640.22240.23606.8
4000.74610.16900.16040.16626.2
η 22 : 0.7 250.69690.77110.56300.744218.3
500.70350.54220.43100.536911.8
1000.70840.38650.32490.383411.9
2000.70780.26840.24490.27258.3
4000.68860.19480.18010.19327.5
Table 2. Estimation results for the tdVA R ( n ) (1) model (8) under (10) with Student errors with 5 degrees of freedom. (a) emp.est. is the empirical estimate (average over the simulation estimates), (b) emp.s.e. is the empirical standard error (standard deviation over the simulation estimates), (c) est.s.e. is the estimated standard error (average of the standard errors as a by-product of estimation), (d) theor.s.e. is the (approximate) theoretical standard error (finite approximation of standard error using (4)), and (e) % rej. is the percentage of rejection of the test of the hypothesis H 0 : θ = θ 0 across the simulations.
Table 2. Estimation results for the tdVA R ( n ) (1) model (8) under (10) with Student errors with 5 degrees of freedom. (a) emp.est. is the empirical estimate (average over the simulation estimates), (b) emp.s.e. is the empirical standard error (standard deviation over the simulation estimates), (c) est.s.e. is the estimated standard error (average of the standard errors as a by-product of estimation), (d) theor.s.e. is the (approximate) theoretical standard error (finite approximation of standard error using (4)), and (e) % rej. is the percentage of rejection of the test of the hypothesis H 0 : θ = θ 0 across the simulations.
Parameter:Sample(a)(b)(c)(d)(e)
True Value: Size n emp.est. emp.s.e. est.s.e. theor.s.e. % rej.
A 11 : 0.8 250.76080.11930.09600.113612.1
500.77730.08550.07060.07678.6
1000.79180.05390.05040.05306.1
2000.79380.03870.03630.03705.8
4000.79770.02600.02570.02604.7
A 22 : 0.75 250.61010.69100.55010.703018.5
500.71530.45640.42900.48279.3
1000.73030.31910.30970.33636.6
2000.73240.23430.22370.23606.3
4000.74640.16980.16070.16626.6
η 22 : 0.7 250.70510.65540.53661.245312.6
500.71020.49760.39460.898412.4
1000.69360.38440.30920.641611.2
2000.70720.27960.23910.45609.0
4000.70520.20680.18030.32327.1
Table 3. Estimation results for the tdVM A ( n ) (1) model (8) under (10) but with linear expressions for B t , 11 ( n ) and B t , 22 ( n ) , and with Laplace errors. (a) emp.est. is the empirical estimate (average over the simulation estimates), (b) emp.s.e. is the empirical standard error (standard deviation over the simulation estimates), (c) est.s.e. is the estimated standard error (average of the standard errors as a by-product of estimation), (d) theor.s.e. is the (approximate) theoretical standard error (finite approximation of standard error using (4)), and (e) % rej. is the percentage of rejection of the test of the hypothesis H 0 : θ = θ 0 across the simulations.
Table 3. Estimation results for the tdVM A ( n ) (1) model (8) under (10) but with linear expressions for B t , 11 ( n ) and B t , 22 ( n ) , and with Laplace errors. (a) emp.est. is the empirical estimate (average over the simulation estimates), (b) emp.s.e. is the empirical standard error (standard deviation over the simulation estimates), (c) est.s.e. is the estimated standard error (average of the standard errors as a by-product of estimation), (d) theor.s.e. is the (approximate) theoretical standard error (finite approximation of standard error using (4)), and (e) % rej. is the percentage of rejection of the test of the hypothesis H 0 : θ = θ 0 across the simulations.
Parameter:Sample(a)(b)(c)(d)(e)
True Value: Size n emp.est. emp.s.e. est.s.e. theor.s.e. % rej.
B 11 : 0.8 250.78270.10450.09890.091311.5
500.80140.06960.06580.06149.0
1000.80430.04530.04350.04239.0
2000.80260.03100.03010.02955.9
4000.80120.02160.02080.02077.3
B 22 : 0.4 250.38500.48560.44730.472215.6
500.40820.34120.30690.320710.9
1000.39340.23080.21490.22237.8
2000.39430.16060.15080.15576.2
4000.40040.11270.10780.10956.2
η 22 : 0.7 250.74580.56240.35670.442424.4
500.73490.34920.26260.320016.7
1000.71680.24150.19730.228914.1
2000.71150.16580.14550.162710.0
4000.69500.11900.10780.11548.7
Table 4. Estimation results for the tdVM A ( n ) (1) model, n = 400 , with normal errors. (a) emp.est. is the empirical estimate (average over the simulation estimates), (b) emp.s.e. is the empirical standard error (standard deviation over the simulation estimates), (c1) and (c2) est.s.e. are the estimated standard errors (average of the standard errors as a by-product of estimation), using V ^ or V ^ 1 W ^ V ^ 1 , respectively; (d) theor.s.e. is the (approximate) theoretical standard error (finite approximation of standard error using (4)), and (e) % rej. is the percentage of rejection of the test of the hypothesis H 0 : θ = θ 0 across the simulations.
Table 4. Estimation results for the tdVM A ( n ) (1) model, n = 400 , with normal errors. (a) emp.est. is the empirical estimate (average over the simulation estimates), (b) emp.s.e. is the empirical standard error (standard deviation over the simulation estimates), (c1) and (c2) est.s.e. are the estimated standard errors (average of the standard errors as a by-product of estimation), using V ^ or V ^ 1 W ^ V ^ 1 , respectively; (d) theor.s.e. is the (approximate) theoretical standard error (finite approximation of standard error using (4)), and (e) % rej. is the percentage of rejection of the test of the hypothesis H 0 : θ = θ 0 across the simulations.
Parameter θ i B 11 B 22 η 22
True value θ i 0 0.80000.40000.7000
(a) emp.est.0.80100.40070.6977
(b) emp.s.e.0.02160.11340.0747
(c1) est.s.e., based on V ^ 0.02100.11050.0744
(c2) est.s.e., based on V ^ 1 W ^ V ^ 1 0.02080.10950.0729
(d) theor.s.e.0.02070.10950.0731
(e) % rej.7.16.05.5
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Mélard, G. Estimator’s Properties of Specific Time-Dependent Multivariate Time Series. Mathematics 2025, 13, 1163. https://doi.org/10.3390/math13071163

AMA Style

Mélard G. Estimator’s Properties of Specific Time-Dependent Multivariate Time Series. Mathematics. 2025; 13(7):1163. https://doi.org/10.3390/math13071163

Chicago/Turabian Style

Mélard, Guy. 2025. "Estimator’s Properties of Specific Time-Dependent Multivariate Time Series" Mathematics 13, no. 7: 1163. https://doi.org/10.3390/math13071163

APA Style

Mélard, G. (2025). Estimator’s Properties of Specific Time-Dependent Multivariate Time Series. Mathematics, 13(7), 1163. https://doi.org/10.3390/math13071163

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop