1. Introduction
Autoregressive-moving average (ARMA) models with time-dependent (td) coefficients and marginally heteroscedastic innovations provide a natural alternative to stationary ARMA time series models. Several theories have been developed in the last 25 years for parametric estimations in that context (see [
1]).
To simplify our presentation, before considering the autoregressive model of order
p or tdAR(
p), let us consider the case of the tdAR(1) model with a time-dependent coefficient
, which depends on time
and also, possibly, on
, the length of the series. Marginal heteroscedasticity is introduced using another deterministic sequence
be a white noise process, consisting of independent random variables, not necessarily identically distributed, with a mean of zero and with a standard deviation
and fourth-order cumulant
. The model is defined by
The coefficient
and
depend on
and sometimes, not always, on
, but also the parameters. We denote
, the innovation standard deviation. For a given
n, consider a sequence of observations
of the process. When
or
depend on
we should speak of a triangular array process, not of a stochastic process. Note that we use
in [
1] to comply with the notations introduced in multivariate models (see [
2]).
The AR(1) process with a time-dependent coefficient has been considered by Wegman, Tjøstheim, Kwoun and Yajima [
3,
4,
5]. Hamdoune [
6] and Dahlhaus [
7] extended the results to autoregressive processes of order
. Azrak and Mélard [
1], denoted AM (see
Appendix A), and Bibi and Francq [
8], denoted BF, have considered tdARMA processes. Contrarily to AM, in BF, the coefficients depend only on
, not on
. Additionally, although the basic assumptions of BF and AM are different, their asymptotics are somewhat similar but differ considerably from those of Dahlhaus [
7] based on locally stationary processes (LSP), where the dependence on
t and
n is only through their ratio
t/
n (see the nice overview by Dahlhaus in [
9]). For simplicity, we will compare these approaches on autoregressive models.
Two approaches can be sketched for asymptotic theories within nonstationary processes (see [
10]). Approach 1 consists of analyzing the behavior of the process when
tends to infinity. That assumes some generating mechanism in the background that remains the same over time. Two examples can be mentioned: processes with periodically changing coefficients and cointegrated processes. It is in the former context that BF [
8] have established asymptotic properties for parameter estimators in the case where
goes to infinity. Approach 2 for asymptotics within nonstationary processes consists of determining how estimates that are obtained for a finite and fixed sample size behave. This is the setting for describing, in general, the properties of a test under local alternatives (where the parameter space is rescaled by
) or in nonparametric regression. Approach 2 is the framework considered by Dahlhaus [
7] for LSP that we will briefly summarize now. First, there is an assumption of local stationarity that imposes continuity with respect to time and even differentiability (although [
9] tries to replace these assumptions with bounded variation alternatives). Additionally,
is not simply increased to infinity. The coefficients, such as
are considered as a function of rescaled time
. Therefore, everything happens as if the time is rescaled to the interval
. Suppose
and
, where
and
,
, depend on a finite number of parameters, are differentiable functions of
and such that
for all
. The model is written as
As a consequence, the assumptions made in the LSP theory are quite different from those of AM and BF, for example, due to the different nature of the asymptotics. The AM approach is somewhere between these two approaches, 1 and 2, sharing parts of their characteristics but not all of them. In
Section 2, we specialize the assumptions of AM for tdAR(
) processes and consider the special case of a tdAR(1) process. In
Appendix B, we provide an alternative theory for tdAR(
) processes that relies on a
-mixing property. The AM theory is further illustrated in
Appendix C by simulation results on a tdAR(2) process. The LSP and BF theories are summarized in
Section 3 and
Section 4, respectively. In
Section 5, we compare the Dahlhaus LSP theory with our own AM theory. This is partly explained thanks to examples. The differences in the basic assumptions are emphasized. Similarly, in
Section 6, a comparison is presented between the AM and BF approaches, before the conclusions in
Section 7.
2. The AM Theory for Time-Dependent Autoregressive Processes
Let us consider the AM theory in the special case of tdAR() processes. We want also to see if simpler conditions can be derived for the treatment of pure autoregressive processes.
We consider a triangular array of random variables defined on a probability space , with values in , whose distribution depends on a vector of unknown parameters to be estimated, with lying in an open set of a Euclidean space . The true value of is denoted by . By abuse of the language, we will nevertheless talk about the process .
Definition 1. The process is called an autoregressive process of order ,
with time-dependent coefficients, if and only if it satisfies the equationwhereandare as before. We denote again
. The initial values
are supposed to be equal to zero. The
-dimensional vector
contains all the parameters to be estimated, those in
and those in
but not the scale factor
which is estimated separately. We suppose a specific deterministic parameterization in the function of
t and
n. Let
be the parametric coefficient with
and, similarly,
with
. Let
be the residual for a given
:
Note that .
Thanks to the assumption about the initial values and by using (3) recurrently, it is possible to write the pure moving average representation of the process:
(see [
1] for a recurrence formula for the
). Let
be the
-field generated by
and, hence, by
, which explains why a superscript
is suppressed, and
. To simplify the presentation, we denote
and, similarly,
and
. We are interested in the Gaussian quasi-maximum likelihood estimator:
Denote
the expression between the curved brackets in (6). Note that the first term of
will sometimes be omitted, corresponding to a weighted least squares method, especially when
does not depend on the parameters, or even ordinary least squares, when
does not depend on
. BF considers that estimation method and, also, a variant where the denominator is replaced by a consistent estimator. Other estimators are also used in the LSP theory (see
Section 3).
We need expressions for the derivatives of
with respect to
using (4). The first derivative is
It will be convenient to write it as a pure moving average process using (5)
for
, where the coefficients
are obtained by the following relations:
Let
(with respect to [
1], we improved the presentation in light of [
8], especially by distinguishing
and
. The notations for the innovations are also changed to emphasize that
does not depend on
). Similarly, we introduce
and
using the second and third derivatives of
for
and define
and
.
Under all the assumptions of Theorem 2’ of [
1] (see
Appendix A), the estimator
converges in probability to
and
when
, where, with
denoting transposition,
and
where
was defined after (6).
Example 1. The tdAR(1) process:
Let us consider a tdAR(1) process defined by (1) with the parametric coefficient. For example, see (13) or (14) later, oras in [
4]
. We have, for in (5): where a product for an empty set of indices is set to one. Note that, if is a constant , then, such as for the MA representation of a stationary AR(1) process. Similarly,and analogous expressions for the second and third derivatives. The following is an application of Theorem 2’ of [
1]
. (Note that the assumption in [
1]
that the expectation of the fourth-order power of the variable is bounded is replaced by a bound for the sum over k of ). Theorem 1. Consider a tdAR(1) process defined by (1) under the assumptions of Theorem A1 inAppendix A, except thatis replaced by: : Let us suppose that there exist constants, (), ,andsuch that the following inequalities hold for all n, i, j, l, and k uniformly in n:and analogous bounds andfor the second- and third-order derivatives. Then, the results of Theorem 1 are still valid.
Proof. Let us show the first of the inequalities in
since the others are similar. Consider for
Hence, and . ☐
Remark 1. Note that the first inequality ofis true whenfor allandbut this is not a necessity. A finite number of thosecan be greater than 1 without any problem. For example,, withwould be acceptable, because the interval around, where the coefficient is greater than 1, shrinks when. With this in mind, Example 3 of [
1]
can be slightly modified to allow the upper bound of the ’s be greater than one. This will be illustrated in Section 5. Note also that the other inequalities of are easy to check. One of the assumptions of Theorem A1 in
Appendix A,
, is particularly strange at first sight, although it could be checked in the examples of
Section 4 in [
1]. It is interesting to note that, at least in the framework of autoregressive processes, it can be replaced by a more standard ρ-mixing condition. This is done in
Appendix B. We were unfortunately not able to generalize it in the time-dependent moving average (MA) or ARMA models.
In [
1], a few simulations were presented for simple tdAR(1) and tdMA(1) models, moreover when the generated series were stationary. It allowed us to assess empirically the theory but not to put it in jeopardy. In
Appendix C, we show simulations for a tdAR(2) model, where the true coefficients
and
vary extremely. The parameterization used is
The results are good, although the model cannot be fitted on some of the generated series, especially for short series of a length
n = 50. Except for the behavior at the end of the series, and with a small approximation of
t/(
n − 1) by
t/
n, it will be seen in
Section 5 that the data generator process fulfills the assumptions of the LSP theory, so these simulations can also be seen as illustrative of that theory. It should be noted that there are few simulation experiments mentioned in the LSP literature. The word “simulation” is mentioned twice in [
9] but each time to express a request.
Estimation of all tdAR models (and tdARMA models as well) is done by quasi-maximum likelihood estimation (QMLE) using numerical optimization of an exact Gaussian maximum likelihood function. An algorithm for the exact computation of the likelihood is based on a Cholesky factorization for band matrices [
11], but an algorithm based on the Kalman filter [
12] can be used as well. Since no software package can treat this, it has been included in a specific program called ANSECH, written in Fortran and included in Time Series Expert (see [
13]), which can be made available by the second author.
In
Section 6 of [
1], there are illustrative numerical examples of the AM theory. They are all based on Box–Jenkins series A, B and G (see [
14]). These examples involve tdARMA models, not tdAR models. To illustrate a tdAR model, let us consider Box–Jenkins series D (see [
14]). It is a series of 310 chemical process viscosity readings every hour. The fitted model was an AR(1) model with a constant. Here, we replaced the constant autoregressive coefficient with a linear function of time, as in (11) for
k = 1 (but omitting the inverse of
n – 1), and fitted a model on the first
n = 300 observations. We used the exact likelihood function but (nonlinear) least squares would give nearly similar results because the series is long enough to remove the effect of the initial value at time
t = 0. The results in
Table 1 show the estimates for the three parameters and the corresponding standard errors obtained from the estimator of
V deduced from the optimization algorithm and justified by the asymptotic theory. Since the
t-value for
is equal to −2.7, we can reject the null hypothesis of a constant coefficient
. Given these estimates, the coefficient
varies linearly between 0.9549 and 0.7307. This is not surprising, because if we compute the autocorrelation at lag 1 around time 90, we find 0.87, while around time 210, we find 0.75.
The AM theory was generalized to vector processes by [
2], who treated the case of tdVARMA processes where the model coefficients did not depend on
n, and by [
15] for the general case, called tdVARMA
(n) processes. Additionally, [
16] provided a better foundation for the asymptotic theory for array processes, a theorem for a reduction of the order of moments from 8 to slightly more than 4 and tools for obtaining the asymptotic covariance matrix of the estimator. In [
17], there was an example of vector tdAR and tdMA models on monthly log returns of IBM stock and the S&P500 index from January 1926 to December 1999, treated first in [
18].
3. The Theory of Locally Stationary Processes
We gave in
Section 1 some elements of the theory of Dahlhaus [
7]. It is based on a class of locally stationary processes (LSP), which means a sequence of stationary processes, based on a stochastic integral representation:
where
is a process with independent increments and
fulfills a condition to be called a slowly varying function of
. The theory will be well-adapted to time series that will be called locally stationary time series (LSTS).
In the case of autoregressive processes, which are emphasized in this paper, for example, an AR(1) process, that means that the observations around time are supposed to be generated by a stationary AR(1) process with some coefficient . Stationarity implies that . Around time , fitting is done using the process at time . More generally, for AR() processes, the autoregressive coefficients are such that the roots of the autoregressive polynomial are greater than 1 in the modulus.
The estimation method is based either on a spectral approach or a Whittle approximation of the Gaussian likelihood. The author of [
19] also sketched out a maximum likelihood estimation method.
As mentioned above, the LSP approach of doing asymptotics relies on rescaling time
in
. That does not mean that the process is considered in a continuous time, but at least that its coefficients are considered in a continuous time. Asymptotics is done by assuming an increasing number of observations between
and
. That means that the coefficients are considered as a function of
not separately as a function of
and
. This is nearly the same as was assumed in (11) since
is close to
for large
. Note, however, that Example 1 of [
1] is not in that class of processes. More generally, processes where the coefficients are periodic functions of
are excluded from the class of processes under consideration. Of course, what was said about the coefficients is also valid for the innovation standard deviation. If the latter is a periodic function of time
, with a given period
, the process is not compatible with time rescaling. We will compare the LSP theory with the AM theory in
Section 5.
That being said, the theory of LSPs has received considerable attention in the statistical literature. In his review [
9], Dahlhaus listed a large number of extensions, generally by other authors, for univariate or multivariate models; for linear and nonlinear models and by parametric, semi-parametric and nonparametric methods. In particular, the following topics are overviewed, with several references for each of them: wavelet LSP, testing of LSP, in particular, testing for stationarity, bootstrap methods for LSP, model misspecification and model selection, likelihood theory and large deviations, recursive estimation, inference for a mean curve, piecewise constant models, long memory LSP, locally stationary random fields, discrimination analysis and applications in forecasting and finance. It is not useful to repeat a large number of references.
Furthermore, since [
9], a large number of articles have appeared in the framework of the LSP theory, so many that it is only possible to mention a few of them. They are about a time-varying general dynamic factor model [
20], time-varying additive models [
21,
22], nonparametric spectral analysis of multivariate series [
23], bootstrapping [
24], comparison of several techniques for identification of nonstationary multivariate autoregressive processes [
25], inference for nonstationary time series autoregressions [
26], the prediction of weakly LSP by autoregression [
27], predictive inference for LSTS [
28], frequency-domain tests for stationarity [
29,
30], cross-validations for LSP [
31], adaptative covariance and spectrum estimation of multivariate LSP [
32], large time-varying parameter VAR models by a nonparametric approach [
33], a co-stationarity test of LSTS [
34], towards a general theory of nonlinear LSPs [
35], a quantile spectral analysis of LSTS [
36], time-dependent dual-frequency coherence of a nonstationary time series [
37] and nonparametric estimation of AR(1) LSP with periodicity [
38].
Several examples illustrating these various methods for LSPs are included in these papers, such as in finance [
21,
26,
36], environmental studies [
22,
28], biology with EEG signals [
37], and also in economics with weekly egg prices [
24].
4. The Theory of Cyclically Time-Dependent Models
Here, we will focus on BF (see [
8]), but part of the discussion is also appropriate for older approaches such as [
4,
5,
6]. In [
8], Bibi and Francq developed a general theory of estimation for linear models with time-dependent coefficients, particularly aimed at the case of cyclically time-dependent coefficients (see also [
39,
40,
41,
42]).
The linear models include autoregressive but also moving average (MA) or ARMA models such as AM. The coefficients can depend on in a general way but not on . Hence, is written . Heteroscedasticity is allowed similarly in the sense that the innovation variance can depend on (but not on ). The estimation method is a quasi-generalized least squares method.
The BF theory supports several classes of models. The periodic ARMA or PARMA models, where the coefficients are periodic functions of time, are an important class. Note that the period does not need to be an integer. However,
Section 3 of [
8] also considered a switching model based on
, a subset of integers in
and its complement
. For example,
can be associated with weekdays and
with the weekend. Then, the coefficient, e.g.,
in (1), depends on a parameter
) in the following way:
, where
denotes the indicator function, equal to 1 if
t belongs to
and 0 otherwise. Consequently, there are two different regimes. However, the composition of
and
can also be generated by an i.i.d. sequence of Bernoulli experiments, with some parameter π, provided they are independent of the white noise process
.
Under appropriate assumptions, there is a theorem of almost sure consistency and convergence in the law of the estimator of
to a normal distribution, somewhat as in Theorem A1 of
Appendix A. Note that a strong consistency is proven here, not just convergence in probability. We will compare the BF theory with the AM theory in
Section 6.
The BF approach has known several developments. In particular, it has been extended to time-dependent bilinear models [
43], periodic bilinear models [
44], periodic ARMA models (PARMA) with periodic bilinear innovations [
45] and to GARCH models with time-varying coefficients [
46,
47]. More recent works on closely related models, such as weak ARMA models with regime changes [
48], prefer to find a stationary and ergodic solution to the model equation. There are few examples in these papers, a remarkable exception being [
47] with an application to daily gas spot prices.
5. A Comparison with the Theory of Locally Stationary Processes
In this section, we compare the AM approach described in
Section 2 with the LSP approach described in
Section 3. The basic model is (3), although the coefficients
and
depend on
t and
n through
t/
n only. Although LSP can be MA and ARMA processes (see [
7]), the latter are rarely mentioned in the LSP literature. The bibliography of the overview paper [
9] mentions “moving average”, “MA” or “ARMA” only four times.
Dependency on
of the model coefficients, as well as the innovation standard deviation, is assumed to be continuous and even differentiable in the initial LSP theory. We have mentioned that [
9] suggested replacing that assumption with a bounded variation assumption (bounded variation functions are such that the derivative exists almost everywhere). In comparison, the other theories, including AM and BF (see [
8]), accept discrete values of the coefficients according to time, without requiring a slow variation. They instead make assumptions of differentiability in the parameters.
Another point of discussion is as follows. To handle economic and social data with an annual seasonality, Box and Jenkins (see [
14]) proposed the so-called seasonal ARMA processes, where the autoregressive and moving average polynomials are products of polynomials in the lag operator
and polynomials in
for some
, for example,
for monthly data or
for quarterly data. Although the series generated by these stochastic processes are not periodic, with suitably initial values, they can show a seasonality with period
. Let us consider such ARMA processes with time-dependent coefficients, for example, a tdAR(12) defined by the equation
with the same notations as in
Section 1. There are exactly 11 observations between times
and
and an increase in the total number of observations would not affect that. For such processes, Approach 1 of doing asymptotics, described in
Section 1, seems to be the most appropriate, assuming that there is a larger number of years, not that there is a larger number of months within a year. Of course, Approach 2 of doing asymptotics is perfectly valid in all cases where the frequency of observation is more or less arbitrary.
To conclude, the AM approach is better suited for economic time series, where we can imagine that more years will become available (see the left-hand part of
Figure 1). In other contexts, such as in biology and engineering, we can imagine that more data become available with an increased sampling rate (see the right-hand side of
Figure 1). Then, the LSP theory seems more appropriate.
In the following example, we will consider a tdAR(1) process but with the innovation standard deviation being a periodic function of time. Let us first show a unique artificial series of length 128 generated by (1) with
with
,
and the
are normally and independently distributed with the mean 0 and variance
, where
is a periodic function of
with period 12, simulating seasonal heteroscedasticity for the monthly data. Furthermore,
, which does not depend on
, assumes values
and
, each during six consecutive time points; hence,
g = 0.5. We omitted the factor
here since only one series length is considered. The series plotted in
Figure 2 clearly shows a nonstationary pattern. The choices of
and
are such that the autoregressive coefficient follows a straight line that goes slightly above
at the end of the series (see
Figure 3). The parameters are estimated using the exact Gaussian maximum likelihood method. The representation of
makes use of an old implementation for an intervention analysis for the innovation standard deviation [
49]. The estimates (with the standard errors):
,
and
are compatible with the true values. We provide the fit of
and
, respectively, in
Figure 3 and
Figure 4.
Figure 5 and
Figure 6 give better insight into the relationships between the observations, broadly showing a negative autocorrelation during the first half of the series and a positive autocorrelation during the second half, as well as a small scatter during half of the year and a large scatter during the other half.
Note, finally, that this example is not compatible with the LSP theory, since for some , and being a piecewise constant is not a differentiable function of time. Additionally, the asymptotics related to that theory will be difficult to interpret, since is periodic with a fixed period.
We ran Monte Carlo simulations using the same setup, except that polynomials of two degrees were fitted for
instead of a linear function of time. The parameterization is
and
is a periodic function that oscillates between the two values,
and
, defined as above. The estimation program is the same as in AM but extended to cover polynomials of a time of degree up to three, as well as for AR (or similarly MA) coefficients, as for
. Estimates are obtained by numerically maximizing the exact Gaussian likelihood.
Some 1000 series of length 128 were generated using a program written in MATLAB with Gaussian innovations. Note that the estimation results were obtained for the 964 series only. They are provided in
Table 2. Unfortunately, some estimates of the standard errors were unreliable, so their averages were useless and replaced by medians. The estimates of the standard errors are quite close to the empirical standard deviations. The fact is that the results are not as good as the simulation experiments described in
Section 5 of [
1], at least for a series of 100 observations or more, perhaps because the basic assumptions are only barely satisfied with
going nearly from about
to
. In
Table 3, we fitted the more adequate and simpler model with a linear function of time instead of a quadratic function of time. Then, the results were obtained for 999 series, and the estimated standard errors were always reliable so that their averages across the simulations are displayed.
To end the section on a positive side, let us say that the three illustrative examples in
Section 6 of [
1] would give approximatively the same results under the LSP theory, provided that the same estimation method is used.
6. A Comparison with the Theory of Cyclically Time-Dependent Models
In this section, we compare the AM approach described in
Section 2 with the BF approach described in
Section 4. The basic model is (3) without the superscripts
since the coefficients
and
do not depend on
n. Therefore, it is clear that there is no sense to compare the LSP and the BF theories, which act on disjoint classes of processes.
We mentioned in
Section 4 a few classes of processes to which the BF theory is applicable. The periodic ARMA or PARMA processes are surely compatible with the AM theory, even with an irrational period for the periodic functions of time (see also [
2]). On the contrary, the switching model based on an i.i.d. sequence of Bernoulli experiments is not a particular case of the models treated in [
1]. This is characteristic of the BF theory that the two other theories cannot handle at first sight. When the switching model is based on a fixed subset of integers
, the AM theory can be adapted, especially in the weekdays versus weekend example. On the other hand, Examples 2–5 of [
1] are incompatible with the BF theory, since the coefficients depend on
n.
The basic assumptions of BF are different from those of AM. A comparison is difficult here, but it is interesting to note a less restrictive assumption of the existence of fourth-order moments, not eighth-order as in AM. Note that [
16] has removed that requirement for the AM theory. Note that the expression for
I in [
8], which corresponds to our
W in (9), did not involve fourth-order moments since no parameter was involved in the heteroscedasticity.
The process considered in
Section 5 was an example for which the theory of locally stationary processes would not apply because of the periodicity in the innovation standard deviation. That process is also an example for which the BF theory would not apply, because the autoregressive coefficient is a function of
and
, not only of
.
7. Conclusions
This paper was motivated by suggestions to see if the results in [
1] simplify much in the case of autoregressive or even tdAR(1) processes and by requests to compare more deeply the AM approach with others and push it in harder situations. We recalled the main result in
Appendix A.
We showed that there are not many simplifications for tdAR processes, perhaps due to the intrinsically complex nature of ARMA processes with time-dependent coefficients. Nevertheless, we were able to simplify one of the assumptions for tdAR(1) processes. We took the opportunity of this study on autoregressive processes with time-dependent coefficients to develop in
Appendix B an alternative approach based on a
-mixing condition instead of the strange assumption
made in AM. At least we could check that assumption in some examples, which was not the case for the mixing condition at present. Note that a mixing approach was the first we tried, before preferring
. The latter could be extended to the tdMA and tdARMA processes, which was not the case for the mixing condition. Although the theoretical results for tdAR(2) processes could not be shown in closed-form expressions, the simulations in
Appendix C indicate that the method is robust when causality becomes questionable. We showed more stressing simulations than in AM. ARIMA models could have been possible for these simulations and examples, but this paper focused on autoregressive processes. Practical examples of tdARMA models were already given in [
1] and [
8].
We also compared the AM approach to others, more especially Dahlhaus LSP theory [
7] and the BF approach [
8], aimed at cyclically time-dependent linear processes. Let us comment on this more deeply.
As in the LSP theory, a different process is considered by AM for each . There are, however, several differences between the two approaches: (a) AM can cope with periodic evolutions with a fixed period, either in the coefficients or in the variance. (b) AM does not assume differentiability with respect to time but will with respect to the parameters; (c) to compensate, AM makes other assumptions that are more difficult to check, (d) which may explain why the LSP theory is more widely applicable: other models than just ARMA models and other estimation methods than the maximum likelihood, even semi-parametric methods, the existence of a LAN approach, etc. (e) AM is purely time domain-oriented, whereas the LSP theory is based on a spectral representation. An example with an economic inspiration and its associated simulation experiments showed that some of these assumptions of AM are less restrictive, but there is no doubt that others are more stringent. In our opinion, the field of applications can influence the kind of asymptotics. The Dahlhaus LSP approach is surely well-adapted to signal measurements in biology and engineering, where the sample span of time is fixed and the sampling interval is more or less arbitrary. This is not true in economics and management, where (a) time series models are primarily used to forecast a flow variable like sales or production, obtained by accumulating data over a given span of time, a month or a quarter, so (b) that the sampling period is fixed, and (c) moreover, some degree of periodicity is induced by seasonality. Here, it is difficult to assume that more observations become available during one year without strongly affecting the model. For that reason, even if the so-called seasonal ARMA processes, which are nearly the rule for economic data, are formally special cases of locally stationary processes, the way of doing asymptotics is not adequate. For the same reason, rescaling time is not natural when the coefficients are periodical functions of time.
Going now to a comparison of AM with the BF approach mainly aimed at cyclically time-dependent linear processes, we see the first fundamental difference is the fact that a different process is considered for each in AM, not in BF. That assumption of dependency on as well as on was introduced to be able to do asymptotics in cases that would not have been possible otherwise (except in adopting the Dahlhaus approach, of course) but, at the same time, making it possible to represent a periodic behavior. When the coefficients are only dependent on , not on , the AM and BF approaches come close in the sense that (a) the estimation methods are close; (b) the assumptions are quite similar. The example shown to distinguish AM from LSP is also illuminating the differences between AM and BF. There remains that the switching model based on an i.i.d. Bernoulli process is not feasible in the AM approach.
In some sense, AM can be seen as partly taking some features of both the LSP and BF approaches. Some features, like the periodicity of the innovation variance, can be handled well in BF, while others, like slowly time-varying coefficients, are in the scope of LSP. However, a cyclical behavior of some innovation variance and slowly varying coefficients together (or the contrary: a cyclical behavior of some coefficients and slowly varying innovation variance) are not covered by Dahlhaus and BF theories but are by AM. The example in
Section 5 may look artificial but includes all the characteristics that are not covered well by locally stationary processes and the corresponding asymptotic theory. It includes a time-dependent first-order autoregressive coefficient
which is very realistic for an I(0) (i.e., not integrated) economic time series and an innovation variance
, which is a periodic function of time (this can be explained by seasonality, as in a winter/summer effect). To emphasize the differences with the LSP approach, we assumed that
goes slightly outside of the causality (or stationarity, in Dahlhaus terminology) region for some time and that
is piecewise-constant and, hence, not compatible with the differentiability at each time.
One aspect was not discussed in this paper: how to specify the time dependence of the model coefficients. For the example of Box and Jenkins series D, we mentioned the computation of autocorrelations on parts of the series. Otherwise, our examples were based on polynomial representations with respect to time, and we used tests to possibly reduce the degree of the polynomial. Other parameterizations than polynomials can be considered.