2.1. The Existing NoVaS Method
The NoVaS method is a model-free prediction principle. The main idea lies in applying an invertible transformation
H, which can map the non-
. vector
to a vector
that has
components. This leads to the prediction of
by inversely transforming the prediction of
[
11]. The starting point to build the transformation of the existing NoVaS method is the ARCH model [
12]. Then, Politis [
1] made some adjustments to determine the final form of
H as:
In Equation (
1),
is the log-returns vector in this article;
is the transformed vector, which we hope to transform to
;
is a fixed-scale invariant constant;
is calculated by
, with
being the mean of
;
is the coefficient corresponding with the currently observed value
. For reaching a qualified transformation function, Equation (
2) is required to stabilize the variance.
Then,
and
are finally determined by minimizing
. In practice, the transformed
is usually uncorrelated; see [
11] for additional processes for correlated
. This method is model-free in the sense that we do not assume any particular distribution for the innovation
except for matching its kurtosis to 3. Once
H is found,
can be obtained immediately. For example,
corresponding with Equation (
1) is:
To obtain the prediction of
, Politis [
11] defined two types of optimal predictors under
(Mean Absolute Deviation) and
(Mean Squared Error) criteria after observing historical information set
:
where
are generated
M times from its empirical distribution or a normal distribution. Here, the normal distribution is an asymptotic limit of the empirical distribution of
. More details about this procedure and multi-step prediction are presented in
Section 2.2.
are given by plugging
into Equation (
3) and setting
t as
. During the optimization process, different forms of unknown parameters in Equation (
2) are applied so that various NoVaS methods are established. Chen [
13] pointed out that the Generalized Exponential NoVaS (GE-NoVaS) method with exponentially decayed unknown parameters presented in Equation (
5) is superior to other NoVaS-type methods.
2.2. A New Method with Less Parameters
However, during our investigation, we found that the GE-NoVaS method returns extremely large predictions under the
criterion sometimes. The reason for this phenomenon is that the denominator of Equation (
3) will be quite small when the generated
(from empirical or normal distribution) is very close to
. In this situation, the prediction error will be amplified. Moreover, when the long-term ahead prediction is desired, this amplification will be accumulated and the final prediction will be dampened. Therefore, a removing-
idea is proposed to avoid such issues in this article.
H and
of the GE-NoVaS-without-
method can be rewritten as below:
We should notice that even without the
term, the causal prediction rule is still satisfied. It is easy to obtain the analytical form of the first-step-ahead
, which can be expressed as below:
More specifically, when the first-step GE-NoVaS-without-
prediction is performed,
are generated
M (i.e., 5000 in this article) times from a standard normal distribution by the Monte Carlo method or bootstrapped from its empirical distribution
which is calculated from Equation (
1). Then, plugging these
into Equation (
7),
M pseudo-predictions
are obtained. According to the strategy implied by Equation (
4), we choose
and
risk optimal predictors
as the sample median and mean of
, respectively. We can even predict the general form of
, such as
, by adopting the sample mean or median of
. Similarly, the two-steps-ahead
can be expressed as:
When the prediction of is required, M pairs of are still generated by bootstrapping or Monte Carlo method from empirically or standard normal distributions, respectively. is replaced by the predicted value which is derived from running the first-step GE-NoVaS-without- prediction with simulated under the or criterion. Subsequently, we choose and risk optimal predictors of as the sample median and mean of .
Finally, iterating the process described above, we can accomplish multi-step-ahead NoVaS predictions.
can be expressed as:
To obtain the prediction of
, we generate
M number of
and plug
with NoVaS predicted values
, which are computed iteratively.
and
risk optimal predictors of
are computed by the sample median and mean of
. In short, we can summarize that
is determined by:
Since
is the observed information set, we can simplify the expression of
as:
For applying the GE-NoVaS method, we can still build the relationship between
and
as:
We should notice that simulated
for obtaining GE-NoVaS method prediction of
should be generated by the bootstrapping or Monte Carlo method from an empirically or trimmed standard normal distribution. The reason for using the trimmed distribution is
from Equation (
1). Here, we summarize Algorithm 1 to perform
h-step-ahead time-aggregated prediction using the GE-NoVaS-without-
method. The algorithm of GE-NoVaS can be written out similarly.
Remark (The advantage of removing the term): First, after removing the
term, the prediction of the NoVaS method under the
criterion is more stable. More details will be shown in
Section 2.3. Second, the suggestion of removing
can also lead to less time complexity of our new method. The reason for this phenomenon is simple. If we consider the limiting distribution of
series,
is required to be larger than or equal to 3 to ensure that
has a sufficiently large range, i.e.,
is required to be less than or equal to 0.111 (recall that the mass of standard normal data is within
). However, the optimal combination of NoVaS coefficients may not render a suitable
. For this situation, we need to increase the NoVaS transformation order
p and repeat the normalizing and variance-stabilizing process till
in the optimal combination of coefficients is suitable. This repeating process definitely increases the computation workload.
Algorithm 1: The h-step ahead prediction for the GE-NoVaS-without- method. |
Step 1 Define a grid of possible values, . Fix , then calculate the optimal combination of of the GE-NoVaS-without- method, which minimizes . |
Step 2 Derive the analytic form of Equation (11) using from the first step. |
Step 3 Generate M times from a standard normal distribution or the empirical distribution . Plug into the analytic form of Equation (11) to obtain M pseudo-values . |
Step 4 Calculate the optimal predictor of by taking the sample mean (under risk criterion) or sample median (under risk criterion) of the set . |
Step 5 Repeat above steps with different values from to get K prediction results. |
2.3. The Potential Instability of the GE-NoVaS Method
Next, we provide an illustration to compare the GE-NoVaS and GE-NoVaS-without-
methods in predicting the volatility of the Microsoft Corporation (MSFT) daily closing price from 8 January 1998 to 31 December 1999 and show an interesting finding that the long-term time-aggregated predictions of the GE-NoVaS method are unstable under the
criterion. Based on the finding of Awartani and Corradi [
14], squared log-returns can be used as a proxy for volatility to render a correct ranking of different GARCH models in terms of a quadratic loss function. Log-return series
can be computed by the equation shown below:
where
is the corresponding MSFT daily closing price series. For achieving a comprehensive comparison, we use 250 financial log-returns as a sliding window to perform POOS 1-step, 5-step and 30-step (long-term) ahead time-aggregated predictions under the
criterion. Then, we roll this window through the whole dataset, i.e., we use
to predict
and
; then, we use
to predict
and
, for 1-step, 5-step and 30-step aggregated predictions, respectively, and so on. We can define all 1-step, 5-step and 30-step-ahead time-aggregated predictions as
,
and
, which are presented as below:
In Equation (
14),
are single-step predictions of squared log-returns by the two NoVaS-type methods. To obtain the “Prediction Errors” for the two methods, we can calculate the “loss” by comparing the aggregated prediction results with the realized aggregated values based on Equation (
15):
where
are realized squared log-returns. To show the potential instability of the GE-NoVaS method under the
criterion, we take
to be 0.5 to build a toy example. In the algorithm when performing the GE-NoVaS method,
could take an optimal value from a discrete set
based on the prediction performance.
From
Figure 1, we can clearly see that the GE-NoVaS-without-
method can better capture different steps’ true time-aggregated features. On the other hand, the GE-NoVaS method returns unstable results for 30-step-ahead time-aggregated predictions. Besides, we can see that the 1-step-ahead POOS prediction returned by the GE-NoVaS method is almost a flat curve, which is actually meaningless. Similarly, for the 5-step-ahead time-aggregated prediction case, the POOS prediction of the GE-NoVaS method fails to match the true time-aggregated values.