Financial Time Series Forecasting Using Empirical Mode Decomposition and Support Vector Regression

: We introduce a multistep-ahead forecasting methodology that combines empirical mode decomposition (EMD) and support vector regression (SVR). This methodology is based on the idea that the forecasting task is simpliﬁed by using as input for SVR the time series decomposed with EMD. The outcomes of this methodology are compared with benchmark models commonly used in the literature. The results demonstrate that the combination of EMD and SVR can outperform benchmark models signiﬁcantly, predicting the Standard & Poor’s 500 Index from 30 s to 25 min ahead. The high-frequency components better forecast short-term horizons, whereas the low-frequency components better forecast long-term horizons.


Introduction
Forecasting financial data is a challenging task.Future prices are difficult to predict because market imperfections are quickly discovered, exploited and corrected by market participants.Nonetheless, forecasting financial time series is a very active research area with applications spanning from trading strategies to risk management (Alexander 2001;Aymanns et al. 2016;Caccioli et al. 2016;Clements et al. 2004;Kim 2003;Varga-Haszonits et al. 2016).
There are a vast number of methodologies used for forecasting purposes (Brooks 2014).Over the last few decades, efforts to improve forecasting techniques have also included the use of empirical mode decomposition (EMD) (Huang et al. 1998).The EMD method decomposes the signal into a finite set of nearly orthogonal oscillating components, called intrinsic mode functions (IMFs).IMFs have characteristic time-scales of oscillations defined by the local maxima and the local minima of the data; they are retrieved by the data itself without imposing any functional form.
There are two major challenges associated with forecasting financial time series: (1) nonstationarity (i.e., the statistical properties of the time series change with time); (2) multi-scaling (i.e., the statistical properties of the time series change with time-horizons) (Di Matteo 2007;Nava et al. 2016aNava et al. , 2016b)).EMD brings two elements that directly address both issues.First, the IMF components are locally stationary, oscillating around zero.Indeed, we can describe the EMD as a highly adaptable and granular detrending procedure where, starting from high frequencies, the local trend of each IMF component is contained within the cycle of the next component.Second, each IMF is associated with a characteristic oscillation period, and, as such, each component is associated with a characteristic time-scale and -horizon.The general idea behind the use of EMD for forecasting purposes is therefore that by dividing a signal into IMF components, the residual component can reduce the complexity of the time series, separating trends and oscillations at different scales, improving in this way the forecasting accuracy at specific time-horizons.
The application of EMD to forecasting has been already explored in the literature.For instance, Yu et al. demonstrated in Yu et al. (2008) that this timescale decomposition is indeed an efficient methodology following the "divide-and-conquer" philosophy.Such a divide-and-conquer philosophy has been used in different areas: crude oil spot prices (Yu et al. 2008), foreign exchange rates (Lin et al. 2012), market stock indices (Cheng and Wei 2014), wind speed (Wang et al. 2014), computer sales (Lu and Shao 2012), and tourism demand (Chen et al. 2012), to mention a few.For instance, a hybrid EMD combined with an artificial neural network (ANN) approach was used by Liu et al. (2012) to forecast one-, two-and three-steps-ahead wind-speed time series.Different forecasting powers from high-to low-frequency and long-term trend components were investigated by Zeng and Qu (2014), who showed the forecasting effectiveness of EMD combined with ANNs for the Baltic Dry Index (BDI).We note that in the literature, often the EMD is applied to the whole dataset before "forecasting" (Chen et al. 2012;Cheng and Wei 2014;Lin et al. 2012;Lu and Shao 2012;Wang et al. 2014;Yu et al. 2008).This implies that the future forecasted data are used to construct the EMD, which therefore contains future information not obtainable in real forecasting scenarios.This could explain the good performance of some of the proposed EMD-based models.
In this paper we combine EMD and support vector regression (SVR) (Christianini and Shawe-Taylor 2000;Kim 2003;Smola and Schölkopf 2004;Suykens et al. 2002;Tay and Cao 2001) techniques.The main purpose of the paper is to demonstrate that EMD can improve forecasting and, in particular, that the high-and low-frequency components are associated with different forecasting powers for short and long time-horizons, respectively.We have chosen SVR as a forecasting tool because it is a general nonlinear regression methodology that has been proved to be effective for the prediction of financial time series, and it is particularly suited to handle multiple inputs.We have tested several ways of using the EMD method combined with both direct and recursive SVR forecasting strategies (Kazem et al. 2013;Lu et al. 2009;Tay and Cao 2001).We have used both univariate and multivariate settings, and we have used both single EMD components and combinations of them.Our main finding is the identification of the best EMD-SVR strategies among various combinations.We report results for the prediction of the S&P 500 from 30 s to 25 min ahead, showing that the EMD-SVR methodology significantly outperforms other forecasting methodologies, including SVR, on the original time series.In this paper, we have divided the dataset into training and testing sets and have applied EMD on the training set only, forecasting the testing set without using information from it.
This paper is organized as follows.Performances of the EMD-SVR methodology for the prediction of the Standard & Poor's 500 (S&P) indext up to 25 min ahead are reported in Section 2. Discussions of the outcomes are provided in Section 3. The methodology used to combine EMD and SVR, originally proposed in this paper, is outlined in Section 4. Conclusions are drawn in Section 5.

Data
The combination of EMD and SVR for financial time series forecasting proposed in this paper (see Section 4) has been tested on the S&P 500 index.The dataset consisted of 128 days of intraday data sampled at 30 s intervals.The forecasting was performed independently for each single day using a training sample of 500 prices (4 h and 10 min) and forecasting the following h = 50 steps ahead (25 min).We note that the trading day for the S&P 500 index is between 08:30 and 15:15, of which we excluded the first 5 min.We therefore forecasted the period 12:45-13:10.The choice of this forecasting interval was an incidental consequence of the procedure.We also tested some later intervals, finding comparable outcomes.We note that the purpose of this paper is not to produce a methodology for trading but to demonstrate that by EMD, one can obtain better forecasting results with respect to benchmark methods.The problem of the size of the training set is common across all methodologies, and, to avoid trading only in the second half of the day, one can adopt several tricks.However, this is beyond the scope of this paper.The EMD construction and the SVM calibration were performed exclusively using the training sample.Forecasting results were computed exclusively for the following testing sample.We note that in the literature, EMD is instead sometimes applied to the whole dataset, including the testing part (see, e.g., Chen et al. 2012;Cheng and Wei 2014;Lin et al. 2012;Lu and Shao 2012;Wang et al. 2014;Yu et al. 2008).The inclusion of (future) testing data in the forecasting methodology is clearly wrong, providing meaningless "forecasts".

Intraday Forecasting: Example of a Single Time Series for 7 August 2014
For the sake of clarity, we first exemplify our forecasting methodology and calibration procedure for one randomly chosen day of the S&P 500 index (7 August 2014), keeping in mind that we performed the same analysis on all the remaining time series all for the other days.The day was chosen at random and has no special significance.It appeared to be an "ordinary" trading day. Figure 1 illustrates the behaviors of the price during this particular day.The first step of the forecasting methodology is the application of the EMD to the training data (see Section 4.1, Equation ( 1)).In this particular case, we obtained n = 5 IMF components and one residue by using the stopping criteria of Rilling et al. (2003) that takes into consideration the local mean and the local amplitude of the envelope functions.The second step is to create an input vector to predict the h-steps-ahead value of the index z(t + h) from the m previous values IMF i (t), ... ,IMF i (t − m) for each of the five (i = 1, ..., 5) IMF components and the residue (see Equations ( 2), ( 3) and ( 5)-( 9)).
We tested three input vectors of different lengths m, as follows: 1. m = 1 lagged values of each IMF and the residue.2. m = 5 lagged values of each IMF and the residue.
3. m = p + d, where p denotes the number of autoregressive terms and d is the number of differentiations of an autoregressive integrated moving average (ARIMA(p, d, q)) model that was fitted to each of the IMFs and to the residue.For the implementation of the ARIMA(p, d, q) models, the software package auto.arimafunction available in R was used (Hyndman and Khandakar 2008) (see Appendix A).
We describe separately the results for the univariate and multivariate EMD-SVR frameworks.

Univariate EMD-SVR Results
The univariate EMD-SVR approach was implemented following the methodology described in Section 4.5.1 (Equations ( 4)-( 8)) using both the recursive and the direct strategies.For the recursive strategy, we forecasted all h = 1, 2, . . ., 50 steps ahead, but we only report here eight values for h ∈ [1,2,3,5,10,20,30,50].For the direct strategy, we trained eight models corresponding to h ∈ [1,2,3,5,10,20,30,50].We computed results for three input vectors of lengths m = 1, m = 5 and m = p + d, but for this example, we only report the results for the input vector, m = p + d, which produced the best outcomes.We note that in several of the studied cases, m = p + d = 5.The autoregressive terms, p, and the number of differentiations, d, estimated with an ARIMA(p, d, q) model applied to each IMF are reported in Table 1.
Table 1.Order of the autoregressive integrated moving average (ARIMA(p, d, q)) models fitted to each intrinsic mode function (IMF) and to the residue.The number of lagged values m = p + d was used to construct the input vectors for the empirical mode decomposition-support vector regression (EMD-SVR) models.
In Figure 2, we compare the forecasted IMFs using the recursive and the direct strategies.Figure 2a illustrates the results of the first IMF.Figure 2b illustrates forecasting results for the second IMF, and so on, until Figure 2f, which shows the forecasted residue.The black line in each plot represents the input IMF, the blue line represents the recursive-strategy forecasting and the red line is the direct-strategy forecasting.We observe that both strategies captured some of the oscillating patterns of the IMFs.
The forecasted values of the IMFs and the residue have been used to generate a coarse-to-fine reconstruction, which generated six forecasting models.The results of these models are illustrated in Figure 3.The first coarse model only considered the residue, which is denoted as R; see Figure 3a.The second model used the forecasting of the residue and the fifth IMF (R + IMF 5 ); see Figure 3b.We continued this process until we included all the IMFs (R + ∑ 5 i=1 IMF i ); see Figure 3f.Forecasted values were obtained using partial reconstructions of the univariate empirical mode decomposition-support vector regression (EMD-SVR) model (Equations ( 5)-( 8)), both the recursive (blue line) and the direct (red line) strategies.We note that the EMD was performed on the training set only (i.e., using only data before beginning of forecasting).

Multivariate EMD-SVR Results
The multivariate EMD-SVR approach was implemented following the methodology described in Section 4.5.2 using the direct strategy (Equation ( 9)).We tested the three input vectors with m = 1, m = 5 and m = p + d, but we only report the results for the input vector m = p + d, which produced the best results.As before, we used the forecast horizons h ∈ [1, 2, 3, 5, 10, 20, 30, 50].
A forecasting example of the multivariate EMD-SVR models for 7 August 2014 is reported in Figure 4.The black line represents the S&P 500 index values, whereas the red line represents the forecasted values from h = 1 to h = 50 steps ahead.9)).The true S&P 500 index values are the same as in Figures 1 and 3. We note that the EMD was performed on the training set only (i.e., using only data before beginning of forecasting).

Intraday Forecasting: Analysis of the Complete Dataset
We performed the same intraday forecasting described in the previous two sub-sections for each of the 128 daily time series of the S&P 500 index.In order to fairly compare the forecasting capabilities of the different methodologies, each time series was decomposed into five IMFs and a residue.
To estimate the performance of the proposed models at each forecast horizon h ∈ [1,2,3,5,10,20,30,50], we calculated the mean absolute error (MAE) (see Section 4.7, Equation ( 10)) over the M = 128 time series.The proposed EMD-SVR models are compared with other commonly used benchmark models, which were applied to the initial time series (without the EMD).These benchmark models were the following:

•
Naive model, which keeps constant the last observed value in the time series.
Table 2 reports the comparison between the MAEs of the EMD-SVR models and the benchmarks.The table refers to the case using m = p + d, which returned the best results.For each forecast horizon, the smallest error across the compared models is set in boldface.The cases for vector of lengths m = 1 and m = 5 are reported in Tables A1 and A2 in the Appendix.Across the three tables (Tables 2, A1 and A2), the smallest errors are indicated with a dagger ( †).
Figure 5 reports the mean MAE versus the forecast horizon for the EMD-SVR models, as well as the benchmarks for m = p + d.It is a graphical representation of Table 2. Cases for m = 1 and m = 5 are reported in Figures A1 and A2 in the Appendix.
Table 2. Mean and standard deviation (std) of mean absolute error (MAE) computed for all the 128 days for all the forecasting models: Naive, ARIMA(p, d, q), direct and recursive SVR on the original data, direct and recursive univariate and multivariate EMD-SVR with input vector m = p + d lagged values, the same input vector as the ARIMA(p, d, q) model.Small MAE indicate better forecasting.The smallest MAE of each forecast horizon is set in boldface.The values marked with a dagger ( †) indicate the smallest MAE of each horizon across all the models with different input vector m (see Tables A1 and A2).5)-( 8)) outperform all benchmarks for at all steps ahead.Multivariate EMD-SVR also outperforms all benchmarks for h ≥ 5.

Statistical Significance
We applied the Wilcoxon test (see Section 4.8) between all the forecasting modes and the naive model (benchmark) for each forecast horizon.In Table 3, we report the Z-statistic values of the two-tailed Wilcoxon signed-rank test with M = 128 and m = p + d.The results for m = 1 and m = 5 are reported in Tables A3 and A4 in the Appendix.We recall that in this test, values of Z larger than zero indicate that the model is performing better than the benchmark, whereas negative values indicate that the benchmark is performing better.In general, the larger the deviation from zero, the more significant the test is.In Tables 3, A3 and A4, we mark the 5% and the 1% significance levels with a ( * ) and ( * * ), respectively.

Discussion
The analysis of the forecasting errors' MAE (Table 2) reveals the following: • Across all forecasting models, the MAE increased with the forecast horizon following the intuition that the distant future is harder to predict.

•
The direct strategy achieved more accurate forecasts than the recursive strategy in almost all the tested models.

•
The smallest errors were observed for the input vector of length m = p + d.Similar results were obtained for models with input vector m = 5, as m = p + d was often around 5, whereas the case m = 1 produced poorer results.

•
For short time-horizons (h ≤ 5), the best results were obtained by the direct univariate EMD-SVR model that included all IMFs and the residue.For large time-horizons (h ≥ 30), the best results were obtained by the direct univariate EMD-SVR model that included the residue only.The intermediate case of h = 20 favored the inclusion of the last two IMFs.

•
The direct EMD-SVR multivariate strategy performed better than the naive and ARIMA(p, d, q) benchmarks across all horizons and performed better than the direct and recursive SVR benchmarks for h ≥ 5.
The statistical significance results (Wilcoxon test; Table 3) show the following:

•
The direct EMD-SVR strategy provides consistently better results than the recursive strategy.

•
For short time-horizons (h ≤ 5), better results are obtained with models including all the IMFs and the residue.For longer time-horizons (h ≥ 20), models with the residue only or models with only few slowly oscillating IMFs become significantly better than the naive model.

•
The direct EMD-SVR multivariate strategy provides forecasting results that significantly outperform the naive model for all time-horizons greater than h = 1 and outperforms the other models from h ≥ 5.
Overall, these outcomes indicate that statistically significant forecasting of the S&P 500 index can be obtained across all time-horizons from 30 s to 25 min ahead.For all the time-horizons, the decomposition of the time series into IMF components improved the SVR forecasting power.The results also demonstrate that the residue and the low-frequency IMF components contribute to the forecasting of long time-horizons and that the high-frequency components become relevant for the forecast of short time-horizons.This follows the intuition that the forecasting horizon is best captured by components at the same time-scale.
We observe that the best-performing results with the mean MAE were obtained by using direct univariate EMD-SVR with different aggregations of EMD components (Equations ( 5)-( 8)).Conversely, the best-performing results with Z-statistics were obtained by using the direct multivariate EMD-SVR model (Equation ( 9)).This apparent contradiction is a consequence of the fact that some models with a small mean MAE had a large standard deviation (std) MAE, making the overall statistical significance poorer.

EMD
The EMD consists of subdividing a time series z(t) into a number of components IMF i (t), i = 1, ..., n, called IMFs, and a residual R(t).
IMFs are oscillating components of the signal.Although theoretically there is no guarantee of stationarity, they oscillate around zero, and therefore they are at least locally stationary.IMFs are automatically discovered from local maxima and minima of the data without imposing any functional form.They are nearly orthogonal and oscillate with different characteristic times.There are several implementations for EMD in the literature; in this paper, we adopt a variation of the procedure introduced by Flandrin and Gonçalves (2004); Huang et al. (1998).The interested reader can see our previous papers (Nava et al. 2016a(Nava et al. , 2017) ) for further details).Typically in the EMD, the number of IMF components n is automatically discovered by the method, which stops when only a nonoscillating residual is left.In our analysis, we have instead imposed n = 5 by ending the decomposition before fully achieving a nonoscillating residual.This helped us to compare results across the entire dataset.Conventionally, indices rank components from high to low frequency with IMF 1 being the highest and IMF n being the lowest.

Forecasting Financial Time Series
Given a time series with values z 1 , ..., z t , forecasting consists of estimating the (future) values of the time series at h-steps ahead from (present) time t.The challenge is to find the function that best maps some past m values of the time series into the value h-steps ahead, that is, ẑt+h = f (z t , z t−1 , ..., z t−m+1 ) The distance between the forecasted value ẑt+h and the true value z t+h quantifies the accuracy of the forecasting.This is a regression problem between the sets of variables (z t , z t−1 , ..., z t−m+1 ) and z t+h .
In financial time series, the main complications arise from the nonstationarity of the underlying process, the nonlinearity of the regression function and its dependence on the time-scale of the forecasting horizon.In the present EMD-SVR approach, nonstationarity and forecasting horizon time-scales are handled with EMD, whereas nonlinearity is handled with SVR.

Recursive and Direct Strategies for h-Steps-Ahead Forecast
In this paper, we use two common strategies to forecast time series h-steps ahead: 1.The recursive strategy constructs a prediction model, which optimizes the one-step-ahead prediction: ẑt+1 = f (z t , ..., z t−m+1 ).Then it uses the same model for the next forecasted value: ẑt+2 = f ( ẑt+1 , ..., z t−m ), with the forecasted value of ẑt+1 used instead of the true value, which is unknown.The procedure continues recursively: We note that when h becomes larger than m, all the input values are forecasted, and this is likely to deteriorate the accuracy of the prediction.2. The direct strategy uses a different model for each forecast horizon.The various forecasting models are independently estimated.In this case, the h-step-ahead forecast is expressed as follows: The previous forecasted values are not used as inputs; therefore the errors do not propagate through the steps.

Support Vector Regression
We estimate the forecasting function f (.) by using a SVR approach that is one of the most popular and best-performing nonlinear regression methodologies (Christianini and Shawe-Taylor 2000;Kim 2003;Smola and Schölkopf 2004;Suykens et al. 2002;Tay and Cao 2001).It is based on the same formalism originally developed for support vector machines (SVMs), which are instead classifiers.SVR requires an input training set made of couples of variables (x i , y i ), from which a nonlinear relation y = f (x) is inferred in the form of a sum of kernel functions f (x) = ∑ i α i K(x, x i ) (Christianini and Shawe-Taylor 2000;Schölkopf et al. 2001).In our case, we identify i = t, y i = z t+h and x i = (z t , z t−1 , ..., z t−m+1 ).
The regression coefficients α i are estimated by solving a quadratic optimization.The other parameters of the SVR model (kernel parameters, the regularization constant and insensitive coefficient) are estimated through a grid search (Bennett et al. 2006).

EMD-SVR Forecasting
In this paper, we have used the EMD components for forecasting by applying both a univariate and a multivariate EMD-SVR scheme.These both use the IMF components from EMD as input for forecasting with SVR.The difference is that the univariate approach attempts to forecast each component and the residue independently, whereas the multivariate approach uses all components and the residue to forecast the future value of the signal.

Univariate EMD-SVR
The univariate EMD-SVR approach can be written as follows: The forecasted IMFs are then combined to obtain the forecast for the input time series.With this method, we can create a set of forecasting functions by partially adding component by component starting from the residue: . . .
With this method, we can use both the direct and recursive forecasting strategies.

Multivariate EMD-SVR
The multivariate EMD-SVR scheme can be written as follows: A different value of m i is used for each IMF i and for the residue.Only the direct strategy can be used in this multivariate model because the forecasting is done on the complete signal and not on each IMF.One of the advantages of this method is that a single forecasting model needs to be trained, and therefore it results in a faster algorithm.

Model Selection and Parameter Estimation
The estimation of the parameters and the validation of the model are a crucial part of any forecasting strategy.Model parameters are estimated by training the regression on a set of "training data", searching for optimal (penalized) outputs.The performance of the regression is then tested on another set of "testing data".Model selection criteria such as the Akaike information criterion (AIC) and the Schwarz Bayesian information criterion (BIC) can be used to select the best-performing model (Montgomery et al. 2008).

Measure of Performance
The performance of the forecasting model is measured by quantifying the mismatch between the forecasted values ẑt+h and the real values z t+h (in the testing set).In this paper, we quantify this mismatch by using the MAE, which is a commonly used error measure defined as follows Willmott and Matsuura (2005): where M is the number of forecasted data points.

Statistical Significance Test
In order to further evaluate the forecasting performance, we have tested for the null hypothesis of equal forecast accuracy between the proposed EMD-SVR models and the benchmark models.Specifically, we have used the Wilcoxon signed-rank test (Wilcoxon 1945), a nonparametric test that estimates the statistically significant difference between a pair of models.We applied the Wilcoxon test to the rank of the difference of the absolute errors and to evaluate the null hypothesis that the two related error samples had the same distribution.The test returns a Z-score that follows a standard normal distribution for large normal samples.A positive value of Z indicates that the tested model had smaller errors than the naive model.On the contrary, a negative value indicates that the naive model outperformed the tested model.The larger the value of Z, the more significant the difference between the EMD-SVR and the benchmark models.

Conclusions
We have introduced a multistep-ahead forecasting methodology for nonlinear and nonstationary time series on the basis of a combination of EMD and SVR.The EMD can fully capture the local fluctuations of the analyzed time series and can be used as a preprocessor to decompose nonstationary data into a finite set of IMFs and a residue.The extracted IMFs are locally stationary (except the residue), they have simpler structures and they are associated to oscillations within a characteristic time-scale range.The underlying idea that we tested successfully in this paper is that IMFs are better suited for forecasting than the original time series.The construction of EMD is algorithmically very simple and computationally undemanding, with complexity scaling linearly with the time series length.
We tested both univariate and multivariate EMD-SVR forecasting schemes.For the univariate scheme, we forecasted each IMF and the residue separately and then constructed the forecasted input time series as the sum of the forecasted components.We defined coarse-to-fine reconstruction models using the cumulative sum of sequential IMFs, that is, adding details to the low-frequency components.The multivariate EMD-SVR scheme instead combined information of all the IMFs and the residue into one input vector used for forecasting the financial time series.We used two multistep-ahead prediction strategies, the recursive and the direct strategies.
We evaluated the performance of our multistep-ahead forecasting models on intraday data from the S&P 500 index.The results suggest that the multivariate EMD-SVR models perform better than benchmark models.The best results were obtained with the direct strategy applied to the univariate EMD-SVR with an input vector of length m = p + d (p and d obtained from a fitted ARIMA(p, d, q) model with, in our case, p + d 5).
We observed that the IMFs with high oscillation frequencies contribute to forecasting on short time-horizons but do not improve forecasting on longer horizons.For short-term forecasting, the model using the full reconstruction (all IMFs) performed better.For long-term forecasting, the residue conveyed the most important features for the forecasting of the original data.This was a novel yet expected outcome, because it is intuitive that the main contributors to forecasting on some time-horizons should be those with a similar time-scale.Shorter time-scales introduce essentially only noise and longer time-scales are slow varying trends with small effects.
We conclude that the EMD can improve forecasting performances with the most significant improvements over the benchmark models achieved for longer-term horizons (h ≥ 10).The limited improvement for short-term horizons may be due to the boundary effects of the EMD, which produce swings in the extremes of the IMFs and perturb the first few forecasting steps.
This paper aims to demonstrate the feasibility of using EMD for forecasting purposes.Although the proposed EMD-SVR forecasting methodology has achieved good predictive performances, therefore proving the starting hypothesis, there is plenty of scope for further refining of the methodology.For instance, the selection of the input vector and the choice of parameters for the SVR can be improved.Further, EMD can be better adapted for forecasting purposes by better implementing special handling of the right boundary values and by better processing propagation of noise from large fluctuations.

Figure 1 .
Figure 1.Values of the S&P 500 index for the trading day, 7 August 2014.The first 5 min of the trading day are not included.

Figure 2 .
Figure2.True (black lines) and forecasted (red and blue lines) intrinsic mode functions (IMFs) and residue extracted from the S&P 500 index for 7 August 2014 shown in Figure1.The forecasted values were obtained using the univariate empirical mode decomposition-support vector regression (EMD-SVR) model, using both the recursive (blue line) and the direct strategies (red line).We note that the black lines end when forecasting begins.EMD was performed on the training set only.

Figure 3 .
Figure 3. True and forecasted values for the S&P 500 index for 7 August 2014 shown in Figure 1.Forecasted values were obtained using partial reconstructions of the univariate empirical mode decomposition-support vector regression (EMD-SVR) model (Equations (5)-(8)), both the recursive (blue line) and the direct (red line) strategies.We note that the EMD was performed on the training set only (i.e., using only data before beginning of forecasting).

Figure 4 .
Figure 4. True (black line) and forecasted (red line) values for the S&P 500 index.Forecasted values were obtained using the multivariate empirical mode decomposition-support vector regression (EMD-SVR) model (Equation (9)).The true S&P 500 index values are the same as in Figures1 and 3. We note that the EMD was performed on the training set only (i.e., using only data before beginning of forecasting).

Figure 5 .
Figure 5. Mean absolute error (MAE) as a function of the forecast horizon for all forecasting models: naive, ARIMA(p, d, q), SVR on the original data, univariate and multivariate EMD-SVR with input vector m = p + d lagged values.Smaller MAE indicate better forecasts.Direct strategy univariate EMD-SVR with different number of components (Equations (5)-(8)) outperform all benchmarks for at all steps ahead.Multivariate EMD-SVR also outperforms all benchmarks for h ≥ 5.
MAE for multivariate EMD-SVR model.

Figure A1 .
Figure A1.Mean absolute error (MAE) as a function of the forecast horizon for the considered forecasting models: naive, autoregressive integrated moving average (ARIMA(p, d, q)), and univariate and multivariate empirical mode decomposition-support vector regression (EMD-SVR) with input vector m = 1 lagged values.

Figure A2 .
Figure A2.Mean absolute error (MAE) as a function of the forecast horizon for the considered forecasting models: naive, autoregressive integrated moving average (ARIMA(p, d, q)), univariate and multivariate empirical mode decomposition-support vector regression (EMD-SVR) with input vector m = 5 lagged values.

Table 3 .
Z-statistic for the Wilcoxon signed-rank testing difference between naive model and the other models: autoregressive integrated moving average (ARIMA(p, d, q)), direct and recursive support vector regression (SVR) on the original data, and univariate and multivariate empirical mode decomposition-SVR (EMD-SVR) with input vector m = p + d.Positive values indicate better performances than naive model; negative values indicate worse performance instead.The larger the value, the more significant the overperformance is with respect to naive model.Statistics were computed over all 128 days in the dataset.Best-performing models for each step ahead h are highlighted in boldface.* Statistically significant at the 5% confidence level.** Statistically significant at the 1% confidence level.

Table A1 .
Mean absolute error (MAE): mean and standard deviation (std) for the considered forecasting models with input vector m = 1 lagged values.The smallest MAE of each forecast horizon is set in boldface.

Table A2 .
Mean absolute error (MAE) and standard deviation (std) for the considered forecasting models: naive, autoregressive integrated moving average (ARIMA(p, d, q)), univariate and multivariate empirical mode decomposition-support vector regression (EMD-SVR) with input vector m = 5 lagged values.The smallest MAE of each forecast horizon is set in boldface.

Table A3 .
Z-statistic for the Wilcoxon signed-rank test for the null hypothesis that the naive model is as accurate as the studied models: autoregressive integrated moving average (ARIMA(p, d, q)), and univariate and multivariate empirical mode decomposition-support vector regression (EMD-SVR) with input vector m = 1.Top: direct strategy; bottom: recursive strategy.* Statistically significant at the 5% confidence level.** Statistically significant at the 1% confidence level.

Table A4 .
Z-statistic for the Wilcoxon signed-rank test for the null hypothesis that the naive model is as accurate as the studied models: autoregressive integrated moving average (ARIMA(p, d, q)), univariate and multivariate empirical mode decomposition-support vector regression (EMD-SVR) with input vector m = 5.Top: direct strategy; bottom: recursive strategy.* Statistically significant at the 5% confidence level.** Statistically significant at the 1% confidence level.