A Hybrid Method Using HAVOK Analysis and Machine Learning for Predicting Chaotic Time Series

Yang, Jinhui; Zhao, Juan; Song, Junqiang; Wu, Jianping; Zhao, Chengwu; Leng, Hongze

doi:10.3390/e24030408

Open AccessArticle

A Hybrid Method Using HAVOK Analysis and Machine Learning for Predicting Chaotic Time Series

by

Jinhui Yang

,

Juan Zhao

^*

,

Junqiang Song

,

Jianping Wu

,

Chengwu Zhao

and

Hongze Leng

College of Meteorology and Oceanography, National University of Defense Technology, Changsha 410000, China

^*

Author to whom correspondence should be addressed.

Entropy 2022, 24(3), 408; https://doi.org/10.3390/e24030408

Submission received: 17 February 2022 / Revised: 5 March 2022 / Accepted: 6 March 2022 / Published: 15 March 2022

(This article belongs to the Topic Artificial Intelligence and Computational Methods: Modeling, Simulations and Optimization of Complex Systems)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

The prediction of chaotic time series systems has remained a challenging problem in recent decades. A hybrid method using Hankel Alternative View Of Koopman (HAVOK) analysis and machine learning (HAVOK-ML) is developed to predict chaotic time series. HAVOK-ML simulates the time series by reconstructing a closed linear model so as to achieve the purpose of prediction. It decomposes chaotic dynamics into intermittently forced linear systems by HAVOK analysis and estimates the external intermittently forcing term using machine learning. The prediction performance evaluations confirm that the proposed method has superior forecasting skills compared with existing prediction methods.

Keywords:

chaotic time series prediction; Koopman; machine learning; Hankel matrix

1. Introduction

A chaotic system refers to a deterministic system where there are irregular movements that appear to be random, and its behavior is uncertain, unrepeatable, and unpredictable. The high sensitivity to the initial conditions and the fact that they are inherently unpredictable are the main characteristics of the chaotic systems. Chaotic phenomena are ubiquitous in several scientific fields, such as atmosphere motions [1], population dynamics [2,3,4], epidemiology [5], and economics. It has been a hot topic in such fields and attracted the attention of many people. It is worth noting that the chaotic system is not completely random as its name suggests but has certain structure and patterns. However, due to the lack of understanding of the dynamic mechanism of chaotic systems, the prediction of chaotic time series is still a very important but challenging problem.

With the development of big data and advanced algorithms in machine learning, it has become a new research direction to solve prediction problems of chaotic systems using a data-driven way. Several empirical models to predict chaotic time series based on machine learning are proposed. Many famous artificial neural networks (ANN) models such as Radial Basis Function (RBF) neural network [6], neuro-fuzzy model with Locally Linear Model Tree (LoLiMoT) [7], feedforward neural network [8], multi-layer perceptron (MLP) [9], recurrent neural networks (RNN) [10], finite impulse response (FIR) neural network [11], deep belief nets (DBN) [12], Elman neural network [11,13,14], and wavelet neural network (WNN) [15,16] have been introduced in the literature.

However, the setting of neural network model parameters will greatly affect the performance of these models. Consequently, a substantial amount of work has also been put into the optimization algorithm and parameter settings. Min Gan et al. present a state-dependent autoregressive (SD-AR) model, which uses a set of locally linear radial basis function networks (LIRBFNs) to approximate its functional coefficients [17]. In addition, Pauline Ong and Zarita Zainuddin presented a modified cuckoo search algorithm (MCSA) to initialize WNN models [16]. A hybrid learning algorithm, called HGAGD, which combines genetic algorithm (GA) with gradient descent (GD) is proposed to optimize the parameters of a quantum-inspired neural network (QNN) [18]. In the proposed methodology, the embedding method is used along with ENN to predict the residual time series [13]. A single hidden Markov model (HMM) combined with fuzzy inference systems is introduced for time series predicting [19]. What is more, many hybrid methods are also developed for improving the performance of these prediction models [20,21].

As mentioned above, model structure and parameter tuning are important factors for chaotic time series prediction with machine learning, and a lot of research has focused on it. To simplify the learning model, a hybrid method using Hankel Alternative View Of Koopman (HAVOK) analysis and machine learning (HAVOK-ML) is developed to predict chaotic time series in this research. Hankel Alternative View Of Koopman (HAVOK) analysis was proposed by Brunton [22]. It combines the delay embedding method [23] and the Koopman theory [24] to decompose chaotic dynamics into a linear model with intermittent forcing. HAVOK-ML decomposes chaotic dynamics into intermittently forced linear systems with HAVOK; then, it estimates the forcing term using machine learning. Essentially, the prediction of the chaotic time series using the HAVOK-ML method is conducted as solving linear ordinary differential equations, which can be calculated efficiently. It can take different types of regression methods such as Linear Regression or Random Forest Regression (RFR) [25] into the prediction framework and combines the advantages of HAVOK theory and machine learning. Therefore, it can obtain better prediction results than directly using those machine learning models.

This paper is organized as follows. Section 2 briefly describes the theory of the HAVOK analysis combined with the machine learning method for time series prediction. Section 3 applies the proposed combined method to perform multi-step ahead prediction for some well-known chaotic time series and also compares the obtained prediction performance with that of existing prediction models. Finally, conclusions are given in Section 4.

2. HAVOK-ML Method

Consider a nonlinear system of the form:

\frac{d x (t)}{d t} = f (x (t))

(1)

where

x (t) \in R^{n}

is the state of the system at time t and f denotes the dynamics of the system. For a given state

x (t_{0})

at time

t_{0}

,

x (t_{0} + t)

can be given discretely by:

x (t_{0} + t) = x (t_{0}) + \int_{t_{0}}^{t_{0} + t} f (x (τ)) d τ

(2)

Generally speaking, for an observed chaotic time series x(t), the governing equation f is highly nonlinear and unknown. HAVOK analysis [22] provides linear representations for those unknown nonlinear systems. A Hankel matrix

H

, for a single measurement

x (t)

, by taking singular value decomposition (SVD), is given by:

H = (\begin{matrix} x_{(} t_{1}) & x_{(} t_{2}) & \dots & x_{(} t_{p}) \\ x_{(} t_{2}) & x_{(} t_{3}) & \dots & x_{(} t_{p} + 1) \\ ⋮ & ⋮ & ⋱ & ⋮ \\ x_{(} t_{q}) & x_{(} t_{q + 1}) & \dots & x_{(} t_{m}) \end{matrix}) = U Σ V^{T}

(3)

where

m = p + q - 1

, p and q are two parameters that determine the dimension of

H

. The columns of

H

are defined by:

h (i) = {[x (t_{i}), x (t_{i + 1}), \dots, x (t_{i + q - 1})]}^{T} (i = 1, \dots, p)

(4)

then

H = [h (1) h (2) \dots h (p)]

(5)

Usually,

H

can be well approximated by the first r columns of

U

,

V

. According to the HAVOK analysis [22], the first

r - 1

variables in V can be built as a linear model with the last variable

v_{r}

as a forcing term:

\frac{d v_{r - 1} (t)}{d t} = A v_{r - 1} (t) + B (t)

(6)

where

v_{r - 1} = {[v_{1}, v_{2}, \dots, v_{r - 1}]}^{T}

is the vector of the first

r - 1

eigen-time-delay coordinates. Note that Equation (6) is not a closed model because

v_{r} (t)

is an external input forcing. In the linear HAVOK model, matrix

A

and vector

B

may be obtained by the Sparse Identification of Nonlinear Dynamics (SINDy) algorithm [26] or by a straightforward linear regression procedure.

v_{r}

is given by the rth column of

V

.

A machine learning method is used to predict

v_{r} (t + 1)

by using previous observed values

[x (t - D Δ t), x (t - (D - 1) Δ t, \dots, x (t - Δ t)]

, as shown in Figure 1. Suppose

v_{r}

evenly varies within interval

[t, t + 1]

. The evolution of

v_{r} (t)

can be approximated by:

\frac{d v_{r} (t)}{d t} = (v_{r} (t + 1) - v_{r} (t)) / Δ t

(7)

Then, the first

r - 1

variables

v_{r - 1} (t + 1)

are obtained by solving the linear model Equation (6):

H = (\begin{matrix} v_{r - 1} (t + 1) \\ v_{r} (t + 1) \\ v_{r} (t + 1) - v_{r} (t) \end{matrix}) = e x p (\begin{matrix} A d t & B d t & 0 \\ 0 & 0 & I \\ 0 & 0 & 0 \end{matrix}) (\begin{matrix} v_{r - 1} (t) \\ v_{r} (t) \\ v_{r} (t + 1) - v_{r} (t) \end{matrix})

(8)

Assume that the integration starts at time

t_{p}

for an input

h (p)

. Then, the next step of

h (p + 1)

can be written as:

h (p + 1) = U Σ v_{r} (t_{p} + 1)

(9)

where

v_{r} = [v_{1}, v_{2}, \dots, v_{r - 1}, v_{r}]

is a vector containing the first r variables. In order to evaluate the efficiency of HAVOK-ML, RMSE, NMSE, and

R^{2}

score defined below are used as a performance index.

RMSE = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} {(y_{i} - {\hat{y}}_{i})}^{2}}

(10)

NMSE = \frac{\sum_{i = 1}^{N} {(y_{i} - {\hat{y}}_{i})}^{2}}{\sum_{i = 1}^{N} {(y_{i} - {\bar{y}}_{i})}^{2}}

(11)

error = \frac{\sum_{i = 1}^{N} {({\hat{y}}_{i} - {\bar{y}}_{i})}^{2}}{\sum_{i = 1}^{N} {(y_{i} - {\bar{y}}_{i})}^{2}}

(12)

where

y_{i}, {\hat{y}}_{i}

, and

\bar{y}

represent the observed data, the predicted data, and the mean of the observed data, respectively. The Root Mean Squared Error (RMSE) and the Normalized Mean Squared Error (NMSE) are used to assess the accuracy of the prediction and to compare the results with those of the literature. The

R^{2}

score is used to evaluate the score of the machine learning based prediction for

v_{r}

.

3. Numerical Experiments

In this section, three different type of time series—Lorenz [1], Mackey–Glass [26], and Sunspot—are applied to verify our proposed HAVOK_ML method. The parameters adopted in the HAVOK analysis of these series are listed in Table 1.

3.1. Lorenz Time Series

The Lorenz system [1] is among the most famous chaotic systems, which is described by:

\begin{matrix} \frac{d x}{d t} & = σ (y - x) \\ \frac{d y}{d t} & = x (ρ - z) - y \\ \frac{d z}{d t} & = x y - β z \end{matrix}

(13)

The chaotic time series is obtained with parameters

σ = 10

,

ρ = 28

,

β = 8 / 3

, and

d t = 0.01

in the second sampling. In this study, only the time series of variable

x (t)

, shown in Figure 2, is considered.

In this research, HAVOK-ML decomposes chaotic dynamics into intermittently forced linear systems by HAVOK analysis; the settings of HAVOK analysis are given in Table 1, and the sampling time step for each system is consistent with other references listed in Table 2. However, according to the advice in paper [14], the samples of the Lorenz system are interpolated at 0.001 s resolution in the HAVOK analysis.

By using HAVOK analysis for the training data, a linear HAVOK model (Equation (6)) is developed. As shown in Figure 3, matrix

A

and vector

B

are sparse, and the reconstruction of

v_{1} (t)

and

x (t)

is coherent with the actual values for the full range of time. Since the

v_{r}

(Figure 4) is not smooth enough, many experiment results demonstrate that the RFR method [25] can predict the

v_{r}

best. Hence, an RFR method is adopted to train and estimate the next step

v_{r} (t + 1)

based on previously observed values

[x (t - 40), x (t - 35), \dots, x (t - 10), x (t - 5)]

. The samples from the 3rd to the 100th seconds are spilled into training set (first 80%) and test set (20%). The

R^{2}

score for the RFR method on the test set is 0.87. We can observe that the estimated results are mainly consistent with the actual values, as shown in Figure 4.

In the next experiment, the HAVOK-ML method is used in the N-step recursive prediction of Lorenz time series. In the recursive prediction, the current predicted values are used for next predictions without any correction to the actual values. A comparison between the multi-step predicted values and the original time series, with 1000 testing samples, is shown in Figure 5. It can be seen that in the initial steps of predict (less than 10), the prediction results are coherent with the actual values. The error increases with predict steps, especially at the region near the extreme point of the curve. The RMSE of the prediction function of time is presented in Figure 6. It can be observed that the error quickly increases with the increase of the predicted time, which means that a long-term prediction is basically impossible. At step 10, the obtained RMSE is 9.003× 10

^{- 3}

(Figure 6), which is significantly better than the result of the literature (0.014) [27].

Table 2 presents the one-step ahead prediction errors (RMSE and NMSE) for the proposed method as well as some results obtained by existing methods, which were extracted from the literature. It can be shown that the RMSE index of the proposed method is optimal, while the NMSE index shows that the proposed method is second only to the functional weight WNN state-dependent AR (FWWNN-AR) model [15].

3.2. Mackey–Glass Time Series

The Mackey–Glass chaotic time series has been introduced as a white blood cell production [26]. It is described by:

\frac{d x (t)}{d t} = \frac{a x (t - τ)}{1 + x^{r} (t - τ)} - b x (t), t > 0

(14)

where

a = 0.2

,

b = 0.1

,

r = 10

and

τ = 17

, similar to other published papers presented in Table 3. The Mackey–Glass equation is solved using the delay differential equation method dde23 of MATLAB. A chaotic time series samples set of 25,000 lengths, with time step

d t = 0.1

, is generated. The samples from the 300th to 2000th seconds, shown in Figure 7, are chosen as the training set, while the rest is used as the test set.

The HAVOK analysis settings for Mackey–Glass time series are given in Table 1. The rows in the H matrix are for

q = 5

, and the rank of the SVD decomposition is

r = 5

. More details on the HAVOK analysis and the multi-step ahead prediction for the Mackey–Glass time series are presented in Figure A1, Figure A2, Figure A3 and Figure A4 in Appendix A. By considering the properties of the

v_{r}

curve, an LLN model with the LoliMoT optimization method [7] is determined through experiments as a regressor to predict

v_{r}

. A comparison between the prediction accuracies of the proposed method and other models of the literature are summarized in Table 3. As shown in Table 3, whether RMSE or NMSE, the effect of the proposed method outperforms the existing models.

3.3. Sunspot Time Series

The number of sunspots observed in the solar surface varies within a period of approximately 11 years. The variation of the number of sunspots has a large impact on Earth and on the climate. The monthly smoothed sunspot number time series, observed by the Solar Influences Data Analysis Center (http://sidc.oma.be/index.php (accessed on 7 August 2021)), is used for training and forecasting the trend of the sunspot variation. In order to compare the results with those extracted from the existing published papers of the literature (listed in Table 4), the series from November 1834 to June 2001 (2000 data points) are chosen and scaled between 0 and 1. The first 1000 samples are selected as the training set (Figure 8), and the remaining 1000 samples are used as the testing set. The settings of HAVOK analysis for sunspot time series are given in Table 1. Similar to the Mackey–Glass time series, an LLN model is used as the regressor to predict

v_{r}

. The details of the HAVOK decomposition and the multi-step prediction for the sunspot time series are shown in Figure A5, Figure A6, Figure A7 and Figure A8 in Appendix A. A comparison between the prediction errors (RMSE and NMSE) obtained with the 1000 samples of the testing set and with other models are presented in Table 4. It can be seen that the proposed HAVOK-ML method outperforms the existing methods in predicting the sunspot chaotic time series.

4. Discussion and Conclusions

In this paper, a HAVOK-ML method combining the HAVOK analysis with machine learning to predict chaotic time series is proposed. Based on the HAVOK analysis, the observed chaotic dynamic system could be reconstructed as a linear model with an external intermittent forcing. A machine learning method was applied to predict the external forcing term by using previously observed values. Finally, the combination of the HAVOK analysis with machine learning produces a closed model for prediction. It is worth noting that the machine learning method used in HAVOK-ML will vary depending on the property of the external forcing term. The developed method has been validated for multi-step ahead prediction of several classic chaotic time series (the Lorenz time series, the Mackey–Glass time series, and the Sunspot time series). The experimental results show that our method can produce accurate forecasts even with simple machine learning algorithms. The prediction performance of the proposed method has been compared with other forecasting models of the literature. The comparison shows that the proposed method outperforms the existing ones in terms of superior forecasting ability. Although HAVOK-ML can be combined with different machine learning methods, it does not give suggestions on how to choose machine learning methods for different time series forecasting problems. This is worth studying in the future.

Author Contributions

Conceptualization, J.Y.; Data curation, J.Z. and C.Z.; Formal analysis, J.Z., J.S. and H.L.; Funding acquisition, H.L.; Investigation, J.Y., J.Z. and C.Z.; Methodology, J.Y. and J.Z.; Project administration, J.S. and J.W.; Supervision, J.S. and J.W.; Validation, J.Y. and H.L.; Writing—original draft, J.Y.; Writing—review and editing, J.Y. and J.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research is partially supported by the National Natural Science Foundation of China (Grant Nos. 41605070, 61802424).

Data Availability Statement

The code and the dataset supporting the results of this article are available in the https://gitee.com/yangjinhui11/havok_py/tree/master (accessed on 14 August 2021).

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Figures of Mackey–Glass Time Series and Sunspot Time Series

Figure A1. Decomposition of the Mackey–Glass chaocit series with HAVOK analysis (similar to Figure 3).

Figure A2. The LLN model with the LoliMoT optimization method, which is used to predict

v_{r}

of the Mackey–Glass chaocit series. The previously observed values at

[x (t - 5), x (t - 4), \dots, x (t - 3), x (t - 2), x (t - 1)]

are used to predict the next time value at

v_{r} (t + 1)

, with

d t = 0.1

s.

Figure A2. The LLN model with the LoliMoT optimization method, which is used to predict

v_{r}

of the Mackey–Glass chaocit series. The previously observed values at

[x (t - 5), x (t - 4), \dots, x (t - 3), x (t - 2), x (t - 1)]

are used to predict the next time value at

v_{r} (t + 1)

, with

d t = 0.1

s.

Figure A3. Comparison of the original time series samples of Mackey–Glass and the multi-step predicted values, with one-step length of 0.1 s.

Figure A4. Error growth of the multi-step prediction of Mackey–Glass chaocit series for 4000 samples, with one-step length of 0.1 s.

Figure A5. Decomposition of the sunspot series with HAVOK analysis (similar to Figure 3).

Figure A6. The LLN model with the LoliMoT optimization method, which is used to predict

v_{r}

of sunspot series. The previously observed values at

[x (t - 140), x (t - 125), x (t - 110), x (t - 95), \dots, x (t - 35), x (t - 20), x (t - 5)]

are used to predict.

Figure A6. The LLN model with the LoliMoT optimization method, which is used to predict

v_{r}

of sunspot series. The previously observed values at

[x (t - 140), x (t - 125), x (t - 110), x (t - 95), \dots, x (t - 35), x (t - 20), x (t - 5)]

are used to predict.

Figure A7. Multi-step ahead prediction of sunspot series with one-step length of 1 month.

Figure A8. Error growth of multi-step prediction of sunspot series with one-step length of 1 (month).

References

Lorenz, E.N. Deterministic nonperiodic flow. J. Atoms. 1963, 20, 130–141. [Google Scholar] [CrossRef] [Green Version]
Bjørnstad, O.N.; Grenfell, B.T. Noisy Clockwork: Time Series Analysis of Population Fluctuations in Animals. Science 2001, 293, 638–643. [Google Scholar] [CrossRef] [Green Version]
Sugihara, G.; May, R.; Ye, H.; Hsieh, C.H.; Deyle, E.; Fogarty, M.; Munch, S. Detecting Causality in Complex Ecosystems. Science 2012, 338, 496–500. [Google Scholar] [CrossRef]
Ye, H.; Beamish, R.J.; Glaser, S.M.; Grant, S.; Hsieh, C.H.; Richards, L.J.; Schnute, J.T.; Sugihara, G. Equation-free mechanistic ecosystem forecasting using empirical dynamic modeling. Proc. Natl. Acad. Sci. USA 2015, 112, E1569. [Google Scholar] [CrossRef] [Green Version]
Sugihara, G.; May, R.M. Nonlinear forecasting as a way of distinguishing chaos from measurement error in time series. Nature 1990, 344, 734–741. [Google Scholar] [CrossRef]
Chen, S.; Cowan, C.; Grant, P. Orthogonal least squares learning algorithm for radial basis function networks. IEEE Trans. Neural Netw. 1991, 2, 302–309. [Google Scholar] [CrossRef] [Green Version]
Predicting Chaotic time series using neural and neurofuzzy models: A comparative study. Neural Process. Lett. 2006, 24, 217–239. [CrossRef]
Chen, Y.; Yang, B.; Dong, J.; Abraham, A. Time-series forecasting using flexible neural tree model. Inf. Sci. 2005, 174, 219–235. [Google Scholar] [CrossRef]
Chandra, R.; Zhang, M. Cooperative coevolution of Elman recurrent neural networks for chaotic time series prediction. Neurocomputing 2012, 86, 116–123. [Google Scholar] [CrossRef]
Ma, Q.L.; Zheng, Q.L.; Peng, H.; Zhong, T.W.; Xu, L.Q. Chaotic Time Series Prediction Based on Evolving Recurrent Neural Networks. In Proceedings of the 2007 International Conference on Machine Learning and Cybernetics, Hong Kong, China, 19–22 August 2007; Volume 6, pp. 3496–3500. [Google Scholar] [CrossRef]
Koskela, T.; Lehtokangas, M.; Saarinen, J.; Kaski, K. Time Series Prediction with Multilayer Perceptron, FIR and Elman Neural Networks. In Proceedings of the World Congress on Neural Networks; INNS Press: San Diego, CA, USA, 1996; pp. 491–496. [Google Scholar]
Kuremoto, T.; Kimura, S.; Kobayashi, K.; Obayashi, M. Time series forecasting using a deep belief network with restricted Boltzmann machines. Neurocomputing 2014, 137, 47–56. [Google Scholar] [CrossRef]
Ardalani-Farsa, M.; Zolfaghari, S. Chaotic time series prediction with residual analysis method using hybrid Elman–NARX neural networks. Neurocomputing 2010, 73, 2540–2553. [Google Scholar] [CrossRef]
Brunton, S.L.; Brunton, B.W.; Proctor, J.L.; Kaiser, E.; Kutz, J.N. Chaos as an Intermittently Forced Linear System. Nat. Commun. 2016, 8, 19. [Google Scholar] [CrossRef]
Inoussa, G.; Peng, H.; Wu, J. Nonlinear time series modeling and prediction using functional weights wavelet neural network-based state-dependent AR model. Neurocomputing 2012, 86, 59–74. [Google Scholar] [CrossRef]
Zhu, L.; Wang, Y.; Fan, Q. MODWT-ARMA model for time series prediction. Appl. Math. Model. 2014, 38, 1859–1865. [Google Scholar] [CrossRef]
Ong, P.; Zainuddin, Z. Optimizing wavelet neural networks using modified cuckoo search for multi-step ahead chaotic time series prediction. Appl. Soft Comput. J. 2019, 80, 374–386. [Google Scholar] [CrossRef]
Wang, X.; Ma, L.; Wang, B.; Wang, T. A hybrid optimization-based recurrent neural network for real-time data prediction. Neurocomputing 2013, 120, 547–559. [Google Scholar] [CrossRef]
Bhardwaj, S.; Srivastava, S.; Gupta, J.R.P. Pattern-Similarity-Based Model for Time Series Prediction. Comput. Intell. 2015, 31, 106–131. [Google Scholar] [CrossRef]
Smith, C.; Jin, Y. Evolutionary multi-objective generation of recurrent neural network ensembles for time series prediction. Neurocomputing 2014, 143, 302–311. [Google Scholar] [CrossRef] [Green Version]
Ho, D.T.; Garibaldi, J.M. Context-Dependent Fuzzy Systems With Application to Time-Series Prediction. IEEE Trans. Fuzzy Syst. Publ. IEEE Neural Netw. Counc. 2014, 22, 778–790. [Google Scholar] [CrossRef]
Takens, F. Detecting strange attractors in turbulence. In Dynamical Systems and Turbulence, Warwick 1980; Rand, D., Young, L.S., Eds.; Springer: Berlin/Heidelberg, Germany, 1981; pp. 366–381. [Google Scholar]
Tu, J.H.; Rowley, C.W.; Luchtenburg, D.M.; Brunton, S.L.; Kutz, J.N. On dynamic mode decomposition: Theory and applications. J. Comput. Dyn. 2014, 1, 391–421. [Google Scholar] [CrossRef] [Green Version]
Brunton, S.L.; Proctor, J.L.; Kutz, J.N. Discovering governing equations from data: Sparse identification of nonlinear dynamical systems. Proc. Natl. Acad. Sci. USA 2015, 113, 3932. [Google Scholar] [CrossRef] [Green Version]
Ao, Y.; Li, H.; Zhu, L.; Ali, S.; Yang, Z. The linear random forest algorithm and its advantages in machine learning assisted logging regression modeling. J. Pet. Sci. Eng. 2019, 174, 776–789. [Google Scholar] [CrossRef]
Mackey, M.C.; Glass, L. Oscillation and Chaos in Physiological Control Systems. Science 1977, 197, 287–289. [Google Scholar] [CrossRef]
Gan, M.; Peng, H.; Peng, X.; Chen, X.; Inoussa, G. A locally linear RBF network-based state-dependent AR model for nonlinear time series modeling. Inf. Sci. 2010, 180, 4370–4383. [Google Scholar] [CrossRef]
Ganjefar, S.; Tofighi, M. Optimization of quantum-inspired neural network using memetic algorithm for function approximation and chaotic time series prediction. Neurocomputing 2018, 291, 175–186. [Google Scholar] [CrossRef]
Woolley, J.W.; Agarwal, P.K.; Baker, J. Modeling and prediction of chaotic systems with artificial neural networks. Int. J. Numer. Methods Fluids 2010, 63, 989–1004. [Google Scholar] [CrossRef]

Figure 1. The architecture of the HAVOK-ML method to perform one-step prediction. The SVD of Hankel matrix

H

yields eigen time series

V^{T}

. On the one hand, the HAVOK analysis gives a linear system for the first

r - 1

variables with

v_{r} (t)

as an external input. On the other hand, by using the machine learning method, the evolution of

v_{r} (t)

can be established. Hence, a closed linear model for the first r variables is available. The symbols with superscript + stand for values at the next step

t + 1

.

Figure 1. The architecture of the HAVOK-ML method to perform one-step prediction. The SVD of Hankel matrix

H

yields eigen time series

V^{T}

. On the one hand, the HAVOK analysis gives a linear system for the first

r - 1

variables with

v_{r} (t)

as an external input. On the other hand, by using the machine learning method, the evolution of

v_{r} (t)

can be established. Hence, a closed linear model for the first r variables is available. The symbols with superscript + stand for values at the next step

t + 1

.

Figure 2. The time series

x (t)

in the Lorenz system. The initial condition is (−8, 8, 27). The training data are chosen from the 3rd to the 100th seconds.

Figure 2. The time series

x (t)

in the Lorenz system. The initial condition is (−8, 8, 27). The training data are chosen from the 3rd to the 100th seconds.

Figure 3. HAVOK analysis for Lorenz chaotic series

x (t)

. From upper-left to bottom-right: matrix

A

, vector

B

, reconstruction of

v_{1} (t)

using the linear HAVOK model with forcing

v_{r} (t)

, reconstruction of

x (t)

and the input of external forcing

v_{r} (t)

.

Figure 3. HAVOK analysis for Lorenz chaotic series

x (t)

. From upper-left to bottom-right: matrix

A

, vector

B

, reconstruction of

v_{1} (t)

using the linear HAVOK model with forcing

v_{r} (t)

, reconstruction of

x (t)

and the input of external forcing

v_{r} (t)

.

Figure 4. The random forest regressor for

v_{r}

, using previously observed values at

[x (t - 40), x (t - 35), \dots, x (t - 10), x (t - 5)]

to predict the next time value at

v_{r} (t + 1)

, with

Δ t = 0.001

.

Figure 4. The random forest regressor for

v_{r}

, using previously observed values at

[x (t - 40), x (t - 35), \dots, x (t - 10), x (t - 5)]

to predict the next time value at

v_{r} (t + 1)

, with

Δ t = 0.001

.

Figure 5. Comparison of the original time series samples and the multi-step predicted values with one-step length of 0.01 s on Lorenz time series.

Figure 6. Lorenz time-series RMSE of multi-step ahead prediction, function of the number of steps (N), with one-step length of 0.01 s.

Figure 7. Time series of Mackey–Glass system. The initial condition is 0.8, and the training data are chosen from the 300th to 2000th seconds.

Figure 8. Time series of sunspot normalized to [

- 1

, 1]. The training period ranges between November 1834 and March 1918.

Figure 8. Time series of sunspot normalized to [

- 1

, 1]. The training period ranges between November 1834 and March 1918.

Table 1. HAVOK analysis parameters for each system.

System	Samples	dt	$Δ t$	q	Rank (r)	Regressor for $v_{r}$
Lorenz	20,000	0.01 s	0.001 s	40	11	RandomForest
Mackey-Glass	50,000	0.1 s	/	5	5	LoLiMoT
Sunspot	2000	1 month	0.02 month	140	7	LoLiMoT

Table 2. Comparison of the models in one-step predicting for Lorenz chaotic series

x (t)

, with 1000 testing samples. The last row represents the proposed HAVOK-ML combined with the RFR method. The highest prediction accuracies achieved by the models are shown in bold.

Table 2. Comparison of the models in one-step predicting for Lorenz chaotic series

x (t)

, with 1000 testing samples. The last row represents the proposed HAVOK-ML combined with the RFR method. The highest prediction accuracies achieved by the models are shown in bold.

Model	RMSE	NMSE	Reference
Deep Belief Network	1.02 × 10 $^{- 2}$	/	[12]
Elman–NARX neural networks	1.08 × 10 $^{- 4}$	1.98 × 10 $^{- 10}$	[13]
WNN	/	9.84 × 10 $^{- 15}$	[15]
Fuzzy Inference System	3.1 × 10 $^{- 3}$	/	[19]
Local Linear Neural Fuzzy	/	9.80 × 10 $^{- 10}$	[7]
Local Linear Radial Basis Function Networks	/	4.53 × 10 $^{- 12}$	[27]
WNNs with MCSA	8.20 × 10 $^{- 3}$	1.22 × 10 $^{- 6}$	[17]
HAVOK_ML(RFR)	1.43 × 10 $^{- 5}$	3.23 × 10 $^{- 12}$

Table 3. Comparison of the models in six-time step ahead predicting Mackey–Glass time series, with 4000 testing samples. The last row shows the proposed HAVOK-ML method with the LLN model as the regressor. The values in bold are the highest prediction accuracies achieved by the models.

Model	RMSE	NMSE	Reference
ARMA with Maximal Overlap Discrete Wavelet Transform	/	5.3373 × 10 $^{- 7}$	[16]
Ensembles of Recurrent Neural Network	7.533 × 10 $^{- 3}$	8.29 × 10 $^{- 4}$	[20]
Quantum-Inspired Neural Network	9.70 × 10 $^{- 4}$	/	[28]
Recurrent Neural Network	6.25 × 10 $^{- 4}$	/	[18]
Type-1 Fuzzy System	4.8 × 10 $^{- 4}$	/	[21]
Fuzzy Inference System	7.1 × 10 $^{- 4}$	/	[19]
WNNs with MCSA	5.60 × 10 $^{- 5}$	6.25 × 10 $^{- 8}$	[17]
HAVOK_ML(RFR)	9.92 × 10 $^{- 6}$	1.86 × 10 $^{- 9}$

Table 4. Comparison of the models in one-time step ahead predicting sunspot time series, with 1000 testing samples. The last row shows the proposed HAVOK analysis with the LLN model as the regressor. The values in bold are the highest prediction accuracies achieved by the models.

Model	RMSE	NMSE	Reference
Elman-NARX Neural Networks	1.19 × 10 $^{- 2}$	5.90 × 10 $^{- 4}$	[13]
Elman Recurrent Neural Networks	5.58 × 10 $^{- 2}$	1.92 × 10 $^{- 2}$	[29]
Ensembles of Recurrent Neural Network	1.52 × 10 $^{- 2}$	9.64 × 10 $^{- 4}$	[20]
Fuzzy Inference System	1.18 × 10 $^{- 2}$	5.32 × 10 $^{- 4}$	[19]
Functional Weights WNNs State Dependent Autoregressive Model	1.12 × 10 $^{- 2}$	5.24 × 10 $^{- 4}$	[21]
WNNs with MCSA	1.13 × 10 $^{- 2}$	5.30 × 10 $^{- 4}$	[17]
HAVOK_ML(RFR)	4.25 × 10 $^{- 3}$	7.40 × 10 $^{- 5}$

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yang, J.; Zhao, J.; Song, J.; Wu, J.; Zhao, C.; Leng, H. A Hybrid Method Using HAVOK Analysis and Machine Learning for Predicting Chaotic Time Series. Entropy 2022, 24, 408. https://doi.org/10.3390/e24030408

AMA Style

Yang J, Zhao J, Song J, Wu J, Zhao C, Leng H. A Hybrid Method Using HAVOK Analysis and Machine Learning for Predicting Chaotic Time Series. Entropy. 2022; 24(3):408. https://doi.org/10.3390/e24030408

Chicago/Turabian Style

Yang, Jinhui, Juan Zhao, Junqiang Song, Jianping Wu, Chengwu Zhao, and Hongze Leng. 2022. "A Hybrid Method Using HAVOK Analysis and Machine Learning for Predicting Chaotic Time Series" Entropy 24, no. 3: 408. https://doi.org/10.3390/e24030408

APA Style

Yang, J., Zhao, J., Song, J., Wu, J., Zhao, C., & Leng, H. (2022). A Hybrid Method Using HAVOK Analysis and Machine Learning for Predicting Chaotic Time Series. Entropy, 24(3), 408. https://doi.org/10.3390/e24030408

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Hybrid Method Using HAVOK Analysis and Machine Learning for Predicting Chaotic Time Series

Abstract

1. Introduction

2. HAVOK-ML Method

3. Numerical Experiments

3.1. Lorenz Time Series

3.2. Mackey–Glass Time Series

3.3. Sunspot Time Series

4. Discussion and Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Appendix A. Figures of Mackey–Glass Time Series and Sunspot Time Series

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI