Next Article in Journal
The Structure and First-Passage Properties of Generalized Weighted Koch Networks
Previous Article in Journal
f-Gintropy: An Entropic Distance Ranking Based on the Gini Index
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Hybrid Method Using HAVOK Analysis and Machine Learning for Predicting Chaotic Time Series

College of Meteorology and Oceanography, National University of Defense Technology, Changsha 410000, China
*
Author to whom correspondence should be addressed.
Entropy 2022, 24(3), 408; https://doi.org/10.3390/e24030408
Submission received: 17 February 2022 / Revised: 5 March 2022 / Accepted: 6 March 2022 / Published: 15 March 2022

Abstract

:
The prediction of chaotic time series systems has remained a challenging problem in recent decades. A hybrid method using Hankel Alternative View Of Koopman (HAVOK) analysis and machine learning (HAVOK-ML) is developed to predict chaotic time series. HAVOK-ML simulates the time series by reconstructing a closed linear model so as to achieve the purpose of prediction. It decomposes chaotic dynamics into intermittently forced linear systems by HAVOK analysis and estimates the external intermittently forcing term using machine learning. The prediction performance evaluations confirm that the proposed method has superior forecasting skills compared with existing prediction methods.

1. Introduction

A chaotic system refers to a deterministic system where there are irregular movements that appear to be random, and its behavior is uncertain, unrepeatable, and unpredictable. The high sensitivity to the initial conditions and the fact that they are inherently unpredictable are the main characteristics of the chaotic systems. Chaotic phenomena are ubiquitous in several scientific fields, such as atmosphere motions [1], population dynamics [2,3,4], epidemiology [5], and economics. It has been a hot topic in such fields and attracted the attention of many people. It is worth noting that the chaotic system is not completely random as its name suggests but has certain structure and patterns. However, due to the lack of understanding of the dynamic mechanism of chaotic systems, the prediction of chaotic time series is still a very important but challenging problem.
With the development of big data and advanced algorithms in machine learning, it has become a new research direction to solve prediction problems of chaotic systems using a data-driven way. Several empirical models to predict chaotic time series based on machine learning are proposed. Many famous artificial neural networks (ANN) models such as Radial Basis Function (RBF) neural network [6], neuro-fuzzy model with Locally Linear Model Tree (LoLiMoT) [7], feedforward neural network [8], multi-layer perceptron (MLP) [9], recurrent neural networks (RNN) [10], finite impulse response (FIR) neural network [11], deep belief nets (DBN) [12], Elman neural network [11,13,14], and wavelet neural network (WNN) [15,16] have been introduced in the literature.
However, the setting of neural network model parameters will greatly affect the performance of these models. Consequently, a substantial amount of work has also been put into the optimization algorithm and parameter settings. Min Gan et al. present a state-dependent autoregressive (SD-AR) model, which uses a set of locally linear radial basis function networks (LIRBFNs) to approximate its functional coefficients [17]. In addition, Pauline Ong and Zarita Zainuddin presented a modified cuckoo search algorithm (MCSA) to initialize WNN models [16]. A hybrid learning algorithm, called HGAGD, which combines genetic algorithm (GA) with gradient descent (GD) is proposed to optimize the parameters of a quantum-inspired neural network (QNN) [18]. In the proposed methodology, the embedding method is used along with ENN to predict the residual time series [13]. A single hidden Markov model (HMM) combined with fuzzy inference systems is introduced for time series predicting [19]. What is more, many hybrid methods are also developed for improving the performance of these prediction models [20,21].
As mentioned above, model structure and parameter tuning are important factors for chaotic time series prediction with machine learning, and a lot of research has focused on it. To simplify the learning model, a hybrid method using Hankel Alternative View Of Koopman (HAVOK) analysis and machine learning (HAVOK-ML) is developed to predict chaotic time series in this research. Hankel Alternative View Of Koopman (HAVOK) analysis was proposed by Brunton [22]. It combines the delay embedding method [23] and the Koopman theory [24] to decompose chaotic dynamics into a linear model with intermittent forcing. HAVOK-ML decomposes chaotic dynamics into intermittently forced linear systems with HAVOK; then, it estimates the forcing term using machine learning. Essentially, the prediction of the chaotic time series using the HAVOK-ML method is conducted as solving linear ordinary differential equations, which can be calculated efficiently. It can take different types of regression methods such as Linear Regression or Random Forest Regression (RFR) [25] into the prediction framework and combines the advantages of HAVOK theory and machine learning. Therefore, it can obtain better prediction results than directly using those machine learning models.
This paper is organized as follows. Section 2 briefly describes the theory of the HAVOK analysis combined with the machine learning method for time series prediction. Section 3 applies the proposed combined method to perform multi-step ahead prediction for some well-known chaotic time series and also compares the obtained prediction performance with that of existing prediction models. Finally, conclusions are given in Section 4.

2. HAVOK-ML Method

Consider a nonlinear system of the form:
d x ( t ) d t = f ( x ( t ) )
where x ( t ) R n is the state of the system at time t and f denotes the dynamics of the system. For a given state x ( t 0 ) at time t 0 , x ( t 0 + t ) can be given discretely by:
x ( t 0 + t ) = x ( t 0 ) + t 0 t 0 + t f ( x ( τ ) ) d τ
Generally speaking, for an observed chaotic time series x(t), the governing equation f is highly nonlinear and unknown. HAVOK analysis [22] provides linear representations for those unknown nonlinear systems. A Hankel matrix H , for a single measurement x ( t ) , by taking singular value decomposition (SVD), is given by:
H = x ( t 1 ) x ( t 2 ) x ( t p ) x ( t 2 ) x ( t 3 ) x ( t p + 1 ) x ( t q ) x ( t q + 1 ) x ( t m ) = U Σ V T
where m = p + q 1 , p and q are two parameters that determine the dimension of H . The columns of H are defined by:
h ( i ) = [ x ( t i ) , x ( t i + 1 ) , , x ( t i + q 1 ) ] T ( i = 1 , , p )
then
H = [ h ( 1 ) h ( 2 ) h ( p ) ]
Usually, H can be well approximated by the first r columns of U , V . According to the HAVOK analysis [22], the first r 1 variables in V can be built as a linear model with the last variable v r as a forcing term:
d v r 1 ( t ) d t = A v r 1 ( t ) + B ( t )
where v r 1 = [ v 1 , v 2 , , v r 1 ] T is the vector of the first r 1 eigen-time-delay coordinates. Note that Equation (6) is not a closed model because v r ( t ) is an external input forcing. In the linear HAVOK model, matrix A and vector B may be obtained by the Sparse Identification of Nonlinear Dynamics (SINDy) algorithm [26] or by a straightforward linear regression procedure. v r is given by the rth column of V .
A machine learning method is used to predict v r ( t + 1 ) by using previous observed values [ x ( t D Δ t ) , x ( t ( D 1 ) Δ t , , x ( t Δ t ) ] , as shown in Figure 1. Suppose v r evenly varies within interval [ t , t + 1 ] . The evolution of v r ( t ) can be approximated by:
d v r ( t ) d t = ( v r ( t + 1 ) v r ( t ) ) / Δ t
Then, the first r 1 variables v r 1 ( t + 1 ) are obtained by solving the linear model Equation (6):
H = v r 1 ( t + 1 ) v r ( t + 1 ) v r ( t + 1 ) v r ( t ) = e x p A d t B d t 0 0 0 I 0 0 0 v r 1 ( t ) v r ( t ) v r ( t + 1 ) v r ( t )
Assume that the integration starts at time t p for an input h ( p ) . Then, the next step of h ( p + 1 ) can be written as:
h ( p + 1 ) = U Σ v r ( t p + 1 )
where v r = [ v 1 , v 2 , , v r 1 , v r ] is a vector containing the first r variables. In order to evaluate the efficiency of HAVOK-ML, RMSE, NMSE, and R 2 score defined below are used as a performance index.
RMSE = 1 N i = 1 N ( y i y ^ i ) 2
NMSE = i = 1 N ( y i y ^ i ) 2 i = 1 N ( y i y ¯ i ) 2
error = i = 1 N ( y ^ i y ¯ i ) 2 i = 1 N ( y i y ¯ i ) 2
where y i , y ^ i , and y ¯ represent the observed data, the predicted data, and the mean of the observed data, respectively. The Root Mean Squared Error (RMSE) and the Normalized Mean Squared Error (NMSE) are used to assess the accuracy of the prediction and to compare the results with those of the literature. The R 2 score is used to evaluate the score of the machine learning based prediction for v r .

3. Numerical Experiments

In this section, three different type of time series—Lorenz [1], Mackey–Glass [26], and Sunspot—are applied to verify our proposed HAVOK_ML method. The parameters adopted in the HAVOK analysis of these series are listed in Table 1.

3.1. Lorenz Time Series

The Lorenz system [1] is among the most famous chaotic systems, which is described by:
d x d t = σ ( y x ) d y d t = x ( ρ z ) y d z d t = x y β z
The chaotic time series is obtained with parameters σ = 10 , ρ = 28 , β = 8 / 3 , and d t = 0.01 in the second sampling. In this study, only the time series of variable x ( t ) , shown in Figure 2, is considered.
In this research, HAVOK-ML decomposes chaotic dynamics into intermittently forced linear systems by HAVOK analysis; the settings of HAVOK analysis are given in Table 1, and the sampling time step for each system is consistent with other references listed in Table 2. However, according to the advice in paper [14], the samples of the Lorenz system are interpolated at 0.001 s resolution in the HAVOK analysis.
By using HAVOK analysis for the training data, a linear HAVOK model (Equation (6)) is developed. As shown in Figure 3, matrix A and vector B are sparse, and the reconstruction of v 1 ( t ) and x ( t ) is coherent with the actual values for the full range of time. Since the v r (Figure 4) is not smooth enough, many experiment results demonstrate that the RFR method [25] can predict the v r best. Hence, an RFR method is adopted to train and estimate the next step v r ( t + 1 ) based on previously observed values [ x ( t 40 ) , x ( t 35 ) , , x ( t 10 ) , x ( t 5 ) ] . The samples from the 3rd to the 100th seconds are spilled into training set (first 80%) and test set (20%). The R 2 score for the RFR method on the test set is 0.87. We can observe that the estimated results are mainly consistent with the actual values, as shown in Figure 4.
In the next experiment, the HAVOK-ML method is used in the N-step recursive prediction of Lorenz time series. In the recursive prediction, the current predicted values are used for next predictions without any correction to the actual values. A comparison between the multi-step predicted values and the original time series, with 1000 testing samples, is shown in Figure 5. It can be seen that in the initial steps of predict (less than 10), the prediction results are coherent with the actual values. The error increases with predict steps, especially at the region near the extreme point of the curve. The RMSE of the prediction function of time is presented in Figure 6. It can be observed that the error quickly increases with the increase of the predicted time, which means that a long-term prediction is basically impossible. At step 10, the obtained RMSE is 9.003× 10 3 (Figure 6), which is significantly better than the result of the literature (0.014) [27].
Table 2 presents the one-step ahead prediction errors (RMSE and NMSE) for the proposed method as well as some results obtained by existing methods, which were extracted from the literature. It can be shown that the RMSE index of the proposed method is optimal, while the NMSE index shows that the proposed method is second only to the functional weight WNN state-dependent AR (FWWNN-AR) model [15].

3.2. Mackey–Glass Time Series

The Mackey–Glass chaotic time series has been introduced as a white blood cell production [26]. It is described by:
d x ( t ) d t = a x ( t τ ) 1 + x r ( t τ ) b x ( t ) , t > 0
where a = 0.2 , b = 0.1 , r = 10 and τ = 17 , similar to other published papers presented in Table 3. The Mackey–Glass equation is solved using the delay differential equation method dde23 of MATLAB. A chaotic time series samples set of 25,000 lengths, with time step d t = 0.1 , is generated. The samples from the 300th to 2000th seconds, shown in Figure 7, are chosen as the training set, while the rest is used as the test set.
The HAVOK analysis settings for Mackey–Glass time series are given in Table 1. The rows in the H matrix are for q = 5 , and the rank of the SVD decomposition is r = 5 . More details on the HAVOK analysis and the multi-step ahead prediction for the Mackey–Glass time series are presented in Figure A1, Figure A2, Figure A3 and Figure A4 in Appendix A. By considering the properties of the v r curve, an LLN model with the LoliMoT optimization method [7] is determined through experiments as a regressor to predict v r . A comparison between the prediction accuracies of the proposed method and other models of the literature are summarized in Table 3. As shown in Table 3, whether RMSE or NMSE, the effect of the proposed method outperforms the existing models.

3.3. Sunspot Time Series

The number of sunspots observed in the solar surface varies within a period of approximately 11 years. The variation of the number of sunspots has a large impact on Earth and on the climate. The monthly smoothed sunspot number time series, observed by the Solar Influences Data Analysis Center (http://sidc.oma.be/index.php (accessed on 7 August 2021)), is used for training and forecasting the trend of the sunspot variation. In order to compare the results with those extracted from the existing published papers of the literature (listed in Table 4), the series from November 1834 to June 2001 (2000 data points) are chosen and scaled between 0 and 1. The first 1000 samples are selected as the training set (Figure 8), and the remaining 1000 samples are used as the testing set. The settings of HAVOK analysis for sunspot time series are given in Table 1. Similar to the Mackey–Glass time series, an LLN model is used as the regressor to predict v r . The details of the HAVOK decomposition and the multi-step prediction for the sunspot time series are shown in Figure A5, Figure A6, Figure A7 and Figure A8 in Appendix A. A comparison between the prediction errors (RMSE and NMSE) obtained with the 1000 samples of the testing set and with other models are presented in Table 4. It can be seen that the proposed HAVOK-ML method outperforms the existing methods in predicting the sunspot chaotic time series.

4. Discussion and Conclusions

In this paper, a HAVOK-ML method combining the HAVOK analysis with machine learning to predict chaotic time series is proposed. Based on the HAVOK analysis, the observed chaotic dynamic system could be reconstructed as a linear model with an external intermittent forcing. A machine learning method was applied to predict the external forcing term by using previously observed values. Finally, the combination of the HAVOK analysis with machine learning produces a closed model for prediction. It is worth noting that the machine learning method used in HAVOK-ML will vary depending on the property of the external forcing term. The developed method has been validated for multi-step ahead prediction of several classic chaotic time series (the Lorenz time series, the Mackey–Glass time series, and the Sunspot time series). The experimental results show that our method can produce accurate forecasts even with simple machine learning algorithms. The prediction performance of the proposed method has been compared with other forecasting models of the literature. The comparison shows that the proposed method outperforms the existing ones in terms of superior forecasting ability. Although HAVOK-ML can be combined with different machine learning methods, it does not give suggestions on how to choose machine learning methods for different time series forecasting problems. This is worth studying in the future.

Author Contributions

Conceptualization, J.Y.; Data curation, J.Z. and C.Z.; Formal analysis, J.Z., J.S. and H.L.; Funding acquisition, H.L.; Investigation, J.Y., J.Z. and C.Z.; Methodology, J.Y. and J.Z.; Project administration, J.S. and J.W.; Supervision, J.S. and J.W.; Validation, J.Y. and H.L.; Writing—original draft, J.Y.; Writing—review and editing, J.Y. and J.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research is partially supported by the National Natural Science Foundation of China (Grant Nos. 41605070, 61802424).

Data Availability Statement

The code and the dataset supporting the results of this article are available in the https://gitee.com/yangjinhui11/havok_py/tree/master (accessed on 14 August 2021).

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Figures of Mackey–Glass Time Series and Sunspot Time Series

Figure A1. Decomposition of the Mackey–Glass chaocit series with HAVOK analysis (similar to Figure 3).
Figure A1. Decomposition of the Mackey–Glass chaocit series with HAVOK analysis (similar to Figure 3).
Entropy 24 00408 g0a1
Figure A2. The LLN model with the LoliMoT optimization method, which is used to predict v r of the Mackey–Glass chaocit series. The previously observed values at [ x ( t 5 ) , x ( t 4 ) , , x ( t 3 ) , x ( t 2 ) , x ( t 1 ) ] are used to predict the next time value at v r ( t + 1 ) , with d t = 0.1 s.
Figure A2. The LLN model with the LoliMoT optimization method, which is used to predict v r of the Mackey–Glass chaocit series. The previously observed values at [ x ( t 5 ) , x ( t 4 ) , , x ( t 3 ) , x ( t 2 ) , x ( t 1 ) ] are used to predict the next time value at v r ( t + 1 ) , with d t = 0.1 s.
Entropy 24 00408 g0a2
Figure A3. Comparison of the original time series samples of Mackey–Glass and the multi-step predicted values, with one-step length of 0.1 s.
Figure A3. Comparison of the original time series samples of Mackey–Glass and the multi-step predicted values, with one-step length of 0.1 s.
Entropy 24 00408 g0a3
Figure A4. Error growth of the multi-step prediction of Mackey–Glass chaocit series for 4000 samples, with one-step length of 0.1 s.
Figure A4. Error growth of the multi-step prediction of Mackey–Glass chaocit series for 4000 samples, with one-step length of 0.1 s.
Entropy 24 00408 g0a4
Figure A5. Decomposition of the sunspot series with HAVOK analysis (similar to Figure 3).
Figure A5. Decomposition of the sunspot series with HAVOK analysis (similar to Figure 3).
Entropy 24 00408 g0a5
Figure A6. The LLN model with the LoliMoT optimization method, which is used to predict v r of sunspot series. The previously observed values at [ x ( t 140 ) , x ( t 125 ) , x ( t 110 ) , x ( t 95 ) , , x ( t 35 ) , x ( t 20 ) , x ( t 5 ) ] are used to predict.
Figure A6. The LLN model with the LoliMoT optimization method, which is used to predict v r of sunspot series. The previously observed values at [ x ( t 140 ) , x ( t 125 ) , x ( t 110 ) , x ( t 95 ) , , x ( t 35 ) , x ( t 20 ) , x ( t 5 ) ] are used to predict.
Entropy 24 00408 g0a6
Figure A7. Multi-step ahead prediction of sunspot series with one-step length of 1 month.
Figure A7. Multi-step ahead prediction of sunspot series with one-step length of 1 month.
Entropy 24 00408 g0a7
Figure A8. Error growth of multi-step prediction of sunspot series with one-step length of 1 (month).
Figure A8. Error growth of multi-step prediction of sunspot series with one-step length of 1 (month).
Entropy 24 00408 g0a8

References

  1. Lorenz, E.N. Deterministic nonperiodic flow. J. Atoms. 1963, 20, 130–141. [Google Scholar] [CrossRef] [Green Version]
  2. Bjørnstad, O.N.; Grenfell, B.T. Noisy Clockwork: Time Series Analysis of Population Fluctuations in Animals. Science 2001, 293, 638–643. [Google Scholar] [CrossRef] [Green Version]
  3. Sugihara, G.; May, R.; Ye, H.; Hsieh, C.H.; Deyle, E.; Fogarty, M.; Munch, S. Detecting Causality in Complex Ecosystems. Science 2012, 338, 496–500. [Google Scholar] [CrossRef]
  4. Ye, H.; Beamish, R.J.; Glaser, S.M.; Grant, S.; Hsieh, C.H.; Richards, L.J.; Schnute, J.T.; Sugihara, G. Equation-free mechanistic ecosystem forecasting using empirical dynamic modeling. Proc. Natl. Acad. Sci. USA 2015, 112, E1569. [Google Scholar] [CrossRef] [Green Version]
  5. Sugihara, G.; May, R.M. Nonlinear forecasting as a way of distinguishing chaos from measurement error in time series. Nature 1990, 344, 734–741. [Google Scholar] [CrossRef]
  6. Chen, S.; Cowan, C.; Grant, P. Orthogonal least squares learning algorithm for radial basis function networks. IEEE Trans. Neural Netw. 1991, 2, 302–309. [Google Scholar] [CrossRef] [Green Version]
  7. Predicting Chaotic time series using neural and neurofuzzy models: A comparative study. Neural Process. Lett. 2006, 24, 217–239. [CrossRef]
  8. Chen, Y.; Yang, B.; Dong, J.; Abraham, A. Time-series forecasting using flexible neural tree model. Inf. Sci. 2005, 174, 219–235. [Google Scholar] [CrossRef]
  9. Chandra, R.; Zhang, M. Cooperative coevolution of Elman recurrent neural networks for chaotic time series prediction. Neurocomputing 2012, 86, 116–123. [Google Scholar] [CrossRef]
  10. Ma, Q.L.; Zheng, Q.L.; Peng, H.; Zhong, T.W.; Xu, L.Q. Chaotic Time Series Prediction Based on Evolving Recurrent Neural Networks. In Proceedings of the 2007 International Conference on Machine Learning and Cybernetics, Hong Kong, China, 19–22 August 2007; Volume 6, pp. 3496–3500. [Google Scholar] [CrossRef]
  11. Koskela, T.; Lehtokangas, M.; Saarinen, J.; Kaski, K. Time Series Prediction with Multilayer Perceptron, FIR and Elman Neural Networks. In Proceedings of the World Congress on Neural Networks; INNS Press: San Diego, CA, USA, 1996; pp. 491–496. [Google Scholar]
  12. Kuremoto, T.; Kimura, S.; Kobayashi, K.; Obayashi, M. Time series forecasting using a deep belief network with restricted Boltzmann machines. Neurocomputing 2014, 137, 47–56. [Google Scholar] [CrossRef]
  13. Ardalani-Farsa, M.; Zolfaghari, S. Chaotic time series prediction with residual analysis method using hybrid Elman–NARX neural networks. Neurocomputing 2010, 73, 2540–2553. [Google Scholar] [CrossRef]
  14. Brunton, S.L.; Brunton, B.W.; Proctor, J.L.; Kaiser, E.; Kutz, J.N. Chaos as an Intermittently Forced Linear System. Nat. Commun. 2016, 8, 19. [Google Scholar] [CrossRef]
  15. Inoussa, G.; Peng, H.; Wu, J. Nonlinear time series modeling and prediction using functional weights wavelet neural network-based state-dependent AR model. Neurocomputing 2012, 86, 59–74. [Google Scholar] [CrossRef]
  16. Zhu, L.; Wang, Y.; Fan, Q. MODWT-ARMA model for time series prediction. Appl. Math. Model. 2014, 38, 1859–1865. [Google Scholar] [CrossRef]
  17. Ong, P.; Zainuddin, Z. Optimizing wavelet neural networks using modified cuckoo search for multi-step ahead chaotic time series prediction. Appl. Soft Comput. J. 2019, 80, 374–386. [Google Scholar] [CrossRef]
  18. Wang, X.; Ma, L.; Wang, B.; Wang, T. A hybrid optimization-based recurrent neural network for real-time data prediction. Neurocomputing 2013, 120, 547–559. [Google Scholar] [CrossRef]
  19. Bhardwaj, S.; Srivastava, S.; Gupta, J.R.P. Pattern-Similarity-Based Model for Time Series Prediction. Comput. Intell. 2015, 31, 106–131. [Google Scholar] [CrossRef]
  20. Smith, C.; Jin, Y. Evolutionary multi-objective generation of recurrent neural network ensembles for time series prediction. Neurocomputing 2014, 143, 302–311. [Google Scholar] [CrossRef] [Green Version]
  21. Ho, D.T.; Garibaldi, J.M. Context-Dependent Fuzzy Systems With Application to Time-Series Prediction. IEEE Trans. Fuzzy Syst. Publ. IEEE Neural Netw. Counc. 2014, 22, 778–790. [Google Scholar] [CrossRef]
  22. Takens, F. Detecting strange attractors in turbulence. In Dynamical Systems and Turbulence, Warwick 1980; Rand, D., Young, L.S., Eds.; Springer: Berlin/Heidelberg, Germany, 1981; pp. 366–381. [Google Scholar]
  23. Tu, J.H.; Rowley, C.W.; Luchtenburg, D.M.; Brunton, S.L.; Kutz, J.N. On dynamic mode decomposition: Theory and applications. J. Comput. Dyn. 2014, 1, 391–421. [Google Scholar] [CrossRef] [Green Version]
  24. Brunton, S.L.; Proctor, J.L.; Kutz, J.N. Discovering governing equations from data: Sparse identification of nonlinear dynamical systems. Proc. Natl. Acad. Sci. USA 2015, 113, 3932. [Google Scholar] [CrossRef] [Green Version]
  25. Ao, Y.; Li, H.; Zhu, L.; Ali, S.; Yang, Z. The linear random forest algorithm and its advantages in machine learning assisted logging regression modeling. J. Pet. Sci. Eng. 2019, 174, 776–789. [Google Scholar] [CrossRef]
  26. Mackey, M.C.; Glass, L. Oscillation and Chaos in Physiological Control Systems. Science 1977, 197, 287–289. [Google Scholar] [CrossRef]
  27. Gan, M.; Peng, H.; Peng, X.; Chen, X.; Inoussa, G. A locally linear RBF network-based state-dependent AR model for nonlinear time series modeling. Inf. Sci. 2010, 180, 4370–4383. [Google Scholar] [CrossRef]
  28. Ganjefar, S.; Tofighi, M. Optimization of quantum-inspired neural network using memetic algorithm for function approximation and chaotic time series prediction. Neurocomputing 2018, 291, 175–186. [Google Scholar] [CrossRef]
  29. Woolley, J.W.; Agarwal, P.K.; Baker, J. Modeling and prediction of chaotic systems with artificial neural networks. Int. J. Numer. Methods Fluids 2010, 63, 989–1004. [Google Scholar] [CrossRef]
Figure 1. The architecture of the HAVOK-ML method to perform one-step prediction. The SVD of Hankel matrix H yields eigen time series V T . On the one hand, the HAVOK analysis gives a linear system for the first r 1 variables with v r ( t ) as an external input. On the other hand, by using the machine learning method, the evolution of v r ( t ) can be established. Hence, a closed linear model for the first r variables is available. The symbols with superscript + stand for values at the next step t + 1 .
Figure 1. The architecture of the HAVOK-ML method to perform one-step prediction. The SVD of Hankel matrix H yields eigen time series V T . On the one hand, the HAVOK analysis gives a linear system for the first r 1 variables with v r ( t ) as an external input. On the other hand, by using the machine learning method, the evolution of v r ( t ) can be established. Hence, a closed linear model for the first r variables is available. The symbols with superscript + stand for values at the next step t + 1 .
Entropy 24 00408 g001
Figure 2. The time series x ( t ) in the Lorenz system. The initial condition is (−8, 8, 27). The training data are chosen from the 3rd to the 100th seconds.
Figure 2. The time series x ( t ) in the Lorenz system. The initial condition is (−8, 8, 27). The training data are chosen from the 3rd to the 100th seconds.
Entropy 24 00408 g002
Figure 3. HAVOK analysis for Lorenz chaotic series x ( t ) . From upper-left to bottom-right: matrix A , vector B , reconstruction of v 1 ( t ) using the linear HAVOK model with forcing v r ( t ) , reconstruction of x ( t ) and the input of external forcing v r ( t ) .
Figure 3. HAVOK analysis for Lorenz chaotic series x ( t ) . From upper-left to bottom-right: matrix A , vector B , reconstruction of v 1 ( t ) using the linear HAVOK model with forcing v r ( t ) , reconstruction of x ( t ) and the input of external forcing v r ( t ) .
Entropy 24 00408 g003
Figure 4. The random forest regressor for v r , using previously observed values at [ x ( t 40 ) , x ( t 35 ) , , x ( t 10 ) , x ( t 5 ) ] to predict the next time value at v r ( t + 1 ) , with Δ t = 0.001 .
Figure 4. The random forest regressor for v r , using previously observed values at [ x ( t 40 ) , x ( t 35 ) , , x ( t 10 ) , x ( t 5 ) ] to predict the next time value at v r ( t + 1 ) , with Δ t = 0.001 .
Entropy 24 00408 g004
Figure 5. Comparison of the original time series samples and the multi-step predicted values with one-step length of 0.01 s on Lorenz time series.
Figure 5. Comparison of the original time series samples and the multi-step predicted values with one-step length of 0.01 s on Lorenz time series.
Entropy 24 00408 g005
Figure 6. Lorenz time-series RMSE of multi-step ahead prediction, function of the number of steps (N), with one-step length of 0.01 s.
Figure 6. Lorenz time-series RMSE of multi-step ahead prediction, function of the number of steps (N), with one-step length of 0.01 s.
Entropy 24 00408 g006
Figure 7. Time series of Mackey–Glass system. The initial condition is 0.8, and the training data are chosen from the 300th to 2000th seconds.
Figure 7. Time series of Mackey–Glass system. The initial condition is 0.8, and the training data are chosen from the 300th to 2000th seconds.
Entropy 24 00408 g007
Figure 8. Time series of sunspot normalized to [ 1 , 1]. The training period ranges between November 1834 and March 1918.
Figure 8. Time series of sunspot normalized to [ 1 , 1]. The training period ranges between November 1834 and March 1918.
Entropy 24 00408 g008
Table 1. HAVOK analysis parameters for each system.
Table 1. HAVOK analysis parameters for each system.
SystemSamplesdt Δ t qRank (r)Regressor for v r
Lorenz20,0000.01 s0.001 s4011RandomForest
Mackey-Glass50,0000.1 s/55LoLiMoT
Sunspot20001 month0.02 month1407LoLiMoT
Table 2. Comparison of the models in one-step predicting for Lorenz chaotic series x ( t ) , with 1000 testing samples. The last row represents the proposed HAVOK-ML combined with the RFR method. The highest prediction accuracies achieved by the models are shown in bold.
Table 2. Comparison of the models in one-step predicting for Lorenz chaotic series x ( t ) , with 1000 testing samples. The last row represents the proposed HAVOK-ML combined with the RFR method. The highest prediction accuracies achieved by the models are shown in bold.
ModelRMSENMSEReference
Deep Belief Network1.02 × 10 2 /[12]
Elman–NARX neural networks1.08 × 10 4 1.98 × 10 10 [13]
WNN/9.84 × 10 15 [15]
Fuzzy Inference System3.1 × 10 3 /[19]
Local Linear Neural Fuzzy/9.80 × 10 10 [7]
Local Linear Radial Basis Function Networks/4.53 × 10 12 [27]
WNNs with MCSA8.20 × 10 3 1.22 × 10 6 [17]
HAVOK_ML(RFR)1.43 × 10 5 3.23 × 10 12
Table 3. Comparison of the models in six-time step ahead predicting Mackey–Glass time series, with 4000 testing samples. The last row shows the proposed HAVOK-ML method with the LLN model as the regressor. The values in bold are the highest prediction accuracies achieved by the models.
Table 3. Comparison of the models in six-time step ahead predicting Mackey–Glass time series, with 4000 testing samples. The last row shows the proposed HAVOK-ML method with the LLN model as the regressor. The values in bold are the highest prediction accuracies achieved by the models.
ModelRMSENMSEReference
ARMA with Maximal Overlap Discrete Wavelet Transform/5.3373 × 10 7 [16]
Ensembles of Recurrent Neural Network7.533 × 10 3 8.29 × 10 4 [20]
Quantum-Inspired Neural Network9.70 × 10 4 /[28]
Recurrent Neural Network6.25 × 10 4 /[18]
Type-1 Fuzzy System4.8 × 10 4 /[21]
Fuzzy Inference System7.1 × 10 4 /[19]
WNNs with MCSA5.60 × 10 5 6.25 × 10 8 [17]
HAVOK_ML(RFR)9.92 × 10 6 1.86 × 10 9
Table 4. Comparison of the models in one-time step ahead predicting sunspot time series, with 1000 testing samples. The last row shows the proposed HAVOK analysis with the LLN model as the regressor. The values in bold are the highest prediction accuracies achieved by the models.
Table 4. Comparison of the models in one-time step ahead predicting sunspot time series, with 1000 testing samples. The last row shows the proposed HAVOK analysis with the LLN model as the regressor. The values in bold are the highest prediction accuracies achieved by the models.
ModelRMSENMSEReference
Elman-NARX Neural Networks1.19 × 10 2 5.90 × 10 4 [13]
Elman Recurrent Neural Networks5.58 × 10 2 1.92 × 10 2 [29]
Ensembles of Recurrent Neural Network1.52 × 10 2 9.64 × 10 4 [20]
Fuzzy Inference System1.18 × 10 2 5.32 × 10 4 [19]
Functional Weights WNNs State Dependent Autoregressive Model1.12 × 10 2 5.24 × 10 4 [21]
WNNs with MCSA1.13 × 10 2 5.30 × 10 4 [17]
HAVOK_ML(RFR)4.25 × 10 3 7.40 × 10 5
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Yang, J.; Zhao, J.; Song, J.; Wu, J.; Zhao, C.; Leng, H. A Hybrid Method Using HAVOK Analysis and Machine Learning for Predicting Chaotic Time Series. Entropy 2022, 24, 408. https://doi.org/10.3390/e24030408

AMA Style

Yang J, Zhao J, Song J, Wu J, Zhao C, Leng H. A Hybrid Method Using HAVOK Analysis and Machine Learning for Predicting Chaotic Time Series. Entropy. 2022; 24(3):408. https://doi.org/10.3390/e24030408

Chicago/Turabian Style

Yang, Jinhui, Juan Zhao, Junqiang Song, Jianping Wu, Chengwu Zhao, and Hongze Leng. 2022. "A Hybrid Method Using HAVOK Analysis and Machine Learning for Predicting Chaotic Time Series" Entropy 24, no. 3: 408. https://doi.org/10.3390/e24030408

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop