1. Introduction
One of the primary features of mid- and high-latitude atmospheric circulation (AC) is transient variability, which is closely related to the growth and decay of daily weather systems. In the 1970s, Blackmon [
1] found sub-weekly (2.5–6 days) transient eddies over the North Pacific and North Atlantic with filtering data. He defined the two zonal-extended regions with the most intensive transient variability as “storm track” (ST), which can be divided, respectively, into North Pacific ST (NPST) and North Atlantic ST (NAST). ST corresponds significantly with cyclone and anticyclone activities, which can be the indication of the development of weather systems. Moreover, as a contacting link of heat and kinetic energy between ocean and atmosphere, ST plays an important role in the maintenance of AC and climate change [
2].
ST is crucial to the short-term anomaly of AC with interactions between ST and low-frequency circulation. So far, many studies have revealed the interaction. Lau [
3] studied the seasonal variation of ST and pointed out that the main mode of the variation was related to the teleconnection pattern of the low-frequency circulation in the northern hemisphere. Straus et al. [
4] discovered that the ST anomaly was closely related to the sea surface temperature (SST) anomaly in the Kuroshio area. Zhu et al. [
5] summarized the correlation between the winter NPST and the Pacific-North America teleconnection pattern (PNA) and Western Pacific teleconnection pattern (WP). Ren et al. [
6] used the empirical orthogonal function (EOF) to analyze the temporal and spatial variability of the winter NPST and explained its coupled pattern with the mid-latitude air-sea system. Liu et al. [
7] determined the correlation and potential influencing mechanisms between the Polar vortex intensity and NPST. Both observational research and theoretical studies have indicated the symbiotic relationship between ST and large-scale AC in the Northern Hemisphere. However, most studies are just diagnostic analysis about ST variability and correlation. To grasp the evolvement role of ST, prediction is becoming an urgent area of research.
However, ST is a highly nonlinear system due to nonlinear processes in the air-sea system. There is relatively little research on the numerical forecasting or statistical forecasting of ST both at home and abroad. That may result from the diversity of influencing factors and the complexity of correlation mechanisms. In addition, strong transients and uncertain rules have also caused difficulties in ST prediction. In meteorological prediction, climate indexes are often used as predictands and predictors to explain the behavior of future climate. Therefore, how to quantify the intensity and spatial-temporal variation as indexes is the premise of ST prediction. At present, there are several indexes that can indicate the possible evolution of ST, whose calculation methods with filtering variance includes the central point representation [
8,
9], regional average [
10], and EOF [
11]. The above studies achieve the quantitative description of the nonlinear ST system by establishing an index. Thus, we can predict the temporal and spatial variation of ST with the ST index.
The prediction of ST index belongs to the prediction of nonlinear time-series. In the field of meteorology and oceanology, data-driven models (i.e., statistical models) are suitable predicting tools due to their rapid development times, as well as low information requirements compared to physical-based models. Hong et al. [
12] introduced the inversion idea and used genetic algorithm to reconstruct the nonlinear forecasting model of the subtropical high index from historical data. Liu et al. [
13] integrated the EOF, wavelet decomposition and support vector machine (SVM) method to predict the 500
geopotential height in summer. Zhu et al. [
14] conducted a short-term forecast experiment of the tropical atmospheric seasonal oscillation (MJO) index, using both the singular spectrum analysis and auto-regression model. Jia et al. [
15] applied the correlation analysis and optimal subset regression to select predictors and established a statistical prediction model for the subtropical high index. The above statistical methods require a large amount of historical data, but their efficiency on processing big data is low. Most importantly, these methods have weak ability to mine and express the internal relations from data quantitatively. Therefore, the above models are still flawed for prediction of ST index.
With the rapid development of computer technology and information acquisition technology, machine learning (ML) and data mining (DM) have opened a new era—artificial intelligence. Breakthroughs have been made by the application of ML and DM in the fields of biology, finance, and medicine [
16,
17,
18], and they have also brought opportunities for the development of predicting technology in meteorology and oceanology. Many scholars have applied ML and DM to meteorological prediction: Yang et al. [
19] used the association rules mining to analyze the data set of North Atlantic hurricane history trace and predicted the intensity of the North Atlantic hurricane based on the mining results. Royston et al. [
20] applied the semantic decision tree to conduct regular mining and forecast modeling with water level and meteorological data, to forecast the storm surge of Thames Estuary. Gordon et al. [
21] constructed a meteorological prediction model using neural network (NN) and frequency domain algorithm to implement 24-hour refined prediction. Teng [
22] extracted highly relevant factors and used the stepwise regression and SVM to establish the medium-term prediction model of the tropical cyclone path in the Western Pacific.
To a certain extent, ML and DM can overcome the shortcomings of the above statistical methods and achieve data mining and reasoning with rapid development times. However, the above ML algorithms are all deterministic methods, that is, give a certain value for a certain predicting moment. Please note that ST is affected by the nonlinear action of various weather systems and has strong uncertainties. When the intensity and position of ST fluctuate greatly, deterministic single-point prediction may not achieve the desired accuracy. In contrast, the probabilistic prediction method could give the result in the form of probability distribution, covering more complete prediction information.
As a new branch of ML theory, Bayesian network (BN) makes it feasible for the probabilistic prediction of ST index, which has been initially used in the field of meteorology and hydrology [
23,
24]. The emerging dynamic Bayesian Network (DBN) adds time information to the classical BN, which becomes a new probabilistic expression and reasoning tool owing to the ability to deal with uncertainties. Correspondingly, ST is affected by many factors in the mid-latitude air-sea system. There are random and non-linear interactions between these factors at same and different time. The features coincide exactly with the DBN, thus DBN is a powerful theoretical tool for probabilistic prediction of ST index. Additionally, note that time-series of the ST index is non-stationary. This limitation with non-stationary data has led to the recent formation of hybrid models, where data is preprocessed for non-stationary characteristics and then run through a predicting method such as ML algorithms to cope with the nonlinearity. Wavelet analysis (WA), an effective tool to deal with non-stationary data, has recently been applied to meteorological forecast. We will combine WA with DBN to achieve scientific and accurate prediction of ST index.
In this paper, we constructed the WA-DBN model to predict the winter PST intensity index. To deal with the non-stationarity, nonlinearity, and uncertainty, we introduced DBN theory innovatively and combined WA to establish a data-driven model for predicting the monthly STII using large-scale climate indexes as the predictors. We first selected the climate indexes significantly related to ST as predictors. Then based on wavelet decomposition, a WA-DBN probabilistic prediction model was constructed through structure learning, parameter learning and probabilistic reasoning. Finally, a deeper comparative analysis of model performance is conducted with key statistical indicators.
6. Conclusions
Effective short-term prediction of STII is significant for researches of mid-latitude weather systems, especially the analysis of abnormal changes. In this study, we have applied the state-of-the art artificial intelligence to predict the monthly intensity index of NPST with WA-DBN probabilistic prediction model. Considering the non-stationarity, nonlinearity, and uncertainty of the STII time-series, we first used the WA to decompose the intensity index into the sub-modes with different frequency domains. Then we applied the DBN to make a probabilistic prediction for each sub-mode. Finally, the independent prediction results of each mode were integrated with the wavelet reconstruction.
To further illustrate the advantages of the model, we conducted multiple sets of STII prediction experiments, fitting experiments, and comparison experiments. The results show that predicting correlation coefficient reached about 0.6 and fitting correlation coefficient reached 0.97. Moreover, this model is good at predicting extremums. Therefore, the WA-DBN model exhibits relatively better performance in prediction of nonlinear uncertainties, as evidence by higher R and smaller RMSE. The improved performance of the WA-DBN model is attributable to two aspects:
The input dataset of predictand is decomposed into separate components based on different frequencies with WA, allowing removal of noisy data and revealing the quasi-periodic components in the original time-series.
Both the relationship between the predictand and the predictors at the same time and that in adjacent time slices are considered with DBN model. The expression of casual relationship with network structure and probability distribution can better deal with the uncertainty of prediction.
We summarize that the WA-DBN model developed and tested in this study has good prediction skills of monthly STII, which is of great scientific guidance to study the abnormal changes of ST and its mechanisms. Above all, we propose a new intelligent prediction model based on graph theory and probability theory, which has wide application prospects with strong generalization ability and good stability.
Although the WA-DBN probabilistic predicting model works well, there are still some problems. First, the selection of the predictors of the ST intensity index needs to be further improved. The existing studies indicate that if the number of predictors exceeds 10, the predicting calculation will be complex, and the accuracy will not increase significantly with more predictors. If fewer predictors are selected such as 5, the accuracy will become poor due to loss of information. In this research, we chose 9 most relevant indicators as predictors. However, the selection of predictors is crucial to prediction, and we need to improve this work. Second, the accuracy of the long-term prediction in this model is low. These are also the focus of future work.