1. Introduction
The PDO is a leading climate mode of sea surface temperature (SST) in the North Pacific [
1,
2]. It plays a crucial role in influencing global mean SST changes, such as hiatus periods [
3], and has been a driving force behind marine heatwaves in the Northeast Pacific in recent decades [
4] and impacting local marine ecosystems [
5]. The PDO significantly affects the interdecadal variations of summer precipitation in China [
6,
7,
8] and precipitation patterns in the western United States [
9]. In marine fisheries and coastal disaster prevention, predicting the PDO and its phase changes have practical significance. When the PDO is in its cold phase, SSTs in the mid-latitudes of the North Pacific (such as the Gulf of Alaska) decrease, promoting the plankton reproduction and thereby increasing salmon and cod resources [
1]. Additionally, the SST anomalies (SSTAs) during the PDO cold phase suppress typhoon formation through atmospheric circulation regulation and can serve as a precursor signal for reduced typhoon activity in the northwest Pacific [
10]. Therefore, monitoring the future evolution of the PDO can provide timely warnings for the onset of short-term events (such as marine heatwaves and hurricanes) modulated by PDO cool/warm phases, which is crucial for mitigating its consequences and facilitating policy and decision making.
The prediction of PDO belongs to the category of decadal prediction, which aims to forecast climate change over the next 1–10 years [
11,
12]. Dynamical forecasts using numerical models represent a widely employed approach for decadal prediction, with examples including models from the North American Multi-Model Ensemble (NMME) project. The accuracy of these dynamical prediction techniques relies on dynamical coupled models and initialization schemes. However, the complexity of these dynamical coupled models, variations in initialization schemes, and unclear mechanisms of decadal variability [
2,
13] bring a great challenge of predicting decadal variability. Additionally, there is a considerable gap in prediction skills for the North Pacific [
14,
15] compared to other regions [
16,
17] for most of the models [
18]. Recently, few of them report strong performance in predicting the PDO (e.g., He et al. [
19]). However, the dynamical models have relatively low prediction skills with monthly/annual mean correlation coefficients of 0.5 at the lead times of 12 months or 2 years [
19,
20].
Artificial intelligence (AI) has demonstrated remarkable achievements in climate and ocean monitoring and prediction. Deep learning methods can use historical observational data as input and directly output predicted values for future states through an end-to-end learning approach, enabling the predictions of the dynamic evolution of climate systems. It has been successfully applied to forecast the El Niño-Southern Oscillation (ENSO), achieving higher forecast skills than traditional dynamical numerical models [
21,
22,
23,
24,
25,
26]. For example, ENSO can now be forecasted with a lead time of up to 23 months [
27]. Advanced deep learning models combining U-net and ConvLSTM have accurately captured the spatial features of SSTAs in the South China Sea, enabling successful short-term predictions of marine heatwaves (MHWs) [
28]. These successes have inspired us to apply AI techniques to PDO prediction and provide a feasibility approach for integrating remote sensing data with near-time advantages into short-term climate prediction systems. Previous studies have shown that artificial neural networks can predict PDO transitions up to 12 months in advance [
29]. The seasonal gated recurrent unit (SGRU) network has achieved correlation coefficients of 0.65/0.67 for 6-month/3-year lead times on the monthly/annual scale [
30]. Qin et al. further extended the PDO prediction up to 12 months by employing the Deep Spatiotemporal Embedding Network (DSEN) model [
31]. While these AI methods show promise, they still struggle to fully capture the PDO’s complex patterns across different timescales, and lack effective strategies for optimizing model hyperparameters to further enhance prediction accuracy.
The PDO involves complex interaction across multiple time scales, such as interannual and decadal time scales [
2]. To improve the accuracy of AI models in representing changes across these different timescales, it is imperative to integrate data mining techniques into AI frameworks. However, this strategy will inevitably complicate the model framework, thereby imposing higher requirements on training parameters. Typically, AI models rely on empirically set parameters during training and then limit the forecast ability. Furthermore, another key limitation in training for decadal prediction is too short of a time-length for observational data (e.g., Zhou et al. [
32]). Consequently, to improve the predictive accuracy of AI models for interdecadal changes, dynamically adjusted parameter optimization strategies and sufficient training data are indispensable.
This study proposes a novel BWM model that integrates a bidirectional long short-term memory (BiLSTM) network with the whale optimization algorithm (WOA) and multiple modal decomposition (MMD) techniques. This model effectively predicts the PDO index at both monthly and annual scales (i.e., interannual-decadal time scales). The whale optimization algorithm automatically adjusts and optimizes model hyperparameters based on model performance feedback, while modal decomposition can enhance the capture of PDO’s multi-scale features. The prediction performance of our model is compared with the SGRU, DSEN, and the traditional statistical time series prediction method, the autoregressive integrated moving average (ARIMA) model (as a baseline). Results demonstrate that our BWM model outperforms these existing approaches. Additionally, the BWM model predicts the future phase of the PDO from May 2025 using the PDO indices available and derived from a near-real-time satellite-derived sea surface temperature data product. The remainder of this paper is organized as follows:
Section 2 describes the data, methodologies, and experiments,
Section 3 presents the results and analysis, and
Section 4 contains concluding remarks and discussion.
2. Data and Methods
2.1. Data
According to Mantua et al. [
1], the PDO is computed as the leading Empirical Orthogonal Function (EOF) mode of monthly SSTA over the North Pacific Ocean (20°N–70°N, 120°E–100°W) and the corresponding leading principal component is defined as the PDO index. We use two sets of PDO indices to train and test the model. The first set is calculated from monthly SSTAs simulated by the Community Earth System Model version 2 (CESM2) [
33]. The CESM2 is capable of accurately simulating the observed spatial pattern of the PDO, including SST and surface wind anomalies over the Pacific, and its dominant period [
34]. The second set is the NCEI PDO index, obtained from NOAA’s National Centers for Environmental Information (NCEI) for the period January 1854–December 2024, which is calculated from the Extended Reconstructed Sea Surface Temperature (ERSST) v5 dataset. Both datasets share a spatial resolution of 2° × 2°. Besides these datasets, we also use the Optimum Interpolation Sea Surface Temperature (OISST) v2 dataset provided by NOAA to calculate the OISST PDO index to further validate the model’s generalizability. The NOAA Daily OISST is a long-term Climate Data Record that incorporates observations from different platforms (satellites, ships, buoys, and Argo floats) into a regular global grid. The dataset is interpolated to fill gaps on the grid and create a spatially complete map of sea surface temperature. Satellite and ship observations are referenced to buoys to compensate for platform differences and sensor biases. The OISST dataset contains daily global sea surface temperature data from 1 January 1982 to 31 December 2024 with a spatial resolution of 0.25° × 0.25°. Since OISST can provide near-time SST data (2 days before the present day), we can use for future prediction based on the derived PDO index. Prior to using it for calculating the PDO index, monthly averaged sea surface temperature values must first be computed. The three PDO index sequences are shown in
Figure S1. Among them, the CESM2-simulated and OISST PDO indices were computed as the first principal component (PC) obtained by performing EOF on the corresponding SSTA dataset with the global mean SSTA removed, and both were standardized.
2.2. Model Architecture
In this study, we propose a hybrid novel model using the BiLSTM network, integrated with the MMD method implemented through Empirical Mode Decomposition (EMD), to predict the PDO index (i.e., BiLSTM + MMD). The WOA is further incorporated into the BiLSTM-MMD framework to optimize the hyperparameters. The complete workflow of the model (BiLSTM + WOA + MMD, denoted as BWM) is illustrated in
Figure 1. Here,
t refers to the current iteration count of the optimization algorithm, and
tmax denotes the maximum iteration count that must be predefined before the algorithm begins. When
t =
tmax, the optimization algorithm ceases updating the hyperparameters.
f is the component obtained by applying EMD to the PDO index, and
r is the residual.
The process of training and forecasting consists of several key phases. Firstly, the PDO index is decomposed into multiple components of varying frequencies. These components correspond to signals sorted from high to low frequency.
Secondly, we construct a multi-channel BiLSTM network, with each channel dedicated to learning the values of a specific MMD component. The number of channels matches the number of MMD components, with each component assigned to one channel for training and prediction. Therefore, the pre-predicted PDO index aggregates all predicted components.
Thirdly, the pre-predicted PDO index is used to update the hyperparameters of the multi-channel BiLSTM network via the WOA. The updated hyperparameters are then used to retrain the model, improving its prediction accuracy.
Finally, through the iterative repetition of these processes, we obtain optimized parameters for the model. With the new parameters in the multi-channel BiLSTM network, every component of PDO index is predicted. The combination is the predicted PDO index.
2.3. Multiple Modal Decomposition
This prediction focuses on the time series itself. As PDO involves complex changes on multiple time scales [
2], we aim to utilize an MMD method to decompose the PDO index into multiple modes with different oscillation frequencies, thereby reducing the learning difficulty of deep learning models. Specifically, we choose to adopt the EMD method, which can decompose complex signals into several intrinsic mode function (IMF) components, each representing the local characteristics of the signal [
35]. The decomposition is based on the different time scales of the signal, without pre-setting any basis functions, and therefore exhibits high adaptability. The detailed steps are found in Huang et al. [
35]. In this study, EMD was implemented using the built-in EMD function in MATLAB 2021a. The number of components is fixed at 6 for the decomposing PDO index and EMD is applied separately to all PDO index sequences, with the results shown in
Figure S2.
2.4. Whale Optimization Algorithm
In deep learning models, the hyperparameters can affect the prediction. In this study, we integrate the WOA for hyperparameter optimization.
The WOA is a new type of heuristic algorithm proposed by Mirjalili and Lewis [
36], which simulates the hunting process of whale populations in marine organisms to optimize the goal of search. The WOA mainly includes three steps: surrounding prey, bubble-net feeding, and searching for prey. Compared to traditional methods (e.g., grid search, random search) and other metaheuristics (e.g., particle swarm optimization, genetic algorithms), WOA can obtain relatively fast convergence speed, high optimization accuracy, and strong optimization ability [
37]. The WOA algorithm mainly includes three steps: encircling prey, bubble-net attack, and search for prey.
In the WOA, whale individuals have the ability to surround and locate prey. Assuming that the current optimal solution or approximate optimal solution is the target prey, other individuals in the whale population will approach the whale closest to the current target prey position and continuously update their own position. The way to update the location is as follows:
In (1) and (2),
t is the current number of iterations.
X(
t) is the position vector of the current whale individual.
X(
t + 1) is the updated whale position vector and
X*(
t) is the updated whale position vector.
D is the time step for each encirclement.
A and
C are coefficient vectors. They are calculated as follows:
Here, r is random vector within [0, 1]. a is the convergence factor and it linearly decreases from 2 to 0 during the iteration process. tmax is the maximum iteration number.
During whale hunting, the bubble-net attack was described as two behaviors: shrinking encircling mechanism and spiral updating position.
Shrinking encircling mechanism: As the convergence factor a is constantly updated, the position coefficient A of whale individuals also changed, causing them to gradually approach the position of prey and ultimately complete the enclosure of prey.
Spiral updating position: When whales discover their target prey, they first calculate the distance between each whale and the prey, and then surround the prey through spiral ascent. The mathematical model expression for this method is as follows:
In (6) and (7), D* is the distance between the current whale individual and the target prey, b is a constant of logarithmic spiral shape, and l is a random number on [0, 1].
When a whale population captures its target prey, individuals will randomly choose the behavior of contraction and spiral upward updating to surround the prey. To better describe this behavior, probability is introduced
p as the selection threshold, with a random number
p ranging from 0 to 1, with a probability threshold of 0.5. The mathematical expression for this process is
The above steps are all for whales to search for the optimal individual position within the local range of the population. In order to prevent local optimization problems and improve the global search ability of the WOA algorithm, the “search for prey” behavior forces whales to randomly search for whale individuals in the global range to update their position by determining whether the condition “|
A|>1” is satisfied.
In (9) and (10), Xrand(t) is the randomly selected whale position vector in the population.
In our proposed model, there are many different hyperparameters as the MMD components and the multi-channel BiLSTM networks are combined to predict the PDO index. Among the networks, each channel has its respective adjustable parameters. The WOA with the function of feedback-adjusted parameter update is applied to these adjustable parameters to improve the model’s prediction skills for PDO. Regarding the initialization of the WOA method used in this study, we opted for random initialization following a uniform distribution across the search range. Regarding the initialization of the WOA method used in this study, we opted for random initialization following a uniform distribution across the search range. The pseudocode of the WOA is shown in
Figure S3.
2.5. Bidirectional Long Short-Term Memory Network
The BiLSTM network adds a backward long short-term memory (LSTM) layer to the original LSTM structure and is named BiLSTM [
38]. The difference between BiLSTM and LSTM networks is that the outputs of the forward and backward LSTM layers are concatenated to form the hidden layer’s final output. Having separate hidden layers in both directions means that BiLSTM can simultaneously extract forward and backward features from sequences, effectively improving the accuracy of prediction. In our study, each BiLSTM network consists of an input layer, a hidden layer, a fully connected layer, and an output layer.
2.6. Training, Forecasts, and Evaluation
In the training phase, we utilize the simulated PDO index sequence from the CESM2 pre-industrial control simulations spanning years 700 to 1200 as training data for the BiLSTM network. The model achieves prediction by inputting a pre-set number of historical PDO index time steps and outputting the PDO index for a number of future time steps. The observed PDO index derived from NCEI, covering the period from January 1854 to December 1902, is employed to optimize and update the model’s hyperparameters to improve the model’s generalization. The NCEI PDO index from January 1903 to December 2024 is used for validation. This separation ensures no leakage of future information from the observed NCEI PDO sequence into validation. We use the mean square error (MSE) as the loss function for the adaptive moment estimation (Adam) optimization algorithm during training. Adam is a widely used adaptive stochastic optimization algorithm. Its core idea is to dynamically adjust the learning rate of each parameter by calculating the adaptive estimates of the first-order and second-order moments of the gradient [
39]. Due to its robustness and efficiency, Adam has become one of the default optimizers in deep learning. The hyperparameters for each BiLSTM channel, optimized via WOA, include the input dimension, the number of hidden layer nodes, maximum epochs, batch size, and initial learning rate, the search range for these hyperparameters, as detailed in
Table S1. Additionally, to manage the training duration for each multi-channel BiLSTM network, we set 30 iterations for WOA to update the model hyperparameters, with 15 multi-channel BiLSTM networks trained per iteration. Under these conditions, the predictive performance of the model trained with the optimized hyperparameters enables longer-term forecasts than SGRU and DSEN models.
In the prediction phase, we use the same method to predict the monthly and annual average PDO indices. The prediction at different lead months (let it be
k) is achieved by constructing different multi-channel BiLSTM networks. Taking a BiLSTM channel that predicts a monthly PDO index component for
k-month lead as an example, assuming that the WOA has searched to this channel’s input dimension
d. To predict the component for the next
k months at time step
t, we input the variables
into this BiLSTM channel to directly generate the results for the next
k months
:
Then, input into BiLSTM to predict the result for the next k months.
To evaluate the BiLSTM-WOA-MMD model, we use the correlation coefficient (CC), root means square error (RMSE) to evaluate prediction performance. Among them, CC is the Pearson correlation coefficient between the predicted and observed PDO index. Their calculation formulas are as follows:
Here, is the observed value, is the predicted value, n is the number of sample point, is the mean of observed value, is the mean of predicted value.
Given the distinct roles of MMD and WOA in the modeling training, where MMD processes the input data while WOA optimizes the model’s hyperparameters. We infer that MMD has the ability to enhance the upper limit of model prediction accuracy, while WOA ensures the effective training of the model. To evaluate the different advantages of the MMD and WOA methods for prediction, we chose to combine MMD and WOA separately with the BiLSTM network (i.e., BiLSTM-MMD and BiLSTM-WOA) and used the predictive performance of the BiLSTM network as a baseline to quantitatively analyze the improvement in predictive performance brought about by the two methods. The results are shown in
Section 3.3. For the hyperparameter settings of BiLSTM and BiLSTM-MMD models, we conducted the evaluation by using a grid search, the best prediction result of 1-month lead is used as a reference, and the specific settings are shown in
Table S2. Due to computational constraints, each BiLSTM channel of the BiLSTM-MMD model was chosen to use the same hyperparameter settings. The hyperparameters of the search include the number of input dimensions (10, 15, 20, 30, and 60), the initial learning rate (0.01, 0.005, 0.001, and 0.0001), number of hidden layer nodes (3, 5, 7, 10, 20, and 40), and number of batches (16, 32, 128, and 256), totaling 480 combination types, where the maximum number of training iterations for each BiLSTM network is set to 350, and monitoring and early stopping are performed during the training process in order to prevent overfitting.
3. Results
3.1. Predicted Monthly NCEI PDO Index
Figure 2 illustrates the predicted NCEI PDO index generated by the BWM model at lead times of 1 month, 8 months, and 15 months, along with their corresponding prediction biases and bias distributions. Other prediction results can be found in
Figure S4. At the 1-month lead time, the predicted values from the BWM model (red line) accurately capture the reference NCEI PDO index (blue), since the predicted PDO index almost follows the referenced one. This is confirmed by the very high correlation coefficient (CC = 0.99) between the predicted and referenced one from 1903 to 2024. Additionally, the model records a low RMSE of 0.12 °C and small bias ranging from −0.5 to 0.5 °C (16% percentage centered at zero bias). However, prediction errors increase with longer lead times. For an 8-month lead time, the predicted values effectively capture the evolution of the reference NCEI PDO index, with a CC of 0.71 and a bias ranging from −2 to 2 °C (8% percentage centered at zero bias). For 15-month lead time, the CC decreases to 0.56 and the large absolute bias increases and the bias is less 8% centered at zero bias), although the RMSE remains below 1.
To further assess the predictive performance of each decomposed series, heatmaps of CC and RMSE for each component are displayed in
Figure 3. The high-frequency components, such as the first and second IMF components (i.e., IMF1 and IMF2), are well-captured at the 1-month lead. However, starting from the 2-month lead, the CC declines rapidly, reaching values of 0.55 and 0.28 for IMF1 at the 2-month and 3-month lead, respectively. Furthermore, from the 7-month lead onward, the CC for IMF2 decreases from 0.22 to 0 at the 8-month lead, before rising again to 0.39 at the 9-month lead. This behavior can be attributed to IMF1 and IMF2 being high-frequency components with significant noise. Such noise components can introduce unpredictable fluctuations during the optimization process of gradient descent in the network, leading to deviations and instability in the learning direction. For IMF4 to IMF6, the CCs remain high for lead times ranging from 1 month to 15 months, indicating that the lower-frequency components are easier to predict at the same lead time. Additionally, the superior prediction of low-frequency IMF components (e.g., IMF3-IMF6) validates MMD’s role in isolating decadal-scale signals, aligning with the physical interpretation of PDO variability. The excellent predictive performance of the BWM model is likely attributable to this.
The heatmaps of the CC and RMSE values for each calendar month and lead time in
Figure 4 illustrate the seasonal variations in the BWM model’s predictive accuracy. Overall, the CC values are predominantly above 0.5 with few exceptions for some months. This suggests that the BWM model demonstrates a robust capability to predict the PDO index across different calendar months. In
Figure 4a, the lowest predicted CC values primarily occur in August, September, and October. For instance, a CC of 0.69 is observed at the 7-month lead in September and October, while decreasing to 0.47 at the 12-month lead in August and to 0.45 at the 15-month lead in October. The RMSE heatmap in
Figure 4b exhibits a similar pattern, with the highest values occurring mainly in July, August, October, and boreal winter. Both CC and RMSE trends show declining predictive skill with increasing lead times, particularly from late boreal summer through autumn, and boreal winter (mainly for RMSE).
3.2. Compare with Other Methods
Existing deep learning networks for PDO index prediction include the SGRU and DSEN [
30,
31]. The PDO prediction performance (CC and RMSE) of the SGRU and DSEN models has been presented in their respective studies. These models, which focus on modeling monthly PDO index, have demonstrated superior skill compared to dynamic models like CanCM4i in NMME and the COLA-RSMAS-CCSM4 model, where CanCM4i and COLA-RSMAS-CCSM4 maintained CC values above 0.5 for up to 11-month and 7-month lead times, respectively. We compared the prediction performance of BWM with SGRU, DSEN, and statistical model ARIMA [
40]. We used the ARIMA model for comparison because it is one of the classic time series prediction methods and it is widely used as a baseline in climatology research, thereby well evaluating the predictive capabilities of deep learning models. For consistent comparison across different time periods, we recomputed CC and RMSE using the validation samples from 1979 to 2020 (
Figure 5). The BWM model demonstrates superior performance in predicting the PDO index and outperforms others at the lead times of 1 month and from 7 to 15 months due to larger CCs and smaller RMSEs (
Figure 5). Notably, the BWM model consistently outperforms ARIMA in predicting the PDO index.
As a deep learning model that uses historical PDO indices as training inputs, the SGRU model employs a decomposition method to decompose the PDO index sequence into three components: seasonal, trend, and residual. Its predictive capability primarily relies on the model’s ability to capture the trend component. However, the BWM model utilizes the MMD method to convert the PDO signal into components with different oscillation frequencies. By leveraging the strong modeling capability of multi-channel BiLSTM for low-frequency components, it effectively captures the potential internal variability of the decadal signal across multiple time scales, demonstrating a superior predictive performance for lead times beyond 3 months. Specifically, at the 6-month lead time, the BWM model achieves a CC of 0.76 compared to 0.65 by the SGRU model, and maintains a CC of 0.65 even at the 12-month lead time.
On the other hand, since the BWM model does not consider spatial features as training input data, it relies solely on PDO index time series. Therefore, when compared to the DSEN model, which incorporates the SSTA spatial information, the BWM model does not demonstrate superior performance to DSEN in predictions with a lead time of less than 7 months. However, the BWM model exhibits higher CC values and lower RMSE at lead times of beyond 7 months (
Figure 5). This may be attributed to the error accumulation effect caused by the recursive forecasting mechanism of the DSEN model, while the prediction pattern and hyperparameter optimization module of the BWM model can effectively reduce this error accumulation.
For short-term prediction (1–3 months), the BWM model does not demonstrate significantly better performance. This limitation may be attributed to two factors. Firstly, the time discontinuity between simulated and observed data complicates the network’s ability to accurately extract features. Secondly, the lack of spatial features in the BWM model compared to the DSEN.
3.3. Effect of Incorporating Optimization Algorithms and Multiple Modal Decomposition
The analysis above demonstrates that the BWM model outperforms other deep learning ones. In this section, we conduct a quantitative evaluation of the improvements achieved by the BWM model after incorporating the WOA and MMD methods, analyze and verify the different enhancements that the two methods bring to the BiLSTM network.
Figure 6 presents the CC and RMSE results for four combined models: BiLSTM (baseline), BiLSTM-WOA, BiLSTM-MMD, and BiLSTM-WOA-MMD (BWM). Using the CC threshold of 0.5, both the WOA and MMD methods demonstrate enhanced prediction performance compared to the BiLSTM. The BiLSTM effectively predicts the PDO index only at the first 5-month lead time and achieves the 0.47 CC value at the 6-month lead time. In contrast, incorporating WOA or MMD with BiLSTM increases the CC to 0.69 and 0.72 at the 6-month lead time, respectively, extending the effective lead times (CC > 0.5) to 3 or 4 months.
The improvements in PDO index prediction skills offered by the WOA and MMD methods differ. In terms of CC values, BiLSTM-MMD consistently outperforms BiLSTM-WOA during the first 6 months. However, at the 7-month lead time, BiLSTM-MMD shows a sharp decline compared to the 6-month lead time, whereas BiLSTM-WOA maintains a more stable performance. Consequently, the CC for BiLSTM-MMD falls below that of BiLSTM-WOA after the 7-month lead time, highlighting the short-term advantages and long-term limitations of BiLSTM-MMD. In contrast, the BiLSTM-WOA model demonstrates more robust long-term predictive stability, albeit with a lower upper limit in prediction accuracy. This divergence stems from their distinct mechanisms. The WOA dynamically optimizes model hyperparameters during training, enabling adaptive prediction of multi-scale PDO variations. This mitigates abrupt performance degradation in long-term predictions caused by fixed parameters, thereby enhancing stability. The MMD decomposes the PDO index into multiple independent modes. By sequentially superimposing these components, we observe that they resemble the low-frequency trend components dominating the PDO index (obtained by applying the moving average to the PDO index) [
30], achieving a CC of 0.93 (
Figure 7). The decomposition isolates high-frequency noise (e.g., interannual variability linked to ENSO) from the low-frequency signals (e.g., SST anomalies in the central North Pacific), allowing multi-channel BiLSTM to independently model each component. This separation prevents high-frequency noise from interfering with the dominant low-frequency signals during training, which explains the higher CCs and lower RMSEs for the first 5–6 months (
Figure 6). Therefore, it is evident that both methods possess distinct advantages.
The WOA method addresses long-term stability through adaptive parameter tuning, while the MMD method enhances short-term accuracy by physically disentangling signal components and suppressing noise. Their complementary roles provide a balanced framework for multi-scale climate index prediction. Combining both methods, the BWM model extends the effective lead time to 15 months (CC = 0.56), outperforming individual methods. At the 6-month lead time, the baseline BiLSTM yields a CC below 0.5, while BWM achieves 0.76. When BiLSTM-WOA and BiLSTM-MMD achieve CCs of 0.5 (or less than 0.5), the BWM model is still able to achieve a CC exceeding 0.65 at the same lead time. The introduction of the MMD and WOA methods into BiLSTM improves the upper prediction limit, resulting in larger CCs and lower RMSEs for the PDO index.
The improvements in prediction performance of the BWM model can be summarized in three key aspects. Firstly, the model is trained on the substantial amount of simulation data, combined with the robust capability of BiLSTM to extract bidirectional time features, allowing for the comprehensive extraction of important characteristics. Secondly, the MMD method decomposes the PDO index into distinct frequency components reducing the temporal complexity of the PDO index and increasing the model’s predictive accuracy ceiling. Thirdly, WOA’s iterative hyperparameter optimization ensures optimal training results and minimize the probability of sudden increases in prediction errors. The comparison clearly confirms these viewpoints (
Figure 6). Collectively, these advantages establish the BWM model’s superiority over alternative data-driven approaches for PDO index prediction.
3.4. Predicted Annual Average NCEI PDO Index
Besides interannual time scales using monthly PDO index, there is decadal scale in the PDO, which has been focused on. At the decadal scale, we chose to use the annual average PDO index for evaluation.
The CC and RMSE are also employed to assess the BWM model’s annual-scale predictions.
Figure 8a presents the prediction results for 1-, 3-, and 5-year lead times. The prediction results for other lead times are shown in
Figure S5.
Figure 8b presents the corresponding prediction performance (RMSE and CC) for different lead times. Throughout the entire validation period during the period 1903–2024, the BWM model maintains RMSE below 0.9. Additionally, the model achieves a CC above 0.5 for the lead times ranging from 1 to 5 years, reaching a CC of 0.55 at the 5-year lead time. This excellent prediction performance can be attributed to the high correlation between the low-frequency signals obtained through the MMD method and the low-frequency PDO index (moving average), which the BiLSTM effectively captures.
3.5. Test the Generalizability Using Monthly OISST PDO Index
In the following, we will test the generalizability of the BWM model using a different dataset. To address this, we computed the monthly OISST PDO indices using SST data from the OISST dataset and conducted predictions. Compared to the ERSST v5 dataset used for computing the NCEI PDO index, the OISST dataset provides higher spatial resolution (0.25° × 0.25°), enabling enhanced spatial detail resolution to provide a more comprehensive understanding of the PDO associated phenomenon. Furthermore, the near-time PDO index (for weekly data) can be derived from the OISST after calculation. Such a near-time index can use for future prediction.
Although the PDO index based on OISST starts from January 1982, we selected January 1991 as the start date due to model input requirements.
Figure 9 compares the predicted OISST PDO index by the BWM model (red) with the observed values (blue) from January 1991 to December 2024, along with the CC,
p-value, and RMSE for different lead times (1-, 6-, 10-, 15-month lead time), while also showing the distribution of biases. Other prediction results can be found in
Figure S6. The results demonstrate that the model successfully reproduces the OISST PDO index temporal variations with high CC values. The CC value is 0.92 at the 1-month lead, effectively capturing the evolution of OISST PDO index. At 6-month and 10-month lead times, the CCs are 0.82 and 0.69, respectively. Additionally, at a 15-month lead time, the model-predicted PDO index maintains a CC of 0.64 with OISST PDO index. These consistent results validate the robust generalization capability of the BWM mode. Notably, no OISST data was used in training, indicating that its generalization performance primarily stems from the WOA algorithm’s continuous updates and adjustments to the model training, effectively reducing the risk of overfitting caused by a large amount of simulated data.
3.6. Predict Future PDO Phase
According to the monthly/annual BWM prediction results, the model can reliably predict the PDO index in advance (CC ≥ 0.5). Generally, the positive or negative values of the PDO index reflect its phase state (warm or cool phase). This means that the model can identify the PDO’s phase transitions in the coming months and years by forecasting the monthly NCEI PDO index starting from May 2025 and the annual average NCEI PDO index from 2024 onward. Therefore, we used the BWM model to predict the NCEI PDO index on monthly and annual scale. Additionally, the OISST data is near-time once the daily average sea surface temperature data for a given month is obtained (typically on the first day of the following month), and the PDO index for that month can be quickly calculated. Due to this advantage, we also conducted future monthly forecasts for the OISST PDO index starting from June 2025. These prediction values as shown in
Figure 10. Predicted monthly PDO indices using the NCEI PDO index and OISST one by the BWM model are basically consistent. Their CC values from January 1991 to December 2024 at the 1-month/6-month/10-month/15-month lead times are 0.89, 0.8, 0.6, and 0.49, respectively (
Figure 10b).
Based on existing PDO index (
Figure S1), it can be found that the PDO remained predominantly in the cool phase from 2019 to 2024. The forecast results of the monthly NCEI and OISST PDO indices indicate that, within the next 15–16 months starting from May 2025, PDO will continue to stay in the cool phase. This phenomenon may be attributed to the frequent occurrence of La Niña events in recent years, which have caused abnormal strengthening of the equatorial easterly winds, leading to the upwelling of cold water in the tropical Pacific. This results in persistent cooling in the eastern equatorial Pacific and the transmission of cooling signals to the North Pacific via the atmospheric bridge. Additionally, the Atlantic Multidecadal Oscillation (AMO) is currently in a positive phase, which induces subsidence over the North Pacific, unfavorable for the maintenance of the Aleutian Low, and may also cause the PDO to enter a negative phase [
41,
42,
43]. The BWM model’s 10-month lead prediction results for the NCEI PDO index indicate that the PDO index will first decrease starting in June 2025, followed by a slight increase trend overall from July onwards. The prediction OISST PDO index at a 10-month lead also reach the same conclusion. This phenomenon suggests that the cool phase may gradually weaken but has not yet reached the threshold of a warm phase. Such short-term fluctuations could be related to internal oceanic dynamic processes (such as subtropical gyre adjustments) or localized wind stress anomalies [
1]. Although the overall cool phase of the PDO is characterized by warm anomalies in the central North Pacific and cold anomalies in the tropical eastern Pacific, localized wind field adjustments (such as regional wind stress anomalies) may trigger short-term marine heatwave events in certain areas, such as in the Gulf of Alaska. The small upward trend in the monthly PDO index predicted by the model (but not reaching the warm phase threshold) may be suggestive of the potential occurrence of such an event. Regarding the 5-year forecast, the results suggest that PDO will remain in the cool phase until 2029. This prolonged persistence may be associated with the synergistic effects of the AMO or the influence of greenhouse gas forcing on ocean heat absorption [
44]. Additionally, this condition could have sustained impacts on temperature variations (warm blob or marine heatwaves) along the North American west coast and on the precipitation of the East Asian monsoon [
2]. In studies on future phase prediction of the PDO, Du and Chen evaluated the performance of 40 CMIP6 models in simulating the PDO during the historical period (1870–2014) and predicted PDO changes from 2015 to 2100 under four future scenarios (SSP1-2.6, SSP2-4.5, SSP3-7.0, and SSP5-8.5) based on 17 models [
45]. All four prediction results indicate that the PDO will predominantly remain in a cool phase during the period 2025–2030, which is consistent with our prediction results.
4. Conclusions and Discussions
This study presents the BWM deep learning model to predict the monthly and annual PDO index. To address the challenge of insufficient samples for model training, we utilize a 500-year dataset of monthly PDO indices simulated by a climate model to train the BWM model. The model effectively predicts all scale variability (monthly-to-monthly changes) of the PDO index at a 1-month lead time, achieving a CC of 0.99, and maintains a high prediction skill at a 15-month lead time with a CC of 0.56. For the annual average (annual) PDO index, the BWM model also demonstrates strong predictive capabilities, yielding a CC of 0.99 at the 1-year lead time and a CC of 0.55 at 5-year lead. The generalization ability of the model is also well verified by the PDO index computed by the OISST dataset. Additionally, the BWM model effectively captures the nonlinear characteristics and interdecadal evolution patterns of the PDO. The monthly and annual forecasts of future PDO phase evolution indicates that the PDO will remain in a cool phase for an extended period. Specifically, the central North Pacific is projected to exhibit positive sea surface temperature anomalies, while the tropical central-eastern Pacific and coastal regions of North America will maintain negative SST anomalies. The monthly phase prediction results are further validated by OISST data, and the near-time updated OISST data can be used for predicting future monthly PDO changes earlier (starting from June 2025). The findings provide valuable insights for relevant industries to develop adaptive measures in advance and demonstrate the positive significance of integrating remote sensing into climate prediction research.
Both the WOA and MMD methods contribute to the improvement of the BiLSTM’s prediction skills for PDO index. The WOA effectively identifies optimal parameters by optimizing the model’s predetermined hyperparameters. After applying WOA and MMD, the PDO prediction performance has been improved significantly, with the CC at the 6-month lead increasing from 0.47 to 0.69 and 0.72, while the RMSE decreases from 1.07 to 0.91 and 0.79, respectively. Moreover, the model exhibits higher predictive skill during winter, spring, and early summer, highlighting the seasonal dependence of the BWM model. Due to the complexity of climate models, they are often affected by various types and scale spatial characteristics. Considering the incorporation of spatial variation into the training process of the model might further improve the prediction skill, which is also particularly important for long-term prediction. In the future, introducing the spatiotemporal information into this model structure may improve PDO prediction further.
Data-driven deep learning models provide valuable insights for PDO prediction, as exemplified by the SGRU and the DSEN models, both of which effectively predict monthly PDO index for lead times of 6 and 12 months, respectively. The BWM model outperforms these existing advanced deep learning models (SGRU and DSEN) in predicting the PDO index. Its robust architecture enhances the model’s ability to handle complex time series data, allowing it to capture intricate patterns and dependencies within the data—an essential capability for climate predictions influenced by multiple factors. Furthermore, the hyperparameter optimization module in the framework enables the model to adaptively extend to additional time series prediction tasks, such as forecasting the key ENSO indicator, including Niño 3.4, Niño 1+2, Niño 3 index et al.
The BWM model is a purely data-driven deep learning model to forecast based on PDO index. It does not explicitly include the physical drivers of the PDO (such as ENSO and the Aleutian Low). Thus, the BWM model suffers from weak physical interpretability of its prediction results. Furthermore, the BWM model uses the only time series of the PDO index and does not incorporate spatial distribution of SSTA or other variables. This will limit the model’s forecast ability of PDO due to the missing connection with other areas. Therefore, the integration of multi-dimension data and the construction of a data-driven framework incorporating hybrid physical information will be a key direction for enhancing the BWM model.
Our research findings demonstrate that deep learning methods significantly facilitate climate system predictions and advance research on forecasting complex dynamical systems. For the PDO, its near-time index can be derived from the satellite remote sensing data (such as OISST). Based on this near-time PDO index, the future prediction of PDO index can be accessed and this can enhance the operational application. Meanwhile, high-resolution remote sensing data can be used to monitor the ocean disaster combined with the predicted PDO index. Given the limited number of observed PDO indices on an annual scale, using them directly for training and testing would lack statistical significance, making it challenging to evaluate the model’s generalization ability. Therefore, the strong performance of the BWM model is also attributed to the simulated PDO index derived from a climate system model. The ample simulated data provides a solid foundation for training the BWM model and maintaining the prediction performance, which is particularly crucial for multi-time-scale predictions of PDO.