Runoff Prediction Method Based on Pangu-Weather

Yang, Wentao; Qin, Hui; Jie, Yongsheng; Qu, Yuhua; Zhang, Taiheng; Li, Chenghong; Tan, Li

doi:10.3390/w17091405

Open AccessArticle

Runoff Prediction Method Based on Pangu-Weather

by

Wentao Yang

^1,2,3,

Hui Qin

^1,2,3,*

,

Yongsheng Jie

^1,2,3,

Yuhua Qu

^1,2,3,

Taiheng Zhang

⁴,

Chenghong Li

^1,2,3

and

Li Tan

^1,2,3

¹

School of Civil and Hydraulic Engineering, Huazhong University of Science and Technology, Wuhan 430074, China

²

Hubei Key Laboratory of Digital River Basin Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China

³

Institute of Water Resources and Hydropower, Huazhong University of Science and Technology, Wuhan 430074, China

⁴

Huadian Electric Power Research Institute Co., Ltd., Hangzhou 310030, China

^*

Author to whom correspondence should be addressed.

Water 2025, 17(9), 1405; https://doi.org/10.3390/w17091405

Submission received: 14 April 2025 / Revised: 6 May 2025 / Accepted: 6 May 2025 / Published: 7 May 2025

(This article belongs to the Section Hydrology)

Download

Browse Figures

Versions Notes

Abstract

:

Runoff prediction is a complex hydrological, nonlinear time-series problem. Many machine learning methods have been put forth in recent years to predict runoff. A sliding window method is typically used to preprocess the data in order to rebuild the time series of runoff data into a standard machine learning dataset. The size of the window is a variable parameter that is commonly referred to as the time step. With developments in computer and AI technology, data-driven models have demonstrated tremendous potential for runoff prediction. And AI technology has opened up a new avenue for weather prediction, with Pangu-Weather demonstrating considerable improvements in both accuracy and processing efficiency. This study creates two novel prediction models, LSTM-Pangu and GRU-Pangu, by combining Pangu with Long Short-Term Memory (LSTM) and the Gate Recurrent Unit (GRU). We concentrated on the Beipanjiang River Basin in China, using Guizhou Qianyuan Power Company Limited’s daily runoff data and meteorological satellite data from the Climate Data Store platform to forecast daily runoff. These models were used to anticipate runoff on various future days (known as the lead time). The results show that regardless of time step, both LSTM-Pangu and GRU-Pangu outperform the LSTM and GRU models. Furthermore, this advantage is more evident as the advance time increases. When the time step is 7 and the lead time is 5, the Nash–Sutcliffe Efficiency (NSE) of the LSTM-Pangu model improves by 8.1% compared to the LSTM model, while the NSE of the GRU-Pangu model improves by 11.7% compared to the GRU model. Furthermore, LSTM-Pangu and GRU-Pangu outperform LSTM and GRU models in terms of the predictive accuracy under high-flow conditions, highlighting their significant advantages in flood forecasting. This integration strategy displays great transferability and may be expanded to other typical data-driven models.

Keywords:

runoff prediction; Pangu-weather; data-driven; LSTM; GRU; lead time; time step; Beipan river basin

1. Introduction

Accurate runoff prediction, especially short-term runoff prediction, is critical for water resource management. However, due to human impacts on land and atmospheric systems, runoff has high spatiotemporal variability and uncertainty, making it more challenging to effectively capture the dynamic dynamics of short-term runoff time series [1]. These issues are exacerbated by anthropogenic effects and climate variability, which increase the spatiotemporal variety and unpredictability of runoff patterns [2,3,4,5]. To solve these difficulties, the researchers have created a diverse set of prediction models, which can be roughly classified as physics-based models [6,7,8,9] and data-driven models [10,11,12,13].

Historically, physics-based models have been the foundation of hydrological predictions. These models replicate complicated, nonlinear hydrological processes by forecasting climatic variables (such as precipitation) and calculating runoff using rainfall–runoff models [14]. It adequately explains the underlying physical mechanisms and has a high level of reliability. For example, the Soil and Water Assessment Tool (SWAT) has been combined with the ArcGIS interface to produce a two-dimensional basin-based hydrological model that has been widely used for runoff and water quality modeling in agricultural areas [15]. Mike SHE and Mike 3 perform three-dimensional simulations of surface water flow and sediment transport in urban, coastal, and marine environments [16]. Despite their utility, physics-based models are frequently limited by the requirement for high-quality, extensive physical data to calculate physical equations. In areas with limited observational networks or poor data quality, their use can be difficult, limiting their usefulness and scalability [17]. In contrast, data-driven hydrological models place less emphasis on the description of physical processes in the hydrological cycle, instead focusing on the data-driven correlations between input and output variables [18,19,20]. These models are highly efficient in learning, can be quickly applied to practical scenarios, and demonstrate good adaptability even in basins with insufficient spatial information or short temporal records. Support Vector Machines (SVMs) are models that minimize output errors by applying linear or nonlinear kernel weighting to input variables [21]. SVMs, like linear regression, can model individual time steps in a time series, and they outperform physical models, like the MIKE Flood and Storm Water Management Model (SWMM) [22], in terms of modeling speed and accuracy.

Data-driven hydrological models are often mappings between many input features and output targets. To use the typical linear and nonlinear machine learning techniques, the data must be reframed for time-series forecasting. To handle input sequences, the researchers created Recurrent Neural Network (RNN) models, such as LSTM and GRU [23,24,25,26,27], which have been successfully used in a variety of investigations. LSTM models have been employed in soil moisture modeling [28], groundwater level prediction, and daily or hourly rainfall–runoff modeling [29,30]. GRU models have been employed in short-term runoff predictions [18], and when combined with techniques such as Variational Mode Decomposition (VMD) [31], they have demonstrated an excellent predictive performance.

However, standalone data-driven models face their own set of challenges. While they excel in capturing relationships within historical data, their predictive accuracy often deteriorates with increasing lead times [32,33]. This problem emerges because these models rely mainly on past runoff data and ignore future climatic variables, which are important for effective runoff prediction [34]. During the process of runoff generation, the hydro-meteorological interactions are very close [35]. In most catchments, runoff is primarily produced either through direct precipitation or snowmelt-driven processes [36]. As such, changes in precipitation or snowmelt are typically positively correlated with runoff volumes—an increase in these inputs often leads to greater runoff, and vice versa. Rising temperatures can enhance potential evapotranspiration, which is particularly impactful in arid regions, where limited precipitation leads to a substantial reduction in water availability for runoff generation [37]. Moreover, extreme weather events can significantly alter runoff dynamics, potentially resulting in hydrological extremes, such as floods or droughts [38]. In scenarios where reliable meteorological forecasts are unavailable [39], this dependency on historical data becomes a significant bottleneck.

Recent breakthroughs in machine learning and meteorological modeling have opened new avenues for addressing these limitations [40]. The emergence of AI-driven weather prediction models, Pangu-Weather, has revolutionized the field by significantly enhancing the accuracy and efficiency of meteorological forecasts [39]. Pangu-Weather uses a three-dimensional Earth-Specific Transformer to process meteorological data, integrating spatial and temporal information, with unprecedented precision. By mitigating cumulative forecast errors via hierarchical temporal aggregation strategies, this model has outperformed traditional Numerical Weather Prediction (NWP) systems in both speed and accuracy. Leveraging such AI-driven meteorological forecasts in hydrological models can account for uncertainties in future conditions, thereby enhancing the robustness and reliability of runoff predictions.

This study presents a novel approach to runoff forecasting that combines Pangu-Weather, LSTM, and GRU. LSTM-Pangu and GRU-Pangu are designed to capitalize on the strengths of both AI-driven meteorological forecasts and data-driven hydrological modeling. By incorporating meteorological predictions into the runoff forecasting process, these models seek to overcome the constraints of classic LSTM and GRU models, particularly in scenarios involving extended lead times or harsh flow conditions.

The paper is arranged as follows. Section 2 describes the LSTM, GRU, and Pangu-Weather models and their structures, followed by model evaluation criteria. Section 3 provides an overview of the study area, data, and model configuration. Section 4 includes the modeling results and discussions. Finally, Section 5 summarizes this paper.

2. Materials and Methods

2.1. Long Short-Term Memory (LSTM)

LSTM is a form of Recurrent Neural Network (RNN) that has been built to solve the inadequacies of regular RNN models in dealing with long-term dependency concerns. A typical RNN struggles to successfully preserve and use long-term dependencies in sequential data, as previous information frequently loses influence during the propagation process. In contrast, LSTM, through its unique memory cell structure, excels at preserving and transmitting long-term information [41]. Figure 1 illustrates the basic construction and operational flow of LSTM units.

At each time step, an LSTM unit has a distinct state, known as the cell state, in which data can be saved. The time-series input is shown in Figure 1 as

x_{t}

, while the output is shown at the top as

h_{t}

.

Each LSTM cell refreshes six parameters at each time step. The required stages are described in Equations (1)–(6) [42]. The first parameter is the forget gate parameter (

f_{t}

), which controls how much information from the prior state of the cell is forgotten. The linear equations in different steps have their own weight matrices (

W

) and biases (

b

). The closer the forget gate parameter (

f_{t}

) is to 0, the more information from the previous cell state is forgotten through the sigmoid function. The second parameter is the input gate parameter (

i_{t}

), which specifies what new information must be retained and supplied to the cell state at the current time step. The input gate factor (

i_{t}

) is determined by applying the sigmoid functions to the linear combination of

x_{t}

and

h_{t - 1}

. Meanwhile, the new potential cell-state level (

{\tilde{C}}_{t}

) is calculated by applying the tanh function to the linear combination of

x_{t}

and

h_{t - 1}

, where tanh is an activating function known as the hyperbolic tangent. Next, the state of the cell (

C_{t}

) is updated by combining the information retained by the forget gate with the new information from the input gate. Finally, the output gate parameter (

o_{t}

) is computed by performing a linear operation on

x_{t}

and

h_{t - 1}

, and then using the sigmoid function. Finally, the current time step’s output (

h_{t}

) of the current time step is the product of

o_{t}

and the tanh function of the cell state (

C_{t}

).

f_{t} = σ (W_{f} \cdot [h_{t - 1}, x_{t}] + b_{f})

(1)

i_{t} = σ (W_{i} \cdot [h_{t - 1}, x_{t}] + b_{i})

(2)

{\tilde{C}}_{t} = \tanh (W_{C} \cdot [h_{t - 1}, x_{t}] + b_{C})

(3)

C_{t} = f_{t} \times C_{t - 1} + i_{t} \times {\tilde{C}}_{t}

(4)

o_{t} = σ (W_{o} \cdot [h_{t - 1}, x_{t}] + b_{o})

(5)

h_{t} = o_{t} \times \tanh (C_{t})

(6)

In this study, given the time cost of converting meteorological factors into runoff and the temporal relationship between past and future runoff, utilizing LSTM as a neural network approach to handle time-series data, such as meteorological and historical runoff data, is extremely appropriate. The LSTM layer is an ordinary aspect in many recent machine learning packages [43], making it convenient to use. As a result, the essential models in this study are implemented using the Keras framework’s built-in LSTM layer component.

2.2. Gate Recurrent Unit (GRU)

The LSTM network was first proposed in 1997 and applied to language processing tasks. It is well known for its extraordinary capacity to identify long-term and short-term dependencies. However, because of their somewhat complex structure, LSTM neural networks often take longer to construct and train. To overcome this issue, GRU networks were proposed as a simplification of the LSTM network, with a simpler topology [44].

The structure and operational flow of GRU units, shown in Figure 2, differ from LSTM by merging the hidden and cell states into a single state and using fewer gates for control. In the GRU cell, there are two control gates: the update gate (

z_{t}

) and the reset gate (

r_{t}

). The gate that updates (

z_{t}

) controls the quantity of information from the previous state (

h_{t - 1}

) and is transferred to the current time step. The higher the value of the update gate (

z_{t}

), the greater the transmission of data. The reset gate (

r_{t}

) determines the quantity of information transferred from the previous situation to the present candidate situation. The lower the value of the reset gate, the less data are transferred from the previous state. The update equations are calculated as follows:

r_{t} = σ (W_{x r} x_{t} + W_{h r} h_{t - 1} + b_{r})

(7)

z_{t} = σ (W_{x z} x_{t} + W_{h z} h_{t - 1} + b_{z})

(8)

{\hat{c}}_{t} = \tanh (W_{x c} x_{t} + W_{h c} (r_{t} \otimes h_{t - 1}) + b_{c})

(9)

c_{t} = (1 - z_{t}) \otimes c_{t - 1} + z_{t} \otimes {\hat{c}}_{t}

(10)

h_{t} = c_{t}

(11)

where

W_{x r}

,

W_{h r}

,

W_{x z}

, and

W_{h z}

are the weight vectors of the network.

b_{r}

and

b_{z}

are the network bias vectors, while

r_{t}

and

z_{t}

are vectors containing the activation values of the update and reset gates.

The GRU features a simpler structure and faster training speed while handling long sequence data. It is suitable for smaller data models. However, its ability to capture long-term dependencies is slightly inferior to that of LSTM. In this study, considering the relationship between meteorological factors, past runoff, and future runoff over time, the GRU model is also suitable for application. The GRU layer is an ordinary aspect in many recent machine learning packages [43], making it convenient to use. Therefore, this study implements the relevant models using the built-in GRU layer component within the Keras framework.

2.3. Pangu-Weather

In recent years, AI technology has opened a new pathway for weather forecasting, significantly improving speed compared to traditional methods. However, the forecasting accuracy of most existing AI models still lags behind that of Numerical Weather Prediction (NWP) systems. Nevertheless, Pangu-Weather proposed by Bi et al. (2023) [39] outperforms the present NWP system—in both accuracy and speed.

The structure of Pangu-Weather is shown in Figure 3. By incorporating the vertical dimension (height) into the neural network, the model constructs a 3D architecture capable of explicitly modeling interactions across different atmospheric pressure layers. The 3D data are transmitted via an encoder–decoder design based on the Swin transformer [45]. To better integrate Earth-specific physical constraints, the researchers replaced the original relative positional bias in Swin with geophysical positional biases, enabling a more accurate spatial encoding of atmospheric variables.

The model partitions atmospheric data into 13 vertical pressure levels (e.g., 500 hPa, 850 hPa) and multiple latitudinal zones (e.g., tropics, mid-latitudes), which helps reflect the geospatial dependencies of atmospheric dynamics. Traditional relative positional encodings are insufficient for capturing latitude-dependent Coriolis effects, vertical coupling across pressure layers (such as interactions between upper-level jet streams and surface wind fields), and land–sea contrasts that influence weather systems (e.g., monsoons). To address this, the model assigns independent positional bias parameters to each pressure level and latitude band, explicitly encoding their absolute spatial relationships. Although this geographic specialization increases the number of bias parameters by a factor of 527 compared to the original Swin architecture, it significantly improves the model’s ability to predict extreme weather events, such as typhoon trajectories, demonstrating the value of incorporating physical priors.

Compared to the baseline, 3DEST has the same computing cost but a higher convergence speed. To reduce aggregate forecast mistakes, they implemented hierarchical temporal aggregation, an unparalleled technique that always selects the deep neural network with the longest lead time. logically, this substantially decreases the number of iterations. This design significantly enhances the predictive capabilities of model across different spatial scales, achieving much higher accuracy compared to 2D models, such as FourCastNet [46].

In this study, the version of the Pangu model described by Bi et al. (2023) [39] was adopted, featuring a spatial resolution of

0.25 ° \times 0.25 °

and a temporal resolution of 24 h intervals. This version was trained using 39 years of ERA5 reanalysis data from 1979 to 2017. After training, the model operates as a predictive system and initiates predictions or simulations given initial conditions. The input data for the Pangu-Weather model include surface variables and upper variables. The surface variables (input_surface.npy) are a numpy array with a shape of (4, 721, 1440), representing four surface variables (mean sea-level pressure, 10 m u-component of wind speed, 10 m v-component of wind speed, 2 m temperature). The upper-air variables (input_upper.npy) are a numpy array with a shape of (5, 13, 721, 1440), representing five upper-air factors (Z, Q, T, U, V) at 13 pressure levels, where Z, T, Q, U, and V represent the geopotential, temperature, specific humidity, and the u- and v-components of wind speed, respectively.

2.4. Pangu-Driven Runoff Prediction Model

The prediction framework, shown in Figure 4, is derived from the Pangu model and multiple runoff predictor sub-models designed to forecast runoff for successive days. The model consists of several sub-models designed to predict runoff for the next m days. These sub-models are not independent; the output runoff of the i-th sub-model serves as the input condition for the (i + 1)-th sub-model. At each step of the prediction process, the model relies on the prediction from the previous step. The simple version of the first sub-model (Model 1) is defined as follows:

{\overset{⌢}{y}}_{n + 1} = f (x_{1}, \dots x_{n})

(12)

and that is

R u n o f f_{n + 1} = f (h m d a t a_{1 \dots n}, h r d a t a_{1, \dots n})

(13)

where n is the time step,

{\overset{⌢}{y}}_{n + 1}

is the predicted value for day (n + 1), and

x_{1}, \dots x_{n}

represents the observed data from day 1 to day n.

R u n o f f_{n + 1}

is the predicted value for day (n + 1). And

h m d a t a_{1 \dots n}

represents the observed meteorological data for the past n days.

h r d a t a_{1, \dots n}

represents the observed runoff data for the past n days. The runoff for day (n + 1) is predicted using both historical runoff data and historical meteorological data.

Next, the simple version of the m-th sub-model (Model m) is defined as follows:

{\hat{y}}_{n + m} = f (x_{m}, \dots, x_{n}, {\overset{⌢}{y}}_{1}, \dots, {\overset{⌢}{y}}_{m - 1})

(14)

and that is

R u n o f f_{n + m} = f (h m d a t a_{m, \dots, n}, h r d a t a_{m, \dots, n}, p g m d a t a_{1, \dots, m - 1}, p r d a t a_{1, \dots, m - 1})

(15)

where m is the lead time, and m < n,

{\hat{y}}_{n + m}

is the predicted value for day n + m;

x_{m}, \dots, x_{n}

are the observed data from day m to day n; and

{\overset{⌢}{y}}_{1}, \dots, {\overset{⌢}{y}}_{m - 1}

are the predicted values from the first to the (m − 1)-th sub-model.

R u n o f f_{n + m}

is the predicted data for day (n + m), and

p g m d a t a_{1, \dots m - 1}

is the meteorological data predicted by the first to the (m − 1)-th Pangu model, while

p r d a t a_{1, \dots, m - 1}

are the runoff data predicted by the first to the (m − 1)-th sub-model. The runoff for day (n + m) is predicted using (n − m) runoff observations, (n − m) meteorological observations, (m − 1) runoff predictions, and (m − 1) meteorological predictions.

Therefore, these m models for runoff prediction over m steps are not independent. For example, the first sub-model needs to be trained first, serving two purposes: predicting the runoff for the second day and providing input for the models in steps 2, 3, …, m, and so on.

The Pangu-Weather model obtains

h m d a t a_{1 \dots n}

, makes predictions, and inputs the predicted values into each sub-model. The meteorological data include 2 m temperature and specific humidity values at 50 hPa and 100 hPa levels for 77 grids in the research region.

Within the model, the initial focus is on daily runoff prediction, which is gradually extended to multi-day predictions. For example, with a 3-day lead time and a 5-day time step, Model 1 forecasts runoff for the first day using historical meteorological and runoff data from the previous 5 days, while Model 2 forecasts runoff for the 2nd day using historical runoff and meteorological data from the previous 4 days, as well as the 1st-day runoff prediction and Pangu’s meteorological data forecast. Model 3 predicts the runoff for the third day using the historic runoff information and meteorological information from the previous three days, along with the runoff predictions for the first and second days from Models 1 and 2, and the forecast of Pangu for the first and second days’ meteorological data.

The sub-models were replaced with LSTM and GRU models to obtain the LSTM-Pangu and GRU-Pangu models, which were then used as experimental models for comparison.

In order to evaluate the models’ performance, they were pitted against two other models: the LSTM and the GRU. Both models forecast future runoff by combining runoff and meteorological data from prior days.

2.5. Performance Evaluation Methods

This study evaluated model performance using four metrics: Nash–Sutcliffe Efficiency (NSE), Pearson Correlation Coefficient (R), Mean Absolute Error (MAE), and Root Mean Square Error (RMSE). Among them, NSE and R are effective for assessing the accuracy and consistency of model predictions, while MAE and RMSE quantify the average variance between anticipated and observed values. The model’s performance improved as the NSE value approached one, or as the MAE and RMSE values approached zero [47]. The formulas to calculate the four metrics are presented below:

NSE = 1 - \frac{\sum_{i = 1}^{N} {(y_{P, i} - y_{O, i})}^{2}}{\sum_{i = 1}^{N} {(y_{O, i} - {\bar{y}}_{O, i})}^{2}}

(16)

R = \frac{\sum_{i = 1}^{N} {(y_{O, i} - {\bar{y}}_{O, i})}^{2} {(y_{P, i} - {\bar{y}}_{p, i})}^{2}}{\sqrt{\sum_{i = 1}^{N} {(y_{O, i} - {\bar{y}}_{O, i})}^{2} \cdot \sum_{i = 1}^{N} {(y_{P, i} - {\bar{y}}_{p, i})}^{2}}}

(17)

MAE = \frac{1}{N} |y_{P, i} - {\bar{y}}_{O, i}|

(18)

RMSE = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} {(y_{P, i} - y_{O, i})}^{2}}

(19)

where N is the total length of the data;

y_{O, i}

is the observed daily runoff;

y_{P, i}

is predicted daily runoff;

{\bar{y}}_{O, i}

is the mean actual runoff; and

{\bar{y}}_{p, i}

is the mean expected runoff.

3. Overview of the Study Area and Model Configurations

3.1. Study Area and Data

The Beipan River Basin is located in southwestern China, spanning the Yunnan and Guizhou provinces. It originates in the Wumeng Mountains of Yunnan and flows through the transitional slope zone from the Yunnan–Guizhou Plateau to the Guangxi hills, ultimately joining the Hongshui River in Wangmo County, Guizhou. The basin lies between longitudes 103°50′ and 106°20′ E and latitudes 24°51′ and 26°45′ N. The main river stretches for 449 km with a total elevation drop of 1985 m, making it the river with the greatest fall in the Pearl River Basin. The geographical location of the basin is shown in Figure 5.

The basin is predominantly characterized by karst topography, featuring deeply incised gorges, such as the Huajiang and Ye Zhong Grand Canyons, with vertical dissection depths ranging from 400 to 1400 m. Due to the high permeability of the karst formations, baseflow in the basin is relatively low compared to non-karst regions, while peak flows typically occur during the flood season. Runoff during the wet season (May to October) accounts for approximately 84% of the annual total, reflecting a high seasonal variability between flood and dry periods [48].

The Guangzhao Hydropower Station is the largest hydropower facility in the basin. It has a normal water storage level of 745 m, a total reservoir capacity of 3.245 billion cubic meters, and an installed capacity of 1040 megawatts, representing about 39% of the basin’s total installed capacity. The dam site is located in a deeply incised valley with exposed bedrock on both sides. The region experiences a subtropical monsoon climate with an average annual precipitation of 1178.8 mm, and peak flood discharges typically occur between June and July.

The Dongqing Hydropower Station is a key downstream facility within the Beipan River Basin, located below the Guangzhao Hydropower Station. It primarily serves power generation, flood control, and water regulation functions. The station has a normal water level of approximately 760 m, a total reservoir capacity of about 1.86 billion cubic meters, and an installed capacity of 600 megawatts. The dam is situated in a mountainous karst region with steep terrain and significant elevation differences. As with other parts of the basin, the area experiences pronounced seasonal runoff variation.

This study selected daily runoff data from the upstream of Dongqing Hydropower Station in the Beipan River Basin, covering the period from 16 April 2015 to 23 May 2023, as the sample dataset. According to the research requirements, the sample data were divided into calibration and validation periods with a 9:1 ratio. The calibration period from 16 April 2015 to 7 August 2022 was used for model training, while the validation period from 8 August 2022 to 23 May 2023 was used for model testing. Figure 6 depicts the process of daily runoff.

The Climate Data Store platform provided the meteorological data [49], which included various observed data, such as temperature and wind speed. In this study, specific humidity and surface temperature were selected as representative meteorological variables to input into the model. These data have a high spatial and temporal resolution and accuracy, and are widely used in the climate change research, hydrological model simulations, and ecological environmental assessments. The high-quality meteorological data effectively reflect the actual climatic characteristics of the Beipan River Basin, providing reliable support for model construction and optimization.

To ensure the reliability and applicability of the data, strict quality control and preprocessing were performed, including outlier removal and missing value imputation.

3.2. Model Configurations

In this study, training and testing data were generated using a sliding window function (with a window size of 5d, 7d, 9d, 11d, 14d, 16d, 18d, or 20d), which created fixed inputs, including historical meteorological information and runoff data, with the output being the runoff for the target forecast time. Normalization was applied to both the training and testing sets of the dataset, which was split into 90% for training and 10% for testing. Following model training, predictions were made on the test set, and the results are denormalized to return them to their actual values.

Since the architectures of LSTM and GRU are very similar, the same input and output structure was designed for both models. The input sequence consisted of n days of historical runoff data and meteorological information, and the output sequence represented the runoff m days ahead. Both the input and output data were created using the sliding window approach.

In the LSTM-Pangu and GRU-Pangu models, the model parameters are identical to those of the LSTM and GRU models. In this study, a two-layer LSTM or GRU network was employed to model the runoff relationship based on the empirical results. Ten neurons made up the second layer, while thirty neurons made up the first. The input layer consisted of two features: one was historical runoff data, and the other was historical meteorological information; the output layer was the predicted runoff value for the target time.

Using the Mean Squared Error (MSE) as the loss function, the following formula is used:

MSE = \frac{1}{N} \sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}

(20)

where

y_{i}

is the observed value and

{\hat{y}}_{i}

is the forecasting value at time step i.

The Adam optimizer was used in the optimization process, and a learning rate of 0.001 was used. To prevent overfitting, an early stopping mechanism was introduced during training: if the loss function changed by less than a predefined threshold (e.g., 0.001) over 10 consecutive iterations, training was stopped early. Ultimately, both the LSTM and GRU networks consist of two layers: A completely connected layer for the final output comes after the first layer, which has 30 units (LSTM or GRU), and the second layer, which has 10 units.

To compare the performance of LSTM, GRU, LSTM-Pangu, and GRU-Pangu models, the LSTM-Pangu and GRU-Pangu models were trained with exactly the same parameter settings as the LSTM and GRU models to ensure the fairness and comparability of the results.

4. Results and Discussions

4.1. Comparison and Analysis of Model Performance Under Different Time Steps

This study conducted two sets of comparative experiments to comprehensively evaluate the effectiveness of models in forecasting daily runoff for the Beipan River Basin: (1) an analysis of the differences between the LSTM-Pangu and LSTM models; and (2) a contrast with the GRU model and the GRU-Pangu model. The purpose of these experiments is to investigate how different model configurations and parameter settings affect runoff prediction accuracy.

In the first phase of the experiment, a lead time of 3d was fixed, and different input window lengths (5d, 7d, 9d, 11d, 14d, 16d, 18d, 20d) were set to explore their impact on the prediction accuracy. The sliding window method was employed to generate the corresponding training and testing samples, and prediction metrics were calculated for each model, including NSE, R, MAE, and RMSE. The specific results are shown in Table 1, Table 2, Table 3 and Table 4.

The following are each model’s prediction metrics when the time step is 5d: The LSTM, LSTM-Pangu, GRU, and GRU-Pangu models have respective NSE values of 0.8332, 0.8378, 0.8315, and 0.8364; the R values are 0.9127, 0.9208, 0.9119, and 0.9254; the MAE values are 24.1841, 23.2318, 25.1731, and 23.9230; and the RMSE values are 36.1030, 33.9629, 36.0200, and 33.9645. When the input window increases to 7, the NSE values increase by 0.0289, 0.0415, 0.0284, and 0.0317; the R values increase by 0.0157, 0.0169, 0.0154, and 0.0117; the MAE values decrease by 1.7721, 2.005, 2.8858, and 2.5578; and the RMSE values decrease by 3.1192, 3.7732, 3.1925, and 3.8754.

Overall, the LSTM-Pangu and GRU-Pangu models demonstrate more substantial improvements compared to LSTM and GRU. To quantitatively assess the significance of performance improvements introduced by the Pangu-enhanced models, we conducted paired two-tailed t-tests for each metric across all the tested time steps. The results, shown in Table 5, demonstrate that the improvements in NSE, R, MAE, and RMSE achieved by both LSTM-Pangu and GRU-Pangu models are statistically significant (p < 0.05) when compared with their baseline counterparts. These findings validate the effectiveness of integrating Pangu weather features into the recurrent neural network frameworks and support our earlier claims of improved model accuracy.

However, when the time step is further increased to 11d and 14d, the prediction accuracy of the models no longer improves significantly and even shows a slight decline. The NSE of the LSTM-Pangu model, for instance, is 0.8317 at time step 11, which is less than the 0.8793 at time step 7. When the time step is 14, the NSE of the GRU-Pangu model is 0.8367, somewhat more than the 0.8226 with a time step of 11. Meanwhile, the MAE and RMSE both exhibit strikingly similar trends, indicating that excessively long time steps may introduce more noise. This is likely attributable to the diminished hydro-meteorological interactions between historical meteorological variables and future runoff observations when the time steps become too large, thereby interfering with the predictive ability of model.

Line graphs were plotted to show the variation in prediction accuracy metrics with the time step (as shown in Figure 7), providing a visual representation of the performance trend of model. The results show that a time step of 7 yields the best performance for both traditional and improved models. Because runoff is inherently influenced by meteorological drivers, such as precipitation, temperature, and solar radiation, if the time step is short, the model fails to capture adequate meteorological–runoff correlation information. Therefore, when the time step grows, the prediction of model accuracy improves. However, the influence of runoff and meteorological information from previous days on future runoff predictions gradually weakens. An excessively long time step may introduce excessive irrelevant noise, thus reducing the prediction accuracy. A time step of 7 was determined to strike a balance between capturing short-term correlations and avoiding information redundancy, achieving optimal performance. Further analysis indicates that as the time step gradually rises from 5 to 20, the prediction accuracy of each model first increases, then decreases, and finally stabilizes. Each model’s forecast accuracy tends to stabilize as the time step rises, as seen graphically in Figure 7.

4.2. Comparison and Analysis of Model Performance Under Different Lead Times

In the second stage of the experiment, the focus shifted to evaluating the prediction accuracy of several models using a fixed-input time step of 7d across various lead times (2d, 3d, 4d, 5d). The experimental results are summarized in Table 6, Table 7, Table 8 and Table 9. From the analysis, it is obvious that, regardless of whether the traditional LSTM and GRU models or the upgraded LSTM-Pangu and GRU-Pangu models are employed, the prediction accuracy is good for shorter lead times (e.g., a lead time of 2), and the disparities among the four models are generally small.

For instance, each model performs as follows when the lead time is 2: the NSE values of the LSTM, LSTM-Pangu, GRU, and GRU-Pangu models are 0.9364, 0.9321, 0.9353, and 0.9345; the R values are 0.9676, 0.9654, 0.9671, and 0.9666; the MAE values are 15.7868, 15.7625, 16.3566, and 16.3645; and the RMSE values are 24.1042, 25.2242, 24.2787, and 26.4813, respectively. These findings suggest that both the standard and upgraded models’ prediction accuracy is comparable for shorter lead times, and there is no discernible benefit to the improvements.

As the lead time increases, both traditional and improved models show a certain degree of decline in the prediction accuracy. However, the improved LSTM-Pangu and GRU-Pangu models exhibit more significant advantages compared to the traditional models for longer lead times (e.g., a lead time of 5). For instance, the LSTM, LSTM-Pangu, GRU, and GRU-Pangu models’ respective NSE values are 0.6909, 0.7587, 0.6836, and 0.7642 at a lead time of 5; the R values are 0.8312, 0.8710, 0.8268, and 0.8741; the MAE values are 30.8651, 28.3561, 31.0889, and 29.4852; and the RMSE values are 51.0841, 45.7546, 50.6629, and 47.1663. Compared to the traditional models, the NSE of the LSTM-Pangu model improves by approximately 8.1%, and the NSE of the GRU-Pangu model improves by approximately 11.7%. This suggests that the improved models exhibit higher robustness and accuracy for runoff predictions over longer lead times.

It is evident from the line graph in Figure 8 that the models’ performance indicators alter dramatically with increasing lead times. In particular, the NSE and R metrics exhibit a clear downward trend, indicating that the fit and correlation of the model weaken as the lead time rises. In the meantime, the MAE and RMSE metrics show a significant increase, indicating that the prediction mistakes and biases increase as the lead time lengthens. Further analysis of the bar chart reveals that the performance difference between the LSTM and LSTM-Pangu models improves in all four performance parameters as the lead time increases, and the gap between the GRU and GRU-Pangu models becomes more pronounced.

This phenomenon suggests that, at shorter lead times (e.g., lead times of 2 or 3), runoff prediction may rely more on the initial conditions, with meteorological factors having less of an impact. Therefore, the inclusion of the Pangu framework has a limited influence on the prediction accuracy. However, as the lead time extends, the influence of meteorological conditions on the runoff process becomes more pronounced. The models need to capture the complex relationship between runoff and meteorological data more accurately. At this stage, the LSTM-Pangu and GRU-Pangu models exhibit significant advantages due to their enhanced ability to extract and process meteorological information, leading to improved prediction accuracy.

In conclusion, the improved LSTM-Pangu and GRU-Pangu models more comprehensively reflect the driving mechanisms of the runoff process at longer lead times, offering higher prediction accuracy and robustness. This indicates that the introduction of the Pangu framework is significant for improving hydrological models, especially in application scenarios with long time spans and high-precision prediction requirements, where its advantages are particularly prominent.

4.3. Comparison and Visualization Analysis of Model Prediction Results

Scatter plots for the LSTM, LSTM-Pangu, GRU, and GRU-Pangu models (with a time step of 7 and lead time of 3) were created to visually compare their prediction performance (as shown in Figure 9). The difference between the expected and observed values is shown in these plots. From the scatter plot, it can be noticed that, whether it is the LSTM or GRU model, or the upgraded LSTM-Pangu or GRU-Pangu model, the prediction accuracy is high under low-flow conditions, with predicted values closely matching the observed values. This suggests that the variable characteristics of runoff during low-flow times may be adequately captured by all models.

However, during the medium-to-high flow phases, the LSTM-Pangu and GRU-Pangu models significantly outperform the traditional LSTM and GRU models. The enhanced models’ scatter plot distribution is nearer the regression line, suggesting that they more precisely depict the trend and magnitude, with a particularly significant advantage in peak value prediction.

Further analysis indicates that the bias in traditional models during medium-to-high flow phases is primarily due to systematic underestimation, with predicted values deviating significantly from the observed values. This systematic bias mainly stems from the models’ limited ability to capture the complex and nonlinear hydrological responses associated with high-flow events. In particular, runoff during such periods is strongly influenced by intense rainfall or sudden upstream inflows, which introduce substantial variability and nonlinearity into the system. Since these high-flow events occur less frequently in the dataset, traditional models may not adequately learn their corresponding patterns, leading to consistent underprediction.

Traditional models, such as LSTM and GRU, rely solely on historical runoff and meteorological data, and often fail to capture the rapid dynamics of these extreme events—especially in terms of peak magnitudes and timing. In contrast, the improved models (LSTM-Pangu and GRU-Pangu), through the integration with the Pangu model for enhanced meteorological forecasting, can better represent key driving factors and their interactions with runoff. This significantly improves the models’ robustness and accuracy during dynamic high-flow conditions.

On the other hand, during low-flow periods, runoff is primarily governed by relatively stable processes, such as groundwater recharge, soil infiltration, and baseflow, which are less sensitive to short-term meteorological fluctuations. As a result, both traditional and improved models perform well under these conditions.

To further address the issue of systematic underestimation in traditional models during high-flow phases, potential strategies include enriching the input feature set with more detailed hydrometeorological variables, incorporating spatially distributed data, or exploring hybrid or physics-informed deep learning frameworks to better represent the underlying hydrological processes during high-flow events.

For runoff prediction in the Beipan River Basin, accurately predicting peak values is especially crucial as it directly affects water resource management efficiency and flood control effectiveness. The LSTM-Pangu and GRU-Pangu models can predict peak values more accurately, indicating their greater practicality and reliability in real-world applications. Overall, the improved models proposed in this study not only capture the trend changes in the overall runoff in the basin more effectively, but also provide stronger support for predicting extreme events during medium-to-high flows, offering important reference for the subsequent research on runoff prediction.

In summary, due to the high precision of the Pangu model in meteorological prediction, the LSTM-Pangu and GRU-Pangu models achieved significant improvements in runoff prediction compared to traditional models. Their rolling prediction framework not only fully considers different lead times but also demonstrates strong applicability and transferability, with broad potential for real-world applications.

5. Conclusions

In order to enhance the performance of conventional LSTM and GRU models in daily runoff prediction, this study presented the LSTM-Pangu and GRU-Pangu models based on the Pangu-Weather model, combining Pangu-Weather meteorological prediction capability, because runoff is inherently influenced by meteorological drivers, such as precipitation, temperature, and solar radiation, which directly affect processes like infiltration, evaporation, and snowmelt. By integrating these variables, the model can more accurately reflect the physical processes governing runoff generation, particularly under changing weather conditions. These improved models effectively describe the relationship between predictive factors and runoff, offering limited advantages in short lead times. However, their superiority becomes more pronounced as the lead time increases, particularly in scenarios where accurate meteorological forecasts are unavailable, demonstrating a significant advantage of long lead times, especially in the absence of accurate meteorological forecast data. According to the experimental findings derived from the daily runoff data from the Beipan River Basin, the improved models have a promising predictive performance. The specific conclusions are as follows:

(1): For both LSTM and GRU models, as well as the improved LSTM-Pangu and GRU-Pangu models, the optimal time step is consistently seven days. The forecast accuracy then gradually declines and then stabilizes as the time step increases.
(2): As the lead time increases, every model’s predicted accuracy decreases to some extent. However, the accuracy of the LSTM and GRU models decreases more rapidly. On the other hand, the accuracy of the LSTM-Pangu and GRU-Pangu models declines more gradually, with the advantages of the improved models becoming increasingly pronounced over longer lead times.
(3): In daily runoff predictions, the LSTM, GRU, LSTM-Pangu, and GRU-Pangu models all exhibit high accuracy in predicting low runoff levels, with minimal differences among the models. The LSTM-Pangu and GRU-Pangu models, however, perform noticeably better in terms of their forecast accuracy for medium- and high-runoff levels than the conventional LSTM and GRU models.

In conclusion, the observed daily runoff in the Beipan River Basin exhibits significant variability during the summer flood season. Traditional LSTM and GRU models, lacking essential meteorological forecast data, fail to effectively capture runoff dynamics, resulting in a poorer predictive performance. In contrast, the LSTM-Pangu and GRU-Pangu models, by incorporating future meteorological factors, are better equipped to reflect runoff variability accurately.

Author Contributions

Conceptualization, W.Y. and H.Q.; methodology, W.Y., Y.J. and Y.Q.; software, W.Y. and C.L.; validation, Y.J., T.Z. and C.L.; formal analysis, W.Y. and Y.J.; investigation, Y.J. and L.T.; resources, H.Q. and T.Z.; data curation, Y.Q.; writing—original draft preparation, W.Y.; writing—review and editing, W.Y., Y.Q. and L.T.; visualization, C.L.; supervision, H.Q.; project administration, W.Y.; funding acquisition, H.Q. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Key Research and Development Program of China (2021YFC3200303) and the Key Project of the Natural Science Foundation of China (52039004).

Data Availability Statement

The authors do not have permission to share the data.

Acknowledgments

The authors are grateful to the anonymous reviewers and editors, whose valuable comments and suggestions helped to improve the quality of the paper.

Conflicts of Interest

Taiheng Zhang is employed by Huadian Electric Power Research Institute Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Radford, A.N.; Lèbre, L.; Lecaillon, G.; Nedelec, S.L.; Simpson, S.D. Repeated Exposure Reduces the Response to Impulsive Noise in European Seabass. Glob. Change Biol. 2016, 22, 3349–3360. [Google Scholar] [CrossRef] [PubMed]
Donnelly, C.; Greuell, W.; Andersson, J.; Gerten, D.; Pisacane, G.; Roudier, P.; Ludwig, F. Impacts of Climate Change on European Hydrology at 1.5, 2 and 3 Degrees Mean Global Warming above Preindustrial Level. Clim. Chang. 2017, 143, 13–26. [Google Scholar] [CrossRef]
Ghiggi, G.; Humphrey, V.; Seneviratne, S.I.; Gudmundsson, L. GRUN: An Observation-Based Global Gridded Runoff Dataset from 1902 to 2014. Earth Syst. Sci. Data 2019, 11, 1655–1674. [Google Scholar] [CrossRef]
Xu, X.; Yang, D.; Yang, H.; Lei, H. Attribution Analysis Based on the Budyko Hypothesis for Detecting the Dominant Cause of Runoff Decline in Haihe Basin. J. Hydrol. 2014, 510, 530–540. [Google Scholar] [CrossRef]
Yin, J.; Gentine, P.; Zhou, S.; Sullivan, S.C.; Wang, R.; Zhang, Y.; Guo, S. Large Increase in Global Storm Runoff Extremes Driven by Climate and Anthropogenic Changes. Nat. Commun. 2018, 9, 4389. [Google Scholar] [CrossRef]
Deb, P.; Kiem, A.S.; Willgoose, G. A Linked Surface Water-Groundwater Modelling Approach to More Realistically Simulate Rainfall-Runoff Non-Stationarity in Semi-Arid Regions. J. Hydrol. 2019, 575, 273–291. [Google Scholar] [CrossRef]
Song, J.-H.; Her, Y.; Park, J.; Kang, M.-S. Exploring Parsimonious Daily Rainfall-Runoff Model Structure Using the Hyperbolic Tangent Function and Tank Model. J. Hydrol. 2019, 574, 574–587. [Google Scholar] [CrossRef]
Song, X. Global Sensitivity Analysis in Hydrological Modeling: Review of Concepts, Methods, Theoretical Framework, and Applications. J. Hydrol. 2015, 523, 739–757. [Google Scholar] [CrossRef]
Xu, Y.; Hu, C.; Wu, Q.; Jian, S.; Li, Z.; Chen, Y.; Zhang, G.; Zhang, Z.; Wang, S. Research on Particle Swarm Optimization in LSTM Neural Networks for Rainfall-Runoff Simulation. J. Hydrol. 2022, 608, 127553. [Google Scholar] [CrossRef]
Ahmadi, M.; Moeini, A.; Ahmadi, H.; Motamedvaziri, B.; Zehtabiyan, G.R. Comparison of the Performance of SWAT, IHACRES and Artificial Neural Networks Models in Rainfall-Runoff Simulation (Case Study: Kan Watershed, Iran). Phys. Chem. Earth Parts A/B/C 2019, 111, 65–77. [Google Scholar] [CrossRef]
Bentivoglio, R.; Isufi, E.; Jonkman, S.N.; Taormina, R. Deep Learning Methods for Flood Mapping: A Review of Existing Applications and Future Research Directions. Hydrol. Earth Syst. Sci. 2022, 26, 4345–4378. [Google Scholar] [CrossRef]
Jie, Y.; Qin, H.; Jia, B.; Tian, M.; Lou, S.; Liu, G.; Huang, Y. A Multiscale Attribution Framework for Separating the Effects of Cascade and Individual Reservoirs on Runoff. Sci. Total Environ. 2024, 933, 172784. [Google Scholar] [CrossRef] [PubMed]
Zuo, G.; Luo, J.; Wang, N.; Lian, Y.; He, X. Decomposition ensemble model based on variational mode decomposition and long short-term memory for streamflow forecasting. J. Hydrol. 2020, 585, 124776. [Google Scholar] [CrossRef]
Arnell, N.W.; Gosling, S.N. The Impacts of Climate Change on River Flood Risk at the Global Scale. Clim. Chang. 2016, 134, 387–401. [Google Scholar] [CrossRef]
Xiang, Z.; Montas, H.J.; Shirmohammadi, A.; Leisnham, P.T.; Brubaker, K. Impact of climate change on critical source areas in a Chesapeake Bay watershed. In Proceedings of the 2018 ASABE Annual International Meeting. American Society of Agricultural and Biological Engineers, Detroit, MI, USA, 29 July–1 August 2018. [Google Scholar]
Devia, G.K.; Ganasri, B.P.; Dwarakish, G.S. A Review on Hydrological Models. Aquat. Procedia 2015, 4, 1001–1007. [Google Scholar] [CrossRef]
Smagulova, K.; James, A.P. A Survey on LSTM Memristive Neural Network Architectures and Applications. Eur. Phys. J. Spec. Top. 2019, 228, 2313–2324. [Google Scholar] [CrossRef]
Gao, S.; Huang, Y.; Zhang, S.; Han, J.; Wang, G.; Zhang, M.; Lin, Q. Short-Term Runoff Prediction with GRU and LSTM Networks without Requiring Time Step Optimization during Sample Generation. J. Hydrol. 2020, 589, 125188. [Google Scholar] [CrossRef]
Xiang, Z.; Yan, J.; Demir, I. A Rainfall-Runoff Model With LSTM-Based Sequence-to-Sequence Learning. Water Resour. Res. 2020, 56, e2019WR025326. [Google Scholar] [CrossRef]
Yin, H.; Wang, F.; Zhang, X.; Zhang, Y.; Chen, J.; Xia, R.; Jin, J. Rainfall-Runoff Modeling Using Long Short-Term Memory Based Step-Sequence Framework. J. Hydrol. 2022, 610, 127901. [Google Scholar] [CrossRef]
Brereton, R.G.; Lloyd, G.R. Support Vector Machines for Classification and Regression. Analyst 2010, 135, 230–267. [Google Scholar] [CrossRef]
Kauker, F.; Kaminski, T.; Karcher, M.; Dowdall, M.; Brown, J.; Hosseini, A.; Strand, P. Model Analysis of Worst Place Scenarios for Nuclear Accidents in the Northern Marine Environment. Environ. Model. Softw. 2016, 77, 13–18. [Google Scholar] [CrossRef]
Mim, T.R.; Amatullah, M.; Afreen, S.; Yousuf, M.A.; Uddin, S.; Alyami, S.A.; Hasan, K.F.; Moni, M.A. GRU-INC: An Inception-Attention Based Approach Using GRU for Human Activity Recognition. Expert Syst. Appl. 2023, 216, 119419. [Google Scholar] [CrossRef]
Wang, W.; Wang, B.; Chau, K.; Zhao, Y.; Zang, H.; Xu, D. Monthly Runoff Prediction Using Gated Recurrent Unit Neural Network Based on Variational Modal Decomposition and Optimized by Whale Optimization Algorithm. Environ. Earth Sci 2024, 83, 72. [Google Scholar] [CrossRef]
Dong, Z.; Hu, H.; Liu, H.; Baiyin, B.; Mu, X.; Wen, J.; Liu, D.; Chen, L.; Ming, G.; Chen, X.; et al. Superior Performance of Hybrid Model in Ungauged Basins for Real-Time Hourly Water Level Forecasting—A Case Study on the Lancang-Mekong Mainstream. J. Hydrol. 2024, 633, 130941. [Google Scholar] [CrossRef]
Rasiya Koya, S.; Roy, T. Temporal Fusion Transformers for Streamflow Prediction: Value of Combining Attention with Recurrence. J. Hydrol. 2024, 637, 131301. [Google Scholar] [CrossRef]
Feng, Z.; Zhang, J.; Niu, W. A State-of-the-Art Review of Long Short-Term Memory Models with Applications in Hydrology and Water Resources. Appl. Soft Comput. 2024, 167, 112352. [Google Scholar] [CrossRef]
Fang, K.; Shen, C.; Kifer, D.; Yang, X. Prolongation of SMAP to Spatiotemporally Seamless Coverage of Continental U.S. Using a Deep Learning Neural Network. Geophys. Res. Lett. 2017, 44, 11030–11039. [Google Scholar] [CrossRef]
Hu, C.; Wu, Q.; Li, H.; Jian, S.; Li, N.; Lou, Z. Deep Learning with a Long Short-Term Memory Networks Approach for Rainfall-Runoff Simulation. Water 2018, 10, 1543. [Google Scholar] [CrossRef]
Kratzert, F.; Klotz, D.; Brenner, C.; Schulz, K.; Herrnegger, M. Rainfall–Runoff Modelling Using Long Short-Term Memory (LSTM) Networks. Hydrol. Earth Syst. Sci. 2018, 22, 6005–6022. [Google Scholar] [CrossRef]
Dong, J.; Wang, Z.; Wu, J.; Cui, X.; Pei, R. A Novel Runoff Prediction Model Based on Support Vector Machine and Gate Recurrent Unit with Secondary Mode Decomposition. Water Resour Manag. 2024, 38, 1655–1674. [Google Scholar] [CrossRef]
Cheng, M.; Fang, F.; Kinouchi, T.; Navon, I.M.; Pain, C.C. Long Lead-Time Daily and Monthly Streamflow Forecasting Using Machine Learning Methods. J. Hydrol. 2020, 590, 125376. [Google Scholar] [CrossRef]
Kao, I.-F.; Zhou, Y.; Chang, L.-C.; Chang, F.-J. Exploring a Long Short-Term Memory Based Encoder-Decoder Framework for Multi-Step-Ahead Flood Forecasting. J. Hydrol. 2020, 583, 124631. [Google Scholar] [CrossRef]
Chen, S.; Feng, Y.; Li, H.; Ma, D.; Mao, Q.; Zhao, Y.; Liu, J. Enhancing Runoff Predictions in Data-Sparse Regions through Hybrid Deep Learning and Hydrologic Modeling. Sci. Rep. 2024, 14, 26450. [Google Scholar] [CrossRef] [PubMed]
Guo, X.; Wang, L.; Ma, F.; Tankpa, V.; Jiang, X.; Li, Z. Pathway Analysis to Quantify the Relationship between Runoff and Meteorological Factors in Re-Identifying Seasonality throughout the Ashi River Watershed, Northeast China. Theor Appl Clim. 2021, 143, 1047–1061. [Google Scholar] [CrossRef]
Roderick, M.L.; Farquhar, G.D. A Simple Framework for Relating Variations in Runoff to Variations in Climatic Conditions and Catchment Properties. Water Resour. Res. 2011, 47, 2010WR009826. [Google Scholar] [CrossRef]
Fan, M.; Xu, J.; Chen, Y.; Fan, M.; Yu, W.; Li, W. Temperature Contributes More than Precipitation to Runoff in the High Mountains of Northwest China. Remote Sens. 2022, 14, 4015. [Google Scholar] [CrossRef]
Li, Z.; Li, W.; Li, Z.; Lv, X. Responses of Runoff and Its Extremes to Climate Change in the Upper Catchment of the Heihe River Basin, China. Atmosphere 2023, 14, 539. [Google Scholar] [CrossRef]
Bi, K.; Xie, L.; Zhang, H.; Chen, X.; Gu, X.; Tian, Q. Accurate Medium-Range Global Weather Forecasting with 3D Neural Networks. Nature 2023, 619, 533–538. [Google Scholar] [CrossRef]
Xu, H.; Zhao, Y.; Zhao, D.; Duan, Y.; Xu, X. Improvement of Disastrous Extreme Precipitation Forecasting in North China by Pangu-Weather AI-Driven Regional WRF Model. Environ. Res. Lett. 2024, 19, 054051. [Google Scholar] [CrossRef]
Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
Fischer, T.; Krauss, C. Deep Learning with Long Short-Term Memory Networks for Financial Market Predictions. Eur. J. Oper. Res. 2018, 270, 654–669. [Google Scholar] [CrossRef]
Gauch, M.; Kratzert, F.; Klotz, D.; Nearing, G.; Lin, J.; Hochreiter, S. Rainfall–Runoff Prediction at Multiple Timescales with a Single Long Short-Term Memory Network. Hydrol. Earth Syst. Sci. 2021, 25, 2045–2062. [Google Scholar] [CrossRef]
Cho, K.; Van Merrienboer, B.; Gulcehre, C.; Bahdanau, D.; Bougares, F.; Schwenk, H.; Bengio, Y. Learning Phrase Representations Using RNN Encoder–Decoder for Statistical Machine Translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 25–29 October 2014; pp. 1724–1734. [Google Scholar]
Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An Image Is Worth 16x16 Words: Transformers for Image Recognition at Scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
Pathak, J.; Subramanian, S.; Harrington, P.; Raja, S.; Chattopadhyay, A.; Mardani, M.; Kurth, T.; Hall, D.; Li, Z.; Azizzadenesheli, K.; et al. FourCastNet: A Global Data-Driven High-Resolution Weather Model Using Adaptive Fourier Neural Operators. arXiv 2022, arXiv:2202.11214. [Google Scholar]
Kumar, P.S.; Praveen, T.V.; Prasad, M.A. Artificial Neural Network Model for Rainfall-Runoff—A Case Study. Int. J. Hybrid Inf. Technol. 2016, 9, 263–272. [Google Scholar] [CrossRef]
Zou, J. Sources and Dynamics of Inorganic Carbon within the Upper Reaches of the Xi River Basin, Southwest China. PLoS ONE 2016, 11, e0160964. [Google Scholar] [CrossRef]
Hersbach, H.; Bell, B.; Berrisford, P.; Hirahara, S.; Horányi, A.; Muñoz-Sabater, J.; Nicolas, J.; Peubey, C.; Radu, R.; Schepers, D.; et al. The ERA5 Global Reanalysis. Quart. J. R. Meteoro. Soc. 2020, 146, 1999–2049. [Google Scholar] [CrossRef]

Figure 1. LSTM structure.

Figure 2. GRU structure.

Figure 3. Network training and inference strategies.

Figure 4. Structure of the Pangu-driven runoff prediction model.

Figure 5. A map showing the locations of the gauge stations and the area under study.

Figure 6. Runoff and rain process in the Beipan River Basin.

Figure 7. Prediction evaluation metrics corresponding to different time steps for the same lead time.

Figure 8. Prediction evaluation metrics corresponding to different lead times under the same time step.

Figure 9. Scatter plots of different models with a time step of 7 and a lead time of 3.

Table 1. NSE of different time steps with a lead time of 3.

Time Steps\NSE	LSTM	LSTM-Pangu	GRU	GRU-Pangu
5	0.8332	0.8378	0.8315	0.8364
7	0.8621	0.8793	0.8599	0.8681
9	0.8510	0.8554	0.8457	0.8599
11	0.8192	0.8317	0.8200	0.8226
14	0.8178	0.8276	0.8210	0.8367
16	0.8165	0.8257	0.8179	0.8215
18	0.8142	0.8187	0.8170	0.8287
20	0.8153	0.8275	0.8162	0.8282

Table 2. R for different time steps with a lead time of 3.

Time Steps\R	LSTM	LSTM-Pangu	GRU	GRU-Pangu
5	0.9128	0.9207	0.9118	0.9254
7	0.9284	0.9377	0.9273	0.9370
9	0.9225	0.9356	0.9196	0.9326
11	0.9051	0.9119	0.9055	0.9179
14	0.9043	0.9152	0.9060	0.9201
16	0.9036	0.9141	0.9043	0.9118
18	0.9023	0.9158	0.9038	0.9103
20	0.9029	0.9151	0.9034	0.9155

Table 3. MAE for different time steps with a lead time of 3.

Time Steps\MAE	LSTM	LSTM-Pangu	GRU	GRU-Pangu
5	24.1841	23.2318	25.1731	23.923
7	22.4242	21.2268	22.2873	21.3652
9	25.6541	23.4596	26.2351	24.4653
11	25.9865	24.5421	26.2563	24.4452
14	26.1534	24.6585	26.3562	24.8563
16	26.6583	24.5662	26.8621	24.6536
18	26.5656	24.3111	26.6542	24.9532
20	26.4212	24.2665	26.7751	24.5768

Table 4. RMSE for different time steps with a lead time of 3.

Time Steps\RMSE	LSTM	LSTM-Pangu	GRU	GRU-Pangu
5	39.2222	37.7361	39.2125	37.8399
7	36.1030	33.9629	36.0200	33.9645
9	39.1292	37.5817	39.9521	37.2276
11	40.5374	39.3280	40.9194	39.0612
14	41.9119	39.4109	42.1872	38.7916
16	42.6480	39.2674	42.9763	39.4857
18	42.5696	38.9026	42.6887	39.9078
20	42.2691	38.7896	42.8340	39.2325

Table 5. Paired two-tailed t-test results between baseline and Pangu-enhanced models.

Metric	Model Pair	t Value	p Value
NSE	LSTM vs. LSTM-Pangu	−5.6737	0.0008
NSE	GRU vs. GRU-Pangu	−5.1447	0.0013
R	LSTM vs. LSTM-Pangu	−12.3465	<0.0001
R	GRU vs. GRU-Pangu	−10.9543	<0.0001
MAE	LSTM vs. LSTM-Pangu	9.5390	<0.0001
MAE	GRU vs. GRU-Pangu	10.6991	<0.0001
RMSE	LSTM vs. LSTM-Pangu	6.9714	0.0002
RMSE	GRU vs. GRU-Pangu	9.0836	<0.0001

Table 6. NSE for different lead times with a time step of 7.

Lead Time/NSE	LSTM	LSTM-Pangu	GRU	GRU-Pangu
2	0.9364	0.9321	0.9353	0.9345
3	0.8527	0.8777	0.8506	0.8756
4	0.7629	0.7952	0.7498	0.7848
5	0.6909	0.7587	0.6836	0.7642

Table 7. R for different lead times with a time step of 7.

Lead Time/R	LSTM	LSTM-Pangu	GRU	GRU-Pangu
2	0.9676	0.9654	0.9671	0.9666
3	0.9234	0.9368	0.9222	0.9357
4	0.8734	0.8917	0.8659	0.8858
5	0.8312	0.8710	0.8268	0.8741

Table 8. MAE for different lead times with a time step of 7.

Lead Time/MAE	LSTM	LSTM-Pangu	GRU	GRU-Pangu
2	15.7868	15.7625	16.3566	16.3645
3	23.4455	21.3256	23.4638	21.4265
4	27.1929	25.3287	29.6186	26.269
5	30.8651	28.3561	31.0889	29.4852

Table 9. RMSE for different lead times with a time step of 7.

Lead Time/RMSE	LSTM	LSTM-Pangu	GRU	GRU-Pangu
2	24.1042	25.2242	24.2787	26.48133
3	36.1409	35.1210	36.2913	35.3537
4	45.2414	41.4259	46.5307	41.505
5	51.0841	45.7546	50.6629	47.1663

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yang, W.; Qin, H.; Jie, Y.; Qu, Y.; Zhang, T.; Li, C.; Tan, L. Runoff Prediction Method Based on Pangu-Weather. Water 2025, 17, 1405. https://doi.org/10.3390/w17091405

AMA Style

Yang W, Qin H, Jie Y, Qu Y, Zhang T, Li C, Tan L. Runoff Prediction Method Based on Pangu-Weather. Water. 2025; 17(9):1405. https://doi.org/10.3390/w17091405

Chicago/Turabian Style

Yang, Wentao, Hui Qin, Yongsheng Jie, Yuhua Qu, Taiheng Zhang, Chenghong Li, and Li Tan. 2025. "Runoff Prediction Method Based on Pangu-Weather" Water 17, no. 9: 1405. https://doi.org/10.3390/w17091405

APA Style

Yang, W., Qin, H., Jie, Y., Qu, Y., Zhang, T., Li, C., & Tan, L. (2025). Runoff Prediction Method Based on Pangu-Weather. Water, 17(9), 1405. https://doi.org/10.3390/w17091405

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Runoff Prediction Method Based on Pangu-Weather

Abstract

1. Introduction

2. Materials and Methods

2.1. Long Short-Term Memory (LSTM)

2.2. Gate Recurrent Unit (GRU)

2.3. Pangu-Weather

2.4. Pangu-Driven Runoff Prediction Model

2.5. Performance Evaluation Methods

3. Overview of the Study Area and Model Configurations

3.1. Study Area and Data

3.2. Model Configurations

4. Results and Discussions

4.1. Comparison and Analysis of Model Performance Under Different Time Steps

4.2. Comparison and Analysis of Model Performance Under Different Lead Times

4.3. Comparison and Visualization Analysis of Model Prediction Results

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI