1. Introduction
Accurate runoff prediction is the prerequisite scientific support for water resources management, water ecological protection, and early warning of extreme drought and flood disaster chains [
1]. However, the runoff formation mechanism is driven by the coupling of multiple factors such as the spatial and temporal distribution of precipitation, topography, soil type, vegetation cover, and climate change, showing high nonlinearity and dynamic complexity [
2,
3]. At present, runoff prediction mainly depends on three technical paths: hydrological model, deep learning model, and statistical model [
4,
5,
6]. Hydrological model can explicitly describe spatial heterogeneity, but it is easy to have the problem of parameter equivalence in areas with lack of data and extreme event scenarios; deep learning model has strong nonlinear fitting ability, but it often leads to overfitting due to sample imbalance and insufficient physical interpretability; statistical model has a simple structure, but it is difficult to reveal the internal production and confluence mechanism [
7]. Therefore, how to significantly improve the accuracy of runoff prediction while maintaining physical significance has become a frontier hotspot in the current international hydrological science [
8].
At present, scholars at home and abroad generally use SWAT model and LSTM model to carry out runoff prediction research. Li et al. [
9] used the SWAT model to link eight modes of the linked Model Intercomparison Project Phase 6 to estimate and forecast runoff in the Yellow River’s source region under three SSP scenarios. Research results show applying the SWAT model can effectively capture the interannual runoff change trend. However, the simulation results have inherent uncertainties due to parameter sensitivity. Kassem et al. [
10] connected the SWAT model with the artificial neural network (ANN) model in series and constructed the SWAT-ANN daily runoff prediction framework to carry out runoff prediction. The findings demonstrated that the relative inaccuracy of Asmawa station and Khanis station decreased by 18% on average, and its accuracy was significantly better than that of the single SWAT model. Song et al. [
11] evaluated and contrasted the SWAT and LSTM models’ performance in the historical runoff fitting and the future scenario extension and found that the LSTM model performed better than the SWAT model in simulating the historical runoff, but there was a significant drift in the future climate scenario. The root cause lies in the structural uncertainty of the LSTM model and its strong dependence on the data distribution. Bian et al. [
12] established an LSTM-LightGBM integrated model in arid areas, and its Nash efficiency coefficient reached 0.92, which confirmed that the coupling of LSTM with other models can greatly improve the prediction accuracy. In recent years, the research paradigm has gradually shifted from a single model to a combined model. Chen et al. [
13] constructed a SWAT-LSTM coupling framework in areas with scarce data, which effectively improved the simulation accuracy of watershed runoff, but still pointed out that the framework was not robust enough in dealing with extreme events and failed to quantify the contribution of uncertainty between models. In summary, the coupling of the SWAT model and LSTM model can take into account the benefits of data-driven methods and physical mechanics. However, simple series coupling does not solve the problem that the sources of uncertainty of the two are different and the contribution degree is unknown. Therefore, there is an urgent need for a method that can systematically quantify and integrate the uncertainty of the physical mechanism model and the data-driven model to improve the reliability and interpretability of the prediction results.
Bayesian model averaging (BMA) is a multi-model comprehensive statistical method weighted by posterior probability. It can systematically quantify the model’s parameters and structure’s uncertainties without losing physical mechanism so as to reduce model overfitting and improve the prediction accuracy of coupled models [
14]. Wu Haijiang et al. [
15] used BMA to perform posterior weighting on multiple Vine Copula models and constructed a BVC framework to achieve monthly runoff probability prediction, which increased the coverage rate of the prediction interval of different hydrological stations by 8–12%, verifying the robustness of BMA in complex frameworks. Wen Tianfu et al. [
16] embedded the BMA method into the time-varying moment model to construct a quantitative attribution analysis method for the change in annual sediment transport in the basin. The results showed that the BMA integration reduced the variance of attribution uncertainty by 35%, which could effectively integrate the advantages of multiple models. He et al. [
17] integrated three machine learning models, RF, XGBoost, and LSTM, based on the BMA method, and discovered that, in comparison to the ideal single model, the mean square error of daily runoff forecast was lowered by 28%, and the reliability of the prediction interval was significantly improved. Huo et al. [
18] compared the performance of various hydrological models in semi-humid basins and discussed the advantages of the BMA method to integrate models. The results showed that the BMA method can couple various hydrological models and improve the prediction accuracy. However, existing studies mainly focus on the integration of the same type of models, such as machine learning or hydrological models, and there is a lack of cases of basin runoff prediction in which the BMA method is systematically applied to the “physical-data” coupling system, especially the long-term forecast practice combined with future climate scenarios.
In view of this, this study takes the Zuli River Basin, where extreme hydrological events occur frequently, as the study area, and constructs a new triple-coupled model of SWAT-LSTM-BMA coupled by BMA on the basis of digital elevation, land use, soil type, and meteorological data. By comparing and analyzing the classic SWAT model’s simulation accuracy and the SWAT-LSTM model, the contribution of the BMA method to the improvement of the accuracy of the coupled model is quantified, and the runoff prediction for 2025–2030 is carried out in combination with future meteorological data, aiming to provide technological assistance and theoretical foundation for analysis of the evolution law of extreme events and flash flood warning in areas with lack of data.
4. Discussion
During the process of runoff prediction, different models produce varying prediction results due to their inherent algorithmic and structural differences, which in turn affects the robustness of the predictions [
32]. The BMA method used in this study intelligently schedules the advantages of the SWAT model and the SWAT-LSTM model under different hydrological conditions through the dynamic weight allocation mechanism. It tends to rely on the physical robustness of the SWAT model during the dry season and turns to the correction ability of the SWAT-LSTM model during the flood season. Therefore, BMA not only improves the prediction accuracy, but also reveals its unique value beyond the single series framework by realizing the scenario-adaptive model selection, that is, systematically quantifying the structural uncertainty in the hybrid modeling.
Compared to single models, coupled models, through the integration of multiple models and the comprehensive consideration of multiple variables, can more accurately represent the changes in watershed runoff patterns, effectively improving the accuracy and reliability of runoff predictions. This aligns with the findings of Wang et al. [
33]. The coupling model constructed in this study shows a good correlation with the measured runoff data, which has certain implications for improving the accuracy of runoff predictions. Phetanan et al. [
34] reached similar conclusions when predicting flow in the Mekong River Basin by coupling SWAT and LSTM models. The main reason is that the inclusion of the LSTM model can capture the nonlinear processes that are difficult to parameterize with SWAT, thereby enhancing the modeling capability of complex relationships. At the same time, the BMA method can reduce the bias of a single model by integrating the predictions of multiple models, making the predicted values closer to the actual measured values. Therefore, this study coupled the SWAT and SWAT-LSTM models based on the BMA method for the first time and proposed a new SWAT-LSTM-BMA coupling framework. Different from the research that only integrates similar models, this framework quantifies the structural uncertainty between “pure physical simulation” and “physical-guided data-driven correction” through posterior weights. The coupled model shows excellent runoff prediction ability as a whole, with R
2 of 0.846, NSE of 0.822, MSE of 0.06 in the calibration period, and R
2 of 0.829, NSE of 0.811, MSE of 0.08 in the validation period. Huang et al. [
35] integrated ANN, RF, and SVM algorithms for runoff prediction through the BMA method. The NSE of the integrated model in the calibration period was 0.8–0.89, and the NSE in the verification period was 0.7–0.84. The coupling model did the best according to the BMA approach. The BMA method’s ability to successfully lower a single model’s uncertainty is the primary explanation, optimize the parameters of each model, and enhance the model’s capacity for prediction. This indicates that in runoff prediction research, using the SWAT-LSTM-BMA coupled model can effectively improve the accuracy and robustness of runoff predictions, making it suitable for precise modeling of runoff predictions.
To address the uncertainties and nonlinearities present in runoff prediction, selecting high-precision, high-resolution future meteorological data is particularly important [
36]. Among these, CMIP6 data are widely used by scholars due to their high computational efficiency and broad applicability [
37], providing comprehensive scenario analysis and reliable climate change information for future runoff predictions [
38]. Zhou et al. [
39], based on CMIP6 data, used the VIC model to forecast future changes in China’s runoff. The results indicate that CMIP6 data play a crucial role in future runoff predictions. Yang et al. [
40] mainly studied the performance of twenty CMIP6 models in simulating temperature and precipitation in China and compared them with historical data. The results indicate that future temperature and precipitation will increase in all ensembles, with a greater increase under the SSP585 scenario. This aligns with the study’s runoff forecast findings, primarily because the SSP585 scenario assumes higher greenhouse gas emissions, leading to a stronger global warming effect, which may cause an increase in precipitation. Shakeri et al. [
41] used the SDSM model and FDSM to process data from three RCP scenarios of CMIP5. They selected two forecasting factors for temperature and two forecasting factors for precipitation. The results indicate that the SDSM model can effectively handle large-scale climate model data, but it uses fewer forecasting factors and has higher result uncertainty. Therefore, this study uses the SDSM model to process CMIP6 data, selecting 8–10 predictors for temperature and 5–7 predictors for precipitation, achieving overall excellent accuracy.
This study used the SWAT-LSTM-BMA coupled model to predict runoff and it achieved good prediction results. It should be pointed out that the single climate model used in this study for future prediction is intended to verify the effectiveness of the coupled model and does not cover all the uncertainties of future climate prediction. However, the SWAT-LSTM-BMA coupled model constructed in this study provides a direct and powerful methodological basis for further synthesizing the uncertainties under multiple GCMs and multiple emission scenarios. Based on this, the next step can be considered to import the latest land use data, more site meteorological data, and multi-source climate model data into the SWAT-LSTM-BMA coupled model to construct a more accurate multi-variable coupled model, so as to improve the reliability of future hydrological prediction.