Next Article in Journal
River–Canal Changes in the Middle Reaches of the Minjiang River (1644–1949): Spatiotemporal Evolution and Driving Mechanisms
Previous Article in Journal
Cost-of-Quality Study for NC Water Utilities Using the Hickory Municipal Classification System
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Thermal-Process-Informed Input-Variable Selection for Multi-Site Short-Term River Water-Temperature Forecasting in the Upper and Middle Reaches of the Yangtze River

1
Hubei Field Observation and Scientific Research Stations for Water Ecosystem in Three Gorges Reservoir, Hubei University of Technology, Wuhan 430068, China
2
Key Laboratory of Intelligent Health Perception and Ecological Restoration of River and Lake, Ministry of Education, Hubei University of Technology, Wuhan 430068, China
3
Upper-Middle Yangtze Reservoirs Operation Impacts on River Ecosystem Observation and Research Station, Ministry of Water Resources, Hubei University of Technology, Wuhan 430068, China
4
State Key Laboratory of Water Resources Engineering and Management, Wuhan University, Wuhan 430072, China
*
Author to whom correspondence should be addressed.
Water 2026, 18(13), 1574; https://doi.org/10.3390/w18131574 (registering DOI)
Submission received: 14 May 2026 / Revised: 5 June 2026 / Accepted: 25 June 2026 / Published: 26 June 2026

Abstract

River water temperature connects hydrodynamic processes, air–water heat exchange, and aquatic ecological responses. Although data-driven models are increasingly used for short-term water-temperature forecasting, input-variable choice still influences both predictive skill and the interpretation of model errors. This study examined daily water-temperature forecasting at nine hydrological stations in the upper and middle reaches of the Yangtze River. The stations were grouped according to natural hydro-meteorological background, reservoir regulation, and compound disturbance. Based on surface-water heat balance and order-of-magnitude analysis, antecedent water temperature, air temperature, and discharge were selected as process-related candidate inputs and tested using LSTM and xLSTM models. The experiments considered input-window length, learning rate, batch size, and the inclusion of discharge. Under the no-discharge scheme, learning rate had the clearest effect on the predicted water-temperature series. For LSTM, the median predicted-temperature differences caused by changes in window length, learning rate, and batch size were 0.055, 0.077, and 0.056 °C, respectively; the corresponding values for xLSTM were 0.089, 0.102, and 0.073 °C. One-day-ahead forecasts for the selected representative dates produced mean RMSE values of 0.160 °C for LSTM and 0.165 °C for xLSTM, compared with 0.183 °C for a persistence baseline. The reservoir regulation impact group showed the lowest errors, whereas the compound disturbance impact group had higher errors and clear within-group differences. The contribution of discharge varied among stations and models: for LSTM, RMSE decreased at Batang, Panzhihua, and Huanglingmiao, but increased or changed little at Gangtuo, Yichang, and Cuntan; for xLSTM, the average RMSE did not decrease after discharge was added at the seven stations with discharge data. xLSTM showed local advantages at Huanglingmiao and Cuntan. These findings show that process-informed input selection offers a consistent basis for comparing multi-site water-temperature forecasts and for interpreting error differences among stations and input schemes.

1. Introduction

River water temperature directly affects dissolved oxygen, nutrient cycling, aquatic metabolism, and fish reproduction, making it a basic control variable in river ecosystems [1,2]. It is shaped by water–air heat exchange, channel morphology, inflow conditions, and local environmental factors; the coupling between air temperature and water temperature is therefore widely used in water-temperature prediction and thermal-regime analysis [3]. Reservoir construction, urban expansion, and basin development have further modified downstream thermal regimes through storage and release, downstream mixing, and the cumulative effects of cascade reservoirs [4,5]. Recent studies in the Yangtze River Basin have also used air temperature, discharge, and semi-physical models to predict water temperature in data-scarce areas [6]. In the upper and middle reaches of the Yangtze River, strong topographic gradients coexist with dense reservoirs and urban reaches, and the water-temperature process can vary substantially from station to station. Reliable short-term forecasting is therefore useful for river thermal-environment management and aquatic ecological protection.
Current water-temperature forecasting methods include process-based models, empirical statistical models, semi-empirical models, and data-driven models. Hydrodynamic water-temperature models explicitly describe heat exchange, advective heat transport, and mixing, but they usually require complete boundary conditions, bathymetric data, and parameter calibration [7]. Empirical statistical models estimate water temperature mainly from the air-temperature–water-temperature relationship. They are simple and require few data points, but have limited capacity to represent hydrodynamic processes and human disturbances in complex reaches [8,9]. Semi-empirical models such as Air2stream use air temperature and discharge as core variables and represent both air-temperature response and discharge regulation [10,11]. Among data-driven approaches, artificial neural networks, random forests, and LSTM-type structures have been applied to daily water-temperature forecasting and multi-site simulation, often with air temperature, discharge, and time-related variables as inputs [12,13,14,15,16,17]. C-vine copula and stochastic methods have also been used to describe variable relationships and daily water-temperature changes [18,19]. More recently, open meteorological data, reservoir-management applications, and deep learning frameworks have expanded the use of water-temperature forecasting [20,21,22,23,24,25], while GPR, NARX, stochastic models, hybrid machine learning models, and differentiable process models have provided additional options [26,27,28,29,30,31,32].
Recent studies have also introduced attention-based and physics-informed frameworks into hydrological and water-temperature prediction. Transformer-type models can represent longer temporal dependence in environmental time series, while physics-informed deep learning can incorporate governing equations or domain constraints into water-temperature simulation [33,34]. These approaches provide useful directions for future development. In this study, LSTM and xLSTM were used as representative recurrent structures under a common input set and parameter-search range, so that the effects of input-variable design, station background, and model structure could be compared under controlled conditions.
Despite these advances, data-driven forecasts are easier to interpret when the input variables have clear process meaning and reliable data quality. Inputs chosen mainly from correlations or empirical combinations can provide error metrics, but they do not always clarify whether the errors are associated with meteorological background, inflow processes, engineering regulation, or local disturbance. In a multi-site setting, meteorological, hydrological, and engineering data also differ in continuity and spatial representativeness. Input-variable expansion therefore needs to consider prediction accuracy, data consistency, and process interpretation together.
LSTM is commonly used for daily river water-temperature forecasting because it can capture memory and lag effects in time series [15,35]. Improved LSTM structures have also been applied to rapid simulation of vertical water-temperature distributions in reservoirs [36]. xLSTM extends the LSTM architecture by modifying gate and state-update mechanisms, which can improve sequence representation in some time-series tasks [37]. For daily river water-temperature forecasting, however, it remains necessary to test whether this additional structural complexity gives stable advantages across stations, input schemes, and parameter combinations. Comparing model structure together with input variables, station background, and hyperparameter response helps clarify whether errors arise mainly from model capacity or from missing process information.
This study conducted multi-site short-term river water-temperature forecasting at nine hydrological stations in the upper and middle reaches of the Yangtze River. Candidate inputs were selected from the perspective of river thermal processes. Discharge was added to antecedent water temperature and air temperature as a process-related variable, and its role was evaluated using ablation experiments. Forecasting errors were compared among stations, seasons, and reach backgrounds, including natural hydro-meteorology, reservoir regulation, and compound disturbance. LSTM and xLSTM were then compared under the same inputs and parameter combinations to evaluate how model structure affected this task. The analysis focuses on error differences among stations, input schemes, and model structures under input conditions that remain process-based and comparable across sites.

2. Materials and Methods

2.1. Study Area and Station Grouping

Nine hydrological stations in the upper and middle reaches of the Yangtze River were selected: Gangtuo, Batang, Panzhihua, Xiangjiaba, Huanglingmiao, Nanjinguan, Yichang, Cuntan, and Badong. The study area and station locations are shown in Figure 1, and the station grouping rationale is summarized in Table 1. Station grouping was used to organize the main water-temperature process characteristics under different reach backgrounds and to support the subsequent interpretation of model errors. The results and discussion therefore relate water-temperature patterns and model accuracy to natural hydro-meteorology, reservoir regulation, urban activity, and local disturbance.
According to reach background, engineering regulation, and local disturbance, the nine stations were divided into three groups: the natural hydro-meteorology group, the reservoir regulation impact group, and the compound disturbance impact group. The natural hydro-meteorology group includes Gangtuo, Batang, and Panzhihua, and mainly reflects the effects of climate, elevation, inflow processes, and natural water–air heat exchange on forecasting errors. The reservoir regulation impact group includes Xiangjiaba, Huanglingmiao, Nanjinguan, and Yichang, and is used to examine how reservoir storage and release, released-water temperature, and downstream mixing reshape short-term water-temperature continuity. The compound disturbance impact group includes Cuntan and Badong. Both stations are influenced by complex human activities and reservoir-zone hydrodynamics, but their local processes differ. Cuntan is affected by backwater near the reservoir tail, the Jialing River inflow, urban shoreline activity, and local hydrodynamics. Badong lies within the influence range of the Three Gorges Reservoir; nearby urban activity and shoreline development may affect local water temperature, while daily variations are also regulated by reservoir storage, water-body thermal inertia, and slow-flow conditions. This grouping was used to distinguish dominant processes and compound backgrounds before comparing errors within and among groups.

2.2. Data Sources, Standardization and Experimental Setting

Daily data from 1 January 2013 to 31 December 2015 were used as the basic sample set, including daily mean water temperature, daily mean air temperature, and daily mean discharge. Because continuous discharge data suitable for modeling were available at seven of the nine stations, two input schemes were designed. The no-discharge scheme covered all nine stations and used air temperature and antecedent water temperature as inputs. The discharge-input scheme covered the seven stations with available discharge data and added discharge to the same two variables. This design allowed the role of discharge to be compared within a consistent experimental structure under different data conditions.
Discharge was introduced as the observed daily mean value and was standardized together with the temperature variables. The main experiments did not add logarithmic transformation or explicit lagged-discharge terms, so the discharge scheme tested the direct information carried by mainstem discharge under the same preprocessing framework as the temperature variables.
Because the variables differ in dimension and numerical range, both input variables and target variables were standardized before model training. The standardization was defined as
z = x μ σ
where x is the original variable, z is the standardized variable, and μ and σ are the mean and standard deviation of the corresponding variable in the training samples. This treatment is consistent with the StandardScaler used in the model scripts [38].
The continuous time series were converted into supervised-learning samples using a sliding-window approach. When the input-window length is N days, the input for sample t is expressed as
X t = v t N + 1 , v t N + 2 , , v t
and the prediction target is
y t = T w , t + 1
where v t denotes the input-variable vector and T w is water temperature. In the no-discharge scheme, v t consists of air temperature and water temperature; in the discharge-input scheme, it consists of air temperature, water temperature, and discharge. Input-window lengths of 10, 20, and 30 days were tested. Thus, different window lengths represent different amounts of antecedent thermal information and allow the sensitivity of short-term water-temperature forecasting to historical information length to be evaluated.

2.3. Thermal-Process-Informed Input-Variable Selection

At the daily scale, changes in surface river water temperature can be treated as a control-volume heat-balance problem [1,3]. If the surface mixed layer is regarded as a control volume with thickness h and heat fluxes are expressed per unit water-surface area, the heat balance can be written as
ρ c p h d T w d t = Q s w + Q l w , i n Q l w , o u t Q e Q h + ρ c p Q A T i n T w + q b
where ρ is water density, c p is specific heat capacity, h is the thickness of the surface mixed layer, Tw is water temperature, Q s w is net shortwave radiation, Q l w , i n and Q l w , o u t are incoming and outgoing longwave radiation, Q e and Q h are latent and sensible heat fluxes, Q is discharge, A is cross-sectional area, T i n is inflow temperature, and q b denotes the riverbed heat-exchange term.
To clarify the basis for variable selection, the major terms were further assessed by order of magnitude. The heat-storage term of the surface layer can be expressed as
S = ρ c p h Δ T w Δ t
For daily river water-temperature problems, if the surface mixed-layer thickness is on the order of meters and daily water-temperature variation is on the order of 10−1–100 °C, the heat-storage term can generally reach 101–102 W m−2. Net water–air heat exchange, which includes shortwave radiation, longwave radiation, sensible heat, and latent heat, is also commonly on the order of 101–102 W m−2 at the daily scale. Advective and mixing effects can become comparable when discharge is high, inflow-temperature differences are evident, or channel mixing is enhanced. By comparison, riverbed heat exchange usually acts as a secondary correction term in daily surface-water-temperature forecasting.
This heat–budget relationship shows that daily river water temperature reflects the combined effects of antecedent thermal state, water–air heat exchange, and inflow transport. For multi-site short-term forecasting, the input variables therefore need to represent the main thermal processes while remaining consistently available across stations. Antecedent water temperature was used to describe the existing thermal state and short-term memory of the river reach. Air temperature was used as a practical indicator of atmospheric thermal forcing, and discharge was used to represent the scale of advective transport and channel mixing. Variables such as radiation, wind speed, humidity, cloud cover, released-water temperature, and tributary inflow can also affect river water temperature. In the present multi-site comparison, these variables were not used as unified model inputs because their continuity and spatial matching differed among stations; they are considered in the interpretation of errors and in the discussion of future improvements.

2.4. Model Structure and Parameter Combinations

LSTM was used as the baseline model [35], and xLSTM was introduced for comparison [37]. Previous daily river water-temperature studies have shown that LSTM can use historical sequence information for short-term prediction [15], and reservoir studies have used improved LSTM structures to simulate vertical water-temperature distributions rapidly [36]. In the present implementation, input variables were first projected to 128 dimensions through a linear layer and then passed into a single-layer LSTM. The hidden state at the last time step was processed by PReLU activation, Dropout (0.2), and a fully connected layer to output next-day water temperature.
In this study, xLSTM denotes the extended recurrent implementation used in the model scripts. It was constructed as a single recurrent layer with the same 128-dimensional input projection and hidden/state dimension as the LSTM baseline. The implementation used an extended recurrent cell without stacked sLSTM/mLSTM blocks. Multi-head matrix-memory blocks were not included; therefore, the number of heads was not applicable. The cell used a tanh candidate state, a sigmoid output gate, exponential input and forget gates, a normalizing accumulator, a max-based stabilization term, and LayerNorm before the hidden-state output. The final hidden state was then passed through PReLU activation, Dropout (0.2), and a fully connected layer. This setting kept the hidden size, layer number, dropout, optimizer, and parameter-search range consistent between LSTM and xLSTM, while allowing the gate and state-update mechanism to differ. The complete experimental design and parameter ranges are summarized in Table 2.
The corresponding implementation can be summarized by the following pseudocode:
Input: sequence X = {v1, v2, …, vN}; hidden/state dimension = 128; recurrent layer = 1
for k = 1 to N:
xk = Linear128(vk)
  compute tanh candidate state, stabilized exponential input/forget gates, and sigmoid output gate
  update the normalizing accumulator and recurrent state
hk = output_gate × tanh(LayerNorm(normalized state))
end for
Output: T ^ w (t + 1) = FC(Dropout(PReLU(hN)))
All models used the Adam optimizer [39] and MSELoss. The training and validation sets were split chronologically at a ratio of 9:1, and window samples were randomly batched during training. Model training and tensor operations were implemented in Python (Version 3.8.8) using PyTorch (Version 2.3.0+cpu) [40]. Data preprocessing and performance metrics were implemented using scikit-learn (Version 0.24.1), NumPy (Version 1.23.5), and pandas (Version 2.0.3), and figures were prepared using Matplotlib (Version 3.3.4). The validation set was used for model saving, hyperparameter sensitivity analysis, and supplementary R 2 evaluation.

2.5. Evaluation Metrics

Model performance was evaluated using RMSE, MAE, and R 2 , which are commonly used in water-temperature forecasting and watershed model evaluation [41,42]. RMSE was used to compare seasons, station groups, discharge input, and model structure. MAE was used as an auxiliary measure of mean absolute deviation, and R 2 was used to evaluate overall validation-set fitting. Because each one-day-ahead forecast for a representative date is a single-point prediction, R 2 was not calculated for these single-point forecasts. The maximum predicted-temperature difference reported in Section 3.1 was used only to describe the amplitude of prediction changes caused by parameter perturbation and was not treated as an accuracy metric. The representative-date forecasts were generated using trained models and selected seasonal initiation dates within the study period, and are therefore reported separately from the validation-set metrics. The metrics are defined as follows:
R M S E = 1 n i = 1 n y ^ i y i 2
M A E = 1 n i = 1 n y ^ i y i
R 2 = 1 i = 1 n y i y ^ i 2 i = 1 n y i y 2
where y i is observed water temperature, y ^ i is predicted water temperature, ȳ is the mean observed water temperature, and n is the number of samples. Lower RMSE and MAE indicate smaller prediction errors, and R 2 values closer to 1 indicate stronger explanatory ability for observed water-temperature variations.
As supplementary checks, a persistence forecast, paired Wilcoxon signed-rank tests, and discharge-lag correlations were used. The persistence forecast used the previous-day observed water temperature as the next-day prediction and was calculated for the same stations and representative dates as the no-discharge model comparison. Paired Wilcoxon tests were applied to the representative-date RMSE values, with pairs matched by station, representative date, input-window length, learning rate, and batch size. To examine whether discharge contained a delayed thermal signal, Spearman correlations between water temperature at day t and discharge at lags of 0, 1, 3, and 7 days were calculated for the seven stations with discharge data at both annual and seasonal scales.

3. Results

3.1. Hyperparameter Sensitivity Analysis

Validation results from the no-discharge scheme at all nine stations were used to analyze how input-window length, learning rate, and batch size affected the predicted water-temperature series. A one-factor-at-a-time design was adopted. The baseline combination used a 20-day input window, a learning rate of 5 × 10−4, and a batch size of 32. The tested levels were 10, 20, and 30 days for window length; 1 × 10−4, 5 × 10−4, and 1 × 10−3 for learning rate; and 16, 32, and 64 for batch size. For each station, forecast-initiation period, and validation date, the maximum difference among predicted temperatures under the three levels of a parameter was calculated to quantify the response of the predicted series to that parameter.
Figure 2 shows the distributions of predicted-temperature differences under the three hyperparameter perturbations. For LSTM, the predictions changed only slightly when input-window length and batch size were varied, with median differences of 0.055 °C and 0.056 °C, respectively. The median difference associated with learning rate was higher, at 0.077 °C. Within the tested range, LSTM predictions were therefore less sensitive to historical window length and batch size than to learning rate.
xLSTM showed a larger overall parameter response than LSTM. The median differences associated with window length, learning rate, and batch size were 0.089, 0.102, and 0.073 °C, respectively, with learning rate again having the largest effect. Under learning-rate perturbation, the 90th percentile and maximum predicted-temperature differences of xLSTM reached 0.254 °C and 1.475 °C, respectively, meaning that some stations produced clearly separated prediction curves under different learning rates. Compared with LSTM, xLSTM was more dependent on training-parameter choice, even though it has greater sequence-representation capacity.
Differences among stations were also evident. Xiangjiaba consistently showed small differences across both models and all three parameters, indicating stable validation predictions under parameter perturbation. Gangtuo responded more strongly to changes in window length and learning rate, Cuntan responded more strongly to changes in batch size, and Panzhihua showed the largest xLSTM response to learning-rate changes. Hyperparameter sensitivity was therefore related not only to model structure, but also to the variability in the local water-temperature process.
Figure 3 presents station–date cases with relatively large differences. Under LSTM, the relative positions of the prediction curves and observations were largely preserved when window length and batch size changed, and the separation among the three prediction curves was small. Learning-rate changes led to greater local separation. The learning-rate response of xLSTM was more pronounced. At Batang, with forecast initiation on 15 May 2015, the prediction curves for different learning rates remained separated over part of the warming period, indicating that learning-rate selection affected how xLSTM fitted this process.
Taken together, Figure 2 and Figure 3 and Table 3 show that input-window length and batch size had relatively small effects on predicted water-temperature processes at most stations. Learning rate was the more sensitive hyperparameter in both models, and its effect was stronger for xLSTM than for LSTM. The same parameter-search range was therefore used in the subsequent station, season, and model-structure comparisons. For further model optimization, learning rate should be checked before input-window length or batch size is adjusted.

3.2. One-Day-Ahead Prediction Under Different Seasons and Station Groups

One-day-ahead forecasts for the selected representative dates were used to evaluate the combined influence of representative seasonal periods, station groups, and model structure. Table 4 and Table 5 list the mean, maximum, and minimum RMSE values of LSTM and xLSTM for four representative seasonal periods. Each cell was calculated from 27 hyperparameter combinations for the same station and seasonal representative date, so the values reflect both the average error level and the range caused by parameter combinations.
A persistence baseline was calculated to examine whether the deep learning models improved on simple short-term thermal continuity. For the same nine stations and four representative dates, the mean one-day-ahead RMSE of the persistence baseline was 0.183 °C. The corresponding mean RMSE values of LSTM and xLSTM were 0.160 °C and 0.165 °C, representing reductions of 12.9% and 10.2%, respectively, relative to the baseline. The persistence-index values were 0.282 for LSTM and 0.192 for xLSTM. Thus, both recurrent models improved on the previous-day baseline, while the magnitude of improvement remained modest because daily river water temperature is itself highly continuous.
Table 4 and Table 5 show that the reservoir regulation impact group had the lowest overall errors, with mean RMSE values of 0.085 °C for LSTM and 0.087 °C for xLSTM. The natural hydro-meteorology group was intermediate, with mean values of 0.193 °C and 0.206 °C, whereas the compound disturbance impact group was higher overall, with both models averaging 0.259 °C. These differences show that reach background affects the predictability of daily water-temperature sequences. Reservoir regulation changes the natural thermal process, but storage–release and downstream mixing can also strengthen short-term continuity, making antecedent thermal state more informative for the models. When tributary inflow, reservoir-tail hydrodynamics, and urban shoreline activity interact without being fully represented by the input variables, model errors are more likely to increase.
Clear differences also occurred within groups. In the reservoir regulation impact group, Yichang, Nanjinguan, and Xiangjiaba had lower errors, whereas Huanglingmiao had relatively higher mean errors. Thus, stations in regulated reaches can still differ in forecasting difficulty because of differences in operation processes, channel mixing, and local boundary conditions. In the natural hydro-meteorology group, Panzhihua had the most pronounced summer high-temperature error, with mean RMSE values of 0.506 °C for LSTM and 0.581 °C for xLSTM, likely reflecting stronger water–air heat exchange and short-term fluctuation under dry-hot valley conditions. Gangtuo had higher errors in winter and spring, which may be related to high-elevation inflow processes, phase differences between air temperature and water temperature, and snowmelt supply. In the compound disturbance impact group, the mean error was mainly raised by Cuntan, whereas Badong’s was much lower. This contrast indicates that compound disturbance needs to be interpreted at the station scale: Cuntan is more likely to be affected by the reservoir tail, tributary inflow, and urban-reach disturbance, while Badong may retain stronger daily water-temperature continuity because of reservoir storage, water-body thermal inertia, and slow-flow conditions. The group-wise RMSE statistics are summarized in Table 6.
Figure 4 further shows that representative seasonal period and reach background jointly affected forecasting errors. The reservoir regulation impact group had low errors in autumn and spring, consistent with stronger short-term water-temperature continuity in regulated reaches. The natural hydro-meteorology group had elevated errors during the summer high-temperature period, mainly because of Panzhihua. The compound disturbance impact group had high errors in the spring, summer, and autumn representative periods, mainly because of Cuntan. These group-level patterns were closely linked to station-level processes, including reach background, water-temperature continuity, and local disturbance.

3.3. Conditional Effect of Discharge Input

Discharge represents water transport and channel mixing, but its predictive value also depends on the thermal state of inflow and the timing of local mixing. Based on the seven stations with available discharge data, Table 7 and Figure 5 compare the mean one-day-ahead RMSE under the no-discharge and discharge-input schemes. Nanjinguan and Badong were not included in this comparison because continuous discharge data suitable for this experiment were not available.
Figure 5 shows the one-day-ahead mean RMSE values with and without discharge input; points farther to the left indicate lower RMSE. Table 7 lists the corresponding differences between the discharge-input and no-discharge schemes. The contribution of discharge varied among stations and models. For LSTM, RMSE decreased at Batang, Panzhihua, and Huanglingmiao, was nearly unchanged at Xiangjiaba, and increased at Gangtuo, Yichang, and Cuntan. For xLSTM, ΔRMSE was positive at all seven stations with discharge data, showing that the additional discharge input was associated with higher average errors in this set of comparisons.
The matched-parameter comparison led to the same pattern. Across 756 matched station–date–parameter combinations, adding discharge changed LSTM RMSE by 0.0004 °C on average (median 0.0004 °C; p = 0.534). For xLSTM, the corresponding change was 0.0216 °C (median 0.0079 °C; p = 0.002). These paired results indicate that discharge input produced little systematic change for LSTM, while the additional discharge term tended to increase xLSTM error under the tested settings.
The lag-correlation check further showed that the discharge–water-temperature relationship was not uniform. At the annual scale, the strongest absolute Spearman correlations between Tw(t) and Q(t-lag) across lags of 0, 1, 3, and 7 days ranged from 0.745 to 0.875 among the seven stations. The strongest lag was 0 days at Xiangjiaba, Gangtuo, Batang, Panzhihua, and Cuntan, but 7 days at Huanglingmiao and Yichang. At the seasonal scale, the correlations varied more widely, from −0.409 to 0.948. These results indicate that discharge can contain a thermal signal, but the sign, strength, and lag of this signal depend on station and season, which helps explain why adding discharge did not produce a uniform gain in the forecasting models.
The discharge response also changed with seasonal stage. At some stations, discharge supplemented information on inflow processes or mixing intensity. At others, discharge changes mainly followed seasonal variation that was already reflected by the water-temperature sequence, adding little new information. Seasonal comparisons showed that the mean ΔRMSE of LSTM was about −0.008 °C in winter and −0.007 °C in summer, whereas the mean ΔRMSE was positive in spring and autumn. For xLSTM, the seasonal mean ΔRMSE was non-negative in all four seasons and was larger in spring and summer, which is consistent with the station-level pattern shown in Table 7.
From a process perspective, the predictive contribution of discharge is controlled by both transport magnitude and the temperature-difference signal. For advective heat transport, discharge represents the scale of water movement, but the effect on local water temperature also depends on the difference between inflow temperature and local water temperature. When this temperature difference is small, large discharge does not necessarily carry a clear heat signal; when discharge variation and water-temperature response are temporally misaligned, runoff seasonality may be learned as a water-temperature signal. The increased error after adding discharge at Gangtuo may be related to the lack of synchrony among high-elevation inflow, snowmelt supply, and water-temperature response. In contrast, the reduced LSTM errors at Batang, Panzhihua, and Huanglingmiao indicate that discharge changes at these stations added information on inflow processes or mixing intensity. At Cuntan, the mainstem-discharge variable captured only part of the combined effects of urban shoreline activity, tributary inflow, and local hydrodynamics, and the change in accuracy was limited.

3.4. Comparison Between LSTM and xLSTM

Table 8 and Figure 6 compare the mean one-day-ahead RMSE of the two models under the no-discharge scheme. Overall, the mean RMSE was 0.160 °C for LSTM and 0.165 °C for xLSTM. At the station scale, xLSTM had lower mean RMSE only at Huanglingmiao and Cuntan, while LSTM performed better at the remaining stations.
Table 8 and Figure 6 show that LSTM had lower mean RMSE at seven of the nine stations, whereas xLSTM was lower at Huanglingmiao and Cuntan. The overall mean RMSE was 0.160 °C for LSTM and 0.165 °C for xLSTM. Table 9 further shows that, in the no-discharge validation set, the RMSE, MAE, and R2 values were 0.238 °C, 0.177 °C, and 0.938 for LSTM, and 0.248 °C, 0.186 °C, and 0.935 for xLSTM. The validation-set results and representative-date one-day-ahead forecasts showed the same overall pattern: LSTM was slightly more stable, while the difference between the two models remained small.
A paired check across the 972 matched no-discharge combinations gave a mean xLSTM-minus-LSTM RMSE difference of 0.0049 °C (median 0.0042 °C; p = 0.200). The matched comparison therefore supports the station-scale result: the overall difference between the two models was small, and the advantage of either model remained local rather than consistent across the whole set of stations and parameter combinations.
The lower RMSE of xLSTM at Huanglingmiao and Cuntan indicates local predictive gains. With the present input set, these gains show station-specific model behavior but cannot separate the individual contributions of regulation, channel mixing, reservoir-zone hydrodynamics, tributary inflow, or urban-reach disturbance. The local xLSTM advantage is therefore interpreted as being consistent with more complex reach backgrounds, and is not used as evidence that the model captured a specific physical mechanism. Even with xLSTM, the mean RMSE at Cuntan remained 0.369 °C, much higher than at most other stations, showing that local processes outside the input set still contributed to the remaining error.
Overall, LSTM performed well when water-temperature sequences were relatively smooth and antecedent thermal state strongly constrained next-day water temperature. xLSTM produced local benefits under some station backgrounds, but its advantage was not consistent across stations or parameter combinations. In reaches affected by tributary inflow, urban shoreline activity, abrupt regulation, or alpine snowmelt, additional process variables may provide more direct information than model-structure expansion alone.

4. Discussion

4.1. Role and Scope of Process-Informed Input Variables

The experimental results directly support comparisons of RMSE among stations, input schemes, seasons, and model structures. The process discussion below relates these differences to the known reach backgrounds of the selected stations. Because variables such as snowmelt contribution, tributary inflow and water temperature, reservoir operation, released-water temperature, radiation, wind speed, humidity, and urban shoreline heat inputs were not included as unified inputs, the related explanations are treated as process-consistent interpretations rather than direct attribution tests.
Antecedent water temperature, air temperature, and discharge correspond to thermal inertia, water–air heat exchange, and hydrodynamic transport in river water-temperature changes. For daily short-term forecasting, these variables form a basic input combination that is relatively easy to obtain consistently across stations and has clear process meaning. This setting makes it possible to compare stations under the same input conditions and to relate error differences to specific hydro-thermal processes.
The results show that this input scheme can support short-term forecasting at the nine stations, while the contribution of each variable differed among station backgrounds. The no-discharge scheme already produced relatively low errors at most stations, indicating that antecedent water temperature and air temperature contained substantial predictive information. Discharge reduced mean RMSE for LSTM at Batang, Panzhihua, and Huanglingmiao, while its effect was small or negative at other stations and in xLSTM. This pattern is consistent with the physical role of discharge: it represents transport magnitude, but its forecasting value also depends on inflow-temperature differences, channel mixing, and water-temperature response lag.
Error magnitudes reported in previous studies provide a useful reference for the present results. Zhang et al. obtained mean RMSE values of about 1.79 °C and 1.40 °C using linear regression and Air2stream models in data-scarce regions [6]. Feigl et al. compared multiple input combinations and machine learning models and reported an average RMSE of about 0.55 °C for the best machine learning models and a median RMSE of about 0.62 °C for the air-temperature–runoff input combination [14]. Because study areas, forecast horizons, and evaluation samples differ, these values are used here only to indicate the error scale rather than to rank model performance directly. Together with the present results, they show that antecedent water temperature and air temperature provided stable short-term information at most stations, whereas the role of discharge varied. The match between input variables and major thermal processes, together with data consistency among stations, is therefore central to interpreting model-error differences.

4.2. Influence of Reach Background on Model Accuracy

The three station groups show that reach background was closely related to model accuracy. The reservoir regulation impact group had the lowest mean RMSE, the natural hydro-meteorology group was intermediate, and the compound disturbance impact group was higher overall. Reach background affected prediction through both disturbance intensity and the continuity of the daily water-temperature sequence, as well as through the extent to which relevant processes were represented by antecedent water temperature, air temperature, and discharge.
Here, the grouping served as an interpretive framework based on dominant reach background and disturbance type. It helped organize the comparison of model errors across stations, while quantitative indicators such as storage–runoff ratio, tributary contribution, urban land use, and operation records could further refine the classification in future work.
The reservoir regulation impact group includes Xiangjiaba, Huanglingmiao, Nanjinguan, and Yichang. Although this group is strongly affected by reservoir storage–release and downstream processes, the regulation effect at the one-day-ahead scale appeared mainly as smoothing of the water-temperature series. Release and mixing can make the sequence more continuous from day to day, so antecedent water temperature strongly constrains the next-day value and the models can learn this relationship more readily. The station-level accuracy differences within this group were therefore linked to both sequence stability and reach-specific background.
The compound disturbance impact group further illustrates the importance of within-group process differences. Cuntan and Badong are both located in reaches with complex reservoir-zone hydrodynamics and urban activity, but their errors differed clearly. Cuntan is near the reservoir tail and the Jialing River confluence, and its water-temperature process is more likely to be affected jointly by tributary inflow, urban shoreline activity, and local hydrodynamic changes. Badong is also affected by urban shoreline activity and reservoir-zone hydrodynamics, but its water-temperature process occurs under the impoundment background of the Three Gorges Reservoir. Water-body thermal inertia and slow-flow conditions may weaken short-term local disturbances in daily sequences, allowing antecedent water temperature to remain a strong constraint on next-day water temperature. The lower error at Badong than at Cuntan reflects the combined effect of reservoir thermal inertia and local disturbance rather than the strength of a single factor.

4.3. Role of Discharge Across Stations and Seasons

The discharge-input results showed clear station and model dependence. Among the seven stations with discharge data, adding discharge reduced mean RMSE at Batang, Panzhihua, and Huanglingmiao for LSTM, while Gangtuo, Yichang, and Cuntan showed higher or similar errors. For xLSTM, the mean RMSE increased after discharge was added at the seven stations. These patterns indicate that discharge is useful when its variation tracks the inflow transport and mixing processes that affect water temperature, and less useful when this relationship is weak or seasonally shifted.
In seasonal comparisons, the mean ΔRMSE of LSTM was slightly negative in winter and summer and positive in spring and autumn. This suggests that discharge can add information in some high- or low-temperature periods, when inflow processes or mixing intensity are more closely linked to water-temperature variation. In transition seasons, discharge may mainly reflect shared seasonal variation or lag behind the temperature response. For xLSTM, the seasonal mean ΔRMSE was non-negative in all four seasons, matching the station-level tendency for discharge to add little stable information in this model setting.
The Gangtuo results illustrate the difference between the physical meaning of discharge and its predictive contribution. After discharge was added, the LSTM and xLSTM errors at this station increased by 0.041 °C and 0.080 °C, respectively. Under high-elevation, snowmelt-supply, or complex-inflow backgrounds, discharge variation may mainly reflect runoff processes and seasonal flow changes rather than effective heat transport. Without inflow-temperature information, Q by itself describes water-volume change but not the temperature contrast between inflow and local water. The error reductions at Batang and Panzhihua indicate that discharge can supplement inflow-process information in some natural reaches, and the improvement at Huanglingmiao indicates that discharge can also represent release processes and channel mixing in some regulated reaches. In this setting, discharge is better viewed as a process-related candidate variable whose usefulness varies with station and season. Future model improvement can prioritize variables that describe the relevant processes more directly, such as inflow temperature, tributary water temperature and inflow volume, operation discharge, released-water temperature, radiation, wind speed, and humidity.

4.4. Hyperparameter Response and Model-Structure Applicability

The hyperparameter results show that learning rate was the most sensitive training parameter in both models. In the no-discharge validation set, the median predicted-temperature differences caused by changes in window length, learning rate, and batch size were 0.055, 0.077, and 0.056 °C for LSTM, and 0.089, 0.102, and 0.073 °C for xLSTM. Compared with LSTM, xLSTM showed larger responses to all three parameter types, especially learning rate.
Station background also influenced parameter response. Xiangjiaba had low sensitivity in both models, indicating a relatively smooth water-temperature sequence in which short-term prediction was mainly controlled by antecedent thermal inertia. Stations such as Gangtuo, Batang, Cuntan, and Panzhihua showed stronger parameter responses, suggesting that seasonal transition, inflow fluctuation, local disturbance, and heat-exchange intensity increased dependence on training conditions.
The model-comparison results were consistent with the hyperparameter-sensitivity results. LSTM had slightly lower mean errors at most stations and a more stable parameter response, making it a suitable baseline model for this study. xLSTM showed local advantages at Huanglingmiao and Cuntan, but was more sensitive to learning rate. Under the tested data scale and parameter-search range, the additional gate complexity of xLSTM mainly led to local rather than basin-wide improvements. In reaches where important local drivers are outside the input set, variables describing tributary inflow, urban shoreline activity, reservoir operation, or inflow thermal state may contribute more directly than further structural expansion.
The one-factor-at-a-time design was used to isolate the response of the predicted series to a single training parameter under directly comparable conditions. All model runs used the same deterministic seed and the same 27 parameter combinations. This design keeps the station, model, and input-scheme comparisons consistent, while repeated-seed experiments and interaction-aware sensitivity methods would provide a fuller uncertainty decomposition in extended evaluations.

4.5. Data Period and Transferability

The analysis used daily records from 2013 to 2015 as a common data period for all stations, input schemes, and model structures. This design supports a controlled within-period comparison under consistent data conditions. The resulting representative-date forecasts describe predictive behavior within this common period. Broader temporal transferability should be examined with longer records, independent hydrological years, rolling-origin evaluation, or leave-one-year-out testing, especially when the target is rare hydro-climatic extremes or reservoir-operation changes outside the present data window.

5. Conclusions

(1)
Under the no-discharge scheme, the validation-set RMSE, MAE, and R2 were 0.238 °C, 0.177 °C, and 0.938 for LSTM, and 0.248 °C, 0.186 °C, and 0.935 for xLSTM. The mean representative-date one-day-ahead RMSE was 0.160 °C for LSTM and 0.165 °C for xLSTM, compared with 0.183 °C for the persistence baseline. The corresponding reductions relative to the baseline were 12.9% and 10.2%. Across 972 matched no-discharge combinations, the xLSTM-minus-LSTM RMSE difference averaged 0.0049 °C (p = 0.200), showing that the overall difference between the two models was small. LSTM was slightly more stable on average, while xLSTM showed local advantages only at Huanglingmiao and Cuntan.
(2)
Learning rate was the main training parameter affecting the predicted water-temperature series in both models. In the no-discharge validation set, the median predicted-temperature differences caused by changes in window length, learning rate, and batch size were 0.055, 0.077, and 0.056 °C for LSTM and 0.089, 0.102, and 0.073 °C for xLSTM. xLSTM showed larger parameter responses, especially to learning rate, indicating that training-parameter selection was more influential for this extended recurrent structure.
(3)
The grouping results showed that the reservoir regulation impact group had the lowest one-day-ahead errors, the natural hydro-meteorology group was intermediate, and the compound disturbance impact group was higher overall but showed clear within-group differences. The high error at Cuntan was consistent with the influence of reservoir-tail effects, tributary inflow, and urban-reach disturbance, whereas the relatively low error at Badong was consistent with reservoir impoundment, water-body thermal inertia, and slow-flow conditions that maintain daily water-temperature continuity. These process interpretations indicate plausible sources of error differences and can be refined with additional observations of tributary inflow, reservoir operation, local mixing, and urban heat inputs.
(4)
The effect of discharge input was station- and model-dependent. For LSTM, adding discharge reduced mean RMSE at Batang, Panzhihua, and Huanglingmiao, while Gangtuo, Yichang, and Cuntan showed higher or similar errors. Across matched combinations, the average LSTM change was only 0.0004 °C (p = 0.534). For xLSTM, the seven stations with discharge data all showed higher mean RMSE after discharge was added, and the paired comparison gave an average increase of 0.0216 °C (p = 0.002). The discharge-lag analysis showed that the strength and lag of the discharge–water-temperature relationship varied by station and season, with seasonal Spearman correlations ranging from −0.409 to 0.948. Discharge is therefore most useful when its variation reflects the inflow transport or mixing processes that control local water-temperature changes.
(5)
Process-informed input-variable selection provides a consistent basis for comparing multi-site water-temperature forecasts and interpreting error differences among stations and input schemes. Antecedent water temperature, air temperature, and discharge correspond to thermal inertia, atmospheric thermal forcing, and hydrodynamic transport, respectively. The results show that variable effectiveness should be considered together with station background, seasonal stage, and inflow thermal state. Future improvements in complex reaches may benefit from variables that describe radiation, wind speed, humidity, inflow temperature, tributary inflow, reservoir operation, and local heat sources more directly.

Author Contributions

J.M.: Conceptualization, data curation, writing—original draft, supervision and funding acquisition; H.H.: methodology, software, validation, data curation, writing—original draft and visualization; D.L.: resources, supervision, and funding acquisition; Y.L.: resources and funding acquisition; Y.X.: conceptualization, methodology, validation and writing—review and editing. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Open Project Funding of Hubei Field Observation and Scientific Research Stations for Water Ecosystem in Three Gorges Reservoir, China Three Gorges University (2024YWZ03); the Natural Science Foundation of Hubei Province, China (2024AFD369); the Green Industrial Science and Technology Leading Project of Hubei University of Technology (XJ2024000401); the National Natural Science Foundation of China (52179065, U2040220); and the Open Project Funding of Key Laboratory of Intelligent Health Perception and Ecological Restoration of Rivers and Lakes, Ministry of Education, Hubei University of Technology (HGKFZ01).

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Caissie, D. The thermal regime of rivers: A review. Freshw. Biol. 2006, 51, 1389–1406. [Google Scholar] [CrossRef]
  2. Poole, G.C.; Berman, C.H. An ecological perspective on in-stream temperature: Natural heat dynamics and mechanisms of human-caused thermal degradation. Environ. Manag. 2001, 27, 787–802. [Google Scholar] [CrossRef] [PubMed]
  3. Webb, B.W.; Hannah, D.M.; Moore, R.D.; Brown, L.E.; Nobilis, F. Recent advances in stream and river temperature research. Hydrol. Process. 2008, 22, 902–918. [Google Scholar] [CrossRef]
  4. Xie, Q.; Liu, Z.; Fang, X.; Chen, Y.; Li, C.; MacIntyre, S. Understanding the temperature variations and thermal structure of a subtropical deep river-run reservoir before and after impoundment. Water 2017, 9, 603. [Google Scholar] [CrossRef]
  5. He, T.; Deng, Y.; Tuo, Y.; Yang, Y.; Liang, N. Impact of the dam construction on the downstream thermal conditions of the Yangtze River. Int. J. Environ. Res. Public Health 2020, 17, 2973. [Google Scholar] [CrossRef] [PubMed]
  6. Zhang, J.Q.; Ma, J.; Xu, Y.Q.; Liu, D.F.; Wang, Z.P.; Tao, Z.Y.; Wei, H.; Xiao, R. Methods for predicting water temperature in data-scarce areas under different climate regions of China. Water Cycle 2025, 6, 259–271. [Google Scholar] [CrossRef]
  7. Ouellet, V.; St-Hilaire, A.; Dugdale, S.J.; Hannah, D.M.; Krause, S.; Proulx-Ouellet, S. River temperature research and practice: Recent challenges and emerging opportunities for managing thermal habitat conditions in stream ecosystems. Sci. Total Environ. 2020, 736, 139679. [Google Scholar] [CrossRef] [PubMed]
  8. Benyahya, L.; Caissie, D.; St-Hilaire, A.; Ouarda, T.B.M.J.; Bobée, B. A review of statistical water temperature models. Can. Water Resour. J. 2007, 32, 179–192. [Google Scholar] [CrossRef]
  9. Mohseni, O.; Stefan, H.G. Stream temperature/air temperature relationship: A physical interpretation. J. Hydrol. 1999, 218, 128–141. [Google Scholar] [CrossRef]
  10. Piccolroaz, S.; Calamita, E.; Majone, B.; Gallice, A.; Siviglia, A.; Toffolon, M. Prediction of river water temperature: A comparison between a new family of hybrid models and statistical approaches. Hydrol. Process. 2016, 30, 3901–3917. [Google Scholar] [CrossRef]
  11. Piotrowski, A.P.; Napiorkowski, J.J. Performance of the air2stream model that relates air and stream water temperatures depends on the calibration method. J. Hydrol. 2018, 561, 395–412. [Google Scholar] [CrossRef]
  12. DeWeber, J.T.; Wagner, T. A regional neural network ensemble for predicting mean daily river water temperature. J. Hydrol. 2014, 517, 187–200. [Google Scholar] [CrossRef]
  13. Zhu, S.; Nyarko, E.K.; Hadzima-Nyarko, M.; Heddam, S.; Wu, S. Assessing the performance of a suite of machine learning models for daily river water temperature prediction. PeerJ 2019, 7, e7065. [Google Scholar] [CrossRef] [PubMed]
  14. Feigl, M.; Lebiedzinski, K.; Herrnegger, M.; Schulz, K. Machine-learning methods for stream water temperature prediction. Hydrol. Earth Syst. Sci. 2021, 25, 2951–2977. [Google Scholar] [CrossRef]
  15. Qiu, R.; Wang, Y.; Rhoads, B.L.; Wang, D.; Qiu, W.; Tao, Y.; Wu, J. River water temperature forecasting using a deep learning method. J. Hydrol. 2021, 595, 126016. [Google Scholar] [CrossRef]
  16. Piotrowski, A.P.; Napiorkowski, M.J.; Napiorkowski, J.J.; Osuch, M. Comparing various artificial neural network types for water temperature prediction in rivers. J. Hydrol. 2015, 529, 302–315. [Google Scholar] [CrossRef]
  17. Qiu, R.; Wang, Y.; Wang, D.; Qiu, W.; Wu, J.; Tao, Y. Water temperature forecasting based on modified artificial neural network methods: Two cases of the Yangtze River. Sci. Total Environ. 2020, 737, 139729. [Google Scholar] [CrossRef] [PubMed]
  18. Tao, Y.; Wang, Y.; Wang, D.; Ni, L.; Wu, J. A C-vine copula framework to predict daily water temperature in the Yangtze River. J. Hydrol. 2021, 598, 126430. [Google Scholar] [CrossRef]
  19. Graf, R.; Aghelpour, P. Daily river water temperature prediction: A comparison between neural network and stochastic techniques. Atmosphere 2021, 12, 1154. [Google Scholar] [CrossRef]
  20. Weierbach, H.; Lima, A.R.; Willard, J.D.; Hendrix, V.C.; Christianson, D.S.; Lubich, M.; Varadharajan, C. Stream temperature predictions for river basin management in the Pacific Northwest and Mid-Atlantic regions using machine learning. Water 2022, 14, 1032. [Google Scholar] [CrossRef]
  21. Jiang, D.; Xu, Y.; Lu, Y.; Gao, J.; Wang, K. Forecasting water temperature in cascade reservoir operation-influenced river with machine learning models. Water 2022, 14, 2146. [Google Scholar] [CrossRef]
  22. Abdi, R.; Rust, A.; Hogue, T.S. Development of a multilayer deep neural network model for predicting hourly river water temperature from meteorological data. Front. Environ. Sci. 2021, 9, 738322. [Google Scholar] [CrossRef]
  23. Drainas, K.; Kaule, L.; Mohr, S.; Uniyal, B.; Wild, R.; Geist, J. Predicting stream water temperature with artificial neural networks based on open-access data. Hydrol. Process. 2023, 37, e14991. [Google Scholar] [CrossRef]
  24. Ikram, R.M.A.; Mostafa, R.R.; Chen, Z.; Parmar, K.S.; Kisi, O.; Zounemat-Kermani, M. Water temperature prediction using improved deep learning methods through reptile search algorithm and weighted mean of vectors optimizer. J. Mar. Sci. Eng. 2023, 11, 259. [Google Scholar] [CrossRef]
  25. Zhu, S.; Piotrowski, A.P. River/stream water temperature forecasting using artificial intelligence models: A systematic review. Acta Geophys. 2020, 68, 1433–1442. [Google Scholar] [CrossRef]
  26. Grbić, R.; Kurtagić, D.; Slišković, D. Stream water temperature prediction based on Gaussian process regression. Expert Syst. Appl. 2013, 40, 7407–7414. [Google Scholar] [CrossRef]
  27. Sun, J.; Di Nunno, F.; Sojka, M.; Ptak, M.; Luo, Y.; Xu, R.; Xu, J.; Luo, Y.; Zhu, S.; Granata, F. Prediction of daily river water temperatures using an optimized model based on NARX networks. Ecol. Indic. 2024, 161, 111978. [Google Scholar] [CrossRef]
  28. Ahmadi-Nedushan, B.; St-Hilaire, A.; Ouarda, T.B.M.J.; Bilodeau, L.; Robichaud, É.; Thiémonge, N.; Bobée, B. Predicting river water temperatures using stochastic models: Case study of the Moisie River (Québec, Canada). Hydrol. Process. 2007, 21, 21–34. [Google Scholar] [CrossRef]
  29. Graf, R.; Zhu, S.; Sivakumar, B. Forecasting river water temperature time series using a wavelet-neural network hybrid modelling approach. J. Hydrol. 2019, 578, 124115. [Google Scholar] [CrossRef]
  30. Heddam, S.; Ptak, M.; Sojka, M.; Kim, S.; Malik, A.; Kisi, O.; Zounemat-Kermani, M. Least square support vector machine-based variational mode decomposition: A new hybrid model for daily river water temperature modeling. Environ. Sci. Pollut. Res. 2022, 29, 71555–71582. [Google Scholar] [CrossRef] [PubMed]
  31. Segura, C.; Caldwell, P.; Sun, G.; McNulty, S.; Zhang, Y. A model to predict stream water temperature across the conterminous USA. Hydrol. Process. 2015, 29, 2178–2195. [Google Scholar] [CrossRef]
  32. Rahmani, F.; Appling, A.P.; Feng, D.; Lawson, K.; Shen, C. Identifying structural priors in a hybrid differentiable model for stream water temperature modeling. Water Resour. Res. 2023, 59, e2023WR034420. [Google Scholar] [CrossRef]
  33. Orozco López, E.; Kaplan, D.; Linhoss, A. Interpretable transformer neural network prediction of diverse environmental time series using weather forecasts. Water Resour. Res. 2024, 60, e2023WR036337. [Google Scholar] [CrossRef]
  34. He, Y.; Yang, X. A physics-informed deep learning framework for estimating thermal stratification in a large deep reservoir. Water Resour. Res. 2025, 61, e2025WR040592. [Google Scholar] [CrossRef]
  35. Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
  36. Zheng, T.G.; Wu, M.X.; Zhang, D.; Jin, J.; Lin, J.Q.; Sun, S.K. Simulating water temperature in reservoir using improved LSTM model. Trans. Chin. Soc. Agric. Eng. 2025, 41, 144–153. [Google Scholar] [CrossRef]
  37. Beck, M.; Pöppel, K.; Spanring, M.; Auer, A.; Prudnikova, O.; Kopp, M.; Klambauer, G.; Brandstetter, J.; Hochreiter, S. xLSTM: Extended long short-term memory. arXiv 2024, arXiv:2405.04517. [Google Scholar] [CrossRef]
  38. Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
  39. Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. In Proceedings of the International Conference on Learning Representations (ICLR), San Diego, CA, USA, 7–9 May 2015. [Google Scholar]
  40. Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; et al. PyTorch: An imperative style, high-performance deep learning library. Adv. Neural Inf. Process. Syst. 2019, 32, 8024–8035. [Google Scholar] [CrossRef]
  41. Corona, C.R.; Hogue, T.S. Machine learning in stream and river water temperature modeling: A review and metrics for evaluation. Hydrol. Earth Syst. Sci. 2025, 29, 2521–2549. [Google Scholar] [CrossRef]
  42. Moriasi, D.N.; Arnold, J.G.; Van Liew, M.W.; Bingner, R.L.; Harmel, R.D.; Veith, T.L. Model evaluation guidelines for systematic quantification of accuracy in watershed simulations. Trans. ASABE 2007, 50, 885–900. [Google Scholar] [CrossRef]
Figure 1. Study area and hydrological stations in the upper and middle reaches of the Yangtze River.
Figure 1. Study area and hydrological stations in the upper and middle reaches of the Yangtze River.
Water 18 01574 g001
Figure 2. Distributions of predicted-temperature differences under different hyperparameter perturbations. Note: Each boxplot shows the maximum difference among three parameter levels for the same station, forecast-initiation period and validation date. Station abbreviations: XJB, Xiangjiaba; PZH, Panzhihua; YC, Yichang; NJG, Nanjinguan; BD, Badong; HLM, Huanglingmiao; BT, Batang; GT, Gangtuo; CT, Cuntan.
Figure 2. Distributions of predicted-temperature differences under different hyperparameter perturbations. Note: Each boxplot shows the maximum difference among three parameter levels for the same station, forecast-initiation period and validation date. Station abbreviations: XJB, Xiangjiaba; PZH, Panzhihua; YC, Yichang; NJG, Nanjinguan; BD, Badong; HLM, Huanglingmiao; BT, Batang; GT, Gangtuo; CT, Cuntan.
Water 18 01574 g002
Figure 3. Observed and predicted water-temperature series under different parameter levels in selected validation cases. Note: The black line represents observed water temperature, and colored lines represent predictions under different parameter levels. The other two parameters were fixed at the baseline combination.
Figure 3. Observed and predicted water-temperature series under different parameter levels in selected validation cases. Note: The black line represents observed water temperature, and colored lines represent predictions under different parameter levels. The other two parameters were fixed at the baseline combination.
Water 18 01574 g003
Figure 4. Seasonal representative-period mean RMSE of LSTM and xLSTM under different station groups.
Figure 4. Seasonal representative-period mean RMSE of LSTM and xLSTM under different station groups.
Water 18 01574 g004
Figure 5. Paired comparison of mean RMSE with and without discharge input.
Figure 5. Paired comparison of mean RMSE with and without discharge input.
Water 18 01574 g005
Figure 6. Paired comparison of mean RMSE between LSTM and xLSTM at different stations.
Figure 6. Paired comparison of mean RMSE between LSTM and xLSTM at different stations.
Water 18 01574 g006
Table 1. Station grouping and modeling rationale.
Table 1. Station grouping and modeling rationale.
Type of InfluenceStationsMain BackgroundModeling Focus
Natural hydro-meteorologyGangtuo, Batang, PanzhihuaDominated by climate, elevation, inflow processes and natural water–air heat exchange; direct engineering regulation is relatively weak.Examine the effects of water-temperature thermal inertia, seasonal transition and climatic differences on forecasting errors.
Reservoir regulation impactXiangjiaba, Huanglingmiao, Nanjinguan, YichangAffected by reservoir storage and release, released-water temperature, downstream mixing or reservoir-outlet processes, which may reshape the short-term continuity of the water-temperature series.Analyze whether reservoir regulation enhances short-term thermal continuity and how regulation-induced smoothing affects model accuracy.
Compound disturbance impactCuntan, BadongCuntan is jointly affected by backwater near the reservoir tail, Jialing River inflow, urban shoreline activity and local hydrodynamics; Badong is influenced by the Three Gorges Reservoir, shoreline activity, reservoir thermal inertia and slow-flow conditions.Compare forecasting errors and input-variable applicability under compound disturbance backgrounds and different local processes.
Table 2. Experimental design and parameter ranges.
Table 2. Experimental design and parameter ranges.
ItemSettingDescription
ModelLSTM, xLSTMBoth models used identical input variables, parameter-search space and training–validation split.
Input schemeNo discharge: air temperature, water temperature; with discharge: air temperature, water temperature, dischargeThe no-discharge scheme covered nine stations, while the discharge-input scheme covered seven stations with available discharge data.
Input window10, 20, 30 dHistorical sequence length used to predict next-day water temperature.
Learning rate0.0001, 0.0005, 0.001Controls the step size of parameter updates in the Adam optimizer.
Batch size16, 32, 64Controls the number of samples used for each gradient update.
Training settingepoch = 100, seed = 42, training:validation = 9:1Training and validation samples were split chronologically to avoid temporal leakage.
Seasonal representative dates15 February 2015, 15 May 2015, 15 August 2015, and 15 November 2015Represent winter low-temperature, spring warming, summer high-temperature and autumn cooling periods, respectively.
Table 3. Statistics of predicted water-temperature differences under hyperparameter perturbations.
Table 3. Statistics of predicted water-temperature differences under hyperparameter perturbations.
ModelParameterMedian Difference/°CMean Difference/°C90th Percentile/°CMaximum/°CLow-Difference StationHigh-Difference Station
LSTMBatch size0.0560.0680.1280.533Xiangjiaba (0.039)Cuntan (0.090)
LSTMLearning rate0.0770.0970.1850.877Xiangjiaba (0.043)Gangtuo (0.112)
LSTMWindow length0.0550.0660.1260.377Xiangjiaba (0.033)Gangtuo (0.105)
xLSTMBatch size0.0730.0910.1720.682Xiangjiaba (0.052)Cuntan (0.123)
xLSTMLearning rate0.1020.1360.2541.475Xiangjiaba (0.068)Panzhihua (0.163)
xLSTMWindow length0.0890.1040.1870.611Xiangjiaba (0.057)Gangtuo (0.144)
Table 4. RMSE statistics of LSTM for different station groups and representative seasonal periods.
Table 4. RMSE statistics of LSTM for different station groups and representative seasonal periods.
Seasonal PeriodRMSE StatisticXJBHLMNJGYCGTBTPZHCTBD
Spring warmingMean0.0340.1500.0680.0640.2190.1430.2110.4410.080
Maximum0.0980.2150.1690.1670.4580.3690.2770.8970.183
Minimum0.0000.0880.0020.0000.0220.0030.0660.0720.017
Summer
high-temp.
Mean0.2130.0490.0320.0490.1840.2080.5060.4560.263
Maximum0.2700.1460.0810.1540.3100.3280.5760.5410.480
Minimum0.1640.0010.0000.0010.0610.1060.4480.3760.097
Autumn coolingMean0.0360.0820.0530.0280.0900.1270.0550.5300.068
Maximum0.0860.1360.1410.0640.1910.2100.1460.7250.127
Minimum0.0000.0250.0050.0060.0000.0050.0060.4170.005
Winter
low-temp.
Mean0.0620.2800.0840.0790.2600.1510.1590.1560.081
Maximum0.1250.3710.1880.2330.3800.2830.3080.2760.242
Minimum0.0040.2010.0040.0050.1340.0780.0130.0640.002
Table 5. RMSE statistics of xLSTM for different station groups and representative seasonal periods.
Table 5. RMSE statistics of xLSTM for different station groups and representative seasonal periods.
Seasonal PeriodRMSE StatisticXJBHLMNJGYCGTBTPZHCTBD
Spring warmingMean0.0580.1140.0740.0810.2080.2430.1310.3240.053
Maximum0.1240.2010.2270.4680.4801.0170.3290.7220.209
Minimum0.0030.0410.0020.0000.0010.0010.0030.0090.003
Summer
high-temp.
Mean0.2290.0810.0520.0640.2200.1870.5810.4190.406
Maximum0.2660.2480.1220.1420.6280.3500.8200.6150.621
Minimum0.1830.0050.0020.0030.0130.0180.4740.3270.098
Autumn coolingMean0.0330.0490.0820.0480.0610.1560.1190.5560.058
Maximum0.1070.1200.2450.1380.2860.3540.3120.7750.177
Minimum0.0010.0030.0000.0040.0040.0400.0090.2260.001
Winter
low-temp.
Mean0.0540.2050.0720.0910.3280.1090.1240.1790.076
Maximum0.1980.3210.1910.2480.5290.3340.3100.3340.219
Minimum0.0040.0350.0170.0150.1030.0140.0060.0450.006
Table 6. Group-wise statistics of one-day-ahead RMSE under the no-discharge scheme.
Table 6. Group-wise statistics of one-day-ahead RMSE under the no-discharge scheme.
GroupModelMean RMSEStandard DeviationMinimumMaximum
Reservoir regulation impactLSTM0.0850.0770.0000.371
Reservoir regulation impactxLSTM0.0870.0730.0000.468
Natural hydro-meteorologyLSTM0.1930.1320.0000.576
Natural hydro-meteorologyxLSTM0.2060.1780.0011.017
Compound disturbance impactLSTM0.2590.2090.0020.897
Compound disturbance impactxLSTM0.2590.2080.0010.775
Table 7. Comparison of one-day-ahead mean RMSE differences with and without discharge input.
Table 7. Comparison of one-day-ahead mean RMSE differences with and without discharge input.
GroupStationLSTMΔRMSExLSTMΔRMSEInterpretation
Reservoir regulation impactXiangjiaba−0.0000.019No improvement for either model
Reservoir regulation impactHuanglingmiao−0.0030.013LSTM decreased; xLSTM did not improve
Reservoir regulation impactYichang0.0060.002No improvement for either model
Natural hydro-meteorologyGangtuo0.0410.080No improvement for either model
Natural hydro-meteorologyBatang−0.0260.019LSTM decreased; xLSTM did not improve
Natural hydro-meteorologyPanzhihua−0.0170.009LSTM decreased; xLSTM did not improve
Compound disturbance impactCuntan0.0020.009No improvement for either model
Note: ΔRMSE = RMSE with discharge input − RMSE without discharge input. Negative values indicate reduced RMSE after discharge was added; positive values indicate increased or unimproved RMSE. Unit: °C. Nanjinguan and Badong were not included because continuous discharge input was unavailable.
Table 8. Comparison of one-day-ahead mean RMSE between LSTM and xLSTM at different stations under the no-discharge scheme.
Table 8. Comparison of one-day-ahead mean RMSE between LSTM and xLSTM at different stations under the no-discharge scheme.
GroupStationLSTMxLSTMDifference (xLSTM—LSTM)Lower-Error Model
Reservoir regulation impactXiangjiaba0.0860.0930.007LSTM
Reservoir regulation impactHuanglingmiao0.1400.112−0.028xLSTM
Reservoir regulation impactNanjinguan0.0590.0700.011LSTM
Reservoir regulation impactYichang0.0550.0710.016LSTM
Natural hydro-meteorologyGangtuo0.1880.2050.016LSTM
Natural hydro-meteorologyBatang0.1570.1740.016LSTM
Natural hydro-meteorologyPanzhihua0.2330.2390.006LSTM
Compound disturbance impactCuntan0.3960.369−0.026xLSTM
Compound disturbance impactBadong0.1230.1480.025LSTM
Table 9. Supplementary validation-set performance of LSTM and xLSTM under the no-discharge scheme.
Table 9. Supplementary validation-set performance of LSTM and xLSTM under the no-discharge scheme.
ModelValidation RMSE/°CValidation MAE/°CValidation R2Representative-Date One-Day-Ahead RMSE/°C
LSTM0.2380.1770.9380.160
xLSTM0.2480.1860.9350.165
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ma, J.; Huang, H.; Liu, D.; Liu, Y.; Xu, Y. Thermal-Process-Informed Input-Variable Selection for Multi-Site Short-Term River Water-Temperature Forecasting in the Upper and Middle Reaches of the Yangtze River. Water 2026, 18, 1574. https://doi.org/10.3390/w18131574

AMA Style

Ma J, Huang H, Liu D, Liu Y, Xu Y. Thermal-Process-Informed Input-Variable Selection for Multi-Site Short-Term River Water-Temperature Forecasting in the Upper and Middle Reaches of the Yangtze River. Water. 2026; 18(13):1574. https://doi.org/10.3390/w18131574

Chicago/Turabian Style

Ma, Jun, Hui Huang, Defu Liu, Ying Liu, and Yaqian Xu. 2026. "Thermal-Process-Informed Input-Variable Selection for Multi-Site Short-Term River Water-Temperature Forecasting in the Upper and Middle Reaches of the Yangtze River" Water 18, no. 13: 1574. https://doi.org/10.3390/w18131574

APA Style

Ma, J., Huang, H., Liu, D., Liu, Y., & Xu, Y. (2026). Thermal-Process-Informed Input-Variable Selection for Multi-Site Short-Term River Water-Temperature Forecasting in the Upper and Middle Reaches of the Yangtze River. Water, 18(13), 1574. https://doi.org/10.3390/w18131574

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop