# Forecasting Water Temperature in Cascade Reservoir Operation-Influenced River with Machine Learning Models

^{1}

^{2}

^{3}

^{4}

^{5}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Materials and Methods

#### 2.1. Study Area

#### 2.2. Methodology

#### 2.2.1. Decision Trees (DT)

#### 2.2.2. Random Forest (RF)

#### 2.2.3. Gradient Boosting (GB)

#### 2.2.4. Adaptive Boosting (AB)

#### 2.2.5. Support Vector Regression (SVR)

#### 2.2.6. MultiLayer Perceptron Neural Network (MLPNN)

## 3. Results

#### 3.1. Importance Ranking of Variables

^{2}all above 0.96 and Nash efficiency coefficient (NSE) [43] no less than 0.96; the best-fitting model was GB, which was almost completely accurate for all training data. In the test data, the prediction accuracy of the six models decreases to some extent. However, the overall RMSE of all models was below 0.64 °C, the R

^{2}was above 0.93, and the NSE was no less than 0.92. The RF model achieved the highest precision among all of them, with an RMSE of only 0.203 °C. All of the above indicate that the constructed model is robust in terms of variable screening.

#### 3.2. Prediction Results of Each Model

^{2}and NSE, were used to compare the various machine learning models. To calibrate and validate the models used for water temperature prediction, the data were randomly divided into a calibration and validation period at a ratio of 6:4. That is, we use the first 60% of the data as input to train the models and then use the remaining 40% to test the model’s performance. This is a widely used paradigm for model development.

^{2}values for each model version in the training set are 0.200 °C and 0.980, 0.044 °C and 0.996 and 0.017 °C and 0.998, respectively. While the RMSE and R

^{2}values for each model version in the test dataset are 0.554 °C and 0.943, 0.330 °C and 0.966 and 0.359 °C and 0.963, respectively. Specifically, DT achieves greater accuracy in versions 2 and 3, both of which improve by approximately two percentage points over version 1. This demonstrates that including Flow as a predictor aids in reducing forecast error. Switching from version 2 to version 3 and including factors other than DOY and Flow as input variables improved the performance of DT slightly. The scatterplot and comparison between observed and predicted WT for the three-model version are shown in Figure 7. As illustrated in the figure, the graphs for all three cases fit reasonably well, and the scatter on the regression curves is relatively concentrated.

^{2}and NSE values (R

^{2}≈ 0.950, NSE ≈ 0.949) and a low error value: RMSE ≈ 0.201 °C. In addition, the modeling findings reveal that version 3 greatly outperformed the other models. In fact, RF2 increased the model’s accuracy by reducing the RMSE of RF1 by 49.73 percent, while RF3 enhanced the model’s accuracy by reducing the RMSE by about 74.39 percent. This proportion is altered to 51.52 percent and 58.12 percent in the test set, demonstrating that the RF3 and RF2 models perform comparably. The accuracy of the three models as a whole demonstrates that version 3 outperforms all other models. Figure 8 illustrates the scatterplots and comparison of observed and predicted WT.

^{2}and NSE values (R

^{2}≈ 0.980, NSE ≈ 0.980) and a low error value: RMSE ≈ 0.194 °C. Additionally, the modeling results indicate that version 3 outperformed the previous versions significantly. Indeed, GB2 reduced the root mean square error of the GB1 by 99.69 percent, and GB3 increased the model’s accuracy by reducing the root mean square error by approximately 100.0 percent in the training set. This percentage is changed to 48.28 and 47.32 percent in the test set, indicating a small performance difference between the GB3 and GB2 models. The accuracy of the three models as a whole demonstrates that version 3 outperforms all other models; Figure 9 illustrates the scatterplots and comparison of observed and predicted WT.

^{2}value of 0.978 while operating alone (version 1) in the exercise set. When the Flow is added to DOY (version 2), the model’s performance is improved slightly. In the test set, a similar phenomenon is observed, but with a lower RMSE and a higher R

^{2}value (RMSE = 0.292 °C, R

^{2}= 0.970) for version 2.

^{2}value of 0.971. When the Flow is added to DOY (version 2), the model’s performance slightly degrades, with the RMSE increasing to 0.345 °C and the R

^{2}value decreasing to 0.965. In the test set, a similar phenomenon is observed, but with a lower RMSE and a higher R

^{2}value (RMSE = 0.323°C, R

^{2}= 0.967) for version 1.

^{2}values for each model version in the training set are 0.275 °C and 0.972, 0.236 °C and 0.976 and 0.190 °C and 0.981, respectively. While the RMSE and R

^{2}values for each model version are 0.311 °C and 0.968, 0.281 °C and 0.971 and 0.386 °C and 0.960, respectively, in the test dataset. It is clear that MLPNN performs better in version 2 than in version 1, with both versions improving by approximately 0.5 percentage points over version 1 in both datasets. This demonstrates that including Flow as a predictor aids in reducing forecast error. Switching from version 2 to version 3, with input variables other than DOY and traffic, improves accuracy slightly on the training set while decreasing it slightly on the test set. As illustrated in Figure 11, the graphs for all three cases fit reasonably well, and the scatter on the regression curves is relatively concentrated.

^{2}and NSE values and lower RMSE (°C) values. The AB model performed best in version 2 (when DOY and Flow were used as predictors), while the model SVR performed better in version 1 (with DOY as an input variable only). On the test dataset, the six models produced results that were inconsistent with those obtained on the training set. Version 2 (with combined DOY and Flow inputs) had the highest fitting accuracy for models AB, DT, GB and MLPNN, while version 3 had the best result for RF and version 1 had the best result for model SVR.

^{2}and NSE values as well as lower RMSE values. On average, the six models performed well in predicting river water temperature, with an R

^{2}value greater than 0.95 and an NSE value greater than 0.95. The DT and GB models fit curves more precisely than the others, especially at peaks and troughs. Their performance varies significantly in the test data. RF performed well in forecasting WT, with an R

^{2}greater than 0.97; AB, DT, GB and MLPNN models performed slightly worse but still exceeded 0.96; and SVR performed poorly in both cases. One possible explanation is that the data pre-processing and parameter selection algorithms (normalization range, dimension reduction algorithm and optimal parameter selection algorithm) are not well adjusted.

## 4. Discussion

^{2}= 0.93, NSE = 0.92). Eventually, we did not find a definite answer about a single optimal machine learning algorithm when using the same input variables, indicating that our selected machine learning models all capture the nonlinear dynamics of the water temperature fairly well. The most parsimonious models were then developed based on six machine learning models using a combination of the three most important inputs. Comparing their performance according to statistical metrics, the results showed that GB3 and RF3 produce the highest prediction accuracy on the training dataset and the test dataset, respectively (Table 8). This also suggested that choosing the appropriate minimum number of input variables is sufficient for the machine learning model to obtain acceptable prediction results. On the other hand, we can see from Figure 7, Figure 8, Figure 9, Figure 10, Figure 11 and Figure 12 that as the water temperature increases (>18 °C), more discrete points appear near the fitting curve. The challenge here is that 18 °C is a threshold of great importance for fish spawning. The four major carp species only started spawning when the water temperature increased to 18 °C in April [46]. The results suggested that gaining further insight into physical dynamics remains the most essential factor for the successful exploitation of ML to better predict water temperature.

## 5. Conclusions

^{2}and NSE values. The DOY is the most significant factor in all forecast models, while air temperature, flow and dew temperature are secondary aspects connected to water temperature variations. This is possibly due to the cascaded reservoir operating on a year-round basis that affects the annual cycle of downstream river water temperature. As a result, the DOY may be more precise in its prediction of river water temperature downstream of the reservoir; the modeling performance indicates that the machine learning model developed in this study is capable of accurately predicting river water temperature under the influence of a cascaded reservoir.

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## References

- Caissie, D. The thermal regime of rivers: A review. Freshw. Biol.
**2006**, 51, 1389–1406. [Google Scholar] [CrossRef] - Havens, K.E.; Paerl, H.W. Climate Change at a Crossroad for Control of Harmful Algal Blooms. Environ. Sci. Technol.
**2015**, 49, 12605–12606. [Google Scholar] [CrossRef] [PubMed] - Comer-Warner, S.A.; Romeijn, P.; Gooddy, D.C.; Ullah, S.; Kettridge, N.; Marchant, B.; Hannah, D.M.; Krause, S. Thermal sensitivity of CO
_{2}and CH_{4}emissions varies with streambed sediment properties (vol 9, 2803, 2018). Nat. Commun.**2019**, 10, 3093. [Google Scholar] [CrossRef] [PubMed] - Baxter, J.S.; McPhail, J.D. The influence of redd site selection, groundwater upwelling, and over-winter incubation temperature on survival of bull trout (Salvelinus confluentus) from egg to alevin. Can. J. Zool.-Rev. Can. Zool.
**1999**, 77, 1233–1239. [Google Scholar] [CrossRef] - Porcelli, D.; Gaston, K.J.; Butlin, R.K.; Snook, R.R. Local adaptation of reproductive performance during thermal stress. J. Evol. Biol.
**2017**, 30, 422–429. [Google Scholar] [CrossRef] - Turschwell, M.P.; Balcombe, S.R.; Steel, E.A.; Sheldon, F.; Peterson, E.E. Thermal habitat restricts patterns of occurrence in multiple life-stages of a headwater fish. Freshw. Sci.
**2017**, 36, 402–414. [Google Scholar] [CrossRef] [Green Version] - Hannah, D.M.; Garner, G. River water temperature in the United Kingdom: Changes over the 20th century and possible changes over the 21st century. Prog. Phys. Geogr.-Earth Environ.
**2015**, 39, 68–92. [Google Scholar] [CrossRef] [Green Version] - Ouellet, V.; Secretan, Y.; St-Hilaire, A.; Morin, J. Water temperature modelling in a controlled environment: Comparative study of heat budget equations. Hydrol. Process.
**2014**, 28, 279–292. [Google Scholar] [CrossRef] - Webb, B.W.; Hannah, D.M.; Moore, R.D.; Brown, L.E.; Nobilis, F. Recent advances in stream and river temperature research. Hydrol. Process.
**2008**, 22, 902–918. [Google Scholar] [CrossRef] - Maheu, A.; Poff, N.L.; St-Hilaire, A. A Classification of Stream Water Temperature Regimes in the Conterminous USA. River Res. Appl.
**2016**, 32, 896–906. [Google Scholar] [CrossRef] - Dugdale, S.J.; Hannah, D.M.; Malcolm, I.A. River temperature modelling: A review of process-based approaches and future directions. Earth-Sci. Rev.
**2017**, 175, 97–113. [Google Scholar] [CrossRef] - Poole, G.C.; Berman, C.H. An ecological perspective on in-stream temperature: Natural heat dynamics and mechanisms of human-caused thermal degradation. Environ. Manag.
**2001**, 27, 787–802. [Google Scholar] [CrossRef] [PubMed] - Ekwueme, B.N.; Agunwamba, J.C. Trend Analysis and Variability of Air Temperature and Rainfall in Regional River Basins. Civ. Eng. J.
**2021**, 7, 816–826. [Google Scholar] [CrossRef] - Ojha, S.S.; Singh, V.; Roshni, T. Comparison of Meteorological Drought using SPI and SPEI. Civ. Eng. J.
**2021**, 7, 2130–2149. [Google Scholar] [CrossRef] - Li, X.; Zhang, L.; J. O’Connor, P.; Yan, J.; Wang, B.; Liu, D.L.; Wang, P.; Wang, Z.; Wan, L.; Li, Y. Ecosystem Services under Climate Change Impact Water Infrastructure in a Highly Forested Basin. Water
**2020**, 12, 2825. [Google Scholar] [CrossRef] - Seyedhashemi, H.; Moatar, F.; Vidal, J.P.; Diamond, J.S.; Beaufort, A.; Chandesris, A.; Valette, L. Thermal signatures identify the influence of dams and ponds on stream temperature at the regional scale. Sci. Total Environ.
**2021**, 766, 142667. [Google Scholar] [CrossRef] - Olden, J.D.; Naiman, R.J. Incorporating thermal regimes into environmental flows assessments: Modifying dam operations to restore freshwater ecosystem integrity. Freshw. Biol.
**2010**, 55, 86–107. [Google Scholar] [CrossRef] - Webb, B.W.; Walling, D.E. Complex summer water temperature behaviour below a UK regulating reservoir. Regul. Rivers-Res. Manag.
**1997**, 13, 463–477. [Google Scholar] [CrossRef] - Xiao, Q.; Chuanbo, G.; Fangyuan, X.; Wei, X.; Yushun, C.; Wei, S. Characterization of the Fish Community and Environmental Driving Factors during Development of Cascaded Dams in the Lower Jinsha River. J. Hydroecol.
**2020**, 41, 46–56. [Google Scholar] [CrossRef] - Zhu, S.; Heddam, S.; Nyarko, E.K.; Hadzima-Nyarko, M.; Piccolroaz, S.; Wu, S. Modeling daily water temperature for rivers: Comparison between adaptive neuro-fuzzy inference systems and artificial neural networks models. Environ. Sci. Pollut. Res.
**2019**, 26, 402–420. [Google Scholar] [CrossRef] - Yousefi, A.; Toffolon, M. Critical factors for the use of machine learning to predict lake surface water temperature. J. Hydrol.
**2022**, 606, 127418. [Google Scholar] [CrossRef] - Tran, T.T.K.; Bateni, S.M.; Ki, S.J.; Vosoughifar, H. A Review of Neural Networks for Air Temperature Forecasting. Water
**2021**, 13, 1294. [Google Scholar] [CrossRef] - Ouellet, V.; St-Hilaire, A.; Dugdale, S.J.; Hannah, D.M.; Krause, S.; Proulx-Ouellet, S. River temperature research and practice: Recent challenges and emerging opportunities for managing thermal habitat conditions in stream ecosystems. Sci. Total Environ.
**2020**, 736, 139679. [Google Scholar] [CrossRef] [PubMed] - Cho, K.; Kim, Y. Improving streamflow prediction in the WRF-Hydro model with LSTM networks. J. Hydrol.
**2022**, 605, 127297. [Google Scholar] [CrossRef] - Pyo, J.; Park, L.J.; Pachepsky, Y.; Baek, S.S.; Kim, K.; Cho, K.H. Using convolutional neural network for predicting cyanobacteria concentrations in river water. Water Res.
**2020**, 186, 116349. [Google Scholar] [CrossRef] - Miao, Q.; Pan, B.; Wang, H.; Hsu, K.; Sorooshian, S. Improving Monsoon Precipitation Prediction Using Combined Convolutional and Long Short Term Memory Neural Network. Water
**2019**, 11, 977. [Google Scholar] [CrossRef] [Green Version] - Zhang, J.; Zhu, Y.; Zhang, X.; Ye, M.; Yang, J. Developing a Long Short-Term Memory (LSTM) based model for predicting water table depth in agricultural areas. J. Hydrol.
**2018**, 561, 918–929. [Google Scholar] [CrossRef] - Zhi, W.; Feng, D.; Tsai, W.P.; Sterle, G.; Harpold, A.; Shen, C.; Li, L. From Hydrometeorology to River Water Quality: Can a Deep Learning Model Predict Dissolved Oxygen at the Continental Scale? Environ. Sci. Technol.
**2021**, 55, 2357–2368. [Google Scholar] [CrossRef] - Fu, Y.; Hu, Z.; Zhao, Y.; Huang, M. A Long-Term Water Quality Prediction Method Based on the Temporal Convolutional Network in Smart Mariculture. Water
**2021**, 13, 2907. [Google Scholar] [CrossRef] - Yuan, Q.S.; Wang, P.F.; Chen, J.; Wang, C. Influence of cascade reservoirs on spatiotemporal variations of hydrogeochemistry in Jinsha River. Water Sci. Eng.
**2021**, 14, 97–108. [Google Scholar] [CrossRef] - Zhibing, W.; Yongfeng, H.; Jinling, G.; Tingbing, Z.; Zihao, M.; Yi, C.; Deguo, Y. Temporal and Spatial Variation of Phytoplankton Community Structure in the Main Stream of the Jinsha River. Resour. Environ. Yangtze Basin
**2020**, 29, 1356–1365. [Google Scholar] [CrossRef] - Yang, K.; Li, X.; Gonf, L. An Eco-environmental Management, Analysis and Evaluation System for the Lower Reaches of Jinsha River. J. Yangtze River Sci. Res. Inst.
**2021**, 38, 120–124. [Google Scholar] [CrossRef] - Gadomer, L.; Sosnowski, Z.A. Fuzzy Random Forest with C-Fuzzy Decision Trees. Comput. Inf. Syst. Ind. Manag. Cisim
**2016**, 9842, 481–492. [Google Scholar] [CrossRef] - Pekel, E. Estimation of soil moisture using decision tree regression. Theor. Appl. Climatol.
**2020**, 139, 1111–1119. [Google Scholar] [CrossRef] - Kulkarni, V.Y.; Petare, M.; Sinha, P.K. Analyzing Random Forest Classifier with Different Split Measures. In Proceedings of the Second International Conference on Soft Computing for Problem Solving (Socpros 2012), Jaipur, India, 28–30 December 2012; Springer: New Delhi, India, 2014; Volume 236, pp. 691–699, ISBN 978-81-322-1602-5; 978-81-322-1601-8. [Google Scholar] [CrossRef]
- Huynh-Thu, V.A.; Geurts, P. Unsupervised Gene Network Inference with Decision Trees and Random Forests. Methods Mol. Biol.
**2019**, 1883, 195–215. [Google Scholar] [CrossRef] [Green Version] - Natekin, A.; Knoll, A. Gradient boosting machines, a tutorial. Front. Neurorobot.
**2013**, 7, 21. [Google Scholar] [CrossRef] [Green Version] - Foo, S.W.; Lim, E.G. Speaker recognition using adaptively boosted classifier. In Proceedings of the IEEE Region 10 International Conference on Electrical and Electronic Technology, Singapore, 19–22 August 2001; pp. 442–446, ISBN 0-7803-7101-1. [Google Scholar] [CrossRef]
- Shilton, A.; Lai, D.T.H.; Palaniswami, M. A Division Algebraic Framework for Multidimensional Support Vector Regression. IEEE Trans. Syst. Man Cybern. Part B-Cybern.
**2010**, 40, 517–528. [Google Scholar] [CrossRef] - Yang, Y.; Wang, Z.; Yang, B.; Liu, X. Optimization of Support Vector Regression Parameters by Flower Pollination Algorithm. In Proceedings of the 2017 5th International Conference on Frontiers of Manufacturing Science and Measuring Technology (FMSMT 2017), Taiyuan, China, 24–25 June 2017; Volume 130, pp. 1607–1612, ISBN 978-94-6252-331-9. [Google Scholar]
- Kumar, P.; Singh, A.K. A Comparison between MLR, MARS, SVR and RF Techniques: Hydrological Time-series Modeling. J. Hum. Earth Future
**2022**, 3, 90–98. [Google Scholar] [CrossRef] - Yaqub, M.; Eren, B.; Eyupoglu, V. Prediction of heavy metals removal by polymer inclusion membranes using machine learning techniques. Water Environ. J.
**2021**, 35, 1073–1084. [Google Scholar] [CrossRef] - Criss, R.E.; Winston, W.E. Do Nash values have value? Discussion and alternate proposals. Hydrol. Process.
**2008**, 22, 2723–2725. [Google Scholar] [CrossRef] - Preece, R.M.; Jones, H.A. The effect of Keepit Dam on the temperature regime of the Namoi River, Australia. River Res. Appl.
**2002**, 18, 397–414. [Google Scholar] [CrossRef] - Todd, C.R.; Ryan, T.; Nicol, S.J.; Bearlin, A.R. The impact of cold water releases on the critical period of post-spawning survival and its implications for Murray cod (Maccullochella peelii peelii): A case study of the Mitta Mitta River, southeastern Australia. River Res. Appl.
**2005**, 21, 1035–1052. [Google Scholar] [CrossRef] - Wang, Y.; Zhang, N.; Wang, D.; Wu, J. Impacts of cascade reservoirs on Yangtze River water temperature: Assessment and ecological implications. J. Hydrol.
**2020**, 590, 125240. [Google Scholar] [CrossRef] - Toffolon, M.; Piccolroaz, S.; Calamita, E. On the use of averaged indicators to assess lakes’ thermal response to changes in climatic conditions. Environ. Res. Lett.
**2020**, 15, 034060. [Google Scholar] [CrossRef] - Toffolon, M.; Piccolroaz, S.; Majone, B.; Soja, A.M.; Peeters, F.; Schmid, M.; Wueest, A. Prediction of surface temperature in lakes with different morphology using air temperature. Limnol. Oceanogr.
**2014**, 59, 2185–2202. [Google Scholar] [CrossRef] [Green Version] - Ren, L.; Song, C.; Wu, W.; Guo, M.; Zhou, X. Reservoir effects on the variations of the water temperature in the upper Yellow River, China, using principal component analysis. J. Environ. Manag.
**2020**, 262, 110339. [Google Scholar] [CrossRef] - Saber, A.; James, D.E.; Hayes, D.F. Effects of seasonal fluctuations of surface heat flux and wind stress on mixing and vertical diffusivity of water column in deep lakes. Adv. Water Resour.
**2018**, 119, 150–163. [Google Scholar] [CrossRef] - He, W.; Lian, J.; Du, H.; Ma, C. Source tracking and temperature prediction of discharged water in a deep reservoir based on a 3-D hydro-thermal-tracer model. J. Hydro-Environ. Res.
**2018**, 20, 9–21. [Google Scholar] [CrossRef] - Soleimani, S.; Bozorg-Haddad, O.; Saadatpour, M.; Loaiciga, H.A. Optimal Selective Withdrawal Rules Using a Coupled Data Mining Model and Genetic Algorithm. J. Water Resour. Plan. Manag.
**2016**, 142, 04016064. [Google Scholar] [CrossRef] - Zouabi-Aloui, B.; Adelana, S.M.; Gueddari, M. Effects of selective withdrawal on hydrodynamics and water quality of a thermally stratified reservoir in the southern side of the Mediterranean Sea: A simulation approach. Environ. Monit. Assess.
**2015**, 187, 292. [Google Scholar] [CrossRef] - Gu, R.C.; McCutcheon, S.; Chen, C.J. Development of weather-dependent flow requirements for river temperature control. Environ. Manag.
**1999**, 24, 529–540. [Google Scholar] [CrossRef] [PubMed] - Richter, B.D.; Thomas, G.A. Restoring environmental flows by modifying dam operations. Ecol. Soc.
**2007**, 12, 12. [Google Scholar] [CrossRef]

**Figure 1.**Distribution map of main dams (blue circle) and hydrological monitoring points (red triangles) in the study area.

**Figure 2.**Time series plot of water temperature (red lines), mean air temperature (black lines), and discharge (blue lines).

**Figure 3.**Workflow summarizing the steps of the comparative analysis of the performance of the different ML methods.

**Figure 4.**Permutation importance in DT and RF; DT, decision trees, RF, random forests. (DT on the

**top**, RF on the

**bottom**, WT: °C).

**Figure 5.**Permutation importance in GB and AB; GB, gradient boosting regression, AB, adaptive boosting regression. (GB on the

**top**, AB on the

**bottom**, WT: °C).

**Figure 6.**Permutation importance in SVR and MLPNN; SVR, support vector regression, MLPNN, multilayer perceptron neural networks. (SVR on the

**top**, MLPNN on the

**bottom**, WT: °C).

**Figure 7.**Model fitting results—DecisionTree Regressor, blue dot: X coordinate (observed data), Y coordinate (predicted data); black line: y = x; red dotted line: the regression curve of the blue dots. (

**a**) only one input variable (DOY), (

**b**) two input variables (DOY and Flow), (

**c**) all variables.

**Figure 8.**Model fitting results—RandomForest Regressor, blue dot: X coordinate (observed data), Y coordinate (predicted data); black line: y = x; red dotted line: the regression curve of the blue dots. (

**a**) only one input variable (DOY), (

**b**) with two input variables (DOY and Flow), (

**c**) with all variables.

**Figure 9.**Model fitting results—GradientBoosting Regressor, blue dot: X coordinate (observed data), Y coordinate (predicted data); black line: y = x; red dotted line: the regression curve of the blue dots. (

**a**) only one input variable (DOY), (

**b**) with two input variables (DOY and Flow), (

**c**) with all variables.

**Figure 10.**Model fitting results—AdaptiveBoosting Regressor, blue dot: X coordinate (observed data), Y coordinate (predicted data); black line: y = x; red dotted line: the regression curve of the blue dots. (

**a**) only one input variable (DOY), (

**b**) with two input variables (DOY and Flow), (

**c**) with all variables.

**Figure 11.**Model fitting results—SupportVector Regression, blue dot: X coordinate (observed data), Y coordinate (predicted data); black line: y = x; red dotted line: the regression curve of the blue dots. (

**a**) only one input variable (DOY), (

**b**) with two input variables (DOY and Flow), (

**c**) with all variables.

**Figure 12.**Model fitting results—Multilayer Perceptron Neural Network, blue dot: X coordinate (observed data), Y coordinate (predicted data); black line: y = x; red dotted line: the regression curve of the blue dots. (

**a**) only one input variable (DOY), (

**b**) with two input variables (DOY and Flow), (

**c**) with all variables.

**Table 1.**Performances of six models (DT: decision trees, RF: random forests, GB: gradient boosting regression, AB: adaptive boosting regression, SVR: support vector regression, MLPNN: multilayer perceptron neural networks) in predicting water temperature.

Models | Training Datasets | Testing Datasets | ||||
---|---|---|---|---|---|---|

RMSE (°C) | R^{2} | NSE | RMSE (°C) | R^{2} | NSE | |

DT | 0.0167698 | 0.998298 | 0.998295 | 0.394039 | 0.959443 | 0.959352 |

RF | 0.0513479 | 0.994788 | 0.994694 | 0.203134 | 0.979092 | 0.978487 |

GB | 2.59 × 10^{−19} | 1 | 1 | 0.308065 | 0.968292 | 0.968119 |

AB | 0.0980462 | 0.990049 | 0.989885 | 0.264675 | 0.972758 | 0.972145 |

SVR | 0.3365923 | 0.965837 | 0.960535 | 0.633647 | 0.934781 | 0.922508 |

MLPNN | 0.1896209 | 0.980754 | 0.980438 | 0.385593 | 0.960312 | 0.958261 |

**Table 2.**Performances of DecisionTree Regressor in modeling water temperature (WT: °C), DT1: only one input variable (DOY), DT2: with two input variables (DOY and Flow), DT3: with all input variables.

Model Version | Training Dataset | Test Dataset | ||||
---|---|---|---|---|---|---|

RMSE (°C) | R^{2} | NSE | RMSE (°C) | R^{2} | NSE | |

DT1 | 0.200361 | 0.979664 | 0.979242 | 0.553881 | 0.942991 | 0.941738 |

DT2 | 0.043619 | 0.995573 | 0.995553 | 0.329739 | 0.966061 | 0.966209 |

DT3 | 0.01677 | 0.998298 | 0.998295 | 0.359478 | 0.963 | 0.962799 |

**Table 3.**Performances of RandomForest Regressor in modeling water temperature (WT: °C), RF1: only one input variable (DOY), RF2: with two input variables (DOY and Flow), RF3: with all input variables.

Model Version | Training Dataset | Test Dataset | ||||
---|---|---|---|---|---|---|

RMSE (°C) | R^{2} | NSE | RMSE (°C) | R^{2} | NSE | |

RF1 | 0.200538 | 0.979646 | 0.979199 | 0.485061 | 0.950074 | 0.948793 |

RF2 | 0.100807 | 0.989769 | 0.989547 | 0.23512 | 0.9758 | 0.975281 |

RF3 | 0.051348 | 0.994788 | 0.994694 | 0.203134 | 0.979092 | 0.978487 |

**Table 4.**Performances of GradientBoosting Regressor in modeling water temperature (WT: °C), GB1: only one input variable (DOY), GB2: with two input variables (DOY and Flow), GB3: with all inputs variable.

Model Version | Training Dataset | Test Dataset | ||||
---|---|---|---|---|---|---|

RMSE (°C) | R^{2} | NSE | RMSE (°C) | R^{2} | NSE | |

GB1 | 0.194251 | 0.980284 | 0.979888 | 0.584754 | 0.939813 | 0.938425 |

GB2 | 0.000596 | 0.99994 | 0.999939 | 0.302442 | 0.968871 | 0.968273 |

GB3 | 2.59 × 10^{−19} | 1 | 1 | 0.308065 | 0.968292 | 0.968119 |

**Table 5.**Performances of AdaptiveBoosting Regressor in modeling water temperature (WT: °C), AB1: only one input variable (DOY), AB2: with two input variables (DOY and Flow), AB3: with all inputs variable.

Model Version | Training Dataset | Test Dataset | ||||
---|---|---|---|---|---|---|

RMSE (°C) | R^{2} | NSE | RMSE (°C) | R^{2} | NSE | |

AB1 | 0.202579 | 0.979439 | 0.978967 | 0.535933 | 0.944838 | 0.943332 |

AB2 | 0.15983 | 0.983778 | 0.983445 | 0.291648 | 0.969982 | 0.969293 |

AB3 | 0.20119 | 0.97958 | 0.979129 | 0.538943 | 0.944528 | 0.943291 |

**Table 6.**Performances of SupportVector Regression in modeling water temperature (WT: °C), SVR1: only one input variable (DOY), SVR2: with two input variables (DOY and Flow), SVR3: with all input variables.

Model Version | Training Dataset | Test Dataset | ||||
---|---|---|---|---|---|---|

RMSE (°C) | R^{2} | NSE | RMSE (°C) | R^{2} | NSE | |

SVR1 | 0.288101 | 0.970759 | 0.970421 | 0.323232 | 0.966731 | 0.966283 |

SVR2 | 0.34541 | 0.964942 | 0.959725 | 0.630718 | 0.935082 | 0.923919 |

SVR3 | 0.336592 | 0.965837 | 0.960535 | 0.633647 | 0.934781 | 0.922508 |

**Table 7.**Performances of Multilayer Perceptron Neural Network in modeling water temperature (WT: °C), MLPNN1: only one input variable (DOY), MLPNN2: with two input variables (DOY and Flow), MLPNN3: with all input variables.

Model Version | Training Dataset | Test Dataset | ||||
---|---|---|---|---|---|---|

RMSE (°C) | R^{2} | NSE | RMSE (°C) | R^{2} | NSE | |

MLPNN1 | 0.274755 | 0.972113 | 0.970939 | 0.311318 | 0.967957 | 0.965799 |

MLPNN2 | 0.236238 | 0.976023 | 0.975322 | 0.280818 | 0.971096 | 0.969259 |

MLPNN3 | 0.189621 | 0.980754 | 0.980438 | 0.385593 | 0.960312 | 0.958261 |

**Table 8.**Best version of performance in different models (DT: decision trees, RF: random forests, GB: gradient boosting regression, AB: adaptive boosting regression, SVR: support vector regression, and MLPNN: multilayer perceptron neural networks) in predicting water temperature (WT: °C), 1.only one input variable (DOY), 2. with two input variables (DOY and Flow), 3. with all variables.

Training Dataset | Test Dataset | ||||||
---|---|---|---|---|---|---|---|

Models | RMSE (°C) | R^{2} | NSE | Models | RMSE (°C) | R^{2} | NSE |

DT3 | 0.017 | 0.998 | 0.998 | DT2 | 0.330 | 0.966 | 0.966 |

RF3 | 0.051 | 0.995 | 0.995 | RF3 | 0.203 | 0.979 | 0.978 |

GB3 | 2.6 × 10^{−19} | 1.000 | 1.000 | GB2 | 0.302 | 0.969 | 0.968 |

AB2 | 0.160 | 0.984 | 0.983 | AB2 | 0.292 | 0.970 | 0.969 |

SVR1 | 0.288 | 0.971 | 0.910 | SVR1 | 0.323 | 0.967 | 0.966 |

MLPNN3 | 0.190 | 0.981 | 0.980 | MLPNN2 | 0.281 | 0.971 | 0.969 |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Jiang, D.; Xu, Y.; Lu, Y.; Gao, J.; Wang, K.
Forecasting Water Temperature in Cascade Reservoir Operation-Influenced River with Machine Learning Models. *Water* **2022**, *14*, 2146.
https://doi.org/10.3390/w14142146

**AMA Style**

Jiang D, Xu Y, Lu Y, Gao J, Wang K.
Forecasting Water Temperature in Cascade Reservoir Operation-Influenced River with Machine Learning Models. *Water*. 2022; 14(14):2146.
https://doi.org/10.3390/w14142146

**Chicago/Turabian Style**

Jiang, Dingguo, Yun Xu, Yang Lu, Jingyi Gao, and Kang Wang.
2022. "Forecasting Water Temperature in Cascade Reservoir Operation-Influenced River with Machine Learning Models" *Water* 14, no. 14: 2146.
https://doi.org/10.3390/w14142146