Development and Assessment of Water-Level Prediction Models for Small Reservoirs Using a Deep Learning Algorithm

: In this study, we aimed to develop and assess a hydrological model using a deep learning algorithm for improved water management. Single-output long short-term memory (LSTM SO) and encoder-decoder long short-term memory (LSTM ED) models were developed, and their performances were compared using different input variables. We used water-level and rainfall data from 2018 to 2020 in the Takayama Reservoir (Nara Prefecture, Japan) to train, test, and assess both models. The root-mean-squared error and Nash–Sutcliffe efﬁciency were estimated to compare the model performances. The results showed that the LSTM ED model had better accuracy. Analysis of water levels and water-level changes presented better results than the analysis of water levels. However, the accuracy of the model was signiﬁcantly lower when predicting water levels outside the range of the training datasets. Within this range, the developed model could be used for water management to reduce the risk of downstream ﬂooding, while ensuring sufﬁcient water storage for irrigation, because of its ability to determine an appropriate amount of water for release from the reservoir before rainfall events.


Introduction
Over 150,000 small-to-medium-sized irrigation reservoirs exist in Japan. Many of them are small, ranging from several hundred to 100,000 m cubed. Approximately 70% of them were built before modern times, or over 150 years ago, for irrigation purposes, and the oldest recorded pond was built over 1500 years ago. Many of the irrigation reservoirs have deteriorated, and the number of suitable reservoirs is continuously decreasing. In recent years, natural disasters such as torrential rains and earthquakes have resulted in floods and the collapse of dikes in several reservoirs. Reservoir floods and collapses induced by such disasters have led to increased secondary disasters in downstream areas. Thus, there has been growing interest in monitoring the hydrological parameters of reservoirs and predicting the risk of flooding using modern sensing and simulation technologies [1,2].
Increasing studies have focused on the development and application of prediction models for large dams or major rivers, considering the size of the associated areas of interest. However, small reservoirs are of relatively low interest for administrators and researchers when considering the lack of available data and the relatively low number of beneficiaries. Recent developments in modeling techniques and low-cost information and communication technologies have enabled the development of hydrological models for small reservoirs.
Models for water-level predictions are based on two approaches, namely, conceptual models and physical and data-driven models [3]. The conceptual or physical-based models utilize hydrological variables such as evaporation, infiltration rate, and soil moisture tions have been developed in various fields, although these models require verification [37]. Therefore, in this study, we attempted to construct a simple and practical water-level prediction model using the LSTM ED model for small-to-medium-sized agricultural reservoirs where observations are limited.
We aimed to develop and assess hydrological models using the deep learning algorithm, LSTM ED, for predicting water levels applicable to agricultural reservoirs. An LSTM SO model was also developed, and its performance was compared with that of the LSTM ED model. Thereafter, using the LSTM ED model, we examined outputs from different learning and testing periods with different hydrological data periods. The water level, rainfall data, and discharge events from 2018 to 2020 in the Takayama Reservoir (Nara Prefecture, Japan) were used in our analysis.

Data Collection
To develop the model, data were obtained from the Takayama Reservoir, located in Takayama Town, Ikoma City, Nara Prefecture, Japan ( Figure 1). The outflow from the Takayama Reservoir joins the Tomio River, which is a tributary of the Yamato River Basin. The specifications of the Takayama Reservoir are presented in Table 1. The Takayama Reservoir is one of the largest agricultural reservoirs among approximately 4000 reservoirs in Nara Prefecture. It was constructed in 1956 by the prefectural government and managed by Kitayamato Land Improvement District, which is a water-user association with a membership of approximately 900 farm households. Reservoir water is used for paddy rice cultivation and is normally distributed from May (during the land-preparation period) to September, which is approximately 2 weeks before harvesting. All drainage water from the paddy flows into the Tomio River, mostly through a downstream residential area. Therefore, the Takayama Reservoir plays a hydrologically important role in preventing flooding in the downstream area.

LSTM Model Development
LSTM cells can be described with the following equations:  A water-level sensor and a rain gauge were installed at the periphery of the reservoir ( Figure 1) to monitor hourly water levels and precipitation. Information regarding water discharge from the reservoir was provided by the Kita Yamato Land Improvement District. These data were collected from 1 July 2018 to 24 July 2020.

LSTM Model Development
LSTM cells can be described with the following equations: where, i t is the input gate, f t is the forget gate, c t is the cell state at time step (t), o t is the forget gate, h t is the output, σ is the sigmoid function, W is the weight of the respective (x) neurons, b is the bias, and i is the input gate. Long-term memory is made possible by adding (i) the long-term c t−1 and (ii) the short-term feature value h t−1 , to i. The LSTM SO model can be described with the following equation: where, Q t+1 is the predicted water level after 1 h. The right side of the equation indicates LSTM cells that perform the single output, and Q t−24...t , R t−24...t , and D t−24...t indicate the observed water level, precipitation, and discharge from time t−24 to t, respectively. The calculation was repeated 24 times to make continuous predictions for up to 24 h. The left diagram in Figure 2 shows the structure of the LSTM SO model. As shown in Figure 2, past rainfall and water-level changes were inserted in the input layer, and the predicted water level at the time following the time of the last input value was inserted in the output layer. With this basic structure, the output obtained was used as the next input. By repeating this, the prediction was performed for up to 24 h. Similarly, the LSTM ED model can be expressed with the following equation: where, Q t+1...t+24 is the predicted water level from t + 1 to t + 24, and Q t+1...t+23 , R t+1...t+23 , and D t+1...t+23 indicate the predicted water level, precipitation, and discharge from t + 1 to t + 23, respectively. The right side of the equation indicates the LSTM cells performing the decoder function. S indicates the feature value (state vector) extracted from past information and can be expressed as: The right side of Equation (8)  and discharge from time t−24 to time t, respectively. With the LSTM ED model, the rainfall and water-level data for the past 24 h are entered into the encoder for each hour. The state vector extracted from the input layer is entered into the decoder. The first LSTM cells generate the weighting of the intermediate layer and return the output value as the input value for the next time step. This process is repeated for up to 24 h to make predictions. In addition, the predicted rainfall and discharge events are entered in the decoder as input variables.
where, … is the predicted water level from t + 1 to t + 24, and … , … , and … indicate the predicted water level, precipitation, and discharge from t + 1 to t + 23, respectively. The right side of the equation indicates the LSTM cells performing the decoder function. S indicates the feature value (state vector) extracted from past information and can be expressed as: The difference between the two models is that, for the LSTM ED model, the output is linked to the middle layers, whereas the LSTM SO model does not have such a linkage. A model that predicts the water level 1 h later using the observed data for current and past timepoints functions as a single unit, and continuous prediction is possible by inputting the predicted value to the next unit.

LSTM Model Assessment
We examined the LSTM SO and LSTM ED models with different input and output variables, as shown in Table 2. In many cases, the catchment area and the relationship between the storage capacity and water levels are not acquired for small reservoirs. Without this information, it is not possible to accurately estimate the relationship between the water level and inflow to the reservoir. Furthermore, from a management perspective, it is desirable, as much as possible, to predict the water level directly from simple observation data. Therefore, we attempted to predict the water level from simple input variables, such as the changes in the water level, rainfall intensity at a single observation point, and the occurrence of a discharge event. Water level fluctuates considerably throughout the year; thus, if the water level is simply used as an output, then it may show negligible changes in the water level due to rainfall. Therefore, the amount of change in the water level was used as the input variable. However, in this case, the relationship between the inflow amount and the water level may not be reflected well because of the influence of the geometry of the reservoir, which is normally a reverse cone shape. Therefore, we considered solving both problems by factoring in both water level and changes in the water as input variables, and water-level changes as an output variable.
To this end, we prepared the LSTM SO1 and ED1 models that used both water level and water-level change as input variables. In addition, the LSTM SO2 and ED2 models were constructed, which used only the water level as both input and output variables, whereas the LSTM SO2 and ED3 models used only the water-level change as an input variable for comparison purposes. Table 3 shows the input parameters used in the models. The same parameters were used for all models except for epochs. Epochs indicated the number of cycles through the training dataset and were set to different values because they were determined using the early stopping method to avoid overfitting. Validation, conducted by attempting 1 to 24 h input lags, revealed there was no effect on the accuracy, even after 15 h. Therefore, the input length for the LSTM ED model and the LSTM SO model was set to 24 h, as small-scale reservoirs have short time lags in rainfall runoff. The output length for the LSTM ED models was set to 24 h, and that for the LSTM SO models was set to 1 h, and this length was repeated for up to 24 h. The hidden unit, which indicates the number of LSTM cells, was set to 20. The batch size (i.e., the number of batches parallelized during batch processing in order to perform parallel processing and increase the training efficiently) was validated with 1 to 512; it was set to 64, because it presented the best performance among them. The Adam optimizer [39], which is based on the stochastic gradient-descent method, was used as the optimization function. The Adam is computationally efficient and is also appropriate for non-stationary objectives and problems with very noisy and/or sparse gradients [39]. The loss function employed the mean squared error, which is generally used for loss functions.

Ensemble Learning Method
In this study, we used the ensemble learning method. An ensemble consists of a set of individually trained models whose outputs are combined when predicting novel instances [40]. In deep learning, each neuron is weighted during learning, and the differences in the initial weighting value affect the weighting after the final learning step. Ensemble learning using different initial weightings is a simple and effective method for neural networks [40]. We used the ensemble-learning method in such a way that the initial weighting was randomly determined to construct 20 individually trained post-learning models.

Comparing the Performances of Different Models
The root-mean-squared error (RMSE) was used to evaluate the models. The RMSE can be used to estimate the absolute error between the predicted and measured values and can be expressed as follows: where, Y obs and Y pre are the observed and predicted water-level changes at time step i, respectively, and N is the total number of time steps. The Nash-Sutcliffe efficiency (NSE) is an index that evaluates models for estimating the size of variation. Its range is −∞ to 1, and values closer to 1 reflect greater accuracy. If the NSE index is 0.7, then the reproducibility of the model is considered high. The NSE is expressed using the following equation: where, Y ave is the average water level observed at all time points.
In this study, we first compared the LSTM ED and LSTM SO models with different input and output variables using the above two evaluation equations and clarified how the structural differences affected the predictions. Next, the water-level input methods were compared using the LSTM ED1, LSTM ED2, and LSTM ED3 models, assuming that the water level was predicted in a situation where there was a record of the spill event, but there was no quantitative spill data. The performance of the models when using water-level changes and water levels as input variables was also compared.
To confirm that a given model could reproduce the relationship between the water level and reservoir capacity, an analysis was performed to simulate water-level changes after a rainfall event with different initial water levels.
The above analyses were conducted to evaluate the comparative advantages of the LSTM ED models and assess the performance of the developed model with minimal hydrological information.

Field Data
An overview of the data obtained in this study is presented in Figure 3. The maximum water level was equal to the height of the spillway, which is 225 m above sea level. Spill water was observed in both 2018 and 2020. The water-level data could not be obtained from 20 October to 5 November 2018, because of a problem with the data-transmission device. Table 4 summarizes the data obtained during the study.
where, Y is the average water level observed at all time points.
In this study, we first compared the LSTM ED and LSTM SO models with different input and output variables using the above two evaluation equations and clarified how the structural differences affected the predictions. Next, the water-level input methods were compared using the LSTM ED1, LSTM ED2, and LSTM ED3 models, assuming that the water level was predicted in a situation where there was a record of the spill event, but there was no quantitative spill data. The performance of the models when using water-level changes and water levels as input variables was also compared.
To confirm that a given model could reproduce the relationship between the water level and reservoir capacity, an analysis was performed to simulate water-level changes after a rainfall event with different initial water levels.
The above analyses were conducted to evaluate the comparative advantages of the LSTM ED models and assess the performance of the developed model with minimal hydrological information.

Field Data
An overview of the data obtained in this study is presented in Figure 3. The maximum water level was equal to the height of the spillway, which is 225 m above sea level. Spill water was observed in both 2018 and 2020. The water-level data could not be obtained from 20 October to 5 November 2018, because of a problem with the data-transmission device. Table 4 summarizes the data obtained during the study.     Figure 4 shows boxplots representing the results of each set of 20 ensemble learnings (obtained for making predictions) over a 24 h observation period. The figure shows the RMSE and NSE values for the LSTM SO1-SO3 and LSTM ED1-ED3 models using data obtained from 2 July to 23 September 2019. With the short-term predictions (approximately 10 h), no significant difference in the RMSE was observed, but the RMSE of the SO models increased as the predictions were generated over longer periods. This result was similar to that of a previous study [37]. (obtained for making predictions) over a 24 h observation period. The figure shows RMSE and NSE values for the LSTM SO1-SO3 and LSTM ED1-ED3 models using d obtained from 2 July to 23 September 2019. With the short-term predictions (appro mately 10 h), no significant difference in the RMSE was observed, but the RMSE of the models increased as the predictions were generated over longer periods. This result w similar to that of a previous study [37].   Figure 4a-c shows that, especially with the LSTM SO model, changing the input v iables did not lead to a significant reduction in errors. The LSTM ED1 model showed t smallest prediction error at 24 h, with an RMSE of 0.07. Compared with the SO2 and E models, the errors were considerably lower with the SO3 and ED3 models, where the w ter-level difference was set as an input variable and the output was directly set as t water level. In addition, the smallest errors were found with the ED1 model (RMSE at Figure 4a-c shows that, especially with the LSTM SO model, changing the input variables did not lead to a significant reduction in errors. The LSTM ED1 model showed the smallest prediction error at 24 h, with an RMSE of 0.07. Compared with the SO2 and ED2 models, the errors were considerably lower with the SO3 and ED3 models, where the water-level difference was set as an input variable and the output was directly set as the water level. In addition, the smallest errors were found with the ED1 model (RMSE at 24 h of 0.07), where both water level and amount of change in water level were used as input variables. The SO1 and ED1 models had the lowest errors among three models of each type, SO and ED. The error between the models was 41%. The percentage decline in the RMSE values was larger in the ED models than in the SO models. In addition, the mean absolute error, which is used in the evaluation of hydrological models, was also evaluated, and the results were similar to the RMSE. Figure 4d-f presents boxplots for the computed NSE indexes, which were generated using the test data. The ED1 model presented the best results, with a maximum average NSE index of 0.99. The ED2 model showed large variations in the NSE values.

Comparison of the LSTM Models
Therefore, the ED models were considered to provide higher accuracy, with both water level and water-level change, set as input variables instead of using either the water level or the water-level change as a single variable.

Relationship between Water-Level Changes and the Reservoir Capacity
To examine the capability of our model to characterize the relationship between the simulated water level and the reservoir geometry, the changes in the water level from different initial water levels were computed. The rainfall data from 26 to 27 July 2019 (total rainfall of 16 mm, which is considered typical for the region) were used and computed using 0.5 m differences in the initial water level ( Figure 5). The red and black lines represent the observed and simulated water-level changes, respectively. The other colored lines are the simulated water-level changes starting from the different initial water levels. The simulated water-level changes agreed with the observed water-level changes when the simulated and observed water levels were initially the same, and the results showed good reproducibility and accuracy. However, a small water-level change was predicted if the inflow started at a high water level, and a large water-level change was predicted if the inflow started at a low water level. The findings reveal that the LSTM ED models learned the relationship between the water level and water-level change and that, by knowing the cross-sectional geometry, they could estimate the inflow volume to the reservoir more accurately. Therefore, the ED models were considered to provide higher accuracy, with both water level and water-level change, set as input variables instead of using either the water level or the water-level change as a single variable.

Relationship between Water-Level Changes and the Reservoir Capacity
To examine the capability of our model to characterize the relationship between the simulated water level and the reservoir geometry, the changes in the water level from different initial water levels were computed. The rainfall data from 26 to 27 July 2019 (total rainfall of 16 mm, which is considered typical for the region) were used and computed using 0.5 m differences in the initial water level ( Figure 5). The red and black lines represent the observed and simulated water-level changes, respectively. The other colored lines are the simulated water-level changes starting from the different initial water levels. The simulated water-level changes agreed with the observed water-level changes when the simulated and observed water levels were initially the same, and the results showed good reproducibility and accuracy. However, a small water-level change was predicted if the inflow started at a high water level, and a large water-level change was predicted if the inflow started at a low water level. The findings reveal that the LSTM ED models learned the relationship between the water level and water-level change and that, by knowing the cross-sectional geometry, they could estimate the inflow volume to the reservoir more accurately.

Assessment of the LSTM ED1 Model
An analysis was performed to further assess the predictions made using the LSTM ED1 model. The observed data were divided into 10 periods (Table 5). All data, except for one period assigned to the test data, were used for training and validation. As the time lag between the start of rainfall events and water-level changes was approximately 3 h, a rainfall event shown in Table 5 was defined as one where the cumulative rainfall was 5 mm or more, and there was no rainfall for 4 h between the events. Overflows were observed from the spillway during periods 1, 9, and 10. The largest rainfall event (event 1) was observed from 4 to 6 July 2019 in period 1. The second-largest rainfall event occurred from 15 to 16 August 2019, in period 6. No overflow occurred during this period, and event 2 (occurring during period 6) was the largest rainfall event without overflow. The third-largest rainfall event occurred from 18 to 19 June 2020 in period 10 (event 3). No overflow occurred during the rainfall event. Figure 6 shows the three largest cumulative rainfall observations among the test data. The red dotted lines show the observed water levels, the green dotted lines show the averages of the predicted values after each ensemble learning, and the green shaded areas represent the ranges of the 95% confidence intervals of the predicted values after each learning of the ensemble member. Event 1 was a simulation of the largest rainfall event when the water reached full capacity and overflow occurred from the spillway. The range of the 95% confidence interval was larger when the water level fluctuated after rainfall, indicating the influence of the initial weighting in the LSTM model. This presented a prediction limitation when the rainfall intensity was larger than the range of the training data sets. In general, ensemble learning provides variable outputs, which indicates overfitting against the training dataset in deep learning [41]. This variation was caused by predictions made using an untrained rainfall event. Therefore, ensemble learning is a valuable technique for ascertaining the range of possible predicted water levels. Even when overflowing was above the spill level of 225 m, the observed water levels did not increase significantly, but the simulated results showed significant increases in the water levels. Only two overflow events were observed, which indicates that water-level fluctuations in the case of overflow could not be predicted well. This implies the need for more training data during overflow events. made using an untrained rainfall event. Therefore, ensemble learning is a valuable technique for ascertaining the range of possible predicted water levels. Even when overflowing was above the spill level of 225 m, the observed water levels did not increase significantly, but the simulated results showed significant increases in the water levels. Only two overflow events were observed, which indicates that water-level fluctuations in the case of overflow could not be predicted well. This implies the need for more training data during overflow events. Similarly, event 2 was the largest rainfall event without overflow. Although the prediction accuracy was better than that for event 1, the variation in the predicted value was larger when the water level fluctuated. In contrast, with event 3, the variation of the predicted value was considerably smaller than those of events 1 and 2, even during a large rainfall event with a small number of observations within the learning period, and the average predicted value after the ensemble was also close to the measured value.
Therefore, the LSTM ED models performed reasonably well when predicting water levels within the range of the training dataset. In the case of a rainfall event or overflow that fell outside the range of the training data values, the variation in prediction was significant. In any case, ensemble learning to assess the variation of outputs can feasibly improve the reliability of hydrological models by presenting a range of outputs. Figure 7 shows an example of the application of the LSTM ED model for different water levels, with or without a prior discharge before the rainfall event. The model provides helpful information regarding the timing and duration of discharge to enable rain- Similarly, event 2 was the largest rainfall event without overflow. Although the prediction accuracy was better than that for event 1, the variation in the predicted value was larger when the water level fluctuated. In contrast, with event 3, the variation of the predicted value was considerably smaller than those of events 1 and 2, even during a large rainfall event with a small number of observations within the learning period, and the average predicted value after the ensemble was also close to the measured value.
Therefore, the LSTM ED models performed reasonably well when predicting water levels within the range of the training dataset. In the case of a rainfall event or overflow that fell outside the range of the training data values, the variation in prediction was significant. In any case, ensemble learning to assess the variation of outputs can feasibly improve the reliability of hydrological models by presenting a range of outputs. Figure 7 shows an example of the application of the LSTM ED model for different water levels, with or without a prior discharge before the rainfall event. The model provides helpful information regarding the timing and duration of discharge to enable rainwater collection without overflow from the reservoir. The model can quantitatively identify the effect of water discharge to increase the storage capacity before rainfall events occur. water collection without overflow from the reservoir. The model can quantitatively identify the effect of water discharge to increase the storage capacity before rainfall events occur.

Figure 7.
Example application of the developed model for predicting water-level changes when water was released before a forecasted rainfall event.

Conclusions
To manage reservoir operations, it is important to store enough water for irrigation, while reducing the risk of flooding in the downstream area. The latter is especially important in the context of climate change, which may be associated with an increasing frequency of heavy rainfall. The application of deep learning models can be used for waterlevel predictions in reservoir management. These models are preferable over simple hydrological models. The LSTM model can potentially be used as a tool for predicting the water levels of small reservoirs with limited available hydrological variables for training, such as water level and rainfall and discharge events without the inflow and outflow data. Therefore, it may be possible to estimate the storage capacity and the area-volume-elevation curve without field surveys, based on the computed relationship between the water level and water-level change.
In this study, the output of a model with the encoder-decoder structure was compared with that of a common single-output LSTM model. The encoder-decoder structure in the LSTM model provided better simulation results, especially for predictions made over longer durations. In predicting the water level, our results showed similar trends to those of the previous study on runoff analysis using encoder-decoder [37]. After further analysis, the model accounted for the geometry of the reservoirs cross-section, based on time-dependent water-level information. Furthermore, the ensemble learning technique demonstrated a range of simulation errors that could be useful for understanding the ability of model predictions.
On the contrary, the LSTM ED model required more than two times more computation time than the LSTM SO model. As the LSTM ED has a more complex model structure than the LSTM SO, it is likely that more time was needed to train per time (1 Epoch) and converge the loss error. To reduce the learning cost, the extreme learning machine model has been studied in hydrological models [10,11,42], which could be applied in this study.
Furthermore, the ensemble learning technique demonstrated a range of simulation errors that could be useful for understanding the ability of model predictions. In addition, ensemble learning can assess the reliability of hydrological models by demonstrating the Figure 7. Example application of the developed model for predicting water-level changes when water was released before a forecasted rainfall event.

Conclusions
To manage reservoir operations, it is important to store enough water for irrigation, while reducing the risk of flooding in the downstream area. The latter is especially important in the context of climate change, which may be associated with an increasing frequency of heavy rainfall. The application of deep learning models can be used for water-level predictions in reservoir management. These models are preferable over simple hydrological models. The LSTM model can potentially be used as a tool for predicting the water levels of small reservoirs with limited available hydrological variables for training, such as water level and rainfall and discharge events without the inflow and outflow data. Therefore, it may be possible to estimate the storage capacity and the area-volume-elevation curve without field surveys, based on the computed relationship between the water level and water-level change.
In this study, the output of a model with the encoder-decoder structure was compared with that of a common single-output LSTM model. The encoder-decoder structure in the LSTM model provided better simulation results, especially for predictions made over longer durations. In predicting the water level, our results showed similar trends to those of the previous study on runoff analysis using encoder-decoder [37]. After further analysis, the model accounted for the geometry of the reservoirs cross-section, based on time-dependent water-level information. Furthermore, the ensemble learning technique demonstrated a range of simulation errors that could be useful for understanding the ability of model predictions.
On the contrary, the LSTM ED model required more than two times more computation time than the LSTM SO model. As the LSTM ED has a more complex model structure than the LSTM SO, it is likely that more time was needed to train per time (1 Epoch) and converge the loss error. To reduce the learning cost, the extreme learning machine model has been studied in hydrological models [10,11,42], which could be applied in this study.
Furthermore, the ensemble learning technique demonstrated a range of simulation errors that could be useful for understanding the ability of model predictions. In addition, ensemble learning can assess the reliability of hydrological models by demonstrating the variation in simulated outputs. The errors can also be significant when the forecasted rainfall intensity is outside the range of the training data. To improve the accuracy, the accumulation of more training data for deep learning architectures or the application of transfer learning [43] should be considered for further studies. Management conditions of small reservoirs are rarely studied and are different from those of dams and large reservoirs. Therefore, further technological development and research focused on small reservoirs will be important.