A Study on the Optimal Deep Learning Model for Dam Inﬂow Prediction

: In the midst of climate change, the need for accurate predictions of dam inﬂow to reduce ﬂood damage along with stable water supply from water resources is increasing. In this study, the process and method of selecting the optimal deep learning model using hydrologic data over the past 20 years to predict dam inﬂow were shown. The study area is Andong Dam and Imha Dam located upstream of the Nakdong River in South Korea. In order to select the optimal model for predicting the inﬂow of two dams, sixteen scenarios (2 × 2 × 4) are generated considering two dams, two climatic conditions, and four deep learning models. During the drought period, the RNN for Andong Dam and the LSTM for Imha Dam were selected as the optimal models for each dam, and the difference between observations was the smallest at 4% and 2%, respectively. In typhoon conditions, the GRU for Andong Dam and the RNN for Imha Dam were selected as optimal models. In the case of Typhoon Maemi, the GRU and the RNN showed a difference of 2% and 6% from the observed maximum inﬂow, respectively. The optimal recurrent neural network-based models selected in this study showed a closer prediction to the observed inﬂow than the SFM, which is currently used to predict the inﬂow of both dams. For the two dams, different optimal models were selected according to watershed characteristics and rainfall under drought and typhoon conditions. In addition, most of the deep learning models were more accurate than the SFM under various typhoon conditions, but the SFM showed better results under certain conditions. Therefore, for efﬁcient dam operation and management, it is necessary to make a rational decision by comparing the inﬂow predictions of the SFM and deep learning models.


Introduction
Due to extreme climatic change, accurate analysis of water resources is increasingly demanded for stable water supply and flood damage mitigation. Among various research subjects, the amount of the dam inflow is an important element in establishing plans for coping with drought, flooding, and operating the dam. The major factors affecting the amount of the inflow are climatic factors, including rainfall, which is the most influential, temperature, and wind speed, as well as topographical factors such as the basin area and the height of the slope [1]. However, recently, local rainfalls, which are difficult to predict, have frequently occurred nationwide. In particular, in Andong and Imha Dams in 2015, the inflow decreased to one-third the level of the average inflow over the past 20 years; and in 2017 and 2018, the discharge rates were adjusted due to entering the drought "attention stage." In addition, in 2020, due to the prolonged rainy season, the inflow increased to more than 40%, and therefore, floodgate discharge was performed at Andong Dam for the first time in 17 years. As such, it is an important issue to predict more accurately and quickly the inflow for two dams, which frequently change in drought and flood conditions every year. The reason for this study is that Andong Dam and Imha Dam are important dams that account for 50% of were evaluated by generating various scenarios according to input variables. In addition, the RNN models were applied considering that the dam inflow is time series data and the learning efficiency of the existing ANN model decreases as the number and period of data increase. The prediction model derived from this study is expected to contribute to stable dam operation management and coping with the disaster.

ANN and RNNs
In this study, the ANN model and the RNN model were compared and analyzed to derive an optimal model for predicting dam inflow. The flow chart of this study is shown in Figure 1. Deep learning is one of the algorithms of machine learning and is a more deeply constructed algorithm than conventional neural network structures. Non-linear characteristics between input variables can be estimated and have superior effects over traditional machine learning algorithms. Machine learning is a process in which humans feed the computers a lot of information, and then the computers predict information, while deep learning has the characteristics of the computers learning and predicting it without human's teaching specifically. The typical activation functions used in the hidden layers of deep learning are mainly Sigmoid, tanh (hyperbolic tangent), and Rectified Linear Unit (ReLU). The sigmoid function is a logistic regression function with values between "0" and "1," which is utilized for simple classification problems. The tanh function has a value between "−1" and "1," and as it moves away from the center value, the slope is lost during the backpropagation. For solving this slope loss problem is the ReLU function, and all values below "0" are treated as "0" to stop the learning progress [14].
In this study, a deep learning model was used to predict the inflow of Andong and Imha Dams in the Nakdong River watershed in Korea. To build an optimal prediction model based on inflow and rainfall data over the past 20 years, accuracy and reliability were evaluated by generating various scenarios according to input variables. In addition, the RNN models were applied considering that the dam inflow is time series data and the learning efficiency of the existing ANN model decreases as the number and period of data increase. The prediction model derived from this study is expected to contribute to stable dam operation management and coping with the disaster.

ANN and RNNs
In this study, the ANN model and the RNN model were compared and analyzed to derive an optimal model for predicting dam inflow. The flow chart of this study is shown in Figure 1. Deep learning is one of the algorithms of machine learning and is a more deeply constructed algorithm than conventional neural network structures. Non-linear characteristics between input variables can be estimated and have superior effects over traditional machine learning algorithms. Machine learning is a process in which humans feed the computers a lot of information, and then the computers predict information, while deep learning has the characteristics of the computers learning and predicting it without human's teaching specifically. The typical activation functions used in the hidden layers of deep learning are mainly Sigmoid, tanh (hyperbolic tangent), and Rectified Linear Unit (ReLU). The sigmoid function is a logistic regression function with values between "0" and "1," which is utilized for simple classification problems. The tanh function has a value between "−1" and "1," and as it moves away from the center value, the slope is lost during the backpropagation. For solving this slope loss problem is the ReLU function, and all values below "0" are treated as "0" to stop the learning progress [14].  The RNN is a specialized model in the field of ordered data processing. In particular, time series data are mainly utilized, and the previous output data are cycled back into the input. The following is a comparison of the hidden layer calculation Equation (1) of Convolution Neural Network (CNN), which processes grid data like an image, and the hidden layer calculation Equation (2) of the RNN.
The RNN has the characteristics of weighing each data individually to determine its importance and memorize it while turning to the next data, but there appears a gradual loss of information of distant past data in the hidden layer; therefore, a method supplemented with a separate memory cell prepared is LSTM [15]. The LSTM is one of the RNN models and is composed of a Forget gate, an Input gate and an Output gate. In order to solve the problem of gradient loss that occurs as the time difference increases in the RNN model, the LSTM model introduces a cell. Information is stored in this cell, and it plays a role in preventing the stored information from being lost in the process of analysis. The gate serves as a filter that allows unnecessary information to be forgotten or necessary information to be stored and passed through the cell. This is represented by Equations (3)-(6). In the forget gate, how much past data will be forgotten is determined, and the input gate plays a role in estimating important values among the incoming data. Output gates are used to keep information from past data and predict them simultaneously.
Forget Gate : where σ is the activation function, U is the input weight, W is the cyclic weight, h t−1 is the previous stage output, h t is the new output value, x t is the current input vector, and b is the bias. In addition, the Gated Recurrent Unit (GRU) is a method with the structure improved for processing faster than LSTM [16]. GRUs are configured as Reset gate and Update gate for the advantage of lower learning weights; therefore, faster processing speed with similar performance compared to LSTM is observed. Reset gate determines the ratio of past data to remove, and Update gate determines the discarding past data, such as forget gate of LSTM, and selects only one of t − 1 and t memory data.

The Storage Function Model (SFM)
The SFM is one of the rainfall-runoff models, and calculates the runoff from the watershed using the reservoir storage and rainfall as main input variables. In this case, impervious area, infiltration, and groundwater are considered. The model makes the basic assumption that stream channels ( I ∼ O) have a downward slope and that the watershed receives the same amount of precipitation (R ave ) as shown in Figure 2. The runoff from the watershed is calculated by Equation (7) [17].
where f 1 is the primary runoff rate (dimensionless), A is the watershed area (km 2 ), q f . is the unit runoff height of runoff area (mm/day), q s is the unit runoff height of infiltration area (mm/day), f sa is the unit runoff in seepage areas directly infiltrating groundwater, and q b is the base runoff (m 3 /s).  Korea Water Resources Corporation (K-water) operates dams through inflow prediction using the SFM, and the parameters of the SFM corresponding to each dam are optimized in consideration of the characteristics of the dam basin [17].

Study Area
Sufficient learning materials are required to calculate the inflow of dams using deep learning. In this study, Andong Dam and Imha Dam of Nakdong River were selected as the study areas among multi-purpose dams in Korea that have collected hydrological data for more than 20 years and secured the largest amount of water supply and storage capacity in the water system. The locations of Andong Dam and Imha Dam are shown in Figure 3.
Andong Dam was completed in 1976, with a basin area of 1584 km 2 and a total water storage capacity of 1248 × 10 6 m 3 . It was built to reduce flood damage by utilizing 110 × 10 6 m 3 of flood control capacity and facilities. It is responsible for supplying 926 × 10 6 m 3 of water annually, including Nakdong River's living water, industrial water, and river maintenance flow. Imha dam was completed in 1993 and has a basin area of 1361 km 2 and a total storage capacity of 595 × 10 6 m 3 . It is 73.0 m-high, with a 515.0 m-long central cutoffwall type rockfill dam built to prevent flood damage in the mid-and downstream of the Nakdong River and to supply water to the Nakdong River and the southeast coast areas. It supplies 615.3 × 10 6 m 3 of water annually, including living water, industrial water, and river maintenance flow (Table 1).  Korea Water Resources Corporation (K-water) operates dams through inflow prediction using the SFM, and the parameters of the SFM corresponding to each dam are optimized in consideration of the characteristics of the dam basin [17].

Study Area
Sufficient learning materials are required to calculate the inflow of dams using deep learning. In this study, Andong Dam and Imha Dam of Nakdong River were selected as the study areas among multi-purpose dams in Korea that have collected hydrological data for more than 20 years and secured the largest amount of water supply and storage capacity in the water system. The locations of Andong Dam and Imha Dam are shown in Figure 3. Korea Water Resources Corporation (K-water) operates dams through inflow prediction using the SFM, and the parameters of the SFM corresponding to each dam are optimized in consideration of the characteristics of the dam basin [17].

Study Area
Sufficient learning materials are required to calculate the inflow of dams using deep learning. In this study, Andong Dam and Imha Dam of Nakdong River were selected as the study areas among multi-purpose dams in Korea that have collected hydrological data for more than 20 years and secured the largest amount of water supply and storage capacity in the water system. The locations of Andong Dam and Imha Dam are shown in Figure 3.
Andong Dam was completed in 1976, with a basin area of 1584 km 2 and a total water storage capacity of 1248 × 10 6 m 3 . It was built to reduce flood damage by utilizing 110 × 10 6 m 3 of flood control capacity and facilities. It is responsible for supplying 926 × 10 6 m 3 of water annually, including Nakdong River's living water, industrial water, and river maintenance flow. Imha dam was completed in 1993 and has a basin area of 1361 km 2 and a total storage capacity of 595 × 10 6 m 3 . It is 73.0 m-high, with a 515.0 m-long central cutoffwall type rockfill dam built to prevent flood damage in the mid-and downstream of the Nakdong River and to supply water to the Nakdong River and the southeast coast areas. It supplies 615.3 × 10 6 m 3 of water annually, including living water, industrial water, and river maintenance flow (Table 1).  Andong Dam was completed in 1976, with a basin area of 1584 km 2 and a total water storage capacity of 1248 × 10 6 m 3 . It was built to reduce flood damage by utilizing 110 × 10 6 m 3 of flood control capacity and facilities. It is responsible for supplying 926 × 10 6 m 3 of water annually, including Nakdong River's living water, industrial water, and river maintenance flow. Imha dam was completed in 1993 and has a basin area of 1361 km 2 and a total storage capacity of 595 × 10 6 m 3 . It is 73.0 m-high, with a 515.0 m-long central cutoff-wall type rockfill dam built to prevent flood damage in the mid-and downstream of the Nakdong River and to supply water to the Nakdong River and the southeast coast areas. It supplies 615.3 × 10 6 m 3 of water annually, including living water, industrial water, and river maintenance flow (Table 1).

Database Buliding
In this study, the time series period required to compare and analyze four models, ANN, RNN, LSTM, and GRU models, was set from 2001 to 2020, and we intend to build an inflow prediction model by utilizing the inflow and precipitation data of Andong and Imha Dams in the subject period. The equations for daily and hourly inflow are as shown in Equations (8) and (9). Rainfall data collected from nine rainfall observatories in Andong Dam basin and eight rainfall observatories in Imha Dam basin were used.
Daily inflow m 3 s = Water Storage(at 24 : 00 today − at 24 : 00 the day before) × 10 6 60 × 60 × 24 + Daily Average Outflow (8) Hourly inflow m 3 /s = Water Storage(at fixed time − at 1hr ago) × 10 6 60 × 60 + Hourly Average Outflow (9) Considering the inflow of Andong and Imha Dams from 2001 to 2020, the annual inflow of Andong Dam in 2003 and 2015 was almost six times different. The inflows of Andong and Imha Dams during the flood period accounts for approximately 2/3 of the average annual inflows, and the precipitation and inflow during specific periods, such as the normal season or the drought and flood periods, are different. Therefore, it is necessary to analyze after dividing the seasons into the normal season or the drought and flood periods when selecting the optimal model later. Figure 4 shows the rainfall and inflow of Andong Dam watershed for 20 years.

Database Buliding
In this study, the time series period required to compare and analyze four models, ANN, RNN, LSTM, and GRU models, was set from 2001 to 2020, and we intend to build an inflow prediction model by utilizing the inflow and precipitation data of Andong and Imha Dams in the subject period. The equations for daily and hourly inflow are as shown in Equations (8) and (9). Rainfall data collected from nine rainfall observatories in Andong Dam basin and eight rainfall observatories in Imha Dam basin were used.

Dailyinflow ( m 3 )
= WaterStorage(at24: 00today − at24: 00thedaybefore) × 10 6 60 × 60 × 24 + DailyAverageOutflow Hourlyinflow(m 3 /s) = WaterStorage(atfixedtime − at1hrago) × 10 6 60 × 60 + HourlyAverageOutflow Considering the inflow of Andong and Imha Dams from 2001 to 2020, the annual inflow of Andong Dam in 2003 and 2015 was almost six times different. The inflows of Andong and Imha Dams during the flood period accounts for approximately 2/3 of the average annual inflows, and the precipitation and inflow during specific periods, such as the normal season or the drought and flood periods, are different. Therefore, it is necessary to analyze after dividing the seasons into the normal season or the drought and flood periods when selecting the optimal model later. Figure 4 shows the rainfall and inflow of Andong Dam watershed for 20 years.  There were four releases through Andong-Imha connection tunnel from 2019 to 2020. The corresponding discharge was calculated as the inflow of Andong Dam and, therefore, excluded from data preprocessing. Since the range of inflow and precipitation data is wide, data normalization was used to convert it to a value between 0 and 1 by Min-Max There were four releases through Andong-Imha connection tunnel from 2019 to 2020. The corresponding discharge was calculated as the inflow of Andong Dam and, therefore, excluded from data preprocessing. Since the range of inflow and precipitation data is wide, data normalization was used to convert it to a value between 0 and 1 by Min-Max Scaling. In addition, the data for 20 years are divided into a training set, a validation set, and a testing set in a 5:3:2 ratio as shown in Figure 5. Scaling. In addition, the data for 20 years are divided into a training set, a validation set, and a testing set in a 5:3:2 ratio as shown in Figure 5.

Input and Output Predictors
In this study, precipitation and dam inflow from previous times were used as input data to predict the inflow of the dam. The number of previous times precipitation and inflow are considered for dam inflow prediction is related to the sequence hyperparameter to be described later. For example, if the sequence is 21, 21 precipitations ( , −1 , ··· −20 ) and 21 dam inflows ( , −1 , ··· −20 ) are simultaneously considered. and are precipitation and dam inflow at the current time, respectively, +1 is the dam inflow at the next time step to be predicted, and −1 and −1 are the precipitation and dam inflow at the previous time steps to be considered for predicting the dam inflow, respectively. Figure 6 shows a schematic diagram of the input and output data of the model with sequence 21.

Optional Hyperparameter
In this study, two hyperparameters (Sequence and Batch size) were optimized by applying a grid search at regular intervals as shown in Table 2. The hyperparameters were optimized by applying a grid search at regular intervals shown in Table 2. The trial-anderror method was additionally applied to compensate for the shortcomings of grid search, which can be difficult to find optimal hyperparameters with regular interval application. The trial-and-error method found optimal variables for sequence length and batch size within the range of 1-100 and compared them with the results of grid search. In particular, the reason why the sequence length(hour) was selected as 12 is that for flood control at the multi-purposed dam, outflow discharge is approved by the government one day before the opening of the gate and notified to downstream residents in advance. Among the

Input and Output Predictors
In this study, precipitation and dam inflow from previous times were used as input data to predict the inflow of the dam. The number of previous times precipitation and inflow are considered for dam inflow prediction is related to the sequence hyperparameter to be described later. For example, if the sequence is 21, 21 precipitations (P t , P t−1 , ··· P t−20 ) and 21 dam inflows (Q t , Q t−1 , ··· Q t−20 ) are simultaneously considered. P t and Q t are precipitation and dam inflow at the current time, respectively, Q t+1 is the dam inflow at the next time step to be predicted, and P t−1 and Q t−1 are the precipitation and dam inflow at the previous time steps to be considered for predicting the dam inflow, respectively. Figure 6 shows a schematic diagram of the input and output data of the model with sequence 21. Scaling. In addition, the data for 20 years are divided into a training set, a validation set, and a testing set in a 5:3:2 ratio as shown in Figure 5.

Input and Output Predictors
In this study, precipitation and dam inflow from previous times were used as input data to predict the inflow of the dam. The number of previous times precipitation and inflow are considered for dam inflow prediction is related to the sequence hyperparameter to be described later. For example, if the sequence is 21, 21 precipitations ( , −1 , ··· −20 ) and 21 dam inflows ( , −1 , ··· −20 ) are simultaneously considered. and are precipitation and dam inflow at the current time, respectively, +1 is the dam inflow at the next time step to be predicted, and −1 and −1 are the precipitation and dam inflow at the previous time steps to be considered for predicting the dam inflow, respectively. Figure 6 shows a schematic diagram of the input and output data of the model with sequence 21.

Optional Hyperparameter
In this study, two hyperparameters (Sequence and Batch size) were optimized by applying a grid search at regular intervals as shown in Table 2. The hyperparameters were optimized by applying a grid search at regular intervals shown in Table 2. The trial-anderror method was additionally applied to compensate for the shortcomings of grid search, which can be difficult to find optimal hyperparameters with regular interval application. The trial-and-error method found optimal variables for sequence length and batch size within the range of 1-100 and compared them with the results of grid search. In particular, the reason why the sequence length(hour) was selected as 12 is that for flood control at the multi-purposed dam, outflow discharge is approved by the government one day before the opening of the gate and notified to downstream residents in advance. Among the

Optional Hyperparameter
In this study, two hyperparameters (Sequence and Batch size) were optimized by applying a grid search at regular intervals as shown in Table 2. The hyperparameters were optimized by applying a grid search at regular intervals shown in Table 2. The trial-anderror method was additionally applied to compensate for the shortcomings of grid search, which can be difficult to find optimal hyperparameters with regular interval application. The trial-and-error method found optimal variables for sequence length and batch size within the range of 1-100 and compared them with the results of grid search. In particular, the reason why the sequence length(hour) was selected as 12 is that for flood control at the multi-purposed dam, outflow discharge is approved by the government one day before the opening of the gate and notified to downstream residents in advance. Among the high-accuracy models, when overfitting occurs compared to the validation data and test data, the dropout method was used to supplement the analysis results. The remaining hyperparameters without grid search were optimized with trial and error. The application ranges of each parameter are shown in Table 2, and Learning rate 0.001, Dropout 0.2, and Hidden layer 3 were applied as optimal values in this study. The name of the scenario is the first letter of 'dam name-day/time-application modelscenario order or optimization . As an example, the scenario is named "ADA-S1", which means "Andong-Day-ANN-Scenario No.1", and "ADA-Opt", which means "Andong-Day-ANN-Optimize".
To evaluate the statistical error and accuracy of the model according to the hyperparameter for each model scenario, the coefficient of determination (R 2 ), mean absolute error (MAE), root mean square error (RMSE), and volume error (VE) presented by Hu et al. [18] were used as performance indicators. Table 3 representatively shows the ANN model results for Andong Dam among 8 cases (2 dams × 4 deep learning models) that analyzed the best performance according to each scenario. Among the various scenarios, ADA-S9 for daily data and AHA-S4 for hourly data were selected.  Andong Dam had a correlation R 2 validation indicator of 0.91, which was closest to the observation compared to other models. However, in the peak inflow, the GRU model showed the closest results to the observations. In the peak inflow of the daily data of Imha Dam, LSTM model showed 925.2 m 3 /s, least different from the actual inflow. As for the scenario result applying the time data of Andong Dam, the correlation of the ANN model was 0.94, similar to the daily data usage, which was the closest to the observation. Unlike Andong Dam, in Imha Dam, the RNN model showed less difference between actual peak inflow and predicted peak inflow than the ANN model. In particular, it was the smallest in the LSTM model at 34.5 m 3 /s.

Performance Evaluation of Optional Scenarios
For the evaluation for the performance evaluation of the scenarios, the RMSE-observed standard deviation ratio (RSR) and the Nash-Sutcliffe efficiency (NSE) were applied among various criteria. The equations for each criterion are shown in the following Equations (10) and (11). With the calculated RSR and NSE, the model performance can be judged based on the general performance rating (Table 5) [19].
where y i is the observed value, y i is the mean value,ŷ i is the predicted value, and n is the numbers of data.  Table 6 shows the RSR and the NSE calculated for the validation and test data of the selected scenarios (Table 4), and the performance ratings evaluated with these values. As a result of having validated the selected scenarios, the RSR value of Andong Dam daily data was low and similar compared to the Imha Dam results, and the evaluation result was "Very Good" in the ANN model and "Good" in the RNN model. In the hourly data, the ANN model showed the lowest result of 0.34, and was evaluated as "Very Good" in all models. Similar to Andong Dam, Imha Dam was evaluated as "Good" in the RNN model except the ANN model. In the hourly data, the evaluation was "Very Good" in all models and the NSE value was above 0.90, deriving reliable results.

Drought Period
In order to select the optimal model according to the period for Andong Dam and Imha Dam, first, the inflow by quantile for the total test period (2017-2020) was compared.
Then, the analysis results for each quantile of the inflow during the normal and dry season are derived, and the daily inflow from Andong and Imha Dams are used to select the inflow prediction model with the highest reliability during the drought period. In addition, the periods of 28 June-20 August 2017, and 13 February-29 March 2018 in the study area was in the 'caution' stage of drought crisis warning under the "Fundamental Act on Disaster and Safety". Therefore, this period data was used for drought period analysis. Table 7 shows the inflows of the 1st (25%), 2nd (50%), and 3rd (75%) quartiles and peak inflows of ADA-S9, ADR-Opt, ADL-S1, and ADG-S1, which are the optimal scenarios for Andong Dam (Table 4). Over the total period (2017-2020), the RNN model showed that the 1st, 2nd, and 3rd quartile values were close to the observations, especially within the maximum difference of up to 2 m 3 /s. In the drought period (2017-2018), the RNN predicted the 2nd and 3rd quartile inflows and maximum inflows closest to the observations, excluding the 1st quartile values. The difference in the maximum inflow between RNN predictions and observations was 6.25 m 3 /s, the smallest difference compared to other RNN models. Figure 7 shows a comparison of the predicted inflow ranges for each model versus the observed ranges for the total and drought periods.   In the case of Imha Dam, the inflows of the 1st, 2nd and 3rd quartiles and peak inflows were calculated by applying the optimal scenarios (IDA-S9, IDR-S4, IDL-Opt, IDG-S5). Figure 8 shows a comparison of the predicted inflow ranges and the observed ranges of each model for the total and drought periods at Imha Dam. As shown in Table 8 and Figure 8, the prediction of the RNN shows the largest difference from the quartile value of the measured inflow compared to other models. On the other hand, inflow predictions of LSTM have the smallest differences from observations in the 1st and 3rd quartiles during the total period and in the 1st and 2nd quartiles and the maximum during the drought period. In the prediction of the maximum inflow, the difference between observation and prediction was 45.14 m 3 /s, which showed a difference of approximately 10%. The GRU prediction showed the most accurate result with a difference of 0.27 m 3 /s from the observation in the 3rd quartile of the drought period. As shown in Table 8, in Imha Dam, LSTM was selected as the optimal model for inflow prediction during the total and drought periods.
As a result of predicting the dam inflow during the drought period, the RNN model for Andong Dam and the LSTM model for Imha Dam were closest to the observed inflow. The reason that the RNN model yielded better results than the LSTM model at Andong In the case of Imha Dam, the inflows of the 1st, 2nd and 3rd quartiles and peak inflows were calculated by applying the optimal scenarios (IDA-S9, IDR-S4, IDL-Opt, IDG-S5). Figure 8 shows a comparison of the predicted inflow ranges and the observed ranges of each model for the total and drought periods at Imha Dam. As shown in Table 8 and Figure 8, the prediction of the RNN shows the largest difference from the quartile value of the measured inflow compared to other models. On the other hand, inflow predictions of LSTM have the smallest differences from observations in the 1st and 3rd quartiles during the total period and in the 1st and 2nd quartiles and the maximum during the drought period. In the prediction of the maximum inflow, the difference between observation and prediction was 45.14 m 3 /s, which showed a difference of approximately 10%. The GRU prediction showed the most accurate result with a difference of 0.27 m 3 /s from the observation in the 3rd quartile of the drought period. As shown in Table 8, in Imha Dam, LSTM was selected as the optimal model for inflow prediction during the total and drought periods.
As a result of predicting the dam inflow during the drought period, the RNN model for Andong Dam and the LSTM model for Imha Dam were closest to the observed inflow. The reason that the RNN model yielded better results than the LSTM model at Andong Dam lies in the activation function. The existing RNN model uses the tanh function among the activation functions to cause the gradient loss problem. However, in this study, the ReLu function was used to reduce gradient loss during backpropagation learning. The reason that the LSTM model was selected as the optimal model in Imha Dam is that the loss was less than that of the RNN model due to the cells of the LSTM with memory function. In addition, although the watersheds of the two dams are close, the optimal model is different because various factors such as land conditions, river slope, and rainfall characteristics worked. Therefore, it can be seen that the analysis process to find an appropriate model is important by referring to these points.

Typhoons
It is important not only to analyze the normal or drought period using daily data to predict the inflow to the dam, but also to analyze it using hourly data for flood control. In particular, in the case of Imha Dam, the inflow of dams in flood season (21 June-20 September) was 157.9 × 10 6 m 3 in 2019, while it was 743.6 × 10 6 m 3 in 2020.In other words, the inflow amount was 4.7 times different even in the same period. Accordingly, by applying the six major typhoon cases to each model, the maximum observed inflow and the prediction of models are compared, and the most accurate model is selected by calculating R 2 . Table 9 shows the six major typhoons applied in this study. In particular, after the rainy season in 2020, typhoons occurred consecutively, and approximately 270 mm of rainfall fell in the basins of Andong and Imha Dam, and a maximum of 23.4 mm of rainfall per hour was recorded in the basin of Imha Dam. Among the six typhoon cases, Typhoon Maysak and Haishen in 2020 occurred consecutively and, therefore, are considered to be one case.

Typhoons
It is important not only to analyze the normal or drought period using daily data to predict the inflow to the dam, but also to analyze it using hourly data for flood control. In particular, in the case of Imha Dam, the inflow of dams in flood season (21 June-20 September) was 157.9 × 10 6 m 3 in 2019, while it was 743.6 × 10 6 m 3 in 2020.In other words, the inflow amount was 4.7 times different even in the same period. Accordingly, by applying the six major typhoon cases to each model, the maximum observed inflow and the prediction of models are compared, and the most accurate model is selected by calculating R 2 . Table 9 shows the six major typhoons applied in this study. In particular, after the rainy season in 2020, typhoons occurred consecutively, and approximately 270 mm of rainfall fell in the basins of Andong and Imha Dam, and a maximum of 23.4 mm of rainfall per hour was recorded in the basin of Imha Dam. Among the six typhoon cases, Typhoon Maysak and Haishen in 2020 occurred consecutively and, therefore, are considered to be one case. Tables 10 and 11 show the peak inflow predicted by each deep learning model using hourly inflow data for Andong Dam and Imha Dam, respectively. In Andong Dam, the GRU predictions had the smallest differences from the peak inflows observed from Typhoons Maemi, Kongrei, and Maysak and Haishen (Table 10). On the other hand, in Imha Dam, the RNN prediction showed the smallest difference from the peak inflow observed in Typhoon Rusa, Kongrei and Mitag (Table 11). Figure 9a,b show the comparison of the observations and predicted inflow by four models for Typhoons Maysak and Haisen in Andong Dam and Imha Dam, respectively. The GRU for Andong Dam and the RNN for Imha Dam were selected as the optimal model based on the maximum inflow prediction and R 2 value under typhoon conditions. However, as the maximum inflow prediction and R 2 values differ greatly depending on the characteristics of each typhoon, such as rainfall strength and preceding rainfall, as shown in Tables 10 and 11, it is considered desirable to compare various models and analyze for future flood simulation. K-water, which operates Andong Dam and Imha Dam, is currently using the SFM to predict the inflow of the two dams. Therefore, the inflow of the SFM and the predicted inflow of the GRU (Andong Dam) and the RNN (Imha Dam) were compared through analysis according to typhoon conditions. The SFM was calibrated so that the predicted inflow was closest to the observed maximum inflow while adjusting the parameters. In some cases, the R 2 has increased while the maximum predicted inflow has decreases. However, in practical dam operation, the maximum inflow and arrival time are more important factors. Therefore, the calibration was performed to better match the maximum inflow than the R 2 between the prediction and the observation.
Dam, the RNN prediction showed the smallest difference from the peak inflow observed in Typhoon Rusa, Kongrei and Mitag (Table 11). Figure 9a,b show the comparison of the observations and predicted inflow by four models for Typhoons Maysak and Haisen in Andong Dam and Imha Dam, respectively. The GRU for Andong Dam and the RNN for Imha Dam were selected as the optimal model based on the maximum inflow prediction and R 2 value under typhoon conditions. However, as the maximum inflow prediction and R 2 values differ greatly depending on the characteristics of each typhoon, such as rainfall strength and preceding rainfall, as shown in Tables 10 and 11, it is considered desirable to compare various models and analyze for future flood simulation.   In Andong Dam, the difference between the predictions and the observations of the maximum inflow for Typhoons Kongrei and Mitag was larger in the SFM than in the GRU. In the case of Imha Dam, the inflow of the SFM was predicted to be lower than the observed value as well as the RNN inflow in all Typhoon conditions (Table 12). These results show that the RNN selected in this study is a reliable model when compared with the results of the SFM currently being used for dam inflow prediction. Overall, the predictions of the deep learning models were closer to the observed maximum inflow than that of the SFM. On the other hand, during Typhoon Maysak and Haishen at Andong Dam, the predictions of the SFM were better in agreement with the observed inflow than those of deep learning models. Therefore, it is necessary to derive more reasonable results through comparison of the predicted values of the SFM and deep learning models when making decisions related to dam operation.

Discussion
This study showed the process of predicting and analyzing dam inflow using deep learning models. The reason for conducting this study is that it is important to predict the inflow with high accuracy for dam operation in disaster situations such as drought and flood. Most of the prediction results showed that the RNN models had higher accuracy than the ANN model. The reason for these results is that precipitation and inflow are time-series data, and the RNN models circulate the previous results as input variables so that learning is performed continuously without compromising the learning ability relatively. In typhoon and drought conditions, recurrent neural network models (RNN, LSTM, GRU) were selected as optimal models. In comparison with the SFM and the deep learning models, the prediction of most deep learning models was found to be closer to the observed maximum inflow than that of the SFM, but the SFM also showed better results under certain conditions.
These results suggest that even if dam basins are adjacent, different deep learning models may be selected as the optimal model for each dam by various factors including land condition and rainfall characteristics. Therefore, further studies including various factors such as land condition, evaporation, temperature, and wind speed that have not been considered in this study are needed to predict more accurate dam inflow using deep learning model.

Conclusions
In this study, for efficient water resource management of Andong Dam and Imha Dam, the optimal model was selected through comparison and validation of deep learning models in predicting the inflow to the two dams. Considering that dam inflow prediction is a time series analysis, RNN models were mainly applied. Four deep learning techniques-ANN, RNN, LSTM, and GRU-were utilized based on dam hydrology data for the past 20 years to predict the inflow of the dams, and optimal input variables were derived through various indicators. In addition, (1) To evaluate the detailed prediction capability of the deep learning model with each scenario, the data were analyzed according to quartile values after differentiating the entire period and the drought period. To select a deep learning model most suitable to the drought and normal season based on the scenario, predictions and observations for the inflows of the 1st, 2nd and 3rd quartiles and peak inflow were compared using the daily time series data. In Andong Dam, the RNN model produced the closest quartile values to the observed inflow in the total period (2017-2020) and it also derived the closest to the measurements in the normal and drought period. In Imha Dam, the LSTM model showed the closest to the observations in the normal season. During the drought period, the LSTM prediction showed the smallest difference from the observations in the 1st and 2nd quartiles, whereas the GRU prediction showed the smallest difference in the 3rd quartile. (2) A comparative analysis of six cases of past typhoons showed different predictions depending on the deep learning models. In Andong Dam, the GRU model showed higher accuracy compared to other models in the inflow prediction. In Imha Dam, unlike Andong Dam, the predicted inflow of the RNN showed the highest correlation and the most agreement with the observations. In Typhoon Mitag, R 2 has a high correlation of 0.97 and a difference of 1% between the observations and predictions which is the closest to the measured value compared to other models. As a result of analyzing the selected model, since the dam inflow and precipitation were characterized as time series data, the RNN derived predicted inflow with relatively high reliability. (3) Compared with the SFM currently used to predict the inflow into the dam, the selected deep learning models derived results that were closer to the observed inflow in the maximum inflow prediction. In predicting future typhoon inflows, using a conceptual or physical model and a deep learning model together will help in efficient decision making.
The appropriate deep learning model varies depending on weather conditions such as drought, typhoon, and torrential rain; therefore, it is important to compare various deep learning models to cope with uncertain future climate change and to manage the operation of reservoirs efficiently and safely. In addition, as the SFM rather than the deep learning model shows better prediction results under certain typhoons, the analytical ability of hands-on workers to utilize deep learning models, as well as existing SFMs is important, as shown in the previous analysis. This study, which analyzed inflow predictions using hydrological data and deep learning models, is expected to contribute to stable dam operation management and disaster response when used as basic data for inflow prediction models of various multi-purpose dams including Andong and Imha Dams.