Determination of Deep Learning Model and Optimum Length of Training Data in the River with Large Fluctuations in Flow Rates

: Recently, developing countries have steadily been pushing for the construction of stream-oriented smart cities, breaking away from the existing old-town-centered development in the past. Due to the accelerating e ﬀ ects of climate change along with such urbanization, it is imperative for urban rivers to establish a ﬂood warning system that can predict the amount of high ﬂow rates of accuracy in engineering, compared to using the existing Computational Fluid Dynamics (CFD) models for disaster prevention. In this study, in the case of streams where missing data existed or only small observations were obtained, the variation in ﬂow rates could be predicted with only the appropriate deep learning models, using only limited time series ﬂow data. In addition, the selected deep learning model allowed the minimum number of input learning data to be determined. In this study, the time series ﬂow rates were predicted by applying the deep learning models to the Han River, which is a highly urbanized stream that ﬂows through the capital of Korea, Seoul and has a large seasonal variation in the ﬂow rate. The deep learning models used are Convolution Neural Network (CNN), Simple Recurrent Neural Network (RNN), Long Short-Term Memory (LSTM), Bidirectional LSTM (Bi-LSTM) and Gated Recurrent Unit (GRU). Sequence lengths for time series runo ﬀ data were determined ﬁrst to assess the accuracy and applicability of the deep learning models. By analyzing the forecast results of the outﬂow data of the Han River, sequence length for 14 days was appropriate in terms of the predicted accuracy of the model. In addition, the GRU model is e ﬀ ective for deep learning models that use time series data of the region with large ﬂuctuations in ﬂow rates, such as the Han River. Furthermore, through this study, it was possible to propose the minimum number of training data that could provide ﬂood warning system with an e ﬀ ective ﬂood forecasting system although the number of input data such as ﬂow rates secured in new towns developed around rivers was insu ﬃ cient.


Introduction
South Korea, which belongs to the monsoon season, will have 60%~70% of its annual precipitation for four months from June to September. Therefore, both flood and drought season have difficulties in managing water resources [1]. In particular, in the case of urban streams during the flood season, predicting flood damage due to impervious covers and heavy rain has become an essential part of urban disaster prevention [2]. Recently, flooding damage has occurred frequently and continues to increase around urban rivers due to localized heavy rains caused by climate change. For this reason, accurate flow rate forecasting techniques are needed to predict high flow rates in urban streams [3].
In the past, physical numerical models were used to predict the flow rate of streams such as stage-storage method of flood routing and discharge-storage method of flood routing [4] but it was difficult to expect accurate results depending on the constraints or numerical techniques of the model. The numerical analysis results of the one-dimensional Saint-Venant equations [5], two-dimensional shallow water equations [6] and the three-dimensional Navier-Stokes equations [7], which are numerical models based on hydrodynamics, must be considered and calculated essential for establishing flood prevention measures in rivers. Therefore, in the event of a flood damage, the finely developed mathematical model can be used to predict the flow rates and water surface elevations in the stream and to establish flood reduction measures. In addition, the main drawback of the traditional method such as dynamic model was many constraints for use as flood warning means. This was because user expertise was essential when developing and using the model and long calculation time for numerical models was required. Thus, the numerical analysis using the Computational Fluid Dynamics (CFD) method is very mathematically complex [8]. Depending on the numerical analysis technique used, the accuracy of the numerical solutions varies greatly. In addition, determining the appropriate size grid size and time interval within the computational domain is essential in order to obtain accurate numerical results using the hydraulic numerical models. Quick flood forecasts were possible but methods were needed to effectively improve the accuracy of flood forecasts. Therefore, studies have emerged on effective numerical analysis model, such as data driven model, which can overcome the limitations of existing physical models and replace them [9].
Artificial Neural Network (ANN) models, which have recently been increasingly interested in the field of data science, can obtain accurate prediction results by repeatedly learning the correlation between input and output data regardless of physical characteristics and meaning. ANN is an algorithm that mathematically models neurons in the biological brain of a person or animal so that machines can learn on their own. One of the detailed methodologies of Machine Learning (ML), a nerve cell such as the neuron, is the form of multiple connected networks.
Recently, due to the remarkable development of DNN (Deep Neural Network) models, there has been a worldwide active study on the prediction of time series data such as flow rate, water elevation and velocity in the field of water resources engineering [10][11][12][13][14][15][16][17][18][19][20][21][22]. First, studies were conducted to steadily increase the accuracy of forecasts by using DNN models for water level prediction. The water level in the stream was predicted using ANN model by using only the time series data on the water level as input data without using rainfall data [10]. The water level of the stream was predicted using Recurrent Neural Network (RNN), Recurrent Neural Network-Back Propagation Trough Time (RNN-BPTT) and Long Short-Term Memory (LSTM) models to predict the flood damage in urban areas [11]. Using ANN, RNN and Nonlinear AutoRegressive eXogenous neural network (NARX) models, the water level of the Han River was predicted and the NARX model was able to produce better results than the ANN and RNN models [12]. Second, studies were conducted to predict the inflow into the dam. The streamflow was accurately predicted using the RNN model with one or two hidden layers using various hydrological parameters [9]. An attempt was made to use the RNN model to predict streamflow, a series of time series data [13]. To predict the monthly flow rate of rivers, accurate prediction was carried out using the RNN model considering the delay of the time series of input data [14]. Daily streamflow was predicted using seven lags of flow and rainfall data by ANN, adaptive neuro-fuzzy and Generalized Regression NN (GRNN) models [15]. The daily flow rate into the dam was predicted using time-lagged RNNs, which perform backpropagation. This method was well predicted for both low and peak flow rates [16]. The forecasts were made using Reinforced RNNs to predict the amount of flood water flowing into the reservoir when a typhoon occurs [17]. The inflows of multi-purpose dams were predicted using ANN and Elman RNN and the results of the inflow prediction during the flood period showed that ANN was superior to ELman RNN and ELman RNN was more advantageous in the inflow prediction during the drought period [18]. ANN and LSTM for the hourly, daily and monthly flows of dam reservoirs during the peak period were applied to increase the efficiency of the model according to maximum iteration [19]. Time series data were analyzed using LSTM to calculate the inflows of multi-purpose dams but it was difficult to accurately predict the inflows during the flood period but the prediction results of flood inflows were improved by utilizing the rainfall data [20]. Three deep learning techniques such as RNN, LSTM, Gated Recurrent Unit (GRU) were used to predict the outflows from the reservoir. The study found that the number of iterations and hidden nodes play a role in improving the accuracy of the models [21]. The performance of various RNN models was compared and analyzed to predict dam reservoir inflow flow and LSTM among them was very accurate in forecasting [22].
In other words, studies on the prediction of flow rate using DNN technologies have been actively conducted in the water resources sector over the past five years. The researches in water resources engineering using DNN technology over the last five years were dominated by hydraulic variables with small changes over time. Predicting seasonal water levels in rivers and inflow into dam reservoirs, which did not fluctuate much, have become major research areas in the field of water resources engineering: most of the flow forecasting studies using DNN technologies predicted inflow to dam reservoirs where flow rate fluctuations are not significant. However, predicting the outflow of urban streams, where seasonal changes in flow rates are large, was not easy to achieve due to the wide difference between low and high flow rates. Thus, when the DNN models allow the prediction of flood flows by using time series flow data with very large fluctuations in the urban stream where actual flood damage occurs, it can replace the existing hydro-dynamic model and has a very important meaning in the flood prevention.
In this study, DNN models are used to predict flow rates. Over the past two years water resources research has compared one or four models, including Convolution Neural Network (CNN), Simple RNN, LSTM, Bidirectional LSTM (Bi-LSTM) and GRU models. However, few studies have predicted extreme flow rates for seasonal fluctuations such as target streams and most have focused on predicting water levels or flow rates with little difference in flow into streams or dams. Therefore, in this study, we studied the method of accurately predicting extreme high flow rates by selecting five DNN models (i.e., CNN, Simple RNN, LSTM, Bi-LSTM and GRU) suitable for time series calculations. Unlike the previous deep learning studies in the field of water resources engineering, however, the purpose of the study is to improve the accuracy of prediction of the high flow rates of the flood season due to changes in the length of time series data by applying it to urban streams with very large seasonal flow fluctuations. Furthermore, we propose a plan to establish the critical conditions of deep learning to ensure reliable for sequence Length and learning size entered into the DNN models. Therefore, we would like to propose ways to effectively utilize the appropriate amount of learning, even if the entered data length is relatively short and therefore not sufficiently learned, to establish a flood warning system in the regions where there is not enough actual data available in the future.

Applied DNN Models
Various models have been developed for ANN models and DNN models that can be used effectively in the analysis of time series data have been investigated. DNNs applicable to time series data through this study are CNN, Simple RNN, LSTM, Bi-LSTM and GRU.

Convolution Neural Network (CNN)
CNN model is a deep learning model that performs very well in image recognition and analysis. CNN process starts with convolutions and max-pooling, breaking down the images into shapes and analyzes them independently. The results of this process are fed to a fully connected neural network structure that leads to the final classification decision (Figure 1). Using the same algorithm as image analysis techniques, CNN has effectively applied it to time series data analysis as well as image analysis [23,24]. Intuitively, time series data can be expressed as images and the algorithms applied to image recognition can be applied using CNN models. Therefore, CNN can be applied to time series data to predict the future.

Simple Recurrent Neural Network (Simple RNN)
Simple RNN is an artificial neural network for learning data that changes over time, such as time series data. Therefore, historical output data is recursively referenced. Traditional neural networks operate only on the entered data, so it is difficult to process continuous data. Because time series data are correlated with previous result (h t−1 ), current result (h t ) are expected through the correlation with yesterday's data as shown in Figure 2. Thus, daily fluctuation data correlates with previous result. Among the various DNN algorithms, Simple RNN emerged to process data with sequences. However, the RNN algorithm has limitations in applying it to long time series data because the storage memory is very short [22,25]. In addition, when multiple hidden layers of the RNN are used to perform complex time series problems, gradient vanishing and exploding problems are often encountered. Simple RNN is computed as follows (Equation (1)): where σ(·) is an activation function; W is weight matrices of the h layer; x is an input vector and b is a bias.

Long Short-Term Memory (LSTM)
LSTM is a model of sequential data that improves the long-term memory loss problem of Simple RNN [26]. As shown in Figure 3, LSTM consists of forget gate, input gate and output gate. The key to LSTM is having a cell state. The horizontal line from c t−1 to c t located at the top of Figure 3 is called the cell state that penetrates the entire time series data through a simple linear operation. Because of this structure, time series information is continuously transferred to the next time step without memory loss.
where f t , i t and o t are the forget, input and output gates at time t, respectively; W f , W i and W o are weights mapped to hidden layers for forget, input and output gates; b f , b i and b o are bias vectors; tanh(·) is hyperbolic tangent function; c t−1 and c i are the cell states of the previous time step and the next time step.

Bidirectional LSTM (Bi-LSTM)
The hidden layer neural network with Bi-LSTM [27], as shown in Figure 4, can expect high perceived performance because it learns both forward and backward directions of input sequence weights compared to uni-direction, which only learns forward direction [28].

Gated Recurrent Unit (GRU)
GRU plays a similar role as LSTM but it is computationally efficient because it consists of a simpler structure. This reduced the calculation of cell state used by conventional LSTM. GRU is a simplified form of three gates of LSTM. As shown in Figure 5, the input gate and forget gate are combined and simplified into an update gate [29]. GRU only has two types such as update gate and reset gate and removed cell state. GRU uses the activation function twice and the tanh function once. Thus, GRU has fewer parameters than LSTM and is faster to train but it is capable of long-term memory such as LSTM.
where r and z are the reset and update gates, respectively. Reset gate aims to reset past data and outputs a value between 0 and 1, which is the value of how much past data will be reset through the activation function. The update gate determines the rate of past and present information updates and the output value z t determines the amount of data to be exported at this point in time. 1 − z t is the amount of data to be forgotten.

Application of Models
In this study, we want to compare the accuracy and performance of the prediction of flow rates by selecting five models suitable for high flow rate prediction using time series data among various deep learning techniques. The time series flow rates are used as input and output data, which is the runoff data of the stream. These time series data are very important for flood warning and its defense in the field of water resources. The purpose of this study is to accurately predict future stream flow rates by utilizing one time series flow rate, which is one-variate data, as input data.
We propose deep learning techniques suitable for extreme flood forecasting to prepare for extreme flooding in water resources and to establish an accurate flood warning system and propose the length of input learning data through this study to ensure adequate accuracy. First of all, due to the nature of the time series data, it is necessary to calculate the appropriate sequence length that the relevant data has. Therefore, we would like to select the best deep learning technique for predicting time series data most recently to determine sequence length for the appropriate flow rates in stream with large flow fluctuations. In addition, an applicable deep learning model can be proposed by calculating the predicted accuracy of flow rate data with large flow fluctuations by combining suitable deep learning models based on previously developed models according to sequence length determined. Finally, in order to ensure the future prediction accuracy of the time series data with limited length, the accuracy and performance of the flow rate prediction according to the length of the time series data learned are calculated and compared. If the length of observational data to be entered into the DNN model is relatively short and it is feared that there will be significant errors in learning and forecasting, a measure should be devised to derive the best high flow rate forecast results using only the observed data.

Study Area
The Han River, in a highly urbanized Han River basin, was selected ( Figure 6). The Han River basin is located in the central part of the Korean Peninsula over a latitude of 36 • 30 N to 38 • 55 N and longitude 126 • 24 E to 129 • 02 E. As shown in Table 1, the Han River has a basin area of 25,953.60 km 2 (excluding the 9816.81 km 2 area of North Korea) and is the largest river in South Korea with the total length of 494.44 km, an average width of 72.35 km and the shape coefficient of 0.146. The Han River basin is a multi-type basin that is a mixture of dendritic form and facsimile form. Historically, the Han River was channelized with its sinuosity demolished as a result of flow modifications, because of the urbanization of the Han River basin [30].  [30]. Table 1 gives a summary of the channel characteristics. The channelized reach of average river width 1300 m has an average slope of 0.0016% on the downstream reach of the Han River basin [31,32].

Hydrologic Data
In this study, the runoff data of Hangang Bridge Station ( Figure 6), which is observed by the Ministry of Environment, Korea, were used. Flow data from the flow monitoring station were obtained using the data from the through the WAMIS website of the Ministry of Environment [33]. The average annual precipitation from 2010 to 2019 was 1313.42 mm, with 60% of the precipitation falling during the monsoon season of July to September based on the rainfall data provided by the website of Korea Meteorological Administration [35]. Thus, since seasonal rainfall is concentrated during the summer, the amount of runoff also increases rapidly at that time. The longer the data, the better the results. However, in this study, the deep learning models were validated using relatively short data of 2 years and 7 months to determine the critical size of learning data by reducing the length of data to be applied to model predictions. As shown in Table 2 and Figure 7, the observed daily flow rates from the Hangang Bridge station were based on real-time observed data from 1 January 2018 to 31 July 2020. In order to analyze the characteristics of flow rates, the average, the minimum and the maximum flow rate were statistically analyzed as shown in Table 2. The flow rate variation is 425.84 m 3 /s, which is larger than the average flow rate of 355.97 m 3 /s and the seasonal runoff changes at this site are very large. Before the multi-purpose dams were built in the Han River, the Coefficient of Flow Fluctuation (CFF) was 390: the CFF is defined as the ratio of annual maximum flow rate and annual minimum flow rate [36]. Large-scale multi-purpose dams were built upstream to control the flow rate: after constructing multi-purpose dams for flood control and securing water supply, the CFF has been dramatically lowered to 70.32. However, the seasonal changes are still very large compared to the CFF in foreign rivers (e.g., 3 for the Mississippi River, 8 for the Thames River, 18 for the Rhine River and 30 for the Nile River), which have a constant flow rate throughout the year [36].  As shown in Figure 7, the selected time series data were used for three stages of calculation and prediction. The distribution of the length of the entire time series data was 80% used as learning data. The remaining 20% of the data length was used to evaluate the accuracy of the model. The blue solid line in Figure 7 was allocated as forecast data. However, the last 10% of the time series (i.e., validation data) out of 80% of the learning data distributed was used as verification data to evaluate the adequacy of training in the learning process and shown in Figure 7 as a red solid line. Therefore, the training data used in the actual deep learning model corresponds to the portion of the 80% data length allocated to the learning, excluding 10% of the verification data. The validation data is indicated by a red solid line as shown in Figure 7.

Composition of Models
In this study, Python version 3.7.7 [37], an open-source program language and TensorFlow version 2.1.0 [38], a representative machine learning library, were used. As shown in Table 3, the deep learning models used in the study were CNN, simple RNN, LSTM, Bi-LSTM and GRU. For each model, the form of the neuron and the number of units comprising each layer of it are shown in Table 3. The composition of the remaining models except CNN consists of 1 input layer, 2 hidden layers, 1 dropout and 2 dense layers and the detailed composition and hyperparameters of each model is given in Table 3. One-time learning of the entire training data entered into the models was defined as epoch and the results of the model calculation were sufficiently convergent after 600 epochs of learning. The Adam optimizer was used to achieve convergent results while the model performed the learning and the cost function was using Mean Squared Error (MSE).

Model Performace Indicators
To evaluate the performance and accuracy of the deep learning models, model evaluation criteria from Equations (12)-(16) were used. The closer the mean absolute error (MAE), mean square error (MSE) and mean square error (RMSE) are to 0, the better the performance of the model. The closer Nash-Sutcliffe model efficiency coefficient (NS) and Coefficient of determination (R 2 ) are to 1, the better the performance of the model.

(1) Mean Absolute Error (MAE)
MAE measures the average magnitude of the errors in a set of predictions, without considering their direction. It is the average over the test sample of the absolute differences between prediction and actual observation where all individual differences have equal weight [39].
where x i are the observed values of the variables, y i are the predicted values and N is the number of data. (

2) Mean Squared Error (MSE)
MSE measures the average of the squares of the errors, that is, the average squared difference between the predicted value and the actual observation value [39]. (

3) Root Mean Squared Error (RMSE)
RMSE is a quadratic scoring rule that also measures the average magnitude of the error. It is the square root of the average of squared differences between prediction and actual observation [39][40][41].
(4) Coefficient of determination The coefficient of determination R 2 is a measure of the goodness of fit of a statistical model [39,42,43].
whereŷ i are the predicted values from a statistical model and x is the mean of observed values of the variables.
As shown in Table 4, it is not appropriate to adopt model result if R 2 and NSE are less than 0.5; if R 2 and NSE are greater than 0.5 and less than 0.65, it is possible to adopt model; If the R 2 and NSE are greater than 0.65 and less than 0.75, it is good for adoption; Also if NSE and R 2 is above 0.75, it is very good to adopt the model [22,39,42,44].

Results on Traning, Validation and Prediction Using Various Time Series Deep Learning Models
The learning and predicted results for 5 deep learning models (CNN, Simple RNN, LSTM, Bi-LSTM and GRU) for time series data are shown in Figure 8 and Table 5. The results of the CNN model were not well achieved, as shown in Figure 8(a1,a2), respectively and the predictions for high flow rate were not well predicted. However, the forecasts of low flow rates were adequately predicted. The training and prediction results of simple RNN were improved than those of CNN but still under-calculated in high flow rates (Figure 8(b1)) and the prediction results were rather overestimated in high flow rates (Figure 8(b2)). As shown in Figure 8(c1,c2) and Table 5, the results of the LSTM model showed a sharp improvement in accuracy (NS = 0.994) in the learning outcomes of the high flow rates but the results of forecast of the high flow rates were still overestimated. The learning results of Bi-LSTM and GRU were very accurate with NSE = 0.984 − 0.994 and, the predicted results for GRU, with high accuracy (NSE = 0.693) in all flow rate ranges ( Table 5). The results of the prediction of Bi-LSTM were well predicted for most flow rate ranges, as shown in Figure 8(d2) but the predictions of high flow rates were somewhat overestimated and the accuracy was reduced. Therefore, based on the results of this study, the GRU model was able to predict both learning (NSE = 0.984) and forecasting (NSE = 0.693) with high accuracy in the case of the Han River with large flow fluctuations (Figure 8(e1,e2)).

Training and Prediction Results of Sequence Length Variation Using GRU
Deep learning calculations were performed for sequence length using all deep learning models but reflecting the results of Section 4.1, only the results for sequence length for GRU, the model with the best prediction results, were specified in Figure 9 and Table 6. Thus, the model used for the predictions mentioned in this section was GRU and all results were calculated after learning 600 epochs, as shown in Figure 9 and Table 6.  As shown in Figure 9, training data used 72% of the total number of data and 8% of the data was used as validation data. And the other 20% of the data was used as prediction data. Comparing NSs to the model suitability in Table 3, NSEs were calculated in the range of 0.994 to 0.961 as a result of learning. When the sequence length was 7 days and 14 days, learning results tended to overfit and NSE values were in the range of 0.507 to 0.549 in the validation data: due to the small number of data entered in the validation data, the NSE values were somewhat small but it was determined that there was no significant problem in verifying the performance of the overall model.
If the sequence lengths were within 7 to 35 days, the predicted NSE results were 0.632 to 0.693 and the predicted results were calculated to be sufficiently reliable (Table 6). However, when the sequence lengths were 4 days and 42 days, the NSE values dropped sharply, respectively, to 0.312 and 0.472 and the model results deteriorated. Therefore, it was considered appropriate to select the appropriate sequence length from the GRU model for a range of 7 to 35 days in the Han River with large seasonal fluctuations like this study site. R 2 s also had the same or slightly smaller value as NSEs, as shown in Table 6. Due to the large change in flow rate of the stream, the remaining performance criteria (such as MAE, MSE and RMSE) of the model had large values even after the completion of the learning but if analyzed by the results of the model's training accuracy and prediction accuracy, it can be seen that the results were sufficiently converged.
As shown in (a4)-(c4) of Figure 9, the R 2 values were shown to be capable of predicting a reasonable level of high flow rates, although some under-or overestimated, as the sequence length increased from 7 days to 21 days. For intermediate or low flow rates, all results were predicted to be at a high level of accuracy (Figure 9(a2-c2)). Therefore, we excluded the case where the sequence length is 7 days to avoid overfitting problems by aggregating the results of this study. Using the sequence length 14 days data, a high accuracy prediction results were obtained for all flow rate ranges.

GRU Performance with Changes in Length of Training Data and Prediction Data
In this section, we observed the accuracy of the predicted flow rate by fixing the length of the prediction data at 188 and reducing the size of the training data: the size of the prediction data presented in Table 7 was from 26 January 2020 to 31 July 2020, with the total number of data lengths fixed at 188. The maximum size of the training data was from 1 January 2018 to 25 January 2020. By reducing the number of training data from January 2018, a total of four time series learning data were prepared. Using GRU model, the accuracy of prediction according to the size of the training data was observed. As shown in Table 7, the number of training data and the number of prediction data were added to calculate the ratio of each training data and prediction data. Therefore, the ratio of training data entered into GRU was reduced to 80.0%, 74.9%, 71.4% and 66.8%, respectively but the absolute size of prediction data was the same for all cases, at 188: the proportion of prediction data in the four groups of data increased to 20.0%, 25.1%, 28.6% and 33.2%, respectively. The results calculated from 600 epochs using the GRU model at the sequence length of 14 days with the same model input conditions were shown in Table 7. The calculation for validation was not performed because the main purpose of this calculation was to find the critical input size of the training data. Only training and prediction calculations were performed to ensure sufficient number of training data.
As shown in Table 7, even if the proportion of training data decreased from 80% to 74.9%, NSE increased from 0.606 to 0.622 to gain confidence in the forecast results. However, further reducing the size of training data to below 71.41% caused NSE to deteriorate rapidly in the range of 0.5 to 0.484%, making the forecast results unreliable.
In the case of training data in Figure 10(a1,b1), high flow rate data were included in the learning data, which also allowed the prediction of high flow rate results as shown in Figure 10(a2,b2). However, if the training data continued to be shortened and the high flow rate data were removed from the learning input, the prediction flow rates were gradually underestimated as shown in Figure 10(c4,d4). Nevertheless, for learning materials entered into the model as shown in Figure 10(c3,d3), the learning results were still very high (NSE = 0.993-0.997). Thus, the input ratio of the data being trained based on the results of this section must be entered in sizes of 74.9% to 80.0%.  Therefore, these results may suggest the minimum input size of data in the establishment of flood warning system for urban rivers, on the contrary, the size of predictable flood warning results can be appropriately determined to establish an effective flood warning system for urban development streams where there is not enough observation data.

•
Most of the DNN studies used in previous water resources engineering [10][11][12][13][14][15][16][17][18][19][20][21][22] were mainly used by RNN, Bi-LSTM and LSTM models to predict time series data. The GRU model had similar accuracy as the LSTM model but as shown in Figure 5, the GRU model is simplified by omitting the cell state calculation rather than using 3 gates in the LSTM model and can effectively calculate the large flow rates.

•
As seen in previous studies [10][11][12], most applications of deep learning technology in the field of water resources engineering were focused on predicting the water levels in streams and the inflows into dams with small variation in hydraulic variables. When DNN models were learning time series data with very high seasonal fluctuations, relatively accurate predictions were possible for low flow rates but the accuracy at high flow rates were significantly reduced. Thus, unlike previous studies for flood runoff prediction, if the variation between the minimum and maximum values of the time series data is very large, the predicted accuracy of the time series data becomes very inaccurate and vulnerable. In this study, LSTM and GRU models were able to achieve better results than other RNN models when the seasonal fluctuations in the flow rate of urban streams were very large. Among them, GRU model results were the best.

•
In most areas of water resources engineering [9,[13][14][15][16][17][18][19][20][21][22], the time series data of any length were utilized as input data without proper consideration of the sequence length of the data in the calculation of the DNN models, which can predict water levels and flow rates. However, it is essential to verify the change in accuracy according to the sequence length of time series data that directly affects the forecast results. In this study, NSE was selected as 0.5 as the minimum threshold for sequence length. The range of sequence length applicable to the Han River (CFF = 70.32), which had a very large flow fluctuation, was 7 to 35 days and in the case of 14 days, the most optimal prediction of the flow rates could be obtained.

•
When the length of the observation data of flow rates is not sufficiently secured, the length of the minimum input time series flow data to be learned must be determined in order to predict the flood flow rates for a specified period of time with minimum accuracy (NSE ≥ 0.5). In previous studies [10][11][12][13][14][15][16][17][18][19][20][21][22], the lengths of training data and forecast data were arbitrarily determined. But in this study, if the length of the training data was determined within the range of 74.9% to 80% of the total data length, the forecast results were also accurately predicted in high flow rates.
As discussed above, in most cases, where deep learning is applied to the water resources engineering to predict flow rates or water elevations, the sequence length and the length of learning data were selected by entering random lengths of data without clear evidence. Therefore, this study could present very meaningful results that could quantitatively determine the input length of data in the Han River, an urban stream with large seasonal flow rate fluctuations.

Critical Conditions of Deep Learning to Ensure Reliability
Through this study, the critical conditions for prediction that have secured reliability in urban streams where seasonal changes in flow rates such as the Han River are very large could be proposed as shown in Figure 11. The criteria for the performance evaluation of learning and forecasting models were NSE. By varying the sequence length and training data size (%) and by plotting the entered learning results and predicted results, the performance evaluation results of the deep learning model can be defined as the criteria that can be used in flood forecasting systems in the future as shown in Figure 11. Based on Table 4, the minimum threshold of available NSE was set to 0.5. If the two variable relationships are illustrated as shown in Figure 11, NSE is greater than 0.5 and, if the maximum is reached, is estimated to be the input condition of the most effective deep learning model for accurate flow rate prediction: it is possible to accurately predict the flow rate by selecting the sequence lengths between 7 and 21 days and using input data of 74.9% to 80% for training data size. In addition, as shown in Section 4.3, high flow rate data should be included without being removed from the learning data entered into deep learning. The process of sensitivity analysis calculated the accuracy by changing the data length to determine the appropriate sequence length. This result was derived for rivers with large flow rates and large flow fluctuations, such as the Han River and provided a basis for engineering judgment to predict flow rates due to seasonal changes. However, sensitivity analysis on input length and sequence length of data applicable to various streams needs further research in the future.

Conclusions
Recently, due to climate change and urban development projects around rivers, the development of flood prediction technology for urban streams has been studied in various ways. Flood prevention measures are being prepared in terms of creating a safe and sustainable waterfront space because of concerns over flooding in cities caused by rapid increase in floods due to urbanization. In order to achieve this goal, accurate prediction of high flow rates must be possible, especially for rivers with severe seasonal changes in flow rates, it is imperative to prepare a flood warning system. Therefore, in this study, 5 models of deep learning (i.e., CNN, simple RNN, LSTM, Bi-LSTM and GRU) were applied to evaluate the accuracy of high flow rate forecasts to suggest the best deep learning technique suitable for predicting continuous time series data for accurate high flow rate forecasting.
In the case of the Han River flowing through Seoul, Korea, GRU was chosen as a deep learning model suitable for cases where seasonal fluctuations in flow rate are up to 70.32 times. It is also important to set a sequence length that accurately reflects the trend of the previous time series in order to predict the exact time series flow. Therefore, the sequence length was appropriate between 7 and 21 days by calculating the predicted accuracy according to the various sequence lengths. However, if the sequence length of 7 days was selected, overfitting occurred in the learning process, so if the sequence length of 14 days is used in this study, the overfitting that can occur during the training process could be minimized. Finally, for the prediction of the time series flow rate for a given period, the minimum length (%) of the training data could be proposed at 74.9-80%. In this case, the reliability of the forecast results could be obtained.
The results of this study examined the accuracy of the predicted flow rates according to the length of the input flow data in urban streams, where the flow rate increases and fluctuates greatly due to urbanization. If the minimum length of the input data is obtained, the use of deep learning technology compared to the results calculated by the traditional CFD models is evaluated as an important achievement as it can effectively predict the flow rates quickly and accurately.