1. Introduction
Inundation of the hinterland due to dike breaches poses a worldwide flood risk. It is expected that flood losses will significantly increase in the future due to climate change and as societies become wealthier [
1]. Accurate prediction of potential dike breach locations, as well as the timing of a dike failure and resulting outflow hydrograph, are crucial for decision-makers to establish appropriate flood mitigation measures such as evacuation plans. Up until now, two-dimensional (2D) hydraulic models are generally used to predict the disastrous consequences of river flood events (e.g., [
2,
3,
4]). However, the computational times of these models are relatively long, especially when large river basins are considered. Furthermore, many hydraulic simulations are required according to a probabilistic approach to find the potential dike breach locations since the critical water level at which a dike may fail is highly uncertain and varies spatially along a river reach. In crisis situations, a quick estimate of the dikes most prone to breach is essential. Therefore, 2D hydraulic models cannot be used for real-time flood forecasting purposes, even with a computational cluster.
For this reason, neural networks are applied more frequently in the field of hydrology in recent times [
5]. Neural networks are data-driven models trained based on the input–output relations of a physically-based model or field measurements. A neural network has as advantage that it is fast with simulation times of less than a second, that it can handle incomplete and noisy input data, and that it is able to reproduce complex nonlinear behavior between input and output [
6,
7,
8]. Most of the studies that developed neural networks for flood forecasting purposes focused on the prediction of discharges and/or water levels at specific sites based on data of upstream gauge stations without considering the effects of dike failure (e.g., [
9,
10,
11,
12,
13]). Shen and Chang [
8] extended the use of neural networks by developing NARX neural networks that were capable of predicting flood time series in terms of inundation extents in urban areas. These inundations were caused by extreme precipitation events. It was found that a NARX neural network can effectively be used for multistep-ahead forecasts even when imperfect input data is used for rainfaill triggered flood events [
8]. Bomers et al. [
14] showed the applicability of neural networks in case of dike breaches by predicting maximum water levels during an extreme historic flood event. However, the study made use of predefined dike breach locations while these locations are highly uncertain in reality. Furthermore, breach outflow peaks and hydrographs have already been successfully predicted using neural network approaches (e.g., [
15,
16]). These studies focused on a single breach only and prediction of the dike breach location along a river stretch was not included in the analysis.
Even though the previous studies successfully showed the possibilities of neural networks to be used in a flood forecasting system in the future, no models exist that can predict both fast and accurately the dike breach location, timing, and outflow volume in case of extreme flood events. This study shows the potential of using neural networks to predict if a dike section will fail during a flood event, the timing a dike will fail, and the corresponding outflow hydrographs, since these parameters are the main drivers of the total flood damage during a flood event. Multiple dike breach locations in a river delta with multiple river bifurcations are considered. The developed neural networks should be able to correctly predict outflow hydrographs of the dike breaches based on an upstream discharge wave. To do so, the effects of a dike breach on downstream water level reductions and, consequently, dike failure probabilities of the remaining potential dike breach locations should be accurately captured by the neural networks.
To reach the objective of this study, several simplifying hypotheses are made. First, a deterministic approach is applied in which dike sections only fail due to overtopping failure mechanism, independent of the type of structure. Consequently, based on the hydraulic modeling simulation, the potential dike breach locations are known. Neural networks are only developed for the dike sections that breached during at least one of the multiple flood events simulated with the hydraulic model to generate the training data. The trained neural networks predict if these dike sections will fail and predict the resulting outflow hydrographs.
If the outflow hydrographs of multiple dike breaches during potential flood events can be predicted both fast and accurately, this study shows the applicability of neural networks to be used in real-time flood forecasting systems in the near future. The Dutch Rhine river delta is used as a case study (
Section 2). However, the proposed methodology can be applied to any river system in the world where flood defenses may fail, resulting in outflow hydrographs. First, training data is generated using a one-dimensional–two-dimensional (1D–2D) coupled hydraulic model in which the river is solved in 1D and the hinterland in 2D. The hydraulic model setup is described in
Section 3.1. The upstream peak discharge and shape of the discharge wave is varied to ensure that a wide range of potential realistic flood events are simulated (
Section 3.3). These upstream discharge waves result in dike breaches and, consequently, outflow hydrographs in the studied area. These input–output relations of the hydraulic model are used to train the neural networks (
Figure 1), presented in
Section 4. The results are shown in
Section 5. The paper ends with a discussion and conclusions in
Section 6 and
Section 7, respectively.
4. The NARX Setups
Even though many data-driven models exist, we focus on the development of neural networks. Neural networks are the most commonly used response surface surrogate models in water resources problems [
5] since they provide an attractive solution to problems in complex systems because they can, theoretically, handle incomplete and noisy data [
6]. Furthermore, neural networks have shown to be highly accurate in predicting water levels during flood events (e.g., [
7,
28,
29,
30,
31]) as well as in predicting breach outflow hydrographs at a single location (e.g., [
15,
16]).
This study focuses on the prediction of the outflow hydrographs of potential dike breaches requiring the need of neural networks that are capable of predicting a time-varying output based on an input time series. For this reason, nonlinear autoregressive with external input (NARX) neural networks are developed for each dike section that breached during at least one of the hydraulic simulations performed to create the training data. A NARX is a recurrent dynamic neural network with feedback connections suitable for time series prediction [
8,
32]. NARX networks have widely been applied in the field of hydrology ranging from predicting groundwater levels [
32] to floods within urban drainage systems [
33] and rainfall-triggered flood forecasting in urban and rural areas [
8,
34]. This study uses NARX neural networks to predict outflow hydrographs of multiple potential dike breach locations for the first time. A hydraulic model with computational times in the order of 10 h for a single simulation is used to create the training data, limiting the data set that can be created and consequently used to train the NARX networks. A NARX predicts a time series
y(
t) based on
d past values of
y(
t) and an input time series
x(
t) according to the following equation:
In this study,
y(
t) represents the predicted dike breach outflow hydrograph,
x(
t) the upstream discharge wave, and
d the number of past time steps. The upstream discharge waves and the outflow hydrographs, both having a time step of one hour, are normalized since it has been proved that the accuracy of NARX predictions increases if trained on normalized data sets since neural networks tend to favor inputs that can have larger values [
6]. The data sets are normalized to a [−1,1] interval [
33].
A NARX network is set up for each dike breach location that failed during the hydraulic modeling simulations. It was decided to set a NARX network up for each dike breach location separately instead of a single network for the entire system to reduce the complexity of the neural network structure and the training time required. Furthermore, this approach is justified by the fact that the NARX network should be able to predict the spatial relations already present in the hydraulic model used to create the training data. A dike breach leads to a reduction in downstream water levels. Consequently, dike failure probabilities in downstream branches decreases. The question of whether the trained neural networks are capable of reproducing this highly nonlinear system behavior is addressed in
Section 5.3.
The NARX networks are developed using the MATLAB Deep Learning Toolbox 14.1. A feed-forward network is developed, meaning that the information in the neurons flows in one direction: from an input layer through a hidden layer with a number of neurons to an output layer [
6,
29]. The hyperbolic tangent sigmoid activation function is used to compute an output based on the weighted sum of all inputs. A sigmoid activation function is generally applied in the literature since it is capable of introducing nonlinear behavior to the network [
30,
35].
During the training procedure, the number of neurons present in the hidden layer and the delays
d must be specified by the modeler. Both parameters are case-specific since they depend on the complexity of the system [
36] as well as on the training data availability [
5]. Most commonly, a trial-and-error procedure is applied to determine the appropriate number of neurons and delays [
8,
28,
37]. The same approach is applied in this study. It was found that the NARX networks produce the most reliable results for two neurons in the hidden layer and two time step delays (
Figure 5).
A general problem with response surface surrogate models is overfitting [
29,
31], which means that the surrogate model fits the noise existing in the training data rather than the underlying function [
5]. Two well-established approaches exist to avoid overfitting during the training procedure of neural networks: early stopping using the Levenberg–Marquardt (LM) algorithm and Bayesian regularization [
5,
6]. Both approaches are implemented in the MATLAB Deep Learning Toolbox. The two approaches were tested, and it was found that the LM algorithm resulted in a slightly more accurate NARX setup (i.e., a slightly higher Nash–Sutcliffe coefficient). Furthermore, the LM algorithm is commonly applied to train NARX neural networks (e.g., [
32,
33,
34]), as it is considered as a fast and efficient training function [
38]. During the training procedure, the training data set is randomly divided into three sets: training (60%), validation (20%), and testing (20%). The training data set is used to train the NARX network, the validation data set is used to test whether increasing the data set during the training phase results in a more accurate NARX network, and the testing data set represents an independent data set used to validate the NARX network performance after being trained.
The NARX networks are trained according to an open-loop [
32], i.e., a single step, form resulting in more efficient training compared to a closed-loop form in which predictions are iterated over many time steps. The maximum number of epochs was set to 1000. The mean squared error (MSE) was used as a loss function such that the error in estimations of the maximum outflow discharge is more heavily weighted than the errors in smaller values. Since training a neural network multiple times will always result in a slighty different network structure as a result of different starting conditions, 10 NARX networks were trained for each breach location. Even though the differences in the overall performance of the various trained networks were small, the NARX networks with the highest accuracy are presented in
Section 5. The accuracy of the NARX networks is evaluated using the Nash–Sutcliffe model efficiency coefficient (NSE), which can be computed with the following equation [
39]:
where
Qm is the outflow discharge predicted by the NARX network (m
3/s),
Qo is the target outflow discharge simulated by the hydraulic model (m
3/s), and
is the average of the target outflow discharge (m
3/s).
N represents the total number of simulations included in the calculation and n the index. A NSE value equal to 1 corresponds with a perfect fit between the NARX network predictions and the hydraulic model output, whereas a value below 0 indicates that the mean outflow discharge simulated by the hydraulic model is a better predictor [
40].
5. Results
It was found that only two out of the 28 potential dike breach locations failed due to the 80 flood events considered, namely the most upstream potential dike breach location located just downstream of the upstream boundary condition of the hydraulic model, and the most upstream potential dike breach location along the IJssel river (green dots in
Figure 2). These two breach locations are referred to as, respectively, “the most upstream dike breach location” and “the IJssel river dike breach location” from now on. Only for these two locations a NARX network is set up to test whether such a neural network is capable of correctly predicting the shape and total volume of a dike breach outflow hydrograph. First, the hydraulic modeling results are presented in
Section 5.1 to gain insights in the system behavior during flood events. This insight is required to identify the main difficulties within the system that should be captured by the NARX networks. The accuracy of the developed NARX networks is described in
Section 5.2,
Section 5.3 and
Section 5.4.
5.1. Hydraulic Modeling Results
During all simulations, the breach location along the IJssel river (
Figure 2) overtopped and thus breached first. The most upstream dike breach location (
Figure 2) only failed if the peak of the upstream discharge wave was larger than approximately 17,100 m
3/s. The shape of the outflow hydrograph at this location was not influenced by the dike breach along the IJssel river since the distance between the two locations was sufficiently large that backwater effects were vanished. However, the most upstream dike breach does significantly affect the shape of the outflow hydrograph of the dike breach at the IJssel river (
Figure 6). This can be explained by the fact that the most upstream dike breach changes the shape of the discharge wave in the river system. The originally smooth discharge wave now has a sudden drop due to the outflow through the breach (
Figure 6). This altered discharge shape consequently changes the shape of the outflow hydrograph at the IJssel river.
The NARX networks at both dike breach locations are set up with the upstream discharge wave as input parameter. Therefore, special attention must be paid to the accuracy of the NARX network of the IJssel breach location for upstream discharges with a peak value lower and higher than 17,100 m3/s resulting in a smooth and highly altered outflow hydrograph, respectively.
5.2. Validation of the NARX Neural Networks
The hydraulic model cannot be used as an early flood forecasting tool because of the long computational time of a single simulation in the order of 10 h on a standard PC. However, the NARX networks were trained in less than 5 s. After training, it was possible to compute the outflow hydrographs of 80 potential flood events in 0.07 s. Even though creating the training data with the hydraulic model is a huge time investment, equal to approximately 800 h in this study, the NARX networks have great potential to be used for flood forecasting purposes when trained.
During the training, validation, and testing procedures, different upstream discharge shapes were used. It was found that the NARX networks were able to respond to varying upstream discharge shapes accurately, even for the ones highly deviating from the ones present in the training data set. For the most upstream dike breach location and the dike breach location along the IJssel river, an NSE of 0.93 and 0.96 was found, respectively.
Figure 7 shows the regression lines of the two dike breach locations; the predictions of the hydraulic model and the NARX networks are presented for each time step. It shows that the NARX predictions closely resemble the hydraulic model output since almost all data points are present at the linear 1:1 line. However, some differences are present:
If no dike breach has occurred yet (hydraulic model output is equal to 0), the NARX network predicts a small negative or positive discharge. This can be seen by the vertical data points clustered around T = 0 in
Figure 7).
The NARX networks predict the peak of the outflow hydrograph one time step later compared to the hydraulic model. This explains the horizontal data points clustered around Y = 0 in
Figure 7).
The maximum outflow discharges seem to be underpredicted by the NARX networks, especially for the most upstream breach location. This can be seen by the data points representing with discharges larger than 6000 m
3/s deviating from the 1:1 line (
Figure 7a).
These three findings are discussed in more detail in the next section, in which we will focus on the shapes of the predicted outflow hydrographs. The implications of these deviations on the total outflow volume, important for flood extent predictions, are discussed in
Section 5.4. To do so, the results of three upstream discharge waves are presented: one with a peak value lower than 17,100 m
3/s such that the upstream dike breach location does not breach and two discharge waves with a peak value larger than 17,100 m
3/s, both with a different discharge wave shape (
Figure 8).
5.3. Prediction of the Outflow Hydrograph Shapes
5.3.1. The Most Upstream Dike Breach Location
From the hydraulic modeling results (
Section 5.1), it was found that dike breaches only occur at the most upstream dike breach location if the peak value of the upstream discharge wave is larger than 17,100 m
3/s. However, the NARX models are sensitive to any change in the input parameter. Even a small change in the upstream discharge results in a different response of the NARX network. Consequently, the NARX network always produces a nonzero discharge prediction (
Figure 9a). This finding explains the data points clustered around the T = 0 location in
Figure 7. However, the predicted discharges are irrelevant for flood forecasting purposes since they are extremely low.
For upstream discharges with a peak value larger than 17,100 m
3/s, the predicted outflow hydrograph of the NARX network closely resembles those of the hydraulic model output (
Figure 9b,c). The timing of the maximum discharge outflow is predicted quite accurately. The exact timing of the peak was shifted one time step, meaning that the peak occurs one hour later in the NARX predictions compared to the hydraulic model output. This explains the data points clustered around Y = 0 in
Figure 7. If a more accurate timing of the peak outflow is required, this can easily be solved by decreasing the time step of the input and output data used to train the ANN to, for example, 1 min. Furthermore, the shape of the outflow hydrograph in the falling stage is correctly predicted by the NARX network (
Figure 9).
Even though the shape of the outflow hydrographs are predicted accurately, the peak value is underpredicted with −27.9% on average for the most upstream breach location. Multiple loss functions were considered during the training procedure of the networks. However, all showed a similar, or even worse, pattern. A well-known problem with neural networks in general is that they are prone to systematic underprediction of flood series for extreme flood events. This underprediction can be reduced by, for example, postprocessing the NARX predictions by applying an unscented Kalman filter [
34].
5.3.2. The IJssel River Dike Breach Location
Figure 10 shows multiple outflow hydrographs at the IJssel river dike breach location as predicted by the NARX network and the hydraulic model. It was found that the NARX network is capable of predicting the shape of the outflow hydrographs accurately. This also applies when the peak of the upstream discharge wave is larger than 17,100 m
3/s. For these cases, the most upstream dike section breached as well influencing the shape of the discharge wave traveling in downstream direction through the river system and, consequently, the outflow hydrograph at the IJssel river (
Section 5.1). This shows that the NARX network is capable of correctly identifying the system behavior. As a result, the NARX network is able to establish a relation between the upstream discharge wave and outflow hydrographs, even if the shape of this discharge wave changes in the river system due to other dike breaches. This finding shows the capabilities of using neural networks as an early flood forecasting system in a bifurcating river system where the dynamics due to multiple dike breaches are highly complex.
Furthermore, we found similar deviations in the predicted outflow hydrograph shapes at the IJssel river as was found for the most upstream dike breach location, namely prediction of small outflow discharges before the dike has failed and a shift of one time step in the timing of the peak value. However, the underprediction of the peak outflow discharge was less severe for the IJssel river breach location compared to the upstream breach location. For the IJssel river breach location, the peaks were underpredicted with 3.5% on average. This is mainly the result of the altered outflow hydrograph if the upstream breach location fails. For these situations, no extreme peak is present in the outflow hydrograph of the IJssel river (
Figure 10b,c). However, if the upstream dike breach location does not fail, an extreme peak with a duration of one time step is present, again resulting in underprediction of this peak value of at maximum 21.2% (
Figure 10a).
5.4. Prediction of the Outflow Hydrograph Volumes
For proper flood evacuation plans, not only the timing of a dike breach is important, but the total flood volume that may flow into the hinterland is also important since this largely determines the flood extent in the hinterland.
Section 5.3 showed that the NARX networks underpredicted the peak outflow discharge, especially for the upstream breach location. However, since this peak discharge only occurred for a short moment in time just after the dike breached, this underprediction has almost no effect on the total outflow flood volume (
Figure 11 and
Figure 12). On average, the NARX networks predict a total flood volume which is around 1.67% and 1.32% lower compared to the hydraulic model predictions for the most upstream and IJssel river breach location, respectively. Besides the total flood volume, the cumulative flood volume over time is also predicted accurately (
Figure 11 and
Figure 12).
Furthermore,
Section 5.3 showed that the NARX networks always computes an outflow discharge, even if the dike has not failed yet. Even though the computed outflow discharges are low, it still results in a cumulative outflow volume in the order of 2 × 10
7 m
3 (
Figure 11a). This incorrect predicted outflow volume can be confused with a dike breach if these networks are used by decision-makers in case of crisis situations. Therefore, it is recommended to postprocess the neural network predictions by including a threshold. Only if the outflow discharge exceeds this specific threshold, the specific dike section should be identified as a dike failure and corresponding outflow volume should be computed.
6. Discussion
This study showed the potential of NARX networks to be used to predict multiple outflow hydrographs of potential dike breaches in a bifurcating river system. This approach has the potential to be extended to a real-time flood forecasting system that is also capable of predicting inundation extents to set up evacuation plans. In reality, a flood event and corresponding consequences are highly uncertain and probabilistic of nature. However, assumptions were made in the current approach to reduce the complexity of the system. These assumptions are addressed in more detail below.
6.1. Critical Water Levels
Dikes can fail due to various mechanisms (e.g., overtopping, piping, macrostability), making the critical water level highly uncertain. In this study, it was assumed that the various potential dike breach locations could only fail due to the failure mechanism overtopping. For this failure mechanism, the critical water levels were assumed to be equal to the dike crest levels and, subsequently, the neural networks were able to accurately predict the dike breach locations. However, this critical water level is highly uncertain. Furthermore, for other failure mechanisms such as piping and macrostability, the critical water level can be substantially lower than currently considered. For future research, we aim to extend the current deterministic approach by including critical water levels in a probabilistic manner (e.g., [
2,
21,
41,
42]). Using such a probabilistic approach increases the complexity of the system significantly. Now, only two out of the 28 potential dike breach locations failed. Furthermore, the IJssel river dike breach location always failed earlier than the most upstream dike breach location since constant critical water levels were assumed. More dike sections may breach and the order in which the various dike sections fail may change by including multiple failure mechanisms randomly. This increase in the system complexity may require additional feedback layers in the NARX network setups. An option could be to update the water levels in the various river branches due to potential dike breaches during the NARX predictions. As a consequence, the interaction between critical water levels, dike breaches, and upstream and downstream water levels can be considered. How such a NARX network should be set up in detail is recommended for future research.
Furthermore, the critical water levels may change over time. Dike reinforcements are expected at the locations currently identified as potential dike breaches during flood events. If flood probability reduction measures are taken (e.g., dike reinforcements, floodplain widening, construction of a side channel), the NARX networks should be retrained. NARX networks, and data-driven models in general, do not have any physical interpretation. Instead, they are based on the input–output relations of a physically-based model. Therefore, new training data must be created with an updated hydraulic model representing the correct river schematization.
6.2. Infinitely High Dikes
In this study, water could only leave the river system if a dike breached. Overflow was not included in the analysis, assuming infinitely high dikes to reduce the complexity of the system. The considered dike breach locations did not always have the lowest crest level in a specific dike section. Therefore, in reality, overflow may have already happened at surrounding dike sections before a specific dike section failed. However, it is expected that the neural networks are also able to accurately predict outflow hydrographs in case overflow is included since the NARX neural networks have been shown to be able to include the spatial relations present in the system in terms of a changing discharge wave in downstream direction.
6.3. Toward an Early Flood Forecasting System
The NARX neural networks developed in this study only predicted outflow hydrographs. Inundation extents in the hinterland were not predicted, whereas a fast and accurate overview of the inundated areas helps to set up proper evacuation schemes. With the predicted outflow hydrographs, it was possible to compute the total flood volume entering the hinterland. These flood volumes should be transferred to inundation extents. Determination of how this can be efficiently performed using neural networks is recommended as an objective of future work.
7. Conclusions
During extreme flood events, dikes may fail, causing inundations in the hinterland and, consequently, flood damage. Early flood warning systems can help to mitigate the consequences of large flood events. In this study, NARX neural networks were developed for the potential dike breach locations in the Dutch Rhine river system. These NARX networks were able to predict if a specific dike section will fail and corresponding outflow hydrographs accurately. The timing of these dike failures was accurately predicted. Even though the maximum outflow discharges were slightly underpredicted, which is a common phenomena of neural networks under extreme conditions, the total outflow volumes only deviated with 1.67% and 1.31%, on average, for the two breach locations. These outflow volumes have the potential to be used to predict inundation extents in the hinterland, which is recommended to be included in the neural network setups for future research. Furthermore, various failure mechanisms should be included in future work to enable even more realistic prediction of the dike breach locations. By doing so, NARX neural networks have great potential to be used within the early flood warning system in the future.