A Support Vector Machine Forecasting Model for Typhoon Flood Inundation Mapping and Early Flood Warning Systems

Accurate real-time forecasts of inundation depth and extent during typhoon flooding are crucial to disaster emergency response. To manage disaster risk, the development of a flood inundation forecasting model has been recognized as essential. In this paper, a forecasting model by integrating a hydrodynamic model, k-means clustering algorithm and support vector machines (SVM) is proposed. The task of this study is divided into four parts. First, the SOBEK model is used in simulating inundation hydrodynamics. Second, the k-means clustering algorithm classifies flood inundation data and identifies the dominant clusters of flood gauging stations. Third, SVM yields water level forecasts with 1–3 h lead time. Finally, a spatial expansion module produces flood inundation maps, based on forecasted information from flood gauging stations and consideration of flood causative factors. To demonstrate the effectiveness of the proposed forecasting model, we present an application to the Yilan River basin, Taiwan. The forecasting results indicate that the simulated water level forecasts from the point forecasting module are in good agreement with the observed data, and the proposed model yields the accurate flood inundation maps for 1–3 h lead time. These results indicate that the proposed model accurately forecasts not only flood inundation depth but also inundation extent. This flood inundation forecasting model is expected to be useful in providing early flood warning information for disaster emergency response.


Introduction
According to records from the past 100 years from the Central Weather Bureau, Taiwan, an average of three typhoons strike Taiwan each year [1].During typhoons, heavy rainfall and widespread flood inundation typically impact the Island of Taiwan.To illustrate the extremity of these events, the average annual rainfall in Taiwan is around 2500 mm, with more than 78% of the precipitation concentrated in the typhoon season from May to October [2].The torrential rainfall brought by typhoons frequently causes flood inundation, which leads to the loss of life and property.For disaster mitigation in the were determined and selected as input in a hybrid model by integrating the principal component analysis, logistic regression and frequency distribution analysis to quantify hazard potential and to map flood characteristics [30].Among these studies, the flood causative factors are determined by using the correlation coefficient analysis.Nine flood causative factors are identified and adopted as input to the proposed model, including elevation, slope, aspect, curvature, plan curvature, profile curvature, TWI, distance to river, and SPI.
In the present study, flood inundation depths are simulated by the SOBEK model, as calibrated and validated by survey data of flood inundation extents and flood water levels.The flood inundation data are grouped into several clusters by a k-means clustering algorithm, in order to identify the dominant clusters for each flood gauging station.Then, rainfall and water level data are determined as input in developing the point forecasting module.Finally, based on rainfall data, point forecasts and flood causative factors, the spatial expansion module is constructed to expand the flood inundation depth from points to areas as flood inundation extents.To demonstrate the effectiveness of the flood inundation forecasting model, we explore an application to the Yilan River basin in Yilan County, Taiwan.Our study is organized as follows.The study area and data collection are described in Section 2. Section 3 presents the development of the flood inundation forecasting model, including hydrodynamic simulation, classification, point forecasting, and spatial expansion steps.The results and performance of the proposed model are presented and discussed in Section 4. The conclusions are summarized in Section 5.

Study Area and Hydrological Data
The Yilan River basin is located in northeastern Taiwan.The annual average precipitation and temperature are 2522 mm and 22 • C, respectively.The mainstream length is approximately 17.25 km, while the total watershed area is around 149.06 km 2 .Several irrigation and drainage systems have been built in the Yilan River basin.The Meifu drainage system, located on the southern side of the Yilan River, is one of the major drainage systems.Typhoons usually hit this region in the summer and fall, from August to October.During typhoons, severe flood inundations may quickly form in low-lying areas between the Meifu drainage system and the Yilan River, causing serious property loss and damage.
The locations of the rainfall, water level and flood gauging stations in the Yilan River basin are shown in Figure 1.The total average hourly rainfall is calculated with Thiessen's Polygon method.Among the Thiessen area, the maximum area is dominated by Dajiaoxi rainfall station.It indicates that the rainfall at the Dajiaoxi rainfall station is crucial to affect the forecasting result.Table 1 presents hydrological data from ten typhoons, including duration, cumulative maximum within 36 h and flood inundation extents.The eight flood causative factors were calculated from the digital elevation model (DEM) data with a spatial resolution of 40 m × 40 m, consisting of elevation, slope, aspect, curvature, plan curvature, profile curvature, distance to river, SPI and TWI.The SPI and TWI are expressed as A × tan(B) and ln(A/tan(B)), respectively, where A is the upstream contributing area (m 2 ) and B is the local slope (degree).

Hydrodynamic Simulation
For hydrologic and hydraulic analysis, the SOBEK model was adopted for solving the Saint-Vernant equations [4].The SOBEK model contains three major modules: a rainfall-runoff module (RR), a 1D flow module (1DF) and a 2D overland flow module (OF).The SOBEK model's finite difference scheme is commonly applied in simulating the unsteady flow velocities, water levels, and inundation extents associated with flooding events, both in rivers and in urban sewer/drainage systems.
Since the study area contains many drainage systems, we divide the basin into rural river areas, urban sewer areas and surface runoff catchments, each of which is simulated by a different analysis module.For rural river areas, the Soil Conservation Service (SCS) curve number is used to analyze the runoff for different rainfall events according to land use.For urban sewer areas, a rational formula is used to calculate the surface runoff discharge.The RR module with rainfall data calculates surface runoff discharge hydrographs.
For the 1D flow module, the continuity equation and the momentum equation are the dominant equations and can be expressed as

Hydrodynamic Simulation
For hydrologic and hydraulic analysis, the SOBEK model was adopted for solving the Saint-Vernant equations [4].The SOBEK model contains three major modules: a rainfall-runoff module (RR), a 1D flow module (1DF) and a 2D overland flow module (OF).The SOBEK model's finite difference scheme is commonly applied in simulating the unsteady flow velocities, water levels, and inundation extents associated with flooding events, both in rivers and in urban sewer/drainage systems.
Since the study area contains many drainage systems, we divide the basin into rural river areas, urban sewer areas and surface runoff catchments, each of which is simulated by a different analysis module.For rural river areas, the Soil Conservation Service (SCS) curve number is used to analyze the runoff for different rainfall events according to land use.For urban sewer areas, a rational formula is used to calculate the surface runoff discharge.The RR module with rainfall data calculates surface runoff discharge hydrographs.
For the 1D flow module, the continuity equation and the momentum equation are the dominant equations and can be expressed as ∂A f ∂t ∂Q ∂t where Q is the discharge (m 3 /s), g is the acceleration due to gravity (m/s 2 ), t is the time (s), h is the water level (m), R is the hydraulic radius (m), q lat presents the lateral discharge per unit length (m 2 /s), A f is the cross sectional flow area (m 2 ), C is the Chezy coefficient, W f is the cross sectional width at the corresponding water level (m 2 ), τ wi is the wind shear stress ( kg m×s 2 ), and ρ w is the water density (kg/m 3 ).
where d is the water depth (m); u and v are the velocity components (m/s) in the x-direction and y-direction, respectively; V is the velocity magnitude √ u 2 + v 2 (m/s); c is the Chezy coefficient; and a is the wall friction coefficient.
The flow chart of the linkage of the integrated model is shown in Figure 2. The 1DF module calculates the hydrographs of overflow flow rate at channel cross-sections when the surface runoff exceeds the design capacity of the drainage system.The overflow hydrographs are subsequently used as sources in the OF module.Historical typhoon event records are used in calibrating some parameters: the SCS curve number (CN) and White Colebrook (k).The value of the Chezy coefficient c is obtained from the White-Colebrook formula, expressed as c = 18 10 log( 12R k n ).
width at the corresponding water level (m 2 ), wi  is the wind shear stress ( 2 kg m s  ), and w  is the water density (kg/m 3 ).The 2D overland flow module is described by the continuity equation and the momentum equation in the x-and y-directions.The continuity equation and momentum equations are expressed as where d is the water depth (m); u and v are the velocity components (m/s) in the x-direction and y-direction, respectively; V is the velocity magnitude

k-means Clustering Algorithm
The k-means clustering algorithm [33] groups data with simulated statistical characteristics.Due to the ease of implementation and efficiency, the k-means algorithm is one of the most commonly used partitional clustering algorithms [34,35], and many studies have confirmed its clustering ability in hydrology [27,36].Data are grouped into n clusters by their common characteristics.The clusters are recognized by minimizing the average Euclidean distance from individual data points to the cluster center.The k-means clustering algorithm function is written as where n is the number of clusters, x j is the input vector, c i is the ith cluster center and w ji is a l × k data matrix.For more details on the k-means clustering algorithm, refer to [37].In this study, similar flood inundation data are grouped into the same cluster.Then, clusters of flood inundation data are identified by minimizing their average Euclidean distance to the cluster center.Determination of an optimal number of clusters is crucial for efficiently grouping the flood inundation depths.In order to obtain an optimal number of clusters, the grid-search method [38] is used for determining the optimal number of clusters.Integrating a classification scheme like k-means into an SVM model is expected to reduce the number of the constructed models and increase the forecast accuracy.

Support Vector Machine
Vapnik [39] developed SVM for classification and later extended the technique for regression analysis.Classifier functions are used for separating different groups of training data to construct hyperplanes in the multidimensional space.In the past two decades, SVM has been widely adopted in rainfall-runoff [40,41], flood [27,[42][43][44] and peak-flood analyses [45].For further details on SVM, refer to [39,46].
with input vector x i and output data y i .The objective of the SVM is to find a nonlinear regression function ŷ = f (x) = w T φ(x) + b and produce the output ŷ, which is the optimal approximate of the observed data with an error tolerance of ε, where φ(x) is a nonlinear function, w is weight, and b is the bias of the regression function, respectively.According to the structural risk minimization (SRM) induction principle, w and b are calculated by minimizing the structural risk function: where Vapnik's ε-insensitive loss function is defined as The first term in Equation (2) represents model complexity; the second term represents empirical error.The penalty parameter C p represents the tradeoff between model complexity and empirical error.
The SVM problem can be formulated as the following optimization problem: where ξ and ξ' are slack variables representing, respectively, the upper and lower training errors subject to error tolerance ε.The dual Lagrange multipliers, a and a , can be applied to solve the above optimization problem.Consequently, the approximate function can be expressed as follows: Water 2018, 10, 1734 where m is the number of support vectors, x i is the support vector, and K(x i ,x) is a kernel function, used for mapping the SVM input vector into a higher-dimensional feature space.The radial basis function (RBF) is used herein and expressed as follows: where γ is the gamma term.The correct determination of parameters significantly increases the accuracy of SVM solutions.The penalty parameter C p and the error tolerance ε are also crucial SVM parameters.In this study, the grid-search method is employed for determining the optimal combination of the kernel function, γ, C p and ε.In this study, the SVM is applied to develop the point forecasting module to forecast the flood inundation depth for each flood gauging station.Then, the SVM is used to develop the spatial expansion module to expand the flood inundation depth from points to areas for each cluster.

Methodology Construction
To yield flood inundation maps in a time series, a flood inundation forecasting model is developed in this study.Figure 3   Flood inundation in the Yilan River basin is simulated using the SOBEK model.The river reach, cross section points of rivers and cross-sectional shapes of rivers are collected as input, along with properties of sewers and hydraulic structures.Moreover, ArcGIS applies the rainfall runoff (RR) module to divide the river and drainage basin and calculate the average rainfall with DEM characteristics.The Manning formula is used to calculate channel roughness along rivers and sewer systems.The White-Colebrook formula is used to calculate subcatchment surface roughness.In general, the higher the DEM spatial resolution, the higher the flooding simulation accuracy.However, increasing the resolution of the DEM is not necessary to improve the accuracy of flooding simulation such as flood inundation depth or extent, especially in the lowland areas.As shown in Figure 4, elevation at lowland areas of the Yilan River basin near the downstream boundary is less than 2 m.According to some test simulations with different spatial resolutions of DEM, it was found that using a spatial resolution of 40 m × 40 m DEM can obtain almost the same simulated results by 20 m × 20 m DEM for the flood inundation map in this study area.Therefore, the spatial resolution of 40 m × 40 m DEM is sufficiently accurate for flood inundation simulation in the study area, which can also result better computational efficiency, e.g., less CPU time.
Water 2018, 10, x FOR PEER REVIEW 9 of 20 which we impose the same values of initial water depth and tidal water elevation at t = 0. Given the above information, the SOBEK model can be successfully applied to simulate the hydrodynamic characteristics of flood inundation.

Classification Step
For model efficiency, the k-means clustering algorithm is applied in grouping the flood inundation data into several clusters.First, flood inundation and non-inundation extents have to be identified.In this study, an extent with a flood inundation depth above 0.3 m is regarded as a flood inundation extent.The dominant clusters of flood gauging stations are then identified by minimizing the average Euclidean distance from the flood inundation data to the center of the cluster.

Point Forecasting Step
The point forecasting module for each flood gauging station is constructed in this step.Observed rainfall and water level are used as input for the point forecasting module.The point forecasting module can be written in a general form as ) SVM[ ( ), ( 1), , ( ), ( ), ( 1), , ( )] where t is the current time, t  is the lead time period (from 1-3 h), R is the rainfall, c H is the water level at a given flood gauging stations and ' ( ) The determination of the inputs and of appropriate lag lengths is crucial to the effectiveness of the proposed point forecasting module.Rainfall and water level are selected as input, while the lag lengths of input are determined by the grid search method.

Spatial Expansion Step
In this step, the spatial expansion module is developed to expand the forecasts from points to areas.First, the forecast of each flood gauging station is transformed to a flood inundation depth.Then, SVM is applied to expand the flood inundation depth from points to areas for each cluster.Inputs for developing the spatial expansion module include the forecasted flood inundation depths According to the typhoon warning information delivered from the Central Weather Bureau (CWB), Taiwan, the sequential 36 h rainfall data with most intensive rainfall is used for model calibration and forecasting.Moreover, obtaining from the CWB, the historical temporal datasets for model inputs are hourly-based hydrological data for each typhoon event, including rainfall intensity, tidal sea-level variations, etc.Then, to establish a stable and reasonable hydrodynamic initial condition, the warm-up period is needed.The warm-up time for the initial condition is 1 h at which we impose the same values of initial water depth and tidal water elevation at t = 0. Given the above information, the SOBEK model can be successfully applied to simulate the hydrodynamic characteristics of flood inundation.

Classification Step
For model efficiency, the k-means clustering algorithm is applied in grouping the flood inundation data into several clusters.First, flood inundation and non-inundation extents have to be identified.In this study, an extent with a flood inundation depth above 0.3 m is regarded as a flood inundation extent.The dominant clusters of flood gauging stations are then identified by minimizing the average Euclidean distance from the flood inundation data to the center of the cluster.

Point Forecasting Step
The point forecasting module for each flood gauging station is constructed in this step.Observed rainfall and water level are used as input for the point forecasting module.The point forecasting module can be written in a general form as where t is the current time, ∆t is the lead time period (from 1-3 h), R is the rainfall, H c is the water level at a given flood gauging stations and H c (t + ∆t) is the point forecasted water level at time t + ∆t.
The determination of the inputs and of appropriate lag lengths is crucial to the effectiveness of the proposed point forecasting module.Rainfall and water level are selected as input, while the lag lengths of input are determined by the grid search method.

Spatial Expansion Step
In this step, the spatial expansion module is developed to expand the forecasts from points to areas.First, the forecast of each flood gauging station is transformed to a flood inundation depth.Then, SVM is applied to expand the flood inundation depth from points to areas for each cluster.Inputs for developing the spatial expansion module include the forecasted flood inundation depths of flood gauging stations with grid coordinates, rainfall and nine flood causative factors.The spatial expansion module can be written as follows: where R is the rainfall, D c is the flood inundation depth of a given flood gauge station, I n is a given flood causative factor, and D n (t + ∆t) is the forecasted flood inundation depth of grid n at time t + ∆t.

Model Evaluation and Cross Validation
To evaluate model performance, six measures of error are employed to indicate the discrepancy between the observed and forecasted values: root mean square error (RMSE), mean absolute error (MAE), error of time to peak water level (E Tp ), error of peak water level (E Wp ), capture rate (CR) and coefficient of efficiency (CE).Smaller RMSE, MAE, E Tp and E Wp values indicate less significant errors between the observed and forecasted values, whereas higher CR value means better agreement of flood inundation extents between observed and forecasted values.The CE value is used to evaluate forecasting ability, with a CE value close to 1 representing high performance.In particular, RMSE and CE are selected as performance measures for the point forecasting module.
For the construction of the proposed model, the collected event-based data are usually separated into two sets: training and testing data.Training data are adopted to develop the proposed model, whereas testing data are used to evaluate the performance of the proposed model.Different selections of training and testing data do impact results, sometimes even leading to different conclusions.To evaluate the accuracy and robustness of the proposed model, a statistical technique called cross validation [47] is applied in this study.Cross validation is described in detail as follows.Each single typhoon event is chosen as the testing set in turn, while the remaining events are used as the training sets.Thus, for N typhoons, each of the events is used to test the performance of the proposed model, and the test results and their performance measures are obtained.Then, performance conclusions for the proposed model are drawn on the basis of the averaged performance measures over all testing events.

Calibration and Validation of SOBEK
The observed flood inundation depths of the six water level stations on the Yilan River and Meifu drainage system are adopted in this study to calibrate and validate the SOBEK model.However, the selection of the parameter values of CN, n and k impacts the modeled values for flood inundation extent, depth and velocity.To determine the optimal parameter values for CN, n and k, these parameters are respectively varied from 39 to 98, from 0.015 to 0.035 and from 0.2 to 10.The determined optimal parameter values are similar to previous studies [6,48], which selected the same study area.
The performance of the SOBEK model at each water level station is shown in Table 2.As demonstrated in Figure 5, water level stations located downstream (i.e., Gamalan, Sijie, Dongjin and Zhuangwei) show good agreement between observed and simulated water levels.However, the E Tp and RMSE values at upstream water level stations (i.e., Yixing and Ximen) are worse than those at downstream water level stations.E Tp values below 2 h are acceptable, as are E Wp values below 10%.Meanwhile, CE values above 0.7 are acceptable [49].Though the E Wp value at Dongjin is greater than 10%, the difference between observed and simulated peak water levels is only 0.5 m; therefore, we accept the errors of the simulated water levels as reasonable.Moreover, the simulated and observed flood inundation extents are in a good agreement (see Figure 6).The CR value of 78% indicates that the SOBEK model can accurately simulate the flood inundation extents.We accept the SOBEK model as an accurate and efficient way to simulate flood inundation in the Yilan River basin.

Identification of Clusters
In this subsection, the flood inundation depth within each cluster identified by the classification module is analyzed.The k-means clustering algorithm is applied to obtain the relative information of the flood inundation data in each grid.In this study, the optimal number of clusters is 10, determined by the grid-search method.The dominated clusters of flood gauging stations are listed in Table 3.
Figure 7 shows the results of the k-means clustering algorithm and the location of each flood gauging station.The main flood inundation extents are located between the Yilan River and Meifu drainage system, especially at low elevations (shown in Figures 7 and 4).Clusters 7-10, which contain areas located adjacent to the river show a distinct of flood inundation depth dynamic: peak flood inundation depths in Clusters 7-10 are above 1 m, while peak flood inundation depths in Clusters 1-6 are below 0.8 m.As shown in Figure 8, flood inundation depths are implicitly clustered according to their respective time series, including peak flood inundation depth and amplitude of flood inundation.

Identification of Clusters
In this subsection, the flood inundation depth within each cluster identified by the classification module is analyzed.The k-means clustering algorithm is applied to obtain the relative information of the flood inundation data in each grid.In this study, the optimal number of clusters is 10, determined by the grid-search method.The dominated clusters of flood gauging stations are listed in Table 3. Figure 7 shows the results of the k-means clustering algorithm and the location of each flood gauging station.The main flood inundation extents are located between the Yilan River and Meifu drainage system, especially at low elevations (shown in Figures 4 and 7).Clusters 7-10, which contain areas located adjacent to the river show a distinct of flood inundation depth dynamic: peak flood inundation depths in Clusters 7-10 are above 1 m, while peak flood inundation depths in Clusters 1-6 are below 0.8 m.As shown in Figure 8, flood inundation depths are implicitly clustered according to their respective time series, including peak flood inundation depth and amplitude of flood inundation.A narrower focus on the comparison between Clusters 7, 9 and 10, reveals that the flood inundation depth patterns are similar, with the highest peak in Cluster 10, followed by Cluster 9 and Cluster 7. The timing order of flood inundation is Cluster 10 first, then Cluster 9, and finally Cluster 7.These phenomena are consequences of the geographic distance from the river to each cluster's Water 2018, 10, 1734 13 of 19 geographic areas.For the same reason, Clusters 1 and 5 show similar dynamics, as do Clusters 2 and 4. We conclude that the k-means clustering algorithm effectively captures the spatiotemporal distribution of flood inundation depths.

Performance of the Point Forecasting Module
Observed rainfall and water level are the inputs for point forecasting.Based on the grid search method, the lag lengths for rainfall and water level are both determined to be 1, representing 1 h lead time.Current rainfall and water level are both determined as input for 2 to 3 h lead time.
Figure 9 presents the point forecasting results for each flood gauging station for 1-3 h lead time.Most of the water level forecasts from the point forecasting module are in good agreement with the observed data.However, the peak flood inundation depths at ISR2, KXL1, ISR3 and GJL1 are underforecasted for Typhoons Nonmadol and Sinlaku, especially for 2 to 3 h lead time.We therefore employed RMSE and CE to clearly and objectively compare the discrepancies between observed and forecasted water levels.As shown in Table 4, with increasing forecast lead time the RMSE values increase (i.e., worsen) and the CE values decrease (i.e., worsen).
A narrower focus on the comparison between Clusters 7, 9 and 10, reveals that the flood inundation depth patterns are similar, with the highest peak in Cluster 10, followed by Cluster 9 and Cluster 7. The timing order of flood inundation is Cluster 10 first, then Cluster 9, and finally Cluster 7.These phenomena are consequences of the geographic distance from the river to each cluster's geographic areas.For the same reason, Clusters 1 and 5 show similar dynamics, as do Clusters 2 and 4. We conclude that the k-means clustering algorithm effectively captures the spatiotemporal distribution of flood inundation depths.

Performance of the Point Forecasting Module
Observed rainfall and water level are the inputs for point forecasting.Based on the grid search method, the lag lengths for rainfall and water level are both determined to be 1, representing 1 h lead time.Current rainfall and water level are both determined as input for 2 to 3 h lead time.
Figure 9 presents the point forecasting results for each flood gauging station for 1-3 h lead time.Most of the water level forecasts from the point forecasting module are in good agreement with the observed data.However, the peak flood inundation depths at ISR2, KXL1, ISR3 and GJL1 are underforecasted for Typhoons Nonmadol and Sinlaku, especially for 2 to 3 h lead time.We therefore employed RMSE and CE to clearly and objectively compare the discrepancies between observed and forecasted water levels.As shown in Table 4, with increasing forecast lead time the RMSE values increase (i.e., worsen) and the CE values decrease (i.e., worsen).To highlight the forecasting performance of the proposed spatial expansion module, the recent Typhoon Dujuan (2015) is taken as an example.Figure 10 presents the flood inundation data versus corresponding forecasts for flood inundation depth at 1-3 h lead time for Typhoon Dujuan.The forecasted flood inundation depths are generally in good agreement with the flood inundation data.As above, with increasing forecast lead time, the difference between observed values and forecasted values increases.Figure 11  spatial expansion module in Clusters 6-9 performs worse.Despite this, most of the results confirm that the proposed spatial expansion module effectively forecasts flood inundation depths.
To highlight the forecasting performance of the proposed spatial expansion module, the recent Typhoon Dujuan (2015) is taken as an example.Figure 10 presents the flood inundation data versus corresponding forecasts for flood inundation depth at 1-3 h lead time for Typhoon Dujuan.The forecasted flood inundation depths are generally in good agreement with the flood inundation data.As above, with increasing forecast lead time, the difference between observed values and forecasted values increases.Figure 11 shows MAE performance across the flood inundation map at 1-3 h lead time for Typhoon Dujuan.High performance is indicated by MAE values between 0 and 0.1 m; at 1-3 h lead time we observe high performance covering 85.2%, 84.2% and 83.4% of the map area, respectively.Meanwhile, MAE values exceeding 0.5 m rarely occur.Compared to Chang et al. [18], the proposed model obtains higher percentage of MAE values between 0 and 0.1 m.It indicates that the proposed model yields the accurate flood inundation map.
The aforementioned results indicate that the proposed model accurately forecasts not only flood inundation depth, but also areal flooding extent.The proposed spatial expansion module accurately and reliably produces flood inundation forecast maps, providing information to help disaster mitigation and reduce loss of life and property.

Conclusions
Accurate forecasts for flood inundation depth and extent play a crucial role in flood early warning systems.For this purpose, an accurate and efficient forecasting model is proposed for producing flood inundation forecast maps.
Based on various measures of error, we find that the SOBEK model accurately simulates flood inundation depths.Flood inundation depth at most locations shows model errors less than 0.1 m, making the proposed flood inundation forecasting model suitable for flood inundation forecasts in typhoon.
The point forecasting module accurately forecasts water level for 1-3 h lead time for most flood gauging stations.Finally, based on the strong performance of the spatial expansion module, we conclude that the proposed model can provide acceptable flood inundation forecast maps for 1-3 h lead time.The accurate forecast maps of inundation depth and extent are expected to be useful for The aforementioned results indicate that the proposed model accurately forecasts not only flood inundation depth, but also areal flooding extent.The proposed spatial expansion module accurately and reliably produces flood inundation forecast maps, providing information to help disaster mitigation and reduce loss of life and property.

Conclusions
Accurate forecasts for flood inundation depth and extent play a crucial role in flood early warning systems.For this purpose, an accurate and efficient forecasting model is proposed for producing flood inundation forecast maps.
Based on various measures of error, we find that the SOBEK model accurately simulates flood inundation depths.Flood inundation depth at most locations shows model errors less than 0.1 m, making the proposed flood inundation forecasting model suitable for flood inundation forecasts in typhoon.
The point forecasting module accurately forecasts water level for 1-3 h lead time for most flood gauging stations.Finally, based on the strong performance of the spatial expansion module, we conclude that the proposed model can provide acceptable flood inundation forecast maps for 1-3 h lead time.The accurate forecast maps of inundation depth and extent are expected to be useful for flood early warning systems.They also can be helpful for decision makers in handling disaster emergency response, reducing human casualties and property damage.
Due to some error of simulated flood inundation depths resulted from hydrodynamic modeling, it may cause poor performance in the flood inundation forecasting model.In the future, the proposed forecasting model can use real-time monitoring data and hence produce more accurate flood inundation forecast maps.Meanwhile, novel clustering algorithms and ANNs probably may improve the model performance.

Figure 1 .
Figure 1.The Yilan River basin and the locations of rainfall, water level and flood gauging stations.

Figure 1 .
Figure 1.The Yilan River basin and the locations of rainfall, water level and flood gauging stations.
s); c is the Chezy coefficient; and a is the wall friction coefficient.The flow chart of the linkage of the integrated model is shown in Figure2.The 1DF module calculates the hydrographs of overflow flow rate at channel cross-sections when the surface runoff exceeds the design capacity of the drainage system.The overflow hydrographs are subsequently used as sources in the OF module.Historical typhoon event records are used in calibrating some parameters: the SCS curve number (CN) and White Colebrook (k).The value of the Chezy coefficient c is obtained from the White-Colebrook formula, expressed as

Figure 2 .
Figure 2. Flowchart of the SOBEK model.Figure 2. Flowchart of the SOBEK model.

Figure 2 .
Figure 2. Flowchart of the SOBEK model.Figure 2. Flowchart of the SOBEK model.

Figure 3 .
Figure 3. Flowchart of the proposed model.

Figure 4 .
Figure 4. High-resolution elevation map of the Yilan River basin.

Figure 4 .
Figure 4. High-resolution elevation map of the Yilan River basin.

Figure 6 .
Figure 6.Comparison of simulated and observed flood inundation extents.

Figure 6 .
Figure 6.Comparison of simulated and observed flood inundation extents.

Figure 7 .
Figure 7. Data clustering results from the k-means clustering algorithm, along with locations of flood gauging stations.

Figure 8 .
Figure 8.Average flood inundation depth of each cluster over time.

Figure 7 . 20 Figure 7 .
Figure 7. Data clustering results from the k-means clustering algorithm, along with locations of flood gauging stations.

Figure 8 .
Figure 8.Average flood inundation depth of each cluster over time.

Figure 8 .
Figure 8.Average flood inundation depth of each cluster over time.
shows MAE performance across the flood inundation map at 1-3 h lead time for Typhoon Dujuan.High performance is indicated by MAE values between 0 and 0.1 m; at 1-3 h lead time we observe high performance covering 85.2%, 84.2% and 83.4% of the map area, respectively.Meanwhile, MAE values exceeding 0.5 m rarely occur.Compared to Chang et al. [18], the proposed model obtains higher percentage of MAE values between 0 and 0.1 m.It indicates that the proposed model yields the accurate flood inundation map.

Figure 10 .
Figure 10.Comparison of flood inundation data with forecasts obtained from the proposed model for Typhoon Dujuan.

Figure 10 .
Figure 10.Comparison of flood inundation data with forecasts obtained from the proposed model for Typhoon Dujuan.

Figure 11 .
Figure 11.Distribution of MAE value with respect to the proposed model during Typhoon Dujuan.

Figure 11 .
Figure 11.Distribution of MAE value with respect to the proposed model during Typhoon Dujuan.

Table 1 .
Hydrological and inundation data for typhoons from 2004 to 2015.

Table 1 .
Hydrological and inundation data for typhoons from 2004 to 2015.

Table 2 .
ETp, EWp, CE and RMSE values for the water level: simulated results by SOBEK model.

Table 2 .
E Tp , E Wp , CE and RMSE values for the water level: simulated results by SOBEK model.

Table 3 .
Hydrologic descriptions and dominant clusters for flood gauging stations.

Table 3 .
Hydrologic descriptions and dominant clusters for flood gauging stations.