## 1. Introduction

Typhoon is a common weather phenomenon in subtropical areas and usually occurs between July and October of each year. Heavy rainfall brought by the typhoon during the event usually results in serious inundation problems in low-lying areas, which not only causes property loss to the local population but also threatens their safety. Due to restrictions on engineering funding, structural protective measures are constrained by the designed limits. Once the typhoon scale exceeds the designed protective limit, people must rely on nonstructural means for disaster relief during the event, such as evacuating people from areas in potentially high flooding risk. Among nonstructural measures the accurate forecast of the inundation level in the areas within the next several hours is a critical factor in the decision-making and planning of disaster relief actions.

Relevant studies on inundation forecast technology are quite ample and can generally be divided into either the numerical simulation or the black-box modeling. The numerical simulation is based on theoretical deduction through the understanding of the mechanism from rainfall to inundation. The advantage of this type of method is the completed support from the theoretical basis for the physical mechanism of inundation. The simulation result often has a high degree of accuracy, which renders the method a powerful tool of inundation forecast. However, the disadvantage of this method is its high demand for computing resources and CPU time, which makes it unsuitable to provide the real-time forecast required in the quick disaster prevention and rescue response during the typhoon attack. On the other hand, the black-box modeling relies on a different approach by deeming the process from rainfall to inundation as a black box. It does not delve into the internal physical mechanism but instead analyzes the input and output data of the system to simulate the relationship between them. These types of models cannot explain the physical mechanism involved in the system, but they can correctly and effectively simulate the response of the system, and the computing speed is faster than numerical models [

1]. These practical benefits render black-box modeling a suitable forecasting tool for decision making and rescue planning during the typhoon period.

Abundant studies with regard to black-box modeling for inundation forecasting can be found in literature, for example, Liong et al. [

2] developed a river stage forecasting model based on an artificial neural network (ANN) and yielded a very high degree of prediction accuracy even for up to seven lead days. Campolo et al. [

3] developed a flood forecasting model based on ANN that exploits real-time information available for the basin of the River Arno to predict the basin’s water level evolution. Keskin et al. [

4] proposed a flow prediction method based on an adaptive neural-based fuzzy inference system (ANFIS) coupled with stochastic hydrological models. Shu and Ouarda [

5] proposed a methodology using ANFIS for flood quantile estimation at ungauged sites and demonstrated that the ANFIS approach has a much better generalization capability than other alternatives. Kia et al. [

6] develop a flood model based on ANN using various flood causative factors in conjunction with geographic information system (GIS) to model flood-prone areas in southern Malaysia. Lin et al. [

7] proposed a real-time regional forecasting model to yield 1- to 3-h lead time inundation maps based on K-means cluster analysis incorporated with support vector machine (SVM). Tehrany et al. [

8] proposed a methodology for flood susceptibility mapping by combining SVM and weights-of-evidence (WoE) models and demonstrated that the ensemble method outperforms the individual methods. Del Giudice et al. [

9] developed a methodology that formulated models with increasing detail and flexibility, describing their systematic deviations using an autoregressive bias process. Chang and Tsai [

10] proposed a spatial–temporal lumping of radar rainfall for modeling inflow forecasts to mitigate time-lag problems and improve forecasting accuracy.

In view of the above literature, most of the approaches employ sequential data as model inputs. Less has been explored for forecasting models with non-sequential data inputs. This type of model is efficient because there are rather fewer inputs required to be processed. The challenge for these models, however, lies in selecting the appropriate combination of non-sequential variables to be used in the inputs. This study aims to propose a methodology to tackle this difficulty. The approach proceeds by integrating a Multi-Objective Genetic Algorithm (MOGA) with models based on ARX (Auto-Regressive models with eXogenous inputs) to search for the optimal combination of non-sequential regressors for model inputs. Three types of ARX-based models are tested by the proposed methodology, including linear ARX (LARX), nonlinear ARX with Wavelet function (NLARX-W), and nonlinear ARX with Sigmoid function (NLARX-S). The models are assessed by a number of indexes to examine their performance on various aspects, and the characteristics of the models selected from the Pareto optimal sets located by MOGA with the best performance in each index are compared and discussed. The remainder of this paper is arranged as follows:

Section 2 illustrates the environmental background of the study area, the ARX models, and the optimal models obtained by MOGA;

Section 3 discusses the characteristics of the acquired optimal models, and finally the conclusions are drawn in

Section 4 based on the findings.

## 3. Results and Discussion

The result acquired by MOGA is the Pareto optimal model set, in which every model is un-dominated, that is, at least one of the three indexes of the model is not exceeded by another model. For each model type of LARX, NLARX-W, and NLARX-S, the three models with the best performance in the three indexes, respectively, are selected among the Pareto optimal set. The results are shown in

Table 2. The selected models are named according to their model type and the featuring index, L1 represents the model with the maximum

$\overline{\mathrm{CE}}$ in LARX, L2 represents the one with the minimum

$\overline{\mathrm{RTS}}$ in LARX, and L3 represents the maximum

${\overline{\mathrm{TS}}}_{15}$ model in LARX. Similarly, W1, W2 and W3, as well as S1, S2, and S3, respectively represent the models with the maximum

$\overline{\mathrm{CE}}$, minimum

$\overline{\mathrm{RTS}}$, and maximum

${\overline{\mathrm{TS}}}_{15}$ in NLARX-W and NLARX-S. The scores of the nine optimal models on the three indexes and the corresponding chromosomes (i.e., selected regressors) are listed in

Table 2. As shown, the selected regressors associated to each of the nine optimal models appear to be in a non-sequential pattern. This supports the result of Talei et al. [

26] that a model with non-sequential inputs has better performance than the ones with sequential or pruned sequential inputs.

The prediction lead time is crucial for disaster relief action during typhoon attack. An appropriate choice of the prediction lead time has to be linked to the properties of the hydrological behavior of the watershed. It has been observed in the study area that the variation of water level often lags behind the rainfall (for example, as observed in

Figure 2). This time lag behavior between the rainfall and the water level presents a characteristic of the watershed, which can be used as an index for the selection of the appropriate prediction lead time. For that, the corrections between the cumulative rainfall data and the water level data shifted back in time by various time lags are analyzed. The results are as shown in

Figure 5. The circle dots represent the average CC of the typhoon events and the error bars denote the maximum and the minimum CCs of the events. As seen, the average CC reaches the peak at the time lag of

t − 3. This indicates that the water level is most correlated to the cumulative rainfall with 3 h of lead in the study area. The prediction lead time in the present study is thus selected to be 3 h in accordance with this hydrological characteristic of the watershed.

Figure 6a–c compare the performance of the nine models in CE, RTS, and

${\mathrm{TS}}_{15}$. In the comparison of CE, as shown in

Figure 6a, the performance of Model Type 1 (L1, W1, S1) and Model Type 3 (L3, W3, S3) is quite good, in which CE all reaches above 0.8. The CEs of Type 1 Models are seen a little higher than Type 3 Models, but the differences are not significant. The CE performance of Model Type 2 is relatively poorer, especially for L2 and W2. This is because Model Type 2 features on reducing the time shift error. Nevertheless, it is noted that the CE of Model S2 still reaches above 0.7. As shown in

Figure 6a, the CEs of the nonlinear models (W- and S-series of models) are somewhat higher than the linear L-series of models, thus indicating that the relationship between rainfall and water level at the site of WG2 is nonlinear. In the comparison of nonlinear models, the CEs of NLARX-S models (S1, S2, S3) appear to be higher than NLARX-W models (W1, W2, W3), and that difference is particularly evident with S2 and W2. In the comparison of time shift errors, as shown in

Figure 6b, the RTS of Model Type 2 (L2, W2, S2) is seen lower than Type 1 (L1, W1, S1) and Type 3 (L3, W3, S3). The RTS of S2 in all models is the lowest, showing the minimum time shift error in the forecast, which is followed by W2 with a little increase in RTS; both of which are nonlinear models. The RTS of L2 is the worst in Type 2 models, and as shown in

Figure 6b, even W1 and S1 which feature on CE have shown somewhat better performance on RTS than L2. Also, the RTS of the linear L1 and L3 models are much higher than the nonlinear W- and S-series of models. This yet again implies the nonlinearity between the rainfall and the water-level at the study area.

Figure 6c is the comparison of the nine models in

${\mathrm{TS}}_{15}$ performance. As shown in the figure, both Type 1 models ((L1, W1, S1) and Type 3 models (L3, W3, S3) display high performance on

${\mathrm{TS}}_{15}$. Among the nine models, the

${\mathrm{TS}}_{15}$ of S3 is the highest, which is closely followed by W3. The linear L3 has the worst

${\mathrm{TS}}_{15}$ in Type 3 models, which as shown in

Figure 6c, is even lower than the nonlinear W1 and S1 that features on CE. In summary, the comparisons in

Figure 6 show that the overall performance of nonlinear models is better than linear models on every aspect. Comparisons in the nonlinear models show that the NLARX-S models perform a little better than the NLARX-W models, but the differences are not very significant.

The results of the model predictions might be related to the hydrological behavior of the watershed. As has been shown in

Figure 5, the correction between the water level and the rainfall in the study area is most prominent with a time lag of 3 h. It is noted in

Table 2 that the model regressors optimized by MOGA all include

R(

t − 3) in the inputs. This result might reflect the hydrological behavior of the study area. It is also noted that Type 1 and Type 3 models which exhibit higher CE and TS

_{15} values include not only

R(

t − 3) but also

R(

t − 4) or

R(

t − 2) as their regressors. As seen in

Figure 2, the correction between the water level and the rainfall in the study area is also high with time lags of

t − 2 and

t − 4. This might explain the rather good results of CE and TS

_{15} scores achieved by Type 1 and Type 3 models.

Figure 7 is the comparison of validated water level hydrograph and measured data of Type 1 models, with a prediction lead time of 3 h.

Figure 7a–d show the validation results of Typhoon Saola, Matmo, Trami and Usagi, respectively. As seen in the figures, the models generally give reasonable forecasts comparing to the data. It is noted that all the three models exhibit certain degrees of time shift errors on the rising limb of the hydrographs. Since the rising limb of the flood wave is the most important phase for flood forecasting activities, the delayed forecasts pose a limitation of the models.

The above model performance comparison is mainly based on 3 h of prediction lead time. In order to investigate the performance under other lead times, the models are applied to the forecasts with prediction lead times varying from 0.5 h to 3 h, and the three indexes of CE, RTS, and

${\mathrm{TS}}_{15}$ associated to each prediction lead time are calculated. The results are shown in

Figure 8a–c.

Figure 8a shows the CE variations of Type 1 models (L1, W1, and S1) along with the prediction lead. As shown in the figure, it appears that the CEs of all the three models increase with smaller prediction leads. For 0.5 h of prediction lead, the CEs of all models reach above 0.95, among which S1 is the highest. As the prediction lead time increases, the CEs of the three models gradually decrease. CE of L1 drops below 0.9 after the lead time passes 2 h, while W1 and S1 maintain above 0.9 up to 3 h of prediction lead. As the figure shows, for prediction leads from 0.5 to 3 h, CE performance of the three models shows S1 as optimal, closely followed by W1, and L1 is the worst.

Figure 8b shows the RTS of Type 2 models (L2, W2, S2) varying along with prediction lead time from 0.5 to 3 h. As shown in the figure, the RTS of the three models gradually increases as the prediction lead time increases. All three models appear to have higher RTS with longer prediction leads, indicating greater relative time shift errors there. Comparing the three models, the RTS performance of S2 appears to be the best, and L2 is the worst.

Figure 8c shows the

${\mathrm{TS}}_{15}$ of Type 3 models (L3, W3, S3) varying along with prediction lead time. As shown in the figure, the

${\mathrm{TS}}_{15}$ of all three models increase with smaller prediction leads. With 0.5 h of lead time,

${\mathrm{TS}}_{15}$ of the three models all reach above 0.9, showing very good forecasts. As the prediction lead time increases, the

${\mathrm{TS}}_{15}$ of the three models gradually decreases. The

${\mathrm{TS}}_{15}$ performance of the three models shows S3 as the best, closely followed by W3, and L3 as the worst.