1. Introduction
Floods are one of the most common natural disasters in the world. Compared with other natural disasters, the loss of people’s material property and the increase of social instability caused by floods have made them the most prominent in all kinds of disasters for a long time [
1,
2]. Each year, floods kill thousands to tens of thousands of people, affect the lives of hundreds of millions, and cause tens of billions of dollars in damage [
3,
4,
5]. If there are no timely preventive measures for floods, it will lead to greater damage, such as easily causing traffic jams, plague, and other problems. Therefore, determining how to effectively reduce or avoid the disasters caused by floods is very necessary.
In past flood forecasting, the models used were mainly physically based [
6]. For example, the literature [
7] pointed out the correlation between rainfall and runoff, and the water level was predicted in 72 test cases in eight different cities. The effect is very satisfactory, which also points out to us that it is desirable and necessary to calculate the rainfall–runoff conversion from direct rainfall data. The disadvantage is that the the cost of calculation is relatively high. The literature [
8] uses three methods (fully dynamic, diffusive, and kinematic waves) to mathematically describe the surface flow. The full dynamic model can make good predictions of flow and water level, but the pros and cons of the model are very limited by the grid size. The literature [
9] used a shallow water surface flow model to test a certain area in the UK, and the results were good, but the calculation of the model was very complicated. Literature [
10] considers the interaction between water flow and man-made structures (bridges, weirs, buildings, etc.) based on flow dynamics and demonstrates a case. The article points out that there are many influencing factors in flood dynamics and suggests inserting different types of hydrological data for learning in two-dimensional hydraulics, but in fact, we know that there are too many influencing factors of floods, and it is difficult for us to fully simulate them. Although physically based models show a strong ability to predict various flood scenarios, they often require various types of hydrological and geomorphological monitoring data sets, and the setup and operation of models are very time-consuming, which hinders short-term prediction [
11]. In addition, as mentioned in reference [
12], the development of physically based models usually requires in-depth knowledge and expertise on hydrological parameters, which is very challenging and not conducive to the promotion of hydrological models.
With the acceleration of the digital process, a lot of historical hydrological data has been preserved. In addition, with the improvement of computer computing power [
13], large-scale data-driven models that used to take several months can now converge and obtain optimized results in just weeks or even hours, making data-driven flood forecasting models appear in large numbers. Examples are ANNs (Artificial Neural Networks) [
14], neuro-fuzzy [
15], adaptive neuro-fuzzy inference systems [
16], support vector machines (SVM) [
17], etc. However, common data-driven models treat flood forecasting tasks as sequence-to-sequence conversion work, treat the input rainfall data only as a time series, and use the RNN (Recurrent Neural Network) and its variant structures to analyze time-series information. For example, Xu Yuanhao [
18] and his colleagues simulated and predicted the flood and waterlogging process in the middle reaches of the Yellow River based on the LSTM (Long Short-Term Memory) network and extracted the temporal characteristics by inputting the rainfall data of 14 areas in the upper reaches of the Yellow River into the LSTM network at one time. Because the spatial distribution of rainfall in different areas was not considered, the prediction effect of the whole model decreased significantly when the prediction period was more than 6 h. Francis Yongwa Dtissibe [
19] and others used the multi-layer perceptron in order to design a flood forecasting model and only used discharge as input–output variables. This model also regards flood forecasting tasks as sequence-to-sequence conversion work. Although, the model can accurately predict future discharge on the hydrological station at the Gardon d’Anduze, a river found in the Gard Division (France), Anduze Township. However, the forecast period of the model is only one hour, and other factors (rainfall, temperature, etc.) are not considered. Experiments of the model are only conducted on one station. The model undoubtedly has certain defects in multiple hydrological stations and multiple rivers. Liang Xiaoxu [
20] and others improved the BIGRU (Bidirectional Gate Recurrent Unit) model, introduced the attention mechanism, calculated the weighted summation output by using the attention weight, and realized the high-precision forecast of Xixianhe River Basin in the 36 h forecast period. Because the model is only a deep mining network of time-series information, it does not take into account the spatial distribution information of rainfall and is insensitive to the rainstorm in some small areas.
For many types of data, there are obvious spatial distributions, such as the traffic flow distribution in the traffic flow prediction task [
21,
22], the spatial distribution of pixels in the computer vision task [
23,
24], and the rainfall data distribution in a certain area. In the task of vehicle flow prediction, there are many temporal and spatial feature fusion schemes. Based on the principle of support vector machine, Li qiaoru [
25] and others designed an adaptive spatiotemporal feature fusion model to dynamically update the weights of spatiotemporal feature fusion. In computer vision, the classical convolutional neural network obtains deep-seated features by mining the spatial distribution of image pixels. Zhao Zhihong [
26] and others used the convolution neural network LeNet 5 to recognize the license plate, and the accuracy was 98%. In the field of flood forecasting, Yukai Ding [
27] and others propose an interpretable Spatio-Temporal Attention Long Short Term Memory model (STA-LSTM) based on LSTM and attention mechanism. The model can realize the dynamic adjustment of the weight through the time attention mechanism and the space attention mechanism module, which better simulates the real rainfall convergence process. However, it did not give a detailed description of the specific convergence relationship of rainfall data and did not use spatial information between rainfall stations to analyze the convergence relationship of rainfall data. Amir Mosavi [
28] and others applied the Normalized Difference Vegetation Index (NDVI) to runoff prediction and developed a simple spatiotemporal model based on the Generalized Structure of Group Method of Data Handling (GSGMDH). The inputs of the model are rainfall, runoff, and NDVI data. The NDVI is an important factor reflecting the change of runoff. NDVI index values are between +1 and −1. The NDVI for water, cloud, and snow cover phenomena are negative, while this parameter for bare soil and sand is positive and close to zero. Besides, healthy and lush vegetation has positive NDVI values (0.2–0.8). The NDVI value can truly reflect the geographic information of the area [
29]. Although the model has achieved good prediction results, the acquisition and processing of NDVI data are very complicated, so the model is not universal. Furthermore, the increase in runoff predicted by the model in autumn and winter may be related to the similarity of the NDVI values of water, clouds, and snow. The extraction of spatial information is not enough.
Effective mining of spatial information is essential to improve the accuracy of prediction. To fully excavate the spatial distribution information of rainfall and improve the overall predictive ability of the model, this paper uses remote sensing images to extract digital elevation information of the target basin. Using the digital elevation information of the target basin, the topography of the target basin can be truly known, so that the convergence relationship between rainfall stations can be expressed more realistically. The rainfall station is abstracted as a node, and the convergence relationship between the rainfall stations is determined according to the digital elevation information of the target basin and the geographic location information of the rainfall station. Based on this, a spatiotemporal feature fusion model GA-RNN (Graph Attention Recurrent Neural Network) method based on graph attention mechanism [
30] is proposed. The contributions are as follows:
- (1)
Use the digital elevation information extracted from remote sensing images to convert rainfall data into graph data and design a GA-RNN model based on graph attention mechanism to extract the spatial characteristics of rainfall.
- (2)
Compared with the model without spatial feature extraction, it proves the performance improvement brought by spatiotemporal feature fusion.
- (3)
Ten flood events were selected to evaluate the GA-RNN model.
This article is organized as follows. The first part introduces the research significance and current situation. The second part introduces the research data processing scheme and the calculation process of GAT (Graph Attention Mechanism). The third part introduces in detail the structural parameters and training effects of GA-RNN based on the graph attention mechanism model and gives relevant statistics and comparative experiments. The fourth part is the discussion part, explaining and analyzing the experimental data of the fourth part. The fifth part summarizes the full-text research.
4. Discussion
As shown in
Table 6 and
Figure 14, in the annual forecast results of the model, it can be seen from the curve fitting effect that the curve fitting degree decreases with the increase of the forecast period, especially during the flood peak forecast period. When the prediction time becomes longer, the prediction difficulty of the model also increases. For the same time_step, the historical data obtained by the model is the same, but as the prediction length of the model increases, the difficulty of the model prediction also increases. When predicting data that is farther away from historical data, there may be a large error between the information obtained by the model and the real information. Therefore, the accuracy of the model will also decrease. In practical applications, a shorter forecast period usually requires higher forecast accuracy. From this point of view, the model conforms to the characteristics of the actual flood forecasting model.
As shown in
Table 7 and
Figure 15, in the 12 h and 24-h forecast period, the GA-RNN model achieved the error standard for the prediction of the key indicators of the four floods. When the forecast period increases to 36 h, the first flood forecast index becomes worse, which does not conform to the error criterion. However, the results of the other three flood forecasts still meet the standard. The differences between the first flood and the other three floods are analyzed and compared. When the first flood reaches the peak value, the flood discharge exceeds 1000 m
s, which has the characteristics of high peak value and rapid change. This large scale and fast change make the model less effective in a long prediction period. This may be because the model learns less for this type of data.
As shown in
Table 8 and
Figure 15, from the perspective of the predicted discharge curve, the GA-RNN model based on spatiotemporal feature fusion is superior to the LSTM network based on time-series analysis, especially when a flood occurs. It can predict flood peak and arrival time well. The RMSE of the GA-RNN model is 13.47% lower than that of the BIGRU model, and
is 8% higher. The RMSE of the GA-RNN model is 26.3% lower than the LSTM model, and the
is 16% higher. Compared with the LSTM model, the BIGRU model can comprehensively consider the context information of the input sequence, so the prediction accuracy is improved, but the effect is worse than the GA-RNN model. Comparative experiments show that the spatial feature extraction of rainfall data is very important, and the spatiotemporal feature fusion scheme is superior to the traditional pure time-series analysis model in prediction accuracy.
As shown in
Table 10 and
Figure 17, in terms of flood process prediction, the GA-RNN model can accurately fit the real flow curve, and according to the discriminant rules shown in
Table 9, the GA-RNN model flood forecast qualification rate can reach 80%, which can be assessed if it is a B-level model. Only floods #2 and #3 exceed the allowable error. The reason for the over-prediction of floods #2 and #3 is that the flow of these two floods is too large. This large-scale and rapid change make the model less effective in a long prediction period, which may be due to the lack of such data in the training set. Compared with floods #2 and #3, the rainfall of floods #1 and #6 is relatively small and gentle, and the peak flow rate is slower. Therefore, the prediction curve of the model is relatively flat, resulting in the prediction peak value of the model being lower than the true value. Therefore, this model can be considered for actual production. Due to the limited training data of the model during the experiment, especially the lack of heavy rainfall data, the effect of the model is limited. If there are more flood data, it is predicted that the effect of the model will be better.
5. Conclusions
Aiming at the current situation of insufficient rainfall spatial feature mining in the existing flood forecasting data-driven models, this paper proposes a GA-RNN space-time feature fusion model. The 50 rainfall stations are abstracted as nodes, and the rivers connected between rainfall stations are abstracted as edges. Digital elevation information from remote sensing images is extracted to determine the direction of the edge. The rainfall data is described by graph data, and the graph attention mechanism is introduced for spatial feature analysis. Through the experiment of flood forecasting, it can be found that over 90% of the forecast results of the GA-RNN model meet the standard under different forecast periods, and the shorter the forecast periods, the higher the accuracy. At the same time, for the ten historical floods, more than 80% of the predicted results also met the standard. According to the hydrological evaluation criteria, the GA-RNN model is assessed as a Class B flood forecasting model. In the model comparison experiment, the RMSE of the GA-RNN model is 13.47% lower than that of the BIGRU model, and is 8% higher. The RMSE of the GA-RNN model is 26.3% lower than the LSTM model, and the is 16% higher. It shows that the prediction accuracy of the model has been significantly improved after using remote sensing images to add the spatial feature extraction of rainfall data. However, when the flood peak discharge is high and the speed of change is fast, the prediction error of the model is relatively large. By increasing the proportion of this situation in the train set, perhaps the accuracy can be further improved. Besides, the GA-RNN model only considers rainfall and flow characteristics. If evapotranspiration, weather, vegetation, and other characteristics that affect flood flow can be considered, the accuracy of the model may be more improved. At the same time, as we all know, machine learning-based models are very dependent on data, and the quality of the data directly affects the results of the model. For many areas where the collection of hydrological information is difficult, the use of the model faces great difficulties. The universality of the model is relatively poor, especially because the model needs to obtain the digital elevation information of the area.