Automated Processing for Flood Area Detection Using ALOS-2 and Hydrodynamic Simulation Data

: Rapid and frequent mapping of ﬂood areas are essential for monitoring and mitigating ﬂood disasters. The Advanced Land Observing Satellite-2 (ALOS-2) carries an L-band synthetic aperture radar (SAR) capable of rapid and frequent disaster observations. In this study, we developed a fully automatic, fast computation, and robust method for detecting ﬂood areas using ALOS-2 and hydrodynamic ﬂood simulation data. This study is the ﬁrst attempt to combine ﬂood simulation data from the Today’s Earth system (TE) with SAR-based disaster mapping. We used Bayesian inference to combine the amplitude / coherence data by ALOS-2 and the ﬂood fraction data by TE. Our experimental results used 12 ﬂood observation sets of data from Japan and had high accuracy and robustness for use under various ALOS-2 observation conditions. Flood simulation contributed to improving the accuracy of ﬂood detection and reducing computation time. Based on these ﬁndings, we also assessed the operability of our method and found that the combination of ALOS-2 and TE data with our analysis method was capable of daily ﬂood monitoring. The Advanced Land Observing Satellite-2 (ALOS-2) test this study.


Introduction
Flooding is the most frequent and widespread natural disaster, which has affected more than 800,000 people and caused USD 300 billion in economic losses over the past decade [1]. In Japan, for example, approximately 41% of the population and 65% of national assets are concentrated in flood-prone areas [2]. Global climate change and rapid urbanization are expected to further increase the risk of flooding in the future [3]. Accurate and rapid mapping of the extent of flooding is, thus, essential for disaster management and mitigation, such as providing search and rescue services and floodwater drainage.
Remote sensing technology, particularly synthetic aperture radar (SAR), has long been expected to play an important role in rapid flood monitoring because of its ability to make observations in any weather, day or night [4]. Over the past two decades, many SAR satellites have been launched, creating a constellation of satellites that provide frequent quasi-real-time observation of floods in high resolution. The Advanced Land Observing Satellite-2 (ALOS-2) carries an L-band SAR, the Phased Array-type L-band Synthetic Aperture Radar-2 (PALSAR-2), for monitoring disasters and environmental changes [5]. ALOS-2 is characterized by the unique local solar time (LST) of its observations (at 0:00 and 12:00), which cover hours not monitored by other SAR satellites observing at 6:00 and 18:00 LST (such as TerraSAR-X, COSMO-SkyMed, RADARSAT-2, and Sentinel-1). Observations at 0:00 LST are particularly important because optical remote sensing and ground surveys cannot be performed at computational efficiency of the algorithm, we employed a Bayesian inference assuming simple Gaussian distributions of SAR data [31] and combined flood fraction data by TE as prior probabilities of flooding.
Our method generates polygonised (vector) data as a final product because they are easy to understand by end-users (non-specialists in remote sensing) working for on-site disaster response. Furthermore, polygon data have a much smaller file size than image data and can easily be sent to and managed by geographical information systems (GIS).
To validate the accuracy and robustness of our method, we processed 12 datasets for urban-area flooding in Japan acquired by ALOS-2 under a variety of observation conditions. The previous research studies on flood detection by ALOS-2 were case studies that analyzed a few flood events that were insufficient for a feasibility study. Furthermore, we provisionally implemented our method within the ALOS-2 ground processing system and measured computational speed to evaluate its suitability for operational use.

SAR Data by ALOS-2
We tested our method with 12 sets of ALOS-2 data corresponding to seven flood events in Japan, as listed in Table 1: flooding in Joso, Ibaraki Prefecture (cases 1 to 5) caused by the heavy rainfall in the Kanto-Tohoku area in September 2015 [32]; flooding in Kitami, Hokkaido Pref. (case 6) by Typhoon Mindulle in August 2016 [33]; flooding in Mabi, Kurashiki city, Okayama Pref. (case 7) by Typhoon Prapiroon in July 2018 [34]; flooding in Saga Pref. (cases 8 and 9) by a stationary front in August 2019 [35]; flooding in Chikuma, Nagano Pref. (case 10) and Naka, Ibaraki Pref. (case 11) by Typhoon Hagibis in October 2019 [36]; flooding in Chiba Pref. (case 12) by an extratropical cyclone in October 2019 [37]. Figure 1 shows these locations of the flood events.  All of the co-event ALOS-2 data used in this study were acquired using 3-m resolution Stripmap mode with HH-polarization due to data availability. Pre-event archived data with the same beam (the same off-nadir angle) as co-event data were required for InSAR, which is effective for flood detection in built-up areas. However, in cases 5, 9, 10, 11, and 12, pre-event data with different beams were used because of a lack of data. In case 10, 50-m resolution ScanSAR data were used for InSAR. In summary, data of cases 5, 9, 11, and 12 cannot detect built-up area flooding accurately, but this was not a significant problem because there were only a few buildings at these sites.
We applied data using beam 14 or lower (<48 • off-nadir angles) because observations with excessively large off-nadir angles result in poor accuracy for flood detection [10]. The use of lower off-nadir angles also avoided radar shadows in mountainous areas.
The original ALOS-2 data used in our processing were the Standard Product Level 1.1, which is single-look complex (SLC) data processed and distributed by the Japan Aerospace Exploration Agency (JAXA).

Flood Simulation Data by TE
We used quasi-real-time flood simulation data processed and provided by a Japanese regional version of TE, TE-Japan [27], developed jointly by the University of Tokyo and JAXA. TE-Japan has been providing various physical quantities hourly with a 1/60 • -grid (corresponding to 1.5-km resolution) in Japan [29].
TE-Japan integrates surface meteorological data from the Meso-Scale Model by Japan Meteorological Agency (JMA-MSM), the hydrologic land surface model (Minimal Advanced Treatments of Surface Integration and RunOff, MATSIRO) [38,39], and the hydrodynamic river model (Catchment-based Macro-scale Floodplain model, CaMa-Flood) [40,41]. By giving surface meteorological data from the JMA-MSM, MATSIRO solves the water and energy interactions between land and atmosphere. MATSIRO is a grid-based one-dimensional vertical simulation model and does not consider water and heat transfer to or from neighboring grids. To calculate horizontal water movement on the land surface, CaMa-Flood calculates the local inertial equation [42] as follows: where Q is the river discharge, A is the flow cross-section area, h is the flow depth, z is the bed elevation, R is the hydraulic radius, g is the gravitational acceleration, and n is the Manning's friction coefficient (n = 0.03 in this model). This equation is based on the Saint-Venant equation, and the first, second, third, and fourth terms represent local acceleration, water pressure, terrain gradient, and friction slope, respectively. The variables x and t are the flow distance and time. CaMa-Flood assumes a simple cross-section for river channels (rectangular) and flood plains (trapezoidal), and the flood area is indirectly obtained using the cross-sectional shape and depth of the water. This study used flooded area fraction (FLDFRC) obtained from TE. FLDFRC is the fraction of the grid cell that is flooded. Since CaMa-Flood is a model with a relatively lower spatial resolution (1.5 km in TE-Japan) than SAR measurements and assumes a simple cross-sectional shape for a water body, FLDFRC serves as a proxy for a large-scale phenomenon rather than a deterministic indicator of local-scale floods.

Other Ancillary Data
A land-cover/land-use map was used for correcting misclassifications that occurred in specific land use. This study used the High-Resolution Land-Use and Land-Cover map (HRLULC), a 10-m resolution 10-category land-use/land-cover product processed and provided by JAXA.
Terrain data (digital elevation model, DEM) were used in the orthorectification of SAR data. This study used the DEM from the Fundamental Geospatial Data provided by the Geospatial Information Authority of Japan (GSI), with 10-m resolution and approximately 5-m elevation accuracy.
For evaluating our method, we compared our flood detection results with the reference flood extent data available from GSI [32][33][34][35][36][37] (shown in Figure 1 and listed in Table 2). The disaster survey experts in GSI produced these maps by manually interpreting high-resolution (20 cm or worse) aerial photographs.  Figure 3 details the pre-processing for preparing SAR amplitude, coherence, FLDFRC, and HRLULC images, which were used in the main processing (Section 3.2).

Pre-Processing
Then, multi-looked (2×2 pixels) amplitude images were made from co-registered SAR data and speckle-filtered by a Frost filter [43] with a 3×3-pixel sliding window. Frost filter is computationally efficient for preserving point targets and thus effective for flood detection [44]. A larger filter window (e.g., 5×5 pixels) reduces the more minute misclassifications; however, in this study, it was not effective because the final polygonization process significantly reduced the noise of classification results. If multiple pre-event amplitude images a pre i were available, their minimum value a pre min was computed for each pixel and used with the co-event amplitude image a co . Other statistics such as the maximum value, average, and temporal variance may be useful but were not used in this study for reasons of simplicity and computational efficiency. The minimum value is the most important in detecting peak seasonal inundation (typically in rice paddies) and to prevent these areas from being falsely detected as flooded.
Interferometric coherences were calculated in a 10×10-pixel window, i.e., a 5×5-pixel sliding window, following 2×2 pixel multi-looking for the pair of co-event and latest pre-event images (γpreco), and the pair of latest two pre-event images (γpre-pre). Many other combinations of coherence pairs can be considered but were not used to avoid complexity. The coherence can be decreased not only by flooding but also by factors such as spatial and temporal decorrelation, and thermal noise [45]. In the case of ALOS-2, temporal decorrelation may be significantly large because of the long temporal distance of observations. To compensate for decorrelations that were not related to flooding, we applied histogram matching to the two coherence images and then calculated the difference, i.e., Δγ = γpre-co − γpre-pre. As it turned out that normal histogram matching overcorrects the decorrelation, we proposed a masked histogram matching (detailed in Appendix A).
Hourly FLDFRC data around the time of SAR observation can be used as prior information on flood risk. However, because of the nature of the model-based simulation, the simulated flood has a temporal error. For this reason, the temporal peak value of FLDFRC over 24 hours, FLDFRCmax, was used to account for the peak of flood probability.
Finally, all SAR images, a co , a pre min, and Δγ were orthorectified and geocoded from the radar coordinate to the Universal Transverse Mercator (UTM) projection with 5-m spatial resolution. The FLDFRCmax and HRLULC images, which were originally provided in equatorial latitude/longitude projection, were also re-sampled to the same projection and resolution as the SAR data.  The co-event SAR data were co-registered with the latest pre-event SAR data. Since rapid products of co-event data may have a larger geometric error in actual disaster response, pre-event data are more reliable in geometry and should be used as the basis for co-registration.

Main Processing-Flood Detection
Then, multi-looked (2 × 2 pixels) amplitude images were made from co-registered SAR data and speckle-filtered by a Frost filter [43] with a 3 × 3-pixel sliding window. Frost filter is computationally efficient for preserving point targets and thus effective for flood detection [44]. A larger filter window (e.g., 5 × 5 pixels) reduces the more minute misclassifications; however, in this study, it was not effective because the final polygonization process significantly reduced the noise of classification results. If multiple pre-event amplitude images a pre i were available, their minimum value a pre min was computed for each pixel and used with the co-event amplitude image a co . Other statistics such as the maximum value, average, and temporal variance may be useful but were not used in this study for reasons of simplicity and computational efficiency. The minimum value is the most important in detecting peak seasonal inundation (typically in rice paddies) and to prevent these areas from being falsely detected as flooded.
Interferometric coherences were calculated in a 10 × 10-pixel window, i.e., a 5 × 5-pixel sliding window, following 2 × 2 pixel multi-looking for the pair of co-event and latest pre-event images (γ pre-co ), and the pair of latest two pre-event images (γ pre-pre ). Many other combinations of coherence pairs can be considered but were not used to avoid complexity. The coherence can be decreased not only by flooding but also by factors such as spatial and temporal decorrelation, and thermal noise [45]. In the case of ALOS-2, temporal decorrelation may be significantly large because of the long temporal distance of observations. To compensate for decorrelations that were not related to flooding, we applied histogram matching to the two coherence images and then calculated the difference, i.e., ∆γ = γ pre-co − γ pre-pre . As it turned out that normal histogram matching overcorrects the decorrelation, we proposed a masked histogram matching (detailed in Appendix A).
Hourly FLDFRC data around the time of SAR observation can be used as prior information on flood risk. However, because of the nature of the model-based simulation, the simulated flood has a temporal error. For this reason, the temporal peak value of FLDFRC over 24 h, FLDFRC max , was used to account for the peak of flood probability.
Finally, all SAR images, a co , a pre min , and ∆γ were orthorectified and geocoded from the radar coordinate to the Universal Transverse Mercator (UTM) projection with 5-m spatial resolution. The FLDFRC max and HRLULC images, which were originally provided in equatorial latitude/longitude projection, were also re-sampled to the same projection and resolution as the SAR data.

Main Processing-Flood Detection
Our method employed two types of complementary data, i.e., low-resolution simulation data (FLDFRC) and high-resolution SAR data. The pre-processed SAR data (co-event amplitude, pre-event minimum amplitude, and the difference in coherence) were unified into a random vector x = (a co , a pre min , ∆γ) T and used for pixel-based classification; FLDFRC max was used as a prior probability of floods in Bayesian inference.
To reduce the computation time, pixels with significantly low flood probability (FLDFRC max < 0.05) were masked and excluded from processing.
Since flooded areas can have different characteristics in SAR images depending on the land-cover type, a four-category classification was initially performed: F 1 : permanent no water; F 2 : permanent water (non-flooded); F 3 : open-water flooding; F 4 : flooded built-up area. Figure 4 depicts a schematic illustration of the four categories. Similar to Giustarini et al. [31], the conditional probability P(F i |x) was computed from Bayesian inference as follows: where P(F i ) is the prior probability computed from FLDFRC max and P(x|F i ) is the probability density function and is described in Equation (4). The denominator of Equation (2) is the total probability density function of x. Since FLDFRC max represents the peak value of flood risk over 24 h, it inevitably tends to overestimate the extent of flooding. Thus, similar to D'Addabbo et al. [30], we used a logistic function to convert FLDFRC max to the prior probability as follows: where A, B, and C are parameters of the logistic function; we set A = 0.5, 1/B = 0.05, C = 0.2 (detailed in Appendix B). The probability density function P(x|F i ) in Equation (2) is a Gaussian distribution: where N represents Gaussian distribution and µ i and σ are its parameters, empirically defined as shown in Table 3. For instance, F 3 (open-water flooding) was modeled as smaller co-event amplitude (a co = t a − ε a ), larger pre-event amplitude (a pre min = t a + ε a ), and small change in the coherence (∆γ~0). t a (threshold of amplitude for distinguishing water and no water) was set based on a previous study, in the case of beam 8, for instance, −14 dB [10]. The previous study intensively investigated the threshold dependency of flood detection accuracy and proved that small variations in a threshold of 1 dB did not significantly affect accuracy; this enabled us to use coarse thresholds with 1 dB steps. The smaller the incidence angle, the larger the threshold required because of the large backscattering at steep angles. The slightly high thresholds at large incidence angles (beam 12 and later) stem from the high noise level at shallow angles and long distances. ε a is the deviation of the probability density function, and the larger the ε a , the fuzzier the flood indication in SAR data. At large incidence angles, the backscatter from rough soil is small and easily confused with water [46]; hence, we set a larger ε a , so that the prior probability contributed more to flood detection. F 4 (flooded built-up area) was modeled as a large decrease in coherence due to decorrelation (∆γ < −0.3) caused by changes in the propagation path of microwaves.  Our method still works even if some data are not available. If SAR coherence information is not available, the number of classification categories will be reduced (only F1 to F3), and the dimension of the random vector will also be reduced, i.e., x = (a co , a pre min) T . In this case, built-up area floods cannot be detected well because of the lack of coherence information. If there is no FLDFRC or similar prior information of the flood, equal prior probability (f = 0.5) can be given so that our method still works. On the contrary, it is also possible to expand the numbers of the classification categories and random variables. For instance, flooded vegetation, which can be detected by an increase in double-bounce (but was ignored in the present study) can be included by adding a new category F5 (flooded vegetation) and expanding the random vector as (a co , a pre min, Δγ, Δa) T , where Δa is the increase in amplitude (co-event − pre-event).
The initial four-category classification map was obtained by selecting the category that has the largest posterior probability given by Equation (2). However, floods in rice paddies, which should be classified as F3 (open-water flooding), tend to be classified as F2 (permanent water) because of irrigation. To correct this misclassification, the following correction was applied to the pixels in rice paddies: pixels classified as F2 were reclassified as F3 if 5% of the pixels in a 21×21 window were F3. This correction is based on the fact that, if paddy levees and roads around paddy fields are submerged, the inner paddy fields could also be regarded as floods. The window size used in this correction is based on the typical size of paddy fields in Japan (100 m ~ 20 pixels).
Then, a binary (flood and non-flood) map was obtained by unifying F1 and F2 to non-flooded, and F3 and F4 to flooded. Then, we polygonised the binary map and eliminated minute polygons smaller than 400 square meters, equivalent to a minimum mapping unit (MMU) of 44 pixels. Similar MMU values were used in other studies, e.g., 30 pixels [25]. To prevent a possible increase in data size due to unexpected errors, we also introduced a failsafe to remove polygons out of the top 200 largest polygons in the image. The data size of the initial polygon data was redundantly large because all pixels on the boundaries of the flood areas were converted into vertices of the polygon. The shape of the polygon was simplified by the Ramer-Douglas-Peucker algorithm (tolerance factor ε = 20 m in this study) [47].  (a) Our method still works even if some data are not available. If SAR coherence information is not available, the number of classification categories will be reduced (only F 1 to F 3 ), and the dimension of the random vector will also be reduced, i.e., x = (a co , a pre min ) T . In this case, built-up area floods cannot be detected well because of the lack of coherence information. If there is no FLDFRC or similar prior information of the flood, equal prior probability (f = 0.5) can be given so that our method still works. On the contrary, it is also possible to expand the numbers of the classification categories and random variables. For instance, flooded vegetation, which can be detected by an increase in double-bounce (but was ignored in the present study) can be included by adding a new category F 5 (flooded vegetation) and expanding the random vector as (a co , a pre min , ∆γ, ∆a) T , where ∆a is the increase in amplitude (co-event − pre-event).
The initial four-category classification map was obtained by selecting the category that has the largest posterior probability given by Equation (2). However, floods in rice paddies, which should be classified as F 3 (open-water flooding), tend to be classified as F 2 (permanent water) because of irrigation. To correct this misclassification, the following correction was applied to the pixels in rice paddies: pixels classified as F 2 were reclassified as F 3 if 5% of the pixels in a 21 × 21 window were F 3 . This correction is based on the fact that, if paddy levees and roads around paddy fields are submerged, the inner paddy fields could also be regarded as floods. The window size used in this correction is based on the typical size of paddy fields in Japan (100 m~20 pixels).
Then, a binary (flood and non-flood) map was obtained by unifying F 1 and F 2 to non-flooded, and F 3 and F 4 to flooded. Then, we polygonised the binary map and eliminated minute polygons smaller than 400 square meters, equivalent to a minimum mapping unit (MMU) of 44 pixels. Similar MMU values were used in other studies, e.g., 30 pixels [25]. To prevent a possible increase in data size due to unexpected errors, we also introduced a failsafe to remove polygons out of the top 200 largest polygons in the image. The data size of the initial polygon data was redundantly large because all pixels on the boundaries of the flood areas were converted into vertices of the polygon. The shape of the polygon was simplified by the Ramer-Douglas-Peucker algorithm (tolerance factor ε = 20 m in this study) [47].

Accuracy Evaluation
We compared the obtained flood maps with the reference data from GSI and calculated four accuracy indicators: Kappa coefficient κ, a robust accuracy index particularly in an unbalanced dataset (flood and non-flood samples are not in equal numbers) [48]; Recall, i.e., the fraction of actual floods that are correctly detected (also known as a true positive rate); Precision, i.e., the fraction of detected flood areas that actually flood; F-measure F, a harmonic mean of Recall and Precision. Overall accuracy (the fraction of all pixels classified correctly) was not used in this study because it overestimates the accuracy in unbalanced datasets [10]. Figure 5 shows the step-by-step outputs of our method for case 7, for instance: initial classification maps before (a) and after (b) the rice-paddy correction, and polygons of flood areas before (c) and after (d) simplification; (a)-(d) are results obtained using FLDFRC. Figure 5e shows the simplified polygon without using FLDFRC, for comparison. The result obtained using FLDFRC (d) is more similar to the reference data (f), while the result without FLDFRC (e) contains several over-detections around the actual flood areas. Rice-paddy correction (from (a) to (b)) using HRLULC effectively improved the under-detection of the flood area.

Overview of the Results
the ALOS Geoscience and Application Processor (AGAP), a subsystem of the ALOS-2 ground system used for processing high-level products) with Intel XEON Gold (2.6 GHz) processors; eight processors are used in parallel. Figure 7 shows the flood maps obtained using FLDFRC and not using it. As mentioned above, in cases 8 and 11 (not shown in Figure 7), SAR-derived flood maps were small because of observation time differences. For other cases, flood maps similar to the reference data were successfully obtained.   Table 4 and Figure 6 summarize the flood detection accuracy and computation time for each case. They compare the results with and without using FLDFRC to show its effectiveness. For most of the cases, the use of FLDFRC improved accuracy and reduced computation time. Case 7 had the highest accuracy (κ~0.9), cases 4, 5, and 9 had low accuracy (κ~0.5), and the averaged accuracy was about κ~0.7. Precision tended to be higher than Recall.

Result of Each Flood Event
From the results of the flood in Joso, the accuracy improvement using FLDFRC was relatively small for the observations at smaller off-nadir angles (cases 1-3 and 5); that is, SAR data could sufficiently detect flood areas without flood simulation in these cases. These results also indicate the expansion of the water area in Watarase Reservoir, the largest flood-control basin in Japan ( Fig. 7b  and 7d). These are not included in the accuracy verification because there was no reference data outside the verification area. FLDFRC is particularly effective in case 4, where the off-nadir angle was large. The 24-h maximum value of FLDFRC was based on the time of the first observation (September 11) and used for cases 1 to 5 because FLDFRC was based on a model of the normal shape of rivers and could not simulate prolonged flooding due to levee collapse.
The flood in Kitami (case 6) suffered low coherence in pre-event InSAR data but provided a relatively good result.
The flood in Mabi (case 7) was observed in good conditions; multi-temporal and highly coherent pre-event data were available with the same off-nadir angle. The results agreed well with the reference data; the accuracy was κ = 0.8 without FLDFRC and κ = 0.9 with FLDFRC. The floods detected outside the validation area (blue circles in Figure 7n) were likely the overland floods of ricepaddy fields.
The result for the flood in Saga on 28 August (case 9) was relatively poor because of undetected floods in crop fields (due to the large incidence angle) and built-up areas (due to the lack of coherence information).
The data for the flood in Chikuma (case 10) required extra consideration in the pre-processing. Since there are no interferometric (having the same off-nadir angle) pre-event data by the highresolution Stripmap mode, pre-event data by the low-resolution ScanSAR mode were used instead. The coherence was calculated with a larger multi-look window, 10×40 pixels (2×8-looks multi-look in range×azimuth and a 5×5-pixel sliding window, corresponding to 100-m resolution) to avoid decorrelation due to the different bandwidths and pulse repetition frequencies (PRFs). Although its resolution was considerably larger than individual buildings, ScanSAR is still useful for detecting changes in clusters of buildings [49]. In cases 8 and 11, accuracy validation was difficult because of the large difference in the observation time of the SAR and reference data. We assumed that the reference data capture the peak of the flood area and that the SAR-derived flood area is off-peak and encompassed in the peak flood area. In these cases, only Precision could be computed.
Note that the computation time shown in Figure 6c is only for main processing (Section 3.2) and does not include the time for pre-processing (3.1) and evaluation (3.3). The process was run on the ALOS Geoscience and Application Processor (AGAP), a subsystem of the ALOS-2 ground system used for processing high-level products) with Intel XEON Gold (2.6 GHz) processors; eight processors are used in parallel. Figure 7 shows the flood maps obtained using FLDFRC and not using it. As mentioned above, in cases 8 and 11 (not shown in Figure 7), SAR-derived flood maps were small because of observation time differences. For other cases, flood maps similar to the reference data were successfully obtained.

Result of Each Flood Event
From the results of the flood in Joso, the accuracy improvement using FLDFRC was relatively small for the observations at smaller off-nadir angles (cases 1-3 and 5); that is, SAR data could sufficiently detect flood areas without flood simulation in these cases. These results also indicate the expansion of the water area in Watarase Reservoir, the largest flood-control basin in Japan (Figure 7b,d). These are not included in the accuracy verification because there was no reference data outside the verification area. FLDFRC is particularly effective in case 4, where the off-nadir angle was large. The 24-h maximum value of FLDFRC was based on the time of the first observation (September 11) and used for cases 1 to 5 because FLDFRC was based on a model of the normal shape of rivers and could not simulate prolonged flooding due to levee collapse.
The flood in Kitami (case 6) suffered low coherence in pre-event InSAR data but provided a relatively good result.
The flood in Mabi (case 7) was observed in good conditions; multi-temporal and highly coherent pre-event data were available with the same off-nadir angle. The results agreed well with the reference data; the accuracy was κ = 0.8 without FLDFRC and κ = 0.9 with FLDFRC. The floods detected outside the validation area (blue circles in Figure 7n) were likely the overland floods of rice-paddy fields.
The result for the flood in Saga on 28 August (case 9) was relatively poor because of undetected floods in crop fields (due to the large incidence angle) and built-up areas (due to the lack of coherence information).
The data for the flood in Chikuma (case 10) required extra consideration in the pre-processing. Since there are no interferometric (having the same off-nadir angle) pre-event data by the high-resolution Stripmap mode, pre-event data by the low-resolution ScanSAR mode were used instead. The coherence was calculated with a larger multi-look window, 10 × 40 pixels (2 × 8-looks multi-look in range × azimuth and a 5 × 5-pixel sliding window, corresponding to 100-m resolution) to avoid decorrelation due to the different bandwidths and pulse repetition frequencies (PRFs). Although its resolution was considerably larger than individual buildings, ScanSAR is still useful for detecting changes in clusters of buildings [49].

The Effect of Using the Flood Simulation
Using FLDFRC had an impact on the improvement of flood detection accuracy, especially for reducing over-detection. Although the accuracy measures did not appear to have improved significantly in some cases (e.g., case 1 in Table 4), FLDFRC reduced many over-detections outside the validation area (as shown in Figure 3d,e). Improvements outside the validation areas are not reflected in the accuracy measures because of a lack of reference data.
The degree of accuracy improvement depended on the observation conditions of SAR data. The FLDFRC particularly improved results under adverse conditions such as large off-nadir angles. In case 4, which had the largest off-nadir angle in this study, the use of FLDFRC was essential to achieve the minimum requirement of the flood detection accuracy (approximately 60% for the Japanese disaster community). The effect of FLDFRC for reducing calculation time also depended on the data because, with fewer flood areas simulated, more calculations could be skipped.
As FLDFRC is calculated by assuming a rectangular river channel and a trapezoidal flood plain, without taking into account levee collapse [40], it sometimes underestimates flood probability. In case 1, some floods could be detected without FLDFRC but not when using FLDFRC (indicated by the blue circles in Figure 7b). This error was caused by the very small value of FLDFRC that brought the prior probability to almost zero. The low resolution of FLDFRC (1.5 km in Japan) compared with SAR (5 m in this study) also contributed to these errors. The lower resolution of FLDFRC outside of Japan (25 km) may present a problem when applying our method to a global scale. High-resolution, more accurate flood simulations will be needed for a more accurate estimation of flooding. Although FLDFRC is effective for the coarse estimation of floods, SAR data are still essential for deciding whether flooding has occurred.

Accuracy
The most important factor that influences flood detection accuracy is the off-nadir angle [10]. In this study, the accuracy of cases 4 (48 • off-nadir angle) and 9 (45 • ) was relatively poor. A large, undetected flood in case 4 (blue circle in Figure 7g,h) and many discrepancies with the reference data in case 9 (Figure 7p) arose from the misclassification of rice paddies and crop fields. Backscattering from the ground (rough soil or low vegetation) was small at large incidence angles and comparable to that of a water surface [46]. Furthermore, the loss of power due to the long range increased the noise level and made it more difficult to distinguish a water surface.
Even at smaller off-nadir angles, under-detection of floods occurred in the low-backscatter crop fields (e.g., blue arrows in Figure 7k,l). This is a characteristic of the L-band and requires shorter wavelengths (X-and C-bands) for radical improvement.
The quality of archived pre-event data also affected accuracy. Pre-event data that are too old and interferometric pairs that have too large a temporal distance degrade accuracy. As this study performed InSAR processing using temporally distant data, the temporal decorrelation was mistakenly detected as flooding. The long observation interval was due to satellite specification, i.e., ALOS-2 returns to the same path (the same ground track) every 14 days but this does not mean that observations are made every 14 days. As the swath width of the ALOS-2 (3-m resolution Stripmap mode) is approximately 50 km, while the interval of paths is approximately 200 km, it takes 2-3 months to completely cover the land surface. It is expected that this issue will be solved by the follow-on satellite, ALOS-4 [50], which will extend the swath width to 200 km and enable observations every 14 days. Once the frequency of data availability is increased by ALOS-4, the use of all pre-event data acquired for the preceding year about a disaster will effectively reduce the influences of seasonal changes.
In cases where the off-nadir angle of pre-event data was different from that of co-event data (cases 5, 9, and 12), floods in built-up areas could not be detected because of a lack of coherence information. Case 10 had to use ScanSAR-Strip interferometry, resulting in lower resolution and less accuracy due to the different frequency bands and PRFs. Consequently, acceptable accuracy could be obtained from data by beam 14 or less but beams 6 to 9 were preferred because they already had a lot of archived data ready for time-series analysis and interferometric processing.
Many studies have already reported the flood detection accuracy of ALOS-2 using the same reference data as this study, i.e., the Joso (reported accuracies in Kappa or F-measure were 0.6 to 0.7) [10,12,44,[51][52][53][54], Mabi (reported accuracy~0.8) [54,55], and Chikuma (~0.5) [49] floods. Compared to these previous studies, our method using FLDFRC naturally obtained higher accuracies because of the contribution of flood simulation. Our results without using FLDFRC tended to be less accurate because some of these past studies used more multi-temporal amplitude/coherence information, which is currently not included in our methods for simplicity. Furthermore, the past studies analyzed a limited number of cases and may be too specific to the test case, resulting in an overestimation of accuracy. Our method was validated with many datasets with a variety of flood events, confirming its robustness and repeatability.
Another major cause of the error was the difference in observation times of the SAR and the reference data. In cases 3, 6, 7, 8, and 11, there were SAR/reference time differences of more than 10 h as shown in Tables 1 and 2. However, case 7 (Mabi) was reservoir-type flooding that did not flow naturally; the large time difference did not significantly affect accuracy.
The present method assumed a simple Bayesian inference and Gaussian distribution that considered only decreases in amplitude and coherence. Thus, other parameters (e.g., amplitude increases caused by the double-bounce effect, more multi-temporal data) and more complex probability distributions should be considered to further improve accuracy. Table 5 summarizes the characteristics of our methodology in terms of flood detection from ALOS-2 data. Our method is computationally fast and meets user requirements, although its accuracy can be improved. The method requires no parameter adjustments other than the threshold, which should be set along the beam number as shown in Table 3. Note that Table 3 covers all the beams of ALOS-2 that are used for flood observation in Japan, but other satellites with different frequencies, S/N ratios, and resolutions require different thresholds. The repeatability of the method was confirmed by various ALOS-2 datasets in this study, but the robustness of more complex models, such as machine learning, should be verified in the future using similar datasets.

Operability of Flood Observation
Based on the above results, we discuss the feasibility of daily flood monitoring using ALOS-2. Our results showed that ALOS-2 data observed in 3-m resolution Stripmap mode up to beam 14 (a 48.0 • off-nadir angle) can detect flood areas with acceptable accuracy. Figure 8 and Table 6 show that the use of beams up to 14 are sufficient for our primary purpose, i.e., daily flood monitoring.  Table 6). For example, at noon (in a descending path) on day 1, the satellites fly southward over path 19 (the thick red line in Figure 8a) and can observe the flood area by beam 5 in right-looking.
The same area can also be observed again at midnight (in ascending orbit) from path 117 by beam 14 in left-looking. In total, two observations can be made on day 1.
ALOS-2 but observation gaps will remain (the number of observations is zero in Table 7) at noon on days 2 and 9, and at midnight on days and 3 and 10. Based on the orbital theory, there are 14 possible angles (n/14×360°) of separation in the orbital plane and 103° (n=4), in which ALOS-4 will follow the same observation paths after six days, enabling gapless twice-daily observation. Therefore, this study recommends 103° of separation between the two satellites for rapid and frequent disaster monitoring.
The complementary use of other SAR satellites should also be considered in the future. As an example, the European Sentinel-1 mission can observe any location at six-day intervals using two satellites. As these satellites have different radar frequencies, S/N ratios, and scattering characteristics, it is necessary to modify their thresholds.    The ALOS-4, the follow-on satellite capable of similar observation and with an extended swath width is scheduled to be launched in the same orbital plane and operated simultaneously with ALOS-2. Table 7 shows the number of observations on each half-day by comparing three configurations, i.e., ALOS-2 only, ALOS-2 and -4 with 180 • of orbit separation, and ALOS-2 and -4 with 103 • separation. In the case of the 180 • separation, ALOS-4 will follow the same observation paths seven days after ALOS-2 but observation gaps will remain (the number of observations is zero in Table 7) at noon on days 2 and 9, and at midnight on days and 3 and 10. Based on the orbital theory, there are 14 possible angles (n/14 × 360 • ) of separation in the orbital plane and 103 • (n = 4), in which ALOS-4 will follow the same observation paths after six days, enabling gapless twice-daily observation. Therefore, this study recommends 103 • of separation between the two satellites for rapid and frequent disaster monitoring. The complementary use of other SAR satellites should also be considered in the future. As an example, the European Sentinel-1 mission can observe any location at six-day intervals using two satellites. As these satellites have different radar frequencies, S/N ratios, and scattering characteristics, it is necessary to modify their thresholds.

Conclusions
In this study, we developed a fully automatic, fast computation, and robust method for detecting flood areas using two complementary data types: SAR data from ALOS-2 and hydrodynamic flood simulation data by TE. We adopted Bayesian inference to combine the SAR amplitude/coherence and simulated flood fraction data. From the experimental results obtained by 12 datasets (corresponding to seven flood events) for urban-area flooding in Japan, the following conclusions were reached.

1.
The simulated flood fraction data contributed to improving flood detection results at an accuracy higher than in previous studies. One should note that the accuracy depended on the observation conditions, i.e., appropriate off-nadir angles (<50 • for acceptable accuracy) and the availability of sufficient time-series pre-event data.

2.
Flood fraction also effectively reduced computation time by skipping unnecessary processing for non-flooded areas and fulfilled user requirements (processing within two hours).

3.
Our method proved robust against non-ideal observation conditions, e.g., data with large off-nadir angles, different off-nadir angles or observation modes between co-and pre-event data, and long temporal intervals among time-series data. 4.
The robustness of our method and the observation capability of ALOS-2 satellite enable daily flood monitoring. If ALOS-4 is also used, the satellites can provide twice-daily monitoring.
In summary, our method using ALOS-2 and TE can effectively monitor urban-area flooding and can be transferred to operational use.
For simplicity and fast computation, the present method assumed simple Bayesian inference, simple Gaussian probability density functions, and a limited number of random variables. More complex models and more random variables are expected to further improve accuracy. For example, the use of more time-series SAR amplitude/coherence data and consideration of amplitude increases by the double-bounce effect were not included. The present study focused on urban floods in areas in Japan, but future work should also consider floods in other areas. It is required to add the category F 5 (flooded vegetation), which was ignored in the present study. The limited resolution of the flood simulation, particularly outside of Japan (25 km), should be improved for better estimation of floods. Furthermore, this study only addressed the use of ALOS-2; the inclusion of other SAR satellites should be considered.
Hydrodynamic simulation data with a high temporal frequency and SAR data with high spatial resolution can have complementary roles in flood monitoring. Although this study showed how simulation can improve SAR analysis, the extent to which SAR could improve the resolution and accuracy of the model should also be studied. interferometric analysis.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A
Appendix A describes our method and the benefit of its use of histogram matching for coherence images in pre-processing (Section 3.1). In case 2 of this study, for instance, the co-event coherence highly decorrelated ( Figure A1b) because of the long temporal distance compared to the pre-event coherence ( Figure A1a), resulting in large negative values in the difference in coherence ( Figure A1c). Histogram matching is the process of modifying the pixel values so that the image histogram is equalized to another image histogram. In this study, the histogram of co-event coherence is equalized to that of pre-event coherence. The use of standard histogram matching resulted in low-coherence areas due to floods that are too much higher to fit forcibly to the pre-event histogram; this resulted in large positive values in coherence difference ( Figure A1d).
We propose masked histogram matching that uses only the histograms where FLDFRCmax is very small (< 0.05) to avoid erroneous matching in flood areas. The matched co-event coherence was obtained by applying the matching table to the entire image. The differential coherence image obtained from the matched co-event coherence adequately showed the decrease in coherence in flooded objects ( Figure A1e). Figure A1. Process of histogram matching used for coherence images in the pre-processing: (a) preevent coherence, (b) original co-event coherence, (c) coherence difference (co-event−pre-event) using original co-event coherence (b), (d) using standard coherence matching, (e) using proposed masked histogram matching.

Appendix B
Appendix B describes how to calculate a flood prior probability f from FLDFRCmax in the main processing (Section 3.2). f takes a value from 0 to 1; f = 0 forces the result to being non-flooded regardless of SAR data; f = 1 forces the result to be flooded; f = 0.5 is neutral (i.e., equivalent to estimating flood areas only from SAR data without using FLDFRC). FLDFRCmax also takes a value from 0 to 1 and is the 24-hour peak value of flood risk. Since the peak value inevitably tends to overestimate the extent of flooding, it should only be used to control the flood detection in areas where the probability of flood is very low; that is, the 0 to 1 of FLDFRCmax should correspond to a Figure A1. Process of histogram matching used for coherence images in the pre-processing: (a) pre-event coherence, (b) original co-event coherence, (c) coherence difference (co-event−pre-event) using original co-event coherence (b), (d) using standard coherence matching, (e) using proposed masked histogram matching.

Appendix B
Appendix B describes how to calculate a flood prior probability f from FLDFRC max in the main processing (Section 3.2). f takes a value from 0 to 1; f = 0 forces the result to being non-flooded regardless of SAR data; f = 1 forces the result to be flooded; f = 0.5 is neutral (i.e., equivalent to estimating flood areas only from SAR data without using FLDFRC). FLDFRC max also takes a value from 0 to 1 and is the 24-h peak value of flood risk. Since the peak value inevitably tends to overestimate the extent of flooding, it should only be used to control the flood detection in areas where the probability of flood is very low; that is, the 0 to 1 of FLDFRC max should correspond to a small non-zero value to 0.5 of f. Furthermore, flood areas smaller than grid size (1.5 km) will be averaged into lower FLDFRC values in the grid. In our experience, if FLDFRC max > 0.3, there is enough possibility to classify the area as flooded (f~0.5).
For these reasons, instead of using FLDFRC max itself as a prior probability, we introduced the logistic function shown in Equation (3). Based on the requirements described above, A controls the maximum value of f and is set to 0.5, 1/B (the sharpness of the rising edge of f ) is 0.05, and C (the position of rising edge) is 0.2 in Equation (3). Thus, f reaches its maximum value (0.5) around FLDFRC max = 0.3 as shown in Figure A2.
Remote Sens. 2020, 12, x FOR PEER REVIEW 21 of 24 small non-zero value to 0.5 of f. Furthermore, flood areas smaller than grid size (1.5 km) will be averaged into lower FLDFRC values in the grid. In our experience, if FLDFRCmax > 0.3, there is enough possibility to classify the area as flooded (f ~ 0.5). For these reasons, instead of using FLDFRCmax itself as a prior probability, we introduced the logistic function shown in Equation (3). Based on the requirements described above, A controls the maximum value of f and is set to 0.5, 1/B (the sharpness of the rising edge of f) is 0.05, and C (the position of rising edge) is 0.2 in Equation (3). Thus, f reaches its maximum value (0.5) around FLDFRCmax = 0.3 as shown in Figure A2.