Detection of Flood Extent Using Sentinel-1A/B Synthetic Aperture Radar: An Application for Hurricane Harvey, Houston, TX

: The increasing number of ﬂood events combined with coastal urbanization has contributed to signiﬁcant economic losses and damage to buildings and infrastructure. Development of higher resolution SAR ﬂood mapping that accurately identiﬁes ﬂood features at all scales can be incorporated into operational ﬂood forecasting tools, improving response and resilience to large ﬂood events. Here, we present a comparison of several methods for characterizing ﬂood inundation using a combination of synthetic aperture radar (SAR) remote sensing data and machine learning methods. We implement two applications with SAR GRD data, an amplitude thresholding technique applied, for the ﬁrst time, to Sentinel-1A/B SAR data, and a machine learning technique, DeepLabv3+. We also apply DeepLabv3+ to a false color RGB characterization of dual polarization SAR data. Analyses at 10 m pixel spacing are performed for the major ﬂood event associated with Hurricane Harvey and associated inundation in Houston, TX in August of 2017. We compare these results with high-resolution aerial optical images over this time period, acquired by the NOAA Remote Sensing Division. We compare the results with NDWI produced from Sentinel-2 images, also at 10 m pixel spacing, and statistical testing suggests that the amplitude thresholding technique is the most effective, although the machine learning analysis is successful at reproducing the inundation shape and extent. These results demonstrate the effectiveness of ﬂood inundation mapping at unprecedented resolutions and its potential for use in operational emergency hazard response to large ﬂood events. blurring, cropping, and scaling them. We trained a DeepLabv3+ model with up to 30,000 iterations with a batch size of eight and learning rate of 0.007. We adopted Xception65 [56] as network architecture and the corresponding pre-trained model based on ImageNet [57]. We used the trained model to identify flooded pixels in the entire SAR GRD images by following steps including titling, predicting, and mosaicking [53]. We used the same training polygons and trained different models for the false RGB data by following the same procedures described above.


Introduction
Flooding is one of the most frequent hydro-meteorological hazards, with annual losses totaling approximately $10 billion (USD) [1], and average losses are expected to increase to more than $1 trillion annually by 2050 [2]. In 2004, it was estimated that more than half a billion people are impacted every year worldwide. That number could double by 2050 [3]. It is anticipated that increasing populations, regional subsidence, and climate change will exacerbate both annual flooding and extreme events, particularly in large coastal cities [4]. Improved remote sensing instrumentation and analysis will be critical tools in the assessment of flood risk through improved characterization of flood inundation, providing insights into the dynamics of coastal and riverine water bodies, their incorporation into flood models, and improved flood risk mapping, impact assessments, forecasting, alerting, and emergency response systems.
Accurate information about impending and ongoing hazards is critical to aid effective preparation and subsequent response to reduce the impact of large flood events [5][6][7].
Technological advancements have enabled rapid dissemination of hazard information via mobile communication and social networking [8,9]. DisasterAWARE ® , a platform operated by the Pacific Disaster Centre (PDC), provides warning and situational awareness information support through mobile apps and web-based platforms to millions of users worldwide for multiple hazards. DisasterAWARE ® is developing a component that will integrate high-resolution SAR images of flood extent and inundation depth into hydrological models of flood forecasting and impact assessment.
Due to the extensive cloud cover during large precipitation events, flood mapping offers particular challenges for many types of remote sensing. For example, while optical satellite imagery such as that from Landsat or the Moderate Resolution Imaging Spectroradiometer (MODIS) has been successfully employed to derive flood inundation maps [10][11][12][13], their operational effectiveness can be limited by extended periods of precipitation and cloud cover. However, because SAR is an all-weather collection system that sees through clouds, it is extremely useful for real-time, or near real-time, flood mapping [14]. For the purposes of flood mapping, real-time flood mapping would produce images as flood water rose, approximately daily or even hourly. Because SAR satellites that produce freely available images today operate with repeat times of 6-to-12 days, SAR near real-time flood mapping often is limited to a few days into the flood event.
In this work, we compare advanced methods for discriminating between land and water pixels at spatial resolutions of 10 m, using ground range detected (GRD) data imagery from the European Space Agency (ESA) Sentinel-1A/B C-band SAR satellite. We apply an amplitude thresholding algorithm to identify inundation using SAR data to flooding associated with Hurricane Harvey (Harvey). Harvey, a slow-moving Category 4 event, struck the Houston, TX, region on 26 August 2017. We also implemented a machine learning (ML) algorithm, DeepLabv3+, to increase image quality and improve identification of water pixels. We present the results from applying DeepLabv3+ to both the original SAR GRD data and to the results of applying a false RGB classification scheme using the different SAR polarizations.

Materials and Methods
In this work, we apply two different methods to SAR GRD data with a pixel spacing of 10 m, to produce flood inundation maps at 10 m resolution. The first is a thresholding technique [37,38], while the second is based on an RGB classification scheme [31]. We also apply a machine learning technique, DeepLabv3+, to the same GRD data and to the classification outputs. All analyses are applied to SAR GRD data with 10 m pixel spacing, and the results are decimated to 10 m spacing. We use both the Normalized Difference Water Index (NDWI) estimated from ESA's Sentinel-2 optical images and the NOAA data to validate the results.

Data
For this study, we downloaded high-resolution ground range detected (GRD) images from ESA's Sentinel-1A/B satellite (C-band SAR, IW mode) with a 6-12 day repeat time. Images were downloaded from the National Aeronautics and Space Administration's Remote Sens. 2022, 14, 2261 3 of 21 (NASA) Alaska Satellite Facility Distributed Active Archive Center (ASFDAAC) (https: //search.asf.alaska.edu/ (accessed on 9 June 2021)). Here, we downloaded GRD images at 10 m resolution for 29 August 2017. Figure 1 shows the Houston, TX, study area and the footprints of the Sentinel-1 and Sentinel-2 scenes used in this study. The northern SAR image is from Path 34, Frame 95, and the southern SAR scene is from Path 34, Frame 90.

Data
For this study, we downloaded high-resolution ground range detected (GRD) images from ESA's Sentinel-1A/B satellite (C-band SAR, IW mode) with a 6-12 day repeat time. Images were downloaded from the National Aeronautics and Space Administration's (NASA) Alaska Satellite Facility Distributed Active Archive Center (ASFDAAC) (https://search.asf.alaska.edu/ (accessed on 9 June 2021)). Here, we downloaded GRD images at 10 m resolution for 29 August 2017. Figure 1 shows the Houston, TX, study area and the footprints of the Sentinel-1 and Sentinel-2 scenes used in this study. The northern SAR image is from Path 34, Frame 95, and the southern SAR scene is from Path 34, Frame 90. Additionally, we preprocess the GRD product into a σ 0 (sigma nought) product using the Sentinel-1 Toolbox Command Line Tool [39]. σ 0 is the backscatter coefficient, the normalized measure of the radar return from a distributed target per unit area on the ground. This preprocessing consists of the following algorithms attached to the Sentinel-1 Toolbox: Apply Orbit File, Remove GRD Border Noise, Calibration, Speckle Filter, and Terrain Correction. In the Calibration step, we derive the σ0 band [40]. Thus, we call the final, preprocessed product the σ 0 SAR image. Additionally, we preprocess the GRD product into a σ 0 (sigma nought) product using the Sentinel-1 Toolbox Command Line Tool [39]. σ 0 is the backscatter coefficient, the normalized measure of the radar return from a distributed target per unit area on the ground. This preprocessing consists of the following algorithms attached to the Sentinel-1 Toolbox: Apply Orbit File, Remove GRD Border Noise, Calibration, Speckle Filter, and Terrain Correction. In the Calibration step, we derive the σ 0 band [40]. Thus, we call the final, preprocessed product the σ 0 SAR image.
The National Oceanic and Atmospheric Administration (NOAA) acquired aerial optical imagery of the region to support emergency response activities after the hurricane. The NOAA Remote Sensing Division acquired airborne digital optical imagery of the Houston area between 27 August and 3 September 2017, in response to Hurricane Harvey. The images were acquired from an altitude between 2500 to 5000 feet using a Trimble Digital Sensor System (DSS). Individual images were combined into a mosaic and tiled for distribution ( Figure 2). Approximate ground sample distance (GSD) for each pixel is between 35 and 50 cm (https://storms.ngs.noaa.gov/storms/harvey/download/metadata. html (accessed on 1 February 2021)). These data were used to identify flooded polygons for ground truthing of our results.
The National Oceanic and Atmospheric Administration (NOAA) acquired aerial optical imagery of the region to support emergency response activities after the hurricane. The NOAA Remote Sensing Division acquired airborne digital optical imagery of the Houston area between 27 August and 3 September 2017, in response to Hurricane Harvey. The images were acquired from an altitude between 2500 to 5000 feet using a Trimble Digital Sensor System (DSS). Individual images were combined into a mosaic and tiled for distribution ( Figure 2). Approximate ground sample distance (GSD) for each pixel is between 35 and 50 cm (https://storms.ngs.noaa.gov/storms/harvey/download/metadata.html (accessed on 1 February 2021)). These data were used to identify flooded polygons for ground truthing of our results. The Dartmouth Flood Observatory (DFO) provides detailed information on flood inundation for global event in recent decades, largely derived from satellite data, and includes an "Active Archive of Large Flood Events, 1985-present". Collection of these data has been carried out by G. R. Brakenridge and co-workers, first at Dartmouth College and then at the University of Colorado Boulder [41]; http://floodobservatory.colorado.edu (accessed on 22 February 2021). For this event, DFO flood estimates are derived from NASA MODIS, ESA Sentinel-1, Cosmo SkyMed, and Radarsat2 satellite data. Here, we use the DFO MODIS data, at 250 m pixel spacing. Figure 3a shows the global water mask (GWM), derived from the Global Surface Water map [42], where blue is permanent water. Figure 3b shows the MODIS-derived DFO The Dartmouth Flood Observatory (DFO) provides detailed information on flood inundation for global event in recent decades, largely derived from satellite data, and includes an "Active Archive of Large Flood Events, 1985-present". Collection of these data has been carried out by G. R. Brakenridge and co-workers, first at Dartmouth College and then at the University of Colorado Boulder [41]; http://floodobservatory.colorado.edu (accessed on 22 February 2021). For this event, DFO flood estimates are derived from NASA MODIS, ESA Sentinel-1, Cosmo SkyMed, and Radarsat2 satellite data. Here, we use the DFO MODIS data, at 250 m pixel spacing. Figure 3a shows the global water mask (GWM), derived from the Global Surface Water map [42], where blue is permanent water. Figure 3b shows the MODIS-derived DFO water map for flooding from Hurricane Harvey at pixel spacing of 250 m, 28 August-4 September 2017 [41].  For our analysis of the NDWI of this time, we acquired two Sentinel-2 scenes. We selected the two best Sentinel-2 scenes that are the closest in time to our SAR scene (footprints shown in Figure 1). The first scene is a large swath from 30 August 2017, and the second scene is a smaller swath from 1 September 2017, as shown in Figure 1. Since our SAR data were acquired on 29 August 2017, this gives a time difference of 1 and 3 days respectively between the Sentinel-2 images and the Sentinel-1 images from 29 August 2017. We obtained 37 Sentinel-2 images from the GEE platform [43], 31 acquired on 30 August 2017, and 6 acquired on 1 September 2017.

Methods
Methods most commonly applied to the detection of inundation include automatic histogram thresholding-based methods [37,38], multitemporal change detection-based methods [44,45], and machine learning and neural network methods [46][47][48]. Here, we implement both a thresholding technique and a false RGB classification scheme [31].

NDWI Analysis
For additional comparison with our results, we estimated the Normalized Difference Water Index (NDWI) on multispectral Sentinel-2 imagery over Houston [49]. Using Google Earth Engine [43], we extracted the B03 band and B08 band only from the Sentinel-2 MSI Level-1C products, 10 m pixel spacing. These bands correspond to the wavelength of green and near-infrared (NIR) light, respectively.  [41], with the GWM removed to characterize temporary water only.
For our analysis of the NDWI of this time, we acquired two Sentinel-2 scenes. We selected the two best Sentinel-2 scenes that are the closest in time to our SAR scene (footprints shown in Figure 1). The first scene is a large swath from 30 August 2017, and the second scene is a smaller swath from 1 September 2017, as shown in Figure 1. Since our SAR data were acquired on 29 August 2017, this gives a time difference of 1 and 3 days respectively between the Sentinel-2 images and the Sentinel-1 images from 29 August 2017. We obtained 37 Sentinel-2 images from the GEE platform [43], 31 acquired on 30 August 2017, and 6 acquired on 1 September 2017.

Methods
Methods most commonly applied to the detection of inundation include automatic histogram thresholding-based methods [37,38], multitemporal change detection-based methods [44,45], and machine learning and neural network methods [46][47][48]. Here, we implement both a thresholding technique and a false RGB classification scheme [31].

NDWI Analysis
For additional comparison with our results, we estimated the Normalized Difference Water Index (NDWI) on multispectral Sentinel-2 imagery over Houston [49]. Using Google Earth Engine [43], we extracted the B03 band and B08 band only from the Sentinel-2 MSI Level-1C products, 10 m pixel spacing. These bands correspond to the wavelength of green and near-infrared (NIR) light, respectively.
The majority of the Sentinel-1 area is covered by the large Sentinel-2 swath, as it encompasses a larger area. Unfortunately, a large cloud inhabits the northeast corner of this image. To combat this, we use the smaller Sentinel-2 swath to fill in some of the area in the northeast. Both images have smaller clouds present, but the calculation of NDWI should mask out these clouds sufficiently for accurate comparison. The NDWI equation is given by (1) The result is an image with values in the range [−1, 1]. To achieve a ground truth binary classification of water pixels and non-water pixels, we must select a threshold value to make this distinction. According to [49], positive values are determined to be open water, although others have suggested that manually choosing a threshold leads to higher accuracy in classifying water [50]. However, we opted for the threshold of 0, as in [49], for simplicity and to avoid any potential bias incurred in a manual decision.
We then stitch the two resulting NDWI images together and crop and resample to match the resolution and bounds of the SAR data. The resulting NDWI map is shown in Figure 4.
The majority of the Sentinel-1 area is covered by the large Sentinel-2 swath, as it encompasses a larger area. Unfortunately, a large cloud inhabits the northeast corner of this image. To combat this, we use the smaller Sentinel-2 swath to fill in some of the area in the northeast. Both images have smaller clouds present, but the calculation of NDWI should mask out these clouds sufficiently for accurate comparison.
The NDWI equation is given by The result is an image with values in the range [-1, 1]. To achieve a ground truth binary classification of water pixels and non-water pixels, we must select a threshold value to make this distinction. According to [49], positive values are determined to be open water, although others have suggested that manually choosing a threshold leads to higher accuracy in classifying water [50]. However, we opted for the threshold of 0, as in [49], for simplicity and to avoid any potential bias incurred in a manual decision. We then stitch the two resulting NDWI images together and crop and resample to match the resolution and bounds of the SAR data. The resulting NDWI map is shown in Figure 4. Water is shown in blue, and pixel spacing is 10 m.

SAR Water Detection
Thresholding

SAR Water Detection Thresholding
The first method applied to SAR images in this study of Hurricane Harvey flooding was a thresholding technique applied to map inundated regions based on the low backscatter coefficient of the GRD SAR data. The distribution of SAR amplitudes, with the appropriate power transform, is a bimodal Gaussian distribution that can be used to sepa-rate water and non-water pixels [37]. This characteristic can be exploited to automatically identify a threshold of the pixel values to classify inundated areas.
The SAR image is subdivided into tiles, due to the wide SAR swath. The maximum normalized between-class variance (BCV) is used to identify those pixels within each tile that have a bimodal distribution [37,51]. From simulations, a distribution can be assumed bimodal for a maximum value of BCV greater than 0.65 [37,38]. Each tile is further split into an array of s x s pixels, where the value of s is varied to determine an optimal threshold for each tile. Bimodal pixels are identified, and an automatic threshold is selected from either the local minimum (LM) separating the peaks in the bimodal distribution or the mode of the distribution itself. Finally, the threshold for the entire tile is equal to the mean of all the thresholds for every subset of s x s pixels. We repeat this process for every tile to generate a binary output that identifies the pixels classified as water. Note that this method also classifies permanent water bodies, in addition to transient flooded pixels, so that, for flood hazard characterization, it is necessary to take into account the common classifications in all images using an earlier set of images, assuming sufficient temporal coverage, or the Global Surface Water map [42]; https://global-surface-water.appspot.com/download (accessed on 17 May 2021). This method also can be applied to VV (vertical-vertical) and VH (vertical-horizontal) SAR polarizations, separately or jointly, to improve accuracy.

RGB Classification
We also implemented an RGB classification scheme, providing an additional data set for input into the machine learning algorithm using the same SAR images. In this application, dual-polarization SAR images, VV and VH, are used to construct a false color RGB image [44]. We leverage the dual-polarization of Sentinel-1 GRD products to construct a similar 3-channel image using the σ 0 SAR images from the SNAP pre-processing above. We designate the channels as colors (red, green, and blue), but it is important to note that colors do not represent the same physical features as that of an optical image. In the RGB composite here, the VH band is assigned to the red channel, the VV band is assigned to the green channel, and the ratio of VH to VV (VH/VV) is assigned to the blue channel.
The following steps are necessary to transform the data before constructing the RGB composite. First, pixel values are converted from a linear scale to a decibel scale, as in [44]: Here, σ 0 denotes the σ 0 SAR image. Then, the ratio of the VH polarization to the VV polarization is computed. To prevent numerical instability, VH/VV = 0 when VV = 0. We then perform quantile clipping to diminish the effect of outliers or extremely bright scatterers. To accomplish this, we find the 99th percentile of each band and set all values above the 99th percentile equal to the value at 99th percentile. Use of the 99th percentile was chosen empirically because it reduced the outliers enough with minimal change in the pixel distribution. Finally, all three bands are mapped to the interval [0, 1] Figure 5a shows the original σ 0 SAR image, while Figure 4b is the false color RGB image derived from the image in Figure 4a, where VH is blue, VV is green, and VH/VV is blue.

Machine Learning and DeepLabV3+
The goal of this phase of the project is to train a machine learning model to detect differences between pixels in our SAR scenes and attribute these changes to inundation, which is captured in the machine learning model after the model has been well trained. We first identified a training data set by identifying flooded pixels on the SAR GRD image from 29 August 2017, as shown in Figure 6 [52,53]. Here, we apply DeepLabv3+ [54], an where VH is blue, VV is green, and VH/VV is blue, 10 m pixel spacing.

Machine Learning and DeepLabV3+
The goal of this phase of the project is to train a machine learning model to detect differences between pixels in our SAR scenes and attribute these changes to inundation, which is captured in the machine learning model after the model has been well trained. We first identified a training data set by identifying flooded pixels on the SAR GRD image from 29 August 2017, as shown in Figure 6 [52,53]. Here, we apply DeepLabv3+ [54], an encoder-decoder structure used in deep neural networks for semantic segmentation tasks, to the SAR GRD image shown in Figure 6.
We chose a training region ( Figure 6a) and manually delineated all flooded areas within it by the guidance of grids (Figure 6b,c) and obtained 535 training polygons. We converted the training polygons and the SAR GRD image into image and label tiles with a buffer size of 1000 m and an adjacent overlap of 160 pixels, as detailed in [52]. These tiles had sizes range from around 200 by 200 pixels to 480 by 480 pixels and were ready for training DeepLabv3+. While there is no fixed rule for the ratio of training to validation data, it typically ranges from 70%/30% to 90%/10%, with a larger dataset allowing for a larger ratio. Here, we have 10,000 tiles, so we chose 90% of those tiles for training data and the remaining as validation data, as in [55]. To increase the volume and diversity of training data, we used a Python package named "imgaug" (imgaug.readthedocs.io) to augment the tiles by flipping, rotating, blurring, cropping, and scaling them. We trained a DeepLabv3+ model with up to 30,000 iterations with a batch size of eight and learning rate of 0.007. We adopted Xception65 [56] as network architecture and the corresponding pretrained model based on ImageNet [57]. We used the trained model to identify flooded pixels in the entire SAR GRD images by following steps including titling, predicting, and mosaicking [53]. We used the same training polygons and trained different models for the false RGB data by following the same procedures described above.  Figure 7 shows the results of applying the thresholding technique, as detailed in Thresholding, above, to the SAR GRD image from 29 August 2017. Note that a greater number of flooded water pixels are identified here than in the NDWI image of Figure 4, although a direct comparison is difficult because the images used to construct them were We chose a training region (Figure 6a) and manually delineated all flooded areas within it by the guidance of grids (Figure 6b,c) and obtained 535 training polygons. We converted the training polygons and the SAR GRD image into image and label tiles with a buffer size of 1000 m and an adjacent overlap of 160 pixels, as detailed in [52]. These tiles had sizes range from around 200 by 200 pixels to 480 by 480 pixels and were ready for training DeepLabv3+. While there is no fixed rule for the ratio of training to validation data, it typically ranges from 70%/30% to 90%/10%, with a larger dataset allowing for a larger ratio. Here, we have 10,000 tiles, so we chose 90% of those tiles for training data and the remaining as validation data, as in [55]. To increase the volume and diversity of training data, we used a Python package named "imgaug" (imgaug.readthedocs.io) to augment the tiles by flipping, rotating, blurring, cropping, and scaling them. We trained a DeepLabv3+ model with up to 30,000 iterations with a batch size of eight and learning rate of 0.007. We adopted Xception65 [56] as network architecture and the corresponding pre-trained model based on ImageNet [57]. We used the trained model to identify flooded pixels in the entire SAR GRD images by following steps including titling, predicting, and mosaicking [53]. We used the same training polygons and trained different models for the false RGB data by following the same procedures described above. Figure 7 shows the results of applying the thresholding technique, as detailed in Thresholding, above, to the SAR GRD image from 29 August 2017. Note that a greater number of flooded water pixels are identified here than in the NDWI image of Figure 4, although a direct comparison is difficult because the images used to construct them were acquired on different dates. Hurricane Harvey made landfall on 26 August 2017, the Sentinel-1 SAR GRD date used in the thresholding analysis was acquired on 29 August 2017, and the Sentinel-2 data from the NDWI analysis was acquired on 30 August and 1 September 2017. Figure 8 shows the results from applying DeepLavv3+ ML analysis to the SAR GRD data, 29 August 2017, shown in Figure 6a. We also applied the ML analysis, DeepLabv3+, separately, to the RGB data from Figure 5b. The results are shown in Figure 9. The DeepLavb3+ analysis of the SAR GRD data, Figure 8, identifies more, smaller scale features in the data, while the DeepLavb3+ analysis of the false color RGB data identifies the most water pixels, particularly in the southern half of the image closest to the coast. Figure 10 presents enlargements of the results from Figures 7-9 that approximately correspond to the region overflown by NOAA, as shown in Figure 2. For comparison, we also provide enlargements for the DFO and NDWI results for the same region. Note that we again have removed the GWM from the results of Figure 10c-e. The same pattern emerges as observed in the larger images. The higher resolution analyses (10 m pixel spacing), whether the NDWI from the Sentinel-2 optical or the various Sentinel-1 SAR methods, all provide more detail than the DFO MODIS data at 250 m pixel spacing. Again, the thresholding analysis provides more detail than the NDWI analysis, based on images acquired closer in time to the hurricane landfall, while the water pixels in DeepLabv3+ ML analysis are denser, particularly closer to the coast.

Results
While the enlargements of Figure 10 provide additional detail on the differences in extent and coverage between these analyses, they are obtained over significantly different time intervals and spatial scales. To evaluate the effectiveness of each method at identifying actual water pixels, we compare the different results for three subregions of the NOAA optical images, as defined in Figure 2. The NOAA data were acquired between 27 August and 3 September 2017, covering this entire time period, and at a higher resolution than the NDWI or SAR analyses, with a GSD between 35 and 50 cm. Again, for comparison, we provide the original NOAA optical data, the DFO MODIS data, and the NDWI analysis from Sentinel-2, as described above. Note that we do not remove the GWM from our analysis, as the optical data contain both permanent and temporary water, although permanent water is masked out of the DFO MODIS data.   Figure 6a. We also applied the ML analysis, DeepLabv3+, separately, to the RGB data from Figure 5b. The results are shown in Figure 9. The DeepLavb3+ analysis of the SAR GRD data, Figure 8, identifies more, smaller scale features in the data, while the DeepLavb3+ analysis of the false color RGB data identifies the most water pixels, particularly in the southern half of the image closest to the coast.    Figure 2. For comparison, we also provide enlargements for the DFO and NDWI results for the same region. Note that we again have removed the GWM from the results of Figure 10c,d and e. The same pattern emerges as observed in the larger images. The higher resolution analyses (10 m pixel spacing), whether the NDWI from the Sentinel-2 optical or the various Sentinel-1 SAR methods, all provide more detail than the DFO MODIS data at 250 m pixel spacing. Again, the thresholding analysis provides more detail than the NDWI analysis, based on images acquired closer in time to the hurricane landfall, while the water pixels in DeepLabv3+ ML analysis are denser, particularly closer to the coast.   (Figure 11f). Note that all the results identify the flood waters, with the DFO and NDWI effectively the same, with different resolutions, while the SAR analyses, at higher resolutions, present only minor differences. Figure 12 shows a comparison between subregion 2 of Figure 2 for the original optical data (Figure 12a), the DFO MODIS data (Figure 12b), Sentinel-2 NDWI analysis (Figure 12c), the DeepLabv3+ ML analysis of the SAR GRD data, 29 August 2017 (Figure 12d), the thresholding analysis of the SAR GRD data, 29 August 2017 (Figure 12e), and the DeepLabv3+ ML analysis of the RGB classification data (Figure 12f). Here the results for the DFO and NDWI again fail to identify the bulk of the floodwaters. The SAR analyses succeed in identifying the permanent and flooded water pixels, capturing the sinuosity of the local channel, although the DeepLabv3+ analysis of the SAR GRD data (Figure 12d) identifies fewer of the flood pixels and the false color RGB analysis even fewer. The Sentinel-2 images employed in this analysis were collected on 30 August and 1 September 2017, while the SAR data were collected on 29 August 2017, and Hurricane Harvey made landfall on 26 August 2017. As noted above, flood inundation extent certainly varied over the time period, and it is likely that the flood waters had receded some between the time of the Sentinel-1 and Sentinel-2 acquisitions, accounting for the inability of the NDWI analysis to properly classify water pixels in Figure 12c  data (Figure 11a), the DFO MODIS data (Figure 11b), Sentinel-2 NDWI analysis ( Figure  11c), the DeepLabv3+ ML analysis of the SAR GRD data, 29 August 2017 (Figure 11d), the thresholding analysis of the SAR GRD data, 29 August 2017 (Figure 11e), and the DeepLabv3+ ML analysis of the RGB classification data (Figure 11f). Note that all the results identify the flood waters, with the DFO and NDWI effectively the same, with different resolutions, while the SAR analyses, at higher resolutions, present only minor differences.  Figure 13 shows a comparison between subregion 3 of Figure 2 for the original optical data (Figure 13a), the DFO MODIS data (Figure 13b), Sentinel-2 NDWI analysis (Figure 13c), the DeepLabv3+ ML analysis of the SAR GRD data, 29 August 2017 (Figure 13d), the thresholding analysis of the SAR GRD data, 29 August 2017 (Figure 13e), and the DeepLabv3+ ML analysis of the RGB classification data (Figure 13f). Here, flood waters are much narrower and the DFO analysis can only identify 250 m pixels at what appears to be random locations. The NDWI analysis (10 m pixel spacing) identifies permanent water pixels in the channels and adjacent ponding. The thresholding SAR analysis (Figure 13e) identifies the permanent water pixels and small ponding areas but also misidentifies large roadways as flooded pixels, probably because they are wet, with similar backscatter properties to a flat water surface (Figure 13e). The DeepLabv3+, however, captures the larger flood areas (Figure 13d), although the ML analysis of the RGB classification data shows large false positive areas associated with sinusoidal shapes (Figure 13f). identifies fewer of the flood pixels and the false color RGB analysis even fewer. The Sentinel-2 images employed in this analysis were collected on 30 August and 1 September 2017, while the SAR data were collected on 29 August 2017, and Hurricane Harvey made landfall on 26 August 2017. As noted above, flood inundation extent certainly varied over the time period, and it is likely that the flood waters had receded some between the time of the Sentinel-1 and Sentinel-2 acquisitions, accounting for the inability of the NDWI analysis to properly classify water pixels in Figure 12c.   Figure 13 shows a comparison between subregion 3 of Figure 2 for the original optical data (Figure 13a), the DFO MODIS data (Figure 13b), Sentinel-2 NDWI analysis ( Figure  13c), the DeepLabv3+ ML analysis of the SAR GRD data, 29 August 2017 (Figure 13d), the thresholding analysis of the SAR GRD data, 29 August 2017 (Figure 13e), and the DeepLabv3+ ML analysis of the RGB classification data (Figure 13f). Here, flood waters are much narrower and the DFO analysis can only identify 250 m pixels at what appears to be random locations. The NDWI analysis (10 m pixel spacing) identifies permanent water pixels in the channels and adjacent ponding. The thresholding SAR analysis ( Figure  13e) identifies the permanent water pixels and small ponding areas but also misidentifies large roadways as flooded pixels, probably because they are wet, with similar backscatter properties to a flat water surface (Figure 13e). The DeepLabv3+, however, captures the larger flood areas (Figure 13d), although the ML analysis of the RGB classification data shows large false positive areas associated with sinusoidal shapes (Figure 13f).  Figure 2; (b) water pixels identified by MODIS data, 250 m pixel spacing, courtesy of the DFO [41]; (c) water pixels identified from NDWI analysis of Sentinel-2 data, 10 m pixel spacing; (d) water pixels identified by the DeepLabv3+ analysis of SAR GRD data, 29 August 2017, 10 m pixel spacing; (e) water pixels identified from thresholding analysis of the same SAR GRD data, 10 m pixel spacing; and (f) water pixels identified by the classification analysis of the SAR GRD data, 10 m pixel spacing. The GWM is not removed from the analyses of (b) through (f). Water pixels are shown in orange. identified from NDWI analysis of Sentinel-2 data, 10 m pixel spacing; (d) water pixels identified by the DeepLabv3+ analysis of SAR GRD data, 29 August 2017, 10 m pixel spacing; (e) water pixels identified from thresholding analysis of the same SAR GRD data, 10 m pixel spacing; and (f) water pixels identified by the classification analysis of the SAR GRD data, 10 m pixel spacing. The GWM is not removed from the analyses of (b) through (f). Water pixels are shown in orange.

Discussion
Characterization of flood inundation poses a unique problem at the intersection of remote sensing and hazard estimation, as evidenced by this study. Because Landsat and Sentinel-2 images are only usable under cloud-free conditions, their data are not always viable. Sentinel-1 data are only available every 12 days, which does not always match with maximum inundation, making it difficult to incorporate into real-time hazard analysis. While large amounts of remote sensing data exist for Hurricane Harvey, the temporal coverage does not overlap. In addition, variable spatial scales add to the difficulty of comparing the various analyses. However, the collection of high-resolution optical data by the NOAA Remote Sensing Division presents a unique opportunity to assess the qualitative advantages of the analyses shown here.
The high-resolution SAR GRD thresholding analysis, at 10 m pixel spacing, identifies water surfaces, both narrow and broader flooded areas, with good reliability. In all three comparisons with the NOAA data (Figures 11e, 12e and 13e), this method is successful in identifying not only larger flood areas, but small, narrow ponding areas. However, it also identifies a number of false positives in Figure 13e, associated with wet road surfaces.
The DeepLabv3+ ML analysis of the SAR GRD images, again at 10 m pixel spacing, is also very effective at identifying inundated water surfaces in all three test cases ( Figures  11d, 12d and 13d). While the thresholding analysis may do better at identifying small features, the ML analysis has fewer false positives, and better characterizes the shape and nature of the significant flood areas.
The DeepLabv3+ ML analysis of the false color RGB data, also at 10 m pixel spacing, does well in large, flooded areas (Figure 11f), but does not do as well as the other methods in areas with more sinuous or smaller flood regions (Figures 12f and 13f). This is despite the fact that the analysis is performed on three times the data as the ML analysis of the SAR GRD data-VH, VV, and VH/VV. Additional studies should investigate comparisons of analysis of the three channels separately, to better understand the factors affecting their results and ability to appropriately characterize flooding.
A visual comparison of all three high-resolution analyses, at 10 m pixel spacing, suggests that they all do a better job of characterizing flood inundation than either the DFO MODIS data (250 m pixel scaling) and NDWI (10 m pixel scaling) for smaller scale flood features and that the SAR data have a unique ability to better characterize flood inundation, a function of both its ability to identify water surfaces and to see through cloud coverage associated with large storms. However, a direct comparison is difficult because none of the images are acquired on the same dates.
To better evaluate the performance of the three classification methods-DeepLabv3+, thresholding, and DeepLabv3+ ML analysis of the false color RGB data-we performed a predictive analysis using confusion matrices ( Figure 14). We compared each method to a subset of the data for the Sentinel-2 NDWI analysis of Figure 4. The test region ranges from −96.25 • to −95.20 • E longitude and 29.25 • to 30.25 • N latitude, which encompasses most of the area overflown by NOAA (Figure 2) without including the Gulf of Mexico to the south. This is not a perfect comparison because the Sentinel-2 images employed in that analysis were collected on 30 August and 1 September 2017, while the SAR data were collected on 29 August 2017, and Hurricane Harvey made landfall on 26 August 2017. As noted above, flood inundation extent certainly varied over the time period, and it is likely that the flood waters had receded some between the time of the Sentinel-2 and SAR acquisitions. However, the Sentinel-2 NDWI is produced at the same 10 m pixel spacing as the SAR GRD results, providing for a direct comparison without rescaling.  Table 1 shows the results of computing the precision, p [p = (tp)/(tp + fp)], recall, r [r = (tp)/(tp + fn)], and F1-score [f1 = 2•p•r)/(p + r)]. The precision of the classification model identifies how many, out of all instances that were predicted to be water, actually were water. The recall identifies how many instances of water were predicted correctly. Finally, the F1-score compares multiple classes by combining the precision and the recall into a single metric by taking their harmonic mean. The F1-score can be used to compare the performance of two or more classifiers. The higher the F1-score, the better the classifier. The results from Table 1 suggest that the thresholding technique provides a better result than either the DeepLabv3+ analysis of the SAR GRD data or the false color RGB data. This may be because of the lower number of false positives associated with the thresholding method or, inversely, the larger number associated with the ML analysis. The higher number of false positives associated with the ML analysis is likely because the training data largely rely on the shape of coherent pixels. Additional training data over a wider variety of regions and sizes should be investigated in the future.

DeepLabv3+
Predicted   Table 1 shows the results of computing the precision, p [p = (tp)/(tp + fp)], recall, r [r = (tp)/(tp + fn)], and F1-score [f 1 = 2·p·r)/(p + r)]. The precision of the classification model identifies how many, out of all instances that were predicted to be water, actually were water. The recall identifies how many instances of water were predicted correctly. Finally, the F1-score compares multiple classes by combining the precision and the recall into a single metric by taking their harmonic mean. The F1-score can be used to compare the performance of two or more classifiers. The higher the F1-score, the better the classifier. The results from Table 1 suggest that the thresholding technique provides a better result than either the DeepLabv3+ analysis of the SAR GRD data or the false color RGB data. This may be because of the lower number of false positives associated with the thresholding method or, inversely, the larger number associated with the ML analysis. The higher number of false positives associated with the ML analysis is likely because the training data largely rely on the shape of coherent pixels. Additional training data over a wider variety of regions and sizes should be investigated in the future.

Conclusions
Disaster resilience is a widely used concept that focuses on increasing the ability of a community or a set of infrastructure to recover after disasters by responding, recovering, and adapting with minimal loss within a short period of time [58][59][60][61][62]. Resilience is impacted by societal, political, and cultural variables, but it also is advanced through technological innovations [63,64]. Being a multi-dimensional concept, its operationalization requires achieving four properties robustness, resourcefulness, rapidity, and redundancy [59,65]. In particular, improved risk assessment and risk communication are critical factors in increasing resilience through better preparedness, mitigation, and response [66,67]. For example, today technological advances enable the rapid dissemination of disaster information via mobile communication and social media [8,9].
Large flood events, combined with coastal urbanization, present a unique challenge to coastal cities and megacities, contribute to significant economic losses, and damage buildings and infrastructure. Detailed and accurate characterization of impending and ongoing flood hazards is critical to aid effective preparation and subsequent response to reduce the impact of large flood events. However, no communication platform currently exists that delivers rapid flood assessment and impact analysis worldwide. DisasterAWARE ® , a platform operated by the Pacific Disaster Centre (PDC) that provides warning and situational awareness information through mobile apps and web-based platforms to millions of users worldwide, is developing a component that will provide flood forecasting and impact assessment [68].
DisasterAWARE ® implements an integrated modelling approach that consists of (i) a Model of Models (MoM) to integrate hydrological models for flood forecasting and risk assessment, (ii) flood extent and depth modelling using SAR imagery at a granular level for high severity floods identified in (i), and (iii) infrastructure impact assessment using high-resolution optical imagery and geospatial data sets. Specifically, the MoM generates flood extent output at the watershed level at regular time intervals (currently 6 h) during a flood event using outputs from established hydrologic models (e.g., GloFAS (Global Flood Awareness System) and GFMS (Global Flood Monitoring System) [68,69]. The MoM output then identifies SAR imagery for high flood severity locations so that flood extent could be generated at high resolution, such as the 10 m pixel spacing presented here, along with optical imagery and geospatial datasets to assess impacts. Integration of the SAR flood mapping data from the research presented here will allow for the generation of alerts about imminent flood hazard, flooding locations and severity, and flood impacts to infrastructure. For Sentinel-1A data, these will be limited by the repeat time of the acquisitions, currently at 12 days with the malfunction of Sentinel-1B. However, integration of additional data sets, such as C-band SAR from the Radarsat Constellation Mission (RCM) or the upcoming L-band NISAR satellite, could lower the repeat times to 4-to-6 days, The addition of commercial data sets from small satellite constellations such as ICEYE could provide updates with repeat times of less than one day.
Here, we present a comparison of several methods for identifying flood inundation using a combination of SAR remote sensing data and ML methods that can be incorporated into operational flood forecasting systems such as DisasterAWARE ® and provide a comparison of their effectiveness relative to a NDWI analysis of Sentinel-2 data. These employ SAR data to characterize flooding at unprecedented resolutions of 10 m pixel spacing, for Hurricane Harvey, which struck Houston, TX, on 26 August 2017. We present two applications applied, for the first time, to Sentinel-1 GRD data, an amplitude thresholding technique and a machine learning technique, DeepLabv3+. We also apply DeepLabv3+ to a false color RGB characterization of dual polarization SAR data. We compare these 10 m pixel spacing results with high-resolution aerial optical images over this time period, acquired by NOAA Remote Sensing Division, DFO MODIS data [15], and an NDWI estimation using Sentinel-2 images, also at 10 m pixel spacing.
Although the thresholding method is most effective at identifying small scale flood features in terms of precision, recall, and F1-score, results show more false positives, associated with flat, wet surfaces such as roadways (Figure 13e). The DeepLabv3+ ML analysis also is very effective, although it does produce fewer true positives, as well as more false negatives (Figure 14). Future studies should investigate potential improvements using expanded training data sets. Both methods (applied to the SAR GRD data) are more effective at identifying both large-and small-scale flood inundation over this time period than the DFO data. Although it is difficult to quantify, given that the acquisition dates are different for all data sources, visual comparison with NOAA optical imagery suggests that they can successfully identify both small and large ponding areas, as well as permanent water bodies (Figure 13).
Future work should investigate improvements in these SAR methods, including investigations into both longer time series from historical archives, such as Radarsat1/2, ALOS, and ERS/ENVISAT, incorporating DEM data (height and slope) into both the thresholding and ML analyses, quantitative comparison, and improved quantity and quality of training data for the ML method. Detailed studies of the false color RGB data may provide insights into their ability to characterize water and other land cover types. In addition, while the Sentinel-1/2 constellation and the recent Landsat-8 and upcoming Landsat-9 missions have significantly improved the temporal and spatial mapping of large flood extents, additional satellite data can be used, both pre-and post-flood, to improve that temporal resolution in the future. Accuracy and sensitivity testing of other SAR frequency bands will provide important information on the potential for incorporating data from the upcoming NISAR (NASA-ISRO SAR) mission, which will have both L-and S-band SAR sensors, into operational flood forecasting.