Multi-Index Image Differencing Method (MINDED) for Flood Extent Estimations

Satellite remote sensing data are often used to extract water surfaces related to extreme events like floods. This study presents the Multi INDEx Differencing (MINDED) method, an innovative procedure to estimate flood extents, aiming at improving the robustness of single water-related indices and threshold-based approaches. MINDED consists of a change detection approach integrating specific sensitivities of several indices. Moreover, the method also allows to quantify the uncertainty of the Overall flood map, based on both the agreement level of the stack of classifications and the weight of every index obtained from the literature. Assuming the lack of ground truths to be the most common condition in flood mapping, MINDED also integrates a procedure to reduce the subjectivity of thresholds extraction focused on the analysis of water-related indices frequency distribution. The results of the MINDED application to a case study using Landsat images are compared with an alternative change detection method using Sentinel-1A data, and demonstrate consistency with local fluvial flood records.


Introduction
Floods are amongst the most important weather-driven hazards, being capable of inducing considerable damage, including economic losses, and threatening of human lives [1]. The effects of floods are dependent on several factors, including flow velocity and depth. Even standing water can produce damage, depending on the persistence time and land cover affected (e.g., damages to crops). Floods may result from heavy or persistent rainfall, flooding by water bodies, water tables rising, snowmelt, or they may originate from artificial sources [2]. Mapping flood extent is, thus, important for several scientific (e.g., better understating of hydrological and, more generally, earth system science processes) and operational (e.g., emergency management; risk and damage assessment, insurance claims) applications.
Flood hazard modelling chains usually start with hydrologic modelling to obtain hydrographs for assigned return periods (e.g., the 100-year flow), which are consequently incorporated into hydraulic flow propagation models to estimate water surface elevations and corresponding inundation areas [3][4][5][6]. Table 1. Advantages and disadvantages of bi-temporal change detection methods.

Method
Advantages Disadvantages

Post classification comparisons
Wide application [35]. Implemented by processing either single or multiple bands (including spectral relationships) [35].
Not ideal for analyzing the process of flooding and post-flooding events, due to the spatially continuous variation of soil wetness [39].

Multi-temporal spectral unmixing
Makes it possible to determine the water proportion changes in every pixel (flood detection) [35].
More complex and less straightforward for mapping purposes (compared to other hard classification techniques). Difficulty in defining classes of change.
Requires a priori information about the study area and end-members [38].

Data transformation
Transformations are capable of sorting modifications of state according to different orders (including those resulting from water-related conditions) [40].
Finding the meaning for each order of change is often a complex task and requires a deep knowledge of the study area (scene-dependent) [40].

Change vector
Multivariate technique, capable of simultaneously incorporating multiple layers. Great potential to recognize and analyze the amount and type of changes [35].
Selection of different thresholds is in practice very complex [35].

Image ratioing
Pixel-by-pixel based analysis of either two-date images or image transformations [35,41].
The non-normal distribution of results has been criticized for being statistically invalid, preventing the computation of thresholds based on standard deviation functions [35,41].

Univariate image differencing
Wide application [35]. Despite its simplicity, it can achieve better performances compared to other methods [41].
Unable to directly specify the type of change [41].
Univariate image differencing is the most widely applied bi-temporal approach. It consists of subtracting spectral or transformed data (e.g., using water-related indices), producing positive and negative values, depending on the type of change [35]. In theoretical and ideal conditions, no-change areas should result in zero values; however, in real conditions, this is not the case as an effect of spatial and spectral co-registration errors, as well as natural time-dependent changes. One or more thresholds may be required to define two or more classes of change (density-slicing), which may provide hints about amounts and types of change [41].
Both optical and SAR data can be used to detect floods using bi-temporal change detection methods, which consist of comparing images acquired before and after a certain event. In principle, the images pre-dating the event should be the most recent from the available dataset, while the post-event the oldest. Regarding the pre-event images, a wide period between acquisition and the event will also imply natural changes resulting from phenological cycles as well as seasonal water body variations [12]. As for the post-event images, the corresponding time span should be as short as possible in order to record the most accurate picture of the flooded area. However, it can be difficult to obtain optical satellite imagery approximately coeval to flood occurrence, since hydro-meteorological Remote Sens. 2019, 11, 1305 4 of 29 events are typically associated with clouds and long periods of adverse weather conditions. Concerning SAR data instead, when applying change detection methods, particular attention must be paid for taking into account all the differences between the image changes occurred to the image's targets (e.g., a comparison between images acquired by the same sensor but with different orbit/geometry of acquisitions).

Optical Water-Related Indices
Many optical Water-related Indices (WrI) have been defined and applied in the literature to detect surface water from remote sensing imagery. This section introduces those most commonly found in flood-related literature.
The Normalized Difference Vegetation Index (NDVI) [42], is obtained with spectral bands corresponding to Red and NIR regions. Without being specific for water detection purposes, it has a theoretical threshold of zero, being the negative values generally associated with water occurrence.
The Normalized Difference Water Index (NDWI) [7] is one of the first specific indices to detect water, considering Green and NIR bands in a similar structure to the NDVI. The NDWI includes the same theoretical threshold of zero, but with positive values being interpreted as water. The Modified Normalized Difference Water Index (MNDWI) was proposed by [43], as an alternative to the NDWI, using SWIR (e.g., band 5 of Landsat TM) instead of the NIR band. The MNDWI has become one of the most popular water detection indices [44].
The Automated Water Extraction Index (AWEI) [45] was conceived for sensors covering the visible to SWIR electromagnetic spectrum range and consists of multiple subtractions and additions of bands through several empirically based coefficients, developed to maximize the separability between water and non-water pixels. It was developed to improve classification accuracy in areas including shadow and dark surfaces, where most other indices tend to fail. It includes two versions: the 'AWEI no shadow' (AWEI_NS) for situations where shadows are not major problems and the 'AWEI shadow' (AWEI_S) intended to effectively eliminate shadows or other dark surfaces [44]. Both versions were developed for a theoretical imposed threshold of zero, with water pixels corresponding to positive values. Nevertheless, as with other indices, in practice, the optimal threshold is usually scene-dependent.
The Tasseled Cap method was first developed by [46] for Landsat MSS sensor and then applied to Landsat TM [47]. It consists of a multispectral sensor-based transformation which makes it possible to obtain new outputs such as Brightness, Greenness, and Wetness. Tasseled Cap Wetness (TCW) [48] can be used to identify water, using a theoretical value of zero to separate water (positive values) from non-water pixels [49]. TCW is usually obtained with sensor-specific coefficients applied to either Top of Atmosphere (TOA) (e.g., [50,51]) or ground reflectance [52]. Table 2 summarizes the main advantages and disadvantages of each of the previous WrI as resulting from the literature. Despite the overall advantages, there are additional challenges for flood detection purposes. Suitability and accuracies of each index are affected by local conditions, which, depending on the study area heterogeneity, may vary within short distances (e.g., land cover type, topography, atmospheric conditions). Good performance in mountain shadow areas [54] Good overall accuracy [55,56] Good performance in large urban areas with cloud-free conditions [49] Weaker abilities to extract water bodies [54] Sensitive to built-up land signals, which often results in an overestimation of water bodies [55] MNDWI [43] The capacity of removing interferences of built-up feature signals [55,56] Better suited for open water mapping [55] High variability of optimal thresholds [55] Unable to remove certain shadow noises effectively [55] NDVI = (NIR − RED)/(NIR + RED) Remote Sens. 2019, 11, 1305 5 of 29 Table 2. Cont.

WrI Name Advantages Disadvantages Equation
AWEI [45] Good performance in large urban areas [49] Difficulty in finding optimal thresholds [38] Unable of totally removing mountain shadows [38] May misclassify high albedo surfaces [44,45,55] TCW [48] Good ability to extract water bodies [54] Fails to suppress mountain shadow [54] TCW = aB1 + bB2 + . . . . + iBj * * Sensor dependent (examples given in Table 3). Regardless of the method chosen, mixed pixels are a common problem affecting remote sensing applications, particularly those resulting from moderate spatial resolution (e.g., 30 m) satellite images [60]. As referred by [49], another limitation of water classification methods concerning the handling of the variability of reflectance spectra which changes according to water properties (e.g., concentrations of phytoplankton and sediments, depth, or substratum type). Another difficulty relates to the discrimination between shadows cast and clouds, and the steep topography, quarries and tall buildings, which may produce signals similar to those of water. In the case of flood analysis, the presence of clouds and cloud shadows are very common, and one of the first steps should be masking such features (e.g., [61,62]).
For reproducibility reasons, when analyzing different epochs and locations, the simplest approach would be to use single global thresholds for each WrI. Finding an optimal scene-dependent threshold may provide better accuracy, even though this can be a difficult task due to high temporal and spatial variability [39].

Flood-Related Synthetic Aperture Radar Applications
According to [33], the use of a simple global threshold is the most frequent approach in flood delineation from SAR imagery. Although this approach can be easily implemented, and thus, is particularly suitable for operational applications related to flood risk emergency management, the criteria used for defining the threshold value affect drastically flood mapping results [30]. Such approaches for image thresholding can be based on the visual inspection of the grey-scale frequency histogram [63][64][65][66], or automatic algorithms (e.g., [67,68]). Given the recently increased availability of SAR images, in the last years, several thresholding methods for flood detection have been developed [26,27,30,[69][70][71]. These methods exploit the characteristics of SAR systems to produce image histograms which are characterized by a bimodal distribution (representative of the classes flooded/non-flooded) that can be used to define an optimal threshold for detecting inundated areas [26,29,30,69,70]. Such rational can be applied both to a multitemporal dataset (change detection methods) or to a single-event image. In the first case, the flood mapping algorithm can be represented Remote Sens. 2019, 11, 1305 6 of 29 as a binary classification problem in which the "change" class (i.e., the flooded area) must be resolved from the "no change" class (which represents the "background" dominant class of the image) [30]. In the second case, instead, the water/flooded pixels correspond to the darker areas of the image that must be discriminated from the other (surrounding) land cover classes which are characterized by higher backscattering values [72]. the effect of the spatial-spectral misregistration between t1 and t2 imagery, changes resulting from phenomena other than flooding, and finally the effective sensitivity of WrI to detect surface water. The main consequence of these conditions is that the real distribution of Nc is represented by a bellshaped range of ∆WrI values located around zero (Figure 2c,d). Given the above considerations and the fact that image differencing prevents discriminating among the types of change, but only change signal and intensity, we assume to classify flooded areas into the categories Low-Magnitude change (LMc) and High-Magnitude change (HMc) as a function of the ∆WrI value (Figure 2b-d). This assumption requires the definition of two thresholds, between Nc-LMc (TL) and LMc-HMc (TH), which are then used to apply density slicing [10,41] to the multitemporal imagery. These thresholds should be ideally defined with analysis of ground truth data. If such information is not available, thresholds may be obtained by only analyzing the frequency distribution of data. The latter approach is a key point of MINDED, allowing us to perform semiautomatic remote sensing procedures to extract flooded area extent from satellite imagery. We   In order to define the study area and the imagery dataset to process, as a preliminary step, we assume that a certain area could have experienced a flooding event within a certain period. If historical flood records are not available, other indirect data sources may be considered (e.g., precipitation measurements, river water levels, local news).

Proposed Method
For flooded areas, WrI variation occurs between the epochs t1 and t2 (respectively prior and after the given flood event). For a change detection method based on image differencing, no-change areas (Nc) are theoretically represented by the digital value zero. Assuming the time span (t2 − t1) is reasonably short, in principle, Nc should be the majority of the pixel image distribution, corresponding to the modal value of the frequency distribution. In contrast, digital values different to zero represent change areas, and they tend to be located toward both tails of the frequency distribution. When WrI differencing (∆WrI) is used, we expect to locate flooded areas changes in only one of the tails, either positive (e.g., ∆NDWI, ∆MNDWI, ∆TCW, ∆AWEI), or negative (e.g., ∆NDVI) ( Figure 2). The higher the distance from the modal ∆WrI value, the higher the magnitude of change, until a complete change of state from dry to flooded surface. If only such kind of flood changes occur, this results in an ideal bi-modal distribution of the function (f ) where the discrimination between Nc and flooded areas is unambiguous (thresholding interval- Figure 2a). In practice, different flooding conditions (water thickness and suspended materials, water surface roughness), as well as different initial conditions (land cover, substrate properties, surface roughness and wetness, and their spatial distribution in respect to pixel size), imply a continuous distribution of ∆WrI values beyond Nc ( Figure 2b). Moreover, when analyzing the frequency distribution, one should also take into account the effect of the spatial-spectral misregistration between t1 and t2 imagery, changes resulting from phenomena other than flooding, and finally the effective sensitivity of WrI to detect surface water. The main consequence of these conditions is that the real distribution of Nc is represented by a bell-shaped range of ∆WrI values located around zero (Figure 2c,d).
Given the above considerations and the fact that image differencing prevents discriminating among the types of change, but only change signal and intensity, we assume to classify flooded areas into the categories Low-Magnitude change (LMc) and High-Magnitude change (HMc) as a function of the ∆WrI value (Figure 2b-d). This assumption requires the definition of two thresholds, between Nc-LMc (TL) and LMc-HMc (TH), which are then used to apply density slicing [10,41] to the multitemporal imagery. These thresholds should be ideally defined with analysis of ground truth data. If such information is not available, thresholds may be obtained by only analyzing the frequency distribution of data. The latter approach is a key point of MINDED, allowing us to perform semi-automatic remote sensing procedures to extract flooded area extent from satellite imagery. We assume that these thresholds correspond to a sudden variation of ∆WrI frequency, which is a consequence of the effects that the appearance of water-related conditions may induce on sets of pixels. The variation may be more ( Figure 2c) or less ( Figure 2d) pronounced depending on the occurrence of change-related secondary modal values. In the first case, the first order derivative of the function (df ) is a useful tool to define the thresholds which are assumed to be located where the change of signs of df occurs. In other cases (Figure 2d), df continuously approaches zero without reaching it. Therefore, we chose as a threshold the ∆WrI value where the second order derivative function (d2f ) reaches a local maximum, corresponding to a sudden variation of both f and df. In this approach, we assumed that this option would ensure the reproducibility of the method while reducing its subjectivity. In practice, for a given scene differencing, the distribution of ∆WrI may follow both the conditions of Figure 2c,d around either TL or TH.
Without ground truth information, the interpretation of HMc may be nonetheless considered quite obvious, because it should represent a complete change of state from dry land to water surface. However, the same does not apply to LMc, which may be expected to represent pixels changing from dry to wet/saturated surfaces, as well as wet/saturated to water surfaces. Moreover, depending on the duration of the time span (t2 − t1), LMc may also correspond to those flooded areas that underwent drying/drainage processes after the flooding event. This is particularly interesting for those situations whenever it is impossible to obtain cloud-free satellite images immediately after a flood event. Finally, Nc areas are expected to include permanent water bodies, continuously wet/saturated surfaces, as well as any other kind of permanently dry surfaces. function (d2f) reaches a local maximum, corresponding to a sudden variation of both f and df. In this approach, we assumed that this option would ensure the reproducibility of the method while reducing its subjectivity. In practice, for a given scene differencing, the distribution of ∆WrI may follow both the conditions of Figure 2c,d around either TL or TH. Ideal distribution: no misregistration effects, gradual transition from Nc to change; (c) Real distribution 1: misregistration effects, gradual well separated transition from Nc to change, and corresponding first (df) and second order (d2f) derivatives; (d) Real distribution 2: misregistration effects, gradual change from Nc to change, and corresponding first and second order derivatives.

Figure 2.
Density slicing classification (Nc-No change; LMc-Low-Magnitude change; HMc-High-Magnitude change) and threshold selections (TL-threshold between Nc-LMc; TH-threshold between LMc-HMc) for different types of frequency (f ) distribution histograms of ∆WrI: (a) Ideal perfect distribution: no misregistration effects, perfect separation between Nc and change areas; (b) Ideal distribution: no misregistration effects, gradual transition from Nc to change; (c) Real distribution 1: misregistration effects, gradual well separated transition from Nc to change, and corresponding first (df ) and second order (d2f ) derivatives; (d) Real distribution 2: misregistration effects, gradual change from Nc to change, and corresponding first and second order derivatives.
WrI are determined for t1 and t2 using the corresponding spectral bands ( Table 2). As for TCW the equation and parameters of transformation are sensor-dependent (examples are given in Table 3).
After calculating ∆WrI, we perform density slicing based on TL and TH, to obtain a stack of six coeval thematic change maps. These maps represent changes caused by flooding according to each WrI specific sensitivity. The stack is then analyzed to extract two outputs, the Overall flood map and the Uncertainty map. The Overall flood map integrates the information from each WrI and is obtained by picking the absolute majority among the frequency of the change classes Nc, LMc, HMc (at least four consistent classifications over six). If an absolute majority does not occur, pixels are classified as 'Mixed'.
The Uncertainty map is obtained by Equation (1) which integrates pixel statistics of change classes within the stack, together with WrI specific average a priori accuracies (Acc WrI ) obtained from literature values (Table 4), where i is a given pixel and ρ i its uncertainty of change classification. This implies that the parameter Acc WrI is a weighting factor for change classification statistics (count), while Acc WrI is the limiting value for the definition of the uncertainty scale range. The expected range of ρ i is 0 -≈ 3.8, where ρ i = 0 represents the lowest uncertainty corresponding to six each-other coherent change classes. This approach also handles pixel statistical ties, by choosing the combination of indices which provides the highest cumulative Acc WrI .

Pre-Processing Considerations
As with any change-detection methods, geometrically corrected surface reflectance data should be used as inputs to implement the MINDED method correctly. When these products are not available, additional preprocessing steps are required, i.e., geometric co-registration and radiometric calibrations [10,73].
In principle, the implementation of MINDED should be applied to cloud-free areas, while minimizing the time span (t2 − t1), in order to reduce the influence of other changes than those caused by flooding (e.g., phenological cycles, land cover modifications) [35]. This is made by choosing the newest image pre-dating and the oldest post-dating the flood event. In practice, a trade-off between time span duration and cloud coverage has to be applied.
Regardless, considering the characteristics of optical satellite imagery and the sensitivity of water detection indices, before implementing MINDED, preprocessing is necessary to extract a global mask including clouds, cloud shadows, and topographic shadows. Both cloud and topographic shadows are dark areas with very low reflectance values in the same bands, where the water spectral signature is also characterized by low values [73][74][75]. For this reason, most WrI are susceptible to collect such features as water, and it is necessary to mask such pixels.
Topographic shadows can be deduced from sun elevation and azimuth parameters that are usually included in satellite products metadata. In order to perform topographic shadows masking, these parameters are integrated with a Digital Elevation Model (DEM), e.g., the Shuttle Radar Topography Mission (SRTM) Version 3.0 1-arc-sec DEM [81], the Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) GDEM [82], the ALOS World 3D (AW3D30) Global DEM [83], or the Multi-Error-Removed Improved-Terrain DEM (MERIT) [84].

Study Area: The Aveiro Region (Portugal)
The Aveiro Region was chosen due to its context of recurrent floods and non-complex topography, presenting a favorable context for remote sensing-based flood detection. It is located on the Northwestern part of continental Portugal (Figure 3

Results
Herein, we illustrate the MINDED method for flood extent estimation through an application to the Aveiro Region study area. In order to preserve the use of free software only, all image processing tasks were performed with GRASS GIS (v7.2.2), while map compositions were made with QGIS (v2.18.15).

Selection of Events
Considering the lack of a long-term systematic record of floods in the Aveiro region, the selection of events was based on an indirect method. Daily precipitation data were extracted from a monitoring database portal [90], corresponding to 23 meteorological stations located inside the Vouga river catchment area (between 30 December 1979 and 12 September 2017). Maximum daily precipitation records were ranked from largest to smallest, for every meteorological station. Return periods were calculated from Equation (2), derived from a Weibull distribution (Equation (3)), where, Tx is the period expressed in number of years, Px is the probability of exceedance, n is the number of The study area includes the 11 municipalities of the Baixo Vouga Subregion (NUTS III), corresponding to the lower section of the Vouga River Watershed. This river is the main fluvial course draining into the Aveiro Lagoon, a shallow coastal lagoon, with a length of 45 km (N-S direction) and a maximum width of 10 km (W-E direction) [85]. The lagoon and adjacent areas are recognized as a complex system, integrating urban areas and a wide range of natural and semi-natural habitats, classified as a Special Protection Area (Natura 2000 Network) and Site of Community Importance [86].
Moreover, the study area includes another shallow natural lagoon, the Pateira de Fermentelos. It is classified as a Ramsar Site and is a part of the lowland area of the Cértima River, immediately located on the confluence with the Águeda River (an affluent of the Volga River). Due to its relatively compact and elongated catchment area, the Águeda River induces inflow to the Pateira de Fermentelos, producing floods during heavy rain events [87]. At about 2.5 km to the east, is located Águeda, one of the largest cities of the study area, which, despite successive efforts of authorities, it is known for being regularly affected by urban floods.
Within the study area, the following municipalities are recurrently affected by floods due to intense precipitations: Aveiro, Estarreja, Oliveira do Bairro, Sever do Vouga, Vagos, and Mira. Besides, considering the coastal influence of this territory, there are also reports of recurring floods due to spring tides in Estarreja, Ílhavo, and Murtosa [88,89].

Results
Herein, we illustrate the MINDED method for flood extent estimation through an application to the Aveiro Region study area. In order to preserve the use of free software only, all image processing tasks were performed with GRASS GIS (v7.2.2), while map compositions were made with QGIS (v2.18.15).

Selection of Events
Considering the lack of a long-term systematic record of floods in the Aveiro region, the selection of events was based on an indirect method. Daily precipitation data were extracted from a monitoring database portal [90], corresponding to 23 meteorological stations located inside the Vouga river catchment area (between 30 December 1979 and 12 September 2017). Maximum daily precipitation records were ranked from largest to smallest, for every meteorological station. Return periods were calculated from Equation (2), derived from a Weibull distribution (Equation (3)), where, Tx is the period expressed in number of years, P x is the probability of exceedance, n is the number of observations and r is the rank number [91].
Table A1 resumes the 50 largest daily precipitation events, providing maximum daily precipitations and maximum return periods (calculated independently for each of the 23 meteorological stations). The highest record corresponds to a return period of 32.7 years and maximum daily precipitation of 180.0 mm. We used this list in order to search for usable couples (t1 and t2) of imagery to be processed (see Section 4.2).

Satellite Data Selection
In order to maintain the simplicity of this methodological application, the selection of satellite data was restricted to freely available orthorectified products of multispectral surface reflectance. Landsat Level-2 products (from Landsat 4 to 8) are accessible 'On-demand' (i.e., usually requiring few hours between the order and delivery of the products) in the NASA's EarthExplorer portal (https://earthexplorer.usgs.gov/). Landsat 5 TM, 7 ETM+, and 8 OLI have the same spatial resolution (30 m in the visible, NIR and SWIR bands) and temporal resolution (16 days, that decrease if more satellites are accounted for a multitemporal analysis). Sentinel 2 Level-2A products, instead, can be downloaded from the Copernicus Open Access Hub website (https://scihub.copernicus.eu/). Sentinel 2 are acquired by the same sensor carried by a constellation of two satellites (Sentinel 2A and 2B). Sentinel 2 data have a spatial resolution that varies from 10 m to 60 m in the visible, NIR, SWIR range (10 m to 20 m for the exploited bands) and a constellation temporal resolution of 5 days at equator [92].
From the list of the 50 largest precipitation events (Table A1), we searched images matching the methodological requirements of this study, i.e., cloud-free conditions and a reasonably short time span between t1 and t2 (we assumed approximately 1 month prior and 15 days after each event). This process resulted in four suitable events to be analyzed (Table 5). Amongst these, only two events match reports of flood occurrences. This is likely the consequence of flooding not being dependent from daily precipitation only, since other factors, such as precipitation intensity, dam breach/discharges, etc. may be relevant. Besides, flood records may be incomplete, since they are more focused on recent years. Also, we observe data discontinuities from several meteorological stations, indicating that some of them did not acquire date continuously. Finally, as with most Portuguese rivers, those within the study area are densely dammed, contributing to regulate water levels during intense precipitation periods (especially during drier months).   Table A1.

Pre-Processing
In order to perform masking of topographic shadows using only free available data, we decided to use the ALOS World 3D-30 m (AW3D30) Global DEM provided by the Japan Aerospace Exploitation Agency (JAXA at http://www.eorc.jaxa.jp/ALOS/en/aw3d30/data/index.htm), due to its horizontal resolution and improved vertical accuracy [93,94]. Shaded relief maps, where produced Those selected events were analyzed with MINDED, using available Landsat 5, 7 and 8 data. Landsat 7 images acquired after 31 May 2003 have data gaps caused by a failure in the Scan Line Corrector (SLC). Nevertheless, we decided to include such images in our analysis by excluding the corresponding faulty pixels in both epochs (t1 and t2). The list of all available images is reported in Table A1.

Pre-Processing
In order to perform masking of topographic shadows using only free available data, we decided to use the ALOS World 3D-30 m (AW3D30) Global DEM provided by the Japan Aerospace Exploitation Agency (JAXA at http://www.eorc.jaxa.jp/ALOS/en/aw3d30/data/index.htm), due to its horizontal resolution and improved vertical accuracy [93,94]. Shaded relief maps, where produced for t1 and t2 (using the r.relief command of GRASSGIS).
Regarding clouds and cloud shadows, we decided to use the quality assessment bands of Landsat Level-2 products, which provides probabilities of occurrence of such features. Using the information of 'pixel_qa' band, we selected those matching the following attributes in order to extract cloud-related masks: 'Cloud Shadows', 'Snow/Ice', 'Cloud', 'Low cloud confidence', 'Medium cloud confidence' and 'High cloud confidence' [95]. Note that we assumed a conservative approach by including the lowest levels of confidence.
Topographic shadows and cloud-related masks were merged into a global mask (using GRASSGIS command r.mapcalc).

Index Calculation and Differencing
As introduced in Section 2, we first determined the WrI and ∆WrI imagery. Then, we obtained the ∆WrI frequency distribution histograms (f ) and corresponding first and second order derivatives (respectively df and d2f ). Given the difference of scales between normalized and non-normalized ∆WrI, we used different binning ranges to obtain frequency functions, respectively based on 255 and 5000 classes. Moreover, smoothing of f (by means of moving averages) was useful to aid the interpretation of df and d2f in order to extract thresholds.
The threshold extraction procedure is exemplified in Figure 5 for ∆NDWI, ∆NDVI and ∆TCW using the same flooding event of 2003. The different shape of the ∆TCW distribution (as well as the other non-normalized ∆WrI), compared to ∆NDWI and ∆NDVI (Figure 5a), is due to the compression of scale caused by the high magnitude of the extreme values of the parameter. As the f distribution does not show secondary modal values beyond the Nc mode, these ∆WrI frequency functions are interpreted as being representative of the example in Figure 2d. For these reasons, instead of interpreting df (Figure 5b), we analyzed d2f to obtain TL and TH (Figure 5c). Since NDVI detects water as negative values, threshold selection analysis was focused on the negative side of the ∆NDVI frequency distribution histogram, unlike the other indices. Following the theoretical premises of Figure 2d, the selection of TL corresponds to the absolute modal value of d2f, which identification was straightforward for the entire dataset under analysis. Instead, the selection of TH required further statistical processing due to a less favorable signal to noise ratio conditions caused by the lower frequency of pixels having ∆WrI larger than the TH. For this reason, we performed a series of smoothing procedures by using unweighted moving averages, until finding the setup which provided the best delineation of the following local maximum (TH) of d2f.
Finally, the application of TL and TH allowed us to obtain the thematic spatial representations shown in Figure 5c. Table 6 includes the threshold list for all ∆WrI and for every selected event.
further statistical processing due to a less favorable signal to noise ratio conditions caused by the lower frequency of pixels having ∆WrI larger than the TH. For this reason, we performed a series of smoothing procedures by using unweighted moving averages, until finding the setup which provided the best delineation of the following local maximum (TH) of d2f.
Finally, the application of TL and TH allowed us to obtain the thematic spatial representations shown in Figure 5c. Table 6 includes the threshold list for all ∆WrI and for every selected event.

Flood Extent Estimation Results
After the selection of the thresholds, we performed density slicing [10,41] for each ∆WrI, resulting in a stack of six thematic maps for each of the four events. The overall flood extent map is produced using mathematical operators that assign each pixel to pick the class that corresponds to the majority within the stack of coeval thematic classifications (i.e., at least four consistent classifications). Whenever the condition of a majority is not met, pixels are classified as 'Mixed'.
Taking the largest precipitation event (19 January 2003) as an example, we compared each ∆WrI coeval thematic map ( Figure 6). In every map, a noticeable concentration of LMc and HMc areas along both the Pateira de Fermentelos lagoon and the major riverbeds of the region of interest (ROI) may be observed. In particular, HMc areas are concentrated to the Northwestern part of the ROI (towards both sides of the Vouga River section). Another array of HMc areas is located towards the Northeastern side of the Pateira de Fermentelos lagoon, near the confluence with the Águeda River. As for the half-southern part of the ROI (along the Águeda river, the Cértima river and its confluence with the Pateira de Fermentelos), HMc change areas are generally detected except for ∆NDVI, which detect them as LMc. In general, with respect to maps obtained by normalized indices, those from non-normalized indices are characterized by larger extents of randomly distributed change pixels located away from fluvial areas. Table 7 summarizes the information about the extent of the coeval thematic classes obtained within the ROI from each ∆WrI, for the 2003 event. Most LMc areas are detected by ∆AWEI_NS and ∆TCW (respectively 24.2%, and 21.8%), while most HMc areas are detected by ∆AWEI_S, ∆AWEI_S and ∆TCW (respectively 9.3%, 7.8%, and 3.8%). Amongst all indices, normalized indices are overall less sensitive to detect HMc areas, with ∆NDVI being the least one (1.2% of the ROI). The Overall flood map ( Figure 7a) and the Uncertainty map ( Figure 7b) are obtained by a procedure based on stacking the six coeval ∆WrI thematic maps (Section 2.1). We observe that the frequency of 'Mixed' pixels (9.1% of the ROI) is smaller than the sum of the remaining classes (Nc + LMc + HMc = 87.1%). Moreover, 'Mixed' pixels are also clearly less widespread than the randomly distributed change pixels located away from fluvial areas (Figure 7a vs. Figure 6). Larger regions of 'Mixed' pixels mostly make a rim between Nc-LMc and LMc-HMc. The quantitative uncertainty map makes it possible to perform a spatial analysis of classification consistencies. Low uncertainty values are predominant (0 to 2 − 81.9% of the ROI), whereas the highest uncertainties (3 to ≈ 4 − 0.9% of the ROI and 0.4% of the whole study area) are concentrated around areas of change, with special incidence into the major riverbeds. The Overall flood map (Figure 7a) and the Uncertainty map (Figure 7b) are obtained by a procedure based on stacking the six coeval ΔWrI thematic maps (Section 2.1). We observe that the frequency of 'Mixed' pixels (9.1% of the ROI) is smaller than the sum of the remaining classes (Nc + LMc + HMc = 87.1%). Moreover, 'Mixed' pixels are also clearly less widespread than the randomly distributed change pixels located away from fluvial areas (Figure 7a vs. Figure 6). Larger regions of Water-related Overall change maps obtained for all four selected events have been compared (Figure 8). Quantitative results are represented in Table 8. For the event of 2004, we detected almost only Nc areas. In 2009, only few LMc (4.4%) and HMc (0.4%) areas were detected by MINDED, most of them corresponding to saltmarshes and mudflat areas located inside the Aveiro lagoon. As for the ROI, the few LMc and HMc areas are located within the Pateira de Fermentelos lagoon. Most LMc and HMc areas that appear in the maps of Figure 8 are related to the two largest precipitation events. In 2016, the lower parts of the Vouga (Albergaria-a-Velha) and Antuã rivers (Estarreja) a large extent of LMc and HMc areas is mapped. Most changes are located around the Pateira de Fermentelos lagoon, along the Cértima and Águeda rivers (the second in a smaller extent), intersecting Águeda and Oliveira do Bairro municipalities. Regarding the event of 2003, which is characterized by the largest return period (26.5 years- Table 5), we detected the largest extent of change pixels, a total of 23,660 ha (corresponding to 8.2% of the whole study area). For this largest event, change areas are consistent with the records of fluvial flood occurrences (see Section 4.2) which were detected in all the reported municipalities. Once more, several change areas within the Aveiro Lagoon correspond to intertidal saltmarshes and mudflat areas.  Water-related Overall change maps obtained for all four selected events have been compared (Figure 8). Quantitative results are represented in Table 8. For the event of 2004, we detected almost only Nc areas. In 2009, only few LMc (4.4%) and HMc (0.4%) areas were detected by MINDED, most of them corresponding to saltmarshes and mudflat areas located inside the Aveiro lagoon. As for the ROI, the few LMc and HMc areas are located within the Pateira de Fermentelos lagoon. Most LMc and HMc areas that appear in the maps of Figure 8 are related to the two largest precipitation events. In 2016, the lower parts of the Vouga (Albergaria-a-Velha) and Antuã rivers (Estarreja) a large extent lagoon, along the Cértima and Águeda rivers (the second in a smaller extent), intersecting Águeda and Oliveira do Bairro municipalities. Regarding the event of 2003, which is characterized by the largest return period (26.5 years- Table 5), we detected the largest extent of change pixels, a total of 23,660 ha (corresponding to 8.2% of the whole study area). For this largest event, change areas are consistent with the records of fluvial flood occurrences (see Section 4.2) which were detected in all the reported municipalities. Once more, several change areas within the Aveiro Lagoon correspond to intertidal saltmarshes and mudflat areas.

Comparison with SAR Data
Given SAR potentialities for mapping water bodies and the environmental characteristics of the study area (i.e., an absence of complex topography, snow and dry sand surface, relevant urban areas), Sentinel 1 images can be considered as a reliable source of data to perform comparisons with the MINDED outputs. Hence, we compared our results with flooding maps obtained by the Hierarchical Split-Based Approach (HSBA) [30]. This is a straightforward automatic change detection method based on Sentinel-1 imagery differencing.
Considering the availability of Sentinel 1A data, the HSBA algorithm was implemented to analyze the event of 2016 (13 February 2016), using three Sentinel 1A images, one for a period before the event, and two for a period after. Thus, we selected the same reference image from 6 February 2016 (7 days before the event), and two others, one from 18 February 2016 (5 days after the event) and another from 1 March 2016 (1 day after the LS7 image used as t2, 16 days after the event) as post-event images.
Each Sentinel 1A scene was obtained in IWS (interferometric wide swath) mode, as ESA GRDH products (20 × 22 m resolution, resampled to 10 × 10 pixels). All selected Sentinel-1A products have been acquired with the same relative orbit (125). The algorithm was applied using the following options: Lee Sigma speckle filter (3 × 3), SRTM 1 arc sec geocoding, 5 minimum levels for HSBA, 2.4 Ashman coefficient, a minimum number of 1.000 pixels for the image tiles to process, and by considering to eliminate objects below 10 pixels.
Wind conditions during the acquisition affect water surface roughness, and therefore, the accuracy of both water surface and flood extent mapping. According to meteorological data of 'Estrada' and 'Tentugal' stations [90], the wind speed at the acquisition times of all the considered images was less than 2.0 m/s, which has no effects in inland waters [96].
The comparison between MINDED and HSBA results for the event of 2016 is illustrated in Figure 9.
From the analysis of Figure 9b,c, it is possible to observe that the extent of flooded areas detected by the HSBA algorithm reduced over time (i.e., from 18 February 2016 to 1 March 2016), with particular incidence to North and West of the Pateira de Fermentelos lagoon (along the Águeda and Vouga rivers). This is confirmed by the multi-temporal Sentinel 1A RGB composite (Figure 9d), which highlights in purple those areas flooded by 18 February 2016, but recovered toward the initial conditions on 1 March 2016. Those areas still flooded by 1 March 2016 (dark blue) are mostly concentrated to Southwest of the Pateira de Fermentelos lagoon (along the Cértima River).
A quantitative pixel location comparison of MINDED is presented within the form of confusion matrixes (Table 9), using MINDED as reference data (although neither method provides ground truth reference). For comparison purposes, HSBA results were reprojected and resampled to the same grid of MINDED (i.e., the resolution of Landsat images, 30 × 30m), using bilinear interpolation. Class accuracies, commission errors (i.e., overestimation), omission errors (i.e., underestimation) and overall accuracies are provided.

Discussion
We performed a multi-index differencing method (MINDED) to detect flooded areas aimed at improving the robustness of single-index approaches. This benefits from the sensitivities of individual indices for detecting water with different characteristics, to mitigate their specific limitations, and to assess the consistency among flood detection results within the so-called Overall flood map.
In digital change detection approaches, thresholds control the results of image differencing, which may be a critical undertaking when no ground truth data is available [35]. In practice, missing ground truth is the typical condition when analyzing archive data, and in particular when studying extreme events like floods, for which systematic collection of field data may be problematic. For these reasons, MINDED includes an expedited procedure aimed at reducing subjectivity for thresholds selection. The procedure is based on the analysis of the frequency distribution of water-related index differencing (∆WrI) data, as well as the corresponding first and second order derivatives. Different thresholds may be selected to discriminate between No change (Nc) areas and different types of change, based on both magnitude and sign of ∆WrI values. Nc areas are expected to include any kind of permanently dry surfaces, as well as permanent water bodies and stable wet/saturated surfaces. High Magnitude change (HMc) areas represent a complete change of state from dry land to water surfaces. Low Magnitude change (LMc) areas may include different conditions of subtle change like either areas changing from dry to wet/saturated, or from wet/saturated to water surfaces. Moreover, when post-event (t2) images are acquired several days after the flood, LMc may also include areas which were initially flooded, then underwent the process of recovering toward their original status. For this reason, we consider that MINDED widens the period in which to obtain usable images to extract maximum flood extents. This is particularly relevant for optical satellite imagery, due to their heavy dependence on cloud-free conditions, which tend to be less frequent immediately after heavy precipitation periods. The multi-index approach also makes it possible to extract a 'Mixed' class resulting from a majority analysis implemented within MINDED, as well as an Uncertainty map (Equation (1)) which quantitatively represents the spatial distribution of coherence among the different WrI results. The determination of uncertainty is also based on accuracies known from the literature (Table 4), which are used as weighting factors. As a consequence, MINDED incorporates the assumed performance of single WrI, even though such accuracies may have been obtained by different methods, assumptions, and sensors, as well as environmental conditions.
As for the implementation of MINDED in the study area, further conclusions may be drawn. Regarding the selection of events to perform the analysis, it is necessary to acknowledge the importance of acquisition time for both pre-and post-event images (respectively t1 and t2). When working with archive imagery related to past events, for which details about spatiotemporal flooding evolution are unknown, the reference image t1 may represent already altered conditions, due to either previous flooding or high water discharge levels. In these cases, the outputs of MINDED will result in underestimation of the overall flooded area. As for t2, the detection capabilities could vary depending on surface permeability, drainage behavior, and post-event anthropic management, which may limit the available time for usable post-event image acquisition. This is particularly relevant for the 13/02/2016 flood, which was studied with a Landsat (LS) 7 image acquired 16 days after the event. In this case, MINDED was able to detect tracks of floods in naturalized areas (e.g., mostly HMc areas nearby the Pateira de Fermentelos lagoon), even though only LMc areas were detected in the Águeda city center, where several streets were reported to be previously inundated. It is important to highlight that the effects of flooding on impervious surfaces, like those of urban areas, are ephemeral and few hours/days may be enough to return to the pre-event surface reflectivity conditions. Thus, we can expect the more t2 is acquired later than the event, the lower is the accuracy of MINDED to map the maximum flood extent. Moreover, it is expected to observe a progressive transition from HMc to LMc classes. MINDED cannot solve these issues related to the dynamics of post-flood processes, without the integration of ground truths and/or ancillary data.
As a general condition, flood water characteristics may affect the results obtained by indices based on different wavelength ranges of the electromagnetic spectrum (e.g., NIR or SWIR [43,49,97]). This issue can be relevant for those events lacking information collected in the field (as the ones considered within this paper). Nevertheless, concerning the 2003 event, no relevant differences may be identified between the post-event (t2) NIR and SWIR bands (Figure 4). Such similarities are consequently reflected in ∆MNDWI and ∆NDWI results (Table 7). In this case, the characteristics of flood water produced similar effects on both the NIR and SWIR bands, which can be interpreted as an indicator of low turbidity. In fact, the t2 image was acquired 5 days after the flood event, allowing time for sedimentation to take place in slow-moving or standing water areas. Among all indices, ∆NDVI was the one detecting fewest HMc areas, confirming its weaker capabilities for recognizing water surfaces [7,39].
Regarding the statistics of the ∆WrI, when using different sensors for t1 and t2, we can expect to observe the occurrence of non-zero centered modal values. This should not be interpreted as a shortcoming since we can still assume the whole distribution, as well as the Nc condition, to be centered with the main modal value. This is verified for the 2009 and 2016 events, for which, considering the limited availability of cloud-free images during the flood periods, we decided to use different sensors for t1 and t2. The width of the "bell" shape of each histogram is expected to be input-data related and variable according to the distribution of both Nc and LMc/HMc areas, spatial and radiometric co-registration between t1 and t2 images, and natural time-depending changes of features (e.g., phenology changes in natural or agricultural areas). However, other sudden changes rather than flooding may also occur, which could be detected as false alarms (e.g., land cover conversions from agricultural practices, effects of fires). The latest represent intrinsic errors of any image-differencing methods, such as MINDED, which nonetheless may be recognized if such changes occur in places falling out of potentially floodable areas (e.g., by analyzing DEM).
Considering that neither of the analyzed events resulted in an ideal multi-modal histogram (Figure 2c), in order to extract the thresholds, we could not use the first order derivative (df ), so we analyzed the second order derivative (d2f ) instead. While the extraction of TL, between Nc and LMc, was a straightforward procedure, the extraction of TH, between LMc and HMc, required further statistical processing aimed at improving the signal-to-noise ratio in the considered ∆WrI range. This last task is more prone to user-subjectivity, so TH should be more reasonably considered as a range of values, which spatial effects have to be verified at map scale. Being a typical expert-dependent analysis, this segment of MINDED may be hardly implemented as a fully automatic tool procedure.
An advantage of MINDED, resulting from the approach of combining results from different WrI, is the possibility to recognize eventually erratic classifications from a certain index in respect to the others. As observed for the 2003 event, all the ∆WrI maps are characterized by apparently random distributed change pixels located away from fluvial areas ( Figure 6). In a single-index flood detection approach, these pixels represent a false alarm condition which reduces the quality of the flooding map. This is particularly relevant for the non-normalized ∆WrI maps, which may result from ineffective cloud masking. Also, higher sensitivities to terrain shadowing caused by local relief (Table 2) may produce similar effects, due to rough masking from global-scale DEM. A more detailed analysis of such areas highlights the fact that the spatial distributions of such pixels are mainly non-overlapping amongst the ∆WrI maps. As a consequence, the process of stacking implemented within MINDED efficiently handles these false alarms by classifying most of them as Nc, or 'Mixed' in a smaller degree (e.g., 9.3% of the ROI for the 2003 event- Figure 7). This observation highlights the benefit of MINDED approach with respect to the standard single-index alternatives. The 'Mixed' class is mostly located nearby the transition areas between Nc-LMc and LMc-HMc, suggesting that they result from the occurrence of subtle changes which are differently recognized by each WrI. Furthermore, it also shows that the combined results from every ∆WrI are each other consistent. Another indication of the effectiveness of MINDED is the analysis of the spatial distribution of uncertainty. Highest uncertainty values are concentrated around the major riverbeds and areas of change, and their frequency is low (ρ = 3 to ≈ 4 − 0.4% of the study area). This suggests that, even though every WrI may provide different results about change, MINDED integrates these results giving a more robust representation of changes.
As expected, we observed larger extents of flooded areas with higher return period events. However, in some cases (2009, 2004 and 2003), several intertidal areas were classified as being flooded (mostly as LMc). This is a consequence of differences of tidal states between t1 and t2, meaning that t1 image likely was acquired during lower tide. Besides, we have to assume that MINDED might detect false alarms within permanent water bodies, such as the Pateira de Fermentelos lagoon, due to variations of water thickness/composition or floating debris/vegetation which might be classified as LMc or even HMc areas.
Regarding the comparison between MINDED and HSBA methods, results show that the location of Flooded and HMc classes areas is coherent. MINDED is more sensitive to detect tracks from recent flood events, particularly in LMc areas, which implies detection of larger flooded areas. On the other hand, despite the apparent improved spatial resolution of Sentinel 1A imagery (which is not directly comparable with the spatial resolution of optical data [98]), the HSBA algorithm seems to be less sensitive to discriminate between water 'saturated' areas and drier surfaces during the days following a flood event (resulting in higher commission errors). We consider that the low agreement level between Change and Flooded classes (Table 9), is likely related to the unavailability of Sentinel 1 images acquired during the flood event (or immediately after), as well as to the HSBA procedure which eliminates change clusters smaller than 10 pixels.

Conclusions
This study presents an innovative method based on the integration of change-detection concepts which are known from the literature. MINDED makes it possible to detect the extent of past and future flood events, combining multiple water-related indices derived from optical satellite data within a change detection approach, and benefiting from long-term image catalogs.
The method implements image differencing and provides a consistent procedure to analyze the frequency distribution of water-related indices for the extraction of different thresholds depending on the magnitude of changes, as a reproducible alternative to ground truthing. Then, types of change are discriminated considering the sign of difference and applying density slicing based on thresholds.
High magnitude changes include surfaces changing from 'dry' to 'flooded' state, while low magnitude changes correspond to surfaces changing from 'dry' to 'saturated', or 'saturated' to 'flooded' states. Unchanged areas do not discriminate between permanent water bodies and permanent dry surfaces, which is not a limitation for flood hazard analysis. By integrating different water-related indices with different sensitivities to water characteristics, and considering the agreement among the resulting classifications, the method is capable of obtaining both an Overall flood map and an Uncertainty map, estimating uncertainty bounds for flooded areas. This approach increases the robustness of results, particularly in the transition areas between flooded and non-flooded surfaces, as well as where randomly distributed changes not related to water may occur. Moreover, the possibility of analyzing flood events by comparing imagery acquired by different sensors may help to widen the temporal window for obtaining suitable optical satellite images. For these reasons, we consider MINDED being a new valid method to be integrated along with others, including SAR-based approaches (e.g., data-fusion methods), to obtain the best representation of flooded areas.
The performance of MINDED was tested with an application to a study area in Northwestern Portugal, using Landsat imagery to perform flood detection analyses for several events. The results are consistent with known historical flood records. When processing images acquired several days after a triggering precipitation event, MINDED showed capability of detecting tracks of floods both in naturalized and agricultural areas.
Since no ground truth data was available, we compared results from the same event with those obtained by processing multi-date post-event Sentinel-1A scenes using the Hierarchical Split-Based Approach algorithm. The best agreement between methods was obtained by processing the earliest post-event Sentinel-1A image, supporting the idea that MINDED provides a close representation of the maximum flood extent.
The current implementation of MINDED has the potential for automatization improvement, which would contribute to its use as a more expedite method for operational purposes. Further developments of MINDED should be tested in other regions and contexts, as well as by integrating more water-related indices and a different number of indices. Moreover, there is potential to implement MINDED using data obtained from other sensors, including improved spatial resolution alternatives (e.g., Sentinel-2), without requiring relevant adjustments to the procedures, still providing an evaluation of uncertainty. Finally, we highlight that the threshold selection procedure developed in this paper, when applied to either other indices or data, has the potential to be applied to change detection studies other than flooding.

Acknowledgments:
The authors thank Ramona Pelich from LIST, for presenting the Hazard Flood Mapping algorithm and the GPOD service. The authors also thank the reviewers, who definitely helped to improve the final contents of this paper.

Conflicts of Interest:
The authors declare no conflict of interest.