The Global Water Body Layer from TanDEM-X Interferometric SAR Data
:1. Introduction
2. Data
2.1. The TanDEM-X Interferometric Global Data Set
2.2. Auxiliary Data
- Shadow and layover mask (SLM): Geometric distortions, such as shadow and layover, are observed in SAR images as low-coherence areas. This effect mainly occurs over mountainous terrain and may lead to the wrong classification of water bodies, when using approaches based on coherence thresholding [40]. For each considered scene, we detected such areas by applying the approach proposed in [41], which takes into account the properties of each SAR scene and its acquisition geometry (orbit height, baseline, and incidence angle). By combining such information with an external reference DEM (in this case, the edited version of SRTM DEM [18], detailed later on), it is possible to detect areas with low coherence on the SAR image, which corresponds to geometric distortions, namely the shadow and layover regions. Figure 1 shows an example of the shadow and layover map, derived for a TanDEM-X image over the Alps, Europe. Amplitude and coherence images are presented as a reference, together with the obtained shadow and layover map. Additionally, a map of the local slope is generated, as well, by computing the bi-dimensional gradient of the reference DEM.
- TanDEM-X quality check products: TanDEM-X acquisitions are interferometrically processed by the operational Integrated TanDEM-X Processor (ITP) [42]. During all the processing chain, the ITP provides direct feedback on the acquisition quality, to ensure high performance of the produced single-scene DEMs [32]. Remaining errors, which may contribute to possibly larger DEM errors, are phase unwrapping and on-board oscillators synchronization problems [43], which are annotated in the ITP quality check products. Moreover, the generated TanDEM-X single scene DEMs by the ITP are then combined in the subsequent TanDEM-X DEM mosaicking and calibration processor (MCP) [44]. Residual errors, due to the remaining phase unwrapping problems or presence of heavy-raining clouds, are annotated in the MCP quality check products. Both ITP and MCP quality check products are considered in the weighted mosaicking process for the generation of the TDM WBL, as well.
- SRTM (Shuttle Radar Topography Mission, [45]): An enhanced, edited version of the SRTM DEM has been used as reference for the detection of shadow and layover regions in each TanDEM-X image. The used SRTM DEM has been merged from an edited SRTM C-Band DEM, adjusted with ICESat, and the Global Land One-km Base Elevation (GLOBE) DEM, as described in [46].
- MODIS (moderate resolution imaging spectroradiometer, [47]): The global snow and ice map, provided monthly by MODIS, at a spatial resolution up to 500 m, is used to detect TanDEM-X scenes affected by the presence of snow, which might lead to a misclassification of water surfaces.
- OSM (OpenStreetMap, [48]): OSM provides a global skeleton of rivers, which are used as input for the watershed classification algorithm, to enhance the placement of the user-defined water markers at the resolution of TanDEM-X quicklooks.
- GlobCover [49]: The backscatter values over sandy desert regions are often close to the TanDEM-X SAR system sensitivity and can lead to an incorrectly estimated low interferometric coherence. In order to avoid the misclassification of such areas as water bodies, the GlobCover classification map has been used to mask out desert regions, as already done for the global TanDEM-X forest/non-forest map [36].
- Copernicus water and wetness (WAW) layer [50]. The Copernicus WAW layer, available over Europe, has been used for the validation of the TDM WBL. The WAW layer is part of the pan-European high-resolution layers (HRL), which provide information on specific land cover characteristics at a 20 m × 20 m resolution.
- TanDEM-X WAM Layer (TDM WAM) [51]. The TDM WAM is delivered together with the TDM global DEM and has been generated during the mosaicking of the full-resolution TanDEM-X DEM at 12 m × 12 m. It is an occurrence counter mask based on the thresholding of both the amplitude and the interferometric coherence. In particular, two fixed thresholds for all acquisitions have been defined for the amplitude: a relaxed amplitude threshold of −15 dB and strict amplitude threshold of −18 dB, while the threshold applied to the interferometric coherence is 0.23. For each single TDM image used for the generation of the global DEM, image pixels showing values below these thresholds are flagged as water. Then, during the mosaicking process, and on a geocell basis, the total occurrence of detected water from overlapping scenes is evaluated and coded into an 8-bit map. In other words, for each pixel in the WAM a coded value is saved, which reflects the number of overlapping acquisitions under the specified thresholds up to a maximum of 3 occurrencies. The complete description of the WAM bits coding can be found in [51]. In order to convert the multiple bit coding of the WAM into binary layers, we split the WAM information into different categories, leading to the generation of up to 13 different binary water maps. All possible combinations and the relative products are summarized in Table 1. For each generated water map, pixels coded with other bits combinations in the WAM have been considered as invalids. Note that the column “All counters” means that, at least in one acquisition, water has been detected for the corresponding WAM binary layer.
- ESA CCI water map [6]: The freely available global map of open permanent water bodies obtained from the Land Cover (LC) project of the Climate Change Initiative (CCI), provided by ESA, at 150 m × 150 m resolution, is used for a large-scale inter-comparison of the produced water maps.Z.
- FROM-GLC water map [14]: The FROM-GLC (Finer Resolution Observation and Monitoring of Global Land Cover) water map has been used for the large-scale inter-comparison of the TDM WBL. This water map has been generated using a machine learning random forests classifier, trained on Landsat data, and updated to 2017 using additional Sentinel-2 data. It is more up-to-date than the ESA CCI one and has been generated at a resolution of 10 m.
- GSW occurrence map [5]: The GSW (global surface water) occurrence map from the European Commission (EC) Joint Research Centre (JRC) is based on Landsat imagery. It shows, at 30 m resolution, the frequency with which water was detected on the surface, from 1984 up to 2015, at the global scale. In order to generate a binary layer to be used for the comparison with the TDM WBL, we set an empirical threshold at the 50% water occurrence.
3. Methods
3.1. Data Preparation
3.2. Seeds Placement
- Reliable data takes: In the proposed method, data takes that are affected by snow and clouds, showing an interferometric low quality, or with a height of ambiguity lower than 25 m are considered as non-reliable acquisitions and are not used further. By excluding data takes affected by the presence of snow, the probability to correctly set water seeds over seasonally frozen lakes increases, thanks to a more likely usage of summer acquisitions, if available. Regarding data takes acquired with low height of ambiguities (or, alternatively, large normal baselines), they typically show low coherence values over forested areas because of the high impact of volume decorrelation, which can mislead the classification [39].
- Water and non-water seeds: Once the reliable data takes have been selected, we define seeds for both water and non-water bodies by properly thresholding the input coherence. The reference threshold values have been empirically defined, after a statistical analysis of more than 200 TanDEM-X images, acquired using different geometries and acquisition parameters. By comparing these images with the ESA CCI water map [6], it has been possible to statistically characterize the expected coherence values for water and non-water bodies. For water bodies, a coherence reference value of 0.22 has been obtained, similar to the one employed by [28], and relatively close to the lower coherence bias. For non-water bodies, the coherence value depends on the land cover type under evaluation. A coherence reference minimum value of 0.5 has been selected as representative for all land cover types. These coherence reference values are the input parameters for the watershed algorithm.
- Super pixels: For a given pixel location, in case all available coherence images from overlapping multiple acquisitions show a coherence value above 0.6, this pixel is directly set as non-water in the final mosaic, since persistently high coherence values are a reliable indicator of the absence of water [38].
- OSM rivers skeleton: Working with a pixel resolution of 50 m × 50 m on ground, narrow river beds smaller than a pixel cell are challenging to detect. The backscattered signal of the surrounding land is merged with the response of such small water regions, and the obtained coherence is higher than the expected one for pure water bodies. This effect leads to a difficult positioning of water seeds. In order to correctly detect such water bodies, we complement seeds detection using the Open Street Map (OSM), which provides a global skeleton of rivers [48]. We extracted this information using the OSMxtract Python package [53]. The skeletons of narrow rivers are tagged as waterways in the OSM and such coordinates on ground are set as water seeds.
- Seeds mosaic: Finally, for each single output coordinate on ground (latitude × longitude), we consider all the N available overlapping acquisitions simultaneously. If it holds that:
3.3. Single-Scene Water Classification
3.4. Reliability Weights
- quantifies the reliability loss caused by the presence of snow on ground. The coherence over areas covered by dry snow is typically degraded because of volume decorrelation effects, while, in the presence of wet snow and bare ice, such a phenomenon is negligible at the X band [56]. The snow coverage information is obtained from MODIS [47]. If the percentage of snow, indicated by MODIS over an image, is higher than the empirical threshold of 20%, then this is considered as a moderate source of uncertainty and .
- characterizes acquisitions affected by heavy-rain clouds, which appear in the coherence images as low coherent areas and could be identified as water surfaces. The clouds information is obtained from the MCP quality check products, described in Section 2.2. If heavy rain events are detected, they are considered as critical error sources and .
- is associated to the presence of acquisition problems, eventually annotated in the ITP quality check products, introduced in Section 2.2. Additionally, in this case, if anomalies are reported, .
- quantifies the reliability of winter acquisitions. Specifically, during this season, water bodies without constantly flowing water, such as lakes, may be frozen. This condition changes their backscattering properties, and they appear as more coherent areas. In this case, . One should note that we define as winter the time period between October and April for data takes acquired over regions at latitudes higher than N and between April and October for data takes acquired over regions at latitudes lower than S.
- accounts for the interferometric coherence variability, with respect to the height of ambiguity [38]. For low values of , the coherence over forested areas can be degraded to values close to the lower bias [39]. On the contrary, this effect is significantly mitigated with increasing , which corresponds to smaller perpendicular baselines. Therefore, we set a different value, depending on specific intervals and seasonal time, as summarized in Table 2.
3.5. Final Mosaicking
3.5.1. Frozen Water
3.6. Additional Output Information Layers
- Coverage map (CM): A map indicating the number of mosaicked acquisitions for each latitude/longitude pixel coordinate of the TDM WBL.
- Acquisition information files (AIF): The acquisition information files list all the acquisitions used in the generation of the TDM WBL map on a geocell level. The list contains the data take acquisition identifier, its scene number, and the date of the acquisition.
3.7. Accuracy Assessment
- The overall accuracy (OA) represents the overall correctly classified pixels, with respect to the total number of classified pixels, and is defined as:The OA is provided for completeness, since it is well known that it shows optimistic results, especially on imbalanced data sets. Indeed, in the case of water mapping at a global scale, the proportion of water is often marginal, with respect to the non-water class, as shown in [6].
- F-score, also called the F1-score, is an accuracy metric that ranges between 0 and 1 and can be expressed as:F-score is mainly used to evaluate binary classifications, and it is specially useful when dealing with imbalanced data sets. The overall accuracy in Equation (6) has the advantage to be easily interpretable, but the disadvantage is that it is not very robust when the data is unevenly distributed. The F-score metric represents a useful alternative when dealing with such kind of data sets.
- The Matthews correlation coefficient () measures the statistical relationship between classified and reference classes and is defined as:The is often used to assess the quality of binary classification, since it is generally regarded as a balanced measure of accuracy, even in the presence of classes with very different population sizes [57,58]. The MCC index varies between −1 and 1. represents a perfect agreement between the classification and reference maps. means that the classification approach is no better than a random prediction approach. indicates an absolute disagreement between classification and reference maps. The assumes a high score, only if a good classification is obtained in all four terms of the confusion matrix.
4. Results
4.1. The Global TanDEM-X Water Body Layer
4.2. Accuracy Assessment
4.2.1. TDM WBL Validation
4.2.2. Comparison with TDM WAM
4.2.3. Intercomparison with Global Water Maps
5. Discussion
6. Conclusions
Author Contributions
Data Availability Statement
Conflicts of Interest
WAM Binary Layer | All Counters | Acq. Counter | ||
3 | 2 | 1 | ||
Coh + Amp | x | - | - | - |
Coherence | x | x | x | x |
Amp. < −15 dB | x | x | x | x |
Amp. < −18 dB | x | x | x | x |
Height of Ambiguity | Summer DT | Winter DT |
<40 m | 0.5 | 0.5 |
>60 m | 2.0 | 0.5 |
>80 m | 4.0 | 1.0 |
Reference Map | |||
Water | Non-Water | ||
TDM WBL | Water | ||
Non-water |
Water Map | Type | All Counters | Acq. Counter | |||
3 | 2 | 1 | ||||
Europe (902 geocells) | WBL | Weighted | 83.16 | - | - | - |
Temp + Perm | 83.03 | - | - | - | ||
WAM | Coh + Amp | 67.16 | - | - | - | |
Coherence | 78.31 | 81.58 | 71.27 | 55.62 | ||
Amp. <−15 dB | 69.57 | 65.46 | 72.38 | 61.29 | ||
Amp. <−18 dB | 81.04 | 69.68 | 77.20 | 68.44 | ||
Alps (48 geocells) | WBL | Weighted | 77.54 | - | - | - |
Temp + Perm | 74.32 | - | - | - | ||
WAM | Coh + Amp | 40.68 | - | - | - | |
Coherence | 50.27 | 69.22 | 33.41 | 16.59 | ||
Amp. <−15 dB | 45.45 | 66.05 | 50.80 | 31.63 | ||
Amp. <−18 dB | 62.56 | 74.62 | 60.26 | 38.81 | ||
Scandinavia (143 geoc.) | WBL | Weighted | 84.63 | - | - | - |
Temp + Perm | 90.91 | - | - | - | ||
WAM | Coh + Amp | 77.02 | - | - | - | |
Coherence | 86.97 | 66.56 | 77.97 | 78.36 | ||
Amp. <−15 dB | 78.66 | 57.15 | 74.68 | 70.67 | ||
Amp. <−18 dB | 84.90 | 57.78 | 79.87 | 79.35 |
Region of Interest | Nr. Geocells | OA | F-Score | MCC | ||||||||
Canada | 1386 | 1366 | 1370 | 94.62 | 94.15 | 95.13 | 74.31 | 74.71 | 73.65 | 72.53 | 71.12 | 71.46 |
USA and Mexico | 927 | 893 | 878 | 98.55 | 98.78 | 98.67 | 86.90 | 89.04 | 90.22 | 84.09 | 86.67 | 87.25 |
Central and South America | 999 | 823 | 892 | 98.66 | 98.65 | 98.01 | 87.70 | 86.60 | 90.52 | 82.49 | 84.65 | 85.08 |
Europe | 1080 | 1042 | 1014 | 98.42 | 98.64 | 98.52 | 86.81 | 88.78 | 90.14 | 85.17 | 87.23 | 88.50 |
Africa | 791 | 717 | 689 | 99.17 | 99.15 | 99.05 | 90.85 | 91.43 | 95.27 | 88.24 | 90.08 | 92.91 |
Asia | 2288 | 2162 | 2033 | 95.76 | 95.62 | 95.37 | 68.97 | 70.03 | 73.24 | 67.18 | 68.44 | 70.85 |
Oceania | 997 | 840 | 903 | 99.08 | 99.21 | 98.32 | 94.24 | 95.72 | 97.25 | 88.66 | 90.90 | 90.62 |
Greenland | 252 | 222 | 161 | 95.21 | 82.58 | 77.05 | 80.01 | 58.38 | 78.19 | 74.54 | 42.53 | 37.24 |
Bueso-Bello, J.-L.; Martone, M.; González, C.; Sica, F.; Valdo, P.; Posovszky, P.; Pulella, A.; Rizzoli, P. The Global Water Body Layer from TanDEM-X Interferometric SAR Data. Remote Sens. 2021, 13, 5069.
Bueso-Bello J-L, Martone M, González C, Sica F, Valdo P, Posovszky P, Pulella A, Rizzoli P. The Global Water Body Layer from TanDEM-X Interferometric SAR Data. Remote Sensing. 2021; 13(24):5069.
Chicago/Turabian StyleBueso-Bello, Jose-Luis, Michele Martone, Carolina González, Francescopaolo Sica, Paolo Valdo, Philipp Posovszky, Andrea Pulella, and Paola Rizzoli. 2021. "The Global Water Body Layer from TanDEM-X Interferometric SAR Data" Remote Sensing 13, no. 24: 5069.
APA StyleBueso-Bello, J.-L., Martone, M., González, C., Sica, F., Valdo, P., Posovszky, P., Pulella, A., & Rizzoli, P. (2021). The Global Water Body Layer from TanDEM-X Interferometric SAR Data. Remote Sensing, 13(24), 5069.