Compilation and Validation of SAR and Optical Data Products for a Complete and Global Map of Inland/Ocean Water Tailored to the Climate Modeling Community

Accurate maps of surface water extent are of paramount importance for water management, satellite data processing and climate modeling. Several maps of water bodies based on remote sensing data have been released during the last decade. Nonetheless, none has a truly (90◦N/90◦S) global coverage while being thoroughly validated. This paper describes a global, spatially-complete (void-free) and accurate mask of inland/ocean water for the 2000–2012 period, built in the framework of the European Space Agency (ESA) Climate Change Initiative (CCI). This map results from the synergistic combination of multiple individual SAR and optical water body and auxiliary datasets. A key aspect of this work is the original and rigorous stratified random sampling designed for the quality assessment of binary classifications where one class is marginally distributed. Input and consolidated products were assessed qualitatively and quantitatively against a reference validation database of 2110 samples spread throughout the globe. Using all samples, overall accuracy was always very high among all products, between 98% and 100%. The CCI global map of open water bodies provided the best water class representation (F-score of 89%) compared to its constitutive inputs. When focusing on the challenging areas for water bodies’ mapping, such as shorelines, lakes and river banks, all products yielded substantially lower accuracy figures with overall accuracies ranging between 74% and 89%. The inland water area of the CCI global map of open water bodies was estimated to be 3.17 million km2 ± 0.24 million km2. The dataset is freely available through the ESA CCI Land Cover viewer.


Introduction
Fresh surface water is one of the most precious resources on Earth, fulfilling social, economic and environmental services [1,2].Climate change and population growth increasingly affect, with large spatial disparities, water resources' availability, quality [3], hydrological flows [4] and related biodiversity [5].Although access to freshwater is an integral part of the Millennium Development Goals of ensuring a sustainable environment [6], about 1.2 billion people live in scarce water areas [7].Reliable assessment of the world water resources is therefore of paramount importance for decision making, governance and mitigation [8].Maps depicting the distribution and extent of surface water also support hydrological simulation analyses, climate modeling and satellite data processing.Maps of open water bodies allow retrieving key climate variables, such as evaporation, water/land surface temperature, energy balance, selecting appropriate aerosol algorithms and sharing a common coastline map between processes.
Traditionally, cartographic methods with the support of Earth observation data have been used to delineate water bodies.The Global Lakes and Wetlands Database (GLWD) [9] compiled large-scale and regional sources for lakes, reservoirs and wetlands greater than 0.1 km 2 dating prior to 2000.The Global Insight Plus database [10] contains drainage features represented as lines/polylines at 1:1 Mscale with a horizontal accuracy below or equal to 2048 m.The major caveats of such data products are the coarse representation of water bodies, geolocation flaws and the risk of becoming obsolete.
Remote sensing is the primary tool to provide accurate, detailed and up-to-date characterization of inland water bodies on a systematic basis for any location on Earth.A variety of methods and datasets have been developed in the last decade to map open water bodies at global or near-global scale (Table 1), using active radar and passive optical satellite observation, with moderate (250-1000 m) and high spatial resolution (<30 m).
Table 1.Available water body products and their compliance with the user requirements of spatial extent, completeness, thematic accuracy, inland water/ocean discrimination and spatial resolution.GLWD, Global Lakes and Wetlands Database; SWBD, SRTM Water Body Dataset; WBI, Water Body Indicator; GFC, Global Forest Change; GIW, Global Inland Water; G3WBM, Global 3 Arc-Second Water Body Map.
The first wall-to-wall map of water bodies for large parts of the globe was obtained with Synthetic Aperture Radar (SAR) data acquired during the Shuttle Radar Topography Mission (SRTM).The SRTM Water Body Dataset (SWBD) [11] was obtained as a by-product of the main target of the mission, i.e., a digital elevation model of all land masses imaged by the SAR.The SWBD has a spatial resolution of 90 m, is void-free, with river continuity being ensured by a thorough post-processing of the initial classification from the SAR data [18].The SWBD map represents the water extent between 11 and 21 February 2000 between 60 • N and 54 • S.More recently, multi-temporal SAR metrics derived from Envisat ASAR data acquired between 2005 and 2012 were exploited to generate a nearly global dataset of permanent open water bodies.The dataset referred to as SAR-based Water Body Indicator (SAR-WBI) covers land masses between 84 • N and 60 • S and has a spatial resolution of 150 m, except in areas with predominant coarse resolution ASAR data takes (1000 m) [12].The SAR-WBI was found to accurately characterize the spatial distribution of water bodies, primarily in the northern latitudes.The major caveat of the SAR-WBI is the omission of water along shorelines and the absence of water features being smaller than twice the pixel size.
The first water bodies dataset based on optical remote sensing data and reported in the literature is the Global Raster Water Mask at 250-m resolution (MOD44W) [13].This dataset builds on the SWBD, by filling gaps with MODIS optical data available for years 2000 and 2001.To achieve a truly global extent, the classification was complemented with water detections from the MODIS data north of 60 • N and with a mosaic of Antarctica land masses [19] south of 60 • S [13].In regions where water was detected using MODIS data, water bodies smaller than 2-3 pixels may have been missed [13].MODIS data at 250-and 500-m resolution were also used in the Global Water Pack methodology to derive the daily temporal dynamics of water [20], but no product has been released so far.PROBA-V acquisitions from January 2014 to present at a spatial resolution of 1000 m are used to generate the Copernicus Global Land Service Collection 2 (Copernicus WB) dataset, consisting of water body maps every 10 days between 80 • N and 60 • S [14].
High-resolution (30 m) Landsat time series were intensively exploited in the last few years to detect water surfaces and monitor water dynamics.The Global Forest Change (GFC) product depicts forest extent and change from 2000-2012 [15].The sub-dataset "datamask" (hereafter, GFC-datamask) includes the class "permanent water bodies" covering all land masses between 80 • N and 57 • S. The 30-m Landsat Global Land Survey (GLS) data acquired for the 2000 epoch were used to generate an inland surface water classification between 90 • N and 60 • S referred to as the Global Inland Water (GIW) v1.0 product [16].Using multi-temporal GLS images from 1990-2010, [17] produced a map of permanent and seasonal water bodies referred to as the Global 3 Arc-Second Water Body Map (G3WBM).The data product spans 90 • N and 60 • S and has a spatial resolution of 90 m.Pan-sharpened Landsat 7 (14.25 m spatial resolution) imagery circa 2000 was used to generate the Global Water Bodiesdatabase (GLOWABO), including all lakes larger than 0.002 km 2 [21].More recently, [22] presented the Deltares Aqua Monitor and [23] the Landsat-based 30-year global surface water dynamics with a spatial resolution of 30 m.
Most efforts reported in this section were not triggered by requirements expressed by a target community of users.In the context of the European Space Agency (ESA) Climate Change Initiative, the climate and remote sensing communities expressed the need for a global (90 • N-90 • S and 180 • W-180 • E), spatially-complete, accurate (maximum 10% error) mask of the open water body product with a moderate resolution of a minimum of 300 m [24].Transparency regarding the degree of quality was required, as well.Distinction between inland water and oceans was requested as an extra feature.
As shown in Table 1, the data products here reviewed fulfill such requirements only partially.The extent was not global in most cases.Several data products presented voids (i.e., classes other than land and water, like "no data", cloud, snow, etc.), i.e., were not complete.Inland water/ocean discrimination was seldom reported.The SWBD, the GFC-datamask, the Copernicus WB and the Global Insight Plus were not thematically validated.Validation of these water body products, when performed, followed various strategies.Accuracy assessments of the MOD44W [13], the GIW v1.0 [16], the GLWD [9], the GLOWABO [21] and the G3WBM [17] rely on the comparison of the total water bodies area between existing products.These methods of comparison are not site-specific, as no information is provided about the location of disagreements between products.In addition, results are not validated with reference data.To validate the SAR-WBI, confusion matrices were built over limited areas using independent maps [12].Santoro and Wegmüller [12] reported the Overall Accuracy (OA), Producer Accuracy (PA), User Accuracy (UA) and Kappa indices.The variety of approaches does not allow comparing strengths and weaknesses of available water body products.In addition, reference validation datasets cover limited areas and are only used to validate a given product, without making any performance comparison with other existing ones.
As part of the land cover component of the ESA CCI, the overall objective of this study has been to generate the CCI global map of open water bodies at 150 m that fulfills all criteria of the climate modeling community.The moderate spatial resolution was found adequate for contemporary global circulation, regional and emerging convection-permitting models that run at a current horizontal spatial resolution coarser than 150 m [25][26][27].Rather than developing new classification schemes, we focused on the synergistic combination of multiple individual datasets.One key aspect of this work is represented by an original and rigorous stratified random sampling designed for the quality assessment of binary classifications where one class is marginally distributed (in this case, water).This sampling design is used to validate the CCI global map of open water bodies and allows comparing the performances with its constitutive inputs and key existing products.
This article is structured as follows.The selection of water body and auxiliary datasets relevant for our objective is first presented (Section 2).Then, the methodology adopted to combine and consolidate the selected input products is described along with the stratified random sampling design (Section 3).The CCI global map of open water bodies and its assessment are presented in Section 4 and discussed in Section 5. Finally, a set of conclusions and future outlooks are included in Section 6.

Potential Products to Build the CCI Global Map of Open Water Bodies
With reference to the water body products listed in the Introduction, the GLWD, the Global Insight Plus and the Copernicus WB were discarded a priori because they did not fulfill the target requirement of a spatial resolution of less than 300 m.Furthermore, the MOD44W data product was not considered due to the MODIS high view zenith angle that implies that individual observations regularly cover several adjacent grid cells [28].This effect can significantly increase the actual observation footprint and reduce the effective spatial resolution.The GLOWABO database [21], the G3WBM [17], the Deltares Aqua Monitor [22] and the product from [23] were not available at the time of this study.
As a consequence, only the SWBD, the SAR-WBI, the GFC-datamask and the GIW v1.0 dataset were retained as candidate inputs to build the CCI global map of open water bodies.The SWBD captured the actual water state of 10 days in February 2000 from the orthorectified SRTM radar image [11].Permanent water pixels of the GFC-datamask were selected as having a fraction of water-flagged observations for all non-cloudy observations greater or equal to 50% of the Landsat image time series selected during the growing season [29].A thresholding algorithm was applied to multi-temporal SAR metrics to map water bodies of the SAR-WBI [12].In the GIW v1.0, water pixels were the ones with the highest water/non-water probability calculated on each Landsat scene of the GLS 2000 collection [16].Product characteristics are presented in Table 2.

Auxiliary Datasets
Each of the selected datasets presents local errors related to imperfect delineation of glaciers.These were corrected with the Randolph Glacier Inventory v3.2 (RGI) [30,31].The RGI compiles glacier outlines as a complement to the Global Land Ice Measurements from Space (GLIMS) initiative [30].The RGI was selected for its wall-to-wall coverage of glaciers and frequent outline improvements.
To fill the data gap that would occur south of 60 • S when combining the input water body datasets, the Scientific Committee on Antarctic Research Antarctic Digital Database (SCAR ADD) [32] was selected.The SCAR ADD is a seamless compilation of coastline and topographic data for the continent of Antarctica.The aim of SCAR ADD is to provide the best currently available data over Antarctica with a maximum offset of 1000 m with respect to the true coastline [32].
Distinction between inland and ocean water relied on the Global Self-consistent, Hierarchical, High-resolution Shoreline (GSHHS) dataset [33].The GSHHS combines the World Vector Shorelines (WVS) and the CIA World Data Bank II [33].The WVS is the basis for shorelines with a working scale of approximately 1:100,000.The GSHHS database considers rivers as inland water bodies limited by a straight line located inland at no more than 1.85 km from the river mouth [34].The GSHHS dataset was selected because of its global coverage and the accurate delineation of coastlines [33].

Method
The flowchart describing the compilation of the CCI global map of open water bodies is illustrated in Figure 1.First, a qualitative assessment of quality of input datasets was necessary to identify their strengths, weaknesses and potential synergies (Section 3.1).These were formalized in fusion rules with the aim of achieving a truly global extent and a void-free dataset (Section 3.2).A consolidation step was then implemented to remove macroscopic errors, ensure the completeness and eliminate temporary water bodies (Section 3.3).The consolidated map was finally spatially resampled to the target spatial resolution of 150 m (Section 3.4), and inland water was distinguished from ocean water (Section 3.5).As a complement, a tool was developed to adapt format and projection according to user needs (Section 3.6).All maps forming the CCI global map of open water bodies and the final product were quantitatively assessed.The validation methodology is described in Section 3.7.

Qualitative Assessment of Input Water Body Maps
The qualitative assessment of the SAR-WBI, the GFC-datamask, the GIW v1.0 and the SWBD was guided by the following user requirements: completeness, quality and spatial resolution.None of these products fulfilled the requirements on global extent and inland/ocean water delineation.
The completeness of each product was assessed by calculating the percentage and location of no data values or thematic classes different from land or water.The GIW v1.0 included the highest proportion of invalid data (9%), spread over land masses.Invalid data belong to classes "no data", "snow/ice", "cloud shadows" and "clouds".Invalid data in the SAR-WBI and GFC-datamask were localized in contiguous areas and summed up to 3% and 7% of the total number of land pixels, respectively.None of the three products include Antarctica, and only ocean water close to coast was explicitly mapped.Islands in the North of Canada, Greenland, Svalbard, Northern Russia and the islands in the Pacific Ocean were not included in the GFC-datamask.The northernmost latitudes and Svalbard were not included in the GIW v1.0 dataset.No classification could be obtained in the SAR-WBI for south Panama, north Australia and several isolated islands due to a lack of ASAR observations.In addition, a 1-degree longitudinal belt between 84 • N and 83 • N was removed because of permanent sea ice, systematically classified as land.Classification was not undertaken for the Greenland ice sheet because permanent water bodies are not included [12].The SWBD was complete between 54 • S and 60 • N. As a result of this investigation, missing information was not systematically located over the same areas among all products so that their fusion could contribute to achieve completeness.
The quality of the products was assessed in terms of errors and their location.Errors were related to misclassifications of temporary water events as permanent water (e.g., snow or ice melt, floods), incorrect coastline delineation, icebergs classified as land, confusion between water and dark landscape features, such as black lava or shadows in mountainous terrain, classification of wetlands or irrigated fields as permanent water, processing-induced artifacts (e.g., seams) and defective sensors.
With regard to the spatial resolution, the best characterization of water bodies was observed in the GFC-datamask and GIW v1.0 products.The 30-m resolution indeed better ensured river connectivity and delineation of small water bodies, such as thermokarst lakes and narrow tributaries.The analysis of these two products furthermore revealed their complementarity in the sense that errors and omissions were not systematic in both (Section 4.1).Despite the coarser spatial resolution, the spatial distribution of water bodies in the SWBD and the GFC-datamask was similar.The added value of the SWBD in this context is the presence of islands, which were not included elsewhere.Because of the coarser resolution (150 m), the delineation of water bodies in the SAR-WBI was less accurate compared to the 30-m data products.Nevertheless, the very high density of observations by ASAR [35] resulted in a more precise characterization of coastlines, lakes and river systems.In addition, some artificial lakes created during the last decade were detected in the SAR-WBI only.

Combination of Water Body Products
The GFC-datamask was selected as the primary source of information due to the 30-m resolution, the high quality of the water body delineations and the tendency to map the minimum water extent.The GFC-datamask was supplemented with the GIW v1.0 water class, which brought additional spatial details to the water characterization and increased the spatial completeness.The SWBD was used to replace water in correspondence to islands missing in the GFC-datamask and the GIW v1.0 datasets.Finally, the water and land classes from the SAR-WBI, resampled at 30 m, filled remaining voids north of 68 • N.

Consolidation
Consolidation of the combined product obtained with the procedure outlined in Section 3.2 served to improve it in terms of completeness and accuracy.The SCAR ADD Antarctica layer was added to extend the data product to 90 • S. The RGI was used to fill gaps on glaciers and correct for water commission errors.At this stage, the data product presented voids only over oceans, which were manually corrected to water to reach completeness.
Macroscopic errors due to imperfections in each of the input products were finally corrected for.To this end, the land surface was divided into a regular grid of cells of 1 × 1 degree inside which the individual datasets were compared.Hotspots of disagreement were furthermore cross-checked with high spatial resolution imagery from Bing Maps and Google Earth.Confirmed errors were manually delimited and removed.With the aim of correcting for water omissions, the SAR-WBI was introduced south of 68 • N.

Spatial Resampling
The nearest neighbor algorithm was chosen for resampling the combined data product to the final spatial resolution of 150 m.The 150-m spatial resolution was chosen as the final spatial resolution as this is the lowest resolution of the data layers used to generate the CCI water body product.This choice will be discussed in Section 5. To compensate for the artificial increase of water bodies after resampling, the percentage of water mapped at 30 m in the target 150-m grid cell was computed as a separate layer.

Differentiation of Inland/Ocean Water
The main constraint in defining the land/water boundary was to maintain the detailed coastline from the input water bodies while including the rivers flowing into the ocean in the inland water class.
Around river mouths, the GSHHS was used to define the limit between the inland section of the rivers and the ocean.Due to discrepancies between the coastline of the input water bodies and the one defined by the GSHHS database, a positive buffer of 0.033 degrees (~3.6 km at the Equator) was applied to ensure extracting rivers from oceans without affecting the coastlines of the global map of open water bodies.Since the GSHHS database considered rivers as inland water bodies limited by a straight line located inland at no more than 1.85 km from the river mouth, the resulting rivers are represented as inland water bodies limited by a straight line located inland at no more than ~5.45 km from the river mouth.Elsewhere, the coastline is defined by the water detection implemented in the input water body products.

User Tool
The CCI global map of open water bodies is delivered at 150-m spatial resolution in a Plate-Carrée projection.To support a wide range of communities requesting a different spatial resolution and/or projection, a stand-alone software tool was developed to allow sub-setting, re-scaling and re-projection (Table 3).Re-scaling generates the fractional area of each class in the target cell and the class value with the largest fractional area.

Sub-Setting
Predefined regional subset Free specification of regional subset (4 corner coordinates)

Sampling Scheme
Differently than traditional accuracy assessments relying on simple random sampling, a two-fold stratified random sampling was used to avoid undersampling rarely occurring map classes, such as "water" [36].
The first level of stratification was geographic in order to obtain a homogeneous distribution of validation samples everywhere.It generated 21 "Level-1" strata (open oceans and polar areas excluded) defined by bioclimatic and remote sensing criteria [37].The number of samples per Level-1 strata was proportional to their area.
Good practices of accuracy assessment suggest that class-based stratification reduces standard errors of class-specific accuracy estimates [38].However, because inland water corresponds to a marginal class with respect to global land cover, using water and land as strata would be obviously beneficial for the user accuracy, but could result in optimistic producer accuracy results due to the reduced probability to sample water omissions.The second level of stratification was therefore developed based on the a priori confidence of correctly representing map classes.This confidence-based stratification was categorized into three Level-2 strata: high confidence in correctly mapping the land class (Stratum 1), high confidence in correctly mapping the water class (Stratum 2) and error-prone areas (Stratum 3).The combination of the MOD44W [13], the GLWD [9] and the Global Insight Plus water layer [10] was used to obtain the three strata.Stratum 1 corresponded to land agreement between the three maps, Stratum 2 to water agreement and Stratum 3 to discrepancies between at least two of the three maps.The surface of Stratum 3, i.e., error-prone areas, corresponded to 76% of the total surface of inland water.
The sample size, S, was optimized with regard to the expected accuracy of the CCI global map of open water bodies, and the confidence interval was derived according to the binomial distribution [39]: where E is the allowable error in the sample (half of the confidence interval), Z α is the critical value drawn from the normal distribution for a given level of confidence and p is the targeted accuracy of the product.A confidence interval of 4% with a confidence level of 95% (Z α = 1.96) was chosen.
Our assumption is that the accuracy of water classification is lower where different maps disagree, while water bodies are usually classified with high to very high overall accuracy in areas of agreement.The targeted accuracy was therefore set to be at least 85% in the error-prone area, which corresponds to approximately 1200 samples.An additional 1200 samples were distributed equally to the other two strata, where the targeted accuracy was at least 93%.

Generation of the Validation Database
The sampling unit was the pixel materialized with a footprint of 150 m × 150 m.These samples were visually interpreted independently from the product using high resolution Google Earth imagery.Careful attention was paid to interpret and record the permanent, as well as the temporary character of snow and water evenly across the globe by extensive use of historical imagery.According to the photo-interpretation practices building on the convergence of evidence [40,41], it was possible to identify water presence at the time of imaging, but also surfaces that can be seasonally flooded.In particular, these surfaces concern dry river beds, flood-prone areas, irrigated agriculture, mangroves/inundated forests, ephemeral streams, salt pans and snow packs.Samples were labeled as water when at least half of the sample was covered with open surface water.Samples showing temporary snow or water were labeled as land, but the temporal aspect was also recorded.For all samples, the date of the high resolution imagery was recorded.In addition, wetlands and swamps were also recorded.

Accuracy Assessment
The CCI global map of open water bodies, the SAR-WBI, the GIW v1.0 dataset and the GFC-datamask were validated against the reference samples of the validation database, assuming that these represent the true Earth surface state.The SWBD was not validated given its minor contribution to the CCI global map of open WB (0.69% of the total inland water surface).Here, accuracy was assessed at three levels.One assessment included all samples and took into account all strata.A second assessment exclusively focused on the error-prone stratum of Level-2.A third assessment focused on the samples recorded as temporary water.Herewith, it was intended to evaluate whether seasonal water bodies affect the water body product.
The assessment was quantified in terms of confusion matrices built by comparing each class of the map (n i ) to the reference sample classes (n j ).Each confusion matrix reported the Overall Accuracy (OA), the User's Accuracy (UA), the Producer's Accuracy (PA) [42] and the F-score.A McNemar test [43] is applied to evaluate if performances are significantly different between confusion matrices.OA represents the proportion of all cases correctly classified (Equation ( 2)) with n being the total number of samples and q the total number of classes (water and non-water).Because the sampling probability was different among the three Level-2 strata, global index values were weighted according to the sampling probability.
In Equation (2), w s is the weight of the stratum, which is inversely proportional to the sampling effort.The weights are computed for each stratum based on Equation (3).
where S s is the area of the stratum and n s is the number of sample points in the stratum.UA corresponds to the probability that a randomly-selected pixel from the map is classified as correct in the reference sample.PA corresponds to the probability that a reference sample is correctly classified in the map.Therefore, UA is related to the commission error while PA informs about the omission error.They are calculated following Equations ( 4) and ( 5) inside each stratum and thereafter weighted using the same method as for the global overall accuracy.
The F-score (Equation ( 6)) represents for a class k the harmonic mean of the user and producer accuracies and ranges between 0 and 1.
A McNemar test [43] was applied to evaluate if the values reported in the confusion matrix for each individual product were significantly different.

The CCI Global Map of Open Water Bodies
The CCI global map of open water bodies is illustrated in Figure 2.This global, void-free dataset consists of two separate layers: an inland water/ocean repartition at 150-m spatial resolution and an inland water fraction, in percent of the 150-m grid cell.The total inland water area is 3.41 million km 2 .The complementarity of the GFC-datamask and the GIW v1.0 products was key to obtain an exhaustive detection and delineation of water bodies.Figure 3 illustrates the increase of river continuity brought by the GIW v1.0 (Figure 3a,b), compensations for water body omissions present in the GFC-datamask (Figure 3c) and "no data" filling by the GFC-datamask (Figure 3d).The SAR-WBI contributed significantly north of 68 • N (Figure 3e).In addition, it contributed to updating the classification based on the Landsat data in areas with more recent artificial basins.Figure 3f illustrates the Indira Sagar Dam commissioned on May 2005, where neither was included in GIW v1.0, nor in the GFC-datamask.
Excluding Greenland, Antarctica and islands south of 60 • S, the consolidation affected 29% of the inland water class.Figure 4 illustrates a few examples of the a posteriori consolidation aided by the auxiliary datasets and manual corrections of macroscopic errors.Macroscopic commission errors along the Ob River in the GIW v1.0 dataset were manually removed and replaced with the classification of the GFC-datamask (Figure 4a).Black lava in Saudi Arabia was misclassified by both the GIW v1.0 and the GFC-datamask; manual correction was applied here (Figure 4b).Land contamination and incompleteness in the GIW v1.0 dataset could be corrected for with the aid of the SWBD (Figure 4c).Water commission over glaciers was corrected with the aid of the RGI dataset (Figure 4d).

Accuracy Assessment
The reference database included 2400 samples spread over land masses with the exclusion of polar areas (Figure 5).Of these, 2121 corresponded to valid data in each of the datasets used to generate the CCI water body product.Eleven samples were further discarded either because of cumbersome interpretation of the Google Earth imagery due to cloud coverage, unavailability of images or uncertain interpretation.For the 2110 samples, 1030 samples were included in the error-prone stratum of Level-2, and 234 corresponded to temporary water bodies like ephemeral streams, beaches, irrigated cultures and salty lakes.The overall accuracy was always very high, between 98 and 100% (Table 4).This was a consequence of the overwhelming proportion of the land class at the global scale compared to the marginal water class.Yet, water surfaces were not identified with high accuracy in any of the input datasets (Table 4).The PAs of water were always lower compared to the values obtained for the CCI water body map, with considerable differences among the the individual input datasets.The PA of the CCI global map of open water bodies (92%) outperformed the best PA of the input datasets by 13%.On the contrary, the UA of the GFC-datamask was higher (97%) compared to the UA of the CCI global map of open water bodies (86%).The CCI global map of open water bodies overestimated water, while the GFC-datamask underestimated it.The underestimation of water of the GFC-datamask was also a result of the qualitative assessment and justified the introduction of the class water from the GIW v1.0 in the consolidation.The GFC-datamask water omissions were found typically along lake banks, shallow water and dams.The GFC-datamask minimum water extent is probably due to its definition of water using a strict threshold greater or equal to 50% of water detections in the Landsat image in the time series [29].Commission errors are marginal and occur along some lakes and over black lava rocks (e.g., Ethiopia).The F-score of the CCI global map of open water bodies (89%) was significantly higher than the value obtained for the GFC-datamask.The F-score of the SAR-WBI was the lowest of all values (71%) mainly due to an underestimation of water (low PA).Omission errors were located mainly along coastlines, water body boundaries and in mountainous areas [12].In areas where the SAR-WBI was based primarily on data with a spatial resolution of 1000 m, water bodies were either missed, imprecisely delineated or only partially detected in fragments.Most commission errors corresponded to temporary water (e.g., inundated areas, floodplains, deserts and salars) and to a lesser extent to coastlines, irrigated croplands and mountainous areas [12].
The second accuracy analysis focused on error-prone areas (Table 5).Compared to the accuracy figures of Table 4, the PAs and UAs for this stratum were substantially lower (on average, 14% and 21%, respectively).The trend in OA and the PA of the non-water class in the SAR-WBI and the GIW v1.0 datasets did not differ when restricting the analysis to error-prone areas only (see Tables 4 and 5).On the contrary, the PA and the F-score of the water class of the SAR-WBI were substantially lower.This was due to the frequent omission of water in the SAR-WBI; however, since water represented a small proportion of the classes being mapped, the effect of omission was not visible in the statistics derived for the overall assessment.
Similarly, the GFC-datamask and the CCI global map of open water bodies gave identical trends in the results for OA and F-scores for the class non-water (see Tables 4 and 5).The GFC-datamask was prone to water omission and the CCI global map of open water bodies to water commission.The underestimation of water in the GFC-datamask (PA of 61%) was exacerbated in error-prone areas that include a high proportion of shorelines, lakes and river banks.The PA of the CCI global map of open water bodies reached 75%, indicating that the large omissions of the GFC-datamask and of the GIW v1.0 were effectively compensated for.
The OA obtained with the third accuracy analysis based on the 234 samples corresponding to temporary water was 79%, 89%, 94% and 99% for the SAR-WBI, the GIW v1.0, the CCI global map of open water bodies and the GFC-datamask, respectively.This ranking was in line with the results highlighted in Table 4.The GFC-datamask tended to map a minimum water extent and always showed low rates of water commission (high UA).These results will be discussed in the next section.

Assessing Total Water Surface
According to the CCI global map of open water bodies, inland water covers an area of 3.41 million km 2 .Following [38], this area was corrected by weighting the actual area of the land and water classes in each Level-2 stratum by the corresponding UA figures of the accuracy assessment using all samples.The maximal error on this area was calculated by taking into account the actual error of each class of each Level-2 stratum.This actual error is derived using Equation ( 1) with the actual number of samples within each class of each Level-2.It resulted in an inland water area estimation of 3.17 million km 2 ± 0.24 million km 2 .
The CCI global map inland water area is in the range of 3.05-4.57million km 2 reported by [44] for a series of global Earth observation products with spatial resolutions from 30 m-1000 m.Estimations provided by [16] for the GIW v1.0, by [13] for the MOD44W, by [17] for the G3WBM and by [45] were also in agreement with this range.However, an area of ~5 million km 2 was reported by [21] for the lakes of the GLOWABO product.

Discussion
This study demonstrated that the combination and consolidation of existing water body products leads to a global map of open water bodies that meets the climate modelers needs of adequate spatial resolution, maximal spatial extent and completeness along with high accuracy.
The 150-m spatial resolution of the CCI global map of open water bodies was found adequate for contemporary climate models that run at a current horizontal spatial resolution, which is, by far, coarser than 150 m.Global circulation models typically range between 250 and 600 km [25], and regional models provide so-called "high resolution" simulations at 10-20 km [26].Climate modeling using convection-permitting models are now emerging and provide more reliable climate information on regional to local scales [27].Those models operate on the kilometer scale up to 0.5 km [46].
According to the survey conducted by [24], a resolution finer than 300 m is also of interest for a broader land cover and "climate-related" communities.In 20% of the answers, the current global standard spatial resolution (300-1000 m) would even be sufficient.Finally, a spatial resolution of 150 m was found to be suitable for communities studying large-scale global dynamics and monitoring of the Earth's surface at 250 m or coarser with satellite data observations like MODIS, Envisat MERIS, PROBA-V and their continuity ensured by Sentinel-3 [47].
Achieving maximal spatial extent and completeness required up to seven different water bodies and auxiliary products.The fusion of various products with differences in periods of data acquisition and quality can cause inconsistencies in the water body representation, but the high accuracy of the CCI map of open water bodies proved that the methodology adopted overcame this issue.The fusion methodology gave priority to high resolution and minimum water extent mapping and the consolidation helped remove macroscopic errors and include recent water bodies.However, the major drawback of such interactive, systematic and very comprehensive consolidation is the lack of repeatability.
It is expected that the joint and systematic use of Sentinel-1 (S1) every 6-12 days with a 10-20-m spatial resolution [48], Sentinel-2 (S2) 10-m multispectral data every five days (S2-A and S2-B) [49] and synergies between SAR and optical data [50,51] will greatly contribute to improving the consistency and allow updating the CCI global map of open water bodies in the future.The Sentinels' high revisit time will provide completeness and reduce manual and time-consuming post-editing by confirming water detection in space and time.A true global extent at 20-m spatial resolution might be achieved as S2 spatial coverage between latitudes 56 • S and 84 • N [49] could be extended to polar environments with S1 monitoring.

Confidence-Based Stratification
In this study, we proposed a rigorous stratified random sampling designed for the quality assessment of a binary classification where one class is marginally distributed.It is also interesting to evaluate to what extent the random sampling scheme is representative of the correctly and incorrectly classified pixels of the validated product as it highlights the actual precision of accuracy indices.Table 6 gives the probability to sample pixels, correctly or incorrectly classified as water or land, according to the results of class distribution for the three different random sampling schemes.With the simple random sampling, there is an equal probability to sample any pixel of the map.The two other sampling schemes, which are stratified, first distribute the sample points according to a given value of the map.For the commonly-used class-based stratified sampling scheme, half of the sampling pixels would have been randomly selected inside the water class of the CCI global map of open water bodies, and the other half would fall in the land class.Our proposed sampling scheme used three strata resulting from the combination of independent global water datasets: half of the samples have been selected in the error-prone areas (Stratum 3), while both areas with high confidence in correctly mapping land (Stratum 1) or water (Stratum 2) received one quarter of the samples.
The probability to capture incorrectly classified pixels in any of the random samples schemes was low when the overall accuracy was large (OA of 99%).Compared with a simple random sampling and a class-based stratified random sampling, the probability of 3.5% to sample cases of water omissions allowed us to verify a posteriori that the use of a confidence-based stratification improved the precision of the producer accuracy estimation.Indeed, while the class-based stratified sampling allowed sampling more in the marginal class (50%), it further reduced the probability to detect water omissions.The stratification using confidence-based strata increased the probability to sample the two types of incorrectly classified pixels.The samples' distribution in cases of water omission and commission was dependent on the datasets used in the stratification.
The proposed stratification relied on independent datasets that are not always available.This can be an obstacle for the assessment of individual maps, but in the case of a comparison between products, the area of discrepancy could also be derived from the difference between products.Congalton and Green [52] also suggested that the rarity of one class could be compensated by strata defined by expert-based knowledge.For instance, in the validation of land cover change, they used a buffer surrounding the change mask or land cover classes where change was more likely to occur in order to stratify the sampling of a change/no change map.The accuracy of the G3WBM, unavailable at the time of the CCI global map of open water bodies compilation, was evaluated with the same validation reference database used in Section 5.1.For the sake of consistency, original classes were grouped as follows: "land", "land (no Landsat observation)", "snow", "wet soil/wet vegetation/lava", "salt marsh" and "temporal flooded area" were merged into one "land" class, while classes "permanent water", "permanent water (added by SWBD)" and "ocean (given by external land/sea mask)" as "water".G3WBM global accuracy figures weighted by the actual surface of the land and water classes and accuracy figures focused on error-prone areas (Table 7) are compared to the results of Tables 4 and 5, respectively.The global assessment revealed that the OAs were not significantly different between the G3WBM and the CCI global map of open water bodies.However, the types of errors were not evenly distributed in each database.The UA of G3WBM was larger than for the CCI global map of open water bodies, while the CCI global map of open water bodies had a larger PA.The CCI water body map minimized the omission errors, while the G3WBM minimized the commission errors.The same conclusion was obtained for the error-prone area, where the CCI map of open water bodies was 1% (not significantly) better.The proportion of correctly-classified pixels using the 234 samples related to temporary water was 93%.Class "permanent water added by the SWBD" contributed to 16% of these errors.
These results prove that two different methods, one of harmonization and consolidation of existing water bodies and one of the classification of multi-temporal images, produced water bodies maps with similar high accuracies.

Permanent versus Temporary Water Bodies
According to the Food and Agriculture Organization Land Cover Classification System [53], identified as the most appropriate land cover classification system [54], non-perennial, i.e., temporary or seasonal, water corresponds to a surface covered with water during less than three months a year.
In the reference validation database, the use of the historical imagery of Google Earth and the interpretation of the context enabled recording information on temporary water.Yet, a threshold of three months could not strictly be verified due to the lack of historical imagery, regularly spread along the year.However, according to the photo-interpretation practices building on the convergence of evidence [40,41], it was possible to identify water presence at the time of imaging, but also surfaces that can be seasonally flooded according to water availability.
Currently, no water body product provides an LCCS-compatible definition of the water status, i.e., temporary or permanent.Yet, water body products, including the SAR-WBI, GIW v1.0 and the GFC-datamask, define water using thresholds on the water detections generated on multi-temporal series of images (Section 2.1).
Differentiating permanent from temporary water bodies could not be achieved in the the GIW v1.0 using the GLS data collection for the 2000 epoch only.Based on the 234 validation samples corresponding to temporary water, 11% were mapped as water in the GIW v1.0 (Section 4.2).This issue of water seasonality was already mentioned [16], and the area of temporary water included in the GIW v1.0 was evaluated as 0.17 million km 2 [17].In addition, the number and time spread of the GLS images limited the GIW v1.0 completeness to 91% of the terrestrial surface.
Although both the GFC-datamask and the SAR-WBI relied on multi-temporal images over several years, the GFC-datamask was more representative of permanent water.The reason is that the GFC-datamask definition of water is based on a stricter threshold on the number of water detections in the image time series.Nevertheless, the GFC-datamask missed water bodies created towards the end of the time interval covered by the Landsat data.These water bodies are permanent, but have a minor contribution to the water frequency in the multi-temporal dataset.For the CCI map of open water bodies, the occurrence of temporary water bodies classified as permanent was seldom because of the consolidation steps adopted to minimize temporary water and account for missing water bodies.

Conclusions
A global map of open water bodies was built within the European Space Agency Climate Change Initiative (ESA CCI) by combination and consolidation of existing nearly global water body and auxiliary datasets.The CCI global map of open water bodies is tailored to the climate modeling community by providing a complete land/inland water and ocean classification for any location of the Earth surface at 150-m spatial resolution.An inland water fraction in percent of the 150-m grid cell is delivered as a separate layer for use within a broader land cover community.Both layers are freely available at: http://maps.elie.ucl.ac.be/CCI/viewer.
The inland water area of the CCI global map of open water bodies was estimated as 3.17 million km 2 ± 0.24 million km 2 .It is in the range of 3.05-4.57million km 2 reported by [44].Estimations for the GIW v1.0 [16], the MOD44W [13], the G3WBM [17] and reported by [45] were also in agreement with this range.
The CCI global map of open water bodies and its constitutive inputs were thoroughly validated against an independent reference database of 2110 samples spread over all land masses, excluding polar regions.This research proposed an original sampling scheme for a better documentation of product quality and a better differentiation among them.A confidence-based stratified random sampling was developed to avoid undersampling rarely occurring map classes, such as "water".The stratification was based on the a priori confidence of correctly representing map classes as defined by independent water body maps.It resulted in three strata corresponding to land agreement between the maps (Stratum 1), to water agreement between the maps (Stratum 2) and to discrepancies between the maps (Stratum 3).Using all samples, overall accuracy was always very high among all products, between 98% and 100%.The CCI global map of open water bodies provided the best water class representation (F-score of 89%) compared to its constitutive inputs, but it tended to slightly overestimate the water area (user accuracy of 86%).When focusing on the challenging areas for water bodies mapping (Stratum 3), such as shorelines, lakes and river banks, all products yielded substantially lower accuracy figures with overall accuracies ranging between 74% and 89%.The CCI global map of open water bodies' producer accuracy for class water (75%) was higher than the producer accuracies of its constitutive inputs ranging between 23% and 67%.This indicated that the large omissions of its input products were effectively compensated for by their combination.The OA obtained based on the 234 samples corresponding to temporary water was 94% for the CCI global map of open water bodies.
The update and improvement of the CCI global map of open water bodies is foreseen with Sentinel-1 and Sentinel-2 by taking the best advantage of the synergy between SAR and optical acquisitions with high frequency of revisit, while targeting a global coverage.Such product will fulfill the needs of the broader land cover community and the next generation of climate models at high resolution.

Figure 1 .
Figure 1.Flowchart outlining the compilation of the CCI global map of open water bodies.RGI, Randolph Glacier Inventory; SCAR-ADD, Scientific Committee on Antarctic Research Antarctic Digital Database; GSHHS, Global Self-consistent, Hierarchical, High-resolution Shoreline.

Figure 2 .
Figure 2. The CCI global map of open water bodies.

Figure 3 .
Figure 3. Examples illustrating the complementarity of selected input data sources to the CCI global map of open water bodies: (a) the lack of continuity in the GIW v1.0 river network was complemented by the GFC-datamask; (b) the lack of continuity in the GFC-datamask river network was complemented by the GIW v1.0; (c) artifacts from the Scan Line Corrector-off issue in Landsat ETM+ in the GFC-datamask were compensated with inclusion of water from the GIW v1.0.(d); GIW v1.0 unprocessed or atmospherically-contaminated areas were filled with the GFC-datamask; (e) detections in the SAR-WBI in areas north of 68 • N, not included in other water body products; (f) map updating with the SAR-WBI for more recent water bodies.82°9'E

Figure 5 .
Figure 5. Location of the 2110 samples (area of 150 m × 150 m) selected for the validation of the water body products.For each sample, the Level-2 stratum is specified (land agreement, water agreement and discrepancies).

Table 2 .
Characteristics of the water body data products selected to compile the CCI global map of open water bodies.

Table 3 .
Set of options included in the user tool.

Table 4 .
Input water body maps and CCI water body map with their accuracy estimates in percent (%) based on an evaluation of 2110 samples.

Table 6 .
Probability, in percent (%), to sample correctly and incorrectly classified pixels depending on the sampling scheme.