High-resolution measurements of geometry and dynamics of surface water are essential parameters required for many environmental applications, such as flood forecasting and warning, agricultural and urban water management, and simulation of transport of pollutants in water bodies [1
]. While there are many applications available focusing on detailed modeling of different rivers, lakes, and coastal areas, a consistent and high-resolution (less than 100 m) global dataset on surface water geometry and dynamics at high-resolution and planetary scale is not yet available. An important use of global datasets of surface water is in global hydrological and hydraulic models that, in many cases, include a fraction of surface water per grid cell in their parameterization (see for example, [2
]). At the same time, these models are moving away from their typical resolution of around 0.5 degrees to the hyper-resolution domain [3
] and include more elaborate routing schemes, thus increasing the need for accurate surface water mapping.
Most methods to detect water from multispectral (satellite) imagery use the fact that water significantly absorbs most radiation at near-infrared wavelengths and beyond. This fact makes it easy to detect clear water by introducing a spectral index, such as the Normalized Difference Water Index (NDWI), representing the slope of the water spectral curve [5
]. The Modified Normalized Difference Water Index (MNDWI) [7
], appears to be more sensitive, due to the use of a shortwave infrared band instead of the near-infrared one in NDWI, resulting in better water feature detection. While detection of clear water features appears to be trivial, many factors make it more difficult in real world situations. Clouds, snow, and ice cause many errors. Additionally, errors of commission (false positive detection of water) can be observed in areas with shadows due to topographic conditions or the presence of cloud shadows. Methods for cloud, cloud shadow, and snow detection are described by [8
]. These methods make use of information on satellite sensor view angle and solar zenith/azimuth and are also used to produce Surface Reflectance Landsat products [10
]. These parameters, combined with elevation data, can also be used to detect hill shadows [11
]. An alternative approach to exclude clouds and shadows is the use of average reflectance composites instead of instantaneous images [12
The fact that water is almost never clear in the real world results in changes in its spectral curve, making it difficult to use spectral indices with a single threshold to separate water pixels from non-water pixels. Typical variations of threshold values for different spectral water indices can be found in [14
]. One of the approaches to overcome this problem is to use methods that allow detecting threshold values based on a histogram of all NDWI values in a given area. One such method is Otsu thresholding [15
]. This method is very similar to the k-means method applied to the histogram of spectral index values [17
Significant improvements in water detection can be achieved by combining optical satellite imagery with elevation data, assuming that most of the permanent surface water can be found in valleys. The use of elevation data to extract a drainage network is also a common practice in hydrological applications [18
]. A drainage network can be used to construct a Height Above the Nearest Drainage (HAND) map [20
], that can be used to separate areas where water can occur from those where it is very unlikely that surface water occurs. A contoured HAND map can also be used to estimate possible flooded areas [21
] similar to other geomorphological methods [22
]. Many efforts to map open water focus on raw information coming from remote sensing, such as optical multispectral imagery, synthetic-aperture radar (SAR) imagery, or radar altimetry [23
]. The combination of remote sensing with existing GIS vector datasets that contain information on water occurrence, gets much less attention, even less so when combined with geomorphological information (i.e.
, HAND). GIS vector datasets are frequently measured using high-resolution GPS devices or by manually digitizing aerial or satellite imagery. As a result, these datasets usually demonstrate much better precision and contain semantic information, such as feature names, types, and other attributes. Their quality and completeness are not uniform around the globe. One key global vector dataset containing water information is OpenStreetMap (OSM) [24
], initiated in 2004 and currently including more than 3 billion objects. From these, more than 8 million objects relate to water [25
]. Many environmental applications use OSM, including the extraction of paved area and surface water coverage for hydrological applications [26
The volunteered nature of OSM is the main factor that makes it less adopted by GIS professionals [27
] stressing the importance of the development of automated methods and tools to validate its quality in comparison to other datasets. A good example of how OSM quality can be assessed can be found in [28
], showing how positional differences between linear and polygonal features can be addressed. Furthermore, the “increasing buffer” method [29
] can be used to estimate the quality of linear features [24
]. An excellent overview of papers focusing on OSM quality can be found in [30
]. The latter also suggests using elements of ISO 19113:2002 [31
] (such as completeness, an error of commission/omission, or positional differences) to evaluate the OSM quality. Unfortunately, to our knowledge, no academic literature exists focusing solely on the quality of water features present in OSM and using global or nearly global remote sensing datasets.
The main reason why OSM was chosen in the current study over local, authoritative datasets, is that it provides a global coverage, even though its local coverage and quality may vary. Additional research would be required to perform a detailed comparison of the datasets presented in this paper to the local Australian authoritative datasets, such as Surface Hydrology [32
], Water Observations from Space [33
], or 5 m Digital Elevation Model (DEM) of Australia [34
We derive a high-resolution water mask using Landsat 8 imagery and OpenStreetMap, as well as a (potential) drainage network using 30 m SRTM. Extracting a water mask from OpenStreetMap data is relatively straightforward (Section 2.2
), but the other data sources require a specialized workflow. Our approach to derive a surface water mask from Landsat 8 imagery is described in Section 2.4
, involving the steps to compute cloud-free average reflectance composites in Section 2.4.1
. Additionally, we introduce a new non-parametric unsupervised method to detect water in flat areas (Section 2.4.2
). We also propose a supervised classification step to refine the water mask in hill areas (Section 2.4.3
We make use of several open geospatial and remote sensing datasets to construct an open water map. Section 2
provides an overview of the main input datasets utilized in the study, as well as methods applied or developed to detect water, and to compare the resulting water masks. The main input datasets include (1) images acquired by the NASA Landsat 8 mission [35
]; (2) a new revision of a nearly-global 30 m DEM, measured by the SRTM mission [36
]; and (3) OSM data for the Murray-Darling basin in Australia.
The research demonstrates clear benefits of the use of the new imagery acquired by Landsat 8 to detect water bodies when combined with water masks derived from other sources. We have also found the SRTM to be an excellent complementary dataset enabling improvement of the water mask detection method for hilly areas, after its transformation into HAND.
However, none of the three water masks were found to be perfect regarding positional differences and completeness. The main issue of the water masks derived from Landsat 8 is its shortcomings in detecting small water features, such as small rivers or man-made canals and detecting water bodies (partially) covered by riparian or surface water vegetation. The main limitation with the drainage network derived from SRTM is its inability to detect river features for flat terrain conditions. The latter constituted a major part of our study area. However, this might not be the case when applied elsewhere. An additional challenge is related to the presence of high-frequency noise and a relatively poor quality of SRTM near water bodies. The noise can be explained by the radar origin of the dataset.
One of the next logical steps of the present research could be the development of a data fusion algorithm using benefits from all three datasets. Such development would require the introduction of objective criteria regarding confidence of every water mask depending on topographic and other conditions. Another step might be to perform the same analysis globally. However, performing global analysis would require significantly larger computational efforts and includes both detection of the water mask and estimation of HAND at 30 m resolution. Additional validation of the OSM and its fusion with the datasets produced by the local governmental agencies (Surface Hydrology, Water Observations from Space, local high-resolution elevation models) will help in harmonizing existing vector and raster water mask datasets.
Possible improvements to the method of water detection might include utilization of the panchromatic band and entropy-based methods, in addition to the spectral methods. Additional significant improvements can be achieved through the use of the other medium or high-resolution satellite missions such as Sentinel 2, PlanetLabs, and SkyBox. The use of higher resolution imagery would allow detection of much smaller (width < 30 m) river features, resulting in improved coverage.
The method of water detection can be easily extended to use Landsat 7 or any other multi-spectral imagery, for example, to generate inter-annual water masks or to study water dynamics.
The proposed method of water detection might face difficulties in the areas where an insufficient number of cloud-free observations are available, for example, in very wet or cold climates. In this case, it might be difficult to determine a correct range of the cloud-free percentiles to be used for water detection.
The new method allows detection of a high-resolution water mask using optical multispectral satellite imagery. The resulting water mask, together with the newly-generated 30 m-resolution drainage network derived from SRTM, was compared to the water mask extracted from OSM.
Our results reveal that the best surface water mask should be generated by combining all three datasets that were analyzed in the current study; water masks extracted from OSM, optical satellite imagery and the drainage network derived from high-resolution digital elevation models for hilly areas.
We found a good agreement, concerning positional accuracy, between river water features from OSM and water masks derived from Landsat 8. However, only 32% of the total OSM and Landsat 8 water mask matches when analyzing the actual intersecting surface area.
The newly generated Landsat 8 water mask reveals many new water bodies previously not present in OSM or any other vector dataset we have explored. A large part constitutes a large number of small agricultural reservoirs located in the northern and southern parts of the catchment.