An Optical and SAR Based Fusion Approach for Mapping Surface Water Dynamics over Mainland China

Earth Observation (EO) data is a critical information source for mapping and monitoring water resources over large inaccessible regions where hydrological in-situ networks are sparse. In this paper, we present a simple yet robust method for fusing optical and Synthetic Aperture Radar (SAR) data for mapping surface water dynamics over mainland China. This method uses a multivariate logistic regression model to estimate monthly surface water extent over a four-year period (2017 to 2020) from the combined usages of Sentinel-1, Sentinel-2 and Landsat-8 imagery. Multi-seasonal high-resolution images from the Chinese Gaofen satellites are used as a reference for an independent validation showing a high degree of agreement (overall accuracy 94%) across a diversity of climatic and physiographic regions demonstrating potential scalability beyond China. Through inter-comparison with similar global scale products, this paper further shows how this new mapping technique provides improved spatio-temporal characterization of inland water bodies, and for better capturing smaller water bodies (< 0.81 ha in size). The relevance of the results is discussed, and we find this new enhanced monitoring approach has the potential to advance the use of Earth observation for water resource management, planning and reporting.


Introduction
The dedicated goal of water in the 2030 agenda for sustainable development has put the spotlight on water policy at the global level and in national planning. It also highlights the need to take action on water resource management in order to meet the challenge of a looming crisis of water scarcity, i.e., a situation where freshwater resources cannot meet standard demands [1]. A recent study from the World Resource Institute (WRI) documented that water scarcity is already real and far more commonplace than previously thought. WRI found that water withdrawals globally have more than doubled since the 1960s due to growing demand, and show no signs of slowing down. Population growth, socioeconomic development and urbanization are all contributing to increased water demand, while climate change induced impacts on precipitation patterns and temperature extremes further exacerbate water resource availability and predictability [2]. The Sustainable Development Goals (SDGs), especially the goal of 'clean water for all' (SDG 6) and the 'climate action goal' (SDG 13), therefore need particular attention to avoid an accelerating 'water crisis' towards 2030.
A 'water crisis' is ultimately a management issue that can be mitigated through the application of sound policies and initiatives. The need for proper and timely information on

Study Area
China is an ideal test bed for the development of new hydrological monitoring methods because the country hosts several major river systems (Yangtze, Yellow River, Pearl River, Songhua River, and Heilongjiang-Amur River) that span a wide variety of climatic and physiographic conditions, as well as major water infrastructure projects. China is home to roughly 6% of the world's total freshwater resources, but accommodates 20% of the global population [16]. Especially during the past decades, the combinations of unevenly distributed water resources, rapid economic development and climate change have pushed China towards a situation with increasingly severe water scarcity [17]. To date, the increasing scarcity of water resources has not been effectively managed, with existing laws and regulations usually focusing on principles, but lacking mechanisms and procedures for enforcement [18]. Better and more timely information on surface water dynamics can help decision-makers and practitioners to implement more evidence-based planning, evaluation, compliance monitoring, and provide statistics for reporting in response to the global water agenda.
Many studies have demonstrated how EO can be applied to investigate various aspects and drivers of surface water changes in China [19][20][21]. In 2014, the first complete picture of inland water bodies in mainland China was published using Landsat imagery acquired over the 2005 to 2008 period [22]. This has been followed by additional studies also using Landsat data to provide a 2015 nationwide map of surface water extent [23] and to characterize long-term changes to inland water bodies [24]. Satellite altimetry data has been used in combination with surface water extent maps from JRC-GSWE to assess the spatial variation of lake and reservoir water storage [25]. Most recently, a 10 m monthly Water Body Dataset was generated from Sentinel-1 imagery, yet due to over estimations caused mainly by snow, the final product was constrained to be within the maximum water extent of JRC-GSWE [26]. To the best of the author's knowledge, there has been no attempt to use an optical-SAR sensor fusion approach to map the extent and track the recent changes in Chinas surface water resources.

Background
Surface water mapping is an established application in remote sensing, yet by reviewing the extensive literature on the subject, the approaches vary according to the objectives and scale of the study, sensor used, and environmental settings. In the optical domain, a common approach for mapping surface water is done using image thresholding on spectral bands or water indices such as the Normalized Difference Water Index (NDWI), Normalized Difference Vegetation Index (NDVI), Modified Normalized Difference Water Index (MNDWI) or Automated Water Extraction Index (AWEI) [27][28][29][30][31][32][33][34]. Theoretically, the thresholds for separating water from non-water are universal, but praxis thresholds tend to vary with the regional settings and atmospheric conditions. The thresholding approach therefore tends to be less reproducible, especially over large areas, as it often requires the intervention of experts to determine the optimal threshold. Water indices and thresholding are also impacted by factors such as water turbidity, shadows from terrain, buildings, and clouds, as well as the presence of snow and ice. For instance, the AWEI comes with optimized coefficients for water extraction in situations with shadows and/or other dark surfaces. Yet, in areas with highly reflective surfaces such as ice, snow and reflective roofs, the optimization may lead to misclassifications, and limit the applicability over larger diverse landscapes [27]. Topographic issues, caused by cast shadows from the terrain, can be suppressed by developing masks from Digital Elevation models, e.g., terrain shadow masking [35] or using the Height Above Nearest Drainage (HAND) index [36], to eliminate commission errors located in areas where water is not expected to accumulate [7,37].
Machine learning (ML) represents another avenue to optical water detection and may overcome some of the inherent issues with threshold-based approaches. A range of methods exist that can be broadly separated into unsupervised and supervised approaches. Unsupervised algorithms require fewer input parameters and decide themselves how to structure the data for example through clustering (e.g., k-means) or automatic thresholding, e.g., the Otsu method [38], with a label being applied afterwards. However, since unsupervised algorithms are self-learning, they depend heavily upon the distribution of the data itself, which can become problematic in certain situations, such as when the data is not naturally bimodal [39]. However, supervised algorithms learn to predict based on labeled input data and are therefore less reliant on the distributions of the target data. A wide range of supervised algorithms has been applied to water classification, including support vector machines [40], decision trees [12], random forests [41] and convolutional neural networks (CNNs) [42]. These algorithms require the labeled data to be applicable across the study area in order to effectively classify unseen data. At larger scales, this can be problematic due to diversity of water conditions such as turbidity, chlorophyll and aquatic vegetation, which poses a challenge to creating universally applicable training data [43].
Regardless of the methodology, cloud cover remains a core limitation of optical imagery and achieving a cloud free coverage will often require some form of temporal image compositing [44]. Nonetheless, in areas with persistent cloud cover and/or lack of sunlight, the required compositing period may comprise the dynamics of the phenomenon being studied. For example, Schneider et al. [45] used full annual time series of Landsat 7 and 8 optical imagery, but due to clouds, the authors failed to extract an accurate annual water mask over the Brahmaputra. The generated mask was conservative, only representing low flow conditions because clouds prevented monitoring during the high flow period. Another more direct way to overcome issues with clouds is to use weather and light independent microwave sensors such as Synthetic Aperture Radar (SAR). SAR sensors transmit a microwave signal towards the target and detect the backscattered portion of the signal. The strength of the signal returned can be used to identify different targets, however, this can vary depending on the polarization used to transmit and receive. Typically, cross polarizations (HV, VH) have been shown to be more informative for volume scatterers, such as forests, whilst co-polarizations (HH, VV) are more effective for surface scatterers, such as water bodies. Sentinel-1 in IW mode operates using the polarizations VH and VV across most parts of the world. Studies support a slight increase in performance of copolarization (VV) compared to cross polarizations (VH) in land-water classification [37,46]. More generally, the composition of the area in question will impact upon the choice of polarization for land-water classification. Numerous studies show SARs effectiveness for water inundation and flood mapping [37,[46][47][48][49][50], however, the separation of land and water based on backscatter values alone is rarely perfect. Inundated vegetation can cause complex scattering mechanism and is a frequently reported source of omission errors [51,52]. Similarly, wind and rain may cause omission errors by increasing waters' surface roughness, resulting in higher backscatter coefficients than expected. [53]. Reversely, dry bare sand, and alluvial sediment often exhibit low backscatter properties, which can be confused for water and lead to commission errors [54,55].
Speckle noise or interference is another undesirable effect, which can result in spurious water detection. Filtering, image segmentation and texture analysis have been widely applied in attempts to overcome this, and to incorporate the assumption of smoothness, i.e., nearby pixels in space or time are more closely related than distant pixels [56,57]. Moreover, the side-looking geometry of SAR imagery combined with topographic relief can create shadow effects that reduce backscatter and distorts the image geometry [58].
There are limitations to mapping surface water when relying solely on either optical or SAR-based instruments. Therefore, the synergistic use of optical and SAR data emerges as an interesting alternative, with the potential to take advantage of the individual sensors' strengths, while minimizing their weaknesses. The fusion of optical and radar data is not new. A review of more than one hundred data fusion studies for land use/land cover assessments concluded that the combination of optical and SAR data provided results with higher accuracy than using either of the datasets individually [59]. However, for large scale operational applications data, fusion approaches have been limited by the Remote Sens. 2021, 13, 1663 5 of 22 lack of systematic SAR data. This is a situation that has now changed following the emergence of the European Copernicus program. In Northern Ireland, Bioresita et al. [60] investigated a decision level Sentinel-1 and -2 data fusion approach for surface water mapping, whereas Market et al. [61] harmonized Sentinel-1 and Landsat time-series into a common water index over a cloud prone region in the lower Mekong. Whilst both these studies showed promise, they were performed on a limited scale, but as demonstrated by Leeuwen et al. [62], there is also a potential for Optical-SAR data fusion for larger national scale surface water mapping. Furthermore, continuous improvements in computational capacity, availability of cloud-based platforms, access to global archives of satellite data and the required tools for big data analytics have further incentivized mapping at scale.
As an established remote sensing application, an accurate and efficient approach for fusing the latest generation of free and open optical and SAR data is needed to support surface water mapping at a large scale and to provide operational services. Whilst accuracy should be the foremost concern when choosing a mapping approach, simplicity and ease of implementation are also important to consider the increase to understanding, maintainability, and potential scalability. At one end of the spectrum are simple, lowcompute thresholding approaches; however, their accuracies may be lower compared to more complex learning models (e.g., CNN's), which require greater computation resources and specialized hardware. Whilst both approaches are viable, the relative strengths and weaknesses should be evaluated. Models that rely upon linear distributions are often simpler and generalize well, and therefore, do not require high quality training labels. Whilst that, in itself, is a useful attribute, it can also become problematic when the observed phenomena does not always adhere to that assumption. Land-water classification has some non-linear exceptions such as cloud, shadows, and snow. A more complex nonlinear model can often overcome these issues when provided with good training data, or alternatively a logic-based system can be integrated into the decision-making process of a simpler model, whereby the problematic areas are removed through specific thresholds or basic decision trees. We chose to use a simple logistic regression model, in combination with logic-based masking, to ensure sufficient accuracy, whilst maintaining computational efficiency and simplicity that facilitates analysis and understanding at scale.

Surface Water Mapping
The developed approach for mapping surface water extent (SWE) and dynamics can be described as an optical and SAR-based fusion model ( Figure 1). The approach integrates all multi-temporal observations from Sentinel-1, Sentinel-2 and Landsat-8 from 2017 to 2020 to map surface water at monthly intervals across the time-series, and to derive insights into seasonal and inter-annual water dynamics. The full mapping workflow was implemented in Google Earth Engine (GEE)-a cloud-computing platform that provides the necessary combination of large-scale computing resources with instant access to planetary scale archives of satellite imagery [63]. All the data is acquired from GEE. The Sentinel-1 dataset on GEE has already been pre-processed, including standard processing routines for orthorectification, radiometric and atmospheric correction.

Optical Data Processing
All Sentinel-2 and Landsat 8 imagery for the study period were acquired as Top-Of-Atmosphere (TOA) Level-1C orthoimage products. Further pre-processing was required to mask clouds, cloud shadow and snow. In the Sentinel-2 imagery, clouds were classified and removed using the s2cloudless machine learning algorithm and snow was removed using Sen2Cor Scene Classification (SCL) [64]. Cloud shadows were masked using information on the solar azimuth, known cloud locations and by applying a cloud shadow algorithm [65]. The Landsat-8 data was resampled to match the resolution of the Sentinels using the nearest neighbor method in order to preserve the input data values. Subsequently the Landsat data was masked for clouds, cloud shadows and snow using the bit-mapped values of The C Function of Mask (CFMask) algorithm [66,67]. In mountainous terrain, Sentinel-2 and Landsat-8 both require topographic (aka terrain) cast shadows to be removed, which was achieved using the solar azimuth and the GLO-30 Digital Elevation Model (DEM) [68]. A final terrain mask was applied to remove pixels with low light illumination using a hill shadow algorithm.
The Sentinel-2 time series data was aggregated into monthly composites using the best-available-pixel (BAP) by 'cloudiness', a conservative approach chosen to reduce inaccuracies from imperfect cloud masks. Landsat composites were derived by simply taking the mean of the clear observations. A range of spectral indices with demonstrated sensitivity to water were computed, including NDVI, NDWI, mNDWI and AWEI, and used for model training and prediction. Only the best performing indices were retained.

Optical Data Processing
All Sentinel-2 and Landsat 8 imagery for the study period were acquired as Top-Of-Atmosphere (TOA) Level-1C orthoimage products. Further pre-processing was required to mask clouds, cloud shadow and snow. In the Sentinel-2 imagery, clouds were classified and removed using the s2cloudless machine learning algorithm and snow was removed using Sen2Cor Scene Classification (SCL) [64]. Cloud shadows were masked using information on the solar azimuth, known cloud locations and by applying a cloud shadow algorithm [65]. The Landsat-8 data was resampled to match the resolution of the Sentinels using the nearest neighbor method in order to preserve the input data values. Subsequently the Landsat data was masked for clouds, cloud shadows and snow using the bitmapped values of The C Function of Mask (CFMask) algorithm [66,67]. In mountainous terrain, Sentinel-2 and Landsat-8 both require topographic (aka terrain) cast shadows to be removed, which was achieved using the solar azimuth and the GLO-30 Digital Elevation Model (DEM) [68]. A final terrain mask was applied to remove pixels with low light illumination using a hill shadow algorithm.
The Sentinel-2 time series data was aggregated into monthly composites using the best-available-pixel (BAP) by 'cloudiness', a conservative approach chosen to reduce inaccuracies from imperfect cloud masks. Landsat composites were derived by simply tak-

SAR Data Processing
We used the Level-1 Interferometric Wide Swath (IW) and Ground Range Detected (GRD) Sentinel-1 data. Within GEE, the data was accessed and processed to generate a calibrated, ortho-corrected product free of geometric terrain distortion and projected onto the same 10 m spatial grid as the optical data [69]. As the GEE Sentinel-1 dataset does not apply terrain flattening, we used a DEM to mask out areas affected by radar shadow and layover effects. Rather than using spatial filtering, we computed the monthly mean composites of the VV and VH polarizations to reduce some of the inherent speckle effects. Only the polarization with the highest training accuracy was used in the model prediction. With SAR data, it can be difficult to differentiate water from other surfaces with low backscatter, for example sand and airports. To minimize potential commission errors (cf. non-water areas with permanent low backscatter), we used an exclusion layer to remove SAR water predictions where the optical water predictions cannot confirm water presence within the preceding three-month period.

Data Fusion
The monthly composites were used as inputs to a logistic regression model trained using labels from Pickens et al. [12]. These labels were acquired using a random sampling approach across all months in 2019 and were sampled across five different strata defined using the RESOLVE ecoregions dataset (Figure 1) [70]. Stratification improved performance across the complex biogeography of China, from the arid Tibetan plateau to the tropical and subtropical regions of China.
One logistic regression model was created for each of the optical-SAR sensor combinations and scored respectively: (1) Sentinel-1, (2) Landsat-8, (3) Landsat-8 and Sentinel-1, (4) Sentinel-2, and (5) Sentinel-2 and Sentinel-1. The scoring dictated the model priority when compositing the logistic regression outputs into a single water probability score. The scores were determined based upon the model accuracies during training and a priority in spatial resolution, i.e., a preference for Sentinel-2 over Landsat. This fusion approach enables all-weather capabilities.
The resulting probability was returned in geographic projection (EPSG:4326), which was thresholded to be converted into a binary water/not-water classification. A bitmask of the sensor combinations was also returned. The algorithm has been prepared to ensure applicability over large areas and spanning many different eco-regions, but the inherent bitmask and probability outputs ensure flexibility and adaptability for local optimization through the selection of improved thresholds.
Lastly, a 'post-processing' stage was completed whereby a maximum surface water extent mask was created from the Sentinel-2 imagery across the entire time-series. This mask is used to remove commission errors in the Landsat due to less accurate cloud and cloud shadow masking, and some residual noise in the Sentinel-1 predictions. Additionally, urban areas can cause confusion in both optical and SAR imagery, for example misclassification of roads, warehouses, and solar PV systems. These areas were removed using the Global Human Settlement Layer (GHSL) [71], which was modified to retain water occurrence in urban areas by using summer (shadow free) SWE predictions only.
The algorithm is referred to as GRAS-Surface Water Dynamics (GRAS-SWD).

Validation
An independent reference dataset was created from the classification and interpretation of imagery from the China High-resolution Earth Observation System (CHEOS). Five sample locations, each 20 × 20 km, were selected and distributed across the major biomes of mainland China (cf. Figure 2). For each site, we obtained imagery acquired by various Gaofen (GF) missions (cf. GF-1, GF-2 and GF-6) and with pixel sizes ranging from 1 to 2 m (panchromatic) and from 3.24 to 8 m (multi-spectral). All images were from 2019 and chosen to ensure seasonal representation. The Gaofen images were orthorectified and the geolocation were visually inspected in order to ensure that they matched the Sentinel-2 data. The water extent was demarcated and digitized by visual interpretation of the panchromatic and multispectral bands with a minimum mapping unit of 10 m × 10 m. The defined water extents were confirmed by a second interpreter in order to ensure high quality. In addition, frozen water (cf. ice) was classified separately in the winter images, enabling an evaluation of its impact upon the classification.
After classification, stratified random sampling was used to generate reference points within three strata across the land-water continuum: permanent water, seasonal water, and non-water. The three strata were generated from the GRAS-SWD multi-annual (2017-2020) water occurrence and derived and defined according to the following thresholds: permanent water > 90%; 0% < seasonal water < 90% and non-water = 0%. In total over 13,000 points were randomly sampled across the strata using disproportionate sampling to ensure enough samples in classes with less areal coverage. At each sample location, the majority class from the reference image was extracted from the overlapping 10 m × 10 m grid cell and compared to the corresponding monthly product from GRAS-SWD, under the assumption that SWE is relatively stable at monthly intervals. Accuracy assessments are reported as percentages of pixels correctly classified using the metrics overall accuracy (OA), user's accuracy (UA) and producers' accuracy (PA). 2020) water occurrence and derived and defined according to the following thresholds: permanent water > 90%; 0% < seasonal water < 90% and non-water = 0%. In total over 13,000 points were randomly sampled across the strata using disproportionate sampling to ensure enough samples in classes with less areal coverage. At each sample location, the majority class from the reference image was extracted from the overlapping 10x10 meter grid cell and compared to the corresponding monthly product from GRAS-SWD, under the assumption that SWE is relatively stable at monthly intervals. Accuracy assessments are reported as percentages of pixels correctly classified using the metrics overall accuracy (OA), user's accuracy (UA) and producers' accuracy (PA).

Product Intercomparison
In order to further evaluate the extent to which this new mapping approach contributed new information, we compared GRAS-SWD results with two global surface water products JRC-GSWE and GLAD-GSWD. The JRC-GSWE product is readily available as a monthly product; however, the GLAD-GSWD product is available as monthly frequency and therefore, a threshold of 50% was used to convert it to a binary product. For each product, the total area of permanent and seasonal water was calculated and evaluated across the time series, using the definition described in Section 3.3. Furthermore, various locations characterized by specific hydrological features (i.e., Poyang Lake, Miyun reservoir, Hunan province and an aquaculture area located near Haizhou bay) were identified and visually compared to demonstrate the improved details of a 10 m surface water product, relative to the 30 m products.

Water Occurrence Maps and Dynamics
When looking at the multi-annual water occurrence map, it is clear that surface water resources are unevenly distributed in mainland China (Figure 3). Our results indicate Remote Sens. 2021, 13, 1663 9 of 22 that about 1-3% of the land surface in mainland China is covered by open inland water (i.e., lakes, rivers and reservoirs) compared to a global average of around 3-4% [11]. Still, large regional disparities exist with the majority of surface water in the south-east and south-west. The southeastern region is the most densely populated region and the water extent and dynamics are largely controlled by anthropogenic factors [72]. The southwestern part of China, dominated by the Tibetan Plateau, is a vast, sparsely populated region where the water landscape and its dynamics are controlled by natural forces in particular, stream runoff from glacial meltwater [22]. uct, relative to the 30 m products.

Water Occurrence Maps and Dynamics
When looking at the multi-annual water occurrence map, it is clear that surface water resources are unevenly distributed in mainland China (Figure 3). Our results indicate that about 1%-3% of the land surface in mainland China is covered by open inland water (i.e., lakes, rivers and reservoirs) compared to a global average of around 3%-4% [11]. Still, large regional disparities exist with the majority of surface water in the south-east and south-west. The southeastern region is the most densely populated region and the water extent and dynamics are largely controlled by anthropogenic factors [72]. The southwestern part of China, dominated by the Tibetan Plateau, is a vast, sparsely populated region where the water landscape and its dynamics are controlled by natural forces in particular, stream runoff from glacial meltwater [22]. The current 4 year time series is too short to make any firm conclusions about longterm water transitions. However, the trends still provide useful illustrations of intra and inter-annual water dynamics.
Poyang Lake, located in the northern part of Jiangxi Province, is the largest freshwater body in China. It receives its water primarily from five major rivers (i.e., the Ganjiang, Xinjiang, Raohe, Xiuhe, and Fuhe Rivers), and discharges into the Yangtze River at Hukou The current 4 year time series is too short to make any firm conclusions about longterm water transitions. However, the trends still provide useful illustrations of intra and inter-annual water dynamics.
Poyang Lake, located in the northern part of Jiangxi Province, is the largest freshwater body in China. It receives its water primarily from five major rivers (i.e., the Ganjiang, Xinjiang, Raohe, Xiuhe, and Fuhe Rivers), and discharges into the Yangtze River at Hukou at its northern end. The lake resides in a subtropical monsoon region with a wet season from April to September, and a dry season from October to March. The temporal variations in the surface water area reflect this seasonality and high intra-annual fluctuations create an extensive wetland around the Lake, providing vital wintering habitats for migratory birds, many of which are critically endangered [73]. Nonetheless, the annual fluctuations are also a concern, especially in recent years where record high water levels have caused damaging floods in July 2017, 2019 and 2020 [74][75][76][77]. These events are all clearly captured in the temporal development profile of the surface water area for Poyang Lake (cf. Figure 4).
Surface water area also responds to human manipulations. Miyun Reservoir, located just north east of Beijing, is one of the largest water protection areas in the world [78]. The reservoir covers an area of 180 square kilometers, with a reservoir capacity of 4 billion cubic meters and an average depth of 30 m, making it the largest drinking-water supply for over 20 million inhabitants of Beijing [79]. Since water began to be channeled from central China's Hubei Province to the Capital in December 2014 via the south-to-north water diversion project, the reservoir's water level has gradually increased up until 2018 [80]. However, in recent years, the trend has leveled off as Miyun Reservoir has supplied over Remote Sens. 2021, 13, 1663 10 of 22 4 million cubic meters of water to rivers running dry in its downstream area [81]. The time series consistently reflect these trends ( Figure 5).
at its northern end. The lake resides in a subtropical monsoon region with a wet season from April to September, and a dry season from October to March. The temporal variations in the surface water area reflect this seasonality and high intra-annual fluctuations create an extensive wetland around the Lake, providing vital wintering habitats for migratory birds, many of which are critically endangered [73]. Nonetheless, the annual fluctuations are also a concern, especially in recent years where record high water levels have caused damaging floods in July 2017, 2019 and 2020 [74][75][76][77]. These events are all clearly captured in the temporal development profile of the surface water area for Poyang Lake (cf. Figure 4). Surface water area also responds to human manipulations. Miyun Reservoir, located just north east of Beijing, is one of the largest water protection areas in the world [78]. The reservoir covers an area of 180 square kilometers, with a reservoir capacity of 4 billion cubic meters and an average depth of 30 meters, making it the largest drinking-water supply for over 20 million inhabitants of Beijing [79]. Since water began to be channeled from central China's Hubei Province to the Capital in December 2014 via the south-to-north water diversion project, the reservoir's water level has gradually increased up until 2018 [80]. However, in recent years, the trend has leveled off as Miyun Reservoir has supplied over 4 million cubic meters of water to rivers running dry in its downstream area [81]. The time series consistently reflect these trends ( Figure 5).  Table 1 lists the overall classification accuracy for each of the five sites. Overall, accuracies are satisfactory for all sites (> 90%), with users and producers' accuracies generally near or above 90%.

Classification Accuracy
The one exception is Daxihaizi during winter. While OA remains high (>95%), water  Table 1 lists the overall classification accuracy for each of the five sites. Overall, accuracies are satisfactory for all sites (>90%), with users and producers' accuracies generally near or above 90%. The one exception is Daxihaizi during winter. While OA remains high (>95%), water PA is markedly lower (79%) compared to the other locations and months. This may be explained by the fact that the Daxihaizi reservoir is shallow and exhibits seasonal changes controlled by upstream discharge. As a result, the classification may be impacted by land-water mixing at the pixel scale and stronger bottom reflectance from shallow waters. Around two-thirds (68%) of the misclassified water pixels were from the seasonal strata, and one-third (33%) was mixed pixels containing both land and water according to the validation. Furthermore, a local threshold optimization could offer improvements since around half (54%) of the incorrectly labeled water pixels have a GRAS-SWD water probability greater than 20%.

Classification Accuracy
The seasonal variation in the accuracies is also apparent in the Yellow River Upstream location, albeit to a lesser extent. Here the performance is reduced by 3 percentage points in winter (cf. month 11) for the overall accuracy (Table 1). This lower accuracy is likely caused by the presence of ice. According to the validation, 75% of the water samples contained some fraction of ice, and yet the accuracy is not substantially diminished. This could be explained by the fact that we did not use a dedicated ice mask, but rather chose to consider ice as water in the binary classification. More generally, we note that the Yellow River validation site was considerably complex, with very limited vegetation cover and wetland characteristics, including many small and shallow water pools, whose extents vary according to upstream discharge and seasonality. This may explain the lower overall accuracies here. The only other site with notable presence of ice is Sanggan (44% of the water samples impacted by ice) and yet this site ranks third in terms of overall accuracy and also performs well at the class level.
Aggregating all the samples across the sites results in an overall accuracy of 94%, with lower overall and per class accuracies observed in the seasonal water strata ( Table 2). This is to be expected since the seasonal class is the most dynamic. It is also arguably the most important strata to ensure accuracy in since it encompasses the land-water boundary. The lowest scoring metric appears to be water PA, which may suggest that the total water extent is underestimated, however the UA is arguably more important, especially when aggregating through time.

Area and Timeseries Comparisons with JRC-GSWE and GLAD GSWD
The surface water area (km 2 ) over mainland China was calculated for each month from 2017 to 2020 and compared to JRC-GSWE and GLAD-GSWD ( Figure 6). Our algorithm detects the highest extent in all months except for May and June, where GLAD-GSWD is higher. The higher area detected by GLAD-GSWD is due to a high number of commission errors observed mainly over urban centers with tall buildings, glaciers (particularly when mixed with debris) and cloud shadow over dark, dense forests [12]. The smallest annual variations in surface water area, which are a reflection of seasonality, is in GRAS-SWD and the largest in JRC-GSWE. In general, all datasets show lower SWE's during the winter months than in summer, which is consistent with precipitation patterns and lower runoff due to the upstream freezing of water bodies and winter precipitation being temporarily stored as snow. JRC-GSWE shows the lowest extent in all months, except for February and March where GLAD-GSWD is lower. In the darkest months (December and January), JRC-GSWE reports as little as 20,000 km 2 , which is less than 15% of the extent mapped by GRAS-SWD. Albeit to a lesser extent, GLAD-GSWD also returns markedly lower areas during winter. This discrepancy is not a true reflection of the surface water extent, but rather due to the lack of suitable Landsat observations during the winter at higher northern latitudes, where sun angles are low and optical water detection is constrained by light. This inherent issue with Landsat derived SWE makes it difficult to derive meaningful insights into water dynamics across all months. Surface water dynamics were investigated through annual comparisons of permanent vs. seasonal water extent. Each algorithm appears to be consistent through time with small deviations in area, and in the proportion to the total water, from year to year (cf. Figure 7). Furthermore, the permanent water estimates are consistent across the algorithms at approximately 100,000 km 2 . However, there appears to be significant differences in the seasonal water estimates ranging in area from 200,000 km 2 to 360,000 km 2 , with JRC-GSWE estimating the smallest area and GLAD-GSWD the largest. We suspect the difference in seasonal water is due to some key methodological differences where JRC-GSWE uses stricter rules, including a sun-angle threshold, a temporal sliding window to reduce cloud shadows, and building shadow removal, whereas GLAD relies upon a machine learning approach to solve these issues. Seasonal water varies between 64.7% and 79.1% of the total annual maximum surface water extent, whilst permanent water varies between 21.1% and 35.3% depending on the algorithm. Surface water dynamics were investigated through annual comparisons of permanent vs. seasonal water extent. Each algorithm appears to be consistent through time with small deviations in area, and in the proportion to the total water, from year to year (cf. Figure 7). Furthermore, the permanent water estimates are consistent across the algorithms at approximately 100,000 km 2 . However, there appears to be significant differences in the seasonal water estimates ranging in area from 200,000 km 2 to 360,000 km 2 , with JRC-GSWE estimating the smallest area and GLAD-GSWD the largest. We suspect the difference in seasonal water is due to some key methodological differences where JRC-GSWE uses stricter rules, including a sun-angle threshold, a temporal sliding window to reduce cloud shadows, and building shadow removal, whereas GLAD relies upon a machine learning approach to solve these issues. Seasonal water varies between 64.7% and 79.1% of the total annual maximum surface water extent, whilst permanent water varies between 21.1% and 35.3% depending on the algorithm. ence in seasonal water is due to some key methodological differences where JRC-GSWE uses stricter rules, including a sun-angle threshold, a temporal sliding window to reduce cloud shadows, and building shadow removal, whereas GLAD relies upon a machine learning approach to solve these issues. Seasonal water varies between 64.7% and 79.1% of the total annual maximum surface water extent, whilst permanent water varies between 21.1% and 35.3% depending on the algorithm.

Visual Evaluation and Spatial Comparison with JRC-GSWE and GLAD-GSWD
Surface water frequency maps were visually compared over a site dominated by aquaculture and located on the east coast of China near Haizhou Bay (Figure 8). The improved spatial detail of our algorithm is evident when compared to JRC-GSWE and GLAD-GSWD. Whilst it is possible to define the boundaries of the larger aquaculture sites in both JRC-GSWE and GLAD-GSWD, the internal divisions within these are hardly discernable at a 30 m resolution. It is only in the GRAS-SWD algorithm that it is possible to fully appreciate the richness and diversity of the aquaculture sites in this region. This is a

Visual Evaluation and Spatial Comparison with JRC-GSWE and GLAD-GSWD
Surface water frequency maps were visually compared over a site dominated by aquaculture and located on the east coast of China near Haizhou Bay (Figure 8). The improved spatial detail of our algorithm is evident when compared to JRC-GSWE and GLAD-GSWD. Whilst it is possible to define the boundaries of the larger aquaculture sites in both JRC-GSWE and GLAD-GSWD, the internal divisions within these are hardly discernable at a 30 m resolution. It is only in the GRAS-SWD algorithm that it is possible to fully appreciate the richness and diversity of the aquaculture sites in this region. This is a significant advantage for counting small water features and estimating their surface water extent.
13, x FOR PEER REVIEW 14 of 24 significant advantage for counting small water features and estimating their surface water extent. The issue of capturing small water bodies was explored in more depth visually and statistically in Hunan province, located in the subtropical biome in the south of China. The region is therefore relatively cloudy, and the topography is somewhat complex. Comparing the seasonal and permanent water extent shows considerable differences, with significantly more water bodies identified by GRAS-SWD both in terms of numbers and in terms of consistency, as most of the water bodies are classified as 'permanent' (Figure 9). The issue of capturing small water bodies was explored in more depth visually and statistically in Hunan province, located in the subtropical biome in the south of China. The region is therefore relatively cloudy, and the topography is somewhat complex. Comparing the seasonal and permanent water extent shows considerable differences, with significantly more water bodies identified by GRAS-SWD both in terms of numbers and in terms of consistency, as most of the water bodies are classified as 'permanent' (Figure 9). GLAD-GSWD identifies some water features, however it appears that they are identified inconsistently, since they are classified as 'seasonal'. Furthermore, some appear to be misclassifications when comparing to the Google Earth base map. JRC-GSWE identifies the least, and the large water body in the southwest is only present as a 'seasonal' water feature. These differences may be due to the cloud coverage limiting optical only observations, or to their relatively smaller size proving difficult to capture at the Landsat resolution. To illustrate the point further, we extracted permanent water bodies with a size <0.81 ha and counted their numbers within each 10 km 2 grid cell for the entirety of the Hunan province administrative region ( Figure 10). The 0.81 ha size threshold was chosen to be within the range of the smallest size categories in global lake inventories [82] and at the same time to respect the resolution of the evaluated SWE products, i.e., 3×3 Landsat 30 m pixels (JRC-GSWE; GLAD-GSWD) vs. 9×9 Sentinel 10 m pixels (GRAS-SWD).
We found 632,081 features less than 0.81 ha with a total area of 350 km 2 using our algorithm. We also note that the spatial correlation of these water bodies aligns well with cropland extent, suggesting their possible importance for food production and domestic water supply. In comparison to our counts, JRC-GSWE identified only 0.2% of this number, and GLAD-GSWD even fewer at 0.1%. This significantly lower count is apparent in the figure, where less than 10 water bodies per cell were identified in nearly all grid cells by both JRC-GSWE and GLAD-GSWD ( Figure 10). To illustrate the point further, we extracted permanent water bodies with a size <0.81 ha and counted their numbers within each 10 km 2 grid cell for the entirety of the Hunan province administrative region ( Figure 10). The 0.81 ha size threshold was chosen to be within the range of the smallest size categories in global lake inventories [82] and at the same time to respect the resolution of the evaluated SWE products, i.e., 3 × 3 Landsat 30 m pixels (JRC-GSWE; GLAD-GSWD) vs. 9 × 9 Sentinel 10 m pixels (GRAS-SWD).

Discussion
To the author's knowledge, this is the first assessment of surface water over mainland China at a 10 m resolution using an optical-SAR sensor fusion approach. The results convincingly demonstrate that the use of 10-meter resolution Sentinel data offers significant improvements and reveals new insights over Landsat based products at a 30 meter resolution, both in terms of spatial detail and better and more consistent seasonal monitoring of surface water extent.

Accuracy
Validation of the algorithm showed good performance (94% OA) across a diversity of ecoregions, and over landscapes influenced by several of the known challenges for satellite-based surface water mapping, including topography, dense vegetation, clouds, low sun angles, low backscatter land covers, as well as snow and ice.
The effect of topography is manyfold including image distortion, deformation, and impact on signal values. In order to reduce these effects a DEM is required. Most studies until recently have relied upon using the Shuttle Radar Topography Mission (SRTM) DEM [84]. However, there are some known quality issues with the SRTM DEM, including the reference year 2000 [85]. Our algorithm therefore used a new enhanced DEM (cf. GLO-30 DEM), which we suspect, although this was not directly quantified, helped us achieve high accuracies in mountainous regions (cf. the validation sites in Xijiang and Guxian).
Permanently low backscatter areas (flat impervious areas and dry sandy surfaces) can pose a challenge for SAR-based water detection [86], such as in the Daxihazi validation site, which is situated in the desert biome. As a mitigation, and to maintain high mapping accuracies in such environments, our algorithm takes a conservative approach by excluding SAR if there is a low backscatter and an absence of optical-derived water in the previous three months. Despite this conservative approach SAR offers the benefits of filling the temporal observations gaps caused by clouds and light restrictions, which is most We found 632,081 features less than 0.81 ha with a total area of 350 km 2 using our algorithm. We also note that the spatial correlation of these water bodies aligns well with cropland extent, suggesting their possible importance for food production and domestic water supply. In comparison to our counts, JRC-GSWE identified only 0.2% of this number, and GLAD-GSWD even fewer at 0.1%. This significantly lower count is apparent in the figure, where less than 10 water bodies per cell were identified in nearly all grid cells by both JRC-GSWE and GLAD-GSWD ( Figure 10).

Discussion
To the author's knowledge, this is the first assessment of surface water over mainland China at a 10 m resolution using an optical-SAR sensor fusion approach. The results convincingly demonstrate that the use of 10-m resolution Sentinel data offers significant improvements and reveals new insights over Landsat based products at a 30 m resolution, both in terms of spatial detail and better and more consistent seasonal monitoring of surface water extent.

Accuracy
Validation of the algorithm showed good performance (94% OA) across a diversity of ecoregions, and over landscapes influenced by several of the known challenges for satellite-based surface water mapping, including topography, dense vegetation, clouds, low sun angles, low backscatter land covers, as well as snow and ice.
The effect of topography is manyfold including image distortion, deformation, and impact on signal values. In order to reduce these effects a DEM is required. Most studies until recently have relied upon using the Shuttle Radar Topography Mission (SRTM) DEM [84]. However, there are some known quality issues with the SRTM DEM, including the reference year 2000 [85]. Our algorithm therefore used a new enhanced DEM (cf. GLO-30 DEM), which we suspect, although this was not directly quantified, helped us achieve high accuracies in mountainous regions (cf. the validation sites in Xijiang and Guxian).
Permanently low backscatter areas (flat impervious areas and dry sandy surfaces) can pose a challenge for SAR-based water detection [86], such as in the Daxihazi validation site, which is situated in the desert biome. As a mitigation, and to maintain high mapping accuracies in such environments, our algorithm takes a conservative approach by excluding SAR if there is a low backscatter and an absence of optical-derived water in the previous three months. Despite this conservative approach SAR offers the benefits of filling the temporal observations gaps caused by clouds and light restrictions, which is most notably demonstrated by the increased water detection during winter months compared to JRC-GSWE and GLAD-GSWD, and the ability to accurately map permanent small water bodies in the cloud prone Hunan province. The cost of using a SAR exclusion mask is the reduced ability of our algorithm to detect short-lived floods in their entirety. Still, and as demonstrated in the case of Poyang Lake, the water dynamics, incl. timing and extent of floods, are well captured. This is a significant improvement, as many studies limit their investigations to the dry season due to cloud cover [87].
Snow and ice represent a special challenge as they are both part of the hydrological cycle and therefore need specific considerations. Snow for the most part forms on the land surface, whereas ice represents the frozen state of inland water bodies. For that reason, our approach masks snow as part of the pre-processing, and indirectly considers ice as water in the sense that we did not attempt to include a specific ice predictor as part of the classification model. Two of the validation sites (Yellow River Upstream and Sanggan) had a substantial presence of ice during winter and we found that it did not significantly reduce accuracies when considered as water. Classifying ice as water can of course be disputed or unwanted for certain applications, and a future development could be to expand our algorithm to a multi-class classification with separate predictions of water and ice, or explicitly remove ice.
There are multiple other challenges for surface water detection and not all were evaluated in the current study. Our results may include uncertainties due to dense inundated vegetation, which is a known problem for both optical and SAR based water detection. Nonetheless, the occurrence of inundated vegetation is not expected to be a dominant issue at the scale of mainland China and why it is unlikely to have a significant impact on our results. Strong winds are also known to influence the SAR signal by roughening the water surface and making the contrast between land and water less discernable. While not directly addressed, we believe the influence of wind to be relatively minor as we use multitemporal compositing rather than single SAR acquisitions, which are more susceptible to wind effects. In addition, mapping surface water extent in urban areas is very complex from the perspective of both optical and SAR data. For optical images the main challenge is building shadows, whereas SAR data is confounded by multiple other factors including layover effects from tall buildings and corner reflections resulting in double/triple bounce effects [88]. Our algorithm is designed with large-scale operational monitoring of surface water in mind and whilst rates of urbanization in China have quadrupled over the past four decades [72], urban areas still represent only a fraction of the landscape. Therefore, our approach handles the urban complexity with a relatively simple masking approach that allows urban waters to be detected within a specific seasonal window when the sun is high in the sky and building shadows at a minimum. More studies dedicated to mapping and monitoring surface water in urban areas are needed to further resolve this specific challenge (see e.g., Reference [89]).

Temporal and Spatial Evaluation
Temporally, the Sentinel data provides a denser (×3) time-series over Landsat that enables a better characterization of the seasonal dynamics of the water extent across the entire calendar year. Incorporating this into our algorithm results in two major advances. First, the higher revisit time of Sentinel-2 increases the chances of obtaining cloud free imagery. Secondly, Sentinel-2 may still be constrained when clouds are persistent and/or under light constraining conditions. In those cases, our algorithm utilizes cloud and light insensitive SAR data from Sentinel-1 enabling surface water detection, irrespective of cloud and light conditions. In comparison to the SWE algorithms, there are significant differences. These are most notable during winter months (cf. November to February) where clouds are more prevalent and light is a limiting factor. During this period, large parts of mainland China are not processed by the JRC-GSWE algorithm, as it includes a threshold for low sunangles. As a result, there is no JRC-GSWE data available for the Daxihazi validation site in December. A similar exclusion criterion is not used by the GLAD-GSWD algorithm, which nevertheless exhibits substantially lower SWE estimates (compared to GRAS-SWD) during winter, primarily due to the lack of suitable Landsat observations. Our algorithm produces a better SWE estimate due to the increased number of available optical observations and the fusion with SAR imagery, and hence, overcomes light limitations inherent in passive-based sensor systems, providing a more reliable monthly SWE estimate throughout the year.
Aggregations through time facilitate the analysis of water dynamics. We defined two classes: permanent water, and seasonal water, based upon the frequency of water observations normalized by the total number of valid observations. Our comparisons with JRC-GSWE and GLAD-GSWD show a large amount of agreement for permanent water at approximately 100,000 km 2 . Contrary to permanent water, the variation in seasonal water extent is large, varying from 200,000 km 2 (JRC-GSWE) to 360,000 km 2 (GLAD-GSWD). Pickens et al. [12], suggest that the JRC-GSWE underreports seasonal waters because of mischaracterization of the water regime, whilst both sources acknowledge and report lower accuracies for classifying seasonal water [11,12]. The GRAS-SWD estimate of seasonal water falls in between JRC-GSWE and GLAD-GSWD, and a direct comparison suggests our algorithm is better at capturing seasonal dynamics, although more quantitative research is required to confirm this. The seasonal consistency of our algorithm is further underpinned by temporal profiles showing patterns of expected seasonal change in Poyang and Miyun reservoir. This is also corroborated by other studies using our algorithm to investigate spatio-temporal variation of water resources in the North China Plain [90]. Despite the convincing visual assessment, our algorithm also showed reduced performance in the seasonal strata compared to the land or water strata. We, therefore, recommend future research on quantifying the performance of surface water detection algorithms within the seasonal strata.
Spatially, the 10 m resolution facilitates a better overall delineation of the land-water boundary by improving the detection of smaller inland water bodies and land features within water bodies. In comparison to JRC-GSWE and GLAD-GSWD, it is shown how most of the small farm ponds ecosystem in the Hunan province cannot be mapped at the scale of Landsat pixels. It may be somehow surprising that our algorithm, which captures more water bodies than JRC-GSWE or GLAD-GSWD, arrives at almost the same areal estimates at the national scale. However, this is explained by the fact that although small water bodies are far more numerous than larger water bodies, their cumulative extent is only a fraction of the surface water area of large water bodies [22]. Our results highlight the need for time series in higher spatial resolution to more accurately capture the surface water dynamics in China.

Operational Water Monitoring
Over the past decades human activities and climate change have driven complex physical and ecological changes to China's inland water bodies. One of the most visible effects has been the steady annual increase in reservoirs since the 1970s [22]. Hydrological processes are significantly influenced by dam construction and regulation and there are valid concerns for the impact large dams may have on the flow regime. It is well known that spaceborne sensors can support monitoring of the state, change and release of large reservoirs [91], and our study verifies that by showing the consistency by which surface water area changes of large reservoirs can be accurately tracked. Furthermore, a recent study demonstrates how our algorithm can be used in conjunction with altimetry and gravity missions to obtain a more complete picture of the spatio-temporal variation of water resources over the North China Plain [90].
Whilst monitoring large dams is important, especially in China, where the greatest rate of increase in storage capacity comes from large reservoirs (>10 km 2 ), the monitoring of smaller reservoirs and lakes (<0.1 km 2 ), which by far outnumbers the larger ones [22], is also important. Science and government measures have traditionally focused on studying and monitoring larger water bodies [92]. However, there is growing awareness of the importance of small waterbodies for freshwater biodiversity and in delivering ecosystem services, as well as their increasing exposure and threats from human activities [3]. In China, small waters have provided hydrologic, biogeochemical, ecological, and socioeconomic benefits for thousands of years, but the hydrological regime of small waters is changing, and their management represents a challenge [93]. One of the primary requirements for improving our understanding of small water bodies, specifically with regards to recommended policy actions that protect and integrate them into water management, are reliable monitoring programs [3]. Our study shows that only higher resolution Sentinel imagery is capable of capturing and tracking these smaller water bodies.
In the context of climate change, the information and mapping approach presented in this paper is also valuable. For example, the Tibetan Plateau has seen an increase in surface waters caused by higher temperatures and annual precipitation, which has accelerated stream runoff and glacial meltwater [94]. On the contrary, Inner Mongolian lakes have shrunk and even vanished in response to an observed warming trend beginning in the 1950s [22]. The ability to consistently map surface water extent over time and at scale and compare it to related climatology trends can improve our understanding of the mechanisms behind, and the extent to which climate change impacts on surface waters.
Whether driven by human or climatic factors better monitoring of water resources in both space and time is needed to prevent a potential water crisis unfold unabated. It is therefore important to consider how this dynamic information is synthesized and operationalized. Monthly deviation measures could help prevent water stress, through actionable management of upstream reservoirs, and behavioral changes in downstream use. Longer-term changes could focus on the annual aggregations of permanent and seasonal water, which have been shown to be consistent through time. Since permanent water represents the minimum water supply available, it reflects potential water scarcity and hence it is an important indicator for the water available to ensure basic societal needs for, e.g., irrigation, hydropower, navigation, and domestic usage. However, seasonal waters represent the intermittent freshwater supply that is important for recharging groundwater and restocking local man-made water reservoirs and thereby helping to meet the yearround demand. They are also vital for sustaining ecosystem integrity, providing variable river flows to natural river systems and wetlands, which support 40% of all species. If the long-term and seasonal water regime undergoes significant changes, wetlands' capacity to moderate the effects of extreme droughts and rainfall may be diminished, along with a reduction in habitat and biodiversity and significant ecological losses. We, therefore, suggest that monthly surface water area, annual permanent and seasonal area, as well as annual counts of small water bodies would be useful monitoring metrics in an operational context.

Conclusions
To date, most applications for surface water monitoring have tended to rely on either optical or SAR images, and only few studies have explored the advantages of a multisensor approach. From independent reference data and by comparison with existing global maps, we demonstrate how the fusion of optical and SAR data brings new enhanced information about the surface water dynamics in mainland China. We have demonstrated the value of using higher resolution (10 m) and denser time-series to identify previously unmapped water bodies consistently across years and seasons. This improved monitoring capacity has important management and planning implications in China, but also beyond. When world leaders adopted the SDGs, they also pledged to develop and support a global indicator framework needed to measure and report progress on achieving the 17 SDGs and their 169 associated targets. Our mapping approach is consistent with the monitoring requirements and methodological guidelines for SDG indicator 6.6.1 "Change in the extent of water-related ecosystems over time" [95]. With the Sentinel satellite constellation, data has continued to improve greatly, and combined with the advances in technical infrastructures for big data analysis, it is now within the realm of countries to implement national EO based surface water monitoring systems to support more evidencebased planning and management of water resources and for SDG reporting. It is worth remembering that the SDG guidelines operate with a 2001-2005 baseline period, hence the historic Landsat records still hold invaluable information. Thus, there is a strong incentive and justification for the continued research into the usage of Landsat and its continuation to the Sentinels (and the increasing amount of other new sensors) for tracking long-term surface water dynamics.