1. Introduction
The subarctic water cycle has been affected by climate change [
1]. To understand the impact of climate change upon the region, widespread broad-scale monitoring of water dynamics and related hydrogeological phenomena has been conducted using satellite remote sensing [
2,
3]. Among the satellite-observable quantities—terrestrial water storage [
4,
5], snow cover or snow water equivalent [
6], soil moisture [
7], and surface water [
8]—surface water is the factor that directly interacts with human activity and is a key indicator of local and global hydrological cycles [
2].
Spatiotemporally heterogeneous features of surface water created by thermokarst lakes and river networks, which reflect the thawing/freezing of snow (or permafrost) and the related hydrometeorological regime at sub-seasonal to interannual scales, requires observation at a high spatiotemporal resolution to understand the regional water dynamics. Detailed surface water maps are important ancillary data for land-surface modeling [
9], soil moisture retrievals [
10], and the understanding of methane emissions [
11]. Various past studies have provided surface water maps at global scales [
12,
13,
14,
15,
16]; however, either spatial or temporal details tend to be lost in such existing maps. Maps with high spatial resolution only provide monthly–annual aggregated values, whereas those with high temporal frequency (e.g., daily) rarely have high spatial resolution.
Monitoring with both spatially and temporally fine resolution using a single satellite sensor is generally challenging, owing to technical and financial limitations [
17]. Progress involving a large constellation of optical microsatellites [
18] may provide spatiotemporally fine-resolution datasets, except in areas frequently obscured by dense cloud cover. However, geometric- and radiometric-calibration activities for each sensor and between sensors are still in progress [
19], and their uncertainty is likely to be too large to precisely observe land-surface properties. In addition, because currently such datasets are not free (i.e., they are available for a charge), users often need to limit data purchases to small areas of interest to save on costs. To avoid such concerns, another solution involves data-fusion approaches [
20,
21] among well-calibrated, open, and free satellite datasets. Traditionally, spatiotemporal data fusion has focused upon the integration of data representing the same land-surface or atmospheric properties (e.g., surface reflectance [
22,
23], evapotranspiration [
24], land-surface temperature [
25], leaf-area index [
26], and precipitable water vapor [
27]) derived from multiple optical or thermal sensors based on physical or quasi-physical models [
22,
23].
However, empirical approaches using machine learning (ML) have been used for a flexible fusion of datasets with highly different features [
28,
29]. The fundamental process involves ML training with matched pairs between different types of data and then using the training results to predict spatially high-resolution but temporally low-resolution data from counterpart data (temporally high-resolution but spatially low-resolution). Suzuki and Matuo [
2] mentioned that integration between sensors with different features, particularly microwave and optical sensors, may complement shortcomings and add value to hydrological monitoring in the subarctic. In such fusions between different spectral domains, the use of sophisticated ML techniques such as the convolutional neural networks (CNNs) has been recommended as an attractive approach (e.g., [
21]).
Therefore, we have aimed for data fusion between microwave and optical sensors by combining popular (i.e., random forest) and sophisticated (i.e., pix2pix) ML approaches to obtain open water maps that effectively show subarctic thermokarst lakes with daily frequency. For microwave and optical data, we selected the Advanced Microwave Scanning Radiometer 2 (AMSR2) and Moderate Resolution Imaging Spectroradiometer (MODIS) sensors. AMSR2, which yields passive microwave data, is characterized by a wide swath and high observational frequency (daily), cloud-penetration ability, and coarse spatial resolution (from several to several tens of kilometers, owing to the long microwave wavelength). In contrast, MODIS, which yields optical data, has better spatial resolution (500 m) and is characterized by a relatively low chance (less than daily) of observation of the ground surface owing to cloud interference. Such contrasting features motivated us to apply data fusion between these sets [
30].
The pix2pix method involves a type of conditional generative adversarial network (GAN) that interfaces two deep CNNs (generator and discriminator [
31]) and enables image-to-image translation [
32]. Applications of GAN-based image classification [
33], segmentation [
34], and super-resolution optical imagery [
35] have emerged in the remote sensing community. However, owing to the computational cost and nontrivial setup, the application of this technique for monitoring boreal surface water has been barely explored [
3]. The novelty of our research is that it addresses the apparent gap between such applications of state-of-the-art ML methods from computer science and spatiotemporal data fusion for satellite remote sensing. Through several experiments (including the timing of the water-index thresholding and input-feature selections), we also aim to provide practical knowledge for the application of sophisticated ML to surface water monitoring.
3. Results
3.1. Preliminary Experiments by Random Forest Method
A comparison between the NDWI prediction and the ML-water map (i.e., fusion-then-thresholding vs. thresholding-then-fusion) clearly showed that the former created better water maps for all accuracy criteria and all cases of input data (
Table 4). Direct water-map prediction omitted a large proportion of the water pixels, resulting in a large MB, large RMSE, and low PA. Thus, given the current experimental conditions, water maps should not be predicted directly by ML; rather, they should be derived from the NDWI maps predicted in advance by ML.
Based on NDWI map prediction by ML, the best accuracies (PA, UA, and OA) for the water map were obtained using input-features case 3 (83.9%, 85.4%, and 99.4%, respectively). An overall tendency for the addition of more features to create greater accuracy was observed: case 2 (with six features) was superior to case 1 (with four features), and case 3 (with 23 features) was superior to cases 1 and 2. However, the relative MB and RMSE did not always show better results with more features; for example, the relative RMSE in case 3 (40.7%) was better than that in case 2 (41.7%), but the relative MB was not. This may relate to overfitting of the ML model with the limited number of training samples, suggesting a need to reduce the number of input features by selecting key features.
To explore the key features, we investigated variable importance for the 23 possible features in the random forest data (
Table 5). The most important feature was DOY (sine and cosine), followed by soil temperature at 7–28 cm depth, microwave indices (
NDPI;
FWS18.7, V), volumetric soil water at 100–289-cm depth, total evapotranspiration, etc. To avoid overfitting and reduce the computational cost, only the top 10 features (
DOYsin and
DOYcos;
STL2;
NDPI;
FWS18.7, V;
SWVL4;
E;
FWS36.5, V;
STL4; and
BWI) were selected for further processing.
The importance of a variable did not necessarily correspond to its independence from the other variables (
Table 6). Naturally, some variables exhibited very high VIF values because the properties of the land surface (and microwave indices) are closely interlinked.
DOY,
E,
FWS18.7,V, and
SWVL4 were relatively low VIF among the top 10 features, and thus seemed to add values in the ML implementation.
The time-series accuracies of the random forest predictions using the top 10 input features are shown in
Figure 4 and
Figure 5. Both the relative MB (−9.55%) and RMSE (40.3%) of the total water fraction across the study site were better than those of any cases in the preliminary experiments (
Table 4), and the temporal mean PA (83.6%), UA (85.4%), and OA (99.4%) of the water maps (
Figure 5) were comparable with those for case 3 (PA = 83.9%, UA = 85.4%, and OA = 99.4%). This confirmed the importance of selecting key features instead of numerous potentially redundant features.
There was an erroneous outlier day (17 May 2013) on which the water fraction was underestimated by ~0.04 and PA was much lower (i.e., large omission of water pixels). According to the original NDWI map on the day (
Figure 6), it experienced an irregularly large fraction of water surface, probably because of ice-jam flooding in the spring season. Without this outlier, the performance of the first prediction was: relative MB = −3.14% and relative RMSE = 15.1%.
The similarity of monthly climatological and predicted maps (
Figure 6) suggests that the seasonal information is the primary control of prediction by the NDWI maps. This is consistent with the fact that DOY was the most important feature in the random forest prediction (
Table 5). However, the predicted maps also resembled the original maps (other than the climatological maps), suggesting the importance of secondary features (e.g.,
STL2;
NDPI;
FWS18.7, V) for representing fluctuation or interannual change (other than regular seasonal change).
3.2. Overall Performance of the Algorithm
Based on the results of the preliminary experiment, we trained pix2pix using the top 10 input features to predict bias contained in the first prediction result by the random forest method. The best performance of pix2pix was obtained after 21 epochs of training, from which we created bias-corrected NDWI maps. Although it seemed challenging for pix2pix to further improve the well-tuned random forest results with the limited number of training samples, the final accuracy was somewhat improved (p = 0.14 in the squared error) from MB = −9.55% and RMSE = 40.3% to MB = −8.68% and RMSE = 39.1% with the irregular inundation day (17 May 2013), and from MB = −3.14% and RMSE = 15.1% to MB = −2.43% and RMSE = 14.7% without this outlier. This suggests the potential for this sophisticated ML technique to predict more accurate water maps, including irregular events.
The final NDWI maps predicted for all days during the study period are provided as a
Supplementary Video (S1). In the original maps, numerous pixels were screened as cloud cover, cloud shadow, or snow, as well as implausibly high NDWI pixels (probably caused by the cloud cover and cloud shadow that survived the screening process). Hence, the data-available pixels across the study site constituted only 30.3% on average over the entire study period. The combination of the random forest and pix2pix methods successfully filled the gaps and corrected the errors for each day, resulting in virtually 100% availability of MODIS data (except for the period during which AMSR2 was unavailable).
Based on the final predicted NDWI maps, the time series of the water fraction across the study site was estimated via fine-resolution temporal frequency and compared with those from SWAMPS (
Figure 7). Seasonal change patterns were generally consistent between them; in particular, both products estimated similar water fractions during the spring and autumn. However, during the summer, our product tended to estimate a larger water fraction. Large discrepancies were observed in 2012 and 2013 in particular.
Table 1 and Landsat show several different surface water observation characteristics from our maps (
Figure 8). For instance, Sentinel 1 and Landsat (i.e., JRC maps) could observe small-scale thermokarst lakes, particularly along the right bank of the Lena River because of their superior spatial resolutions. In contrast, MODIS is likely to omit sub-grid-scale (< 500 m) thermokarst lakes.
Sentinel 1 also is relatively unstable in automated surface water extraction, because the backscatter coefficient is affected by various factors, such as roughness across waters’ surface due to wind, the existence of vegetation, and the incidence angle. Accordingly, we tentatively determined that σ0 < −20.5 dB indicated water, based on visual interpretation. Hence, a large proportion of the surface water along the Lena River and some thermokarst lakes were omitted by thresholding (particularly on 27 July 2017 and 20 August 2017). Based on our visual interpretation of the raw σ0 images, the surface water extent of the Lena River apparently does not tend to reflect dynamic seasonal change. In contrast, MODIS-based water maps showed continuous seasonal changes characterized by a seasonal increase in the water extent in spring to summer and a seasonal decrease in summer to autumn. The seasonal decrease in surface water extent from July to August was also confirmed in the Landsat-based product, implying consistency between the results obtained from optical sensors.
Sentinel 1 can also easily distinguish frozen river surfaces from liquid water (e.g., on 19 October 2017), whereas MODIS NDWI could not do so. In the predicted water maps, snow masking could not be applied because of the lack of original cloud-free data. Hence, in early May and late October, when the water surface is likely to freeze, the predicted maps in the absence of snow masking may overestimate surface water. Excluding those differences inevitably arising from the sensor features, MODIS and Sentinel 1 created consistent water maps for relatively large-scale water surfaces (21 June 2017).
Our algorithm created wall-to-wall water maps on a daily basis, whereas JRC maps contain no-data pixels even on a monthly basis, probably because of fewer observation opportunities by Landsat, cloud interruption, and error in the scan-line corrector in Landsat 7 (shown as a stripe no-data pattern in May).
4. Discussion
A combination of the random forest and sophisticated (i.e., pix2pix) ML methods created gap-filled MODIS-like water maps with MB = −2.43% and RMSE = 14.7% (excluding the irregular inundation event). Given that the original MODIS dataset was available for only 30.3% of the study period, the substantial improvement in the observation frequency is clearly useful for detailed monitoring of seasonal changes in surface water, particularly during key phenological periods such as foliation and defoliation, river-runoff increase, seasonal permafrost thawing and snow melting, and freezing. The improved temporal frequency enabled wall-to-wall water maps without any no-data pixels to be provided, as compared to the existing JRC water map [
16].
Our preliminary experiments revealed that ML prediction of the NDWI maps followed by thresholding created more accurate water maps than those derived solely from direct ML prediction. Similar to previous research reporting that index maps directly predicted by data fusion were more accurate than those calculated from the reflectance predicted by data fusion (i.e., “index-then-blend” was found to be better than “blend-then-index” [
61]), we also confirmed the robustness of the index-based approach in terms of different aspects of the data-fusion application. The index-based approach can also provide other options, such as flexible post-tuning of the water threshold, depending upon the location or satellite scenes [
62], or relating the index to other hydrological parameters.
The preliminary experiments also revealed that the use of key features in addition to the AMSR2-drived indices improved fusion accuracy. The original 23 input variables closely depend on one another (
Table 6), and some are likely to have duplicated information for ML. High VIF does not necessarily mean that the variable in the ML is of lower importance; however, it may be used as an indicator to omit some redundant variables. Principal component analysis is also promising to make the input variables independent from each other. As previous research has reported [
28], DOY information clearly helped to describe seasonality (
Figure 6;
Table 5), which may be primarily important in characterizing variations in the river and thermokarst lakes associated with the annual cycles of snow (accumulation/melting), water (freezing/melting/ice jamming), and the active permafrost layer (freezing/thawing). However, AMSR2-derived indices (e.g.,
NDPI and
FWS18.7, V) are still necessary for tracking interannual variation and sub-seasonal-scale fluctuation, apart from the regular seasonal cycles. Some meteorological parameters, such as soil temperature at 7–28-cm depth (
STL2), volumetric soil water at 100–289-cm depth (
SWVL4), and total evapotranspiration (
E) are also important ancillary data. The soil layers of
STL2 and
SWVL4 (deeper part of the active layer) may be more important than the others because of the sensitivity of those layers to the hydrological and thermal features of the land surface and the active layer in this region; however, confirming this would require further site-scale investigation.
Contrary to expectations, snowpack in the previous winter and snowmelt in the spring across the basin [
40] were not very useful for predictive purposes (
Table 5). This is likely due to the spatiotemporal aggregation of those data across very large areas (~2,400,000 km
2 of the entire Lena basin and over several months), suggesting the importance of selecting suitable spatiotemporal scales for input features during the data fusion stage. Furthermore, unlike previous research [
28], the total precipitation was found to be a variable of lower importance, which can be partly attributed to the fact that the reference high-quality MODIS data were inevitably collected from clear-sky days, upon which the rainfall and snowfall were unlikely to be observed (the so-called clear-sky bias). High-resolution microwave data (i.e., synthetic-aperture radar (SAR), such as Sentinel 1 and the Phased Array-type L-band Synthetic Aperture Radar (PALSAR) series) can be promising alternatives to optical data, because they are less affected by cloud cover, although there are less historical SAR data than optical data.
Our water maps tended to estimate a greater water fraction than SWAMPS across the study site, particularly during the summer. SWAMPS provides the water fraction at a relatively coarse (25 km) resolution on the basis of unmixing of the passive microwave data (the Special Sensor Microwave/Imager; the Special Sensor Microwave Imager Sounder) [
59], and because it does not focus on local mapping in Siberia, it may omit sub-grid-scale surface water and seasonal changes of the Lena River. Our achieved 500 m resolution enabled extraction of not only the Lena River but also thermokarst lakes (
Figure 6) to some extent. The contrasting density of the thermokarst lakes between the right (1–5%) and the left (0–1%) banks of the Lena River [
63,
64] was consistently depicted by our water maps (e.g., right bank: 2.4%, left bank: 0.1% in the temporal mean water fraction), thus supporting the validity of our water maps.
Given the progress in producing high-resolution water maps [
15,
16], our water maps should be further downscaled in conjunction with high-resolution data archives (e.g., Landsat series [
56]) to extract small-scale surface water data [
3], given that the small-scale thermokarst lakes constitute a nonnegligible proportion of the surface water in the region [
65]. The omission of more detailed surface water was indeed observed by comparison between our result and Sentinel 1 (
Figure 8). The same comparison also revealed that water extraction by MODIS NDWI was more robust than that derived from the simple thresholding of the VV-polarized backscatter signal of C-band SAR. In contrast, freezing and vegetation cover mixed in with the surface water is also likely to affect NDWI, helping to explain the discrepancy in seasonal changes between the MODIS and Sentinel 1 data. Utilizing SAR’s ability to penetrate vegetation cover and distinguish liquid water from the frozen surface may improve the data-fusion accuracy in creating water maps.
Expansion of the study period using other historical passive microwave data (e.g., AMSR and AMSR-E) is also important for future work. Watts et al. [
8] reported a trend of increasing water fraction across wide areas of the subarctic covered by continuous permafrost, as shown by AMSR-E for the period 2003–2010, whereas our more recent water maps (2013–2018) did not exhibit such a trend according to the Mann–Kendall test, neither in the annual mean nor maximum water fractions across the study site. Creating long data records by expanding our analysis to both the past and the future will provide an in-depth understanding of the surface water dynamics of the region due to climate change.
The sophisticated ML (i.e., pix2pix) showed the potential to create more accurate water maps that include irregular events such as ice-jam inundation in the spring season. Although the setup and tuning of such ML techniques can be laborious [
3], they are worth applying to address difficult mapping tasks. Expanding the study period may also contribute to obtaining more matched pairs for the ML training, resulting in a more accurate prediction.