Monitoring of an Indonesian Tropical Wetland by Machine Learning-Based Data Fusion of Passive and Active Microwave Sensors

: In this study, a novel data fusion approach was used to monitor the water-body extent in a tropical wetland (Lake Sentarum, Indonesia). Monitoring is required in the region to support the conservation of water resources and biodiversity. The developed approach, random forest database unmixing (RFDBUX), makes use of pixel-based random forest regression to overcome the limitations of the existing lookup-table-based approach (DBUX). The RFDBUX approach with passive microwave data (AMSR2) and active microwave data (PALSAR-2) was used from 2012 to 2017 in order to obtain PALSAR-2-like images with a 100 m spatial resolution and three-day temporal resolution. In addition, a thresholding approach for the obtained PALSAR-2-like backscatter coefﬁcient images provided water body extent maps. The validation revealed that the spatial patterns of the images predicted by RFDBUX are consistent with the original PALSAR-2 backscatter coefﬁcient images ( r = 0.94, RMSE = 1.04 in average), and that the temporal pattern of the predicted water body extent can track the wetland dynamics. The PALSAR-2-like images should be a useful basis for further investigation of the hydrological/climatological features of the site, and the proposed approach appears to have the potential for application in other tropical regions worldwide.


Introduction
Lake Sentarum in Kapuas Hulu Regency, West Kalimantan Province, Indonesia, is a well-known seasonal wetland that experiences drastic seasonal changes in its water body. The lake includes the floodplain of the Kapuas River, which is the longest river in Indonesia. During the rainy season, more than 25% of the river water enters the lake. On the other hand, during the dry season, 50% of the lake water exits to the river [1,2]. The lake, therefore, plays an important role as a flood buffer for the Kapuas River [3], and the monitoring of the water-body area is necessary for watershed management and preservation. Lake Sentarum is also known for its biodiversity. It has been designated as a Ramsar Convention site since 1994 and has been part of the Danau Sentarum National Park (DSNP) since 1999. However, there have been forest destructions and forest fires in the area, and the government has continued to lead several development projects in the watershed to catch up with the economic development A technique to mitigate this trade-off, spatiotemporal fusion (STF) [28], has recently been developed. This technique combines the data with a lower spatial resolution but higher observation frequency (hereinafter referred to as lower resolution images) with other data that contains a higher spatial resolution but lower observation frequency (higher resolution images) to create a dataset with the best available spatiotemporal resolution (i.e., a dataset having the spatial resolution of higher resolution images and the observation frequency of lower resolution images). Examples of STF include the spatial and temporal adaptive reflectance fusion model (STARFM) [29] and its modifications (e.g., ESTARFM) [26]. Another approach is mixed pixel decomposition [30,31], which estimates the composition of multiple land-cover categories in each pixel [32].
Mizuochi et al. [33,34] developed another type of STF called database unmixing (DBUX). One of the drawbacks of other STFs is their inability to predict sudden changes in the land surface and the necessity of having ancillary data, such as land-cover maps. The DBUX approach can overcome these shortcomings by making use of a lookup table (LUT) for the relation between lower resolution and higher resolution images based on historical data. Using this method, Mizuochi et al. monitored the long-term changes of seasonal wetlands in a semi-arid region of northern Namibia by combining AMSR-series microwave radiometer data (for lower resolution data) and Moderate-Resolution Imaging Spectroradiometer (MODIS) optical data (for higher resolution data). They also applied DBUX to Landsat to create a higher spatial resolution dataset.
This method could be effective for monitoring the extent of water bodies, but there is a problem with the application of DBUX to this study region. Because it is a statistical approach, it requires a sufficient amount of historical data to create the LUT, which is the essential aspect of a successful DBUX. A limited amount of input data will cause gaps in the DBUX LUT and result in data fusion failure. Unlike situations in arid or semi-arid regions, such as northern Namibia, the clear sky data from optical sensors such as MODIS and Landsat are much less available in tropical climate zones, such as Indonesia and other tropical regions worldwide [35,36]. Given its potential for use under cloudy conditions, SAR is more applicable; however, SAR's limited observation opportunities in low-latitude regions remains an issue. Although freely available SAR constellation images (i.e., from Sentinel-1) potentially provide a six-day repeat cycle, the actual temporal resolution depends on the latitude and acquisition modes. In fact, when the availability of the primary conflict-free modes (interferometric wide swath: IW) for the Sentinel-1 were checked, it was found that the temporal resolution of the product was 24 or 48 days until 2016 (i.e., before the constellation) and 12 days since 2017 (i.e., after constellation). As a result, on average, only 18 images per year (or one to two images per month) are available from 2015 to 2017 in the study region for this work.
To address the issue, the DBUX was upgraded for this study by replacing the LUT with a machine-learning algorithm (random forest [37]). As noted above, the shortcoming of DBUX is the possibility of LUT gaps, which are especially likely when insufficient training data are available. The new approach, random forest DBUX (RFDBUX), can overcome this shortcoming by creating relations without setting bins through random forest regression. For the lower resolution images in the STF, images from the Advanced Microwave Scanning Radiometer 2 (AMSR2) were used, which is a successor to the AMSR-E microwave radiometer. Taking into consideration the utility of the HH polarization band, budgetary efficiency, and its capability of penetrating through vegetation canopy [18], the L-band SAR (PALSAR-2) was chosen for the higher resolution images in the STF. Using the RFDBUX prediction of backscatter coefficient images of PALSAR-2, a long-term dataset with 100 m spatial resolution and several-day temporal resolution was created with the potential to track the dynamics of the water extent in Lake Sentarum.
Using the PALSAR-2-like dataset predicted by RFDBUX, water extent maps of this region were created that have the same spatiotemporal resolution as the dataset (i.e., 100 m, several days). The time series of the water extent was evaluated by checking an independent data source (i.e., water gauge and time-lapse camera data) obtained by ground observation. Figure 1 shows the flowchart of the analysis. It consisted of four parts, pre-processing (described in Section 2.2 in detail), training (Section 2.3), cross-validation (Section 2.4), and long-term prediction (Section 2.5).

Materials and Methods
Remote Sens. 2018, 10, x FOR PEER REVIEW 4 of 19 Figure 1 shows the flowchart of the analysis. It consisted of four parts, pre-processing (described in Section 2.2 in detail), training (Section 2.3), cross-validation (Section 2.4), and long-term prediction (Section 2.5).

Study Site and Period
Lake Sentarum and its surrounding area are centered at 0°48′ N and 112°09′ E, stretch 78 km (east-west) and 66 km (north-south), and were selected as the study area ( Figure 2). The area corresponds to x = 589,000 to 667,000 and y = 55,500 to 121,500 in the UTM49N projection, which is the original projection of the PALSAR-2 images. The rainy season lasts from October to June and the dry season lasts from July to September, with annual fluctuations [1]. To obtain ground references, a water gauge sensor (CO-U20-001-03, Onset Computer Corporation, Bourne, Massachusetts, MA, USA) and a time-lapse camera (GardenWatchCam, Brinno Inc., Walnut, CA, USA) were installed at the site in March 2017. Considering the AMSR2 availability (since 3 July 2012) and comparability with the ground references (only for a part of 2017), the target period from 3 July 2012 (the beginning of the dry season) to 30 October 2017 (the beginning of the rainy season) was set.

Study Site and Period
Lake Sentarum and its surrounding area are centered at 0 • 48 N and 112 • 09 E, stretch 78 km (east-west) and 66 km (north-south), and were selected as the study area ( Figure 2). The area corresponds to x = 589,000 to 667,000 and y = 55,500 to 121,500 in the UTM49N projection, which is the original projection of the PALSAR-2 images. The rainy season lasts from October to June and the dry season lasts from July to September, with annual fluctuations [1]. To obtain ground references, a water gauge sensor (CO-U20-001-03, Onset Computer Corporation, Bourne, Massachusetts, MA, USA) and a time-lapse camera (GardenWatchCam, Brinno Inc., Walnut, CA, USA) were installed at the site in March 2017. Considering the AMSR2 availability (since 3 July 2012) and comparability with the ground references (only for a part of 2017), the target period from 3 July 2012 (the beginning of the dry season) to 30 October 2017 (the beginning of the rainy season) was set.

Pre-Processing of Satellite Data
The AMSR2 is a microwave radiometer on the GCOM-W1 satellite of the Japan Aerospace Exploration Agency (JAXA). It has similar characteristics to its predecessor (AMSR-E), along with several improvements [38]. It provides brightness temperature data in both ascending and descending orbits in two polarizations (H and V) and at seven frequencies (ranging from 6.925 GHz to 89.0 GHz). The AMSR2 has been applied to the retrievals of precipitation [39], soil moisture [40], and tropospheric water vapor [41]. The Level 3 brightness temperature images were utilized here (available at the GCOM-W1 data providing system, https://gportal.jaxa.jp, last accessed on 18 July 2018) at 36.5 GHz frequency for the entire study period, which totaled 3887 scenes. The images were projected on a 0.1-degree (about 10 km) or 0.25-degree (about 25 km) lattice on a latitude-longitude projection. A 0.1-degree product was chosen. To enhance the surface water on the images, the normalized difference polarization index (NDPI) for each pixel on each scene was computed as follows [42]: where TB36.5V and TB36.5H are the vertical and horizontal brightness temperatures at 36.5 GHz, respectively. The NDPI is sensitive to the wetness of the ground surface; hence, it is useful as a wetness index [43]. Three-day average NDPI maps were produced that included both ascending and descending orbits within the three-day period. This averaging procedure prevented bias between the ascending and descending orbits and gaps in the daily observation data, which can occur every three to five days in this region. Finally, the average NDPI map was resampled with a 100 m interval on the UTM49N projection by the nearest neighbor method to match it with the PALSAR-2 images described in the next section.

PALSAR-2
The level 2.1 WBD HH data was used from PALSAR-2, a SAR on JAXA's ALOS-2 satellite. The level 2.1 data were orthorectified using Level 1.1 data with a digital elevation model. Nineteen scenes were obtained during the target period ( Table 1). The data were provided with a 25 m grid on the UTM49N projection. To suppress speckle noise and the influence of positional error, averaging was applied to 5 × 5 windows and the results were resampled at 100 m intervals. The radar backscatter coefficient (σ 0 ) was then computed using Equation (2):

AMSR2
The AMSR2 is a microwave radiometer on the GCOM-W1 satellite of the Japan Aerospace Exploration Agency (JAXA). It has similar characteristics to its predecessor (AMSR-E), along with several improvements [38]. It provides brightness temperature data in both ascending and descending orbits in two polarizations (H and V) and at seven frequencies (ranging from 6.925 GHz to 89.0 GHz). The AMSR2 has been applied to the retrievals of precipitation [39], soil moisture [40], and tropospheric water vapor [41]. The Level 3 brightness temperature images were utilized here (available at the GCOM-W1 data providing system, https://gportal.jaxa.jp, last accessed on 18 July 2018) at 36.5 GHz frequency for the entire study period, which totaled 3887 scenes. The images were projected on a 0.1-degree (about 10 km) or 0.25-degree (about 25 km) lattice on a latitude-longitude projection. A 0.1-degree product was chosen. To enhance the surface water on the images, the normalized difference polarization index (NDPI) for each pixel on each scene was computed as follows [42]: where TB 36.5V and TB 36.5H are the vertical and horizontal brightness temperatures at 36.5 GHz, respectively. The NDPI is sensitive to the wetness of the ground surface; hence, it is useful as a wetness index [43]. Three-day average NDPI maps were produced that included both ascending and descending orbits within the three-day period. This averaging procedure prevented bias between the ascending and descending orbits and gaps in the daily observation data, which can occur every three to five days in this region. Finally, the average NDPI map was resampled with a 100 m interval on the UTM49N projection by the nearest neighbor method to match it with the PALSAR-2 images described in the next section.

PALSAR-2
The level 2.1 WBD HH data was used from PALSAR-2, a SAR on JAXA's ALOS-2 satellite. The level 2.1 data were orthorectified using Level 1.1 data with a digital elevation model. Nineteen scenes were obtained during the target period ( Table 1). The data were provided with a 25 m grid on the UTM49N projection. To suppress speckle noise and the influence of positional error, averaging was applied to 5 × 5 windows and the results were resampled at 100 m intervals. The radar backscatter coefficient (σ 0 ) was then computed using Equation (2): where DN is a digital number and CF is a calibration factor. Variable CF was set as −83.0 dB, based on the calibration result of the ALOS-2/PALSAR-2 JAXA standard products [44].  Figure 3 shows a PALSAR-2 image obtained close to the peak of the rainy season for use in making a rough estimation of the possible extent of the seasonal wetlands. The base map (black and white) is the image obtained during the rainy season. The blue area shows the seasonal wetlands, which were delineated by highlighting the pixels on a dry season image (29 August 2016), having σ 0 values >3 dB brighter than those in the rainy season image (9 May 2016). Among the PALSAR-2 images studied, these images had the smallest (29 August 2016) and largest (9 May 2016) areas of lake water. A threshold of 3 dB was selected based on a visual comparison of these two images.

Training of Random Forest Database Unmixing (RFDBUX)
The DBUX approach integrates (A) images a having higher spatial resolution but lower temporal resolution with (B) the other images having a lower spatial resolution but higher temporal resolution. For simplicity, dataset A will be referred to as the 'spatial images' and dataset B as the 'temporal images' in the algorithm. For example, in Mizuochi et al. [33], MODIS (500 m spatial resolution and more than several-day temporal resolution due to the cloud cover) are the spatial images and AMSRseries (25 km spatial resolution and one-day temporal resolution) are temporal images. In this study, the PALSAR-2 HH σ 0 are spatial images and AMSR2 NDPI are the temporal images.
Before prediction, DBUX creates a LUT for the location of each spatial image pixel. Each LUT is a mapping (i.e., a function) that converts the pixel value of a temporal image into the pixel value of a spatial image. Here, this mapping was created empirically based on match-ups in the historical data, where a match-up is a combination of a spatial image and a temporal image captured at a close timing interval (mostly within one or several days). If there are enough match-ups in the historical archive, and if the mapping from the pixel values of spatial images to the pixel values of temporal images is unique, it should be possible to use this LUT to predict a spatial image from a temporal image when a spatial image is not available (Figure 1).
Because the number of available match-ups is finite, there are gaps in the pixel values of the temporal images in the match-ups. The creation of a LUT, therefore, requires an interpolation of the gaps. In the original DBUX [33,34], a bin-average approach was used to fill the gaps. That is, the match-ups were allocated to the bins with a regular interval of pixel values of the temporal images, and then the averages were obtained for each bin, which resulted in a step-like function, with a step at each bin. However, this approach requires enough variety and volume of match-ups to cover all of the bins. If there are not enough match-ups, there can be bins without any corresponding match-

Training of Random Forest Database Unmixing (RFDBUX)
The DBUX approach integrates (A) images a having higher spatial resolution but lower temporal resolution with (B) the other images having a lower spatial resolution but higher temporal resolution. For simplicity, dataset A will be referred to as the 'spatial images' and dataset B as the 'temporal images' in the algorithm. For example, in Mizuochi et al. [33], MODIS (500 m spatial resolution and more than several-day temporal resolution due to the cloud cover) are the spatial images and AMSR-series (25 km spatial resolution and one-day temporal resolution) are temporal images. In this study, the PALSAR-2 HH σ 0 are spatial images and AMSR2 NDPI are the temporal images.
Before prediction, DBUX creates a LUT for the location of each spatial image pixel. Each LUT is a mapping (i.e., a function) that converts the pixel value of a temporal image into the pixel value of a spatial image. Here, this mapping was created empirically based on match-ups in the historical data, where a match-up is a combination of a spatial image and a temporal image captured at a close timing interval (mostly within one or several days). If there are enough match-ups in the historical archive, and if the mapping from the pixel values of spatial images to the pixel values of temporal images is unique, it should be possible to use this LUT to predict a spatial image from a temporal image when a spatial image is not available (Figure 1).
Because the number of available match-ups is finite, there are gaps in the pixel values of the temporal images in the match-ups. The creation of a LUT, therefore, requires an interpolation of the gaps. In the original DBUX [33,34], a bin-average approach was used to fill the gaps. That is, the match-ups were allocated to the bins with a regular interval of pixel values of the temporal images, and then the averages were obtained for each bin, which resulted in a step-like function, with a step at each bin. However, this approach requires enough variety and volume of match-ups to cover all of the bins. If there are not enough match-ups, there can be bins without any corresponding match-ups, resulting in gaps in the LUT. In this study, this can occur because of the lesser availability of PALSAR-2 data than MODIS data in the case of Mizuochi et al. [33].
The DBUX was upgraded by replacing the bin-average approach with the more robust random forest regression interpolation method [37]. Random forest is an algorithm that generates many decision trees and uses the majority or average of the outputs from all of them. In the RFDBUX case, the algorithm generates a regression function that converts the pixel value of a temporal image to the pixel value of a spatial image, both of which come from the match-ups. After conducting feasibility studies of the model's performance, the optimal number of decision trees in a forest was set at 100 and the maximum number of decision tree levels was set at two. Both the DBUX and RFDBUX source codes are available at https://github.com/hmizuochi.

Cross-Validation
The prediction accuracy was assessed by cross-validation. Because there were 19 match-ups, one was excluded and the RFDBUX was trained using the remaining 18. Then, the spatial image was predicted from the temporal image of the excluded match-up, and was compared with the real spatial image in the match-up. This procedure was repeated for each match-up, resulting in 19 cases of comparison between the predicted and real spatial images (i.e., 'leave-one-out' method).
To compare the images, the correlation coefficients and root mean square errors (RMSE) between the predicted and real spatial images were checked for each case. The correlation coefficients for the entire image and for the possible seasonal wetland area were evaluated (delineated in Figure 3), which was the area of particular interest. Also, the correlation coefficient for two definitions of the variables was calculated, namely the pixel value (HH σ 0 ) and the anomaly of the pixel value, where the anomaly is the pixel value of the image minus the average of the pixel values at the same position in all 19 spatial images. The anomaly can exclude the contrast that depends on the topography and land cover, and can indicate signals of seasonal or inter-annual changes.
The statistical significance of the correlation coefficients was evaluated by estimating their p-values using the z-transformation. The p-value strongly depends on the degrees of freedom, but because there is spatial autocorrelation in the images, one cannot assume the number of pixels to be the number of degrees of freedom. The autocorrelation was estimated and it was found that it disappeared at about 10 km. This corresponds to about 42 (6 × 7) subregions of 10 km × 10 km in the target area. Thus, the degrees of freedom were assumed to be 42.

Long-Term Prediction
The RFDBUX approach was used to predict the spatial images for the entire study period (3 July 2012 to 30 October 2017) and the extent of seasonal water was roughly estimated by setting three thresholds, 2 dB, 2.5 dB, and 3 dB. Then, the area in which the spatial image obtained near the peak of the dry season (29 August 2016) minus the predicted spatial image was larger than each threshold was delineated. The water gauge and time-lapse camera data were obtained from March to August 2017, and were compared with the time series of the extent of seasonal water. The absolute pressure measured by the water gauge sensor to the water level was converted by assuming a hydrostatic condition in conjunction with the daily average surface air pressure from the ERA-Interim data [45].

Additional Experiments
To investigate the effect of the selection of spatial and temporal images on the prediction result, the following three experiments were conducted. Sentinel-1 was used as supplementary spatial images in addition to PALSAR-2. Then, the day-of-year (DOY) and global precipitation mission (GPM) data were used as temporal images instead of AMSR2.

Integrated use of Sentinel-1
In the random forest regression of RFDBUX, increasing the number of spatial images in addition to the original PALSAR-2 images may improve the accuracy. To investigate the feasibility of this approach, Sentinel-1 data were used as the spatial images in RFDBUX along with the original PALSAR-2 images. The same pre-processing was applied (i.e., noise reduction, calculation of backscatter coefficient, and resampling into 100 m resolution) as that used for the PALSAR-2 on all of the available IW-mode Sentinel-1 data during the study period on the Google Earth Engine [46]. Unfortunately, the primary observation mode (i.e., IW) did not include HH polarization, but only VV and VH. Thus, VV polarization was chosen for the Sentinel-1 data because of its slightly better ability to map the surface water than VH [8,9]. However, the difference between PALSAR-2 and Sentinel-1 in the polarizations (i.e., HH and VV), the microwave bands used (i.e., L-band and C-band), and the other sensor system or data processing performed were likely to affect the result. To mitigate this effect, a linear regression analysis was conducted by comparing the corresponding pixels that were obtained temporally within three days (randomly sampled, making N = 10000) and the Sentinel-1 images were calibrated to resemble PALSAR-2 images (Figure 4). Then, the calibrated Sentinel-1 (49 scenes) and original PALSAR-2 (19 scenes) images were applied to RFDBUX (vs. AMSR2) and the created dataset was validated using the same cross-validation method described in the previous section (leave-one-out for PALSAR-2 19 scenes and the same hereafter).
Remote Sens. 2018, 10, x FOR PEER REVIEW 9 of 19 approach, Sentinel-1 data were used as the spatial images in RFDBUX along with the original PALSAR-2 images. The same pre-processing was applied (i.e., noise reduction, calculation of backscatter coefficient, and resampling into 100 m resolution) as that used for the PALSAR-2 on all of the available IW-mode Sentinel-1 data during the study period on the Google Earth Engine [46]. Unfortunately, the primary observation mode (i.e., IW) did not include HH polarization, but only VV and VH. Thus, VV polarization was chosen for the Sentinel-1 data because of its slightly better ability to map the surface water than VH [8,9]. However, the difference between PALSAR-2 and Sentinel-1 in the polarizations (i.e., HH and VV), the microwave bands used (i.e., L-band and C-band), and the other sensor system or data processing performed were likely to affect the result. To mitigate this effect, a linear regression analysis was conducted by comparing the corresponding pixels that were obtained temporally within three days (randomly sampled, making N = 10000) and the Sentinel-1 images were calibrated to resemble PALSAR-2 images ( Figure 4). Then, the calibrated Sentinel-1 (49 scenes) and original PALSAR-2 (19 scenes) images were applied to RFDBUX (vs. AMSR2) and the created dataset was validated using the same cross-validation method described in the previous section (leave-one-out for PALSAR-2 19 scenes and the same hereafter).

Use of DOY
The selection of the temporal images (i.e., selection of an explanatory variable) is an essential aspect of a successful RFDBUX. Although AMSR-2 NDPI was confirmed to have the ability to see the surface water [24,34], exploring other candidate variables to explain the surface water extent, such as DOY and precipitation, is worthwhile so as to improve the understanding of the STF applications for water monitoring. Thus, images with 100 m pixel spacing were created that show the DOY of the prediction day (i.e., spatially uniform images), and they were used as temporal images for RFDBUX instead of AMSR2. The spatial images were the calibrated Sentinel-1 plus PALSAR-2 images.

Use of precipitation
In addition to the DOY, the spatial pattern of precipitation also relates to the surface water extent. Recent global precipitation mission (GPM) data were used with 0.1 deg × 0.1 deg spatial resolution and one-day temporal resolution (GPM IMERG Final Precipitation L3 [47]) from National

Use of DOY
The selection of the temporal images (i.e., selection of an explanatory variable) is an essential aspect of a successful RFDBUX. Although AMSR-2 NDPI was confirmed to have the ability to see the surface water [24,34], exploring other candidate variables to explain the surface water extent, such as DOY and precipitation, is worthwhile so as to improve the understanding of the STF applications for water monitoring. Thus, images with 100 m pixel spacing were created that show the DOY of the prediction day (i.e., spatially uniform images), and they were used as temporal images for RFDBUX instead of AMSR2. The spatial images were the calibrated Sentinel-1 plus PALSAR-2 images.

Use of precipitation
In addition to the DOY, the spatial pattern of precipitation also relates to the surface water extent. Recent global precipitation mission (GPM) data were used with 0.1 deg × 0.1 deg spatial resolution and one-day temporal resolution (GPM IMERG Final Precipitation L3 [47]) from National Aeronautics and Space Administration (NASA) Goddard Earth Sciences Data and Information Service Center (GES DISC). The GPM data were resampled into 100 m resolution and applied to RFDBUX as temporal images. Again, the spatial images were the calibrated Sentinel-1 plus PALSAR-2 images. Table 2 shows the cross-validation results. The correlation coefficients were generally about 0.9 or higher for both the entire area and the possible wetland area, with a high level of statistical significance for all of the dates. The correlation coefficients decreased in the anomaly maps and were sometimes close to 0 or even negative, making more than half of the scenes statistically insignificant. The RMSE values ranged from approximately 1.0 to 2.0 (dB) in most scenes over the entire area. A lower accuracy (i.e., a lower correlation coefficient and higher RMSE) was obtained when only the seasonal wetland pixels were considered rather than the entire area, although this was not necessarily the case with the anomaly maps. The averaged correlation coefficients and RMSEs of the entire area were 0.94 and 1.05 (dB), respectively.  Figures 5 and 6, respectively. Visually, the spatial pattern of the lake was predicted in all of the cases. However, some biases were found between the predicted and original maps, especially in the bad cases (e.g., 30 March 2015 and 15 February 2016). The biases do not appear to be temporally systematic (i.e., not always positive or always negative for each pixel), but rather somewhat temporally random in the scenes. 'Source' is the AMSR-2 NDPI composite image, 'predicted' is a prediction result of RFDBUX with the leave-one-out condition, 'original' is the PALSAR-2 HH σ 0 image on that day, and 'pred−orig' is the 'predicted' minus the "original". 'Source' is the AMSR-2 NDPI composite image, 'predicted' is a prediction result of RFDBUX with the leave-one-out condition, 'original' is the PALSAR-2 HH σ 0 image on that day, and 'pred−orig' is the 'predicted' minus the "original".  Obvious seasonal patterns are evident. A relatively sudden increase in the seasonal wetlands at the beginning of the rainy season (middle of October to the beginning of November) is followed by a gradual decrease until the beginning of the dry season (July). The predicted areas of the seasonal wetlands were consistent with those calculated from the original PALSAR-2 image in most cases. However, on the days when the relatively large areas of seasonal wetlands appeared (e.g., 15 February 2016, 28 March 2016 and 9 May 2016), the predicted areas tended to be underestimated, as shown by colored symbols in Figure 7. Figure 8 shows a comparison of the in situ water level and the possible seasonal wetland area, in which the water level shows dynamic change. In the rainy season, it was relatively stable at about 3.0 m and then it rapidly dropped to near zero in June. It temporarily increased to as high as 1.5 m in July, dropped again in August, and rose again in September. These changes probably reflect the nonuniform patterns of the rain and surface inflow from the river. These water dynamics were also confirmed by the time-lapse photos (Figure 8). The pattern was substantially well tracked by the  Obvious seasonal patterns are evident. A relatively sudden increase in the seasonal wetlands at the beginning of the rainy season (middle of October to the beginning of November) is followed by a gradual decrease until the beginning of the dry season (July). The predicted areas of the seasonal wetlands were consistent with those calculated from the original PALSAR-2 image in most cases. However, on the days when the relatively large areas of seasonal wetlands appeared (e.g., 15 February 2016, 28 March 2016 and 9 May 2016), the predicted areas tended to be underestimated, as shown by colored symbols in Figure 7. Figure 8 shows a comparison of the in situ water level and the possible seasonal wetland area, in which the water level shows dynamic change. In the rainy season, it was relatively stable at about 3.0 m and then it rapidly dropped to near zero in June. It temporarily increased to as high as 1.5 m in July, dropped again in August, and rose again in September. These changes probably reflect the nonuniform patterns of the rain and surface inflow from the river. These water dynamics were also confirmed by the time-lapse photos (Figure 8). The pattern was substantially well tracked by the extracted possible wetland area, including the drop/recover pattern observed during the dry season (middle of June to August) and the local minimum around the DOY 111 (April).    Contrary to the expectations, the integrated use of the Sentinel-1 degraded all of the accuracy criteria (Table 3). Only one day in the rainy season (8 May 2017) was found for which the correlation coefficient and RMSE of the seasonal wetland improved (data not shown). In addition, Table 3 shows that use of DOY or GPM (precipitation) instead of AMSR2 slightly improved the correlation coefficient and RMSE, especially in the dry season, but substantially degraded the correlation coefficient of the anomaly maps and RMSE of the seasonal wetland. The use of precipitation yielded an even worse result than the use of DOY, except for the correlation coefficient of the seasonal wetland in the dry season. Table 3. Comparison of averaged cross-validation result (for all of the data, for the rainy season, and for the dry season) among the original and experimental RFDBUX. (1) Original RFDBUX, (2) integrated use of Sentinel-1, (3) use of DOY instead of Advanced Microwave Scanning Radiometer 2 (AMSR2), and (4) use of GPM (precipitation) instead of AMSR2. The abbreviation of the accuracy criteria is the same as in Table 2.

No.
(1) Accuracy criteria shown as average of all data; the rainy season's data; the dry season's data.

Discussion
The RFDBUX approach predicted the spatial pattern of the wetlands with r = 0.94 and RMSE = 1.05 dB on average (Table 2), and filled the large temporal gaps from the original PALSAR-2 data. The resultant temporal resolution (three days) probably could not be realized by the other STFs, such as STARFM, because they do not make use of passive microwave data (AMSR2 in this case) integrated with spatially high-resolution data (PALSAR-2 in this case). The STARFM and other semiphysical fusion models are limited to the STF of the physical values that can be derived from the Contrary to the expectations, the integrated use of the Sentinel-1 degraded all of the accuracy criteria (Table 3). Only one day in the rainy season (8 May 2017) was found for which the correlation coefficient and RMSE of the seasonal wetland improved (data not shown). In addition, Table 3 shows that use of DOY or GPM (precipitation) instead of AMSR2 slightly improved the correlation coefficient and RMSE, especially in the dry season, but substantially degraded the correlation coefficient of the anomaly maps and RMSE of the seasonal wetland. The use of precipitation yielded an even worse result than the use of DOY, except for the correlation coefficient of the seasonal wetland in the dry season. Table 3. Comparison of averaged cross-validation result (for all of the data, for the rainy season, and for the dry season) among the original and experimental RFDBUX. (1) Original RFDBUX, (2) integrated use of Sentinel-1, (3) use of DOY instead of Advanced Microwave Scanning Radiometer 2 (AMSR2), and (4) use of GPM (precipitation) instead of AMSR2. The abbreviation of the accuracy criteria is the same as in Table 2.

Discussion
The RFDBUX approach predicted the spatial pattern of the wetlands with r = 0.94 and RMSE = 1.05 dB on average (Table 2), and filled the large temporal gaps from the original PALSAR-2 data. The resultant temporal resolution (three days) probably could not be realized by the other STFs, such as STARFM, because they do not make use of passive microwave data (AMSR2 in this case) integrated with spatially high-resolution data (PALSAR-2 in this case). The STARFM and other semi-physical fusion models are limited to the STF of the physical values that can be derived from the optical or thermal sensors, which are inevitably disturbed by clouds. In fact, Landsat-8 (as an example of optical images) was searched over Lake Sentarum (path: 120, row: 059) using Google Earth Engine, and it was found that there were 18 cloud-free images (determined as less than 50% cloud cover) during the study period. Only three images were cloud-free when the half year (April-November) was considered, which spanned the center of the rainy season. This makes it problematic to observe the temporal change of the wetlands, especially in the rainy season, and supports the validity of using microwave images. Thus, in the regions where cloud cover is an issue, such as tropical climate zones [35], empirical or machine-learning-based fusion models that can utilize microwave data are a promising option.
The RFDBUX approach is likely to provide better gap-filling than DBUX, because DBUX, as noted in Section 2.3, uses the bin-average approach, which creates LUTs that may have gaps [34], especially when there is an insufficient number of match-ups. These gaps can cause prediction failures and degrade the temporal resolution of the resultant dataset. In addition, DBUX is somewhat arbitrary in making bin interval decisions (i.e., how to slice the entire range of values of the temporal images) when creating a LUT. The RFDBUX approach overcomes these problems through machine-learning, which provides more robust gap-filling. A quantitative comparison of RFDBUX and DBUX (and other STFs) is an important subject for future study.
It was also shown that there is still room for improving the RFDBUX accuracy. The correlation coefficients between the image predicted by RFDBUX and the original image were a little lower when the focus was on the seasonal wetland pixels, and dropped notably with respect to the anomaly maps. Generally speaking, it is natural that a lower correlation coefficient was observed when the focus was only on temporally variable pixels (i.e., seasonal wetlands), because making predictions for these areas is much more difficult than it is for the temporally stable pixels. However, given that seasonal wetland areas are the subject of interest, improving the algorithm for use in studying the wetlands must be considered. The poor accuracy in the anomaly maps can be partly attributed to their low spatial variation rather than being an error of RFDBUX or the satellites used. When the predicted or the original map was similar to the average map of the 19 PALSAR-2 maps, the calculated anomaly values approached zero at all points. Because of the nature of the correlation coefficient (r), these 'no contrast' samples probably yield low r values. However, even taking this effect into account, there still seems to be uncertainty associated with the approach discussed here. One of the potential error sources is the effect of inundated vegetation [14], which may create double-bounce backscatter or volume backscatter [18], thereby increasing the backscatter coefficient and resulting in the omission of water under the vegetation. As this effect seems to relate to complex factors (e.g., microwave incident angle, vegetation structure, density, and water levels), completely removing the effect is difficult when solely using the backscatter intensity. A potential solution is to use polarimetry decomposition [18,48] to distinguish the inundated vegetation, although it is not easy to obtain a substantial number of full polarimetry data, which is necessary for accurate polarimetry decomposition [18].
Additional experiments showed that the combined use of Sentinel-1 and PALSAR-2 does not contribute to any substantial improvement of the accuracy, but rather degrades it. This outcome was attributed to the remaining discrepancy between them in the polarizations used (VV and HH) or bands used (C-band and L-band) even after the calibration. The calibration through the linear regression analysis somewhat mitigated the discrepancy between them. Also, the combined use of the calibrated Sentinel-1 increased the input spatial images for RFDBUX and seemed to contribute to the accuracy improvement in one day in the rainy season (8 May 2017); however, the effect of the discrepancy and the reduced ability of VV to map water [9,11] could not be completely addressed. Dense clouds also may affect the observation, especially for shorter wavelength sensors (i.e., Sentinel-1) [49]. Thus, it was concluded that their combined use is an attractive avenue, but requires further research.
An additional investigation on the effectiveness of the DOY and precipitation (GPM) instead of AMSR2 showed that in the stable situation (i.e., dry season), the DOY or precipitation have the potential to make an accurate map. However, in describing the temporal variation (i.e., rainy season and anomaly), the performance of the AMSR2 NDPI was better. As the water extent of this region is controlled by precipitation, it does not show a regular seasonality every year and is largely impacted by the fluctuating inflow from nearby rivers in the rainy season [3]. Thus, the DOY and precipitation should not be used as primary explanatory variables for the dynamic change of the water extent of the lake. Rather, although AMSR2 has a relatively coarse resolution (10 km), it can directly provide information about the surface wetness of each pixel, which is a better proxy of the surface water extent than the DOY and precipitation for this region. Adding the DOY and precipitation (or other potential candidates) to RFDBUX as a secondary explanatory variable in conjunction with AMSR2 NDPI may be an interesting approach for future work.
The potential seasonal wetland areas calculated from the predicted and the original PALSAR-2 were consistent with each other (Figure 7, solid lines and colored symbols, respectively), except for the days when relatively large areas of seasonal wetlands appeared (e.g., 15 February 2016, 28 March 2016, and 9 May 2016). For such days, the prediction underestimated the area, probably because of an insufficient training dataset (i.e., match-ups between PALSAR-2 and AMSR2) for RFDBUX. To improve the accuracy, increasing the amount of the dataset and training the RFDBUX separately between the rainy season and the dry season will be beneficial [34].
According to the time-series comparison of the potential seasonal wetland area and the in-situ water level data, RFDBUX appears to track the patterns of seasonal change successfully, which supports the validity of the time-series data derived from PALSAR-2. The actual values of the areas of the seasonal wetlands derived from this approach were not calibrated or validated in this study. In addition, on the days when relatively large areas of seasonal wetlands appeared (e.g., 15 February 2016, 28 March 2016, and 9 May 2016), the predictions tended to underestimate the area of the seasonal wetlands. This underestimation may be due to too much generalization in the random forest regression and may result in uncertainty in the quantitative time-series analyses of wetland extent.
As RFDBUX is a machine-learning-based approach, it can potentially integrate multiple datasets that have different observation mechanisms (e.g., microwave/optical or radiometer/radar) with more flexibility than other major approaches [26,29]. Different combinations of broad-scale, long-term data, other than AMSR2/PALSAR-2, should be tested over different landscapes, such as snowy mountainous regions, forests, and agricultural fields. Increasing the number of available match-ups by expanding the operation period will contribute to improving the accuracy. Also, improving algorithms by including other machine-learning techniques (e.g., support vector machine or deep neural network) and searching for other water indices would be reasonable in future work.

Conclusions
A data fusion approach was developed and tested using pixel-based random forest regression (RFDBUX) to study the tropical wetlands in Indonesia. The RFDBUX approach integrated passive microwave data (AMSR2) and active microwave data (PALSAR-2) to fill large temporal gaps in the original PALSAR-2 data. The validation showed that the spatial patterns of the PALSAR-2 backscatter coefficient images predicted by RFDBUX are consistent with the original PALSAR-2 images (r = 0.94; RMSE = 1.05 dB in average). The accuracy degraded when the focus was only on the temporally variable pixels (i.e., seasonal wetlands) or anomaly maps. The potential seasonal wetland areas calculated from the predicted and the original PALSAR-2 were consistent with each other, except for the days when relatively large areas of seasonal wetlands appeared. Potential error sources were double-bounce or volume scatter from inundated vegetation and an insufficient training dataset (i.e., match-ups between PALSAR-2 and AMSR2) for RFDBUX. The advanced SAR analysis (e.g., polarimetry decomposition) and integration of other satellite or climatological data sources may contribute to the accuracy improvement. Nonetheless, the integrated use of Sentinel-1 with PALSAR-2, or the integrated use of DOY and precipitation with AMSR2, requires further research. A comparison with the in situ water level data showed that the temporal pattern of the predicted PALSAR-2 images can track the wetland dynamics. This attempt represents a first step in creating high-spatiotemporal-resolution water maps for this site and should contribute to the further investigation of hydrological/climatological features, the management of water resources, and the conservation of biodiversity at the site. The RFDBUX approach is potentially applicable to other combinations of datasets and other landscapes worldwide. The quantitative comparisons of RFDBUX and DBUX, and with other fusion models such as STARFM, in addition to the application of RFDBUX to other data combinations and other landscapes, are also important research directions.
Author Contributions: H.M. developed RFDBUX, designed the research, implemented the analysis (processing satellite data including Sentinel-1, GPM, PALSAR-2, AMSR2 and water level data, running RFDBUX, and making validation and time-series analysis), and wrote the manuscript. C.N. implemented the analysis (processing PALSAR-2 and AMSR2, running RFDBUX, and validation), conducted the field survey, and wrote the manuscript. I.R. conducted the field survey, provided study site information and water level data, and supported the writing of the manuscript. K.N.N. designed the research, implemented the analysis (running RFDBUX, validation, map visualization, and time-series analysis), and wrote the manuscript.