Water Body Extraction from Sentinel-3 Image with Multiscale Spatiotemporal Super-Resolution Mapping

: Water body mapping is signiﬁcant for water resource management. In the view of 21 spectral bands and a short revisit time of no more than two days, a Sentinel-3 OLCI (Ocean and Land Colour Instrument) image could be the optimum data source in the near-real-time mapping of water bodies. However, the image is often limited by its low spatial resolution in practice. Super-resolution mapping (SRM) is a good solution to generate ﬁner spatial resolution maps than the input data allows. In this paper, a multiscale spatiotemporal super-resolution mapping (MSST_SRM) method for water bodies is proposed, particularly for Sentinel-3 OLCI images. The proposed MSST_SRM method employs the integrated Normalized Di ﬀ erence Water Index (NDWI) images calculated from four near-infrared (NIR) bands and Green Band 6 of the Sentinel-3 OLCI image as input data and combined the spectral, multispatial, and temporal terms into one objective function to generate a ﬁne water body map. Two experiments in the Tibet Plate and Daye lakes were employed to test the e ﬀ ectiveness of the MSST_SRM method. Results revealed that by using multiscale spatial dependence under the framework of spatiotemporal super-resolution Mapping, MSST_SRM could generate ﬁner water body maps than the hard classiﬁcation method and the other three SRM-based methods. Therefore, the proposed MSST_SRM method shows marked e ﬃ ciency and potential in water body mapping using Sentinel-3 OLCI images. accuracy. The comparable visual results and higher precision values of the new MSST_SRM method show that MSST_SRM could be a promising approach in producing high spatiotemporal water body maps from Sentinel-3 OLCI images.


Introduction
Water resources are important natural resources for the development of human beings and the ecological sustainability of ecosystems. However, many countries suffer from water scarcity because of several reasons, such as land-use change, natural calamities, and water pollution [1,2]. Water body extraction and mapping are essential and useful in many water applications, including water resource management [3,4], wetland protection [5,6], lake change detection [7,8], flood detection and assessment [9][10][11], and cyanobacteria blooms [12].
Advances in remote sensing technology provide access to water body monitoring and mapping [13,14]. Thus far, many papers have proposed water body mapping approaches from satellite images. Water indices are often employed as a popular and valid way to extract water because they enhance the distinction between water and nonwater through band calculations [15,16]. Existing water extraction methods are mostly based on single-source remotely sensed images such as Moderate small patches and linear features. The accuracy needs to be improved because it only uses one coarse spatial resolution Sentinel-3 image as the input. The successful launch of numerous high-resolution satellites has produced abundant high spatial resolution remotely sensed images, with which finer water body maps can be generated. These archived finer water body maps can provide prior spatial information for the processing of subpixel locating. Spatiotemporal super-resolution mapping (STSRM), which is developed based on SRM, can realize the fusion of multisource remote sensing data with different spatial and temporal resolutions. Finally, by combining the advantages of a short period, the many spectral bands of Sentinel-3 images, and the abundant archived high spatial resolution images, high spatial-temporal resolution water body maps can be generated.
In this study, a water body extraction method under the multiscale spatial-temporal super-resolution mapping (MSST_SRM) method is put forward for Sentinel-3 OLCI images to overcome the above problem of water body extraction. First, an integrated NDWI image calculated directly from the original Sentinel-3 image is used as model input for the processing of water body mapping. Then, MSST_SRM is implemented based on four terms: the spectral water index term, the subpixel/pixel spatial dependence term, and the temporal term. In particular, the spectral water index term is grounded on the Markov random field (MRF) approach; spatial dependence is calculated at the multiscales, and a transfer matrix is employed in the temporal term to calculate the change in the water body.
The rest of the paper is organized as follows. Detailed introductions of the MSST_SRM method is presented in Section 2. Results and analysis of the implementations are outlined in Section 3. Discussions are shown in Section 4. Finally, Section 5 concludes. Figure 1 shows the flowchart of the proposed MSST_SRM approach. After preprocessing the Sentinel-3 OLCI image, five bands, four NIR bands, and one green band were employed to calculate the NDWI images. Normally, in Sentinel-3, Band 17 is regarded as the NIR band in the NDWI calculation. However, using one NIR band of Sentinel-3 OLCI would be too narrow compared with Landsat sensors because the wavelength range of NIR in Landsat covers almost four NIR bands (Bands [17][18][19][20] in Sentinel-3. Hence, to ensure the same bandwidth of NIR and a comprehensive description of the water features, NIR bands from 17 to 20 of Sentinel-3 were used to calculate the NDWI images, and all precalculated NDWI images were combined into one image as input for the MSST_SRM model. MSST_SRM is implemented by the following four terms. Detailed information on these four terms is discussed in Sections 2.1-2.4. In addition, Section 2.5 discusses the implementation of MSST_SRM and some validation methods.

MSST_SRM
The target of MSST_SRM is to generate a finer water body map, X, from the combined NDWI image Y with the assistance of the previous water body map . Suppose the spatial resolution of the Sentinel-3 image is R and the scale factor is . Then, the spatial resolution of NDWI images Y = [ , , … … ] is also R because it is calculated from the Sentinel-3 image. Here, is a single NDWI image, and n is equal to 4. Four NDWI images comprise the integrated input NDWI image. The spatial resolution of and X are both . Each pixel in image Y includes × pixels in and X. In general, the optimal MSST_SRM results, X, is a minimized optimization problem. The objective function of MSST_SRM is composed of four terms and expressed as follows: where _ , _ , _ , and are the energy functions of the four parts: a spectral term for the water index, a spatial term at the subpixel scale, a spatial term at the pixel scale, and a temporal term, respectively. The weighting coefficients α, , and β control the weights of the four terms.

Spectral Water Index Term
The goal of this term in MSST_SRM is to make the class proportions in the resultant water body map consistent with the input NDWI images. The Markov random field (MRF) method [39,40] has been successfully applied as a spectral term in SRM to mitigate the effects of classification errors on change detection. The energy function _ is expressed as the following:

MSST_SRM
The target of MSST_SRM is to generate a finer water body map, X, from the combined NDWI image Y with the assistance of the previous water body map X f ormer . Suppose the spatial resolution of the Sentinel-3 image is R and the scale factor is z. Then, the spatial resolution of NDWI images Y = [y 1 , y 2 , y 3 . . . y i . . . y n ] is also R because it is calculated from the Sentinel-3 image. Here, y i is a single NDWI image, and n is equal to 4. Four NDWI images comprise the integrated input NDWI image. The spatial resolution of X f ormer and X are both R/z . Each pixel in image Y includes z × z pixels in X f ormer and X. In general, the optimal MSST_SRM results, X, is a minimized optimization problem. The objective function of MSST_SRM is composed of four terms and expressed as follows: where U index_spectral , U spaial_sp , U spatial_cp , and U temporal are the energy functions of the four parts: a spectral term for the water index, a spatial term at the subpixel scale, a spatial term at the pixel scale, and a temporal term, respectively. The weighting coefficients α, δ, and β control the weights of the four terms.

Spectral Water Index Term
The goal of this term in MSST_SRM is to make the class proportions in the resultant water body map consistent with the input NDWI images. The Markov random field (MRF) method [39,40] has been successfully applied as a spectral term in SRM to mitigate the effects of classification errors on change detection. The energy function U index_spectral is expressed as the following: where z is the scale factor, equal to 10, and C is the number of classes, equal to 2. D represents the number of pixels in NDWI image Y. x i is a coarse pixel in NDWI image at the current time, y(x i ) is the observed pixel's spectral value of x i , assuming that x i is allocated with the normal distribution of the mean vector V i and the covariance matrix M i , according to the category proportions.
where f ω i is the category proportion of class ω in pixel x i . The value of f ω i is obtained from the fraction images of Y.

Multispatial Term
The multispatial term includes two parts, namely, the subpixel spatial dependence term and the pixel spatial dependence term. In MSST_SRM, each subpixel is surrounded by two neighborhood systems. According to the maximum spatial dependence rule, the center subpixel is affected by the neighboring pixels and subpixels. Suppose the pixel neighborhood window size is 5, and the subpixel neighborhood window size is 7. Figure 2 shows an example to illustrate the multispatial neighborhood system. The center subpixel is affected by the surrounding 48 neighboring subpixels and 24 neighboring pixels. Detailed subpixel and pixel spatial dependence and calculation methods are described in where z is the scale factor, equal to 10, and C is the number of classes, equal to 2. represents the number of pixels in NDWI image Y.
is a coarse pixel in NDWI image at the current time, y( ) is the observed pixel's spectral value of , assuming that is allocated with the normal distribution of the mean vector and the covariance matrix , according to the category proportions.
where is the category proportion of class in pixel . The value of is obtained from the fraction images of Y.

Multispatial Term
The multispatial term includes two parts, namely, the subpixel spatial dependence term and the pixel spatial dependence term. In MSST_SRM, each subpixel is surrounded by two neighborhood systems. According to the maximum spatial dependence rule, the center subpixel is affected by the neighboring pixels and subpixels. Suppose the pixel neighborhood window size is 5, and the subpixel neighborhood window size is 7. Figure 2 shows an example to illustrate the multispatial neighborhood system. The center subpixel is affected by the surrounding 48 neighboring subpixels and 24 neighboring pixels. Detailed subpixel and pixel spatial dependence and calculation methods are described in Sections 2.3.1 and 2.3.2.

Subpixel-Based Spatial Dependence Term
This term intends to locally smoothen the resultant water body map. In SRM, the maximum spatial dependence rule is a popular method for expressing the spatial correlation and spatial pattern among subpixels. Under this rule, subpixels that are nearby may have similar class labels than those with longer distances [19,30]. MSST_SRM adopts the maximum spatial dependence rule to express the spatial relationships between the center subpixel and its neighbors. The subpixel spatial energy function _ is formulated as follows:

Subpixel-Based Spatial Dependence Term
This term intends to locally smoothen the resultant water body map. In SRM, the maximum spatial dependence rule is a popular method for expressing the spatial correlation and spatial pattern among subpixels. Under this rule, subpixels that are nearby may have similar class labels than those with longer distances [19,30]. MSST_SRM adopts the maximum spatial dependence rule to express the spatial relationships between the center subpixel and its neighbors. The subpixel spatial energy function U spatial_sp is formulated as follows: where x j,i is the jth subpixel in ith coarse pixel. x l is the neighboring subpixel of x j,i . N sp x j,i represents the subpixel neighborhood system for x j,i . Suppose the neighborhood window size is w, the neighborhood system of x j,i is a w × w square window. x j,i is located in the center of the window, with (w × w − 1) subpixel neighbors. c x j,i is the class label of x j,i and c(x l ) is the class label of x l . θ c x j,i , c(x l ) aims to show the land cover spatial dependence and is calculated as 1 if c x j,i equals c(x l ). Otherwise, it is 0. The spatial weighting function η x j,i , x l weighs the contribution of x l to x j,i , which is calculated as follows: where Ω is the selected normalization constant to ensure l N sp (x j,i ) η x j,i , x l = 1. d x j,i , x l is the distance between x j,i and x l .

Pixel-Based Spatial Dependence Term
The goal of this term is to provide holistic water information for the results. Under the principle of this term, the class label of each center subpixel is decided by its adjacent coarse pixels. The objective function of this term is to maximize the spatial correlation of neighboring coarse pixels and it is expressed as where x j,i is the jth subpixel in ith coarse pixel, x t is the tth neighboring coarse pixel around subpixel x j,i . N cp x j,i is the pixel neighborhood system for x j,i . Suppose W is the pixel neighborhood window size, which has the same meaning as w in Section 2.3.1. φ c x j,i , c(x t ) is the spatial coefficient of the neighboring coarse pixel x t at the class of c x j,i indicating the contribution of x t . The spatial weighting function η x j,i , x t is formulated as follows: where ε is the distance-decay parameter, and d x j,i , x t represents the Euclidean distance of x j,i and its neighboring coarse pixel x t . The spatial coefficient c x j,i , c(x t ) is calculated according to the category proportion of the neighboring coarse pixels. The category proportion can be obtained from the fraction image through spectral unmixing. Suppose x q is another neighboring coarse pixel of subpixel where f c(x j,i ) x q represents the category proportion of the pixel x q when the class label is c x j,i .
The spatial correlation function η x q , x t is used to calculate the spatial correlation between x q and x t .
The expression of η x q , x t is similar to Equation (8). After f c(x j,i ) x q and η x q , x t are obtained, the solution of φ c x j,i , c(x t ) can be resolved by the matrix in Equation (10): The spatial energy function U spaial combining subpixel and coarse pixel spatial dependence terms are expressed as follows: where δ controls the weight of the subpixel spatial term.

Temporal Term
Earlier archived images with high spatial resolution can provide prior information and spatial patterns for the processing of subpixel location. However, the water body changes over time. In this paper, transition probability P, which represents the change and transfer laws of the land-cover class, is employed in the temporal term to maintain the correlation between the previous water body map X f ormer and the current NDWI image Y. In the subpixel temporal neighborhood system N t , the subpixel at the current time is affected by its temporal neighbors at the former time. The degree of influence is usually expressed as the transition probability. In general, the higher the transfer probability for a certain class of subpixel, the lower the temporal energy function of this class should be. Therefore, the objective function U temporal can be modeled as follows: where c 2 x j,i is the class label of subpixel x j,i at the current time. c 1 (x k ) is the class label of the fine pixel x k at the previous time. P c 2 x j,i c 1 (x k ) is the transition probability from x k at the previous time to x j,i at the current time. In MSST_SRM, P is calculated based on the global transition model by comparing the previous water body map and current SR map pixel by pixel. Suppose the land cover class is C. In this study, there are only two classes, so C equals to 2. Transition probability matrix T is composed of C × C transition probabilities P.
where c 1 (•) is the class label of the arbitrary fine pixel at the previous time, and c 2 (•) represents the class labels of an arbitrary subpixel at the current time. P is calculated as [40].
where I(c 1 (x m ) = ω a ) is an indicator function equaled to 1 when c 1 (x m ) = ω a and 0 otherwise.

Implementation of MSST_SRM and Accuracy Validation
Generally, the process of obtaining the optimal solution of the MSST_SRM algorithm is to minimize the objective function. In this paper, the iterative conditional model (ICM) is adopted as a fast convergence algorithm to minimize global energy iteratively. The main steps of the MSST_SRM are as follows: (1) Initialize the input data and parameters and generate NDWI images by calculating Band 6 and Bands 17-20 from Sentinel-3 OLCI; set the proper values for scale factor z, key parameters α, δ, and β, and window sizes w and W. (2) Extract the endmember and then obtain fraction images by using the linear unmixing method.
Then, initialize the SR map based on it. (3) Calculate the transition probability matrix. (4) Update the subpixel labels iteratively all over the whole image. The class label for each subpixel that contributes to the last iteration process is regarded as the final class label for this subpixel. (5) Stop the iteration after the algorithm converges or the iteration exceeds the setting number; otherwise, step 4 is repeated.
Kappa coefficient, overall accuracy (OA), PULC, and PCLC are employed to assess the accuracy of the proposed method. PULC and PCLC represent the correct classification ratio of the unchanged and changed pixels, respectively.

Study Area
The Tibetan Plateau is located in central Asia, and it is the Earth's broadest and highest highland. With its rich water resources, the Tibetan Plateau is also called the "Water Tower of Asia". The Tibetan Plateau is not only the source of many large rivers in East Asia and Southeast Asia, but it also has a great number of alpine lakes and is the most concentrated area of lakes in China. The study area ( Figure 3) is located in the Tibetan Plateau. (1) Initialize the input data and parameters and generate NDWI images by calculating Band 6 and Bands 17-20 from Sentinel-3 OLCI; set the proper values for scale factor z, key parameters α, δ, and β, and window sizes and .
(2) Extract the endmember and then obtain fraction images by using the linear unmixing method.
Then, initialize the SR map based on it. (3) Calculate the transition probability matrix. (4) Update the subpixel labels iteratively all over the whole image. The class label for each subpixel that contributes to the last iteration process is regarded as the final class label for this subpixel. (5) Stop the iteration after the algorithm converges or the iteration exceeds the setting number; otherwise, step 4 is repeated.
Kappa coefficient, overall accuracy (OA), PULC, and PCLC are employed to assess the accuracy of the proposed method. PULC and PCLC represent the correct classification ratio of the unchanged and changed pixels, respectively.

Study Area
The Tibetan Plateau is located in central Asia, and it is the Earth's broadest and highest highland. With its rich water resources, the Tibetan Plateau is also called the "Water Tower of Asia". The Tibetan Plateau is not only the source of many large rivers in East Asia and Southeast Asia, but it also has a great number of alpine lakes and is the most concentrated area of lakes in China. The study area ( Figure 3) is located in the Tibetan Plateau.

Data Sources and Preprocessing
Three kinds of data were used in Experiment 1 ( Figure 4) as inputs, namely, the current Sentinel-3 image, previous Landsat image providing additional prior information for the processing of subpixel locating, and current Landsat image used for model validation. Specifically, the Sentinel-3A OLCI image was acquired on 25 July 2019 and downloaded from the European Space Agency (https://scihub.copernicus.eu). The Sentinel Toolboxes of Sentinel Application Platform (SNAP) is used to preprocess the Sentinel-3A image for cloud detection, atmospheric correction, and coordinate transformation. Image correction for atmospheric effects (iCOR) is used for OLCI Sentinel-3 atmospheric correction. The coverage area of the Sentinel-3A image is 12 × 12 km, containing 40 × 40 pixels and 21 bands. The two Landsat images were obtained from the United States Geological Survey (USGS) Earth Explorer (https://earthexplorer.usgs.gov). The previous image of Landsat-5 was obtained for 28 August 2009, and the reference image of Landsat-8 was acquired on 25 September 2019. The Landsat acquisition period was the same as the Sentinel-3A image in the summer season to ensure the same water level in the Sentinel-3A image. The two Landsat images were divided into water and nonwater classes to extract water body maps and they contained 400 × 400 pixels and 9 bands. The changed pixels of water accounted for 11.31% of all water pixels.
Water 2020, 12, x FOR PEER REVIEW 9 of 19

Data Sources and Preprocessing
Three kinds of data were used in Experiment 1 ( Figure 4) as inputs, namely, the current Sentinel-3 image, previous Landsat image providing additional prior information for the processing of subpixel locating, and current Landsat image used for model validation. Specifically, the Sentinel-3A OLCI image was acquired on 25 July 2019 and downloaded from the European Space Agency (https://scihub.copernicus.eu). The Sentinel Toolboxes of Sentinel Application Platform (SNAP) is used to preprocess the Sentinel-3A image for cloud detection, atmospheric correction, and coordinate transformation. Image correction for atmospheric effects (iCOR) is used for OLCI Sentinel-3 atmospheric correction. The coverage area of the Sentinel-3A image is 12 × 12 km, containing 40 × 40 pixels and 21 bands. The two Landsat images were obtained from the United States Geological Survey (USGS) Earth Explorer (https://earthexplorer.usgs.gov). The previous image of Landsat-5 was obtained for 28 August 2009, and the reference image of Landsat-8 was acquired on 25 September 2019. The Landsat acquisition period was the same as the Sentinel-3A image in the summer season to ensure the same water level in the Sentinel-3A image. The two Landsat images were divided into water and nonwater classes to extract water body maps and they contained 400 × 400 pixels and 9 bands. The changed pixels of water accounted for 11.31% of all water pixels. Model parameters were initialized after data preprocessing. The pixel-and subpixel-based neighborhood window sizes were both set to 7, which indicates that the center subpixel has 48 neighboring pixels and subpixels. The value of scale factor was 10. Weighting coefficients of α, δ, and β were gained by trying and testing. Kappa coefficient, overall accuracy (OA), PULC, and PCLC were employed to calculate the accuracy of the result.

Results and Analysis
Four additional approaches, namely, hard classification (HC), MSS_SRM, MCT_SRM, and MST_SRM were used for comparison. HC is a pixel-based method that labels the principal component class with the pixel class label according to the fraction images. MSS_SRM is an SRMbased method with a multiscale spatial term, but it uses only one single remotely sensed image regardless of the temporal term. MCT_SRM and MST_SRM are methods based on spatiotemporal super-resolution mapping (STSRM), but they use only single scale spatial dependence. Specifically, MCT_SRM adopts a pixel-based spatial dependence without considering its subpixel spatial neighbors, while MST_SRM utilizes subpixel-based spatial dependence. The resultant water body maps, generated through the different methods, are shown in Figure 5. Table 1 presents the confusion matrices of the five different methods. Based on these confusion matrices, the corresponding quantitative results of these five different methods are obtained, as shown in Table 2. Model parameters were initialized after data preprocessing. The pixel-and subpixel-based neighborhood window sizes were both set to 7, which indicates that the center subpixel has 48 neighboring pixels and subpixels. The value of scale factor z was 10. Weighting coefficients of α, δ, and β were gained by trying and testing. Kappa coefficient, overall accuracy (OA), PULC, and PCLC were employed to calculate the accuracy of the result.

Results and Analysis
Four additional approaches, namely, hard classification (HC), MSS_SRM, MCT_SRM, and MST_SRM were used for comparison. HC is a pixel-based method that labels the principal component class with the pixel class label according to the fraction images. MSS_SRM is an SRM-based method with a multiscale spatial term, but it uses only one single remotely sensed image regardless of the temporal term. MCT_SRM and MST_SRM are methods based on spatiotemporal super-resolution mapping (STSRM), but they use only single scale spatial dependence. Specifically, MCT_SRM adopts a pixel-based spatial dependence without considering its subpixel spatial neighbors, while MST_SRM utilizes subpixel-based spatial dependence. The resultant water body maps, generated through the different methods, are shown in Figure 5. Table 1 presents the confusion matrices of the five different methods. Based on these confusion matrices, the corresponding quantitative results of these five different methods are obtained, as shown in Table 2. From the visual and qualitative perspective of the above results, the MSST_SRM-based method is significantly superior to the other four methods. In Figure 5b, the resultant water body map of the HC-based method has many jagged boundaries and shows poor continuity because, in HC, only pixel scale land cover information is considered, making it easy to lose spatial details. In Figure 5c, multiscale maximal spatial dependence was adopted in MSS_SRM, which considers the spatial neighborhood relationship at both pixel and subpixel scales. The resultant water body map of MSS_SRM becomes locally oversmoothed because of the multispatial dependence model and lack of temporal term. Thus, while the MCT_SRM method can provide a relatively continuous water body map, it contains many discontinuous speckles because of the lack of spatial constraint at the subpixel level ( Figure 5d). Compared with the three previously mentioned methods, the resultant water body map of MST_SRM has better visual results (Figure 5e) and higher resulting accuracy (Table 2). However, some detailed information on the resultant water body map in Figure 5e was lost, especially for the linear features and small water blocks (highlighted by red circles).
By contrast, as shown in the red circle, the resultant water body map generated by MSST_SRM (Figure 5f) can maintain better spatial patterns and water structure information than the other four results. Moreover, MSST_SRM also had the highest quantitative evaluation results ( Table 2). The main reason for the good results is that the MSST_SRM method considers multiscale spatial dependence and additional prior temporal information. With the help of a former water body map of Landsat-5, a valuable spatial pattern can be provided for the current Sentinel-3 image. For unchanged areas, the class label of the subpixels can be obtained directly from the former water body map of Landsat-5 of 2009. Thus, using multiscale spatial dependence, the neighboring subpixels and pixels can provide prior spatial information for the center subpixel. From the visual and qualitative perspective of the above results, the MSST_SRM-based method is significantly superior to the other four methods. In Figure 5b, the resultant water body map of the HC-based method has many jagged boundaries and shows poor continuity because, in HC, only pixel scale land cover information is considered, making it easy to lose spatial details. In Figure 5c, multiscale maximal spatial dependence was adopted in MSS_SRM, which considers the spatial neighborhood relationship at both pixel and subpixel scales. The resultant water body map of MSS_SRM becomes locally oversmoothed because of the multispatial dependence model and lack of temporal term. Thus, while the MCT_SRM method can provide a relatively continuous water body map, it contains many discontinuous speckles because of the lack of spatial constraint at the subpixel level (Figure 5d). Compared with the three previously mentioned methods, the resultant water body map of MST_SRM has better visual results (Figure 5e) and higher resulting accuracy (Table 2). However, some detailed information on the resultant water body map in Figure 5e was lost, especially for the linear features and small water blocks (highlighted by red circles).
By contrast, as shown in the red circle, the resultant water body map generated by MSST_SRM (Figure 5f) can maintain better spatial patterns and water structure information than the other four results. Moreover, MSST_SRM also had the highest quantitative evaluation results ( Table 2). The main reason for the good results is that the MSST_SRM method considers multiscale spatial dependence and additional prior temporal information. With the help of a former water body map of Landsat-5, a valuable spatial pattern can be provided for the current Sentinel-3 image. For unchanged areas, the class label of the subpixels can be obtained directly from the former water body map of Landsat-5 of 2009. Thus, using multiscale spatial dependence, the neighboring subpixels and pixels can provide prior spatial information for the center subpixel.  The proposed approach was compared with a previous approach, which was also calculated for Tibet Plate Lake. Wang et al. [36] used a USWBM method based on a spectral water index to produce a 30-m water body map from the Sentinel-3A image and obtained optimal results with kappa = 0.8430 and OA = 92.37%. By contrast, MSST_SRM generated better results, with kappa = 0.8984 and OA = 95.06%, which indicates that MSST_SRM can produce better results closer to the reference water body maps.

Study Area
A Sentinel-3 OLCI image and Landsat multispectral images (Figure 6), taken over at Huangshi City, were employed in this experiment to further explore the proposed MSST_SRM method. Daye Lake is located in the southeast of Daye city; it crosses Daye city, Yangxin county, the Xisai mountain area, and the development zone of Huangshi and then flows into the Yangtze River from west to east at the Sigu floodgate. Daye Lake is rich in natural bait and suitable for fish and algae reproduction, which makes it an important water source for aquatic economic plants and fishery breeding.

Data Sources and Preprocessing
The experiment was implemented with the Sentinel-3A OLCI image acquired on 25 July 2019 and two Landsat images acquired on 29 August 1997 and 25 July 2019 (Figure 7). The Landsat-5 TM image from 1997 was divided into nonwater and water classes and used to offer additional prior information for the processing of subpixel location. The Landsat-8 OLI image from 2019 was also classified into two categories and used as a reference water body map. The acquisition and data preprocessing methods of these three images are the same as those in Experiment 1. The study area in the Sentinel-3A OCLI image contains 70 × 40 pixels and 21 bands. The two Landsat images contain 700 × 400 pixels and 9 bands. Scale factor was set to 10 and the neighborhood window of subpixels and pixels was set to 7. The changed pixels of water accounted for 32.63% of all water pixels.

Results and Analysis
Comparative experiments among HC, MSS_SRM, MCT_SRM, MST_SRM, and MSST_SRM were conducted to verify the proposed MSST_SRM method for Daye Lake (Figure 8). Confusion matrices of the different methods are showed in Table 3. Four quantitative statistical results were adopted to

Data Sources and Preprocessing
The experiment was implemented with the Sentinel-3A OLCI image acquired on 25 July 2019 and two Landsat images acquired on 29 August 1997 and 25 July 2019 (Figure 7). The Landsat-5 TM image from 1997 was divided into nonwater and water classes and used to offer additional prior information for the processing of subpixel location. The Landsat-8 OLI image from 2019 was also classified into two categories and used as a reference water body map. The acquisition and data preprocessing methods of these three images are the same as those in Experiment 1. The study area in the Sentinel-3A OCLI image contains 70 × 40 pixels and 21 bands. The two Landsat images contain 700 × 400 pixels and 9 bands. Scale factor z was set to 10 and the neighborhood window of subpixels and pixels was set to 7. The changed pixels of water accounted for 32.63% of all water pixels.

Data Sources and Preprocessing
The experiment was implemented with the Sentinel-3A OLCI image acquired on 25 July 2019 and two Landsat images acquired on 29 August 1997 and 25 July 2019 (Figure 7). The Landsat-5 TM image from 1997 was divided into nonwater and water classes and used to offer additional prior information for the processing of subpixel location. The Landsat-8 OLI image from 2019 was also classified into two categories and used as a reference water body map. The acquisition and data preprocessing methods of these three images are the same as those in Experiment 1. The study area in the Sentinel-3A OCLI image contains 70 × 40 pixels and 21 bands. The two Landsat images contain 700 × 400 pixels and 9 bands. Scale factor was set to 10 and the neighborhood window of subpixels and pixels was set to 7. The changed pixels of water accounted for 32.63% of all water pixels.

Results and Analysis
Comparative experiments among HC, MSS_SRM, MCT_SRM, MST_SRM, and MSST_SRM were conducted to verify the proposed MSST_SRM method for Daye Lake (Figure 8). Confusion matrices of the different methods are showed in Table 3. Four quantitative statistical results were adopted to

Results and Analysis
Comparative experiments among HC, MSS_SRM, MCT_SRM, MST_SRM, and MSST_SRM were conducted to verify the proposed MSST_SRM method for Daye Lake (Figure 8). Confusion matrices of the different methods are showed in Table 3. Four quantitative statistical results were adopted to assess the results (Table 4). Tests and errors were used to achieve the best value of the weighting coefficients.
Water 2020, 12, x FOR PEER REVIEW 13 of 19 assess the results (Table 4). Tests and errors were used to achieve the best value of the weighting coefficients. The comparison maps in Figure 8 show that the results of MSST_SRM can maintain better water structure. Results in Figure 8b have boundaries with many jags and low continuity because the HC method classification operation is calculated only on the pixel scale. The resultant water body map produced by MSS_SRM contains many over-smoothed patches (Figure 8c) because of the lack of temporal term. The kappa coefficient and OA of MCT_SRM have a higher value than those of HC and MSS_SRM (Table 4), and thus, the resultant water body map generated by MCT_SRM has low continuity (Figure 8d). In Figure 8e, the resultant water body map contains many speckle-like artifacts. Although MCT_SRM and MST_SRM methods have employed the former finer resolution water body map to learn the spatial pattern, the resultant water body maps still did not perform well because both methods only applied single scale spatial dependence either on the pixel scale or on the subpixel scale.
Compared with Tibet Plate Lake, the water in Daye Lake has more mixture, such as algae and sediments, which may be reflected in the NIR, making it more difficult to extract water from Daye Lake. However, among the above methods, the resultant water body map generated by MSST_SRM has the best performance ( Figure 8f) and the highest OA and kappa coefficient values (Table 4). Qualitative and quantitative results show that multiscale spatial dependence and additional temporal terms are significant in increasing the accuracy of the results. The comparison maps in Figure 8 show that the results of MSST_SRM can maintain better water structure. Results in Figure 8b have boundaries with many jags and low continuity because the HC method classification operation is calculated only on the pixel scale. The resultant water body map produced by MSS_SRM contains many over-smoothed patches (Figure 8c) because of the lack of temporal term. The kappa coefficient and OA of MCT_SRM have a higher value than those of HC and MSS_SRM (Table 4), and thus, the resultant water body map generated by MCT_SRM has low continuity ( Figure 8d). In Figure 8e, the resultant water body map contains many speckle-like artifacts. Although MCT_SRM and MST_SRM methods have employed the former finer resolution water body map to learn the spatial pattern, the resultant water body maps still did not perform well because both methods only applied single scale spatial dependence either on the pixel scale or on the subpixel scale.
Compared with Tibet Plate Lake, the water in Daye Lake has more mixture, such as algae and sediments, which may be reflected in the NIR, making it more difficult to extract water from Daye Lake. However, among the above methods, the resultant water body map generated by MSST_SRM has the best performance (Figure 8f) and the highest OA and kappa coefficient values (Table 4). Qualitative and quantitative results show that multiscale spatial dependence and additional temporal terms are significant in increasing the accuracy of the results.

Influences of Weighting Coefficients among Spectral, Multispatial, and Temporal Terms
The objective function includes spectral, multispatial, and temporal terms. Weighting coefficients α and β, which are used to balance the three big terms, play significant roles in water body mapping. α is a spatial weight coefficient that adjusts the contribution value of the whole multispatial term. The higher the value of α, the greater the spatial dependence among pixels. β is used as a temporal weigh coefficient. A higher value of β means the previous water body map has a greater effect on the results. Different water body maps can be generated by using different αs and βs. Four typical water body maps are displayed here to elaborate on the effects of the different coefficient combinations on the results. In Figure 9a, the temporal coefficient β is equal to 0, indicating the ineffectiveness of temporal dependence. Moreover, spatial coefficient α was at a low value of 0.01. Finally, the resultant water body map in Figure 9a has many jagged boundaries and poor continuity because of the absence of prior information and imperfect spatial pattern. In Figure 9b, α was set to 1, and β was set at a low value of 0.1, which means the previous water body map only plays a minor role in the process of water body generation and that the results would be extremely dependent on the spatial term. As a consequence, the results were over-smoothed, and the linear water body features could not be classified correctly. In Figure 9c, the temporal coefficient β was set to 1. Spatial coefficient α was set to 0, indicating that the multispatial term was unavailable in such a situation. The resultant water body map contained many speckle-like artifacts because of the missing constraint of spatial dependence. By contrast, with appropriate weight coefficients, the results displayed in Figure 9d had the best performance, with a better spatial pattern that not only represented linear water body features correctly but also contained fewer speckle-like artifacts. The quantitative accuracy values of the proposed MSST_SRM method are shown in Table 5. When β is set to 0, no matter how α increases, the kappa coefficient will not exceed 0.76 and OA remains at a relatively low value because, when β equals 0, the temporal term will be useless in the object function, which would result in great uncertainty in subpixel location. Meanwhile, if the value of α is fixed, the kappa coefficient, PULC, and OA will be higher when β is higher. However, the accuracy values tend to be stable. After that, the accuracy values remain almost unchanged, no matter how much β increases. The final optimal parameters were obtained by trial and error. In general, the value of PULC is higher than PCLC, especially when β is higher. The main reason for this result is that the previous finer water body map is ineffective for changed pixels. Therefore, the correct classification probability of changed pixels is lower than for unchanged pixels. The quantitative accuracy values of the proposed MSST_SRM method are shown in Table 5. When β is set to 0, no matter how α increases, the kappa coefficient will not exceed 0.76 and OA remains at a relatively low value because, when β equals 0, the temporal term will be useless in the object function, which would result in great uncertainty in subpixel location. Meanwhile, if the value of α is fixed, the kappa coefficient, PULC, and OA will be higher when β is higher. However, the accuracy values tend to be stable. After that, the accuracy values remain almost unchanged, no matter how much β increases. The final optimal parameters were obtained by trial and error. In general, the value of PULC is higher than PCLC, especially when β is higher. The main reason for this result is that the previous finer water body map is ineffective for changed pixels. Therefore, the correct classification probability of changed pixels is lower than for unchanged pixels.

Influences of Change Rate
Land cover changes always occur, and the change rate of the water body should be considered in practical applications. The change rate of the water body means the proportion of changed water pixels in all water pixels, which is calculated by comparing the current image with the previous image.
In general, the accuracy of the results, such as kappa and OA, is determined by the classification precision of changed and unchanged water bodies. For unchanged pixels, the class labels can be inherited from the former water body map directly. Therefore, the result accuracy will be higher when the change rate is smaller. For changed pixels, the spatial pattern of the water body is obtained from the current image and the spatial dependence rule. Obviously, in the processing of SRM, the uncertainty of changed pixels is greater than with the unchanged ones.
In this study, the change rate of water in the two experiments is 11.31% and 32.63%, respectively. The time interval of the former and latter images in Experiment 2 is 22 years. Even so, the two results show that the MSST_SRM method can still obtain the best result among the contrast experiment methods (HC, MSS_SRM, MCT_SRM, and MST_SRM). The above experiments also suggest the following: (1) A higher precision water body map is more likely to be produced at a smaller change rate because the proposed MSST_SRM method usually gains more accurate results for unchanged pixels than changed ones; (2) among the three STSRM-based methods that employ the former fine resolution water body map as prior information, the proposed MSST_SRM method is superior to the other methods (MCT_SRM and MST_SRM) in quantity and quality, thereby indicating that with the help of multiscale spatial dependence, MSST_SRM can provide better spatial patterns and maintain the water structure well.
The changed and unchanged water body classifications affect the resultant water body map. Therefore, to achieve higher result accuracy, the trade-off between changed and unchanged pixels should be considered. First, a balance should be made between the previous finer water body map and the current low-resolution image. Using training data to adjust the weighting parameters is a significant and credible way to obtain balance and gain higher accuracy. Second, using multiscale spatial dependence is an effective and promising means of promoting the performance of the current image. Third, it is better to use Sentinal-3 OLCI and Landsat-8 images acquired at the nearest time and in the same season to reduce the water change rate and to make full use of the prior information provided by the previous high spatial resolution remotely sensed images. In future research, adopting a series of previous finer resolution water body maps to acquire land-cover change rules may be a worthwhile method for further improving the result accuracy of unchanged pixels.

Conclusions
In this work, a multiscale spatiotemporal super-resolution method was proposed to produce a water body map with high spatiotemporal resolution from Sentinel-3 images. The proposed MSST_SRM method not only extends the application field of Sentinel-3 but also provides data support for water monitoring and water mapping. First, to ensure the stable performance of the proposed algorithm under different water features, MSST_SRM utilizes four combined NDWI images as input data from the band calculations of the Sentinel-3 image. Then, MSST_SRM unites the spectral, multispatial, and temporal terms into one objective function, where the spectral term is established according to the MRF criterion; the multispatial term is realized by the maximal spatial dependence principle on the pixel and subpixel scales, and the temporal term is constructed through the transition probability matrix between the latter and former images. As a consequence, by using temporal dependence, the spatial pattern uncertainty caused by a single image can be effectively reduced. With the help of multiscale spatial dependence, the integrity and spatial continuity of the entire image can be maintained well.
Two experiments at Tibet Plate Lake and Daye Lake were employed to validate the performance of MSST_SRM. Compared with the pixel-based method (HC), the no-temporal-information SRM-based method (MSS_SRM), and the two STSRM-based methods of single scale spatial dependence (MST_SRM and MCT_SRM), the proposed MSST_SRM method could generate a water body map with high result accuracy. The comparable visual results and higher precision values of the new MSST_SRM method show that MSST_SRM could be a promising approach in producing high spatiotemporal water body maps from Sentinel-3 OLCI images.