Spatiotemporal Continuous Impervious Surface Mapping by Fusion of Landsat Time Series Data and Google Earth Imagery

: The monitoring of impervious surfaces in urban areas using remote sensing with ﬁne spatial and temporal resolutions is crucial for monitoring urban development and environmental changes in urban areas. Spatiotemporal super-resolution mapping (STSRM) fuses ﬁne-spatial-coarse-temporal remote sensing data with coarse-spatial-ﬁne-temporal data, allowing for urban impervious surface mapping at both ﬁne-spatial and ﬁne-temporal resolutions. The STSRM involves two main steps: unmixing the coarse-spatial-ﬁne-temporal remote sensing data to class fraction images, and downscaling the fraction images to sub-pixel land cover maps. Yet, challenges exist in each step when applying STSRM in mapping impervious surfaces. First, the impervious surfaces have high spectral variability (i.e., high intra-class and low inter-class variability), which impacts the accurate extraction of sub-pixel scale impervious surface fractions. Second, downscaling the fraction images to sub-pixel land cover maps is an ill-posed problem and would bring great uncertainty and error in the predictions. This paper proposed a new Spatiotemporal Continuous Impervious Surface Mapping (STCISM) method to deal with these challenges in fusing Landsat and Google Earth imagery. The STCISM used the Multiple Endmember Spectral Mixture Analysis and the Fisher Discriminant Analysis to minimize the within-class variability and maximize the between-class variability to reduce the spectral unmixing uncertainty. In addition, the STCISM adopted a new temporal consistency check model to incorporate temporal contextual information to reduce the uncertainty in the time-series impervious surface prediction maps. Unlike the traditional temporal consistency check model that assumed the impervious-to-pervious conversion is unlikely to happen, the new model allowed the bidirectional conversions between pervious and impervious surfaces. The temporal consistency check was used as a post-procession method to correct the errors in the prediction maps. The proposed STCISM method was used to predict time-series impervious surface maps at 5 m resolution of Google Earth image at the Landsat frequency. The results showed that the proposed STCISM outperformed the STSRM model without using the temporal consistency check and the STSRM model using the temporal consistency check based on the unidirectional pervious-to-impervious surface conversion rule.


Introduction
Urbanization is an ongoing dynamic process worldwide [1], which has transformed many earthen surfaces in natural ecosystems into impervious surfaces [2]. Impervious the group of residential and commercial areas) impervious surface changes. Nevertheless, to the best of our knowledge, few spatiotemporal image fusion methods have been applied to generate impervious surface maps at higher spatial resolution (under 10 m) that are suitable for local scale (such as roads, sidewalks, paved areas, and rooftops which are mixed in the medium resolution remote sensing image) impervious surface changes.
Spatio-Temporal Super-Resolution land cover Mapping (STSRM) is a new method that directly generates fine-spatial-temporal resolution land cover maps from coarse-spatialfine-temporal resolution remote sensing imagery together with a few fine-spatial-coarsetemporal resolution land cover maps. Compared with the spatiotemporal image fusion methods such as FSDAF and ESTARFM, STSRM predicts land cover maps directly without introducing other classification approaches, reducing the uncertainty introduced from the classification process. In addition, STSRM can predict finer land cover maps than the medium resolution Landsat data [30]. In STSRM, the coarse-spatial-fine-temporal resolution image (such as Landsat) was used to provide class fraction information at the prediction time. The fine-spatial-coarse-temporal resolution image (such as Google Earth) data, which are temporally close to the prediction time, were used to provide ancillary fine spatial resolution land cover temporal contextual information in downscaling the class fraction images to sub-pixel scale. In recent years, various STSRM algorithms have been employed in the monitoring of land cover change [31], time-series land covers [32], forest [33], burned areas [34], and water [35]. However, to the best of our knowledge, no STSRM is designed for mapping impervious surfaces at present. The main challenges that exist in mapping impervious surfaces in STSRM can be concluded below.
First, the impervious surfaces have high spectral variability, affecting the accurate extraction of sub-pixel impervious surface fractions used as the STSRM input. The high spectral variability is caused by the variations in material components and properties, illumination, and temporal conditions [36]. In general, the STSRM uses the linear mixture model with a fixed endmember combination for land cover types in estimating the impervious surface fractions, but fails to consider the spectral variability of endmembers. In order to incorporate endmember variability, a few studies have used variable endmember combination as a substitute for fixed endmember in STSRM. For instance, Wu et al. [31] used a cross correlogram spectral matching technique to find appropriate endmember combinations for each pixel in STSRM. Li et al. [37] used the spectral angle and spectral distance to measure the similarity between the reference and candidate pixel spectra to find the most suitable endmember combination for each pixel. Although the uncertainty of the impervious surface fractions was decreased by using multiple endmember combinations, these STSRM algorithms still use the reflectance values in the unmixing. The impervious surfaces have high spectral variability in the reflectance values, and demonstrate high intra-class and low inter-class variability in the reflectance that may cause problems such as the collinearity in the spectral unmixing approach [38]. Studies have shown that the data dimensionality reduction can reduce the high spectral variability in spectral unmixing [39]. For instance, Xu et al. [38] combined the fisher discriminant analysis with multiple endmember spectral mixture analysis in extracting impervious surface fractions, which increased the between-class variability and decreased the within-class variability compared with the unmixing method using spectral reflectance values. However, its application in STSRM has not been reported yet.
Second, the STSRM is an ill-posed problem that predicts sub-pixel land cover spatial distribution from the original remote sensing data. The result contains considerable uncertainty in the time-series predictions. To reduce the results' uncertainty, it is necessary to incorporate temporal contextual information to post-process the time-series predictions. However, there are no appropriate post-processing methods for the sub-pixel scale STSRM predictions at present. As a result, the misclassification in the STSRM prediction map would inevitably result in errors in analyzing the dates when the impervious surface change [40]. Compared to the sub-pixel scale time series analysis, many pixel-scale temporal consistency check methods have been proposed to post-process the time-series impervious surface Remote Sens. 2021, 13, 2409 4 of 22 classification maps. For example, Li et al. [41] proposed a temporal consistency check model (TCC) that combines temporal filtering and heuristic reasoning to post-process the 30 m annual impervious surface maps. The TCC has been widely used in post-processing the Landsat derived time-series impervious surface maps [40][41][42]. Although more accurate predictions can be achieved after the temporal consistency check, the basic assumption is that impervious surface is stable in the temporal domain. Therefore, the TCC only considered the unidirectional conversion from the pervious surface to the impervious surface in the temporal domain. Thus the TCC may not be suitable for the sub-pixel scale STSRM predictions, because the information observed at one scale cannot be directly applied as that observed at another scale [43]. The conversion from pervious-to-impervious surfaces, such as the green space recovery from urban land, is nonnegligible and reported in many studies [44][45][46][47]. Both the pervious-to-impervious surface conversion and impervious-topervious surface conversion are typical urban renewal processes. The latter was viewed as a way to counter-balance the rapid urbanization [48]. Therefore, exploring both the pervious-to-impervious surface conversion and impervious-to-pervious surface conversion in the temporal consistency check in STSRM is necessary.
This paper proposed a Spatiotemporal Continuous Impervious Surface Mapping (STCISM) that fuses Landsat and Google Earth images to generate a series of fine-spatialtemporal resolution (Google-Earth-image-spatial-resolution and Landsat-temporalresolution) impervious surface maps. The STCISM is designed to deal with the challenges related to high spectral variability and the various temporal transition directions between impervious surfaces in STSRM. Specifically, the STCISM used the combined method of Multiple Endmember Spectral Mixture Analysis (MESMA) and Fisher Discriminant Analysis (FDA) proposed by Xu et al. [38] to transform spectra from the reflectance image to a Fisher feature space that maximized the inter-class variability and minimized the intraclass variability, and used multiple endmembers for each mixed Landsat pixel to solve the endmember variability problem. In addition, the STCISM used a new temporal consistency check as a post-procession to integrate the temporal contexture information to correct the outliers in the time-series impervious surface maps. The new temporal consistency check considered the bidirectional conversion between the pervious surface and impervious surface, in order to reconstruct the land cover trajectories of both pervious-to-impervious surface conversion and impervious-to-pervious surface conversion happen in the urban renewal process. To test the performance of the proposed method in generating a series of impervious surfaces at fine spatiotemporal resolutions, we selected a region that has undergone rapid urbanization.

Study Area
Daqiao New Town in Jiangxia District, Wuhan City, Hubei Province, China (30 • 24 36"N, 114 • 17 34"E) was selected in this study ( Figure 1). This area covers approximately 63 km 2 , and has undergone a drastic urbanization process in recent years with an increase of impervious surfaces such as an industrial park, residential houses, and roads in this area.

Coarse-Spatial-Fine-Temporal Landsat Data
All cloud-free Landsat TM and OLI data in the study area from 2006 to 2020 were downloaded from Google Earth Engine (GEE, https://earthengine.google.org, accessed on 29 April 2021). These data were Tier 1 surface reflectance and have been atmospherically corrected by the Landsat Ecosystem Disturbance Adaptive Processing System [49] used in Landsat 5 data and Landsat Surface Reflectance Code [50] used in Landsat 8 data. The multispectral bands, including Blue (B), Green (G), Red (R), Near InfraRed (NIR), Short Wave InfraRed 1 (SWIR1), and Short Wave InfraRed 2 (SWIR2), were used. The numbers of Landsat images in each year used in the study site, covered by the tile path/row 123/39, are shown in Figure 2. A total of 43 scenes were used, including 19 Landsat 5 images acquired before 2012 and 24 Landsat 8 images acquired after 2012.

Fine-Spatial-Coarse-Temporal Data
Seven cloud-free fine-spatial-coarse-temporal remote sensing images, including six Google Earth images (4.12 m resolution) acquired on 2006/12/17, 2010/08/14, 2013/06/13, 2015/08/04, 2017/12/09, and 2020/02/25 and a Ziyuan-3 image (5.8 m resolution) acquired on 2012/08/18, were adopted as input data for STCISM model in this study. A Gaofen-1 image (the four 8 m multispectral bands were pan-sharpened with the 2 m panchromatic band based on the Gram-Schmidt spectral sharpening to generate a 2 m multispectral image) acquired on 2019/12/11 and other Google Earth images as listed in Tables 1 and 2 were adopted as reference data for accuracy assessment. The ZY-3 image and GF-1 image were georeferenced to the Google Earth Imagery. All fine-spatial-coarse-temporal images

Fine-Spatial-Coarse-Temporal Data
Seven cloud-free fine-spatial-coarse-temporal remote sensing images, including six Google Earth images (4.12 m resolution) acquired on 2006/12/17, 2010/08/14, 2013/06/13, 2015/08/04, 2017/12/09, and 2020/02/25 and a Ziyuan-3 image (5.8 m resolution) acquired on 2012/08/18, were adopted as input data for STCISM model in this study. A Gaofen-1 image (the four 8 m multispectral bands were pan-sharpened with the 2 m panchromatic band based on the Gram-Schmidt spectral sharpening to generate a 2 m multispectral image) acquired on 2019/12/11 and other Google Earth images as listed in Tables 1 and 2 were adopted as reference data for accuracy assessment. The ZY-3 image and GF-1 image were georeferenced to the Google Earth Imagery. All fine-spatial-coarse-temporal images were reprojected into UTM (zone 49, north) projection as same as Landsat series. Then, the nearest neighbor resampling technique was used to resample the fine-spatial-coarsetemporal images into a 5 m resolution.  The fine-spatial-coarse-temporal data were classified to generate 5 m impervious surface maps, which are the input of STSRM. The object-based classification approach, which takes advantage of spectral and spatial information such as textural, morphological, and contextual characteristics [51], was applied to the seven Google Earth and ZY-3 data to generate the 5 m resolution impervious surface maps. The object-based classification overcomes the problem of salt-and-pepper effects caused in traditional pixel-based classification methods [52]. This object-based classification was performed using the Trimble eCognition Developer software. First, the multi-resolution segmentation and spectral difference segmentation were performed. The multi-resolution segmentation procedure starts with one-pixel objects and merges similar neighboring objects in subsequent steps until it reaches a heterogeneity threshold set by the "scale parameter" [53]. In this study, different segment parameters ranging from 15, 25, 30, to 40 were examined for the high-resolution images, and the scale parameter of 15 was used through many trials. Next, the spectral difference segmentation algorithm was used to merge some adjacent segmented objects with similar brightness values from the multi-resolution segmentation algorithm. Then, a decision tree classifier (including selecting, training, and applying V-I-S-W samples) was used to generate the classification map. Finally, the post-classification processing (including merging vegetation, soil, and water classes into pervious surface class) was used to generate the final impervious map at a 5 m spatial resolution. An independent set of validation samples (150 for each class) was used to assess the accuracy of the classification results, and the average Kappa coefficient was higher than 0.95.

Methodology
The flowchart of the STCISM, which generated fine-spatial-temporal resolution impervious surface maps from Landsat time-series and Google Earth images, is shown in Figure 3. The STCISM contains three main steps. First, the Fisher transformed Multiple Endmember Spectral Mixture Analysis (F-MESMA) algorithm [38] was used to unmix Landsat time-series images and generate a series of 30 m resolution impervious surface fraction images. Then, based on the 30 m resolution impervious surface fraction images and the 5 m resolution ancillary impervious surface maps, a Bayesian-based STSRM (Bayesian-STSRM), was used to predict every 5 m impervious surface maps at the acquisition time of each Landsat data. Finally, a novel temporal consistency check that considered the bidirectional conversions between pervious and impervious surfaces was used to post-process the initial 5 m time series impervious surface maps to generate the final predictions.  The F-MESMA unmixing proposed by Xu et al. [38] was used to generate the 43 scenes of impervious surface fraction maps from the Landsat time-series images acquired during 2006-2020. The F-MESMA unmixing procedure consists of three steps. First, reflectance images and generic spectral training samples of vegetation, impervious surface, bare soil, and water (V-I-S-W) are transformed into Fisher feature images via a Fisher transformation. Then, the MESMA method is applied to generate V-I-S-W fractions. A more detailed description of this method is presented in [38].
In STCISM, the F-MESMA was modified to incorporate local endmember information about the study area. The V-I-S-W endmembers for unmixing in the F-MESMA method were selected globally from online spectral libraries [38], but were not representative for the site in this study. Thus, the STCISM derived the V-I-S-W endmembers directly from the Landsat images of this study site. Since the spectral separability of impervious surfaces from other land cover types was maximized in summer [54], and to reduce the disagreement of surface reflectance between different sensors (Landsat TM and Landsat OLI), images acquired on 2011/06/08 and 2018/09/05 were applied for endmember selection. With the help of all available higher resolution Google Earth images, the training samples whose type remained stable during the study period were finally selected. A total of 17 vegetation, 42 impervious surface, 29 soil, and 5 water training samples were manually selected from the Landsat images in this study. More specifically, 17 vegetation samples included grass, shrub, and tree samples; 42 impervious surface samples included roads, rooftops, and parking lots samples; 29 soil samples included bare soil, sand and rock samples, and water samples included pond, river, and lake samples.

Bayesian-STSRM
The Bayesian-STSRM was used to predict each 5 m resolution impervious surface map (X) at Landsat image acquisition time t p (p = 1, . . . , N, where N is the total number of Landsat data). The Bayesian-STSRM used the 30 m resolution impervious surface fraction image (Y) unmixed from the Landsat image at time t p and two 5 m resolution impervious surface maps (X pre ; X post ), one pre-date and the other post-date of t p , respectively. Based on the Bayesian theory, the 5 m resolution impervious surface map (X) at time t p was formulated by solving the maximization problem: where Z is a normalizing constant, P posterior (X|Y, X pre , X post ) is the posterior probability of X, given Y, X pre , X post , and U posterior (X|Y, X pre , X post ) is the posterior energy function of X, given Y, X pre , X post : where U fraction (Y|X) is the class fraction term that represents the inconsistency between Y and X, and U spatial (X) and U temporal (X|X pre , X post ) are the spatial and temporal terms, respectively. The class fraction term is used to minimize the difference between the unmixed and synthetic impervious surface fraction: The spatial term is used to maximize the spatial dependence of land cover classes: where s is the scale factor between the Landsat and Google Earth image (s = 6), a ijk is the k th 5 m resolution pixel in the Landsat pixel (i,j), N a ijk is a 3 × 3 neighborhood system with the 5 m fine pixel a ijk as its center, and d a ijk , a l is the Euclidean distance between the 5 m fine pixels a ijk and a l . c a ijk and c(a l ) are the class labels of pixels a ijk and a l .
con c a ijk , c(a l ) equals to 1 if c a ijk and c(a l ) are the same, and 0 otherwise. The temporal term is used to make the predicted fine-resolution map X similar to the pre-and post-dated maps X pre and X post : where a ijk , a pre ijk and a post ijk are the k th 5 m resolution pixel in the Landsat pixel (i,j) in the maps X, X pre and X post . λ pre ij and λ post ij are the weights for the Landsat pixel (i,j) in the maps X pre and X post , which can be calculated using the exponential functions in [30].

Temporal Consistency Check
A new temporal consistency check was used to incorporate temporal-contextual information and correct the outliers in the Bayesian-STSRM time-series maps. The proposed STCISM considered pervious-to-impervious surface conversion and vice versa, which is different from the models that assumed the impervious-to-pervious surface conversion is unlikely to happen, and the conversion of the pervious surface to the impervious surface is unidirectional [7,41]. The basis is that the impervious-to-pervious surface conversion is nonnegligible in the fine-spatial-resolution scale. For instance, in Figure 4, the region was covered by water in 2006, and was changed to impervious surfaces due to the construction of factory buildings in 2013, and was changed to pervious surfaces of bare land in 2020. Figure 4 indicates that both the pervious-to-impervious surface conversion and imperviousto-pervious surface conversion are essential in the urban renewal process, which should be both considered in the temporal consistency check. The main steps of the temporal consistency check are introduced below.
Remote Sens. 2021, 13, x FOR PEER REVIEW 10 of 23 A new temporal consistency check was used to incorporate temporal-contextual information and correct the outliers in the Bayesian-STSRM time-series maps. The proposed STCISM considered pervious-to-impervious surface conversion and vice versa, which is different from the models that assumed the impervious-to-pervious surface conversion is unlikely to happen, and the conversion of the pervious surface to the impervious surface is unidirectional [7,41]. The basis is that the impervious-to-pervious surface conversion is nonnegligible in the fine-spatial-resolution scale. For instance, in Figure 4, the region was covered by water in 2006, and was changed to impervious surfaces due to the construction of factory buildings in 2013, and was changed to pervious surfaces of bare land in 2020. Figure 4 indicates that both the pervious-to-impervious surface conversion and impervious-to-pervious surface conversion are essential in the urban renewal process, which should be both considered in the temporal consistency check. The main steps of the temporal consistency check are introduced below. (1) Detect change points A significant change point, also called 'break point', would be present in the time series predictions if a pervious-to-impervious surface or impervious-to-pervious surface conversion occurs. The temporal consistency check first detects the change points in the time series class labels for each predicted 5 m resolution pixel. The break point detection algorithm identifies abrupt changes based on the significant change in the statistical mean values on either side of a data point in the impervious time series data (in which "0" means pervious surface, and "1" means impervious surface) while minimizing the total residual error [55,56]. The change points are iteratively detected. First, the suspected change point (1) Detect change points A significant change point, also called 'break point', would be present in the time series predictions if a pervious-to-impervious surface or impervious-to-pervious surface conversion occurs. The temporal consistency check first detects the change points in the time series class labels for each predicted 5 m resolution pixel. The break point detection algorithm identifies abrupt changes based on the significant change in the statistical mean values on either side of a data point in the impervious time series data (in which "0" means pervious surface, and "1" means impervious surface) while minimizing the total residual error [55,56]. The change points are iteratively detected. First, the suspected change point divides the time series into two segments, and an empirical estimate of the mean for each segment is calculated. Then, within each segment, the deviation of each point from the empirical estimate is calculated, and the deviations for all points are added. Next, the deviations are added segment-to-segment to find the total residual error. Finally, the location of the suspected change point is changed until the total residual error attains a minimum: where J represents the smallest total residual error value, x i is the corresponding label of a temporal sequence and its value is either "0" (i.e., pervious surface) or "1" (i.e., impervious surface), and h is the actual change point in temporal sequence. Finally, according to the number (M) of change points, the temporal sequence is divided into M + 1 segments.
(2) Update class labels for each segment The temporal consistency check assumes the class labels within a time series segment are the same. Thus, each segment's class labels were updated based on the domain class within the segment. Specifically, assuming the probability of this segment belonging to the impervious surface is Prob imp , which is calculated according to Equation (7): where W is the length of the segment, and con(c(w), 1) equals one if the w th label in the segment belongs to impervious surface or 0 otherwise. The class labels in the segment are set as "1" (i.e., impervious surface) if Prob imp is higher than 0.5 and "0" (i.e., pervious surface) otherwise. In this way, any outlier predicted by Bayesian-STSRM in each timeseries segment could be corrected. The flowchart of the temporal consistency check is in Figure 5.

Accuracy Assessment
To test the effectiveness of the novel temporal consistency check, the proposed STCISM was compared with the Bayesian-STSRM method in Section 2.3.3. The Bayesian-STSRM did not post-process the predicted time-series impervious surface maps (Figure 3). In addition, we also compared our results derived from STCISM with three existing impervious surface products, the Global Artificial Impervious Areas (GAIA) dataset, the Normalized Urban Areas Composite Index (NUACI) dataset, and the Multisource, Multitemporal (MSMT) dataset, with multiple epochs. These three products are all at a 30 m spatial resolution.
The GAIA dataset (http://data.ess.tsinghua.edu.cn, accessed on 29 April 2021) is the 30 m resolution annual dataset spanning over 30 years acquired at a global scale. An automatic mapping procedure developed this dataset on the GEE platform with the full acquisition of Landsat time-series images and ancillary datasets including nighttime light data and Sentinel-1 Synthetic Aperture Radar data [57]. This study employed GAIA impervious surface maps acquired in 2010 and 2015.
where W is the length of the segment, and ( ( ),1) con c w equals one if the w th label in the segment belongs to impervious surface or 0 otherwise. The class labels in the segment are set as "1" (i.e., impervious surface) if imp Prob is higher than 0.5 and "0" (i.e., pervious surface) otherwise. In this way, any outlier predicted by Bayesian-STSRM in each timeseries segment could be corrected. The flowchart of the temporal consistency check is in Figure 5.

Accuracy Assessment
To test the effectiveness of the novel temporal consistency check, the proposed STCISM was compared with the Bayesian-STSRM method in Section 2.3.3. The Bayesian-STSRM did not post-process the predicted time-series impervious surface maps ( Figure  3). In addition, we also compared our results derived from STCISM with three existing impervious surface products, the Global Artificial Impervious Areas (GAIA) dataset, the Normalized Urban Areas Composite Index (NUACI) dataset, and the Multisource, Multitemporal (MSMT) dataset, with multiple epochs. These three products are all at a 30 m spatial resolution.
The GAIA dataset (http://data.ess.tsinghua.edu.cn, accessed on 29 April 2021) is the 30 m resolution annual dataset spanning over 30 years acquired at a global scale. An automatic mapping procedure developed this dataset on the GEE platform with the full acquisition of Landsat time-series images and ancillary datasets including nighttime light The MSMT dataset was produced by using the Multisource, Multitemporal Random Forest classification (MSMT_RF) approach based on the GEE cloud-based platform (https:// doi.org/10.5281/zenodo.3505079, accessed on 29 April 2021). This product was developed by integrating Landsat-8 OLI optical images, Sentinel-1 SAR images, and Visible Infrared Imaging Radiometer Suite (VIRS) NTL images. It is a global impervious surface map at a resolution of 30 m for 2015 [59].
In this study, a series of independent validation samples were used for accuracy assessment. The reference data were from the high-spatial-resolution Google Earth images and Gaofen-1 image acquired temporally adjacent to classification maps for validation. Based on the "stratified random sampling" rule [60], the samples were randomly selected and labeled as either "impervious surfaces" or "pervious surfaces" via visual interpretation. More specifically, the "pervious surface" samples included vegetation, soil, and water land cover types. To eliminate the impact of resolution and uncertainty at object boundaries, we selected a sample location at the center of each relevant geographic object (e.g., building, water, etc.). A total of 2600 samples (1300 for the pervious surface class, and the others for the impervious surface class) were collected. The metrics of overall accuracy (OA), producer's accuracy (PA), user's accuracy (UA), and kappa coefficient were used to assess the results quantitatively.

Comparison between Bayesian-STSRM and STCISM
Both the proposed STCISM (the temporal consistency check model used) and the original Bayesian-STSRM (the temporal consistency check model not used) were assessed using 11 high-resolution images including four cloud-free images and six partly cloud-free images (Table 1). In general, the STCISM has overall accuracies higher than 90% and kappa coefficients higher than 0.85, demonstrating its ability to accurately map the impervious surfaces at a sub-pixel scale. Furthermore, the STCISM generates higher overall accuracies and kappa coefficients than Bayesian-STSRM. Specifically, the STCISM increases the overall accuracy and kappa coefficient by about 1.12% and 0.2, respectively, showing the efficiency of the temporal consistency check in correcting the outliers in the Bayesian-STSRM results. In addition, the STCISM has the average producer's accuracy of 90.06% and average user's accuracy of 97.21% compared with 88.70% and 96.24% for Bayesian-STSRM, respectively. The result indicates that the temporal consistency check can reduce both omission and commission errors. Figure 6 presents a comparison of Bayesian-STSRM and STCISM in two selected regions of the study area. In Figure 6A, the Google Earth image shows buildings that were partly predicted in the Bayesian-STSRM and modified by STCISM with more impervious surface area predicted. This finding shows the temporal consistency check used in STCISM can correct the omission error from Bayesian-STSRM for the impervious surface objects. In Figure 6B, many soil objects in the Google Earth image were incorrectly predicted as impervious surfaces in Bayesian-STSRM, and most of them were correctly predicted as pervious surfaces in the final STCISM map. This figure shows the temporal consistency check used in STCISM can correct the commission error of the impervious surface objects from the Bayesian-STSRM predictions.

Comparison with Other Impervious Surface Products
The quantitative comparison between STCISM and several impervious surface products is shown in Table 2. The STCISM outperforms other impervious surface products based on various metrics. For the year 2010, the STCISM achieves the highest overall accuracy of 97.33% and kappa coefficient of 0.95 compared with 79.33% and 0.59 for both NUACI and GAIA. For the year 2015, STCISM achieves the highest overall accuracy of 92.33% and kappa coefficient of 0.85 compared with 76.67% and 0.53 for MSMT and 70.33% and 0.41 for GAIA. These results indicate that the impervious surface maps predicted from STCISM were in better agreement with the reference data than the comparison products. Besides, the producer's accuracies of both NUACI and GAIA products for 2010 are lower than 80%. The producer's accuracies of both MSMT and GAIA products for 2015 are also lower than 80%, and the user's accuracy of GAIA product for 2015 is lower than 70%. Because the three comparison products are all global impervious surface maps with a 30 m resolution, they are seriously affected by the mixed pixel problem in the heterogeneous region, such as shown in Figure 7. By contrast, the STCISM predictions are

Comparison with Other Impervious Surface Products
The quantitative comparison between STCISM and several impervious surface products is shown in Table 2 and 0.41 for GAIA. These results indicate that the impervious surface maps predicted from STCISM were in better agreement with the reference data than the comparison products. Besides, the producer's accuracies of both NUACI and GAIA products for 2010 are lower than 80%. The producer's accuracies of both MSMT and GAIA products for 2015 are also lower than 80%, and the user's accuracy of GAIA product for 2015 is lower than 70%. Because the three comparison products are all global impervious surface maps with a 30 m resolution, they are seriously affected by the mixed pixel problem in the heterogeneous region, such as shown in Figure 7. By contrast, the STCISM predictions are 5 m spatial resolution impervious surface maps in which the mixed pixel problem in the Landsat image can be reduced to some extent. A visual comparison of different impervious surface maps is shown in Figure 7. In general, the STCISM maps generated at a 5 m spatial resolution were more similar to the reference maps than the 30 m impervious surface map products. In Figure 7C, the NUACI product fails to capture fragmented impervious surfaces and only maps part of the houses, resulting in a low producer's accuracy for impervious surfaces in Table 2. In Figure 7E, the GAIA product overestimates the impervious surface area, i.e., it misclassifies some pervious surfaces (such as vegetation and soils) as impervious surfaces, resulting in  Figure 7. In general, the STCISM maps generated at a 5 m spatial resolution were more similar to the reference maps than the 30 m impervious surface map products. In Figure 7C, the NUACI product fails to capture fragmented impervious surfaces and only maps part of the houses, resulting in a low producer's accuracy for impervious surfaces in Table 2. In Figure 7E, the GAIA product overestimates the impervious surface area, i.e., it misclassifies some pervious surfaces (such as vegetation and soils) as impervious surfaces, resulting in the lowest user's accuracy for impervious surfaces (Table 2). In Figure 7F, the MSMT product underestimates the impervious surfaces and fails to predict the spatial distribution of road networks, resulting in the lowest producer's accuracy for impervious surfaces of 60.00% ( Table 2).

A visual comparison of different impervious surface maps is shown in
The 5 m resolution STCISM maps mapped most of the linear-shaped roads, while GAIA, NUACI, and MSMT fail to predict roads' spatial distribution (Figure 7, rows C, E, and F). The main reason is that the linear road is relatively narrow in spatial distribution with a width of about 10 m or less in this region. Thus, the road class is not the domain land cover class in the 30 m Landsat pixel, and the 30 m products such as GAIA may assign the mixed Landsat pixel as the pervious surface. Furthermore, the boundaries of impervious surface objects from the 30 m impervious surface products are jagged (Figure 7, rows B and F). This indicates that the prediction maps with high resolution provide more detailed information about impervious surfaces.

Annual Change of Impervious Surface
The fine-spatial-temporal impervious surface results allow the analysis of continuous pervious-to-impervious surface change. Figure 8 shows a map of the year of pervious-toimpervious surface change in the entire study area, in which the urban expansion process is evident. Figure 9 shows the enlarged areas of four regions that have undergone the expansion of impervious surfaces. Figure 9A,B are located in two university towns, and the expansion of impervious surfaces shows the construction of universities. Figure 9C shows the development of the athletes' village of the 7th Military World Games, in which the construction of the athlete's village resulted in an expansion of impervious surfaces from 2015 to 2019. Figure 9D presents the development of the manufacturing industrial park, in which many companies were built from 2006 to 2020.
The annual impervious surface change map provides an opportunity to visualize the process of urban expansion better. Many educational and industrial facilities, residential houses, and infrastructure projects, including main roads, ring roads, and subways were constructed from 2006 to 2020. The impervious surface change map, which depicts the finespatial-temporal impervious surface changes, helps us understand the rapid urbanization process.
is evident. Figure 9 shows the enlarged areas of four regions that have undergone the expansion of impervious surfaces. Figure 9A,B are located in two university towns, and the expansion of impervious surfaces shows the construction of universities. Figure 9C shows the development of the athletes' village of the 7th Military World Games, in which the construction of the athlete's village resulted in an expansion of impervious surfaces from 2015 to 2019. Figure 9D presents the development of the manufacturing industrial park, in which many companies were built from 2006 to 2020. Figure 8. The annual change of impervious surface maps in Daqiao New Town. Considering both the impervious-topervious and pervious-to-impervious changes were considered, and there may be more than one pervious-to-impervious change that occurred in a single pixel, Figure 8 shows the year of the latest pervious-to-impervious change if more than one pervious-to-impervious change was detected. In addition, Figure 8 only shows the change years for pixels, which are labeled as impervious surfaces in the year 2020. The white color represents impervious surfaces in 2006 and before; the gradient colors from shallow to deep represent the annual change of impervious surfaces. Figure 8. The annual change of impervious surface maps in Daqiao New Town. Considering both the impervious-topervious and pervious-to-impervious changes were considered, and there may be more than one pervious-to-impervious change that occurred in a single pixel, Figure 8 shows the year of the latest pervious-to-impervious change if more than one pervious-to-impervious change was detected. In addition, Figure 8  The annual impervious surface change map provides an opportunity to visualize the process of urban expansion better. Many educational and industrial facilities, residential houses, and infrastructure projects, including main roads, ring roads, and subways were constructed from 2006 to 2020. The impervious surface change map, which depicts the

Fine Spatial and High-Frequency Mapping of Impervious Surface at the Local Scale
The primary purpose of generating 5 m resolution impervious surface maps at Landsat frequency is to capture the rapid and spatially explicit dynamics of urban expansion. The global impervious surface maps such as NUACI, MSMT, and GAIA are usually generated annually, or every five years, or even longer. These products are for global scale impervious monitoring, but may not capture the rapid urbanization process. To map impervious surfaces on a regional scale with a high frequency, many researchers have tried to map annual impervious surface maps using Landsat time-series data at 30 m [40,41]. These outputs are suitable for monitoring the temporally consistent impervious changes for large cities that have undergone a rapid urbanization process according to the adoption of new development policies [61]. However, even the same city could have different urbanization processes in different districts. For instance, the city's new district usually has undergone a rapid urbanization process, while the old or mature district in the city has undergone a relatively low urbanization process. At the city's local scale of rapid urbanization district, the annual 30 m resolution impervious surface mapping methods is relatively coarse in both time-frequency and spatial resolution. Although the fine spatial resolution remote sensing image such as WorldView allows the impervious surface mapping at the very high spatial resolution, the relatively coarse time interval and high cost would limit the monitoring of urban expansion at high frequency. Unlike the methods based on Landsat or high-spatial-resolution data, the proposed method fused both the Landsat and Google Earth images, thus allowing the monitoring of local scale (such as in the rapid developed new districts in the city) impervious surface changes at both fine spatial resolution and fine temporal frequency.

Incorporating Bidirectional Conversion in the Temporal Consistency Check in the Long-Term Mapping of Impervious Surface
The STSRM is an ill-posed problem, and the individual STSRM predictions would contain considerable uncertainty and result in outliers in the time series maps. The temporal consistency check, which uses the temporal context of impervious surface, can be used to correct the outliers in the time series impervious surface maps. This paper proposed a new temporal consistency check to post-process the time-series impervious surface maps from the individual STSRM predictions. The new temporal consistency check considered the bidirectional conversion (pervious-to-impervious surface conversion and impervious-topervious surface conversion), considering that both conversions are typical urban renewal processes. To validate the effectiveness of the new temporal consistency check, the proposed STCISM was compared with the popular unidirectional temporal consistency check model referred to as TCC in this paper [41]. Only the TCC was compared because no other STSRM algorithm was designed for the impervious surface mapping to the best of our knowledge. The TCC assumes the impervious-to-pervious surface conversion is unlikely to happen, and only considers the pervious-to-impervious surface conversion. The TCC, which is composed of temporal filtering and logic reasoning steps, was directly applied to the Bayesian-STSRM predictions. The TCC was validated using the same validation samples as described in Section 2.3.4 in the selected dates. The quantitative results between TCC and the proposed STCISM are shown in Figure 10.
In Figure 10, the proposed STCISM increased the overall accuracy and kappa compared with that of TCC, showing that bidirectional conversion in the temporal consistency check can better map the impervious surfaces at local scales. The TCC has a higher producer's accuracy (lower omission error) for impervious surfaces and lower user's accuracy (higher commission error) for impervious surfaces. The main reason is that if a pixel is labeled as an impervious surface at time t p in the Bayesian-STSRM, TCC would probably revise the pervious surface labels, which represent the time that post-dates time t p , as impervious surface labels according to the unidirectional conversion model. However, suppose the impervious surface label is incorrect at time tp. In that case, the TCC may erroneously revise the post-dated pervious surface labels as impervious surface labels, thus resulting in an over-estimate of impervious surfaces in the time series maps. In other words, the TCC that considers the unidirectional conversion is more sensitive to the commission error of the impervious surface, and the error would propagate to the prediction of the post-dated pervious surface labels. In contrast, the new temporal consistency check in the STCISM is based on only the statistical characters of the time series data, and treats the pervious-to-impervious surface conversion and impervious-to-pervious surface conversion equally. The result shows that the STCISM is more robust to the commission errors of the impervious surface, and could generate a more accurate map (in terms of overall accuracy and kappa) in monitoring the sub-pixel scale high-frequency impervious surface.
Remote Sens. 2021, 13, x FOR PEER REVIEW 18 of 23 Figure 10. Accuracy of the traditional TCC model that only considers the unidirectional perviousto-impervious conversion with the proposed STCISM that considered the bidirectional conversion.
In Figure 10, the proposed STCISM increased the overall accuracy and kappa compared with that of TCC, showing that bidirectional conversion in the temporal consistency check can better map the impervious surfaces at local scales. The TCC has a higher producer's accuracy (lower omission error) for impervious surfaces and lower user's accuracy (higher commission error) for impervious surfaces. The main reason is that if a pixel is labeled as an impervious surface at time tp in the Bayesian-STSRM, TCC would probably revise the pervious surface labels, which represent the time that post-dates time tp, as impervious surface labels according to the unidirectional conversion model. However, suppose the impervious surface label is incorrect at time tp. In that case, the TCC may erroneously revise the post-dated pervious surface labels as impervious surface labels, thus resulting in an over-estimate of impervious surfaces in the time series maps. In other words, the TCC that considers the unidirectional conversion is more sensitive to the commission error of the impervious surface, and the error would propagate to the prediction Besides the accuracy metrics between TCC and STCISM, Figure 11 shows zoomed-in area predictions between TCC and STCISM, demonstrating the rationality of the bidirectional conversion model in the temporal consistency check. In the zoomed-in area, many houses were found on 12 November 2010, which were converted to bare soil on 31 June 2013. This zoomed-in area has undergone an impervious surface to pervious surface conversion during this period. In the TCC, the impervious surface is predicted as mostly unchanged during this period. The TCC, which assumed the impervious-to-pervious conversion is unlikely to happen, fails to predict the impervious-to-pervious surface change.
In contrast, the STCISM with the bidirectional conversion rule correctly predicts most of the impervious-to-pervious surface conversion.

face.
Besides the accuracy metrics between TCC and STCISM, Figure 11 shows zoomed-i area predictions between TCC and STCISM, demonstrating the rationality of the bidirec tional conversion model in the temporal consistency check. In the zoomed-in area, man houses were found on 12 th Nov. 2010, which were converted to bare soil on 31 th Jun. 201 This zoomed-in area has undergone an impervious surface to pervious surface conversio during this period. In the TCC, the impervious surface is predicted as mostly unchange during this period. The TCC, which assumed the impervious-to-pervious conversion unlikely to happen, fails to predict the impervious-to-pervious surface change. In contras the STCISM with the bidirectional conversion rule correctly predicts most of the imperv ous-to-pervious surface conversion. Figure 11. An example showing the difference between TCC (using unidirectional pervious-to-impervious surface conversion rule) and STCISM (using the bidirectional conversion rule between pervious and impervious surface) in mapping the impervious-to-pervious surface conversion. Red and white represent impervious and pervious surfaces, respectively.

Limitation and Future Work
Although STCISM increased the accuracy in mapping impervious surfaces compare with the 30 m global impervious surface products, artifacts still exist. For instance, ther are artefacts such as the "salt-and-pepper" noise as the flaws in Figure 6. The main reaso is that STCISM is based on the performance of Bayesian-STSRM, which is an ill-pose problem that predicts a 5 m sub-pixel impervious map from the original 30 m medium resolution Landsat data. The varisou factors, including the uncertainties introduced b endmember variability, the complexity in the algorithm for solving the ill-posed problem (the Bayesian-STSRM method used the simulated annealing for searching the optimal so lution), as well as the sizeable spatial resolution gap between the input and output (th Figure 11. An example showing the difference between TCC (using unidirectional pervious-toimpervious surface conversion rule) and STCISM (using the bidirectional conversion rule between pervious and impervious surface) in mapping the impervious-to-pervious surface conversion. Red and white represent impervious and pervious surfaces, respectively.

Limitation and Future Work
Although STCISM increased the accuracy in mapping impervious surfaces compared with the 30 m global impervious surface products, artifacts still exist. For instance, there are artefacts such as the "salt-and-pepper" noise as the flaws in Figure 6. The main reason is that STCISM is based on the performance of Bayesian-STSRM, which is an ill-posed problem that predicts a 5 m sub-pixel impervious map from the original 30 m mediumresolution Landsat data. The varisou factors, including the uncertainties introduced by endmember variability, the complexity in the algorithm for solving the ill-posed problem (the Bayesian-STSRM method used the simulated annealing for searching the optimal solution), as well as the sizeable spatial resolution gap between the input and output (the scale factor is set to 6 in this paper), could decrease the performance in mapping the impervious surface maps such as the Bayesian-STSRM predictions in Figure 6. In addition, the STCISM applied the temporal consistency check as a post-procession to the Bayesian-STSRM predictions. However, the temporal consistency check only considered the temporal contextual information about land covers, but did not use the spatial contextual information. Future works will focus on using spatial filter [39] or the spatiotemporal Markov random field [38] to incorporate spatial or spatiotemporal contextual information to further eliminate artifacts such as the salt and pepper noise to improve the STCISM results.
In STCISM, the object-based classification was used to generate the impervious surface maps from high-resolution Google Earth images. With the development of deep learning algorithms, the convolutional neural networks, which can extract the much abstract highlevel features of impervious surfaces from a high-resolution image, are promising in the use of STCISM. In addition, this paper only used Landsat data as the coarse-spatial-fine-temporal data in the STCISM. Other medium resolution data such as Sentinel-2 optical images could also be used to generate temporal-dense impervious surface maps.

Conclusions
The mapping of impervious surface changes at a fine-spatial-temporal resolution is crucial to the monitoring of the urbanization process. In this paper, a new spatiotemporal impervious surface mapping method (STCISM) was proposed to deal with the trade-off between the spatial and temporal resolution of remote sensing imagery to map fine-spatialresolution (5 m Google-Earth-image-resolution) impervious surfaces at Landsat frequency. Compared with traditional STSRM designed to map multiple land cover classes, the STCISM is proposed for mapping impervious surfaces, and has two main advantages. First, the STCISM considered the high spectral variability in the impervious surface, and used the F-MESMA to minimize the within-class variability and maximize the between-class variability for urban classes. As a result, the uncertainty in spectral unmixing could be reduced. Second, the STCISM considered the considerable uncertainty in the STSRM algorithm, which is ill-posed, and used a new temporal consistency check to post-process the predicted time series impervious surface maps. Results show that using the temporal consistency check can correct the time series impervious surface maps and reduce the uncertainty in time series impervious mapping to some extent. In addition, the temporal consistency check using the unidirectional conversion rule is sensitive to the commission error of the impervious surface and may fail to predict the actual impervious-to-pervious surface conversion, resulting in a high commission error of the impervious surface. In contrast, the temporal consistency check using the bidirectional conversion rule outperformed that using the unidirectional conversion rule both visually and quantitatively.
The results demonstrated that the STCISM is suitable for mapping local scale impervious surface changes at fine spatiotemporal resolutions. The STCISM was compared with the global impervious surface products which were generated at 30 m resolution and at annual or multi-year temporal frequency. These global products are relatively coarse in both spatial and temporal resolutions in mapping local scale impervious surface changes both visually and quantitatively. In contrast, the STCISM predicted 5 m resolution impervious surface maps at Landsat frequency. The STCISM is essentially suitable for monitoring local scale impervious surface changes for regions with rapid urbanization progress in the city. The STCISM maps were able to represent the details of small impervious objects such as buildings and roads. With the increased spatial-temporal resolutions, detailed information about urbanization and urban expansion can be extracted. These fine-spatial-temporal resolution impervious surface change maps can help better monitor the urbanization process and help decision-makers respond quickly to illegal construction and make land-use policies for urban planning and sustainable city development.