A Two-Stage Fusion Framework to Generate a Spatio–Temporally Continuous MODIS NDSI Product over the Tibetan Plateau

: The Tibetan Plateau (TP) is an important component of the global environmental system, on which the snow cover greatly a ﬀ ects the regional climate and ecology. Moderate resolution imaging spectroradiometer (MODIS) snow cover products have been demonstrated to be appropriate for investigating the snow cover over the TP. However, they are subject to cloud obscuration, and the TP’s extremely complex terrain makes the snow monitoring di ﬃ cult. Therefore, in this paper, we propose a two-stage spatio–temporal fusion framework for the cloud removal of MODIS C6 snow products, including an adjusted Terra and Aqua combination (TAC) and a spatio–temporal fusion based on Gaussian kernel function and error correction (STF-GKF-EC). To the best of our knowledge, this is the ﬁrst time that a spatio–temporally continuous daily 500-m MODIS normalized di ﬀ erence snow index (NDSI) product has been generated for the TP, which greatly improves the spatial and temporal resolutions of the current snow cover products. The main stage, STF-GKF-EC, adaptively weights the spatial and temporal correlations by the Gaussian kernel function, and further takes the rapid changes of snow cover into consideration through the error correction. The experiments indicated that STF-GKF-EC removes clouds completely, achieving an overall accuracy (OA) and mean absolute error (MAE) of 91.48% and 3.88, respectively. Based on the cloud-removed results, during 2001–2017, as far as the intra-annual variation is concerned, a large proportion of the snow cover appears between October and May, with a peak in February / March, and the variation is mainly controlled by temperature. For the inter-annual variation, an obvious increasing trend of 0.68 / year for NDSI is observed before 2005, followed by a slight decreasing trend of 0.16 / year, in which precipitation is a better explanation factor than temperature. The proposed two-stage spatio–temporal fusion framework includes an adjusted Terra and Aqua combination (TAC) and a spatio–temporal fusion based on Gaussian kernel function and error correction (STF-GKF-EC). Firstly, the TAC based on an adjusted priority scheme, is used as the ﬁrst stage to reduce the cloud cover. STF-GKF-EC is then implemented according to the following procedure: (1) neighborhood interpolation; (2) GKF-based weighting integration of the spatio–temporal blocks; and (3) error correction. It should be noted that this process is recursively performed on the NDSI product until no cloud-covered pixels remain. Since the TAC has been extensively demonstrated to be highly accurate, only STF-GKF-EC was evaluated in this paper. The simulated experiment suggests that STF-GKF-EC can obtain an average OA of 91.48%. In addition, the MAE and the MAE for snow-covered areas (MAE_S) are 3.88 and 12.06, respectively. Overall, STF-GKF-EC has three advantages: (1) it takes into account regional inconsistency through the block processing; (2) it simultaneously considers the spatial and temporal information based on the uniﬁed spatio–temporal weight; and (3) it further improves the accuracy by the error correction. Nevertheless, there are still some disadvantages: (1) STF-GKF-EC may lead to a few misclassiﬁcations at the edge of the snow cover; and (2) it is di ﬃ cult to recover the cloud-covered pixels with a high precision when continuous cloud cover appears in the snow accumulation or ablation period. For our future work, remote sensing data from di ﬀ erent sensors, such as Sentinel data, will be fused to produce higher-quality NDSI products. The key is to improve the accuracy of the areas covered by continuous cloud during the snow accumulation and ablation.


Introduction
As the "third pole of the planet", the Tibetan Plateau (TP) has a considerable impact on regional climate and ecology. With global warming in recent decades, the TP is undergoing glacier retreat, abnormal snow cover change, permafrost degradation, and weakening of the ecosystem [1]. Snow is a significant and sensitive element of the climate system, affecting the surface energy balance [2] and hydrological circulation [3,4]. For example, when the snow melts early, the peak of river runoff will shift from summer and autumn, when water demand is highest, to winter and spring, when water demand is lower [5], which will seriously affect the development of agriculture and the supply of municipal water in the downstream areas. Given these facts, it is essential to monitor the snow cover variability over the TP. the C6 products across China [38]. For cloud removal, some studies have first converted the NDSI to SCA or FSC, and then removed the clouds, as in the C5 products [39][40][41]. In addition, using eightday MOD10C2 data, Li et al. [42] confirmed that the snow cover patterns on the TP exhibit significant spatio-temporal heterogeneity, and that the snow cover fraction decreased slightly by ~1.1% from 2001 to 2014. In general, for the snow cover variability over the TP, the current research based on NDSI products is still scarce and the spatial and temporal resolutions are coarse.
As a result, in this paper, we propose a novel two-stage spatio-temporal fusion framework to generate a cloud-free MODIS NDSI product, and we further investigate the snow cover variability over the TP at higher spatial and temporal resolutions than previous studies. First, a framework integrating an adjusted Terra and Aqua combination (TAC) and a spatio-temporal fusion based on Gaussian kernel function and error correction (STF-GKF-EC) is introduced. The main stage, STF-GKF-EC, integrates the spatial and temporal information based on a unified spatio-temporal weight, and greatly mitigates the effects of rapid snow cover changes through the error correction. Then, the cloud removal effect and accuracy of this method are tested in the simulated experiment. Finally, the method is applied to generate the daily cloud-free NDSI product with a spatial resolution of 500 m over the TP during 2001-2017, which is used to analyze the intra-annual and inter-annual variabilities of snow cover and their response to temperature and precipitation.

Study Area
The TP is located in central Eurasia, extending from 26°00′12″N to 39°46′50″N and from 73°18′52″E to 104°46′59″E, covering an area of approximately 2.57 × 10 6 km 2 (see Figure 1). It is the largest ice bank outside of the polar regions, almost entirely because of its elevation, which averages more than 4000 m a.s.l. In this study, we also analyzed the snow cover variability in 10 subregions of the TP, including six river basins (Indus, Brahmaputra, Salween, Mekong, Yangtze, and Yellow), three northern regions (Tarim, Qaidam, Hexi), and the remaining central region (Inner).

Data
The MODIS C6 Terra/Aqua snow products over the TP cover six tiles, in which the NDSI is in the range of 0, 10-100. The products were obtained from the National Aeronautics and Space Administration (NASA, https://search.earthdata.nasa.gov/search). The pre-processing included mosaicking the six tiles, re-projecting them into Universal Transverse Mercator (UTM) 45N, and extracting the TP. Thereafter, 6141 MOD10A1 and 5644 MYD10A1 images during 2001-2017 were acquired, as a few images with incorrect mosaics were discarded [43].
The Shuttle Radar Topographic Mission (SRTM) 90 m Digital Elevation Model (DEM) data downloaded from the United States Geological Survey (USGS) were used as ancillary data. Except for resampling the DEM into a 500-m spatial resolution, the pre-processing was similar to that for the MODIS snow products.
Monthly average temperature data and monthly average precipitation data from 87 meteorological stations over the TP (see Figure 1), provided by the China Meteorological Information Center (http: //data.cma.cn), were used to analyze the relationship between snow cover variation and climate change.
Inland water is marked as 237 in the MODIS NDSI products [36], but there are omissions. Therefore, lake data for the TP (obtained from http://dx.doi.org/10.6084/m9.figshare.3145369 [44]) were used to extract a water mask for the MODIS NDSI product.

Methodology
In this study, the clouds of the MODIS NDSI product were removed under a two-stage spatio-temporal fusion framework including the TAC and STF-GKF-EC, as shown in Figure 2. The MODIS C6 Terra/Aqua snow products over the TP cover six tiles, in which the NDSI is in the range of 0, 10-100. The products were obtained from the National Aeronautics and Space Administration (NASA, https://search.earthdata.nasa.gov/search). The pre-processing included mosaicking the six tiles, re-projecting them into Universal Transverse Mercator (UTM) 45N, and extracting the TP. Thereafter, 6141 MOD10A1 and 5644 MYD10A1 images during 2001-2017 were acquired, as a few images with incorrect mosaics were discarded [43].
The Shuttle Radar Topographic Mission (SRTM) 90 m Digital Elevation Model (DEM) data downloaded from the United States Geological Survey (USGS) were used as ancillary data. Except for resampling the DEM into a 500-m spatial resolution, the pre-processing was similar to that for the MODIS snow products.
Monthly average temperature data and monthly average precipitation data from 87 meteorological stations over the TP (see Figure 1), provided by the China Meteorological Information Center (http://data.cma.cn), were used to analyze the relationship between snow cover variation and climate change.
Inland water is marked as 237 in the MODIS NDSI products [36], but there are omissions. Therefore, lake data for the TP (obtained from http://dx.doi.org/10.6084/m9.figshare.3145369 [44]) were used to extract a water mask for the MODIS NDSI product.

Methodology
In this study, the clouds of the MODIS NDSI product were removed under a two-stage spatiotemporal fusion framework including the TAC and STF-GKF-EC, as shown in Figure 2.

Adjusted Terra and Aqua Combination (TAC)
The TAC [17], which blends the two MODIS snow products from Terra and Aqua, performs well in both effectiveness and accuracy. Therefore, it has been widely used as a pre-step for the cloud removal of MODIS C5 binary snow products, in which the priority is generally determined as snow>no snow>cloud. Although the TAC was proposed for use with MODIS binary snow products, we made appropriate adjustments according to the characteristics of the NDSI to make it applicable to C6 products. The specific priority scheme is given as high value > low value > cloud. This step provides the preliminary result for the next step.

Adjusted Terra and Aqua Combination (TAC)
The TAC [17], which blends the two MODIS snow products from Terra and Aqua, performs well in both effectiveness and accuracy. Therefore, it has been widely used as a pre-step for the cloud removal of MODIS C5 binary snow products, in which the priority is generally determined as snow > no snow > cloud. Although the TAC was proposed for use with MODIS binary snow products, we made appropriate adjustments according to the characteristics of the NDSI to make it applicable to C6 products. The specific priority scheme is given as high value > low value > cloud. This step provides the preliminary result for the next step.

Spatio-Temporal Fusion Based on Gaussian Kernel Function and Error Correction (STF-GKF-EC)
Based on the output of the TAC, the remaining clouds are further removed by the second stage, i.e., STF-GKF-EC, which makes full use of the spatio-temporal information of the snow cover product. STF-GKF-EC includes three parts: (1) neighborhood interpolation; (2) GKF-based weighting integration of the spatio-temporal blocks; and (3) error correction. In order to ensure the accuracy, the three parts are performed recursively, to remove the clouds iteratively (see Figure 2). Taking a single cycle as an example, the details are provided in the following.

Neighborhood Interpolation
Neighborhood interpolation is a traditional method which has a high precision for filling small areas [22]. In order to ensure the reconstruction accuracy under cloud cover, only the cloud pixels adjacent to the known pixels are filled by the neighborhood interpolation in each cycle of STF-GKF-EC. As can be seen in Figure 3, for the m-th loop, the cloud pixels within a buffer no greater than 2m − 1 from known pixels, are interpolated by known pixels. Then, after the GKF-based weighting integration and error correction, if the cloud cover rate is larger than zero, the next loop begins to further remove the clouds (see Figure 2).
A variety of interpolation methods can be used in this step, the choice of which has little impact on the accuracy of the small-area cloud removal. In this paper, inverse distance weighting (IDW) with geographical features is applied, in which each cloud pixel is assigned a weighted average of the reliable clear-sky neighbors. Statistical analysis shows that when the elevation difference over the TP is greater than 50 m, there may be a large difference in snow conditions. Therefore, a pixel whose elevation difference from the target pixel is more than 50 m is considered unreliable and is excluded from the analysis. The formula can be expressed as follows: where NDSI T i and NDSI T (i,n) represent the NDSI of the i-th cloud pixel and its n-th neighboring reference pixel in the target image, respectively. N is the total number of selected neighbors. ∆d (i,n) is the spatial distance between the n-th reference pixel and the i-th target pixel, which is no more than a given threshold 2 × m. In this way, the clouds near the clear-sky pixels can be removed efficiently and accurately, which is the basis for the next step. Based on the output of the TAC, the remaining clouds are further removed by the second stage, i.e., STF-GKF-EC, which makes full use of the spatio-temporal information of the snow cover product. STF-GKF-EC includes three parts: (1) neighborhood interpolation; (2) GKF-based weighting integration of the spatio-temporal blocks; and (3) error correction. In order to ensure the accuracy, the three parts are performed recursively, to remove the clouds iteratively (see Figure 2). Taking a single cycle as an example, the details are provided in the following.

Neighborhood Interpolation
Neighborhood interpolation is a traditional method which has a high precision for filling small areas [22]. In order to ensure the reconstruction accuracy under cloud cover, only the cloud pixels adjacent to the known pixels are filled by the neighborhood interpolation in each cycle of STF-GKF-EC. As can be seen in Figure 3, for the m-th loop, the cloud pixels within a buffer no greater than 2m-1 from known pixels, are interpolated by known pixels. Then, after the GKF-based weighting integration and error correction, if the cloud cover rate is larger than zero, the next loop begins to further remove the clouds (see Figure 2).
A variety of interpolation methods can be used in this step, the choice of which has little impact on the accuracy of the small-area cloud removal. In this paper, inverse distance weighting (IDW) with geographical features is applied, in which each cloud pixel is assigned a weighted average of the reliable clear-sky neighbors. Statistical analysis shows that when the elevation difference over the TP is greater than 50 m, there may be a large difference in snow conditions. Therefore, a pixel whose elevation difference from the target pixel is more than 50 m is considered unreliable and is excluded from the analysis. The formula can be expressed as follows: where NDSI i T and NDSI (i,n) T represent the NDSI of the i-th cloud pixel and its n-th neighboring reference pixel in the target image, respectively. N is the total number of selected neighbors. ∆d (i,n) is the spatial distance between the n-th reference pixel and the i-th target pixel, which is no more than a given threshold 2m. In this way, the clouds near the clear-sky pixels can be removed efficiently and accurately, which is the basis for the next step.

GKF-based Weighting Integration of the Spatio-Temporal Blocks
To ensure accuracy, neighborhood interpolation is applied to recover a small range of cloudcovered pixels surrounding the non-cloud pixels. More importantly, GKF-based weighting integration of the spatio-temporal blocks is proposed to completely remove the residual clouds, which synthetically takes advantage of the spatio-temporal information in the reference images (see Figure 2). In short, it consists of block division, reference block selection, and weighted average calculation. The detailed description is as follows: (1) Block division. Due to the complex terrain of the TP, the snow cover variation often presents regional inconsistency. In this regard, the minimum bounding rectangle of the TP is divided into b1×b2 blocks uniformly, so that the subsequent steps can be performed block by block.

GKF-based Weighting Integration of the Spatio-Temporal Blocks
To ensure accuracy, neighborhood interpolation is applied to recover a small range of cloud-covered pixels surrounding the non-cloud pixels. More importantly, GKF-based weighting integration of the spatio-temporal blocks is proposed to completely remove the residual clouds, which synthetically takes advantage of the spatio-temporal information in the reference images (see Figure 2). In short, it consists of block division, reference block selection, and weighted average calculation. The detailed description is as follows: (1) Block division. Due to the complex terrain of the TP, the snow cover variation often presents regional inconsistency. In this regard, the minimum bounding rectangle of the TP is divided into b1 × b2 blocks uniformly, so that the subsequent steps can be performed block by block.
(2) Reference block selection. In order to ensure the reconstruction accuracy, only similar blocks are used as reference blocks. Since the variation of snow cover is rapid and complex, it is insufficient to measure the similarity by the temporal distance alone. Instead, the optimal reference blocks are selected from the candidate blocks within a given time window (Tw) via the combination of the correlation coefficient (r), non-cloud fraction ( f n ), and temporal distance (∆t) as follows: where f C&T n represents the overlapping non-cloud fraction of the candidate block and the target block. f C n is the non-cloud fraction of the candidate block. According to rule 1, a candidate block with a correlation coefficient to the target block higher than 0.7 is selected as a reference block, provided that f C&T n is high enough (0.3) to ensure that the calculated correlation coefficient is representative. The two thresholds were obtained from our experimental experience. In addition, if rule 1 fails, two reference blocks are selected by rule 2 to improve the efficiency of the cloud removal, with the sum of ∆t −1 and f C n reaching a maximum in the 8 days preceding or following.
(3) Weighted average calculation. Firstly, based on the GKF, the weights of the corresponding cloud-free pixels in the reference blocks are calculated. As shown in Figure 4, each cloud pixel in the target block is preliminarily assigned the weighted average of the reference pixels, as shown in Equation (3): where NDSI P j is the preliminary NDSI of pixel j in the target block. NDSI R ( j,st) represents the NDSI in the reference block. M denotes the number of reference blocks, with each containing N reference pixels. The weight w ( j,st) of the reference pixel, which is determined by its temporal and spatial distance to the target pixel j, consists of the temporal weight w ( j,t) and spatial weight w ( j,s) , as shown in Equation (4): where r t is the correlation coefficient between the reference block and the target block. In particular, it will be ignored when the reference block is selected by rule 2 (namely, r t = 1). ∆t ( j,t) and ∆d ( j,s) represent the normalized temporal distance and normalized spatial distance between the reference pixel and the target pixel, respectively. In addition, σ t and σ s are the corresponding standard deviations, which control the full width at half-maximum of each Gaussian curve.
where r t is the correlation coefficient between the reference block and the target block. In particular, it will be ignored when the reference block is selected by rule 2 (namely, r t =1). ∆t (j,t) and ∆d (j,s) represent the normalized temporal distance and normalized spatial distance between the reference pixel and the target pixel, respectively. In addition, σ t and σ s are the corresponding standard deviations, which control the full width at half-maximum of each Gaussian curve.

Error Correction
Based on the previous step, the preliminary cloud-removed result of each block can be obtained. However, large errors may occur during periods of rapid snow cover changes. It has been demonstrated that error correction with spatial correlation can further improve the effect of cloud removal [45]. Error correction is also adopted in this work (see Figure 2). According to Figure 5, the detailed process is described below. Based on the previous step, the preliminary cloud-removed result of each block can be obtained. However, large errors may occur during periods of rapid snow cover changes. It has been demonstrated that error correction with spatial correlation can further improve the effect of cloud removal [45]. Error correction is also adopted in this work (see Figure 2). According to Figure 5, the detailed process is described below. (1) Known error calculation. As shown in Figure 5, the errors of the original cloud-free areas (i.e., region 1) are calculated by Equation (5): where NDSI 1 P and NDSI 1 T represent the NDSI in region 1 of the preliminary result and the original target block, respectively.
(2) Unknown error estimation. In consideration of the superiority in smooth approximation, the triangulation-based natural neighbor interpolation method [46] is adopted to estimate the errors in region 2. In particular, only the known errors on the boundary of region 2 (∂R2) are utilized as a reference. The equation can be expressed as: where u ʹ (x,y) is the estimation at (x,y) based on the weighted combination of N references u(x n ,y n ). ϕ n is the weight associated with the known neighboring pixel n . Figure 6 shows the Voronoi diagrams for weight calculation. Specifically, Figure 6a shows the Voronoi diagram of pixels 1-5 based on the Delaunay triangles, in which f-h are the three vertices. When a target pixel T is included, its natural neighbors are those to which it is connected through the sides of the Delaunay triangles (pixels 1-5). A new Voronoi diagram is generated, as shown by the green line in Figure 6b, in which a-e are the five vertices in the Voronoi cell of pixel T. Taking pixel 1 as an example, the weight is given as follows: where A afghe is the overlapping area of the Voronoi cell of pixel 1 in Figure 6a and that of pixel T in Figure 6b. A abcde is the total area of the Voronoi cell of pixel T. (1) Known error calculation. As shown in Figure 5, the errors of the original cloud-free areas (i.e., region 1) are calculated by Equation (5): where NDSI P 1 and NDSI T 1 represent the NDSI in region 1 of the preliminary result and the original target block, respectively.
(2) Unknown error estimation. In consideration of the superiority in smooth approximation, the triangulation-based natural neighbor interpolation method [46] is adopted to estimate the errors in region 2. In particular, only the known errors on the boundary of region 2 (∂R 2 ) are utilized as a reference. The equation can be expressed as: where u (x, y) is the estimation at (x, y) based on the weighted combination of N references u(x n , y n . φ n is the weight associated with the known neighboring pixel n. Figure 6 shows the Voronoi diagrams for weight calculation. Specifically, Figure 6a shows the Voronoi diagram of pixels 1-5 based on the Delaunay triangles, in which f-h are the three vertices. When a target pixel T is included, its natural neighbors are those to which it is connected through the sides of the Delaunay triangles (pixels 1-5).
A new Voronoi diagram is generated, as shown by the green line in Figure 6b, in which a-e are the five vertices in the Voronoi cell of pixel T. Taking pixel 1 as an example, the weight is given as follows: where A a f ghe is the overlapping area of the Voronoi cell of pixel 1 in Figure 6a and that of pixel T in Figure 6b. A abcde is the total area of the Voronoi cell of pixel T.
(3) Error correction. The corrected result is determined by the subtraction of the new errors from the preliminary result. After all the blocks are processed, the next loop of STF-GKF-EC begins until there is no cloud (see Figure 2). (3) Error correction. The corrected result is determined by the subtraction of the new errors from the preliminary result. After all the blocks are processed, the next loop of STF-GKF-EC begins until there is no cloud (see Figure 2).

Validation Method
The proposed method was evaluated in terms of classification accuracy and numerical precision. The details are provided below.

Classification Accuracy
The metrics used in the research to evaluate the classification accuracy are the overall accuracy (OA), the commission error (CE), the omission error (OE) [24], and the F-score (FS) [38]. Table 1 shows the confusion matrix on which all the metrics are based, as well as the definition of the metrics. Specifically, pixels with an NDSI greater than 0 are classified as snow [36]. In contrast, the others are considered as snow-free land surface.

Numerical Precision
The numerical precision is based on the mean absolute error (MAE) and root-mean-square error (RMSE), as shown in Equations (8) and (9). Furthermore, since the two indicators are greatly affected by the large proportion of snow-free surface, which makes them unrealistically low, MAE and RMSE

Validation Method
The proposed method was evaluated in terms of classification accuracy and numerical precision. The details are provided below.

Classification Accuracy
The metrics used in the research to evaluate the classification accuracy are the overall accuracy (OA), the commission error (CE), the omission error (OE) [24], and the F-score (FS) [38]. Table 1 shows the confusion matrix on which all the metrics are based, as well as the definition of the metrics. Specifically, pixels with an NDSI greater than 0 are classified as snow [36]. In contrast, the others are considered as snow-free land surface.

Numerical Precision
The numerical precision is based on the mean absolute error (MAE) and root-mean-square error (RMSE), as shown in Equations (8) and (9). Furthermore, since the two indicators are greatly affected Remote Sens. 2019, 11, 2261 9 of 21 by the large proportion of snow-free surface, which makes them unrealistically low, MAE and RMSE for the snow-covered areas (MAE_S and RMSE_S) are utilized for a more comprehensive evaluation.
where x 1,k and x 2,k denote the NDSI of pixel k in the validation data and the cloud-removed results.
K is the number of all pixels in the evaluation area.

Results
This section describes the accuracy evaluation in the simulated experiment, in which the parameters were set as shown in Table 2. Since the TAC has been widely demonstrated to have a high accuracy, only the accuracy of STF-GKF-EC was evaluated. The spatial and temporal parameters in the GKF.

Accuracy Evaluation of STF-GKF-EC
The evaluation was based on the approach developed by Gafurov and Bárdossy [18]. Firstly, 10 images (see Table 3) with negligible cloud cover after the TAC from 2016/2017 were considered as "ground truth". Cloud masks extracted from the images with high cloud coverage of 40-85% were applied to the "ground truth". Then, the artificially cloud-covered images (called "observed data") were reconstructed by STF-GKF-EC. Finally, the cloud-removed results were compared with the corresponding "ground truth".
Since STF-GKF-EC was recursively performed, the clouds were removed iteratively. In the simulated experiment, the average cloud cover of the original data, the three intermediate products, and the final results on the 10 days were 62.61%, 26.58%, 2.38%, 0.09%, and 0, respectively. It should be noted that the number of iterations depends on the spatio-temporal cloud cover. The method automatically adjusts the number of iterations based on the remaining cloud coverage. For brevity, only two instances are shown in Figure 7. As can be seen from group (a), although the cloud cover of the "observed data" on 24 March 2016, is as high as 67%, STF-GKF-EC completely removes the clouds, and the cloud-free result shown in Figure 7 (a3) has a good consistency with the "ground truth". Group (b) shows the results for 10 September 2017, with an extremely high cloud cover of 88% in the "observed data". In this case, STF-GKF-EC restores the high values in snow-covered areas and retains the normal low values in snow-free areas. However, there are also a few misclassified pixels in the cloud removal result, mainly in the southeast of Brahmaputra and the interior of Qaidam. In these areas, the original reference images misidentified salt lakes, clouds, etc. as snow, leading to commission of the results. On the whole, the proposed method can remove all the clouds and restore the snow patterns with a high reliability.  To further verify the experimental results, the details of the red rectangular areas A3 and A4 (see Figure 7) are displayed in Figure 8. For group (a), it is clear that the result in (a3) is very pleasing, restoring the detailed information of the snow cover and preserving the spatial continuity with the original NDSI. As for group (b), the snow patterns of the cloud-removed result in (b3) are similar to those of the "ground truth" in (b1). However, some misclassified pixels can be observed at the edge of the snow cover (see the black rectangular areas). The reason is that the reference information is scarce owing to the continuous cloud presence in both the time sequence and the spatial neighborhood. In general, when there is no effective information in the adjacent five days during the period of snow accumulation or ablation, the reconstruction accuracy will be relatively poor in the area without the neighborhood information. Overall, STF-GKF-EC can recover all the NDSI of cloudcontaminated pixels, with a high agreement with the "ground truth". To further verify the experimental results, the details of the red rectangular areas A3 and A4 (see Figure 7) are displayed in Figure 8. For group (a), it is clear that the result in (a3) is very pleasing, restoring the detailed information of the snow cover and preserving the spatial continuity with the original NDSI. As for group (b), the snow patterns of the cloud-removed result in (b3) are similar to those of the "ground truth" in (b1). However, some misclassified pixels can be observed at the edge of the snow cover (see the black rectangular areas). The reason is that the reference information is scarce owing to the continuous cloud presence in both the time sequence and the spatial neighborhood. In general, when there is no effective information in the adjacent five days during the period of snow accumulation or ablation, the reconstruction accuracy will be relatively poor in the area without the neighborhood information. Overall, STF-GKF-EC can recover all the NDSI of cloud-contaminated pixels, with a high agreement with the "ground truth". In addition, the quantitative evaluation results for the 10 simulated images are shown in Table 3. On the one hand, in terms of the classification accuracy, STF-GKF-EC obtains the robust OA ranging from 89.37% to 94.29%. The average OA reaches 91.48%, of which 4.22% is overestimated and 4.30% is underestimated. The FS varies from 0.50 to 0.88, which is basically consistent with the increase of the snow fraction. On the other hand, for the numerical precision, the average MAE and RMSE of the cloud-removed results are 3.88 and 9.98, compared with the "ground truth". Moreover, the average MAE_S and RMSE_S reach 12.06 and 16.20, respectively. That is to say, STF-GKF-EC can also achieve a satisfactory numerical precision in snow-covered areas. From this comprehensive comparison, almost all the indicators suggest that 17 April 2017, has the highest accuracy. However, different conclusions about the most inaccurate day are obtained according to the different indicators. Since the snow fraction on 10 September 2016, is extremely low, the FS, MAE_S, and RMSE_S magnify the errors to some extent, as they only consider snow-covered areas. Hence, based on the OA, MAE, and RMSE of all areas, we conclude that the accuracy of the cloud removal for 17 April 2016, is relatively poor. The reason is that continuous cloud cover in time and space occurs during the snowmelt period. In summary, STF-GKF-EC can achieve satisfactory classification accuracy and numerical precision in both snow-covered and snow-free areas during all year. To further verify the experimental results, the details of the red rectangular areas A3 and A4 (see Figure 7) are displayed in Figure 8. For group (a), it is clear that the result in (a3) is very pleasing, restoring the detailed information of the snow cover and preserving the spatial continuity with the original NDSI. As for group (b), the snow patterns of the cloud-removed result in (b3) are similar to those of the "ground truth" in (b1). However, some misclassified pixels can be observed at the edge of the snow cover (see the black rectangular areas). The reason is that the reference information is scarce owing to the continuous cloud presence in both the time sequence and the spatial neighborhood. In general, when there is no effective information in the adjacent five days during the period of snow accumulation or ablation, the reconstruction accuracy will be relatively poor in the area without the neighborhood information. Overall, STF-GKF-EC can recover all the NDSI of cloudcontaminated pixels, with a high agreement with the "ground truth". In general, iterative processing can lead to error propagation. Therefore, the relation between the number of iterations and the accuracy of cloud removal was investigated. Table 4 shows the average accuracy of each iteration of 6 simulated images in 2017. The OA drops from 93.66% after the first iteration to 91.86% after the fourth iteration. For the numerical precision, the MAE and RMSE increase by 1.00 and 1.78, respectively. The MAE_S and RMSE_S increase to 11.73 and 15.79, respectively. It reveals that as the number of iterations increases, the classification accuracy and numerical precision decrease. However, as stated above, the STF-GKF-EC synthetically utilizes the spatial and temporal information, so that the cloud removal efficiency is high and the number of iterations is small. In our experiments, even in summer with extremely high cloud coverage, the number of iterations is no more than seven. Therefore, the final error of the STF-GKF-EC is still small.

Discussion of STF-GKF-EC
As stated above, the STF-GKF-EC can achieve robust accuracy for cloud removal. The two advantages of this method are: (1) the spatial and temporal weights are unified by the GKF; and (2) the estimation errors are reduced by the error correction. For the two aspects, there are two issues to be discussed.

Parameter Determination
Since the spatial and temporal weights are unified by the GKF, the two parameters in the GKF need to be determined. Figure 9 shows the results of the parametric test for 24 March 2016, and 1 October 2016, including the RMSE, RMSE_S, and OA values corresponding to different matrices of [σ s , σ t ]. For simplicity, the parameter testing was only performed on "A1" and "A2" (see Figure 1), and error correction in STF-GKF-EC was omitted. The different colors and vertices of the polygon represent different spatial and temporal parameters, respectively. The vertical comparison shows that the impact of the spatial parameter on the accuracy is highly consistent in the different regions, while that of the temporal parameter is relatively complex. From the horizontal perspective, as the value of the spatial parameter increases, the OA and RMSE are improved, while the RMSE_S is worse. The reason is that the OA and RMSE are significantly affected by the proportion of snow-free areas, while RMSE_S is the opposite. Specifically, when the spatial parameter value is 0.5 or 0.7, the OA and RMSE of both regions show a higher accuracy, while a value of 0.5 performs significantly better than 0.7 in terms of RMSE_S on A2. Therefore, σ s = 0.5 is regarded as the optimal parameter setting. In addition, when the temporal parameter value is set as 0.5, the RMSE and RMSE_S of "A1" and the RMSE_S of "A2" show good performance. Thus, the parameter matrix [0.5, 0.5] is recommended in STF-GKF-EC.

Discussion of STF-GKF-EC
As stated above, the STF-GKF-EC can achieve robust accuracy for cloud removal. The two advantages of this method are: (1) the spatial and temporal weights are unified by the GKF; and (2) the estimation errors are reduced by the error correction. For the two aspects, there are two issues to be discussed.

Parameter Determination
Since the spatial and temporal weights are unified by the GKF, the two parameters in the GKF need to be determined. Figure 9 shows the results of the parametric test for 24 March 2016, and 1 October 2016, including the RMSE, RMSE_S, and OA values corresponding to different matrices of [σ s , σ t ]. For simplicity, the parameter testing was only performed on "A1" and "A2" (see Figure 1), and error correction in STF-GKF-EC was omitted. The different colors and vertices of the polygon represent different spatial and temporal parameters, respectively. The vertical comparison shows that the impact of the spatial parameter on the accuracy is highly consistent in the different regions, while that of the temporal parameter is relatively complex. From the horizontal perspective, as the value of the spatial parameter increases, the OA and RMSE are improved, while the RMSE_S is worse. The reason is that the OA and RMSE are significantly affected by the proportion of snow-free areas, while RMSE_S is the opposite. Specifically, when the spatial parameter value is 0.5 or 0.7, the OA and RMSE of both regions show a higher accuracy, while a value of 0.5 performs significantly better than 0.7 in terms of RMSE_S on A2. Therefore, σ s =0.5 is regarded as the optimal parameter setting. In addition, when the temporal parameter value is set as 0.5, the RMSE and RMSE_S of "A1" and the RMSE_S of "A2" show good performance. Thus, the parameter matrix [0.5, 0.5] is recommended in STF-GKF-EC.

Effect of Error Correction
As stated above, the error correction further improves the numerical accuracy and spatial consistency of the NDSI data, especially during periods of rapid changes in snow cover. The effect of the error correction is affected by the quantity and distribution of the original effective data. Figure 10 shows three specific effects (groups (c)-(e)) extracted from the results for 24 March 2016, where the cloud removal results of STF-GKF-EC are compared with those of the method without error correction. Group (c) presents an example with an excellent correction effect. Since the snow cover in this area varies greatly over a few days, the NDSI values filled with the information of the reference images are significantly underestimated, as shown in (c3). In this case, based on the original effective information of the target image, the error correction can greatly improve the reconstruction accuracy. However, group (d) reveals that the correction is likely to be insufficient when the spatial information is scarce. In addition, original effective data with an extremely uneven spatial distribution may introduce incorrect error information, due to the lack of representativeness, resulting in excessive correction, as shown in group (e). The NDSI in (e3) is slightly overestimated at the edge of the snow-covered areas, and is covered by clouds in other snow-covered areas. Accordingly, nearby cloud-covered areas are mistakenly considered to be overestimated, thus correcting these overestimates and leading to underestimation. In general, although there are some insufficient and excessive corrections, the error correction can effectively eliminate the temporal errors introduced by the reference images, indicating that it is an essential component of the proposed method.

Effect of Error Correction
As stated above, the error correction further improves the numerical accuracy and spatial consistency of the NDSI data, especially during periods of rapid changes in snow cover. The effect of the error correction is affected by the quantity and distribution of the original effective data. Figure  10 shows three specific effects (groups (c)-(e)) extracted from the results for 24 March 2016, where the cloud removal results of STF-GKF-EC are compared with those of the method without error correction. Group (c) presents an example with an excellent correction effect. Since the snow cover in this area varies greatly over a few days, the NDSI values filled with the information of the reference images are significantly underestimated, as shown in (c3). In this case, based on the original effective information of the target image, the error correction can greatly improve the reconstruction accuracy. However, group (d) reveals that the correction is likely to be insufficient when the spatial information is scarce. In addition, original effective data with an extremely uneven spatial distribution may introduce incorrect error information, due to the lack of representativeness, resulting in excessive correction, as shown in group (e). The NDSI in (e3) is slightly overestimated at the edge of the snowcovered areas, and is covered by clouds in other snow-covered areas. Accordingly, nearby cloudcovered areas are mistakenly considered to be overestimated, thus correcting these overestimates and leading to underestimation. In general, although there are some insufficient and excessive corrections, the error correction can effectively eliminate the temporal errors introduced by the reference images, indicating that it is an essential component of the proposed method.

Snow Cover Variability
Based on the cloud-removed results of the proposed method, the variability of snow cover over the TP during 2001-2017 was analyzed from three aspects. Specifically, the intra-annual and inter-annual variabilities were revealed with respect to different elevation zones and/or subranges. The response of snow cover variability to temperature and precipitation in the different subranges was also investigated.

Intra-Annual Variability of Snow Cover
The intra-annual variability was analyzed at both the daily and monthly scales. First, the average daily snow fraction (the ratio of NDSI greater than 0) and the average daily NDSI for snow-covered areas in a hydrological year are shown in Figures 11 and 12, respectively. The gray error bars represent the standard deviation. It can be seen that the daily snow fraction over the TP fluctuates greatly, with a maximum of 34.87% on 20 January and a minimum of 8.45% on 5 August. In comparison, the average daily NDSI for the snow-covered areas rises earlier and falls later, with a maximum of 46.11 on 24 October and a minimum of 31.80 on 9 August. In other words, on the one hand, snow accumulation first occurs in areas with high NDSI values, and then spreads to snow-free areas. On the other hand, snow ablation starts from areas with low NDSI values, and then moves to areas with high NDSI values. However, in general, their changes are highly consistent, with a correlation coefficient of 0.91.
Based on the cloud-removed results of the proposed method, the variability of snow cover over the TP during 2001-2017 was analyzed from three aspects. Specifically, the intra-annual and interannual variabilities were revealed with respect to different elevation zones and/or subranges. The response of snow cover variability to temperature and precipitation in the different subranges was also investigated.

Intra-Annual Variability of Snow Cover
The intra-annual variability was analyzed at both the daily and monthly scales. First, the average daily snow fraction (the ratio of NDSI greater than 0) and the average daily NDSI for snow-covered areas in a hydrological year are shown in Figures 11 and 12, respectively. The gray error bars represent the standard deviation. It can be seen that the daily snow fraction over the TP fluctuates greatly, with a maximum of 34.87% on 20 January and a minimum of 8.45% on 5 August. In comparison, the average daily NDSI for the snow-covered areas rises earlier and falls later, with a maximum of 46.11 on 24 October and a minimum of 31.80 on 9 August. In other words, on the one hand, snow accumulation first occurs in areas with high NDSI values, and then spreads to snow-free areas. On the other hand, snow ablation starts from areas with low NDSI values, and then moves to areas with high NDSI values. However, in general, their changes are highly consistent, with a correlation coefficient of 0.91.  Furthermore, the analysis at the monthly scale considers different elevations and subregions. Figure 13 shows the average monthly NDSI at different elevation zones. On the TP, this reveals that a large proportion of the snow cover lasts from October to May, with a peak of 13.43 in February.  the TP during 2001-2017 was analyzed from three aspects. Specifically, the intra-annual and interannual variabilities were revealed with respect to different elevation zones and/or subranges. The response of snow cover variability to temperature and precipitation in the different subranges was also investigated.

Intra-Annual Variability of Snow Cover
The intra-annual variability was analyzed at both the daily and monthly scales. First, the average daily snow fraction (the ratio of NDSI greater than 0) and the average daily NDSI for snow-covered areas in a hydrological year are shown in Figures 11 and 12, respectively. The gray error bars represent the standard deviation. It can be seen that the daily snow fraction over the TP fluctuates greatly, with a maximum of 34.87% on 20 January and a minimum of 8.45% on 5 August. In comparison, the average daily NDSI for the snow-covered areas rises earlier and falls later, with a maximum of 46.11 on 24 October and a minimum of 31.80 on 9 August. In other words, on the one hand, snow accumulation first occurs in areas with high NDSI values, and then spreads to snow-free areas. On the other hand, snow ablation starts from areas with low NDSI values, and then moves to areas with high NDSI values. However, in general, their changes are highly consistent, with a correlation coefficient of 0.91.  Furthermore, the analysis at the monthly scale considers different elevations and subregions. Figure 13 shows the average monthly NDSI at different elevation zones. On the TP, this reveals that a large proportion of the snow cover lasts from October to May, with a peak of 13.43 in February.  Furthermore, the analysis at the monthly scale considers different elevations and subregions. Figure 13 shows the average monthly NDSI at different elevation zones. On the TP, this reveals that a large proportion of the snow cover lasts from October to May, with a peak of 13.43 in February. Rapid accumulation and ablation occur in September/October and May/June, respectively. Moreover, a slight decrease appears in December, which is mainly due to the dryness in winter over the TP. However, the snow cover variation is different in the different elevation zones. In areas with an altitude of less than 4000 m, the period of snow presence is fairly short, mainly in January and February. Meanwhile, the areas above 6000 m are almost permanently covered with snow, where the lowest average NDSI is as high as 50.05. This is a large difference compared with the areas in the lower elevation zone.
Furthermore, since about half of the TP has an altitude of 4000-5000 m, the intra-annual snow variation curve of the areas is similar to that of the entire TP. However, its maximum appears in March, which is the same as the areas with an altitude of 5000-6000 m.
Rapid accumulation and ablation occur in September/October and May/June, respectively. Moreover, a slight decrease appears in December, which is mainly due to the dryness in winter over the TP. However, the snow cover variation is different in the different elevation zones. In areas with an altitude of less than 4000 m, the period of snow presence is fairly short, mainly in January and February. Meanwhile, the areas above 6000 m are almost permanently covered with snow, where the lowest average NDSI is as high as 50.05. This is a large difference compared with the areas in the lower elevation zone. Furthermore, since about half of the TP has an altitude of 4000-5000 m, the intra-annual snow variation curve of the areas is similar to that of the entire TP. However, its maximum appears in March, which is the same as the areas with an altitude of 5000-6000 m. In view of the different subregions (see Figure 1), there are also some commonalities and individualities in the intra-annual snow cover variation, as shown in Figure 14. On the one hand, all the subregions quickly accumulate snow in September/October. Except for Tarim and Indus, all subregions have a lower NDSI in December. Furthermore, the snow cover in each subregion reaches a minimum in July/August. On the other hand, snowmelt onset occurs between January and March, depending on the subregion. Hexi, for example, has relatively high snow cover in November, but the snow begins to melt as early as January. The snow in Indus is almost exclusively concentrated between January and March, and then drops sharply to a very low proportion. In addition, it can be noticed that the Inner subregion has three peaks, which occur in January, May, and October, respectively. In contrast, Tarim, with a high average NDSI, has a smoother snow cover curve, retaining some permanent snow in summer. Moreover, the intra-annual snow variation curves of the Mekong and Yangtze river basins are the closest to that of the entire TP. In view of the different subregions (see Figure 1), there are also some commonalities and individualities in the intra-annual snow cover variation, as shown in Figure 14. On the one hand, all the subregions quickly accumulate snow in September/October. Except for Tarim and Indus, all subregions have a lower NDSI in December. Furthermore, the snow cover in each subregion reaches a minimum in July/August. On the other hand, snowmelt onset occurs between January and March, depending on the subregion. Hexi, for example, has relatively high snow cover in November, but the snow begins to melt as early as January. The snow in Indus is almost exclusively concentrated between January and March, and then drops sharply to a very low proportion. In addition, it can be noticed that the Inner subregion has three peaks, which occur in January, May, and October, respectively. In contrast, Tarim, with a high average NDSI, has a smoother snow cover curve, retaining some permanent snow in summer. Moreover, the intra-annual snow variation curves of the Mekong and Yangtze river basins are the closest to that of the entire TP.
Rapid accumulation and ablation occur in September/October and May/June, respectively. Moreover, a slight decrease appears in December, which is mainly due to the dryness in winter over the TP. However, the snow cover variation is different in the different elevation zones. In areas with an altitude of less than 4000 m, the period of snow presence is fairly short, mainly in January and February. Meanwhile, the areas above 6000 m are almost permanently covered with snow, where the lowest average NDSI is as high as 50.05. This is a large difference compared with the areas in the lower elevation zone. Furthermore, since about half of the TP has an altitude of 4000-5000 m, the intra-annual snow variation curve of the areas is similar to that of the entire TP. However, its maximum appears in March, which is the same as the areas with an altitude of 5000-6000 m. In view of the different subregions (see Figure 1), there are also some commonalities and individualities in the intra-annual snow cover variation, as shown in Figure 14. On the one hand, all the subregions quickly accumulate snow in September/October. Except for Tarim and Indus, all subregions have a lower NDSI in December. Furthermore, the snow cover in each subregion reaches a minimum in July/August. On the other hand, snowmelt onset occurs between January and March, depending on the subregion. Hexi, for example, has relatively high snow cover in November, but the snow begins to melt as early as January. The snow in Indus is almost exclusively concentrated between January and March, and then drops sharply to a very low proportion. In addition, it can be noticed that the Inner subregion has three peaks, which occur in January, May, and October, respectively. In contrast, Tarim, with a high average NDSI, has a smoother snow cover curve, retaining some permanent snow in summer. Moreover, the intra-annual snow variation curves of the Mekong and Yangtze river basins are the closest to that of the entire TP.

Inter-Annual Variability of Snow Cover
Based on the cloud-free NDSI product, the inter-annual snow cover variability was also analyzed. The variation trend over the entire TP is described by a piecewise linear regression model [35,47]: where NDSI t denotes the NDSI in the t-th year, ε is the residual error, t 0 is the breakpoint determined by the least-squares error, namely t 0 = argmin( ε 2 ), β i is the regression coefficient, and β 1 and β 1 + β 2 are the slopes of the two equations, which can quantify the trends before and after the breakpoint. Over the past decade, the annual average NDSI has fallen from more than 11 to about eight. In addition, since the snow cover reaches its lowest value in July and August, we can assume that the snow cover during this period is permanent snow cover. Figure 15b shows the changes of average NDSI for July and August over the 17 years. It can be seen that, after a rapid rise during 2001-2003, a slight decreasing trend of 0.03/year has lasted for 14 years.
Moreover, according to the magnitude of the average NDSI, the 10 subregions over the TP are ranked as Qaidam, Yellow, Inner, Yangtze, Mekong, Hexi, Brahmaputra, Indus, Salween, and Tarim (see Figure 16). This reveals that the snow cover is high in the western, southern, and northern margins of the TP, and relatively low in the central and eastern regions. In addition, the snow cover in each subregion has fluctuated greatly over the 17 years. The linear regression equation of Salween has the smallest slope of −0.13/year. However, this trend is not statistically significant (α = 0.10).

Inter-Annual Variability of Snow Cover
Based on the cloud-free NDSI product, the inter-annual snow cover variability was also analyzed. The variation trend over the entire TP is described by a piecewise linear regression model [35,47]: NDSI t = β 0 +β 1 t+ε, t≤t 0 β 0 +β 1 t+β 2 (t-t 0 )+ε, t>t 0 , (10) where NDSI t denotes the NDSI in the t -th year, ε is the residual error, t 0 is the breakpoint determined by the least-squares error, namely t 0 = argmin(‖ε‖ 2 ), β i is the regression coefficient, and β 1 and β 1 +β 2 are the slopes of the two equations, which can quantify the trends before and after the breakpoint. Firstly, the annual averages of NDSI over the entire TP from 2001 to 2017 are shown in Figure  15a. An apparent increasing trend can be observed during 2001-2005, as well as a slight decreasing trend during 2005-2017. Using the piecewise linear regression model, the trends are determined as being +0.68/year and −0.16/year, respectively. The trends are statistically significant (α=0.10). Over the past decade, the annual average NDSI has fallen from more than 11 to about eight. In addition, since the snow cover reaches its lowest value in July and August, we can assume that the snow cover during this period is permanent snow cover. Figure 15b shows the changes of average NDSI for July and August over the 17 years. It can be seen that, after a rapid rise during 2001-2003, a slight decreasing trend of 0.03/year has lasted for 14 years.
Moreover, according to the magnitude of the average NDSI, the 10 subregions over the TP are ranked as Qaidam, Yellow, Inner, Yangtze, Mekong, Hexi, Brahmaputra, Indus, Salween, and Tarim (see Figure 16). This reveals that the snow cover is high in the western, southern, and northern margins of the TP, and relatively low in the central and eastern regions. In addition, the snow cover in each subregion has fluctuated greatly over the 17 years. The linear regression equation of Salween has the smallest slope of −0.13/year. However, this trend is not statistically significant (α = 0.10).  To further investigate the correlation of the inter-annual snow cover variation between the subregions and the entire TP, Table 5 lists some indicators describing the variation and regional topography. On the one hand, the inter-annual snow cover variation of the four subregions of Inner, Yangtze, Mekong, and Salween has a high consistency with that of the TP, obtaining a correlation coefficient of above 0.82. On the other hand, the specific variability of the snow cover shows significant regional inconsistency. For example, Tarim has the largest average NDSI of 18.28, which is twice that of the TP; however, their average elevation difference is only 83 m. In comparison, the snow cover variability over the Mekong subregion with a similar average elevation, is more consistent with the entire TP. At the same time, significantly inconsistent variability can be observed in Hexi. Furthermore, in terms of the standard deviation and coefficient of variation, the snow cover changes most dramatically in the Indus valley, followed by Hexi, Mekong, and the Yellow River basin. On the whole, the inter-annual variations of the different subregions are highly inconsistent.  number of pixels (Pixel), regional-average elevation (Elevation), correlation coefficient (R) and p-value.

Response of Snow Cover to Climate Change
As we know, snow is very sensitive to climate change, of which the most important factors are temperature and precipitation. Accordingly, based on the meteorological station data, the effects of temperature and precipitation on the intra-annual and inter-annual variabilities of snow cover were analyzed. As the meteorological stations are mainly distributed in the east of the TP, this analysis only covered the seven subregions in this area. From the perspective of intra-annual variability, Figure 17 reveals that the snow cover over the TP has a significant negative correlation with temperature and precipitation. In all the subregions, the impact of temperature on the snow cover variation is greater than that of precipitation, and the degree is slightly magnified as the snow cover increases. In particular, the monthly NDSI in Qaidam has little correlation with precipitation. The reason is that the precipitation of Qaidam is extremely low, with an average of about 8.6 mm over To further investigate the correlation of the inter-annual snow cover variation between the subregions and the entire TP, Table 5 lists some indicators describing the variation and regional topography. On the one hand, the inter-annual snow cover variation of the four subregions of Inner, Yangtze, Mekong, and Salween has a high consistency with that of the TP, obtaining a correlation coefficient of above 0.82. On the other hand, the specific variability of the snow cover shows significant regional inconsistency. For example, Tarim has the largest average NDSI of 18.28, which is twice that of the TP; however, their average elevation difference is only 83 m. In comparison, the snow cover variability over the Mekong subregion with a similar average elevation, is more consistent with the entire TP. At the same time, significantly inconsistent variability can be observed in Hexi. Furthermore, in terms of the standard deviation and coefficient of variation, the snow cover changes most dramatically in the Indus valley, followed by Hexi, Mekong, and the Yellow River basin. On the whole, the inter-annual variations of the different subregions are highly inconsistent. The indicators represent the average NDSI (Avg.), standard deviation (Std.), coefficient of variation (CV), number of pixels (Pixel), regional-average elevation (Elevation), correlation coefficient (R) and p-value.

Response of Snow Cover to Climate Change
As we know, snow is very sensitive to climate change, of which the most important factors are temperature and precipitation. Accordingly, based on the meteorological station data, the effects of temperature and precipitation on the intra-annual and inter-annual variabilities of snow cover were analyzed. As the meteorological stations are mainly distributed in the east of the TP, this analysis only covered the seven subregions in this area. From the perspective of intra-annual variability, Figure 17 reveals that the snow cover over the TP has a significant negative correlation with temperature and precipitation. In all the subregions, the impact of temperature on the snow cover variation is greater than that of precipitation, and the degree is slightly magnified as the snow cover increases. In particular, the monthly NDSI in Qaidam has little correlation with precipitation. The reason is that the precipitation of Qaidam is extremely low, with an average of about 8.6 mm over the 17 years according to the four meteorological stations. Generally speaking, the intra-annual snow cover variation over the TP is mainly controlled by temperature rather than precipitation.
Remote Sens. 2019, 11, x FOR PEER REVIEW 18 of 21 the 17 years according to the four meteorological stations. Generally speaking, the intra-annual snow cover variation over the TP is mainly controlled by temperature rather than precipitation. According to the Pearson correlation coefficients (PCCs) of the average monthly NDSI and temperature/precipitation from 2001 to 2016, Table 6 describes the correlation between the interannual variation of snow cover and the climatic factors. It can be seen that temperature has little effect on the inter-annual snow cover variation, with only February showing a significant negative correlation. Precipitation, in contrast, is typically positively correlated with the snow cover variation, especially in January and October. Moreover, the overall PCCs from October to March also indicate that precipitation has a greater effect on the inter-annual variability than temperature. However, the regional inconsistency is also significant, for example, the correlation coefficient is relatively low around the Yangtze River. In the Salween River basin, there is even an alternating positive/negative correlation. As a result, compared to the intra-annual variability in snow cover, inter-annual variability in snow cover exhibits a weaker correlation with both temperature and precipitation.

Month
Temperature

Conclusions
The investigation of snow cover variability over the TP is of great importance. MODIS snow products are widely adopted to monitor the snow cover variation over the TP, owing to the appropriate temporal and spatial resolutions. However, removing cloud cover has been a challenge According to the Pearson correlation coefficients (PCCs) of the average monthly NDSI and temperature/precipitation from 2001 to 2016, Table 6 describes the correlation between the inter-annual variation of snow cover and the climatic factors. It can be seen that temperature has little effect on the inter-annual snow cover variation, with only February showing a significant negative correlation. Precipitation, in contrast, is typically positively correlated with the snow cover variation, especially in January and October. Moreover, the overall PCCs from October to March also indicate that precipitation has a greater effect on the inter-annual variability than temperature. However, the regional inconsistency is also significant, for example, the correlation coefficient is relatively low around the Yangtze River. In the Salween River basin, there is even an alternating positive/negative correlation. As a result, compared to the intra-annual variability in snow cover, inter-annual variability in snow cover exhibits a weaker correlation with both temperature and precipitation.

Conclusions
The investigation of snow cover variability over the TP is of great importance. MODIS snow products are widely adopted to monitor the snow cover variation over the TP, owing to the appropriate temporal and spatial resolutions. However, removing cloud cover has been a challenge of snow monitoring due to the extremely complex topography over the TP. Therefore, this paper is intended as a precedent for producing cloud-free MODIS NDSI product using spatio-temporal information.
The proposed two-stage spatio-temporal fusion framework includes an adjusted Terra and Aqua combination (TAC) and a spatio-temporal fusion based on Gaussian kernel function and error correction (STF-GKF-EC). Firstly, the TAC based on an adjusted priority scheme, is used as the first stage to reduce the cloud cover. STF-GKF-EC is then implemented according to the following procedure: (1) neighborhood interpolation; (2) GKF-based weighting integration of the spatio-temporal blocks; and (3) error correction. It should be noted that this process is recursively performed on the NDSI product until no cloud-covered pixels remain. Since the TAC has been extensively demonstrated to be highly accurate, only STF-GKF-EC was evaluated in this paper. The simulated experiment suggests that STF-GKF-EC can obtain an average OA of 91.48%. In addition, the MAE and the MAE for snow-covered areas (MAE_S) are 3.88 and 12.06, respectively. Overall, STF-GKF-EC has three advantages: (1) it takes into account regional inconsistency through the block processing; (2) it simultaneously considers the spatial and temporal information based on the unified spatio-temporal weight; and (3) it further improves the accuracy by the error correction. Nevertheless, there are still some disadvantages: (1) STF-GKF-EC may lead to a few misclassifications at the edge of the snow cover; and (2) it is difficult to recover the cloud-covered pixels with a high precision when continuous cloud cover appears in the snow accumulation or ablation period. For our future work, remote sensing data from different sensors, such as Sentinel data, will be fused to produce higher-quality NDSI products. The key is to improve the accuracy of the areas covered by continuous cloud during the snow accumulation and ablation.
Based on the cloud-removed product, the variability of snow cover over the TP was analyzed. It reveals that the snow cover over the TP is relatively high from October to May. The average NDSI during the snow period reaches a trough in December and a peak in February/March. The snow cover over the TP presents an upward trend of 0.68/year from 2001 to 2005, and a downward trend of 0.16/year from 2005 to 2017. Moreover, on the one hand, the explanatory power of temperature for the intra-annual snow cover variation is higher than that of precipitation. On the other hand, precipitation has a relatively significant positive effect on the inter-annual variation. With global warming, monitoring the variability of snow cover is an urgent concern. Undoubtedly, the decreasing trend of snow cover over the TP after 2005 is a warning signal. Therefore, more research is needed on this issue.