Gap-Filling of a MODIS Normalized Di ﬀ erence Snow Index Product Based on the Similar Pixel Selecting Algorithm: A Case Study on the Qinghai–Tibetan Plateau

: Cloud contamination has largely limited the application of the Moderate Resolution Imaging Spectroradiometer(MODIS) normalized di ﬀ erence snow index (NDSI). Here, a novel gap-ﬁlling method based on spatial-temporal similar pixel interpolation was proposed to remove cloud occlusions in MODIS NDSI products. First, the widely used Terra and Aqua combination and three-day temporal ﬁlter methods were applied. The remaining missing NDSI information was estimated by using similar eligible pixels in the remaining cloud-free portion of a target image through a spatial-temporal similar pixel selecting algorithm (SPSA). The MODIS daily NDSI product data from 2003 to 2018 in the Qinghai–Tibetan Plateau (China) was used as a case study. The results demonstrate that the three-step methodology can generate almost completely cloud-free, daily MODIS NDSI images, reducing the cloud-gap fraction from > 45% to less than 1.5% on average. The validation results of the SPSA method exhibited a high accuracy, with a high R 2 exceeding 0.78, a low mean absolute error of 2.77%, a root mean square error of 3.78%, and a 96.92% overall accuracy. The proposed method can ﬁll cloud gaps without a signiﬁcant loss of accuracy, especially during snow cover transition periods (autumn and spring), which may provide more accurate cloud-free NDSI data for climate change and energy balance studies.


Introduction
As an integral part of the Earth's climate system, seasonal snow is one of the most variable land cover types throughout the year and has a strong impact on the radiation and energy balance, hydrological and biogeochemical cycles, and even human activities [1,2]. Snow cover characteristics, such as the cover extent and duration, are also crucial parameters in hydrological and ecological process models [3]. Moreover, meltwater from glaciers and seasonal snow packs provide water for nearly one-sixth of the world's population [4] and more than one-fifth of China's population, with the Tibetan Plateau being the main source of meltwater in China. Therefore, in recent years, the study of snow snow cover (FSC) products [55]. In 2016, MODIS version 6 was released, which no longer provides binary snow cover products but instead offers NDSI products, which can be regarded as continuous numerical products [52,56]. Compared with binary snow cover products, NDSI products can provide more information, which is more conducive to hydrological and ecological process simulations. Additionally, the increased data content (NDSI product) allows for flexibility in using the datasets for specific regions and end-user applications [56]. However, traditional cloud removal methods have mainly focused on cloud removal for discrete MODIS binary snow cover products (i.e., snow or snow-free). Currently, there are relatively few studies on cloud removal algorithms for continuous snow cover products (such as the NDSI product in version 6), and there have not been many studies that have performed cloud removal for fractional snow cover (FSC, V5) products. Dozier et al. [42], for example, first applied the spatiotemporal cube concept to snow products in 2008, filling in the gaps in the cloud by interpolating cubic splines in time on each pixel. Tang et al. [25] also used the cubic spline interpolation algorithm to de-cloud MODIS FSC products monitoring the Qinghai-Tibetan Plateau. Dozier's algorithm, however, considers only a small number of local neighbors; Tang's algorithm is a simple one-dimensional time interpolation algorithm that only considers the time information associated with snow cover and does not include the spatial characteristics of snow cover. In the case of a substantial number of continuous cloud cover days, a large number of outliers will be generated.
Dong et al. proposed a cloud removal algorithm for binary snow cover products based on station observations and optical data, whereby the conditional probability that a given cloudy pixel represents snow cover can be calculated to reclassify the residual cloudy pixels [36,57,58] on the condition that the snow depth (SD) is higher than zero at a nearby station [14]. The presence of snow in one cloudy pixel can be predicted with data from neighboring weather stations. This method has been reported as effective, especially during snow season [36,57,59]; however, the distribution and number of meteorological stations limit, to a certain degree, the predictive ability of the snow cover reconstruction [14]. Cheng et al. [60] developed a novel approach based on a similar pixel replacement method. A spatiotemporal Markov random fields (STMRF) function was developed to identify the most appropriate cloud-free pixels to replace a cloudy pixel in MODIS land surface temperature products. Hou et al. [30] further developed Cheng's algorithm, applied it to MODIS FSC snow products, and conducted experiments in Xinjiang (China), which yielded good results. The validation results based on cloud mask assumptions exhibited a high accuracy, with a high R 2 exceeding 0.8, a lower root mean square error (RMSE) of 0.1, an overestimation error (Equation (13)) of 1.13%, an underestimation error (Equation (14)) of 1.4%, an overall accuracy (Equation (12)) of 93.72%, and a spatial efficiency of 0.78 [30].
Here, we propose a novel gap-filling method based on non-local spatiotemporal similar pixels and conditional probabilities to eliminate cloud occlusion in daily MODIS NDSI products. This algorithm first determines the possible NDSI range of cloudy pixels through spatial similarity, after which, the temporal similarity is used as an index to select similar pixels in the cloud-free areas to fill the cloud-covered area in the MODIS NDSI products. Therefore, our study focuses on the following three goals: (1) a comprehensive assessment of cloud contamination severity in MODIS NDSI products (version 6) across the Qinghai-Tibetan Plateau (TP); (2) an evaluation of the potential applicability of existing widely-used binary product-based cloud removal methods to continuous NDSI products; and (3) the development and accuracy assessment of an innovative cloud removal algorithm based on spatiotemporal pixel similarity.

Study Area
The Qinghai-Tibetan Plateau (TP) is located in the central Eurasian continent. It ranges 1532 km from north to south (26 • 00 12"N to 39 • 46 50"N) and 2945 km from west to east (73 • 18 52"E to 104 • 46 59"E), covering an area of over 2.5 million km 2 ( Figure 1). The TP has a mean elevation Remote Sens. 2020, 12, 1077 4 of 33 exceeding 4000 m above sea level (m.a.s.l.) and is among the areas most sensitive to global climate change [61]. The TP features a wet and warm summer (from June to August) and a cool and dry winter (from December to February) [62]. Winter temperatures exhibit a large spatial variation, ranging from −25 • C in the west to −15 • C in the east [63], providing favorable conditions for the preservation of snow cover. Summer temperatures across the TP are generally above 0 • C; however, due to very high altitudes and low temperatures (given the decreasing rate of temperature with altitude), the abundant precipitation in the summer still causes snowfall in high altitude areas. In the southeastern TP, the East Asian Summer Monsoon and Indian Monsoon produce heavy precipitation in the summer season. In the western part, westerly winds bring abundant precipitation in winter and early spring, mostly in the form of snow [64].
Unlike in northeastern China and northern Xinjiang, China, the snow cover in the TP is relatively low (less than 1 cm), where the status of the snow cover changes rapidly due to numerous factors, such as high solar radiation and high wind speeds [10], which may result in sublimation and snow drifting. In certain parts of the TP, snow cover can change substantially, even during a single day. Moreover, as it is affected by the East Asian Summer Monsoon and Indian Monsoon [62], the southeast of the TP exhibits abundant water vapor, which may lead to large amounts of cloud cover (especially in the spring and summer). In certain areas, the average annual cloud cover duration is as long as 300 days, where the maximum number of continuous cloud-covered days is greater than 20 days. The rapid change in snow cover status and high cloud-gap fraction provides significant challenges to the study of cloud removal algorithms for snow cover products across the TP. winter (from December to February) [62]. Winter temperatures exhibit a large spatial variation, ranging from −25 °C in the west to −15 °C in the east [63], providing favorable conditions for the preservation of snow cover. Summer temperatures across the TP are generally above 0 °C; however, due to very high altitudes and low temperatures (given the decreasing rate of temperature with altitude), the abundant precipitation in the summer still causes snowfall in high altitude areas. In the southeastern TP, the East Asian Summer Monsoon and Indian Monsoon produce heavy precipitation in the summer season. In the western part, westerly winds bring abundant precipitation in winter and early spring, mostly in the form of snow [64]. Unlike in northeastern China and northern Xinjiang, China, the snow cover in the TP is relatively low (less than 1 cm), where the status of the snow cover changes rapidly due to numerous factors, such as high solar radiation and high wind speeds [10], which may result in sublimation and snow drifting. In certain parts of the TP, snow cover can change substantially, even during a single day. Moreover, as it is affected by the East Asian Summer Monsoon and Indian Monsoon [62], the southeast of the TP exhibits abundant water vapor, which may lead to large amounts of cloud cover (especially in the spring and summer). In certain areas, the average annual cloud cover duration is as long as 300 days, where the maximum number of continuous cloud-covered days is greater than 20 days. The rapid change in snow cover status and high cloud-gap fraction provides significant challenges to the study of cloud removal algorithms for snow cover products across the TP.

MODIS NDSI Products
Two versions of MODIS daily snow products have been widely used in most studies, i.e., version 5 [51] and version 6 [52]. In version 5, binary and fractional snow cover are provided, both of which are calculated based on the NDSI.

MODIS NDSI Products
Two versions of MODIS daily snow products have been widely used in most studies, i.e., version 5 [51] and version 6 [52]. In version 5, binary and fractional snow cover are provided, both of which are calculated based on the NDSI.
In contrast, version 6 of the product contains several significant changes. For instance, binary and fractional snow cover estimates are no longer available, while only the NDSI is provided. The NDSI snow cover for a given pixel is reported as 0-100% [52]. The pixels with the other eight codes may represent missing data (200), no decision (201), nighttime (211), inland water (237), ocean (239), clouds (250), detector saturated (254), and filled (255) [52]. There may be two main reasons for the release of NDSI products instead of binary snow products or fractional snow cover (FSC) products in version 6: First, the threshold of the binary snow cover product (or regression coefficient of the FSC product) has significant regional heterogeneity such that the global unified standard cannot achieve an optimal effect. Second, both binary snow cover products and FSC products are generated based on NDSI products, which can provide more abundant snow information. Therefore, the NDSI product released by version 6 of MODIS is more conducive to user customization.
MODIS Terra and MODIS Aqua daily NDSI products (MOD10A1 and MYD10A1, Collection 006) from 1 January 2003 to 31 December 2018 were collected from National Snow and Ice Data Center (NSIDC) (http://nsidc.org/). Six tiles (i.e., h23v05, h24v05, h25v05, h26v05, h25v06, and h26v06) were required to cover the entire TP. The MODIS Reprojection Tool was then used to mosaic each of the six tiles via a nearest neighbor resampling method and re-project them from the sinusoidal projection into the Universal Transverse Mercator (UTM zone 45) projection with the World Geodetic System 1984 (WGS84) datum at a 500-m resolution.

Digital Elevation Model (DEM) Data
The NASA Shuttle Radar Topography Mission Digital Elevation Model data (SRTM3, 90 m) were provided by the Consortium for Spatial Information (CGIAR-CSI) (http://srtm.csi.cgiar.org). The DEM data were mosaicked, re-projected, and resampled to achieve consistency with the MODIS NDSI images. The elevation and aspect maps were extracted from processed DEM data for further analysis. To further analyze the influence of topographical factors (e.g., elevation and aspect) on the accuracy of the proposed algorithm, the elevation from 3000 to 6000 m was divided into 31 100-m intervals (regions with an altitude greater than 6000 m were considered as one interval). Similarly, the aspect from 0 • to 360 • was divided into 36 10 • intervals.

Analysis of Cloud Gaps in MODIS NDSI Products
To determine the number of cloud gaps and cloud cover duration in the MODIS NDSI products across the TP, we first reclassified the MODIS NDSI products into two categories: cloud-free (NDSI value (0 to 100), where inland water was 137), and clouds (250). Some special cases, such as detector saturated (254), missing data (200), and no decision (201), were classified as clouds due to the small amount of data available (less than 1%). After reclassification, the monthly and annual cloud coverage was calculated to analyze the influence of cloud gaps on the MODIS NDSI products across the TP.

Theoretical Basis
Unlike construction land or vegetation (such as cropland), snow cover is less affected by human activities, and thus, is mainly influenced by natural factors, which play an important role in the Remote Sens. 2020, 12, 1077 6 of 33 formation and melting of snow. The distribution of snow cover is mainly affected by meteorological factors (e.g., moisture content, atmospheric and land surface temperature, and ocean currents), topographic factors (e.g., altitude, slope, shady slope and sunny slope, and windward or leeward slopes), and other factors (e.g., solar radiation intensity causing snow to melt, snow drifting and subsequent snow redistribution, and land-cover type causing different snow melting rates). Therefore, when the meteorological conditions, topographic conditions, and other conditions of the two pixels are very similar, the snow cover status of the two pixels, which can be measured using NDSI to some extent, might also be very similar. This similar information not only includes the local neighborhood similarity (according to the first law of geography), but also includes the non-local similarity [65,66].
When a pixel is covered by clouds, its NDSI information is lost. Fortunately, this information can be estimated from similar pixels from the remaining known region at the same date and time. Similar pixels have the following characteristics: (1) they have nearly equal mean NDSI values in a relatively small time window [30], (2) they are relatively close to each other in space [67,68], and (3) they have similar change trends in multi-temporal images [69]. Further, (4) when they are simultaneously observed, they should have close NDSI values. Thus, in this study, we selected continuous multi-temporal images to provide auxiliary information. Figure 2 illustrates the NDSI distribution of MOD10A1 in a sample area (within 31 × 31 pixels) acquired on (a) 7 November 2010 and (b) 14 November 2010. The fifteen-day continuous NDSI values of three pixels are presented in Figure 2c. Compared with the pixel in the red box, the blue pixel better matched the trend and average value over time of the target pixel, even though it was farther away from the target pixel. Therefore, we suggest that the blue pixel was similar to the target pixel, whereas the red pixel was not.
Remote Sens. 2019, 11, x FOR PEER REVIEW 6 of 34 neighborhood similarity (according to the first law of geography), but also includes the non-local similarity [65,66]. When a pixel is covered by clouds, its NDSI information is lost. Fortunately, this information can be estimated from similar pixels from the remaining known region at the same date and time. Similar pixels have the following characteristics: (1) they have nearly equal mean NDSI values in a relatively small time window [30], (2) they are relatively close to each other in space [67,68], and (3) they have similar change trends in multi-temporal images [69]. Further, (4) when they are simultaneously observed, they should have close NDSI values. Thus, in this study, we selected continuous multi-temporal images to provide auxiliary information. Figure 2 illustrates the NDSI distribution of MOD10A1 in a sample area (within 31 × 31 pixels) acquired on (a) 7 November 2010 and (b) 14 November 2010. The fifteen-day continuous NDSI values of three pixels are presented in Figure 2c. Compared with the pixel in the red box, the blue pixel better matched the trend and average value over time of the target pixel, even though it was farther away from the target pixel. Therefore, we suggest that the blue pixel was similar to the target pixel, whereas the red pixel was not.
Based on this principle, the missing NDSI values of the cloudy pixels can be calculated from similar eligible pixels in the clear sky region on the same day with the help of multi-temporal reference images [30]. (c) Normalized difference snow index (NDSI) curve of the target pixel (black), similar pixel (blue), and non-similar pixel (red) from 2 November 2010 to 16 November 2010 as an example. The NDSI product is a product that has been de-clouded by the Terra and Aqua combination (TAC) and three-day temporary filtering (3DTF) methods (see Section 3.2.2).
Based on this principle, the missing NDSI values of the cloudy pixels can be calculated from similar eligible pixels in the clear sky region on the same day with the help of multi-temporal reference images [30].

Gap-Filling Method
An innovative cloud removal method based on similar pixel replacement was proposed to retrieve the missing NDSI information beneath the cloud gaps in the daily MODIS NDSI products. The Terra and Aqua combination (TAC) method and a three-day temporal filtering procedure (3DTF) were executed first to fill some gaps before the application of a similar pixel selecting algorithm (SPSA). Figure 3 illustrates a detailed cloud removal procedure via the MODIS NDSI product gap-filling method. (c) Normalized difference snow index (NDSI) curve of the target pixel (black), similar pixel (blue), and non-similar pixel (red) from 2 November 2010 to 16 November 2010 as an example. The NDSI product is a product that has been de-clouded by the Terra and Aqua combination (TAC) and threeday temporary filtering (3DTF) methods (see Section 3.2.2).

Gap-Filling Method
An innovative cloud removal method based on similar pixel replacement was proposed to retrieve the missing NDSI information beneath the cloud gaps in the daily MODIS NDSI products. The Terra and Aqua combination (TAC) method and a three-day temporal filtering procedure (3DTF) were executed first to fill some gaps before the application of a similar pixel selecting algorithm (SPSA). Figure 3 illustrates a detailed cloud removal procedure via the MODIS NDSI product gapfilling method.

Terra and Aqua Combination (TAC)
The daily MOD10A1 and MYD10A1 (hereinafter referred to as MOD and MYD, respectively) products were first combined due to the characteristics of clouds that move over time [14]. The priority was assigned to MOD, as numerous validation studies demonstrated that it had more accurate retrievals than did MYD [26,70]. The specific combination strategy was as follows [30]: when a pixel was cloud-free in both MOD and MYD, we used the NDSI value in MOD. When a pixel was cloud-free only in one of the products, we used the cloud-free NDSI value.

Three-Day Temporal Filter (3DTF)
Adjacent temporal filtering is based on the assumption that snow will persist on the land surface for a certain amount of time [30]. In previous studies, the size of the time filter window (days) varied from one to eight days. Due to the special climate conditions of the TP (high wind speeds easily lead to snow redistribution and a thin snow layer is easy to melt quickly), the snow state changes rapidly; therefore, a long composite period will introduce errors. In this study, the composite days were set to three days (i.e., the day of the cloud coverage, as well as the days before and after cloud coverage). When a pixel was covered by clouds and was cloudless the day before and after, the NDSI value could be calculated using the following formula: where NDSI_predict , is the predicted NDSI value of a cloud-covered pixel , at date T, and

Terra and Aqua Combination (TAC)
The daily MOD10A1 and MYD10A1 (hereinafter referred to as MOD and MYD, respectively) products were first combined due to the characteristics of clouds that move over time [14]. The priority was assigned to MOD, as numerous validation studies demonstrated that it had more accurate retrievals than did MYD [26,70]. The specific combination strategy was as follows [30]: when a pixel was cloud-free in both MOD and MYD, we used the NDSI value in MOD. When a pixel was cloud-free only in one of the products, we used the cloud-free NDSI value.

Three-Day Temporal Filter (3DTF)
Adjacent temporal filtering is based on the assumption that snow will persist on the land surface for a certain amount of time [30]. In previous studies, the size of the time filter window (days) varied from one to eight days. Due to the special climate conditions of the TP (high wind speeds easily lead to snow redistribution and a thin snow layer is easy to melt quickly), the snow state changes rapidly; therefore, a long composite period will introduce errors. In this study, the composite days were set to three days (i.e., the day of the cloud coverage, as well as the days before and after cloud coverage). When a pixel was covered by clouds and was cloudless the day before and after, the NDSI value could be calculated using the following formula: It should be noted that the conditions for the application of 3DTF were very strict. Particularly, the predicted NDSI value of the cloudy pixel could be calculated only when the pixel was cloudless the day before and after. The input data for this step were the output products of the previous step (i.e., a daily combination of MOD and MYD).

Similar Pixel Selecting Algorithm (SPSA)
The SPSA can be divided into two steps: (a) determine the possible range of the NDSI and (b) search for the most similar pixels within the possible range and calculate the predicted NDSI value.
(A) Determination of the potential NDSI range The physical process of snow accumulation and melting follows a predictable spatial pattern that occurs yearly [50]. This recurrence is widely acknowledged; however, to our knowledge, it has not been implemented in gap-filling procedures for snow cover products. By employing multiple years of MODIS snow cover images, this recurrence pattern was used to develop a snow dynamics model to reconstruct the missing NDSI information. Specifically, the NDSI value of a certain pixel should fluctuate from the multi-year average NDSI value of the pixel at a given moment with a certain deviation degree, which can be expressed as: where Range is the possible range of the NDSI value, Average is the multi-year average NDSI value at a certain time (date T as an example), Delta is the anomaly, and ε is the error term. Therefore, we can use this formula to determine the approximate range of the possible NDSI value of a pixel. Suppose that pixel P is a cloud-covered pixel at time T in 2015 (here we take 2015 as an example), and P' is the nearest N clear-sky pixels surrounding pixel P at time T in 2015 ( Figure 4a). With pixel P as the center, the search radius increases from 10 to 50, with a step length of 5, and keeps expanding until the termination condition is satisfied (N cloud-free pixels are found or the window size reaches 50). In this study, N was set to 20. NDSI images at time T of each year from 2003 to 2018 were selected as auxiliary data ( Figure 4c). According to these historical images, the average NDSI value of P and P' at time T over 15 years can be calculated using the following formula ( Figure 4b): where P 2003−2018 denotes the multi-year average NDSI value of pixel P at date T during 2003-2018 and C is the number of cloud-free pixels at date T from 2003 to 2018. According to Tobler's first law of geography (TFL) [67,68], nearby objects are more closely related than distant objects. Thus, closer pixels would have more similar attribute values than pixels further apart [71]. When pixel P is covered by the cloud, the deviation degree between the cloud-free pixels near pixel P and the multi-year average NDSI of their corresponding position can theoretically represent, to a certain extent, the deviation degree between the NDSI value of pixel P and its multi-year average. Therefore, based on the NDSI value of N clear-sky pixels at time T in 2015 (the pixels marked as P' in Figure 4) and the multi-year average NDSI value at time T in their corresponding position, it can be calculated that the Delta value of the P pixels (cloud-covered pixels) at time T in 2015 deviated from the multi-year average P pixels at time T as follows: where Delta is the anomaly between the NDSI value of P at time T and the multi-year average NDSI value of P at time T, and N is the number of cloud-free pixels that are closest to the cloudy pixel P. In this study, N was set to 20. Here, P I 2015 is the NDSI value of the ith clear-sky pixel at time T in 2015, P I 2001−2018 is the multi-year average NDSI value of date T for the ith clear-sky pixel from 2003 to 2018, and Delta can be either positive or negative. When Delta is positive (or negative), this indicates that the NDSI value of pixel P in the target image (take time T in 2015 as an example) is greater than (or less than) the multi-year average NDSI value, which can also indicate that the snow cover of the target image is more than (or less than) the average historical value.
Therefore, the value range of pixel P can be roughly determined as follows: where NDSI P range is the range of the predicted NDSI value of pixel P, and P 2003−2018 is the multi-year average NDSI value of pixel P from 2003 to 2018 at time T. ε is the error term, where its value in this study was 10. The lower and upper limits for the value range of NDSI were 0 and 100, respectively.
Remote Sens. 2019, 11, x FOR PEER REVIEW 9 of 34 (or less than) the multi-year average NDSI value, which can also indicate that the snow cover of the target image is more than (or less than) the average historical value. Therefore, the value range of pixel P can be roughly determined as follows: , where is the range of the predicted NDSI value of pixel P, and is the multiyear average NDSI value of pixel P from 2003 to 2018 at time T. is the error term, where its value in this study was 10. The lower and upper limits for the value range of NDSI were 0 and 100,

B) Selecting similar pixels
We then extracted M cloud-free pixels from the target image, which had an NDSI value in the within a 61 × 61 pixel search window centered at location P, as it is exceedingly timeconsuming to search for similar pixels in the entire study area [60]. To guarantee the precision of the SPSA, we let M account for at least 3000 pixels in the search window. If this condition was not satisfied, the search window was consecutively expanded to 101 × 101, 141 × 141, and 181 × 181 pixels, etc., until the entire study area was covered. The search radius was continuously and dynamically expanded with a step length of 20. These M cloud-free pixels are referred to as candidate similar pixels in this paper. Here, a 21-day time-series dataset from T − 10 to T + 10 was employed as auxiliary data ( Figure 5). Then, the similarity of each pixel was calculated between these M candidate similar pixels and the cloudy pixel P using the following formula: , ∈ 1, , ∈ 10, 10 , where is the similarity between cloud-covered pixel and the ith cloud-free candidate pixel We then extracted M cloud-free pixels from the target image, which had an NDSI value in the NDSI P range within a 61 × 61 pixel search window centered at location P, as it is exceedingly time-consuming to search for similar pixels in the entire study area [60]. To guarantee the precision of the SPSA, we let M account for at least 3000 pixels in the search window. If this condition was not satisfied, the search window was consecutively expanded to 101 × 101, 141 × 141, and 181 × 181 pixels, etc., until the entire study area was covered. The search radius was continuously and dynamically expanded with a step length of 20. These M cloud-free pixels are referred to as candidate similar pixels in this paper. Here, a 21-day time-series dataset from T − 10 to T + 10 was employed as auxiliary data ( Figure 5). Then, the similarity of each pixel was calculated between these M candidate similar pixels and the cloudy pixel P using the following formula: Remote Sens. 2020, 12, 1077 10 of 33 where SIMI P P i is the similarity between cloud-covered pixel P and the ith cloud-free candidate pixel P i (the value of i ranges from 1 to M) and NDSI P,t and NDSI P i ,t are the NDSI observations of pixels P and P i at time t, respectively. We note that this calculation can be performed only if both pixels P and P i have NDSI observations at time t. Here, N count is the number of days in the 21-day time window when both pixels P and P i have NDSI observations. SIMI P P i represents the degree of similarity between these two pixels, whose value ranges from 0 to 100.
Remote Sens. 2019, 11, x FOR PEER REVIEW 10 of 34 window when both pixels and have NDSI observations. represents the degree of similarity between these two pixels, whose value ranges from 0 to 100. We note that if the number of valid simultaneous observations between a candidate pixel and the target pixel is small in the 21-day time domain, the similarity between these two pixels may not be reliable. Therefore, the pixel will be removed from the candidate pixel list. In this study, the minimum threshold of simultaneous effective observations was set to 11 times.
Then, according to the similarity, K cloud-free candidate similar pixels, P', which were the most similar to the cloud-covered pixel, P, were selected. The K value can be customized according to user requirements and was set to 20 in this study. Finally, the predicted NDSI value for cloudy pixel P was calculated as the arithmetic NDSI mean of the K candidate similar pixels.
The software used in this study was the Interactive Data Language (IDL, version 8.4).

Validation Method Based on a "Cloud Mask Assumption"
Previous studies have typically used snow depth (SD) observations of meteorological stations to estimate cloud removal algorithms for the MODIS binary products; however, this validation method has problems regarding the verification of MODIS NDSI products. First, there is no clear linear relationship between the SD and NDSI [72]; second, the SD of the weather station cannot represent the entire pixel at a scale of 500 m (given the scale mismatch between the weather station and the pixel). Finally, the distribution of meteorological stations across the TP is mostly concentrated in the bottom of the valley at a lower altitude (Figure 1), which is hardly representative of the entire region due to its low spatial density; furthermore, there are no records from certain inaccessible and highly mountainous areas where the snow cover duration is longer. Therefore, another validation method based on a "mask assumption" was adopted in this study. The cloud gaps from other cloudy MODIS We note that if the number of valid simultaneous observations between a candidate pixel and the target pixel is small in the 21-day time domain, the similarity between these two pixels may not be reliable. Therefore, the pixel will be removed from the candidate pixel list. In this study, the minimum threshold of simultaneous effective observations was set to 11 times.
Then, according to the similarity, K cloud-free candidate similar pixels, P', which were the most similar to the cloud-covered pixel, P, were selected. The K value can be customized according to user requirements and was set to 20 in this study. Finally, the predicted NDSI value for cloudy pixel P was calculated as the arithmetic NDSI mean of the K candidate similar pixels.
The software used in this study was the Interactive Data Language (IDL, version 8.4).

Validation Method Based on a "Cloud Mask Assumption"
Previous studies have typically used snow depth (SD) observations of meteorological stations to estimate cloud removal algorithms for the MODIS binary products; however, this validation method has problems regarding the verification of MODIS NDSI products. First, there is no clear linear relationship between the SD and NDSI [72]; second, the SD of the weather station cannot represent the entire pixel at a scale of 500 m (given the scale mismatch between the weather station and the pixel). Finally, the distribution of meteorological stations across the TP is mostly concentrated in the bottom of the valley at a lower altitude (Figure 1), which is hardly representative of the entire region due to its low spatial density; furthermore, there are no records from certain inaccessible and highly mountainous areas where the snow cover duration is longer. Therefore, another validation method based on a "mask assumption" was adopted in this study. The cloud gaps from other cloudy MODIS NDSI images were used to simulate cloud masks for the "cloud-free" MODIS NDSI image, which is regarded as the true ground information [40,48,73].
The accuracy of the three sub-steps was evaluated independently and sequentially. The cloud removal results of the former method were the input data of the latter method. That is, the input data of the TAC method were the MOD10A1 and MYD10A1, while the output results were the TAC products, which were used as the input products of the 3DTF method. The cloud removal result (3DTF product) of the 3DTF method was used as the input product for the SPSA cloud removal method, and finally, the cloud-free MODIS NDSI product was produced using the SPSA method ( Figure 3).
The first two steps (TAC and 3DTF) do not include spatial neighborhood information from cloudy pixels and only consider the temporal information from cloudy pixels. Therefore, their verification method is relatively simple. The specific method is as follows: (A) For the TAC method: cloud-free pixels with NDSI values ranging from 0 to 100, both in the MOD10A1 and MYD10A1, were selected as test samples from 2003-2018. Pixels from MOD10A1 were taken as the true values, while the corresponding pixels from the MYD (at the same location) were taken as the predicted value. Equations (7)-(11) express the evaluation indicators. (B) For the 3DTF method: from 2003 to 2018, the cloud-free pixels with an NDSI value ranging from 0 to 100 within three consecutive days were selected as test samples, while the observed value of the middle day (date T) was taken as the true value. The arithmetic mean of the previous day's (date T − 1) NDSI and the latter day's (date T + 1) NDSI was taken as the predicted value. Equations (7)-(11) express the evaluation index. It should be noted that the evaluation of these two cloud removal methods (TAC and 3DTF) was performed day-by-day from 2003 to 2018. Each day, all pixels that satisfied these conditions were selected as verification samples. Thus, the accuracy indexes were available for each day. (C) For the SPSA method: a one-day cloud mask assumption was employed for validation. The specific details are outlined below.
The empirical cumulative distribution function (ECDF) [21,30] of the cloud-gap fraction was calculated based on the 3DTF product from 2003 to 2018 and is shown in Figure 6. Cloud contamination with a cloud-gap fraction corresponding to the 25th, 50th, and 75th percentile of the ECDF (hereafter abbreviated as P25, P50, and P75, respectively) are considered low, moderate, and high rates of cloud contamination [30]. The images with a cloud gap fraction closest to P25, P50, and P75 for each month were selected and the corresponding cloud masks were extracted for each month. We then selected three MODIS NDSI images with the lowest cloud-gap fraction for each month from 2003 to 2018 to be designated as "cloud-free" images (i.e., a total of 36 images were selected as "true" images). Each selected "cloud-free" NDSI image was related to three mask images, resulting in a total of 108 validation images from the 16-year study period. The acquisition date, cloud-gap fraction, and snow fraction information from the true images and cloud mask images selected in this study are summarized in Table 1.

Performance Metrics for NDSI
Five performance evaluation criteria (i.e., mean error (ME), mean absolute error (MAE), mean absolute percentage error (MAPE), root mean square error (RMSE), and the coefficient of determination (R 2 )) were employed and calculated using the following equations: where x Predicted i is the estimated NDSI value of the ith cloud mask pixel, x Observed i is the corresponding true NDSI value, N is the total number of cloud mask pixels, and σ Predicted and σ Observed denote the standard deviations of the estimated and true NDSI value, respectively.

Classification Accuracy
Binary snow cover products (snow-covered or snow-free) are still widely used in water cycles, global climate change, water and heat energy balances, and other research fields. NDSI products can be converted into binary snow cover products according to specific thresholds [51], and thus, they serve as input parameters for numerous hydrological and climate models. To evaluate the accuracy of cloud-free NDSI products obtained by the proposed cloud removal algorithm when converted into binary snow cover products, the confusion [74] matrix method (Table 2), which is recommended by several studies [14,32,35,75,76], was used in this study to evaluate the classification accuracy of different cloud removal algorithms. The threshold for the binary classification used in this study was 40, which is recommended by the MODIS standard snow cover product. Table 2. Confusion matrix comparing the observed and predicted MODIS binary snow cover product.

NDSI (≥σ) NDSI (<σ)
Predicted NDSI NDSI (≥σ) ss sn NDSI (<σ) ns nn *ss, sn, ns, and nn (s refers to snow and n refers to no snow) are the number of correctly hit, false positives, false negatives, and correctly rejected pixels, respectively. The σ is the defined threshold values of the observed and predicted NDSI values that determine whether a MODIS pixel is covered by snow. In this study, the standard threshold of 40, recommended by the MODIS products, was adopted, i.e., σ was set to 40.
Three validation indices are defined as follows: where the overall accuracy (OA) is the probability that a MODIS pixel is correctly classified as snow or snow-free; UE and OE represent the underestimation and overestimation snow error, respectively. A perfect estimation of MODIS gives OA = 1, and OE = UE = 0. A completely failed estimation of MODIS is OA = 0, and UE + OE = 1 [30].

Cloud Gaps in MODIS NDSI Products across the TP
An understanding of the spatial and temporal distribution characteristics of clouds across the TP is the basis and premise for selecting an appropriate and effective cloud removal method [30]. After investigating the cloud gaps in the MODIS NDSI products of the TP, we found that the annual average cloud-gap fraction of the study area was 41.62-47.38% for Terra and 50.61-56.32% for Aqua from 2003 to 2018, with mean cloud-gap fractions of 44.06% and 53.55%, respectively (Figure 7). The results show that the mean number of cloudy days was 151-172 days for MODIS Terra NDIS images and 184-206 days for MODIS Aqua NDSI images, with means of 161 and 195 cloudy days, respectively (Figures 7 and 8). Although there were slight differences in the extent and duration of cloud contamination from year to year, the spatial distribution and pattern of cloud coverage from year to year across the TP seemed relatively stable. where the overall accuracy (OA) is the probability that a MODIS pixel is correctly classified as snow or snow-free; UE and OE represent the underestimation and overestimation snow error, respectively. A perfect estimation of MODIS gives OA = 1, and OE = UE = 0. A completely failed estimation of MODIS is OA = 0, and UE + OE = 1 [30].

Cloud Gaps in MODIS NDSI Products across the TP
An understanding of the spatial and temporal distribution characteristics of clouds across the TP is the basis and premise for selecting an appropriate and effective cloud removal method [30]. After investigating the cloud gaps in the MODIS NDSI products of the TP, we found that the annual average cloud-gap fraction of the study area was 41.62%-47.38% for Terra and 50.61%-56.32% for Aqua from 2003 to 2018, with mean cloud-gap fractions of 44.06% and 53.55%, respectively ( Figure  7). The results show that the mean number of cloudy days was 151-172 days for MODIS Terra NDIS images and 184-206 days for MODIS Aqua NDSI images, with means of 161 and 195 cloudy days, respectively (Figures 7 and 8). Although there were slight differences in the extent and duration of cloud contamination from year to year, the spatial distribution and pattern of cloud coverage from year to year across the TP seemed relatively stable.  where the overall accuracy (OA) is the probability that a MODIS pixel is correctly classified as snow or snow-free; UE and OE represent the underestimation and overestimation snow error, respectively. A perfect estimation of MODIS gives OA = 1, and OE = UE = 0. A completely failed estimation of MODIS is OA = 0, and UE + OE = 1 [30].

Cloud Gaps in MODIS NDSI Products across the TP
An understanding of the spatial and temporal distribution characteristics of clouds across the TP is the basis and premise for selecting an appropriate and effective cloud removal method [30]. After investigating the cloud gaps in the MODIS NDSI products of the TP, we found that the annual average cloud-gap fraction of the study area was 41.62%-47.38% for Terra and 50.61%-56.32% for Aqua from 2003 to 2018, with mean cloud-gap fractions of 44.06% and 53.55%, respectively ( Figure  7). The results show that the mean number of cloudy days was 151-172 days for MODIS Terra NDIS images and 184-206 days for MODIS Aqua NDSI images, with means of 161 and 195 cloudy days, respectively (Figures 7 and 8). Although there were slight differences in the extent and duration of cloud contamination from year to year, the spatial distribution and pattern of cloud coverage from year to year across the TP seemed relatively stable.  Overall, the MYD10A1 (obtained from Aqua at 13:30) had approximately 9% more cloud coverage than MOD10A1 (obtained from Terra at 10:30). This may have resulted from temperature increases, vegetation transpiration, and increases in soil evaporation throughout each day, which created conditions that favored cloud formation. The distribution of cloud contamination across the TP was very uneven, with high values mainly concentrated on the southeastern part of the TP, especially in the Hengduan mountain regions. The annual average cloud cover duration of these areas exceeds 250 days, making cloud removal very challenging.
MODIS Terra NDSI products were taken as an example to further analyze the relationship between cloud coverage and time. The monthly average cloud coverages in different elevation zones are shown in Figure 9. Based on Figure 9, the monthly multi-year average cloud coverage for each elevation zone demonstrated that the proportion of cloud cover in the TP changed with time, which occurred as follows: all areas (except for areas with altitudes above 6000 m) showed a similar seasonal cycle, reaching their maximum values in summer (June, July, and August), and had an average cloud cover of 54.1% (for all areas except the areas above 6000 m). When autumn arrived, the cloud coverage began to decline rapidly from September and reached the minimum value in November (with a mean value of 25.7%). Winter and the following spring showed a gradually increasing trend. The opposite trend was observed for areas over 6000 m.a.s.l. Cloud coverage peaks in the winter were as high as 80% in January, and were relatively low in the summer, at approximately 40%. This may have been due to MODIS misinterpreting the snow cover as clouds at high altitudes [77]. Overall, the MYD10A1 (obtained from Aqua at 13:30) had approximately 9% more cloud coverage than MOD10A1 (obtained from Terra at 10:30). This may have resulted from temperature increases, vegetation transpiration, and increases in soil evaporation throughout each day, which created conditions that favored cloud formation. The distribution of cloud contamination across the TP was very uneven, with high values mainly concentrated on the southeastern part of the TP, especially in the Hengduan mountain regions. The annual average cloud cover duration of these areas exceeds 250 days, making cloud removal very challenging.
MODIS Terra NDSI products were taken as an example to further analyze the relationship between cloud coverage and time. The monthly average cloud coverages in different elevation zones are shown in Figure 9. Based on Figure 9, the monthly multi-year average cloud coverage for each elevation zone demonstrated that the proportion of cloud cover in the TP changed with time, which occurred as follows: all areas (except for areas with altitudes above 6000 m) showed a similar seasonal cycle, reaching their maximum values in summer (June, July, and August), and had an average cloud cover of 54.1% (for all areas except the areas above 6000 m). When autumn arrived, the cloud coverage began to decline rapidly from September and reached the minimum value in November (with a mean value of 25.7%). Winter and the following spring showed a gradually increasing trend. The opposite trend was observed for areas over 6000 m.a.s.l. Cloud coverage peaks in the winter were as high as 80% in January, and were relatively low in the summer, at approximately 40%. This may have been due to MODIS misinterpreting the snow cover as clouds at high altitudes [77]. Despite the high cloud cover over the TP, most of the cloud cover was concentrated in the summer months. This was beneficial for cloud removal in snow products, as previous studies [33,34] indicate that, in the summer, when the temperature is at its highest, the remaining snow can be considered permanent snow cover, which only exists on the peaks of the highest mountains. This type of snow can be considered perennial (i.e., there is always snow under the cloud cover). This significantly reduces the need for cloud removal in binary snow products [33,34]. In autumn and spring, when the snow cover was sensitive to global climate change, cloud cover was relatively low, which facilitated the cloud removal procedure. Despite the high cloud cover over the TP, most of the cloud cover was concentrated in the summer months. This was beneficial for cloud removal in snow products, as previous studies [33,34] indicate that, in the summer, when the temperature is at its highest, the remaining snow can be considered permanent snow cover, which only exists on the peaks of the highest mountains. This type of snow can be considered perennial (i.e., there is always snow under the cloud cover). This significantly reduces the need for cloud removal in binary snow products [33,34]. In autumn and spring, when the snow cover was sensitive to global climate change, cloud cover was relatively low, which facilitated the cloud removal procedure. Figure 10 presents the mean cloud-gap fraction in different months after the execution of each gap-filling step. The TAC and 3DTF reduced the cloud gaps by an average of 8.19% (compared to the original Terra data) and 9.43%, respectively. The SPSA method removed the remaining ≈25% of cloud gaps, leaving only less than 1.5% on average.  Figure 10 presents the mean cloud-gap fraction in different months after the execution of each gap-filling step. The TAC and 3DTF reduced the cloud gaps by an average of 8.19% (compared to the original Terra data) and 9.43%, respectively. The SPSA method removed the remaining ≈25% of cloud gaps, leaving only less than 1.5% on average.  Figure 11 illustrates the spatial distribution of the average number of cloud-gap days (CGDs) for the MOD, TAC, and 3DTF products at a seasonal scale. The results indicate that the cloud occlusion in the TP was significantly mitigated by the TAC and 3DTF methods. The TAC algorithm reduced the number of the CGDs for the MOD products from 41.9, 49.0, 30.6, and 31.8 days to 35.3, 38.8, 23.5, and 26.6 days, for spring, summer, autumn, and winter, respectively. Moreover, the 3DTF algorithm further reduced these values to 26.1, 30.6, 16.2, and 19.3 days, respectively. For the 3DTF product, the areas with CGDs below 30 days in the autumn and winter accounted for 88.9% and 84.1% of the total area of the TP, respectively, which suggests a relatively small need for cloud removal in autumn and winter. During the spring and summer, areas with more than 60 days of CGD accounted for 3.4% and 6.9% of the entire TP. This part of the region presented the most difficulties for cloud removal via the SPSA method due to a lack of effective temporal information.  Figure 11 illustrates the spatial distribution of the average number of cloud-gap days (CGDs) for the MOD, TAC, and 3DTF products at a seasonal scale. The results indicate that the cloud occlusion in the TP was significantly mitigated by the TAC and 3DTF methods. The TAC algorithm reduced the number of the CGDs for the MOD products from 41.9, 49.0, 30.6, and 31.8 days to 35.3, 38.8, 23.5, and 26.6 days, for spring, summer, autumn, and winter, respectively. Moreover, the 3DTF algorithm further reduced these values to 26.1, 30.6, 16.2, and 19.3 days, respectively. For the 3DTF product, the areas with CGDs below 30 days in the autumn and winter accounted for 88.9% and 84.1% of the total area of the TP, respectively, which suggests a relatively small need for cloud removal in autumn and winter. During the spring and summer, areas with more than 60 days of CGD accounted for 3.4% and 6.9% of the entire TP. This part of the region presented the most difficulties for cloud removal via the SPSA method due to a lack of effective temporal information. Figure 11. Seasonal average number of cloud-gap days (CGDs) based on different cloud removal methods. The left column refers to the MOD product, the middle column refers to the TAC product, and the right column refers to the 3DTF product. (a), (b), (c), and (d) refer to spring, summer, autumn, and winter, respectively. Figure 12 shows a time-series example of cloud-free NDSI distribution maps after the TAC, 3DTF, and SPSA gap-filling steps. The remaining cloud-gapped fraction decreased to less than 1% after the multi-step cloud removal procedure. Figure 12 displays a complete snowfall and snowmelt dynamic process. From 3 January 2010, it first began to snow in western TP. Over the next few days, with the movement of the weather system, the snow rapidly spread across the eastern part of the Tibetan plateau, reaching the maximum snow-covered area on 5 January. Over the next four days, Figure 11. Seasonal average number of cloud-gap days (CGDs) based on different cloud removal methods. The left column refers to the MOD product, the middle column refers to the TAC product, and the right column refers to the 3DTF product. (a), (b), (c), and (d) refer to spring, summer, autumn, and winter, respectively. Figure 12 shows a time-series example of cloud-free NDSI distribution maps after the TAC, 3DTF, and SPSA gap-filling steps. The remaining cloud-gapped fraction decreased to less than 1% after the multi-step cloud removal procedure. Figure 12 displays a complete snowfall and snowmelt dynamic process. From 3 January 2010, it first began to snow in western TP. Over the next few days, with the movement of the weather system, the snow rapidly spread across the eastern part of the Tibetan plateau, reaching the maximum snow-covered area on 5 January. Over the next four days, the snow melted rapidly. The entire process of snowfall and melting occurred very rapidly, in less than a week, which reflected the rapid changes in the characteristics of the TP snow state. the snow melted rapidly. The entire process of snowfall and melting occurred very rapidly, in less than a week, which reflected the rapid changes in the characteristics of the TP snow state.    Based on Figure 13a, the MAE and RMSE had a similar seasonal cycle. Figure 13a shows that the low value of the MAE and RMSE occurred in winter (December, January, and February) and summer (June, July, and August), with MAE = 2.57 and RMSE = 3.77 for winter and MAE = 2.45 and RMSE = 3.41 for summer. The high value occurred in the spring (March, April, and May) and autumn (September, October, and November), with MAE = 3.38 and RMSE = 4.28 for spring and MAE = 2.86 and RMSE = 3.81 for autumn. MAE (or RMSE) was lower in winter because the temperature was at its lowest in the year, such that it was easy to preserve the snow cover and the snow cover state did Based on Figure 13a, the MAE and RMSE had a similar seasonal cycle. Figure 13a shows that the low value of the MAE and RMSE occurred in winter (December, January, and February) and summer (June, July, and August), with MAE = 2.57 and RMSE = 3.77 for winter and MAE = 2.45 and RMSE = 3.41 for summer. The high value occurred in the spring (March, April, and May) and autumn (September, October, and November), with MAE = 3.38 and RMSE = 4.28 for spring and MAE = 2.86 and RMSE = 3.81 for autumn. MAE (or RMSE) was lower in winter because the temperature was at its lowest in the year, such that it was easy to preserve the snow cover and the snow cover state did not change easily. Therefore, the observed values for the MOD (crossing at 10:30) and MYD (crossing at 13:30) are relatively close. The reason for the low MAE (or RMSE) in summer was that the snow cover on the TP only appeared in high-altitude areas of the highest mountains in the summer. The average NDSI of the entire TP was the minimum NDSI of the entire year, such that the MAE (or RMSE) was also relatively low.

Effectiveness of the Gap-Filling Methodology
Spring, however, was the season in which snow began to melt. With the gradual increase in temperature, the snow status changed rapidly, such that the observed values of MOD (crossing at 10:30) and MYD (crossing at 13:30) could change due to temperature and solar radiation. MAE (or RMSE) also showed a trend that first increased and then decreased in the spring. R 2 was lower from June to September, averaging only 0.52, while the annual average R 2 was 0.67.
The classification accuracy and performance metrics for NDSI had similar characteristics ( Figure 13). The lowest value of the overall accuracy appeared in the spring (OA = 96%), followed by autumn (OA = 97%). High values for the OA occurred in winter (OA = 98.1%) and summer (OA = 98.2%). The error in the OA was mainly derived from the UE. The UE showed a significant trend over time, with high values appearing in spring (UE = 3.2%) and autumn (UE = 2.6%). The reason for this could be that the NDSI values observed by the MYD were lower than those observed by the MOD due to the snow melting caused by the temperature rise at noon. This phenomenon was most evident in the spring.
Overall, the NDSI values observed using MOD and MYD had a good consistency, with an annual average MAE of 2.82. In winter, the consistency was higher, with an MAE of only 2.57. In spring and autumn, the MAE values were 3.38 and 2.86, respectively. This indicates that the overall de-clouding accuracy of the MOD and MYD fusion method was high, and therefore, this method can be employed for cloud removal. Figure 14 illustrates the performance metrics for NDSI and the classification accuracy of the 3DTF. Both the OA and UE changed with time (season), while the OE showed a relatively insignificant change with time. The OA was the lowest in spring and autumn (97.2% and 97.6%, respectively) and the highest in winter and summer (both exceeding 99%). The annual OE was maintained at a very low level (only 0.5%) and depended weakly on time. This indicates that the decrease in the OA was mainly due to the contribution of the UE, which was also consistent with the findings demonstrated in Figure 14. The annual average OA was high, reaching 98%; the annual average value of the UE was 2%, which was greater than that of the OE (0.5%).

Accuracy Assessment of the Three-Day Temporal Filter (3DTF) Method
The error associated with the 3DTF method mainly derives from the UE, where the peak UE value appeared in the spring and autumn, which was consistent with the snow cover characteristics of the TP. The snow layer on the TP was thin, the snow patch was relatively broken, the solar radiation was strong, the wind speed was high, sublimation was common, and the snow state changed rapidly. As the snow depth was shallow and the snow episodes were short, the snow cover that accumulated in the morning of date T may have melted in the afternoon of the same date or at date T + 1. In other words, in such a scenario, the NDSI of date T − 1 was zero (the accumulation of snow had not yet begun), the NDSI of date T was high (snow began to accumulate on date T), and the NDSI of date T + 1 was low (snow cover began to melt). In this case, the 3DTF method underestimated the NDSI value. Moreover, this phenomenon was more common in snow-accumulation (autumn) and snow-melting periods (spring). On the other hand, in winter, due to the low temperature, the snow cover state was more likely to be consistent, such that the UE was low. The reason for the low UE in summer was that the snow cover was only distributed on the highest mountains of the TP in summer. The snow cover is generally considered to be permanent snow cover, which is in a relatively stable state. Therefore, the UE in summer was low and the OA was high. The error associated with the 3DTF method mainly derives from the UE, where the peak UE value appeared in the spring and autumn, which was consistent with the snow cover characteristics of the TP. The snow layer on the TP was thin, the snow patch was relatively broken, the solar radiation was strong, the wind speed was high, sublimation was common, and the snow state changed rapidly. As the snow depth was shallow and the snow episodes were short, the snow cover that accumulated in the morning of date T may have melted in the afternoon of the same date or at date T + 1. In other words, in such a scenario, the NDSI of date T − 1 was zero (the accumulation of snow had not yet In general, the accuracy of the 3DTF method was very high. The annual average OA reached 98% but the accuracy varied with time. The OA was high in stable periods (winter and summer) but relatively low in the snow-accumulation (autumn) and snow-melting periods (spring).

Accuracy Assessment of the Similar Pixel Selecting Algorithm (SPSA)
According to the validation method introduced previously, the cloud removal experiment was conducted on 108 samples, and the performance of the SPSA algorithm was evaluated regarding the two aspects of performance metrics for NDSI and classification accuracy (Table 3). Overall, the accuracy of the SPSA algorithm was high. In terms of the performance metrics for NDSI, the annual averages were as follows: MAE was 2.77, RMSE was 3.78, and R 2 was 0.78. In terms of the classification accuracy, the annual average OA was 96.92%, the OE was quite low at only 1.10%, and the UE was greater than the OE at 1.98%. Therefore, it can be inferred that the main error source of the SPSA algorithm was the UE.
The performance of the SPSA algorithm was different for the four seasons (Table 3), among which, the accuracies for autumn, winter, and spring were close, whereas the situation in summer was unique. Among autumn, winter, and spring, autumn had the highest accuracy, followed by spring and winter. It is worth noting that, although the absolute indicators (i.e., the MAE, RMSE, and OA) in autumn were better than those in winter and spring, the relative indicators (MAPE) in autumn were slightly higher than those in winter and spring at 37.84%. This was mainly due to less snow cover in autumn compared with winter and spring, such that the relative error was larger. Summer had the lowest MAE (1.69) and RMSE (2.86) of the year and the highest overall accuracy (OA = 98.54%) for the whole year, but its relative error was the worst (MAPE = 71.12%). The R 2 was also the lowest for the whole year at only 0.62. Therefore, a high summer accuracy (low MAE and RMSE, high OA) was considered to be a "false high precision." The reason for this "false high precision" was that there was less snow cover in the summer and more snow-free areas. Even if the snow was misclassified in snowy areas, it accounted for only a small portion of the total, such that the OA was still high. The SPSA algorithm for summer was limited as summer was the season with the highest number of days covered by clouds in a year (Figure 11(b3)), i.e., the number of days covered by clouds in certain pixels during summer was nearly 80 days. These pixels did not provide valid temporal information. There was little temporal information available during the similarity pixel calculation, such that there was a reduction of the accuracy in the SPSA algorithm (R 2 was only 0.62). However, summer snow usually exists only in the high-altitude mountains, which is considered permanent snow [33,34]. The snow state is relatively stable, such that the remaining clouds in summer can be removed by other methods [33,34]. Figure 15 details the SPSA algorithm. The date associated with the data was 08/010, with an image size of 400 × 400 pixels (approximately 40000 km 2 ) (a) and 600 × 600 pixels (approximately 90,000 km 2 ) (b). Based on Figure 15, the SPSA algorithm could remove almost all cloud contamination. From a visual perspective, the SPSA algorithm could restore the missing snow information beneath the cloud, rendering the predicted snow distribution pattern highly consistent with the observed snow distribution pattern. However, the SPSA algorithm also had certain missing points. The omission areas were mostly concentrated in the transition area between snow and land, but not in the center of large areas covered with snow. We speculate that the reason for this was a relatively thin snow layer in this part of the snow-land boundary, as well as rapid changes in the snow state, particularly with newly fallen thin snow, which melted within a short period. Therefore, the information provided in the time dimension (both before and after the snow) had a lower NDSI value, such that the selection of a similar pixel introduced error. This was also consistent with the conclusion listed in Table 3, i.e., the UE of the SPSA algorithm was approximately twice the OE. The SPSA algorithm performed well for large cloud areas (Figure 15b). Even when the cloud coverage exceeded 75%, the missing NDSI value could be reconstructed. However, certain errors were identified within large cloud patches, such as region I in Figure 15b. This may have been due to the large size of the cloud patch areas, which increased the distance between the target pixel and similar eligible pixels, thus reducing the accuracy of the algorithm. 90,000 km 2 ) (b). Based on Figure 15, the SPSA algorithm could remove almost all cloud contamination. From a visual perspective, the SPSA algorithm could restore the missing snow information beneath the cloud, rendering the predicted snow distribution pattern highly consistent with the observed snow distribution pattern. However, the SPSA algorithm also had certain missing points. The omission areas were mostly concentrated in the transition area between snow and land, but not in the center of large areas covered with snow. We speculate that the reason for this was a relatively thin snow layer in this part of the snow-land boundary, as well as rapid changes in the snow state, particularly with newly fallen thin snow, which melted within a short period. Therefore, the information provided in the time dimension (both before and after the snow) had a lower NDSI value, such that the selection of a similar pixel introduced error. This was also consistent with the conclusion listed in Table 3, i.e., the UE of the SPSA algorithm was approximately twice the OE. The SPSA algorithm performed well for large cloud areas (Figure 15b). Even when the cloud coverage exceeded 75%, the missing NDSI value could be reconstructed. However, certain errors were identified within large cloud patches, such as region I in Figure 15b. This may have been due to the large size of the cloud patch areas, which increased the distance between the target pixel and similar eligible pixels, thus reducing the accuracy of the algorithm.

Impact Factor Analysis of the SPSA Algorithm
To further analyze the relationship between the accuracy of the SPSA algorithm and the cloud occlusion rate, we randomly selected five images with a cloud-gap fraction of less than 3%: 08/313 (CGF = 1.69%), 10/077 (CGF = 2.18%), 12/092 (CGF = 2.26%), 13/321 (CGF = 2.86%), and 16/318 (CGF = 1.61%). According to the cumulative frequency of the cloud-gap fraction for the 3DTF product ( Figure 6), the cloud mask images were selected according to the proportion of the annual mean cloud-gap fraction from small to large. A total of 40 cloud mask images were selected. Table 4 lists the dates of the true and cloud mask image and the cloud fraction. For each true image scene, 40 cloud masks with a different cloud-gap fraction (from 3.42% to 64.68%) were employed. Thus, a total of 200 combinations were validated. The accuracy of the SPSA algorithm was then evaluated according to the previously mentioned cloud mask assumption method.  Figure 16 shows the accuracy indexes. In general, with these 200 images as verification samples, the average MAE of the SPSA cloud removal method was 3.78, the average RMSE was 4.60, the average R 2 was 0.88, and the average OA was 95.49%, which was consistent with the previous results in this study.
Subsequently, we calculated the relationship between the accuracy of these 200 groups of samples (RMSE and OA were selected as evaluation indicators), the cloud-gap fraction, and the mean NDSI of all pixels under the cloud (mean NDSI of all cloudy pixels).  Table 4). Figure 17 shows the scatterplots of the relationships among the various accuracy indexes. The results show that the accuracy of the SPSA cloud removal method (RMSE and OA) was independent of the cloud-gap fraction (R 2 = 0.04 for RMSE and 0.01 for OA). However, both the RMSE and OA had a strong correlation with the average mean NDSI value of the pixels under the cloud (R 2 = 0.84 and 0.74, respectively). The RMSE was positively correlated with the mean NDSI of the cloudy pixels, while the OA exhibited a negative correlation.  Table 4). Figure 17 shows the scatterplots of the relationships among the various accuracy indexes. The results show that the accuracy of the SPSA cloud removal method (RMSE and OA) was independent of the cloud-gap fraction (R 2 = 0.04 for RMSE and 0.01 for OA). However, both the RMSE and OA had a strong correlation with the average mean NDSI value of the pixels under the cloud (R 2 = 0.84 and 0.74, respectively). The RMSE was positively correlated with the mean NDSI of the cloudy pixels, while the OA exhibited a negative correlation. We then further analyzed the prediction results of 108 scene verification datasets used in Section 3.3.1 according to the zoning of elevation and aspect. The relationship between accuracy and topographical factors (elevation and aspect) also provided indirect evidence for our conclusions. Figure 18 demonstrates the accuracy's changing pattern with elevation and aspect. Accuracy generally decreased with increasing elevation. Aspect also affected the accuracy of the SPSA algorithm ( Figure 18). Specifically, the error associated with the sunny slope was small, while that of the shaded slope was large. This effect was especially notable between 4000 and 6000 m above sea level, but not below 4000 m or above 6500 m. Low areas (<4500 m) exhibited an overestimation and high areas (>4500 m) an underestimation (Figure 18b).
Previous studies have demonstrated a relationship between snow cover and elevation, which generally indicates that snow cover duration increases with elevation [6,62,[78][79][80][81]. Moreover, snow cover duration is lower in sunny slopes relative to shaded slopes due to solar irradiation [78,80]. The absolute accuracy of the SPSA algorithm can also be attributed to the consistent spatial distribution of snow, which is consistent with our previous observations. We then further analyzed the prediction results of 108 scene verification datasets used in Section 3.3.1 according to the zoning of elevation and aspect. The relationship between accuracy and topographical factors (elevation and aspect) also provided indirect evidence for our conclusions. Figure 18 demonstrates the accuracy's changing pattern with elevation and aspect. Accuracy generally decreased with increasing elevation. Aspect also affected the accuracy of the SPSA algorithm ( Figure 18). Specifically, the error associated with the sunny slope was small, while that of the shaded slope was large. This effect was especially notable between 4000 and 6000 m above sea level, but not below 4000 m or above 6500 m. Low areas (<4500 m) exhibited an overestimation and high areas (>4500 m) an underestimation (Figure 18b).
Previous studies have demonstrated a relationship between snow cover and elevation, which generally indicates that snow cover duration increases with elevation [6,62,[78][79][80][81]. Moreover, snow cover duration is lower in sunny slopes relative to shaded slopes due to solar irradiation [78,80]. The absolute accuracy of the SPSA algorithm can also be attributed to the consistent spatial distribution of snow, which is consistent with our previous observations. This result shows that the accuracy of the proposed method was not affected by the cloud-gap fraction and had a good robustness under different cloud cover circumstances. However, the NDSI itself affected the accuracy. If the mean NDSI value of cloudy pixels was large, cloud removal was more difficult, such that there was a reduction in the accuracy. When the mean NDSI was small (e.g., clouds were above snowless areas), cloud removal was much easier.

Comparison between TAC, 3DTF, and SPSA
Previous studies have demonstrated the satisfactory accuracy of existing MODIS cloud gapfilling methods during the stable-snow-cover phase, but these approaches cannot achieve acceptable accuracies in the transition phase characterized by snow cover accumulation and ablation [23,47]. Compared with the TAC and 3DTF methods, the SPSA algorithm showed no significant difference in accuracy during the stable snow-packed phase (winter) and snow-transition phase (spring and autumn) ( Table 5). The accuracies during autumn and spring (snow-cover accumulation and ablation phase) were not lower than that of winter (snow-packed phase), but slightly higher, which suggests that the SPSA method was also effective during the snow transition period. This may have been because cloud coverage reached a minimum in autumn, during which, time-series information was more abundant, thus facilitating the accurate selection of similar pixels.
The TAC and 3DTF algorithms are widely used in numerous cloud removal algorithms. Typically, these two algorithms are implemented first, followed by other algorithms. However, there are relatively few studies on accuracy evaluations for these two methods. This study demonstrated that both the TAC and the 3DTF methods had high accuracies, which could provide a theoretical basis for future studies. The SPSA algorithm could eliminate the remaining ≈25% cloud gaps without a significant loss of accuracy, which was the value of the SPSA algorithm. When using the proposed cloud removal method, users can specify how to combine the three algorithms according to their requirements. Additionally, the SPSA method can be used independently.  This result shows that the accuracy of the proposed method was not affected by the cloud-gap fraction and had a good robustness under different cloud cover circumstances. However, the NDSI itself affected the accuracy. If the mean NDSI value of cloudy pixels was large, cloud removal was more difficult, such that there was a reduction in the accuracy. When the mean NDSI was small (e.g., clouds were above snowless areas), cloud removal was much easier.

Comparison between TAC, 3DTF, and SPSA
Previous studies have demonstrated the satisfactory accuracy of existing MODIS cloud gap-filling methods during the stable-snow-cover phase, but these approaches cannot achieve acceptable accuracies in the transition phase characterized by snow cover accumulation and ablation [23,47]. Compared with the TAC and 3DTF methods, the SPSA algorithm showed no significant difference in accuracy during the stable snow-packed phase (winter) and snow-transition phase (spring and autumn) ( Table 5). The accuracies during autumn and spring (snow-cover accumulation and ablation phase) were not lower than that of winter (snow-packed phase), but slightly higher, which suggests that the SPSA method was also effective during the snow transition period. This may have been because cloud coverage reached a minimum in autumn, during which, time-series information was more abundant, thus facilitating the accurate selection of similar pixels.
The TAC and 3DTF algorithms are widely used in numerous cloud removal algorithms. Typically, these two algorithms are implemented first, followed by other algorithms. However, there are relatively few studies on accuracy evaluations for these two methods. This study demonstrated that both the TAC and the 3DTF methods had high accuracies, which could provide a theoretical basis for future studies. The SPSA algorithm could eliminate the remaining ≈25% cloud gaps without a significant loss of accuracy, which was the value of the SPSA algorithm. When using the proposed cloud removal method, users can specify how to combine the three algorithms according to their requirements. Additionally, the SPSA method can be used independently.

Comparison between SPSA and the Multi-Temporal Backward Filter
The multi-temporal backward filter (MTBF) has been successfully used by numerous studies with the MODIS standard snow products and other satellite data [40,58]. Using the MTBF, Hall et al. [82] generated cloud-gap filled daily snow cover extent map products. The cloud pixels on the current day of M*D10A1 were replaced with the most recent previous cloud-free pixels (snow/snow-free) from the M*D10A1 (see Hall and colleagues [82,83] for details). The method proposed in Hall et al. [82] is the method that was selected for the NASA MODIS standard snow cover extent (SCE) products because of its ease of use and effectiveness, as well as due to its reliance on data from only one sensor at a time to produce the results [83]. This method can also be employed for MODIS NDSI products [83].
We compared the performance of the SPSA and MTBF algorithms. The input data was the 3DTF data for both algorithms, including 108 validation images (see Section 3.3.1). The size of the MTBF's backward temporal window ranged from 1 to 10 days. Based on Figure 19a-c, for the MTBF algorithm, with an increase in the size of the backward temporal window, the cloud-gapped fraction gradually decreased from 17.2% to 0.87%. We note that with an increasing length of the temporal window, the cloud-gap fraction decreased dramatically during the first few days. However, as the window size increased, the accuracy of the MTBF algorithm decreased concurrently. For the MTBF, there was a trade-off between cloud removal efficiency and accuracy.
For different seasons, improvements to the accuracy of the SPSA in the transition period (autumn and spring) were significantly better than that in the stable period (winter). The MAE, R 2 , and OA of the SPSA improved by 1.92 (1.73), 0.05 (0.09), and 2.9% (2.3%), respectively, in spring (autumn) while those for winter only improved by 0.72, 0.01, and 1.0%, respectively. This shows that the advantage of the SPSA algorithm for snow transitions was more evident.

Advantages and Limitations of the SPSA
The precision of the SPSA was not affected by the cloud amount (i.e., cloud-gap fraction in a given day), which highlights the robustness of this approach. The SPSA algorithm could comprehensively account for the spatiotemporal correlation of the MODIS NDSI product time series, and thus, it could fully use the spatiotemporal correlation (similarity) of pixels, which also contributed to the high precision of the SPSA algorithm, especially during transition periods. Nevertheless, given that the SPSA algorithm depends on temporal information (i.e., the algorithm requires continuous time-series information to calculate the SIMI similarity index), when clouds covered a pixel for a long period, the algorithm was unable to provide effective temporal information, such that similar pixels could not be identified. In other words, the SPSA algorithm was limited in areas (or seasons) with a large number of consecutive cloudy days, which was why the accuracy of the SPSA algorithm was lower in summer ( Figure 11

Advantages and Limitations of the SPSA
The precision of the SPSA was not affected by the cloud amount (i.e., cloud-gap fraction in a given day), which highlights the robustness of this approach. The SPSA algorithm could comprehensively account for the spatiotemporal correlation of the MODIS NDSI product time series, and thus, it could fully use the spatiotemporal correlation (similarity) of pixels, which also contributed to the high precision of the SPSA algorithm, especially during transition periods. Nevertheless, given that the SPSA algorithm depends on temporal information (i.e., the algorithm requires continuous time-series information to calculate the SIMI similarity index), when clouds covered a pixel for a long period, the algorithm was unable to provide effective temporal information, such that similar pixels could not be identified. In other words, the SPSA algorithm was limited in areas (or seasons) with a large number of consecutive cloudy days, which was why the accuracy of the SPSA algorithm was lower in summer ( Figure-11(b3)).
Moreover, the SPSA algorithm still has certain limitations that require improvements. For example, in this study, we used fixed parameters M (3000) and K (20) to select M candidate similar Moreover, the SPSA algorithm still has certain limitations that require improvements. For example, in this study, we used fixed parameters M (3000) and K (20) to select M candidate similar pixels and K pixels with the highest similarity, which was arbitrary and sub-optimal. In future studies, we will attempt to use a variable adaptive algorithm to optimize the parameter selection process of the SPSA algorithm. Finally, cloud removal algorithms for pixels that have been covered by clouds for long periods should also be further developed.

Conclusions
Snow cover is one of the Global Climate Observing System (GCOS) essential climate variables. MODIS snow cover products have been widely used to investigate the distribution, extent, and duration of snow, along with snowmelt timing, which are critical for characterizing Earth's climate system and its changes [82,83]. However, clouds frequently create gaps in the snow cover data, which is the most important factor affecting the ability to accurately map the snow cover extent or duration.
Compared with binary snow cover products, the MODIS NDSI products (version 6) released in 2016 provide more abundant and continuous information of the snow cover status, thus facilitating the development of new cloud removal methods for snow cover products.
This study first systematically assessed the severity of the cloud contamination of MODIS NDSI products (version 6) across the TP. The results demonstrated that clouds were mainly concentrated in the southeast. Furthermore, cloud contamination was found to be largely season-dependent.
A novel cloud removal method was proposed for MODIS NDSI products, which mainly included three parts: (a) TAC, (b) 3DTF, and (c) SPSA. The validation results elucidated the differences in the efficiency and precision of each algorithm. The cloud removal efficiency of the TAC was approximately 9%, reducing the cloud-gap fraction from 45.2% (MOD products) to 35.7% (TAC products). The accuracy of the TAC was generally high (MAE = 2.82, R 2 = 0.67, and OA = 97.08%); however, this was significantly affected by time (season), with high accuracies observed in winter and summer, and low accuracies in spring and autumn.
The 3DTF method could remove approximately 10% of the cloud gaps and reduce the cloud-gap fraction from 35% (TAC products) to 25% (3DTF products). The 3DTF method exhibited a high accuracy with MAE = 1.45, R 2 = 0.88, and OA = 98.38%. The accuracy of this method was also season-dependent. Specifically, in stable periods, the 3DTF yielded a high precision, whereas, in the transition phase of snow cover accumulation (autumn) and ablation (spring), this algorithm had a relatively low accuracy, which was similar to the accuracy variation of the TAC method over time.
The SPSA method fully considered the effective information of the temporal and spatial dimensions to estimate the NSDI for a cloud-covered pixel. The possible range of the cloudy pixel NDSI value was first determined using spatial information, and then similar pixels within this potential range in the surrounding clear-sky area were selected based on the temporal similarity index to predict the missing NDSI information. This algorithm was efficient at filling almost all remaining cloud gaps without a significant loss of accuracy. The validation results from 108 one-day cloud-masked images based on cloud assumptions showed that the SPSA method provided good stability and applicability. The annual averages were as follows: MAE was 2.77, MAPE was 42.40%, RMSE was 3.78, and R 2 was 0.78. The annual average AO, OE, and UE were 96.92, 1.10, and 1.98%, respectively. Even if the cloud-gap fraction exceeded 60% (3DTF products), the SPSA still maintained a high accuracy. Compared with MTBF, the SPSA performed better, especially in transition periods (autumn and spring).
The accuracy of the snow products depended on numerous factors. In this study, we mainly focused on the accuracy of the gap-filling method but did not address the inherent accuracy of the MODIS snow maps because that has been documented elsewhere by numerous previous studies. Our study demonstrated that the uncertainty in the SPSA was greater in areas (or seasons) with frequent and persistent cloud cover, such as in southeastern TP in summer.
Various remote sensing products for land surface properties suffer from cloud gaps, which largely limit their applications. In this study, the application of NDSI data recovery produced good results. The ability of the SPSA for the reconstruction of other data and the gap-filling of continuous numerical physical quantities with recurring patterns, such as land surface temperature (LST) or the vegetation index (VI), should be examined in the future. Certain aspects of this study will be the subject of future investigations. Additionally, the applicability of the SPSA algorithm still requires further study to better characterize its strengths and limitations.
Data Availability: The cloud-free MODIS NDSI data product across the Tibetan Plateau generated in this study will be made publicly available from Zenodo at https://zenodo.org/record/3700229#.XmkR-KgzY2w (last access: 10 March 2020). The codes are also available for download.
Author Contributions: M.L. designed the study and developed the algorithm, conducted the data analysis, and wrote the manuscript. Y.P., X.Z., and N.L. significantly edited and revised the manuscript. All authors have read and agreed to the published version of the manuscript.