Gap-Filling of MODIS Fractional Snow Cover Products via Non-Local Spatio-Temporal Filtering Based on Machine Learning Techniques

: Cloud obscuration leaves signiﬁcant gaps in MODIS snow cover products. In this study, an innovative gap-ﬁlling method based on the concept of non-local spatio-temporal ﬁltering (NSTF) is proposed to reconstruct the cloud gaps in MODIS fractional snow cover (SCF) products. The ground information of a gap pixel was estimated by using the appropriate similar pixels in the remaining known part of an image via an automatic machine learning technique. We take the MODIS SCF product cloud gap ﬁlling data from 2001 to 2016 in Northern Xinjiang, China as an example. The results demonstrate that the methodology can generate almost continuous spatio-temporal, daily MODIS SCF images, and it leaves only 0.52% of cloud gaps long-term, on average. The validation results based on “cloud assumption” exhibit high accuracy, with a higher R 2 exceeding 0.8, a lower RMSE of 0.1, an overestimated error of 1.13%, an underestimated error of 1.4%, and a spatial efﬁciency ( SPAEF ) of 0.78. The validation based on 50 in situ snow depth observations demonstrates the superiority of the methodology in terms of accuracy and consistency. The overall accuracy is 93.72%. The average omission and commission error have increased approximately 1.16 and 0.53% compared with the original MODIS SCF products under a clear sky term. 0.7, and 0.75, the gap-ﬁlled ﬁctitious MODIS SCF assumed ground truth MODIS SCF image. there was gap-ﬁlling lower coverage level of ATF, ATF NSTF cases. the second step the results of previous step. As the previous step the input information of the second step and error non-local spatio-temporal ﬁltering (NSTF), and ﬁnal images.


Introduction
Snow cover is one of the fastest changing objects on the Earth's surface. It strongly affects the climate system, radiation, and energy balance between the atmosphere and the Earth, hydrological circulation, biogeochemical cycling, and even human activities [1,2]. In the Northern Hemisphere, the snow coverage ranges from approximately 45.2 to 1.9 million square kilometers, which occurs in January and August, respectively. The drastic variation of snow cover spatial extent is a sensitive indicator of climate change [3]. The melting of seasonal snow cover supplies the dominant water source for river runoff and ground water discharge, particularly for arid and semi-arid regions [4]. Approximately 1/6 of the world population and more than 1/5 of the Chinese population depend on water resources supplied by snow cover or glaciers [5]. Snow cover is also a critical parameter in hydrological and ecological process models [6]. There is an urgent requirement to accurately map the spatial distribution of snow cover. This is not only because of a rising demand for fresh water management but also due to the concerns about the effects of climate change. Cheng et al. (2014) developed a new approach based on the idea of similar pixel replacement [35]. The spatio-temporal Markov random fields (STMRF) function was constructed to search for the most appropriate similar non-cloud pixels to replace a cloud-contaminated pixel in MODIS land surface temperature products. Inspired by this work, we presented an innovative gap-filling method, based on the concept of the non-local spatio-temporal filter (NSTF), to eliminate the cloud contamination in daily MODIS SCF products. The ground information of a cloud pixel is estimated through selected similar pixels in the remaining known region. We anticipate that twofold goals can be achieved: (1) reconstructing the ground information of MODIS SCF products obscured by cloud cover and (2) achieving the overall accuracy of MODIS SCF source products under clear a sky term after filling the gaps.

Study Area
North Xinjiang, located between 42-50 • N and 79-92 • E and covering a total area of 0.39 million km 2 (Figure 1), is one of the three major seasonal snow cover areas in China. The area is characterized by distinct topographic relief, with the Altai Mountains in the north, the Tianshan Mountains in the south, and the Junggar basin in between. The elevation of the study area ranges from 187 to 5950 m. The land cover, which can severely influence the accuracy of snow cover mapping [8], mainly consists of grasslands and shrubs (42.6%), as well as bare ground (40.7%). Cropland (7.7%), water bodies (2.5%), urban and built up land (0.6%), and forests (5.9%) cover only 16.7% of the total area [34]. Xinjiang has a temperate continental climate and is a typical arid and semi-arid region. The annual mean precipitation is approximately 145 mm, and the evaporation capacity is about 200 mm [36]. The study area has a long, cold winter and a short, hot summer, with an extreme minimum temperature of −40 • C. Over half of the precipitation in the study area falls as snow in the winter. The seasonal snow cover period is approximately 120 days, with a mean snow depth of 0.6 m [25]. The considerably heterogeneous nature of the topography, diverse land cover types, and changeable climatic conditions have created complex spatial and temporal snow cover patterns.
Remote Sens. 2019, 11, x FOR PEER REVIEW 3 of 26 number of outliers in the case of continuous cloud coverage whereas the spatial distribution characteristics of snow cover was not considered. Cheng et al. (2014) developed a new approach based on the idea of similar pixel replacement [35]. The spatio-temporal Markov random fields (STMRF) function was constructed to search for the most appropriate similar non-cloud pixels to replace a cloud-contaminated pixel in MODIS land surface temperature products. Inspired by this work, we presented an innovative gap-filling method, based on the concept of the non-local spatio-temporal filter (NSTF), to eliminate the cloud contamination in daily MODIS SCF products. The ground information of a cloud pixel is estimated through selected similar pixels in the remaining known region. We anticipate that twofold goals can be achieved: (1) reconstructing the ground information of MODIS SCF products obscured by cloud cover and (2) achieving the overall accuracy of MODIS SCF source products under clear a sky term after filling the gaps.

Study Area
North Xinjiang, located between 42-50°N and 79-92°E and covering a total area of 0.39 million km 2 (Figure 1), is one of the three major seasonal snow cover areas in China. The area is characterized by distinct topographic relief, with the Altai Mountains in the north, the Tianshan Mountains in the south, and the Junggar basin in between. The elevation of the study area ranges from 187 to 5950 m. The land cover, which can severely influence the accuracy of snow cover mapping [8], mainly consists of grasslands and shrubs (42.6%), as well as bare ground (40.7%). Cropland (7.7%), water bodies (2.5%), urban and built up land (0.6%), and forests (5.9%) cover only 16.7% of the total area [34]. Xinjiang has a temperate continental climate and is a typical arid and semi-arid region. The annual mean precipitation is approximately 145 mm, and the evaporation capacity is about 200 mm [36]. The study area has a long, cold winter and a short, hot summer, with an extreme minimum temperature of −40 °C. Over half of the precipitation in the study area falls as snow in the winter. The seasonal snow cover period is approximately 120 days, with a mean snow depth of 0.6 m [25]. The considerably heterogeneous nature of the topography, diverse land cover types, and changeable climatic conditions have created complex spatial and temporal snow cover patterns.    31 December 2016 were collected from NASA's EOSDIS "Earthdata" (https://earthdata.nasa.gov/), respectively. Three tiles (i.e., h23v03, h23v04, and h24v04) were required to cover the whole study area. The MODIS Reprojection Tool (MRT) was used to mosaic each of the three tiles via nearest neighbor resample method and reprojected the primitive sinusoidal projection into the Universal Transverse Mercator (UTM zone 45) projection with the WGS84 datum and 500 m resolution.

Meteorological Observation, Land Cover and Digital Elevation Model data
The daily snow depth (SD) recorded by 50 meteorological stations were provided by the Chinese Meteorological Data Sharing Service System (CMDS) (http://cdc.cma.gov.cn/). Most of these meteorological stations are located in the inhabited and low elevation valleys. Only three stations are located at elevations above 2000 m ( Figure 1). The SD observations were recorded in centimeters, and the fractional part is rounded up.
The land cover data was extracted from land cover products of China, which was provided by the Cold and Arid Regions Science Data Center at Lanzhou (http://westdc.westgis.ac.cn). The overall classification accuracy of this land cover map was 71%, which was higher than the accuracies of other land cover maps (i.e., global land cover dataset of IGBP, MODIS land cover map, a global land cover map produced by the University of Maryland, and the land cover map produced by the global land cover for the year 2000 (GLC 2000) project. The overall classification accuracy of these four land cover maps is between 54.17-59.28%) [37,38]. The land cover map uses a unified 1000 m resolution and ALBERS projection. We reprojected it into UTM projection with the WGS84 datum and resampled it into 500 m resolution so that it has consistent projection and resolution with MODIS SCF data.
The NASA Shuttle Radar Topography Mission (SRTM) Digital Elevation Model (DEM) data (SRTM3, 90 m) was provided by the Consortium for Spatial Information (CGIAR-CSI) (http://srtm.csi.cgiar.org). The eighteen DEM tiles are mosaicked, reprojected, and resampled to achieve consistency with the MODIS SCF images. The elevation, aspect, and slope maps were extracted from processed DEM data.

Investigation on Cloud Gaps in MODIS SCF Products
How much cloud gaps are in the MODIS SCF products? How long is the cloud cover duration? To answer these two questions, we first reclassified the MODIS SCF products into two categories: fractional snow (0 to 100) and cloud (250). After reclassification, annual cloud coverage is calculated to analyze the influence of cloud gaps on MODIS SCF products. Furthermore, to explore the relationship between elevation and cloud cover frequency, we also take elevation into account in our calculations. For example, the daily cloud coverage for elevations above the specified threshold Z T of day d can be calculated using the following formula.
where N d (Z > Z T ) and N Tol (Z > Z T ) are the numbers of cloud pixels and total number of pixels in the elevation above Z T for date d in the study area. If Z T is taken as the minimum elevation (Z min ) of study area, the study area is considered in its entirety. Existing studies have found that there are always numerous similar pixels in a remote sensing image. Similar pixels do not exist only in an adjacent location but also appear in non-local neighboring regions [35,39]. Similar pixels have the following characteristics. (1) They have approximately equal spectral characteristics and, (2) despite images obtained at different times having different spectral characteristics, similar pixels have approximately the same reflectance changes over time and relative coincident positions in multi-temporal images [40]. For example, Figure 2 shows the snow distribution of MOD10A1 in a small region (the size is 31 × 31 pixels, with the upper left corner is located at (45. Figure 2a are the similar pixels of the target pixel (black box), which are also similar pixels of target pixels in Figure 2b. The land cover types of these pixels are cropland, and these pixels have close reflectance within an image and consistent changes with time. Taking this basic concept, the ground information of a cloud pixel within a MODIS SCF image can be estimated by using the appropriate similar pixels in the remaining known region with the help of multitemporal reference images.

Theoretical Basis
Existing studies have found that there are always numerous similar pixels in a remote sensing image. Similar pixels do not exist only in an adjacent location but also appear in non-local neighboring regions [35,39]. Similar pixels have the following characteristics. (1) They have approximately equal spectral characteristics and, (2) despite images obtained at different times having different spectral characteristics, similar pixels have approximately the same reflectance changes over time and relative coincident positions in multi-temporal images [40]. For example, Figure 2 shows the snow distribution of MOD10A1 in a small region (the size is 31 31 × pixels, with the upper left corner is located at (45.05°N, 86.00°E)) of study area on (a) 30 January 2003 and (b) 18 February 2003. The pixels displayed in the red box in Figure 2a are the similar pixels of the target pixel (black box), which are also similar pixels of target pixels in Figure 2b. The land cover types of these pixels are cropland, and these pixels have close reflectance within an image and consistent changes with time. Taking this basic concept, the ground information of a cloud pixel within a MODIS SCF image can be estimated by using the appropriate similar pixels in the remaining known region with the help of multitemporal reference images. However, some other studies point out that a pixel would also show considerable reflectance change where land cover type changes, therefore causing the phenomenon of temporal difference measurement [41,42]. For example, regarding the pixel we picked out with a green square in Figure  2a, its land cover type is bare land. It is also the similar pixel of the target pixel before snow melts in Figure 2a, but it is no longer the similar pixel of target pixels in Figure 2b after partial snow melting. This may be caused by different snow melting rates under changed underlying surface conditions. To tackle this problem, the pixel which had the same land cover type with the target pixel was regarded as a similar pixel [41][42][43].

Gap-Filling Method
An innovative gap-filling method via NSTF is proposed to reconstruct the ground information of the cloud gaps in the daily MODIS SCF products. As there are significant gaps in large cloud regions, it is unreliable and unrealistic to rely on the remaining known regions of the image to match similar pixels. Therefore, a combination of Terra and Aqua, and temporal filtering are executed first However, some other studies point out that a pixel would also show considerable reflectance change where land cover type changes, therefore causing the phenomenon of temporal difference measurement [41,42]. For example, regarding the pixel we picked out with a green square in Figure 2a, its land cover type is bare land. It is also the similar pixel of the target pixel before snow melts in Figure 2a, but it is no longer the similar pixel of target pixels in Figure 2b after partial snow melting. This may be caused by different snow melting rates under changed underlying surface conditions. To tackle this problem, the pixel which had the same land cover type with the target pixel was regarded as a similar pixel [41][42][43].

Gap-Filling Method
An innovative gap-filling method via NSTF is proposed to reconstruct the ground information of the cloud gaps in the daily MODIS SCF products. As there are significant gaps in large cloud regions, it is unreliable and unrealistic to rely on the remaining known regions of the image to match similar pixels. Therefore, a combination of Terra and Aqua, and temporal filtering are executed first to fill some gaps before NSTF. However, the NSTF method can be used independently. The detailed schematic of the MODIS SCF products gap-filling procedure is presented in Figure 3. to fill some gaps before NSTF. However, the NSTF method can be used independently. The detailed schematic of the MODIS SCF products gap-filling procedure is presented in Figure 3.

Daily MOD10A1 and MYD10A1 Combination
Daily MOD10A1 and MYD10A1 products were combined first by using the composite rules where priority was assigned to MOD10A1, as the validation studies demonstrated that it had more accurate retrievals than MYD10A1 [13]. Hereafter we refer to MOD10A1, MYD10A1, and the combination products to MOD, MYD, and MOYD, respectively. The specific combination strategy is the following. When a pixel is cloud-free in both MOD and MYD, the value in MOD is taken. When a pixel is cloud-free in only one of the products, the cloud free observation is taken. In addition, because the MYD product was not available until July 2002, we only used the MOD product prior to that date.

Adjacent Temporal Filtering
Adjacent temporal filtering (ATF) rests on the fundamental assumption that snow will persist on the surface for a certain period of time. Previous studies of one-day to eight-day temporal filters have confirmed that the overall accuracy of multi-days combined snow cover products improves with the increase of composition days [26,27,32,44,45]. However,  found that the overall accuracy of multi-day combined snow cover products decreases with increasing composition days [19]. To determine the optimum composition days, we calculated a cloud persistence index (CPI) for each pixel via finding the average of the continuously cloud covered days [25]. The result indicates that three days is the optimum number of composition days, as the mean CPI for the study area from 2001 to 2016 is 2.72 days. In addition, to guarantee the real-time application ability of the proposed method, we only use images from the previous date. Thus, we determined the three-days backward temporal filtering method used in this study. For a cloudy pixel, we searched images from the previous three days and, if a cloud-free observation was obtained from two or more days, the last valid SCF at this location was assigned to this cloud pixel.

NSTF
To match the similar pixels in the remaining non-cloud covered regions to fill the gap pixel, the following three criteria need to be considered.

Daily MOD10A1 and MYD10A1 Combination
Daily MOD10A1 and MYD10A1 products were combined first by using the composite rules where priority was assigned to MOD10A1, as the validation studies demonstrated that it had more accurate retrievals than MYD10A1 [13]. Hereafter we refer to MOD10A1, MYD10A1, and the combination products to MOD, MYD, and MOYD, respectively. The specific combination strategy is the following. When a pixel is cloud-free in both MOD and MYD, the value in MOD is taken. When a pixel is cloud-free in only one of the products, the cloud free observation is taken. In addition, because the MYD product was not available until July 2002, we only used the MOD product prior to that date.

Adjacent Temporal Filtering
Adjacent temporal filtering (ATF) rests on the fundamental assumption that snow will persist on the surface for a certain period of time. Previous studies of one-day to eight-day temporal filters have confirmed that the overall accuracy of multi-days combined snow cover products improves with the increase of composition days [26,27,32,44,45]. However,  found that the overall accuracy of multi-day combined snow cover products decreases with increasing composition days [19]. To determine the optimum composition days, we calculated a cloud persistence index (CPI) for each pixel via finding the average of the continuously cloud covered days [25]. The result indicates that three days is the optimum number of composition days, as the mean CPI for the study area from 2001 to 2016 is 2.72 days. In addition, to guarantee the real-time application ability of the proposed method, we only use images from the previous date. Thus, we determined the three-days backward temporal filtering method used in this study. For a cloudy pixel, we searched images from the previous three days and, if a cloud-free observation was obtained from two or more days, the last valid SCF at this location was assigned to this cloud pixel.

NSTF
To match the similar pixels in the remaining non-cloud covered regions to fill the gap pixel, the following three criteria need to be considered. and have an approximately similar SCF variation in the time domain; (2) Considering the continuity of snow distribution in the spatial domain, the similar pixels have an approximately similar SCF proximity with their respective surrounding neighbors; (3) The similar pixels have approximately similar topographic features. According to the above three criteria, three objective functions are defined. The similar pixels are the solution of this three-objective optimization problem which requires the simultaneous minimization of three objective functions under equality constraints. However, the three objective functions are almost impossible to minimize simultaneously because there is often conflict between multiple objectives. To provide a means to assess trade-off between three conflicting objectives, an automatic machine learning technique of the fast elitist and non-dominated sorting multi-objective genetic algorithm (NSGA-II) is adopted [46,47].
First, for a cloud pixel P on target image, it is assumed that P is the pixel at the same position as P on an auxiliary reference image. The historical image closest to the target image with B(P ) is a 3 × 3 non-cloud block centered at P (Figure 4). This was selected as the most appropriate auxiliary reference image. We then extracted the M non-cloud common pixels of the target image and reference image within a 51 × 51 pixels search window centered at location P, as is too time-consuming to search for similar pixels in the entire study area [41]. To guarantee the precision of the NSTF, we let M account for at least 30% of the total number of pixels in searching window. If this condition was not satisfied, the search window was expanded to 101 × 101 pixels then 151 × 151 pixels, etc., until the entire study area is encompassed.
Remote Sens. 2019, 11, x FOR PEER REVIEW 7 of 26 topographic features. According to the above three criteria, three objective functions are defined. The similar pixels are the solution of this three-objective optimization problem which requires the simultaneous minimization of three objective functions under equality constraints. However, the three objective functions are almost impossible to minimize simultaneously because there is often conflict between multiple objectives. To provide a means to assess trade-off between three conflicting objectives, an automatic machine learning technique of the fast elitist and non-dominated sorting multi-objective genetic algorithm (NSGA-II) is adopted [46,47]. First, for a cloud pixel P on target image, it is assumed that P ′ is the pixel at the same position as P on an auxiliary reference image. The historical image closest to the target image with ( ) × non-cloud block centered at P ′ ( Figure 4). This was selected as the most appropriate auxiliary reference image. We then extracted the M non-cloud common pixels of the target image and reference image within a 51 × 51 pixels search window centered at location P , as is too timeconsuming to search for similar pixels in the entire study area [41]. To guarantee the precision of the NSTF, we let M account for at least 30% of the total number of pixels in searching window. If this condition was not satisfied, the search window was expanded to 101 × 101 pixels then 151 × 151 pixels, etc., until the entire study area is encompassed. (a) Target image T , in which P is a cloud pixel, P′′ (graymarked pixel) is the spatial non-cloud pixel of first order adjacency to P , s P is a candidate similar pixel of P ; (b) Reference image R , in which P ′ is non-cloud pixel in the same position relative to Second, the three-objective MODIS SCF products gap-filling problem can be expressed as: where s P is a candidate similar pixel of P . LC is the land cover type.

A: Minimize Variation in Time Domain
The objective considers that similar pixels having similar change trends over time, and relative coincident positions in multi-temporal images can be mathematically defined as (a) Target image T, in which P is a cloud pixel, P (gray-marked pixel) is the spatial non-cloud pixel of first order adjacency to P, P s is a candidate similar pixel of P; (b) Reference image R, in which P is non-cloud pixel in the same position relative to P, B(P ) is a 3 × 3 non-cloud block (red box) centered at P , P s is a candidate similar pixels of P .
Second, the three-objective MODIS SCF products gap-filling problem can be expressed as: Minimize Subject to : LC(P S ) = LC(P) where P s is a candidate similar pixel of P. LC is the land cover type.

A: Minimize Variation in Time Domain
The objective considers that similar pixels having similar change trends over time, and relative coincident positions in multi-temporal images can be mathematically defined as where the subscript T represents the target image, S T (P S ) is the SCF value of P S , S T is the average SCF value of pixels in a corresponding location P S in the target image, and P S is a candidate similar pixel of P which was calculated in advance ( Figure 4). The calculation procedure of P S is Subject to : LC(P S ) = LC(P ) where B(P S ) is a 3 × 3 non-cloud block centered at location P S ; subscript R represents the reference image; S R (P ) is the SCF value of P ; and S R (B(P )) and S R (B(P S )) represent the average SCF value of blocks B(P ) and B(P S ), respectively. d is a free parameter related to the sensor [41]; it is set as 0.01 for the MODIS case. Equation (4) is the similarity measurement to ensure that the difference between the two blocks is extremely slight.

B: Minimize Discontinuity in Spatial Domain
The objective function f 2 considers the spatial continuity of snow cover distribution. The continuity between a cloud pixel and surrounding spatial neighbors can be presented as where P is the spatial non-cloud pixel of first order adjacency to P (Figure 4), and S T (P ) and S R (P ) represent the average SCF values of P in the target and reference image, respectively.

C: Minimize the Topographic Differences
Similar to [21], we also assume that there is equivalent exposure and the same amount of snow cover under the same terrain conditions. Thus, the objective function f 3 is used to enforce similar pixels having similar topographic features.
where E, S, and A represent normalized elevation, slope and aspect, respectively. Among them, E and S adopt the Min-Max normalization method, and the normalization method of A was: Third, the NSGA-II algorithm was executed to search for N (the value is 10 in the present study) similar pixels among the M non-cloud common pixels. We encoded the chromosome (CM, a solution to this three-objective minimal optimization problem) as the combination of the location of the N similar pixels.
where (x i , y i ) is the location of i-th similar pixels, i = 1, 2, . . . , N.
The detailed execution process of NSGA-II can be referred to in [46]. Finally, the arithmetic mean SCF value of N similar pixels (that is, the average SCF value of 10 pixels in corresponding locations of (x i , y i )) is arranged for cloud gap pixel P.

Validation Methodology
It is noticeable that the most appropriate validation technique is to use in situ SD observations as the true ground information. However, as most of the 50 meteorological stations in the area are located in the inhabited and low elevation valleys (Figure 1), the in situ SD observations can hardly provide the overall picture of the entire region because of its low spatial density and because there are no records from some inaccessible, high-mountainous areas. Therefore, another methodology based on "cloud assumption" was adopted. The cloud gaps from some more clouded MODIS SCF images Remote Sens. 2019, 11, 90 9 of 24 were borrowed to mask the "cloud-free" MODIS SCF image, which is regarded as the true ground information [27,30,45].

Validation Based on "Cloud Assumption"
Considering that percentages of cloud obstruction and the duration of the clouds will affect the accuracy assessment results directly, two strategies are adopted to carry out validation [25].

One-Day Cloud Mask
The MODIS SCF image with the lowest cloud coverage per month from 2001 to 2016 is considered the "cloud-free" image. Then, to ensure that the artificial fictitious cloud gaps would occur as close to real life as possible, we calculated the cloud coverage level of 25, 50, and 75% probability of occurrence (PO) via the empirical cumulative distribution function (ECDF) to describe the cloud pattern of low, moderate, and high occurrence in each month. The closest images in time to 25% PO, 50% PO, and 75% PO (abbreviated to P25, P50, and P75 here after) in each month were identified, and the corresponding cloud gaps were extracted to mask the selected "cloud-free" image of this month. Consequently, each selected "cloud-free" SCF image was related to three mask images, and it co-produced a total of 576 validation images during the study period of 16 years.

Multi-Day Cloud Mask
To imitate the real natural conditions of consecutive overcast days, the spatio-temporal additional cloud masked images were generated. Because the mean CPI of the study area during the period from 2001 to 2016 is 2.72 days, we chose two groups of three consecutive days characterized by minimum cloud cover every snow season (i.e., November to March next year as a snow season) to be the "cloud-free" images. Additionally, the cloud gaps in the corresponding images of the previous snow cover season records were extracted to overlap onto the selected "cloud-free" images. Moreover, we chose additional groups of two, four, five, and seven consecutive days every snow season to launch more extensive validation. This co-produced a total of 90 validation groups.
where x s i is the estimated SCF value of the i-th fictitious cloud pixel, x T i is the corresponding ground truth MODIS SCF value, and N is the total number of fictitious cloud masked pixels.
In addition to the four above traditional evaluation criteria, a new bias-insensitive metric called spatial efficiency (SPAEF) was used for snow cover spatial pattern validation, the SPAEF is defined as follows [48]: where A is the correlation coefficient between the gap-filled and ground truth MODIS SCF image, B is the fraction of coefficient of variation, and C is the percentage of histogram intersection. The optimal value of SPAEF is 1, which means that the gap-filled fictitious MODIS SCF image is exactly the same as the ground truth MODIS SCF image.

Validation Based on In Situ SD Observations
The original and gap-filled MODIS SCF images were compared with the in situ SD observations corresponding to the 50 meteorological stations. We applied the validation strategy based on the confusion matrix (Table 1), which is recommended by extensive studies such as [49].

MODIS SCF
a, b, c, and d are the number of pixels correctly hit, commission alarmed, omission alarmed, and correctly rejected, respectively. e and f are the cloud pixels in the MODIS SCF image when the meteorological station reports snow and no snow, respectively. The ε 1 and ε 2 are the defined threshold values of observed SD and MODIS SCF to determine whether a station or a MODIS pixel is covered by snow, respectively.
The four validation indices are defined as follows.
Overall accuracy (OA) is the probability a MODIS pixel that correctly classified; MU and MO represent the underestimated and overestimated snow event, respectively; and Bias refers to the probability of snow event being correctly identified by MODIS. The perfect estimation of MODIS is OA= 1, Bias = 1 (we also set Bias to 1 when a = b = c = e = 0), and MU = MO = 0. The imperfect estimation of MODIS is OA= 0, MU = MO = 1, Bias = 0 or ∞ (we set Bias to 0 when a = c = e = 0 and b = 0, which means no snow is observed whereas MODIS indicates snow cover).

Cloud Gaps in MODIS SCF Products Over Northern Xinjiang
After investigating the cloud gaps in MODIS SCF products of the study area, we found that the extent of cloud obstruction is even worse than we had desired. Annual average cloud coverage of the study area is 43.53-49.85% for Terra and 42.43-52.31% for Aqua from 2001 to 2016, with a mean cloud coverage of 46.48 and 47.78%, respectively ( Table 2). The results are effectively consistent with previous studies of [24]. The cloud cover duration maps demonstrate that the mean number of cloud observations is 94-275 days for MODIS Terra SCF images and 96-316 days for MODIS Aqua SCF images, with mean cloud days of 169 and 175, respectively ( Figure 5).  We took MODIS Terra SCF products as an example (though there were similar patterns for MODIS Aqua). The calculated annual cloud coverage in different elevation zones are shown in Figure  6. From Figure 6, the annual average cloud coverage for each elevation zone demonstrates that the cloud obstruction probability increases with elevation.  53.09, and 46.52%, respectively. The systematic trends of cloud obstruction for the entire study area are the most consistent with those of the elevation zone of 2000-3000 m. The influence of topography on cloud cover is also confirmed in Figure 5, in which the number of cloud cover days have shown a positive pattern in relation to elevation. This result is partly attributable to the water vapor uplift caused by orography, making its condensation into clouds more likely. Meanwhile, atmospheric circulation may be contributing to clearing the clouds at low-altitude regions [37]. We took MODIS Terra SCF products as an example (though there were similar patterns for MODIS Aqua). The calculated annual cloud coverage in different elevation zones are shown in Figure 6. From Figure 6, the annual average cloud coverage for each elevation zone demonstrates that the cloud obstruction probability increases with elevation.  46.52%, respectively. The systematic trends of cloud obstruction for the entire study area are the most consistent with those of the elevation zone of 2000-3000 m. The influence of topography on cloud cover is also confirmed in Figure 5, in which the number of cloud cover days have shown a positive pattern in relation to elevation. This result is partly attributable to the water vapor uplift caused by orography, making its condensation into clouds more likely. Meanwhile, atmospheric circulation may be contributing to clearing the clouds at low-altitude regions [37].
respectively. The systematic trends of cloud obstruction for the entire study area are the most consistent with those of the elevation zone of 2000-3000 m. The influence of topography on cloud cover is also confirmed in Figure 5, in which the number of cloud cover days have shown a positive pattern in relation to elevation. This result is partly attributable to the water vapor uplift caused by orography, making its condensation into clouds more likely. Meanwhile, atmospheric circulation may be contributing to clearing the clouds at low-altitude regions [37].

The Effectiveness of Gap-Filling Methodology
The proposed gap-filling methodology generated almost continuous spatio-temporal daily MODIS SCF products, leaving only 0.52% of cloud gaps on long-term average. The maximum gaps remaining is 5.92%. The mean cloud coverage remaining in different elevation zones after the execution of each gap-filling step is presented in Figure 7. We found an extremely consistent change trend between the single year of 2004 and the average after 16 years. Similar conditions were found in other years, which will not be explored here. MOYD, ATF, and NSTF reduced cloud cover by an average of 8.19% (compared to the original Terra data) or 9.72% (compared to the original Aqua data), 21.4%, and 16.43%, respectively.

The Effectiveness of Gap-Filling Methodology
The proposed gap-filling methodology generated almost continuous spatio-temporal daily MODIS SCF products, leaving only 0.52% of cloud gaps on long-term average. The maximum gaps remaining is 5.92%. The mean cloud coverage remaining in different elevation zones after the execution of each gap-filling step is presented in Figure 7. We found an extremely consistent change trend between the single year of 2004 and the average after 16 years. Similar conditions were found in other years, which will not be explored here. MOYD, ATF, and NSTF reduced cloud cover by an average of 8.19% (compared to the original Terra data) or 9.72% (compared to the original Aqua data), 21.4%, and 16.43%, respectively.  07% of the clouds on those days, respectively, and closes the gap-filling procedure. By consecutively performing three gap-filling steps, even if more than 90% of the ground information is obscured by clouds, extensive cloud gaps can be reconstructed and more reasonable ground information is obtained. This agrees with the general knowledge of snow spatial distribution characteristics in Northern XinJiang.  07% of the clouds on those days, respectively, and closes the gap-filling procedure. By consecutively performing three gap-filling steps, even if more than 90% of the ground information is obscured by clouds, extensive cloud gaps can be reconstructed and more reasonable ground information is obtained. This agrees with the general knowledge of snow spatial distribution characteristics in Northern XinJiang. 25.39, 70.87, 45.25, and 29.78% of the cloud coverage on those days, respectively. NSTF removes the remaining 59. 55, 12.22, 17.48, and 12.07% of the clouds on those days, respectively, and closes the gap-filling procedure. By consecutively performing three gap-filling steps, even if more than 90% of the ground information is obscured by clouds, extensive cloud gaps can be reconstructed and more reasonable ground information is obtained. This agrees with the general knowledge of snow spatial distribution characteristics in Northern XinJiang.

Accuracy: An Example of One-Day Cloud Mask
We take the special case of January 2004 as an example. The clearest MODIS Terra SCF image appeared on 21 January with 18.18% cloud gaps. This was regarded as the "cloud-free" MODIS SCF image ( Table 3). The ECDF of cloud coverage in each month for the MODIS Terra SCF product in 2004 is shown in Figure 9. The closest images to P25, P50, and P75 in January were the images from 1 January 2004, 20 January 2004, and 8 January 2004, with cloud coverages of 47.55, 55.08 and 69%, respectively. After these three extracted images were assigned to the clearest SCF image on 21 January, the percentage of cloud cover in the three artificially cloud-filled images were 53.59, 59.78 and 71.87%, respectively. Note that we also performed this cloud-filled procedure for the corresponding MODIS Aqua SCF image.

Accuracy: An Example of One-Day Cloud Mask
We take the special case of January 2004 as an example. The clearest MODIS Terra SCF image appeared on 21 January with 18.18% cloud gaps. This was regarded as the "cloud-free" MODIS SCF image ( Table 3). The ECDF of cloud coverage in each month for the MODIS Terra SCF product in 2004 is shown in Figure 9. The closest images to P25, P50, and P75 in January were the images from 1 January 2004, 20 January 2004, and 8 January 2004, with cloud coverages of 47.55, 55.08 and 69%, respectively. After these three extracted images were assigned to the clearest SCF image on January 21, the percentage of cloud cover in the three artificially cloud-filled images were 53.59, 59.78 and 71.87%, respectively. Note that we also performed this cloud-filled procedure for the corresponding MODIS Aqua SCF image.   After the gap-filling procedure was finished, the ground information was perfectly reconstructed visually. Despite the clouds obscuring more than half of the study area, the SCF values of the reconstructed image were extremely close to the true value of the "cloud-free" MODIS SCF image in all three cloud patterns ( Figure 10). MOYD reduced 14.75, 19.85, and 17.31% of the cloud cover and ATF removed 33.7, 45.02, and 49.47% of the cloud cover for P25, P50, and P75 cloud-masked conditions, respectively. NSTF eliminates the remaining clouds and completely cloudless images were produced. As seen from Table 4, the proposed gap-filling approach can achieve the preferred accuracy. The RMSE was 0.07-0.13, OE was 0.59%-2.07%, UE was below 1%, and R 2 was 0.86-0.96. The spatial pattern validation metric of SPAEF for the MODY, ATF, and NSTF steps, and final images exceeded 0.8, 0.65, 0.7, and 0.75, respectively. This illustrates that the gap-filled fictitious MODIS SCF image is very consistent with assumed ground truth MODIS SCF image. As expected, there was better gap-filling performance under lower cloud coverage level conditions. In addition, the performance of MOYD was superior to ATF, while ATF was superior to NSTF in most cases. This was because the implementation of the second step was based on the results of the previous step. As such, the imprecision of the previous step was passed through the input information of the second step and error was accumulated.
MODIS SCF image is very consistent with assumed ground truth MODIS SCF image. As expected, there was better gap-filling performance under lower cloud coverage level conditions. In addition, the performance of MOYD was superior to ATF, while ATF was superior to NSTF in most cases. This was because the implementation of the second step was based on the results of the previous step. As such, the imprecision of the previous step was passed through the input information of the second step and error was accumulated.  b2, c2, d2) The corresponding artificially cloud-filled images. The third to fifth rows represent the consecutive applications of MOYD, ATF, and NSTF. Table 4. Quantitative performance evaluation results for the one-day cloud mask. In view of the fact that the reconstruction of SCF information under consecutive cloud cover in the snow season has more significant importance for practical applications, we take the special case of the 2015 to 2016 snow season as an example. Three consecutive days characterized by lowest average cloud obstruction for MODIS Terra SCF images on February 26, February 27, and February 28 of 2016 are selected as the "cloud-free" true SCF images. The cloud coverages during these three days were 17.50, 13.12, and 64.68%, respectively ( Table 5). The corresponding images on the same dates in 2015 had cloud coverages of 66.03, 44.06, and 45.50% were used to mask the selected "cloud-free" images. The three artificially cloud-filled images were produced with cloud coverages of 70.43, 50.44, and 79.59%. The same was also done for the corresponding Aqua images.
From Table 6, we can see that the performance of this consecutive cloud cover case is slightly worse, compared to the one-day cloud mask case. MOYD reduced the cloud cover by 8.8, 16.92, and 16.73% on the three days, respectively. ATF reduced the cloud cover by 34.71, 11.58, and 35.59% on the three days, respectively. Although the NSTF step eliminated a lot of cloud cover (26.02, 20.94, and 26.42% on the three days, respectively), it does not completely eliminate the cloud cover. Tiny amounts (i.e., 0.9, 1, and 0.95% on the three days, respectively) of cloud gaps were still retained. Higher accuracy was also achieved with lower overall RMSE values of 0.09-0.11, OE values of 1.15-1.58%, UE values of 0.15-2.41%, higher R 2 values of 0.93-0.96, and higher SPAEF values of 0.8-0.88. Despite the significant differences in terms of both cloud percentage and cloud patterns, there was little performance difference among these three days. Table 5. Cloud coverage of selected "cloud-free" MODIS Terra SCF images, three extracted mask images, three artificially filled images, and after the implementation of each gap-filling step for a consecutive three-days cloud mask. D1-D3 refers to the selected first to third days.  Table 6. Quantitative performance evaluation results for a consecutive three-day cloud mask.

Validation Results Based on "Cloud Assumption"
The generalization validation results of 576 one-day and 90 groups of multi-day (2-7 days) cloud masked images are shown in Figure 11. The comparison of the result of each gap-filing step with the selected "cloud-free" true MODIS SCF image confirms the efficacy of the proposed gap-filling method. For the one-day case, most cloud-free images on each month can be found with an average cloud coverage less than 1%. The average cloud coverage of the artificially cloud-filled images was 30, 47, and 66%, respectively, when P25, P50 and P75 additional cloud cover was introduced. The average cloud coverage of the selected 2-7 consecutive "cloud-free" days was approximately 11-27% and 50-64% after artificially spatiotemporal additional cloud cover was incorporated, as shown in Figure 11a. The gap-filling efficiency is further confirmed by Figure 11b. More than 99% of gaps were reconstructed. The average amount of cloud gaps reduced by MODY, ATF, and NSTF are 9.21, 25.19, and 17.28%, which are basically similar with the results averaged from 16 years given in Section 4.2. Based on Figures 11c-f and 12, although there are significant differences in both cloud percentage and cloud patterns among different validation strategies, all of them are characterized by lower values of RMSE, OE, and UE, as well as higher values of R 2 and SPAEF. The average RMSE was about 0.1, R 2 was greater than 0.8, OE was 1.13%, UE was 1.4%, and SPAEF was 0.78 for the final images. The performance of MOYD and ATF was very prominent, with the SPAEF below 0.1, the OE and UE below 1%, and SPAEF exceeding 0.8 in all validation strategies. However, the performance of NSTF was slightly worse, particularly for the validation strategy of four to seven consecutive cloud covered days. The RMSE reached up to 0.15, some disagreement errors exceeded 2%, and the average SPAEF was approximately 0.65. This is because, after three days of backward temporal filtering in ATF, it will inevitably leave out a larger percentage of gaps. There will also be less available non-cloud pixels for NSTF. Together with the accumulative effect of error, these factors lead to a slightly inferior performance of NSTF.

Validation Based on In-Situ SD Observations
Four validation indices were calculated by using an SD threshold value of 1 =3 ε and MODIS SCF threshold value of 2 =50% ε . We identified a pixel as snow covered when the SD or SCF exceeds the threshold value, otherwise it was classified as a land pixel. As shown in Figure 13, we found that, for each elevation zones, the performance of the MODIS Terra SCF product was superior to MODIS Aqua's. The MODIS Terra SCF product had a slightly higher OA , lower M U and MO , and a Bias closer to 1. This may be attributed to the MODIS Aqua SCF mapping algorithm that uses band 7 instead of band 6 and the changed underlying surface conditions [50]. The performance of each gapfilling step is varied. For OA , the average value increased from 50.86% (Terra)/48.55% (Aqua) to 58.15% after MOYD, to 78% after ATF, and 93.72% after NSTF. The improvement of OA is more significant in elevation zones of <2000 m. The OA remains almost unchanged in the zones of elevation above 2000 m. This may be because only three in situ observation stations were located in this area, and it may not be representative enough. For Bias , three consecutive gap-filling steps bring it closer to 1. This proves that the probability of MODIS observed snow events is increasing. The M U and MO errors for the study area for 16 years were 2.16 and 2.43%, respectively, for the MODIS SCF product under clear-sky conditions, and they were 3.32 and 2.96%, respectively, for the final, gapfilled SCF product.

Validation Based on In-Situ SD Observations
Four validation indices were calculated by using an SD threshold value of ε 1 = 3 and MODIS SCF threshold value of ε 2 = 50%. We identified a pixel as snow covered when the SD or SCF exceeds the threshold value, otherwise it was classified as a land pixel. As shown in Figure 13, we found that, for each elevation zones, the performance of the MODIS Terra SCF product was superior to MODIS Aqua's. The MODIS Terra SCF product had a slightly higher OA, lower MU and MO, and a Bias closer to 1. This may be attributed to the MODIS Aqua SCF mapping algorithm that uses band 7 instead of band 6 and the changed underlying surface conditions [50]. The performance of each gap-filling step is varied. For OA, the average value increased from 50.86% (Terra)/48.55% (Aqua) to 58.15% after MOYD, to 78% after ATF, and 93.72% after NSTF. The improvement of OA is more significant in elevation zones of <2000 m. The OA remains almost unchanged in the zones of elevation above 2000 m. This may be because only three in situ observation stations were located in this area, and it may not be representative enough. For Bias, three consecutive gap-filling steps bring it closer to 1. This proves that the probability of MODIS observed snow events is increasing. The MU and MO errors for the study area for 16 years were 2.16 and 2.43%, respectively, for the MODIS SCF product under clear-sky conditions, and they were 3.32 and 2.96%, respectively, for the final, gap-filled SCF product.

Validation Based on In-Situ SD Observations
Four validation indices were calculated by using an SD threshold value of 1 =3 ε and MODIS SCF threshold value of 2 =50% ε . We identified a pixel as snow covered when the SD or SCF exceeds the threshold value, otherwise it was classified as a land pixel. As shown in Figure 13, we found that, for each elevation zones, the performance of the MODIS Terra SCF product was superior to MODIS Aqua's. The MODIS Terra SCF product had a slightly higher OA , lower M U and MO , and a Bias closer to 1. This may be attributed to the MODIS Aqua SCF mapping algorithm that uses band 7 instead of band 6 and the changed underlying surface conditions [50]. The performance of each gapfilling step is varied. For OA , the average value increased from 50.86% (Terra)/48.55% (Aqua) to 58.15% after MOYD, to 78% after ATF, and 93.72% after NSTF. The improvement of OA is more significant in elevation zones of <2000 m. The OA remains almost unchanged in the zones of elevation above 2000 m. This may be because only three in situ observation stations were located in this area, and it may not be representative enough. For Bias , three consecutive gap-filling steps bring it closer to 1. This proves that the probability of MODIS observed snow events is increasing. The M U and MO errors for the study area for 16 years were 2.16 and 2.43%, respectively, for the MODIS SCF product under clear-sky conditions, and they were 3.32 and 2.96%, respectively, for the final, gapfilled SCF product.
The results indicated that a higher OA (above 90%) and lower MU and MO (below 2%) are achieved for all the different threshold combinations in the non-snow season, but Bias is closest to 1 when the threshold value combination of ε 1 = 3 and ε 2 = 50% is used. This illustrates that the inconsistency between the final gap-filled MODIS SCF product and in situ SD observations is minimal. In snow seasons, OA is over 80% and Bias is also extremely close to 1 for all different threshold combinations, which further confirms that there is favorable consistency between the final gap-filled MODIS SCF product and in situ SD observations. The MODIS SCF threshold value of ε 2 = 50% produced a relatively a higher MU and lower MO. When the ε 2 is fixed, a large value of ε 1 produces relatively lower MU and higher MO values. For example, MU is 12% and MO is 0 for the threshold combination of ε 1 = 1 and ε 2 = 50%, and MU is 8.2% and MO is 9.08% for the threshold combination of ε 1 = 4 and ε 2 = 50% in March. The above results demonstrate that the most suitable ε 1 should be between 1 and 4cm and ε 2 should be between 15 and 50%. Therefore, we drew the inference that the threshold combination of ε 1 = 3 and ε 2 = 50% is the best choice for the MODIS accuracy assessment in the study area.

The Sensitivity of Different SCF and SD Threshold Combinations
When comparing the MODIS SCF value with in situ SD observations, the most suitable SD threshold value ( 1 ε ) and SCF threshold value ( 2 ε ) must be chosen. 1 [12,55] have been used in the literature. We use these thresholds to analyze the sensitivity of different thresholds to validate results, as shown in Figure 14.
The results indicated that a higher OA (above 90%) and lower MU and MO (below 2%) are achieved for all the different threshold combinations in the non-snow season, but Bias is closest to 1 when the threshold value combination of 1 =3 ε and 2 =50% ε is used. This illustrates that the inconsistency between the final gap-filled MODIS SCF product and in situ SD observations is minimal. In snow seasons, OA is over 80% and Bias is also extremely close to 1 for all different threshold combinations, which further confirms that there is favorable consistency between the final gap-filled MODIS SCF product and in situ SD observations. The MODIS SCF threshold value of 2 =50% ε produced a relatively a higher MU and lower MO . When the 2 ε is fixed, a large value of 1 ε produces relatively lower MU and higher MO values. For example, MU is 12% and MO is 0 for the threshold combination of 1 =1 ε and 2 =50% ε , and MU is 8.2% and MO is 9.08% for the threshold combination of 1 =4 ε and 2 =50% ε in March. The above results demonstrate that the most suitable 1 ε should be between 1 and 4cm and 2 ε should be between 15 and 50%. Therefore, we drew the inference that the threshold combination of 1 =3 ε and 2 =50% ε is the best choice for the MODIS accuracy assessment in the study area.

Comparison with Another Gap-Filling Method
We compared the proposed method with the existing cubic spline temporal interpolation approach. Figure 15 is the monthly/annual variation of the four validation indices. Although the cubic spline interpolation method uses information from both preceding and following days and the proposed method only uses information from preceding days, it was found that the proposed method performs more excellently, particularly during the snow season. Bias was closer to 1, and the average OA was 87%, MU was 6.85%, and MO was 6.3%. These values are 2% higher, 1.92% lower, and 0.04% lower, respectively than the corresponding values for the cubic spline interpolation method. It must be noted that the Bias for the period from June to September is far more than 1 for Figure 14. Monthly/annual variation of four validation indices based on different SD threshold values (ε 1 ) and SCF threshold value (ε 2 ). Validation strategies 1-6 represent threshold value combinations of ε 1 = 4 and ε 2 = 50%, ε 1 = 3 and ε 2 = 50%, ε 1 = 1 and ε 2 = 50%, ε 1 = 4 and ε 2 = 15%, ε 1 = 3 and ε 2 = 15%, and ε 1 = 4 and ε 2 = 15%, respectively.

Comparison with Another Gap-Filling Method
We compared the proposed method with the existing cubic spline temporal interpolation approach. Figure 15 is the monthly/annual variation of the four validation indices. Although the cubic spline interpolation method uses information from both preceding and following days and the proposed method only uses information from preceding days, it was found that the proposed method performs more excellently, particularly during the snow season. Bias was closer to 1, and the average OA was 87%, MU was 6.85%, and MO was 6.3%. These values are 2% higher, 1.92% lower, and 0.04% lower, respectively than the corresponding values for the cubic spline interpolation method. It must be noted that the Bias for the period from June to September is far more than 1 for both gap-filling methods. The value even exceeds 3 for the spline interpolation approach. This means that both methods have poor consistency with in situ SD observations during this period. On the other hand, the OA values of both gap-filling methods are very high. Their average values even reach 99%. This seems to be contradictor. The cause of this phenomenon is that there are only a few permanent snow covers in the summer and some slight overestimations occurred.
Remote Sens. 2019, 11, x FOR PEER REVIEW 21 of 26 both gap-filling methods. The value even exceeds 3 for the spline interpolation approach. This means that both methods have poor consistency with in situ SD observations during this period. On the other hand, the OA values of both gap-filling methods are very high. Their average values even reach 99%. This seems to be contradictor. The cause of this phenomenon is that there are only a few permanent snow covers in the summer and some slight overestimations occurred.

Dependence of Gap-Filling Performance on Land Cover Types
The spatio-temporal distribution and "accumulation-melting" pattern of snow cover is closely related to land cover types [8,10,45]. Thus, we examined the performance of each gap-filling step among different land cover types, as shown in Table 7. We found that the OA of original MODIS SCF products is approximately 50%. The highest accuracy appears in cropland, and relatively low accuracies occur in grasslands and shrubs. The disagreement between MU and MO errors occurred in the various land cover types, particularly in grassland and unused land. The performance of OA and Bias improved a lot, whereas there was no significant increase in MU and MO error. The significant improvements were found in cropland, forests, and urban and built up land, with OA values exceeding 95%. The improvements of grasslands and shrubs, and unused land are also obvious, with OA values of 87.08 and 91.17%, respectively. Note that, only 49 observation stations were used in our analysis because another observation station (Daxigou) was located at the eastern edge of the Glacier No. 1 in Tianshan, may be due to the errors of geolocation or the land cover type data, the corresponding land cover type of this observation station was permanent glacier, but we found that the cumulative number of snow covered days in 16 years at this station was 1517 days, this means that this observation station was not sufficiently representative as it is generally believed that the annual average number of snow days in permanent snow areas is more than 320 days, we therefore give it up. In addition, the final result is related to the number and location of the observation sites, it may not be enough to explain the essence of the problem. The various land cover type may lead to different spatio-temporal distribution and "accumulation-melting" patterns of snow cover. Varying intensities of illumination of the heterogeneous terrain in complex mountainous areas, wind-drifting snow, avalanche, and snow cover left in garden and streets caused by man-made operations, all these factors make the problem more complicated, we will explore the dependence of gap-filling performance on land cover types by introduce higher resolution data from Landsat OLI 8, Sentinel-2, geostationary satellites in the future.

Dependence of Gap-Filling Performance on Land Cover Types
The spatio-temporal distribution and "accumulation-melting" pattern of snow cover is closely related to land cover types [8,10,45]. Thus, we examined the performance of each gap-filling step among different land cover types, as shown in Table 7. We found that the OA of original MODIS SCF products is approximately 50%. The highest accuracy appears in cropland, and relatively low accuracies occur in grasslands and shrubs. The disagreement between MU and MO errors occurred in the various land cover types, particularly in grassland and unused land. The performance of OA and Bias improved a lot, whereas there was no significant increase in MU and MO error. The significant improvements were found in cropland, forests, and urban and built up land, with OA values exceeding 95%. The improvements of grasslands and shrubs, and unused land are also obvious, with OA values of 87.08 and 91.17%, respectively. Note that, only 49 observation stations were used in our analysis because another observation station (Daxigou) was located at the eastern edge of the Glacier No. 1 in Tianshan, may be due to the errors of geolocation or the land cover type data, the corresponding land cover type of this observation station was permanent glacier, but we found that the cumulative number of snow covered days in 16 years at this station was 1517 days, this means that this observation station was not sufficiently representative as it is generally believed that the annual average number of snow days in permanent snow areas is more than 320 days, we therefore give it up. In addition, the final result is related to the number and location of the observation sites, it may not be enough to explain the essence of the problem. The various land cover type may lead to different spatio-temporal distribution and "accumulation-melting" patterns of snow cover. Varying intensities of illumination of the heterogeneous terrain in complex mountainous areas, wind-drifting snow, avalanche, and snow cover left in garden and streets caused by man-made operations, all these factors make the problem