MODIS Fractional Snow Cover Mapping Using Machine Learning Technology in a Mountainous Area

: To improve the poor accuracy of the MODIS (Moderate Resolution Imaging Spectroradiometer) daily fractional snow cover product over the complex terrain of the Tibetan Plateau (RMSE = 0.30), unmanned aerial vehicle and machine learning technologies are employed to map the fractional snow cover based on MODIS over this terrain. Three machine learning models, including random forest, support vector machine, and back-propagation artiﬁcial neural network models, are trained and compared in this study. The results indicate that compared with the MODIS daily fractional snow cover product, the introduction of a highly accurate snow map acquired by unmanned aerial vehicles as a reference into machine learning models can signiﬁcantly improve the MODIS fractional snow cover mapping accuracy. The random forest model shows the best accuracy among the three machine learning models, with an RMSE (root-mean-square error) of 0.23, especially over forestland and shrubland, with RMSEs of 0.13 and 0.18, respectively. Although the accuracy of the support vector machine and back-propagation artiﬁcial neural network models are worse over forestland and shrubland, their average errors are still better than that of MOD10A1. Di ﬀ erent fractional snow cover gradients also a ﬀ ect the accuracy of the machine learning algorithms. Nevertheless, the random forest model remains stable in di ﬀ erent fractional snow cover gradients and is, therefore, the best machine learning algorithm for MODIS fractional snow cover mapping in Tibetan Plateau areas with complex terrain and severely fragmented snow cover.


Introduction
Snow is an important water resource, and snowmelt water constitutes the main source for many rivers in mountainous areas, especially in arid regions [1]. The high albedo and low thermal conductivity of snow cover strongly affect the global radiant energy balance [2,3]. Known as the "Third Pole", the average elevation of the Tibetan Plateau (TP) exceeds 4000 m.a.s.l. Snow in the TP is the main water source for many major rivers in Asia, including the Yellow, Yangtze, Mekong, Salween, and Ganges Rivers. Previous studies have shown that the snow thickness in the TP can significantly affect the Indian Ocean summer monsoon and precipitation [4,5] and even affect the frequency of Eurasian heat waves [6]. Seasonal snow cover is the main factor that controls the phenology and growth of alpine grassland in the TP [7,8]. In addition, excessive snowfall results in the death of many livestock and great economic losses, making it one of the major natural disasters in the TP [9].
Due to terrain ruggedness and hostile conditions, it is very difficult to monitor snow cover using manual field-based techniques; thus, remote sensing technology has become the most effective method to monitor snow cover in the TP. However, due to the impacts from strong solar radiation, snow melts In the context of global warming, there is an urgent need for a real-time and accurate snow cover monitoring algorithm for the TP. Therefore, this study employs three machine learning algorithms based on UAV and MODIS data to improve the accuracy of the MODIS FSC product over the TP. The goals of this study are threefold: (1) to introduce true FSC values obtained by UAVs into the MODIS FSC inversion model for the TP, which is characterized by complex terrain; (2) to compare the efficiency of various machine learning algorithms in the FSC inversion process; (3) to develop an optimal FSC inversion model that can produce a more accurate FSC product based on MODIS data in the TP.

Study Area
The TP (26.00 • N-52.77 • N, 73.30 • E-105.77 • E) is located in Southwest China and exhibits an average elevation exceeding 4000 m.a.s.l. The total area of the TP is approximately 2.5 million km 2 , 41.5% of which is covered by snow in winter [10]. The topography is complex, the elevation ranges from 100 to 8848 m across the plateau, and the land cover types include mainly grassland, forestland, shrubland, bare land and cropland ( Figure 1). One of the most important resources offered by the TP is its water resources, which are stored mainly in the form of glaciers and snow. Meltwater is an important source of several major rivers in China and neighboring Asian countries [30,31].
Remote Sens. 2020, 12, x FOR PEER REVIEW 3 of 16 varying accuracy of MOD10A1 among different regions. Therefore, obtaining real FSC information observed on the ground forms the basis of the MODIS FSC mapping algorithm and represents the main approach to improve the accuracy of snow cover monitoring. Moreover, the extremely high spatial resolution of UAVs can provide observations close to the ground, thereby providing reliable input parameters for the FSC mapping algorithm.
In the context of global warming, there is an urgent need for a real-time and accurate snow cover monitoring algorithm for the TP. Therefore, this study employs three machine learning algorithms based on UAV and MODIS data to improve the accuracy of the MODIS FSC product over the TP. The goals of this study are threefold: 1) to introduce true FSC values obtained by UAVs into the MODIS FSC inversion model for the TP, which is characterized by complex terrain; 2) to compare the efficiency of various machine learning algorithms in the FSC inversion process; 3) to develop an optimal FSC inversion model that can produce a more accurate FSC product based on MODIS data in the TP.

Study Area
The TP (26.00 °N -52.77 °N, 73.30 °E -105.77 °E) is located in Southwest China and exhibits an average elevation exceeding 4000 m.a.s.l. The total area of the TP is approximately 2.5 million km 2 , 41.5% of which is covered by snow in winter [10]. The topography is complex, the elevation ranges from 100 to 8848 m across the plateau, and the land cover types include mainly grassland, forestland, shrubland, bare land and cropland ( Figure 1). One of the most important resources offered by the TP is its water resources, which are stored mainly in the form of glaciers and snow. Meltwater is an important source of several major rivers in China and neighboring Asian countries [30,31].

UAVs
During a field survey from February 2017 to March 2018, 24 images were captured across the TP by using a Dajiang Inspire 1 Pro equipped with a GoPro Hero 3+ camera (Figure 1a). The coverage of each image was controlled to 750 m × 750 m, and then 500 m × 500 m images were clipped to correspond to MODIS pixels. Each image was composed of approximately 200 photos, in which the adjacent photos had an overlap of over 60%. Figure 2 shows four images as examples over different land cover types. Reference FSC data were then produced based on these UAV images to train the FSC machine learning models.

UAVs
During a field survey from February 2017 to March 2018, 24 images were captured across the TP by using a Dajiang Inspire 1 Pro equipped with a GoPro Hero 3+ camera (Figure 1a). The coverage of each image was controlled to 750 × 750 m, and then 500 × 500 m images were clipped to correspond to MODIS pixels. Each image was composed of approximately 200 photos, in which the adjacent photos had an overlap of over 60%. Figure 2 shows four images as examples over different land cover types. Reference FSC data were then produced based on these UAV images to train the FSC machine learning models.

Landsat
For this study, seven Landsat 8 Operational Land Imager (OLI) images acquired under nearly clear-sky conditions were collected from the United States Geological Survey website (http://www.usgs.gov/) ( Figure 1a, Table 1). These images contain all land cover types across the TP, and each land cover type has at least 10,000 OLI pixels. For this study, the MODIS daily snow cover product (MOD10A1 V006), surface spectral reflectance product (MOD09GA) and vegetation index product (MOD13A1) were utilized, and the data from these products correspond to the same dates as the UAV and Landsat data (http://www.nasa.gov). The MOD10A1 V006 product, which is an improved version of V005, includes four main data layers: the NDSI_Snow_Cover, NDSI, albedo and data quality assessment layers. In this study, cloudy pixels and meaningless pixels were eliminated. The FSC calculation formula for MOD10A1 (MOD) is as follows [33]: 3.1.4. Auxiliary Data

Landsat
For this study, seven Landsat 8 Operational Land Imager (OLI) images acquired under nearly clear-sky conditions were collected from the United States Geological Survey website (http://www. usgs.gov/) ( Figure 1a, Table 1). These images contain all land cover types across the TP, and each land cover type has at least 10,000 OLI pixels.

MODIS
For this study, the MODIS daily snow cover product (MOD10A1 V006), surface spectral reflectance product (MOD09GA) and vegetation index product (MOD13A1) were utilized, and the data from these products correspond to the same dates as the UAV and Landsat data (http://www.nasa.gov). The MOD10A1 V006 product, which is an improved version of V005, includes four main data layers: the NDSI_Snow_Cover, NDSI, albedo and data quality assessment layers. In this study, cloudy pixels and meaningless pixels were eliminated. The FSC calculation formula for MOD10A1 (MOD) is as follows [33]:

Auxiliary Data
The Shuttle Radar Topography Mission digital elevation model (SRTM-DEM) with a spatial resolution of 90 m was used in this study (http://www.usgs.gov). The DEM was resampled to a resolution of 500 m and used as the input elevation data to train the machine learning model. MCD12Q1 is an annual land cover type product (https://lpdaac.usgs.gov/products/mcd12q1v006/) with a spatial resolution of 500 m [32]. In this study, to reduce complexity and improve efficiency, the MCD12Q1 land cover types in our study area during 2017 in the International Geosphere-Biosphere Programme (IGPB) classification scheme were reclassified into six categories, namely, grassland, forestland, shrubland, bare land, cropland and other lands.

Landsat Snow Cover Mapping Algorithm
The SNOWMAP algorithm was used to generate a Landsat 8 OLI binary snow map, and the normalized difference forest snow index (NDFSI) and normalized difference vegetation index (NDVI) were employed during binary snow mapping to improve the detection of snow in forested areas [29]. The snow map was then aggregated to the FSC product with a spatial resolution of 500 m.

Machine Learning
Three popular machine learning algorithms in snow cover mapping, including the random forest (RF), support vector machine (SVM) and back-propagation artificial neural network (BP-ANN) algorithms, were used to train the FSC inversion model in our study area. The BP-ANN algorithm is a commonly used machine learning algorithm that is composed of an input layer, a hidden layer and an output layer [34,35]. In this study, the parameters labeled error.criterium, Stao and method were set to "LMS", "NA" and "ADAPTgdwm", respectively. The number of neurons in the input layer and the number of neurons in the hidden layer are the two most important parameters of a neural network model. However, an excessive number of neurons will lead to overfitting or unnecessary operation of the model, which will lead to a low modeling efficiency [27,36]. In this study, 240 BP-ANN models were constructed on the basis of a previous empirical model [25], where the numbers of neurons in the input and hidden layers were 6 and 48, respectively, which resulted in the optimal BP-ANN model for this study.
The RF model is a nonlinear algorithm that employs a series of decision trees to achieve sufficient prediction accuracy. The theoretical basis of this model is the classification tree algorithm [37]. The bootstrap sampling method is used; accordingly, the final prediction result is selected from multiple decision trees by voting. The higher the degree of repetition is, the better the effect of the RF model.
The SVM model is a supervised learning algorithm that is often used in data analysis and pattern recognition [38]. The classification and regression processes in an SVM compose its core algorithm and are based on a group of hyperplanes constructed in high-dimensional space. The main difficulty in constructing an SVM model is setting the gamma and cost parameters; in this study, the gamma and cost parameters were debugged and determined by using the tune.svm function. The kernel function is another important parameter; in this study, the Gaussian kernel function was used.

Validation
For all three models described above, the 10-fold cross-validation method was utilized to adjust the model structure [39]. The FSC distribution retrieved from the UAV was used as the output layer and the corresponding factors, namely, the NDSI, R 1 , R 2 , R 3 , R 4 , R 5 and DEM, were utilized as the input layer and were randomly divided into 10 parts for cross-validation. In the RF model, the SVM model and the BP-ANN model, one dataset was randomly selected at each moment as the test set, and the remaining 9 datasets were used as the training set; a total of 10 iterations were performed. The goodness of fit (R 2 ), RMSE and bias were used to evaluate the fitting ability of each model, and the mean value of 10 iterations was selected. After adjusting the model structure, FSC model mapping and verification were carried out. The average accuracy (ACC), RMSE, positive average error (PAE), negative average error (NAE) and average error (AE) were adopted to evaluate the FSC models [25]. The formulas are as follows: In the above formulas, y i represents the real FSC, y i ' represents the simulated value, i represents a specific sample, n represents the number of samples and m and r represent the numbers of pixels for which the simulated value is larger and smaller than the real FSC, respectively. Figure 3 shows the snow maps retrieved from the UAV based on supervised classification. The FSC was calculated and used as the ground truth to train the three machine learning models. Before a model is trained, however, the input parameters should be screened [25]. The method employed to screen the parameters was univariate regression, which was applied between the ground truth and each parameter, and the significance of each parameter was tested by R 2 and Pearson's correlation coefficient. The test results are shown in Table 2. As a result, the surface reflectance in R 1 , R 2 , R 3 , R 4 and R 5 , the NDSI and altitude were ultimately selected as the modeling parameters through the comprehensive consideration of R 2 and Pearson's correlation coefficients. Remote Sens. 2020, 12, x FOR PEER REVIEW 7 of 16  Figure 3 shows the snow maps retrieved from the UAV based on supervised classification. The FSC was calculated and used as the ground truth to train the three machine learning models. Before a model is trained, however, the input parameters should be screened [25]. The method employed to screen the parameters was univariate regression, which was applied between the ground truth and each parameter, and the significance of each parameter was tested by R 2 and Pearson's correlation coefficient. The test results are shown in Table 2. As a result, the surface reflectance in R1, R2, R3, R4 and R5, the NDSI and altitude were ultimately selected as the modeling parameters through the comprehensive consideration of R 2 and Pearson's correlation coefficients.

FSC Mapping
A sample of verification areas was randomly selected, and the FSC results retrieved from the Landsat OLI imagery, RF, SVM, BP-ANN and MOD10A1 (MOD) are displayed in Figure 4. Compared with the MOD10A1 FSC, the FSC distributions extracted by the three machine learning algorithms are close to the OLI FSC. The machine learning algorithms seem to overestimate the FSC slightly, but MOD10A1 definitively underestimates the FSC. Furthermore, machine learning algorithms have advantages in extracting snow cover in areas with fragmented snow cover, especially in the transition zone between areas with and without snow cover. In particular, the RF model performs the best among the three machine learning models compared with the OLI data, and the BP-ANN model significantly overestimates the FSC at the edge of the snow distribution.
algorithms are close to the OLI FSC. The machine learning algorithms seem to overestimate the FSC slightly, but MOD10A1 definitively underestimates the FSC. Furthermore, machine learning algorithms have advantages in extracting snow cover in areas with fragmented snow cover, especially in the transition zone between areas with and without snow cover. In particular, the RF model performs the best among the three machine learning models compared with the OLI data, and the BP-ANN model significantly overestimates the FSC at the edge of the snow distribution.  Figure 5 shows that the BP-ANN and SVM models perform worse than the RF model and MOD10A1 in forestland and shrubland; moreover, these two models tend to overestimate the FSC more than the RF model and MOD10A1, and the RMSE is larger than 0.3. In particular, the RMSE of the BP-ANN in bare land reaches 0.34. In contrast, MOD10A1 performs worse in grassland, cropland and other lands. However, the RF performs best among the three machine learning models. In each land cover type except forestland, MOD10A1 underestimates the FSC in comparison with the Landsat OLI. The FSC distributions of the three machine learning models are close to those of the Landsat OLI in each land cover type; the AE of the RF model is especially good, between -0.07 and 0.05. However, the BP-ANN and SVM overestimate the FSC in forestland and shrubland, and the AE is larger than 0.3 for both models. The PAE of the RF model is the lowest, and the NAE of the BP-ANN is the lowest over each land cover type. Overall, the RF model performs the best among the other machine learning models on the TP; the RMSE, AE, PAE and NAE of the RF model are all lower than those of the other models, with average values of 0.23, 0.01, -0.17 and 0.15, respectively. In particular, the RF model performs better in forestland and shrubland, with RMSEs of only 0.13 and 0.18, respectively. Table 3 shows the overall accuracy of each FSC inversion model in the TP. These results further confirm that  Figure 5 shows that the BP-ANN and SVM models perform worse than the RF model and MOD10A1 in forestland and shrubland; moreover, these two models tend to overestimate the FSC more than the RF model and MOD10A1, and the RMSE is larger than 0.3. In particular, the RMSE of the BP-ANN in bare land reaches 0.34. In contrast, MOD10A1 performs worse in grassland, cropland and other lands. However, the RF performs best among the three machine learning models. In each land cover type except forestland, MOD10A1 underestimates the FSC in comparison with the Landsat OLI. The FSC distributions of the three machine learning models are close to those of the Landsat OLI in each land cover type; the AE of the RF model is especially good, between −0.07 and 0.05. However, the BP-ANN and SVM overestimate the FSC in forestland and shrubland, and the AE is larger than 0.3 for both models. The PAE of the RF model is the lowest, and the NAE of the BP-ANN is the lowest over each land cover type. Overall, the RF model performs the best among the other machine learning models on the TP; the RMSE, AE, PAE and NAE of the RF model are all lower than those of the other models, with average values of 0.23, 0.01, −0.17 and 0.15, respectively. In particular, the RF model performs better in forestland and shrubland, with RMSEs of only 0.13 and 0.18, respectively. Table 3 shows the overall accuracy of each FSC inversion model in the TP. These results further confirm that the three machine learning models overestimate the FSC and that MOD10A1 underestimates the FSC in the TP.

Accuracy Verification
Remote Sens. 2020, 12, x FOR PEER REVIEW 9 of 16 the three machine learning models overestimate the FSC and that MOD10A1 underestimates the FSC in the TP.      Table 4. Combined with Figure 6, these results further indicate that the RF model is more suitable than the other models and that the MOD10A1 product for inverting the FSC in the TP has an overall ACC of 0.84. Remote Sens. 2020, 12, x FOR PEER REVIEW 10 of 16   Table 4. Combined with Figure 6, these results further indicate that the RF model is more suitable than the other models and that the MOD10A1 product for inverting the FSC in the TP has an overall ACC of 0.84.

MOD
RF SVM ANN ACC 0.77 0.84 0.80 0.81 To explore the accuracy of each model under different FSC gradients, the verified pixels are divided into four classes based on snow maps retrieved from Landsat OLI, namely, areas with low values (FSC between 0% and 25%), medium-low values (FSC between 25% and 50%), medium-high values (FSC between 50% and 75%), and high values (FSC between 75% and 100%). Figure 7 shows the accuracy distributions of each FSC model under various fractional snow cover gradients over the Tibetan Plateau. Results indicate that the effects of FSC gradients on each model are different. From areas with low FSC values to high FSC values, the ACC of MOD10A1 decreases significantly. The ACCs of RF and BP-ANN in areas with low and high FSC values are higher than those in areas with medium FSC values. For the SVM, the ACC in areas with medium FSC values is higher than that in areas with low and high FSC values. In low-FSC areas, MOD10A1 performs the best with an ACC of 0.95, whereas the SVM performs the worst with an ACC of only 0.78. In areas with medium-low and medium-high FSC values, the SVM model performs best with a combined ACC of 0.85. However, the BP-ANN and MOD10A1 perform the worst in areas with medium-low and medium-high FSC values  To explore the accuracy of each model under different FSC gradients, the verified pixels are divided into four classes based on snow maps retrieved from Landsat OLI, namely, areas with low values (FSC between 0% and 25%), medium-low values (FSC between 25% and 50%), medium-high values (FSC between 50% and 75%), and high values (FSC between 75% and 100%). Figure 7 shows the accuracy distributions of each FSC model under various fractional snow cover gradients over the Tibetan Plateau. Results indicate that the effects of FSC gradients on each model are different. From areas with low FSC values to high FSC values, the ACC of MOD10A1 decreases significantly. The ACCs of RF and BP-ANN in areas with low and high FSC values are higher than those in areas with medium FSC values. For the SVM, the ACC in areas with medium FSC values is higher than that in areas with low and high FSC values. In low-FSC areas, MOD10A1 performs the best with an ACC of 0.95, whereas the SVM performs the worst with an ACC of only 0.78. In areas with medium-low and medium-high FSC values, the SVM model performs best with a combined ACC of 0.85. However, the BP-ANN and MOD10A1 perform the worst in areas with medium-low and medium-high FSC values with ACCs of 0.50 and 0.59, respectively. In high FSC areas, the BP-ANN model performs the best, and the ACC reaches 0.94; however, the ACC of MOD10A1 in high FSC areas is only 0.53. Nevertheless, the RF model does not exhibit outstanding performance under any FSC gradient; instead, the RF model displays relatively stable performance under all FSC gradients.
Analysis of variance (F-test) between samples is an important process to verify whether the model accuracy obtained under different conditions is statistically significant. The land cover types and FSC gradients have significant influences on the model accuracy (p < 0.05) (Figures 6 and 7). The results show that the accuracies of the three machine learning models are significantly higher than that of MOD10A1; therefore, these machine learning algorithms can significantly improve the FSC mapping accuracy relative to the linear regression algorithm employed by MOD10A1. The effects of the land cover type and FSC gradient on the accuracy of each FSC inversion model in this study are combined. Figure 8 demonstrates that the effects of forestland and shrubland on the FSC inversion of each model are the same; similarly, the influences of grassland, cropland, bare land and other lands on the FSC inversion of each model are also the same. Overall, in forestland and shrubland, MOD10A1 and the RF model provide FSC distributions that are closer to those derived from Landsat OLI, while the SVM and BP-ANN models overestimate the FSC. In grassland, cropland, bare land and other lands, the FSC distributions from the RF model, SVM model and BP-ANN model are close to the Landsat OLI FSC; however, MOD10A1 underestimates the FSC compared to Landsat OLI.
Furthermore, the agreement between MOD10A1 and Landsat OLI decreases with an increase in the FSC for each land cover type. Similarly, the agreement between the RF model and Landsat OLI decreases with increasing FSC in forestland and shrubland, and the reverse FSC trend becomes closer Analysis of variance (F-test) between samples is an important process to verify whether the model accuracy obtained under different conditions is statistically significant. The land cover types and FSC gradients have significant influences on the model accuracy (p < 0.05) (Figures 6 and 7). The results show that the accuracies of the three machine learning models are significantly higher than that of MOD10A1; therefore, these machine learning algorithms can significantly improve the FSC mapping accuracy relative to the linear regression algorithm employed by MOD10A1.
The effects of the land cover type and FSC gradient on the accuracy of each FSC inversion model in this study are combined. Figure 8 demonstrates that the effects of forestland and shrubland on the FSC inversion of each model are the same; similarly, the influences of grassland, cropland, bare land and other lands on the FSC inversion of each model are also the same. Overall, in forestland and shrubland, MOD10A1 and the RF model provide FSC distributions that are closer to those derived from Landsat OLI, while the SVM and BP-ANN models overestimate the FSC. In grassland, cropland, bare land and other lands, the FSC distributions from the RF model, SVM model and BP-ANN model are close to the Landsat OLI FSC; however, MOD10A1 underestimates the FSC compared to Landsat OLI.
Furthermore, the agreement between MOD10A1 and Landsat OLI decreases with an increase in the FSC for each land cover type. Similarly, the agreement between the RF model and Landsat OLI decreases with increasing FSC in forestland and shrubland, and the reverse FSC trend becomes closer to the Landsat OLI FSC distribution in other land cover types. The SVM overestimates the FSC in forestland and shrubland when the FSC is smaller than 50% and exhibits an underestimation trend when the FSC is larger than 50% in each land cover type. Overall, the BP-ANN overestimates the FSC in each land cover type.
Remote Sens. 2020, 12, x FOR PEER REVIEW 12 of 16 to the Landsat OLI FSC distribution in other land cover types. The SVM overestimates the FSC in forestland and shrubland when the FSC is smaller than 50% and exhibits an underestimation trend when the FSC is larger than 50% in each land cover type. Overall, the BP-ANN overestimates the FSC in each land cover type.

Discussion
Although the accuracy of MOD10A1 is high in Europe, America and other regions, many problems are still encountered when adapting to the complex terrain of the TP. Accordingly, the complexity of snow monitoring throughout the TP has resulted in considerable challenges to important work involving the determination of snow cover. However, in recent years, with the introduction of machine learning and high-precision UAV data, the combination of these technologies has become an effective way to improve the accuracy of snow monitoring in mountainous areas with complex terrain, such as the TP.
The introduction of the BP-ANN algorithm into the FSC inversion for the Heihe River basin improved the FSC inversion accuracy compared with the MODIS global daily FSC product [26]. In addition, the RF method was employed in a snow depth inversion model based on passive microwave data in the Reynolds Creek Mountains, and the results demonstrated that the RMSE of the snow depth compared with ground-based observations declined to 0.09 m [40]. The SVM algorithm is also capable of accurately extracting and distinguishing areas with dry snow, wet snow and no snow in permafrost regions [41]. The SVM algorithm was further found to be able to extract snow information in mountainous areas with sufficient accuracy [42].
In this study, different machine learning models were employed for FSC inversion and compared with the global MODIS daily snow cover product. The results indicated that the accuracy of the machine learning algorithms is better than that of the MOD10A1 product in the TP. Among the three machine learning models used in this study, the RF model performed with ideal efficiency and accuracy in the FSC inversion over the TP with complex terrain, especially in forestland and shrubland. However, these machine learning models generally exhibit inconsistent performance under different FSC gradients, mainly as a result of the algorithm structure. Although the BP-ANN has a strong self-learning ability [43], its defects are obvious. First, the BP-ANN algorithm constitutes

Discussion
Although the accuracy of MOD10A1 is high in Europe, America and other regions, many problems are still encountered when adapting to the complex terrain of the TP. Accordingly, the complexity of snow monitoring throughout the TP has resulted in considerable challenges to important work involving the determination of snow cover. However, in recent years, with the introduction of machine learning and high-precision UAV data, the combination of these technologies has become an effective way to improve the accuracy of snow monitoring in mountainous areas with complex terrain, such as the TP.
The introduction of the BP-ANN algorithm into the FSC inversion for the Heihe River basin improved the FSC inversion accuracy compared with the MODIS global daily FSC product [26]. In addition, the RF method was employed in a snow depth inversion model based on passive microwave data in the Reynolds Creek Mountains, and the results demonstrated that the RMSE of the snow depth compared with ground-based observations declined to 0.09 m [40]. The SVM algorithm is also capable of accurately extracting and distinguishing areas with dry snow, wet snow and no snow in permafrost regions [41]. The SVM algorithm was further found to be able to extract snow information in mountainous areas with sufficient accuracy [42].
In this study, different machine learning models were employed for FSC inversion and compared with the global MODIS daily snow cover product. The results indicated that the accuracy of the machine learning algorithms is better than that of the MOD10A1 product in the TP. Among the three machine learning models used in this study, the RF model performed with ideal efficiency and accuracy in the FSC inversion over the TP with complex terrain, especially in forestland and shrubland. However, these machine learning models generally exhibit inconsistent performance under different FSC gradients, mainly as a result of the algorithm structure. Although the BP-ANN has a strong self-learning ability [43], its defects are obvious. First, the BP-ANN algorithm constitutes a local search optimization method that easily falls into local extrema, which leads to training failure [44]. Second, no unified and complete theoretical guidance has been developed for selecting the network structure; consequently, the network parameters can be selected only by experience. Unfortunately, the structure of the network directly affects its training and prediction abilities [45]. In this study, the improvement of the training ability exceeded a certain threshold, but the prediction ability decreased; that is, the so-called overfitting phenomenon appeared, which was mainly due to an excessive number of neurons [46,47]. Hence, the BP-ANN performed worse in the transition zone between areas with snow and no snow and overestimated the FSC in regions with fragmented snow cover. In contrast, nonlinear mapping constitutes the theoretical basis of the SVM algorithm, which uses an inner-product kernel function instead of nonlinear mapping to a high-dimensional space [48]. However, while the SVM model is mature in dealing with classification problems, the SVM algorithm still needs to be improved for regression problems and still encounters some difficulties when faced with large sample datasets. Moreover, the algorithm often fails due to storage and computation defects [49]. Finally, in the RF model, each tree randomly selects some samples and some features to avoid overfitting; consequently, the model features a good anti-noise ability and stable performance [50,51]. Furthermore, the RF model can handle very high-dimensional data and omit the work associated with feature selection [52].
Traditional remote sensing on the FSC inversion mainly includes two methods: the linear regression and spectral mixture analysis algorithms. However, previous studies have shown that these two algorithms perform worse in the Tibetan Plateau because of snow heterogeneity [24,25]. According to statistics, the average snow depth of the ground observations is around 5 cm so it is difficult to monitor the snow cover accurately by using remote sensing because the snow is shallow in the Tibetan Plateau [53]. The strong solar radiation, wind-blown snow, and the rugged terrain are the main reasons causing the snow to be patchy [10,35]. This study concludes that the machine learning algorithm can obtain satisfactory FSC inversion accuracy over complex terrain and regions with fragmented snow distribution in the TP. Specifically, the RF model displays acceptable accuracy when snow is patchy, and also robustness under different land cover types. However, machine learning is a form of artificial intelligence, which itself belongs to the scope of computer science. There are limitations to the interpretability of computing processes, which is referred to as the so-called "black box" of artificial intelligence. Further understanding of the snow radiation characteristics and distribution pattern, making the selection of variables that affect the FSC inversion is completely controllable. Machine learning can carry out optimized simulations based on the selected variables, avoiding the influence of human factors on model construction. Therefore, machine learning provides a reliable choice for snow cover inversion based on remote sensing in mountainous areas. However, how to obtain more effective input variables to improve the efficiency of machine learning requires more work.

Conclusions
In this study, three kinds of FSC machine learning models, including RF, SVM and BP-ANN models, are trained and compared based on MODIS, UAV and other auxiliary data. Then, the fitting ability is evaluated using the 10-fold cross-validation method with Landsat 8 OLI data under various land cover and FSC gradients. The results indicate that introducing a highly accurate UAV-acquired snow map as a reference into a machine learning model can significantly improve the MODIS FSC mapping accuracy. The main conclusions are as follows: (1) Compared with MOD10A1, the machine learning FSC algorithms employed in this study can significantly improve the FSC mapping accuracy. The FSC distributions generated by the three machine learning models are close to (or slightly overestimate) the real FSC obtained by Landsat. However, MOD10A1 severely underestimates the FSC in the TP. (2) The land cover type is the main factor affecting the FSC inversion accuracies of the machine learning algorithms and MOD10A1. Machine learning algorithms can significantly reduce the effects of different land cover types on the extraction of FSC using MODIS. In particular, the RF algorithm significantly improves the FSC extraction accuracy in forestland and shrubland. The RMSEs of the RF model in forestland and shrubland are only 0.13 and 0.18, respectively, whereas the corresponding MOD10A1 RMSEs are 0.21 and 0.23, respectively. (3) Various FSC gradients also affect the FSC inversion accuracy. The accuracy of MOD10A1 decreases with an increasing FSC gradient. The accuracies of the three machine learning models also change with the FSC gradient, although they still achieve acceptable accuracy compared to MOD10A1. Ultimately, the RF model performs the best with the most stable accuracy among the three machine learning models. (4) Finally, the RF algorithm shows high accuracies, which are normally unaffected by changes in the land cover types and FSC gradients in the TP. Therefore, this study proposed that the RF algorithm is the most effective method for resolving the problems associated with fragmented snow cover in the complex terrain of the TP. Hence, using the RF model can improve the FSC inversion accuracy from MODIS data compared with MOD10A1.
Author Contributions: All authors contributed significantly to this manuscript. X.H. designed this study and provided editorial advice and feedback on the manuscript. C.L. performed the data analysis and prepared the manuscript, and X.L. helped the data analysis. Professor T.L. helped with the manuscript preparation. All authors have read and agreed to the published version of the manuscript.