A New Application of Random Forest Algorithm to Estimate Coverage of Moss-Dominated Biological Soil Crusts in Semi-Arid Mu Us Sandy Land, China

Biological soil crusts (BSCs) play an essential role in desert ecosystems. Knowledge of the distribution and disappearance of BSCs is vital for the management of ecosystems and for desertification researches. However, the major remote sensing approaches used to extract BSCs are multispectral indices, which lack accuracy, and hyperspectral indices, which have lower data availability and require a higher computational effort. This study employs random forest (RF) models to optimize the extraction of BSCs using band combinations similar to the two multispectral BSC indices (Crust Index-CI; Biological Soil Crust Index-BSCI), but covering all possible band combinations. Simulated multispectral datasets resampled from in-situ hyperspectral data were used to extract BSC information. Multispectral datasets (Landsat-8 and Sentinel-2 datasets) were then used to detect BSC coverage in Mu Us Sandy Land, located in northern China, where BSCs dominated by moss are widely distributed. The results show that (i) the spectral curves of moss-dominated BSCs are different from those of other typical land surfaces, (ii) the BSC coverage can be predicted using the simulated multispectral data (mean square error (MSE) < 0.01), (iii) Sentinel-2 satellite datasets with CI-based band combinations provided a reliable RF model for detecting moss-dominated BSCs (10-fold validation, R2 = 0.947; ground validation, R2 = 0.906). In conclusion, application of the RF algorithm to the Sentinel-2 dataset can precisely and effectively map BSCs dominated by moss. This new application can be used as a theoretical basis for detecting BSCs in other arid and semi-arid lands within desert ecosystems.


Introduction
Biological soil crusts (BSCs) containing microphytic communities (i.e., cyanobacteria, lichens, liverworts, and mosses), grow within or directly on top of soil [1]. BSCs are the primary producers, sinks of carbon and nitrogen, and soil stabilizers, and they mainly exist in arid and semi-arid areas that cover over 35% of global land surfaces [1][2][3][4]. BSCs are a top management priority in desertified lands because of their extreme vulnerability to disturbances from human activities and climate change, which have recently been shown to negatively affect such areas [1]. It is essential to obtain accurate Yanchi County. The annual mean temperature ranges from 6.0 °C to 8.5 °C [19]; precipitation occurs mainly in July and September (particularly during August) and this accounts for 60-75% of the annual total precipitation. The potential annual evaporation is 2300 mm, which is six times that of annual precipitation on average. Northwest winds prevail in winter, spring, and autumn, and southeast winds prevail during summer [19]. The soil type is loose aeolian sandy soil, and the land is barren and vulnerable to wind erosion. More than 80% of sandy areas are covered by sandy grassland, and the dominant species is Artemisia ordosica [20]. Moss-dominated BSC is widespread within the A. ordosica community and is a potent indicator of the fixation phase of sand dunes [20]. Other natural vegetation types, including steppe, meadow, and shrub exist within Mu Us Sandy Land, and farmlands are distributed along the river or scattered in the sandy grasslands, artificial forests, and shrubs [21]. Mu Us Sandy Land is one of the 12 sandy zones in China, but is the only one located in an intermediate region between the typical steppe and the desert. As it belongs to semiarid continental climate, Mu Us Sandy Land is sensitive to climate change as well as changes in land utilization [21]. In this study, three field campaigns were undertaken in 2017 and 2018 within Mu Us Sandy Land during the late growing season to determine the growth peak of BSCs.

In-Situ Hyperspectral Dataset
A field survey was conducted between 28 June and 4 July, 2017. A portable spectrometer (ASD Field Spec Handheld 2) was used to measure in-situ spectral reflectance at 352 points ( Figure 2a) on mixed and typical land surfaces such as BSCs, bare sand, and different types of plants. The handheld instrument was used to obtain measurements at wavelength increments of 2 nm between 325 and 1075 nm, with a 15° field of view (FOV), at a height of approximately 1 m above the ground. Spectral measurements were taken under bright and sunny conditions from 10:00 to 15:00 Chinese standard time (UTC+8). Ten spectral curves were measured at each point and the calculated mean values were

In-Situ Hyperspectral Dataset
A field survey was conducted between 28 June and 4 July, 2017. A portable spectrometer (ASD Field Spec Handheld 2) was used to measure in-situ spectral reflectance at 352 points ( Figure 2a) on mixed and typical land surfaces such as BSCs, bare sand, and different types of plants. The handheld instrument was used to obtain measurements at wavelength increments of 2 nm between 325 and 1075 nm, with a 15 • field of view (FOV), at a height of approximately 1 m above the ground. Spectral measurements were taken under bright and sunny conditions from 10:00 to 15:00 Chinese standard time (UTC+8). Ten spectral curves were measured at each point and the calculated mean values were taken as the final reflectance spectra. In all, BSC spectra were obtained from 138 plots. Furthermore, spectral data of other ground objects were from 214 plots. taken as the final reflectance spectra. In all, BSC spectra were obtained from 138 plots. Furthermore, spectral data of other ground objects were from 214 plots.

Simulated Multispectral Dataset
To determine the subtle spectral features that could be used to estimate different BSC coverage from multispectral satellite images, in-situ hyperspectral data were firstly resampled using multispectral Landsat-8 and Sentinel-2 channels (see Table 1) by employing the spectral response functions of the respective sensors to simulate the satellite dataset [22,23]. As the distance between the handheld spectrometer and the objects on the ground was short, the BSC coverage was recorded in real-time (by analyzing the instantaneous digital photos (Figure 2c,d; see details in Section 2.2.3). The simulated dataset was not influenced by issues, such as atmospheric effects or the delay between the field survey and remote sensing data acquisition, which occur in remote sensing images [17,24]. As wavelengths in the 325-400 nm and 900-1075 nm ranges were affected by severe noise, only the 400-900 nm wavelength range was considered in the analysis (Table 1).

Simulated Multispectral Dataset
To determine the subtle spectral features that could be used to estimate different BSC coverage from multispectral satellite images, in-situ hyperspectral data were firstly resampled using multispectral Landsat-8 and Sentinel-2 channels (see Table 1) by employing the spectral response functions of the respective sensors to simulate the satellite dataset [22,23]. As the distance between the handheld spectrometer and the objects on the ground was short, the BSC coverage was recorded in real-time (by analyzing the instantaneous digital photos (Figure 2c,d; see details in Section 2.2.3). The simulated dataset was not influenced by issues, such as atmospheric effects or the delay between the field survey and remote sensing data acquisition, which occur in remote sensing images [17,24]. As wavelengths in the 325-400 nm and 900-1075 nm ranges were affected by severe noise, only the 400-900 nm wavelength range was considered in the analysis (Table 1).

Satellite Multispectral Dataset
To enable a comparison with the simulated multispectral dataset, bands of satellite images ranging from 400 to 900 nm were considered (Table 1). These geometrically corrected sensor data are available for Landsat-8 as Surface Reflectance (SR) images defined in the Worldwide Reference System (WRS) path/row coordinate system [25,26] and for Sentinel-2 as Level-2A products defined in Bottom-Of-Atmosphere (BOA) granules, also called tiles, which are 100 × 100 km 2 ortho-images in UTM (Universal Transverse Mercator) WGS84 (World Geodetic System 1984) projection [27]. To validate their applicability to BSC mapping, 5 scenes of Landsat-8 OLI image data and 15 scenes of Sentinel-2 MSI image data covering the entire study area under cloudless conditions were acquired and analyzed ( Table 2). The Landsat-8 has a swath of approximately 185 km (15" FOV from a height of 705 km) and offers global area coverage every 16 days [25]. The SR products have a 30 m spatial resolution and are produced at the Earth Resources Observation and Science (EROS) Center using the Landsat Surface Reflectance Code (LaSRC) [22].
The Sentinel-2 swath is approximately 290 km (20.6" FOV from a height of 786 km) and offers global area coverage every 10 days [23,27]. The Sen2Cor tool serves to generate and format the Sentinel-2 Level 2A product to correct for the atmosphere, terrain, and cirrus from Top-Of-Atmosphere (TOA) Level 1C input data [28]. The Level 2A BOA product includes three different resolutions of 60, 20, and 10 m [27,28]. The resolutions of Sentinel-2 dataset used were converted to 20 m.

BSC Coverage on Pixel Scale obtained from Quadrat Survey Data
During the second and third fieldwork conducted on 16-28 August and 23-29 October, 2018, the species composition of plants and general BSC coverage were obtained from a total of 319 surveyed units ( Figure 1). A rope square measuring 30 m × 30 m was arranged in the field within each survey unit (based on the spatial resolution of the Landsat-8 image is 30 m) to simulate one pixel of remote sensing data. To avoid corresponding problems between the image pixel and ground pixel, the survey pixel located the middle of the area with uniform and consistent landscape in a scale of approximately 500 × 500 area was chosen. To calculate the area covered by BSCs, a 1 m × 1 m wire square was employed on foot within each rope square [29]. Quadrat surveys were assisted by an unmanned aerial vehicle (UAV), and the UAV was flown at a height of 8 m to obtain a detailed coverage of BSCs ( Figure 3). Furthermore, the coordinates of each survey unit were recorded using a GNSS receiver (Garmin GPS 72, ±15 m) to validate the BSC distribution maps via remote sensing data ( Figure 1). GNSS receiver (Garmin GPS 72, ±15 m) to validate the BSC distribution maps via remote sensing data ( Figure 1).

Band Combinations from BSC Indices
To capture the subtle spectral features of BSCs, reflectance feature spaces composed of multispectral indices similar to the CI [10] and BSCI [11], but covering all possible combinations of bands, were calculated to build models using the following: where R is the reflectance spectra at a wavelength of λ , λ , and λ ; and R is the mean reflectance of R , R , and R . To amplify the absolute difference between R and R , L was set to 2 as an adjustment parameter, based on observations of Chen et al. [11]. Different bands were recombined to generate both the CI-based band combinations (for Landsat data, there are C = 10 conditions; for Sentinel data, there are C = 21 conditions), and the BSCI-based band combinations (for Landsat data, there are C = 10 conditions; for Sentinel data, there are C = 35 conditions).

Random Forest (RF) Regression Models
The RF models are generated from an association between the bagging method and randomized subspace method [15]. Every decision tree grows until it reaches a predefined minimum node (nodesize) via a random feature selection in the training dataset. In this study, the number of trees (ntree) was set to 10,000. The size of the variable's subset (mtry) and the nodesize was set to 5 [16]. The optimization of the parameters was conducted using the randomForest package based on R Version 3.5 [15]. The randomForest() function was used to set ntree. The tuneRF() function was used to set mtry. The treesize() function was used to set nodesize. RF is good at measuring the importance of variables (i.e., the importance of every variable to the performance of a model) [18], which can assist in ranking the useful spectral band combinations from the multispectral data employed to

Band Combinations from BSC Indices
To capture the subtle spectral features of BSCs, reflectance feature spaces composed of multispectral indices similar to the CI [10] and BSCI [11], but covering all possible combinations of bands, were calculated to build models using the following: where R is the reflectance spectra at a wavelength of λ i , λ j , and λ k ; and R mean λ i λ j λ k is the mean reflectance of R λ i , R λ j , and R λ k . To amplify the absolute difference between R λ i and R λ j , L was set to 2 as an adjustment parameter, based on observations of Chen et al. [11]. Different bands were recombined to generate both the CI-based band combinations (for Landsat data, there are C 2 5 = 10 conditions; for Sentinel data, there are C 2 7 = 21 conditions), and the BSCI-based band combinations (for Landsat data, there are C 3 5 = 10 conditions; for Sentinel data, there are C 3 7 = 35 conditions).

Random Forest (RF) Regression Models
The RF models are generated from an association between the bagging method and randomized subspace method [15]. Every decision tree grows until it reaches a predefined minimum node (nodesize) via a random feature selection in the training dataset. In this study, the number of trees (ntree) was set to 10,000. The size of the variable's subset (mtry) and the nodesize was set to 5 [16]. The optimization of the parameters was conducted using the randomForest package based on R Version 3.5 [15]. The randomForest() function was used to set ntree. The tuneRF() function was used to set mtry. The treesize() function was used to set nodesize. RF is good at measuring the importance of variables (i.e., the importance of every variable to the performance of a model) [18], which can assist in ranking the useful spectral band combinations from the multispectral data employed to estimate BSC coverage. The first measure of variable importance is calculated from permuting out-of-bag (OOB) Remote Sens. 2019, 11, 1286 7 of 18 data (%IncMSE), and the mean square error (MSE) for each tree on the OOB portion of the dataset is computed. After permuting each predictor variable, the MSE is computed again, and the mean value (the difference between the two MSEs among all the trees) is then calculated, and normalized by the standard deviation of the differences [18]. The second measure of variable importance is the residual sum of squares for the regression of the total decrease in node impurities from splitting the variables, and this measure is also averaged over all trees (IncNodePurity) [18]. For different combinations of bands based on CI and BSCI indices, RF is iteratively fitted, that is, at each iteration, new forests are developed in the model one after another (starting with the most important ones) [30]. The rfcv function in the R package randomForest is used to show the cross-validated prediction performance (error.cv: corresponding vector of MSEs at each step) of models with a descending number of predictors (n.var: ranked by variable importance) based on a nested cross-validation procedure [31]. Therefore, the nested subset of the combination of bands in the IncNodePurity ranking that had the lowest error rate was used as the optimal band combinations for predicting the coverage of BSCs.

Accuracy Assessment
Four evaluation parameters appropriate for the continuous model were selected: Coefficient of Determinant (R 2 ), Mean Absolute Error (MAE), Mean Square Error (MSE) and Normalized Mean Square Error (NMSE) [31], as follows: For models on a hoop scale, a 10-fold cross-validation was chosen, as it is one of the most preferred techniques used to evaluate models and is acknowledged to be better than the use of residually based metrics [31,32].
For models on a pixel scale, the 319 quadrat survey plots previously mentioned were separated into two parts: 269 plots were randomly chosen as training data for calibrating the model by 10-fold cross-validation (white points in Figure 1) and the remaining 50 plots were selected as testing data for ground validation (red points in Figure 1).
The methods described above were conducted using R Version 3.5 ( Figure 4). The band combination steps using BSCs indices were implemented using the R package, hsdar [24,33]. The R packages, caret [34] and randomForest [18], was used for training and testing the RF model, and the rgdal package [35] was used for processing geospatial data. Remote Sens. 2019, 11, x FOR PEER REVIEW 8 of 19 Figure 4. Flow chart of the study process.

BSC Reflectance Features from In-Situ Spectral Measurements
The reflectance curves revealed the spectral features of BSCs, bare sand, and plants ( Figure 5). The reflectance of plants showed distinctive features with a maximum value in the green band, an absorption maximum in the red band, and a notable soar from 700 to 800 nm (the vegetation red edge band). All plant species generally had the lowest values of reflectance in the blue band and an absorption of approximately 500 nm. Spectral reflectance values of physical crusts closely resembled those of bare sand, with an intersect occurring around the visible red band. The spectral curves of varying BSC coverage showed intermediate reflectance values compared with other ground objects throughout the visible spectrum. BSCs also showed absorption in both the blue and red bands, but this was much weaker than plants.

BSC Reflectance Features from In-Situ Spectral Measurements
The reflectance curves revealed the spectral features of BSCs, bare sand, and plants ( Figure 5). The reflectance of plants showed distinctive features with a maximum value in the green band, an absorption maximum in the red band, and a notable soar from 700 to 800 nm (the vegetation red edge band). All plant species generally had the lowest values of reflectance in the blue band and an absorption of approximately 500 nm. Spectral reflectance values of physical crusts closely resembled those of bare sand, with an intersect occurring around the visible red band. The spectral curves of varying BSC coverage showed intermediate reflectance values compared with other ground objects throughout the visible spectrum. BSCs also showed absorption in both the blue and red bands, but this was much weaker than plants.

Implementation of RF Model with the Simulated Multispectral Dataset
The band combinations were ranked according to their importance in estimating BSC coverage as determined by the RF algorithm (Figure 6e-h). The simulated Landsat dataset, which used combinations of bands based on CI, had the lowest error rate when the top five important band combinations were employed (Figure 6a). The green-red bands were the most important band combination (Figure 6e). The red-NIR combination of bands, which represent the band combination of classical vegetation index (the Normalized Difference Vegetation Index (NDVI)), was ranked in second place. The RF model, using the Sentinel dataset with band combinations based on BSCI, achieved the best result among the top 18 important band combinations (Figure 6d), of which the blue-green-red bands, blue-red-red edge bands, and blue-green-red edge bands were the top three important performing combinations (Figure 6h). The best results for other models were obtained when all combinations of bands were selected (Figure 6b,c). The sensitive bands of the top three BSCIbased band combinations, which used the simulated Landsat data, were bands 1 to 4, without the NIR band (Figure 6f). Similarly, the NIR band did not rank among the top three most important band combinations when using both the CI-and BSCI-based formula with the simulated Sentinel models (Figure 6g,h). In contrast, the blue, green, red, and vegetation red edge bands were found to be important for predicting BSC coverage using the simulated Sentinel-2 dataset.

Implementation of RF Model with the Simulated Multispectral Dataset
The band combinations were ranked according to their importance in estimating BSC coverage as determined by the RF algorithm (Figure 6e-h). The simulated Landsat dataset, which used combinations of bands based on CI, had the lowest error rate when the top five important band combinations were employed (Figure 6a). The green-red bands were the most important band combination (Figure 6e). The red-NIR combination of bands, which represent the band combination of classical vegetation index (the Normalized Difference Vegetation Index (NDVI)), was ranked in second place. The RF model, using the Sentinel dataset with band combinations based on BSCI, achieved the best result among the top 18 important band combinations (Figure 6d), of which the blue-green-red bands, blue-red-red edge bands, and blue-green-red edge bands were the top three important performing combinations (Figure 6h). The best results for other models were obtained when all combinations of bands were selected (Figure 6b,c). The sensitive bands of the top three BSCI-based band combinations, which used the simulated Landsat data, were bands 1 to 4, without the NIR band (Figure 6f). Similarly, the NIR band did not rank among the top three most important band combinations when using both the CIand BSCI-based formula with the simulated Sentinel models (Figure 6g,h). In contrast, the blue, green, red, and vegetation red edge bands were found to be important for predicting BSC coverage using the simulated Sentinel-2 dataset. Figure 7 shows the corresponding 10-fold cross-validation results of estimated coverage versus measured coverage of BSCs on a hoop scale. The models using simulated multispectral data for predicting BSC coverage exhibited high performance of R 2 > 0.950 and MSE < 0.010. There were no significant differences between different datasets with different band combinations with respect to their ability to estimate BSC coverage.   Table 1). Figure 7 shows the corresponding 10-fold cross-validation results of estimated coverage versus measured coverage of BSCs on a hoop scale. The models using simulated multispectral data for predicting BSC coverage exhibited high performance of R 2 > 0.950 and MSE < 0.010. There were no significant differences between different datasets with different band combinations with respect to their ability to estimate BSC coverage.

Quantification of BSC Surface Cover in Mu Us Sandy Land
Compared to the simulated multispectral data, there was a decrease in the model performance with satellite images (Figure 7 vs. Figure 8), and the performance of 10-fold cross-validation (shown as red points in Figure 8) dropped on average, but did not significantly decrease (R 2 = 0.974 vs. For ground validation (shown as gray triangles in Figure 8), BSC coverage was predicted with R 2 equal to 0.557 (CI-based band combination) and 0.588 (BSCI-based band combination) using Landsat-8 images. These performances were significantly inferior to those of models using the simulated Landsat dataset (Figure 7a vs. Figure 8a, Figure 7b vs. Figure 8b). The BSC coverage was predicted with R 2 = 0.906 (CI-based band combination) and R 2 = 0.899 (BSCI-based band combination) using Sentinel-2 scenes (Figure 8c,d). The ground validation of models showed high performance for pixels with low BSCs coverage, whereas pixels with high BSC coverage (observed BSCs coverage > 0.125) were underestimated by models.
Remote Sens. 2019, 11, x FOR PEER REVIEW 12 of 19 Compared to the simulated multispectral data, there was a decrease in the model performance with satellite images (Figure 7 vs. Figure 8), and the performance of 10-fold cross-validation (shown as red points in Figure 8) dropped on average, but did not significantly decrease (R 2 = 0.974 vs. R 2 = 0.944). For ground validation (shown as gray triangles in Figure 8), BSC coverage was predicted with R 2 equal to 0.557 (CI-based band combination) and 0.588 (BSCI-based band combination) using Landsat-8 images. These performances were significantly inferior to those of models using the simulated Landsat dataset (Figure 7a vs. Figure 8a, Figure 7b vs. Figure 8b). The BSC coverage was predicted with R 2 = 0.906 (CI-based band combination) and R 2 = 0.899 (BSCI-based band combination) using Sentinel-2 scenes (Figure 8c,d). The ground validation of models showed high performance for The general BSC distribution was roughly similar for the four models ( Figure 9). All models showed BSC widely distributed in the northeastern and southeastern corner of Mu Us Sandy land and sparse distribution on sand dunes in the in the southwest of Otog Front Banner (administrative regions marked in Figure 1) (Figure 9a-d). In Ejin Horo Banner, Jingbian County, Dingbian County, and Yanchi County, the models using Landsat data with band combinations based on CI showed the highest distribution of BSC coverage (Figure 9a). There was a distinct gradient distribution of BSCs from south to north in Uxin Banner from the Sentinel data models using CI-based band combinations (Figure 9c). and sparse distribution on sand dunes in the in the southwest of Otog Front Banner (administrative regions marked in Figure 1) (Figure 9a-d). In Ejin Horo Banner, Jingbian County, Dingbian County, and Yanchi County, the models using Landsat data with band combinations based on CI showed the highest distribution of BSC coverage (Figure 9a). There was a distinct gradient distribution of BSCs from south to north in Uxin Banner from the Sentinel data models using CI-based band combinations (Figure 9c).

Reflectance Features of BSCs
Two primary absorption characteristics of BSCs of approximately 520 and 680 nm, which have been described in previous studies, are believed to be related to the existence of carotenoids and chlorophyll a, respectively [8,9,36]. Our study also detected these absorptions, which were relatively weak compared to those of plants ( Figure 5). The increase in BSC reflectance within the green band can distinguish plants from BSCs. Moreover, bare sand showed no absorption around 680 nm,

Reflectance Features of BSCs
Two primary absorption characteristics of BSCs of approximately 520 and 680 nm, which have been described in previous studies, are believed to be related to the existence of carotenoids and chlorophyll a, respectively [8,9,36]. Our study also detected these absorptions, which were relatively weak compared to those of plants ( Figure 5). The increase in BSC reflectance within the green band can distinguish plants from BSCs. Moreover, bare sand showed no absorption around 680 nm, whereas BSC showed weak absorption in the red band. This can be used to differentiate between the BSCs and the bare sand. However, when the pixel filled only with plants and soil, without BSCs, the model might confuse the existence of BSCs and recognize plants as BSCs.
Nevertheless, indices calculated using single band combination such as CI [10], may not enable the precise detection of BSCs. Linear regressions between the band combination based on CI and the coverage of BSCs showed that the most sensitive band combination was band 4 (red) with band 5 (NIR) for both the simulated Landsat-8 dataset (Figure 10a) and actual Landsat-8 images (Figure 10c). This red-NIR combination is also recognized as the band combination of the NDVI. For Sentinel-2 channel, Band 7 (vegetation red edge) with band 8A (NIR) and band 6 (vegetation red edge) with band 8A (NIR) were the most sensitive band combination in the simulated dataset ( Figure 10b) and real images (Figure 10d), respectively. It appears that the vegetation red edge with the NIR band is sensitive to BSCs in Sentinel-2 channels. However, all these combinations are also sensitive to vegetation. Our research found that a single pair of bands is unsuitable for spectrally discriminating BSCs, due to the strong spectral characteristics of vegetation, and thus we investigated the deeper detection of sensitive spectral information using the machine learning method.
BSCs and the bare sand. However, when the pixel filled only with plants and soil, without BSCs, the model might confuse the existence of BSCs and recognize plants as BSCs.
Nevertheless, indices calculated using single band combination such as CI [10], may not enable the precise detection of BSCs. Linear regressions between the band combination based on CI and the coverage of BSCs showed that the most sensitive band combination was band 4 (red) with band 5 (NIR) for both the simulated Landsat-8 dataset (Figure 10a) and actual Landsat-8 images (Figure 10c). This red-NIR combination is also recognized as the band combination of the NDVI. For Sentinel-2 channel, Band 7 (vegetation red edge) with band 8A (NIR) and band 6 (vegetation red edge) with band 8A (NIR) were the most sensitive band combination in the simulated dataset ( Figure 10b) and real images (Figure 10d), respectively. It appears that the vegetation red edge with the NIR band is sensitive to BSCs in Sentinel-2 channels. However, all these combinations are also sensitive to vegetation. Our research found that a single pair of bands is unsuitable for spectrally discriminating BSCs, due to the strong spectral characteristics of vegetation, and thus we investigated the deeper detection of sensitive spectral information using the machine learning method.

Implementation of RF Model with the Simulated Multispectral Dataset
Karnieli [10] found that BSCs on a bare substrate have higher values of reflectance in the blue band than bare soil. However, this spectral feature was not determined in our study ( Figure 5), nor in the results of Chen et al. [11] and Weber et al. [8]. Weber et al. [8] believed that spectral information of minerals with strong absorption features in the blue band, may interfere with spectral measurements in the field. The training plots investigated in the field were mixed by BSCs, plants and soil, because this is the actual landscape. However, as with the combination of red and blue bands employed by CI [10], our models also found that the combination of red and blue bands is important in CI-based RF models (Figure 6a,c). Another two important CI-based band combinations in our models were the green-red bands and the red-NIR bands (Figure 6a,c) where the combination of green-red bands separated BSCs from plants, and the combination of red-NIR bands (which is the band combination of the NDVI) separated BSCs from bare sand. Although the BSCI proposed by Chen et al. [11] included the NIR band, the NIR band was not included in the top four most sensitive band combinations in our models using the BSCI-based formula, whereas the frequency of the NIR band that appeared in our models was the highest (Figure 6b,d). These results indicate that important band combinations used in our models are able to detect BSCs information and are superior to the multispectral BSC indices that combine only two or three bands.
The high capability of the RF algorithm in detecting BSC information from multispectral channels is seen in Figure 7. Since Pirotti et al. [37] have proved the best performance of random forest for classification, the standard deviation (SD) of Mean Square Errors (MSEs) was calculated in each fold validation of 10-fold validation ( Table 3). The SDs are all < 0.01. This further proved the powerful ability of the RF algorithm for regression. The hyperspectral dataset requires high computational efforts and has difficulties in large-scale data acquisition. Therefore, use of the RF model with multispectral remote sensing data to detect BSCs is more convenient and efficient than using hyperspectral datasets such as studies of Weber et al. [8], Chamizo et al. [9], and Rodríguez-Caballero et al. [5]. This has already been proven while estimating BSCs using the CI [10] or BSCI [11]. In addition, the use of regression models in our study enabled extraction of BSCs and the estimation of quantitative BSC coverage.

Quantification of BSC Surface Cover in Mu Us Sandy Land
The models trained by the simulated multispectral dataset on a hoop scale for both Landsat and Sentinel channels showed high performance ( Figure 7). However, the performance reduced when the models were applied to remote sensing data (Figure 8), which could be attributed to the time gaps between the ground survey and acquisition of satellite images [17]. The model on a hoop scale provided high performance as no time gaps exists, because it was built using the data of the simulated multispectral data (resampled by in-situ hyperspectral data) for the precise plot where BSC coverage had been determined (by analyzing the instantaneous digital photos). The issues with time gap problems for models on a pixel scale posed no problem with our research, but they pose a difficulty for all research applying ground surveys to remote sensing dataset.
Our results emphasize that the RF algorithm provides the highest estimation of BSC coverage when using Sentinel-2 satellite sensors. Models using the direct Sentinel-2 dataset performed almost as well as the simulated Sentinel-2 dataset. It seems the multispectral Sentinel-2 sensors provide a better spatial resolution than Landsat-8 sensors. Landsat-8 had a lower spatial resolution and its performance in ground validation was inferior (Figure 8a,b). In addition, higher spectral resolution and band setting of Sentinel-2 (three vegetation red edge bands) might be one of the reasons for the superior performance to estimate BSC coverage. Future satellite missions may offer better-suited data sources to enable BSC mapping.
Furthermore, our best result of BSC distribution (Figure 9c) was generally matched with the results of aboveground biomass (AGB) distribution [38], vegetation coverage distribution [39], and sand dune distribution [40] in Mu Us Sandy Land. Moreover, some researchers believe that moss-dominated BSCs have a positive correlation with perennial plant coverage and soil organic matter [41,42]. Our method overestimated BSCs under the coverage of 30% (Figures 7 and 8). The main reasons might be the influence of vegetation and unbalanced training datasets that were used. As we discussed in Section 4.1, when the pixel mixed only by plants and soil, without BSCs, the model might recognize plants as BSCs. One of the most common landscapes in Mu Us Sandy Land, however, is sparsely distributed with some vascular plants without BSCs. Therefore, the training datasets including plants without BSCs may lead to overestimating results of BSCs. Our future work could focus on the spatial analysis of the relationships between BSC, vegetation, and bare sand to quantify the impact of plants and soil on the determination of BSCs. The seasonality and different dry-wet conditions of BSCs have been studied earlier [43,44], which highlight the seasonal changes of BSCs in arid and semi-arid land. The BSC coverage predicted by our research provides a snapshot of Mu Us Sandy Land at the end of the growing season in 2018. However, our study did not make an assessment of trends and phenological changes in the region. BSCs are poikilohydric plants that lie dormant when dry [44]. Soil moisture and precipitation can increase the CI, BSCI, and NDVI of moss-dominated BSC [43]. The training data on a pixel scale, which was investigated on the ground, sometimes were collected when it was raining. However, it is difficult to obtain satellite images with good quality during the rainy seasons. These artificial errors might be one of the reasons for underestimations of high BSC coverage in our models. Further research, consequently, needs to focus on improving the BSC monitoring by considering the season, weather, and soil moisture content. Finally, it is recommended to unify season and weather conditions while collecting spectral information or BSC coverage to build BSC RF models.

Conclusions
In this study, a new application of RF was proposed to quantitatively detect moss-dominated BSCs. This application not only can attain more accurate results than multispectral indices, but also is more efficient than hyperspectral methods. A spectral analysis of the main ground objects in Mu Us Sandy Land was initially conducted, which provided sufficient information to distinguish moss-dominated BSCs. However, using a simple band combination proved difficult in discriminating between plants and BSCs. Thus, we implemented the RF algorithm to analyze the simulated multispectral dataset, which provided promising results. The Sentinel-2 dataset was shown to be suitable for use in training reliable RF models that can predict BSC coverage using band combinations based on the CI. The ultimate aim of this study was to derive regional scale maps of BSC in Mu Us Sandy Land, which are urgently required to obtain accurate spatial information relating to desertification. Such applications are essential for local people and politicians in maintaining ecosystem services, and the methods used in this study can help map BSC coverage in other arid and semi-arid areas.