Extracting Information on Rocky Desertiﬁcation from Satellite Images: A Comparative Study

: Rocky desertiﬁcation occurs in many karst terrains of the world and poses major challenges for regional sustainable development. Remotely sensed data can provide important information on rocky desertiﬁcation. In this study, three common open-access satellite image datasets (Sentinel-2B, Landsat-8, and Gaofen-6) were used for extracting information on rocky desertiﬁcation in a typical karst region (Guangnan County, Yunnan) of southwest China, using three machine-learning algorithms implemented in the Python programming language: random forest (RF), bagged decision tree (BDT), and extremely randomized trees (ERT). Comparative analyses of the three data sources and three algorithms show that: (1) The Sentinel-2B image has the best capability for extracting rocky desertiﬁcation information, with an overall accuracy (OA) of 85.21% using the ERT method. This can be attributed to the higher spatial resolution of the Sentinel-2B image than that of Landsat-8 and Gaofen-6 images and Gaofen-6’s lack of the shortwave infrared (SWIR) bands suitable for mapping carbonate rocks. (2) The ERT method has the best classiﬁcation results of rocky desertiﬁcation. Compared with the RF and BDT methods, the ERT method has stronger randomness in modeling and can effectively identify important feature factors for extracting information on rocky desertiﬁcation. (3) The combination of the Sentinel-2B images and the ERT method provides an effective, efﬁcient, and free approach to information extraction for mapping rocky desertiﬁcation. The study can provide a useful reference for effective mapping of rocky desertiﬁcation in similar karst environments of the world, in terms of both satellite image sources and classiﬁcation algorithms. It also provides important information on the total area and spatial distribution of different levels of rocky desertiﬁcation in the study area to support decision making by local governments for sustainable development. level, mild level, moderate level, and severe level, respectively. For example, “3–4” is the Euclidean distance between moderate and severe rocky desertification.). and severe rocky desertiﬁcation and the surrounding areas is quite different. The results extracted from Sentinel-2B image are more detailed and can more accurately reﬂect the actual range of moderate and severe rocky desertiﬁcation. The classiﬁcation results of the other two images are relatively less accurate.


Introduction
Karst areas cover approximately 22 million km 2 and are an important part of Earth's surface system, accounting for 15% of the Earth's land area [1,2]. Karst areas are ecologically fragile and susceptible to land degradation [3]. Rocky desertification is an extreme manifestation of land degradation in karst areas that results in vegetation deterioration, soil erosion, and bedrock exposure, under the dual action of natural processes and human interference [4,5]. Accurate mapping of rocky desertification is essential for controlling land degradation in karst areas.
While the geomorphological/hydrological characteristics, geological causes, and development processes of karst areas have been well investigated, studies on rocky desertification are relatively new [4,[6][7][8][9][10]. Information on the degree and spatial distribution of rocky desertification is essential for creating a baseline for sustainable land management in karst areas. Although traditional ground surveying and mapping can accurately In addition to comparative analyses of open-access image data for mapping rocky desertification, it is worthwhile to compare the performance of different information extraction methods applied to the image data. Similar to other applications of remote sensing, traditional manual image interpretation has been used for mapping rocky desertification [6,29]. However, visual interpretation is a time-consuming process that requires a great deal of expert knowledge [30]. Some quantitative and efficient methods have been employed, such as spectral mixture analysis (SMA) [21], dimidiate pixel model (DPM) [20], maximum likelihood classification (MLC) [17], and decision tree (DT) [31]. The SMA and DPM methods can effectively reduce the influence of terrain on satellite images and improve the accuracy of information extraction in uneven and rugged terrains. The karst areas are characterized by high levels of landscape heterogeneity, and one pixel usually contains mixed spectral information of vegetation, bedrock, and bare soil [20]. However, the results on mapping rocky desertification are usually of low accuracy in karst areas when using the SMA and DPM alone, because the surface color of rocks may vary depending on the intensity of human disturbance and the degree of weathering erosion. The MLC and DT methods are parameterized and non-parameterized, respectively, and have good performance and widespread application [17,32]. However, the MLC method needs a lot of time in the process of supervised classification, while the DT method does not take into account the relationships between attribute data [33]. Random forest (RF) and bagged decision tree (BDT) were proposed to strengthen the function of DT and achieve machine Remote Sens. 2021, 13, 2497 3 of 21 integrated learning [34,35]. However, both the RF and BDT algorithms use the bagging algorithm; therefore, their threshold values in classification are not obtained completely randomly, resulting in a large variance [36].
The extremely randomized trees (ERT) algorithm has been increasingly applied to solving classification and regression problems [37]. It is a new tree-based classification and regression method. Based on many DT models, ERT conducts classification in a holistic and random way, with smaller variance and more stable prediction results [38]. The ERT algorithm can obtain and integrate variable knowledge in many aspects and enhance the randomness of classification operation. It has the advantages of RF and BDT and seems to be suitable for classification of objects with complex heterogeneity [39]. As a typical classification and regression method, the ERT model is mainly applied to medical science, biology, and atmospheric science [40][41][42]. At present, the model has not been used for the classification of Earth surface features. From the perspective of remote sensing, it is only used in the study of air pollution identification and prediction [43]. The ERT model is worthy of further exploration and discussion for ground object identification. Therefore, a comparative analysis of the performance of RF, BDT, and ERT for extracting rocky desertification would be desirable.
Located in the center of East Asia, the karst area in southwest China is one of the three largest karst regions in the world, with a total area of 540,000 km 2 and a population of over 220 million [44,45]. Its rocky desertification is extremely serious, becoming the third biggest ecological problem in China after the desertification in northwest China and soil erosion in the Loess Plateau [46]. This paper presents a comparative analysis of open-access satellite images for extracting information on rocky desertification, in terms of both data and methods, in a typical karst area in southwest China. The objectives are: (1) evaluating the capabilities of three open-access satellite image datasets (Sentinel-2, Landsat-8, and Gaofen-6) for mapping karst rocky desertification; (2) comparing the performance of three machine-learning algorithms (RF, BDT, and ERT) for extracting information on rocky desertification; (3) discussing the advantages and limitations of the three open-access image datasets and the machine-learning algorithms for extracting information on rocky desertification in karst areas.

Study Area
The study area, Guangnan County with an area of 7730 km 2 , Yunnan Province, southwest China (104 • 31 ~105 • 39 E, 23 • 29 ~24 • 28 N), is a typical region with karst rocky desertification ( Figure 1). Carbonate rocks (karst) are distributed in most regions of the county, with shale and clastic rocks in the north and mid-east regions. The mountainous region accounts for 94.7% of the total area of the county, and the flat region accounts for 5.3%, with an average altitude of 1280 m. The county is characterized by a subtropical plateau monsoon climate, with an annual average sunshine of 1857 h, an annual average temperature of 16 • C, and an annual average rainfall of 1057 mm. There are significant differences between dry and rainy seasons.
The county also has a prominent problem of impoverishment. For a long time, it has been in a vicious cycle of "fragile ecology → extensive development → rocky desertification → enhanced poverty" [47]. Due to the wide distribution of rocky desertification, Guangnan County was listed as the pilot county of China's comprehensive control of rocky desertification in 2008. Meanwhile, the per capita net income of the county was about RMB 8700 in 2018, only 30 percent of the national average of the same year, making it one of the most important counties for poverty alleviation in China. Through the implementation of various projects on ecological control and targeted poverty alleviation, rocky desertification in isolated parts of the study area has been alleviated in recent years. Nonetheless, deforestation, overgrazing, and other ecologically destructive behaviors remain a challenge. Thus, the overall situation of rocky desertification in the study area has not been fundamentally reversed. The county also has a prominent problem of impoverishment. For a long time, it has been in a vicious cycle of "fragile ecology → extensive development → rocky desertification → enhanced poverty" [47]. Due to the wide distribution of rocky desertification, Guangnan County was listed as the pilot county of China's comprehensive control of rocky desertification in 2008. Meanwhile, the per capita net income of the county was about RMB 8700 in 2018, only 30 percent of the national average of the same year, making it one of the most important counties for poverty alleviation in China. Through the implementation of various projects on ecological control and targeted poverty alleviation, rocky desertification in isolated parts of the study area has been alleviated in recent years. Nonetheless, deforestation, overgrazing, and other ecologically destructive behaviors remain a challenge. Thus, the overall situation of rocky desertification in the study area has not been fundamentally reversed.

Satellite Image Data and Other Data
Three open-access satellite image datasets (Sentinel-2B, Landsat-8, and Gaofen-6) are used in this study, representing products from three remote sensing satellite series. Six Sentinel-2B L2A images acquired in March 2019 were downloaded from the data sharing Figure 1. Location of study area and sample pictures: most of the exposed rocks in the karst region show characteristics of small area, dotted shapes on image, and dense distribution.

Satellite Image Data and Other Data
Three open-access satellite image datasets (Sentinel-2B, Landsat-8, and Gaofen-6) are used in this study, representing products from three remote sensing satellite series. Six Sentinel-2B L2A images acquired in March 2019 were downloaded from the data sharing site (Copernicus Open Access Hub) of the European Space Agency (ESA) ( Table 1). The L2A products of Sentinel-2B have undergone radiometric calibration, atmospheric correction, geometric correction, and orthorectification. Therefore, the L2A products of Sentinel-2B are the bottom-of-atmosphere corrected reflectance data. Then, a mosaic of six images was created using the SNAP software, and it was clipped by the vector boundary of Guangnan County.
Four Landsat-8 L1T images acquired in March 2019 were downloaded from the USGS data sharing site (Earth Explorer) ( Table 1). The L1T products of Landsat-8 were created after radiometric calibration and geometric precision correction. The atmospheric correction and orthorectification were carried out using the ENVI 5.3 software, to obtain the bottom-of-atmosphere corrected reflectance of Landsat-8. Then, a mosaic of four images was produced using the ENVI 5.3 software, and the image was clipped by the vector boundary of Guangnan County. Land use data in the study area and the administrative boundary. 7 Soil data of Guangnan County

Bureau of Agricultural Technology of Guangnan County
The data were used to extract types and thickness of soil as factors assessing the status of rocky desertification.
One Gaofen-6 L1T image acquired in February 2019 was downloaded from the CNSA data sharing site (CNSA-GEO) ( Table 1), as the wide-format camera (800km) equipped with the Gaofen-6 can directly cover the study area with one image. The Gaofen-6 image is an uncorrected basic satellite imagery product. We used the "Domestic Satellite Support Tool of China" plug-in of the ENVI 5.3 software to load the image and completed the radiometric calibration, atmospheric correction, geometric precision correction, and orthorectification, to obtain the bottom-of-atmosphere corrected reflectance. Then, the Gaofen-6 image was clipped by the vector boundary of Guangnan County using the ENVI5.3 software.
The multispectral image data of the study area during the dry season in the first half of 2019 were obtained through downloading and preprocessing the images of openaccess Sentinel-2B, Landsat-8, and Gaofen-6 satellites. The open-access image data and other datasets are listed in Table 1. The relevant parameters of the image data are shown in Table 2.
The main factors affecting rocky desertification include vegetation coverage, exposed rocks, topography, and lithology [18,31]. Additional factors may include land use, soil type, soil thickness, and human activities [48,49]. These factors were mainly obtained through processing the other auxiliary data, including NASA DEM, Rocky Types, Land Use Database, and Soil Data (Table 1).

Methods
First, after downloading and pre-processing remotely sensed data and ancillary data, we calculated the factors required for information extraction of rocky desertification. Then, we compared the separability of the fractional vegetation cover (FVC) and exposed bedrock fraction (EBF) obtained by three types of images (Sentinel-2B, Landsat-8, and Gaofen-6) on distinguishing rocky desertification by Euclidean distance. Finally, we built ERT, RF, and BDT models through the Python programming language and selected the best model based on their performance in extracting rocky desertification from the three types of images. The methodology flowchart is shown in Figure 2, and some major components of the flowchart are described in the following sections.

Calculation of Remotely Sensed Factors
After image preprocessing, the Normalized Difference Vegetation Index (NDVI) of three types of image data was obtained through the "band math" tool of ENVI5.3 software.
where NDVI g and NDVI 0 denote the maximum NDVI and minimum NDVI, respectively; KBRI r and KBRI 0 denote the maximum KBRI and minimum KBRI, respectively; CRI r and Remote Sens. 2021, 13, 2497 7 of 21 CRI 0 denote the maximum CRI and minimum CRI, respectively; NIR, Red, Blue, and SWIR1 refer to the near-infrared band, red band, blue band, and the 1st shortwave infrared band of the multispectral image data. The FVC in the study area was obtained by using NDVI (Equations (1) and (2)). Pei et al. proposed a Karst Bare-Rock Index (KBRI) (Equation (3)), which showed better extraction capability of exposed rocks compared with the previous Normalized Difference Rock Index (NDRI) [4]. For Sentinel-2B and Landsat-8 satellite images, we calculated KBRI to represent surface rock features. However, as the Gaofen-6 image does not contain shortwave infrared (SWIR) bands, the KBRI cannot be calculated for the Gaofen-6 image. The Carbonate Rock Index (CRI) proposed by Xie et al. [50] (Equation (4)) was obtained from the Gaofen-6 image instead. Then, the KBRI or CRI were used to obtain the EBF of study area (Equation (5)). The Sentinel-2B images were processed using the SNAP software, and the Landsat-8 and Gaofen-6 images were processed using the ENVI 5.3 software (Exelis Visual Information Solutions, USA).

Calculation of Remotely Sensed Factors
After image preprocessing, the Normalized Difference Vegetation Index (NDVI) of three types of image data was obtained through the "band math" tool of ENVI5.3 software.

Calculation of Other Factors
Some other factors that may be needed for extracting rocky desertification information have been selected based on field investigation and existing studies [18,31,48,49]. They include the elevation (EL), slope (SL), slope aspect (SP), lithology (RP), land use (LU), soil type (ST), and soil thickness (STK) in the study area. We downloaded the latest NASADEM data (spatial resolution: 12.5 m) to extract the EL, SL, and SP of the study area. The data of RP, LU, ST, and STK were obtained from the Bureau of Natural Resources and Bureau of Agricultural Technology of Guangnan County (Table 1). They were also used for information extraction of rocky desertification after rasterization.

Method for Comparing Separability
Euclidean distance is a parameter based on the sample similarity to express the aggregation extent of similar samples and the separability of different samples.
where d i,j is the Euclidean distance between the i-th and j-th level of rocky desertification; x i and x j are the average value of the i-th and j-th level; σ 2 is the variance of the samples. The greater the Euclidean distance between the two levels, the stronger the separability between the levels; conversely, the smaller the Euclidean distance between the two levels, the weaker the separability between the levels (Equation (6)). In this study, the Euclidean distance between different levels of rocky desertification is used to characterize the separability of the EBF and FVC indexes obtained from the three open-access satellite images. The capability of information extraction is obtained by comparing the available remotely sensed indexes.

Model Construction for Information Extraction
Extremely randomized trees (ERT), an integrated learning prediction algorithm based on decision tree (DT), was proposed by Geurts in 2006 [38]. Based on the classic top-down approach, a series of "free-growing" DT sets are randomly constructed. The characteristic of the ERT is that the selection of the bifurcation values is completely random, and each tree uses all the training samples. Random forest (RF) and bagged decision tree (BDT) are classifiers that are trained and predicted based on the bagging algorithm [34,35]. We implemented the ERT, RF, and BDT classifiers using ExtraTreesClassifer, RandomForest-Classifer, and BaggingClassifer algorithms from the Sklearn.ensemble package of Python under the Anaconda environment. The following are the major steps for the ERT, RF, and BDT classification: (1) Based on research literature on rocky desertification [20,31,51], the degree of rocky desertification in the study area was divided into five levels: no rocky desertification, potential rocky desertification, mild rocky desertification, moderate rocky desertification, and severe rocky desertification. Sample data were collected during field surveys (Table 3), and the ERT, RF, and BDT models were constructed using the Python programming language. The proportion of exposed rock is less than 30%.
Mild Level 319 136 455 The proportion of exposed rock is between 30% and 50%.
Moderate Level 377 161 538 The proportion of exposed rock is between 50% and 70%.
Severe Level 395 169 564 The proportion of exposed rock is more than 70%. Total 1707 729 2436 / (2) All feature factors were extracted from remotely sensed data and ancillary data and saved as text or table files, which were brought into the ERT, RF, and BDT classification models.
(3) Iterative operations of sampling and tree building were carried out to identify the best-performing model. Our test results suggest that the classification accuracy appears to Remote Sens. 2021, 13, 2497 9 of 21 be stable when the number of DTs is over 10,000. Thus, 10,000 was set as the number of DTs in the ERT, RF, and BDT models in this study.
(4) The prediction results of all the base classifiers are counted, and the final classification results are produced by the voting decision method. The generated ERT, RF, and BDT results are tested with the validation samples (Table 3). (5) We selected the better model by the accuracy comparison. Additionally, we selected the factors needed to extract the information of rocky desertification according to the importance of each factor during operation. We carried out 100 times iterative operation of the algorithm and selected the operation result with the optimal precision. The selected result is the classification result of rocky desertification.
(6) Through importing the classification result into the study area, the final area and spatial distribution map of rocky desertification are obtained.

Accuracy Assessment
Accuracy assessment is an important step after classification [40,52]. Through extensive field investigations and analyses of 2019 Google Earth high-resolution images, we collected 2436 samples that are evenly distributed in the study area (Figure 1), covering all levels of rocky desertification. The samples were randomly divided into training samples and validation samples according to a ratio of approximately 7:3, which were used for information extraction and accuracy assessment after classification ( Table 3). The overall accuracy (OA), producer's accuracy (PA), and user's accuracy (UA) were used to measure the performance of the open-access satellite data and classification methods.

Contrast of Spectral Reflection Information
According to the calculation equations of EBF and FVC, the bands that can be used to extract rocky desertification information in Sentinel-2B, Landsat-8, and Gaofen-6 images mainly include the Blue band, Red band, NIR band, and SWIR1 band. The spectral ranges of the three types of images are similar in the Blue band and Red band, but there are significant differences in the NIR band and SWIR1 band ( Table 4). The NIR band of Landsat-8 eliminates the water vapor absorption characteristic at 825 nm and is obviously narrower than that of Sentinel-2B and Gaofen-6, with fluctuation phenomenon [23]. The NIR bands of Sentinel-2B and Gaofen-6 have a wide range, covering the most important "red-edge" band in the vegetation spectrum recognition and rising sharply with the increase in wavelength between 760 and 790 nm. There is no significant difference between the SWIR1 bands of Sentinel-2B and Landsat-8, while Geofen-6 does not have a SWIR band. The spectral reflectance value of Sentinel-2B has serrated fluctuation, while Landsat-8 gradually increased. These four bands are important for calculating EBF and FVC, and their differences also indicate that the three open-access images worth comparing for information extraction of rocky desertification.

Separability of FVC and EBF for Rocky Desertification
Through image processing methods described earlier, the spatial distribution map of FVC and EBF of three types of satellite images were obtained (Figure 3). For the FVC, there is not much difference in the results (FVC_S, FVC_L, and FVC_G), except some differences in the northern parts of the study area. Since the FVC and EBF represent vegetation and rock (non-vegetation) coverage, respectively, they show opposite characteristics in the results of Sentinel-2B and Landsat-8. That is, the areas with high FVC have lower EBF, and vice versa. However, due to the inconsistent calculation method, the EBF of Gaofen-6 is not the same as that of the other two satellites, and most of the EBF raster from Gaofen-6 have low values, so it is not effective for extracting rocky desertification.

Similarities and differences
The band range is not very different.
The band range is not very different.
The band range of Landsat-8 is relatively narrow, ignoring the "red-edge" band of 760 -690 nm, which is an important range for vegetation detection. The other two satellites have "red-edge" bands, with similar wavelength ranges.
There is not much difference between Sentinel-2B and Landsat-8, while the Gaofen-6 is missing this band.
Note: " √ " means that the spectral band is carried in the satellite; "×" means that the band is not carried.

Comparison of the Classification Accuracy of Three Methods
We added the nine feature factors, calculated in the earlier stage, into the ERT, RF and BDT algorithms successively (among which, the addition order of ERT and RF is determined by the importance of the analyzed factor in the model; the BDT algorithm fails to recognize the factors' importance and uses the addition order of the ERT model). From the perspective of classification stability, the OA of ERT model increases with the addition of feature factors. Although the OA of RF and BDT models is also increasing overall, there is occasionally a slight decrease in the middle ( Figure 5). When the classification accuracy of three models reached the maximum, the OA of ERT is the highest. Its accuracy of Sentinel-2B, Landsat-8, and Gaofen-6 images for information extraction of rocky desertification are 85.21%, 74.15%, and 73.60%, respectively. The ERT model uses all training sam- . Euclidean distance accumulation histogram of five indices for classification of rocky desertification ("S", "L", and "G" represent three types of images of Sentinel-2B, Landsat-8, and Gaofen-6; 0, 1, 2, 3, and 4 represent no rocky desertification, potential level, mild level, moderate level, and severe level, respectively. For example, "3-4" is the Euclidean distance between moderate and severe rocky desertification.).

Comparison of the Classification Accuracy of Three Methods
We added the nine feature factors, calculated in the earlier stage, into the ERT, RF and BDT algorithms successively (among which, the addition order of ERT and RF is determined by the importance of the analyzed factor in the model; the BDT algorithm fails to recognize the factors' importance and uses the addition order of the ERT model). From the perspective of classification stability, the OA of ERT model increases with the addition of feature factors. Although the OA of RF and BDT models is also increasing overall, there is occasionally a slight decrease in the middle ( Figure 5). When the classification accuracy of three models reached the maximum, the OA of ERT is the highest. Its accuracy of Sentinel-2B, Landsat-8, and Gaofen-6 images for information extraction of rocky desertification are 85.21%, 74.15%, and 73.60%, respectively. The ERT model uses all training samples and provides a better classification stability and accuracy. However, the OA of other two models is relatively lower. Since the RF and BDT models adopt a bagging algorithm for random sampling rather than use all the training samples, they have the weaker classification stability and accuracy. The OA of RF are 80.85%, 70.86%, and 72.64% from three types of images, respectively, about 2.87% lower than those obtained by ERT on average. The OA of BDT are 78.93%, 68.40%, and 70.31%, respectively, about 5.11% lower than those obtained by ERT.

Comparison of Classification Results based on Factors Selection
The ERT algorithm is used to analyze and evaluate the importance of each feature factor ( Figure 5), so as to select the feature factors to improve the accuracy of classification, and to reduce data redundancy. The higher the importance score, the greater the contribution of the factor for the classification (sum to 1 of all factors).
For Sentinel-2B, the importance of EBF_S on information extraction of rocky desertification is the highest (importance = 0.28), playing the strongest role in classification. The OA reaches 45% when use the EBF_S factor alone. The importance of FVC_S is also high (importance = 0.18). These two factors play a strong role in rocky desertification classification. The OA reaches 62% when considering these two factors. The importance of other feature factors is less than or equal to 0.10, with weaker effects on classification. In particular, SP, ST, and RP have no obvious effect on classification. For Landsat-8, although the EBF_L and FVC_L have the higher importance on rocky desertification classification (importance = 0.17 and 0.14), the importance values are less than those derived from Sentinel-2B images. When using EBF_L and FVC_L alone, the OA of rocky desertification classification is only 38% (24% lower than the Sentinel-2B image). Therefore, it is also important that other feature factors are taken into account in the ERT model of Landsat-8. The importance values of SL, STK, EL, and LU are all around 0.12, while the importance values of SP, ST, and RP are lower than 0.10. When the number of feature factors is gradually increased, the classification accuracy of Landsat-8 images is significantly improved compared with the ERT model of Sentinel-2B images. By comparing the stability and accuracy of the three models, it can be seen that ERT has certain advantages on information extraction of rocky desertification in karst areas. This is consistent with our expected hypothesis. In karst areas with strong landscape heterogeneity, ERT has a higher robustness and a better classification capability for information extraction of rocky desertification. Thus, the extraction results of rocky desertification obtained by ERT were used for subsequent analysis.

Comparison of Classification Results Based on Factors Selection
The ERT algorithm is used to analyze and evaluate the importance of each feature factor ( Figure 5), so as to select the feature factors to improve the accuracy of classifica-tion, and to reduce data redundancy. The higher the importance score, the greater the contribution of the factor for the classification (sum to 1 of all factors).
For Sentinel-2B, the importance of EBF_S on information extraction of rocky desertification is the highest (importance = 0.28), playing the strongest role in classification. The OA reaches 45% when use the EBF_S factor alone. The importance of FVC_S is also high (importance = 0.18). These two factors play a strong role in rocky desertification classification. The OA reaches 62% when considering these two factors. The importance of other feature factors is less than or equal to 0.10, with weaker effects on classification. In particular, SP, ST, and RP have no obvious effect on classification.
For Landsat-8, although the EBF_L and FVC_L have the higher importance on rocky desertification classification (importance = 0.17 and 0.14), the importance values are less than those derived from Sentinel-2B images. When using EBF_L and FVC_L alone, the OA of rocky desertification classification is only 38% (24% lower than the Sentinel-2B image). Therefore, it is also important that other feature factors are taken into account in the ERT model of Landsat-8. The importance values of SL, STK, EL, and LU are all around 0.12, while the importance values of SP, ST, and RP are lower than 0.10. When the number of feature factors is gradually increased, the classification accuracy of Landsat-8 images is significantly improved compared with the ERT model of Sentinel-2B images.
For the Gaofen-6 image, the most important feature factor of rocky desertification classification is FVC_G (importance = 0.16). Even so, the classification accuracy is only 24% when using this factor alone. The three factors of SL, STK, and EL have similar importance (between 0.12 and 0.14). After employing these three factors, the OA is improved significantly. However, in the ERT model of the Gaofen-6 image, EBF_G has a lower importance (0.12) and a lower ranking, indicating that it plays a small role in the rocky desertification classification. The importance of other factors is lower than 0.14, and the ST and RP have the lowest importance.
Based on the analysis of the factors' importance and classification accuracy in the ERT model constructed by three types of open-access satellite images, it can be seen that the feature factors with higher importance contribute more to classification. Therefore, factors can be grouped and screened by importance. The indexes of EBF and FVC (except EBF_G) have the highest contribution on extracting rocky desertification, followed by the factors of LU, SL, EL, and STK, while the contributions of SP, ST, and RP are the lowest. Therefore, we divided the feature factors into three groups on the basis of the importance values and further built the ERT model for extracting rocky desertification from the images. The first group includes EBF and FVC factors (ET1), to clarify the situation of information extraction of rocky desertification when only remote sensing factors are considered. The second group adds LU, SL, EL, and STK factors (ET2) to the first group (ET1), since the importance of these factors is at the intermediate level. Additionally, the third group adds SP, ST, and RP factors with lower importance (ET3) to the first two groups.

Accuracy Evaluation of Extraction Results
We mainly analyzed the accuracy of extraction results in ET1, ET2, and ET3, since there was no significant difference in OA between different levels of rocky desertification ( Figure 6). In ET1, because only two factors of EBF and FVC are used, the producer's accuracy and user's accuracy are not very high. The producer's accuracy and user's accuracy can reach about 60% in the extraction results of Sentinel-2B, while the accuracies from Landsat-8 and Gaofen-6 images are slightly lower. In the results of ET2 and ET3, the producer's accuracy and user's accuracy of the three models are improved, indicating that the accuracy of information extraction of rocky desertification can be effectively improved after incorporating the feature factors such as topography, soil, and lithology. The information of rocky desertification can be better extracted by adding other feature factors, which are related to the occurrence of exposed rocky desertification, such as topographic, lithologic, and soil factors. Therefore, combining with the changes in the OA, producer's accuracy, and user's accuracy, we selected all nine feature factors to extract the information of rocky desertification. After extracting information of rocky desertification by using nine feature factors, the results are consistent with the previous results of Euclidean distance analysis.

Comparison of Extraction Results of Rocky Desertification
Using all nine feature factors, the spatial distributions of each level of rocky desertification are finally classified (Figure 7). The Sentinel-2B image revealed the largest area (3,469 km 2 ) of rocky desertification, followed by Landsat-8 image (3137 km 2 ) and Gaofen-6 image (2790 km 2 ). Among the rocky desertification levels extracted from the three images, the areas of the moderate rocky desertification are not much different, while the areas of the other levels are different. For no rocky desertification and severe rocky desertification, the order of the areas extracted from the three images is Sentinel-2B < Landsat-8 < Gaofen-6, while the order of potential rocky desertification and mild rocky desertification is opposite: Sentinel-2B > Landsat-8 > Gaofen-6 ( Figure 7d).

Comparison of Extraction Results of Rocky Desertification
Using all nine feature factors, the spatial distributions of each level of rocky desertification are finally classified (Figure 7). The Sentinel-2B image revealed the largest area (3,469 km 2 ) of rocky desertification, followed by Landsat-8 image (3137 km 2 ) and Gaofen-6 image (2790 km 2 ). Among the rocky desertification levels extracted from the three images, the areas of the moderate rocky desertification are not much different, while the areas of the other levels are different. For no rocky desertification and severe rocky desertification, the order of the areas extracted from the three images is Sentinel-2B < Landsat-8 < Gaofen-6, while the order of potential rocky desertification and mild rocky desertification is opposite: Sentinel-2B > Landsat-8 > Gaofen-6 ( Figure 7d). Remote Sens. 2021, 13, x FOR PEER REVIEW 16 of 22 Using the photos from typical field locations, the actual situation of different levels of rocky desertification extracted from the three images of Sentinel-2B, Landsat-8, and Gaofen-6 were compared and analyzed (Figure 8). There are some subtle differences between the potential rocky desertification and mild rocky desertification extracted from the three images. The classification of moderate and severe rocky desertification and the surrounding areas is quite different. The results extracted from Sentinel-2B image are more detailed and can more accurately reflect the actual range of moderate and severe rocky desertification. The classification results of the other two images are relatively less accurate.
In general, the rocky desertification is mainly distributed in the karst areas in Guangnan County (Figures 1 and 8). The severe and moderate rocky desertification levels are mainly distributed in the southern, mid-west, and mid-east regions. The mid-east Using the photos from typical field locations, the actual situation of different levels of rocky desertification extracted from the three images of Sentinel-2B, Landsat-8, and Gaofen-6 were compared and analyzed (Figure 8). There are some subtle differences between the potential rocky desertification and mild rocky desertification extracted from the three images. The classification of moderate and severe rocky desertification and the surrounding areas is quite different. The results extracted from Sentinel-2B image are more detailed and can more accurately reflect the actual range of moderate and severe rocky desertification. The classification results of the other two images are relatively less accurate.
parts are prone to rocky desertification mainly due to the natural conditions such as high mountains and steep slopes, while the southern and mid-west parts are prone to rocky desertification due to the long-term unreasonable human activities such as deforestation and overgrazing. On the whole, the distribution of rocky desertification extracted from the three images is consistent with the actual distribution of rocky desertification in the field survey in study area. In particular, the rocky desertification information extracted from Sentinel-2B images is more reliable than that from the other two images.

Accuracy of Information Extraction from Three Types of Images
The rocky desertification extracted from Landsat-8 and Gaofen-6 are of similar accuracy. Zhu et al. evaluated the capability of rocky desertification identification of Gaofen-1 and Landsat-8 satellite images and reported that the separability of Landsat-8 image is slightly better than Gaofen-1 on rocky desertification classification [23]. The same satellites series, Gaofen-1 and Gaofen-6, have similar effective spectral bands for information extraction of rocky desertification. The results of Zhu et al. [23] are similar to our study, proving that Landsat-8 and Gaofen-6 images have similar separability of rocky desertification. The classification accuracy of rocky desertification is also consistent with the separability results. The Landsat-8 image is mainly limited by 30 m spatial resolution, and In general, the rocky desertification is mainly distributed in the karst areas in Guangnan County (Figures 1 and 8). The severe and moderate rocky desertification levels are mainly distributed in the southern, mid-west, and mid-east regions. The mid-east parts are prone to rocky desertification mainly due to the natural conditions such as high mountains and steep slopes, while the southern and mid-west parts are prone to rocky desertification due to the long-term unreasonable human activities such as deforestation and overgrazing. On the whole, the distribution of rocky desertification extracted from the three images is consistent with the actual distribution of rocky desertification in the field survey in study area. In particular, the rocky desertification information extracted from Sentinel-2B images is more reliable than that from the other two images.

Accuracy of Information Extraction from Three Types of Images
The rocky desertification extracted from Landsat-8 and Gaofen-6 are of similar accuracy. Zhu et al. evaluated the capability of rocky desertification identification of Gaofen-1 and Landsat-8 satellite images and reported that the separability of Landsat-8 image is slightly better than Gaofen-1 on rocky desertification classification [23]. The same satellites series, Gaofen-1 and Gaofen-6, have similar effective spectral bands for information extraction of rocky desertification. The results of Zhu et al. [23] are similar to our study, proving that Landsat-8 and Gaofen-6 images have similar separability of rocky desertification. The classification accuracy of rocky desertification is also consistent with the separability results. The Landsat-8 image is mainly limited by 30 m spatial resolution, and the Gaofen-6 image lacks the ideal calculation result of EBF due to the absence of SWIR1 band, which affected the accuracy of information extraction of rocky desertification [53][54][55]. The Sentinel-2B image has significant advantages in this regard. On the one hand, spatial resolution is important to determine the accuracy of ground object identification [56]. Especially, in the field investigation, we found that most of the exposed rocks in the karst region show characteristics of small area, dotted shapes on image, and dense distribution (Figures 1 and 8). The situation has also been reported in a previous study [57]. Therefore, the high spatial resolution of satellite images may provide more accurate identification of the small areas of exposed rocks, as reflected in the classification capability of EBF (Figures 4 and 5). On the other hand, as shown in Table 4, the "red-edge" band of Sentinel-2B image data is spectrally more continuous than that of Landsat-8 [23], which is the basis for the accurate calculation of FVC [25], indirectly leading to more accurate mapping of rocky desertification. Although the Gaofen-6 image also included the "red-edge" band, it has lower spatial resolution than the Sentinel 2B image and lacks the SWIR1 band where carbonate rocks usually have higher reflectance than in the blue band [58].
In the ERT classification model, EBF and FVC extracted from Sentinel-2B and Landsat-8, as well as FVC extracted from Gaofen-6, are of the highest importance. It shows that the surface information obtained by remotely sensed images plays the key role for the classification of rocky desertification [16,59]. With a good separability of the EBF and FVC of Sentinel-2B on rocky desertification, the other feature factors are less important. Since the separability of EBF and FVC extracted from Landsat-8 and Gaofen-6 is slightly poor, it would be necessary to add other factors to assist the extraction of rocky desertification information. On the whole, since the rocky desertification is a complex phenomenon caused by multiple factors [8,10], it is still not straightforward to extract rocky desertification information only by traditional remotely sensed indexes. When the terrain, soil, and lithology of the study area are integrated with remotely sensed factors, the information of rocky desertification might be extracted more accurately [21,31].

Applicability of ERT for Extracting Information of Rocky Desertification
The structure of the ERT algorithm is similar to that of RF and BDT and is composed in parallel of multiple DTs [26]. The major difference is that RF and BDT use the bagging algorithm for sampling, whereas ERT uses all the sample data for modeling [60]. In addition, the threshold of bifurcation in DT of ERT is randomly selected, so each tree's final classification feature is completely random [38]. Therefore, the comprehensive results of multiple DTs increase the objectivity of classification, which can be better and more stable than the results from RF and BDT [39]. However, the ERT algorithm also puts forward higher requirements for samples. Only when samples and feature factors are accurately obtained can the results be superior to other classification models [36]. In this study, the classification accuracy of ERT on rocky desertification information is higher than 70% and is better than the results from the RF and BDT models. It should be noted that only two out of nine feature factors are remotely sensed factors. If more related feature factors are used, the classification accuracy could be further improved. This is also one of the areas we need to follow up in the future.

Study Limitations
We conducted comparative analyses regarding the capabilities of three types of openaccess satellite images and three machine-learning algorithms for extracting rocky desertification information. Some limitations of the study are presented below.
First, the inherent landscape heterogeneity in each pixel of remotely sensed images usually leads to uncertainty in indexes such as EBF and FVC. In particular, the landscape features in karst rocky desertification regions are very fragmented [20]. The unreasonable land use in the last century caused more serious landscape heterogeneity in the study area [61], which reduces the classification accuracy of rocky desertification. To better address the mixed pixel problem, the combination of object-based method and ERT model might be able to reduce the uncertainty caused by landscape heterogeneity [61]. Second, because there are great differences in the karst terrains in the study area, and Guangnan County is located near the Tropic of Cancer, the sunrays are often unable to illuminate vertically, especially in the dry seasons. As a result, the images acquired in February and March used in this study have a certain solar oblique angle, which forms a backlit region (i.e., the shadowed region) on the Earth's surface [21,62]. It affects the extraction result of rocky desertification and needs to be considered in the future [20]. Third, due to the constraints of topography and weather in the karst rocky desertification region, it is difficult to obtain a large number of field samples during investigation [63], which brings difficulties for classification assessment and accuracy control. Fourth, it is an experiment that used the ERT model to investigate the information extraction of ground objects. However, the classification accuracy for rocky desertification by the ERT model needs to be improved. Factors that play roles in extracting information have not been fully tapped and need further investigation.

Conclusions
Based on the comparative analyses of three open-access satellite image datasets (Sentinel-2B, Landsat-8, and Gaofen-6) and three machine-learning algorithms (random forest, bagged decision tree, and extremely randomized trees) implemented in the Python programming language for extracting information on rocky desertification in a typical karst region of southwest China, the following conclusions are drawn: (1) The Sentinel-2B image has the best capability for extracting rocky desertification information in the three open-access satellite images of Sentinel-2B, Landsat-8, and Gaofen-6, and its OA in the ERT model reaches 85.21%. This can be attributed to the higher spatial resolution of the Sentinel-2B image than that of Landsat-8 and Gaofen-6 images, and Gaofen-6's lack of the shortwave infrared (SWIR) bands suitable for mapping carbonate rocks.
(2) Among the three models of ERT, RF, and BDT, the ERT model has the best classification results of rocky desertification information. Compared with the RF and BDT models, the ERT model has stronger randomness in modeling and can effectively identify important feature factors for the extraction information on rocky desertification. Therefore, the results from ERT have less variance, better stability, and higher accuracy.
(3) The combination of the Sentinel-2B images and the ERT method provides an effective, efficient, and free approach to information extraction for mapping rocky desertification. The study can provide useful reference for effective mapping of rocky desertification in similar karst environments of the world, in terms of both satellite image sources and classification algorithms.
(4) The study provides important information on the total area and spatial distribution of different levels of rocky desertification in the study area. The total rocky desertification area in Guangnan County is 3469 km 2 . The more serious rocky desertification levels such as severe and moderate levels are mainly distributed in the southern, mid-west, and mideast parts of the county. These results can be used to support decision making by local governments for sustainable development.