Dynamic Monitoring of Desertiﬁcation in Ningdong Based on Landsat Images and Machine Learning

: The ecological stability of mining areas in Northwest China has been threatened by desertiﬁcation for a long time. Remote sensing information combined with machine learning algorithms can effectively monitor and evaluate desertiﬁcation. However, due to the fact that the geological environment of a mining area is easily affected by factors such as resource exploitation, it is chal-lenging to accurately grasp the development process of desertiﬁcation in a mining area. In order to better play the role of remote sensing technology and machine learning algorithms in the monitoring of desertiﬁcation in mining areas, based on Landsat images, we used a variety of machine learning algorithms and feature combinations to monitor desertiﬁcation in Ningdong coal base. The performance of each monitoring model was evaluated by various performance indexes. Then, the optimal monitoring model was selected to extract the long-time desertiﬁcation information of the base, and the spatial-temporal characteristics of desertiﬁcation were discussed in many aspects. Finally, the factors driving desertiﬁcation change were quantitatively studied. The results showed that random forest with the best feature combination had better recognition performance than other monitoring models. Its accuracy was 87.2%, kappa was 0.825, Macro-F1 was 0.851, and AUC was 0.961. In 2003–2017, desertiﬁcation land in Ningdong increased ﬁrst and then slowly improved. In 2021, the desertiﬁcation situation deteriorated. The driving force analysis showed that human economic activities such as coal mining have become the dominant factor in controlling the change of desert in Ningdong coal base, and the change of rainfall plays an auxiliary role. The study comprehensively analyzed the spatial-temporal characteristics and driving factors of desertiﬁcation in Ningdong coal base. It can provide a scientiﬁc basis for combating desertiﬁcation and for the construction of green mines. validation, W.D. and X.K.; formal analysis, P.C.; investigation, P.C. and J.S.; resources, G.W.; data curation, P.C. and S.Z.; writing—original draft preparation, P.C. and P.L.; writing—review and editing, P.L.; visualization, P.L.; supervision, P.L.; project administration, P.L.; funding acquisition, P.L.


Introduction
Desertification occurs as land degrades in arid, semiarid, and dry, subhumid areas, owing to various factors, including climate variability and human activities [1]. China is one of the countries which is most severely affected by desertification in the world. The fifth national census on desertification and sandification showed that the amount desertification land area in China makes up 2.6116 million km 2 , which is 27.2% of the total land area, located mainly in Xinjiang, Inner Mongolia, Gansu, and other western regions [2]. Wind desertification, water desertification, freeze-thaw desertification, and soil salinization are the main types of desertification in China [3]. Most desert areas are characterized by a serious degradation state.
The degree of land desertification has aggravated owing to the large-scale development of coal resources in the mining areas of northwestern China [4]. To promote the co-ordinated development of energy industries and ecological civilizations, we must conduct in-depth research on desertification caused by human activities, such as mining development, to understand climate change, mining development, and other impacts on desertification while providing a scientific basis for reducing the risk of negative environmental impacts caused by mining.
In the 1970s, high-resolution aerial images were used as data sources for desertification research at home and abroad, which laid the foundation for remote sensing monitoring of desertification [5,6]. At present, remote sensing technology has gradually become an important means of desertification monitoring. In the traditional remote sensing monitoring technology of desertification, although visual interpretation has high accuracy, it is timeconsuming and requires a lot of expert experience [7]. The Albedo-NDVI feature space model proposed by Zeng et al. [8] is widely used because of its simple principle and easy implementation [9][10][11]. Later, some scholars considered the influence of soil background on NDVI, used MSAVI to replace NDVI, and proposed the albedo MSAVI feature space model [11]. However, the desertification inversion accuracy of the feature space model depends on the calculation accuracy of typical surface parameters, and the wrong sampling points will cause the inversion accuracy to be damaged [12]. Based on multisource data such as climate, vegetation, and soil, the establishment of comprehensive assessment models using methods such as AHP is also commonly used in desertification monitoring work [13][14][15]. However, this method has defects such as cumbersome data collation, strong subjectivity, and data coupling correlation [16][17][18]. Mixed pixel decomposition is also used in desertification monitoring [19]. It makes the classification scale from pixel level to subpixel level [20], which can effectively improve the classification accuracy and reduce the influence of terrain [21]. However, the related algorithms are mainly aimed at hyperspectral images, which makes it difficult to realize long-time and large-scale desertification monitoring. If multispectral imagery is used for mixed pixel decomposition, it is difficult to ensure the accuracy of the results.
Recently, the application of machine learning has played a positive role in promoting desertification monitoring via remote sensing. Compared with traditional desertification monitoring methods, machine learning can better obtain the potential information of classification features [22]. Unsupervised classification methods such as K-mean can classify images without samples, which is fast and has a low labor cost. However, the effect of unsupervised classification is excessively dependent on the quality of image data [23]. Intelligent algorithms, such as decision tree, need to manually select sampling points based on expert knowledge to establish training samples which belong to supervised classification [24]. The classification results are more accurate than unsupervised classification. The desertification monitoring research using the intelligent algorithm of supervised classification is mainly different in the selection of classification characteristics and machine learning algorithms. In the establishment of multifeature datasets, most studies choose spectral features such as NDVI, Albedo, and TGSI [25][26][27][28]. They consider the vegetation, soil, radiance, and other information comprehensively. On this basis, some studies introduce texture features to make the classification results more accurate [29]. However, few studies have considered the impact of the quality of features on the prediction performance and training rate of machine learning models. Therefore, it is necessary to discuss the performance of machine learning algorithms on feature combinations of different qualities to verify the anti-interference ability of machine learning algorithms. The intelligent algorithms commonly used in desertification monitoring include decision tree, random forest, and support vector machine [25,30,31]. Most studies are too single on the choice of algorithms and do not discuss the effect of other algorithms in desertification monitoring [21,32]. In order to further improve the technologies and methods of desertification monitoring, it is necessary to comprehensively evaluate the performance of various machine learning algorithms in desertification monitoring. In this paper, we take a typical ecologically fragile mining area in northwest China-Ningdong coal base-as the research object. Based on Landsat images, 17 features were obtained, and three different feature combinations were established. The performance differences of 11 machine learning intelligence algorithms on different combinations of features were comparatively analyzed by four performance metrics. Then, the spatialtemporal distribution and dynamic evolution of desertification in Ningdong coal base from 2003 to 2021 were statistically analyzed by using the optimal intelligent algorithm and feature combination. Finally, the driving factors of desertification in Ningdong coal base were explored and discussed. The main purposes of this study are: (1) To evaluate the performance of various machine learning classification algorithms in desertification monitoring. (2) To monitor the dynamic changes of desertification in Ningdong coal base in recent 19 years. (3) To reveal the factors driving the desertification change in Ningdong coal base.

Study Area
Ningdong coal base is located in the Middle East of the Ningxia Hui Autonomous Region and in the transition zone between the Ordos platform and Yinchuan plain. It is a semidesert desertification zone and one of the 14 large coal bases in China exceeding 100 million tons. It is composed of Hongshiwan, Maliantai, Renjiazhuang, Shigouyi, Qingshuiying, Lingxin, Meihuajing, Yangchangwan, Zaoquan, Hongliu, Shuangma, Jinfeng, and other mines with abundant coal resources. Ningdong coal base is flat, with an average altitude of 1480 m, and has a continental climate in the mid temperate zone. It is characterized by drought, concentrated rainfall, long sunshine durations, and large temperature differences between day and night. Sandstorms usually persist in spring. Its vegetation consists mainly of shrub and shrub-grass composed of Salix psammophila, Allium mongolicum, and other xerophytic vegetation, with sparse and uneven distributions and poor resilience [33]. The soil types mainly include light gray calcium soil, aeolian sand, silty fine sand, and saline alkali soil [34]. The soil is barren and the ecological environment is highly fragile. Figure 1 shows the location of Ningdong coal base and its coal mines.

Data Collection 2.2.1. Imaging Data and Preprocessing
The imaging data used in this study were Landsat products (downloaded from https: //earthexplorer.usgs.gov/ (accessed on 21 November 2021)), with a spatial resolution of 30 m, suitable for long-term analysis and variation monitoring of surface information [35]. Selected Landsat imaging data from 2003,2005,2007,2010,2014,2017, and 2021 for the Ningdong coal base were downloaded. To avoid misclassification due to seasonal factors, the month scale was from June to September, when vegetation becomes bushy. Imaging clouds were all <2%.
Radiometric calibration and atmospheric correction of Landsat images were performed using the general calibration tool and FLAASH model in ENVI, respectively, to eliminate the effects of sensors, atmosphere, and other factors. Landsat T1 already has good geometric accuracy. To improve the reliability of the results of the study, more accurate geometric correction using ground control points (GCP) was used. Then, images were cut. Improved normalized difference water index [36,37], enhanced construction land index [38], and the Fmask algorithm [39,40] are used to mask water, construction land, and cloud and cloud shadow, respectively.

Imaging Data and Preprocessing
The imaging data used in this study were Landsat products (downloaded from https://earthexplorer.usgs.gov/ (accessed on 21 November 2021)), with a spatial resolution of 30 m, suitable for long-term analysis and variation monitoring of surface information [35]. Selected Landsat imaging data from 2003,2005,2007,2010,2014,2017, and 2021 for the Ningdong coal base were downloaded. To avoid misclassification due to seasonal factors, the month scale was from June to September, when vegetation becomes bushy. Imaging clouds were all <2%.
Radiometric calibration and atmospheric correction of Landsat images were performed using the general calibration tool and FLAASH model in ENVI, respectively, to eliminate the effects of sensors, atmosphere, and other factors. Landsat T1 already has good geometric accuracy. To improve the reliability of the results of the study, more accurate geometric correction using ground control points (GCP) was used. Then, images were cut. Improved normalized difference water index [36,37], enhanced construction land index [38], and the Fmask algorithm [39,40] are used to mask water, construction land, and cloud and cloud shadow, respectively.

Topographic Data
The digital elevation model (DEM) was obtained from the ASTGTM2 DEM data provided by the Geospatial Data Cloud (https://www.gscloud.cn/ (accessed on 8 December 2021)). The global spatial resolution of the DEM data is 30 m. First, images were mosaicprocessed, projection-transformed, and edge-trimmed to match the Landsat data. Based on the DEM data, we calculated the slope and aspect of the study area. We then explored the changes in the spatial distribution of the desertification under different altitudes, slopes, and aspect conditions.

Desertification Driving Force Analysis Data
Based on previous studies focusing on the driving mechanism of desertification in northwestern China [9,28,41,42], as well as considering the mining characteristics of this region, 12 indexes were selected as the representative natural and human activity driving factors. These indexes included the average temperature, annual rainfall, total agricultural output value, total animal husbandry output value, and the number of main livestock and coal production activities. They were used to analyze the driving force of desertification at the Ningdong coal base to examine the main controlling factors and statistical characteristics of desertification evolution. All data were obtained from the Chinese meteorological data service center (http://data.cma.cn/ (accessed on 29 December 2021)) and the statistical yearbook for the Ningxia Hui Autonomous Region (http://nxdata.com.cn/publish.htm? cn=G01/ (accessed on 1 January 2022)).

Data Analysis
We used 7-period Landsat images of the Ningdong coal base from 2003 to 2021. Based on the preprocessed images, 9 spectral features and 8 texture features were obtained, and the missing value, abnormal value, and data dimension of the features were processed. Then, the importance and relevance of features were evaluated by using the feature selection method based on the tree model and Pearson correlation coefficient method. According to the evaluation results, three different quality feature combinations were established. Based on these three feature combinations and 11 machine learning intelligent algorithms, different desertification monitoring models were established. The performance of different monitoring models was compared by using the four indicators of accuracy, kappa, marco-f1, and AUC. The best monitoring model was used to extract the long-term desertification information of Ningdong from 2003 to 2021. Then, the spatio-temporal change law and driving factors of desertification were analyzed by using the gravity center migration model, dynamic change intensity index, and PCA. Figure 2 shows a flowchart of this study. The development of desertification changes the structure and coverage of surface vegetation and the soil capacity. Different degrees of desertification show varying vegetation coverage and landscape characteristics [43]. Based on the monitoring and evalua-

. Desertification Classification System and Sample Selection
The development of desertification changes the structure and coverage of surface vegetation and the soil capacity. Different degrees of desertification show varying vegetation coverage and landscape characteristics [43]. Based on the monitoring and evaluation indicator system for sandy desertification [44], as well as the regional characteristics of the Ningdong coal base, the degree of land desertification was divided into four grades: nondesertification, light desertification, moderate desertification, and severe desertification. Table 1 lists the indexes and UAV image features of each category. Referencing the UAV images and Google Earth high-resolution images from the same period, nearly 2000 sampling points were randomly selected by QGIS software to establish the sample pool. Among them, severe, moderate, mild, and nondesertification samples accounted for about 22%, 21%, 22%, and 35%, respectively. The hold-out method [45] was used to divide 75% of the samples into the training dataset, and 25% were used as the validation dataset in a stratified manner. Figure 3 shows the distribution of the training sample points.

Desertification Classification Indicators
Based on the preprocessed Landsat image data, 17 feature indicators containing 9 spectral features and 8 textural features, which represented information on the vegeta tion, soil, surface radiation, and texture, were calculated. The corresponding characteris tic values of the sample points were extracted using the Point Sampling Tool in QGIS to establish the initial dataset. Spectral information, such as the surface vegetation, soil, and albedo, are widely used in desertification research [46][47][48][49][50]. Effectively using this infor mation is crucial for the accurate extraction of the degree of surface desertification. Tex tural features can reflect the structural features and spatial arrangement patterns of these features; therefore, they are widely used in scenarios such as remote sensing image pro cessing and pattern recognition [51,52]. The textural features of desertified land are no table; adding textural features can effectively improve the recognition accuracy. Table 2 lists the equations for each feature. Of these, the tassel cap transformation coefficients fo Landsat 5 TM were proposed by Crist and Cicone in 1984 [53]. The tassel cap transfor mation coefficients for Landsat 8 OLI were proposed by Baig in 2014 [54]. The textura features were obtained using the grayscale co-occurrence matrix (GLCM) calculations [55][56][57].

Desertification Classification Indicators
Based on the preprocessed Landsat image data, 17 feature indicators containing 9 spectral features and 8 textural features, which represented information on the vegetation, soil, surface radiation, and texture, were calculated. The corresponding characteristic values of the sample points were extracted using the Point Sampling Tool in QGIS to establish the initial dataset. Spectral information, such as the surface vegetation, soil, and albedo, are widely used in desertification research [46][47][48][49][50]. Effectively using this information is crucial for the accurate extraction of the degree of surface desertification. Textural features can reflect the structural features and spatial arrangement patterns of these features; therefore, they are widely used in scenarios such as remote sensing image processing and pattern recognition [51,52]. The textural features of desertified land are notable; adding textural features can effectively improve the recognition accuracy. Table 2 lists the equations for each feature. Of these, the tassel cap transformation coefficients for Landsat 5 TM were proposed by Crist and Cicone in 1984 [53]. The tassel cap transformation coefficients for Landsat 8 OLI were proposed by Baig in 2014 [54]. The textural features were obtained using the grayscale co-occurrence matrix (GLCM) calculations [55][56][57].

Feature Preprocessing
Owing to errors in the data itself or improper manual interpretations, there are missing values, outliers, and inconsistent magnitudes in the dataset. To import the accuracy of desertification extraction, the dataset should first be preprocessed [58].

•
Missing values and abnormal value management Samples with missing or abnormal values should be deleted or processed with the mean, plurality, model prediction, interpolation, or weighting methods for compensation [59,60]. The dataset only had a few missing or abnormal values; all of these samples were deleted.
• Feature standardization Different dimensions of features can reduce the convergence rate of the algorithm model and affect the accuracy of algorithm analysis; therefore, data standardization is necessary. The Max-Min normalization method was used to linearly map the feature values to 0-1 to eliminate the influence of the dimensional difference on the accuracy of model: where x is the original value, X is the mapped value of x, x max is the maximum value of the dataset, and x min is the minimum value of the dataset.

Feature Combinations
In order to explore the influence of feature combinations with different qualities on the performance of intelligent algorithms for desertification monitoring, we used the feature selection method based on tree model and the Pearson correlation coefficient method to evaluate the importance and correlation of features. Both methods are implemented in Python language: • Tree model feature selection method First, noise interference was added to the out-of-bag data. The importance of each feature variable was then obtained by calculating the degree of decline in the Gini index or the residual sum of squares caused by each feature variable in each decision tree [61,62]. Finally, all feature variables were ranked according to their value of importance; the features were selected by a given threshold.

• Pearson correlation coefficient
The Pearson correlation coefficient shown in Equation (2) is used to measure the degree of linear correlation between two variables. The range of the Pearson correlation coefficient is between −1 and +1. Positive values show a positive correlation while negative values show a negative correlation [63]. The greater the Pearson correlation coefficient, the stronger the correlation between two variables. A strong correlation can reduce the data application efficiency and rate of model operation: where r is the Pearson correlation coefficient, n is the total number of samples, X i , Y i is the value of the X and Y variables of the i-th sample, and X, Y is the average of the X and Y variables. Figures 4 and 5 show the results of feature importance and the Pearson correlation coefficient, respectively. We used an importance > 0.03 and absolute value of Pearson correlation coefficient of <0.65 (features with an absolute correlation coefficient value ≥ 0.65 were considered as strongly correlated while only one feature with high importance was selected) as the threshold to establish three feature combinations: 1 the feature combination after importance and correlation screening (the feature is of high importance and weak correlation), 2 the feature combination after importance screening (the feature is of high importance but strong correlation), and 3 all features (including features with low importance and strong correlation). Table 3 shows the composition of each feature combination. combination after importance and correlation screening (the feature is of high importance and weak correlation), ② the feature combination after importance screening (the feature is of high importance but strong correlation), and ③ all features (including features with low importance and strong correlation). Table 3 shows the composition of each feature combination.   combination after importance and correlation screening (the feature is of high importance and weak correlation), ② the feature combination after importance screening (the feature is of high importance but strong correlation), and ③ all features (including features with low importance and strong correlation). Table 3 shows the composition of each feature combination.

Intelligent Algorithms
Recently, the computational efficiency and reliability of artificial intelligence technology and machine learning algorithms have improved effectively with increased development. These methods have been widely used in different application scenarios. In order to compare the performance of different machine learning intelligent algorithms in desertification monitoring, 11 commonly used classification algorithms were selected for experiments. All algorithms were written in Python, and the parameters of each algorithm were optimized by grid search and cross-validation to eliminate the influence of differences in features and make the results more reliable.

Multinomial Logistic Regression (MLR)
Logistic regression simulates the probabilities of binary dependent variables. It assumes a linear relationship between the log odds of the dependent variable and the independent variable [64]. It applies to binary classification. MLR is a generalization of logistic regression models that can be applied to multiclassification problems. It is a conversion of linear regression using the softmax function [65].

2.
Linear Discriminant Analysis (LDA) LDA projects high-dimensional sample feature data onto low dimensions, and finds the best recognized vector space to achieve the effect of extracting classification information and compressing feature space dimensions. The projection ensures that the pattern sample has the largest interclass distance and the smallest intraclass distance in the new subspace, that is, the pattern has the best separability in that space [66].

3.
Quadratic Discriminant Analysis (QDA) QDA is a variant of LDA. It also assumes that the observed values of each class come from Gaussian distribution, and inserts the estimated values of parameters into Bayesian Theorem for prediction. The difference is that LDA assumes that the covariance matrix of each classification is the same, while the covariance matrix of each classification in QDA is different, which is the basic reason why LDA is more flexible than QDA [67].

Classification and Regression Tree (CART)
CART is the search for the best classification system from a complex set of ir-regularly distributed data. CART sets a heterogeneity threshold, and when heterogeneity reaches that threshold range, a classification node is generated, otherwise it is reselected and combined from a multitude of categorical attributes, and it goes back and forth until heterogeneity reaches that threshold range [68].

Support Vector Machines (SVM)
When dealing with classification problems, SVM maps the input vector into a highdimensional space through some nonlinear function relationship, and then solves the optimal classification surface to realize classification [69]. SVM classifiers have a high accuracy in remote sensing image classification, and also avoid the problem of overfitting in theory.

6.
Naive Bayes classifier (NB) The classification principle of NB is based on the Bayesian formula. The posterior probability of an object is calculated from its prior probability, that is, the probability that the object belongs to a certain class. It is a way to implement decision making at the probability level [70]. NB has a fast convergence speed and is suitable for classification of a small amount of data.

K-Nearest Neighbor (KNN)
KNN does not depend on a specific function distribution, and it is classified by measuring the distance between different eigenvalues. For newly input sample data, if most of the k nearest sample data in the feature space belong to a certain category, the sample data will also be divided into this category [71].

Random Forests (RF)
RF is an ensemble learning algorithm proposed by Breiman et al. [72]. RF uses the decision tree of randomly selected features and sample sets as a weak learner to determine the final classification results according to the votes of all decision tree classifiers. The selection of each tree sample in the random forest is made by randomly putting back samples from the original data set for N times to generate N different, untrimmed decision trees. Each node in the decision tree randomly selects K features from all the features. Each split is tested according to the Gini index to select the best features. Finally, the decision tree with the fastest reduction of the Gini index is obtained. Random forest algorithms have randomness in sample and feature selection, which makes it difficult for random forest to fall into overfitting and gives it a good antinoise ability [61].

9.
Extremely Randomized Trees (ERT) ET is an ensemble learning algorithm proposed by Pierre Geurts et al. in 2006 [73]. Its principle is similar to RF, by integrating multiple decision tree voting results to determine the final classification results. Each subdecision tree in ET is trained using the original dataset. In feature selection, ET randomly selects an eigenvalue to divide the decision tree. 10. AdaBoost (AB) AB was proposed by Freund et al. in 1997 [74]. It is widely used because of its fast speed, low complexity, and good compatibility. AB reasonably combines multiple weak classifiers to make it a strong classifier. Using the idea of iteration, a weak classifier is trained in each iteration and applied to the next iteration.

Gradient Boosting Machine (GBM)
GBM is a machine learning algorithm proposed by Friedman on the basis of AB [75]. The basic principle is to train the newly added weak classifier according to the negative gradient information of the loss function of the current model, and then combine the trained weak classifier with the existing model in the form of accumulation. The main innovation of GBM is that it proposes to estimate the basis function with the nonparametric method and use "gradient descent" in function space for approximate solution.

Performance Index
Four indicators were selected to compare the performance of these algorithms.

Accuracy
Accuracy is the proportion of the number of samples classified correctly with respect to the total number of samples, with values between 0 and 1. The higher the value of the accuracy indicator, the better classification result. Imbalances in the sample size easily influence the evaluation effect.
where m is the total number of samples, f (x i ) is the predicted result of sample x i , y i is the true markup corresponding to x i , and 1(x) is the indicating function: when x is true, the value is 1; when x is false, the value is 0.
where Acc is the accuracy, c is the number of classes, a i is the actual quantity corresponding to class i, b i is the predicted quantity corresponding to class i, and m is the total number of samples.

Macro-F1
F1 is the harmonic mean of the precision and recall, which can be used to measure the accuracy of the binary classification, with values between 0 and 1. Its high value corresponds to a high accuracy. The precision is the proportion of the number of samples classified correctly to the total number of prediction results in the category. Recall is the proportion of the number of samples classified correctly to the actual total number in the category. The Macro-F1 is a variant model of the F1 that decomposes the multiclassification into multiple binary classifications [76].
where n is the number of binary classifications decomposed by multiclassification, P i is the precision corresponding to the i binary classification, and R i is the recall corresponding to the i binary classification.

AUC
The AUC is the area under the ROC curve, which can directly evaluate the performance of the binary classification model [77], with values between 0.5 and 1. Larger values indicate better performance. Here, 0.5 represents the random guessing performance while 1 represents the optimal performance [78]. For multiclassification, the AUC can be calculated by decomposing multiples into binary classifications.
where m + , m − , respectively, correspond to the sample numbers of positive and negative classes in the binary classification, D + , D − , respectively, correspond to the sample sets of positive and negative classes in the binary classification. f (x + ), f (x − ), respectively, correspond to the predicted values of positive and negative classes in the binary classification. n is the number of binary classifications decomposed by multiclassification. 1(x) is the indicating function: when x is true, the value is 1; when x is false, the value is 0.

Gravity Center Migration Model
The gravity center migration model is an effective method that reflects the changes in the center of the gravity position of the research object. This method was first proposed to examine the center of gravity of the population distribution. Now it is widely used to examine changes in the spatial patterns in human economy and ecological landscapes, among others [79][80][81][82]. In this study, the gravity center migration was calculated to analyze the spatial variation in desertification land at the Ningdong Coal Base from 2003 to 2021. The gravity co-ordinate was calculated as follows: where X j , Y j are the longitude and latitude co-ordinates of the gravity center of a certain type of desertification land in the j-th year, n is the number of patches of the calculated type of desertification land in the j-th year, C ji is the area of the i-th patch of the calculated type of desertification land in the j-th year, and X ji , Y ji are the geometric center co-ordinates of the i-th patch of the calculated type of desertification land in the j-th year.

Desertification Dynamic Change Intensity Index
As listed in Table 4, to explore the dynamic changes in desertification land at the Ningdong coal base, we proposed six levels of the change intensity according to the spanned desertification grade, which were severe degradation, moderate degradation, light degradation, light improvement, moderate improvement, and significant improvement. The desertification change intensity index, T, was introduced to characterize the intensity of desertification change in each period, as follows: where T w is a degenerate intensity; T r is the improvement intensity; T is the overall change intensity; i = −3, −2, −1 correspond to severe, moderate, and light degradation, respectively; j = +1, +2, +3 correspond to light, moderate, and significant improvements, respectively; S i , S j represent the desertification area of the corresponding level; and S represents the total area of desertification change.

Dimidiate Pixel Model
Vegetation cover is an important ecological factor affecting the degree of surface desertification; vegetation degradation leads to the deterioration of surface ecosystem conditions. Wind, rainfall, solar radiation, freeze-thaw changes, and other factors, under low vegetation cover conditions, affect the surface soil and trigger different degrees of surface desertification. The dimidiate pixel model is a vegetation coverage estimation model frequently used for its simplicity and practicality [83][84][85], which assumes that a pixel in the image is composed of both vegetation and nonvegetation coverage [86]. We used the NDVI as the parameter for the dimidiate pixel model to extract vegetation coverage information for the Ningdong coal base: where FVC is the Fractional Vegetation Cover and NDVI soil , NDVI veg are the NDVI information reflected by pure soil and vegetation pixels, respectively.

PCA
PCA is a multivariate statistical method for dimension reduction of variable information, commonly used for data simplification and multi-index comprehensive evaluation [87]. It is widely used in social economics, medicine, meteorology, environmental science, and other fields. PCA is a linear transformation of multiple related variables into ir-relevant comprehensive index variables [88]. PCA approximately generalizes most of the original data to eliminate the effect of subjective desirability. PCA was used to extract the driving factors of desertification at the Ningdong coal base and analyze its influencing weight to reveal the driving mechanism of desertification. Figure 6 shows the performance of different desertification monitoring models. It can be seen that different intelligent algorithms with different feature combinations had certain differences in the prediction effect and training rate. In terms of feature combination, when RF, GMB, AB, KNN, ET, NB, CART, and SVM were used, feature combination 1 showed good results, but the accuracy was low when MLR, QDA, and LDA were used. It may be that LR, QDA, and LDA are classification algorithms based on geometric principles, but the linear correlation degree of combination 1 was greatly weakened after correlation screening. The overall performance of combination 2 was slightly worse than that of combination 1, but its accuracy was much higher than that of combination 1 when using MLR, QDA, and LDA. The performance of combination 3 was slightly better than that of combination 2 only when MLR and LDA algorithms were used. However, its time complexity was too high, and the training time was several times that of the latter. In terms of training time, feature selection could effectively improve the training rate of the machine learning model. Overall, the combination 1, screened by importance and correlation, had better applicability, which was more suitable for extracting desertification information in Ningdong base.

Comparison of Desertification Monitoring Models
As the intelligent algorithms often used in desertification monitoring, RF and SVM still had good results in this study. Especially RF, whose accuracy was 0.872, Kappa was 0.825, Macro-F1 was 0.851, and AUC was 0.872 under the best combination. The four indexes were the best among the 11 algorithms, which shows that it had high accuracy, consistency, and strong generalization ability. In addition, GBM and AB also showed good performance in this study, and the prediction effect was slightly lower than that of RF and SVM. It can be seen from the results that the accuracy of GBM and AB was similar, but the training time of GBM was about twice that of AB, which may be due to the fact that GBM modified the error of the previous base learner by fitting the (quasi) residual of the previous base learner [89]. KNN had high accuracy in combinations 1 and 2, while it decreased by about 13% in combination 3. The algorithm was vulnerable to noise. Although CART and ET had good anti-interference abilities and fast training rates, their accuracy was lower than the previous algorithms. MLR, QDA, and LDA had low accuracy and unstable performance, which was greatly affected by decision boundary and feature quality. NB is a classifier that assumes that each feature is independent of each other, and the relationship between the features has a greater impact on its prediction results [90]. Its performance was the worst among the 11 algorithms, and the best results of its Accuracy, Kappa, Macro-F1, and AUC were only 0.584, 0.442, 0.549, and 0.584, respectively. Overall, RF had higher accuracy and stability than other algorithms, so it was more suitable for extracting desertification information from Ningdong coal base.

Spatial and Temporal Distribution of Desertification
Based on combination 1, RF was used to extract desertification at the Ningdong coal base. Figures 7 and 8  good performance in this study, and the prediction effect was slightly lower than that of RF and SVM. It can be seen from the results that the accuracy of GBM and AB was similar, but the training time of GBM was about twice that of AB, which may be due to the fact that GBM modified the error of the previous base learner by fitting the (quasi) residual of the previous base learner [89]. KNN had high accuracy in combinations 1 and 2, while it decreased by about 13% in combination 3. The algorithm was vulnerable to noise. Although CART and ET had good anti-interference abilities and fast training rates, their accuracy was lower than the previous algorithms. MLR, QDA, and LDA had low accuracy and unstable performance, which was greatly affected by decision boundary and feature quality. NB is a classifier that assumes that each feature is independent of each other, and the relationship between the features has a greater impact on its prediction results [90]. Its performance was the worst among the 11 algorithms, and the best results of its Accuracy, Kappa, Macro-F1, and AUC were only 0.584, 0.442, 0.549, and 0.584, respectively. Overall, RF had higher accuracy and stability than other algorithms, so it was more suitable for extracting desertification information from Ningdong coal base.

Desertification Variations under Different Topographic Conditions
To examine desertification changes at the Ningdong coal base under different terrain conditions, the topographic conditions were classified based on the elevation, aspect, and slope. Table 5 lists the classification index information.

Desertification Variations under Different Topographic Conditions
To examine desertification changes at the Ningdong coal base under different terrain conditions, the topographic conditions were classified based on the elevation, aspect, and slope. Table 5 lists the classification index information.    Figure 10 shows the proportions of the desertification types at different slopes at the Ningdong coal base. There was a correlation between the proportions of the different desertification types and changes in the slope. With an increase in the slope, the proportions of severe and moderate desertification land at the Ningdong coal base decreased gradually; there were almost no severe desertification lands in areas >25 • . Light desertification land first increased and then decreased with an increase in the slope, reaching its maximum at a moderately steep slope. The proportion of nondesertification land area showed an increasing trend, with the largest proportion at very steep slopes.

Desertification Variations at Different Slopes
Ningdong coal base. There were different distributions of the desertification types at different elevations. With an increase in elevation, the proportion of the severe desertification area showed a decreasing trend; the proportion of moderate desertification also decreased gradually from 1275 m. The area of light desertification increased with an increase in the elevation and tended to become stable at elevations > 1475 m. The area of nondesertification land decreased first and then increased, with the smallest proportions in areas at elevations from 1475-1575 m.  Figure 10 shows the proportions of the desertification types at different slopes at the Ningdong coal base. There was a correlation between the proportions of the different desertification types and changes in the slope. With an increase in the slope, the proportions of severe and moderate desertification land at the Ningdong coal base decreased gradually; there were almost no severe desertification lands in areas > 25°. Light desertification land first increased and then decreased with an increase in the slope, reaching its maximum at a moderately steep slope. The proportion of nondesertification land area showed an increasing trend, with the largest proportion at very steep slopes.  Figure 11 shows the proportions of desertification types at different aspects of the Ningdong coal base. Difference aspects had negligible impacts on desertification; the proportion of each desertification type in the different slope direction areas fluctuated by no more than 5%. With the change in the aspect from the north to the south, land with severe, moderate, and light desertification first increased and then decreased. Land with severe and moderate desertification accounted for the largest proportion in the east slope while land with light desertification accounted for the largest proportion across the south and southwest slopes. The proportion of nondesertification land first decreased and then increased from the north to south; the proportion was the smallest on the east slope.  Figure 11 shows the proportions of desertification types at different aspects of the Ningdong coal base. Difference aspects had negligible impacts on desertification; the proportion of each desertification type in the different slope direction areas fluctuated by no more than 5%. With the change in the aspect from the north to the south, land with severe, moderate, and light desertification first increased and then decreased. Land with severe and moderate desertification accounted for the largest proportion in the east slope while land with light desertification accounted for the largest proportion across the south and southwest slopes. The proportion of nondesertification land first decreased and then increased from the north to south; the proportion was the smallest on the east slope. Figure 11 shows the proportions of desertification types at different aspects of the Ningdong coal base. Difference aspects had negligible impacts on desertification; the proportion of each desertification type in the different slope direction areas fluctuated by no more than 5%. With the change in the aspect from the north to the south, land with severe, moderate, and light desertification first increased and then decreased. Land with severe and moderate desertification accounted for the largest proportion in the east slope while land with light desertification accounted for the largest proportion across the south and southwest slopes. The proportion of nondesertification land first decreased and then increased from the north to south; the proportion was the smallest on the east slope.      Figure 12 and Table 7

Analysis of Desertification Driving Factors
The load matrix and variance contribution rate of the desertification driving factors were obtained based on the PCA using statistical data from 2007-2020 with SPSS, as listed in Table 8. Three principal components with eigenvalues > 1 were extracted. The cumulative contribution rate reached 86.13%, covering most of the data. These three principal components were selected to discuss the impact that the natural driving factors and human activities had on the desertification status of the Ningdong coal base.
The contribution rate of the first principal component was 57.61%, including the total output value of agriculture, animal husbandry, main livestock, traffic freight volume, industrial wastewater discharge, and industrial solid waste production. Human activities played a leading role in the change in land desertification at the Ningdong base. The contribution rate of the second principal component was 19.71%, mainly including the number of mining enterprises, coal industry personnel, annual coal output, and total output value of the coal industry. This explained the driving effect that human activities, as dominated by coal resource mining, had on desertification at the Ningdong coal base. The contribution rate of the third principal component was 8.81%, mainly including annual rainfall, which indicates that climate change was another important factor affecting desertification at the Ningdong coal base. Overall, human activities dominated the evolution of desertification at the Ningdong coal base, followed by climate change.
From 2007-2010, the desertification situation in a few areas in the northern part of the base was seriously degraded. There were only slight changes from 2010-2014. From 2014-2017, a large area in the northern part of the base showed moderate improvements; the deterioration areas were mainly concentrated in the middle, mostly with a light degradation trend. From 2017-2021, a large area of land had a moderate degradation, with changes across the northern and southern parts of the base.

Analysis of Desertification Driving Factors
The load matrix and variance contribution rate of the desertification driving factors were obtained based on the PCA using statistical data from 2007-2020 with SPSS, as listed in Table 8. Three principal components with eigenvalues > 1 were extracted. The cumulative contribution rate reached 86.13%, covering most of the data. These three principal components were selected to discuss the impact that the natural driving factors and human activities had on the desertification status of the Ningdong coal base.
The contribution rate of the first principal component was 57.61%, including the total output value of agriculture, animal husbandry, main livestock, traffic freight volume, industrial wastewater discharge, and industrial solid waste production. Human activities played a leading role in the change in land desertification at the Ningdong base. The contribution rate of the second principal component was 19.71%, mainly including the number of mining enterprises, coal industry personnel, annual coal output, and total output value of the coal industry. This explained the driving effect that human activities, as dominated by coal resource mining, had on desertification at the Ningdong coal base. The contribution rate of the third principal component was 8.81%, mainly including annual rainfall, which indicates that climate change was another important factor affecting desertification at the Ningdong coal base. Overall, human activities dominated the evolution of desertification at the Ningdong coal base, followed by climate change.

Human Activity Factors
The cumulative contribution rate of the first and second principal components reached 77.32%, which was mainly related to human activities. Human economic activities were the core driving forces of the positive and negative changes in desertification at the Ningdong coal base. The change in the ecological environment across the study area was partially reflected by indicators such as the agricultural output value, animal husbandry output value, and number of main livestock stocks. For the relative lag in agricultural technology, the ir-rational use of land resources, such as the rapid reclamation of farmland and overgrazing, will destroy the original surface vegetation growth environment, yielding the degradation of vegetation and expansion of desertification.
The exploitation of coal resources is the most important driving force for the development and change in desertification at the Ningdong coal base. The large-scale exploitation of mineral resources destroys the stress balance in the rock strata above the goaf, resulting in movement and fracture. This causes surface subsidence in the mining area, destroys the original soil structure, and changes the soil void ratio. Discontinuous deformation, such as ground fissures, accelerates the evaporation rate of soil moisture and reduces water retention in the surface soil [91]. Soil erosion accelerates the generation and development of surface desertification [92]. The expansion of the resource exploitation scale can increase indicators such as the personnel density, transportation volume, and waste emissions in the mining area. The deterioration of desertification will intensify when the damage exceeds the bearing capacity of the environment itself. Figure 15 shows

Natural Factors
Annual rainfall had the largest load on the third principal component, indicating that it was closely related to desertification at the Ningdong coal base. Rainfall drives the development of desertification by affecting vegetation growth. An abnormal reduction in rainfall can cause the disappearance of vegetation over a large area and aggravate the degree of land desertification. Figure 17 shows the average annual temperature and precipitation at the Ningdong coal base from 2003-2020. From 2004-2006, the rainfall decreased abnormally, then increased, and reached a peak in 2014. Annual rainfall first increased and then decreased from 2015-2020. The average temperature in the mining area fluctuated from 8.5-10.2 • C. Before 2013, the average temperature changed significantly. Rainfall showed a notable correlation with desertification at the Ningdong coal base, but the temperature did not have any notable relationship with it.

Synergy of Human Activities and Natural Factors
Farmland reclamation, livestock storage and grazing, mineral exploitation, rainfall changes, and other factors affect the ecological structure of surface vegetation, which leads to the intensification and improvement in desertification. The pixel dichotomy model was used to estimate the FVC of the Ningdong coal base. According to the unique ecological characteristics of vegetation in the mining area, the FVC was divided into five grades: very low, low, moderate, moderately high, and high. The combined influence of human and natural factors was examined to understand the evolution of vegetation cover and its driving effect on desertification land changes at the Ningdong coal base. Figure 18 shows degree of land desertification. Figure 17 shows the average annual temperature and precipitation at the Ningdong coal base from 2003-2020. From 2004-2006, the rainfall decreased abnormally, then increased, and reached a peak in 2014. Annual rainfall first increased and then decreased from 2015-2020. The average temperature in the mining area fluctuated from 8.5-10.2 °C. Before 2013, the average temperature changed significantly. Rainfall showed a notable correlation with desertification at the Ningdong coal base, but the temperature did not have any notable relationship with it.

Synergy of Human Activities and Natural Factors
Farmland reclamation, livestock storage and grazing, mineral exploitation, rainfall changes, and other factors affect the ecological structure of surface vegetation, which leads to the intensification and improvement in desertification. The pixel dichotomy model was used to estimate the FVC of the Ningdong coal base. According to the unique ecological characteristics of vegetation in the mining area, the FVC was divided into five grades: very low, low, moderate, moderately high, and high. The combined influence of human and natural factors was examined to understand the evolution of vegetation cover and its driving effect on desertification land changes at the Ningdong coal base. Figure 18 shows

Practicability of Different Machine Learning Algorithms in Desertification Monitoring
In practical applications, we not only require the algorithm model to have high accuracy, but also hope that the model has a strong anti-interference ability. It is not easily affected by factors such as noise variables, feature correlations, etc. RF and SVM algorithms are commonly used in remote sensing monitoring research, and their good performance has been verified in many studies [25,61,93,94]. They also showed a good recognition effect in this study. In addition, we found that the two boosting algorithms, GBM and AB, are also very useful in desertification monitoring. Like random forest, boosting algorithm is also based on the idea of ensemble learning. The difference is that random forest tries to make decision trees ir-relevant, while boosting algorithm uses each weak learner to make up for the shortcomings of all previous learners. They are accurate, insensitive to noise variables and feature correlation, and have stable performance. The

Practicability of Different Machine Learning Algorithms in Desertification Monitoring
In practical applications, we not only require the algorithm model to have high accuracy, but also hope that the model has a strong anti-interference ability. It is not easily affected by factors such as noise variables, feature correlations, etc. RF and SVM algorithms are commonly used in remote sensing monitoring research, and their good performance has been verified in many studies [25,61,93,94]. They also showed a good recognition effect in this study. In addition, we found that the two boosting algorithms, GBM and AB, are also very useful in desertification monitoring. Like random forest, boosting algorithm is also based on the idea of ensemble learning. The difference is that random forest tries to make decision trees ir-relevant, while boosting algorithm uses each weak learner to make up for the shortcomings of all previous learners. They are accurate, insensitive to noise variables and feature correlation, and have stable performance. The accuracy of KNN decreases significantly when there are noise variables. This is because KNN does not consider response variables in classification, which makes it vulnerable to noise variables [95]. Therefore, when using KNN for desertification monitoring, it is necessary to filter out the noise variables in the dataset. Contrary to KNN, CART considers the influence of eigenvectors on response variables when dividing regions, and only uses one splitting variable at a time [68]. Therefore, CART is suitable for high-dimensional space and is not susceptible to noise variables. This also explains why CART has similar effects on three different feature combinations. It is worth mentioning that RF, AB, GBM, and ET are all ensemble algorithms based on CART, so they have good anti-interference. SVM, MLR, LDA, and QDA are all algorithms based on geometric principles, but the stability of SVM in desertification monitoring is much better than that of the latter three. This may be because SVM is usually determined by only a few support vectors, so it is less affected by noise variables. MLR, LDA, QDA, and NB have a poor recognition effect, and their stability is easily affected by data quality and type, so they are not very suitable for desertification monitoring. In addition, the feature combinations used by different algorithms are not necessarily the same when the accuracy of desertification identification is the highest, indicating that the selection of features needs to adapt to the algorithm, rather than assuming the more the better or the less the better. Therefore, the importance of feature selection is self-evident, especially when the amount of data is huge.

Driving Mechanism of Desertification in Ningdong
Land desertification is a complex process affected by both natural factors and human activities. It is not rigorous to judge the driving causes of desertification regardless of research scale and regional characteristics [96]. Ningdong coal base is located in the arid and semiarid region of northwestern China, which is a typical ecologically fragile area. Abnormal changes in natural conditions and human activities are the most rapid and direct causes of desertified land changes in this region. The average precipitation in Ningxia dropped to 199 mm in 2005, which is 31% lower than the average of many years, and it experiences a dry season once in a decade [97,98]. The vegetation growth was seriously degraded, and the desertification was deteriorated. At this time, the catastrophic drought was the main reason for the serious deterioration of the desertification in East Ningdong. Around 2007, with the government's policy of closing hills and prohibiting grazing in an effort to return farmland to forest, the cultivated land area decreased, the forest land area increased, and the ecological environment of the base improved. After 2010, Ningdong coal base began large-scale coal mining activities. Mining is one of the engineering activities that has the strongest impact on the geological environment, and its negative impact on desertification in mining areas is very large: mineral exploitation directly destroys the original topography and biological communities in the mining area, resulting in serious degradation of surface ecosystems; deep mining will also change the aquifer and destroy groundwater resources, resulting in vegetation degradation; a large number of industrial wastes generated in mining development are piled up around the mining area, forming a large number of coal gangue piles and dumps; a large amount of sand and dust spreads under the action of wind, which adversely affects the surrounding ecological environment; and the growth of industrial personnel density, traffic volume, mine construction, and road construction will all have a negative impact on the surrounding environment. Although the government has implemented a number of ecological protection and restoration projects in mining areas and enacted relevant environmental protection policies, the status of desertification in the base remains tense due to the weak ecological resilience of the base itself and the impact of mineral resources' exploitation and related industries. Clearly grasping the driving mechanism of desertification is the key to control and prevent desertification in the base, which should be paid enough attention to.

Recommendations for Desertification Control
Lucid waters and lush mountains are invaluable assets. While human beings demand infinitely from nature, they should respect, protect, and comply with nature to achieve co-ordinated and sustainable development of social economy and ecological environment. Ningdong coal base is located in the fragile ecological area of the arid and semiarid zone in the northwest. The situation of desertification control is very serious. Future governance and restoration can be carried out from the following aspects: (1) The ecological protection and engineering of mining areas should be carried out in depth, and the restoration and management of abandoned industrial and mining land should be strengthened. An artificial wind break and sand fixation forest shouldbe established at the boundary of the mining area to prevent the desertification land from spreading to the surrounding areas. Protecting the existing vegetation and cultivating new vegetation for wind prevention and sand fixation should be focused on. On the premise of improving soil texture, increasing the content of organic matter in desertification land, improving the fertility of desertification land, and enhancing the environmental carrying capacity of mining areas should be considered.
(2) Mineral enterprises should reasonably arrange the mining, production, and business activities of mineral resources and control the intensity of mineral exploitation. They should also further optimize the resource mining technology to minimize the negative impact of mining activities on the ecological environment. (3) A monitoring and early warning program for desertification in mining areas should be built. A supervision and monitoring system is formed by means of administrative supervision, remote sensing monitoring, etc., combined with technologies such as big data and cloud platforms. At the same time, reasonable early warning programs should be set up to prevent desertification from aggravating. (4) The government should establish a sound legal guarantee system to ensure the smooth implementation of ecological projects in the form of legislation. Law enforcement departments should improve the legal monitoring system, strengthen law enforcement, and severely crack down on activities such as deforestation, reclamation, and illegal exploitation. Law popularization departments should strengthen legal publicity and enhance public legal awareness to prevent land desertification caused by human factors.

Shortcomings and Prospects of Research
This paper explored the performance of different machine learning algorithms in the field of remote sensing monitoring of desertification, and extracted the desertification information of Ningdong coal base for many years. We had achieved good results. However, there are still some deficiencies in this paper, which need to be further improved: (1) This study only used the spectral and textural information generated by satellite images to establish a dataset, without considering the impact of soil, meteorology, and other factors. In future research, multisource data can be used to improve the performance of desertification monitoring of machine learning models. (2) For long-term desertification monitoring, the machine learning model established only by the training samples of single-phase images is prone to overfitting when predicting and segmenting images in other years. Although this study selected training samples for each year, due to the lack of field survey data, it is easy to produce subjective misjudgment and affect the accuracy of the model only relying on Google Earth images and UAV images. In future research, it is worth looking forward to developing new methods to accurately discriminate wrong pixels in sample data. (3) When discussing the factors driving the desertification process in Ningdong, we only discussed the corresponding relationship between the desertification status and each factor in the time dimension, and lacked the mapping verification in the space dimension. In future research, it is necessary to use buffer zone analysis and other technologies to discuss the driving causes of desertification in different areas of the mining area. (4) In this paper, ENVI, QGIS, Python, and other tools were used in the whole monitoring process. The workflow was scattered and the time complexity was high, which was not conducive to large-scale desertification monitoring. In future research, it is of great significance for desertification control to establish a comprehensive remote sensing monitoring platform with a unified process and simple operation to realize large-scale, long-time sequence, high-frequency, and high-precision desertification monitoring.

Conclusions
Combining quantitative remote sensing and machine learning, this paper discussed the performance of various machine learning models in desertification monitoring, and analyzed the spatial-temporal changes and driving factors of desertification land in Ningdong coal base over the last 19 years. The main conclusions are as follows: (1) Among the 11 algorithms, RF, SVM, GBM, and AB had good performances in desertification monitoring, with reliable and stable accuracy. RF was especially effective, and performed best in this study. (2) The results showed that in 2003-2017, the area of desertification land first increased rapidly, and then decreased slowly. In 2017-2021, the desertification situation deteriorated and a large number of nondesertified land turned into mild desertification land. (3) The driving analysis results showed that human economic activities, dominated by coal mining, played a major role in driving desertification in mining areas, and natural driving forces such as rainfall played a secondary role.
In future research, a comprehensive monitoring and evaluation system of cloud platforms based on machine learning, big data, and remote sensing should be established to control the desertification in mining areas comprehensively.