Effect of Environmental Factors on Soil Nutrient Loss under Conditions of Mining Disturbance in a Coalﬁeld

: Underground coal mining can result in land deformation (e.g., land subsidence and ground ﬁssures), and may consequently change the soil nutrients. Soil organic matter (SOM), total nitrogen (TN), and available phosphorus (AP) are critical indicators of soil fertility and eco-restoration in mining areas. In this study, soil samples (depth: 0–20 cm) were collected twice from 20 sampling points in pre-mining and post-mining in the No.12 panel of Caojiatan coalﬁeld, in the Loess Plateau of China. SOM, TN, and AP in soil samples were measured, and the nutrient loss was evaluated. Ten environmental factors affecting soil nutrient loss were identiﬁed from a 5-m resolution digital elevation map (DEM). The paired t -test was utilized to evaluate the differences between SOM, TN, and AP in pre-mining and post-mining soil. The mechanisms of the effects of environmental factors on soil nutrient loss were revealed based on multiple linear regression, redundancy analysis (RDA), and the random forest algorithm (RF). Ordinary kriging and RF were utilized to predict and optimize the spatial distribution of the soil nutrient loss. The results showed that signiﬁcant differences existed between the SOM, TN, and AP in the pre-mining and post-mining soil. The model established by RF provided a higher accuracy in terms of ﬁtting the correlation between soil nutrient loss and environmental factors compared to the model established by multiple linear regression, and the feature importance obtained by RF showed that proﬁle curvature, distance to working panel margin, and surface roughness were the most signiﬁcant factors affecting the loss of SOM, TN, and AP, respectively. This study provides a theoretical reference for eco-restoration, as well as soil and water conservation, in subsided lands in coalﬁelds. that the correlation between SOM and The correlation between and and and in both pre-mining and post-mining


Introduction
Coal is the most abundant fossil fuel in many countries, including Australia, India, Indonesia, the United Kingdom, and South Africa [1,2]. Coal is also the dominant fuel for power generation, industrial production, and residential consumption in many developing countries [3,4]. Although large-scale and high-intensity exploitation of coal resources has promoted economic development, it extensively damages the ecosystem and causes severe environmental problems, such as heavy metal pollution, soil erosion, and land degradation [5][6][7][8][9][10]. China is the largest producer and consumer of coal in the world, and coal accounts for about 70% of its primary energy [11]. More than 90% of coal

Soil Sampling and Lab Analysis
Soil samples were collected twice from 20 sampling points in pre-mining and postmining in the No.12 panel of the coalfield. For the first sampling period, twenty soil samples were collected in April of 2019, using a soil auger, at a depth of 0-20 cm, in the pre-mining soil prior to mining disturbance ( Figure 1). Sampling was performed using the five-point sampling method, in which soil was collected from five points within a 50 m radius and combined to generate a mixed sample. The soil samples were air-dried and sieved through a 2 mm sieve prior to the chemical and physical analyses. The second sampling period was in October of 2019, post-mining. To ensure comparability with the first set of sampled data, the sampling points and processes were the same. Land deformation (e.g., ground fissures, uneven subsidence, ephemeral gullies) was found in the post-mining topsoil for all 20 sampling points. The land use type of the 20 sampling points was shrubland, and the shrub community was Caragana korshinskii Kom.
Three soil fertility indicators were chosen to evaluate the fundamental properties in the pre-and post-mining soil. Our analytical items included soil organic matter, total nitrogen, and available phosphorous. Soil organic matter was measured using a K2Cr2O7 heating method, TN was measured using a semi-micro Kjeldahl method, and AP via alkaline hydrolysis NaHCO3-extraction with Mo-Sb-Vc-colormetry [10]. We measured each sample three times, and the mean value of each indicator was taken as the analysis value. All of the aforementioned properties were analyzed at the laboratory of the Chinese Academy of Agricultural Sciences. The soil nutrient loss was calculated using the following formula:  Three soil fertility indicators were chosen to evaluate the fundamental properties in the pre-and post-mining soil. Our analytical items included soil organic matter, total nitrogen, and available phosphorous. Soil organic matter was measured using a K2Cr2O7 heating method, TN was measured using a semi-micro Kjeldahl method, and AP via alkaline hydrolysis NaHCO3-extraction with Mo-Sb-Vc-colormetry [10]. We measured each sample three times, and the mean value of each indicator was taken as the analysis value. All of the aforementioned properties were analyzed at the laboratory of the Chinese Academy of Agricultural Sciences. The soil nutrient loss was calculated using the follow- A K-S test was performed to check the normality assumption and the paired t-test was utilized to evaluate the difference between soil nutrients in pre-mining and post-mining soil. The ordinary kriging coupled with RF were utilized to demonstrate the macro pattern of the DSOM, DTN, and DAP.

Extraction of Environmental Factors
A total of 10 environmental factors that could potentially affect the nutrient loss were considered. They were relief amplitude, slope gradient, variation of slope gradient, profile curvature, plan curvature, topography wetness index, drainage density, distance to working panel margin, surface roughness, and the normalized difference vegetation index. A digital elevation model (DEM) with a 5-m resolution was applied to extract the slope gradient, variation of slope gradient, curvature, and drainage. The stream network was determined from DEM using a stream threshold of 400. The DEM was conducted by Resources Satellite Three (ZY-3), the first civil high-resolution stereoscopic Earth mapping satellite of China. It carries three high-resolution panchromatic cameras and an infrared multispectral scanner. The relief amplitude (Formula 1), TWI (Formula 2), and surface roughness (Formula 3) were calculated as below [46]: A K-S test was performed to check the normality assumption and the paired t-test was utilized to evaluate the difference between soil nutrients in pre-mining and post-mining soil. The ordinary kriging coupled with RF were utilized to demonstrate the macro pattern of the DSOM, DTN, and DAP.

Extraction of Environmental Factors
A total of 10 environmental factors that could potentially affect the nutrient loss were considered. They were relief amplitude, slope gradient, variation of slope gradient, profile curvature, plan curvature, topography wetness index, drainage density, distance to working panel margin, surface roughness, and the normalized difference vegetation index. A digital elevation model (DEM) with a 5-m resolution was applied to extract the slope gradient, variation of slope gradient, curvature, and drainage. The stream network was determined from DEM using a stream threshold of 400. The DEM was conducted by Resources Satellite Three (ZY-3), the first civil high-resolution stereoscopic Earth mapping satellite of China. It carries three high-resolution panchromatic cameras and an infrared multispectral scanner. The relief amplitude (Formula (1)), TWI (Formula (2)), and surface roughness (Formula (3)) were calculated as below [46]: Sur f ace roughness = 1/ cos(S * π/180) where h max is the maximum elevation value of all pixels in the neighborhood, and h min is the minimum elevation value of all pixels in the neighborhood. AS is the specific catchment area in square kilometers, and S is the local slope gradient in degrees. The working panel was delineated using the coordinates provided by the Xi'an Research Institute of China Coal Technology and Engineering Group, and the distance to mining was calculated using the Euclidean distance. All of the aforementioned environmental factors were calculated and extracted using ArcGIS 10.2. The Normalized Difference Vegetation Index (NDVI) was computed from Sentinel 2 (4 January 2019, 10 m resolution), by ENVI 5.3 software.

RDA and Multiple Linear Regression
Redundancy analysis (RDA) is a sorting method that combines regression analysis and principal component analysis. By analyzing the correlation between the dependent variable and the independent variables, the reasons for the variation of the dependent variables can be assessed [47]. This method uses the nutrient loss as the dependent variable and the environmental variables as the independent variables to establish a linear regression model to describe the proportion of the dependent variable variation caused by each independent variable. A higher proportion represents a more significant correlation [48].
The multiple linear regression can be expressed as follows (4): where a 1 , . . . , a 10 are the partial correlation coefficients and x 1 , . . . , x 10 are the environmental factors. The standardized regressive coefficient calculated based on the partial correlation coefficient demonstrates the effect of the environmental factors on soil nutrient loss.

Random Forest Regression Trees
A random forest (RF) regression was used to model the non-linear correlation between nutrient loss and environmental factors under mining disturbance [49]. The machine learning algorithm requires a large dataset to obtain an optimized and stable model, avoiding underfitting. However, the powerful predicting ability of RF allows it to acquire an optimized model with a small dataset. The RF algorithm uses a bootstrapping method based on regression tree analysis to predict a continuous response variable. It fits a collection of decision tree models to the dataset. Each tree, trained using different bootstrap samples of the training data, acts as a regression function on its own, and the final output given by the regression corresponds to the average of the individual tree outputs [50]. The samples that are not in the bootstrap sample are called out-of-bag (OOB) samples; they are used to test the accuracy of the decision trees and estimate the overall model's misclassification error and variable importance [50]. The OOB ensures that each decision tree in the RF is a relatively accurate classification model; thus, the ensemble of these trees achieves a higher accuracy than the normally utilized BRT method. Hence, OOB is able to strictly control errors. Due to its cross-validation capability, RF regression provides realistic prediction error estimates during the training process, which gives it a powerful generalization ability to predict the unmeasured value [51]. Other advantages of RF include its minimized risk of overfitting, the possibility of including categorical along with continuous explanatory variables, and the small number of model parameters that need to be specified compared to other modeling approaches [50]. RF also provides several metrics to aid in interpretation: for instance, it automatically computes a variable importance score that assesses the contribution of individual predictors to the final model. After the optimized model was established based on RF, it also provided a powerful prediction ability to simulate soil nutrient loss in other unmeasured points. The kriging was utilized to predict the spatial distribution of soil nutrient loss, and RF was subsequently utilized to predict the other 20 unmeasured subsided points. These 20 unmeasured points were added to the observation points into kriging as measured points, to optimize the macro-spatial pattern of nutrient loss. The RF model was evaluated using mean absolute error (MAE) (Formula (5)), root mean squared error (RMSE) (Formula (6)) and mean squared error (MSE) (Formula (7)) as follows: These three parameters combined with the mean error (ME), root mean square standardized error (RMSSE), and average squared error (ASE) were utilized to inaugurate the appropriate kriging model presentation. The formulae of ME, RMSSE, and ASE can be found in a previous study [10].

Comparison between SOM, TN, and AP in Pre-and Post-Mining Soil
Comparisons of SOM, TN, and AP in the pre-mining and post-mining soil were shown in Figure 2. In the pre-mining period, SOM, TN, and AP varied from 2.6 to 14.4 g kg −1 , 0.273 to 0.891 g kg −1 , and 3.4 to 32.5 mg kg −1 , with an average of 8.05 g kg −1 , 0.539 g kg −1 , and 7.73 mg kg −1 , respectively. In the post-mining soil, SOM, TN, and AP varied from 0.5 to 12.5 g kg −1 , 0.182 to 0.890 g kg −1 , and 1.1 to 23.4 mg kg −1 , with an average of 5.2 g kg −1 , 0.412 g kg −1 , and 4.8 mg kg −1 , respectively. The variation of the three soil nutrient indexes is shown in Table 1. SOM, TN, and AP decreased in the post-mining soil in the range of 0.1 to 7.1 g kg −1 , 0.001 to 0.364 g kg −1 , and 0.1 to 9.1 mg kg −1 , with an average of 2.9 g kg −1 , 0.127 g kg −1 , and 2.9 mg kg −1 , respectively. The data of the samples satisfied the normality assumption, with the p values of SOM, TN, AP, DSOM, DTN, and DAP all above 0.05 (p > 0.05). The paired t-test was conducted to evaluate the differences between SOM, TN, and AP in the pre-mining and post-mining soil ( Table 1). The t values of SOM, TN, and AP were 3.338, 4.539, and 4.742, with p < 0.005, p < 0.001, and p < 0.001, respectively. The average decrease of SOM, TN, and AP compared to the pre-mining soil were 35.49%, 23.56%, and 38.06%, respectively. The results of the t-tests demonstrated a significant difference between the three soil nutrient indexes in the pre-mining and post-mining soil. We found that the soil nutrients dramatically decreased due to mining activities.   As shown in Figure 3, the liner regression correlating SOM and TN in the pre-mining and post-mining soil was fitted. The value of R square was 0.54 in the linear fitting of SOM and TN in the pre-mining soil, while it reached to 0.95 in the linear fitting of SOM and TN in the post-mining soil, demonstrating that mining activities enhanced the correlation between SOM and TN. The correlation between SOM and AP, and TN and AP, in both premining soil and post-mining soil, was not significant.  As shown in Figure 3, the liner regression correlating SOM and TN in the pre-mining and post-mining soil was fitted. The value of R square was 0.54 in the linear fitting of SOM and TN in the pre-mining soil, while it reached to 0.95 in the linear fitting of SOM and TN in the post-mining soil, demonstrating that mining activities enhanced the correlation between SOM and TN. The correlation between SOM and AP, and TN and AP, in both pre-mining soil and post-mining soil, was not significant. Figure 3, the liner regression correlating SOM and TN in the pre-mining and post-mining soil was fitted. The value of R square was 0.54 in the linear fitting of SOM and TN in the pre-mining soil, while it reached to 0.95 in the linear fitting of SOM and TN in the post-mining soil, demonstrating that mining activities enhanced the correlation between SOM and TN. The correlation between SOM and AP, and TN and AP, in both premining soil and post-mining soil, was not significant.

The Relationship between the Decrease of Soil Nutrients and Environmental Factors
The correlation between environmental factors and soil nutrient loss is shown in Tables 2 and 3, using RDA. For SOM, the total contribution was 15.7%, and surface roughness, distance to working panel margin, drainage density, and relief amplitude

The Relationship between the Decrease of Soil Nutrients and Environmental Factors
The correlation between environmental factors and soil nutrient loss is shown in Tables 2 and 3, using RDA. For SOM, the total contribution was 15.7%, and surface roughness, distance to working panel margin, drainage density, and relief amplitude contributed 21.0%, 18.9%, 17.6%, and 16.6%, respectively. These four environmental factors contributed significantly to the total explanation (p < 0.05) of DSOM. For TN, the total increased to 19.0%, and profile curvature, relief amplitude, and surface roughness contributed 47.6%, 15.8%, and 14.6%, respectively. These three factors significantly contributed to the total explanation of DTN. In terms of AP, the total contribution reached 44.8%, and relief amplitude, profile curvature, and distance to mining contributed 47.8%, 13.3%, and 12.4%, respectively. These three factors significantly contributed to the total explanation of DAP.    Among the three predictors, the DAP model showed the most powerful predicting function, with the highest total explanation. The DSOM model showed the lowest total explanation, demonstrating that the environmental factors were not significant in terms of predicting the decreased value of SOM. Among the environmental factors, distance to working panel margin and drainage density were significantly correlated with DSOM and DAP, while they scarcely contributed to DTN. Profile curvature was significantly correlated with DAP and DTN, while it scarcely contributed to DSOM. Surface roughness was significantly correlated with DSOM and DTN, while it scarcely contributed to DAP. Relief amplitude was significantly correlated with all soil nutrients, while NDVI and TWI scarcely contributed to any of the predictors.

Modelling DSOM, DTN, and DAP Using Random Forest
The multiple linear regression between DSOM, DTN, and DAP with the environmental factors showed a relatively low value of R square, demonstrating that the correlations may be better fitted using a non-linear algorithm. The result of feature importance conducted by random forest is shown in Figure 4. The MAE, MSE, and RMSE values in the model correlated with DSOM and environmental factors were 0.072, 0.013, and 0.112, respectively. The feature importance showed that plan curvature (0.228) was the most significant factor affecting DSOM, followed by profile curvature (0.191), NDVI (0.124), TWI (0.095), drainage density (0.094), distance to working panel margin (0.083), surface roughness (0.062), slope variation (0.058), slope gradient (0.038), and relief amplitude (0.027). In terms of the DTN model, the MAE, MSE, and RMSE were 0.093, 0.032, and 0.181, respectively. The feature importance showed that distance to working panel margin (0.222) was the most significant factor affecting DTN, followed by profile curvature (0.164), NDVI (0.149), slope variation (0.108), drainage density (0.098), plan curvature (0.072), TWI (0.053), relief amplitude (0.049), surface roughness (0.046), and slope gradient (0.039). As for the DAP model, the MAE, MSE, and RMSE were 0.05, 0.004, and 0.065, respectively. The feature importance showed that surface roughness (0.398) was the most significant factor affecting DAP, followed by slope gradient (0.217), slope variation (0.153), relief amplitude (0.127), profile curvature (0.076), drainage density (0.008), NDVI (0.006), distance to working panel margin (0.006), TWI (0.005), and plan curvature (0.003). The results showed that random forest had a higher accuracy in terms of predicting DSOM, DTN, and DAP. Thus, the feature importance could be a more reliable index to analyze the influence of environmental factors on soil nutrient decrease, compared to the standardized coefficients.

Predicting the Spatial Distribution of DSOM, DTN, and DAP by Combining Random Forest and Kriging
The sampling points were relatively deficient in the present study. Thus, random forest was utilized to predict the correlation of DSOM, DTN, and DAP with environmental factors, using 20 further mining disturbance points. These points were added to the study area with all environmental factors extracted. The tendency of the spatial distribution was smooth with the addition of the 20 points predicted by random forest (Figure 5). SOM decreased drastically in the north-east of the study area, adjacent to the working panel margin, and the decline slowed away from the working panel margin. The SOM decreased least in the north-west of the study area. TN decreased drastically in the northern and eastern area, except for the south-west corner of the study area, and this result was reversed from that of SOM. TN also decreased by the least in the middle and southwest corner of the study area. In terms of AP, the spatial distribution was homogeneous, with the greatest decline occurring in the middle and south-east of the study area, and the smallest decline in the west of the study area.
The accuracy of kriging interpolation was validated by 10-cross validation (Table 4). With ME and MSE close to 0, RMSE close to ASE, and RMSSE close to 1, the interpolation was perceived as accurate. The ME

Predicting the Spatial Distribution of DSOM, DTN, and DAP by Combining Random Forest and Kriging
The sampling points were relatively deficient in the present study. Thus, random forest was utilized to predict the correlation of DSOM, DTN, and DAP with environmental factors, using 20 further mining disturbance points. These points were added to the study area with all environmental factors extracted. The tendency of the spatial distribution was smooth with the addition of the 20 points predicted by random forest (Figure 5). SOM decreased drastically in the north-east of the study area, adjacent to the working panel margin, and the decline slowed away from the working panel margin. The SOM decreased least in the north-west of the study area. TN decreased drastically in the northern and eastern area, except for the south-west corner of the study area, and this result was reversed from that of SOM. TN also decreased by the least in the middle and south-west corner of the study area. In terms of AP, the spatial distribution was homogeneous, with the greatest decline occurring in the middle and south-east of the study area, and the smallest decline in the west of the study area.  Note: 20 points-soil nutrients, using 20 points as measured points; 40 points-soil nutrients, using 20 original points and 20 RF predicted points as measured points. ME, mean error; RMSE, root mean square error; MSE, mean square error; RMSSE, root mean standard square error; ASE, average sampling error.

Effects of Mining Disturbance on Soil Nutrient Loss
As shown in Table 2 and Figure 2, the three measured soil nutrients declined significantly in the post-mining soil, demonstrating severe destruction of the topsoil and soil nutrient loss under conditions of mining activity. With the implementation of the "Grain for Green" project, the soil nutrients increased in the study area with the treatment of a covering of caragana microphylla due to its nitrogen fixation capacity, litter residue, and root system, which protect the soil aggregates from soil erosion [45,52]. However, the soil nutrients drastically decreased in the post-mining soil, demonstrating that the root system, as well as the soil aggregates, were destroyed by underground mining. Surface deformation may significantly alter the soil texture in the topsoil, as clay and silt particles may be entrained through the cracks and sustain vertical loss. These particles are enriched with soil nutrients, and their loss directly causes soil nutrient decline in post-mining soil. Due to the loss of these finer particles, nutrient-holding capacity is also drastically reduced [53,54].
The soil microbial community is essential for the accumulation and decomposition of SOM, and microbes are responsible for decomposition of plant litter and animal residues [28]. They also contribute to the decomposition of organic matter that produces slimes and gums, which aid in the formation of soil structures and humus [28]. Previous studies have shown that the diversity and richness of the bacterial community drastically decrease with mining disturbance, which affects the stability of the soil aggregates and leads to the decrease of SOM and TN [27,35]. The loss of these soil nutrients induced by mining disturbance may directly affect the richness and diversity of the bacteria, thus affecting the mineralization of organic N and p, and dissolution of inorganic N and P. Meanwhile, surface subsidence and ground fissures can alter the soil physical properties (e.g., porosity, soil bulk density). With an increase of soil porosity, the SOM loss can be The accuracy of kriging interpolation was validated by 10-cross validation (Table 4). With ME and MSE close to 0, RMSE close to ASE, and RMSSE close to 1, the interpolation was perceived as accurate. The ME of DSOM, DTN, and DAP using 20 points and 40 points was 0.504 and 0.053, −0.051 and −0.002, and −0.094 and −0.025, respectively. The MSE of DSOM, DTN, and DAP using 20 and 40 points was 0.146 and 0.029, −0.620 and −0.020, and −0.041 and −0.010, respectively. The RMSSE of DSOM, DTN, and DAP using 40 points were all closer to 1 than those using 20 points. Thus, the kriging combined with random forest provided an optimized spatial pattern of soil nutrient decline.

Effects of Mining Disturbance on Soil Nutrient Loss
As shown in Table 2 and Figure 2, the three measured soil nutrients declined significantly in the post-mining soil, demonstrating severe destruction of the topsoil and soil nutrient loss under conditions of mining activity. With the implementation of the "Grain for Green" project, the soil nutrients increased in the study area with the treatment of a covering of caragana microphylla due to its nitrogen fixation capacity, litter residue, and root system, which protect the soil aggregates from soil erosion [45,52]. However, the soil nutrients drastically decreased in the post-mining soil, demonstrating that the root system, as well as the soil aggregates, were destroyed by underground mining. Surface deformation may significantly alter the soil texture in the topsoil, as clay and silt particles may be entrained through the cracks and sustain vertical loss. These particles are enriched with soil nutrients, and their loss directly causes soil nutrient decline in post-mining soil. Due to the loss of these finer particles, nutrient-holding capacity is also drastically reduced [53,54].
The soil microbial community is essential for the accumulation and decomposition of SOM, and microbes are responsible for decomposition of plant litter and animal residues [28]. They also contribute to the decomposition of organic matter that produces slimes and gums, which aid in the formation of soil structures and humus [28]. Previous studies have shown that the diversity and richness of the bacterial community drastically decrease with mining disturbance, which affects the stability of the soil aggregates and leads to the decrease of SOM and TN [27,35]. The loss of these soil nutrients induced by mining disturbance may directly affect the richness and diversity of the bacteria, thus affecting the mineralization of organic N and p, and dissolution of inorganic N and P. Meanwhile, surface subsidence and ground fissures can alter the soil physical properties (e.g., porosity, soil bulk density). With an increase of soil porosity, the SOM loss can be aggravated [24,55]. SOM is the major source of N and P; with the decrease of SOM, the aggregation and the availability of nutrients could drastically decline, which may lead to a decrease of TN and AP in post-mining soil [3,10]. The R 2 values of the correlation of TN and SOM were 0.54 and 0.95 in the pre-and post-mining soil, respectively, and an increased correlation could be induced by the loss of the inorganic form N. Plant and microbial N uptake is likely less in post-mining soil. Due to the cracks and increased infiltration, inorganic N is potentially susceptible to greater losses from leaching, volatilization, and conversion to gaseous forms than organic forms of N [10,28]. With the development of ground fissures, surface runoff may gather though these tunnels, while the runoff stored by topsoil may decrease. Ground fissures also promote the evaporation of soil moisture [15]. As a result, soil moisture in areas exposed to mining disturbance, especially areas with ground fissures, is likely to decrease. AP is related to the soil moisture, and with the decrease of soil moisture, AP drastically decreases. AP loss can generally be attributed to low primary productivity and high nutrient immobilization by calcium and magnesium [15]. Meanwhile, cracks, which result in the leakage of soil water, increase the effective flow path of water and promote soil water movement to the bottom layers of the soil, which further induces the loss of AP [2,10]. SOM, TN, and AP are the dominant factors for the development of soil bacteria communities.

Effects of Environmental Factors on Soil Nutrient Loss
Based on the results of the linear regression and random forest, plan curvature was the most significant factor and was negatively correlated with DSOM, demonstrating that a convex surface is related to divergence of the flow across the topsoil surface, and can aggravate the SOM loss under the conditions of mining disturbance. Ground fissures and subsidence in the topsoil can aggravate the SOM loss with divergence of the flow, as the cracks increase the potential of preferential flow and thus aggravate the damage to the humus layer in the topsoil [2]. Divergence of flow also contributes to the formation of ephemeral gullies, which generate the incision of overland flow into the cracks, and thus aggravate soil erosion and SOM loss [56]. Soil materials are voided by slumping of these unstable gullies and cracks, and are subsequently transported by the flowing runoff in both the vertical and horizontal directions [56]. SOM decreases slightly under a convergence of the flow, demonstrating that the accumulation of runoff could alleviate the SOM loss in the topsoil. Previous studies have demonstrated that convergence of the flow may cause less tensile stress compared to divergence of the flow, which may prevent the development of ground fissures [57]. Convergence of the flow could also contribute to the accumulation of SOM in the center of the subsidence, which would also alleviate the SOM loss. Profile curvature contributes significantly to the decrease of the SOM, suggesting that accelerated flow could aggravate the loss of SOM, and decelerated flow could alleviate the loss of SOM. The erosive power of accelerated flow could increase and thus aggravate the destruction of soil aggregates, and lead to SOM loss [58].
The distance to working panel margin significantly affected TN loss, with a negative correlation. Ground fissures occurred on the edge of the working panel due to tensile stress, and these are not able to spontaneously recover [10]. Ground fissures in the middle part of the working panel first expand and then compress, and may eventually recover though self-healing [21]. Thus, N-fixing bacteria could prevent TN loss in the middle part of the working panel; these bacteria are susceptible to mining disturbance, and their community and abundance also drastically decrease at the edge of the working panel [59]. Profile curvature also contributed significantly to TN loss, as accelerated flow aggravates the leaching of TN. NDVI significantly contributed to both SOM and TN loss, with a positive correlation, demonstrating that plants and their root systems can increase the nutrient-holding capacity and prevent soil erosion under uneven subsidence.
Surface roughness and slope gradient significantly affect AP loss, with a negative correlation. With elevating surface roughness, the soil erosion is aggravated, increasing AP loss. The availability of P is limited in the study area. Thus, the AP in the topsoil is susceptible to ground fissures, and can be eroded through cracks [10]. The uneven subsidence, which induces an additional slope, can also aggravate AP loss. With an elevated slope gradient, AP loss is aggravated as the clay and silt particles in which P is enriched are transported and eroded.
The standardized coefficient of relief amplitude was greatest in the linear regression. However, it barely affected DSOM, DTN, and DAP in the RF model. Meanwhile, the standardized coefficient of NDVI was low in the linear regression; it significantly affected DSOM and DTN in the RF model. Thus, it could be inferred that the intricate non-linear regressions among the nutrient loss and environmental factors could be better simulated using a machine learning algorithm.

Optimized Spatial Distribution Modelling Based on RF
By comparing the results of the kriging interpolation, we inferred that RF contributed to the improved accuracy of the macro-spatial pattern, with a smooth distribution and tendency. By using ordinary kriging, the nutrient loss at the unmeasured points was predicted only by the semi-variance. With the addition of predicted values using RF, the correlation with the soil nutrient loss and environmental factors achieved a higher accuracy and well-distributed measured points, which significantly enhanced the performance of the interpolation of DSOM, DTN, and DTN.

Conclusions
This study analyzed the effects of environmental factors on soil nutrient loss in topsoil under land deformation (e.g., land subsidence and ground fissures) in the Caojiatan coalfield. Significant differences were found between SOM, TN, and AP in pre-mining soil and post-mining soil. The soil nutrients at 20 measured points all decreased after the mining disturbance, demonstrating severe destruction of the soil fertility caused by mining activities. The results of RDA analysis demonstrated that the distance to working panel margin, profile curvature, and relief amplitude significantly contributed to the decrease of SOM, TN, and AP. Multiple linear regression demonstrated that relief amplitude was the most dominant factor affecting the decrease of SOM, TN, and AP. The results of RF showed that plan curvature, distance to working panel margin, and surface roughness were the most significant factors affecting DSOM, DTN, and DAP. Mining disturbance caused loss of the litter layer and microbes, which affected the availability of soil nutrients and nutrient-holding capacity in the topsoil. Inorganic N is potentially subject to greater losses from leaching, volatilization, and conversion to gaseous forms than organic forms of N. Meanwhile, the profile curvature contributed to the formation of ephemeral gullies, and thus aggravated SOM loss. The distance to working panel margin contributed to the formation of fissures, and thus aggravated TN loss. The surface roughness contributed to aggravating the soil erosion intensity and AP loss. RF predicted the soil nutrients at unmeasured points, and thus optimized the macro-spatial pattern of DSOM, DTN, and DAP.