Next Article in Journal
Associations Between Environmental Factors and Perceived Density of Residents in High-Density Residential Built Environment in Mountainous Cities—A Case Study of Chongqing Central Urban Area, China
Previous Article in Journal
Research on the Priority of County-Level Territorial Space Consolidation: Form–Flow Synthesis Analysis Based on Principal Component Analysis
Previous Article in Special Issue
How Spectrally Nearby Samples Influence the Inversion of Soil Heavy Metal Copper
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Interpretable Machine Learning and Remote Sensing Data Reveal Soil Biogeochemistry Patterns in Agricultural Systems

1
Laboratory of Soil Science, Ufa Institute of Biology, Ufa Federal Research Centre, Russian Academy of Sciences, 450054 Ufa, Russia
2
Laboratory for Plant Biotechnology, Department of Multidisciplinary Scientific Research of the Karelian Research Centre, Russian Academy of Sciences, 185910 Petrozavodsk, Russia
3
Laboratory for Ecological Monitoring and Modeling, Department of Multidisciplinary Scientific Research of the Karelian Research Centre, Russian Academy of Sciences, 185910 Petrozavodsk, Russia
4
Department of Earth and Space Sciences, Ural Federal University Named After the First President of Russia B.N. Yeltsin, 620002 Yekaterinburg, Russia
5
Department of Geodesy, Cartography and Geographic Information Systems, Ufa University of Science and Technology, 450076 Ufa, Russia
6
Department of Environmental Protection and Prudent Exploitation of Natural Resources, Ufa State Petroleum Technological University, 450064 Ufa, Russia
7
Centre of Excellence on Sustainable Land Management, Indian Council of Forestry Research and Education, Dehradun 248006, Uttarakhand, India
*
Author to whom correspondence should be addressed.
Land 2025, 14(9), 1881; https://doi.org/10.3390/land14091881
Submission received: 15 August 2025 / Revised: 9 September 2025 / Accepted: 12 September 2025 / Published: 15 September 2025
(This article belongs to the Special Issue Digital Soil Mapping and Precision Agriculture)

Abstract

Soil condition represents a critical factor for ensuring sustainable agricultural development and food security. In this study, we examined the content of key soil properties and their patterns using an interpretable machine learning framework in combination with remote sensing data (Sentinel-2A) across several land use types in northwestern Russia. The analyzed soil properties in 64 samples included soil organic carbon (Corg), total nitrogen (N), mobile phosphorus (Pmob), total phosphorus (Ptot), and mobile potassium (Kmob) sampled across three land use types: cropland, hayfield, and forest. For machine learning interpretability, model-agnostic methods were utilized, including permutation and SHapley Additive exPlanations (SHAP) with spatial visualization. Our results revealed the highest concentrations of Corg (6.1 ± 4.3%), Kmob (78.3 ± 42.1%), and N (31.2 ± 14.5 mg/100 g) in forested areas, while both types of phosphorus (Ptot and Pmob) peaked in croplands (0.075 ± 0.024 and 0.023 ± 0.015%, respectively). The lowest values of Corg were observed in hayfields, and the lowest values of Kmob and N in croplands. Model validation demonstrated that Corg and N were predicted most accurately (R2 = 0.53 and 0.55, respectively), where SWIR bands from Sentinel-2A satellite imagery were key predictors. The generated soil property maps and spatial SHAP values clearly showed distinct patterns correlated with land use types due to distinct biogeochemical processes across landscapes. Our findings demonstrate how land management practices fundamentally alter soil parameters, creating diagnostic spectral signatures that can be captured through interpretable machine learning and remote sensing.

1. Introduction

The abandonment of agricultural land is a complex, multidimensional, and nonlinear phenomenon that has varying impacts on agro-biodiversity, the environment, and society [1]. This practice is widespread across global land use systems and is driven by political, demographic, economic, technological, and socio-cultural factors [2]. From the perspective of ecosystem functions performed by soil cover in the biosphere, abandoned lands can contribute to the restoration of degraded ecosystems and enhance biodiversity. On the other hand, they may also be reclaimed for agricultural use if needed [3,4]. Additionally, such abandoned lands play a beneficial role in mitigating climate change by participating in biospheric carbon sequestration processes and reducing anthropogenic CO2 emissions. This occurs through vegetation recovery and increased soil organic matter [5,6].
However, post agricultural soils undergo divergent pedogenic processes. For instance, in southeastern Spain, soils with low levels of agronomically valuable structural aggregates and organic matter have exhibited accelerated erosion [7]. Nevertheless, erosion processes can be mitigated by restoring plant cover, which serves as a key factor in enhancing aggregate formation and water infiltration rates, ultimately reducing soil degradation [8]. In mountainous landscapes (Central Pyrenees, Spain), abandoned lands have undergone natural revegetation by forests and meadows, leading to the accumulation of soil organic carbon (Corg) and total nitrogen (N). The rate of accumulation has varied depending on vegetation type and age [9]. In the Amazon rainforest, intensive agricultural use has degraded soils, reducing exchangeable cations and nutrient content. This degradation later became a limiting factor for biomass accumulation in post abandonment ecosystems [10]. These findings highlight how agricultural intensification depletes base cations, constraining ecosystem recovery after abandonment. On China’s Loess Plateau, abandoned croplands exhibited increased microbial diversity and richness compared with actively farmed soils. The key factors shaping microbial communities included available N and phosphorus, total carbon, and soil moisture [11].
In European Russia, post Soviet agricultural abandonment was widespread. The primary drivers were low cereal crop yields, remote field locations (far from settlements of 500+ inhabitants), and the cessation of state agricultural subsidies [3]. In northwestern Russia’s Leningrad Oblast (Karelian Isthmus, between Lake Ladoga and the Baltic Sea), the proportion of agricultural land declined from 18% to 9% between 1939 and 2005 [12]. In the Republic of Karelia (located north of Leningrad Oblast), under the Land Degradation Neutrality framework (where cropland abandonment is classified as degradation while natural revegetation is considered a positive change), approximately 2% of agricultural lands are abandoned. This assessment utilizes global databases analyzing land cover dynamics, productivity, and Corg stocks [13].
The Republic of Karelia has experienced significant reductions in agricultural areas due to changing land use systems, with arable land decreasing by 37%, pastures by 47%, and hayfields by 70%, driven by both economic factors and declining land quality [14,15]. To enhance sustainable land management and agricultural productivity, comprehensive inventories and agroecological assessments of abandoned lands are needed to develop improvement strategies and implement effective land resource management measures [16,17,18]. Research on post agrogenic landscape transformation in Karelia remains limited and sporadic [19], though studies of soil property changes have demonstrated that sod-podzolic soils show particularly high restoration potential. Their long-term recovery processes depend on both the cessation of agricultural activity and local soil–climatic conditions, with these soils naturally developing dry meadows containing characteristic vegetation that enables cost-effective reintegration into agricultural use [20]. Recently, research on northern soils of Russia showed a vulnerability of key soil properties to land use change [21]. For instance, Nizamutdinov et al. [22] demonstrated that after abandoning arable land over a 25-year period there was a decline in Corg and key nutrients, including mineral forms of N, as well as mobile phosphorus (Pmob) and potassium (Kmob). Moreover, Retisols are also being studied all over the world regarding changes in their properties under different land use types [23,24,25].
Digital soil mapping (DSM) workflows are particularly valuable for monitoring soil property dynamics across landscapes undergoing land use changes, as they enable the integration of spatially explicit environmental covariates with field observations. Remote sensing data serve as a critical covariate in these workflows [26], especially when spectral features (e.g., bare soil reflectance or vegetation indices) are essential for establishing the predictive relationships between soil properties and land cover. For instance, multispectral or hyperspectral imagery can capture variations in Corg content [27], moisture [28], salinity [29], or erosion susceptibility [30], which often shift significantly during transitions across different land use types [31,32]. DSM methodologies account for these relationships using state-of-the-art predictive approaches, such as machine learning techniques. Besides producing digital products for an area of interest, this methodology allows us to identify and rank the key environmental drivers of biogeochemistry patterns. Furthermore, using explainable methods, it is possible to reveal the behavior of the model and its predictions, which is important for understanding changes due to post agrogenic transformation or other land use changes.
The objectives of this study are follows: (1) to evaluate and compare the content of key soil properties across different land use types; (2) to estimate their spatial distribution using machine learning algorithms and remote sensing data; (3) to validate the performance of the models; (4) to interpret the influence of key predictors.

2. Materials and Methods

2.1. Study Area

This study was conducted in the Republic of Karelia, within the Prjazhinskij National Municipal District (Figure 1). The climate is moderately continental, with a mean annual air temperature of 2.5 °C. The coldest months are January and February (average temperature: −11 °C), while the warmest is July (average temperature: +15.8 °C). Summer begins in early June and ends in late August, lasting approximately 90 days. The frost-free period averages 126 days, while persistent sub-zero temperatures prevail for about 114 days annually. The region experiences excessive moisture, with mean annual precipitation of 614 mm—ranging from a minimum of 26 mm in March to a peak of 81 mm in August. Snow cover persists for an average of 155 days per year, typically forming in mid-October and melting completely by early May, with an average snow depth of 32 cm.
The soil cover of the district is characterized by loamy–sandy sod-podzolic soils or Retisols, according to the WRB [33]. These soils are poorly suited for agriculture in their natural state. The high acidity, low nutrient availability, increased rockiness, and unfavorable water–physical properties of the soils all limit agricultural development.
A chronosequence encompassing several land use types was selected for this study. The study objects comprised soils from both natural (forest) and anthropogenically altered biocenoses (cropland and sown hayfields exhibiting various stages of regrowth). The selection of these areas for research was based on the prevalence of forest land, the distribution of agricultural land (primarily hayfields and arable land), and the presence of abandoned agricultural land in the district.
The species that presented under the forest plot (square is 0.12 km2) were Pinus sylvestris L., Alnus glutinosa L., Salix caprea L., Betula pendula L., Sorbus aucuparia L., Picea abies L., Calluna vulgaris L., Vaccinium myrtillus L., Vaccinium vitis–idaea L., Rubus saxatilis L., Calamagrostis arundinacea L., Poa pratensis L., Chamerion angustifolium, Potentilla erecta L., Trientális europaéa, Lycopodium annotinum, Pleurozium schreberi Mitt., Dicranum scoparium Hedw, and Polytrichum commune Hedw.
The hayfield (abandoned arable land, aged 7–10 years; square is 0.15 km2) is characterized by a weedy, mixed-herb vegetation with low floristic diversity. The projective cover of the herbaceous layer is 80%, and the average height of the vegetation is 120 cm. The following types of vegetation are present: Anthriscus sylvestris L., Cirsium vulgare L., Elytrigia repens L., Poa pratensis L. Deschampsia cespitosa L., Phleum pretense L., Sonchus arvensis L., Veronica longifolia L., Stellaria media L., Galium aparíne L., Ranunculus acris L., Vicia cracca L., Alchemilla micans L., and Taraxacum officinale F.H. Wigg.
The arable field (square is 0.09 km2), under a row crop rotation system following several years of perennial grasses, was planted with potatoes when the study was conducted.
The distance between biocenoses ranged from 0 to 1500 m. Cropland and unused hayfields occupy adjacent plots sharing a boundary up to 150 m long. The forested plot is located 0.5–1.5 km from the arable land and unused hayfield (Figure 1b). Due to the significant spatial (i.e., variation in location across the landscape) and temporal (i.e., variation in time) heterogeneity of regrowth processes, five stages of forest development on agricultural lands were identified based on projective cover. This scale was adapted for the present study, conducted under the specific soil and climatic conditions of Karelia. The transformation of hayfields was studied in relation to plant community succession, focusing on the description of natural woody and shrub vegetation under a clumped regrowth pattern (1–5 and 5–10 years old).

2.2. Soil Sampling and Analyses

Soil samples were collected from three land use types (cropland, hayfield, and forest) using a targeted sampling scheme [34]. Therefore, sampling locations were selected based on significant spatial heterogeneity within the study area, and we aimed to capture the range of variability associated with different management practices. A total of 20 samples (0–20 cm depth) were collected from the cropland, 24 from the hayfield, and 20 from the forest. A complete soil profile was described for the plot exhibiting the clumped regrowth stage under the hayfield. The morphological description of the soil profile and soil sample analysis were performed using standard soil science methods [33].
For chemical analysis, dried samples were sieved through a 1 mm sieve, and the following determinations were made: total N using the Kjeldahl method on a Buchi nitrogen analyzer; total phosphorus (Ptot) was determined by combustion in a muffle furnace with spectrophotometric termination (spectrophotometer SF-2000, OKB-spectr, Saint-Petersburg, Russia); Pmob and mobile potassium (Kmob) using the Kirsanov method (0.2 n HCl), with spectrophotometric (UV-1800 Shimadzu spectrophotometer for phosphorus, Kyoto, Japan) and atomic emission (AA-7000 Shimadzu atomic absorption spectrophotometer for potassium, Kyoto, Japan) determination; and Corg using high-temperature catalytic combustion on a Shimadzu TOC-L CPN analyzer. Analyses were performed in triplicate. These soil parameters were selected as they are fundamental indicators of soil health. Moreover, they are highly sensitive to land use change, directly reflecting alterations in organic matter input, decomposition, and nutrient status. The Kruskal–Wallis and Dunn’s tests were used to compare soil properties across the three land use types because the data might not have met the assumptions of parametric tests (e.g., normality). Data were statistically processed using the R (4.4.2) programming language.

2.3. Digital Mapping Procedure

We tested several machine learning techniques for the spatial prediction of soil properties: Cubist, K-Nearest Neighbors (KNN), Random Forest (RF), and Support Vector Machines (SVM). The choice of these algorithms was driven by a study by Khaledian and Miller [35], who showed their effectiveness with a small training data set. Their full description and details can be found in the above-mentioned study [35].
The predictor variables (covariates) used in the machine learning models were derived from Sentinel-2A satellite imagery in a raw form, accessed via Google Earth Engine [36]. We selected a cloud-free scene from 13 May 2023 to align closely with the timing of field sampling. Among the available spectral bands, B1 (coastal aerosol), B9 (water vapor), and B10 (cirrus) were excluded. Thus, the visible (red, green, blue), near-infrared (NIR), and short-wave infrared (SWIR) bands were used as predictors. Their technical characteristics are presented in Table 1. We resampled a spatial resolution for all bands of 20 m × 20 m. In addition, the Normalized Difference Vegetation Index (NDVI) was computed to enhance the predictive capability of the models.
To optimize the predictive performance of the models, we applied recursive feature elimination (RFE), a backward selection technique that iteratively removes the least important variables based on their permutation importance scores [37]. The process begins with the full set of covariates and progressively eliminates the weakest predictors in each iteration, retraining the model until only the most influential variables remain. This approach helps reduce model complexity, mitigate overfitting, and improve computational efficiency while maintaining or even enhancing prediction accuracy.

2.4. Model Validation and Interpretability

The performance of the machine learning models was assessed using a 5-fold cross-validation (CV) approach with fifty repetitions. Model accuracy was quantified using the root mean square error (RMSE), the coefficient of determination (R2), and the Ratio of Performance to Deviation (RPD). Lower RMSE values indicate better predictive precision, while a higher R2 value reflects a stronger explanatory power of the model in capturing the variability of the observed soil properties. According to the RPD metric, a model is poor when RPD < 1.4, fair when 1.4 < RPD < 1.8, good when 1.8 < RPD < 2.0, and very good when RPD > 2.
To evaluate the contribution of each predictor variable in the machine learning models, we employed model-agnostic methods, which are techniques applicable to any trained model regardless of its internal structure. Specifically, we applied the permutation-based variable importance method. This approach works by randomly shuffling the values of a single predictor and measuring the resulting decrease in model accuracy. A larger drop in performance indicates the higher importance of the permuted variable, as the model relies more heavily on it for making accurate predictions.
Furthermore, we utilized SHAP (SHapley Additive exPlanations) [38], a game-theoretic approach for calculating Shapley values that fairly distributes the prediction’s outcome among each feature, quantifying its individual contribution. We also explored the spatial distribution of the generated Shapley values by analyzing their relationship with key remote sensing variables. This allows us to provide spatially explicit insights into the drivers of soil biogeochemistry across the study area. This allowed us to understand not only the overall importance of each feature but also how its influence varied geographically.

3. Results

3.1. Morphological Description of Soil Profile

The morphological description of the soil profile (Figure 2) revealed that the studied biocenoses were located on either sod-medium podzolic sandy loam soil over lake–glacial sandy loam deposits or alfhumic sandy loam cultivated land over lake–glacial deposits. The soil profile under the hayfield showed an eluvial–illuvial distribution of Corg (AY(0–4)—2.25%, P1(4–15)—1.13%, P2(15–25)—1.79%, BP(25–50)—1.81%, B2(50–80)—0.09%, BC(80–100)—0.08%, and C(100–120)—0.10%), with an accumulation of Corg in the illuvial horizon BP (25–50 cm)—1.81%, a characteristic feature of podzolic soils. The top horizons (AY, P1) were compacted and gradually passed into the lower ones. With depth, the density and moisture of the horizons increased.

3.2. Summary Statistics of Soil Properties

Table 2 presents the mean values and standard deviations for the soil properties across three land use types: cropland, haying, and forest. The topsoil Corg content showed a contrasting pattern, with the significantly highest mean in forests (6.1 ± 4.3%), exceeding cropland (2.4 ± 1.1%) and haying (1.9 ± 0.8%) (Figure 3). Kmob, N, and Ptot contents showed a different response across all land types. Kmob concentrations demonstrated a clear increasing trend from cropland (38.2 ± 15.6%) to haying (58.9 ± 36.7%) and forest (78.3 ± 42.1%), although the standard deviations highlight considerable variability. N content followed a similar pattern, increasing from cropland (7.6 ± 3.2 mg/100 g) to haying (15.2 ± 8.9 mg/100 g) and forest (31.2 ± 14.5 mg/100 g). In contrast, the highest mean values of Ptot and Pmob were observed in cropland (0.075 ± 0.024 and 0.023 ± 0.015%, respectively), with decreasing trends observed in haying and forest lands. Pmob content was significantly lower in forests than in croplands and haying. The standard deviation indicated substantial variability in soil properties.
To determine the correlation among the soil properties, Spearman’s correlation coefficients were calculated (Figure 4). We found a positive correlation between Ptot and Pmob, between Kmob and N, between Kmob and Corg, and between Corg and N. Pmob and N content had a negative relationship.

3.3. Soil Predictive Models

The prediction accuracy of the tested predictive models varied across the soil properties (Figure 5). The KNN and RF algorithms achieved the best performance for the most target variables. These models explained 48–53% of the variance in Corg, 45–49% in Pmob, and 54–55% in N. However, the R2 values did not exceed 0.29 for the Kmob and Ptot models. The SVM approach showed intermediate performance, typically ranking third behind RF and Cubist.
Table 3 presents the best predictive models for each soil parameter. The RF model for Corg produced an RMSE of 2.15%, an R2 of 0.53, and an RPD of 1.38. For N content, the RF model demonstrated a relatively good performance, with an RMSE of 9.87 mg/100 g and the highest R2 (0.55) and RPD (1.50). For Ptot, the KNN technique was the best and achieved an RMSE of 0.02%, an R2 of 0.19, and an RPD of 1.22, whereas for Pmob it performed better, with a lower RMSE (0.01%) and a higher R2 (0.49) with RPD (1.43) using the RF approach. The prediction of Kmob using the Cubist approach yielded an RMSE of 31.92%, an R2 of 0.29, and an RPD of 1.16.

3.4. Spatial Predictions

The predicted spatial patterns of soil properties differed well across the land use types (Figure 6). The highest Corg content was clearly observed in the western forest part of the site, while the lowest was under the croplands in the east. The Kmob content showed distinct differences across all three land use types. The highest N levels were observed under forest soils and the lowest levels under arable land, whereas the lowest Pmob content was under the forest.
Among the most important variables, band 11 (SWIR 1) was key for predicting Corg content (Figure 7). Furthermore, bands 5 (red edge 1), 3 (green), and 12 (SWIR 2) were important. Spatial variations in Kmob content were mainly explained by red edge band 5. For the N model, several bands were crucial simultaneously, including 12 (SWIR 2), 5 (red edge 1), and 2 (blue). Bands 12 (SWIR 2), 4 (red), and 3 (green) were the most important for Pmob predictions across the study area, whereas bands 3 (green) and 8 (NIR) contributed to Ptot modeling. Notably, NDVI was not key in all soil predictive models.
The dependence plots from the SHAP approach were generated for the best predictive models (i.e., Corg, N, and Pmob) and their key predictors (SWIR bands 11 and 12), assuming the most reliable results (Figure 8). Here, each plot shows how a single band affects soil property predictions in terms of the positive or negative contribution. The presented plots were not characterized by a linear relationship. Specifically, the reflectance values of band 11 up to 2500 had a positive contribution to Corg content predictions (under forest areas), whereas after 3000 the model tended to predict lower Corg values. A similar trend occurred for N content, where at band 12 values between 1500 and 1750, the SHAP values were positive, implying soils with higher N levels. After the threshold value of 2000, these values were negative, which corresponds to cropland and hayfield. In the case of Pmob, the graph was tortuous, showing a peak between band values 2200–2700. A noticeable negative contribution to the predictions was found at low band values (up to 1750).
The spatial visualization of the Shapley values clearly confirmed the above findings (Figure 9). For Corg and N, the positive Shapley values were located in the western parts, represented by forest vegetation, while the opposite was true under cropland and hayfield. The lowest Pmob values were concentrated under the forest, which was confirmed by the spatial patterns of the Shapley values that had negative values. The highest values were observed and predicted under cropland and hayfield, which had positive Shapley values (up to 0.005%).

4. Discussion

4.1. Soil Properties Across Land Use Types

The natural and climatic characteristics of the study area (temperate continental climate, undulating plain relief, sandy fluvioglacial soil-forming parent material, and a predominance of coniferous forests) promoted the formation of a soil cover significantly dominated by podzolic loamy sandy soils. As previously mentioned, these soils are poorly suited for agricultural use due to their unfavorable hydrophysical and agrochemical properties. However, the socio-economic conditions of the region’s development contributed to their active involvement in agriculture. Subsequently, in the post Soviet period, low yields and poor economic efficiency led to the abandonment of some of the land [3,13].
If the pristine, anthropogenically undisturbed soil under forest vegetation is taken as the baseline standard for this study area, then long-term agricultural use contributed to a decrease in Corg content of 61–69% (arable and hayfield sites). Moreover, even the 10-year period that the site was under hayfield management did not promote the restoration of Corg content, which is generally characteristic of soils in this region. This is related to the direction in which the organic matter transformation process occurs. The transformation of organic matter annually entering the soil cover under the hayfield continues along the path of mineralization, whereas in soils located under forest vegetation, this process is directed towards its humification [39,40,41]. As a result, the notably lower Corg levels in cropland and hayfield soils results from the continual export of biomass, which removes carbon inputs, and tillage practices that enhance decomposition rates.
A similar trend was observed with N, where its content decreased by 76% in the arable field and was 51% lower in the hayfield compared with the forest. However, in contrast to the situation with Corg, the recovery and accumulation of N are occurring in the hayfield. The absence of tillage in the hayfield reduces mineralization rates and prevents the rapid loss of N. This can also be related to the fixation of N compounds from the atmosphere by nodule bacteria, which actively function in the rhizosphere of fine root hairs at the end of the plant root system [42]. Furthermore, the positive relationship between N and Corg suggests a close coupling of the soil organic matter cycle with N availability, likely because these nutrients are released simultaneously during the decomposition of organic material.
Regarding exchangeable potassium, a reduced Kmob content is also observed compared with the forested site, with a decrease of 51% in the arable field and 25% in the hayfield. The decrease in potassium content under the arable soil is due to the fact that, on the one hand, potassium does not form strong complexes with soil organic matter and exhibits a tendency to migrate, and on the other hand, it is removed by agricultural products [15]. It is also supported by the lack of correlation between Kmob and Corg. Under the hayfield soil, it is likely that the migration activity of potassium decreased, and a trend towards its accumulation appeared under the influence of the formed vegetation.
Meanwhile, the higher content of Pmob in the arable field (63% higher) and the hayfield (26% higher) compared with the forested site is due to the application of mineral phosphorus fertilizers. The content of Ptot is a relatively stable agrochemical indicator that is less susceptible to anthropogenic impact and more closely related to the properties of the soil-forming parent material. In our case, the varying content of this indicator in the studied sites is explained by the natural spatial heterogeneity of its distribution [43]. Moreover, the negative correlation between Pmob and N may suggest contrasting biogeochemical cycling in plant uptake or microbial processes.

4.2. Spatial Predictions of Soil Properties

According to the model validation, the highest accuracy was achieved for the N and Corg predictive models, while the Pmob model performed slightly worse (Table 3). In contrast, the accuracy of the Kmob and Ptot models was low. Given that the accuracy of spatial soil models rarely exceeds 0.7 (R2) [44], limitations may arise from the lack of relevant predictors explaining soil property variations, as well as errors in soil property measurements [45].
As previous studies demonstrated, Corg is usually modeled more effectively than nutrients [46], based on the use of remote sensing data on bare soils like cropland [47]. It is mainly explained by their distinct spectral signatures and strong correlations between color and content [48]. Across heterogenous landscapes with diverse land use types, spectral variations are different, which can serve as indicators of underlying soil properties [32,49].
We demonstrated that remote sensing data alone clearly revealed a trend of property content differentiation across land use types due to distinct spectral signatures associated with croplands, haying, and forests (Figure 6). The observed differences in soil property distribution across land use types likely stem from variations in organic input, disturbance regimes, and management practices. Specifically, the highest concentrations of Corg, Kmob, and N were predicted in forested areas in the western part of the study site, while the lowest were found in croplands to the west.
Despite using NDVI as a predictor, its contribution to predicting all properties was negligible, whereas Sentinel-2A spectral bands proved to be key. According to a permutation approach (Figure 7), the most critical bands for building more reliable models were band 11 SWIR 1 (for Corg), band 5 red edge (for Kmob), and band 12 SWIR 2 (for N and Pmob). Similar results have previously been reported in other studies showing that SWIR bands were most significant for Corg prediction [50,51] and other properties [52]. The superior performance of specific spectral bands over NDVI highlights the importance of selecting tailored predictors for different soil properties [53]. These findings suggest that vegetation indices like NDVI may oversimplify soil–vegetation interactions, whereas narrow spectral bands better capture the heterogeneity of soil characteristics.
Future research aimed at improving prediction accuracy in heterogeneous landscapes should focus on (1) incorporating ancillary covariates (e.g., terrain attributes, climate variables, or land use history) [54,55] to enhance model explanatory power, (2) integrating multi-temporal remote sensing data to account for seasonal dynamics [56,57], (3) testing methods that explicitly account for spatial autocorrelation (e.g., geostatistical approaches or hybrid machine learning models) [58]. These steps could significantly improve prediction accuracy across diverse environments.

5. Conclusions

Soils are receiving increased attention worldwide due to their importance in agriculture and food security. The results of this study revealed that sod-podzolic soils under different land use types (cropland, hayfield, and forest) are characterized by a different content of soil properties. The interpreted predictive models based on the remote sensing data showed clear patterns and relationships between land use types and soil parameters. Forest soils exhibited the highest Corg and N contents, mainly due to continuous organic matter accumulation and minimal external disturbance. On the other hand, croplands showed elevated phosphorus levels, likely reflecting fertilization practices, but the lowest N and Kmob values due to anthropogenic influence. Hayfields displayed intermediate characteristics, with reduced Corg associated with periodic biomass removal. The generated digital maps of soil properties and Shapley values demonstrated that quantitative properties varied well across land use types, which is explained by different spectral reflectance for unique landscapes. This study offers valuable insights for precision agriculture, carbon sequestration strategies, and ecosystem restoration efforts in similar boreal regions. The emphasis on interpretability in machine learning models provides a template for subsequent research aiming to understand and predict soil property changes resulting from land use alterations.

Author Contributions

Conceptualization, R.S. and M.Y.; Funding acquisition, R.S., M.Y. and O.B.; Investigation, R.S., M.Y., O.B., T.P. and D.Z.; Methodology, R.S., M.Y., T.P., G.M. and A.S.; Project administration, O.B.; Software, A.K., A.V., D.Z. and A.S.; Visualization, D.Z., A.V. and A.S.; Writing—original draft, R.S., M.Y., A.K., A.V. and A.S.; Writing—review and editing, all authors. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Ministry of Science and Higher Education of the Russian Federation № FMEN 2022–0013, FMEN 2022–0014.

Data Availability Statement

The data presented in this study are available on reasonable request from the authors.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Subedi, Y.R.; Kristiansen, P.; Cacho, O. Drivers and consequences of agricultural land abandonment and its reutilisation pathways: A systematic review. Environ. Dev. 2022, 42, 100681. [Google Scholar] [CrossRef]
  2. Cao, J.; Zhang, M.; Chen, E. The Dynamic Effects of Ecosystem Services Supply and Demand on Air Quality: A Case Study of the Yellow River Basin, China. Pol. J. Environ. Stud. 2025. [Google Scholar] [CrossRef]
  3. Prishchepov, A.V.; Müller, D.; Dubinin, M.; Baumann, M.; Radeloff, V.C. Determinants of agricultural land abandonment in post-Soviet European Russia. Land Use Policy 2013, 30, 873–884. [Google Scholar] [CrossRef]
  4. Prishchepov, A.V.; Ponkina, E.V.; Sun, Z.; Bavorova, M.; Yekimovskaja, O.A. Revealing the intentions of farmers to recultivate abandoned farmland: A case study of the Buryat Republic in Russia. Land Use Policy 2021, 107, 105513. [Google Scholar] [CrossRef]
  5. Novara, A.; Gristina, L.; Sala, G.; Galati, A.; Crescimanno, M.; Cerdà, A.; Badalamenti, E.; La Mantia, T. Agricultural land abandonment in Mediterranean environment provides ecosystem services via soil carbon sequestration. Sci. Total Environ. 2017, 576, 420–429. [Google Scholar] [CrossRef]
  6. Bell, S.M.; Barriocanal, C.; Terrer, C.; Rosell-Melé, A. Management opportunities for soil carbon sequestration following agricultural land abandonment. Environ. Sci. Policy 2020, 108, 104–111. [Google Scholar] [CrossRef]
  7. Romero-Díaz, A.; Pérez-Morales, A.; Marín-Sanleandro, P. Prevalence, causes, and consequences of agricultural land abandonment: A case study in the Region of Murcia, Spain. Catena 2024, 241, 108071. [Google Scholar] [CrossRef]
  8. Rodrigo-Comino, J.; Martínez-Hernández, C.; Iserloh, T.; Cerdà, A. Contrasted impact of land abandonment on soil erosion in Mediterranean agriculture fields. Pedosphere 2018, 28, 617–631. [Google Scholar] [CrossRef]
  9. Nadal-Romero, E.; Rubio, P.; Kremyda, V.; Absalah, S.; Cammeraat, E.; Jansen, B.; Lasanta, T. Effects of agricultural land abandonment on soil organic carbon stocks and composition of soil organic matter in the Central Spanish Pyrenees. Catena 2021, 205, 105441. [Google Scholar] [CrossRef]
  10. Rebola, L.C.; Paz, C.P.; Gamarra, L.V.; Burslem, D.F.R.P. Land use intensity determines soil properties and biomass recovery after abandonment of agricultural land in an Amazonian biodiversity hotspot. Sci. Total Environ. 2021, 801, 149487. [Google Scholar] [CrossRef]
  11. Wu, L.; Ren, C.; Jiang, H.; Zhang, W.; Chen, N.; Zhao, X.; Wei, G.; Shu, D. Land abandonment transforms soil microbiome stability and functional profiles in apple orchards of the Chinese Losses Plateau. Sci. Total Environ. 2024, 906, 167556. [Google Scholar] [CrossRef]
  12. Rautiainen, A.; Virtanen, T.; Kauppi, P.E. Land cover change on the Isthmus of Karelia 1939–2005: Agricultural abandonment and natural succession. Environ. Sci. Policy 2016, 55, 127–134. [Google Scholar] [CrossRef]
  13. Andreeva, O.V.; Kust, G.S. Land assessment in Russia based on the concept of land degradation neutrality. Reg. Res. Russ. 2020, 10, 593–602. [Google Scholar] [CrossRef]
  14. Ageenko, P.A. The problem of rational land use as a factor in ensuring food security in the Republic of Karelia. Bull. Fac. Land Manag. St.-Petersburg State Agrar. Univ. 2020, 6, 47–50. [Google Scholar]
  15. Kotova, Z.P.; Kotov, S.E.; Kuznetsova, L.A. Dynamics of soil fertility indicators’ changes in agricultural lands of Karelia. Proc. Petrozavodsk. State Univ. 2017, 2, 32–38. [Google Scholar]
  16. Frieva, N.A. The efficiency of use of land as a factor of development of the agrarian sector of the European North of Russia. Sci. Bull. South. Inst. Manag. 2018, 4, 33–44. [Google Scholar] [CrossRef]
  17. Svidskaia, I.A.S.; Khorolskii, I.V. Involvement of unused or inefficiently used land in the economic turnover as a way to manage territories. Experience of Karelia. Prop. Relat. Russ. Fed. 2021, 9, 21–25. [Google Scholar]
  18. Polunin, G.A. Establishing unfavorable territories for agricultural production, depending on the suitability of the land. Econ. Labor Manag. Agric. 2020, 11, 81–87. [Google Scholar] [CrossRef]
  19. Moshkina, Y.V.; Medvedeva, M.V.; Tuyunen, A.V.; Karpechko, A.Y.; Genikova, N.V.; Dubrovina, I.A.; Mamai, A.V.; Sidorova, V.A.; Tolstoguzov, O.V.; Kulakova, L.M. Patterns of natural forest ecosystem regeneration in abandoned farmland (the case of the southern agro-climatic district of Karelia). Biosfera 2019, 11, 134–145. [Google Scholar]
  20. Sidorova, V.A. Evaluation of possibilities of using fallow land in agriculture in Karelia. Mod. Sci. Success 2016, 5, 146–149. [Google Scholar]
  21. Enchilik, P.R.; Chechenkov, P.D.; Yu, G.-H.; Semenkov, I.N. Partitioning of Available P and K in Soils During Post-Agricultural Pine and Spruce Reforestation in Smolensk Lakeland National Park, Russia. Forests 2025, 16, 845. [Google Scholar] [CrossRef]
  22. Nizamutdinov, T.; Yang, S.; Abakumov, E. Post-Agricultural Shifts in Soils of Subarctic Environment on the Example of Plaggic Podzols Chronosequence. Agronomy 2025, 15, 584. [Google Scholar] [CrossRef]
  23. Bryk, M. Study on the Physical Properties of a Forest Glossic Retisol Developed from Loess in the Lublin Upland, SE Poland. Soil Sci. Ann. 2023, 74, 1–17. [Google Scholar] [CrossRef]
  24. Kochiieru, M.; Lamorski, K.; Feizienė, D.; Feiza, V.; Šlepetienė, A.; Volungevičius, J. Land Use and Soil Types Affect Macropore Network, Organic Carbon and Nutrient Retention, Lithuania. Geoderma Reg. 2022, 28, e00473. [Google Scholar] [CrossRef]
  25. Muraškienė, M.; Armolaitis, K.; Varnagirytė-Kabašinskienė, I.; Baliuckas, V.; Aleinikovienė, J. Evaluation of Soil Organic Carbon Stability in Different Land Uses in Lithuania. Sustainability 2023, 15, 16042. [Google Scholar] [CrossRef]
  26. Mulder, V.L.; de Bruin, S.; Schaepman, M.E.; Mayr, T.R. The Use of Remote Sensing in Soil and Terrain Mapping—A Review. Geoderma 2011, 162, 1–19. [Google Scholar] [CrossRef]
  27. Odebiri, O.; Odindi, J.; Mutanga, O. Basic and Deep Learning Models in Remote Sensing of Soil Organic Carbon Estimation: A Brief Review. Int. J. Appl. Earth Obs. Geoinf. 2021, 102, 102389. [Google Scholar] [CrossRef]
  28. Babaeian, E.; Sadeghi, M.; Jones, S.B.; Montzka, C.; Vereecken, H.; Tuller, M. Ground, Proximal, and Satellite Remote Sensing of Soil Moisture. Rev. Geophys. 2019, 57, 530–616. [Google Scholar] [CrossRef]
  29. Suleymanov, A.; Gabbasova, I.; Abakumov, E.; Kostecki, J. Soil Salinity Assessment from Satellite Data in the Trans-Ural Steppe Zone (Southern Ural, Russia). Soil Sci. Annu. 2021, 72, 132233. [Google Scholar] [CrossRef]
  30. Wang, J.; Zhen, J.; Hu, W.; Chen, S.; Lizaga, I.; Zeraatpisheh, M.; Yang, X. Remote Sensing of Soil Degradation: Progress and Perspective. Int. Soil Water Conserv. Res. 2023, 11, 429–454. [Google Scholar] [CrossRef]
  31. Kebebew, S.; Bedadi, B.; Erkossa, T.; Yimer, F.; Wogi, L. Effect of Different Land-Use Types on Soil Properties in Cheha District, South-Central Ethiopia. Sustainability 2022, 14, 1323. [Google Scholar] [CrossRef]
  32. Suleymanov, A.; Abakumov, E.; Polyakov, V.; Kozlov, A.; Saby, N.; Kuzmenko, P.; Telyagissov, S.; Coblinski, J. Estimation and Mapping of Soil pH in Urban Landscapes. Geoderma Reg. 2025, 40, e00919. [Google Scholar] [CrossRef]
  33. IUSS Working Group WRB. World Reference Base for Soil Resources. In International Soil Classification System for Naming Soils and Creating Legends for Soil Maps, 4th ed.; International Union of Soil Sciences: Vienna, Austria, 2022. [Google Scholar]
  34. Piikki, K.; Wetterlind, J.; Söderström, M.; Stenberg, B. Perspectives on Validation in Digital Soil Mapping of Continuous Attributes—A Review. Soil Use Manag. 2021, 37, 7–21. [Google Scholar] [CrossRef]
  35. Khaledian, Y.; Miller, B.A. Selecting Appropriate Machine Learning Methods for Digital Soil Mapping. Appl. Math. Model. 2020, 81, 401–418. [Google Scholar] [CrossRef]
  36. Gorelick, N.; Hancher, M.; Dixon, M.; Ilyushchenko, S.; Thau, D.; Moore, R. Google Earth Engine: Planetary-Scale Geospatial Analysis for Everyone. Remote Sens. Environ. 2017, 202, 18–27. [Google Scholar] [CrossRef]
  37. Gregorutti, B.; Michel, B.; Saint-Pierre, P. Correlation and Variable Importance in Random Forests. Stat. Comput. 2017, 27, 659–678. [Google Scholar] [CrossRef]
  38. Lundberg, S.; Lee, S.-I. A Unified Approach to Interpreting Model Predictions. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; pp. 4768–4777. [Google Scholar]
  39. Dubrovina, I.A.; Moshkina, E.V.; Tuyunen, A.V.; Genikova, N.V.; Karpechko, A.Y.; Medvedeva, M.V. Dynamics of soil properties and ecosystem carbon stocks for different types of land use (Middle Taiga of Karelia). Eurasian Soil Sci. 2022, 55, 1209–1221. [Google Scholar] [CrossRef]
  40. Dubrovina, I.A.; Moshkina, E.V.; Tuyunen, A.V.; Genikova, N.V.; Karpechko, A.Y.; Medvedeva, M.V. Ecosystem carbon stock in iron-metamorphic soils with different types of land use in South Karelia. Eurasian Soil Sci. 2024, 57, 1567–1578. [Google Scholar] [CrossRef]
  41. Dubrovina, I.A.; Sidorova, V.A.; Moshkina, E.V.; Tuyunen, A.V.; Karpechko, A.Y.; Genikova, N.V.; Medvedeva, M.V.; Mamai, A.V.; Tolstoguzov, O.V.; Kulakova, L.M. The impact of land use on soil properties and structure of ecosystem carbon stocks in the Middle Taiga Subzone of Karelia. Eurasian Soil Sci. 2021, 54, 1756–1769. [Google Scholar] [CrossRef]
  42. Dorosinsky, L.M. Competitive ability of nodule bacteria. In Biological Nitrogen in Agriculture of the USSR; Mishustin, E.N., Ed.; Nauka: Moscow, Russia, 1989; pp. 27–34. [Google Scholar]
  43. Dubrovina, I.A. Change in the content of total carbon, nitrogen and phosphorus in the boreal soils of the Republic of Karelia when used in agriculture. Tomsk. State Univ. J. Biol. 2018, 41, 27–41. [Google Scholar] [CrossRef]
  44. Žížala, D.; Minařík, R.; Skála, J.; Beitlerová, H.; Juřicová, A.; Reyes Rojas, J.; Penížek, V.; Zádorová, T. High-Resolution Agriculture Soil Property Maps from Digital Soil Mapping Methods, Czech Republic. CATENA 2022, 212, 106024. [Google Scholar] [CrossRef]
  45. Heuvelink, G.B.M. Uncertainty and Uncertainty Propagation in Soil Mapping and Modelling. In Pedometrics; McBratney, A.B., Minasny, B., Stockmann, U., Eds.; Progress in Soil Science; Springer International Publishing: Cham, Switzerland, 2018; pp. 439–461. ISBN 978-3-319-63439-5. [Google Scholar]
  46. Liu, F.; Wu, H.; Zhao, Y.; Li, D.; Yang, J.-L.; Song, X.; Shi, Z.; Zhu, A.-X.; Zhang, G.-L. Mapping High Resolution National Soil Information Grids of China. Sci. Bull. 2022, 67, 328–340. [Google Scholar] [CrossRef]
  47. Suleymanov, A.; Suleymanov, R.; Gabbasova, I.; Saifullin, I. Field-Scale Digital Mapping of Top- and Subsoil Chernozem Properties. Precis. Agric. 2024, 25, 1636–1657. [Google Scholar] [CrossRef]
  48. Castaldi, F.; Palombo, A.; Santini, F.; Pascucci, S.; Pignatti, S.; Casa, R. Evaluation of the Potential of the Current and Forthcoming Multispectral and Hyperspectral Imagers to Estimate Soil Texture and Organic Carbon. Remote Sens. Environ. 2016, 179, 54–65. [Google Scholar] [CrossRef]
  49. Mahmoudabadi, E.; Karimi, A.; Haghnia, G.H.; Sepehr, A. Digital Soil Mapping Using Remote Sensing Indices, Terrain Attributes, and Vegetation Features in the Rangelands of Northeastern Iran. Environ. Monit. Assess. 2017, 189, 500. [Google Scholar] [CrossRef] [PubMed]
  50. Urbina-Salazar, D.; Vaudour, E.; Baghdadi, N.; Ceschia, E.; Richer-de-Forges, A.C.; Lehmann, S.; Arrouays, D. Using Sentinel-2 Images for Soil Organic Carbon Content Mapping in Croplands of Southwestern France. The Usefulness of Sentinel-1/2 Derived Moisture Maps and Mismatches between Sentinel Images and Sampling Dates. Remote Sens. 2021, 13, 5115. [Google Scholar] [CrossRef]
  51. Žížala, D.; Minařík, R.; Zádorová, T. Soil Organic Carbon Mapping Using Multispectral Remote Sensing Data: Prediction Ability of Data with Different Spatial and Spectral Resolutions. Remote Sens. 2019, 11, 2947. [Google Scholar] [CrossRef]
  52. Vaudour, E.; Gomez, C.; Fouad, Y.; Lagacherie, P. Sentinel-2 Image Capacities to Predict Common Topsoil Properties of Temperate and Mediterranean Agroecosystems. Remote Sens. Environ. 2019, 223, 21–33. [Google Scholar] [CrossRef]
  53. Kasraei, B.; Schmidt, M.G.; Zhang, J.; Bulmer, C.E.; Filatow, D.S.; Arbor, A.; Pennell, T.; Heung, B. A Framework for Optimizing Environmental Covariates to Support Model Interpretability in Digital Soil Mapping. Geoderma 2024, 445, 116873. [Google Scholar] [CrossRef]
  54. Dvornikov, Y.A.; Vasenev, V.I.; Romzaykina, O.N.; Grigorieva, V.E.; Litvinov, Y.A.; Gorbov, S.N.; Dolgikh, A.V.; Korneykova, M.V.; Gosse, D.D. Projecting the Urbanization Effect on Soil Organic Carbon Stocks in Polar and Steppe Areas of European Russia by Remote Sensing. Geoderma 2021, 399, 115039. [Google Scholar] [CrossRef]
  55. Hu, B.; Geng, Y.; Shi, K.; Xie, M.; Ni, H.; Zhu, Q.; Qiu, Y.; Zhang, Y.; Bourennane, H. Fine-Resolution Baseline Maps of Soil Nutrients in Farmland of Jiangxi Province Using Digital Soil Mapping and Interpretable Machine Learning. CATENA 2025, 249, 108635. [Google Scholar] [CrossRef]
  56. Chinilin, A.; Lozbenev, N.; Shilov, P.; Fil, P.; Levchenko, E.; Kozlov, D. Synergetic Use of Bare Soil Composite Imagery and Multitemporal Vegetation Remote Sensing for Soil Mapping (A Case Study from Samara Region’s Upland). Land 2024, 13, 2229. [Google Scholar] [CrossRef]
  57. Maynard, J.J.; Levi, M.R. Hyper-Temporal Remote Sensing for Digital Soil Mapping: Characterizing Soil-Vegetation Response to Climatic Variability. Geoderma 2017, 285, 94–109. [Google Scholar] [CrossRef]
  58. Heuvelink, G.B.M.; Webster, R. Spatial Statistics and Soil Mapping: A Blossoming Partnership under Pressure. Spat. Stat. 2022, 50, 100639. [Google Scholar] [CrossRef]
Figure 1. The location of the study area (a,b); the investigation area with the sample points across different land use types (b) (image source—Google maps).
Figure 1. The location of the study area (a,b); the investigation area with the sample points across different land use types (b) (image source—Google maps).
Land 14 01881 g001
Figure 2. Morphological description of the soil profile under hayfield.
Figure 2. Morphological description of the soil profile under hayfield.
Land 14 01881 g002
Figure 3. Kruskal–Wallis’s and Dunn’s analysis (p < 0.05). Different letters indicate significant differences in the distribution of the values across the land use types.
Figure 3. Kruskal–Wallis’s and Dunn’s analysis (p < 0.05). Different letters indicate significant differences in the distribution of the values across the land use types.
Land 14 01881 g003
Figure 4. Pairwise correlation matrix among soil properties.
Figure 4. Pairwise correlation matrix among soil properties.
Land 14 01881 g004
Figure 5. Comparison of machine learning methods for each soil property in terms of R2.
Figure 5. Comparison of machine learning methods for each soil property in terms of R2.
Land 14 01881 g005
Figure 6. Digital maps of soil properties using machine learning procedures and remote sensing data.
Figure 6. Digital maps of soil properties using machine learning procedures and remote sensing data.
Land 14 01881 g006
Figure 7. Variable importance assessment in the models after the RFE procedure.
Figure 7. Variable importance assessment in the models after the RFE procedure.
Land 14 01881 g007
Figure 8. Effects of SWIR bands (11 and 12) on soil properties (Corg, N, and Pmob) based on SHAP approach. The x-axis shows the value of each band, and the y-axis shows the SHAP value (positive or negative) related to that specific band. Each dot represents a single data point (soil sample with a measured property) from the dataset. The green dashed line is a smoothed curve fitted on the SHAP values.
Figure 8. Effects of SWIR bands (11 and 12) on soil properties (Corg, N, and Pmob) based on SHAP approach. The x-axis shows the value of each band, and the y-axis shows the SHAP value (positive or negative) related to that specific band. Each dot represents a single data point (soil sample with a measured property) from the dataset. The green dashed line is a smoothed curve fitted on the SHAP values.
Land 14 01881 g008
Figure 9. Spatial visualization of the Shapley values for SWIR bands (11 and 12). Lighter color corresponds to a positive contribution to soil property predictions (i.e., higher values) and darker color indicates a negative contribution (i.e., lower values).
Figure 9. Spatial visualization of the Shapley values for SWIR bands (11 and 12). Lighter color corresponds to a positive contribution to soil property predictions (i.e., higher values) and darker color indicates a negative contribution (i.e., lower values).
Land 14 01881 g009
Table 1. Derived Sentinel-2A bands and spectral attributes used for spatial modeling.
Table 1. Derived Sentinel-2A bands and spectral attributes used for spatial modeling.
Bands/Spectral IndexResolutionCentral WavelengthDescription
B210 m490 nmBlue
B310 m560 nmGreen
B410 m665 nmRed
B520 m705 nmRed edge 1
B620 m740 nmRed edge 2
B720 m783 nmRed edge 3
B810 m842 nmNear-infrared 1 (NIR)
B8a20 m865 nmNear-infrared 2 (NIR)
B1120 m1610 nmShort-wave Infrared 1 (SWIR)
B1220 m2190 nmShort-wave Infrared 2 (SWIR)
NDVI20 m-Normalized Difference Vegetation Index
Table 2. The average value (±standard deviation) of soil parameters across the land use types.
Table 2. The average value (±standard deviation) of soil parameters across the land use types.
Soil Property/Land Use TypeCroplandHayingForest
Corg, %2.4 ± 1.11.9 ± 0.86.1 ± 4.3
Kmob, %38.2 ± 15.658.9 ± 36.778.3 ± 42.1
N, mg/100 g7.6 ± 3.215.2 ± 8.931.2 ± 14.5
Ptot, %0.075 ± 0.0240.058 ± 0.0190.046 ± 0.021
Pmob, %0.023 ± 0.0150.017 ± 0.0080.005 ± 0.003
Table 3. Prediction performance of the best predictive model in terms of error metrics.
Table 3. Prediction performance of the best predictive model in terms of error metrics.
Soil PropertyModelRMSE 1R2RPD
CorgRF2.150.531.38
KmobCubist31.920.291.16
NRF9.870.551.50
PtotKNN0.020.191.22
PmobRF0.010.491.43
1 RMSE values reported for each soil property are expressed in the same units as the respective property (Table 2).
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Suleymanov, R.; Yurkevich, M.; Bakhmet, O.; Popova, T.; Kungurtsev, A.; Zakirov, D.; Vittsenko, A.; Mishra, G.; Suleymanov, A. Interpretable Machine Learning and Remote Sensing Data Reveal Soil Biogeochemistry Patterns in Agricultural Systems. Land 2025, 14, 1881. https://doi.org/10.3390/land14091881

AMA Style

Suleymanov R, Yurkevich M, Bakhmet O, Popova T, Kungurtsev A, Zakirov D, Vittsenko A, Mishra G, Suleymanov A. Interpretable Machine Learning and Remote Sensing Data Reveal Soil Biogeochemistry Patterns in Agricultural Systems. Land. 2025; 14(9):1881. https://doi.org/10.3390/land14091881

Chicago/Turabian Style

Suleymanov, Ruslan, Marija Yurkevich, Olga Bakhmet, Tatiana Popova, Andrey Kungurtsev, Denis Zakirov, Anastasia Vittsenko, Gaurav Mishra, and Azamat Suleymanov. 2025. "Interpretable Machine Learning and Remote Sensing Data Reveal Soil Biogeochemistry Patterns in Agricultural Systems" Land 14, no. 9: 1881. https://doi.org/10.3390/land14091881

APA Style

Suleymanov, R., Yurkevich, M., Bakhmet, O., Popova, T., Kungurtsev, A., Zakirov, D., Vittsenko, A., Mishra, G., & Suleymanov, A. (2025). Interpretable Machine Learning and Remote Sensing Data Reveal Soil Biogeochemistry Patterns in Agricultural Systems. Land, 14(9), 1881. https://doi.org/10.3390/land14091881

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop