Next Article in Journal
Assessing the Crucial Role of Marine Fog in Early Soil Development and Biocrust Dynamics in the Atacama Desert
Previous Article in Journal
Content of Radionuclides in Soils of Hydraulic Development Areas in Brazil
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

The Basic Soil Structure Parameters and Their Spatial Prediction Using Machine Learning and Remote Sensing Data in Semi-Arid Trans-Ural Steppe Zone, Russia

by
Azamat Suleymanov
*,
Mikhail Komissarov
,
Ruslan Suleymanov
and
Ilyusya Gabbasova
Laboratory of Soil Science, Ufa Institute of Biology, Ufa Federal Research Centre, Russian Academy of Sciences, Ufa 450054, Russia
*
Author to whom correspondence should be addressed.
Soil Syst. 2026, 10(1), 11; https://doi.org/10.3390/soilsystems10010011
Submission received: 10 November 2025 / Revised: 6 January 2026 / Accepted: 8 January 2026 / Published: 12 January 2026

Abstract

Soil structure is one of the key soil water-physical properties that determine the water–air regime and ultimately affect soil fertility. This study aimed to test different machine learning (ML) methods in combination with environmental variables (soil and climate) and remote sensing data derived from Landsat 8 for prediction of key structure parameters of topsoil (0–25 cm) in semi-arid areas (Trans-Ural steppe zone, Republic of Bashkortostan, Russia). The all studied soil types (Chernozems (n = 24), Solonchaks (n = 9)) and Solonetzes (n = 12)) characterized by “excellent” aggregate state (the average structural coefficient (Ks) was 6.52, 11.23 and 5.70) and “good” resistance of aggregates to destruction by water (soil aggregate stability coefficient (Ksas)—0.67, 0.65 and 0.70, respectively). The soils had a high proportion of agronomically valuable aggregates (0.25–10 mm, mesoaggregates (MEA)), and a low proportion of blocky/lumpy (>10 mm, macroaggregates (MAA)) and fine/dusty (<0.25 mm, microaggregates (MIA)) ones. In particular, the average share of MIA, MEA, and MAA in Chernozem was 7.63, 83.20, and 11.73%, and in Solonchak, 4.24, 87.91, and 9.74%, respectively. After wet sifting, the water-resistant macroaggregates (WSMAA) were not identified (they were destroyed by water) in all studied soils; the proportion of water-stable mesoaggregates (WSMEA) in Chernozems was 65.92 and microaggregates (WSMIA)—39.67; Solonchaks—74.95 and 22.54; Solonetz soil—66.77 and 33.22%; respectively. Under the ML framework, the best model was achieved for Ksas predictions (R2 = 0.50 and RMSE 0.17), where spectral indices (NDWI, EVI, SAVI, and NDVI) were the main predictors. Other ML techniques explained 22-30% variance of the remaining properties. The findings of this study can be valuable in further endeavors for soil water-physical mapping and accelerate the adoption of measures for land management/reclamation planning for landscapes with similar (arid and semi-arid) natural climatic conditions.

1. Introduction

The sustainable management of agricultural and natural ecosystems in semi-arid regions depends on an understanding of soil physical properties and their spatial patterns. These characteristics are important determinants of hydrological processes [1], nutrient cycling [2], and vegetation productivity [3]. In addition, water-physical properties, along with agrochemical ones, determine soil fertility and resistance to degradation processes. The key water-physical properties of soil include its structure—i.e., a collection of aggregates of various shapes and sizes, mechanical strength, and water resistance, into which the soil disintegrates in its natural state. Soil structure affects water permeability (and moisture preservation), air permeability, and nutrient retention. Good soil structure promotes root development and improves plant growth. Knowing the structure allows for choosing optimal soil tillage methods and preventing its compaction and erosion [4]. Moreover, in the face of land degradation and climate change, accurately assessing the spatial variability of water-physical properties is important for developing effective soil conservation strategies. However, traditional field-based methods for mapping these properties are prohibitively expensive, time-consuming, and inadequate for capturing their complex spatial patterns.
Digital soil mapping (DSM) has provided a powerful framework for these tasks using state-of-the-art predictive techniques at different scales. Among the approaches, machine learning (ML) algorithms are robust tools due to their ability to handle non-linear relationships [5]. Currently, numerous techniques are used for DSM purposes, including for modeling of soil physical parameters. Khosravani et al. [6] tested three ML algorithms for spatial modeling of physical and mechanical properties in Iran. Alonso-Sarria et al. [7] predicted clay, silt, and sand contents using four ML methods in Spain. Ließ et al. [8] compared regression tree and random forest models in the spatial prediction of soil texture.
While ML approaches have shown promise in spatial modeling of soil physical properties, their application to soil structural stability, under both dry and wet sieving, remains limited [9,10]. We hypothesize that ML algorithms can predict soil aggregate stability by integrating multiple environmental variables, thus addressing the current gap in understanding how ML methods perform in modeling soil structure across different aggregation conditions. At the same time, as practice shows, various predictive methods can demonstrate different performance [5]. Hence, selecting the appropriate ML model for specific soil parameters is necessary.
Given this context, the key aims of this study are (1) to investigate the key soil structure parameters (structural state, including aggregate distribution and its stability) across several soil types; (2) compare the predictive performance of multiple ML algorithms for spatial prediction of soil properties; and (3) to identify the most important environmental predictors, including single-date Landsat 8 imagery alongside climatic and auxiliary soil variables, governing the spatial distribution of soil properties.

2. Materials and Methods

2.1. Study Area and Sampling

The study was conducted in the Trans-Ural steppe zone within the Republic of Bashkortostan, Russia (Figure 1). The climate in the study area is continental, with cold winters and warm, dry summers (Dfb according to the Köppen climate classification [11]). Mean annual temperatures range from 1 to 3 °C, with January averages reaching −16 °C and July averages around 20 °C, according to the closest meteorological station “Akyar” (51.867543 N, 58.220405 E). Annual precipitation is relatively low, averaging 300−400 mm, concentrated primarily during the summer months; however, the rare frequency of rain leads to periodic drought conditions. Dominant soil types include Chernozems (black soils) and salt-affected soils: Solonchaks and Solonetzes, according to the WRB system [12]. Salt-affected soils are mostly located in combination with Chernozems and do not form large areas, but are only in the same places where they are found as hotspot areas. The types of salinity of these soils are sulfate, chloride-sulfate, and mixed. The genesis of soil salinization is associated with a high content of water-soluble salts from tertiary seas, mineralized groundwater, and the arid climate. The land use/land cover of the research site is mostly presented by virgin or abandoned agricultural lands. The vegetation is poorly developed and mainly presented by steppe plants and different halophytes (e.g., Volga fescue (Festuca valesiaca), European feather grass (Stipa pennata), picklegrass (Salicornia), etc.) [13]. Predominant parent materials are diluvial yellow-brown carbonate clays and heavy loams [13].
A total of 45 geo-referenced soil samples were collected from the topsoil layer (0−25 cm) using a targeted sampling scheme with an emphasis on soil type variability (Figure 1c). Samples were taken (using a small shovel) from three types of soils: Chernozem (n = 24), Solonchak (n = 9), and Solonetz (n = 12). The geographical coordinates of each sampling point were recorded with a high-precision GPS receiver (Garmin, Olathe, KS, USA).

2.2. Soil Analyses

The stones and tree/plant roots were removed from the undisturbed (not grounded or sieved) samples, then soil samples were air-dried to constant weight and proceeded to water-physical soil properties’ analyses, which were performed according to Vadyunina and Korchagina [14] methodology. In particular, the structural-aggregate composition was determined by dry sieving using meshes with sizes 10 and 0.25 mm. The blocky/lumpy aggregates (>10 mm) were categorized as macroaggregates (MAA, %), agronomically valuable aggregates (0.25–10 mm)—mesoaggregates (MEA, %), and fine/dust aggregates (<0.25 mm)—microaggregates (MIA, %). The structural coefficient (Ks), as the main indicator in assessment/quality of soil aggregate composition, was estimated according to Equation (1):
K s = M E A M A A + M I A
Soil aggregate stability (SAS) was measured using a Baksheev device (Vibrotehnic, Saint Petersburg, Russia) under a wet sieving procedure, which included a swinging of dry aggregates in water using a device with meshes of 10- and 0.25 mm size for 10 min and in 3 replications. Then the proportion of water-stable macro—(>10 mm), meso—(0.25–10 mm) and microaggregates (<0.25 mm) (WSMAA, WSMEA and WSMIA, respectively) was determined by the standard weight method. The SAS coefficient (Ksas) was calculated from Equation (2):
K s a s = Σ w Σ d
where Σw—sum of aggregates > 0.25 mm under wet sieving (water-stable aggregates), Σd—sum of aggregates > 0.25 mm under dry sieving.
The gradation of Ks and Ksas across the categories was made according to the Russian classification [14]. In particular, the Ks, where a value > 1.5 is classified as “excellent”; 1.5–0.67 as “good”; and >0.67 as “unsatisfactory”. Ksas was classified as follows: <0.30—“unsatisfactory”; 0.30–0.40—“satisfactory”; 0.40–0.75—“good”; and >0.75 as “excessively high”.

2.3. Environmental Predictors

A suite of environmental covariates, representing key soil-forming factors, was assembled to serve as predictors for ML models. The covariate dataset included remote sensing data (RSD), climate, and existing soil maps.
RSD was presented by spectral data from Landsat 8 satellite imagery. A “single-date” approach was chosen to capture the soil and surface conditions at specific, phenologically significant moments [15]. To minimize the influence of atmospheric contaminants, cloud-free scenes with atmospheric preprocessing for the study area were identified and downloaded using Google Engine. Three specific scenes from the growing season of 2025 were acquired for analysis: 4 May (4 May 2025), 28 May (28 May 2025), and 7 July (7 July 2025). Covariates included the spectral bands as well as a set of derived spectral indices (Equations (3)–(7): the Normalized Difference Vegetation Index (NDVI) [16], the Soil-Adjusted Vegetation Index (SAVI) [17], the Enhanced Vegetation Index (EVI) [18], the Normalized Difference Water Index (NDWI) [19] and the normalized difference salinity index (NDSI) [20], corresponding to the scene data.
N D V I = N I R R N I R + R
S A V I = ( N I R R ) × ( 1 + 0.5 ) N I R R + 0.5
E V I = 2.6 ( N I R R N I R + 6 × R 7.5 × B + 1 )
N D W I = G N I R G + N I R
N D S I = R N I R R + N I R
where B, G, R, and NIR are blue, green, red, and near-infrared bands, respectively.
Climate data included land surface temperature (day and night), precipitation, and solar radiation [21,22]. Existing digital type soil map, as well as maps of soil organic carbon (SOC) content and pH derived from Suleymanov et al. [13,23] were included as soil covariates. In order to unify the spatial resolution of all covariates, we harmonized them to 30 m/pixel as Landsat data.
To evaluate the individual predictive power of each satellite acquisition date, three distinct modeling scenarios were tested. Each scenario utilized a unique set of covariates: the RSD set from one of the three single-date Landsat scenes, combined with the static climate and auxiliary soil data (type, SOC, and pH).

2.4. Predictive Algorithms

We tested four state-of-the-art ML algorithms that are suitable for small datasets [5] and compared their performance in predicting soil properties.
Cubist is a rule-based ensemble method that combines regression trees with instance-based learning and linear models [5]. The algorithm generates a series of rules (conditional statements) based on the input variables, each associated with a multivariate linear model. A unique feature of Cubist is the use of committees—ensembles of rule-based models—where each subsequent committee attempts to correct errors made by previous ones.
Random Forest (RF) algorithm predicts the mean value of the target variable by constructing multiple decision trees during training and outputting the prediction of the individual trees for regression tasks [24]. This ensemble approach reduces the risk of overfitting and improves prediction accuracy compared to single decision trees. Each tree is trained on a random subset of the data and a random subset of the features, further enhancing the model’s robustness and ability to generalize to unseen data.
Elastic Net is a regularized linear regression approach that combines both L1 (lasso) and L2 (ridge) penalties, addressing limitations of either method used independently [25]. The L1 penalty promotes sparsity by performing automatic variable selection, effectively setting coefficients of irrelevant features to zero, while the L2 penalty handles multicollinearity by shrinking coefficients of correlated variables toward each other. This dual regularization makes Elastic Net particularly suitable for datasets with numerous potentially correlated environmental covariates, as it maintains model interpretability while preventing overfitting. The method excels in scenarios where the number of predictors may exceed the number of observations, a common situation in soil modeling applications.
Support Vector Machines (SVMs) represent a powerful kernel-based method that transforms the input space into a higher-dimensional feature space where nonlinear relationships become linearly separable [5]. The radial basis function kernel enables the model to capture complex patterns and interactions among variables without explicitly specifying the transformation. The method works by finding the optimal hyperplane that maximizes the margin between support vectors while minimizing prediction error through regularization.

2.5. Model Validation and Uncertainty Assessment

The performance of all models was assessed using a repeated 5-fold cross-validation procedure. This process was repeated 50 times with different random partitions to ensure the results were stable and not dependent on a single random split of the data. Statistical metrics such as mean absolute error (MAE), root mean squared error (RMSE), squared Pearson’s correlation (R2), and Nash–Sutcliffe model efficiency coefficient (NSE) were used as error metrics.
Uncertainty assessment of predictions was quantified using the Quantile Regression Forest (QRF) method [26]. QRF is an extension of the RF algorithm that not only predicts the mean value of the target variable but also models the conditional distribution, allowing for the estimation of prediction intervals and uncertainty. We used the 90% prediction interval width (PI90) as an uncertainty measure.
All steps to implement ML algorithms, including covariate preprocessing, variable importance assessment, cross-validation, and visualization, were performed in the R programming environment (v. 4.5.2) using the “caret” [27] and other basic packages.

3. Results

3.1. Summary Statistics of Soil Properties

The most measured soil structure parameters demonstrated a wide range of values, showing variability within the study area (Table 1). The Ks ranged between 0.01 and 34.71 with a mean of 7.24 and high variability (83%). The aggregate-structural composition showed that MEA predominates with a range from 63.90 to 100%, an average value of 85.01% and a lowest CV (10%). MIA and MAA ranged from 0 to 23.10% and from 0 to 33.80%, respectively. The mean value for MEA (85.01%) was substantially higher than for MIA (5.93%), but the CV was lower (10 vs. 76%). The properties related to aggregate stability: WSMIA and WSMEA varied from 9.16 to 80.30% and from 19.70 to 90.84%, respectively. The Ksas was the stable property with CVs of 0.33 and a mean of 0.67.
The ANOVA results indicated a statistically significant difference between soil types only for the Ks property (p-value < 0.01). Although no significant differences appeared for the remaining soil structure parameters, we found notable patterns in the variability of some of them (Figure 2). Specifically, although the mean and median values of WSMIA (%) and WSMEA (%) were not dramatically different across soil types, the variability was consistently lower for Solonchak compared to Chernozem and Solonetz soils.

3.2. Model Performance and Key Variables

The cross-validation results revealed varying performance across different soil structure parameters (Table 2). Modeling Ksas resulted in the most accurate model, which had RMSE 0.17 and R2 = 0.50. The model performance for other parameters was lower and did not exceed R2 = 0.30. The analysis found that for most properties, satellite scenes from May combined with climate and soil covariates were the most informative. The July scene provided the best accuracy for Ks predictions.
Figure 3 presents a comparison of model performance. In terms of RMSE and R2 metrics, SVM was the best algorithm for most properties. For instance, the SVM algorithm demonstrated superior performance based on RMSE. RF had the best performance in the prediction of MAA. The Elastic Net method demonstrated the worst accuracy for most soil structure parameters.
The permutation-based variable importance for the best predicted variables (Ksas and MEA) demonstrated different results (Figure 4). The top five effective covariates in the Ksas model were spectral indices, including NDWI, EVI, SAVI, and NDVI. In addition, the climate variable, which shows the land surface temperature during the daytime, entered the top 5. The spatial distribution of MEA was controlled by climate variables (land surface temperature and solar radiation) and Landsat bands (green and NIR).

3.3. Spatial Distribution of Soil Properties

The generated maps are presented for the best predicted variables (Ksas and MEA) with predictions from RF as a comparison and uncertainty assessment using the QRF algorithm (Figure 5). The maps of Ksas produced by SVM and RF were characterized by comparable patterns, resulting in a decrease in Ksas from north to south, with the lowest values under croplands. However, RF predicted a wider range of values (0.35–0.91) in comparison with the SVM technique (0.65–0.87). The uncertainty map, generated using a 90% prediction interval from QRF, showed a trend of increasing uncertainty magnitude in southern areas. The digital map of the MEA showed a trend of decreasing these particles from north to south. RF also produced a wider range of predicted MEA (68–95%) in comparison with SVM (79–93%). The greatest uncertainties in MEA predictions were found in the northern areas of the territory.

4. Discussion

4.1. Physical Soil Properties of the Study Area

The topsoil’s structural state of all studied soil types was characterized as “excellent” (Ks > 1.5). The best structure was diagnosed for Solonchaks (Ks = 11.23), somewhat worse for Chernozems (Ks = 6.52) and Solonetz soils (5.70). Usually, under natural conditions, Chernozem soil types have an “excellent” structure. For example, the Ks of Chernozems in some parts of Russia have the following similar values for topsoil: in the Kursk Oblast it ranged from 1.5 to 3.2 [28], in the Ufimsky District (Republic of Bashkortostan) it varies from 4.2 to 8.4; however, as degradation processes increase (frequent plowing, irrational irrigation and erosion), the Ks of Chernozems decreases [29]. The Ks value of the studied Solonetz soils was close to that in the dry steppe zone of the Lower Volga Basin (Volgograd Oblast), where it varied in topsoil from 2.2 to 3.2 [30]. The “excellent” structure of studied soils related to the high proportion of agronomically valuable aggregates (MEA), their contents were prevailing, as well were of the same order: 83.2 for Chernozems; 87.91—Solonchaks; and 86.47%—Solonetz soils. The share of blocky/lumpy (MAA) and fine/dust aggregates (MIA) was also almost identical: 11.73 and 7.63 in Chernozems, 8.48 and 4.24—Solonchaks, 9.74 and 3.78%—Solonetz soils, respectively. In other natural and climatic conditions, in particular in the European part of Russia, the MAA and MIA have a similar tendency in their distribution in aggregate-structural composition for the studied soil types. For example, the share of MIA in Chernozems of the Kursk Oblast was approximately the same (3–10%), but the share of MAA was a little higher (17–32%) [28], that is associated with the technology of soil cultivation (plowing with layer turnover without harrowing and disking, that lead to the preservation of MAA). In the Rostov Oblast (Russia), the amount of MAA was 18–26% and MIA was 5–11% in Chernozems, as researchers observed a similar situation [31]. Soils of Solonetzic complexes in the southern Privolzhskaya Upland (Volgograd Oblast) contain a share of MAA~2–9, and MIA~5–20% [30]. It should be noted that anthropogenic impact such as excessive and monotonous plowing [32,33], use of heavy machinery [34], intensive irrigation [35], irrational agricultural management, e.g., the frequent use of row crops in crop rotation, etc. [36], deforestation or disturbance of vegetation cover [37], overgrazing [38], and natural processes as water erosion [39] or deflation [40] lead to soil structure deterioration. In the study area, we observe slight soil erosion [41], which occurs only on steep slopes and in ravines, so we consider the soil structure for most of the studied territory to be in good condition. Moreover, the dominant land-use is fallow, which leads to self-restoration and improvement of the soil structure [42].
Water resistance is one of the main indicators of aggregate stability (ability of soil to resist breaking apart when wet), which also serves as a condition for erosion resistance. Ksas is a crucial and quantitative indicator for evaluating the level of aggregate stability. The soil aggregates’ resistance to water at all studied soils was “good” (Ksas = 0.65–0.70). The obtained soil aggregate stability values are consistent with the data of other researchers. For example, Khasanova et al. [43], studying the physical properties of the Chernozems in the same Trans-Ural region, revealed that, depending on the land use type, the Ksas varies within 0.35–0.73. Moreover, closely was and the proportion of MIA and MAA, which varied from 8.2 to 12.6, and from 15.7 to 24.0%, respectively. Bartlová and Badalíková [44] concluded that the soil aggregate stability of Chernozem in the Czech Republic worsens depending on soil cultivation type, in particular from 70 (minimal tillage) to 50% (classical plowing). Values of soil aggregate stability in salt-affected soils of Volgograd oblast (Russia) were 38–58% [30], i.e., slightly smaller (up to 17%) than our results. The content of WSMEA and WSMIA in Chernozem and Solonetz soil was close: 65.92 and 39.67, and 66.77 and 33.22%, respectively; in Solonchaks, there was more WSMEA (74.95%) and less WSMIA (22.54%). Similar values were obtained for Chernozems in Rostov and Kursk Oblasts [28], where WSMIA content ranged from 36 to 48% [31], and WSMEA—from 60 to 77% [28]. The content of WSMIA in Solonchak increased under the impact of perennial grasses: WSMIA in the arable was 15.8, under cultivation of alfalfa, it was 36.6% [45].
Summing up, the studied soil has good structure and water resistance; however, to maintain and improve these indicators, it is possible to carry out extra soil conservation measures. For example: soil aeration [46], adding organic and mineral fertilizers [47], mulching [48], minimum tillage [49], avoiding over-watering and over-drying [50], etc.

4.2. Importance of Environmental Variables

We found that RSD and climate were key covariates in the best predictive models. As was demonstrated earlier, RSD serves as a powerful tool for the digital mapping of soil physical properties [51], particularly in arid and semi-arid regions [9,52]. For instance, Khosravani et al. [53] found that RSD was a major variable for spatial modeling of soil aggregate stability indices in Iran. The inhomogeneous surface explains the effectiveness of RSD across the study area. Particularly, there are arable lands with bare surfaces and the presence of distinctive Solonchaks. The absence of dense vegetation on bare arable fields allows satellite sensors to capture the direct spectral signatures of the soil itself, which are influenced by key physical properties such as texture, mineralogy, and surface moisture content [54,55]. A surface white crust characterized Solonchaks with distinct spectral absorption features, especially in the near- and short-wave infrared regions [56,57]. Hence, the combination of bare soils and Solonchaks simplifies the relationships for ML algorithms. For this reason, Landsat satellite images acquired in May proved to be the most valuable because of the sparse vegetation at that time. Future work should incorporate multi-temporal satellite composites (e.g., multi-date Landsat or Sentinel-2) to better represent seasonal dynamics and reduce the sensitivity of optical–soil relationships to single-date conditions [58]. Although climate is the most important predictor in soil spatial modeling across large areas [59], in our study, land surface temperature made a contribution to the predictions. Land surface temperature variables derived from thermal infrared RSD—what can explain this contribution? Earlier, Sayão and Demattê [51] found significant correlations between soil texture and land surface temperature.

4.3. Model Performance

In spatial modeling of water-physical soil properties, different results were reported in terms of accuracy. Bousbih et al. [60] tested RF and SVM with fourteen Sentinel images for clay content modeling and received poor accuracy from all models. In another study, Vaudour et al. [58] predicted texture fractions (clay, silt, sand) using RSD in two areas with an R2 between 0.03 and 0.42. In opposite, Zeraatpisheh et al. [10] demonstrated that the RF model was the best for mean weight diameter (R2 = 0.74), geometric mean diameter (R2 = 0.75), and water-stable aggregates (R2 = 0.58) predictions.
Among the investigated soil properties, Ksas was relatively well predicted. We assume that a primary limitation in the model performance of other properties is the slight difference in their content between the key soil types (Figure 2). As a result, despite different surface signals, properties were not directly visible to sensors and showed weak correlations. Furthermore, the inherent complexity and high spatial variability of soil-forming factors in these environments can lead to non-stationary relationships that are difficult for models to capture without a very large and densely distributed training dataset. We also emphasize the possible limitation of the initial spatial resolution of climatic and soil variables, as a single pixel may represent multiple soil types or land covers. This feature did not allow for the detection of short-scale variations in soils and their properties, characteristic of this region [41]. The uncertainty maps reveal substantial spatial variability in prediction confidence, with wider intervals in areas of sparse sampling, and high complexity of covariates, as well as their limited representation.
Among the tested ML algorithms, the SVM model demonstrated superior predictive performance for the majority of the soil physical properties, as evidenced by its error metrics. This highlights its effectiveness in capturing the complex, potentially non-linear relationships between covariates and soil characteristics. However, the performance of ML algorithms is often data-specific, and no single method universally outperforms all others in soil applications [5,61,62]. Therefore, it remains a critical best practice to test and compare a suite of multiple algorithms. Subsequent initiatives on DSM may test hybrid methods that take into account spatial autocorrelation of soil properties [63].
We acknowledge the relatively small sample size as a limitation of this study. To mitigate this issue, a targeted sampling scheme was adopted with an emphasis on sampling from three key local soil types. This design was intended to maximize the informational content and structural contrast within the dataset rather than simply increasing the number of observations. Furthermore, previous work at local scales has shown that ML algorithms, including RF, can perform well even with limited sample sizes, and that increasing the number of samples does not necessarily lead to substantial gains in prediction accuracy [64,65].

5. Conclusions

This study investigated the topsoil structure parameters and properties in semi-arid areas and the ML framework in combination with environmental predictors for their spatial predictions. The key findings of this study are summarized below:
(1)
All studied soil types (Chernozems, Solonchaks, and Solonetzes) were characterized by “excellent” aggregate state with the average structural coefficient (Ks) 6.52, 11.23, and 5.70, respectively. The soils demonstrated a “good” resistance of aggregates to destruction by water, with aggregate stability coefficients (Ksas) of 0.67, 0.65, and 0.70, respectively. The average share of MIA, MEA, and MAA in Chernozem was 7.63, 83.20, and 11.73%, in Solonchak—4.24, 87.91, and 8.48%, and in Solonetz—3.78, 86.47, and 9.74%, respectively. The proportion of water-stable mesoaggregates (WSMEA) in Chernozems was 65.92, and microaggregates (WSMIA)—39.67; Solonchaks—74.95 and 22.54; Solonetz soil—66.77 and 33.22%; respectively.
(2)
Among the ML approaches tested, SVM achieved the best accuracy for predictions of most properties (Ks, Ksas, MEA, WSMIA, and WSMEA), according to the cross-validation strategy. RF and Elastic Net were the best algorithms for MAA and MIA predictions, respectively. The best prediction model was achieved for Ksas under the SVM algorithm and resulted in R2 = 0.50 and RMSE = 0.17. The second most accurate model (R2 = 0.30 and RMSE = 7.70%) was obtained for MEA modeling.
(3)
The best modeled soil properties (Ksas and MEA) were controlled by different variables. The spatial distribution of Ksas was controlled by spectral indices, mainly due to the distinctive spectral response of Solonchak soils. Climate variables (land surface temperature and solar radiation) and Landsat bands (green and NIR) were crucial for the MEA model, highlighting correlations between different soil types and their surface spectral signal.
(4)
Despite the results obtained, subsequent studies should focus on including a more extensive data set and testing additional independent variables reflecting the variability of soil properties.

Author Contributions

Conceptualization: A.S.; data curation: A.S.; formal analysis: A.S., M.K., R.S. and I.G.; funding acquisition: A.S.; investigation: A.S., M.K., R.S. and I.G.; methodology: A.S. and M.K.; project administration: A.S.; resources: I.G.; software: A.S.; supervision: I.G.; validation: M.K. and R.S.; visualization: A.S.; writing—original draft: A.S. and M.K.; writing—review and editing: R.S. and I.G. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Russian Science Foundation, project No. 25-74-00002.

Data Availability Statement

The data presented in this study are available upon reasonable request from the authors.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

ML—machine learning; Ks—soil structural coefficient; SAS—soil aggregate stability; Ksas—soil aggregate stability coefficient; MEA—mesoaggregates or agronomically valuable aggregates, mm or %; MAA—macroaggregates, mm or %; MIA—microaggregates, mm or %; WSMEA—water-stable mesoaggregates, mm or %; WSMIA—water-stable microaggregates, mm or %; WSMAA—water-stable macroaggregates, mm or %; DSM—digital soil mapping; RSD—remote sensing data; SOC—soil organic carbon; NDVI—normalized difference vegetation index; NDSI—normalized difference salinity index; SAVI—soil-adjusted vegetation index; EVI—enhanced vegetation index; NDWI—normalized difference water index; RF—random forest; SVM—support vector machine; MAE—mean absolute error; RMSE—root mean squared error, R2—squared Pearson’s correlation; NSE—Nash–Sutcliffe model efficiency coefficient; QRF—quantile regression forest; SD—standard deviation; CV—coefficient of variation.

References

  1. Deng, Z.; Lan, H.; Li, L.; Sun, W. Vegetation-Induced Modifications in Hydrological Processes and the Consequential Dynamic Effects of Slope Stability. CATENA 2025, 251, 108793. [Google Scholar] [CrossRef]
  2. Lu, M.; Yang, M.; Yang, Y.; Wang, D.; Sheng, L. Soil Carbon and Nutrient Sequestration Linking to Soil Aggregate in a Temperate Fen in Northeast China. Ecol. Indic. 2019, 98, 869–878. [Google Scholar] [CrossRef]
  3. Sainju, U.M.; Liptzin, D.; Jabro, J.D. Relating Soil Physical Properties to Other Soil Properties and Crop Yields. Sci. Rep. 2022, 12, 22025. [Google Scholar] [CrossRef] [PubMed]
  4. Bronick, C.J.; Lal, R. Soil Structure and Management: A Review. Geoderma 2005, 124, 3–22. [Google Scholar] [CrossRef]
  5. Khaledian, Y.; Miller, B.A. Selecting Appropriate Machine Learning Methods for Digital Soil Mapping. Appl. Math. Model. 2020, 81, 401–418. [Google Scholar] [CrossRef]
  6. Khosravani, P.; Baghernejad, M.; Moosavi, A.A.; Rezaei, M. Digital Mapping and Spatial Modeling of Some Soil Physical and Mechanical Properties in a Semi-Arid Region of Iran. Environ. Monit. Assess. 2023, 195, 1367. [Google Scholar] [CrossRef]
  7. Alonso-Sarria, F.; Blanco-Bernardeau, A.; Gomariz-Castillo, F.; Jiménez-Bastida, H.; Romero-Diaz, A. Estimation of Soil Properties Using Machine Learning Techniques to Improve Hydrological Modeling in a Semiarid Environment: Campo de Cartagena (Spain). Earth Sci. Inform. 2025, 18, 323. [Google Scholar] [CrossRef]
  8. Ließ, M.; Glaser, B.; Huwe, B. Uncertainty in the Spatial Prediction of Soil Texture: Comparison of Regression Tree and Random Forest Models. Geoderma 2012, 170, 70–79. [Google Scholar] [CrossRef]
  9. Bouslihim, Y.; Rochdi, A.; Aboutayeb, R.; El Amrani-Paaza, N.; Miftah, A.; Hssaini, L. Soil Aggregate Stability Mapping Using Remote Sensing and GIS-Based Machine Learning Technique. Front. Earth Sci. 2021, 9, 748859. [Google Scholar] [CrossRef]
  10. Zeraatpisheh, M.; Ayoubi, S.; Mirbagheri, Z.; Mosaddeghi, M.R.; Xu, M. Spatial Prediction of Soil Aggregate Stability and Soil Organic Carbon in Aggregate Fractions Using Machine Learning Algorithms and Environmental Variables. Geoderma Reg. 2021, 27, e00440. [Google Scholar] [CrossRef]
  11. Beck, H.E.; Zimmermann, N.E.; McVicar, T.R.; Vergopolan, N.; Berg, A.; Wood, E.F. Present and Future Köppen-Geiger Climate Classification Maps at 1-Km Resolution. Sci. Data 2018, 5, 180214. [Google Scholar] [CrossRef]
  12. IUSS Working Group WRB. World Reference Base for Soil Resources 2014, Update 2015. In International Soil Classification System for Naming Soils and Creating Legends for Soil Maps; World Soil Resources Reports No. 106; FAO: Rome, Italy, 2015. [Google Scholar]
  13. Suleymanov, A.; Asylbaev, I.; Suleymanov, R.; Mirsayapov, R.; Gabbasova, I.; Tuktarova, I.; Belan, L. Assessing and Mapping of Soil Organic Carbon at Multiple Depths in the Semi-Arid Trans-Ural Steppe Zone. Geoderma Reg. 2024, 38, e00855. [Google Scholar] [CrossRef]
  14. Vadyunina, A.F.; Korchagina, Z.A. Methods of Studying the Physical Properties of Soils; Agropromizdat: Moscow, Russia, 1986; p. 416. (In Russian) [Google Scholar]
  15. Langley, S.K.; Cheshire, H.M.; Humes, K.S. A Comparison of Single Date and Multitemporal Satellite Image Classifications in a Semi-Arid Grassland. J. Arid Environ. 2001, 49, 401–411. [Google Scholar] [CrossRef]
  16. Rouse, J.W.; Haas, R.H.; Schell, J.A.; Deering, D.W. Monitoring vegetation systems in the Great Plains with ERTS. NASA Spec. Publ. 1974, 351, 309. [Google Scholar]
  17. Huete, A.R. A Soil-Adjusted Vegetation Index (SAVI). Remote Sens. Environ. 1988, 25, 295–309. [Google Scholar] [CrossRef]
  18. Huete, A.; Didan, K.; Miura, T.; Rodriguez, E.P.; Gao, X.; Ferreira, L.G. Overview of the Radiometric and Biophysical Performance of the MODIS Vegetation Indices. Remote Sens. Environ. 2002, 83, 195–213. [Google Scholar] [CrossRef]
  19. Gao, B. NDWI—A Normalized Difference Water Index for Remote Sensing of Vegetation Liquid Water from Space. Remote Sens. Environ. 1996, 58, 257–266. [Google Scholar] [CrossRef]
  20. Khan, N.M.; Rastoskuev, V.V.; Sato, Y.; Shiozawa, S. Assessment of hydrosaline land degradation by using a simple approach of remote sensing indicators. Agric. Water Manag. 2005, 77, 96–109. [Google Scholar] [CrossRef]
  21. Fick, S.E.; Hijmans, R.J. WorldClim 2: New 1-Km Spatial Resolution Climate Surfaces for Global Land Areas. Int. J. Climatol. 2017, 37, 4302–4315. [Google Scholar] [CrossRef]
  22. Karger, D.N.; Conrad, O.; Böhner, J.; Kawohl, T.; Kreft, H.; Soria-Auza, R.W.; Zimmermann, N.E.; Linder, H.P.; Kessler, M. Climatologies at High Resolution for the Earth’s Land Surface Areas. Sci. Data 2017, 4, 170122. [Google Scholar] [CrossRef]
  23. Suleymanov, A.; Kuzyakov, Y.; Asylbaev, I.; Rusakov, I.; Suleymanov, R.; Tuktarova, I.; Belan, L. Mechanisms and Drivers of Soil pH Assessed by Shapley Additive Explanation. CATENA 2025, 259, 109301. [Google Scholar] [CrossRef]
  24. Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
  25. Zou, H.; Hastie, T.; Zou, H.; Hastie, T. Regularization and Variable Selection via the Elastic Net. J. R. Stat. Soc. Ser. B Stat. Methodol. 2005, 67, 301–320. [Google Scholar] [CrossRef]
  26. Meinshausen, N. Quantile Regression Forests. J. Mach. Learn. Res. 2006, 7, 983–999. [Google Scholar]
  27. Kuhn, M.; Johnson, K. Applied Predictive Modeling; Springer: New York, NY, USA, 2013; ISBN 978-1-4614-6848-6. [Google Scholar]
  28. Dubovik, E.V.; Dubovik, D.V. Relationships between the Organic Carbon Content and Structural State of Typical Chernozem. Eurasian Soil Sci. 2019, 52, 150–161. [Google Scholar] [CrossRef]
  29. Komissarov, M.A.; Gabbasova, I.M. Erosion of Agrochernozems under Sprinkler Irrigation and Rainfall Simulation in the Southern Forest-Steppe of Bashkir Cis-Ural Region. Eurasian Soil Sci. 2017, 50, 253–261. [Google Scholar] [CrossRef]
  30. Zolotareva, B.N.; Bukhonov, A.V.; Demkin, V.A. The Structural State of Buried and Surface Soils of Solonetzic Complexes in the Dry Steppe Zone of the Lower Volga Basin. Eurasian Soil Sci. 2012, 45, 690–699. [Google Scholar] [CrossRef]
  31. Gaevaya, E.A.; Bezuglova, O.S. Use of Analysis of the Principal Components to Identify the Relationships between Soil Aggregates and Parameters of Soil Fertility of Migration–Segregation Chernozems. Moscow Univ. Soil Sci. Bull. 2025, 80, 228–238. [Google Scholar] [CrossRef]
  32. Lal, R. Tillage Effects on Soil Degradation, Soil Resilience, Soil Quality, and Sustainability. Soil Till. Res. 1993, 27, 1–8. [Google Scholar] [CrossRef]
  33. Komissarov, M.A.; Klik, A. The Impact of No-Till, Conservation, and Conventional Tillage Systems on Erosion and Soil Properties in Lower Austria. Eurasian Soil Sci. 2020, 53, 503–511. [Google Scholar] [CrossRef]
  34. Alaoui, A.; Diserens, E. Changes in Soil Structure Following Passage of a Tracked Heavy Machine. Geoderma 2011, 163, 283–290. [Google Scholar] [CrossRef]
  35. Murray, R.S.; Grant, C.D. The impact of irrigation on soil structure. In Land and Water Australia; The National Program for Sustainable Irrigation (Land & Water Australia): Canberra, Australia, 2007; pp. 1–31. [Google Scholar]
  36. Ball, B.; Bingham, I.; Rees, B.; Watson, C.; Litterick, A. The Role of Crop Rotations in Determining Soil Structure and Crop Growth Conditions. Can. J. Soil Sci. 2005, 85, 557–577. [Google Scholar] [CrossRef]
  37. Gholoubi, A.; Emami, H.; Caldwell, T. Deforestation Effects on Soil Aggregate Stability Quantified by the High Energy Moisture Characteristic Method. Geoderma 2019, 355, 113919. [Google Scholar] [CrossRef]
  38. Villamil, M.B.; Amiotti, N.M.; Peinemann, N. Soil degradation related to overgrazing in the semi-arid southern Caldenal area of Argentina. Soil Sci. 2001, 166, 441. [Google Scholar] [CrossRef]
  39. Chalise, D.; Kumar, L.; Kristiansen, P. Land Degradation by Soil Erosion in Nepal: A Review. Soil Syst. 2019, 3, 12. [Google Scholar] [CrossRef]
  40. Sterk, G.; Riksen, M.; Goossens, D. Dryland Degradation by Wind Erosion and Its Control. Ann. Arid Zone 2001, 41, 351–367. [Google Scholar]
  41. Suleymanov, A.; Polyakov, V.; Komissarov, M.; Suleymanov, R.; Gabbasova, I.; Garipov, T.; Saifullin, I.; Abakumov, E. Biophysicochemical Properties of the Eroded Southern Chernozem (Trans-Ural Steppe, Russia) with Emphasis on the 13C NMR Spectroscopy of Humic Acids. Soil Water Res. 2022, 17, 222–232. [Google Scholar] [CrossRef]
  42. Fedorov, N.; Shirokikh, P.; Zhigunova, S.; Baisheva, E.; Komissarov, M.; Muldashev, A.; Gabbasova, D.; Akhmetova, M.; Tuktamyshev, I.; Bikbaev, I.; et al. Productivity, Carbon Sequestration and Species Diversity in Virgin and Secondary Meadow Steppes of the Bashkir Cis-Urals. Sci. Rep. 2025, 15, 17268. [Google Scholar] [CrossRef]
  43. Khasanova, R.F.; Suyundukova, M.B.; Suyundukov, Y.T.; Akhmetov, F.R. Optimization of agrophysical properties of ordinary chernozem under the influence of perennial grasses. Fund. Res. 2014, 8–5, 1095–1099. (In Russian) [Google Scholar]
  44. Bartlová, J.; Badalíková, B. Water Stability of Soil Aggregates in Different Systems of Chernozem Tillage. Acta Univ. Agric. Silvic. Mendel. Brun. 2011, 59, 25–30. [Google Scholar] [CrossRef]
  45. Kursakova, V.S. The Effect of Perennial Herbs on the Physical Properties of Saline Soils. Eurasian Soil Sci. 2006, 39, 748–752. [Google Scholar] [CrossRef]
  46. Khan, A.R.; Chandra, D.; Quraishi, S.; Sinha, R.K. Soil aeration under different soil surface conditions. J. Agron. Crop Sci. 2000, 185, 105–112. [Google Scholar] [CrossRef]
  47. Chen, M.; Zhang, S.; Liu, L.; Wu, L.; Ding, X. Combined Organic Amendments and Mineral Fertilizer Application Increase Rice Yield by Improving Soil Structure, P Availability and Root Growth in Saline-Alkaline Soil. Soil Till. Res. 2021, 212, 105060. [Google Scholar] [CrossRef]
  48. Chen, Y.; Wen, X.; Sun, Y.; Zhang, J.; Wu, W.; Liao, Y. Mulching Practices Altered Soil Bacterial Community Structure and Improved Orchard Productivity and Apple Quality after Five Growing Seasons. Sci. Hortic. 2014, 172, 248–257. [Google Scholar] [CrossRef]
  49. Nunes, M.R.; Karlen, D.L.; Moorman, T.B. Tillage Intensity Effects on Soil Structure Indicators—A US Meta-Analysis. Sustainability 2020, 12, 2071. [Google Scholar] [CrossRef]
  50. Askaraliev, B.; Musabaeva, K.; Koshmatov, B.; Omurzakov, K.; Dzhakshylykova, Z. Development of modern irrigation systems for improving efficiency, reducing water consumption and increasing yields. Mach. Energetics 2024, 15, 47. [Google Scholar] [CrossRef]
  51. Sayão, V.M.; Demattê, J.A.M. Soil Texture and Organic Carbon Mapping Using Surface Temperature and Reflectance Spectra in Southeast Brazil. Geoderma Reg. 2018, 14, e00174. [Google Scholar] [CrossRef]
  52. Shabou, M.; Mougenot, B.; Chabaane, Z.; Walter, C.; Boulet, G.; Aissa, N.; Zribi, M. Soil Clay Content Mapping Using a Time Series of Landsat TM Data in Semi-Arid Lands. Remote Sens. 2015, 7, 6059–6078. [Google Scholar] [CrossRef]
  53. Khosravani, P.; Moosavi, A.A.; Baghernejad, M.; Kebonye, N.M.; Mousavi, S.R.; Scholten, T. Machine Learning Enhances Soil Aggregate Stability Mapping for Effective Land Management in a Semi-Arid Region. Remote Sens. 2024, 16, 4304. [Google Scholar] [CrossRef]
  54. Jeihouni, M.; Alavipanah, S.K.; Toomanian, A.; Jafarzadeh, A.A. Digital Mapping of Soil Moisture Retention Properties Using Solely Satellite-Based Data and Data Mining Techniques. J. Hydrol. 2020, 585, 124786. [Google Scholar] [CrossRef]
  55. Swain, S.R.; Chakraborty, P.; Panigrahi, N.; Vasava, H.B.; Reddy, N.N.; Roy, S.; Majeed, I.; Das, B.S. Estimation of Soil Texture Using Sentinel-2 Multispectral Imaging Data: An Ensemble Modeling Approach. Soil Till. Res. 2021, 213, 105134. [Google Scholar] [CrossRef]
  56. Delavar, M.A.; Naderi, A.; Ghorbani, Y.; Mehrpouyan, A.; Bakhshi, A. Soil Salinity Mapping by Remote Sensing South of Urmia Lake, Iran. Geoderma Reg. 2020, 22, e00317. [Google Scholar] [CrossRef]
  57. Ngabire, M.; Wang, T.; Xue, X.; Liao, J.; Sahbeni, G.; Huang, C.; Duan, H.; Song, X. Soil Salinization Mapping across Different Sandy Land-Cover Types in the Shiyang River Basin: A Remote Sensing and Multiple Linear Regression Approach. Remote Sens. Appl. Soc. Environ. 2022, 28, 100847. [Google Scholar] [CrossRef]
  58. Vaudour, E.; Gomez, C.; Fouad, Y.; Lagacherie, P. Sentinel-2 Image Capacities to Predict Common Topsoil Properties of Temperate and Mediterranean Agroecosystems. Remote Sens. Environ. 2019, 223, 21–33. [Google Scholar] [CrossRef]
  59. Barthold, F.K.; Wiesmeier, M.; Breuer, L.; Frede, H.-G.; Wu, J.; Blank, F.B. Land Use and Climate Control the Spatial Distribution of Soil Types in the Grasslands of Inner Mongolia. J. Arid Environ. 2013, 88, 194–205. [Google Scholar] [CrossRef]
  60. Bousbih, S.; Zribi, M.; Pelletier, C.; Gorrab, A.; Lili-Chabaane, Z.; Baghdadi, N.; Ben Aissa, N.; Mougenot, B. Soil Texture Estimation Using Radar and Optical Data from Sentinel-1 and Sentinel-2. Remote Sens. 2019, 11, 1520. [Google Scholar] [CrossRef]
  61. Kaya, F.; Başayiğit, L.; Keshavarzi, A.; Francaviglia, R. Digital Mapping for Soil Texture Class Prediction in Northwestern Türkiye by Different Machine Learning Algorithms. Geoderma Reg. 2022, 31, e00584. [Google Scholar] [CrossRef]
  62. Zeng, P.; Song, X.; Yang, H.; Wei, N.; Du, L. Digital Soil Mapping of Soil Organic Matter with Deep Learning Algorithms. ISPRS Int. J. Geo. Inf. 2022, 11, 299. [Google Scholar] [CrossRef]
  63. Guo, P.-T.; Li, M.-F.; Luo, W.; Tang, Q.-F.; Liu, Z.-W.; Lin, Z.-M. Digital mapping of soil organic matter for rubber plantation at regional scale: An application of random forest plus residuals kriging approach. Geoderma 2015, 237–238, 49–59. [Google Scholar] [CrossRef]
  64. Ma, T.; Brus, D.J.; Zhu, A.-X.; Zhang, L.; Scholten, T. Comparison of Conditioned Latin Hypercube and Feature Space Coverage Sampling for Predicting Soil Classes Using Simulation from Soil Maps. Geoderma 2020, 370, 114366. [Google Scholar] [CrossRef]
  65. Schmidinger, J.; Schröter, I.; Bönecke, E.; Gebbers, R.; Joerg, R.; Eckart, K.; Mulder, V.L.; Heuvelink, G.; Vogel, S. Effect of Training Sample Size, Sampling Design and Prediction Model on Soil Mapping with Proximal Sensing Data for Precision Liming. Precis. Agric. 2024, 25, 1–27. [Google Scholar] [CrossRef]
Figure 1. Locations: (a) the Republic of Bashkortostan on the European continent; (b) study site in the southern part of the Republic of Bashkortostan and the location of the weather station (the blue triangle within the study area); (c) sampling points (red dots) within the study area. Source: Google Maps.
Figure 1. Locations: (a) the Republic of Bashkortostan on the European continent; (b) study site in the southern part of the Republic of Bashkortostan and the location of the weather station (the blue triangle within the study area); (c) sampling points (red dots) within the study area. Source: Google Maps.
Soilsystems 10 00011 g001
Figure 2. Boxplots of physical soil structure parameters across key soil types. The horizontal black line inside each box represents the median of the data, and the numbers are the mean values.
Figure 2. Boxplots of physical soil structure parameters across key soil types. The horizontal black line inside each box represents the median of the data, and the numbers are the mean values.
Soilsystems 10 00011 g002
Figure 3. Model performance of ML algorithms using a repeated 5-fold cross-validation procedure.
Figure 3. Model performance of ML algorithms using a repeated 5-fold cross-validation procedure.
Soilsystems 10 00011 g003
Figure 4. Variable importance (top 5) for the best predicted soil structure parameters: Ksas (a) and MEA (b).
Figure 4. Variable importance (top 5) for the best predicted soil structure parameters: Ksas (a) and MEA (b).
Soilsystems 10 00011 g004
Figure 5. Generated digital maps of Ksas (top row) and MEA (bottom row) parameters under SVM and RF approaches. Uncertainty maps showing the 90% prediction interval width (PI90) generated using the QRF technique.
Figure 5. Generated digital maps of Ksas (top row) and MEA (bottom row) parameters under SVM and RF approaches. Uncertainty maps showing the 90% prediction interval width (PI90) generated using the QRF technique.
Soilsystems 10 00011 g005
Table 1. Descriptive statistics of soil structure parameters (all soil types).
Table 1. Descriptive statistics of soil structure parameters (all soil types).
ParameterMinMaxMeanMedianSD 1CV 2
Ks 0.0134.717.245.256.030.83
Ksas0.210.930.670.760.220.33
MIA, %023.105.935.304.500.76
MEA, %63.90100.0085.0185.008.620.10
MAA, %033.8010.5510.807.450.71
WSMIA, %9.1680.3034.5326.0021.770.63
WSMEA, %19.7090.8467.9574.5020.070.30
1 standard deviation; 2 coefficient of variation.
Table 2. Accuracy of the best predictive models for each soil property based on five-fold cross-validation with 50 repetitions.
Table 2. Accuracy of the best predictive models for each soil property based on five-fold cross-validation with 50 repetitions.
Soil PropertyRSD DateMethodR2RMSEMAENSE
Ks20250707SVM0.245.103.350.10
Ksas20250528SVM0.500.170.120.38
MAA, %20250504RF0.246.795.200.02
MEA, %20250504SVM0.307.706.040.10
MIA, %20250504Elastic Net0.224.293.27−0.20
WSMIA, %20250528SVM0.2420.1716.600.08
WSMEA, %20250504SVM0.2718.4913.950.08
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Suleymanov, A.; Komissarov, M.; Suleymanov, R.; Gabbasova, I. The Basic Soil Structure Parameters and Their Spatial Prediction Using Machine Learning and Remote Sensing Data in Semi-Arid Trans-Ural Steppe Zone, Russia. Soil Syst. 2026, 10, 11. https://doi.org/10.3390/soilsystems10010011

AMA Style

Suleymanov A, Komissarov M, Suleymanov R, Gabbasova I. The Basic Soil Structure Parameters and Their Spatial Prediction Using Machine Learning and Remote Sensing Data in Semi-Arid Trans-Ural Steppe Zone, Russia. Soil Systems. 2026; 10(1):11. https://doi.org/10.3390/soilsystems10010011

Chicago/Turabian Style

Suleymanov, Azamat, Mikhail Komissarov, Ruslan Suleymanov, and Ilyusya Gabbasova. 2026. "The Basic Soil Structure Parameters and Their Spatial Prediction Using Machine Learning and Remote Sensing Data in Semi-Arid Trans-Ural Steppe Zone, Russia" Soil Systems 10, no. 1: 11. https://doi.org/10.3390/soilsystems10010011

APA Style

Suleymanov, A., Komissarov, M., Suleymanov, R., & Gabbasova, I. (2026). The Basic Soil Structure Parameters and Their Spatial Prediction Using Machine Learning and Remote Sensing Data in Semi-Arid Trans-Ural Steppe Zone, Russia. Soil Systems, 10(1), 11. https://doi.org/10.3390/soilsystems10010011

Article Metrics

Back to TopTop