Next Article in Journal
Assessing the Practical Feasibility of Characterizing the Sustainability of Arable Farms by Measuring and Judging Ecosystem Services
Next Article in Special Issue
Improving Digital Soil Organic Carbon Mapping Using Continuum-Removal Spectral Indices and Multivariate Geostatistics
Previous Article in Journal
Sorption-Mediated Carbon Stabilization and Bacterial Assembly Regulated by Biochar Derived from Invasive Solanum rostratum in China
Previous Article in Special Issue
Non-Invasive Soil Texture Prediction Using Machine Learning and Multi-Source Environmental Data
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Long-Term Assessment of Soil Carbon Dynamics in Post-Fire Conditions: Evidence from Digital Soil Mapping Approaches

1
LEAF—Linking Landscape, Environment, Agriculture and Food Research Center, Instituto Superior de Agronomia, Universidade de Lisboa, Tapada da Ajuda, 1349-017 Lisbon, Portugal
2
Associate Laboratory TERRA, Instituto Superior de Agronomia, Universidade de Lisboa, Tapada da Ajuda, 1349-017 Lisbon, Portugal
*
Author to whom correspondence should be addressed.
Soil Syst. 2026, 10(1), 17; https://doi.org/10.3390/soilsystems10010017
Submission received: 1 December 2025 / Revised: 9 January 2026 / Accepted: 15 January 2026 / Published: 20 January 2026
(This article belongs to the Special Issue Use of Modern Statistical Methods in Soil Science)

Abstract

This study examined long-term changes in soil carbon stock dynamics 11 and 19 years after fire under different severities at 0–5 and 0–25 cm depths with a digital soil mapping approach. Linear (MLR) and non-linear models (RF, SVR, XGBoost) combined with feature selection methods (r < 0.8, FFS, Boruta) were used to predict bulk density (BD), total C, and C stock. Distributional biases were evaluated with Kolmogorov–Smirnov statistics and corrected by Quantile Mapping (QM). RF-FFS performed best for BD and total C at 0–5, while RF-SVR outperformed for C stock and all properties at 0–25. Total C was 49% higher at 0–5, whereas C stock was 7.57 times greater at 0–25. Both models underestimated variability, especially for C stock. At 0–25, bulk density decreased after fire, particularly under conditions of medium severity, while total C increased following the same tendency. The results showed that fire’s legacy is still present in the ecosystem after one and two decades. This is particularly evident at greater depths, where long-term C stock is lower.

1. Introduction

Forest ecosystems underpin a broad spectrum of ecosystem services essential to human societies. Among the most critical are the provision of food and non-food resources, regulation of the global climate, supply of clean fresh water, mitigation of natural disturbances, and the maintenance of productive soils [1]. Forest soil represents one of the planet’s largest and most stable C reservoirs, owing to their capacity to sequester and retain carbon over extended temporal scales [2,3].
The United Nations Convention to Combat Desertification (UNCCD) has adopted the soil organic carbon stock (C stock) as one of the three key indicators of Land Degradation Neutrality, committing to monitor and report regularly on their trends, as a metric to define climate change mitigation and adaptation strategies [4,5,6].
Beyond its importance as a global indicator, C stock and its covariates are critical for multiple reasons. Soil organic carbon regulates ecosystem services such as nutrient provisioning, water holding capacity, soil drainage, soil stability, and greenhouse gas emissions [7,8].
Forest soils store more than half of the world’s C stock with 584 Pg C (56%) at 30 cm and 1525 Pg C (54%) at 100 cm [9]. However, forests worldwide have experienced considerable alterations over recent decades, driven by climate change and human activities that directly or indirectly affect soil carbon storage [10]. Mediterranean forest soils, despite occupying a small region, contribute substantially to Europe’s C stock, primarily associated with forest ecosystems [9,11]. Nevertheless, these forests are also among the most vulnerable to soil degradation processes [12].
Among the various drivers that alter the ecosystem in Mediterranean forests, fire stands out as the most significant and is also associated with increased intensity and frequency in future climate scenarios [13,14,15]. In Mediterranean countries, wildfire consumes over 400,000 ha/year, with projections indicating they will triple by the end of the 21st century [16]. In this context, Portugal has been identified as one of the region’s most at risk of extreme wildfire events [17], particularly in the south, where soil organic carbon levels are among the lowest in the Mediterranean basin [18].
Fire can induce substantial alterations in C stock. In the short term, fire can affect different soil properties, such as changes in bulk density [19], texture [20], and the total or partial loss of organic matter [21,22]. These changes also increase soil vulnerability to erosion, further accelerating soil organic carbon losses [23]. Nonetheless, the magnitude and direction of these effects vary considerably across soil type and its intrinsic characteristics and conditions, vegetation cover and type, topography, and fire intensity [19,24]. Importantly, most immediate effects are concentrated in the upper soil layer, up to 5 cm of depth) but can trigger cascading effects into deeper layers [25,26].
Over longer timescales, fire may contribute to increasing C stock, although the mechanisms that regulate it are complex. For example, combustion can transform biomass or organic matter into more recalcitrant forms, or pyrogenic C, which is chemically resistant to decomposition. Likewise, fire can lead to changes in soil moisture, nutrient dynamics, and microbial activity that can modify decomposition rates and stimulate vegetation regeneration, such as pyrophytic shrubs, that contribute new C to the soil. Given that nearly 70% of the world’s topsoil carbon occurs in fire-prone regions, these processes underscore that fire does not universally deplete soil carbon, and that, depending on local conditions and fire regimes, it can even stabilize or increase C content in the long-term [27].
Understanding the spatial distribution of C stock following fire events is crucial for designing sustainable management strategies and informed environmental policies. Yet assessing these dynamics at management-relevant scales requires approaches that are both efficient and cost-effective. In this context, digital soil mapping (DSM) techniques have increasingly replaced conventional field-based methods, which, while accurate, are often labor-intensive, time-consuming, and costly [28].
Recent technological advances have enabled the production of digital soil maps by spatially representing soil-forming factors with digital datasets and linking them to the quantitative and qualitative attributes of soil properties obtained through field and laboratory analyses [29]. A key component of digital soil mapping (DSM) is the use of environmental covariates, which capture the soil–environment relationship. These typically include remotely sensed (RS) spectral data from satellite imagery (e.g., Sentinel-2, Landsat 8 OLI), as well as derivatives of digital elevation models (DEMs) such as SRTM, which provide topographic and relief information. Remotely sensed imagery contributes valuable insights into soil characteristics through empirical methods. Enhancements in the quality and resolution of these covariates directly improve the accuracy of DSM predictions, thereby opening new avenues for soil research in Portugal.
The most recent national LiDAR survey—conducted by the Direção-Geral do Território [30]—provides high-resolution elevation data across continental Portugal, with a point density of ~10 pts/m2, altimetric accuracy of ~10 cm, and openly available terrain and surface models at 50 cm and 2 m resolution [31]. Although its full potential has not yet been explored, this unprecedented dataset creates new opportunities for fine-scale digital mapping of soil properties, as well as broader applications in ecosystem monitoring and natural resource management.
The models applied in DSM span from linear methods to geostatistical and non-linear approaches. Among linear techniques, multiple linear regression (MLR) has been widely adopted to estimate SOC stocks, owing to its robustness with small sample sizes, methodological simplicity, and straightforward interpretability [32,33]. Geostatistical approaches such as ordinary kriging (e.g., [34]) and geographically weighted kriging [35] consider the spatial correlation. Nevertheless, statistical and geostatistical models are based on the three following assumptions, linearity, stationarity, and noncollinearity [36] which are hard to meet in several cases. Moreover, it is difficult to limit the number of candidate covariates and thus avoid collinearity in the absence of a priori knowledge of large-scale SOC drivers [37]. However, with the increasing availability of environmental data, recent advances in modern statistical methods have expanded the scope of digital soil mapping beyond traditional geostatistical interpolation. These approaches integrate statistical learning, high-dimensional variable selection, and distribution-aware model evaluation to address non-linear relationships, multicollinearity, and uncertainty inherent in soil carbon data. In particular, modern statistical frameworks emphasize not only predictive accuracy but also bias, variance, and the representation of full response distributions, which are critical for robust inference in disturbance-driven ecosystems such as in fire-affected forests. Consequently, a shift from geostatistical techniques such as kriging toward machine learning (ML) algorithms as the primary tools for spatial prediction has been noticed recently [35]. Machine learning approaches have consistently demonstrated higher accuracy compared to traditional methods and offer additional advantages, including explicit consideration of uncertainty, reproducibility, and the ability to overcome the limitations of conventional soil mapping [38].
Nevertheless, selecting the most suitable prediction algorithm remains a challenge, as performance depends on the complexity of relationships between environmental covariates and soil characteristics. Each algorithm generates distinct spatial patterns of SOC distribution depending on its underlying statistical or mathematical principles, the sampling design and sample size, the feature selection approach, and the precision of the covariates used [39]. Among the most widely applied methods, Random Forest (RF), support vector regression (SVR), and XGBoost have generally provided the most accurate predictions. In addition, feature selection varies in its techniques and greediness in selecting the appropriate covariates. Methods such as Boruta, forward selection, or correlation analysis are the most common [40]. However, each algorithm has specific strengths and weaknesses, and their predictive accuracy can vary depending on the dataset and study design.
Several studies have aimed to quantify C stock at the global [41], continental [42], national [43,44,45], and regional scales [46,47,48] with large sample sizes and on different resolutions. However, to the best of our knowledge, studies investigating long-term changes in C stock and its covariates following fire using the DSM approach appear to be non-existent, especially at a fine scale using a small sample size. This is particularly relevant in cork oak forests, which are emblematic of the Mediterranean region and important for conservation (under Natura 2000 network). In light of the above, this study aims to assess the performance of ML models by using different feature selection approaches in the assessment and quantification of C stock and its covariates in cork oak forests that were subject to fire one and two decades ago at a high-resolution scale to understand the main drivers of change for these factors and the C stock dynamics changes across severity and fire history.

2. Materials and Methods

2.1. Characterization of Study Area and Soil Sampling

The study was conducted in the Serra do Caldeirão, a mountainous region located in the northeastern Algarve, southern Portugal (Figure 1). The area covers approximately 81,000 ha and is characterized by cork oak (Quercus suber L.) forest ecosystems with varying tree densities, interspersed with shrublands dominated by Cistaceae and Ericaceae species. The region has low altitudes (<500 m) but steep slopes (3–35%), making it prone to erosion. Soils are relatively shallow, developed on schists and greywackes of the Mira Formation (Carboniferous; [49]), and are mainly classified as Leptosols and Cambisols [50]. The climate is Mediterranean (Csa, Köppen classification; [51]), with mean annual precipitation of 600–800 mm and mean annual temperatures ranging between 12 and 24 °C (1981–2020; [52]).
The area was subject to two large wildfires that occurred in different years, the 2004 fire (two decades ago) that burned 17% and the 2012 fire (one decade ago) that burned 30% of the study area. The sampling took into consideration the three fire-history scenarios (one decade fire, two decade fire along with unburned area) (Figure 1).
A total of 47 plots (~441 m2 each) were randomly spatially distributed using ArcGIS Pro 3.4.0 across these scenarios: 12 unburned, 25 in burned one decade ago, and 10 burned two decades ago. The sampling was distributed randomly across the landscape with higher sampling sizes targeting all LULC categories. The sampling was performed in autumn 2020, where at each plot three, soil subsamples were collected per plot along slope gradients to account for within-plot variability.
At each sampling point, soil samples were taken randomly along the slope gradient, with accessibility taken into consideration. In total, 141 samples were taken at 0–5 cm depth (three per plot) and 96 samples were taken at 0–25 cm depth, corresponding to approximately two per plot. For each sample, both undisturbed and disturbed samples were collected. Undisturbed cores (5 cm diameter, 2 cm height) were collected for bulk density (BD) determination. Bulk density for the 0–25 cm layer was harmonized from measured 0–5 cm values using depth-specific scaling factors derived from SoilGrids bulk density profiles [41], following GSOCmap methodology [53], while disturbed soil samples were homogenized for physicochemical analysis. Disturbed samples (<2 mm fraction) were used for the determination of soil total C by a dry combustion method in an elemental analyzer [54]. To calculate the amount of organic carbon stock, Equation (1) was applied for each depth, 0–5 cm and 0–25 cm separately [55].
SOCi stock (MgC/ha) = OCi x BDfinei x (1-vGi) x t x 0.1,
where for the evaluation depth i, SOC is the soil organic carbon stock (MgC/ha); OC is the organic carbon content (g/kg) in the soil fraction (<2 mm); BDfine is the bulk density of fine soil; vG is the volumetric content of coarse fraction (>2 mm), t is the thickness of the soil; and 0.1 is the conversion factor.

2.2. Predictor Variables for Modeling

Several drivers and indicators influence soil carbon stock, including climatic factors such as temperature, precipitation, and solar radiation, which can control C inputs into the soil and decomposition process [56], topography, vegetation, and land use factors that can modify the distribution or rate of C loss or accumulation [57], along with intrinsic soil characteristics such as texture, structure, and organic matter type [58,59].
Fire, both as a natural and anthropogenic factor, strongly affects SOCS directly with its loss through combustion or volatilization [60,61], or indirectly with erosion [62] or changes at the level of soil biological activities [63,64].
Data used for this study were obtained from various sources. All data was downloaded from the providers corresponding to the date of when soil sampling was collected or a close date. In the case of remote sensing data, cloud cover was less than 2%. All the covariates were calculated with different methodologies and transferred to raster layers with a 10 m spatial resolution in a GIS environment, using the open-source software QGIS 3.16.1.
We used different types of environmental covariates: topographic, soil, climatic, land cover, and remote sensing indices. Topographic variables were extracted from the latest national DTM provided by DGT [30] at 10 m accuracy, while soil data was downloaded from the SOILGRIDS database [41], except for pH, which was created using the inverse distance weighting (IDW) algorithm. Soil moisture data was obtained from the works of Sungmin et al. [65]. On the other hand, soil temperature and climate data was downloaded from ECWMF ERA5-Land reanalysis (Copernicus Climate Change Service [C3S], [66]). Land cover was accounted for using the official national land use/land cover (LULC) map [67].
Remote sensing data were derived primarily from Sentinel-2 multispectral imagery (Level 2A), processed through Google Earth Engine, where various multispectral indices were derived, and complemented with Landsat 8 OLI surface temperature.
In total, 48 different covariates were compiled and prepared at a harmonized spatial resolution of 10 m (remote sensing and topography) or the native resolution of the source product (climate and soil). A full list of covariates, sources, resolutions, and references is provided in Table A1.

2.3. Modeling Approaches

Several algorithms were tested to achieve the best performance with the data collected for the variables analyzed. So far, according to the literature and, with respect to the sample size presented, MLR, XGboost, RF, and SVR showed the best predictive accuracy [47,48,68,69,70,71,72]. To reduce redundancy among predictors and improve model interpretability, several feature selection methods were implemented prior to model calibration, which is a mandatory phase to reduce noise and serve against overfitting problems. Each feature selection method was tested independently in combination with different algorithms to ensure fair comparison and identify the best performing feature selection and model, as shown in Table 1.

2.3.1. Feature Selection Methods

The Variance Inflation Factor (VIF) quantifies how much the variance of a regression coefficient is inflated due to collinearity among predictors, with higher values indicating stronger multicollinearity [73,74]. A commonly used threshold is VIF > 10, beyond which predictors are considered to exhibit problematic collinearity.
Pearson pairwise correlation analysis is a widely used feature selection method to identify and reduce redundancy among predictors. By excluding one variable from pairs with very high correlation, this approach minimizes collinearity while retaining representative covariates, thereby improving model stability [75]. Such correlation-based filtering is commonly applied in environmental modeling and digital soil mapping to handle highly interrelated terrain excluding covariates correlated at |r| ≥ 0.8 [48,76].
The Boruta algorithm, a wrapper feature selection method based on Random Forests, was applied to identify predictors significantly related to the target variable. The method evaluates the importance of each predictor by comparing it with “shadow” attributes created through random permutation of the original predictors. Predictors whose importance consistently exceeded that of the shadow attributes were classified as confirmed, while those performing no better than random were rejected. This method proved to be highly efficient in digital soil mapping as it reduces overfitting, improves model performance, and highlights the environmental covariates most relevant for digital soil mapping [67,68].
Forward Feature Selection (FFS) is a greedy search algorithm that iteratively builds the predictor set by adding variables at a time. The procedure starts with no predictors and, at each step, evaluates the performance of candidate models created by adding one remaining predictor. The variable that leads to the greatest improvement in model performance (e.g., lowest RMSE or highest R2) is retained and stops when no further improvement is achieved. FFS has demonstrated suitability with a strong interest in spatial variable selection. Alternatively, it can be combined with cross-validation to identify the predictors conditioning the maximum performance [77].

2.3.2. Regression Algorithms

Multiple linear regression (MLR) is a linear regression for different quantitative factors. It is commonly used in soil studies for testing the interaction between environmental covariates from the b-coefficients indicated on the linear model expression and fitting using least square algorithm for coefficients estimation especially with low sample size [29,78,79].
Random Forest (RF) is a well-known ML algorithm developed by Breiman [80]. It operates by generating multiple bootstrap samples (ntree) from the original dataset and selecting random subsets of predictors (mtry) at each split.
The model uses decision trees for training, combining them to produce single predictions for each observation using an out-of-bag (OOB) strategy. The algorithm outputs both the OOB error and variable importance measures, which evaluate prediction accuracy and quantify each variable’s contribution.
RF is widely applied in digital soil mapping (DSM) for its robustness across diverse data sources and soil heterogeneity on local and global scales [41,43,48,67,68,81].
Support Vector Machine (SVM), introduced by Vapnik [82], is a kernel-based algorithm highly used to analyze non-linear relationships over a high-dimensional induced feature space and capable of handling both classification and regression tasks. Its regression variant, support vector regression (SVR), is particularly effective with small training datasets, often achieving high predictive accuracy [83]. It is an effective machine learning method for mapping soil properties, largely used by soil mappers in recent years [84,85]. The performance of the SVR model is highly influenced by the selection of the kernel functions.
eXtreme Gradient Boosting (XGboost) is a supervised learning algorithm belonging to the family of gradient boosting models [86]. It is recognized for its speed, accuracy, and ability to handle both classification and regression tasks through additive training. By iteratively fitting new learners to the residuals of previous ones, XGB reduces overfitting and improves predictive performance, making it a frequent winner in machine learning competitions [87,88]. Training involves multiple boosting rounds until stopping criteria are met, with model performance strongly dependent on careful hyperparameter tuning.

2.3.3. Prediction Validation and Uncertainties Estimation

With respect to feature selection methods, the Variance Inflation Factor (VIF) was set with a threshold of less than 10. Pearson pairwise correlation-based filtering was implemented, excluding covariates correlated at |r| ≥ 0.8. Forward Feature Selection (FFS) was implemented with the inclusion criteria set as the variables that minimize RMSE by more than 1% will be considered, and finally, Boruta ran up to 200 iterations (maxRuns = 200). Only “Confirmed” attributes were retained for modeling.
MLR was implemented using only VIF as a feature selection method to assess collinearity [72,73]. For categorical predictors, we used the Generalized VIF (GVIF), adjusted following Fox & Monette [89] and Fox & Weisberg [90], to allow comparability with numeric predictors. In contrast, for the other algorithms, three feature selection methods were applied (Pearson pairwise comparison, Boruta, and FFS). RF was implemented with the Ranger package in R, using 1000 trees. SVR was implemented with the radial basis function (RBF) kernel to minimize bias (kernlab engine). Predictors were standardized before training, using 1000 boosting rounds. XGboost was implemented using 1000 boosting rounds.
All algorithms were implemented in the tidy models, CAR, Boruta, Ranger, purr, and ggplot2 in R (version 4.4.2). Hyperparameter tuning was performed using a grid search method with 10-fold cross-validation. This approach, widely applied to digital soil mapping, divides the dataset into multiple training and testing subsets to ensure robust assessment. Cross-validation is particularly suitable for studies in regions with very limited data availability, such as mountainous forests and limited sample size [91,92]. The root mean square error (RMSE) was used as the primary selection metric, while mean absolute error (MAE) and the coefficient of determination (R2) with their associated standard errors were also reported for model evaluation.

2.3.4. Feature Importance, Model Validation, and Uncertainties Mapping

We quantified the influence of environmental covariates on each target variable using a harmonized, permutation-based feature importance. Each predictor was independently permuted (n = 200), and the resulting increase in root mean square error (ΔRMSE) relative to the baseline model quantified its importance. The mean ΔRMSE represented the raw importance, and its standard deviation provided uncertainty estimates. To enable comparison across models and depth intervals, raw importances were normalized as percentages of total positive importance.
Predicted soil property maps derived from digital soil mapping models often exhibit systematic biases relative to measured field data. Such discrepancies typically affect not only the mean values but also the spread and shape of predicted distributions. Predicted values were compared against observations using the Kolmogorov–Smirnov (KS) test [93,94], and the resulting maps were subsequently bias-corrected using Quantile regression [95,96]. Uncertainty was mapped using the standard deviation (SD) derived from 50 model runs, following the approach proposed by the Global Map Project [97,98,99].

2.4. Long-Term Soil Property Dynamics Following Fire

C stock was assessed across severity levels and year since last fire to better understand its underlying dynamics. Therefore, changes in total C, bulk density, and C stock were assessed individually for both depths (0–5 cm and 0–25 cm). In order to do so, burn severity was assessed using the Differenced Normalized Burn Ratio (dNBR), a spectral index commonly applied in burned area and burn severity mapping [62,100,101]. The dNBR quantifies fire-induced ecosystem changes by calculating differences in near-infrared and short-wave infrared reflectance between pre- and post-fire conditions [102,103]. Two dNBR maps were generated by differencing the pre- and post-fire NBR images for 2012 and 2004 fire events. Landsat 5 imagery was used for the 2004 fire, while Landsat 7 ETM+ data were employed for the 2012 fire. The resulting dNBR maps were then reclassified following the burn severity thresholds presented in Table A2.

3. Results

3.1. Model Performance

The models’ performance across the surface layer (0–5 cm) varied depending on soil properties and feature-selection methods (Figure 2). With respect to bulk density, the Random Forest with Forward Feature Selection (RF–FFS) model achieved the highest accuracy and lowest error (R2 = 0.63, RMSE = 0.15, MAE = 0.09). The SVR–Boruta model performed comparably (R2 = 0.62, RMSE = 0.16, MAE = 0.11). RF–Boruta and RF–Corr < 0.8 both yielded (R2 = 0.61, RMSE and MAE between 0.16 and 0.17 and 0.10 and 0.12, respectively). The XGBoost and MLR models showed substantially lower performance (R2 ≤ 0.37, RMSE ≥ 0.21, MAE ≥ 0.17) (Figure 2).
In addition, the RF–FFS model provided the best prediction for total C (R2 = 0.63, RMSE = 12.6, MAE = 8.55). This was closely followed by SVR–Boruta (R2 = 0.61, RMSE = 12.4, MAE = 8.24) and SVR–Corr < 0.8 (R2 = 0.60, RMSE = 13.1, MAE = 8.23). The XGBoost–Boruta and XGBoost–Corr < 0.8 models achieved R2 values of 0.57–0.59, with RMSE values between 13.6 and 14.0. MLR–VIF was the weakest performer (R2 = 0.36, RMSE = 17.5, MAE = 12.9). C stock prediction achieved the highest overall accuracy among all soil properties and the lowest variation in model performance. SVR-FFS performed best (R2 = 0.84, RMSE = 0.05, MAE = 0.04), followed by RF–Boruta (R2 = 0.82, RMSE = 0.06, MAE = 0.04), RF–Corr < 0.8 (R2 = 0.81, RMSE = 0.06, MAE = 0.04), and MLR–VIF (R2 = 0.74, RMSE = 0.07, MAE = 0.05). XGBoost models were the least effective (R2 ≤ 0.57, RMSE ≥ 0.10, MAE ≥ 0.08) (Figure 2). Across all properties for that depth, Forward Feature Selection (FFS) consistently provided the highest R2 and lowest RMSE and MAE values, particularly in RF and SVR models. Boruta occasionally enhanced performance for surface total C and C stock, but its advantage was marginal compared to FFS. Correlation-based filtering (Corr < 0.8) did not yield significant improvements (Figure 2).
Model performance decreased slightly at 0–25 cm with an increase in variation in performance across properties, more prominently for bulk density (Figure 2). Bulk density predictions were low across all models. However, the SVR–FFS model achieved the best prediction accuracy (R2 = 0.43, RMSE = 0.198, MAE = 0.155), followed by XGBoost–FFS (R2 = 0.34, RMSE = 0.217, MAE = 0.180). RF models performed moderately (R2 = 0.23–0.29, RMSE = 0.219–0.241, MAE = 0.187–0.205), while MLR–VIF remained weakest (R2 = 0.23, RMSE = 0.227, MAE = 0.192). For total C, the SVR–FFS model ranked first (R2 = 0.61, RMSE = 5.33, MAE = 4.25), closely followed by XGBoost–FFS (R2 = 0.58, RMSE = 5.51, MAE = 4.82). The RF and SVR models with Boruta or Corr < 0.8 achieved intermediate results (R2 = 0.45–0.48, RMSE = 6.3–6.5, MAE = 5.0–5.1), with MLR–VIF ranking last (R2 = 0.34, RMSE = 7.75, MAE = 6.28) (Figure 2).
Carbon stock predictions at this depth produced high model agreement, where the SVR and RF models had near accuracies R2 = 0.82–0.83, except for RF + Corr < 0.8 (R2 = 0.79). RMSE varied in function of the feature selection applied. Overall, SVR provided the best performance with FFS (R2 = 0.83 RMSE = 0.80, MAE = 0.98). XGBoost-FFS models reached moderate accuracy (R2 = 0.78–0.82), while MLR–VIF remained the least accurate (R2 = 0.72, RMSE = 1.28, MAE = 1.02). FFS remained the dominant feature selection approach, producing the highest R2 and the lowest RMSE and MAE across BD and total C and C stock predictions. Boruta performed moderately but did not exceed FFS in any property (Figure 2).

3.2. The Main Drivers and Uncertainties: Bulk Density, Total C, and C Stock Spatial Distribution

Different features were deemed to be important for each soil property studied and this changed across depths for each property (Figure 3). At 0–5 depth, the spatial distribution of bulk density was mainly influenced by silt content as the most influential predictor accounting for 44.8% of the total importance, followed by clay content (22.8%), TC greenness (19.3%), and minimal curvature (13.1%). On the other hand, surface temperature (29.3%) and bulk density (22.8%) were the dominant predictors of total C, followed by terrain ruggedness index (13.9%), silt content (12.3%), curvature (11.1%), and northness (10.6%). For C stock, total C was by far the most important predictor (49.3%), followed by NBR (16.2%), TC wetness (15.7%), and bulk density (13.1%) (Figure 3).
At 0–25 cm depth, across all three soil properties, the relative importance of predictors shifted (Figure 3). Bulk density spatial predictions were primarily influenced by aspect (35.2%), followed by Band 8 (19.8%), eastness (17.9%), NBR2 (16.0%), and Band 4 (11.2%). For total C with a broader set of variables with near contributions, the most important were GNDVI (14.1%), bulk density (13.1%), surface temperature (11.5%), and the set of terrain-derived parameters and silt content. For C stock, the total C again dominated (36.3%), followed by northness (18.3%), elevation (17.9%), GNDVI (10.3%), temperature seasonality (9.3%), and slope length (7.9%) (Figure 3).
Soil property values at 5 and 25 cm depth are summarized in Table 2 and their respective maps are shown in Figure 4 and Figure 5. For the 0–5 cm soil layer, the predicted values for bulk density ranged between 1.06 and 1.82 g cm−3, while total C ranged between 23.42 and 65.09 g kg−1. The predicted C stock values for this layer varied between 0.10 and 0.81 kg m−2. For that depth, the RF model slightly underestimated mean values for bulk density (−1.3%) and total C (−3.0%) and substantially underestimated variability (SD reduced by 38–55%) with KS distance ranging between 17% and 39%, indicating moderate deviation between predicted and observed distributions. The predicted ranges were narrower with an underestimation of the maximum values and overestimation of min values. Confirming that the model produced over-smoothed predictions.
On the other hand, for the 0–25 cm soil layer, the observed bulk density values ranged between 0.82 and 1.93 g cm−3, while the predicted values ranged between 1.12 and 2.02 g cm−3. The observed total C values varied between 5.28 and 42.70 g kg−1, compared to 4.70 and 35.68 g kg−1 for the predicted values. Similarly, observed C stock ranged between 0.22 and 12.25 kg m−2, whereas the predicted values ranged between 0.41 and 10.54 kg m−2 (Table 2).
At this depth, SVR exhibited more pronounced smoothing and was more prominent in the mean values. Bulk density was slightly overestimated (–2%), while total C (9%) was the highest for C stock at 26% (Table 2). Standard deviations were lower than observed, with moderate deviation between distributions (KS distance between 16 and 26%) and predicted ranges were truncated with a tendency to underestimate the maximum values and overestimate and underestimate bulk density, where both range limits were overestimated. Quantile regression succeeded in shrinking the KS distance for all properties by less than 15% for the maximum deviated property across both depths (Table 2).

3.3. Long-Term Soil Property Dynamics Following Fire

The severity mapping results showed that more than half of the total area was affected by the fires under study: 23,500 ha (29%) in 2004 and 24,900 ha (31%) in 2012 (Table A3). Although the affected areas were extensive, most of the burned areas had low (68.3%) or medium–low (20.2%) severity (Table A3). The area affected by medium–high severity corresponded to less than 1% of the total area, so this was combined with the medium–low severity category to create a single “medium severity” class. Areas classified as having high post-fire severity represented less than 0.01% and were excluded from the subsequent analysis.
Changes in bulk density, total C, and C stock according to severity levels and year since last fire for 0–5 cm and 0–25 cm are shown in Figure 6.
Overall, changes at the 0–5 cm depth were minimal, bulk density values ranged between 1.49 and 1.51 g cm−3, remaining stable across all years and treatments. Total C concentrations averaged 39.76 g kg−1 in unburned soils and increased slightly after fire, ranging from 44.35 to 44.62 g kg−1 for low severity and 44.67 to 44.88 g kg−1 for medium severity after 11 and 19 years, respectively. The C stock values averaged 0.32 kg m−2 in unburned soils and changed only marginally over time, with averages of 0.36 kg m−2 in 2012 and 0.35 kg m−2 in 2019 for both low and medium severities (Figure 6).
On the other hand, at the 0–25 cm depth, modest differences were observed across years and severity levels. Bulk density averaged 1.50 g cm−3 in unburned soils and decreased after fire, especially under medium severity (1.39 g cm−3 after 11 years). By 19 years, values increased slightly to 1.43 g cm−3, though they remained below pre-fire levels. Total C concentrations averaged 20.78 g kg−1 in unburned soils and increased slightly over time, particularly under medium severity, reaching 22.65 and 24.42 g kg−1 after 11 years and 23.45 and 26.00 g kg−1 after 19 years for low and medium severity, respectively. Moreover, C stock averaged 3.97 kg m−2 in unburned soils and declined following fire, reaching 3.69 and 3.41 kg m−2 for low and medium severity plots, respectively, 11 years after burning. Moreover, 19 years after fire, low severity sites showed partial recovery (3.77 kg m−2), while medium-severity plots remained largely unchanged.

4. Discussion

4.1. Model Performance

The performance of machine learning models can vary because each model operates differently due to its unique structure. Each algorithm has distinct capabilities that interact with the complexity, dimensionality, and non-linearity of the predictors chosen.
Across all three properties, non-linear models consistently outperformed the linear MLR baseline except for C stock at 0–25 cm. Soil–environment relationships in Mediterranean forests are non-linear, multicollinear, and interactive [104]. MLR’s weak performance reinforces that linear additive models are inadequate for capturing the complex feedback between vegetation, microclimate, and soil development in forested Mediterranean ecosystems.
Overall SVR and RF dominate the high-performance ladder, a result that is consistent with numerous studies across the Mediterranean Basin [48,105,106]. For the 0–5 cm depth, RF and SVR were the top performers, both coupled with FFS. RF excelled for bulk density and total C, achieving R = 0.63 with relatively low RMSE and MAE. On the other hand, SVR showed the best performance for carbon stock estimation, achieving R = 0.84 and the lowest RMSE for all combinations analyzed.
The high environmental heterogeneity captured from different remote sensors and covariates favors RF at the superficial depth as it excels at handling high-dimensional data, variable collinearity, and non-linear interactions based on the high heterogeneity captured. XGBoost builds trees sequentially, each correcting the errors of the previous ensemble, optimizing a regularized objective function. However, low sample size limits its performance considerably. The regularization parameters and tree depth settings that prevent overfitting and can also suppress subtle patterns that are key to soil prediction.
At 0–25 cm, a lower accuracy was observed for bulk density; on the other hand, the change in total C and C stock was very insignificant. A similar trend is always observable and was reported in the literature for decreasing accuracy along the depth gradient [41,78]. This is probably due to the deeper, more heterogeneous soil processes. In addition to the limitations of some covariates to capture subsoil underlying processes, such as multispectral bands and other covariates, subsurface properties are less directly sensed by remote imagery and more affected by cumulative processes such as leaching, root turnover, or compaction.
At this depth, SVR became the clearly dominant algorithm across all properties, indicating superior generalization to deeper, more heterogeneous soil processes, achieving a lower accuracy than the superficial layer (R2 = 0.43) for bulk density. R2 = 0.61 for total C and R2 = 0.83 for C stock estimation performed second while XGBoost–FFS performed last.
This suggests that SVR’s kernel-based flexibility effectively captured non-linear depth-dependent relationships that tree-based models (RF, XGBoost) struggled with. The superior performance of SVR with RBF kernel reflects its ability to capture smooth, non-linear gradients driven by topography, climate, and vegetation, whereas RF relies more heavily on surface heterogeneity that diminishes with depth.
Across all properties and depths studied, Forward Feature Selection (FFS) was consistently associated with the highest R2 and lowest RMSE and MAE values, particularly in RF and SVR models. Boruta occasionally enhanced performance for surface total C and C stock, but its advantage was marginal compared to FFS. Although different in structure and functioning, both feature selection methods produce the lowest set of potential highly uncorrelated covariates, which makes model interpretation more convenient. Feature selection studies have shown that model-based selection (FFS, RFE) enhances accuracy by 5–10% over filter-based methods, especially for SVR and RF [47,77].

4.2. The Main Drivers and Uncertainties: Bulk Density, Total C, and C Stock Spatial Distribution

The strong influence of silt and clay contents on bulk density in the 0–5 cm layer highlights the dominant role of soil texture in determining soil compaction and porosity near the surface. Finer-textured soils (high clay and silt fractions) tend to contribute to greater aggregate stability. The contribution of total C greenness further indicates that vegetation cover influences surface soil structure through root activity and organic matter inputs, which improve aggregation and reduce compaction.
However, at 0–25 cm, aspect, eastness, and spectral bands B4 and B8 became more influential. Aspect is an important factor in determining the amount of incoming solar radiation, soil moisture, and vegetation cover. Dry aspects (e.g., south-facing in the Northern Hemisphere), reduced moisture, and lower organic matter accumulation lead to diminished soil structure and increased compaction, resulting in higher bulk density. In contrast, cooler, moist aspects (e.g., north-facing) support greater plant productivity, root penetration, and organic matter input, which promote larger pore volume and lower bulk density [107]. The spectral bands B8 and B4 act as a proxy for vegetation-related attributes such as structure and biomass, jointly. These variables likely capture microclimatic gradients (light and moisture availability) and vegetation structure differences. The presence of NBR2 among the key predictors suggests that long-term vegetation and soil surface properties possibly influenced by historic fire events or other disturbances may continue to shape subsurface bulk density patterns, patterns mainly mediated by the post-fire recovery change that remained visible due changes in vegetation recovery. NBR and NBR2 have proven to be a solid indicator of long-term fire legacy that spans over a decade [108]. However, a direct causal relationship cannot be established, especially related to that depth since the indicated covariates are multispectral and largely related to upsurface reflectances.
For 0–5 cm, surface temperature and bulk density emerged as key predictors of total C, reflecting the combined influence of microclimate and soil physical properties on organic matter accumulation at 0–5 cm. Lower surface temperatures are often covered by a high vegetation cover and often retain more organic carbon by the input from litter and limited decomposition rates, whereas high bulk density mediate aeration and thus microbial activity [109,110]. In addition, total C as a proxy of organic matter establishes a high influence on bulk density and vice versa [111]. The contribution of terrain ruggedness, curvature, and northness underscores the role of topographic heterogeneity in redistributing organic matter through erosion and deposition [112,113].
In contrast to previous works by other authors where total C content spatial distribution is merely governed by vegetation [68] at deeper layers (0–25 cm), total C spatial variability was influenced by an ensemble of different features such as vegetation indices (GNDVI), bulk density, surface temperature, topographic indices (slope, curvature, elevation), B2, and silt content. Their contribution was proportional with lesser inter-differences, suggesting that the interplay between vegetation density and landscape position jointly influences subsoil carbon distribution mediated by bulk density. The moderate importance of surface temperature at this depth implies continuing, though weaker, climatic control on carbon storage due to the above-mentioned causes in the uppermost layer.
C stock is one of the most studied parameters in the literature; however, features that dictate its spatial variability are highly variable depending on the region, the study area, the chosen covariates, and the resolution.
For both depth intervals, total C was the dominant determinant of carbon stock, as expected from their direct relationship. For 0–5 cm, NBR and total C wetness were second and of equal importance, a pattern that is often not relevant in studies of subsoil surface, where usually land cover, vegetation, geomorphometric variables, or climatic variables are often the most present [48,114,115]. The continued relevance of NBR suggests the persistence of fire legacy on C stock even 10 and 20 years after fire [108,116].
Even though total C wetness is a clear indicator for sensitivity for surface moisture, several authors revealed its relevance to distinguish post-fire succession dynamics and long-term fire effects along with NBR and NBR 2 for more than eight years [108,117,118].
Despite its direct relationship with C stock, the bulk density was among the least likely features to predict C stock, along with EVI 90 for spatial feature variability. EVI 90 indicates higher vegetation density, higher humidity, lower surface temperature, and higher total C input, which, along with sensitivity surface moisture, are all closely associated with organic carbon accumulation. The continued relevance of NBR in predicting C stock may reflect not only current vegetation conditions but also persistent post-disturbance differences in vegetation composition and canopy density.
For 0–25 cm, along with total C as the most influential feature, C stock was primarily predicted by topographic predictors (northness, elevation, slope length), vegetation density (GNDVI), and temperature seasonality, highlighting the absence of any indices with any direct relationship to fire effects. This suggests that topographic and climatic variables exert more consistent control on carbon accumulation and stabilization. In addition to the presence of GNDVI, which suggests that vegetation density influences C stock even at deeper levels by its role in organic matter input and composition, the following supports the idea that predictive sensitivity to disturbance legacies is strongest in surface soils, while subsoil carbon stocks are more influenced by terrain-driven redistribution with climatic and vegetation variability [48].
The spatial prediction of different soil properties shows that in the 0–5 cm layer, the spatial prediction of total C is 49% higher than in the 0–25 cm layer. On the other hand, the area stores’ C stock is 7.57 times more at 25 depth than 0–5 cm.
FFS-RF and FFS-SVR achieved good predictive accuracy, but showed consistent tendencies to underestimate variability and truncate distributional extremes, with the highest distributional differences observed for bulk density and total C. However, the lowest mean differences were observed on bulk density and the highest on C stock.
RF produced more accurate mean estimates but substantially underestimated variability due to ensemble averaging. Its predictions were thus centered correctly but compressed. In contrast, SVR produced smoother, more homogeneous predictions with both mean and range bias. This is attributed to the regularization parameter (C) and kernel function, which prioritize generalization at the expense of capturing extremes. This well-seen behavior is a well-documented limitation of ensemble and kernel-based learning algorithms when modeling environmental variables with strong spatial heterogeneity and skewed distributions [80,119]. After QM, both models achieved comparable accuracy and distributional consistency, suggesting that post-processing techniques such as QM can compensate for intrinsic algorithmic smoothing bias while preserving the spatial dispersion and distributional characteristics [120,121].

4.3. Long-Term Soil Property Dynamics Following Fire

At the surface layer (0–5 cm), all soil properties showed little to no variation over time or among burn severities. Bulk density values were stable across all years and severity levels. On the other hand, total C concentrations increased after fire across the years. However, these changes were minor and did not indicate a clear temporal or severity-related pattern. Similarly, the C stock changed only marginally over time and had no changes across severities; its fluctuations were minimal and not ecologically meaningful.
Changes for the 0–25cm depth were more visible with modest differences observed across years and severity levels. The bulk density decreased following fire, especially under medium severity. By 19 years, values increased slightly, although they remained below pre-fire levels. This tendency was followed by an increase in total C, particularly with medium severity being more pronounced 11 years after and continuing to increase for both severities 19 years after. The decrease in bulk density and increase in total C elucidate the important role that vegetation plays in both parameters in the long-term study area. After fire, changes in soil’s physicochemical properties facilitate the invasion and growth of shrubs, which actively contribute through a greater supply of leaf litter and promote greater microbial activity, compared with unburned areas [122]. In addition, wildfires induce changes in organic matter structure and composition, facilitating its transition into stable carbon pools by modulating its flow pathways [123,124], which may last decades. Organic matter tends to also be the dynamic factor driving soil nutrients dynamics at superficial depths [122], so its consumption and composition are highly mediated by the vegetation. All these factors contribute to a higher increase in soil organic matter and, therefore, total C. Soils rich in organic matter exhibit higher porosity and lower bulk density. While this effect was not visible in the uppermost layer, the contribution and continuous accumulation of organic matter was observed at greater depths, highlighting the contribution of vegetation and carbon storage processes in the soil. The severity of fire appears to have contributed through its effect on the composition and density of the vegetation, as some shrubs are more competitive than others under post-fire conditions. Although total C is a high determinant factor of C stock in this study area, the content decreased in that layer (0–25 cm), which was more prominent in medium severity than low severity. The differences observed in C stock across depths highlight the importance of total C in shaping the C stock dynamics in post-fire environments and suggest that the organic matter composition and nature highly influence the C stock by its effect in soil structure influencing bulk density. In addition, lower depths may reveal the true effect of long-term fire in increasing total C and subsequently C stock capacity of such soils in Mediterranean environments.

5. Conclusions

This study enabled the estimation of bulk density, total C, and C stock in cork oak forests in a post-fire environment with different fire history and severity levels, using different combinations of machine learning algorithms and feature selection methods judged to be the most efficient in the field for two depths. The results showed that employing modern statistical methods allowed for reliable and statistically robust estimation of bulk density, total C, and C stock in fire-affected cork oak forests where non-linear models (SVR and RF) outperformed the linear MLR baseline and Xgboost across all soil properties. RF achieved highest accuracy at the uppermost depth, while SVR was best at predicting C stock and properties at greater depths. The performance of both was improved when combined with Forward Feature Selection (FFS), confirming its robustness and adaptability among models evaluated.
Spatially, total C was 49% higher at 0–5 cm than at 0–25 cm, while C stock storage was 7.57 times greater at 25 cm. Both RF and SVR showed a consistent tendency to underestimate variability and truncate distributional extremes, although RF produced more accurate means. Mean prediction errors were smallest for bulk density and higher for C stock, indicating that C stock was more challenging to estimate accurately.
Distribution-aware post-processing using Quantile Mapping proved essential for addressing intrinsic algorithmic biases. Quantile regression mapping effectively corrected the distribution and mean bias, restored the spread of the predicted distributions, and better matched the observed variability and extremes while preserving the underlying spatial patterns. These post-processing methods can compensate for intrinsic algorithmic biases in non-linear models and significantly improve the reliability of spatial estimates.
The variables that influenced the determination of bulk density, total C, and C stock showed not only the influence of intrinsic soil characteristics such as texture, but also environmental and topographic conditions, with variables denoting the importance of vegetation and fire legacy, evidenced at the surface and at depth. Total C was influenced by a broader set of environmental and spectral characteristics, reflecting the combined effects of vegetation, soil, and topographic conditions. Although C stock was mainly governed by total C, topographic factors (northness, elevation, slope length), vegetation density (GNDVI), and temperature seasonality were also relevant, highlighting a transition from fire-driven controls near the surface to terrain and vegetation processes at depth.
The properties of bulk density, total C, and C stock showed no influence of severity or time elapsed after the fire at the surface level, in the first 5 cm. However, at greater depths (0–25 cm), an increase in total C was observed, favored by conditions of medium severity and associated with a greater input of organic matter derived from post-fire regeneration, leading to a reduction in soil density and lower C stock in the long-term. These depth-dependent patterns demonstrate that the legacy of fire persists for decades and remains embedded in subsurface soil carbon dynamics.

Author Contributions

Conceptualization, D.A. and E.S.S.; methodology, D.A. and Y.B.; validation, Y.B., D.A. and E.S.S.; formal analysis, Y.B.; investigation, Y.B., E.S.S. and D.A.; data curation, Y.B. and D.A.; writing—original draft preparation, Y.B.; writing—review and editing, Y.B., E.S.S. and D.A.; supervision, D.A.; project administration, E.S.S.; funding acquisition, E.S.S. and D.A. All authors have read and agreed to the published version of the manuscript.

Funding

This work was funded by national funds through FCT—Fundação para a Ciência e a Tecnologia, I.P., under the projects UID/04129/2025 of LEAF-Linking Landscape, Environment, Agriculture and Food, Research Unit and LA/P/0092/2020 of Associate Laboratory TERRA, and under CEEC INST 2ed with DOI 10.54499/CEECINST/00081/2021/CP2809/CT0002.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data will be made available on request.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Table A1. List of the different covariates along with their sources.
Table A1. List of the different covariates along with their sources.
Covariate NamesMain StatisticsCovariates NamesMain Statistics
MinMeanMaxSDMinMeanMaxSD
Annual_mean_temperature14.515.3216.90.52NDVI−0.280.290.690.09
Annual_solar_radiation215218.77222.91.91NDMI−0.620.010.470.1
Annual_relative_humidity7476.3279.91.63NBR20.020.130.220.03
Annual_precipitation14.515.3216.90.52NBR−0.640.120.610.11
water_vapour_pressure12.813.1814.20.31EVI−0.970.270.960.09
Temperature_seasonality4.54.885.100.13EVI_900 1
precipitation_wettest_month92107.321257.87TCW−1.89−0.230.050.05
precipitation_driest_month0.10.421.20.27TCG−1.54−0.040.270.04
Min_Temperature_coldest_month8.18.8710.90.59TCB0.230.483.970.06
Max_Temperature_warmest_month22.323.8725.300.76SAVI−0.230.210.650.07
Surface_temperature22.7830.7143.722.80Sand_content2.98432.61504.3624.77
B120.110.231.620.04Silt_content31.06350.60418.1316.29
B110.110.291.620.05Clay_content0.40216.09299.2519.61
B80.100.291.710.04Soil_temperature25.4026.5027.100.40
B40.090.161.790.03Hillshade−1−0.0110.61
B20.040.131.990.02TWI04.5032.212.87
GNDVI−0.250.320.660.07TRI08.9032.643.53
RGR0.611.051.930.07eastness−10.3910.68
Curvature0.166.830.110.02MRRTF00.053.870.26
Slope0.0919.1887.998.58MRVBF00.053.870.26
Aspect1 8 Minimal_curvature−1.220.012.460.04
LULC1 6 Tangentiel_curvature−0.870.00030.950.01
Slope_length042.41142961.84Plan_curvature−27.348.2736.840.23
Northness−1−0.2410.58Profile_curvatrure−1.050.010.970.01
topographicPosition index−24.105.3823.131.00Maximal_curvature−0.020.013.610.07
DEM7.26183.18580.79184.18
Table A2. Severity levels classified according to their DNBR ranges [125].
Table A2. Severity levels classified according to their DNBR ranges [125].
Severity LeveldNBR Range (Scaled by 103)
High post-fire growth−500 to −251
Low post-fire growth−250 to −101
Unburned−100 to +99
Low severity+100 to +269
Morerate−low severity+270 to +439
Morerate−high severity+440 to +659
High severity+660 to +1300
Table A3. Severity classes percentage along with the % total area burned in each severity class.
Table A3. Severity classes percentage along with the % total area burned in each severity class.
Severity ClassBurned Area (ha)% of Total Area
2004|23,500 ha (29%)
Unburned3139.613.36
Low13,839.1558.89
Medium–Low6509.527.70
Medium–High4.70.02
High0.000.00
Very High0.000.00
2012|24,900 ha (31%)
Unburned2049.278.23
Low19,474.2978.21
Medium–Low3366.4813.52
Medium–High0.000.00
High0.000.00
Very High0.000.00

References

  1. Mori, A.S.; Bradford, M.A.; Martínez-Rodríguez, M.R. Biodiversity and ecosystem services in forest ecosystems: A global-scale synthesis. J. Appl. Ecol. 2017, 54, 1133–1144. [Google Scholar] [CrossRef]
  2. Lal, R. Soil carbon sequestration to mitigate climate change. Geoderma 2004, 123, 1–22. [Google Scholar] [CrossRef]
  3. Villat, J.; Nicholas, K.A. Quantifying soil carbon sequestration from regenerative agricultural practices in crops and vineyards. Front. Sustain. Food Syst. 2024, 7, 1234108. [Google Scholar] [CrossRef]
  4. Batjes, N.H. Technologically achievable soil organic carbon sequestration in world croplands and grasslands. Land Degrad. Dev. 2019, 30, 25–32. [Google Scholar] [CrossRef]
  5. UNCCD—United Nations Convention to Combat Desertification. Report of the Conference of the Parties on Its Fourteenth Session, Held in New Delhi, India, from 2 to 13 September 2019. Available online: https://www.unccd.int/sites/default/files/sessions/documents/2019-12/ICCD_COP%2814%29_23_Add.1-1918355E.pdf (accessed on 18 September 2025).
  6. Minasny, B.; Malone, B.P.; McBratney, A.B.; Angers, D.A.; Arrouays, D.; Chambers, A.; Chaplot, V.; Chen, Z.S.; Cheng, K.; Das, B.; et al. Soil carbon 4 per mille. Geoderma 2017, 292, 59–86. [Google Scholar] [CrossRef]
  7. Jackson, R.B.; Lajtha, K.; Crow, S.E.; Hugelius, G.; Kramer, M.G.; Piñeiro, G. The ecology of soil carbon: Pools, vulnerabilities, and biotic and abiotic controls. Annu. Rev. Ecol. Evol. Syst. 2017, 48, 419–445. [Google Scholar] [CrossRef]
  8. Davidson, E.A.; Janssens, I.A. Temperature sensitivity of soil carbon decomposition and feedbacks to climate change. Nature 2006, 440, 165–173. [Google Scholar] [CrossRef] [PubMed]
  9. Crézé, C.; Saatchi, S.; Kwon, N.; Yang, Y.; Li, S. High-resolution global map (100 m) of soil organic carbon reveals critical ecosystems for carbon storage. Earth Syst. Sci. Data Discuss. 2025, 2025, 1–46. [Google Scholar] [CrossRef]
  10. IPBES. Thematic Assessment Report on the Underlying Causes of Biodiversity Loss and the Determinants of Transformative Change and Options for Achieving the 2050 Vision for Biodiversity of the Intergovernmental Science-Policy Platform on Biodiversity and Ecosystem Services; O’Brien, K., Garibaldi, L., Agrawal, A., Eds.; IPBES Secretariat: Bonn, Germany, 2024. [Google Scholar] [CrossRef]
  11. Food and Agriculture Organization of the United Nations; Plan Bleu. State of Mediterranean Forests 2018; FAO: Rome, Italy; Plan Bleu: Marseille, France, 2018; ISBN 978-92-5-131047-2/978-2-912081-52-0. [Google Scholar]
  12. Ferreira, C.S.S.; Seifollahi-Aghmiuni, S.; Destouni, G.; Ghajarnia, N.; Kalantari, Z. Soil degradation in the European Mediterranean region: Processes, status and consequences. Sci. Total Environ. 2022, 805, 150106. [Google Scholar] [CrossRef]
  13. Jones, M.W.; Abatzoglou, J.T.; Veraverbeke, S.; Andela, N.; Lasslop, G.; Forkel, M.; Smith, A.J.P.; Burton, C.; Betts, R.A.; van der Werf, G.R.; et al. Global and regional trends and drivers of fire under climate change. Rev. Geophys. 2022, 60, e2020RG000726. [Google Scholar] [CrossRef]
  14. El Garroussi, S.; Di Giuseppe, F.; Barnard, C. Europe faces up to tenfold increase in extreme fires in a warming climate. npj Clim. Atmos. Sci. 2024, 7, 30. [Google Scholar] [CrossRef]
  15. Coogan, S.C.P.; Robinne, F.N.; Jain, P.; Flannigan, M.D. Scientists’ warning on wildfire—A Canadian perspective. Can. J. For. Res. 2019, 49, 1015–1023. [Google Scholar] [CrossRef]
  16. van der Schriek, T.; Varotsos, K.V.; Karali, A.; Giannakopoulos, C. Wildfire Burnt Area and Associated Greenhouse Gas Emissions under Future Climate Change Scenarios in the Mediterranean: Developing a Robust Estimation Approach. Fire 2024, 7, 324. [Google Scholar] [CrossRef]
  17. Shen, L.; Zhang, W.; Zhai, D.; Han, S.; Tian, S. Estimation of soil organic carbon content and dynamics in Mediterranean climate regions considering long-term monthly climatic conditions. Ecol. Indic. 2024, 168, 112746. [Google Scholar] [CrossRef]
  18. Meier, S.; Strobl, E.; Elliott, R.J.R.; Kettridge, N. Cross-country risk quantification of extreme wildfires in Mediterranean Europe. Risk Anal. 2022, 42, 2444–2463. [Google Scholar] [CrossRef]
  19. Xofis, P.; Buckley, P.G.; Kefalas, G.; Chalaris, M.; Mitchley, J. Mid-term effects of fire on soil properties of North-East Mediterranean ecosystems. Fire 2023, 6, 337. [Google Scholar] [CrossRef]
  20. Miesel, J.R.; Hockaday, W.C.; Kolka, R.K.; Townsend, P.A. Soil organic matter composition and quality across fire severity gradients in coniferous and deciduous forests of the southern boreal region. J. Geophys. Res. 2015, 120, 1124–1141. [Google Scholar] [CrossRef]
  21. Zhang, Y.; Biswas, A. The effects of forest fire on soil organic matter and nutrients in boreal forests of North America: A review. In Adaptative Soil Management: From Theory to Practices; Rakshit, A., Abhilash, P.C., Bahadur, H.S., Ghosh, S., Eds.; Springer: Singapore, 2017; pp. 465–476. [Google Scholar] [CrossRef]
  22. Wittenberg, L.; Pereira, P. Fire and soils: Measurements, modelling, management and challenges. Sci. Total Environ. 2021, 782, 145964. [Google Scholar] [CrossRef]
  23. Pellegrini, A.F.A.; Harden, J.; Georgiou, K.; Hemes, K.S.; Malhotra, A.; Nola, C.J.; Jackson, R.B. Fire effects on the persistence of soil organic matter and long-term carbon storage. Nat. Geosci. 2022, 15, 5–13. [Google Scholar] [CrossRef]
  24. Agbeshie, A.A.; Abugre, S.; Atta-Darkwa, T.; Awuah, R. A review of the effects of forest fire on soil properties. J. For. Res. 2022, 33, 1419–1441. [Google Scholar] [CrossRef]
  25. Araya, S.N.; Meding, M.; Berhe, A.A. Thermal alteration of soil physico-chemical properties: A systematic study to infer response of Sierra Nevada climosequence soils to forest fires. Soil 2016, 2, 351–366. [Google Scholar] [CrossRef]
  26. Fonseca, F.; de Figueiredo, T.; Nogueira, C.; Queirós, A. Effect of prescribed fire on soil properties and soil erosion in a Mediterranean mountain area. Geoderma 2017, 307, 172–180. [Google Scholar] [CrossRef]
  27. McBratney, A.B.; Mendonça Santos, M.L.; Minasny, B. On digital soil mapping. Geoderma 2003, 117, 3–52. [Google Scholar] [CrossRef]
  28. Brown, K.S.; Libohova, Z.; Boettinger, J. Digital Soil Mapping. In Soil Survey Manual, USDA Handbook 18; Ditzler, C., Scheffe, K., Monger, H.C., Eds.; Government Printing Office: Washington, DC, USA, 2017; pp. 295–354. [Google Scholar]
  29. Forkuor, G.; Hounkpatin, O.K.L.; Welp, G.; Thiel, M. High Resolution Mapping of Soil Properties Using Remote Sensing Variables in South-Western Burkina Faso: A Comparison of Machine Learning and Multiple Linear Regression Models. PLoS ONE 2017, 12, e0170478. [Google Scholar] [CrossRef]
  30. Direção-Geral do Território. Levantamento LiDAR de Portugal Continental: Dados LiDAR de Portugal Continental. Direção-Geral do Território. 2025. Available online: https://dados.gov.pt/pt/datasets/dados-lidar-de-portugal-continental/ (accessed on 15 June 2025).
  31. Meersmans, J.; De Ridder, F.; Canters, F.; De Baets, S.; Van Molle, M. A multiple regression approach to assess the spatial distribution of soil organic carbon (SOC) at the regional scale (Flanders, Belgium). Geoderma 2008, 143, 1–13. [Google Scholar] [CrossRef]
  32. Rial, M.; Martínez Cortizas, A.; Taboada, T.; Rodríguez-Lado, L. Soil organic carbon stocks in Santa Cruz Island, Galapagos, under different climate change scenarios. Catena 2017, 156, 74–81. [Google Scholar] [CrossRef]
  33. Piccini, C.; Marchetti, A.; Francaviglia, R. Estimation of soil organic matter by geostatistical methods: Use of auxiliary information in agricultural and environmental assessment. Ecol. Indic. 2014, 36, 301–314. [Google Scholar] [CrossRef]
  34. Webster, R.; Oliver, M.A. Geostatistics for Environmental Scientists, 2nd ed.; Wiley: Hoboken, NJ, USA, 2007. [Google Scholar]
  35. Veronesi, F.; Schillaci, C. Comparison between geostatistical and machine learning models as predictors of topsoil organic carbon with a focus on local uncertainty estimation. Ecol. Indic. 2019, 101, 1032–1044. [Google Scholar] [CrossRef]
  36. Chen, L.; Ren, C.; Li, L.; Wang, Y.; Zhang, B.; Wang, Z.; Li, L. A Comparative Assessment of Geostatistical, Machine Learning, and Hybrid Approaches for Mapping Topsoil Organic Carbon Content. ISPRS Int. J. Geo-Inf. 2019, 8, 174. [Google Scholar] [CrossRef]
  37. Eldeiry, A.A.; Garcia, L.A. Comparison of ordinary kriging, regression kriging, and cokriging techniques to estimate soil salinity using LANDSAT images. J. Irrig. Drain. Eng. 2010, 136, 355–364. [Google Scholar] [CrossRef]
  38. Naimi, S.; Ayoubi, S.; Demattê, J.A.M.; Zeraatpisheh, M.; Amorim, M.T.A.; Mello, F.A.O. Spatial prediction of soil surface properties in an arid region using synthetic soil image and machine learning. Geocarto Int. 2021, 37, 8230–8253. [Google Scholar] [CrossRef]
  39. Wadoux, A.M.-C.; Minasny, B.; McBratney, A.B. Machine learning for digital soil mapping: Applications, challenges and suggested solutions. Earth-Sci. Rev. 2020, 210, 103359. [Google Scholar] [CrossRef]
  40. Chen, S.; Richer-de-Forges, A.C.; Mulder, V.L.; Martelet, G.; Loiseau, T.; Lehmann, S.; Arrouays, D. Digital mapping of the soil thickness of loess deposits over a calcareous bedrock in central France. Catena 2021, 198, 105062. [Google Scholar] [CrossRef]
  41. Hengl, T.; De Jesus, J.M.; Heuvelink, G.B.M.; Gonzalez, M.R.; Kilibarda, M.; Blagotić, A.; Shangguan, W.; Wright, M.N.; Geng, X.; Bauer-Marschallinger, B.; et al. SoilGrids250m: Global Gridded Soil Information Based on Machine Learning. PLoS ONE 2017, 12, e0169748. [Google Scholar] [CrossRef] [PubMed]
  42. Hengl, T.; Miller, M.A.E.; Križan, J.; Shepherd, K.D.; Sila, A.; Kilibarba, M.; Antonijević, O.; Glušica, L.; Dobermann, A.; Haefele, S.M.; et al. African soil properties and nutrients mapped at 30 m spatial resolution using two-scale ensemble machine learning. Sci. Rep. 2021, 11, 6130. [Google Scholar] [CrossRef] [PubMed]
  43. Bravo-García, J.; CamarilloNaranjo, J.M.; Blanco-Velázquez, F.J.; Anaya-Romero, M. Soil Organic Carbon Mapping Through Remote Sensing and In Situ Data with Random Forest by Using Google Earth Engine: A Case Study in Southern Africa. Land 2025, 14, 1436. [Google Scholar] [CrossRef]
  44. Chen, Z.; Shuai, Q.; Shi, Z.; Arrouays, D.; Richer-de-Forges, A.C.; Chen, S. National-scale mapping of soil organic carbon stock in France: New insights and lessons learned by direct and indirect approaches. Soil Environ. Health 2023, 1, 100049. [Google Scholar] [CrossRef]
  45. Bahri, H.; Raclot, D.; Barbouchi, M.; Lagacherir, P.; Annabi, M. Mapping soil organic carbon stocks in Tunisian topsoils. Geoderma Reg. 2022, 30, e00561. [Google Scholar] [CrossRef]
  46. Fiantis, D.; Rudiyanto Ginting, F.I.; Agtalarik, A.; Arianto, D.T.; Wichaksono, P.; Irfan, R.; Nelson, M.; Gusnidar, G.; Jeon, S.; Minasny, B. Mapping peat thickness and carbon stock of a degraded peatland in West Sumatra, Indonesia. Soil Use Manag. 2023, 40, e12954. [Google Scholar] [CrossRef]
  47. Meliho, M.; Boulmane, M.; Khattabi, A.; Dansou, C.E.; Orlando, C.A.; Mhammdi, N.; Noumonvi, K.D. Spatial Prediction of Soil Organic Carbon Stock in the Moroccan High Atlas Using Machine Learning. Remote Sens. 2023, 15, 2494. [Google Scholar] [CrossRef]
  48. Agaba, S.; Ferré, C.; Musetti, M.; Comolli, R. Mapping Soil Organic Carbon Stock and Uncertainties in an Alpine Valley (Northern Italy) Using Machine Learning Models. Land 2024, 13, 78. [Google Scholar] [CrossRef]
  49. Oliveira, J.T. Carta Geológica de Portugal, Escala 1_200000. Notícia Explicativa, 8; Serviços Geológicos de Portugal: Lisboa, Portugal, 1982. [Google Scholar]
  50. IUSS Working Group; W.R.B. World Reference Base for Soil Resources. International Soil Classification System for Naming Soils and Creating Legends for Soil Maps, 4th ed.; International Union of Soil Sciences (IUSS): Vienna, Austria, 2022. [Google Scholar]
  51. Köppen, W. Das geographische System der Klimate. In Handbuch der Klimatologie, Vol. 1, Part C; Köppen, W., Geiger, R., Eds.; Gebrüder Borntraeger: Berlin, Germany, 1936. [Google Scholar]
  52. IPMA—Instituto Português do Mar e da Atmosfera. Normais Climatológicas. 2025. Available online: https://www.ipma.pt/pt/oclima/nor-527362mais.clima/ (accessed on 18 June 2025).
  53. FAO; ITPS. Global Soil Organic Carbon Map V1.5: Technical Report; FAO: Rome, Italy, 2020. [Google Scholar] [CrossRef]
  54. Walkley, A.; Black, I.A. An examination of the Degtjareff method for determining soil organic matter and a proposed modification of the chromic acid titration method. Soil Sci. 1934, 37, 29–38. [Google Scholar] [CrossRef]
  55. FAO. Measuring and Modelling Soil Carbon Stocks and Stock Changes in Livestock Production Systems: Guidelines for Assessment (Version 1); Livestock Environmental Assessment and Performance (LEAP) Partnership; FAO: Rome, Italy, 2019; p. 170. [Google Scholar]
  56. Wiesmeier, M.; Urbanski, L.; Hobley, E.; Lang, B.; von Lützow, M.; Marin-Spiotta, E.; van Wesemael, B.; Rabot, E.; Ließ, M.; Garcia-Franco, N.; et al. Soil organic carbon storage as a key function of soils—A review of drivers and indicators at various scales. Geoderma 2019, 333, 149–162. [Google Scholar] [CrossRef]
  57. Jobbagy, E.G.; Jackson, R.B. The vertical distribution of soil organic carbon and its relation to climate and vegetation. Ecol. Appl. 2000, 10, 423–436. [Google Scholar] [CrossRef]
  58. Von Lützow, M.; Kogel-Knabner, I.; Ekschmitt, K.; Matzner, E.; Guggenberger, G.; Marschner, B.; Flessa, H. Stabilization of organic matter in temperate soils: Mechanisms and their relevance under different soil conditions—A review. Eur. J. Soil Sci. 2006, 57, 426–445. [Google Scholar] [CrossRef]
  59. Matus, F.; Garrido, E.; Hidalgo, C.; Paz Pellat, F.; Etchevers, J.; Merino, C.; Báez-Pérez, A. Carbon saturation in the silt and clay particles in soils with contrasting mineralogy. Terra Latinoam. 2016, 34, 311–319. [Google Scholar]
  60. Armas-Herrera, C.M.; Martí, C.; Badía, D.; Ortiz-Perpiñá, O.; Girona-García, A.; Porta, J. Immediate effects of prescribed burning in the Central Pyrenees on the amount and stability of topsoil organic matter. Catena 2016, 147, 238–244. [Google Scholar] [CrossRef]
  61. Meira-Castro, A.; Shakesby, R.A.; Espinha Marques, J.; Doerr, S.H.; Meixedo, J.P.; Teixeira, J.; Chamine, H.I. Effects of prescribed fire on surface soil in a Pinus pinaster plantation, northern Portugal. Environ. Earth Sci. 2015, 73, 3011–3018. [Google Scholar] [CrossRef]
  62. Vieira, D.C.S.; Borrelli, P.; Jahanianfard, D.; Benali, A.; Scarpa, S.; Panagos, P. Wildfires in Europe: Burned soils require attention. Environ. Res. 2023, 217, 114936. [Google Scholar] [CrossRef]
  63. Barreiro, A.; Díaz-Raviña, M. Fire impacts on soil microorganisms: Mass, activity, and diversity. Curr. Opin. Environ. Sci. Health 2021, 22, 100264. [Google Scholar] [CrossRef]
  64. Mataix-Solera, J.; Cerdà, A.; Arcenegui, V.; Jordán, A.; Zavala, L.M. Fire effects on soil aggregation: A review. Earth-Sci. Rev. 2011, 109, 44–60. [Google Scholar] [CrossRef]
  65. Sungmin, O.; Orth, R.; Weber, U.; Park, S.K. High-resolution European daily soil moisture derived with machine learning (2003–2020). Sci. Data 2022, 9, 701. [Google Scholar] [CrossRef]
  66. Copernicus Climate Change Service. ERA5-Land Monthly Averaged Data from 1950 to Present; Copernicus Climate Change Service (C3S): Reading, UK; Climate Data Store (CDS): Online, 2022. [Google Scholar] [CrossRef]
  67. Direção-Geral do Território. Sistema de Monitorização da Ocupação do Solo (SMOS), 2025. Available online: https://smos.dgterritorio.gov.pt/cosvgi/ (accessed on 18 July 2025).
  68. Bouslihim, Y.; John, K.; Miftah, A.; Azmi, R.; Aboutayeb, R.; Bouasria, A.; Hssaini, L. The effect of covariates on Soil Organic Matter and pH variability: A digital soil mapping approach using random forest model. Ann. GIS 2024, 30, 215–232. [Google Scholar] [CrossRef]
  69. Mosaid, H.; Barakat, A.; John, K.; Faozi, E.; Bustillo, V.; Garnaoui, M.E.; Heung, B. Improved soil carbon stock spatial prediction in a Mediterranean soil erosion site through robust machine learning techniques. Environ. Monit Assess 2024, 196, 130. [Google Scholar] [CrossRef]
  70. dos Santos, W.P.; Vaz, C.M.P.; Martin-Neto, L.; Anselmi, A.; Tomasella, J.; de Costa, F.; Albuquerque, J.A.; de Jong, Q.; Galbieri, R.; PerinaF, J. Predicting bulk density in Brazilian soils for carbon stocks calculation: A comparative study of multiple linear regression and Random Forest models using continuous and categorical variables. Discov. Soil 2025, 2, 7. [Google Scholar] [CrossRef]
  71. Guo, H.; Wang, J.; Zhang, D.; Cui, J.; Yuan, Y.; Bao, H.; Yang, M.; Gou, J.; Chen, F.; Zhou, W.; et al. Mapping surface soil organic carbon density of cultivated land using machine learning in Zhengzhou. Environ. Geochem Health 2025, 47, 1. [Google Scholar] [CrossRef]
  72. Han, L.; Wang, Z. Estimation of Soil Organic Carbon Content in the Ebinur Lake Wetland, Xinjiang, China, Based on Multisource Remote Sensing Data and Ensemble Learning Algorithms. Sensors 2022, 22, 2685. [Google Scholar] [CrossRef] [PubMed]
  73. Kutner, M.H.; Nachtsheim, C.J.; Neter, J.; Li, W. Applied Linear Statistical Models, 5th ed.; McGraw-Hill, Irwin: New York, NY, USA, 2005. [Google Scholar]
  74. O’Brien, R.M. A caution regarding rules of thumb for variance inflation factors. Qual. Quant. 2007, 41, 673–690. [Google Scholar] [CrossRef]
  75. Dormann, C.F.; Elith, J.; Bacher, S.; Buchmann, C.; Carl, G.; Carré, G.; García Marquéz, J.R.; Gruber, B.; Lafourcade, B.; Leitão, P.J.; et al. Collinearity: A review of methods to deal with it and a simulation study evaluating their performance. Ecography 2013, 36, 27–46. [Google Scholar] [CrossRef]
  76. Ladd, J.T.C.; Smeaton, C.; Skov, M.W.; Austin, W.E.N. Best practice for upscaling soil organic carbon stocks in salt marshes. Geoderma 2022, 428, 116188. [Google Scholar] [CrossRef]
  77. Meyer, H.; Reudenbach, C.; Hengl, T.; Katurji, M.; Nauss, T. Improving performance of spatio-temporal machine learning models using forward feature selection and target-oriented validation. Environ. Model. Softw. 2018, 101, 1–9. [Google Scholar] [CrossRef]
  78. Padarian, J.; Minasny, B.; McBratney, A.B. Machine learning and soil sciences: A review aided by machine learning tools. Soil 2020, 6, 35–52. [Google Scholar] [CrossRef]
  79. Li, H.; Wang, J.; Zhang, J.; Liu, T.; Acquah, G.E.; Yuan, H. Combining Variable Selection and Multiple Linear Regression for Soil Organic Matter and Total Nitrogen Estimation Using MIR Spectroscopy. Agronomy 2022, 12, 638, In the study they compare MLR and PLSR for soil SOM and TN using MIR data. [Google Scholar]
  80. Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
  81. Richardson, H.J.; Hill, D.J.; Denesiuk, D.R.; Fraser, L.H. A comparison of geographic datasets and field measurements to model soil carbon using random forests and stepwise regressions (British Columbia, Canada). GISci. Remote Sens. 2017, 54, 573–591. [Google Scholar] [CrossRef]
  82. Vapnik, V.N. The Nature of Statistical Learning Theory; Springer: New York, NY, USA, 1995; pp. 841–842. [Google Scholar]
  83. Khaledian, Y.; Miller, B.A. Selecting appropriate machine learning methods for digital soil mapping. Appl. Math. Model. 2020, 81, 401–418. [Google Scholar] [CrossRef]
  84. Pham, T.D.; Yokoya, N.; Nguyen, T.T.T.; Le, N.N.; Ha, N.T.; Xia, J.; Takeuchi, W.; Pham, T.D. Improvement of mangrove soil carbon stocks estimation in North Vietnam using Sentinel-2 data and machine learning approach. GISci. Remote Sens. 2021, 58, 68–87. [Google Scholar] [CrossRef]
  85. Were, K.; Bui, D.T.; Dick, Ø.B.; Singh, B.R. A comparative assessment of support vector regression, artificial neural networks, and random forests for predicting and mapping soil organic carbon stocks across an Afromontane landscape. Ecol. Indic. 2015, 52, 394–403. [Google Scholar] [CrossRef]
  86. Chen, T.; Guestrin, C. XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’16), San Francisco, CA, USA, 13–17 August 2016; ACM: New York, NY, USA, 2016; pp. 785–794. [Google Scholar] [CrossRef]
  87. Nielsen, D. Tree Boosting with XGBoost—Why Does XGBoost Win “Every” Machine Learning Competition? Master’s Thesis, Norwegian University of Science and Technology (NTNU), Trondheim, Norway, 2016. [Google Scholar]
  88. Pham, T.D.; Le, N.N.; Ha, N.T.; Nguyen, L.V.; Xia, J.; Yokoya, N.; To, T.T.; Trinh, H.X.; Kieu, L.Q.; Takeuchi, W. Estimating mangrove above-ground biomass using Extreme Gradient Boosting decision trees with fused Sentinel-2 and ALOS-2 PALSAR-2 data in Can Gio Biosphere Reserve, Vietnam. Remote Sens. 2020, 12, 777. [Google Scholar] [CrossRef]
  89. Fox, J.; Monette, G. Generalized collinearity diagnostics. J. Am. Stat. Assoc. 1992, 87, 178–183. [Google Scholar] [CrossRef]
  90. Fox, J.; Weisberg, S. An R Companion to Applied Regression, 3rd ed.; Sage: Thousand Oaks, CA, USA, 2019. [Google Scholar]
  91. Piikki, K.; Wetterlind, J.; Söderström, M.; Stenberg, B. Perspectives on validation in digital soil mapping of continuous attributes—A review. Soil Use Manag. 2020, 37, 7–21. [Google Scholar] [CrossRef]
  92. Tajik, S.; Ayoubi, S.; Zeraatpisheh, M. Digital mapping of soil organic carbon using ensemble learning model in Mollisols of Hyrcanian forests, northern Iran. Geoderma Reg. 2020, 20, e00256. [Google Scholar] [CrossRef]
  93. Kolmogorov, A.N. Sulla determinazione empirica di una legge di distribuzione. G. Dell’istituto Ital. Degli Attuari 1933, 4, 83–91. [Google Scholar]
  94. Smirnov, N. Table for estimating the goodness of fit of empirical distributions. Ann. Math. Stat. 1948, 19, 279–281. [Google Scholar] [CrossRef]
  95. Nguyen, N.Y.; Tran, N.A.; Nguyen, H.D.; Dang, D.K. Quantile mapping technique for enhancing satellite-derived precipitation data in hydrological modelling: A case study of the Lam River Basin, Vietnam. J. HydroInform. 2024, 26, 2026–2044. [Google Scholar] [CrossRef]
  96. Lombardo, L.; Saia, S.; Schillaci, C.; Mai, P.M.; Huser, R. Modeling soil organic carbon with Quantile Regression: Dissecting predictors’ effects on carbon stocks. arXiv 2017, arXiv:1708.03859. [Google Scholar] [CrossRef]
  97. Wadoux, A.M.J.-C.; Román-Dobarco, M.; McBratney, A.B. Perspectives on data-driven soil research. Eur. J. Soil Sci. 2020, 71, 1093–1107. [Google Scholar] [CrossRef]
  98. Arrouays, D.; Poggio, L.; Salazar Guerrero, O.A.; Mulder, V.L. Digital soil mapping and GlobalSoilMap. Main advances and ways forward. Geoderma Reg. 2020, 21, e00265. [Google Scholar] [CrossRef]
  99. Peralta, G.; Di Paolo, L.; Luotto, I. Global Soil Organic Carbon Sequestration Potential Map—GSOCseq V.1.1.; FAO: Rome, Italy, 2022. [Google Scholar] [CrossRef]
  100. Llorens, R.; Sobrino, J.A.; Fernández, C.; Fernández-Alonso, J.M.; Vega, J.A. A methodology to estimate forest fires burned areas and burn severity degrees using Sentinel-2 data. Application to the October 2017 fires in the Iberian Peninsula. Int. J. Appl. Earth Obs. Geoinf. 2021, 95, 102243. [Google Scholar] [CrossRef]
  101. Miller, J.D.; Thode, A.E. Quantifying burn severity in a heterogeneous landscape with a relative version of the delta Normalized Burn Ratio (dNBR). Remote Sens. Environ. 2007, 109, 66–80. [Google Scholar] [CrossRef]
  102. Key, C.H. Ecological and Sampling Constraints on Defining Landscape Fire Severity. Fire Ecol. 2006, 2, 34–59. [Google Scholar] [CrossRef]
  103. Key, C.H.; Benson, N.C. Landscape Assessment (LA). In FIREMON: Fire Effects Monitoring and Inventory System; Gen. Tech. Rep. RMRS-GTR-164-CD; Lutes, D.C., Keane, R.E., Caratti, J.F., Key, C.H., Benson, N.C., Sutherland, S., Gangi, L.J., Eds.; U.S. Department of Agriculture, Forest Service, Rocky Mountain Research Station: Fort Colins, CO, USA, 2006; p. 55. [Google Scholar]
  104. Vaudour, E.; Gholizadeh, A.; Castaldi, F.; Saberioon, M.; Borůvka, L.; Urbina-Salazar, D.; Fouad, Y.; Arrouays, D.; Richer-de-Forges, A.C.; Biney, J.; et al. Satellite Imagery to Map Topsoil Organic Carbon Content over Cultivated Areas: An Overview. Remote Sens. 2022, 14, 2917. [Google Scholar] [CrossRef]
  105. Castaldi, F.; Chabrillat, S.; van Wesemael, B. Sampling Strategies for Soil Property Mapping Using Multispectral Sentinel-2 and Hyperspectral EnMAP Satellite Data. Remote Sens. 2019, 11, 309. [Google Scholar] [CrossRef]
  106. Chagas, C.d.S.; Pinheiro, H.S.K.; de Carvalho Junior, W.; dos Anjos, L.H.C.; Pereira, N.R.; Bhering, S.B. Data mining methods applied to map soil units on tropical hillslopes in Rio de Janeiro, Brazil. Geoderma Reg. 2017, 9, 47–55. [Google Scholar] [CrossRef]
  107. Qin, Y.; Feng, Q.; Holden, N.M.; Cao, J. Variation in soil organic carbon by slope aspect in the middle of the Qilian Mountains in the upper Heihe River Basin, China. Catena 2016, 147, 308–314. [Google Scholar] [CrossRef]
  108. Hislop, S.; Jones, S.; Soto-Berelov, M.; Skidmore, A.; Haywood, A.; Nguyen, T.H. Using Landsat Spectral Indices in Time-Series to Assess Wildfire Disturbance and Recovery. Remote Sens. 2018, 10, 460. [Google Scholar] [CrossRef]
  109. Sothe, C.; Gonsamo, A.; Arabian, J.; Snider, J. Large scale mapping of soil organic carbon concentration with 3D machine learning and satellite observations. Geoderma 2022, 405, 115402. [Google Scholar] [CrossRef]
  110. Tan, Q.; Han, W.; Li, X.; Wang, G. Clarifying the response of soil organic carbon storage to increasing temperature through minimizing the precipitation effect. Geoderma 2020, 374, 114398. [Google Scholar] [CrossRef]
  111. Périé, C.; Ouimet, R. Organic carbon, organic matter and bulk density relationships in boreal forest soils. Can. J. Soil Sci. 2008, 88, 315–325. [Google Scholar] [CrossRef]
  112. Li, X.; McCarty, G.W.; Karlen, D.L.; Cambardella, C.A. Topographic metric predictions of soil redistribution and organic carbon in Iowa cropland fields. Catena 2018, 160, 222–232. [Google Scholar] [CrossRef]
  113. Ma, Y.; Minasny, B.; Viaud, V.; Walter, C.; Malone, B.; McBratney, A. Modelling the whole profile soil organic carbon dynamics considering soil redistribution under future climate change and landscape projections over the Lower Hunter Valley, Australia. Land 2023, 12, 255. [Google Scholar] [CrossRef]
  114. Carey, C.J.; Weverka, J.; DiGaudio, R.; Gardali, T.; Porzig, E.L. Exploring variability in rangeland soil organic carbon stocks across California (USA) using a voluntary monitoring network. Geoderma Reg. 2020, 22, e00304. [Google Scholar] [CrossRef]
  115. Emami, M.; Khormali, F.; Pahlavan-Rad, M.R.; Ebrahimi, S. Predicting the spatial distribution of organic carbon in soil by combining machine learning algorithms and spline depth function in a part of Golestan Province, Iran. Soil Tillage Res. 2025, 251, 106530. [Google Scholar] [CrossRef]
  116. Cocke, A.E.; Fulé, P.Z.; Crouse, J.E. Comparison of burn severity assessments using Differenced Normalized Burn Ratio and ground data. Int. J. Wildl. Fire 2005, 14, 189. [Google Scholar] [CrossRef]
  117. Pickell, P.D.; Hermosilla, T.; Frazier, R.J.; Coops, N.C.; Wulder, M.A. Forest recovery trends derived from Landsat time series for North American boreal forests. Int. J. Remote Sens. 2016, 37, 138–149. [Google Scholar] [CrossRef]
  118. Parker, B.M.; Lewis, T.; Srivastava, S.K. Estimation and evaluation of multi-decadal fire severity patterns using Landsat sensors. Remote Sens. Environ. 2015, 170, 340–349. [Google Scholar] [CrossRef]
  119. Hengl, T.; Nussbaum, M.; Wright, M.N.; Heuvelink, G.B.M.; Gräler, B. Random forest as a generic framework for predictive modeling of spatial and spatio-temporal variables. PeerJ 2018, 6, e5518. [Google Scholar] [CrossRef]
  120. Themeßl, M.J.; Gobiet, A.; Heinrich, G. Empirical-statistical downscaling and error correction of daily precipitation from regional climate models. Int. J. Climatol. 2012, 32, 1531–1548. [Google Scholar] [CrossRef]
  121. Cannon, A.J.; Sobie, S.R.; Murdock, T.Q. Bias correction of GCM precipitation by quantile mapping: How well do methods preserve relative changes in quantiles and extremes? J. Clim. 2015, 28, 6938–6959. [Google Scholar] [CrossRef]
  122. Benhalima, Y.; Santos, E.; Arán, D. Two Decades of Soil Responses to Fire in a Mediterranean Ecosystem: A Multi-Indicator Analysis. Catena 2026, 262, 109647. [Google Scholar] [CrossRef]
  123. Li, J.; Tian, L.; Chang, Z.; Li, X.; Li, F.; Wu, J.; Zhou, Q.; Zhang, P.; Pan, B. Effect of Wildfires on Soil Organic Carbon Content and Carbon Flow Pathways: The Evidence of BPCAs Molecular Markers and ^13C Natural Abundance. Catena 2025, 260, 109468. [Google Scholar] [CrossRef]
  124. Certini, G.; Nocentini, C.; Knicker, H.; Arfaioli, P.; Rumpel, C. Wildfire Effects on Soil Organic Matter Quantity and Quality in Two Fire-Prone Mediterranean Pine Forests. Geoderma 2011, 167–168, 148–155. [Google Scholar] [CrossRef]
  125. Keeley, J.E. Fire intensity, fire severity and burn severity: A brief review and suggested usage. Int. J. Wildland Fire 2009, 18, 116–126. [Google Scholar] [CrossRef]
Figure 1. Location of the study area, indicating the areas affected by the forest fires of 2004 (two decades) and 2012 (one decade) with the corresponding soil sampling points.
Figure 1. Location of the study area, indicating the areas affected by the forest fires of 2004 (two decades) and 2012 (one decade) with the corresponding soil sampling points.
Soilsystems 10 00017 g001
Figure 2. Cross-validated performance of model–feature selection combinations for predicting (a) bulk density, (b) total C, and (c) C stock at 0–5 cm and 0–25 cm depths. Points show mean values across folds and bars indicate ± SE. Columns display R2, RMSE, and MAE; dashed lines mark median values per metric.
Figure 2. Cross-validated performance of model–feature selection combinations for predicting (a) bulk density, (b) total C, and (c) C stock at 0–5 cm and 0–25 cm depths. Points show mean values across folds and bars indicate ± SE. Columns display R2, RMSE, and MAE; dashed lines mark median values per metric.
Soilsystems 10 00017 g002
Figure 3. Feature Importance for predicted soil bulk density, total carbon, and carbon stock at two depth intervals (0–5 cm and 0–25 cm).
Figure 3. Feature Importance for predicted soil bulk density, total carbon, and carbon stock at two depth intervals (0–5 cm and 0–25 cm).
Soilsystems 10 00017 g003
Figure 4. Maps of bulk density, total C, and C stock for the 0–5 cm depth along with their uncertainties.
Figure 4. Maps of bulk density, total C, and C stock for the 0–5 cm depth along with their uncertainties.
Soilsystems 10 00017 g004aSoilsystems 10 00017 g004b
Figure 5. Maps of bulk density, total C, and C stock for the 0–25 cm depth along with their uncertainties.
Figure 5. Maps of bulk density, total C, and C stock for the 0–25 cm depth along with their uncertainties.
Soilsystems 10 00017 g005aSoilsystems 10 00017 g005b
Figure 6. Changes in bulk density, total C, and C stock according to severity levels and year since last fire for 0–5 cm and 0–25 cm.
Figure 6. Changes in bulk density, total C, and C stock according to severity levels and year since last fire for 0–5 cm and 0–25 cm.
Soilsystems 10 00017 g006
Table 1. Feature selection used along with its corresponding regression algorithm.
Table 1. Feature selection used along with its corresponding regression algorithm.
Feature Selection MethodModel
VIF MLR
Multicollinearity check |r| < 0.8RF; SVM; XGBoost
Boruta algorithmRF; SVM; XGBoost
Forward feature selectionRF; SVM; XGBoost
Table 2. Summary of distributional statistics and Kolmogorov–Smirnov (KS) test results for observed and predicted soil properties along with Quantile Mapping (QM) modeling at 0–5 and 0–25 cm depths. Significance is shown as: * p < 0.05; *** p < 0.001.
Table 2. Summary of distributional statistics and Kolmogorov–Smirnov (KS) test results for observed and predicted soil properties along with Quantile Mapping (QM) modeling at 0–5 and 0–25 cm depths. Significance is shown as: * p < 0.05; *** p < 0.001.
Depth (cm)ModelDatasetMean Δ%
(Obs–Pred)
SD Δ%
(Obs–Pred)
Min–Max
(Obs)
Min–Max
(Pred)
KS
BD (g cm−3) Obs/Pred1.3438.460.82–1.931.06–1.820.26 ***
0–5RFPred (QM) 0.82–1.920.09
Total C (g kg−1) Obs/Pred2.9754.947.38–112.223.42–65.090.22 ***
0–5RFPred (QM) 7.38–112.20.09
C stock (kg m−2) Obs/Pred3.0346.670.09–0.840.15–0.550.17 *
0–5SVRPred (QM)) 0.1–0.810.12
BD (g cm−3) Obs/Pred−2.0326.920.82–1.931.12–2.020.17
0–25SVRPred (QM) 0.84–1.930.12
Obs/Pred9.0428.745.28–42.74.7–35.680.21 *
Total C (g kg−1)0–25SVRPred (QM) 6.22–42.490.07
Obs/Pred26.348.720.22–12.250.41–10.540.39
C stock (kg m−2)0–25SVRPred (QM) 0.22–12.250.09
BD: Bulk density; RF: Random Forest; SVR: Support Vector Regression.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Benhalima, Y.; Santos, E.S.; Arán, D. Long-Term Assessment of Soil Carbon Dynamics in Post-Fire Conditions: Evidence from Digital Soil Mapping Approaches. Soil Syst. 2026, 10, 17. https://doi.org/10.3390/soilsystems10010017

AMA Style

Benhalima Y, Santos ES, Arán D. Long-Term Assessment of Soil Carbon Dynamics in Post-Fire Conditions: Evidence from Digital Soil Mapping Approaches. Soil Systems. 2026; 10(1):17. https://doi.org/10.3390/soilsystems10010017

Chicago/Turabian Style

Benhalima, Yacine, Erika S. Santos, and Diego Arán. 2026. "Long-Term Assessment of Soil Carbon Dynamics in Post-Fire Conditions: Evidence from Digital Soil Mapping Approaches" Soil Systems 10, no. 1: 17. https://doi.org/10.3390/soilsystems10010017

APA Style

Benhalima, Y., Santos, E. S., & Arán, D. (2026). Long-Term Assessment of Soil Carbon Dynamics in Post-Fire Conditions: Evidence from Digital Soil Mapping Approaches. Soil Systems, 10(1), 17. https://doi.org/10.3390/soilsystems10010017

Article Metrics

Back to TopTop