Bridging Pedology and Data Science: Machine Learning Applications for Soil Organic Matter and Carbon Analysis
Abstract
1. Introduction
1.1. Background and Significance
1.2. Traditional Approaches and Their Evolution
1.3. The Machine Learning Revolution
1.4. Objectives
1.5. Novel Contributions and Review Scope
1.6. Review Type and Methodology
2. Soil Organic Matter
2.1. Definition and Composition
2.2. Forms and Stability of Soil Organic Matter: Implications for Predictive Modelling
2.3. Factors Affecting Soil Organic Matter Content
2.4. Challenges in Organic Matter Assessment of Soil
3. Classical Approaches to Soil Organic Matter and Carbon Analysis
3.1. Laboratory Analytical Methods
3.1.1. Wet Chemical Oxidation
3.1.2. Dry Combustion (Elemental Analysis)
3.1.3. Loss-on-Ignition
3.1.4. Spectroscopic Methods
3.2. Field Sampling and Geostatistical Analysis
3.2.1. Sampling Strategies
3.2.2. Geostatistical Methods
3.3. Remote Sensing in Classical Frameworks
3.4. Strengths and Limitations of Classical Approaches
4. Machine Learning Approaches to Soil Organic Matter Analysis
4.1. Overview of Machine Learning Techniques
4.2. Regression Methods
4.2.1. Linear and Polynomial Regression
4.2.2. Support Vector Regression
4.2.3. Random Forests
4.2.4. Gradient Boosting Machines
4.3. Neural Networks and Deep Learning
4.3.1. Artificial Neural Networks
4.3.2. Convolutional Neural Networks
4.3.3. Recurrent Neural Networks
5. Challenges and Limitations of ML Approaches
5.1. Data Distribution Challenges Specific to SOC
5.2. Data Scarcity
5.3. Interpretability and the Black Box Problem
5.4. Causality Versus Correlation
5.5. Spatial Autocorrelation and Data Leakage
5.6. Non-Stationarity
5.7. Reproducibility and Model Transparency
6. Comparative Analysis: Classical Versus Machine Learning Approaches
6.1. Prediction Accuracy
6.2. Cost and Efficiency
6.3. Data Requirements and Availability
6.4. Interpretability and Scientific Understanding
6.5. Uncertainty Quantification
6.6. Transferability, Domain Shift, and Generalisation
6.7. Temporal Dynamics and Change Detection
6.8. Hybrid Frameworks: Architecture and Implementation
6.8.1. Fusion Architecture Descriptions
6.8.2. Structured Implementation Workflow
6.8.3. Synthesised Case Studies
7. Challenges and Opportunities
8. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
| SOM | Soil Organic Matter |
| ML | Machine Learning |
| RF | Random Forest |
| OM | Organic Matter |
| SOC | Soil Organic Carbon |
| LOI | Loss On Ignition |
| VNIR | Visible and Near-Infrared |
| FTIR | Fourier Transform Infrared |
| MIR | Mid-Infrared |
| NDVI | Normalised Difference Vegetation Index |
| vis-NIR | Visible-Near-Infrared |
| PLSR | Partial Least Squares Regression |
| LS-SVM | Least Squares Support Vector Machines |
| ELM | Extreme Learning Machines |
| EPO | External Parameter Orthogonalisation |
| VNIR-MIR | Vis-NIR and Mid-Infrared |
| NiOA | Ninja Optimisation Algorithm |
| SVR | Support Vector Regression |
| SOCD | Soil Organic Carbon Density |
| GBRT | Gradient-Boosted Regression Trees |
| ANNs | Artificial Neural Networks |
| CNNs | Convolutional Neural Networks |
| RNNs | Recurrent Neural Networks |
| LSTM | Long Short-Term Memory |
| GRUs | Gated Recurrent Units |
| BiGRU | Bidirectional Gated Recurrent Units |
| TPML | Two-Point ML |
| AfSIS | Africa Soil Information Service |
References
- Georgiou, K.; Jackson, R.B.; Vindušková, O.; Abramoff, R.Z.; Ahlström, A.; Feng, W.; Harden, J.W.; Pellegrini, A.F.A.; Polley, H.W.; Soong, J.L.; et al. Global stocks and capacity of mineral-associated soil organic carbon. Nat. Commun. 2022, 13, 3797. [Google Scholar] [CrossRef] [PubMed]
- Lal, R. Soil Carbon Sequestration Impacts on Global Climate Change and Food Security. Science 2004, 304, 1623–1627. [Google Scholar] [CrossRef]
- Smith, P.; House, J.I.; Bustamante, M.; Sobocká, J.; Harper, R.; Pan, G.; West, P.C.; Clark, J.M.; Adhya, T.; Rumpel, C.; et al. Global change pressures on soils from land use and management. Glob. Change Biol. 2015, 22, 1008–1028. [Google Scholar] [CrossRef]
- Batjes, N.H. Total carbon and nitrogen in the soils of the world. Eur. J. Soil Sci. 1996, 47, 151–163. [Google Scholar] [CrossRef]
- Post, W.M.; Kwon, K.C. Soil carbon sequestration and land-use change: Processes and potential. Glob. Change Biol. 2000, 6, 317–327. [Google Scholar] [CrossRef]
- Beillouin, D.; Corbeels, M.; Demenois, J.; Berre, D.; Boyer, A.; Fallot, A.; Feder, F.; Cardinael, R. A global meta-analysis of soil organic carbon in the Anthropocene. Nat. Commun. 2023, 14, 3700. [Google Scholar] [CrossRef]
- Bokati, L.; Somenahally, A.; Kumar, S.; Robatjazi, J.; Talchabhadel, R.; Sarkar, R.; Perepi, R. Temporal adjustment approach for high-resolution continental scale modeling of soil organic carbon. Sci. Rep. 2025, 15, 6483. [Google Scholar] [CrossRef]
- Celestina, C.; Hunt, J.R.; Sale, P.W.G.; Franks, A.E. Attribution of crop yield responses to application of organic amendments: A critical review. Soil Tillage Res. 2019, 186, 135–145. [Google Scholar] [CrossRef]
- Wilpiszeski, R.L.; Aufrecht, J.A.; Retterer, S.T.; Sullivan, M.B.; Graham, D.E.; Pierce, E.M.; Zablocki, O.D.; Palumbo, A.V.; Elias, D.A. Soil Aggregate Microbial Communities: Towards Understanding Microbiome Interactions at Biologically Relevant Scales. Appl. Environ. Microbiol. 2019, 85, e00324-19. [Google Scholar] [CrossRef]
- Țopa, D.C.; Căpșună, S.; Calistru, A.E.; Ailincăi, C. Sustainable Practices for Enhancing Soil Health and Crop Quality in Modern Agriculture: A Review. Agriculture 2025, 15, 998. [Google Scholar] [CrossRef]
- Apesteguia, M.; Plante, A.F.; Virto, I. Methods assessment for organic and inorganic carbon quantification in calcareous soils of the Mediterranean region. Geoderma Reg. 2018, 12, 39–48. [Google Scholar] [CrossRef]
- Gomez, C.; Chevallier, T.; Moulin, P.; Bouferra, I.; Hmaidi, K.; Arrouays, D.; Jolivet, C.; Barthès, B.G. Prediction of soil organic and inorganic carbon concentrations in Tunisian samples by mid-infrared reflectance spectroscopy using a French national library. Geoderma 2020, 375, 114469. [Google Scholar] [CrossRef]
- Huang, B.; Yang, G.; Lei, J.; Wang, X. A partitioned conditioned Latin hypercube sampling method considering spatial heterogeneity in digital soil mapping. Sci. Rep. 2025, 15, 12851. [Google Scholar] [CrossRef]
- Paul, S.S.; Coops, N.C.; Johnson, M.S.; Krzic, M.; Smukler, S.M. Evaluating sampling efforts of standard laboratory analysis and mid-infrared spectroscopy for cost effective digital soil mapping at field scale. Geoderma 2019, 356, 113925. [Google Scholar] [CrossRef]
- Wadoux, A.M.J.C. Artificial intelligence in soil science. Eur. J. Soil Sci. 2025, 76, e70080. [Google Scholar] [CrossRef]
- Bouslihim, Y.; Rochdi, A.; Aboutayeb, R.; El Amrani-Paaza, N.; Miftah, A.; Hssaini, L. Soil Aggregate Stability Mapping Using Remote Sensing and GIS-Based Machine Learning Technique. Front. Earth Sci. 2021, 9, 748859. [Google Scholar] [CrossRef]
- Wu, B.; Zhang, M.; Zeng, H.; Tian, F.; Potgieter, A.B.; Qin, X.; Yan, N.; Chang, S.; Zhao, Z.; Dong, Q. Challenges and opportunities in remote sensing-based crop monitoring: A review. Natl. Sci. Rev. 2022, 10, nwac290. [Google Scholar] [CrossRef]
- Xiao, X.; He, Q.; Ma, S.; Liu, J.; Sun, W.; Lin, Y.; Yi, R. Environmental variables improve the accuracy of remote sensing estimation of soil organic carbon content. Sci. Rep. 2024, 14, 18964. [Google Scholar] [CrossRef]
- Chen, Q.; Wang, Y.; Zhu, X. Soil organic carbon estimation using remote sensing data-driven machine learning. PeerJ 2024, 12, e17836. [Google Scholar] [CrossRef]
- John, K.; Abraham Isong, I.; Michael Kebonye, N.; Okon Ayito, E.; Chapman Agyeman, P.; Marcus Afu, S. Using Machine Learning Algorithms to Estimate Soil Organic Carbon Variability with Environmental Variables and Soil Nutrient Indicators in an Alluvial Soil. Land 2020, 9, 487. [Google Scholar] [CrossRef]
- Miao, T.; Ji, W.; Li, B.; Zhu, X.; Yin, J.; Yang, J.; Huang, Y.; Cao, Y.; Yao, D.; Kong, X. Advanced Soil Organic Matter Prediction with a Regional Soil NIR Spectral Library Using Long ShortTerm Memory–Convolutional Neural Networks: A Case Study. Remote Sens. 2024, 16, 1256. [Google Scholar] [CrossRef]
- Liu, L.; Zhou, W.; Guan, K.; Peng, B.; Xu, S.; Tang, J.; Zhu, Q.; Till, J.; Jia, X.; Jiang, C.; et al. Knowledge-guided machine learning can improve carbon cycle quantification in agroecosystems. Nat. Commun. 2024, 15, 357. [Google Scholar] [CrossRef]
- Sunantha, O.; Shao, Z.; Pattama, P.; Potchara, A.; Huang, X.; Zeeshan, A. Machine learning-based estimation of soil organic carbon in Thailand’s cash crops using multispectral and SAR data fusion combined with environmental variables. Geo-Spat. Inf. Sci. 2025, 28, 2721–2743. [Google Scholar] [CrossRef]
- Abrar, M.M.; Waqas, M.A.; Mehmood, K.; Fan, R.; Memon, M.S.; Khan, M.A.; Siddique, N.; Xu, M.; Du, J. Organic carbon sequestration in global croplands: Evidenced through a bibliometric approach. Front. Environ. Sci. 2025, 13, 1495991. [Google Scholar] [CrossRef]
- Fortuna, A.M.; Starks, P.J.; Moriasi, D.N. Estimation of soil organic carbon as a function of soil pretreatment and spectral features of radiometers within the visible and near-infrared spectra. J. Soil Water Conserv. 2025, 80, 476–490. [Google Scholar] [CrossRef]
- Conant, R.T.; Ryan, M.G.; Ågren, G.I.; Birge, H.E.; Davidson, E.A.; Eliasson, P.E.; Evans, S.E.; Frey, S.D.; Giardina, C.P.; Hopkins, F.M.; et al. Temperature and soil organic matter decomposition rates—Synthesis of current knowledge and a way forward. Glob. Change Biol. 2011, 17, 3392–3404. [Google Scholar] [CrossRef]
- König, A.; Wiesenbauer, J.; Gorka, S.; Marchand, L.; Kitzler, B.; Inselsbacher, E.; Kaiser, C. Reverse microdialysis: A window into root exudation hotspots. Soil Biol. Biochem. 2022, 174, 108829. [Google Scholar] [CrossRef]
- Zhou, Z.; Ren, C.; Wang, C.; Delgado-Baquerizo, M.; Luo, Y.; Luo, Z.; Du, Z.; Zhu, B.; Yang, Y.; Jiao, S.; et al. Global turnover of soil mineral-associated and particulate organic carbon. Nat. Commun. 2024, 15, 5329. [Google Scholar] [CrossRef]
- Wagai, R.; Mayer, L.M.; Kitayama, K.; Knicker, H. Climate and parent material controls on organic matter storage in surface soils: A three-pool, density-separation approach. Geoderma 2008, 147, 23–33. [Google Scholar] [CrossRef]
- Patton, N.R.; Lohse, K.A.; Seyfried, M.S.; Godsey, S.; Parsons, S.B. Topographic controls of soil organic carbon on soil-mantled landscapes. Sci. Rep. 2019, 9, 6390. [Google Scholar] [CrossRef]
- Wang, X.; Chi, Y.; Song, S. Important soil microbiota’s effects on plants and soils: A comprehensive 30-year systematic literature review. Front. Microbiol. 2024, 15, 1347745. [Google Scholar] [CrossRef]
- Engell, I.; Gerigk, J.; Linsler, D.; Joergensen, R.G.; Potthoff, M. Tillage and land use management effects on soil organic matter and soil microbial biomass in a field network of practical farms in France, Romania, and Sweden. Appl. Soil Ecol. 2024, 202, 105584. [Google Scholar] [CrossRef]
- Wan, Q.; Zhu, G.; Guo, H.; Zhang, Y.; Pan, H.; Yong, L.; Ma, H. Influence of Vegetation Coverage and Climate Environment on Soil Organic Carbon in the Qilian Mountains. Sci. Rep. 2019, 9, 17623. [Google Scholar] [CrossRef]
- Hartley, I.P.; Hill, T.C.; Chadburn, S.E.; Hugelius, G. Temperature effects on carbon storage are controlled by soil stabilisation capacities. Nat. Commun. 2021, 12, 6713. [Google Scholar] [CrossRef]
- Huang, Y.; Wei, F. Climate controls the global distribution of soil organic and inorganic carbon. Ecol. Indic. 2025, 175, 113514. [Google Scholar] [CrossRef]
- Schapel, A.; Marschner, P.; Churchman, J. Clay amount and distribution influence organic carbon content in sand with subsoil clay addition. Soil Tillage Res. 2018, 184, 253–260. [Google Scholar] [CrossRef]
- Dalzell, B.J.; Fissore, C.; Nater, E.A. Topography and land use impact erosion and soil organic carbon burial over decadal timescales. CATENA 2022, 218, 106578. [Google Scholar] [CrossRef]
- Zhang, P.; Shao, M. Spatial Variability and Stocks of Soil Organic Carbon in the Gobi Desert of Northwestern China. PLoS ONE 2014, 9, e93584. [Google Scholar] [CrossRef] [PubMed]
- Piñeiro-Juncal, N.; Mateo, M.Á.; Leiva-Dueñas, C.; Serrano, E.; Inostroza, K.; Soler, M.; Apostolaki, E.T.; Lavery, P.; Duarte, C.M.; Lafratta, A.; et al. Soil organic carbon depth profiles and centennial and millennial decay rates in tidal marsh, mangrove and seagrass blue carbon ecosystems. Commun. Earth Environ. 2025, 6, 504. [Google Scholar] [CrossRef]
- Davis, M.; Alves, B.; Karlen, D.; Kline, K.; Galdos, M.; Abulebdeh, D. Review of Soil Organic Carbon Measurement Protocols: A US and Brazil Comparison and Recommendation. Sustainability 2018, 10, 53. [Google Scholar] [CrossRef]
- Bravo-García, J.; Camarillo-Naranjo, J.M.; Blanco-Velázquez, F.J.; Anaya-Romero, M. Soil Organic Carbon Mapping Through Remote Sensing and In Situ Data with Random Forest by Using Google Earth Engine: A Case Study in Southern Africa. Land 2025, 14, 1436. [Google Scholar] [CrossRef]
- Lv, J.; Huang, Z.; Luo, L.; Zhang, S.; Wang, Y. Advances in Molecular and Microscale Characterization of Soil Organic Matter: Current Limitations and Future Prospects. Environ. Sci. Technol. 2022, 56, 12793–12810. [Google Scholar] [CrossRef]
- Walkley, A.; Black, I.A. An Examination of The Degtjareff Method for Determining Soil Organic Matter, and A Proposed Modification of The Chromic Acid Titration Method. Soil Sci. 1934, 37, 29–38. [Google Scholar] [CrossRef]
- Matus, F.J.; Escudey, M.; Förster, J.E.; Gutiérrez, M.; Chang, A.C. Is the Walkley–Black Method Suitable for Organic Carbon Determination in Chilean Volcanic Soils? Commun. Soil Sci. Plant Anal. 2009, 40, 1862–1872. [Google Scholar] [CrossRef]
- Burgos Hernández, T.D.; Slater, B.K.; Shaffer, J.M.; Basta, N. Comparison of methods for determining organic carbon content of urban soils in Central Ohio. Geoderma Reg. 2023, 34, e00680. [Google Scholar] [CrossRef]
- Pallasser, R.; Minasny, B.; McBratney, A.B. Soil carbon determination by thermogravimetrics. PeerJ 2013, 1, e6. [Google Scholar] [CrossRef] [PubMed]
- Hoogsteen, M.J.J.; Lantinga, E.A.; Bakker, E.J.; Groot, J.C.J.; Tittonell, P.A. Estimating soil organic carbon through loss on ignition: Effects of ignition conditions and structural water loss. Eur. J. Soil Sci. 2015, 66, 320–328. [Google Scholar] [CrossRef]
- Salehi, M.H.; Beni, O.H.; Harchegani, H.B.; Borujeni, I.E.; Motaghian, H.R. Refining Soil Organic Matter Determination by Loss-on-Ignition. Pedosphere 2011, 21, 473–482. [Google Scholar] [CrossRef]
- Schulte, E.E.; Hopkins, B.G. Estimation of Soil Organic Matter by Weight Loss-On-Ignition. In Soil Organic Matter: Analysis and Interpretation; Soil Science Society of America, Inc.: Madison, WI, USA, 2015; pp. 21–31. [Google Scholar]
- Viscarra Rossel, R.A.; Behrens, T.; Ben-Dor, E.; Brown, D.J.; Demattê, J.A.M.; Shepherd, K.D.; Shi, Z.; Stenberg, B.; Stevens, A.; Adamchuk, V.; et al. A global spectral library to characterize the world’s soil. Earth-Sci. Rev. 2016, 155, 198–230. [Google Scholar] [CrossRef]
- Ng, W.; Minasny, B.; Jeon, S.H.; McBratney, A. Mid-infrared spectroscopy for accurate measurement of an extensive set of soil properties for assessing soil functions. Soil Secur. 2022, 6, 100043. [Google Scholar] [CrossRef]
- Margenot, A.J.; Parikh, S.J.; Calderón, F.J. Fourier-transform infrared spectroscopy for soil organic matter analysis. Soil Sci. Soc. Am. J. 2023, 87, 1503–1528. [Google Scholar] [CrossRef]
- Walden, L.; Sepanta, F.; Viscarra Rossel, R.A. FT-MIR Spectroscopic Analysis of the Organic Carbon Fractions in Australian Mineral Soils. Eur. J. Soil Sci. 2025, 76, e70084. [Google Scholar] [CrossRef]
- Shepherd, K.D.; Walsh, M.G. Development of Reflectance Spectral Libraries for Characterization of Soil Properties. Soil Sci. Soc. Am. J. 2002, 66, 988–998. [Google Scholar] [CrossRef]
- Carter, T.L.; Schaecher, C.; Monteith, S.; Ferguson, R. Using combustion analysis to simultaneously measure soil organic and inorganic carbon. Geoderma 2024, 451, 117066. [Google Scholar] [CrossRef]
- Virk, S.; Tucker, M.; Harris, G.; Smith, A.; Levi, M.; Lessl, J. Efficacy and Economics of Different Soil Sampling Grid Sizes for Site-Specific Nutrient Management in Southeastern USA. Agronomy 2025, 15, 903. [Google Scholar] [CrossRef]
- Marcaida, M.; Workman, K.; Czymmek, K.J.; Ketterings, Q.M. Grid-based soil sampling for Northeast Region phosphorus index assessment. Soil Sci. Soc. Am. J. 2025, 89, e70156. [Google Scholar] [CrossRef]
- Brus, D.J.; Kempen, B.; Heuvelink, G.B.M. Sampling for validation of digital soil maps. Eur. J. Soil Sci. 2011, 62, 394–407. [Google Scholar] [CrossRef]
- Adamchuk, V.I.; Viscarra Rossel, R.A.; Marx, D.B.; Samal, A.K. Using targeted sampling to process multivariate soil sensing data. Geoderma 2011, 163, 63–73. [Google Scholar] [CrossRef]
- Long, J.; Liu, Y.; Xing, S.; Qiu, L.; Huang, Q.; Zhou, B.; Shen, J.; Zhang, L. Effects of sampling density on interpolation accuracy for farmland soil organic matter concentration in a large region of complex topography. Ecol. Indic. 2018, 93, 562–571. [Google Scholar] [CrossRef]
- Radočaj, D.; Jug, I.; Vukadinović, V.; Jurišić, M.; Gašparović, M. The Effect of Soil Sampling Density and Spatial Autocorrelation on Interpolation Accuracy of Chemical Soil Properties in Arable Cropland. Agronomy 2021, 11, 2430. [Google Scholar] [CrossRef]
- Brenning, A. Spatial prediction models for landslide hazards: Review, comparison and evaluation. Nat. Hazards Earth Syst. Sci. 2005, 5, 853–862. [Google Scholar] [CrossRef]
- Goovaerts, P. Geostatistics in soil science: State-of-the-art and perspectives. Geoderma 1999, 89, 1–45. [Google Scholar] [CrossRef]
- Fongaro, C.T.; Demattê, J.A.M.; Rizzo, R.; Lucas Safanelli, J.; Mendes, W.D.S.; Dotto, A.C.; Vicente, L.E.; Franceschini, M.H.D.; Ustin, S.L. Improvement of Clay and Sand Quantification Based on a Novel Approach with a Focus on Multispectral Satellite Images. Remote Sens. 2018, 10, 1555. [Google Scholar] [CrossRef]
- Yan, K.; Wang, D.; Feng, Y.; Hou, S.; Zhang, Y.; Yang, H. Digital mapping of soil organic carbon in a plain area based on time-series features. Ecol. Indic. 2025, 171, 113215. [Google Scholar] [CrossRef]
- Dhawale, N.M.; Adamchuk, V.I.; Prasher, S.O.; Viscarra Rossel, R.A. Evaluating the Precision and Accuracy of Proximal Soil vis–NIR Sensors for Estimating Soil Organic Matter and Texture. Soil Syst. 2021, 5, 48. [Google Scholar] [CrossRef]
- Whalen, E.D.; Grandy, A.S.; Geyer, K.M.; Morrison, E.W.; Frey, S.D. Microbial trait multifunctionality drives soil organic matter formation potential. Nat. Commun. 2024, 15, 10209. [Google Scholar] [CrossRef] [PubMed]
- Wold, S.; Sjöström, M.; Eriksson, L. PLS-regression: A basic tool of chemometrics. Chemom. Intell. Lab. Syst. 2001, 58, 109–130. [Google Scholar] [CrossRef]
- Sothe, C.; Gonsamo, A.; Arabian, J.; Snider, J. Large scale mapping of soil organic carbon concentration with 3D machine learning and satellite observations. Geoderma 2022, 405, 115402. [Google Scholar] [CrossRef]
- van Wesemael, B.; Chabrillat, S.; Dias, A.S.; Berger, M.; Szantoi, Z. Remote Sensing for Soil Organic Carbon Mapping and Monitoring. Remote Sens. 2023, 15, 3464. [Google Scholar] [CrossRef]
- Stockmann, U.; Adams, M.A.; Crawford, J.W.; Field, D.J.; Henakaarchchi, N.; Jenkins, M.; Minasny, B.; McBratney, A.B.; Courcelles, V.R.; Singh, K.; et al. The knowns, known unknowns and unknowns of sequestration of soil organic carbon. Agric. Ecosyst. Environ. 2013, 164, 80–99. [Google Scholar] [CrossRef]
- Roper, W.R.; Robarge, W.P.; Osmond, D.L.; Heitman, J.L. Comparing Four Methods of Measuring Soil Organic Matter in North Carolina Soils. Soil Sci. Soc. Am. J. 2019, 83, 466. [Google Scholar] [CrossRef]
- Burgess, T.M.; Webster, R. Optimal interpolation and isarithmic mapping of soil properties: I The semi-variogram and punctual kriging. Eur. J. Soil Sci. 2019, 70, 11–19. [Google Scholar] [CrossRef]
- Wang, W.; Li, Q. Smart farming revolution: Leveraging machine learning for sustainable agriculture. J. Clean. Prod. 2025, 527, 146434. [Google Scholar] [CrossRef]
- Xing, Y.; Xie, Y.; Wang, X. Enhancing soil health through balanced fertilization: A pathway to sustainable agriculture and food security. Front. Microbiol. 2025, 16, 1536524. [Google Scholar]
- Yang, M.; Xu, D.; Chen, S.; Li, H.; Shi, Z. Evaluation of Machine Learning Approaches to Predict Soil Organic Matter and pH Using vis-NIR Spectra. Sensors 2019, 19, 263. [Google Scholar] [CrossRef]
- Mundada, S.; Jain, P. Predicting soil organic carbon with ensemble learning techniques by using satellite images for precision farming. Sci. Rep. 2025, 15, 28760. [Google Scholar] [CrossRef]
- Kakhani, N.; Taghizadeh-Mehrjardi, R.; Omarzadeh, D.; Ryo, M.; Heiden, U.; Scholten, T. Towards Explainable AI: Interpreting Soil Organic Carbon Prediction Models Using a Learning-Based Explanation Method. Eur. J. Soil Sci. 2025, 76, e70071. [Google Scholar] [CrossRef]
- Hutengs, C.; Eisenhauer, N.; Schädler, M.; Cesarz, S.; Lochner, A.; Seidel, M.; Vohland, M. Enhanced VNIR and MIR proximal sensing of soil organic matter and PLFA-derived soil microbial properties through machine learning ensembles and external parameter orthogonalization. Geoderma 2024, 450, 117037. [Google Scholar] [CrossRef]
- Ben Ghorbal, A.; Grine, A.; Eid, M.M.; El-kenawy, E.S.M. Sustainable soil organic carbon prediction using machine learning and the ninja optimization algorithm. Front. Environ. Sci. 2025, 13, 1630762. [Google Scholar] [CrossRef]
- Ngu, N.H.; Trung, N.H.; Shinjo, H.; Chotpantarat, S.; Thanh, N.N. Improving spatial prediction of soil organic matter in central Vietnam using Bayesian-enhanced machine learning and environmental covariates. Arch. Agron. Soil Sci. 2025, 71, 1–17. [Google Scholar] [CrossRef]
- Narváez-Ortiz, W.A.; Reyes-Valdés, M.H.; Cabrera-De la Fuente, M.; Benavides-Mendoza, A. Multiple Linear and Polynomial Models for Studying the Dynamics of the Soil Solution. Soil Syst. 2022, 6, 42. [Google Scholar] [CrossRef]
- Rukhovich, D.; Koroleva, P.; Rukhovich, A.; Komissarov, M. A detailed mapping of soil organic matter content in arable land based on the multitemporal soil line coefficients and neural network filtering of big remote sensing data. Geoderma 2024, 447, 116941. [Google Scholar] [CrossRef]
- Law, T.; Shawe-Taylor, J. Practical Bayesian support vector regression for financial time series prediction and market condition change detection. Quant. Financ. 2017, 17, 1403–1416. [Google Scholar] [CrossRef]
- Elhallaoui Oueldkaddour, F.Z.; Wariaghli, F.; Brirhet, H.; Yahyaoui, A.; Jaziri, H. Comparison of Machine Learning Models for Real-Time Flow Forecasting in the Semi-Arid Bouregreg Basin. Limnol. Rev. 2025, 25, 6. [Google Scholar] [CrossRef]
- Were, K.; Bui, D.T.; Dick, Ø.B.; Singh, B.R. A comparative assessment of support vector regression, artificial neural networks, and random forests for predicting and mapping soil organic carbon stocks across an Afromontane landscape. Ecol. Indic. 2015, 52, 394–403. [Google Scholar] [CrossRef]
- Laref, R.; Losson, E.; Sava, A.; Siadat, M. On the optimization of the support vector machine regression hyperparameters setting for gas sensors array applications. Chemom. Intell. Lab. Syst. 2019, 184, 22–27. [Google Scholar] [CrossRef]
- Greve, M.H.; Kheir, R.B.; Greve, M.B.; Bøcher, P.K. Quantifying the ability of environmental parameters to predict soil texture fractions using regression-tree model with GIS and LIDAR data: The case study of Denmark. Ecol. Indic. 2012, 18, 1–10. [Google Scholar] [CrossRef]
- Siqueira, R.G.; Moquedace, C.M.; Fernandes-Filho, E.I.; Schaefer, C.E.G.R.; Francelino, M.R.; Sacramento, I.F.; Michel, R.F.M. Modelling and prediction of major soil chemical properties with Random Forest: Machine learning as tool to understand soil-environment relationships in Antarctica. CATENA 2024, 235, 107677. [Google Scholar] [CrossRef]
- Valavi, R.; Elith, J.; Lahoz-Monfort, J.J.; Guillera-Arroita, G. Modelling species presence-only data with random forests. Ecography 2021, 44, 1731–1742. [Google Scholar] [CrossRef]
- Ho, V.H.; Morita, H.; Bachofer, F.; Ho, T.H. Random forest regression kriging modeling for soil organic carbon density estimation using multi-source environmental data in central Vietnamese forests. Model. Earth Syst. Environ. 2024, 10, 7137–7158. [Google Scholar] [CrossRef]
- Gao, D.; Zhang, Y.X.; Zhao, Y.H. Random forest algorithm for classification of multiwavelength data. Res. Astron. Astrophys. 2009, 9, 220–226. [Google Scholar] [CrossRef]
- Tang, F.; Ishwaran, H. Random forest missing data algorithms. Stat. Anal. Data Min. ASA Data Sci. J. 2017, 10, 363–377. [Google Scholar] [CrossRef] [PubMed]
- Probst, P.; Wright, M.N.; Boulesteix, A. Hyperparameters and tuning strategies for random forest. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2019, 9, e1301. [Google Scholar] [CrossRef]
- Adeniyi, O.D.; Brenning, A.; Maerker, M. Spatial prediction of soil organic carbon: Combining machine learning with residual kriging in an agricultural lowland area (Lombardy region, Italy). Geoderma 2024, 448, 116953. [Google Scholar] [CrossRef]
- Wang, L.; Abramowitz, G.; Wang, Y.P.; Pitman, A.; Viscarra Rossel, R.A. An ensemble estimate of Australian soil organic carbon using machine learning and process-based modelling. SOIL 2024, 10, 619–636. [Google Scholar] [CrossRef]
- Hateffard, F.; Szatmári, G.; Novák, T.J. Applicability of machine learning models for predicting soil organic carbon content and bulk density under different soil conditions. Soil Sci. Annu. 2023, 74, 165879. [Google Scholar] [CrossRef]
- Li, Y.; Yao, G.; Li, S.; Dong, X. Predicting and Mapping of Soil Organic Matter with Machine Learning in the Black Soil Region of the Southern Northeast Plain of China. Agronomy 2025, 15, 533. [Google Scholar] [CrossRef]
- Natekin, A.; Knoll, A. Gradient boosting machines, a tutorial. Front. Neurorobot. 2013, 7, 21. [Google Scholar] [CrossRef]
- Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining—KDD ’16; Association for Computing Machinery: New York, NY, USA, 2016; Volume 1, pp. 785–794. [Google Scholar]
- Hamzehpour, N.; Shafizadeh-Moghadam, H.; Valavi, R. Exploring the driving forces and digital mapping of soil organic carbon using remote sensing and soil texture. CATENA 2019, 182, 104141. [Google Scholar] [CrossRef]
- Kalambukattu, J.G.; Kumar, S.; Das, B.; Roy, T. Digital mapping of soil organic carbon in the hilly and mountainous landscape of Indian Himalayan region employing machine-learning techniques. Discov. Soil 2025, 2, 35. [Google Scholar] [CrossRef]
- Zou, X.; Wu, Z.; Fan, D.; Wu, Z.; Zhu, Y.; Ou, J. Exploring the Main Control Factors of Soil Organic Carbon in Riparian Farmland by Using Gradient Boosting Decision Tree. Eurasian Soil Sci. 2025, 58, 93. [Google Scholar]
- López, O.A.M.; López, A.M.; Crossa, J. Fundamentals of Artificial Neural Networks and Deep Learning. In Multivariate Statistical Machine Learning Methods for Genomic Prediction; Springer: Cham, Switzerland, 2022; pp. 379–425. [Google Scholar]
- Hornik, K. Approximation capabilities of multilayer feedforward networks. Neural Netw. 1991, 4, 251–257. [Google Scholar] [CrossRef]
- Pacci, S.; Dengiz, O.; Alaboz, P.; Saygın, F. Artificial neural networks in soil quality prediction: Significance for sustainable tea cultivation. Sci. Total Environ. 2024, 947, 174447. [Google Scholar] [CrossRef] [PubMed]
- Nawar, S.; Mouazen, A. Comparison between Random Forests, Artificial Neural Networks and Gradient Boosted Machines Methods of On-Line Vis-NIR Spectroscopy Measurements of Soil Total Nitrogen and Total Carbon. Sensors 2017, 17, 2428. [Google Scholar] [CrossRef] [PubMed]
- Carbajal, M.; Ramírez, D.A.; Turin, C.; Schaeffer, S.M.; Konkel, J.; Ninanya, J.; Rinza, J.; Mendiburu, F.D.; Zorogastua, P.; Villaorduña, L.; et al. From Rangelands to Cropland, Land-Use Change and Its Impact on Soil Organic Carbon Variables in a Peruvian Andean Highlands: A Machine Learning Modeling Approach. Ecosystems 2024, 27, 899–917. [Google Scholar] [CrossRef]
- Tian, X.; Ahrens, B.; Rossdeutscher, L.; Alonso, L.; Parente, L. Soil science-informed neural networks for soil organic carbon density modelling under scarce bulk density data. EGUsphere 2026. [Google Scholar] [CrossRef]
- Emadi, M.; Taghizadeh-Mehrjardi, R.; Cherati, A.; Danesh, M.; Mosavi, A.; Scholten, T. Predicting and Mapping of Soil Organic Carbon Using Machine Learning Algorithms in Northern Iran. Remote Sens. 2020, 12, 2234. [Google Scholar] [CrossRef]
- Ding, Z.; Liu, K.; Grunwald, S.; Smith, P.; Ciais, P.; Wang, B.; Wadoux, A.M.J.C.; Ferreira, C.; Karunaratne, S.; Shurpali, N.; et al. Advancing Soil Organic Carbon Prediction: A Comprehensive Review of Technologies, AI, Process-Based and Hybrid Modelling Approaches. Adv. Sci. 2025, 12, e04152. [Google Scholar] [CrossRef]
- Triantakonstantis, D.; Karakostas, A. Soil Organic Carbon Monitoring and Modelling via Machine Learning Methods Using Soil and Remote Sensing Data. Agriculture 2025, 15, 910. [Google Scholar] [CrossRef]
- Sarkar, R.; Ray, R.L. Application of artificial neural network algorithms to estimate spatial soil organic carbon stock in Prairie lands from remote sensing and soil data pool. Geocarto Int. 2025, 40, 2597423. [Google Scholar] [CrossRef]
- Honorato, M.; Coelho, A.P.; Fernandes, C.; Matheus, S.; Claudia, M. Estimation of soil organic matter content by modeling with artificial neural networks. Geoderma 2019, 350, 46–51. [Google Scholar] [CrossRef]
- Guo, P.T.; Wu, W.; Sheng, Q.; Li, M.F.; Liu, H.; Wang, Z.Y. Prediction of soil organic matter using artificial neural network and topographic indicators in hilly areas. Nutr. Cycl. Agroecosystems 2013, 95, 333–344. [Google Scholar] [CrossRef]
- Mansur, N.; Abbod, M. Machine learning-based estimation of soil organic matter using RGB values. DYSONA-Appl. Sci. 2026, 7, 73–81. [Google Scholar] [CrossRef]
- Khan, A.; Sohail, A.; Zahoora, U.; Qureshi, A.S. A survey of the recent architectures of deep convolutional neural networks. Artif. Intell. Rev. 2020, 53, 5455–5516. [Google Scholar] [CrossRef]
- Alhatami, E.; Huang, M.; Bhatti, U.A.; Bhatti, U.A. Chapter 14—Remote sensing image fusion based on deep learning and convolutional neural network technique. In Deep Learning for Earth Observation and Climate Monitoring; Elsevier: Amsterdam, The Netherlands, 2025; pp. 265–277. Available online: https://www.sciencedirect.com/science/article/pii/B9780443247125000166 (accessed on 21 May 2026).
- Tziolas, N.; Tsakiridis, N.; Heiden, U.; van Wesemael, B. Soil organic carbon mapping utilizing convolutional neural networks and Earth observation data, a case study in Bavaria state Germany. Geoderma 2024, 444, 116867. [Google Scholar] [CrossRef]
- Wang, H.; Sun, Q.; Niu, X.; Liu, K.; Zhang, J.; Hao, Z.; Xu, D. Soil Organic Carbon Prediction Using an Efficient Channel Attention-Enhanced CNN-LSTM Model with LUCAS Spectral Library. Eur. J. Soil Sci. 2025, 76, e70202. [Google Scholar] [CrossRef]
- Guo, L.; Gao, Q.; Zhang, M.; Cheng, P.; He, P.; Li, L.; Ding, D.; Liu, C.; Muga, F.C.; Kamal, M.; et al. Soil Organic Matter Content Prediction Using Multi-Input Convolutional Neural Network Based on Multi-Source Information Fusion. Agriculture 2025, 15, 1313. [Google Scholar] [CrossRef]
- Li, T.; Xia, A.; McLaren, T.I.; Pandey, R.; Xu, Z.; Liu, H.; Manning, S.; Madgett, O.; Duncan, S.; Rasmussen, P.; et al. Preliminary Results in Innovative Solutions for Soil Carbon Estimation: Integrating Remote Sensing, Machine Learning, and Proximal Sensing Spectroscopy. Remote Sens. 2023, 15, 5571. [Google Scholar]
- Wang, H.; Zhang, L.; Zhao, J. Application of a Fusion Attention Mechanism-Based Model Combining Bidirectional Gated Recurrent Units and Recurrent Neural Networks in Soil Nutrient Content Estimation. Agronomy 2023, 13, 2724. [Google Scholar] [CrossRef]
- Zhang, L.; Cai, Y.; Huang, H.; Li, A.; Yang, L.; Zhou, C. A CNN-LSTM Model for Soil Organic Carbon Content Prediction with Long Time Series of MODIS-Based Phenological Variables. Remote Sens. 2022, 14, 4441. [Google Scholar]
- Pavlovic, M.; Ilic, S.; Ralevic, N.; Antonic, N.; Raffa, D.W.; Bandecchi, M.; Culibrk, D. A Deep Learning Approach to Estimate Soil Organic Carbon from Remote Sensing. Remote Sens. 2024, 16, 655. [Google Scholar] [CrossRef]
- Broeg, T.; Blaschek, M.; Seitz, S.; Taghizadeh-Mehrjardi, R.; Zepp, S.; Scholten, T. Transferability of Covariates to Predict Soil Organic Carbon in Cropland Soils. Remote Sens. 2023, 15, 876. [Google Scholar] [CrossRef]
- Zhang, L.; Yang, L.; Ma, Y.; Zhu, A.-X.; Wei, R.; Liu, J.; Greve, M.H.; Zhou, C. Regional-scale soil carbon predictions can be enhanced by transferring global-scale soil–environment relationships. Geoderma 2025, 461, 117466. [Google Scholar] [CrossRef]
- Novielli, P.; Magarelli, M.; Romano, D.; Di Bitonto, P.; Stellacci, A.M.; Monaco, A.; Amoroso, N.; Bellotti, R.; Tangaro, S. Leveraging explainable AI to predict soil respiration sensitivity and its drivers for climate change mitigation. Sci. Rep. 2025, 15, 12527. [Google Scholar] [CrossRef] [PubMed]
- Suleymanov, A.; Komissarov, M.; Suleymanov, R.; Gabbasova, I. The Basic Soil Structure Parameters and Their Spatial Prediction Using Machine Learning and Remote Sensing Data in Semi-Arid Trans-Ural Steppe Zone, Russia. Soil Syst. 2026, 10, 11. [Google Scholar] [CrossRef]
- Kmoch, A.; Harrison, C.T.; Choi, J.; Uuemaa, E. Spatial autocorrelation in machine learning for modelling soil organic carbon. Ecol. Inform. 2025, 86, 103057. [Google Scholar] [CrossRef]
- Chinilin, A.; Savin, I.Y. Combining machine learning and environmental covariates for mapping of organic carbon in soils of Russia. Egypt. J. Remote Sens. Space Sci. 2023, 26, 666–675. [Google Scholar] [CrossRef]
- Yin, Y.; Gao, B.; Xu, H.; Wang, Y.; Xie, D.; Liu, Y.; Wang, C. Soil organic matter mapping in complex terrains considering spatial heterogeneity. Environ. Model. Softw. 2025, 192, 106569. [Google Scholar] [CrossRef]
- Heuvelink, G.B.M.; Angelini, M.E.; Poggio, L.; Bai, Z.; Batjes, N.H.; van den Bosch, R.; Bossio, D.; Estella, S.; Lehmann, J.; Olmedo, G.F.; et al. Machine learning in space and time for modelling soil organic carbon change. Eur. J. Soil Sci. 2020, 72, 1607–1623. [Google Scholar] [CrossRef]
- Hengl, T.; Mendes de Jesus, J.; Heuvelink, G.B.M.; Ruiperez Gonzalez, M.; Kilibarda, M.; Blagotić, A.; Shangguan, W.; Wright, M.N.; Geng, X.; Bauer-Marschallinger, B.; et al. SoilGrids250m: Global gridded soil information based on machine learning. PLoS ONE 2017, 12, e0169748. [Google Scholar] [CrossRef]
- Poggio, L.; de Sousa, L.M.; Batjes, N.H.; Heuvelink, G.B.M.; Kempen, B.; Ribeiro, E.; Rossiter, D. SoilGrids 2.0: Producing soil information for the globe with quantified spatial uncertainty. SOIL 2021, 7, 217–240. [Google Scholar] [CrossRef]
- McBratney, A.B.; Mendonça Santos, M.L.; Minasny, B. On digital soil mapping. Geoderma 2003, 117, 3–52. [Google Scholar] [CrossRef]
- Qu, L.; Lu, H.; Tian, Z.; Schoorl, J.M.; Huang, B.; Liang, Y.; Qiu, D.; Liang, Y. Spatial prediction of soil sand content at various sampling density based on geostatistical and machine learning algorithms in plain areas. CATENA 2024, 234, 107572. [Google Scholar] [CrossRef]
- Lundberg, S.; Lee, S.I. A Unified Approach to Interpreting Model Predictions. arXiv 2017. [Google Scholar] [CrossRef]
- Padarian, J.; Minasny, B.; McBratney, A.B. Machine learning and soil sciences: A review aided by machine learning tools. SOIL 2020, 6, 35–52. [Google Scholar] [CrossRef]
- Paciorek, C.J.; Stone, D.A.; Wehner, M.F. Quantifying statistical uncertainty in the attribution of human influence on severe weather. Weather Clim. Extrem. 2018, 20, 69–80. [Google Scholar] [CrossRef]
- Lark, R.M.; Lapworth, D.J. Quality measures for soil surveys by lognormal kriging. Geoderma 2012, 173–174, 231–240. [Google Scholar] [CrossRef]
- Shi, Y.; Wei, P.; Feng, K.; Feng, D.C.; Beer, M. A survey on machine learning approaches for uncertainty quantification of engineering systems. Mach. Learn. Comput. Sci. Eng. 2025, 1, 11. [Google Scholar] [CrossRef]
- Luo, L.; Chen, B.; Zeng, S.; Li, Y.; Chen, X.; Zhang, J.; Guo, X.; Li, S.; Ruan, L.; Zhu, S.; et al. Machine learning integrates region-specific microbial signatures to distinguish geographically adjacent populations within a province. Front. Microbiol. 2025, 16, 1586195. [Google Scholar] [CrossRef] [PubMed]
- Amoli, M.G.; Hasanlou, M.; Samadzadegan, F.; Taghizadeh-Mehrjardi, R.; Dadrass Javan, F. Estimating soil organic carbon using time series Band 11 (SWIR) of multispectral Sentinel-2 satellite images and machine learning algorithms. Remote Sens. Appl. Soc. Environ. 2025, 40, 101736. [Google Scholar] [CrossRef]
- Luo, D.; Xie, Y.; Tang, J.; Xu, J.; Zhang, M.; Cheng, H.; Luo, H.; Ouyang, W. Improving the prediction accuracy of soil organic matter: Addressing the challenge of soil moisture variability. Ecol. Indic. 2025, 179, 114249. [Google Scholar] [CrossRef]
- Scharlemann, J.P.; Tanner, E.V.; Hiederer, R.; Kapos, V. Global soil carbon: Understanding and managing the largest terrestrial carbon pool. Carbon Manag. 2014, 5, 81–91. [Google Scholar] [CrossRef]
- Lin, Z.; Dai, Y.; Mishra, U.; Wang, G.; Shangguan, W.; Zhang, W.; Qin, Z. Global and regional soil organic carbon estimates: Magnitude and uncertainties. Pedosphere 2023, 34, 685–698. [Google Scholar] [CrossRef]
- Chen, S.; Arrouays, D.; Leatitia Mulder, V.; Poggio, L.; Minasny, B.; Roudier, P.; Libohova, Z.; Lagacherie, P.; Shi, Z.; Hannam, J.; et al. Digital mapping of GlobalSoilMap soil properties at a broad scale: A review. Geoderma 2022, 409, 115567. [Google Scholar] [CrossRef]
- Kebonye, N.M.; John, K.; Delgado-Baquerizo, M.; Zhou, Y.; Agyeman, P.C.; Seletlo, Z.; Heung, B.; Scholten, T. Major overlap in plant and soil organic carbon hotspots across Africa. Sci. Total Environ. 2024, 951, 175476. [Google Scholar] [CrossRef]
- Garrett, L.G.; Byers, A.K.; Chen, C.; Lan, Z.; Bahadori, M.; Wakelin, S.A. The hidden depths of forest soil organic carbon chemistry in a pumice soil. Geoderma Reg. 2024, 36, e00760. [Google Scholar] [CrossRef]
- Feeney, C.; Cosby, B.J.; Robinson, D.A.; Thomas, A.; Emmett, B.; Henrys, P. Multiple soil map comparison highlights challenges for predicting topsoil organic carbon concentration at national scale. Sci. Rep. 2022, 12, 1379. [Google Scholar] [CrossRef]
- Nair, M.; Svedberg, P.; Larsson, I.; Nygren, J.M. A comprehensive overview of barriers and strategies for AI implementation in healthcare: Mixed-method design. PLoS ONE 2024, 19, e0305949. [Google Scholar] [CrossRef]
- Habib, S.; Tahir, F.; Hussain, F.; Macauley, N.; Al-Ghamdi, S.G. Current and emerging technologies for carbon accounting in urban landscapes: Advantages and limitations. Ecol. Indic. 2023, 154, 110603. [Google Scholar] [CrossRef]
- Bui, E.N.; Searle, R.D.; Wilson, P.R.; Philip, S.R.; Thomas, M.; Brough, D.; Harms, B.; Hill, J.V.; Holmes, K.; Henry, J.; et al. Soil surveyor knowledge in digital soil mapping and assessment in Australia. Geoderma Reg. 2020, 22, e00299. [Google Scholar] [CrossRef]
- Sun, Z.; Liu, F.; Wu, H.; Zhang, G.L. Developing a national black soil map of China through machine learning classification. CATENA 2024, 240, 107993. [Google Scholar] [CrossRef]
- Fu, H.; Zhao, H.; Liu, G.; Zhang, Y.; Huangfu, X.; Jiang, J. Forest aboveground carbon storage estimation and uncertainty analysis by coupled multi-source remote sensing data in Liaoning Province. Ecol. Indic. 2025, 176, 113729. [Google Scholar] [CrossRef]
- Goyette, J.O.; Loiselle, A.; Mendes, P.; Cimon-Morin, J.; Pellerin, S.; Poulin, M.; Dupras, J. Above and belowground carbon stocks among organic soil wetland types, accounting for peat bathymetry. Sci. Total Environ. 2024, 946, 174177. [Google Scholar] [CrossRef]
- Jiang, R.; Sui, Y.; Zhang, X.; Lin, N.; Zheng, X.; Li, B.; Zhang, L.; Li, X.; Yu, H. Estimation of soil organic carbon by combining hyperspectral and radar remote sensing to reduce coupling effects of soil surface moisture and roughness. Geoderma 2024, 444, 116874. [Google Scholar] [CrossRef]
- Chen, Z.; Chen, Y.; Shi, T.; Chen, X.; Pan, X.; Lei, J.; Wu, T.; Li, Y.; Liu, Q.; Liu, X.; et al. Estimation of soil organic carbon in tropical rainforest regions by combining UAV hyperspectral and LiDAR data. CATENA 2025, 258, 109195. [Google Scholar] [CrossRef]




| Geographic Focus | Soil Type/Region | Method(s) Tested | Key Covariates | Outputs/Target | Performance (R2/RMSE/MAE) | Reference |
|---|---|---|---|---|---|---|
| Paddy soils, China | Agricultural paddy | ELM, PLSR, LS-SVM, Cubist | Vis-NIR spectra, feature selection | SOM prediction | R2 = 0.81, RMSE = 5.17 g kg−1 (ELM) | [76] |
| Germany | Croplands (LUCAS dataset) | RF, Neural Networks, Linear regression | Soil properties, environmental data | SOC | Different algorithms prioritised different features (trees: topography; NN: pH) | [78] |
| Germany | Diverse soils | Stacking ensemble vs. gradient boosting | Satellite imagery, soil chemical properties | SOC | Training R2 = 0.95 (GBM); Test R2 higher for stacking | [77] |
| Field conditions, Europe | Various | VNIR-MIR ML (Cubist, SVMs) vs. PLSR | Vis-NIR and mid-IR spectra; EPO calibration transfer | SOC | R2 improved 40% with the EPO algorithm for field-collected spectra | [79] |
| Global optimization | Various | SVR with Ninja Optimisation Algorithm | Spectral bands, soil properties | SOC prediction | RMSE reduced 99.98% vs. untuned SVR | [80] |
| Tropical forests, Australia | Forest soils, diverse | Bayesian-enhanced RF | Rainfall, distance to coast/water, altitude | SOM spatial mapping | R2 > 0.74 for key environmental covariates | [81] |
| Agricultural lowland, Italy (Lombardy) | Agricultural soils | RF vs. kriging vs. regression kriging | Soil, climatic, topographic, and RS data | SOC stocks | R2 = 0.6 (SVR best); RMSE = 14.9 Mg C ha−1 | [86] |
| Across Australia | Continental scale | RF, multiple algorithms | Environmental & remote sensing variables | SOC stocks | R2 > 0.8 (RF); southeastern/southwestern regions highest | [96] |
| Central Vietnamese forests | Mixed forest ecosystems | RF + regression kriging | Topographic wetness index, relative position, slope | SOCD | High accuracy with spatial autocorrelation accounted for | [91] |
| East Hungary (two fields) | Agricultural soils | RF, XGBoost, other ML | Terrain attributes, satellite vegetation indices | SOC content/stock | R2 ≈ 0.80 (RF); R2 slightly higher for XGBoost + optimization | [97] |
| Tieling County, China | Agricultural/forest mixed | RF with spatial mapping | NDVI, elevation, land use | SOM | R2 = 0.77, RMSE = 2.85 g kg−1 | [98] |
| Urmia Lake region, Iran | Semi-arid mixed | Gradient boosting machine | EVI, sand content, wetness indices | SOC | R2 = 0.435, RMSE = 0.23% | [101] |
| Indian Himalayas | Mountainous terrain | XGBoost, RF, SVR | Climatic, topographic, soil, and satellite covariates | SOC at 30m resolution | R2 ~0.60+ (context-dependent) | [102] |
| Peixian County, North China Plain | Riparian agricultural | Gradient boosting decision trees | Precipitation, temperature, distance to settlement/lake | SOC | R2 = 0.68 | [103] |
| Hilly terrain | Diverse upland soils | ANN vs. multiple linear regression | Topographic wetness, relative position, slope length | SOM | ANN R2 = 0.87 vs. MLR R2 = 0.82 | [115] |
| Diverse soils (large dataset) | Various mineral soils | Multilayer perceptron ANN | Routine chemical soil attributes | SOM | Calibration R2 = 0.92; Validation R2 = 0.76 | [114] |
| Bavarian soils, Germany | Diverse cropland | CNN framework | Spectral pre-treatment variants | SOC | R2 = 0.64, RMSE = 12.03 g kg−1 | [119] |
| Soil spectral libraries | Laboratory & field | LSTM-CNN hybrid | NIR spectral data | SOM | R2 = 0.96, RMSE = 1.66 g kg−1 | [120] |
| Soil spectral libraries (global) | Multi-source fusion | CNN-LSTM-ECA (channel attention) | Spectra, texture, colour information | SOC | R2 = 0.92 | [120] |
| Soil images | Diverse soils | Three-branch CNN (spectra + texture + colour) | Spectral bands, image texture, colour features | SOM | R2 = 0.87 (23% improvement over single-input) | [121] |
| Multiple regions | Various | Att-BiGRU-RNN (attention mechanism) | Vegetation phenology, environmental data | Soil nutrients (OM, N, P, K) | R2 = 0.959 (OM) | [123] |
| MODIS time-series | Regional scale | CNN-LSTM (vegetation phenology) | 10-year MODIS phenology, environmental variables | SOC regional mapping | R2 improved vs. traditional RF | [124] |
| Tuscany, Italy | Mixed soils | Deep neural networks | Spectral and environmental data | SOC | R2 = 0.26 | [125] |
| Greece | Diverse | Shallow neural networks | Environmental covariates | SOC | Modest; intensive CV required | [112] |
| Transfer learning (Bavaria ↔ Baden-Württemberg, Germany) | Cropland soils | RF transfer models | Environmental covariates | SOC | Reduced accuracy in transferred model; overprediction at high values | [126] |
| Global scale transfer | Multiple regions | Domain adaptation pre-training + fine-tuning | Diverse environmental data | SOC regional | MAE improved ~11% in target region | [127] |
| Estonia | Diverse soils | RF with spatial covariates | Environmental variables + spatial covariates | SOC | R2 improved +0.02 with spatial variables; Spatial CV R2 ~0.45 vs. random CV R2 ~0.66 | [130] |
| Complex terrain, Argentina | Mountainous | Two-point ML (global + local models) | Environmental covariates | SOC | Performance varies with local heterogeneity | [132] |
| Argentina (1982–2017) | Agricultural | Temporal ML (time-series) | NDVI, climate data, temporal records | SOC change detection | Uncertainty high due to uneven temporal distribution | [133] |
| Spectral data analysis | Paddy soils China (extended) | PLSR, SVM ensemble, Cubist optimization | Visible-near-infrared spectroscopy | SOM from spectra | Ensemble methods outperformed single algorithms | [76] |
| Spectral indices integration | Various European regions | Multiple regression, PLS regression | Satellite NDVI, NDSI indices + field data | SOM/SOC | R2 of 0.3–0.6 range typically | [65,66,67,68] |
| Classical kriging comparison | Multiple regions | Ordinary kriging, co-kriging | Spatial semivariograms, secondary variables | SOC mapping | Baseline for ML comparison; provides uncertainty estimates | [62,63] |
| Wet chemistry method (Walkley-Black) | Diverse global soils | Laboratory oxidation method | Chemical soil oxidation | SOC quantification | Recovers 70–80% of total organic carbon; site-specific correction needed | [43,44] |
| Dry combustion (Elemental analysis) | Research-grade soils | Combustion analysis | Complete oxidation at 900–1000 °C | Total SOC | Gold standard; R2 = 1.0 (by definition) for measured samples | [45,46] |
| Loss-on-Ignition (LOI) | Various mineral soils | Heating to 360–550 °C, mass loss | Volatile matter measurement | Crude SOM estimate | Unreliable for soils with hydrous minerals or carbonates | [47,48,49] |
| Spectroscopic methods (VNIR, MIR, FTIR) | Global soil libraries | Spectroscopic calibration vs. wet chemistry | Infrared absorption/reflectance bonds | SOM from spectra | Requires calibration; MIR shows strong correlation with measured OM | [50,51,52,53,54,55] |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Dolatabadian, A.; Kariman, K. Bridging Pedology and Data Science: Machine Learning Applications for Soil Organic Matter and Carbon Analysis. Appl. Sci. 2026, 16, 5412. https://doi.org/10.3390/app16115412
Dolatabadian A, Kariman K. Bridging Pedology and Data Science: Machine Learning Applications for Soil Organic Matter and Carbon Analysis. Applied Sciences. 2026; 16(11):5412. https://doi.org/10.3390/app16115412
Chicago/Turabian StyleDolatabadian, Aria, and Khalil Kariman. 2026. "Bridging Pedology and Data Science: Machine Learning Applications for Soil Organic Matter and Carbon Analysis" Applied Sciences 16, no. 11: 5412. https://doi.org/10.3390/app16115412
APA StyleDolatabadian, A., & Kariman, K. (2026). Bridging Pedology and Data Science: Machine Learning Applications for Soil Organic Matter and Carbon Analysis. Applied Sciences, 16(11), 5412. https://doi.org/10.3390/app16115412

