Tree-Based Modeling for Large-Scale Management in Agriculture: Explaining Organic Matter Content in Soil
Abstract
:1. Introduction
2. Materials and Methods
2.1. Data Compilation and Processing
2.2. Explainable Tree-Based Models
2.2.1. Model Evaluation
2.2.2. Assessment Statistics
3. Results
3.1. Comparison of Selected Tree-Based Models
3.2. Feature Overall Importance Analysis
3.3. Feature Partial Dependence Analysis
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Bodria, F.; Giannotti, F.; Guidotti, R.; Naretto, F.; Pedreschi, D.; Rinzivillo, S. Benchmarking and survey of explanation methods for black box models. Data Min. Knowl. Discov. 2021, 37, 1719–1778. [Google Scholar] [CrossRef]
- Ryo, M. Explainable artificial intelligence and interpretable machine learning for agricultural data analysis. Artif. Intell. Agric. 2022, 6, 257–265. [Google Scholar] [CrossRef]
- Miller, T. Explanation in Artificial Intelligence: Insights from the Social Sciences. Artif. Intell. 2017, 267, 1–38. [Google Scholar] [CrossRef]
- Belle, V.; Papantonis, I. Principles and practice of explainable machine learning. Front. Big Data 2021, 4, 688969. [Google Scholar] [CrossRef]
- Saeed, W.; Omlin, C. Explainable AI (XAI): A systematic meta-survey of current challenges and future opportunities. Knowl.-Based Syst. 2023, 263, 110273. [Google Scholar] [CrossRef]
- Wang, M.; Fu, W.J.; He, X.N.; Hao, S.J.; Wu, X.D. A survey on large-scale machine learning. IEEE Trans. Knowl. Data Eng. 2022, 34, 2574–2594. [Google Scholar] [CrossRef]
- Visser, O.; Sippel, S.R.; Thiemann, L. Imprecision farming? Examining the (in)accuracy and risks of digital agriculture. J. Rural Stud. 2021, 86, 623–632. [Google Scholar] [CrossRef]
- Dundon, S.J. Agricultural ethics and multifunctionality are unavoidable. Plant Physiol. 2003, 133, 427–437. [Google Scholar] [CrossRef] [PubMed]
- Rudin, C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell. 2019, 1, 206–215. [Google Scholar] [CrossRef]
- Finger, R. Digital innovations for sustainable and resilient agricultural systems. Eur. Rev. Agric. Econ. 2023, 50, 1277–1309. [Google Scholar] [CrossRef]
- Hoang, N.T.; Taherzadeh, O.; Ohashi, H.; Yonekura, Y.; Nishijima, S.; Yamabe, M.; Matsui, T.; Matsuda, H.; Moran, D.; Kanemoto, K. Mapping potential conflicts between global agriculture and terrestrial conservation. Proc. Natl. Acad. Sci. USA 2023, 120, e2208376120. [Google Scholar] [CrossRef]
- Chouldechova, A. Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. Big Data 2016, 5, 153–163. [Google Scholar] [CrossRef] [PubMed]
- Schuett, J. Risk management in the Artificial Intelligence Act. Eur. J. Risk Regul. 2023, 1–19. [Google Scholar] [CrossRef]
- Thomson Reuters. LAWnB IP Exclusive Report: 2023 Domestic and International AI Regulatory and Policy Trends; Thomson Reuters Korea: Seoul, Korea, 2023; pp. 5–8. [Google Scholar]
- Breiman, L. Statistical modeling: The two cultures. Stat. Sci. 2001, 16, 133. [Google Scholar] [CrossRef]
- Guidotti, R.; Monreale, A.; Turini, F.; Pedreschi, D.; Giannotti, F. A survey of methods for explaining black box models. ACM Comput. Surv. 2018, 51, 1–42. [Google Scholar] [CrossRef]
- Adadi, A.; Berrada, M. Peeking inside the black-box: A survey on explainable artificial intelligence (XAI). IEEE Access 2018, 6, 52138–52160. [Google Scholar] [CrossRef]
- Barredo Arrieta, A.; Díaz-Rodríguez, N.; Del Ser, J.; Bennetot, A.; Tabik, S.; Barbado, A.; Garcia, S.; Gil-Lopez, S.; Molina, D.; Benjamins, R.; et al. Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Inf. Fusion 2020, 58, 82–115. [Google Scholar] [CrossRef]
- Theissler, A.; Spinnato, F.; Schlegel, U.; Guidotti, R. Explainable AI for time series classification: A review, taxonomy and research directions. IEEE Access 2022, 10, 100700–100724. [Google Scholar] [CrossRef]
- Yuan, H.; Yu, H.; Gui, S.; Ji, S. Explainability in graph neural networks: A taxonomic survey. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 45, 5782–5799. [Google Scholar] [CrossRef]
- Pichler, M.; Hartig, F. Machine learning and deep learning—A review for ecologists. Methods Ecol. Evol. 2023, 14, 994–1016. [Google Scholar] [CrossRef]
- Rudin, C.; Chen, C.; Chen, Z.; Huang, H.; Semenova, L.; Zhong, C. Interpretable machine learning: Fundamental principles and 10 grand challenges. Stat. Surv. 2022, 16, 85. [Google Scholar] [CrossRef]
- Antle, J.M.; Basso, B.; Conant, R.T.; Godfray, H.C.J.; Jones, J.W.; Herrero, M.; Howitt, R.E.; Keating, B.A.; Munoz-Carpena, R.; Rosenzweig, C.; et al. Towards a new generation of agricultural system data, models and knowledge products: Design and improvement. Agric. Syst. 2017, 155, 255–268. [Google Scholar] [CrossRef] [PubMed]
- Smith, P.; Davies, C.A.; Ogle, S.; Zanchi, G.; Bellarby, J.; Bird, N.; Boddey, R.M.; McNamara, N.P.; Powlson, D.; Cowie, A.; et al. Towards an integrated global framework to assess the impacts of land use and management change on soil carbon: Current capability and future vision. Glob. Chang. Biol. 2012, 18, 2089–2101. [Google Scholar] [CrossRef]
- Hu, T.; Zhang, X.; Bohrer, G.; Liu, Y.; Zhou, Y.; Martin, J.; Li, Y.; Zhao, K. Crop yield prediction via explainable AI and interpretable machine learning: Dangers of black box models for evaluating climate change impacts on crop yield. Agric. For. Meteorol. 2023, 336, 109458. [Google Scholar] [CrossRef]
- Paustian, K.; Lehmann, J.; Ogle, S.; Reay, D.; Robertson, G.P.; Smith, P. Climate-smart soils. Nature 2016, 532, 49–57. [Google Scholar] [CrossRef]
- Lal, R. Challenges and opportunities in soil organic matter research. Eur. J. Soil Sci. 2009, 60, 158–169. [Google Scholar] [CrossRef]
- Conway, G.R. The properties of agroecosystems. Agric. Syst. 1987, 24, 95–117. [Google Scholar] [CrossRef]
- Spencer, J.E.; Stewart, N.R. The nature of agricultural systems. Ann. Assoc. Am. Geogr. 1973, 63, 529–544. [Google Scholar] [CrossRef]
- NAS. Chemical Data for Soil Test. National Institute of Agricultural Sciences, Rural Development Administration 2023. Available online: www.data.go.kr/data/15073569/openapi.do (accessed on 30 January 2023).
- RDA. Precision Soil Maps. Rural Development Administration 2023. Available online: https://soil.rda.go.kr (accessed on 19 January 2023).
- NSDI. Degital Elevation Model. National Spatial Data Infrastructure 2020. Available online: https://data.nsdi.go.kr/dataset/20001 (accessed on 11 August 2020).
- KMA. Climate Change Scenarios. 2023. Available online: https://www.climate.go.kr/home/CCS/contents_2021/35_download.php (accessed on 20 December 2023).
- NEO. Net Primary Productivity (1 year—TERRA/MODIS). NASA Earth Observations 2023. Available online: https://neo.gsfc.nasa.gov (accessed on 20 February 2023).
- Horn, B.K.P. Hill shading and the reflectance map. Proc. IEEE 1981, 69, 14–47. [Google Scholar] [CrossRef]
- Quinn, P.F.; Beven, K.J.; Lamb, R. The in(a/tan/β) index: How to calculate it and how to use it within the topmodel framework. Hydrol. Process. 1995, 9, 161–182. [Google Scholar] [CrossRef]
- EGIS. Land Cover Maps. Environmental Geographic Information Service 2023. Available online: https://egis.me.go.kr/intro/land.do (accessed on 13 January 2023).
- Statistics Korea. Arable land in Korea. 2021. Available online: https://kosis.kr/statHtml/statHtml.do?orgId=101&tblId=DT_1EB001&conn_path=I2 (accessed on 6 October 2022).
- R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2023. [Google Scholar]
- Heung, B.; Ho, H.C.; Zhang, J.; Knudby, A.; Bulmer, C.E.; Schmidt, M.G. An overview and comparison of machine-learning techniques for classification purposes in digital soil mapping. Geoderma 2016, 265, 62–77. [Google Scholar] [CrossRef]
- Alzubaidi, L.; Zhang, J.; Humaidi, A.J.; Al-Dujaili, A.; Duan, Y.; Al-Shamma, O.; Santamaría, J.; Fadhel, M.A.; Al-Amidie, M.; Farhan, L. Review of deep learning: Concepts, CNN architectures, challenges, applications, future directions. J. Big Data 2021, 8, 53. [Google Scholar] [CrossRef] [PubMed]
- Lagacherie, P. Digital Soil Mapping: A State of the Art. In Digital Soil Mapping with Limited Data; Hartemink, A.E., McBratney, A., Mendonça-Santos, M.d.L., Eds.; Springer: Dordrecht, The Netherlands, 2008; pp. 3–14. [Google Scholar]
- Hastie, T.; Tibshirani, R.; Friedman, J. Unsupervised Learning. In The Elements of Statistical Learning: Data Mining, Inference, and Prediction; Hastie, T., Tibshirani, R., Friedman, J., Eds.; Springer: New York, NY, USA, 2009; pp. 485–585. [Google Scholar]
- Lundberg, S.M.; Lee, S.-I. A Unified Approach to Interpreting Model Predictions. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; pp. 4768–4777. [Google Scholar]
- Zhang, W.; Wu, C.; Zhong, H.; Li, Y.; Wang, L. Prediction of undrained shear strength using extreme gradient boosting and random forest based on Bayesian optimization. Geosci. Front. 2021, 12, 469–477. [Google Scholar] [CrossRef]
- Liu, J.; Li, C.; Ouyang, P.; Liu, J.; Wu, C. Interpreting the prediction results of the tree-based gradient boosting models for financial distress prediction with an explainable machine learning approach. J. Forecast. 2023, 42, 1112–1137. [Google Scholar] [CrossRef]
- Rodríguez-Pérez, R.; Bajorath, J. Interpretation of machine learning models using shapley values: Application to compound potency and multi-target activity predictions. J. Comput.-Aided Mol. Des. 2020, 34, 1013–1026. [Google Scholar] [CrossRef] [PubMed]
- Friedman, J.H. Greedy function approximation: A gradient boosting machine. Ann. Stat. 2001, 29, 1189–1232. [Google Scholar] [CrossRef]
- Lundy, M.E.; Pittelkow, C.M.; Linquist, B.A.; Liang, X.Q.; van Groenigen, K.J.; Lee, J.; Six, J.; Venterea, R.T.; van Kessel, C. Nitrogen fertilization reduces yield declines following no-till adoption. Field Crop. Res. 2015, 183, 204–210. [Google Scholar] [CrossRef]
- Pittelkow, C.M.; Linquist, B.A.; Lundy, M.E.; Liang, X.Q.; van Groenigen, K.J.; Lee, J.; van Gestel, N.; Six, J.; Venterea, R.T.; van Kessel, C. When does no-till yield more? A global meta-analysis. Field Crop. Res. 2015, 183, 156–168. [Google Scholar] [CrossRef]
- Machmuller, M.B.; Kramer, M.G.; Cyle, T.K.; Hill, N.; Hancock, D.; Thompson, A. Emerging land use practices rapidly increase soil organic matter. Nat. Commun. 2015, 6, 6995. [Google Scholar] [CrossRef]
- Czerwinska, U. Interpretability of Machine Learning Models. In Applied Data Science in Tourism: Interdisciplinary Approaches, Methodologies, and Applications; Egger, R., Ed.; Springer International Publishing: Cham, Switzerland, 2022; pp. 275–303. [Google Scholar]
- Pittelkow, C.M.; Liang, X.Q.; Linquist, B.A.; van Groenigen, K.J.; Lee, J.; Lundy, M.E.; van Gesell, N.; Six, J.; Venterea, R.T.; van Kessel, C. Productivity limits and potentials of the principles of conservation agriculture. Nature 2015, 517, 365-U482. [Google Scholar] [CrossRef]
- Petersen, B.; Snapp, S. What is sustainable intensification? Views from experts. Land Use Policy 2015, 46, 1–10. [Google Scholar] [CrossRef]
- Adamtey, N.; Musyoka, M.W.; Zundel, C.; Cobo, J.G.; Karanja, E.; Fiaboe, K.K.M.; Muriuki, A.; Mucheru-Muna, M.; Vanlauwe, B.; Berset, E.; et al. Productivity, profitability and partial nutrient balance in maize-based conventional and organic farming systems in Kenya. Agric. Ecosyst. Environ. 2016, 235, 61–79. [Google Scholar] [CrossRef]
- Petersen, E.H.; Hoyle, F.C. Estimating the economic value of soil organic carbon for grains cropping systems in Western Australia. Soil Res. 2016, 54, 383–396. [Google Scholar] [CrossRef]
- Mikhailova, E.A.; Groshans, G.R.; Post, C.J.; Schlautman, M.A.; Post, G.C. Valuation of soil organic carbon stocks in the contiguous United States based on the avoided social cost of carbon emissions. Resources 2019, 8, 153. [Google Scholar] [CrossRef]
- Dube, B.; White, A.; Ricketts, T.; Darby, H. Valuation of Soil Health Ecosystem Services; The University of Vermont: Burlington, VT, USA, 2022. [Google Scholar]
- Hacisalihoglu, S.; Toksoy, D.; Kalca, A. Economic valuation of soil erosion in a semi and area in Turkey. Afr. J. Agric. Res. 2010, 5, 1–6. [Google Scholar] [CrossRef]
- Kane, D.A.; Bradford, M.A.; Fuller, E.; Oldfield, E.E.; Wood, S.A. Soil organic matter protects US maize yields and lowers crop insurance payouts under drought. Environ. Res. Lett. 2021, 16, 044018. [Google Scholar] [CrossRef]
- Sparling, G.P.; Wheeler, D.; Vesely, E.T.; Schipper, L.A. What is soil organic matter worth? J. Environ. Qual. 2006, 35, 548–557. [Google Scholar] [CrossRef]
- Fan, F.; Henriksen, C.B.; Porter, J. Valuation of ecosystem services in organic cereal crop production systems with different management practices in relation to organic matter input. Ecosyst. Serv. 2016, 22, 117–127. [Google Scholar] [CrossRef]
- Reyes, J.J.; Elias, E. Spatio-temporal variation of crop loss in the United States from 2001 to 2016. Environ. Res. Lett. 2019, 14, 074017. [Google Scholar] [CrossRef]
Category | Variable (Abbreviation) | Unit | Resolution | Source |
---|---|---|---|---|
Soil | Organic matter (OM) | g kg−1 | Field | [30] |
Available phosphate (AP) | mg kg−1 | Field | ||
Available silicate (AS) | mg kg−1 | Field | ||
Exchangeable magnesium (Mg) | cmol+ kg−1 | Field | ||
Exchangeable potassium (K) | cmol+ kg−1 | Field | ||
Exchangeable calcium (Ca) | cmol+ kg−1 | Field | ||
pH (1:5 H2O) | Field | |||
Electric conductivity (EC) | dS m−1 | Field | ||
Soil map | Topsoil texture (TT) | class | 250 m | [31] |
Drainage (DC) | class | 250 m | ||
Soil order (OR) | group | 250 m | ||
Soil structure (SS) | class | 250 m | ||
Parent material (PM) | type | 250 m | ||
Erosion (EG) | grade | 250 m | ||
Terrain | Elevation (DEM) | m | 90 m | [32] |
Slope 1 | radians | 90 m | ||
Aspect 1 | radians | 90 m | ||
Flow direction (flowdir) 1 | m | 90 m | ||
Roughness 1 | m | 90 m | ||
Hill shade (hill) 2 | 90 m | |||
Topographic position index (TPI) 1 | 90 m | |||
Terrain ruggedness index (TRI) 1 | 90 m | |||
Upslope contributing area (a) 1 | 90 m | |||
Topographic wetness index (TWI) 1 | 90 m | |||
Climate | Mean annual temperature (TA) | °C | 1 km | [33] |
Maximum annual temperature (TAMAX) | °C | 1 km | ||
Minimum annual temperature (TAMIN) | °C | 1 km | ||
Mean annual precipitation (RN) | mm | 1 km | ||
Solar irradiation (SI) | MJ m−2 | 1 km | ||
Relative humidity (RHM) | % | 1 km | ||
Wind speed (WS) | m s−1 | 1 km | ||
Vegetation | Net primary productivity (NPP) | g C m−2 y−1 | 11 km | [34] |
Units | Mean | Median | s.d. | Min. | Max. | |
---|---|---|---|---|---|---|
Organic matter | g kg−1 | 24.2 | 21.9 | 11.1 | 0.6 | 74.0 |
Available phosphate | mg kg−1 | 303.7 | 199.4 | 281.5 | 0.8 | 3589.7 |
Available silicate | mg kg−1 | 191.8 | 175.9 | 153.9 | 0.1 | 1896.0 |
Exchangeable magnesium | cmol+ kg−1 | 1.82 | 1.60 | 1.08 | 0.04 | 14.56 |
Exchangeable potassium | cmol+ kg−1 | 0.74 | 0.52 | 0.68 | 0.01 | 8.31 |
Exchangeable calcium | cmol+ kg−1 | 6.43 | 6.03 | 3.08 | 0.11 | 31.90 |
pH (1:5 H2O) | 6.30 | 6.30 | 0.68 | 4.00 | 9.50 | |
Electric conductivity | dS m−1 | 0.81 | 0.56 | 1.03 | 0.01 | 20.00 |
Elevation | m | 71.38 | 35.89 | 81.40 | 0.25 | 671.00 |
Mean annual temperature | °C | 12.97 | 13.26 | 1.028 | 6.12 | 16.02 |
Maximum annual temperature | °C | 18.55 | 18.57 | 0.78 | 11.22 | 20.40 |
Minimum annual temperature | °C | 8.17 | 8.68 | 1.46 | 2.01 | 13.19 |
Mean annual precipitation | mm | 1358.28 | 1269.33 | 215.95 | 934.89 | 2636.66 |
Solar irradiation | MJ m−2 | 13.70 | 13.71 | 0.53 | 10.62 | 15.78 |
Relative humidity | % | 71.20 | 72.43 | 2.98 | 59.61 | 76.81 |
Wind speed | m s−1 | 1.84 | 1.93 | 0.41 | 0.19 | 5.46 |
Net primary productivity | g C m−2 y−1 | 184.86 | 188.00 | 10.41 | 138.00 | 255.00 |
Model Parameter | Parameter Grid | Decision Tree | Random Forest | Gradient Boosting |
---|---|---|---|---|
Maximum depth of a tree | [6, 8, 10, 12] | 6 | 12 | 8 |
Minimum samples per leaf | [8, 12, 18] | 12 | 8 | 18 |
Minimum number of samples | [8, 16, 20] | 8 | 8 | 8 |
Number of trees | [10, 100] | - | 100 | 100 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Lee, W.; Lee, J. Tree-Based Modeling for Large-Scale Management in Agriculture: Explaining Organic Matter Content in Soil. Appl. Sci. 2024, 14, 1811. https://doi.org/10.3390/app14051811
Lee W, Lee J. Tree-Based Modeling for Large-Scale Management in Agriculture: Explaining Organic Matter Content in Soil. Applied Sciences. 2024; 14(5):1811. https://doi.org/10.3390/app14051811
Chicago/Turabian StyleLee, Woosik, and Juhwan Lee. 2024. "Tree-Based Modeling for Large-Scale Management in Agriculture: Explaining Organic Matter Content in Soil" Applied Sciences 14, no. 5: 1811. https://doi.org/10.3390/app14051811
APA StyleLee, W., & Lee, J. (2024). Tree-Based Modeling for Large-Scale Management in Agriculture: Explaining Organic Matter Content in Soil. Applied Sciences, 14(5), 1811. https://doi.org/10.3390/app14051811