Potential Model Overfitting in Predicting Soil Carbon Content by Visible and Near-Infrared Spectroscopy
Abstract
:1. Introduction
2. Materials and Methods
2.1. Site Description
2.2. Soil Sampling, Total Carbon Analysis and Spectral Measurement
2.3. Spectral Pre-Processing and Reflectance Analysis
2.4. Cross-Validation and Partial Least Squares Regression
3. Results and Discussion
3.1. Soil Total Carbon
3.2. Spectral Pre-Processing
3.3. Effect of Soil TC on Reflectance
3.4. Cross-Validation (CV) and Partial Least Squares Regression
3.4.1. Cross-Validation
3.4.2. PLSR Calibration
4. Conclusions
Supplementary Materials
Acknowledgments
Author Contributions
Conflicts of Interest
References
- Stevenson, F.; Cole, M. Cycles of Soils, 2nd ed.; Wiley: New York, NY, USA, 1999. [Google Scholar]
- Dube, F.; Zagal, E.; Stolpe, N.; Espinosa, M. The Influence of Land-Use Change on the Organic Carbon Distribution and Microbial Respiration in a Volcanic Soil of the Chilean Patagonia. For. Ecol. Manag. 2009, 257, 1695–1704. [Google Scholar] [CrossRef]
- Zagal, E.; Muñoz, C.; Espinoza, S.; Campos, J. Soil Profile Distribution of Total C Content and Natural Abundance of 13C in Two Volcanic Soils Subjected to Crop Residue Burning versus Crop Residue Retention. Acta Agric. Scand. 2012, 62, 263–272. [Google Scholar]
- Powlson, D.; Smith, P.; Nobili, M.D. Soil organic matter. In Soil Conditions and Plant Growth; Blackwell Publishing Ltd.: Oxford, UK, 2013; pp. 86–131. [Google Scholar]
- Stenberg, B.; Viscarra Rossel, R.A.; Mouazen, A.M.; Wetterlind, J. Chapter Five—Visible and Near Infrared Spectroscopy in Soil Science. In Advances in Agronomy; Sparks, D.L., Ed.; Academic Press: San Diego, CA, USA, 2010; Volume 107, pp. 163–215. [Google Scholar]
- Viscarra Rossel, R.A.; Adamchuk, V.I.; Sudduth, K.A.; McKenzie, N.J.; Lobsey, C. Chapter Five—Proximal Soil Sensing: An Effective Approach for Soil Measurements in Space and Time. In Advances in Agronomy; Sparks, D.L., Ed.; Academic Press: San Diego, CA, USA, 2011; Volume 113, pp. 243–291. [Google Scholar]
- Reeves, J.B., III. Near- versus Mid-Infrared Diffuse Reflectance Spectroscopy for Soil Analysis Emphasizing Carbon and Laboratory versus on-Site Analysis: Where Are We and What Needs to Be Done? Geoderma 2010, 158, 3–14. [Google Scholar] [CrossRef]
- Sarkhot, D.V.; Grunwald, S.; Ge, Y.; Morgan, C.L.S. Comparison and Detection of Total and Available Soil Carbon Fractions Using Visible/near Infrared Diffuse Reflectance Spectroscopy. Geoderma 2011, 164, 22–32. [Google Scholar] [CrossRef]
- Fontán, J.M.; Calvache, S.; López-Bellido, R.J.; López-Bellido, L. Soil Carbon Measurement in Clods and Sieved Samples in a Mediterranean Vertisol by Visible and Near-Infrared Reflectance Spectroscopy. Geoderma 2010, 156, 93–98. [Google Scholar] [CrossRef]
- Reeves, J.B., III; Follett, R.F.; McCarty, G.W.; Kimble, J.M. Can Near or Mid-Infrared Diffuse Reflectance Spectroscopy Be Used to Determine Soil Carbon Pools? Commun. Soil Sci. Plant Anal. 2006, 37, 2307–2325. [Google Scholar] [CrossRef]
- Knox, N.M.; Grunwald, S.; McDowell, M.L.; Bruland, G.L.; Myers, D.B.; Harris, W.G. Modelling Soil Carbon Fractions with Visible Near-Infrared (VNIR) and Mid-Infrared (MIR) Spectroscopy. Geoderma 2015, 239, 229–239. [Google Scholar] [CrossRef]
- Vasques, G.M.; Grunwald, S.; Sickman, J.O. Comparison of Multivariate Methods for Inferential Modeling of Soil Carbon Using Visible/near-Infrared Spectra. Geoderma 2008, 146, 14–25. [Google Scholar] [CrossRef]
- Lucà, F.; Conforti, M.; Castrignanò, A.; Matteucci, G.; Buttafuoco, G. Effect of Calibration Set Size on Prediction at Local Scale of Soil Carbon by Vis-NIR Spectroscopy. Geoderma 2017, 288, 175–183. [Google Scholar] [CrossRef]
- Mouazen, A.M.; Kuang, B.; De Baerdemaeker, J.; Ramon, H. Comparison among Principal Component, Partial Least Squares and Back Propagation Neural Network Analyses for Accuracy of Measurement of Selected Soil Properties with Visible and near Infrared Spectroscopy. Geoderma 2010, 158, 23–31. [Google Scholar] [CrossRef]
- Fystro, G. The Prediction of C and N Content and Their Potential Mineralisation in Heterogeneous Soil Samples Using Vis-NIR Spectroscopy and Comparative Methods. Plant Soil 2002, 246, 139–149. [Google Scholar] [CrossRef]
- Brunet, D.; Barthès, B.G.; Chotte, J.L.; Feller, C. Determination of Carbon and Nitrogen Contents in Alfisols, Oxisols and Ultisols from Africa and Brazil Using NIRS Analysis: Effects of Sample Grinding and Set Heterogeneity. Geoderma 2007, 139, 106–117. [Google Scholar] [CrossRef]
- Gomez, C.; Viscarra Rossel, R.A.; McBratney, A.B. Soil Organic Carbon Prediction by Hyperspectral Remote Sensing and Field Vis-NIR Spectroscopy: An Australian Case Study. Geoderma 2008, 146, 403–411. [Google Scholar] [CrossRef]
- Wenjun, J.; Zhou, S.; Jingyi, H.; Shuo, L. In Situ Measurement of Some Soil Properties in Paddy Soil Using Visible and Near-Infrared Spectroscopy. PLoS ONE 2014, 9, e105708. [Google Scholar] [CrossRef] [PubMed]
- Zheng, G.; Ryu, D.; Jiao, C.; Hong, C. Estimation of Organic Matter Content in Coastal Soil Using Reflectance Spectroscopy. Pedosphere 2016, 26, 130–136. [Google Scholar] [CrossRef]
- Guillén, C.E.; Dávila, M.J.; Gilliot, J.M.; Vaoudour, E. Aporte de la espectroscopia a la estimación de carbono orgánico de los suelos de la planicie de Versalles, Francia. Revista Geográfica Venezolana 2013, 54, 85–98. [Google Scholar]
- Baumgardner, M.F.; Silva, L.F.; Biehl, L.L.; Stoner, E.R. Reflectance Properties of Soils. In Advances in Agronomy; Brady, N.C., Ed.; Academic Press: San Diego, CA, USA, 1986; Volume 38, pp. 1–44. [Google Scholar]
- Reeves, J.B., III; McCarty, G.W.; Calderon, F.; Hively, W.D. Chapter 20—Advances in Spectroscopic Methods for Quantifying Soil Carbon A2—Liebig, Mark A. In Managing Agricultural Greenhouse Gases; Franzluebbers, A.J., Follett, R.F., Eds.; Academic Press: San Diego, CA, USA, 2012; pp. 345–366. [Google Scholar]
- Stolpe, N.B. Descripción de Los Principales Suelos de La VII Región de Chile; Publicaciones del Departamento de Suelos y Recursos Naturales—Universidad de Concepción: Chillán, Chile, 2006; Volume 1, p. 1. [Google Scholar]
- Rinnan, Å.; van den Berg, F.; Engelsen, S.B. Review of the Most Common Pre-Processing Techniques for near-Infrared Spectra. TrAC Trends Anal. Chem. 2009, 28, 1201–1222. [Google Scholar] [CrossRef]
- Savitzky, A.; Golay, M.J.E. Smoothing and Differentiation of Data by Simplified Least Squares Procedures. Anal. Chem. 1964, 36, 1627–1639. [Google Scholar] [CrossRef]
- Nawar, S.; Buddenbaum, H.; Hill, J.; Kozak, J.; Mouazen, A.M. Estimating the Soil Clay Content and Organic Matter by Means of Different Calibration Methods of Vis-NIR Diffuse Reflectance Spectroscopy. Soil Tillage Res. 2016, 155, 510–522. [Google Scholar] [CrossRef]
- Dube, F.; Stolpe, N.B. SOM and Biomass C Stocks in Degraded and Undisturbed Andean and Coastal Nothofagus Forests of Southwestern South America. Forests 2016, 7, 320. [Google Scholar] [CrossRef]
- Casanova, M.; Salazar, O.; Seguel, O.; Luzio, W. Main Features of Chilean Soils. In The Soils of Chile; Springer: Dordrecht, The Netherlands, 2013; pp. 25–97. [Google Scholar]
- Wright, A.F.; Bailey, J.S. Organic Carbon, Total Carbon, and Total Nitrogen Determinations in Soils of Variable Calcium Carbonate Contents Using a Leco CN-2000 Dry Combustion Analyzer. Commun. Soil Sci. Plant Anal. 2001, 32, 3243–3258. [Google Scholar] [CrossRef]
- Schafer, R.W. What Is a Savitzky-Golay Filter? [Lecture Notes]. IEEE Signal Process. Mag. 2011, 28, 111–117. [Google Scholar] [CrossRef]
- Kinoshita, R.; Roupsard, O.; Chevallier, T.; Albrecht, A.; Taugourdeau, S.; Ahmed, Z.; van Es, H.M. Large Topsoil Organic Carbon Variability Is Controlled by Andisol Properties and Effectively Assessed by VNIR Spectroscopy in a Coffee Agroforestry System of Costa Rica. Geoderma 2016, 262, 254–265. [Google Scholar] [CrossRef]
- Van den Berg, R.A.; Hoefsloot, H.C.; Westerhuis, J.A.; Smilde, A.K.; van der Werf, M.J. Centering, Scaling, and Transformations: Improving the Biological Information Content of Metabolomics Data. BMC Genom. 2006, 7, 142. [Google Scholar] [CrossRef] [PubMed]
- Adeline, K.R.M.; Gomez, C.; Gorretta, N.; Roger, J.M. Predictive Ability of Soil Properties to Spectral Degradation from Laboratory Vis-NIR Spectroscopy Data. Geoderma 2017, 288, 143–153. [Google Scholar] [CrossRef]
- Henderson, T.L.; Baumgardner, M.F.; Franzmeier, D.P.; Stott, D.E.; Coster, D.C. High Dimensional Reflectance Analysis of Soil Organic Matter. Soil Sci. Soc. Am. J. 1992, 53, 865–872. [Google Scholar] [CrossRef]
- Zhang, P.; Shao, M. Spatial Variability and Stocks of Soil Organic Carbon in the Gobi Desert of Northwestern China. PLoS ONE 2014, 9, e93584. [Google Scholar] [CrossRef] [PubMed]
- Geladi, P.; Kowalski, B.R. Partial Least-Squares Regression: A Tutorial. Anal. Chim. Acta 1986, 185, 1–17. [Google Scholar] [CrossRef]
- Haenlein, M.; Kaplan, A.M. A Beginner’s Guide to Partial Least Squares Analysis. Underst. Stat. 2004, 3, 283–297. [Google Scholar] [CrossRef]
- Jonathan, P.; Krzanowski, W.J.; McCarthy, W.V. On the Use of Cross-Validation to Assess Performance in Multivariate Prediction. Stat. Comput. 2000, 10, 209–229. [Google Scholar] [CrossRef]
- Li, B.; Morris, J.; Martin, E.B. Model Selection for Partial Least Squares Regression. Chemom. Intell. Lab. Syst. 2002, 64, 79–89. [Google Scholar] [CrossRef]
- Nocita, M.; Stevens, A.; Noon, C.; van Wesemael, B. Prediction of Soil Organic Carbon for Different Levels of Soil Moisture Using Vis-NIR Spectroscopy. Geoderma 2013, 199, 37–42. [Google Scholar] [CrossRef]
- Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-Learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
- Jones, E.; Oliphant, T.; Peterson, P. SciPy: Open Source Scientific Tools for Python. Available online: http://www.scipy.org (accessed on 5 May 2016).
- Shen, H. Interactive Notebooks: Sharing the Code. Nat. News 2014, 515, 151. [Google Scholar] [CrossRef] [PubMed]
- Xie, H.; Zhao, J.; Wang, Q.; Sui, Y.; Wang, J.; Yang, X.; Zhang, X.; Liang, C. Soil Type Recognition as Improved by Genetic Algorithm-Based Variable Selection Using near Infrared Spectroscopy and Partial Least Squares Discriminant Analysis. Sci. Rep. 2015, 5, 10930. [Google Scholar] [CrossRef] [PubMed]
- Demattê, J.A.M.; Nanni, M.R.; da Silva, A.P.; de Melo Filho, J.F.; Santos, W.C.D.; Campos, R.C. Soil Density Evaluated by Spectral Reflectance as an Evidence of Compaction Effects. Int. J. Remote Sens. 2010, 31, 403–422. [Google Scholar] [CrossRef]
- Abdi, H. Partial Least Squares Regression and Projection on Latent Structure Regression (PLS Regression). Wiley Interdiscip. Rev. Comput. Stat. 2010, 2, 97–106. [Google Scholar] [CrossRef]
- Brown, D.J.; Bricklemyer, R.S.; Miller, P.R. Validation Requirements for Diffuse Reflectance Soil Characterization Models with a Case Study of VNIR Soil C Prediction in Montana. Geoderma 2005, 129, 251–267. [Google Scholar] [CrossRef]
- Viscarra Rossel, R.A. ParLeS: Software for Chemometric Analysis of Spectroscopic Data. Chemom. Intell. Lab. Syst. 2008, 90, 72–83. [Google Scholar] [CrossRef]
- Askari, M.S.; O’Rourke, S.M.; Holden, N.M. Evaluation of Soil Quality for Agricultural Production Using Visible-near-Infrared Spectroscopy. Geoderma 2015, 243–244, 80–91. [Google Scholar] [CrossRef]
- Knadel, M.; Gislum, R.; Hermansen, C.; Peng, Y.; Moldrup, P.; de Jonge, L.W.; Greve, M.H. Comparing Predictive Ability of Laser-Induced Breakdown Spectroscopy to Visible near-Infrared Spectroscopy for Soil Property Determination. Biosyst. Eng. 2017, 156, 157–172. [Google Scholar] [CrossRef]
Soil order | Depth (cm) | Average | Standard Deviation |
---|---|---|---|
0–5 | 6.4 (12) | 2.42 | |
Andisol | 5–20 | 4.4 (12) | 1.65 |
20–40 | 3.0 (12) | 1.29 | |
0–5 | 5.3 (12) | 1.55 | |
Ultisol | 5–20 | 4.5 (11) | 1.26 |
20–40 | 2.7 (11) | 1.78 |
SG Filter | Number of LVs | RMSE | Outliers ID | |
---|---|---|---|---|
(5, 1, 0) | 2 | 0.82 | 0.61 | 21, 60 |
(5, 2, 0) | 2 | 0.82 | 0.61 | 21, 60 |
(5, 1, 1) | 2 | 0.58 | 1.22 | 20, 57 |
(5, 2, 2) | 1 | 0.23 | 1.51 | – |
(11, 1, 0) | 2 | 0.82 | 0.61 | 21, 60 |
(11, 2, 0) | 2 | 0.82 | 0.61 | 21, 60 |
(11, 1, 1) | 1 | 0.54 | 1.08 | – |
(11, 2, 2) | 1 | 0.36 | 1.29 | 20 |
(17, 1, 0) | 2 | 0.82 | 0.61 | 21, 60 |
(17, 2, 0) | 2 | 0.82 | 0.61 | 21, 60 |
(17, 1, 1) | 2 | 0.58 | 1.48 | – |
(17, 2, 2) | 1 | 0.26 | 1.59 | 20 |
(17, 1, 0) + Log(1/R) | 2 | 0.79 | 0.66 | 21, 60 |
(17, 1, 0) + centering | 2 | 0.79 | 0.66 | 21, 60 |
© 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Reyna, L.; Dube, F.; Barrera, J.A.; Zagal, E. Potential Model Overfitting in Predicting Soil Carbon Content by Visible and Near-Infrared Spectroscopy. Appl. Sci. 2017, 7, 708. https://doi.org/10.3390/app7070708
Reyna L, Dube F, Barrera JA, Zagal E. Potential Model Overfitting in Predicting Soil Carbon Content by Visible and Near-Infrared Spectroscopy. Applied Sciences. 2017; 7(7):708. https://doi.org/10.3390/app7070708
Chicago/Turabian StyleReyna, Lizardo, Francis Dube, Juan A. Barrera, and Erick Zagal. 2017. "Potential Model Overfitting in Predicting Soil Carbon Content by Visible and Near-Infrared Spectroscopy" Applied Sciences 7, no. 7: 708. https://doi.org/10.3390/app7070708
APA StyleReyna, L., Dube, F., Barrera, J. A., & Zagal, E. (2017). Potential Model Overfitting in Predicting Soil Carbon Content by Visible and Near-Infrared Spectroscopy. Applied Sciences, 7(7), 708. https://doi.org/10.3390/app7070708