PCA- and PLSR-Based Machine Learning Model for Prediction of Urea-N Content in Heterogeneous Soils Using Near-Infrared Spectroscopy
Abstract
1. Introduction
2. Materials and Methods
2.1. Methodological Overview
- (i)
- Sample Collection
- (ii)
- Laboratory and Soil Spectroscopy Dataset
- (iii)
- Sample Preparation and Urea-N Injection
- (iv)
- NIR Spectroscopy Analysis
- (v)
- Data Pre-processing
- (vi)
- PLSR Modeling and Dataset Partition
- (vii)
- Prediction
2.2. Soil Sampling and Sample Preparation
- The soil was passed through a 2.0 mm sieve, dried at 85 °C for 48 h, and then weighed.
- Granular urea was ground into powder using a pestle and mortar, then weighed.
- The urea was added to the soil and mixed thoroughly until no lumps remained.
2.3. Examination of Soil Composition Properties
- (i)
- Brescia #1 and Ketotsuchi #1 were acidic soils (pH around 5);
- (ii)
- Brescia #2 was an alkaline soil (pH around 8);
- (iii)
- The electrical conductivity of the soils (a parameter that indicates higher levels of soluble salts or ions) varied significantly, from 90 to 2200 μS/cm;
- (iv)
- The available potassium was much lower in Kiryuzuna #1 (123 mg/kg) than in Brescia #1 (13,900 mg/kg);
- (v)
- The available nitrogen in Ketotsuchi #1 (19.2 g/kg) was higher than in the other soils;
- (vi)
- The organic matter content in Cassino #2 and Kiryuzuna #1 (<1.1 g/kg) was lower than in the other soils.
2.4. Examination of Soil Spectral Characteristics
2.5. Measurement System
2.6. IR Spectral Processing and Sample Set Partitioning
- -
- First-derivative spectra with SG smoothing (FD-SG)—this method enhances the peak detection performance by applying the SG filter to the first derivative of the spectrum.
- -
- Second-derivative spectra with SG smoothing (SD-SG)—similarly to FD-SG, this technique focuses on fine features by applying the SG filter to the second derivative of the spectra.
- -
- PCA—this method is useful for reducing the dimensionality of the problem by projecting the spectra into a different space, where each spectrum is represented as a single point.
- -
- Enhanced peak resolution—peaks become sharper and easier to identify.
- -
- Suppressed baseline variations—noise and baseline drift are minimized.
- -
- Improved subsequent data analysis—FD-SG pre-processed spectra are more amenable to chemometric modeling.
2.7. Accuracy Evaluation of the Model
3. Results and Discussion
4. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Panagos, P.; Montanarella, L.; Barbero, M.; Schneegans, A.; Aguglia, L.; Jones, A. Soil priorities in the European Union. Geoderma Reg. 2022, 29, e00510. [Google Scholar] [CrossRef]
- Communication from the Commission to the European Parliament, the Council, the European Economic and Social Committee and the Committee of the Regions EU. Soil Strategy for 2020 Reaping the Benefits of Healthy Soils for People, Food, Nature and Climate. Available online: https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=celex:52021DC0699 (accessed on 1 July 2025).
- Bremner, J.M. Determination of nitrogen in soil by the Kjeldahl method. J. Agric. Sci. 1960, 55, 11–33. [Google Scholar] [CrossRef]
- Novamsky, I.; Van Eck, R.; Van Schouwenburg, C.; Walinga, I. Total nitrogen determination in plant material by means of the indophenol-blue method. Neth. J. Agric. Sci. 1974, 22, 3–5. [Google Scholar] [CrossRef]
- Piccone, L.I.; Cabrera, M.L.; Franzluebbers, A.J. A rapid method to estimate potentially mineralizable nitrogen in soil. Soil Sci. Soc. Am. J. 2002, 66, 1843–1847. [Google Scholar] [CrossRef]
- Dudala, S.; Dubey, S.K.; Goel, S. Microfluidic Soil Nutrient Detection System: Integrating Nitrite, pH, and Electrical Conductivity Detection. IEEE Sens. J. 2020, 20, 4504–4511. [Google Scholar] [CrossRef]
- Kundu, A.; Shawon, S.M.R.H.; Kapali, S.; Helal, A.; Ali, K. Hairpin Resonator-Based Microwave Sensor for Detection of Nitrogenous Fertilizers in Soil and Water. IEEE Sens. J. 2024, 24, 27436–27445. [Google Scholar] [CrossRef]
- Estrada-López, J.J.; Castillo-Atoche, A.A.; Vázquez-Castillo, J.; Sánchez-Sinencio, E. Smart Soil Parameters Estimation System Using an Autonomous Wireless Sensor Network with Dynamic Power Management Strategy. IEEE Sens. J. 2018, 18, 8913–8923. [Google Scholar] [CrossRef]
- Chen, H.; Xie, J.; Xu, L.; Feng, Q.; Lin, Q.; Cai, K. Feature Selection for Portable Spectral Sensing Data of Soil Using Broad Learning Network in Fusion with Fuzzy Technique. IEEE Sens. J. 2024, 24, 5644–5653. [Google Scholar] [CrossRef]
- Qaswar, M.; Bustan, D.; Mouazen, A.M. Economic and Environmental Assessment of Variable Rate Nitrogen Application in Potato by Fusion of Online Visible and Near Infrared (Vis-NIR) and Remote Sensing Data. Soil Syst. 2024, 8, 66. [Google Scholar] [CrossRef]
- de Lima, B.C.; Demattê, J.A.M.; dos Santos, C.H.; Tiritan, C.S.; Poppiel, R.R.; Nanni, M.R.; Falcioni, R.; de Oliveira, C.A.; Vedana, N.G.; Zimmermann, G.; et al. The Use of Vis-NIR-SWIR Spectroscopy and X-ray Fluorescence in the Development of Predictive Models: A Step forward in the Quantification of Nitrogen, Total Organic Carbon and Humic Fractions in Ferralsols. Remote Sens. 2024, 16, 3009. [Google Scholar] [CrossRef]
- Zhang, Y.; Li, M.; Zheng, L.; Zhao, Y.; Pei, X. Soil nitrogen content forecasting based on real-time NIR spectroscopy. Comput. Electron. Agric. 2016, 124, 29–36. [Google Scholar] [CrossRef]
- Sinitambirivoutin, M.; Milne, E.; Schiettecatte, L.-S.; Tzamtzis, I.; Dionisio, D.; Henry, M.; Brierley, I.; Salvatore, M.; Bernoux, M. An updated IPCC major soil types map derived from the harmonized world soil database v2.0. CATENA 2024, 244, 108258. [Google Scholar] [CrossRef]
- Tan, B.; You, W.; Tian, S.; Xiao, T.; Wang, M.; Zheng, B.; Luo, L. Soil Nitrogen Content Detection Based on Near-Infrared Spectroscopy. Sensors 2022, 22, 8013. [Google Scholar] [CrossRef]
- Nawar, S.; Buddenbaum, H.; Hill, J.; Kozak, J.; Mouazen, A.M. Estimating the soil clay content and organic matter by means of different calibration methods of vis-NIR diffuse reflectance spectroscopy. Soil Tillage Res. 2016, 155, 510–522. [Google Scholar] [CrossRef]
- Munawar, A.A.; Yunus, Y.; Devianti; Satriyo, P. Calibration models database of near infrared spectroscopy to predict agricultural soil fertility properties. Data Brief 2020, 30, 105469. [Google Scholar] [CrossRef]
- ISO/IEC 17025:2018; General Requirements for the Competence of Testing and Calibration Laboratories. ISO: Geneva, Switzerland, 2018.
- Stenberg, B.; Rossel, R.A.V.; Mouazen, A.M.; Wetterlind, J. Chapter Five—Visible and Near Infrared Spectroscopy in Soil Science. In Advances in Agronomy; Sparks, D.L., Ed.; Academic Press: Cambridge, MA, USA, 2010; Volume 107, pp. 163–215. ISSN 0065-2113. [Google Scholar]
- Available online: https://webbook.nist.gov/cgi/cbook.cgi?Name=urea&Units=SI (accessed on 1 July 2025).
- Minopoulou, E.; Dessipri, E.; Chryssikos, G.D.; Gionis, V.; Paipetis, A. Panayiotou. Use of NIR for structural characterization of Urea-formaldehyde resins. Int. J. Adhes. Adhes. 2003, 23, 473–484. [Google Scholar] [CrossRef]
- Barra, I.; Haefele, S.M.; Sakrabani, R.; Kebede, F. Soil spectroscopy with the use of chemometrics, machine learning and pre-processing techniques in soil diagnosis: Recent advances–A review. TrAC Trends Anal. Chem. 2021, 135, 116166. [Google Scholar] [CrossRef]
- Guo, P.; Li, T.; Gao, H.; Chen, X.; Cui, Y.; Huang, Y. Evaluating Calibration and Spectral Variable Selection Methods for Predicting Three Soil Nutrients Using Vis-NIR Spectroscopy. Remote Sens. 2021, 13, 4000. [Google Scholar] [CrossRef]
- Vestergaard, R.-J.; Vasava, H.B.; Aspinall, D.; Chen, S.; Gillespie, A.; Adamchuk, V.; Biswas, A. Evaluation of Optimized Preprocessing and Modeling Algorithms for Prediction of Soil Properties Using VIS-NIR Spectroscopy. Sensors 2021, 21, 6745. [Google Scholar] [CrossRef]
- Zhao, A.-X.; Tang, X.-J.; Zhang, Z.-H.; Liu, J.-H. The parameters optimization selection of Savitzky-Golay filter and its application in smoothing pretreatment for FTIR spectra. In Proceedings of the 2014 9th IEEE Conference on Industrial Electronics and Applications, Hangzhou, China, 9–11 June 2014; pp. 516–521. [Google Scholar] [CrossRef]
- Young, K.; Govind, V.; Sharma, K.; Studholme, C.; Maudsley, A.A.; Schuff, N. Multivariate statistical mapping of spectroscopic imaging data. Magn. Reson. Med. 2010, 63, 20–24. [Google Scholar] [CrossRef]
- Galvão, R.K.H.; Araujo, M.C.U.; José, G.E.; Pontes, M.J.C.; Silva, E.C.; Saldanha, T.C.B. A method for calibration and validation subset partitioning. Talanta 2005, 67, 736–740. [Google Scholar] [CrossRef]
- Tian, H.; Zhang, L.; Li, M.; Wang, Y.; Sheng, D.; Liu, J.; Wang, C. Weighted SPXY method for calibration set selection for composition analysis based on near-infrared spectroscopy. Infrared Phys. Technol. 2018, 95, 88–92. [Google Scholar] [CrossRef]
- Metzger, K.; Liebisch, F.; Herrera, J.M.; Guillaume, T.; Bragazza, L. Prediction Accuracy of Soil Chemical Parameters by Field- and Laboratory-Obtained vis-NIR Spectra after External Parameter Orthogonalization. Sensors 2024, 24, 3556. [Google Scholar] [CrossRef]
- Ahmadi, A.; Emami, M.; Daccache, A.; He, L. Soil Properties Prediction for Precision Agriculture Using Visible and Near-Infrared Spectroscopy: A Systematic Review and Meta-Analysis. Agronomy 2021, 11, 433. [Google Scholar] [CrossRef]
- Yin, Z.; Lei, T.; Yan, Q.; Chen, Z.; Dong, Y. A near-infrared reflectance sensor for soil surface moisture measurement. Comput. Electron. Agric. 2013, 99, 101–107. [Google Scholar] [CrossRef]
- Mahmood, H.S.; Bartholomeus, H.M.; Hoogmoed, W.B.; van Henten, E.J. Evaluation and implementation of vis-NIR spectroscopy models to determine workability. Soil Tillage Res. 2013, 134, 172–179. [Google Scholar] [CrossRef]
- Riedel, F.; Denk, M.; Müller, I.; Barth, N.; Gläßer, C. Prediction of soil parameters using the spectral range between 350 and 15,000 nm: A case study based on the Permanent Soil Monitoring Program in Saxony, Germany. Geoderma 2018, 15, 188–198. [Google Scholar] [CrossRef]
- Dhawale, N.M.; Adamchuk, V.I.; Prasher, S.O.; Viscarra Rossel, R.A.; Ismail, A.A.; Kaur, J. Proximal soil sensing of soil texture and organic matter with a prototype portable mid-infrared spectrometer. Eur. J. Soil Sci. 2015, 66, 661–669. [Google Scholar] [CrossRef]
Soil Type | Country | Coordinate or Commercial Product ID | Soil Texture |
---|---|---|---|
Cassino #1 | Sant’Angelo in Theodice, Cassino, South of Italy | 41.45° N, 13.83° E | Clayey and sandy |
Cassino #2 | Caira, Cassino, South of Italy | 41.530° N, 13.81° E | Clayey and sandy |
Brescia #1 | Montichiari, Brescia, North of Italy | 45.41° N, 10.41° E | Sandy loam and alluvial |
Brescia #2 | Mompiano, Brescia, North of Italy | 45.56° N, 10.23° E | Sandy loam and alluvial |
Kiryuzuna #1 | Northern island of Hokkaido, Japan | Bonsaischule Wenddorf DE-KIRYU-02 | Volcanic |
Ketotsuchi #1 | Northern island of Honsu, Japan | Crespi Bonsai A518/02 | Volcanic |
Soil Type | Content of Urea-N | Uncertainty | Number of Samples |
---|---|---|---|
[%] | [%] | ||
Cassino #1 | 0, 2.5, 5, 7.5, 10, 12.5, 15, 17.5 | 0.17 | 8 |
Cassino #2 | 0, 2.5, 5, 7.5, 10, 12.5, 15, 17.5 | 0.17 | 8 |
Brescia #1 | 0, 2.5, 5, 7.5, 10, 12.5, 15, 17.5, 20 | 0.17 | 9 |
Brescia #2 | 0, 2.5, 5, 7.5, 10, 12.5, 15, 17.5 | 0.17 | 8 |
Kiryuzuna #1 | 0, 2.5, 5, 7.5, 10, 12.5, 15, 17.5 | 0.17 | 8 |
Ketotsuchi #1 | 0, 2.5, 5, 7.5, 10, 12.5, 15, 17.5, 20 | 0.17 | 9 |
Total number of samples | 50 |
Soil Type | pH | Electrical Conductivity @ 20 °C | Available Nitrogen | Available Potassium | Available Phosphorous | SOM |
---|---|---|---|---|---|---|
[µS/cm] | [g/kg] | [mg/kg] | [mg/kg] | [g/kg] | ||
Cassino #1 | 7.5 | 414 | 1.3 | 2790 | 276 | 53 |
Cassino #2 | 7.7 | 852 | 0.1 | 4420 | 869 | <1 |
Brescia #1 | 5.5 | 1710 | 1.1 | 13,900 | 968 | 116 |
Brescia #2 | 7.9 | 1900 | 4.4 | 4740 | 717 | 131 |
Kiryuzuna #1 | 6.0 | 93 | 2.6 | 123 | 218 | <1 |
Ketotsuchi #1 | 5.1 | 2230 | 19.2 | 670 | 340 | >207 |
Soil Attribute | Method | Calibration | ||
---|---|---|---|---|
R2 | RMSE | BIAS | ||
Urea-N [%] | 1300–1650 nm FD and PCA | 0.9 | 1.70 | −4.5 × 10−15 |
Urea-N [%] | 1300–1650 nm SD and PCA | 0.93 | 1.53 | −1.6 × 10−15 |
Soil Attribute | Method | Calibration | |||
---|---|---|---|---|---|
R2 | RMSEP | BIAS | RPD | ||
Urea-N [%] | 1300–1650 nm FD and PCA | 0.77 | 4.36 | −2.9 | 2.06 |
Urea-N [%] | 1300–1650 nm SD and PCA | 0.65 | 4.69 | −1.62 | 1.77 |
Soils | Prediction Error [%] | ||
---|---|---|---|
Min | Mean | Max | |
Cassino #1 | −5.89 | −1.33 × 10−15 | 3.98 |
Cassino #2 | −7.23 | −1.22 × 10−15 | 2.53 |
Brescia #1 | −6.82 | 1.25 | 5.94 |
Brescia #2 | −2.94 | −1.11 × 10−16 | 2.00 |
Kiryuzuna #1 | −6.05 | 7.77 × 10−16 | 3.79 |
Ketotsuchi #1 | −5.24 | 0.94 | 7.68 |
Study (Reference) | No. of Different Soils | No. of Samples | Contaminant to Detect in Soil | Wavelengths Used | Machine Learning | Prediction Accuracy (RMSEP or RPD) | Online / Offline | Computational Time | Cost |
---|---|---|---|---|---|---|---|---|---|
Yin, Z.; Lei, T.; Yan, Q.; Chen, Z.; Dong, Y. A near-infrared reflectance sensor for soil surface moisture measurement. Comput. Electron. Agric. 2013, 99, 101–107. https://doi.org/10.1016/j.compag.2013.08.029. [30] | 4 | ~52 | Soil moisture | 900–1700 nm | SVR (reported as linear reg) | 4.1% | Online | Low | High |
Dhawale et al. (2015), Proximal soil sensing of soil texture and organic matter with a prototype portable mid-infrared spectrometer, Eur. J. Soil Sci. 66(4), 661–669. https://doi.org/10.1111/ejss.12226 [33] | 4 | 60 (≈48 validation) | Sand, clay and SOM | 2500–4000 nm | PLSR | 10% (sand), 10% (clay), 2.3% (SOM) | Offline | Medium | High |
Munawar A. A. et al. (2020), Calibration models database of near infrared spectroscopy to predict agricultural soil fertility properties, Data Brief 30, 105469. https://doi.org/10.1016/j.dib.2020.105469 [16] | 10 | 40 | N, P, K, pH | 1000–2500 nm | PCR and PLSR | RPD > 2 | Offline | Medium | High |
Tan, B.; You, W.; Tian, S.; Xiao, T.; Wang, M.; Zheng, B.; Luo, L. Soil Nitrogen Content Detection Based on Near-Infrared Spectroscopy. Sensors 2022, 22, 8013. https://doi.org/10.3390/s22208013 [14] | 1 | 43 | N | 900–1670 nm | Random Forest | RMSEP = 0.141 g/kg (~0.0141%) | Offline | High | High |
Nawar et al. (2016), Estimating the soil clay content and organic matter by means of different calibration methods of vis-NIR diffuse reflectance spectroscopy, Soil Tillage Res. 155, 510–522. https://doi.org/10.1016/j.still.2015.07.021 [15] | 5 | 75 | SOM | 350–2500 nm | OSC-PLSR, SNV-PLSR, MSC-PLSR | Best (MSC-PLSR): RMSEP = 4.8 % (RPD = 2.1) | Offline | Medium | High |
This article | 6 | 50 | Urea-N | 1100–1700 nm | SG + PCA + PLSR | RMSEP = 4.36% RPD = 2.06 | Online | Low | low |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Crescini, D.; Mascialino, G.; Moggia, N.; Piubeni, G.; Serpelloni, M.; Sardini, E. PCA- and PLSR-Based Machine Learning Model for Prediction of Urea-N Content in Heterogeneous Soils Using Near-Infrared Spectroscopy. Sensors 2025, 25, 4176. https://doi.org/10.3390/s25134176
Crescini D, Mascialino G, Moggia N, Piubeni G, Serpelloni M, Sardini E. PCA- and PLSR-Based Machine Learning Model for Prediction of Urea-N Content in Heterogeneous Soils Using Near-Infrared Spectroscopy. Sensors. 2025; 25(13):4176. https://doi.org/10.3390/s25134176
Chicago/Turabian StyleCrescini, Damiano, Gabriele Mascialino, Nicola Moggia, Giordano Piubeni, Mauro Serpelloni, and Emilio Sardini. 2025. "PCA- and PLSR-Based Machine Learning Model for Prediction of Urea-N Content in Heterogeneous Soils Using Near-Infrared Spectroscopy" Sensors 25, no. 13: 4176. https://doi.org/10.3390/s25134176
APA StyleCrescini, D., Mascialino, G., Moggia, N., Piubeni, G., Serpelloni, M., & Sardini, E. (2025). PCA- and PLSR-Based Machine Learning Model for Prediction of Urea-N Content in Heterogeneous Soils Using Near-Infrared Spectroscopy. Sensors, 25(13), 4176. https://doi.org/10.3390/s25134176