# Laboratory Visible and Near-Infrared Spectroscopy with Genetic Algorithm-Based Partial Least Squares Regression for Assessing the Soil Phosphorus Content of Upland and Lowland Rice Fields in Madagascar

^{1}

^{2}

^{*}

## Abstract

**:**

^{2}= 0.796) with a good prediction ability (residual predictive deviation (RPD) = 2.211). Selected wavebands in the GA-PLS model did not perfectly match wavelengths of previously known absorption features of soil nutrients, but in most cases, the selected wavebands were within 20 nm of previously known wavelength regions. Bootstrap procedures (N = 10,000 times) using selected wavebands also confirmed the improvements in accuracy and robustness of the GA-PLS model compared to those of the ISE-PLS and FS-PLS models. These results suggest that soil oxalate-extractable P can be predicted from Vis-NIR spectroscopy and that GA-PLS regression has the advantage of tuning optimum bands for PLS regression, contributing to a better predictive ability.

## 1. Introduction

## 2. Materials and Methods

#### 2.1. Study Site and Soil Sampling and Chemical Analyses

#### 2.2. Vis-NIR Diffuse Reflectance Measurement

^{2}. A Spectralon (Labsphere Inc., Sutton, NH, USA) reference panel (white reference) was used to optimize the ASD instrument prior to taking Vis-NIR reflectance measurements for each sample.

#### 2.3. Overview of Data Processing

#### 2.4. Preprocessing of Spectral Data

#### 2.5. Standard Full-Spectrum Partial Least Squares (FS-PLS) Regression

_{1}to x

_{i}are surface reflectance or FDR values for spectral bands 1 to i (400, 401, …, 2400 nm), respectively; β

_{1}to β

_{i}are the estimated weighted regression coefficients; and ε is the error vector. The latent variables were introduced to simplify the relationship between response variables and predictor variables. To determine the optimal number of latent variables (NLV), a LOO-CV was performed to avoid over-fitting of the model and was based on the minimum value of the root mean squared error of cross-validation (RMSECV). The RMSECV was calculated as follows:

_{i}and y

_{p}represent the respective measured and predicted soil parameters for sample i, and n is the number of samples in the data sets (n = 103).

#### 2.6. Iterative Stepwise Elimination Partial Least Squares (ISE-PLS) Regression

_{i}), described as follows:

_{i}is the standard deviation, and β

_{i}is the regression coefficient; both s

_{i}and β

_{i}correspond to the predictor variable of the waveband i.

_{i}is evaluated, and the less informative wavebands are eliminated. Subsequently, the PLS model is re-calibrated with the remaining predictors [43]. The model-building procedure is repeated until the final model is calibrated with the maximum predictive ability.

#### 2.7. Genetic Algorithm Partial Least Squares (GA-PLS) Regression

#### 2.8. Predictive Ability of the PLS Models

^{2}), RMSECV and the residual predictive deviation (RPD) using a LOO-CV. High R

^{2}and low RMSECV values indicate the best model for predicting the soil parameters. The RPD has been defined as the ratio of standard deviation (SD) of reference data for predicting RMSECV. For the performance ability of calibration models, an RPD of 3 has been suggested for agriculture applications, while RPD values between 2 and 3 indicate a model with good prediction ability; 1.5 < RPD < 2 is an intermediate model needing some improvement; and an RPD < 1.5 indicates that the model has poor prediction ability.

^{2}and the root mean squares error of prediction (RMSEP) from 10,000 runs in the test data set. The RMSEP was calculated as follows:

#### 2.9. Assessing Significant Wavelengths

_{k}(a) is the importance of the kth predictor variable based on a model with a factors, W

_{ak}is the corresponding loading weight of the kth variable in the ath PLS regression factor, SSY

_{a}is the explained sum of squares of y obtained from a PLS regression model with a factors, SSY

_{t}is the total sum of squares of y, and m is the total number of predictor variables. A high VIP score (>1) indicates an important x variable (waveband) [47,49].

## 3. Results and Discussion

#### 3.1. A Wide Range of Soil Oxalate-Extractable P Contents in Upland and Lowland Rice Fields

^{−1}and 30.73–826.64 mg kg

^{−1}, respectively. The mean value of the upland data set (588.74 mg kg

^{−1}) showed significantly higher values than that of the lowland data set (319.41 mg P kg

^{−1}) (p < 0.001, two sample t-test). The lower values in soil oxalate-extractable P is probably because of little fertilizer input in lowland fields compared to upland fields in the central highlands of Madagascar. Similarly, the soil TC was significantly higher (p < 0.05) in lowland soils due to the anaerobic condition, while there was no significant difference in soil clay contents (Figure S1). It is, therefore, suggested that soil physicochemical properties were inherently not different between upland and lowland soils as they were collected nearby fields, and have been changed by the agricultural practices, i.e., fertilization and flooding.

^{−1}(±319.10 mg kg

^{−1}), with a range of 30.73–1225.16 mg kg

^{−1}, and CV = 65.91%. The SD and range of the sample affect the accuracy of soil property predictions using Vis-NIR spectroscopy [34]. In the present study, the range of soil oxalate-extractable P values was considered sufficiently large to develop the calibration models using PLS regression analyses. Our data set also demonstrated that the oxalate-extractable P content had a good correlation with the total P content in soils [6].

#### 3.2. Soil Spectral Response and Its Correlation to Oxalate-Extractable P in Soil

#### 3.3. Selected Wavebands from ISE-PLS and GA-PLS Models

#### 3.4. Waveband Selection with Cross-Validated Calibration Results

^{2}and lowest RMSECV values were obtained with the GA-PLS model for estimating the soil oxalate-extractable P content (R

^{2}= 0.796 and RMSECV = 143.625). Based on RPD > 2 in the GA-PLS and ISE-PLS models, the quality and future applicability of our results could be considered to have a good predictive ability.

#### 3.5. Evaluation of Predictive Ability Using Modified Bootstrapping

^{2}and RMSECV in the training data set (n = 69) and R

^{2}, RMSEP and the percent difference of RMSEP (ΔRMSEP) between FS-PLS and ISE-PLS or GA-PLS models in the test data set (n = 34). In addition, Figure 7 demonstrates the distribution of R

^{2}values in the test data set. The mean optimum NLV ranged from 5.364 in the GA-PLS model to 7.285 in the FS-PLS model. In the training data set, GA-PLS obtained the best mean R

^{2}(0.782) and the lowest mean RMSECV (148.930 mg P kg

^{−1}) values, and ISE-PLS performed better than FS-PLS. Similarly, in the test data set, the GA-PLS model obtained the best mean R

^{2}(0.787) and the lowest mean RMSEP (149.013 mg P kg

^{−1}) values for estimating soil oxalate-extractable P. In comparison with the FS-PLS model and GA-PLS, the ΔRMSEP showed greater predictive accuracies in ISE-PLS (−16.21%) and GA-PLS (−24.69%) models, respectively.

## 4. Conclusions

^{−1}) can be rapidly and non-destructively predicted by Vis-NIR spectroscopy for rice fields irrespective of different cropping systems and geographical locations and that the predictive ability was improved by GA-based waveband selection coupled with PLS regression analysis. GA-based waveband selection in the PLS calibration suggested that the important wavebands for estimating soil oxalate-extractable P were 4.7% of all 2001 wavebands in the 400–2400 nm range. The selected wavebands were different from previously published absorption peaks of specific materials. However, most of the peaks were within the 20 nm vicinity of such a peak and apparently relevant to chemical associations of oxalate-extractable P in soils bound to Al and Fe oxides and organic compounds. Thus, the selected wavelength in our study should be considered informative for estimating soil oxalate-extractable P contents. Based on the selected FDR wavebands in the GA-PLS model, soil oxalate-extractable P was determined to provide a good prediction (RPD = 2.211), with 20.4% and 21.3% of errors when cross-validating and testing, respectively, the independent data set. Such timely P sensing in soils might allow Madagascar’s farmers to implement better fertilizer management.

## Supplementary Materials

## Author Contributions

## Funding

## Acknowledgments

## Conflicts of Interest

## References

- Dogbe, W.; Sogbedji, J.M.; Buah, S.S.J. Site-specific Nutrient Management for Lowland Rice in the Northern Savannah Zones of Ghana. Curr. Agric. Res. J.
**2015**, 3, 109–117. [Google Scholar] [CrossRef] - Kone, B.; Fofana, M.; Sorho, F.; Diatta, S.; Ogunbayo, A.; Sie, M. Nutrient constraint of rainfed rice production in foot slope soil of Guinea Forest in Côte d’Ivoire. Arch. Agron. Soil Sci.
**2014**, 60, 735–746. [Google Scholar] [CrossRef] - Koné, B.; Amadji, G.L.; Aliou, S.; Diatta, S.; Akakpo, C. Nutrient constraint and yield potential of rice on upland soil in the south of the Dahoumey gap of West Africa. Arch. Agron. Soil Sci.
**2011**, 57, 763–774. [Google Scholar] [CrossRef] - Tamburini, F.; Bernasconi, S.M.; Paytan, A. Phosphorus in the environment. In Eos; White, P.J., Hammond, J.P., Eds.; Springer Netherlands: Dordrecht, The Netherlands, 2012; Volume 93, p. 405. ISBN1 978-1-4020-8434-8. ISBN2 978-1-4020-8435-5. [Google Scholar]
- Balemi, T.; Negisho, K. Management of soil phosphorus and plant adaptation mechanisms to phosphorus stress for sustainable crop production: A review. J. Soil Sci. Plant Nutr.
**2012**, 12, 547–562. [Google Scholar] [CrossRef] - Nishigaki, T.; Tsujimoto, Y.; Rinasoa, S.; Rakotoson, T.; Andriamananjara, A.; Razafimbelo, T. Phosphorus uptake of rice plants is affected by phosphorus forms and physicochemical properties of tropical weathered soils. Plant Soil
**2018**, 435, 27–38. [Google Scholar] [CrossRef] - Wuenscher, R.; Unterfrauner, H.; Peticzka, R.; Zehetner, F. A comparison of 14 soil phosphorus extraction methods applied to 50 agricultural soils from Central Europe. Plant Soil Environ.
**2015**, 61, 86–96. [Google Scholar] [CrossRef] - Helfenstein, J.; Tamburini, F.; von Sperber, C.; Massey, M.S.; Pistocchi, C.; Chadwick, O.A.; Vitousek, P.M.; Kretzschmar, R.; Frossard, E. Combining spectroscopic and isotopic techniques gives a dynamic view of phosphorus cycling in soil. Nat. Commun.
**2018**, 9, 3226. [Google Scholar] [CrossRef] [PubMed] - Rabeharisoa, L.; Razanakoto, O.R.; Razafimanantsoa, M.P.; Rakotoson, T.; Amery, F.; Smolders, E. Larger bioavailability of soil phosphorus for irrigated rice compared with rainfed rice in Madagascar: Results from a soil and plant survey. Soil Use Manag.
**2012**, 28, 448–456. [Google Scholar] [CrossRef] - Sims, J.T.; Sharpley, A.N.; Condron, L.M.; Turner, B.L.; Cade-Menun, B.J. Chemistry and Dynamics of Soil Organic Phosphorus. Phosphorus Agric. Environ.
**2005**, 87–121. [Google Scholar] [CrossRef] - Viscarra Rossel, R.A.; Walvoort, D.J.J.; McBratney, A.B.; Janik, L.J.; Skjemstad, J.O. Visible, near infrared, mid infrared or combined diffuse reflectance spectroscopy for simultaneous assessment of various soil properties. Geoderma
**2006**, 131, 59–75. [Google Scholar] [CrossRef] - Miller, C.E. Chemical principles of near-infrared technology. In Near Infrared Technology in the Agricultural and Food Industries; Williams, P.C., Horris, K.H., Eds.; American Association of Cereal Chemists: St. Paul, MN, USA, 2001; pp. 19–37. [Google Scholar]
- Cañasveras, J.C.; Barrón, V.; del Campillo, M.C.; Torrent, J.; Gómez, J.A. Estimation of aggregate stability indices in Mediterranean soils by diffuse reflectance spectroscopy. Geoderma
**2010**, 158, 78–84. [Google Scholar] [CrossRef] [Green Version] - Cañasveras Sánchez, J.C.; Barrón, V.; del Campillo, M.C.; Viscarra Rossel, R.A. Reflectance spectroscopy: A tool for predicting soil properties related to the incidence of Fe chlorosis. Span. J. Agric. Res.
**2012**, 10, 10. [Google Scholar] - Viscarra Rossel, R.A.; Behrens, T.; Ben-Dor, E.; Brown, D.J.; Demattê, J.A.M.; Shepherd, K.D.; Shi, Z.; Stenberg, B.; Stevens, A.; Adamchuk, V.; et al. A global spectral library to characterize the world’s soil. Earth-Sci. Rev.
**2016**, 155, 198–230. [Google Scholar] [CrossRef] - Yang, H.; Kuang, B.; Mouazen, A.M. Quantitative analysis of soil nitrogen and carbon at a farm scale using visible and near infrared spectroscopy coupled with wavelength reduction. Eur. J. Soil Sci.
**2012**, 63, 410–420. [Google Scholar] [CrossRef] - Vohland, M.; Ludwig, M.; Thiele-Bruhn, S.; Ludwig, B. Determination of soil properties with visible to near- and mid-infrared spectroscopy: Effects of spectral variable selection. Geoderma
**2014**, 223, 88–96. [Google Scholar] [CrossRef] - Kawamura, K.; Tsujimoto, Y.; Rabenarivo, M.; Asai, H.; Andriamananjara, A.; Rakotoson, T. Vis-NIR spectroscopy and PLS regression with waveband selection for estimating the total C and N of paddy soils in Madagascar. Remote Sens.
**2017**, 9, 1081. [Google Scholar] [CrossRef] - Bolster, K.L.; Martin, M.E.; Aber, J.D. Determination of carbon fraction and nitrogen concentration in tree foliage by near infrared reflectance: A comparison of statistical methods. Can. J. For. Res.
**1996**, 26, 590–600. [Google Scholar] [CrossRef] - Kawamura, K.; Watanabe, N.; Sakanoue, S.; Inoue, Y. Estimating forage biomass and quality in a mixed sown pasture based on PLS regression with waveband selection. Grassl. Sci.
**2008**, 54, 131–146. [Google Scholar] [CrossRef] - Boggia, R.; Forina, M.; Fossa, P.; Mosti, L. Chemometric study and validation strategies in the structure-activity relationships of new cardiotonic agents. Quant. Struct. Relatsh.
**1997**, 16, 201–213. [Google Scholar] [CrossRef] - Centner, V.; Massart, D.L.; de Noord, O.E.; de Jong, S.; Vandeginste, B.M.; Sterna, C. Elimination of uninformative variables for multivariate calibration. Anal. Chem.
**1996**, 68, 3851–3858. [Google Scholar] [CrossRef] [PubMed] - Li, H.; Liang, Y.; Xu, Q.; Cao, D. Key wavelengths screening using competitive adaptive reweighted sampling method for multivariate calibration. Anal. Chim. Acta
**2009**, 648, 77–84. [Google Scholar] [CrossRef] [PubMed] - Nørgaard, L.; Saudland, A.; Wagner, J.; Nielsen, J.P.; Munck, L.; Engelsen, S.B. Interval partial least-squares regression (iPLS): A comparative chemometric study with an example from near-infrared spectroscopy. Appl. Spectrosc.
**2000**, 54, 413–419. [Google Scholar] [CrossRef] - Jiang, J.H.; James, R.; Siesler, B.H.W.; Ozaki, Y. Wavelength interval selection in multicomponent spectral analysis by moving window partial least-squares regression with applications to mid-infrared and near-infrared spectroscopic data. Anal. Chem.
**2002**, 74, 3555–3565. [Google Scholar] [CrossRef] [PubMed] - Leardi, R. Application of a genetic algorithm to feature selection under full validation conditions and to outlier detection. J. Chemom.
**1994**, 8, 65–79. [Google Scholar] [CrossRef] - Leardi, R. Application of genetic algorithm-PLS for feature selection in spectral data sets. J. Chemom.
**2000**, 14, 643–655. [Google Scholar] [CrossRef] - Leardi, R.; González, A.L. Genetic algorithms applied to feature selection in PLS regression: How and when to use them. Chemom. Intell. Lab. Syst.
**1998**, 41, 195–207. [Google Scholar] [CrossRef] - Lucasius, C.B.; Kateman, G. Understanding and using genetic algorithms Part 2. Representation, configuration and hybridization. Chemom. Intell. Lab. Syst.
**1994**, 25, 99–145. [Google Scholar] [CrossRef] - Kawamura, K.; Watanabe, N.; Sakanoue, S.; Lee, H.J.; Inoue, Y.; Odagawa, S. Testing genetic algorithm as a tool to select relevant wavebands from field hyperspectral data for estimating pasture mass and quality in a mixed sown pasture using partial least squares regression. Grassl. Sci.
**2010**, 56, 205–216. [Google Scholar] [CrossRef] - Kawamura, K.; Watanabe, N.; Sakanoue, S.; Lee, H.J.; Lim, J.; Yoshitoshi, R. Genetic algorithm-based partial least squares regression for estimating legume content in a grass-legume mixture using field hyperspectral measurements. Grassl. Sci.
**2013**, 59, 166–172. [Google Scholar] [CrossRef] - Bogrekci, I.; Lee, W.S. Spectral soil signatures and sensing phosphorus. Biosyst. Eng.
**2005**, 92, 527–533. [Google Scholar] [CrossRef] - Maleki, M.R.; Van Holm, L.; Ramon, H.; Merckx, R.; De Baerdemaeker, J.; Mouazen, A.M. Phosphorus Sensing for Fresh Soils using Visible and Near Infrared Spectroscopy. Biosyst. Eng.
**2006**, 95, 425–436. [Google Scholar] [CrossRef] - Kuang, B.; Mouazen, A.M. Calibration of visible and near infrared spectroscopy for soil analysis at the field scale on three European farms. Eur. J. Soil Sci.
**2011**, 62, 629–636. [Google Scholar] [CrossRef] - Tsujimoto, Y.; Horie, T.; Randriamihary, H.; Shiraiwa, T.; Homma, K. Soil management: The key factors for higher productivity in the fields utilizing the system of rice intensification (SRI) in the central highland of Madagascar. Agric. Syst.
**2009**, 100, 61–71. [Google Scholar] [CrossRef] - IUSS Working Group, WRB. World Reference Base for Soil Resources 2014, Update 2015 International Soil Classification System for Naming Soils and Creating Legends for Soil Maps; World Soil Resources Reports No. 106; Food and Agriculture Organization of the United Nations: Rome, Italy, 2015. [Google Scholar]
- Soil Survey Staff. Keys to Soil Taxonomy, 12th ed.; USDA-Natural Resources Conservation Service: Washington, DC, USA, 2014.
- Schwertmann, U. The differentiation of iron oxides in soils by extraction with ammonium oxalate solution. Z. Pflanz. Bodenkd.
**1964**, 105, 194–202. [Google Scholar] [CrossRef] - van Veldhoven, P.P.; Mannaerts, G.P. Inorganic and organic phosphate measurements in the nanomolar range. Anal. Biochem.
**1987**, 161, 45–48. [Google Scholar] [CrossRef] - Reeves, J.; McCarty, G.; Mimmo, T. The potential of diffuse reflectance spectroscopy for the determination of carbon inventories in soils. Environ. Pollut.
**2002**, 116, S277–S284. [Google Scholar] [CrossRef] - Savitzky, A.; Golay, E.J.M. Smoothing and difference of data by simplified least squares procedures. Anal. Chem.
**1964**, 36, 1627–1639. [Google Scholar] [CrossRef] - Brunet, D.; Barthès, B.G.; Chotte, J.-L.; Feller, C. Determination of carbon and nitrogen contents in Alfisols, Oxisols and Ultisols from Africa and Brazil using NIRS analysis: Effects of sample grinding and set heterogeneity. Geoderma
**2007**, 139, 106–117. [Google Scholar] [CrossRef] - Forina, M.; Lanteri, S.; Oliveros, M.C.C.; Millan, C.P. Selection of useful predictors in multivariate calibration. Anal. Bioanal. Chem.
**2004**, 380, 397–418. [Google Scholar] [CrossRef] [PubMed] - Leardi, R.; Boggia, R.; Terrile, M. Genetic Algorithms as a strategyfor feature selection. J. Chemom.
**1992**, 6, 267–281. [Google Scholar] [CrossRef] - Ding, Q.; Small, G.W.; Arnold, M.A. Genetic algorithm-based wavelength selection for the near-infrared determination of glucose in biological matrixes: Initialization strategies and effects of spectral resolution. Anal. Chem.
**1998**, 70, 4472–4479. [Google Scholar] [CrossRef] [PubMed] - Leardi, R.; Seasholtz, M.B.; Pell, R.J. Variable selection for multivariate calibration using a genetic algorithm: Prediction of additive concentrations in polymer films from Fourier transform-infrared spectral data. Anal. Chim. Acta
**2002**, 461, 189–200. [Google Scholar] [CrossRef] - Wold, S.; Sjöström, M.; Eriksson, L. PLS-regression: A basic tool of chemometrics. Chemom. Intell. Lab. Syst.
**2001**, 58, 109–130. [Google Scholar] [CrossRef] - Chong, I.-G.; Jun, C.-H. Performance of some variable selection methods when multicollinearity is present. Chemom. Intell. Lab. Syst.
**2005**, 78, 103–112. [Google Scholar] [CrossRef] - Li, B.; Liew, O.W.; Asundi, A.K. Pre-visual detection of iron and phosphorus deficiency by transformed reflectance spectra. J. Photochem. Photobiol. B Biol.
**2006**, 85, 131–139. [Google Scholar] [CrossRef] [PubMed] - Ben-Dor, E. Quantitative remote sensing of soil properties. In Advances in Agronomy; Academic Press: New York, NY, USA, 2002; Volume 75, pp. 173–243. ISBN 9780120007936. [Google Scholar]
- Drăguţ, L.; Dornik, A. Land-surface segmentation as a method to create strata for spatial sampling and its potential for digital soil mapping. Int. J. Geogr. Inf. Sci.
**2016**, 30, 1359–1376. [Google Scholar] [CrossRef] - Scheinost, A.C.; Chavernas, A.; Barrón, V.; Torrent, J. Use and limitations of second-derivative diffuse reflectance spectroscopy in the visible to near-infrared range to identify and quantify Fe oxide minerals in soils. Clays Clay Miner.
**1998**, 46, 528–536. [Google Scholar] [CrossRef] - Mortimore, J.L.; Marshll, L.-J.R.; Almond, J.M.; Hollins, P.; Matthews, W. Analysis of red and yellow ochre samples from Clearwell Caves and Çatalhöyük by vibrational spectroscopy and other techniques. Spectrochim. Acta Part A Mol. Biomol. Spectrosc.
**2004**, 60, 1179–1188. [Google Scholar] [CrossRef] [PubMed] - Shonk, J.L.; Gaultney, L.D.; Schulze, D.G.; Van Scoyoc, G.E. Spectroscopic sensing of soil organic-matter content. Trans. ASAE
**1991**, 34, 1978–1984. [Google Scholar] [CrossRef] - Daniel, K.W.; Tripathi, N.K.; Honda, K. Artificial neural network analysis of laboratory and in situ spectra for the estimation of macronutrients in soils of Lop Buri (Thailand). Aust. J. Soil Res.
**2003**, 41, 47–59. [Google Scholar] [CrossRef] - Knadel, M.; Viscarra Rossel, R.A.; Deng, F.; Thomsen, A.; Greve, M.H. Visible–Near Infrared Spectra as a Proxy for Topsoil Texture and Glacial Boundaries. Soil Sci. Soc. Am. J.
**2013**, 77, 568. [Google Scholar] [CrossRef] - Hunt, G.H.; Salisbury, J.W. Visible and Near Infrared Spectra of Minerals and Rocks: XI. Sedimentary Rocks. Mod. Geol.
**1976**, 5, 211–217. [Google Scholar] - Katuwal, S.; Knadel, M.; Moldrup, P.; Norgaard, T.; Greve, M.H.; de Jonge, L.W. Visible–Near-Infrared Spectroscopy can predict Mass Transport of Dissolved Chemicals through Intact Soil. Sci. Rep.
**2018**, 8, 11188. [Google Scholar] [CrossRef] [PubMed] - Ramaroson, V.H.; Becquer, T.; Sá, S.O.; Razafimahatratra, H.; Delarivière, J.L.; Blavet, D.; Vendrame, P.R.S.; Rabeharisoa, L.; Rakotondrazafy, A.F.M. Mineralogical analysis of ferralitic soils in Madagascar using NIR spectroscopy. CATENA
**2018**, 168, 102–109. [Google Scholar] [CrossRef] - Ben-Dor, E.; Inbar, Y.; Chen, Y. The reflectance spectra of organic matter in the visible near-infrared and short wave infrared region (400–2500 nm) during a controlled decomposition process. Remote Sens. Environ.
**1997**, 61, 1–15. [Google Scholar] [CrossRef] - Clark, R.N.; King, T.V.V.; Klejwa, M.; Swayze, G.A.; Vergo, N. High spectral resolution reflectance spectroscopy of minerals. J. Geophys. Res.
**1990**, 95, 12653–12680. [Google Scholar] [CrossRef] - Clark, R.N. Spectroscopy of rocks and minerals, and principles of spectroscopy. In Manual of Remote Sensing; John Wiley and Sons, Inc.: Chichester, UK, 1999; pp. 3–58. [Google Scholar]
- Turner, B.L. Organic phosphorus in Madagascan rice soils. Geoderma
**2006**, 136, 279–288. [Google Scholar] [CrossRef] [Green Version] - Viscarra Rossel, R.A.; Fouad, Y.; Walter, C. Using a digital camera to measure soil organic carbon and iron contents. Biosyst. Eng.
**2008**, 100, 149–159. [Google Scholar] [CrossRef] - Hunt, G.R. Spectral Signatures of Particulate Minerals in the Visible and Near Infrared. Geophysics
**1977**, 42, 501–513. [Google Scholar] [CrossRef] - Sherman, D.M.; Waite, D.T. Electronic spectra of Fe3+ oxides and oxide hydroxides in the near IR to near UV. Am. Mineral.
**1985**, 70, 1262–1269. [Google Scholar] - Stenberg, B.; Viscarra Rossel, R.A.; Mouazen, A.M.; Wetterlind, J. Visible and Near Infrared Spectroscopy in Soil Science. Adv. Agron.
**2010**, 107, 163–215. [Google Scholar] [Green Version] - Fourty, T.; Baret, F.; Jacquemoud, S.; Schmuck, G.; Verdebout, J. Leaf optical properties with explicit description of its biochemical composition: Direct and inverse problems. Remote Sens. Environ.
**1996**, 56, 104–117. [Google Scholar] [CrossRef] - Bishop, J.L.; Pieters, C.M.; Edwards, J.O. Infrared spectroscopic analyses on the nature of water in montmorillonite. Clays Clay Miner.
**1994**, 42, 702–716. [Google Scholar] [CrossRef] - Darvishzadeh, R.; Skidmore, A.; Atzberger, C.; van Wieren, S. Estimation of vegetation LAI from hyperspectral reflectance data: Effects of soil type and plant architecture. Int. J. Appl. Earth Obs. Geoinf.
**2008**, in press, 358–373. [Google Scholar] [CrossRef] - Wang, Z.; Kawamura, K.; Sakuno, Y.; Fan, X.; Gong, Z.; Lim, J. Retrieval of chlorophyll-a and total suspended solids using iterative stepwise elimination partial least squares (ISE-PLS) regression based on field hyperspectral measurements in irrigation ponds in Higashihiroshima, Japan. Remote Sens.
**2017**, 9, 264. [Google Scholar] [CrossRef] - Holst, J.; Liu, C.; Yao, Z.; Brüggemann, N.; Zheng, X.; Han, X.; Butterbach-Bahl, K. Importance of point sources on regional nitrous oxide fluxes in semi-arid steppe of Inner Mongolia, China. Plant Soil
**2007**, 296, 209–226. [Google Scholar] [CrossRef] - Leardi, R.; Nørgaard, L. Sequential application of backward interval partial least squares and genetic algorithms for the selection of relevant spectral regions. J. Chemom.
**2004**, 18, 486–497. [Google Scholar] [CrossRef]

**Figure 1.**Location of studied regions and soil sampling points. Source in (

**a**), (

**b**) and (

**d**): Esri, DigitalGlobe, GeoEye, Earthstar Geographics, CNES/Airbus DS, USDA, USGS, AeroGRID, IGN, and the GIS User Community. Source in (

**c**) and (

**e**): The ASTER GDEM version 2 data were downloaded via EarthExplore (https://earthexplorer.usgs.gov/).

**Figure 3.**Box plot (

**a**) and histogram (

**b**) of soil oxalate-extractable P in the whole, lowland and upland data sets.

**Figure 4.**Raw reflectance spectra (

**a**) and first derivative reflectance (FDR) spectra on a log10 scale (

**b**) of the soil samples and their correlation of coefficients to soil oxalate-extractable P in each waveband (

**c**,

**d**).

**Figure 5.**(

**a**) Selected wavebands in ISE-PLS analysis (green bars) and GA-PLS (blue bars in each run) using the FDR data set (n = 103) to estimate oxalate-extractable P contents of upland and lowland soils, with commonly selected wavebands from five GA-PLS runs (red bars), (

**b**) regression coefficients in the FS-PLS model, and (

**c**) VIP score (>1, grey bars). Specific absorption wavebands for the different bonds present in soil are specified on the top x-axis (modified by [58]).

**Figure 6.**Relationships between observed and predicted values of soil oxalate-extractable P contents using (

**a**) FS-PLS, (

**b**) ISE-PLS and (

**c**) GA-PLS regressions.

**Figure 7.**Comparisons of the frequency distributions of R

^{2}values in the test data (n = 34) from (

**a**) FS-PLS, (

**b**) ISE-PLS and (

**c**) GA-PLS using FDR, with mean (red line) ± standard deviation (SD) values.

Parameter | Condition |
---|---|

Population size | 30 chromosomes |

Regression method | PLS |

Response | Cross-validated percent explained variance (five deletion group; the number of components is determined by cross-validation) |

Maximum number of variables selected for the same chromosome | 30 |

Probability of mutation | 1% |

Maximum number of latent variables | 15 |

Number of runs | 100 |

Window size for smoothing | 3 |

**Table 2.**Descriptive statistics of soil oxalate-extractable P data. n, number of samples; SD, standard deviation; CV, coefficient of variation (SD/mean × 100%).

Data Set | n | Min | Max | Median | Mean | SD | CV |
---|---|---|---|---|---|---|---|

Whole | 103 | 30.73 | 1225.16 | 496.84 | 484.15 | 319.10 | 65.91 |

Upland | 63 | 30.78 | 1225.16 | 609.23 | 588.74 | 324.93 | 55.70 |

Lowland | 40 | 30.73 | 826.64 | 245.79 | 319.41 | 223.26 | 69.89 |

**Table 3.**Commonly selected wavebands from five GA-PLS runs to estimate soil oxalate-extractable P using the FDR data set (n = 103) and possible soil components.

Selected Waveband (nm) | Previously Known Waveband and Related Soil Component | ||
---|---|---|---|

Waveband (nm) | Soil Component | Reference | |

454–457 | 400–700 | organic matter (color) | [11,64] |

470 | Fe^{3+}, ferric oxide | [65] | |

506–508, 517, 518 | 488–499 | ferrihydrite | [52] |

495, 510 | hematite | [66] | |

660 | 660 | goethite | [67] |

655 | schwertmannite | [52] | |

1732 | 1720 | organic matter | [55] |

1726 | aliphatic C-H stretch, cellulose, lignin, starch, pectin, wax, humic acid | [60] | |

1730 | protein, cellulose, aliphatic C-H stretch, lignin, starch, pectin, wax, humic acid | [68] | |

1847–1849 | 1730–1852 | methyl (C-H) | [61,62] |

1957–1961 | 1950 | sugar, starch, cellulose, lignin, protein | [68] |

1961 | phenolics (C-OH) | [61,62] | |

1970 | smectite, shoulder due to absorbed water | [69] | |

2105, 2107, 2109 | 2111 | organic matter, cellulose, glucan, pectin | [60] |

2312 | 2300 | C-H stretch fundamentals | [61] |

2307–2469 | methyl | [62] | |

2309 | aliphatic C-H, aromatic stretch, humic acid wax, starch | [60] | |

2310 | oil | [54] |

**Table 4.**Optimum number of latent variables (NLV), coefficient of determination (R

^{2}), root mean squared errors of cross-validation (RMSECV), and residual predictive values (RPD) from FS-PLS, ISE-PLS and GA-PLS models with selected number of wavebands (NW) and their percentages of the full spectrum (NW%). NW% = NW / 2001 bands × 100%.

Regression Method | Cross-Validation for Whole Data Set (n = 103) | |||||
---|---|---|---|---|---|---|

NLV | R2CV | RMSECV | RPD | NW | NW% | |

FS-PLS | 7 | 0.686 | 179.146 | 1.773 | ||

ISE-PLS | 7 | 0.770 | 152.984 | 2.076 | 158 | 7.9 |

GA-PLS | 6 | 0.796 | 143.625 | 2.211 | 94 | 4.7 |

**Table 5.**Mean values of NLV, R

^{2}and RMSECV/RMSEP from N = 10,000 evaluations using independent training and test data sets with FS-PLS, ISE-PLS and GA-PLS.

Regression Method | Training Data Set (n = 69) | Test Data Set (n = 34) | ||||
---|---|---|---|---|---|---|

Mean NLV | Mean R^{2} | Mean RMSECV | Mean R^{2} | Mean RMSEP | ΔRMSEP ^{1} | |

FS-PLS | 7.285 | 0.659 | 188.560 | 0.638 | 197.860 | |

ISE-PLS | 6.419 | 0.751 | 160.180 | 0.742 | 165.786 | –16.21 |

GA-PLS | 5.364 | 0.782 | 148.930 | 0.787 | 149.013 | –24.69 |

^{1}ΔRMSEP, percent difference in the RMSEP to FS-PLS.

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Kawamura, K.; Tsujimoto, Y.; Nishigaki, T.; Andriamananjara, A.; Rabenarivo, M.; Asai, H.; Rakotoson, T.; Razafimbelo, T.
Laboratory Visible and Near-Infrared Spectroscopy with Genetic Algorithm-Based Partial Least Squares Regression for Assessing the Soil Phosphorus Content of Upland and Lowland Rice Fields in Madagascar. *Remote Sens.* **2019**, *11*, 506.
https://doi.org/10.3390/rs11050506

**AMA Style**

Kawamura K, Tsujimoto Y, Nishigaki T, Andriamananjara A, Rabenarivo M, Asai H, Rakotoson T, Razafimbelo T.
Laboratory Visible and Near-Infrared Spectroscopy with Genetic Algorithm-Based Partial Least Squares Regression for Assessing the Soil Phosphorus Content of Upland and Lowland Rice Fields in Madagascar. *Remote Sensing*. 2019; 11(5):506.
https://doi.org/10.3390/rs11050506

**Chicago/Turabian Style**

Kawamura, Kensuke, Yasuhiro Tsujimoto, Tomohiro Nishigaki, Andry Andriamananjara, Michel Rabenarivo, Hidetoshi Asai, Tovohery Rakotoson, and Tantely Razafimbelo.
2019. "Laboratory Visible and Near-Infrared Spectroscopy with Genetic Algorithm-Based Partial Least Squares Regression for Assessing the Soil Phosphorus Content of Upland and Lowland Rice Fields in Madagascar" *Remote Sensing* 11, no. 5: 506.
https://doi.org/10.3390/rs11050506