Soil Properties Prediction for Precision Agriculture Using Visible and Near ‐ Infrared Spectroscopy: A Systematic Review and Meta ‐ Analysis

: Reflectance spectroscopy for soil property prediction is a non ‐ invasive, fast, and cost ‐ ef ‐ fective alternative to the standard laboratory analytical procedures. Soil spectroscopy has been un ‐ der study for decades now with limited application outside research. The recent advancement in precision agriculture and the need for the spatial assessment of soil properties have raised interest in this technique. The performance of soil spectroscopy differs from one site to another depending on the soil’s physical composition and chemical properties but it also depends on the instrumenta ‐ tion, mode of use (in ‐ situ/laboratory), spectral range, and data analysis methods used to correlate reflectance data to soil properties. This paper uses the systematic review procedure developed by the Centre for Evidence ‐ Based Conservation (CEBC) for an evidence ‐ based search of soil property prediction using Visible (V) and Near ‐ InfraRed (NIR) reflectance spectroscopy. Constrained by in ‐ clusion criteria and defined methods for literature search and data extraction, a meta ‐ analysis is conducted on 115 articles collated from 30 countries. In addition to the soil properties, findings are also categorized and reported by different aspects like date of publication, journals, countries, em ‐ ployed regression methods, laboratory or in ‐ field conditions, spectra preprocessing methods, sam ‐ ples drying methods, spectroscopy devices, wavelengths, number of sites and samples, and data division into calibration and validation sets. The arithmetic means of the coefficient of determina ‐ tion ( R 2 ) over all the reports for different properties ranged from 0.68 to 0.87, with better predictions for carbon and nitrogen content and lower performance for silt and clay. After over 30 years of research on using V ‐ NIR spectroscopy to predict soil properties, this systematic review reveals solid evidence from a literature search that this technology can be relied on as a low ‐ cost and fast alter ‐ native for standard methods of soil properties prediction with acceptable accuracy. alization, Writing—Re ‐ view & Editing.


Introduction
Monitoring the soil status is in great demand in precision agriculture to adjust practices such as tillage, fertilization, and irrigation. A good understanding of the soil characteristics can assist growers with their farming decisions, and more generally, can improve the application of operations, practices, and treatments in soil management [1,2]. However, standard analytical procedures like wet chemistry require specialized equipment and can be extremely time-consuming and expensive, especially when dealing with a high spatial sampling density [3,4]. As an alternative to the standard wet chemistry, soil Visible and Near-InfraRed (V-NIR) reflectance spectroscopy has proven to be a fast, cost-effective, nondestructive, environmental-friendly, repeatable, and reproducible analytical technique [5][6][7]. V-NIR reflectance spectroscopy has been used now for more than 30 years to predict an extensive variety of soil properties like organic and inorganic carbon [8][9][10][11][12][13], nitrogen [14,15], organic matter [16,17], moisture [18,19], texture [20,21], and salinity [22,23]. Although the results of these studies are encouraging, they rely on a specific dataset and a single or a handful of analysis procedures. Therefore, accumulating their findings systematically and appraising the pool of outcomes can affirm the ability of V-NIR spectroscopy to predict soil properties.
While numerous individual studies have shown the ability of V-NIR spectra to provide reliable information on soil physical, chemical, and biological properties [11][12][13]24], their findings are rarely compiled to compare, contrast, and critically appraise. Although some reviews are showing the ability of reflectance spectroscopy for prediction of soil physical, chemical, and biological properties [25][26][27][28][29][30], to the best knowledge of the authors, none of them followed the highly structured approach of a "systematic review". This is particularly important because, contrary to ordinary literature reviews, a "systematic review" prevents the risk of bias essential to excluding or including specific literature that could have a considerable influence on the study outcomes [31]. Another issue with the available reviews is that they mainly focused on a single property; examples are [28] review on soil organic matter and [26] review on soil carbon content; moreover, available reviews usually analyze both V-NIR and Mid-InfraRed (MIR) spectroscopy and usually compare their results [26][27][28]. While it is reported that generally, MIR produces better predictions than V-NIR [6,27], since the performance of MIR spectroscopy is highly influenced by soil moisture content due to the strong water absorption bands in MIR, and because MIR technology can hardly be employed in portable devices, the use of this technology is mainly limited to laboratory conditions and is not suitable for on-the-go and infield sensors, which are well-required in precision agriculture applications. In fact, another reason that justifies the focus on V-NIR is the lower cost of this technology compared with MIR that makes it more accessible to both farmers and researchers.
Given the demand for fast and cost-effective soil property monitoring in modern agricultural activities, and the necessity of evidence-based approaches to evaluate the capacity of alternatives to standard procedures, here, we present the findings from a systematic review and meta-analysis of the ability of V-NIR reflectance spectroscopy to predict various soil properties. Although miscellaneous soil properties are reported in the literature to be predicted by V-NIR spectroscopy with acceptable accuracy (Soriano-Disla et al., 2014), the focus of this article is on six main properties that play decisive roles in precision agriculture and farming practices: 1. Carbon content (as Total Carbon (TC), Soil Organic Carbon (SOC), Inorganic Carbon (IC)), 2. Nitrogen content (as Total Nitrogen (TN)), 3. Organic matter (as Soil Organic Matter (SOM)), 4. Water or moisture content (as Moisture Content (MC)), 5. Soil salinity (as Soil Salinity Content (SSC)), and 6. Texture (as Sand, Clay, and Silt). Having a fast, cost-effective, and reliable estimation of these properties can result in more efficient farming decisions and practices. Therefore, the research question of this study is: How accurately V-NIR spectroscopy can predict soil carbon, nitrogen, organic matter, moisture, salinity, and texture?.

Methodology
The systematic review used in this paper follows the procedure developed by the Centre for Evidence-Based Conservation (CEBC). This includes drafting a protocol to define a literature search followed by data extraction based on a defined set of 'inclusion criteria' [32].
Following the CEE's protocol, the research question, mentioned earlier was broken down into PECO (Population, Exposure, Comparator, and Outcome) components as follows: (i) Population: Soil samples, (ii) Exposure: Visible and Near-Infrared Spectroscopy, (iii) Comparator: standard laboratory methods of measurement like wet chemistry, and (iv) Outcome: Predictive accuracy of soil properties through quantitative measures of performance like the coefficient of determination (R 2 ), the Ratio of Performance to Deviation (RPD), and the Root Mean Square Error (RMSE).
Articles reporting original research employing V-NIR spectroscopy to predict at least one of these six selected properties were included: carbon, nitrogen, moisture, texture, salinity, and organic matter. Only articles published in English after 1990 were included in the search process. Both lab-based and in-field measurements were added to the manuscript pool. All articles that did not satisfy these inclusion criteria were eliminated.
Unique PECO keywords were defined, and the same search string was used in four of the most common scientific search engines: Science Direct, Scopus, Web of Science, and Google Scholar. To limit the number of records collected by Google Scholar and to eliminate records that were less related to the research question, only the first relevant fifty hits were stored. In total, 1,314 references were collated in a bibliographic management system software (Mendeley). By eliminating the duplicated records, the remaining number of records reduced to 589.
According to the proposed PECO, an article to be included for the meta-analysis must expose soil samples to V-NIR spectroscopy and compare the results to standard laboratory procedures through quantitative measures of performance. Article screening occurred at two main stages; first, by investigating titles and then reviewing the abstracts. At the first stage, all stored articles' titles were analyzed thoroughly to check their relevance to this systematic review's research question. On the basis of the designed PECO, any article unable to satisfy all the inclusion criteria cited before was excluded. After exclusion by title, the next step requires exclusion by abstract using the same PECO inclusion/exclusion criteria. To reduce the likelihood of personal mistakes and/or bias, two independent reviewers did the screening in parallel and double-checked the disagreements.
From the 589 unique articles selected, 326 records were excluded by title and 148 by abstract. The results from the remaining 115 articles were extracted and used to conduct the meta-analysis.

Number of Relevant Papers
The number of V-NIR spectroscopy articles for soil property prediction has noticeably increased in the last decade. The research was conducted in 30 different countries, predominantly from China (32) and the USA (16), followed by Italy (8), Germany (6), and Brazil (6) ( Figure S3 in the supplementary data). As can be seen in Error! Reference source not found., in the last four years, more research has employed V-NIR technology for soil spectroscopy than at any time before. This can be due to the need to identify soil characteristics in a fast and cost-effective manner, and also, it implies that soil V-NIR spectroscopy is gaining more attention and trust in the research community.
The 115 articles reviewed in this research were published in 54 different peer-reviewed journals with Geoderma (18), Soil Science Society of America Journal (11), and Journal of Near-Infrared Spectroscopy (8) in the lead ( Figure S1 in the supplementary data).

Instrumentation
V-NIR spectroscopy application in soil studies is not restricted to a specific type or brand of instruments ( Figure S2 in the supplementary data). However, ASD FieldSpec (field-portable spectroradiometer) and FOSS NIR System (laboratory-based instrument) appear to be the most popular instruments for this type of application, with ASD FieldSpec used in 39% of the reported outcomes compared to 21% for FOSS NIR System ( Figure S2 in the supplementary data). It is worthwhile to mention that some of the included articles have designed their own instruments [33,34]. Out of the 115 selected articles, only 25 articles have used NIR spectroscopy (700-2500 nm of wavelength) alone, while the rest used both visible and NIR (400-2500 nm of wavelength). In terms of spectral resolution, analysis at the 2 nm interval was the most commonly used with 34% of the total reported outcomes. This is mainly due to the features of the employed instruments.

Soil Samples Preparation
Soil spectroscopy conducted in the laboratory on dried and sieved samples was reported in 85% of the total reviewed articles. In situ measurement on the soil surface instead represents 11% of the selected articles, while the remaining 4% have used soil spectroscopy on-the-go mounted on mobile platforms. There are some international protocols for sample preparation for reflectance measurements in the laboratory that some of the reviewed studies employed [35][36][37], but there are less pre-defined standards for in situ and on-the-go measurements. Most of the studies (almost 74% of the reports) took their soil samples from less than 10 sites and very few (7%) relied on existing samples from soil banks. The number of soil samples analyzed varied from less than 50 to more than 700, with samples between 50 and 150 being the most common. The number of samples mostly relies on the variety and heterogeneity of the soil, the soil characteristics under study, the experimental design, and the application. Out of the 115 articles, 109 dried their soil samples before analysis, of which 67% were air-dried, and the remaining 33% used oven-drying methods. The oven-drying process duration was mostly (56%) up to 24 hours (Table  S5 in the supplementary data). In total, 98 articles reported grinding soil samples before analysis, and 79 of them used a 2 mm sieve size. Although the drying and grinding procedures have happened for samples before spectroscopy analysis, they have usually coincided with reference analytical requirements (e.g., wet chemistry methods), and sometimes, they have been influenced by them.

Preprocessing Methods
For the purpose of smoothing the spectral data, a variety of preprocessing methods and mathematical pretreatments are used in the reviewed articles. Savitzky-Golay (24%), first derivative (21%), and absorbance (i.e., log (1/reflectance)) (20%) were the most common methods (Table S4 in the supplementary data). Most articles have employed only one preprocessing method, however, some studies have used different spectra pretreatments and compared their results [38]. Furthermore, in some studies, multiple methods are used in combination [39,40].

Analyzed Soil Properties
Most of the 115 articles included for the meta-analysis reported more than one soil property. Although we have restricted our search to only six main properties, in the reviewed articles, there were up to 81 soil properties reported to be predicted by V-NIR spectroscopy ( Table  S6 in the supplementary data). Among the six properties that this research focused on, SOC (28.2%) and TN (11.5%) have the highest number of reports. These are the same properties that are very significant to measure and know for agricultural activities.

Machine Learning Methods
In the reviewed articles, various machine learning regression methods are utilized to extract soil properties from the V-NIR reflectance spectra. The input of these machine learning models is the reflectance of soil samples in different wavelengths, and they are trained with the standard laboratory measurements of soil properties. The goodness of algorithms' prediction is evaluated by comparing their output with the standard measurements through quantitative measures of performance such as the coefficient of determination (R 2 ) and the Root Mean Square Error (RMSE). Although most articles have reported the goodness of their prediction through the coefficient of determination, it is important to note that this indicator may be misleading since it doesn't depend on the explained variance only, but also the variance of the data set; the more variable the dataset, the easier a high R 2 may be achieved. The PLSR (Partial Least Squares Regression) is by far (62.3%) the most employed machine learning method. As almost 70% of the studies have used the Partial Least Squares (PLS) family of methods (i.e., PLSR and MPLSR), it can be inferred that this method has proven suitable for the prediction of soil properties using V-NIR spectroscopy. In addition to the fact that most studies have chosen this method over other machine learning regression algorithms (e.g., linear regression), two inherent features can explain the suitability of PLSR for soil spectroscopy: 1. As mentioned earlier, most studies employ instruments measuring soil reflectance in both visible and near-infrared regions of the electromagnetic spectrum (i.e., wavelengths from 400 to 2500 nm), with a resolution of 2 nm. This leads to hundreds of variables in the matrix of predictors, making it too large to be handled by standard regression models.
2. Similar to other hyperspectral spectroscopy applications, soil V-NIR spectroscopy usually deals with a high resolution (2 nm mostly, as revealed by this systematic review) input data. Dealing with this resolution, multi-collinearity among input values are expected, and it requires to be dealt with. On the contrary to standard regression algorithms, PLSR, with some similarities with principal components regression, can handle this multicollinearity very well.
According to our meta-analysis, 82% of reports divided their whole datasets into calibration and validation sets, while 18% of reports used a held-out cross-validation approach. Of the reported outcome from the 115 studies, 24.6% used 61%-65% of their datasets for the calibration set ( Figure S4 in the supplementary data). This aligns with the standard practice of machine learning applications [41].

Meta-Analysis
To delineate the extracted data's statistical features more precisely, the "violin plot" was used for the meta-analysis instead of the classic box and whiskers plots. In addition to the basic summary statistics of the box plots such as minimum, maximum, interquartile range, median (with a white spot), and mean (with the bold horizontal line), the "violin plot" provides insights on the density shape and distribution of data which facilitates data analysis and exploration [42]. Accordingly, each violin plot illustrates at least nine different statistical values extracted from the reviewed articles' analysis. Detailed values of all the statistical features demonstrated in Figures 7, 8, and 10 are also presented in Tables S1, S2, and S3 in the article's supplementary data, respectively.
The coefficient of determination (R 2 ), the Ratio of Performance to Deviation (RPD), and the Root Mean Square Error (RMSE) are the most common statistical features reported in the selected articles to evaluate the performance of soil property prediction through V-NIR spectroscopy. Therefore, these measures are chosen in this study to report the accuracy of prediction across soil properties (Figure 7) and regression models ( Figure  8). While higher R 2 and RPD values represent better predictions, lower values of RMSE indicate higher accuracy, as RMSE is a measure of error.
According to Figure 7, V-NIR spectroscopy has reasonably predicted all the ten se-  (Table S1 in the supplementary data). A discrepancy in the reported results has been observed mainly with soil texture with an average RMSE between reported research studies for clay and sand equivalent to 5.31% and 6.05%, respectively (Table S1 in the supplementary data). It suggests that when using V-NIR spectroscopy, the prediction of soil texture as the percentage of clay, silt, and sand (i.e., soil physical properties) is harder than the prediction of chemical properties like carbon and nitrogen content and deals with more uncertainty. One reason for this high uncertainty can be due to the compositional nature of soil texture data [43]. In other words, soil clay, silt, and sand proportions are relative information, as they are parts of a whole and should always add up to 100%. However, since machine learning methods estimate clay, silt, and sand fractions independently, they do not add up to 100%. To overcome this obstacle, [44] applied a log-ratio transformation that allowed all three particle size fractions to be modeled simultaneously while meeting the constraint that all size fractions should add up to 100%. This method can be a solution for high errors in soil texture prediction using V-NIR spectroscopy. The Partial Least Squares Regression (PLSR), the Modified Partial Least Squares Regression (MPLSR), and the Support Vector Machine Regression (SVMR) are the most employed regression tools in reviewed articles to predict soil properties. By analyzing the reported outcome, we can conclude that the performance of PLSR and MPLSR are similar and comparable, while slightly better than SVMR for most cases. In general, MPLSR performed better than PLSR for TC and TN prediction with R 2 value for MPLSR equivalent to 0.82 and 0.89 compared to 0.78 and 0.79 for PLSR, respectively. The goodness of prediction for all 10 soil properties is illustrated in Error! Reference source not found.. According to [45], when RPD is greater than 3 and R 2 is greater than 0.9, the prediction is "excellent". Whereas RPD values from 2.5 to 3.0 and R 2 values from 0.82 to 0.9 denote "good" predictions. "Approximate quantitative predictions" are indicated by RPD values between 2.0 and 2.5 and R 2 values in the range from 0.66 to 0.81. The possibility "to distinguish between high and low values" is revealed with RPD values between 1.5 and 2, and R 2 values between 0.5 and 0.65. "Unsuccessful" predictions have RPD values lower than 1.5 or R 2 values lower than 0.5.
Error! Reference source not found. indicates that most of the predictions for all the soil properties are satisfactory (i.e., approximate quantitative predictions, good, and excellent), and the results are rarely unsuccessful. It is worth noting that in this figure, each black circle is an independent report from a reviewed article. Error! Reference source not found. compares the results for Soil Organic Carbon (SOC) predicted in the laboratory with in situ (in the field) conditions. It is worthwhile to mention that because some of the reviewed articles mentioned SOC as the target of their analysis, and others mentioned SOM, we followed the same terminology and discriminated the results. SOC was the only property having enough data to compare the results of laboratory versus in-field analysis. As expected, the results from laboratory analysis are slightly better than those from in situ conditions. Although the difference in performance between the two methods of using V-NIR spectroscopy, as reported by the selected articles, is not sharp, the pool from which those numbers are obtained differs largely with many more records available for lab-based analysis. This can be the main reason for more uncertainty in the results of lab-based studies, indicated by higher spans and wider distributions of the data points in Error! Reference source not found..

Conclusions
To analyze soil, as the most complicated biomaterial on the planet and the most valuable ecosystem in the world [46], we need rigorous experiments and accurate analyses. Traditional analytical procedures for soil property prediction are time-consuming and expensive, especially when a large number of soil samples are needed. In this study, a systematic review and meta-analysis conducted on papers published in the last 30 years showed clear evidence that V-NIR reflectance spectroscopy could be used as an alternative to the traditional wet chemistry for soil properties prediction. The arithmetic mean of the coefficient of determination (R 2 ) over all the analyzed reports for SOC, TN, SOM, TC, Clay, SSC, Sand, MC, IC, Silt, are 0.75, 0.81, 0.73, 0.80, 0.70, 0.76, 0.76, 0.87, 0.79, and 0.68, respectively (Table S1 in the supplementary data). Organic content and total nitrogen are the properties most analyzed with V-NIR spectroscopy with promising results. Other parameters, such as soil texture, mainly sand, and silt, show few records that could suggest the poor performance of this technique in predicting those properties. Being compositional data and relative information can be one reason for this. Another reason can be due to the presence of non-soil mineral material, such as SOM and carbonate, which can result in inaccurate particle size measurements. To overcome this problem, such constituents should be removed from soil samples before conducting V-NIR spectroscopy experiments [6].
Although this systematic review could suggest an unbiased collation of available literature on V-NIR soil spectroscopy, it has some inherent methodological limitations. The scientific search engines used in this review have picked mostly peer-reviewed articles.
The unintentional exclusion of unpublished and gray literature might have created publication bias since studies with significant "positive" results are more likely to be published than those with negative outcomes.
As shown in Tables S1, S2, and S3 in the supplementary data, the number of records diverge markedly from one soil property to the other. For example, as can be seen in Table  S1, there are more than 200 reports on SOC, encompassing all three performance measures (i.e., R 2 , RPD, RMSE), while on the contrary, there are only 10 reports on silt, limited to R 2 values only. In this research, the minimum number of reports in each category included in the meta-analysis was nine, since fewer reports can be misleading and biased. Moreover, we assumed all the reviewed articles have the same quality, and we avoided weighing their reports. Although, this can be another limitation of this systematic review, since not all the articles have the same level of thoroughness in their experimental design, regression modeling, and even presenting their findings. To minimize the effect of this limitation, we excluded gray literature and included the peer-reviewed articles only.
PLSR was by far the most common regression method used, while MPLSR had better performances mainly for TC and TN. ASD FieldSpec and FOSS NIR System are the most popular instruments for this type of research. Most studies used 2 nm spectral resolution, and the number of soil samples higher than 50 showed suitable for a more robust prediction algorithm.
After over 30 years of using V-NIR spectroscopy to predict soil properties, now, this research's findings disclose the capability of this technology in a systematic way. V-NIR spectroscopy presents a good trade-off between resources and the required accuracy for soil properties prediction. We also expect more conclusive results from using this technique in the future, due to the increase in the need for soil spatial variability assessment, advancement in instrumentation technology, development in data mining techniques, and the availability of a large global spectral library. More in-field experiments are needed to show the ability of this method for fast and effective field experiments. As shown by this study, now, there is not enough background to systematically prove the accuracy of in-field measurements for most of the soil properties. Although, more studies in the future should concentrate on soil texture to show the potential of V-NIR spectroscopy to alternate standard texture analysis involving sieving and sedimentation of suspended soil in solution, which is time-consuming and requires laboratory personnel and instrumentation. More precisely, still, there are many other borders to explore to comprehend the soil's physical, chemical, and biological properties. Conducting systematic reviews on MIR spectroscopy's ability and considering other soil properties are some of the possible future works to fill in the gap.
Precision agriculture, as a technology to improve profitability while reducing the impact of agriculture on the environment, essentially deals with variabilities [47,48]. To enable all growers to make the best decisions, especially low-income farmers from underdeveloped and developing regions of the world, fast and cost-effective techniques are required to substitute traditional means of measuring and understanding the variabilities. On the basis of the finding of this research, soil V-NIR spectroscopy can be relied on for this purpose.
Supplementary Materials: The following are available online at www.mdpi.com/2073-4395/11/3/433/s1, Figure S1: Percentage of reviewed articles published in each journal, Figure S2: Percentage of the reviewed articles using different spectroscopy devices, Figure S3: Worldwide distribution of the reviewed studies, Figure S4: Percentage of the data used as the calibration set across the reports, Table S1: The values of the statistical features of the violin plots presented in figure 7, Table S2: The values of the statistical features of the violin plots presented in figure 8, Table S3: The values of the statistical features of the violin plots presented in figure 10, Table S4: The proportion of spectra pre-processing methods reported in the reviewed articles, Table S5: The proportion of drying duration in the reviewed articles, Table S6: The list of 81 properties that were predicted by V-NIR spectroscopy in the reviewed articles, Table S7: Definition of the abbreviations used for soil properties.