Estimating Soil Properties and Nutrients by Visible and Infrared Diffuse Reﬂectance Spectroscopy to Characterize Vineyards

: Visible, near, and shortwave infrared (VIS-NIR-SWIR) reﬂectance spectroscopy, a cost-effective and rapid means of characterizing soils, was used to predict soil sample properties for four vineyards (central and north-western Spain). Sieved and air-dried samples were measured using a portable spectroradiometer (350–2500 nm) and compared for pistol grip (PG) versus contact probe (CP) setups. Raw data processed using standard normal variate (SVN) and detrending transformation (DT) were grouped into four subsets (VIS: 350–700 nm; NIR: 701–1000 nm; SWIR: 1001–2500 nm; and full range: 350–2500 nm) in order to identify the most suitable range for determining soil characteristics. The performance of partial least squares regression (PLSR) models in predicting soil properties from reﬂectance spectra was evaluated by cross-validation. The four spectral subsets and transformed reﬂectances for each setup were used as PLSR predictor variables. The best performing PLSR models were obtained for pH, electrical conductivity, and phosphorous (R 2 values above 0.92), while models for sand, nitrogen, and potassium showed moderately good performances (R 2 values between 0.69 and 0.77). The SWIR subset and SVN + DT processing yielded the best PLSR models for both the PG and CP setups. VIS-NIR-SWIR reﬂectance spectroscopy shows promise as a technique for characterizing vineyard soils for precision viticulture purposes. Further studies will be carried out to corroborate our ﬁndings. pH, electrical conductivity (Ec), total nitrogen content (N), extractable phosphorous (P), extractable potassium (K), extractable calcium (Ca), extractable manganese (Mn), organic matter (OM), and extractable iron (Fe). have


Introduction
Knowledge of soil properties and mapping is regarded as key to decision making in precision viticulture, mainly because of a growing interest in more environmentally friendly and sustainable practices [1]. Chemical and physical characteristics are especially important in evaluating soil fertility and understanding soil dynamics [2]. Modern viticulture requires the evaluation of a wide range of soil properties in a timely and cost-effective way. However, conventional methods for laboratory analysis of soils are expensive, timeconsuming, and non-environmentally friendly (they require the use of chemical reagents), and need a whole range of sophisticated protocols and equipment [3]. Soil assessment using visible (VIS), near infrared (NIR), and shortwave infrared (SWIR) spectroscopy, although it cannot replace laboratory chemical analysis, is fast, cost-effective, environmentally friendly, non-destructive, reproducible, and repeatable analytical technique [4]. It is also easy to use since samples only require minimal preparation, and, furthermore, it requires no chemicals or reagents and so does not generate chemical waste [5]. A single wavelength spectrum may contain comprehensive information that can predict various soil components [6]. Spectroscopic applications to the soil include NIR, VIS-NIR, and mid-infrared (MIR) analyses 2 of 17 comprising Fourier transform infrared (FTIR), FTIR-attenuated total reflection (FTIR-ATR), and Raman spectroscopy [3].
Spectroscopic techniques are physical characterisation methods that involve studying electromagnetic wave interaction with the material under consideration in the ultraviolet, VIS, and infrared (IR) wavelengths [7]. Furthermore, spectroscopy, when coupled with multivariate data analysis, has been shown to be a powerful tool for developing quantitative and classification models in many disciplines, including food technology [8], petroleum engineering [9], and soil science [10], as described by Barra et al. [3]. VIS-NIR spectroscopy is an empirical method based on an analysis of diffuse reflectance radiation in relation to a material's characteristics and the assumption that the concentration of a given constituent is a linear combination of several absorption features [11].
Ben-Dor [12] described the principles and mechanisms of soil-radiation interactions in relation to quantitative remote sensing of soil properties, noting problematic factors that prevent direct spectral analysis of electromagnetic signals and reviewing studies that describe advances in this quantitative method. The same author previously published research focused on the reflectance spectrum in the VIS-NIR-SWIR regions, together with proposals for practical applications [13]. Stenberg et al. [14] comprehensively reviewed the literature on soil VIS and IR diffuse reflectance spectroscopy (including fundamentals, studied soil properties, conditioning factors, calibrations, field analyses, and practical applications), while Kuang et al. [15] reviewed the sensing concept applied to soil properties (basics and brief theory, factors affecting results, and relationship between sensor output and soil properties).
When electromagnetic radiation is directed to a soil sample, it causes individual molecular bonds to vibrate (they bend or stretch), resulting in a characteristic absorption spectrum [15]. The resulting spectrum has a specific shape dependent on soil composition that can be used for physical and chemical analyses [14]. Soil content in carbon (C), nitrogen (N), water, and clay minerals are properties with direct NIR spectral responses that can be attributed to overtones of OH and overtones and/or combinations of C-H + C-H, C-H + C-C, OH + minerals, and N-H. Moreover, absorption bands in the VIS range (400-700 nm), due to electron excitation, are related to soil colour [15,16]. Numerous studies have used VIS-NIR spectroscopy in an attempt to predict soil content in total and organic C, total N, clay minerals, and water. Other studies have focused specifically on sand and silt content, pH, electrical conductivity (Ec), total content in N, extractable phosphorous (P), extractable potassium (K), extractable calcium (Ca), extractable iron (Fe), extractable sodium (Na), extractable manganese (Mn), extractable magnesium (Mg), and cation exchange capacity (CEC) [16][17][18][19]. However, results for those studies have been typically modest and also highly variable, as they were based on co-variations in constituents that are spectrally active.
The availability of commercial spectroscopic equipment and software packages for multivariate calibration has led to VIS-NIR spectroscopy becoming widely used for soil characterisation purposes. Standards and protocols for reflectance measurements of soils in the laboratory have been proposed by Pimstein et al. [20] and Ben-Dor et al. [21], while Kuang et al. [15] have reviewed several studies of different VIS-NIR reflectance sensors, including laboratory, non-mobile/field (in situ), and mobile/field (online) sensors.
Diffuse reflectance spectra in soil are non-specific, since scatter effects caused by structure result in overlapping absorption features. Therefore, multivariate techniques are required to extract absorption patterns and to correlate spectra with soil properties. Calibration methods for soil applications include linear regression approaches, such as stepwise multiple linear regression (SMLR), principal component regression (PCR), and partial least squares regression (PLSR), and also data mining techniques, such as neural networks (NN), multivariate adaptive regression splines (MARS), boosted regression trees (BRT), random forests (RF), and support vector machines (SVM), along with their combinations [14].
In agriculture, quantitative and qualitative analyses of soil properties yield accurate information to guide the management of soil fertility and productivity through adjusted fertiliser formulations and recommendations [22]. The rapid development of portable and handheld spectrometers allows analyses to be conducted in situ [23]. As a key factor for site-specific management practices, Angelopoulou et al. [24] recently reviewed laboratory and proximal sensing spectroscopy in the VIS, NIR, and SWIR wavelength regions for soil organic matter estimates. MIR spectroscopy and laser diffraction analysis (LDA) have also been demonstrated to be useful for calculating organic matter and clay content in soils [25].
Spectroscopy has previously been applied to viticulture. For vineyards located in Australia, Cozzolino et al. [23] evaluated use of a portable NIR spectrophotometer in the field to predict soil chemical properties, fitting PLSR models with coefficients of determination (R 2 ) that ranged from 0.69 for P to 0.95 for total N content. Muganu et al. [26] demonstrated the great potential of NIR-acoustic optically tuneable filter (AOTF) spectroscopy in assessing grape quality, noting the influence of soil management practices on vine and grape characteristics. Páscoa et al. [27] developed a method for indirect soil differentiation based on grapevine leaf spectra, demonstrating that leaf spectral information can be used to define soil maps for vineyards. For Northern Portugal, Lopo et al. [28] demonstrated the ability of NIR spectroscopy to discriminate between vineyard soil types, showing that water content is not a significant factor in differentiating between soils.
As reported by Marín-González et al. [19], VIS-NIR spectroscopy can be used to detect soil properties using laboratory, in situ, and online measurements. This technique is effective mainly for assessing primary soil properties with direct spectral responses in the VIS-NIR range, e.g., water, C, N, and clay [15], as well as other soil chemical parameters in the laboratory [17]. Few studies, however, have described evaluations of soil properties without direct spectral responses in the VIS-NIR-SWIR range or have compared different approaches to spectral preprocessing and the use of different accessories. Marín-González et al. [19] evaluated models to estimate soil properties without direct spectral responses in the NIR spectroscopy range (CEC, pH, and extractable Ca and Mg), reporting very good accuracy for pH and moderately good accuracy for CEC and Mg. Munnaf et al. [29] explored accuracy improvements to visible NIR spectroscopy estimates of secondary soil properties (pH and extractable K, Mg, Ca and Na) by laboratory fusion approaches, finding that exclusively online spectrum or hybrid models (50% online scanned spectra and laboratory spectra) significantly improved online prediction accuracies. Note, however, that since those works were based on online spectral measurements obtained by specialist industrial-grade instruments mounted in heavy soil-tilling machinery, and so they are not applicable to multi-year crops such as vineyards.
The objectives of this study were (1) to compare spectral signatures of soils as measured in two setups, using a pistol grip (PG) and fibre optic cable, with light provided by an external illuminator lamp, and using a contact probe (CP), with light provided by an internal halogen bulb; and (2) to assess the ability of linear regression models to calculate soil properties (mainly without direct spectral responses in the VIS-NIR-SWIR range) from preprocessed and non-preprocessed spectral data. Thus, two measurement methods (PG and CP) and two modelling approaches (with and without preprocessing) were applied and compared in order to define a suitable protocol to predict vineyard soil composition by VIS-NIR-SWIR spectroscopy.

Study Area and Soil Sampling
Soils were sampled in four different commercial vineyards belonging to three Designations of Origin (DOs): Bierzo (northwest Spain), Ribera del Duero (north-central Spain), and Rueda (northwest-central Spain). A total of 12 soil samples were collected from each of the vineyards, yielding 48 samples in total. Table 1 summarises the main characteristics of the sampled sites, which were very diverse in terms of soil textures, crops, and landscapes. Geographic coordinates refer to WGS84. The soil classification system is that of the IUSS Working Group WRB [30].
Soil samples were collected in the 0-0.40 m layer between June and August 2015. Soil cores were air-dried and were sieved (10-mesh) by hand selecting fractions <2 mm before chemical analyses, performed in the Instrumental Techniques Laboratory attached to León University (certified by UNE-EN ISO 9001). The following official analytical measurement methods [31] were used: particle-size distribution of clay, silt, and sand (%) by the pipette method, pH at 1:2.5 soil/water suspension, Ec (dS m −1 ) at 1:5 soil/water suspension, organic matter (%) by the Walkley-Black method, N (%) by total Kjeldahl nitrogen, P extracted with NaHCO 3 0.5 M at pH 8.5 by optical spectrometer UV/VIS analysis (mg kg −1 ), K and Ca extracted with AcONH 4 1N at pH 7 by ICP-AES analysis (cmol kg −1 ), Mn and Fe extracted with DTPA at pH 7.3 by ICP-AES analysis (mg kg −1 ), and CEC measured by extraction with ClBa 0.1 M by ICP-AES analysis (cmol kg −1 ).

Spectral Reflectance Acquisition
Soil samples were air-dried and spread in black soil cores (20 × 20 cm). Spectral reflectances were recorded at 1 nm intervals from 350 nm to 2500 nm using an ASD Field-Spec 4 Portable Spectroradiometer (Analytical Spectral Devices, Inc., Boulder, CO, USA). Measurements were made, using a 1.5 m fibre optic cable (25 • field-of-view), in two ways: (1) PG setup, with two tungsten halogen lamps supporting the fibre optic; and (2) CP setup, with an internal halogen bulb attached by cable.
Data were collected following spectroradiometer manufacturer recommendations [32]. Spectral measurements corresponded to reflectance calculated as the ratio of reflected soil sample energy to reflected energy of a reference calibration panel, consisting of a white reflectance panel providing a diffuse homogeneous mix of full-source energy at nearly 100%. Recalibration was performed after each measurement of five soil samples.

PG Setup Measurements
The geometry parameters of measurements (lamp to soil sample and fibre optic to soil sample distances, and the angle between those two distances) were set to ensure homogenous illumination, with the spot area over the sample surface. To ensure a representative spectrum for each soil sample, four reflectance readings (turning the soil core 90 • clockwise before each capture) were calculated, each representing the average of 15 individual measurements.

CP Setup Measurements
The CP accessory with an internal halogen bulb allowed the fibre optic to be attached at a fixed measurement angle of 35 • , reducing noise caused by shadows and other errors associated with stray light [33]. The sensed spot had a diameter of 10 mm, so measurements were made five times at five different points of the samples and then averaged.

Preprocessing
The spectral signatures were preprocessed to identify outliers, and the spectra measured for each sample were averaged. To identify the most suitable range to estimate soil properties, wavelengths were grouped into four spectral subsets: VIS (350-700 nm), NIR (701-1000 nm), SWIR (1001-2500 nm), and full range (350-2500 nm).
Standard normal variate (SVN) and detrending transformation (DT) were used for scatter correction following previous studies of soil composition estimation by spectroscopy [23,34]. SVN removes multiplicative interferences of scatter and particle size effects from spectral data by centring and scaling each spectral signature [34]. DT removes nonlinear trends in spectroscopic data by calculating a baseline function as the least squares fit of a polynomial to the sample spectrum [34].

Soil Property Estimation by PLSR
We used PLSR to estimate soil properties (predicted variables) from spectral signatures (predictor variables), given that (as explained above) diffuse reflectance spectra are correlated with soil properties. Since soil spectra show an overlap of weak overtones and combinations of fundamental vibrational bands, multivariate calibration methods were required to quantitatively determine soil properties [35]. PLSR is a generalisation of linear multiple regression that reduces a large number of collinear variables (e.g., reflectance values) to a few non-correlated hidden (latent) variables or factors (see Geladi and Kowalski [36] and Wold et al. [37] for comprehensive descriptions of PLSR).
We fitted several models in order to identify the most suitable procedure. The three reflectance datasets considered were non-preprocessed data and SVN and DT processed data. Additionally, in order to fit simpler and more effective models, an independent model was fitted for each dataset considering the following subsets as independent variables in the PLSR: VIS (350-700 nm), NIR (701-1000 nm), SWIR (1001-2500 nm), and the full range (350-2500 nm).
The resulting models were compared regarding requirements to fit a robust PLSR model: a small number of factors, small errors in leave-one-out cross-validation (CV), and a high R 2 [38]. Because of the small number of soil samples, we used the leave-one-out CV procedure to validate the regression models. R 2 and root mean square error (RMSE) values for CV were calculated to test the prediction accuracy of each model; also calculated for CV were standard error (SE) values. The ratio of performance to deviation (RPD), i.e., the standard deviation (SD) to SE ratio, was used to test the usability of the calibrated models [38], with an RPD value of 2 or more considered appropriate for soil analysis by spectroscopy [35]. Statistics were calculated according to the following expressions: where y is the predicted values, z is the measured values, and n is the number of samples; where y is the predicted values, z is the measured values, and n is the number of samples; where y is the predicted values, z is the measured values, n is the number of samples; and where SD reflects the SD values of the measured variable, y is the predicted values, z is the measured values, and n is the number of samples. The PLSR factors used in the models were selected on the basis of the lowest RMSE and highest R 2 [39]. The criterion to choose the optimal number of factors was based on RMSE and the explained variance of the model: another factor was added to the model if the RMSE was reduced by >2% and the explained variance increased. The maximum number of factors ultimately selected was seven.

Soil Reflectance Spectra
Soil spectra were mainly dominated by combinations of fundamental vibrational bands for H-C, H-N, and H-O bonds and by weak overtones, especially from the MIR region [35]. The range of reflectance values for the sampled soils and average spectral signatures for the PG and CP setups are shown in Figure 1.
where y is the predicted values, z is the measured values, n is the number of samples; and where SD reflects the SD values of the measured variable, y is the predicted values, z is the measured values, and n is the number of samples. The PLSR factors used in the models were selected on the basis of the lowest RMSE and highest R 2 [39]. The criterion to choose the optimal number of factors was based on RMSE and the explained variance of the model: another factor was added to the model if the RMSE was reduced by >2% and the explained variance increased. The maximum number of factors ultimately selected was seven.

Soil Reflectance Spectra
Soil spectra were mainly dominated by combinations of fundamental vibrational bands for H-C, H-N, and H-O bonds and by weak overtones, especially from the MIR region [35]. The range of reflectance values for the sampled soils and average spectral signatures for the PG and CP setups are shown in Figure 1. As was expected, the spectral signatures derived from the PG and CP setups were similar, while the reflectance values were higher for the CP setup due to its greater illumination intensity. Reflectance is influenced by the physical structure of soil [35]; the size, shape, and arrangement of particles and voids affect the length of the light transmitted through a soil sample, thereby influencing spectral signatures [40,41]. All spectral signatures followed the typical shape in each wavelength region, i.e., low values in VIS that rise in NIR and SWIR, while showing water absorbance features at around 1400 nm, 1900 nm, and 2200 nm. The 1400-1900 nm absorption bands dominated for water (O-H bonds), even though the peak at 1400 nm was associated with aliphatic C-H and the peak at 1900 nm was associated with amide N-H [15,42]. The spectral shape at 2200 nm was associated with groups such as phenolic O-H, amide N-H, amine N-H, and aliphatic C-H [2]. In sum, the three major reflectance peaks identified were caused by absorbances of O-H bonds of hygroscopically bound water, clay lattices, and various oxides [43]. As was expected, the spectral signatures derived from the PG and CP setups were similar, while the reflectance values were higher for the CP setup due to its greater illumination intensity. Reflectance is influenced by the physical structure of soil [35]; the size, shape, and arrangement of particles and voids affect the length of the light transmitted through a soil sample, thereby influencing spectral signatures [40,41]. All spectral signatures followed the typical shape in each wavelength region, i.e., low values in VIS that rise in NIR and SWIR, while showing water absorbance features at around 1400 nm, 1900 nm, and 2200 nm. The 1400-1900 nm absorption bands dominated for water (O-H bonds), even though the peak at 1400 nm was associated with aliphatic C-H and the peak at 1900 nm was associated with amide N-H [15,42]. The spectral shape at 2200 nm was associated with groups such as phenolic O-H, amide N-H, amine N-H, and aliphatic C-H [2]. In sum, the three major reflectance peaks identified were caused by absorbances of O-H bonds of hygroscopically bound water, clay lattices, and various oxides [43]. Table 2 shows basic statistics for the chemical and physical properties of the soil samples. Since the soil dataset reflected four different locations with different chemical and physical soil properties, values were very diverse. In fact, the coefficients of variation (CoV) obtained for P, Ca, and Fe were large. Generally, the variability observed in the soil samples for some chemical and physical properties was considered appropriate for spectroscopic calibrations, while the variability of other properties (clay and organic matter) was not great enough to build robust PLSR models. An important issue in chemometric calibration is collinearity in the analytical values conditioning the validity of the results [23]. Pearson correlations (not shown) were calculated for all the soil properties; the greatest correlations were observed for silt with sand (r = 0.95) and for pH with Ec (r = 0.94), while the lowest correlation was found for Fe with K.

PLSR Model Predictions
Only variables with R 2 values above 0.60 were considered for PLSR in this research. The reference for the other preprocessing results was PLSR prediction results using the full VIS + NIR + SWIR range (350-2500 nm) and non-preprocessed data, as summarised in Table 3. Broadly speaking, the PLSR calibration results indicated good predictions for pH, Ec, and P, and reasonably good predictions for sand, N, K, and Mn. The CP setup models had higher R 2 and lower RMSE values for pH, Ec, P, Ca, and Mn, while the PG setup models had higher R 2 and lower RMSE values for sand, N, and K. Regarding the number of factors, CP setup models required fewer factors that PG setup models. Cozzolino and Morón [2] suggest that calibration models developed for soil composition by spectroscopy can be classified according to RPD as poor (<1.6), acceptable (1.6-2.0), or excellent (>2.0). According to this classification, the fitted PLSR models proved excellent for pH, P, and Ca and acceptable for sand, Ec, N, K, and Mn. Chang et al. [35] suggest that spectroscopic prediction models in the intermediate category could be improved using different calibration strategies. The strategy used in this research was SVN and DT preprocessing to achieve models that reduce errors (RMSE and SE) and number of factors and increase R 2 . Table 4 shows PLSR results for preprocessed reflectance. Regarding the SVN transformation, R 2 increased for all variables except for P, Ec, and N. PLSR performance improved more for the CP setup models. RMSE decreased except for N and K, while the number of factors was also reduced except for N. Regarding DT preprocessing, R 2 did not increase except for K. Results were generally better for the PG setup models. RMSE values were maintained or increased except for P and Mn, while the number of factors was reduced except for N. The reduction in the number of factors was less for PG setup models. For SVN preprocessing, R 2 values increased except for Ec and pH, which remained constant, while RMSE values decreased. Results were better for the models based on the CP setup, while the number of factors was also reduced, with the exception of the PG models estimating Mn (+2 factors) and the CP setup models estimating N and P (+1 factor). For DT preprocessing, although not significantly greater, R 2 and RMSE values were better for the CP setup than the PG setup. The main improvement was the simplification of the models in reducing the number of factors. Finally, applying SVN + DT, R 2 values increased and RMSE values decreased, while the number of factors was reduced, with the exception of N (7 factors).

Soil Property Predictions
Cross-validation results for the PLSR models were different for the three particlesize distributions (clay, silt, and sand). For sand, results were satisfactory (R 2 = 0.75 and R 2 = 0.70 for the PG and CP data, respectively), and also corroborated other published results [17,39,44]. For clay, however, results were quite poor (R 2 = 0.53 and R 2 = 0.51 for the PG and CP data, respectively), and likewise for silt (R 2 = 0.51 and R 2 = 0.49 for the PG and CP data, respectively). Those unexpectedly low R 2 values may be due to narrow variability in clay content (min = 10 and max = 32) and silt content (min = 14 and max = 42) of the analysed soils.
Soil content in N was estimated by spectroscopy because it is quite sensitive to IR radiation. The R 2 values obtained ranged from R 2 = 0.68 (RMSE = 0.017%) to R 2 = 0.62 (RMSE = 0.018%) for the PG and CP setups, respectively, lower than the values of R 2 = 0.80-0.98 cited elsewhere [45] and the R 2 = 0.92 (SE = 2.19) obtained by Cozzolino et al. [23]. Our result can be explained by the fact that N estimation by spectroscopy is soil-dependent, due mainly to varying carbonate contents [46]. While MIR-ATR spectroscopy can, in fact, predict nitrate concentration in soil pastes by direct measurement, prediction accuracy is strongly conditioned for water and soil constituents [47][48][49].
Previous research has reported the ability of soil reflectance spectroscopy to accurately determine soil organic matter [24]. However, our results for organic matter were R 2 = 0.29 (RMSE = 0.438%) and R 2 = 0.27 (RMSE = 0.445%) for the PG and CP setups, respectively. These poor results may be due to the fact that the analysed soils have low organic matter content (0.37-2.40%) and high sand content (24-76%). In fact, spectroscopic predictions of organic matter are poorly accurate in soils with low C content [50] and high sand content [14].
Soil pH is a key factor for agriculture as an important fertility regulator of nutrient solubility and plant root development, biological activity, decomposition, mineralisation, etc. Because pH is a soil property with no direct spectral responses in the NIR spectroscopy range [15], calibrations rarely perform better than an RMSE of one-third or half a pH unit [14]. However, soil pH has been predicted quite successfully in several studies [19,51,52].
The pH prediction performance of our PLSR models was excellent (RPD > 3.3) for both PG and CP reflectances (R 2 = 0.92 (RMSE = 0.340) and R 2 = 0.92 (RMSE = 0.329), respectively). Our results were similar to those reported by Kuang et al. (RMSE = 0.36; RPD = 2.02) and better than those reported by Sorenson et al. [52] (R 2 = 0.68) and by Marín-González et al. [19] (R 2 = 0.86; RPD = 2.69). Results for pH predictions may be explained by co-variation to spectrally active soil constituents such as organic matter and clay [35] or by soil mineralogy and carbonate content [15]. Note that pH calibrations tend to vary from one dataset to another because they reflect different scenarios.
Since our PG and CP setup models had comparable predictive capacities (R 2 ) and accuracies (RMSE), it can be concluded that the CP setup, more versatile for field measurements, is preferable for soil property estimates by VIS-NIR-SWIR spectroscopy. Note that while Rosero-Blasova et al. [33] reported a PG setup to perform better than a CP setup, they detected no statistically significant differences between the two setups. Figure 2 shows distributions of the weighted regression coefficients over the full spectral range for both the PG and CP setups and the considered soil properties (to highlight differences, the regression coefficients for each soil property are offset by 3.0 units). Evident are several peaks in wavelength bands located in the VIS and NIR regions, attributable to colour, water, organic matter, and clay minerals [16]. Regarding sand, K, P, and Mn, the main peaks in the VIS range are associated with the blue and green regions around 450 nm and 550 nm, respectively, demonstrating that colour contributes similarly to predicting those properties. Mouazen et al. [16] reported a similar distribution of regression coefficients to ours, identifying the spectral range between 1800 nm and 2450 nm as the most active for P and K estimates. As for pH and Ec, these are mainly associated with the blue and green regions, denoting the influence of Fe oxides associated with clay minerals [58]. Predictions of N content are little affected by colour, while Ca predictions are influenced in the red region.

PLSR Model Performance
As was expected, regression coefficient distributions were very similar for both PG and CP setups, thereby corroborating measurement and prediction consistency between both. The main difference was reported for N estimates, which can be attributed to PLSR algorithm calculations instead of actual differences in spectral signatures. Figure 3 shows the R 2 values for the PLSR models obtained for the different data subsets, different pre-processing approaches, and PG and CP setups. Using only part of the spectrum (VIS, NIR, or SWIR), general trends for R 2 were similar: R 2 values with SWIR were the best, followed by R 2 values with VIS, then R 2 values with NIR. RMSE values were highest with NIR, while the lowest values were obtained with SWIR (see Table 4 above). The main disadvantage of using SWIR subsets was that some models (e.g., those for N and Mn) needed a greater number of factors than the VIS and NIR subsets, which potentially delays computational calculations. as the most active for P and K estimates. As for pH and Ec, these are mainly associated with the blue and green regions, denoting the influence of Fe oxides associated with clay minerals [58]. Predictions of N content are little affected by colour, while Ca predictions are influenced in the red region. Figure 2. Weighted regression coefficient distribution over the spectral range obtained for PLSR models for the CP setup (coloured unbroken lines) and PG setup (coloured broken lines). Soil properties analysed for cross-validation are sand content (Sand), pH, electrical conductivity (Ec), total nitrogen content (N), extractable phosphorous (P), extractable potassium (K), extractable calcium (Ca), and extractable manganese (Mn). Black broken lines represent zero correlation, offset by 3.0 units for clarity of presentation.
As was expected, regression coefficient distributions were very similar for both PG and CP setups, thereby corroborating measurement and prediction consistency between both. The main difference was reported for N estimates, which can be attributed to PLSR algorithm calculations instead of actual differences in spectral signatures. Figure 3 shows the R 2 values for the PLSR models obtained for the different data subsets, different pre-processing approaches, and PG and CP setups. Using only part of the spectrum (VIS, NIR, or SWIR), general trends for R 2 were similar: R 2 values with SWIR were the best, followed by R 2 values with VIS, then R 2 values with NIR. RMSE values were highest with NIR, while the lowest values were obtained with SWIR (see Table 4 above). The main disadvantage of using SWIR subsets was that some models (e.g., those for N and Mn) needed a greater number of factors than the VIS and NIR subsets, which potentially delays computational calculations.  Table 5 shows the best fitting models for each soil property, each type of preprocessing, each spectral subset, and each setup. Generally, sand, pH, Ec, N, P, and K were best predicted with models using the SWIR subset, Ca with models using the VIS subset, and Mn with models using the full spectrum. The best models for sand, Ec, K, Ca, and Mn were obtained for SVN + DT preprocessing, while models for N and P only required SVN preprocessing, and models for pH required no preprocessing. Agronomy 2021, 11, x FOR PEER REVIEW 12 of 17  Table 5 shows the best fitting models for each soil property, each type of preprocessing, each spectral subset, and each setup. Generally, sand, pH, Ec, N, P, and K were best predicted with models using the SWIR subset, Ca with models using the VIS subset, and Mn with models using the full spectrum. The best models for sand, Ec, K, Ca, and Mn were obtained for SVN + DT preprocessing, while models for N and P only required SVN preprocessing, and models for pH required no preprocessing.  Figure 3. Variation in coefficients of determination for spectral subsets and preprocessing approaches. R 2 values obtained for PLSR models for the CP setup (coloured unbroken lines) and PG setup (coloured broken lines). Spectral subsets were VIS (350-700 nm), NIR (701-1000 nm), SWIR (1001-2500 nm), and VIS + NIR + SWIR (350-2500 nm). Spectral preprocessing approaches were standard normal variate (SVN), detrending transformation (DT), and SVN + DT. Soil properties are sand content (Sand), pH, electrical conductivity (Ec), total nitrogen content (N), extractable phosphorous (P), extractable potassium (K), extractable calcium (Ca), and extractable manganese (Mn). No one specific kind of preprocessing ensures the effectiveness of models. Spectral signatures of soils are influenced by chemical composition and structural properties that produce non-linear light scattering effects. Regression model performance depends on the soil dataset, the analysed soil property, and the variability of the data [59], so a specific model needs to be fitted that reflects each scenario. Furthermore, it has been reported that spectral preprocessing has a minor influence on results when PLSR models are used [60].

Data Preprocessing
Stenberg et al. [14] report that SVN combined with DT is one of the more commonly used means of improving PLSR performance, as this approach usually enhances weak soil spectral signals. In our research, while SVN + DT increased R 2 and reduced RMSE (see Tables 3 and 4), improvement depended on the studied soil property, and was not so great probably because the raw reflectance data were quite stable and consistent. Other authors [33,61] report, for VIS-NIR spectroscopy, that preprocessing of spectral samples is data-specific, so no single or combination technique is generally applicable to preprocessing. In fact, different preprocessing methods should be used for different calibration techniques, different datasets, and different soil conditions [59]. Table 5 confirms that the predictive performance of soil property PLSR spectroscopic models varies with different kinds of preprocessing. Furthermore, use of different accessories results in different illumination setups and observation geometries that condition measurement and that consequently may affect the performance of models [21]. Model effectiveness is also probably conditioned by variability in the data [59]. In fact, for properties where standard deviations are greater, more variance is explained and greater accuracy is achieved. Figure 4 represents Pearson coefficient values reflecting correlations between soil properties and wavelengths. The correlograms grouped by correlation structure are Fe and Mn; N and organic matter; pH and Ec; and sand and K. Analysing the correlograms, within groups, the correlation structure is quite redundant; only Fe and organic matter have direct optical features, while predictions for the remaining properties are based on spurious correlations [62]. Patterns in the groups can be attributed to dominant chemical characteristics (e.g., iron oxides and clay minerals in the group consisting of Fe and Mn) and to the aggregate effect of several optically active minerals [62]. Regression models based on spurious correlations depend on underlying geology and soil parameters and so only are useful for our studied plots. soil dataset, the analysed soil property, and the variability of the data [59], so a specific model needs to be fitted that reflects each scenario. Furthermore, it has been reported that spectral preprocessing has a minor influence on results when PLSR models are used [60].
Stenberg et al. [14] report that SVN combined with DT is one of the more commonly used means of improving PLSR performance, as this approach usually enhances weak soil spectral signals. In our research, while SVN + DT increased R 2 and reduced RMSE (see Tables 3 and 4), improvement depended on the studied soil property, and was not so great probably because the raw reflectance data were quite stable and consistent. Other authors [33,61] report, for VIS-NIR spectroscopy, that preprocessing of spectral samples is dataspecific, so no single or combination technique is generally applicable to preprocessing. In fact, different preprocessing methods should be used for different calibration techniques, different datasets, and different soil conditions [59]. Table 5 confirms that the predictive performance of soil property PLSR spectroscopic models varies with different kinds of preprocessing. Furthermore, use of different accessories results in different illumination setups and observation geometries that condition measurement and that consequently may affect the performance of models [21]. Model effectiveness is also probably conditioned by variability in the data [59]. In fact, for properties where standard deviations are greater, more variance is explained and greater accuracy is achieved. Figure 4 represents Pearson coefficient values reflecting correlations between soil properties and wavelengths. The correlograms grouped by correlation structure are Fe and Mn; N and organic matter; pH and Ec; and sand and K. Analysing the correlograms, within groups, the correlation structure is quite redundant; only Fe and organic matter have direct optical features, while predictions for the remaining properties are based on spurious correlations [62]. Patterns in the groups can be attributed to dominant chemical characteristics (e.g., iron oxides and clay minerals in the group consisting of Fe and Mn) and to the aggregate effect of several optically active minerals [62]. Regression models based on spurious correlations depend on underlying geology and soil parameters and so only are useful for our studied plots. Interpretation of the regression coefficient curves and correlograms (Figures 3 and 4, respectively) is complicated due to the complexity in overlapping soil constituent absorption patterns. The studied chemical properties do not have direct spectral responses in the considered spectral regions. The prediction of these properties, namely, sand, pH, Ec, N, P, K, Ca, and Mn, can be attributed to locally present co-variations in spectrally active constituents (mainly organic C and clay minerals). Furthermore, correlations of some soil properties with NIR spectroscopy are still unknown and so require further investigation [16]. In fact, Miller [63] acknowledged that it is difficult to identify relevant effects in the NIR spectrum based on chemistry and spectroscopy of samples alone. Therefore, further studies are needed to understand why, in our study and using VIS-NIR-SWIR spectroscopy, properties were estimated with excellent accuracy (pH, Ec, P and Ca) and acceptable accuracy (sand, N, and Mn).
Our results suggest that it is possible to estimate variables such as sand, pH, Ec, N, P, K, Ca, and Mn that are optically non-active chemical properties with featureless spectra, because those elements are bonded to spectrally active soil components, mainly iron oxides, organic matter, and clay minerals, in such a way that the bonds constitute a key predictive mechanism [62]. Similar conclusions have been published by Martínez-Carreras et al. [64] and Wu et al. [65].

Conclusions
Vineyard soil parameters were calculated by relating spectral signatures and laboratory analytical determinations using PLSR. Reflectance measurements were made using PG and CP setups. Our findings suggest that proximal soil spectroscopy is a useful technique for soil characterisation and monitoring. The great advantage of the spectroscopic approach is that it is cost-effective and rapid, although prediction accuracy is less than for laboratory analyses. The predictive capacity (R 2 ) and accuracy (RMSE) of the PLSR models depends on setup (PG or CP), preprocessing (SVN and/or DT), spectral subset (VIS, NIR, SWIR, or full spectrum), and individual soil properties. The best predictions, with R 2 values above 0.915, were obtained for pH, Ec, and P, while moderately accurate predictions, with R 2 values of 0.69 to 0.77, were obtained for sand, N, and K.
In conclusion, PLSR models can be useful for monitoring overall changes in soil properties. Further studies aimed at more effective precision viticulture practices will focus on vineyard soil characterisation using VIS-NIR-SWIR spectroscopy combined with geographical information system (GIS) data.