3.4. Determination of the Total Phenolic Fraction
For TPC analysis three individual plants from each one of the 45 accessions per replication were collected at five weeks after sowing. For each plant, 3 or 4 leaves were washed with tap water, weighed to assess their biomass, and placed in Ziploc-type freezer bags at −20 °C for post-harvest storage. The samples were freeze-dried up to performing the TPC analysis.
The concentration of total phenolic compounds (TPC) was estimated by a modified version of the Folin–Ciocalteu method [
31], using gallic acid as standard, for which a calibration curve was run with solutions of 50, 100, 200, 300, 400, 500 and 600 mg/L of this compound. A 0.06 mL aliquot of extract 1.58 mL of distilled water, 0.1 mL of Folin–Ciocalteu reagent and 0.3 mL of Na
2CO
3 (20%
w/
v) were mixed and heated at 50 °C for 5 min. After 30 min, the absorbance was measured at 765 nm against a blank similarly prepared, but containing 70:30 ethanol–water mixture (pH 3.2) instead of extract. Sodium carbonate (Panreac), Folin–Ciocalteu reagent (FCR) and gallic acid (both from Sigma–Aldrich) were used to determine the total phenol fraction. The absorbance was measured with a ThermoSpectronic UV–visible Spectrometer (Thermo Fisher Scientific, Waltham, MA, USA).
3.5. NIRS Analysis
All the samples which were previously analysed by reference method were then analysed in a NIRS monochromator (NIR Systems mod. 6500, NIR Systems, Inc., Silversprings, MD, USA). In this work, each spectrum was recorded in triplicate from each sample, and was obtained as an average of 32 scans over the sample, plus 16 scans over the standard ceramic. Samples were placed in a small ring cup (3.75 cm φ) using the spinning sample module and their spectra collected between 400–2500 nm, registering the absorbance values (log 1/R) at 2 nm intervals for each sample.
Prediction equations for DFF, LP and TPC were developed using the ISI program CALIBRATE (WINISI II, Infrasoft International, LLC, Port Matilda, PA, USA) with the modified partial least squares regression option. Modified partial least squares (PLSm) was used as a regression method to correlate the spectral information (raw optical data or derived spectra) of the samples and TPC and TCC contents determined by the reference method, using a different number of wavelengths from 400 to 2500 nm for the calculation. The objective was to perform a linear regression in a new coordinate system with a lower dimensionality than the original space of the independent variables. The PLS loading factors (latent variables) were determined by the maximum variance of the independent (spectral data) variables and by a maximum correlation with the dependent (chemical) variables. The model obtained used only the most important factors, the “noise” being encapsulated in the less important factors.
Cross-validation was performed on the calibration set to determine both, the ability to predict unknown samples and the best number of terms to use in the equation [
21]. Cross-validation is an internal validation method that like the external validation approach seeks to validate the calibration model on independent test data, but it does not waste data for testing only, as occurs in external validation. This procedure is useful because all available chemical analyses for all individuals can be used to determine the calibration model without the need to maintain separate validation and calibration sets. The method is carried out by splitting the calibration set into M segments and then calibrating M times, each time testing about a (1/M) part of the calibration set.
Calibration equations were computed using four mathematical treatments (0, 0, 1, 1 (derivative, gap, smooth, second smooth); 1, 4, 4, 1; 1, 10, 10, 1 and 2, 5, 5, 1) on the calibration set. Standard normal variate and detrend transformations were used to correct scattering, and two passes were the option chosen to eliminate outliers (spectra with a standardized distance from the mean (H) > 3 (Mahalonobis distance)), by using principal component analysis (PCA). The objective of this procedure was to detect and, if necessary, remove possible samples whose spectra differed from the other spectra in the set.
Wavelengths from 400 to 2500 nm every 2 nm, were used for calibration. The standard error of calibration (SEC), coefficient of determination (R
2cv), standard error of cross-validation (SECV) and 1-VR (1 minus the ratio of unexplained variance to total variance) statistics were used to characterise the different equations obtained and to determine the best calibration equation [
32]. SECV was used as an estimate of the standard error of performance (SEP) [
33]. The best calibration equations were obtained with the 2,5,5,1 mathematical treatments (second derivative of the raw optical data, with a gap of 5 nm and 5 and 2 nm for the first and second smooth) for all traits evaluated. The regression vectors of the three factors generated from the MPLS method performed on the 2,5,5,1 mathematical treatments were calculated. The loading plots show the regression coefficients of each wavelength to the parameter being calibrated for each factor of the equation.