#### 3.1. Plant Material and Greenhouse Experiments

Blackberry (Rubus fructicosus L.) cv. Tupy was chosen for the field trials.

The plant transplant took place on November 29th 2013 in a greenhouse of 600 m^{2} in the IFAPA Centre La Mojonera, Almería (36°47′19″ N, 02°42′11″ W; 142 m a.s.l.), following standard cultural practices for disease control, insect pest and plant nutrition.

The blackberry plants were transplanted on polypropylene containers of 15 l capacity using coconut fibre substrate. The irrigation water conditions were pH 8.1 and 1.26 mS·cm^{−1} conductivity and the nutrient solution had pH 5.8 and 2.50 mS·cm^{−1} electrical conductivity.

The trial (

Figure 6) was designed as a randomized complete block with 3 replicates and 20 plants per repetition. Thirty fruits were collected per each plant and stored at −80 °C until lyophilization, then were lyophilized (Telstar LyoQuest, Terrassa, Spain) and ground in a mill (Janke & Kunkel, model A10, IKA

^{®}-Labortechnik). The samples were lyophilized to remove the strong absorbance of water in the infrared region, which overlaps with important bands of nutritional parameters present in low concentration [

40].

Two samplings were performed at the time of maximum production (21 April 2014 and 20 May 2014).

The fruits harvested were classified according to their colour with a colorimeter to avoid fruit-to-fruit variation in ripeness, thus these were considered to be ripe when the CIE L*a*b (CIELAB) values were L: 21.11; a: 0.835; b: 0.073 y C: 1.27.

#### 3.4. NIRS Analysis Calibration and Validation Development

One hundred and twenty freeze-dried blackberry samples were analysed by NIRS (90 calibration, 30 calibration). An spectrometer (Model 6500 Foss-NIRSystems, Inc., Silver Spring, MD, USA) was used for registrating the spectra in the range from 400–2500 nm each 2 nm in reflectance mode.

Freeze-dried, ground samples of the blackberries were placed in the sample holder (3 cm diameter, 10 mL volume approximately) until it was full (sample weight: 3.50 g) and then were scanned. Their spectra were acquired at 2 nm wavelength resolution as log 1/R (R is reflectance) over a wavelength range from 400 to 2500 nm (visible and near-infrared regions).

The spectral variability and structure of the sample population was performed using the CENTER algorithm; samples with a statistical value >3 were considered anomalous spectra or outliers [

42].

Calibration equations for total phenolic content and total carotenoid content were developed on the whole set (

n = 90) using the application GLOBAL v. 1.50 (WINISI II, Infrasoft International, LLC, Port Matilda, PA, USA). Calibration equations were computed using different mathematical treatments although only those that displayed the higher predictive capacity were showed: [(1,4,4,1); (1,10,10,1); (2,5,5,2); (2,20,20,2)] where the meaning of each term is the derivative order of the log 1/R data (being R the reflectance), segment of the derivative, first smooth and second smooth). Additionally to the use of derivatives, standard normal variate and de-trending (SNV-DT) transformations [

43] were used, which are algorithms used to correct baseline offset due to scattering effects (differences in particle size and path length among samples) and improve the accuracy of the calibration.

Modified partial least squares (PLSm) was used as a regression method to correlate the spectral information (raw optical data or derived spectra) of the samples and TPC and TCC contents determined by the reference method, using different number of wavelengths from 400 to 2500 nm for the calculation. The objective was to perform a linear regression in a new coordinate system with a lower dimensionality than the original space of the independent variables. The PLS loading factors (latent variables) were determined by the maximum variance of the independent (spectral data) variables and by a maximum correlation with the dependent (chemical) variables. The model obtained used only the most important factors, the “noise” being encapsulated in the less important factors.

Cross-validation was performed on the calibration set to determine both, the ability to predict on unknown samples and the best number of terms to use in the equation [

44]. The number of principal component terms used in the equation to explain the analyte variance was also taken into account before selecting the equation for use. The cross validation process used in the software should prevent over fitting of the equation to the calibration set as the optimum number of terms are selected when the SECV is at its lowest and R

^{2}_{CV} is at its highest. Addition of more terms than necessary will increase the prediction error and over fit the equation to its calibration set resulting in poor predictive performance on samples outside the calibration set. Usually a medium sized model is preferred. An external validation in 30 independent samples was carried out to evaluate the accuracy and precision of the calibration equations for total phenolic and carotenoid content following the protocol outline by Shenk et al. [

44]. The 30 samples of the validation set were selected by taking one of every 5 samples in the 120 samples set; finally, the calibration set was constituted of the 90 remaining samples. The standard error (SE) and coefficient of determination were calculated for cross-validation (R

^{2}_{CV}) and external validation (Q

_{2}). The predictive ability of the equations was assessed in the external validation from the Q

^{2} coefficient, the RPD (the ratio of the standard deviation for the samples of the validation to the SEP (standard error of prediction (performance) and the RER (the ratio of the range in the reference data (validation set) to the SEP). NIR models can be classified depending the Q

^{2} from the external validation [

36] as: if 0.26 < Q

^{2} < 0.49, the models show a low correlation;); if 0.50 < Q

^{2} < 0.64) models can be used for rough predictions of samples; if 0.65 < Q

^{2} < 0.81) the models can be used to discriminate between low and high values of the samples; (if 0.82 < Q

^{2} < 0.90 are models with good prediction; if Q

^{2} > 0.90 the models show excellent precision. RPD values > 3 are desirable for excellent calibration equations, however equations with an RPD < 1.5 are unusable [

35]. The RER (ratio of the range to standard error of prediction (performance), it should be at least 10 [

36].

The mathematical expressions of these statistics are as follows:

where

${y}_{i}$ = laboratory reference value for the

ith sample;

$\widehat{y}$ = NIR value;

K = number of wavelengths used in an equation;

N = number of samples;

SD = standard deviation.

where

${y}_{i}$ = laboratory reference value for the

ith sample;

$\widehat{y}$ = NIR value;

K = number of wavelengths used in an equation;

N = number of samples.