Multi-Chemical Profiling of Strawberry as a Traceability Tool to Investigate the Effect of Cultivar and Cultivation Conditions

The chemical composition of foods is tightly regulated by multiple genotypic and agronomic factors, which can thus serve as potential descriptors for traceability and authentication purposes. In the present work, we performed a multi-chemical characterization of strawberry fruits from five varieties (Aromas, Camarosa, Diamante, Medina, and Ventana) grown in two cultivation systems (open/closed soilless systems) during two consecutive campaigns with different climatic conditions (rainfall and temperature). For this purpose, we analyzed multiple components closely related to the sensory and health characteristics of strawberry, including sugars, organic acids, phenolic compounds, and essential and non-essential mineral elements, and various complementary statistical approaches were applied for selecting chemical descriptors of cultivar and agronomic conditions. Anthocyanins, phenolic acids, sucrose, and malic acid were found to be the most discriminant variables among cultivars, while climatic conditions and the cultivation system were behind changes in polyphenol contents. These results thus demonstrate the utility of combining multi-chemical profiling approaches with advanced chemometric tools in food traceability research.


Introduction
The composition of foods, in terms of nutrients, bioactive compounds, and other components, is tightly regulated by multiple factors, such as the genotype, geographical origin, environmental factors, and agronomic conditions. Therefore, this influences the sensory, nutritional, and nutraceutical properties of food products, which makes the implementation of quality control strategies mandatory to ensure their authenticity and traceability. In this vein, it should be noted that food quality and safety may be influenced by a myriad of factors throughout the entire supply chain, from initial food production to packaging, processing, and transport, until its final commercialization [1]. This is particularly important for processed foods, which usually require more complex operations and thus make the implementation of efficient traceability initiatives mandatory. To address these needs, novel and powerful analytical methods are requested by the food industry to accurately guarantee the authenticity and traceability of food products.
Strawberry (Fragaria × ananassa Duch.) is one of the most commonly consumed berry fruits around the world and is considered a functional food because of its chemical composition, which rich in essential and bioactive compounds. Strawberry has been demonstrated to lower post-prandial oxidative stress,

Experimental Design and Sampling
Strawberry fruits (Fragaria × ananassa Duch.) were collected in two consecutive campaigns (years 2015 and 2016) from the same experimental plantations located in Huelva (southwest Spain), at the same commercial ripeness (>75% of the surface showing red color). The first campaign was characterized by higher total radiation, while in the second one, higher rainfall, and maximum and minimum temperatures were registered. Five varieties of strawberries, genetically characterized by the vendor (Aromas, Camarosa, Diamante, Medina, and Ventana) and grown in two soilless systems (closed and open systems, i.e., with and without recirculation of the nutrient solution, respectively), were investigated. Plants were grown in a polycarbonate-covered greenhouse using elevated horizontal troughs filled with coconut fiber as a substrate, and with natural daylight as a radiation source. The temperature ranged from 25 • C during the day to 8 • C at night, with relative humidity held at 75 ± 5%.
Several fruits (n = 10) were collected for each variety and cultivation system to generate a representative pooled sample. Immediately after harvesting, fruits were sorted, frozen in situ in a deep freezer, and shipped to the laboratory in polystyrene punnets. Then, fruits were washed, sepals were dissected, and pooled fruits (n = 10) were gently homogenized by using a kitchen mixer to obtain a puree (approximately 100-150 mL). Samples were subsequently aliquoted and stored for up to 2 months at −21 • C, until further analysis. For each study condition (i.e., cultivar, campaign, and cultivation conditions), three replicates (i.e., three pooled and homogenized samples) were prepared.

Analysis of Sugars and Organic Acids
Sugars and organic acids were analyzed using an Agilent 110 series high-performance liquid chromatography (HPLC) system coupled to ultraviolet (UV) and refractive index (RI) detectors (Agilent Technologies, Santa Clara, CA, USA), following the methodology previously described [7]. Approximately 1 g of the homogenate was accurately weighed, diluted to 10 mL with ultrapure water (Millipore, Bedford, Massachusetts, MA, USA), and centrifuged at 10,000 rpm for 10 min (BHG-Hermle Z 365, Wehingen, Germany). The supernatant was filtered through a 0.45 µm PVDF (polyvinylidene difluoride) filter prior to HPLC analysis.
In a single chromatographic run, three sugars (glucose, fructose, and sucrose) and six organic acids (oxalic, citric, tartaric, malic, succinic, and lactic) were separated using a Metacarb 87H hydrogen-form cation-exchange resin-based column (300 × 7.8 mm internal diameter, i.d.) packed with sulfonated polystyrene. A total of 5 mM of sulfuric acid was delivered in isocratic mode at a 0.5 mL min −1 flow rate for 15 min, and the injection volume was 20 µL. UV detection of organic acids was performed at 210 nm, while sugars were analyzed by using the RI detector. Identifications were accomplished by comparing retention times (and UV spectra for organic acids) with those of reference standards.
The identification of phenolic compounds was achieved by comparing their retention times and UV spectra with those for commercial standards. For quantification, the following wavelengths were employed: 260 nm for ellagic acid and derivatives, 280 nm for benzoic acids and flavan-3-ols, 320 nm for cinnamic acids, 360 nm for flavonols, and 520 nm for anthocyanins.

Analysis of Mineral Elements
For mineral content analysis, 0.5 g of fruit was placed in a Teflon vessel and digested with 3 mL of a mixture of nitric and hydrochloric acids, both 1.5 M. Digestion was carried out for 2 min using a microwave furnace at 250 W. After cooling, the digest was filtered, transferred to a 25 mL flask, and made-up with ultrapure water.

Statistical Analysis
One-way analysis of variance (ANOVA), multivariate analysis of variance (MANOVA), and pattern recognition techniques, including principal component analysis (PCA), linear discriminant analysis (LDA), soft independent modeling of class analogy (SIMCA), and partial least squares discriminant analysis (PLS-DA), were carried out to investigate the differences among strawberry varieties and/or cultivation systems. All statistical analyses were conducted on Statistica 7.1 (StatSoft Inc., Tulsa, Oklahoma, OK, USA) and SIMCA-P™ 11.5 (UMetrics AB, Umeå, Sweden).

Multi-Chemical Profiling of Strawberry
Mean concentrations for all the analyzed compounds (i.e., sugars, organic acids, polyphenols, and mineral elements) are listed in Table 1 for the five strawberry cultivars investigated. Soluble sugars identified and quantified in strawberry fruits were fructose, glucose, and sucrose; monosaccharides were the major species in all varieties, except for "Camarosa", which showed higher sucrose contents. The ratio of fructose to glucose content was about the same, regardless of the cultivar, in agreement with our previous study findings [10]. With regards to organic acids, citric acid was the most concentrated metabolite, followed by malic acid, in consonance with previous studies [7,11]. In agreement with results found in the literature, anthocyanins were the predominant polyphenol class in strawberry [14], followed by phenolic acids, with pelargonidin 3-glucoside, pelargonidin 3-rutinoside, and cyanidin 3-glucoside being the three major anthocyanin species [8,15], which were found at similar levels to those reported by Crespo et al. [16]. The mineral profile was mainly dominated by five major elements-K, P, Ca, Na, and Mg-with potassium showing the highest concentrations (average content of 2834.5 mg kg −1 ). Phosphorous, calcium magnesium, and sodium were also present in high concentrations, representing approximately 20% of the total mineral content, while other elements (Fe, Cu, Zn, and Sr) accounted for less than 1% of the mineral profile. It should be noted that these results are in line with previous findings [7].
Multivariate analysis of variance (MANOVA) was applied to test the effects of the cultivar and cultivation system on the chemical profile, and analysis of variance (ANOVA) with a Tukey HSD post hoc test was used to evaluate the statistical significance of the differences for each compound or element measured. The multivariate test showed that both factors have a significant effect on the content of sugars, organic acids, and polyphenols (p < 0.001), but not on the mineral profile (p > 0.1 and p > 0.5 for the variety and cultivation system, respectively). Univariate results for each variable are shown in Table 1. "Camarosa" and "Ventana" were found to be the richest cultivars in total sugars and organic acids. In particular, "Camarosa" strawberries showed the highest content of sucrose and malic acid. The "Ventana" cultivar presented the richest profile in phenolic acids, mainly dominated by ellagic acid, while "Camarosa" and "Aromas" varieties showed higher concentrations of total polyphenols, mainly anthocyanins. Table 1. Concentrations (expressed as the mean ± standard deviation) of sugars (g kg −1 ), organic acids (g kg −1 ), phenolic compounds (mg kg −1 ), and mineral elements (mg kg −1 ) in each strawberry cultivar, and p values obtained by ANOVA.

Application of Pattern Recognition Tools for Selecting Chemical Descriptors of Cultivar and Agronomic Conditions
Several chemometric techniques, including unsupervised and supervised pattern recognition procedures, were employed to achieve a reliable differentiation between strawberry samples according to the cultivar, cultivation system, and/or campaign.
A preliminary data exploration was carried out by principal component analysis (PCA), using autoscaled data and only considering the principal components (PCs) with eigenvalues greater than 1. This PCA model allowed 84% of the total variance to be explained with five components. As shown in the scores plot built using the two first principal components ( Figure 1A), a clear separation was observed along the PC1 among samples collected in the two consecutive campaigns. The first PC explained 28% of the variance, and was positively related to fructose and tartaric acid, and negatively associated with pelargonidin 3-glucoside, total flavonoids, and total polyphenols. That is, the content of anthocyanins and total polyphenols was greater during the second campaign, when rainfall, and maximum and minimum temperatures were higher, whereas fructose and tartaric acid contents were more abundant in the first campaign, when total radiation was higher. In this vein, it has previously been described that the content of many phenolic compounds and the antioxidant capacity increase in berry fruits as the temperature increases [17]. Moreover, a low light intensity and high temperatures have also been demonstrated to provoke a decreased synthesis of sugars and ascorbic acid [7,11,18].
On the other hand, the plotting of the second and fourth PCs provided a certain differentiation, depending on the cultivar ( Figure 1B), with "Camarosa" and "Aromas" varieties distributed on the left side of the projection, and the rest of the samples located on the right side. The most relevant compounds contributing to this separation were anthocyanins (increased in "Camarosa" and "Aromas") and phenolic acids (decreased in "Camarosa" and "Aromas)", in accordance with the results obtained by ANOVA. After this preliminary data exploration, several supervised chemometric tools were employed to build classification models with the aim of assessing the potential of the multi-chemical profile investigated in this work to authenticate strawberries according to the variety and cultivation conditions. For this purpose, multiple supervised pattern recognition procedures have recently been proposed in food research to solve authentication problems for various foods with a high commercial value, such as strawberry [11,15], olive oil [19][20][21], or wine [22,23]. In the present study, three complementary statistical techniques were tested: linear discriminant analysis (LDA), soft independent modeling of class analogy (SIMCA), and partial least squares discriminant analysis (PLS-DA).
Linear discriminant analysis (LDA) was first applied to all the study variables, yielding a model capable of explaining 96% of the total variance with a 95% prediction ability. Applying forward stepwise analysis, cyanidin 3-glucoside, pelargonidin 3-rutinoside, p-coumaric acid, phosphorous, malic acid, caffeic acid, and quercetin were identified as the most discriminant variables among cultivars. As shown in Figure 2A, all samples were correctly classified, with the exception of two samples of "Medina" cultivar, which were classified as "Diamante". In line with the results from PCA, "Aromas" and "Camarosa" cultivars were clearly differentiated from the rest of the samples along the first root, while the second one described almost complete separation between the other three cultivars.
Soft independent modeling of class analogy (SIMCA) was subsequently applied to the same data matrix used in LDA, with the aim of looking for possible overlap among the study groups. Using a seven-fold cross-validation procedure, 3-PC-based models were obtained explaining 96.4%, 94.5%, 95.0%, 92.5%, and 97.2% of variance for the classes "Aromas", "Camarosa", "Diamante", "Medina", and "Ventana", respectively. These models also provided very good results in terms of their prediction ability, with 86.7%, 81.5%, 80.8%, 71.3%, and 88.5% correct prediction for the five cultivars. In this line, representation of the corresponding Coomans plot showed a correct classification of strawberries according to the variety based on their chemical composition ( Figure 2B). However, SIMCA modeling did not provide suitable results for the classification of strawberry samples according to agronomic conditions, with samples appearing in the overlapping area from the Coomans plots (figure not shown). After this preliminary data exploration, several supervised chemometric tools were employed to build classification models with the aim of assessing the potential of the multi-chemical profile investigated in this work to authenticate strawberries according to the variety and cultivation conditions. For this purpose, multiple supervised pattern recognition procedures have recently been proposed in food research to solve authentication problems for various foods with a high commercial value, such as strawberry [11,15], olive oil [19][20][21], or wine [22,23]. In the present study, three complementary statistical techniques were tested: linear discriminant analysis (LDA), soft independent modeling of class analogy (SIMCA), and partial least squares discriminant analysis (PLS-DA).
Linear discriminant analysis (LDA) was first applied to all the study variables, yielding a model capable of explaining 96% of the total variance with a 95% prediction ability. Applying forward stepwise analysis, cyanidin 3-glucoside, pelargonidin 3-rutinoside, p-coumaric acid, phosphorous, malic acid, caffeic acid, and quercetin were identified as the most discriminant variables among cultivars. As shown in Figure 2A, all samples were correctly classified, with the exception of two samples of "Medina" cultivar, which were classified as "Diamante". In line with the results from PCA, "Aromas" and "Camarosa" cultivars were clearly differentiated from the rest of the samples along the first root, while the second one described almost complete separation between the other three cultivars.
Soft independent modeling of class analogy (SIMCA) was subsequently applied to the same data matrix used in LDA, with the aim of looking for possible overlap among the study groups. Using a seven-fold cross-validation procedure, 3-PC-based models were obtained explaining 96.4%, 94.5%, 95.0%, 92.5%, and 97.2% of variance for the classes "Aromas", "Camarosa", "Diamante", "Medina", and "Ventana", respectively. These models also provided very good results in terms of their prediction ability, with 86.7%, 81.5%, 80.8%, 71.3%, and 88.5% correct prediction for the five cultivars. In this line, representation of the corresponding Coomans plot showed a correct classification of strawberries according to the variety based on their chemical composition ( Figure 2B). However, SIMCA modeling did not provide suitable results for the classification of strawberry samples according to agronomic conditions, with samples appearing in the overlapping area from the Coomans plots (figure not shown). Finally, partial least squares discriminant analysis (PLS-DA) was also employed as a more powerful technique for class differentiation and for the selection of the most discriminant variables. A five-component model was obtained with a good quality of fit (R 2 X = 0.744) and predictive ability (Q 2 = 0.413) for the classification of strawberry samples according to the cultivar ( Figure 2C). The most important chemical descriptors driving this separation were anthocyanins and phenolic acids, in line with previous findings from ANOVA and LDA. Interestingly, PLS-DA modeling also enabled the discrimination of samples grown in the two cultivation systems (i.e., open and closed soilless systems). The PLS-DA model explained 70.2% of the variance ( Figure 2D), with p-hydroxybenzoic acid, ferulic acid, unknown derivatives of pelargonidin, glucose, pelargonidin acetylglucoside, and cyanidin 3-glucoside being the most discriminant variables.

Conclusions
In this work, we have evaluated the potential of combining multi-chemical profiling and complementary statistical techniques to investigate the effect of the genotype and cultivation conditions on the chemical composition of strawberry fruits. The five cultivars investigated showed clear differences in the content of anthocyanins, phenolic acids, sucrose, and malic acid. On the other hand, climatic conditions (e.g., rainfall and temperature) were responsible for slight changes in the polyphenolic profile, with an increased content of anthocyanins and total polyphenols in strawberry fruits grown under higher rainfall and more extreme temperatures. Similarly, the cultivation conditions (i.e., open/closed soilless system) also induced minor changes in concentrations of several anthocyanins and phenolic acids. The present work therefore demonstrates that multi-chemical profiling can be used to differentiate among strawberry cultivars grown under different agronomic conditions, thus showing a great applicability for food traceability. In future studies, this approach could also be tested to search for characteristic patterns associated with the geographical origin, ripeness status, and other factors related to food production. Finally, partial least squares discriminant analysis (PLS-DA) was also employed as a more powerful technique for class differentiation and for the selection of the most discriminant variables. A five-component model was obtained with a good quality of fit (R 2 X = 0.744) and predictive ability (Q 2 = 0.413) for the classification of strawberry samples according to the cultivar ( Figure 2C). The most important chemical descriptors driving this separation were anthocyanins and phenolic acids, in line with previous findings from ANOVA and LDA. Interestingly, PLS-DA modeling also enabled the discrimination of samples grown in the two cultivation systems (i.e., open and closed soilless systems). The PLS-DA model explained 70.2% of the variance ( Figure 2D), with p-hydroxybenzoic acid, ferulic acid, unknown derivatives of pelargonidin, glucose, pelargonidin acetylglucoside, and cyanidin 3-glucoside being the most discriminant variables.

Conclusions
In this work, we have evaluated the potential of combining multi-chemical profiling and complementary statistical techniques to investigate the effect of the genotype and cultivation conditions on the chemical composition of strawberry fruits. The five cultivars investigated showed clear differences in the content of anthocyanins, phenolic acids, sucrose, and malic acid. On the other hand, climatic conditions (e.g., rainfall and temperature) were responsible for slight changes in the polyphenolic profile, with an increased content of anthocyanins and total polyphenols in strawberry fruits grown under higher rainfall and more extreme temperatures. Similarly, the cultivation conditions (i.e., open/closed soilless system) also induced minor changes in concentrations of several anthocyanins and phenolic acids. The present work therefore demonstrates that multi-chemical profiling can be used to differentiate among strawberry cultivars grown under different agronomic conditions, thus showing a great applicability for food traceability. In future studies, this approach could also be tested to search for characteristic patterns associated with the geographical origin, ripeness status, and other factors related to food production.