Phytochemical Characterization of Olea europaea L. Cultivars of Cilento National Park (South Italy) through NMR-Based Metabolomics

Olea europaea germplasm is constituted by a huge number of cultivars, each one characterized by specific features. In this context, endemic cultivars evolved for a very long period in a precise local area, developing very specific traits. These characteristics include the production and accumulation of phytochemicals, many of which are also responsible for the nutraceutical value of the drupes and of the oils therefrom. With the aim of obtaining information on the phytochemical profile of drupes of autochthonous cultivars of Cilento, Vallo di Diano and Alburni National Park, a metabolomics-based study was carried out on 19 selected cultivars. Multivariate data analysis of 1H-NMR data and 2D NMR analyses allowed the rapid identification of metabolites that were qualitatively and/or quantitatively varying among the cultivars. This study allowed to identify the cultivars Racioppella, Guglia, Pizzulella, Oliva amara, and Racioppa as the richest in health-promoting phenolic compounds. Furthermore, it showed a significant variability among the different cultivars, suggesting the possibility of using metabolic fingerprinting approaches for cultivar differentiation, once that further studies aimed at assessing the influence of growing conditions and environmental factors on the chemical profiles of single cultivars are carried out.


Introduction
Olive (Olea europaea L.) belongs to the Oleaceae family and is a tree of significant biological, economic, and cultural importance. Its long record of cultivation led to the multiple phenotypic expressions, usually described as cultivars, each one characterized by specific morphometric and biological features [1]. It is becoming more and more evident that different cultivars are also characterized by the production of different specialized metabolites [2][3][4], and since these compounds are responsible for many properties of olives and olive oil, it is crucial to study this aspect.
The world olive germplasm is made up by more than 2600 cultivars, 600 of which are cultivated in Italy [5]. This high variability is certainly due to the fact that olive is an allogamous species characterized by a high level of hetero-pollination. However, they are also the result of a long history of selection and adaptation to specific environmental conditions. Endemic cultivars, in particular, evolved for a very long period in a specific local area and developed adaptative traits that are well-integrated with the environmental, agronomic, cultural, and traditional landscape features of the site [6]. This huge number of cultivars is an enormous resource in terms of biodiversity, but on the other hand, their The 1 H-NMR spectra obtained by the extraction of the plant material in a mixture of 1:1 CD 3 OD:phosphate buffer in D 2 O were processed, and bucketing was carried out. The integral table was analyzed via multivariate data analysis.
The PCA score plot ( Figure 1) showed a distinction in two clusters according to PC1, i.e., a cluster described by the positive scores, and one characterized by the negative scores.  Table 1. Numbers 1-3 indicate three independent biological replicates for each cultivar.
Within each cluster, a gradient was observed along PC2. This component showed an indirect correlation with signals in the aliphatic region and a direct correlation with signals in the aromatic region and a few aldehydic signals. For what concerns the aldehyde and aromatic signals, it was clear that those directly correlated with PC1 were indirectly correlated with PC2 and vice versa ( Figure 2). Therefore, the metabolites generating the signals resonating in these regions might be important for the discrimination of the different cultivars. Furthermore, signals potentially belonging to olefinic protons also From the analysis of the loading plot (Figure 2A), it was possible to identify the variables that were correlated with PC1. It, indeed, showed a direct correlation with signals in the sugar region and an indirect correlation with signals in the region of the aromatic and aldehyde protons. Furthermore, an indirect correlation was also observed with signals in the regions 1.72-2.00 ppm and 2.68-3.12 ppm, along with a few other signals in the aliphatic region. Although along PC2 there was no clear distinction in groups-but rather a gradient was observed-it is however possible to separate each cluster into two distinct groups according to their position on the PCA score plot. Overall, the samples could then be divided into four groups based on their positions in the different PCA quadrants.
Since PC1 is indirectly correlated with the aromatic and aldehydic signals and PC2 is directly correlated with the same signals, it is already possible to state that the samples located in the right lower quadrant of the PCA (Group I, Figure 1) were characterized by the presence of smaller amounts of aromatic compounds compared to the other cultivars. Ricippudda (RI) only showed traces of these signals and is the cultivar with the lowest amount, but also Salella (SA), Rotondella (RT), and Nostrale (NO) were not particularly rich in these signals ( Figure 3). Within each cluster, a gradient was observed along PC2. This component showed an indirect correlation with signals in the aliphatic region and a direct correlation with signals in the aromatic region and a few aldehydic signals. For what concerns the aldehyde and aromatic signals, it was clear that those directly correlated with PC1 were indirectly correlated with PC2 and vice versa ( Figure 2). Therefore, the metabolites generating the signals resonating in these regions might be important for the discrimination of the different cultivars. Furthermore, signals potentially belonging to olefinic protons also showed a direct correlation with PC2.
Although along PC2 there was no clear distinction in groups-but rather a gradient was observed-it is however possible to separate each cluster into two distinct groups according to their position on the PCA score plot. Overall, the samples could then be divided into four groups based on their positions in the different PCA quadrants.
Since PC1 is indirectly correlated with the aromatic and aldehydic signals and PC2 is directly correlated with the same signals, it is already possible to state that the samples located in the right lower quadrant of the PCA (Group I, Figure 1) were characterized by the presence of smaller amounts of aromatic compounds compared to the other cultivars. Ricippudda (RI) only showed traces of these signals and is the cultivar with the lowest amount, but also Salella (SA), Rotondella (RT), and Nostrale (NO) were not particularly rich in these signals ( Figure 3).  All of the other cultivars were characterized by a very rich aromatic region. The samples in the upper-right quadrant of the PCA score plot (GroupII, Figure 1) were characterized by signals in the aromatic and olefinic regions, but most notably by the presence of metabolites characterized by an aldehydic signal at δH 9.09. Although in the cultivars that were closer to the zero value of the PC2 axis other signals in the aldehydic region were also detected, in the Marinella samples (MA) δH 9.09 was still the predominant one ( Figure 4).
The cultivars Marinella (MA), Provenzale (PR), Sanginara (SN), Oliva amara (AM), and Cammarotana (CA) constituted Group II and were characterized by the presence of this signal at δH 9.09 and further signals in the aromatic region. All of the other cultivars were characterized by a very rich aromatic region. The samples in the upper-right quadrant of the PCA score plot (GroupII, Figure 1) were characterized by signals in the aromatic and olefinic regions, but most notably by the presence of metabolites characterized by an aldehydic signal at δ H 9.09. Although in the cultivars that were closer to the zero value of the PC2 axis other signals in the aldehydic region were also detected, in the Marinella samples (MA) δ H 9.09 was still the predominant one ( Figure 4).
The cultivars Marinella (MA), Provenzale (PR), Sanginara (SN), Oliva amara (AM), and Cammarotana (CA) constituted Group II and were characterized by the presence of this signal at δ H 9.09 and further signals in the aromatic region.
Since the signal at δ H 9.09 was directly correlated with PC2, we also expected it to be in a significant intensity in the samples of Racioppella (RO) and Guglia (GU), located in the upper-left quadrant of the PCA score plot (Group III, Figure 1). However, in these spectra, signals indirectly correlated with PC1 were also detected in the aromatic region-in particular, doublets in the regions 6.00-6.50 ppm and 7.00-7.50 ppm ( Figure 5). In the samples closer to the zero value of the PC2 axis (PI, FT, and GR), several signals of other aldehydic protons were also detected. the presence of metabolites characterized by an aldehydic signal at δH 9.09. Although in the cultivars that were closer to the zero value of the PC2 axis other signals in the aldehydic region were also detected, in the Marinella samples (MA) δH 9.09 was still the predominant one ( Figure 4).
The cultivars Marinella (MA), Provenzale (PR), Sanginara (SN), Oliva amara (AM), and Cammarotana (CA) constituted Group II and were characterized by the presence of this signal at δH 9.09 and further signals in the aromatic region.  Since the signal at δH 9.09 was directly correlated with PC2, we also expected it to be in a significant intensity in the samples of Racioppella (RO) and Guglia (GU), located in the upper-left quadrant of the PCA score plot (Group III, Figure 1). However, in these spectra, signals indirectly correlated with PC1 were also detected in the aromatic regionin particular, doublets in the regions 6.00-6.50 ppm and 7.00-7.50 ppm ( Figure 5). In the samples closer to the zero value of the PC2 axis (PI, FT, and GR), several signals of other aldehydic protons were also detected. Finally, the samples present in the lower-left quadrant (Group IV, Figure 1) showed once again signals of aromatic and aldehydic protons, but characterized by different chemical shifts and multiplicity compared to the other cultivars ( Figure 6). Furthermore, these metabolites were less abundant in the cultivars of Group IV compared to the cultivars of Groups II and III.
Although, in all of the four groups of cultivars herewith described, the signals in the aliphatic region appeared to change, most of them did change in accordance with the aromatic, olefinic, and aldehydic signals. Therefore, they might arise from protons belonging to the same molecules. In order to prove this and to identify the chemicals responsible for the observed separation, 2D NMR analyses were carried out. Finally, the samples present in the lower-left quadrant (Group IV, Figure 1) showed once again signals of aromatic and aldehydic protons, but characterized by different chemical shifts and multiplicity compared to the other cultivars ( Figure 6). Furthermore, these metabolites were less abundant in the cultivars of Group IV compared to the cultivars of Groups II and III. Molecules 2021, 26, x FOR PEER REVIEW 8 of 16

Identification of the Metabolites in the Extracts
The PCA analysis showed that the compounds generating the aldehydic and some aromatic and olefinic signals might be important for the cultivar discrimination. Therefore, 2D NMR analyses (Supplementary Figures S1-S10) were carried out on selected samples with the aim of identifying the compounds generating these signals in the NMR spectra.
Among the cultivars of Group II, Cammarotana samples were useful for the identification of two of the compounds detected in the extracts. Indeed, two aldehydic signals were detected: one at δH 9.24 and another one at δH 9.09. Based on the 2D NMR data, it was possible to assign the proton at δH 9.24 to compound 1 ( Figure 7); the HMBC experiment showed the correlations reported in Table 2. A TOCSY experiment was useful to confirm the spin systems belonging to this compound. From the 2D NMR of this extract, it was also possible to identify the dihydroxyphenylethanolelenolic acid dialdehyde (DHPEA-EDA, 2). The presence of the aromatic signals as two doublets and a double of doublets, respectively, at δH 6.70, 6.79, and 6.58, the benzylic proton at δH 2.77, and the diagnostic signals of the iridoid moiety at δH 9.09, 6.72, and 1.90, along with the correlation reported in Table 2, allowed us to identify this compound in the extract. The data were in accordance with the literature [20].
In the spectra of the Group III, aldehyde signals not yet identified were present, especially in Pizzulella (PI). However, since the same signals were also detected, along with other related ones, in the Pisciottana (PS) cultivar, 2D NMR analyses were carried out on the latter, allowing the identification of several metabolites at the same time. Signals resonating at δH 9.04 (doublet) and a double of doublets at δH 9.00 were detected and observed to always be at a 1:1 ratio. Based on the HSQC and HMBC data (Table 2), it was possible to identify the dialdehydic form of oleuropein, also known as oleomissional [21] (3, Figure 7). Finally, another aldehydic compound detected in the extracts was oleocanthal (4). Also in this case, their identity was confirmed thanks to 2D NMR data (Table 2), which were in accordance with those reported in the literature [22]. Although, in all of the four groups of cultivars herewith described, the signals in the aliphatic region appeared to change, most of them did change in accordance with the aromatic, olefinic, and aldehydic signals. Therefore, they might arise from protons belonging to the same molecules. In order to prove this and to identify the chemicals responsible for the observed separation, 2D NMR analyses were carried out.

Identification of the Metabolites in the Extracts
The PCA analysis showed that the compounds generating the aldehydic and some aromatic and olefinic signals might be important for the cultivar discrimination. Therefore, 2D NMR analyses (Supplementary Figures S1-S10) were carried out on selected samples with the aim of identifying the compounds generating these signals in the NMR spectra.
Among the cultivars of Group II, Cammarotana samples were useful for the identification of two of the compounds detected in the extracts. Indeed, two aldehydic signals were detected: one at δ H 9.24 and another one at δ H 9.09. Based on the 2D NMR data, it was possible to assign the proton at δ H 9.24 to compound 1 (Figure 7); the HMBC experiment showed the correlations reported in Table 2. A TOCSY experiment was useful to confirm the spin systems belonging to this compound. From the 2D NMR of this extract, it was also possible to identify the dihydroxyphenylethanolelenolic acid dialdehyde (DHPEA-EDA, 2). The presence of the aromatic signals as two doublets and a double of doublets, respectively, at δ H 6.70, 6.79, and 6.58, the benzylic proton at δ H 2.77, and the diagnostic signals of the iridoid moiety at δ H 9.09, 6.72, and 1.90, along with the correlation reported in Table 2, allowed us to identify this compound in the extract. The data were in accordance with the literature [20].   Figure 7. Compounds 1-4 identified in the extracts and responsible for the differentiation of the samples.  In the spectra of the Group III, aldehyde signals not yet identified were present, especially in Pizzulella (PI). However, since the same signals were also detected, along with other related ones, in the Pisciottana (PS) cultivar, 2D NMR analyses were carried out on the latter, allowing the identification of several metabolites at the same time. Signals resonating at δ H 9.04 (doublet) and a double of doublets at δ H 9.00 were detected and observed to always be at a 1:1 ratio. Based on the HSQC and HMBC data ( Table 2), it was possible to identify the dialdehydic form of oleuropein, also known as oleomissional [21] (3, Figure 7). Finally, another aldehydic compound detected in the extracts was oleocanthal (4). Also in this case, their identity was confirmed thanks to 2D NMR data (Table 2), which were in accordance with those reported in the literature [22].
Oleuropein (5), demethyloleuropein (6), nüzhenide (7), and the hydrated form of the aglycone of oleuropein (8) (Figure 8) were present only in a few cultivars and in low amounts. Their identity was therefore putatively established thanks to the comparison with the previously published data [3,20,23]. The presence of oleuropein was suggested by the singlets at δ H 5.85 (H-1) and 7.54 (H-3), the quartet at δ H 6.06 (H-8), the methyl at δ H 1.58 (H-10), and the anomeric proton of glucose at δ H 4.60. Furthermore, signals of the hydroxytyrosol moiety were detected. Diagnostics signals of demethyloleuropein were the singlets at δ H 5.76 (H-1) and 7.27 (H-3) and the quartet at δ H 6.00 (H-8). The presence of nüzhenide was, on the other hand, suggested by the singlets at δ H 5.96 (H-1) and 7.55 (H-3), the quartet at δ H 6.10 (H-8), and the methyl at δ H 1.78 (H-10). Finally, the hydrated form of the aglycone of oleuropein was detected thanks to the olefinic proton at δ H 7.59 (H-3), the methine at δ H 3.26 (H-5), the signal at δ H 4.86 (H-1), and the methyl at δ H 1.28 (H-10). This compound is present as different stereoisomers, but it is not possible to discriminate them in the mixture. form of the aglycone of oleuropein was detected thanks to the olefinic proton at δH 7.59 (H-3), the methine at δH 3.26 (H-5), the signal at δH 4.86 (H-1), and the methyl at δH 1.28 (H-10). This compound is present as different stereoisomers, but it is not possible to discriminate them in the mixture. Besides the signals of the hydroxytyrosol and tyrosol secoiridoid derivatives, signals belonging to cornoside and to halleridone [24] (9 and 10, Figure 8) were also detected and identified based on the correlations observed in the Grossale cultivar (Table 2). Finally, the Racioppa extract allowed us to attribute signals belonging to verbascoside [20,25] (11,  Table 2 and Figure 8).

Classifications of the Analyzed Cultivars Based on Their Metabolite Content
Although several other metabolites are present in the extracts, from the previous analyses, it was clear that the discrimination between the different cultivars was possible thanks to the compounds generating aromatic, olefinic, and aldehydic signals. The signals in this region were therefore integrated and normalized by the integral of the internal standard TMSP. The cultivars are listed in Table 3 according to their decreasing content for these metabolites (RO was the cultivar with the highest value for the integral of the region between 5.5 and 9.5 ppm). Based on diagnostic signals of the identified metabolites, it was also possible to tell which metabolites are present in each cultivar (Table 3).  Besides the signals of the hydroxytyrosol and tyrosol secoiridoid derivatives, signals belonging to cornoside and to halleridone [24] (9 and 10, Figure 8) were also detected and identified based on the correlations observed in the Grossale cultivar (Table 2). Finally, the Racioppa extract allowed us to attribute signals belonging to verbascoside [20,25] (11,  Table 2 and Figure 8).

Classifications of the Analyzed Cultivars Based on Their Metabolite Content
Although several other metabolites are present in the extracts, from the previous analyses, it was clear that the discrimination between the different cultivars was possible thanks to the compounds generating aromatic, olefinic, and aldehydic signals. The signals in this region were therefore integrated and normalized by the integral of the internal standard TMSP. The cultivars are listed in Table 3 according to their decreasing content for these metabolites (RO was the cultivar with the highest value for the integral of the region between 5.5 and 9.5 ppm). Based on diagnostic signals of the identified metabolites, it was also possible to tell which metabolites are present in each cultivar (Table 3). Table 3. Phytochemical compositions of the analyzed cultivars. Cultivar  1  2  3  4  5  6  7  8  9  10 11

Discussion
The metabolomics analysis of the drupes of 19 autochthonous olive cultivars endemic of the PNCVDA allowed us to classify them based on their richness in potential nutraceutical compounds, both from a quantitative and a qualitative point of view. Focusing on the variables identified through PCA analysis of the NMR data (Figures 1 and 2) and after the attribution of these variables to metabolites thanks to the 2D NMR analyses of selected spectra (Table 3, Supplementary Figures S1-S10), it was possible to classify the studied cultivars based on the richness in these metabolites. Furthermore, it was possible to define the composition, relative to the metabolites of interest, of each cultivar.
The cultivars Racioppella, Guglia, Pizzulella, Oliva amara, and Racioppa were the ones characterized by the highest content in specialized metabolites ( Table 3). The main metabolite in these cultivars was DHPEA-EDA (Figures 4-8). However, they were all characterized by the presence of several different metabolites (Table 3). In the cultivars Pizzulella and Racioppa, also signals of oleacein and verbascoside, respectively, stood out.
The cultivars characterized by the lowest concentration of the target compounds were Nostrale, Salella, Rotondella, and Ricippudda ( Figure 3). Among these, however, Nostrale had a more diversified profile, since signals from many different compounds were detected (Table 3).
Among the detected compounds, some (1-4) seemed to be rather common in the analysed cultivars, while the less common cornoside and halleridone were only detected in specific cultivars (Table 3). Oleuropein and its derivatives seemed to have a nonuniform distributions as well. Finally, verbascoside was also only detected in some of the cultivars. Could any of these compounds be considered as a potential biomarker? None of the compounds were typical for only one cultivar; however, the specific profile of the cultivar could be taken into consideration to establish a sort of fingerprinting approach for cultivar identification. At this point, it is important to highlight that the content in specialized metabolites of any plant material could be affected by several variables [20,26,27]. In this study, one of the most important was excluded thanks to the specific approach for sample collection: all of the drupes were collected at the same ripening stage. However, many other factors that could impact the olive metabolome will need to be investigated, particularly altitude, rainfall, and temperature. It has been reported, for example, that fruits harvested at higher altitudes are characterized by higher tocopherol and total phenolic contents [28].
The present study only shows that, for most of the cultivars, a specific and unique profile was detected. A crucial next step will be to address the variability in the metabolome of each cultivar based on the cultivation site and environmental conditions. Although previous studies have been reported in which either the influence of the cultivar or the influence of the environment on the content in specialized metabolites have been explored [2,3,29], metabolomics now makes it possible to design studies in which the combined influences of the different factors could be addressed. This information is crucial to assess the nutraceutical value of each cultivar because the latter is directly correlated to the content in some phytochemicals.
Based on the data herewith discussed, and on the current knowledge, the cultivars Racioppella, Guglia, Pizzulella, Oliva amara, and Racioppa are those potentially characterized by the highest nutraceutical value. Among these cultivars, Racioppella, Guglia, Pizzulella, and Racioppa were particularly interesting for the presence of oleocanthal. This compound has a well-proven anti-inflammatory activity [16,17], shared also with the verbascoside [30]. Phenolic compounds, in general, have been largely studied for their antioxidant and radical scavenging properties. Oleuropein and oleocanthal, in particular, have not only been tested in vitro but, also, in vivo or at least in in vivo models [31]. The olive drupes herewith analyzed contained, however, relatively low amounts of oleuropein that seems to be abundant in leaves [32]. These compounds, along with the other secoiridoid derivatives described, are also characterized by documented antimicrobial activity against several pathogens [33] and potential anticancer activity [32]. It has been shown, however, that phytochemical mixtures might have different activity than the pure compounds due to synergistic, additive, but, also, antagonistic effects.
Further studies are therefore needed to experimentally assess the nutraceutical potential of the analyzed cultivars. This has to go hand-in-hand with the evaluation of the variability of the content of these metabolites in the cultivars of interest depending on different factors.

Sample Collection
In this study, 19 olive cultivars (Table 1) were collected in the PNCVDA (Table 1) and analyzed. In particular, the metabolite content of the pulps of the drupes was explored by an NMR-based metabolomics approach.
Since the ripening stage strongly affects the contents in metabolites, it was important to choose the right moment for the sample collection. It was possible to assess the ripening stage based on a previously reported method and based on weekly monitoring on the sugar content of the drupes [33,34]. Furthermore, at the harvesting time, attention was paid to the phytosanitary state of the plants and of the drupes, in particular, selecting those that did not present any attack of pathogens.
For each cultivar, samples were collected (3 biological replicates constituted by a 50-mL falcon tube -Falcon, Berlin, Germany-filled with the drupes). Samples were immediately frozen in liquid nitrogen and freeze-dried as soon as they were transported to the laboratory. The pulps of the drupes were then separated from the seeds and freeze-dried again. The samples were then ground in liquid nitrogen and stored at −80 • until further analyses.

NMR Analysis
NMR spectra were recorded at 25 • C on a Varian (Palo Alto, CA, USA) Mercury Plus 300 Fourier transform NMR operating at 300.03 MHz for 1 H and 75.45 MHz for 13 C. CD 3 OD was used as the internal lock. One-dimensional and 2D NMR spectra were acquired using Varian standard pulse sequences and as previously described [20].
Two-dimensional NMR analyses were carried out on selected samples using standard library sequences for HSQC, HMBC, COSY, and TOCSY.

Multivariate Data Analysis
1 H NMR spectra were scaled to total intensity and bucketed, reducing them to integral segments with a width of 0.04 ppm with ACDLABS 12.0 1 H NMR processor (ACDLABS, Toronto, ON, Canada). The regions at δ −0.02-0.02, 4.70-5.00, and 3.30-3.34 were excluded from the analysis (by indicating them as dark regions before integration) because of the residual TMSP and solvents signals. Principal component analysis (PCA) was performed with the SIMCA-P software (version 14.0, Umetrics, Umeå, Sweden) with scaling based on Pareto.

Conclusions
The present study, aimed at determining the phytochemical profiles of the drupes of different olive cultivars of the PNCVDA, showed specific and unique profiles for most of the analyzed cultivars. Whether or not this information can be used for the classification of cultivars needs to be assessed, with further studies aimed at determining the variability of the metabolome for each of the analyzed cultivar depending on several factors. Nevertheless, the knowledge on the specialized metabolites present in each cultivar is essential to assess their nutraceutical potential. The present data showed that the cultivars Racioppella, Guglia, Pizzulella, and Racioppa were particularly interesting for the presence of oleocanthal, a molecule with several proven biological activities, although several other cultivars were characterized by the presence of potential health-promoting compounds.

Data Availability Statement:
The raw data supporting the conclusions of this article will be made available by the authors upon request, without undue reservation.