Preliminary Discrimination of Commercial Extra Virgin Olive Oils from Brazil by Geographical Origin and Olive Cultivar: A Call for Broader Investigations

: Extra virgin olive oil (EVOO) production in Brazil has been recently established and is growing, but only a few studies have been published on the topic, particularly involving commercial EVOO samples. A preliminary discrimination of Brazilian EVOOs according to olive cultivar and region of production was conducted. Principal component analysis (PCA) and partial least squares discriminant analysis (PLS-DA) were performed based on the results of recent published work by our research group on the evaluation of the quality parameters, the metabolic profile, and other typical features of commercial EVOOs from Brazil. One of the oleuropein aglycone isomers, linoleic acid, α -tocopherol, and free sterols were found to be the most discriminating variables within the models. PLS-DA also revealed the region of production as a significant factor in samples’ clustering. The present work provides a preview of Brazilian EVOOs typicity and discloses the urge for further investigations with a higher number of commercial samples, from different olive cultivars and production regions. The comprehensive definition of the identity of their chemical profiles could provide Brazilian oils with a significant added value, and possibly show distinctive features that could motivate the future establishment of protected designation of origin.


Introduction
Brazil is one of the largest global importers of olive oil with an increase in imports of 20% between 2018/19 and 2019/20 harvest seasons. Thus, it is an expanding market to be explored by national production, which would reach consumers more quickly, thus providing a significant competitive advantage relating to the oil's freshness. Brazil started olive oil production relatively recently, with the first known industrial batch dating from 2008, in the Southeast region. Since then, national olive oil production has spread to the South region and has been rising in both producing areas, in parallel with the need for research to build an identity profile of this emerging product [1,2].
Studying Brazilian olive oils quality indices and chemical composition is essential to evidence their distinctiveness. Thereafter, the application of multivariate analysis to this data could provide olive oils' discrimination by cultivar and geographical origin based on samples' compositional profiles, as previously evidenced by different authors working  with samples produced in other countries [3,4]. Regarding Brazilian olive oils, in a recent study, commercial olive oils (monovarietal oils from four different cultivars and two blends) were tentatively discriminated based on their chemical profile, and oil blends located midway between their monovarietal cultivars [5]. Preliminary discrimination of non-commercial samples by geographical origin and cultivar has been reported in several research works [2,[6][7][8][9][10]. However, to the best of our knowledge, this geographical origin clustering has not been reported for Brazilian commercial samples to date.
The aim of the present work was to achieve a preliminary discrimination of commercial olive oils produced in Brazil according to olive cultivar and region of production by applying multivariate analysis to the oils' compositional profiles.

Materials and Methods
Data from a recent publication by our group [11] describing quality parameters, metabolic profile, and other typical features of ten commercial extra virgin olive oils (EVOOs) from Brazil were further analyzed by means of multivariate statistics.
Description of EVOOs sampling is available in Section 3.2 of our published work [11]. In short, ten samples of commercial monovarietal EVOOs produced in Brazil were deeply characterized, together with five Spanish EVOOs acquired in the Brazili-an market, which were used as a representative reference for comparisons. All data used to perform multivariate analysis are available in the previously mentioned pub-lication [11], as follows: (1) minor component content in Table S2a of the cited work (ref [11]) (phenolic and triterpenic compounds determined by reverse-phase liquid chromatog-raphy coupled to mass spectrometry), Table S2b of the cited work (ref [11]) (tocopherols, phy-tosterols and pigments determined by normal-phase liquid chromatography with flu-orescence and diode array detection), and Table S3 of the cited work (ref [11]) (volatile and semi-volatile compounds determined by solid-phase microextraction-gas chromatog-raphy coupled to mass spectrometry); (2) antioxidant capacity, oxidative stability in-dex, total phenolic content, free acidity value, peroxide value, specific extinction coef-ficients (K232 and K270) values, and p-anisidine value in Table 1 of the cited work (ref [11]); (3) M:Pratio and fatty acids contents in Table 2 of the cited work (ref [11]); and (4) sample cultivar, geographical origin of production, and altitude in Table 3 of the cited work (ref [11]).
Principal component analysis (PCA) was firstly applied to investigate the natural clustering of the samples and then, partial least squares discriminant analysis (PLS-DA) was used to build supervised discrimination models based on the same data matrix. From a total of 85 evaluated variables, considering quality parameters and chemical components, seven variables were excluded before running these multivariate statistics because they showed missing values for over half of the samples (contents below the limit of quantification). The 78 selected variables (concentrations of 63 minor components, 6 fatty acids content (and M:Pratio), antioxidant capacity, oxidative stability index, total phenolic content, free acidity value, peroxide value, specific extinction coefficients (K232 and K270) values, and p-anisidine value) were successfully mean-normalized, before conducting data analysis to avoid excessive influence of a specific variable on principal components. The same data matrix with 78 normalized variables was subjected to PLS-DA analysis, considering as PLS2 Y-variables: (1) samples' cultivar (Arbequina or Koroneiki), (2) geographical origin of production (South of Brazil, Southeast of Brazil, or Spain), and (3) sample altitude (defined by quartiles: lowest-1st quartile, <215 m; intermediate-2nd and 3rd quartiles, 215 to 874 m; and highest-4th quartile, >874 m) in a supervised approach. For all the analyses, p-values ≤ 0.05 were considered significant.
PCA and PLS-DA were performed with The Unscrambler ® , software version 9.7 (CAMO Software, Oslo, Norway). Altitude quartiles were calculated using the software GraphPad Prism (version 8.0.1, GraphPad Software, San Diego, CA, USA).

Results
For PCA, using the non-supervised approach, the first two principal components explained 95% of the data variance and showed a natural clustering of samples by olive cultivar and, partially, by region of production, as can be observed on the scores plot (Figure 1a). Among the variables responsible for such grouping tendency, oleuropein aglycone (isomer 3), linoleic acid, α-tocopherol, and free sterols were the most influential, as shown in the loadings plot (Figure 1b). Koroneiki EVOOs from Brazil were grouped separately depending on the region of production: samples from the Southeast, highly influenced by the content of oleuropein aglycone (isomer 3), were clustered separately from those from the South, which presented higher influence of the content of α-tocopherol, except for brand G. Arbequina EVOOs were not clustered according to their production country, although samples from the Southeast of Brazil seemed to be more influenced by the contents of linoleic acid and free sterols, whereas free sterols clustered Arbequina oils from the South of Brazil and from Spain. Sample cultivar, geographical origin, and altitude were considered as discriminating factors in PLS-DA analysis. The supervised approach showed a grouping tendency consistent with PCA. The first two principal components in PLS-DA explained 95% (X-PC1: 89%; PC2: 6%) and 65% (Y-PC1: 47%; PC2: 18%) of the data variance. The analysis presented samples from cv. Arbequina clustered separately from cv. Koroneiki, and, partially, according to geographical origin of production. Koroneiki samples from Brazil were grouped according to the region of production (Figure 1c), with samples from the Southeast (higher altitudes) apart from those from the same cultivar produced in the South, except for brand G, which could be explained taking into account its production site altitude (it is closer to the cluster of samples from the Southeast that showed intermediate altitudes). In addition to altitude, the content of α-tocopherol strongly influenced Koroneiki sample clustering from the South region, and the content of oleuropein aglycone (isomer 3) seemed to be responsible for Koroneiki sample clustering from the Southeast region. Arbequina samples were more influenced by free sterol content (Figure 1d), also showing a clustering tendency affected by altitude; samples from the Southeast of Brazil (higher altitudes) were grouped separately from samples from the South. Spanish samples were spread across samples from both producing regions in Brazil but tended to cluster closer to samples from the South region, which also present intermediate altitudes.

Discussion
Even though a limited number of olive oil samples were analyzed in the current work, two principal components explained the samples' data variance with high percentages (>90%), and consistently extracted the same variables as discriminating factors in both PCA and PLS-DA.
Olive's cultivars Koroneiki and Arbequina were clustered apart from each other in agreement with previous studies showing that the compositional profile of EVOOs was strongly influenced by the olive's genetic origin [12][13][14][15]. Separate clustering of non-commercial Arbequina and Koroneiki EVOOs based on their phenolic compound profile and content was previously reported [6]. Furthermore, differences in the phenolic compound profile of EVOOs from Southern Brazil were found between two harvest years [10]. Discrimination of Koroneiki and Arbequina commercial samples grown in the same region seems to be strongly influenced by fatty acid and phenolic compound profiles, but it should be stressed that only one sample of each cultivar grown in the same region were analyzed [5].
Our findings also indicate that free sterols and α-tocopherol can serve as cultivar discriminant factors, which might prove helpful in future investigations with increased sample size. Sterols and tocopherols have been previously reported as influencing factors when discriminating EVOO by geographical origin and cultivar [12,16].
Within Koroneiki EVOOs, most samples were clustered by geographical origin, except for brand G that was produced at a slightly higher altitude. In addition to altitude, brand G showed higher contents of maslinic acid, which might have influenced its clustering closer to EVOOs from the Southeast. Arbequina samples were clustered by region of production in Brazil, but EVOO cv. Arbequina produced in Spain was not discriminated from those from either Brazilian region. Clustering of Brazilian Arbequina EVOOs according to their geographical origin has been previously pursued, but the low number of samples did not allow a clear separation when using oxidative stability, pigments, color, and fatty acids profile as variables [7]. Nevertheless, two different production regions could be distinguished when clustering the same samples on the basis of coenzyme Q10, tocopherols, and phenolic compound contents [8]. This finding reinforces the need for a broad chemical profile of the samples, especially on olives' secondary metabolites, to provide a better track on sample origin.
As indicated by PLS-DA, altitude of olive orchards may be an influential factor to be considered when aiming to discriminate EVOOs from the same cultivar by producing region. This fact could be explained by the influence of altitude on olive oil composition, as previously demonstrated, for instance, on fatty acids, tocopherols, and phenolic compound profiles [17][18][19].

Conclusions
EVOO grouping by olive cultivar and, partially, by geographical origin, confirms the utility of using a detailed chemical profile for sample discrimination, even when sample availability is limited. The present work indicates the need for future studies regarding Brazilian EVOO composition to further contribute to the establishment of compositional patterns that could be useful when looking for biomarkers of origin and quality.

Conflicts of Interest:
The authors declare no conflict of interest.