Elucidation of the Origin of the Monumental Olive Tree of Vouves in Crete, Greece

The olive tree of Vouves in Crete, is considered the oldest producing olive tree in the world with an estimated age exceeding 4000 years. In the present study, we sequenced two samples (from the bottom and the top of the tree) to elucidate the genetic relation of this ancient tree with other olive cvs as well as to gain some insights about its origin. Our results showed that both samples have different genetic origins, proving that this ancient tree has been grafted at least one time. On the basis of whole genome sequences the sample from the top of the Vouves tree showed relation of the same order than half-siblings to one accession corresponding to the present-day Greek cv ‘Mastoidis’. Nevertheless, in the framework of a microsatellite analysis it was found to cluster with the ‘Mastoidis’ samples. The Vouves rootstock (bottom sample) showed a clear grouping with the oleaster samples in a similar way to that of ‘Megaritiki’ Greek cv although it does not show any signal of introgression from them. The genomic analyses did not show a strong relation of this sample with the present-day Greek cvs analyzed in this study so it cannot be proved that it has been used as a source for cultivated olive tree populations represented by available genome sequences. Nevertheless, on the basis of microsatellite analyses, the Vouves rootstock showed affinity with two present-day Greek cvs, one “ancient” rootstock from continental Greece as well as monumental trees from Cyprus. The analysis of the impact of the variants in the gene space revealed an enrichment of genes associated to pathways related with carbohydrate and amino acid metabolism. This is in agreement with what has been found before in the sweep regions related with the process of domestication. The absence of oleaster gene flow, its old age and its variant profile, similar to other cultivated populations, makes it an excellent reference point for domestication studies.


Introduction
Olive (Olea europaea subsp. europaea) is the dominant tree crop in the Mediterranean countries. In fact, over the 90% of the global olive production is realized in this region [1]. The products of this emblematic crop, namely olive oil and table olives are popular in the framework of a healthy lifestyle [2,3]. Even in parts of the world far from its traditional cultivation area, such as Eastern Asia, Australia and America, there is a strong interest in growing olive trees and consuming its nutritional and rich in biological value products. The world annual gross production value exceeds 18.3 billion dollars for 2014, 2015 and 2016 [4] depicting the significant contribution in the economic life of producer countries. Even though that the olive tree has gained a lot of attention for the health benefits of its products as well as for its central role in history and culture, the roots of its domestication have not been unambiguously identified [5]. Even in our days, there is an open debate about whether one major domestication event was realized in the Eastern Mediterranean followed by subsequent dispersion westwards [6] or more than one independent domestication incidents shaped the richness of olive genetic resources available to date [7]. In any case, it is common belief that the wild olive Olea europaea var. sylvestris is the ancestor of the cultivated olive Olea europaea var. europaea [8][9][10]. Interestingly, wild olive trees still survive in some Mediterranean forests [11].
Archaeological studies in Greece have identified olive pollen dating to 7000 BC [12] while they suggest that human was firstly attracted by the olive wood during the third millennium B.C. and that fruit use followed the transformation of olive bush to a tree through pruning [13]. Indeed, early historical references on the use of fruit for its oil content appear much later [14].
The olive tree was considered as sacred in ancient Greece and it was the highest prize for the winner athletes in the Olympic Games [15]. In an attempt to revive and honor this tradition, during the modern Olympic Games held in Greece in 2004, the winners were crowned with an olive wreath from the millennial olive tree of Vouves in Crete, which is considered the oldest producing olive tree in the world (https://en.wikipedia.org/ wiki/Olive_tree_of_Vouves, accessed on 1 August 2021) with an estimated age exceeding 4000 years. The oldest indication for the incidence of olive oil based on analysis of residue on pottery is dated at the 4th millennium in the Gerani Cave in Crete [16]. In the case of Prepalatial Chrysokamino, it was found that olive oil was used to cover wine for preventing its oxidation to vinegar [17]. In another instance, olive oil was noticed in the Early Minoan I site of Aphrodite's Kephali [18]. After a long period of time, Late Bronze Age findings were reported from the cemetery of Armenoi [19] and Pseira [20] in Crete. Taking into consideration all these findings it is concluded that proof for the production and use of olive oil in Crete surely precedes the Late Bronze Age.
The millennial olive tree of Vouves in Crete is considered to consist of a Greek cv resembling 'Mastoidis' grafted onto an unknown ancient rootstock yielding a present-day "monumental" tree. A three-dimensional model of its trunk has been elaborated while the overall physiognomy is discussed [21]. Nevertheless, the tree trunk is presented as a compact object without differentiating between the two parts (top vs. bottom rootstock). In ancient times, olive domestication was based on selection of wild trees with desirable properties and grafting on other trees, as reported by Theophrastus more than 2000 years ago [22].
The study of ancient olive trees is valuable for detecting unknown genetic resources and for shedding light in the historical processes of olive domestication [23]. Previous studies on genotyping of old olive trees revealed ancient cvs and confirmed the selection of cvs [24] as well as rootstocks [25].
The similarity in genetic structure of naturally growing populations with the suckers of old cultivated trees implies that wild trees were used as rootstocks [26]. Pollen archaeological studies suggest that a cultivation process seems to have occurred in the Aegean (Crete)-whether as an independent large-scale management event or as a result of knowledge and/or seedling transfer from the southern Levant around the fourth millennium BC [27].
In the present study, we sequenced the genome of the Olive Tree of Vouves, which is considered the oldest producing olive tree in the world. The upper part of the tree, the scion, producing fruit and the lower part of the tree, the rootstock, providing the roots for supplying water and nutrients, were sequenced separately ( Figure 1). Further, in order to gain an initial understanding into the relative placement of the two Vouves parts within the overall present-day Greek cv diversity landscape, a separate microsatellite (SSR) analysis was conducted. For that, samples were selected so as to span placement across the SSR similarity dendrogram from [36] and analyzed anew. In total 17 samples were genotyped including 12 present-day Greek cvs (alphabetically; Adramytini, Amfissis, Chalkidikis, Gaidourelia, Karydolia, Koroneiki, Mastoidis, Megareitiki, Pierias, Pikrolia, Tragolia and Vasilikada), the two Vouves samples, one "ancient" rootstock from Peloponnese and one Olea europaea subsp. cuspidata (Wall. and G. Don) Cif. sample as an outgroup).  [32]; O. europaea subps. europaea, cv. "Picual" version Oleur0.6.1 [33] and the most recent assembly, O. europaea subps. europaea, cv. "Arberquina" version Oe_Rao [34]. Also, some genotyping-by-sequencing studies are available such as [35].
In the present study, we sequenced the genome of the Olive Tree of Vouves, which is considered the oldest producing olive tree in the world. The upper part of the tree, the scion, producing fruit and the lower part of the tree, the rootstock, providing the roots for supplying water and nutrients, were sequenced separately ( Figure 1). Further, in order to gain an initial understanding into the relative placement of the two Vouves parts within the overall present-day Greek cv diversity landscape, a separate microsatellite (SSR) analysis was conducted. For that, samples were selected so as to span placement across the SSR similarity dendrogram from [36] and analyzed anew. In total 17 samples were genotyped including 12 present-day Greek cvs (alphabetically; Adramytini, Amfissis, Chalkidikis, Gaidourelia, Karydolia, Koroneiki, Mastoidis, Megareitiki, Pierias, Pikrolia, Tragolia and Vasilikada), the two Vouves samples, one "ancient" rootstock from Peloponnese and one Olea europaea subsp. cuspidata (Wall. and G. Don) Cif. sample as an outgroup). Our aims were to (i) verify that the tree is consisted of two different genotypes united through grafting, (ii) reveal the genetic identity of the two genotypes forming this historical and millennial tree, (iii) elucidate their genetic relation with other olive cvs, especially of Greek origin, (iv) characterize their genetic differences in relation to the Olea europaea var. sylvestris reference genome and (v) propose sound hypotheses on the origin of the Vouves monumental olive tree. Our aims were to (i) verify that the tree is consisted of two different genotypes united through grafting, (ii) reveal the genetic identity of the two genotypes forming this historical and millennial tree, (iii) elucidate their genetic relation with other olive cvs, especially of Greek origin, (iv) characterize their genetic differences in relation to the Olea europaea var. sylvestris reference genome and (v) propose sound hypotheses on the origin of the Vouves monumental olive tree.  [33] and reanalyzed in this work. These samples included 'Kalamon', 'Koroneiki', 'Mastoidis', 'Mavreya', 'Megaritiki' and 'Myrtolia'. Results are summarized in Table 1. The levels of heterozygosity were similar between all the samples, ranging from 1.31 heterozygous variants/100 bp of bottom of the Vouves tree to 2.16 of heterozygous variants/100 bp 'Megaritiki'. The high levels of heterozygosity are concordant with other projects in which an olive genome was sequenced such as the 'Farga' (5.4%) [30] and 'Picual' genomes (2.02%) [33] whereupon similar values were determined. In a different work, genotyping of an olive panel using Genotyping-By-Sequencing (GBS) delivered values ranging from 1.28% of a cv called 'Zhonglan' to 6.36% of the Italian cv 'Nociara' [35], including some Greek samples such as 'Koroneiki' with 2.19%. Olive genome heterozygosity is much higher than in other tree crops. For example, the apple (Malus domestica) variety 'Golden Delicious' is considered highly heterozygous reaching values of 0.32 heterozygous variants each 100 bp [37]. Peach (Prunus persica) is another example of a tree crop where the average heterozygosity for cvs and wild relatives are 0.07% and 0.25%, respectively [38]. Avocado trees (Persea americana) have heterozygosity levels of the same order with olive trees. Indeed, estimated heterozygosity of the 'Hass' variety is 1.05% [39].

Origin of the Vouves Monumental Olive Tree in the Context of Olive Domestication
All RNASeq data from NCBI SRA project PRJNA525000 [6] as well as the Whole Genome DNA Resequencing (WGR) data from the SRA project PRJNA556567 [33] were used in an effort to propose sound hypotheses regarding the origins and phylogenomic/ Plants 2021, 10, 2374 5 of 18 phylogenetic relations of the Vouves' olive tree. The first dataset contains 56 samples of wild and cultivated olive trees from 14 different countries across the Mediterranean basin. The second dataset, eventually used to produce Figure 2a, contains 41 different cultivated varieties (Olea europaea subsp. europaea) as well as 10 wild accessions (i.e., a total of 51 taxons), including different subspecies such as laperrinei and guanchica and wild Olea europaea subsp. europaea varieties (Olea europaea subsp. europaea var. sylvestris), syn Olea europaea var. sylvestris (also called oleasters). After the read mapping, variant calling and filtering, 299, 435 biallelic SNPs were obtained for 117 individuals. Subsequently, samples coming from RNASeq and WGR were compared so as to assess if it is feasible to combine data sets produced from two different methodologies (i.e., RNASeq and WGR). It was found that samples clustered by methodology and not by origin or cv ( Figure S1). Consequently, and based on this result, data derived from RNASeq analyses were filtered out, retaining only the WGR data for subsequent analyses. An additional filtering was applied to remove linked variants obtaining a total of 71,040 biallelic SNPs.
The distance tree produced using these variants was employed to construct the phylogenomic NJ tree depicted in Figure 2a. Accession (Olea europaea subsp. laperrinei) termed 'Adjelella10' was employed as an outgroup. In Figure 2a it can be observed that accession 'Gran Canaria' is sister to the outgroup accession as is expected for a different subspecies (Olea europaea subsp. guanchica) although the other guanchica accession, 'Tenerife', is nested with the oleaster accessions (Olea europaea var. sylvestris). Accession 'Dokkar' is also nested with the oleaster accessions. The rest of the accessions are part of the same clade. There are three oleaster accessions nested with the cultivated accessions, 'Croatia' acting as an outgroup of the cultivated accessions and 'Extremadura' and 'Morocco' that are nested with three accessions which originate from southern Spain ('Temprano', 'Zarza' and 'Lechin de Sevilla') and the Algerian accession ('Chemlal De Kabylie').
In the phylogenomic tree it can be seen that the Vouves tree bottom sample (more than 4000-year-old) is external to all the cultivated samples except for 'Megaritiki'. The Vouves top tree sample is sister to the 'Mastoidis' accession and it clusters with other present Greek samples. The Italian samples ('Frantoio', 'Leccino' and 'Grappolo') are a monophyletic group as well as all the Syrian and Iranian accessions. The Spanish samples are divided into four groups. The first one is nested with two oleaster accessions ('Extremadura' and 'Morocco'). The 1 ('Pinonera' and 'Menya') is sister to the Greek accession 'Mavreya'. The third one ('Farga', 'Llumeta' and 'Forastera de Tortosa') are sister to the Israel accession 'Barnea'. The fourth group contains accessions from southern Spain and is sister to the Syrian/Iranian clade. The 'Kalamon' accession is sister to the Turkish accession 'Uslu'. In the UPGMA similarity tree produced with 11 SSR loci ( Figure 2b) it can be seen that 'Vouves bottom' exhibits high similarity with another "ancient" rootstock from the Greek province of Peloponnese. This province is isolated from Crete by sea. Further, the Peloponnese rootstock shares high similarity with present-day Greek cvs "Pikrolia" and "Vasilikada'. This is in full agreement with a subsequent-yet unpublished-populational study involving few hundred olive tree samples from all over Greece genotyped with SSR markers. In this study each Greek cv is represented by a series of newly analyzed independent genotypes (data not shown).
A 299, 435 biallelic SNPs were obtained for 117 individuals. Subsequently, samples coming from RNASeq and WGR were compared so as to assess if it is feasible to combine data sets produced from two different methodologies (i.e., RNASeq and WGR). It was found that samples clustered by methodology and not by origin or cv ( Figure S1). Consequently, and based on this result, data derived from RNASeq analyses were filtered out, retaining only the WGR data for subsequent analyses. An additional filtering was applied to remove linked variants obtaining a total of 71,040 biallelic SNPs.  [36] reprocessed within the framework of present study). Distances are indicated above each tree branch.
The distance tree produced using these variants was employed to construct the phylogenomic NJ tree depicted in Figure 2a. Accession (Olea europaea subsp. laperrinei) termed 'Adjelella10' was employed as an outgroup. In Figure 2a it can be observed that accession 'Gran Canaria' is sister to the outgroup accession as is expected for a different subspecies (Olea europaea subsp. guanchica) although the other guanchica accession, 'Tenerife', is  The topology of the phylogenomic tree is similar to previously published phylogenomic trees [31,33], with Italian and Syrian/Iranian samples as monophyletic groups, Spanish samples grouped in two branches with one of them being a sister group to the Syrian/Iranian groups. Greek samples were also distributed in a similar fashion with 'Kalamon' grouped with the Syrian samples and 'Myrtolia', 'Mastoidis' and 'Koroneiki' comprising a monophyletic group sister to the Italian one. 'Megaritiki' appeared as one of The topology of the phylogenomic tree is similar to previously published phylogenomic trees [31,33], with Italian and Syrian/Iranian samples as monophyletic groups, Spanish samples grouped in two branches with one of them being a sister group to the Syrian/Iranian groups. Greek samples were also distributed in a similar fashion with 'Kalamon' grouped with the Syrian samples and 'Myrtolia', 'Mastoidis' and 'Koroneiki' comprising a monophyletic group sister to the Italian one. 'Megaritiki' appeared as one of the most outer taxa of the cultivated olives similarly to 'Dokkar', so it is possible that this cv has some contribution from the wild olive populations [31,33].
The clustering analysis using STRUCTURE software and DAPC evidenced four clusters as the most probable number ( Figure S2), with two, three and six being the alternative scenarios. The grouping of the different samples using four groups with Structure showed a first group (G1) composed by non-cultivated olives such as O. europaea subsp. laperrinei, O. europaea subsp. guanchica and most of the O. europaea var. sylvestris with the exception of the 'Extremadura', 'Morocco' and the 'Croatia' accessions that cluster in the groups "G2", "G2" and "G4", respectively. 'Dokkar' and the sample of the bottom of the Vouves tree also cluster in the group "G1". Both samples show some component of the group "G4". The second group, "G2", is composed by accessions from southern Spain such as 'Lechin de Sevilla', 'Zarza' and 'Temprano'. In this group, there can also be found accessions with some components of the groups "G1 + G4" such as 'Chemlal Kabile' and "G4 + G3" such as 'Forastera T.'. A third group, "G3", is composed by most of the cvs from Syria such as 'Abou Kanami', 'Mari' and 'Barri'. In this group, there can also be found the Greek sample 'Kalamon' as well as some southern Spain samples such as 'Verdial', 'Ocal' or 'Picudo' as admixture between this "G3" group and the "G2". The fourth group, "G4" is composed mostly by Greek, Italian and Northwest Spanish accessions including the top of Vouves tree sample (Figure 4). The results for the DAPC analysis are similar ( Figure S3). The clustering analysis is also similar to the previously published analysis if the groups G2 and G3 are considered as a single group [33] or G1 and G2 are considered as one group and then G3 and G4 as another one [31]. A recent work with a wider sampling of wild accessions evidenced two separate groups; one comprising O. europaea var. sylvestris and another comprising O. europaea subsp. guanchica [40] in contrast to the present work where only one group is detected. This could evidence one of the limitations of the sampling of the present study associated with non-representative number of accessions for some groups, such as guanchica. Nevertheless, the sylvestris accessions appears as a group separate from the group of cvs giving enough resolution to distinguish both groups even if this study did not sample hundreds of accessions. a first group (G1) composed by non-cultivated olives such as O. europaea subsp. la O. europaea subsp. guanchica and most of the O. europaea var. sylvestris with the ex of the 'Extremadura', 'Morocco' and the 'Croatia' accessions that cluster in the "G2", "G2" and "G4", respectively. 'Dokkar' and the sample of the bottom of the tree also cluster in the group "G1". Both samples show some component of the "G4". The second group, "G2", is composed by accessions from southern Spain 'Lechin de Sevilla', 'Zarza' and 'Temprano'. In this group, there can also be foun sions with some components of the groups "G1 + G4" such as 'Chemlal Kabile' an G3" such as 'Forastera T.'. A third group, "G3", is composed by most of the cvs fro such as 'Abou Kanami', 'Mari' and 'Barri'. In this group, there can also be found th sample 'Kalamon' as well as some southern Spain samples such as 'Verdial', 'O 'Picudo' as admixture between this "G3" group and the "G2". The fourth group, composed mostly by Greek, Italian and Northwest Spanish accessions including of Vouves tree sample (Figure 4). The results for the DAPC analysis are similar S3). The clustering analysis is also similar to the previously published analysi groups G2 and G3 are considered as a single group [33] or G1 and G2 are consid one group and then G3 and G4 as another one [31]. A recent work with a wider sa of wild accessions evidenced two separate groups; one comprising O. europaea v vestris and another comprising O. europaea subsp. guanchica [40] in contrast to the work where only one group is detected. This could evidence one of the limitation sampling of the present study associated with non-representative number of acc for some groups, such as guanchica. Nevertheless, the sylvestris accessions appe group separate from the group of cvs giving enough resolution to distinguish both even if this study did not sample hundreds of accessions.  Examining the composition of the individuals in admixture analysis it appears possible that some introgression and gene flow has taken place between the wild and some cultivated accessions such as "Dokkar". Consequently, and in order to test this possibility, an ABBA-BABA analysis was performed using the Italian varieties as sister group, the wild accessions "Minorca", "Palma del Rio" and "Jaen. The Olea europaea subsp. laperrinei was used as outgroup. The results evidenced that three cvs ("Chemlal Kabil", "Megaritiki" and "Dokkar") experienced introgression from wild olive trees: (Table S1). This is in agreement with the position of these three cvs in the phylogenetic tree (Figure 2a). The Vouves tree The relation between the top of the Vouves tree sample and the Greek accession 'Mastoidis' was tested calculating the relatedness Ajk statistic between all the samples ( Figure S4). Individuals with themselves will have values of 1 or higher, individuals in the same population will have values close to 0 and unrelated individuals will have negative values. The top of the Vouves tree sample has an Ajk statistics value of 0.53 with the 'Mastoidis' accession indicating a relation of the same order than half-siblings, and verifying that the Vouves tree was grafted with a present-day cv. This is in full agreement with present SSR analysis whereupon Vouves top sample fell within the 'Mastoidis' cluster ( Figure 2b Greek cvs whose genomes were available for inclusion in the present study (Table 1; 'Kalamon', 'Koroneiki', 'Mastoidis', 'Mavreya', 'Megaritiki', 'Myrtolia') didn't show any high Ajk values with the Vouves bottom sample so it cannot be documented that the original monumental tree has a special role in the development of these present-day Greek varieties. Nevertheless, previous SSR studies showed that the Vouves tree is genetically related with other "monumental" trees from the 'Sotira' area in Cyprus [24] in addition to an "ancient" tree from Peloponnese)-a region in continental Greece (present study). Both localities are geographically isolated from Crete, by sea, the first by appr. 700 Km and the second by appr. 300 Km. Similar to the Vouves bottom sample, the Sotira area "monumental" samples appear genetically remote by comparison to present-day local, Cypriot cvs. Specifically, [25] had performed Maximum Likelihood (ML) clustering analysis, employing 17 SSR loci data, of 51 old rootstocks (also dubbed 'living fossils' or 'centennial olive germplasm') and 12 present-day cvs from Cyprus. They showed that the 'Vouves tree' ('Vouves bottom' of present study) is genetically related to other monumental trees of the Sotira area in Cyprus (the two islands of Cyprus and Crete are 700 Km apart). The same authors subsequently performed coalescent modelling employing the same data as for ML. Similar to what is found for 'Vouves bottom' in the present study, [25] concluded that "most of the rootstocks were positioned externally to the core of the olive entries, thus underlining their lack of genetic affinity, but without ruling out the possible contribution to the establishment of the current cultivars".
Overall, we believe that we have shown that the bottom of the Vouves tree is a wellsupported separate branch with no gene flow from the sylvestries trees indicating that probably its cultivation was a separate event and that, at another level, wild cultivars from the eastern cluster together with those from the western Mediterranean basin.
The analysis of the Vouves tree can bring forth some interesting points about the date of the early diversification of the East/West cultivated populations. The 'Farga' taxon represents an ancient branch of domesticated olive trees dated between 300 and 1000 years old [41]. The phylogenetic tree shows a clear divergency of some lineages (e.g., Italian or some Greek popular accessions) from the 'Farga' lineage indicating that the ancestor of these cvs could have existed more than a thousand years ago. The estimated age of diversification between Eastern and Western cultivated populations is dated around 6000 years ago [5]. The position of the bottom of the Vouves tree sample, dated more than 4000 years ago could indicate a later diversification (appears as an outgroup for the cultivated olives without any apparent gene flow with the oleaster populations) although more ancient monumental trees should be studied before any solid conclusion is drown.
Taking together the conclusions of [25,41] and of the present study it could be proposed that, in the Mediterranean Basin, there existed an ancient olive tree common genetic pool which is only partly represented in few present-day cvs.

Gene Space Variation in the Vouves Monumental Olive Tree
The genome resequencing allows for analysis of possible changes occurred in the gene space of a genome. Most of the variants between the reference genome (Oe451) and the Vouves tree samples were intergenic variants (55.30% and 46.53% of the annotated variants for bottom and top samples, respectively), followed by 5 Kb downstream (14.90% and 12.98%) and upstream gene variants (20.63% and 16.81%). Intron variants accounted for 5.12% and 4.15% of the total variants in both datasets. The impact of the variants in the Vouves samples was compared with other Greek genotypes ( Table 2). Between 36.73% (Vouves Top) and 56.36% ('Megaritiki') of the genes presented at least one variant with high impact. Most of these types of variants are associated with frameshift or gaining of a stop codon (Table 3).  In the present study, 11,553 genes with high impact variants were shared between all the Greek accessions and the Vouves tree samples. Indeed, 13,269 and 12,909 genes with high impact variants were shared between the Greek accessions and the bottom and the top Vouves tree sample, respectively, meanwhile 14,984 genes were shared between both Vouves samples ( Figure 5). To further understand the processes in which these genes may be involved, a Gene Set Enrichment Analysis (GSEA) was performed in the different groups. The first group, formed by 4859 genes with high impact variants in the Vouves bottom tree sample, presented an enrichment in eight terms: "IMP Salvage" (GO: 0032264), "Heme Oxidation" (GO: 0006788), "rRNA processing" (GO: 0006364), "cellulose biosynthetic process" (GO: 0030244), "Glycolytic process" (GO: 0006096), "Telomere maintenance" (GO: 0000723), "DNA repair" (GO: 0006281) and "Carbohydrate metabolic process" (GO: 0005975) ( Figure 6A). The second group had 2,278 genes with high impact variants for the Vouves top tree sample. It presented enrichment in the following terms "Regulation of proton transport" (GO: 0010155), "Mitochondrial electron transport ubiquinol to cytochrome c" (GO: 0006122), "Terpenoid biosynthetic process" (GO: 0016114), "Response to metal ion" (GO: 0010038), "Biosynthetic process" (GO: 0009058), "L-phenylalanine catabolic process" (GO: 0006559), "tyrosine metabolic process" (GO: 0006570), "tyrosine biosynthetic process" (GO: 0006571), "Metabolic process" (GO: 0008152), "Double-strand break repair via homologous recombination" (GO: 0000724), "Anaphase-promoting complex dependent catabolic process" (GO: 0031145), "Dolichol-linked oligosaccharide biosynthetic process" (GO: 0006488), "Phytochelatin biosynthetic process" (GO:0046938) and "Photosynthesis light harvesting" (GO: 0009765) ( Figure 6B). 1  14,752  5954  5790  2860  17,086  3603  Koroneiki  2  16,355  6503  6204  3275  17,037  4003  Mastoidis  3  14,896  5798  5570  2757  16,170  3593  Mavreya  5  15,829  6318  6071  3111  17,307  3736  Megaritiki  3  16,924  6956  6629  3386  19,322  4199  Myrtolia  1  14,565  5789  5451  2747  16,332  3427 In the present study, 11,553 genes with high impact variants were shared between all the Greek accessions and the Vouves tree samples. Indeed, 13,269 and 12,909 genes with high impact variants were shared between the Greek accessions and the bottom and the top Vouves tree sample, respectively, meanwhile 14,984 genes were shared between both Vouves samples ( Figure 5). To further understand the processes in which these genes may be involved, a Gene Set Enrichment Analysis (GSEA) was performed in the different groups. The first group, formed by 4,859 genes with high impact variants in the Vouves bottom tree sample, presented an enrichment in eight terms: "IMP Salvage" (GO: 0032264), "Heme Oxidation" (GO: 0006788), "rRNA processing" (GO: 0006364), "cellulose biosynthetic process" (GO: 0030244), "Glycolytic process" (GO: 0006096), "Telomere maintenance" (GO: 0000723), "DNA repair" (GO: 0006281) and "Carbohydrate metabolic process" (GO: 0005975) ( Figure 6A). The second group had 2,278 genes with high impact variants for the Vouves top tree sample. It presented enrichment in the following terms "Regulation of proton transport" (GO: 0010155), "Mitochondrial electron transport ubiquinol to cytochrome c" (GO: 0006122), "Terpenoid biosynthetic process" (GO: 0016114), "Response to metal ion" (GO: 0010038), "Biosynthetic process" (GO: 0009058), "L-phenylalanine catabolic process" (GO: 0006559), "tyrosine metabolic process" (GO: 0006570), "tyrosine biosynthetic process" (GO: 0006571), "Metabolic process" (GO: 0008152), "Doublestrand break repair via homologous recombination" (GO: 0000724), "Anaphase-promoting complex dependent catabolic process" (GO: 0031145), "Dolichol-linked oligosaccharide biosynthetic process" (GO: 0006488), "Phytochelatin biosynthetic process" (GO:0046938) and "Photosynthesis light harvesting" (GO: 0009765) ( Figure 6B).  This group presented a GO enrichment for the terms "valyl-tRNA aminoacylation" (GO: 0006438), "ribosomal large subunit biogenesis" (GO: 0042273), "ribosomal large subunit export from nucleus" (GO: 0000055), "dimethylallyl diphosphate biosynthetic process" (GO: 0050992), "DNA integration" (GO: 0015074), "proteolysis" (GO: 0006508), "DNA topological change" (GO: 0006265), "DNA repair" (GO: 0006281), "photosynthetic electron transport chain" (GO: 0009767), "photosynthetic electron transport in photosystem II" (GO: 0009772), "methylation" (GO: 0032259), "transcription DNA-templated" (GO: 0006351), "isopentenyl diphosphate biosynthetic process methylerythritol 4-phosphate pathway" (GO: 0019288), "transcription initiation from RNA polymerase III promoter" (GO: 0006384), "recognition of pollen" (GO: 0048544), "retrograde transport endosome to Golgi" (GO: 0042147), "protein N-linked glycosylation via asparagine" (GO: 0018279), "protein phosphorylation" (GO: 0006468) and "telomere maintenance" (GO: 0000723) ( Figure 6D). The comparison of the Gene Ontology Terms of the genes with high impact variants between the Greek samples and the Vouves bottom and top tree didn't show any term related with fatty acid metabolism or accumulation. This comparison didn't reveal other terms related with adaptation to biotic or abiotic stresses. Nevertheless, it is interesting that pathways related with carbohydrate and amino acid metabolism were found, in the same way that has been found in the sweep regions related with the process of domestication [31]. An important proportion of these genes are related with the cell wall biosynthesis (e.g., "cellulose synthase 1" or "heteroglycan glucosidase 1"). The impact of these variations in the phenotype of the different varieties is uncertain. These genes are part of extensive gene families (e.g., 42 genes were annotated as "cellulose synthase" in the reference genome used) while mutations in some of the copies can be a part of the process of non-functionalization of multiple gene copies that may be risen by duplications. The study of the evolution of the cellulose synthase revealed that some of them are indeed pseudogenes [42] so it is expected to find high impact variants on those. It is also interesting to find a high number of genes with HI variants associated with DNA repair. Most of these genes have the descriptor "PIF1 helicase". PIF1 DNA helicases are important players in the stability of the genome [43], but their multiple copies and their redundancy has limited the number of functional studies associated to this gene family. The olive genome has annotated 103 genes under the descriptor of "PIF1 helicase". Indeed, 9, 2, 7 PIF1 Helicase genes presented high impact variations in the Vouves bottom and top samples and the other Greek accessions, respectively. One possibility of this elevated number of PIF1 genes as well as the high impact variations on those may be related with the fact that Helitron HelRel proteins have an S1 helicase domain similar to the PIF1 helicase [44]. This may lead to the misidentification of these helitrons as PIF1 helicases.

Conclusions
The olive tree of Vouves, which is considered to be one of the oldest olive trees in the world, was studied to reveal its genetic relations with present-day cultivated and wild genotypes and to shed light into potential origin or routes of domestication of olive. The top of the Vouves tree sample has an Ajk statistics value of 0.53 with the 'Mastoidis' accession indicating a relation of the same order than half-siblings, and verifying that the Vouves tree was grafted with a present-day cv. The Vouves rootstock (bottom sample) showed a clear grouping with the oleaster samples in a similar way that the 'Megaritiki' Greek cv. An ABBA-BABA analysis showed that it doesn't have any introgression from the oleaster population. The Greek cvs didn't show any high Ajk values with the Vouves bottom sample so it cannot be documented that the original monumental tree has a special role in the development of some important present-day Greek varieties. Nevertheless, previous results showed that the Vouves tree is genetically related with other monumental trees of the Sotira area in Cyprus. An ongoing survey on centennial olive trees allover Greece and the comparative analysis of resequencing data with other projects will shed more light on the history of olive growing and potentially reveal ancient varieties to be reserved as natural heritage and genotypes with desirable traits to be exploited as invaluable pre-breeding material.

Plant Samples and DNA Extraction
Fresh healthy leaves were collected during July 2018 separately (i) from the highest part of the Vouves monumental olive tree (referred as 'top') and (ii) from the lowest part of the tree at the base of the trunk (referred as 'bottom') and were immediately transferred in a cool container to the laboratory for DNA extraction. For all analysis described herein, total genomic DNA was isolated from leaf material using the DNeasy Plant Mini kit (Qiagen cat. No 69104, Düsseldorf, Germany) according to manufacturer's instructions. The hypothesis that the tree is consisted of two genotypes united through grafting was tested.

DNA Sequencing and SSR Analysis
1 ug of DNA for each of the Vouves monumental olive tree (top and bottom) were sent to Novogene Co. Ltd. (Sacramento, CA, USA, https://en.novogene.com/, accessed on 25 June 2018) for whole genome sequencing. The libraries were prepared following the standard Illumina Nextera ® protocol for insert sizes~350 bp. Then, they were sequenced with an Illumina Novaseq6000 platform, paired-end 150 bp. SSR analysis followed [45] with the inclusion, in the SSR set, of two new loci while omitting one, previously used. Total SSR loci presently employed were 11 (DCA3, DCA5, DCA9, DCA14, DCA16, DCA18, EMO90, GAPU71B, GAPU101, GAPU103A, UDO99). Allelic data were used to construct a UPGMA similarity dendrogram based on Jaccard distances, similarly to [36].

Origin Analysis
RNA-Seq reads representing 55 olive accessions from 14 different countries were downloaded from GenBank SRA (ProjectID PRJNA525000 [6] as well as 50 olive accessions Whole Genome DNA Resequencing (WGR) data from the SRA project PRJNA556567 [33]. Reads were processed with Fastq-mcf v1.04.676 from the Ea-utils package [46] and them mapped to the reference genome Olea europaea var. sylvestris genome reference v1.0 [32] using Hisat2 v2.1.0 [53]. Variants were called using Freebayes v1.3.1-16-g85d7bfc [50] with a minimum read coverage of 5, a minimum mapping quality of 20 and discarding complex variants. The VCF file was filtered using Vcftools v0.1.15 [51] removing the variants that were not present in all the samples and keeping only biallelic Single Nucleotide Polymorphisms (SNPs). The VCF file was upload in RStudio v1.1.463 running R v3.5.1 using Adegenet v2.1.1 [54] and Poppr v2.8.3 [55] packages. A distance matrix between all the samples' SNP was calculated with the function dist() with the default parameters. A distance tree was calculated with the function aboot() function from the Poppr package using Nei distance, NJ tree and 1000 samples. A Principal Components Analysis (PCA) was performed using the prcomp() function from the Stats R core package with the default parameters and it was plotted with the Ggplot2 v3.2.0 package. The DAPC analysis was performed with the function dapc() from the Adegenet package.
Population structure was inferred using two alternative procedures: (1) a Bayesian, model-based algorithm employed through STRUCTURE software (release: V2.3.4, July 2012) [56] and (2) Discriminant Analysis of Principal Components [57] which produces genetic clusters using a few "synthetic" variables constructed as linear combinations of the original variables (alleles). These alleles are in turn selected as having the largest between-group variance and the smallest within-group variance.
The ABBA-BABA analysis was performed using the same VCF file that was described before. The VCF files was converted to the EIGENSTRAT format with the script convertVCFtoEigenstrat.sh from Joanam at Github (https://github.com/joanam/scripts/ blob/master/convertVCFtoEigenstrat.sh/, accessed on 8 September 2021). The "admixr" R package v0.9.1 was used to perform the ABBA-BABA analysis. In summary, the EIGEN-STRAT files were uploaded in R with the eigenstrat() function. The sister group to the targets was the Italian cvs ("Frantoio", "Grappolo", "Leccino) being a monophyletic branch in the Figure 2. The wild accessions were the O. europaea var. sylvestris accessions: "Mi-