Hidden Genetic Variability, Can the Olive Moth Prays oleae (Lepidoptera: Yponomeutidae or Praydidae?) be a Species’ Complex?

Prays oleae is the second most important pest in Mediterranean olive groves, causing substantial damage on olive production. We used mitochondrial [cytochrome c oxidase subunit I (COI), and NADH dehydrogenase subunit 5 (nad5)] and nuclear [ribosomal protein S5 (RpS5)] amplicons to assess the population variability in five main olive producing regions from Tunisia, to support or dismiss the existence of two non-monophyletic groups within the species, as found within Portugal. Our phylogenetic analysis with cytochrome c oxidase subunit I (COI) indeed displayed two distinct and well-supported clades of P. oleae, which were corroborated by the haplotype network reconstructed with both mitochondrial and nuclear amplicons. We were also able to dismiss the hypothesis that one of the clades would not develop on olive fruits. No correlation was observed between clades differentiation and geographic distribution. The existence of cryptic species can impact on the management of agroecosystems and on the perception of how these moths responds to environmental changes.


Introduction
Global crop losses by insects are estimated to be 13% per annum despite the usage of multiple pesticides [1]. The insect order Lepidoptera includes many crop pests [2]; generally larval stages are crop destroyers that include defoliators, shoot/root borers, and seed predators causing significant agricultural losses. Some Lepidoptera species are known to cause damage to the olive trees, the major agro-ecosystem in the Mediterranean Basin, being the most significant in terms of impact the olive moth, Prays oleae (Bernard, 1978) (Lepidoptera, Yponomeutidae or Praydidae) [3,4].
The population dynamics of this moth is intrinsically dependent on the host-plant characteristics and development as its three yearly larval generations depend on the olive tree: (i) the phylophagous generation feeds on leaves; (ii) the anthophagous generation feeds on olive tree flowers and develops during the plant blooming; and (iii) the carpophagous generation feeds on olive fruits. The dietary preferences of adults are, however, poorly known but likely they feed on floral nectar and on a variety of other liquids similarly to most Lepidopteran adults [5]. Such a close and intricate connection between the olive moth and the olive tree should be reflected in the olive moth population structure and its co-evolutionary history. This co-history is already known and accepted for Bactrocera oleae (Diptera, Tephritidae) [6] whose larvae are monophagous, feeding exclusively on the tissue of olive fruits.
Recently, Nobre and co-workers [4] have questioned the species status of P. oleae as the reconstructed phylogeny based on the available data resolves this species as non-monophyletic. Moreover, the same study suggested the co-existence of two sympatric evolutionary lineages of morphologically cryptic olive moth populations. These two lineages overlapped geographically throughout the extensive sampling in Portugal [4]. Given this scenario, a local diversification could be hypothesized, particularly because the Iberian Peninsula is known as one of the most important Pleistocene glacial refugia in Europe. This claim is well supported by several lines of evidence, also for Lepidoptera species (e.g. two genetic lineages of Aglaope infausta with a likely differentiation center in Iberia [7]; the phylogeography of Melitaea cinxia shows the importance of the Iberian refugia in current structure [8], several refugia in the Iberian Peninsula have been inferred for the protected species Graellsia isabellae and its recognized plant host [9]). However, the currently available data on Prays oleae COI collected outside Portugal (one from Spain and three from Tunisia) suggest a similar pattern on the other side of the Mediterranean Sea [4].
For Bactrocera oleae, Segura and co-workers [10] found that the most southerly of the Mediterranean populations sampled (Tunisia) differed significantly from the remaining populations. However, Nardi and co-workers [6] suggested that those divergent Tunisian samples might in fact belong to a cluster of olive fruit flies in Central/Western Mediterranean area. More recently, Iberian and Italian B. oleae populations were shown to clearly split, giving rise to the existence of at least three well separated Mediterranean Basin populations: the Western Mediterranean, the Italian (including Greece and Western Turkey) and the Eastern Mediterranean clusters [11].
These findings triggered the present work [4,6,11]: we have sampled P. oleae in five important olive grove regions in Tunisia and proceed similarly with the previous approach done in Portugal [4], to search for population diversity and co-presence of the previously identified lineages.

Specimen Collection
Seventy-nine P. oleae specimens from Tunisia were sampled at five localities (Bouficha, Chaffar, Hajeb, Sidi Bouali, and Zarzis) and three specimens of P. oleae from Greece (that have emerged from olives collected in Crete island in 2019) ( Figure 1). Tunisian specimens were sampled using commercial traps with specific pheromones (Biosani) during the year 2017.

DNA Extraction, PCR and Sequencing
Adult specimens were stored at −20 °C in 96% ethanol until DNA extraction. DNA extraction was performed from whole specimen body following extraction protocol described in [4]. DNA was eluted in 50 µl of sterile ultra-pure water and stored at −20 °C for posteriorly utilization in PCR reactions. Partial cytochrome c oxidase subunit I (COI), NADH dehydrogenase subunit 5 (nad5) and ribosomal protein S5 (RpS5) were amplified using primers: 1) LCO1490 (5′-GGT CAA CAA ATC ATA AAG ATA TTG G -3' and HCO2198 (5′-TAA ACT TCA GGG TGA CCA AAA AAT CA -3′) for
The assembly and editing of sequences were performed using GeneStudio program. Sequences were aligned with the Muscle algorithm in MEGA X [15] and were organized as haplotypes in DnaSP6 [16]. All haplotypes obtained in this study were submitted to the GenBank database (GenBank accession numbers (Supplementary Table S1): RPS5 (from MT096181 to MT096259), COI (from MT096260 to MT096341) and nad5 (from MT106246 to MT106324).

Phylogenetic Analysis
COI haplotypes observed in present study were combined with those available for Prays species on GenBank, and we performed a Bayesian phylogenetic reconstruction of Prays oleae using COI sequences in BEAST version v.4.2.8 [17]. Gamma site Model with 4 categories, and rate frequencies were estimated, other settings were kept as default (10 000 000 generations). BEAST results were analysed by Tracer v.1.6. Consensus tree was obtained using TreeAnnotator, first 10% trees were removed. To check phylogenetic reconstruction congruence, we performed Maximum Likelihood and Neighbor-Joining methods as implemented in MEGA X [15].

Variability and Population Structure
Mitochondrial and nuclear sequences (COI, nad5 and RpS5) variability analysis was performed on DnaSP6 [16]. Genetic diversity (haplotype diversity [Hd] and nucleotide diversity [Pi]) was estimated, synonymous and non-synonymous sites were analysed together (as separation would imply a too low number of sites to yield reliable results [18]). Tajima's D statistics was used to compare pairwise differences with the number of segregating sites [19]. ZnS statistics (the r2 squared allele frequency correlation [20]) was used to test linkage disequilibrium in all sampled fragments, based on the parsimony informational sites. The statistical support for the Zns and D of Tajima was evaluated by coalescent simulations with 10000 replicates in DnaSP6 [16], considering all segregating sites (α = 0.05). Analyses were performed for the three amplicons independently and concatenated. The reconstructed haplotype networks, both concatenated and in the three separate regions, were used to visualize the relationships among the sequences and were built using the TCS network (95% connection limit) in PopART [21].

Results
All Tunisian specimens were sequenced for the COI, nad5 and RpS5 amplicons. The specimens from Greece were only used for phylogenetic reconstruction and were only sequenced at the COI amplicon. Phylogenetic reconstruction of the Prays genus based on the cytochrome oxidase region (Figure 2) showed that Prays oleae has a non-monophyletic group (the Maximum-likelihood -Supplementary Figure S1-and Neighbor-joining -Supplementary Figure S2-methods corroborated this typology).
Nuclear and mitochondrial DNA amplicons variability of Tunisian P. oleae group (COI, nad5 and RpS5) is shown in Table 1.    Considering the complete dataset together, linkage disequilibrium was not detected, and Tajima statistics were not significant for all amplicons, suggesting that these DNA sequences evolved randomly ('neutrality') ( Table 2). By analyzing clade 1 and clade 2 separately, Tajima D statistics was significant for both mitochondrial markers (Table 2). Tunisian P. oleae haplotype network (Figure 3a,b) consistently shows two groups separated by 11 mutational steps, for the haplotype network of both mtDNA amplicons, unlikely if the specimens represent a single species. The two lineages signal is faded when looking at the protein-coding nuclear gene region RpS5 amplicon (Figure 3c). No relation with sampling location was found (Supplementary Figure S3).

Discussion
Prays oleae reconstructed phylogeny resulted in two separate lineages with specimens forming a clade (clade 2) with Prays fraxinella, and a sister group clade (clade 1) with only P. oleae samples; this corroborates previous findings where we also found this distinct mitochondrial subdivision in the partial COI gene fragment [4]. The question on whether we are dealing with a single species or a group of cryptic species thus remains. On the raised question of whether these two differentiated lineages do have different ecological niches [4], we can now add to the discussion that we have observed specimens belonging to both clades emerging from olive fruits.
The fact that individuals of both lineages of putative P. oleae emerge from olive fruits is not surprising, but just a confirmation, as specimens of both studies were captured in olive groves. Several examples of phylogenetically related species complexes, hybrid specimen's or cryptic species that present similar behavior and have no apparent phenotypic differences are reported [22][23][24][25][26]. The resource of molecular identification through species-specific markers help identify a given species quickly with a higher degree of accuracy (e.g. Barcode of Life) [27][28][29][30]. DNA barcoding is an extremely powerful tool and it was long established that the mitochondrial gene cytochrome c oxidase I (COI) is the core of the global identification system for animals [27]. However, the use of a mitochondrial amplicon only has also inherent limitations, including on hybrid identification for example [31,32]. Also, in phylogeography, mitochondrial DNA (mtDNA) has been extensively used due to its fast substitution rate, lack of recombination, small effective population size resulting in fast lineage sorting, and high sensitivity to demographic events (e.g. [33]). Even though the sole use of mtDNA reveals only a small part of the evolutionary history of a species, it has provided valuable phylogeographic data (e.g. [11,29,34,35]).

Discussion
Prays oleae reconstructed phylogeny resulted in two separate lineages with specimens forming a clade (clade 2) with Prays fraxinella, and a sister group clade (clade 1) with only P. oleae samples; this corroborates previous findings where we also found this distinct mitochondrial subdivision in the partial COI gene fragment [4]. The question on whether we are dealing with a single species or a group of cryptic species thus remains. On the raised question of whether these two differentiated lineages do have different ecological niches [4], we can now add to the discussion that we have observed specimens belonging to both clades emerging from olive fruits.
The fact that individuals of both lineages of putative P. oleae emerge from olive fruits is not surprising, but just a confirmation, as specimens of both studies were captured in olive groves. Several examples of phylogenetically related species complexes, hybrid specimen's or cryptic species that present similar behavior and have no apparent phenotypic differences are reported [22][23][24][25][26]. The resource of molecular identification through species-specific markers help identify a given species quickly with a higher degree of accuracy (e.g. Barcode of Life) [27][28][29][30]. DNA barcoding is an extremely powerful tool and it was long established that the mitochondrial gene cytochrome c oxidase I (COI) is the core of the global identification system for animals [27]. However, the use of a mitochondrial amplicon only has also inherent limitations, including on hybrid identification for example [31,32]. Also, in phylogeography, mitochondrial DNA (mtDNA) has been extensively used due to its fast substitution rate, lack of recombination, small effective population size resulting in fast lineage sorting, and high sensitivity to demographic events (e.g. [33]). Even though the sole use of mtDNA reveals only a small part of the evolutionary history of a species, it has provided valuable phylogeographic data (e.g. [11,29,34,35]).
In the present study, the reconstruction of COI phylogeny supports the existence of two mitochondrial lineages in P. oleae (clade 1 and clade 2) coexisting in Tunisian and Portuguese olive groves. These data suggest the existence of a Central/Western Mediterranean olive moth group and not of a North-South Mediterranean differentiation. The only three available specimens from the Greek population belong to Prays clade 2, asking for further analyses to understand if a putative Central/Western-Eastern differentiation boundary exists and where it lies, and whether the Italian cluster (including Greece and Western Turkey) identified for B. oleae [11] is represented also in the P. oleae population structure.
The haplotype network shows the two groups separated by several mutational steps, suggesting that specimens sampled unlikely represent a single species [36], given the strong structure among populations. The haplotype network of both mtDNA amplicons suggests the hypothesis that P. oleae may comprise more than one species. The nuclear region used, RpS5, likely evolves too slow to be able to discriminate the two clades signature. To be able to address the central question on whether this sub-division is uniquely mitochondrial or if it is also present at the nuclear level, several nuclear encoded marker(s) should be analysed. What seems clear from the mtDNA analyses is that there is a clear differentiation into two groups and ultimately these can correspond to two separate cryptic species [8,34,37]. Several studies demonstrate the effectiveness of using the COI gene fragment to discriminate known species and signal new ones, alone or in association with other genes fragments [26,30,34,37]. Despite the concerns of using this region alone [35], several studies show that it can reveal the existence of taxa with low divergence rate or recent radiation [26].
Prays oleae, although one of the most relevant pest species of olive groves, has been gained no or little attention from researchers dealing with population dynamics and structure. Our present results thus provide new insights, as they expand the geographic span of the available data and corroborate the existence of two different olive moth lineages highly differentiated and reconstructed as non-monophyletic on the basis of COI amplicon.

Conclusions
The Prays oleae paradigm will only be disentangled with a well-designed phylogenetic study comprising geographical meaningful samples of the two P. oleae lineages and P. fraxinella. From an agronomic perspective, the existence of cryptic pest species can have a high impact on agroecosystem management including in the perception of pest species reply to drastic environmental changes. From an evolutionary perspective, the impact of understanding population structure and eventual speciation is even higher by the fact that Prays belongs to the Yponomeutoidea superfamily which is thought to be one of the oldest among extant Ditrysian Lepidoptera [38].
Supplementary Materials: The following are available online at http://www.mdpi.com/2075-4450/11/4/204/s1, Table S1. Data on the specimen and sampling coordinates of Prays oleae collected in Tunisia and Greece. GenBank accession numbers are given per sample; Figure S1. Maximum likelihood inference for Prays oleae samples using Tamura-Nei model. Nodes values are bootstrap statistic. The tree with the highest log likelihood is shown. The percentage of trees in which the associated taxa clustered together is shown next to the branches. Initial tree(s) for the heuristic search were obtained automatically by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using the Maximum Composite Likelihood (MCL) approach, and then selecting the topology with superior log likelihood value. The tree is drawn to scale, with branch lengths measured in the number of substitutions per site. Evolutionary analyses were conducted in MEGA X. h = haplotype, Grec = Greece, Pt = Portugal, Sp = Spain, Tun = Tunisia; Figure S2: Neighbor joining inference for Prays oleae samples. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (10,000 replicates) are shown next to the branches. All positions containing gaps and missing data were eliminated (complete deletion). Evolutionary analyses were conducted in MEGA X. h = haplotype, Grec = Greece, Pt = Portugal, Sp = Spain, Tun = Tunisia. Figure S3. TCS haplotype network based on COI (a), nad5 (b) and RpS5 (c) amplicons of Prays oleae (circles, scaled to relative frequency of each haplotype in the data set); Note the absence of correlation between sampling locality and group (clade1 or clade 2; Figure 3). Funding: This research was funded by Program Alentejo 2020; project 'A Protecção Integrada do olival alentejano. Contributos para a sua inovação e melhoria contra os seus inimigos-chave' Reference: ALT20-03-0145-FEDER-000029. TN was supported by PTDC/ASP-PLA/30650/2017 ("Fundação para a Ciência e Tecnologia", FCT Portugal). This work is funded by National Funds through FCT-Foundation for Science and Technology under the Project UIDB/05183/2020.

Conflicts of Interest:
The authors declare no conflict of interest.