Molecular Characterization of Mitogenome of Cacopsylla picta and Cacopsylla melanoneura , Two Vector Species of ‘ Candidatus Phytoplasma mali’

: The mitochondrial genomes of two vector psyllids of the ‘ Candidatus Phytoplasma mali’, Cacopsylla picta and C. melanoneura , were sequenced using high-throughput sequencing on the Illumina platform. The main objective of the study was to describe their mitogenome and characterize their genetic variability and the potential changes in the context of the observed global warming. The four complete sequences for C. picta , 14,801 bp and 14,802 bp in length, two complete and one partial sequence for C. melanoneura , ranging from 14,879 bp to 14,881 bp in length, were obtained for the ﬁrst time for these European apple psyllids. The detected intraspecies mtDNA identity was highly similar (99.85–99.98%), the identity’s similarity with other Cacopsylla species varied between 79.79 and 86.64%. The mitogenomes showed a typical mitochondrial DNA structure with 13 protein-coding genes, 2 rRNA genes and 22 tRNA genes; the presence of CGGA motif in the ND1-trnS2 junction was detected in both species. Phylogenetic analysis placed both species in close relationship with C. burckhardti within the Cacopsylla clade-I O group. The analysis of complete mitogenomes and of partial COI sequences of ﬁfty-two Cacopsylla individuals showed a high homogeneity of genotypes over 15 years and among the different localities in the Czech Republic.


Introduction
Psyllids, or jumping plant lice, are phytophagous insects that comprise about 4000 species with a worldwide distribution, the majority of which are found in tropical and subtropical areas, but only 100 which are considered to be economically important pests in agriculture or forestry. They are hemimetabolous phloem feeders with stylets that penetrate intercellularly into phloem sieve elements [1][2][3]. Some species of this group are vectors of serious bacterial pathogens. Among them, one of the most important are phytoplasmas ('Candidatus Phytoplasma sp.'), which are the causal agents of serious diseases of more than 1000, mainly dicotyledonous, herbaceous plants and trees [4,5].

DNA Isolation
Individual adult psyllids were homogenized in ATL buffer in 1.5 mL microtube using a micropestle homogenizer and total DNA was isolated using the QIAmp DNA Blood mini kit (Qiagen, Hilden, Germany, Cat. No. 51106) and eluted in 100 µL of deionized water. Purified DNA was stored at a temperature of −20 °C until used in HTS or PCR assays.

PCR Amplification and Sanger Sequencing
For the species confirmation and to study of the variability of the larger sample set, the partial COI gene was amplified using the universal primer pair CPF4 (5′-TAA-GAACTAACCATAAGATTATCGG-3′) and CPR4 (5′-CACTTCAGGGTGTCCAAA-GAATC-3′) according to the work of Kang et al. [24]. The PCR reaction mix consisted of MyTaq polymerase buffer (1×), MyTaq DNA polymerase (1U) (both Bioline, London, UK, Cat. No. BIO-21105) and 0.2 mM of each primer, and 2 µL of isolated DNA in a concentration of 50-200 ng, in total volume of 25 µL. The PCR reaction was conducted under the following conditions: 95 °C for 3 min, followed by 40

DNA Isolation
Individual adult psyllids were homogenized in ATL buffer in 1.5 mL microtube using a micropestle homogenizer and total DNA was isolated using the QIAmp DNA Blood mini kit (Qiagen, Hilden, Germany, Cat. No. 51106) and eluted in 100 µL of deionized water. Purified DNA was stored at a temperature of −20 • C until used in HTS or PCR assays.

PCR Amplification and Sanger Sequencing
For the species confirmation and to study of the variability of the larger sample set, the partial COI gene was amplified using the universal primer pair CPF4 (5 -TAAGAACTAACC ATAAGATTATCGG-3 ) and CPR4 (5 -CACTTCAGGGTGTCCAAAGAATC-3 ) according to the work of Kang et al. [24].

High-Throughput Sequencing
The 50 ng of DNA was fragmented in 50 µL of Sonication buffer [TE buffer (10 mM Tris, 1 mM EDTA), pH 7.5-8.0] using the Bioruptor Plus (Diagenode, Denville, NJ, USA) five times for 30 s at the HIGH setting. The sheared DNA was used to prepare a sequencing library using the NEBNext Ultra™ II DNA Library Prep Kit for Illumina (New England Biolabs, Ipswich, MA, USA, Cat. No. E7645L) with a mean fragment size of 500 bp. The library was sequenced on an Illumina NovaSeq 6000 to generate 2 × 160 or 2 × 250 bp paired-end reads to achieve at least 100-fold mitochondrial genome coverage. The samples were sequenced in multiple runs using the NovaSeq6000 at CATRIN-UPOL, Czech Republic.

Assembly of HTS Reads and the Genome Annotation
The high-throughput reads were paired by name, trimmed using BBDuk, merged and de novo assembled using Spades assembler, under standard conditions, all algorithms implemented in Geneious Prime version 2023.1.2. The resulting contigs were screened using the BLASTN algorithm [37] against the NCBI Cacopsylla mitochondrion sequences and circularized based on the identical 5 -and 3 -overhangs. The derived mtDNAs were subjected to mitogenome analysis using the MITOS Web server (http://mitos.bioinf.unileipzig.de, accessed on 22 June 2023) [38] under the invertebrate genetic code (code 5), and the annotation files and tRNA structure analyses were generated. The position of the identified genes was manually verified based on the additional comparisons with the available Cacopsylla mitogenomes using MEGA11 version 11.0.9 [39] and Geneious Prime 2023.1.2 software. Geneious Prime 2023.1.2 was also used for the visualization of the mitogenome map.

Phylogenetic Analysis
Multiple alignments of the obtained HTS mtDNA sequences and complete mtDNA sequences of Cacopsylla sp. available in the GenBank were performed using ClustalW algorithm. The obtained nt matrices were used for the estimation of genetic variability and phylogenetic relationships: (1) genetic variability was calculated using p-distance algorithm; (2) frequency of synonymous and nonsynonymous codons was calculated using distance analysis and the Nei-Gojobori model; (3) the phylogenetic trees were constructed on nucleotide sequences of complete mitogenome and discrete genes using the neighborjoining method and the Tajima-Nei (G+I) model (selected using the model algorithm). The 1000 bootstrap replicates were used for all analyses, and trees were rooted using the appropriate Psylla alni sequences. The phylogenetic tree was visualized using Tree Explorer. All of the above analyses were performed using MEGA11 version 11.0.9 [39]. A similar strategy was used for the analysis of partial COI sequences except for the selection of the Tamura-Nei three-parametric model (G+I) used in the phylogenetic analysis.

Results
In order to gain a broader insight into the structure of the mitogenome of the 'Candidatus Phytoplasma mali' vector species, Cacopsylla picta and Cacopsylla melanoneura, four individuals of C. picta and three of C. melanoneura collected in 2005, 2006 and 2022 in the southern Moravia and the eastern Bohemia of the Czech Republic were subjected to highthroughput sequencing.

HTS Analysis
The four complete mtDNAs of C. picta, and two complete and one partial mtDNAs of C. melanoneura were obtained after de novo assembly of the HTS reads (Table 2); one overlapping contig was obtained for each sample. The sequences were deposited in the GenBank under Acc. Nos. OR346833-OR346839, OR351141-OR351187.
The length of the C. picta mtDNA varied between 14,801 and 14,802 bp, with an indel -/A present at position 1314. These mtDNA sequences showed 99.85-99.98% similarity with each other and 79.79-86.51% similarity with complete mitochondrial sequences of other Cacopsylla species available in GenBank, namely C. burckhardti, C. pyri, C. coccinea, C. yukiungi, C. citrinella, C. fuscicella. The correct morphological determination of the analyzed individuals was confirmed via BLASTN analysis, in which 99.32-100% similarity was found with the different partial COI sequences of C. picta available in the GenBank database.
Two complete mitochondrial sequences (gt S3 and 498-2) and one partial sequence (gt AO282), lacking the complete A+T rich control region, measuring 14,879-14,881 bp in length were obtained for C. melanoneura samples. The sequences differ according to the presence of specific indels in non-coding regions, with A/-at position 6034, AT/--at position 14,400, T/at position 14,608; and they showed 99.86-99.98% similarity with each other, 80.02-86.64% with the complete mitochondrial sequences of the other Cacopsylla sp. and 99.73-100% similarity with various C. melanoneura sequences available in GenBank, respectively. All molecules showed the standard organization of the mitogenome consisting of 36 typical genes, 13 protein-coding genes, 2 ribosomal (rRNA) RNA genes and 22 transfer RNA (tRNA) genes ( Figure 2, Table 3 and Table S1). All mtDNAs obtained showed high frequency of A and T bases, with 39.1% of A, 35% of T, 16.3% of C, 9.6% of G and a GC content of 25.8% for the complete sequence and slightly higher, 26.9%, for protein-coding genes for the C. picta. A similar situation was found for the C. melanoneura with the frequencies 38.8% of A, 35.1 of T, 16.1% of C, 10.1 of G, and 26.2% of GC in the complete sequence and 27.5% of GC, in protein-coding genes. respectively. The analysis of the overall codon usage shows a clear preference for codons with A or T in the third position compared to G or C ( Figure S1). As expected, the shortest gene is ATP8 with a length of 153 bp for both species, and the longest genes are ND5 with length of 1618 bp, followed by the COI gene, which varies in length from 1530 bp in C. picta to 1533 bp in C. melanoneura. The length of the protein coding genes was identical for the both species, except for the COI gene mentioned above and the ND4L gene (1249 and 1246 bp). Minimal nucleotide variability was found among the genes within each Cacopsylla species. The interspecies comparisons reflected the species differences. The lowest diversity (12.3%) was found for the COI gene, while the most variable genes were ND4L (21.7-24.1% diversity), ATP8 (27.2%), and ND6 (28.9-28.7%). The analysis of the frequency of synonymous and nonsynonymous mutations showed the more frequent presence of synonymous codons at intra-species and inter-species levels, with intra-species dN-dS values close to zero and inter-species dN-dS values ranging from −0.225 to −1.175 (Table 4), implying the effect of negative selection in all genes, except for ND6 in C. melanoneura.
The mitogenome harbors two ribosomal RNA genes, both in the reverse orientation positioned between the ND1 and A+T rich Control region, the 16S rRNA gene of 1157 bp in length for C. picta and 1154 bp in length for C. melanoneura, and the 12S rRNA gene of 747 bp and 748 bp in length, respectively. Both species differ in the length of the A+T rich control regions, which are 689 bp and 757-758 bp in length, respectively, and contain a 20 bp and 23 bp long poly-T motif.   Abbreviations used as follows: ND2-NADH dehydrogenase subunit 2; COI-cytochrome c oxidase subunit I; COII-cytochrome c oxidase subunit II; ATP8-ATP synthase membrane subunit 8; ATP6-ATP synthase membrane subunit 6; COIII-cytochrome c oxidase subunit III; ND3-NADH dehydrogenase subunit 3; ND5-NADH dehydrogenase subunit 5; ND4-NADH dehydrogenase subunit 4; ND4L-NADH dehydrogenase subunit 4L; ND6-NADH dehydrogenase subunit 6; ND1-NADH dehydrogenase subunit 1; CYTB-cytochrom b; A+T rich-A+T rich control region; trn-tRNA (amino acids marked by IUPAC code); Fw-forward orientation; Rev-reverse orientation.
The mtDNA of both psyllid studies is similar to those described to date, namely for Cacopsylla species, i.e., C. burckhardti, C. pyri, C. coccinea, C. yukiungi, C. citrinella, C. fuscicella psyllids shows a standard structure typical for the Cacopsylla mitogenome. The 13 proteincoding genes have been identified there, of which 9 (ND2, COI, COII, APT6, ATP8, COIII, ND3, ND6 and CYTB) are localized on the H strand and 4 (ND5, ND4, ND4L, ND1) on the L strand. The genes are flanked by AT-start codons, mainly ATG, followed by ATA and ATT (specific TTG as a start codon has been detected for the ND5 gene, at the 5 end) and usually by TAA and less frequently TAG termination codons at the 3 end. The incomplete T-termination, which is thought to be converted to TAA after specific adenylation is present in the COII, COIII, ND4 and ND5 genes ( Table 4). The coding region consists of 3604 codons, and the most frequent codons were ATT (9.40 and 9.49%), TTA (8.70% and 8.46%), TTT (8.34% and 8.10%) and ATA (5.55% and 5.55%) for C. picta and C. melanoneura, respectively. The analysis of the overall codon usage shows a clear preference for codons with A or T in the third position compared to G or C ( Figure S1). As expected, the shortest gene is ATP8 with a length of 153 bp for both species, and the longest genes are ND5 with length of 1618 bp, followed by the COI gene, which varies in length from 1530 bp in C. picta to 1533 bp in C. melanoneura. The length of the protein coding genes was identical for the both species, except for the COI gene mentioned above and the ND4L gene (1249 and 1246 bp). Minimal nucleotide variability was found among the genes within each Cacopsylla species. The interspecies comparisons reflected the species differences. The lowest diversity (12.3%) was found for the COI gene, while the most variable genes were ND4L (21.7-24.1% diversity), ATP8 (27.2%), and ND6 (28.9-28.7%). The analysis of the frequency of synonymous and nonsynonymous mutations showed the more frequent presence of synonymous codons at intra-species and inter-species levels, with intra-species dN-dS values close to zero and inter-species dN-dS values ranging from −0.225 to −1.175 (Table 4), implying the effect of negative selection in all genes, except for ND6 in C. melanoneura. The analysis of the variability of the mtDNAs of both species did not show any significant differences between the psyllids collected in 2005, 2006 and 2022, and between the samples collected in the Czech Republic in the regions of Eastern Bohemia and Southern Moravia.
The mitogenome harbors two ribosomal RNA genes, both in the reverse orientation positioned between the ND1 and A+T rich Control region, the 16S rRNA gene of 1157 bp in length for C. picta and 1154 bp in length for C. melanoneura, and the 12S rRNA gene of 747 bp and 748 bp in length, respectively. Both species differ in the length of the A+T rich control regions, which are 689 bp and 757-758 bp in length, respectively, and contain a 20 bp and 23 bp long poly-T motif.
Phylogenetic analysis of the complete mitochondrial sequences using neighbor-joining analysis allowed the construction of a phylogenetic tree, whose topology placed C. picta and C. melanoneura sequences in relationship to C. burckhardti (Figure 3). Both species Agronomy 2023, 13, 2210 9 of 14 formed a homogeneous branch that differed significantly from the other sequences analyzed. Phylogenetic trees of similar topology with the strong support for the C. picta and C. melanoneura branches were obtained not only for the COI, 16S rRNA and CYTB genes used for taxonomic classification, but also for all other protein-coding genes.
Phylogenetic analysis of the complete mitochondrial sequences using neighbor-join ing analysis allowed the construction of a phylogenetic tree, whose topology placed C picta and C. melanoneura sequences in relationship to C. burckhardti (Figure 3). Both species formed a homogeneous branch that differed significantly from the other sequences ana lyzed. Phylogenetic trees of similar topology with the strong support for the C. picta and C. melanoneura branches were obtained not only for the COI, 16S rRNA and CYTB genes used for taxonomic classification, but also for all other protein-coding genes.  With the aim of mapping the variability of the studied psyllids and the change in the Czech population structure in the context of global warming and the possible spread of different genotypes in the area over more than 15 years, the partial COI sequence analysis was performed on the set of C. picta and C. melanoneura psyllids collected in the 2005-2022 period from different localities in the Czech Republic. The species-specific sequences were almost identical, showing 99.7-100% similarity between each other and compared to Italian samples available in GenBank for C. picta, and 99.3-100% for C. melanoneura, respectively. The only exception was the sample adult genotype 495-1, which was determined to be C. melanoneura and showed only 91.5-92.0% similarity to other C. melanoneura sequences.
In the phylogenetic analysis using neighbor-joining analysis (Figure 4), samples of C. picta collected from different localities in Moravia and eastern Bohemia formed one major and four minor clusters, but without the statistical support of bootstrap values, among which the sequences were distributed together with the Italian sequences available in GenBank without correlating with their origin or the date of their collection. The C. melanoneura psyllids formed a homogeneous cluster covering all analyzed sequences without any correlation with the origin or date of collection. The specific situation of adult male 495-1 was also confirmed in this analysis, when this sequence formed a significant single branch (bootstrap value 97) that was distinct from the other C. melanoneura and C. picta psyllids.

Discussion
In our study, we have obtained the first complete mitochondrial genomes of two apple psyllids Cacopsylla picta and Cacopsylla melanoneura. The length of the mitogenomes, 14,801-14,802 bp and 14,879-14,881 bp is similar to the mitogenomes of Cacopsylla species described so far, with the detected length of around 14.8-14.9 kb [29][30][31]33,40]. The phylogenetic analysis of the complete mtDNA places the two species characterized here in a close relationship with C. burckhardti. Despite the lack of complex data publicly available for further comparisons, it could be deduced that both of them belong to the Cacopsylla clade I of the O-group of psyllids, psyllids of the Old and New World [29], which is in agreement with the expectation based on their distribution limited to the Europe and the variability of the partial cytochrome c oxidase subunit I gene used for the DNA barcoding and molecular identification of psyllid species [11,22,[25][26][27][28]41].

Discussion
In our study, we have obtained the first complete mitochondrial genomes of two apple psyllids Cacopsylla picta and Cacopsylla melanoneura. The length of the mitogenomes, 14,801-14,802 bp and 14,879-14,881 bp is similar to the mitogenomes of Cacopsylla species described so far, with the detected length of around 14.8-14.9 kb [29][30][31]33,40]. The phylogenetic analysis of the complete mtDNA places the two species characterized here in a close relationship with C. burckhardti. Despite the lack of complex data publicly available for further comparisons, it could be deduced that both of them belong to the Cacopsylla clade I of the O-group of psyllids, psyllids of the Old and New World [29], which is in agreement with the expectation based on their distribution limited to the Europe and the variability of the partial cytochrome c oxidase subunit I gene used for the DNA barcoding and molecular identification of psyllid species [11,22,[25][26][27][28]41].
The A/T content is the same for mitochondrial DNA from other members of the Cacopsylla species, typically ranging from 72.04 for C. citrisuga to 73.85% for C. pyri, and 73.69% for C. burckhardti, respectively. The codon bias, with a preference for A/T in the third position of the codon, also reflects this situation. TTA, ATT, TTT and ATA codons are the most frequently observed codons in their mitogenomes, and their frequency in C. picta (31.99%) and C. melanoneura (31.6%) belongs to the upper limit of their frequency, showing again a high similarity with the C. burckhardti (31.89-32% frequency of the mentioned codons) and the use of these codons in the range of 28.91-31.69% by the other Cacopsylla psyllid species [33].
The variability of all protein-coding genes and the observed identical topology of the phylogenetic tree constructed for each of them support the previous suggestion that all of them and not only COI could be used for the species discrimination and their molecular identification. Thus, the most variable genes, ND6 and ATP8, could be efficient in the study of population variability at least for the C. picta and C. melanoneura. On the other hand, the conservative character and low variability of the COI gene, also confirmed in our study, regardless of whether we discuss the complete or partial sequences, shows that this gene is not so suitable for intrapopulation variability studies, although it is still used for them. The combination of chromosomal and mitochondrial markers might be more useful [24,26,28,42]. In addition to this gene, the mitochondrial genes, such as ND5, COIII or ND4L, are recommended for the study of interspecies variability [33]; however, their different variability among the species or higher taxonomic groups shows that the further studies of the mitogenome structure of the other Cacopsylla species are necessary for a better understanding of their evolution and their adaptations to the environment and life strategies. Their study is still under progress.
The finding of genetic and phylogenetic distance of sample genotype 495-1, male determined as Cacopsylla melanoneura (Pavel Lauterer, personal communication), which shows almost 8.5% distance from any other C. melanoneura plant lice and even higher with the other representatives of the genus Cacopsylla, is surprising. The significant phylogenetic position between C. melanoneura and C. burckhardti and its distance from them raises questions about the identity of this insect. The two distinct lineages have been previously observed for the Cacopsylla chinensis in Japan, and China and Taiwan [27,42], and one of them has recently been described as a new species, Cacopsylla jukyungi [33]. Given our high respect for the morphological determination by Dr. P. Lauterer, the possible existence of the two cryptic C. melanoneura species could not be excluded, similar to the existence of two cryptic species of C. pruni and the existence of A and B genotypes [43,44]. However, further efforts to clarify the situation could be complicated by the age of the sample analyzed and by the fact that individuals could occur quite randomly in the area, in minimal abundance, due to their random transfer by air currents [13] during the re-emigration from their from their wintering grounds and that their natural population occurs in the other area. Their different ability to transmit phytoplasma 'Ca. Phytoplasma mali' could not be excluded.

Conclusions
The complete mitogenome of two apple plant lice, C. picta and C. melanoneura, is characterized for the first time for these species. It brings comprehensive information about the mitogenome of European members of the genus Cacopsylla, and its stability on the set of samples collected more than seventeen years apart, at least in the center of Europe, Czech Republic. Both species share common characteristics with other members of the genus described to date, such as length of mtDNA, its molecular and genetic structure, including codon bias, structure of the A+T rich control region, and carry the CGGTA motif in the non-coding region between ND1 and tRNA-S2 (-orientation). On the other hand, the observed variability in the length of this region between the individuals within the same species of C. picta implies that the length and conservation of this repetitive region might not be so strict. The obtained results show a high genetic homogeneity of the mtDNA of both psyllid species occurring in the studied Czech localities and clear differences over a 15-year period were not detected.
Increasing the knowledge about the structure of the mitogenome of other European psyllids of the genus Cacopsylla, not only C. picta and C. melanoneura described in our study, could be beneficial in the future.