Phylogenetic Analysis of Two New Mitochondrial Genomes of Singapora shinshana and Seriana bacilla from the Karst Region of Southwest China

Leafhoppers have been identified as a serious threat to different plants. To explore the characteristics of mitogenomes and reveal the phylogenetic positions of two species in the Typhlocybinae, complete mitogenomes of Singapora shinshana and Seriana bacilla were sequenced and annotated for the first time with lengths of 15,402 bp and 15,383 bp, respectively. The two mitogenomes contained 13 PCGs, 22 tRNA genes and 2 rRNA genes. The genome content, gene order, nucleotide composition, codon usage and amino acid composition are similar to those of other typical mitogenomes of Typhlocybinae. All 13 PCGs started with ATN codons, except for atp8 (TTA) and nad5 (TTG). All tRNAs were folded into a typical cloverleaf secondary structure, except for tRNA-Ser1 and tRNA-Val. Moreover, phylogenetic trees were constructed and analyzed based on all the PCGs from 42 mitogenomes using maximum likelihood (ML) and Bayesian inference (BI) methods. The results supported that eleven subfamilies are all monophyletic groups, S. shinshana and S. bacilla are members of Erythroneurini, but S. shinshana and the genus Empoascanara have a very close relationship with ((((Empoascanara sipra+ Empoascanara wengangensis) + Empoascanara dwalata) + Empoascanara gracilis) + S. shinshana), and S. bacilla is closely related to the genus Mitjaevia ((Mitjaevia dworakowskae + Mitjaevia shibingensis) + S. bacilla). These results provide valuable information for future study of evolutionary relationships in Typhlocybinae.


Introduction
Phytophagous piercing-sucking insects are insects that feed on the undersides of leaves and insert piercing-sucking mouthparts into the plant tissue to remove plant juices [1]. These insects are some of the main pests that feed on food crops, vegetables and ornamental plants [2]. Cicadellidae, the largest family of Hemiptera, is a type of insect harmful to cereals, vegetables, fruit trees and other trees [3]. S. shinshana and S. bacilla (Typhlocybinae, Cicadellidae, Hemiptera, Erythroneurini) cause significant damage to leaves of peach trees and potatoes by their sucking behavior, thus leading to yellow-whitish spots on the leaves [4,5]. This damage has a serious impact on the normal growth of agricultural and forestry crops and cause a decline in fruit yield.
The mitochondrial genome plays a crucial part in the phylogenetic and evolutionary analysis of insects. It is widely used to study genome structure and function, population genetic structure and phylogenetic relationships at various taxonomic levels [6][7][8][9][10][11][12][13].

Sample Collection and DNA Extraction
The specimens were collected by a sweeping net in karst areas, and morphological terminology used in this work follows Dietrich (2005) and Song and Li (2013) [46,47]. Detailed collection information is shown in Table 2. The specimens were preserved in 95% ethanol and stored in the insect specimen storage room of Guizhou Normal University. The identified species were selected, and their head, wings and abdomen were removed. Genomic DNA was extracted from thorax muscle tissues and legs by employing a DNeasy Blood & Tissue kit (QIAGEN, Beijing, China). The tissues were ground and incubated at 56 • C for 6 h for complete lysis and total genomic DNA was eluted in 50 µL of double distilled water (ddH 2 O), and the remaining steps were performed according to the manufacturer's protocol. Genomic DNA was stored at −20 • C.

Mitochondrial Genome Sequencing and Assembly
The complete mitochondrial genomes were sequenced at Berry Genomics (Beijing, China) using an Illumina Novaseq 6000 platform (Illumina, Alameda, CA, USA) with 150 bp paired-end reads. Firstly, the obtained sequence reads were filtered following Zhou et al. [48] and the residual high-quality reads were assembled by an iterative De Bruijin graph de novo assembler, the IDBA-UD toolkit, with a similarity threshold of 98% and k values of 40 and 160 bp [49]. The mitogenome was assembled by Geneious Prime 2021 v2021.1.1 using the clean paired reads with default parameters and the mitogenome of Mitjaevia protuberanta Song, Li et Xiong, 2011 (Hemiptera: Cicadellidae: Typhlocybinae) (GenBank accession number: NC_047465.1) as the reference [50], and then Geneious Prime software was used to manually map the clean readings to the obtained mitochondrial scaffolds to check the accuracy of the assembly.

Mitogenome Annotation and Sequence Analysis
The assembled mitogenome sequence was subsequently annotated using Geneious Prime and the mitogenome of M. protuberanta (GenBank accession number: NC_047465.1) as the reference. All tRNA genes were identified with the MITOS Web Server (http:// mitos.bioinf.uni-leipzig.de/index.py, accessed on 12 December 2021) [51]. The annotated mitogenome sequences of S. shinshana and S. bacilla were deposited in GenBank under the accession numbers OM048770.1 and OM048922.1. The typical secondary structures for tRNAs were drawn with Adobe Illustrator 2021 in accordance with the MITOS predictions. The circular mitogenomic maps were visualized with the CGView server (http://stothard. afns.ualberta.ca/cgview_server/, accessed on 12 December 2021). The base composition, codon usage and relative synonymous codon usage (RSCU) of all protein-coding genes (PCGs) were calculated using MEGA 7.0 [52]. Strand asymmetry was calculated using the following formulae: AT skew = (A − T)/(A + T) and GC-skew = (G − C)/(G + C) [53].

Phylogenetic Analyses
Forty species of Cicadellidae and two outgroups available on GenBank were selected to construct the phylogenetic tree (Table 1). Phylogenetic analyses were performed using the thirteen PCGs of the two species and other leafhopper species. Each PCG was aligned using MAFFT to perform protein alignment [54,55]. Gblocks version 0.91b was used to remove gaps and ambiguously aligned sites [52]. For phylogenetic analyses, the 13 PCG dataset was used to construct phylogenetic trees based on maximum likelihood (ML) and Bayesian inference (BI) using RaxML 8.0.2 [56] and MrBayes 3.2.6 [57], respectively. ML analysis was performed with 1000 rapid bootstrapping replicates using iqtree, and GTR+I+G was considered the best model. BI analysis was performed under the GTR+I+G nucleotide substitution model in MrBayes 3.2.7a with 4 chains and sampling of the chains every 1000 generations [58]. Then, 2 independent runs of 1,000,000 generations were applied.

Genome Organization and Composition
As with the reports of most leafhoppers, genome organization and composition were relatively conservative [8]. The complete mitogenomes of S. shinshana and S. bacilla are double-stranded plasmids with lengths of 15,204 bp and 15,383 bp ( Figure 1), which contain 13 PCGs, 22 tRNA genes, 2 rRNA genes, and a control region (Table 3). However, the sequence lengths of S. shinshana and S. bacilla are different from of those of leafhoppers in complete mitochondrial genomes, based on the length of intergenic space and A+T-rich regions. There was no difference in the mitochondrial genome between the two species, and the content of A and T is higher than G and C. Twenty-three genes are located in the majority strand (J), whereas fourteen genes are encoded in the minority strand (N). Among them, the shortest intergenic space (1 bp) is located between tRNA-Ser2 and nad1, and the longest intergenic space (5 bp) is located between tRNA-Cys and tRNA-Tyr in S. shinshana. However, the shortest intergenic space sequence (1 bp) is located between atp6 and cox3, and the longest intergenic space sequence (11 bp) is located between tRNA-Gln and tRNA-Met in S. bacilla. Additionally, 13 PCG genes were found to overlap by a total of 37 bp and 35 bp in S. shinshana and S. bacilla, respectively ( Table 3). The conserved 8 bp overlapping nucleotide sequence in the two species, located in tRNA-Typ and tRNA-Cys, is extremely common in Cicadellidae [6,18,42,59], whereas the other 8 bp overlapping nucleotide sequence between nad6 and cytb from S. shinshana and the 10 bp overlapping nucleotide sequence from S. bacilla between tRNA-Ser2 and nad1 could be an accident.     1%, 35.3), G (9.4%, 10.8%) and C (11.7%, 13.0%). Like most insects [60], they had a strong A+T bias, and the content of A+T (78.9%, 76.1%) was significantly higher than that of G+C (21.1%, 23.9%) ( Table 4). The overall positive AT skews (0.09, 0.07) and negative GC skews (−0.11, −0.09) indicated that A nucleotide content was higher than T content, and C content was higher than G nucleotide content. The AT skew and GC skew values of the two species are consistent with those reported earlier in leafhoppers [17].

Protein-Coding Genes and Codon Usage
The total lengths of all the PCGs of S. shinshana and S. bacilla were 11,400 bp and 10,973 bp, accounting for 72.0% and 71.3% of the entire mitogenome, respectively. The order of the PCGs is shown in Figure 2. The longest PCG is nad5 with 1674 bp and 1675 bp, respectively, and atp8 is the shortest PCG with 153 bp. This phenomenon also occurs in other species of Cicadellidae [18]. Twelve PCGs of the S. shinshana and S. bacilla mitogenomes were initiated with the start codon ATN (ATG, ATA, ATT), except atp8 (TTG). TAA is the most frequent stop codon, but in cox2 and nad1 in S. shinshana and cox2 and nad5 in S. bacilla, a single T is used as an incomplete stop codon, which is converted into a complete TAA codon through adding of a polyadenylated tail at the 3 end [55]. Only four genes, including nad5, nad4, nad4L and nad1, are located on the N-strand, and the remaining genes, including cox1, cox2, cox3, atp8, atp6, nad2, nad3, nad6 and cytb, are located on the J-strand.

Protein-Coding Genes and Codon Usage
The total lengths of all the PCGs of S. shinshana and S. bacilla were 11,400 bp and 10,973 bp, accounting for 72.0% and 71.3% of the entire mitogenome, respectively. The order of the PCGs is shown in Figure 2. The longest PCG is nad5 with 1674 bp and 1675 bp, respectively, and atp8 is the shortest PCG with 153 bp. This phenomenon also occurs in other species of Cicadellidae [18]. Twelve PCGs of the S. shinshana and S. bacilla mitogenomes were initiated with the start codon ATN (ATG, ATA, ATT), except atp8 (TTG). TAA is the most frequent stop codon, but in cox2 and nad1 in S. shinshana and cox2 and nad5 in S. bacilla, a single T is used as an incomplete stop codon, which is converted into a complete TAA codon through adding of a polyadenylated tail at the 3′ end [55]. Only four genes, including nad5, nad4, nad4L and nad1, are located on the N-strand, and the remaining genes, including cox1, cox2, cox3, atp8, atp6, nad2, nad3, nad6 and cytb, are located on the J-strand.  (Table 4 and Figure 3). In a previous study, the most frequently used codon was AUA, and this result  (Table 4 and Figure 3). In a previous study, the most frequently used codon was AUA, and this result is inconsistent with a previous study for Cicadellidae [18]. However, with regard to the codon usage analysis of S. bacilla, the most frequently used codons decreased in the following order: AAA-Lys (261), AUU-Ile (255), AAU-Asn (210), UUA-Leu2 (205) and AUA-Met (201) ( Table 5). Codon usage analysis revealed that adenine and cytosine are usually located at the third codon. The most frequently used codons end with A or U, therefore, the content of AT is higher than that of GC in PCGs. More generally, this contributes to the AT content of the whole mitogenome.

Ribosomal and Transfer RNA Genes
The 16S rRNA gene contained 1186 bp and 1185 bp located between tRNA-Leu2 and tRNA-Val in S. shinshana and S. bacilla, and the 12S rRNA gene comprised 736 bp and 731 bp situated between tRNA-Val and the D-loop in S. shinshana and S. bacilla. The A+T contents of 16S rRNA and 12S rRNA were 83.4%, 85.4% and 81.4%, 80.0%, respectively. Twenty-two tRNA genes from the two species ranged from 61 and 71 bp in both species. Among the 22 tRNA genes, 14 genes are located on the J-strand and 8 genes on the Nstrand. This arrangement of tRNA-Ile and tRNA-Gln is common in Cicadellidae, but a few rearrangements of tRNA-Ile and tRNA-Gln have been observed in the mitogenome of other subfamilies of Cicadellidae [6,8]. In insects, most tRNA genes are folded into the typical cloverleaf secondary structure, however, tRNA-Ser1 lacks a dihydrouridine (DHU) arm [61] (Figure 4). Usually, the canonical Watson-Crick base pairings (A-U and C-G) are observed in the tRNA genes, but 16 and 20 noncanonical base pairings (G-U) are found in the DHU arms. The A+T contents in tRNA of S. shinshana and S. bacilla are 79.3% and 77.0% with a positive AT skew (0.06, 0.02) and a negative GC skew (−0.07, −0.04) ( Table 4).

A+T-Rich Region
The mitochondrial A+T-rich region plays an important role in the initiation and regulation of insect replication and transcription in insects [62]. The A+T-rich region is located between 12S rRNA and tRNA-Ile in the mitogenome with a total length of 926 bp and 1045 bp (Table 3) in S. shinshana and S. bacilla, respectively. The A+T content is 96.7% with both a negative AT skew (−0.24, −0.25) and a GC skew (0.48, 0.47) ( Table 4).

A+T-Rich Region
The mitochondrial A+T-rich region plays an important role in the initiation and regulation of insect replication and transcription in insects [62]. The A+T-rich region is located between 12S rRNA and tRNA-Ile in the mitogenome with a total length of 926 bp and 1045 bp (Table 3) (Table 4).

Phylogenetic Relationship
The two phylogenetic relationships were constructed using the 13 PCGs by BI and ML methods and then they were merged into one phylogenetic tree. The results indicated The monophyly of two tribes (Typhlocybini and Erythroneurini) was generally well supported in the subfamily Typhlocybinae, which is consistent with the findings of some previous molecular phylogenetic studies [65,66]. Ten species of Typhlocybini and eight species of Erythroneurini are clustered together, respectively, and most phylogenetic relationships demonstrated a higher nodal support in both ML (BS = 100) and BI (PP = 1) analyses. Our results further confirmed that S. shinshana is closely related to the genus Empoascanara, while S. bacilla and the genus Mitjaevia have a closer relationship ( Figure 5). In terms of morphological identification, it is easier to distinguish S. shinshana from other leafhoppers by observing the appearance. Its body is yellow or yellow-green, but the genera Seriana, Mitjaevia and Empoascanara are difficult to distinguish. Thus, dissection of the male genitals is necessary. Combined with the appearance and the shape of genitalia, S. shinshana and S. bacilla belong in Erythroneurini (Hemiptera: Cicadellidae: Typhlocybinae). Although the two species belong to the group of Erythronrurini, other species were used to analyze the phylogenetic relationship at the mitochondrial DNA sequence level in order to elucidate their phylogenetic status and verify consistency with traditional taxonomy. species of Erythroneurini are clustered together, respectively, and most phylogenetic relationships demonstrated a higher nodal support in both ML (BS = 100) and BI (PP = 1) analyses. Our results further confirmed that S. shinshana is closely related to the genus Empoascanara, while S. bacilla and the genus Mitjaevia have a closer relationship ( Figure 5). In terms of morphological identification, it is easier to distinguish S. shinshana from other leafhoppers by observing the appearance. Its body is yellow or yellow-green, but the genera Seriana, Mitjaevia and Empoascanara are difficult to distinguish. Thus, dissection of the male genitals is necessary. Combined with the appearance and the shape of genitalia, S. shinshana and S. bacilla belong in Erythroneurini (Hemiptera: Cicadellidae: Typhlocybinae). Although the two species belong to the group of Erythronrurini, other species were used to analyze the phylogenetic relationship at the mitochondrial DNA sequence level in order to elucidate their phylogenetic status and verify consistency with traditional taxonomy. Figure 5. Phylogenetic trees of forty species of Cicadellidae and two groups inferred by maximum likelihood (ML) and Bayesian (BI) methods based on protein-coding genes. The red tick represents the species in this study.

Conclusions
Consistent with previous results for other Typhlocybinae species, this study presents the mitogenome sequences of S. shinshana and S. bacilla, which are highly conserved in gene size and organization, highly A+T-biased base composition, codon usage of protein-coding genes and secondary structures of tRNAs. In addition, there are no gene rearrangements. This work provides the basic information to perform comparative analyses and discussion of the mitogenome evolution of Erythroneurini. Phylogenetic analyses support all subfamilies as monophyletic groups and S. shinshana and S. bacilla as part of Erythroneurini. However, larger scale studies with more taxa are still needed to enrich the mitochondrial genome database and construct more comprehensive phylogenies to support the results of our study. Our study offers valuable data and an efficient framework for the future phylogenetic research of Typhlocybinae.