The Complete Mitochondrial Genome of Box Tree Moth Cydalima perspectalis and Insights into Phylogenetics in Pyraloidea

Simple Summary The mitochondrial genome (mitogenome) has been extensively employed in the investigation of phylogenetic relationships at different taxonomic levels. The mitochondrial genomes of insects are important for understanding their evolution and relationships. Herein, the entire mitogenome of Cydalima perspectalis was sequenced and characterized. Comparative mitogenomics and phylogenetic relationships were performed within the Pyraloidea. Our comparative studies show that mitochondrial genomes are a useful tool for phylogenetic studies at the level of the subfamilies in the Pyraloidea. Abstract To resolve and reconstruct phylogenetic relationships within Pyraloidea based on molecular data, the mitochondrial genome (mitogenome) was widely applied to understand phylogenetic relations at different taxonomic levels. In this research, a complete mitogenome of Cydalima perspectalis was recorded, and the phylogenetic position of C. perspectalis was inferred based on the sequence in combination with other available sequence data. According to the research, the circular mitochondrial genome is 15,180 bp in length. It contains 22 transfer RNA genes (tRNAs), two ribosomal RNA genes (rRNAs), 13 typical protein-coding genes (PCGs), and a non-coding control region. The arrangement of a gene of the C. perspectalis mitogenome is not the same as the putative ancestral arthropod mitogenome. All of the PCGs are initiated by ATN codons, except for the cytochrome c oxidase subunit 1 (cox1) gene, which is undertaken by CGA. Five genes have incomplete stop codons that contain only ‘T’. All tRNA genes display a typical clover–leaf structure of mitochondrial tRNA, except for trnS1 (AGN). The control region contained an ‘ATAGG(A)’-like motif followed by a poly-T stretch. Based on the mitochondrial data, phylogenetic analysis within Pyraloidea was carried out using Bayesian inference (BI) and maximum likelihood (ML) analyses. Phylogenetic analysis showed that C. perspectalis is more closely related to Pygospila tyres within Spilomelinae than those of Crambidae and Pyraloidea.


Introduction
Lepidoptera, with more than 157,000 known species and 137 families among 43 superfamilies, is the world's third most significant order after Diptera and Coleoptera [1]. One of several Lepidoptera superfamilies, Pyraloidea, includes the Pyralidae and Crambidae families. To date, over 15,500 different species of Pyraloidea have been identified around the world [2]. Pyraloidea insects contain a large number of economically significant pests that affect forests, agriculture, stored goods, and ornamental plants, and they have been used as model insects to research biodiversity, community ecology, management, behavioral ecology, genetics, and the evolution of pheromone communication networks [3][4][5][6][7].

Ethics Standards
The Committee of the Yancheng Teachers University and Nanjing University of Chinese Medicine approved the animal protocols, and all experiments were performed under the applicable standards, with access no. YCTU-2020007 and SP-2020003, respectively.

Sample Collection and DNA Extraction
The moths of C. perspectalis were gathered in Yancheng, Jiangsu Province, China. The specimens were stored in 100% ethanol at −20 • C until DNA extraction. The total genomic DNA was extracted from the legs of moths using the Ezup Column Animal Genomic DNA Purification Kit (SangonBiotech, Shanghai, China) in accordance with the manufacturer's protocol.

Mitogenome Sequencing
Universal primer sets for mitogenomic sequences from other Lepidopteran insects were designed to amplify the C. perspectalis mitogenome [17][18][19][20]. PCR was conducted in the following series: 3 min at 94 • C, followed by 35 cycles of 30 s at 94 • C, 1-3 min at 50-62 • C, and 10 min at 72 • C. All amplifications were conducted in 50 µL reaction volumes using the Mastercycler gradient and Eppendorf Mastercycler. The PCR products were separated by agarose gel electrophoresis (1% w/v) and then purified using a DNA Gel Extraction Kit (Vazyme, Nanjing, China). The refined PCR products were ligated into T-vector (SangonBiotech, Shanghai, China) and sequenced at least three times.

Gene Annotation and Sequence Assembly
Sequence annotation was applied by NCBI Internet BLAST function for the searching and packaging of MITOS (http://mitos2.bioinf.uni-leipzig.de/index.py (accessed on 10 January 2023)). Alignments of C. perspectalis PCGs and different Pyraloidea mitogenomes were applied by MAFFT 17 . The following rules were calculated using composition skewness: GC-skew = [G − C]/[G + C] and AT-skew = [A − T]/[A + T]. Nucleotide composition statistics and codon usage were computed using PhyloSuite [21].

Phylogenetic Analysis
GenBank provides the Pyraloidea species used for mitogenomic phylogeny to determine the phylogenetic relationships among Pyraloidea insects based on nucleotide alignments (Table 1). Spodoptera litura was used as an outgroup. Using default concatenation and settings, nucleotide sequences were aligned for each of the 13 mitochondrial PCGs. MrBayes v 3.2.2 [22] and IQ-Tree [23] performed phylogenetic analyses using the maximum likelihood (ML) and Bayesian inference (BI), respectively. Each of the PCGs was individually aligned using MAFFT [24]. Gblocks were applied to ensure protected areas and eliminate undependably aligned sequences in the datasets [25]. For ML and BI analyses, GTR + I + G was the suitable model for nucleotide sequences by MrModeltest 2.3 on Akaike's information criterion (AIC) [26]. Bayesian analysis was conducted under the following circumstances: 10,000,000 generations, four chains, and a burn-in step for the first 5000 generations, 100 sample frequency. We evaluated the reliability of the results through two methods: first, the average standard deviation of split frequencies was lower than 0.01 in the Bayesian method. The value of ESS was over 200. This showed that our data combined cumulatively. The results of the phylogenetic trees are presented in it [27].

Base Composition and Genome Organization
The complete mitogenome sequence of C. perspectalis is a closed circular molecule 15,180 bp in length. The composition of the gene is similar to that of other Pyraloidea insect mitogenomes such as 13 PCGs (cox1-3, nad1-6, nad4L, cob, atp6 and atp8), 22 tRNA genes, two mitochondrial rRNA genes (rrnS and rrnL), and a central non-coding region known as the AT-rich region. The majority strand (F strand) encodes 23 genes. The opposite (R) strand encodes 14 genes (Figure 1, Table 2). Four of the 13 PCGs (nad1, nad4, nad4L  and nad5), eight tRNAs (trnQ, trnV, trnY, trnF, trnC, trnP, trnH, and trnL [CUN]), and two rRNAs (rrnS and rrnL) were coded with minority-strands. The remaining 23 genes were encoded by the majority strands.  The nucleotide composition of the C. perspetives mitogenome is as follows ( Table 3): A = 6058 (39.9%), T = 6231 (41.0%), G = 1162 (7.7%), and C = 1729 (11.4%). The A + T of the C. perspectalis mitogenome's nucleotide composition was 81.0%. The entire GC-skew and AT-skew of the C. perspectalis mitogenome were −0.014 and −0.196, respectively. The AT skew for the C. perspectalis mitogenome was slightly negative. This suggests that T nucleotides are more abundant than A nucleotides. The GC-skew for the C. perspectalis mitogenome was scarcely negative, with C nucleotides outnumbering G nucleotides. In addition, AT-skew (0.014) and GC-skew (0.185) of the tRNAs indicate that tRNAs include more As and Gs than Ts and Cs. Similarly, AT-skew (0.050) and GC-skew (0.337) of the rRNAs clearly suggest that rRNAs have more As and Gs than Ts and Cs.

Protein-Coding Genes
In total, 13 PCGs of C. perspectalis contain 3723 codons, except for the termination codons. The beginning and ending codons of 13 PCGs in the C. perspectalis mitogenome are presented in Table 2. The CGA codon encoded arginine, with the exception of cox1. All of the PCGs were launched by ATN codons. The CGA codon is incredibly protected across almost all groups of the insect [28][29][30]. In the C. perspectalis mitogenome, eight PCGs (atp6, atp8, cox1, cox3, nad2, nad3, nad6, and cob) had the whole stop codon TAA, but the other  T (nad1, nad4, nad4L, cox2, and nad5). The ordinary A + T of the 13 PCGs was 79.6%. Moreover, 13 PCGs had a slightly negative AT skew, although it was a marginally positive GC skew (Table 3). For the C. perspectalis mitogenome, the related synonymous codon usage (RSCU) is valuable, as outlined in Table 4 and Figure 2, where NNT and NNA were higher than 1.0, apart from Leu (CUR), showing a great Ts or As bias in the 3rds. Leu (UUR) (484), Ile (469), and Phe (374) (Figure 3) are the most frequent amino acids found in mitochondrial proteins.

Control Region
The control region (AT-rich region) plays a crucial role in the introduction of the transcription and replication of the mitogenome [31]. The AT-rich part (288 bp) of the C. perspectalis mitogenome is situated among trnM and rrnS. The entire AT content of the PCGs was 96.2% and it was highest in the mitogenome of C. perspectalis. The entire GC-skew and AT-skew in the AT-rich part of C. perspectalis were 0.26 and 0.01, respectively ( Table  3). The GC-skew and AT-skew for the AT-rich part of C. perspectalis were marginally positive, showing that G and A are more abundant than C and T.
Some protected structures were discovered in an AT-rich part of C. perspectalis (Figure 4). The motif ATAGG plus 17 bp poly-T stretch downstream of rrnS was the first protected structure and may demonstrate the source of light strands or minority replication [32,33]. In the A + T rich region, the microsatellite-like repeat (AT)14 elements were

Control Region
The control region (AT-rich region) plays a crucial role in the introduction of the transcription and replication of the mitogenome [31]. The AT-rich part (288 bp) of the C. perspectalis mitogenome is situated among trnM and rrnS. The entire AT content of the PCGs was 96.2% and it was highest in the mitogenome of C. perspectalis. The entire GC-skew and AT-skew in the AT-rich part of C. perspectalis were 0.26 and 0.01, respectively ( Table 3). The GC-skew and AT-skew for the AT-rich part of C. perspectalis were marginally positive, showing that G and A are more abundant than C and T.
Some protected structures were discovered in an AT-rich part of C. perspectalis (Figure 4). The motif ATAGG plus 17 bp poly-T stretch downstream of rrnS was the first protected structure and may demonstrate the source of light strands or minority replication [32,33]. In the A + T rich region, the microsatellite-like repeat (AT) 14 elements were detected. In addition, a 10 bp poly-A stretch was discovered just in front of the trnM region. Many tandem repeat elements are usually present in the A + T-rich regions of most insects. No repetitions were discovered in the A + T-rich region of the C. perspectalis mitogenome ( Figure 4).

Rearrangement of Gene
The arrangement of genes of Pyraloidea insects is often remarkably conserved. In contrast to the putative ancestral arthropod mitogenome, the order of the C. perspectalis differs from that of traditional insects. The trnM gene's placement in the C. perspectalis mitogenome is trnM-trnI-trnQ-nad2. This differs from conventional insects, in which trnM is situated between nad2 and trnQ ( Figure 5). The ancestral insect placement of the trnM gene clusters has been discovered in ghost moths [34]. The rearrangement of genes in C. perspectalis stands for the opinion that the ancestral arrangement of the trnM gene cluster goes through rearrangement after Hepialoidea departs from the Pyraloidea lineages. Rearrangements of tRNA are believed to be the result of a tandem copy of the mitogenome's part as a whole. This was followed by non-random or random loss of identical copies [35][36][37][38].

Rearrangement of Gene
The arrangement of genes of Pyraloidea insects is often remarkably conserved. In contrast to the putative ancestral arthropod mitogenome, the order of the C. perspectalis differs from that of traditional insects. The trnM gene's placement in the C. perspectalis mitogenome is trnM-trnI-trnQ-nad2. This differs from conventional insects, in which trnM is situated between nad2 and trnQ ( Figure 5). The ancestral insect placement of the trnM gene clusters has been discovered in ghost moths [34]. The rearrangement of genes in C. perspectalis stands for the opinion that the ancestral arrangement of the trnM gene cluster goes through rearrangement after Hepialoidea departs from the Pyraloidea lineages. Rearrangements of tRNA are believed to be the result of a tandem copy of the mitogenome's part as a whole. This was followed by non-random or random loss of identical copies [35][36][37][38].
The arrangement of genes of Pyraloidea insects is often remarkably conserved. In contrast to the putative ancestral arthropod mitogenome, the order of the C. perspectalis differs from that of traditional insects. The trnM gene's placement in the C. perspectalis mitogenome is trnM-trnI-trnQ-nad2. This differs from conventional insects, in which trnM is situated between nad2 and trnQ ( Figure 5). The ancestral insect placement of the trnM gene clusters has been discovered in ghost moths [34]. The rearrangement of genes in C. perspectalis stands for the opinion that the ancestral arrangement of the trnM gene cluster goes through rearrangement after Hepialoidea departs from the Pyraloidea lineages. Rearrangements of tRNA are believed to be the result of a tandem copy of the mitogenome's part as a whole. This was followed by non-random or random loss of identical copies [35][36][37][38].

Phylogenetic Analyses
Based on nucleotide alignments (NT dataset), phylogenetic trees were constructed using two methods (ML and BI) and the MAFFT alignment technique. As an outgroup, S. litura was used. The monophyly of every superfamily is usually strongly suggested by Bayesian inference (BI), and the maximum likelihood method based on the nucleotide sequence of 13 mitochondrial PCGs. The BI and ML trees had identical tree topologies; monophyly of the families and subfamilies was powerfully recommended, as shown by the morphological characteristics and phylogeny of the completed mitogenome [39]. In the research, the trees' comparative analyses show high node support values, together with 13 PCG datasets ( Figure 6). The phylogenetic analysis shows that C. perspectalis is more closely related to Pygospila tyres than other species, indicating that C. perspectalis belongs to the Spilomelinae, Crambidae, and Pyraloidea. As shown in Figure 6, the monophyly of each superfamily is generally well-supported, typically with posterior probabilities

Phylogenetic Analyses
Based on nucleotide alignments (NT dataset), phylogenetic trees were constructed using two methods (ML and BI) and the MAFFT alignment technique. As an outgroup, S. litura was used. The monophyly of every superfamily is usually strongly suggested by Bayesian inference (BI), and the maximum likelihood method based on the nucleotide sequence of 13 mitochondrial PCGs. The BI and ML trees had identical tree topologies; monophyly of the families and subfamilies was powerfully recommended, as shown by the morphological characteristics and phylogeny of the completed mitogenome [39]. In the research, the trees' comparative analyses show high node support values, together with 13 PCG datasets ( Figure 6). The phylogenetic analysis shows that C. perspectalis is more closely related to Pygospila tyres than other species, indicating that C. perspectalis belongs to the Spilomelinae, Crambidae, and Pyraloidea. As shown in Figure 6, the monophyly of each superfamily is generally well-supported, typically with posterior probabilities greater than 0.9 and bootstrap support (BS) greater than 75. It is obvious that three families belong to the Pyraloidea: Thyrididae, Pyralidae, and Crambidae. Regier et al. presented molecular phylogenetic research on Pyraloidea using five nuclear genes. The findings led to a new classification of Crambidae into 'non-PS Clade' and 'PS Clade'. The two sister lineages correspond suitably to the 'PS clade' (Pyraustinae and Spilomelinae) and the 'non-PS clade' (Glaphyriinae, Acentropinae, Crambinae, Schoenobiinae, and Scopariinae) [40]. Our phylogenetic analysis outcome demonstrates that the same topological structures were derived from some traditional classifications and molecular data. Four of the subfamilies, Galleriinae + (Phycitinae + (Pyralinae + Epipaschiinae)) have been widely supported based on a variety of combinations of mitogenomic data or multiple gene markers in Pyralidae [40][41][42][43], and these phylogenetic relationships were also obtained based on 14 nuclear gene data. Meanwhile, the limited availability of a mitogenome precluded the Chrysauginae from being sampled in this case [44]. Orybina was regarded as a member of Pyralina based on the morphological method. However, a molecular phylogenetic analysis of Orybina revealed that the phylogenetic position was away from the Pyralina and close to Galleriinae, which is consistent with a previous study with significant value support [45]. Within the Crambidae, the 'PS clade' Pyraustinae and Spilomelinae formed sister lineages, while the 'non-PS clade' was divided into two sister lineages: one group included Glaphyriinae and Odontiinae, while the other group included the remaining four subfamilies (Schoenobiinae, Crambinae, Scopariinae, and Nymphulinae). The family-level topology of the phylogenetic analyses can be described as follows: (Glaphyriinae + Odontiinae) (Schoenobiinae + (Crambinae + (Scopariinae + Nymphulinae))) and the results were strongly supported (BS ≥ 95, PP = 1.00) and consistent with the previous research results [1,46]. Nevertheless, since we identified a separate sample in the research, a more desirable realization of the Pyraloidea mitogenome requires an extension of the genome and taxon samplings, especially in the Orybina and Chrysauginae.

Conclusions
In this study, we reported a complete mitogenome of Cydalima perspectalis, and the phylogenetic analyses of C. perspectalis were inferred using nucleotide sequence. The arrangement of a gene in the C. perspectalis mitogenome is similar to that of the Pyraloidea mitogenome. All of the PCGs were initiated by ATN codons, except for cox1, which was undertaken by CGA. Five genes had incomplete stop codons that contain only 'T'. All tRNA genes displayed a typical cloverleaf structure of mitochondrial tRNA, except for trnS1 (AGN). The control region contained an 'ATAGG(A)'-like motif followed by a poly-T stretch. Phylogenetic analysis within Pyraloidea was constructed using the BI and ML methods. The results showed that C. perspectalis is more closely related to Pygospila tyres within Spilomelinae than those of Crambidae and Pyraloidea. These molecular-based phylogenies support the morphological classification of the relationships within the Pyraloidea species.