Nine Mitochondrial Genomes of the Pyraloidea and Their Phylogenetic Implications (Lepidoptera)

Simple Summary The Pyraloidea is a large superfamily of Lepidoptera in species composition. To date, the higher-level phylogenetic relationships in this group remain unresolved, and many taxa, with taxonomic positions historically established by morphological characters, need to be confirmed through sequencing of DNA, including mitochondrial genome sequences (mitogenomes). Here, we newly generated nine complete mitogenomes for Pyraloidea that shared identical gene content, and arrangements that are typical of Lepidoptera. The current phylogenetic results confirmed previous multilocus studies, indicating the effectiveness of mitogenomes for inference of Pyraloidea higher-level relationships. Unexpectedly, Orybina Snellen was robustly placed as basal to the remaining Pyralidae taxa, rather than nested in the Pyralinae of Pyralidae as morphologically defined and placed. Our results bring a greater understanding to Pyraloidea phylogeny, and highlight the necessity of sequencing more pyraloid taxa to reevaluate their phylogenetic positions. Abstract The Pyraloidea is one of the species-rich superfamilies of Lepidoptera and contains numerous economically important pest species that cause great loss in crop production. Here, we sequenced and annotated nine complete mitogenomes for Pyraloidea, and further performed various phylogenetic analyses, to improve our understanding of mitogenomic evolution and phylogeny of this superfamily. The nine mitogenomes were circular, double-stranded molecules, with the lengths ranging from 15,214 bp to 15,422 bp, which are comparable to other reported pyraloid mitogenomes in size. Gene content and arrangement were highly conserved and are typical of Lepidoptera. Based on the hitherto most extensive mitogenomic sampling, our various resulting trees showed generally congruent topologies among pyraloid subfamilies, which are almost in accordance with previous multilocus studies, indicating the suitability of mitogenomes in inferring high-level relationships of Pyraloidea. However, nodes linking subfamilies in the “non-PS clade” were not completely resolved in terms of unstable topologies or low supports, and future investigations are needed with increased taxon sampling and molecular data. Unexpectedly, Orybina Snellen, represented in a molecular phylogenetic investigation for the first time, was robustly placed as basal to the remaining Pyralidae taxa across our analyses, rather than nested in Pyralinae of Pyralidae as morphologically defined. This novel finding highlights the need to reevaluate Orybina monophyly and its phylogenetic position by incorporating additional molecular and morphological evidence.


Introduction
The Pyraloidea is one of the largest superfamilies in Lepidoptera and includes more than 16,000 described extant species with a worldwide distribution except Antarctica [1][2][3][4]. The food plants of Pyraloidea are highly diverse and contain many widely cultivated crops,

Samples, DNA Extraction and Mitogenome Sequencing
Adults were collected at Mountains Yaoshan and Jigongshan of Henan Province, China, from 2019 to 2020. Identification was conducted through morphology, by blasting the standard mitochondrial cox1 barcode to the GenBank database, or a combination thereof [27]. A total of nine species were sequenced, six of the Pyralidae, and the remaining three species belong to Crambidae. Detailed specimen information is shown in Table  S1, and voucher specimens are deposited in the Biology Laboratory of Zhoukou Normal University, China.
Total genomic DNA was extracted from the thoracic tissue of a single individual using DNeasy tissue kit (Qiagen, Germany), following the manufacturer's instructions. A total of nine libraries (one for each species) were constructed, and sequencing was conducted using an Illumina HiSeq 2500 platform with a strategy of 150 paired-ends.

Mitogenome Assembly and Annotation
Raw sequences were checked for quality control using the FastQC (http://www. bioinformatics.babraham.ac.uk/projects/fastqc, accessed on 16 June 2021). Clean paired reads were obtained by AdapterRemoval version 2 [28] and SOAPdenovo version. 2.01 [29]. Then, the mitogenome was assembled using the Geneious R11 [30]. The "map to reference" strategy was selected to map all cleaned reads to an "anchor" of the standard mitochondrial cox1 barcoding sequence amplified earlier using insect primer pair Lco1490 (F) and Hco2198 (R) [31]. After iteration up to 100 times with custom sensitivity, a target sequence with high coverage was generated. The beginning and end of the target sequence were checked, and a complete mitochondrial genome was generated and circularized using MEGA X [32] to delete the overlapping sequence. The mitogenome sequence was annotated using MITOS2 webserver [33] with invertebrate genetic code, and the gene boundaries were confirmed using MEGA X [32] by aligning the new mitogenome with previously reported pyraloid mitogenomes available on GenBank. The mitogenome map of the O. regalis, a representative of the nine species with mitogenomes sequenced in this study, was depicted on Tutools platform (http://www.cloudtutu.com, accessed on 12 September 2021).

Multiple Alignment
A total of 90 mitogenomes were analyzed, which include the nine newly sequenced in the present study, 60 downloaded from GenBank for Pyraloidea and the remaining 21 from Noctuoidea, Geometroidea, Gelechioidea and Bombycoidea as outgroup sequences (Table 1). Among the 37 mitochondrial genes, 13 PCGs were individually aligned with the codon-based mode in TranslatorX online platform [83]. A total of two rRNAs and 22 tRNAs were independently aligned with Q-INS-i algorithm as implemented in MAFFT online platform [84]. The aligned tRNA and rRNA sequences were filtered using ClipKIT [85] to delete ambiguously aligned sites with -g 0.8 algorithm.

Nucleotide Composition, Substitution Saturation and Heterogeneity of Sequence Divergence
Nucleotide composition was calculated using MEGA X [32]. Strand asymmetry was calculated according to the formulas: [86]. Tests of substitutional saturation were conducted with DAMBE version 5.3.74 [87,88] based on the Iss (i.e., index of substitutional saturation) statistic for different datasets. For this method, if Iss is smaller than Iss.c (i.e., critical Iss), the sequences may have experienced little substitutional saturation [89]. The heterogeneity of sequence divergences was detected by using AliGROOVE [90], with the default sliding window size and the gaps treated as ambiguous characters.

Phylogenetic Analyses
A total of five datasets were generated using MEGA X [32] in combination with PhyloSuite version 1. Maximum likelihood (ML) analyses were conducted using IQ-TREE 2.0.4 [91] under the partitioning schemes and corresponding substitution models (Tables S2 and S3) determined by ModelFinder [92]. Branch supports (BS) were calculated using 1000 ultrafast bootstrap replicates [93]. Bayesian inference (BI) analyses were performed with MrBayes version 3.2.6 [94] with the partitioned models (Tables S4 and S5) determined by PartitionFinder version 2.1.1 [95]. A total of twelve processors were used to perform Insects 2021, 12, 1039 6 of 16 two independent runs, each with six chains (five heated and one cold) simultaneously for 500,000 to 10,000,000 generations sampled every 100 generations. Convergences were considered to be reached when the estimated sample size (ESS) value was above 200, established by Tracer version 1.7 [96], and the potential scale reduction factor (PSRF) approached 1.0 [94]. The first 25% of samples were discarded as burn-in and the remaining trees were used to calculate posterior probabilities (PP) in a 50% majority-rule consensus tree. In addition, BI analyses were also performed using PhyloBayes MPI 1.5a [97,98]. The site-heterogeneous mixture model CAT-GTR was imposed for all datasets. Each analysis involved two independent runs, and the two runs were regarded convergent satisfactorily with the maxdiff < 0.1 calculated through the "bpcomp" program implemented in PhyloBayes MPI. A consensus tree was generated with the first 1000 trees of each run as burn-in.

General Features of the Sequenced Mitogenomes
A total of nine complete mitogenomes were annotated for nine species covering four subfamilies, two families of the Pyraloidea, with the lengths ranging from 15,214 bp (S. plagialis) to 15,422 bp (D. rubella) ( Table 2). All mitogenomes ( Figure 1) contained 37 mitochondrial genes typical in insects and showed identical gene organization to other reported pyraloid mitogenomes, which is also typical of Lepidoptera [14,16]. Similar to other insect mitogenomes [17], the A + T content of the nine mitogenomes were highly biased, showing 79.8% (D. rubella) to 81.7% (S. taiwanalis) in nucleotide composition. AT-skew and GC-skew are routinely used to describe the base composition of mitogenomes [86,99]. The negligible AT-skew and moderate GC-skew detected in the nine mitogenomes are similar to other Lepidoptera and most insect species [100]. The annotation information for the nine mitogenomes is summarized in Table S6, and all of them have been submitted to GenBank with the accession numbers shown in Table 2.

Tests of Substitution Saturation and Heterogeneity of Sequence Divergence
The final alignment yielded 11,211 bp of 13 combined PCGs, 2171 bp of two combined rRNAs and 1506 bp of 22 combined tRNAs. Tests of substitution saturation (Table 3) showed all observed values of Iss in the first and second coding positions of 13 PCGs, two rRNAs and 22 tRNA, were significantly lower than Iss.c values for both symmetrical and asymmetrical topologies. For the third coding positions of 13 PCGs, the value of Iss was significantly higher than the Iss.c value for asymmetrical topology, indicating some of these sites have suffered substantial saturation. Relative to the first and second coding positions of mitochondrial PCG, the third coding positions generally show a faster evolutionary rate due to synonymous mutation and might contain noise information in inferring high-level phylogenetic relationships [101]. Thus, in subsequent phylogenetic analyses, multiple datasets associated with the inclusion and exclusion of the third coding positions were considered. Analyses of sequence divergence heterogeneity (Figure 2) showed little heterogeneity among all sequences except for the E. angustea (KJ508052), H. undalis (KJ636057) and O. plangonalis (MF568543), which possibly ascribed to the existence of missing genes or gene fragments in these sequences.

Tests of Substitution Saturation and Heterogeneity of Sequence Divergence
The final alignment yielded 11,211 bp of 13 combined PCGs, 2171 bp of two combined rRNAs and 1506 bp of 22 combined tRNAs. Tests of substitution saturation (Table 3) showed all observed values of Iss in the first and second coding positions of 13 PCGs, two rRNAs and 22 tRNA, were significantly lower than Iss.c values for both symmetrical and asymmetrical topologies. For the third coding positions of 13 PCGs, the value of Iss was significantly higher than the Iss.c value for asymmetrical topology, indicating some of

Phylogenetic Analyses
By adding nine newly sequenced mitogenomes to 60 existing mitogenomes from GenBank, we performed a comprehensive phylogenetic analysis of Pyraloidea based on the hitherto most extensive mitogenome sampling, using five datasets and three inference methods.
The resulting trees (Figures 3-5) consistently showed the two families as monophyletic with strong supports (BS = 100, PP = 1.00) across our analyses. In Pyralidae, the relationship among four subfamilies were inferred as Galleriinae + (Phycitinae + (Pyralinae + Epipaschiinae)), which is identical with previous studies based on mitogenomic data [11,14,53,54] or multilocus data [4,10] regardless of the Chrysauginae that was not sampled herein because of the lack of an available mitogenome. Unexpectedly, Orybina Snellen, regarded as a member of Pyralinae morphologically [102,103], was consistently basal to the remaining taxa in the Pyralidae clade by all our analyses with high supports (BS > 90, PP > 0.90), rendering the Pyralinae paraphyletic. It should be noted that Polyterpnes, an Australian chrysaugine, was found to be basal to the Pyralidae as well [10]. Orybina was established in 1895, having O. flaviplaga from India as the type species. A total of nine species have been recorded for this genus with the distribution range generally covering the whole of Southeast Asia [102,103]. To date, molecular phylogenetic analysis of this genus has never been conducted. Zhu et al. [11] sequenced the partial mitogenome of O. plangonalis, but in their mitogenomic phylogenetic investigations this sequence was not sampled, probably due to its mitogenomic incompleteness. In this study, we sequenced the first complete mitogenome for Orybina, and conducted phylogenetic analyses with Orybina included for the first time. The phylogenetic position of this genus recovered herein indicated that its monophyly phylogenetic position, especially its association with the Pyralinae and Chrysauginae of Pyralidae, urgently need to be investigated, based on more extensive molecular data and morphological reassessment. In addition, the phylogenetic positions of the five other species with mitogenomes sequenced for Pyralidae were in accordance with morphological studies [1].
based on increased taxon sampling and molecular data (including mitogenomic clear genes or even the nuclear genome data) are needed to clarify the higher-lev tionships, and to confirm or revise the groups or taxa that have never been incl previous molecular phylogenetic studies.
The three mitogenomes sequenced for Crambidae in this study were all nes Spilomelinae in our resulting trees, reinforcing their positions in this subfamil lished by morphological evidence [8]. The Spilomelinae, with 4132 species assigne genera [4], represents the most speciose subfamily in Pyraloidea. The classificatio speciose subfamily had long been regarded as inconclusive, until recent studies con by Mally et al. [13] and Léger et al. [4]. However, great efforts are still needed to the taxonomically unplaced genera or unexamined genera in molecular phylogen vestigations to the Spilomelinae tribes [13].    In Crambidae, 46 available mitogenomes (including three sequenced herein) representing eight subfamilies were sampled. All our analyses assigned the eight subfamilies into two groups, generally corresponding the "PS clade" and "non-PS clade" defined by Regier et al. [10]. The subfamilies Pyraustinae and Spilomelinae constituted the "PS clade" with strong support (BS ≥ 95, PP = 1.00), in accordance with previous studies based on mitogenomic data [11,14,30,33,50] or multilocus data [4,10]. In the "non-PS clade", three close relationships among the six subfamilies can be recognized in most of our analyses. Of these, one was the Glaphyriinae and Odontiinae that corresponds to the "OG clade" in Regier et al. [10] and Léger et al. [4], although the two Glaphyriinae taxa sampled herein often did not cluster with each other, as also recovered by Qi et al. [14] using mitogenomic data. The sister relationship between Schoenobiinae and Acentropinae was recovered by all our analyses, except the PhyloBayes method of PCG123 dataset that placed Schoenobiinae as sister to Scopariinae but with weak support (PP < 0.5). The third was the close relationship between Crambinae and Scopariinae, that was also defined by Léger et al. [4] and in our analyses; only the ML method of PCGR dataset and PhyloBayes method of PCG123 dataset rejected this relationship. The "non-PS clade" has received intense attention in recent molecular phylogenetic investigations and several revisions have been proposed, which effectively supplemented the morphologically taxonomic systems [4,[10][11][12]14]. However, nodes linking some subfamilies in the "non-PS clade" remain unresolved. A reason for this may be that the taxon sampling in our present study and related studies remains limited relative to this speciose group [4]. Consequently, future investigations based on increased taxon sampling and molecular data (including mitogenomic and nuclear genes or even the nuclear genome data) are needed to clarify the higher-level relationships, and to confirm or revise the groups or taxa that have never been included in previous molecular phylogenetic studies.
.  The three mitogenomes sequenced for Crambidae in this study were all nested into Spilomelinae in our resulting trees, reinforcing their positions in this subfamily established by morphological evidence [8]. The Spilomelinae, with 4132 species assigned to 340 genera [4], represents the most speciose subfamily in Pyraloidea. The classification of this speciose subfamily had long been regarded as inconclusive, until recent studies conducted by Mally et al. [13] and Léger et al. [4]. However, great efforts are still needed to assign the taxonomically unplaced genera or unexamined genera in molecular phylogenetic investigations to the Spilomelinae tribes [13].

Conclusions
In this study, nine complete mitogenomes were determined for Pyraloidea, and comparative mitogenomics showed these mitogenomes were conserved in nucleotide composition, gene content and gene organization in Pyraloidea and typical for Lepidoptera. Based on the hitherto most extensive mitogenomic sampling, various phylogenetic trees of five datasets and three inference methods showed the relationships among the twelve included pyraloid subfamilies, which were generally congruent and provided robust supports for previous multilocus studies, indicating the suitability of the mitogenomes for inferring higher-level relationships of the Pyraloidea. Unexpectedly, O. regalis, a member of Pyralinae morphologically, was consistently basal to the remaining Pyralidae taxa together with O. plangonalis, raising the need to reevaluate the taxonomic status of Orybina by incorporating molecular and morphological evidence.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/ 10.3390/insects12111039/s1, Table S1: Information of samples with mitogenomes sequenced in this study, Table S2: The partitioning schemes and corresponding substitution models determined by ModelFinder for the PCG123R dataset, Table S3: The partitioning schemes and corresponding substitution models determined by ModelFinder for the PCGAA dataset, Table S4: The partitioning schemes and corresponding substitution models determined by PartitionFinder for the PCG123R dataset, Table S5: The partitioning schemes and corresponding substitution models determined by PartitionFinder for the PCGAA dataset, Table S6