Characterization of the First Complete Mitochondrial Genome of Cyphonocerinae (Coleoptera: Lampyridae) with Implications for Phylogeny and Evolution of Fireflies

Simple Summary The classification of Lampyridae has been extensively debated. Although some recent efforts have provided deeper insight into it, few genes have been analyzed for Cyphonocerinae in the molecular phylogenies, which undoubtedly influence elucidating the relationships of fireflies. In this study, we generated the first complete mitochondrial genome for Cyphonocerinae, with Cyphonocerus sanguineus klapperichi as the representative species. The comparative analyses of the mitogenomes were made between C. sanguineus klapperichi and that of well-characterized species. The results showed that the mitogenome of Cyphonocerinae was conservative in the organization and characters, compared with all other fireflies. Like most other insects, the cox1 gene was most converse, and the third codon positions of the protein-coding genes were more rate-heterogeneous than the first and second ones in the fireflies. The phylogenetic analyses suggested that Cyphonocerinae as an independent lineage was more closely related to Drilaster (Ototretinae). Nevertheless, more sampler species are needed in the reconstruction of fireflies’ phylogeny to verify this result. Abstract Complete mitochondrial genomes are valuable resources for phylogenetics in insects. The Cyphonoceridae represents an important lineage of fireflies. However, no complete mitogenome is available until now. Here, the first complete mitochondrial genome from this subfamily was reported, with Cyphonocerus sanguineus klapperichi as a representative. The mitogenome of C. sanguineus klapperichi was conserved in the structure and comparable to that of others in size and A+T content. Nucleotide composition was A+T-biased, and all genes exhibited a positive AT-skew and negative GC-skew. Two types of tandem repeat sequence units were present in the control region (136 bp × 2; 171 bp × 2 + 9 bp). For reconstruction of Lampyridae’s phylogeny, three different datasets were analyzed by both maximum likelihood (ML) and Bayesian inference (BI) methods. As a result, the same topology was produced by both ML analysis of 13 protein-coding genes and 2rRNA and BI analysis of 37 genes. The results indicated that Lampyridae, Lampyrinae, Luciolinae (excluding Emeia) were monophyletic, but Ototretinae was paraphyletic, of which Stenocladius was recovered as the sister taxon to all others, while Drilaster was more closely related to Cyphonocerinae; Phturinae + Emeia were included in a monophyletic clade, which comprised sister groups with Lampyridae. Vesta was deeply rooted in the Luciolinae.


Introduction
Lampyridae Rafinesque, 1815 is a cosmopolitan family consisting of about 100 genera and 2200 species [1][2][3]. It is an amazing bioluminescent beetle group, with all species for long-term storage and further molecular analyses, which were deposited in the Museum of Hebei University (MHBU, accession No. 2CAN0196).
The whole mitochondrial genome sequence was sequenced using an Illumina Novaseq 6000 platform with 150 bp paired-end reads at BerryGenomics, China. The sequence reads were first filtered by the programs following Zhou et al. [34], and then the remaining high-quality reads were assembled using IDBA-UD [35], under similarity threshold 98%, and k values minimum 40 and maximum 160 bp. The gene cox1 was amplified by polymerase chain reaction (PCR) using universal primers as 'reference sequences' to target mitochondrial scaffolds by IDBA-UD [35] to acquire the best-fit, which is under at least 98% similarity. Geneious 2019.2 [36] software was used to manually map the clean readings to the obtained mitochondrial scaffolds to check the accuracy of the assembly.

Genome Annotation and Analyses
Gene annotation was done by Geneious 2019.2 [36] software and the MITOS web server (http://mitos.bioinf.uni-leipzig.de/index.py, accessed on 20 March 2021) [37]. The positions and secondary structures of 22 tRNAs were estimated by a combination of the results predicted by an ARWEN and tRNAscan-SE Search Server v.1.21 [38,39]. The mitogenomic circular map was produced using a CGView Server (http://stothard.afns.ualberta.ca/ cgview_server, accessed on 20 March 2021) [40]. The skewness was determined with base composition of nucleotide sequences by using the formula: AT skew = [A − T]/[A + T], GC skew = [G − C]/[G + C] [41]. The Tandem Repeat Finder program was used to predicted tandem repeats in A + T-rich region [42]. The relative synonymous codon usage (RSCU) was analyzed by MEGA 7.0 [43]. DnaSP v5. 10.01 [44] was used to calculate the nucleotide diversity (Pi) and sliding window analysis (a sliding window of 200 bp and a step size of 20 bp) based on 13 aligned protein-coding genes (PCGs) and non-synonymous (Ka)/synonymous (Ks) substitution rates among the 13 PCG. The base composition and component skew were analyzed using PhyloSuite v1.2.2 [45]. The genetic distances were computed using MEGA 7.0 with the Kimura-2-parameter model. SymTest v2.0.47 [46] with Bowker's matching pair symmetry test was used to analyze the differences of heterogeneous sequences in the datasets, and the heat maps were generated according to the inferred p-values.

Phylogenetic Analysis
We followed the classification of Lampyridae by Martin et al. [3]. In addition to the newly sequenced genome here for Cyphonocerinae, another 32 species representing four subfamilies of Lampyridae were selected as the ingroups, which are the previously published complete or almost complete mitochondrial genomes downloaded from GenBank (Table 1). Two species of Rhagophthalmidae (Rhagophthalmus ohbai Wittmer, 1994) and Phengodidae (Phrixothrix hirtus Olivier, 1909) were chosen as the outgroups [2].
Data standardization and information extraction were performed by PhyloSuite v 1.2.2 [45]. The 13 PCGs were aligned using the MAFFT algorithm implemented in Transla-torX [47] with the L-INS-i strategy. The 2 rRNAs and 22 tRNAs were aligned with MAFFT version 7 online services using the G-INS-i strategy. Gblocks v 0.91b [48] was used to remove the gaps and ambiguously aligned sites. The aligned data were concatenated with Sequence Matrix v.1.7.8 [49] and PhyloSuite v 1.2.2. The alignment of the individual gene was concatenated into three datasets: (i) the PCGrRNA matrix, including 13 PCGs and 2 rRNA genes (12,875 bp), (ii) the PCG12RNA matrix, including the first and second codon positions of the PCGs and 2 rRNA genes (9236 bp), and (iii) the PCGRNA matrix, including 13 PCGs, 2 rRNA genes and 22 tRNA genes (14,321 bp).
All datasets were analyzed using maximum likelihood (ML) on the IQ-TREE web server (http://iqtree.cibiv.univie.ac.at/, accessed on 6 April 2021) [50] with the GTR+G+I substitution model (Supplementary Tables S1-S3). Bayesian inference (BI) was also used for the phylogenetic analyses either by PhyloBayes MPI v.1.7a [51] (for the PCG12RNA and PCGrRNA matrixes) with the site-heterogeneous mixture CAT + GTR model (Supplemen-tary Table S4), or by MrBayes 3.2.6 [52] (for the PCGRNA matrix) with two independent Markov Chain Monte Carlo chains run of 2 × 10 6 generations, of which the tree was sampled every 1000 generations, and the initial 25% of sampled data were discarded as burn. ITOL (http://itol.embl.de/, accessed on 10 April 2021) [53] was used to annotate and beautify the phylogenetic tree. Note: "unpublished" means the sequence with an accession number of the species could be download from the NCBI, but the publication could not be found.

Genomic Structure and Base Compositions
As in other fireflies, the mitogenome of C. sanguineus klapperichi ( Figure 1) is a typical double-strand circular molecule and contains 13 protein-coding genes (PCGs), 22 transfer RNA genes (tRNAs), 2 ribosomal RNA(rRNAs) genes, and a control region (CR) or ATrich region, in which 14 genes (8 tRNAs, 4 PCGs, and 2 rRNAs) are transcribed from the minority strand (N-strand), while others (14 tRNAs and 9 PCGs) from the majority strand (J-strand). The annotated sequence was registered in GenBank with accession number MW365445. This is the first complete mitogenome record for Cyphonocerinae. Seven gene overlaps are present in C. sanguineus klapperichi mitogenome (Supplementary Table S5), ranging from 1 to 4 bp, with the longest overlap (4 bp) occurring between atp6 and atp8, and also nad4 and nad4l, respectively. The overlap between atp6 and atp8 is also found in mitogenomes of other arthropods [66][67][68]. Moreover, there are 13 intergenic spacer regions between genes (Supplementary Table S5), of which the total length is 211 bp, with the longest intergenic spacer (72 bp) exists between trnC and trnW. This result shows that the number and length of gene spacers are significantly higher than those of gene overlaps.
Known complete mitogenomes of fireflies range from 15,950 bp (Asymmetricata circumdata) to 18,054 bp (Pyrocoelia thibetana) (Supplementary Table S6). The mitogenome of C. sanguineus klapperichi is 16,443 bp in length (Supplementary Table S5) and slightly shorter than most others, of which the average length is 16,855 bp.
For fireflies' mitogenomes, the sizes of the control region vary greatly among different species, whereas the PCGs, tRNAs, and rRNAs show little variation in length (Figure 2A, Supplementary Table S7). This suggests that the mitogenome size of different fireflies is largely determined by the size of control regions, like other insects [68]. The base composition of C. sanguineus klapperichi is A (42.5%), T (34.2%), C (13.9%), and G (9.4%), respectively ( Table 2). It contains a slightly lower A+T content (76.7%) and a higher G + C content (23.3%), compared to that of other firefly species, which have an average value of A + T content being 78.0%, varying from 75.7% to 80.7% (Supplementary  Table S6). This lower A + T bias in C. sanguineus klapperichi is reflected in all components of its genome, except tRNAs are near to the average value ( Figure 2B). The nucleotide skew analysis shows that the full mitogenome of C. sanguineus klapperichi exhibits a positive AT-skew (0.11) and a negative GC-skew (−0.19) ( Table 2). A similar pattern is found in all other firefly mitogenomes, with the AT-skew ranging from 0.06 (Photinus pyralis) to 0.14 (Luciola cruciata) and a GC-skew varying from −0.08 (Diaphanes nubilus) to −0.24 (Lamprigera yunnana) (Supplementary Table S6). These results indicate that Ototretinae (Drilaster sp. and Stenocladius sp.) has the strongest A skew and weakest C skew, while Cyphonocerinae (C. sanguineus klapperichi) has an average value of AT-skew and GC-skew in comparison with other known firefly mitogenomes ( Figure 2C,D). This base composition bias has been suggested to be associated with replication and transcription of the mitochondrial genome [67].

Protein-Coding Genes
The overall size (excluding stop codons) of 13 PCGs of C. sanguineus klapperichi is 11,008 bp in length, accounting for 66.95% of the total genome ( Table 2). Like the full mitogenome, the whole PCGs show a slightly lower A+T content (75.1%), of which the third codon position (80.2%) is higher than those of the first and second codon positions (72.3% and 72.9%, respectively). The AT-skew (0.09) is positive, while GC-skew (−0.17) is negative for the PCGs, reflecting a bias towards nucleotides A and C than their counterparts.
All PCGs of C. sanguineus klapperichi are initiated with the standard ATN codons and terminated with TAA/TAG or a truncated termination codon T (Supplementary Table S5). These incomplete stop codons are thought to be ubiquitous in metazoan [26] and have been supposed to be completed through posttranscriptional polyadenylation [69].
The codon usage analysis of C. sanguineus klapperichi shows that the most frequently used codons are UUA-Leu (352), AUU-Ile (346), UUU-Phe (326), and AUA-Met (235) (Figure S1A, Supplementary Table S8). The UUA-Leu also has the highest RSCU value (3.9), further indicating that UUA is the most preferred codon. The RSCU values of the PCGs reveal that there is a higher frequency in the usage of AT than that of GC in the third codon positions (Figure S1B, Supplementary Table S8).
Sliding window analysis was implemented to study the nucleotide diversity of 13 PCGs among fireflies exhibited in Figure 3A. Nucleotide diversity values range from 0.187 (cox1) to 0.337 (atp8). Among the genes, atp8 (Pi = 0.337) has the highest variability, followed by nad6 (Pi = 0.303), nad2 (Pi = 0.282), and nad3 (Pi = 0.257) (Supplementary  Table S9). In contrast, cox1 (Pi = 0.187) and nad1 (Pi = 0.198) have relatively low values and are the most conserved of the 13 PCGs. This result indicates that the nucleotide diversity is highly variable among the 13 PCGs.
The ratio (ω) of non-synonymous (Ka) to synonymous (Ks) substitution rates, which is a diagnostic statistical method to detect molecular adaption [70,71], is used to estimate the evolutionary rate among insects. In Lampyridae, the genes atp8 (0.777), nad6 (0.641), and nad2 (0.501) have comparatively high Ka/Ks ratios, while cox1 (0.165), cox2 (0.266), and cox3 (0.282) have relatively low values ( Figure 3B, Supplementary Table S9). The average Ka/Ks (ω) of 13 PCGs of the fireflies are all less than 1, indicating that these genes are under purifying selection [72]. Heterogeneity of nucleotide divergence was examined under pairwise comparisons in a multiple sequence alignment ( Figure 4). The datasets PCGrRNA and PCGRNA exhibit higher heterogeneous sequence divergence than PCG12rRNA, indicating that the third codon positions are more rate-heterogeneous than the first and second ones. The higher compositional heterogeneity may result in systematic errors in phylogenetic analyses [27].

Transfer and Ribosomal RNA Genes
The complete set of 22 typical tRNAs were all found in the mitochondrial genome of C. sanguineus klapperichi, and their secondary structures are shown in Supplementary Figure S2. The total length of tRNAs is 1432 bp, ranging from 62 bp (trnF, trnG, trnH, and trnL1) to 71 bp (trnK) in size (Supplementary Table S5). The AT-skew (0.07) is positive, and GC-skew (−0.15) is negative (Table 2), indicating a preference for using A base over T and G over C.
Most tRNAs exhibit the typical clover-leaf structures, except trnS1 missing the dihydrouridine (DHU) arm (Supplementary Figure S2). Lacking the DHU arm in trnS1 is a common feature for most metazoan mitogenomes [26]. These aberrant tRNAs are supposed to sustain their function via a posttranscriptional RNA editing mechanism [73,74]. In the tRNAs, except the classic base pairs (A-U and C-G), 22 non-canonical base pairings (G-U and A-C), and 8 other mismatched base pairs (U-U, A-A, A-G) were found in the arms (Supplementary Figure S2).
In the mitogenome of C. sanguineus klapperichi, the rrnL (1268 bp) and rrnS (766 bp) are located in the conserved positions between trnL and trnV, and trnV and the control region, respectively (Figure 1). There is a little variation in the sizes of both rRNAs among firefly species (Figure 2A, Supplementary Table S7). The A+T content of rrnL and rrnS are 81.5% and 78.7%, respectively (Supplementary Table S7), which are slightly lower than most other firefly species ( Figure 2B). The overall rRNA shows a positive AT-skew (0.17) and a negative GC-skew (−0.34), which shows a slight bias toward using A and an obvious bias toward G ( Table 2).

Control Region
The control region of C. sanguineus klapperichi was identified by the position between rrnS and trnI, spanning 1776 bp in length ( Figure 1, Table 2). This is comparable to that of most other fireflies, which have an average length of 1810 bp (Supplementary Table S7), ranging from 1400 bp (Photinus pyralis) to 2341 bp (Diaphanes sp.). Generally, the length of the control region varies more than other components in the fireflies (Figure 2A). The control region is supposed to be involved in the initiation of replication and transcription of mitogenomes [75].
The control region of mitogenome in C. sanguineus klapperichi has a slightly lower A+T content (81.2%) compared to most other firefly species ( Figure 2B), of which the average value is 86.6% (Supplementary Table S7). The AT-skew (0.12) is positive, while the GC-skew (−0.23) is negative (Table 2), showing a bias towards using A and G.
Within the control region, two tandem repeat sequence units are present in the mitogenome of C. sanguineus klapperichi; their positions and length are shown in Figure S3. They are a 136 bp-sequence tandemly repeated twice and a 171 bp-sequence tandemly repeated twice with a partial third repeat (9 bp), respectively. In Lampyridae, the tandem repeat sequences within the control region are quite diverse. They have been found in most firefly species which are all located between rrnS and trnI and vary widely in size and number of repeat units but are absent in Luciola substriata (Supplementary Figure S3). The species has two different types of repeat units at least, and every unit repeated at least twice, except Ellychnia corrusca, Bicellonycha lividipennis, and Asymmetricata circumdata, which have only one type of repeat unit. The most diverse tandem repeat sequence units happen in Pyrocoelia praetexta, which includes five types of repeat units. The longest repeat unit was found in Diaphanes mendax, which contained two 303 bp repeats. However, in Diaphanes pectinealis, the tandem repeat region is the shortest, with a 9 bp repeat unit tandemly repeated six times plus a 6 bp partial sequence.

Phylogenetic Analysis
Analyses of the three datasets resulted in nearly identical and fully resolved topologies with high nodal support values under ML and BI methods. What is noted, the BI reconstruction of 37 genes dataset and ML reconstruction of the 13 PCGs and 2rRNA dataset produced the same topology shown in Figure 5.
In all topologies ( Figure 5, Supplementary Figures S4-S7), the monophyly of Lampyridae was well supported based on mitogenomes of different genes datasets (PP = 1/0.98/0.92, BS = 100/100/96). Furthermore, the monophyly of Lampyrinae and Luciolinae Lacordaire, 1857 (with Emeia Fu, Ballantyne et Lambkin, 2012 excluded), including 11 and 16 representative species respectively, were highly supported (PP = 1, BS = 100). The clade composing Photurinae Lacordaire, 1857 (only one species included) and Emeia was suggested to be a monophyly (PP = 1, BS = 100), which was a recovered sister to Lampyrinae with high support value (PP = 1, BS = 100). However, the monophyly of Ototretinae McDermott, 1964 was not recovered, with the sampler genera Drilaster Kiesenwetter, 1879 and Stenocladius Fairmaire in Deyrolle and Fairmaire, 1878 splitting in different clades. The monophyly of the monotypic Cyphonocerinae with a single species here could not be tested.  Figure  S4), they were a recovered sister to Luciolinae but with very low support (BS = 35). However, based on the other two datasets of BI analyses (Supplementary Figures S6 and S7), C. sanguineus klapperichi was solely suggested as a sister to Lampyrinae + (Photurinae + Emeia), also with high support values (PP = 0.99/0.89), while Drilaster was in uncertain relationships with or sister to these taxa.  Figure S4), but with a low support value (BS = 56).

Features of Mitochondrial Genomes in Lampyridae
In all known mitochondrial genomes of Lampyridae, both the size and A+T content vary greatly for the control region, but a less variation for PCGs, tRNAs, and rRNAs, respectively, indicating that the control region is an important component that heavily affects the size and total A+T content of fireflies' mitogenome.
Fireflies exhibit the typical A+T-biased composition of insect mitogenomes [26,76,77], in either the full genome or each component, all over 75.1%. The biological reasons for such A+T-biased compositional heterogeneity have been extensively investigated [78][79][80], and one of the hypotheses is the energy efficiency trade-offs which has been experimentally rested [79]. The hypothesis suggests that the resources for nucleotide production are limited, and synthesis of G+C consumes more energy and nitrogen than A+T, so A and T are preferred nucleotides [79]. Although the mitogenome of fireflies is A+T-biased, Cyphonocerinae have a slightly lower value of A+T content in comparison with most others (Figure 2A). What is more interesting, the size of the full genome of Cyphonocerinae is about 400 bp shorter than those of the other fireflies (Supplementary Table S6). Therefore, it is presumed that the higher G+C content in this group seems to be compensated by the shortened mitogenome, which is likely to be shaped by selection for efficient usage [30]. Additionally, this is suggested to be a molecular strategy to ensure a reliable protein synthesis under high temperatures [76].
In Lampyridae, the AT-skew is all positive while GC-skew is negative, indicating the base composition bias towards A and C than their counterparts. Although the cases for such skewed strand composition are multifactorial, most of the hypotheses suggest that the strand asymmetry is the result of mutations and selection pressures [81], and the value of GC-skew of insect mitogenomes seem to be associated with replication orientation [79].
The analyses of nucleotide diversity, pairwise genetic distances, and Ka/Ks (ω) all showed that in the Lampyridae, the genes atp8, nad6, and nad2 to be more variable and evolve faster, while cox1 is more conserved and evolves comparatively slowly. Nucleotide diversity analyses are useful for designing species-specific markers, especially in taxa where morphological identification is difficult and ambiguous [82,83]. The cox1 gene is often used as a universal barcode for species identification for the insects [84][85][86], but for the fireflies, its low variability indicates that it is more suitable for exploring the phylogenetic relationships among the higher grades. In contrast, those genes exhibiting an optimal combination of fast evolution and sufficiently large size, such as nad6, should be evaluated as potential DNA markers for species and/or population identification.
The subfamily Ototretinae was established by McDermott [92] and redefined in a broad sense by Crowson [14], which was followed by Lawrence and Newton [91] and Brancucci and Geiser [93]. This group consists of several genera, including Drilaster and Stenocladius. However, Branham and Wenzel [16] excluded the latter taxa from Lampyridae on the basis of a morphological phylogeny, which was supported by Lawrence et al. [15], who placed them in Elateriformia incertae sedis, but against the molecular phylogenetic analyses by Bocakova et al. [22] and Sagegami-Oba et al. [94], also not adopted by Geisthardt and Satô [95]. Recently, this subfamily was revised by Janisova and Bocakova [96], based on the comparative morphology of adults, and transferred the above genera to Lampyridae again. In the most recent molecular phylogenetics, the members of Ototretinae were deeply rooted in Lampyridae with high support value [2,3,20] but often suggested to be a paraphyletic group [2,3], which is congruent with the present study.
Furthermore, our analyses indicated that Stenocladius was most probably the basal lineage of Lampyridae, which correlated with that of Li et al. [97]. In addition, Cyphonocerus (Cyphonocerinae) was recovered more closely related to Drilaster, which is in agreement with that of Martin et al. [20]. The clade of Drilaster and Cyphonocerus seemed more closely related to Lampyrinae but Luciolinae, which is in agreement with Suzuki [21], but against those results of Martin et al. [20] or Chen et al. [2].
Unfortunately, the monophyly of Cyphonocerinae could not be tested in this study due to a lack of more material of Cyphonocerus. Furthermore, the phylogenetic relationships among Psilocladus Blanchard, 1846 and Pollaclasis Newman, 1838 were not evaluated because of a deficiency of the complete mitogenome data. The three taxa were included in the subfamily Psilocladinae by Jeng [4,5], which was redefined by Martin et al. [3] as a monotypic subfamily (including only Psilocladus) and Pollaclasis in Lampyridae incertae sedis, meanwhile, Cyphonocerus left as the sole member of Cyphonocerinae. The efforts need to be made to clarify their relationships in the future when the material is available for all these genera.
Previous works have recovered Lamprigera in various positions within Lampyridae [3,20,61,97]. In this study, Lamprigera was either a recovered sister to Luciolinae, which is congruent with Martin et al. [3] and Chen et al. [2], or a sister to the remaining fireflies except for Stenocladius, similar to that of Li et al. [97], but never be a member of Lampyrinae [98]. Given this incongruence, the exact position of Lamprigera remains uncertain, as what has been done by Martin et al. [3,20], placing it as Lampyridae incertae sedis. To rigorously test the classification of Lamprigera relative to other subfamilies, an expanded taxon sampling including deeper species coverage of this genus will be needed.
Additionally, here we recovered Vesta in the Luciolinae for the first time and sister to Pristolycus with strong support and congruence among all of our analyses. It was once placed in Amydetinae [92] or Lampyrinae [90] and was noted to be a paraphyletic group with some species from Photurinae and Lampyrinae by Jeng [98]. Evidence from individual or multi-molecular markers supported the position of Vesta near Photurinae in the phylogenetics [2,3,20,97] and was placed in the Lampyridae incertae sedis. However, this placement was based on a single Vesta species; as a specious genus, more taxa should be included in the future study to verify this result.
It is surprising that the monotypic Emeia and Bicellonycha (Photurinae) comprised a monophyletic clade in all of our analyses. Regardless of this result, their morphological characteristics, biology, and luminous behaviors are substantially different [99]. In addition, its placement in Luciolinae has been well supported by both morphological [99] and molecular phylogenies [2,3]. Since only one species of Photurinae was included in our analyses, an expanded sampling taxon will be needed to test this placement, so we do not make any taxonomic change here.
Above all, the molecular phylogenies, including this study, were analyzed on the basis of a minority part of species or limited molecular data in comparison to an estimated 2200 species of fireflies worldwide. Therefore, many more species need to be included in future analysis to establish a solid and dependable classification of Lampyridae. Particularly, the complete mitochondrial genomes should be encouraged to accumulate more for Lampyridae, in view of its high value in investigating phylogenetic relationships of the insects.

Conclusions
In the present study, we generated and analyzed the first complete mitochondrial genome for Cyphonocerinae, with C. sanguineus klapperichi as a representative. Compared with that of all well-characterized mitochondrial genomes of fireflies, the mitogenome of Cyphonocerinae is highly conserved in structure and size. It is a highly A+T-biased composition, with a positive AT-skew and negative GC-skew. It has conserved codon usage of protein-coding genes and secondary structures of tRNAs, as well as a unique type of tandem repeat sequence units present in the control region. This provides the basic information to perform comparative analyses and further discussion of the mitogenomes' evolution of Lampyridae.
Furthermore, the nucleotide diversity, genetic distance substitution rates, and heterogeneity of nucleotide divergence were analyzed and examined. The result indicates that in Lampyridae, the genes atp8, nad6, and nad2 are more variable and evolve faster, while cox1 is more conserved and evolves comparatively slowly. Furthermore, the third codon positions are more rate-heterogeneous than the first and second ones.
Moreover, the phylogenetic trees of Lampyridae were reconstructed based on three different datasets by both maximum likelihood (ML) and Bayesian inference (BI) methods. The result suggests that Lampyridae, Lampyrinae, Luciolinae (excluding Emeia) are monophyletic. Ototretinae is paraphyletic, of which Stenocladius is at the basal lineage and sister to all others, while Drilaster is more closely related to Cyphonocerinae. Lampyridae + (Photurniae + Emeia) comprises sister groups, and the latter two are a monophyletic clade.
Here Vesta is recovered in Luciolinae for the first time. Nevertheless, large-scale analyses with denser taxon sampling are needed to confirm the present results.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/10 .3390/insects12070570/s1, Table S1: The best partitioning schemes and models for the Maximum likelihood (ML) method in the analysis of the PCG12rRNA matrix; Table S2: The best partitioning schemes and models for the Maximum likelihood (ML) method in the analysis of the PCGrRNA matrix; Table S3: The best partitioning schemes and models for the Maximum likelihood (ML) method in the analysis of the PCGRNA matrix; Table S4: The best partitioning schemes and models for the Bayesian inference (BI) method in the analysis of the PCGRNA matrix; Table S5: Mitogenomic organization of C. sanguineus klapperichi. Table S6: Comparison of length, nucleotide composition, and skewness of the representative fireflies' mitochondrial genomes; Table S7: Comparison of length and A+T% in different components of mitogenomes of the representative fireflies; Table S8: The count and relative synonymous codon usage (RSCU) of C. sanguineus klapperichi; Table S9: Nucleotide diversity, synonymous and non-synonymous substitution and genetic distance of representative fireflies. Figure S1: The codon usage numbers (A) and the RSCU (B) of amino acids used for the construction of 13 protein-coding genes of C. sanguineus klapperichi. Codon families are labeled below the X-axis. Figure S2: Secondary structures of the tRNAs in the mitochondrial genome of C. sanguineus klapperichi. Non-canonical base pairings are indicated in green and mismatched base pairs in red. Figure S3: Repeat units of the control region of the mitochondrial genome from C. sanguineus klapperichi and the representative species from other firefly subfamilies. Different colors of square bars indicate different tandem repeats. rrnS and tRNAs are indicated by the gray and purple square bars, respectively. Figure S4: Phylogenetic tree of Lampyridae produced from ML analysis of PCGRNA dataset; Figure S5: Phylogenetic tree of Lampyridae produced from ML analysis of PCG12rRNA dataset; Figure S6: Phylogenetic tree of Lampyridae produced from BI analysis of PCGrRNA dataset; Figure S7: Phylogenetic tree of Lampyridae produced from BI analysis of PCG12rRNA dataset.

Data Availability Statement:
The sequence generated in this study is deposited in GenBank with accession number (MW365445).