Complete Mitochondrial Genomes of Metcalfa pruinosa and Salurnis marginella (Hemiptera: Flatidae): Genomic Comparison and Phylogenetic Inference in Fulgoroidea

The complete mitochondrial genomes (mitogenomes) of two DNA barcode-defined haplotypes of Metcalfa pruinosa and one of Salurnis marginella (Hemiptera: Flatidae) were sequenced and compared to those of other Fulgoroidea species. Furthermore, the mitogenome sequences were used to reconstruct phylogenetic relationships among fulgoroid families. The three mitogenomes, including that of the available species of Flatidae, commonly possessed distinctive structures in the 1702–1836 bp A+T-rich region, such as two repeat regions at each end and a large centered nonrepeat region. All members of the superfamily Fulgoroidea, including the Flatidae, consistently possessed a motiflike sequence (TAGTA) at the ND1 and trnS2 junction. The phylogenetic analyses consistently recovered the familial relationships of (((((Ricaniidae + Issidae) + Flatidae) + Fulgoridae) + Achilidae) + Derbidae) in the amino acid-based analysis, with the placement of Cixiidae and Delphacidae as the earliest-derived lineages of fulgoroid families, whereas the monophyly of Delphacidae was not congruent between tree-constructing algorithms.


Introduction
In Hemiptera,~1670 mitochondrial genomes (mitogenomes) are available, and for the suborder Auchenorrhyncha, which is composed of two infraorders, Cicadomorpha and Fulgoromorpha, 982 mitogenomes are available (as of August 2021). In Auchenorrhyncha, the sequences of 558 mitogenomes are available for the infraorder Fulgoromorpha, which consists of the monotypic superfamily Fulgoroidea and comprises 13,428 species of 2350 genera in 21 families, but approximately half of these are from the same species (e.g., Lycorma delicatula in Fulgoridae) [1,2]. Therefore, complete or near complete mitogenome sequences are available for a total of 20 species (excluding nine unknown species) of 22 genera from 10 subfamilies and eight families. Therefore, mitogenome sequences are available for a limited number of Fulgoroidea, presenting only one genus in many subfamilies. These circumstances hamper phylogenetic scrutiny, reconstruction and evolutionary genomic comparison at various levels of taxonomic groups and, thus, require additional mitogenome sequences from a diverse taxonomic group.
The mitogenome sequences in Fulgoroidea have been used in several studies of evolution, including those involving mitogenomic announcement, illustration of intraspecies diversity, genomic characteristics, diversity of rearrangement, and phylogenetic inference .
For the illustration of phylogenetic relationships among fulgoroid families, a diverse set of morphological characteristics has been scrutinized, including the number of spines on the second segment of the hind tarsi [24], primarily for adult morphology, including the female genitalia [25], adult and larval morphology [26], adult female genitalia [27], and larval metatarsi [28]. Additionally, multiple genes, such as 18S rRNA, 28S rRNA, Histone3, Wingless [29], 18S rRNA, 28S rRNA, 16S rRNA, and CytB [30] have been used to infer phylogenetic relationships in the Fulgoroidea. Furthermore, mitogenome-based analyses have also been performed in several studies with varying degrees of ingroup diversity, mainly using 13 protein-coding gene (PCG) sequences [11,13,15,16,21,22]. These studies have greatly improved our understanding of the phylogenetic relationships of fulgoroid families, but additional studies are still required, particularly those that investigate conflicting relationships and include a diverse taxonomic group ( Figure 1).
Curr. Issues Mol. Biol. 2021, 1, FOR PEER REVIEW 2 The mitogenome sequences in Fulgoroidea have been used in several studies of evolution, including those involving mitogenomic announcement, illustration of intraspecies diversity, genomic characteristics, diversity of rearrangement, and phylogenetic inference .
For the illustration of phylogenetic relationships among fulgoroid families, a diverse set of morphological characteristics has been scrutinized, including the number of spines on the second segment of the hind tarsi [24], primarily for adult morphology, including the female genitalia [25], adult and larval morphology [26], adult female genitalia [27], and larval metatarsi [28]. Additionally, multiple genes, such as 18S rRNA, 28S rRNA, His-tone3, Wingless [29], 18S rRNA, 28S rRNA, 16S rRNA, and CytB [30] have been used to infer phylogenetic relationships in the Fulgoroidea. Furthermore, mitogenome-based analyses have also been performed in several studies with varying degrees of ingroup diversity, mainly using 13 protein-coding gene (PCG) sequences [11,13,15,16,21,22]. These studies have greatly improved our understanding of the phylogenetic relationships of fulgoroid families, but additional studies are still required, particularly those that investigate conflicting relationships and include a diverse taxonomic group (Figure 1).  [24] based on the number of spines on the second segment of the hind tarsi. (B) Asche [25] based primarily on adult morphological characteristics, including features of the female genitalia. (C) Emeljanov [26] based on adult and larval morphology. (D) Bourgoin [27] based on adult female genitalia. (E) Chen and Yang [28] based  [24] based on the number of spines on the second segment of the hind tarsi. (B) Asche [25] based primarily on adult morphological characteristics, including features of the female genitalia. (C) Emeljanov [26] based on adult and larval morphology. (D) Bourgoin [27] based on adult female genitalia. (E) Chen and Yang [28] based on larval metatarsi. (F,G) Urban and Cryan [29] based on 18S rDNA, 28S rDNA, Histone3, and Wingless using the Parsimony method and Bayesian inference (BI) method, respectively. (H,I) Song and Liang [30] based on 18S rDNA, 28S rDNA, 16S rDNA, and CytB using the Maximum Likelihood (ML) and BI methods, respectively. (J) Zhang et al. [11] based on 13 protein-coding genes (PCGs) of mitochondrial genomes (mitogenomes), using the Neighbor-Joining method. (K,L) Song et al. [15] based on 13 PCG, 22 tRNA, and two rRNA of mitogenomes, using the ML and BI methods, respectively. (M) Huang and Qin [13] based on 13 PCGs of mitogenomes using the ML method. (N) Yu and Liang [16] based on 13 PCGs of mitogenomes using the ML and BI methods. (O) Wang et al. [21] based on 13 PCGs of mitogenomes using the ML and BI methods. (P) Xu et al. [22] based on 13 PCGs of mitogenomes using the ML and BI methods.
The flatid planthopper, Metcalfa pruinosa (Hemiptera: Flatidae), is a polyphagous species that causes economic and aesthetic damage to diverse types of plants and is one of the most common species in eastern North America [31,32]. However, it was accidentally introduced to European countries, Russia, and Australia [33,34] and was also recorded in South Korea in 2005 [35]. Salurnis marginella (Hemiptera: Flatidae) is distributed in several Asian countries, and it invaded South Korea in 2013 from southwest China [36,37]. Currently, the damage caused by the species is negligible due to its small population size, but ecological adaptations to South Korean ecosystems are seriously concerning [38].
In this study, we determined the complete mitogenome sequences from two individuals of M. pruinosa and one of S. marginella. The two mitogenome sequences of M. pruinosa represent two haplotypes on the basis of the DNA barcoding sequences [39]. Currently, a mitogenome sequence is available for only for a single genus and species of Flatidae (Geisha distinctissima) [5]. Therefore, these three newly sequenced mitogenomes were incorporated into 42 available mitogenome sequences, representing eight families, to address several phylogenetic issues, including the phylogenetic position of Flatidae and familial relationships in Fulgoroidea. Furthermore, the genomic sequences were compared to those of other species of Fulgoroidea to better understand the evolutionary characteristics of Fulgoroidea, including gene rearrangement dynamics. Finally, intraspecies sequence divergence of each gene, including the DNA barcoding region, was categorized to identify suitable genes as molecular markers for population-level studies.

Next-Generation Sequencing for M. pruinosa
Each individual of M. pruinosa found to possess different haplotypes (H1 and H3) from DNA barcoding sequencing [39] was subjected to next-generation sequencing, whereas S. marginella was sequenced by the Sanger method. The individual with a H1 haplotype was collected from the Korean locality of Gimhae, Gyeongsangnam-do Province, Republic of Korea, located in the south eastern region of Korea (35 •  For library construction, approximately 40 ng of genomic DNA were isolated and randomly sheared using the Covaris System (Woburn, MA, USA) to generate inserts of 300 bp fragments. Library construction was performed using the TruSeq Nano DNA Kit (Illumina, San Diego, CA, USA) following the manufacturer's guidelines. Quality and DNA size of libraries were assessed using Agilent 2100 BioAnalyzer (Agilent Technologies, Palo Alto, CA, USA). Libraries were quantified by qPCR using the CFX96 Real Time System (BioRad, Hercules, CA, USA). After normalization, sequencing of the prepared library was conducted on the Illumina MiSeq platform using a Mid Output v2 kit to produce 150 bp paired-end reads (Illumina, San Diego, CA, USA).
Quality analysis of the raw sequence data was performed using FastQC software [40]. Adapter sequence reduction and trimming of low quality 5 -and 3 -ends of the reads were performed using Skewer ver. 0.2.2. [41]. Base-calling errors or insertions/deletions (indels) were corrected from the filtered set of reads using the alignment-based error correction tool Karect [42]. Consequently,~1.45 Gb of nucleotides from 2.4 million reads for the Gimhae sample and~1.63 Gb of nucleotides from~2.87 million reads for the Montpellier sample were obtained. The Phred quality score (Q) indicated that base call accuracy was 86% for the Gimhae sample and 87.2% for the Montpellier sample from the Q30 score.

Assembly and Gap Filling
The two M. pruinosa mitogenomes were assembled from the Illumina reads using a baiting and iterative mapping approach with the software MITObim ver. 1.9 [43]. The assembled mitogenomes were remapped with the whole genome sequence reads using Bowtie2 [44] before conducting manual curation. Mismatch calling and correction of the assembled sequences were conducted using GATK [45]. Finally, primarily annotation of PCGs, tRNAs, rRNAs, and the A+T-rich region of each mitogenome was carried out using MITOS Web-Server [46] (http://mitos.bioinf.uni-leipzig.de/index.py, accessed on 9 September 2021).
For gap filling, two long overlapping fragments (LF1 and LF2) encompassing COI to CytB and CytB to COI were amplified, then each five (gap 1-gap 5) and two short fragments (SFs) (gap 2 and gap 3) for H1 and H3 haplotypes, respectively, were individually amplified using the primers designed in this study (Table S1). PCR was conducted using AccuPower ® PCR PreMix (Bioneer, Daejeon, South Korea) under the following conditions: denaturation for 5 min at 94 • C; 30 cycles of 1 min denaturation at 94 • C; 1 min annealing at 48-52 • C; 1 min extension at 72 • C; and a final extension of 7 min at 72 • C. Except for gap 2, the remaining gap regions were cloned after PCR amplification for sequencing. Cloning was carried out using a T-Blunt TM PCR Cloning Kit (SolGent Co., Daejeon City, South Korea) and DH5α competent cells (Real Biotech Co., Banqiao City, Taiwan). The resultant plasmid DNA was isolated using an AccuPrep ® Plasmid Mini Extraction Kit (Bioneer Co., Daejeon City, South Korea). DNA sequencing was conducted using the ABI PRISM ® BigDye ® Terminator v3.1 Cycle Sequencing Kit and an ABI PRISMTM 3100 Genetic Analyzer (PE Applied Biosystems, Foster City, CA, USA). All products were sequenced from both directions.

S. marginella Sequencing by the Sanger Method
For S. marginella, a hind leg was used to extract DNA using a Wizard Genomic DNA Purification Kit (Promega, Madison, WI, USA) according to the manufacturer's instructions. Four primer sets that amplify four long overlapping fragments (Table S2; LF1, LF2, LF3, and LF4) were designed using previously reported mitogenome sequences of G. distinctissima [5] and the two current M. pruinosa, all of which belonged to the family Flatidae: LF1, LF2, LF3, and LF4 amplified COI and ND3 (~3.7 kb), COIII to ND4 (~3.7 kb), ND5 to srRNA (5.3 kb), and lrRNA to COI (~3.8 kb), respectively. Amplification of the LFs was conducted using LA Taq TM (Takara Biomedical, Tokyo, Japan) under the following conditions: 96 • C for 2 min, 30 cycles of 98 • C for 10 s and 48 • C for 15 min, and a final extension step of 72 • C for 10 min. Thereafter, these amplicons were used as templates to amplify 30 overlapping SFs using AccuPower ® PCR PreMix (Bioneer, Daejeon, South Korea) under the following conditions: initial denaturation for 5 min at 94 • C, followed by 35 cycles of 30 s at 94 • C, 1 min at 48-52 • C, and 1 min at 72 • C, and a final 7 min extension at 72 • C. The primers for SFs were also designed using G. distinctissima and the two M. pruinosa (Table S2). Individual SF sequences were assembled manually into the complete mitogenome using SeqMan (DNASTAR, Madison, WI, USA).

Gene Annotation
Annotations were performed using MITOS WebServer (http://mitos.bioinf.uni-leipzig.de/ index.py, accessed on 9 September 2021) with the search mode set as default, Mito/Chloroplast set as the searching source, and the genetic code of invertebrate mitogenomes set for tRNA isotype prediction [46]. Overall, 21 tRNA genes were identified, and the boundaries were delimitated based on these parameters. However, trnS 1 , which has a truncated dihydrouridine (DHU) arm, was detected using a hand-drawn secondary structure in conjunction with an alignment of the predicted trnS 1 regions of co-familial species G. distinctissima [5], and the anticodon was given particular consideration [7,11]. Start and stop codons of PCGs were further confirmed by alignment against mitochondrial (mt) PCGs of the fulgoroid species [4,5,7,14]. The nucleotide sequences of the PCGs were translated based on the invertebrate mitochondrial DNA (mtDNA) genetic code. Sequence data were deposited into the GenBank database under accession nos. MK303326 and MN417319 for H1 and H3 haplotypes of M. pruinosa, respectively, and MT628542 of S. marginella.

Comparative Genome Analyses
For the comparative analysis, 42 fulgoroid mitogenome sequences, which represent 27 species (including unidentified species) of 20 genera in 10 subfamilies of eight families, were downloaded from the GenBank database. The mitogenome sequences lacking generic names and a substantial genic sequence were excluded from genomic comparison and phylogenetic analysis. Further, among the 81 mitogenome sequences of L. striatellus reported by Sun et al. [2], only two representing each haplotype group were included. These sequences, along with the three mitogenome sequences obtained in the present study, were compared for several genomic characteristics. The A+T content of each gene, whole genome, and codon position of the PCGs were calculated using DNASTAR (Madison, USA). Codon usage was determined by MEGA 6 [47], and the gene overlap and intergenic space sequences were hand counted. The genetic distance at each taxonomic category was calculated using unrooted pairwise distance using PAUP ver. 4.01b10 [48]. These values were plotted using boxplots implemented in JMP software ver. 11.1.1 (SAS Institute, Cary, NC, USA).
Compositional skewness, which measures the relative number of As to Ts [AT skew = (A − T)/(A + T)] and Gs to Cs (GC skew = [(G − C)/(G + C)]), was calculated to determine the base composition of nucleotide sequences [49]. Ka and Ks, along with the Ka:Ks ratio, were estimated to determine the degree of genetic divergence of the whole genome, PCGs encoded in each strand, each individual PCG in fulgoroid species, and each PCG in each family using a model that averages parameters across 14 candidate models [50] with the KaKs Calculator ver. 1.2 [51]. Thrips imaginis in the order Thysanoptera [52] was used as a criterion sequence for calculations. Subsequently, a series of one-way analyses of variance (ANOVAs) and Tukey's honestly significant difference multiple range tests were performed using JMP software ver. 15.10 (SAS Institute) to detect the statistical difference of the mean Ka/Ks values obtained by pairwise comparisons.

Phylogenetic Analysis
For the phylogenetic reconstruction of the superfamily Fulgoroidea, the nucleotide sequence of each PCG was aligned based on the codons using RevTrans ver. 2.0 [53]. The well-aligned blocks within each PCG were selected using GBlocks 0.91b 9 [54], with the maximum number of contiguous non-conserved positions set to 11. Gap positions were excluded within the final blocks. Each of the 11 aligned PCGs (excluding ND1 and ND3, which are unavailable in some species) was then concatenated to generate the nucleotide (NU) sequences of the PCG dataset (7716 bp excluding gaps for the NU sequence dataset). For amino acid (AA) sequence-based analysis, the NU sequences of the 11 PCGs were recorded into AA sequences using RevTrans ver. 2.0 [53], and these were concatenated into a single data matrix (2271 AAs including gaps for the AA dataset).
PartitionFinder2 was used to search for the optimal partitions and the corresponding optimal models of substitution using the 'greedy' search [55][56][57], with the inclusion of the evolutionary models available in RAxML [58] and MrBayes [59]. As a result, five partition schemes for the NU data matrix were obtained, providing three different substitution models (GTR + I + G for subset 1, 2, and 4; TVM + Gg for subset 3; and HKY + G for subset 5), and two partition schemes for the AA data matrix were obtained, providing two different substitution models (MTART + I + G + F for subset 1 and MTZOA + I + G + F for subset 2). These partition schemes and substitution models for each data matrix were applied for each phylogenetic analysis.
To reconstruct the phylogeny of the Fulgoroidea, we used both the maximum likelihood (ML) and Bayesian inference (BI) algorithms using RAxML ver. 8.2.10 [58] and MrBayes ver. 3.2.7 [59], respectively, implemented within the CIPRES Portal ver. 3.1 [60]. For BI analysis, two independent runs of four incrementally heated Markov and Monte Carlo chains (one cold chain and three hot chains) were simultaneously run for 10 million generations, with tree sampling conducted at every 500 generations. The first 25% of the sampled trees were discarded as burn-in. Partitioned analyses were conducted with each partition unlinked in each parameter (revmat, statefreq, shape, pinvar, and tratio). An average split frequency of less than 0.01 was used to represent the convergence of the two simultaneous runs. For ML analysis, the RAxML algorithm was applied, which uses a "rapid" bootstrapping approach and searches for the best-scoring tree [58]. Confidence values for BI trees were obtained from the Bayesian posterior probabilities (BPPs), and those for ML trees were determined with 1,000 bootstrap (BS) iterations. Durgades nigropicta and Populicerus populi from another infraorder Cicadomorpha, which has traditionally been known as the sister group to Fulgoroidea in Auchenorrhyncha, were selected as outgroups [61,62]. The phylogenetic trees were visualized using iTOL ver. 4 [63].

General Mitochondrial Genome Features
The three mitogenomes contained 37 typical genes (13 PCGs, 22 tRNA genes, and two rRNA genes) and one non-coding A+T-rich region, found in animals [64]. The mitogenome sizes of the two M. pruinosa haplotypes (H1 and H3) and S. marginella were 16,312, 16,314, and 16,126 bp, respectively, which were well within the range previously reported for complete mitogenomes of Fulgoroidea, and the same is true for the number of codons of 3656, 3656, and 3637, respectively, excluding termination codons (Table 1; Table S3). The A/T nucleotide composition of the whole genome was 76.61%, 76.66%, and 75.73% in the M. pruinosa H1, M. pruinosa H3, and S. marginella mitogenomes, respectively, indicating biased A/T nucleotides. The A/T content among M. pruinosa genes and region was the highest for lrRNA, 79.53% and 79.53%; followed by srRNA, 77.96% and 77.96%; the A + T-rich region, 77.46% and 77.54%; tRNAs, 76.82% and 76.82%; and PCGs, 75.78% and 75.79%, in H1 and H3, respectively. However, this trend was detected only in the M. pruinosa mitogenomes in the Fulgoroidea, presenting a diverse pattern in the Fulgoroidea, even in the co-familial species S. marginella and G. distinctissima (Table 2).  Note: The gene abbreviations are as follows: COI, COII, and COIII refer to the cytochrome oxidase subunits; CytB refers to cytochrome b; ND1-6 refers to NADH dehydrogenase components; srRNA and lrRNA refer to small and large subunit ribosomal RNA (rRNA) genes, respectively. tRNAs are denoted as one-letter symbols in accordance with the IUPAC-IUB single-letter amino acid codes, except those encoding leucine and serine, which are labeled trnL 1 for the CTN codon family, trnL 2 for the TTR codon family, trnS 1 for the AGN codon family, and trnS 2 for the TCN codon family. Superscripts indicate identical start and stop codons among skipper species. +, genes encoded in major strand; −, genes encoded in minor strand. Values in parentheses indicate gene size (bp).
The biased A/T content was reflected in the form of codon usage. M. pruinosa H1, M. pruinosa H3, and S. marginella PCGs had TTA (leucine), ATT (isoleucine), TTT (phenylalanine), and ATA (methionine) as the four most frequently used codons with frequencies of 35.61%, 35.64%, and 34.78%, respectively, among the 64 available codons and the frequency of the co-familial species G. distinctissima was the lowest at 32.01% (Table S4). A similar pattern was also found in all sequenced Fulgoroidea, ranging in frequency from 30.14% (Magadhaideus luodiana) to 40.45% (Saccharosydne procerus) (Table S4). These four codons comprised A or T nucleotides, indicating the biased usage of A/T nucleotides in Fulgoroidea.
The two haplotypes of M. pruinosa shared an identical start codon, and S. marginella used different codons to those of M. pruinosa in COI, ND3, and ND5 (e.g., ATC in M. pruinosa vs. ATG in S. marginella for COI). All fulgoroid species, including the three current mitogenomes, had the canonical start codon ATN, but one species in Delphacidae (Ugyops sp.) used the atypical codon CTG for COI ( Table 2; Table S3). Previously, the insect orders Lepidoptera and Coleoptera have often shown non-canonical start codons, such as GCA in Lepidoptera [65] and AAT/AAC in Coleoptera for COI, but CTG is highly unconventional. GTG, which is not typical, but is a canonical start codon in insects, was also used in several fulgoroid species for ND5 and ND1 (Table S3). Examples of the GTG start codon in insects are found in Drosophila formosana in Diptera for ND5 [66] and Saturnia jonasii in Lepidoptera for COII [65].
A total of 22 tRNA genes (one specific for each amino acid and two for leucine and serine) were identified for each mitogenome sequenced in this study ( Figure 2). All tRNAs except trnS 1 , which lacked the DHU loop, were shown to be folded into the expected cloverleaf secondary structures. In other fulgoroid species, which are available for their tRNA structure trnS 1 , were also truncated at the DHU loop (11 mitogenomes in 10 species) [6,9,16,21,22]. This incomplete trnS 1 structure has been detected in the mitogenomes of other animals, including insects [67]. The lengths of trnS 1 of the 11 fulgoroid species, including those of current studies, were nearly always the shortest among tRNA isotypes, ranging in size from 54 to 63 bp, due to incomplete formation of the DHU loop, whereas the longest tRNA was variable as trnV, trnK, trnW, or trnD, ranging in size from 71 to 73 bp (Table S5). The postulated tRNA cloverleaf structure of M. pruinosa (H1 and H3) and S. marginella harbors an invariable 7 bp in the aminoacyl stem, 5 bp in the anticodon stem, and 7 bp in the anticodon loop, whereas the DHU arm and the TΨC arm, particularly within the loops, were variable in length ( Figure 2). A total of 51 and 45 unmatched base pairs were detected in the M. pruinosa (H1 and H3) and S. marginella tRNAs, respectively, but 31 and 24, respectively, were G-U pairs, which form a weak bond in the tRNAs. The remaining mismatches were found either in the aminoacyl stem (nine in M. pruinosa and S. marginella), DHU stem (zero in M. pruinosa and one in S. marginella), anticodon stem (eight in M. pruinosa and S. marginella), or TΨC stem (three in M. pruinosa and S. marginella), indicating that the amino-acyl stem has the highest mismatch. In each tRNA the number of mismatches ranged from one in M. pruinosa to three in S. marginella tRNAs. Curr. Issues Mol. Biol. 2021, 1, FOR PEER REVIEW 10

Compositional Skew
Skewness was calculated to infer the compositional bias of base in terms of whole mitogenome, whole PCGs, major-strand PCGs, and minor-strand PCGs in Fulgoroidea ( Figure 3; Table S6). The three mitogenomes and co-familial species G. distinctissima in Flatidae (measured from the major strand) was slightly A-skewed (ranged from 0.242 to 0.265) and moderately C-skewed (−0.274-−0.248) in the whole genome ( Figure 3; Table S6). Within fulgoroid families and subfamilies, all species of Delphacinae in Delphacidae, excluding another subfamily, Asiracinae, were obviously less A-skewed (0.089-0.133) compared to other families, including the Flatidae (0.183-0.299), suggesting a different evolutionary force even within fulgoroid families ( Figure 3). Variability in skewness has also been found in other taxonomic groups of Hemiptera. For example, the superfamily Membracoidea in the infraorder Cicadomorpha in Auchenorrhyncha showed variable degrees of A-or T-skewness and G-or C-skewness [68], whereas the superfamily Aphidoidea in another suborder Sternorrhyncha showed A-and C-skewness in the whole genome [69], as seen in Fulgoroidea (Figure 3).

Compositional Skew
Skewness was calculated to infer the compositional bias of base in terms of whole mitogenome, whole PCGs, major-strand PCGs, and minor-strand PCGs in Fulgoroidea ( Figure 3; Table S6). The three mitogenomes and co-familial species G. distinctissima in Flatidae (measured from the major strand) was slightly A-skewed (ranged from 0.242 to 0.265) and moderately C-skewed (−0.274-−0.248) in the whole genome ( Figure 3; Table  S6). Within fulgoroid families and subfamilies, all species of Delphacinae in Delphacidae, excluding another subfamily, Asiracinae, were obviously less A-skewed (0.089-0.133) compared to other families, including the Flatidae (0.183-0.299), suggesting a different evolutionary force even within fulgoroid families (Figure 3). Variability in skewness has also been found in other taxonomic groups of Hemiptera. For example, the superfamily Membracoidea in the infraorder Cicadomorpha in Auchenorrhyncha showed variable degrees of A-or T-skewness and G-or C-skewness [68], whereas the superfamily Aphidoidea in another suborder Sternorrhyncha showed A-and C-skewness in the whole genome [69], as seen in Fulgoroidea (Figure 3).  showed different evolutionary bias in Delphacinae compared to other families. However, with regard to GC-skewness, the major strand PCGs of all fulgoroid species evidenced moderate to strong C-skewness, whereas those of the minor strand exhibited moderate to strong G-skewness, without an obvious difference between species that were not of the subfamily Delphacinae and those that were. Therefore, the degree of bias differed between the two strands and among taxonomic groups, although such strand-based inequalities in base frequencies have yet to be fully understood.

Genetic Divergence Inferred from the Ka/Ks Ratio
Ka and Ks were estimated to infer the extent of genetic divergence in terms of whole PCGs, major-strand PCGs, minor-strand PCGs, and individual PCGs in Fulgoroidea (Table S7). The ratios of Ka:Ks were well below 1 for all species, including M. pruinosa and S. marginella in whole PCGs (0.1122 at maximum in Sivaloka damnosus in Issidae), majorstrand PCGs (0.0953 at maximum in Pyrops candelaria in Fulgoridae), minor-strand PCGs (0.1566 at maximum in S. damnosus in Issidae), and individual PCGs (Figure 4; Table S7), suggesting that the PCGs in Fulgoroidea are under purifying selection. The mean Ka:Ks ratio was higher in the order of minor-strand PCGs, whole PCGs, and major-strand PCGs in Fulgoroidea, and the ANOVA test (Table S8) showed a significant difference among them at the level of p < 0.0001. Individual gene analysis provided three statistically different groups in the mean Ka:Ks ratio: ATP8, ND4L, and ND6 as the highest group; ND1, ND2, and ND3 as the next highest group; and the remaining genes as the lowest group ( Figure 4B). Such different divergence patterns among PCGs might be useful for different purposes, depending on the degree of necessary variability, such as higher variability for species delimitation and conserved variation for the discrimination of higher taxonomic groups.
At the familial analysis, Flatidae, which comprises M. pruinosa and S. marginella, also showed a similar pattern to that of the whole fulgoroid species, in that ATP8, ND4L, and ND6 showed the highest mean Ka:Ks ratio, ND1, ND2, and ND3 showed the next highest ratio, and the remaining genes showed the lowest mean Ka:Ks ratio, but no group of genes differed with statistical significance ( Figure 4C; Table S8). Similar patterns were also observed in Fulgoridae, Achilidae, and Issidae, but Ricaniidae included ND1, ND2, and ATP8 as the highest group, instead of ND4L and ND6, without statistical significance, and Delphacidae included ND3 along with ND4L and ND6 as the highest group, instead of ATP8, with statistical support only in ND4L and ND6 ( Figure 4C; Table S8). These results indicate that the evolution of fulgoroid families is largely consistent but also has distinctions in certain families, such as Ricaniidae and Delphacidae.

Individual Gene Divergence within Species
One of the main utilities of mtDNA sequences includes the property of COI sequences as a "DNA Barcode" [70]. Further, this region has been recommended for early insight into the patterning of genomic diversity within species [71]. In fact, the barcode region in COI has been used extensively to understand geographic variation and the origin of invasion of species [72][73][74][75]. Nevertheless, low variability and low numbers of haplotypes have necessitated additional sequence regions that can be used independently to barcode regions or in combination with barcode regions for the detection of population structures and hidden lineages [72,75,76], including introduced species [39,[77][78][79].
In order to quantify sequence divergence at each species, a pairwise comparison of each gene, including the DNA barcoding region, was performed from the species available for multiple genome sequences (six sequences for L. striatellus, three for Lycorma delicatula, eight for Nilaparvata lugens, and two for Sogatella furcifera), including two M. pruinosa haplotypes ( Figure 5). In the case of M. pruinosa, CytB, ND2, COI, DNA barcoding region, ATP6, and ND5 provided variation in the order of the lowest to highest of average median value (0.089-0.417), with the exclusion of ND3 due to unavailability in some other species ( Figure 5). Therefore, COI, which includes the DNA barcoding region, was not the most variable region. Similarly, other species also provided genes with higher variability than the DNA barcoding region, with varying numbers of genes ( Figure 5). These more variable genes might be informative for intra-specific analysis for diverse purposes. ratio was higher in the order of minor-strand PCGs, whole PCGs, and major-strand PCGs in Fulgoroidea, and the ANOVA test (Table S8) showed a significant difference among them at the level of p < 0.0001. Individual gene analysis provided three statistically different groups in the mean Ka:Ks ratio: ATP8, ND4L, and ND6 as the highest group; ND1, ND2, and ND3 as the next highest group; and the remaining genes as the lowest group ( Figure 4B). Such different divergence patterns among PCGs might be useful for different purposes, depending on the degree of necessary variability, such as higher variability for species delimitation and conserved variation for the discrimination of higher taxonomic groups.  At the familial analysis, Flatidae, which comprises M. pruinosa and S. marginella, also showed a similar pattern to that of the whole fulgoroid species, in that ATP8, ND4L, and ND6 showed the highest mean Ka:Ks ratio, ND1, ND2, and ND3 showed the next highest ratio, and the remaining genes showed the lowest mean Ka:Ks ratio, but no group of genes differed with statistical significance ( Figure 4C; Table S8). Similar patterns were also observed in Fulgoridae, Achilidae, and Issidae, but Ricaniidae included ND1, ND2, and ATP8 as the highest group, instead of ND4L and ND6, without statistical significance, and Delphacidae included ND3 along with ND4L and ND6 as the highest group, instead of ATP8, with statistical support only in ND4L and ND6 ( Figure 4C; Table S8). These results indicate that the evolution of fulgoroid families is largely consistent but also has distinctions in certain families, such as Ricaniidae and Delphacidae. species ( Figure 5). Therefore, COI, which includes the DNA barcoding region, was not the most variable region. Similarly, other species also provided genes with higher variability than the DNA barcoding region, with varying numbers of genes ( Figure 5). These more variable genes might be informative for intra-specific analysis for diverse purposes. Figure 5. Boxplot distribution of intraspecies genetic divergence for 12 protein-coding genes (excluding ND3 gene), the barcode region, and two rRNAs in fulgoroid species available for multiple genome sequences. Dots in the colored bars indicate maximum, average, and minimum divergence (%), respectively; and the reddish horizontal line represents the median of sequence divergence (%) in Fulgoroidea.

Intergenic Spacer Regions and Potential Motif Sequences
The genes of the two M. pruinosa haplotypes and S. marginella are interleaved with 147 and 68 bp, which are spread over 21 and 11 regions, ranging in size between 1 and 25 and 1 and 35 bp, respectively ( Figure 6; Table S9). The majority of intergenic spacer sequences (ISSs) are short (1-2 bp), but five and two locations in M. pruinosa and S. marginella, respectively, have ISSs that are longer than 10 bp. The longest, 25 bp, located between trnR and trnN in the two M. pruinosa haplotypes, which both have two identical poly-runs of adenine nucleotides, is composed of 84% A/T nucleotides. The next longest, 17 bp, ISS is located between trnA and trnR and also has poly-running adenine nucleotides, which also have higher A/T nucleotides. In S. marginella, the longest 35-bp-long ISS is located between trnG and ND3, but it has interspersed poly-running TAA sequences composed of 91.4% A/T nucleotides and the next longest, 18 bp, is located between trnI and trnQ. On the other hand, the co-familial species G. distinctissima does not have ISSs that are longer than 10 bp in the 51 bp of ISSs that are interspersed at 13 locations, indicating that there is no consistent location and length of ISSs among Flatidae species.

Intergenic Spacer Regions and Potential Motif Sequences
The genes of the two M. pruinosa haplotypes and S. marginella are interleaved with 147 and 68 bp, which are spread over 21 and 11 regions, ranging in size between 1 and 25 and 1 and 35 bp, respectively ( Figure 6; Table S9). The majority of intergenic spacer sequences (ISSs) are short (1-2 bp), but five and two locations in M. pruinosa and S. marginella, respectively, have ISSs that are longer than 10 bp. The longest, 25 bp, located between trnR and trnN in the two M. pruinosa haplotypes, which both have two identical poly-runs of adenine nucleotides, is composed of 84% A/T nucleotides. The next longest, 17 bp, ISS is located between trnA and trnR and also has poly-running adenine nucleotides, which also have higher A/T nucleotides. In S. marginella, the longest 35-bp-long ISS is located between trnG and ND3, but it has interspersed poly-running TAA sequences composed of 91.4% A/T nucleotides and the next longest, 18 bp, is located between trnI and trnQ. On the other hand, the co-familial species G. distinctissima does not have ISSs that are longer than 10 bp in the 51 bp of ISSs that are interspersed at 13 locations, indicating that there is no consistent location and length of ISSs among Flatidae species.
In other fulgoroid families, the total length of ISSs tended to be longer than those of Flatidae, providing 55-1,090 bp with a few exceptions (Table S9). In the case of Peltatavertexalis horizontalis in Achilidae, one spacer located between lrRNA and trnV was exceptionally long at 1,008 bp. Except for this extreme case, all members of Delphacinae in Delphacidae were obviously longer than any other members of Delphacidae, ranging from 178 to 856 bp, with an exception of 131 bp found in L. striatellus collected in China [4], indicating different genomic evolution in Delphacinae than other fulgoroid families. ssues Mol. Biol. 2021, 1, FOR PEER REVIEW 15 In other fulgoroid families, the total length of ISSs tended to be longer than those of Flatidae, providing 55-1,090 bp with a few exceptions (Table S9). In the case of Peltatavertexalis horizontalis in Achilidae, one spacer located between lrRNA and trnV was exceptionally long at 1,008 bp. Except for this extreme case, all members of Delphacinae in Delphacidae were obviously longer than any other members of Delphacidae, ranging from 178 to 856 bp, with an exception of 131 bp found in L. striatellus collected in China [4], indicating different genomic evolution in Delphacinae than other fulgoroid families.
It has been reported that an intergenic spacer region located at the ND1 and trnS2 junction contains a conserved motif sequence as TTAGTAT in Lepidoptera and TAGTA It has been reported that an intergenic spacer region located at the ND1 and trnS 2 junction contains a conserved motif sequence as TTAGTAT in Lepidoptera and TAGTA in Coleoptera with a slight modification in some species [65,80,81]. Such motif sequences have been signified for their role as binding sites for the transcription termination peptide of mtDNA (mtTERM) because the sequence signals the termination of the transcription of PCGs encoded on the major strand (CytB) in the circular mtDNA [82,83]. We searched the homolog in the Fulgoroidea genomes and found the motiflike sequence, TAGTA, at the ND1 and trnS 2 junction (Figure 7). However, only the members of Delphacidae had a consistent length of ISS and contained the motiflike sequence within a well-aligned frame between the two genes, but a slight modification was also present in some species. On the other hand, the remaining family members lacked the ISS but had the motiflike sequence at the 3 -end of ND1 or at the 5 -end of trnS 2 , with a slight modification in some species. A functional study for its role as mtTERM in Fulgoroidea will be necessary for further decisive conclusions on the role of the sequence.
have been signified for their role as binding sites for the transcription termination peptide of mtDNA (mtTERM) because the sequence signals the termination of the transcription of PCGs encoded on the major strand (CytB) in the circular mtDNA [82,83]. We searched the homolog in the Fulgoroidea genomes and found the motiflike sequence, TAGTA, at the ND1 and trnS2 junction (Figure 7). However, only the members of Delphacidae had a consistent length of ISS and contained the motiflike sequence within a well-aligned frame between the two genes, but a slight modification was also present in some species. On the other hand, the remaining family members lacked the ISS but had the motiflike sequence at the 3′-end of ND1 or at the 5′-end of trnS2, with a slight modification in some species. A functional study for its role as mtTERM in Fulgoroidea will be necessary for further decisive conclusions on the role of the sequence.

A+T-rich Region Structure
Located between the srRNA and trnI, the A+T-rich region of fulgoroid species was mostly longer than that for other insect orders (e.g.,~350 bp in Lepidoptera), with the size well over 1000 bp, with only a few exceptions ( Table 1). The two M. pruinosa haplotypes were 1788 (H1) and 1790 bp (H3), presenting a unique length difference in the whole genome, including genic regions and ISSs between them ( Table 1). The S. marginella A+T-rich region was slightly longer at 1836 bp but was well within the range found in other fulgoroid species (Table S3).
The A+T-rich region of the two M. pruinosa is structured into four regions: one repeat region; a large nonrepeat region; another repeat region; and a short nonrepeat region from the 3 -end of srRNA to the 5 -end of trnI ( Figure 8A). The first repeat region is composed of tandem triplicated repeats, each of which consists of two units (named repeat units A and B) that have an identical sequence and length, but the third copy lacks the repeat unit B. The second repeat region (named repeat unit C), which is located closer to trnI, is repeated 20 times, each with an identical 21 bp. A large nonrepeat region contains intermittent variable length poly-AT, poly-T, and poly-A that are scattered in the region. At the end of the A+T-rich region abutting to the 5 -end of trnI, one short, nonrepeat sequence, which contains one poly-A, was located. A similar structure was also found in S. marginella and G. distinctissima, in that they are all composed of four regions, but the copy number and repeat length differ between them ( Figure 8B,C). In particular, the repeat C in G. distinctissima, composed of six copies, contains indels and substitutions among them, providing four different versions ( Figure 8D). One of the common interpretations for the occurrence of such tandem repeats in animal mitogenomes is slipped-strand mispairing, in concert with unequal crossing over during DNA replication, resulting in an expanded repeat [84][85][86].

A+T-rich Region Structure
Located between the srRNA and trnI, the A+T-rich region of fulgoroid species was mostly longer than that for other insect orders (e.g., ~350 bp in Lepidoptera), with the size well over 1000 bp, with only a few exceptions ( Table 1). The two M. pruinosa haplotypes were 1788 (H1) and 1790 bp (H3), presenting a unique length difference in the whole genome, including genic regions and ISSs between them ( Table 1). The S. marginella A+Trich region was slightly longer at 1836 bp but was well within the range found in other fulgoroid species (Table S3).
The A+T-rich region of the two M. pruinosa is structured into four regions: one repeat region; a large nonrepeat region; another repeat region; and a short nonrepeat region from the 3′-end of srRNA to the 5′-end of trnI ( Figure 8A). The first repeat region is composed of tandem triplicated repeats, each of which consists of two units (named repeat units A and B) that have an identical sequence and length, but the third copy lacks the repeat unit B. The second repeat region (named repeat unit C), which is located closer to trnI, is repeated 20 times, each with an identical 21 bp. A large nonrepeat region contains intermittent variable length poly-AT, poly-T, and poly-A that are scattered in the region. At the end of the A+T-rich region abutting to the 5′-end of trnI, one short, nonrepeat sequence, which contains one poly-A, was located. A similar structure was also found in S. marginella and G. distinctissima, in that they are all composed of four regions, but the copy number and repeat length differ between them ( Figure 8B,C). In particular, the repeat C in G. distinctissima, composed of six copies, contains indels and substitutions among them, providing four different versions ( Figure 8D). One of the common interpretations for the occurrence of such tandem repeats in animal mitogenomes is slipped-strand mispairing, in concert with unequal crossing over during DNA replication, resulting in an expanded repeat [84][85][86]. Due to the similarity in structural arrangement in the A+T-rich region among Flatidae species, the possibility of a common origin of each repeat unit was considered by sequence alignment, but no reasonable alignment was obtained (data not shown), suggesting an independent origin of each repeat unit in each species. Furthermore, an alignment of each repeat unit to its own mitogenome sequences, including ISSs, did not yield any fruitful results (data not shown), suggesting that the origin of the repeat units is probably else-where in the mitogenome. This inference is further supported by the substantially lower A/T composition of the repeat units than that of the nonrepeat regions (62.58-65.61% vs. 86.57-89.70%; Figure 8E), along with other genic and non-genic regions in the genome in each species (Table 1). In contrast to mitochondrion-to-nucleus transfer of DNA, such as nuclear-encoded mt pseudogenes [87][88][89][90], nucleus-to-mitochondrion transfer of DNA has not critically been examined in animals, including insects, and no citable example could be found. However, such an event can casually be exemplified in plant mitogenomes [91][92][93]. Therefore, the potential origin of the repeat units from chromosomal DNA cannot completely be excluded in insects, including Flatidae species. If no chromosomal sequences of any Flatidae species are available from a homology search of NCBI through BLAST, this may be limiting, particularly due to the lower ratio of query cover in many repeat units (e.g., 26% in repeat B in G. distinctissima and 29% in repeat A in M. pruinosa; Table S10). Nevertheless, several repeat units showed a substantial sequence homology to the chromosomal DNA originating from a diverse organism, where even their lengths were shorter, hinting at the potential origin of the repeat unit from other organisms during the long evolutionary history. For example, the repeat units A, B, and C of M. pruinosa showed 94.12% sequence identity to the chimpanzee Pan troglodytes, 100% to the radish Raphanus sativus, and 100% to the tapeworm Spirometra erinaceieuropaei, respectively, and a similar spectrum of homology was obtained from the repeat units of other Flatidae species (Table S10). As some repeat units showed higher sequence homology to parasites (e.g., tapeworms and bacteria), genomic transfer to mitogenomes via parasites can also be considered, although it is not decisive. Therefore, the evolutionary mechanism responsible for the incorporation of such repeat sequences in the A+T-rich region may require further close analysis with chromosomal genomic sequences.
Previously, the detailed analysis of the A+T-rich region of the order Hemiptera classified the region into three compositional types and found several Fulgoromorpha species, including G. distinctissima and species of Cicadomorpha to have two repeat regions separated in the region [94], as we have shown for G. distinctissima in Figure 8C. Further, the stem-and-loop structure has often been detected in other Hemiptera species [94], but we could not casually locate such a structure in any species of Flatidae. Li and Liang [95] further scrutinized the A+T-rich region registered in the GenBank database, which covers 11 infraorders of Hemiptera, including Fulgoromorpha, and reported one longer poly-A stretch (23 bp) closer to the 3 -end of srRNA. However, in Flatidae, only M. pruinosa and G. distinctissima had such a poly-A stretch ( Figure 8). Therefore, generalization of the structural arrangement in the A+T-rich region requires further data accumulation of Fulgoromorpha, particularly for Flatidae.

Gene Arrangements
The orientation and gene order of Flatidae species, including M. pruinosa and S. marginella, were identical to those of the species of Ricaniidae, Issidae, Achilidae, Fulgoridae, Derbidae, subfamily Asiracinae of Delphacidae, and the two outgroup species, which belong to another infraorder, Cicadomorpha (A type; Figure 9). This arrangement has been hypothesized to be ancestral for insects and found in diverse insect orders [64]. However, members of the Delphacinae in Delphacidae represented by nine species in seven genera are highly interesting in that members of the subfamily uniquely presented two different arrangements (B and C types). The C type arrangement was found only in one mitogenome of L. striatellus collected from Beijing, China [4], whereas other individuals of the same species [8,18,19], including 81 sequences of the species [2], along with the remaining species in the subfamily, all possessed the B type ( Figure 9). Furthermore, among the eight N. lugens mitogenomes reported from China and South Korea each, two and one sequences from China and South Korea, respectively [8,10,17,19], presented a triplicated trnC at the ND2 and trnY junction with the same rearrangement to B type (named B' type), indicating intra-specific changes in gene arrangement or duplication (Figure 9). nome of L. striatellus collected from Beijing, China [4], whereas other individuals of the same species [8,18,19], including 81 sequences of the species [2], along with the remaining species in the subfamily, all possessed the B type ( Figure 9). Furthermore, among the eight N. lugens mitogenomes reported from China and South Korea each, two and one sequences from China and South Korea, respectively [8,10,17,19], presented a triplicated trnC at the ND2 and trnY junction with the same rearrangement to B type (named B' type), indicating intra-specific changes in gene arrangement or duplication (Figure 9). Compared to the insect ancestral type [64], two regions had rearranged genes in the B type. One rearranged region involved the translocation of ND6 to a position upstream of ND4L (underlining indicates a gene inversion) and a translocation between trnT and trnP without gene inversion, resulting in ND6-trnP-trnT, instead of ancestral trnT-trnP-ND6 arrangement at the ND4L and CytB junction. Another region involves transposition between trnW and trnC, resulting in the trnC-trnW arrangement at the ND2 and trnY junction. In addition, the C type arrangement has a translocation of trnH from the ND5 and ND4 junction to the upstream region of ND6, resulting in trnH-ND6-trnP-trnT at the ND4L and CytB junction.
As nearly all members of the subfamily Delphacinae in Delphacidae have both ND6-trnP-trnT and trnC-trnW rearrangements, they are synapomorphy for the subfamily, but the taxonomic expansion of these rearrangements to other subfamilies in Delphacidae should wait, as a mitogenome sequence in another subfamily, Asiracinae, at present, is available and contains only a single species (Ugyops sp.), which has A type arrangement.
Previously, gene rearrangement and gene multiplication in a certain species within a genus have been exemplified in the honeybee genus, Apis, in Hymenoptera. For example, Figure 9. Schematic illustration of the available mitochondrial gene arrangements in Fulgoroidea. Gene sizes are not drawn to scale. Gene names that are not underlined indicate a forward transcriptional direction, whereas underlining indicates a reverse transcriptional direction. tRNAs are denoted by one-letter symbols in accordance with the IUPAC-IUB single-letter amino acid codes. Dotted lines above the gene names indicate rearranged genes relative to ancestral arrangement in insects.
Compared to the insect ancestral type [64], two regions had rearranged genes in the B type. One rearranged region involved the translocation of ND6 to a position upstream of ND4L (underlining indicates a gene inversion) and a translocation between trnT and trnP without gene inversion, resulting in ND6-trnP-trnT, instead of ancestral trnT-trnP-ND6 arrangement at the ND4L and CytB junction. Another region involves transposition between trnW and trnC, resulting in the trnC-trnW arrangement at the ND2 and trnY junction. In addition, the C type arrangement has a translocation of trnH from the ND5 and ND4 junction to the upstream region of ND6, resulting in trnH-ND6-trnP-trnT at the ND4L and CytB junction.
As nearly all members of the subfamily Delphacinae in Delphacidae have both ND6-trnP-trnT and trnC-trnW rearrangements, they are synapomorphy for the subfamily, but the taxonomic expansion of these rearrangements to other subfamilies in Delphacidae should wait, as a mitogenome sequence in another subfamily, Asiracinae, at present, is available and contains only a single species (Ugyops sp.), which has A type arrangement.
Previously, gene rearrangement and gene multiplication in a certain species within a genus have been exemplified in the honeybee genus, Apis, in Hymenoptera. For example, mitogenomes of eight species of Apis present two different arrangements; trnE-trnS 1 (E-S 1 type) and trnS 1 -trnE (S 1 -E type) at the A+T-rich region and ND2 junction [96], indicating that gene rearrangement occurs in a certain species within a genus. Of species with the S 1 -E type, only A. koschevnikovi has tandemly triplicated copies of trnM at the downstream of trnE, and only A. andreniformis has five additional copies of trnL 1 at the downstream of trnE, along with a likely original copy of this tRNA located at the ND1 and lrRNA junction, the position of which is ancestral in insects [96]. Considering these examples, gene duplication and translocation could possibly only occur independently in a species after the divergence of the particular species from its ancestor.
On the other hand, the intraspecies variation of gene arrangement might be very rare and unconventional, although fair judgment for its extensiveness might limitedly be possible due to the limited availability of multi-individual sequences in a species. Furthermore, gene rearrangement data, independent of nucleotide sequences, have been signified to have phylogenetic signals for the inference of evolutionary relationships in higher levels of taxonomic groups, rather than at individual levels [97][98][99]. Nevertheless, it might be noteworthy that Delphacinae is the only subfamily that has evolved a different gene arrangement (B, B', and C types) compared to other families and subfamilies in Fulgoroidea. Therefore, members of the subfamily may, under slightly different evolutionary forces, be prone to evolve gene rearrangement and the incorporation of additional copies of the tRNAs in the genome. As more studies focusing on these aspects become available, further close inference of gene arrangement for delphacid planthoppers may be possible.

Phylogenetic Relationships
Four phylogenetic analyses were performed in combination with different datasets (AA for amino acid sequences and NU for nucleotide sequences) and algorithms (ML and BI methods). The four analyses (AA-ML, AA-BI, NU-ML, and NU-BI) resulted in three slightly, but importantly different, topologies, providing two AA-based trees (topology A by AA-ML dataset and topology B by AA-BI dataset) and only one NU-based tree (topology C both by NU-ML and NU-BI datasets; Figure 10). All phylogenetic analyses divided the eight fulgoroid families into two clades: one consisted of Ricaniidae, Issidae, Flatidae, Fulgoridae, Achilidae, and Derbidae, and the other consisted of Delphacidae and Cixiidae, each of which was supported relatively strongly in all analyses (BS = 98-100%, BPP = 1; Figure 10). The three topologies differ by the changes in positions and relationships among Fulgoridae, Achilidae, and Derbidae, with regard to consistent relationships among Ricaniidae, Issidae, and Flatidae (RIF; between topologies A, B vs. topology C) and the fluctuation in the monophyly of Delphacidae with regard to Cixiidae (between topology A vs. topologies B, C), whereas monophyly of Fulgoroidea was always supported with the highest nodal supports (BS = 100%, BPP = 1) ( Figure 10).
Among the eight families included in the analyses, six were represented by more than two taxa and five of them, excluding Delphacidae, were supported as the monophyletic groups, almost always with the highest nodal supports (BS = 97-100%, BPP = 1.0) (Table S11; Figure 10). In particular, M. pruinosa and S. marginella belonging to Flatidae formed a strong monophyletic group with the co-familial species G. distinctissima in all analyses (BS = 100%, BPP = 1.0). However, the most specious Delphacidae, represented by two subfamilies (Asiracinae and Delphacinae), formed a non-monophyletic group in the AA-BI analysis by clustering the Asiracinae and Cixiidae, each of which was represented by a single species (topology B; Figure 10B), whereas monophyly of Delphacidae was supported in the AA-ML, NU-ML, and NU-BI analyses (topologies A and C) with moderate to high nodal supports (BS = 72, 93%, BPP = 1; Figure 10A,C), presenting an equivocal relationship with regard to the monophyly of Delphacidae. A previous study based on DNA nucleotide sequence data from three nuclear (nr) genes, mt COI, and 132 morphological characters also showed equivocal relationships, forming each member of Delphacidae and Cixiidae, unusual sister groups in a maximum parsimony (MP) analysis [100]. Further, other molecular and morphological studies have not yet reached a full conclusion with regard to their monophyly [29,101].
Nevertheless, several studies have shown mitogenome arrangement, independent of gene sequences, as an important evolutionary characteristic, which is used to infer the relationships between organismal phylogeny and rearrangement evolution in a diverse array of taxonomic groups [64,[97][98][99][102][103][104][105][106]. If the gene rearrangement data are implemented to the current phylogeny, the clustering of Asiracinae in Delphacidae and Cixiidae, both of which evidenced an ancestral arrangement (type A), excluding rearranged subfamily Delphacinae (B, B', and C types), may truly signal phylogenetic closeness between Asiracinae and Cixiidae, resulting in non-monophyly of Delphacidae ( Figure 10B), an expanded sampling will likely result in better inference of their relationships. Among the eight families included in the analyses, six were represented by more than two taxa and five of them, excluding Delphacidae, were supported as the monophyletic groups, almost always with the highest nodal supports (BS = 97-100%, BPP = 1.0) (Table S11; Figure 10). In particular, M. pruinosa and S. marginella belonging to Flatidae formed a strong monophyletic group with the co-familial species G. distinctissima in all analyses (BS = 100%, BPP = 1.0). However, the most specious Delphacidae, represented by two subfamilies (Asiracinae and Delphacinae), formed a non-monophyletic group in the AA-BI analysis by clustering the Asiracinae and Cixiidae, each of which was represented by a single species (topology B; Figure 10B), whereas monophyly of Delphacidae was supported in the AA-ML, NU-ML, and NU-BI analyses (topologies A and C) with moderate to high nodal supports (BS = 72, 93%, BPP = 1; Figure 10A,C), presenting an equivocal relationship with regard to the monophyly of Delphacidae. A previous study based on DNA nucleotide sequence data from three nuclear (nr) genes, mt COI, and 132 morphological characters also showed equivocal relationships, forming each member of Delphacidae and Cixiidae, unusual sister groups in a maximum parsimony (MP) analysis [100]. Further, other molecular and morphological studies have not yet reached a full conclusion with regard to their monophyly [29,101].
Nevertheless, several studies have shown mitogenome arrangement, independent of gene sequences, as an important evolutionary characteristic, which is used to infer the relationships between organismal phylogeny and rearrangement evolution in a diverse array of taxonomic groups [64,[97][98][99][102][103][104][105][106]. If the gene rearrangement data are implemented to the current phylogeny, the clustering of Asiracinae in Delphacidae and Cixiidae, both of which evidenced an ancestral arrangement (type A), excluding rearranged With regard to familial relationships, RIF as one clade with the highest nodal supports of all analyses formed Ricaniidae and Issidae as the sister group with moderate (BS = 58-72%) to higher nodal supports (BPP = 0.99-1.0) (Table S11; Figure 10). Although nodal support by ML analyses was comparatively lower than that of BI, this trend has previously been recognized [107].
Previously, several types of data have supported RIF as an inclusive group: morphology ( Figure 1A-D), concatenated gene segments ( Figure 1F,G), and mitogenome sequences ( Figure 1J-P). Among them, the sister relationship between Ricaniidae and Issidae, leaving Flatidae as the sister lineage of the two families, was supported only in the studies that used several concatenated gene segments and mitogenome sequences ( Figure 1F-N) but not those from morphological data ( Figure 1A-E). For example, Urban and Cryan (2007) performed phylogenetic analyses using DNA nucleotide sequences from seven genes (18S rDNA, 28S rDNA, histone H3, histone 2A, wingless, COI, and ND4) and 86 in-group exemplars representing all major lineages of Hemiptera. They showed the sister relationship between Ricaniidae and Issidae with higher nodal support by ML analysis (BS = 91%) and placed Flatidae as the sister to the Ricaniidae + Issidae group, with the higher nodal support (BS = 87%) ( Figure 1G). Furthermore, several mitogenome-based studies, using 13 PCGs by the Neighbor-Joining method, ML method alone, and both ML and BI methods, all strongly supported the sister relationships between Ricaniidae and Issidae, leaving Flatidae as the basal lineage of the two families ( Figure 1J-N), but exceptions also exist ( Figure 1O,P).
On the other hand, the morphology-based analysis presented either unresolved relationships ( Figure 1A,B) or alternative relationships among RIF, other than the sister relationships between Ricaniidae and Issidae ( Figure 1C-E). In particular, the sister relationship between Ricaniidae and Flatidae is supported by the loss of posterior tentorial arms in the female genitalia ( Figure 1D) [27] and a piercin-excavating ovipositor, which inserts the eggs into the plant [25]. Support for the sister relationship between Flatidae and Ricaniidae is also found in the concatenated gene-based analysis of the Fulgoroidea (66 species in 23 families) using nr (18S rRNA and 28S rRNA) and mt (16S rRNA and CytB) DNA sequences [30] (Figure 1H,I). They support the sister relationship between Ricaniidae and Flatidae with the higher nodal supports in ML and BI trees (BS = 95%, BPP = 1), whereas the sister group to the Ricaniidae + Flatidae group was unequivocal by the placement of moderately supported Achilidae in ML analysis ( Figure 1H) or by the unresolved relationships among the Issidae + Achilidae + Fulgoridae group in the BI tree ( Figure 1I). Therefore, whereas the relationships among RIF were equivocal among previous studies, the current study strongly supports a ((R + I) + F) topology.
In contrast to consistent relationships among RIF in our analyses, the relationships of the other families fluctuate, providing two different topologies (topologies A and B vs. topology C; Figure 10). In topologies A and B, the sister relationships between RIF and Fulgoridae, with the placement of Achilidae as an early derived lineage and Derbidae as the most basal lineage in non-Delphacidae and Cixiidae clades, were supported, presenting familial relationships of (((RIF + Fulgoridae) + Achilidae) + Derbidae) ( Figure 10A,B). On the other hand, topology C supported Derbidae as the sister to the RIF group, instead of Fulgoridae, presenting familial relationships of (((RIF + Derbidae) + Achilidae) + Fulgoridae) ( Figure 10C). Nodal support for the sister relationships between RIF and Fulgoridae in topologies A and B was either lower (BS = 42%; Table S11) or higher (BPP = 0.94), but the sister relationships between RIF and Derbidae in topologies C were always lower (BS = 50%, BPP = 0.66). Taken together, the sister group to RIF is not consistent among datasets and algorithms, although the RIF + Fulgoridae group was slightly better supported. In contrast to the fluctuating relationships of the sister group to RIF, the placement of Achilidae as an earlier-derived lineage to the RIF + Fulgoridae group was strongly supported in AA-ML and AA-BI analyses ( Figure 10A,B; BS = 99%, BPP = 1), whereas the placement of Achilidae as an earlier-derived lineage to the RIF + Derbidae group was poorly supported both in NU-ML and NU-BI analyses ( Figure 10C; BS = 56%, BPP = 0.66).
Within the non-Delphacidae and Cixiidae clade, the placement of Achilidae as an early derived lineage to the RIF + Fulgoridae group has been supported using morphological data ( Figure 1C) and multiple genes by the MP method ( Figure 1F), as was supported in topologies A and B ( Figure 10A,B). However, this result sharply contrasts with those of other studies, such as those that support the sister relationships between Achilidae and Derbidae using morphology ( Figure 1D) and unresolved relationships among the RIF + Achilidae + Fulgoridae + Derbidae group ( Figure 1B) and multiple genes by the BI method ( Figure 1G). In particular, recent two mitogenome-based analyses, which include Achilidae but not Derbidae, both placed the Achilidae and Fulgoridae as the sister group and placed this group as the sister to the RIF group ( Figure 10I,P). It seems that newly added available mitogenome sequences in this study, subsequent to previous studies [21,22], affect the position and relationships of Achilidae and Fulgoridae, although this study is also limited by taxon diversity.
Delphacidae and Cixiidae, each as independent groups (topologies A and C) or a sister group between Cixiidae and Asiracinae, excluding another subfamily, Delphacinae in Delphacidae (topology B), were placed as the most early derived lineage in fulgoroid families ( Figure 10). Nodal supports for the placement of the two families as the most basal lineages were strongly supported in all analyses (BS = 100%, BPP = 1; Figure 10). A close relationship between Cixiidae and Delphacidae has previously been identified. Asche (1987) showed that the two families share an orthopteroid ovipositor and fused gonapophyses on abdominal segment 9. Additionally, the placement of the two families as the most early derived lineage to the remaining fulgoroid families was supported by several types of data, such as morphological data ( Figure 1D), concatenated gene sequence data ( Figure 1F-I), and mitogenome-based studies ( Figure 1K-N), whereas the placement of Delphacidae alone as an earlier-derived lineage than Cixiidae has been suggested based only on adult and larval morphology ( Figure 1C). However, no previous studies support the placement of Cixiidae alone as the most basal lineage of fulgoroid families, whereas our topologies A and C did ( Figure 10A,C).
In summary, our phylogenetic analysis using mitogenome sequences consistently supported the relationships of ((R + I) + F), but the early derived lineages to the RIF group were supported either as (((RIF + Fulgoridae) + Achilidae) + Derbidae) by AAbased analysis or (((RIF + Derbidae) + Achilidae) + Fulgoridae) by NU-based analyses. Overall, slightly higher supports for the former topology suggest that AA sequences, which recorded NU data into AA data, better support the phylogenetic positions of the Fulgoridae, Achilidae, and Derbidae, although limited taxon diversity may also be responsible for such incongruence. The classification of Delphacidae as a non-monophyletic group is unconventional, but gene rearrangement data supported such a possibility by having both ancestral and derived arrangements in the members of Delphacidae.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/ 10.3390/cimb43030099/s1, Table S1: List of primers used to amplify the gapped regions of Metcalfa pruinosa H1 and H3 haplotypes; Table S2: Primers used to amplify and sequence the mitochondrial genome of Salurnis marginella;  Table S9: Overlapping and intergenic-spacer sequences of Fulgoroidea mitochondrial genomes; Table S10: BLAST results of the repeat sequences found in the A+T-rich region of Flatidae species; Table S11: Node supports for monophyletic families and inter-familial relationships within Fulgoroidea.