First Mitochondrial Genome from Nemouridae (Plecoptera) Reveals Novel Features of the Elongated Control Region and Phylogenetic Implications

The complete mitochondrial genome (mitogenome) of Nemoura nankinensis (Plecoptera: Nemouridae) was sequenced as the first reported mitogenome from the family Nemouridae. The N. nankinensis mitogenome was the longest (16,602 bp) among reported plecopteran mitogenomes, and it contains 37 genes including 13 protein-coding genes (PCGs), 22 transfer RNA (tRNA) genes and two ribosomal RNA (rRNA) genes. Most PCGs used standard ATN as start codons, and TAN as termination codons. All tRNA genes of N. nankinensis could fold into the cloverleaf secondary structures except for trnSer (AGN), whose dihydrouridine (DHU) arm was reduced to a small loop. There was also a large non-coding region (control region, CR) in the N. nankinensis mitogenome. The 1751 bp CR was the longest and had the highest A+T content (81.8%) among stoneflies. A large tandem repeat region, five potential stem-loop (SL) structures, four tRNA-like structures and four conserved sequence blocks (CSBs) were detected in the elongated CR. The presence of these tRNA-like structures in the CR has never been reported in other plecopteran mitogenomes. These novel features of the elongated CR in N. nankinensis may have functions associated with the process of replication and transcription. Finally, phylogenetic reconstruction suggested that Nemouridae was the sister-group of Capniidae.


Introduction
Nowadays, mitochondrial genome (mitogenome) has been one of the most popular molecules widely used in insect taxonomy, population genetics, evolutionary biology and phylogenetics [1]. Generally, an insect mitogenome is a double strand circular molecule, ranging from 14-20 kb in length. It usually contains a typical set of 37 genes: 13 protein-coding genes (PCGs), 22 transfer RNA (tRNA) genes and two ribosomal RNA (rRNA) genes [2,3]. There is also a non-coding control region (CR) in mitogenomes, which is involved in the initiation and regulation of transcription and replication of the mitogenome [4][5][6]. The CR is the most variable region concerning A+T content and length, and some structural elements are expected to be present in a CR: (1) a poly-T stretch near the 5 end of the CR; (2) a poly-[TA(A)] n stretch following the poly-T stretch; (3) conserved stem-loop (SL) structures with a 5 flanking TATA and a 3' flanking G(A) n T motif; and (4) a G+A-rich region located downstream of the CR [7]. Functional information on replication derived from these structures has been well discussed, but the transcription features of insect mitogenomes are still little known [6][7][8][9]. 2 of 12 The Plecoptera (stoneflies) are a group of ancient insects, which are vital in the reconstruction of the evolutionary history of insects, and they are important bioindicators of water quality [10]. To date, 15 complete or near complete plecopteran mitogenomes have been reported, and relevant attempts have been made to rebuild the phylogeny of Plecoptera based on the increasing mitogenomic data [11][12][13][14][15][16][17][18][19][20][21][22][23][24]. However, the phylogenetic position of Plecoptera in Insecta and the phylogenetic relationship among stoneflies are still controversial.
To facilitate the study of mitogenome phylogeny in Plecoptera, we report the complete mitogenome of Nemoura nankinensis, which was the first sequenced mitogenome from Nemouridae. In this study, the organization, nucleotide composition, codon usage, secondary tRNA structures, and novel features of the elongated CR in the N. nankinensis mitogenome were analyzed. Finally, the phylogenetic relationships of N. nankinensis and other stoneflies were reconstructed based on PCG sequences.

Genome Annotation and Base Composition
The complete mitogenome of N. nankinensis was 16,602 bp in length, which was larger than any other reported stonefly mitogenomes. It contained the 37 typical mitochondrial genes (13 PCGs, 22 tRNAs and two rRNAs) and a large noncoding control region; 23 genes (nine PCGs and 14 tRNAs) were located on the majority strand (J-strand) and 14 genes (four PCGs, eight tRNAs, and two rRNAs) were on the minority strand (N-strand) ( Figure 1, Table 1). The highly-conserved gene arrangement of the N. nankinensis mitogenome was identical with other stoneflies as well as the model insect, Drosophila yakuba, which had the putative ancestral arthropod mitogenome [25]. The N. nankinensis mitogenome contained 36 overlapping nucleotides that were 1-8 bp in length and located in 11 pairs of neighboring genes. The longest overlap (8 bp) was located between trnCys and trnTrp. Except for the large control region, a total of 72 intergenic nucleotides (IGN) were found in 13 locations, ranging in size from 1 to 31 bp. The Plecoptera (stoneflies) are a group of ancient insects, which are vital in the reconstruction of the evolutionary history of insects, and they are important bioindicators of water quality [10]. To date, 15 complete or near complete plecopteran mitogenomes have been reported, and relevant attempts have been made to rebuild the phylogeny of Plecoptera based on the increasing mitogenomic data [11][12][13][14][15][16][17][18][19][20][21][22][23][24]. However, the phylogenetic position of Plecoptera in Insecta and the phylogenetic relationship among stoneflies are still controversial.
To facilitate the study of mitogenome phylogeny in Plecoptera, we report the complete mitogenome of Nemoura nankinensis, which was the first sequenced mitogenome from Nemouridae. In this study, the organization, nucleotide composition, codon usage, secondary tRNA structures, and novel features of the elongated CR in the N. nankinensis mitogenome were analyzed. Finally, the phylogenetic relationships of N. nankinensis and other stoneflies were reconstructed based on PCG sequences.

Genome Annotation and Base Composition
The complete mitogenome of N. nankinensis was 16,602 bp in length, which was larger than any other reported stonefly mitogenomes. It contained the 37 typical mitochondrial genes (13 PCGs, 22 tRNAs and two rRNAs) and a large noncoding control region; 23 genes (nine PCGs and 14 tRNAs) were located on the majority strand (J-strand) and 14 genes (four PCGs, eight tRNAs, and two rRNAs) were on the minority strand (N-strand) ( Figure 1, Table 1). The highly-conserved gene arrangement of the N. nankinensis mitogenome was identical with other stoneflies as well as the model insect, Drosophila yakuba, which had the putative ancestral arthropod mitogenome [25]. The N. nankinensis mitogenome contained 36 overlapping nucleotides that were 1-8 bp in length and located in 11 pairs of neighboring genes. The longest overlap (8 bp) was located between trnCys and trnTrp. Except for the large control region, a total of 72 intergenic nucleotides (IGN) were found in 13 locations, ranging in size from 1 to 31 bp.   In the N. nankinensis mitogenome, the A+T content of the whole mitogenome, PCGs, tRNAs, rRNAs and the control region was 71.2%, 69.0%, 71.4%, 73.5% and 81.9%, respectively (Table 2). For the 37 genes, the A+T content ranged from 60.6% in trnTyr to 90.9% in trnGlu, showing a bias for the A and T nucleotides. The AT skew and GC skew of the N. nankinensis mitogenome were calculated and showed a biased use of A and C nucleotides ( Table 2). For the J-strand, the AT skew of PCGs was negative and the GC skew of tRNA genes was positive, which was inconsistent with the strand bias of most other insects (positive AT skew and negative GC skew for the J-strand) [26].

Protein-Coding Genes, Transfer RNA and Ribosomal RNA Genes
The 13 PCGs of N. nankinensis were similar in length and arrangement to other sequenced stonefly mitogenomes. Eleven PCGs initiated with the standard start codon ATN (ATT and ATG), while cox1 used CCG, and nad1 used TTG as a start codon. Ten PCGs had complete termination codons (TAA or TAG), whereas cox1, cox2 and nad5 ended with the incomplete termination codon T, which could be completed by post-transcriptional polyadenylation [27].

Protein-Coding Genes, Transfer RNA and Ribosomal RNA Genes
The 13 PCGs of N. nankinensis were similar in length and arrangement to other sequenced stonefly mitogenomes. Eleven PCGs initiated with the standard start codon ATN (ATT and ATG), while cox1 used CCG, and nad1 used TTG as a start codon. Ten PCGs had complete termination codons (TAA or TAG), whereas cox1, cox2 and nad5 ended with the incomplete termination codon T, which could be completed by post-transcriptional polyadenylation [27].  The total length of the 22 tRNA genes was 1404 bp, and individual genes ranged from 63 to 71 bp with an average A+T content of 71.4%. All tRNA genes of N. nankinensis were predicted to fold into typical cloverleaf secondary structures ( Figure 3). However, in trnSer (AGN), the dihydrouridine (DHU) arm was reduced to a small loop, which was common in many other metazoan mitogenomes [28]. In addition, 30 mismatched base pairs were identified in the tRNA genes, and these were all G- The total length of the 22 tRNA genes was 1404 bp, and individual genes ranged from 63 to 71 bp with an average A+T content of 71.4%. All tRNA genes of N. nankinensis were predicted to fold into typical cloverleaf secondary structures (Figure 3). However, in trnSer (AGN), the dihydrouridine (DHU) arm was reduced to a small loop, which was common in many other metazoan mitogenomes [28]. In addition, 30 mismatched base pairs were identified in the tRNA genes, and these were all G-U pairs. In N. nankinensis, the anticodons of the 22 tRNAs were identical with other stoneflies, and the AGG codon was translated as Lys instead of Ser, which indicated the utilization of a variant of the invertebrate mitochondrial genetic code in this mitogenome (Table 1). This phenomenon has been well discussed by Abascal et al., and the shifts between alternative genetic codes were concluded to occur frequently within arthropod main lineages [29,30]. U pairs. In N. nankinensis, the anticodons of the 22 tRNAs were identical with other stoneflies, and the AGG codon was translated as Lys instead of Ser, which indicated the utilization of a variant of the invertebrate mitochondrial genetic code in this mitogenome (Table 1). This phenomenon has been well discussed by Abascal et al., and the shifts between alternative genetic codes were concluded to occur frequently within arthropod main lineages [29,30]. The large ribosomal RNA (rrnL) gene of N. nankinensis was 1327 bp in length with an A+T content of 74.9%, and the small ribosomal RNA (rrnS) gene was 790 bp with an A+T content of 71.3% (Table  1). The two rRNA genes were mapped between trnLeu (CUN) and the control region, which was consistent with other stonefly species. The large ribosomal RNA (rrnL) gene of N. nankinensis was 1327 bp in length with an A+T content of 74.9%, and the small ribosomal RNA (rrnS) gene was 790 bp with an A+T content of 71.3% (Table 1). The two rRNA genes were mapped between trnLeu (CUN) and the control region, which was consistent with other stonefly species.

The Control Region
The control region (CR) of the N. nankinensis mitogenome is currently the longest known CR (1751 bp) with the highest A+T content (81.8%) among stoneflies, and was located at the conserved position between rrnS and trnIle (Figure 1, Tables 1 and 3). Firstly, a large repeat region (15021-16049) was identified, which was 1029 bp and contained 3.1 tandem repeats (Figure 4). Each of the three repeated sequences could be folded into a same trnAsn-like structure encoded on the N-strand. These long tandem repeats might explain the large size of the CR in N. nankinensis.

The Control Region
The control region (CR) of the N. nankinensis mitogenome is currently the longest known CR (1751 bp) with the highest A+T content (81.8%) among stoneflies, and was located at the conserved position between rrnS and trnIle (Figure 1, Tables 1 and 3). Firstly, a large repeat region (15021-16049) was identified, which was 1029 bp and contained 3.1 tandem repeats (Figure 4). Each of the three repeated sequences could be folded into a same trnAsn-like structure encoded on the N-strand. These long tandem repeats might explain the large size of the CR in N. nankinensis.   552-16,572). The proposed "G(A)nT" motif was detected after SL-1 and SL-4, but it was modified as "GTA" after SL-2 and SL-3, and "TGA" after SL-5. These SL structures were considered to be associated with the initiation of mitogenome replication and transcription [31]. Interestingly, a trnGln-like structure was found between SL-4 and SL-5, and it was encoded on the majority strand. The presence of tRNA-like structures in the CR has never been reported in other plecopteran mitogenomes, and its underlying mechanisms are unclear. These tRNA-like structures may have signaling functions in the process of transcription [32]. When compared with the available 11 CRs of the other stoneflies, four conserved sequence blocks (CSBs)   552-16,572). The proposed "G(A) n T" motif was detected after SL-1 and SL-4, but it was modified as "GTA" after SL-2 and SL-3, and "TGA" after SL-5. These SL structures were considered to be associated with the initiation of mitogenome replication and transcription [31]. Interestingly, a trnGln-like structure was found between SL-4 and SL-5, and it was encoded on the majority strand. The presence of tRNA-like structures in the CR has never been reported in other plecopteran mitogenomes, and its underlying mechanisms are unclear. These tRNA-like structures may have signaling functions in the process of transcription [32]. When compared with the available 11 CRs of the other stoneflies, four conserved sequence blocks (CSBs) were identified in N. nankinensis ( Figure 5). These CBSs ranged in size from 35 to 100 bp, and their sequence identity among species was generally over 50% ( Figure 5). However, the function of these CSBs is still unclear.
were identified in N. nankinensis ( Figure 5). These CBSs ranged in size from 35 to 100 bp, and their sequence identity among species was generally over 50% ( Figure 5). However, the function of these CSBs is still unclear. In the past, the sequenced stonefly mitogenomes were mainly from the superfamily group Systellognatha, only N. nankinensis and the three Capniidae species are from the group Euholognatha. The CR of the Capnia zijinshana (Plecoptera: Capniidae) mitogenome was completely reported, and was 1513 bp in length, the second longest in stoneflies. Accordingly, we speculate that with more mitogenomes sequenced from Plecoptera, especially from the group Euholognatha, highly varied mitogenome sizes with more novel structural features will be found, and their functions and phylogenetic implications will be clear.

Phylogenetic Analyses
The phylogenetic analyses were performed based on the concatenated nucleotide sequences of 13 PCGs derived from 14 available stonefly mitogenomes, and one Ephemeroptera species was included as the outgroup (Table 3). BI and ML analyses generated similar tree topologies (Figures 6  and 7). In both analyses, N. nankinensis was recovered as the sister group of the three species from Capniidae, and the species from other five families were grouped together. This corresponds with the taxonomical knowledge that both Nemouridae and Capniidae are members in the superfamily group Euholognatha, while other five families are from Systellognatha in Plecoptera. In addition to the Capniidae, the clade containing species from Perlidae was well supported on both trees. However, the phylogenetic positions of the four families, Pteronarcyidae, Choloroperlidae, Styloperlidae and Peltoperlidae, were still unclear. These results were generally identical to the recent study made by Chen and Du [12]. The uncertainty and inconsistency of recent mitogenomic phylogenetic studies in Plecoptera may result from the limited mitogenomic data, and more sequencing work is necessary to resolve this problem. In the past, the sequenced stonefly mitogenomes were mainly from the superfamily group Systellognatha, only N. nankinensis and the three Capniidae species are from the group Euholognatha. The CR of the Capnia zijinshana (Plecoptera: Capniidae) mitogenome was completely reported, and was 1513 bp in length, the second longest in stoneflies. Accordingly, we speculate that with more mitogenomes sequenced from Plecoptera, especially from the group Euholognatha, highly varied mitogenome sizes with more novel structural features will be found, and their functions and phylogenetic implications will be clear.

Phylogenetic Analyses
The phylogenetic analyses were performed based on the concatenated nucleotide sequences of 13 PCGs derived from 14 available stonefly mitogenomes, and one Ephemeroptera species was included as the outgroup (Table 3). BI and ML analyses generated similar tree topologies (Figures 6  and 7). In both analyses, N. nankinensis was recovered as the sister group of the three species from Capniidae, and the species from other five families were grouped together. This corresponds with the taxonomical knowledge that both Nemouridae and Capniidae are members in the superfamily group Euholognatha, while other five families are from Systellognatha in Plecoptera. In addition to the Capniidae, the clade containing species from Perlidae was well supported on both trees. However, the phylogenetic positions of the four families, Pteronarcyidae, Choloroperlidae, Styloperlidae and Peltoperlidae, were still unclear. These results were generally identical to the recent study made by Chen and Du [12]. The uncertainty and inconsistency of recent mitogenomic phylogenetic studies in Plecoptera may result from the limited mitogenomic data, and more sequencing work is necessary to resolve this problem.

Sample Preparation and DNA Extraction
Specimens of N. nankinensis were collected in February 2016 from Zijin Mountain of Jiangsu Province, China. Our research activities were not banned by any organization or individual and did not involve endangered or protected species. Specimens used in this study were identified and

Sample Preparation and DNA Extraction
Specimens of N. nankinensis were collected in February 2016 from Zijin Mountain of Jiangsu Province, China. Our research activities were not banned by any organization or individual and did not involve endangered or protected species. Specimens used in this study were identified and

Sample Preparation and DNA Extraction
Specimens of N. nankinensis were collected in February 2016 from Zijin Mountain of Jiangsu Province, China. Our research activities were not banned by any organization or individual and did not involve endangered or protected species. Specimens used in this study were identified and preserved in 100% ethanol and stored at −20 • C. Genomic DNA was extracted from adults using the Column mtDNAout kit (Tianda, Beijing, China) and stored at −20 • C until used for PCR.

PCR Amplification and Sequencing
Five pairs of LA-PCR primers were used to amplify segments of the N. nankinensis mitogenome (Table A1). Conditions for LA-PCR amplification were as follows: initial denaturation at 93 • C for 2 min, followed by 40 cycles at 92 • C for 10 s; annealing at 54 • C for 30 s; and elongation at 68 • C for 8 min (20 cycles), which increased 20 s/cycle in the final 20 cycles; and final elongation at 68 • C for 10 min. PCR products were separated by 1.0% agarose gel electrophoresis and purified with an Axygen DNA Gel Extraction Kit (Axygen Biotechnology, Hangzhou, China). All PCR fragments were sent to Map Biotech Company (Shanghai, China) for sequencing. Firstly, the LA-PCR fragments were partially sequenced by Shotgun sequencing in combination with the primer walking strategy (Table A1). Then 15 specifically designed primer pairs were used for the remaining gaps including the CR (Table A1).

Phylogenetic Analyses
Phylogenetic analyses were based on nucleotide sequence data of 13 PCGs derived from N. nankinensis and 13 other stonefly mitogenomes available from GenBank (Table 3). Parafronurus youi (Accession No. EU349015) from the insect order Ephemeroptera was used as the outgroup. The nucleotide sequences of the 13 PCGs were aligned with Clustal X as implemented in MEGA v. 6.0 using default settings before concatenation excluding the stop codon [39]. The length of the alignment was 11,154 nucleotides in the final dataset. The best nucleotide substitution model was determined with MEGA v. 6.0 using the Bayesian Information Criterion (BIC) and the GTR+G+I model was predetermined for analyses. Bayesian inferences (BI) and maximum likelihood (ML) analysis were respectively performed using MrBayes v. 3.1.2 [40] and the RAxML Web-Server (http://embnet.vital-it.ch/raxml-bb/index.php) [41]. The BI analyses were performed under the following conditions: 3 million generations with sampling every 100 generations, four chains (one cold chain and three hot chains) and a burn-in of 25% trees. After 3 million generations, all runs reached in stationarity were examined by Tracer v. 1.5 (effective sample sizes exceed 200) [42]. The confidence values of the BI tree were shown as Bayesian posterior probabilities. One thousand bootstrap replicates were performed with the GTRGAMMA substitution model in ML analyses. Finally, the phylogenetic trees were drawn with the software FigTree v. 1.4.2 [43].

Conclusions
The mitogenome of N. nankinensis was the longest among reported stonefly mitogenomes. The gene arrangement of the N. nankinensis mitogenome was highly-conserved and identical with other stoneflies. In the elongated CR, novel features were found, including a large tandem repeat region, five SL structures, four tRNA-like structures and four CSBs. These structural elements may have functions associated with the process of replication and transcription. Phylogenetic analyses supported that Nemouridae was the sister-group of Capniidae, which was consistent with former researches.