Characterization, Comparison of Two New Mitogenomes of Crocodile Newts Tylototriton (Caudata: Salamandridae), and Phylogenetic Implications

Mitochondrial genomes (mitogenomes) are valuable resources in molecular and evolutionary studies, such as phylogeny and population genetics. The complete mitogenomes of two crocodile newts, Tylototriton broadoridgus and Tylototriton gaowangjienensis, were sequenced, assembled, and annotated for the first time using next-generation sequencing. The complete mitogenomes of T. broadoridgus and T. gaowangjienensis were 16,265 bp and 16,259 bp in lengths, which both composed of 13 protein-coding genes (PCGs), 2 rRNA genes, 22 tRNA genes, and 1 control region. The two mitogenomes had high A + T content with positive AT-skew and negative GC-skew patterns. The ratio of non-synonymous and synonymous substitutions showed that, relatively, the ATP8 gene evolved the fastest and COI evolved the slowest among the 13 PCGs. Phylogenetic trees from BI and ML analyses resulted in identical topologies, where the Tylototriton split into two groups corresponding to two subgenera. Both T. broadoridgus and T. gaowangjienensis sequenced here belonged to the subgenus Yaotriton, and these two species shared a tentative sister group relationship. The two mitogenomes reported in this study provided valuable data for future molecular and evolutionary studies of the genus Tylotoriton and other salamanders.


Introduction
Vertebrate mitochondrial genome (mitogenome) is double-stranded circular DNA, typically 16-17 kb in length [1,2]. It encodes, usually, 13 protein-coding genes (PCGs), 2 rRNA genes, 22 tRNA genes, and 1 non-coding control region (CR) that contains information for initiating and regulating gene replication and transcription [3][4][5]. The mitogenome has many characteristics, such as low levels of recombination, multiple copy numbers, simple structure with conserved coding regions, rapid evolutionary rate, and maternal inheritance [6]. For some of these features, mitochondrial DNAs (mtDNAs) have been extensively used as molecular markers for reconstructing phylogenetic relationships, revealing population genetic structures, estimating divergence times, identifying relatedness between recently diverged species, etc. [7,8].
The salamandrid genus Tylototriton mainly inhabit montane waterside areas throughout the eastern Himalaya to Indo-China peninsular, including India, Nepal, Myanmar, Thailand, Laos, Vietnam, and central and southern China [9]. The species of this genus are known as crocodile newts because they have a very peculiar appearance like crocodiles: a flat head, large mouth, and highly rough skin with varying-sized warts lined on the

Sample Collection and Sequencing
Samples of T. broadoridgus were collected from the type locality, Badagongshan National Nature Reserve in Sangzhi County in Hunan Province, China. T. gaowangjienensis was collected from Gaowangjie National Nature Reserve in Guzhang County, and also in Hunan Province of China. The permissions of field survey for scientific purposes were approved by the local Bureau of National Nature Reserve, and the collection of newts used in this study complied with the Wildlife Protection Act of China. According to the "3R principle" (Reduction, Replacement, and Refinement) of animal sampling, only one sample in each population was used in this study. All the procedures of animal collection and treatment were complied with the guidance of the Code of Practice for the Housing and Care of Animals. The specimens were brought back into the laboratory smoothly, and then euthanatized and preserved in 95% alcohol as voucher specimens deposited in Jishou University (T. broadoridgus, voucher no. JWS20221095; T. gaowangjienensis, voucher no. JWS20210100). A small part of the tail samples was used for molecular analysis of the two species. The total DNA was extracted using the DNeasy Blood & Tissue Kit (Qiagen, Hilden, Germany), and then the DNA library was constructed, and high-throughput sequencing was conducted in paired-end mode on the DNBSEQ-T7 platform (Complete Genomics and MGI Tech, Shenzhen, China). As the estimated genome size was 25 Gb in Tylototriton, approximately 100 Gb raw reads of each sample, with 150 bp read length, were finally generated.

Sequence Assembly, Annotation, and Analysis
The complete mitogenomes of the two samples were assembled using three popular tools, the NOVOPlasty 4.3 [24], MitoZ [25], and MEANGS [26], to increase the success rate and facilitate the mutual correction of undefined sites. The annotation of the final assembled mitogenomes was conducted within the online servers of both MITOS2 [27] and GeSeq [28]. The tRNAscan-SE 1.21 online tool was adopted to predict the secondary structure of tRNAs [29]. The nucleotide composition and codon usage of PCGs were calculated using MEGA 11.0 [30]. The AT skew and GC skew were analyzed using the formula: AT-skew = [A − T]/[A + T] and GC-skew = [G − C]/[G + C] [1]. The ratio of non-synonymous (Ka) and synonymous (Ks) substitutions were calculated using DNASP 6.0 [31] based on 12 species of Tylototritons (10 species were downloaded from NCBI). The plots of codon usage frequencies and Ka/Ks ratio were drawn using the Origin software [32].

Phylogenetic Analysis
To reveal the phylogenetic position of the two species we sequenced in this study, another 31 species of Caudata were downloaded from NCBI, whereas the Batrachuperus pinchonii in the family Hynobiidae was selected as the outgroup. All of the 13 PCGs were extracted and checked manually through MEGA 11.0 [30], and then each of the PCG alignments based on 33 species were concatenated to make a combined dataset. The best-fit partitioning scheme and partition-specific models were calculated using Partitionfinder 2.1.1 [33], and the sites of codons 1, 2, and 3 of each PCG were assigned. Phylogenetic relationships were reconstructed under Bayesian inference (BI) and maximum likelihood (ML) methods. BI trees were analyzed using MrBayes 3.2.6 [34], running 1,000,000 generations and sampling every 100 generations; after discarding the first 25% samples as burn-in, posterior probabilities (PP) were calculated into a consensus tree. ML trees were performed using RaxML 8.0.2 [35] by executing 10 runs of random additional sequences and generating the bootstrap values following 1000 rapid bootstrap replicates.

Mitogenome Assembly and Undefined Sites Identification
NOVOPlasty was capable to assemble the fully circled mitogenomes of both samples, with a length of 16,265 bp for T. broadoridgus and 16,259 bp for T. gaowangjienensis. However, the MitoZ can only fully assemble the sample of T. broadoridgus, and the MEANGS can only assemble several fragments (or contigs) for both species. The results of NOVOPlasty, although better-resolved, revealed two undefined sites (probably SNPs) in T. gaowangjienensis that presented as degenerate codons, including the loci of Y (2976) and Y (4110). However, these sites assembled from other two tools were presented either as defined sites (MitoZ) or SNPs within multiple assembled fragments (MEANGS). From a conservative concern, we used the assembly results from NOVOPlasty as the basic sequences, and replaced these undefined sites by the results of MitoZ and MEANGS. The two undefined sites of T. gaowangjienensis were, therefore, corrected as C (2976) and C (4110) accordingly. The final mitogenomes of T. broadoridgus and T. gaowangjienensis without any undefined sites were submitted and used for the following analyses.

Mitogenome Annotation and Nucleotide Composition
The complete mitogenomes of T. broadoridgus and T. gaowangjienensis were 16,265 bp and 16,259 bp in length; of which, both composed 13 PCGs (ATP6, ATP8, CYTB, ND4L, COI-III, ND1-6), 2 rRNA (12S rRNA and 16S rRNA), 22 tRNA genes, and a control region ( Figure 1, Table 1). Most of the PCGs, tRNA genes, and rRNA genes were encoded on the heavy strand (H-strand), except ND6 and eight tRNA genes that encoded on the light strand (L-strand). All of the PCGs of T. broadoridgus and T. gaowangjienensis started with ATG except the ATP6 gene in T. broadoridgus, and the COI gene in both species used GTG as the start codon. There were four kinds of stop codons of the PCGs, whereas the ND2, COI, ATP8, ATP6, ND3, NDL, and ND5 ended with TAA; the ND1 ended with TAG; ND6 ended with AGA; and the COII, COIII, ND4, and CYTB used incomplete T(AA) as the stop codon ( Table 1). The final mitogenomes of the two species with annotated information have been deposited in GenBank (accession number: OP598114 and ON764431).    Each of the PCGs of the two species was identical in length, but with different lengths in non-PCG regions, such as the 16S and 12S rRNA gene in T. broadoridgus which were 3 bp and 1 bp longer than that of T. gaowangjienensis, respectively (Table 1). Both the mitogenomes of T. broadoridgus and T. gaowangjienensis contained a total of 41 bp overlapping sites, which were shared in 10 pairs of neighboring genes, ranging from 1 to 15 bp in length. The longest one (15 bp) was overlapped between ND4L and ND4. For T. broadoridgus, a total of 146 bp intergenic nucleotides (IGN) was dispersed in 15 locations, ranging from 1 to 108 bp in length. The longest one (108 bp) was that between tRNA Thr and tRNA Pro . The IGN of T. gaowangjienensis distributed a similar pattern to that of T. broadoridgus, but only had 145 bp sites ( Table 1).

Characteristics of PCGs and Codon Usage
The total length of PCGs in both T. broadoridgus and T. gaowangjienensis were identical to 11 Table 3). All the PCGs presented positive AT-skews, except a negative value existed in COI and ND6, and all the PCGs presented negative GC-skews, except a positive value existed in ND6 (Figure 2). Given that the different codon positions might have different codon bias in PCGs, we also examined nucleotide compositions from the three codon positions of T. broadoridgus and T. gaowangjienensis (Table 3). Interestingly, the A + T contents were slightly increasing from the first to third codon positions. In detail, there were 55.4% for PCGs-first, 58.7% and 58.4% for PCGs-second, and 60.8% and 60.0% for PCGs-third for the two species, respectively. All of the codons showed positive AT-skews, except the second codon showed a slightly negative value, whereas all GC-skews in three codon positions were negative (Table 3).
Codon usage bias would drive genes to evolve at different rates [36]. Statistics on the relative synonymous codon usage (RSCU) of T. broadoridgus and T. gaowangjienensis showed they shared very similar patterns (Figure 3). In terms of codon frequencies, the CUA (Leu), CCA (Pro), CGA (Arg), and UCA (Ser1) were the most abundant codons in both T. broadoridgus and T. gaowangjienensis. The calculation of the Ka/Ks ratio of each PCG would assess the different evolutionary rate [37]. Among the analyzed 12 species of Tylototriton, the ATP8 gene evolved relatively fast and exhibited the highest Ka/Ks value, whereas COI, on the contrary, showed the lowest Ka/Ks ( Figure 4); however, the Ka/Ks for all 13 PCGs were below 0.7, and did not show positive selection signals. Table 3. Nucleotide composition and skewness of the mitogenomes of T. broadoridgus (TB) and T. gaowangjienensis (TG). TB  TG  TB  TG  TB  TG  TB  TG  TB  TG  TB  TG  TB  TG  TB  in three codon positions were negative (Table 3). Codon usage bias would drive genes to evolve at different rates [36]. Statistics on the relative synonymous codon usage (RSCU) of T. broadoridgus and T. gaowangjienensis showed they shared very similar patterns (Figure 3). In terms of codon frequencies, the CUA (Leu), CCA (Pro), CGA (Arg), and UCA (Ser1) were the most abundant codons in both T. broadoridgus and T. gaowangjienensis. The calculation of the Ka/Ks ratio of each PCG would assess the different evolutionary rate [37]. Among the analyzed 12 species of Tylototriton, the ATP8 gene evolved relatively fast and exhibited the highest Ka/Ks value, whereas COI, on the contrary, showed the lowest Ka/Ks ( Figure 4); however, the Ka/Ks for all 13 PCGs were below 0.7, and did not show positive selection signals. Table 3. Nucleotide composition and skewness of the mitogenomes of T. broadoridgus (TB) and T. gaowangjienensis (TG). TB  TG  TB  TG  TB  TG  TB  TG  TB  TG  TB  TG  TB  TG  TB  TG  D-

Characteristics of rRNAs, tRNAs, and the Control Region
There were two rRNA genes of both T. broadoridgus and T. gaowangjienensis: the 16S rRNA was located between tRNA Val and tRNA Leu , with corresponding lengths of 1563 bp and 1560 bp, whereas the 12S rRNA was located between tRNA Phe and tRNA Val , with corresponding lengths of 928 bp and 927 bp ( Table 2). The nucleotide composition of the two rRNAs were similar, and the AT-skews were positive and the GC-skews were negative for the two species (Table 3).
The mitogenomes of both T. broadoridgus and T. gaowangjienensis contained 22 tRNA genes; of which, eight genes, including tRNA Gln , tRNA Ala , tRNA Asn , tRNA Cys , tRNA Tyr ,

Characteristics of rRNAs, tRNAs, and the Control Region
There were two rRNA genes of both T. broadoridgus and T. gaowangjienensis: the 16S rRNA was located between tRNA Val and tRNA Leu , with corresponding lengths of 1563 bp and 1560 bp, whereas the 12S rRNA was located between tRNA Phe and tRNA Val , with corresponding lengths of 928 bp and 927 bp ( Table 2). The nucleotide composition of the two rRNAs were similar, and the AT-skews were positive and the GC-skews were negative for the two species (Table 3).
The mitogenomes of both T. broadoridgus and T. gaowangjienensis contained 22 tRNA genes; of which, eight genes, including tRNA Gln , tRNA Ala , tRNA Asn , tRNA Cys , tRNA Tyr , tRNA Ser , tRNA Glu , and tRNA Pro , were on the L-strand, and the rest were on the H-strand ( Table 1). The total length of tRNAs was 1537 bp in both species, and the individual length of each tRNA gene was generally identical, except tRNA Phe and tRNA Gly have a 1 bp difference between the two species. It ranged from 66-75 bp of all the tRNA genes, with the longest tRNA Leu and the shortest tRNA Cys (Table 3). All tRNA genes, except tRNA Ser , can be folded into a typical cloverleaf structure.
The non-coding control region, also known as the D-loop, was usually the sequence region with greatest variations across the mitogenome. Here, the D-loops of T. broadoridgus and T. gaowangjienensis were 716 bp and 715 bp in length, located between tRNA Pro and tRNA Phe , with similar A + T contents, AT-skew, and GC-skew values (Table 3).
The genus Tylototriton was one group of the primitive newts, which divided into, as expected, two major groups corresponding to the two subgenera, Tylototriton and Yaotriton. The subgenus Yaotriton was further divided into two subgroups. The first subgroup included T. biegleri and T. asperrimus, and the second one included T. wenxianensis, and both T. broadoridgus and T. gaowangjienensis that we sequenced in this study. T. broadoridgus and T. gaowangjienensis were revealed as sister groups, and then clustered with T. wenxianensis. The subgenus Tylototriton also divided into two subgroups. The first one included T. pseudoverrucosus and T. taliangensis, and the second one included five species that diverged in the following sequences: T. kweichowensis, T. shanorum, T. yangi, T. verrucosus, and T. shanjing.

Species Verification from ND2 and 16S rRNA Gene
The species verification was fundamental for reporting a new mitogenome. Following the suggestions of Sangster and Luksenburg (2021) [38], we verified the identity of our mitogenome sequence of T. broadoridgus with reference sequences of two commonly used markers in Tylotoriton systematics [7]: the ND2 (1035 bp; n = 107, incl. three of T. broadoridgus, KC147814, KY800837, and OK539842) and 16S rRNA (508 bp; n = 89, incl. two of T. broadoridgus, KY800569 and KY800570). In each of these analyses, our sequence of T. broadoridgus clustered with the reference sequences of T. broadoridgus, indicating that our sample was correctly identified. As there are no reference sequences of T. gaowangjiensis, we added our newly obtained ND2 and 16S rRNA sequences into a previous dataset from a phylogenetic study that included the most species of Tylotoriton so far [7]. A simple neighbor-joining tree based on both ND2 and 16S rRNA sequences revealed the sister species of T. gaowangjiensis was T. dabienicus, which was collected from Shangcheng County, Anhui Province of China [7], with genetic distances that ranged from 1% to 2%.

Discussion
The genome sizes of species would affect the rate of assembly success for both nuclear and organellar genomes, with a putatively positive relationship between genome sizes and sequence complexities [39]. Because the genome sizes of Caudata amphibians were relatively large-for instance, the genome sizes of Tylotoriton that we studied werẽ 25 G [40]-the two samples we sequenced were assembled using three popular tools, NOVOPlasty, MitoZ, and MEANGS, for comparisons. Generally, NOVOPlasty performed best, as it assembled fully circled mitogenomes for both samples; however, it produced several undefined loci that presented as degenerate codons. However, these sites assembled from the other two tools were presented as either defined sites (MitoZ) or SNPs in multiple assembled fragments (MEANGS). This might result from the different strategies of the three software in balancing the assembly performance and error-tolerance rates [24][25][26]. We carried out a strategy to correct the undefined sites from NOVOPlasty with defined sites or more abundant SNPs from MitoZ and MEANGS, and, thus, we obtained the final whole mitogenomes without any undefined sites. This approach would be helpful for the mitogenome assembly of other species with relatively large genome sizes. As far as we know, the mitogenomes of the two crocodile newts, T. broadoridgus and T. gaowangjienensis, were assembled in this study for the first time. The characteristics of the mitogenomes of the two species were very similar, in terms of mitogenome size and organization ( Figure 1, Table 1); and nucleotide composition of PCGs, rRNAs, tRNAs, control region, or codon usage of PCGs (Figures 2 and 3, Table 3). They also showed very similar patterns with other Tylototriton species reported previously [11,[41][42][43][44][45][46], even though some minor differences remained. For example, the ND3 used the incomplete "T-" as the stop codon in T. wenxianensis [41], but this gene used the conventional stop codon "TAA" in both T. broadoridgus and T. gaowangjienensis. Similarly, the ND5 used "TAG" as the stop codon in T. taliangensis [42], but it stopped with "TAA" in both T. broadoridgus and T. gaowangjienensis (Table 1). Whether the diverse usage of stop codons among the closely-related species was generated randomly or with some meaningful preferences was an interesting question of selection, but was not given much attention.
In this study, the phylogenetic relationships within Salamandridae were able to be reconstructed, while using 32 representative species, including 2 newly obtained in this study, as the ingroups, and B. pinchonii in Hynobiidae as the outgroup. The relationship of the three subfamilies of the Salamandridae has been highly supported and broadly consistent with previous studies [11,47]. However, some new, but different, inter-generic relationships were also revealed. For instance, while using only 16S rRNA and ND2 as gene markers, Wang et al. (2018) recovered a sister group relationship of Triturus and Neurergus, but, here, we revealed the sister group of Triturus was Calotriton ( Figure 5). Although the phylogenetic hypotheses would be changed based on different DNA markers [12,48], it was speculated that more sequences used, such as the 13 PCGs here, would be generally better to understand the real phylogenetic relationships.

Conclusions
In summary, we have successfully sequenced and assembled the complete mitogenomes of two rare species of crocodile newts, T. broadoridgus and T. gaowangjienensis, for the first time. We further provided detailed characteristics of the two mitogenomes in aspects of gene orders, nucleotide composition, and codon usages from different regions, such as rRNAs, tRNAs, PCGs, and the non-coding control region. The phylogenetic trees using the BI and ML methods based on 13 PCGs of 32 species have provided well-supported major clade relationships within Salamandridae for reference, as well as revealed new relationships among Tylotoriton, which we were concerned with the most. The two mitogenomes reported here and the detailed analyses in this study would provide valuable materials and data for future taxonomic and evolutionary studies of the genus Tylotoriton and other salamanders. The crocodile newts, Tylotoriton, were recovered as a monophyletic group split into two well-supported subgenera (Yaotriton and Tylotoriton) ( Figure 5). The inter-specific relationships were broadly consistent with the previous findings that were based on several gene fragments [7,14,18,49,50], but some interesting differences were also revealed and are worth attention. For example,   [49] found that in the subgenus Tylotoriton, T. verrucosus branched off first, followed by T. shanjing and T. yangi. In contrast, our study showed that T. yangi divided first and then grouped with T. shanjing plus T. verrucosus. Both T. broadoridgus and T. gaowangjienensis that we sequenced in this study were recovered as members in the subgenus Yaotriton, which was consistent with the morphological studies. It was also reasonable in terms of the distribution area of this unique group [17]. Although the phylogenetic tree revealed that T. broadoridgus and T. gaowangjienensis were sister groups, this relationship would be tentative, as the Tylotoriton species with reported mitogenomes were still very limited. The whole picture of the biogeography of Tylotoriton would be presented, whereas more molecular data, such as mitogenomes from more species, would be available in the future.

Conclusions
In summary, we have successfully sequenced and assembled the complete mitogenomes of two rare species of crocodile newts, T. broadoridgus and T. gaowangjienensis, for the first time. We further provided detailed characteristics of the two mitogenomes in aspects of gene orders, nucleotide composition, and codon usages from different regions, such as rRNAs, tRNAs, PCGs, and the non-coding control region. The phylogenetic trees using the BI and ML methods based on 13 PCGs of 32 species have provided well-supported major clade relationships within Salamandridae for reference, as well as revealed new relationships among Tylotoriton, which we were concerned with the most. The two mitogenomes reported here and the detailed analyses in this study would provide valuable materials and data for future taxonomic and evolutionary studies of the genus Tylotoriton and other salamanders.