The First Complete Mitochondrial Genome of Eucrate crenata (Decapoda: Brachyura: Goneplacidae) and Phylogenetic Relationships within Infraorder Brachyura

Characterizing the complete mitochondrial genome (mitogenome) of an organism is useful for genomic studies in taxonomy and evolution. The mitogenomic characteristics of Eucrate crenata (Decapoda: Brachyura: Goneplacidae) have never been studied. The present study decodes the first mitogenome of E. crenata by high-throughput sequencing (HTS). The length of the mitogenome is 15,597 bp, and it contains 13 protein-coding genes, 2 ribosomal RNA genes (rrnS and rrnL), and 22 transfer RNA genes. There are 14 and 23 genes observed on the heavy and light strands, respectively. E. crenata possesses a trnH-cac translocation, with the trnH-cac shifted between trnE-gaa and trnF-ttc instead of the usual location between nad5 and nad4 in decapods. Phylogenetic analyses based on the current dataset of 33 Brachyuran mitogenomes indicate that E. crenata. is closely related to Ashtoret lunaris of Matutidae. The similar codon usage and rearrangements in the two species provide evidence for their close phylogenetic relationship. Positive selection analysis showed that one residue located in cox1 was identified as a positively selected site with high BEB value (>95%), indicating that this gene was under positive selection pressure. This study is the first complete mitogenome record for the family Goneplacidae, and the results obtained may improve the understanding of the phylogeny of Goneplacidae in Brachyura.


Introduction
Mitochondria are involved in various processes, such as ATP generation, signaling, cell differentiation, growth, and apoptosis [1]. Based on this, the mitochondrial genome (mitogenome) has been widely used in studies of molecular evolution and the reconstruction of phylogeny [2,3]. Nucleotide sequences or amino acid data, and the gene arrangement of mitogenomes, are greatly useful for inferring relationships among metazoan lineages [3,4]. In general, a metazoan mitogenome is a closed circular DNA molecule, ranging in size from 14 to 20 kb, and typically contains 37 genes: 13 protein-coding genes (PCGs) (atp6, atp8, cox1-3, cob, nad1-6, and nad4l), 2 rRNA genes (rrnS and rrnL), 22 tRNA genes, and an ATrich region (also known as the control region) that contains some of the initiation sites for genome transcription and replication [5]. The structural characteristics and arrangements of these genomic features of mitogenome differ across each species [6,7]. Mitogenomic composition can thus be examined for species identification, genetic diversity assessment,

Illumina Sequencing, Mitogenome Assembly and Annotation
The DNA was first isolated, and then the 2 µg purified DNA was fragmented. Pairedend libraries (350 bp) were performed using the NEBNext ® Ultra™ DNA Library Prep Kit for Illumina (NEB, USA) following manufacturer's instructions, then sequenced using an Illumina NovaSeq 6000 at the Beijing Novogene Bioinformatics Technology Co., Ltd. (Beijing, China).
Raw reads were filtered by Trimmomatic v0.39 before the assembly [20]. This filtering step was performed to remove the reads with connectors, the reads showing a quality score below 20 (Q < 20), the reads containing a percentage of uncalled bases ("N" characters) equal to or greater than 10%, and repeated sequences. A combination of de novo and reference-guided assemblies was applied to reconstruct the mitochondrial genome, and the following three steps were performed to assemble the mitogenome. First of all, the filtered reads were assembled into contigs with SPAdes 3.14.1 (http://bioinf.spbau.ru/spades, accessed on 30 December 2021). Then, BLAST (https://blast.ncbi.nlm.nih.gov/Blast.cgi, accessed on 30 December 2021) were used to align contigs with reference mitogenomes from the infraorder Brachyura species, and aligned contigs (≥80% similarity and query coverage) were ranked according to the reference mitogenomes. Finally, GapCloser 1.12 with default parameters were used to map the clean reads to the assembled draft mitogenome, and most gaps were filled by local assembly.
The mitochondrial genes were annotated using the online MITOS tool (http://mitos. bioinf.uni-leipzig.de/index.py, accessed on 30 December 2021), and the protein-coding genes, transfer RNA (tRNA) genes, and ribosome RNA (rRNA) genes were predicted with default parameters. BLAST searches for reference mitochondrion genes were used to determine the location of each coding gene. Manual corrections of genes for start/stop codons were carried out in SnapGene Viewer by referring to the reference mitogenomes. The circular map of mitogenome of E. crenata was drawn by the CGview tool (http:// stothard.afns.ualberta.ca/cgview_server/, accessed on 30 December 2021). Several publicly available protein databases were functionally annotated using sequence-similarity Blast searches with a typical cut-off E-value of 10 −5 against: NCBI non-redundant (Nr) protein database, Swiss-Prot, Clusters of Orthologous Groups (COGs), Kyoto Encyclopedia of Genes and Genomes (KEGG), and Gene Ontology (GO) terms.

Sequence Analysis
Relative synonymous codon usage (RSCU) was computed with cusp (EMBOSS v6.6.0.0). The AT and GC skews were calculated using the following formulas: AT skew = (A − T)/ (A + T) and GC skew = (G − C)/(G + C) [21], where A, T, G and C refer to the occurrences of the four nucleotides. The complete mitochondrial DNA sequence has been deposited in GenBank with the accession number ON150678.

Phylogenetic Analysis
The phylogeny of the infraorder Brachyura was reconstructed using mitogenome data from 30 species (Table 1). Two species in the family Alpheidae were used as the outgroup (Table 1). Our data set was based on nucleotide sequences from 13 mitochondrial PCGs (atp6, atp8, cox1, cox2, cox3, cob, nad1, nad2, nad3, nad4, nad4l, nad5, and nad6). Multiple alignments of all 13 genes were performed with MUSCLE 3.8.31. In aligned sequences, blurred aligned regions and gaps were removed using Gblocks server 0.91b with the default setting. JModeltest v2.1.10 was used to select the most suitable evolutionary model GTR + I + G for the amino acid dataset. Maximum likelihood (ML) analysis of phylogenetic trees and model parameters was performed, and the best model was determined through AIC and BIC scores. The phylogenetic tree (http://www.atgc-montpellier.fr/phyml/, accessed on 23 February 2022) was constructed by PhyML v3.0 and graphically edited using the iTOL 3.4.3 (https://itol.embl.de/itol.cgi, accessed on 23 February 2022).

Positive Selection Analysis
The comparison of the nonsynonymous/synonymous substitution ratios (ω = dN/dS) is meaning for quantifying the impact of natural selection on molecular evolution [22]. Ω > 1, =1, and <1 respectively represent positive selection, neutrality, and purifying selection [23]. The codon-based maximum likelihood (CodeML) method, which implemented in the PAML package, [24] was used to estimate the ω values. A combined database of all 13 mitochondrial PCGs was used for selection pressure analyses. The ML phylogenetic tree was used as the working topology in CodeML analyses.
In this study, branch models were used to evaluate positive selection within the infraorder Brachyura. First, the one-ratio model (M0) estimated the distribution of the ω values as a baseline under the assumption that there is no adaptive evolution of the gene sequences, which assumes a single ω ratio for all branches of the phylogeny [25]. Then, positive selection acting on branches of interest was detected using a two-ratio model (M2), which allows the background and foreground lineages to have different ω ratio values [26,27]. Finally, the free-ratio model (M1), which allows the ω ratio variations between branches, was used to estimate the ω value on each branch [27]. The one-ratio model (M0) and the free-ratio model (M1) were compared to determine whether different clades in Brachyura had different ω values. The one-ratio model (M0) and the two-ratio model (M2) were compared to determine whether the ω values of the foreground and background branches were significantly different. In the one-ratio model, ω represents the value for the overall evolutionary tree. In the two-ratio model, ω 0 and ω 1 represent the values for the background branch and foreground branch, respectively. The paired models were compared with critical values of the Chi square (χ 2 ) distribution using likelihood ratio tests (LRTs), where the test statistic was estimated to be twice the difference in log likelihood (2∆L) and the degrees of freedom were estimated as the difference in the parameter number for each model.
In addition, we fit branch site models to check positive selection of some sites in the infraorder Brachyura. The branch-site models allow ω to vary between sites in proteins and between branches in trees. Branch-site model A (positive selection model) was used to identify the positively selected sites in Brachyuran species lineages (marked as foreground lineages). Comparing the significance level detection between Model A and the null Model with ω 2 = 1 can be directly used to determine positive selection sites in the foreground Branch. The presence of sites with ω > 1 indicates that the data fitting of model A is significantly better than the corresponding null model. Bayes Empirical Bayes analysis was used to calculate a posterior probabilities in order to identify positive selection sites on the foreground lineages in the case of significant LRTs [28].

E. crenata Mitogenome Organization
The Illumina NovaSeq runs resulted in 38,947,830 paired-end reads from the E. crenata library with an insert size of 350 bp. According to selective-assembly analysis, 167,844 reads were assembled into a 15,597 bp circular molecule (ON150678), which represented the complete mitogenome of E. crenata ( Figure 1 and Table 2). This length is shorter than that of the complete mitogenome of most Brachyura crabs, including Pseudohelice subquadrata, Macrophthalmus pacificus, Hemigrapsus penicillatus, and Grapsus tenuicrustatus [29][30][31][32]. The genome encodes 37 genes, including 13 PCGs, two rRNA genes, and 22 tRNA genes (duplication of trnL and trnS). Metazoan mitogenomes usually have a control region (Dloop area), which varies in length between species. However, the D-loop area was not obvious in E. crenata mitogenome. As opposed to other Brachyura crabs [15,17], only 14 genes are found on the heavy (H) strand, whereas the other 23 genes are found on the light (L) strand. In addition, there are nine overlaps between adjacent genes in the E. crenata mitogenome with a length range of 1 to 46 bp (cox3/atp6, atp6/atp8, trnL2-ttA/cox1, trnC-tgc/trnW-tga, trnW-tga/nad2, trnV-gta/rrnL, rrnL/trnS2-tca, cob/nad6, nad4l/nad4) ( Table 2). The E. crenata mitogenome also contains 941 bp intergenic spacers between genes in 20 regions ranging from 1 to 614 bp (Table 2), indicating the occurrence of tandem repeats and the loss of redundant genes.    Mitochondrial gene rearrangement is known as an effective phylogenetic signal for studying mitochondrial evolution [8]. Several studies and results have confirmed that gene rearrangements in metazoan mitochondrial genomes are conserved [33] and gene rearrangements occur relatively randomly and rarely [10,[33][34][35]. However, it can be used as direct evidence of evolutionary relationships between species [36]. Gene order and strand arrangement in E. crenata is unidentical to that reported before in co-order species with mitochondrial genomes deposited in Genbank, i.e., Perisesarma bidens, Parasesarma tripectinis, and Metopaulias depressus [16,37,38]. After comparison and analysis with the ancestor of Decapoda [15], we found that E. crenata had a trnH-cac translocation, where the trnH-cac shifted into trnE-gaa and trnF-ttc, instead of the usual location between nad5 and nad4. This phenomenon also occurs in several other species in Crustacea, such as Grapsus albolineatus, Metopograpsus frontalis, and Macrophthalmus pacificus [39][40][41]. As known, the tandem duplication/random loss model can explain the movement of trnH-cac, which is caused by tandem duplication in the region between trnE-gaa and nad4, followed by deletions of redundant genes that produce trnH-trnF-nad5.
The E. crenata mitogenome has a nucleotide composition of 36.19% A, 10.84% C, 16.90% G, and 36.07% T, with an overall AT content of 72.26% and GC content of 27.74%. This value of AT content was lower than the minimum value of 74.22% reported for Parasesarma tripectinis in the family Sesarmidae [36]. AT and GC skews are a measure of compositional asymmetry. The values of AT and GC skews found in E. crenata mitogenome were different in those of most animal mitogenomes. For E. crenata, the AT skew is 0.00166, and the GC skew is 0.218. As a representative of the family Goneplacidae, the AT skew and GC skew of E. crenata mitogenome showed different results with the species in the family Grapsidae, in which the AT skew and GC skew are always negative [15]. AT-skewed mitochondrial genomes are often reported in metazoan clades, such as crustaceans and brachyuran crabs [42][43][44], whereas GC-skewed mitochondrial genomes are not. The higher value of GC skew implies an apparent bias towards G in nucleotide composition.
A large number of studies have shown that metazoan mitogenomes always have a bias towards higher representation of nucleotides A and T, resulting in subsequent bias in the corresponding encoded amino acids [51][52][53][54]. In the mitogenome of E. crenata, the A+T contents of PCGs and third codon positions are 69.87% and 69.70%, respectively. The amino acid usage and relative synonymous codon usage (RSCU) values in the PCGs of E. crenata are summarized in Figure 2. The mitogenome encodes a total of 3683 amino acids, among which serine (16.1%) and tryptophan (1.6%) are the most and the least frequently used, respectively. The RSCU values indicate that the six most commonly used codons are TTA (Leu), AGA (Ser), TCT (Ser), CCT (Pro), ACT (Thr), GCT (Ala) (Figure 2). Based on RSCU and amino acid composition in the PCGs, comparative analyses showed that the codon usage pattern of E. crenata is not conserved, varying from those of most Brachyuran species, such as Grapsus albolineatus and Metopaulias depressus [15,16]. In particular, leucine (9.8%) was found to be used less than serine, and AGA (Ser) plays a key role in the codon usage. TTA (Leu), AGA (Ser), TCT (Ser), CCT (Pro), ACT (Thr), GCT (Ala) (Figure 2). Based RSCU and amino acid composition in the PCGs, comparative analyses showed that t codon usage pattern of E. crenata is not conserved, varying from those of most Brachyur species, such as Grapsus albolineatus and Metopaulias depressus [15,16]. In particular, leuci (9.8%) was found to be used less than serine, and AGA (Ser) plays a keyrole in the cod usage.

Ribosomal RNA and Transfer RNA Genes
The rrnL and rrnS genes of E. crenata are 1398 bp and 819 bp in length, respectively. Consistent with many Brachyura crabs [15,16], rrnL and rrnS are typically separated by trnV (gta) ( Table 2). The AT and GC content of rRNAs are 78.3% and 21.7%, and the AT skew and GC skew are 0.022 and 0.29, respectively, suggesting an apparent bias toward the use of A and G.
Twenty-two tRNA genes were identified in the mitogenome of E. crenata, which is typical for metazoans. As opposed to most Brachyura crabs, i.e., Eurypanopeus depressus, Panopeus herbstii, Rhithropanopeus harrisii and Grapsus albolineatus [15,17], only eight of them are encoded by the heavy strand, and the rest are encoded on the light strand. The total length of tRNAs was 1477 bp, accounting for 9.47% of the mitogenome, and the length of these genes ranges from 62 (trnC-tgc) to 73 (trnV-gta) bp ( Table 2). The regions contained 3.06 times more A-T content (75.36%) than G-C content (24.64%). Both AT skew and GC skew were positive, and their values were 0.035 and 0.11, respectively, showing a slight bias toward the use of A and an apparent bias toward G. In addition, the codons with A and T in the third position are used more frequently than other synonymous codons. These characteristics reflect codon usage with A and T bias at the third codon position, similar to the bias found in many metazoans [55][56][57].

Phylogenetic Relationships
In the present study, the phylogenetic relationships were analyzed based on the sequences of the 13 PCGs to clarify the relationships in E. crenata and 33 other known Brachyuran species (Figure 3). Phylogenetic analyses were carried out based on nucleotide sequences of 13 mitochondrial PCGs (atp6, atp8, cox1, cox2, cox3, cob, nad1, nad2, nad3, nad4, nad4l, nad5, and nad6) by the ML method. In the phylogenetic tree, some species from the same superfamily such as Xanthoidea, Grapsoidea, and Leucosioidea are not closest related, which may be caused by the selection methods of species here or differences in gene expression and morphological classification. As for the family Goneplacidae, it united well with Matutidae and Leucosiidae, and this also confirms a phylogenetic position of the family Goneplacidae close to Matutidae and Leucosiidae in the infraorder Brachyura. Among the 33 species included in the phylogenetic tree, each species in the tree has a high nodal support value in the clade. It is obvious that Ashtoret lunaris in the family Matutidae is closely related to E. crenata. Phyrhila pisum belonging to the family Leucosiidae also reveals a close relationship with E. crenata, which is second only to Ashtoret lunaris. This may improve the understanding of the phylogenetic position of E. crenata in Brachyura.   atp6, atp8, cox1, cox2, cox3, cob, nad1, nad2, nad3, nad4, nad4L,  nad5 and nad6). Species belonging to the same superfamily are marked with the same color. Numbers on branches represent ML bootstrap values.

Positive Selection Analysis
Purifying selection is the dominant force in mitotic evolution. Weak and/or episodic positive selection may occur under the strong purifying selection of reduced oxygen availability or greater energy requirements, because mitochondria are the primary site of aerobic respiration and are essential for energy metabolism [59][60][61]. Numerous studies have demonstrated that mitochondrial PCGs experienced positive selection in animals that survived in low oxygen environments, such as Tibetan humans, Ordovician bivalves, diving cetaceans, and insects [62][63][64][65].
Potential positive selection in E. crenata using CodeML implemented in the PAML package was examined to investigate whether marine ecosystems influence the function of mitochondrial genes. In the analysis of branch models, the ω (dN/dS) ratio that calculated under the one-ratio model (M0) was 0.03928 for the 13 mitochondrial PCGs of E. As shown in the previous study [58], Ashtoret lunaris also possesses a trnH-cac translocation. That is, the trnH-cac shifted into trnE-gaa and trnF-ttc. Cob exhibits an incomplete (T) stop codon, as does E. crenata. According to data analysis based on the Ashtoret lunaris genome sequence, serine (13.8%) and tryptophan (0.9%) are the most and the least frequently used, respectively. The similar rearrangements and codon usage between Ashtoret lunaris and E. crenata support the results above and provide further evidence for their close phylogenetic relationship.

Positive Selection Analysis
Purifying selection is the dominant force in mitotic evolution. Weak and/or episodic positive selection may occur under the strong purifying selection of reduced oxygen availability or greater energy requirements, because mitochondria are the primary site of aerobic respiration and are essential for energy metabolism [59][60][61]. Numerous studies have demonstrated that mitochondrial PCGs experienced positive selection in animals that survived in low oxygen environments, such as Tibetan humans, Ordovician bivalves, diving cetaceans, and insects [62][63][64][65].
Potential positive selection in E. crenata using CodeML implemented in the PAML package was examined to investigate whether marine ecosystems influence the function of mitochondrial genes. In the analysis of branch models, the ω (dN/dS) ratio that calculated under the one-ratio model (M0) was 0.03928 for the 13 mitochondrial PCGs of E. crenata (Table 3), indicating that these genes have underwent constrained selection pressure to maintain the main function. In the two-ratio model, ω 0 and ω 1 , which represent the values for the background branch and foreground branch, were 0.04084 and 999.00000, respectively. The ω ratio of the foreground branch (ω 1 = 999.00000) was significantly more than 1, indicating a strong positive selection of E. crenata by the infraorder Brachyura. In addition, many studies have shown that positive selection usually occurs over a short evolutionary time and performed on only a few sites [59,66]. Therefore, the signals of positive selection are often overwhelmed by those for successive purifying selection that occur at most sites in the gene sequence [59,61,66]. In the current study, branch-site models were used to determine possible positively selected sites in E. crenata (Table 4). One residue, located in cox1, was identified as a positively selected site with high BEB value (>95%). It is known that mitochondrial PCGs are crucial in the oxidative phosphorylation pathway. Cytochrome c oxidase catalyzes the oxygen terminal reduction, and the catalytic core is encoded by three mitochondrial PCGs (cox1, cox2 and cox3) [61,67]. Cytochrome c oxidase has been confirmed to be a particularly important site of the positive selection in marine anoxic adaptation [68]. It has been reported that the cytochrome c oxidase requires reactive oxygen species (ROS) when the living cells are exposed to anoxia, and the increase in ROS concentration contributes to stabilize Hif-1α, which then results in the induction of Hif-1dependent nuclear hypoxic genes [61,69,70]. The positive selection site in cox1 suggests a phylogenetic adaption of ATP synthase in the presence of reduced oxygen availability or increased energy requirements. In addition, cox1 in mitochondrion is involved in the regulation of platelet aggregation and vasomotor to maintain the stability of physiological functions of cells, tissues, and organs [61]. For E. crenata, functional modification mediated by positive selection mutations may increase the affinity between the enzyme and oxygen, and then efficiently utilize oxygen utilization under hypoxia conditions and maintain basic metabolic levels.  It's well known that the marine environment is features with a lack of high pressure, variable temperatures, and low dissolved oxygen. Several studies have confirmed that all the above environmental factors could influence the mitochondrial aerobic respiration process [56,71,72]. In this study, the potentially adaptive residue was identified in the cox1 gene, supporting the adaptive evolution of the E. crenata mitogenome. Our results provide an important basis for understanding how marine crabs maintain aerobic respiration to obtain adequate energy in marine chemosynthesis environment at the mitochondrial level.

Conclusions
In this study, the mitogenome of E. crenata was sequenced by HTS sequencing, thereby the first complete mitogenome for the family Goneplacidae was revealed. The E. crenata mitogenome is a typical closed-circular animal molecule including 13 PCGs, 22 tRNA genes, and two rRNA genes. Different from the published results for most Brachyura crabs, the AT skew and GC skew in the mitogenome are both positive. E. crenata shows a trnH-cac translocation between trnE-gaa and trnF-ttc. The phylogenetic analyses indicate that E. crenata is closely related to Ashtoret lunaris of Matutidae. Positive selection analysis showed that one residue located in cox1 was identified as a positively selected site with high BEB value, which may suggest that this gene was under positive selection pressure in the marine ecosystem. The findings of this study could help to deepen the understanding of the phylogeny of Goneplacidae and the adaptive evolution at the mitochondrial level for marine crabs.