Characterization of the Complete Mitochondrial Genome of the Bromeliad Crab Metopaulias depressus (Rathbun, 1896) (Crustacea: Decapoda: Brachyura: Sesarmidae)

Metopaulias depressus is a non-marine crab endemic to Jamaica that dwells in rainforest bromeliads and exhibits elaborate active parental care behavior. Current genomic resources on M. depressus are rare, limiting the understanding of its adaptation to terrestrial life in species that evolved from marine ancestors. This study reports the complete mitochondrial genome of M. depressus assembled using Sanger sequencing. The AT-rich mitochondrial genome of M. depressus is 15,765 bp in length and comprises 13 protein-coding genes (PCGs), 2 ribosomal RNA genes, and 22 transfer RNA genes. A single 691 bp-long intergenic space is assumed to be the control region (CR) or D-loop. A set of selective pressure analyses indicate that the entirety of the PCGs experience purifying selection. Cox1, cox2, nad5, cox3, and atp6 experience strong purifying selection, and atp8 experiences weak purifying selection compared to the rest of the PCGs. The secondary structures of most tRNA genes exhibit a standard ‘cloverleaf’ structure, with the exception of trnS1, which lacks the dihydroxyuridine (DHU) arm but not the loop, the trnH gene, which lacks the thymine pseudouracil cytosine (T) loop but not the arm, and trnM, which exhibits an overly developed T loop. A maximum likelihood phylogenetic analysis based on all PCGs indicated that M. depressus is more closely related to the genera Clistocoeloma, Nanosesarma, and Parasesarma than to Chiromantes, Geosesarma, and Orisarma. This study contributes to deciphering the phylogenetic relationships within the family Sesarmidae and represents a new genomic resource for this iconic crab species.


Introduction
Within the Decapoda, crabs belonging to the Infraorder Brachyura are recognized for their astonishing anatomical, ecological, physiological, and behavioral diversity [1,2]. Among them, the family Sesarmidae is a speciose clade that has successfully colonized marine intertidal and supratidal zones [3][4][5]. Some lineages have even radiated into freshwater and terrestrial habitats, and these non-marine sesarmids often exhibit abbreviated larval development and complex parental-offspring interactions [6]. Metopaulias depressus sets

Specimen Collection and Mitochondrial Genome Sequencing
The used specimen was collected during a field trip and visit to the Windor Great House near Sherwood (Trelawny) in Cockpit Country, Jamaica. Collecting permits were obtained beforehand. DNA extraction was conducted using the DNeasy Tissue Kit (Qiagen, Hilden, Germany), following the manufacturer's protocol. Next, the mitochondrial genome of M. depressus was assembled using a primer-walking strategy with the set of primer pairs developed by [26]. More specifically, the whole mitochondrial genome of M. depressus was first amplified in three long overlapping PCR products. Next, these products were used as templates for amplifying shorter fragments (PCR products > 800 bp) using the Sanger sequencing method, employing a primer-walking strategy. For more details such as primer sequences and PCR conditions, see [26].
The nucleotide composition of the entire mitochondrial chromosome and each protein coding gene (PCG) was estimated with the software MEGA 7 [30].
To explore selective pressures on each mitochondrial PCG, a pairwise comparison was performed between M. depressus and Clistocoeloma sinense (GenBank: NC_033866). The number of nonsynonymous substitutions per nonsynonymous site (Ka), synonymous substitutions per synonymous site (Ks), and the ratio Ka/Ks (ω) were estimated using the software KaKs_calculator 2.0 [6]. If a PCG experiences neutral selection, then ω = 1. Negative or purifying selection is indicated by values ω < 1, whereas positive or diversifying selection is denoted by values ω > 1. The γ-MYN model was used to account for variable mutation rates along each sequence during calculations [34].
tRNA and their secondary structures were predicted using the program MiTFi [35], as implemented in MITOS and MITOS2. The visualization of the secondary structure for each tRNA was conducted using the FORNA web server (http://rna.tbi.univie.ac.at/forna/, accessed on 15 May 2021 [35,36]).

Phylogenetic Position of Metopaulias Depressus
The phylogenetic position of M. depressus among other representatives of the family Sesarmidae was examined based on PCGs. Our analysis was conducted with amino acids instead of nucleotides due to the fact that the phylogenetic signal from nucleotide characters alone has the potential to be saturated. The newly sequenced and annotated mitogenome of M. depressus, together with those of 11 other species (6 genera) belonging to the family Sesarmidae available in GenBank (consulted: 19 December 2021) were used for the phylogenetic analysis conducted using the software MitoPhAST V2.0 [40].
Outgroups included species from each of the families Grapsidae, Gecarcinidae, Ocypodidae, Xenograpsidae, and Varunidae. MitoPhAST first extracted all 13 PCG nucleotide sequences from the species available in GenBank and any others provided by the user (i.e., M. depressus). Next, each PCG nucleotide sequence was translated to amino acids and each PCG amino acid sequence was then aligned using Clustal Omega [41,42]. Poorly aligned regions were removed with trimAl v1.2.0 [43] before the dataset was partitioned and the best fitting models of sequence evolution were selected with ProtTest3 v3.4 [44]. Lastly, the concatenated and partitioned PCG amino acid alignments were used to perform a maximum likelihood phylogenetic tree search in the software IQ-TREE [45]. The robustness of the ML tree topology was ascertained by 1000 bootstrap pseudoreplicates of the tree search.
v3.4 [44]. Lastly, the concatenated and partitioned PCG amino acid alignments were to perform a maximum likelihood phylogenetic tree search in the software IQ-TREE The robustness of the ML tree topology was ascertained by 1,000 bootstrap p doreplicates of the tree search.
Relative synonymous codon usage (RSCU) and amino acid composition in the PCGs of M. depressus are summarized in Figure 2. The most frequently used codons (amino acids) were: TTA (Leu) used 434 times (73%), ATT (Ile) used 336 times (94%), TTT (Phe) used 317 times (91%), and ATA (Met) used 225 times (92%). Codons (amino acids) that were the least commonly used to encode their respective amino acids (excluding stop codons) included CGC (Ala), used one time (0.01%), CTG (Leu), used one time (undefined %), CGG (Arg), used one time (0.02%), AGC (Ser), used two times (0.01%), and CCC (Pro,) used two times (0.02%) (Supplementary Table S2). RSCU and amino acid composition of PCGs in M. depressus is similar to that reported before in other representatives of the family Sesarmidae. For instance, the most frequently used codons in P. affine, O. sinense, and P. bidens were Leu, Ile, and Phe, in agreement with that observed in this study for M. depressus [5,48,49]. In addition to M. depressus, codons for Met are frequently used in P. pictum [6]. All the codons coding for the aforementioned amino acids are AT-rich, in line with the observed overrepresentation of A and T nucleotides in the mitogenome of M. depressus and other co-familiar crabs [5,46].
Relative synonymous codon usage (RSCU) and amino acid composition in the PC of M. depressus are summarized in Figure 2. The most frequently used codons (am acids) were: TTA (Leu) used 434 times (73%), ATT (Ile) used 336 times (94%), TTT (P used 317 times (91%), and ATA (Met) used 225 times (92%). Codons (amino acids) were the least commonly used to encode their respective amino acids (excluding s codons) included CGC (Ala), used one time (0.01%), CTG (Leu), used one time (un fined %), CGG (Arg), used one time (0.02%), AGC (Ser), used two times (0.01%), and C (Pro,) used two times (0.02%) (Supplementary Table S2). RSCU and amino acid com sition of PCGs in M. depressus is similar to that reported before in other representative the family Sesarmidae. For instance, the most frequently used codons in P. affine sinense, and P. bidens were Leu, Ile, and Phe, in agreement with that observed in study for M. depressus [5,48,49]. In addition to M. depressus, codons for Met are freque used in P. pictum [6]. All the codons coding for the aforementioned amino acids AT-rich, in line with the observed overrepresentation of A and T nucleotides in the togenome of M. depressus and other co-familiar crabs [5,46]. In the mitochondrial genome of M. depressus, the Ka/Ks ratio estimated for all PC show values < 1 PCGs (all p values < 0.05), indicating that purifying selection is ac upon all these PCGs. The Ka/Ks ratio estimated for atp8 is the highest observed va (0.16017) compared to the rest of the PCGs and indicates that the purifying selection relatively weak in this gene. In turn, Ka/Ks ratios calculated for cox1, cox2, nad5, cox3, atp6 are the lowest observed values (0.029, 0.01069, 0.02027, 0.03302, and 0.03196, resp tively) and indicate strong selective pressure affecting the latter PCGs (Figure 3). Se In the mitochondrial genome of M. depressus, the Ka/Ks ratio estimated for all PCGs show values < 1 PCGs (all p values < 0.05), indicating that purifying selection is acting upon all these PCGs. The Ka/Ks ratio estimated for atp8 is the highest observed value (0.16017) compared to the rest of the PCGs and indicates that the purifying selection was relatively weak in this gene. In turn, Ka/Ks ratios calculated for cox1, cox2, nad5, cox3, and atp6 are the lowest observed values (0.029, 0.01069, 0.02027, 0.03302, and 0.03196, respectively) and indicate strong selective pressure affecting the latter PCGs (Figure 3). Selective pressure in PCGs has not been studied before in any other crab belonging to the family Sesarmidae. However, a strong pattern of purifying selection has been reported for many other brachyuran crabs, crustaceans, and arthropods in general ( [34] and references therein). A recent study of caridean shrimps (genus Synalpheus) found a relationship between PCG length and the strength of purifying selection, with short genes (e.g., atp8) being subject to weaker purifying selection than longer PCGs [55]. Our observations are in agreement with the aforementioned pattern. Whether or not an association between gene length and the strength of purifying selection exists in sesarmid and other brachyuran crabs remains to be addressed.
tive pressure in PCGs has not been studied before in any other crab belonging to the family Sesarmidae. However, a strong pattern of purifying selection has been reported for many other brachyuran crabs, crustaceans, and arthropods in general ( [34] and references therein). A recent study of caridean shrimps (genus Synalpheus) found a relationship between PCG length and the strength of purifying selection, with short genes (e.g., atp8) being subject to weaker purifying selection than longer PCGs [55]. Our observations are in agreement with the aforementioned pattern. Whether or not an association between gene length and the strength of purifying selection exists in sesarmid and other brachyuran crabs remains to be addressed. In the mitochondrial genome of M. depressus, 19 out of the 22 tRNA genes exhibited a cloverleaf secondary structure (Figure 4). The trnS1 gene exhibited a deletion of the dihydroxyuridine (DHU) arm, having only its loop. Other co-familiar crabs, including O. sinense, P. pictum, P. affine, P. bidens, G. faustum, G. penangense, C. sinense, and C. haematocheir, presented the same deletion of the DHU arm in the trnS1 gene [5,6,46,48,49,[51][52][53][54][55][56][57][58], with the exception of O. neglectum [47], in which all tRNAs exhibited the typical cloverleaf secondary structure. A truncated trnS1 gene represents a conserved mitochondrial feature in eumetazoans, including crabs and other decapod crustaceans [6,49,56].
In M. depressus, the 691 bp-long control region (CR) is located between the rrnS and trnQ genes, starting at position 13,480 and ending at position 14,170. The length of the CR was similar in range (630 to 751 bp) to that previously reported in other crabs belonging to the family Sesarmidae [5,6,[47][48][49]58]. The Microsatellite Repeats Finder analysis found 18 TA-rich microsatellites (SSRs) distributed from position 57 to 680. Most SSRs exhibited TA, AA, and TT di-nucleotide repeats (Supplementary Table S3). The tandem repeat analysis identified one TA-rich tandem repeat, 17 bp in length, repeated four times and located between positions 572 and 637 of the CR. The RNA structure Web Server tool revealed 20 possible secondary structures (Gibbs free energy (∆G) ranged from −77.8 to −76.7 kcal/mol, Supplementary Figure S2), and in all of them, hairpin structures were observed along most of the entire length of this region. A detailed characterization of the CR is not available for any other sesarmid crab. However, the presence of SSRs, tandem repeats, and numerous hairpin secondary structures are often observed in the CR of other brachyuran crabs as well as other closely or more distantly related decapod crustaceans (e.g., [34] and references therein).
The ML phylogenetic tree with various representatives of the Thoracotremata (25 terminals, 3695 amino acid characters, and 1074 informative sites) fully supports the monophyly of the family Sesarmidae and the other selected crab families (Ocypodidae, Grapsidae, Varunidae, Xenograpsidae, and Gecarcinidae), with bootstrap values (bv) of 100 (except for the Gecarcinidae with bv = 80). Even if inter-familiar relationships are not fully resolved, clear trends become visible. The Ocypodidae, with fiddler and ghost crabs, splits off first, so that all other included families group together as a clear-cut monophylum. This serves as additional evidence that the former superfamily Ocypodoidea has to be redefined with exclusion of the family Macrophthalmidae, for which we provide additional evidence that the latter forms a sister taxon to the Varunidae (bv = 98) (see also [23][24][25]). This will require redefinition of the Grapsoidea at the same time, and one solution is to create a separate superfamily for the Sesarmidae. Within this family, two well-supported clades comprise representatives belonging to the genera Clistocoeloma + Metopaulias + Nanosesarma + Parasesarma (CMNPP clade, bv = 98) and Chiromantes + Geosesarma + Orisarma (CCGO clade, bv = 97). In the first CMNPP clade, Metopaulias and Clistocoeloma form a well-supported clade (bv = 85), sister to representatives of the genera Parasesarma and Nanosesarma (bv = 100). Within the second clade, the genus Parasesarma appears paraphyletic due to the position of Nanosesarma minutum, but the latter genus is in need of revision, because it currently includes all small-sized representatives of the family ( Figure 5). In the first CCGO clade, the real Chiromantes haematocheir (as Cristarma eulimene in GenBank, moleculary re-assigned in [60]) is sister to all other species comprised in this clade. The two species of Geosesarma used in this analysis cluster together as a fully supported monophyletic clade. With the species re-assignment [60], the monophyly of the genus Orisarma becomes well supported, considering that the record of "Chiromantes haematocheir" is shown to be another representative of Orisarma sinense [60] and the two together are sister to a second clade that comprises O. dehaani and O. neglectum ( Figure 5). Overall, the phylogenetic relationships among genera and families reported in this study are not in full agreement with inferences drawn by previous phylogenetic studies that used complete mitochondrial genomes. However, these included a smaller number of species belonging to the family Sesarmidae and other and fewer members of the Thoracotremata than were included in the present study ( [6] and references therein). a fully supported monophyletic clade. With the species re-assignment [60], the monophyly of the genus Orisarma becomes well supported, considering that the record of "Chiromantes haematocheir" is shown to be another representative of Orisarma sinense [60] and the two together are sister to a second clade that comprises O. dehaani and O. neglectum ( Figure 5). Overall, the phylogenetic relationships among genera and families reported in this study are not in full agreement with inferences drawn by previous phylogenetic studies that used complete mitochondrial genomes. However, these included a smaller number of species belonging to the family Sesarmidae and other and fewer members of the Thoracotremata than were included in the present study ( [6] and references therein).

Conclusions
This study sequenced and characterized in detail the mitochondrial genome of the bromeliad crab Metopaulias depressus. Characterization of the complete mitochondrial genome of M. depressus enhances the genomic resources available for the family Sesarmidae and the Thoracotremata and Brachyura in general, particularly its radiation into semi-terrestrial and terrestrial environments. Present and future mitochondrial genomes assembled for other species in these taxa will permit the exploration of the interlink between the colonization of harsh, i.e., non-marine, including terrestrial, environments from marine ancestral species and selective pressures and rates of molecular evolution in mitochondrial genomes.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/ 10.3390/genes13020299/s1, Figure S1: tRNA-M gene secondary structure of Metopaulias depressus exhibiting an unusually developed loop in the T arm; Figure S2: Secondary structure prediction of the control region (CR) in the mitochondrial genome of Metopaulias depressus. Table S1: Nucleotide usage, AT-content, and GC-content in crabs belonging to the family Sesarmidae; Table S2: Codon usage analysis of protein coding genes (PCGs) in the mitochondrial genome; Table S3