Next-Generation Genome Sequencing of Sedum plumbizincicola Sheds Light on the Structural Evolution of Plastid rRNA Operon and Phylogenetic Implications within Saxifragales

The genus Sedum, with about 470 recognized species, is classified in the family Crassulaceae of the order Saxifragales. Phylogenetic relationships within the Saxifragales are still unresolved and controversial. In this study, the plastome of S. plumbizincicola was firstly presented, with a focus on the structural analysis of rrn operon and phylogenetic implications within the order Saxifragaceae. The assembled complete plastome of S. plumbizincicola is 149,397 bp in size, with a typical circular, double-stranded, and quadripartite structure of angiosperms. It contains 133 genes, including 85 protein-coding genes (PCGs), 36 tRNA genes, 8 rRNA genes, and four pseudogenes (one ycf1, one rps19, and two ycf15). The predicted secondary structure of S. plumbizincicola 16S rRNA includes three main domains organized in 74 helices. Further, our results confirm that 4.5S rRNA of higher plants is associated with fragmentation of 23S rRNA progenitor. Notably, we also found the sequence of putative rrn5 promoter has some evolutionary implications within the order Saxifragales. Moreover, our phylogenetic analyses suggested that S. plumbizincicola had a closer relationship with S. sarmentosum than S. oryzifolium, and supported the taxonomic revision of Phedimus. Our findings of the present study will be useful for further investigation of the evolution of plastid rRNA operon and phylogenetic relationships within Saxifragales.


Introduction
The genus Sedum comprises more than 420 recognized species, which is the most species-rich member of the family Crassulaceae [1,2]. Some species, formerly classified as Sedum, are now assigned to the segregate genera Hylotelephium and Rhodiola [3][4][5][6]. The family Crassulaceae, together with 14 other family members, has been classified in the order Saxifragales. Recently, increasing research

Genome Assembly, Gene Annotation, and Sequence Analyses
The paired-end reads were first checked with Fastqc [28] and then trimmed for quality using Trimmomatic 0.39 [29]. After that, obtained clean reads were filtered and assembled with GetOrganelle 1.5.2 [30] using the chloroplast genome of S. sarmentosum [7] as reference. The chloroplast genome was annotated with GeSeq [31]. The secondary cloverleaf structures of tRNAs were identified using tRNA-scan SE web server [32]. The secondary structures of rRNAs were predicted by comparison with those of other plant species [33].

Phylogenetic Analysis
To resolve the phylogenetic relationships among Saxifragales species, two phylogenetic approaches were applied: the maximum likelihood (ML) method in RAxML GUI 1.5b2 [34], as well as the Bayesian inference (BI) method in MrBayes 3.2.7a [35]. With exclusion of the termination codons, 79 protein-coding genes (PCGs) and 4 rRNAs of 37 Saxifragales species were used to construct an evolutionary tree. A phylogenomic study by Yang et al. [36] revealed a sister group relationship between Saxifragales and Rosids. We selected therefore two Vitales species within Rosids (Vitis heyneana, NC_039796; V. vinifera, NC_007957) as outgroups. For ML analyses, we performed analyses with thorough bootstrap for ten runs and 1000 replicates under the GTRCAT model using RAxML GUI. For BI analyses, the best-fit models for 83 genes were first selected based on Bayesian information criterion (BIC) values in ModelGenerator 0.85 [53], then two simultaneous runs with eight independent Markov chains were run for 10,000,000 generations (sampling every 1000 generations).

General Features of S. plumbizincicola Plastome
Based on Bowtie2 mapping, in total 19,610,999 reads (21.5% of total reads) were mapped to the reference genome (S. sarmentosum, NC_023085), with a 1969× mean coverage (min, 1286×, max, 3664×, standard deviation, 71). The assembled complete plastome of S. plumbizincicola (accession number: MN185459.1) is 149,397 bp in size, with a typical circular, double-stranded, and quadripartite structure of angiosperms. The plastome has two identical inverted repeats (IRs, 25,565 bp) separated by a small single copy (SSC, 16,669 bp) and a large single copy (LSC, 81,598 bp), as shown in Figure 1. Approximately 52.0%, 4.3%, and 1.83% of the genome encodes for proteins, rRNAs, and tRNAs, respectively. Whereas, the remaining 41.87% are non-coding regions, including introns, intergenic spaces, and pseudogenes. Based on Bowtie2 mapping, in total 19,610,999 reads (21.5% of total reads) were mapped to the reference genome (S. sarmentosum, NC_023085), with a 1969 × mean coverage (min, 1286 ×, max, 3664 ×, standard deviation, 71). The assembled complete plastome of S. plumbizincicola (accession number: MN185459.1) is 149,397 bp in size, with a typical circular, double-stranded, and quadripartite structure of angiosperms. The plastome has two identical inverted repeats (IRs, 25,565 bp) separated by a small single copy (SSC, 16,669 bp) and a large single copy (LSC, 81,598 bp), as shown in Figure  1. Approximately 52.0%, 4.3%, and 1.83% of the genome encodes for proteins, rRNAs, and tRNAs, respectively. Whereas, the remaining 41.87% are non-coding regions, including introns, intergenic spaces, and pseudogenes. Along with new data from this study, we comparatively investigated the structures and properties of plastomes from 44 species, representing 11 families in Saxifragales, as shown in Table  1. The size of plastomes of Saxifragales ranges from 147,048 bp (Phedimus kamtschaticus) to 160,410 bp (Liquidambar formosana), as shown in Table S1, and the total of G + C content varies from 36.40% (Myriophyllum spicatum) to 38.55% (Paeonia brownii).
[8] reported that infA and rpl32 have been lost from three species of Paeonia plastome (Paeonia brownii, Figure 1. Chloroplast genome annotation map for Sedum plumbizincicola. Genes lying outside the circle are transcribed in a clockwise direction, whereas genes inside are transcribed in a counterclockwise direction. Different colors represent different functional groups. The dashed darker and lighter gray in the inner circle denote G + C and A + T contents of chloroplast genome, respectively. LSC, SSC, and IRs mean long single copy, small single copy, and inverted repeat regions, respectively. Along with new data from this study, we comparatively investigated the structures and properties of plastomes from 44 species, representing 11 families in Saxifragales, as shown in Table 1. The size of plastomes of Saxifragales ranges from 147,048 bp (Phedimus kamtschaticus) to 160,410 bp (Liquidambar formosana), as shown in Table S1, and the total of G + C content varies from 36.40% (Myriophyllum spicatum) to 38.55% (Paeonia brownii).

Structure of 16S rRNA
Similar to most other plants, the size of S. plumbizincicola rrn16 is 1490 bp. In all Saxifragales species examined, the sizes of rrn16s are the same as that of S. plumbizincicola, except for the family Paeoniaceae, with an insertion (U) between positions 576 and 577 nts. As shown in Table S2, the G + C content of the rrn16s of Saxifragales ranges from 56.5% (Rhodiola rosea) to 56.9% (Fortunearia sinensis, and Sinowilsonia henryi). The average G + C content for typical land plants is 56%, whereas this value falls from 52% to 28% for holoparasitic angiosperms, with an increasingly greater number of mutations [78].
We next examined the predicted secondary structure of 16S rRNA in S. plumbizincicola. The structure is similar to the models proposed for other plants [78][79][80], including three main domains organized in 74 helices. In total, 72 mismatched pairs have been detected, and most of them (58/72) are G-U wobble pairs, as shown in Figure 2. Furthermore, we also detected that the position 123 nt of 16S rRNA is cytosine (123-C), whereas other Saxifragales species examined are uracil. To avoid a potential sequencing error, we confirmed the mutation U123C by transcriptomic data of S. plumbizincicola (accession number: SRR5118122-SRR5118124). For further analysis, the 16S rRNAs from 3125 reference plastomes of land plants deposited in GenBank were investigated. The survey results indicated that only 13 species had the special 123-C, including two hyperaccumulator plants, Alpinia oxyphylla and Curcuma longa [81,82]. In contrast with non-canonical base pairing (G-U), we particularly observed that the mutation U123C of 16S rRNA can form stabilized base pairing (C-G) in helices H120, as shown in Figure 2. However, the underlying biological mechanisms of the mutation U123C of 16S rRNA are still unknown.  As can be seen from Table S2, the size of rrn23 spans from 2089 bp (Sedum) to 2857 bp (Paeonia suffruticosa), and the G + C content ranges from 55.0 (Corylopsis coreana, Loropetalum subcordatum, and Chrysosplenium aureobracteatum) to 55.4% (M. spicatum), with an average value of 55.1%. In contrast to rrn23, the rrn4.5 of Saxifragales is remarkably conserved in size (103 bp), with a mean G + C content of 56.7%. The rrn4.5 and rrn23 genes are separated by 98-99 bp intergenic spacers (IGd), with G + C content between 57.1% and 60.2%, as shown in Table S2. The predicted secondary structure of 23S rRNA in S. plumbizincicola is similar to the models of Gutell [80,83], containing 149 helices and six domains, as shown in Figure 3. Moreover, a total of 135 mismatched pairs with 101 G-U wobble pairs were found in the structure. We then comparatively analyzed 23S rRNA secondary structures of all investigated taxa in Saxifragales. Remarkably, as shown in Figure 4, the hairpin loops near helix H550 were more divergent than others, including nucleotide substitutions and indels. In particular, these divergent hairpin loops may have potential phylogenetic implications. For instance, all species of Crassulaceae are characterized by six nucleotides (5'-CACUGG-3') in these hairpin loops. In addition, in contrast to S. plumbizincicola, P. suffruticosa had an extra 46 nts insertion between the helices H1684 and H2037 of 23S rRNA. Our study further shows that the extra insertion may form two additional helices, as shown in Figure 5. Notably, 4.5S rRNA is a unique component of plastid ribosomes from nonvascular (bryophytes) to vascular plants (pteridophytes, gymnosperms, and angiosperms), which is located on the large subunit. Several previous studies of 4.5S rRNA have failed to find known homologues in other types of ribosomes [84][85][86]. In ongoing follow-up research, 4.5S rRNA has been identified as structurally homologous to the 3' terminus of bacterial, cyanobacterial, and green algal 23S rRNA [19,84,[87][88][89][90]. Based on sequence identity analysis, 4.5S rRNA of S. plumbizincicola and 3' terminus of Escherichia coli 23S rRNA (accession number: J01695) share 62.9% nucleotide identity. Interestingly, despite a considerable amount of nucleotide substitutions and indels between these two regions, their secondary structures exhibited similar topology, as shown in Figure 6. This finding confirms once again that 4.5S rRNA of higher plants is associated with fragmentation of 23S rRNA progenitor.    Figure 8. Interestingly, we found the sequence of putative rrn5 promoters have some evolutionary implications. For example, all spacers between -35 and -10 boxes from 44 investigated species share the 16 common nucleotides (CCTCACAATCACTAGC), except for Liquidambar formosana (CCTCTAGC). Due to nucleotide insertion, deletion, and substitution, the ancestral sequence was then further evolved to different apomorphies in diversified lineages within Saxifragales.

Phylogenetic Implications
To investigate the evolutionary relationships among the order Saxifragales, we performed phylogenetic analyses using 83 plastid genes of 44 species. Two species of Vitaceae (V. heyneana and V. vinifera) were employed as outgroups. After alignment, the concatenated sequences are 74,751 bp long. The trees derived from ML and BI analyses display the same topology, as shown in Figure 9. According to the Angiosperm Phylogeny Group (APG) system IV [98], the order Saxifragales

Structure of 5S rRNA and Evolutionary Implications of Its Putative Promoter
Structurally, 5S rRNA is the smallest RNA component of the large ribosomal subunit in all known organisms [91]. In the S. plumbizincicola plastome, rrn5 and rrn4.5 are physically linked by the intergenic region (IGe), with the size 219 bp, as shown in Table S2. Besides, the predicted secondary structure of S. plumbizincicola 5S rRNA is similar to that of other published studies [92,93], harboring five helices, as shown in Figure 7. Furthermore, our comparative sequence analysis identified a perfectly conserved 121-bp rrn5 among Saxifragales, with medium G + C content (about 52%), as shown in Table S2. In this study, we also used the 5SRNAdb (http://combio.pl/rrna/) to survey the G + C content of plastomic rrn5. A total of 839 sequences were downloaded and analyzed. The mean G + C content is 50.73%, with the lowest in Euglena viridis (32.26%) and the highest in Staurastrum punctulatum (59.84%). The survey shows that there is a great variability in G + C content of rrn5 for photosynthetic euglenoid and green algae.
Based on similarity of nucleotide sequences, Audren et al. [94] found that a prokaryotic type promoter, which is closely related to the bacterial consensus, was located upstream of the rrn5 and downstream of the stem-loop structure from spinach. However, the putative promoter is inactive both in vivo and in vitro, likely due to the high GC content of the sextama box (TTGGGG) [94,95]. A number of studies have demonstrated that the 5S rRNA gene is transcribed with the other ribosomal genes within the same operon [19,94,96,97]. Notably, the spinach putative promoter was also detected in the similar region from all 44 Saxifragales species. As shown in Figure 8, it contains a sextama box (−35 region, T 100 T 100 G 100 G 100 G 100 G 100 ) and a pribnow box (−10 region, C 57 A 100 A 100 T 100 A 100 T 86 ) separated by 8-29 bp within Saxifragales, as shown in Figure 8. Interestingly, we found the sequence of putative rrn5 promoters have some evolutionary implications. For example, all spacers between -35 and -10 boxes from 44 investigated species share the 16 common nucleotides (CCTCACAATCACTAGC), except for Liquidambar formosana (CCTCTAGC). Due to nucleotide insertion, deletion, and substitution, the ancestral sequence was then further evolved to different apomorphies in diversified lineages within Saxifragales. relationships within Saxifragales generated from this study agrees with those reported by Jian et al. [11], Moore et al. [99], and Soltis et al. [12].   relationships within Saxifragales generated from this study agrees with those reported by Jian et al. [11], Moore et al. [99], and Soltis et al. [12].

Phylogenetic Implications
To investigate the evolutionary relationships among the order Saxifragales, we performed phylogenetic analyses using 83 plastid genes of 44 species. Two species of Vitaceae (V. heyneana and V. vinifera) were employed as outgroups. After alignment, the concatenated sequences are 74,751 bp long. The trees derived from ML and BI analyses display the same topology, as shown in Figure 9. According to the Angiosperm Phylogeny Group (APG) system IV [98], the order Saxifragales comprises 15 families, 11 of which were chosen for the phylogenetic analyses. The order Saxifragales can be generally divided into two clades: core Saxifragales clade (maximum likelihood bootstrap [BS] = 100 and bayesian posterior probability [PP] = 1.0) and Paeoniaceae plus the woody clade ([BS] = 89 and [PP] = 1.0). The former clade is subdivided into two subclades: one containing Crassulaceae, Haloragaceae, and Penthoraceae, and the other comprising three families of Saxifragaceae alliance (Grossulariaceae, Saxifragaceae, and Iteaceae). The latter clade includes Paeoniaceae, Altingiaceae, Cercidiphyllaceae, Daphniphyllaceae, and Hamamelidaceae. In general, the framework of relationships within Saxifragales generated from this study agrees with those reported by Jian et al. [11], Moore et al. [99], and Soltis et al. [12]. In the present study, we found that S. plumbizincicola had a closer relationship with S. sarmentosum than S. oryzifolium. Furthermore, Sedum is sister to (Phedimus + Rhodiola). Species of Phedimus, previously treated as members of Sedum, have been classified as a separate genus [100,101]. Our data support this taxonomic revision of Phedimus.
Our results also accepted the monophyly of the woody clade, which is sister to the family Paeoniaceae. It is noteworthy that deep-level relationships within Hamamelidaceae are strongly supported. Nevertheless, the closest relatives of this family and relationships among these woody families are still unresolved in our analysis. This might partially be attributed to an ancient, rapid radiation [11]. Therefore, further detailed analyses need be conducted to evaluate the relationships within the woody clade.  In the present study, we found that S. plumbizincicola had a closer relationship with S. sarmentosum than S. oryzifolium. Furthermore, Sedum is sister to (Phedimus + Rhodiola). Species of Phedimus, previously treated as members of Sedum, have been classified as a separate genus [100,101]. Our data support this taxonomic revision of Phedimus.
Our results also accepted the monophyly of the woody clade, which is sister to the family Paeoniaceae. It is noteworthy that deep-level relationships within Hamamelidaceae are strongly supported. Nevertheless, the closest relatives of this family and relationships among these woody families are still unresolved in our analysis. This might partially be attributed to an ancient, rapid radiation [11]. Therefore, further detailed analyses need be conducted to evaluate the relationships within the woody clade.

Conclusions
In the present study, we first sequenced and analyzed the plastome of S. plumbizincicola. The genome structure and gene order were revealed, including 85 PCGs, 36 tRNA genes, 8 rRNA genes, and four pseudogenes. Next, we focused on the analyses of the primary and secondary structures of plastid rRNA genes. Notably, we found the sequence of putative rrn5 promoter has some evolutionary implications within the order Saxifragales. Based on the 83 plastid genes from 44 species, phylogenetic analyses demonstrated that S. plumbizincicola had a closer relationship with S. sarmentosum than S. oryzifolium. Our findings reported here shed light on the structural evolution of plastid rRNA operon and phylogenetic relationships within Saxifragales.