Next Article in Journal
Evaluation of Library Preparation Workflows and Applications to Different Sample Types Using the PowerSeq® 46GY System with Massively Parallel Sequencing
Next Article in Special Issue
The First Complete Chloroplast Genome of Campanula carpatica: Genome Characterization and Phylogenetic Diversity
Previous Article in Journal
Hypoxia-Inducible Pathway Polymorphisms and Their Role in the Complications of Prematurity
Previous Article in Special Issue
The Complete Chloroplast Genomes of Gynostemma Reveal the Phylogenetic Relationships of Species within the Genus
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

The First Complete Chloroplast Genome of Cordia monoica: Structure and Comparative Analysis

by
Rana M. Alshegaihi
1,
Hassan Mansour
2,3,
Shouaa A. Alrobaish
4,
Najla A. Al Shaye
5,* and
Diaa Abd El-Moneim
6
1
Department of Biology, College of Science, University of Jeddah, Jeddah 21493, Saudi Arabia
2
Department of Biological Sciences, Faculty of Science & Arts, King Abdulaziz University, Rabigh 21911, Saudi Arabia
3
Department of Botany and Microbiology, Faculty of Science, Suez Canal University, Ismailia 41522, Egypt
4
Department of Biology, College of Science, Qassim University, Buraydah 52377, Saudi Arabia
5
Department of Biology, College of Science, Princess Nourah bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia
6
Department of Plant Production, (Genetic Branch), Faculty of Environmental Agricultural Sciences, Arish University, El-Arish 45511, Egypt
*
Author to whom correspondence should be addressed.
Genes 2023, 14(5), 976; https://doi.org/10.3390/genes14050976
Submission received: 8 February 2023 / Revised: 10 April 2023 / Accepted: 20 April 2023 / Published: 26 April 2023
(This article belongs to the Special Issue Advances in Chloroplast Genomics and Proteostasis)

Abstract

:
Cordia monoica is a member of the Boraginaceae family. This plant is widely distributed in tropical regions and has a great deal of medical value as well as economic importance. In the current study, the complete chloroplast (cp) genome of C. monoica was sequenced, assembled, annotated, and reported. This circular chloroplast genome had a size of 148,711 bp, with a quadripartite structure alternating between a pair of repeated inverted regions (26,897–26,901 bp) and a single copy region (77,893 bp). Among the 134 genes encoded by the cp genome, there were 89 protein-coding genes, 37 transfer RNA (tRNA) genes, and 8 ribosomal RNA (rRNA) genes. A total of 1387 tandem repeats were detected, with the hexanucleotides class making up 28 percent of the repeats. Cordia monoica has 26,303 codons in its protein-coding regions, and leucine amino acid was the most frequently encoded amino acid in contrast to cysteine. In addition, 12 of the 89 protein-coding genes were found to be under positive selection. The phyloplastomic taxonomical clustering of the Boraginaceae species provides further evidence that chloroplast genome data are reliable not only at family level but also in deciphering the phylogeny at genus level (e.g., Cordia).

1. Introduction

In green plants, chloroplasts (cp) play essential roles in photosynthesis, as well as carbon fixation, as they transform light energy into chemical energy. They are formed by photosynthetic bacteria that interact with non-photosynthetic hosts through endosymbiosis [1,2,3,4]. In addition to producing starch, amino acids, lipids, vitamins, and pigments in flowers, chloroplasts also participate in several sulfur and nitrogen metabolism pathways [5]. There are 110–130 genes encoded in the chloroplast genomes, whose size ranges from 120–180 kb, while gene order and content are highly conserved [6,7,8,9]. Most angiosperms display a quadripartite cyclic structure consisting of two identical inverted repeats (IR) separated by a large or small single-copy region (LSC and SSC, respectively) [7,9]. It has also been reported that several angiosperm lineages have undergone large-scale genome rearrangements and gene losses [10,11].
As angiosperm chloroplast genomes exhibit uniparental inheritance, stable structures, and moderate evolutionary rates, they offer sufficient genetic markers to conduct genome-wide evolutionary studies [12,13,14]. In the era of high-throughput sequencing technologies, we have been able to sequence complete genomes and analyze whole plastomes. As a result, large amounts of valuable information can be gathered, and phylogenomic analyses based on the whole plastomes can be conducted rather than specific loci [15,16,17].
Cordias are deciduous shrubs or trees belonging to the subfamily Cordioideae of the Boraginaceae family, which was previously distinguished as a separate family known as Cordiacea (Tropicos.org). Approximately 300 species are known to exist in both hemispheres, including Mexico, Central America, South America, the Arabian Peninsula, Pakistan, Sri Lanka, India, East and West Africa, Nigeria, and Ghana [18]. The tree grows up to 6 m tall and it bears white flowers and yellow fruit, with ovate leaves that can reach up to four inches long. Typically, the fruit measures between 0.5 and 1 inch long, and the flowering and fruiting process occur between October and December [19]. It has been reported that many species of Cordia have been used in traditional medicine for centuries to treat a range of ailments, including C. monoica, which showed significant anti-ulcer activity. Additionally, C. monoica leaves are used as a vapor bath for leprosy, its roots for vomiting, and its stem bark for chest pains [18,20,21,22,23]. Currently, there are only a few species of Boraginaceae chloroplast genome in GenBank, while C. monoica has never been sequenced. It remains important to further research on this family of chloroplast genomes, since significant variations are observed in the length of chloroplast genomes sequences. For example, Pholisma arenarium (GenBank accession: NC_039719) and Lennoa madreporoides (GenBank accessions: NC_039720) show 81,198 and 83,675 bp of length, respectively.
This study aimed to sequence, characterize, and compare the whole chloroplast genome sequence of C. monoica with other species belonging to the family Boraginaceae. By evaluating interspecific variation among genera of Boraginaceae family, it is possible to develop markers and distinguish Boraginaceae species using newly generated chloroplast genomes.

2. Materials and Methods

2.1. Sample Collection and DNA Extraction

Fresh leaves of C. monoica were collected from the Faifa mountains in the Jazan province of Saudi Arabia (17°15′ N 43°06′ E). Harvested fresh leaves were immediately placed in a container with silica gel and stored at 4 °C for further DNA extraction. Genomic DNA was extracted using WizPrep™ gDNA Mini Kit (Cell/Tissue, Seol, Republic of Korea), and the DNA concentration and quality were assessed using Quantus™ Fluorometer (Promega, Madison, WI, USA) and electrophoresis on a 1% agarose gel, respectively.

2.2. Cp-Genome Sequencing, Assembly, and Annotation

Following the instruction of the library construction kit, the purified high-quality genomic DNA was used to construct paired-end libraries by shearing the genomic DNA into short fragments of approximately 350 bp before sequencing in 150 bp paired-end mode was implemented on an Illumina HiSeq 4000 (Novogene Technologies, Beijing, China). Adapters and low-quality sequences were removed from raw reads to obtain high-quality reads. Clean filtered reads were de novo assembled using the single-contig approach [24,25]. GeSeq was used to annotate the assembled chloroplast genome [26], while Organellar Genome DRAW (OGDRAW) [27] mapped the chloroplast genome of C. monoica. The tRNA scan-SE 2.0 search server was used to confirm all tRNAs [28]. Geneious Prime was used to check and correct annotations and coding sequences [29].

2.3. Genome Analysis, Codon Usage, and Tandem Repeats Structures

SNPs and indels were detected using Geneious Prime in LSC, SSC, and IR regions. MEGA 11 software [30] was used to analyze the codon usage frequency and relative synonymous codon usage (RSCU) in C. monoica for all protein-coding genes. The Phobos V3.3 software was used to detect tandem repeats in CP genome sequences, implemented in Geneious Prime.

2.4. Sequence Divergence in Boraginaceae Family and Region Boundaries

The complete chloroplast genome of C. monoica was compared with other Boraginaceae species available in the GenBank database, namely, P. arenarium, L. madreporoides, B. officinalis, and O. fuyunensis, using the mVISTA program set for a shuffle-LAGAN model [31], with C. monoica cp genome as the reference. The chloroplast genome borders of LSC, SSC, and IRs were compared according to their annotations using IRScope online tool (https://irscope.shinyapps.io/irapp/, accessed on 12 October 2022).

2.5. Synonymous (dS) and Non-Synonymous (dN) Substitution Rate Analysis

To identify the genes under selection pressure, the nonsynonymous (dN), synonymous (dS), and dN/dS (ω) ratio of each protein-coding gene were used. The conditions for a positive, neutral, or purifying selection were indicated when ω  >  1, ω  =  1, and ω  <  1, respectively [32,33].

2.6. Phylogenetic Analyses

The phylogenetic analysis was based on the LSC region, the SSC region, and the IR region of C. monoica and other species of Boraginaceae downloaded from the GenBank database. Using MAFFT [34], the chloroplast genome sequences of all five species were aligned. Alignments were adjusted manually and concatenated to construct a phylogenetic tree. The phylogenetic analyses were generated using maximum likelihood (ML), computed using FastTree V2 [35], which performed under the generalized time reversible (GTR) model using the default settings; and the maximum parsimony (MP) computed on all sites using MEGA 11, adjusted to the default parameters.

3. Results

3.1. Complete Chloroplast Genome Sequence of C. monoica

The complete cp genome of C. monoica shows a length of 148,711 bp and a quadripartite structure typical of angiosperms. The molecule consists of a pair of inverted repeats (IRA and IRB) regions (26,897–26,901 bp). The IRA region is separated from the IRB region by a small single copy region (17,020 bp) and a large single copy (77,893 bp) region (Figure 1). A total of 134 genes are found in the cp genome, including 8 ribosomal RNA (rRNA) genes, 37 transfer RNA (tRNA) genes, and 89 protein-coding genes (PCGs). Of these, there are 22 intron-containing genes, 2 (clpP, and pafI) of which contain two introns, and 20 (13 PCGs and seven tRNAs) have one intron. The remaining 21 genes, namely, ndhB, rpl2, rpl22, rps3, rps7, rps12, rps19, ycf2, rrn4.5, rrn5, rrn16, rrn23, trnAUGC, trnICAU, trnLCAA, trnRACG, trnNGUU, trnRACG, and trnVGAC are duplicated in the IR regions. Notably, the C. monoica cp genome has the rps12 gene trans-spliced with the 3′ end duplicated in the IR regions and 5′ end in the LSC region.
It is found that the chloroplast genome contains 38.20% GC, while the LSC, SSC, and IR areas contain 36.3%, 32.1%, and 42.7%, respectively. The nucleotide frequency is 30.6% for A, 19.5% for C, 18.7% for G, and 31.3% for T. Over half of the cp genome (60.9%) is occupied by the coding region (90,615 bp) with the CDS (78,796 bp, 52.98%) regions forming the largest portion, followed by rRNA genes (9050 bp; 6.08%) and tRNA genes (2769 bp; 1.86%). The remaining 39.0% consists of intergenic regions, introns, and pseudogenes (Table 1).

3.2. Tandem Repeats Sequence

The C. monoica, B. officinalis, and O. fuyunensis chloroplast genomes were examined and show a total of 1387 tandem repeats in the noncoding regions, with a repeat unit ranging from 8 to 86 bp. Repeats are found predominantly in the LSC region (61%), while low proportions are found in the IR (31%), and SSC (8%) regions. Interestingly, most of the dinucleotide repeats belong to the AT type (67%), and the majority of other repeat classes are especially rich in A or T.
According to the repeated class, the hexanucleotide (28%) is the most abundant class of repeats followed by pentanucleotide (20%), tetranucleotide (17%), trinucleotide (14%), dinucleotide (8%), hepta-nucleotide (6%), octa-nucleotide (3%), nona-nucleotide (2%), and deca-nucleotide (1%; Table 2). The highest abundance motifs in trinucleotide repeats are AAG and AAT, AAAT motif in tetranucleotide, AAAAT in pentanucleotide, AAATAG in hexanucleotide, AAAAAAT in hepta-nucleotide, AAAAAAAT in octa-nucleotide, and AAATGTTCC in nona-nucleotide.

3.3. Codon Usage Bias of C. monoica

Using the sequences of protein-coding genes, the frequency of codon usage for C. monoica cp was calculated. Using a standard set of 64 codons, 26,303 codons were used to code 20 types of amino acids. All amino acids, except methionine and tryptophan, display codon preferences. Arginine, serine, and leucine are encoded by six codons each, while the remaining amino acids are encoded by two or four codons. There are 2766 codons containing leucine (10.5%), compared with 302 codons containing cysteine (1.1%). The RSCU values of all codons are shown in Figure 2. With 29 codons with RSCU > 1, all ending in A/U except for UUG, the A/U contents are mostly observed in the third codon position. No bias in the frequency of AGU and UGG codons encoding for serine and tryptophan is observed (RSCU  =  1).

3.4. Comparative Analysis of Chloroplast Genome in Boraginaceae Family

The chloroplast alignment indicates numerous changes between C. monoica and related species (P. arenarium, L. madreporoides, B. officinalis, and O. fuyunensis). The main variations found in the cp genomes length are, therefore, differences in the length of each region and the positioning of its boundaries (Table 3). The size cp genome ranges from 81,198 bp (P. arenarium) to 150,612 bp (O. fuyunensis). A significant difference is observed among the studied species of the family Boraginaceae. In two of the five species (P. arenarium and L. madreporoides), a severe reduction in the cp length is detected, 60% less in length compared C. monoica, yet the four cp regions are possibly annotated. The reduction is asymmetric among all regions; the LSC and SSC sever the major parts in contrast to the IR regions, that show high level of conservation and only 17.3% less than the IR region length in C. monoica. Based on the mVista, the missing regions contain coding genes, including ATP subunits, RNA polymerase genes, photosystem I, II, assembly and stability factors, and NADH dehydrogenase subunits (Figure 3).
Thus, we focused on the non-gapped regions to define the hypervariable regions of the studies’ cp genomes. The coding genes matK, rbcL, and rpl16, the non-coding regions rps16 intron, and the intergenic spacers rps18-rpl22, trnM-ycf2, rps15-ycf1, ycf1, trnV-rps12, and trnM-rpl23 show the lowest similarity percentage among the four Boraginaceae species compared to C. monoica. The massive variances leads to the exclusion of two distinct species (P. arenarium, and L. madreporoides) from further analysis, in order to avoid the appearance of extensive SNPs and indels.

3.5. IR Expansion and Contraction

Although the IR region of the chloroplast genome is the most conserved region, it is the border region contractions and expansions that are responsible for the variability in chloroplast genome length during evolution. The junction sites between each region are denoted as JLB (IRb/LSC), JSA (SSC/IRa), JSB (IRb/SSC), and JLA (IRa/LSC). In the current study, a comprehensive assessment of the four junctions (JLA, JLB, JSA, JSB) between C. monoica, B. officinalis, and O. fuyunensis was performed (Figure 4). The size variations in the plastomes causes dynamic changes in IR boundaries. The JLB boundary is similar in C. monoica and B. officinalis in terms of position and gene synteny and is located between rpl16 and rps3. This is in contrast to O. fuyunensis, located after the rpl16, with the rps3 junction toward rpl22, rps19, and rpl2. The JSB boundary is located within the ndhF gene in all the three species. The ycf1 gene is crossed by the JSA boundary in O. fuyunensis and B. officinalis but not in C. monoica. The JLA boundary is located between rps3 and trnH in B. oficianlis and C. monoica, in contrast to O. fuyunensis, where the boundary is located between rpl2 and rps19.

3.6. SNPs, Indels, and Selective Pressure Analysis

Using the O. fuyunensis cp genome as the reference sequence, the single nucleotide polymorphism (SNP) and indels (insertion and deletion) loci of the C. monoica, B. officinalis, and O. fuyunensis were assessed across the protein-coding genes. The results reveal a total of 5580 variations, including 5398 SNPs and 113 indels (55 deletions and 58 insertions). Of these indels, 30 (26.5%) are single-base indels, and the indel size ranges from 1 bp to 21 bp. The most abundant indel sites are detected in the IR region, followed by the SSC and LSC regions, while the highest numbers of indel are recorded in ycf1, ycf2, and rpoC2. All SNPs are classified into two types: synonymous (dS) and nonsynonymous (dN). There are 3050 synonymous SNPs and 2348 nonsynonymous SNPs in the protein-coding genes. The LSC region contains the majority of the SNPs (48%), followed by the SSC region (29%), and the IR region (15%). The most substitutions are found in the rpoC2 gene, followed by the ycf1 and ycf2 genes.
To detect the selective pressure on the PCGs of C. monoica, B. officinalis, and O. fuyunensis cp genomes, the rates of synonymous (dS) and nonsynonymous (dN) substitutions, and the dN/dS ratio were calculated. The dS values ranges from 0 (psbL) to 376 (rpoc2), with a total average value of 39.39, while the dN values ranges from 0 (pbf1, petN, psaJ, psbF, psbM, psbI) to 519 (ycf1) with a total average value of 31.32. Most dN/dS ratios are less than 1, possibly indicating that most cp genes are under purifying selection. Twelve cp genes, including rps15, ccsA, ndhF, psbH, rps7, rpoA, rps16, rpl23, psbK, matK, ycf1, and ycf2 are detected with dN/dS values  >  1, indicating that these genes undergo a positive selection and only four genes (psal, psbT, rpl33, and rpl36) have dN/dS values = 1.

3.7. Phylogenetic Analysis

To clarify the relationship between five Boraginaceae species, phylogenetic trees were constructed based on the sequences of the LSC region, the SSC region, and the IR region together (Figure 5). The results of ML/MP analyses based on the three regions yielded identical topologies with generally high support values. In the phylogeny tree, the five Boraginaceae species can be divided into two well-supported clades. Interestingly, the P. arenarium is grouped with L. madreporoides in the same clade, both are heterotrophs and parasitic plants, and C. monoica is placed as a sister group in the ingroup, while B. officinalis and O. fuyunensis form the other clade.

4. Discussion

Chloroplast genomes have been used for taxonomic and evolutionary studies to evaluate evolutionary relationships and determine genome structure, especially among closely related species [36,37]. This study sequenced and assembled the first complete cp genome from C. monoica, which was sampled from the Faifa mountains in Saudi Arabia. For the comparative analysis, four additional Boraginaceae chloroplast genomes were combined from the GenBank database. This study contributes to the database’s ever-expanding resources and is valuable for further studies on molecular identification, genetic diversity, and phylogenetics related to Boraginaceae.
The C. monoica cp genome typically exists as a double-stranded circular molecule with two inverted repeats (IR) and one large single copy (LSC) [38,39]. Our assembly and annotation results show that the C. monoica cp genome length is 148,711 bp, which is in the range of other Boraginaceae species [40,41], displaying similar genome structures and gene arrangements. While the tRNA and rRNA gene compositions of the three Boraginaceae species are similar, some differences are observed in the number of PCGs. The cp genomes of C. monoica are found to encode 89 PCGs, whereas O. fuyunensis and B. officinalis possess 84, and 83 PCGs, respectively. In this case, the variation occurs due to the pseudogenization and location of ycf1 and rpl23 in the IR region. Angiosperm cp genomes evolve relatively fast, and gene losses and inversions occur during their evolution [42].
Comparing the C. monoica cp genome with four related species, the sizes of the two Boraginaceae chloroplast genomes (P. arenarium and L. madreporoides) are significantly shorter than those of most angiosperms. Most angiosperm chloroplast genomes are 120 to 160 kb in length [43], while the sizes of the chloroplast genomes of P. arenarium and L. madreporoides range from 81,198 to 83,657 bp. Compared with most angiosperms, the sizes of the four regions of P. arenarium and L. madreporoides change significantly, and the most conspicuous change occurs in LSC and SSC, reduced by about 40 and 10 kb in size, respectively. Thus, these two Boraginaceae species have smaller chloroplast genomes because of the expansion of IRs. Several chloroplast genomes have been reported, which are significantly smaller than most other plants. Usually, small chloroplast genomes are found in parasitic plants, such as Epifagus virginiana in Orobanchaceae of Lamiales [11], and Cuscuta chinensis in Convolvulaceae of Solanales [44].
At lower taxonomic levels, tandem repeats have been shown to be an important molecular marker for species discrimination and population genetics [45]. Additionally, they have been used in a wide range of studies, including estimating genetic variation, analyzing gene flow, and exploring animal and plant populations [46,47]. Previously reported findings agree with those of the present study. In chloroplast genomes, poly-A or poly-T repeats are combined with tandem guanine or cytosine repeats [48], resulting in AT-rich chloroplast genomes [49,50].
Using codons correctly plays an essential role in expressing genetic information [51], resulting in a correlation between gene expression level, GC content, amino acid conservation, and transcriptional selection [52]. The most frequent are codons encoding leucine, and the least frequent are codons encoding cysteine. This result was confirmed in different species, such as Cinnamomum camphora [53] and Ocotea species [54]. As found in most chloroplast genomes from land plants, the codon preference for A/U codons is stronger than that for G/C codons [55,56].
A dynamic expansion or contraction of the four IR boundaries frequently occurs during the evolution of cp genomes, which results in further changes in the cp genome size. Researchers previously discovered that chloroplast genome size can change as a result of gene deletions [57] and intergenic variation [58], as well as contraction or expansion of the IR regions [59]. Due to their contraction and expansion at the borders, IR regions explain size variation between cp genomes despite being the most conserved in cp genome sequences [60,61,62,63].
In spite of the highly conserved genome of the cp, SNPs are clustered in “hotspots” [64], resulting in highly variable loci. In addition, variable hotspots containing indels have also been reported [65]. It is likely that the hotspots in the cp genome produce several highly variable cp genome markers. In contrast to commonly used molecular markers, the cp genome has a conserved sequence length of 110 to 160 kb, allowing for greater variation between closely related species [66]. A significant amount of structural variation (SNPs and indels) is found across cp genomes. As a result, some mutation hotspot regions could be tested as DNA markers specific to Boraginaceae (i.e., the coding genes, matK, rbcL, and rpl16; the non-coding regions, rps16 intron; and the intergenic spacers rps18-rpl22, trnM-ycf2, rps15-ycf1, ycf1, trnV-rps12, and trnM-rpl23). In this list, matK and rbcL are known as standard DNA barcode sequences. The genetic variation within these regions might also be sufficient to resolve the phylogenetic relationship of Boraginaceae species.
It is important to analyze the adaptive evolution of genes to understand how the substitution rate impacts the alteration of gene structure and function. An estimation of the dN/dS ratio can give insight into the constraints on organisms imposed by natural selection [67,68]. A sequence divergence analysis of protein-coding genes was conducted in the present study, and twelve of them (rps15, ccsA, ndhF, psbH, rps7, rpoA, rps16, rpl23, psbK, matK, ycf1, and ycf2) show a difference between dN and dS of >1, which is expected of genes under positive selection. Among these, rpl and rps encode ribosomal proteins that have more divergent sequences than proteins related to photosynthesis [69], the psbH gene is associated with photosystem II [70], the matK gene is involved in the cutting/splicing of group II RNA transcriptional introns [71], the rpoA encoding proteins are involved in transcription [72] and the ccsA encoding proteins are involved in the cytochrome synthesis gene [73]. Furthermore, the psbK and ndhF genes show photosynthesis-linked roles, indicating their role in photosynthesis and carbon fixation [74,75]. The genes ycf1 and ycf2 are two of the largest genes encoding for a putative membrane protein [76,77]. All of these genes are essential for plants to adapt to their environments and survive [78].
In the past two decades, a number of studies using chloroplast DNA have greatly enhanced our understanding of evolutionary relationships among angiosperms using cp DNA sequences [79]. The present study uses ML and MP analyses of different datasets to construct a phylogenetic tree with similar topological structures. As a result of the phylogenetic analysis, it is possible to delimit species by paraphyletic clustering based on their genetic variation. However, the large deletions found among the studied accessions violate the molecular clock assumptions and impede the ability to infer the divergence time accurately [80]. However, a much larger number of sequences are necessary to obtain a more accurate relationship between the Boraginaceae.

5. Conclusions

The complete chloroplast of C. monoica species was sequenced, assembled, and compared. The chloroplast genomes of C. monoica are conserved in terms of structure and gene order. Tandem repeats are found in the noncoding regions that might be useful for studying population genetics within the family Boraginaceae. A number of high-variability hotspots are also detected in the protein-coding genes for Boraginaceae species, which provide candidates for genetic markers for species identification and phylogeny. Additionally, three closely related species were compared in terms of their IR expansion and contraction. Analysis of coding gene sequence divergence reveals that twelve genes are positively selected. As a result of the study, the data obtained are helpful for future research on Boraginaceae diversity, ecology, taxonomy, phylogenetic evolution, and conservation.

Author Contributions

Conceptualization, R.M.A., H.M., S.A.A., N.A.A.S. and D.A.E.-M.; methodology, R.M.A. and H.M.; software, H.M., S.A.A. and D.A.E.-M.; validation, N.A.A.S. and D.A.E.-M.; formal analysis, R.M.A., H.M., S.A.A., N.A.A.S. and D.A.E.-M.; investigation, D.A.E.-M.; resources, R.M.A., H.M., S.A.A., N.A.A.S. and D.A.E.-M.; data curation, N.A.A.S. and D.A.E.-M.; writing—original draft preparation, R.M.A., H.M., S.A.A., N.A.A.S. and D.A.E.-M.; writing—review and editing, R.M.A., H.M., N.A.A.S. and D.A.E.-M.; visualization, H.M. and D.A.E.-M.; supervision, D.A.E.-M.; project administration, N.A.A.S. and D.A.E.-M.; funding acquisition, S.A.A., N.A.A.S. and D.A.E.-M. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Princess Nourah bint Abdulrahman University Researchers Supporting Project number (PNURSP2023R187), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The complete sequence of C. monoica was deposited into the NCBI GenBank, accession number OP224515.

Acknowledgments

The authors extend their appreciation to Princess Nourah bint Abdulrahman University Researchers Supporting Project number (PNURSP2023R187), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Cho, K.-S.; Yun, B.-K.; Yoon, Y.-H.; Hong, S.-Y.; Mekapogu, M.; Kim, K.-H.; Yang, T.-J. Complete Chloroplast Genome Sequence of Tartary Buckwheat (Fagopyrum tataricum) and Comparative Analysis with Common Buckwheat (F. esculentum). PLoS ONE 2015, 10, e0125332. [Google Scholar] [CrossRef]
  2. Howe, C.J.; Barbrook, A.C.; Koumandou, V.L.; Nisbet, R.E.R.; Symington, H.A.; Wightman, T.F. Evolution of the Chloroplast Genome. Philos. Trans. R. Soc. Lond. Ser. B Biol. Sci. 2003, 358, 99–107. [Google Scholar] [CrossRef] [PubMed]
  3. Esmail, S.M.; Aboulila, A.A.; El-Moneim, D.A. Variation in several pathogenesis—Related (PR) protein genes in wheat (Triticum aestivum) involved in defense against Puccinia striiformis f.sp. tritici. Physiol. Mol. Plant Pathol. 2020, 112, 101545. [Google Scholar] [CrossRef]
  4. Neuhaus, H.; Emes, M. Nonphotosynthetic Metabolism in Plastids. Annu. Rev. Plant Biol. 2000, 51, 111. [Google Scholar] [CrossRef]
  5. Bausher, M.G.; Singh, N.D.; Lee, S.-B.; Jansen, R.K.; Daniell, H. The Complete Chloroplast Genome Sequence of Citrus sinensis (L.) Osbeck Var’Ridge Pineapple’: Organization and Phylogenetic Relationships to Other Angiosperms. BMC Plant Biol. 2006, 6, 21. [Google Scholar] [CrossRef] [PubMed]
  6. Cui, Y.; Nie, L.; Sun, W.; Xu, Z.; Wang, Y.; Yu, J.; Song, J.; Yao, H. Comparative and Phylogenetic Analyses of Ginger (Zingiber officinale) in the Family Zingiberaceae Based on the Complete Chloroplast Genome. Plants 2019, 8, 283. [Google Scholar] [CrossRef] [PubMed]
  7. Shetty, S.M.; Md Shah, M.U.; Makale, K.; Mohd-Yusuf, Y.; Khalid, N.; Othman, R.Y. Complete Chloroplast Genome Sequence of Musa Balbisiana Corroborates Structural Heterogeneity of Inverted Repeats in Wild Progenitors of Cultivated Bananas and Plantains. Plant Genome 2016, 9, plantgenome2015-09. [Google Scholar] [CrossRef]
  8. Wambugu, P.W.; Brozynska, M.; Furtado, A.; Waters, D.L.; Henry, R.J. Relationships of Wild and Domesticated Rices (Oryza AA Genome Species) Based upon Whole Chloroplast Genome Sequences. Sci. Rep. 2015, 5, 13957. [Google Scholar] [CrossRef]
  9. Wicke, S.; Schneeweiss, G.M.; Depamphilis, C.W.; Müller, K.F.; Quandt, D. The Evolution of the Plastid Chromosome in Land Plants: Gene Content, Gene Order, Gene Function. Plant Mol. Biol. 2011, 76, 273–297. [Google Scholar] [CrossRef]
  10. Lee, H.-L.; Jansen, R.K.; Chumley, T.W.; Kim, K.-J. Gene Relocations within Chloroplast Genomes of Jasminum and Menodora (Oleaceae) Are Due to Multiple, Overlapping Inversions. Mol. Biol. Evol. 2007, 24, 1161–1180. [Google Scholar] [CrossRef]
  11. Wolfe, K.H.; Mordent, C.W.; Ems, S.C.; Palmer, J.D. Rapid Evolution of the Plastid Translational Apparatus in a Nonphotosynthetic Plant: Loss or Accelerated Sequence Evolution of TRNA and Ribosomal Protein Genes. J. Mol. Evol. 1992, 35, 304–317. [Google Scholar] [CrossRef]
  12. Dong, W.; Liu, J.; Yu, J.; Wang, L.; Zhou, S. Highly Variable Chloroplast Markers for Evaluating Plant Phylogeny at Low Taxonomic Levels and for DNA Barcoding. PLoS ONE 2012, 7, e35071. [Google Scholar] [CrossRef]
  13. Wu, F.-H.; Chan, M.-T.; Liao, D.-C.; Hsu, C.-T.; Lee, Y.-W.; Daniell, H.; Duvall, M.R.; Lin, C.-S. Complete Chloroplast Genome of Oncidium Gower Ramsey and Evaluation of Molecular Markers for Identification and Breeding in Oncidiinae. BMC Plant Biol. 2010, 10, 68. [Google Scholar] [CrossRef] [PubMed]
  14. Zhang, Y.; Iaffaldano, B.J.; Zhuang, X.; Cardina, J.; Cornish, K. Chloroplast Genome Resources and Molecular Markers Differentiate Rubber Dandelion Species from Weedy Relatives. BMC Plant Biol. 2017, 17, 34. [Google Scholar] [CrossRef] [PubMed]
  15. Huo, Y.; Gao, L.; Liu, B.; Yang, Y.; Kong, S.; Sun, Y.; Yang, Y.; Wu, X. Complete Chloroplast Genome Sequences of Four Allium Species: Comparative and Phylogenetic Analyses. Sci. Rep. 2019, 9, 12250. [Google Scholar] [CrossRef]
  16. Martin, W.; Deusch, O.; Stawski, N.; Grünheit, N.; Goremykin, V. Chloroplast Genome Phylogenetics: Why We Need Independent Approaches to Plant Molecular Evolution. Trends Plant Sci. 2005, 10, 203–209. [Google Scholar] [CrossRef] [PubMed]
  17. Sanitá Lima, M.; Woods, L.C.; Cartwright, M.W.; Smith, D.R. The (in)Complete Organelle Genome: Exploring the Use and Nonuse of Available Technologies for Characterizing Mitochondrial and Plastid Chromosomes. Mol. Ecol. Resour. 2016, 16, 1279–1286. [Google Scholar] [CrossRef]
  18. Quattrocchi, U. CRC World Dictionary of Medicinal and Poisonous Plants: Common Names, Scientific Names, Eponyms, Synonyms, and Etymology (5 Volume Set); CRC Press: Boca Raton, FL, USA, 2012; ISBN 1-4200-8044-X. [Google Scholar]
  19. Ramana, K.V.; Trivedi, M.H.; Reddy, P.R.K.; Rao, C.V. Anti ulcer activity of cordia monoica roxb root. Adv. Pharmacol. Toxicol. 2014, 15, 57. [Google Scholar]
  20. Glover, P.E.; Stewart, J.; Gwynne, M.D. Masai and Kipsigis Notes on East African Plants: Part III—Medicinal Uses of Plants. East Afr. Agric. For. J. 1966, 32, 200–207. [Google Scholar] [CrossRef]
  21. Oza, M.J.; Kulkarni, Y.A. Traditional Uses, Phytochemistry and Pharmacology of the Medicinal Species of the Genus Cordia (Boraginaceae). J. Pharm. Pharmacol. 2017, 69, 755–789. [Google Scholar] [CrossRef]
  22. Pradheeps, M. Ethnobotany and Utilization of Plant Resources in Irula Villages (Sigur Plateau, Nilgiri Biosphere Reserve, India). J. Med. Plants Res. 2013, 7, 267–276. [Google Scholar]
  23. Ruffo, C.K.; Birnie, A.; Tengnäs, B. Edible Wild Plants of Tanzania, RELMA Technical Handbook (TH) Series; Palzer, C., Ed.; Regional Land Management Unit (RELMA), Swedish International Development Cooperation Agency (Sida): Nairobi, Kenya, 2002; TH No. 26; ISBN 9966-896-60-0.
  24. Magdy, M.; Ou, L.; Yu, H.; Chen, R.; Zhou, Y.; Hassan, H.; Feng, B.; Taitano, N.; van der Knaap, E.; Zou, X.; et al. Pan-Plastome Approach Empowers the Assessment of Genetic Variation in Cultivated Capsicum Species. Hortic. Res. 2019, 6, 108. [Google Scholar] [CrossRef] [PubMed]
  25. Magdy, M.; Ouyang, B. The Complete Mitochondrial Genome of the Chiltepin Pepper (Capsicum annuum Var. Glabriusculu), the Wild Progenitor of Capsicum annuum L. Mitochondrial DNA Part B 2020, 5, 683–684. [Google Scholar] [CrossRef] [PubMed]
  26. Tillich, M.; Lehwark, P.; Pellizzer, T.; Ulbricht-Jones, E.S.; Fischer, A.; Bock, R.; Greiner, S. GeSeq—Versatile and Accurate Annotation of Organelle Genomes. Nucleic Acids Res. 2017, 45, W6–W11. [Google Scholar] [CrossRef] [PubMed]
  27. Greiner, S.; Lehwark, P.; Bock, R. OrganellarGenomeDRAW (OGDRAW) Version 1.3.1: Expanded Toolkit for the Graphical Visualization of Organellar Genomes. Nucleic Acids Res. 2019, 47, W59–W64. [Google Scholar] [CrossRef]
  28. Chan, P.P.; Lin, B.Y.; Mak, A.J.; Lowe, T.M. TRNAscan-SE 2.0: Improved Detection and Functional Classification of Transfer RNA Genes. Nucleic Acids Res. 2021, 49, 9077–9096. [Google Scholar] [CrossRef] [PubMed]
  29. Kearse, M.; Moir, R.; Wilson, A.; Stones-Havas, S.; Cheung, M.; Sturrock, S.; Buxton, S.; Cooper, A.; Markowitz, S.; Duran, C.; et al. Geneious Basic: An Integrated and Extendable Desktop Software Platform for the Organization and Analysis of Sequence Data. Bioinformatics 2012, 28, 1647–1649. [Google Scholar] [CrossRef]
  30. Tamura, K.; Stecher, G.; Kumar, S. MEGA11: Molecular Evolutionary Genetics Analysis Version 11. Mol. Biol. Evol. 2021, 38, 3022–3027. [Google Scholar] [CrossRef]
  31. Frazer, K.A.; Pachter, L.; Poliakov, A.; Rubin, E.M.; Dubchak, I. VISTA: Computational Tools for Comparative Genomics. Nucleic Acids Res. 2004, 32, W273–W279. [Google Scholar] [CrossRef]
  32. Nei, M.; Gojobori, T. Simple Methods for Estimating the Numbers of Synonymous and Nonsynonymous Nucleotide Substitutions. Mol. Biol. Evol. 1986, 3, 418–426. [Google Scholar]
  33. Nielsen, R. Molecular Signatures of Natural Selection. Annu. Rev. Genet. 2005, 39, 197–218. [Google Scholar] [CrossRef] [PubMed]
  34. Katoh, K.; Standley, D.M. MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability. Mol. Biol. Evol. 2013, 30, 772–780. [Google Scholar] [CrossRef]
  35. Price, M.N.; Dehal, P.S.; Arkin, A.P. FastTree 2—Approximately Maximum-Likelihood Trees for Large Alignments. PLoS ONE 2010, 5, e9490. [Google Scholar] [CrossRef] [PubMed]
  36. Henriquez, C.L.; Ahmed, I.; Carlsen, M.M.; Zuluaga, A.; Croat, T.B.; McKain, M.R. Evolutionary Dynamics of Chloroplast Genomes in Subfamily Aroideae (Araceae). Genomics 2020, 112, 2349–2360. [Google Scholar] [CrossRef] [PubMed]
  37. Mehmood, F.; Shahzadi, I.; Waseem, S.; Mirza, B.; Ahmed, I.; Waheed, M.T. Chloroplast Genome of Hibiscus Rosa-Sinensis (Malvaceae): Comparative Analyses and Identification of Mutational Hotspots. Genomics 2020, 112, 581–591. [Google Scholar]
  38. Jansen, R.K.; Cai, Z.; Raubeson, L.A.; Daniell, H.; Depamphilis, C.W.; Leebens-Mack, J.; Müller, K.F.; Guisinger-Bellian, M.; Haberle, R.C.; Hansen, A.K. Analysis of 81 Genes from 64 Plastid Genomes Resolves Relationships in Angiosperms and Identifies Genome-Scale Evolutionary Patterns. Proc. Natl. Acad. Sci. USA 2007, 104, 19369–19374. [Google Scholar] [CrossRef]
  39. Moore, M.J.; Bell, C.D.; Soltis, P.S.; Soltis, D.E. Using Plastid Genome-Scale Data to Resolve Enigmatic Relationships among Basal Angiosperms. Proc. Natl. Acad. Sci. USA 2007, 104, 19363–19368. [Google Scholar] [CrossRef] [PubMed]
  40. Chen, Q.; Zhang, D. The Complete Chloroplast Genome Sequence of Onosma Paniculatum Bur. et Franch. (Boraginaceae), a Medicinal Plant in Yunnan and Its Adjacent Regions. Mitochondrial DNA Part B 2019, 4, 3330–3332. [Google Scholar] [CrossRef]
  41. Wu, J.; Li, H.; Lei, J.; Liang, Z. The Complete Chloroplast Genome Sequence of Trigonotis Peduncularis (Boraginaceae). Mitochondrial DNA Part B 2022, 7, 456–457. [Google Scholar] [CrossRef]
  42. Lei, W.; Ni, D.; Wang, Y.; Shao, J.; Wang, X.; Yang, D.; Wang, J.; Chen, H.; Liu, C. Intraspecific and Heteroplasmic Variations, Gene Losses and Inversions in the Chloroplast Genome of Astragalus Membranaceus. Sci. Rep. 2016, 6, 21669. [Google Scholar] [CrossRef]
  43. Ruhlman, T.A.; Jansen, R.K. The Plastid Genomes of Flowering Plants. In Chloroplast Biotechnology; Springer: Berlin/Heidelberg, Germany, 2014; pp. 3–38. [Google Scholar] [CrossRef]
  44. Park, I.; Song, J.-H.; Yang, S.; Kim, W.J.; Choi, G.; Moon, B.C. Cuscuta Species Identification Based on the Morphology of Reproductive Organs and Complete Chloroplast Genome Sequences. Int. J. Mol. Sci. 2019, 20, 2726. [Google Scholar] [CrossRef] [PubMed]
  45. Provan, J.; Powell, W.; Hollingsworth, P.M. Chloroplast Microsatellites: New Tools for Studies in Plant Ecology and Evolution. Trends Ecol. Evol. 2001, 16, 142–147. [Google Scholar] [CrossRef]
  46. Addisalem, A.; Esselink, G.D.; Bongers, F.; Smulders, M. Genomic Sequencing and Microsatellite Marker Development for Boswellia Papyrifera, an Economically Important but Threatened Tree Native to Dry Tropical Forests. AoB Plants 2015, 7, plu086. [Google Scholar] [CrossRef] [PubMed]
  47. Ebert, D.; Peakall, R. Chloroplast Simple Sequence Repeats (CpSSRs): Technical Resources and Recommendations for Expanding CpSSR Discovery and Applications to a Wide Array of Plant Species. Mol. Ecol. Resour. 2009, 9, 673–690. [Google Scholar] [CrossRef] [PubMed]
  48. Yi, X.; Gao, L.; Wang, B.; Su, Y.-J.; Wang, T. The Complete Chloroplast Genome Sequence of Cephalotaxus Oliveri (Cephalotaxaceae): Evolutionary Comparison of Cephalotaxus Chloroplast DNAs and Insights into the Loss of Inverted Repeat Copies in Gymnosperms. Genome Biol. Evol. 2013, 5, 688–698. [Google Scholar] [CrossRef]
  49. Asaf, S.; Waqas, M.; Khan, A.L.; Khan, M.A.; Kang, S.-M.; Imran, Q.M.; Shahzad, R.; Bilal, S.; Yun, B.-W.; Lee, I.-J. The Complete Chloroplast Genome of Wild Rice (Oryza minuta) and Its Comparison to Related Species. Front. Plant Sci. 2017, 8, 304. [Google Scholar] [CrossRef] [PubMed]
  50. Kuang, D.-Y.; Wu, H.; Wang, Y.-L.; Gao, L.-M.; Zhang, S.-Z.; Lu, L. Complete Chloroplast Genome Sequence of Magnolia Kwangsiensis (Magnoliaceae): Implication for DNA Barcoding and Population Genetics. Genome 2011, 54, 663–673. [Google Scholar] [CrossRef]
  51. Chen, X.; Li, Q.; Li, Y.; Qian, J.; Han, J. Chloroplast Genome of Aconitum Barbatum Var. Puberulum (Ranunculaceae) Derived from CCS Reads Using the PacBio RS Platform. Front. Plant Sci. 2015, 6, 42. [Google Scholar] [CrossRef]
  52. Sharp, P.M.; Emery, L.R.; Zeng, K. Forces That Influence the Evolution of Codon Bias. Phil. Trans. R. Soc. B 2010, 365, 1203–1212. [Google Scholar] [CrossRef]
  53. Chen, C.; Zheng, Y.; Liu, S.; Zhong, Y.; Wu, Y.; Li, J.; Xu, L.-A.; Xu, M. The Complete Chloroplast Genome of Cinnamomum Camphora and Its Comparison with Related Lauraceae Species. PeerJ 2017, 5, e3820. [Google Scholar] [CrossRef]
  54. Trofimov, D.; Cadar, D.; Schmidt-Chanasit, J.; Rodrigues de Moraes, P.L.; Rohwer, J.G. A Comparative Analysis of Complete Chloroplast Genomes of Seven Ocotea Species (Lauraceae) Confirms Low Sequence Divergence within the Ocotea Complex. Sci. Rep. 2022, 12, 1120. [Google Scholar] [CrossRef] [PubMed]
  55. Guo, S.; Guo, L.; Zhao, W.; Xu, J.; Li, Y.; Zhang, X.; Shen, X.; Wu, M.; Hou, X. Complete Chloroplast Genome Sequence and Phylogenetic Analysis of Paeonia Ostii. Molecules 2018, 23, 246. [Google Scholar] [CrossRef]
  56. Zhou, J.; Cui, Y.; Chen, X.; Li, Y.; Xu, Z.; Duan, B.; Li, Y.; Song, J.; Yao, H. Complete Chloroplast Genomes of Papaver Rhoeas and Papaver Orientale: Molecular Structures, Comparative Analysis, and Phylogenetic Analysis. Molecules 2018, 23, 437. [Google Scholar] [CrossRef] [PubMed]
  57. Wakasugi, T.; Tsudzuki, J.; Ito, S.; Nakashima, K.; Tsudzuki, T.; Sugiura, M. Loss of All Ndh Genes as Determined by Sequencing the Entire Chloroplast Genome of the Black Pine Pinus Thunbergii. Proc. Natl. Acad. Sci. USA 1994, 91, 9794–9798. [Google Scholar] [CrossRef]
  58. Tang, J.; Xia, H.; Cao, M.; Zhang, X.; Zeng, W.; Hu, S.; Tong, W.; Wang, J.; Wang, J.; Yu, J. A Comparison of Rice Chloroplast Genomes. Plant Physiol. 2004, 135, 412–420. [Google Scholar] [CrossRef]
  59. Wicke, S.; Naumann, J. Molecular Evolution of Plastid Genomes in Parasitic Flowering Plants. In Advances in Botanical Research; Elsevier: Amsterdam, The Netherlands, 2018; Volume 85, pp. 315–347. ISBN 0065-2296. [Google Scholar] [CrossRef]
  60. Kode, V.; Mudd, E.A.; Iamtham, S.; Day, A. The Tobacco Plastid AccD Gene Is Essential and Is Required for Leaf Development. Plant J. 2005, 44, 237–244. [Google Scholar] [CrossRef]
  61. Raubeson, L.A.; Peery, R.; Chumley, T.W.; Dziubek, C.; Fourcade, H.M.; Boore, J.L.; Jansen, R.K. Comparative Chloroplast Genomics: Analyses Including New Sequences from the Angiosperms Nuphar Advena and Ranunculus Macranthus. BMC Genom. 2007, 8, 174. [Google Scholar] [CrossRef] [PubMed]
  62. Wang, R.-J.; Cheng, C.-L.; Chang, C.-C.; Wu, C.-L.; Su, T.-M.; Chaw, S.-M. Dynamics and Evolution of the Inverted Repeat-Large Single Copy Junctions in the Chloroplast Genomes of Monocots. BMC Evol. Biol. 2008, 8, 36. [Google Scholar] [CrossRef]
  63. Yao, X.; Tang, P.; Li, Z.; Li, D.; Liu, Y.; Huang, H. The First Complete Chloroplast Genome Sequences in Actinidiaceae: Genome Structure and Comparative Analysis. PLoS ONE 2015, 10, e0129347. [Google Scholar] [CrossRef] [PubMed]
  64. Scarcelli, N.; Barnaud, A.; Eiserhardt, W.; Treier, U.A.; Seveno, M.; d’Anfray, A.; Vigouroux, Y.; Pintaud, J.-C. A Set of 100 Chloroplast DNA Primer Pairs to Study Population Genetics and Phylogeny in Monocotyledons. PLoS ONE 2011, 6, e19954. [Google Scholar] [CrossRef] [PubMed]
  65. Aldrich, J.; Cherney, B.W.; Merlin, E. The Role of Insertions/Deletions in the Evolution of the Intergenic Region BetweenpsbA AndtrnH in the Chloroplast Genome. Curr. Genet. 1988, 14, 137–146. [Google Scholar] [CrossRef]
  66. Li, X.; Yang, Y.; Henry, R.J.; Rossetto, M.; Wang, Y.; Chen, S. Plant DNA Barcoding: From Gene to Genome. Biol. Rev. 2015, 90, 157–166. [Google Scholar] [CrossRef] [PubMed]
  67. Shi, H.; Yang, M.; Mo, C.; Xie, W.; Liu, C.; Wu, B.; Ma, X. Complete Chloroplast Genomes of Two Siraitia Merrill Species: Comparative Analysis, Positive Selection and Novel Molecular Marker Development. PLoS ONE 2019, 14, e0226865. [Google Scholar] [CrossRef] [PubMed]
  68. Zhang, X.; Zhou, T.; Yang, J.; Sun, J.; Ju, M.; Zhao, Y.; Zhao, G. Comparative Analyses of Chloroplast Genomes of Cucurbitaceae Species: Lights into Selective Pressures and Phylogenetic Relationships. Molecules 2018, 23, 2165. [Google Scholar] [CrossRef]
  69. Xu, J.-H.; Liu, Q.; Hu, W.; Wang, T.; Xue, Q.; Messing, J. Dynamics of Chloroplast Genomes in Green Plants. Genomics 2015, 106, 221–231. [Google Scholar] [CrossRef] [PubMed]
  70. de Souza, U.J.B.; Nunes, R.; Targueta, C.P.; Diniz-Filho, J.A.F.; Telles, M.P.D.C. The Complete Chloroplast Genome of Stryphnodendron Adstringens (Leguminosae-caesalpinioideae): Comparative Analysis with Related Mimosoid Species. Sci. Rep. 2019, 9, 1–12. [Google Scholar]
  71. Hertel, S.; Zoschke, R.; Neumann, L.; Qu, Y.; Axmann, I.M.; Schmitz-Linneweber, C. Multiple Checkpoints for the Expression of the Chloroplast-Encoded Splicing Factor MatK. Plant Physiol. 2013, 163, 1686–1698. [Google Scholar] [CrossRef] [PubMed]
  72. Zhou, T.; Zhu, H.; Wang, J.; Xu, Y.; Xu, F.; Wang, X. Complete Chloroplast Genome Sequence Determination of Rheum Species and Comparative Chloroplast Genomics for the Members of Rumiceae. Plant Cell Rep. 2020, 39, 811–824. [Google Scholar] [CrossRef]
  73. Tyagi, S.; Jung, J.-A.; Kim, J.S.; Won, S.Y. A Comparative Analysis of the Complete Chloroplast Genomes of Three Chrysanthemum Boreale Strains. PeerJ 2020, 8, e9448. [Google Scholar] [CrossRef]
  74. Gao, C.; Deng, Y.; Wang, J. The Complete Chloroplast Genomes of Echinacanthus Species (Acanthaceae): Phylogenetic Relationships, Adaptive Evolution, and Screening of Molecular Markers. Front. Plant Sci. 2019, 9, 1989. [Google Scholar] [CrossRef]
  75. Kofer, W.; Koop, H.-U.; Wanner, G.; Steinmüller, K. Mutagenesis of the Genes Encoding Subunits A, C, H, I, J and K of the Plastid NAD (P) H-Plastoquinone-Oxidoreductase in Tobacco by Polyethylene Glycol-Mediated Plastome Transformation. Mol. Gen. Genet. MGG 1998, 258, 166–173. [Google Scholar] [CrossRef]
  76. Drescher, A.; Ruf, S.; Calsa, T., Jr.; Carrer, H.; Bock, R. The Two Largest Chloroplast Genome-encoded Open Reading Frames of Higher Plants Are Essential Genes. Plant J. 2000, 22, 97–104. [Google Scholar] [CrossRef] [PubMed]
  77. Kikuchi, S.; Bédard, J.; Hirano, M.; Hirabayashi, Y.; Oishi, M.; Imai, M.; Takase, M.; Ide, T.; Nakai, M. Uncovering the Protein Translocon at the Chloroplast Inner Envelope Membrane. Science 2013, 339, 571–574. [Google Scholar] [CrossRef] [PubMed]
  78. Chen, J.; Xie, D.; He, X.; Yang, Y.; Li, X. Comparative Analysis of the Complete Chloroplast Genomes in Allium Section Bromatorrhiza Species (Amaryllidaceae): Phylogenetic Relationship and Adaptive Evolution. Genes 2022, 13, 1279. [Google Scholar] [CrossRef]
  79. Särkinen, T.; George, M. Predicting Plastid Marker Variation: Can Complete Plastid Genomes from Closely Related Species Help? PLoS ONE 2013, 8, e82266. [Google Scholar] [CrossRef]
  80. Yang, Z.; Rannala, B. Molecular Phylogenetics: Principles and Practice. Nat. Rev. Genet. 2012, 13, 303–314. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Gene map of the chloroplast genome of C. monoica. The genes shown on the inside and the outside of the outer circle are transcribed in clockwise and counterclockwise directions, respectively. The colored bars denote gene functional groups. The dark gray area in the inner circle indicates the GC content, while the light grey area represents the AT content. (*) label genes with introns.
Figure 1. Gene map of the chloroplast genome of C. monoica. The genes shown on the inside and the outside of the outer circle are transcribed in clockwise and counterclockwise directions, respectively. The colored bars denote gene functional groups. The dark gray area in the inner circle indicates the GC content, while the light grey area represents the AT content. (*) label genes with introns.
Genes 14 00976 g001
Figure 2. Relative synonymous codon usage (RSCU) of 20 amino acids and stop codons of the C. monoica chloroplast genome. * = stop codons.
Figure 2. Relative synonymous codon usage (RSCU) of 20 amino acids and stop codons of the C. monoica chloroplast genome. * = stop codons.
Genes 14 00976 g002
Figure 3. The mVista chart compares four Boraginaceae species to C. monoica. Genes are annotated in gray arrows; exons are highlighted in blue. Continuous regions are marked as contigs with arrows within each track. One track represents a species, 1: P. arenarium, 2: L. madreporoides, 3: B. officinalis, and 4: O. fuyunensis.
Figure 3. The mVista chart compares four Boraginaceae species to C. monoica. Genes are annotated in gray arrows; exons are highlighted in blue. Continuous regions are marked as contigs with arrows within each track. One track represents a species, 1: P. arenarium, 2: L. madreporoides, 3: B. officinalis, and 4: O. fuyunensis.
Genes 14 00976 g003
Figure 4. Comparison of the borders of LSC, SSC, and IR regions between C. monoica, B. officinalis, and O. fuyunensis. Genes are represented by colored boxes while arrows show the coordinate positions of each gene near the junctions. Abbreviations denote the junction site of the plastid genome JLA (IRa/LSC), JLB (IRb/LSC), JSA (SSC/IRa), and JSB (IRb/SSC).
Figure 4. Comparison of the borders of LSC, SSC, and IR regions between C. monoica, B. officinalis, and O. fuyunensis. Genes are represented by colored boxes while arrows show the coordinate positions of each gene near the junctions. Abbreviations denote the junction site of the plastid genome JLA (IRa/LSC), JLB (IRb/LSC), JSA (SSC/IRa), and JSB (IRb/SSC).
Genes 14 00976 g004
Figure 5. Phylogenetic trees of C. monoica and other species belong to the Boraginaceae family using maximum likelihood based on the LSC region, SSC region, and IR region together. The bootstrap values of ML and MP are written on each node.
Figure 5. Phylogenetic trees of C. monoica and other species belong to the Boraginaceae family using maximum likelihood based on the LSC region, SSC region, and IR region together. The bootstrap values of ML and MP are written on each node.
Genes 14 00976 g005
Table 1. Summary of the newly assembled chloroplast genome of the C. monoica.
Table 1. Summary of the newly assembled chloroplast genome of the C. monoica.
FeatureC. monoicaFeatureC. monoica
Total cp DNA size (bp)148,711Intergenic sequences (%)39.0%
LSC size (bp)77,893Number of genes133
SSC size (bp)17,020Number of different protein-coding genes90
IR size (bp)26,897Number of different tRNA genes37
Protein-coding regions (%)52.98%Number of different rRNA genes8
rRNA and tRNA (%)7.94%Number of different duplicated genes21
Introns size (% total)13.58%GC content38.2%
Table 2. Repeated sequences in the C. monoica chloroplast genomes, including repeat class, repeat abundances, and percentage abundance.
Table 2. Repeated sequences in the C. monoica chloroplast genomes, including repeat class, repeat abundances, and percentage abundance.
Repeat ClassRepeat AbundancesAbundance (%)
Dinucleotide1108%
Trinucleotide19414%
Tetranucleotide24117%
Pentanucleotide28120%
Hexanucleotide38828%
7-nucleotide776%
8-nucleotide473%
9-nucleotide302%
10-nucleotide131%
Total1387100.00
Table 3. Summary of the cp lengths, variation in cp regions, and similarity percentage to C. monoica of five Boraginaceae species.
Table 3. Summary of the cp lengths, variation in cp regions, and similarity percentage to C. monoica of five Boraginaceae species.
SpeciesTotal LengthLSC bpSSC bpIR bpSimilarity%Accession No.
Cordia monoica148,71177,89317,02026,897100%-
Borago officinalis149,83578,84016,96727,01488%NC_046796
Onosma fuyunensis150,61282,93117,28125,20084.8%NC_049569
Pholisma arenarium81,19830,262645422,24139.8%NC_039719
Lennoa madreporoides83,67530,881683022,98240%NC_039720
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Alshegaihi, R.M.; Mansour, H.; Alrobaish, S.A.; Al Shaye, N.A.; Abd El-Moneim, D. The First Complete Chloroplast Genome of Cordia monoica: Structure and Comparative Analysis. Genes 2023, 14, 976. https://doi.org/10.3390/genes14050976

AMA Style

Alshegaihi RM, Mansour H, Alrobaish SA, Al Shaye NA, Abd El-Moneim D. The First Complete Chloroplast Genome of Cordia monoica: Structure and Comparative Analysis. Genes. 2023; 14(5):976. https://doi.org/10.3390/genes14050976

Chicago/Turabian Style

Alshegaihi, Rana M., Hassan Mansour, Shouaa A. Alrobaish, Najla A. Al Shaye, and Diaa Abd El-Moneim. 2023. "The First Complete Chloroplast Genome of Cordia monoica: Structure and Comparative Analysis" Genes 14, no. 5: 976. https://doi.org/10.3390/genes14050976

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop