Complete Chloroplast Genome Sequence of Endangered Species in the Genus Opisthopappus C. Shih: Characterization, Species Identification, and Phylogenetic Relationships

Opisthopappus C. Shih is a rare genus of the Asteraceae family native to the Taihang Mountains in China. Due to the narrow distribution area, poor reproduction ability and human harvesting, Opisthopappus is threatened by extinction. However, the limited genetic information within Opisthopappus impede understanding of the conservation efforts and bioprospecting. Therefore, in this study, we reported the complete chloroplast (cp) genome sequences of two Opisthopappus species, including Opisthopappus taihangensis and Opisthopappus longilobus. The cp genomes of O. taihangensis and O. longilobus were 151,117 and 151,123 bp, which contained 88 protein-coding genes, 37 tRNA genes, and 8 rRNA genes. The repeat sequences, codon usage, RNA-editing sites, and comparative analyses revealed a high degree of conservation between the two species. The ycf1 gene was identified as a potential molecular marker. The phylogenetic tree demonstrated that O. longilobus was a separate species and not a synonym or variety of O. taihangensis. The molecular clock showed that two species diverge over a large time span, O. longilobus diverged at 15.24 Mya (Million years ago), whereas O. taihangensis diverged at 5.40 Mya We found that Opisthopappus and Ajania are closely related, which provides new ideas for the development of Opisthopappus. These results provide biological information and an essential basis to understand the evolutionary history of the Opisthopappus species, which will aid in the future the bioprospecting and conservation of endangered species.


Introduction
Opisthopappus C. Shih belonging to the family Asteraceae was generally considered as including two species O. taihangensis and O. longilobus [1], and it is native to the Taihang Mountains in China, growing on the cliffs at an altitude of 1000 m [2,3]. The species of this genus are rich in phenolic compounds and flavonoids and have been consumed as medicine and food in the origin areas [4]. Based on the narrow distribution area, poor reproduction ability and human harvesting, Opisthopappus is now in an endangered state and has been listed as a rare and endangered species [5]. Furthermore, the species classification within Opisthopappus is not clear. According to the morphological and molecular evidence, O. taihangensis and O. longilobus were previously classified as the same species [6]. The phylogenetic relationships between these two species were unclear due to limited resolving power of morphological methods and detection capacity of unique nuclear gene sequences [7].
In green plants, cp plays an important role in photosynthesis [8]. The cp genomes show high conservatism regarding genome size, structure, gene content, and organization [9,10]. Therefore, the cp genome has been used as an ideal technology for the investigation of phylogenetic analyses, plant molecular identification, and genetic diversity evaluation.
Recently, the whole cp genome sequences can be used as a plant super-barcode for discriminating closely related species in some taxa [11,12]. However, there are few available on the cp genome of Opisthopappus, only the cp of O. taihangensis has been sequenced [13], and the cp genome information is still missing in O. longilobus. In addition, the comparative genomics within the genus Opisthopappus have not been studied, which limits the phylogenetic and genetic diversity studies in this genus.
In this study, the complete cp genomes of two Opisthopappus species were sequenced and annotated for the first time. We analyzed the general characteristics and compared the structural characteristics to determine the origin and phylogenetic relationships. These results provide abundant genetic information of the genus Opisthopappus and formulate effective conservation and molecular identification approaches for the crucial and endangered medicinal plants.
In this study, the complete cp genomes of two Opisthopappus species were sequenced and annotated. We analyzed the general characteristics and compared the structural characteristics to determine the origin and phylogenetic relationships. The results of this study will provide abundant genetic information on the genus Opisthopappus and formulate effective conservation and molecular identification approaches for the crucial and endangered medicinal plants.  [14]. The DNA sequencing was performed using the Illumina NovaSeq system (Illumina, San Diego, CA, USA).

Cp genome Assembly and Annotation
The cp genomes were assembled from clean reads using NOVOPlasty v.3.8.3 [15], and then, the assembled file was annotated by CpGAVAS2 web service (accessed on 20 May 2022) [16]. The tRNA genes were identified using tRNAscan-SE program v.1.3.1 [17]. The annotated whole genome sequences were submitted to GenBank (accession numbers: MZ779049, MZ779050). Finally, circular maps of cp genomes were drawn by the OGDRAW program [18].
In addition, the predictive RNA Editor for Plants (PREP) suite was used at the cutoff value of 0.8 to analyze the potential RNA editing sites in the protein-coding genes of the cp genomes [25,26].

Phylogenetic and Divergence Time Analysis
In total, 20 complete cp genomes of subtribe Chrysantheminae were downloaded from the NCBI for phylogenetic analysis. In addition, Xanthium spinosum (NC_054222) was used as the outgroup. These cp genome sequences were aligned by MAFFT v. 7.307 [27]. Subsequently, the alignment was conducted with the maximum likelihood (ML) method using RAxML v. 8.2.4 [28]. The detailed parameters were set to "raxmlHPC-PTHREADS-SSE3 -fa -N 1000 -m GTRGAMMA -x 551314260 -p 551314260 -o Xanthium_spinosum_NC_054222 -T 20". The reliability of the phylogenetic tree was assessed using bootstrap method with 1000 replications.
The molecular clock tree was constructed using MEGA based on an ML method to estimate the origin and divergence times of Opisthopappus and related genus [29]. The relevant divergence times can be found in the TimeTree Resource database (http://www.timetree.org/) (accessed on 25 May 2022) [30]. Furthermore, X. spinosum was selected as outgroup of the phylogeny. The temporal constraints were derived from the TimeTree Resource. NODE TIME found the divergence time between two genera.

Characteristics of O. taihangensis and O. longilobus cp Genomes
The total length of the cp genomes of O. taihangensis and O. longilobus were 151,117 and 151,123 bp ( Figure 1). The two species cp genomes showed the commonplace quadripartite construction comprising an LSC area (82,901 and 82,895 bp), an SSC locale (18,306 and 18,320 bp), and a pair of IRs districts (24,955 and 24,954 bp). In addition, the overall GC contents were 37.46% and 37.44%, which were unevenly distributed across the complete cp genome ( Table 1). The IRs were the highest (43.07% and 43.08%), followed by the LSC (35.54% and 35.52%), whereas the SSC region showed the lowest GC content (30.82% and 30.8%). The overall G + C contents in both species were lower than A + T contents, which is a general feature exhibited in many angiosperm species cp genomes sequences.
The cp genomes of O. taihangensis and O. longilobus were highly conservative, with nearly identical gene content, gene order, and no structural reconfigurations. In total, 133 genes were predicted in two species, consisting of 88 protein-coding genes, 37 tRNA genes, and 8 rRNA genes. Most of the genes of two cp genomes were generally classified into three categories (Table 2), including self-replication, photosynthesis-related, and other function genes, respectively. Opisthopappus had lost some genes during evolution, like most angiosperms, such as chIB, chIL, and ycf 68. [33,34]. The complete cp genomes of the two species were highly conserved intron number and type. In total, 19 genes contained introns. Among those genes, eight genes (rpl2 (×2), ndhB (×2), trnA-UGC (×2), and trnE-UUC (×2)) were located in the IR region, and the remaining ten genes (rps16, atpF, rpoC1, ycf 3, clpP, petB, trnK-UUU, trnS-CGA, trnL-UAA, and petD) were located in the LSC region. However, ndhA was the only gene found in the SSC region. Among them, the clpP and ycf 3 had two introns (Table S1)  The cp genomes of O. taihangensis and O. longilobus were highly conservative, with nearly identical gene content, gene order, and no structural reconfigurations. In total, 133 genes were predicted in two species, consisting of 88 protein-coding genes, 37 tRNA genes, and 8 rRNA genes. Most of the genes of two cp genomes were generally classified into three categories (Table 2), including self-replication, photosynthesis-related, and other function genes, respectively. Opisthopappus had lost some genes during evolution,  To investigate the genomic differences, we compared the cp genome features of NC_042787 and MZ779049. As shown in Table 3, the gene content and organization of MZ779049 were similar to NC_042787. However, the cp genome of MZ779049 had the longer genome length and higher number of genes, which indicated that MZ779049 showed more complete cp genome.

Gene Function
Group Of Genes Gene Names Amount

Repetitive Sequence
SSRs consist of tandem short repeat units and distribute widely throughout the cp genome. In this study, the total number of SSRs detected in the cp genome sequences of O. taihangensis and O. longilobus were 42 and 44 (Figure 2a). Among them, mononucleotide SSRs were the most abundant, followed by both dinucleotides and tetranucleotides. Other types of SSRs were not detected. Moreover, intergenic regions had more abundant SSRs than the protein-coding regions, which is similar to those in most angiosperms' cp genome [35].
In addition, the long repeat sequences of the cp genomes in the two species were similar. O. taihangensis showed 20 forward and 22 palindromic repeats, while O. longilobus displayed 20 forward and 21 palindromic repeats (Figure 2b). These repetitive sequences provide meaningful clues for the studies of genetic diversity and the development of molecular markers. genome. In this study, the total number of SSRs detected in the cp genome sequences of O. taihangensis and O. longilobus were 42 and 44 (Figure 2a). Among them, mononucleotide SSRs were the most abundant, followed by both dinucleotides and tetranucleotides. Other types of SSRs were not detected. Moreover, intergenic regions had more abundant SSRs than the protein-coding regions, which is similar to those in most angiosperms' cp genome [35].

Codon Usage Analysis and RNA-Editing Sites
The length of the protein-coding genes regions in O. taihangensis and O. longilobus were 78,636 and 78,624 bp, which were used to calculate RSCU values. An RSCU value greater than 1 indicates high frequency usage. As shown in Figure 3, these encoded protein sequences consisted of 21 amino acids. High RSCU values are represented by red, and low RSCU values are indicated by blue. The heatmap shows that 30 codons were used frequently in the two species. With the exception of UUG, all preference codons finish in purines (A/U) because of the nature of the A/T-rich cp genome.

Codon Usage Analysis and RNA-Editing Sites
The length of the protein-coding genes regions in O. taihangensis and O. longilobus were 78,636 and 78,624 bp, which were used to calculate RSCU values. An RSCU value greater than 1 indicates high frequency usage. As shown in Figure 3  Additionally, in the cp genomes of O. taihangensis and O. longilobus, latent RNA editing sites were discovered for 18 genes. In total, 47 RNA editing sites were identified in the two species. Among the two species cp genomes, the gene with the most RNA editing sites was ndhB. The majority of the detected RNA-editing sites were at the second codon position and included cytosine to uracil (C-U) conversions. Serine to leucine (S-L) conversions were the most frequent amino acid conversions. In addition, a large number of RNAediting sites resulted in modifications to amino acids in hydrophobic products such as phenylalanine (F), tyrosine (Y), leucine (L), and valine (V) ( Table S2).

Phylogenetic Analysis and Divergence Time Analysis
The ML phylogenetic tree (Figure 4)   Additionally, in the cp genomes of O. taihangensis and O. longilobus, latent RNA editing sites were discovered for 18 genes. In total, 47 RNA editing sites were identified in the two species. Among the two species cp genomes, the gene with the most RNA editing sites was ndhB. The majority of the detected RNA-editing sites were at the second codon position and included cytosine to uracil (C-U) conversions. Serine to leucine (S-L) conversions were the most frequent amino acid conversions. In addition, a large number of RNA-editing sites resulted in modifications to amino acids in hydrophobic products such as phenylalanine (F), tyrosine (Y), leucine (L), and valine (V) ( Table S2).

Phylogenetic Analysis and Divergence Time Analysis
The ML phylogenetic tree (Figure 4) reveals that Opisthopappus was more closely related to Chrysanthemum and Ajania in the subtribe Chrysantheminae. They formed a strongly supported sister relationship with Artemisia. Furthermore, O. taihangensis sequenced in this study was clustered with the published O. taihangensis (NC_042787). O. longilobus was more closely related to A. pacifica than to O. taihangensis. This indicated that O. longilobus was a separate species and was not a synonym or variety of O. taihangensis. Closely related plants are chemically similar and may have the same pharmacological properties. Moreover, plants are phylogenetically related to each other. Therefore, ethnobotanists have used a range of phylogenetic methods for bioprospecting [36]. From a previous study, Ajania has anti-inflammatory, anthelmintic and malaria treatment properties [37]. These results provide new ideas for the exploitation of Opisthopappus. The cp genomes seemed to provide more solid support for the reconstruction of phylogenetic relationships among these sections.

Comparative cp Genomic Analysis
In order to explore the sequence divergence between the two species of Opisthopappus, nucleotide diversity (Pi) was estimated to indicate the variability of potential plastid regions. The values of Pi ranged from 0 to 0.01. Among them, 124,370-126,369 bp resign showed high nucleotide diversity (Pi: 0.0067-0.01). This region was identified as the protein-coding region ycf1 (Figure 6).

Comparative cp Genomic Analysis
In order to explore the sequence divergence between the two species of Opisthopap pus, nucleotide diversity (Pi) was estimated to indicate the variability of potential plastid regions. The values of Pi ranged from 0 to 0.01. Among them, 124,370-126,369 bp resign showed high nucleotide diversity (Pi: 0.0067-0.01). This region was identified as the protein-coding region ycf1 ( Figure 6). The phylogenetic tree revealed a tight relationship between Opisthopappus and A. pacifica. The cp genome sequences of the three species were compared using O. taihangensis (NC_042787) as a reference sequence to examine the differences in the cp genome sequences. As shown in Figure 7, the three cp genomes had the lowest variability in the IR region and was relatively high in the LSC and SSC regions, which may be attributable to the presence of highly conserved rRNA sequences in IR regions. In the cp genomes of the three species, the majority of the protein-coding genes were conserved. However, the rpl16 gene had a large mutation. In addition, O. longilobus and Ajania had greater variation compared to O. taihangensis. The variations were predominantly localized in intergenic The phylogenetic tree revealed a tight relationship between Opisthopappus and A. pacifica. The cp genome sequences of the three species were compared using O. taihangensis (NC_042787) as a reference sequence to examine the differences in the cp genome sequences. As shown in Figure 7, the three cp genomes had the lowest variability in the IR region and was relatively high in the LSC and SSC regions, which may be attributable to the presence of highly conserved rRNA sequences in IR regions. In the cp genomes of the three species, the majority of the protein-coding genes were conserved. However, the rpl16 gene had a large mutation. In addition, O. longilobus and Ajania had greater variation compared to O. taihangensis. The variations were predominantly localized in intergenic regions, such as petN-psbM, psbE-petL, psbA-matk, trnT-UGU-trnL-UAA, and trnR-UCU-trnG-UCC, which could be considered as possible molecular genetic markers. Changes in the length of the cp genome are frequently caused by contraction an extension at the boundaries of IR regions [38]. Figure 8 shows the results of IR region contraction and expansion of the two cp genomes. The rps19 gene was located at th LSC/IRb borders, which was mainly located in the LSC at 60-61 bp of the IRb. The gen on the IRb/SSC borders of Opisthopappus was ycf1; however, there is no gene change o IRb of A. pacifica. The ycf1 gene, which was primarily found in the SSC at 558 bp of th IRa, served as the SSC/IRa borders for all species. The trnH was found in the LSC/IR borders. However, the rps19 was in IRa and was close to the IRa/LSC boundary in O. ta hangensis (NC_042787). Overall, A. pacifica had the highest variability. Changes in the length of the cp genome are frequently caused by contraction and extension at the boundaries of IR regions [38]. Figure 8 shows the results of IR regions contraction and expansion of the two cp genomes. The rps19 gene was located at the LSC/IRb borders, which was mainly located in the LSC at 60-61 bp of the IRb. The gene on the IRb/SSC borders of Opisthopappus was ycf 1; however, there is no gene change on IRb of A. pacifica. The ycf 1 gene, which was primarily found in the SSC at 558 bp of the IRa, served as the SSC/IRa borders for all species. The trnH was found in the LSC/IRa borders. However, the rps19 was in IRa and was close to the IRa/LSC boundary in O. taihangensis (NC_042787). Overall, A. pacifica had the highest variability.
LSC/IRb borders, which was mainly located in the LSC at 60-61 bp of the IRb. The gene on the IRb/SSC borders of Opisthopappus was ycf1; however, there is no gene change on IRb of A. pacifica. The ycf1 gene, which was primarily found in the SSC at 558 bp of the IRa, served as the SSC/IRa borders for all species. The trnH was found in the LSC/IRa borders. However, the rps19 was in IRa and was close to the IRa/LSC boundary in O. taihangensis (NC_042787). Overall, A. pacifica had the highest variability.

Conclusions
In this study, the cp genome of O. longilobus was firstly reported and we resequenced O. taihangensis. A comparative analysis with other genomes was also performed. Opisthopappus is an endemic cave plant in China, and its harsh growing environment leads it to drought and other stresses. The study of the cp genome can provide more biological information for the sustainability of Opisthopappus. Overall, Opisthopappus cp genomes had similar structure and gene composition. However, the sliding window results showed that O. taihangensis and O. longilobus had great variation in ycf 1, which could be used as a potential barcode to distinguish the two species. Furthermore, we reconstructed a phylogenetic tree by complete cp genomes. The results indicated that O. longilobus was a separate species and not a synonym or variety of O. taihangensis. We found that Opisthopappus and Ajania are closely related. The results provide new ideas for the exploitation of Opisthopappus. Overall, these results can provide biological information and essential insights into the evolutionary history of the endangered Opisthopappus that will contribute to the bioprospecting and conservation of Opisthopappus species. This is the first study to report the cp genome of O. longilobus, and we resequenced O. taihangensis. A comparative analysis with other genomes was also performed. Opisthopappus is an endemic cave plant in China, and its harsh growing environment leads it to drought and other stresses. The study of the cp genome can provide more biological information for the sustainability of Opisthopappus. Overall, Opisthopappus cp genomes had similar structure and gene composition. However, the sliding window results revealed that O. taihangensis and O. longilobus had great variation in ycf1, which could be employed as a potential molecular marker to distinguish the two species. Furthermore, we reconstructed a phylogenetic tree by complete cp genomes. The results indicated that O. longilobus was a separate species and was not a synonym or variety of O. taihangensis. It is interesting that we discovered the closely relationship between Opisthopappus and Ajania. The results provide new ideas for the exploitation of Opisthopappus. Overall, these results provide biological information and an essential basis to understand the evolutionary history of the Opisthopappus species, which will aid in future bioprospecting and conservation of endangered species.
Supplementary Materials: The following supporting information can be downloaded at: https:// www.mdpi.com/article/10.3390/genes13122410/s1, Table S1: The lengths of introns and exons for the splitting genes. Table S2: RNA editing sites analyses of the five Paris plastomes.
Author Contributions: X.Z. and Y.J. conceived the ideas; L.H. and G.Z. contributed to the sampling; X.Z. performed the experiments and analyzed the data. The manuscript was written by X.Z. and edited by L.H. All authors have read and agreed to the final version of the manuscript. Data Availability Statement: All sequences used in this study are in the form of attachments. We have submitted this part of the data to NCBI but have not yet released it. At present, we have provided it to the journal and reviewers as an attachment and urge NCBI to release it as soon as possible. The dataset generated and or analyzed during the current study is deposited in Genbank with accession numbers: MZ779049 and MZ779050.

Conflicts of Interest:
The authors declare no conflict of interest.