Complete Chloroplast Genome Sequence of the Endemic and Endangered Plant Dendropanax oligodontus: Genome Structure, Comparative and Phylogenetic Analysis

Dendropanax oligodontus, which belongs to the family Araliaceae, is an endemic and endangered species of Hainan Island, China. It has potential economic and medicinal value owing to the presence of phenylpropanoids, flavonoids, triterpenoids, etc. The analysis of the structure and characteristics of the D. oligodontus chloroplast genome (cpDNA) is crucial for understanding the genetic and phylogenetic evolution of this species. In this study, the cpDNA of D. oligodontus was sequenced for the first time using next-generation sequencing methods, assembled, and annotated. We observed a circular quadripartite structure comprising a large single-copy region (86,440 bp), a small single-copy region (18,075 bp), and a pair of inverted repeat regions (25,944 bp). The total length of the cpDNA was 156,403 bp, and the GC% was 37.99%. We found that the D. oligodontus chloroplast genome comprised 131 genes, with 86 protein-coding genes, 8 rRNA genes, and 37 tRNAs. Furthermore, we identified 26,514 codons, 13 repetitive sequences, and 43 simple sequence repeat sites in the D. oligodontus cpDNA. The most common amino acid encoded was leucine, with a strong A/T preference at the third position of the codon. The prediction of RNA editing sites in the protein-coding genes indicated that RNA editing was observed in 19 genes with a total of 54 editing sites, all of which involved C-to-T transitions. Finally, the cpDNA of 11 species of the family Araliaceae were selected for comparative analysis. The sequences of the untranslated regions and coding regions among 11 species were highly conserved, and minor differences were observed in the length of the inverted repeat regions; therefore, the cpDNAs were relatively stable and consistent among these 11 species. The variable hotspots in the genome included clpP, ycf1, rnK-rps16, rps16-trnQ, atpH-atpI, trnE-trnT, psbM-trnD, ycf3-trnS, and rpl32-trnL, providing valuable molecular markers for species authentication and regions for inferring phylogenetic relationships among them, as well as for evolutionary studies. Evolutionary selection pressure analysis indicated that the atpF gene was strongly subjected to positive environmental selection. Phylogenetic analysis indicated that D. oligodontus and Dendropanax dentiger were the most closely related species within the genus, and D. oligodontus was closely related to the genera Kalopanax and Metapanax in the Araliaceae family. Overall, the cp genomes reported in this study will provide resources for studying the genetic diversity and conservation of the endangered plant D. oligodontus, as well as resolving phylogenetic relationships within the family.


Introduction
The genus Dendropanax belongs to the family Araliaceae, comprising shrubs or trees with approximately 93 species worldwide [1]. The species are distributed around tropical America and eastern Asia, with 16 species spread from the south-western to south-eastern Owing to the remarkable medicinal and health properties as well as the ecological value of plants belonging to the Araliaceae family, this family has garnered considerable interest in recent years [3][4][5][6][7][8][9][10]40,41]. Particularly, species belonging to the genus Panax, including P. ginseng, P. notoginseng, and P. quinquefolius, have been extensively studied [3,40,[42][43][44]. However, studies on plants of the genus Dendropanax have mostly focused on Dendropanax morbifera [4,7,8,45], and little is known about D. oligodontus. This species is endemic to Hainan Island [11,12] and is characterised as CR [13]. Therefore, basic research on the speciation and evolution of D. oligodontus is crucial for the conservation and development of tropical resources.
Analysis of the cpDNA of D. oligodontus is vital for the examination of genetic and phylogenetic evolution as well as its conservation and development. Data corresponding to the cpDNA of the Araliaceae family have been published in the NCBI for 34 species in 12 genera. However, of these, only D. morbifera (NC_027607.1) [46] and D. dentiger (NC_026546.1) [47] of the genus Dendropanax have been reported. Therefore, in this study, the cpDNA of D. oligodontus was sequenced and assembled using next-generation sequencing technology, and its genes were annotated and submitted to NCBI (MT909827). Next, the genomic structure, codon usage preferences, RNA editing, and SSR of the species were analysed. Furthermore, 10 closely related species were selected to compare data such as sequencing differences, nucleotide diversity (Pi), evolutionary selection pressure, and the expansion and contraction of IR regions. Finally, a phylogenetic tree was constructed using the cpDNA sequences of 24 species to analyse the phylogeny of D. oligodontus. The results of this study will provide a valid reference for the conservation, resource development, and utilisation of this endangered species.

Plant Material and DNA Extraction
Leaf samples were collected from Baoting Li Autonomous County, Hainan Province (18 • 42 N, 109 • 42 E) in July 2020. Fresh leaves were collected and then stored on ice until returning to the laboratory. In addition, the sample specimens were also preserved in the Herbarium of Hainan Normal University (reference number: WY-202007-1). The chloroplast genomic DNA of D. oligodontus leaves was extracted using a modified CTAB method [48]. The quality and concentration of the genomic DNA were evaluated using agarose electrophoresis and NanoDrop 2000 (Thermo Fisher Scientific, Inc., Waltham, MA, USA), and the genomic DNA with the necessary quality was then used for library construction.

Sequencing, Assembly, and Gene Annotation of cpDNA
The genome was sequenced on an Illumina Novaseq 6000 platform (Illumina, San Diego, CA, USA) with 150 bp paired-end reads. For enhanced accuracy in subsequent assembly, the clean read data were obtained after quality filtering and trimming. Highquality reads were de novo assembled into a complete cp genome by SPAdes3.11.0 software, using a k-mer set of 21, 45, 65, 85, and 105 [49]. Finally, the complete cp genome was annotated using PGA [50] with the cp genome of D. dentiger (NCBI accession number: NC_026546.1) as a reference, coupled with manual checks and adjustment. Additionally, tRNA genes were identified by the tRNAscan-SE 1.21 program [51]. Annotation of rRNA was performed using the RNAmmer 1.2 Server (http://www.cbs.dtu.dk/services; accessed on 17 August 2020), supplemented by homologous sequence comparison to correct for boundary ranges. After sequence annotation, Sequin was used to edit and generate a file for submission to the GenBank database. The edited GenBank annotation files were submitted to Organellar Genome DRAW (OGDRAW) v1.2 for annotation mapping.

Analyses of Amino Acid Frequencies, Codon Preferences, and RNA Editing Sites
Codon usage bias analysis and calculation of the RSCU values were performed in the program CodonW v1.4.2 (https://sourceforge.net/projects/codonw; accessed on 8 April 2022). RSCU > 1 indicated that the codon was more frequently used and was a preferred codon, whereas RSCU < 1 indicated low usage, and RSCU = 1 indicated that the codon had no usage preference. Potential RNA editing sites were identified using the PREP-Cp online prediction tool (http://prep.unl.edu; accessed on 9 April 2022) with the cut-off value set as 0.8 [53].

Comparative Genomic Analysis and Nucleotide Diversity in the Chloroplast Genome
To investigate the sequence divergence of the chloroplast genome among the analysed Araliaceae family species, the whole chloroplast genome sequences of the 11 species (D. morbifera, D. dentiger, Fatsia japonica, Kalopanax septemlobus, Metapanax delavayi, Brassaiopsis hainla, Eleutherococcus gracilistylus, Schefflera heptaphylla, P. ginseng, and P. notoginseng) were analysed using the mVISTA program (http://genome.lbl.gov/vista/mvista/submit. shtml; accessed on 5 April 2022) in Shuffle-LAGAN mode [54]. For the 11 aligned cp genome sequences, the Pi values of the compared chloroplast genomes were calculated using DnaSp v6.12.03 in the sliding window for DNA polymorphism analysis. The window length was set to 600 bp with a step size of 100 bp.

IR Contraction and Expansion Analysis
To observe the expansion and contraction of the IR regions, the IRscope software (https://irscope.shinyapps.io/irapp; accessed on 1 April 2022) [55] was used to map the four adjacent boundary regions (LSC/IRa, IRa/SSC, SSC/IRb, and IRb/LSC) of the cpDNA for the aforementioned 11 species.

Analysis of Evolutionary Selection Pressure
The protein-coding genes of D. oligodontus cpDNA were aligned and formatted using ParaAT2.0 with default parameters. The protein-coding gene sequences annotated in the cpDNA were then analysed separately using KaKs_Calculator v.2.0 based on the YN method, and the value of nonsynonymous substitution rate (Ka)/synonymous substitution rate (Ks) was calculated. Ka/Ks > 1 indicated a positive selection effect for the gene, whereas Ka/Ks = 1 indicated a neutral selection and Ka/Ks < 1 indicated a purifying selection effect.

Phylogenetic Analysis
Homologous single-copy genes from 24 species were selected (see Table S4) and compared using MUSCLE v3.8.1551. Conserved sequences were then extracted using Gblocks v0.91b and concatenated for constructing phylogenetic trees. The best amino acid substitution model was selected using prottest v3.4: HIVb + I + G + F, with Chrysanthemum indicum (NC_020320) as the outgroup. Finally, the phylogenetic tree was constructed using RAxML v8.2.12.

Analysis of Long Repeats and Simple Sequence Repeats
Long-repeat sequence analysis of D. oligodontus cpDNA using the REPuter software revealed a total of 13 repetitive sequences, including five forward repeats and eight palindromic repeats ( Table 2). The repetitive sequences ranged in length from 30 to 48 bp and were primarily localised on the gene ycf2. As many as eight of the sequences were present in two forward repeats (48 bp and 30 bp) localised in the IRA and IRB regions, respectively, and four palindromic repeats (48 bp and 30 bp) present in both IRA and IRB regions. The ndhA gene was distributed in one forward repeat (IRB; SSC) and one palindromic repeat (SSC; IRA) region. Both regions were 34 bp in length and located within the intron region. The intergenic spacer (IGS) region of the trnS-GGA gene contained one 30 bp palindromic repeat sequence. In addition, one 44 bp and one 40 bp palindromic repeat sequence were distributed in the LSC and SSC regions, respectively ( Table 2).

Codon Usage Bias
D. oligodontus cpDNA has 26,514 codons encoding a total of 20 amino acids (without stop codons). Codon classification was performed. The results showed that the most commonly encoded amino acid was leucine (Leu) with 2805 codons (10.58%) and included 6 synonymous codons, with UUA being the most common codon (Table 3, Supplementary Figure S1). This was followed by isoleucine (Ile, 8.39%), serine (Ser, 7.72%), glycine (Gly, 6.92%), arginine (Arg, 6.03%), and phenylalanine (Phe, 5.62%). Cysteine (Cys, 1.11%) was the least-encoded amino acid. Among these codons, the most frequent was lysine (Lys) encoded by AAA (with 1054 codons), and the least frequently used codon was UGC encoding cysteine (with 79 codons) ( Table 3). A total of 30 codons had RSCU values higher than 1, indicating that they had codon usage preferences (Table 3). In the third position, 16 codons ended in U(T), 13 ended in A, and only one ended in G, indicating that the codons of D. oligodontus cpDNA had a strong A/T preference in the third position. In addition, both the start codon AUG and the Trp-encoding codon UGG had RSCU values equal to 1, indicating no preference, whereas the termination codon UAA had an RSCU value greater than 1, indicating a preference.

Prediction of RNA Editing Sites in Protein-Coding Genes
The Prep-Cp software was used to predict the RNA editing sites for 86 proteinencoding genes of D. oligodontus cpDNA (Supplementary Table S2). Of these, RNA editing was observed in 19 genes with a total of 54 editing sites, all of which were C-to-T transitions. In addition, 6 (11.11%), 48 (88.89%), and 0 editing sites were found at the first, second, and third positions of the codon, respectively. These editing sites were primarily present in the genes ndhB (nine sites), ndhD (eight sites), ndhA (five sites), rpoB (five sites), matK (four sites), and accD (three sites), whereas the remaining genes demonstrated one or two sites (Supplementary Table S2). The most common transition induced by site editing was the transition of serine codon TCA (S) to leucine codon TTA (L) (37.04%), followed by proline (P) to leucine (L) (9.2%). H (histidine) to Y (tyrosine) and S (serine) to F (phenylalanine) both showed a frequency of 7.4%. Lower frequencies were observed for R (arginine) to W (tryptophan), P (proline) to L (leucine), T (threonine) to I (isoleucine), P (proline) to S (serine), T (threonine) to M (methionine), A (alanine) to V (valine), L (leucine) to F (phenylalanine), and other transitions.

Comparative Analysis of the Chloroplast Genomes of 11 Araliaceae Species
Using D. oligodontus cpDNA as a reference, the cpDNA of 10 plants from eight genera of the Araliaceae family (including two species of the genus Dendropanax, two species of the genus Panax, and one species each in the genera Eleutherococcus, Schefflera, Fatsia, Kalopanax, Metapanax, and Brassaiopsis) were selected for comparative analysis ( Table 4). The cpDNAs of 11 species showed sizes ranging from 155,613 bp (in Fatsia japonica) to 156,770 bp (in Eleutherococcus gracilistylus), with a maximum difference of 1157 bp and a minimum difference of 10 bp. The size of the LSC region ranged from 86,111 bp (in P. notoginseng) to 86,729 bp (in E. gracilistylus), with a maximum difference of 618 bp and a minimum difference of 21 bp. The size of the SSC region ranged from 17,867 bp (in F. japonica) to 18,554 bp (in P. notoginseng), with a maximum difference of 687 bp and a minimum difference of 5 bp. The size of the IR region ranged from 25,629 bp (in F. japonica) to 26,074 bp (in P. ginseng), with a maximum difference of 445 bp and a minimum difference of 3 bp. Comparative analysis revealed that the GC content of cpDNA in all 11 species was approximately 37.9%, with a maximum difference of only 0.19%. Eight species (D. morbifera, D. dentiger, F. japonica, Kalopanax septemlobus, Metapanax delavayi, Brassaiopsis hainla, E. gracilistylus and P. notoginseng) had the same total gene number, which was 132 for each, including 87 protein-coding genes, 37 tRNAs and eight rRNA genes ( Table 4). The other three species, D. oligodontus, P. ginseng, and Schefflera heptaphylla, had 131, 129, and 123 genes, respectively. They shared eight rRNA genes, too. However, compared with the nine species mentioned above, D. oligodontus lacked the ycf15 gene in the PCGs; P. ginseng lacked petN, rpl2 and trnG-UCC; and S. heptaphylla lacked psbN, trnK-UUU, trnG-UCC, trnL-UAA, trnV-UAC, two trnI-GAC, and two trnA-UGC. Using the mVISTA program for the visual display of multiple-sequence alignment, a comparative analysis showed that the sequences of the untranslated region (UTR) was highly conserved among the 11 Araliaceae species, with some differences in the sequences of the non-coding and exon regions ( Figure 3). However, the non-coding sequences showed more differences than the exon sequences; for example, at about 4-11 kb, 28-34 kb, 49-50 kb, etc. (Figure 3). Three species of the genus Dendropanax (D. oligodontus, D. morbifera, and D. dentiger) and two species of the genus Panax (P. ginseng and P. notoginseng) had high intra-genus plant sequence similarity, and F. japonica, a plant of the genus Fatsia, showed more differences in some non-coding regions when compared with other species.
The nucleotide diversity (Pi) of cpDNA in 11 Araliaceae species was analysed using DnaSP software, as shown in Figure 4. The Pi values ranged from 0 to 0.04882, with an overall genomic Pi mean value of 0.00639. Regions with high nucleotide diversity were distributed in the LSC and SSC, and the vast majority were present in the spacer regions of genes, such as trnK-rps16, rps16-trnQ, atpH-atpI, trnE-trnT, psbM-trnD, ycf3-trnS, and rpl32-trnL. The genes clpP and ycf1 had relatively higher Pi values (Figure 4). The mean Pi values of the LSC and SSC regions were 0.00854 and 0.01056, respectively. The IR region had lower nucleotide diversity and was more conserved than the single-copy region sequence, with a mean Pi value of only 0.00323. The nucleotide diversity (Pi) of cpDNA in 11 Araliaceae species was analysed using DnaSP software, as shown in Figure 4. The Pi values ranged from 0 to 0.04882, with an overall genomic Pi mean value of 0.00639. Regions with high nucleotide diversity were distributed in the LSC and SSC, and the vast majority were present in the spacer regions of genes, such as trnK-rps16, rps16-trnQ, atpH-atpI, trnE-trnT, psbM-trnD, ycf3-trnS, and rpl32-trnL. The genes clpP and ycf1 had relatively higher Pi values (Figure 4). The mean Pi values of the LSC and SSC regions were 0.00854 and 0.01056, respectively. The IR region had lower nucleotide diversity and was more conserved than the single-copy region sequence, with a mean Pi value of only 0.00323.

IR Contraction and Expansion Analysis
Variations in the sizes of cpDNA between species are closely related to the expansion and contraction of the IR and SC (LSC and SSC) regions, which is a relatively common phenomenon in cpDNA evolution and may reflect phylogenetic relationships to a certain extent. As shown in Figure 5, the JLB boundaries of all species were located within the rps19 gene. The base lengths of the rps19 gene sequences distributed in the LSC (231 bp) and IRb (48 bp) regions were identical in the three species of the genus Dendropanax, and differed by only 1 bp in the two species of the genus Panax ( Figure 5). The results were slightly different for the rps19 sequence of E. gracilistylus, which had only 37 bp distributed in the IRb region, whereas all other species had between 45 to 51 bp. The JSB boundaries of 11 species were all located in the ycf1 pseudogene (D. dentiger, F. japonica, M. delavayi, B. hainla, and P. notoginseng) or the ycf1 gene (D. oligodontus, D. morbifera, K. septemlobus, E. gracilistylus, S. heptaphylla, and P. ginseng), whereas the JSA boundaries of 11 species were all in the ycf1 gene ( Figure 5). In addition, the number of bases at the JSB boundary of the ycf1 pseudogene or ycf1 gene deep within the SSC region varied greatly ranging from as low as 3 bp (B. hainla) and 4 bp (F. japonica) to as high as 262 bp (P. notoginseng) ( Figure 5). Moreover, the ndhF gene of 11 species was distributed in the SSC region. How-

IR Contraction and Expansion Analysis
Variations in the sizes of cpDNA between species are closely related to the expansion and contraction of the IR and SC (LSC and SSC) regions, which is a relatively common phenomenon in cpDNA evolution and may reflect phylogenetic relationships to a certain extent. As shown in Figure 5, the JLB boundaries of all species were located within the rps19 gene. The base lengths of the rps19 gene sequences distributed in the LSC (231 bp) and IRb (48 bp) regions were identical in the three species of the genus Dendropanax, and differed by only 1 bp in the two species of the genus Panax ( Figure 5). The results were slightly different for the rps19 sequence of E. gracilistylus, which had only 37 bp distributed in the IRb region, whereas all other species had between 45 to 51 bp. The JSB boundaries of 11 species were all located in the ycf1 pseudogene (D. dentiger, F. japonica, M. delavayi, B. hainla, and P. notoginseng) or the ycf1 gene (D. oligodontus, D. morbifera, K. septemlobus, E. gracilistylus, S. heptaphylla, and P. ginseng), whereas the JSA boundaries of 11 species were all in the ycf1 gene ( Figure 5). In addition, the number of bases at the JSB boundary of the ycf1 pseudogene or ycf1 gene deep within the SSC region varied greatly ranging from as low as 3 bp (B. hainla) and 4 bp (F. japonica) to as high as 262 bp (P. notoginseng) ( Figure 5). Moreover, the ndhF gene of 11 species was distributed in the SSC region. However, it was closely adjacent to the JSB boundary (only 5 bp) in the P. ginseng species. Except for the ycf1 gene of B. hainla, which was only 1725 bp long, the ycf1 genes of 10 species were between 5577 to 5760 bp in length at the JSA boundary. The ycf1 gene sequences distributed within the IRA region were between 1387 to 1497 bp in length with a smaller difference. At the JLA boundary, the rpl2 gene (118 bp from the boundary) and the trnH gene (2 bp across the boundary) were identical in the three plants of the genus Dendropanax, whereas the length of the trnH gene across the boundary in the remaining species ranged from 0 to 13 bp. Finally, the rps19 pseudogene was observed at the JLA boundary in some species ( Figure 5). ever, it was closely adjacent to the JSB boundary (only 5 bp) in the P. ginseng species. Except for the ycf1 gene of B. hainla, which was only 1725 bp long, the ycf1 genes of 10 species were between 5577 to 5760 bp in length at the JSA boundary. The ycf1 gene sequences distributed within the IRA region were between 1387 to 1497 bp in length with a smaller difference. At the JLA boundary, the rpl2 gene (118 bp from the boundary) and the trnH gene (2 bp across the boundary) were identical in the three plants of the genus Dendropanax, whereas the length of the trnH gene across the boundary in the remaining species ranged from 0 to 13 bp. Finally, the rps19 pseudogene was observed at the JLA boundary in some species ( Figure 5).

Analysis of Evolutionary Selection Pressure
During plant evolution, Ka, Ks, and the ratio of Ka/Ks are commonly used to evaluate the rate of evolution between gene sequences and to better elucidate whether selection pressure is associated with a particular protein-coding gene. In this study, the results (Table S3) showed that most of the genes of Dendropanax species without Ka/Ks values indicated more conservation, whereas some genes such as ycf1/2, nadA/F, and rpoA/B/C1/C2 showed neutral or purifying selection trends. Compared with the species outside the genus Dendropanax, atpF (except vs. P. ginseng), clpP (vs. P. ginseng and P. notoginseng), matK (vs. K. septemlobus), ndhA (vs. P. ginseng), petB (vs. S. heptaphylla), rpl22 (vs. E. gracilistylus and F. japonica), rps12 (vs. K. septemlobus, M. delavayi, and B. hainla) and ycf1/2 (vs. F. japonica, B. hainla, S. heptaphylla, P. ginseng, and P. notoginseng) in D. oligodontus showed positive selection effects (Table S3). A few genes had values close to 1, indicating a slow evolutionary rate and some conservation in D. oligodontus cpDNA.

Phylogenetic Analysis
To analyse the phylogenetic position of D. oligodontus in Araliaceae, a phylogenetic tree was reconstructed for 22 species of the family Araliaceae (see Table S4) using the 54 common protein-coding genes among the 22 complete chloroplast genome sequences. The best amino acid substitution model was selected using prottest v3.4: HIVb + I + G + F, with Chrysanthemum indicum of the Asteraceae family and Angelica gigas of the Apiaceae family (Apiales) as an outgroup cluster. The results showed that the entire family Araliaceae was clustered into one large branch. Since the Araliaceae family also belongs to Apiales, A. gigas was clustered with the Araliaceae family as an extra-family taxon. The Araliaceae taxon was divided into two evolutionary branches (Figure 6, I and II). Branch I included species of the Plerandreae Group, containing 12 species of seven genera. D. oligodontus was most closely related to D. dentiger, and together with D. morbifera were clustered into one branch as the genus Dendropanax. Branch II represents the Aralieae-Panaceae group, which includes 10 species of the genera Panax and Aralia.

Phylogenetic Analysis
To analyse the phylogenetic position of D. oligodontus in Araliaceae, a phylogenetic tree was reconstructed for 22 species of the family Araliaceae (see Table S4) using the 54 common protein-coding genes among the 22 complete chloroplast genome sequences. The best amino acid substitution model was selected using prottest v3.4: HIVb + I + G + F, with Chrysanthemum indicum of the Asteraceae family and Angelica gigas of the Apiaceae family (Apiales) as an outgroup cluster. The results showed that the entire family Araliaceae was clustered into one large branch. Since the Araliaceae family also belongs to Apiales, A. gigas was clustered with the Araliaceae family as an extra-family taxon. The Araliaceae taxon was divided into two evolutionary branches (Figure 6, I and II). Branch I included species of the Plerandreae Group, containing 12 species of seven genera. D. oligodontus was most closely related to D. dentiger, and together with D. morbifera were clustered into one branch as the genus Dendropanax. Branch II represents the Aralieae-Panaceae group, which includes 10 species of the genera Panax and Aralia.

Basic Characteristics of the D. oligodontus Chloroplast Genome
Chloroplast DNA typically exists as a double-stranded circular molecule, and the cpDNA of most higher-order plants is a highly conserved tetrameric structure comprising an LSC (approximately 81-90 kb long), SSC (length 18-20 kb), and two IR sequences, IRa and IRb (approximately 20-30 kb long) [15,16]. However, a few plants, such as those belonging to the genus Erodium [56], Medicago truncatula, Cicer arietinum, Trifolium repens, and other legumes [57], do not show a tetrameric structure because of the loss of an inverted repeat region. In this study, we sequenced and assembled D. oligodontus cpDNA for the first time. The analysis results showed a typical tetrameric structure that was 156,403 bp long with an 86,440 bp LSC, 18,075 bp SSC, 25,944 bp IR, and 37.99% GC content. It comprised 131 genes, including 86 protein-coding genes, 37 tRNA genes, and 8 rRNA genes, which is similar to the reported structural features of the cpDNA of D. morbifera [46], D. dentiger [47], and the genus Panax [3,42] in the Araliaceae family. These results indicate that the cpDNA structure of D. oligodontus was relatively conserved.
Repeat sequences and SSRs are widely present in plant cpDNA, and their type, number, and distribution vary among various plants [58]. Chloroplast SSRs have been widely used in studies on the genetic diversity and phylogeny of plant populations, as well as population analysis involving structure, classification, and phylogeographic distribution [16,38]. Analysis of D. oligodontus cpDNA revealed a total of 13 repetitive sequences and 43 SSR loci, of which the repetitive sequences were mainly distributed in the coding region of the ycf2 gene (61.5%), the intron region of the ndhA gene, and the IGS region of the trnS-GGA gene. Most of the repeats were attributed to simultaneous distribution in the IR region. The SSR loci were mainly distributed in the LSC region (83.7%) and primarily comprised A or T base repeats. These results are similar to the reported repetitive sequences and SSR distribution characteristics of the chloroplast genomes of nine species in the Araliaceae family, including plants of the genera Brassaiopsis, Eleutherococcus, Kalopanax, and Macropanax [59,60]. These species had 35 to 46 cpDNA SSR loci, mostly including single-nucleotide repeats (A/T), and these loci were mainly present on the IGS and intron sequences [59,60]. These data may provide useful genetic references for further research into the molecular ecology of this species.

Gene Codon Usage and RNA Editing Site Prediction in the D. oligodontus Chloroplast Genome
Plant codons exhibit certain usage preferences (codon usage bias, CUB), which are considered an important evolutionary feature of the genome. RSCU is often used as an important indicator of codon preference [61]. Codon usage bias is related to factors such as GC content, tRNA abundance, and protein structure, and mutation and natural selection are the dominant factors affecting the degree of codon bias. The examination of codon preferences is beneficial for understanding phylogenetic patterns in particular species [61,62]. In this study, it was shown that Leu was the most predominant amino acid in D. oligodontus cpDNA, and 29 of the 30 codons with an RSCU value higher than 1 ended in A/U, indicating that the codons of D. oligodontus cpDNA had a strong A/T preference in the third position. This result is identical to that of the chloroplast genes of Panax species belonging to the same family, which also mostly ended in A/T at the third base [59,60]. This is also consistent with the results of previous studies showing that dicotyledon codons use biased A/T endings, whereas monocotyledon codons use biased G/C endings [63,64].
RNA editing may be yet another conserved mechanism for the expansion of genetic information that has been generated in organisms over long periods of evolution, resulting in increased diversity in gene transcription and function [65]. A single gene sequence can be modified by editing to synthesise multiple different proteins involved in the control of different traits or biological processes. Several RNA edits are also observed in the genomes of cellular organelles (including mitochondria and chloroplasts). For instance, the maize chloroplast gene rpl2 with the start codon ACG must undergo C-to-T editing to form the correct start codon, ATG, before it can initiate transcription and translation [65]. The physiological functions of chloroplast RNA editing primarily involve gene expression regulation, and the effects are manifested in the protein structure. For instance, RNA editing in the non-coding region is important for mRNA splicing, and RNA editing in the coding region of the gene leads to amino acid changes [66,67]. In this study, 54 RNA editing sites were predicted in 86 protein-coding genes of D. oligodontus cpDNA. The sites were distributed across 19 genes. All involved C-to-U transitions, with 11.11% and 88.89% of the editing sites observed at the first and second bases of the codons, respectively (see Supplementary Table S2). The predicted number of sites was higher than that observed in an average angiosperm cpDNA, which typically has approximately 30 editing sites [68]. The type of editing observed in this study is consistent with the fact that only C-to-U transitions are found in the chloroplast of higher-order plants, whereas frequent U-to C-transitions are present in the chloroplasts of lower-order plants of the Pteridophyta and Bryophyta taxa [68]. Furthermore, in angiosperms, most editing sites are present in the ndhB gene (up to 15). Certain sites are present on several genes, such as rpoC2, rpoB, ndhA, rpoB, and ndhF, but the number of editing sites is considerably reduced in lower-order plants [66,68]. In the present study, the editing sites in D. oligodontus cpDNA were mainly found on the genes ndhB (9), ndhD (8), ndhA (5), rpoB (5), matK (4), and accD (3), which are involved in biological processes such as photosynthetic electron chaining, energy metabolism, gene transcription of proteins, and fatty acid metabolism. These results are consistent with the general characteristics of angiosperms; however, they are also species-specific. In addition, the analysis in this study also revealed that amino acids undergo more than one shift from Ser to Leu after RNA editing, resulting in a shift from hydrophilic to hydrophobic amino acids which increases the hydrophobicity of proteins. These results are in line with the general characteristics of chloroplast RNA in higher-order plants [68].

Comparative Analysis of the Chloroplast Genome of D. oligodontus and Closely Related Species
The size of the cpDNA in angiosperms ranges from 120 to 180 kb, and the size of the IR region ranges from 20 to 30 kb [15,16]. In the present study, statistical analysis of the cpDNA from 11 species belonging to eight genera in the Araliaceae family, including D. oligodontus (Table 4), revealed that the genome size ranged from 155.6 to 156.7 kb. The LSC, SSC, and IR regions ranged from 86.1 to 86.7 kb, 17.8 to 18.5 kb, and 25.6 to 26.0 kb, respectively, which was consistent with the characteristics of angiosperm cpDNA [15,16]. In addition, all 11 of these cpDNAs had a GC content of approximately 37.9%, and the total number of genes in most plants was approximately 132 (except D. oligodontus, 131; P. ginseng, 129; and S. heptaphylla, 123), which demonstrates the evolutionary conservation among 11 species of the Araliaceae family. Furthermore, the results of multiple-sequence alignment using the mVISTA program has also confirmed the above conclusions. Here, considerable similarities in the genome UTR and coding regions were identified among the cpDNAs of these 11 species, whereas small differential sequences were primarily found in the non-coding regions (Figure 3).
For further identification of highly diverse regions, nucleotide polymorphism (Pi) analysis in the cpDNA of 11 species was conducted using DnaSP software. The results ( Figure 4) showed that regions with high Pi values were most distributed in the IGSs of the LSC and SSC. The average value of Pi in the LSC and SSC regions was higher than that in the IR region. We also identified some mutational hotspots, and these highly variable fragments might be adopted as appropriate loci for population genetic and phylogeographic studies.
In addition, a general evolutionary phenomenon in the cpDNA involves expansion and contraction events in the four IR boundaries, which leads to variations-to a considerable extent-in the entire cpDNA of the same or different plant groups. For example, the IR region of the genus Pelargonium is up to 75 kb in length, and the IR regions of Cryptomeria japonica and Pisum sativum are lost [18,23]. In the present study, small differences were observed in the length of the IR regions in the cpDNA of 11 species from eight genera. These results indicated that the cpDNA of the Araliaceae family did not show such changes in the IR regions, which were relatively stable and consistent ( Figure 5). However, minor differences in the sequence lengths of rps19 and ycf1 genes across the IR boundary regions were observed in different species, indicating the diversity of IR/SC boundaries across Araliaceae family species. Moreover, the boundary genes, such as the genes rps19, rpl2, and trnH on the IRA-LSC boundary, showed relatively conserved characteristics within the genus Dendropanax.
Ka, Ks, and Ka/Ks values can be used to evaluate the rate of evolution in gene sequences. The analysis of these values can provide a reference for our understanding of the adaptive relationship between gene function and structure. A Ka/Ks value of higher than 1 indicates that the gene is subject to positive environmental selection, which helps to determine if the gene is undergoing adaptive evolution [69]. Therefore, in this study, some genes involved in RNA transcription and redox metabolism (e.g., nadA/F, and rpoA/B/C1/C2) were probably subjected to neutral or purifying selection among the Araliaceae family (Table S3). Correspondingly, the results also showed that some genes (e.g., atpF, clpP, matK, ndhA, petB, rpl22, rps12, and ycf1/2) involved in energy metabolism, gene information transfer, and photosynthetic electron transfer likely underwent positive selection effects. Interestingly, the atpF gene of the genus Dendropanax showed strong positive selection compared with other plants of the Araliaceae family (except P. ginseng), especially with Ka/Ks values as high as 36.958 compared with those of E. gracilistylus (Table S3). These results are consistent with those of Yin et al. [70] who reported that, when compared with deciduous plants, the atpF gene of the evergreen sclerophyll Quercus aquifolioides showed strong positive selection, and considerable purifying selection was observed within the evergreen sclerophyll taxon [70]. The gene atpF is a subunit of H + -ATP synthase, which is important for chloroplast electron transport, photorespiration, and resistance to adversity in plants [71]. D. oligodontus is also a tropical evergreen plant, and the results of this study imply that the atpF gene may be important in the environmental adaptation and species evolution of Dendropanax.

Phylogenetic Analysis of the Genus Dendropanax and the Araliaceae Family
Taxonomic studies performed using plant cpDNA sequences have high resolution and can solve many morphological classification problems. Such studies are now widely performed to explore the phylogenetic relationships between species [16,18]. In the present study, the results showed that most nodes of the phylogenetic tree have also obtained a high support rate. There was a clear developmental relationship between each genus or between each species within a genus. The phylogenetic tree showed that D. oligodontus was most closely related to D. dentiger within the genus Dendropanax. This result is consistent with the geographical distribution of the two species. D. dentiger is widely distributed in the southern region of China, whereas D. oligodontus is distributed only on Hainan Island. Hence, D. oligodontus may have evolved from D. dentiger due to long-term geographic isolation. Furthermore, D. morbifera is primarily distributed in the regions of the Korean Peninsula and Japan [72] and is more distant from the distribution areas of D. oligodontus and D. dentiger. Thus, evolutionarily, this is also reflected in the results of the mVISTA comparison between the cpDNA of these three species. For instance, regions with lower similarity were located in prtB (at approximately 77 kb), trnL-CAA (at approximately 96 kb and 147 kb), and ycf1 (at approximately 127 kb) ( Figure 3). In this study, earlier separation of the genera Schefflera and Fatsia in the Plerandreae (I) clade, as well as the clustering of the genera Eleutherococcus and Brassaiopsis into one evolutionary branch, are consistent with the results of Dong et al. [60]. In our results, the genera Metapanax, Kalopanax, and Dendropanax were clustered into one evolutionary branch, but low node support was shown among these three genera in the phylogenetic tree ( Figure 6). This result seems to be consistent with the reports of Dong et al. [60] and Kim et al. [46] which show that Kalopanax and Metapanax were more closely related. However, as indicated by Li et al. [73], the genus Metapanax is clustered with the genera Eleutherococcus and Brassaiopsis into one evolutionary branch, forming a sister relationship with the genera Kalopanax and Schefflera. These inconsistencies require verification in further studies.

Conclusions
In conclusion, this study used the Illumina Novaseq 6000 platform to obtain the first complete chloroplast sequences of D. oligodontus. Similar to the cpDNA of most higher-order plants, the cpDNA of D. oligodontus was present as a circular double-stranded molecule with a tetrameric structure, encoding a total of 131 genes. We conducted a structural feature analysis of D. oligodontus cpDNA, including repetitive sequence and SSR site identification, the prediction of RNA editing sites in PCGs, the analysis of codon preference, and multiple comparisons of cpDNA sequences among closely related species. These results provide useful information for studying the phylogeny and conservation of D. oligodontus, as well as the phylogenetic analysis, identification, classification, and examination of genetic diversity in the Araliaceae family.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/genes13112028/s1, Figure S1: Codon content of 20 amino acids and stop codons in the D. oligodontus cpDNA; Table S1: Summary of the SSRs in chloroplast gene of D. oligodontus; Table S2: Prediction of the RNA editing sites in chloroplast genome of D. oligodontus; Table S3: The rate of Ka/Ks in the chloroplast genomes of 11 Araliaceae species; Table S4: Plant materials for data matrix.

Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author. The data are not publicly available due to privacy. The complete chloroplast genome sequence of D. oligodontus was deposited at NCBI (GenBank accession number: MT909827).