Next Article in Journal
Forest Structure, Wood Standing Stock, and Tree Biomass in Different Restoration Systems in the Brazilian Atlantic Forest
Next Article in Special Issue
Complete Chloroplast Genome of Pinus densiflora Siebold & Zucc. and Comparative Analysis with Five Pine Trees
Previous Article in Journal
On the Effect of Heat Treatments on the Adhesion, Finishing and Decay Resistance of Japanese cedar (Cryptomeria japonica D. Don) and Formosa acacia (Acacia confuse Merr.(Leguminosae))
Previous Article in Special Issue
Medicinal Potential, Utilization and Domestication Status of Bitter Kola (Garcinia kola Heckel) in West and Central Africa
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:

Complete Chloroplast Genome Sequence and Phylogenetic Analysis of Quercus bawanglingensis Huang, Li et Xing, a Vulnerable Oak Tree in China

Key Laboratory of Tree Breeding and Cultivation of State Forestry Administration, Research Institute of Forestry, Chinese Academy of Forestry, Beijing 100091, China
Research Institute of Forest Ecology, Environment and Protection, Chinese Academy of Forestry, Beijing 100091, China
Author to whom correspondence should be addressed.
Forests 2019, 10(7), 587;
Original submission received: 5 June 2019 / Revised: 7 July 2019 / Accepted: 12 July 2019 / Published: 15 July 2019


Quercus bawanglingensis Huang, Li et Xing, an endemic evergreen oak of the genus Quercus (Fagaceae) in China, is currently listed in the Red List of Chinese Plants as a vulnerable (VU) plant. No chloroplast (cp) genome information is currently available for Q. bawanglingensis, which would be essential for the establishment of guidelines for its conservation and breeding. In the present study, the cp genome of Q. bawanglingensis was sequenced and assembled into double-stranded circular DNA with a length of 161,394 bp. Two inverted repeats (IRs) with a total of 51,730 bp were identified, and the rest of the sequence was separated into two single-copy regions, namely, a large single-copy (LSC) region (90,628 bp) and a small single-copy (SSC) region (19,036 bp). The genome of Q. bawanglingensis contains 134 genes (86 protein-coding genes, 40 tRNAs and eight rRNAs). More forward (29) than inverted long repeats (21) are distributed in the cp genome. A simple sequence repeat (SSR) analysis showed that the genome contains 82 SSR loci, involving 84.15% A/T mononucleotides. Sequence comparisons among the nine complete cp genomes, including the genomes of Q. bawanglingensis, Q. tarokoensis Hayata (NC036370), Q. aliena var. acutiserrata Maxim. ex Wenz. (KU240009), Q. baronii Skan (KT963087), Q. aquifolioides Rehd. et Wils. (KX911971), Q. variabilis Bl. (KU240009), Fagus engleriana Seem. (KX852398), Lithocarpus balansae (Drake) A. Camus (KP299291) and Castanea mollissima Bl. (HQ336406), demonstrated that the diversity of SC regions was higher than that of IR regions, which might facilitate identification of the relationships within this extremely complex family. A phylogenetic analysis showed that Fagus engleriana and Trigonobalanus doichangensis form the basis of the produced evolutionary tree. Q. bawanglingensis and Q. tarokoensis, which belong to the group Ilex, share the closest relationship. The analysis of the cp genome of Q. bawanglingensis provides crucial genetic information for further studies of this vulnerable species and the taxonomy, phylogenetics and evolution of Quercus.

1. Introduction

The cp genomes of most gymnosperms are uniparentally paternally inherited, whereas the majority of angiosperms are uniparentally maternally inherited [1]. In most angiosperms, the cp genomes, which encode approximately 130 genes and range from 76 to 217 kb [2,3], are typical double-stranded circular DNA composed of four regions containing two copies of inverted repeats (IRa and IRb) and two single-copy regions (LSC and SSC) [4,5]. Due to its uniparental inheritance, highly conserved structure, general lack of recombination and small effective population size, the analysis of cp DNA has been deemed a useful method for evolution research and the exploration of plant systematics [6,7,8,9]. In fact, the availability of sufficient data on cp genomes is crucial for phylogenetic relationship reconstruction, i.e., the assessment of relationships within angiosperms [10,11,12], the identification of members of Pinaceae [13] and Pinus [14], and adequate comparisons, i.e., cp genomes from sister species [15] and possibly multiple individuals [16]. At present, approximately 3000 plastid genomes of Eukaryota are shareable in the National Center for Biotechnology Information database (NCBI; Available online: due to improvements in sequencing technologies. In addition, molecular genetic methodologies based on nuclear and organellar genomes are crucial for conservation studies [17], particularly the conservation of threatened species for which there is scarce information on the genetic variation among populations [18]. Comprehensive analysis of both cpDNA and nDNA sequences could provide supplementary and often contrasting information on the genetic diversity among populations [17,19,20,21], which could be used to explore the causes of species threats and for the formulation of appropriate conservation measures. In addition, the DNA barcode has broad applications for rapid and accurate species identification [22]. Although the design of universal primers for single-copy nuclear sequences related to species boundaries is difficult, these nuclear primers might also be used for species discrimination in the future [23]. Barcodes based on whole plastid analyses that show interspecific discrepancies are expected to yield more information at the species and population levels for species identification to reveal new species and aid in biodiversity surveys and thus offer useful conservation strategies [24,25].
Oaks (Quercus L., Fagaceae) encompass approximately 500 species that are located throughout the northern hemisphere, although they mainly thrive in northern South America and Indonesia [26,27] and are dominant, diverse angiosperm plants due to their economical, ecological, religious and cultural benefits [28]. In biology research, oaks are widely used for both hybridization and introgression. Oaks were originally formalized as belonging to the subgenera Cyclobalanopsis and Quercus based entirely on their morphological characteristics [27,29], and the shift from a morphology- to a molecular-based classification changed the classification of oaks to the following two major clades (each comprising three groups): a Palearctic-Indomalayan clade (group Ilex, group Cerris and group Cyclobalanopsis) and a predominantly Nearctic clade (group Protobalanus, group Lobatae and group Quercus) [28]. Most recently, based on their morphology, molecular features, and evolutionary history, Quercus was split into two subgenera, Quercus and Cerris, and these subgenera include eight groups: for subgenus Quercus, group Protobalanus, group Ponticae, group Virentes, group Quercus and group Lobatae, and for subgenus Quercus, group Cyclobalanopsis, group Ilex, and group Cerris [30]. In subsequent studies, the main challenge in oak classification will be infrasectional classification. With the rapid development of sequencing technology, genomic databases are becoming increasingly vital for in-depth studies of plant phylogenetics [31,32]. However, due to the use of plastid and nuclear data, incongruent phylogenies have been observed in not only Quercus but also other genera [33,34,35]. In fact, high-resolution phylogenomic approaches can be used to assess the nuclear genome (e.g., RAD-sequencing) and likely provide even more highly important sources of information for phylogenetic and evolutionary studies, particularly in American oaks, such as Lobatae, Protobalanus and Quercus [36,37,38]. Plastid genomes are also important, because they can provide supplementary information that can be somewhat hidden in nuclear genomes (e.g., population–area relationships, ancient taxa histories and relationships) [38,39,40]. Hence, it is necessary to obtain preliminary cp genome data that can be used in future studies for species identification, for the assessment of relationships and eventual phenomena, such as reticulation, isolation, and introgression, and for establishing adequate conservation strategies.
Q. bawanglingensis is an endemic and vulnerable plant in China that is included in the Red List of Chinese Plants at the D2 VU level (the Red List of Chinese Plants. Available from: [41] based on the following criteria: a decline in the area of occupancy (AOO) by <20 km² or in the number of locations by ≤5. Nevertheless, its genetic background and resources have not been widely studied. Deng et al. (2017) reported that Q. bawanglingensis, which belongs to the phylogenetic group Ilex Q. setulosa complex, was more related to Q. setulosa in terms of leaf epidermal features [42]. As recorded in the Flora of China (the Flora of China. Available from:, Q. bawanglingensis is considered a distinct species related to Q. phillyreoides, but its genetic traits and taxonomic status are uncertain. Thus, a high-resolution and supported molecular phylogenetic tree is necessary. Obtaining cp genome information is necessary due to the lack of data on Q. bawanglingensis, and the importance and availability of information on the plastid genomes of oaks for detailed comparisons are increasing [40,43,44,45,46]. Polymorphic chloroplast microsatellite markers designed based on a cp genome analysis can be utilised to comprehend the levels and patterns of the geographical structure and genetic diversity of Q. bawanglingensis, and this information can subsequently be used formulate an effective protection strategy.
In this study, we first sequenced and described the complete cp genome of Q. bawanglingensis and performed a comparative analysis of the cp genomes of multiple Quercus species in order to (1) investigate the structural patterns of the whole chloroplast genome of Quercus species including the genome structure, gene order and gene content; (2) examine abundant simple sequence repeats (SSRs) and large repeat sequences in the whole cp genome of Q. bawanglingensis to provide markers for phylogenetic and genetic studies; and (3) construct a chloroplast phylogeny for Fagaceae species using their whole cp DNA sequences.

2. Materials and Methods

2.1. Chloroplast DNA Extraction, Illumina Sequencing, Assembly, Annotation and Sequence Analyses

A single individual of Q. bawanglingensis (height: 3.3 m, diameter at breast height (DBH): 7.8 cm) was used as a sampling object from Mount Exianling (109°06′, 35.88″E; 19°00′, 45.65″N) on Hainan Island (Figure A1) [47]. Hainan, a portion of the Indo-Burma Biodiversity Hotspot and the second largest island in China, is located at the northern edge of tropical Southeast Asia. Mount Exianling, the largest and the best-preserved tropical limestone rainforest on Hainan Island, is situated in the western area of this island [47]. The mount covers 2000 ha. and has an altitude from 100 to 1238 m. The island is characterised by a typical tropical monsoon and continental climate, with a rainy season (May to November) and a dry season (December to April of the following year). The annual average temperature is 24.5 °C, and the annual precipitation is 1647 mm.
Fresh leaves of the individual were collected and flash-frozen in liquid nitrogen and then stored in a refrigerator (−80 °C) until DNA extraction. DNA extraction was performed using the modified CTAB method [48]. DNA quality was assessed in a one drop spectrophotometer (OD-1000, Shanghai Cytoeasy Biotech Co., Ltd., Shanghai, China), and integrity was evaluated using 0.8% agarose gel. Sequencing was performed using an Illumina Hiseq4000 platform (Genepioneer Biotechnologies Co. Ltd., Nanjing, China) with PE250 based on Sequencing by Synthesis (SBS), with at least 5.74 GB of clean data obtained for Q. bawanglingensis. We then used FastQC v0.11.3 to trim the raw reads, and the cp-like reads were then extracted through a BLAST analysis between the trimmed reads and references (Q. tarokoensis and Q. tungmaiensis). We subsequently assembled the sequences with the cp-like reads using NOVOPlasty [49]. Genome annotation was performed using CPGAVAS [50], and the results were checked using DOGMA (DOGMA. Available from: and BLAST [51]. The tRNAs were identified by tRNAscan-SE [52], and we then mapped the entire genome using the OGDRAWv1.2 programme (OGDRAWv1.2. Available from: [53]. The cp genome sequences of Q. bawanglingensis have been deposited in GenBank (MK449426). SSRs and long repeats were determined using the MIcroSAtellite (MISA) identification tool (MISA. Available from: [54] and REPuter (REPuter. Available from: [55]. We also conducted various analyses of the guanine and cytosine (GC) content, codon usage, diversification in synonymous codon usage, and relative synonymous codon usage (RSCU) values.

2.2. Genome Comparison

Paired sequence alignment was performed using MUMmer [56]. mVISTA [57] was used to examine the genetic divergence among nine complete cp genomes, namely, those of Q. bawanglingensis, Q. tarokoensis (NC036370.1), Q. aliena var. acutiserrata (KU240008), Q. baronii (KT963087), Q. aquifolioides (KX911971), Q. variabilis (KU240009), Fagus engleriana (KX852398), Lithocarpus balansae (KP299291) and Castanea mollissima (HQ336406), in the Shuffle-LAGAN mode [58] with the genome of Q. tarokoensis as the reference genome. The cp genome sequences of Q. bawanglingensis, Q. tarokoensis, Q. aliena var. acutiserrata, Q. baronii, Q. aquifolioides and Q. variabilis were aligned using MAFFT v.5 [56], and a sliding window analysis was performed to detect the nucleotide diversity of the cp genomes using DnaSP v5 [59].

2.3. Phylogenetic Analysis

The phylogenetic analysis was performed using FastTree based on sequences from 29 taxa, namely, 24 Fagaceae species, three Betulaceae species and two outgroups (Populus trichocarpa and Theobroma cacao), all of which were downloaded from the NCBI except those of Q. bawanglingensis. MAFFT v.5 [56] was utilized to align the cp genomes of the 29 species. We also performed multiple sequence alignments manually using BioEdit [60] and reconstructed a maximum likelihood (ML) tree using FastTree version 2.1.10 [61].

3. Results

3.1. Features of the Chloroplast Genome of Q. bawanglingensis

In total, at least 5.74 GB of clean data was obtained for Q. bawanglingensis, and these data were assembled into a double-stranded circular DNA with a length of 161,394 bp (Figure 1 and Table 1). The total lengths of the LSC, SSC and IRs are 90,628, 19,036 and 51,730 bp, respectively, and the sequences encode 134 genes, including eight rRNA genes, 40 tRNA genes and 86 protein-coding genes (Table A1). Different sections exhibit different distributions of genes: eight rRNA genes, 14 tRNA genes and 13 protein-coding genes within IR regions; one tRNA gene and 12 protein-coding genes in the SSC region; and 25 tRNA genes and 61 protein-coding genes within the LSC region (Figure 1 and Table 1). Furthermore, the GC contents of the entire cp genome and the IR, SSC and LSC regions are 36.80%, 42.70%, 30.90% and 34.60%, respectively, which are equivalent to the values obtained for other species in this study (Table 1 and Table A1).
The results of the codon usage analysis are summarized in Table A2. Overall, these identified genes consist of 26,801 codons, and the most and least frequent amino acids in these codons are leucine (2828, 10.55%) and cysteine (308, 1.15%), respectively. The majority of the codons end in A- and U-.
The statistics of exons and introns are provided in Table A3 and Table A4. The sequence contains 23 intron-containing genes, including clpP and ycf3, comprising two introns; in addition, ten of these intron-containing genes are located in LSC regions, and only ndhA is found in the SSC region. The longest intron (2511 bp) is found in turnK-UUU, and the smallest intron (483 bp) is located in trnL-UAA.

3.2. Analysis of Long Repeats and SSRs

The long-repeat analysis of the Q. bawanglingensis cp genome revealed that the genome contains eight more forward long repeats than inverted long repeats (21) (Table A5). The majority of the repeats are located in the LSC region (40), followed by the SSC region (12) and IRs (8). Moreover, a large proportion of repeats are located in intergenic regions (34, 68%), most of which are distributed in the LCS region, and the minority are found in the trnS-GCU, trnS-UGA, trnG-GCC (exon), trnG-GCC, psaB, psaA, clpP, rpl2, ndhF, ndhI, ndhA (intron), ycf1, trnV-UAC, trnA-UGC and rpl2 genes. Significantly, a longer repeat was not found in the Q. bawanglingensis cp genome, whose repeats range from 18 to 31 bp.
Based on the SSR polymorphism results, we found 82 SSRs in the Q. bawanglingensis cp genome. Most of the SSRs are distributed in the LSC region (62, 75.61%), followed by the SSC region (16, 19.51%) and IRs (4, 4.88%), whereas 64 are located in intergenic spaces and 18 in genes, such as trnK-UUU, trnG-GC, atpF, rpoC2, rpoC1, rpoB, atpB, accD, clpP, petB, petD, ndhF, ndhD, ndhA and ycf1 (Table A6). Furthermore, rpoC1 and rpoC2 contain more SSR loci than the other genes. The cpSSRs in the cp genome generally consist of 69 mononucleotide SSRs (poly A or poly T), six dinucleotide SSRs and seven trinucleotide SSRs.

3.3. Comparison of Complete Chloroplast Genomes among Fagaceae Species

We performed a Blast analysis of the sequences from nine cp genomes using mVISTA, and the cp genome of Q. tarokoensis was used as the reference (Figure 2). The results showed that the entire genome is well conserved across all species with the exception of F. engleriana. The SCs have a substantially higher nucleotide diversity than the IRs, whereas more variation was found in the noncoding regions compared with the coding regions, consistent with the observations of the nucleotide variability (pi), which showed that the pi values of LSC, SSC and IRb are 0.004906, 0.007103 and 0.000729, respectively, among the six species (Q. bawanglingensis, Q. variabilis, Q. aliena var. acutiserrata, Q. aquifolioides, Q. baronii, and Q. tarokoensis); this information is graphically presented in Figure 3. Importantly, both the results from the mVISTA analysis and the assessment of nucleotide variability showed that numerous divergence hotspot regions, such as rbcL-accD (pi: 0.02365, 0.02317), accD (pi: 0.02365), trnS-trnG (pi: 0.01865, 0.01802), ycf1 (pi: 0.01643, 0.01627), trnG-trnR (pi: 0.0173), trnK-rps16 (pi: 0.01627), ndhF (pi: 0.01619) and trnH-psbA (pi:0.01548), are completely located within the SC regions (Figure 2, Figure 3 and Figure 4). In addition, more variable sites are located in intergenic regions than in coding genes, which allows the potential development of DNA barcodes for species identification and taxonomical studies of the genus Quercus.
The IRs are extremely conserved in Quercus (Figure 5), consistent with the observations shown in Figure 2, Figure 3 and Figure 4 but are slightly different from the others investigated in this study. The rps19 gene is located within the LSC, 10 bp from the border of the LSC/IRb, in all species with the exception of C. mollissima (0 bp), and this gene is also found 16 bp between the trnH gene in the LSC and the IRA/LSC border in all species except C. mollissima, in which the gene is found at a distance of 8 bp. At the boundary of the LSC/IRb, the rpl2 gene is located 62 bp from the LSC, whereas shorter distances were found in C. mollissima (67 bp) and F. engleriana (65 bp). In Quercus species, the boundary of the LSC/IRs is highly conserved, whereas the borders of IRs/SSC are highly variable. The IRs/SSC borders are generally located in the varied sites of the ycf1 and ndhF genes. The junctions of SSC/IRa located in ycf1 within the SSC and IRa regions vary in length (Q. bawanglingensis: 4653 and 1038 bp; Q. tarokoensis: 4625 and 1064 bp; Q. aliena var. acutiserrata: 4615 and 1043 bp; Q. variabilis: 4620 and 1041 bp; Q. baronii: 4611 and 1047 bp; Q. aquifolioides: 4513 and 1057 bp; C. mollissima: 4623 and 1059 bp; L. balansae: 4626 and 828 bp; and F. engleriana: 4633 and 1049 bp). The ndhF gene relevant for photosynthesis was found to be located at 1 to 159 bp from the IRb/SSC junction.

3.4. Phylogenetic Analysis

With P. trichocarpa and T. cacao as the outgroups, a phylogenetic tree was generated using ML based on the above-described whole-cp genome data (Figure 6). The phylogenetic results resolved 29 nodes with bootstrap support values of 52–100, which generally strongly supports the hypothesis that the Fagaceae species form a single clade. F. engleriana is located at the top node as a sister to T. doichangensis with high support, whereas L. balansae and Castanopsis species closely related to group Cyclobalanopsis are split into Quercus. The clade formed by Quercus indubitably involves group Quercus, whereas group Ilex is both separately clustered with group Cyclobalanopsis, and Cerris. Q. bawanglingensis is located in one clade that includes several evergreen oaks. The phylogenetic tree also revealed that Q. bawanglingensis is a sister to Q. tarokoensis with a 100% bootstrap value.

4. Discussion

In general, the complete cp genome of Q. bawanglingensis has a strong resemblance to those of other Quercus species in the aspects of genome size and structure, GC content, genes and gene order, which illustrates that the cp genomes are conserved in Quercus [43,44,45,62,63]. Nonetheless, changes in the border of LSC/IRb and the nucleotide variability were detected, which are relatively common in plants [15,46,64]. The maximum difference in genome size among the nine Fagaceae species is 3055 bp, whereas the largest difference in the LSC region is 2981 bp, which could indicate that the divergence in the LSC length leads to variation in the size of the cp genomes based on IR contraction or expansion [31]. Differences in the four IR boundaries among species frequently appear during the process of cp genome evolution, which leads to further changes in the cp genome size. Hence, IR regions are used to explain size differences between cp genomes due to their contraction and expansion at the borders, even though they are the most conserved regions in cp genome sequences [64,65,66,67].
Higher nucleotide diversity has been found in SCs compared with IRs and in noncoding regions compared with coding regions, which is in accordance with the results found for other taxa [43,44,45,63,68], although exceptions have been identified [32,69]. A cp genome has a copy-dependent repair mechanism that ensures the uniformity and stability of two IR regions in sequence and enhances the stability and conservation of the genome [70,71], which might explain the lower sequence divergence in the IRs compared with the LSC or SSC regions, because natural selection coding regions are more conserved than non-coding regions [72]. In our study, both the results from the mVISTA analysis and the nucleotide variability (pi) assessment showed that numerous divergence hotspot regions are primarily situated in the SCs of the cp genome and that more variable sites are located in intergenic regions than in coding genes, and these can be directly utilized for the development of new molecular markers for research on Quercus species identification and taxonomy. Among these divergence hotspot regions, trnH-psbA has already been selected as a suitable barcode for plants [40,73], as have rbcL-accD, trnS-trnG [74], ndhF [40,75], ycf1 [69,76], accD [67,77], trnG-trnR [78] and trnK-rps16 [79]. In this study, the ycf1, ndhF and accD genes were found to be optimal genetic markers based on their high substitution variability, repeat sequence diversity and SSC/IR junction length variability. The accD gene, which encodes the acetyl-CoA carboxylase (ACCase) enzyme, is crucial for maintenance of the plastid compartment and for leaf development in tobacco [67] and might be considered a locus for obtaining insights into chloroplast genome evolution [77] in Quercus. As a NADH dehydrogenase gene, ndhF is favoured by studies on the evolution of plant taxonomy [79,80,81]. The ycf1 gene, which has the largest open reading frame, is crucial for the protein translocons at the inner envelope membranes in Chloroplasts (TIC) complex, which related to plant survival, due to Tic214/Tic20, which provides access of cps to exotic proteins [82]. The ycf1 gene is also important for examining the diversification of the cp genome in algae or other plants [83]. Further research is necessary for examining whether these divergence hotspot regions could be used for assessing the taxonomic evolution of Quercus or could be considered candidate DNA barcodes.
The observed GC content is generally consistent with the results of previous intensive studies [3,84,85,86,87], which confirms that the cp genome of Fagaceae species is rich in adenine and thymine (AT). GC skewness is considered an indicator of replication terminals, replication origin, lag chains and DNA lead chains [88,89,90] as well as a dominant factor in codon bias. Several studies [3,84,87,91] have suggested that high AT richness is the major reason for synonymous codons ending in A/U. This phenomenon might be subordinate to natural selection and mutation during the process of evolution.
cpSSRs, which are typical uniparentally inherited material, have been used extensively in analyses of taxonomic status, phylogenetic relationships, the maternal structure of the community, diversity and differentiation [92,93,94]. SSR polymorphisms result from a mutational mechanism in which SSRs with a length of at least 10 bp appear as slipped-strand mispairings [95]. We found 82 SSRs in the Q. bawanglingensis cp genome that were mostly distributed in the LSC (62, 75.61%) and intergenic spaces (64, 78.04%). Efficient molecular markers might be selected by using auxiliary information from the uneven distribution of cpSSRs for phylogenetic and phylogeographical studies [38,96,97]. In addition, the majority of cpSSRs in the Q. bawanglingensis cp genome mononucleotides and dinucleotides are formed by A and T, which might be related to the high AT richness in the nucleotide composition, similar to the results found for other cp genomes [39,43,46,98].
Previous studies on the origin time of Fagaceae have shown that fossils of T. doichangensis were the first to appear in the fossil record and that Cyclobalanopsis is closer than Quercus to the ancestral group in Fagaceae [99]. Based on the phylogenetic trees, F. engleriana and T. doichangensis are located in the basal phylogeny, and the evolutionary tree is consistent with the fossil record [99]. Q. bawanglingensis and Q. tarokoensis have a close relationship, and in accordance with their morphological features, both of these species belong to section Engleriana in group Ilex [73,100]. Importantly, the Quercus species were not shown to form a clade, similar to the findings in other research [44,45,46]. Group Ilex within group Cerris forms a Cerris-Ilex clade, which is identified by inferences from primarily chloroplast haplotypes between group Cerris and its sister group, Ilex [39]. A group comprising Heterobalanus (corresponding to group Ilex) and Cyclobalanopsis matches the traditional taxonomy, which formalized both Cyclobalanopsis and Ilex as one lineage [101]. Overall, the relationships among the other branches in Fagaceae are mostly consistent with those inferred from nuclear data [33,102].

5. Conclusions

In the present study, we successfully completed the whole cp genome for the vulnerable oak tree Q. bawanglingensis using next generation sequencing technology. In comparing the Q. bawanglingensis cp genome with prior Quercue species from NCBI, we found that it was very similar in cp genome structure and gene content. Nevertheless, obviously heterogeneous sequence divergences were revealed in different regions among Quercus cp genomes. The divergence hotspot regions and abundant SSRs identified in the cp genome could be used for molecular marker development for further population genetics studies on whether and how natural populations have adapted to their local environments, to predict their responses to future habitat alterations and to establish adequate conservation strategies for this vulnerable species. The phylogenetic relationships of Q. bawanglingensis in Fagaceae were robustly resolved based on the cp genome data, strongly supporting the sister relationship between Q. bawanglingensis and Q. tarokoensis in the group Ilex lineage. Overall, the data obtained will contribute to further studies on the diversity, ecology, taxonomy, phylogenetic evolution and conservation of Chinese Quercus species.

Author Contributions

The experiments were conceived and designed by Z.-P.J. and J.-F.L.; X.L., E.-M.C., Y.-N.H., Y.W., and N.Y. were involved in the collection of the study materials. X.L. and E.-M.C. participated in the DNA extraction and data analyses. X.L. wrote and J.-F.L. revised the manuscript. All authors read and approved the final manuscript.


This study was funded by the Fundamental Research Funds for the Central Non-Profit Research Institution of CAF (CAFYBB2018ZB001).


The authors sincerely thank Mingzhi Li of Genepioneer Biotechnologies Co. Ltd., Nanjing, China for the assistance provided with this study. In addition, the authors sincerely thank the reviewers for their careful reading and helpful comments on this manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Figure A1. Location of Mt. Exianling, Hainan Island, China.
Figure A1. Location of Mt. Exianling, Hainan Island, China.
Forests 10 00587 g0a1
Table A1. Base composition of the Q. bawanglingensis chloroplast genome.
Table A1. Base composition of the Q. bawanglingensis chloroplast genome.
RegionCDStRNA GenesrRNA GenesA (%)T (U) (%)C (%)G (%)G + C (%)
LSC6125 32.0033.4017.7016.9034.60
SSC121 34.4034.7016.3014.6030.90
Table A2. Detailed statistics on codon usage, diversification in synonymous codon usage, relative synonymous codon usage (RSCU) values and codon-anticodon recognition patterns of the Q. bawanglingensis chloroplast genome.
Table A2. Detailed statistics on codon usage, diversification in synonymous codon usage, relative synonymous codon usage (RSCU) values and codon-anticodon recognition patterns of the Q. bawanglingensis chloroplast genome.
Amino AcidCodonNo.RSCUtRNAAmino AcidCodonNo.RSCUtRNA
PheUUU9861.29 GluGAG3540.5
LeuCUU5901.25 SerUCG1920.57
LeuCUC2030.43 SerAGU3941.17
LeuCUG1960.42 ProCCU4081.47 
IleAUU11371.45 ProCCC2250.81trnP-GGG
IleAUA7580.97 ProCCG1630.59
MetAUG6181.00trnM-CAU trnI-CAUThrACU5351.59
ValGUU5091.42 ThrACC2470.73
ValGUG2040.57 AlaGCU6321.80
TyrUAU7981.58 AlaGCC2210.63
TerUAA471.64 AlaGCG1690.48
TerUAG210.73 CysUGU2221.44
TerUGA180.63 CysUGC860.56trnC-GCA
HisCAU4861.54 TryUGG4631.00trnW-CCA
GlnCAG2150.45 ArgCGA3541.32trnR-ACG
AsnAAU10121.54 ArgCGG1220.45
AsnAAC3020.46 ArgAGA5051.88trnR-UCU
LysAAA10701.47 ArgAGG1860.69
AspGAU8721.61 GlyGGC2080.46trnG-GCC
Table A3. List of annotated genes in the Q. bawanglingensis chloroplast genome.
Table A3. List of annotated genes in the Q. bawanglingensis chloroplast genome.
Category for GenesGroup of GeneName of Gene
Photosynthesis related genesPhotosystem IpsaA, psaB, psaC, psaI, psaJ,
Photosystem IIpsbA, psbB, psbC, psbD, psbE, psbF, psbH, psbI, psbJ, psbK, psbL, psbM, psbN, psbT, psbZ
Cytochrome b/f complexpetA, petB1, petD1, petL, petG, petN
ATP synthaseatpA, atpB, atpE, atpF1, atPH, atpI
Cytochrome c synthesisccsA
Assembly/stability of photosystemycf32, ycf4
NADPH dehydrogenasendhA1, ndhB1d, ndhC, ndhD, ndhE, ndhF, ndhG, ndhH, ndhI, ndhJ, ndhK
Transcription and translation related genesTranscriptionrpoC11, rpoC2, rpoA, rpoB
Ribosomal proteinsrps2, rps3, rps4, rps7d, rps8, rps11, rps12d, rps14, rps15, rps161, rps18, rps19,
Large subunitrpl21, rpl14, rpl161, rpl20, rpl22, rpl23d, rpl32, rpl33, rpl36
RNA genesRibosomal RNA4.5S rRNAd, 5S rRNAd, 16S rRNAd, 23S rRNA d
Transfer RNAtrnH-GUG, trnK-UUU1, trnQ-UUG, trnS-GCU, trnG-GCC1, trnR-UCU, trnC-GCA, trnD-GUC, trnY-GUA, trnE-UUC, trnT-GGUd, trnM-CAU, trnS-UGA, trnG-UCC, trnfM-CAU, trnS-GGA, trnT-UGU, trnL-UAA1, trnF-GAA, trnV-UAC1, trnW-CCA, trnP-UGG, trnP-GGG, trnL-CAAd, trnV-GACd, trnI-GAU1d , trnR-ACGd, trnL-UAG, trnN-GUUd , trnA-UGC1d , trnI-CAUd
Other genesRNA processingmatK
Carbon metabolismcemA
Fatty acid synthesisaccD
Translational initiation factorinfA
Genes of unknown functionConserved reading framesycf1d, ycf2 d
1, genes containing only one intron; 2, genes containing two introns; d, two gene copies in the IRs.
Table A4. The lengths of introns and exons in genes in the Q. bawanglingensis chloroplast genome.
Table A4. The lengths of introns and exons in genes in the Q. bawanglingensis chloroplast genome.
GeneStrandsLocationExon1 (bp)Exon2 (bp)Intron1 (bp)Exon3 (bp)Intron2 (bp)
trnA-UGC IRB3835800
rps12+IRB 23253626
rps12-IRA 23153730
Table A5. Repeated sequences of the Q. bawanglingensis chloroplast genome.
Table A5. Repeated sequences of the Q. bawanglingensis chloroplast genome.
IDSize (bp)Repeat Start ITypeSize (bp)Repeat Start 2Mismatch (bp)E-ValueRegionGene
118325F18492601.07 × 10−1LSC
2216821R21682101.67 × 10−3LSC
3196835R19683502.67 × 10−2LSC
4187431R18743101.07 × 10−1LSC
5188884R18888401.07 × 10−1LSC
6189988R18998801.07 × 10−1LSC
73111,852R3111,85201.59 × 10−9LSC
82230,370F2230,38804.16 × 10−4LSC
92010,290F2031,74706.66 × 10−3LSC
10198557R1935,01402.67 × 10−2LSC
11204925F2036,72206.66 × 10−3LSC
1218325F1836,72301.07 × 10−1LSC
13219531F2140,09801.67 × 10−3LSCtrnS-GCU, trnS-UGA
142040,206R2040,20606.66 × 10−3LSC
152211,376F2241,43804.16 × 10−4LSCtrnG-GCC (exon), trnG-GCC
161843,688F1845,91201.07 × 10−1LSCpsaB, psaA
171921,298R1954,38402.67 × 10−2LSC
182154,575F2154,59401.67 × 10−3LSC
192056,125R2056,12506.66 × 10−3LSCndhC
202162,263R2162,26301.67 × 10−3LSC
211964,976R1964,97602.67 × 10−2LSC
222169,245R2169,24501.67 × 10−3LSC
231869,246R1869,24601.07 × 10−1LSC
241869,246F1869,24701.07 × 10−1LSC
251869,247R1869,24701.07 × 10−1LSC
261971,499R1971,49902.67 × 10−2LSC
271972,775R1972,77502.67 × 10−2LSC
281818,660F1876,84301.07 × 10−1LSCclpP
291852,390F1887,36901.07 × 10−1LSC
302091,234F2091,25406.66 × 10−3IRBrpl2
3120105,557F20105,57506.66 × 10−3IRB
3223113,771F23113,80201.04 × 10−4IRB
331869,461F18116,76001.07 × 10−1LSC, SSCndhF
3421117,268R21117,26801.67 × 10−3SSCndhF
351866,388F18118,80101.07 × 10-01LSC, SSC
36184934F18118,91601.07 × 10−1LSC, SSC
371918,660R19119,06402.67 × 10−2LSC, SSC
3823119,066R23119,06601.04 × 10−4SSC
391910,289F19126,14102.67 × 10−2LSC, SSC
401831,747F18126,14201.07 × 10−1LSC, SSC
411973,588F19127,65002.67 × 10−2LSC, SSCndhA
4225127,669F25127,69306.51 × 10−6SSCndhA (intron)
4320119,064R20130,69006.66 × 10−3SSC
441918,660F19130,69102.67 × 10−2LSC, SSC
451810,551F18133,57001.07 × 10−1LSC, SSCycf1
4624116,026F24135,97202.60 × 10−5IRB, IRAycf1
4723138,197F23138,22801.04 × 10−4IRA
482057,490F20142,31306.66 × 10−3LSC, IRAtrnV-UAC, trnA-UGC
4920146,427F20146,44506.66 × 10−3IRA
5020160,748F20160,76806.66 × 10−3IRArpl2
Table A6. Simple sequence repeats (SSRs) in the Q. bawanglingensis chloroplast genome.
Table A6. Simple sequence repeats (SSRs) in the Q. bawanglingensis chloroplast genome.
1(A)1111333343LSC 42(T)101059,81359,822LSCatpB
2(A)101017961805LSC 43(T)111160,28560,295LSC
4(C)12(A)112344264448LSC 45(T)111164,31764,327LSCaccD
5(T)131346904702LSC 46(A)101064,49264,501LSC
6(A)111149344944LSC 47(AT)71464,79564,808LSC
7(A)111151345144LSC 48(T)111165,17065,180LSC
8(T)111169676977LSC 49(T)101066,21166,220LSC
9(A)101081398148LSC 50(T)141466,38966,402LSC
10(A)161685558570LSC 51(T)101068,83668,845LSC
11(A)101088898898LSC 52(A)19(AT)68669,24769,332LSC
12(A)111110,15310,163LSC 53(C)111170,94370,953LSC
13(T)111110,29310,303LSC 54(T)131373,58873,600LSC
15(A)111113,55213,562LSC 56(A)14(A)133476,82976,862LSCclpP
17(T)101015,31915,328LSC 58(TA)71483,13483,147LSCpetD
18(A)121218,66718,678LSC 59(A)101085,98185,990LSC
25(T)101029,64229,651LSC 66(T)1515118,801118,815SSC
26(C)131330,44230,454LSC 67(A)1010118,917118,926SSC
27(T)111131,75031,760LSC 68(A)12t(A)1124119,066119,089SSC
28(A)101032,11332,122LSC 69(T)1414119,222119,235SSC
29(A)121234,22934,240LSC 70(A)1212120,003120,014SSC
30(A)131335,02135,033LSC 71(T)1212122,398122,409SSC
31(A)111136,73136,741LSC 72(A)1010122,745122,754SSCndhD
32(A)111139,92139,931LSC 73(A)1111124,071124,081SSC
33(AT)61240,06840,079LSC 74(T)1010126,004126,013SSC
34(T)141440,21040,223LSC 75(T)1111126,145126,155SSC
35(A)131340,36540,377LSC 76(A)11(T)12(A)1177127,622127,722SSCndhA
36(A)101040,88240,891LSC 77(T)1010130,474130,483SSC
37(A)10(A)108952,31752,405LSC 78(A)1212130,698130,709SSC
38(T)111153,42353,433LSC 79(T)1010133,670133,679SSCycf1
39(A)101053,93253,941LSC 80(T)1313134,247134,259SSCycf1
40(T)101054,31654,325LSC 81(A)1010137,718137,727IRA
41(A)101055,21055,219LSC 82(A)1010161,371161,380IRA


  1. Birky, C.W.; Maruyama, T.; Fuerst, P. An approach to population and evolutionary genetic theory for genes in mitochondria and chloroplasts, and some results. Genetics 1983, 103, 513–527. [Google Scholar] [PubMed]
  2. Sugiura, M. The chloroplast genome. In 10 Years Plant Molecular Biology; Springer: Dordrecht, The Netherlands, 1992; pp. 149–168. [Google Scholar]
  3. Tangphatsornruang, S.; Sangsrakru, D.; Chanprasert, J.; Uthaipaisanwong, P.; Yoocha, T.; Jomchai, N.; Tragoonrung, S. The chloroplast genome sequence of mungbean (Vigna radiata) determined by high-throughput pyrosequencing: Structural organization and phylogenetic relationships. DNA Res. 2009, 17, 11–22. [Google Scholar] [CrossRef] [PubMed]
  4. Shinozaki, K.; Ohme, M.; Tanaka, M.; Wakasugi, T.; Hayashida, N.; Matsubayashi, T.; Zaita, N.; Chunwongse, J.; Obokata, J.; Yamaguchi-Shinozaki, K. The complete nucleotide sequence of the tobacco chloroplast genome: Its gene organization and expression. EMBO J. 1986, 5, 2043–2049. [Google Scholar] [CrossRef]
  5. Zhang, T.; Fang, Y.; Wang, X.; Deng, X.; Zhang, X.; Hu, S.; Yu, J. The complete chloroplast and mitochondrial genome sequences of Boea hygrometrica: Insights into the evolution of plant organellar genomes. PLoS ONE 2012, 7, e30531. [Google Scholar] [CrossRef] [PubMed]
  6. Cosner, M.E.; Raubeson, L.A.; Jansen, R.K. Chloroplast DNA rearrangements in Campanulaceae: Phylogenetic utility of highly rearranged genomes. BMC Evol. Biol. 2004, 4, 27. [Google Scholar] [CrossRef] [PubMed]
  7. Nock, C.J.; Waters, D.L.; Edwards, M.A.; Bowen, S.G.; Rice, N.; Cordeiro, G.M.; Henry, R.J. Chloroplast genome sequences from total DNA for plant identification. Plant Biotechnol. J. 2011, 9, 328–333. [Google Scholar] [CrossRef]
  8. Reith, M. Complete uncleotide sequence of the Porphyra purpurea chloroplast genome. Plant Mol. Biol. Rep. 1995, 13, 327–332. [Google Scholar] [CrossRef]
  9. Wicke, S.; Schneeweiss, G.M.; Müller, K.F.; Quandt, D. The evolution of the plastid chromosome in land plants: Gene content, gene order, gene function. Plant Mol. Biol. 2011, 76, 273–297. [Google Scholar] [CrossRef]
  10. Samson, N.; Bausher, M.G.; Lee, S.B.; Jansen, R.K.; Daniell, H. The complete nucleotide sequence of the coffee (Coffea arabica L.) chloroplast genome: Organization and implications for biotechnology and phylogenetic relationships amongst angiosperms. Plant Biotechnol. J. 2007, 5, 339–353. [Google Scholar] [CrossRef]
  11. Moore, M.J.; Bell, C.D.; Soltis, P.S.; Soltis, D.E. Using plastid genome-scale data to resolve enigmatic relationships among basal angiosperms. Proc. Natl. Acad. Sci. USA 2007, 104, 19363–19368. [Google Scholar] [CrossRef][Green Version]
  12. Jansen, R.K.; Cai, Z.; Raubeson, L.A.; Daniell, H.; Depamphilis, C.W.; Leebens-Mack, J.; Müller, K.F.; Guisinger-Bellian, M.; Haberle, R.C.; Hansen, A.K. Analysis of 81 genes from 64 plastid genomes resolves relationships in angiosperms and identifies genome-scale evolutionary patterns. Proc. Natl. Acad. Sci. USA 2007, 104, 19369–19374. [Google Scholar] [CrossRef] [PubMed][Green Version]
  13. Lin, C.-P.; Huang, J.-P.; Wu, C.-S.; Hsu, C.-Y.; Chaw, S.-M. Comparative chloroplast genomics reveals the evolution of Pinaceae genera and subfamilies. Genome Biol. Evol. 2010, 2, 504–517. [Google Scholar] [CrossRef] [PubMed]
  14. Parks, M.; Cronn, R.; Liston, A. Increasing phylogenetic resolution at low taxonomic levels using massively parallel sequencing of chloroplast genomes. BMC Biol. 2009, 7, 84. [Google Scholar] [CrossRef] [PubMed]
  15. Liu, L.; Wang, Y.; He, P.; Li, P.; Lee, J.; Soltis, D.E.; Fu, C. Chloroplast genome analyses and genomic resource development for epilithic sister genera Oresitrophe and Mukdenia (Saxifragaceae), using genome skimming data. BMC Genom. 2018, 19, 235. [Google Scholar] [CrossRef] [PubMed]
  16. Li, Z.-H.; Ma, X.; Wang, D.-Y.; Li, Y.-X.; Wang, C.-W.; Jin, X.-H. Evolution of plastid genomes of Holcoglossum (Orchidaceae) with recent radiation. BMC Evol. Biol. 2019, 19, 63. [Google Scholar] [CrossRef]
  17. Haig, S.M. Molecular contributions to conservation. Ecology 1998, 79, 413–425. [Google Scholar] [CrossRef]
  18. Juchum, F.; Leal, J.; Santos, L.; Almeida, M.; Ahnert, D.; Corrêa, R. Evaluation of genetic diversity in a natural rosewood population (Dalbergia nigra Vell. Allemão ex Benth.) using RAPD markers. Genet. Mol. Res. 2007, 6, 543–553. [Google Scholar]
  19. Lira, C.F.; Cardoso, S.R.S.; Ferreira, P.C.G.; Cardoso, M.A.; Provan, J. Long-term population isolation in the endangered tropical tree species Caesalpinia echinata Lam. revealed by chloroplast microsatellites. Mol. Ecol. 2003, 12, 3219–3225. [Google Scholar] [CrossRef]
  20. McCauley, D.E. The use of chloroplast DNA polymorphism in studies of gene flow in plants. Trends Ecol. Evol. 1995, 10, 198–202. [Google Scholar] [CrossRef]
  21. Ennos, R.A. Using organelle markers to elucidate the history, ecology and evolution of plant populations. Mol. Syst. Plant Evol. 1999, 1–19. [Google Scholar] [CrossRef]
  22. Gregory, T.R. DNA barcoding does not compete with taxonomy. Nature 2005, 434, 1067. [Google Scholar] [CrossRef] [PubMed]
  23. Qian, W.; Qiu-Shi, Y.; Jian-Quan, L. Are nuclear loci ideal for barcoding plants? A case study of genetic delimitation of two sister species using multiple loci and multiple intraspecific individuals. J. Syst. Evol. 2011, 49, 182–188. [Google Scholar]
  24. Barrett, C.F.; Davis, J.I.; Leebens-Mack, J.; Conran, J.G.; Stevenson, D.W. Plastid genomes and deep relationships among the commelinid monocot angiosperms. Cladistics 2013, 29, 65–87. [Google Scholar] [CrossRef]
  25. Li, X.; Yang, Y.; Henry, R.J.; Rossetto, M.; Wang, Y.; Chen, S. Plant DNA barcoding: From gene to genome. Biol. Rev. 2015, 90, 157–166. [Google Scholar] [CrossRef] [PubMed]
  26. Nixon, K.C.; Crepet, W.L. Trigonobalanus (Fagaceae): Taxonomic status and phylogenetic relationships. Am. J. Bot. 1989, 76, 828–841. [Google Scholar] [CrossRef]
  27. Nixon, K.C. Infrageneric classification of Quercus (Fagaceae) and typification of sectional names. In Annales des Sciences Forestières; EDP Sciences: Les Ulis, France, 1993. [Google Scholar]
  28. Hubert, F.; Grimm, G.W.; Jousselin, E.; Berry, V.; Franc, A.; Kremer, A. Multiple nuclear genes stabilize the phylogenetic backbone of the genus Quercus. Syst. Biodivers. 2014, 12, 405–423. [Google Scholar] [CrossRef]
  29. Ørsted, A.S. Bidrag til kundskab om Egefamilien i Nutid og Fortid; Mathematisk-naturvidenskabelig Klass: Skrifter Udgivne af Videnskabs-Selskabet i Christiana; Bianco Lunos Bogtr.: Copenhagen, Denmark, 1871. [Google Scholar]
  30. Denk, T.; Grimm, G.W.; Manos, P.S.; Deng, M.; Hipp, A.L. An updated infrageneric classification of the oaks: Review of previous taxonomic schemes and synthesis of evolutionary patterns. In Oaks Physiological Ecology. Exploring the Functional Diversity of Genus Quercus L.; Springer: Cham, Switzerland, 2017. [Google Scholar]
  31. Kim, K.; Lee, S.-C.; Lee, J.; Yu, Y.; Yang, K.; Choi, B.-S.; Koh, H.-J.; Waminal, N.E.; Choi, H.-I.; Kim, N.-H. Complete chloroplast and ribosomal sequences for 30 accessions elucidate evolution of Oryza AA genome species. Sci. Rep. 2015, 5, 15655. [Google Scholar] [CrossRef] [PubMed]
  32. Yang, J.; Yue, M.; Niu, C.; Ma, X.-F.; Li, Z.-H. Comparative analysis of the complete chloroplast genome of four endangered herbals of Notopterygium. Genes 2017, 8, 124. [Google Scholar] [CrossRef] [PubMed]
  33. Oh, S.-H.; Manos, P.S. Molecular phylogenetics and cupule evolution in Fagaceae as inferred from nuclear CRABS CLAW sequences. Taxon 2008, 57, 434–451. [Google Scholar]
  34. Pelser, P.B.; Kennedy, A.H.; Tepe, E.J.; Shidler, J.B.; Nordenstam, B.; Kadereit, J.W.; Watson, L.E. Patterns and causes of incongruence between plastid and nuclear Senecioneae (Asteraceae) phylogenies. Am. J. Bot. 2010, 97, 856–873. [Google Scholar] [CrossRef] [PubMed][Green Version]
  35. Pérez-Escobar, O.A.; Balbuena, J.A.; Gottschling, M. Rumbling orchids: How to assess divergent evolution between chloroplast endosymbionts and the nuclear host. Syst. Biol. 2015, 65, 51–65. [Google Scholar] [CrossRef] [PubMed]
  36. Hipp, A.L.; Eaton, D.A.; Cavender-Bares, J.; Fitzek, E.; Nipper, R.; Manos, P.S. A framework phylogeny of the American oak clade based on sequenced RAD data. PLoS ONE 2014, 9, e93975. [Google Scholar] [CrossRef] [PubMed]
  37. McVay, J.D.; Hipp, A.L.; Manos, P.S. A genetic legacy of introgression confounds phylogeny and biogeography in oaks. Proc. R. Soc. B Biol. Sci. 2017, 284, 20170300. [Google Scholar] [CrossRef] [PubMed][Green Version]
  38. Pham, K.K.; Hipp, A.L.; Manos, P.S.; Cronn, R.C. A time and a place for everything: Phylogenetic history and geography as joint predictors of oak plastome phylogeny. Genome 2017, 60, 720–732. [Google Scholar] [CrossRef] [PubMed]
  39. Simeone, M.C.; Cardoni, S.; Piredda, R.; Imperatori, F.; Avishai, M.; Grimm, G.W.; Denk, T. Comparative systematics and phylogeography of Quercus Section Cerris in western Eurasia: Inferences from plastid and nuclear DNA variation. PeerJ 2018, 6, e5793. [Google Scholar] [CrossRef] [PubMed]
  40. Yang, J.; Vázquez, L.; Chen, X.; Li, H.; Zhang, H.; Liu, Z.; Zhao, G. Development of chloroplast and nuclear DNA markers for Chinese oaks (Quercus subgenus Quercus) and assessment of their utility as DNA barcodes. Front. Plant Sci. 2017, 8, 816. [Google Scholar] [CrossRef]
  41. Qin, H.; Yang, Y.; Dong, S.; He, Q.; Jia, Y.; Zhao, L.; Yu, S.; Liu, H.; Liu, B.; Yan, Y. Threatened species list of China’s higher plants. Biodivers. Sci. 2017, 25, 744. [Google Scholar] [CrossRef]
  42. Deng, M.; Jiang, X.-L.; Song, Y.-G.; Coombes, A.; Yang, X.-R.; Xiong, Y.-S.; Li, Q.-S. Leaf epidermal features of Quercus Group Ilex (Fagaceae) and their application to species identification. Rev. Palaeobot. Palynol. 2017, 237, 10–36. [Google Scholar] [CrossRef]
  43. Yang, Y.; Hu, Y.; Ren, T.; Sun, J.; Zhao, G. Remarkably conserved plastid genomes of Quercus group Cerris in China: Comparative and phylogenetic analyses. Nord. J. Bot. 2018, 36, e01921. [Google Scholar] [CrossRef]
  44. Yang, Y.; Zhou, T.; Duan, D.; Yang, J.; Feng, L.; Zhao, G. Comparative analysis of the complete chloroplast genomes of five Quercus species. Front. Plant Sci. 2016, 7, 959. [Google Scholar] [CrossRef]
  45. Yang, Y.; Zhu, J.; Feng, L.; Zhou, T.; Bai, G.; Yang, J.; Zhao, G. Plastid genome comparative and phylogenetic analyses of the key genera in Fagaceae: Highlighting the effect of codon composition bias in phylogenetic inference. Front. Plant Sci. 2018, 9, 82. [Google Scholar] [CrossRef] [PubMed]
  46. Li, X.; Li, Y.; Zang, M.; Li, M.; Fang, Y. Complete chloroplast genome sequence and phylogenetic analysis of Quercus acutissima. Int. J. Mol. Sci. 2018, 19, 2443. [Google Scholar] [CrossRef] [PubMed]
  47. Zhang, R.; Qin, X.; Chen, H.; Chan, B.P.L.; Xing, F.; Xu, Z. Phytogeography and floristic affinities of the limestone flora of Mt. Exianling, Hainan Island, China. Bot. Rev. 2017, 83, 38–58. [Google Scholar] [CrossRef]
  48. Doyle, J.J. A rapid DNA isolation procedure for small quantities of fresh leaf tissue. Phytochem. Bull. 1987, 19, 11–15. [Google Scholar]
  49. Dierckxsens, N.; Mardulyn, P.; Smits, G. NOVOPlasty: De novo assembly of organelle genomes from whole genome data. Nucleic Acids Res. 2016, 45, e18. [Google Scholar]
  50. Liu, C.; Shi, L.; Zhu, Y.; Chen, H.; Zhang, J.; Lin, X.; Guan, X. CpGAVAS, an integrated web server for the annotation, visualization, analysis, and GenBank submission of completely sequenced chloroplast genome sequences. BMC Genom. 2012, 13, 715. [Google Scholar] [CrossRef]
  51. Wyman, S.K.; Jansen, R.K.; Boore, J.L. Automatic annotation of organellar genomes with DOGMA. Bioinformatics 2004, 20, 3252–3255. [Google Scholar] [CrossRef][Green Version]
  52. Schattner, P.; Brooks, A.N.; Lowe, T.M. The tRNAscan-SE, snoscan and snoGPS web servers for the detection of tRNAs and snoRNAs. Nucleic Acids Res. 2005, 33, W686–W689. [Google Scholar] [CrossRef]
  53. Lohse, M.; Drechsel, O.; Bock, R. Organellar Genome DRAW (OGDRAW): A tool for the easy generation of high-quality custom graphical maps of plastid and mitochondrial genomes. Curr. Genet. 2007, 52, 267–274. [Google Scholar] [CrossRef]
  54. Mudunuri, S.B.; Nagarajaram, H.A. IMEx: Imperfect microsatellite extractor. Bioinformatics 2007, 23, 1181–1187. [Google Scholar] [CrossRef]
  55. Kurtz, S.; Choudhuri, J.V.; Ohlebusch, E.; Schleiermacher, C.; Stoye, J.; Giegerich, R. REPuter: The manifold applications of repeat analysis on a genomic scale. Nucleic Acids Res. 2001, 29, 4633–4642. [Google Scholar] [CrossRef] [PubMed]
  56. Katoh, K.; Kuma, K.-I.; Toh, H.; Miyata, T. MAFFT version 5: Improvement in accuracy of multiple sequence alignment. Nucleic Acids Res. 2005, 33, 511–518. [Google Scholar] [CrossRef] [PubMed]
  57. Mayor, C.; Brudno, M.; Schwartz, J.R.; Poliakov, A.; Rubin, E.M.; Frazer, K.A.; Pachter, L.S.; Dubchak, I. VISTA: Visualizing global DNA sequence alignments of arbitrary length. Bioinformatics 2000, 16, 1046–1047. [Google Scholar] [CrossRef] [PubMed]
  58. Frazer, K.A.; Pachter, L.; Poliakov, A.; Rubin, E.M.; Dubchak, I. VISTA: Computational tools for comparative genomics. Nucleic Acids Res. 2004, 32, W273–W279. [Google Scholar] [CrossRef] [PubMed]
  59. Librado, P.; Rozas, J. DnaSP v5: A software for comprehensive analysis of DNA polymorphism data. Bioinformatics 2009, 25, 1451–1452. [Google Scholar] [CrossRef] [PubMed]
  60. Hall, T.A. BioEdit: A user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucleic Acids Symp. Ser. 1999, 41, 95–98. [Google Scholar]
  61. Price, M.N.; Dehal, P.S.; Arkin, A.P. FastTree 2-approximately maximum-likelihood trees for large alignments. PLoS ONE 2010, 5, e9490. [Google Scholar] [CrossRef]
  62. Hu, H.-L.; Zhang, J.-Y.; Li, Y.-P.; Xie, L.; Chen, D.-B.; Li, Q.; Liu, Y.-Q.; Hui, S.-R.; Qin, L. The complete chloroplast genome of the daimyo oak, Quercus dentata Thunb. Conserv. Genet. Resour. 2018, 1–3. [Google Scholar] [CrossRef]
  63. Zhang, X.; Hu, Y.; Liu, M.; Lang, T. Optimization of Assembly Pipeline may Improve the Sequence of the Chloroplast Genome in Quercus spinosa. Sci. Rep. 2018, 8, 8906. [Google Scholar] [CrossRef]
  64. Wang, R.-J.; Cheng, C.-L.; Chang, C.-C.; Wu, C.-L.; Su, T.-M.; Chaw, S.-M. Dynamics and evolution of the inverted repeat-large single copy junctions in the chloroplast genomes of monocots. BMC Evol. Biol. 2008, 8, 36. [Google Scholar] [CrossRef]
  65. Yao, X.; Tang, P.; Li, Z.; Li, D.; Liu, Y.; Huang, H. The first complete chloroplast genome sequences in Actinidiaceae: Genome structure and comparative analysis. PLoS ONE 2015, 10, e0129347. [Google Scholar] [CrossRef] [PubMed]
  66. Raubeson, L.A.; Peery, R.; Chumley, T.W.; Dziubek, C.; Fourcade, H.M.; Boore, J.L.; Jansen, R.K. Comparative chloroplast genomics: Analyses including new sequences from the angiosperms Nuphar advena and Ranunculus macranthus. BMC Genom. 2007, 8, 174. [Google Scholar] [CrossRef] [PubMed]
  67. Kode, V.; Mudd, E.A.; Iamtham, S.; Day, A. The tobacco plastid accD gene is essential and is required for leaf development. Plant J. 2005, 44, 237–244. [Google Scholar] [CrossRef] [PubMed]
  68. Nguyen, P.A.T.; Kim, J.S.; Kim, J.-H. The complete chloroplast genome of colchicine plants (Colchicum autumnale L. and Gloriosa superba L.) and its application for identifying the genus. Planta 2015, 242, 223–237. [Google Scholar] [CrossRef] [PubMed]
  69. Firetti, F.; Zuntini, A.R.; Gaiarsa, J.W.; Oliveira, R.S.; Lohmann, L.G.; Van Sluys, M.A. Complete chloroplast genome sequences contribute to plant species delimitation: A case study of the Anemopaegma species complex. Am. J. Bot. 2017, 104, 1493–1509. [Google Scholar] [CrossRef] [PubMed]
  70. Perry, A.S.; Wolfe, K.H. Nucleotide substitution rates in legume chloroplast DNA depend on the presence of the inverted repeat. J. Mol. Evol. 2002, 55, 501–508. [Google Scholar] [CrossRef]
  71. Khakhlova, O.; Bock, R. Elimination of deleterious mutations in plastid genomes by gene conversion. Plant J. 2006, 46, 85–94. [Google Scholar] [CrossRef] [PubMed]
  72. Shaw, J.; Lickey, E.B.; Schilling, E.E.; Small, R.L. Comparison of whole chloroplast genome sequences to choose noncoding regions for phylogenetic studies in angiosperms: The tortoise and the hare III. Am. J. Bot. 2007, 94, 275–288. [Google Scholar] [CrossRef] [PubMed]
  73. Hollingsworth, P.M.; Forrest, L.L.; Spouge, J.L.; Hajibabaei, M.; Ratnasingham, S.; van der Bank, M.; Chase, M.W.; Cowan, R.S.; Erickson, D.L.; Fazekas, A.J.; et al. A DNA barcode for land plants. Proc. Natl. Acad. Sci. USA 2009, 106, 12794–12797. [Google Scholar] [CrossRef][Green Version]
  74. Shaw, J.; Shafer, H.L.; Leonard, O.R.; Kovach, M.J.; Schorr, M.; Morris, A.B. Chloroplast DNA sequence utility for the lowest phylogenetic and phylogeographic inferences in angiosperms: The tortoise and the hare IV. Am. J. Bot. 2014, 101, 1987–2004. [Google Scholar] [CrossRef] [PubMed][Green Version]
  75. Yang, Z.; Zhao, T.; Ma, Q.; Liang, L.; Wang, G. Comparative genomics and phylogenetic analysis revealed the chloroplast genome variation and interspecific relationships of Corylus (Betulaceae) Species. Front. Plant Sci. 2018, 9, 927. [Google Scholar] [CrossRef]
  76. Dong, W.; Xu, C.; Li, C.; Sun, J.; Zuo, Y.; Shi, S.; Cheng, T.; Guo, J.; Zhou, S. ycf1, the most promising plastid DNA barcode of land plants. Sci. Rep. 2015, 5, 8348. [Google Scholar] [CrossRef] [PubMed]
  77. Li, J.; Su, Y.; Wang, T. The Repeat Sequences and Elevated Substitution Rates of the Chloroplast accD Gene in Cupressophytes. Front. Plant Sci. 2018, 9, 533. [Google Scholar] [CrossRef] [PubMed]
  78. Nagalingum, N.S.; Schneider, H.; Pryer, K.M. Molecular phylogenetic relationships and morphological evolution in the heterosporous fern genus Marsilea. Syst. Bot. 2007, 32, 16–25. [Google Scholar] [CrossRef]
  79. Zecca, G.; Abbott, J.R.; Sun, W.-B.; Spada, A.; Sala, F.; Grassi, F. The timing and the mode of evolution of wild grapes (Vitis). Mol. Phylogenet. Evol. 2012, 62, 736–747. [Google Scholar] [CrossRef] [PubMed]
  80. Díaz, J.G.; Bauters, K.; Xanthos, M.; Larridon, I. Scleria diversity in Madagascar: Evolutionary links to mainland Africa. Royal Botanic Gardens, Kew 2017. [Google Scholar]
  81. Moura, M.N.; Santos-Silva, F.; Gomes-da-Silva, J.; de Almeida, J.P.P.; Forzza, R.C. Between Spines and Molecules: A Total Evidence Phylogeny of the Brazilian Endemic Genus Encholirium (Pitcairnioideae, Bromeliaceae). Syst. Bot. 2019. [Google Scholar] [CrossRef]
  82. Kikuchi, S.; Bédard, J.; Hirano, M.; Hirabayashi, Y.; Oishi, M.; Imai, M.; Takase, M.; Ide, T.; Nakai, M. Uncovering the protein translocon at the chloroplast inner envelope membrane. Science 2013, 339, 571–574. [Google Scholar] [CrossRef]
  83. De Cambiaire, J.-C.; Otis, C.; Lemieux, C.; Turmel, M. The complete chloroplast genome sequence of the chlorophycean green alga Scenedesmus obliquus reveals a compact gene organization and a biased distribution of genes on the two DNA strands. BMC Evol. Biol. 2006, 6, 37. [Google Scholar] [CrossRef]
  84. Clegg, M.T.; Gaut, B.S.; Learn, G.H.; Morton, B.R. Rates and patterns of chloroplast DNA evolution. Proc. Natl. Acad. Sci. USA 1994, 91, 6795–6801. [Google Scholar] [CrossRef]
  85. Guo, S.; Guo, L.; Zhao, W.; Xu, J.; Li, Y.; Zhang, X.; Shen, X.; Wu, M.; Hou, X. Complete chloroplast genome sequence and phylogenetic analysis of Paeonia ostii. Molecules 2018, 23, 246. [Google Scholar] [CrossRef] [PubMed]
  86. Kuang, D.-Y.; Wu, H.; Wang, Y.-L.; Gao, L.-M.; Zhang, S.-Z.; Lu, L. Complete chloroplast genome sequence of Magnolia kwangsiensis (Magnoliaceae): Implication for DNA barcoding and population genetics. Genome 2011, 54, 663–673. [Google Scholar] [CrossRef] [PubMed]
  87. Shimda, H.; Sugiuro, M. Fine structural features of the chloroplast genome: Comparison of the sequenced chloroplast genomes. Nucleic Acids Res. 1991, 19, 983–995. [Google Scholar] [CrossRef] [PubMed]
  88. Lobry, J.R. Asymmetric substitution patterns in the two DNA strands of bacteria. Mol. Biol. Evol. 1996, 13, 660–665. [Google Scholar] [CrossRef] [PubMed][Green Version]
  89. Necşulea, A.; Lobry, J.R. A new method for assessing the effect of replication on DNA base composition asymmetry. Mol. Biol. Evol. 2007, 24, 2169–2179. [Google Scholar]
  90. Tillier, E.R.; Collins, R.A. The contributions of replication orientation, gene direction, and signal sequences to base-composition asymmetries in bacterial genomes. J. Mol. Evol. 2000, 50, 249–257. [Google Scholar] [CrossRef]
  91. Delannoy, E.; Fujii, S.; Colas des Francs-Small, C.; Brundrett, M.; Small, I. Rampant gene loss in the underground orchid Rhizanthella gardneri highlights evolutionary constraints on plastid genomes. Mol. Biol. Evol. 2011, 28, 2077–2086. [Google Scholar] [CrossRef] [PubMed]
  92. Flannery, M.; Mitchell, F.; Coyne, S.; Kavanagh, T.; Burke, J.; Salamin, N.; Dowding, P.; Hodkinson, T.J. Plastid genome characterisation in Brassica and Brassicaceae using a new set of nine SSRs. Theor. Appl. Genet. 2006, 113, 1221–1231. [Google Scholar] [CrossRef]
  93. Provan, J. Novel chloroplast microsatellites reveal cytoplasmic variation in Arabidopsis thaliana. Mol. Ecol. 2000, 9, 2183–2185. [Google Scholar] [CrossRef]
  94. Bryan, G.; McNicoll, J.; Ramsay, G.; Meyer, R.; De Jong, W. Polymorphic simple sequence repeat markers in chloroplast genomes of Solanaceous plants. Theor. Appl. Genet. 1999, 99, 859–867. [Google Scholar] [CrossRef]
  95. Asaf, S.; Khan, A.L.; Khan, M.A.; Waqas, M.; Kang, S.-M.; Yun, B.-W.; Lee, I.-J. Chloroplast genomes of Arabidopsis halleri ssp. gemmifera and Arabidopsis lyrata ssp. petraea: Structures and comparative analysis. Sci. Rep. 2017, 7, 7556. [Google Scholar] [CrossRef] [PubMed]
  96. Schroeder, H.; Cronn, R.; Yanbaev, Y.; Jennings, T.; Mader, M.; Degen, B.; Kersten, B. Development of molecular markers for determining continental origin of wood from white oaks (Quercus L. sect. Quercus). PLoS ONE 2016, 11, e0158221. [Google Scholar] [CrossRef] [PubMed]
  97. Powell, W.; Morgante, M.; Andre, C.; McNicol, J.; Machray, G.; Doyle, J.; Tingey, S.; Rafalski, J. Hypervariable microsatellites provide a general source of polymorphic DNA markers for the chloroplast genome. Curr. Biol. 1995, 5, 1023–1029. [Google Scholar] [CrossRef][Green Version]
  98. Li, X.; Gao, H.; Wang, Y.; Song, J.; Henry, R.; Wu, H.; Hu, Z.; Yao, H.; Luo, H.; Luo, K.; et al. Complete chloroplast genome sequence of Magnolia grandiflora and comparative analysis with related species. Sci. China Life Sci. 2013, 56, 189–198. [Google Scholar] [CrossRef] [PubMed]
  99. Zhou, Z. Fossils of the Fagaceae and their implications in systematics and biogeography. Acta Phytotaxon. Sin. 1999, 37, 369–385. [Google Scholar]
  100. Pu, C.; Zhou, Z.; Luo, Y. A cladistic analysis of Quercus (Fagaceae) in China based on leaf epidermic and architecture. Acta Bot. Yunnanica 2002, 24, 689–698. [Google Scholar]
  101. Editorial Committee of Flora of China. Flora of China; Science Press: Beijing, China, 1998. [Google Scholar]
  102. Denk, T.; Grimm, G.W. The oaks of western Eurasia: Traditional classifications and evidence from two nuclear markers. Taxon 2010, 59, 351–366. [Google Scholar] [CrossRef]
Figure 1. Map of the chloroplast genome of Q. bawanglingensis. The genes in the clockwise direction fill the inner circle, and the outer circle contains genes in the counterclockwise direction. Different colours represent different genes in different functional groups. The lighter grey shows the A + T content, and the darker grey in the inner circle indicates the G + C content. The direction of the genes is denoted by the direction of the grey arrow.
Figure 1. Map of the chloroplast genome of Q. bawanglingensis. The genes in the clockwise direction fill the inner circle, and the outer circle contains genes in the counterclockwise direction. Different colours represent different genes in different functional groups. The lighter grey shows the A + T content, and the darker grey in the inner circle indicates the G + C content. The direction of the genes is denoted by the direction of the grey arrow.
Forests 10 00587 g001
Figure 2. Sequence identity plots of the nine Fagaceae cp genomes generated by mVISTA, with the Q. tarokoensis genome as the reference. The vertical and horizontal axes in the figure represent the consistency degree of the sequences from 50% to 100% and the sequence length, respectively. Annotated genes are displayed along the top.
Figure 2. Sequence identity plots of the nine Fagaceae cp genomes generated by mVISTA, with the Q. tarokoensis genome as the reference. The vertical and horizontal axes in the figure represent the consistency degree of the sequences from 50% to 100% and the sequence length, respectively. Annotated genes are displayed along the top.
Forests 10 00587 g002
Figure 3. Nucleotide variability (pi) values. X-axis: position of the midpoint of a window. Y-axis: nucleotide diversity of each window.
Figure 3. Nucleotide variability (pi) values. X-axis: position of the midpoint of a window. Y-axis: nucleotide diversity of each window.
Forests 10 00587 g003
Figure 4. Comparison of the cp genomes from four Fagaceae species. The outer two rings pointing in different directions show the coding sequence (CDS), rRNA genes, and tRNA genes. The three inner circles show the blast results for Q. bawanglingensis vs. L. balansae, Quercus tarokoensis and Q. variabilis, respectively. GC skew+ (in a green colour) means G > C, whereas GC skew- (in a purple colour) indicates G < C.
Figure 4. Comparison of the cp genomes from four Fagaceae species. The outer two rings pointing in different directions show the coding sequence (CDS), rRNA genes, and tRNA genes. The three inner circles show the blast results for Q. bawanglingensis vs. L. balansae, Quercus tarokoensis and Q. variabilis, respectively. GC skew+ (in a green colour) means G > C, whereas GC skew- (in a purple colour) indicates G < C.
Forests 10 00587 g004
Figure 5. Comparison of the borders for the LSC and SSC regions and IRs among the nine Fagaceae cp genomes.
Figure 5. Comparison of the borders for the LSC and SSC regions and IRs among the nine Fagaceae cp genomes.
Forests 10 00587 g005
Figure 6. Maximum likelihood (ML) phylogenetic tree of 29 species of Fagaceae constructed using their chloroplast genomes. Populus trichocarpa and Theobroma cacao were used as the outgroups.
Figure 6. Maximum likelihood (ML) phylogenetic tree of 29 species of Fagaceae constructed using their chloroplast genomes. Populus trichocarpa and Theobroma cacao were used as the outgroups.
Forests 10 00587 g006
Table 1. Comparison of features of nine Fagaceae chloroplast genomes.
Table 1. Comparison of features of nine Fagaceae chloroplast genomes.
Genome FeaturesGenome Size (bp)LSC Length (bp)SSC Length (bp)IRs Length (bp)Number of GenesNumber of Protein Coding GenesNumber of tRNA GenesNumber of rRNA GenesGC Content (%)
Q. bawanglingensis Huang, Li et Xing161,39490,62819,03651,7301348640836.8
Q. tarokoensis Hayata161,35590,60219,03351,7201348640836.9
Q. aliena var. acutiserrata Maxim. ex Wenz.161,15390,45719,04451,6521348640836.8
Q. variabilis Bl.161,07790,38719,05651,6341348640836.8
Q. baronii Skan161,07290,34119,04551,6861348640836.8
Q. aquifolioides Rehd. et Wils.161,22590,53519,00051,6901348640836.8
Fagus engleriana Seem.158,34687,66718,89551,7841318340837.1
Lithocarpus balansae (Drake) A. Camus161,02090,59619,16051,2641348739836.7
Castanea mollissima Bl.160,79990,43218,99551,3721308337836.8
SSC, a small single-copy region; LSC, a large single-copy region; IRs, two inverted repeats.

Share and Cite

MDPI and ACS Style

Liu, X.; Chang, E.-M.; Liu, J.-F.; Huang, Y.-N.; Wang, Y.; Yao, N.; Jiang, Z.-P. Complete Chloroplast Genome Sequence and Phylogenetic Analysis of Quercus bawanglingensis Huang, Li et Xing, a Vulnerable Oak Tree in China. Forests 2019, 10, 587.

AMA Style

Liu X, Chang E-M, Liu J-F, Huang Y-N, Wang Y, Yao N, Jiang Z-P. Complete Chloroplast Genome Sequence and Phylogenetic Analysis of Quercus bawanglingensis Huang, Li et Xing, a Vulnerable Oak Tree in China. Forests. 2019; 10(7):587.

Chicago/Turabian Style

Liu, Xue, Er-Mei Chang, Jian-Feng Liu, Yue-Ning Huang, Ya Wang, Ning Yao, and Ze-Ping Jiang. 2019. "Complete Chloroplast Genome Sequence and Phylogenetic Analysis of Quercus bawanglingensis Huang, Li et Xing, a Vulnerable Oak Tree in China" Forests 10, no. 7: 587.

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop