Complete Chloroplast Genome of Cercis chuniana (Fabaceae) with Structural and Genetic Comparison to Six Species in Caesalpinioideae

The subfamily Caesalpinioideae of the Fabaceae has long been recognized as non-monophyletic due to its controversial phylogenetic relationships. Cercis chuniana, endemic to China, is a representative species of Cercis L. placed within Caesalpinioideae in the older sense. Here, we report the whole chloroplast (cp) genome of C. chuniana and compare it to six other species from the Caesalpinioideae. Comparative analyses of gene synteny and simple sequence repeats (SSRs), as well as estimation of nucleotide diversity, the relative ratios of synonymous and nonsynonymous substitutions (dn/ds), and Kimura 2-parameter (K2P) interspecific genetic distances, were all conducted. The whole cp genome of C. chuniana was found to be 158,433 bp long with a total of 114 genes, 81 of which code for proteins. Nucleotide substitutions and length variation are present, particularly at the boundaries among large single copy (LSC), inverted repeat (IR) and small single copy (SSC) regions. Nucleotide diversity among all species was estimated to be 0.03, the average dn/ds ratio 0.3177, and the average K2P value 0.0372. Ninety-one SSRs were identified in C. chuniana, with the highest proportion in the LSC region. Ninety-seven species from the old Caesalpinioideae were selected for phylogenetic reconstruction, the analysis of which strongly supports the monophyly of Cercidoideae based on the new classification of the Fabaceae. Our study provides genomic information for further phylogenetic reconstruction and biogeographic inference of Cercis and other legume species.


Introduction
The chloroplast (cp) is widely present in algae and plants with important functions in photosynthesis, carbon fixation, and stress response [1,2]. The cp genome in most angiosperms is a circular molecule with a typically quadripartite structure, comprising a large single copy (LSC) region and a small single copy (SSC) region separated by two copies of a large inverted repeat (IR) region [3][4][5][6]. Although the cp genome is highly conserved, some differences in gene synteny, simple sequence repeats (SSRs) and pseudogenes have been observed [7][8][9] and an accelerated rate of evolution has been observed in some cp regions at different taxonomic levels [10,11]. A complete cp genome

Genome Organization and Features of C. chuniana
A total number of 2 × 250 bp pair-end reads of 1,917,920 were produced with 1.17 Gb of clean data. All reads data were deposited in the NCBI Sequence Read Archive (SRA) under accession number SRP118607. In total, 102 contigs (N50 = 8438 bp) were generated for C. chuniana. The size of the complete cp genome is 158,433 bp ( Figure 1; Table 1). The cp genome displays a typical quadripartite structure, including a pair of IR regions (25,505 bp) separated by the LSC (88,063 bp) and SSC (19,360 bp) regions ( Figure 1 and Table 1). The G + C content of the cp genome is 36.10% for C. chuniana, demonstrating congruence with that of C. canadensis (36.20%) ( Table 1). When duplicated genes in the IR regions were counted only once, the cp genome of C. chuniana were found to encode 114 predicted functional genes, including 81 protein-coding genes (PCGs), 29 tRNA genes, and four rRNA genes, all of which are comparable to the numbers in C. canadensis and other related species (Table 1). The remaining non-coding regions include introns, intergenic spacers, and pseudogenes. Nineteen genes are duplicated in the IR regions, including eight PCGs, seven tRNA genes, and four rRNA genes ( Figure 1 and Table S1). Fifteen genes (nine PCGs and six tRNA genes) contain one intron, and two PCGs (clpP and ycf3) have two introns each (Table S1). The maturase K (matK) gene in the cp genome is located within the trnK intron, consistent with the location in C. canadensis and similar to most other plant species [31]. In the IR regions of C. chuniana, the four rRNA genes and two tRNA genes (trnE and trnA) are clustered as 16S-trnE-trnA-23S-4.5S-5S. This differs from the cp genomes of C. canadensis and most legumes, which show a cluster of 16S-trnI-trnA-23S-4.5S-5S [32][33][34][35][36][37]. Int Figure 1. Gene map of the Cercis chuniana cp genome. The genes lying inside and outside the outer circle are transcribed in clockwise and counterclockwise direction, respectively (as indicated by arrows). Colors denote the genes belonging to different functional groups. The hatch marks on the inner circle indicate the extent of the inverted repeats (IRa and IRb) that separate the small single copy (SSC) region from the large single copy (LSC) region. The dark gray and light gray shading within the inner circle correspond to percentage G + C and A + T content, respectively.  Colors denote the genes belonging to different functional groups. The hatch marks on the inner circle indicate the extent of the inverted repeats (IRa and IRb) that separate the small single copy (SSC) region from the large single copy (LSC) region. The dark gray and light gray shading within the inner circle correspond to percentage G + C and A + T content, respectively.

Comparative Analysis of Genomic Structure
Synteny analysis identified a lack of genome rearrangement and inversions in the cp genome sequences among the seven species ( Figure S1). Therefore, genomic structure, including gene number and gene order, is highly conserved among the seven species. However, some nucleotide substitutions and indels as well as length variation are still present, particularly in the LSC/IR/SSC boundaries ( Figure 2 and Figure S2).
Pseudogenes are frequently identified in cp genomes [38,39]. Four pseudogenes were identified in the current study, i.e., Ψrps19, Ψycf1, ΨinfA and ΨaccD (Table 2). Ψrps19 and ycf1 are partially repeated in the IR regions and were generally found to be pseudogenized. The rps19 gene is 279 bp in all species (Figure 2) with length variation in the IR regions, from 73 bp in Tamarindus indica to 107 bp in Libidibia coriaria. It has the same length (152 bp) in both C. chuniana and C. canadensis in the IR regions ( Figure 2). Because it is partially duplicated in the IR regions, the Ψrps19 gene has lost its protein-coding ability, thus producing the pseudogenized Ψrps 19 gene. Two nonsynonymous substitutions were detected in the Ψrps19 gene between C. chuniana and C. canadensis. Among the seven species, 28 substitutions (seven in the IRb region and 21 in the LSC region, respectively) and 4 indels with length variation from 4 to 47 bp were identified ( Figure 2; Table 2). The same was found with the Ψycf1 gene, as the IRb/SSC junction region is located within the Ψycf1 CDS region and only a partial gene is duplicated in the IRa region, thus producing the pseudogene Ψycf1. This is generally the case in the dicots. The length of the Ψycf1 pseudogene in the IR regions ranges from 385 bp in C. chuniana to 899 bp in Mezoneruon cucullatum. Four nonsynonymous substitutions were detected between C. chuniana and C. canadensis. Altogether 20 substitutions (19 in the IRa region and one in the SSC region) and 7 indels with length variation ranging from 1 to 33 bp are present among the seven species ( Figure 2; Table 2). The ΨinfA gene is pseudogenized in all species except Ceratonia siliqua, with a length of 135 bp in both C. chuniana and C. canadensis and with length ranging from 192 to 252 bp among the other four species. A total of 23 substitutions and 6 indels ranging from 1 to 13 bp in length occurs in ΨinfA ( Figure 2; Table 2). The pseudogenized ΨinfA gene has also been frequently found in other angiosperm chloroplast genomes as well [40][41][42]. The pseudogenized ΨaccD gene is present in all species except T. indica and M. cucullatum, with a length of 1473 bp in both C. chuniana and C. canadensis and with length ranging from 1395 to 1500 bp in the other three species. Six indels ranging from 3 to 36 bp in length, and 101 substitutions were detected in ΨaccD (Table 2).

Characterization of Simple Sequence Repeats
Variable copy numbers and resulting length variation have impelled the wide use of cp SSRs in plant population genetics and biogeographic studies, especially at lower taxonomic levels [43,44]. A total of 91 SSRs of ≥10 bp in length were found in both C. chuniana and C. canadensis. These two species exhibit the highest number of SSRs among the seven species (Table 3). The lowest number of SSRs was detected in Haematoxylum brasiletto, with only 38 SSRs in total (Table 3). Most SSRs are present in the LSC regions, accounting for an average of 75.00% of the total SSRs in each species. Among all of the SSRs, the mononucleotide A + T repeat units were found in highest proportion, with an average of 78.10% of the total SSRs in each species. The SSRs have a remarkably high A + T content, with only 15 compound SSRs containing the nucleotides C or G in C. chuniana (Table S2). The lengths of SSRs in the seven species range from 10 to 20 bp, whereas the compound SSRs range from 21 to 275 bp.

Characterization of Simple Sequence Repeats
Variable copy numbers and resulting length variation have impelled the wide use of cp SSRs in plant population genetics and biogeographic studies, especially at lower taxonomic levels [43,44]. A total of 91 SSRs of ≥10 bp in length were found in both C. chuniana and C. canadensis. These two species exhibit the highest number of SSRs among the seven species (Table 3). The lowest number of SSRs was detected in Haematoxylum brasiletto, with only 38 SSRs in total (Table 3). Most SSRs are present in the LSC regions, accounting for an average of 75.00% of the total SSRs in each species. Among all of the SSRs, the mononucleotide A or T repeat units were found in highest proportion, with an average of 78.10% of the total SSRs in each species. The SSRs have a remarkably high A + T content, with only 15 compound SSRs containing the nucleotides C or G in C. chuniana (Table S2). The lengths of SSRs in the seven species range from 10 to 20 bp, whereas the compound SSRs range from 21 to 275 bp. The copy lengths of 10 to 13 bp are most common, with an average of 77.00% among all species (Figure 3). No pentanucleotide or hexanucleotide SSRs were detected among the seven species.
The shared interspecific SSRs were identified among species, with identical repeats and locations in homologous regions (Table 4). Cercis chuniana and C. canadensis demonstrated the highest number of 19 common SSRs. Conversely, Tamarindus indica has the lowest number of shared SSRs (≤3). Altogether 13 SSRs were isolated and corresponding primer pairs were designed for each di-, tri-and tetranucleotide SSRs of C. chuniana (Table S3). These SSRs are expected to be useful in the assessment of genetic diversity and population structure as well as the investigations of biogeographic patterns among the species of Cercis.
The copy lengths of 10 to 13 bp are most common, with an average of 77.00% among all species (Figure 3). No pentanucleotide or hexanucleotide SSRs were detected among the seven species.
The shared interspecific SSRs were identified among species, with identical repeats and locations in homologous regions (Table 4). Cercis chuniana and C. canadensis demonstrated the highest number of 19 common SSRs. Conversely, Tamarindus indica has the lowest number of shared SSRs (≤3). Altogether 13 SSRs were isolated and corresponding primer pairs were designed for each di-, tri-and tetranucleotide SSRs of C. chuniana (Table S3). These SSRs are expected to be useful in the assessment of genetic diversity and population structure as well as the investigations of biogeographic patterns among the species of Cercis.

Sequence Divergence and Nucleotide Diversity
A complete cp genome is valuable for plant taxonomic analyses, phylogenetic reconstruction, speciation processes, and biogeographical inferences at different taxonomic levels [45][46][47][48][49]. Highly variable regions among cp genomes can provide useful data for phylogenetic reconstruction. In the current study, the average nucleotide variability (Pi) was estimated to be 0.006 between C. chuniana and C. canadensis as based on the comparative analysis with DnaSP ( Figure 4a). The highest variation was found in the LSC and SSC regions. The IR regions had a much lower nucleotide diversity with Pi < 0.006. Eight regions (trnS-trnT, atpF-atpH, trnT-psbD, trnL-trnF-ndhJ, accD-psaI, rps3-rps19, ycf1-ndhF and the ndhA intron) were highly variable, with Pi values >0.030. The first five loci are present in the LSC, whereas the remaining two are present in the SSC region. In contrast, much higher nucleotide diversity with Pi = 0.038 was detected among the seven species (Figure 4b). Five regions (psbZ-trnG, trnT-trnL, rps3-rps19, rpl32, and ycf1) exhibit the highest nucleotide diversity, all with Pi >0.12. These loci are thus suggested as useful regions for phylogenetic analysis at higher taxonomic levels in the Fabaceae.

Sequence Divergence and Nucleotide Diversity
A complete cp genome is valuable for plant taxonomic analyses, phylogenetic reconstruction, speciation processes, and biogeographical inferences at different taxonomic levels [45][46][47][48][49]. Highly variable regions among cp genomes can provide useful data for phylogenetic reconstruction. In the current study, the average nucleotide variability (Pi) was estimated to be 0.006 between C. chuniana and C. canadensis as based on the comparative analysis with DnaSP (Figure 4a). The highest variation was found in the LSC and SSC regions. The IR regions had a much lower nucleotide diversity with Pi < 0.006. Eight regions (trnS-trnT, atpF-atpH, trnT-psbD, trnL-trnF-ndhJ, accD-psaI, rps3-rps19, ycf1-ndhF and the ndhA intron) were highly variable, with Pi values >0.030. The first five loci are present in the LSC, whereas the remaining two are present in the SSC region. In contrast, much higher nucleotide diversity with Pi = 0.038 was detected among the seven species (Figure 4b). Five regions (psbZ-trnG, trnT-trnL, rps3-rps19, rpl32, and ycf1) exhibit the highest nucleotide diversity, all with Pi >0.12. These loci are thus suggested as useful regions for phylogenetic analysis at higher taxonomic levels in the Fabaceae.

dn/ds Ratio and Kimura 2-Parameter (K2P) Genetic Distance
A total of 76 PCGs in all seven species was used to estimate dn/ds ratios. The dn and ds values range from 0 to 0.1713 and 0.0046 to 0.5330, respectively. If dn or ds is 0, the dn/ds ratio cannot be calculated. Among all genes, 67 proteins possess dn/ds ratios <0.5, indicating purifying selection (Figure 5a). In ndhD, Ψycf1, ΨinfA and rpl23 the dn/ds ratios were >1, indicating positive selection (Figure 5a). Among the different regions, the dn/ds ratio was the highest in the IR regions (0.9022) and the lowest in the LSC region (0.2205). Based on the K2P model, we calculated the interspecific genetic distance among the seven species using 80 PCGs. The average K2P interspecific genetic distance was found to be 0.0373 (Figure 5b). The minimum K2P values were identified in ndhB and rps7 (0.0030) and the maximum in psaB (0.2020).

dn/ds Ratio and Kimura 2-Parameter (K2P) Genetic Distance
A total of 76 PCGs in all seven species was used to estimate dn/ds ratios. The dn and ds values range from 0 to 0.1713 and 0.0046 to 0.5330, respectively. If dn or ds is 0, the dn/ds ratio cannot be calculated. Among all genes, 67 proteins possess dn/ds ratios <0.5, indicating purifying selection (Figure 5a). In ndhD, Ψycf1, ΨinfA and rpl23 the dn/ds ratios were >1, indicating positive selection (Figure 5a). Among the different regions, the dn/ds ratio was the highest in the IR regions (0.9022) and the lowest in the LSC region (0.2205). Based on the K2P model, we calculated the interspecific genetic distance among the seven species using 80 PCGs. The average K2P interspecific genetic distance was found to be 0.0373 (Figure 5b). The minimum K2P values were identified in ndhB and rps7 (0.0030) and the maximum in psaB (0.2020).

Phylogenetic Analyses
A total of 97 representative species from the old Caesalpinioideae and Mimosoideae were selected to reconstruct phylogenetic relationships (Table S4). Cucumis sativus (DQ119058) was used as the outgroup. Two phylogenetic methods of Bayesian inference (BI) and maximum likelihood (ML) resulted in highly similar phylogenetic trees based on the complete cp genome sequences and 61 protein-coding genes (PCGs) (Figure 6). The total aligned length was 302,882 bp for the complete cp genome sequences and 69,253 bp for the PCGs, and the number of parsimony-informative sites was 163,470 bp and 25,698 bp, respectively. The trees based on ML exhibit completely congruent topologies with higher bootstrap support values in the tree based on complete cp genome sequences than those based on the PCGs (Figure 6a). The relationship between subfamilies Cercidoideae and Detarioideae was not stable in the BI analysis, but otherwise high posterior probability values were detected in both the ML and BI analyses based on the two data sets (Figure 6b). All analyses recover the monophyly of both the Cercidoideae and Detarioideae with strong support. Our results are consistent with [50] and strongly support the new classification system of the Fabaceae [21].

Phylogenetic Analyses
A total of 97 representative species from the old Caesalpinioideae and Mimosoideae were selected to reconstruct phylogenetic relationships (Table S4). Cucumis sativus (DQ119058) was used as the outgroup. Two phylogenetic methods of Bayesian inference (BI) and maximum likelihood (ML) resulted in highly similar phylogenetic trees based on the complete cp genome sequences and 61 protein-coding genes (PCGs) (Figure 6). The total aligned length was 302,882 bp for the complete cp genome sequences and 69,253 bp for the PCGs, and the number of parsimony-informative sites was 163,470 bp and 25,698 bp, respectively. The trees based on ML exhibit completely congruent topologies with higher bootstrap support values in the tree based on complete cp genome sequences than those based on the PCGs (Figure 6a). The relationship between subfamilies Cercidoideae and Detarioideae was not stable in the BI analysis, but otherwise high posterior probability values were detected in both the ML and BI analyses based on the two data sets (Figure 6b). All analyses recover the monophyly of both the Cercidoideae and Detarioideae with strong support. Our results are consistent with [50] and strongly support the new classification system of the Fabaceae [21].

Ethics Statement
Sample collection and transplanting were carried out for scientific purposes. Cercis chuniana was collected from the field in Dadongshan Natural Reserve in Guangdong Province, China. One individual seedling was permitted by the management of the reserve to be transplanted and grown in the greenhouse at the College of Life Sciences, South China Agricultural University (SCAU, Guangzhou, China).

Plant Samples
Fresh leaves were collected from C. chuniana growing at SCAU. The voucher (LWZ109) is deposited in the herbarium of SCAU (CANT). The cp genome of C. canadensis (KF856619) was downloaded from NCBI and used as the reference sequence in the assembly of C. chuniana. Five additional species from the old Caesalpinioideae were used for comparison, i.e., Tamarindus indica

Ethics Statement
Sample collection and transplanting were carried out for scientific purposes. Cercis chuniana was collected from the field in Dadongshan Natural Reserve in Guangdong Province, China. One individual seedling was permitted by the management of the reserve to be transplanted and grown in the greenhouse at the College of Life Sciences, South China Agricultural University (SCAU, Guangzhou, China).

DNA Extraction and PCR Amplification
Total genomic DNA was extracted with the modified Cetyl Trimethyl Ammonium Bromide (CTAB) method [51]. The DNA concentration was quantified with a Nanodrop spectrophotometer (Thermo Scientific, Carlsbad, CA, USA), and a final DNA concentration of >30 ng/µL was used. Sequences of complete cp genome of C. chuniana were amplified with fifteen universal primer pairs developed by Zhang et al. [52]. The PCR amplification was performed in a total volume of 25 µL,

Chloroplast Genome Sequencing, Assembly and Annotation
A paired-end library was constructed with the Nextera XT DNA Library Prep Kit (Illumina Inc., San Diego, CA, USA). The genomic DNA mixture was fragmented into~300 bp size by the Nextera XT transposome. Library Sequencing acquired 2 × 250 bp paired reads with Illumina MiSeq Desktop Sequencer at South China Botanical Garden, Chinese Academy of Sciences. Reads of the C. chuniana cp genome were initially filtered for quality, and then adapters were removed, errors were checked, and contigs and scaffolds generated, all with the A5-miseq pipeline [53]. Scaffolds from the assembly with k-mer values of 35 to 145 were matched to reference cp genome sequences, and were used to determine the relative position and direction respectively. We assembled the cp genome using Geneious 9.1.4 (Biomatters Ltd., Auckland, New Zealand) [54] with BLAST 2.0.3+ (National Institutes of Health, Bethesda, MD, USA) [55] and map reference tools. DOGMA (available online: http://dogma.ccbb.utexas.edu/) [56] and Geneious (Biomatters Ltd., Auckland, New Zealand) were used for annotating the cp genome in comparison with that of C. canadensis (KF856619) [57]. The annotation of tRNA genes were confirmed with the ARAGORN program (Lund University, Lund, Sweden) [58] and then manually adjusted with Geneious. Contigs with BLAST hits to the consensus sequence from the "map to reference function" were assembled manually to construct the complete cp genome. Finally, the circular genome map of C. chuniana was illustrated with the Organellar Genome DRAW tool (OGDRAW, available online: http://ogdraw.mpimp-golm.mpg.de/) [59]. To further refine the draft genome, the quality and coverage of was confirmed by remapping reads. The Sequence Read Archive (SRA) can be found in GenBank under an accession number of SRP118607. The annotated cp genomic sequence of C. chuniana was deposited in GenBank (Accession Number: MF741770).

Genome Comparison
The cp genome sequences from the finalized data set were aligned with MAFFT v7.0.0 (Osaka University, Suita, Japan) [60] and adjusted manually when necessary. The expansion/contraction of the IR regions can lead to changes in the structure of the cp genome, resulting in the length variation of angiosperm cp genomes and contributing to the formation of pseudogenes [9,61,62]. Therefore, we conducted a comparative analysis to detect the variation in the LSC/IR/SSC boundaries among the seven species included in comparisons. Gene synteny analysis was performed with MAUVE (University of Wisconsin, Madison, WI, USA) [63] as implemented in Geneious with default settings. To elucidate the level of sequence divergence, the complete cp genomes were compared and plotted with the mVISTA program in Shuffle-LAGAN mode [64][65][66].

Simple Sequence Repeats Analysis
MISA (available online: http://pgrc.ipk-gatersleben.de/misa/misa.html) [67] is a tool for the identification and location of perfect simple sequence repeat loci (SSRs) and compound SSRs (the latter being two individual SSRs that are disrupted by a certain number of bases). We used MISA to search for potential SSRs in the cp genomes of the seven species. The minimum number (thresholds) of SSRs was set as 10, 6, 5, 5, and 5 for mono-, di-, tri-, tetra-, and pentanucleotide SSRs, respectively. All SSRs, motif types and length variants were manually verified and the redundant ones removed. We investigated the shared repeats among the cp genomes of the seven species, based on the criterion that identical lengths located in homologous regions are considered to be shared repeats. Using the program Primer 3-1.1.1 (Premier Biosoft International, Palo Alto, CA, USA) [68], we developed SSR primers specific for C. chuniana for potential application in further analysis.

Sequence Divergence, dn/ds Ratio and K2P Genetic Distance
Comparative analyses of the nucleotide diversity (Pi) among the complete cp genomes of the seven species were performed with DnaSP 6 (Universitat de Barcelona, Barcelona, Spain) [69,70], as based on a sliding window analysis. The window length was 600 bp and step size was 200 bp. The 80 PCGs were extracted and aligned with MAFFT. We estimated the dn/ds ratio for each PCG as well as the interspecific genetic distance with DnaSP 6 and MEGA 6.0 (Tokyo Metropolitan University, Hachioji, Tokyo, Japan) [71], as based on the Kimura 2-parameter (K2P) model.

Phylogenetic Analysis
Altogether 97 representative species from the old Caesalpinioideae and Mimosoideae were selected for phylogenetic analyses (Table S4). Cucumis sativus (DQ119058) was used as the outgroup. Two data sets of the complete cp genome sequences and PCGs were used for phylogenetic reconstruction based on two methods of Bayesian inference (BI) and maximum likelihood (ML), respectively. All analyses were performed on the high-performance computer cluster available in the CIPRES Science Gateway 3.3 (available online: www.phylo.org) [72]. Gaps were treated as missing data. BI was performed by using MrBayes v. 3.2.6 (Swedish Museum of Natural History, Stockholm, Sweden) [73] with base frequencies estimated from the data. We ran four Markov Chains Monte Carlo (MCMC) for 50 million generations using default settings for priors and saved one tree every 1000 generations. The first 10% of the trees were discarded, as determined with the aid of the program Tracer version 1.6 (University of Auckland, Auckland, New Zealand) [74]. The posterior probability (PP) of each clade (i.e., the "clade credibility value") was estimated with 50% majority-rule consensus trees. We conducted ML using RAxML 8.2.10 (Heidelberg Institute for Theoretical Studies, Heidelberg, Germany) [75] and the RAxML graphical interface (rxmlGUI v. 1.3) (Research Institute Senckenberg, Frankfurt, Germany) [76]. RaxML was conducted by using Python v.2.7.6 (available online: http://www.python.org/ftp/python/2.7.6/python-2.7.6.msi) with 1000 rapid bootstrap replicates. The general time-reversible (GTR) model was chosen with a gamma model for the rate of heterogeneity.

Conclusions
We report the complete cp genome of C. chuniana endemic to China, which belongs to Cercis L., an intercontinentally disjunct genus. Using a high-throughput sequencing method, we sequenced and annotated the whole genome, detected the arrangement of the genes, and identified SSRs in C. chuniana. We compared the cp genomic characteristics of C. chuniana to its congener C. canadensis and five other species from the old Caesalpinioideae. The current study is the first structural and gene comparison among the cp genomes of seven species from three subfamilies of legumes, including Cercidoideae, Detarioideae and Caesalpinioideae at the genomic level. Nearly 100 representative species from the old Caesalpinioideae and Mimosoideae were used for phylogenetic reconstruction, strongly corroborating the monophyly of Cercidoideae and Detarioideae in the sense of the new classification of Fabaceae. Our study contributes to the taxonomy, phylogenetic reconstruction and biogeographical research of Cercis and other legume species.

Conflicts of Interest:
The authors declare no conflict of interest.