Population Structure and Genetic Diversity in the Natural Distribution of Neolamarckia cadamba in China

Neolamarckia cadamba (Roxb.) Bosser is a fast-growing deciduous tree species and belongs to the Neolamarckia genus of the Rubiaceae family. This species has great economic and medical values in addition to being an important timber species for multiple industrial purposes. However, few studies have examined the genetic diversity and population structure in the natural distribution of this species in China. Here, we applied both the haploid nrDNA ITS (619 bp for aligned sequences) and mtDNA (2 polymorphic loci) markers to investigate 10 natural populations (239 individuals in total) that covered most of the distribution of the species in China. The results showed that the nucleotide diversity was π = 0.1185 ± 0.0242 for the nrDNA ITS markers and π = 0.00038 ± 0.00052 for the mtDNA markers. The haplotype diversity for the mtDNA markers was h = 0.1952 ± 0.2532. The population genetic differentiation was small (Fstn = 0.0294) for the nrDNA ITS markers but large (Fstm = 0.6765) for the mtDNA markers. There were no significant effects of isolation by distance (IBD), by elevation, and by two climatic factors (annual average precipitation and tem perature). A geographic structure among populations (Nst<Gst) was absent. Phylogenetic analysis showed a highly genetic mixture among individuals of the ten populations. Pollen flow was substantially greater than seed flow (mp/ms ≫ 1.0) and played a dominant role in shaping population genetic structure. The nrDNA ITS sequences were neutral and all local populations did not undergo demographic expansion. The overall results provide fundamental information for the genetic conservation and breeding of this miraculous tree.


Introduction
N. cadamba (2n = 44) is an evergreen deciduous and fast-growing tree species and belongs to the Neolamarckia genus of the Rubiaceae family [1,2]. This species is naturally distributed in Yunnan, Guangxi, and Guangdong provinces of South China and other subtropical and tropical regions, including Vietnam, Malaysia, Myanmar, India, Sri Lanka, and Australia [1,2]. N. cadamba can grow to 40-45 m in height and 100-160 cm in diameter [2]. It prefers high temperature and strong light habitats as well as relatively moist and fertile soil [2]. It is characterized by a rounded crown, straight trunk, and rapid growth. Its timber has the properties of straight texture, easy planning, fast drying, and hardness [3,4]. It is widely used for multiple industrial and commercial purposes, such as wood board, pulp, and paper making [5,6]. The species is also used as a material for woody forage to feed livestock [7], for nectar preparation, and for nutraceutical beverages [8,9]. In addition, the species has rich secondary metabolites in different tissues, such as cadambine, alkaloids, and triterpenoids, which have antioxidant, anti-inflammatory, antibacterial, and antimalarial effects [10][11][12][13][14]. Thus, this species is traditionally used as a medicinal plant to cure several diseases, such as diabetes, anemia, stomatitis, leprosy, cancer, infection, and other diseases [12]. Up to now, this "miraculous tree" species has been the subject of extensive provenance trials, clonal propagation, and plantations in South China [15,16].
Recent studies on genetic variation and breeding of N. cadamba covered wide aspects. Provenance trials in China were conducted to assess the genetic variation among provenances in height, DBH (the diameter at the breast height), and wood volume [15][16][17]. A progeny test was also reported in India [18] to estimate the genetic parameters of phenotypic traits and to select better individuals for genetic improvement. Modern biotechniques were applied to developing plant propagation and identifying functional genes, including tissue culture and propagation [19,20], transcriptome analysis of gene expression [21,22], expression sequence tags (ESTs) in xylem tissue [23], gene discovery of developing xylem tissue [4], the evolution of gene families [24], and polyploidization breeding [25]. A previous study used ISSR (inter-simple sequence repeat) dominant markers to analyze the genetic differentiation between two artificial populations and among six natural populations in Malaysia [26]. The results showed that 8.71% of the total genetic variation occurred between two artificial populations and 20.13% of the total genetic variation occurred among natural populations. Recently, Wang et al. [27] reported a complete sequence of the mitochondrial genome of this species and analyzed its phylogenetic relationships with other species of the Rubiaceae family. Zhao et al. [28] reported the sequences of nuclear genomes, which provided useful references for designing molecular markers for phylogenetic and population structure analyses. Nevertheless, population genetic structure and genetic diversity have not been investigated in China, which limits our understanding of a general picture of the genetic variation in the natural distribution of this species.
The ecological and evolutionary processes underlying the present population structure have not been examined as well. Although we could exclude the impacts of interspecific processes, such as incomplete lineage sorting and introgression/hybridization, other processes, such as seed/pollen flow and local adaptation, were likely involved in shaping population structure. Previous provenance trials showed that significant population differentiation occurred in growth traits [16], suggesting that selection participated in local population adaption. However, the provenance trials did not provide information on the effect of gene flow in shaping population structure. Given that N. cadamba is an anemophilous plant, pollen grains may be dispersed to long distances. Seeds are potentially locally distributed around mother trees due to gravity effects or dispersal by animals [29]. Thus, we hypothesized that pollen flow would dominantly contribute to the inter-population gene flow.
Using nuclear and organelle markers, we investigated population genetic structure and genetic diversity in the natural distribution of N. cadamba in China. Gene flow between populations could be inferred from the genetic structure analysis using molecular markers. For the nuclear genomes, we used the nuclear ribosomal internal transcribed spacer (nrDNA ITS) markers, which were frequently applied to studying phylogeography and population structure of both plant and animal species. For the organelle genomes, we used mitochondrial DNA (mtDNA) markers based on our previous sequencing of mitochondrial genomes of a few individuals where polymorphic loci were observed in some samples [27]. However, polymorphic loci were not found although several samples were tested according to the sequence of chloroplast DNA (cpDNA) [30]. Therefore, the mtDNA marker was used as the second marker in this study. It is well known that the nrDNA ITS markers are biparentally inherited, and their inter-population gene flow is achieved through seed and pollen flow. The mtDNA markers are maternally inherited and their inter-population gene flow is achieved through seed flow only. Combining the results from two types of markers helps to elucidate the relative effects of pollen versus seed flow on shaping population structure [31,32]. In addition, to infer historical dynamics, we examined if the existent natural populations underwent expansion after bottleneck effects. This would also provide additional background information for breeding and genetic conservation of this species in the future.

Population Sampling and DNA Extraction
Leaf samples were collected from ten natural populations (Table 1 and Figure 1), including three populations in Guangxi province (Longzhou, Fangchenggang, and Nanning), two populations in Guangdong province (Guangzhou and Yunfu), and five populations in Yunnan province (Baoshan, Dehong, Jinghong, Mangshi, and Mengla). These populations covered most of the natural distribution of N. cadamba in China. All sampling sites except for Baoshan were located at elevations below 1000 m. Trees were separated at least 50 m away in natural forest stands. A total of 239 individuals were analyzed in this study, with the sample size ranging from 15 to 32 among populations. DNA samples were extracted from silica-gel-dried leaves by following the CTAB 2X protocol [33]. The quality of DNA extraction was checked by 2% (w/v) agarose gel electrophoresis. All quantified DNA samples were stored at −20 • C for polymerase chain reaction (PCR) amplification. dynamics, we examined if the existent natural populations underwent expansion after bottleneck effects. This would also provide additional background information for breeding and genetic conservation of this species in the future.

Population Sampling and DNA Extraction
Leaf samples were collected from ten natural populations (Table 1 and Figure 1), including three populations in Guangxi province (Longzhou, Fangchenggang, and Nanning), two populations in Guangdong province (Guangzhou and Yunfu), and five populations in Yunnan province (Baoshan, Dehong, Jinghong, Mangshi, and Mengla). These populations covered most of the natural distribution of N. cadamba in China. All sampling sites except for Baoshan were located at elevations below 1000 m. Trees were separated at least 50 m away in natural forest stands. A total of 239 individuals were analyzed in this study, with the sample size ranging from 15 to 32 among populations. DNA samples were extracted from silica-gel-dried leaves by following the CTAB 2X protocol [33]. The quality of DNA extraction was checked by 2% (w/v) agarose gel electrophoresis. All quantified DNA samples were stored at −20°C for polymerase chain reaction (PCR) amplification.

Primer Screening, PCR Amplification, and Sequencing
For the mitochondrial genomes, we tried 11 pairs of primers that were selected from the relevant literature (Table S1), but their amplifications did not have single nucleotide polymorphisms (SNPs). We designed two pairs of primers from our previous sequencing of the mitochondrial genome of N. cadamba [27], which contained SNPs in their amplifications. These two pairs of primers were GAACATGGATTAGCATTATGTC/AT GCTAAGAGAGGGATGCTTCGC for the F1-R1 pair, and TTAGGGTCCG CTTACTTTG A/AACCGGGTAG ATGCTAAGAG for the F2-R2 pair.
For the nrDNA ITS markers, the forward and reverse primers were TCCTCCGCTTA TTGATATGC and GGAAGGAGAAGT CGTAACAAGG, respectively.
Each PCR amplification was conducted in a 25 µL reaction volume. The amplification volume included 1 µL of template DNA, 1.0 µL of each forward and reverse primer, 9.5 µL of ddH 2 O, and 12.5 µL of a mixed enzyme (0.1 U Tap polymerase, 500 µmol/L dNTPs, 20 mmol/LTris-HCl, 3 mmol/L MgCl 2 , and 100 mM KCl). The PCR amplification for nrDNA ITS and mtDNA primers is detailed below: (i) preheating at 95 • C for 4 min; (ii) annealing at a 50-65 • C temperature interval for 30 s (55 • C for ITS, 58 • C for F1-R1, and 53 • C for F2-R2); (iii) extension at 72 • C for 1 min, and repetition of the cycle of denaturation-annealing-extension for 35 times; (iv) 72 • C final extension for 10 min to make PCR amplification products be fully extended; and, finally, 4 • C to stop the reaction program. Then, the PCR products were run on 2% agarose gel electrophoresis. The results of gel running were examined by a gel imager, and the amplification products with clear and single bands were sent to Shanghai Biotech Biological Co., Ltd. (Shanghai, China) for sequencing. The Sanger method was used for sequencing. We checked all sequence data obtained from the company using Chromatogram Explorer 3.2 software. The sequences with high quality and without mixed peak signals were used for downstream analyses.

Analysis of Genetic Diversity
We aligned the sequences of both nrDNA ITS and mtDNA markers from 10 populations with MEGA7 [34] and removed those instable bases at the front and end. Two datasets were generated for analysis: one for the concatenated sequences from amplifications of primers F1-R1 and F2-R2 (Table S2) and the other for the nrDNA ITS sequences (Table S3).
DNAsp v5 [35] was used to estimate the haplotype diversity (h) of the mtDNA markers, the nucleotide diversity (π), and the effective population size-scaled mutation rate (θ) for both mtDNA and haploid nrDNA ITS markers [36][37][38]. The nucleotide diversity (π) refers to the pairwise nucleotide differences per nucleotide site [39]. The parameter θ per nucleotide site was estimated from the number of segregating sites among sequences in a sample [36]. TCS v1.21 [40] was used to map the evolutionary network among mitochondrial haplotypes.

Analysis of Population Genetic Structure
Population genetic structure was analyzed using DNAsp v5 [35] and Arlequin v3.0 [41] packages. Population differentiation was measured using G st [38], F st [42], N st [43,44], and φ st [45]. The three indices (G st , φ st , and N st ) have the same biological meaning as F st except that G st is used for the case of loci with more than two alleles and that φ st and N st are estimated with haploid sequences. AMOVA (analysis of molecular variance) was performed using Arlequin v3.0 to estimate the distribution of variances within and between populations and the genetic differentiation coefficient (φ st ). The geographical structure was inferred by testing if N st was significantly greater than G st [43].
Isolation by distance (IBD) was tested through the regression of F st /(1 − F st ) on the logarithm of geographical distances [46], Genes 2023, 14, 855

of 15
The geographical distance between two populations was calculated using their latitude and longitude coordinates (Table 1). A significant difference in the regression coefficient b from 0 means the presence of IBD effects. Pearson correlation between F st and geographical distance was also tested to check IBD effects in the case of many pairwise populations with complete differentiation (F st = 1) or without differentiation (F st = 0). In addition, isolation tests by elevation and by two climatic factors were performed. Mantel tests were conducted to test all types of isolation effects using the mantel function in R.
Phylogenetic relationships among individuals of the ten populations were analyzed using MEGA 7 [35]. The phylogenetic tree among populations was drawn using the ML (maximum likelihood) method [35]. Population structure was investigated using ADMIXTURE version 1.3.0 for the haploid nrDNA ITS sequences [47] and the results were plotted using the barplot function in R.
Gene flow was assessed under the assumption of the classical island model [48]. For maternally inherited mtDNA markers, population genetic differentiation, denoted by F st(m) , is expressed by where N e is the effective population size and m s is the migration rate of seeds. For the biparental nrDNA ITS markers, population genetic differentiation, denoted by F st(n) , is expressed by where m p is the migration rate of pollen [30,31]. From Ennos [30], the relative rate of pollen to seed flow is estimated by comparing F st(m) and F st(n) , Standard deviation of the point estimate m P /m s was estimated using the same method as Xiao [49]. The Jacknife method was used to estimate the variances in F st(m) and F st(n) [50], which were then used to calculate the standard deviation of m P /m s .

Analysis of Population Demography
We tested the neutrality of the nrDNA ITS markers using Tajima's D [39] and Fu's F statistics [51]. Tajima's D was calculated as the normalized difference of π − θ. The neutral variation was implied when D was not significant from zero. This was the same case in testing neutrality using Fu's F statistics. Significant negative values of Tajima's D and Fu's F also implied that a population had expanded after bottleneck effects. To further infer population demographic changes, mismatch distribution was analyzed using the Arlequin v3.0 package [41]. Under a hypothesis of population expansion after bottleneck effects, an unimodal distribution of the frequency of the observed number of pairwise different sites is expected, which fits for a single-peaked Poisson distribution. This analysis tested if the expected sum of square deviations (SSD) and the Harpending's raggedness index (Rag) were greater than the observed SSD and Rag [52], respectively. Relevant parameters were θ 0 = 2N 0 µ and θ 1 = 2N 1 µ, where N 0 and N 1 were the population sizes before and after population expansion, and τ (t) was the time elapsed since a sudden expansion.

Genetic Diversity
The segments produced by the two primer pairs were about 650 bp for the primer F1-R1 amplification and about 400 bp for the primer F2-R2 amplification. Sequence alignment analysis confirmed that both amplified segments were located on genome 2 (GenBank Access No. MT364442) of Wang et al. [27]. We obtained 161 samples in total for the mtDNA marker (Table S2).
For the haploid nrDNA ITS markers, the PCR amplification size was about 750 bp. We obtained 239 samples of sequences for downstream analyses (Table S3).
For the mtDNA markers, a total of 4 haplotypes were identified among the 161 sequenced individuals (Figure 2). Haplotype H1 (A-A) was the most frequent haplotype, accounting for 31% of all 161 samples, and occurred in 7 populations. This was followed by H2 (C-A) accounting for 28%, H3 (A-C) for 25%, and H4 (C-C) for 15% of all samples. Figure 1 shows the geographical distribution of these four haplotypes in the ten populations. Six populations were fixed by monomorphic haplotype (Table 2)

Genetic Diversity
The segments produced by the two primer pairs were about 650 bp for the primer F1-R1 amplification and about 400 bp for the primer F2-R2 amplification. Sequence alignment analysis confirmed that both amplified segments were located on genome 2 (Gen-Bank Access No. MT364442) of Wang et al. [27]. We obtained 161 samples in total for the mtDNA marker (Table S2).
For the haploid nrDNA ITS markers, the PCR amplification size was about 750 bp. We obtained 239 samples of sequences for downstream analyses (Table S3).
For the mtDNA markers, a total of 4 haplotypes were identified among the 161 sequenced individuals (Figure 2). Haplotype H1 (A-A) was the most frequent haplotype, accounting for 31% of all 161 samples, and occurred in 7 populations. This was followed by H2 (C-A) accounting for 28%, H3 (A-C) for 25%, and H4 (C-C) for 15% of all samples. Figure 1 shows the geographical distribution of these four haplotypes in the ten populations. Six populations were fixed by monomorphic haplotype (Table 2)     For the haploid nrDNA ITS markers, the nucleotide diversity ranged from π = 0.0806 in FS to 0.1616 in BS, with a mean of 0.1185 ± 0.0242 ( Table 2). The effective population size-scaled mutation rate (θ) ranged from 0.0985 in YF to 0.1543 in LZ, with the mean of 0.1345 ± 0.0180. The θ estimates were slightly greater than the π estimates in all populations except for BS. All populations except for FS, YF, and DH had relatively large nucleotide diversity (π > 0.1).

Population Genetic Structure
Analysis of molecular variance (AMOVA) indicated significant genetic differentiation among populations for both mtDNA and nrDNA ITS markers (Table 3). For the mtDNA markers, population differentiation coefficients were φ st = 0.5533 at the locus amplified by the F1-R1 primers, 0.7955 by the F2-R2 primers, and 0.6755 for their concatenated sequences. For the nrDNA ITS markers, the population differentiation coefficient was φ st = 0.0246. A major proportion of genetic variation occurred within populations, contrasting with the results of the analysis of the mtDNA markers. A comparison of N st with G st showed that N st (=0.6764) was slightly smaller than G st (= 0.6893) for the mtDNA markers. This was the same case for the haploid nrDNA markers where N st (=0.0300) was slightly smaller than G st (=0.0340). The results indicated that the phylogeographic structure was absent in the spatial haplotype distribution ( Figure 1 for the mtDNA markers only). Figure 3 shows that the IBD effects were not significant for both types of markers. For the concatenated mtDNA markers ( Figure 3A), Pearson's correlation coefficient between F st and geographic distance was r = 0.0815, p-value = 0.2934. For the haploid nrDNA ITS markers, analysis of the regression of F st /(1 − F st ) on the geographic distance was not significant, F st /(1 − F st ) = 0.0534 + 0.0034 ln(geographical distance), R 2 = 0.0095, p-value = 0.5232. This correlation was also tested by the Mantel test and was not significant (Pearson's correlation r = 0.0182, p-value = 0.32). Mantel tests also showed insignificant correlations between F st and the elevation difference (r = −0.0989, p-value = 0.896), between F st and the difference in annual average precipitation (r = 0.0582, p-value = 0.140), and between F st and the difference in annual average temperature (r = 0.0165, p-value = 0.336). Analysis of the phylogenetic relationship showed that individuals from the ten populations were genetically well mixed, indicating the presence of close genetic relationships among individuals using both the mtDNA ( Figure 4A) and nrDNA ITS markers ( Figure 4B). However, samples were genetically less mixed at the mtDNA loci than at the nrDNA ITS loci, consolidating that population genetic differentiation was relatively larger at the mtDNA loci. The phylogenetic relationship among populations did not match their geographical relationships (Figures 1 and 5), consolidating the result of the absent geographical structure of genetic variation in the nrDNA ITS loci. Further analysis with ADMIXTURE showed that the optimal number of subpopulations was K = 6, with the minimum crossvalidation (CV) error (CV error = 0.2993; Figure 6A). Figure 6B shows that individuals across populations were genetically well mixed by different proportions of the optimal six populations. Analysis of the phylogenetic relationship showed that individuals from the ten populations were genetically well mixed, indicating the presence of close genetic relationships among individuals using both the mtDNA ( Figure 4A) and nrDNA ITS markers ( Figure  4B). However, samples were genetically less mixed at the mtDNA loci than at the nrDNA ITS loci, consolidating that population genetic differentiation was relatively larger at the mtDNA loci. The phylogenetic relationship among populations did not match their geographical relationships (Figures 1 and 5), consolidating the result of the absent geographical structure of genetic variation in the nrDNA ITS loci. Further analysis with ADMIX-TURE showed that the optimal number of subpopulations was K = 6, with the minimum cross-validation (CV) error (CV error = 0.2993; Figure 6A). Figure 6B shows that individuals across populations were genetically well mixed by different proportions of the optimal six populations.   To estimate the standard deviation of ( ) , we used the two estimates of from AMOVA at the two polymorphic sites of the mtDNA markers. The mean estimates were 0.6765 ± 0.1712. The average and standard deviations of ( ) estimated by the Jacknife method were 0.0294 and 0.0001, respectively. According to the method of Xiao [49], the ratio of pollen to seed flow was estimated, ⁄ = 66.8440 ± 0.3351. Thus, pollen flow dominantly contributed to the gene flow among populations.

Population Demography
With the nrDNA ITS markers, Tajima  To estimate the standard deviation of F st(m) , we used the two estimates of φ st from AMOVA at the two polymorphic sites of the mtDNA markers. The mean estimates were 0.6765 ± 0.1712. The average and standard deviations of F st(n) estimated by the Jacknife method were 0.0294 and 0.0001, respectively. According to the method of Xiao [49], the ratio of pollen to seed flow was estimated, m p /m s = 66.8440 ± 0.3351. Thus, pollen flow dominantly contributed to the gene flow among populations.

Population Demography
With the nrDNA ITS markers, Tajima an expansion after bottleneck effects to some extents. However, there was no a unimodal distribution for the frequency of the observed number of pairwise different sites in each population. Thus, we concluded that N. cadamba species potentially underwent a weak expansion in a few local regions but did not exhibit global expansion.

Genetic Diversity
Although N. cadamba is a valuable species for multiple purposes, its molecular genetic variation has rarely been investigated in natural populations. We evaluated the genetic diversity in ten natural populations of N. cadamba. Both the nrDNA ITS and mtDNA markers were employed. These two types of markers were frequently applied to studying genetic diversity and population structure within and between species. The nrDNA ITS sequences were detected to be neutral from Tajima's D tests and the results of isolation by climatic factors. The effects of deleterious mutation did not produce significantly negative values of Tajima's D [39]. As expected, the overall level of nucleotide diversity was much lower for the mtDNA markers (π = 0.00038 ± 0.00052) than for the nrDNA ITS markers (π = 0.1185 ± 0.0242, θ = 0.1345 ± 0.0180) due to the lower mutation rate of the mitochondrial genome in plant species [53]. The haplotype diversity of the mtDNA markers was quite variable among populations, with the mean of h = 0.1952 ± 0.2532. A total of 6 populations (60%) were fixed by different haplotypes. There were absent phylogeographic structures (G st > N st ) in the spatial distribution of genetic diversity.

Population Genetic Structure
As expected in theory [30,31], the observed patterns of the population genetic structure were contrasted for the maternally inherited mtDNA markers versus the biparentally inherited nrDNA ITS markers. For the nrDNA ITS markers, small but significant population genetic differentiation occurred, which accounted for about 2.94% of the total genetic variation. This result was in line with previous work on nuclear ISSRs marker variation in the natural populations of N. cadamba in Malaysia [5]. However, a major proportion of total genetic variation (more than 60%) occurred among populations for the mtDNA markers. These contrasting patterns could also be inferred from the phylogenetic relationships among individuals derived from the mtDNA markers versus the nrDNA ITS markers.
Two possible explanations could be responsible for such contrasting patterns. One explanation is related to the asymmetric dispersal between nuclear and mitochondrial genes. The fruits of N. cadamba are fleshy and spherical. Mature cones are about 6 cm in diameter [32]. Seeds are mainly locally distributed because the heavy cones are affected by gravity or are dispersed by animals. Most fallen seeds decay in the wet soil. Although mature cones are edible, cones are rarely directly consumed in China. Human activities could not effectively contribute to seed flow. However, the pollen of N. cadamba is windborne and can spread to long distances, especially in the small and scattered populations in South China where barriers to pollen flow are weak. The nrDNA lTS markers may be dispersed through both pollen and seed flow. The mtDNA markers are dispersed through seed flow only. The spatial distribution of mtDNA haplotypes implies restrictive seed flow among populations despite absent IBD effects. Therefore, these differences yielded the contrasting patterns of population structure.
The second explanation is related to the reproductive ecology of N. cadamba although its mating system has not been scored. The mode of the reproductive system influences pollen dispersal and species range [67,68]. The high genetic diversity but small population genetic differentiation, observed from analysis of the neutral nrDNA ITS markers, imply that N. cadamba is potentially predominantly outcrossing, which facilitates pollen flow. Our estimate of the relative rate of pollen to seed flow supported the hypothesis that pollen flow played a dominant role in shaping the population genetic structure of N. cadamba. Compared with other tropical plants, such as Laelia rubescens (m p /m s = 13.67) [69], Dactylorhiza umbrosa (8.4-12.01) [70], and T. ciliata (1.3741 at the species level) [49], N. cadamba more heavily relies on pollen dispersal to shape population structure.

Implications for Genetic Resource Management
Two implications could be gained from the findings of this study. One implication is concerned with genetic conservation since population structure provides fundamental information for managing genetic resources [71]. Although genetic conservation is currently not a major issue in this species, it could be potentially important in the future [26]. Because of its economic and medical value, N. cadamba has been overharvested and, therefore, the natural population density is declining due to its narrow distribution in South China. Our results implicate that most local populations did not undergo demographic changes. Thus, on the evidence of small population differentiation, it would be appropriate to focus on a few populations in genetic conservation.
The second implication is concerned with breeding that is currently being conducted [16,17]. Although small population differentiation was observed with the neutral nrDNA ITS markers, the small population differentiation could also occur for some adaptive quantitative traits. Previous provenance trials showed that significant but small differences were present among provenances in tree height, DBH, and wood volume [17]. From the results of both population structure and provenance trials, it is speculated that provenance selection could be effective only for the adaptive traits but not for the neutral or nearly neutral traits. However, the genetic gain from provenance selection could be small, given the extensive pollen flow that effectively reduced population genetic differentiation. Therefore, artificial selection should concentrate on selection from families within provenances, or selection from individuals within families within provenances in breeding [72,73].

Conclusions
N. cadamba is a fast-growing timber tree species in South China. The species, also known as a miraculous tree, is exploited as a medicinal plant in addition to its use for industrial purposes. Here, we used both the mtDNA and nrDNA ITS markers to investigate ten populations that covered most of the natural distribution of N. cadamba in South China. Genetic diversity for both types of markers was randomly distributed in space, without phylogeographic structure. The genetic diversity was high from the analysis of the nrDNA ITS markers but low from the analysis of the mtDNA markers. Population genetic variation was mainly distributed within populations for the nrDNA ITS markers but among populations for the mtDNA markers. Effects of isolation by distance were absent among populations. The phylogenetic analysis showed close genetic relationships among individuals of the ten populations or among populations. The relative rate of pollen to seed flow was much greater than one, implying that pollen flow played a dominant role in contributing to gene flow of N. cadamba. Populations had not experienced expansion events and the nrDNA ITS sequences were selectively neutral. The overall results provide fundamental information for genetic conservation and breeding of N. cadamba in South China.

Supplementary Materials:
The following supporting information can be downloaded at: https:// www.mdpi.com/article/10.3390/genes14040855/s1, Table S1: Dataset of 13 pairs of mitochondrial DNA primers tested in N. cadamba; Table S2: Dataset of 161 samples of mitochondrial sequences of N. cadamba. Each sample was a concatenated sequence of F1-R1 and F2-R2 amplified segments; Table S3