Analysis of Genetic Diversity and Structure Pattern of Indigofera Pseudotinctoria in Karst Habitats of the Wushan Mountains Using AFLP Markers

Indigofera pseudotinctoria Mats is an agronomically and economically important perennial legume shrub with a high forage yield, protein content and strong adaptability, which is subject to natural habitat fragmentation and serious human disturbance. Until now, our knowledge of the genetic relationships and intraspecific genetic diversity for its wild collections is still poor, especially at small spatial scales. Here amplified fragment length polymorphism (AFLP) technology was employed for analysis of genetic diversity, differentiation, and structure of 364 genotypes of I. pseudotinctoria from 15 natural locations in Wushan Montain, a highly structured mountain with typical karst landforms in Southwest China. We also tested whether eco-climate factors has affected genetic structure by correlating genetic diversity with habitat features. A total of 515 distinctly scoreable bands were generated, and 324 of them were polymorphic. The polymorphic information content (PIC) ranged from 0.694 to 0.890 with an average of 0.789 per primer pair. On species level, Nei’s gene diversity (Hj), the Bayesian genetic diversity index (HB) and the Shannon information index (I) were 0.2465, 0.2363 and 0.3772, respectively. The high differentiation among all sampling sites was detected (FST = 0.2217, GST = 0.1746, G’ST = 0.2060, θB = 0.1844), and instead, gene flow among accessions (Nm = 1.1819) was restricted. The population genetic structure resolved by the UPGMA tree, principal coordinate analysis, and Bayesian-based cluster analyses irrefutably grouped all accessions into two distinct clusters, i.e., lowland and highland groups. The population genetic structure resolved by the UPGMA tree, principal coordinate analysis, and Bayesian-based cluster analyses irrefutably grouped all accessions into two distinct clusters, i.e., lowland and highland groups. This structure pattern may indicate joint effects by the neutral evolution and natural selection. Restricted Nm was observed across all accessions, and genetic barriers were detected between adjacent accessions due to specifically geographical landform.


Introduction
As a fundamental component of biodiversity, genetic diversity is indispensable to the population sustainability and evolution of plant species in changing environments. Its levels and patterns reflect the evolutionary history of the species (e.g., shifts in distribution, habitat fragmentation, population isolation, bottleneck effects and founder events) and the interactions of various processes, such as genetic drift, breeding system, selection and gene flow [1,2]. Consequently, reliable estimates of population differentiation may play a crucial role in understanding connectivity between populations and could be a prerequisite for conserving and using sources of wild germplasm [3]. Some studies on finer-scale spatial genetic structure have shown the importance of gene flow and its impact on genetic structure at a local scale [4]. Gene flow interruption is the primary step towards reproductive isolation under geographical isolation, thereby increasing interspecific differentiation and inbreeding while decreasing intraspecific variation [5]. However, at a small spatial scale, micro-geographical differentiation among local populations without discrete isolation is attributed more to species characteristics such as population size, breeding system and seed dispersal than to selection [6]. These neighboring populations may suffer from habitat fragmentation that increases spatial isolation, reduces population size and density, leads to strong divergence and decreases genetic diversity [7,8].
Although limitation of gene flow remains the main reason to spatial genetic structure, historical events, eco-geographic conditions and anthropic effects can also influence the spatial distribution and structure pattern of a species [9,10]. Typical of mountain ranges, elevation gradients are special cases of landscape variation that influence both genetic structure and diversity of populations, particularly those located at the upper and lower distributional limits [11]. Driven by the joint effects of natural selection and neutral evolution, ecological divergence could interfere the balance between genetic drift and gene flow across the landscape and alter genetic structure within a species [12,13]. If natural selection has affected the genome of a species, the outlier loci should show limited differentiation between populations in the case of balancing selection, or extensive differentiation if divergent selection has had an effect. Alternatively, if natural selection has not affected specific sites throughout the genome, there should be no evidence of outlying loci [14]. Therefore, analysis across multiple populations allows separation of demographic and environmental factors in determining population structure of species, and infers the primarily diverged force due to selectively neutral evolution or natural selection [15].
Karsts represent distinctive landforms and hydrological systems on Earth. Karst regions are extensive in southern China and have generated the extraordinary floras present on these landscapes [16,17]. The high edaphic and topographic heterogeneity of karst landscapes offers a multitude of ecological niches for plants to diversify and specialize; accordingly, more than 4287 plant species have been found in karst areas in southern China, of which approximately 28% are karst endemics [18]. However, karst ecosystems are extremely fragile eco-geo-environments. The high dissolution rate of soluble and porous bedrock (e.g., carbonate rocks, limestone) makes the soil alkaline and barren with a double-layered structure, creating conditions of low soil water content and periodic water deficiency, which are harmful to the growth and reproduction of plant communities [19]. In addition, due to topographic heterogeneity (e.g., deep canyons, caves and rivers formed by geological events) and global warming, many endemic plants in karst areas tend to suffer from fragmentation of suitable habitats and reduction in species diversity [20,21]. Furthermore, the unique biota in karst areas has been sharply affected by recent human activities, such as large-scale logging, agricultural expansion, livestock overgrazing and fire, which have led to extensive water and soil erosion and serious land degradation in the form of rocky desertification, reducing the stability of the karst ecosystem [22,23]. Under increasing habitat pressure caused by both human disturbance and global climate change, many plant communities in the karst area have become rare or endangered, especially species endemic to the area [16]. However, little is known about naturally heterogeneous landscape affecting genetic diversity and structure in endemic plants to karst mountains in southern China, especially at fine spatial scales. The information on the distribution of genetic variation over the scale of small geographical regions is crucial to our understanding of the ecological adaptation of any endemic species.
Indigofera L., a large pantropical genus belonging to the Fabaceae, contains 750 species worldwide and occurs on all major land masses, most abundantly in Africa and Asia [24,25]. Among these, Indigofera pseudotinctoria Mats (false indigo), a perennial diploid shrub (2n = 2x = 16) [26], is an economically and agronomically important species distributed in China, Japan, and Korea, occurring sparsely in grasslands, hillsides and shrublands at elevations ranging from 100 m to 2000 m [27]. Because of its strong nitrogen fixation and adaptation capacity as well as its various organic compounds and mineral elements, it serves as medicine and animal feed and plays an important role in ecological protection in southern China, especially in water and soil conservation, by providing slope stability in karst areas [28,29]. Although this plant can effectively resist environmental degradation in karst landscapes, it suffers from a fragmented distribution caused by human disturbance and climate change; the population size and density of I. pseudotinctoria have shown a continuous and rapid decrease, even threatening the karst ecosystem [30]. Therefore, a good knowledge of its habitat factors as well as the genetic diversity and structure of native I. pseudotinctoria populations will provide a better understanding of the relevant ecological and genetic processes, which is a necessary prerequisite for both ecological restoration and propagation of the remnant populations in the karst areas of southern China [31].
The amplified fragment length polymorphism technique (AFLP) is a high resolution PCR-based, multi-locus molecular marker assay with high reproducibility with no requirement for prior sequence information of the genome being studied [32]. It is capable of generating a large number of reproducible fragments with genome wide distribution [3,32]. Due to high efficiency and reproducibility of AFLP, it has been widely used to assess genetic relationships and intraspecific genetic diversity of wild or natural plant populations [5,33]. To date, no AFLP marker has been used to assess the genetic diversity of native I. pseudotinctoria populations of China. The objective of present study were to estimate genetic structure and diversity in local I. pseudotinctoria accessions in Wushan Mountain of southern China, among which all collection sites could represent different eco-climatic conditions. Since increased environmental variability on a small geographical scale favors genetic differentiation influenced by differential local adaptation [4,5,34], our hypothesis was that genetic accession structure would exist within I. pseudotinctoria collections from Wushan Mountain in Southwest China, and that genetic variability would be associated with certain environmental factors (altitude, annual mean temperature and annual precipitation, etc). This information may be useful to elaborate the relationship between local adaptation and genetic diversity as well as make the desirably managing strategies of mesquite resource.

Gene Diversity of I. pseudotinctoria
At species level, amplification and polymorphism of the six AFLP primer pairs were showed in Table 1. A total of 515 AFLP bands were clearly visible in 364 individuals, with an average of 85.83 bands generated by each pair of primers. Among all the bands, 324 (62.91%) were polymorphic. The number of polymorphic bands ranged from 49 for primer pair E59M55 to 57 for E76M85 and E85M60, while the percentage of polymorphic bands varied from 54.90% for primer pair E53M55 to 70.37% for E76M85. In addition, the polymorphic information content (PIC) for each primer pair ranged from 0.1690 to 0.3029, with an average of 0.2243. The Nei's gene diversity (H j ) of primer was between 0.2308 for primer pair E53M55 and 0.3980 for E76M85. The Shannon information index (I) of primer was also computed, and the results varied from 0.1998 to 0.3189 with a mean of 0.2774. The parameters of genetic diversity of 15 accessions are shown in Table 2. The highest number of polymorphic loci (N p ) was 296 in ace.E, with 91.36% polymorphic loci, while the lowest N p was 202 in popO, with 62.35% polymorphic loci. A linkage disequilibrium (LD) analysis for each accession was conducted that the percentage of LD ranged from 9.01% in acc.B to 12.72% in acc.F, while the total rate of LD was 10.73%. This result revealed relatively low recombination and replacement. The observed number of alleles per locus (N a ) ranged from 1.6235 to 1.9136, and the effective number of alleles per locus (N e ) varied from 1.2874 in acc.F to 1.4491 in acc.B, with average values of 1.7757 and 1.3661, respectively. Different genetic diversity indexes were calculated and showed high genetic diversity among accessions. Nei's gene diversity index (H j ) ranged from 0.2020 in acc.F to 0.2992 in acc.B, the overall H j was 0.2986, and the average was 0.2465. Similarly, the Shannon information index (I) found the highest diversity within acc.B (0.4130) and the lowest within acc.F (0.2708); the total H o was 0.4291, and the mean was 0.3407. In contrast, the Bayesian gene diversity index (H B ) revealed the highest diversity within acc.A (0.2869), although the lowest diversity was still found within acc.F (0.1993), and the total and average of H B were 0.2895 and 0.2362, respectively.  For studying the genetic structure of I. pseudotinctoria, an unweighted pair-group method with arithmetic mean (UPGMA) clustering analysis was conducted to estimate the genetic relationships among the 15 accessions using Nei's genetic distance matrix (Figure 1a). The result showed that these accessions were classified into two main clusters, which was confirmed by Bayesian model-based cluster analysis with a clear peak value of the ∆K statistic (Figure 1b) at K = 2. In addition, STRUCTURE plots were created for 15 accessions (Figure 1c) and 364 individuals (Figure 1d) at K = 2. The cluster analysis (UPGMA tree and STRUCTURE analysis) strongly suggested that these accessions gathered into clusters based on similarity of elevation. The Cluster I contained acc.A, acc.B, acc.C, acc.D, acc.M, acc.N, and acc.O, which were situated at 311~913 m. The Cluster II contained the remaining eight accessions, from acc.E to acc.L, which were situated at 1002~1612 m. Moreover, each cluster could be divided into two sub-clusters. The first sub-cluster belonging to Cluster I contained acc.C and acc.D, and the second sub-cluster of Cluster I contained acc.A, acc.B, acc.M, acc.N and acc.O. In Cluster II, acc.E to acc.I and acc.J to acc.L represented first sub-cluster and second sub-cluster, respectively. Overall, the cluster analysis demonstrated that 15 accessions gathered into two clusters that could be approximated by elevation. To further investigate accession structure, principal coordinate analysis (PCoA) was performed on the entire dataset of 324 genotypes ( Figure 2). PCoA based on a similarity matrix explained 59.56% and 3.37% of the genetic variation with the first and the second PCoA axes, respectively. This result revealed that these individuals could basically be divided into two groups. The individuals from acc.A, acc.B, acc.C, acc.D, acc.M, acc.N, and acc.O gathered together in the upper part of the plot. The individuals from acc.E to acc.L mainly gathered at the lower part of the plot. This result confirms that of the cluster analysis. Furthermore, the individuals within each of acc.A, acc.B, acc.F, acc.G, and acc.K gathered relatively closely, and this result was also visualized in the STRUCTURE analysis.  To further investigate accession structure, principal coordinate analysis (PCoA) was performed on the entire dataset of 324 genotypes ( Figure 2). PCoA based on a similarity matrix explained 59.56% and 3.37% of the genetic variation with the first and the second PCoA axes, respectively. This result revealed that these individuals could basically be divided into two groups. The individuals from acc.A, acc.B, acc.C, acc.D, acc.M, acc.N, and acc.O gathered together in the upper part of the plot. The individuals from acc.E to acc.L mainly gathered at the lower part of the plot. This result confirms that of the cluster analysis. Furthermore, the individuals within each of acc.A, acc.B, acc.F, acc.G, and acc.K gathered relatively closely, and this result was also visualized in the STRUCTURE analysis. To further investigate accession structure, principal coordinate analysis (PCoA) was performed on the entire dataset of 324 genotypes ( Figure 2). PCoA based on a similarity matrix explained 59.56% and 3.37% of the genetic variation with the first and the second PCoA axes, respectively. This result revealed that these individuals could basically be divided into two groups. The individuals from acc.A, acc.B, acc.C, acc.D, acc.M, acc.N, and acc.O gathered together in the upper part of the plot. The individuals from acc.E to acc.L mainly gathered at the lower part of the plot. This result confirms that of the cluster analysis. Furthermore, the individuals within each of acc.A, acc.B, acc.F, acc.G, and acc.K gathered relatively closely, and this result was also visualized in the STRUCTURE analysis.

Genetic Differentiation and Gene Flow
For estimating the partitioning of genetic variance within I. pseudotinctoria, an analysis of molecular variance (AMOVA) was conducted ( Table 3). The results attributed 22.17% of the variance to differences among accessions, while 77.83% could be attributed to variability within accessions. Based on cluster analysis, we divided the 15 accessions into two Clusters according their elevation. The accessions situated at 311~913 m were considered the lowland Cluster (corresponding to Cluster I in UPGMA tree and STRUCTURE analysis), and the accessions at elevations of 1002~1612 m were considered the highland Cluster (corresponding to Cluster II in UPGMA tree and STRUCTURE analysis). Then, AMOVA was calculated for the two Clusters, which showed that 8.98% of the genetic variance occurred between the two classified Clusters, while 25.43% of the genetic variance occurred within accessions. To further study the variance of I. pseudotinctoria at a small scale, a variety of genetic differentiation coefficients were calculated (Table 4).

Variable
All   Table 5. The smallest mean deviance information criterion (DIC), 19,375.6, was obtained under the f = 0 model, suggesting that the f = 0 model was more suitable than other models for this dataset. The mean θ B (analogous to F ST ) in the f = 0 model was 0.1844 (standard deviation (SD) = 0.0022) among accessions, and the magnitude of genetic differentiation was analogous to the above coefficients.
In addition, gene flow (N m ) was calculated to be 1.1819 individuals per generation between accessions (Table 4). Further, a genetic barrier prediction analysis using Monmonier's maximum difference algorithm revealed three likely barriers to gene flow ( Figure 3). The first barrier (aa) isolated the acc.A site from the rest of the collection sites, the second (bb) separated acc.M and acc.N from the neighboring accessions, and the third (cc) was predicted to isolate the acc.O site from acc.B. Moreover, we have calculated the pairwise F ST and N m for 15 I. pseudotinctoria accessions (Table 6). It revealed that the highest genetic differentiation (F ST = 0.2730) and lowest gene flow (N m = 0.6658) were between acc.A and acc.F. According to above barrier analysis, the F ST and N m between acc.A and acc.O were 0.2176 and 0.8989, respectively. mean θ B (analogous to FST) in the f = 0 model was 0.1844 (standard deviation (SD) = 0.0022) among accessions, and the magnitude of genetic differentiation was analogous to the above coefficients. In addition, gene flow (Nm) was calculated to be 1.1819 individuals per generation between accessions (Table 4). Further, a genetic barrier prediction analysis using Monmonier's maximum difference algorithm revealed three likely barriers to gene flow (Figure 3). The first barrier (aa) isolated the acc.A site from the rest of the collection sites, the second (bb) separated acc.M and acc.N from the neighboring accessions, and the third (cc) was predicted to isolate the acc.O site from acc.B. Moreover, we have calculated the pairwise FST and Nm for 15 I. pseudotinctoria accessions (Table 6). It revealed that the highest genetic differentiation (FST = 0.2730) and lowest gene flow (Nm = 0.6658) were between acc.A and acc.F. According to above barrier analysis, the FST and Nm between acc.A and acc.O were 0.2176 and 0.8989, respectively.

Genetic Diversity Associated with Environmental Factors
Among the 324 AFLP loci, the F ST values of 21 loci (6.48%) were significantly high deviating from 95% confidence intervals In Mcheza program, thus the 21 loci were potential outliers under divergent selection (Figure 4)

Genetic Diversity Across all Accessions
Genetic diversity is a vital basis for species studies because its amount and distribution are likely to affect the evolutionary potential of species or populations, which can include expanding distribution range and adapting to habitat [35]. Otao et al. [27] have researched the genetic diversity of three I. pseudotinctoria populations comprising two native populations in Japan and one population transported from China using 14 microsatellite primers and found no LD in any of the three studied populations, while 10.73% LD was detected in our study. Theoretical studies have noted that many interrelated factors are involved in the formation of patterns of LD, including genetic drift, population admixture, and natural selection [36,37]. The heterozygosity (He = 0.346) and effective number of alleles per locus (Ne = 3.310) found by Otao et al. were higher than in our study (Hj = 0.2986, Ne = 1.4433). These differences are due to inherent differences between the AFLP and microsatellite DNA (simple sequence repeat, SSR) techniques because SSR markers are co-dominant, whereas AFLP is dominant [38].
Most of the genetic diversity of I. pseudotinctoria is distributed within accessions (77.83%). This high level of genetic diversity within accessions may be because I. pseudotinctoria is an outcrossing perennial species; such species usually contain higher genetic diversity than that in autogamous annual species, as shown in other studies [39][40][41]. The pattern of genetic diversity found in I. pseudotinctoria most likely reflected its ecological and life history traits; that is, historical fragmentation derived from geological movement and subsequent climatic oscillation would hinder gene flow within accessions. A high-longevity species rather than losing genetic diversity could maintain genetic diversity over a longer period [42,43]. For the interpretation of FST, it has been suggested that a value lying in the range a value between 0.15 and 0.25 meant great differentiation

Genetic Diversity Across all Accessions
Genetic diversity is a vital basis for species studies because its amount and distribution are likely to affect the evolutionary potential of species or populations, which can include expanding distribution range and adapting to habitat [35]. Otao et al. [27] have researched the genetic diversity of three I. pseudotinctoria populations comprising two native populations in Japan and one population transported from China using 14 microsatellite primers and found no LD in any of the three studied populations, while 10.73% LD was detected in our study. Theoretical studies have noted that many interrelated factors are involved in the formation of patterns of LD, including genetic drift, population admixture, and natural selection [36,37]. The heterozygosity (H e = 0.346) and effective number of alleles per locus (N e = 3.310) found by Otao et al. were higher than in our study (H j = 0.2986, N e = 1.4433). These differences are due to inherent differences between the AFLP and microsatellite DNA (simple sequence repeat, SSR) techniques because SSR markers are co-dominant, whereas AFLP is dominant [38].
Most of the genetic diversity of I. pseudotinctoria is distributed within accessions (77.83%). This high level of genetic diversity within accessions may be because I. pseudotinctoria is an outcrossing perennial species; such species usually contain higher genetic diversity than that in autogamous annual species, as shown in other studies [39][40][41]. The pattern of genetic diversity found in I. pseudotinctoria most likely reflected its ecological and life history traits; that is, historical fragmentation derived from geological movement and subsequent climatic oscillation would hinder gene flow within accessions. A high-longevity species rather than losing genetic diversity could maintain genetic diversity over a longer period [42,43]. For the interpretation of F ST , it has been suggested that a value lying in the range a value between 0.15 and 0.25 meant great differentiation so our study has revealed high genetic differentiation among accessions (F ST = 0.2217). This explanation might be the small size and fragmented distribution of I. pseudotinctoria accessions due to heterogeneous environment and hunman activities. So gene flow among accessions is limited and genetic drift might lead to the loss of genetic diversity and increase of genetic differentiation for some accessions, especially for those accessions with small size [44,45].

Genetic Diversity Related to Geographic and Climate Parameters
Autocorrelation, derived from the first law of geography, is a very general property of ecological variables along a time series (temporal autocorrelation) or across geographic space (spatial autocorrelation) [46], and it has been used in plant in situ conservation. According to spatial autocorrelation analysis, populations that are physically closer together are more easily affected by each other with respect to intra-population diversity and inter-population variance [47]. In this study, the spatial autocorrelation analysis was conducted with Moran's Index, revealing very slight interactions among I. pseudotinctoria accessions (r = 0.146, p < 0.01); thus, all the accessions were suitable, and all of them were used for calculations of genetics with no bias.
Remarkable variation occurs in the distribution patterns of genetic population diversity along elevation gradients, and greater genetic diversity is generally found within lowland populations [48,49]; however, genetic diversity peaks on higher slopes are also observed. In this study, a regression analysis between [F ST /(1 − F ST )] and geographic altitude revealed a strong and significant negative correlation (r = 0.755, p < 0.01) between within-accession diversity and site elevation, i.e., lower diversity at highland sites and higher diversity at lowland sites. As an insect-pollinated plant species, I. pseudotinctoria is thought to breed by xenogamy, which requires pollinators for outbreeding [25]. Schuster et al. [12] have found that large-scale elevation gradients represented significant barriers to the movement of pollinators and seed dispersers, limiting gene flow and further aggravating the divergence among upper-and lower-elevation Podocarpus parlatorei populations [50]. The empirical research revealed that lower size, density and visiting frequency of pollinators at higher altitudes [49]; in other words, I. pseudotinctoria accessions at the upper distributional range would have less opportunity for allelic exchange than lowland accessions that exhibit less genetic diversity. Because the ecological isolation of I. pseudotinctoria in karst areas driven by habitat heterogeneity and fragmentation could hinder gene flow among accessions and cause genetic erosion by genetic drift, local adaptation and inbreeding [50,51].
Climate factors such as temperature and precipitation can strongly influence population diversity and structure for a species [9]. A slight correlation was found between [F ST /(1 − F ST )] and annual mean temperature (r = 0.250, p < 0.01) in this study. In fact, temperatures decline along with rising elevation gradients; thus, the extremely low temperature at peaks can threaten local life, decreasing population diversity and size and even resulting in extinction [42]. This is a potential reason to explain the lower genetic diversity in higher altitudes. Because of the difference of precipitation among these sampling sites was very little, so that the annual precipitation was not a factor affecting accession diversity of I. pseudotinctoria at this small spatial scale (r = 0.050, p < 0.01).

Structure Pattern and Gene Flow
Generally, outcrossing species have higher levels of gene flow (N m ) that can resist the effect of genetic drift within populations and prevent the differentiation of populations [52]. When the value of N m is greater than 1, this resistance is stronger; when the value of N m is less than 1, genetic drift can lead to genetic differentiation among populations [53]. In this study, N m was 1.1819 (p < 0.01) among all sampling accessions, indicating a small magnitude of gene flow among the accessions. This restricted gene flow reflected that the natural landforms in the sampling area led to ecological isolation, which reduced gene flow and increased genetic divergence among the accessions [43].
Reductions in gene flow between accessions due to divergent selection can also enhance the impact of selectively neutral mechanisms of evolution (e.g., genetic drift), further accelerating differentiation between accessions [54,55]. So some genetic differentiation was found between lowland group and highland group (Table 4: F CT = 0.0898). The gene flow (N m ) was calculated for two groups in different elevations Based on F ST . We found that gene flow within lowland group (N m = 1.7421) and highland group (N m = 2.2128) were both high, suggesting that gene flow resisted intra-accession drift and hindered inter-accession differentiation. However, the lower level of N m among the lowland accessions was not consistent with the lower frequency of pollinators at higher altitudes [9], suggesting that the pollinator difference derived from the elevation gradient was not the main factor affecting gene flow in lowland and highland accessions in the present study, but rather the divergent evolution by natural selection under specially geographic landforms. It was demonstrated in divergent outlier analysis that 21 loci were selected under natural evolution. As a preferred leguminous feed for livestock, especially goats, I. pseudotinctoria in lowland sites was sharply disturbed by local animal grazing because these livestock directly change the accession size and distribution by ingestion and spreading the indigestible seeds [56]. In addition, human activities, such as deforestation and urban expansion also partly change the community structure, complexity and heterogeneity and increase the fragmentation of the I. pseudotinctoria in different elevation gradients [18].
The genetic BARRIER analysis using Monmonier's maximum difference algorithm [57] revealed a highly significant genetic barrier (aa) between acc.A and acc.B, and then we calculated the relevant coefficients G ST and N m , which indicated high genetic differentiation (G ST = 0.3162, p < 0.01) and very low gene flow (N m = 0.5406) between them. The second genetic barrier (bb) isolated acc.M and acc.N from the adjacent accessions, resulting in high genetic differentiation (G ST = 0.2498, p < 0.01) and low gene flow (N m = 0.7508). A third genetic barrier occurred between acc.B and acc.O, and similar results were observed (G ST = 0.2184, N m = 0.8947). Although the distance between adjacent accessions is not great, gene flow has been seriously restricted by genetic barriers derived from habitat fragmentation; for example, acc.M and acc.N are located in deep canyons surrounded by high mountains, while acc.O is separated from acc.A and acc.B by the wide Yangtze River. These ecological barriers have promoted divergence between adjacent accessions by the joint effects of neutral evolution and natural selection and even led to reproductive isolation in future generations.

Study Sites and Sampling
Wushan Mountain is located in the northeast of China's Chongqing Municipality, flowed across by the Yangtze River and its several branches. It has a subtropical monsoon climate and the typical karst landform of south China, filled with many deep valleys, sharp cliffs and slopes. The elevation drop from top to bottom is over 2000 m, and the various vegetation types significantly change along with the increase of elevation [58]. We analyzed 364 adult individuals of Indigofera pseudotinctoria collected from 15 native accessions located in Wushan Mountain ( Table 7) Figure 5). The annual mean temperature and annual precipitation in the sampling area were approximately 15.04 • C and 1124.5 mm, respectively. The sampling area was montane karst, and the soils were loess soil and purple soil, mostly alkaline, with soil organic matter varying from 6.2 g/kg in the acc.A site to 66.3 g/kg in the acc.F site. Between 19 and 30 individuals were collected for each accession, and the geographical coordinates of each accession were recorded using a global positioning system (GPS) device. For genetic analysis, samples from each individual consisted of 5-10 young leaves that were preserved in zip-lock plastic bags containing silica gel and stored at −20 • C for DNA isolation.

DNA Isolation
The leaves were rinsed with 70% ethanol, dried and then ground to a fine powder under liquid nitrogen using a mortar and pestle. Genomic DNA was isolated from 200 mg of the ground leaf tissue from each sample using a modified CTAB method [59]. The quality and quantity of genomic DNA were verified by 0.8% (w/v) agarose gel electrophoresis in 1X TBE buffer (0.09 M Tris-borate, 2 mM EDTA, pH 8.3), run 45 min at 100 V, stained with ethidium bromide and visualized under UV light. The DNA obtained was stored at −20 °C in TE buffer (Tris-EDTA) until use.

AFLP Procedure
AFLP reactions were performed as described by Vuylsteke et al. [32] with minor modifications. Six selective pairs of MseI (M-GACGATGAGTCCTGAG) and EcoRI (E-CTCGTAGACTGCGTACC) primers were used in the selective amplification, having been identified as the most informative and reliable from a set of 80 primer pairs during a preliminary investigation. Amplification products were purified with a spin column purification kit for PCR products as following: 1 PCR buffer, 2 mM MgCl2, 2 mM dNTP, 40 ng of each of EcoRI primer and MseI primer, 1 U Taq polymerase, and 5.0 μL

DNA Isolation
The leaves were rinsed with 70% ethanol, dried and then ground to a fine powder under liquid nitrogen using a mortar and pestle. Genomic DNA was isolated from 200 mg of the ground leaf tissue from each sample using a modified CTAB method [59]. The quality and quantity of genomic DNA were verified by 0.8% (w/v) agarose gel electrophoresis in 1X TBE buffer (0.09 M Tris-borate, 2 mM EDTA, pH 8.3), run 45 min at 100 V, stained with ethidium bromide and visualized under UV light. The DNA obtained was stored at −20 • C in TE buffer (Tris-EDTA) until use.

AFLP Procedure
AFLP reactions were performed as described by Vuylsteke et al. [32] with minor modifications. Six selective pairs of MseI (M-GACGATGAGTCCTGAG) and EcoRI (E-CTCGTAGACTGCGTACC) primers were used in the selective amplification, having been identified as the most informative and reliable from a set of 80 primer pairs during a preliminary investigation. Amplification products were purified with a spin column purification kit for PCR products as following: 1 PCR buffer, 2 mM MgCl 2 , 2 mM dNTP, 40 ng of each of EcoRI primer and MseI primer, 1 U Taq polymerase, and 5.0 µL pre-amplified template DNA. The fragments amplified in the latter step were subjected to capillary electrophoresis using an ABI 3500 (Applied Biosystems, Foster City, CA, USA), and the fluorescent fragments of each sample were treated using GeneMarker (version 2.6, SoftGenetics, State College, PA, USA). In order to evaluate the reproducibility of six AFLP primer pairs selected by the pre-test experiment, we compared the profiles from five replicates for 10 randomly chosen plants from all accessions, which yielded the average reproducibility of 96.8%. Non-reproducible markers were not taken into consideration.

Band Statistics
The fluorescent AFLP patterns were visualized as electropherograms, in which each peak of fluorescence corresponded to an amplified DNA fragment. Each AFLP band between 80 and 500 bp was considered a single biallelic locus with an amplifiable (1) and a null (0) allele to generate AFLP binary matrices, then an integrated binary matrix of AFLP phenotypes was constructed for further analysis.

Gene Diversity
Linkage disequilibrium (LD) was calculated using ARLEQUIN version3.0 [60] to detect whether recent accession bottlenecks and extinction-replacement had occurred at a significance level of 0.05. Then, the following genetic diversity parameters were calculated for each primer set: the total number of bands (TNB), polymorphic bands (PB), polymorphism rate (P) and polymorphic information content (PIC). Parameters were calculated using Excel 2013 software using the formula: PIC = 1 − ∑P 2 ij , where P ij is the frequency of the jth allele for the ith AFLP locus. Genetic diversity (H j ) and Shannon information index (I) for primers were calculated in POPGENE (version 1.32, University of Alberta, Edmonton, AB, Canada) [61]. At the species level, the number of polymorphic loci (N p ), percentage of polymorphic loci (PLP), observed number of alleles per locus (N a ), effective number of alleles per locus (N e ), and Shannon information index (I) with its standard error were calculated using POPGENE version 1.32 [61]. Nei's gene diversity (H j ) with its standard error was computed using AFLP-SURV version 1.0 [61] using the approach of Lynch and Milligan and the Bayesian method with non-uniform prior distribution to calculate allelic frequencies assuming Hardy-Weinberg equilibrium [62]. Nevertheless, taking into account a possible bias caused by this assumption, Bayesian gene diversity (H B ) and its standard error were also calculated using Hickory version 1.1 [63], which does not assume Hardy-Weinberg equilibrium within accessions.

Genetic Structure
To further research the genetic relationship among 15 accessions, Nei's genetic distance (GD) matrix of 15 accessions was calculated with a Bayesian method using AFLP-SURV version 1.0 [61] with non-uniform distribution and 10,000 bootstrapping replicates, and then the results were used as inputs for an UPGMA clustering analysis using the CONSENSE module in the PHYLIP software package, version 3.69 [64]. Meanwhile, a PCoA for 364 individuals was computed using R package "stats" [65] to ordinate the spatial relationships among the individuals. Moreover, Bayesian model-based cluster analysis was performed to infer genetic structure and to define the number of clusters in the dataset using the software STRUCTURE, version 2.3.4 [33]. Correlated allele frequencies and an admixed model were applied with a burn-in period of 50,000 and 100,000 MCMC replicates after burn-in. The range of cluster number (K) was predefined from 1 to 14 with 10 runs for each K. The optimum value of K was determined by examination of the ∆Ks and L(K) using STRUCTURE HARVESTER (version 0.6.94, UCSC, Santa cruz, CA, USA) [66]; these metrics usually plateaued or increased slightly after the "optimum K" was reached. A consensus STRUCTURE plot was obtained from the admixture repeats using the greedy algorithm in CLUMPP version 1.1 [67], and final plots were visualized in STRUCTURE PLOT [68].

Differentiation and Gene Flow
To further study the magnitude of genetic differentiation (F ST ), the partitioning of variation at different levels was calculated by AMOVA using "vegan" package [5] in R program with 1000 permutations. Additionally, Nei's coefficient of genetic differentiation (G ST ) was calculated using AFLP-SURV version 1.0 [61] with 10,000 permutations according to the formula: G ST = (H T − H S )/H T , in which H T and H S were total Nei's gene diversity and average Nei's gene diversity within accessions, respectively. Furthermore, Shannon's coefficient of genetic differentiation (G' ST ) was also estimated as G' ST = (I sp − I pop )/I sp , in which I sp was total Shannon diversity and I pop was average Shannon diversity within accessions. Bayes posteriors were obtained for a full model (f, inbreeding coefficient, and θ, accession differentiation, were estimated), a model in which f was set to 0, a model in which θ B was set to 0 and an f -free model (estimates θ B without estimating f ) using Hickory version 1.10 [69]. The DIC was used to estimate the fit of the different models, and those with lower DIC values and differences no larger than 6 units between them were chosen. According to Nei's genetic differentiation (G ST ), gene flow N m was estimated following the method: N m = (1 − G ST )/4G ST ; N m allows accessions to resist the effect of genetic drift and prevents the differentiation of accessions for N m > 1; for N m < 1, genetic drift can lead to genetic differentiation among accessions [70].

Genetic Barriers and Divergence Outlier Analysis
A specific test was devised to suggest historical barriers to gene flow among collection sites using the program BARRIER (version 2.2, Syracuse University, NY, USA) [57], according to Monmonier's maximum difference algorithm with the geographic coordinates of each sampling site and the Nei's GD calculated in AFLP-SURV as inputs. For detecting loci under selection, divergence outlier analysis (DOA) was conducted using the Mcheza program [71]. Pairwise comparison of F ST values between the accessions was conducted with default settings. The expected F ST distribution under the null hypothesis of neutrality was calculated with 50,000 simulations and a false discovery rate of 0.1 within the 95% confidence interval in 10 different simulations. As calculation of the initial mean F ST found in the dataset includes all potentially selected loci (i.e., loci falling outside of the confidence interval), those loci were removed from calculation of the "neutral" mean F ST [72]. The mean simulated F ST was forced to approximate the initial mean F ST by application of a bisection algorithm over repeated simulations.

Correlation Analysis
A spatial autocorrelation analysis was conducted using SAM version 4.0 [73] to detect interrelation among minimum units obeying Moran's Index. Then, a regression analyses between [F ST /(1 − F ST )] and environment parameters were calculated using "ade4" package [74] in R program with 95% confidence level.

Conclusions
The results of this study demonstrate that environmental conditions in different habitats are a major factor influencing the hereditary characteristics of I. pseudotinctoria. The gene diversity of I. pseudotinctoria was high and decreased with increasing elevation. The accessions from lowland and highland areas clustered into separate groups and the structure pattern of I. pseudotinctoria was affected by the joint effects of neutral evolution and natural selection. Higher gene diversity and differentiation occurred in lowland accessions than in highland accessions. Restricted gene flow was observed at different spatial scales, whereas genetic drift was detected in adjacent accessions due to geographic barriers. Suffering from the degeneration due to climatic changing and human activities, the accession size and density of I. pseudotinctoria gradually reduce, thus we should protect the local environment for maintaining the diversity of I. pseudotinctoria. Furthermore, the very high proportion of genetic diversity residing within locations encourages the intensive selection and collection of elite genotypes from each location for the genetic improvement of I. pseudotinctoria.