Genetic Evaluation of Juniperus sabina L. (Cupressaceae) in Arid and Semi-Arid Regions of China Based on SSR Markers

: Juniperus sabina L., a shrub distributed in patches in arid and semi-arid areas of the northern hemisphere, plays an important role in preventing land desertiﬁcation and maintaining ecosystems. However, few studies have reported genetic diversity and genetic structure of widely distributed populations of J. sabina in northwest China. Here, we evaluated the genetic diversity and genetic structure and predicted the isolation barriers among 11 populations based on 20 simple sequence repeats (SSRs). A total of 134 alleles were generated and the average number of alleles per locus was 6.70. The Shannon diversity index ranged from 0.659 to 0.951, with an average of 0.825. Population structure analysis revealed that the populations were assigned into two genetic groups. The analysis of molecular variance (AMOVA) indicated that 88% of genetic variation existed within populations. Moderate population differentiation was occurred with F ST value of 0.090. Finally, we concluded that geographic isolation is the main factor affecting the genetic structure of J. sabina populations. The results of this study provide a foundation for the strategies for J. sabina genetic conservation and management.


Introduction
Juniperus sabina L., savin juniper, is a coniferous evergreen shrub of Cupressaceae family with both sexual and asexual reproduction strategies and erect and creeping growth types [1]. From Spain to Kazakhstan, northern China, Mongolia, and Siberia, there are widely intermittent distribution of this plant [2]. In China, it is naturally distributed in the Tian Mountain to Altai Mountain in Xinjiang; Helan Mountain in Ningxia; Qilian Mountain in Gansu; northeastern Qinghai; Shenmu, Yulin, and Hengshan counties in Shaanxi; the western part of Yin Mountain, Manhan Mountain, Mu Us Sandland, and Hunshandak Sandland. Because of its strong sprouting ability (adventitious root formation after covering them with soil) and its resistance to branch pruning, this shrub species plays an important role in improving the environment in arid and semi-arid areas, preventing land desertification and improving urban landscaping. In addition, J. sabina contains chemical components such as camphor and podophyllotoxin, that have insecticide activity and can also be used as medicine for treating diseases such as rheumatoid arthritis [3,4].
The genetic diversity of a species or group determines the evolutionary potential and the ability to adapt to the environment [5]. The study of genetic diversity can not only reveal the variation types, population genetic structure, and evolutionary characteristics of species, but also clarify the relationship between genetic diversity and geographic distribution, ecological environment, and climate types. Juniperus L. plants have always been a hotspot in genetic diversity [6][7][8][9][10] and phylogenetic [2,7] research. This evergreen species is characterized by its very distinctive, fleshy, 'berry'-like cones show discontinuous distribution patterns around the Mediterranean and North America in the northern hemisphere. Studies have shown that intermittently distributed juniper plants are likely to evolve from the remaining parts of the Madrean-Tethyan vegetation belt which is distributed at low latitudes in the middle of tertiary and is composed of evergreen plant groups [11].
Savin juniper is one of the most widely distributed plants in arid and semi-arid areas of China. However, shrinking habitats due to human activities and extinction caused by drought have influenced its natural distribution. Therefore, it is essential to conserve genetic resources and reconstruct the habitats of this species. Understanding the genetic diversity and genetic structure of natural populations of J. sabina is a basic prerequisite for proposing conservation strategies. Protecting J. sabina resources is of great significance for studying the impact of climate change on plant growth.
In recent years, several studies have reported the genetic diversity and structures of Juniperus plants, such as J. thurifera L. [2,6], J. rigida Siebold et Zucc [7], J. osteosperma (Torr.) Little [8], J. przewalskii Kom. [9], J. communis L. [10], and J. cedrus Webb et Berthel. [12]. Currently, the research on J. sabina mainly focuses on its geographic distribution [2,13], physiological and ecological characteristics [14][15][16][17], reproduction and regeneration [18,19], and biogeography [20,21]. However, there are only a few studies on the genetic diversity of J. sabina in Asia. Only Hong et al. [22] and Geng et al. [23] studied four or two wild populations in Inner Mongolia based on RAPD and SSR markers, respectively. The sampling strategy of small-scale wild populations could not fully reveal the level of genetic diversity of the species. Besides, the special climatic conditions in arid and semi-arid distribution areas in China might play an essential impact on the formation of the genetic diversity of J. sabina.
With the development of molecular biology, using molecular markers to study plant genetic diversity has become a necessary method for the protection, maintenance, and genetic improvement of forest species diversity. Simple sequence repeats (SSRs) are widely distributed in the genome with large number, high rate of polymorphism, good duplicability, and codominant inheritance; therefore, they are regarded as the most effective markers to detect intraspecific genetic diversity and interspecific genetic differentiation [24].
In the present study, we employed 20 SSR markers, based on 333 samples from 11 natural populations of J. sabina in China, to (1) evaluate the level of genetic diversity and structure among populations; (2) explore whether the geographical distance, barrier isolation, and climate factors influenced the genetic structure. We believe that these findings will provide a theoretical basis for more efficient conservation management plans for this juniper species.

Plant Materials
A total of 333 samples were collected from 11 natural populations of J. sabina in Inner Mongolia, Shaanxi, Gansu, and Qinghai provinces (Table 1), with an individual spacing of at least 50 m. We took more than 10g fresh leaves from each sample, dried with silica gel and brought back to the laboratory for storage at −80 • C. The leaves of all samples were preserved in Forest genetics and Breeding Laboratory of Forestry College of Inner Mongolia Agricultural University.

DNA Extraction
Total genomic DNA of all samples was isolated using a Plant Genomic DNA Kit (TIANGEN, Beijing, China). The quality and concentration of DNA were verified by a NanoDrop2000 spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA) and 1.0% agarose gel electrophoresis. The samples were diluted to a final concentration of 50 ng µL −1 for PCR.

Statistical Analyses
GenALEx 6.5 [27] software was used to assess genetic diversity parameters, including observed number of alleles (Na), effective number of alleles (Ne), observed and expected heterozygosity (Ho and He), Shannon's information index (I), fixation index (F), percentage of polymorphic loci (PPL), inbreeding coefficients (Fis) and perform Hardy-Weinberg Equilibrium (HWE) test. The frequency of null alleles (FNA) was estimated using the Cervus v3.0.7 software [28]. Linkage disequilibrium (LD) was estimated as the correlation coefficient r 2 between all pairs of SSRs using the TASSEL ver. 2.1 software [29]. Pairs of loci were considered to have significant LD if p < 0.01 [30]. The polymorphic information content (PIC) value was calculated by PICcalc software [31]. Nei's genetic distances (D) [32] between populations and individuals were calculated in GenALEx 6.5. The same software was also used for a principal coordinates analysis (PCoA). Cluster analysis was performed based on the unweighted pair-group method with arithmetic means (UPGMA) and neighbor joining (NJ) methods using the genetic distances matrix of 11 populations and 333 individuals in the MEGA 7.0 software with 1000 bootstraps [33], respectively.
To evaluate the degree of differentiation among and within populations of J. sabina, the analysis of molecular variance (AMOVA) was performed by ARLEQUIN v3.11 [34]. We also analyzed the genetic structure of 11 populations using STRUCTURE v2.3.4 software with Bayesian clustering method [35]. The length of the burn-in period and value of MCMC (Markov chain Monte Carlo) was set to 100,000 and 1,000,000 times, respectively. The K value was estimated from 2 to 10 with 10 replicates. After running the program, the results were compress and upload to Structure Harvester [36] to obtain the optimal K value, which was determined according to the relationship between ∆K and K value referring to Evanno et al. [37].
In order to further understand the causes of population differentiation of J. sabina, BARRIER Version 2 [38] software was used to detect the existence of gene flow barriers in populations distribution area. In addition, to explore whether the genetic variation of J. sabina influenced by isolation by distance (IBD) and isolation by environment (IBE), the geographical and environmental distances between populations were calculated. Firstly, for the calculation of the environmental distance matrix, we obtained bioclimatic data of 11 different regions of J. sabina populations from the WorldClim dataset (https://www.worldclim.org/data/worldclim21.html, accessed date: 14 April 2021) at 2.5 arc-min resolution by using ArcGIS software version 10.2. Finally, the genetic distance matrix was compared with the geographic and environmental distance (Euclidean distance) matrices using the simple Mantel test [39] in PASSaGE 2.0 software with 1000 permutations [40]. In addition, bioclimatic variables were used to create the clustering heat map by using R package tool 'pheatmap'.

Results
Among 333 individuals, a total of 134 alleles, ranging from 2 to 14, were generated by 20 SSR loci, with an average of 6.70 alleles per locus, of which JS17 possessed the largest number of alleles (14 alleles), while JS66 generated the smallest number of alleles (2 alleles).
The PIC values ranged from 0.179 (JS5) to 0.751 (JS33) with an average of 0.450 (Table 2). Furthermore, null alleles were found at loci JS15 and JS20. Four loci (JS17, JS20, JS35, and JS61) showed significant deviations from Hardy-Weinberg equilibrium (p < 0.001) ( Table 2). Among the 11 studied populations, the number of alleles (Na) was in the range of 2.8 (NMYQ) to 4.300 (NMTK), with a mean value of 3.641. The highest level of observing heterozygosity was found in NMDM. The highest level of expected heterozygosity was found in NMTK, while the lowest level was found in GS. The Shannon diversity index (SI) varied from 0.659 (GS) to 0.951 (NMTK), along with PPL from 70% to 100% (Table 3). The fixation index (F) was significantly higher than zero, except in NMYQ and GS. The genetic diversity of J. sabina populations was high, and the order of genetic diversity level (SI) of 11 populations from high to low was as follows: NMTK > SXHS > NMDM > NMXM > NMKQ > NMNL > NMAZ > NMTL > QH > NMYQ > GS (Table 3). According to the geographical location of the populations, the populations are divided into three groups: Eastern group (NMDM, NMXM, and NMKQ), Central group (NMTK, SXHS, NMNL, and NMTL) and Western group (NMAZ, QH, GS, and NMYQ), then the genetic diversity of these groups showed the trend of Central group > Eastern group > Western group.
Fis values ranged from 0.010 (NMKQ) to 0.282 (SXHS) with an average of 0.135 (Table 3). In the LD analysis results, only the last 13 rows of data have a P value of less than 0.01, accounting for 6.8% of the total 190, which was significant and in a state of linkage disequilibrium (Table S1), but according to the Wang's [30] study, the data has little effect on the results and can be ignored. The AMOVA analysis was conducted, and the result showed that the total variation mainly occurred among 333 individuals and accounted for 88%, whereas the variation among the populations was only 12% (Table 4). The results of principal coordinate analysis (PCoA) were shown in Figure 1. PCoA of populations revealed that 91.4% of the genetic variation was explained by the first and second axis, and the variation rates of the two axes were 65.9% and 25.6%, respectively ( Figure 1A). Both PCoA analysis of populations and individuals showed that NMYQ, NMTK, NMNL, and SXHS populations were separated from the remaining populations ( Figure 1A The UPGMA dendrogram of 11 J. sabina populations is shown in Figure 2. From the dendrogram, it is clearly visible that the studied populations can be divided into four groups (Figure 2A): NMYQ from Western group; GS, QH and NMAZ from Western group; NMTK, SXHS, and NMNL from Central group; NMDM, NMXM, and NMKQ from Eastern group and NMTL from Central group. Except for NMNL, the clustering results were basically corresponding to the geographical location of the populations (Eastern group: Yin Mountain and Hunshandak Sandland, Western group: Qilian Mountain, Helan Mountain, Central group: Mu Us Sandland). On the other hand, the NJ dendrogram of 333 individuals was consistent with that of PCoA results. NMYQ, NMTK, NMNL, and SXHS populations were clustered into one clade ( Figure 2B). To further analyze the genetic structure of 11 populations, we used STRUCTURE software. The most probable division with the highest ∆K value was detected at K = 2 ( Figure 3A). This implied that the studied populations were mainly divided into two main genetic groups ( Figure 3C). The individual assignment by STRUCTURE also showed that the populations of NMTK, NMNL, NMYQ, and SXHS were clustered separately from the other populations regardless of the K value (2, 3, or 4) ( Figure 3B). The results of the STRUCTURE analysis for the estimated population structure for K = 2 found to be consistent with the PCoA results in Figure 1.
A simple Mantel test identified significant correlations (r = 0.282, p = 0.041) between genetic and geographic distance, proving an influence of isolation by distances (IBD) pattern ( Figure 4B). Isolation by environment (IBE), however, did not contribute to the genetic variability, as proven by a lack of significant correlation between genetic and bioclimatic distances (r = 0.254 p = 0.313) ( Figure 4A). In addition, as shown in Figure S1, the result of the cluster heat map analysis of 11 J. sabina populations which was constructed based on 19 climatic variables was inconsistent with the cluster analysis based on genetic distance (Figure 2A), indicating that the genetic structure of J. sabina may be affected by geographic distance rather than climatic conditions. Barrier v22 software was used to analyze isolation barriers among 11 populations. The results supported the above genetic structure and revealed that there were strong geographical isolation barriers (a and b) between populations of NMYQ, NMTK, and NMNL in Inner Mongolia and SXHS in Shaanxi Province with the other seven populations of J. sabina ( Figure 5). Judging from the geographical location, the barriers may be caused by Badain Jaran Desert and Mu Us Sandland.

Discussion
Compared with the other species in Cupressaceae family, the estimated average number of alleles (6.70) recorded in the this study was much higher than in J. cedrus (4.31) reported by Rumeu et al. [41] and J. brevifolia (Seub.) Antoine reported by Bettencourt et al. [42], and lower than a mean number of alleles (6.08) in Calocedrus macrolepis Kurz reported by Liao et al. [43].
It is well known that SSRs, as putative neutral markers, provide more information's on the neutral evolutionary processes. However, the level of population genetic diversity may represent the adaptive potential of the population [44,45]. The higher the genetic diversity, the stronger the ability to resist the environment. In the present study, the average Shannon diversity index (SI) analyzed by SSR markers was 0.825, which was higher than SI (0.428) reported by Zhang et al. [46] based on Random Amplified Polymorphic DNA (RAPD) markers. Simultaneously, PPL value of the latter was only 80.65%, while the PPL value calculated by SSR markers was 95.91%. The above results indicate that J. sabina has strong adaptability, which is consistent with the characteristics of drought resistance and cold resistance in plant species.
As a measure of genetic diversity, the order of Shannon's diversity index (SI) in different populations reported by Zhang et al. [46] were as follows: Mu Us Sandland (0.386) > Yin Mountain (0.376) > Hunshandak Sandland (0.345) > Helan Mountain (0.281). Similar results were revealed in the present study: NMTK and SXHS located in Mu Us Sandland had the higher SI value (0.951 and 0.891, respectively), followed by NMDM from Yin Mountain, NMAZ from Helan Mountain, and NMXM and NMKQ from Hunshandak Sandland. These results indicate that the genetic diversity of sandland populations is higher than that of mountain populations, and the special habitat conditions of sand land may be the direct reason for the higher genetic diversity among populations. Although the sexual regeneration of J. sabina is relatively low due to poor seed quality, the savin juniper seeds can be buried for a long time by the natural sand cover, and later germinate and grow to form a new community when conditions are suitable [47]. That means that the sandland populations can be renewed quickly and, in that case, gene exchange can be frequent. In addition, according to the field survey, some of the populations in the mountain area were affected by human activities; that could be one of the reasons for generally low genetic diversity of that populations. In addition, ability of multiplication of adventitious roots produced by sand burial and a better environment may be the decisive factors for the higher genetic diversity of sand populations than mountain populations.
In terms of population genetic diversity in the distribution area, the population genetic diversity in the Mu Us Sandland was relatively high. The same is true for other marker research results [20,25]. Therefore, Mu Us Sandland was very likely to be one of the centers of genetic diversity for J. sabina populations and should be conserved in-situ, because a high level of genetic diversity can provide abundant gene selection. Furthermore, there are very few ancient J. sabina individuals in Mu Us Sandland, which may have experienced more historical events and should be protected. In contrast, populations which owed the lower level of genetic diversity like GS and NMYQ should be conserved ex-situ.
The population structure analysis revealed the presence of two genetic groups among the 11 populations of J. sabina. The first group contained NMTK, SXHS, NMNL, and NMYQ; and the second group included NMDM, NMXM, NMKQ, NMTL, NMAZ, QH, and GS. Except for NMYQ, the first group of populations distributed in Mu Us Sandland, indicating that the distribution population of Mu Us Sandland and NMYQ of Longshou Mountain could differ from other populations. During the sampling process, it was found that the NMYQ population in Longshou Mountain was mixed with J. przewalskii. Therefore, interspecific hybridization or introgression may also be the cause of the differentiation of the NMYQ population and other J. sabina populations [48]. However, for other three populations, almost no coexistence with other species was found. We assume that the most important reason for clustering of NMYQ with Eastern populations is that J. sabina originated in the Tertiary Miocene, and there may be multiple refuges in the Quaternary. Among them, existing J. sabina populations distributed in Mu Us Sandland may have evolved from the original vegetation residues on the Loess Plateau [25]. The existing genetic structure is most likely caused by the special biogeographic history of the species.
The results of principal coordinate analysis based on Nei's genetic distance of 333 individuals and 11 populations were consistent with the results of genetic structure analysis, but the cluster diagram of population is slightly divergent from the above groups. NMTK, SXHS, and NMNL belong to the second group, which may be caused by different clustering algorithms.
In this study, we found two barriers between NMYQ, NMTK, NMNL, SXHS, and other seven populations. The barrier found between NMYQ and GS is probably the result of their geographical location; they are located in Longshou Mountain and Qilian Mountain, respectively, that is, on both sides of Hexi Corridor. From history to now, Hexi Corridor is not only an important traffic artery, but also an agricultural area. That is to say, human factors may be the key factor of NMYQ and GS group isolation. Furthermore, the geographical isolation caused by the vast Badain Jaran Desert and Helan Mountain between NMYQ and NMAZ may be the main obstacle to the difference between the two groups. In addition, NMTL, NMNL, NMTK, and SXHS are all located in Mu Us Sandland, but due to the difference of micro topography, isolation barriers may also appear.
Savin juniper is specifically distributed in arid and semi-arid areas of China. It is reasonable to believe that climate may have an impact on the differentiation of population genetic structure, but our Mantel test did not confirm this. Heat map clustering was also inconsistent with the genetic structure. More evidence will be needed in future to explain this statement especially ecological niches under different climatic conditions. In contrast, a significant correlation was found between genetic and geographical distance matrices. This indicates the existence of isolation by distance (IBD) pattern in the researched populations, characterized by increasing genetic differentiation with increasing geographic distance between populations. Similar observations were reported for other woody species as well: Quercus chenii Nakai [49] and Emmenopterys henryi Oliv. [50].
AMOVA analysis based on SSR markers showed that the genetic variation of J. sabina mainly occurred within the population (variation within population accounted for 88% of the total variation), which was highly consistent with the previously published data with RAPD and ITS markers [21,22]. In this study, the genetic differentiation index (F ST ) of the population was 0.090 (p < 0.05), with a moderate degree of differentiation [51]. In addition, the overall gene flow was 2.534, which was a high level. Generally, gymnosperms present higher gene flow than angiosperms, and the high level of gene flow may be the result of high pollen mobility. In other words, the wind, birds, and water can transport pollen of Juniperus plants over a long distance, thus promoting the delivery of high gene flow [42,52].

Conclusions
In the present study, we analyzed 333 individuals from 11 natural populations of J. sabina to assess genetic diversity, population structure and isolation barriers based on 20 SSR markers. Our results suggested that J. sabina has a high level of genetic diversity, with a trend of Central populations > Eastern populations > Western populations. The genetic diversity of mountain populations was lower than that of sandland populations. In addition, we revealed that geographic isolation may affect population genetic structure. Finally, we believe that the findings of this study can provide a strong theoretical basis for the protection, management, and utilization of germplasm resources of studied species. However, further research should advance in analyzing the genetic diversity, population structure in a larger scale, preferably in Eurasia, and pay more attention to NMYQ population because of special variation.

Supplementary Materials:
The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/f13020231/s1, Figure S1: Cluster heat map analysis of 11 J. sabina populations based on 19 climate factors, Table S1: The result of LD analysis.

Data Availability Statement:
Original data presented in the study are included in the main text, and further inquiries can be directed to the corresponding author.