Genetic Diversity and Structure of Pinus densiflora Siebold & Zucc. Populations in Republic of Korea Based on Microsatellite Markers

Pinus densiflora Siebold & Zucc. is a widely distributed conifer species in the Republic of Korea with economic and ecologic importance. However, P. densiflora is negatively influenced by various factors, such as forest fires, clearing of large numbers of trees infected with Pinus disease, and dieback. We aimed to use microsatellite markers to estimate the genetic diversity, differentiation, and structure of P. densiflora populations in the Republic of Korea. A total of 1844 samples from 60 populations were evaluated using 11 polymorphic microsatellite markers. The observed heterozygosity and expected heterozygosity were 0.652 and 0.673, respectively. The mean genetic differentiation among the populations was 0.013. Moreover, P. densiflora showed high genetic diversity and low genetic differentiation compared with conifer species, including Pinus species with similar life histories. Principal coordinates analysis and Bayesian clustering showed that P. densiflora has a weak geographical structure. The P. densiflora population at Mt. Halla, Jeju Island, showed the lowest genetic diversity and significant genetic differentiation compared with other mainland populations due to genetic drift and restricted gene flow. These findings can be useful for designing new conservation, management, and breeding strategies for P. densiflora populations in response to future environmental changes.


Introduction
Pinus densiflora Siebold & Zucc., usually called the Japanese red pine, is a conifer species belonging to the family Pinaceae and genus Pinus L. Japanese red pine is the most widely distributed conifer species in the Republic of Korea [1]. Its distribution in the Korean Peninsula ranges from Hamgyeongbukdo (N 43 • ) to Jeju Island (N 33 • ) [2]. This species is also found throughout Japan, Manchuria, the Shandong Peninsula in China, and eastern Russia [3]. P. densiflora is known to have undergone speciation from Pinus sylvestris L. [4]. P. sylvestris diverged to a variety of species termed 'sylvestriformis' in the Manchuria province of China and eventually differentiated into P. densiflora [5]. Most P. densiflora forests are secondary and mixed stand forests that primarily compete with Quercus mongolica Fisch. ex Ledeb. and Quercus serrata Murray [6]; thus, it is an ecologically important species [7]. Moreover, due to the wide range of local environmental growth, it also exhibits differences in growth depending on the matrix or soil type. Human disturbance is closely related to the wide distribution and decline in P. densiflora forests. Deciduous broad-leaved forests were replaced with P. densiflora during agricultural harvesting of wood for fuel purposes [8]. P. densiflora forests were then conserved by law and subjected to national moderation, including the plantation policy during the Koryo (918-1392) and the Chosun (1392-1910) Dynasties [9]. P. densiflora was specifically used as important building material for the royal palace and Buddhist temples. However, with time the P. densiflora forest declined due to large-scale logging after infection with pine gall midges (Thecodiplosis japonensis) or pine   Table 1.

Microsatellite Analysis
Microsatellite markers have been developed for P. densiflora over the past two decades [26][27][28]. A total of 11 markers were used for the current analysis (Table 2). Polymerase chain reaction (PCR) amplification was performed using a PTC-0240 DNA Engine Tetrad ® 2 Peltier Thermal Cycler (Bio-Rad, Hercules, CA, USA). The reaction conditions were modified and varied depending on the use of different primers in a 12 µL reaction volume containing 4 µL of 20 ng final concentration of DNA, 1.2 µL of 10x buffer, 0.24 µL 10 µM of each primer (forward was labeled with florescent FAM dye), 0.12 µL of 10 mM dNTPs, 0.32 µL of 25 mM MgCl 2 , and 0.24 µL of 0.1 unit Taq DNA polymerase (BioFACT TM , Daejeon, Republic of Korea) to amplify Pdms009, Pdms030, Pdms065, Pdms221, Pde14, and Lop5. Except for Lop5, amplification was conducted using the following cycling parameters: initial denaturation at 94 • C for 4 min, followed by 6 cycles of touchdown consisting of denaturation at 94 • C for 1 min, and annealing at 55 • C for 1 min, decreasing by 1 • C each cycle, extension at 72 • C for 2 min, and subsequently 32 cycles of amplification consisting of denaturation at 94 • C for 30 s, annealing at 52 • C for 30 s, extension at 72 • C for 1 min, and a final extension at 72 • C for 5 min. For the Lop5 primer, amplification conditions included an initial denaturation at 94 • C for 4 min, followed by 5 cycles of touchdown consisting of denaturation at 94 • C for 1 min, annealing at 50 • C for 1 min, decreasing by 1 • C per cycle, extension at 72 • C for 2 min, and 30 cycles of amplification consisting of denaturation at 94 • C for 30 s, annealing at 45 • C for 30 s, extension at 72 • C for 1 min, and a final extension at 72 • C for 5 min. To amplify CPDE0039, CPDE0058, CPDE0060, CPDE0076, and CPDE0106, a 16 µL reaction volume containing 4 µL of 20 ng final concentration of DNA, 1.6 µL of 10× buffer, 0.32 µL of 10 µM of each primer, 0.32 µL of 10 mM dNTPs, 0.96 µL of 25 mM MgCl 2 , and 0.1 U of Taq DNA polymerase was used. The conditions included an initial denaturation at 95 • C for 5 min, followed by 5 cycles of touchdown consisting of denaturation at 94 • C for 30 s, annealing at 61 • C for 30 s, decreasing by 1 • C per cycle; extension at 72 • C for 30 s; and 29 cycles of amplification consisting of denaturation at 94 • C for 30 s, annealing at 56 • C for 30 s, extension at 72 • C for 30 s, and a final extension at 72 • C for 10 min.
The amplification products were fractionated by capillary electrophoresis using an ABI Prism 3730xl Genetic Analyzer (Applied Biosystems, Foster City, CA, USA). The amplified fragment size was identified using the GeneScan TM -500 ROX size standard (Life Technologies, Carlsbad, CA, USA). Allele binning and genotyping were performed using GeneMapper v5.0 (Life Technologies, Carlsbad, CA, USA). All loci were checked for the occurrence of null alleles, large allele dropout, and stutter bands using Micro-Checker v2.2.3 software [29]. Number of alleles per locus (Na) and polymorphism information content (PIC) per locus were calculated based on allele frequency using the program Cervus v2.0 [30].

Data Analysis
The number of alleles (A) and effective alleles (A e ) were calculated. The observed heterozygosity (H o ) and expected heterozygosity (H e ) were estimated from the allele frequencies. Pairwise values of Nei's genetic distance between populations were estimated. Principal coordinates analysis (PCA) was performed using Nei's genetic distance values. We also performed a Mantel test with 999 random permutations between the matrices obtained for pairwise population differentiation (F ST ) and geographic distance based on the isolation by distance (IBD). All analyses were performed using GeneAlEx 6.41 software [31].
Allelic richness (A R ) was estimated using a minimum sample size of 15 individuals. The inbreeding coefficient (F IS ) for each population and the value of genetic differentiation between pairs of populations (F ST ) were calculated. The significance level of deviation of F IS from zero and pairwise F ST values were determined using the FSTATv2.9.3 software [32] with 1000 permutations.
We employed both the sign test and Wilcoxon signed-rank test for each population using Bottleneck v1.2 software [33]. We estimated the significance value to detect recently bottlenecked populations. We selected both the infinite allele mutation model (IAM) and the two-phase mutation model (TPM) and set the option to test according to previously published analysis on P. densiflora in Japan [34]. The two models were selected because IAM has greater statistical power than the step mutation model (SMM) when using highly polymorphic markers. TPM is suitable when allele size is not equally distributed.
Bayesian clustering analysis was employed to identify the genetic structure of populations using Structure v2.3 software [35]. The number of assumed clusters (K) was set from 1 to 10. Each simulation was performed assuming a K value of 1-10. All simulations were performed ten times. A correlated allele frequency model was selected. The running length was set to 100,000 Markov chain Monte Carlo (MCMC) sampling, after a burn-in period of 100,000 iterations. The ∆K value was calculated according to the method of Evanno et al. [36] using the Structure Harvester program [37]. Clustering analysis results were visualized using CLUMPP v. 1.1.2 [38] and DISTRUCT v1.1 software [39]. The proportional membership of each cluster was generated and visualized. The spatially explicit Bayesian clustering methods facilitated the estimation of genetic structure by considering landscape factors [40]. Bayesian clustering analysis was used to consider spatial distribution with an algorithm implemented in v3.4.1 of the R package Geneland (v4.0.8). We assumed a spatial model with correlated allele frequency and set the cluster numbers from 1 to 5. The simulations were run for 100,000 iterations, using a thinning value of 100. Hierarchical AMOVA was also performed to estimate the degree of genetic differences among clusters using GeneAlEx 6.41 software [31]. We set the number of clusters obtained from the Geneland analysis.

Genetic Diversity within a Population
We

Genetic Differentiation and Population Structure
Pairwise F ST values ranged from 0.004 to 0.049, and the mean genetic differentiation among populations (F ST ) was 0.013 (Table S2). Mt. Halla (code 60) had significantly high F ST estimates relative to other populations (p < 0.001), as depicted in the PCA scatter diagram (Figure 2). Mt. Halla populations were genetically distant from other populations. Using the Mantel test, significant results were identified among the 60 populations (r = 0.343, p < 0.01, Figure S1), indicating a weak correlation between genetic differentiation and geographic distance. From the Structure analysis, the mean value of Ln P(X/K) was estimated to be similar when K was 1-10, which was found to gradually decrease using the method of Pritchard et al. [35] ( Figure 3A). However, the value of ΔK was the highest at K = 5, based on the theory suggested by Evanno et al. [36] (Figure 3B). When K was set at 5, the proportion of the five clusters was visualized in all populations ( Figure 3C). Mt. Halla (code 60) was  From the Structure analysis, the mean value of Ln P(X/K) was estimated to be similar when K was 1-10, which was found to gradually decrease using the method of Pritchard et al. [35] ( Figure 3A). However, the value of ∆K was the highest at K = 5, based on the theory suggested by Evanno et al. [36] (Figure 3B). When K was set at 5, the proportion of the five clusters was visualized in all populations ( Figure 3C). Mt. Halla (code 60) was dominant in cluster 5-V. Other populations showed proportions similar to the five clusters. Four clusters were divided from spatially explicit clustering using the Geneland program ( Figure 4). The proportion of each cluster within the populations was clearer than that in the Structure analysis.  From the Structure analysis, the mean value of Ln P(X/K) was estimated to be similar when K was 1-10, which was found to gradually decrease using the method of Pritchard et al. [35] ( Figure 3A). However, the value of ΔK was the highest at K = 5, based on the theory suggested by Evanno et al. [36] (Figure 3B). When K was set at 5, the proportion of the five clusters was visualized in all populations ( Figure 3C). Mt. Halla (code 60) was dominant in cluster 5-V. Other populations showed proportions similar to the five clusters. Four clusters were divided from spatially explicit clustering using the Geneland program ( Figure 4). The proportion of each cluster within the populations was clearer than that in the Structure analysis.    Table 1.  Table  1.
In contrast to Structure, Geneland detected the genetic structure among populations. However, the genetic differences between the four clusters including Mt. Halla was 1.4%, whereas the difference between the three clusters except Mt. Halla was less than 1% using hierarchical AMOVA (Table 3). Table 3. Hierarchical AMOVA of P. densiflora populations. In contrast to Structure, Geneland detected the genetic structure among populations. However, the genetic differences between the four clusters including Mt. Halla was 1.4%, whereas the difference between the three clusters except Mt. Halla was less than 1% using hierarchical AMOVA (Table 3).  [42]). According to the microsatellite metadata study, genetic diversity is influenced by the life-history parameters of various species, such as life span, geographic range, breeding system, and seed dispersal [43]. P. densiflora is wind pollinated, and its seeds are dispersed by wind [44], whereas the mating system of this species involves outcrossing [45]. The level of species genetic diversity associated with life-history traits such as being long-lived perennials, wide spread, outcrossing, and seed dispersal by wind is high [43]. The level of genetic diversity in P. densiflora (H e = 0.673) was higher than that in species with lifehistory traits such as being a long-lived perennial (H e = 0.  [49]). The mean inbreeding coefficient was 0.048. There have been several studies to investigate heterozygote deficiency in conifer species. For instance, in the P. radiata D.Don study, a high degree of selfing was noted among individuals in the population under bottleneck [50]. The authors noted that homozygote excess was the result of mating among survival-selfing individuals. Additionally, in the P. contorta Douglas ex Loudon study, homozygote excess was interpreted as a subpopulation structure developed from different seed trees during regeneration following fire [51]. In the Tsuga canadensis (L.) Carrière study, it was reported that inbreeding increased due to a decline in large-scale insect outbreaks [52]. From the bottleneck test, we did not find any evidence of a recent bottleneck effect.
Mating system studies for P. densiflora in South Korea showed a high outcrossing rate (t m = 0.798-0.967) [15,53,54]. However, despite a high outcrossing rate within the population, a limited number of effective pollen contributors may result in non-random mating [15]. The number of effective pollen contributors varies depending on stand density, stand age, stand structure, and environmental conditions [54]. If the pollen donor is related, there is a possibility of biparental inbreeding depression. Moreover, genetically biased pollen causes changes in genetic diversity and genetic structure within the population. The estimated F IS value was 0.025 for Buan. This value is smaller than those for Anmyeondo (F IS = 0.067) and Mt. Juwang (F IS = 0.072). Additionally, the number of effective pollen contributors in Buan (N ep = 83.3, [54]) was higher than that in Anmyeondo (N ep = 1.3, [15]) and Mt. Juwang (N ep = 3.9, [53]).

Genetic Differentiation among Populations
The level of genetic differentiation for P. densiflora (F ST = 0.013) was smaller than that of other conifer species in the Republic of Korea, including P. jezoensis (F ST = 0.102; [41]), A. koreana (F ST = 0.053; [42]), and A. nephrolepis (F ST = 0.049; [42]). From the microsatellite metadata study [43], the level of genetic differentiation of species with life-history traits such as being a long-lived perennial (F ST = 0.190), wide spread (F ST = 0.250), outcrossing (F ST = 0.220), and seed dispersal by wind (F ST = 0.130) were low. The level of genetic differentiation in P. densiflora (F ST = 0.013) was lower than that in species with a similar life history. The genetic differentiation was small compared to that of P. densiflora in Japan (F ST = 0.013, [34]), P. koraiensis (F ST = 0.020, [47]), P. sylvestris (F ST = 0.058, [46]), P. strobus (F ST = 0.060, [48]), and P. cambra (F ST = 0.065; [49]). Therefore, the level of genetic differentiation in P. densiflora is small compared to other Pinus species with similar lifehistory traits. These results agree with previous studies based on isozyme (F ST = 0.037; [19]) and ISSR (Φ ST = 0.080; [7]) data. The level of genetic differentiation is affected by gene flow between populations and is determined by pollen and seed dispersal [55]. Gene flow may increase in distant populations. Parentage analysis in P. densiflora has focused on the longdistance dispersal of seeds [56]. As a result of parental analysis of seeds produced in the natural P. densiflora forest,~20% of the sampled seeds had gametes derived from maternal trees in other populations. The effect of gene exchange by seed migration increases when many conspecific populations are distributed around the population [44]. Long-distance dispersal of P. densiflora seeds may have the potential for gene exchange within and among populations, and is likely to represent the primary cause of generally low differentiation among natural populations.

Population Structure
Both PCA and Bayesian clustering suggested that Mt. Halla (code 60) was genetically differentiated from other populations. Pairwise F ST between Mt. Halla and other populations also showed significant genetic differentiation from other populations (Table S2). Bayesian analysis assigned the Mt. Halla population to a distinct genetic cluster. Mt. Halla, located on Jeju Island, was a refuge for a variety of climatic species from the cold-tolerant to subtropical during the late postglacial period, and conifer species including P. densiflora have been habitant in Mt. Halla [57]. Genetic drift has led to the loss of genetic diversity of populations on the island, as well as genetic differentiation from mainland populations [58]. Mt. Halla showed the lowest genetic diversity (H e = 0.609) compared to the other mainland populations. Therefore, we conclude that genetic drift and restricted gene flow have affected the genetic diversity of Mt. Halla.
To understand the genetic structure of populations, it is necessary to consider the interactions with the landscape and environmental features (roads, mountains, and deforested areas) as well as microevolution factors, such as gene flow, genetic drift, and selection [59]. According to a comparative simulation study on Bayesian clustering methodology, it has been suggested that spatially explicit Bayesian clustering analysis should be performed to minimize interpretation errors for species in which populations are continuously distributed [60]. When comparing the two types of Bayesian clustering analysis results, spatially explicit Bayesian clustering depicted the structure between populations. However, hierarchical AMOVA showed weak genetic differentiation between clusters with no geographic patterns identified among the clusters. This is consistent with the weak correlation between geographic and genetic distance.
Wind-pollinated species have a large effective size and weak geographic structure due to extensive pollen-mediated gene flow [61]. As such, the habitat range of P. densiflora extends from mountainous areas to those near human habitats, and can thus be influenced by evolutionary factors and by human activities. Indeed, human activities and changes in land use have been reported to affect the genetic diversity and genetic structure of P. densiflora in Japan [34]. Those authors discussed that agricultural activities have created an open gap in the landscape and caused the rapid expansion of the Pinus forest. In cluster 4-III in our study, Mt. Geumo (code 33), Angang (code 35), and Goseong (code 54) were estimated to have highly significant inbreeding coefficients (Table 1). There was a deficiency in heterozygotes in the population that did not result from bottlenecks (Table 3). Although empirical evidence is lacking, a history of disturbance was detected in these studies. Until the 1970s, residents were logging and cultivating in these regions [62], and in 1994 a forest fire destroyed 12 ha of pine forests. According to recent studies, broad-leaved oak trees are widely distributed and Larix kaempferi has been planted [63]. Additionally, Angang is a region that was heavily impacted by the initial spread of pine wilt disease [64], and in the late 1990s the dieback of pine trees was observed [12]. In 2009, a large-scale dieback in an area of 8416 ha began in the southern regions, including Goseong and Kimhae, which is still ongoing to date.

Implication for Forest Conservation and Management of P. densiflora
According to the climate change scenario (RCP 8.5), 18.7% of P. densiflora forests are expected to decrease by 2070 [65]. In addition, the spread of pine nematode disease, largescale forest fires, and increases in the area of human activity are causes of forest decline. P. densiflora is the most widely distributed species in the Republic of Korea, highlighting the difficulty associated with conserving all populations effectively. Therefore, it is important to establish strategies for the conservation of genetic diversity and sustainable management of species, which can be used to select a conservation-priority population based on the genetic diversity and genetic structure of P. densiflora. The selected population can be designated as a protected area, and it should be conserved and managed over the long term at the national level. Breeding strategies that consider genetic diversity are also important for supplying seeds and sustainable management.
Certain limitations were noted in the current study. For instance, the genetic diversity and structure of P. densiflora populations were investigated using microsatellite markers, revealing neutral microsatellite variation. However, it is not possible to detect associations using adaptation traits. Therefore, to develop a strategy for the sustainable conservation of P. densiflora that withstands climate change, it is necessary to develop DNA markers and estimate genetic variations related to adaptation. Accordingly, genomic studies on P. densiflora and the development of SNP markers related to adaptive traits are underway. However, research on population-level application will require additional studies.

Conclusions
This is the first study to use microsatellite markers to estimate the genetic diversity, genetic differentiation, and genetic structure of P. densiflora populations in the Republic of Korea. Compared to other species with similar life-history traits, we identified high genetic diversity and low genetic differentiation, and a weak geographic structure was observed for the genetic structure of population. These results can be attributed to the continuous distribution of populations and the extensive gene flow between populations by wind dispersal. Although a weak genetic structure was detected, we believe that the results should be of practical significance concerning the current state of species and landscape-based ecological perspectives. P. densiflora in the Republic of Korea continues to be affected by various disturbances, including forest fires, clearing a large number of trees infected with Pinus disease, and dieback. As climate change continues, pine forests are predicted to decrease, which will alter their genetic diversity and structure in response to the continued decrease in the effective size of the population. Our study provides insights into the genetic diversity and genetic structure of P. densiflora and will prove valuable for establishing strategies for conservation, management, and breeding in response to future environmental changes.

Supplementary Materials:
The following are available online at https://www.mdpi.com/article/10 .3390/f12060750/s1, Table S1: Probability values of 60 P. densiflora populations from bottleneck test, Table S2: Pairwise F ST values among populations of P. densiflora, Figure S1: Mantel test results between population differentiation (F ST ) and geographic distance based on the isolation by distance (IBD).

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.