Genetic Structure, Differentiation and Originality of Pinus sylvestris L. Populations in the East of the East European Plain

In order to carry out activities aimed at conservation and rational use of forest resources; it is necessary to study the main forest-forming plant species in detail. Scots pine (Pinus sylvestris L., Pinaceae) is mainly found in the boreal forests of Eurasia and is not so often encountered in the east of the East European Plain. The aim of the study was to study the genetic diversity, structure and differentiation of Scots pine populations in the east of the East European Plain. We studied ten populations of P. sylvestris using the Inter Simple Sequence Repeats (ISSR)-based DNA polymorphism detection method. Natural populations are demonstrated by relatively high rates of genetic diversity (He = 0.167; ne = 1.279; I = 0.253). At the same time, there is a tendency for a decrease in the genetic diversity of the studied populations of P. sylvestris from west to east. Analysis of the genetic structure shows that the studied populations are highly differentiated (GST = 0.439), the intrapopulation component accounts for about 56% of the genetic diversity. Using various algorithms for determining the spatial genetic structure, it is found that the studied populations form two groups of populations in accordance with geographic location. With the help of a genetic originality coefficient, populations with specific and typical gene pools are identified. They are recommended as sources of genetic diversity and reserves for the conservation of genetic resources of the species.


Introduction
The problem of preserving the boreal forests of Eurasia in the face of climate change is very relevant for Europe and the whole world [1,2]. The study of genetic diversity, determination of the spatial and genetic structure, intra-and interspecific differentiation of coniferous species which are of great biosphere and resource importance make up one of the important tasks of population biology [3]. Scots pine (Pinus sylvestris L.; Pinaceae), one of the most widespread economically important forest-forming species, plays an extremely important role in the formation of the structure and functions of forest ecosystems [4][5][6]. The wood of this plant has many beneficial qualities and is versatile in its use. In addition to its physical and mechanical properties which are valued in construction the pine contains a number of useful substances that are promising raw materials for the wood chemical industry. Scots pine contains various biologically active substances (BAS), such as terpenoids, steroids, alkaloids, flavonoids and others [7,8]. Investigations of the genetic diversity and differentiation of P. sylvestris populations are also important for studying the content of resin acids in trees which are widely used in the wood chemical industry and are promising BAS under different growing conditions [9][10][11].
In the territory of Russia, there are two-thirds of the total area of boreal forests and up to 80% of the total reserves of coniferous wood, which are of great economic importance. Despite the fact that the forest is a renewable resource the amount of cutting exceeds the number of new plantings (renewals). When investigating crimes related to illegal logging the main problem is the creation of an expert evidence base [12]. Therefore, it is necessary to develop measures to detect and control illegal logging, as well as to develop identification of wood species at the population level. Only on the basis of accurate information about the population genetic structure of woody plants can one assess their genetic potential and develop a set of measures aimed at preserving genetic diversity in the process of their use and reproduction.
In the east of the East European Plain, Scots pine occupies about 20% of the area of all types of conifers. The largest areas are concentrated in the north of the region. In the subarctic region, P. sylvestris grows continuously from Scandinavia to Siberia. Along the southern boundary of the range, post-glacial warming during the Holocene led to isolated and fragmented populations. Range fragmentation can lead to loss of genetic diversity and population extinction [13].
Large-scale studies of the molecular genetic phylogeography of P. sylvestris have been carried out both in Europe and in Russia [2,4,6,8,11,[14][15][16]. According to the analysis of the sequences of mitochondrial markers, it is assumed that throughout the space from the east of the East European Plain to at least the river of Yenisei P. sylvestris is genetically homogeneous [15]. A similar picture has been previously obtained on the basis of allozyme markers [15]. However, a significantly greater geographical differentiation of P. sylvestris populations within the range is found in the Mediterranean and in the southern part of the distribution range [16]. At the same time, the analysis of chloroplast sequences has revealed significant genetic heterogeneity of the species throughout its distribution area. The identification of different patterns can be associated with the use of different types of DNA markers, with a different type of inheritance, and is also a consequence of different genetic processes. However, the genetic conservationism of the internal transcribed ribosomal genes spacer and of chloroplast gene sequences for a particular species makes the study of these sequences for intraspecific diversity unsuitable. It would be ideal to use markers characterized by wide genome distribution and [17], above all, would be accessible for any species including those that have not been studied. Such DNA genetic markers include all PCR-based DNA fingerprinting method variants of the Random Amplified Polymorphic DNA (RAPD) method [18], such as Inter Simple Sequence Repeat (ISSR) [19], Palindromic Sequence-targeted (PST) PCR [20] and others [21]. This list can also be supplemented with methods based on interspersed repeat sequences in genomes, including a range of retrotransposon related elements [22]. The genetic polymorphism research method in plant species using ISSR is as simple and accessible of method. Due to the high copy number of microsatellite sequences and their abundance in eukaryotic genomes, the use of SSR sequences as PCR-based DNA fingerprinting is a convenient and effective method. To obtain a general picture of the genetic and ecological diversity of natural populations and to study the biogeography of Scots pine in the east of the East European Plain, it is necessary to conduct monitoring over a larger area. Therefore, the study of the molecular genetic diversity and biological structure of P. sylvestris populations of the East European Plain through the analysis of PCR-based DNA profiling methods is promising for the development and optimization of methods for assessing the state of the gene pools of boreal coniferous species which is an urgent task for conservation of forest woody species being productive and resistant to the action of various environmental factors.
The objectives of this study were to investigate the genetic diversity, population structure and genetic relationships of our collected P. sylvestris samples from ten natural populations, in the conditions of their growth on the territory of the East European Plain, using the Inter Simple Sequence Repeats (ISSR) PCR-based DNA profiling technique.

Materials and Methods
Ten natural populations of Scots pine located within the East European Plain, in the Russian Federation, six of which are located in Perm Krai, and four in Kirov Oblast, are selected as objects of the study. The studied populations of P. sylvestris in Perm Krai are located on the territories of Berezniki's (Ps_Br), Polazna's (Ps_Pl), Gainy's (Ps_Gn), Karagay's (Ps_Kg), Perm's (Ps_Uk) and Bolshesosnovsky's (Ps_Bs) forestries, and populations from Kirov Oblast in Darovskoy's (Ps_Dr), Yuryansky's (Ps_Ur), Slobodskoy's (Ps_Sl) and Belokholunitsky's (Ps_Bl) forestries (Table S1 and Figure 1). The plant material was collected from trees located at a distance of at least 100-150 m from each other. Geographic distances between populations varied from a minimum of 50 km (populations Ps_Sl and Ps_Bh located in Slobodskoy's and Belokholunitsky's forestries) up to a maximum of 516 km between populations Ps_Br and Ps_Dr located in the northern part of Perm Krai and in Kirov Oblast. Pairwise geographic distances between all studied populations are presented in Table S2. populations, in the conditions of their growth on the territory of the East European Plain, using the Inter Simple Sequence Repeats (ISSR) PCR-based DNA profiling technique.

Materials and Methods
Ten natural populations of Scots pine located within the East European Plain, in the Russian Federation, six of which are located in Perm Krai, and four in Kirov Oblast, are selected as objects of the study. The studied populations of P. sylvestris in Perm Krai are located on the territories of Berezniki's (Ps_Br), Polazna's (Ps_Pl), Gainy's (Ps_Gn), Karagay's (Ps_Kg), Perm's (Ps_Uk) and Bolshesosnovsky's (Ps_Bs) forestries, and populations from Kirov Oblast in Darovskoy's (Ps_Dr), Yuryansky's (Ps_Ur), Slobodskoy's (Ps_Sl) and Belokholunitsky's (Ps_Bl) forestries (Table S1 and Figure 1). The plant material was collected from trees located at a distance of at least 100-150 m from each other. Geographic distances between populations varied from a minimum of 50 km (populations Ps_Sl and Ps_Bh located in Slobodskoy's and Belokholunitsky's forestries) up to a maximum of 516 km between populations Ps_Br and Ps_Dr located in the northern part of Perm Krai and in Kirov Oblast. Pairwise geographic distances between all studied populations are presented in Table S2. For the study, the needles samples were collected individually from 25-30 trees in each of ten populations of P. sylvestris. DNA was isolated according to the procedure for complex biological samples [23]. The weighed amount of the needles made up 20 mg. NanoDrop 2000 Spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA) was used to determine the concentration and quality of DNA.
The ISSR method was used to assess genetic diversity and genetic structure of populations [19,22]. PCR reactions were performed in a 25 µL reaction mixture. Each reaction mixture contained 50 ng of template DNA, 1 × PCR buffer with 2.5 mM MgCl2, 1 µM ISSR primer, 0.25 mM each dNTP, and 2 U Taq DNA polymerase (Sileks M, Moscow, Russia). PCR amplification was carried out in a SimpliAmp™ Thermal Cycler (Thermo Fisher Scientific Inc.) under the following conditions: initial denaturation step at 94 °C for 2 min, followed by 32 amplifications at 94 °C for 20 s, at 52-64 °C (depending on primer sequence) for 30 s, and at 72 °C for 60 s, followed by a final extension of 72 °C for 3 min. For the study, the needles samples were collected individually from 25-30 trees in each of ten populations of P. sylvestris. DNA was isolated according to the procedure for complex biological samples [23]. The weighed amount of the needles made up 20 mg. NanoDrop 2000 Spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA) was used to determine the concentration and quality of DNA.
The ISSR method was used to assess genetic diversity and genetic structure of populations [19,22]. PCR reactions were performed in a 25 µL reaction mixture. Each reaction mixture contained 50 ng of template DNA, 1 × PCR buffer with 2.5 mM MgCl 2 , 1 µM ISSR primer, 0.25 mM each dNTP, and 2 U Taq DNA polymerase (Sileks M, Moscow, Russia). PCR amplification was carried out in a SimpliAmp™ Thermal Cycler (Thermo Fisher Scientific Inc.) under the following conditions: initial denaturation step at 94 • C for 2 min, followed by 32 amplifications at 94 • C for 20 s, at 52-64 • C (depending on primer sequence) for 30 s, and at 72 • C for 60 s, followed by a final extension of 72 • C for 3 min.
The ISSR primers were designed by Kalendar et al. [24], and in this study ISSR primers for P. sylvestris were used [25]. DNA amplification was conducted using a slightly modified protocol [25,26]. A total of 20 primers were tested initially, and five polymorphic primers producing clearly identifiable and repeatable bands were selected for further analyses  Table 1). The reproducibility of ISSR-profiles was verified based on comparison of the electrophoretic profiles of randomly selected P. sylvestris samples. Data were generated and compared in three replicates. Gels were then checked to identify ISSR-profiles in one or both replicates (original gel-photo collected at the Supplementary Materials Figures S1-S21). All ISSR primers were tested to assess the genetic diversity of P. sylvestris using PCR amplification for DNA profiling. PCR products were separated by electrophoresis at 70 V for 5 h in 1.5% agarose gel with 1 xTBE buffer, stained with ethidium bromide and photographed in transmitted ultraviolet light using GelDoc XR (Bio-Rad Laboratories, Inc., Hercules, CA, USA) gel documentation system. To determine the length of DNA fragments, a molecular weight marker (100 bp DNA Ladder (Cat. 07-11-00050); Solis BioDyne, Tartu, Estonia) and the Quantity One program (Bio-Rad Laboratories, Inc.) were used. In total, polymorphism was analysed for ISSR profiles with five primers in 293 trees, a total of 1465 individual samples of P. sylvestris.
To assess genetic polymorphism and determine genetic structure of ten studied natural populations of P. sylvestris, the obtained data, taking into account reproducibility in repeated experiments, were presented in the form of a matrix of binary characters, in which the presence or absence of fragments of the same size in the spectra was considered, respectively, as 1 or 0 state.
Computer processing of data was carried out using computer programs POPGENE 1.31 [27] and a specialized macro GenAlEx6 [28] for MS-Excel with the definition of number of alleles (n a ), effective (n e ) number of alleles [29], expected (He) heterozygosity and Shannon's information index (I). The following parameters were used to describe genetic structure of the populations: expected proportion of heterozygous genotypes (H T ) in the entire population, as a measure of total genetic diversity; expected proportion of heterozygous genotypes in a subpopulation (H S ), as a measure of intrapopulation diversity; share of interpopulation genetic diversity in total diversity or coefficient of gene differentiation (G ST ), as well as the Analysis of Molecular Variance (AMOVA) package with the calculation of PhiPT-index (population subdivision index) using 1000 rounds of permutations [30]. Genetic distances between populations (D N ) were determined using the formula of M. Nei and W-H Li [31]. To determine the correlation in the general group of populations between the genetic differentiation parameters (D N and PhiPT) and geographic distances, the generally accepted Mantel test was used [28]. The specificity of the gene pools of P. sylvestris populations was characterized using the genetic originality coefficient (GOC), which makes it possible to characterize populations in terms of the proportion of rare and typical alleles.
On the ground of the matrix of binary features, the matrix of genetic distances was calculated [27], on the basis of which dendrograms reflecting the degree of similarity of the studied populations and trees by spectra were constructed using the unweighted pairgroup method Unweighted Pair-Group Method Using Arithmetic Average (UPGMA) using a computer program (POPGENE v1.31, University of Alberta, Edmonton, AB, Canada). In addition, to verify the data obtained, Principal Coordinate Analysis (PCoA) was carried out using GenAlEx6 program [28].

Genetic Diversity
Molecular genetic analysis of ten populations of P. sylvestris revealed 132 ISSR amplicons ( Figure 2). The ISSR primers used detected from 24 to 29 PCR amplicons, and the maximum number of amplicons was amplified with primer X10. On average, a single primer showed 26.4 PCR bands. PCR amplicons lengths ranged from 200 to 1550 base pairs. the studied populations and trees by spectra were constructed using the unweighted pair-group method Unweighted Pair-Group Method Using Arithmetic Average (UP-GMA) using a computer program (POPGENE v1.31, University of Alberta, Edmonton, Canada). In addition, to verify the data obtained, Principal Coordinate Analysis (PCoA) was carried out using GenAlEx6 program [28].

Genetic Diversity
Molecular genetic analysis of ten populations of P. sylvestris revealed 132 ISSR amplicons ( Figure 2). The ISSR primers used detected from 24 to 29 PCR amplicons, and the maximum number of amplicons was amplified with primer X10. On average, a single primer showed 26.4 PCR bands. PCR amplicons lengths ranged from 200 to 1550 base pairs.  Table 2). The correlation analysis revealed the dependence of genetic diversity on the longitudinal location of populations (r 2 = 0.5857). In the studied populations of P. sylvestris, the level of genetic diversity decreases in the direction of their location from west to east ( Figure S1).   Table 2). The correlation analysis revealed the dependence of genetic diversity on the longitudinal location of populations (r 2 = 0.5857). In the studied populations of P. sylvestris, the level of genetic diversity decreases in the direction of their location from west to east ( Figure S1).

Population Genetic Structure
Analysis of the genetic structure of the studied populations of P. sylvestris revealed that the expected proportion of heterozygous genotypes (H T ) for the total sample was 0.297, and the expected proportion of heterozygous genotypes in a subpopulation (H S ) was 0.167. The coefficient of gene differentiation (G ST ) shows that the interpopulation component accounts for 0.439 of the total genetic diversity. The greatest differentiation among P. sylvestris populations is established using CR-215 primer (Table S3). The values of pairwise PhiPT genetic distances revealed using AMOVA package varied from 0.139 (Ps_Gn/Ps_Bs) to 0.614 (Ps_Uk/Ps_Bh). Differences in genetic distances between populations were statistically significant (Table S4). For the total sample of P. sylvestris, PhiPT index made up 0.476, which approximately corresponded to the value of G ST = 0.439. Thus, the analysis of molecular variance has confirmed that genetic diversity is distributed among intrapopulation and interpopulation components approximately equally (48% and 52%, respectively, Table 3). df-degrees of freedom; SS-sum of squares; MS-standard deviation; %-percentage of total genetic diversity; p-significance level using 1000 rounds of premutation.
The smallest genetic distance was noted between Ps_Uk/Ps_Bs populations (D N = 0.036), the largest (D N = 0.337)-between Ps_Uk and Ps_Bh populations (Table S5). Based on the obtained matrix pairwise genetic distances (D N ), a cluster analysis was carried out using UPGMA method and a dendrogram was constructed, reflecting the degree of similarity in ISSR-profile of the studied populations ( Figure 3A). On the dendrogram, the studied populations formed two clusters corresponding to the geographical position of the populations. Thus, according to the results of analyses using various algorithms for determining the spatial and genetic structure, the division of ten studied populations of P. sylvestris into the following two groups was revealed: Pre-Ural's (Ps_Uk, Ps_Bs, Ps_Pl, Ps_Br, Ps_Gn) and North Transvolga's (Ps_Dr, Ps_Ur, Ps_Sl, Ps_Bh).
When studying the populations of P. sylvestris in the East European Plain, their spatial and genetic structure was checked for compliance with "isolation-by-distance" model. Thus, when pairwise comparison of all ten studied populations, the Mantel test revealed an average positive correlation (r 2 = 0.421) of geographical and genetic (DN) dis-  The division of populations into two clusters is confirmed by the results of Principal Coordinate Analysis (PoCA) carried out on the basis of PhiPT index calculated using the AMOVA package. Upon ordination, the populations were unevenly distributed ( Figure 3B). Two groups were clearly distinguished: the first one included Ps_Gn, Ps_Uk and Ps_Bs populations which were joined by Ps_Kg, Ps_Pl and Ps_Br populations. The second cluster is formed by Ps_Dr, Ps_Ur, Ps_Sl and Ps_Bh populations.
Thus, according to the results of analyses using various algorithms for determining the spatial and genetic structure, the division of ten studied populations of P. sylvestris into the following two groups was revealed: Pre-Ural's (Ps_Uk, Ps_Bs, Ps_Pl, Ps_Br, Ps_Gn) and North Transvolga's (Ps_Dr, Ps_Ur, Ps_Sl, Ps_Bh).
When studying the populations of P. sylvestris in the East European Plain, their spatial and genetic structure was checked for compliance with "isolation-by-distance" model. Thus, when pairwise comparison of all ten studied populations, the Mantel test revealed an average positive correlation (r 2 = 0.421) of geographical and genetic (D N ) distances ( Figure S2).
During AMOVA analyses taking into account two groups of populations, it was found that most of the total genetic diversity is concentrated within populations (45%), while variability between groups of populations accounted for 29% and the interpopulation component of all observed genetic diversity made up 26%. This indicates that differentiation between groups of populations is expressed to the same extent as between individual populations within groups.
Using the approach proposed by us for the species of woody plants to identify specific and typical gene pools (GOC), the populations in the two identified groups were investigated and characterized [32].

Discussion
In our study, 132 PCR amplicons were produced from five ISSR primers with an average of 26 highly polymorphic amplicons in P. sylvestris in the east of the Eastern European Plain for 293 trees. Similar studies were conducted for Asian samples (135 trees) using a smaller number of ISSR markers (108 amplicons) [33]. In general, our results on ISSR profiles in P. sylvestris are in agreement with previous work [34]. As a result of the analysis, a high level of genetic diversity is revealed in P. sylvestris populations, which is consistent with the data obtained earlier by Vidyakin et al. [35] in the northeast of the Russian plain. The main reason for the high genetic diversity observed in P. sylvestris can be attributed to its long life cycle, cross-pollination by wind and high fecundity [36,37]. The high level of genetic diversity may be facilitated by the large geographical range of P. sylvestris, over which there are differences in climatic and growing conditions. On the other hand, the expected heterozygosity in P. sylvestris in this study compared to other populations in Europe was higher than using the RAPD method of DNA polymorphism detection (He = 0.110), but lower using the ISSR method in Portugal (He = 0.467) [38], as well as the SSR method in the Carpathians (He = 0.610) [38]. These differences can be explained by the different types of molecular markers (ISSR primers itself) and the technologies used to detect them. It was revealed that the level of genetic diversity in the group of populations growing in the east of the East European Plain is higher than in the group of populations in the Ural region, which corresponds to the decreasing genetic diversity from west to east from the East European Plain to Tranbaikal. This fact may be the result of the loss of variability during multiple so-called "bottleneck" processes during the eastward propagation of the species [39,40]. Genetic drift plays an important role in the diversity and divergence of Scots pine populations over a small area [41]. Similar results were obtained by other authors for other species of the genus Pinus: Pinus strobus L. [42,43], Pinus coulteri D.Don [44] and Pinus cembra L. [41].

Population Genetic Structure
Our results showed that the level of genetic differentiation of P. sylvestris is very high (G ST = 0.439). According to Wright [45], given the dominant marker type, the level of genetic differentiation among populations is high when the coefficient of genetic differentiation is greater than 0.25 [38]. The studied populations are divided into two large groups according to their geographical location confined to North Transvolga and Pre-Ural. Based on AMOVA analysis data, the studied populations of P. sylvestris are largely differentiated, about half (45%) of the observed genetic diversity is concentrated within the populations, about a third of the diversity is due to differences among the study regions (29%) and among individual populations (26%). The data obtained indicate the origin of several genetically differentiated populations and their groups in P. sylvestris in the study region. The revealed level of population differentiation is high and significantly exceeds that obtained in other studies of the genetic structure of P. sylvestris populations, according to isoenzyme analyses (4%) in the Southern Urals and microsatellite analyses (6%) in the Eastern and Southern Carpathians [46]. The probable reason for such differentiation may be related to the peculiarity of the chosen PCR-based DNA fingerprinting method, since ISSR loci are of high copy number and abundant in eukaryotic genomes. Similar studies for the East European Plain where ISSR method is also used, the level of genetic differentiation largely coincided with those obtained in our study [35]. High differentiation may be a consequence of the fragmentation of P. sylvestris range in the studied region. The significant differentiation between North Transvolga's populations and Pre-Ural's populations (29%) is possibly due to the history of dispersal of the species in the Pleistocene. Despite the fact that the South Ural refugium occupies a dominant position in both groups, it makes a much larger contribution to the Ural's population than to the North Transvolga's populations [47,48].
There is also a slight positive correlation (r 2 = 0.421; p = 0.004) between genetic and geographic distances, and population differentiation is not due to geographic distance alone. Probably, a significant additional factor influencing the genetic structure is the features of the surrounding relief of the Ural Mountains.
The heterogeneity of the habitat determines the multidimensional complexity of the population structure of any species. However, the complex spatial structure of populations is only to some extent reflected in their differentiation, while most of the interpopulation differences are of polygenic nature. The uniqueness, diversity and historical factors of the development of natural systems of Ural also determine the complexity of the population structure of P. sylvestris in the region.
Within the North Transvolga's population group of P. sylvestris are generally homogeneous, among them there are no populations characterized by a specific gene pool. Among the populations of the Pre-Ural's group, there are both populations with a specific and a typical gene pool. Populations with a specific gene pool, such as the population from Berezniki's forestry (Ps_Br), can serve as a source of genetic diversity in reforestation programs. And populations with a more typical gene pool having the most common alleles in the region can be preserved as forest genetic reserves to conserve the genetic resources of the species. An example of such populations can be populations from Gainy's (Ps_Gn) and Perm's (Ps_Uk) forestries, as well as North Transvolga populations with the most typical gene pools.

Conclusions
P. sylvestris is a conifer species of biosphere and resource importance. In the Eastern European Plain, the range of this species is fragmented. In the present study, we evaluated the genetic diversity and population structure of ten natural populations in the eastern East European Plain using inter-microsatellite DNA polymorphism analysis. First, a high level of genetic diversity and a strong degree of genetic differentiation of the studied natural populations of P. sylvestris were established. Secondly, a downward trend in genetic diversity of the studied P. sylvestris populations from west to east was detected. Thirdly, a division of the studied populations into two groups has been established: North Transvolga and Pre-Urals. One of the approaches to using the obtained data on the genetic diversity of ten natural populations of P. sylvestris and their division into two groups of populations can be used to study the content of resin acids with antimicrobial activity in Scots pine trees with different genotypes under different growing conditions in the east of the East European Plain. Fourthly, populations with specific (Ps_Br) and typical (Ps_Uk, Ps_Gn, Ps_Dr, Ps_Ur, Ps_Sl, Ps_Bh) gene pools are identified which are recommended for conservation as a source of genetic diversity (Ps_Br) and reserves for conservation of genetic resources of the species (Ps_Uk, Ps_Gn, Ps_Dr, Ps_Ur, Ps_Sl, Ps_Bh).
The results of studies of the spatial and genetic structure and differentiation of natural populations of woody plants can be used to draw up comprehensive programs for control of felling, rational use of forest genetic resources and genetically-based programs for their conservation and restoration, as well as to clarify the history of dispersion which will make it possible to draw up a plan for conservation and protection of population gene pools.