Spatial Genetic Structure within and among Seed Stands of Pinus engelmannii Carr. and Pinus leiophylla Schiede ex Schltdl. & Cham, in Durango, Mexico

Studies of spatial genetic structure (SGS) are important because they offer detailed insights into historical demographic and evolutionary processes and provide important information regarding species conservation and management. Pinus engelmannii and P. leiophylla var. leiophylla are two important timber tree species in Mexico, covering about 2.5 and 1.9 million hectares, respectively. However, studies in relation to population genetics are unfortunately scant. The aim of this research was to use amplified fragment length polymorphisms (AFLP) analysis to identify potential differences in spatial genetic structure within and among seven Pinus engelmannii and nine P. leiophylla var. leiophylla seed stands in Durango, Mexico. Within the 16 seed stands of the two tested pine species, no significant SGS was detected, although SGS was detected among the seed stands. We concluded that the collection of seed in only some seed stands should not significantly alter the degree of genetic differentiation within the (collected) seed. Distances between seed orchards and pollen propagators of more than 24 km for P. engelmannii and 7 km for P. leiophylla may be sufficient to limit contamination. Finally, local seeds should be used for (re)forestation.


Introduction
Non-random spatial distribution of genotypes in natural populations, also known as spatial genetic structure (SGS), can occur as a result of various processes, including limited gene dispersal, selection pressures, and historical events. The genetic distance between neighboring individuals is therefore lower than that between more distant individuals, if dispersal is limited [1]. Indeed, the SGS among plant populations is mainly the result of the gene flow (exchange) through seed and pollen dispersal. Thus, geographically fragmented populations (i.e., those with a patchy spatial structure of individuals) are expected to have a higher level of SGS than continuous populations

Study Area
Seven Apache pine (Pinus engelmannii (PE)) and nine Chihuahua pine (P. leiophylla var. leiophylla (PL)) seed stands in the Sierra Madre Occidental, Durango, Mexico were examined, as shown in Figure 1 and Table 1.
Each seed stand consisted of 130 adult trees that looked clearly superior in their dimension, quality, and disease resistance compared to the mean tree in each stand (for selection criteria, see Wehenkel et al. [23]. The size of the seed stands varied between 4.35 and 33.85 ha depending on the tree density. The stands were situated in natural populations mixed with other pine species (P. teocote, P. arizonica, and P. cooperi) as well as oak (Quercus sideroxyla), Juniper (Juniperus flacciada, J. deppeana), and Arbutus spp. For analysis of the SGS, samples of needles were randomly collected from 23-35 trees (from 130 trees) in each seed stand in 2013 (a total of 238 needle samples from PE and 298 from PL) (see more in Hernández-Velasco et al. [14]; Table 1).

Genetic Analysis
AFLP analysis of the sampled needles was performed according to Vos et al. [30]. As the genome of conifers is huge, the protocol was modified to include a greater number of adapters (restriction/ligature, 6 times) and primers (pre-AFLP and selective AFLP, 6 times). DNA was extracted using the QIAGEN DNeasy96 plant kit and digested with the restriction enzymes EcoRI (5 -GACTGCGTACCAATTCNNN-3 ) and MseI (5 -GATGAGTCCTGAGTAANNN-3 ). Polymerase Chain Reaction (PCR) amplification was then performed with double-stranded EcoRI and MseI adaptors ligated to the end of the restriction fragments, to produce template DNA. In pre-AFLP amplification, the PCR products were treated with the primer combination E01/M03 (EcoRI-A/MseI-G). Selective amplification was carried out with the fluorescent-labelled (FAM) primer pair E35 (EcoRI-ACA-3) and M63+C (MseI-GAAC) for PE and PL. The fourth selective base was added to reduce the large number of weak signals. The amplified restriction products were electrophoretically separated in a Genetic Analyzer (ABI 3100 16 capillaries, Foster City, CA, USA), along with the internal size standard GeneScan 500 ROX (fluorescent dye, obtained from Applied Biosystems, Foster City, CA, USA). The size of the AFLP fragments was resolved with the GeneScan 3.7 and Genotyper 3.7 software packages (Applied Biosystems, Foster City, CA, USA) (see details in Ávila-Flores et al. [31]).
Two binary AFLP matrices (one per species) were produced from the presence (code 1) or absence (code 0) at potential band positions. The appearance of a band indicates a dominant genetic variant (the "plus phenotype") and the absence of a band indicates the recessive genetic (allelic) variants at the given position (locus) (the "minus phenotype") [32,33]. Only AFLP matrices with frequencies of genetic variants above 0.05 and below 0.95 were included in further analyses.

Spatial Structural Analysis of AFLP Data
The geographical (spatial) distance was determined from the Euclidean distance between individuals, separately for each of the two tree species analyzed. The SGS within and among the seed stands was determined by analyzing three different coefficients of genetic distances. Tanimoto distance (TD) [34], Genetic Distance of Huff (GD) [35], and pairwise kinship coefficients (F ij ) [9] were all considered because different mathematical approaches can produce contrasting conclusions (see details below). The distances were computed using Spatial Genetic Software v1.0 (SILVOLAB Guyane INRA Station de Recherches Forestières, Campus agronomique, Kourou cedex, French Guiana) [5], SPAGeDi v1.4 (Laboratoire de Gènètique et Ecologie vegetales, Universitè Libre de Bruxelles, Bruxelles, Belgium) [1], and GenAlEx v6.501 (Evolution, Ecology and Genetics, Research School of Biology, The Australian National University, Canberra, Australia and Department of Ecology, Evolution and Natural Resources, School of Environmental and Biological Sciences, Rutgers University, New Brunswick, NJ, USA) [14,36].
The "width" dimension of spatial distance classes was defined as including a minimum of 30 pairwise comparisons per distance class [37]. As the smallest width of the distance class could not be less than 150 m, all SGS statistics within the 16 seed stands were therefore calculated for the 150 m distance-class distributions.
For each spatial distance class, the observed values of the genetic distances were compared with the random value obtained using 999 permutations (in the case of Spatial Genetic Software, and SPAGeDi) or 999 bootstraps (using GenAlEx). To test whether SGS was significantly different from zero, a 99% confidence interval was then calculated for each genetic distance [5,38]. For each spatial distance class, the probability value (P) was also computed for each genetic distance. Bonferroni correction [39] was also performed to reduce the risk of obtaining false-positive results (type I errors). The corrected critical p value (significance level α* = 0.00036) was determined by dividing the original critical p value (0.025) by the number of comparisons or hypotheses (m = 69). Consequently, the Bonferroni critical value of P was 1 − α* = 0.9997 [14].
The genetic dissimilarity (D ij ) between the binary vectors of two individuals, i and j, was calculated by using Spatial Genetic Software v1.0, as described for the TD [34]. Genetic distograms were then constructed to estimate the SGS. The overall mean value of the TD between all trees was used as a reference for random spatial structure [5].
The F ij for dominant markers in diploids was calculated to measure the occurrence of identical alleles at a given locus in a pair of individuals [9] by using SPAGeDi software ver.1.4 [40,41]. The F ij based on genetic markers actually estimates relative kinships that can be defined as ratios of the differences of impossible identities [42]. To detect the SGS, the F ij values were averaged over a group of distance classes and plotted against the Euclidean spatial distance. Positive values of F ij can be obtained for some individuals that are more closely related than random individuals, but negative F ij values occur if the individuals are less closely related than random individuals [9].
An individual-by-individual pairwise GD matrix (N × N) was created using GenAlEx v6.501. In regard to the GD, a comparison where individuals show the same state according to the AFLP (e.g., 0 vs. 0 and 1 vs. 1 comparison) yields a value of 0, while any comparison of different states (e.g., 0 vs. 1 and 1 vs. 0) yields a value of 1 [36]. An autocorrelation coefficient (r) was then computed using the matrices including pairwise geographic (Euclidean) distances and pairwise GD, which yielded the genetic distance between pairs of individuals within a defined distance class [36].

Kruskal Wallis Test
The Kruskal Wallis post-hoc test (Tukey and Kramer approach) [43,44] was used to check whether the observed differences in the values of the TD, GD, and F ij between each PE and PL stand occur as random events, rather than being caused by directed forces (significance level α = 0.05). The test was implemented using the R 3.2.2 statistical package (Foundation for Statistical Computing, Vienna, Austria) [45] and the PMCMR package (Federal Institute of Hydrology, Koblenz, Rhineland-Palatinate, Germany) [46].

Principal Coordinate Analysis
The binary AFLP data matrix was also analyzed by Principal Coordinate Analysis (PCoA, Gresham College, London, UK), implemented with GenAlEx v6. Genetic differences between the seed stands and trees were calculated as Nei s Genetic Distance (Brown University, Providence, RI, USA) [47,48]. The two first coordinates were used to represent genetic separation of populations and individuals.

Results
In total, all AFLP primer combinations tested in this study yielded 303 polymorphic bands of 77-450 bp across all Pinus engelmannii individuals and 325 polymorphic bands in the P. leiophylla individuals. Analysis of the SGS (using the Tanimoto distance) with 150 m distance-class distributions did not reveal any significant autocorrelation (after Bonferroni correction) within most of the P. engelmannii and P. leiophylla stands, except for the distance class of 300-450 m in the PL-CO stand (p ≥ 0.9997) ( Table 2, Figures 2 and 3). The mean genetic distance between P. engelmannii trees in the most southern population (Li stand) was significantly the smallest, whereas the significantly largest genetic distance in this study was found in the two most northern populations (MM and MP3 stands). However, the mean genetic differentiation (TD) between P. leiophylla trees was the smallest in a northern population (M stand), and the largest mean genetic differentiation observed in this study was found in another northern population (PO stand) ( Table 2 and Table S1). On the other hand, statistically significant SGS was detected in almost all distance classes at large scale (among the seed stands) (Tables 3 and 4, Tables S2 and S3, Figures 3 and 4). Table 2. Analysis of fine-scale spatial genetic structure in seven seed stands of Pinus engelmannii and nine seed stands of P. leiophylla using a distance class width of 150 m, 999 permutations, and a 99% confidence interval. The Tanimoto distance (TD) was calculated using Spatial Genetic Software 1.0 [5]. Note: N = sample size; MD = mean distance to the next pine under study; TD = calculated average of the Tanimoto distance; MT = minimum tree pairs per class (distance class); CI = confidence interval; P(TD) < −CI and P(TD) > +CI, highest probability of autocorrelation in each stand, showing the distance class (m) where it was found; + indicates a significant difference after Bonferroni correction. For definitions of the stand codes, see in Table 1. Table 3. Analysis of spatial genetic structure among P. engelmannii seed stands by considering a distance class width of 160 m in the northern stands and 24 km in the southern stands, 999 permutations, and a 99% confidence interval. The mean Tanimoto distance (TD) was calculated using Spatial Genetic Software 1.0 [5].   Table 3. Analysis of spatial genetic structure among P. engelmannii seed stands by considering a distance class width of 160 m in the northern stands and 24 km in the southern stands, 999 permutations, and a 99% confidence interval. The mean Tanimoto distance (TD) was calculated using Spatial Genetic Software 1.0 [5]. Note: TD = mean Tanimoto distance; MT = minimum number of pairs per class (distance class), MD = mean distance to the next tree; CI = confidence interval; P(TD) < −CI and P(TD) > +CI, highest probability of autocorrelation in each stand, showing the distance class (m or km) where it was found; + indicates a significant difference after Bonferroni correction. Table 4. Analysis of spatial genetic structure among seed stands of Pinus leiophylla by considering a distance class width of 8 km in the seed stands, 999 permutations, and a 99% confidence interval. The Tanimoto distance (TD) was calculated using Spatial Genetic Software 1.0 [5]. Note: TD = mean Tanimoto distance; MT = minimum of pairs per class (distance class), MD = mean distance to the next tree; P = probability of autocorrelation; CI = confidence interval; + indicates a significant difference after Bonferroni correction.  Table 4. Analysis of spatial genetic structure among seed stands of Pinus leiophylla by considering a distance class width of 8 km in the seed stands, 999 permutations, and a 99% confidence interval. The Tanimoto distance (TD) was calculated using Spatial Genetic Software 1.0 [5]. Note: TD = mean Tanimoto distance; MT = minimum of pairs per class (distance class), MD = mean distance to the next tree; P = probability of autocorrelation; CI = confidence interval; + indicates a significant difference after Bonferroni correction.    The results of the Kruskal-Wallis test showed that only the mean GD was statistically significantly larger in the P. leiophylla seed stands and not in the P. engelmannii seed stands (p = 0.013).

Code N MD TD P(TD) < (−CI) P(TD) > (+CI) MT Classes
At the seed stand level, the PCA revealed clear separation between the stands for both species ( Figure 5). The first three coordinates explained 85.93% (P. engelmannii) and 79.11% (P. leiophylla), respectively, of the variation found in the AFLP study. P. leiophylla stands were more genetically different than the P. engelmannii stands. By plotting the first two coordinates, which together describe 75.78% and 64.95%, respectively, of the AFLP variation, the P. engelmannii (PE) and the P. leiophylla (PL) populations were divided into three clusters, while the P. leiophylla (PL) stands were distributed in four   The results of the Kruskal-Wallis test showed that only the mean GD was statistically significantly larger in the P. leiophylla seed stands and not in the P. engelmannii seed stands (p = 0.013).
At the seed stand level, the PCA revealed clear separation between the stands for both species ( Figure 5). The first three coordinates explained 85.93% (P. engelmannii) and 79.11% (P. leiophylla), respectively, of the variation found in the AFLP study. P. leiophylla stands were more genetically different than the P. engelmannii stands. By plotting the first two coordinates, which together describe 75.78% and 64.95%, respectively, of the AFLP variation, the P. engelmannii (PE) and the P. leiophylla (PL) populations were divided into three clusters, while the P. leiophylla (PL) stands were distributed in four clusters (   In a PCoA conducted at the individual level, the first three coordinates explained a much lower percentage of the variation in the AFLP study (13.77% in P. engelmannii and 14.42% in P. leiophylla). Plotting the first two coordinates (which together explained 11.01% and 11.48%, respectively, of the variation in AFLP) showed that only P. engelmannii individuals were divided into separate groups (PE-Li vs. the rest of seed stands). P. leiophylla individuals per seed stand were genetically more dissimilar than the P. engelmannii individuals ( Figure S1). In a PCoA conducted at the individual level, the first three coordinates explained a much lower percentage of the variation in the AFLP study (13.77% in P. engelmannii and 14.42% in P. leiophylla). Plotting the first two coordinates (which together explained 11.01% and 11.48%, respectively, of the variation in AFLP) showed that only P. engelmannii individuals were divided into separate groups (PE-Li vs. the rest of seed stands). P. leiophylla individuals per seed stand were genetically more dissimilar than the P. engelmannii individuals ( Figure S1).

Discussion
As predicted, no significant SGS was detected within the 16 seed stands of the two tested pine species, Apache pine (P. engelmannii) and Chihuahua pine (P. leiophylla var. leiophylla) at the local level ( Table 2 and Table S1). This matches several previous reports describing that no SGS could be observed in North American stands of conifer species such as Picea chihuahuana, Pseudotsuga menziesii, Pinus strobiformis [13], P. clausa [11], P. strobus [12], P. lumholtzii [6], P. cembroides, P. discolor, P. durangensis, and P. teocote [14]. According to Parker et al. [11] and Vekemans and Hardy [1], the main reason for the lack of significant SGS observed in these two pine species at the local level was probably the wind-mediated dispersal mating system, which leads to a strong omnidirectional gene flow.
However, there are other causes of random spatial distribution of genetic variants: (i) seed shadows of neighboring adult trees sometimes overlap; (ii) germination of seeds from numerous neighboring adult trees within relatively small stand patches; (iii) demographic mortality, which may also reduce the genetic relatedness at the seedling and sapling stages [6,11,[49][50][51]; (iv) the disguising of SGS by high densities of adult trees; and (v) alteration of the tree stands by human intervention [14,50,52,53].
Among the P. engelmannii and P. leiophylla seed stands under study, and for almost all distance classes considered, the genetic and geographical distances between individual trees were significantly related (Tables 3 and 4, Figures 2a and 3, Tables S2 and S3). This confirms the theory of isolation by distance (i.e., limited dispersal caused by geographic distance) [54]. The limited dispersal mainly facilitates the genetic divergence of populations through drift and local adaptation. On the other hand, in the medium or long term restricted dispersal of pollen and seed can limit the genetic exchange between populations by reproductive segregation in the form of phenological isolation or ecological specialization [55].

Conclusions
Our findings have some important practical implications for future pine seed stand identification and management that can help avoid reforestation failure or inefficiencies, since non-local genotypes often are considered to have a fitness disadvantage over local genotypes [29]. First, the genetic variants were randomly mixed within the seed stands and were detected by non-significant genetic differences between the individual trees. In these stands, the maximum geographical distance between the studied trees was 600-1200 m (Tables 2 and 3). Thus, the collection of seed in only some parts of these seed stands should not significantly alter the degree of genetic differentiation within the (collected) seed. Second, analysis of the seed stands showed that genetic differences between individuals at geographical distances of seven to eight kilometers for P. leiophylla and up to 24 km for P. engelmannii were significantly smaller than random genetic differences (Tables 3 and 4, Tables S2 and S3), i.e., there was probably still sufficient genetic exchange between the trees at up to 8 km for P. leiophylla and up to 24 km for P. engelmannii. Thus, future seed orchards within these geographic distances may be exposed to external pollen contamination from other stands from the same tree species, which can decrease the genetic gain of a seed orchard crop (a contamination level of about 50% reduces the genetic gain by about 25%) [56].
The genetic exchange detected at larger scales is supported by Varis et al. [57] who reported a dispersal distance up to 600 km for P. sylvestris, and by Williams [58], who indicated a dispersal distance up to 40 km for P. taeda; although these findings were probably due to stepping stone events and not to single long distance results. Moreover, Robledo-Arnuncio [59] observed a significant level of effective pollen flow (up to 4.4%) from a large population located at a distance of 100 km. Moreover, the problem of pollen contamination in pine seed orchards due to insufficient distances from other stands was observed in various studies. For example, the estimated level of pollen contamination in Scot's pine seed orchards varied between 2% and 74% [60]. Similar pine species could contaminate the seed orchards by hybridization within these distances, which seems to be common in P. engelmannii [31].
However, the seed stands under study here were already significantly genetically different in the distance classes of 24-48 km of P. engelmannii and 7-14 km of P. leiophylla (Tables 3 and 4). Thus, distances between seed orchards and pollen propagators of more than 24 km for P. engelmannii and more than 7 km for P. leiophylla may be sufficient to limit the contamination. Finally, the significant differences in genetic structures lead us to conclude the following: (i) local seeds should be used for (re)forestation because they may also be locally adapted [61]; and (ii) a fine-scale network of forest seed stands is appropriate for conserving the different genetic structures [14].
Supplementary Materials: The following are available online at www.mdpi.com/1999-4907/8/1/22/s1, Figure S1: Results of Principal Coordinates Analysis (PCoA) of the individual Pinus engelmannii (a) and Pinus leiophylla trees (b). The first two coordinates together explained 11.01% and 11.48%, respectively, of the variation in AFLP. For definitions of the stand codes, see in caption of Figure 5 in the paper, Table S1: Analysis of fine-scale with GenAlex and SPAGeDi in seven seed stands of Pinus engelmannii and nine seed stands of P. leiophilla by considering a distance class width of 150 m, 999 bootstraps or 999 permutations and a 99% confidence interval. The genetic distance of Huff (GD) was calculated using GenAlEx 6.501. The pairwise kinship coefficient (F ij ) was calculated using SPAGeDi 1.4, Table S2: Analysis of spatial genetic structure among seven Pinus engelmannii seed stands by considering a distance class width of 160 m in the northern stands and 24 km in the southern stands, 999 bootstraps and permutations and a 99% confidence interval. The genetic distance of Huff (GD) was calculated using GenAlEx 6.501, and the pairwise kinship coefficient (F ij ) was calculated using SPAGeDi 1.4, Table S3: Analysis of large-scale spatial genetic structure in nine seed stands of Pinus leiophylla by considering a distance class width of 8 km in the stands, 999 bootstraps or 999 permutations and a 99% confidence interval. The genetic distance of Huff (GD) was calculated using GenAlEx 6.501, and the pairwise kinship coefficient (F ij ) was calculated using SPAGeDi 1.4.