Phylogeographic Analysis of Lodgepole Pine (Pinus contorta) Reveals Limited Subspecies Differentiation and Evidence for Glacial Refugia

Fazekas, Aron J.; Yeh, Francis C.

doi:10.3390/dna6020020

Open AccessArticle

Phylogeographic Analysis of Lodgepole Pine (Pinus contorta) Reveals Limited Subspecies Differentiation and Evidence for Glacial Refugia

by

Aron J. Fazekas

¹ and

Francis C. Yeh

^2,*

¹

The Arboretum, University of Guelph, Guelph, ON N1G 2W1, Canada

²

Department of Renewable Resources, University of Alberta, Edmonton, AB T6G 2H1, Canada

^*

Author to whom correspondence should be addressed.

DNA 2026, 6(2), 20; https://doi.org/10.3390/dna6020020

Submission received: 13 January 2026 / Revised: 11 March 2026 / Accepted: 27 March 2026 / Published: 16 April 2026

Download

Browse Figures

Review Reports Versions Notes

Abstract

Lodgepole pine (Pinus contorta Dougl.) exhibits pronounced morphological variation across its range, historically attributed to allopatric differentiation during the Wisconsin glaciation. However, whether genetic divergence aligns with morphological differentiation—a fundamental prediction of allopatric speciation theory—remains untested. We conducted a comprehensive phylogeographic analysis of chloroplast DNA (trnL intron and trnL/trnF spacer) and mitochondrial DNA (nad1 b/c intron) across 31 populations representing all four recognized subspecies to test hypotheses of refugial isolation and to evaluate the genetic basis of current taxonomic classification. Contrary to predictions of allopatric divergence, both organellar genomes showed striking genetic uniformity (π = 0.000178–0.000186; intersubspecific genetic distances: 1.06 × 10⁻⁴ to 3.96 × 10⁻⁴) with no phylogenetic structure corresponding to morphological boundaries. Significant negative neutrality test values (Tajima’s D = −2.26, p < 0.02; Fu and Li’s D* = −4.52, p < 0.02) suggest recent demographic expansion rather than equilibrium divergence. A distinctive 5 bp indel in coastal populations provides molecular evidence for a northern Pacific refugium, and its occurrence in interior populations is consistent with post-glacial pollen-mediated gene flow, though this directionality remains inferential pending nuclear genomic confirmation. These findings suggest that morphological divergence reflects rapid adaptive evolution in heterogeneous environments rather than deep phylogenetic divisions. This pattern exemplifies gene flow-selection balance, in which divergent selection maintains local adaptation despite extensive gene flow—supporting an ecotypic rather than a phylogenetic interpretation of intraspecific diversity. The persistence of morphological variation despite genetic homogeneity indicates strong selection on ecologically important traits, likely driven by variation in fire regimes, differential seed predation, and climate gradients. These results have critical implications for understanding adaptive evolution rates in widespread conifers and for developing conservation strategies that emphasize adaptive processes over taxonomic categories.

Keywords:

Pinus contorta; phylogeography; glacial refugia; gene flow; adaptive evolution; ecotypes; chloroplast DNA; mitochondrial DNA; migration-selection balance

1. Introduction

The lodgepole pine (Pinus contorta Dougl.) complex illustrates a fundamental paradox in evolutionary biology: how do widespread species maintain substantial morphological variation across environmental gradients while still experiencing sufficient gene flow to prevent complete reproductive isolation? This species’ broad distribution—from the Yukon Territory through British Columbia and western Alberta, south to California, and east along the Rocky Mountains to Colorado [1]—encompasses dramatic climatic and ecological differences that have led to notable phenotypic divergence. Understanding the evolutionary mechanisms underlying this variation requires integrating insights from population genetics, biogeography, and adaptation theory.

Two competing theoretical perspectives offer contrasting predictions about the evolutionary origins of morphological variation in lodgepole pine. The classical allopatric divergence theory posits that geographic isolation—particularly during Pleistocene glaciations—should produce phylogeographic patterns reflecting historical population fragmentation [2,3,4]. Under this model, morphologically distinct subspecies should exhibit corresponding genetic differentiation across the genome, with the magnitude of divergence proportional to the duration of isolation and the effective population size. Critchfield’s influential morphological analysis [2] identified four subspecies—P. contorta contorta (coastal), P. contorta murrayana (Sierra Nevada), P. contorta latifolia (Rocky Mountains and interior), and P. contorta bolanderi (Mendocino pygmy forest)—based primarily on cone architecture and foliar characteristics. This taxonomic framework assumes that morphological differences indicate separate evolutionary trajectories established through prolonged isolation during the Wisconsin glaciation (~100,000–12,000 years ago) [4,5].

An alternative theoretical perspective emphasizes migration-selection balance in maintaining local adaptation across species ranges [6,7,8]. This framework predicts that morphological differentiation can evolve and persist despite ongoing gene flow if divergent selection is sufficiently strong relative to migration’s homogenizing effects. Critically, this scenario predicts genetic homogeneity at neutral markers alongside substantial differentiation at loci under selection—a genomic landscape fundamentally different from that expected under allopatric divergence. The conditions under which selection can overcome gene flow have been extensively modeled [6,7,8], demonstrating that even moderate selection coefficients can maintain adaptive differentiation when gene flow is moderate, particularly for polygenic traits where selection acts simultaneously across multiple loci [9,10].

These contrasting frameworks generate testable predictions about lodgepole pine’s evolutionary history. Allopatric divergence predicts: (1) phylogenetic structure in neutral markers that aligns with subspecies boundaries; (2) genetic distances proportional to presumed periods of isolation; (3) reciprocal monophyly or distinct haplotype clusters for different subspecies; and (4) concordance between nuclear and organellar phylogeographic patterns reflecting genome-wide divergence. In contrast, the gene flow selection balance model predicts: (1) genetic homogeneity at neutral loci despite morphological differentiation; (2) weak or absent phylogeographic structure in organellar genomes; (3) morphological clines aligned with environmental rather than geographic gradients; and (4) potential discordance between adaptive and neutral genetic variation.

Our earlier research using nuclear markers revealed limited population genetic structure in lodgepole pine, inconsistent with long-term subspecies isolation [11] and contradicting expectations under traditional vicariance scenarios. Subsequently, mitochondrial DNA minisatellite analysis identified unexpected refugial zones in Haida Gwaii (Queen Charlotte Islands) and the Alexander Archipelago [12], suggesting a more complex biogeographic history than previously recognized. These findings raised critical questions: Does the apparent genetic homogeneity at nuclear markers extend to organellar genomes? If so, how can substantial morphological variation be maintained? What role did glacial refugia play in shaping current diversity patterns? And fundamentally, do recognized subspecies represent phylogenetically distinct lineages or ecotypic variants maintained by divergent selection?

Organellar genomes offer distinct advantages for reconstructing evolutionary history because of their uniparental inheritance, lack of recombination, and generally slower mutation rates than nuclear DNA [13]. In conifers, chloroplast DNA (cpDNA) is paternally inherited and dispersed through pollen, facilitating long-distance gene flow and rapid homogenization across landscapes. Conversely, mitochondrial DNA (mtDNA) is maternally inherited and transmitted via seeds, which typically disperse over shorter distances, thereby preserving stronger signatures of historical population structure and colonization routes [14,15,16]. This fundamental difference in dispersal biology creates contrasting spatial genetic structures: cpDNA typically exhibits less geographic structure because pollen disperses extensively, whereas mtDNA maintains stronger phylogeographic signals reflecting seed-mediated demographic processes.

These contrasting inheritance patterns enable powerful inferences about evolutionary history. Concordant phylogeographic structure across both genomes strongly suggests historical isolation of entire populations, whereas discordance may indicate sex-biased dispersal or recent gene flow primarily through pollen. Complete genetic homogeneity in cpDNA, coupled with some mtDNA structure, would suggest that extensive pollen flow has homogenized paternal lineages while maternal lineages retain traces of historical fragmentation. Conversely, homogeneity in both genomes, despite morphological variation, would provide compelling evidence that adaptive differentiation has occurred without prolonged geographic isolation.

This study addresses three interconnected objectives to discriminate among competing evolutionary hypotheses:

First, we test whether organellar genome variation supports existing subspecies classifications, directly evaluating predictions from allopatric divergence theory. If subspecies represent phylogenetically distinct lineages established through prolonged isolation, organellar genomes should exhibit: (a) significant genetic distances between subspecies exceeding typical intraspecific variation; (b) reciprocally monophyletic or at least distinct haplotype groups; and (c) phylogeographic structure aligned with taxonomic boundaries. Alternatively, if subspecies represent ecotypic variants, organellar genomes should show minimal differentiation, regardless of morphological distinctiveness.

Second, we evaluate the biogeographic hypothesis that Wisconsin glaciation caused prolonged population isolation by comparing coalescent expectations for genetic structure with empirical data. Classical vicariance models predict that ~100,000 years of isolation in separate refugia should generate detectable phylogenetic structure even at slowly evolving organellar loci. We assess whether the observed genetic diversity and demographic signatures are consistent with prolonged isolation or suggest more recent common ancestry and rapid post-glacial expansion.

Third, we identify potential glacial refugia and post-glacial colonization pathways by integrating molecular data with paleoecological evidence. Unique haplotypes concentrated in specific regions may indicate refugia, and their distribution patterns can illuminate colonization routes and the relative importance of seed versus pollen dispersal in range expansion.

By combining chloroplast and mitochondrial data within clear theoretical frameworks from coalescent theory, migration-selection models, and phylogeography, we seek to clarify the roles of historical demography and current selection in the evolution of lodgepole pine. These findings have wider implications for understanding conifer diversification processes and guiding conservation efforts that balance taxonomic identity with adaptive potential.

2. Materials and Methods

2.1. Study Populations and Hierarchical Sampling Design

We employed a hierarchical sampling strategy to capture both broad-scale phylogeographic patterns and fine-scale population genetic variation while minimizing environmental effects on genetic analyses. Needle tissue was collected from 139 individuals across 31 populations representing all four recognized subspecies. Sampling was conducted in established provenance plantations maintained by the British Columbia Ministry of Forests in Prince George and Lake Cowichan. These ex situ collections, derived from natural stands throughout the species’ range, provided standardized growing conditions that eliminated contemporary environmental variation while preserving the geographic and genetic signal from source populations—a critical advantage for disentangling genetic from plastic phenotypic variation.

Our sampling design included 16 populations of P. c. latifolia, 11 of P. c. contorta, 3 of P. c. murrayana, and 1 of P. c. bolanderi (Figure 1). For all but four populations (121, 126, 135, and 141), four or five trees were sampled per population; these four populations had 1–3 trees due to low survival in the provenance trial. Although modest per-population sample sizes limit the detection of rare haplotypes, the total sample of 139 individuals across 31 populations provides enough statistical power to detect biologically meaningful subspecies-level differences in AMOVA [17], as shown below (see Section 2.5 for a formal power analysis). Fresh tissue was immediately frozen in liquid nitrogen and stored at −20 °C to prevent DNA degradation. For P. c. bolanderi, which did not survive in provenance trials due to its extreme adaptation to nutrient-poor soils, we germinated archived seeds and grew seedlings in controlled greenhouse conditions for six weeks before tissue collection.

2.2. DNA Extraction and PCR Amplification

Total genomic DNA was extracted using a modified CTAB protocol optimized for coniferous tissues [18], which effectively removes polyphenolic compounds and polysaccharides that can inhibit downstream enzymatic reactions. The chloroplast trnL intron and trnL/trnF intergenic spacer regions were amplified as a single fragment of approximately 1100 bp using universal primers c and f [19]. These regions were chosen for their proven usefulness in resolving phylogenetic relationships among closely related taxa, due to their moderate substitution rates that balance phylogenetic signal with alignability [20,21,22,23]. The mitochondrial nad1 b/c intron was amplified with primers NAD1B1F and NAD1C1R [24], producing an approximately 1537 bp fragment from which a 339 bp variable region was later sequenced using nested internal primers NAD1B3F and NAD1C3R. This region contains tandemly repeated 34 and 32 bp elements that differ among pine species [24,25], potentially offering valuable phylogeographic markers. Sequences of the amplification and sequencing primers used in this study are detailed in Table 1.

PCR reactions were conducted in 25 μL volumes containing 10 mM Tris-HCl (pH 8.3), 50 mM KCl, 2.5 mM MgCl₂, 200 μM of each dNTP, 0.5 μM of each primer, 0.5 units of Taq polymerase, and 20–40 ng of template DNA. Thermal cycling involved an initial denaturation at 95 °C for 2 min, followed by 35 cycles of 95 °C for 1 min (denaturation), 55 °C for 1 min (annealing), and 72 °C for 2 min (extension), with a final extension at 72 °C for 5 min to ensure complete synthesis of all products. PCR products were verified by agarose gel electrophoresis, and amplicons displaying a single band of the expected size were purified using standard protocols before sequencing.

2.3. DNA Sequencing and Quality Control

Direct bidirectional sequencing was carried out using fluorescently labeled primers on a Li-Cor 4200 automated sequencer with SequiTherm EXCEL II DNA sequencing chemistry (Epicentre Technologies, Maharashtra, India). This method allowed for high-quality sequence determination without the need for cloning, which is suitable for organellar genomes where intracellular homogeneity reduces concerns about heteroplasmy. Both forward and reverse strands were sequenced to enhance accuracy and resolve ambiguous base calls. Raw chromatograms were inspected and edited with Chromas version 2.6.6 [26], paying close attention to trace quality scores and possible sources of sequencing artifacts.

2.4. Sequence Alignment and Phylogenetic Analysis

Sequences were aligned using ClustalW version 2 within MEGA11 [27] with the following parameters: gap opening penalty = 15.0, gap extension penalty = 6.66 for the pairwise alignment phase, and gap opening penalty = 15.0, gap extension penalty = 6.66 for the multiple alignment phase; IUB weight matrix; transition weight = 0.5, followed by manual inspection. The trnL intron and trnL/trnF spacer datasets were combined for analysis because these regions are physically linked and inherited as a single unit, maximizing phylogenetic signal while avoiding pseudo-replication. Indel events were coded as binary characters using simple indel coding, which treats multi-base insertions or deletions as single evolutionary events, thereby retaining phylogenetic information while avoiding inappropriate weighting of indel length variation. We chose simple indel coding [28] over alternative schemes (e.g., the modified complex indel coding or complete deletion of indel-containing sites) because of its demonstrated applicability to datasets with few, non-overlapping indels—as observed here. The 5 bp and 26 bp deletions occur at non-overlapping alignment positions; thus, the ambiguities inherent in more complex coding schemes do not apply. Nonetheless, to verify that the indel coding scheme did not affect phylogenetic conclusions, we re-ran maximum parsimony analyses with (a) indels treated as missing data and (b) indels excluded entirely; both alternatives recovered the same unresolved polytomy, confirming that conclusions are robust to indel treatment.

Nucleotide diversity (π) was estimated following Nei [29], providing a measure of average pairwise sequence divergence within and among populations. Genetic distances reported represent mean pairwise distances computed across all individual sequences within each subspecies group (inter-subspecific distances) or among individuals within each subspecies (intra-subspecific distances, shown on the diagonal). Genetic distances between subspecies were calculated using Tajima-Nei corrections [30], which account for differences in rates of transition and transversion substitutions and saturation effects. These analyses were implemented in MEGA11 [27]. Summary statistics describing sequence variation and neutrality tests were computed using DnaSP version 3.53 [31]. Specifically, we calculated Tajima’s D [32] and Fu and Li’s D* [33], which test for departures from neutral evolution expectations by comparing different estimators of the population mutation parameter θ. Significantly negative values of these statistics indicate an excess of rare alleles relative to neutral expectations, potentially reflecting either purifying selection removing deleterious mutations or demographic expansion that increases population size and thereby generates many recent, rare mutations.

2.5. Statistical Power Analysis

To formally assess whether the sampling design had enough statistical power to detect biologically meaningful structure at all levels of the AMOVA—among subspecies (Φ_CT), among populations within subspecies (Φ_SC), and within populations—we performed a formal power analysis for the three-level nested design. Power was estimated by partitioning the total variance across three levels with the following degrees of freedom: df_groups = k − 1 = 3 (among four subspecies), df_pops (groups) = P − k = 27 (among 31 populations within four subspecies), and df_within = N − P = 108 (within 31 populations). The non-centrality parameter at each level is given by λ = df × nc × Φ/(1 − Φ), where nc is the adjusted average sample size appropriate for each comparison.

For the among-subspecies contrast (Φ_CT), Φ_CT is tested in the hierarchical design against MS_pops (groups) rather than MS_within; the appropriate denominator degrees of freedom are therefore df = 27. The group size nc = 27.00 (from N = 139, with per-subspecies totals nlatifolia = 76, ncontorta = 46, nmurrayana = 12, nbolanderi = 5). Power to detect at Φ_CT = 0.10 is 0.6974; at Φ_CT = 0.15, power is 0.893; and at Φ_CT = 0.20, power reaches 0.95 (Supplementary Table S1). The minimum detectable Φ_CT at 80% power is 0.13.

For the among-populations-within-subspecies contrast (Φ_SC), the relevant population size is approximately 4.48 (mean individuals per population: N/P = 139/31), resulting in df = 27, 108. Power to detect Φ_SC = 0.15 is 0.647; at Φ_SC = 0.20, power increases to 0.841; and the smallest detectable Φ_SC at 80% power is 0.187 (Supplementary Table S2). This indicates that the per-population sample sizes of 3–5 are sufficient to identify moderate-to-large population-level differentiation (Φ_SC ≥ 0.19) but may miss weaker among-population structure within subspecies (Φ_SC < 0.15). We acknowledge this limitation: our design was optimized for broad coverage across 31 populations rather than deep sampling within individual populations. Detecting fine-scale within-subspecies population structure would require targeted, intensive resampling.

For rare haplotype detection, the probability of observing at least one copy of a haplotype at a population frequency p is 1 − (1 − p)ⁿ. With n = 3, haplotypes at p = 0.10, 0.20, and 0.30 are detected with probabilities of 0.27, 0.49, and 0.66, respectively; with n = 5, these increase to 0.41, 0.67, and 0.83. Per-population sample sizes of 3–5 are therefore insufficient for cataloging the full range of within-population haplotype diversity, especially for alleles at p < 0.20. However, the implications for inference are asymmetric: undetected rare haplotypes are private variants limited to individual populations, and their existence would increase within-population variance and further reduce any detectable among-subspecies signal. Their non-detection, therefore, cannot lead to inflated Φ_CT values and does not undermine the main conclusion of negligible subspecies-level differentiation. Instead, it emphasizes a limitation in our ability to resolve fine-scale haplotype diversity within populations, which we acknowledge as an area for future detailed resampling studies.

To empirically evaluate whether small per-population sample sizes of 3–5 individuals could bias AMOVA results, we conducted complementary simulation-based and rarefaction analyses using the observed alignment (π = 0.00018, L = 1400 bp, N = 139). For the simulation, a five-haplotype model representing the observed cpDNA diversity (14 variable sites; dominant haplotype frequency ≈ 0.65) was used to generate 500 structured datasets per target Φ_CT value following a four-group island model that matched the subspecific sampling design (nlatifolia = 76, ncontorta = 46, nmurrayana = 12, nbolanderi = 5; Supplementary Figure S1A). Power was 8.2% at Φ_CT = 0.10, 40.0% at Φ_CT = 0.20, and only reached 80% at roughly Φ_CT ≈ 0.30, confirming that the low power results from the scarcity of segregating sites caused by the exceptionally low nucleotide diversity of this locus, not the small per-population sample size. In the rarefaction analysis, individuals were subsampled (r = 2–5 per population; 500 replicates per level) from null datasets simulated with the same haplotype model and no among-group structure. The estimated Φ_CT remained near zero across all rarefaction levels (mean = 0.018 at r = 2; 0.007 at r = 5; 95% CI including zero throughout; false-positive rate < 1.5% at all levels; Supplementary Figure S1B), indicating that the near-zero Φ_CT observed in the empirical data cannot be attributed to an artifact of small per-population sample size. These simulation-based estimates (8.2% at Φ_CT = 0.10) are considerably lower than the analytical F-test power (64% at the same Φ_CT) because they measure fundamentally different quantities: the analytical F-test assumes that the variance components are freely estimable from any allele frequency contrast and represents the maximum theoretical power of the AMOVA framework given the sample structure, whereas the simulation is bounded by the actual haplotype pool available in this dataset (five haplotypes, 14 variable sites, π = 0.00018). With so few polymorphic sites, most individual pairs across populations carry identical sequences regardless of the true Φ_CT, severely limiting the realised discriminatory power of the test. The simulation-based power curve is therefore the more relevant guide to the detectability of structure in this specific dataset, and it confirms that the near-zero empirical Φ_CT is not an artefact of insufficient sampling but a genuine reflection of organellar genetic uniformity across subspecies.

3. Results

3.1. Organellar Genome Variation and Geographic Structure

Both organellar genomes exhibited remarkably low genetic variation across the species’ entire range, a pattern inconsistent with predictions from allopatric divergence theory. The chloroplast trnL intron (487 bp sequenced) contained only five polymorphic sites, yielding a nucleotide diversity of π = 0.000178. The adjacent trnL/trnF spacer (385 bp) revealed four polymorphic sites, with π = 0.000186. These diversity values are among the lowest reported for widespread conifer species and suggest either recent common ancestry, severe historical bottlenecks, or both.

The most striking chloroplast variant was a 5 bp indel (deletion of TAAAT) at positions 404–408 of the trnL intron (Table 2). This deletion occurred in four individuals from three geographically disjunct populations: two coastal populations (49 and 95, both P. c. contorta from the Queen Charlotte Islands region) and one interior population (36, P. c. latifolia from the Rocky Mountains). The predominance of this marker in coastal populations, combined with its rarity in interior locations, provides molecular evidence for a northern Pacific coastal refugium and subsequent eastward pollen-mediated gene flow following deglaciation. Additional structural variation in the trnL/trnF spacer included a single-bp deletion in population 135 (P. c. murrayana) and a 26 bp deletion in population 31 (P. c. latifolia). However, the phylogeographic significance of these variants remains unclear given their single occurrences (Table 3).

Most notably, the mitochondrial nad1 b/c intron showed complete sequence homogeneity across all 139 sampled individuals, despite containing tandemly repeated 34- and 32 bp elements known to vary in other pine species [24,25]. This total lack of mtDNA variation sharply contrasts with Pinus ponderosa, where variation in repeat number clearly delineates distinct eastern and western lineages established during Pleistocene glaciations [34]. The monomorphic mitochondrial genome offers no phylogeographic signal, making it impossible to infer seed-mediated colonization routes or maternal lineage structure and suggesting either recent bottlenecks or extremely slow mtDNA evolution in this species. However, it is important to recognize that the absence of variation may partly reflect the choice of locus rather than true species-wide maternal uniformity. The nad1 b/c intron is among the more conserved regions in pine mitochondrial genomes [35], and several other introns (e.g., nad5 intron 1 [36], nad7 introns [37], cox1 introns [38]) or mitochondrial simple sequence repeats (mtSSRs) are known to show substantially higher polymorphism in Pinus [12]. Future research incorporating additional mtDNA targets or mtSSR markers would give a more complete picture of maternal lineage structure in P. contorta and should be considered before making firm conclusions about mtDNA uniformity across the species. Benchmarks of locus-specific variability in conifers suggest that the nad1 b/c intron may be too conserved to resolve intraspecific structure even if such structure exists [35,36,37,38].

Spatial analysis revealed no geographic clustering of chloroplast haplotypes that corresponded to subspecies boundaries or major geographic regions. The most common haplotype (consensus sequence) was ubiquitous across all four subspecies and often the only variant within populations, indicating recent common ancestry and extensive gene flow. Single-nucleotide polymorphisms showed no apparent geographic structure, appearing randomly distributed across the range without the clinal patterns expected under isolation-by-distance or the discrete clusters expected under refugial isolation. Notably, rare haplotypes were disproportionately found in peripheral rather than core populations, contrary to theoretical predictions that range-margin populations should exhibit reduced diversity due to founder effects and genetic drift during range expansion (“leading edge” effects). This unexpected pattern may reflect (1) sampling artifacts given modest within-population sample sizes, (2) persistence of ancestral variation in historically stable peripheral refugia, or (3) mutation accumulation in long-isolated marginal populations. Distinguishing among these alternatives would require more intensive sampling of both peripheral and core populations.

3.2. Subspecies Differentiation and Phylogenetic Relationships

Genetic distances among recognized subspecies were minimal, ranging from 1.06 × 10⁻⁴ to 3.96 × 10⁻⁴ (Table 4), well within the range typically observed for intraspecific variation in conifers. Within-subspecies variation was comparable to between-subspecies variation, failing to support recognition of subspecies as genetically distinct evolutionary units. The largest genetic distance separated P. c. latifolia and P. c. contorta (3.96 × 10⁻⁴), whereas the smallest occurred between P. c. contorta and P. c. bolanderi (2.56 × 10⁻⁵). These small distances provide no evidence for deep evolutionary divergence among subspecies and suggest that morphological differences have evolved without substantial neutral genetic differentiation.

Maximum parsimony analysis of the combined chloroplast dataset produced a poorly resolved tree dominated by a large polytomy encompassing most individuals, with no recovery of morphologically defined subspecies as monophyletic groups; individuals from different subspecies clustered together throughout the topology. Bootstrap support values were uniformly low, indicating insufficient phylogenetic signal. The few resolved nodes separated individual haplotype variants rather than subspecies groups, suggesting either recent divergence or extensive homogenization of organellar lineages.

3.3. Demographic Signatures: Evidence for Recent Expansion

Neutrality tests revealed significant departures from equilibrium expectations, providing insights into demographic history. Tajima’s D was significantly negative (−2.26, p < 0.02), as was Fu and Li’s D* (−4.52, p < 0.02), both indicating an excess of rare alleles relative to neutral equilibrium expectations. In non-coding organellar regions where purifying selection is unlikely to operate strongly, a recent demographic expansion following glacial retreat is the most parsimonious explanation for this signature. Rapid range expansion from restricted southern and possibly northern refugia would amplify common ancestral alleles throughout the expanded range while rare variants remain localized near refugial areas, generating the observed site frequency spectrum characteristic of population growth.

Under the standard neutral model, Tajima’s D compares the average number of pairwise differences (π) with the number of segregating sites (S), with negative values indicating excess rare variants. Fu and Li’s D* similarly compares the number of external branch mutations (those appearing in only one sequence) with the total number of mutations, with negative values indicating an excess of singletons. The concordance between these two tests strengthens the inference that the observed site frequency spectrum departs from neutral equilibrium. However, a fundamental caveat must be acknowledged: negative values of both Tajima’s D and Fu and Li’s D* are consistent with both recent demographic expansion and purifying or positive selection at linked sites, and standard frequency-spectrum statistics cannot formally separate these processes [34,35]. The two explanations can, in principle, be distinguished using full or approximate Bayesian coalescent inference [39,40], which fits competing demographic models to the full data and returns posterior probabilities for each scenario. In the present study, demographic expansion is nonetheless the most parsimonious interpretation for three convergent reasons. First, the analysed loci are non-coding intergenic spacers and intron sequences for which purifying selection is unlikely to be substantial. Second, the demographic-expansion signal is consistent with extensive paleoecological evidence—pollen records documenting rapid northward range expansion from glacial refugia following deglaciation [41,42]—providing independent corroboration from a completely different data class. Third, selection on organellar genomes would require either very strong purifying selection uniformly affecting all surveyed non-coding loci, or a selective sweep of recent origin, both of which are difficult to reconcile with the moderate and geographically homogeneous nucleotide diversity observed. We therefore retain demographic expansion as the working hypothesis, while explicitly flagging that formal Bayesian model comparison remains outstanding and is recommended as a priority for future work (see Section 4.6).

3.4. Estimation of Coalescence Time and Demographic History

Nucleotide diversity (π ≈ 0.00018) can be used to estimate the time to the most recent common ancestor (T_MRCA) under the standard neutral coalescent relationship θ ≈ π = 4Neμ [43,44]. However, this calculation depends critically on two uncertain quantities: the per-site per-year substitution rate (μ) for the trnL/trnF region, and the long-term effective population size (Ne). Both quantities carry substantial uncertainty in non-equilibrium populations such as those that experienced glacial contraction and post-glacial expansion. We therefore present a sensitivity analysis rather than a single-point estimate, systematically varying μ and Ne over plausible ranges reported in the conifer literature.

Published fossil-calibrated substitution rates for non-coding cpDNA in pines range from approximately 0.5 to 5.0 × 10⁻⁹ substitutions per site per year [13,45,46]. The lower end of this spectrum is typical for slowly evolving intron regions, while the upper end reflects faster-evolving intergenic spacers. For Ne, lodgepole pine is a widespread, wind-pollinated conifer with large census population sizes [47]. However, chloroplast Ne is expected to be considerably lower than census size due to paternal inheritance [48], selective sweeps [49], and repeated glacial bottlenecks [50]. Empirical estimates of chloroplast Ne in related pines range from roughly 10,000 to about 150,000 [51,52]. Since T_MRCA = π/(2μ) under a diploid model or π/μ for haploid organellar genomes under the infinite-sites model (Table 5), the estimated coalescence times cover nearly two orders of magnitude across the plausible parameter space.

Table 5 shows estimated T_MRCA for five values of μ across the published range, evaluated across five effective population sizes: low (Ne = 10,000), lower-intermediate (Ne = 25,000), intermediate (Ne = 50,000), upper-intermediate (Ne = 100,000), and high (Ne = 200,000) effective population sizes. Throughout the full parameter space, T_MRCA varies from approximately 4000 years (fast rate, low Ne) to roughly 720,000 years (slow rate, high Ne). The central scenario (μ = 1.5 × 10⁻⁹, Ne = 50,000) produces T_MRCA of approximately 60,000 years, which falls squarely within the timeframe of the Wisconsin glaciation (~100,000–12,000 years ago). In scenarios with lower rates or larger Ne, coalescence occurs before the Wisconsin glaciation, suggesting that the observed low diversity might also reflect ancestral population structure prior to the most recent glacial cycle. Scenarios with faster rates and smaller Ne are consistent with post-glacial coalescence, suggesting a significant bottleneck during the Holocene or late-glacial period.

The standard equilibrium model used in this calculation also assumes a stable population size, but there is strong independent evidence—from the significantly negative Tajima’s D and Fu and Li’s D* statistics (Section 3.3) and paleoecological pollen records [41,42]—that lodgepole pine experienced significant post-glacial range expansion. Population growth after a bottleneck shortens coalescent branch lengths compared to the equilibrium expectation, leading π to underestimate the actual T_MRCA and skewing the site-frequency spectrum toward rare variants (exactly the pattern observed). Therefore, the estimates at the lower end of Table 5 are probably artificially young, and the actual coalescence time may be much older than any single-rate estimate suggests. Given this non-equilibrium demographic context, the T_MRCA estimates in Table 5 should be seen as approximate bounds rather than precise point estimates. Formal Bayesian inference with BEAST or similar software, using an appropriate demographic model and multiple fossil calibration points, would greatly reduce this uncertainty and is recommended for future research [40,53]. The key finding for this study’s conclusions is not a specific coalescence time, but rather that all scenarios within the plausible parameter space are incompatible with the deep, long-term isolation (millions of years) required to produce the level of organellar structure observed in well-differentiated conifer lineages.

Under neutral coalescent theory, lineage sorting over 100,000+ years should produce detectable phylogeographic structure even at slowly evolving organellar loci, especially given the large effective population sizes of widespread conifers [54,55,56]. The absence of such structure in lodgepole pine—consistent across the full sensitivity range in Table 5—suggests either (1) severe bottlenecks during glaciation eliminated most ancestral variation, leaving insufficient polymorphism for lineage differentiation; (2) rapid post-glacial expansion from a single or only a few closely related source populations homogenized variation before lineage sorting could occur; (3) cryptic northern refugia maintained connectivity among populations throughout glaciation, preventing complete isolation; or (4) post-glacial gene flow has been extensive enough to erase refugial signatures. Differentiating among these scenarios requires integrating paleoecological data and additional genetic information.

4. Discussion

4.1. Discordance Between Genetic and Morphological Variation: Theoretical Implications

Our results reveal a striking paradox: lodgepole pine exhibits pronounced morphological variation, warranting recognition of four distinct subspecies [2], yet its organellar genomes show virtually no phylogenetic structure that aligns with these taxonomic boundaries. This pattern matches Avise’s Category V phylogeographic structure—defined by widespread haplotypes, low genetic diversity, and shallow genealogical roots that point to recent expansion from a common ancestor [57]. This discordance between phenotypic and neutral genetic divergence challenges traditional assumptions underlying subspecies classifications in widespread conifers and requires careful evaluation within contemporary evolutionary theory.

From a theoretical perspective, the maintenance of morphological differentiation despite genetic homogeneity can arise through two non-mutually exclusive mechanisms: (1) rapid adaptive evolution driven by strong divergent selection, with gene flow-selection balance maintaining local adaptation across the species range [6,7,8]; or (2) phenotypic plasticity producing environmentally induced morphological variation without underlying genetic differentiation [58]. The former scenario predicts strong differentiation at loci underlying adaptive traits, whereas the latter predicts uniformly low differentiation across genomic regions. Distinguishing these alternatives requires genomic data targeting adaptive loci, though ecological observations provide initial insights. A critical limitation of the present study is that organellar loci, being neutrally evolving and uniparentally inherited, are blind to cryptic adaptive structure in the nuclear genome. Populations that appear homogeneous at chloroplast and mitochondrial markers may harbour substantial differentiation at nuclear loci under selection—a pattern documented in several conifers where organellar homogeneity coexists with pronounced nuclear adaptive divergence [59]. Consequently, conclusions about the relative roles of selection and gene flow should be regarded as hypotheses to be tested with nuclear genomic evidence rather than firm empirical findings.

Several lines of evidence favor adaptive evolution over pure phenotypic plasticity as the sole explanation for morphological divergence, indicating a substantial genetic basis for adaptive differentiation. First, common-garden experiments in which diverse provenances were grown under uniform conditions showed that growth (height, diameter, and volume), morphology (branch length, branch width, and branch angle), and specific gravity varied considerably across both traits and geographic regions, reflecting the complex genetic basis of these characteristics [60]. If variation in growth, morphology, and specific gravity were attributable solely to phenotypic plasticity, common-garden studies should eliminate these differences because all genotypes experienced identical environments.

Second, populations show the highest survival and growth in environments matching their origin, and performance declines when transplanted to foreign environments [61]. This pattern of local adaptation—where each population performs best “at home”—requires genetic differentiation in fitness-related traits and cannot arise from phenotypic plasticity alone, which should optimize performance across all environments if plasticity itself is adaptive. In addition, morphological clines align with environmental gradients in predictable ways. For example, coastal forms exhibit adaptations to fog-belt environments, interior forms show fire-adapted cone characteristics, and the pygmy forest subspecies displays extreme dwarfism on nutrient-poor soils [62]. These patterns suggest local adaptation rather than random phenotypic variation.

Third, specific morphological traits show genetic control and rapid evolutionary responses that are inconsistent with phenotypic plasticity. Cone serotiny—the retention of closed cones that require fire-generated heat for seed release—correlates strongly with fire frequency [63]. Populations experiencing frequent stand-replacing fires show mean serotiny levels of 70–90%, whereas populations in fire-infrequent environments show < 20% serotiny. Reciprocal seed-sowing experiments demonstrate rapid selection against non-serotinous phenotypes in high-fire environments and against serotinous phenotypes in low-fire environments [64]. Similarly, cone morphology responds to differential seed predation by red squirrels (Tamiasciurus hudsonicus) and crossbills (Loxia spp.), creating a geographic mosaic of predator-driven selection [65]. Where squirrels are abundant, selection favors cones with thick scales and strong attachment to branches (squirrel-resistant morphology). In contrast, crossbill predation favors elongated cones with weak scale closure (crossbill-resistant morphology). These opposing selection pressures maintain variation in cone morphology across the range despite potential gene flow. Taken together, these patterns suggest local adaptation rather than random phenotypic variation.

Contemporary evolutionary theory increasingly recognizes that substantial phenotypic divergence can occur rapidly—on ecological rather than evolutionary timescales—when selection is sufficiently strong [66,67]. The gene flow-selection balance framework offers a coherent explanation for our observations. Under this model, pollen-mediated gene flow (as reflected in chloroplast DNA) homogenizes neutral genetic variation across the landscape. In contrast, divergent selection maintains adaptive differentiation at functional loci despite ongoing migration [6,7,8,9]. The requisite selection coefficients need not be prohibitively large: theoretical models show that even moderate selection can maintain local adaptation across moderate levels of gene flow, particularly for traits with polygenic architectures, where selection can act on multiple loci simultaneously [6,8,9,10]. In lodgepole pine, wind pollination facilitates extensive pollen dispersal (as evidenced by complete chloroplast DNA homogeneity). In contrast, shorter seed dispersal distances (as reflected in slightly more structured mitochondrial patterns in previous studies [12]) create opportunities for seedling establishment in locally adaptive microhabitats.

4.2. Biogeographic History and Glacial Refugia: Reconciling Molecular and Paleoecological Evidence

The genetic homogeneity seen across the range of lodgepole pine offers limited support for traditional biogeographic hypotheses that suggest long-term isolation in multiple glacial refugia [4,5]. Classical vicariance models predict that the Wisconsin glaciation (which covered most of its current range for about 100,000 years) would fragment populations into isolated refugia, leading to distinct genetic lineages tied to specific refugial areas. According to coalescent expectations [43,44], the low observed nucleotide diversity (π ≈ 0.00018) indicates a relatively recent common ancestor (T_MRCA) for the sampled lineages. Using a chloroplast substitution rate of roughly 1.0–3.0 × 10⁻⁹ substitutions per site per year for pine trnL/trnF regions [13], the sensitivity analysis in Section 3.4 and Table 5 shows that plausible T_MRCA estimates range from approximately 4000 to 720,000 years depending on the assumed substitution rate and effective population size. One main scenario broadly aligns with the Wisconsin glaciation timeframe, but the broad uncertainty means we cannot confidently determine a specific coalescence date. Regardless of the assumed rate, lineage sorting over these timescales under neutral coalescent theory should produce detectable phylogeographic structure, given the effective population sizes typical of widespread conifers [54,55,56].

However, the absence of such structure in lodgepole pine—contrasting sharply with a clear east–west phylogeographic division in Pinus ponderosa [34,68]-suggests either more recent common ancestry or exceptionally effective post-glacial homogenization. Several scenarios could explain this pattern: (1) severe bottlenecks during glaciation eliminated ancestral variation, leaving insufficient polymorphism for lineage differentiation; (2) rapid post-glacial expansion from a single or a few closely related source populations homogenized variation before lineage sorting could complete; (3) cryptic northern refugia maintained connected populations throughout glaciation, preventing complete isolation; or (4) post-glacial gene flow has been sufficiently extensive to erase any signature of refugial isolation.

The identification of a unique 5 bp chloroplast indel, predominantly in coastal populations (Queen Charlotte Islands and adjacent mainland), provides direct molecular evidence for a northern Pacific refugium, consistent with paleoecological data indicating ice-free coastal zones during the Wisconsin glaciation [69,70,71]. The presence of this marker in interior populations (population 36, P. c. latifolia) supports post-glacial pollen-mediated gene flow from coastal to interior regions following deglaciation. Wu and Ying [61] reported that population 36 exhibits phenotypic characteristics intermediate between coastal and interior forms, displaying coastal-type morphology (extensive browsing by snowshoe hare, Lepus americanus) at the test site while showing interior-type growth patterns at sites without wildlife damage. The geographic position of this population—situated in a river valley that descends toward the coast—facilitates pollen deposition from coastal air masses. The high frequency of non-serotinous cones observed in this population, a diagnostic trait distinguishing coastal from interior subspecies, provides additional evidence of coastal genetic influence. These observations are consistent with asymmetric gene flow from coastal to interior populations. However, we note that this interpretation is inferential rather than directly demonstrated, as organellar markers alone cannot establish directionality of gene flow. Direct confirmation would require nuclear genomic data, such as genome-wide SNP panels, to distinguish between coastal and interior allele pools. We propose that directional gene flow, bringing coastal alleles into interior populations, in combination with selection gradients associated with climatic variation, has been instrumental in establishing the observed geographic cline of local adaptation. This east–west connectivity in P. contorta [3,12,69] contrasts with the more isolated north–south refugial pattern proposed for P. ponderosa [34,68]. It may reflect differences in refugial geography or post-glacial dispersal dynamics.

Recent syntheses of glacial refugia across western North American tree species [69,70] reveal considerable heterogeneity in refugial patterns, with some species exhibiting strong phylogeographic structure (e.g., Pinus ponderosa [34,68], Pseudotsuga menziesii [71]) and others showing relative genetic homogeneity (e.g., Picea glauca [72]). These differences likely reflect species-specific combinations of refugial distribution, effective population sizes, dispersal capability, and time since range expansion. Lodgepole pine’s pattern suggests either rapid expansion from limited refugia or maintenance of population connectivity through stepping-stone colonization during deglaciation [73], with subsequent gene flow effectively homogenizing organellar variation [74].

4.3. Mechanisms of Rapid Morphological Evolution: Life History and Ecological Context

The maintenance of substantial morphological variation despite neutral genetic homogeneity raises important questions about the mechanisms and tempo of adaptive evolution in lodgepole pine. Several ecological and life-history factors may facilitate rapid morphological evolution in this species, providing insights into broader patterns of conifer adaptation.

First, lodgepole pine’s demographic characteristics create conditions conducive to rapid local adaptation. The species exhibits early reproductive maturity, often producing cones by age 5–10 years [75], high fecundity [76], and large effective population sizes [56,77], collectively providing substantial standing genetic variation on which selection can act. Indeed, rates of contemporary evolution in natural populations can be remarkably rapid, with measurable phenotypic shifts occurring over tens to hundreds of generations when selection is strong [66,67,78]. In environments with strong selective pressures—particularly variable fire regimes—rapid evolutionary change can occur over relatively few generations [79,80,81], potentially producing substantial phenotypic divergence on timescales shorter than those required for neutral lineage sorting at organellar loci.

Second, specific ecological interactions generate strong divergent selection on morphological traits across the species’ range. Cone serotiny provides perhaps the most compelling example: populations experiencing frequent stand-replacing fires evolve high serotiny levels that maximize post-fire regeneration, whereas populations in fire-infrequent environments evolve non-serotinous cones that facilitate annual seed dispersal [63,75]. This trait shows rapid evolutionary responses in reciprocal transplant experiments [64]. Similarly, cone morphology responds to differential seed predation by red squirrels and crossbills, creating a geographic mosaic of predator-driven selection that maintains variation in cone characteristics [65].

Third, the polygenic architecture of morphological traits may facilitate rapid evolution despite gene flow. Unlike simple Mendelian traits, in which migration can readily override selection, polygenic traits with many loci of small effect can evolve and maintain local adaptation even with moderate levels of gene flow [9,82]. Selection acting simultaneously across multiple loci generates stronger overall differentiation than would be predicted from single-locus models, potentially explaining how morphological variation persists across lodgepole pine’s range despite extensive pollen flow. This polygenic scenario remains a working hypothesis, however: empirical support requires identification of adaptive loci through genome-wide association studies or selective-sweep analyses using dense nuclear SNP data. Future work should explicitly contrast F_ST at putatively neutral loci against Q_ST for quantitative morphological traits across subspecies to provide direct, rather than inferred, evidence for polygenic local adaptation.

4.4. Taxonomic Implications: Rethinking Subspecies as Adaptive Ecotypes

Our findings, based on neutral organellar markers with limited polymorphism, raise questions about the degree to which current subspecies classifications reflect deep phylogenetic divisions in lodgepole pine. The genetic homogeneity observed across recognized subspecies boundaries, combined with continuous morphological variation between forms [2], is more consistent with an ecotypic interpretation than with deep phylogenetic divergence. However, we stress that this conclusion is necessarily tentative given the limited number of polymorphic sites resolved and the absence of data from adaptive genomic regions. Neutral markers are inherently insensitive to divergent selection acting on ecologically important traits; thus, the lack of neutral genetic structure does not preclude substantial adaptive differentiation. An ecotypic framework—in which subspecies represent adaptive solutions to local environmental challenges rather than evolutionarily independent lineages—is consistent with the data, but a definitive taxonomic reinterpretation would require corroborating evidence from genome-wide scans, quantitative genetic analyses, and assessments of reproductive isolation.

The ecotypic interpretation better accommodates several empirical observations. First, morphological transitions between subspecies occur gradually along environmental gradients rather than at discrete boundaries, particularly between P. c. murrayana and P. c. latifolia, where elevation-associated variation produces continuous clinal variation [2,3]. Second, diagnostic characters (cone serotiny, needle length, growth form) correlate more strongly with environmental variables than with geographic distance, suggesting adaptive responses to local conditions [2,60,61,62]. Third, provenance performance depends critically on environmental matching, with local adaptation evident across relatively fine spatial scales [61,83]. These patterns collectively indicate that morphological variation reflects ongoing adaptation to heterogeneous environments rather than historical isolation.

However, this ecotypic interpretation does not diminish the biological significance of recognized forms. Rather, it reframes our understanding of their evolutionary origin and maintenance: subspecies represent dynamic adaptive responses maintained by spatially varying selection rather than static entities isolated by reproductive barriers. This perspective has important implications for nomenclature and taxonomy. While retaining subspecies designations for communication and management may be pragmatic, we should recognize that these categories mark points along adaptive continua rather than discrete evolutionary units. This interpretation aligns with emerging conceptual frameworks that emphasize the evolutionary process over pattern in defining biological diversity [84].

4.5. Conservation and Management Implications

The ecotypic interpretation of lodgepole pine diversity has significant implications for conservation prioritization and management strategies. Traditional approaches that emphasize protecting taxonomic units (subspecies, varieties) may inadequately capture the adaptive processes that maintain morphological variation along environmental gradients. Instead, conservation efforts should focus on preserving the ecological contexts and selective regimes that promote local adaptation.

For assisted migration and reforestation programs, our results indicate that genotype-environment matching should take precedence over subspecies identity in seed source selection. Provenance trials showing strong local adaptation [60,61,83] indicate that transferring populations to climatically mismatched sites—even within the same subspecies—can reduce fitness and compromise regeneration success. Conversely, populations from different subspecies may perform similarly when matched to appropriate environmental conditions. This finding supports the development of seed transfer guidelines based on climate models and provenance performance data rather than taxonomic boundaries [85,86].

Climate change adds urgency to these considerations. As environmental conditions shift, populations must either adapt in situ, migrate to track suitable habitats, or face extirpation [87,88]. The genetic homogeneity we observe suggests substantial connectivity across the range, potentially facilitating adaptive allele flow to populations experiencing novel climates. However, the strong local adaptation evident in morphological traits suggests that adaptation may need to be rapid to keep pace with environmental change. Conservation strategies should therefore maximize both within-population genetic diversity (providing raw material for adaptation) and landscape connectivity (facilitating gene flow), while recognizing that historical taxonomic units may not effectively capture functionally important variation.

The genetic basis of adaptation will be critical to determining evolutionary responses to climate change. While neutral genetic diversity provides evolutionary potential, the architecture of adaptive traits—their heritability, genetic correlations, and pleiotropic effects—will determine the rate and direction of evolutionary change. Understanding these genetic architectures through genomic approaches will enable more informed predictions of population responses and more effective conservation interventions.

4.6. Broader Implications for Conifer Phylogeography and Evolution, and Priorities for Future Bayesian Demographic Inference

Lodgepole pine’s pattern of morphological divergence without corresponding genetic structure may characterize conifers more broadly, particularly among species with large ranges and high potential for gene flow. The growing body of phylogeographic studies in western North American conifers reveals considerable heterogeneity: some species show strong phylogeographic structure (e.g., P. ponderosa [34,68], Pseudotsuga menziesii) [71], whereas others exhibit relative homogeneity (e.g., P. contorta [this study], Picea glauca [72]). Understanding the factors that underlie these contrasting patterns remains a significant challenge.

Several characteristics may predispose lodgepole pine to rapid morphological evolution and genetic homogenization. First, the species’ ecological generalism—occupying sites ranging from coastal fog belts to interior montane forests, from nutrient-poor soils to productive sites—exposes populations to diverse selective pressures that drive adaptive differentiation. Second, wind pollination, combined with early reproductive maturity, facilitates extensive gene flow that homogenizes neutral variation while allowing adaptive loci to differentiate under selection. Third, the species’ boom-and-bust demography associated with fire disturbance creates periodic selective episodes that can drive rapid evolutionary change. These factors collectively create conditions that favor the evolution and maintenance of ecotypic variation in the absence of deep phylogenetic structure.

More broadly, our results suggest that, at least for lodgepole pine, inferences of evolutionary independence based solely on morphological differences should be approached cautiously. Whether this conclusion applies widely to other common conifers remains an open empirical question: patterns of disagreement between morphological traits and neutral genetic divergence are not universal (as the contrasting example of Pinus ponderosa illustrates), and each species needs individual assessment. The traditional method of recognizing subspecies, primarily based on morphology, may overstate phylogenetic divergence in taxa where morphological traits are strongly selected and can evolve rapidly, but this caveat applies only when organellar and nuclear evidence align. Combining genomic data that target both neutral and adaptive variation offers a more detailed view of evolutionary processes, distinguishing between historical demographic changes (as reflected in neutral markers) and current adaptation (as reflected in functional traits and their genetic basis).

A particular priority is to formally separate demographic expansion from selection when explaining the negative neutrality-test statistics discussed here. As noted in the Introduction and Section 3.3, Tajima’s D and Fu and Li’s D* are inherently ambiguous regarding this distinction. Two complementary Bayesian approaches could clarify this ambiguity in future work. First, approximate Bayesian computation (ABC; [39]) provides a flexible likelihood-free method where observed summary statistics are compared to distributions simulated under different demographic models (e.g., constant size, exponential expansion, bottleneck followed by expansion). This approach allows for efficient estimation of posterior model probabilities, with nuisance parameters—including the substitution rate and effective population size—integrated over, rather than fixed at specific values, thereby carrying the uncertainty from the sensitivity analysis in Section 3.4. Second, full Bayesian coalescent inference using software such as BEAST [40] can directly fit flexible demographic priors (e.g., the Bayesian skyline plot) to sequence data, estimating posterior distributions of effective population size over time. Applying these methods to the current dataset, along with additional nuclear loci to distinguish organellar from nuclear demographic histories, would significantly bolster the conclusion that the site-frequency-spectrum signature reflects post-glacial range expansion rather than selection on linked sites. It would also produce posterior-supported population size trajectories consistent with paleoecological reconstructions.

4.7. Limitations, Inferential Status of Key Conclusions, and Future Directions

Several conclusions in this manuscript warrant explicit clarification of their evidential basis. The following distinguishes conclusions directly supported by our data from those that are inferential or hypothesis-generating. (1) Directly supported: low organellar nucleotide diversity (π = 0.000178–0.000186) and absence of subspecies-level phylogenetic structure in cpDNA and mtDNA; significantly negative neutrality test values (Tajima’s D = −2.26; Fu and Li’s D* = −4.52), which are consistent with demographic expansion; and a 5 bp chloroplast indel concentrated in coastal populations. (2) Inferential or hypothesis-generating: asymmetric coastal-to-interior pollen-mediated gene flow; rapid polygenic adaptation as the primary driver of morphological divergence; and the claim that selection is strong enough to overcome gene flow across the genome. These latter claims are logically consistent with the organellar data but require nuclear genomic evidence for direct confirmation.

Another limitation is that, because organellar loci evolve neutrally, our study cannot detect any hidden adaptive nuclear structures that might exist. It is plausible—and, based on evidence from common-garden and provenance studies [60,61,83], likely—that populations show significant divergence at adaptive nuclear loci despite organellar uniformity. Future research should therefore focus on: (i) genome-wide SNP genotyping (such as RADseq or whole-genome resequencing) to measure nuclear F_ST and find outlier loci under selection; (ii) environmental association analyses that connect genomic variants to climate and disturbance gradients; (iii) Q_ST–F_ST comparisons to differentiate between adaptive and neutral divergence; and (iv) coalescent demographic modeling with nuclear loci to formally test divergence scenarios and gene flow directionality. Until such data are available, our ecotypic reinterpretation should be seen as a well-supported working hypothesis rather than a definitive conclusion.

Figure 2 presents a conceptual model showing how gene flow (pollen versus seed dispersal), divergent selection, and morphological differentiation interact in lodgepole pine. It illustrates how cpDNA homogenization through long-distance pollen flow can happen alongside adaptive morphological divergence driven by spatially varying selection on fire- and climate-related traits.

5. Conclusions

This detailed phylogeographic analysis uncovers a key paradox in lodgepole pine evolution: clear morphological differences that suggest subspecies recognition occur alongside almost complete genetic uniformity at organellar loci. This pattern conflicts with expectations based on organellar markers, which indicate deep phylogenetic splits among subspecies, and underscores the complex interactions among gene flow, natural selection, and demographic history in shaping current biodiversity patterns.

Our organellar data do not support the existence of deep phylogenetic divisions among the recognized subspecies. Instead, the data align with the idea that morphological differences result from rapid adaptive evolution in spatially diverse environments rather than from long-term geographic separation. This view reinterprets subspecies as potential ecotypes maintained by ongoing divergent selection despite extensive gene flow. We emphasize, however, that this interpretation remains hypothesis-generating: testing it directly requires nuclear genomic data, including adaptive-locus scans and quantitative-genetic studies, which we identify as a critical priority for future research.

The discovery of a unique chloroplast variant in coastal populations offers molecular evidence for a northern Pacific refugium during the Wisconsin glaciation. Its presence in interior populations also indicates post-glacial pollen-mediated gene flow. However, the overall pattern of genetic uniformity suggests either severe bottlenecks during glaciation or highly effective post-glacial homogenization, in contrast to more structured phylogeographic patterns seen in related species. These findings add to the growing understanding that glacial refugia and post-glacial colonization patterns differ significantly among co-distributed species, influenced by species-specific factors like population size, dispersal ability, and demographic history.

The maintenance of morphological variation despite genetic homogeneity has important theoretical implications, illustrating that strong divergent selection can surpass the homogenizing effects of gene flow on ecologically important traits. This pattern supports models of local adaptation despite migration and highlights that evolutionary independence cannot be determined solely from phenotypic differences. The capacity for rapid morphological change, as seen in lodgepole pine, is common among conifers. Extensive phylogeographic and adaptive genetic research across Pinus species—including P. sylvestris [89], P. pinea L [90], and P. mugo [91]—as well as Picea species like P. abies [92] and P. mariana [93], consistently reveal marked phenotypic differences across environmental gradients despite limited neutral genetic divergence. These studies collectively demonstrate that strong divergent selection can sustain locally adapted ecotypes even with significant gene flow, a pattern that seems to be a hallmark of widespread, wind-pollinated conifers. This recurring pattern across diverse conifer lineages suggests that traditional taxonomy based on morphology may not accurately represent genetic relationships or evolutionary history in this group.

These results have direct implications for conservation and management strategies. As detailed in Section 4.5, a process-oriented approach that prioritizes preservation of ecological gradients, selective regimes, and landscape connectivity—rather than taxonomic units per se—is recommended; genotype–environment matching should guide seed source selection in reforestation and assisted migration programs.

Future research that combines genomic methods with ecological niche modeling will offer unprecedented insights into the genetic basis of adaptation and the environmental factors that drive morphological differences. These studies will verify whether loci believed to be adaptive exhibit higher levels of differentiation than neutral genomic regions, directly illustrating the role of local adaptation in maintaining genetic variation. By integrating genetic, morphological, environmental, and demographic data within clear theoretical frameworks, we can develop a more advanced understanding of evolutionary processes in widespread species and create more effective strategies for conserving biodiversity amid rapid environmental changes.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/dna6020020/s1. All Supplementary files are in the ZIP DNA-4122203_All Supplementary_Files.

Author Contributions

Conceptualization, A.J.F. and F.C.Y.; methodology, A.J.F. and F.C.Y.; data collection, A.J.F.; data curation, A.J.F. and F.C.Y.; validation, A.J.F. and F.C.Y.; formal analysis, A.J.F.; writing—original draft preparation, A.J.F.; writing—review and editing, A.J.F. and F.C.Y.; funding acquisition, F.C.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by grants from the Natural Sciences and Engineering Research Council of Canada (STR0118500 and A2282 to F.C.Y.) and by a University of Alberta, Department of Biological Sciences, Challenge Grant in Biodiversity to A.F.

Data Availability Statement

The DNA sequences generated in this study have been deposited in GenBank under accession numbers PZ132966-PZ133105 (trnL intron); PZ133106-PZ133245 (trnL/trnF spacer); PZ133246-PZ133385 (nad1 b/c intron). Supplementary File S1 has the unedited and edited alignments.

Acknowledgments

We acknowledge the infrastructure support provided by the Department of Renewable Resources at the University of Alberta. We are grateful to the Research Branch, B.C. Ministry of Forests, for permission to collect samples from their lodgepole pine provenance trials. We thank Chao Wu of SCAU for assisting with replotting Figure 1.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Little, E.L., Jr. Checklist of Native and Naturalized Trees of the United States (Including Alaska); Forest Service: Washington, DC, USA, 1953. [Google Scholar]
Critchfield, W.B. Geographic Variation in Pinus contorta; Harvard University Press: Cambridge, MA, USA, 1957. [Google Scholar]
Wheeler, N.C.; Guries, R.P. Biogeography of lodgepole pine. Can. J. Bot. 1982, 60, 1805–1814. [Google Scholar] [CrossRef]
Critchfield, W.B. The late Quaternary history of lodgepole and jack pines. Can. J. For. Res. 1985, 15, 749–772. [Google Scholar] [CrossRef]
Dyke, A.S.; Prest, V.K. Late Wisconsinan history of the Laurentide ice sheet. Géograph. Phys. Quat. 1987, 41, 327–364. [Google Scholar] [CrossRef]
Lenormand, T. Gene flow and the limits to natural selection. Trends Ecol. Evol. 2002, 17, 183–189. [Google Scholar] [CrossRef]
Tigano, A.; Friesen, V.L. Genomics of local adaptation with gene flow. Mol. Ecol. 2016, 25, 2144–2164. [Google Scholar] [CrossRef]
Yeaman, S.; Whitlock, M.C. The genetic architecture of adaptation under migration-selection balance. Evolution 2011, 65, 1897–1911. [Google Scholar] [CrossRef]
Yeaman, S. Evolution of polygenic traits under global vs local adaptation. Genetics 2022, 220, iyab134. [Google Scholar] [CrossRef]
Le Corre, V.; Kremer, A. The genetic differentiation at quantitative trait loci under local adaptation. Mol. Ecol. 2012, 21, 1548–1566. [Google Scholar] [CrossRef]
Fazekas, A.J.; Yeh, F.C. Postglacial colonization and population genetic relationships in the Pinus contorta complex. Can. J. Bot. 2006, 84, 223–234. [Google Scholar] [CrossRef]
Godbout, J.; Fazekas, A.; Newton, C.H.; Yeh, F.C.; Bousquet, J. Glacial vicariance in the Pacific Northwest: Evidence from a lodgepole pine mitochondrial DNA minisatellite for multiple genetically distinct and widely separated refugia. Mol. Ecol. 2008, 17, 2463–2475. [Google Scholar] [CrossRef] [PubMed]
Wolfe, K.H.; Li, W.H.; Sharp, P.M. Rates of nucleotide substitution vary greatly among plant mitochondrial, chloroplast, and nuclear DNAs. Proc. Natl. Acad. Sci. USA 1987, 84, 9054–9058. [Google Scholar] [CrossRef]
Dong, J.; Wagner, D.B.; Yanchuk, A.D.; Carlson, M.R.; Magnussen, S.; Wang, X.; Szmidt, A.E. Paternal chloroplast DNA inheritance in Pinus contorta and Pinus banksiana: Independence of parental species or cross direction. J. Hered. 1992, 83, 419–422. [Google Scholar] [CrossRef]
Dong, J.; Wagner, D.B. Paternally inherited chloroplast polymorphism in Pinus: Estimation of diversity and population subdivision and tests of disequilibrium with a maternally inherited mitochondrial polymorphism. Genetics 1994, 136, 1187–1194. [Google Scholar] [CrossRef]
Neale, D.B.; Sederoff, R.R. Paternal inheritance of chloroplast DNA and maternal inheritance of mitochondrial DNA in loblolly pine. Theor. Appl. Genet. 1989, 77, 212–216. [Google Scholar] [CrossRef] [PubMed]
Excoffier, L.; Smouse, P.E.; Quattro, J.M. Analysis of molecular variance inferred from metric distances among DNA haplotypes: Application to human mitochondrial DNA restriction data. Genetics 1992, 131, 479–491. [Google Scholar] [CrossRef] [PubMed]
Doyle, J.J.; Doyle, J.L. Isolation of plant DNA from fresh tissue. Focus 1990, 12, 13–15. [Google Scholar]
Taberlet, P.; Gielly, L.; Pantou, G.; Bouvet, J. Universal primers for amplification of three non-coding regions of chloroplast DNA. Plant Mol. Biol. 1991, 17, 1105–1109. [Google Scholar] [CrossRef] [PubMed]
Gielly, L.; Taberlet, P. The use of chloroplast DNA to resolve plant phylogenies: Noncoding versus rbcL sequences. Mol. Biol. Evol. 1994, 11, 769–777. [Google Scholar] [CrossRef]
Gielly, L.; Taberlet, P. A phylogeny of the European gentians inferred from chloroplast trnL (UAA) intron sequences. Bot. J. Linn. Soc. 1996, 120, 57–75. [Google Scholar] [CrossRef]
Bellstedt, D.U.; Linder, H.P.; Harley, E.H. Phylogenetic relationships in Disa based on non-coding trnL-trnF chloroplast sequences: Evidence of numerous repeat regions. Am. J. Bot. 2001, 88, 2088–2100. [Google Scholar] [CrossRef]
González, D.; Vovides, A.P. Low intralineage divergence in Ceratozamia (Zamiaceae) detected with nuclear ribosomal DNA ITS and chloroplast DNA trnL-F non-coding region. Syst. Bot. 2002, 27, 654–661. [Google Scholar]
Mitton, J.B.; Kreiser, B.R.; Rehfeldt, G.E. Primers designed to amplify a mitochondrial nad1 intron in ponderosa pine, Pinus ponderosa, limber pine, P. flexilis, and Scots pine, P. sylvestris. Theor. Appl. Genet. 2000, 101, 1269–1272. [Google Scholar] [CrossRef]
Mitton, J.B.; Kreiser, B.R.; Latta, R.G. Glacial refugia of limber pine (Pinus flexilis James) inferred from the population structure of mitochondrial DNA. Mol. Ecol. 2000, 9, 91–97. [Google Scholar] [CrossRef]
Chromas, Version 2.6; Technelysium Pty Ltd.: Helensvale, QLD, Australia, 2018.
Tamura, K.; Stecher, G.; Kumar, S. MEGA11: Molecular Evolutionary Genetics Analysis Version 11. Mol. Biol. Evol. 2021, 38, 3022–3027. [Google Scholar] [CrossRef] [PubMed]
Simmons, M.P.; Ochoterena, H. Gaps as characters in sequence-based phylogenetic analyses. Syst. Biol. 2000, 49, 369–381. [Google Scholar] [CrossRef]
Nei, M. Molecular Evolutionary Genetics; Columbia University Press: New York, NY, USA, 1987. [Google Scholar]
Tajima, F.; Nei, M. Estimation of evolutionary distance between nucleotide sequences. Mol. Biol. Evol. 1984, 1, 269–285. [Google Scholar] [CrossRef]
Rozas, J.; Rozas, R. DnaSP version 3: An integrated program for molecular population genetics and molecular evolution analysis. Bioinformatics 1999, 15, 174–175. [Google Scholar] [CrossRef] [PubMed]
Tajima, F. Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 1989, 123, 585–595. [Google Scholar] [CrossRef]
Fu, Y.X.; Li, W.H. Statistical tests of neutrality of mutations. Genetics 1993, 133, 693–709. [Google Scholar] [CrossRef] [PubMed]
Potter, K.M.; Hipkins, V.D.; Mahalovich, M.F.; Means, R.E. Mitochondrial DNA haplotype distribution patterns in Pinus ponderosa (Pinaceae): Range-wide evolutionary history and implications for conservation. Am. J. Bot. 2013, 100, 1562–1579. [Google Scholar] [CrossRef]
Soranzo, N.; Alía, R.; Provan, J.; Powell, W. Patterns of variation at a mitochondrial sequence-tagged-site locus provides new insights into the postglacial history of European Pinus sylvestris populations. Mol. Ecol. 2000, 9, 1205–1211. [Google Scholar] [CrossRef] [PubMed]
Chen, K.; Abbott, R.J.; Milne, R.I.; Tian, X.-M.; Liu, J. Phylogeography of Pinus tabulaeformis Carr. (Pinaceae), a dominant species of coniferous forest in northern China. Mol. Ecol. 2008, 17, 4193–4214. [Google Scholar] [CrossRef]
Naydenov, K.D.; Senneville, S.; Beaulieu, J.; Tremblay, F.; Bousquet, J. Glacial vicariance in Eurasia: Mitochondrial DNA evidence from Scots pine for a complex heritage involving genetically distinct refugia at mid-northern latitudes and in Asia Minor. BMC Evol. Biol. 2007, 7, 233. [Google Scholar] [CrossRef]
Ran, J.-H.; Wei, X.-X.; Wang, X.-Q. Molecular phylogeny and biogeography of Pinus armandii and its relatives: Heterogeneous contributions of geography and climate changes to the genetic differentiation and diversification of Chinese white pines. PLoS ONE 2014, 9, E85920. [Google Scholar]
Beaumont, M.A.; Zhang, W.; Balding, D.J. Approximate Bayesian computation in population genetics. Genetics 2002, 162, 2025–2035. [Google Scholar] [CrossRef]
Drummond, A.J.; Rambaut, A.; Shapiro, B.; Pybus, O.G. Bayesian coalescent inference of past population dynamics from molecular sequences. Mol. Biol. Evol. 2005, 22, 1185–1192. [Google Scholar] [CrossRef]
MacDonald, G.M.; Cwynar, L.C. A fossil pollen based reconstruction of the late Quaternary history of lodgepole pine (Pinus contorta ssp. latifolia) in the western interior of Canada. Can. J. For. Res. 1985, 6, 1039–1044. [Google Scholar] [CrossRef]
Cwynar, L.C.; MacDonald, G.M. Geographical variation of lodgepole pine in relation to population history. Am. Nat. 1987, 129, 463–469. [Google Scholar] [CrossRef]
Kingman, J.F.C. The coalescent. Stoch. Process. Their Appl. 1982, 13, 235–248. [Google Scholar] [CrossRef]
Kingman, J.F.C. Origins of the Coalescent: 1974–1982. Genetics 2000, 156, 1461–1463. [Google Scholar] [CrossRef] [PubMed]
Eckert, A.J.; Hall, B.D. Phylogeny, historical biogeography, and patterns of diversification for Pinus (Pinaceae): Phylogenetic tests of fossil-based hypotheses. Mol. Phylogenet. Evol. 2006, 40, 166–182. [Google Scholar] [CrossRef]
Muse, S.V. Examining rates and patterns of nucleotide substitution in plants. Plant Mol. Biol. 2000, 42, 25–43. [Google Scholar] [CrossRef] [PubMed]
Dawn Marshall, H.; Newton, C.; Ritland, K. Chloroplast phylogeography and evolution of highly polymorphic microsatellites in lodgepole pine (Pinus contorta). Theor. Appl. Genet. 2002, 104, 367–378. [Google Scholar] [CrossRef]
Birky, C.W.; Fuerst, P.; Maruyama, T. Organelle gene diversity under migration, mutation, and drift: Equilibrium expectations, approach to equilibrium, effects of heteroplasmic cells, and comparison to nuclear genes. Genetics 1989, 121, 613–627. [Google Scholar] [CrossRef]
Charlesworth, B.; Morgan, M.T.; Charlesworth, D. The effect of deleterious mutations on neutral molecular variation. Genetics 1993, 134, 1289–1303. [Google Scholar] [CrossRef]
Hewitt, G.M. Some genetic consequences of ice ages, and their role in divergence and speciation. Biol. J. Linn. Soc. 1996, 58, 247–276. [Google Scholar] [CrossRef]
Ledig, F.T.; Conkle, M.T.; Bermejo-Velázquez, B.; Eguiluz-Piedra, T.; Hodgskiss, P.D.; Johnson, D.R.; Dvorak, W.S. Evidence for an extreme bottleneck in a rare Mexican pinyon: Genetic diversity, disequilibrium, and the mating system in Pinus maximartinezii. Evolution 1999, 53, 91–99. [Google Scholar] [CrossRef]
Provan, J.; Soranzo, N.; Wilson, N.J.; Goldstein, D.B.; Powell, W. A low mutation rate for chloroplast microsatellites. Genetics 1999, 153, 943–947. [Google Scholar] [CrossRef] [PubMed]
Ho, S.Y.W.; Phillips, M.J. Accounting for calibration uncertainty in phylogenetic estimation of evolutionary divergence times. Syst. Biol. 2009, 58, 367–380. [Google Scholar] [CrossRef] [PubMed]
Hudson, R.R. Gene genealogies and the coalescent process. Oxf. Surv. Evol. Biol. 1990, 7, 1–44. [Google Scholar]
Avise, J.C.; Arnold, J.; Ball, R.M.; Bermingham, E.; Lamb, T.; Neigel, J.E.; Reeb, C.A.; Saunders, N.C. Intraspecific phylogeography: The mitochondrial DNA bridge between population genetics and systematics. Annu. Rev. Ecol. Syst. 1987, 18, 489–522. [Google Scholar] [CrossRef]
Hamrick, J.L.; Godt, M.J.W.; Sherman-Broyles, S.L. Factors influencing levels of genetic diversity in woody plant species. New For. 1992, 6, 95–124. [Google Scholar] [CrossRef]
Avise, J.C. Phylogeography: Retrospect and prospect. J. Biogeogr. 2009, 36, 3–15. [Google Scholar] [CrossRef]
Schlichting, C.D. The evolution of phenotypic plasticity in plants. Annu. Rev. Ecol. Syst. 1986, 1, 667–693. [Google Scholar] [CrossRef]
Savolainen, O.; Pyhäjärvi, T.; Knürr, T. Gene flow and local adaptation in trees. Annu. Rev. Ecol. Evol. Syst. 2007, 38, 595–619. [Google Scholar] [CrossRef]
Hu, X.S.; Yanchuk, A.; Yeh, F.C. Quantitative genetic architecture and evolutionary potential of lodgepole pine: Insights from a multi-zone provenance trial. J. For. Res. 2026, 37, 70. [Google Scholar] [CrossRef]
Wu, H.X.; Ying, C.C. Geographic pattern of local optimality in natural populations of lodgepole pine. For. Ecol. Manag. 2004, 194, 177–198. [Google Scholar] [CrossRef]
Aitken, S.N.; Libby, W.J. Evolution of the pygmy-forest edaphic ubspecies of Pinus contorta across an ecological staircase. Evolution 1994, 48, 1009–1019. [Google Scholar] [PubMed]
Lotan, J.E. Cone serotiny-fire relationships in lodgepole pine. In Tall Timbers Fire Ecology Conference and Fire and Land Management Symposium 14; Tall Timbers Research Station: Tallahassee, FL, USA, 1976; pp. 267–278. [Google Scholar]
Benkman, C.W.; Parchman, T.L.; Favis, A.; Siepielski, A.M. Reciprocal selection causes a coevolutionary arms race between crossbills and lodgepole pine. Am. Nat. 2003, 162, 182–194. [Google Scholar] [CrossRef] [PubMed]
Benkman, C.W.; Holimon, W.C.; Smith, J.W. The influence of a competitor on the geographic mosaic of coevolution between crossbills and lodgepole pine. Evolution 2001, 55, 282–294. [Google Scholar] [CrossRef]
Carroll, S.P.; Hendry, A.P.; Reznick, D.N.; Fox, C.W. Evolution on ecological time-scales. Funct. Ecol. 2007, 21, 387–393. [Google Scholar] [CrossRef]
Messer, P.W.; Ellner, S.P.; Hairston, N.G. Can population genetics adapt to rapid evolution? Trends Genet. 2016, 32, 408–418. [Google Scholar] [CrossRef]
Potter, K.M.; Hipkins, V.D.; Mahalovich, M.F.; Means, R.E. Nuclear genetic variation across the range of ponderosa pine (Pinus ponderosa): Phylogeographic, taxonomic and conservation implications. Tree Genet. Genomes 2015, 11, 38. [Google Scholar] [CrossRef]
Roberts, D.R.; Hamann, A. Glacial refugia and modern genetic diversity of 22 western North American tree species. Proc. R. Soc. B Biol. Sci. 2015, 282, 20142903. [Google Scholar] [CrossRef]
Jaramillo-Correa, J.P.; Beaulieu, J.; Khasa, D.P.; Bousquet, J. Inferring the past from the present phylogeographic structure of North American forest trees: Seeing the forest for the genes. Can. J. For. Res. 2009, 39, 286–307. [Google Scholar] [CrossRef]
Gugger, P.F.; Sugita, S.; Cavender-Bares, J. Phylogeography of Douglas-fir based on mitochondrial and chloroplast DNA sequences: Testing hypotheses from the fossil record. Mol. Ecol. 2010, 19, 1877–1897. [Google Scholar] [CrossRef]
Anderson, L.L.; Hu, F.S.; Nelson, D.M.; Petit, R.J.; Paige, K.N. Ice-age endurance: DNA evidence of a white spruce refugium in Alaska. Proc. Natl. Acad. Sci. USA 2006, 103, 12447–12450. [Google Scholar] [CrossRef] [PubMed]
Hewitt, G.M. The genetic legacy of the Quaternary ice ages. Nature 2000, 405, 907–913. [Google Scholar] [CrossRef] [PubMed]
Ennos, R.A. Estimating the relative rates of pollen and seed migration among plant populations. Heredity 1994, 72, 250–259. [Google Scholar] [CrossRef]
Lotan, J.E.; Critchfield, W.B. Pinus contorta Dougl. ex. Loud. lodgepole pine. In Silvics of North America; USDA: Washington, DC, USA, 1990; Volume 1, pp. 302–315. [Google Scholar]
Koch, P. Lodgepole Pine in North America; Forest Products Society: Ruston, LA, USA, 1996. [Google Scholar]
Wheeler, N.C.; Guries, R.P. Population structure, genic diversity, and morphological variation in Pinus contorta Dougl. Can. J. For. Res. 1982, 12, 595–606. [Google Scholar] [CrossRef]
Gingerich, P.D. Rates of evolution. Annu. Rev. Ecol. Evol. Syst. 2009, 40, 657–675. [Google Scholar] [CrossRef]
Lamont, B.B.; Pausas, J.G.; He, T.; Witkowski, E.T.; Hanley, M.E. Fire as a selective agent for both serotiny and nonserotiny over space and time. Crit. Rev. Plant Sci. 2020, 39, 140–172. [Google Scholar] [CrossRef]
Tinker, D.B.; Romme, W.H.; Hargrove, W.W.; Gardner, R.H.; Turner, M.G. Landscape-scale heterogeneity of lodgepole pine serotiny. Can. J. For. Res. 1994, 24, 897–903. [Google Scholar] [CrossRef]
Pausas, J.G. Evolutionary fire ecology: Lessons learned from pines. Trends Plant Sci. 2015, 20, 318–324. [Google Scholar] [CrossRef]
Zwaenepoel, A.; Sachdeva, H.; Fraïsse, C. The genetic architecture of polygenic local adaptation and its role in shaping barriers to gene flow. Genetics 2024, 228, iyae140. [Google Scholar] [CrossRef] [PubMed]
Rehfeldt, G.E.; Ying, C.C.; Spittlehouse, D.L.; Hamilton, D.A., Jr. Genetic responses to climate in Pinus contorta: Niche breadth, climate change, and reforestation. Ecol. Monogr. 1999, 69, 375–407. [Google Scholar] [CrossRef]
de Queiroz, K. Species concepts and species delimitation. Syst. Biol. 2007, 56, 879–886. [Google Scholar] [CrossRef]
Ying, C.C.; Yanchuk, A.D. The development of British Columbia’s tree seed transfer guidelines: Purpose, concept, methodology, and implementation. For. Ecol. Manag. 2006, 227, 1–13. [Google Scholar] [CrossRef]
Wang, T.; O’Neill, G.A.; Aitken, S.N. Integrating environmental and genetic effects to predict responses of tree populations to climate. Ecol. Appl. 2010, 20, 153–163. [Google Scholar] [CrossRef] [PubMed]
Aitken, S.N.; Yeaman, S.; Holliday, J.A.; Wang, T.; Curtis-McLane, S. Adaptation, migration or extirpation: Climate change outcomes for tree populations. Evol. Appl. 2008, 1, 95–111. [Google Scholar] [CrossRef]
Alberto, F.J.; Aitken, S.N.; Alía, R.; González-Martínez, S.C.; Hänninen, H.; Kremer, A.; Lefèvre, F.; Lenormand, T.; Yeaman, S.; Whetten, R.; et al. Potential for evolutionary responses to climate change—Evidence from tree populations. Glob. Change Biol. 2013, 19, 1645–1661. [Google Scholar] [CrossRef]
Pyhäjärvi, T.; Kujala, S.T.; Savolainen, O. 275 years of forestry meets genomics in Pinus sylvestris. Evol. Appl. 2020, 13, 11–30. [Google Scholar] [CrossRef]
Vendramin, G.G.; Fady, B.; González-Martínez, S.C.; Hu, F.S.; Scotti, I.; Sebastiani, F.; Soto, A.; Petit, R.J. Genetically depauperate but widespread: The case of an emblematic Mediterranean pine. Evolution 2008, 62, 680–688. [Google Scholar] [CrossRef]
Dzialuk, A.; Boratynski, A.; Boratynska, K.; Burczyk, J. Geographic patterns of genetic diversity of Pinus mugo (Pinaceae) in Central European mountains. Dendrobiology 2012, 68, 31–41. [Google Scholar]
Heuertz, M.; De Paoli, E.; Källman, T.; Larsson, H.; Jurman, I.; Morgante, M.; Lascoux, M.; Gyllenstrand, N. Multilocus patterns of nucleotide diversity, linkage disequilibrium, and demographic history of Norway spruce [Picea abies (L.) Karst.]. Genetics 2006, 174, 2095–2105. [Google Scholar] [CrossRef] [PubMed]
Gamache, I.; Jaramillo-Correa, J.P.; Payette, S.; Bousquet, J. Diverging patterns of mitochondrial and nuclear DNA diversity in subarctic black spruce: Imprint of a founder effect associated with postglacial colonization. Mol. Ecol. 2003, 12, 891–901. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Locations of 31 populations of Pinus contorta used in this study.

Figure 2. Conceptual model of migration-selection balance in Pinus contorta. The upper panel presents the species range from the coast to the interior transect, illustrating wind-borne pollen dispersal (dashed green arrow) that homogenises chloroplast DNA across all four subspecies. The lower panel summarises the three interacting processes. On the left: Pollen-mediated gene flow maintains near-identical cpDNA haplotype frequencies across the range (nucleotide diversity π = 0.000178–0.000186), erasing any phylogeographic structure at neutral loci. In the center: Spatially varying divergent selection—driven by fire regime (cone serotiny), climate gradient (growth form and phenology), and seed predation by crossbills (Loxia spp.) and red squirrels (Tamiasciurus hudsonicus)— exerts strong, locally-specific selection pressures on polygenic morphological traits, where the net selection coefficient greatly exceeds the migration rate (s >> m). On the right: The net result is pronounced morphological differentiation (diagnostic traits shown in the trait grid) alongside minimal genetic distances between subspecies (1.06–3.96 × 10⁻⁴). Taken together, the data support an ecotypic rather than a phylogenetic interpretation of subspecies identity, consistent with Avise Category V phylogeographic structure. This framework predicts that adaptive differentiation will be detectable at functional loci in genome-wide scans, even though neutral markers are uniform.

Table 1. Sequences of the amplification and sequencing primers used in this study.

	Primer	Sequence 5′–3′
Chloroplast	c	CGAAATCGGTAGACGCTACG
	d	GGGGATAGAGGGACTTGAAC
	e	GGTTCAAGTCCCTCTATCCC
	f	ATTTGAACTGGTGACACGAG
	trnF-Rb	CTTGCCAGGAACCAGATTTG
Mitochondrial	NAD1B1F	ATGCCGCCCGTTTCCATTTC
	NAD1C1R	TGCTGCAAAGGGTTAGGGGG
	NAD1B3F	CTTTTTGGTTTGCTTATTGGGTGGGGGG
	NAD1C3R	TTTTAAGTGACTCGCCCGACC

Table 2. Relative positions of single-nucleotide polymorphisms and a 5 bp indel in the trnL intron in 9 lodgepole pine populations. The numbers above indicate the position relative to the intron start. Individual identifiers show population number followed by tree number within that population (e.g., 33-3 = population 33, individual tree 3); ‘–’ indicates the consensus (wild-type) base; INDEL indicates presence of the 5 bp deletion (TAAAT) at positions 404–408. Population numbers in this table correspond to the provenance codes in Figure 1. For all but two populations (131 and 135), five trees were sampled per population. Three and four trees were sampled from Populations 135 and 131, respectively.

Nucleotide Position
	92	111	389	400	404–408	420
Consensus	C	T	A	G	TAAAT	G
Individual
33-3 (latifolia)	T	–	–	–	–	–
33-14 (latifolia)	–	G	–	–	–	–
36-3 (latifolia)	–	–	–	–	INDEL	–
95-3 (contorta)	–	–	–	–	INDEL	–
49-11 (contorta)	–	–	–	–	INDEL	A
49-18 (contorta)	–	–	–	–	INDEL	–
123-3 (murrayana)	–	–	–	T	–	–
131-26 (murrayana)	–	–	C	–	–	–
135-8 (murrayana)	–	–	C	–	–	–

Table 3. Relative positions of single nucleotide polymorphisms, a 1 bp indel, and a 26 bp indel in the trnL/F spacer in 6 lodgepole pine populations. The numbers above indicate the position relative to the start of the spacer. Individual identifiers show population number followed by tree number within that population (e.g., 31-5 = population 31, individual tree 5); ‘–’ indicates the consensus (wild-type) base; INDEL indicates the presence of a deletion polymorphism at the indicated position. The 26 bp deletion in individual 31-15 spans positions 228–253 of the spacer alignment. Population numbers in this table correspond to the provenance codes in Figure 1. For all but three populations (135, 141, 154), five trees were sampled per population. Three trees were sampled from Populations 135 and 141. Four trees were sampled from Population 154.

				Nucleotide Position
	11	91	103	228	240	244	253
Consensus	C	T	G	AATTATTCAATTGCAGTCCATTTTTA
Individual
31-5 (latifolia)	–	–	–	T		–
31-15 (latifolia)	–	–	–	————— INDEL (26 bp) ————
141-26 (latifolia)	–	–	T	–		–
143-10 (latifolia)	–	–	–	–		G
145-5 (latifolia)	–	–	–	–		G
154-11 (latifolia)	A	–	–	–		–
135-13 (murrayana)	–	INDEL	–	–		–

Table 4. Evolutionary distance within (diagonal) and between the four subspecies of lodgepole pine based on the chloroplast trnL intron and the trnF spacer using the method of Tajima and Nei [32]. Diagonal values represent mean pairwise within-subspecies distances; off-diagonal values represent mean pairwise between-subspecies distances, each calculated over all individual sequence pairs. Per-subspecies population (p) and sample sizes (n): latifolia p = 16 and n = 76; contorta p = 11 and n = 46; murrayana p = 3 and n = 12; bolanderi p = 1 and n = 5. Dashes (—) for bolanderi within-subspecies distance reflect zero polymorphism within this subspecies (n = 5 with identical sequences). Subspecies names in italic follow the authority of Critchfield [2].

Subspecies	Latifolia	murrayana	contorta	bolanderi
latifolia	2.09 × 10^–4
murrayana	1.31 × 10^–4	5.45 × 10^–4
contorta	3.96 × 10^–4	3.13 × 10^–4	4.93 × 10^–5
bolanderi	1.06 × 10^–4	2.88 × 10^–4	2.56 × 10^–5	—

Table 5. Sensitivity analysis of T_MRCA estimates (years × 10³) across plausible cpDNA substitution rates and effective population sizes (π = 0.00018, haploid model: T_MRCA = π/μ). Shaded cells correspond to parameter combinations compatible with the Wisconsin glaciation interval (~12–100 ka). All estimates should be treated as illustrative order-of-magnitude bounds given non-equilibrium demography (see Section 3.4).

μ (×10⁻⁹ subs/site/y)	N_e = 10,000	N_e = 25,000	N_e = 50,000	N_e = 100,000	N_e = 200,000
0.5	36	90	180	360	720 †
1.0	18	45	90	180	360
1.5 (central)	12	30	60	120	240
3.0	6	15	30	60	120
5.0	4	9	18	36	72

Green shading = Wisconsin glaciation interval (12–100 ka). Yellow shading = central scenario (μ = 1.5, N_e = 50,000). † Exceeds the Pleistocene epoch boundary; likely an artefact of assuming equilibrium demography in a post-glacial expansion species. All values are approximations; formal Bayesian inference is recommended for precise estimation.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Fazekas, A.J.; Yeh, F.C. Phylogeographic Analysis of Lodgepole Pine (Pinus contorta) Reveals Limited Subspecies Differentiation and Evidence for Glacial Refugia. DNA 2026, 6, 20. https://doi.org/10.3390/dna6020020

AMA Style

Fazekas AJ, Yeh FC. Phylogeographic Analysis of Lodgepole Pine (Pinus contorta) Reveals Limited Subspecies Differentiation and Evidence for Glacial Refugia. DNA. 2026; 6(2):20. https://doi.org/10.3390/dna6020020

Chicago/Turabian Style

Fazekas, Aron J., and Francis C. Yeh. 2026. "Phylogeographic Analysis of Lodgepole Pine (Pinus contorta) Reveals Limited Subspecies Differentiation and Evidence for Glacial Refugia" DNA 6, no. 2: 20. https://doi.org/10.3390/dna6020020

APA Style

Fazekas, A. J., & Yeh, F. C. (2026). Phylogeographic Analysis of Lodgepole Pine (Pinus contorta) Reveals Limited Subspecies Differentiation and Evidence for Glacial Refugia. DNA, 6(2), 20. https://doi.org/10.3390/dna6020020

Article Menu

Phylogeographic Analysis of Lodgepole Pine (Pinus contorta) Reveals Limited Subspecies Differentiation and Evidence for Glacial Refugia

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Populations and Hierarchical Sampling Design

2.2. DNA Extraction and PCR Amplification

2.3. DNA Sequencing and Quality Control

2.4. Sequence Alignment and Phylogenetic Analysis

2.5. Statistical Power Analysis

3. Results

3.1. Organellar Genome Variation and Geographic Structure

3.2. Subspecies Differentiation and Phylogenetic Relationships

3.3. Demographic Signatures: Evidence for Recent Expansion

3.4. Estimation of Coalescence Time and Demographic History

4. Discussion

4.1. Discordance Between Genetic and Morphological Variation: Theoretical Implications

4.2. Biogeographic History and Glacial Refugia: Reconciling Molecular and Paleoecological Evidence

4.3. Mechanisms of Rapid Morphological Evolution: Life History and Ecological Context

4.4. Taxonomic Implications: Rethinking Subspecies as Adaptive Ecotypes

4.5. Conservation and Management Implications

4.6. Broader Implications for Conifer Phylogeography and Evolution, and Priorities for Future Bayesian Demographic Inference

4.7. Limitations, Inferential Status of Key Conclusions, and Future Directions

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI