Genome-Wide Variation in DNA Methylation Predicts Variation in Leaf Traits in an Ecosystem-Foundational Oak Species

: Epigenetic modiﬁcations such as DNA methylation are a potential mechanism for trees to respond to changing environments. However, it remains controversial the extent to which DNA methylation impacts ecologically important traits that inﬂuence ﬁtness. In this study, we used reduced-representation bisulﬁte sequencing to associate genomic and epigenomic variation with seven phenotypic traits related to growth, leaf function, and disease susceptibility in 160 valley oak ( Quercus lobata ) saplings planted across two common gardens in California. We found that DNA methylation was associated with a signiﬁcant fraction of phenotypic variance in plant height, leaf lobedness, powdery mildew infection, and trichome density. Two of the seven traits were signiﬁcantly associated with DNA methylation in the CG context, three traits were signiﬁcantly associated with CHG methylation, and two traits were signiﬁcantly associated with CHH methylation. Notably, controlling for genomic variation in SNPs generally reduced the amount of trait variation explained by DNA methylation. Our results suggest that DNA methylation may serve as a useful biomarker to predict phenotypic variation in trees, though it remains unclear the degree to which DNA methylation is a causal mechanism driving phenotypic variation in forest tree species.


Introduction
Due to rapidly changing environmental conditions caused by drivers of global change, understanding the potential role of epigenetics in plant response to the environment has become a topic of interest for ecologists and evolutionary biologists [1][2][3][4][5]. DNA methylation, where cytosine bases are methylated by distinct molecular pathways in either the CG, CHG, or CHH (H = C, A, or T) sequence contexts, is a prominent epigenetic modification [4,6,7] that may play a role in adaptive phenotypic responses to the environment [8][9][10][11]. In addition, phenotypic traits mediated by DNA methylation can be heritable [10,[12][13][14][15][16], providing the raw material for evolution to occur and a mechanism for induced phenotypic responses to the environment to persist across generations. Several studies have shown DNA methylation to be associated with phenotypic traits, such as drought tolerance in Arabidopsis [10], flowering time in flax [17], adaption to temperature in Arabidopsis [18], and various traits in other herbaceous species [10,17,19,20]. However, studies resolving the associations between DNA methylation and phenotype are less common in tree species despite the advantages epigenetic modifications may bring for plants with long lifespans to be able to respond phenotypically to rapid climate change [21]. growth and altered gene expression when DNA methylation is inhibited experimentally [31]. Moreover, oaks have an unusual pattern of DNA methylation where subcontext-specific patterns of DNA methylation associated with transposable elements reveal that the genome has broadly distributed heterochromatin in intergenic regions [48], which is a pattern more similar to grasses than to other angiosperms [49]. Using data from 160 valley oak saplings planted across two common gardens in California, we addressed the following questions: Is there an association between DNA methylation and plant growth and leaf traits? How does this relationship vary by sequence context? If we control for the underlying genetic variation across individuals, do associations between DNA methylation and phenotypic traits remain? In addition, we perform a series of data simulations to test the power and precision of our analysis approach in an effort to guide future studies.

Study Species
Quercus lobata Née (valley oak) is an ecosystem-foundational oak species endemic to California that occurs along the foothills of the Sierra Nevada and Coastal mountain ranges ( Figure 1). As a dominant species in woodlands and savannas, valley oak has high ecological importance [50], in addition to its high cultural value to Native Americans [51]. In general, valley oak populations have high local genetic diversity, low genetic structure, and significant associations between genotypic and climatic gradients across its range [52,53]. Previous studies of valley oak suggest that DNA methylation may be involved in local adaptation to climate, especially in the CG context, which shows high differentiation among populations and strong associations with temperature gradients [26,27].

Valley Oak Provenance Trial
All phenotypic and sequencing data used in this study come from the valley oak provenance trial, and the full details on the design and establishment of the valley oak provenance trial are available in [54]. In 2012, >10,000 open-pollinated acorns were collected from 674 adult valley oaks in 95 localities distributed across the species range. These

Valley Oak Provenance Trial
All phenotypic and sequencing data used in this study come from the valley oak provenance trial, and the full details on the design and establishment of the valley oak provenance trial are available in [54]. In 2012, >10,000 open-pollinated acorns were collected from 674 adult valley oaks in 95 localities distributed across the species range. These acorns were germinated in a greenhouse and in 2014, 6945 seedlings were planted at two sites in Placerville (IFG) and Chico, California ( Figure 1). Seedlings were planted with 2.25 m spacing intervals in a blocked design, such that each family (i.e., half-sib progeny collected as acorns from a single adult) had at least one individual in each block at each site. Both sites were irrigated and weeded to increase the probability of seedling establishment, as is commonly done in provenance trials.

Phenotypic Measurements
In 2016, when saplings were approximately four years old, we measured a set of seven phenotypic traits related to performance and leaf function: height of the tallest stem (cm), leaf area (cm 2 ), leaf lobedness, specific leaf area (SLA, cm 2 g −1 ), leaf thickness (mm), trichome density (categorical, see below), and powdery mildew infection intensity (categorical, see below) [55]. Height of the tallest stem in valley oak seedlings is strongly correlated with other metrics of size and growth, such as diameter at breast height (dbh), and likely plays a key role in early stage survival for avoiding herbivory and surviving abiotic stress [56]. For leaf traits, we sampled mature leaves that were~10 cm from the branch tip from each side of the tree. We used one leaf sample per individual for estimating trichome density, two samples for leaf thickness and 4-5 leaves per individual for leaf area, specific leaf area, and leaf lobedness. Pairwise correlations among phenotypic traits were all R < 0.46 (Table 1). We measured leaf area by scanning fully hydrated leaves and estimating area using the ImageJ software [57]. To estimate leaf lobedness, we estimated leaf perimeter using ImageJ and then divided leaf perimeter 2 by leaf area. Leaf lobedness in oaks is negatively associated with hydraulic resistance and may be a mechanism for optimizing water balance under dry conditions [58]. We then pressed and dried the leaves to calculate SLA (leaf area divided by dried mass) [59]. Higher SLA has been shown to be related to higher leaf nitrogen concentration, higher potential relative growth rates, and lower leaf longevity and lower investment in secondary compounds [59]. We measured leaf thickness at the midpoint between the perimeter and midrib, midway between the base and the tip using digital calipers, making sure to avoid secondary veins, as recommended by [59]. Higher leaf thickness is related to longer leaf longevity, higher toughness, higher photosynthetic capacity per unit leaf area, and leaf nitrogen per unit area, which may be advantageous when under high levels of moisture availability [59,60]. We estimated trichome density by viewing the abaxial side of the leaf under a microscope at 1500× magnification following [61] and [62] and categorizing trichome density into 6 categories: <15, 15-24, 25-34, 35-44, 45-54, and ≥55 trichomes in the field of view. Higher trichome densities reduce water loss by decreasing the rate of transpiration and may be advantageous in dry conditions [63]. We categorized powdery mildew infection based on the proportion of leaves on the entire sapling displaying signs of infection: no signs of infection, low infection (<33% of leaves infected), medium infection (33-66% of leaves infected), high infection (>66% of leaves displaying sign of infection).
We analyzed the relationship between SNP, methylation, and phenotypic variation using both raw phenotypic trait values and phenotypic trait values adjusted to control for local environmental effects of each common garden. Raw phenotypic values include the potential contribution of environmentally induced DNA methylation on phenotypic traits, while the adjusted phenotypic values include the contribution of DNA methylation to a phenotypic trait after controlling for local environmental effects. To accomplish this, we fit a separate linear regression for each phenotypic trait using the phenotypic trait as the response variable and common garden location and block nested within common garden location as random effects using the 'lmer' function in the 'lme4' R package [64]. We then took the residuals of each model as the adjusted phenotypic values.

Bisulfite Sequencing
We chose 160 focal saplings for bisulfite sequencing, where we simultaneously identified genomic SNPs and variation in DNA methylation across the genome using the same DNA sample (see below). The saplings came from a total of 40 maternal families, sampling 2 progeny from each family in each planting site. We followed the same bisulfite sequencing protocol described elsewhere [27]. Briefly, we extracted total genomic DNA from leaf tissue using a prewash protocol [65] and the Qiagen Dneasy plant extraction kit. We modified the protocol of Feng et al. [66] to prepare reduced-representation bisulfite sequencing (RRBS) libraries. We first digested total genomic DNA with Mspl (CCGG) that was then end-repaired and adenylated with Klenow fragments (3' to 5' exo-). We ligated Illumina TruSeq adapters to fragments in each library. We performed size selection using AMPureXP beads to target fragments of 200 to 500 bp (including~120 bp of adapter sequence). We treated libraries with sodium bisulfite (EpiTect, Qiagen) to perform the conversion of unmethylated cytosines to uracil, which then are subsequently read as thymine during sequencing. The prepared libraries were PCR amplified with Illumina primers and then pooled in batches of 12. Pooled libraries were sequenced on multiple lanes on an Illumina HiSeq 2000 v3 in 100-bp single end mode at the core facilities of the Broad Stem Cell Center, UCLA.

Methylation and SNP Calling
We filtered Illumina reads that failed the Illumina chastity test and then converted from QSEQ to FASTQ format, demultiplexed by sample, and applied the EAMSS quality score correction. We used TrimGalore (https://github.com/FelixKrueger/TrimGalore, accessed on 31 October 2019) in 'rrbs' mode to trim 3-4 bases at the ends of reads after examining mbias plots generated in MethylDackel (https://github.com/dpryan79/MethylDackel, accessed on 31 October 2019), ensuring reads were at least 20 bases with a quality score of 20. We aligned trimmed reads using bwa-meth [67] to v3.0 of the valley oak genome [68] using default settings. We extracted the methylation levels separately for the CG, CHG, and CHH context using MethylDackel, with minimum mapping quality of 40, a minimum depth of coverage of 10×, and a maximum depth of coverage of 1000×. These methylation levels range from 0 to 1 and estimate the fraction of cells containing DNA methylation at a locus for a given sample. We filtered methylation sites such that each site had <10% missing data and had a standard deviation >0.01, to select for variable sites. We then removed any methylation site that had a high correlation (R 2 ≥ 0.70) with any other methylation site in the same sequence context within 500 bp until no such pairs remained, leaving a set of uncorrelated methylation sites. To reduce the influence of extreme methylation levels at individual samples in analysis, we winsorized methylation levels at each site using the 'Winsorize' function in the DescTools R package [69]. Winsorizing limits extreme values in a dataset and reduces the effect of spurious outliers by setting all data below the 5th percentile to the 5th percentile and all data above the 95th percentile to the 95th. Finally, we removed two individual valley oaks for having ≥25% missing data across all methylation sites. After filtering, we obtained 6175 mCG, 2831 mCHG, and 4350 mCHH variable and high-quality methylation sites across the valley oak genome.
To call SNPs from the RRBS reads [70], we used the BISulfite-seq CUI Toolkit (BISCUIT) (https://github.com/zhou-lab/biscuit, accessed on October 31 2019). We set a minimum base quality of 20 and mapping quality of 20 during SNP calling. We then used the Genome Analysis Tool Kit [71] to filter variants with an allele count AC < 4, allele frequency AF < 0.10, and genotype quality GQ < 20. We further filtered SNPs using vcftools [72] such that all SNPs were biallelic, had < 10% missing data, minor allele frequency > 0.05, and a mean depth of coverage over all individuals ≥5× and ≤500×. To reduce the occurrence of spurious SNP calls from the RRBS reads, we used vcftools to exclude variants that were out of Hardy-Weinberg Equilibrium with p < 0.0000001. Then, we used plink [73] to filter for linkage disequilibrium among SNPs, similar to the methylation sites, where a SNP was removed if it was highly correlated (R 2 ≥ 0.70) with any other SNP in a 500 bp window until no such pairs remain. After filtering, we retained 8450 high quality and variable SNPs.

Trait Associations
To determine the association between DNA methylation and our set of phenotypic traits, we used the GCTA (Genome-wide Complex Trait Analysis) software [47]. GCTA estimates the proportion of phenotypic variance of a trait explained by genomic data (e.g., SNP or DNA methylation data) using relatedness matrices and genome-based restricted maximum likelihood (GREML) [44]. The use of relatedness matrices to explain variation in complex traits has grown recently in ecology and evolutionary biology [43,46] due to the increasing availability of reduced representation sequencing that is well-suited for these types of analyses. Additionally, GCTA is able to estimate the separate contributions of multiple relatedness matrices to phenotypic variance. In this way, we were able to separate the amount of phenotypic variance explained by DNA methylation after controlling for underlying genetic variation in the sampled SNPs. Importantly though, SNPs outside of the sampled sequences may still play an important role in predicting phenotypic variation, which would not be accounted for due to lack of sampling. The approach does not relate individual SNPs or methylation sites to individual traits, but rather uses variation in SNPs and DNA methylation sampled across the entire genome to predict phenotype.
We calculated relatedness matrices from the SNP and DNA methylation for each sequence context separately (e.g., CG, CHG, CHH) using OSCA (OmicS-data-based Complex trait Analysis) [74]. In all model runs, we included the first 2 principal components of the relatedness matrix or matrices included in the analysis to help control for population stratification ( Figure S1). We also included mean depth of coverage to control for differences in sequencing coverage among individuals. Mean depth of coverage across sites was highly correlated across SNP, CG, CHG, and CHH sites (Pearson's R > 0.98 for all comparisons), and thus for simplicity we only used depth of coverage at SNP sites as a covariate in the GCTA analysis. We used the expectation-maximization restricted maximum likelihood (EM-REML) algorithm option with no constraint and 10,000 maximum iterations to aid in model convergence.
We included 154 saplings in the final trait analyses (two excluded for missing methylation data and four excluded for missing trait data). With this sample size, we expect high uncertainty and low power to detect trait associations. However, our goal in this study was to determine if there is a detectable association between DNA methylation, SNPs, and phenotypic traits, rather than precisely estimate the amount of variance DNA methylation can explain, which would require much larger sample sizes. To this end, we used a permutation-based approach to test whether the amount of variance explained by SNPs or DNA methylation in each phenotypic trait was higher than would be expected by chance, under the null hypothesis that there was no association between SNPs, DNA methylation, and our phenotypic traits, following the approach of [75]. To accomplish this, for each model run, we randomized phenotypes among individuals n = 10,000 times and in each iteration calculated the percent of variance explained by DNA methylation and SNPs in permuted phenotypes. We compared this null distribution of expected percent variance explained in phenotypes with the observed phenotypic variance explained in the observed data. If the observed amount of phenotypic variance was >95% of permuted values, we interpret this as a signal of a significant association between the trait of interest and DNA methylation of SNP variation. If more than one relatedness matrix was included in the model run, we separately estimated the null distribution of phenotypic variance explained for each matrix.
Additionally, we performed a series of data simulations to test the validity and power of our permutation-based approach. We simulated a series of different scenarios that varied the number of individuals sampled and the amount of trait variance explained by either SNP or DNA methylation variation. In each simulation repetition, we first simulated a SNP dataset with 1000 markers all with minor allele frequency >0.05. We then simulated a set of 1000 DNA methylation markers that varied in methylation levels from 0.0-1.0 at each marker. We next simulated independent polygenic additive effect sizes for each SNP and methylation marker that was then used to simulate a phenotypic trait. The effect sizes of SNP and methylation markers were independent from each other as well. We varied the true amount of variance in the phenotypic trait explained by either the SNP or DNA methylation markers from 0.00001 (required to be a non-zero number), 0.10, 0.20, 0.30, 0.40. We also varied the number of individuals being simulated from 50, 150, 250, 500, to 1000 individuals. For each parameter combination, we employed the permutation-based testing approach described above for detecting a significant effect of either SNP or DNA methylation in predicting trait variation. We simulated 100 repetitions of each parameter combination. We combined the results to calculate false positive rates when DNA methylation explained close to 0 trait variance and the statistical power of the permutation-based approach when a non-zero amount of trait variance was explained by DNA methylation across sample sizes.

Results
Overall, DNA methylation alone in any context without controlling for genetic variation explained a significant fraction of variance in four of the seven traits we measured ( Figure 2). However, the amount of trait variation explained by DNA methylation was generally reduced and often was no longer statistically significant after controlling for genetic variation (Figure 2). The few exceptions to this pattern were CHG and CHH methylation predicting plant height and CHG methylation predicting extent of powdery mildew infection, which both explained a significant fraction of trait variation even after controlling for genetic variation (Figure 2). Traits varied in which DNA methylation contexts were able to explain variation, with some traits associated with CG methylation (e.g., leaf lobedness, trichome density), others with CHG methylation (e.g., height, extent of powdery mildew infection, trichome density), and others with CHH methylation (e.g., height, trichome density, Figure 2). Leaf area, specific leaf area, and leaf thickness did not show significant associations with DNA methylation in any context ( Figure 2). SNP variation explained a significant fraction of trait variation for leaf area, leaf lobedness, and trichome density ( Figure 2).
Adjusting phenotypic trait values for local environmental effects (i.e., controlling for variation across common garden sites and among blocks within sites) affected estimates of the amount of variation explained primarily for height, trichome density, and extent of powdery mildew infection ( Figure 2). In these cases, adjusted phenotypes showed weaker associations with variation in DNA methylation when local environmental effects were accounted for (Figure 2). The amount of trait variation explained by SNPs was not affected by controlling for local environmental effects (Figure 2).
Using simulated data, we found that our analysis approach of permuting phenotypes to simulate the null hypothesis of no association between SNP or DNA methylation data and traits had an acceptable false discovery rate (~5%) for testing the null hypothesis of no association between trait variation and SNP/DNA methylation variation at sample sizes similar to those used in this study (n = 150 individuals, Figure 3). However, the analysis approach had low power to detect associations (e.g., 20% power to detect association if DNA methylation explains 30% of trait variance) at these sample sizes (Figure 3). Furthermore, we found high levels of variation and uncertainty in the estimates of variation explained when sample sizes were <500 individuals ( Figure S2). Adjusting phenotypic trait values for local environmental effects (i.e., controlling for variation across common garden sites and among blocks within sites) affected estimates of the amount of variation explained primarily for height, trichome density, and extent of powdery mildew infection ( Figure 2). In these cases, adjusted phenotypes showed weaker associations with variation in DNA methylation when local environmental effects were accounted for ( Figure 2). The amount of trait variation explained by SNPs was not affected by controlling for local environmental effects ( Figure 2).
Using simulated data, we found that our analysis approach of permuting phenotypes to simulate the null hypothesis of no association between SNP or DNA methylation data and traits had an acceptable false discovery rate (~5%) for testing the null hypothesis of no association between trait variation and SNP/DNA methylation variation at sample sizes similar to those used in this study (n = 150 individuals, Figure 3). However, the analysis approach had low power to detect associations (e.g., 20% power to detect association if DNA methylation explains 30% of trait variance) at these sample sizes (Figure 3). Furthermore, we found high levels of variation and uncertainty in the estimates of variation explained when sample sizes were <500 individuals ( Figure S2).

Discussion
Overall, we found significant associations between DNA methylation and phenotype for four of the seven traits measured in this study (plant height, leaf lobedness, disease susceptibility, and trichome density). The phenotypic traits varied in their association with DNA methylation based on sequence context, with traits such as height associated

Discussion
Overall, we found significant associations between DNA methylation and phenotype for four of the seven traits measured in this study (plant height, leaf lobedness, disease susceptibility, and trichome density). The phenotypic traits varied in their association with DNA methylation based on sequence context, with traits such as height associated with mCHG and mCHH methylation, while leaf lobedness associated most strongly with mCG sites. In addition, the amount of variation explained by DNA methylation was substantially reduced in some cases after controlling for genetic variation in SNPs, indicating that variation in DNA methylation and genomic background may be confounded in some instances. Resolving the associations between methylation and phenotypic traits is important for understanding the role of epigenetic modifications in plant response to the environment. This study provides one of the few examples of significant associations between DNA methylation and phenotype for a tree species.
The height of 4-year-old valley oak saplings showed strong associations with variation in mCHG and mCHH methylation and no associations with SNPs or mCG methylation. When we adjusted height to control for local environmental effects (i.e., differences within and across common gardens), the significant association between height and mCHG and mCHH no longer remained. In general, plant height and growth are plastic traits largely influenced by environmental conditions experienced by individual plants, though growth in valley oak can be partially explained by SNP variation [76]. The fact that the association between DNA methylation and height was removed after controlling for plastic phenotypic responses suggests mCHG and mCHH methylation exhibit plastic responses to the environment even within common gardens, which correspond to changes in plant height. A previous experimental study where valley oak seedlings that were chemically demethylated suggests a potential mechanistic link between DNA methylation and plant height in valley oaks [31]. When DNA methylation was reduced in all sequence contexts, seedlings showed a~50% reduction in new growth compared to controls [31]. Furthermore, associations between DNA methylation and plant growth have been demonstrated in other systems, including Eucalyptus [34,77,78], suggesting that plant growth may be a phenotypic trait particularly sensitive to variation in DNA methylation. At the same time, plant growth is also a fitness component which may reflect the consequences of traits on plant performance rather than a direct effect on polygenic traits associated with growth.
Leaf lobedness, a trait associated with leaf hydraulic resistance [58], showed strong associations with variation in both SNPs and mCG methylation, but no association with mCHG or mCHH methylation. Leaf lobedness has been shown to vary across natural populations spanning the valley oak range [79] and to be positively associated with photosynthetic rates [80], and thus might be a target of divergent selection. Our results support the idea that leaf lobedness is a heritable trait where variation in SNPs explain a significant amount of variation. In addition, mCG methylation by itself explained variation in leaf lobedness, though this amount was reduced when also controlling for variation in SNPs. Given that mCG methylation itself can be heritable, it seems likely that the association between mCG methylation and leaf lobedness is driven by an underlying correlation between variation in mCG methylation and SNPs. Because most CG methylation is in gene bodies and it has been shown that gene body methylation may not directly influence gene expression, the reason for this association will need more investigation. Indeed, the underlying causal mechanism driving phenotypic variation is likely to be more associated with the genomic background rather than mCG methylation itself. Similarly, trichome density also varies across populations in the valley oak range [79] and showed strong associations with variation in SNPs and all three methylation sequence contexts. The associations with sequence contexts were substantially reduced after controlling for genetic variation, suggesting that variation in SNPs is likely to be the underlying driver of variation in trichome density. However, further studies, potentially using demethylating chemical agents to experimentally manipulate methylation levels [81], are needed to conclusively determine causal pathways and disentangle the influence of variation in DNA methylation levels from potential confounding effects of genomic background.
Given the complexities in the way that DNA methylation is associated with phenotypic variation depending on many factors, such as context of the methylation [7], TE insertions impacts (e.g., [45]), and variation in DNA methylation homeostasis (e.g., [16]), we acknowledge the limitations in this study. Once candidate phenotypes are identified, future experimental work would be necessary to identify the mechanisms underlying the patterns reported here.
In general, studies associating DNA methylation with phenotypic traits in non-model plant systems are rare, and the sample size of 160 individuals achieved in this study are high for current standards of ecological epigenomic studies. However, such a sample size is relatively low for human studies and studies in model plant systems [82]. As a consequence, our study was limited by low statistical power and high uncertainty in estimates of percent variance explained in phenotypic traits. Because of this limitation, we advise caution in interpreting the values of percent variance explained as exact estimates and recognize that the lack of a significant association between SNPs, DNA methylation, and a phenotypic trait could arise from low statistical power rather than a lack of a true association. In addition, detecting associations between polygenic traits and variation in SNPs or DNA methylation can be difficult, relying on adequate coverage of markers across the genome and the sequenced markers being in linkage disequilibrium with causal variants or potentially casual variants themselves [83]. From our simulation analysis (Figure 3, Figure S2), samples sizes on the order of 500-1000 would be needed to obtain~75-100% statistical power depending on the strength of the association between DNA methylation and phenotypic trait. Achieving the sample sizes necessary to obtain sufficient statistical power and reduced uncertainty will become more feasible as epigenomic sequencing techniques continue to become more accessible in ecology and evolutionary biology research [4,70].
In summary, this study provides compelling evidence that DNA methylation can predict variation in ecologically relevant phenotypic traits, such as plant height, leaf lobedness, disease susceptibility, and trichome density in valley oak. Future studies that combine whole-genome bisulfite sequencing, gene expression analysis, TE insertion polymorphisms and phenotypic trait associations will provide a stronger and more detailed understanding of the potential mechanistic links between methylation at particular genomic regions and phenotypic traits.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/ 10.3390/f12050569/s1, Figure S1: Results of PCA for SNP and DNA methylation data, Figure S2: Proportion of variance explained by simulated phenotypic and DNA methylation data.  Mix, A. Lentz, A. Albarran-Laras, P. Gugger, L. Crane, and many others for their help in establishing and maintaining the valley oak provenance trial. We thank E. Eskin, B. Pasaniuc, S. Sankararaman, and members of the Sork lab for their helpful comments and discussion. Genomic studies were supported by NSF PGRP grant (IOS-1444661) to V.L.S. Provenance trials were supported by the US Department of Agriculture Forest Service Pacific Southwest Research Station. Any use of product names is for informational purposes only and does not imply endorsement by the US Government.

Conflicts of Interest:
The authors declare no conflict of interest.