The D165H Polymorphism of QiMYB-like-1 Is Linked to Interactions between Tannin Accumulation, Herbivory and Biogeographical Determinants of Quercus ilex

The accumulation in the leaves and young stems of phenolic compounds, such as hydrolyzable and condensed tannins, constitutes a defense mechanism of plants against herbivores. Among other stressing factors, chronic herbivory endangers Quercus ilex, a tree playing a central role in Mediterranean forests. This work addressed the connections between the chemical defenses of Q. ilex leaves and their susceptibility to herbivory, quantitative traits whose relationships are modulated by environmental and genetic factors that could be useful as molecular markers for the selection of plants with improved fitness. A search for natural variants detected the polymorphism D165H in the effector domain of QiMYB-like-1, a TT2-like transcription factor whose family includes members that control the late steps of condensed tannins biosynthesis in different plant species. QiMYB-like-1 D165H polymorphism was screened by PCR-RFLP in trees from six national parks in Spain where Q. ilex has a relevant presence, revealing that, unlike most regions that match the Hardy-Weinberg equilibrium, homozygous plants are over-represented in “Monfragüe” and “Cabañeros”, among the best examples to represent the continental Mediterranean (cM) ecosystem. Accordingly, the averages of two stress-related quantitative traits measured in leaves, herbivory index and accumulation of condensed tannins, showed asymmetric distributions depending on the clustering of trees based on ecological and genetic factors. Thus, the impact of herbivory was greater in managed forests with a low density of trees from the cM region, among which QiMYB-like-1 D165 homozygotes stand out, whereas condensed tannins accumulation was higher in leaves of QiMYB-like-1 H165 homozygotes from low-density forests, mainly in the Pyrenean (Py) region. Besides, the correlation between the contents of condensed tannins and total tannins vanished after clustering by the same factors: the cM region singularity, forest tree density, and QiMYB-like-1 genotype, among which homozygous shared the lowest link. The biogeographical and genetic constraints that modulate the contribution of condensed tannins to chemical defenses also mediated their interactions with the herbivory index, which was found positively correlated with total phenolics or tannins, suggesting an induction signal by this biotic stress. In contrast, a negative correlation was observed with condensed tannins after tree clustering by genetics factors where associations between tannins were lost. Therefore, condensed tannins might protect Q. ilex from defoliation in parks belonging to the cM ecosystem and carrying genetic factor(s) linked to the QiMYB-like-1 D165H polymorphism.


Introduction
Seven national parks in the Iberian Peninsula share a significant presence of Quercus ilex, with a continuous distribution at the Center-South and with a more scattered distribution in the North ( [1,2], Figure S1). Holm oaks are suffering oak decline, a severe disease that endangers many Quercus species [3]. The selection of Q. ilex trees resistant to biotic stress is critical to improving the sustainability of the Mediterranean forest.
Tannins are a broad spectrum of phenolic compounds that play a role as chemical defenses against biotic stress in plants [4]. These molecules are accumulated in the leaves of Quercus species to protect themselves from herbivore damage by inhibition of digestive proteases [5]. Two classes of tannins are synthesized in plants: hydrolyzable tannins that are derived from the common precursor 3-dehydroshikimate, and proanthocyanidins or condensed tannins (CT), which are flavonoid molecules produced from phenylpropanoids [6]. Gene expression for late enzymes that synthesize condensed tannins is regulated by a tripartite complex of transcription factors including an R2-R3 MYB factor, a basic helix-loophelix protein, and a WD40 protein, respectively named TT2, TT8, and TTG1, evoking the transparent testa phenotypes conferred by mutations in the flavonoid pathway of Arabidopsis thaliana [6]. Depending on plant species, between 100-200 paralogs encode the R2-R3 type MYB-family, whose members regulate development processes and the responses to biotic and abiotic stresses by controlling multiple steps of metabolic processes, mainly the pathway for flavonoid biosynthesis but also to lignin and other cell wall components [7]. Accordingly, QsMYB1, whose transcript was detected in the external bark of Quercus suber [8] and accumulates specifically under cork-producing conditions [9], including heat and drought stresses and plant recovery [10], binds to promoter regions of genes involved in lignin and suberin biosynthesis [11].
Biotic stress by herbivory has been shown to produce signals that regulate tannin biosynthesis [12] and, reciprocally, defoliation by insects was found to be modulated by the genetic background and positively correlated with phenolic content of leaves from Quercus robur, indicating a compensatory feeding response [13]. Furthermore, defoliation of Q. ilex increases the gene expression of its group III shikimate dehydrogenase isoenzyme (SDH2) [14], the role of which has been assigned to the synthesis of gallic acid from 3-dehydroshikimate in Vitis vinifera [15]. Therefore, the only known enzyme involved in the biosynthesis of hydrolyzable tannins in plants could be induced by the attack of herbivores, although, nevertheless, the simultaneous infection by the oomycete that causes the decline of holm oak in Mediterranean forests, Phytophthora cinnamomi, blocks this response, probably worsening plant health when subjected to this double biotic stress [14].
The oak genome database makes available the chromosomal and coding sequences of Q. robur [16] and provides information to analyze genes of closely related species such as Q. ilex, which sequence has been recently described by preprint although not yet been made available (doi.org/10.1101/2022.10.09.511480, accessed on 6 December 2022). This work addressed the identification of TT2-like protein polymorphism in Q. ilex, its geographic distribution pattern, and possible association with phenolics traits and/or herbivore susceptibility.

Geographical Distribution of the QiMYB-like-1 Genotypes
The occurrence of genotypes carrying QiMYB-like-1 alleles was analyzed in holm oaks from six national parks and found in Hardy-Weinberg equilibrium in all but one case ( Table 2). A higher ratio of homozygous plants than expected in trees from Monfragüe National Park promoted the detected deviation in the HW equilibrium. Although Cabañeros was not statistically significant, a similar tendency against heterozygous in this  Table 1. (C) Resolution in agarose gel (1.5%) electrophoresis (0.5 × TBE) of MboI fragments from QIMYB-like-1 specific PCR. Yellow, DNA polymorphisms; red/pink, protein polymorphisms; red, polymorphism detected by RFLP.

Geographical Distribution of the QiMYB-like-1 Genotypes
The occurrence of genotypes carrying QiMYB-like-1 alleles was analyzed in holm oaks from six national parks and found in Hardy-Weinberg equilibrium in all but one case ( Table 2). A higher ratio of homozygous plants than expected in trees from Monfragüe National Park promoted the detected deviation in the HW equilibrium. Although Cabañeros was not statistically significant, a similar tendency against heterozygous in this locus was found in the other region representing the cM biogeographical context, a common ecosystem widely spread in the southern half of the Iberian Peninsula [1,2].

Non-Random Distribution of Quantitative Traits Related to Chemical Defenses Depends on Biogeographic, Ecologic, or Genetic Factors
The leaves of Q. ilex sampled in the six national parks accumulated significant amounts of the three families of molecules determined as representatives of holm oak chemical defenses: TP, TT, and CT ( Figure 3). The comparison of TP averages among national parks evidences similar values, whereas the TT contents of the trees in the Pyrenean parks were the lowest and CT accumulation presents a decreasing gradient from northern to southern regions. Besides, the highest susceptibility to herbivores was found in Cabañeros, which presented in turn one of the lowest CT contents. When contrasted according to the regional singularity Py vs cM, their differences in susceptibility to defoliation and between their contents in CT and TT are maximized ( Figure 3). On the other hand, none of the averages of the quantitative variables differed according to the density of the forest or the distribution in the genotypes defined for QiMYB-like-1.  The significance of the differences was statistically assessed by using a two-tailed alternative t-test. *** p < 0.001 (confidence level 99.9%).
Regional (cM vs. Py), habitat type (LDL vs. HDL), and genetic (QiMYB-like-1 D165H polymorphism) interactions depending on the distribution of quantitative traits related to biotic stress were only evident for CT content and herbivory index, while TT content only showed highly significant variation (p < 0.001) with respect to regional grouping (Table 3). Thus, we observed that herbivory susceptibility varied by region (p < 0.001) and its interaction with genotype (G × R, p < 0.027)) and with density (R × D, p < 0.001), although not by all three simultaneously. Similarly, the CT content shows significant differences at the regional (p < 0.001), ecological (p < 0.003), and genetic (p < 0.033) levels, individually and according to the GxD interaction (p < 0.028), although also not for all the three variables simultaneously (Table 3). Furthermore, QiMYB-like-1 H165 homozygous (HH) trees grown in low-density forests accumulate higher amounts of CT, while plants from the cM region are more prone to attack by herbivores when located in low-density forests or sharing the QiMYB-like-1 homozygous D165 genotype (DD; Figure 4). Table 3. Interaction effects (two-way ANOVA) of genotype, region, and forest density on chemical defense production and herbivore susceptibility.   Figure 3). The bar graph represents the mean and standard error for a confidence interval of 95%. The dashed line indicates the mean value of the measured samples. Confidence levels were assessed by two-way ANOVA: * 95%, ** 99%, *** 99.9%.

Susceptibility to Defoliation and/or Chemical Defenses Depends on Biogeographic, Ecological, or Genetic Factors
Accumulation of the three quantitative traits indicative of the phenolic defenses analyzed in this work (TP, TT, and CT) were found to be correlated with each other in populations of Q. ilex from the Iberian Peninsula, in agreement with the overlap between these families of molecules (Table 4). Best fits were found within every National Park, where regression coefficients reached above or near 0.8 for TP and TT, although slopes between TP and CT decreased to 0.4-0.5 from northern to southern parks. In contrast, TT and CT contents were the less correlated chemical defenses suggesting that CT contribution to total tannins and/or phenolics is more variable in holm oaks when latitude decreases in the Iberian Peninsula (Table 4) [1,2]. Moreover, after classifying the plants according to the three singularities defined as independent variables for this study, the contribution of CT loses the link with TP or TT accumulations in the cM region. Similarly, QiMYB-like-1 homozygous plants had TT and TP or CT and TP contents more closely correlated with each other than CT and TT, whose association was completely lost after grouping by biogeographical criteria, as also occurs with classification based on forest density (Table 4).  Figure 3). The bar graph represents the mean and standard error for a confidence interval of 95%. The dashed line indicates the mean value of the measured samples. Confidence levels were assessed by two-way ANOVA: * 95%, ** 99%, *** 99.9%.

Susceptibility to Defoliation and/or Chemical Defenses Depends on Biogeographic, Ecological, or Genetic Factors
Accumulation of the three quantitative traits indicative of the phenolic defenses analyzed in this work (TP, TT, and CT) were found to be correlated with each other in populations of Q. ilex from the Iberian Peninsula, in agreement with the overlap between these families of molecules (Table 4). Best fits were found within every National Park, where regression coefficients reached above or near 0.8 for TP and TT, although slopes between TP and CT decreased to 0.4-0.5 from northern to southern parks. In contrast, TT and CT contents were the less correlated chemical defenses suggesting that CT contribution to total tannins and/or phenolics is more variable in holm oaks when latitude decreases in the Iberian Peninsula (Table 4) [1,2]. Moreover, after classifying the plants according to the three singularities defined as independent variables for this study, the contribution of CT loses the link with TP or TT accumulations in the cM region. Similarly, QiMYB-like-1 homozygous plants had TT and TP or CT and TP contents more closely correlated with each other than CT and TT, whose association was completely lost after grouping by biogeographical criteria, as also occurs with classification based on forest density (Table 4). On the other hand, the associations between chemical defenses and the defoliation index presented a variable sign, being mostly positive for TP and TT, and negative for CT, although the latter relationship is strongly dependent on the cM singularity (Table 5), the same factor that dissociates the CT and TT contents (Table 4). Again, regional singularity dominates over ecological or genetic variables, since negative correlations between herbivory index and CT content are only found for QiMYB-like-1 homozygous or plants grown on low-density forests in the cM national parks, where these tannin molecules seem to protect leaves from insect attack.

Discussion
Leaves from Q. ilex trees grown in six national parks of the Iberian Peninsula [1,2] have been analyzed in search of possible links among genetic, ecologic, and biogeographical determinants of biotic stress. Results shown in this work evidence that these factors might modulate interactions between herbivore susceptibility and the accumulation of chemical defenses, represented in holm oaks by phenolic compounds, mainly tannins and particularly condensed tannins [17].
Genomic sequences encoding four TT2-like polypeptides, the R2R3-MYB component of the MBW tripartite complex in A. thaliana [6], were detected in a preliminary version of the oak genome database (Figure 1) [16]. The four R2R3-MYB proteins presented polymorphisms between the two Quercus species, Q. robur and Q. ilex, with most variants located in the C-terminus, a poorly conserved and acidic-rich domain with an effector function to regulate mRNA expression from the target gene (s), whereas the two predicted DNA binding motifs (HTH1 and HTH2) at the N-terminus are less prone to inter-species evolution ( Figure 1). D165H, a polymorphism detected in the effector domain of QiMYBlike-1, presents a biased distribution of genotypes in the continental-Mediterranean (cM) region (Table 2), a biogeographical context dominated by Mediterranean forests that share almost all environmental factors (latitude, altitude, temperature, and raining), suggesting a genetic selection by biogeographical constraint (s). After this clustering, the bias from HW equilibrium reaches a maximum only for cM, where a skewed distribution of the QiMYB-like-1 polymorphism adds genetic support to its singularity.
The two quantitative characters that represent chemical defenses in the leaves of Q. ilex, TT, and CT, as well as the defoliation index that is directly related to biotic stress, are differentially biased towards the biogeographic singularities cM and Py (Figure 3): defoliation and TT content are lower and CT accumulation is higher in Py than in cM national parks. The two other independent variables utilized in this work, the ecological constraints involving forest density, and the genetic factor(s) linked to QiMYB-like-1 polymorphism, do not affect the distributions of averages, although ANOVA detects interactions among them on susceptibility to defoliation and CT contents (Table 3), two out of the three quantitative traits related to biological stress whose averages presented asymmetric distributions ( Figure 3). However, different QiMYB-like-1 genotypes are associated with both, defoliation and CT accumulation (Figure 4), explaining why the three-way ANOVA for the defoliation index is just above the significance threshold based on biogeographic, forest density, and QiMYB-like-1 polymorphism singularities (Table 3). Interestingly, QiMYB-like-1 homozygotes are the overrepresented genotypes in the cM biogeographical singularity, where DD plants have the lowest defoliation rate whilst the HH genotype shares the highest CT content in low-density forests, which suggests that at least one genetic determinant for each of these biotic stress indices might be linked to the QiMYB-like-1 locus.
The analysis of the regression fits between plant parameters related to biotic stress complements the comparison of their averages, confirming the impact of biogeographical, ecological, and genetic singularities on the quantitative characters of Q. ilex. Thus, all chemical defenses were found positively correlated among themselves, although clustering by biogeographic singularities clearly breaks the associations between CT and TT in the cM region, probably indicating that contribution of the former is distinct and more variable in this environment (Table 4). This effect dominates over ecological or genetic singularities since classification by forest density or QiMYB-like-1 genotype has little or no effect unless the plants considered are from the cM region. The correlation analysis between the herbivory index and chemical defenses showed an asymmetric distribution of the averages of these variables, although additional associations were detected, like the positive link between TP and defoliation rates under almost any condition used for plant classification (Table 5). Positive correlations were also found between TT and defoliation susceptibility when the total of plants is considered, although it is lost when we grouped according to the biogeographic region except for Py, LDF, and DH heterozygotes. Although the upregulation of one trait by the other cannot be excluded, this phenomenon could also be explained if both the TP and TT chemical defenses and the rate of herbivory responded to the same environmental signals. In contrast, defoliation rates and CT contents in the cM singularity were found negatively correlated (Table 5), consistently with the biased distributions of the same variables ( Figure 4) and indicating a protective effect of CT in plants associated with biogeographic, ecological, and genetic determinants.

Plant Material
Quercus ilex subsp. rotundifolia was sampled from six national parks in the Iberian Peninsula with a significant presence of holm oaks [1,2]; Figure S1. Ordered from highest to lowest latitude, these parks were: Aigüestortes, Cabañeros, Guadarrama, Monfragüe, Ordesa, and Sierra Nevada. Among them, four parks represent the cM and Py geo-climatic areas mentioned in this work. Cabañeros and Monfragüe national parks (continental-Mediterranean, cM) are characterized by mild, short winters and long and very hot summers, with altitudes over 500 m, 16 • C average annual temperature, and irregular rainfall around 500 mm. In contrast, Aigües Tortes and Ordesa (Pyrenean, Py) present cold and long winters and mild summers, altitude over 1300 m, 8 • C average temperature, and over 1000 mm rainfall. The two other national parks screened, Guadarrama and Sierra Nevada, present mixed conditions. Sierra Nevada is a high mountain park, from 900 to 3400 m, with the highest peaks in the Iberian Peninsula and an average temperature of 4 • C. On the contrary, the proximity to the Mediterranean Sea provides strong insolation and relative drought, with around 700 mm of rainfall. Guadarrama, also a high mountain (from 900 m), has a less Mediterranean influence, with very cold winters and mild summers, average temperatures below 7 • C, and rainfall of 1200 mm.

Experimental Design, Treatments, and Tissue Processing
The healthy leaves (dark green color, regardless of traces of insect attack) were collected in June 2015, when their development is finished and stabilized, taking them from branches located around 2 m above the ground, avoiding those that are excessively shaded and/or accessible to herbivorous mammals. From the six national parks, two habitats were screened ( Figure S2). First, the "dehesa" or low-density forest (LDF, hereafter in this work) corresponds to the Mediterranean agroforestry system characterized by scattered oak trees with undergrowth grazed by livestock, whereas the second habitat is represented by dense or closed forest formations of oaks and undergrowth cover in a high proportion (>40 percent) of the ground (high-density forest, HDF). An average of 100 leaves were collected for each of the 360 trees sampled according to the following scheme: six national parks, two habitats per park, three locations per habitat, and 10 adult plants were selected to represent the entire range of ages per location. Leaves were stored at −80 • C until processed for analysis of their defoliation index (fraction of defoliated leaves per tree), accumulation of chemical defenses, and DNA polymorphism. Defoliation was visually estimated in all leaves collected, and the percentage of leaves with any damage compatible with herbivory by insects per every tree was registered.

Chemical Analyses of Phenolics
Frozen leaf tissues were hand-ground using a mortar and pestle in liquid nitrogen. Fifty milligrams of milled tissue were extracted with 250 µL 50% (v/v) aqueous methanol for 60 min in an ultrasonic bath. The crude extracts were centrifuged at 10,000 rpm for 5 min at 4 • C and the supernatant was collected and stored at −80 • C.
Total phenolic (TP) content was determined by the Folin-Ciocalteu method according to [18] and following the modifications by [14]. Briefly, 200 µL of appropriately diluted extract (1:50) were mixed with 1 mL of 10% Folin-Ciocalteu reagent (Sigma-Aldrich, Madrid, Spain) followed by 800 µL of 7.5% (w/v) Na 2 CO 3 . After 40 min incubation in the dark at room temperature, absorbance was measured at 725 nm. Gallic acid (Sigma-Aldrich, Madrid, Spain) was the reference standard, and the results were expressed as mg of gallic acid equivalents (G.A.E.) per g dry leaf material.
Total tannin (TT) content was determined by the radial diffusion method in the presence of BSA [17] and following the modifications by [19]. Gel Petri dishes were prepared using a solution of 50 mM acetic acid and 60 µM ascorbic acid and adjusting to pH 5. One percent agarose (Sigma-Aldrich, Madrid, Spain) was added to the solution and boiled (heated) until homogenized. After cooling to 42 • C, 0.1% bovine serum albumin (Sigma Aldrich, Madrid, Spain) was added and stirred until it dissolves. Uniform wells were made on plates containing the same volume of solidified agarose and constant volumes (10 µL) of samples were loaded, incubated at 30 • C for 120 h, and the sizes of the halos of precipitation were measured and transformed into TT by means of extrapolation in the corresponding calibration, which was performed with tannic acid (Sigma Aldrich, Madrid, Spain) and results were expressed as mg of tannic acid equivalents (T.A.E.) per g of dry leaf material.
Condensed tannins (CT) concentration was measured according to the method described by [20] and following the modifications by [14]. Crude extracts were mixed thoroughly with 100 volumes of n-butanol/acetone 1:1 (46% each) plus HCL (1.85%) and ferric ammonium sulfate (4%). The mixture was heated at 70 • C for 45 min and, after being cooled, the absorbance at 550 nm was measured. A calibration was performed using procyanidin B2 as reference (Sigma-Aldrich, Madrid, Spain), and results were expressed as mg of procyanidin B equivalents (PB.E.) per g dry leaf material.

Purification of Nucleic Acids
Frozen leaf tissues were hand-ground using a mortar and pestle in liquid nitrogen. DNA was isolated from 100 mg of tissue using an i-genomic Plant DNA Extraction Mini kit (Intron Biotechnology, Gyeonggi, Korea), according to the manufacturer's protocol. Nucleic acids were analyzed and quantified by spectrophotometry (260 nm/280 nm), and the integrity of the preparations was further evaluated by agarose gel electrophoresis.

PCR and RFLP Analyses
Four TT2-like coding sequences were amplified from Q. ilex by using specific primers (Table 1) in a Prime G Thermal Cycler (Techne, supplied by Thermo Fisher Scientific, Waltham, MA, USA). The amplified PCR products were used for MboI (Invitrogen, supplied by Thermo Fisher Scientific, massachusetts, USA) digestion, which was performed using a 20 µL mixture containing 18 µL purified PCR product (MEGAquick-spin kit, Intron Biotechnology, Gyeonggi, Korea) and one unit of restriction endonuclease in the appropriate reaction buffer. The incubation temperature was set according to the manufacturer's instructions for the restriction endonuclease. The resulting restriction fragments were separated on 1.2% agarose gels stained with SYBR-safe DNA gel stain (Invitrogen, supplied by Thermo Fisher Scientific, Waltham, MA, USA). The genotyping of plants by PCR-RFLP was performed by comparison of fragment sizes with the molecular weight marker Gene Ruler DNA Ladder 100 bp Plus (Thermo Fisher Scientific, Waltham, MA, USA) loaded in each agarose gel.

Statistical Analysis
The effects of the biogeographical region (cM vs. Py), forest type (LDF vs. HDF), and genotype on the quantitative traits used to measure biological stress were analyzed through analysis of variance (two-way ANOVA). The region, forest type, and genotype were used as random factors whereas the content of total phenolics (TP), content of total tannins (TT), content of condensed tannins (CT), and defoliation (in%) were dependent variables. The relationships among the 4 quantitative traits representing plant chemical defenses (TP, TT, CT) and susceptibility to herbivory (defoliation) were analyzed by using linear Pearson correlation.

Conclusions
One main finding of this work is that distinct phenolic compounds that accumulated in the leaves of Q. ilex might interact differentially on susceptibility to insect herbivory. Thus, TP and TT do not seem to play significant protection against this biotic stress, since the three traits are parallelly induced. However, the negative correlation of CT content and susceptibility to herbivory suggests that, in the cM biogeographic region that best represents the agroforestry ecosystem known as "dehesa", the trees that grow in lowdensity forests (LDF) and are QiMYB-like-1 D165 or H165 homozygotes would present a higher content of CT and a lower susceptibility to herbivory. This fact could explain the anomalously high frequency found of these genotypes in the best preserved "dehesas" in Spain and could be useful to improve the management of their oak forests. Funding: This study was funded by the Spanish Ministry of Agriculture, Food and Environment (Ministerio de Agricultura, Alimentación y Medio Ambiente, Organismo Autónomo Parques Nacionales), project 956, "Determinants of biotic resistance in a model tree species: a new tool for adaptive management in national parks".
Institutional Review Board Statement: Not applicable.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.