Genome-Wide Association Study for Resistance to Tan Spot in Synthetic Hexaploid Wheat

Synthetic hexaploid wheat (SHW) has shown effective resistance to a diversity of diseases and insects, including tan spot, which is caused by Pyrenophora tritici-repentis, being an important foliar disease that can attack all types of wheat and several grasses. In this study, 443 SHW plants were evaluated for their resistance to tan spot under controlled environmental conditions. Additionally, a genome-wide association study was conducted by genotyping all entries with the DArTSeq technology to identify marker-trait associations for tan spot resistance. Of the 443 SHW plants, 233 showed resistant and 183 moderately resistant reactions, and only 27 were moderately susceptible or susceptible to tan spot. Durum wheat (DW) parents of the SHW showed moderately susceptible to susceptible reactions. A total of 30 significant marker-trait associations were found on chromosomes 1B (4 markers), 1D (1 marker), 2A (1 marker), 2D (2 markers), 3A (4 markers), 3D (3 markers), 4B (1 marker), 5A (4 markers), 6A (6 markers), 6B (1 marker) and 7D (3 markers). Increased resistance in the SHW in comparison to the DW parents, along with the significant association of resistance with the A and B genome, supported the concept of activating epistasis interaction across the three wheat genomes. Candidate genes coding for F-box and cytochrome P450 proteins that play significant roles in biotic stress resistance were identified for the significant markers. The identified resistant SHW lines can be deployed in wheat breeding for tan spot resistance.


Introduction
Diseases are major threats that significantly reduce yield when crops are grown under disease-favoring conditions. Wheat foliar diseases have gained increased importance in recent years due to various factors such as the adoption of conservation agriculture practices, commercial cultivation of susceptible varieties, and high-evolution dynamics of the causal pathogens [1]. Furthermore, climate change often results in severe disease epidemics that significantly limit grain yield and quality in wheat [2]. About 12-14% of the global wheat production is lost each year due to diseases [3]. The causative agents of these diseases, mainly fungal pathogens, infect multiple wheat tissues such as root, stem, leaf, spike, and grain. Based on the frequency and severity levels of disease epidemics, the diseases that infect leaf and spike/grain are considered of greater importance. In this sense, many researchers agree that "stripe rust" caused by Puccinia striiformis f. sp. tritici; "tan spot" by Pyrenophora tritici-repentis (Died.) Drechs. (anamorph Drechslera tritici-repentis (Deceased) Shoem.); "Septoria nodorum blotch" by Parastagonospora nodorum (syn. ana. Stagonospora; teleo. Phaeosphaeria) (Berk.) Quaedvlieg, Verkley & Crous, and "Septoria tritici The current GWAS study was conducted on a diverse panel of 443 SHW plants in order to (1) evaluate their resistance to tan spot under controlled environmental conditions and (2) identify possible new genomic regions for tan spot resistance.

Resistance to Tan Spot at the Seedling Stage
Uniform and consistent tan spot development was observed during seedling evaluation in the greenhouse. Analyses of variance (ANOVA) showed significant differences among SHW plants (p < 0.001) for reaction to tan spot. The checks Erik, Glenlea, 6B-662, and 6B-365 displayed scores of 1.0, 4.8, 2.5 and 3.4, respectively (Table 1), verifying the identity of P. tritici-repentis and successful inoculation.   Of the 40 DW parents, six (15%) had reaction scores of 1.0-1.5 (R) and 12 (30%) had reactions scores of 1.6-2.5 (MR), developing mostly small dark to maroon lesions on the leaves. Twenty-two entries (55%) were observed to have a mean reaction score between 2.6 and 4.3, being considered MS to S, wherein large necrotic lesions with or without chlorosis was observed. (Table 1, Supplementary Table S1).

Genome-Wide Association Mapping under Different References Maps
Using the markers mapped on the 100K consensus map, the first two principal components (PCs) separated two clear groups of entries of similar sizes and some entries in between, explaining around 34% of the total variability (Supplementary Figure S1). As Of the 40 DW parents, six (15%) had reaction scores of 1.0-1.5 (R) and 12 (30%) had reactions scores of 1.6-2.5 (MR), developing mostly small dark to maroon lesions on the leaves. Twenty-two entries (55%) were observed to have a mean reaction score between 2.6 and 4.3, being considered MS to S, wherein large necrotic lesions with or without chlorosis was observed. (Table 1, Supplementary Table S1).

Genome-Wide Association Mapping under Different References Maps
Using the markers mapped on the 100K consensus map, the first two principal components (PCs) separated two clear groups of entries of similar sizes and some entries in between, explaining around 34% of the total variability (Supplementary Figure S1). As described in the Section 4, possible population structure was controlled by fitting the first five PCs from the correlation matrix as a fixed variate. In addition, the coefficient of parentage used as a random variable for fitting the GWAS mixed linear model (MLM) effectively controlled the remnant population structure after fitting the first three PCs.
Thirteen markers were significantly related to tan spot resistance, aligned to the durum wheat cultivar Svevo and the Ae. tauschii reference genomes. These markers were located on chromosomes 1B (4), 2D (2), 3A (2), 4A (1), 5A (1), 6A (2) and 7D (1) ( Table 4 and Figure 4). Only three markers from Table 4 coincided with the significant markers found in Tables 2 and 3. Marker 3026113 on chromosome 1B in Svevo was found to be significant on chromosome 1D aligned to the physical map of CS. Similarly, marker 1125862 on chromosome 3A in Svevo aligned to chromosome 3D in the physical map of CS (Table 3). Marker 16793126 aligned to chromosome 7D in the Ae. tauschii and CS physical maps ( Table 3). The markers with the highest allele substitution effects ranged from −0.20 to −0.27 and were located on chromosomes 1B, 3A, 5A, and 6A. Table 2. Significant markers associated with seedling resistance to tan spot detected with the consensus genetic maps. Allele ID, genetic position in centimorgan (cM), F statistics, Probability (Prob), Marker R 2 , −log10 p-value and the effect of allele substitution are given for each marker.

Chr
Marker ID Allele ID          Table 5 summarizes the 30 genomic regions identified with different maps. A realignment of the sequences to the ABD, AB and D genomes could verify the physical position of several of the significant SNPs. Furthermore, 16 SNPs were found within annotated high-confidence gene sequences. Eight of these 16 possible candidate genes were annotated in the CS reference genome, four in Svevo and the residual four in the Ae. tauschii reference genome (Supplementary Table S2).

Marker-Trait Associations and QTL for Tan Spot Resistance
The allele frequency correlations (R 2 ) among the markers were used to estimate LD. Based on the physical positions of observed marker-trait association in the CS reference genome, three potential QTL were identified on each of the chromosomes 3A, 5A and 6A. Out of the four significant markers on chromosome 3A, with marker IDs 1125872, 1668224, 1019955, and 1065211, the latter two markers were positioned at 474,447,292 Mb and 474,447,226 Mb, respectively, only 66 bp apart with a R 2 of 0.89 and a significant LD p-value of 8.62E-16. The third marker (ID 1668224), despite being located 5.9 Mb apart from the previous two, still had R 2 values of 0.87 and 0.89 and significant LD p-values of 6.54E-16 and 2.30E-16, with the two SNPs, respectively. Therefore, these three markers can be considered for a single QTL for resistance to tan spot. Marker 112872, however, was located far from the markers mentioned above and must represent an independent QTL.
Likewise, two markers on chromosome 5A (100034112 and 3064590) and four markers on chromosome 6A (1254459, 2266481, 100027398, and 1862737) were located in LD and thus represented one same QTL, whereas all the remaining SNPs identified in our study represented independent QTL, due to their mutually unlinked physical positions.

Discussion
The development of genetically resistant wheat cultivars is an effective and environmentally friendly mechanism for the control of diseases such as tan spot. In the following subsections, we discuss the findings of this GWAS in relations to previous studies performed.

Tan Spot Resistance in SHW
Modern bread wheat cultivars have only a few broad-spectrum sources of resistance to the major foliar spotting diseases, such as tan spot [25], and great efforts have been made in recent decades to identify and introduce new sources of resistance. Despite the number of studies performed and published for wheat diseases, only a few included SHW. For example, Bhatta et al. [26] studied 125 SHW plants for their resistance to diseases and pests such as rust, crown rot, cereal cyst nematodes, and Hessian fly. To the best of our knowledge, so far, no GWAS was performed to evaluate SHW for tan spot resistance.
Our study indicates that SHW plants present considerable resistance to tan spot due to the diverse genetic backgrounds of these lines. The DW parents were mostly of reaction types of MS and S, suggesting that the resistance in the SHW was either derived from Ae. tauschii or through possible favorable epistatic interaction (activation) between A/Band D-genomes.

Significant Markers Found in the D-Genome Chromosomes
Our study found significant marker-trait associations for tan spot resistance on chromosomes 1D (marker ID 3026113), 2D (marker IDs 1217275, 1046621), 3D (marker IDs 987556, 1125862, 1217411), 4D (marker ID 4993454) and 7D (marker IDs 16793126, 991140, 993425). Thus, this is the first study to detect several significant genomic regions to tan spot resistance in the D-genome, in addition to the few loci reported previously. Phuke et al. [24] found a significant marker on chromosome 7D located at 550,216,751 Mb in CS. The closest significant marker on chromosome 7D in this study (marker ID 993425) was positioned at 620,252,508 Mb, physically distant and suggesting that at least two of the three marker-trait associations on chromosome 7D in this study are novel. The physical position of the third marker 991140 in CS could not be determined.
Tadesse et al. [27] studied resistance to tan spot in segregating F 2:3 derived populations of SHW using simple sequence repeat (SSR) markers. The authors found that loci tsn3a, tsn3b and tsn3c are all located in the vicinity of the marker Xgwm2a located on chromosome 3D. The physical distance of this SSR marker to the SNP markers in our study was difficult to determine. Gurung et al. [17] performed GWAS in spring wheat landraces using DArT markers to identify chromosome regions associated to tan spot race 1 and 5 resistances. The authors found significant markers, among others, on chromosomes 1D and 7D associated to tan spot race 1 and in regions of chromosomes 2D and 7D for tan spot race 5. Similar to the study by Tadesse et al. [27], genomic regions could not be compared as different genotyping platforms were used.

Significant Markers Found at the A and B Genome Chromosomes
The present study found significant marker-trait associations on the A-genome chro- Phuke et al. [24] also found several marker-trait associations in the A-and B-genomes. The authors found a significant marker on chromosome 2A but in a different position than the one found in this study. A significant locus on chromosome 1B mapped to a physical position at 465,584,555 Mb and was also distant from markers on chromosome 1B of this study located at 340,462,174 Mb and 558,561,647 Mb. Significant marker on chromosome 6A was located at 596,903,177 Mb and coincided with the QTL found in this study at 599,622,814 Mb, 601,233,092 Mb, 602,989,232 Mb, and 602,745,555 Mb, thus representing the same QTL. The marker located on chromosome 5A in Phuke et al. [24] mapped to the physical position of 597,291,565 Mb, whereas the markers identified in this study forming a QTL are located a distance apart, at 454,770,615 Mb, 471,723,681 Mb, and 470,186,523 Mb, thus likely presenting a novel QTL.
The study by Kokhmetova et al. [28] detected three significant loci on chromosome 1B within a range of 86.7-92.2 cM, not distant from marker ID 1089962 located at 83.6 cM in this study using the same 100K consensus map. Furthermore, the QTL on chromosome 6A were in proximity to the markers found by Kokhmetova et al. [28] on the same chromosome. Kalia et al. [29] performed bi-parental QTL mapping for resistance to tan spot race 1 in a population with a SHW parent and identified QTL only on the A-genome chromosomes 1A, 6A, and 7A. Because DArT markers were used in this study, the physical positions of the QTL were, once again, difficult to compare. Similarly, Chu et al. [30] identified QTL on chromosomes in the A-and B-genome (2A, 5A and 5B) in a bi-parental mapping population having a SHW parent. The authors hypothesized that the expression of tan spot resistance genes in DW is suppressed (or diluted) but are activated when DW is crossed with Ae. tauschii, which could be due to inter-locus interaction (epistasis effects) between loci on A/B-and D-genomes. In the current study, increased resistance in SHW in comparison to their direct DW parents supports this hypothesis.

Underlying Candidate Genes Based on Protein Annotation
Two markers, one on chromosome 5A (marker ID 3064590) positioned at 470,186,523 Mb and the other one located on chromosome 6A (marker ID 1862737) at 599,622,814 Mb, were of particular interest in this study as they were positioned within genes that code for disease resistance related proteins, i.e., TraesCS5A02G254500/TRITD5Av1G155700 (F-box protein) and TraesCS6A02G378800/TRITD6Av1G217060 (cytochrome P450).
Candidate genes TraesCS5A02G254500/TRITD5Av1G155700 code for F-box proteins that play a role in protein regulation and degradation, plant photoperiodic and hormone signaling transduction. A total of 1796 F-box proteins have been identified and classified in wheat [31], many of which have been related to biotic stresses, particularly to fungal pathogens. In addition, F-box proteins have been observed to affect the plant metabolism and the regulation of plant enzymes involved in several diverse cellular processes [31]. It has been found that the F-box proteins can act in different development stages in a wheat cultivar. The identification of candidate genes being related to specific disease resistance should offer an opportunity to further elucidate the biological functions of F-box genes and proteins in wheat.
The cytochrome P450 (CYP) enzyme in plants is involved in the biosynthetic pathway of phytoalexins that are synthetized by plants to deter hostile organisms [32]. This CYP enzyme plays an important role in the metabolism of herbicides as a key factor in providing tolerance to some species and thus selectively between crops and weeds. Plants encounter various biotic and abiotic factors at different stages of their growth and development, and the group of CYP enzymes is important in the synthesis of certain metabolites which play a fundamental part in the response to biotic stresses. The CYT enzymatic protein participates in the formation of numerous secondary synthetized metabolites that protect plants from biotic and abiotic stresses [33]. The mycotoxin deoxynivalenol (DON) is a virulent factor for the development of Fusarium head blight in wheat. A wheat cytochrome P450 subfamily was found on chromosomes 3A, 3B and 3D of the wheat genome that was activated in the wheat spikelets as a response to the mycotoxin DON [34].

Plant Material
A total of 443 SHW plants generated by the CIMMYT Wheat Wide Crosses Program throughout several years were evaluated (Supplementary Table S1). These SHW plants were selected from a group of 1,524 SHW plants for resistance to diseases such as Fusarium head blight, Septoria tritici blotch, and rusts and phenological traits such as plant height and days to heading. The SWH plants were derived from crosses involving 40 DW parents and 277 Ae. tauschii accessions, where the DW parents were used in 1 to 54 crosses and the Ae. tauschii accessions were used in 1 to 7 crosses (Supplementary Table S1).

Phenotypic Evaluation for Tan Spot
The disease screening for tan spot was carried out in a greenhouse in CIMMYT, El Batán, Mexico (19 • 31 N, 98 • 50 W, elevation 2249 m above sea level) in 2018-2019. In addition to the 443 SHW plants, the 40 DW parents were also evaluated, while the Ae. tauschii parents could not be screened due to their challenging phenology as a wild species. The SHW seeds were vernalized (7 days at 4 • C) to break dormancy and to obtain an even germination. Greenhouse experiments were arranged in a randomized complete block design with 12 replicates for each of the SHW entries and eight replicates for the DW parents. Each entry in each experiment had four plants. Four entries with different levels of resistances were considered "checks"-Erik (resistant), Glenlea (susceptible), 6B-365 (moderately susceptible), and 6B-662 (moderately resistant)-grown in plastic trays as experimental units to derive mean values for subsequent analysis. The seedlings were grown under controlled conditions in a temperature of 22-25/16-18 • C (day/night) and with a 16 h photoperiod.
For the induction of disease, the Mexican P. tritici-repentis isolate CIMFU 531-Ptr1 (race 1), well characterized by the CIMMYT Wheat Pathology Laboratory, was used. This isolate produces ToxA, based on inoculation experiments with differential genotypes, infiltration experiments, and PCR with the ToxA specific marker (data not shown). The isolate was grown on V8-PDA media [9], and the conidia concentration for inoculation was adjusted to 4 × 10 3 spores mL −1 using a Fuchs-Rosenthal counting chamber, with one drop of Tween 20 (a surfactant reagent) per 100 mL added to the spore suspension.
In the two-leaf stage, when the second leaf was fully expanded at two weeks after sowing, the seedlings were inoculated with a conidial suspension of the CIMFU 531-Ptr1 isolate until runoff. Subsequently, the trays were moved to a mist chamber (RH 100%, 21-22 • C) to facilitate infection. After 24 h, the plants were transferred back to the greenhouse bench. Seedling response was evaluated seven days post inoculation following the 1-5 lesion rating scale developed by Lamari and Bernier [9]. The readings from 12 and 8 inoculation experiments of the SHW plants and DW parents, respectively, were used to calculate the average seedling response, which was used for subsequent statistical analysis. The scale used for the tan spot reaction was based on continuous data given by the mean of the replicates: 1.0-1.5 = Resistant (R); 1.6-2.5 = Moderately Resistant (MR); 2.6-3.5 = Moderately Susceptible (MS); 3.6-5.0=Susceptible (S).

Plant Genotyping
The genomic DNA was extracted from 10-day-old seedlings of each SHW line using the modified cetyltrimethylammonium bromide (CTAB) method described in the CIMMYT laboratory protocols [35]. The DArTseq TM technology [36] was applied to all samples at the Genetic Analysis Service for Agriculture (SAGA) in CIMMYT, Mexico. DArTseq uses a complexity reduction method including two enzymes (PstI and HpaII) to create a genome representation of the samples. A PstI-RE site-specific adapter is then tagged with 96 different barcodes enabling the multiplexing of a 96-well microtiter plate with equimolar amounts of amplification products to run in an Illumina sequencer Novaseq6000 (Illumina Inc., San Diego, CA, USA). The successfully amplified fragments were sequenced up to 83 bases.
A pipeline developed by DArT P/L was used to generate allele calls for SNP and SilicoDArT (presence/absence variation markers) [36]. A 100K consensus map [37] [40].
A total of 67,436 DArTSeq SNP markers were originally scored, out of which 50% (34,790) were aligned to the reference genomes. Filtering was carried out excluding SNP with <0.05 allele frequency and >20% missing data points. Finally, 5800 DArTSeq markers were retained and used for GWAS analysis. The allele substitution effects for the significant marker-trait association were estimated by the mean phenotypic differences of alleles making the assumption that one genotype has effects equal to zero. Marker sequences were re-aligned (BLASTn) to the diverse reference sequences using the Ensembl plant public website (https://plants.ensembl.org/, accessed on 2 February 2022) to verify the position of the SNPs.

Statistical Analysis and Genome-Wide Association Analysis
For the disease data, statistical analyses were performed using the Statistical Analysis System version 9.1 [41]. Analyses of variance (ANOVA) were conducted on the average reactions of the SHW, the DW parents and checks for tan spot. The Best Linear Unbiased Estimates (BLUE) were computed for each of the 443 SHW genotypes.
The BLUE for disease severity were used as an input to conduct GWAS using the TASSEL (Trait Analysis by Association Evolution and Linkage) software ver. 5 [42]. We used the mixed linear model (MLM) [43] to simultaneously include the level of relatedness based on marker data and identical by descent (IBD) computed from the coefficient of parentage, which controls population structure. Additionally, population structure was controlled by fitting the first three principal components (PC) from the kinship matrix taken as the fixed variate and the coefficient of parentage (COP) as the random variable. The false-discovery rate (FDR) was used to assess the significance of the p-value (<0.05). The allelic effects of the significant marker-trait associations were estimated as the difference between the mean value of lines, with and without the favorable alleles, and were presented as box plots.
The results of the GWAS from MLM are presented in the Manhattan plots and the corresponding Quantile-Quantile plots (QQ-plots) are displayed to compare the quantiles of the empirical distribution of the results obtained in this study with those of the distribution that we would expect theoretically if the null hypothesis is true.

Conclusions
Our research identified new sources of resistance to tan spot in CIMMYT's SHW that can be used in wheat breeding via crosses and backcrosses with elite bread wheat lines. A total of 30 significant marker-trait associations were found on chromosomes 1B, 1D, 2A, 2D, 3A, 3D, 4B, 4D, 5A, 6A, 6B, and 7D, of which some SNP markers clustered and likely represent single QTL. Several marker-trait associations found in this study can contribute to the genetic diversity of resistance, specifically those on D genome contributed by Ae. tauschii, which were almost all novel, but also several on the A-and B-genomes. Furthermore, our study supports the previous concept of possible inter-locus effects caused by the activation of resistance genes in the DW genomes by interaction with the D genome of Ae. tauschii after hybridization.
Supplementary Materials: The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/plants11030433/s1, Figure S1: Principal component analysis of the synthetic wheat panel used in this study; Table S1: Seedling tan spot reaction scores of synthetic hexaploid wheat (SHW) lines and their durum wheat (DW) parents, Table S2

Data Availability Statement:
The original contributions presented in the study are publicly available on Supplementary Tables S1 and S2 and Figure S1.