Genome-Wide Association Study for Multiple Biotic Stress Resistance in Synthetic Hexaploid Wheat

Genetic resistance against biotic stress is a major goal in many wheat breeding programs. However, modern wheat cultivars have a limited genetic variation for disease and pest resistance and there is always a possibility of the evolution of new diseases and pests to overcome previously identified resistance genes. A total of 125 synthetic hexaploid wheats (SHWs; 2n = 6x = 42, AABBDD, Triticum aestivum L.) were characterized for resistance to fungal pathogens that cause wheat rusts (leaf; Puccinia triticina, stem; P. graminis f.sp. tritici, and stripe; P. striiformis f.sp. tritici) and crown rot (Fusarium spp.); cereal cyst nematode (Heterodera spp.); and Hessian fly (Mayetiola destructor). A wide range of genetic variation was observed among SHWs for multiple (two to five) biotic stresses and 17 SHWs that were resistant to more than two stresses. The genomic regions and potential candidate genes conferring resistance to these biotic stresses were identified from a genome-wide association study (GWAS). This GWAS study identified 124 significant marker-trait associations (MTAs) for multiple biotic stresses and 33 of these were found within genes. Furthermore, 16 of the 33 MTAs present within genes had annotations suggesting their potential role in disease resistance. These results will be valuable for pyramiding novel genes/genomic regions conferring resistance to multiple biotic stresses from SHWs into elite bread wheat cultivars and providing further insights on a wide range of stress resistance in wheat.


Introduction
Wheat (Triticum aestivum L.) is one of the most widely grown cereal crops with a production of more than 756 million tonnes in 2017-2018 [1] and it feeds more than one-third of the world's population [2]. Global population is increasing at an alarming rate and is estimated to reach 9.8 billion by 2050 [3]. To meet global food demands, cereal production needs to be increased by 50% by 2030 [4].

Phenotypic Distribution of Disease and Pest
A wide range of genetic variation for multiple biotic stresses was observed in 125 SHWs ( Figure 1 and Supplementary Table S1). As expected, seedling and adult plant resistance screening for susceptible checks showed a moderately susceptible to very susceptible response and resistant checks showed a resistant to very resistant response, indicating that the screening methods were effective (Supplementary Table S1). More than 50% of the lines had a resistant (11) to moderately resistant (64) reaction to Lr at the adult plant stage (Figure 1a). For Lr seedling resistance screening, we identified 109 lines as immune (no symptoms) and 16 lines as very resistant. Of the 11 resistant lines at the adult plant stage, ten lines showed an immune response and one line showed a very resistant response at the seedling stage. These results indicated that 11 lines may possess major Lr resistance genes. These data will provide useful information for pyramiding favorable alleles and multiple disease resistance genes from SHWs into elite bread wheat germplasm.

Phenotypic Distribution of Disease and Pest
A wide range of genetic variation for multiple biotic stresses was observed in 125 SHWs ( Figure  1 and Supplementary Table S1). As expected, seedling and adult plant resistance screening for susceptible checks showed a moderately susceptible to very susceptible response and resistant checks showed a resistant to very resistant response, indicating that the screening methods were effective (Supplementary Table S1). More than 50% of the lines had a resistant (11) to moderately resistant (64) reaction to Lr at the adult plant stage (Figure 1a). For Lr seedling resistance screening, we identified 109 lines as immune (no symptoms) and 16 lines as very resistant. Of the 11 resistant lines at the adult plant stage, ten lines showed an immune response and one line showed a very resistant response at the seedling stage. These results indicated that 11 lines may possess major Lr resistance genes. Approximately 60% of the lines (73 lines) were resistant (17 lines) to moderately resistant (56 lines) to stem rust at the adult plant stage (Figure 1b). Seedling resistance screening against Sr identified 51, 52, 13, seven, and two lines as immune, very resistant, resistant, moderately resistant, Approximately 60% of the lines (73 lines) were resistant (17 lines) to moderately resistant (56 lines) to stem rust at the adult plant stage (Figure 1b). Seedling resistance screening against Sr identified 51, 52, 13, seven, and two lines as immune, very resistant, resistant, moderately resistant, and moderately susceptible, respectively against race TPMK (Supplementary Table S1). These results indicated that most of the lines may have Sr genes Sr6, Sr9a, Sr9b, or Sr30 based on the TMPK race used. Of the 17 adult plant resistant lines, nine were immune, seven were resistant to very resistant, and one line was moderately resistant at the seedling stage, indicating that these lines may possess major Sr resistance genes.
The adult plant stage Yr screening identified 10 resistant lines (Figure 1c). Similarly, seedling resistance screening identified one very resistant, 106 resistant, and 18 moderately resistant lines against the mixture of three Pst races (Pstv-14, Pstv-37, and Pstv-40) (Supplementary Table S1). This indicates that most of these synthetic genotypes may have Yr genes Yr1, Yr5, Yr10, Yr15, Yr24, Yr17, Yr32, YrSP, or YrTye or other unknown genes as they were resistant against three races . Of the 10 adult plant resistant lines, seven lines were resistant, and three lines were moderately resistant at the seedling stage, indicating that these lines may possess major Yr resistance genes.
We have also screened SHW lines for CCN, Cr, and HF resistance. This study identified one resistant and 11 moderately resistant lines for CCN (Figure 1d and Supplementary Table S1), one resistant and 12 moderately resistant lines for Cr (Figure 1e), and 33 resistant and 22 moderately resistant lines for HF ( Figure 1f).

Genome-Wide Association Study
A GWAS identified a total of 124 marker-trait associations (MTAs) for Hessian fly infestation (%), field-based infection type (IT) for Yr, and field disease severity (SEV), IT, and coefficient of infection (COI) for Lr and Sr ( Figure 2 and Supplementary Table S2). The phenotypic variance explained (PVE) by these MTAs ranged from 0.3% to 27.4% (Supplementary Table S2). The MTAs identified in this study were year specific because of the significant influence of genotype × year interaction as can be seen by weak phenotypic correlation between years (r = 0.18; p = 0.06 to 0.42, p = 0.02). Also, we ran GWAS analysis for CCN and Cr, but we were unable to identify significant MTAs in our population; hence, the GWAS results for these traits will not be discussed further. Additionally, we did not run GWAS analysis on seedling rust resistance data because most of the genotypes were resistant and data were highly skewed (Supplementary Table S1).
At the adult plant stage for Lr, a total of 46 MTAs for SEV (12), IT (22), and COI (12) Table S2). To date, over 110 Lr resistance genes have been identified and they were distributed across all chromosomes in wheat except 5A and 6D [33]. Leaf rust resistance genes such as Lr21, Lr22, Lr32, Lr39, and Lr41 were derived from Ae. tauschii [35]. The identification of 15 MTAs on the D genome clearly represent variation from the Ae. tauschii and show the potential of SHW for its utilization in a marker-assisted breeding program after being validated in an independent genetic background. This study identified three co-located MTAs (S2D_13778898, S3A_721353764, S3B_4553494) and a haplotype block (perfect linkage disequilibrium with R 2 = 1) for Lr consisting of two SNPs (S2D_50105731 and S2D_50105753) of size 22 base pairs for COI and SEV on chromosomes 2D, 3A, and 3B (Supplementary Table S2). This was expected as these traits (SEV and COI) had a high correlation (ranging from 0.65 to 0.87, p < 0.001).  Table 2). To date, over 110 Lr resistance genes have been identified and they were distributed across all chromosomes in wheat except 5A and 6D [33]. Leaf rust resistance genes such as Lr21, Lr22, Lr32, Lr39, and Lr41 were derived from Ae. tauschii [35]. The identification of 15 MTAs on the D genome clearly represent variation from the Ae. tauschii and show the potential of SHW for its utilization in a marker-assisted breeding program after being validated in an independent genetic background. This study identified three co-located MTAs (S2D_13778898, S3A_721353764, S3B_4553494) and a haplotype block (perfect linkage disequilibrium with R 2 = 1) for Lr consisting of two SNPs (S2D_50105731 and S2D_50105753) of size 22 base pairs for COI and SEV on chromosomes 2D, 3A, and 3B (Supplementary Table S2). This was expected as these traits (SEV  Table S3). To date, over 86 Sr genes have been identified, which were distributed across all chromosomes in wheat [33]. Several MTAs (17) identified on the D genome showed the potential of SHWs for improving Sr resistance in modern wheat cultivars. Additionally, this study identified seven co-located MTAs (S2D_381953220, S4A_545611013, S5A_208152827, S5A_440734987, S6B_7943502, S6D_456325441, and S7B_7067875454) for COI and SEV on chromosomes 2D, 4A, 5A, 6B, 6D, and 7B) and one co-located MTA (S6D_16745888) for IT and SEV on chromosome 6D. Identification of co-located MTAs was supported by the high correlation among these traits (r > 0.64, p < 0.001).
Similarly, at the adult plant stage for Yr, a total of 11 MTAs for IT were identified on chromosomes 1A (1), 2A (2) Table S3). To date, over 83 Yr resistance genes have been reported, which were distributed across all chromosomes in wheat [33]. Most of the genes are race-specific and are often short-lived due to the frequent change in the pathogen population [32]. The MTAs identified in the current study showed the quantitative nature of the resistance to Yr in SHWs. Previous studies have identified MTAs for Yr on chromosomes 2A, 3B, 6A, and 7B in an association mapping panel of 181 SHWs [32]. The MTAs identified on the D genome (chromosomes 3D and 7D) showed the potential of SHWs for improving Yr resistance in modern wheat cultivars.

Genes Underlying Marker-Trait Associations and Their Functional Annotations
The functional annotation of genes underlying MTAs was available through IWGSC RefSeq v1.0 annotations. A total of 33 of 124 MTAs were found within genes and of these, 16 were present in seven genes whose functional annotations suggested their involvement in disease resistance in wheat based on previous literature (Table 1 and Supplementary Table S3). The identification of underlying genes with annotations related to the trait provides further reliability for the MTAs identified. For instance, four genes (TraesCS4D01G096900: Lr, TraesCS7A01G517300: Lr, TraesCS4B01G007000: Sr, and TraesCS3A01G495400: Lr) were annotated as the leucine-rich repeat protein family, which was previously reported to have a potential role in Yr [38,39] and Sr resistance [40]. Similarly, five genes (TraesCS2A01G076800: Yr, TraesCS7B01G473300: Yr, TraesCS1A01G325300: Sr, TraesCS6B01G394600: Yr, and TraesCS5B01G199700: Sr) were annotated as the F-box family protein, which were reported to have potential role in Lr [41], Sr [42], and Yr [43] resistance in wheat. These two classes of genes are interesting genes for future study in wheat as they were also reported to be involved in abiotic stress tolerance such as drought [27,44,45]. However, further validation studies are needed for the MTAs identified and underlying gene functions prior to their utilization in a marker-assisted selection program. Additional examples where the underlying genes had annotations matching the trait of interest are provided in Table 1.

Multiple Stress Resistant Lines
There were six biotic stresses (three rusts assessed at both seedling and adult plant stages, CCN, Cr, and HF) assessed to identify multiple stress resistant lines. All the SHW lines showed resistance to multiple biotic stresses ranging from two to five biotic stresses (Figure 3 and Supplementary  Table S1). Specifically, 14, 67, 38, and 6 lines showed resistance against two, three, four, and five biotic stresses, respectively (Figure 3). From this study, we can recommend 17 lines (15.3% of the total resistant lines) that were resistant to more than two stresses and had a large number of favorable alleles (alleles that increased its resistance against multiple stresses based on significant markers identified in GWAS) conferring resistance (Supplementary Table S4). These lines can be used as parents in breeding programs with the goal of reducing disease and pest damage. The results indicated the usefulness of these lines for pyramiding multiple genes for durable resistance to multiple diseases and pests.

Multiple Stress Resistant Lines
There were six biotic stresses (three rusts assessed at both seedling and adult plant stages, CCN, Cr, and HF) assessed to identify multiple stress resistant lines. All the SHW lines showed resistance to multiple biotic stresses ranging from two to five biotic stresses (Figure 3 and Supplementary Table  S1). Specifically, 14, 67, 38, and 6 lines showed resistance against two, three, four, and five biotic stresses, respectively (Figure 3). From this study, we can recommend 17 lines (15.3% of the total resistant lines) that were resistant to more than two stresses and had a large number of favorable alleles (alleles that increased its resistance against multiple stresses based on significant markers identified in GWAS) conferring resistance (Supplementary Table S4). These lines can be used as parents in breeding programs with the goal of reducing disease and pest damage. The results indicated the usefulness of these lines for pyramiding multiple genes for durable resistance to multiple diseases and pests.

Genetic Resources
A total of 125 synthetic hexaploid wheat (110 winter synthetics and 15 spring synthetics) lines derived from crosses between seven durum (T. turgidum) parents and 25 different Ae. tauschii accessions were used in this study (Supplementary Table S1). These lines were distributed by the International Winter Wheat Improvement Program (http://www.iwwip.org) and details were provided previously [25,27,50].

Adult Plant Resistance Screening for Leaf, Stem, and Stripe Rusts
For the adult plant rust screening, all the genotypes were tested under field conditions for Lr, Sr, and Yr resistance in an augmented design with replicated checks ("Gerek" and "Karahan") in two rows (distance between rows: 22.5 cm) of a meter-long plot. Leaf rust was tested under natural conditions in Adapazarı, Turkey in the 2016 and 2017 growing seasons. Stem rust was tested against artificial inoculation of race "TTTTF" (avirulent on: Sr24 and Sr31; virulent on: Sr5, Sr6, Sr7b, Sr8a, Sr9a, Sr9b, Sr9d, Sr9e, Sr9g, Sr10, Sr11, Sr17, Sr21, Sr30, Sr36, Sr38, and SrMcN) at Haymana in Turkey in 2016 under natural conditions using natural races in Kastamonu in Turkey in 2016 and 2017, Eskisehir in Turkey, and at Njoro, Kenya in 2017. Similarly, Yr was tested against the artificial inoculation of "Warrior race" (avirulent on: Yr8 and Yr27; virulent on: Yr1, Yr2, Yr3, Yr4, Yr6, Yr7, Yr9, Yr17, Yr25, Yr32, YrSp, YrAvs, and YrAmb) under field conditions at Haymana research station in Turkey in 2016, whereas in 2017, the genotypes were tested under natural conditions using natural races of Yr at Kenya Agricultural and Livestock Research Organization (KALRO) in Njoro, Kenya. Yellow rust disease severity on the flag leaf was recorded three times during the growing season and the final score was taken as a representative score for the analysis. "Gerek" was a susceptible check and "Karahan" was a resistant check for the Lr, Sr, and Yr (Warrior race) used in this study. Rust disease intensities at the adult plant stage were recorded using a modified cobb scale [55] based upon severity (percentage of the plant infected) and response (a type of disease reaction). Field disease severity (SEV) was recorded as a percentage using the following scores: Trace, 5, 10, 20, 40, 60, and 100% infection, whereas a disease response or an infection type (IT) was recorded by dividing the infection type into six groups: 0, resistant (R), moderately resistant (MR), moderately resistant to susceptible (M), moderately susceptible (MS), and susceptible (S) [55]. An average coefficient of infection (COI) was calculated as the multiplication of severity and assigned a constant value based on IT. Constant values were 0.2, 0.4, 0.6, 0.8, and 1.0 for field responses of R, MR, MR-MS (M), MS, and S genotypes, respectively.

Cereal Cyst Nematode Resistance Screening
The cereal cyst nematode resistance screening was carried out at the Soil Borne Diseases Program at International Maize and Wheat Improvement Center (CIMMYT)-Turkey facilities located at the Transitional Zone Agriculture Research Institute (TZARI) Eskisehir, Turkey. The screening was performed in 2016 and 2017 in a randomized complete block design with three replications in the growth chamber as per [14]. Briefly, the population of H. filipjevi was collected from a field in Yozgat, Turkey and freshly hatched second stage juveniles (J2s) were used as inoculum in the screening tests. Three replicates per genotype were inoculated after emergence with 400 freshly hatched J2 by making three holes around the stem base under controlled conditions. Nine weeks after inoculation, cysts from both root and soil extractions were counted under a stereomicroscope. The mean number of cysts was recorded and classed into five groups: Resistant (R), equal or fewer cysts than in a known resistant check; moderately resistant (MR), slightly more cysts than in a resistant check; moderately susceptible (MS), slightly more cysts than in an MR check whereas less than in a susceptible (S) check; S, cyst number similar to S check; and highly susceptible (VS), more cysts than in the S check [14]. Wheat cultivars "Katea", "Sonmez", and "Yelken" were used as resistant checks whereas "Bezostaja" and "Kutluk" were used as susceptible checks for CCN.

Crown Rot Resistance Screening
Crown rot resistance screening was performed in a randomized complete block design with three replicates under greenhouse conditions at the Transitional Zone Agricultural Research Institute, Eskisehir, Turkey in 2016 and 2017. The crown rot resistance screening protocol used in this study was provided in detail in Erginbas-Orakci et al. [56]. In brief, the pathogenic F. culmorum Isolate 18 and Isolate 41 were isolated from a mature wheat crowns near Usak city in the Aegean region and Konya, Central Anatolia Plateau of Turkey, respectively. Three replicates of each genotype were grown in a plastic tube (3 cm diameter × 12.5 cm in length) in the greenhouse for about two months with a day/night photoperiod of 16/8 h at a temperature of 25/15 • C and a relative humidity of 60/80%. Around 0.5 g fungus colonized wheat bran was added to the tubes at the time of seed sowing. Around six weeks after inoculation (Zadoks growth stage 14), plants were scored using a 1-5 scale: 1, Resistant; 2, Moderately Resistant; 3, Moderately Susceptible; 4, Susceptible; and 5, Very Susceptible [56]. Wheat cultivars "Altay","Carisma", "Katea", "Mace", "Spitfire", "Suntop", "Wallup", "249", were used as R-MR checks; "Scout", "Sonmez", and "Sunco" were used as MR-MS checks; "Bezostaya","Ega wylie", "Emu Rock", and "Janz" were used as MS checks; whereas "Dumlupinar", "Kiziltan", "Kutluk", "Puseas", "Seri", and "Suzen97" were used as susceptible checks for Cr.

Hessian Fly Resistance Screening
The Hessian fly resistance screening using a population from Morocco was performed in 2017 in the greenhouse (20 • C and 70% relative humidity) at the International Center for Agricultural Research the Dry Areas (ICARDA) in Settat, Morocco [28]. A genotype that was resistant to this population may have Hessian fly resistance genes H5, H11, H13, H14, H15, H21, H22, H23, H25, or H26 [28]. The detailed protocol for Hessian fly resistance screening used in this study was provided previously [57]. Briefly, 10 plants per genotype were grown in a greenhouse flat (54 × 36 × 8 cm). Then the seedlings at the one leaf stage were placed under a cheesecloth tent along with infested plants containing mature Hessian fly pupae from which adult Hessian flies emerged and infected the seedling plants. Plant reactions to larval feeding were determined after 20 days. Susceptible genotypes were stunted, dark green, and contained live larvae whereas resistant genotypes were not stunted, light green, and contained dead or live larvae. Percentage of resistant plants was calculated as the number of resistant plants divided by a total number of plants evaluated. The percentage score was divided into five groups: 0%, very susceptible (VS); 10-40%, susceptible; 50-70%, moderately susceptible; 80-90% moderately resistant (MR); and 100% as resistant (R) genotype [57]. Wheat cultivar 'Nesma' was used as a susceptible check [57] and resistant plants were checked (under a microscope; 40x) for the presence of dead larvae to confirm the antibiosis effect [36].

Genotyping and SNP Discovery
The details on genotyping and SNP discovery were described previously [25,27]. Briefly, the genotyping was performed using genotyping-by-sequencing (GBS) approach and SNP discovery was made using TASSEL v. 5.2.40 GBS v2 Pipeline [58] with physical alignment to the Chinese Spring genome sequence, RefSeq v1.0 [59]. The SNPs with minor allele frequency less than 5% and missing information greater than 20% were excluded [50]. All genotypes in this study met the filtering criteria of missing sites less than 20% and were retained in this study [25,27].

Population Structure and Genome-Wide Association Study
The population structure analysis of 125 SHWs used in this study was performed using a Bayesian clustering algorithm in STRUCURE v.2.3.4 [60] and principal component analysis in TASSEL [61], and has been described previously [29,50]. A GWAS was conducted using 35,798 GBS-derived SNPs and best linear unbiased estimates (BLUEs) of HF infestation score, CCN score, Cr score, and field-based rust SEV, IT, and COI for each location and year. BLUEs were estimated using PROC MIXED in SAS 9.4 (SAS Institute Inc., Cary, NC) by assuming genotype as a fixed effect and replication/block as a random effect. A multi-locus mixed linear model using FarmCPU (Fixed and random model Circulating Probability Unification) algorithm with the coefficient of ancestries of the first three population structure subgroups as covariates and FarmCPU calculated kinship [62] as a random effect were implemented in the Memory-efficient, Visualization-enhanced, and Parallel-accelerated Tool (MVP R software package-https://github.com/XiaoleiLiuBio/MVP) for GWAS. The marker-trait associations (MTAs) were selected with a genome-wide significance threshold level of p = 9.18 × 10 −5 (−log 10 p = 4.04) considering the deviation of the observed test statistics value from the test statistics values in the quantile-quantile plots [27]. Haplotype block analysis was performed using default parameters settings of Haploview software (https://www.broadinstitute.org/haploview/haploview). The gene underlying MTAs and their annotation was retrieved from the IWGSC RefSeq v1.0 annotation. The potential candidate genes associated with the trait of interest were further examined using published literature.

Conclusions
The present study identified SHWs resistant to diseases and pests that can be used for breeding for resistance to multiple biotic stresses. Additionally, this study identified several lines having resistance to multiple rusts (Lr, Sr, and Yr) at both the seedling and adult plant stage. Resistance genes from these lines can be transferred to bread wheat cultivars using back-crossing.
From GWAS, this study identified 124 genomic regions associated with various diseases that can be used in a marker-assisted breeding program upon validation in an independent population. Furthermore, several genes in those significant genomic regions had gene annotations suggesting their involvement in disease resistance, which provided confidence on the reliability of the identified genomic regions. Furthermore, this study identified several MTAs (42) on the D genome which demonstrates the potential of SHWs to expand the genetic resources of the D genome and for their utilization in wheat breeding programs. Finally, this study also provided information towards further understanding of resistance to multiple biotic stresses in wheat.

Conflicts of Interest:
The authors declare that they have no conflict of interests. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, and in the decision to publish the results.