Genome-Wide Association Study for Spot Blotch Resistance in Synthetic Hexaploid Wheat

Spot blotch (SB) caused by Bipolaris sorokiniana (Sacc.) Shoem is a destructive fungal disease affecting wheat and many other crops. Synthetic hexaploid wheat (SHW) offers opportunities to explore new resistance genes for SB for introgression into elite bread wheat. The objectives of our study were to evaluate a collection of 441 SHWs for resistance to SB and to identify potential new genomic regions associated with the disease. The panel exhibited high SB resistance, with 250 accessions showing resistance and 161 showing moderate resistance reactions. A genome-wide association study (GWAS) revealed a total of 41 significant marker–trait associations for resistance to SB, being located on chromosomes 1B, 1D, 2A, 2B, 2D, 3A, 3B, 3D, 4A, 4D, 5A, 5D, 6D, 7A, and 7D; yet none of them exhibited a major phenotypic effect. In addition, a partial least squares regression was conducted to validate the marker–trait associations, and 15 markers were found to be most important for SB resistance in the panel. To our knowledge, this is the first GWAS to investigate SB resistance in SHW that identified markers and resistant SHW lines to be utilized in wheat breeding.


Introduction
Wheat (Triticum aestivum L.) is the most widely consumed food grain in the world. Global wheat production must therefore increase to meet the growing demand estimated for the next three decades [1]. It will be paramount to combine climate resilience, yield potential, and disease resistance in single wheat genotypes which could be grown across diverse environments. Known challenges that limit increased production rates are rapid climate change and emergence of new pathogenic variants. Foliar diseases, in particular, have become increasingly relevant for wheat in recent years, leading to significant losses in grain yield and quality [2]. Some of the factors driving foliar diseases are the commercial cultivation of susceptible varieties, the rapid evolution of causal pathogens, climate change, and unfavorable agricultural practices, which often lead to severe disease epidemics. About 21.5% of the global wheat production is lost each year to diseases [2], the majority of the losses attributed to fungal pathogens infecting multiple wheat organs such as root, stem, leaf, spike, and grain.
Spot blotch (SB) is caused by the fungus Bipolaris sorokiniana (Sacc.) Shoem syn. Drechslera sorokiniana (Sacc.) Subrm and Jain (syn. Helminthosporium sativum, teleomorph Cochliobolus sativus) and is considered one of the most destructive fungal diseases in humid and high temperature regions; they not only affect wheat, but also several other small grains worldwide such as barley, rye, and triticale [3][4][5][6][7][8][9]. The SB pathogen can infect all plant organs, but particularly leaves and grain; thus, reducing plant photosynthetic efficiency and grain quality. SB has a wide range of hosts among wild and cultivated Poaceae species [10][11][12]. SB symptoms are characterized by light to dark brown lesions on leaves, oval to elongated in shape [13], that extend and merge very quickly, resulting in tissue death.
The importance of SB in production losses has been widely documented. On average, yield loss of 15-20% due to SB has been reported in several countries under favorable climate conditions, yet the yield losses can reach up to 70% in susceptible varieties [14][15][16]. The growing threat of SB due to rising global temperatures and the accelerated evolution of pathogenic races have recently caught the attention of plant breeders and pathologists and created a sense of urgency for the identification of new sources of SB resistance.
The commercial cultivation of SB-resistant varieties is the most sustainable and costeffective strategy to manage the losses incurred by SB [17][18][19]. Cultivar development for resistance to SB is slow due to the quantitative nature of resistance and a limited number of genes are known to have a major effect. Four SB resistance genes with major effects have been named to date, i.e., Sb1 through Sb4 [20][21][22][23]. Furthermore, several QTLs with minor effects have been found on almost all wheat chromosomes [24][25][26][27]. Most gene discovery studies undertaken to date have used biparental mapping populations, while a genome-wide association study (GWAS) using historical recombination usually provides a better resolution than bi-parental mapping. GWAS for resistance to SB found minor QTLs on chromosomes 2D, 3A, 4A, 4B, 5A, and 7B [28]; 1A, 1B, 1D, 4A, 5A, 5B, 6A, 6B, 6D, 7A, 7B [29]; and 1B, 3B 7B and 7D [30]. Recently, Bainsla et al. [31] found 25 markertrait associations (MTAs) on 13 chromosomes explaining between 2.0 and 17.7% of the phenotypic variance. Tomar et al. [32] reported four new QTLs for resistance to SB in spring wheat on chromosomes 1A, 1D, 2B, and 6D. Most of the studies for resistance to SB concentrated on spring wheat, and only a few focused on winter wheat germplasm.
To identify novel and more effective sources, synthetic hexaploid wheat (SHW) (2n = 6x = 42; AABBDD), derived from a cross between Triticum turgidum L. (2n = 4x = 28; AABB) and Aegilops tauschii syn. Ae. squarossa (2n = 2x = 14; DD), could be an alternative source of resistance to SB as envisaged from other studies [33,34]. Previously, considerable levels of genetic variation were already recorded among SHW developed by the Wide Crosses Program of the International Maize and Wheat Improvement Center (CIMMYT) for different agronomic traits, disease resistance, and quality [33,[35][36][37]. SHW was found to be promising in terms of resistance to SB and a few SHW lines showed better resistance than the resistant check variety 'Mayoor' [38].
Spot blotch is a major limiting factor for bread wheat production in hot and humid regions, particularly the Indo-Gangetic plains of South Asia. Despite the extensive breeding efforts, effective resistance to SB has not been observed in released cultivars, and the most promising cultivars have been found to be only partially resistant. Numerous studies have indicated that resistance to SB is polygenic, and multiple QTLs have been reported [24,26]. In CIMMYT, four biparental bread wheat populations were recently tested for SB resistance under Mexican environments, where several QTLs with minor effects were identified [24,25]. The same populations were further evaluated in South Asia with similar results, all QTLs presenting minor effects [26,27].
However, to our knowledge, no large-scale systematic screening and genetic study for SB resistance have been performed yet on SHW. Therefore, the objectives of this study were to (1) evaluate a set of 441 primary SHW lines for SB resistance under controlled environmental conditions and (2) to apply GWAS to identify potential new genomic regions of resistance that are not yet present in elite bread wheat germplasm.

Plant Material
A total of 441 SHW lines, generated by the CIMMYT's Wide Crosses Program via hybridizing 40 durum wheat (DW) parents and 277 Ae. tauschii accessions, were used in this study. The DW parents were involved in 1-54 crosses and the Ae. tauschii accessions were used in 1-7 crosses (Supplementary Table S1). The SHWs were selected from a larger collection of 1524 SHWs for their resistance to diseases such as Fusarium head blight, Septoria tritici blotch, rusts, and have acceptable agronomic traits such as plant height and days to heading [34].

Phenotypic Evaluations of Spot Blotch
The disease screening was carried out in a greenhouse at CIMMYT, El Batán, Mexico (19 • 31 N, 98 • 50 W, elevation 2249 m above sea level) during 2018 and 2019. All 441 SHWs, along with the 40 DW parents and four checks including Chirya 3 (resistant), Sonalika and Ciano T79 (susceptible) and Francolin (moderately susceptible) were evaluated for SB resistance at the seedling stage, while the Ae. tauschii accessions could not be screened due to their nature and growth as a wild species. The seeds of SHW lines were vernalized to break down seed dormancy and to obtain an even germination. Experiments were planned in a randomized complete block design with six replicates for the SHW and eight replicates for the DW parents, with four plants per entry-grown in plastic containers as experimental units to obtain average values for their subsequent analysis. The size of the containers was 26.5 cm long, 20.5 cm wide, and 5 cm high. The seedlings were grown under controlled conditions with an ambient temperature of 22-25/16-18 • C (day/night) and with a 16 h photoperiod.
For disease expression, the isolate CIMFU 483 of Mexican Bipolaris sorokiniana (BSG40M2), a monosporic strain isolated from wheat collected in Agua Fria, Mexico, was used. This isolate is a ToxA producer, which was confirmed based on inoculation experiments with differential genotypes, infiltration experiments, and PCR with the ToxA1/ToxA2 primers. The isolate was grown in a 30% V8 media [39], and the conidia concentration for inoculation was adjusted to 7500 spores mL −1 using a Neubauer counting chamber. One drop of Tween 20 (a surfactant reagent) was added for every 100 mL of spore suspension.
Seedlings were inoculated at the two-leaf stage, when the second leaf was fully expanded, or two weeks after sowing. The seedlings were inoculated with a conidial suspension of the CIMFU 483 isolate until the leaves were at dew point. This inoculum was sprayed four times every 20-30 min using a hand sprayer. After the leaves dried, the trays were moved to a mist chamber (RH 100%, 22-24 • C) to promote infection. After 48 h, the plants were transferred back to the greenhouse bench. Seedling response was evaluated seven days post inoculation following the 1-5 ordinal lesion rating scale developed by Lamari and Bernier [40], which is based on the lesion type shown on the second leaf. The genotypes were grouped based on the mean score of replicates following 1.0-1.5 = Resistant (R); 1.6-2.5 = Moderately Resistant (MR); 2.6-3.5 = Moderately Susceptible (MS); and 3.6-5.0 = Susceptible (S).

Genotyping
Genomic DNA was extracted from the second leaf (0.25 mg per entry) of 10-day-old seedlings of each line of the SHW using the modified cetyltrimethyl ammonium bromide (CTAB) method described in the CIMMYT laboratory protocols [41]. The high-throughput genotyping method DArTseq TM [42] was applied to all samples in the Genetic Analysis Service for Agriculture (SAGA) in CIMMYT, El Batan, Mexico.
Briefly, DArTseq is a complexity reduction method that includes two enzymes (PstI and HpaII) to create a genome representation of the set of samples. The PstI-RE sitespecific adapter is tagged with 96 different barcodes, enabling the multiplexing of a 96-well microtiter plate with equimolar amounts of amplification products to run in an Illumina sequencer Novaseq6000 (Illumina Inc., San Diego, CA, USA). The successfully amplified fragments are sequenced with up to 83 bases, generating approximately 500,000 unique reads per sample. A proprietary analytical pipeline developed by DArT P/L was used to generate allele calls for SNP and presence/absence variation (PAV) markers [42]. A 100K consensus map [43] was used to obtain genetic positions of the SNPs in addition to the alignments to the reference genomes.
From the complete set of 441 SHW lines, 438 were genotyped and used for Genome Wide Association Study (GWAS). A total of 67,436 markers were scored, out of which 50% (34,790) could be aligned to reference genomes. Quality control was carried out based on the minimum lack of alleles, resulting in 5800 markers to be used for GWAS. The reference genomes used in this study were Chinese Spring IWGSC RefSeq v1.0 genome assembly [44] and durum wheat (cv. Svevo) Ref Seq Rel. 1.0 [45], along with the reference genome of Ae. tauschii (v.4, 2017) [46].

Statistical Analysis and Genome-Wide Association Study
For the disease data, statistical analyses were performed using the Statistical Analysis System version 9.1 [47]. An analysis of variance (ANOVA) was conducted on the average reactions of the SHW, the DW parents, and SB checks. The Best Linear Unbiased Estimates (BLUE) were computed for each of the 441 SHW genotypes and later used to conduct GWAS using the TASSEL (Trait Analysis by Association Evolution and Linkage) software ver. 5.2.73 [48]. The mixed linear model (MLM) by Yu et al. [49] was used to simultaneously include the level of relatedness based on marker data and identical by descent (IBD) computed from the coefficient of parentage, which controls population structure. Additionally, population structure was controlled by fitting the first five principal components (PC) from the kinship matrix taken as the fixed variate and the coefficient of parentage (COP) as the random variable. The false-discovery rate (FDR) was used to assess the significance of the p-value (<0.05) [49]. The allelic effects of the significant MTAs were estimated as the difference between the mean value of lines, with and without the favorable alleles, and were presented as box plots.

Partial Least Squares Regression
We used the Partial Least Squares (PLS) method to apply the results of GWAS analyses to practical application to breeding. Extensive studies to assess the importance of environmental and genotypic covariables in multi-environment plant breeding trials were carried out using the PLS method [50][51][52][53].
In the context of this study, the PLS relates in a single estimation procedure (1) the two-way table of phenotypic measurements of SB of the SHW lines in 6 replicates in the greenhouse (and on the mean across the six replicated) and (2) the total number of significant markers found in the current GWAS study (41 explanatory variables). PLS regression describes explanatory (markers) as linear combinations of the complete set of measures of SB on SHW cultivars with no limit to the number of marker covariables or to the number of SHW lines that can be used.

Resistance to Spot Blotch at the Seedling Stage
The SB development observed during seedling evaluation in the greenhouse was even and consistent. ANOVA showed significant differences among SHWs (p < 0.001). The checks Chirya 3, Sonalika, Ciano T79, and Francolin displayed scores of 1.4, 4.0, 4.0, and 2.8, respectively (Table 1), verifying the identity of the B. sorokiniana isolate used and a successful inoculation.
The SB reaction of DW parents revealed that 18 (45%) parents had reaction scores of 1.0-1.5 (R) and 14 (35%) reaction scores of 1.6-2.5 (MR), developing mostly small dark to maroon lesions on those that had extended 1-2 mm in length with chlorotic edges during the initial infection. Eight entries (20%) were observed to have a mean reaction score between 2.6 and 3.6, being considered moderately susceptible (MS) to susceptible (S), whereas the leaves were observed to die/senescence when the light brown to dark brown oval to elongated blotches extended and merged very quickly (Tables 1 and S1). The SB reaction scores of the DW parents compared to the scores of the SHW indicated that the SB resistance of SHW was likely inherited from both DW and Ae. tauschii parents.

Genome-Wide Association Study Using Different References Genomes
The first two principal components (PCs) based on the DArTSeq markers separated two clear groups of entries of similar sizes and some entries in between, explaining around 34% of the total variability. This population structure was controlled by fitting the first five PCs derived from the correlation matrix as fixed covariates. Additionally, the coefficient of parentage used as a random variable to fit the GWAS mixed linear model (MLM) effectively controlled the remaining population structure after fitting the first five PCs.
From the complete set of 441 SHW lines, 438 were genotyped and used for the Genome-Wide Association Study (GWAS). A total of 67,436 markers were scored, out of which 50% (34,790) could be aligned to reference genomes. Quality control was carried out based on the minimum lack of alleles, resulting in 5800 markers to be used for GWAS.
Out of the DArTSeq markers that could be aligned to the whole genome sequence of cv. Chinese Spring (CS, IWGSC RefSeq v1.0), 20 significant MTAs were identified as shown in Table S2 and   Table S1).
The SB reaction of DW parents revealed that 18 (45%) parents had reaction scores of 1.0-1.5 (R) and 14 (35%) reaction scores of 1.6-2.5 (MR), developing mostly small dark to maroon lesions on those that had extended 1-2 mm in length with chlorotic edges during the initial infection. Eight entries (20%) were observed to have a mean reaction score between 2.6 and 3.6, being considered moderately susceptible (MS) to susceptible (S), whereas the leaves were observed to die/senescence when the light brown to dark brown oval to elongated blotches extended and merged very quickly (Tables 1 and S1). The SB reaction scores of the DW parents compared to the scores of the SHW indicated that the SB resistance of SHW was likely inherited from both DW and Ae. tauschii parents.

Genome-Wide Association Study Using Different References Genomes
The first two principal components (PCs) based on the DArTSeq markers separated two clear groups of entries of similar sizes and some entries in between, explaining around 34% of the total variability. This population structure was controlled by fitting the first five PCs derived from the correlation matrix as fixed covariates. Additionally, the coefficient of parentage used as a random variable to fit the GWAS mixed linear model (MLM) effectively controlled the remaining population structure after fitting the first five PCs.
From the complete set of 441 SHW lines, 438 were genotyped and used for the Genome-Wide Association Study (GWAS). A total of 67,436 markers were scored, out of which 50% (34,790) could be aligned to reference genomes. Quality control was carried out based on the minimum lack of alleles, resulting in 5800 markers to be used for GWAS.
Out of the DArTSeq markers that could be aligned to the whole genome sequence of cv. Chinese Spring (CS, IWGSC RefSeq v1.0), 20 significant MTAs were identified as shown in Table S2   Looking at the markers located on the 100 K consensus map, 32 significant MTAs were detected, as shown in Table S3 and Figure 3, and found to be located on chromosomes 1B (7) Table S3. Therefore, three MTAs showed the same chromosome allocation on the genetic and physical maps, while six MTAs showed different chromosome assignments (yet mainly homologous chromosomes) on both maps. Looking at the markers located on the 100 K consensus map, 32 significant MTAs were detected, as shown in Table S3 and Figure 3, and found to be located on chromosomes 1B (7) Table S3. Therefore, three MTAs showed the same chromosome allocation on the genetic and physical maps, while six MTAs showed different chromosome assignments (yet mainly homologous chromosomes) on both maps.

of 19
Genes 2022, 13, x FOR PEER REVIEW 8 of 18 When markers aligned to the DW cultivar Svevo and the Ae. tauschii reference genomes were considered, 10 MTAs were identified on chromosomes 1B (1), 2A (1), 2B (1), 2D (1), 3A (2), 3B (2), 4D (1), and 7A (1) (Table S4 and Figure 4). However, only three markers in Table S4 coincided with those found in Tables S2 and S3. Marker ID 1240012 on chromosome 2B in Svevo was found to be on chromosome 7D when aligned to the physical map of CS and on chromosome 5B in the 100K consensus map. The markers with the highest allele substitution effects ranged from 1.10 (2B), 0.33 (3A), to 0.16 (3A). Overall, a total of 41 genomic regions identified using the different maps are summarized in Table 2. A re-alignment of the marker sequences to the ABD, AB, and D genomes verified the physical position of several of the significant SNPs and could identify their physical positions across species. However, among all, 11 MTAs could not be assigned positions on the physical map. Furthermore, 23 MTAs were found within annotated high-confidence gene sequences, with 10 of these 23 candidate genes annotated in the CS reference genome, 6 in Svevo reference genome, and 7 in the Ae. tauschii reference genome (Supplementary  Table S5). These significant MTAs were detected on 15 chromosomes with the maximum number of 5 MTAs on chromosome 1B and 1 each on 6D and 7B, and their R 2 values varied from 0.03 to 0.07. Among the five markers detected on chromosome 1B, the highest R 2 value of 0.06 was found for marker ID 1145134 that is in proximity with marker ID 5582520, with two other markers (IDs 4261287 and 7335825) distal to them and one (ID 100033209) proximal to them. Three MTAs were found on chromosome 2A, with marker ID 1144884 exhibiting the highest R 2 value of 0.07. Two MTAs on chromosome 5A (IDs 3570010 and 1046932) were found with low R 2 values of 0.03 for each one. Allelic effects ranged from 0.01 to 1.11 for the MTAs on 4D (ID 2243087) and 7D (ID 1240012), respectively.   Overall, a total of 41 genomic regions identified using the different maps are summarized in Table 2. A re-alignment of the marker sequences to the ABD, AB, and D genomes verified the physical position of several of the significant SNPs and could identify their physical positions across species. However, among all, 11 MTAs could not be assigned positions on the physical map. Furthermore, 23 MTAs were found within annotated high-confidence gene sequences, with 10 of these 23 candidate genes annotated in the CS reference genome, 6 in Svevo reference genome, and 7 in the Ae. tauschii reference genome (Supplementary Table S5). These significant MTAs were detected on 15 chromosomes with the maximum number of 5 MTAs on chromosome 1B and 1 each on 6D and 7B, and their R 2 values varied from 0.03 to 0.07. Among the five markers detected on chromosome 1B, the highest R 2 value of 0.06 was found for marker ID 1145134 that is in proximity with marker ID 5582520, with two other markers (IDs 4261287 and 7335825) distal to them and one (ID 100033209) proximal to them. Three MTAs were found on chromosome 2A, with marker ID 1144884 exhibiting the highest R 2 value of 0.07. Two MTAs on chromosome 5A (IDs 3570010 and 1046932) were found with low R 2 values of 0.03 for each one. Allelic effects ranged from 0.01 to 1.11 for the MTAs on 4D (ID 2243087) and 7D (ID 1240012), respectively.

Identified MTA
On chromosome 1B, the reported positions for five MTAs showed two MTAs (markers 4261287 and 7335825) nearby, at 51.3 and 52.6 cM, and two MTAs (markers 5582520 and 1145134) at 96.9-98.0 cM, respectively, resulting in three different QTLs identified for SB on chromosome 1B. On chromosome 2D, two MTAs (markers 1122278 and 2243785) were positioned 11.02 Mbp apart but with an R 2 of 0.08 and a probability of linkage disequilibrium (LD) of 1.23 × 10 −7 forming a third MTA. Additionally, two markers on chromosome 3A with a distance of only 0.11 Mbp (markers ID 1019955 and 474554774) showed a linkage disequilibrium R 2 of 0.8138, with a p-value of 1.21 × 10 −7 . The two significant markers on 3D were located at a distance of 20 cM; thus, being considered unlinked. On chromosome 4A, markers 1162615 and 100036641 were mapped near each other, at 96.1 and 96.4 cM, respectively, and thus could be considered one single MTA.

Frequency of Resistance Alleles within Individual SHWs
The frequency of resistance alleles in the SHWs was examined with the aim of identifying lines with high numbers of resistance alleles to be used for further resistance breeding. A total of 59 SHW lines carried more than 30 of the 41 identified resistance alleles with an average SB score of 1.3 ( Figure 5). Although not shown in this figure, there are 32 SHW lines with >32 resistance alleles and 15 SHW lines with >34 resistance alleles, which could be the top candidates for further evaluation and breeding. SHW lines with less resistance alleles (<16 R alleles) showed increased susceptibility and demonstrated the additive nature of the resistance alleles.
tifying lines with high numbers of resistance alleles to be used for further resistance breeding. A total of 59 SHW lines carried more than 30 of the 41 identified resistance alleles with an average SB score of 1.3 ( Figure 5). Although not shown in this figure, there are 32 SHW lines with > 32 resistance alleles and 15 SHW lines with > 34 resistance alleles, which could be the top candidates for further evaluation and breeding. SHW lines with less resistance alleles (<16 R alleles) showed increased susceptibility and demonstrated the additive nature of the resistance alleles.

Interpretation of Results from Partial Least Squares
The results of the PLS are shown in Figure 6, where the first two PLS factors explained around 26% of the total variability, and 15 molecular markers (green color) with a frequency of R alleles greater than 84% and 32 SHW lines (red color) having more than 32 resistance alleles ( Figure 6). The arrows from the center to the upper-left quadrant show the six phenotype measurements of SB (SB1-6) and their overall mean (Mean SB). The SHW lines are distributed in a linear manner from the lower-right quadrant (more resistance lines) to the upper-left quadrant (more susceptible lines). The 15 markers were located at the center and on the right-hand side of the biplot (green letter-numeric combination), and the 32 most resistant SHW lines (red numbers) are located towards the lower-right quadrant. From a practical breeding perspective, the 15 markers and the 32 SB resistance lines could be prioritized in crosses between SHW lines and elite bread wheat lines in breeding and pre-breeding programs.
The SHW lines are distributed in a linear manner from the lower-right quadrant (more resistance lines) to the upper-left quadrant (more susceptible lines). The 15 markers were located at the center and on the right-hand side of the biplot (green letter-numeric combination), and the 32 most resistant SHW lines (red numbers) are located towards the lower-right quadrant. From a practical breeding perspective, the 15 markers and the 32 SB resistance lines could be prioritized in crosses between SHW lines and elite bread wheat lines in breeding and pre-breeding programs.  Table 2). The 32 SHW lines having more than 32 resistance alleles are identified with red numbers. The remaining markers and SHW lines are represented by green and red dots, respectively.

Discussion
Genome-wide association studies were performed to uncover SNP markers related to SB resistance in bread wheat. One such study was conducted by [54] on 528 spring wheat accessions for seedling resistance against SB, and 11 MTAs were identified. The same panel was analyzed earlier by [30], but only four genomic regions were identified, due to fewer markers being used, emphasizing the importance of high-density marker data. A recent GWAS was reported by [55], who studied a total of 6736 CIMMYT breeding lines for SB resistance in field experiments conducted throughout several years (2014-2019), and up to 214 MTAs were identified in at least one year, 96 were repeatable in at least two years and all had minor effects.
To our knowledge, to date no GWAS has been reported on SB resistance in SHW, although several studies reported good resistance of SHW to SB. In earlier studies, Ae. tauschii was used to transfer potential SB-resistant genes through T. turgidum × Ae. tauschii  Table 2). The 32 SHW lines having more than 32 resistance alleles are identified with red numbers. The remaining markers and SHW lines are represented by green and red dots, respectively.

Discussion
Genome-wide association studies were performed to uncover SNP markers related to SB resistance in bread wheat. One such study was conducted by [54] on 528 spring wheat accessions for seedling resistance against SB, and 11 MTAs were identified. The same panel was analyzed earlier by [30], but only four genomic regions were identified, due to fewer markers being used, emphasizing the importance of high-density marker data. A recent GWAS was reported by [55], who studied a total of 6736 CIMMYT breeding lines for SB resistance in field experiments conducted throughout several years (2014-2019), and up to 214 MTAs were identified in at least one year, 96 were repeatable in at least two years and all had minor effects.
To our knowledge, to date no GWAS has been reported on SB resistance in SHW, although several studies reported good resistance of SHW to SB. In earlier studies, Ae. tauschii was used to transfer potential SB-resistant genes through T. turgidum × Ae. tauschii or T. aestivum × Ae. tauschii crosses [35]. Diverse Ae. tauschii accessions were used to make SHW lines, which exhibited promising SB resistance and often performed better than the resistant check Mayoor [38]. A series of SHW was developed and then screened for several biotic and abiotic stresses, and promising entries were either used for commercial cultivars or as pre-breeding materials to develop new genotypes. The authors of [33] reported eight SHW accessions with SB resistance, along with sources of resistance to other diseases.
Our study revealed that the evaluated SHWs displayed a considerable resistance to SB, with 38% of the SHW lines showing better resistance than the resistant check Chirya 3. According to the pedigree information, SB resistance of the panel might be based on diverse DW and Ae. tauschii backgrounds and was thus likely contributed by multiple SB resistance genes that was in agreement with the GWAS results.

Novelties of the Significant Markers Found in the Current Study
Previous genetic studies have identified a range of SB resistance genes/QTL, residing on all wheat chromosomes except 4D and 5D, as summarized recently by [56]. Some of these loci exhibited major effects, such as the nominated Sb genes, yet most of them showed minor effects. The same applies to the current study, where a total of 41 significant markers on 15 chromosomes were found to be associated with SB resistance, and none of them showed any major effects. This again confirmed the polygenic nature of SB resistance described in previous studies [24,26,55]. The significant MTAs were identified on AB genome chromosomes as well as on D genome chromosomes, suggesting that SB resistance in the SHWs was derived from both their DW and Ae. tauschii parents.
MTAs were identified on all seven D genome chromosomes, especially chromosomes 4D and 5D, on which no QTL/MTA has been reported so far [56]; thus, confirming their novelty. The two MTAs on chromosome 4D were located on short arm (marker 2243087) and long arm (marker 3023637); on chromosome 5D the physically distant markers must represent two different QTL. MTAs on chromosomes 1B (marker 1145134), 2D (marker 1122278 and 2243785), 3A (marker 1019955 and 2279238), and 6D (marker 1698662) also suggested to be novel since no QTL/MTA has been reported in the vicinity of these markers [56].
However, some MTAs were found within known QTL regions. For example, the two MTAs on chromosome 1BS (markers 4261287 and 7335825) were in close proximity to the MTAs reported by [29]. Likewise, on chromosome 3B, marker 4992362 was closely located to an MTA reported by [31]. Nevertheless, close linkage or coincidence does not necessarily mean that the identified regions represent the same QTL/MTA, especially because our study screened SHW, while those published previously evaluated common wheat. It is noteworthy that some markers did not show any BLAST hit on the three reference genomes, e.g., marker 7335825 on chromosome 1B and marker 7492146 on chromosome 2B. These MTAs represent variants absent in the reference genomes and might be worthy of further investigation.

Candidate Genes for the Identified Marker-Trait Associations
The significant markers identified from the GWAS were further evaluated for their association with disease resistance-related genes. We identified 23 plant defense-related protein families across multiple chromosome regions, of which only 13 have a known protein function. For example, marker 12779374 on chromosome 1D was identified within the gene TRITD1Bv1G224330 (Tables S5 and 2), which is involved in the synthesis of the lectin receptor kinase that has an important function for the general immunity of the plants [57]. Similarly, marker 1240012 on 7D was located within the gene TRITD2Bv1G075350 related to protein U-box domain containing protein 4, associated with the control of grain production [58]. However, it should be noticed that these candidate genes might not be the underlying genes for the MTAs, due to the usually large linkage disequilibrium blocks in the wheat genome [59].
Furthermore, marker 1283998 on chromosome 3B marked an SNP within gene TRITD-3Bv1G194800, which is a protein described as disease resistance protein RPM1 G, again involved in the general resistance of plants to various diseases [60]. Marker 4992362 on chromosome 3B marked the gene TRITD3Bv1G257410, which is identified as protein Serpin that participates in the regulation of proteolytic complex systems [61], whereas marker 1011260 (in chromosome 3D) falls within the gene TraesCS3D02G407000, a peroxidase protein that has the divergence role in different pathogens systems in plants [62]. Furthermore, marker 100016153, aligned on chromosomes 5A and 5D, was located within the genes TraesCS5A02G146400 and TRITD5Av1G111170, in which two proteins, Mannan endo-1,4 -beta-mannosidase 6 and Mannan endo-1,4-beta-mannosidase-like protein, are involved.
Note that marker 4002611 on chromosome 7A did fall within the gene TRITD7Av1G003410, a Pectin lyase-like superfamily protein, which has an important role in the development and maturity process of the plant. This protein also acts on the peptic substances pre-sented as structural polysaccharides in the primary cell walls of the superior plants [63]. Marker 22765212, on chromosome 7D, was included in gene TraesCS7D02G278500, which is found in the ribosomal protein that plays a fundamental integral role in the growth and development of the plant, as well as participating in the general defense mechanism of the plants [64].

Application of GWAS for Use in Practical Breeding
Genome-wide association studies (GWAS) are a powerful option for the genetic characterization of quantitative traits and have been widely used to analyze agronomic and disease traits. With the increasing number of diseases affecting cultivated wheat plants, the option of developing resistance SHW lines has been widely used. This is the first GWAS study to assess significant MTA of SB from a diverse collection of 441 SHW lines, and 41 significant markers and a range of SHW lines with high SB resistance were identified. In the PLS analysis, a subset of markers and SHW lines were identified that are more suitable for future breeding and pre-breeding activities.
Results of this study showed 15 molecular markers with a frequency of R alleles greater than 84% and 32 SHW lines having more than 32 resistance alleles. The PLS plot show the specific locations of the 15 markers and the 32 most resistant SHW lines. From a practical breeding perspective, these markers with R alleles and the SB resistance lines could be used in future breeding crosses.

Conclusions
This is the first GWAS study to investigate MTAs for SB resistance in a diverse collection of 441 SHW lines from CIMMYT. GWAS found a total of 41 significant markers related to SB resistance, being distributed on 15 wheat chromosomes, and many of them are novel. We were able to identify highly resistant SHW lines with most resistance alleles of the significant markers that can be used in wheat breeding programs.