Genome-Wide Association Study Reveals Novel Genomic Regions for Grain Yield and Yield-Related Traits in Drought-Stressed Synthetic Hexaploid Wheat

Synthetic hexaploid wheat (SHW; 2n = 6x = 42, AABBDD, Triticum aestivum L.) is produced from an interspecific cross between durum wheat (2n = 4x = 28, AABB, T. turgidum L.) and goat grass (2n = 2x = 14, DD, Aegilops tauschii Coss.) and is reported to have significant novel alleles-controlling biotic and abiotic stresses resistance. A genome-wide association study (GWAS) was conducted to unravel these loci [marker–trait associations (MTAs)] using 35,648 genotyping-by-sequencing-derived single nucleotide polymorphisms in 123 SHWs. We identified 90 novel MTAs (45, 11, and 34 on the A, B, and D genomes, respectively) and haplotype blocks associated with grain yield and yield-related traits including root traits under drought stress. The phenotypic variance explained by the MTAs ranged from 1.1% to 32.3%. Most of the MTAs (120 out of 194) identified were found in genes, and of these 45 MTAs were in genes annotated as having a potential role in drought stress. This result provides further evidence for the reliability of MTAs identified. The large number of MTAs (53) identified especially on the D-genome demonstrate the potential of SHWs for elucidating the genetic architecture of complex traits and provide an opportunity for further improvement of wheat under rapidly changing climatic conditions.


Introduction
Drought is one of the most important abiotic stresses that reduce crop productivity and is expected to increase with the change in climate [1]. Erratic rainfall patterns caused by climate change may aggravate drought stress and will have a major impact on agriculture [2,3]. The most prominent example of the impact of drought stress on agriculture was the 2012 drought stress in the United States, where moderate to extreme drought stress occurred across the central agricultural states that resulted in crop harvest failure for corn (Zea mays L.), sorghum (Sorghum bicolor L.), and soybean (Glycine max L.), and the agriculture loss due to drought was estimated to be $30 billion [4]. To cope with the challenges of drought stress, plant breeders have been focusing on improving drought tolerance since several decades [2,3,5]. However, the drought tolerance is a complex phenomenon as most of the traits associated with drought tolerance are polygenic in nature, and understating the genetic architecture of drought tolerance is still underway [3] including in wheat (Triticum sps.) [2]. Wheat is one of the most important staple cereal crops mainly grown under rainfed conditions [3,6] and is expected to suffer from drought stress [3]. Therefore, breeding for drought tolerance and identifying genomic regions and underlying candidate genes associated with drought tolerance are important for wheat improvement.
About 800 quantitative trait loci (QTLs) and marker-trait associations (MTAs) have been reported for drought tolerant traits (agronomic, physiological, root, and yield-related traits) using bi-parental mapping (~691 QTLs) and genome-wide association studies (GWASs;~109 MTAs) in wheat [3]. However, only 68 QTLs are major QTLs that explain more than 19% of phenotypic variation [3]. This study was conducted to identify novel genomic regions associated with grain yield (GY) and yield-related traits using GWAS performed using 35,648 genotyping-by-sequencing (GBS)-derived single nucleotide polymorphisms (SNPs) in 123 SHWs grown under two drought-stressed growing seasons (2016 and 2017) in Konya, Turkey. Subsequently, the underlying genes for the MTAs identified were investigated for their potential role in drought stress using the functional annotations. To the best of our knowledge, this is the first report on GWAS on GY and yield-related traits under drought stress in SHWs. The results from this study will be a valuable resource for the genetic improvement of GY and yield-related traits in drought stress, introgression of desirable genes from SHWs into elite wheat germplasm, genomic selection, and marker-assisted selection in the breeding program.

Weather Conditions
The mean monthly air temperatures were similar at Konya in both growing seasons with 13 • C in 2015-2016 and 12 • C in 2016-2017 compared to the 25-year mean monthly air temperature (11 • C) in Turkey ( Table 1). The total rainfall during 2016-2017 (243 mm) was slightly higher than that during 2015-2016 (222.4 mm) growing season. Total rainfalls during the wheat-growing season (September-July) were 48.9% lower in 2015-2016 and 44.2% lower in 2016-2017 compared to the 25-year mean total rainfall (435.1 mm) in Turkey. Although winter wheat water requirements are higher from mid-March to mid-June (from the spring tillering period to the mid-grain filling period), rainfalls were lower in both years compared to the 25-year mean rainfall. The plants were exposed to drought stress from tillering through grain filling. Hence, the results from the present study can be used to understand the effects and genetics of drought in SHW.

Phenotypic Variation for Yield and Yield-Related Traits
A combined analysis of variance (ANOVA) across years identified significant cross-over genotype × year interaction for all traits except for flag leaf width (FLW) and stem diameter (STMDIA) (Table S4). Therefore, analysis of variance was computed for both years separately and the results indicated that the SHWs showed significant variation for GY and yield-related traits in each year (Table 2). For instance, GY ranged from 200 g·m −2 to 341 g·m −2 with an average yield of 259 g·m −2 in 2016 and from 241 g·m −2 to 392 g·m −2 with an average yield of 290 g·m −2 in 2017 ( Table 2). The large variation among the traits in each year can be attributed to the collection of diverse accessions of SHWs from different countries and different genetic backgrounds [11,14].

Principal Component Analysis and Phenotypic Correlation
Principal component (PC) bi-plot analysis showed the association between GY and yield-related traits based on correlation matrix ( Figure 1). The first two PCs that explained 43.4% (2016) and 44.9% (2017) of variation better explained the relationship between traits in the two-dimensional space. In the PC biplot, we observed two distinct groupings. The first one comprised of GY, HI, BMWT, TKW, GVWT, AWNLN, and RTLN whereas the second one had FLLN, FLW, and FLA ( Figure 1). The traits grouping with GY are the more important traits for improving GY in drought-stressed conditions. The association observed in the PC biplot was supported by the significant correlations of GY with BMWT, HI, TKW, and GVW in both years ( Figure S1). Similar correlations for these traits were observed in the previous studies [18,[27][28][29][30][31][32].

Population Structure and Genome-Wide Association Study
Population structure analysis of 123 SHWs was performed using 35,648 SNPs [filtered for minor allele frequency (MAF) > 0.05 and missing data < 20%] using the Bayesian clustering algorithm implemented in Structure software and the results showed that these lines were divided into three subgroups ( Figure S2 and Table S1). The details of the population structure and genetic diversity of these SHWs have been previously reported in Bhatta et al. [11].
The GWAS identified novel genomic regions for GY and yield-related traits and the MTA explained the high phenotypic variance. The Fixed and random model Circulating Probability Unification algorithm (FarmCPU), with kinship, population structure (Q) or PC, best linear unbiased predictors (BLUPs) for each trait, and 35,648 GBS-derived SNPs was used to identify MTAs. The GBS-derived SNPs were well distributed across each of the chromosome ( Figure S3). We identified 194 MTAs distributed across 21 chromosomes for GY and yield-related traits with phenotypic variance explained (PVE) ranging from 1.1% to 32.3% ( Figure 2 and Figure S4, Table S5). The highest number of MTAs was observed for GY (29), followed by STMDIA (23), FLA (20), and TKW (20) while the lowest MTAs were observed for HI (10) (Figure 2). Of the 194 MTAs, 75 MTAs were detected on the A genome, with 66 MTAs on the B genome, and 53 MTAs on the D-genome. The highest MTAs were present on chromosome 7A (26 MTAs) and the lowest MTAs were present on chromosome 3D (four MTAs) ( Figure 2). Most of the MTAs identified in the present study were year-specific, suggesting the influence of genotype × environment interaction on the phenotype of the traits measured in two years. However, 120 of the 194 significant SNPs were in 83 genes and 45 of these MTAs were present within genes and their annotations suggested their potential role in drought stress. This result further provided confidence that the MTAs identified in the study are likely reliable MTAs (Table S5).
The present study identified four major haplotype blocks (from 19 bp to 433 kb) on chromosome 7A with two to six SNPs associated with GY in 2016 ( Figure 3). First haplotype block consisted of six MTAs within the 433 kb range, second haplotype block consisted of four MTAs within the 81 bp range, third haplotype block consisted of two MTAs within the 19 bp range, and fourth haplotype block consisted of three MTAs within the 314 kb range. The PVE on GY by the first, second, third, and fourth haplotype blocks were 17.2%, 24.6%, 21.9%, and 8.2%, respectively. One MTA (S7A_112977027; 112977027 bp) present in-between the second (537 kb away) and third (837 kb) haplotype blocks was within the gene, TraesCS7A01G158200.1, and PVE on GY was 12.8% (Table S5). This gene was annotated as a member of sentrin-specific protease of Ubiquitin-like Protease 1 (Ulp1) gene family ( Table 3). The Ulp1 is a small ubiquitin related modifier (SUMO)-specific protease that affects several important biological processes in plants including response to abiotic stress [21]. It has been shown to play a role in drought tolerance in Arabidopsis (Arabidopsis thaliana) [36] and rice (Oryza sativa) [22,37]. This makes this MTA interesting and a stronger candidate for future functional validation studies. Table 3. List of significant markers associated with GY and yield-related traits, favorable alleles (underlined), SNP effects, and drought-related putative genes from genome-wide association study of 123 drought stressed synthetic hexaploid wheats grown in 2016 in Konya, Turkey.  Another major haplotype block (18 kb) of three MTAs was observed on chromosome 3A in 2017 ( Figure 3) and the PVE on GY by this haplotype block was 13.1% ( Figure 3). The chromosome 3A is known to be an important chromosome that contains useful QTLs for GY and yield-related traits [17][18][19][20][21][22][30][31][32][33][34][35][36][37][38] and the haplotype block identified will have a significance in the crop improvement program. All three MTAs present in this haplotype block of chromosome 3A were found in the gene, TraesCS3A01G047300 (Table S5), which was annotated as a member of the F-box gene family (Table 4). These three SNPs were indicated as having a moderate impact on the protein as they resulted in a missense mutation and caused an amino acid change. Such changes may alter the function of the protein [39], which makes this F-box gene a strong candidate for future functional characterization studies under drought tolerance in wheat. The F-box proteins are known to regulate many important biological processes, such as embryogenesis, floral development, plant growth and development, biotic and abiotic stresses, hormonal responses, and senescence [39]. Two other MTAs observed on chromosome 3A and 3D were present within genes (F-box family protein: TraesCS3A01G445100 and disease resistance protein RPM1: TraesCS3D01G002700) that have been previously reported to be involved in drought tolerance [39,40] (Table 4).
The GY haplotype blocks and other MTAs identified in the present study have not been mapped to date and four MTAs were in the genes, of which functional annotations suggested that they are likely involved in drought tolerance. This result implied that haplotype blocks observed on chromosome 3A (3 MTAs) and 7A (16 MTAs), and one MTA on chromosome 3D (1) for GY are novel and may potentially be used in a marker-assisted breeding program, focusing on improving drought tolerance in wheat after validating them in different populations and environments.
Six MTAs for HI detected on chromosomes 2A, 3A, 6B, 6D, and 7B were found in genes and two of these genes have annotations suggesting their involvement in drought stress (Table 3 and Table S5). The two genes are WRKY transcription factor (TraesCS3A01G343700) found on chromosome 3A and cytochrome P450 (TraesCS6D01G170900.1) found on chromosome 6D. The role of WRKY transcription factor is well known in abiotic stresses including drought tolerance [42,43]. The cytochrome P450 genes are a large superfamily of enzymes and are involved in many metabolic pathways including drought tolerance in rice [44] and Arabidopsis [45,46]. The multi-trait marker associated with GY and HI was located on chromosome 5B (S5B_598463062) with PVE ranging from 15.9% to 18.7% (Table S5).
A novel haplotype block (38 kb) of three SNPs on chromosome 3A associated with BMWT was identified in 2017 ( Figure 3) with PVE by the haplotype block of 11.7%. This MTA (S3A_25012018) was also associated with GY and PVE ranged from 12.7% (GY) to 14.4% (BMWT).
All three MTAs present in this haplotype block were within genes (Table S5) and one of the genes had annotations suggesting its involvement in drought tolerance was an F-box family protein (TraesCS3A01G047300) ( Table 4) [39]. Excluding MTAs on haplotype block, eight MTAs for BMWT detected on chromosomes 1D, 2B, 6D, and 7B were found in genes (Table S5) and two of the genes had annotations suggesting its involvement in drought stress (Tables 3 and 4). The genes associated with two MTAs are F-box family protein (TraesCS7B01G242600) [39] and protein DETOXIFICATION containing multi-antimicrobial extrusion protein (MatE) (TraesCS1D01G357500) [23].
Eleven MTAs responsible for AWNLN detected on chromosomes 1D, 2A, 4D, 5A, 5B, 6B, and 7A were found in genes (Table S5) and four of these genes had annotations suggesting their involvement in drought stress (Tables 3 and 4). The genes associated with four MTAs involved in drought tolerance are 60S ribosomal protein L18a (chromosome 4D; TraesCS4D01G290700) [55], guanine nucleotide exchange family protein (chromosome 5A; TraesCS5A01G361300) [56], and F-box family protein (chromosome 5B; TraesCS5B01G038700 and chromosome 6B; TraesCS6B01G001000) [39]. It has been reported that the putative 60S ribosomal protein L18a is an upregulated transcript in response to drought stress in ears and silks during the flowering stage in maize [55].
Thirteen MTAs responsible for FLW detected on chromosomes 1A, 1B, 1D, 2D, 4B, 6B, and 6D were found in genes (Table S5) and three of these genes had annotations suggesting their involvement in drought stress (Tables 3 and 4). The genes associated with three MTAs involved in drought stress are citrate-binding protein (chromosome 1A; TraesCS1A01G326700) [62], F-box family protein (chromosome 6B; TraesCS6B01G042800) [39], and mitochondrial transcription termination factor-like (chromosome 6D; TraesCS6D01G040100) [63]. The SNPs S1A_516732460 and S6D_16376439 were annotated as a missense variant and thus may impact the function of the proteins that are annotated as citrate-binding protein and mitochondrial transcription termination factor-like protein, respectively (Table S5).
The multi-trait marker associated with FLW and FLA was located on chromosome 1A (S1A_516732460) with PVE ranging from 8.0% to 9.5% (Table S4). Another multi-trait marker associated with FLLN and FLA was located on chromosome 2A (S2A_29874199) with PVE ranging from 23.1% to 32.3% (Table S4). The multi-trait MTA indicates that the related candidate gene may affect multiple traits.

Root Length
RTLN is one of the most important traits under drought stress. We have measured RTLN 3-4 days after anthesis (Zadoks 60 growth stage) under the drought-stressed field condition using WinRhizo ® (WinRhizo reg. 2009c, Regent Instruments Inc., Quebec City, QC, Canada). This trait is very unique compared to previous studies where they focused on the roots of seedlings [67][68][69][70] rather than direct field-based measurements (labor intensive, time consuming, and expensive). Identification of QTL governing RTLN is very important in wheat, especially for the wheat grown under drought stress. Limited information is available on QTL related to root traits in wheat [67,[69][70][71][72].
Seven out of eight MTAs responsible for RTLN were present on chromosome 6D. Two haplotype blocks (the haplotype block1 with a size of 64 kb and the haplotype block2 with a size of 5kb) were identified from five out of seven MTAs for RTLN on chromosome 6D with PVE ranging from 5.0% to 11.8% (Figure 3). One SNP (S6D_435300571) present in the haplotype block2 was found in the gene (TraesCS6D01G332800) with PVE of 13.0%. The gene associated with this SNP is protein detoxification gene-containing multi-antimicrobial extrusion protein (MatE) ( Table 3) and has been reported to be expressed mainly in the root than shoots under drought stress [73]. For instance, MatE family genes such as HvAACT1 in barley [74] and TaMate in wheat [75], encode proteins that are primarily localized to root epidermis cells [74] and required for external resistance [23]. In the present study, this gene was also significantly associated with BMWT on chromosome 1D. This result implied that that this gene plays an important role for RTLN and BMWT in drought-stressed conditions.

Potential Candidate Gene Annotations Affecting Yield and Yield-related Traits under Drought Stress
This study identified~194 MTAs present on different chromosomes and associated with multiple traits. These 62 MTAs were associated with either the same trait in multiple years (MTA stability in different environments) or multiple traits within the same year or across years (suggesting epistasis) (Table S5). Additionally,~45 of the MTAs were present in genes with annotations relevant to the respective trait under drought stress (Tables 3 and 4). Interestingly, we noticed MTAs associated with the same or related traits were located within genes that had the exact same annotation (Figure 4; and Table S6). For instance, some of the MTAs for GY (2 MTAs), BMWT (2), TKW (1), AWNLN (2), FLLN (1), FLW (1), and STMDIA (1) were located within genes annotated as F-box family protein (Table S6). Similarly, the genes annotated as cytochrome P450 harbored MTAs for HI (1), TKW (1), FLA (2), GVWT (1), and FLLN (2). Additional examples are provided in Figure 4 and Table S6 in Supplementary Materials. This result indicated the likely gene families that are important for GY and yield-related traits under drought stress.

Site Description
A field experiment was conducted during two growing seasons (2016 and 2017) under drought stressed conditions (rainfed) at the research farm located at the Bahri Dagdas International Agricultural Research Institute in Konya, Turkey (37 • 51 15.894" N, 32 • 34 3.936" E; Elevation = 1021 m). This site was characterized by a low precipitation (below 300 mm), low humidity, and slightly alkaline clay loam soil [83].

Plant Materials and Experimental Design
One hundred twenty-three SHWs developed from two introgression programs were used (Table S1). The first group was developed by Kyoto University, Japan from one spring durum ('Langdon') parent crossed with 14 different Ae. tauschii accessions resulting in 14 different lines (Table S2). The remaining 109 lines were the second group of synthetics that were developed by International Maize and Wheat Improvement Center (CIMMYT) from the six durum parents crossed with 11 different Ae. tauschii accessions mainly from the Caspian Sea Basin area. Initially, 13 crosses among six winter durum wheats were involved in the creation of 13 different winter-type synthetics. However, due to the partial sterility, segregation, and continuous selection in the early generation, 109 lines were selected as unique lines because of their differences in phenotype [14] and their kinship values [11]. The synthetic genotypes used in the present study are unique and have been developed recently and tested for their agronomic traits [14], genetic diversity, and population structure [11]. The detailed information of these SHWs were provided in Bhatta et al. [11].
The experimental design was an augmented design (plot size: 1.

Trait Measurements
The GY was obtained by harvesting four middle rows of 1.008 m 2 (i.e., 84 m × 120 m) and reported in g·m −2 . The HI, BMWT, TKW, GVWT, AWNLN, FLLN, FLW, and FLA (0.8 × FLLN × FLW) were measured using previously reported protocols [18,32,59,61]. The STMDIA was measured from five randomly selected plants per plot using a digital Vernier caliper at the second internode from the soil surface at physiological maturity. The RTLN was measured from randomly selected three plants per plot after 3-4 days of flowering (Zadoks 60 growth stage) using WinRhizo software (WinRhizo reg. 2009c, Regent Instruments Inc., Quebec City, QC, Canada).

Phenotypic Data Analysis
A combined ANOVA was performed using the following model: where y ijklm is the GY and yield-related trait; µ is overall mean; Yr i is the effect of ith year; R(Yr) ji is the effect of jth replication within the ith year; B(R(Yr)) kji is the effect of kth incomplete block within jth replication of ith environment; C l is the lth checks; G mkji is the effect of mth genotypes (new variable, where check is coded as 0 and entry is coded is 1 and the genotype is taken as a new variable × entry) within the kth incomplete block of jth replication in the ith year; GXYr mi is the interaction effect of mth genotype and ith year; and e ijklmn is the residual. In the combined ANOVA, year and check were assumed as fixed effects, whereas genotype, genotype × year interaction, replication nested within a year, and incomplete block nested within replications were assumed as random effects. Individual analyses of variance were performed because most of the traits had highly significant genotype × year interaction and therefore will be discussed hereafter. An augmented design was analyzed using the following model for the estimation of BLUPs in the year 2016: where y ijkl is the trait, µ is the overall mean; B i is the effect of ith incomplete block; C j is the jth check; G k(i) (new variable, where check is coded as 0 and entry is coded as 1 and genotype is taken as a new variable × entry) is the effect of kth genotypes within the ith block; e ijkl is the residual. In ANOVA calculated for the 2016 datasets, the check was assumed as a fixed effect, whereas genotype and incomplete block were assumed as random effects. An alpha (α) lattice design with two replications was analyzed using the following model for the estimation of BLUPs in the year 2017: where y ijk is the trait, µ is the overall mean; R i is the effect of ith replication; B(R) ji is the effect of jth block within the ith replication; C k is the kth checks; G lji (new variable, where check is coded as 0 and entry is coded as 1 and the genotype is taken as a new variable × entry) is the effect of kth genotypes within jth incomplete block of ith replication; e ijkl is the residual. In ANOVA calculated for the 2017 datasets, the check was assumed as a fixed effect, whereas genotype, replication, and incomplete block nested within replication were assumed as random effects. All phenotypic data were analyzed using PROC MIXED in SAS 9.4 (SAS Institute Inc., Cary, NC) [84] using the restricted maximum likelihood (REML) approach unless mentioned otherwise.
Broad-sense heritability for each trait in each year was calculated based on entry mean basis using Equations (4-6) for 2016, 2017, and combine0d experiments, respectively: where σ 2 g , σ 2 yr , σ 2 gxyr , and σ 2 e are the variance components for genotype, year, genotype × year, and error, respectively, and n and r are the numbers of years and replications, respectively. Pearson's correlation of GY and yield-related traits was calculated based on BLUPs for each trait in each year using PROC CORR in SAS. The PC biplot analysis (PCA-biplot) was performed based on the correlation matrix to avoid any variation due to the different scales of the measured variables using 'factoextra' package in R software [85].

Genotyping and SNP Discovery
Genomic DNA was extracted from two to three fresh young leaves of 14-day-old seedlings using BioSprint 96 Plant Kits (Qiagen, Hombrechtikon, Switzerland), as described in Bhatta et al. [11]. The GBS libraries were constructed in 96-plex following digestion with two restriction enzymes, PstI and MspI [86] and pooled libraries were sequenced using Illumina, Inc. (San Diego, CA, USA) next-generation sequencing platforms at the Wheat Genetics Resource Center at Kansas State University (Manhattan, KS, USA). SNP calling was performed using TASSEL v. 5.2.40 GBS v2 Pipeline (available online: https://bitbucket.org/tasseladmin/tassel-5-source/wiki/Tassel5GBSv2Pipeline) [87] with physical alignment to the Chinese Spring genome sequence (RefSeq v1.0) made available by the IWGSC [88] using default settings with the one exception that the number of times for a GBS tag to be present and included for SNP calling was changed from the default value of 1 to 5 to increase the stringency in SNP calling. The identified SNPs with MAF less than 5% and missing data more than 20% were removed from the analysis. All lines had missing data less than 20% and none of them were dropped due to missing percentage-filtering criterion. The GBS-derived SNPs are provided in Table S3 in Supplementary Materials.
Many GWASs were previously performed using the mixed linear model (MLM), where the population structure (Q) or PC was set as a fixed effect and kinship (K) as a random effect to control false positives [91,92]. However, the MLM may lead to confounding between population structure, kinship, and quantitative trait nucleotides (QTNs) that results in false negatives due to model overfitting [78]. Recently, the multilocus mixed model (MLMM), which tests multiple markers simultaneously by fitting pseudo QTNs, in addition to testing markers in stepwise MLM, has been proposed, which is advantageous over conventional GLM and MLM testing one marker at a time [78]. One of the examples of recently popular GWAS analysis algorithm that is based on MLMM is FarmCPU [78,79]. The FarmCPU uses a fixed effect model (FEM) and a random effect model (REM) iteratively to remove the confounding between testing markers and kinship that results in false negatives, prevents model overfitting, and control false positives simultaneously [78]. Therefore, GWAS was performed on the adjusted BLUPs for each trait in each year to identify SNPs associated with GY and yield-related traits in SHWs using FarmCPU with population structures (Q 1 and Q 2 ) or first three principal components (PC 1 , PC 2 , and PC 3 ) as covariates by looking at the model fit using Quantile-Quantile (Q-Q) plots and FarmCPU-calculated kinship [78] implemented in MVP R software package (available online: https://github.com/XiaoleiLiuBio/MVP). A uniform suggestive genome-wide significance threshold level of p-value = 9.99 × 10 -5 (−log 10 p = 4.00) was selected for MTAs considering the deviation of the observed test statistics values from the expected test statistics values in the Q-Q plots [28,80] from the two-year results of the present study.

Haplotype Block Analysis
Haplotype blocks with linkage disequilibrium (LD) values (squared correlation coefficient between locus allele frequency; R 2 > 0.2) in adjacent regions (<500 kb) of significant MTAs were visualized and plotted using default parameters (Hardy-Weinberg p-value cut off at 1% and MAF > 0.001) of Haploview software (available online: https://www.broadinstitute.org/haploview/ haploview) [81]. PVE by each haplotype block on the trait of interest was calculated using multiple regression analysis that accounted for the population structure by removing the haplotype allele of less than 5% in SAS using PROC REG.

Putative Candidate Gene Analysis
The genes underlying the MTAs and subsequently their annotations were retrieved using a Perl script and the IWGSC RefSeq v1.0 annotations [88] provided for the Chinese Spring. The underlying genes were further examined for their association with GY and yield-related traits under drought stress using previously published literature. Additionally, the SnpEff program (available online: http://snpeff.sourceforge.net/) was used for SNP annotation and predicting the effects of SNPs on the protein function. The MTAs present within genes or flanked (5 kb) by genes were investigated [93].

Conclusions
The present study showed SHWs have large amounts of genetic variation for GY and yield-related traits. The GWAS in 123 SHWs using 35,648 SNPs identified several novel (90 MTAs: 45 MTAs on the A genome, 11 on the B genome, and 34 on the D-genome) genomic regions or haplotype blocks associated with GY and yield-related traits in drought-stressed conditions. Most of the MTAs (120 MTAs) were present in genes ad several of them (45 MTAs) were annotated with functions related to drought stress. This provided further evidence for the reliability of the MTAs identified. We also identified MTAs on different chromosomes associated with multiple traits but within genes having the same annotation. This resulted in the identification of candidate genes belonging to the same gene family that likely have a major role in affecting GY and yield-related traits under drought stress in SHWs. The large number of MTAs, especially on the D-genome (53 MTAs with 34 MTAs being novel) identified in the present study, demonstrate the potential of SHWs for elucidating the genetic architecture of complex traits and provide an opportunity for further improvement of wheat under rapidly growing drought-stressed environment worldwide.

Conflicts of Interest:
The authors declare no conflicts of interest.