High-Density Linkage Mapping of Agronomic Trait QTLs in Wheat under Water Deficit Condition using Genotyping by Sequencing (GBS)

Improvement of grain yield is the ultimate goal for wheat breeding under water-limited environments. In the present study, a high-density linkage map was developed by using genotyping-by-sequencing (GBS) of a recombinant inbred line (RIL) population derived from the cross between Iranian landrace #49 and cultivar Yecora Rojo. The population was evaluated in three locations in Iran during two years under irrigated and water deficit conditions for the agronomic traits grain yield (GY), plant height (PH), spike number per square meter (SM), 1000 kernel weight (TKW), grain number per spike (GNS), spike length (SL), biomass (BIO) and harvest index (HI). A linkage map was constructed using 5831 SNPs assigned to 21 chromosomes, spanning 3642.14 cM of the hexaploid wheat genome with an average marker density of 0.62 (markers/cM). In total, 85 QTLs were identified on 19 chromosomes (all except 5D and 6D) explaining 6.06–19.25% of the traits phenotypic variance. We could identify 20 novel QTLs explaining 8.87–19.18% of phenotypic variance on chromosomes 1A, 1B, 1D, 2B, 3A, 3B, 6A, 6B and 7A. For 35 out of 85 mapped QTLs functionally annotated genes were identified which could be related to a potential role in drought stress.


Introduction
Drought caused by climate change affects crop production, especially under waterlimited environments. Wheat (Triticum aestivum L.) is one of the fundamental global crops which are mostly cultivated in dry regions. Here water deficit can cause up to 50% yield reduction in wheat if compared to production under irrigation conditions [1]. In this context breeding for drought tolerance is a major concern globally. Components of the grain yield in wheat are complex and multigenic with low heritability which are highly affected by environmental conditions, especially drought stress [2]. Moreover, yield and yield components traits like number of spikes, number of grains per spike, and 1000 kernel weight as well as some morphological and physiological traits have been recognized as relevant traits to drought tolerance [3]. To develop drought tolerant varieties it is essential to understand the genetic basis of these complex traits [4], thus identification of quantitative trait loci (QTL) and genes for important agronomical traits such as grain yield and its components is a key component of crop improvement programs [5]. To date, a large number of QTLs associated with yield component traits and grain yield under water deficit condition [1][2][3][6][7][8][9][10][11][12][13][14] and normal condition [15][16][17][18][19][20][21][22][23][24][25][26][27][28][29] have been identified on all 21 wheat chromosomes.
To precisely map QTLs underlying the traits of interest, a high-density linkage map is a prerequisite. Today, single nucleotide polymorphisms (SNPs) are the marker of choice

Phenotypic Data Analysis
A total of 148 RILs as well as parents were grown at three locations and over two years to test performance under irrigation and water deficit regimes. Eight traits were recorded: GY, PH, SM, TKW, GNS, SL, BIO and HI. The combined mean distribution of the traits over six environments (three locations and two years) under irrigated and water deficit conditions are presented in Figure 1a,b, respectively. The results revealed the normal distribution of the traits in the population. For all the studied traits except HI, #49 had higher mean values compared with Yecora Rojo under both conditions (Table S1 supplementary data). GY was positively correlated with all other traits under irrigated and water deficit conditions; however, the correlation between PH and GY under water deficit conditions was not significant. Strong correlations were observed between GY and HI as well as BIO under both conditions. The correlations between SM and PH, SL and GNS were negative and significant under irrigated conditions, whereas these correlations were not significant under water deficit conditions. BIO showed significant positive correlation with all the traits under both conditions except for HI. Negative correlations of SM with GNS, PH, SL and TKW indicate competition between yield components and traits such as plant height for available resources during plant growth (Table 1).

Genome Wide SNP Discovery by GBS
In total, 481.3 million reads were generated and 98% of them were mapped to the Chinese Spring reference genome. The number of reads per samples varied from 2.0 to 4.6 million reads with an average of 3.0 million reads. We required a read depth of four independent sequence reads for homozygous and heterozygous genotype calls, thus our 4-fold coverage SNP matrix was used in subsequent analysis. A total of 35,405 SNPs with a MAF = 0.05 were detected by the SAMtools pipeline based on wheat RefSeqv1.0 [36]. After removing SNPs with ≥ 20% missing values, a total of 7788 polymorphic SNPs distributing across all 21 wheat chromosomes, based on their predicted physical position, were used for further analyses ( Table 2). The maximum (785) and minimum (27) number of SNPs were mapped on chromosome 2B and 4D, respectively.

Linkage Map Construction
A total of 7788 SNP markers were used for map construction and out of which 5831 SNPs were mapped on 21 linkage groups corresponding to the 21 chromosomes of hexaploid wheat ( Figure 2). The linkage map spanned a total length of 3642.14 centi-Morgan (cM) with an average marker density of 0.62 cM. A total of 2933 (50%) and 2454 (42%) markers were mapped to the B and A genome, respectively, covering 1519.70 and 1566.64 cM of these two genomes. Only 444 out of 5831 SNPs were mapped to the D genome spanning a length of 555.8 cM, indicating a much lower level of polymorphism in D compared to the A and B genomes. The number of markers in the different chromosomes ranged from 570 (2B) to 15 (4D), respectively, and chromosome 3D exhibited the lowest marker density (2.89 cM between two adjacent markers) (

Linkage Map Construction
A total of 7788 SNP markers were used for map construction and out of which 5831 SNPs were mapped on 21 linkage groups corresponding to the 21 chromosomes of hexaploid wheat ( Figure 2). The linkage map spanned a total length of 3642.14 centiMorgan (cM) with an average marker density of 0.62 cM. A total of 2933 (50%) and 2454 (42%) markers were mapped to the B and A genome, respectively, covering 1519.70 and 1566.64 cM of these two genomes. Only 444 out of 5831 SNPs were mapped to the D genome spanning a length of 555.8 cM, indicating a much lower level of polymorphism in D compared to the A and B genomes. The number of markers in the different chromosomes ranged from 570 (2B) to 15 (4D), respectively, and chromosome 3D exhibited the lowest marker density (2.89 cM between two adjacent markers) ( Table 2).

QTL Mapping
A total of 85 QTLs were identified for the studied traits across three locations (Mahabad, Miandoab and Tabriz, Iran) during two years and mean of three locations under irrigated and water deficit conditions ( Table 3). The Logarithm of Odds (LOD) score of the detected QTLs varied from 2.55 to 10.60 with an average 4.5 explaining 6.06 to 19.25% of the trait`s phenotypic variation (mean 10.6%). No QTL was identified on chromosomes

QTL Mapping
A total of 85 QTLs were identified for the studied traits across three locations (Mahabad, Miandoab and Tabriz, Iran) during two years and mean of three locations under irrigated and water deficit conditions ( Table 3). The Logarithm of Odds (LOD) score of the detected QTLs varied from 2.55 to 10.60 with an average 4.5 explaining 6.06 to 19.25% of the trait's phenotypic variation (mean 10.6%). No QTL was identified on chromosomes 5D and 6D. The number of QTL varied from 19 for PH to seven for SM, GNS and BIO. For 38 QTLs, additive effect was positive and for 47 QTLs was negative, indicating inheritance to the offspring of favorable alleles in theses loci from either #49 or Yecora Rojo, respectively. Five QTLs were stably identified across irrigated and water deficit conditions. The chromosomes 4A and 6B possessed QTLs associated with most of the traits (PH, SL, GNS, BIO and GY) and (PH, SM, TKW and GY), respectively. Genetic positions of yield-related trait QTLs are shown in Figure 3a-c.     Qbio-Irr.mia-1D 0  5  10  15  20  25  30  35  40  45  50  55  60  65  70  75  80  85  90  95  100  105  110  115  120  125  130  135  140  145 Figure 3a-c). Among these, six QTLs on chromosomes 2A, 4A and 6B (4) and two QTLs on chromosome 2A were identified under irrigated and water deficit conditions, respectively. The favorable alleles for GY at the loci on chromosomes 4A and 6B were contributed by #49, and those on chromosome 2A were inherited from Yecora Rojo.

Plant Height (PH)
For plant height, 10 QTLs on chromosomes 2B, 3A, 3D, 4B (2), 4D, 6B (2), 7A and 7B with LOD ranging from 3.59 to 10.60 and R 2 from 6.20 to 19% were mapped under irrigated conditions (Table 3, Figure 3a-c). Under water deficit conditions, nine QTLs for PH were identified on chromosomes 4A, 4B, 4D, 7A (5) and 7D with LOD ranging from 3.81 to 6.13 explaining 8.9-14.0% of the trait phenotypic variance. The favorable alleles were contributed by Yecora Rojo at all loci except for QTL on chromosome 6B under water deficit conditions. The two QTLs on chromosome 4B and 4D were consistent under both conditions.

Number of Spikes per Square Meter (SM)
Six QTLs on chromosomes 5A, 2B and 6B under irrigated and one QTL on chromosomes 2B under water deficit conditions were mapped. The LOD scores ranged from 3.68 to 8.01 with R 2 value ranging from 8.70 to 17.6%. The #49 contributed the favorable alleles at all the loci except for QTLs on chromosome 2B (Table 3, Figure 3a-c).

Number of Grains per Spike (NGS)
Seven QTLs were identified for number of grains per spike on chromosomes 1A, 3A, 4A (3) and 7D (2) with LOD score from 2.82 to 7.58 and R 2 from 6.6 to 19.25% and both parents contributed the favorable alleles (Table 3, Figure 3a-c). Among these, three QTLs were found under irrigated and four QTLs under water deficit conditions. The QTL on 4A was stable at two locations and under two conditions.  (Table 3, Figure 3a-c). HI QTLs explained an average of 10.63% phenotypic variation and the LOD score ranged from 2.55 to 6.43. Both parents contributed alleles to affect this trait.

Discussion
Improving grain yield under water deficit conditions is of utmost importance for wheat breeding programs for arid and semi-arid regions of the world. To study drought tolerance related genomic regions, QTL mapping has been conducted using GBS in a RIL population derived from a cross between #49 × Yecora Rojo under irrigated and water deficit conditions. Due to simultaneous discovery and genotyping of a large numbers of SNPs, cost-efficiency and flexibility, GBS has been applied to mapping of QTLs in several plant species including wheat [21,[32][33][34]37,38]. We employed GBS to genotype 148 recombinant inbred lines and two parental lines. After filtering against low-quality markers, 7788 high quality SNPs were used to generate a linkage map.
In the present study, 85 putative QTLs for grain yield and its related traits were identified on 19 out of 21 chromosomes of hexaploid wheat (Table 3, Figure 3a-c). We could identify unique QTLs for studied traits that were likely not detected in the previous studies. This population revealed high-effect QTL under water deficit conditions for spike length in the interval of 59.9-75.5 Mb of 7A with 19.18% R 2 and 7.15 LOD score (Qsl-Wd.mah-7A) and favorable alleles were contributed by Yecora Rojo. SL QTL Qsl-Wd.mia-2B detected on 2B spanning from 26.5 to 33.7 Mb (with 11.64% R 2 ) has not also been reported. For TKW under water deficit conditions, two unique QTLs, Qtkw-Wd.mah-3B and Qtkw-Wd.mia-3B, with 11.59 and 8.60% R 2 , were identified on 3B (20.4-28 Mb). GNS QTL, Qgns-Wd.mia-1A, was mapped on 1A (12-14.8 Mb) and explained the 11.10% phenotypic variation. The genomic regions on chromosome 1B contained BIO QTLs (Qbio1-Wd.mia-1B and Qbio2-Wd.mia-1B) spanning from 9.6 to 16.4 Mb and HI QTLs (Qhi1-Wd.tab-1B and Qhi2-Wd.tab-1B) spanning from 420.5 to 427 Mb. In this study, some unique QTLs related to irrigation conditions were also identified, the high effect of which was Qhi-Irr.mia-6A (580-595 Mb) with a 14.20% R 2 and 6.43 LOD score. Two QTLs on chromosome 6B spanning from 28 to 35 Mb (43 and 49 cM) were co-localized for GY and SM under irrigation conditions that explained an average of 9.28% phenotypic variation, and the LOD score ranged from 3.65 to 4.22. BIO QTL (Qbio-Irr.mia-1D) on chromosome 1D (34-250 Mb) and PH QTL (Qph-Irr.3P-3A) on chromosome 3A (33.4-37.2 Mb) were identified under irrigation conditions and explained 10.78 and 8.30% of the phenotypic variation, respectively. To date, numerous QTL associated with grain yield and yield components have been reported by using various populations. However, different QTL expression and its position may be influenced by factors such as type and size of population, density of the linkage map, analytical tools and the environmental conditions used in different studies [39].
We aimed to predict candidate genes for QTL function on the basis of the structural and functional gene annotation provided with Chinese Spring RefSeqv1.0 [36]. The important putatively drought-related genes underlying the QTL intervals are listed in Table 4. Since the physical intervals of several QTLs overlapped with each other, the same annotated gene is co-located with more than one QTL. Potential candidate genes underlying Qtkw-Wd.tab-7B, Qsl1-Irr.tab-4A and Qph1-Irr.mia-6B were annotated as F-box family protein. F-box proteins play a role in responses to hormones [54,55], light and abiotic stress [56,57]. Other putatively drought-related genes for PH QTLs on chromosomes 4B under irrigated and water deficit conditions belonged to the gene class of protein kinases. Generally, protein kinases are involved in drought response as regulatory proteins. Mao et al. [58] reported that a serine/threonine-protein kinase of wheat increases multi-stress tolerance in Arabidopsis. The gene underlying Qph1-Wd.mia-7A and Qph1-Wd.3P-7A identified under water deficit condition was a 60 kDa chaperonin protein.
The 60 kDa chaperonin, like other heat-shock proteins, are induced by hot temperatures, salinity, cold and water deficit stress as a protective mechanism [59,60]. NBS-LRR disease resistance protein genes were present in chromosomal regions of two QTLs under water deficit condition for grain yield and spike length on chromosomes 2A and 2B, respectively. Improvement of drought tolerance by enhanced expression of NBS-LRR genes was reported by Chini et al. [61]. The function of candidate genes underlying Qtkw-Wd.mah-3B and Qsl-Wd.3P-2A were annotated as AP2-EREBP transcription factor and ABC transporter protein, respectively. Liu and et al. [62] and Song and et al. [63] reported the role of an AP2-EREBP transcription factor in signal transduction pathways in drought and ABA responses. Furthermore, some members of the ABC transporter family proteins mediate the transport of acyl-coenzyme A [64] which contributes to drought tolerance. An NAD(P)H-quinone oxidoreductase was located within marker intervals for GNS QTLs (Qgns2-Irr.tab-4A, Qgns-Wd.3P-4A and Qgns2-Irr.tab-4A) on chromosome 4A. NAD(P)H-quinone oxidoreductase inhibits the production of semiquinones and oxygen radicals by the reduction of quinones to quinols [65]. Cysteine proteinase genes were co-located with grain yield QTLs on chromosome 6B identified in irrigated condition. The role of cysteine proteinases family in wheat under severe drought [66] and salinity, drought, oxidation and cold stress [67] has been established. Another important response to abiotic stress particularly drought is the elimination of reactive oxygen species (ROS) which is performed by different kind of detoxification proteins. In present study, a gene related to detoxification function was found within QTL intervals for PH on chromosome 7A (Qph-Irr.tab-7A). Cytochrome P450 genes were identified for HI QTLs intervals on chromosomes 2A (Qhi2-Irr.mia-2A) and 1B (Qhi2-Irr.tab-1B). Cytochrome P450 genes are members of a large superfamily enzyme which involved in drought stress response [68]. Ubiquitin-conjugating enzyme E2 was annotated on QTLs underlying 1000 kernel weight on chromosome 1B (Qtkw1-Irr.mah-1B and Qtkw-Irr.mia-1B). Ubiquitinylation is a multi-step reaction for protein degradation and plays an important role in light signaling, biotic and abiotic stress responses [69].

Plant Materials
The mapping population consisted of 148 F8 recombinant inbred lines (RILs) derived from a cross between genotype Iran #49 and Yecora Rojo. Iran #49 is a tall late spring landrace collected at Allary, 30 • 56 , 61 • 39 , alt. 530 m, average rainfall = 50 mm, in Bluchestan, southeast Iran with a large root system. Yecora Rojo is spring wheat variety from Mexico which was cultivated in Southern California for more than 40 years, which is carrying two dwarfing genes and is characteristic for its shallow root system [35]. The parental lines were different for a number of phenological, morphological, and agronomic traits including grain yield and plant height. In each station, the 148 RILs and two parental lines were planted in an alpha lattice design with two replications each consisting of ten incomplete blocks with 15 genotypes, both under water deficit and irrigation conditions. Each line was sown in three rows of 2.5 m length with inter-and in-row spacing of 20 and 5 cm, respectively. Irrigation in non-stress conditions was done after 70 mm evaporation from class A Pan corresponded to soil water potential of −0.5 MPa and in water deficit conditions was performed after 130 mm evaporation from class-A Pan corresponded to soil water potential of −1.2 MPa. In the stressed treatment, water deficit stress was induced by stopping irrigation from 50% heading to physiological maturity in order to simulate terminal drought stress. The following phenotypic traits were recorded: The number of spikes per square meter (SM) was recorded at physiological maturity and plant height (PH) was measured in centimeters (cm) from the ground to the tip of the spike from 10 randomly sampled and tagged plants in each plot before harvesting. The spike length (SL) [measured in cm] and the numbers of grains per spike (NGS) were recorded after harvesting from the main tillers of 10 randomly selected plants. Thousand kernel weight (TKW) was determined measured from randomly sampled 1000 seeds after harvest and expressed in g/1000 seed. The grain yield per plot (GY) and biomass were determined as the weight (grams) of the grain from a plot where the plot sizes were 2.5 m rows with 50 plants after eliminating borders. Finally, the harvest index (HI) was calculated as the proportion of the total biomass devoted to grain yield.

GBS Library Construction, Genotyping and SNP Calling
DNA was extracted and purified from the leaf tissue of the RILs and parental lines using Guanidine thiocyanate-based DNA isolation technique. Briefly, six centimeters of two-week-old of leaves were harvested and transferred into 1.1 mL 8-strip mini tubes (supplier, type) together with two 4mm glass beads and homogenized by using a 96-well block holder and a Retsch MM 400 mixer mill (Retsch GmbH, Haan, Germany) for a minimum of 1 min at 30 Hz frequency. 600 µL preheated GTC extraction buffer (1 M Guanidine thiocyanate, 2 M NaCl, 30 mM NaAc pH 6.0) was added to the tubes and incubated at 65 • C for 30 min. After spinning for 30 min at 4000 rpm using a table top centrifuge, 480 µL of the supernatants were transferred to a 96-Well AcroPrep Advance plate (Thermo Fisher Scientific GmbH, Dreieich, Germany) for vacuum filtration. 900 µL wash buffer (50 mM NaCl, 10 mM Tris/HCl pH 8, 1mM EDTA, Ethanol 70%) was added twice and the vacuum was applied with a vacuum manifold after every washing step. The 96-Well AcroPrep Advance plate was placed onto a NUNC 96-well plate (Thermo Fisher Scientific GmbH, Dreieich, Germany) and 100 µL preheated 1x TE light elution buffer (0.1 mM EDTA, 10 mM Tris/HCl pH 8) was pipetted to each well. Extracted DNA was eluted by spinning at 3500 rpm for 10 min. DNA quality and concentration were checked by running electrophoresis of 1% agarose gels (Invitrogen, Carlsbad, CA, USA) in 1X TBE buffer (89 mM Tris, 89 mM Boric acid, 2 mM EDTA pH 8.0) and by using Qubit 2.0 Fluorometer (Invitrogen, Carlsbad, CA, USA) were used.
GBS libraries were constructed for each RIL and parental lines according to a previously described protocol [70]. Libraries comprising 150 individually barcoded samples were sequenced as a pool in three lanes of an Illumina HiSeq2500 (Illumina Inc., San Diego, CA, USA). The length of single-end reads was 100 bp. Each of the RILs and parents were sequenced three and six times, respectively, to specify the minimum required number of reads per sample. Raw data processing followed a previously established reference-based GBS pipeline using SAMtools [71]. First, sequence reads were quality-and adaptor-trimmed and then mapped against Chinese Spring RefseqV1.0 [36]. Variant filtering was done with the following parameters: minimum minor allele frequency 0.05, maximum fraction of missing genotype call 0.9 and minimum read depth 4 for both of homozygous and heterozygous genotype calls. As the maximum expected residual heterozygosity for RIL population is 10%, this parameter was set to 0.1. Identified SNPs were named as "chromosome name_ physical position" i.e., chr4A_614700608. All unknown or heterozygous SNPs in the parents as well as those with >20% missing data were excluded from further analysis.

Linkage Map Construction and QTL Mapping
A total of 7788 SNPs were used for construction of a linkage map based on 148 RILs. Linkage analysis was performed using the regression mapping function in JoinMap 4.1 (Kyazma B.V., Wageningen, The Netherlands) [72] with LOD values in the range from 3 to 8. The Kosambi mapping function was used to convert recombination frequency to genetic distance based on centi-Morgan [73]. The identical markers and unmapped SNPs were removed automatically by JoinMap. The loci with significant segregation distortion (p ≤ 0.01) were also excluded. QTLs were identified by composite interval mapping (CIM) based on model 6 and forward and backward regression to select cofactors implemented in WinQTL Cartographer V2.5 software [74]. The QTL threshold for the studied traits (LOD significance threshold) was estimated by 1000 permutations. The percentage of phenotypic variance explained by each QTL and their additive effect were also evaluated. Each QTL was denominated as "Q" (abbreviation of QTL)+ trait name+ trial name+ chromosome name. For example, Qtkw-Wd.mia-3B reveals QTL for thousand kernel weight under water deficit conditions in Miandoab in which it stands on 3B chromosome. MapChart (version 2.3) software [75] was used for the graphical presentation of linkage maps and QTL positions. The physical position of flanking markers surrounding identified QTLs were extracted and used to search on the IWGSC_RefSeq v1.0 assembly. Gene IDs present in each of physical intervals were obtained from JBrowse. The annotation of the gene was then retrieved from the Functional Annotation_v1 of IWGSC_RefSeqv1.0, which is publicly available at https://wheat-urgi.versailles.inra.fr/Seq-Repository/Annotations (accessed on 21 October 2018).

Conclusions
The results of present study indicate that GBS methods provide an opportunity to develop a high-density linkage map for the RIL population derived from a cross between #49 and Yecora Rojo, and precise identification of QTLs for grain yield and its components. For grain yield and its related traits 20 novel QTLs explaining 8.87-19.18% of phenotypic variance were mapped on chromosomes 1A, 1B, 1D, 2B, 3A, 3B, 6A, 6B and 7A. We could identify two stable QTLs for plant height under irrigated and water deficit conditions, which they were associated with Rht-B1 and Rht-D1 genes on chromosomes 4B and 4D, respectively. Some linkage groups had QTLs associated with several traits showing pleiotropic effects, which might be interesting for gene pyramiding. In addition, annotation of genes underlying the QTL intervals revealed their potential role in drought stress responses. These QTLs and potential candidate genes can be further analyzed and validated for breeding applications.  Table S2: Coincidences of significant QTLs in this study with QTL regions reported previously; Figure S1: Coincidences of significant QTLs in this study with QTL regions reported previously.