Genome-Wide Association Mapping through 90K SNP Array for Quality and Yield Attributes in Bread Wheat against Water-Deficit Conditions

The decrease in water resources is a serious threat to food security world-wide. In this regard, a genome-wide association study (GWAS) was conducted to identify grain yield and quality-related genes/loci under normal and water-deficit conditions. Highly significant differences were exhibited among genotypes under both conditions for all studied traits. Water-deficit stress caused a reduction in grains yield and an increase in grains protein contents (GPC) and gluten contents (GLC). Population structure divided the 96 genotypes into four sub-populations. Out of 72 significant marker-trait associations (MTAs), 28 and 44 were observed under normal and water-deficit stress conditions, respectively. Pleiotropic loci (RAC875_s117925_244, BobWhite_c23828_341 and wsnp_CAP8_c334_304253) for yield and quality traits were identified on chromosomes 5A, 6B and 7B, respectively, under normal conditions. Under a water-deficit condition, the pleiotropic loci (Excalibur_c48047_90, Tdurum_contig100702_265 and BobWhite_c19429_95) for grain yield per plant (GYP), GPC and GLC were identified on chromosomes 3A, 4A and 7B, respectively. The pleiotropic loci (BS00063551_51 and RAC875_c28721_290) for GPC and GLC on chromosome 1B and 3A, respectively, were found under both conditions. Besides the validation of previously reported MTAs, some new MTAs were identified for flag leaf area (FLA), thousand grain weight (TGW), GYP, GPC and GLC under normal and water-deficit conditions. Twenty SNPs associated with the traits were mapped in the coding DNA sequence (CDS) of the respective candidate genes. The protein functions of the identified candidate genes were predicted and discussed. Isolation and characterization of the candidate genes, wherein, SNPs were mapped in CDS will result in discovering novel genes underpinning water-deficit tolerance in bread wheat.


Introduction
Wheat is considered worldwide as one of the most important crops. Ensuring sustainable wheat production to fulfill the needs of an increasing population is a serious challenge for wheat scientists and GWAS also explores the genetic mechanisms of attributes and their responsible genes. It is a useful technique with more accurate results because of having more genetic diversity and historically recombination of alleles between associated panels [17].
A plethora of studies has been conducted to detect quantitative trait loci (QTL) for yield and quality traits under normal and water-deficit conditions in different association panels. Modern wheat breeding depends upon exploring genetic and molecular mechanisms of high temperature and water-deficit tolerance through corresponding techniques of association and QTL mapping. Nowdays, the use of high-density single nucleotide polymorphism (SNP) markers to detect the genomic regions which associated with target traits through genome-wide association studies GWAS in wheat crop [19,20]. About 800 quantitative trait loci (QTLs) and marker-trait associations (MTAs) have been reported for water-deficit tolerant traits (quality and yield-related traits) using bi-parental mapping (691 QTLs) and genome-wide association studies (GWASs; 109 MTAs) in wheat. However, only 68 QTLs are major QTLs that exhibit more than 19% of phenotypic variation [21]. QTL analysis for wheat quality was reported mainly on grain protein traits [22], e.g., grain protein content [23] and gluten contents [24]. Herein, we performed a GWAS to identify loci/genes underpinning the major grain yield and quality traits using the 90k SNP assay and two years field data on a selected panel of 96 spring wheat genotypes grown under normal and water-deficit conditions. The candidate genes for the identified significant MTAs were identified and their protein functions were predicted and discussed.

Germplasm Collection and Experimental Layout
Seeds of the association panel of 96 bread wheat accessions were obtained from the Department of Plant Breeding and Genetics, University of Agriculture, Faisalabad (PBG-UAF). The genotype code, name, pedigree record and their origin is mentioned in the Table S1. Out of 96 accessions, 22 were developed at PBG-UAF Pakistan, 24 were introduced from CIMMYT and 50 were from historical Pakistani approved varieties. The association panel was grown under normal and water-deficit conditions in a randomized complete block design (RCBD) with three replications during two crop seasons 2016-17 and 2017-18. Normal experiment irrigation was applied at three critical stages i.e., (1) tillering (35 Days after sowing (DAS)), (2) the booting stage (85 DAS) and (3) the milking stage (112 DAS) [14]. In this experiment water-deficit stress was applied at the tillering stage by upholding the irrigation treatment. Each genotype was sown in one-meter long row with three replications, maintaining a plant-to-plant distance of 15 cm. Row-to-row distance was 30 cm. Two seeds of each genotype were dibbled per hole and one healthy wheat seedling was reserved after germination by thinning. One set of genotypes was irrigated at all the three-critical stages, while the other set of the same wheat genotypes was kept under water-deficit stress, missing the irrigation at first (tillering) critical stage 35 days after sowing. [14]. All regular agronomic applications like fertilizer, hoeing, weeding, etc. were implemented equivalently to lessen the experimental fault in both conditions during both seasons.

Data Recording and Statistical Analysis
At maturity, when wheat plants were fully established, data were collected of 10 plants from each replication for thousand-grain weight (TGW) and grain yield/plant (GYP) in normal and water-deficient environment. Flag leaf area (FLA) was measured in cm 2 using the equation FLA = Flag lea f length (cm) × f lag lea f width (cm) × 0.74 [25]. After cleaning the wheat samples of foreign matter, the determination of protein and gluten contents in the whole grain was performed, using an Omeg Analyzer G Device [26]. The device is a computer-controlled dual-beam near infrared-analyzer for the analysis of whole grain for the determination of protein and gluten contents in wheat. The device key features are: near-infrared wavelength range: 730 nm-1100 nm; increment: 0.5 nm; variable path length: 8-30 nm; data increment: <1 nm and analyzing time: 50 s [26,27]. Scored data of all studied attributes were exposed to the analysis of variance (ANOVA) technique using the GenStat ® version 17, VSN, International [28]. Pooled analysis of variance was implemented in studied germplasm. Broad-sense heritability (H 2 ) was calculated for both seasons under combined average data of normal and water-deficit conditions using the model given by Bhatta and their colleagues in wheat crop [22]. Pearson's correlation coefficients (r) were performed to conclude the association among yield and some quality traits in normal and water-deficit conditions using SPSS version 23 [29].

Genotyping of the Studied Germplasm
Three seeds of each genotype were planted. Fresh leaf samples for DNA extraction were collected from 15-day old seedlings. DNA was extracted following the CIMMYT Molecular Genetics Manual [30]. The DNA samples (50-100 ng/µL per sample) in a 96-well plate format were sent to the CapitalBio ® genotyping facility in Beijing for genotyping with high-density illumina 90K infinium SNP array [31]. The genome-wide positions of SNPs in terms of genetic distance (cM) located on chromosomes were used in this study based on a consensus genetic map of wheat 2015 [31]. Monomorphic markers, missing values <20% and showed unclear SNPs or (minor alleles) demonstrated the allelic frequencies of less than <5%, were excluded from the analysis.

Population Structure and GWAS Analysis
Bayesian clustering technique was applied with unlinked SNPs to classify groups of genotypically same individuals applying the statistical software STRUCTURE v.2.3 [32]. Burn-in iterations of 10 4 cycles, followed by a simulation run of 10 6 cycles and the admixture model selection were used. Web-based analysis "Structure Harvester v0.6.93" was applied to obtain maximum value or peak of "K" for validation to understand the STRUCTURE results which were based on ad-hoc techniques [33]. We selected the K values ranged 1-10 and 6 independent runs to attain reliable effects.
GAPIT (genome association and prediction integrated tool) was also applied with the model selection preference to test the reliability of the results [34]. It was advanced in an R package which offers maximum likelihood precision and run in a computationally effective method. (GAPIT) implements unconventional statistical approaches containing the compressed mixed linear model (CMLM) and CMLM-based genomic prediction and selection. The threshold level for significant marker-trait associations (MTA) was 10 −3 (log10p) or above [4] after applying the false discovery rate (FDR) <0.05 correction [35]. A mixed linear model (MLM) was estimated from newly developed GWAS. To define the spurious associations derived from population structure, covariates from either STRUCTURE [32] or principal components (PCs) were considered as fixed effects. The relationships among individuals were calculated using a kinship matrix and incorporated MLM [36]. Overall, 35,320 of the 81,000 functional iSelect bead chip analyzes visually showed polymorphism; to locate them on the published genetic map [31] in the studied genotypes.

Mapping SNPs and Identification of Candidate Genes
The bread wheat reference genome (IWGSC RefSeq v1.0) and gene annotations in GFF3 format were retrieved from the Ensembl database release 44 [37]. The SNP marker sequences were aligned to the Wheat genome using blastn program with a stringent E-value of 0.0001. For each SNP only the best scoring hit was retained. Each aligned genomic position was annotated into 5 -UTR, 3 -UTR, CDS, intron, intergenic regions according to the genomic regions provided in the GFF3 file. The intergenic region was defined as the genomic region with no annotated genes. The annotated genes within ±250 Kb of the mapped SNP were considered candidate genes as described by scientists. The protein functions for the candidate genes were predicted using the Uniprot Protein database [38].

Phenotypic Evaluation
Significant genotypic variations (p < 0.01) were observed for all measured yield and quality traits among genotypes in Table 1. Significant G × E (genotype by environment) interactions for all the traits were observed under normal and water-deficit conditions ( Table 1). The highest heritability was observed for GYP with the values of H 2 = 0.95 and H 2 = 0.92 under normal and water-deficit conditions respectively ( Table 2). Summaries of average data for quality and yield-related attributes over the two years are shown in Table 2. Flag leaf area mean values ranged from 28.50 cm 2 to 44.83 cm 2 and 19.02 cm 2 to 35.02 cm 2 under normal and water-deficit conditions, respectively. Thousand-grain weight (TGW) had mean values ranging from 42.3 g to 58.2 g and 33.7 g to 47.5 g under normal and water-deficit conditions, respectively. The average grain yield per plant values were ranged from 21.30 g to 38.11 g and 15.21 g to 32.02 g under well-watered and water-deficit conditions, respectively. Under normal conditions, a minimum value of protein and gluten contents 13.01% and 22.51%, respectively, while under the water-deficit condition, the minimum values were 12.92% and 24.23%, respectively. The maximum values for protein contents were 13.64% (normal) and 15.13% (water-deficit) while gluten contents had maximum mean values 30.06% (normal) and 31.78% (water-deficit). In Table 3 results showed a correlation of studied attributes based on data averaged over the years 2016-2017 and 2017-18 under normal and water-deficit conditions. Flag leaf area was positively associated with TGW and GYP under normal and water-deficit stress conditions (Table 3). Thousand-grain weight was strongly correlated with GYP and FLA under both conditions. Quality traits like protein contents and gluten contents were significant and positively associated with each other's and negatively associated with all studied yield and yield-related traits including FLA, TGW and GYP under both conditions.

Population Structure
A Bayesian approach performed in statistical software package STRUCTURE version 2.3.3 was used to assess the genetic structure of 96 bread wheat accessions. The results showed that the highest peak number of K = 4 based on the rate of change in the log probability of data between successive K-values ( Figure 1). This designated that the genotypes could be divided into four sub-groups. Each color in Figure 2 demonstrates a single group and studied germplasm of 96 bread wheat genotypes divided into 4 sub-populations. Moreover, evaluation of each group exposed that genotypes from G-1 to G-10 and from G-27 to G-28 were placed into the first group in this group total 12 genotypes appeared. In the second group, a total of 14 genotypes were present G-11 to G-22 and from G-29 to G-30. The third group included 39 genotypes which were from G-34 to G-72. The fourth group from G-23 to G-26 was developed by a combination of various genotypes from first and second groups. Three genotypes from G-31 to G-33 showed mixed genetic material with genotypes from the second and the third groups. The fourth group consisted of 17 genotypes from G-73 to G-89. Six genotypes from G-90 to G-96 exhibited the shared genetic material from the third and the fourth groups.

Population Structure
A Bayesian approach performed in statistical software package STRUCTURE version 2.3.3 was used to assess the genetic structure of 96 bread wheat accessions. The results showed that the highest peak number of K = 4 based on the rate of change in the log probability of data between successive K-values ( Figure 1). This designated that the genotypes could be divided into four subgroups. Each color in Figure 2 demonstrates a single group and studied germplasm of 96 bread wheat genotypes divided into 4 sub-populations. Moreover, evaluation of each group exposed that genotypes from G-1 to G-10 and from G-27 to G-28 were placed into the first group in this group total 12 genotypes appeared. In the second group, a total of 14 genotypes were present G-11 to G-22 and from G-29 to G-30. The third group included 39 genotypes which were from G-34 to G-72. The fourth group from G-23 to G-26 was developed by a combination of various genotypes from first and second groups. Three genotypes from G-31 to G-33 showed mixed genetic material with genotypes from the second and the third groups. The fourth group consisted of 17 genotypes from G-73 to G-89. Six genotypes from G-90 to G-96 exhibited the shared genetic material from the third and the fourth groups.   Table S1.

Markers-Traits Associations for Yield and Quality Attributes
Marker-trait associations for studied parameters in normal and water-deficit stress conditions were observed in this experiment. Manhattan plots as Figure 3A-J showing the location of significant SNPs and −log10(p) associated with quality and yield-related traits under both conditions. The blue horizontal line on the Manhattan Plot designates the threshold (p ≤ 10 −3 ) of significance. In this study, a total of 72 significant SNPs were associated with studied traits, out of them 28 and 44 significant MTAs were observed under normal and water-deficit stress conditions respectively, at −log 10 (p ≤ 10 −3 ) threshold using a mixed linear model (MLM) after applying the false discovery rate (FDR) ≤ 0.05 correction for studied quality and yield-related traits in bread wheat genotypes.
(A) (B) Figure 2. Population structure of 96 bread wheat accessions based on Bayesian methodology using 90K SNPS markers detecting four clusters, K = 4. The dissimilar colors demonstrating the diverse clusters which having the studied genotypes corresponding to Table S1.

Markers-Traits Associations for Yield and Quality Attributes
Marker-trait associations for studied parameters in normal and water-deficit stress conditions were observed in this experiment. Manhattan plots as Figure 3A-J showing the location of significant SNPs and −log10(p) associated with quality and yield-related traits under both conditions. The blue horizontal line on the Manhattan Plot designates the threshold (p ≤ 10 −3 ) of significance. In this study, a total of 72 significant SNPs were associated with studied traits, out of them 28 and 44 significant MTAs were observed under normal and water-deficit stress conditions respectively, at −log 10 (p ≤ 10 −3 ) threshold using a mixed linear model (MLM) after applying the false discovery rate (FDR) ≤ 0.05 correction for studied quality and yield-related traits in bread wheat genotypes.
Agriculture 2020, 10, x FOR PEER REVIEW 7 of 23 Figure 2. Population structure of 96 bread wheat accessions based on Bayesian methodology using 90K SNPS markers detecting four clusters, K = 4. The dissimilar colors demonstrating the diverse clusters which having the studied genotypes corresponding to Table S1.

Markers-Traits Associations for Yield and Quality Attributes
Marker-trait associations for studied parameters in normal and water-deficit stress conditions were observed in this experiment. Manhattan plots as Figure 3A-J showing the location of significant SNPs and −log10(p) associated with quality and yield-related traits under both conditions. The blue horizontal line on the Manhattan Plot designates the threshold (p ≤ 10 −3 ) of significance. In this study, a total of 72 significant SNPs were associated with studied traits, out of them 28 and 44 significant MTAs were observed under normal and water-deficit stress conditions respectively, at −log 10 (p ≤ 10 −3 ) threshold using a mixed linear model (MLM) after applying the false discovery rate (FDR) ≤ 0.05 correction for studied quality and yield-related traits in bread wheat genotypes.

Flag Leaf Area (FLA)
In GWAS analysis, four markers were found to be highly associated with FLA located on chromosomes 7A, 5A and 1A under normal conditions ( Figure 3A). Phenotypic variation explained (PVE) by the FLA associated loci was 17.06% to 21.21% of the total phenotypic variation of FLA. The marker (RAC875_s117925_244) explained the maximum value of trait variability (21.21%) on chromosome 5A at 15.57 cM while the marker (Tdurum_contig42590_755) from chromosome 7A at 35.31 cM explained the minimum value (17.06%) of trait variability under normal conditions (Table  4). Under a water-deficit condition, three markers were strongly associated with FLA on chromosomes 1B and 5D ( Figure B). Total PVE by these markers ranged from 14.9%to 17.60% (Table  5). Under a water-deficit conditions, the marker (Tdurum_contig9144_222) had maximum PVE (17.60%) on chromosome 1B at 171.31 cM while the marker (wsnp_Ex_c955_1827719) on the same chromosome explained the least proportion 14.95% of the trait PVE at the same position.

Flag Leaf Area (FLA)
In GWAS analysis, four markers were found to be highly associated with FLA located on chromosomes 7A, 5A and 1A under normal conditions ( Figure 3A). Phenotypic variation explained (PVE) by the FLA associated loci was 17.06% to 21.21% of the total phenotypic variation of FLA. The marker (RAC875_s117925_244) explained the maximum value of trait variability (21.21%) on chromosome 5A at 15.57 cM while the marker (Tdurum_contig42590_755) from chromosome 7A at 35.31 cM explained the minimum value (17.06%) of trait variability under normal conditions (Table 4). Under a water-deficit condition, three markers were strongly associated with FLA on chromosomes 1B and 5D ( Figure 3B). Total PVE by these markers ranged from 14.9%to 17.60% (Table 5). Under a water-deficit conditions, the marker (Tdurum_contig9144_222) had maximum PVE (17.60%) on chromosome 1B at 171.31 cM while the marker (wsnp_Ex_c955_1827719) on the same chromosome explained the least proportion 14.95% of the trait PVE at the same position.

Thousand Grain Weight (TGW)
Under normal conditions, TGW was highly associated with six markers. Two TGW associated marker loci were located on chromosomes 5A, two on 7B, one on each 6B and 1A ( Figure 3C). The six TGW related markers explained 15.52% to 18.22% of the variation in TGW ( Table 4). The marker (BobWhite_c23828_341) explained maximum phenotypic trait variability (18.22%) on chromosome 6B at 34.94 cM while the marker (Excalibur_rep_c71254_415) on chromosome 5A at 84.58 cM explained minimum value (15.52%). MTA for TGW was distributed across 6 chromosomes, including, 3 SNPs at A-genome and 3 at B-genome under normal condition. Under water-deficit conditions, seventeen significant SNP markers were strongly linked with TGW including 4 markers located on chromosomes 3B, 3 on 4B, 3 on 5B, 2 on 6B and the other on 1A, 3A, 5A, 2D and 7D ( Figure 3D). These markers explained 13.76% to 20.62% of the phenotypic variation in TGW under water-deficit conditions. The marker (Excalibur_c53131_187) on chromosome 3A at 86.66 cM explained a maximum variation 20.62%) while the marker (Tdurum_contig62286_271) on 4B at 89.40 cM explained (Table 5) minimum variation (13.76%) in TGW under water-deficit conditions.

Grain Yield per Plant (GYP)
Under normal conditions, GYP was highly associated with eight SNP markers out of which three SNPs were located on chromosome 2A, two on 5A, one on 6B and two on 7B ( Figure 3E). These SNPs explained 15.95% to 19.04% of the phenotypic variation in GYP. MTA for GYP was distributed across 8 chromosomes, including, 5 SNPs on A-genome and 3 on B-genome. The marker (BobWhite_c23828_341) explained maximum variation (19.04%) on chromosome 6B at 43.94 cM while the marker (Kukri_c55051_414) on chromosome 5A at 13.62 cM explained minimum variation (15.95%) under normal conditions (Table 4). Under a water-deficit stress condition, eight significant SNPs were found associated. Two SNPs were located on chromosome 1B, two on 4A and other on 1A, 3A, 7B and 5D ( Figure 3F). Under a water-deficit condition, eight significant SNPs explained phenotypic variation ranging from 16.06% to 23.88% in GYP. MTA for GYP was distributed across 8 chromosomes, including, 4 SNPs on A-genome, 3 on B-genome and 1 on D-genome ( Table 5). The marker (Tdurum_contig100702_265) on chromosome 4A at 138.76 cM explained maximum phenotypic variation (23.88%) while the marker (IAAV6265) on 5D at 87.06 cM explained minimum variation (16.06%) under the water-deficit stress condition.

Grain Protein Contents (GPC)
A total of five significant SNP markers were observed for GPC under normal conditions. Three SNP markers were found on chromosome 1B, one 3B and one 4D ( Figure 3G). The total phenotypic variation explained by these SNP markers ranged from 14.52% to 17.52% for GPC. The marker (wsnp_Ex_rep_c107564_91144523) on chromosome 4D at 70.59cM explained maximum variation (17.52%) while the marker (RAC875_rep_c111494_195) on chromosome 1B (130.90 cM) explained minimum variation (14.52%) for GPC under normal condition (Table 4). MTA for GPC were distributed across 5 chromosomes including 4 on B-genome and 1 on D-genome. Under water-deficit conditions, seven significant SNP markers were highly associated with GPC. Two were located on chromosome 3A and the other on 1A, 5B, 6B, 7B, 1D and 3D ( Figure 3H). These SNPs had 15.59% to 25.47% variation in GPC under water-deficit conditions. MTA for GPC were distributed across seven chromosomes including three SNPs on A-genome, three on B-genome and one D-genome. The marker (Tdurum_contig100702_265) on chromosome 4A (138.76 cM) explained maximum variation (25.47%) while the marker (BS00063551_51) on chromosome 1B (158.59 cM) described minimum variation (15.59%) for GPC under water-deficit stress condition (Table 5).

Gluten Contents (GLC)
Five markers were highly associated with GLC under normal conditions that were located on chromosomes 5A, 1B, 4B, 5B, 6B and 2D ( Figure 3I). These significantly associated markers explained11.80% to 12.25% of the variability in gluten contents under normal conditions. MTA for GLC were distributed across 5 chromosomes, including, two SNPs on A-genome, two on B-genome and one D-genome. The marker (Excalibur_c19658_127) on chromosome 3D at 4.56 cM expounded maximum (12.25%) and the marker (Excalibur_c10307_254) on chromosome 2A at 25.97 cM expounded minimum variation (11.80%) for GLC under normal condition (Table 4). Nine markers were detected for GLC under water-deficit stress. Two markers were on chromosome 3A, two on 3D, two on 1B, one on 4A, one on 7B and one on 5D ( Figure 3J) explaining 16.75% to 27.55% variation for GLC under water-deficit condition. The marker (Tdurum_contig100702_265) on chromosome 4A at 138.76cM explained maximum (27.55%) and the marker (Tdurum_contig1631_240) on chromosome 1B at 171.31 cM expounded (Table 5) minimum variation (16.75%) for GLC under water-deficit stressed conditions.

Genome-Wide MTAs
The highest numbers of markers-traits associations (MTAs) were identified for GYP (8) followed by TGW (6), GPC (5), GLC (5) and FLA (4) under normal condition. Under a water-deficit stress condition, maximum numbers of MTAs were detected for TGW (17) followed by GLC (9) In genome A, the markers (Excalibur_c10307_254 and wsnp_BG263358A_Ta_2_3) had the lowest phenotypic variation 11.80% and 13.81% on chromosomes 2A and 1A at 25.97 cM and 101.19 cM were significantly associated with GLC and TGW under normal and water-deficit stress conditions, respectively. The maximum phenotypic variation 21.21% and 27.55% existed in A genome depicted by the markers (RAC875_s117925_244 and Tdurum_contig100702_265) on chromosomes 5A and 4A at 15.53 cM and 138.76 cM were associated with FLA and GLC under normal and water-deficit conditions, respectively.
The SNP markers (wsnp_CAP8_c334_304253 and Tdurum_contig62286_271) were associated with TGW in B genome on chromosome 7B (29.49 cM) and 4B (89.40 cM) explained 12.03% and 13.76% variation for TGW under normal and water-deficit-stress, respectively. In genome B the markers (BobWhite_c23828_341 and BobWhite_c19429_95) observed the maximum trait variation of 19.04% and 20.58% on chromosomes 6B and 7B at 43.94 cM and 133.59 cM were significantly associated with GYP and GLC under normal and water-deficit-stress, respectively.
The markers (wsnp_Ex_rep_c107564_91144523 and Kukri_c7658_229) located on chromosomes 4D (70.59cM) and 3D (143.01cM) associated with GPC in D genome had the maximum phenotypic variation 17.52% and 19.71% under normal and water-deficit-stress conditions respectively. The lowest phenotypic variation 12.25% and 14.56% existed in genome D depicted by the markers (Excalibur_c19658_127 and GENE-4937_537) on chromosomes 3D and 2D at 4.56cM and 111.11 cM were associated with GLC and TGW under normal and water-deficit stress, respectively.

Pleiotropic Locus
In the current study, multi-trait-loci (pleiotropic effect) were perceived on chromosome 1A

Mapping SNPs and Identification of Candidate Genes
Out of 72 SNPs that were found to be associated with different attributes in this experiment, seventy one SNPs were successfully mapped on the bread wheat reference sequence. Eleven candidate genes were predicted for FLA under normal conditions (Table S2) and seven candidate genes were predicted under water-deficit conditions (Table S3). The adjacent genes TraesCS1B02G440200 and TraesCS1B02G480200 on chromosome 1B were predicted as candidate genes for FLA under normal and water-deficit conditions, respectively. A total of 15 and 42 candidate genes were found near SNPs associated with TGW under normal and water-deficit conditions, respectively. For GYP, 14 and 21 candidate genes were identified under normal and water-deficit conditions, respectively. For GPC six and eight new candidate genes were predicted under normal and water-deficit conditions, respectively (Tables S2 and S3). For GLC 27 and 28 new candidate genes were predicted in our study under both conditions. Twenty SNPs associated with the traits under study were mapped in coding DNA sequence (CDS) of the respective candidate genes.

Phenotypic Evaluation
From the results of ANOVA, the interaction between water treatments and genotypes was highly significant for all studied traits which indicated the genotypic variation in response to water treatments as indicated in Table 1. Significant water treatments and genotypes were also observed in previous studies on spring wheat germplasm [11]. Heritability is a statistic used in the fields of breeding and genetics that estimates the degree of variation in a phenotypic trait in a population that is due to genetic variation between individuals in that population. Heritability estimates provide information about the extent of which a particular genetic character to be transmitted to the successive generations.
In current experiments, high heritability was reported in the studied traits like GYP (0.95), followed by TGW (0.92), FLA (0.90) and GLC (0.90) which indicates ( Table 2) that these are simply inherited traits and most likely the heritability is due to additive gene effects and selection may be effective in early generations for these traits. Previous studies have also reported the high heritability in TGW and GYP as complex traits in thirty wheat diverse genotypes evaluated under water-deficit conditions in Alpha Lattice Design with similar heritability of the current study [39]. The quality traits like GPC and GLC had high heritability in the present study similarly earlier reported by Yagdi et al., in diverse wheat genotypes [40]. The grain yield and quality-related attributes heritability in different hexaploid under normal and water-deficit conditions ranged from 0.40 to 0.90 in previous studies [41,42] were also in line with our findings (Table 2). Budak et al. [43] stated that broad-sense heritabilities of grain yield and protein contents were 67% and 64%, respectively. Heritability is a concept that summarizes how much of the variation in a trait is due to variation in genetic factors. Summary statistics of studied attributes under normal and water-deficit conditions based on data averaged over the years presented in Table 2 and exhibited the variation among genotypes in different environments. FLA, which has an important role in photosynthesis and has directly contributed in yield and transpiration, is associated with leaf area under water-deficit conditions [7]. The severity of the effects of water-deficit are particularly acute during the anthesis and grain-filling periods, resulting in decreases in the major yield components, and ultimately lower the total yield per plant [9]. Thousand-grain weight (TGW) is a vital yield component and is more-or-less stable character of wheat cultivars. The mean values decreased due to water-deficit stress in FLA, TGW and GYP but in quality traits like protein and gluten contents mean values increased under water-deficit stress conditions. Water shortage conditions may also have a considerable effect on the chemical composition of the grain, including the storage protein in wheat grain. Our results are similar to the results of wheat scientists they stated that water-deficit stress reduced the wheat grain yield but enhanced the performance of quality traits like GPC and GLC [44,45]. Negative effects of water-deficit stress on wheat performance and genotypic differences in response to a water deficit have also been reported earlier [46]. In water-deficit conditions, GYP and TGW were negatively affected, whereas GPC and GLC were significantly increased. Generally, water-deficit condition is known to reduce the carbohydrate content (including sucrose and starch) of the grain and to increase the protein content similar results were obtained in current study. However, the effects are highly dependent on the degree and timing of the water-deficit and on interactions with other environmental stresses. [26,47]. It is established that terminal drought stress during anthesis and post-anthesis stages in wheat is associated with maximum yield losses [12], which are predominantly caused by a decrease in grain yield per plant [48]. At the booting stage, the early induced terminal drought stress directly decreased flag leaf area and ultimately decreased TGW and GYP. The outcomes agreed with our observations regarding drought stress at the booting stage. Other reasons linked with the restraint of physiological and biochemical pathways, i.e., early leaf senescence, the potential level of leaf water, closure of stomata, decreased net photosynthates, oxidative damage of chloroplasts, carbon fixation rate decreased and assimilation of translocation were also contributed in yield losses under water-deficit stress [49]. The significant association of flag leaf area to yield per plant in wheat has been revealed to be due to its ability to capture radiant energy. In the present study, the GPC contents increased significantly under water-deficit conditions ( Table 3) which also previously reported by wheat breeders [50]. An increase in grain protein contents under water-deficit conditions has been reported mainly due to higher rates of accumulation of grain nitrogen and lower rates of accumulation of carbohydrates [51]. The negative association of quality and yield attributes suggested that an increase in yield under water-deficit stress might be achieved with an insignificant decrease in protein and gluten contents [44].

Population Structure Analysis
Population structure is an important component in association mapping analyses between molecular markers and traits because it may reduce both type I and II errors. In this study, structure analysis suggested that the 96 accessions originating from different ancestors. According to known origin evidence from the maintainer of 96 wheat accessions and pedigree records, showed three kinds of populations but genetically the studied materials was classified into four groups. Structure genotypic analysis was directed to the classification of 96 bread wheat accessions in four sub-groups. In the wheat breeding scheme, these methods were also applied by wheat scientists [11,15,52]. In the current experiment, STRUCTURE analysis suggested dissimilarities among 96 bread wheat accessions and all groups were genetically diverse. The maximum genetic distance between groups exhibited indicating genetic similarity within groups and genetic dissimilarity between the groups. Particularly, results were useable for conferring to the previously known pedigree record and origin of wheat genotypes. Genetic diversity evaluation could be helpful to identify the different genotypes for the advancement and improve the future wheat breeding scheme [9,11]. The genotypes with different genetic makeup can be selected for desirable combinations to develop complex and significant attributes to obtain maximum yield. Discrimination of wheat genotypes based on their genetic basis would be useful for effective and early selection of desired genotypes in the wheat breeding scheme for developing promising wheat genotypes.

Markers-Traits Associations for Yield and Quality Attributes
Marker-trait association study established the relationship among specific phenotypic and genetic variability within a genome, which ultimately detected loci underpinning corresponding traits [18]. This diverse panel was never utilized earlier for a study of the genetics of quality and yield-related traits using GWAS. In this study, 35,320 high-density, polymorphic SNP markers from the 90 K Illumina iSelect SNP array were analyzed [31] to detect SNPs linked with quality yield and yield-related traits. Marker-trait associations for studied parameters in both conditions were analyzed. This study allowed us to identify important genomic regions carrying some important genes associated with studied traits. The flag leaf area of the wheat plant is an important character and directly influences on yield because a greater area enables us to produce photosynthates in higher amounts, which are translocated in seed to increase their yield. Earlier research was in line with current results they reported MTAs for FLA on chromosome 1B, 4D, 5A, 6B, 7D using a wheat recombinant inbred line (RIL) population under normal and water-deficit conditions [53]. Wu et al. identified the thirteen chromosome regions to be associated with FLA explaining 3.33-26.13% of the phenotypic variance in wheat using an integrated high-density SNP genetic linkage maps [54]. Zhao et al., Identify the significant markers for FLA on chromosomes 2A under four environments in the RIL wheat population [55]. In the current study, MTA for TGW were distributed across 17 chromosomes. Genome-wide association studies (GWAS) were undertaken by wheat breeders to identify SNP markers associated with TGW in 123 Pakistani historical wheat cultivars evaluated under rainfed field conditions. These cultivars were genotyped by using high-density Illumina iSelect 90K single nucleotide polymorphism (SNP) assay, They reported MTAs for TGW on chromosomes 3A, 3A, 3B, and 5B [11]. Stable genomic regions for TGW were frequently identified on chromosomes 5A, 3B and 5B which influence TGW in various wheat populations using association mapping analysis which was reported by wheat breeders [48]. MTAs were detected by many wheat scientists using different hexaploid wheat panel using genome-wide SNP studies for TGW on chromosomes 1A, 2D, 3A [10], 2A, 3B, 7B, 7D [4], 2B [56], 4B [8] and 5B [57], theses previous findings were also in agreement with the current study.
In this study, MTAs controlling GYP trait was found on chromosomes 1A, 3A, 4A, 1B, 4B, 6B, 7B, 5D and 7D under both conditions. Previously observed MTAs for GYP, in various wheat panel analyzed thorough GWAS using high-density SNP assay, on chromosomes 1A, 2D, 3A, 7B and 7D [11], 1B [8], 2B, 3A, 3D, 5B, 7A and 7B [10] under different water regimes. The marker locus on 4B was associated with GYP under water stress conditions was also reported previously to be associated with this trait in Pakistani wheat population [15]. Similarly in genome-wide association mapping, Edae et al. [19] reported MTAs for GYP on chromosomes 4A, 1B, 5B, and 2B of spring wheat association panel under contrasting moisture regimes. Moreover, Lozada et al. [56] stated MTAs for GYP on chromosomes 5A, 1B, 2B and 4B in a diverse panel of 239 wheat (Triticum aestivum) genotypes evaluated across two growing seasons using SNP markers. Wheat breeders [20] identified MTA for GYP on 3A, 3B and 2D in a set of 287 diverse advanced wheat lines across different environments. Tadesse et al. [17] reported GYP related MTAs on 1B in 120 elite hexaploid wheat genotypes which evaluated under rain-fed and irrigated conditions for a genome-wide study. Pinto et al. [8] detected GYP related MTA on 4A which explained 27% of variation under water-deficit stress using 167 wheat recombinant inbred line (RIL) under three different water regimes. MTAs responsible for GYP were identified on chromosomes 1A, 4B, 6B, and 7D in bi-parental QTL analyses were also reported [58]. Sukumaran et al. [4] identified MTAs on chromosome 5A, 6A, 2B and 3B in GWAS on multi-environment data identified genomic regions associated with wheat yield and yield-related traits. They used a panel of 287 elite spring bread wheat lines through 90K Illumina Infinitum SNP array, and their results were similar to the current findings [4]. MTAs for yield per plant and its related attributes were identified in the current experiment were specific to water treatment conditions, suggesting the dynamic nature of genetics underpinning for wheat yield [56]. These significant SNPs associated with yield and yield-related traits in this study can help in designing new strategies to accumulate favorable alleles for studied traits in a future wheat breeding program.
Suprayogi et al., reported the significant associated MTAs for GPC on chromosomes 5B, 6B, 2B and 7A in diverse wheat germplsm using SSR and SNP markers [59]. In accordance with the current study, MTA detected by Tadesse et al. [17] on chromosomes 5B and 3B were significant and linked with protein percentage exhibiting 16% and 15% of the total variation, respectively, using Diversity Array Technologies (DArT) markers in 120 elite wheat genotypes under rain-fed and normal conditions. Earlier, two MTAs for GPC on 3B and 5B were also reported by analysis of recombinant inbred lines (RILS) derived from a cross between spring wheat and spring version of winter wheat, comprising 257 SSRs and 77 SNP markers [60]. MTAs linked with GPC were reported at the same position under two environments on chromosomes 3B and 7B, and in one environment also reported on chromosomes 2A, 4A, 5A, 7A, 1B, 2B, 3B and 5B [61] in a mapping population of 93 RILs derived from the cross UC1113 x Kofa of duram wheat and these result are also in line with the current results which showing the significant MTAs from A and B genome.
In wheat MTAs/QTLs for GPC were found on chromosomes 3A, 5A, 4B and 5B in wheat RIL using Infinium iSelect SNP genotyping assays containing 9000 wheat SNPs developed by Illumina Inc [62]. In the present study, MTA for GLC was distributed across 9 chromosomes, including, three on A-genome, three on B-genome and three on D-genome under water-deficit condition. Our results showed that MTAs for GLC were located on chromosomes 5A, 1B, 4B, 5B, 6B and 2D under normal conditions, while under water-deficit conditions reported on chromosomes 3A, 4A, 1B, 7B, 3D and 5D. The MTAs in this study would provide preliminary information of genetic regions that may be important for GPC and GLC. However, further mapping and validation of these MTAs/QTLs should be carried out before applying in marker-assisted breeding. Chromosomes 7A, 1B, 4B and 7B were reported to have significantly associated MTA for GPC and GLC in wheat using bia-parental population under different environmental conditions [63]. The most significant and stable QTL influencing GLC was found on chromosome 1B commonly found across different environmental conditions detected in a mapping population of 93 RILs of wheat genotypes [61]. Six significant and different genomic regions for GPC and GLC were located on chromosomes 3A, 4A, 5A, 6A, 4B and 2D in the present study and also previously reported. These were exhibited pleiotropic effects and showed significant effects on many quality-related attributes with no or minor negative influence on yield-related attributes [6]. Multi trait loci for yield and yield-related traits were also identified on chromosome 5A [4]. In this experiment, MTA for TGW was documented on chromosome 7B, which was earlier reported to have noteworthy relations to yield traits [64]. Chromosome 5D was found to be associated with QTL for GYP in wheat [65], in agreement with the current study.
The results from this study supported that GWAS is a beneficial tool for recognizing the positions of genes, MTAs/QTLs and candidate genes liable for the variations in a desired quantitative characters [18]. The power to identify linked loci for the desired character via GWAS is usually based on the density of markers, size of the population, desired attributes, phenotypic performance, and statistical analysis [66]. It is imperative to indicate that phenotypic experiments were directed in normal and water-deficit conditions; therefore MTAs recognized in the current experiment are vital since they might be associated with minor genes adaptation to targeted conditions. Numerous studies have been heading for localizing genes and QTLs affecting different quality and yield attributes to expedite marker-assisted selection in the wheat breeding program against water-deficit tolerance [8,67,68].

Mapping SNPs and Identification of Candidate Genes
To validate the SNPs reported previously [31], the sequences of the SNP loci were mapped on a recently published bread wheat reference sequence (IWGSC RefSeq v1.0). Putative candidate genes were identified surrounding ±250Kb of the mapped SNPS. In this study, eleven candidate genes were predicted for FLA under normal (Table S2) and seven were predicted under water-deficit condition ( Table S3). The adjacent genes TraesCS1B02G440200 and TraesCS1B02G480200 on chromosome 1B were predicted as candidate genes for FLA under normal and water-deficit conditions, respectively. Previously, QTLS for FLA on 1B under different water regimes have been reported [53]. For the FLA candidate genes, the predicted proteins and their functions were ureide permease (nitrogen compound transport), Peptidase_A22B_SPP (aspartic-type endopeptidase activity) and Reticulon domain-containing protein (Transmembrane). Besides the housekeeping role in N recycling, ureides are the major products of N 2 fixation in root nodules, which are translocated to the shoot [69,70]. The ureide allantoin is gaining attention as several studies have reported this metabolite to accumulate in many plant species under water-deficit conditions [71,72] Peptidase_A22B_SPP is an endopeptidase and is an integral component of the membrane [73]. The role of reticulon domain-containing protein is still unknown. For GYP, 14 and 21 candidate genes were identified under normal and water-deficit conditions, respectively. The candidates' genes reported here in our studies are different than those which have been cloned so far, such as TaSnRK2.10-4A, TaTGW6-A1, TaFlo2-A1, TaGS53A, TaGASR7-A1, TaSAP1-A1, TaCwi-A1, TaGW2, TaGS1a and TaGS-D1. The predicted protein for GYP and TGW were Laccase/Lignin degradation, MATH domain-containing protein, Importin N-terminal domain-containing protein, NTP_transferase domain-containing protein, cyclic nucleotide-binding domain-containing protein/ion channel activity, and WD_REPEATS_REGION domain-containing protein. Laccase was recently reported to improve GYP by improving plant defense against fungal infections [74]. In Arabidopsis and rice, MATH domain-containing protein has been reported to be involved in grain yield under normal and abiotic conditions [75]. The functions of Importin N-terminal domain-containing protein and NTP transferase domain-containing protein are still unclear in improving grain yield. The role of cyclic nucleotide-binding domain-containing protein genes in yield-related traits and Pst resistance have been documented [76,77].
For GLC, 27 and 28 new candidate genes were predicted under normal water deficit conditions, respectively in our study. Twenty SNPs associated with the traits under study were mapped in coding DNA sequence (CDS) of the respective candidate genes (Tables S2 and S3). The new candidate genes identified herein, can be cloned and functionally characterized for the respective traits. The predicted proteins for GPC and GLC were UDP-glucose 6-dehydrogenase, Protein detoxification, Thioredoxin domain-containing protein, SGNH_hydro domain-containing protein, Reticulon-like protein, 4-hydroxy-7-methoxy-3-oxo-3,4-dihydro-2H-1,4-benzoxazin-2-yl glucosidebeta-D-glucosidase and WD_REPEATS_REGION domain-containing protein (Tables S2  and S3). UDP-glucose 6-dehydrogenase is involved in the biosynthesis of UDP-glucuronic acid (UDP-GlcA), providing nucleotide sugars for cell-wall polymers and affects protein contents in seeds [78]. The role of protein detoxification in GPC and GLC is well-established and comprehensively reviewed. Thioredoxin domain-containing proteins are related to seed germination in cereals and are reported to affect GPC and GLC in wheat [79]. Cloning and characterization of the candidate genes wherein SNP were mapped in CDS will result in discovering novel genes underpinning yield potential and water-deficit tolerance in bread wheat.

Conclusions
In addition to validating the previously reported MTAs, we identified some new candidate genes underpinning the key grain yield and quality traits. The GPC increased significantly under water-deficit stress. The negative association among protein and yield attributes suggested that an increase in yield under water-deficit stress might be achieved with a significant decrease in GPC. The new pleiotropic loci were detected on chromosomes 5A, 6B and 7B under normal conditions, while under water-deficit stress conditions on chromosomes 3A, 4A, 1B, 7B and 5D. The MTAs on chromosomes 7B showed pleotropic effects for studied quality and yield contributing traits under both normal and water-deficit conditions. The newly identified genes for FLA, TGW, GYP, GPC and GLC could be cloned and characterized for furthering understanding of the molecular mechanisms underpinning these traits.

Supplementary Materials:
The following are available online at http://www.mdpi.com/2077-0472/10/9/392/s1, Table S1: Genotypes code, name and pedigree of 96 bread wheat genotypes, Table S2: The mapped position of the SNP markers and predicted candidate genes under normal conditions, Table S3: The mapped position of the SNP markers and predicted candidate genes under water-deficit conditions. Funding: Funding was supported by the China Agriculture Research System (CARS-05-01A-04) which was used for collection and analysis of the genotypic data through 90K SNPs Array and manuscript processing charges.

Conflicts of Interest:
The authors declare that they have no competing interests.