Identification and Validation of Quantitative Trait Loci for Grain Size in Bread Wheat (Triticum aestivum L.)

Grain width (GW) and grain length (GL) are crucial components affecting grain weight. Dissection of their genetic control is essential for improving yield potential in wheat breeding. Yangmai 12 (YM12) and Yanzhan 1 (YZ1) are two elite cultivars released in the Middle and Lower Yangtze Valleys Wheat Zone (MLYVWZ) and the Yellow-Huai River Valleys Wheat Zone (YRVWZ), respectively. One biparental population derived from YM12/YZ1 cross was employed to perform QTL mapping based on the data from four environments over two years to detect quantitative trait loci (QTL) for GW and GL. A total of eight QTL were identified on chromosomes 1B, 2D, 3B, 4B, 5A, and 6B. Notably, QGW.yz.2D was co-located with QGL.yz.2D, and QGW.yz.4B was co-located with QGL.yz.4B, respectively. QGW.yz.2D and QGL.yz.2D, with the increasing GW/GL allele from YZ1, explained 12.36–18.27% and 13.69–26.53% of the phenotypic variations for GW and GL, respectively. QGW.yz.4B and QGL.yz.4B, with the increasing GW/GL allele from YM12, explained 10.34–11.95% and 10.35–16.04% of the phenotypic variation for GW and GL, respectively. QGL.yz.5A, with the increasing GL allele from YM12, explained 10.04–12.48% of the phenotypic variation for GL. Moreover, the positive alleles of these three QTL regions could significantly increase thousand-grain weight, and QGW.yz.4B/QGL.yz.4B and QGL.yz.5A did not show significant negative effects on grain number per spike. QGL.yz.2D, QGW.yz.4B/QGL.yz.4B, and QGL.yz.5A have not been reported. These three QTL regions were then further validated using Kompetitive Allele-Specific PCR (KASP) markers in 159 wheat cultivars/lines from MLYVWZ and YRVWZ. Combining the positive alleles of the major QTL significantly increased GW and GL. Eleven candidate genes associated with encoding ethyleneresponsive transcription factor, oleosin, osmotin protein, and thaumatin protein were identified. Three major QTL and KASP markers reported here will be helpful in developing new wheat cultivars with high and stable yields.


Introduction
Wheat (Triticum aestivum L.) is one of the world's major staple crops, providing 20% of the total caloric demands of humans [1]. The demand for wheat production has been increasing rapidly due to farmland loss, climate change, and population increase [2,3]. Therefore, it is important to improve grain yield by breeding high-yield wheat cultivars. Wheat grain yield comprises three main components, viz., spike number per unit area, kernel number per spike, and thousand-grain weight (TGW). Among them, TGW with relatively high heritability is determined mainly by grain size, including grain width (GW) and grain length (GL) [4,5]. In the past decade, many studies have been conducted to identify quantitative trait loci (QTL) for wheat grain size [5] (Guan et al. 2020). Of those detected QTL, few had been further validated and fine-mapped [5][6][7][8][9][10][11]. Since rice and wheat have a conserved genetic network in regulating grain size, a translational genomics approach appears to be an effective method to identify and map candidate genes for grain size in the wheat genome [12]. Most of the candidate genes for grain weight in wheat have been cloned through homology-based cloning approaches, such as TaGW2 [13], TaCKX6-D1 [14], TaCwi-A1 [15], TaGS-D1 [16], 6-SFT-A2 [17], TaTGW6 [18], TaTPP-6AL1 [19], TaGW8 [20], TaBT1 [21], and TaGW7 [22]. Moreover, several genes for grain size in wheat, such as TaGL3.3-5B [23] and TaPGS1 [24] have been characterized. The functions of several orthologous genes associated with grain size and weight had been further confirmed by the RNAi approach, TILLING (targeting induced local lesions in genomes), and clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated protein 9 (Cas9) genome editing technology [21,22,[25][26][27]. However, few QTL/genes for grain size have been used in marker-assisted selection (MAS) in breeding programs, due to their unstable effects on corresponding traits in different genetic backgrounds and environments. Therefore, detecting novel QTL for grain size and validating them is still critical for yield-breeding. SSR and SNP markers are common molecular markers that are based on polymerase chain reaction (PCR) [28]. Recent progress on wheat genome sequencing and availability of high-throughput chip-based markers have accelerated QTL analysis and MAS in breeding programs. Kompetitive Allele-Specific PCR (KASP) assay, which using the specific commercial master-mix without compromising data throughput, has been an excellent breeding toolkit for high-throughput, cost-effective tracing functional genes/QTL for agronomic traits, pre-harvest sprouting resistance and biotic stress resistance in wheat [29,30].
Yangmai 12 (YM12) was a leading cultivar in the Middle and Lower Yangtze Valleys Wheat Zone (MLYVWZ) released with a peak planting area of 0.14 Mha in 2006 [31], which has been used extensively as an important parent with good disease resistance and excellent agronomic traits in many breeding programs. Yangzhan 1 (YZ1) is a highyield and good-quality winter wheat variety released in the Yellow-Huai River Valleys Wheat Zone (YRVWZ) [32]. The genetic basis of grain size for YM12 and YZ1 is unclear. Thus, the objectives of this study were to (1) understand the genetic basis of YM12 and YZ1 for GW and GL using a recombinant inbred line (RIL) population; (2) highlight the critical chromosomal regions harboring stable QTL; (3) develop breeder-friendly molecular markers tightly linked to the target QTL regions and evaluate the effects of stable QTL intervals on the related traits in different backgrounds; and (4) identify candidate genes for map-based cloning of the major QTL.

Plant Materials
A biparental population (205 F 10 RILs) derived from the cross of YM12 and YZ1 was used for QTL detection. YM12 has a bigger grain size than YZ1. A panel of 159 wheat cultivars/lines, including 64 cultivars and 95 advanced breeding lines, was used for the validation of the major QTL. In the 95 advanced breeding lines, 25 lines were from Yangmai 15/Zhoumai 18 cross, 16 lines were from Yangmai 16/Ningmai 22 cross, and 22 lines were from Yangmai 18/Zhengmai 7698 cross.

Field Trials and Phenotypic Evaluation
Field trials were carried out at Yangzhou Experimental Station (YZ) (altitude 10-20 m, latitude 32.24 • N, longitude 119.26 • E, annual rainfall 1020 mm) and Sihong Experimental Station (SH) (altitude 35-45 m, latitude 33.46 • N, longitude 118.22 • E, annual rainfall 910 mm) in Jiangsu Province, respectively. The YM12/YZ1 RIL population was planted in wheat cropping seasons 2019-2020 and 2020-2021, respectively (2020YZ, 2020SH, 2021YZ, and 2021SH). Field trials were conducted in randomized complete blocks with two replications. Each plot had three 2.5-m rows spaced 0.3 m apart. Fifty seeds for each RIL and parental cultivar were sown in each row. Weed control, fungicide treatment, and other field managements were in accordance with local standard practices. At maturity, 20 plants of each RIL and parental cultivar with similar growth stages and without disease infection were selected and marked. The mean grain number per spike (GNS) was measured as the mean grain number of 20 main-stem spikes. The selected plants were harvested and manually threshed for evaluating GW and GL with SC-G software (Wanshen Technology Company, Hangzhou, Zhejiang, China). TGW was calculated from the mean weight of three independent samples of 500 grains. GNS and TGW were used to validate the effect of the target QTL on the corresponding traits. One hundred and fifty-nine wheat cultivars/lines were planted in the yield evaluation nurseries from 2019-2020 and 2020-2021 cropping seasons at YZ for measuring GW and GL using the same protocol as RILs. Yangmai 20 (YM20) and Yanzhan 4110 (YZ4110) were used as the control for GW, GL, and TGW in the validation populations.

Statistical Analysis
All analyses were performed in Microsoft Excel 2019 and SPSS software (Chicago, IL, USA). Broad-sense heritability (H B 2 ) for GW and GL was calculated using the formula , where E and R represent the number of environments and replicates, respectively; σ 2 G represents the genotypic variance; σ 2 G× E represents genotype-by-environment interaction; and σ 2 e represents residual variance. Best linear unbiased estimator (BLUE) values and H B 2 were estimated using the ANOVA function in IciMapping v4.1 [33].

Genotyping, Linkage Map Construction, and QTL Analysis
A whole-genome genetic map of the YM12/YZ1 population was previously constructed from Wheat 55 K SNP array data (unpublished). The linkage map including 1468 bin markers spanned a total length of 3610.67 cM, with 26 linkage groups assigned to 21 chromosomes with a mean distance of 2.62 cM/marker; chromosomes 1A, 6B, 1D, 2D, 3D, and 5D each had two linkage groups. Twelve Kompetitive Allele-Specific PCR (KASP) markers and eight SSR markers that are associated with known genes including Vrn-B1, Ppd-B1, Rht-B1, Rht-D1, Rht8, and Qfhb-2DL were selected for genotyping the parental cultivars and RILs [34][35][36][37]. The 660K SNP array containing 660,011 SNPs was used to scan parents YM12 and YZ1 and the SNPs corresponding to the target interval of stable and major QTL for grain size on chromosome 2D were converted into KASP markers. Then, the KASP markers were merged with other markers to construct a new linkage map of 2D (unpublished). QTL analysis was conducted using the inclusive composite interval mapping algorithm and the LOD threshold value was set at 2.5 in IciMapping V4.0 software, with walking step = 0.001 cM and PIN = 0.0005 [36][37][38]. QTL identified for the same trait situated within overlapping confidence intervals were considered to be the same one. Physical positions of linked markers were used to compare the QTL identified in the current study with the previous QTL (http://202.194.139.32/blast/blast.html) (accessed on 3 December 2021) (http://202.194.139.32/genes/) (accessed on 1 February 2022).

Marker Development and QTL Validation in Different Genetic Backgrounds
Genomic DNA was extracted from fresh leaves according to Ma and Sorrells [39] The flanking markers of the peak position for the major QTL were converted into KASP markers. KASP markers were designed following the protocols described by Xu et al. [36]. KASP assays were performed in 384-well PCR plates in a 5 µL volume with 2.5 µL of KASP 2× Reaction Mix, 0.056 µL of KASP primer mix, and 2.5 µL of genomic DNA at 30 ng/µL. Fluorescence detection of PCR products was performed with PHERAstar (BMG LABTECH, Ortenberg, Germany). KlusterCaller software (LGC Genomics, Beverly, MA, USA) was used to analyze the fluorescence results [38]. The KASP markers were remapped in the RIL population to integrate maps. To further validate these major QTL in different genetic backgrounds, the developed KASP markers were used to trace the target QTL in a panel of 64 cultivars and 95 advanced breeding lines.

Phenotypic Variation and Correlation Analysis
Significant differences (P < 0.01) in GW and GL between YM12 and YZ1 were observed from all environments as well as the combined BLUE dataset (Table 1, Figure 1a). Broadsense heritabilities (H B 2 ) for GW and GL were 0.71 and 0.79, respectively ( Table 1). The phenotype values of all traits based on the BLUE dataset of all trials for the YM12/YZ1 population displayed a continuous distribution and obvious transgressive segregation, suggesting the existence of polygenic inheritance (Table 1, Figure 1b). The datasets of GW and GL in all environments and BLUE values were employed to assess their correlations (Table S1). GW showed very significant positive correlations with GL from the same trial (p < 0.01) and significant positive correlations with GL from different trials (p < 0.05) (Table S1). BLUE datasets of GW and GL also had very significant positive correlations with each other (p < 0.01) (Table S1).
3.3. Effect of Major QTL QGW.yz.2D/QGL.yz.2D, QGW.yz.4B/QGL.yz.4B and QGL-yz-5A on GNS and TGW in the Mapping Population AX109059601, AX108819885, and AX110003317 flanking the peak intervals of the major QTL QGW.yz.2D/QGL.yz.2D, QGW.yz.4B/QGL.yz.4B, and QGL-yz-5A, were used to evaluate the effects of these regions on GNS and TGW with the BLUE dataset from four environments. The YZ1 allele at QGW.yz.2D/QGL.yz.2D had an extremely significant negative effect on GNS (p < 0.001), while it increased 5.77% of TGW (Figure 2). The YM12 allele at QGW.yz.4B/QGL.yz.4B increased 4.02% of TGW, and the YM12 allele at QGL-yz-5A increased 2.10% of TGW (Figures 3 and 4). These two alleles did not have significant negative effect on GNS (Figures 3 and 4). In allelic effects (a), A and B indicated the lines with the alleles from YM12 and YZ1, respectively; *** represents significance at p < 0.001; ns represents non-significance. In the genetic maps (b), the names of the markers are listed on the right side of the corresponding linkage group, and their genetic positions and QTL names are shown on the left (cM). The red rectangles at the chromosomes represent QTL regions. Blue blocks represent QTLs for grain width, and green blocks represent QTLs for grain length. In the expression patterns of genes (d), the red arrows represent the genes that were more highly expressed in grain or both in grain and whole endosperm than in root, leaf, stem, and spike. GW, grain width; GL, grain length; GNS, grain number per spike; TGW, thousand-grain weight. YM12, Yangmai 12; YZ1, Yangzhan 1. In allelic effects (a), A and B indicated the lines with the alleles from YM12 and YZ1, respectively; *** represents significance at P < 0.001; ns represents non-significance. In the genetic maps (b), the names of the markers are listed on the right side of the corresponding linkage group, and their genetic positions and QTL names are shown on the left (cM). The red rectangles at the chromosomes represent QTL regions. Blue blocks represent QTLs for grain width, and green blocks represent QTLs for grain length. In the expression patterns of genes (d), the red arrows represent the genes that were more highly expressed in grain or both in grain and whole endosperm than in root, leaf, stem, and spike. GW, grain width; GL, grain length; GNS, grain number per spike; TGW, thousand-grain weight. YM12, Yangmai 12; YZ1, Yangzhan 1.

Additive Effects of the Major QTL
Different major QTL for GW and GL showed additive effects with lines without any of the positive alleles that had the lowest GW and GL (p < 0.01) ( Figure 5). For GW, lines with one of the positive alleles increased GW by 1.18-2.66% and lines with both positive alleles increased GW by 4.44% (p < 0.01) (Figure 5a). For GL, lines with one of the positive alleles for GL increased GL by 3.73%, 3.27%, or 1.56%, respectively; lines with positive alleles at two loci increased GL by 6.22-8.56%; and lines with positive alleles at all three loci increased GL by 10.26% (Figure 5b).

Development and Evaluation of Breeder-Friendly KASP Markers for Major QTL and Validation of These QTL in 159 Wheat Cultivars/Lines
Among the QTL detected for GW and GL, QGW-yz-2D/QGL-yz-2D, QGW-yz-4B/QGLyz-4B, and QGL-yz-5A had larger effects on GW or GL. We converted the SNP markers AX109059601, AX108819885, and AX110003317, which flanked the peak intervals of QGWyz-2D/QGL-yz-2D, QGW-yz-4B/QGL-yz-4B, and QGL-yz-5A, into KASP markers KASP_2D, KASP_4B, and KASP_5A (Figure 6a-c and Table S2). A collection of 159 wheat cultivars and lines were evaluated for GW and GL and surveyed using the three KASP markers. The list of 159 wheat cultivars/lines and their genotypes and phenotypes were shown in Table S3. To compare the additive allelic effects of "QGW-yz-2D + QGW-yz-4B" on GW and "QGL-yz-2D + QGL-yz-4B + QGL-yz-5A" on GL, all cultivars/lines in the panel was divided into four and eight groups based on their allele combinations. For GW, the group carrying the YZ1 allele at QGW-yz-2D and the YM12 allele at QGW-yz-4B showed 7.12% higher GW than cultivars/lines without positive alleles. The group carrying the YZ allele at QGW-yz-2D and the YZ allele at QGW-yz-4B had similar GW to those with the YM12 allele at QGW-yz-2D or YM12 allele at QGW-yz-4B, increasing GW by 4.64-5.26% (Figure 6d). GL of the group with the three positive alleles was 6.95 mm, 6.60% longer than cultivars/lines without positive alleles (P < 0.05) (Figure 6e). Lines with a positive allele for GL at one of QGL.yz.2D, QGL.yz.4B, or QGL-yz-5A increased GL by 1.84-2.91% (Figure 6e). The groups that carried two positive allele combinations had 3.22-5.37% longer GL than cultivars/lines without positive alleles (Figure 6e). In allelic effects (a), LA and B indicated the lines with the alleles from YM12 and YZ1, respectively; * represents significance at p < 0.05; ns represents non-significance. In the genetic maps (b), the names of the markers are listed on the right side of the corresponding linkage group, and their genetic positions and QTL names are shown on the left (cM). The red rectangles at the chromosomes represent QTL regions. Blue blocks represent QTLs for grain width, and green blocks represent QTLs for grain length. In the expression patterns of genes (d), the red arrows represent the genes that were more highly expressed in grain or both in grain and whole endosperm than in root, leaf, stem, and spike. GW, grain width; GL, grain length; GNS, grain number per spike; TGW, thousand-grain weight. YM12, Yangmai 12; YZ1, Yangzhan 1.

Potential Candidate Genes for QGW-yz-2D/QGL-yz-2D, QGW-yz-4B/QGL-yz-4B, and QGL-yz-5A
We attempted to predict potential candidate genes for the three major QTL. In the QGW-yz-2D/QGL-yz-2D interval on the CS genome, there are 59 annotated high-confidence genes (Table S4). Expression pattern analyses showed that 16 genes were expressed in grain and two of them were expressed more highly in grain than in root, leaf, stem, and spike (Figure 2d). Gene annotation analysis indicated that TraesCS2D03G0934600 and TraesCS2D03G0935700 are likely associated with encoding the ethylene-responsive transcription factor and oleosin, respectively (Table S4). In the QGW-yz-4B/QGL-yz-4B interval on the CS genome, there are 66 annotated high-confidence genes (Table S4). Thirty-one genes were expressed in grain and only TraesCS4B03G0630100 was expressed more highly in grain than in root, leaf, stem, and spike. However, TraesCS4B03G0630100 does not have annotation (Figure 3d, Table S4). In the QGL-yz-5A interval on the CS genome, there are 59 annotated high-confidence genes (Table S4). Twenty-four genes were expressed in grain and eight of them had higher expression levels in grain than in root, leaf, stem, and spike (Figure 4d). Among the eight genes, TraesCS5A03G0044100 is likely to be associated with encoding osmotin protein while TraesCS5A03G0043700, TraesCS5A03G0044700, TraesCS5A03G0044900, TraesCS5A03G0045000, TraesCS5A03G0045100, and TraesCS5A03G0045200 are probably involved in encoding thaumatin protein. Moreover, TraesCS5A03G0045900 does not have annotation (Table S4).

Pyramiding of Major QTL for GW and GL Improvement
Pyramiding the favorable alleles from elite cultivars/lines is an effective method to obtain desired ideal cultivars with improved yield [41,51]. Utilizing accessions derived from a different ecological area can be an effective way to broaden the genetic diversity of local breeding materials for breeders. Dissection of the effects of major QTL on corresponding traits in the mapping population indicated that YZ1 allele at QGW.yz.2D/QGL.yz.2D and QGW.yz.4B/QGL.yz.4B and the YM12 allele at QGL-yz-5A had positive effects on improving TGW as well as GW or GL. The significant additive effect indicated that pyramiding of the major loci facilitated by using the developed KASP markers could be utilized as applicable strategy to optimize grain size in wheat breeding. YZ1 and YM12 are two elite cultivars released in YRVWZ and the MLYVWZ, respectively. The KASP markers developed in this study could be utilized to efficiently pyramid these loci through MAS. Among the validated population, 71 cultivars/lines are suitable for planting in MLYVWZ, and 88 cultivars/lines are suitable for planting in YRVWZ (Table S3). For QGW.yz.2D/QGL.yz.2D, only 29.58% of the cultivars/lines from MLYVWZ have positive alleles, while 81.82% of the cultivars/lines from YRVWZ carry positive alleles. For QGW.yz.4B/QGL.yz.4B, 46.48% of the cultivars/lines from MLYVWZ have positive alleles, while 75% of the cultivars/lines from YRVWZ carry positive alleles. For QGL-yz-5A, 54.93% of the cultivars/lines from MLYVWZ have positive alleles, while 69.32% of the cultivars/lines from YRVWZ carry positive alleles. All of the above results suggest that these three positive alleles have been used more frequently in wheat breeding programs in YRVWZ, which was consistent with the finding that the average GW and GL of the cultivars/lines from YRVWZ are 3.40 mm and 6.86 mm, which are 1.20% and 2.85% higher than those from MLYVWZ. For YZ1 allele from QGW.yz.2D/QGL.yz.2D, due to its negative effect on the GNS, it mainly exists in the cultivars/lines in YRVWZ and has been used with difficulty in wheat breeding programs in MLYVWZ. Further fine-mapping and map-based cloning of QGW.yz.2D/QGL.yz.2D for grain size would facilitate better use of this QTL in wheat breeding. There are only 12.58% of the 159 cultivars/lines simultaneously harboring these three positive alleles, indicating that the combining of the three favorable alleles has great potential for breeding programs.

Potential Candidate
Genes for QGW-yz-2D/QGL-yz-2D, QGW-yz-4B/QGL-yz-4B, and QGL-yz-5A Among the genes in the intervals of the major QTL identified in the current study, a total of eleven genes showed significantly higher expression levels in grain than in root, leaf, stem, and spike, indicating that they are likely associated with grain growth and developmental processes. Gene annotation and ortholog analysis showed that TraesCS2D03G0934600 acts as a transcriptional activator, binding to the GCC-box pathogenesis-related promoter element and is involved in the regulation of gene expression by stress factors and by components of stress signal transduction pathways in Arabidopsis thaliana [52,53]. TraesCS2D03G0935700 might have a structural role to stabilize the lipid body during desiccation of the seed by preventing coalescence of the oil and probably interacting with both lipid and phospholipid moieties of lipid bodies in Rye brome (https://www.uniprot.org/ uniprot/Q96543) (accessed on 2 March 2022). TraesCS5A03G0044100 encodes the osmotin protein OSML13 involved in response to stress, a change in state or activity of a cell or an organism as a result of some stressful conditions (https://www.uniprot.org/uniprot/P50701) (accessed on 2 March 2022). TraesCS5A03G0043700 encodes thaumatin protein in Actinidia deliciosa (kiwi). TraesCS5A03G0044700, TraesCS5A03G0044900, TraesCS5A03G0045000, TraesCS5A03G0045100, and TraesCS5A03G0045200 are associated with the thaumatin protein in rice [54]. Os12g0629600 is the ortholog gene of these five candidate genes [51], indicating that these five genes are homologous and have the conserved site. Moreover, TraesCS4B03G0630100 and TraesCS5A03G0045900 do not have annotation in Chinese Spring genome (http://wheat.cau.edu.cn/TGT/ann_db/) (accessed on 5 March 2022). The function of these candidate genes in the growth of grain would be elucidated by cloning and gene editing.

Conclusions
In the current study, four QTL for GW and four QTL for GL were identified in the YM12/YZ1 population. Among them, QGW.yz.2D/QGL.yz.2D, QGW.yz.4B/QGL.yz.4B, and QGL-yz-5A were three major QTL regions. QGL.yz.2D, QGW.yz.4B/QGL.yz.4B, and QGLyz-5A are more likely novel QTL. The additive effects of the positive alleles of the major QTL on corresponding traits in the validation population are significant. Eleven potential candidate genes in the interval of the major QTL were expressed significantly more highly in grain than root, leaf, stem, and spike by spatial expression patterns. These results lay a foundation for further fine-mapping and map-based cloning of these major QTL for grain size. In addition, three breeder-friendly markers KASP_2D, KASP_4B, and KASP_5A for QGW.yz.2D/QGL.yz.2D, QGW.yz.4B/QGL.yz.4B, and QGL-yz-5A, respectively, would be useful for marker-assisted selection in wheat breeding programs.
Author Contributions: W.H., formal analysis, writing-original draft, and writing-review and editing; S.C., conceptualization, project administration, resources, and supervision; S.L., D.Z., J.J. and W.X., investigation, methodology, and data curation. All authors have read and agreed to the published version of the manuscript.

Data Availability Statement:
The data that support the findings of this study are available in the main body of the paper.

Conflicts of Interest:
The authors declare no conflict of interest.