Simple Summary
Rice grain morphology, especially grain shape, is one of the major indicators determinant of yield potential and is controlled by an interconnected gene regulatory network. It also affects rice processing and quality, which are critical for market competitiveness. To achieve higher and more stable rice yields, it is crucial to identify and utilize favorable genetic resources associated with grain shape, and to understand the underlying mechanisms of these genes. Therefore, investigating the genetic regions responsible for rice grain shape is of great importance for both scientific research and rice breeding. In this study, 231 different rice varieties were grown under two different nitrogen conditions, and their grain shapes were measured at maturity. The grain shape data were then correlated with the varieties’ genomic information to identify regions controlling grain shape, along with potential genes and their variants in those regions. Our findings highlight valuable rice accessions and beneficial gene variants enabling breeders to more efficiently generate rice lines exhibiting the desired grain shapes.
Abstract
The morphology of rice grains represents one of the most vital agronomic characteristics, significantly impacting both grain productivity and the subsequent milling and nutritional quality of the crop. A comprehensive understanding of the genetic basis and molecular drivers of grain shape is vital for the targeted breeding of high-performance rice lines with consistent yield stability. To pinpoint the genomic regions influencing grain dimensions, we conducted a genome-wide association analysis across a panel of 231 distinct rice accessions, focusing on the discovery of loci associated with length and width. Our analysis revealed four consistent quantitative trait loci (QTLs) distributed across chromosomes 3, 4, and 11. Notably, grain length was associated with qGL3.1, qGL3.2, and qGL11. The first two were co-localized with GS3 and SMG3, respectively, whereas qGL11 likely constitutes a novel locus. One QTL, qGW4, which governs grain width, was found to co-localize with the gene OsOFP14. Haplotype analysis further revealed that the characteristic haplotypes of the candidate genes for qGL3.1, qGL3.2, and qGW4 were enriched in eight germplasm accessions (including Newbonnet, Skybonnet, and Lemont), all of which exhibit a slender-grain phenotype. This finding suggests that the specific combination of these characteristic haplotypes is a common genetic signature of slender-grain rice, serving as a potential gene combination for the targeted improvement of rice grain shape. Our results reveal valuable QTLs and candidate genes and highlight specific germplasm resources that can be readily applied in marker-assisted breeding to improve rice grain shape.
1. Introduction
Grain shape in rice (Oryza sativa) is commonly quantified by grain length, width, and thickness [1]. Variation in these dimensions influences per-plant yield and is also tightly linked to milling performance and the commercial value of rice products [2]. Moreover, grain shape is associated with rice quality: differences in grain size can alter starch accumulation and the distribution of storage proteins, which in turn affects the texture and palatability of cooked rice. Therefore, optimizing grain shape contributes not only to yield improvement but also to quality enhancement and value addition [3]. Genetically, grain shape is a polygenic quantitative trait controlled by multiple loci. To date, numerous grain-shape-related QTLs/genes have been reported [4,5,6], including over 600 QTLs, and nearly 200 loci/genes across all 12 chromosomes have been cloned and functionally validated [7]. Collectively, these advances have deepened our understanding of the genetic regulatory network underlying grain morphology and have provided useful targets and resources for breeding. Continued identification of elite germplasm carrying favorable haplotypes at multiple grain-shape loci, together with mechanistic investigation, should further expand genetic resources for breeding high-yield, high-quality rice.
The genetic control of rice grain reflects the coordination of several signaling modules, among which heterotrimeric G proteins constitute a central node. By alternating between guanosine triphosphate (GTP) bound and GDP bound states through guanosine triphosphatase (GTPase) activity, these proteins transmit signals downstream of membrane-associated pathways and link signaling to metabolic outputs. A well-studied example is GS3, which was first mapped by Fan et al. [8]. GS3 has since become a prominent target for rice breeding. Beyond the role of GS3, Sun and colleagues [9] established that the functional divergence among three rice G protein γ subunits—namely DEP1, GGC2 and GS3—arises from sequence variations in their C-terminal domains. These C-terminal regions of DEP1 and GGC2 are crucial for G protein signal transduction, given that their biological activity is contingent upon forming complexes with both the β subunit (RGB1) and the α subunit (RGA1) of the G protein. Although GS3 does not directly promote grain enlargement, it restricts longitudinal grain growth by outcompeting DEP1 or GGC2 for binding to the Gβ subunit, which ultimately limits grain size. Furthermore, the involvement of phytohormones—specifically brassinosteroids (BRs) and cytokinins (CKs)—underscores the critical integration of hormonal pathways in determining grain morphology.
A further noteworthy gene, SMG3, was uncovered through the analysis of a cross between the japonica accession M494 and the indica line Zhong 9B. Research indicates that SMG3 exerts a negative regulatory effect on the longitudinal and transverse dimensions of rice grains. Subsequent analysis by Li and colleagues [10] further investigated SMG3 and revealed that the protein it encodes is homologous to the Arabidopsis ubiquitin-conjugating enzyme UBC32. This protein interacts with DGS1 (another grain-size regulatory factor) and facilitates the ubiquitination of BR receptors, consequently modulating the BR signaling cascade, which in turn determines the final dimensions of the grain. In addition to the mechanisms described above, other molecular factors contribute to the development of rice grain shape. Overall, the regulation of rice grain development is a multifaceted process involving various signaling pathways that collectively influence final grain shape.
In study on QTLs related to rice grain morphology, most of the loci are still located in wide chromosome segments, resulting in low resolution. Restricted by technical bottlenecks, previous studies have focused on the analysis of single genes and their regulatory networks, while the exploration of the combined effects of different excellent haplotypes is relatively scarce. In view of this, it is of great theoretical and practical value to explore more QTLs regulating rice grain morphology. In addition, breeders can use the cloned grain morphology regulatory genes to carry out targeted screening of germplasm resources, excavate excellent haplotypes, and further analyze the regulatory effect of major locus pyramiding on grain morphology, so as to provide theoretical support and technical path for molecular improvement of grain morphology-related traits.
To advance these efforts, a particularly valuable yet underexplored approach is to integrate the discovery of novel loci with the systematic evaluation of haplotype combinations—encompassing both established genes and new candidates—within a single, diverse germplasm panel. Such an integrated strategy can directly bridge genetic discovery with breeding application by pinpointing accessions that harbor optimal combinations of alleles. To address this, we implemented a GWAS approach on a panel of 231 rice varieties representing a broad spectrum of global genetic diversity. Moving beyond mere locus identification, we placed special emphasis on haplotype analysis of candidate regions and, crucially, on evaluating the combined effects (pyramiding) of favorable haplotypes. We anticipated that this integrated approach would not only help uncover novel genetic factors but, more importantly, identify specific germplasm accessions carrying elite haplotype combinations for grain shape, thereby providing practical genetic resources for breeding.
2. Materials and Methods
2.1. Genetic Resources and Experimental Layout
In the present research, we utilized 231 rice germplasm accessions collected worldwide, classified into two subspecies: xian (XI, indica) and geng (GJ, japonica). These accessions originated from 23 countries and regions, mainly in tropical and subtropical zones (Table S1). The field experiment was conducted in Qinglian, Sichuan Province, China, in May 2024. Two nitrogen (N) treatments were applied: low nitrogen (0 kg ha−1, denoted R1) and medium nitrogen (60 kg ha−1, denoted R2), using a compound fertilizer from Jinzhengda (15% each of N, P2O5 and K2O). For every accession, a three-row plot was established, with each row consisting of ten individual plants. Phosphorus fertilizer (P2O5, 60 kg ha−1) was applied as a basal dose during soil preparation before transplanting. The planting spacing was 15.5 cm × 20 cm, and all other field management practices followed standard conditions.
2.2. Phenotypic Evaluation
After maturation, five representative and independent plants within each accession were randomly selected as biological replicates and harvested individually. Following the approach described by Si et al. [11], the harvested grains were naturally air-dried prior to the evaluation of grain shape-related traits. For each individual plant sample, about 50 plump and uniform grains were randomly selected as a measurement sample. The longitudinal and transverse dimensions, specifically grain length (GL) and width (GW), were quantified for each grain using a Wanshen digital seed imaging system (Model SC-G, Wseen Technology Co., Ltd., Hangzhou, China; http://www.wseen.com/, data retrieved on 4 November 2024) (Figure S1). The final phenotypic value for each accession under a specific nitrogen treatment was calculated as the mean of the trait data obtained the five biological replicates. All operations followed the instrument’s user manual.
2.3. Quantitative Assessment of Experimental Data
2.3.1. Phenotypic Evaluation and Data Modeling
The raw grain length and grain width data for all accessions were processed using SPSS (version 22.0). To characterize the distributional features of grain shape traits, we performed descriptive statistical analyses on each trait using the dedicated descriptive statistics function in SPSS, with key metrics including the minimum, maximum, mean, standard error, coefficient of variation, skewness, and kurtosis calculated for each trait. Correlations among rice grain size-related phenotypes were then assessed in RStudio (v2024.04.2+764).
2.3.2. Genetic Structure Analysis
High-density SNP genotypic data for the 231 accessions were extracted from the 3K-RGP (3K Rice Genome Project), specifically utilizing a high-density collection of approximately 4.8 million single-nucleotide polymorphisms (SNPs) [12], a public rice genetic variation database. The final SNP panel was obtained by employing PLINK (version 1.9) [13] to filter the raw data; specifically, we omitted any markers with a missing rate > 10% or those failing to meet the frequency requirements (MAF > 5% and major allele frequency < 95%). Population structure was inferred via ADMIXTURE (1.3.0) [14], and the derived ancestry component matrix (q) was retained for subsequent downstream analyses. Kinship (K) among the 231 accessions was estimated using TASSEL (version 5.0) [15], with all high-quality SNPs integrated to construct the pairwise relationship matrix. Furthermore, principal component analysis (PCA) was conducted to supplement the assessment of population genetic stratification, and the obtained PCA scores, combined with the kinship matrix (K), were incorporated as covariates in the subsequent analysis.
2.3.3. Identification of Grain-Shape Loci via GWAS
Using the filtered high-quality SNP dataset, we performed a genome-wide association analysis by implementing a mixed linear model (MLM) in the TASSEL software (5.0) platform. We set a significance threshold of p < 1 × 10−4. If two or more significant SNPs were found within 200 kb of each other, they were considered to represent a single QTL region. For each identified QTL region, we compared the R2 values of all significant SNPs and used the highest R2 as the explained variance (contribution rate) for that QTL. Manhattan and Q–Q plots were produced in R with the qqman package (4.4.1), while linkage disequilibrium (LD) patterns were visualized using LDBlockShow (3.1.1).
2.3.4. Identification of Candidate Genes and Haplotype Characterization
Based on gene annotations from the Rice Annotation Project Database (RAP-DB), candidate regions for each QTL were delineated as a 100 kb window flanking the peak SNP. Hypothetical proteins, transposable elements, and pseudogenes were excluded from consideration [16]. Within these intervals, any genes homologous or functionally similar to known grain-shape genes were initially flagged as candidate genes (using the Rice Expression Database for guidance). Subsequently, we retrieved SNP data from the Rice Genomic Variation and Functional Annotation Database (IC4R Varmap) and PLINK to extract all non-synonymous SNPs within each candidate gene, including variants in the 2 kb upstream promoter region, coding exons, introns, and the 1 kb downstream region. To ensure sufficient polymorphism, SNP loci with a major allele frequency >80% or a missing data rate >5% were discarded. We then conducted a haplotype analysis of the remaining variant sites using Haploview 4.2 [17] defining haplotypes according to LD patterns. After performing multiple comparisons of trait values among the different haplotypes, genes exhibiting significant phenotypic variation among haplotypes were retained as the final set of candidate genes [18].
2.3.5. Pyramiding Analysis of Excellent Haplotypes of Grain Shape Genes in Germplasms
Initially, IBM SPSS Statistics 22 was used to compute the mean and standard deviation of grain length and grain width across the germplasm panel. Subsequently, multiple comparison analyses were conducted to evaluate differences in these traits among distinct haplotype combinations of grain-shape genes, with the aim of identifying accessions exhibiting favorable grain morphology.
3. Results
3.1. Phenotypic Characterization
The 231 rice accessions used in this study were collected from 23 regions (Table 1). Evaluation of grain length and width across the germplasm population indicated that under the R1 environment, the ranges were 6.30–11.81 mm for length and 2.15–3.76 mm for width, and under the R2 environment the ranges were 6.11–11.40 mm and 2.11–3.71 mm, respectively (Table 2). The coefficients of variation for both traits were similar in the two environments, indicating comparable population distributions and no significant overall difference due to the nitrogen treatment. These results suggest that grain shape in this germplasm population is genetically stable and only slightly influenced by the different nitrogen levels. Under the R1 condition, the grain length had kurtosis and skewness values of 0.05 and 0.21, respectively (mean length 8.47 mm), and grain width had kurtosis and skewness of 0.68 and 0.12 (mean width 2.90 mm). Under R2, the grain length kurtosis and skewness were 0.12 and 0.21 (mean length 8.44 mm), and for grain width they were 0.55 and 0.20 (mean width 2.89 mm). In both environments, both grain length and grain width exhibited absolute skewness and kurtosis values below 1, indicating that these grain shape traits follow an approximately normal distribution (consistent with their quantitative, polygenic control). The genetic characteristics of this rice germplasm population therefore meet the statistical assumptions for performing a GWAS. Correlation analysis revealed the relationships among grain size-related traits of the 231–rice natural population are shown in Figure 1. Grain length and grain width were strongly and negatively correlated in both the R1 and R2 environments. Accordingly, genome-wide association analysis of grain-shape traits is justified.
Table 1.
Distribution of rice germplasm accessions used in this study.
Table 2.
Statistical Characterization of Rice Grain Shape Traits.
Figure 1.
Correlation analysis of 231 rice grain size traits. The color gradient corresponds to the strength and direction of Pearson correlation coefficients (ranging from −1.0 to 1.0; blue indicates positive correlation, red indicates negative correlation), and “***” denotes a statistically significant correlation at the p < 0.001 level. R1 and R2 represent two independent experimental environments; GL represents grain length, and GW represents grain width.
3.2. Population Structure Analysis
Population structure analysis revealed the genetic stratification characteristics of the 231 rice germplasms. Cross-validation error assessment showed that the lowest error value was obtained when the number of ancestral groups (K) was set to 6 (Figure 2), indicating that these rice germplasms can be optimally divided into six distinct ancestral groups. This stratification result is visually presented in the ADMIXTURE cluster plot (Figure 2), where each color represents one ancestral group, and the color proportion in each individual germplasm reflects the genetic component contribution of the corresponding ancestral group.
Figure 2.
Admixture analyses of the population structure of 231 rice germplasms. Cluster analysis results of population genotypes, in which each color represents a group.
3.3. Genome-Wide Association Study on Rice Grain Shape
Genome-wide association studies targeting grain length and grain width were performed using the rice germplasm panel, with the corresponding results shown in Figure 3. In total, four QTLs associated with grain shape were detected on chromosomes 3, 4, and 11 (Table 3). Consecutive significant SNPs (two or more within ≤200 kb) were considered a single QTL locus. Based on this criterion, four QTLs were detected, including three loci influencing grain length and a single locus affecting grain width. Among the identified loci, qGL3.1, qGL3.2, and qGL11 were consistently detected for grain length across both R1 and R2 environments. Notably, qGL11 is a novel grain length QTL on chromosome 11, accounting for 15.07% of phenotypic variation, with a physical interval of 1,113,362–1,113,774 bp. qGL3.1 explains 14.19% of variation and lies at 16,729,325–16,832,795 bp; it co-localizes with the cloned grain-length gene GS3, suggesting that GS3 is the gene underlying qGL3.1 [8]. Likewise, qGL3.2 explains 11.32% of the variation and is located at 16,878,251–16,893,486 bp; it co-localizes with the cloned gene SMG3, indicating that SMG3 is the likely candidate gene for qGL3.2 [6]. Regarding grain width, the QTL qGW4 was detected under both R1 and R2, explaining 13.03% of the variation and located at 20,599,993–20,620,638 bp on chromosome 4. This region co-localizes with OsOFP14, suggesting that OsOFP14 is the candidate gene underlying qGW4 [19].
Figure 3.
Genome-wide association results for rice grain length and grain width. (A) Manhattan and Q–Q plots for grain length under the R1 environment. (B) Manhattan and Q–Q plots for grain length under the R2 environment. (C) Manhattan and Q–Q plots for grain width under the R1 environment. (D) Manhattan and Q–Q plots for grain width under the R2 environment.
Table 3.
Significant SNPs identified by genome-wide association analysis for rice grain length and grain width.
3.4. Prediction of Candidate Genes for Grain Length QTL and Their Haplotype Analysis
Analysis of the qGL11 interval (which contains 34 annotated genes) was conducted after excluding hypothetical proteins, transposons, and pseudogenes. By screening for genes potentially involved in grain shape pathways, we identified two candidate genes associated with grain length in this interval (see Supplementary Table S2). Their annotations are: a putative glycosyltransferase family 8 member (LOC_Os11g03160) and a putative auxin efflux carrier protein (LOC_Os11g02950). Further significance analysis of grain length among the haplotypes of these two genes revealed that LOC_Os11g03160 had two major haplotypes, with no significant difference in grain length between them (Figure S2). In contrast, LOC_Os11g02950 had three major haplotypes, and the grain length of Hap3 was significantly higher than that of the other two haplotypes (Figure 4). Compared with LOC_Os11g03160, the differences in grain length among the haplotypes of LOC_Os11g02950 were more pronounced, suggesting that LOC_Os11g02950 might be a candidate gene for qGL11. However, further validation is required to confirm this conclusion.
Figure 4.
Analysis of the LOC_Os11g02950 haplotype. (A) Schematic representation of structure and haplotypes. Data are expressed as mean ± standard deviation. Values followed by different lowercase letters (a, b) in the figure indicate significant differences at P <0.05 (tested by Duncan’s multiple range test) (B) The subpopulation composition of LOC_Os11g02950. (C) Geographical distribution of different haplotypes of LOC_Os11g02950.
For qGL3.1, the analysis confirmed that GS3 (Os03g0407400) is the candidate gene, as this locus lies in a strong LD block containing GS3 (Figure S3A). We identified two major haplotypes of GS3 based on the known functional SNP (a C→A mutation causing a 178–amino acid truncation in the GS3 protein [8]; Figure S3B). Grain length differed significantly between the haplotypes: GS3Hap1 had a greater mean length (8.62 mm) than GS3Hap2 (8.33 mm; p < 0.05). GS3Hap1 was carried by 132 accessions (mostly indica, primarily from East Asia), whereas GS3Hap2 was found in 99 accessions (mainly japonica, with a high proportion from East and Southeast Asia) (Figure S3B–D).
For the qGL3.2 locus, a 100 kb flanking region around the significant SNP (Chr3: 16,888,255 bp) was defined as the candidate interval. Linkage Disequilibrium analysis showed that the candidate gene SMG3 (Os03g0410000) lies within a region of high SNP linkage (Figure S4A). By analyzing a combination of three variants within the 5′ untranslated region (UTR) and three within the coding sequence of Os03g0410000, we successfully distinguished two primary haplotypes, as illustrated in Figure 5B. Grain length also differed significantly between SMG3Hap1 and SMG3Hap2 (p < 0.05), with SMG3Hap1 showing a higher mean grain length (8.47 mm) than SMG3Hap2 (p< 0.05). Considering the haplotype associated with longer grain as favorable, SMG3Hap1 was designated as the advantageous haplotype for SMG3. SMG3Hap1 was predominantly represented by indica varieties (Figure S4C), and accessions carrying this haplotype were mainly distributed across Asian countries (Figure S4D).
Figure 5.
Germplasm analysis of 6 aggregation types of grain length and grain width gene haplotypes. (A) Composition of subgroups for each aggregation type. (B) Geographical distribution of germplasm in various aggregation types.
3.5. Prediction of Candidate Genes for Grain Width QTL and Their Haplotype Analysis
For qGW4, the candidate interval (Chr4: 20,620,638 bp ± 100 kb) contained OsOFP14 (Os04g0415100) in a high-LD segment (Figure S5A). Based on a trio of single-nucleotide polymorphisms located within the OsOFP14 coding sequence, the population was classified into three predominant haplotypes (illustrated in Figure S5B). One haplotype, OsOFP14Hap3 (defined by the TAA allele combination), was associated with a mean grain width of ~2.70 mm (Figure S5B). This haplotype occurred predominantly in indica accessions (Figure S5C), and the accessions carrying OsOFP14Hap3 were mainly from North America and East Asia (Figure S5D).
3.6. Aggregation Analysis of Different Haplotypes of Grain Shape Genes in Germplasm Resources
Further analysis of genotype–phenotype patterns indicated an epistatic interaction between GS3 and OsOFP14: OsOFP14 influenced grain width only when GS3 was nonfunctional. This observation is consistent with the genetic pattern in which a major-effect gene masks the effect of a minor gene in a multi-locus context. In contrast, SMG3 did not have a significant effect on grain shape in our study. Li et al. [10] have noted that the protein product of SMG3 is an E2 ubiquitin-conjugating enzyme associated with the ERAD pathway; it necessitates a physical association with DGS1 to modulate BR signaling and dictate the longitudinal growth of grains. The lack of a detectable SMG3 effect here may be due to a uniformly functional DGS1 in our population, whereas SMG3’s influence was apparent in other genetic backgrounds, where DGS1 allowed its regulatory potential to be expressed. Among the six major haplotype combinations of the three genes, Types 1, 2, and 3 had significantly greater grain length than Types 4, 5, and 6 (p < 0.05). Among the categorized groups, Type 3 accessions were characterized by the most substantial longitudinal growth, averaging 8.98 mm. Conversely, Type 5 exhibited the minimum average grain length at 7.19 mm, though it was distinguished by the highest mean grain width (3.27 mm), as detailed in Table 4. This pyramiding analysis thus identified a “slender-grain” haplotype combination (Type 3)—exemplified by accessions like Newbonnet, Skybonnet, and Lemont—which are predominantly indica varieties from North America. Conversely, the “short-round” haplotype combination (Type 5), e.g., Kosh, Kongyu 131, Jia 33, was mainly found in japonica varieties from Asia (Figure 5A,B).
Table 4.
Evaluation of Haplotype Combinations and Their Impact on Grain Dimensions Under R1 Conditions.
4. Discussion
As a primary determinant of harvestable biomass and commercial suitability, grain shape in rice serves as an essential parameter for optimizing yield and industrial processing efficiency [20]. Specifically, grain width is positively correlated with average chalkiness, whereas the length–width ratio presents a negative correlation with chalkiness degree and slender rice grains, featured by transparency and lack of chalky texture, thus possess substantial commercial value in the global marketplace [21,22]. Elucidating the molecular underpinnings that govern grain architecture is indispensable for developing targeted selection programs intended to optimize rice grain dimensions. Our grasp of the genetic factors influencing rice grain traits has been substantially enhanced by the results of recent high-throughput association studies. Chen et al. [23] performed GWAS on 280 japonica rice accessions and identified 15 grain shape-related QTLs. Wang et al. [24] identified 61 QTLs using 265 natural populations, and their discovery of monosaccharide transporter gene LOC_Os03g39710 as a grain length regulator lends further support to the possibility that chromosome 3 may harbor key loci for grain shape control. By conducting a genome-wide association analysis on a diverse panel of 231 rice lines, we successfully mapped several major-effect loci governing grain architecture, specifically qGL3.1, qGL3.2, qGL11, and qGW4. These results expand the existing framework of rice grain morphogenesis and offer practical genomic targets for the precision breeding of varieties with optimized physical dimensions.
Identification of the qGL11, a novel grain-length QTL, opens up new possibilities for improving grain size in rice. In contrast, qGL3.1 and qGL3.2 co-localize with the known genes GS3 and SMG3, respectively, and our findings validate their roles in a new genetic background, providing additional evidence of their importance in shaping rice grain morphology. Notably, we also detected qGW4, which co-localizes with OsOFP14, a gene known to regulate grain width. These observations imply a degree of mechanistic crosstalk between the genetic networks governing the longitudinal and transverse dimensions of the rice grain.
Despite these advances, several questions remain. For instance, the qGL11 region may harbor additional gene(s) with major effects on grain length, warranting further fine mapping and functional validation. Moreover, interactions between these QTLs and environmental factors (as highlighted by comparisons with other studies) emphasize the complexity of grain shape regulation. This complexity is further reflected in the differential effects of grain shape QTLs across genetic backgrounds, as demonstrated by our Type 3 vs. Type 5 haplotype comparisons (Figure 5).
Although the targeted mutagenesis of pivotal grain morphology determinants, such as GS3 and SMG3, has been effectively achieved through CRISPR/Cas9 technology [25,26], our study highlights the complementary potential of pyramiding multiple QTLs to achieve desirable grain shapes. The identification of accessions with extremely slender grains (Type 3 combination) versus very short, round grains (Type 5 combination) underscores the possibility of marker-assisted selection for specific grain morphologies, an approach that could accelerate breeding programs. However, the effectiveness of this pyramiding strategy will depend on fine-tuning haplotype combinations—notably, some theoretically possible haplotype combinations were not observed in our population, suggesting limitations in available diversity.
In a large-scale association mapping effort, six quantitative trait loci (QTLs) (e.g., qTGW3.1, qTGW9, qGL4/qRLW4) governing grain morphology were localized by Niu and colleagues [27], who leveraged a massive panel of 2453 lines from the 3K-RGP. A comparison between their results and ours revealed no overlapping loci. This suggests that the detection of grain shape QTLs is strongly influenced by environmental conditions and the genetic composition of the population. In other words, the specific plant materials used in a GWAS can directly affect which QTLs are found, and grain shape QTLs may have effects that are expressed in some genetic backgrounds but not in others. Indeed, the choice of plant materials can strongly influence GWAS results, as different genetic backgrounds may allow different QTLs to be detected. For example, through the use of an M494 × Zhong 9B cross mapping population, Li and colleagues [6] localized SMG3 as a multifaceted QTL that influences grain dimensions, weight, and panicle density. Functionally, SMG3 acts by stimulating the growth and division of glume cells, which imposes a negative constraint on grain size while simultaneously enhancing the total number of grains produced. Notably, SMG3 maps to the same interval as qGL3.2 in our study, indicating that SMG3 is likely the causal gene for qGL3.2 (as further supported by our haplotype analysis). According to Zhao et al. [19], the protein OsOFP14 functions as an antagonist to GS9. It integrates into a larger molecular assembly alongside OsOFP8 and OsGSK2, thereby modulating the orientation or frequency of glume cell division to dictate final grain morphology. Together with our evidence (from haplotype analysis) suggesting OsOFP14 is the gene underlying qGW4, this finding reinforces the credibility of the QTLs we identified. Additionally, our identification of two candidate genes related to seed morphology near the significant Chr11 SNP (position ~1.113 Mb) suggests the presence of a high-priority candidate gene for longitudinal grain development that demands additional empirical scrutiny. This finding reaffirms the conclusion reported by Wang et al. [28], namely that synthesizing data from haplotype block frameworks and functional gene characterizations can effectively streamline the identification of candidate genes, thereby providing a more targeted approach for elucidating the molecular architecture that underlies multifaceted grain characteristics.
Varietal improvement is a key means of incorporating beneficial germplasm traits into elite cultivars. However, conventional breeding is time-consuming, labor-intensive, and not always effective when it comes to achieving specific trait outcomes. Yang et al. [29] found that pyramiding known grain shape genes can enable precise improvement of grain characteristics. Similarly, Xia et al. [30] reported an epistatic interaction between GL3.3 and GS3–pyramiding these two genes in one genetic background produced a significant increase in milled grain morphology.
In the present research, the combinatorial effects of different haplotypes at qGL3.1 (GS3), qGL3.2 (SMG3), and qGW4 (OsOFP14) were examined. The Type 3 haplotype combination (GS3Hap1 + SMG3Hap1 + OsOFP14Hap3) resulted in reduced grain width but increased grain length–producing a slender grain phenotype. In contrast, the Type 5 combination (GS3Hap2 + SMG3Hap1 + OsOFP14Hap2) led to shorter grain length and greater grain width, yielding a short, round grain. From these groups, we identified eight accessions with slender grains and 22 accessions with short-round grains. These genetic resources are valuable for grain shape improvement and could be directly utilized to shorten the breeding cycle via marker-assisted breeding.
However, the effectiveness of pyramiding these QTLs for grain shape improvement depends on fine-tuning haplotype combinations across different genetic backgrounds [31]. The differential effects of these QTLs observed among rice accessions suggest that genetic background plays a significant role in breeding outcomes. These findings underscore the imperative to investigate how genomic context modulates grain morphology, especially within the framework of large-scale marker-based selection or precision genome engineering Via CRISPR.
In addition to the limitation of genetic background, the breeding improvement of rice grain shape traits is further restricted by its internal development and physiological characteristics. The lemma and palea of rice play a role as a sink structure, which determines the maximum spatial volume of caryopsis development, and the formation of endosperm is not only limited by the size of this sink, but is also closely related to the filling process such as photosynthetic product transport and starch accumulation efficiency [32]. This double limitation means that in breeding programs, the aggregation of favorable endosperm QTLs does not guarantee the expected yield and quality gains, because under specific cultivation conditions, the glume shape may not be fully matched with endosperm filling. Therefore, future breeding strategies should take this limitation into account and combine sink-related traits (glume size) with source-related traits (grain filling capacity).
Additionally, although our study identified key haplotypes associated with grain shape, the exact molecular mechanisms through which these haplotypes influence grain development remain unclear. Providing definitive proof of these candidate genes’ influence on grain architecture will require targeted empirical strategies, such as loss-of-function mutagenesis and gain-of-function assays. Moreover, investigating how these genes interact with environmental factors (e.g., nutrient availability, temperature, water stress) will provide deeper insight into the stability of grain shape traits under field conditions.
Our findings have significant implications for future rice breeding programs. Identifying specific grain shape haplotypes enables more precise selection of parent lines, thereby accelerating the genetic enhancement of rice lines possessing superior grain attributes. Most existing studies focus on the relationship between single genes and traits; however, as a multifaceted quantitative characteristic, grain shape is dictated by the cumulative activity of various genes distributed across the rice genome. Therefore, conducting multi-gene combinatorial regulation of traits based on GWAS results is more likely to provide effective combination patterns for marker-assisted breeding [33]. Compared with single-marker analysis, haplotype analysis integrates multiple linked genetic variations, which can more accurately reflect the functional differences of genes and improve the reliability of identifying favorable genetic loci related to grain shape. Furthermore, combining these favorable haplotypes with other agronomically important traits (such as disease resistance or drought tolerance) offers opportunities for developing climate-resilient rice varieties. Ultimately, integrating these genetic improvements into breeding pipelines will require a multidisciplinary approach—incorporating genomic selection methods and advanced phenotyping technologies—to ensure that desired traits are effectively realized in new cultivars.
The identification of these grain shape QTLs and their associated haplotypes marks an important step towards improving rice morphology. Beyond elucidating the genomic architecture of grain morphology, this investigation furnishes a practical roadmap for the rapid development of specialized rice cultivars by identifying optimal allelic configurations. Further research into the molecular mechanisms of these QTLs, as well as their interactions with environmental factors, will pave the way for more efficient and sustainable rice breeding strategies.
5. Conclusions
In conclusion, a GWAS on grain length and width was conducted in 231 diverse rice accessions, and identified four grain shape QTLs distributed on chromosomes 3, 4, and 11. Notably, qGL11 is a novel grain-length locus with no previously known gene, while qGL3.1, qGL3.2, and qGW4 co-localize with previously characterized genes (GS3, SMG3, and OsOFP14, respectively). Through haplotype combination analysis of both established genetic regulators of grain morphology and new candidate genes, we identified eight accessions with slender grains and 22 accessions with short, round grains. These specific accessions constitute a vital germplasm pool for elite rice cultivation, paving the way for the targeted enhancement of grain architecture in subsequent breeding cycles.
Supplementary Materials
The following supporting information can be downloaded at https://www.mdpi.com/article/10.3390/biology15010050/s1, Table S1: The names and geographical origins of 231 rice accessions; Table S2: Candidate genes located within the significant region on chromosome 11. Figure S1. Morphological phenotypic appearance of GL and GW in different plant materials. Figure S2. Analysis of the LOC_Os11g03160 haplotype. Figure S3. Analysis of the GS3 haplotype. Figure S4. Analysis of the SMG3 haplotype. Figure S5. Analysis of the OsOFP14 haplotype.
Author Contributions
Conceptualization, X.L. and W.Y.; Data curation, C.W.; Formal analysis, S.M. and C.W.; Funding acquisition, X.X. and Y.H.; Investigation, S.M., S.L. and X.G.; Methodology, W.Y.; Resources, S.L.; Software, X.Y.; Visualization, X.Y.; Writing—original draft, X.L. and S.W.; Writing—review and editing, Y.H., L.X. and X.X. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by the Major Science and Technology Projects in Sichuan Province, grant number 2022ZDZX0012; the Key Research and Development Program of Sichuan, grant number 2021YFYZ0016; the Natural Science Foundation of Southwest University of Science and Technology, grant number 22zx7144; and the Postgraduate Innovation Fund Project by Southwest University of Science and Technology, grant number 25ycx2079; the opening Foundation of State Key Laboratory of Crop Gene Resources and Breeding, grant number CGRB-2025-03.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
The datasets supporting the conclusions of this article are included within the article and Supplementary Materials.
Conflicts of Interest
The authors declare no conflicts of interest.
Abbreviations
The following abbreviations are used in this manuscript:
| GWAS | Genome-Wide Association Study |
| QTL | Quantitative Trait Locus |
| MLM | Mixed Linear Model |
| SNP | Single Nucleotide Polymorphism |
References
- Gao, Z.Q.; Zhan, X.D.; Liang, Y.S.; Cheng, S.H.; Cao, L.Y. Progress on genetics of rice grain shape trait and its related gene mapping and cloning. Hereditas 2011, 33, 314–321. [Google Scholar] [CrossRef]
- Meng, B.X.; Wang, T.P.; Luo, Y.; Guo, Y.; Xu, D.; Liu, C.H.; Zou, J.; Li, L.; Diao, Y.; Gao, Z.Y.; et al. Identification and Allele Combination Analysis of Rice Grain Shape-Related Genes by Genome-Wide Association Study. Int. J. Mol. Sci. 2022, 23, 1065. [Google Scholar] [CrossRef]
- Ren, D.Y.; Ding, C.Q.; Qian, Q. Molecular bases of rice grain size and quality for optimized productivity. Sci. Bull. 2023, 68, 314–350. [Google Scholar] [CrossRef]
- Wei, M.Y.; Luo, T.P.; Huang, D.H.; Ma, Z.F.; Liu, C.; Qin, Y.Y.; Wu, Z.S.; Zhou, X.L.; Lu, Y.P.; Yan, L.H.; et al. Construction of High-Density Genetic Map and QTL Mapping for Grain Shape in the Rice RIL Population. Plants 2023, 12, 2911. [Google Scholar] [CrossRef] [PubMed]
- Hao, J.P.; Wang, D.K.; Wu, Y.B.; Huang, K.; Duan, P.G.; Li, N.; Xu, R.; Zeng, D.; Dong, G.J.; Zhang, B.L.; et al. The GW2-WG1-OsbZIP47 pathway controls grain size and weight in rice. Mol. Plant 2021, 14, 1266–1280. [Google Scholar] [CrossRef] [PubMed]
- Li, R.S.; Li, Z.; Ye, J.; Yang, Y.Y.; Ye, J.; Xu, S.I.; Liu, J.R.; Yuan, X.P.; Wang, Y.P.; Zhang, M.C.; et al. Identification of SMG3, a QTL Coordinately Controls Grain Size, Grain Number per Panicle, and Grain Weight in Rice. Front. Plant Sci. 2022, 13, 880919. [Google Scholar] [CrossRef]
- Duan, Y.X.; Cui, J.N.; Xu, S.B.; Wang, J.G.; Liu, H.I.; Yang, L.M.; Jia, Y.; Xin, W.; Wu, W.S.; Zheng, H.I.; et al. Whole genome association analysis and candidate gene mining of grain type correlation in crypt japonica rice. Acta Agric. Boreali-Sin. 2023, 38, 19–28. [Google Scholar] [CrossRef]
- Fan, C.C.; Xing, Y.Z.; Mao, H.L.; Lu, T.T.; Han, B.; Xu, C.G.; Li, X.H.; Zhang, Q.F. GS3, a major QTL for grain length and weight and minor QTL for grain width and thickness in rice, encodes a putative transmembrane protein. Theor. Appl. Genet. 2006, 112, 1164–1171. [Google Scholar] [CrossRef]
- Sun, S.Y.; Wang, L.; Mao, H.L.; Shao, L.; Li, X.H.; Xiao, J.H.; Ouyang, Y.D.; Zhang, Q.F. A G-protein pathway determines grain size in rice. Nat. Commun. 2018, 9, 851–862. [Google Scholar] [CrossRef]
- Li, J.; Zhang, B.I.; Duan, P.G.; Yan, L.; Yu, H.Y.; Zhang, L.M.; Li, N.; Zheng, L.Y.; Chai, T.Y.; Xu, R.; et al. An ERAD-related E2-E3 enzyme pair controls grain size and weight through the brassinosteroid signaling pathway in rice. Plant Cell 2022, 35, 1076–1091. [Google Scholar] [CrossRef]
- Si, L.Z.; Chen, J.Y.; Huang, X.H.; Gong, H.; Luo, J.H.; Hou, Q.Q.; Zhou, T.Y.; Lu, T.T.; Zhu, J.J.; Shangguan, Y.Y.; et al. OsSPL13 controls grain size in cultivated rice. Nat. Genet. 2016, 48, 447–456. [Google Scholar] [CrossRef]
- Alexandrov, N.; Tai, S.S.; Wang, W.S.; Mansueto, L.; Palis, K.; Fuentes, R.R.; Ulat, V.J.; Chebotarov, D.; Zhang, G.Y.; Li, Z.K.; et al. SNP-Seek database of SNPs derived from 3000 rice genomes. Nucleic Acids Res. 2014, 43, 1023–1027. [Google Scholar] [CrossRef] [PubMed]
- Purcell, S.; Neale, B.; Todd-Brown, K.; Thomas, L.; Ferreira, M.A.R.; Bender, D.; Maller, J.; Sklar, P.; de Bakker, P.I.W.; Daly, M.J.; et al. PLINK: A tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 2007, 81, 559–575. [Google Scholar] [CrossRef] [PubMed]
- Alexander, D.H.; Novembre, J.; Lange, K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 2009, 19, 1655–1664. [Google Scholar] [CrossRef] [PubMed]
- Bradbury, P.J.; Zhang, Z.W.; Kroon, D.E.; Casstevens, T.M.; Ramdoss, Y.; Buckler, E.S. TASSEL: Software for association mapping of complex traits in diverse samples. Bioinformatics 2007, 23, 2633–2635. [Google Scholar] [CrossRef]
- Wu, F.X.; Luo, X.; Wang, L.Q.; Wei, Y.D.; Li, J.G.; Xie, H.A.; Zhang, J.F.; Xie, G.S. Genome-Wide Association Study Reveals the QTLs for Seed Storability in World Rice Core Collections. Plants 2021, 10, 812. [Google Scholar] [CrossRef]
- Barrett, J.C.; Fry, B.; Maller, J.; Daly, M.J. Haploview: Analysis and visualization of LD and haplotype maps. Bioinformatics 2004, 21, 263–265. [Google Scholar] [CrossRef]
- He, Y.; Li, L.Y.; Shi, W.B.; Tan, J.H.; Luo, X.X.; Zheng, S.Y.; Chen, W.T.; Li, J.; Zhuang, C.X.; Jiang, D.G. Florigen repression complexes involving rice CENTRORADIALIS2 regulate grain size. Plant Physiol. 2022, 190, 338–353. [Google Scholar] [CrossRef]
- Zhao, D.S.; Li, Q.F.; Zhang, C.Q.; Zhang, C.; Yang, Q.Q.; Pan, L.X.; Ren, X.Y.; Lu, J.; Gu, M.H.; Liu, Q.Q. GS9 acts as a transcriptional activator to regulate rice grain shape and appearance quality. Nat. Commun. 2018, 9, 1240–1254. [Google Scholar] [CrossRef]
- Huang, R.Y.; Jiang, L.R.; Zheng, J.S.; Wang, T.S.; Wang, H.C.; Huang, Y.M.; Hong, Z.L. Genetic bases of rice grain shape: So many genes, so little known. Trends Plant Sci. 2013, 18, 218–226. [Google Scholar] [CrossRef]
- Calingacion, M.; Laborte, A.; Nelson, A.; Resurreccion, A.; Concepcion, J.C.; Daygon, V.D.; Mumm, R.; Reinke, R.; Dipti, S.; Bassinello, P.Z.; et al. Diversity of Global Rice Markets and the Science Required for Consumer-Targeted Rice Breeding. PLoS ONE 2014, 9, e85106. [Google Scholar] [CrossRef]
- Zhang, J.C.; Du, Y.; Xu, P.K.; Zhong, L.Y.; Li, Z.; Zhu, W.R.; Li, Y.C.; Cheng, B.; Chang, X.Y.; Fan, Y.W.; et al. A Natural Major Module Confers the Trade-Off between Phenotypic Mean and Plasticity of Grain Chalkiness in Rice. Adv. Sci. 2025, 12, e06242. [Google Scholar] [CrossRef] [PubMed]
- Chen, H.W.; Zhang, X.; Tian, S.J.; Gao, H.; Sun, J.; Pang, X.; Li, X.W.; Li, Q.Y.; Xie, W.X.; Wang, L.; et al. Genome-wide association study reveals the advantaged genes regulating japonica rice grain shape traits in northern China. PeerJ 2024, 12, e18746. [Google Scholar] [CrossRef] [PubMed]
- Wang, N.S.; Chen, H.G.; Qian, Y.Z.; Liang, Z.J.; Zheng, G.Q.; Xiang, J.; Feng, T.; Li, M.; Zeng, W.; Bao, Y.; et al. Genome-Wide Association Study of Rice Grain Shape and Chalkiness in a Worldwide Collection of Xian Accessions. Plants 2023, 12, 419. [Google Scholar] [CrossRef] [PubMed]
- Li, M.R.; Pan, X.P.; Li, H.Q. Pyramiding of gn1a, gs3, and ipa1 Exhibits Complementary and Additive Effects on Rice Yield. Int. J. Mol. Sci. 2022, 23, 12478. [Google Scholar] [CrossRef]
- Huang, J.; Chen, W.W.; Gao, L.J.; Qing, D.J.; Pan, Y.H.; Zhou, W.Y.; Wu, H.; Li, J.C.; Ma, C.I.; Zhu, C.I.; et al. Rapid improvement of grain appearance in three-line hybrid rice via CRISPR/Cas9 editing of grain size genes. Theor. Appl. Genet. 2024, 137, 173–188. [Google Scholar] [CrossRef]
- Niu, Y.A.; Chen, T.X.; Wang, C.C.; Chen, K.; Shen, C.C.; Chen, H.Z.; Zhu, S.B.; Wu, Z.C.; Zheng, T.Q.; Zhang, F.; et al. Identification and allele mining of new candidate genes underlying rice grain weight and grain shape by genome-wide association study. BMC Genom. 2021, 22, 602. [Google Scholar] [CrossRef]
- Wang, N.S.; Hassan, M.A.; Li, K.; Zhou, K.N.; Gan, Q.; Xia, J.F.; Lin, C.X.; Li, Z.F.; Ni, D.H.; Song, F.S. Grain shape in rice: A Genome-wide association study of the effector genes. BMC Plant Biol. 2025, 25, 1387. [Google Scholar] [CrossRef]
- Yang, T.F.; Gu, H.Y.; Yang, W.; Liu, B.; Liang, S.H.; Zhao, J.I. Artificially Selected Grain Shape Gene Combinations in Guangdong Simiao Varieties of Rice (Oryza sativa L.). Rice 2023, 16, 3–17. [Google Scholar] [CrossRef]
- Xia, D.; Zhou, H.; Liu, R.J.; Dan, W.H.; Li, P.B.; Wu, B.A.; Chen, J.X.; Wang, L.Q.; Gao, G.J.; Zhang, Q.I.; et al. GL3.3, a Novel QTL Encoding a GSK3/SHAGGY-like Kinase, Epistatically Interacts with GS3 to Produce Extra-long Grains in Rice. Mol. Plant 2018, 11, 754–756. [Google Scholar] [CrossRef]
- Li, Z.H.; Wang, S.L.; Zhu, Y.J.; Fan, Y.Y.; Huang, D.R.; Zhu, A.K.; Zhuang, J.Y.; Liang, Y.; Zhang, Z.H. Control of Grain Shape and Size in Rice by Two Functional Alleles of OsPUB3 in Varied Genetic Background. Plants 2022, 11, 2530. [Google Scholar] [CrossRef]
- Ye, Y.; Yuan, X.Y.; Zhao, D.S.; Yang, Q.Q. MicroRNAs Regulate Grain Development in Rice. Agronomy 2025, 15, 2027. [Google Scholar] [CrossRef]
- Hu, Q.F.; Zhao, Z.K.; Ma, L.L.; Xia, H.J.; Ma, Z.Q.; Xu, P.H.; Wang, X.P.; Zhu, R.; Zhao, Y.; Guo, H.F.; et al. Natural variation of GNP2 enhances grain number to benefit rice yield. Nat. Commun. 2025, 16, 8848. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.