Discovering New QTNs and Candidate Genes Associated with Rice-Grain-Related Traits within a Collection of Northeast Core Set and Rice Landraces

Grain-related traits are pivotal in rice cultivation, influencing yield and consumer preference. The complex inheritance of these traits, involving multiple alleles contributing to their expression, poses challenges in breeding. To address these challenges, a multi-locus genome-wide association study (ML-GWAS) utilizing 35,286 high-quality single-nucleotide polymorphisms (SNPs) was conducted. Our study utilized an association panel comprising 483 rice genotypes sourced from a northeast core set and a landraces set collected from various regions in India. Forty quantitative trait nucleotides (QTNs) were identified, associated with four grain-related traits: grain length (GL), grain width (GW), grain aroma (Aro), and length–width ratio (LWR). Notably, 16 QTNs were simultaneously identified using two ML-GWAS methods, distributed across multiple chromosomes. Nearly 258 genes were found near the 16 significant QTNs. Gene annotation study revealed that sixty of these genes exhibited elevated expression levels in specific tissues and were implicated in pathways influencing grain quality. Gene ontology (GO), trait ontology (TO), and enrichment analysis pinpointed 60 candidate genes (CGs) enriched in relevant GO terms. Among them, LOC_Os05g06470, LOC_Os06g06080, LOC_Os08g43470, and LOC_Os03g53110 were confirmed as key contributors to GL, GW, Aro, and LWR. Insights from QTNs and CGs illuminate rice trait regulation and genetic connections, offering potential targets for future studies.


Introduction
Grain-quality-related traits like grain length, grain width, and aroma are pivotal agronomic traits in rice and have evolved through natural selection during domestication.Initially, the improvement in grain size was an unintentional adaptation to rice cultivation, favoring seeds capable of thriving in deeper soil [1].Interestingly, this trait was subsequently subjected to deliberate selection and breeding due to its direct impact on rice grain quality and overall yield [2,3].The contemporary landscape of rice varieties now displays a diverse spectrum of grain size characteristics, which predominantly determine their yield.The dimensions of rice grains encompass grain length (GL), grain width (GW), and grain thickness (GT), as outlined in earlier studies [4,5].These characteristics are intricately linked to grain weight, a significant factor influencing overall yield, in conjunction with the quantity of panicles per plant and the number of grains per panicle.The intricate nature of rice grain traits has undergone thorough investigation, revealing a complex inheritance system regulated by numerous genes that exert gradual and modest influences [6].
Plants 2024, 13, 1707 3 of 33 platforms.Therefore, performing association studies through high-density SNP chips has enabled the selection of potential candidate genes for GWAS.There have been substantial efforts on breeding lines to decipher new QTLs for target traits.However, limited efforts have been made with germplasms to characterize genomic regions associated with target traits.In this study, genome-wide association mapping was conducted with a statistically strong and diverse association panel of the northeast core set (190 accessions) [53] and a set of rice landraces (293 accessions) [54] taken together to identify significant marker-trait associations for grain-related traits (GL, GW, Aro, and LWR).This study offers crucial insights for the continued exploration of elite genes within the northeast core and landrace sets for utilization in rice breeding.The findings of this study may be useful in further elucidating the genetic basis of rice grain size for improving grain characteristics in rice.These selected genes offer a window into delving deeper into the genetic framework of grain quality characteristics, thereby contributing to enhancing genetic improvements.

Trait Correlations and Variance
Four grain-related traits, GL, GW, Aro, and LWR, were investigated in the selected set of 483 rice panel.The list of 483 rice panels has been included in Table S1.The diverse set of rice accessions showed a wide range of values for all the grain-related traits.Aroma analysis showed a smaller number of samples being aromatic.A wide range of values for all the grain-related traits was recorded, and their descriptive statistics are represented in Table 1.The mean values of GL, GW, and LWR were 5.6, 2.4, and 2.3, respectively.The skewness of the population showed negative skewness for GL and GW traits, whereas LWR and aroma showed positive skewness.Kurtosis was less than 3 for GL, GW, and LWR, indicating a platykurtic distribution of phenotypes in the population, which means traits are governed by the large number of genes [6].The correlation analysis was performed to understand the linear relationship between grain traits.The correlation was positive, significant, and strong between GL and GW (0.31) and between GL and LWR (0.42).Negative, strong, and significant correlation was observed with GW and LWR (−0.71), which means they are inversely proportional.The rest of the correlation values did not turn out to be significant.The results suggest a close relationship between the mentioned traits and indicate their possible contribution to enhancing the genetic improvement of rice (Figure 1).

Genetic Structure and Linkage Disequilibrium Analysis
Before conducting GWAS, the genetic architecture of the 483-rice panel was assessed using principal component analysis (PCA), kinship analysis, a neighbor-joining (NJ) tree, and population structure analysis.The SNP chip DNA marker density across all 12 rice chromosomes is illustrated in Figure 2a.Additionally, the pairwise linkage disequilibrium (LD) between markers, represented by the average r 2 values of bins plotted against the physical distance between markers, produced a near-flat curve, indicating relatively low LD decay in the population.The maximum r 2 value (r max ) obtained was ~0.48, and the r max was reduced to half at a distance, i.e., 133.46 kbp (Figure 2b).Therefore, 133.46 was calculated to be the LD distance.Plants 2024, 13, 1707 5 of 33

PCA, Kinship and Relatedness Study
According to principal component analysis, there were three subpopulations in the selected rice panel (Figure 3a).PC1 explained 15.5% of the variance and PC2 explained 12.5% of the variance.The kinship matrix was computed to assess genetic relationships within the population.The coefficient of relatedness spanned from −2.0 to 2.0, and a kinship heat map was generated to illustrate these relationships.Notably, the upper right corner exhibited a relatively close relationship, while the rest of the accessions displayed lower coefficients of relatedness showing them to be unique and different (Figure 3b).Further, NJ tree also unveiled three distinct clusters, from the genetic distances derived from SNP variations among the selected rice accessions.Cluster 1 and 2 comprised 26 accessions with the majority of the accessions from Arunachal Pradesh, Nagaland, Tripura, Mizoram, Meghalaya, Manipur, and Assam.Cluster 3 comprised a maximum number of samples (431), with the majority of the Uttarakhand samples being grouped.The accessions from the rest of the states showed mixing.This observation suggests that the population utilized in our study confirms a natural population, with only a few instances of close relatedness among certain accessions.Concordantly, STRUCTURE also showed three subpopulations within the 483-panel based on the distribution of the 35,286 SNPs across 12 rice chromosomes.Figure S1 shows the structure bar plot with three subpopulations as revealed by Structure Harvester.The genotypes exhibiting ≥80% likelihood was designated pure, whereas others were categorized as admixtures.The three subpopulations were accommodating pure and admixed individuals.Subpopulation 1 had 74 pure and 42 admixed individuals, subpopulation 2 had 43 pure and 21 admixed individuals, and subpopulation 3 had 317 pure and 28 admixed individuals.Overall PCA, kinship, and relatedness studies showed the accessions to be highly diverse.Plants 2024, 13, 1707 6 of 33

GWAS and CGs Mining for Grain-Related Traits
A total of 40 QTNs were detected for four grain-related traits: GL, GW, grain aroma, and LWR.QTNs with LOD scores > 3.0 were considered significant trait-related QTNs.Most of the QTNs were identified with at least two of the five ML-GWAS methods, namely, mrMLM, FASTmrEMMA, FASTmrMLM, pLARmEB, and ISIS EM-BLASSO, utilized in the study.Nineteen of the QTNs detected overlapped with previously reported QTL/genes, which shows the consistency of the study.The number of QTNs detected varied with various methods.The Manhattan and quantile-quantile plots of all four traits presented in Figure 4 indicate that false associations were controlled, and the SNPs detected by ML-GWAS methods were true associations.A positively skewed QQ plot was observed in the case of aroma, which means that the observed p-values are more extreme than expected under the null hypothesis, which could indicate the presence of true associations between the genetic variant and the trait of interest.Eight QTNs were detected to be associated with GL on chromosomes 1, 2, 4, 5, 8, and 12, with LOD scores ranging from 3.04 to 5.72.There were nine QTNs identified for GW with LOD scores ranging from 3.14 to 5.29.For aroma, 12 QTNs were identified on chromosomes 1, 3, 5, 8, 10, and 11, with LOD scores ranging from 3.04 to 5.88.A maximum number of QTNs was detected on chromosome 8.For LWR, 11 QTNs were identified with LOD scores ranging from 3.0 to 6.6.Of these 40 QTNs, 38 QTNs were observed to be in the vicinity of annotated genes, and 16 QTNs were identified simultaneously by two or more ML-GWAS methods (Table 2; 16 QTNs have been marked in bold).Further, all 16 genomic loci associated with grain quality traits were searched for their annotation in Rice Assembly version 7. Probable CGs were searched in the 130-kbp genomic region of each 16 commonly annotated QTNs.For 16 QTNs, 258 genes were detected closer to significant QTNs.A Venn diagram was prepared showing the number of overlapping QTNs by various methods (Figure S2).

Superior Allele Distribution in Northeast Core and Rice Landraces
All 40 QTNs were studied for their superior allele and inferior allele in the rice landrace and northeast core to identify better-suiting genotypes and to see their distribution in the two sets.For GL among the eight QTNs reported, among them, SNP AX-95915857, AX-95947399, AX-95926370, AX-95925933, and AX-95930845 showed the presence of superior alleles to 80% in the NE core set, whereas SNP AX-95936094 was present in 30% of the accessions; therefore, this QTN needs attention for improvement of GL.For GW SNPs, AX-95946823, AX-95952472, AX-95936134 (Figure 5a), AX-95927762, and AX-95930391 showed their presence in less than 30% of the accessions in the northeast core, indicating that these SNPs should be paid more attention for improvement of the GW trait.Ten SNPs (AX-95918031, AX-95941053, AX-95921840, AX-95954140, AX-95926338, AX-95930157, AX-95937862, AX-95930471, AX-95960288, and AX-95932719) showed the presence of superior alleles in fewer than 30% of the accessions; hence, more attention needs to be paid to these loci for aroma improvement.LWR QTNs, SNP AX-95918592, AX-95920263, and AX-95947983 showed low percentages of superior alleles in the northeast core, suggesting that more study is needed for these QTNs for improvement of the LWR in the collection.
For the landraces set (Figure 5b), SNPs AX-95915857, AX-95947399, AX-95926370, AX-95930845, AX-95930845, AX-95963880, and AX-95964306 are present in more than 60% of the accessions, which can account for its rice diversity; however, AX-95936094 and AX-95925933 are present in less than 30% of the rice landrace collection, which means that these SNPs need more study to improve the GL trait in the collection.For the GW trait, superior SNPs (AX-95946823, AX-95955084, AX-95927762, and AX-95933306) were in high percentages, whereas AX-95952472, AX-95936134, AX-95955169, AX-95930391, and AX-95932902 were in very low percentages in the rice landrace collection, and these need improvement to improve the GW trait.In the case of aroma, most of the superior alleles (AX-95941053, AX-95918031, AX-95930157, AX95930471, and AX-95932719) were present to less than 50% of the accessions in the rice landrace collection, which calls for improvement to improve the quality of aromatic rice in this collection.With respect to the LWR trait, SNPs AX-95948231 and AX-95952612 were present in more than 60% of the accessions, which counts for good diversity in LWR in the rice landrace collection; however, SNPs AX-95918592, AX-95920263, AX-95947983, AX-95924497, and AX-95957972 need more study to improve LWR in the collection.The above distribution suggests that both the sets are rich in alleles and provides valuable insights into the allelic distribution.
(AX-95918031, AX-95941053, AX-95921840, AX-95954140, AX-95926338, AX-95930157, AX-95937862, AX-95930471, AX-95960288, and AX-95932719) showed the presence of superior alleles in fewer than 30% of the accessions; hence, more attention needs to be paid to these loci for aroma improvement.LWR QTNs, SNP AX-95918592, AX-95920263, and AX-95947983 showed low percentages of superior alleles in the northeast core, suggesting that more study is needed for these QTNs for improvement of the LWR in the collection.For the landraces set (Figure 5b), SNPs AX-95915857, AX-95947399, AX-95926370, AX-95930845, AX-95930845, AX-95963880, and AX-95964306 are present in more than 60% of the accessions, which can account for its rice diversity; however, AX-95936094 and AX-95925933 are present in less than 30% of the rice landrace collection, which means that these SNPs need more study to improve the GL trait in the collection.For the GW trait,

Gene Functional Enrichment Analysis
To delve deeper into the various loci linked to the desired trait, we conducted an enrichment analysis using PlantGSAD.Furthermore, we predicted the potential functions of 60 candidate genes (CGs) and categorized them according to their respective molecular functions, biological processes, and cellular components (Table 3).Additionally, we utilized REVIGO [55] (http://revigo.irb.hr/) to analyze and generate non-repetitive gene ontology (GO) terms, which were then illustrated via a scatterplot based on the frequency and pvalue of the GO terms (Figure 6).The bubble size and the value of the scatterplot depict the term significance.In total, 20, 17, and 29 terms were significantly enriched in the biological processes, molecular function, and cellular components, respectively.The GO term "regulation of biological quality" (GO:0065008, FDR = 1.30 × 10 −11 ) was significantly enriched, indicating that the overlapping genes modulate a qualitative or quantitative trait of a biological quality (Table S2).The overlapping genes include LOC_Os08g43470 (qAro-8-4), which was found to be significantly associated with trait aroma, and LOC_Os05g06460 and LOC_Os05g06430, which are neighboring genes to QTN found for GL (qGL-5-1).CGs were enriched in other biological processes, such as cellular homeostasis, biological regulation, maintenance of location in the cell, positive regulation of gene expression, and cellular metabolic processes, indicating their role in cell development processes affecting grain quality trait.The GO:0051173 (positive regulation of nitrogen compound metabolic process) term overlapped with LOC_Os10g10990, suggesting its role in influencing the aroma, as shown in a previous study where grain 2-acetyl-1-pyrroline (2-AP) biosynthesis in the presence of nitrogen application at the booting stage was seen to be enhanced [56].Furthermore, molecular and cellular components were annotated and detected to be enriched in the 60 CGs.GO:0004148 (dihydrolipoyl dehydrogenase activity; FDR = 1.78 × 10 −9 ), a multifunctional oxidoreductase, and GO:0003924 (GTPase activity; FDR = 1.42 × 10 −3 ), involved in the G-protein signaling pathway that governs cell expansion and proliferation, were also determined.Cellular processes include the proteasome core complex, the latherin coat of the coated pit, the trans-Golgi network transport vesicle, the intracellular organelle, and the cell itself.LOC_Os03g53110 and LOC_Os06g06030 were annotated as constituent parts of a cell (GO:0044464).GO: 0005839 (Proteasome core complex; FDR = 6.24 × 10 −17 ) has a role in peptide cleavage at C-terminal of hydrophobic, basic, and acidic residues.This multi-functional enzyme complex plays a role in the ubiquitin-proteasome pathway, regulating cell cycle progression.Hence, these findings demonstrate the impact of these CGs on rice grain size traits and grain aroma.For further analysis of candidate genes, we applied the PlantGSAD platform, focusing on the TO category, chromatin states, and pathways (Tables S3-S6), to predict the potential functions of CGs related to agronomic traits.As shown in Figure 7, various developments related to TO terms were significantly enriched, suggesting the putative role of CGs in the regulation of grain quality.TO:0000975 (grain width), TO:0000734 (grain length), and TO:0000587 (endosperm quality) were identified to be significantly enriched (Figure 7).LOC_Os05g06480 (inorganic H+ pyrophosphatase), LOC_Os06g06050 (F-box/LRR-repeat protein), and LOC_Os06g06090 (mitogen-activated protein kinase 6) were found to be associated with the enriched TO terms.Inorganic H+ pyrophosphatase has shown its role in cell size and seed development [57].It has been demonstrated earlier that overexpression of F-box proteins reduced the levels of ethylene and has promoted hull cell expansion and increased grain size [58].Also, mitogen-activated protein kinases (OsMAPK6), mainly located in the nucleus and cytoplasm, are ubiquitously distributed in various organs, pre-dominately in spikelet and spikelet hulls, and have a consistent role in rice grain size enhancement [31].Hence, SNPs found to enrich TO terms can form causal variants or associated variants and establish a comprehensive collection of standardized trait vocabularies and descriptions, particularly for uncharacterized traits in crops.The identification and understanding of plant chromatin states could provide valuable insights into the locations and roles of regulatory regions and genes, particularly in response to developmental cues and environmental stimuli.We studied chromatin states of the candidate loci using PlantGSAD and plant chromatin state database [59].CGs were analyzed for their association with epigenetic markers of gene activation (H3K36me1, H3K36me2, H3K36me3, and H3K4me3) (Figure S3).Co-enrichment of H3K36me2 and H3K36me3 (combining effect) should be correlated with higher transcriptional activity [60].
Histone acetylation was also prominently associated with CGs.Previous studies have demonstrated that histone acetylation increases grain size by positively regulating hull cell proliferation [58].
For pathway analysis, we opted for MapMan to discern the pathways and processes linked to CGs, given its specificity in covering plant-specific pathways.CGs were identified as overlapping with protein degradation, protein synthesis, and protein targeting secretory pathways (Figure 8).Protein ubiquitination is involved in grain development via the cascade pathway involving ubiquitin-activation (E1), ubiquitin-conjugation (E2), and ubiquitin ligation (E3) enzymes, regulating proteasomal degradation, protein stability, and localization [61].Therefore, CGs are found to be associated with metabolic pathways of grain-related traits through protein degradation, synthesis, and targeting secretory pathways.

Expression Profile of CGs
The RGAP database demonstrates the FPKM expression values obtained for different tissue, including leaves at 20 days, post-emergence inflorescence, pre-emergence inflorescence, anther, pistil, seeds 5 days after pollination (DAP), embryos (25 DAP), endosperm (25 DAP), seeds (10 DAP), and shoots.Expression values of all the 60 CGs in specific tissues were illustrated as heatmap (Figure 9, Table S6).The size and quality of rice grains are shaped by the conjoined growth of maternal and zygotic tissues.Cell expansion acts as an intrinsic force, propelling longitudinal elongation of the caryopsis within the first 6 days after pollination (DAP), while lateral expansion predominantly takes place between

Exploration of Four Potential QTNs Associated with Grain-Related Traits
We selected four QTNs (qGL-5-1, qGW-6-1, qAro-8-4, and qLWR-3-2) anticipated to significantly impact grain size and quality traits, guided by gene ontology and the expression levels of candidate genes (CGs), and proceeded to investigate them further.The candidate region of 3.28 Mb to 3.33 Mb in qGL-5-1 was defined in the LD block (Figure S4), considering a threshold value of r 2 > 0.2.Three genes in this genomic region were potential CGs governing grain length in rice.qGL-5-1 was found to be associated with LOC_Os05g06470, which is a suppressor of Mek (MAP kinase kinases or mitogen-activated protein kinase kinases) [62].The products of mitogen-activated protein kinase (MAPK) cascades are determined via phosphorylation of MAPK substrates.There are several studies in which the MAPK signaling pathway for the control of grain size have been reported [31,63].LOC_Os05g06450 (Tubulin/Ftsz domain-containing-protein-OsTUG) neighboring LOC_Os05g06470, plays an important role in determining the location of cell division, promoting nuclear separation, and is expected to be essential for microtubule Plants 2024, 13, 1707 20 of 33 organization [17].Therefore, LOC_Os05g06470, associated with the suppressor of Mek, can play role in grain length attributes in rice.
The QTN for grain width, qGW-6-1, located at 2804963 bp at chromosome 6, shows peak association with grain width via the pLARmEB and ISIS EM-BLASSO methods, with an LOD score of 3.82.A total of 87.15 Kb (2.74 Mb-2.83Mb) block was generated (Figure S5).Five candidate genes were found to be associated with grain width.The CGs for this block were LOC_Os06g06030 (peptidase, T1 family), LOC_Os06g06050 (OsFBL27-F-box domain and LRR containing protein), LOC_Os06g06090 (CGMC_MAPKCMGC_2_ERK.12-CGMC includes CDA, MAPK, GSK3, and CLKC kinases), LOC_Os06g06100 (dihydroneopterin aldolase), and LOC_Os06g06150 (zinc finger, C3HC4-type domain-containing protein).Gene annotation studies have shown their role in grain size development [58,64].This QTN was also found in the vicinity of GW6 (LOC_Os06g44100), present approximately 23.7 Mb downstream.Therefore, LOC_Os06g06080, which is associated with serine esterase family protein, can have role in rice grain size development.
The LD block of the 103.89Kb (27.3 Mb-37.4Mb) genomic region was mapped with three CGs for a QTN found associated with trait aroma (Figure S6).qAro-8-4 was found associated with LOC_Os08g43470, which is an ER lumen protein-retaining receptor which is responsible for retaining ER lumen proteins from the Golgi apparatus.In a study conducted previously [65], rice mutants were generated by editing three cytochrome P450 (LOC_Os08g43440 neighboring to qAro-8-4 in the current study) homologs, which exhibited increased grain size and 2-AP in concentration.2-AP is responsible for grain aroma.It was also demonstrated that gene ontology of the cellular component of rice mutant lines showed enriched membrane components [65].Since ER is a membranous cellular component, it can have direct or indirect roles in the synthesis of 2-AP.Another locus LOC_Os08g43430 neighboring our QTN qAro-8-4 is annotated as CXE carboxylesterase, which belongs to the alpha/beta-hydrolases superfamily containing 331 amino acids.CXE carboxylesterases are reported to positively regulate the catabolism of volatile esters in pear fruit and peach, enhancing the aroma and taste of the fruit [66].Therefore, qAro-8-4 was found to be associated with LOC_Os08g43470 and can have role in the pathways 2-AP synthesis, which is a volatile organic compound responsible for aroma.
QTN for LWR (qLWR-3-2) was found to be associated with LOC_Os03g53110, which encodes a Cor-A-like magnesium transporter.Magnesium plays role in rice grain yield and 1000 grain weight [67].In wheat, magnesium transporters are important for starch distribution and increase in grain size [68].All the neighboring genes near qLWR-3-2 have been reported to contribute to grain-size-related traits.Six CGs were identified in the genomic region of 127.25 Kb (Figure 10).LOC_Os03g53050 (probable WRKY transcription factor 21) acts as a downstream receptor of the MAPK module, thereby regulating the grain size by these modules.LOC_Os03g53070, annotated as prenylated rab acceptor, regulates the vesicle trafficking of GTPases.LOC_Os03g53080 zinc finger C3HC4-type domaincontaining protein belongs to the RING finger protein family reported to be involved in the determination of grain size in Arabidopsis and rice [64].For instance, in Arabidopsis, DA1, a ubiquitin receptor, negatively regulates organ size and grain size by restricting the period of cell proliferation [69], whereas RING finger protein BIG BROTHER (BB) is a repressor of plant organ size development.Small changes in BB expression levels substantially alter organ size, whereby mutation in genes like EOD1 (an enhancer of da1-1)/BB (Big Brother) and DA2, which encodes RING-type E3 ligases, enhances seed size [69][70][71].A decreased grain size1 (dgs1) mutant was also studied by Zhu et al. in rice [64] and showed reduced grain size compared to the wild type.Furthermore, a previously known gene, OsGW2, encoding RING protein, also negatively regulates grain size, which is a RING-type ligase [26].Other colocalized loci include LOC_Os03g53100, and LOC_Os03g53150.LOC_Os03g53100 (response regulator receiver domain) highly expressed in seeds (5 DAP), specifically, is a signal transducer involved in the His-to-Asp phosphorylation relay signal transduction system.It activates type A response regulators in response to cytokinin, which affects cell expansion and proliferation by regulating the cell cycle [72].LOC_Os03g53150 is associated with OsIAA13; it is an auxin-responsive Aux/IAA gene family member containing 237 amino acids involved in the auxin signaling pathway that modulates the grain growth, cell expansion, cell fate via signal transduction, auxin coupling and auxin catabolism.The auxin-deficient mutant tsg1(tillering and small-grain 1) resulted in reduced grain size, whereas overexpression of BG1 (Big Grain 1) resulted in larger grains due to the alteration in hull cell proliferation as compared to wild types [31,73].This suggests that chromosome 3 is a hotspot of grain size regulatory genes, forming a cascade which modulates the seed length-width ratio.Hence, the LD block for QTN qLWR3-2 (LOC_Os03g53110) has been shown, which forms the major QTN for LWR.

Discussion
Finding genomic regions linked to quantitative traits before utilizing them in practical breeding to improve specific traits is essential.Enhancing grain size in rice has captured researchers' interest due to its significant influence on yield.Several studies have aimed to map the genomic regions controlling these traits and identify the genes involved [6,74].Connecting gene-based markers to genomic regions controlling grain size traits holds substantial promise as it can address multiple traits simultaneously [75].Genome-wide association analysis using diverse germplasm accessions offers numerous advantages over traditional bi-parental mapping for identifying QTLs [76].In our study, the use of rice northeast core and rice landraces collected from the different states of India has been an added advantage for diversifying the GWAS panel.The rich genetic diversity present in rice landraces and northeast core collection establishes it as a significant repository of genetic variability and a potential reservoir of beneficial alleles for rice breeding.Due to limited studies on grain-size-and grain-quality-related traits, there is abundant scope to explore the natural variation that exists in germplasm accessions for grain-quality-related characters.Aroma, on the other hand, comes from volatile organic compounds, one of them being 2-AP, but this is influenced by many factors such as temperature, soil type, etc. [44].Some studies have revealed a single recessive gene to be responsible for aroma; others have discovered a dominant gene [44,45].Different methods of screening traits of interest have been used by researchers.As an alternative to the classical markers or DNA/molecular markers method, GWAS has become very prominent and useful.It is exceedingly challenging to explore and harness exotic genes across large collections; hence, the northeast core and rice landrace collection serves to ease this.In this study, 483 genotypes with high potential constituted the association panel and were evaluated for grain size parameters and aroma.After high-throughput genotyping, significant associations were analyzed.

Phenotypic Variation and Trait Correlation
Analysis of variation, skewness, and kurtosis outcomes supports the constitution of the association panel aimed at detecting potential QTNs via marker-trait associations for grain-related traits utilizing GWAS (Figure 1, Table 1).In the present study, there was considerable phenotypic diversity observed with respect to grain size traits within the population, implying a plentiful presence of allelic variations.Positive and strong correlation was observed between GL and GW, which was consistent with previous studies (Ponce et al., 2020 [9]).Negligible skewness or zero skewness was observed, with aroma and grain size parameters similar to the results obtained previously [6].

Population Structure, Genetic Relatedness and LD Decay Analysis
The population structure analysis revealed the presence of three subpopulations within the GWAS panel of 483 rice genotypes.This finding was consistent with the observation from PCA and NJ tree, which also identified three distinct groups.These findings align with the results reported in previous GWAS studies [77][78][79].These sub-populations are thought to arise from allelic sharing, attributed to the accumulation of alleles over time due to spontaneous mutations [80].Finally, the kinship matrix, displayed as a heat map with relatedness values ranging from −2.0 to +2.0, indicates weak relationships among individuals in the association panel.These findings aided in understanding the population structure before embarking on the GWAS to identify potential genomic regions associated with grain-related traits.It is essential in the GWAS that genetic structure analysis must be conducted prior to association, regardless of the type of population, to give an overview of the association panel.LD decayed rapidly with the increase in physical distance between SNP pairs and reached its half maximum around ~130 kb for the GWAS panel.Studies have shown that LD decay distances from 100 kb and over [81] are best suited for association studies.The effectiveness and precision of association mapping rely on the level of LD within the population being studied.A self-pollinating species such as rice, where LD spans around 100 kb and beyond, is exceptionally well suited for GWAS [82].In the current study, LD was calculated as 133.46kb, which goes beyond 100 kb, suggesting a well-suited GWAS panel.

Dissecting Candidate Genes around the Stable QTNs Identified
Utilizing GWAS for mapping marker-trait associations also enables the discovery of beneficial combinations of alleles concerning trait expression.This aspect is crucial for breeding programs focused on incorporating advantageous alleles in crop enhancement efforts [83].Association analysis with five multi-locus methods was performed with 483 rice germplasms, and 40 QTNs were identified for four grain-related traits (GL, GW, aroma, and LWR).
The GWAS using single-locus models, despite the frequent use of such models, comes with drawbacks like a notably high false-positive discovery rate.Correcting for multiple testing, often using the Bonferroni correction factor, becomes necessary, yet this approach has been criticized for being stringent, leading to the dismissal of associations that could actually be valid [84].Conversely, multi-locus GWAS models offer solutions by mitigating the need for multiple corrections and providing more accurate estimations of QTN effects [85].Enhancing grain size in rice has captured researchers' interest due to its significant influence on yield.Several studies have aimed to map the genomic regions controlling these traits and identify the genes involved [6,75].In the current study, out of the 40 QTNs identified, 16 QTNs i.e., qGL-4-1, qGL-5-1, qGL-12-1, qGW-2-1, qGW-4-1, qGW-6-1, qGW-12-1, qAro-1-2, qAro-3-1, qAro-3-2, qAro-5-1, qAro-8-1, qAro-8-3, Aro-8-4, qAro-10-1, and qLWR-3-2 were detected with more than one of the ML-GWAS methods.Some of them are discussed as follows.qGL-5-1 was found associated with LOC_Os05g06470, which is a suppressor of mek (mitogen activated protein kinase kinase), putative and expressed.The products of MAPK (mitogen activated protein kinases) cascades are determined via the phosphorylation of MAPK substrates.Earlier, in a study, it was shown that SMG1 gene coding for MKK4 influences rice grain size by promoting cell proliferation [63].Loss of OsMKKK10 function results in small, light grains and semi-dwarf plants with short panicles [28].This QTN has also shown a positive allelic effect on GL (Table 2).Among the 16 QTNs, qGW-2-1 was confined to LOC_Os02g42600, which is an RNA-binding motif.La proteins in rice were characterized previously as RNA-binding proteins [86].The mutant form of these La proteins showed reduced grain length and pollen fertility.qGW-4-1 demonstrated a significant association confined to LOC_Os04g58320, which is a gene in rice which encodes for a zinc-finger-RING-type, putatively expressed protein.Zinc finger RING proteins are a type of RING finger protein that have a conserved RING domain and mainly function as E3 ubiquitin ligases.In rice, this domain is found to play role in various processes, like regulating grain size [64] and salt tolerance [87].qGW-12-1, identified by two ML-GWAS methods, corresponds to LOC_Os12g25200, which is a chloride transporter.The chloride channel family negatively regulate salt tolerance in rice [88].However, a member of the chloride efflux transporter is involved in mediating grain size as well [89].A neighboring gene, located 16 kb downstream of LOC_Os12g25200 (qGW-12-1), is LOC_Os12g25210, which is a signal-peptidase complex subunit 1. Signal-peptide peptidase is a multi-transmembrane aspartic proteinase involved in regulated intramembrane proteolysis, which is implicated in fundamental life processes such as immunological response, cell signaling, tissue differentiation, and embryogenesis.It can play a role in rice grain width, which requires further analysis [90].Moving further among the 16 QTNs, for aroma, qAro-8-4 was identified as an association, which belongs to the ER lumen protein-retaining receptor (LOC_Os08g43470).The ER lumen protein-retaining receptor manages the retention of endoplasmic reticulum proteins within the ER's lumen.It plays a crucial role in determining the specificity of this retention system and is essential for the smooth movement of vesicles through the Golgi apparatus.The gene ontology analysis of cellular components in rice mutant lines revealed enrichment in membrane components.Given that the endoplasmic reticulum (ER) is a membranous cellular component, it may play direct or indirect roles in the synthesis of 2-AP.The role of membrane components in regulating aroma volatiles is worthy of comprehensive studies and confirmation.A neighboring gene within the LD of LOC_Os08g43470 (qAro-8-4) is LOC_Os08g43440, which is related to cytochrome P450; through mutations involved in a homolog of P450, it has been reported to generate high yield and improved aroma in rice; therefore, this can serve as a candidate gene for rice aroma [65].Another neighboring gene near qAro-8-4 is LOC_Os08g43430, related to CXE carboxyl esterase.Carboxyl esterase is positively co-related with the catabolism of volatile esters in pear fruit and enhances their aroma quality [66].Therefore, LOC_Os05g06470, LOC_Os06g06080, LOC_Os08g43470, and LOC_Os03g53110 have been proposed as the CGs for GL, GW, grain aroma, and LWR.

Favorable/Superior Allele Analysis
GWAS is an important tool for detecting favorable alleles for various traits in many plants [91].The association study resulted in the identification of associations and, in turn, superior alleles, which may play a key role in modulating the agronomic traits of rice.In our study, SNP AX-95936094 was present to 30% in the northeast core set but showed a positive effect for the GL trait.However, the potential molecular functions of the significant markers need to be researched.Likewise, SNPs AX-95946823 for GW, AX-95918031 for aroma, and SNP AX-95918592 for LWR showed low percentages of superior alleles.Sometimes, individual markers may account for only minor phenotypic variance, but the aggregation of favorable alleles from diverse marker loci into a single recipient parent can yield significantly greater effects, potentially leading to the development of elite cultivars [92].We anticipate that these superior alleles through proper enhancement could significantly impact grain-related traits and provide valuable insights for breeding and enhanced rice varieties in the future.

Enrichment Analysis of Identified Candidate Genes
In the current study, GWAS revealed 258 genes for four traits based on the gene expression data and annotations.Of these, 60 genes showed higher expression in specific tissues and were predicted to be involved in pathways affecting grain quality.Gene functional enrichment analysis showed that the GO term "regulation of biological quality" (GO:0065008, FDR = 1.30 × 10 −11 ) was significantly enriched, indicating that the overlapping genes modulates a qualitative or quantitative trait of a biological quality (Table S1).The overlapping genes include LOC_Os08g43470 (qAro-8-4), which was found to be significantly associated with trait aroma.The GO:0051173 term, associated with the positive regulation of nitrogen compound metabolic processes, coincided with LOC_Os10g10990, indicating its potential involvement in influencing aroma traits.This correlation is supported by prior research demonstrating that increased nitrogen application enhances aroma in rice [56].Additionally, molecular and cellular components were annotated and found to be enriched in the 60 candidate genes (CGs).Analyzing trait ontology is a productive approach for exploring the connections between genes and traits.LOC_Os05g06480 (encoding inorganic H+ pyrophosphatase), LOC_Os06g06050 (coding for F-box/LRR-repeat protein), and LOC_Os06g06090 (related to mitogen-activated protein kinase 6) were identified as being linked to the enriched TO terms.All three are shown to be related to grain size directly or in associated ways.Hence, trait ontology (TO) serves as a valuable tool for the systematic exploration of molecular mechanisms that underlie agronomic traits.Genome-wide maps of chromatin states have become a powerful representation of genome annotation and regulatory activity.Chromatin functions as a platform for organizing the genome, overseeing gene expression, cell division, differentiation, and more.Epigenetic regulation, including DNA methylation, histone modifications, and variants, plays a pivotal role in governing chromatin structure.The interplay of various epigenetic mechanisms can give insight into the regulatory roles of genes.In our study, CGs were found to be associated with epigenetic markers (H3K36me1, H3K36me2, H3K36me3, and H3K4me3) of gene activation.Epigenetic elements, encompassing chromatin histone modifications, DNA alterations, and miRNA regulation, operate independently of changes in DNA sequence.Reports have highlighted the role of epigenetic mechanisms in regulating grain size in rice and other plant species.For instance, RAV6 encodes a B3 DNA-binding domain protein, and heightened expression of this gene correlates with smaller grains, regulated, in part, by methylation [93].The semidominant mutation Epi-rav6, which mimics RAV6 overexpression, enhances leaf inclination and grain size by impacting BR (Brassinosteroid) homeostasis.BR signaling via ubiquitin pathway has shown to play direct role in grain size [58].Thus, chromatin states, together with gene ontology and trait ontology, indicate the CGs having role with different degrees of activity in grain size development and grain aroma.Finally, the presence of a nearby LD block encompassing the robust QTNs qGL-5-1, qGW-6-1, qAro-8-4, and qLWR-3-2, along with the surrounding candidate genes, indicates a relatively stable heritability for their associated traits, potentially unaffected by LD block effects.

Association Mapping Panel
A diverse set of 483 rice germplasms comprising 190 accessions from the northeast core set of India, comprising the states of Tripura, Manipur, Nagaland, Assam, Meghalaya, Mizoram, and Arunachal Pradesh, and 293 rice landraces from other different states of India viz.Uttar Pradesh, Jharkhand, Chhattisgarh, Andaman, West Bengal, and Uttarakhand were utilized in the study.These diverse sets of germplasms constituted the association panel for the GWAS for the identification of significant marker-trait associations for GL, GW, grain aroma, and LWR.The details of genotypes are presented in Table S1.

Phenotyping and Phenotypic Analysis
Seeds were procured from the gene bank in two sets, with fifty grains in each set, for evaluation of grain quality traits.One set of seeds underwent dehusking and milling in the laboratory using a rice husker and milling machine (model JGMJ 8098, Made-in-China, Nanjing, China) following the cleaning of the paddy to achieve the optimal moisture level.Three grain-related traits, GL, GW, and LWR, were measured using a digital scanner, Biovis PSM Seed Analyzer (Limburg an der Lahn, Germany).The next set of fifty grains, after dehusking and milling, was used for the evaluation of aroma.The aroma was evaluated using the sensory KOH method [94].In the analysis, two fragrant Basmati rice varieties, namely, Pusa-1121, given an aroma score of 3, and (Pusa basmati-1) PB-1, given an aroma score of 2, along with a non-aromatic rice variety, Pusa-44, with an aroma score of 0, were employed [46].Each sample underwent evaluation by seven experts to verify the phenotype.Simultaneously, the range, mean value, deviation, and phenotypic coefficient of variation for each trait were computed using R.The interrelationships among quality traits were examined by assessing the linear correlation through the R package psych [95].

DNA Isolation and SNP Genotyping
Eight to ten seeds were carefully placed on 30 × 45 cm seed germination paper, with 2 to 3 cm gaps between them.The paper was properly folded and placed in a germination tray with a water level of up to three centimeters.These trays were then kept in a growth chamber at 28 • C and 90% relative humidity.Rice accessions were grown in batches over two weeks, with accessions from each region processed separately.Young leaves were collected after 15 days, and genomic DNA was isolated from the 483 germplasm accessions using the CTAB method [96].The DNA quality was assessed on a 0.8% agarose gel and quantified using a Nanodrop spectrophotometer (NanoDropTM 2000/2000c, Thermo Fisher Scientific, Greenville, NC, USA).The 50 K SNP chip, based on single-copy genes and covering all 12 rice chromosomes, was used for genotyping the 483 rice germplasm accessions.The chip provides extensive genome-wide coverage, with an average distance of 0.745 kbp between adjacent SNPs.SNP identification and array design of 50 kSNP chip: Gene-based SNPs were identified from public databases (OryzaSNP, Gramene v6, Raleigh, NC, USA) and through in-house sequence alignment with Bioedit v7.2 and ClustalW 2.1, focusing on 35 bases around each SNP.SNP assays were designed and validated in silico using the AxiomGTv1 algorithm of APT.SNPs with p-convert values >0.30 were selected, resulting in a rice Affymetrix chip containing 50,051 high-quality SNPs.
Target probe preparation and 50 K rice SNP array hybridization: Rice genomic DNA was extracted using the CTAB method, quantified with a nano-drop spectrophotometer, and checked on a 1% agarose gel.For target probe preparation, 20 µL of gDNA at 10 ng/µL was used per sample, following the Affymetrix Axiom ® 2.0 Assay Manual (Affymetrix Axiom ® , Thermo Fisher Scientific, Singapore).The process included DNA amplification, fragmentation, chip hybridization, single-base extension, and signal amplification, followed by staining and scanning SNP allele calling and data analysis: SNP genotypes were called using the Affymetrix Genotyping Console™ v4.1 software.SNPs with low call rates were excluded, retaining those call rates >95.0%[97].

Phylogenetic Study, Population Structure, Kinship, and Linkage Disequilibrium (LD) Analysis
For final analysis, the stringent filtering strategy was conducted to choose high quality SNPs for association.Markers were imputed using Beagle v4 [98].Markers with minor allele frequency (MAF) < 0.05 were removed and over 10% missing reads were excluded from the analysis.The final count of markers stood at 35,286.The neighbor-joining tree was constructed based on the SNP data using TASSEL v5.2.82 [99] software and visualized using the interactive tree of life (iTOL) software v6 [100].The number of subgroups in the association mapping panel was estimated using both a model-based approach using STRUCTURE 2.3.4 software [101] and principal component analysis.To infer the value of genetic cluster (K), each individual was run from K = 1 to K = 10 with 3 iterations for each population.For each run, 100,000 burn-in steps followed by 100,000 Markov chain Monte Carlo simulations were implemented.The optimum number of K was determined according to Evanno et al. (2005) [102], embedded in the structure harvester.PCA analysis, which was incorporated in the package "Adegenet v2.1.10'[103] (genomic association and prediction integrated tool) running under R environment, was used.PCA was also used to infer population structure.The first two PCs were plotted using ggplot2 in R 3.4.2 to visualize the dispersion of the rice accessions.The LD analysis was performed via pairwise comparisons in a set of 35,286 SNP markers (MAF < 0.05) using the LD function in TASSEL v.5.2.82 [99].The value on the x-axis (distance bp), where r max (y-axis) was dropped to half, was calculated to be the LD.

Association Analysis and Identification of Potential Candidate Genes
A GWAS analysis was conducted on 483 rice accessions utilizing 35,286 high-quality SNPs with default settings of in mrMLM software v4.0.2 to estimate the significant associations for grain length, grain width, grain length-width ratio, and grain aroma.Five multi-locus models, namely, mrMLM, FASTmrMLM, FASTmrEMMA, pLARmEB, and ISIS EM-BLASSO, were utilized using R package to pinpoint candidate QTNs (https: //cran.r-project.org/web/packages/mrMLM/index.html(accessed on 22 November 2023).To mitigate potential false positives due to population structure, the analysis incorporated the first three principal components (PCs) and a kinship matrix as covariates within the framework.Considering an LOD score value ≥3 as the threshold, significant QTNs were identified [104].The common QTNs detected by any two ML-GWAS models were predicted to be good candidates for rice grain related traits.All genes situated within the LD decay distance of the identified QTNs were extracted and underwent comprehensive gene annotation studies to find candidate loci for each trait, employing the Rice Annotation Project-Database (RGAP, http://rice.uga.edu/)[105].

Candidate Gene Analysis and Gene Functional Enrichment Analysis
The Rice Genome Annotation Project Database (RGAP, MSUv7.0,http://rice.uga.edu/) [105] and Information Commons for Rice (IC4R, http://ic4r.org/)were used to search and functionally annotate the putative genes underlying the ±130 Kb genomic region of the common significant QTNs identified via various ML-GWAS models.Candidate genes were selected based on following criteria: (a) neighboring genes of significant QTNs lying within the LD decay distance; and (b) genes with a known function in rice or Arabidopsis orthologs associated with traits of interest.Subsequently, gene ontology (GO) enrichment analysis was conducted on 60 putative candidate genes, with the aim of acquiring better insight into the molecular and biological role, using the Plant Gene Set Annotation Database (PlantGSAD) analysis tool [106] and analyzing the data with REVIGO [55].For further analysis, trait ontology (TO) categories, pathway gene sets (Mapmam gene set), and chromatin states were selected for gene set enrichment analysis using PlantGSAD.TO terms, pathways, and chromatin states with a p-value less than 0.05 were regarded as significantly enriched, indicating that the set is enriched with the genes of a particular pathway or functional category.

In Silico Gene Expression Analysis of Candidate Genes
The expression values available at the Rice Genome Annotation Project Database (RGAP, MSUv7.0,http://rice.uga.edu/) and the Rice Expression Database (RED, http: //expression.ic4r.org/)were utilized to investigate all the candidate genes in different tissues to further depict the associations between genes and phenotypic traits.Annotated genes were compared with their homologs using the Arabidopsis Information Resource (TAIR) database (https://www.arabidopsis.org/).The R package "heatmap" was used to create a heatmap depicting the FPKM values of the candidate gene.Genes with the elevated expression in specific tissues and putative or known functions associated with desired traits were identified and further investigated.A schematic representation of the methodology followed in our study is shown in Figure 11.

LD Block Analysis
Four candidate genes associated with traits of interests were further investigated.The LD block of four selected robust loci were generated using filtered SNPs following the established confidence interval by Gabriel [107].LD heatmaps were created using the LD Block Show tool v1.39.All situated within the LD decay range of the identified QTNs were mined and subjected to comprehensive investigation.

Conclusions
In the current study, five ML-GWAS models were employed for grain-related traits on a set of 483 rice GWAS panels, with 190 accessions from northeast core and 293 from rice landraces, using 35,286 SNPs.The number of QTNs identified were 8, 9, 12, and 11 for GL, GW, aroma, and LWR, respectively.Amongst the 40 different QTNs in total, 16 were obtained with two ML-GWAS methods simultaneously.We examined all 16 genomic loci linked to grain quality traits in Rice Assembly version 7 and annotated them accordingly.Probable candidate genes (CGs) were sought within the 130 kbp genomic region surrounding each of the 16 commonly annotated QTNs.Across these 16 QTNs, 258 genes were identified as being in close proximity to significant QTNs.Among them, 60 genes exhibited elevated expression levels in specific tissues, as indicated by the available FPKM values from the RGAP and RED databases, and were predicted to play roles in pathways influencing grain quality.Meanwhile, we also studied the superior allele in the northeast core set and the rice landrace set.Some superior SNPs have been shown to be present to less than 30% of the genotypes, indicating the need to uncover their potential molecular function, which would be beneficial to pyramid breeding.Subsequently, gene annotation, gene ontology, trait ontology, and enrichment analysis showed that 60 CGs were found to be enriched, in GO terms, in the studied traits, and they also showed higher expression in seeds (5 DAP) and seeds (10 DAP), suggesting an association between CGs and the grain size and quality traits.LOC_Os05g06470, LOC_Os06g06080, LOC_Os08g43470, and LOC_Os03g53110 were confirmed as key candidates by expression analysis, GO, and T, as well as pathways analysis for aforementioned traits.Choosing elite genotypes identified for their higher occurrence of desirable alleles linked to grain size traits and aroma could accelerate the pace of rice enhancement, tackling issues concerning food security and sustainable rice cultivation.Moreover, from a breeding point of view, forty MTAs exhibiting significant associations with grain-related traits hold promise for gene cloning, which, in turn, can be leveraged for marker-assisted selection (MAS) aimed at enhancing grain-related traits.The MTAs uncovered in the current study are pivotal as they may be linked to minor genes influencing target traits.Utilizing SNP markers associated with specific loci, favorable alleles can be stacked in newly emerging varieties to enhance their traits.Furthermore, these varieties have the potential to serve as valuable parents in breeding programs aimed at enhancing grain quality parameters through genetic enhancement.GWAS models have uncovered genetic variants linked to various traits, a phenomenon known as pleiotropy.This identification of pleiotropic loci holds significance in comprehending the common origins of diseases and complex traits.Thus, candidate genes can help in MAS and precision breeding, whereby traits like GL can be precisely targeted without affecting other traits.CGs can help in pyramiding the genes, whereby breeders can leverage this knowledge to pyramid multiple favorable alleles into elite rice varieties, leading to further improvements in grain length and other related traits, such as grain weight and yield.Furthermore, the potential candidate genes identified are crucial targets for future studies aimed at functional characterization.Such research endeavors can help bridge the gaps within and/or construct a genetic framework for signaling pathways that regulate grain size and aroma in rice.

Figure 1 .
Figure 1.. Correlation coefficient matrix, scatter plot, and phenotypic frequency distribution among grain-related traits.Each variable's distribution is displayed diagonally.The bivariate scatter plots with a trend line are shown at the bottom of the diagonal.The correlation coefficient and the level of significance are displayed as stars at the top of the diagonal (*** p > 0.001 shows significance level).

Figure 1 .
Figure 1.Correlation coefficient matrix, scatter plot, and phenotypic frequency distribution among grain-related traits.Each variable's distribution is displayed diagonally.The bivariate scatter plots with a trend line are shown at the bottom of the diagonal.The correlation coefficient and the level of significance are displayed as stars at the top of the diagonal (*** p > 0.001 shows significance level).Plants 2024, 13, x FOR PEER REVIEW 2 of 12

Figure 2 .
Figure 2. SNP marker distribution and LD decay of 483 rice accessions: (a) The distribution of SNPs within 1 Mb window size across 12 rice chromosomes; (b) LD decay distance in the whole population.Pairwise LD (r 2 ) values against the corresponding pairwise physical distance (bp) of SNP markers were plotted.Red line indicates the trend line of non-linear regressions against physical distance.

Figure 2 .
Figure 2. SNP marker distribution and LD decay of 483 rice accessions: (a) The distribution of SNPs within 1 Mb window size across 12 rice chromosomes; (b) LD decay distance in the whole population.Pairwise LD (r 2 ) values against the corresponding pairwise physical distance (bp) of SNP markers were plotted.Red line indicates the trend line of non-linear regressions against physical distance.

Figure 3 .
Figure 3. Population stratification and diversity analysis of 483 accessions using 35,286 high-quality SNPs: (a) principal component analysis 3D plot (black-190 ne core; red-293 rice landraces; Blue ellipses shows three subclusters among the 483 rice panel) (b) heatmap of pairwise kinship matrix; (c) phylogenetic tree based on neighbor-joining method.

Figure 4 .
Figure 4. Manhattan plots and quantile-quantile plots for (a) grain length, (b) grain width, (c) aroma, and (d) length-width ratio using five multi-locus models.The horizontal dotted line indicates the threshold LOD score ≥3.The dots above the threshold value represent the significant QTNs at different rice chromosomes; the dots in pink color represent QTNs detected by ≥2 models.

Figure 4 .
Figure 4. Manhattan plots and quantile-quantile plots for (a) grain length, (b) grain width, (c) aroma, and (d) length-width ratio using five multi-locus models.The horizontal dotted line indicates the threshold LOD score ≥3.The dots above the threshold value represent the significant QTNs at different rice chromosomes; the dots in pink color represent QTNs detected by ≥2 models.

Figure 5 .
Figure 5. (a) Superior and inferior allele distribution in the northeast core; (b) superior and inferior allele distribution in rice landraces.

Figure 5 .
Figure 5. (a) Superior and inferior allele distribution in the northeast core; (b) superior and inferior allele distribution in rice landraces.

Figure 6 .
Figure 6.GO enrichment analysis of CGs by PlantGSAD and REVIGO.Scatter plot illustrates the cluster representatives.(a) Biological processes; (b) cellular component; (c) molecular function positioned in a two-dimensional space comprising significant GO terms with semantic similarities.Bubble color and size signify the −log10(p) value.

Figure 6 .
Figure 6.GO enrichment analysis of CGs by PlantGSAD and REVIGO.Scatter plot illustrates the cluster representatives.(a) Biological processes; (b) cellular component; (c) molecular function positioned in a two-dimensional space comprising significant GO terms with semantic similarities.Bubble color and size signify the −log10(p) value.

Plants 2024 , 12 Figure 7 .Figure 7 .
Figure 7. Trait ontology tree depicting the TO terms associated with CGs.Boxes in the diagram represent the TO terms corresponding to a seven-digit ID number preceded by TO, their description, and p-value.Colored nodes indicate the significantly enriched TO terms, and arrows indicate theFigure 7. Trait ontology tree depicting the TO terms associated with CGs.Boxes in the diagram represent the TO terms corresponding to a seven-digit ID number preceded by TO, their description, and p-value.Colored nodes indicate the significantly enriched TO terms, and arrows indicate the relationship between consecutive nodes.

Plants 2024 , 12 Figure 8 .
Figure 8. Tree depicting the pathways and processes of MapMan associated with CGs.Boxes in the diagram represent the bincode followed by the description of pathways and p-value.

Figure 8 .
Figure 8. Tree depicting the pathways and processes of MapMan associated with CGs.Boxes in the diagram represent the bincode followed by the description of pathways and p-value.

12 Figure 9 .Figure 9 .
Figure 9. Heatmap showing the normalized FPKM expression values of the 60 CGs for the four grain quality traits.Figure 9. Heatmap showing the normalized FPKM expression values of the 60 CGs for the four grain quality traits.

Plants 2024 , 12 Figure 11 .
Figure 11.Schematic representation of the methodology followed in our study.

Table 1 .
Phenotype variation and distribution pattern of four grain-related traits.

Table 2 .
Table showing 40 associated QTNs identified for four grain-related traits (16 QTNs identified simultaneously by two or more ML-GWAS methods have been marked in bold).

Table 3 .
List of 60 candidate genes with their functional annotation.