Genome-Wide Association Study of Seed Morphology Traits in Senegalese Sorghum Cultivars

Sorghum is considered the fifth most important crop in the world. Despite the potential value of Senegalese germplasm for various traits, such as resistance to fungal diseases, there is limited information on the study of sorghum seed morphology. In this study, 162 Senegalese germplasms were evaluated for seed area size, length, width, length-to-width ratio, perimeter, circularity, the distance between the intersection of length & width (IS) and center of gravity (CG), and seed darkness and brightness by scanning and analyzing morphology-related traits with SmartGrain software at the USDA-ARS Plant Science Research Unit. Correlations between seed morphology-related traits and traits associated with anthracnose and head smut resistance were analyzed. Lastly, genome-wide association studies were performed on phenotypic data collected from over 16,000 seeds and 193,727 publicly available single nucleotide polymorphisms (SNPs). Several significant SNPs were found and mapped to the reference sorghum genome to uncover multiple candidate genes potentially associated with seed morphology. The results indicate clear correlations among seed morphology-related traits and potential associations between seed morphology and the defense response of sorghum. GWAS analysis listed candidate genes associated with seed morphologies that can be used for sorghum breeding in the future.


Introduction
Sorghum [Sorghum bicolor (L.) Moench] is a gluten-free cereal widely consumed throughout Africa [1,2]. It is a climate-resilient and drought-tolerant crop used for animal feed, biofuels, forage, ethanol production, and fodder preservation; it is arguably one of Africa's most versatile food crops [3]. Plant genetics researchers have frequently employed community association panels to investigate the genetic basis of naturally occurring phenotypic variation in several traits [4]. Sorghum germplasm lines from West and Central Africa are cultivated in rainy and high-humidity regions and have been a significant source of critical agronomic traits such as fungal disease resistance [5]. To date, Senegalese germplasms have been extensively tested through genome-wide association studies (GWAS) to identify novel sources of resistance genes against fungal diseases such as anthracnose and head smut [5][6][7], but the germplasms have yet to be widely studied for other agronomically important traits such as seed morphology.
Morphological variations in seed characters, such as differences in seed size and shape, are important traits in plant identification and classification of taxa, and are useful parameters for analyzing plant biodiversity [8,9]. Seed morphology is of agronomic importance, as it reflects genetic, physiological, and ecological components that affect yield, quality, and market price [8]; for instance, acceptance of high lysine-containing sorghums is limited due to many problems associated with their opaque kernel characteristics [10,11]. Breeding targets encompass seed shape and size, as seed size-related traits bear particular importance, and a comprehensive understanding of the genes underlying seed morphology equips breeders with the capacity to develop novel cultivars harboring desired characteristics [12].
Sakamoto et al. evaluated 329 sorghum germplasms from various origins and identified SNPs potentially associated with seed morphology, including SNP loci S01_50413644, S04_59021202, and S05_9112888 based on multi-traits GWAS [13]. Zhang et al. identified 73 quantitative trait loci (QTL) related to grain color and tannin content in Chinese sorghum accessions, and a new recessive allelic variant of Tannin2 was discovered [14]. A GWAS conducted on a diverse set of 635 Ethiopian sorghum accessions found variations in loci harboring seed protein genes involved in seed storage, late embryogenesis, and tannin biosynthesis, all of which are associated with sorghum grain mold resistance [15].
To add to our understanding of sorghum seed morphology, in this study, 162 Senegalese sorghum germplasms along with controls (BTx623, PI609251, and PI659985, which are widely used cultivars in sorghum research) were evaluated for various seed morphologies, including seed area size, length, width, length-to-width ratio (LWR), perimeter, circularity, the distance between the intersection of length and width (IS) and center of gravity (CG), and seed darkness and brightness. In addition to identifying variation for each trait, possible correlations among the characteristics were evaluated across the subset of the Senegalese collection. Furthermore, Ahn et al. previously studied identical germplasms to evaluate resistance to anthracnose at the seedling and 8-leaf stages and for head smut based on distinctive spot appearance rate and average time for spot appearance on the first leaf under water [6,7,16,17]. By taking advantage of the previous studies, the seed morphology-related traits were also analyzed to identify potential correlations with the traits associated with anthracnose and head smut. Finally, the phenotypic data collected from over 16,000 seeds for the traits were combined with 193,727 single nucleotide polymorphisms (SNPs) throughout the genome to perform GWAS regarding single traits and multi-traits. The top candidate SNPs were tracked back to the reference sorghum genome, resulting in the identification of multiple candidate genes potentially associated with seed morphology-related traits.

Seed Morphologies
Based on the two-tailed test, the ANOVA for the 165 accessions, including BTx623, PI609251, and PI659985, showed significant differences, with p < 0.0001 for all evaluated traits (raw data available through Supplementary Data S1). The Shapiro-Wilk test identified normal distribution for the distance between IS and CG (p = 0.086), but other traits showed abnormal distributions with p < 0.0001 (Figure 1). Brightness and circularity were skewed to the right, and the other traits had a few to multiple outliers ( Figure 1). The top five cultivars for each trait are shown in Table 1 (detailed phenotypic data are available through Supplementary Data S1). For example, the area size for PI514293 was 6.67 ± 0.86 mm 2 , while that for PI514404 was 19.26 ± 3.41 mm 2 (Table 1 and Figure 2). Similarly, the seed colors between PI514471 and PI514419 showed great contrast (Table 1 and Figure 3). Significant phenotypic variations were observed in other traits across the population as well. , (e) length-to-width ratio, (f) circularity (0-1 range), (g) distance between IS and CG (mm) and (h) brightness (0-255 range). Box plots above each histogram indicated mean value (diamond), percentiles, upper and lower whisker, and outliers. Median ranges are shown in red.

Correlations among the Seed Morphology-Related Traits
Based on Pearson's correlation analysis, all evaluated seed size-related traits correlate to each other except for perimeter-circularity and LWR-circularity ( Figure 4 and Table 2). Seed brightness showed no correlation to all seven seed size-related traits indicating the seed color-related trait is highly independent of seed size-related traits.

Correlations among the Seed Morphology-Related Traits
Based on Pearson's correlation analysis, all evaluated seed size-related traits correlate to each other except for perimeter-circularity and LWR-circularity ( Figure 4 and Table 2). Seed brightness showed no correlation to all seven seed size-related traits indicating the seed color-related trait is highly independent of seed size-related traits.

***
A PCA using eight seed morphology-related traits revealed that PC1 and PC2 explain 75.83% of the overall variation ( Figure 5). The plot of the partial contribution of variables for eight traits revealed that PC1 comprises area size, perimeter, length, width, and lengthto-width ratio ( Figure 6). Circularity and distance between IS and CG mainly contribute to PC2. Seed brightness contributes mostly to PC3.  A PCA using eight seed morphology-related traits revealed that PC1 and PC2 explain 75.83% of the overall variation ( Figure 5). The plot of the partial contribution of variables for eight traits revealed that PC1 comprises area size, perimeter, length, width, and lengthto-width ratio ( Figure 6). Circularity and distance between IS and CG mainly contribute to PC2. Seed brightness contributes mostly to PC3.

Correlations between the Seed Morphology-Related Traits and Anthracnose and Head Smut Resistance Responses in Sorghum
Based on multivariate correlation studies between seed morphology-related traits and sorghum responses to anthracnose and head smut, we identified two major correlations supported with significant p-values. There was a moderately strong negative correlation between head smut spot appearance rate (%) and circularity (Figure 7a). On the

Correlations between the Seed Morphology-Related Traits and Anthracnose and Head Smut Resistance Responses in Sorghum
Based on multivariate correlation studies between seed morphology-related traits and sorghum responses to anthracnose and head smut, we identified two major correlations supported with significant p-values. There was a moderately strong negative correlation between head smut spot appearance rate (%) and circularity (Figure 7a). On the other hand, head smut spot appearance rate (%) showed a moderate positive correlation to distance between IS and CG, which are directly associated with seed morphology and circularity ( Figure 7b). Although p-values were higher than 0.05, seed area size and length were also negatively correlated to the spot appearance rate (p-values ≈ 0.05). Other seed morphology-related traits did not show correlation to the diseases' associated traits.

Correlations between the Seed Morphology-Related Traits and Anthracnose and Head Smut Resistance Responses in Sorghum
Based on multivariate correlation studies between seed morphology-related traits and sorghum responses to anthracnose and head smut, we identified two major correlations supported with significant p-values. There was a moderately strong negative correlation between head smut spot appearance rate (%) and circularity (Figure 7a). On the other hand, head smut spot appearance rate (%) showed a moderate positive correlation to distance between IS and CG, which are directly associated with seed morphology and circularity ( Figure 7b). Although p-values were higher than 0.05, seed area size and length were also negatively correlated to the spot appearance rate (p-values ≈ 0.05). Other seed morphology-related traits did not show correlation to the diseases' associated traits.

Population Structure and GWAS Analysis
The population structure analysis revealed that there are two major groups across the accessions tested in this study ( Figure 8). The dendrogram displayed nearly identical results to the admixture plot ( Figure 9). Overall, the population structure analysis aligned with previous studies indicating that the botanical subrace played a major role in shaping the diversity patterns of the population [7,18,19]. PCA plots, along with phylogenetic trees, were omitted as it has been reported in a recent study in the population [7]. Figure 10 and Table 3 show the top SNPs identified from GWAS and their associated genes. LD heatmaps highlighted LD of regions near the statistically significant SNP loci, indicating low LD ( Figure 11). Overall, 100 SNP variants passed the Bonferroni threshold before secondary filtering with a t-test. The number of SNPs for each trait varied (3 SNPs for area size, 11 SNPs for perimeter, 0 SNP for length, 9 SNPs for width, 3 SNPs for LWR, 0 SNP for distance between IS and CG, 56 SNPs for PCs and 18 SNPs for brightness). After filtering with a t-test, multiple SNPs were excluded from the list, leaving the top SNPs that can be used as novel sources for sorghum seed morphology-related traits in breeding programs.
indicating low LD ( Figure 11). Overall, 100 SNP variants passed the Bonferroni threshold before secondary filtering with a t-test. The number of SNPs for each trait varied (3 SNPs for area size, 11 SNPs for perimeter, 0 SNP for length, 9 SNPs for width, 3 SNPs for LWR, 0 SNP for distance between IS and CG, 56 SNPs for PCs and 18 SNPs for brightness). After filtering with a t-test, multiple SNPs were excluded from the list, leaving the top SNPs that can be used as novel sources for sorghum seed morphology-related traits in breeding programs.

Discussion
Seed weight and size are critical yield components, and selecting for large seeds has long been a goal in crop domestication [20,21]. Measures of size and shape in seeds and their correlation are equally essential in current breeding to improve yield or quality [8]. As one of the most important crops worldwide, sorghum seed morphologies and their associations with molecular markers are of potential use not only for breeding but for evaluating the role of specific genes in seed shape and size. This study explored eight important sorghum seed morphology traits in Senegalese sorghum accessions that have yet to be extensively studied regarding seed-related traits. Figure 4 and Table 2 show that many seed size-related traits, including area size, perimeter, length, width, and length-to-width ratio, were correlated. Circularity and distance between IS and CG were also associated with other seed size-related traits. However, based on PCA analysis (Figures 5 and 6), these traits were not closely grouped with other seed traits. Seed brightness, primarily explained by PC3, was not significantly associated with the other seed traits.
In the multivariate correlation studies examining the relationship between seed morphology-related traits and sorghum responses to anthracnose and head smut, we observed two significant correlations. Firstly, we found a moderate negative correlation between the head smut spot appearance rate (%) and circularity. Secondly, the head smut spot appearance rate (%) showed a moderate positive correlation with the distance between IS and CG. Craig and Frederiksen conducted a seedling inoculation for sorghum using peat pellets against Sporisorium reilianum (Kühn) Langdon & Fullerton (syns. Sphacelotheca reiliana (Kühn) G.P. Clinton and Sorosporium reilianum (Kühn) McAlpine) causing sorghum head smut [16,22]. The sorghum seedlings at the 1-leaf stage were inoculated with teliospore cultures. After four days, the seedlings were submerged in water-filled test tubes, and the presence of brown or dark spots on the first leaf blade distinguished susceptible genotypes from resistant ones. Craig and Frederiksen explained the spots caused by the fungal pathogen, but it is unclear if the spots are present due to fungal infection or due to a plant defense response [16,22]. Although the observed correlations may be mere coincidences, Seiwa et al. reported results suggesting that seed size may play a role in conspecific negative distance-dependent seedling mortality and negative density-dependent seedling survival variation (CNDD), and that seed size may promote species coexistence by influencing distance-dependent pathogen attacks, especially those related to leaf diseases in eight tree species [23]. A study conducted in the Peruvian Amazon suggested a positive correlation between tree seed size and susceptibility to pathogen attack [24]. Specifically, the results indicated that larger and shade-tolerant seeds exhibited a higher vulnerability to pathogen attack than smaller seeds relying on light dependence [24]. A positive relationship between seed weight and susceptibility to pathogens was also found [24]. Ahn et al. [7] reported top candidate genes associated with the spot appearance when inoculated with S. reilianum at the seedling stage. The top candidate genes (F-box and leucine-rich repeat protein (Sobic.004G202700), ankyrin repeats (Sobic.002G174700) and xyloglucan endotransglucosylase (Sobic.004G273200)) were located near growth and development-related genes such as rhodanese-like domain-containing protein-like (Sobic.004G202600) [25], cellulase/endoglucanase (Sobic.004G202800) [26], protein kinase AFC1 (Sobic.004G202500) [27], serine/threonine protein kinase (Sobic.002G174801) [28], legume lectin domain (Sobic.002G174600) [29], transcription factor jumonji (jmj) family protein/zinc finger (C5HC2 type) family protein (Sobic.004G273100) [30], ubiquitin and ubiquitin-like proteins (Sobic.004G273300) [31], and MYB-like DNA-binding domain (Sobic.004G273000) [32]. Therefore, it is speculated that the correlations between the head smut spot appearance rate and the two morphology-related traits are due to strong genetic linkage.
Zinc finger proteins play essential roles in plant growth, development, and responses to abiotic stresses such as drought, salt, temperature, reactive oxygen species, and harmful metals [33]. As listed in Table 3, zinc finger-associated genes were linked with multiple SNPs potentially associated with seeds (SNP loci for area: S06_12058855, perimeter: S02_36482136 and S07_16244875, width: S06_12058855 and S07_16244901, and PCs: S06_54692844 and S06_54708681). MicroRNAs (miRNAs) play an essential role in regulating plant development by mediating target genes at transcriptional and post-transcriptional levels [34], and DUF3537 (perimeter-associated SNP locus S06_39973848) is one of the predicted target genes of miRNAs in Acacia crassicarpa [35]. DUF3537 is a member of the transmembrane protein family. Transmembrane proteins are recognized to play a role in biological stress response [36]. Examples include pathogen-induced cysteine-rich transmembrane proteins [37], suppressors of NPR1 Constitutive2 proteins [36], polygalacturonase-inhibiting proteins [38][39][40], and ankyrin repeat-containing proteins [41].
The role of ribonucleotide reductase (perimeter and width: SNP loci S02_36533975 and S02_36673863) in the de novo synthesis of deoxynucleoside triphosphates (dNTPs) in DNA replication and cell cycle progression is critical [42]. Homeodomain leucine-zipper (perimeter: SNP locus S01_72472059) interacts genetically to align morphogenesis and environmental responses by modulating phytohormone-signaling networks [43]. SNP locus S02_46916695 (perimeter) is located next to Sobic.001G447400 and associated with TPR and ankyrin repeat. The TPR-containing protein TTL1in Arabidopsis regulates plant responses to abscisic acid (ABA) in seeds and seedlings [44]. Ankyrin repeat-containing proteins are essential in cell growth, development, and response to hormones and environmental stresses [45], and ankyrin-TPR repeats gene clusters in rice are associated with panicle branching diversity [46]. SNP locus S06_51902835, one of the top SNPs associated with perimeter, was 2571 bp away from calcium/calmodulin-dependent protein kinase. In rice, OsDMI3 (calcium/calmodulin-dependent protein kinase)-mediated phosphorylation of OsMKK1 (MAPK) kinase activates the MAPK cascade and positively regulates abscisic acid responses in seed germination, root growth, and tolerance to both water stress and oxidative stress [47]. Glycosyltransferase, tagged with seed perimeter-associated SNP locus S02_46926845, is known to have roles in seed coat mucilage composition [48]. Auxin is a plant hormone central to plant growth and development from embryogenesis to senescence, and PB1 domain (perimeter: SNP locus S04_67200818) interactions in auxin response factor ARF5 and repressor IAA17 [49]. The closest gene from the SNP locus S01_52374442 (width) was Sobic.001G271500, which contains a leucine-rich repeat. Leucine-rich repeat proteins are critical for growth promotion, seed maturation, stress response, and enhanced seed production [50,51]. Leucine-rich repeat proteins such as the polygalacturonase-inhibiting proteins are involved in plant defense [52]. The superfamily of cytochrome P450 tagged by SNP locus S03_5165375 (PCs) plays critical roles in plant growth and development, biotic and abiotic stress responses, and metabolic diversification [53,54]. Considering that PC1, which was used as input data, predominantly consists of variables such as area size, perimeter length, length, width, and other PCs included as covariates, the multivariate GWAS results mainly reflect traits associated with seed size.
The SNP locus that S06_58756099 (PCs) tagged to farnesyl-pyrophosphate synthetase is known to be associated with plant development [55]. The N-terminal domains of Arabidopsis rhamnose synthases RHM1, 2, and 3 have UDP-D-glucose 4,6-dehydratase activity (PCs-related SNP loci S04_62452658), and rhamnose synthases are required for the development of root hairs and cotyledon pavement cells and the synthesis of seed mucilage [56]. SNP loci S06_8316056 and S03_5151221 (PCs) were related to protein transport-related genes. In plants, the Golgi apparatus is central to synthesizing complex cell wall polysaccharides and glycolipids in the plasma membrane and adding oligosaccharides to proteins destined to reach the cell wall, plasma membrane, or storage vacuoles [57]. Rho GTPases (PCs-related SNP locus S04_62432641) modulate plant growth and development. Similarly, PPR proteins (PCs-related SNP locus S04_62438345) also play important roles in seed development, plant growth and development, and stress responses [58]. WD40 repeat genes, which include SNP loci S03_51995650 (PCs) and S10_15656073 (brightness), are reported to be associated with anthocyanin accumulation, seed pigmentation, seed germination, seed growth, and biomass [59,60]. DNA helicases, a gene close to SNP locus S01_53382065 associated with seed brightness, are molecular motor proteins that have suggested roles in cell division/proliferation during flower development, maintenance of genomic methylation patterns, and the plant cell cycle, as well as in basic cellular activities [61].
The brightness-related SNP locus S01_2657706 is found near amyloid beta precursor protein-binding protein 1 (APPBP1). Numerous amyloids are involved in pathogenesis; however, plant amyloids are poorly studied but are known to play roles in the autonomous flowering pathway and post-translational modification [62]. Ribonuclease III (brightness-related SNP locus S01_66050539) is responsible for the processing and maturation of RNA precursors into functional rRNA, mRNA, and other small RNA. However, no reported role of seed color or development has been known for the genes. SNP locus S06_37776989 (width) tagged gene of unknown function (DUF493) and SNP locus S10_53260179 (brightness) were closely located on a locus that does not have any annotated gene nearby. These SNPs may be associated with seed morphology but could also be false positives caused by pure coincidences.
Most of the genes identified are involved in plant development and physiological processes with potential aspects of seed morphology. Bonferroni correction is often considered highly conservative [63], and in this study, 100 SNPs passed the Bonferroni threshold. Furthermore, we verified the average score/SNP through a simple t-test and filtered out any SNP that failed to pass p < 0.05. Even with these strict methods to minimize false positives, there are many novel SNPs potentially conferring changes in various seed morphologies. Hence, it is expected that most identified SNPs are genuinely associated with seed morphology-related traits that can be used for sorghum breeding in the future. Further studies should investigate the relationship between seed morphology-related characteristics and molecular markers to better understand seed morphology-related genes in the subset of the Senegalese sorghum collection and other collections within the National Plant Germplasm System of the US.

Seed Phenotypic Evaluation
A total of 162 cultivars from the Senegalese germplasm collection (complete list available in Supplementary Data S1) maintained by the USDA-ARS, Plant Genetic Resources Conservation Unit, Griffin, Georgia, and Controls (BTx623, PI609251, and PI659985) were evaluated for seed area size (mm 2 ), length (mm), width (mm), length-to-width ratio (LWR), perimeter (mm), circularity (0-1 range, 0: not circular to 1: complete circle), the distance between the intersection of length & width (IS) and center of gravity (CG), and seed darkness & brightness (0-255 range, 0: complete black to 255: pure white). CG is the point where the seed's mass is concentrated and IS are points where the width and length hit the boundary for the seed parameter [64].
Around 80 to 100 seeds were evaluated for each trait across all the cultivars and controls, except for PI659985, where the available number of seeds was only 50. Seed images were captured with Canon imageRUNNER ADVANCE C7270 (Canon Inc., Tokyo, Japan). With the SmartGrain (version 1.3) high-throughput phenotyping software, the scanned seed images were measured for seed area size, length, width, LWR, perimeter, circularity, and distance between IS and CG [64]. Any errors generated by SmartGrain were manually corrected for each image. Seed darkness and brightness were measured using a multi-point function in ImageJ version 1.54d [65].

Statistical Analysis
Tukey's HSD test for all possible cultivar comparisons was performed with JMP Pro 15 (SAS Institute, Cary, NC, USA) in each trait for statistical analysis. One-way ANOVA was performed for each trait separately. Pearson's correlation was calculated for all possible pairs of seed morphology-related traits with JMP Pro 15. For anthracnose and head smut resistance traits [6,7,16,17], both Pearson's (for parametric traits: all traits except anthracnose score-related traits) and Spearman's rank (for non-parametric traits including the 1-5 scale scoring for anthracnose-related traits) correlation tests were performed. The Shapiro-Wilk test was conducted to evaluate normal distribution in each trait. A principal component analysis (PCA) was performed using data from the eight measured traits. Additional PCA was performed using seven correlated data, except the traits for color for multi-variate GWAS analysis.

GWAS and Population Genomic Analysis
For GWAS, a total of 193,727 SNP data were extracted from an integrated sorghum SNPs dataset based on sorghum reference genome version 3.1.1 and initially genotyped using GBS [18,[66][67][68]. Phenotypic data was input univariately to perform GWAS with a mixed linear model (MLM) through TASSEL version 5.2.55 [69] association mapping software to identify chromosomal locations associated with each trait. Moreover, a PCA was performed with TASSEL version 5.2.55 for the traits that showed high correlations (all traits except seed color in this study), and a multivariate GWAS [PC GWAS based on PCs for phenotypic data (PC1 = data and other PCs = covariates)] was conducted through MLM. False associations were minimized by removing SNPs with higher than 20% unknown alleles and SNPs with minor allele frequency (MAF) below 5%, resulting in 132,024 SNPs [70]. The SNPs that passed the Bonferroni threshold were mapped back to the publicly available sorghum reference genome to be tracked to the specific chromosome location based on the sorghum reference genome sequence, version 3.1.1, available at the Phytozome 13 (https://phytozome.jgi.doe.gov, (accessed on 25 May 2023)) [71]. The mean values for Senegalese germplasms with either of the two prevalent bases were determined for each of the prospective genes listed in Table 3. The differences in these mean values were verified to be significant (p < 0.05) using JMP Pro 15 (SAS Institute, Cary, NC, USA). SNPs that did not pass the t-test were excluded from the list of candidate SNPs. PLINK v1.9 [72] was used for VCF file conversions and to randomly select 50,000 SNPs from the genotypic data for analysis of population structure in R v4.1.2 and R studio v1.4.1717. The packages FactoMineR v2.8 [73] and Factoextra v1.0.7 [74] were used to conduct PCA and determine optimal k-means clustering using the average silhouette method, respectively. ADMIXTURE v1.3.0 [75] was run with the optimal k-means clustering value to visualize population structure. SNPRelate v1.28.0 and gdsfmt v1.30.lil0 [76] were used to generate a dendrogram of the accessions from the genotype data, which was visualized using ggtree v3.2.1 [73] to further validate population structure, and assign accessions to genetic groups. LD heatmap v1.0-6 [77] was utilized to plot LD of local variants around statistically significant SNPs.

Conclusions
In this study, we analyzed seed morphology-related traits in a subset of Senegalese germplasm. Even though a low genetic diversity was found, the accessions showed a wide range of morphological traits. Intriguingly, there were potential associations between seed morphology-related traits and head smut spot appearance rate, explained by possible genetic linkages. Nearly all the candidate genes from GWAS analysis had known roles in plant growth and development. The identified genes' functions can be validated by using modern and cutting-edge techniques such as real-time quantitative reverse transcription PCR (Real-time qRT-PCR), RNA sequencing analysis (RNA-Seq), and CRISPR-Cas9-associated gene editing. Although applying gene-editing techniques in monocot crops is challenging, rapid developments in gene editing technology will offer fast and precise functional validations of the candidate genes. On the other hand, it is essential to survey seed morphology-related traits in other sorghum populations to identify additional candidate genes and genes that overlap in multiple populations.