Variations in Total Protein and Amino Acids in the Sequenced Sorghum Mutant Library

Sorghum (Sorghum bicolor) is the fifth most important cereal crop worldwide; however, its utilization in food products can be limited due to reduced nutritional quality related to amino acid composition and protein digestibility in cooked products. Low essential amino acid levels and digestibility are influenced by the composition of the sorghum seed storage proteins, kafirins. In this study, we report a core collection of 206 sorghum mutant lines with altered seed storage proteins. Wet lab chemistry analysis was conducted to evaluate the total protein content and 23 amino acids, including 19 protein-bound and 4 non-protein amino acids. We identified mutant lines with diverse compositions of essential and non-essential amino acids. The highest total protein content in these lines was almost double that of the wild-type (BTx623). The mutants identified in this study can be used as a genetic resource to improve the sorghum grain quality and determine the molecular mechanisms underlying the biosynthesis of storage protein and starch in sorghum seeds.


Introduction
Sorghum (Sorghum bicolor), belonging to the Poaceae family, is the fifth major cereal crop worldwide, after wheat, rice, corn, and barley, in terms of grain productivity [1]. Sorghum is widely used in animal feed, fodder, and high-value products, such as syrup and bioethanol, making it a promising candidate for multipurpose feedstock [2,3]. Because of its excellent resistance to drought in agro-ecological zones with limited rainfall, where the yield of other cereal crops is insufficient, sorghum is considered as a potential alternative to nutrientenriched food [4]. Sorghum is a major crop in the semiarid and arid areas of the world, particularly in Africa, where it is a staple meal for a sizable portion of the population [5]. However, it has lower nutritional value compared to other crops [6,7]. Therefore, it is important to develop improved sorghum varieties with higher nutritional value.
Sorghum grain can have a wide range of protein content, but on average contains approximately 11% crude protein [8,9]. The most prevalent proteins, prolamins (kafirins in sorghum), are present in the endosperm with the majority found in spherical protein bodies [10]. Notably, kafirin accounts for approximately 48-70% of the total protein content in whole grains and up to 80% of the protein content in decorticated kernels [2,11]. Starch is the main carbohydrate in sorghum grains. Similar to protein content, total starch can vary widely but on average the starch content in sorghum is~70%. Starch consists of two polysaccharides: amylopectin, accounting for 70-80% of the starch, and amylose constituting the remaining 20-30% [10,12]. Similar to maize and other cereal grains, sorghum grains have low quantities of oil and other lipids, generally 2-4% on a weight basis [13].
such as seed quality. For example, the FAD2-1A gene (Glyma.03G144500) has been shown to influence the concertation of linoleic acid. Similarly, mutations in the RS2 (raffinose synthase; Glyma.03G137900) genes result in higher sucrose levels and lower concentrations of raffinose and stachyose oligosaccharide [53][54][55]. Similarly, another EMS-based mutant population was used to investigate the mechanism of oil production and generate germplasms for Brassica napus breeding [56]. In maize, the Waxy1 line was created by EMS-based mutagenesis, which could produce the low expression level of granule-bound starch synthase I and lead to the low level of amylose but high level of amylopectin in seed [57,58]. OsNF-YB1 knockout in rice led to altered rice quality due to the changes in grain size, amylose, total starch, crude fiber, and lipid content, as well as increased protein content [59]. In recent years, induced mutations have been extensively used for breeding annual oilseed crops [60][61][62][63]. The maize Opaque-2 (O2) mutant has been used in breeding to produce new maize verities with high lysine content and reduced zein to glutenin ratio [64,65]. The maize O2 mutant has been used in breeding to produce new maize verities with high lysine content and reduced zein to glutenin ratio [64,65]. Hence, the mutant germplasm produced by EMS that has the potential to improve seed quality and other agronomic performance may be employed as a complementing technique in the genetic improvement of cereal crops.
Although several EMS mutant populations have been developed previously, none of the sorghum mutant populations have been screened extensively for seed quality traits [48,49,[66][67][68]. Sorghum has been underutilized in grain quality studies due to the limited availability of genetic resources. To facilitate the functional genomics research of sorghum, we previously established a large sorghum mutant population containing more than 6400 M2 pools [48,49]. More than 1.8 million canonical EMS-induced mutations were discovered from the wholegenome sequencing of 256 mutant lines, which covered more than 95% of the sorghum genome. Interestingly, 97.5% of the generated mutations were not discovered in the natural variants. This population has been of great value in characterizing important traits in sorghum, including epicuticular wax, seed size, and inflorescence development [49,69,70]. This mutant population has sufficient mutation density and low cross-fertilization, making it a useful genetic resource for examining the functions of sorghum genes. The underlying mutant resources, coupled with compressive grain quality analysis for essential and nonessential amino acids provide an efficient platform for functional validation of genes related to important seed traits in sorghum.
In this study, we analyzed 23 amino acids in 206 ethyl methanesulfonate (EMS)-based sorghum mutant lines to provide a new germplasm resource for sorghum grain quality improvement. EMS mutagenesis technique has gained momentum for generating mutant populations due to its high frequency of point mutations [71,72]. Herein, we used a wet chemistry approach to determine the levels of essential and non-essential amino acids in EMS-based mutant and wild-type sorghum crops to compare their nutrient dynamics. Our study may serve as a robust genetic resource for enhancing the sorghum genetic potential to improve its grain nutrient content and aid in the development of effective sorghum breeding strategies.

Variations in the Seed Amino Acid and Total Protein Levels in the Mutagenized Sorghum Population
To identify any variations in amino acid levels in the sorghum mutant population, we analyzed the M3 generation of 206 mutant lines. This mutant population was generated via EMS treatment of the reference genome line, BTx623, through single-seed descent. Each mutant line contained, on average, 7660 mutations. The overall mutation density in the population was 11 SNPs/Mb, with a range varying from 0.02 to 22.5 SNPs/Mb [49]. We conducted wet lab chemistry analysis to determine the seed protein composition, including total protein content and concentration of various amino acids in the mutant population. Wet lab chemistry analysis provides accurate and precise measurements of the content of each amino acid [73,74]. Total protein content and distribution of 23 amino acids, including 19 protein-bound and 4 non-protein amino acids, are depicted in Figure 1A. Total protein content (10.28-19.01%) in 206 accessions indicated new genetic variability due to induced mutagenesis ( Figure 1A). Surprisingly, the total protein content was higher in all mutant lines than in the BTx623 line (10.35%). ARS197 (19.14%), ARS152 (18.45%), and ARS (18.04%) showed increases in total protein content of 84.92, 78.26, and 74.29%, respectively ( Figure 1A). In wild-type BTx623, the most abundant amino acid in sorghum seeds was glutamic acid (21.11%), followed by leucine (13.40%), alanine (9.34%), and proline (8.21%) ( Figure 1B). The minimum and maximum concentrations of each amino acid in the mutant population are listed in Table 1. Major amino acids, hydroxylysine, methionine, taurine, cysteine, tyrosine, tryptophan, hydroxyproline, lanthionine, and ornithine, exhibited > 20% difference between their minimum and maximum concentrations, indicating diversity in the mutagenized population (Table 1). the population was 11 SNPs/Mb, with a range varying from 0.02 to 22.5 SNPs/Mb [49]. We conducted wet lab chemistry analysis to determine the seed protein composition, including total protein content and concentration of various amino acids in the mutant population. Wet lab chemistry analysis provides accurate and precise measurements of the content of each amino acid [73,74]. Total protein content and distribution of 23 amino acids, including 19 protein-bound and 4 non-protein amino acids, are depicted in Figure 1A. Total protein content (10.28-19.01%) in 206 accessions indicated new genetic variability due to induced mutagenesis ( Figure 1A). Surprisingly, the total protein content was higher in all mutant lines than in the BTx623 line (10.35%). ARS197 (19.14%), ARS152 (18.45%), and ARS (18.04%) showed increases in total protein content of 84.92, 78.26, and 74.29%, respectively ( Figure 1A). In wild-type BTx623, the most abundant amino acid in sorghum seeds was glutamic acid (21.11%), followed by leucine (13.40%), alanine (9.34%), and proline (8.21%) ( Figure 1B). The minimum and maximum concentrations of each amino acid in the mutant population are listed in Table 1. Major amino acids, hydroxylysine, methionine, taurine, cysteine, tyrosine, tryptophan, hydroxyproline, lanthionine, and ornithine, exhibited > 20% difference between their minimum and maximum concentrations, indicating diversity in the mutagenized population (Table 1).  Using the wet lab chemistry approach, we measured the concentrations of essential (histidine, isoleucine, leucine, lysine, methionine, phenylalanine, threonine, tryptophan, and valine) and non-essential (alanine, arginine, asparagine, aspartic acid, cysteine, glutamic acid, glutamine, glycine, proline, serine, and tyrosine) amino acids in each mutant line. We found a positive correlation between the concentration of each amino acid and the total protein content ( Figure S1). As shown in Figure S1, the concentration of each amino acid increased with the increase in total protein content in the seeds. Compared to leucine, methionine, tryptophane, and phenylalanine (R 2 = 0.594, 0.485, 0.492, and 0.632, respectively), histidine, lysine, threonine, and valine (R 2 = 0.671, 0.918, 0.746, and 0.720, respectively) exhibited stronger correlations ( Figure S2). This result suggests that the mutant lines with a high percentage of total protein had high concentrations of essential amino acids. Compared to the wild-type line, ARS158, ARS125, and ARS197 mutant lines had double the concentrations of important amino acids. Histidine concentration was 0.41 g/100 of dry seed in ARS158 and ARS125 mutant lines compared to 0.22 g/100 of dry seed in BTx623 (Figure 2A). Similarly, isoleucine, leucine, lysine, and phenylalanine concentrations were doubled in the ARS197 mutant line compared to that in the wild-type line. ARS197 had high concentrations of isoleucine, leucine, lysine, and phenylalanine, indicating positive correlations among these amino acids (Figure 2A-I). Compared to BTx623, the mutant lines had higher concentrations of the essential amino acids, histidine, isoleucine, leucine, lysine, and phenylalanine. However, ARS159, showed lower concertation of methionine (0.16 g/100 g of dry seed) than BTx623 (0.19 g/100 g of dry seed). Concentrations of non-essential amino acids showed a similar trend in the mutant population ( Figure 3A-I). Similar to essential amino acids, non-essential amino acids also showed a positive correlation with the total protein content ( Figure S2). All mutant lines had higher concentrations of alanine, proline, serine, arginine, and tyrosine than BTx623. However, five mutant lines exhibited lower concentrations of cysteine than BTx623 ( Figure 3A). A similar trend was observed for the non-protein amino acids, ornithine, taurine, and hydroxyproline ( Figure S2).
( Figure 3A-I). Similar to essential amino acids, non-essential amino acids also showed a positive correlation with the total protein content ( Figure S2). All mutant lines had higher concentrations of alanine, proline, serine, arginine, and tyrosine than BTx623. However, five mutant lines exhibited lower concentrations of cysteine than BTx623 ( Figure 3A). A similar trend was observed for the non-protein amino acids, ornithine, taurine, and hydroxyproline ( Figure S2).

Variability in the Kernel Structure, Composition, and Starch Content
To further investigate the effects of mutations on the kernel structure composition and starch content, seeds from five mutant lines (ARS132, ARS137, ARS239, ARS136, and ARS140) and BTx623 were analyzed using near-infrared (NIR) spectroscopy. These mutant lines were randomly selected for comparison with BTx623. As illustrated in Figure 4, the mutations in these five lines affected the proportion of floury and vitreous endosperm, vitreosity, and the seed protein, total starch, and amylose contents. The proportion of vitreous endosperm was higher than that of floury endosperm in ARS132, ARS137, ARS136, and ARS140 lines (Figure 4). However, ARS239 showed a pattern similar to that of BTx623. Floury and vitreous endosperm traits are directly related to the starch digestibility rate. These components are unique in their composition and their relative proportions influence the final chemistry of the grain. In addition, the grain composition is closely intertwined with the physical structure [75]. Hence, the mutant lines generated in this study may be a useful resource for investigating the effects of floury and vitreous endosperms on the starch digestibility rate and other traits of sorghum seeds. ARS132, ARS137, ARS136, and ARS140 mutant lines had higher vitreosity than BTx623 and ARS239 lines (Figure 4), indicating that the rate of vitreosity is directly correlated with the proportion of floury and vitreous endosperm. We also quantified the total protein content via NIR spectroscopy and found a correlation of 0.857 with wet lab chemistry analysis, indicating the high quality and reliability of the data generated in this study. ARS239 had a higher protein content than the other mutant lines and BTx623, suggesting that the proportion of vitreous endosperm is negatively correlated with the total protein content. Compared with BTx623, no obvious changes were observed in any of the mutant lines. Interestingly, amylose content decreased by 26.673% in ARS239 compared to that in BTx623, suggesting that the floury endosperm has more amylose content than the vitreous endosperm ( Figure 3). These results indicate that the mutant lines generated in this study have sufficient variability and can be used as a valuable resource for functional genomics research and sorghum breeding in the future.

Variability in the Kernel Structure, Composition, and Starch Content
To further investigate the effects of mutations on the kernel structure composition and starch content, seeds from five mutant lines (ARS132, ARS137, ARS239, ARS136, and ARS140) and BTx623 were analyzed using near-infrared (NIR) spectroscopy. These mutant lines were randomly selected for comparison with BTx623. As illustrated in Figure 4, the mutations in these five lines affected the proportion of floury and vitreous endosperm, vitreosity, and the seed protein, total starch, and amylose contents. The proportion of vitreous endosperm was higher than that of floury endosperm in ARS132, ARS137, ARS136, vitreous endosperm is negatively correlated with the total protein content. Compared with BTx623, no obvious changes were observed in any of the mutant lines. Interestingly, amylose content decreased by 26.673% in ARS239 compared to that in BTx623, suggesting that the floury endosperm has more amylose content than the vitreous endosperm ( Figure  3). These results indicate that the mutant lines generated in this study have sufficient variability and can be used as a valuable resource for functional genomics research and sorghum breeding in the future.

Correlation between the Total Protein and Amino Acid Levels in the Mutagenized Sorghum Population
To determine the correlations between each amino acid concentration and the total protein content, the concentrations of each amino acid (g amino acid/100 g protein) relative to the total protein content were calculated. Glutamic acid, lanthionine, alanine, leucine, hydroxylysine, and ornithine showed positive correlations with the total protein content ( Figure 5), suggesting that their concentrations increase relative to the total protein content. Notably, glutamic acid, alanine, and leucine were the three most abundant amino acids in wild-type seeds. Taurine, hydroxyproline, threonine, serine, proline, glycine, cysteine, methionine, lysine, histidine, arginine, and tryptophan showed negative correlations with the total protein content. For example, the total protein content was increased by 89.34% in the ARS197 line compared to that in BTx623, but the lysin content was decreased (2.16 g/100 g of protein) in the ARS197 line compared to that in BTx623 (2.39 g/100 g of protein). de Borja Reis, et al. [76] reported similar findings in soybeans. Aspartic acid, phenylalanine, isoleucine, valine, and tyrosine were not significantly related to the total

Correlation between the Total Protein and Amino Acid Levels in the Mutagenized Sorghum Population
To determine the correlations between each amino acid concentration and the total protein content, the concentrations of each amino acid (g amino acid/100 g protein) relative to the total protein content were calculated. Glutamic acid, lanthionine, alanine, leucine, hydroxylysine, and ornithine showed positive correlations with the total protein content ( Figure 5), suggesting that their concentrations increase relative to the total protein content. Notably, glutamic acid, alanine, and leucine were the three most abundant amino acids in wild-type seeds. Taurine, hydroxyproline, threonine, serine, proline, glycine, cysteine, methionine, lysine, histidine, arginine, and tryptophan showed negative correlations with the total protein content. For example, the total protein content was increased by 89.34% in the ARS197 line compared to that in BTx623, but the lysin content was decreased (2.16 g/100 g of protein) in the ARS197 line compared to that in BTx623 (2.39 g/100 g of protein). de Borja Reis, et al. [76] reported similar findings in soybeans. Aspartic acid, phenylalanine, isoleucine, valine, and tyrosine were not significantly related to the total protein concentration in the mutagenized sorghum population ( Figure 5). These results indicate that the amino acid to protein ratio directly affects the nutritional value of sorghum seeds and should be specifically considered in the development of high-protein varieties.
protein concentration in the mutagenized sorghum population ( Figure 5). These results indicate that the amino acid to protein ratio directly affects the nutritional value of sorghum seeds and should be specifically considered in the development of high-protein varieties.

Discussion
Sorghum is the fifth most important cereal crop that is grown as a food and feed crop worldwide. In comparison to wheat, rice, and maize, sorghum may have lower nutritional value, which is directly affected by the protein quality of the grain. Along with lower

Discussion
Sorghum is the fifth most important cereal crop that is grown as a food and feed crop worldwide. In comparison to wheat, rice, and maize, sorghum may have lower nutritional value, which is directly affected by the protein quality of the grain. Along with lower protein digestibility, sorghum is low in essential amino acids, especially lysine. Therefore, it is necessary to develop new germplasm sources, such as mutant populations, with variability in seed composition, particularly in essential amino acid content. In this study, mutant lines were generated using an EMS-based approach. These mutant lines showed variation in seed amino acid concentrations compared to those in the BTx623 line and may be used in future sorghum breeding programs.
In this study, the mutants exhibited wide variability in the total protein content and 23 amino acids, including 19 protein-bound and 4 non-protein amino acids. Lasztity [77] and De Mesa-Stonestreet et al. [78] revealed that the total protein content in elite sorghum lines ranged from 6 to 18%, with an average of 11%. The lines generated in this study showed 10-19.2% total protein content ( Figure 1A), which varied significantly in the mutant lines. Compared to BTx623, all mutant lines showed a higher total protein content ( Figure 1A). Similar results have been reported in a soybean EMS-based mutant population [79]. In addition, the essential amino acid content was positively correlated with the total protein content. Several mutant lines with altered amino acid content were also observed. Therefore, the mutant population developed in this study can be used as a resource to enhance the sorghum seed quality by increasing its essential amino acid content.
Our mutant lines can aid in the identification of genes affecting the seed quality. We identified promising mutants with increased total protein and essential amino acid contents in this study. We also observed positive correlations among various amino acids in the ARS197 mutant line. Hence, this mutant line can be used to determine the correlations among various amino acids. Mutants showing variability in kernel weight, kernel size, amylose content, virtuosity, and starch content may also exhibit a high yield. Therefore, the mutant lines generated in this study can be used in sorghum breeding programs to improve the seed quality, such as increase the essential amino acid content and digestibility, and other important traits, such as yield, kernel hardness, and starch content.

Generating a Mutant Population
We generated a mutant population via EMS treatment, as previously described [49]. Briefly, approximately 100 g of BTx623 seeds was treated with 0.1-0.3% EMS (v/v) for 16 h at 50 rpm on a rotary shaker. Seeds were thoroughly washed for 5 h in tap water at an ambient temperature. During incubation, the water was repeatedly changed every 30 min. Seeds were air-dried and planted in the field at a density of 120,000 seeds per hectare. To prevent cross-pollination, the panicles of each plant were covered with a 400-weight rainproof paper pollination bag before anthesis. Mature seeds were harvested from each plant and advanced to M2 generation by growing the resulting seeds one row per head, and the panicle was bagged before anthesis (three individuals in each row). One panicle progressed to M3 generation. Ten panicles were bagged for each M3 head row and pooled as M4 seeds, which were distributed to the end users upon request. Sample preparation, DNA extraction, variation detection and function, and prediction characterization of Sorghum EMS-induced single-nucleotide polymorphisms have been previously described [49]. For the current study, mutant collections of 256 lines were planted at the USDA-ARS Plant Stress and Germplasm Development Research Unit, Lubbock, Texas (latitude 33 • 35 N, longitude 101 • 53 W, and altitude 958 m) in 2017. The soil type is an Amarillo fine sandy loam (fineloamy, mixed, superactive thermic Aridic Paleustalfs). Before planting, a mixture of bulk ammonium sulfate and mono ammonium phosphate was applied to the field, calculated to achieve levels of 65 kg nitrogen and 27 kg phosphorous per hectare. An augmented design with no replicates was used. The plot size is four rows of 4.67 m long with 1.02-m row spacing. Sorghum seeds were planted at 80 per row at a depth of 3 cm using a John Deere MaxEmerge Planter. The irrigated plots received 5 mm of water per day from underground drip lines located on 1.02-m centers as needed. Fifty grams of dry seed from 206 lines were provided to Agricultural Experiment Station Chemical Laboratories of the University of Missouri-Columbia for total protein and composition of amino acids. The remaining 50 lines with low seeds yield were not included.

Total Protein and Amino Acid Extraction and Analysis
Total protein and amino acid contents were analyzed by the Agricultural Experiment Station Chemical Laboratories of the University of Missouri-Columbia (https://aescl. missouri.edu/index.html, accessed on 1 January 2023) following the standard methods of the Association of Official Analytical Chemists (AOAC). Briefly, crude seed protein content was calculated from the total nitrogen content using the Kjeldahl method [80]. The complete amino acid profile was analyzed via cation-exchange chromatography coupled with post-column ninhydrin derivatization and quantitation according to the AOAC Official Method 982 [81].

NIR Analysis of the Five Sorghum Mutant Lines
Grain composition of five selected sorghum mutant lines was determined using NIR, as previously described [82]. The vitreosity of the samples was determined via image analysis of the cut sorghum kernels, as described in [83].

Conclusions
The sorghum mutant population showed significant variability in total protein content, along with variability in 19 protein-bound and 4 non-protein amino acids. Mutants with doubled amounts of total protein and essential amino acids, such as histidine, leucine, and lysine, have great potential for improving sorghum seed quality. In addition, we observed variability in kernel weight, kernel size, amylose content, virtuosity, and starch, which could also be used to improve important traits of sorghum. The mutant lines that exhibited interesting variability in seed composition generated in the current study will be freely available for sorghum breeding programs.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/plants12081662/s1, Figure S1: Relationship between amino acids concertation (g/100 g of seeds) and protein concentration (g/100 g of seeds) in seed of mutant lines; Figure S2: Number of mutant lines with various concentrations of non-protein amino acids such as taurine, lanthionine, and ornithine etc.
Author Contributions: Y.J. and Z.X. conceptualized the study; S.R.B., J.C. and Z.X. conducted grain quality analysis; A.K. and N.A.K. conducted data analysis; all authors contributed to the preparation of the manuscript. All authors have read and agreed to the published version of the manuscript.