Next Article in Journal
Integrated Multi-Omics Reveals DAM-Mediated Phytohormone Regulatory Networks Driving Bud Dormancy in ‘Mixue’ Pears
Next Article in Special Issue
Knockout of GmCKX3 Enhances Soybean Seed Yield via Cytokinin-Mediated Cell Expansion and Lipid Accumulation
Previous Article in Journal
Leaching Characteristics of Exogenous Cl in Rain-Fed Potato Fields and Residual Estimation Model Validation
Previous Article in Special Issue
Autofluorescence and Metabotyping of Soybean Varieties Using Confocal Laser Microscopy and High-Resolution Mass Spectrometric Approaches
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Multi-Locus GWAS Mapping and Candidate Gene Analysis of Anticancer Peptide Lunasin in Soybean (Glycine max L. Merr.)

1
Department of Biological and Forensic Sciences, Fayetteville State University, Fayetteville, NC 20348, USA
2
Department of Food Science & Human Nutrition, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
3
Department of Biological Sciences, Old Dominion University, Norfolk, VA 23529, USA
4
Soybean and Nitrogen Fixation Research Unit, The United States Department of Agriculture—Agricultural Research Service, Raleigh, NC 27607, USA
*
Authors to whom correspondence should be addressed.
Plants 2025, 14(14), 2169; https://doi.org/10.3390/plants14142169
Submission received: 26 March 2025 / Revised: 26 June 2025 / Accepted: 10 July 2025 / Published: 14 July 2025

Abstract

Soybean (Glycine max) peptide lunasin exhibits significant cancer-preventive, antioxidant, and hypocholesterolemic effects. This study aimed to identify quantitative trait nucleotides (QTNs) associated with lunasin content and to annotate the candidate genes in the soybean genome. The mapping panel of 144 accessions was gathered from the USDA Soybean Germplasm Collection, encompassing diverse geographical origins and genetic backgrounds, and was genotyped using SoySNP50K iSelect Beadchips. The lunasin content in soybean seeds was measured using the enzyme-linked immunosorbent assay (ELISA) method, with lipid-adjusted soybean flour prepared from seeds obtained from the Germplasm Resource Information Network (GRIN) of USDA-ARS in 2003 and from North Carolina in 2021, respectively. QTNs significantly related to lunasin content in soybean seeds were detected on 15 chromosomes, with LOD scores greater than 3.0, explaining various phenotypic variations identified using the R package mrMLM (v4.0). Significant QTNs on chromosomes 3, 13, 16, 18, and 20 were consistently identified across multiple models as being significantly associated with soybean lunasin content, based on assessment data from two years. Twenty-nine candidate genes were found, with 12 identified in seeds from 2003 and 17 from 2021. Our study is an important effort to understand the genetic basis and functional genes for lunasin production in soybean seeds.

1. Introduction

Improving seed composition and quality, including protein, oil, fatty acid, and amino acid content, is an important aim of soybean farmers and breeders. Soybean (Glycine max) consumption has been associated with beneficial effects in improving human health such as reducing obesity, cardiovascular disease, immune disorders, and certain types of cancers [1,2]. Diets rich in soybean products are associated with lower colon, breast, and prostate cancer mortalities, suggesting that carcinogenesis prevention may be derived from soybean components [3]. The therapeutic function of soybean has also been recognized in soy bioactive peptides, both from soy protein in seeds and peptides derived through gastrointestinal digestion [4]. Lunasin has shown remarkable cancer-preventive, antioxidant, and hypocholesterolemic effects in animal and in vitro trials [5,6,7,8,9,10,11,12,13]. Since it was isolated from soybean seeds in 1981 during the screening of protease inhibitors in Japan, peptide lunasin has been considered one of the most promising potential anticancer phytochemicals [9]. Besides its anticancer activity, lunasin plays a vital role in the regulation of cholesterol biosynthesis in the body with its inherent antioxidative and anti-inflammatory effects [14]. Lunasin possesses high tissue affinity, specificity, and efficiency in promoting human health, and has been associated with antihypertention, antiobesity, and anticancer properties [15]. These various biological functions of lunasin have been clearly demonstrated in both in vitro and in vivo assessments [9,10,16]. Lunasin displays the ability to regulate the cell cycle through inducing cell apoptosis, decreasing the gene expression of cyclin, inhibiting oncogene expression, and reducing the mutation rate caused by carcinogens [8,16,17]. A study suggested that mice receiving lunasin treatment showed significantly reduced pulmonary colonization after the injection of highly metastatic melanoma cells compared to the control group [18]. Lunasin inhibits the growth of murine LLC cells and murine B16-F0 melanoma cells in vitro and in wild-type C57BL/6 mice [19]. Lunasin can help prevent metastasis and patient relapses in melanoma by reducing the invasive potential of cancer initiation cells (CICs), shown both in vitro and in vivo in a mouse experimental metastasis model, and therefore, lunasin has been considered an exceptional anticancer agent for patients that have developed resistance to current chemotherapies [18].
Overall, tremendous progress has been made in understanding the anticancer bioactive functions of lunasin. However, the genetic basis and inheritance of lunasin have not been fully addressed, and QTL mapping and candidate genes that are associated with lunasin production in soybeans have not been defined genetically to the best of our knowledge. More studies are required to elucidate the genetic mechanisms of production for lunasin in soybean seeds to bridge the gap between advanced clinical therapeutic success and the lack of genetic information on lunasin inheritance. de Mejia et al. [20] quantified lunasin concentrations in 144 selected, diverse soybean accessions (Glycine max) from the USDA Soybean Germplasm Collection (Urbana, Illinois), indicating that the lunasin concentrations within soybean accessions vary greatly. These 144 accessions from USDA Soybean Germplasm Collection, including exotic, ancestral, and modern accessions, were quantified in soybean defatted flour using an enzyme-linked immunosorbent assay (ELISA) method [20]. In this study, lunasin concentrations in the flour of soybean lines varied greatly and significantly, ranging from 0.1 to 1.3 g/100 g in exotic lines, 3.6 to 10.1 g/100 g in ancestral lines, and 3.3 to 9.5 g/100 g in modern lines. Notably, the maximum differences in lunasin content exceeded 100% among soybean lines grown under the same environmental conditions. These findings suggest that genetic background plays a major role in determining lunasin accumulation, even when environmental factors are held constant. The average lunasin concentration in 23 major ancestral lines of U.S. cultivars appears to be like that of 16 modern cultivars; the ancestral and exotic accessions show the highest value of lunasin concentration. Interestingly, accessions with high concentrations of isoflavone-enriched products contain low or no lunasin. Moreover, the variation in lunasin concentration is correlated with the accession’s geological origin but less likely related to soybean growth maturity based on pioneer research [20]. Abiotic stress, such as in salt-treated soybean seeds, has been found to be able to induce lunasin accumulation at the highest level six hours after imbibition [21]. The phenotypic variation in lunasin content in soybean seeds of diverse soybean lines has positioned lunasin as an ideal candidate trait for the genetic improvement of luansin concentrations in soybean plants.
The genome-wide association study (GWAS) is a powerful alternative approach to the traditional biparental mapping approach for mapping the target QTL of complex traits at sequence resolution by accounting for the population structure and historical recombination events in mapping panels as co-variances in the mathematic model [22,23,24,25]. It has been widely used to investigate the various traits of plants, such as maize [26], soybean [23,27], barley [28,29], Arabidopsis thaliana [30], and sorghum [31]. The GWAS is one of the most powerful methods. The GWAS has been widely used to define quantitative trait loci because it will directly detect a genetic association between single-nucleotide polymorphism (SNP) markers and traits in the mapping panels of landraces and advanced breeding populations based on linkage disequilibrium (LD) information [23,32]. The mixed linear model (MLM) is commonly implemented for GWAS analysis, which combines historical recombination events, a greater number of alleles, and broader genetic variation as co-variances to dissect the genetic merits of complex traits [33,34,35]. However, single-locus models such as the CMLM [35] and the ECMLM [35] are one-dimensional genome scans, which need corrections for multiple tests. The Bonferroni correction is often integrated for multiple tests, and nevertheless, this stringent correction has some deficiencies in the identification of QTLs with small effects, particularly in field experiments on crop genetics [36]. Recently, Wang et al. [26] proposed a multi-locus random-SNP-effect mixed linear model (mrMLM) method without Bonferroni correction. Thereafter, many multi-locus GWAS methods have been presented, including the ISIS EM-BLASSO [37], pLARmEB [38], FASTmrEMMA [39], FASTmrMLM [40], and pKWmEB [41] to detect quantitative trait nucleotides (QTNs). In these models, MLM stands for mixed linear model (Q+K model); FAST stands for factored spectrally transformed function; ISIS stands for iterative-modified sure independence screening; EB is expectation–maximization; BLASSO stands for Bayesian least absolute shrinkage and selection operator; EMMAX stands for efficient mixed-model association expedited function; and pKWmEB and pLARmEB stand for Kruskal–Wallis test and LARS algorithm, respectively. An R package called the mrMLM has been developed (https://cran.r-project.org/web/packages/mrMLM/index.html, 1 August 2024), through which these six multi-locus GWAS methods are integrated into the R-based software (V4.0) [42,43]. Because multi-locus GWAS models are relatively closer to true genetic models in plants and animals than existing single-locus GWAS methods, these methods appear to display more robust identification of QTNs with lower false positive rates (FPRs) in the analyses, especially with small-effect quantitative traits [26,36].
It is generally believed that cultivated soybean [Glycine max (L.) Merr.] was derived from its wild progenitor, G. soja Sieb. et Zucc., approximately 0.8 million years ago in Eastern Asia [44]. The landraces of soybean were subsequently selected from the adaptation of localized Glycine max after its domestication [45]. The genetic sources of current elite cultivars in the United States were developed from a small number of landrace accessions [46]. Traditional breeding methods for selection have greatly improved soybean production for economically important traits such as yield, protein, and oil content, but they have not been performed on a large scale to develop cultivars with health-benefiting characteristics. The lunasin gene has been cloned from midmaturation soybean seed [named 2S albumin (Gm2S-1)] using a homolog search-based cDNA approach [47], and the bioactivities of lunasin have been assessed in commercial lines in South Korea for lunasin concentrations [48]. Single-nucleotide polymorphisms (SNPs) are the most promising molecular markers in the genome. Because SNPs are the most abundant type of genetic polymorphism and are evolutionarily stable from generation to generation, SNPs represent the ideal form of molecular marker. With the target SNPs identified by multi-locus GWAS, a high-throughput and cost-effective platform will be used to genotype populations and conduct marker-assisted selection (MAS) for genetic studies and trait improvement. The application of MAS in a trait improvement program will be used to shorten breeding cycles and allow a greater genetic gain over time due to more cycles of selection. The genotypic and phenotypic data for the 144 soybean accessions have been collected, and primary data analysis has been conducted. Based on the results, we hypothesize that genetic variants are the key factor in determining lunasin content in soybean seeds. The objective of this study is to disclose the genetic bases of the anticancer bioactive compound lunasin in soybean plants using a multi-locus GWAS with two diverse environments. The information derived from this research should provide valuable information for molecular breeding and marker-assisted selection and contribute to soybean genetic and trait improvement for human health benefits.

2. Results

2.1. Trait Distribution and Broad Sense Heritability of Assessed Traits

The lunasin content in the soybean seeds of the GWAS mapping panel varied significantly across different environments. The frequency distribution of the assessed traits in the GWAS mapping panel was evaluated using the Shapiro–Wilk test for normality (Table 1). Only the lunasin content of undefatted flour in 2021 (Lunasin_Pr21) followed a normal distribution. Negative skewness in the distribution of lunasin content in defatted flour (Lunasin_DF03) and a kurtosis value greater than 3 were observed in the mapping population (Table 1; Figure 1). The trait distribution of Lunasin_Pr21 exhibited significant variation, even after the data were transformed to a 1/10 scale, indicating potential biological or experimental factors influencing the trait’s expression. The coefficient of variation (CV) for each assessed trait showed some consistency, except for Lunasin_DF21. The CV values ranged from 43% to 46% for Lunasin_DF03 and Lunasin_Pr21, whereas a CV of 66.82% was observed for lunasin content in 2021. The mean lunasin content in soybean seeds from these two environments differed significantly, with some extreme outliers identified (Table 1). Estimates of trait correlations showed that the correlogram of lunasin content and protein across different environments varied remarkably (Figure 2). Among the four phenotypes assessed, the correlation coefficient (r) between the two methods of assessed lunasin content was 0.97 (p < 0.001) in 2003. However, no significant correlation (r = 0.15) was identified between undefatted flour and lipid-adjusted soybean flour for the lunasin concentration in soybean seeds. The broad-sense heritability (H2) of for the trait of seed lunasin content (mg/g of dry, defatted (Lunasin_DF03) and lipid-adjusted (Lunasin_DF21) samples) across the two years was 27%. The Panel–Year interactions played a significant role in the molecular formation of lunasin in soybean seeds in these two markedly different environments, as indicated by our two-way ANOVA assessment, where the σGE2 was 244.5. Due to the cost constraints of this student-centered project, independent duplicates with technical replicates were applied in Lunasin_DF21, and we calculated the Sum Sq and Mean Sq to determine σG2 and σGE2 for lunasin H2 using the type I sum of squares (ANOVA (model) function in R.

2.2. Multi-Locu GWAS QTN Mapping

A set of 42,080 SNPs was used in this study, distributed across all 20 soybean chromosomes. A total of 15 chromosomes were detected containing QTNs significantly related to lunasin content in soybean seeds, with LOD scores greater than 3.0, explaining various phenotypic variations identified using the R package mrMLM (v4.0) (Table 2). For significant QTNs, the SNPs were spread out relatively evenly on the chromosomes, and they were most distributed on chromosome 6 and 16 with four SNPs and least distributed on chromosome 4, 5, and 15 with just one SNP. Sixteen significant QTNs were identified in 2003, and 17 QTNs were identified in 2021, shown in the Manhattan plots (Figure 3). Five significant QTNs on chromosomes 3, 13, 16, 18, and 20 were identified by multiple models as being significantly associated with soybean lunasin content based on two years of assessment data. The models ISIS EM-BLASSO, FASTmrMLM, mrMLM, pLARmEB, and FASTmrEMMA detected 13, 12, 12, 12, and 5 significant QTNs, respectively. A total of five chromosomes contained QTNs identified by more than four different models, all with LOD scores greater than 3.0, explaining varying amounts of phenotypic variation over two years of data (Table 2). On chromosome 7, a total of four significant QTNs were detected by different models, mrMLM (three), FASTmrMLM (one), pLARmEB (one), and ISIS EM-BLASSO (two), explaining 3.5% to 11.4% of the phenotypic variation in 2003, with the highest LOD score of 6.6 using the FASTmrMLM model. On chromosome 6, five significant QTNs were identified by the models FASTmrMLM (two), ISIS EM-BLASSO (two), and pLARmEB (two) in 2021, explaining up to 9.2% of the phenotypic variation, with the highest LOD score of 9.5 using the pLARmEB algorithm. Similarly, four significant QTNs were found on chromosome 16 by models ISIS EM-BLASSO (two), pLARmEB (two), FASTmrMLM (one), and mrMLM (one), explaining up to 6.4% of the variation in lunasin content. On chromosome 10, three significant QTNs associated with lunasin content were identified using the models mrMLM (two), pLARmEB (one), and ISIS EM-BLASSO (one), explaining the highest amount of phenotypic variation (30.99%) in 2021 by mrMLM. One significant QTN on chromosome 12 was identified by mrMLM, explaining 26.15% of the second highest phenotypic variation in 2021. On chromosome 2, two significant QTNs associated with lunasin content were identified using the models FASTmrMLM (one), FASTmrEMMA (one), pLARmEB (one), and ISIS EM-BLASSO (one), explaining the highest amount of phenotypic variation (26.1%) in 2003 by FASTmrMLM. A total of two significant QTNs were detected on chromosome 9 by different models, mrMLM, pLARmEB, and ISIS EM-BLASSO, explaining the second highest phenotypic variation in 2003 (16.6%), with the highest LOD score of 6.6 using the FASTmrMLM model. One significant QTN associated with lunasin content was identified on chromosome 15 using the models of FASTmrMLM, pLARmEB, and ISIS EM-BLASSO, with the highest LOD score of 10.2 using the FASTmrMLM algorithm, explaining the high amount of phenotypic variation (17.7%) in 2021 by FASTmrMLM. Additional significant QTNs were identified on chromosomes 3, 4, 5, 13, 14, 18, and 20 using the five different models, respectively (Table 2). The QTN effects for lunasin were relatively diverse in two environments, as listed in Table 2, with an effect of −47 for SNP marker ss715607293 anchored on chromosome 10 and an effect of 32 for SNP marker ss715612259 located on chromosome 12. These two SNPs explained the very high percentage of phenotypic variation, at 31% for ss715607293 with an LOD of 5.2 and 16.6% for ss715612259 with an LOD of 4.7, respectively.

2.3. Candidate Genes for Lunasin Content in Soybean Seeds

The candidate genes underlying significant QTNs for lunasin content in soybean seeds were identified using SoyBase JBrowser (soybase.org) Williams 82 (Wm82) genome assembly 6 (https://www.soybase.org/tools/browsers/gbrowse.html?iframe_pathname_suffix=glyma.Wm82.gnm6, 1 August 2024). A total of 28 genes were found near significant QTNs based on the soybean genome browser, using data from two years of field assessments, with 12 candidate genes identified in seeds from 2003 and 15 from 2021. These candidate genes encode pentatricopeptide repeat (PPR) superfamily proteins (nine), transmembrane amino acid transporter proteins (five), ribosomal proteins (three), ribosomal proteins (three), ATP binding (two), RNA binding, the nudix hydrolase 1 (NUDT1) cluster, and others (Table 3). These annotated candidate genes are located close to the regions of significant SNPs. In many cases, more than one candidate gene was found near these SNPs for the trait analyzed, suggesting that the multi-locus GWAS platform has strong detection power for deciphering the genomic structure underlying lunasin content in soybean plants. The pentatricopeptide repeat (PPR) superfamily proteins around SNPs ss715607293 and ss715613090 act as a diverse group of RNA-binding proteins found primarily in plants. They play crucial roles in the regulation of gene expression from the nucleus to organelles such as chloroplasts and mitochondria within various cellular functions, such as RNA processing and editing, RNA stability and protection, translation regulation and RAN maturation, and many other functions in peptide synthesis. These SNPs explained more than 30.9% and 26% of the phenotypic variation in 2021. The expression of ABC transporters (ATP-binding cassette transporters) around the significant SNPs ss715581194 and ss715622529 plays an important role in transporting various substances across cellular membranes. Their key functions include facilitating the movement of a wide range of substrates, including ions, small molecules, peptides, and larger macromolecules, across biological membranes and transporting energy substrates against their concentration gradient, which is critical for maintaining cellular homeostasis and nutrient balance. These significant SNPs explained up to 26% of the phenotypic variation. Moreover, ribosomal proteins (r-proteins) are essential components of ribosomes, the cellular machinery responsible for protein synthesis. Their key functions include contributing to the overall structure and stability of ribosomes and maintaining the integrity of ribosomal architecture. These proteins facilitate cellular function by assisting in the binding of messenger RNA (mRNA) and transfer RNA (tRNA) to the ribosome, and this step ensures the correct assembly of amino acids into peptides following the requirements of the genetic code. These significant SNPs explained various phenotypic variations within two years data (Table 2 and Table 3). NUDT1, or nudix hydrolase 1 or MutT Homolog1 (MTH1), has been paid attention in human research for its enzymatic activity associated with cancer. While it has garnered significant attention in human cancer research, its role in soybean as a gene cluster with high phenotypic variation is also intriguing. A total of 11 nudix hydrolase 1 genes were identified around a significant SNP (ss715581194) in Glycine max Wm82 genome assembly version 6 (glyma.Wm82.gnm6), and this SNP explained 26.1% of the phenotypic variation. Within these 11 genes, Glyma.02G130702, Glyma.02G131152, Glyma.02G131102, Glyma.02G131052, Glyma.02G131002, Glyma.02G130950, Glyma.02G130902, and Glyma.02G130852 contain 209 nucleotides, while Glyma.02G130820, Glyma.02G130752, and Glyma.02G130702 contain 354, 375, and 375 nucleotides. These genes were located on the minus strain of the soybean genome on chromosome 2.
Candidate genes were annotated using SoyBase JBrowser (soybase.org) Williams 82 (Wm82) genome assembly 6 (https://www.soybase.org/tools/browsers/gbrowse.html?iframe_pathname_suffix=glyma.Wm82.gnm6, 1 August 2024).

3. Discussion

Lunasin has shown remarkable cancer-preventive, antioxidant, and hypocholesterolemic effects in animal and in vitro trials [5,6,7,8,9,10,11,12]. Overall, tremendous progress has been made in understanding the anticancer bioactive functions of lunasin. However, the genetic basis and inheritance of lunasin have not been fully addressed, and QTL mapping and candidate genes that are associated with lunasin production in soybean have not been defined genetically to the best of our knowledge. More studies are required to elucidate the genetic mechanisms of production for lunasin in soybean seeds to seal the gap between advanced clinical therapeutic success and the lack of genetic information on lunasin inheritance. de Mejia et al. [20] quantified lunasin concentrations in 144 diverse soybean accessions (Glycine max) from the USDA Soybean Germplasm Collection (GRIN). Their findings revealed substantial variability in lunasin concentrations among soybean accessions, ranging from 0.1 to 1.3 g per 100 g of flour. The observed differences in lunasin content exceeded 100%, even when soybean lines were cultivated under identical environmental conditions. Similarly, Jeong et al. [8] identified significant phenotypic diversity in lunasin content in Korean soybean accessions, with concentrations ranging from 4.40 to 70.49 mg of lunasin per gram of protein. This pronounced phenotypic variation in lunasin content among soybean seeds highlights its potential as an ideal candidate trait for genome-wide association studies (GWASs) and genetic improvement efforts aimed at enhancing lunasin concentration in soybean plants. Based on these findings, we hypothesize that genetic variants are the primary determinants of lunasin content in soybean seeds. To validate the hypothesis, a total of 251 soybean lines were requested from the GRIN in 2021 and planted in different fields at the Central Crop Research Station of North Carolina State University (Clayton, NC, USA) over the past four years (2021–2024). The lunasin content in seeds harvested in 2021 and 144 lines phenotyped in 2003 were assessed exclusively using the enzyme-linked immunosorbent assay (ELISA) method, primarily due to cost-effectiveness considerations (mainly anti-rabbit polyclonal antibody). The ELISA was selected for its balance of affordability, sensitivity, and accuracy, making it an ideal choice for high-throughput analysis under budgetary constraints. These findings aim to contribute to the understanding of lunasin biosynthesis and its genetic regulation, providing a foundation for future breeding programs to develop soybean varieties with enhanced functional properties.
The multi-locus mrMLM platform aims to select potentially associated markers rather than to identify significant loci or intervals by employing multiple methods to increase the probability of identifying potential significant loci. The models of multi-locus genome-wide association study (GWAS) methodologies embedded in the r package of mrMLM v4.0.2 have been implemented in numerous applications and have demonstrated enhanced uncovering supremacy and improved accuracy in estimating QTN effects compared to previous single-locus GWAS methods [42,49,50]. Rapid multi-locus genome deciphering and scrupulously considering all effects whilst controlling for all genetic backgrounds makes this platform an effective choice for GWAS mapping. Multi-locus GWAS algorithms depend on the random-SNP-effect model with two stages: selecting a reduced number of molecular markers using different algorithms and then deciphering the true associations with multi-locus models, respectively [36]. In the present study, several significant QTNs, including numerous QTNs with small effects, were identified as being associated with lunasin content in soybean seeds, providing valuable insights into the genetic factors influencing this trait (Table 2). These significant QTNs should be used to breed favorable alleles into elite soybean lines via marker-assisted selection (MAS). The genomic regions underlying significant QTNs that explain a substantial portion of the phenotypic variation are likely to contain candidate genes responsible for lunasin production in soybean plants. Notably, these loci include chromosomes 2, 7, 10, 12, and 15, which harbor the genes associated with the target trait. In addition, the identification of small-effect QTNs linked to lunasin content highlights the complexity and dynamic nature of metabolic networks in soybean plants, suggesting that both major and minor loci contribute to the regulation and biosynthesis of this important compound. These findings provide a comprehensive framework for understanding the genetic architecture of lunasin production and offer potential targets for future functional studies and breeding programs. Among the thirty-three QTNs detected within two years of data collection in the present study, nine QTNs explained more than 10% of the phenotypic variation.
Lunasin was identified as a 43-amino acid peptide, presented as a cDNA with an 828 bp transcript (AF005030) in the NCBI (https://www.ncbi.nlm.nih.gov/gene/?term=AF005030, 1 August 2024), which was cloned from mid-maturation soybean seed, encodes a 2S albumin (Gm2S-1) based on the previous study [51], is anchored on soybean chromosome 13 (Glyma.13G154100), and encodes a bifunctional inhibitor/lipid-transfer protein/seed storage 2S albumin superfamily protein (soybase.org). However, none of the significant QTNs identified in our study were found to be located near this locus. This suggests that the genetic variations associated with the trait may be regulated by different genomic regions or highly impacted by environmental factors, indicating complex genetic architecture that warrants further investigation. Expression of the lunasin gene in mammalian cells resulted in mitotic arrest, ultimately leading to cell death, as demonstrated by Galvez and de Lumen [5], due to its impact on the ability of the kinetochore complex to attach to centromeres during cell mitosis. However, the genetic basis underlying lunasin production is still missing. Based on the multi-locus GWAS in our research, a total of 28 candidate genes near or adjacent to the primary significant QTNs for lunasin content in soybean seeds were identified using SoyBase JBrowser (soybase.org) Williams 82 (Wm82) genome assembly 6 (https://www.soybase.org/tools/browsers/gbrowse.html?iframe_pathname_suffix=glyma.Wm82.gnm6, 1 August 2024). From the genetic analysis of two years of field assessments, 12 candidate genes were identified in the seeds from 2003 and 15 from 2021. The SNPs behind these candidate genes explained more than 26% and 30.9% of the phenotypic variation in 2003 and 2021, respectively. The function of these genes around the significant SNPs plays an important role, which includes facilitating the movement of a wide range of substrates, molecules and energy transport, ribosomal protein (r-protein) components of ribosomes, and the cellular machinery responsible for RNA and protein synthesis. One of the significant SNPs explained 26% of the phenotypic variation on chromosome 2 in 2003 data, defined as a new candidate gene called NUDT 1 (nudix hydrolase 1). The NUDT 1 gene has been extensively studied for its enzymatic activity and association with cancer in humans. While it has garnered significant attention in cancer biology, its role in soybean is equally intriguing. In soybean, NUDT 1 functions as part of a gene cluster characterized by high phenotypic variation, which may have implications for plant development, stress responses, and agricultural productivity. The gene family of nucleoside diphosphate-linked moiety X (Nudix) hydrolases composes a large group of genes in living organisms, and it functions to degrade the nucleoside diphosphate-X (NDP-X) to nucleoside monophosphate (NMP) and phosphate-X (P-X) and regulates plant immunity in Arabidopsis to initiate pathogen response [52,53]. Plant Nudix hydrolases play very important roles in the plant–pathogen interaction, functioning as important players in the defense mechanisms plants employ against pathogens. The enzymes encoded by this gene family hydrolyze a variety of substrates, including nucleotides and secondary metabolites, are involved in regulating oxidative stress and modulating immune responses, and help plants maintain cellular integrity under pathogen attack, making them crucial components of the plant’s immune system [54]. A total of 11 NUDT 1 genes were identified around the significant SNP (ss715581194) in Glycine max Wm82 genome assembly version 6 (glyma.Wm82.gnm6, 1 August 2024) and this SNP explained 26.1% of the phenotypic variation. These 11 genes contain 209 nucleotides, while three additional family members contain 354, 375, and 375 nucleotides, and these genes were located on a minus strain of the soybean genome on chromosome 2. The expression of these candidate genes is in soybean plants still under investigation.
Heritability is an essential concept in quantitative genetics and a key element in the improvement of genetic attributes through selection. It estimates the proportion of phenotypic variation attributable to genetic factors, providing insight into the selection response and the potential for trait improvement [55]. In plants, the genotype-by-environment interaction plays a significant role in crop production. Hassani et al. [56] studied the location, year, and genotype of both the root and white sugar yield of sugar beet, suggesting that two-way and three-way interactions among these factors affect the stability of root and white sugar yield and impact key traits such as sugar, nitrogen, sodium, and potassium contents among 20 sugar beet genotypes. By using various analytic approaches, their study also identified negative correlations between these key traits and root yield, and the optimal performance of traits and multiple-trait stability were genotype-dependent. The broad-sense heritability of lunasin content in soybean seeds has been estimated at 27%, underscoring the substantial role of genotype–environment interactions in trait expression. This value suggests that lunasin gene expression is strongly influenced by environmental factors, with diverse growing conditions significantly impacting its phenotypic manifestation. Not all high levels of lunasin content are consistent across different environments for soybean lines; while some accessions exhibit elevated content, others display lower levels, highlighting variability influenced by environmental factors. In the soybean seeds of Clayton, North Carolina, a total of 17 significant QTNs were identified by various models for lunasin content, while 15 significant QTNs for lunasin content were found in the seeds of the GRIN from 2003. Some of these QTNs such as on chromosomes 13, 16, and 20 were located adjacent to genomic intervals on the chromosomes; however, none of these QTNs shared the same genomic position, even though they were identified by the same models (Table 2).
The dissection methodology employed in multi-locus GWAS software [42] differs fundamentally from that of traditional single-locus GWASs. In multi-locus GWASs, the analytical framework accounts for the joint effects of multiple loci and often correlates more directly with the specific nucleotide levels where SNPs are anchored. As a result, although the identified QTNs may not be identical across environments or models, they can still exhibit genomic collinearity and collectively contribute to lunasin production through linked or interacting genetic regions. In the mapping panel, we observed several soybean accessions consistently expressing high or low levels of lunasin across diverse environmental conditions. We observed several soybean accessions that consistently expressed either high or low levels of lunasin across these two diverse environmental conditions. This stable expression pattern suggests that these accessions may possess genetically regulated mechanisms influencing lunasin biosynthesis, making them valuable candidates for further genetic analysis and breeding programs. This reinforces the potential functional relevance of the collineated QTNs. While these QTNs may not be universally conserved across models, their co-localization on the genome and consistent phenotypic effects suggest that they can be effectively utilized in marker-assisted selection (MAS). However, careful consideration may be needed in handling these markers, possibly integrating them into multi-locus selection strategies rather than treating them as independent loci. The interaction between genetic and environmental factors complicates breeding efforts aimed at stabilizing lunasin production, as gene expression may fluctuate based on ambient weather conditions such as temperature, rainfall, and soil quality. Studies have demonstrated that genotype–environment interactions can either enhance or suppress the expression of key traits, depending on the environmental context [57]. In the case of soybean, these interactions highlight the necessity of multi-environment trials (METs) to evaluate genotypes across varied climatic zones and to identify stable, high-performing genotypes for lunasin content. While specific studies on lunasin content in soybeans are limited, recent research has demonstrated the effectiveness of multi-environment trials (METs) and genomic tools in evaluating and improving complex traits in soybeans. For instance, a study by [58] identified 22 loci associated with seed weight through a genome-wide association study (GWAS) and estimated the prediction accuracies of genomic selection (GS) and marker-assisted selection (MAS) for this trait. Furthermore, leveraging quantitative trait loci (QTL) mapping and transcriptomic studies has enabled researchers to understand how environmental factors modulate gene expression. Recent studies have utilized transcriptomic analyses to understand how environmental factors influence gene expression in soybeans. For example, research has shown that abiotic stressors, such as drought and temperature extremes, significantly affect biosynthesis pathways. Li et al. [59] studied the physiological indicators of superoxide dismutase (SOD) and peroxidase (POD), which were significantly increased to varying degrees depending on the specific accessions treated with high temperatures. Additionally, the plant hormone abscisic acid (ABA) was elevated, while gibberellin (GA) levels decreased by 2.2-fold in the cotyledon and 1.3-fold in the root. These changes in physiological indicators suggest a complex interaction between environmental factors and gene expression in soybeans, especially regarding the biosynthesis of health-promoting peptides such as lunasin. This highlights the intricate relationship between stress responses and the regulation of key biochemical pathways in the plant. This knowledge underscores the importance of integrating molecular techniques with traditional breeding approaches to enhance trait predictability and stability under varying environmental conditions. The relatively low heritability of lunasin content in soybean seeds, driven by strong genotype–environment interactions, presents a challenge for breeding programs. However, integrating advanced genomic tools and leveraging METs can help breeders identify and develop genotypes that exhibit both high lunasin content and environmental stability, ultimately supporting the production of functional foods with consistent quality.

4. Materials and Methods

4.1. GWAS Mapping Panel

A total of 251 accessions for multi-locus GWAS analysis including exotic, ancestral, and US modern breeding lines were obtained from the Germplasm Resource Information Network (GRIN) of the U.S. Department of Agriculture Soybean Germplasm Collection (U.S. Department of Agriculture, Agriculture Research Station, University of Illinois, Urbana, IL, USA) in 2003 and 2021. The GWAS mapping panel including exotic accessions, major ancestral lines of U.S. cultivars, and recently released U.S. cultivars was requested from the U.S. Department of Agriculture Soybean Germplasm Collection [(Germplasm Resource Information Network of USDA-ARS (GRIN)] at the University of Illinois Urbana, Champaign, IL, USA which covered all maturity groups and represented diverse genetic origins [20]. These soybean lines were planted, cultivated, and harvested again at the Central Crop Research Station of North Carolina State University for four consecutive years (2021–2024). The Central Crops Research Station of North Carolina State University sits on the western edge of Johnston County (35.66974° N and 78.4926° W) near the city of Clayton, NC. The average precipitation is around 140.5 mm, and the mean daily temperature is 24.5 °C from June to September (National Weather Service, weather.gov). All lines of the GWAS panel were planted in a completely randomized block design (CRBD) with three replications in each environment. A 3 m plot with a row spacing of 65 cm was used, and seeds were planted using a manual push planter at a depth of 2 to 4 cm, with 5 seeds per 30 cm. Standard USDA field and crop management regimes for weed control and fertilizer application were applied across all the replicates. Soybean seeds were sown in May/June; harvesting was conducted in the month of September/October/November depending upon their maturity condition. After harvesting, seeds were stored in brown double-layer paper bags at room temperature inside the seed storage room at FSU and threshed in February next year.

4.2. SNP Genotyping and Phenotyping

The leaf samples were collected from young leaves of soybean plants in 1.5 mL microcentrifuge tubes (Fisher Scientific, Pittsburg, PA, USA), and the leaves were stored at −80 °C. DNA was extracted using a DNA extraction kit (Qiagen DNase Plant Extraction, Germantown, MD, USA) in combination with RNase A treatment (Roche Molecular. Biochemicals, Mannheim, Germany). A Picogreen DNA quantitation kit (Invitrogen, Eugene, OR, USA) and BioTek microplate reader (Seattle, WA, USA) with the lamp filter F485 and the emission filter F528 were used to quantify DNA. DNA samples selected for the experiment were normalized to be no less than 200 ng in a 4 µL volume for chip-based genotyping to achieve the optimal concentration and quantity necessary. Because a total of 19,700 accessions, including the wild G. soja and G. max, were genotyped with SoySNP50K high-density SNP arrays (illumine.com) and stored in the USDA Soybean Germplasm Collection [60,61], part of the SNP genotypic data for the mapping panels were retrieved directly from the SoyBase (http://soybase.org/) [62]. However, the lines selected without SNP genotype data were sent to be genotyped using the SNP arrays to obtain the same SNP genotype data for these lines (Molbreeding, Irvine, CA, USA). The genotypes were called by Illumina’s BeadStudio software (Illumina, San Diego, CA, USA, v3.2.23) following the company’s standard protocol. The quality of each SNP was visually inspected using Excel’s sorting function. Non-polymorphic SNPs (minor allele frequency < 5%) were discarded from the dataset manually.
The lunasin content in soybean seeds was only measured from 144 accessions using an enzyme-linked immunosorbent assay (ELISA) method [63,64], with defatted flour (Lunasin_DF03) and undefatted (Lunasin_Pr03) flour requested from the Germplasm Resource Information Network (2003) of the USDA-ARS, and lipid-adjusted soybean flour (Lunasin_DF21) and undefatted flour (Lunasin_Pr21) from crops planted in North Carolina (2021), respectively. For soybean defatting, soybean flour was first passed through a fine sieve (0.5 mm) to ensure uniform particle size. The sieved flour was then mixed with Hexamethylenetetramine (Millipore Sigma, Burlington MA, USA; H11300) at a 1:10 ratio (weight:volume). The mixture was continuously stirred at room temperature for 2 h at 60 °C to facilitate thorough interaction. Afterward, it was filtered to remove excess solvent and left to air-dry overnight under a fume hood to ensure complete solvent evaporation and safe handling of the defatted material. This process yielded defatted soybean flour suitable for downstream applications. Lunasin quantification was carried out by an ELISA and lunasin identification by a Western blot. Several parameters were modified and tested from the original protocol, using the ELISA for lunasin quantification [63] and the Western blot for identification [64]. Several dilutions were tested for the primary lunasin antibody, and these were 1:50, 1:100, 1:200, and 1:400. Several dilutions were also tested for the G. soja protein extracts, and these were 1:100, 1:500, 1:1000, 1:5000, and 1:10,000. We decided on a 1:200 dilution for the primary lunasin antibody and a 1:10,000 dilution for the protein extracts, as these were the optimum parameters for the reaction between the antibody and the antigen. Purified lunasin from G. max and commercial synthetic lunasin were tested to build the standard curve (0–1000 ng/mL). Purified lunasin, instead of synthetic lunasin, was used (R2 = 0.90–0.95). Quantification of lunasin in extracted soluble protein was performed according to the method described by Cavazos et al. [63] with some modifications conducted to optimize the assay. Briefly, 100 μL of protein extract (diluted 1:10,000) was loaded per well and left to incubate overnight at 4 °C. The well was washed three times with phosphate-buffered saline (PBS, 0.01 M, pH 7.4) before being blocked by 5% bovine serum albumin (BSA, 300 μL per well) for 1 h at 4 °C. The washing process was repeated before incubation with the primary lunasin antibody (rabbit polyclonal) diluted at 1:200 in 3% BSA (50 μL per well) overnight at 4 °C. The wells were washed again, followed by incubation by secondary anti-rabbit IgG conjugated to alkaline phosphatase (1:2000 dilution, 50 μL per well). The washing process was repeated and 100 μL of p-nitrophenyl phosphate (pNPP) was added to each well. The plate was read at 410 nm after 20 min of incubation at room temperature, followed by the addition of 100 μL of NaOH (3 N) to each well to stop the reaction. A reading was taken again at 410 nm after 5 min. Purified lunasin was used to build the standard curve (0–1000 ng/mL). Identification of lunasin by Western blot was performed according to the protocol described in Kusumah et al. [64] with some modifications. Briefly, SDS-PAGE was run to separate proteins in the extract (40 μg in 10 μL solution per well) according to molecular weight before transferring them onto a PVDF membrane. The membrane was then blocked by 5% (w/v) non-fat milk for 2.5 h at 4 °C, before being washed 6 times by Tris-buffered saline containing 0.1% Tween20 (TBST) and then incubated with primary lunasin antibody (rabbit polyclonal; 1:200 dilution) overnight at 4 °C. The membrane was washed again six times and incubated with secondary anti-rabbit IgG-horseradish peroxidase (1:2000 dilution). The lunasin band was visualized by incubation for 10 min with chemiluminescent reagent, and images were taken using ImageQuant 800 and analyzed by Image.

4.3. Multi-Locus GWAS QTN Mapping and Statistical Analysis

The SNP data of 144 soybean accessions assessed by the SoySNP50K SNP arrays were prepared. SNPs from the raw genotype data were filtered with a minor allele frequency (MAF) ≥ 5% and a missing data ratio < 0.1 for association analyses. The population structure of the dataset was analyzed using PCA (principal components analysis) and the program fastSTRUCTURE (http://rajanil.github.io/fastStructure/, 1 August 2024). The mrMLM (multi-locus random-SNP-effect mixed linear model) package was downloaded from http://cran.r-project.org/web/packages/mrMLM/index.html (1 August 2024) following the description in Zhang et al. [42]. Six multi-locus GWAS methods, namely mrMLM [26], FASTmrMLM [40], FASTmrEMMA [39], ISIS_EM-BLASSO [37], pLARmEB [38], and pKWmEB [41], were used to conduct a multi-locus GWAS on 144 diverse accessions with high-quality SNPs. A threshold criterion of an LOD of 3 and above was set to achieve the final set of significant SNPs. To obtain reliable candidate genes for lunasin content, only QTNs with an LOD score ≥ 3 and r2 ≥ 4 were searched on the soybean genome in SoyBase (https://soybase.org). In addition, the sequences of both 100 kb up- and down-streams of the defined QTN were investigated as an empirically probable interval of the corresponding QTN. The significant QTNs and associated intervals of chromosomes repeatedly detected in both environments and by a minimum of two methods were defined as dependable QTNs or QTN clusters, respectively. R (www.r-project.org) and its integrated development environment RStudio (posit.co) were used in the multi-locus GWAS, with statistical analyses including agronomic traits, two-way ANOVA, and broad-sense heritability carried out using its native packages. The trait distribution and other parametric characters including means, variance, and standard deviation are displayed in violin plots, using r code within the ggplot2 package (https://r-charts.com/distribution/violin-plot-group-ggplot2/, 1 August 2024). The lunasin content for 2021 was percentage-transferred when it was displayed in the violin plot. The significance level of assessed traits was generated using R package car (type II Wald chi-square tests) (www.r-project.org). The calculation of broad-sense (mean-based) heritability from the ANOVA table of two-way ANOVA was carried out using the equation H2 = σG2/[σG2 + (σGE2/e) + (σe2/re)], where σG2 (variance of genetic factors), σGE2 (variance of genotype–environment interactions), and σe2 (variance of random effects) were assessed with e (number of environments) and r (number of replicates) normalization [65]. The boxplots were generated to visualize the phenotype distribution among constructed haplotypes using R script (r-project.org). Three biological replicates were included, and statistical significance was analyzed by ANOVA and displayed in an ANOVA table.

5. Conclusions

This study provides valuable insights into the genetic basis of lunasin content in soybean seeds, a peptide with notable cancer-preventive, antioxidant, and hypocholesterolemic properties. By analyzing a diverse panel of 144 soybean accessions, we identified 32 significant quantitative trait nucleotides (QTNs) across multiple chromosomes, with key QTNs located on chromosomes 2, 7, 12, and 15. Candidate genes associated with these QTNs were annotated, revealing potential functional pathways involved in lunasin biosynthesis. These findings contribute to a deeper understanding of the genetic factors influencing lunasin production, which could inform future breeding strategies for enhanced health benefits in soybeans.

Author Contributions

Conceptualization, J.Y., E.d.M., R.M. and M.A.K.; methodology, J.Y., R.L., J.K., E.d.M., L.R., S.V., F.L., F.K., N.K., D.J. and S.J.; writing—original draft preparation, J.Y., R.L., E.d.M., R.M. and M.A.K.; writing—review and editing, J.Y., E.d.M., R.M., M.A.K., L.R., S.V., F.K., D.J. and J.Y. All authors have read and agreed to the published version of the manuscript.

Funding

NSF-HBCU-UP-RIA (#2101138).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article.

Acknowledgments

University of Illinois at Urbana-Champaign, and U.S. Department of Agriculture, Agricultural Research Service (USDA-ARS). Ben Fallen and Earl Taliercio from the USDA-ARS contributed their help for this project. More than eight other undergraduate research scholars and high school interns actively participated in this research, contributing to both fieldwork and laboratory activities. Their dedication and hard work were invaluable throughout the study, and their efforts are greatly appreciated. The collaboration with these students not only enhanced the quality of the research but also provided them with hands-on experience in scientific investigation, fostering their growth and skill development. Their involvement was essential to the success of this project, and we are grateful for their contributions to the advancement of our research. Mention of trade names or commercial products in this publication is solely for the purpose of providing specific information and does not imply recommendation or endorsement by the U.S. Department of Agriculture. The findings and conclusions in this publication are those of the authors and should not be construed to represent any official USDA or U.S. Government determination or policy. The USDA is an equal opportunity provider and employer.

Conflicts of Interest

All authors declare no conflicts of interest.

References

  1. Velasquez, M.T.; Bhathena, S.J. Role of dietary soy protein in obesity. Int. J. Med. Sci. 2007, 4, 72–82. [Google Scholar] [CrossRef]
  2. He, F.J.; Chen, J.Q. Consumption of soybean, soy foods, soy isoflavones and breast cancer incidence: Differences between Chinese women and women in Western countries and possible mechanisms. Food Sci. Hum. Wellness 2013, 2, 146–161. [Google Scholar] [CrossRef]
  3. Messina, J.; Persky, V.; Setchell, D.; Barnes, S. Soy intake and cancer risk: A review of the in vitro and in vivo data. Nutr. Cancer 1994, 21, 113–131. [Google Scholar] [CrossRef] [PubMed]
  4. Chatterjee, C.; Gleddie, S.; Xiao, C.W. Soybean bioactive peptides and their functional properties. Nutrients 2018, 10, 1211. [Google Scholar] [CrossRef]
  5. Galvez, A.F.; de Lumen, B.O. A soybean cDNA encoding a chromatin-binding peptide inhibits mitosis of mammalian cells. Nat. Biotech. 1999, 17, 495–500. [Google Scholar] [CrossRef] [PubMed]
  6. Galvez, A.F.; Chen, N.; Macasieb, J.; de Lumen, B.O. Chemopreventive property of a soybean peptide (lunasin) that binds to deacetylated histones and inhibits acetylation. Cancer Res. 2001, 61, 7473–7478. [Google Scholar]
  7. Lam, Y.; Galvez, A.; de Lumen, B.O. Lunasin suppresses E1A-mediated transformation of mammalian cells but does not inhibit growth of immortalized and established cancer cell lines. Nutr. Cancer 2003, 47, 88–94. [Google Scholar] [CrossRef]
  8. Jeong, H.J.; Jeong, J.B.; Kim, D.S.; de Lumen, B.O. Inhibition of core histone acetylation by the cancer preventive peptide Lunasin. J. Agric. Food Chem. 2007, 55, 632–637. [Google Scholar] [CrossRef]
  9. Liu, J.; Jia, S.H.; Kirberger, M.; Chen, N. Lunasin as a promising health-beneficial peptide. Eur. Rev. Med. Pharmacol. Sci. 2014, 18, 2070–2075. [Google Scholar]
  10. Wan, X.; Hong, L.; Yong, S.; Zhang, J.; Chen, X.; Chen, N. Lunasin: A promising polypeptide for the prevention and treatment of cancer (Review). Oncol. Lett. 2017, 3, 3997–4001. [Google Scholar] [CrossRef]
  11. Ďúranová, H.; Fialková, V.; Bilčíková, J.; Lukáč, N.; Kňažická, Z. Lunasin and its versatile health-promoting actions. J. Microbiol. Biotechnol. Food Sci. 2019, 8, 1106–1110. [Google Scholar] [CrossRef]
  12. Kusmardi, K.; Wiyarta, E.; Rusdi, N.K.; Maulana, A.M.; Estuningtyas, A.; Sunaryo, H. The potential of lunasin extract for the prevention of breast cancer progression by upregulating E-Cadherin and inhibiting ICAM-1. E1000Research 2021, 10, 902. [Google Scholar] [CrossRef]
  13. Kaufman-Szymczyk, A.; Kaczmarek, W.; Fabianowska-Majewska, K.; Lubecka-Gajewska, K. Lunasin and its epigenetic impact in cancer chemoprevention. Int. J. Mol. Sci. 2023, 24, 9187. [Google Scholar] [CrossRef]
  14. Lule, V.K.; Garg, S.; Pophaly, S.D.; Tomar, S.D. Potential health benefits of Lunasin: A multifaceted Soy—Derived bioactive peptide. J. Food Sci. 2015, 80, R485–R494. [Google Scholar] [CrossRef] [PubMed]
  15. Singh, B.P.; Vij, S.; Hati, S. Functional significance of bioactive peptides derived from soybean. Peptides 2014, 54, 171–179. [Google Scholar] [CrossRef] [PubMed]
  16. Hernandez-Ledesma, B.; Hsieh, C.C.; de Lumen, B.O. Relationship between lunasin’s sequence and its inhibitory activity of histones H3 and H4 acetylation. Mol. Nutr. Food Res. 2011, 55, 989–998. [Google Scholar] [CrossRef]
  17. Hsieh, C.C.; Hernandez-Ledesma, B.; de Lumen, B.O. Soybean peptide lunasin suppresses in vitro and in vivo 7,12-dimethylbenz anthracene-induced tumorigenesis. J. Food Sci. 2010, 75, H311–H316. [Google Scholar] [CrossRef]
  18. Shidal, C.; Al-Rayyan, N.; Yaddanapudi, K.; Davis, K. Lunasin is a novel therapeutic agent for targeting melanoma cancer stem cells. Oncotarget 2016, 7, 84128–84141. [Google Scholar] [CrossRef]
  19. Devapatla, B.; Shidal, C.; Yaddanapudi, K.; Davis, K. Validation of syngeneic mouse models of melanoma and non-small cell lung cancer for investigating the anticancer effects of the soy-derived peptide Lunasin. F1000Research 2017, 5, 2432. [Google Scholar] [CrossRef]
  20. de Mejia, G.E.; Vásconez, M.; de Lumen, B.O.; Nelson, R. Lunasin concentration in different soybean genotypes, commercial soy protein, and isoflavone products. J. Agric. Food Chem. 2004, 52, 5882–5887. [Google Scholar] [CrossRef]
  21. Zhang, W.; Hao, Y.; Teng, C.; Tan, C. Effects of Salt Stimulation on Lunasin Accumulation and Activity during Soybean Germination. Foods 2020, 9, 118. [Google Scholar] [CrossRef] [PubMed]
  22. Kump, K.L.; Bradbury, P.J.; Wisser, R.J.; Buckler, E.S.; Belcher, A.R.; Oropeza-Rosas, M.A. Genome-wide association study of quantitative resistance to southern leaf blight in the maize nested association mapping population. Nat. Genet. 2011, 43, 163–168. [Google Scholar] [CrossRef] [PubMed]
  23. Wen, Z.; Tan, R.; Yuan, J.; Bales, C.; Du, W.; Zhang, S.; Wang, D. Genome-wide association mapping of quantitative resistance to sudden death syndrome in soybean. BMC Genom. 2014, 15, 809. [Google Scholar] [CrossRef]
  24. Yuan, J.; Bizimungu, B.; De Koeyer, D.; Rosyara, U.; Wen, Z.; Lagüe, M. Genome-Wide Association Study of Resistance to Potato Common Scab. Potato Res. 2020, 63, 253–266. [Google Scholar] [CrossRef]
  25. Naik, S.; Sudan, J.; Urwat, U.; Pakhtoon, M.; Bhat, B.; Sharma, V.; Sofi4, P.; Shikari, S.; Bhat, B.; Sofi, N.; et al. Genome-wide SNP discovery and genotyping delineates potential QTLs underlying major yield-attributing traits in buckwheat. Plant Genome 2024, 17, e20427. [Google Scholar] [CrossRef] [PubMed]
  26. Wang, S.B.; Feng, J.Y.; Ren, W.L.; Huang, B.; Zhou, L.; Wen, Y.J.; Zhang, J.; Dunwell, J.M.; Xu, S.; Zhang, Y.M. Improving power and accuracy of genome-wide association studies via a multi-locus mixed linear model methodology. Sci. Rep. 2016, 6, 19444. [Google Scholar] [CrossRef]
  27. Lee, S.; Van, K.; Sung, M.; McHale, L.; Nelson, R.L.; LaMantia, J.M.; Mian, M.A.R. Genome-wide association study of seed protein, oil, and amino acid contents in soybean from maturity groups I to IV. J. Theor. Appl. Genet. 2019, 132, 1639–1659. [Google Scholar] [CrossRef]
  28. Wójcik-Jagła, M.; Fiust, A.; Koscielniak, J.; Rapacz, M. Association mapping of drought tolerance-related traits in barley to complement a traditional biparental QTL mapping study. Theor. Appl. Genet. 2018, 131, 167–181. [Google Scholar] [CrossRef]
  29. Xu, X.; Sharma, R.; Tondelli, A.; Russell, J.; Comadran, J.; Schnaithmann, F.; Pillen, K.; Kilian, B.; Cattivelli, L.; Thomas, W.T.B.; et al. Genome-wide association analysis of grain yield-associated traits in a Pan-European barley cultivar collection. Plant Genome 2018, 11, 170073. [Google Scholar] [CrossRef]
  30. van Rooijen, R.; Kruijer, W.; Boesten, R.; van Eeuwijk, F.A.; Harbinson, J.; Aarts, M.G.M. Natural variation of yellow seedling1affects photosynthetic acclimation of Arabidopsis thaliana. Nat. Commun. 2017, 8, 1421. [Google Scholar] [CrossRef]
  31. Ramalingam, A.; Mohanavel, W.; Kambale, R.; Rajagopalan, V.; Marla, S.; Prasad, P.; Muthurajan, R.; Perumal, R. Pilot-scale genome-wide association mapping in diverse sorghum germplasms identified novel genetic loci linked to major agronomic, root and stomatal traits. Sci. Rep. 2023, 13, 21917. [Google Scholar] [CrossRef] [PubMed]
  32. Nordborg, M.; Tavare, S. Linkage disequilibrium: What history has to tell us. Trends Genet. 2002, 18, 83–90. [Google Scholar] [CrossRef] [PubMed]
  33. Yu, J.; Pressoir, G.; Briggs, W.; Vroh, I.; Bi, M.; Yamasaki, J.F.; Doebley, M.D.; McMullen, B.S.; Gaut, D.; Nielsen, J.B.; et al. A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nat. Genet. 2006, 38, 203–208. [Google Scholar] [CrossRef]
  34. Yu, J.; Buckler, E.S. Genetic association mapping and genome organization of maize. Curr. Opin. Biotechnol. 2006, 17, 155–160. [Google Scholar] [CrossRef]
  35. Zhang, Z.W.; Ersoz, E.; Lai, C.Q.; Todhunter, R.J.; Tiwari, H.K.; Gore, M.A. Mixed linear model approach adapted for genome-wide association studies. Nat. Genet. 2010, 42, 355–360. [Google Scholar] [CrossRef]
  36. Zhang, Y.; Jia, Z.; Dunwell, J. Editorial: The Applications of New Multi-Locus GWAS Methodologies in the Genetic Dissection of Complex Traits. Front. Plant Sci. 2019, 10, 100. [Google Scholar] [CrossRef]
  37. Tamba, C.L.; Ni, Y.L.; Zhang, Y.M. Iterative sure independence screening EM-Bayesian LASSO algorithm for multi-locus genome-wide association studies. PLoS Comput. Biol. 2017, 13, e1005357. [Google Scholar] [CrossRef] [PubMed]
  38. Zhang, J.; Feng, J.Y.; Ni, Y.N.; Wen, Y.J.; Niu, Y.; Tamba, C.L.; Yue, C.; Song, Q.; Zhang, Y.-M. pLARmEB: Integration of least angle regression with empirical Bayes for multi-locus genome-wide association studies. Heredity 2017, 118, 517–524. [Google Scholar] [CrossRef]
  39. Wen, Y.J.; Zhang, H.; Ni, Y.N.; Huang, B.; Zhang, J.; Feng, J.Y.; Wang, S.B.; Dunwell, J.M.; Zhang, Y.M.; Wu, R. Methodological implementation of mixed linear models in multi-locus genome-wide association studies. Brief. Bioinform. 2018, 19, 700–712. [Google Scholar] [CrossRef]
  40. Tamba, C.; Zhang, Y.M. A fast mrMLM algorithm for multi-locus genome-wide association studies. bioRxiv 2018. [Google Scholar] [CrossRef]
  41. Ren, W.; Wen, Y.; Dunwell, J.M.; Zhang, Y.M. pKWmEB: Integration of Kruskal-Wallis test with empirical Bayes under polygenic background control for multi-locus genome-wide association study. Heredity 2018, 120, 208–218. [Google Scholar] [CrossRef]
  42. Zhang, W.; Tamba, C.; Wen, Y.; Li, P.; Ren, W.; Ni, Y.; Gao, J.; Zhang, Y. mrMLM v4.0: An R platform for multi-locus genome-wide association studies. bioRxiv 2020. [Google Scholar] [CrossRef]
  43. Zhang, Y.; Jia, Z.; Dunwell, J. Editorial: The applications of new multi-locus GWAS methodologies in the genetic dissection of complex traits, volume II. Front. Plant Sci. 2023, 14, 1340767. [Google Scholar] [CrossRef] [PubMed]
  44. Li, Y.H.; Zhou, G.; Ma, J.; Jiang, W.; Jin, L.G.; Zhang, Z.; Guo, Y.; Zhang, J.; Sui, Y.; Zheng, L.; et al. De novo assembly of soybean wild relatives for pan-genome analysis of diversity and agronomic traits. Nat. Biotechnol. 2014, 32, 1045–1052. [Google Scholar] [CrossRef] [PubMed]
  45. Xu, H.; Abe, J.; Gai, Y.; Shimamoto, Y. Diversity of chloroplast DNA SSRs in wild and cultivated soybeans: Evidence for multiple origins of cultivated soybean. Theor. Appl. Genet. 2002, 105, 645–653. [Google Scholar] [CrossRef]
  46. Gizlice, Z.; Carter, T.E.; Burton, J.W. Genetic base for North American public soybean cultivars released between 1947 and 1988. Crop Sci. 1994, 4, 1143–1151. [Google Scholar] [CrossRef]
  47. Revilleza, M.J.; Galvez, A.F.; Krenz, D.C.; de Lumen, B.O. An 8 kDa methionine-rich protein from soybean (Glycine max) cotyledon: Identification, purification, and N-terminal sequence. J. Agric. Food Chem. 1996, 44, 2930–2935. [Google Scholar] [CrossRef]
  48. Jeong, H.J.; Park, J.H.; Lam, Y.; de Lumen, B.O. Characterization of lunasin isolated from soybean. J. Agric. Food Chem. 2003, 51, 7901–7906. [Google Scholar] [CrossRef]
  49. Sachdeva, S.; Singh, R.; Maurya, A.; Singh, V.; Singh, U. New insights into QTNs and potential candidate genes governing rice yield via a multi-model genome-wide association study. BMC Plant Biol. 2024, 24, 124. [Google Scholar] [CrossRef]
  50. Zeffa, D.; Júnior, L.; de Assis, R.; Delfini, J.; Marcos, A.; Koltun, A.; Baba, V.; Goncalves, L. Multi-locus genome-wide association study for phosphorus use efficiency in a tropical maize germplasm. Front. Plant Sci. 2024, 15, 1366173. [Google Scholar] [CrossRef]
  51. Galvez, A.F.; Revilleza, M.J.R.; de Lumen, B.O. A novel methionine-rich protein from soybean cotyledon: Cloning and characterization of a cDNA (Accession No. AF005030) Plant Gene Register #PGR97-103. Plant Physiol. 1997, 114, 1567. [Google Scholar]
  52. McLennan, A.G. The Nudix hydrolase superfamily. Cell. Mol. Life Sci. 2006, 63, 123–143. [Google Scholar] [CrossRef] [PubMed]
  53. Ge, X.; Xia, Y. The role of AtNUDT7, a Nudix hydrolase, in the plant defense response. Plant Signal. Behav. 2008, 3, 119–120. [Google Scholar] [CrossRef] [PubMed]
  54. Dong, S.; Wang, Y. Nudix Effectors: A Common Weapon in the Arsenal of Plant Pathogens. PLoS Pathog. 2016, 12, e1005704. [Google Scholar] [CrossRef]
  55. Hallauer, A.R.; Carena, M.J.; Miranda Filho, J.D. Quantitative Genetics in Maize Breeding; Springer Science & Business Media: New York, NY, USA, 2010. [Google Scholar] [CrossRef]
  56. Hassani, M.; Mahmoudi, S.; Saremirad, A.; Taleghani, D. Genotype by environment and genotype by yield*trait interactions in sugar beet: Analyzing yield stability and determining key traits association. Sci. Rep. 2023, 13, 23111. [Google Scholar] [CrossRef]
  57. Falconer, D.S.; Mackay, T.F.C. Introduction to Quantitative Genetics, 4th ed.; Addison Wesley Longman: Harlow, UK, 1996. [Google Scholar]
  58. Zhang, J.; Song, Q.; Cregan, P.; Jiang, G. Genome-wide association study, genomic prediction and marker-assisted selection for seed weight in soybean (Glycine max). Theor. Appl. Genet. 2016, 129, 117–130. [Google Scholar] [CrossRef] [PubMed]
  59. Li, J.; Wu, M.; Chen, H.; Liao, W.; Yao, S.; Wei, Y.; Wang, H.; Long, Q.; Hu, X.; Wang, W.; et al. An integrated physiological indicator and transcriptomic analysis reveals the response of soybean buds to high-temperature stress. BMC Plant Biol. 2024, 24, 1102. [Google Scholar] [CrossRef]
  60. Song, Q.; Hyten, D.; Jia, G.; Quigley, C.; Fickus, E.; Nelson, R.; Cregan, P. Development and evaluation of SoySNP50K, a high-density genotyping array for soybean. PLoS ONE 2013, 8, e54985. [Google Scholar] [CrossRef]
  61. Song, Q.; Hyten, D.; Jia, G.; Quigley, C.; Fickus, E.; Nelson, R.; Cregan, P. Fingerprinting soybean germplasm and its utility in genomic research. G3 2015, 5, 1999–2006. [Google Scholar] [CrossRef]
  62. Grant, D.; Nelson, R.T.; Cannon, S.B.; Shoemaker, R.C. SoyBase, the USDA-ARS soybean genetics and genomics database. Nucleic Acids Res. 2010, 38, D843–D846. [Google Scholar] [CrossRef]
  63. Cavazos, A.; Morales, E.; Dia, V.P.; De Mejia, E.G. Analysis of lunasin in commercial and pilot plant produced soybean products and an improved method of lunasin purification. J. Food Sci. 2012, 77, C539–C545. [Google Scholar] [CrossRef]
  64. Kusumah, J.; Castañeda-Reyes, E.D.; Bringe, N.A.; de Mejia, E.G. Soybean (Glycine max) INFOGEST Colonic Digests Attenuated Inflammatory Responses Based on Protein Profiles of Different Varieties. Int. J. Mol. Sci. 2023, 24, 12396. [Google Scholar] [CrossRef] [PubMed]
  65. Pilet-Nayel, M.L.; Muehlbauer, F.J.; McGee, R.J.; Kraft, J.M.; Baranger, A.; Coyne, C.J. Quantitative trait loci for partial resistance to Aphanomyces root rot in pea. Theor. Appl. Genet. 2002, 106, 28–39. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Trait distribution and parametric characters for seed lunasin content in GRIN (2003) and Clayton, NC (2021), in GWAS mapping panel displayed in violin plots. The trait was assessed in defatted and undefatted (protein mg/g) soybean (Glycine max) samples in 2003 and lipid-adjusted (defatted) and undefatted (protein) soybean flour.
Figure 1. Trait distribution and parametric characters for seed lunasin content in GRIN (2003) and Clayton, NC (2021), in GWAS mapping panel displayed in violin plots. The trait was assessed in defatted and undefatted (protein mg/g) soybean (Glycine max) samples in 2003 and lipid-adjusted (defatted) and undefatted (protein) soybean flour.
Plants 14 02169 g001
Figure 2. Correlation between lunasin content of defatted and undefatted soybean flour (GRIN, 2003) and lipid-adjusted and undefatted soybean flour (Clayton, NC, 2021). * The significance of correlation (* p < 0.05, ** p < 0.005, *** p < 0.005). The distribution of each trait is displayed in histograms, and correlations between two traits are shown in scatter plots.
Figure 2. Correlation between lunasin content of defatted and undefatted soybean flour (GRIN, 2003) and lipid-adjusted and undefatted soybean flour (Clayton, NC, 2021). * The significance of correlation (* p < 0.05, ** p < 0.005, *** p < 0.005). The distribution of each trait is displayed in histograms, and correlations between two traits are shown in scatter plots.
Plants 14 02169 g002
Figure 3. Manhattan plots (showing significant marker–trait associations) for lunasin content in soybean seeds grown in two different environments, GRIN (2003) (A) and Clayton, NC (202) (B), respectively.
Figure 3. Manhattan plots (showing significant marker–trait associations) for lunasin content in soybean seeds grown in two different environments, GRIN (2003) (A) and Clayton, NC (202) (B), respectively.
Plants 14 02169 g003
Table 1. Agronomic parameters of assessed traits in GWAS mapping panel in GRIN (2003) and Calyton, NC (2021). Significant level: * < 0.05, **< 0.01, *** < 0.001.
Table 1. Agronomic parameters of assessed traits in GWAS mapping panel in GRIN (2003) and Calyton, NC (2021). Significant level: * < 0.05, **< 0.01, *** < 0.001.
TraitMean RangeCV (%)SESkewnessKurtosisW Value (p < 0.05)
Lunasin_Pr036.3210.7443.080.320.012.150.96 *
Lunasin_DF035.799.8843.030.29−0.082.140.96 *
Lunasin_Pr21338.09772.8746.1617.230.563.250.97
Lunasin_DF2135.29106.4966.822.61.334.410.87 ***
Table 2. QTNs detected for lunasin content in soybean seeds grown in two diverse environments, GRIN (2023) and Clayton, NC (2021), using multi-locus GWAS models.
Table 2. QTNs detected for lunasin content in soybean seeds grown in two diverse environments, GRIN (2023) and Clayton, NC (2021), using multi-locus GWAS models.
Trait NameMethodSNPChr.Marker PositionQTN EffectLOD Score‘−log10(P)’r2 (%)MAFGenotype
Lunasin_DF03mrMLMss715595316651845621.25044.04614.79998.37560.1179G
Lunasin_DF03mrMLMss7155966167170430960.81425.87616.70496.99450.3302C
Lunasin_DF03mrMLMss715633674193035283−8.39775.57396.39230.71610.0047G
Lunasin_DF03FASTmrMLMss715595316651845620.82553.95644.70584.95910.1157G
Lunasin_DF03FASTmrMLMss715595189650535934−1.17253.08273.78328.75770.1065A
Lunasin_DF03FASTmrMLMss7155966167170430960.4213.06243.76162.54040.3241C
Lunasin_DF03FASTmrEMMAss7155792761400929151.27414.00344.75515.55410.3056T
Lunasin_DF03FASTmrEMMAss71558696137937781.36823.57754.30716.89870.3704A
Lunasin_DF03FASTmrEMMAss715596420715552442−1.13753.63754.37034.00450.2639A
Lunasin_DF03pLARmEBss7155804261540045290.78683.11763.82033.65230.088G
Lunasin_DF03pLARmEBss715581194213683259−1.50267.00637.87115.41910.0926G
Lunasin_DF03pLARmEBss715590292519420750.44234.56765.34563.04590.412A
Lunasin_DF03pLARmEBss715595316651845620.86665.00725.80365.46540.1157G
Lunasin_DF03pLARmEBss7155966087169746620.52613.34064.05673.9670.3241G
Lunasin_DF03pLARmEBss715604755947022602−0.65944.52915.30545.73950.2824T
Lunasin_DF03pLARmEBss71561638013427523780.37513.2753.98732.13770.412T
Lunasin_DF03pLARmEBss715628168177370439−0.51914.92465.71773.91450.3426A
Lunasin_DF03ISIS EM-BLASSOss7155793171411413300.5355.7666.59094.38760.3889G
Lunasin_DF03ISIS EM-BLASSOss7155800421507251970.60765.45646.27023.72980.1898A
Lunasin_DF03ISIS EM-BLASSOss7155804261540045290.89184.47245.24624.69260.088G
Lunasin_DF03ISIS EM-BLASSOss715581194213683259−0.93613.17663.88295.98380.0926G
Lunasin_DF03ISIS EM-BLASSOss715595316651845620.6063.66414.39832.67230.1157G
Lunasin_DF03ISIS EM-BLASSOss7155966087169746620.46614.32355.09053.11380.3241G
Lunasin_DF03ISIS EM-BLASSOss715597517737004443−0.68915.39446.20584.06170.1389G
Lunasin_DF03ISIS EM-BLASSOss715599447813272509−0.71453.39544.11471.45370.0463A
Lunasin_DF03ISIS EM-BLASSOss715604755947022602−0.5724.65195.43354.31920.2824T
Lunasin_DF03ISIS EM-BLASSOss715628168177370439−0.49744.67015.45253.59380.3426A
Lunasin_DF21mrMLMss715588993451820047−15.79296.71597.57197.43690.0875G
Lunasin_DF21mrMLMss715591002534359243−6.66514.00034.75194.11760.4C
Lunasin_DF21mrMLMss7156072931043465671−47.98475.20816.012430.99310.0187C
Lunasin_DF21mrMLMss715608045104969635810.62856.41197.25856.10790.1562G
Lunasin_DF21mrMLMss715613090124402508−24.28425.62626.446226.15840.0688A
Lunasin_DF21mrMLMss7156239191628465014−8.35735.18975.99336.41430.4125A
Lunasin_DF21mrMLMss7156293121817359236.11963.3194.03392.80760.2562A
Lunasin_DF21mrMLMss71563848520441081467.03043.56434.29312.5180.15T
Lunasin_DF21FASTmrMLMss715591002534359243−4.5673.66774.40213.64080.4146A
Lunasin_DF21FASTmrMLMss715592971612980012−5.86213.32364.03874.23420.2073A
Lunasin_DF21FASTmrMLMss715594520645930385−5.71354.61395.39395.58650.3841C
Lunasin_DF21FASTmrMLMss715612259123317589412.40646.11526.95218.64290.1159T
Lunasin_DF21FASTmrMLMss7156167571316561515−4.01253.07993.78022.88780.4939A
Lunasin_DF21FASTmrMLMss7156189921444255136−8.81165.38636.19747.44920.1524C
Lunasin_DF21FASTmrMLMss7156225291549657138−10.77910.21111.153517.70850.2988C
Lunasin_DF21FASTmrEMMAss715612259123317589432.58374.73155.516516.64310.1159T
Lunasin_DF21FASTmrEMMAss7156225291549657138−3.37 × 10−53.64244.37554.23 × 10−110.2988C
Lunasin_DF21pLARmEBss715585714336302681−7.10317.38198.25733.07450.3415G
Lunasin_DF21pLARmEBss715592971612980012−10.07539.563910.49294.79870.2073A
Lunasin_DF21pLARmEBss7155945076454368085.70384.85815.64841.94380.3171T
Lunasin_DF21pLARmEBss7156073421044167653−14.84867.19298.0634.12570.061C
Lunasin_DF21pLARmEBss715612259123317589413.95858.97219.88784.19750.1159T
Lunasin_DF21pLARmEBss7156225291549657138−7.6226.21337.05343.39710.2988C
Lunasin_DF21pLARmEBss7156246281633733586−5.48593.4854.20941.53130.2195C
Lunasin_DF21ISIS EM-BLASSOss715592975612984798−8.10466.51877.36869.27360.2622G
Lunasin_DF21ISIS EM-BLASSOss715594520645930385−4.95993.7814.52144.20990.3841C
Lunasin_DF21ISIS EM-BLASSOss7156073421044167653−14.30035.71426.53739.9740.061C
Lunasin_DF21ISIS EM-BLASSOss71561225912331758949.08023.65034.38384.62980.1159T
Lunasin_DF21ISIS EM-BLASSOss7156167571316561515−4.00373.18423.8912.87520.4939A
Lunasin_DF21ISIS EM-BLASSOss7156225291549657138−8.12016.28637.128810.04960.2988C
Lunasin_DF21ISIS EM-BLASSOss7156246281633733586−7.87265.25966.06598.21990.2195C
Table 3. Candidate genes or gene clusters around the significant QTNs for lunasin content in soybean seeds.
Table 3. Candidate genes or gene clusters around the significant QTNs for lunasin content in soybean seeds.
EnvironmentQTNChromosomeGenomic PositionCandidate Gene IDReference Genome Functional Annotation
GRIN, 2003qL-01212034409Glyma.02G123302Wm82.gnm6RAB geranylgeranyl transferase alpha subunit 1
GRIN, 2003qL-02213683259Glyma.02G131052Wm82.gnm6Nudix hydrolase 1 (NUDT1) cluster
GRIN, 2003qL-033793778Glyma.03G008400Wm82.gnm6Peptide chain release factor
GRIN, 2003qL-04717043096Glyma.07G144500Wm82.gnm6mRNA capping enzyme family protein
GRIN, 2003qL-05735777062Glyma.07G177100Wm82.gnm6Pentatricopeptide repeat (PPR-like) superfamily protein
GRIN, 2003qL-06737004443Glyma.07G179400Wm82.gnm6Embryo defective 1273 protein
GRIN, 2003qL-07947022602Glyma.09G238700Wm82.gnm6ZF-HD homeobox protein cluster
GRIN, 2003qL-081313819188Glyma.13G044000Wm82.gnm6ATP-binding ABC transporter
GRIN, 2003qL-09164380977Glyma.16G047300Wm82.gnm6ATP binding/protein serine/threonine kinase
GRIN, 2003qL-101636170992Glyma.16G179400Wm82.gnm6Pentatricopeptide repeat (PPR) superfamily protein
GRIN, 2003qL-11185146121Glyma.18G059800Wm82.gnm6Pentatricopeptide repeat-containing protein
GRIN, 2003qL-122034499736Glyma.20G083300Wm82.gnm6Ribosomal protein S3
Clayton, NC 2021qL-01336302681Glyma.03G007400Wm82.gnm6Transmembrane protein
Clayton, NC 2021qL-02451820047Glyma.04G214500Wm82.gnm6Ribosomal protein L17 family protein
Clayton, NC 2021qL-03534359243Glyma.05G129200Wm82.gnm6Transmembrane protein
Clayton, NC 2021qL-04612980012Glyma.06G156700Wm82.gnm6Transmembrane amino acid transporter family protein
Clayton, NC 2021qL-05645436808–45930385Glyma.06G253700Wm82.gnm6Transmembrane protein 184C-like
Clayton, NC 2021qL-061043465671Glyma.10G162500Wm82.gnm6Pentatricopeptide repeat (PPR) superfamily protein
Clayton, NC 2021qL-071044167653Glyma.10G168600Wm82.gnm6Pentatricopeptide repeat (PPR) superfamily protein
Clayton, NC 2021qL-081049696358Glyma.10G230100Wm82.gnm6Pentatricopeptide repeat (PPR) superfamily protein
Clayton, NC 2021qL-09124402508Glyma.12G060101Wm82.gnm6Pentatricopeptide repeat (PPR) superfamily protein
Clayton, NC 2021qL-101233175894Glyma.12G162300Wm82.gnm630S ribosomal protein S20
Clayton, NC 2021qL-111316561515Glyma.13G068200Wm82.gnm6Peptide transporter 1
Clayton, NC 2021qL-121444255136
Clayton, NC 2021qL-131549657138Glyma.15G245700Wm82.gnm6Pentatricopeptide repeat (PPR) superfamily protein
Clayton, NC 2021qL-141628465014Glyma.16G116600Wm82.gnm6Pentatricopeptide repeat (PPR) superfamily protein
Clayton, NC 2021qL-151633733586Glyma.16G160200Wm82.gnm6Transmembrane amino acid transporter family protein
Clayton, NC 2021qL-16181735923Glyma.18G022400Wm82.gnm6Transmembrane amino acid transporter family protein
Clayton, NC 2021qL-172044108146Glyma.20G170700Wm82.gnm6RAN binding protein 1
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Locklear, R.; Kusumah, J.; Rashad, L.; Lugaro, F.; Viera, S.; Kipyego, N.; Kipkosgei, F.; Jerop, D.; Jacquet, S.; Kassem, M.A.; et al. Multi-Locus GWAS Mapping and Candidate Gene Analysis of Anticancer Peptide Lunasin in Soybean (Glycine max L. Merr.). Plants 2025, 14, 2169. https://doi.org/10.3390/plants14142169

AMA Style

Locklear R, Kusumah J, Rashad L, Lugaro F, Viera S, Kipyego N, Kipkosgei F, Jerop D, Jacquet S, Kassem MA, et al. Multi-Locus GWAS Mapping and Candidate Gene Analysis of Anticancer Peptide Lunasin in Soybean (Glycine max L. Merr.). Plants. 2025; 14(14):2169. https://doi.org/10.3390/plants14142169

Chicago/Turabian Style

Locklear, Rikki, Jennifer Kusumah, Layla Rashad, Felecia Lugaro, Sonia Viera, Nathan Kipyego, Faith Kipkosgei, Daisy Jerop, Shirley Jacquet, My Abdelmajid Kassem, and et al. 2025. "Multi-Locus GWAS Mapping and Candidate Gene Analysis of Anticancer Peptide Lunasin in Soybean (Glycine max L. Merr.)" Plants 14, no. 14: 2169. https://doi.org/10.3390/plants14142169

APA Style

Locklear, R., Kusumah, J., Rashad, L., Lugaro, F., Viera, S., Kipyego, N., Kipkosgei, F., Jerop, D., Jacquet, S., Kassem, M. A., Yuan, J., de Mejia, E., & Mian, R. (2025). Multi-Locus GWAS Mapping and Candidate Gene Analysis of Anticancer Peptide Lunasin in Soybean (Glycine max L. Merr.). Plants, 14(14), 2169. https://doi.org/10.3390/plants14142169

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop