Next Article in Journal
The Antioxidant Activity of Mistletoes (Viscum album and Other Species)
Next Article in Special Issue
Development and Application of a Cleaved Amplified Polymorphic Sequence Marker (Phyto) Linked to the Pc5.1 Locus Conferring Resistance to Phytophthora capsici in Pepper (Capsicum annuum L.)
Previous Article in Journal
Physico-Chemical Properties, Fatty Acids Profile, and Economic Properties of Raspberry (Rubus idaeus L.) Seed Oil, Extracted in Various Ways
Previous Article in Special Issue
Population Genetic Analysis in Persimmons (Diospyros kaki Thunb.) Based on Genome-Wide Single-Nucleotide Polymorphisms
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Genome-Wide Association Study Reveals Region Associated with Seed Protein Content in Cowpea

1
Department of Horticulture, University of Arkansas, Fayetteville, AR 72701, USA
2
Texas A&M AgriLife Research, 11708 Highway 70 South, Vernon, TX 76384, USA
3
Department of Plant and Soil Sciences, Mississippi State University, North Mississippi Research and Extension Center, Verona, MS 38879, USA
4
USDA-ARS, Crop Improvement and Protection Research Unit, Salinas, CA 93905, USA
5
USDA-ARS, Plant Genetic Resources Conservation Unit, 1109 Experiment Street, Griffin, GA 30223, USA
*
Authors to whom correspondence should be addressed.
These authors contributed equally to this work.
Plants 2023, 12(14), 2705; https://doi.org/10.3390/plants12142705
Submission received: 19 June 2023 / Revised: 16 July 2023 / Accepted: 17 July 2023 / Published: 20 July 2023
(This article belongs to the Special Issue Molecular Markers and Molecular Breeding in Horticultural Plants)

Abstract

:
Cowpea (Vigna unguiculata L. Walp., 2n = 2x = 22) is a protein-rich crop that complements staple cereals for humans and serves as fodder for livestock. It is widely grown in Africa and other developing countries as the primary source of protein in the diet; therefore, it is necessary to identify the protein-related loci to improve cowpea breeding. In the current study, we conducted a genome-wide association study (GWAS) on 161 cowpea accessions (151 USDA germplasm plus 10 Arkansas breeding lines) with a wide range of seed protein contents (21.8~28.9%) with 110,155 high-quality whole-genome single-nucleotide polymorphisms (SNPs) to identify markers associated with protein content, then performed genomic prediction (GP) for future breeding. A total of seven significant SNP markers were identified using five GWAS models (single-marker regression (SMR), the general linear model (GLM), Mixed Linear Model (MLM), Fixed and Random Model Circulating Probability Unification (FarmCPU), and Bayesian-information and Linkage-disequilibrium Iteratively Nested Keyway (BLINK), which are located at the same locus on chromosome 8 for seed protein content. This locus was associated with the gene Vigun08g039200, which was annotated as the protein of the thioredoxin superfamily, playing a critical function for protein content increase and nutritional quality improvement. In this study, a genomic prediction (GP) approach was employed to assess the accuracy of predicting seed protein content in cowpea. The GP was conducted using cross-prediction with five models, namely ridge regression best linear unbiased prediction (rrBLUP), Bayesian ridge regression (BRR), Bayesian A (BA), Bayesian B (BB), and Bayesian least absolute shrinkage and selection operator (BL), applied to seven random whole genome marker sets with different densities (10 k, 5 k, 2 k, 1 k, 500, 200, and 7), as well as significant markers identified through GWAS. The accuracies of the GP varied between 42.9% and 52.1% across the seven SNPs considered, depending on the model used. These findings not only have the potential to expedite the breeding cycle through early prediction of individual performance prior to phenotyping, but also offer practical implications for cowpea breeding programs striving to enhance seed protein content and nutritional quality.

1. Introduction

Cowpea (Vigna unguiculata L. Walp., diploid, 2n = 2x = 22) is a protein-rich crop that complements staple cereals for humans and is fodder for livestock [1,2]. In the semi-arid regions of sub-Saharan Africa, cowpea can still reach an output of 1000 kg/ha, which provides millions of people with cheap, high-quality food protein [3]. The protein content of cowpea is between 20.3 and 32.5% (on a dry-weight basis), higher than that of many other legumes such as chickpea (Cicer arietinum), 13.3–26.8% [4], and lima bean (Phaseolus lunatus), 20.7–23.1% [5]. The high-quality plant-based protein contains all nine essential amino acids required for human health. Adding cowpea to children’s diets can significantly reduce developmental diseases such as malnutrition and infection in developing areas [6]. Its protein can also be used in various food products, including meat alternatives [7], baked goods [8], and protein bars [9], creating economic opportunities for food companies and entrepreneurs [10]. Assessing the protein content of cowpea germplasm can aid plant breeders in identifying and developing cultivars with high seed-protein content within their breeding programs. Ddamulira et al. [11] measured the protein content of 30 cowpea genotypes and found that the average cowpea protein content was between 23.9% and 30.3%. Weng et al. [12] developed a Near-Infrared Reflectance (NIR) rapid method to analyze 240 cowpea genotypes and found the protein contents were 28.8–37.8%. Boukar et al. [13] measured the protein content of 1541 germplasm lines, and the results were 17.5–32.5%, with an average of 25.0% and a standard deviation of 0.4 g.
Breeding efforts for cowpea protein content traits have been ongoing since the last century. B.B. Singh et al. [14] improved 52 cowpea varieties and obtained three varieties, IT89KD-245, IT89KD-288, and IT97K-499-35, with a protein content of 26%. Raina et al. [15] used sodium azide and radiation-induced mutation to improve cowpea protein content. However, the development of a new cowpea cultivar through breeding can take many years, typically around 8 to 10 years, or even longer. Therefore, MAS is a good approach for plant breeders that can accelerate the development of new and improved plant varieties with desirable traits [16].
Linkage- and family-based genetic mapping have proven effective in detecting quantitative trait loci (QTLs) that have major or minor phenotypic influences for simple and complex traits in cowpea. Kongjaimun et al. [17] utilized 226 SSR markers from related Vigna species and identified one major and six minor QTLs for the variance in pod length between yard long bean and wild cowpea. Andargie et al. [18] employed SSR markers to identify six QTLs for seed size and four QTLs for pod shattering, with phenotypic variation ranging from 8.9% to 19.1% and 6.4% to 17.2%, respectively. Using RFLP markers, Fatokun et al. [19] established genomic maps for cowpea and detected major QTLs for seed weight. Furthermore, Muchero et al. [20] mapped 12 QTLs associated with seedling drought tolerance and maturity in a cowpea recombinant inbred (RIL) population.
Genome-wide association studies (GWASs) are a powerful tool used in genetics research to identify genetic variations associated with various traits of interest [21]. Several GWASs in cowpea have been conducted using different methods and different populations. Huynh et al. [22] used 51,128 single-nucleotide polymorphisms on 368 diverse cowpea accessions from 51 countries to identify 17 loci related to seed weight, length, and width. Burridge et al. [23] identified 11 significant loci associated with biologically relevant variation in cowpea root architecture with a 189-entry cowpea diversity panel. Paudel et al. [24] performed a GWAS to identify marker trait associations for flowering time in 292 cowpea accessions using 51,128 SNPs, resulting in the identification of 7 reliable SNPs and candidate genes. Wu et al. [25] combined high-throughput physiological phenotyping of 106 cowpea accessions under progressive drought stress with a GWAS, which allowed for the genetic mapping of complex drought-responsive stomatal traits and the identification of a final set of 30 significant SNPs associated with stomatal closure, providing a new methodology for exploring the genetic determinants of water budgeting in crops under stressful conditions. Kpoviessi et al. [26] evaluated 107 cowpea collections from six countries for their responses to Callosobruchus maculatus to identify three quantitative trait nucleotides (QTNs) which were linked to candidate genes located nearby. However, compared with context traits or protein contents in other leguminous crops, the research on cowpea protein-content breeding is obviously insufficient since no GWASs have been reported on this aspect.
Genomic selection (GS) is a relatively new approach in plant breeding that allows for the selection of desirable traits based on genomic information and has the potential to accelerate the development of new and improved cowpea varieties with improved resistance to pests and diseases, drought tolerance, and other desirable traits. However, very few studies were reported on the applications of GS or GP in cowpea, such as a study to evaluate considerations for genetic architecture in GS models for flowering time, maturity, and seed size and a study to conduct GS for the drought tolerance indices [27]. Additionally, there have been no studies reported on cowpea seed quality so far.
The objectives of this study were to conduct a GWAS to identify SNP markers associated with the seed protein content of cowpea and to estimate the GS accuracy of predicting protein content using a 161-cowpea population as the first report.

2. Results

2.1. Phenotypes

The nitrogen contents were assessed from three replicates and two locations, following a randomized complete block design for the implementation of the field experiment. Subsequently, the protein content data for each accession of cowpeas was estimated and obtained (Table S1). The distribution and accumulation density of the protein contents are shown in Figure 1. In total, 24 accessions were under 24.00%, 35 were from 24.00 to 25.00%, 30 were from 25.00 to 26.00%, 44 were from 26.00 to 27.00%, and 28 were more than 28.00%. Among the 161 accessions, the highest protein content was PI662992, reaching 28.87%. The lowest was PI339587, and the protein content was 21.80%. The average protein content was 25.61%, and the standard deviation was 1.49%. The protein content of the 161 accessions followed an approximately normal distribution. The estimated broad-sense heritability (h2) was found to be 53.8%, indicating a moderately high level of inheritability in seed protein content of cowpea.

2.2. SNP Profile

For the GWAS and GP in this study, a collection of 110,155 high-quality single-nucleotide polymorphisms (SNPs) were employed, and their distribution across the 11 chromosomes is depicted in Figure 2. The average inter-SNP distance ranged from 3.3 kb to 6.5 kb across each chromosome, with an overall mean of 3.9 kb. The average minor allele frequency (MAF) across the entire genome was 21.6%, while the rates of heterozygosity and missingness were 2.4% and 0.3%, respectively.

2.3. Population Structure Analysis

The population structure of the 161 cowpea accessions was initially inferred using STRUCTURE 2.3.1 and the peak of delta K was observed at K = 2, by 110,155 high-quality SNPs indicating the presence of two sub-populations. At a threshold value of 0.5, 79 of the 161 accessions (49.1%) were assigned to Q1 subpopulation; 82 accessions (50.9%) were assigned to Q2 (Table S1). Phylogenetic analysis and a population admixture map of the 161 accessions using the GAPIT 3 R package also showed a clustering pattern consistent with that inferred by structure K = 2 (Figure 3A). The two groups were also observed based on PCA dimensions (Figure 3B). The most closely related accessions based on Structure analysis were grouped in the neighbor branches of the phylogenetic tree using Neighbor-Joining analysis (Figure 3C). Therefore, the 161 accessions can be divided into two sub-populations based on both structural and phylogenetic analyses. The kinship matrix, based on 110,155 SNPs for the studied genotypes, indicated that there was no clear clustering among the 161 genotypes (Figure S1).

2.4. GWAS Analysis and Candidate Gene

Association analysis for cowpea protein content was conducted by using 110,155 high-quality SNPs using four methods of MLM, GLM, Blink, and FarmCPU of GAPIT3 and three methods of SMR, GLM, and MLM of TASSEL5.0. In this study, the QQ plots (Figure 4A and Figure S2) demonstrated a significant deviation from the expected distribution of the observed p-value, suggesting the presence of SNPs was associated with protein contents in this population. The results of GAPIT3 (Figure 4C) and TASSEL5.0 (Figure S2) showed a high consistency of significant SNPs with protein contents which were located in chromosome 8. In total, seven SNPs with higher LOD and MAF values were identified: Vu08_3838280, Vu08_3838282, Vu08_3838296, Vu08_3839577, Vu08_3839579, Vu08_3840180, and Vu08_3840193 (Table 1), which were located between the 3839 kb and 3841 kb physical position of chromosome 8 (Figure 4B). Moreover, based on the analysis of haplotype blocks, the seven SNPs were located at the same block, which indicated that the SNPs tend to be inherited together, rather than being shuffled independently during meiosis. We identified Vu08_3839577 as the most significant SNP in comparison to the others. Specifically, the FarmCPU and Blink models of GAPIT3 produced LOD scores of 10.78 and 6.60, respectively, in association with this SNP. Moreover, the R2 of Vu08_3839577 was up to 22.44%, with similar values in the other six loci.
Given that all significant SNPs were located within a single block (Figure 4B), candidate genes were identified within the region of the block, encompassing a physical distance of 30 kb on either side. A total of three candidate genes, Vigun08g039100, Vigun08g039200, and Vigun08g039300, were identified and located at the positions of −25 kb, −6 kb, and 6 kb beside the haplotype block (Table 2). The candidate genes were annotated as Fructan fructosyltransferase (Vigun08g039100), hioredoxin superfamily protein, glutaredoxin subgroup III (Vigun08g039200), and heat shock transcription factor A2 (Vigun08g039300), respectively.

2.5. Genomic Prediction Analysis

In this study, eight SNP sets were used for GP analysis using five models: BA, BB, BL, BRR, and rrBLUP (Figure 5). The average prediction accuracies of random SNP sets ranged from 18.4% to 53.2%, and generally increased with the number of SNPs included, increasing from 7 to 10 k. The similar average accuracies among all models by randomly SNP sets ranged from 39.1% (rrBLUP) to 45.1% (BL). The set of seven SNPs associated with a trait had accuracies comparable to the larger sets of random SNPs and was particularly strong in the BL and BRR models (Table S2). Therefore, using trait-associated marker alleles to perform GP is more efficient for selecting protein content in cowpea breeding.

3. Discussion

3.1. Population and Phenotyping

Cowpea is a significant food crop in tropical and subtropical regions, but research on seed protein content in this crop remains limited compared with other legume species such as soybean [28,29]. In the present investigation, a set of 161 cowpea germplasm accessions obtained from 31 countries was analyzed, manifesting substantial genetic diversity and encompassing 10 distinct seed coat types, as previously documented by Xiong et al. [30,31]. This diverse genetic reservoir offers a promising avenue for a comprehensive exploration of the genetic determinants underlying seed protein content, consequently emphasizing the noteworthy variability inherent within the crop. The scarcity of data on germplasms with high seed-protein content available for cowpea breeding programs makes it crucial to screen cowpea germplasms to identify elite genotype(s) with high protein contents [32]. The protein content of cowpea seeds is an essential index that is closely related to quality, health, nutrition, and market price, regardless of whether the seeds are used for direct human consumption or processed into flour for baked goods or other products. Farmers and consumers prefer high-protein varieties and products; thus, achieving a high protein content has become a crucial goal in cowpea breeding programs and production [33]. According to previous research, cowpea seeds typically contain 25% protein [13]. However, identifying varieties and germplasms with higher protein contents than the average of 25% could be beneficial [34]. Asante et al. conducted a study on 32 cowpea accessions in Ghana to investigate the variation in protein content and found that the seed protein content ranged from 16.4% to 27.3%, with an average of 22.5% [35]. A total of 28 elite USDA cowpea accessions exhibiting a seed-protein content exceeding 28% were identified, surpassing that of conventional commercial cowpea cultivars. These accessions could prove valuable for utilization in Marker-Assisted Selection (MAS) breeding to develop novel cultivars with elevated protein content.
Cowpea seed protein content has been found to be highly heritable. Ajeigbe et al. [36] reported a broad-sense heritability of 86% based on nine cowpea varieties. Similarly, Nielsen et al. [37] observed a high broad-sense heritability of 95% for seed protein in 100 cowpea lines based on data from a single location. Tchiagam et al. [38] determined a broad-sense heritability of 74% for seed protein using five divergent lines for cross mating. Emebiri [39] reported broad-sense heritability for protein content ranging from 70% to 78% in two crosses. Moreover, our study found a lower heritability estimate of 53.8% for cowpea seed protein content compared with previous reports (>75%). This is attributed to the two-location trials in our analysis. If we independently calculated the data collected from each location, the heritability estimate would have exceeded 80%.

3.2. The Models of GWASs

GWASs have emerged as a powerful tool for identifying genetic variants associated with complex traits, including protein content in leguminous crops [40]. Due to the complex nature of genetic architecture and environmental factors, it is not uncommon for different statistical models to yield slightly different results. The use of multiple statistical models in GWASs is common practice, as it helps to reduce the risk of false-positive associations and increases the robustness of the results [41,42,43]. However, the fact that all models produced similar results in a study provides additional confidence in the validity of the findings [44]. In this report, the GWAS identified several SNPs associated with protein content using multiple models, and interestingly, all models yielded similar results. This finding suggests that the association between the genetic variants and the trait is robust and not dependent on the choice of statistical model.

3.3. The SNPs Associated with the Seed Protein Content

Several studies have applied GWASs to investigate the genetic basis of protein content in various leguminous crops. Priyanatha et al. [45] utilized a genomic panel consisting of 200 genotypes to investigate yield, protein, and oil concentrations using the FarmCPU model of GWAS. Hwang et al. [46] performed a GWAS to identify quantitative trait loci (QTL) controlling seed protein and oil concentration in 298 soybean germplasm accessions. Zhang et al. [47] identified three QTLs related to protein content using a GWAS with 211 diverse soybean accessions genotyped with a 355 K SoySNP array. Lee et al. [48] conducted a GWAS using phenotypic data collected from five environments for 621 accessions and 34,014 markers to identify three QTLs for seed protein content. Upadhyaya et al. [49] identified seven genomic loci associated with seed protein content using 16,376 genome-based SNPs in 336 sequenced chickpea accessions. However, there is currently no report on cowpea protein content using a GWAS. In this study, the analysis of cowpea seed protein content in 161 accessions under multi-models is the first report of a GWAS in this field.
In total, seven SNPs were identified that were located in the same haplotype block. This finding can provide valuable insights into genetic architecture [50]. It may suggest that these SNPs are tagging a common causal variant located within this block, or that the block itself contains functional elements that affect the trait [51]. Further investigations are necessary to identify the specific variant(s) responsible for the observed association, but the identification of these SNPs in the same haplotype block provides an important starting point for this process.

3.4. The Candidate Genes

Three candidate genes were identified by seven significance SNPs and were annotated as Fructan fructosyl-transferase (FFT Vigun08g039100), thioredoxin superfamily protein/Glutaredoxin subgroup (Vigun08g039200), and heat shock transcription factor A2 (HSFA2 Vigun08g039300).
The Thioredoxin superfamily is an essential group of proteins that regulate seed protein content in plants [52]. Seed protein content is regulated by the balance between protein synthesis and degradation, which is controlled by various factors, including the Thioredoxin superfamily. Studies have highlighted the importance of Thioredoxin h (Trx h) and the Glutaredoxin subgroup (GrxS) in regulating seed protein content [53]. Trx h plays a significant role in regulating the accumulation of storage proteins in seeds by controlling the expression of genes involved in the synthesis and accumulation of these proteins [54]. GrxS, on the other hand, regulates the degradation of storage proteins by controlling the activity of cysteine proteases that break down proteins [55,56]. Other members of the Thioredoxin superfamily, such as Thioredoxin m and Thioredoxin x, also play a role in regulating seed protein content [57]. In summary, the Thioredoxin superfamily has diverse functions in regulating seed protein content, making it essential for seed development and quality.
FFT is an enzyme involved in the biosynthesis of fructans, which are carbohydrate molecules found in plants [58]. HSFA2 is a transcription factor that plays a crucial role in the response of plants to heat stress [59]. There is no direct relationship between FFT/HSFA2 and protein content in plants. It is worth noting that there may be indirect effects on protein content in plants. For example, fructans can affect plant growth and development, which in turn can impact the expression of genes involved in protein synthesis and degradation pathways [60]. Additionally, fructans may provide a source of energy for plant metabolism, which could indirectly support protein synthesis [61]. Additionally, studies have shown that overexpression of HSFA2 in Arabidopsis leads to an increase in the accumulation of several classes of proteins [62]. Additionally, the overexpression of HSFA2 in rice resulted in an increase in the accumulation of storage proteins in the seeds [63]. Nonetheless, the relationship between FFT/HSFA2 and protein content in plants is complex and requires further investigation.

3.5. The Genomic Prediction

The study identified seven significant SNPs located within a single locus containing genes that are associated with storage proteins. However, prior to the application of these findings in breeding, further verification work is required [64,65]. In recent years, GS has gained popularity in large-scale crop-breeding programs. Previous studies have indicated that GS achieves a more robust prediction of genotypic values when compared with QTLs for traits controlled by numerous genes with small effects. GS is considered to offer a superior and more reliable prediction of outcomes than the traditional QTL approach, as it employs more markers that are distributed throughout the genome and captures more of the genetic variation of a trait [66]. Furthermore, GS can enable predictions of an individual’s performance prior to phenotyping, potentially saving up to 50% of the time and resources required in the breeding process [67].
However, there is currently no existing research investigating the effectiveness of GS or GP for cowpea seed protein content. To address this gap, we conducted a study utilizing GP with five models based on GWAS-derived SNPs and seven randomly selected sets containing between 7 and 10,000 SNPs. The accuracy of the GWAS-based SNPs ranged from 42.9% to 52.1%, which is similar to the accuracies reported in previous studies on seed protein contents of other plant species like winter wheat [68] and soybean [69,70], but lower than the prediction of flax [71]. Notably, the accuracy of the GP based on the GWAS-derived SNPs was higher than that of randomly selected SNP sets containing 7, 200, and 500 SNPs, and was comparable to that of random SNP sets containing ≥1000 SNPs. These findings suggest that significant SNPs derived from GWAS are important and efficient for use in breeding selection of seed protein content. The candidate SNPs are centered on chromosome 8, located between bases 3,839,000 and 3,841,000. This indicates that there may be a QTL regulating cowpea protein around these two kilobases.
In summary, the tight clustering of seven SNP markers within a small genomic region, with Vu3839577 as the peak marker, indicates a major QTL governing cowpea protein content. By focusing on this marker and exploring its effects and potential use, breeders can streamline selection processes and make significant strides towards developing high-protein cowpea varieties with broader agricultural and nutritional benefits. Further investigations into its biological mechanisms and interactions could enhance our understanding of cowpea genetics and support targeted crop-improvement efforts.

4. Materials and Methods

4.1. Plant Materials and Field Experiment

In this study, 161 cowpea genotypes including 151 USDA germplasm collections and 10 Arkansas lines were assessed. The cowpea plants were cultivated using a randomized complete block design (RCBD) with three replications in a single 14-foot-long row with a 3 foot row spacing and approximately 4 inches plant spacing at two distinct locations in Arkansas: Fayetteville (36°4′ N, 94°9′ S) and Alma (35°29′ N, 94°13′ S) in 2016. Throughout the growing season, no pesticides, herbicides, or chemicals were employed to manage pests, diseases, or weeds. No regular irrigation was upheld before maturity. Harvesting was carried out by bulk harvesting cowpea pods when 90% of pods dried at maturity stages. The cowpea seeds were subsequently shelled and cleaned following the harvest of pods [72].

4.2. Seed Protein-Content Assessment

A total of 966 samples were collected from 161 cowpea accessions with three replications at two locations, as described above. In order to measure seed protein content, each cowpea genotype sample was carefully selected based on matured seeds, uniform color and size, and the absence of damage from insects or machinery. Approximately 20 g of cowpea seeds from each sample was ground using a coffee grinder (Hamilton Beach, MODEL: 80335RV, Glen Allen, VA, USA) for 1 min. Next, 5 g of the ground powder was sieved through a 100# sieve (nominal wire diameter 0.1 mm), and each sample was weighed as 1 g, then transferred to a 0.2 mL microfuge tube for protein determination. The cowpea seed protein content was determined through the analysis of nitrogen percentage by combustion using an Elementar Rapid N III instrument (Elementar, Rhine Main, Germany) at the Agriculture Diagnostic Laboratory, University of Arkansas. We loaded each representative powdered sample into the instrument’s combustion chamber and allowed the instrument to analyze the sample by measuring released nitrogen gas. Combustion was performed at high temperature and in the presence of pure oxygen to remove nitrogen, which was subsequently isolated from other combustion products. The nitrogen content was then measured with a thermal conductivity detector for each sample, and the percentage of nitrogen in each sample was determined [73]. Finally, the total protein content for each sample was estimated by multiplying the nitrogen content with the conversion factor as 6.25 [74]. The phenotypic data were analyzed with SAS 9.2 (SAS Institute, Cary, NC, USA) software. The formula of heritability (h2) was used to determine each trait.
h 2 = σ G 2 σ G 2 + σ G E 2 e + σ E 2 r e
where σ G 2 is the genetic variance; σ G E 2 is genotype × environment variance; σ ε 2 is the residual variance; e is the number of environments (locations); and r is the number of replications (blocks).

4.3. DNA Extraction and Construction of Gene Library and GBS

Fresh leaf genomic DNA was extracted from freeze-dried young cowpea leaves using the CTAB (cetyltrimethylammonium bromide) protocol [75]. DNA content was detected using a NanoDrop 200 c spectrophotometer (Thermo SCIENTIFIC, Wilmington, DE, USA). The DNA library was obtained by treating the DNA with ApeKI restriction endonuclease. DNA normalization, library preparation, and GBS (genotyping by sequencing) were conducted with HiSeq 2000 in the Beijing Genome Institute (BGI), China. The cowpea reference genome was provided by Dr. Timothy J. Close, University of California, Riverside [76]. After screening SNP data with minor allele frequency (MAF) > 5% and missing data < 10%, 110,155 high-quality SNPs were finally obtained.

4.4. Population Structure and Genetic Diversity

LEA is an R package designed for conducting population structure and genomic signature analysis of local adaptations. The inference algorithms utilized by R are based on a fast version of the structure algorithm, which is available through the LEA package [77]. The Structure analysis identifies K clusters by measuring an optimum ΔK based on the SNP data provided. A preliminary analysis was performed in multiple runs by inputting successive values of K from 2 to 20. Once an optimum K was determined, each cowpea accession was assigned to a cluster (Q) based on the probability that the accession belonged to that cluster, with a cut-off probability for the assignment set to 0.5. Using the optimum K, a bar plot with ‘Sort by Q’ was generated to visualize the population structure among the 161 accessions. Additionally, phylogenetic relationships and principal component analyses (PCAs) among the accessions were generated and drawn using the R package GAPIT 3 (Genomic Association and Prediction Integrated Tool version 3, https://zzlab.net/GAPIT/index.html; https://github.com/jiabowang/GAPIT3, accessed on 1 April 2023) [78]. During the drawing of the phylogeny trees and PCA, the population structure and cluster information were imported for the combined analysis of genetic diversity. For the sub-tree of each Q (cluster), the shape of ‘Node/Subtree Marker’ and the ‘Branch Line’ was drawn using the same color scheme as the STRUCTURE analysis.

4.5. GWAS and Candidate Gene

The association analysis of the cowpea dataset was conducted using TASSEL 5.0 software [79]. Three different association analysis models were employed, including single-marker regression (SMR), the General Linear Model (GLM), and Mixed Linear Model (MLM) [80]. Additionally, two other models, Fixed and Random Model Circulating Probability Unification (FarmCPU) [81] and Bayesian-information and Linkage-disequilibrium Iteratively Nested Keyway (BLINK) [78], were utilized in R software GAPIT 3 [78]. The models were set with principal component analysis (PCA) equal to two, and pseudo QTNs were employed. Haplotype blocks (HAP) were estimated using Plink 2.0 software [82] within 100 kb, and a minimum threshold value of 0.05 for minor allele frequency (MAF) was used. Candidate genes were selected based on the peak significant SNP in each linkage disequilibrium (LD) region located within 30 kb on either side of significant SNPs [67]. The candidate genes were retrieved from the reference annotation of the cowpea reference genome Vigna unguiculata v1.2 from Phytozome database (https://phytozome.jgi.doe.gov).

4.6. Genomic Prediction

GP was conducted using eight genotype datasets, including seven randomly selected SNP sets (7, 200, 500, 1 k, 2 k, 5 k, and 10 k SNPs) and a trait-associated marker set (seven SNPs) according to the GWAS results. Genomic estimated breeding value (GEBV) was computed using five different statistical models, namely, ridge regression best linear unbiased predictor (rrBLUP) [83], Bayes ridge regression (BRR), ‘Bayes A’ (BA), ‘Bayes B (BB)’ Bayesian least absolute shrinkage, and selection operator (BL) [84]. A five-fold cross validation to a training/testing set as 20%/80% was performed for the genomic prediction study. The association panel was randomly divided into five disjointed groups. A total of 100 replications were conducted at each fold. Mean and standard errors corresponding to each fold were computed [85].

5. Conclusions

This study utilized GWAS to identify seven significant SNP markers located at a single locus on chromosome 8 associated with seed protein content. Further analysis revealed that the gene Vigun08g039200, annotated as a protein of the thioredoxin superfamily, plays a critical role in improving seed protein content and nutritional quality. To assess the accuracy of predicting seed protein content in cowpea, a GP approach was employed. The GP results showed that the accuracies of predicting seed protein content varied between 42.9% and 52.1%, depending on the model used. The findings suggest that GP is a useful tool for breeders to predict the selection accuracy of complex traits such as seed protein content in cowpea. Moreover, this approach may help expedite the breeding cycle by enabling early prediction of individual performance before phenotyping. Overall, these results provide practical implications for cowpea breeding programs seeking to enhance seed protein content and nutritional quality.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/plants12142705/s1, Figure S1: The kinship plot of 161 accessions. The heatmap of the values in the kinship matrix was created using GAPIT 3; Figure S2: The Manhattan plots for cowpea protein contents using three GWAS models: Mixed Linear Model (MLM), Generalized Linear Model (GLM), and single-marker regression (SMR) by TASSEL; Table S1: Protein contents and structure proportion in K = 2, among 161 cowpea accessions; Table S2: Prediction accuracy (PA) for seed protein content with 8 different SNP number sets from 7 SNPs to 10,000 SNPs with five genomic prediction models: ridge regression best linear unbiased predictor (rrBLUP), Bayes ridge regression (BRR), ’Bayes A’ (BA), ‘Bayes B (BB)’, Bayesian least absolute shrinkage (BL), and selection operator based on randomly selected SNP sets and GWAS-based markers.

Author Contributions

Conceptualization, A.S.; methodology, A.S.; software, A.S. and H.X.; validation, A.S. and H.X.; formal analysis, A.S., H.X. and Y.C.; investigation, H.X. and Y.C.; resources, A.S.; writing, H.X. and Y.C.; review and editing, A.S., B.M., S.T., G.B., H.X., I.A., K.C., C.B., T.M.P. and W.R.; visualization, Y.C., H.X., and A.S.; supervision, A.S., H.X. and B.M.; project administration, A.S., H.X. and S.T.; funding acquisition, A.S. and S.T.; cowpea germplasm providing, S.T. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the USDA Crop Germplasm Committees program for Vigna germplasm evaluation with Agreement Number/FAIN: 58-6046-6-004, Project Number: 6046-21000-011-22S, and Accession No.: 431146, the USDA National Institute of Food and Agriculture Hatch project accession number 1002423 and 1017337.

Data Availability Statement

The data that support the findings of this study are available in the Supplementary Material.

Acknowledgments

This research was funded by the USDA Crop Germplasm Committees program for Vigna germplasm evaluation. The authors are grateful to the scientists who have contributed to this project, and to the reviewers and editors for their constructive review.

Conflicts of Interest

The authors declare that they have no conflict of interest.

References

  1. Gondwe, T.M.; Alamu, E.O.; Mdziniso, P.; Maziya-Dixon, B. Cowpea (Vigna unguiculata (L.) Walp) for Food Security: An Evaluation of End-User Traits of Improved Varieties in Swaziland. Sci. Rep. 2019, 9, 15991. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  2. Jayathilake, C.; Visvanathan, R.; Deen, A.; Bangamuwage, R.; Jayawardana, B.C.; Nammi, S.; Liyanage, R. Cowpea: An Overview on Its Nutritional Facts and Health Benefits. J. Sci. Food Agric. 2018, 98, 4793–4806. [Google Scholar] [CrossRef] [PubMed]
  3. Ehlers, J.D.; Hall, A.E. Cowpea (Vigna unguiculata L. Walp.). Field Crops Res. 1997, 53, 187–204. [Google Scholar] [CrossRef]
  4. Jadhav, A.A.; Rayate, S.J.; Mhase, L.B.; Thudi, M.; Chitikineni, A.; Harer, P.N.; Jadhav, A.S.; Varshney, R.K.; Kulwal, P.L. Marker-Trait Association Study for Protein Content in Chickpea (Cicer arietinum L.). J. Genet. 2015, 94, 279–286. [Google Scholar] [CrossRef] [Green Version]
  5. Seidu, K.T.; Osundahunsi, O.F.; Olaleye, M.T.; Oluwalana, I.B. Amino Acid Composition, Mineral Contents and Protein Solubility of Some Lima Bean (Phaseolus lunatus L. Walp) Seeds Coat. Food Res. Int. 2015, 73, 130–134. [Google Scholar] [CrossRef]
  6. Mekonnen, T.W.; Gerrano, A.S.; Mbuma, N.W.; Labuschagne, M.T. Breeding of Vegetable Cowpea for Nutrition and Climate Resilience in Sub-Saharan Africa: Progress, Opportunities, and Challenges. Plants 2022, 11, 1583. [Google Scholar] [CrossRef]
  7. Penchalaraju, M.; John Don Bosco, S. Legume Protein Concentrates from Green Gram, Cowpea, and Horse Gram. J. Food Process Preserv. 2022, 46, e16477. [Google Scholar] [CrossRef]
  8. Prinyawiwatkul, W.; McWatters, K.H.; Beuchat, L.R.; Phillips, R.D.; Uebersak, M.A. Cowpea Flour: A Potential Ingredient in Food Products. Crit. Rev. Food Sci. Nutr. 1996, 36, 413–436. [Google Scholar] [CrossRef]
  9. Maleki, G.; Shadordizadeh, T.; Mozafari, M.R.; Attar, F.R.; Hesarinejad, M.A. Physicochemical and Nutritional Characteristics of Nutrition Bar Fortified with Cowpea Protein. J. Food Meas. Charact. 2023, 17, 2010–2015. [Google Scholar] [CrossRef]
  10. Owade, J.O.; Abong’, G.; Okoth, M.; Mwang’ombe, A.W. A Review of the Contribution of Cowpea Leaves to Food and Nutrition Security in East Africa. Food Sci. Nutr. 2020, 8, 36–47. [Google Scholar] [CrossRef] [Green Version]
  11. Ddamulira, G.; Santos, C.A.F.; Obuo, P.; Alanyo, M.; Lwanga, C.K. Grain Yield and Protein Content of Brazilian Cowpea Genotypes under Diverse Ugandan Environments. Am. J. Plant Sci. 2015, 6, 2074–2084. [Google Scholar] [CrossRef] [Green Version]
  12. Weng, Y.; Shi, A.; Ravelombola, W.S.; Yang, W.; Qin, J.; Motes, D.; Moseley, D.O.; Chen, P. A Rapid Method for Measuring Seed Protein Content in Cowpea (Vigna unguiculata (L.) Walp). Am. J. Plant Sci. 2017, 08, 2387–2396. [Google Scholar] [CrossRef] [Green Version]
  13. Boukar, O.; Massawe, F.; Muranaka, S.; Franco, J.; Maziya-Dixon, B.; Singh, B.; Fatokun, C. Evaluation of Cowpea Germplasm Lines for Protein and Mineral Concentrations in Grains. Plant Genet. Resour. 2011, 9, 515–522. [Google Scholar] [CrossRef]
  14. Singh, B.; Ehlers, J.; Sharma, B.; Freire Filho, F. Recent Progress in Cowpea Breeding. In Challenges and Opportunities for Enhancing Sustainable Cowpea Production; Fatokun, C., Tarawali, S., Singh, B., Kormawa, P., Eds.; ITAA: Ibadan, Nigeria, 2002; pp. 22–40. [Google Scholar]
  15. Raina, A.; Laskar, R.A.; Tantray, Y.R.; Khursheed, S.; Wani, M.R.; Khan, S. Characterization of Induced High Yielding Cowpea Mutant Lines Using Physiological, Biochemical and Molecular Markers. Sci. Rep. 2020, 10, 3687. [Google Scholar] [CrossRef] [Green Version]
  16. Horn, L.N.; Shimelis, H. Production Constraints and Breeding Approaches for Cowpea Improvement for Drought Prone Agro-Ecologies in Sub-Saharan Africa. Ann. Agric. Sci. 2020, 65, 83–91. [Google Scholar] [CrossRef]
  17. Kongjaimun, A.; Kaga, A.; Tomooka, N.; Somta, P.; Vaughan, D.A.; Srinives, P. The Genetics of Domestication of Yardlong Bean, Vigna unguiculata (L.) Walp. Ssp. Unguiculata Cv.-Gr. Sesquipedalis. Ann. Bot. 2012, 109, 1185. [Google Scholar] [CrossRef] [Green Version]
  18. Andargie, M.; Pasquet, R.S.; Gowda, B.S.; Muluvi, G.M.; Timko, M.P. Construction of a SSR-Based Genetic Map and Identification of QTL for Domestication Traits Using Recombinant Inbred Lines from a Cross between Wild and Cultivated Cowpea (V. unguiculata (L.) Walp.). Mol. Breed. 2011, 28, 413–420. [Google Scholar] [CrossRef]
  19. Fatokun, C.A.; Menancio-Hautea, D.I.; Danesh, D.; Young, N.D. Evidence for Orthologous Seed Weight Genes in Cowpea and Mung Bean Based on RFLP Mapping. Genetics 1992, 132, 841–846. [Google Scholar] [CrossRef]
  20. Muchero, W.; Ehlers, J.D.; Roberts, P.A. Seedling Stage Drought-Induced Phenotypes and Drought-Responsive Genes in Diverse Cowpea Genotypes. Crop Sci. 2008, 48, 541–552. [Google Scholar] [CrossRef]
  21. Uffelmann, E.; Huang, Q.Q.; Munung, N.S.; de Vries, J.; Okada, Y.; Martin, A.R.; Martin, H.C.; Lappalainen, T.; Posthuma, D. Genome-Wide Association Studies. Nat. Rev. Methods Primers 2021, 1, 59. [Google Scholar] [CrossRef]
  22. Huynh, B.L.; Ehlers, J.D.; Huang, B.E.; Muñoz-Amatriaín, M.; Lonardi, S.; Santos, J.R.P.; Ndeve, A.; Batieno, B.J.; Boukar, O.; Cisse, N.; et al. A Multi-Parent Advanced Generation Inter-Cross (MAGIC) Population for Genetic Analysis and Improvement of Cowpea (Vigna unguiculata L. Walp.). Plant J. 2018, 93, 1129–1142. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  23. Burridge, J.D.; Schneider, H.M.; Huynh, B.-L.; Roberts, P.A.; Bucksch, A.; Lynch, J.P. Genome-Wide Association Mapping and Agronomic Impact of Cowpea Root Architecture. Theor. Appl. Genet. 2017, 130, 419–431. [Google Scholar] [CrossRef] [PubMed]
  24. Paudel, D.; Dareus, R.; Rosenwald, J.; Muñoz-Amatriaín, M.; Rios, E.F. Genome-Wide Association Study Reveals Candidate Genes for Flowering Time in Cowpea (Vigna unguiculata [L.] Walp.). Front. Genet. 2021, 12, 667038. [Google Scholar] [CrossRef] [PubMed]
  25. Wu, X.; Sun, T.; Xu, W.; Sun, Y.; Wang, B.; Wang, Y.; Li, Y.; Wang, J.; Wu, X.; Lu, Z.; et al. Unraveling the Genetic Architecture of Two Complex, Stomata-Related Drought-Responsive Traits by High-Throughput Physiological Phenotyping and GWAS in Cowpea (Vigna unguiculata L. Walp). Front. Genet. 2021, 12, 743758. [Google Scholar] [CrossRef]
  26. Kpoviessi, A.D.; Agbahoungba, S.; Agoyi, E.E.; Nuwamanya, E.; Assogbadjo, A.E.; Chougourou, D.C.; Adoukonou-Sagbadja, H. Primary and Secondary Metabolite Compounds in Cowpea Seeds Resistant to the Cowpea Bruchid [Callosobruchus maculatus (F.)] in Postharvest Storage. J. Stored Prod. Res. 2021, 93, 101858. [Google Scholar] [CrossRef]
  27. Olatoye, M.O.; Hu, Z.; Aikpokpodion, P.O. Epistasis Detection and Modeling for Genomic Selection in Cowpea (Vigna unguiculata L. Walp.). Front. Genet. 2019, 10, 677. [Google Scholar] [CrossRef] [Green Version]
  28. Fernandes Santos, C.A.; Campos da Costa, D.C.; Roberto da Silva, W.; Boiteux, L.S. Genetic Analysis of Total Seed Protein Content in Two Cowpea Crosses. Crop Sci. 2012, 52, 2501–2506. [Google Scholar] [CrossRef] [Green Version]
  29. Ravelombola, W.S.; Shi, A.; Weng, Y.; Motes, D.; Chen, P.; Srivastava, V.; Wingfield, C.; Ravelombola, W.S.; Shi, A.; Weng, Y.; et al. Evaluation of Total Seed Protein Content in Eleven Arkansas Cowpea (Vigna unguiculata (L.) Walp.) Lines. Am. J. Plant Sci. 2016, 7, 2288–2296. [Google Scholar] [CrossRef] [Green Version]
  30. Xiong, H.; Shi, A.; Mou, B.; Qin, J.; Motes, D.; Lu, W.; Ma, J.; Weng, Y.; Yang, W.; Wu, D. Genetic Diversity and Population Structure of Cowpea (Vigna unguiculata L. Walp). PLoS ONE 2016, 11, e0160941. [Google Scholar] [CrossRef] [Green Version]
  31. Xiong, H.; Qin, J.; Shi, A.; Mou, B.; Wu, D.; Sun, J.; Shu, X.; Wang, Z.; Lu, W.; Ma, J. Genetic Differentiation and Diversity upon Genotype and Phenotype in Cowpea (Vigna unguiculata L. Walp.). Euphytica 2018, 214, 4. [Google Scholar] [CrossRef]
  32. Boukar, O.; Belko, N.; Chamarthi, S.; Togola, A.; Batieno, J.; Owusu, E.; Haruna, M.; Diallo, S.; Umar, M.L.; Olufajo, O.; et al. Cowpea (Vigna unguiculata): Genetics, Genomics and Breeding. Plant Breed. 2019, 138, 415–424. [Google Scholar] [CrossRef] [Green Version]
  33. Phillips, R.D.; McWatters, K.H.; Chinnan, M.S.; Hung, Y.C.; Beuchat, L.R.; Sefa-Dedeh, S.; Sakyi-Dawson, E.; Ngoddy, P.; Nnanyelugo, D.; Enwere, J.; et al. Utilization of Cowpeas for Human Food. Field Crops Res. 2003, 82, 193–213. [Google Scholar] [CrossRef]
  34. Gerrano, A.S.; Jansen van Rensburg, W.S.; Venter, S.L.; Shargie, N.G.; Amelework, B.A.; Shimelis, H.A.; Labuschagne, M.T. Selection of Cowpea Genotypes Based on Grain Mineral and Total Protein Content. Acta Agric. Scand. B Soil. Plant Sci. 2019, 69, 155–166. [Google Scholar] [CrossRef]
  35. Asante, I.; Adu-Dapaah, H.; Addison, P. Seed Weight and Protein and Tannin Contents of 32 Cowpea Accessions in Ghana. Trop. Sci. 2004, 44, 77–79. [Google Scholar] [CrossRef]
  36. Ajeigbe, H.A.; Ihedioha, D.; Chikoye, D. Variation in Physico-Chemical Properties of Seed of Selected Improved Varieties of Cowpea as It Relates to Industrial Utilization of the Crop. Afr. J. Biotechnol. 2008, 7, 3642–3647. [Google Scholar] [CrossRef]
  37. Nielsen, S.S.; Brandt, W.E.; Singh, B.B. Genetic Variability for Nutritional Composition and Cooking Time of Improved Cowpea Lines. Crop Sci. 1993, 33, 469–472. [Google Scholar] [CrossRef] [Green Version]
  38. Jean Baptiste, N.T.; Joseph, M.B.; Antoine, M.N.; Nicolas, Y.N.; Emmanuel, Y. Genetic Analysis of Seed Proteins Contents in Cowpea (Vigna unguiculata L. Walp.). Afr. J. Biotechnol. 2011, 10, 3077–3086. [Google Scholar] [CrossRef]
  39. Emebiri, L.C. Inheritance of Protein Content in Seeds of Selected Crosses of Cowpea (Vigna unguiculata). J. Sci. Food Agric. 1991, 54, 1–7. [Google Scholar] [CrossRef]
  40. Pandey, M.K.; Roorkiwal, M.; Singh, V.K.; Ramalingam, A.; Kudapa, H.; Thudi, M.; Chitikineni, A.; Rathore, A.; Varshney, R.K. Emerging Genomic Tools for Legume Breeding: Current Status and Future Prospects. Front. Plant Sci. 2016, 7, 455. [Google Scholar] [CrossRef] [Green Version]
  41. Cantor, R.M.; Lange, K.; Sinsheimer, J.S. Prioritizing GWAS Results: A Review of Statistical Methods and Recommendations for Their Application. Am. J. Hum. Genet. 2010, 86, 6–22. [Google Scholar] [CrossRef] [Green Version]
  42. Korte, A.; Farlow, A. The Advantages and Limitations of Trait Analysis with GWAS: A Review. Plant Methods 2013, 9, 29. [Google Scholar] [CrossRef] [Green Version]
  43. Begum, F.; Ghosh, D.; Tseng, G.C.; Feingold, E. Comprehensive Literature Review and Statistical Considerations for GWAS Meta-Analysis. Nucleic Acids Res. 2012, 40, 3777–3784. [Google Scholar] [CrossRef] [Green Version]
  44. Tibbs Cortes, L.; Zhang, Z.; Yu, J. Status and Prospects of Genome-Wide Association Studies in Plants. Plant Genome 2021, 14, e20077. [Google Scholar] [CrossRef]
  45. Priyanatha, C.; Rajcan, I. Phenotypic Evaluation of Canadian × Chinese Elite Germplasm in a Diversity Panel for Seed Yield and Seed Quality Traits. Can. J. Plant Sci. 2022, 102, 1032–1039. [Google Scholar] [CrossRef]
  46. Hwang, E.Y.; Song, Q.; Jia, G.; Specht, J.E.; Hyten, D.L.; Costa, J.; Cregan, P.B. A Genome-Wide Association Study of Seed Protein and Oil Content in Soybean. BMC Genom. 2014, 15, 1. [Google Scholar] [CrossRef] [Green Version]
  47. Zhang, S.; Hao, D.; Zhang, S.; Zhang, D.; Wang, H.; Du, H.; Kan, G.; Yu, D. Genome-Wide Association Mapping for Protein, Oil and Water-Soluble Protein Contents in Soybean. Mol. Genet. Genom. 2021, 296, 91–102. [Google Scholar] [CrossRef]
  48. Lee, S.; Van, K.; Sung, M.; Nelson, R.; LaMantia, J.; McHale, L.K.; Mian, M.A.R. Genome-Wide Association Study of Seed Protein, Oil and Amino Acid Contents in Soybean from Maturity Groups I to IV. Theor. Appl. Genet. 2019, 132, 1639–1659. [Google Scholar] [CrossRef] [Green Version]
  49. Upadhyaya, H.D.; Bajaj, D.; Narnoliya, L.; Das, S.; Kumar, V.; Gowda, C.L.L.; Sharma, S.; Tyagi, A.K.; Parida, S.K. Genome-Wide Scans for Delineation of Candidate Genes Regulating Seed-Protein Content in Chickpea. Front. Plant Sci. 2016, 7, 302. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  50. Shi, H.; Mancuso, N.; Spendlove, S.; Pasaniuc, B. Local Genetic Correlation Gives Insights into the Shared Genetic Architecture of Complex Traits. Am. J. Hum. Genet. 2017, 101, 737–751. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  51. Schaid, D.J.; Chen, W.; Larson, N.B. From Genome-Wide Associations to Candidate Causal Variants by Statistical Fine-Mapping. Nat. Rev. Genet. 2018, 19, 491–504. [Google Scholar] [CrossRef]
  52. Lu, J.; Holmgren, A. The Thioredoxin Superfamily in Oxidative Protein Folding. Antioxid. Redox Signal 2014, 21, 457–470. [Google Scholar] [CrossRef] [PubMed]
  53. Hägglund, P.; Finnie, C.; Yano, H.; Shahpiri, A.; Buchanan, B.B.; Henriksen, A.; Svensson, B. Seed Thioredoxin h. Biochim. Biophys. Acta (BBA)—Proteins Proteom. 2016, 1864, 974–982. [Google Scholar] [CrossRef] [PubMed]
  54. Gelhaye, E.; Rouhier, N.; Jacquot, J.-P. The Thioredoxin h System of Higher Plants. Plant Physiol. Biochem. 2004, 42, 265–271. [Google Scholar] [CrossRef] [PubMed]
  55. Lindahl, M.; Mata-Cabana, A.; Kieselbach, T. The Disulfide Proteome and Other Reactive Cysteine Proteomes: Analysis and Functional Significance. Antioxid. Redox Signal 2011, 14, 2581–2642. [Google Scholar] [CrossRef] [PubMed]
  56. Colville, L.; Kranner, I. Desiccation Tolerant Plants as Model Systems to Study Redox Regulation of Protein Thiols. Plant Growth Regul. 2010, 62, 241–255. [Google Scholar] [CrossRef]
  57. Lockwood, T.D. Redox Control of Protein Degradation. Antioxid. Redox Signal 2000, 2, 851–878. [Google Scholar] [CrossRef] [Green Version]
  58. Kawakami, A.; Yoshida, M. Fructan:Fructan 1-Fructosyltransferase, a Key Enzyme for Biosynthesis of Graminan Oligomers in Hardened Wheat. Planta 2005, 223, 90–104. [Google Scholar] [CrossRef]
  59. Guo, M.; Liu, J.H.; Ma, X.; Luo, D.X.; Gong, Z.H.; Lu, M.H. The Plant Heat Stress Transcription Factors (HSFS): Structure, Regulation, and Function in Response to Abiotic Stresses. Front. Plant Sci. 2016, 7, 114. [Google Scholar] [CrossRef] [Green Version]
  60. Márquez-López, R.E.; Loyola-Vargas, V.M.; Santiago-García, P.A. Interaction between Fructan Metabolism and Plant Growth Regulators. Planta 2022, 255, 49. [Google Scholar] [CrossRef]
  61. Bolouri-Moghaddam, M.R.; Le Roy, K.; Xiang, L.; Rolland, F.; Van Den Ende, W. Sugar Signalling and Antioxidant Network Connections in Plant Cells. FEBS J. 2010, 277, 2022–2037. [Google Scholar] [CrossRef]
  62. Nishizawa, A.; Yabuta, Y.; Yoshida, E.; Maruta, T.; Yoshimura, K.; Shigeoka, S. Arabidopsis Heat Shock Transcription Factor A2 as a Key Regulator in Response to Several Types of Environmental Stress. Plant J. 2006, 48, 535–547. [Google Scholar] [CrossRef] [PubMed]
  63. Xu, H.; Li, X.; Zhang, H.; Wang, L.; Zhu, Z.; Gao, J.; Li, C.; Zhu, Y. High Temperature Inhibits the Accumulation of Storage Materials by Inducing Alternative Splicing of OsbZIP58 during Filling Stage in Rice. Plant Cell Environ. 2020, 43, 1879–1896. [Google Scholar] [CrossRef] [PubMed]
  64. Jannink, J.-L.; Lorenz, A.J.; Iwata, H. Genomic Selection in Plant Breeding: From Theory to Practice. Brief. Funct. Genom. 2010, 9, 166–177. [Google Scholar] [CrossRef] [Green Version]
  65. Crossa, J.; Pérez-Rodríguez, P.; Cuevas, J.; Montesinos-López, O.; Jarquín, D.; de los Campos, G.; Burgueño, J.; González-Camacho, J.M.; Pérez-Elizalde, S.; Beyene, Y.; et al. Genomic Selection in Plant Breeding: Methods, Models, and Perspectives. Trends Plant Sci. 2017, 22, 961–975. [Google Scholar] [CrossRef]
  66. Goddard, M.E.; Hayes, B.J. Genomic Selection. J. Anim. Breed. Genet. 2007, 124, 323–330. [Google Scholar] [CrossRef] [PubMed]
  67. Ravelombola, W.; Shi, A.; Huynh, B.L. Loci Discovery, Network-Guided Approach, and Genomic Prediction for Drought Tolerance Index in a Multi-Parent Advanced Generation Intercross (MAGIC) Cowpea Population. Hortic. Res. 2021, 8, 24. [Google Scholar] [CrossRef]
  68. Kristensen, P.S.; Jahoor, A.; Andersen, J.R.; Cericola, F.; Orabi, J.; Janss, L.L.; Jensen, J. Genome-Wide Association Studies and Comparison of Models and Cross-Validation Strategies for Genomic Prediction of Quality Traits in Advanced Winter Wheat Breeding Lines. Front. Plant Sci. 2018, 9, 69. [Google Scholar] [CrossRef] [Green Version]
  69. Jarquin, D.; Specht, J.; Lorenz, A. Prospects of Genomic Prediction in the USDA Soybean Germplasm Collection: Historical Data Creates Robust Models for Enhancing Selection of Accessions. G3 Genes Genomes Genet. 2016, 6, 2329–2341. [Google Scholar] [CrossRef] [Green Version]
  70. Stewart-Brown, B.B.; Song, Q.; Vaughn, J.N.; Li, Z. Genomic Selection for Yield and Seed Composition Traits Within an Applied Soybean Breeding Program. G3 Genes Genomes Genet. 2019, 9, 2253–2265. [Google Scholar] [CrossRef] [Green Version]
  71. Lan, S.; Zheng, C.; Hauck, K.; McCausland, M.; Duguid, S.D.; Booker, H.M.; Cloutier, S.; You, F.M. Genomic Prediction Accuracy of Seven Breeding Selection Traits Improved by QTL Identification in Flax. Int. J. Mol. Sci. 2020, 21, 1577. [Google Scholar] [CrossRef] [Green Version]
  72. Weng, Y.; Qin, J.; Eaton, S.; Yang, Y.; Ravelombola, W.S.; Shi, A. Evaluation of Seed Protein Content in USDA Cowpea Germplasm. HortScience 2019, 54, 814–817. [Google Scholar] [CrossRef] [Green Version]
  73. Isaac, R.A.; Johnson, W.C. Determination of Total Nitrogen in Plant Tissue, Using a Block Digestor. J. AOAC Int. 1976, 59, 98–100. [Google Scholar] [CrossRef]
  74. Moore, J.C.; DeVries, J.W.; Lipp, M.; Griffiths, J.C.; Abernethy, D.R. Total Protein Methods and Their Potential Utility to Reduce the Risk of Food Protein Adulteration. Compr. Rev. Food Sci. Food Saf. 2010, 9, 330–357. [Google Scholar] [CrossRef]
  75. Rogers, S.O.; Bendich, A.J. Extraction of DNA from Milligram Amounts of Fresh, Herbarium and Mummified Plant Tissues. Plant Mol. Biol. 1985, 5, 69–76. [Google Scholar] [CrossRef]
  76. Lonardi, S.; Muñoz-Amatriaín, M.; Liang, Q.; Shu, S.; Wanamaker, S.I.; Lo, S.; Tanskanen, J.; Schulman, A.H.; Zhu, T.; Luo, M.C.; et al. The Genome of Cowpea (Vigna unguiculata [L.] Walp.). Plant J. 2019, 98, 767–782. [Google Scholar] [CrossRef] [Green Version]
  77. Frichot, E.; François, O. LEA: An R Package for Landscape and Ecological Association Studies. Methods Ecol. Evol. 2015, 6, 925–929. [Google Scholar] [CrossRef]
  78. Wang, J.; Zhang, Z. GAPIT Version 3: Boosting Power and Accuracy for Genomic Association and Prediction. Genom. Proteom. Bioinform. 2021, 19, 629–640. [Google Scholar] [CrossRef]
  79. Bradbury, P.J.; Zhang, Z.; Kroon, D.E.; Casstevens, T.M.; Ramdoss, Y.; Buckler, E.S. TASSEL: Software for Association Mapping of Complex Traits in Diverse Samples. Bioinformatics 2007, 23, 2633–2635. [Google Scholar] [CrossRef] [Green Version]
  80. Yu, J.; Pressoir, G.; Briggs, W.H.; Bi, I.V.; Yamasaki, M.; Doebley, J.F.; McMullen, M.D.; Gaut, B.S.; Nielsen, D.M.; Holland, J.B.; et al. A Unified Mixed-Model Method for Association Mapping That Accounts for Multiple Levels of Relatedness. Nat. Genet. 2005, 38, 203–208. [Google Scholar] [CrossRef]
  81. Liu, X.; Huang, M.; Fan, B.; Buckler, E.S.; Zhang, Z. Iterative Usage of Fixed and Random Effect Models for Powerful and Efficient Genome-Wide Association Studies. PLoS Genet. 2016, 12, e1005767. [Google Scholar] [CrossRef]
  82. Purcell, S.; Neale, B.; Todd-Brown, K.; Thomas, L.; Ferreira, M.A.R.; Bender, D.; Maller, J.; Sklar, P.; De Bakker, P.I.W.; Daly, M.J.; et al. PLINK: A Tool Set for Whole-Genome Association and Population-Based Linkage Analyses. Am. J. Hum. Genet. 2007, 81, 559–575. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  83. Endelman, J.B. Ridge Regression and Other Kernels for Genomic Selection with R Package RrBLUP. Plant Genome 2011, 4, 250–255. [Google Scholar] [CrossRef] [Green Version]
  84. Heslot, N.; Yang, H.-P.; Sorrells, M.E.; Jannink, J.-L. Genomic Selection in Plant Breeding: A Comparison of Models. Crop Sci. 2012, 52, 146–160. [Google Scholar] [CrossRef]
  85. Shikha, M.; Kanika, A.; Rao, A.R.; Mallikarjuna, M.G.; Gupta, H.S.; Nepolean, T. Genomic Selection for Drought Tolerance Using Genome-Wide SNPs in Maize. Front. Plant Sci. 2017, 8, 550. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Figure 1. The distribution and accumulation density of the protein contents in 161 cowpea accessions. Blue: low protein contents, yellow: medium protein contents, red: high protein contents.
Figure 1. The distribution and accumulation density of the protein contents in 161 cowpea accessions. Blue: low protein contents, yellow: medium protein contents, red: high protein contents.
Plants 12 02705 g001
Figure 2. The distribution of 110,155 SNPs among the 11 chromosomes of cowpea within 1 Mb size.
Figure 2. The distribution of 110,155 SNPs among the 11 chromosomes of cowpea within 1 Mb size.
Plants 12 02705 g002
Figure 3. The structure, principal component, and phylogenetic analysis of 161 cowpea accessions were based on 110,155 SNPs. (A) Classification of 161 accessions in two groups (K = 2) using STRUCTURE. The distribution of accessions to different populations is color−coded. The X−axis represents the 161 accessions, and the value on the Y−axis shows the likelihood of every individual belonging to one of the two colored subpopulations, Q1 = red, Q2 = cyan; (B) scatter diagram of PCA for 161 accessions labeled by Q groups with the colors in (A); (C) phylogenetic analysis of the 161 with the corresponding labels as Q group colors in (A).
Figure 3. The structure, principal component, and phylogenetic analysis of 161 cowpea accessions were based on 110,155 SNPs. (A) Classification of 161 accessions in two groups (K = 2) using STRUCTURE. The distribution of accessions to different populations is color−coded. The X−axis represents the 161 accessions, and the value on the Y−axis shows the likelihood of every individual belonging to one of the two colored subpopulations, Q1 = red, Q2 = cyan; (B) scatter diagram of PCA for 161 accessions labeled by Q groups with the colors in (A); (C) phylogenetic analysis of the 161 with the corresponding labels as Q group colors in (A).
Plants 12 02705 g003
Figure 4. The QQ plot (A), haplotype block (B), and Manhattan plots (C) for cowpea protein contents using four GWAS models: Bayesian-information and Linkage-disequilibrium Iteratively Nested Keyway (BLINK), Fixed and Random Model Circulating Probability Unification (FarmCPU), the Mixed Linear Model (MLM), and the Generalized Linear Model (GLM) by GAPIT 3.
Figure 4. The QQ plot (A), haplotype block (B), and Manhattan plots (C) for cowpea protein contents using four GWAS models: Bayesian-information and Linkage-disequilibrium Iteratively Nested Keyway (BLINK), Fixed and Random Model Circulating Probability Unification (FarmCPU), the Mixed Linear Model (MLM), and the Generalized Linear Model (GLM) by GAPIT 3.
Plants 12 02705 g004
Figure 5. Genomic prediction (GP) accuracy (r−value) for protein contents using five GP models, ridge regression best linear unbiased predictor (rrBLUP), Bayes ridge regression (BRR), ’Bayes A’ (BA), ‘Bayes B‘ (BB), and Bayesian least absolute shrinkage and selection operator (BL), on GWAS−based markers (7 m = violet) and randomly selected SNP sets (7 r = red, 200 r = tan, 500 r = kelly, 1000 r = green, 2000 r = teal, 5000 r = blue, and 10,000 r = purple).
Figure 5. Genomic prediction (GP) accuracy (r−value) for protein contents using five GP models, ridge regression best linear unbiased predictor (rrBLUP), Bayes ridge regression (BRR), ’Bayes A’ (BA), ‘Bayes B‘ (BB), and Bayesian least absolute shrinkage and selection operator (BL), on GWAS−based markers (7 m = violet) and randomly selected SNP sets (7 r = red, 200 r = tan, 500 r = kelly, 1000 r = green, 2000 r = teal, 5000 r = blue, and 10,000 r = purple).
Plants 12 02705 g005
Table 1. SNP markers associated with seed protein content in cowpea, based on four models, BLINK, FarmCPU, MLM, and GLM, in GAPIT 3 and three models, MLM, GLM, and SMR, in Tassel 5, and t-test.
Table 1. SNP markers associated with seed protein content in cowpea, based on four models, BLINK, FarmCPU, MLM, and GLM, in GAPIT 3 and three models, MLM, GLM, and SMR, in Tassel 5, and t-test.
SNPChrPosition −Log(p-Value) Using GAPIT 3−Log(p-Value)
in Tassel
t-TestRsq in TasselHigh Protein
Content
Allele
Low Protein
Content
Allele
MAF
(%)
BlinkFarmCPUMLMGLMSMRGLMMLM−LOG(p)SMRGLMMLM
Vu08_383828083,838,2800.040.283.227.8910.599.193.3513.9426.5621.5610.21TA47.52
Vu08_383828283,838,2820.040.283.227.8910.599.193.3513.9426.5621.5610.21TA47.52
Vu08_383829683,838,2960.160.193.338.1410.929.543.4214.2827.2622.2610.44CG46.89
Vu08_383957783,839,57710.786.603.628.3410.959.633.2014.5027.3322.449.755GA46.58
Vu08_383957983,839,5790.000.003.628.3410.959.633.2014.5027.3322.449.755TC46.58
Vu08_384018083,840,1800.350.253.596.559.046.953.0112.6423.1716.829.144AG44.10
Vu08_384019383,840,1930.480.153.757.149.687.983.6013.0924.5919.0311.04CA49.69
Table 2. Functional annotation of the genes within the 50 kb genomic region harboring the significant SNPs that associate with seed protein content.
Table 2. Functional annotation of the genes within the 50 kb genomic region harboring the significant SNPs that associate with seed protein content.
GeneFunctionChrGene Start PosGene End PosSNPChrPosDistance (Bp) from Gene Start and End
Vigun08g038900Fructan fructosyltransferaseVu083,789,8463,793,119Vu08_383957783839577−49,731−46,458
Vigun08g039000Fructan fructosyltransferaseVu083,799,6493,802,436−39,928−37,141
Vigun08g039100Fructan fructosyltransferaseVu083,814,1613,817,053−25,416−22,524
Vigun08g039200Thioredoxin superfamily protein, OsGrx_C15—Glutaredoxin subgroup III, expressedVu083,832,7653,834,779−6812−4798
Vigun08g039300Heat shock transcription factor A2Vu083,846,3463,848,72867699151
Vigun08g039400Thromboxane-A synthase/Thromboxane synthetaseVu083,869,6953,873,52330,11833,946
Vigun08g039500IQ calmodulin-binding motif domain containing protein, expressedVu083,873,4363,876,76633,85937,189
Vigun08g039600LOC_Os06g05730, expressed proteinVu083,887,8213,888,66348,24449,086
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Chen, Y.; Xiong, H.; Ravelombola, W.; Bhattarai, G.; Barickman, C.; Alatawi, I.; Phiri, T.M.; Chiwina, K.; Mou, B.; Tallury, S.; et al. A Genome-Wide Association Study Reveals Region Associated with Seed Protein Content in Cowpea. Plants 2023, 12, 2705. https://doi.org/10.3390/plants12142705

AMA Style

Chen Y, Xiong H, Ravelombola W, Bhattarai G, Barickman C, Alatawi I, Phiri TM, Chiwina K, Mou B, Tallury S, et al. A Genome-Wide Association Study Reveals Region Associated with Seed Protein Content in Cowpea. Plants. 2023; 12(14):2705. https://doi.org/10.3390/plants12142705

Chicago/Turabian Style

Chen, Yilin, Haizheng Xiong, Waltram Ravelombola, Gehendra Bhattarai, Casey Barickman, Ibtisam Alatawi, Theresa Makawa Phiri, Kenani Chiwina, Beiquan Mou, Shyam Tallury, and et al. 2023. "A Genome-Wide Association Study Reveals Region Associated with Seed Protein Content in Cowpea" Plants 12, no. 14: 2705. https://doi.org/10.3390/plants12142705

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop