Genetic Diversity among Cowpea ( Vigna unguiculata (L.) Walp.) Landraces Suggests Central Mozambique as an Important Hotspot of Variation

: Cowpea is a multiple-purpose drought-tolerant leguminous pulse crop grown in several dry tropical areas. Its domestication center is thought to be East or West Africa, where a high level of genetic diversity is apparently still found. However, detailed genetic information is lacking in many African countries, limiting the success of breeding programs. In this work, we assessed the genetic variation and gene ﬂow in 59 Vigna unguiculata (cowpea) accessions from 10 landraces spanning across six agro-ecological zones of Mozambique, based on nuclear microsatellite markers. The results revealed the existence of high genetic diversity between the landraces, even in comparison to other world regions. Four genetic groups were found, with no speciﬁc geographic pattern, suggesting the presence of gene ﬂow between landraces. In comparison, the two commercial varieties had lower values of genetic diversity, although still close to the ones found in local landraces. The high genetic diversity found in Mozambique sustains the importance of local genetic resources and farm protection to enhance genetic diversity in modern varieties of cowpea worldwide.


Introduction
Cowpea (Vigna unguiculata (L.) Walp.), also known as black eye pea, is a major annual pulse crop mostly grown in dry tropical areas of Latin America, South Asia, and Africa [1]. It is cultivated mainly for its seeds, which have a high dry matter content of proteins (20-32%) and carbohydrates (50-60%). the aim of this study was to assess the genetic diversity of 59 cowpea accessions from 10 landraces across six agro-ecological zones of Mozambique, using single sequence repeat (SSR) markers.

Plant Material
Fifty-nine cowpea accessions corresponding to 10 landraces were sampled in six agro-ecological zones (AEZs) in the provinces of Manica, Sofala, and Zambezia, where cowpea is grown as an integral component of local cereal-legume cropping systems ( Figure 1): R3 (North and Central Gaza and Western Inhambane), R4 (medium-altitude areas of Central Mozambique), R5 (low-altitude areas of Sofala and Zambezia), R6 (dry areas of Zambezia and Southern Tete), R7 (mid-altitude areas of Zambezia, Nampula, Tete, Niassa, and Cabo Delgado), and R10 (high-altitude areas of Zambezia, Niassa, Angonia-Maravia, and Manica [21]). Additionally, two widely used commercial cultivars (IT16 and IT18) released by the Mozambican Institute of Agricultural Research (IIAM) and bred through a partnership with the International Institute of Tropical Agriculture (IITA) in Nigeria were also used in this study.
Agronomy 2020, 10, x FOR PEER REVIEW 3 of 15 between previously released and discontinued open pollinated varieties [16], not being subjected to selection over a long period of time [17][18][19]. However, knowledge about their variability is usually limited [20]. Therefore, the aim of this study was to assess the genetic diversity of 59 cowpea accessions from 10 landraces across six agro-ecological zones of Mozambique, using single sequence repeat (SSR) markers.

Plant Material
Fifty-nine cowpea accessions corresponding to 10 landraces were sampled in six agro-ecological zones (AEZs) in the provinces of Manica, Sofala, and Zambezia, where cowpea is grown as an integral component of local cereal-legume cropping systems ( Figure 1): R3 (North and Central Gaza and Western Inhambane), R4 (medium-altitude areas of Central Mozambique), R5 (low-altitude areas of Sofala and Zambezia), R6 (dry areas of Zambezia and Southern Tete), R7 (mid-altitude areas of Zambezia, Nampula, Tete, Niassa, and Cabo Delgado), and R10 (high-altitude areas of Zambezia, Niassa, Angonia-Maravia, and Manica [21]). Additionally, two widely used commercial cultivars (IT16 and IT18) released by the Mozambican Institute of Agricultural Research (IIAM) and bred through a partnership with the International Institute of Tropical Agriculture (IITA) in Nigeria were also used in this study.

DNA Extraction and nSSR Amplification
The 61 samples used in this study were genotyped based on nine polymorphic nuclear single sequence repeats (nSSRs) previously developed by [22]: VuUGM05, VuUGM22, VuUGM31, VuUGM33, VuUGM39, VuUGM40, VuUGM68, VuUGM71, and VuUGM74. Based on an initial survey, we selected these nSSR markers since they produced robust, highly polymorphic amplified bands among the entire Agronomy 2020, 10, 1893 4 of 15 collection of cowpea samples. Total genomic DNA was extracted from 50 mg of ground leaves using the InnuSPEED Plant DNA Kit (Analytik Jena Innuscreen GmbH, Germany) according to the manufacturer's protocol. The average yield and purity were assessed spectrophotometrically by OD230, OD260, and OD280 readings (Nanodrop 2000, Thermo Fisher Scientific, Waltham, MA, USA) and visualized by electrophoresis in 1% agarose gels under UV light. Amplifications were performed in 15 µL reactions containing 1.25U TaKaRa Hot startTaq polymerase, 1X Buffer I, 0.15 mM dNTPs, 0.2 µM Primer F-ROX and R, and 100 ng DNA under the following PCR conditions: an initial denaturation at 95 • C for 5 min, followed by 35 cycles of denaturation at 65 • C (20 s),annealing at 56 • C (30 s), and a final extension at 60 • C for 30 min. Allele sizes were determined using GeneMapper 3.2 (Applied Biosystems; UK).

Genetic Diversity and Population Structure
For each nSSR locus and landrace, genetic diversity was assessed by calculating the total number of alleles (N a ), mean expected heterozygosity (H e ), mean observed heterozygosity (H o ), allelic richness (A R ), and inbreeding coefficient (F IS ) using FSTAT 2.9.3.2 [23]). GenAlEx 6 software was used to estimate the mean expected heterozygosity (H e ) and mean observed heterozygosity (H o ) for each population, as well as the number of private alleles [24]. The selfing rate (s) was estimated as s = 2FIS/(1 + FIS) [25]. Hardy Weinberg Exact (HWE) Tests and linkage disequilibrium among SSR markers were determined using the software program GENEPOP 4.7 [26,27] with dememorization numbers at 10,000 and performing 100,000 iterations for all permutation tests. Significant values were corrected for multiple comparisons by Bonferroni correction [28]. An analysis of variance was used to detect significant differences between sites for the measured genetic values. Grids for all significant genetic parameters were generated in R and were based on a grid with a cell size of 30 s (which corresponds to approximately 1 km in the study area), applying a 1.5-degree circular neighborhood diameter. The circular neighborhood was used to re-sample the genetic composition of a single sample to all surrounding grid cells, with a size of 30 s, within a diameter of 1.5 degrees around its location. In this way, the genetic composition of each sample is representative for the area within the defined buffer zone.

Population Structure and Differentiation
The Bayesian program STRUCTURE v.2.3.4 [29] was used to test whether any discrete genetic structure existed among the landraces and AEZs sampled. The analysis was performed assuming clusters from K = 1 to K = 8, with 10 repetitions per K. Models were run assuming ancestral admixture and correlated allele frequencies with 50,000 burn-in steps, followed by run lengths of 300,000 interactions for each K. The optimum K was determined using STRUCTURE HARVESTER [30], which identifies the optimal K based on both the posterior probability of the data for a given K and the ∆K [31]. To correctly assess the membership proportions (q values) for clusters identified in STRUCTURE, the results of the replicates at the best-fit K were post-processed using CLUMPP 1.1.2 [32]. POPULATION 1.2 [33] was used to calculate the Nei's genetic distance [34] among individuals and to construct an unrooted neighbor-joining (NJ) tree with 1000 bootstrap replicates. A principal component analysis (PCoA) was also constructed in GenAlEx6 [24] to detect the genetic relatedness among individuals based on Nei's genetic distance. We estimated genetic differentiation among locations via an analysis of molecular variance (AMOVA) using ARLEQUIN 3.11 [35]. Molecular variance was quantified among and within landraces, considering AEZs and wild cowpea versus cultivars, via an AMOVA using 10,000 permutations at the 0.95 significance level in ARLEQUIN 3.11 [35].

Spatial Analysis and Genetic Diversity Rarefaction
Grids for genetic parameters were generated in DIVA-GIS version 7.5 (www.diva-gis.org) based on a grid with a cell size of 2.5 min (which corresponds to approximately 4.5 km in the study area) and applying a circular neighborhood with a diameter buffer of one degree (corresponding to approximate Agronomy 2020, 10, 1893 5 of 15 111 km). The circular neighborhood was used to illustrate the allelic composition of each sampled site representative for the area within the defined buffer zone. Genetic diversity rarefaction considered the spatial average of several population parameters such as the number of alleles (N A ), observed heterozygosity (H o ), inbreeding coefficient (F IS ), and selfing rate (s).

Genetic Diversity
The total number of alleles varied between 49 in VuUGM74 and 145 in VuUGM40 (Table 1). For each locus, the observed heterozygosity values (H o ) ranged from 0.014 in VuUGM74 to 1 in VuUGM40, and expected heterozygosity (H e ) varied from 0.016 in VuUGM74 to 0.806 in VuUGM33. F IS values varied between −0.008 and 0.857 (for loci VuUGM68 and VuUGM31, respectively; Table 1) across the loci studied. Table 1. Characteristics and genetic diversity statistics of the nuclear single sequence repeat (nSSR) primers used in the genetic study of Vigna unguiculata. For each locus, the total number of alleles (Na), mean expected heterozygosity (He), mean observed heterozygosity (Ho), and fixation index (FIS) obtained from the 61 studied samples are shown. A total of 327 alleles were found among the set of V. unguiculata accessions, varying significantly between sites (p < 0.001; Table 2). The number of alleles varied geographically from 14 in the coastal area of Muchela to 71 in the dry western area of Tambara ( Figure 2). Allelic richness varied between 1.250 in Muchela and 1.751 in Gurué, with no statistical differences being found between areas (p = 0.452; Table 1). However, the number of private alleles varied significantly across areas (p < 0.001; Table 2), with the highest numbers being found in Gurué, Tambara, and Machaze ( Figure 3).    Table 2. The two cultivars are also indicated.
The two cultivars had a low number of alleles (IT-16: 11 and IT-18: 2) and low allelic richness (IT-16: 1.333 and IT-18: 1.222) constrained by the small sampling size. However, although the observed heterozygosity (IT-16: 0.333 and IT-18: 0.222) was higher than expected in both cultivars (IT-16: 0.167 and IT-18: 0.111; p < 0.001 in both cases), it was also lower than that found in most local accessions (Table 2; p < 0.001). From all landraces sampled, only one was not at HWE (GUR) at the 5% level after the sequential Bonferroni correction (Table S1). Pairwise comparisons between loci revealed no significant linkage disequilibrium at p = 5%, suggesting that alleles are assorted independently at different loci.

Genetic Structure of V. unguiculata
The Bayesian clustering program STRUCTURE found the highest LnP(D) and ∆K values for K = 4 ( Figure S1). The results showed a high degree of admixture between landraces without any specific geographic pattern or clustering when considering the different AEZs ( Figure 5). One cluster was predominant and grouped all accessions from North Zambezia and most accessions from Sofala and Central Manica; the second cluster characterized Central and South Zambezia accessions; the third clustered accessions from North Manica as well as Central Sofala; and the fourth cluster was exclusively composed of accessions from South Manica ( Figure 5). The two cultivars were assigned to the first cluster, although with signs of admixture with other clusters.
In accordance with these results, the NJ tree separated all the groups assigned by STRUCTURE, revealing again no general correlation with the geographical distribution of accessions ( Figure 6). All individuals from R3 and R7 were clustered into two different clades, one with 65% and the other with 34% bootstrap support (BS) value ( Figure 6). Most individuals from R6 were clustered in the same group (57% BS), while R4, R5, and R10 were clustered into two different groups. The two cultivars were nested within the landraces, although in two different separated groups.
were nested within the landraces, although in two different separated groups.
The PCoA spatially separated the accessions analyzed into three main groups (Figure 7). In accordance with the NJ tree, the accessions from R3 and R7 were separated from the main group: the first being on the upper left of axis 2, accumulating 21.14% of variance, with the second on the lower left of axis 2. All remaining accessions were clustered in a heterogeneous group also containing the two cultivars.   The PCoA spatially separated the accessions analyzed into three main groups (Figure 7). In accordance with the NJ tree, the accessions from R3 and R7 were separated from the main group: the first being on the upper left of axis 2, accumulating 21.14% of variance, with the second on the lower left of axis 2. All remaining accessions were clustered in a heterogeneous group also containing the two cultivars.

Genetic Differentiation between Landraces
Overall, genetic differentiation was significantly low (AMOVA FST = 0.199, p < 0.001). The analysis performed over the landraces sampled indicated that only 19.92% of the genetic variation could be attributed to among the AEZs ( Table 3). The highest molecular variance was found among genotypes within accessions (47.39%), followed by that found within genotypes (32.69%; p < 0.001; Table 3). Remarkably, a very low molecular variance was found between wild cowpea versus the cultivars (0.12%), most of the variance being found among individuals within samples (65.58%; Table 3).

Discussion
Landraces harbor a gene pool of unexplored alleles that constitute a unique set of genetic resources for breeding to improve productivity, nutritional value, adaptation, and resilience to climate change [36][37][38][39]. Given their evolutionary history and adaptation to local conditions, they usually harbor higher genetic diversity and environmental resilience than modern varieties [40][41][42][43]. However, such richness tends to be lost because most of the current intensive agricultural systems are based on few high-input and high-yielding cultivars [44]. Thus, a comprehensive characterization of landraces towards the development of conservation and breeding strategies is among the main clues that will allow us to face the major agricultural challenges related to population growth and environmental risks.
Despite the ongoing agricultural changes in Africa, according to our data, the nine microsatellites employed in this study were highly polymorphic and revealed the existence of high genetic diversity between landraces of V. unguiculata from Mozambique (Tables 1 and 2). A total of 327 alleles were found among the 59 cowpea accessions, which can be attributed to high genetic heterogeneity (Table 2). Indeed, the genetic diversity values found within the studied Mozambique landraces (Ho: 0.222-0.426; He: 0.451-0.654) were much higher than those reported in other cowpea studies. For instance, high-density single-nucleotide polymorphism (SNP) genotyping using the Cowpea iSelect Consortium Array was used to study the population structure and genetic diversity in a set of 96 worldwide cowpea accessions, and average PIC and He values of 0.25 and 0.31, respectively, were found [45]. Similar results were obtained by [10,12] using 422 cowpea landraces and 768 cowpea genotypes, respectively, collected in 56 countries. A genotyping by sequencing (GBS)study of 298 lines from the mini core collection of 360 landraces maintained at The International Institute of Tropical Agriculture revealed a low number of robust SNP markers with an observed genetic distance of between 0.0096 and 0.462 [11]. The studied accessions of the world cowpea collection maintained at IITA were grouped into three main clusters, agreeing with the groups also reported by [10]. Genetic diversity using SSR markers was also assessed in 105 selected cowpea genotypes from the National Genebank of China at the Institute of Crop Science (ICS), with a low level of genetic diversity found among accessions [46]. Genetic diversity was also reported to be low among cowpea populations collected from Benin Republic [47], Ghana [48], and Sudan [49]. In comparison, in our study, the two commercial cultivars (IT-16 and IT-18) had a very low number of alleles and low heterozygosity values. Cluster analyses (PcoA or NJ tree) showed no clear differentiation between these modern varieties and wild accessions. The pairwise genetic distances reported in other studies have also shown that African landraces are close to wild cowpea samples [11,12]. This suggests that the genetic diversity of these two commercial varieties is still close to that of the wild accessions, although more individuals are needed to accurately determine if genetic erosion is occurring.
Population structure analysis using worldwide cowpea samples usually delineates African landraces into two major gene pools separated by the Congo River basin, the East/South, and West Africa [8][9][10], although nothing has been reported in terms of the cowpea genetic structure within these regions. Our study, focused on Mozambican (East Africa) landraces, found four genetic groups with a high degree of admixture ( Figure 5). No specific geographic pattern or clustering was found considering the different AEZs in either the NJ tree or the PcoA (Figures 6 and 7), which supports the presence of gene flow between these regions. The rate of self-fertilization in V. unguiculata varied across landraces (25-74%; Table 2; Figure 4), supporting the possibility of gene flow between individuals. In fact, two landraces (Lucas and Muchela) exhibited negative F IS values, indicating that these landraces are less related than expected under random mating, which could imply fewer homozygotes and consequent crossbreeding. Nonetheless, most of the remaining landraces had low F IS values (0.1-0.3), which indicates that inbreeding might not be prevalent.
The analysis of genetic differentiation indicated that most of the genetic variation was explained by differences among genotypes within landraces (Table 3), which also supports the hypothesis of gene flow. This low genetic differentiation and the absence of a geographical pattern associated with AEZs might be due to crossbreeding between individuals but also to seed exchange by farmers.
Seed exchange is a common practice between African farmers of neighboring areas [50] and could explain the specific genetic cluster found in the isolated accessions of South Manica that show no admixture with the remaining ones. It is economical unfeasible for seed companies to distribute small amounts of seeds over long rural distances in Africa, and therefore, certified, commercial seeds do not reach the farmers [51]. In addition, certified seeds are generally expensive, and farmers are unwilling to buy them at a cost twice (or more) that of grain [51]. Nonetheless, continuous recycling of seeds results in poor grain yields [50], highlighting the importance of conserving wild accessions and their seed stock.
The high genetic diversity found in Mozambique in comparison to other world regions reinforces the importance of local landraces to widen the genetic base of modern varieties of cowpea. The results of this study underline the hidden genetic diversity in local landraces, which should be conserved as sublines in genebanks to avoid the expected reduction of genetic diversity with successive regeneration of bulk samples. The high levels of genetic differentiation found within landraces (but not among AEZs) could imply the presence of different phenotypes, which might exhibit desired traits for exploitation in future breeding programs. In fact, according to [5], accessions A55 from R3, A80 from R7, and A116 from R10, which were clustered into different genetic groups ( Figure 6), revealed contrasting responses, respectively leading to high sensitivity, mild sensitivity, and high tolerance to drought stress related to the regulation of photosynthesis, C/N metabolism, and antioxidative status [5]. Compared to close relatives and other crops, cowpea is well adapted to semi-arid and arid regions because of its ability to fix nitrogen, but local genotypes still display significantly different levels of tolerance [52,53]. The same plasticity has been observed in response to virus and fungal diseases, with impact in terms of yield losses ranging from 20 to 100% [54]. Such a diverse genetic pool could be used to investigate the genetic basis of tolerance to abiotic and biotic stress.
A priority for in situ, on-farm conservation should be given to the landraces of Gurué, Tambara, and Machaze, which showed a high number of private alleles ( Figure 3) and belong to different genetic groups according to STRUCTURE analysis ( Figure 5). On-farm conservation allows the concomitant evolution and retention of potentially useful genetic variation needed to maintain crop ability to adapt to changes [55]. However, genetic diversity conserved on the farm is complementary to that found in the genebank, and both systems are required for efficient conservation of cowpea. Thus, further to molecular tools, farmers' knowledge should be employed to optimize the sampling of sublines within landraces for ex situ conservation. A core germplasm collection should include most of the cowpea's genetic diversity, for which the results outlined in this study can be used. The results of this work encourage a broad network of on-farm activities that should be enrolled in a socio-economic framework to complement genebank collections. This is also the best way to prevent genetic erosion in the genebank while maintaining and expanding the cultivation of cowpea in a wide range of environmental conditions. Supplementary Materials: The following are available online at http://www.mdpi.com/2073-4395/10/12/1893/s1, Table S1: Results of the Hardy Weinberg exact tests retrieved by GENEPOP for the sample landraces of Vigna unguiculata. p-value (0.05) associated with the null hypothesis of random union of gametes estimated with a Markov chain algorithm. Figure S1: STRUCTURE analysis of Vigna unguiculata. Above: mean log probability of data LnP(D) over 10 runs for each K value as a function of K (error bars represent standard deviation). Below: Evanno's ad hoc statistic; DK as a function of K.