Molecular Genetic Diversity and Population Structure in Ethiopian Chickpea Germplasm Accessions

Chickpea (Cicer arietinum L.) is a cheap source of protein and rich in minerals for people living in developing countries. In order to assess the existing molecular genetic diversity and determine population structures in selected Ethiopian chickpea germplasm accessions (118), a set of 46 simple sequence repeat (SSR) markers equally distributed on the chickpea genome were genotyped. A total of 572 alleles were detected from 46 SSR markers, and the number of alleles per locus varied from 2 (ICCM0289) to 28 (TA22). The average number of alleles per locus, polymorphism information content, and expected heterozygosity were 12, 0.684, and 0.699, respectively. Phylogenetic analysis grouped the 118 chickpea genotypes from diverse sources into three evolutionary and/or biological groups (improved desi, improved kabuli, and landraces). The population structure analysis revealed six sub-populations from 118 chickpea genotypes studied. AMOVA revealed that 57%, 29%, and 14% of the total genetic variations were observed among individuals, within populations, and among populations. The insights into the genetic diversity at molecular levels in the Ethiopian germplasm lines can be used for designing conservation strategies as well as the diverse germplasm lines identified in this study can be used for trait dissection and trait improvement.


Introduction
Chickpea (Cicer arietinum L.) is a diploid crop plant (2n = 2x = 16) with a haploid genome size of approximately 740 Mb [1]. Southeastern Turkey and adjoining Syria are the primary Vavilovian centers of origin, and Ethiopia is the secondary center of diversity [2,3]. In Ethiopia, chickpea is one of the most economically important legumes produced on an area of 258,486 ha, with a production of 470,000 tons [4][5][6]. Ethiopia is one of the top ten chickpea growing countries across the world and is the leading producer, consumer, and exporter of chickpeas in Africa [7]. In Ethiopia, chickpea is consumed as a green vegetable (eshet), roasted (kollo), boiled (nifro), dry vegetable, 'shimbra asa', shiro wot (sauce), and snacks, which are cheap and healthy diets that are rich in protein, vitamins, and minerals for the poor farmers who cannot afford animal products. Moreover, chickpea generates income for the poor farmers and draws foreign currency to the country, improves food and nutritional security and soil fertility, provides livestock feed, and requires low production costs [6][7][8]. The major chickpea growing zones of Ethiopia are South Gondar, North Gondar, East Gojam, West Gojam, North Shewa, East Shewa, West Shewa, South Wollo, North Wollo, and Tigray [9]. In Ethiopia, there are about 1173 chickpea accessions collected from different agroecologies and geographical origins and stored at the Institute of Biodiversity and Conservation [10]. Although Ethiopia is bestowed with diverse agroecologies and, especially, crop diversity, the productivity of chickpea is about 850 kg/ha [7] due to the exposure of the crop to several biotic and abiotic stresses.
Deeper insights into genetic diversity enable the use of appropriate germplasm lines in breeding programs to develop climate-resilient varieties. The investigation of the nature and structure of genetic diversity and relatedness within and among the cultivated chickpea and its wild relatives helps to identify new sources of germplasm bearing valuable genes for improving yield, grain quality, and enhancing resistance to various biotic and abiotic stresses. Additionally, studying genetic diversity is important in the management, conservation, and selection of diverse plant materials for intraspecific and interspecific crossing [11]. Genomics revolution during the last two decades led to the development of several genomics resources, including the genome sequence [1], molecular markers [12,13], and technologies for assessing the genetic diversity of germplasm lines at the genome level [14][15][16].
Earlier, efforts were made to understand the genetic structure using highly polymorphic simple sequence repeat (SSR) markers that are required to facilitate the chickpea genetic improvement [17]. SSR markers are robust and quite cheap markers for allele mining, molecular genetic diversity pattern, genetic relationships, association genetics, genetic mapping and identification of genes, phylogenetic patterns, population genetic structure studies, cloning gene(s), and marker-assisted selection in chickpea accessions [18][19][20][21][22]. However, most of these studies assessed the genetic diversity among germplasm lines from Southeast Asia [19][20][21][22][23] and Mediterranean regions [24][25][26][27][28]. The genetic diversity among the Ethiopian germplasm lines was seldom studied [18,20] using SSR markers, and most of the genetic diversity studies of chickpea so far focused on morphological characterizations. Ethiopia is the secondary center of origin, and assessing the molecular genetic diversity and determining the population structures among Ethiopian germplasm lines would help designing breeding programs as well as germplasm conservation and management strategies. In this study, we report the assessment of molecular genetic diversity and population structures of 118 chickpea genotypes (115 Ethiopian chickpea landraces, breeding lines, and cultivars and 3 Indian elite varieties) by using 46 genome-wide SSR markers.

Plant Materials and DNA Extraction
A total of 118 chickpea genotypes (115 Ethiopian landraces, lines, and varieties and 3 Indian elite varieties) were grown at Debre Zeit Lath house, Ethiopia (Table 1). DNA was extracted from 22-days-old seedlings (leaves from ten plants per genotype were pooled) using a modified CTAB protocol [29] at the Institute of Biotechnology, Addis Ababa University, Ethiopia. At ICRISAT, Patancheru, India, the DNA samples were treated with RNase A (by adding 15 µL of RNase A to each DNA sample) for three hours at 37 • C. The RNase A treatment was further purified following the NucleoSpin 96 Plant II kit purification protocol [30]. The quality check, quantification, and normalization to 5 ng/µL of DNA samples were made using 1% agarose by using lambda DNA (50 ng) as a standard and 8000 NanoDrop Spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA).

Polymerase Chain Reaction (PCR), Capillary Gel Electrophoresis, Allele Calling, and Sizing
A total of 46 polymorphic SSR markers (Table 2) distributed equally on the genome were used to genotype 118 chickpea genotypes. The 46 SSR markers were selected based on quality criteria, genome coverage, and locus-specific information content of the published data [18,20,22,31] and unpublished data. All 46 SSR markers were genotyped as described earlier [12,13]. PCR products were separated based on the size of amplicons by an automated DNA fragment analyzer and Sanger sequencer (Capillary gel electrophoresis, ABI Prism 3730 XL, Applied Biosystems, Foster City, CA, USA) by using an internal size standard, GeneScan-500 LIZ. GeneMapper4.0 (Applied Biosystems, Foster City, CA, USA) software was used for allele calling.

Diversity Analysis
Based on SSR allelic data, the molecular-genetic relationships (dendrogram) of 118 chickpea genotypes were determined by using the neighbor-joining weighted pair-group method with arithmetic averages (WPGMA), i.e., clustering of simple matching dissimilarity indices with the help of the DARwin-6.0 program [32]. PowerMarker version 3.51 [33] was used to determine the major allele frequency, the number of alleles per locus, polymorphic information content (PIC), gene diversity (expected heterozygosity), and observed heterozygosity. [34] software was used to estimate the number of natural genetic groups (K), the distribution of individuals among these groups, and to assign individual genotypes to a specified number of groups K based on the membership coefficients calculated from the genotype data. The genetic structure of 118 chickpea genotypes was explored using the Admixture model implemented in the STRUCTURE software. A range of population numbers (K = 2 to K = 10) was assessed using a burn-in period of 10,000 steps, followed by 100,000 MCMC (Monte Carlo Markov chain) replicates with 5× iterations. The rate of change of the Napierian logarithm probability relative to the standard deviation (∆K) was determined by using the [35] method. The output of this population structure was taken from the STRUCTURE HARVESTER program online [36], which showed the highest peak at K = 6, indicating the presence of six major distinct populations. The 46 SSR loci were analyzed based on populations (geographical distributions) of the chickpea genotypes using GenALEX 6.5 software [37]. Population-specific genetic diversity, allelic frequency, observed, expected and unbiased expected heterozygosity, Shannon information index, the total number of alleles, average number of alleles and genotype-specific alleles per locus, percentage of polymorphism, gene flow, Nei's pairwise genetic distance between populations [38], and genetic differentiation were computed. Based on the pair-wise genetic distance between genotypes, principal coordinate analysis (PCoA) was also analyzed in GenAlEX 6.5 to complement the analyses of STRUCTURE and dendrogram.

Population Structure
The statistical significances of molecular genetic variance components (Analysis of molecular variance, AMOVA) for each hierarchical comparison (among populations, among individuals, individuals within populations) were tested using 999 permutations. Moreover, F-statistics among individuals, within and among populations were determined using GenAlEX 6.5 [37].

Results
A total of 572 alleles were detected with 46 genome-wide polymorphic SSR markers across 118 chickpea genotypes, out of which 113 were considered rare, 435 common, and 34 most frequent alleles. Large numbers of rare allelic compositions were found for TS72, CaM1101, and TA14. The number of alleles per locus varied from 2 (ICCM0289) to 28 (TA22), and an average of 12 alleles per locus was perceived ( Table 2). A total of 126 genotype-specific alleles were also found (data not shown).   , namely TA22, CaM1101, TS72, TA130, ICCM249, TA118, TA96, TA176, CaM1158,  CaM0443, TA14, TA37, CaM0436, H1G16, TR29, CaM1402, TA78, TA113, STMS11, TA71,  TA122, TA194, and TA103 in their respective order, which might have great potential to discriminate 118 chickpea genotypes into different genetic groups. Nevertheless, SSR markers such as GAA47, ICCM0289, CaM0610, and H1D24 had a low polymorphic power (PIC ≤ 0.1) to discriminate 118 chickpea genotypes into different genetic groups. The numbers of alleles per locus were directly proportional to variations of the allele size, PIC, and He, and inversely proportional to the major allele frequency (MAF). There was a parallelism between PIC and He; both might reflect the discriminating powers (genetic diversity) of the SSR markers across 118 chickpea genotypes ( Table 2).

Phylogenetic Relationships and Principal Coordinate Analysis (PCoA)
Molecular genetic relationships among 118 chickpea genotypes were revealed by phylogenetic relationships (dendrograms) and principal coordinate analysis (PCoA; Figures 1 and 2). The phylogenetic relationships of 118 chickpea genotypes were determined based on pairs of a simple matching dissimilarity matrix of hierarchical neighbor-joining weighted pair group arithmetic average (WPGMA) that resolved the 118 chickpea genotypes into three clusters: landraces, improved kabuli, and desi ( Figure 1). In addition, the principal coordinate analysis (PCoA) of the molecular data was analyzed based on a pairwise distance matrix across 118 chickpea genotypes by using GenAlex 6.5 software. Hence, the PCoA analysis confirmed the same result as the WPGMA dendrogram; 118 chickpea genotypes were grouped into three clusters: landraces, improved desi, and kabuli ( Figure 2).

Population Structure, Genetic Differentiation, and Geographic Distribution
According to an online output of population structure [36] by using the Evano et al. [35] method, the varying probable numbers of populations (K = 2 to K = 10) and the maximum delta K (∆K) value were clearly obtained at K = 6 ( Figure 3a). Therefore, the admixture model from the STRUCTURE 2.3.4 software analysis clearly revealed that 118 chickpea genotypes (115 Ethiopian and 3 Indian genotypes) were categorized into six groups of populations (desi, kabuli, and 4 distinct groups of landraces, Figure 3b). Populations 1, 4, 5, and 6 were comprised of landraces collected from North Shewa, SNNP (Southern Nations, Nationalities, and Peoples), West Shewa, South Wollo, and Tigray; East Shewa, Arsi, North Shewa, and South Wollo; North Shewa and West Harerge; and South Gondar, East Gojam, and West Gojam, respectively. Pop2 and Pop3 comprised of improved kabuli and desi (Figure 3b).
In this study, 60% of the membership coefficient was used as a threshold value of individual genotypes to be clustered into their own genetic groups. However, the average membership coefficient value of individual genotypes per population group was greater than 80%. Based on this threshold, membership coefficients of 32, 18, 22, 15, 7, and 13 individual chickpea genotypes belonged to populations 1-6 in their respective order; nevertheless, 11 genotypes possessed a membership coefficient of less than 60% that had admixture ancestry among the 6 populations.   [35], the probable number populations was detected from K = 2 to K = 10 and the highest pick was depicted at K = 6, (b) six sub-populations were identified among 118 chickpea genotypes using STRUCTURE 2.3.4 software analysis. Six separate bar plots were shown, and each bar plot represented a distinct population. Nei's pairwise genetic distances among populations varied from 0.15 to 0.52. Pop1 was distant to Pop3 (0.52) and Pop2 (0.46), while Pop1 was closer to Pop6 (0.15) and Pop4 (0.16). Pops 2 and 3 were closer (0.44) to each other than to either of the landrace groups and distant from Pop5 (0.66 and 0.67) and Pop6 (0.55 and 0.58). Among 118 chickpea genotypes, the most distant and closest chickpea genotypes to all other genotypes were an Indian elite desi variety, ICCV96029 (Pop3), and an Ethiopian landrace, 207654 (Pop1) collected from North Shewa. Landraces collected from different regions of Ethiopia were very close to each other, rather than the lines and varieties of desi and kabuli taken from Ethiopia and India (ICRISAT; Table 4). Pairwise genetic divergence (F ST ) among populations varied from 0.04 to 0.12. Pops 1 and 3 had the highest genetic divergence (0.12), and Pops 1 and 4 had the least genetic divergence (0.04). The inbreeding coefficient value of each population varied from 0.08 (Pop1) to 0.42 (Pop5; Table 4). The trends of genetic differentiation detected by F ST among six populations were consistent with Nei's pairwise genetic distance. The population structure analysis of this study confirmed the same result as the WPGMA dendrogram and PCoA except for further clustering of landraces into four distinct populations. This is partially due to the geographical proximity of collected landraces; nevertheless, in some cases, landraces collected from geographically distant zones like North Shewa, West Shewa, South Wollo, Tigray, and SNNP were clustered into the same population. Landraces collected from North Shewa and West Harerge were also grouped under the same population. Moreover, even landraces collected from the same geographical zone (for example, in North Shewa) were clustered into 3 distinct groups of populations (Pops 1, 4, and 5).

Analysis of Molecular Variance (AMOVA)
The genetic diversity among 118 chickpea genotypes was depicted by the Shannon information index, genetic diversity parameters, AMOVA, F-statistics, and gene flow. The Shannon information index revealed that the genetic diversity among the population was 28% while, within populations, it was 72%. Population differentiation based on the AMOVA has shown that 14%, 57%, and 29% of the total genetic variation was obtained among populations, among individuals, and within populations, respectively (Table 5 and Figure 4), and all the genetic variance components were highly significantly different (p < 0.001). F-statistics (fixation index or inbreeding coefficient) explained that the genetic variation among individuals (F IT , 0.71) and within subpopulations (F IS , 0.67) was higher than among the populations (F ST , 0.14). The mean gene flow (Nm) among the six populations was 1.57 (Table 5).

Allelic Variation and Molecular Genetic Diversity
The number of alleles per locus, number of genotype-specific, rare, common, frequent, and group-specific alleles, variability of allele sizes, PIC values, and He were allelic variation and genetic diversity parameters that can infer the discrimination power of an SSR marker for measuring phylogenetic relationships, molecular genetic diversity, geographical relationship, and genetic differentiation patterns of chickpea accessions [17][18][19][20][22][23][24][26][27][28]. The number of alleles per locus showed a positive correlation with the ranges of allele sizes, PIC, and genetic diversity [11,20,22,28], which were in agreement with the present study.
In the present study, a total of 572 alleles were detected from 46 SSR markers, the number of alleles and allele sizes varied from 2 to 28 alleles per locus and 115 to 345 bp, an average number of alleles per locus, PIC, and genetic diversity were 12, 0.68, and 0.69, respectively. Similar results were reported by [20,24,26,28]. In these studies, 218, 122, 504, and 309 alleles were detected with 22,9,48, and 16 SSR markers, respectively. The number of alleles per locus varied from 2 to 26 (mean 9.9), 9 to 20 (mean 13.5), 3 to 22 (mean 10.5), and 8 to 29 (mean 19.3) in their respective order. Their average genetic diversity and PIC ranged from 0.7-0.8. The allele sizes were in a similar range (131 to 344 bp), as reported earlier [24].
On the contrary, Refs. [21,22] reported a relatively large number of total numbers of alleles (917 and 1683) from 35 and 48 SSR markers, the average number of alleles per locus (26.2 and 35), and genetic diversity (0.48-0.96 and 0.87). On the other hand, Refs. [11,18,25,27,39] detected a small number of alleles per locus and average alleles per locus, less average PIC, and genetic diversity values. These authors detected 58, 119, 480, 117, and 111 alleles from 10, 19, 100, 38, and 33 SSR markers in their respective order. According to the reports of these authors, the number of alleles, average alleles per locus, PIC, and genetic diversity varied from 2-13, 3-6.25, and 0.41-0.68, respectively. The number, type, genome-wide distribution, and polymorphic nature of the SSR markers, and the number, biological status, type, geographical distributions, and agroecology of the germplasm accessions might account for these differences.
The presence of genotype-specific alleles is important for genetic characterizations of agronomic and quality traits and marker-assisted selection (gene tagging) [40]. A total of 106 genotype-specific alleles were identified from 307 Iranian chickpea accessions [22,26], reporting 470 unique alleles across 2915 composite chickpea collections as revealed by SSR markers. Moreover, Upadhyaya et al. [22] found a higher percentage (55.5%) of rare alleles than common (42.8%) and most frequent (1.7%) alleles. In the present study, comparatively large numbers of genotype-specific alleles (126) were found. The common alleles (74%) were found to be extremely higher than the rare alleles (19.9%). Upadhyaya et al. [22] confirmed that accessions from East Africa regions were highly polymorphic and genetically diverse with beneficial traits.
In the present study, from the total molecular genetic variations, 72% and 28% of the genetic variation was found within and among populations. In agreement with this study, the AMOVA revealed that 73% and 27% [18], 59% and 41% [27], and 62% and 38% [28] of the variation observed in the chickpea accessions was found within and among populations. This might be due to the presence of high genetic variation within landraces collected from the primary and second centers of chickpea [24]. The study of genetic diversity and relationships among chickpea genotypes used in this study might be important to the management, conservation, and selection of diverse plant materials for the intraspecific crossing of distantly related chickpea genotypes (e.g., Ethiopian landraces with Indian elite varieties, ICCV96029, and JG 14) to improve yield and mitigate abiotic stresses such as drought, heat, cold, and salinity in the national chickpea breeding programs. Phenotypic data confirmed that few of the chickpea accessions proved to be heat resilient under heat stress environments (data not shown). Similarly, Keneni et al. [18] emphasized that crossings between genetically distant parents and those from diverse local sources might produce higher heterosis, better genetic recombination, and segregation in their progenies and result in varieties with a broad genetic base.

Population Structure and Patterns of Population Genetic Divergence
Chickpea accessions might form clusters based on evolutionary (phylogenetic) relationships, market classes, and geographical distributions [20,22,26,27]. In the present study, 118 chickpea genotypes were taken from diverse geographical regions of Ethiopia, Debre Zeit Agricultural Research Center (Ethiopia), and ICRISAT (India). They were grouped into three clusters and six distinct genetic populations based on phylogenetic and structural analyses, respectively. Similarly, phylogenetic analyses of 48 [25] and 103 [24] chickpea entries were clustered into five and three distinct populations. Saeed et al. [27] and Sefera et al. [20] reported that by using the UPGMA clustering method, 44 and 48 chickpea accessions were clustered into eight and two populations, respectively. In addition, the population structure analysis revealed that 46 [19], 155 [18], and 94 [17] chickpea genotypes were divided into five, five, and six distinct populations, respectively.
In this study, chickpea landraces collected from a single origin or two or more geographical origins clustered into a single population, and landraces collected from a single geographical origin could be clustered into more than one cluster. For instance, all landraces collected from SNNP were grouped into a single cluster; however, landraces collected from North Shewa were grouped into four of the landrace clusters with different proportions of individuals. The number of entries varied from cluster to cluster, and individual chickpea genotypes within a cluster were definitely more closely related with each other (possessed a membership coefficient of ≥ 60%) than with those individual members in different clusters. Intra-and inter-regional similarities were observed between adjoining geographical origins due to an exchange of germplasms. Especially, there might be a massive seed accumulation and movement from North Shewa (central part of the country) into different geographical origins of Ethiopia through the market or other mechanisms. This could be due to North Shewa being an adjoining geographical area to the national chickpea breeding program, which might have received significant benefit out of it. In agreement with the present study, Choudhary et al. [11] reported that the overall clustering pattern did not strictly follow the grouping of accessions according to their geographic origins. Moreover, Keneni et al. [18] explained that chickpea genotypes taken from different sources might be evolved from different lines of ancestry and/or derived from independent events of evolutionary forces, namely genetic drift, mutation, migration, selection, and influx/outflux of genes in the form of germplasm exchange that differentiated chickpea genotypes into different gene pools. Three elite desi varieties taken from ICRISAT (Hyderabad, India) were clustered together with the Ethiopian improved desi genotypes. This clearly indicated that Ethiopia and India have common breeding lines and varieties, which exchanged chickpea genetic material often. This is in line with the report of Sefera et al. [20]: there was correspondence between the grouping of cultivars released in Ethiopia and India.

Conclusions
A total of 572 natural alleles and 126 genotype-specific alleles were detected at 46 SSR loci across 118 chickpea genotypes. The number of alleles per locus varied from 2 (ICCM0289) to 28 (TA22), and an average of 12 alleles per locus was found. TA22 and CaM0610 markers had the highest and least polymorphic information content, genetic diversity, allele sizes, and a total number of alleles per locus, respectively. AMOVA revealed that there was a higher genetic diversity (59%) among individuals than among populations (14%). The presence of high genetic diversity within populations, a wide range of allele sizes, large numbers of alleles per locus and genotype specific alleles per population, high polymorphic information content, and expected heterozygosity had breeding implications on the national chickpea improvement program by helping to enhance yield and improve specific traits such as heat, cold, and drought through the crossing of genetically distant genotypes.

Data Availability Statement:
The datasets used and/or analyzed in this study are available from the corresponding authors and first author.
Acknowledgments: T.G. is sincerely thankful to the Institute of Biotechnology, Addis Ababa University, for financial support. R.K.V. greatly acknowledges the funding support from the Bill and Melinda Gates Foundation, Tropical Legumes III. This work has been undertaken as part of the CGIAR Research Program on Grain Legumes and Dryland Cereals (GLDC). ICRISAT is a member of CGIAR Consortium.

Conflicts of Interest:
All authors declared no conflict of interest.