Determining Genetic Diversity and Population Structure of Common Bean (Phaseolus vulgaris L.) Landraces from Türkiye Using SSR Markers

Assessment of genetic diversity among different varieties helps to improve desired characteristics of crops, including disease resistance, early maturity, high yield, and resistance to drought. Molecular markers are one of the most effective tools for discovering genetic diversity that can increase reproductive efficiency. Simple sequence repeats (SSRs), which are codominant markers, are preferred for the determination of genetic diversity because they are highly polymorphic, multi-allelic, highly reproducible, and have good genome coverage. This study aimed to determine the genetic diversity of 40 common bean (Phaseolus vulgaris L.) landraces collected from the Ispir district located in the Northeast Anatolia region of Türkiye and five commercial varieties using SSR markers. The Twenty-seven SSR markers produced a total of 142 polymorphic bands, ranging from 2 (GATS91 and PVTT001) to 12 (BM153) alleles per marker, with an average number of 5.26 alleles. The gene diversity per marker varied between 0.37 and 0.87 for BM053 and BM153 markers, respectively. When heterozygous individuals are calculated proportional to the population, the heterozygosity ranged from 0.00 to 1.00, with an average of 0.30. The expected heterozygosity of the SSR locus ranged from 0.37 (BM053) to 0.88 (BM153), with an average of 0.69. Nei’s gene diversity scored an average of 0.69. The polymorphic information content (PIC) values of SSR markers varied from 0.33 (BM053) to 0.86 (BM153), with an average of 0.63 per locus. The greatest genetic distance (0.83) was between lines 49, 50, 53, and cultivar Karacaşehir-90, while the shortest (0.08) was between lines 6 and 26. In cluster analysis using Nei’s genetic distance, 45 common bean genotypes were divided into three groups and very little relationship was found between the genotypes and the geographical distances. In genetic structure analysis, three subgroups were formed, including local landraces and commercial varieties. The result confirmed that the rich diversity existing in Ispir bean landraces could be used as a genetic resource in designing breeding programs and may also contribute to Türkiye bean breeding programs.


Introduction
Bean (Phaseolus vulgaris L.) is one of the most important cultivated plants from the legume family worldwide in terms of total yield and cultivated area [1]. Beans consumed in different forms (green pods, immature or dried seeds) are a primary source of vegetable region source. In addition, a wide-ranging study has not yet been conducted to measure the genetic diversity of bean germplasm in Türkiye. This research was carried out to reveal the genetic diversity and population structure of the landraces obtained from the bean population grown in the Ispir district by using the SSR molecular marker method to determine the degree of inbreeding between the landraces and to reveal suitable lines for breeding studies. Therefore, the germplasm information yielded from this study will be useful for bean breeding studies. In addition, the findings are anticipated to contribute to the development of strategies to protect endangered bean genetic resources in the Erzurum-Ispir region.

Plant Material
The common bean genotypes used in this study were collected from the Ispir Valley in Northeast Anatolia, Türkiye. A total of 45 common bean genotypes together with five nationally registered cultivars were used for SSR analysis. (Figure 1 and Table 1). In addition, some seed characteristics (100-seed weight, seed color, and seed shape) of bean accessions are presented in Table 1.  Table 1. List of bean accession by information, coordinates, and some seed characteristics of the gathering place ( Figure 1).

DNA Extraction
Sample plants were grown in a greenhouse of the Atatürk University Field Crops Department. Bulk DNA of 45 individuals per accession was prepared from young leaves of 2-week-old plants in the Laboratory of Molecular Biology and Genetics, Department of Field Crops, Ataturk University, Türkiye. Genomic DNA extractions were performed as described by Zeinalzadehtabrizi et al. [31]. DNA quality was affirmed through electrophoresis in 0.8% agarose gel. The NanoDrop ® ND-1000 UV/V spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA) was used to determine DNA concentrations. For SSR analysis, the final DNA concentration was adjusted to 50 µg/mL. Diluted DNA samples were stored at −20 • C to await SSR-polymerase chain reaction (PCR).

SSR Analysis
Twenty-seven SSR primer pairs were selected from previous studies based on their reliable amplification patterns and high polymorphic information contents. There are three markers on the Pv01 chromosome, five markers on the Pv02 chromosome, five markers on the Pv04 chromosome, two markers on the Pv06 chromosome, and two markers on the Pv07 chromosome, three markers on the Pv08 chromosome, and five markers on the Pv09 chromosome. There was only one marker on the Pv01 (BM053 marker) and Pv05 (BM175 marker) chromosomes. In addition, none of the markers used in our study were markers located on the Pv10 and Pv11 chromosomes ( Table 2). These primer pairs resulted in specific and stable DNA profiles in this study. PCR amplifications were performed in Labcycler. The PCR mixture consisted of 10× buffer, 2 mM MgCl 2 , 0.25 mM of each dNTP, 2 µM (20 pmol) primer, 0.5 U Taq polymerase, and 50 µg/ng DNA template in a 20 µL reaction mixture. The amplification conditions were as follows: an initial denaturation step of 2 min at 95 • C, 37 cycles of 30 s at 95 • C, 60 s at 47-58 • C and 60 s at 72 • C, and a final extension step of 5 min at 72 • C. The amplification products were resolved on 3% agarose gel in 1X SB buffer at 150 V/cm for 120 min, stained with ethidium bromide (0.2 ug/mL), visualized under a UV-transilluminator, and photographed under ultraviolet light with Nikon Coolpix5000. The sizes of the base pairs were determined based on a DNA ladder between 50 and 1000 bp (Vivantis product no. NM2421) [32].

Molecular Data Analysis
Scoring was given as 1 (presence) and 0 (absence) for amplified fragments at each SSR locus, and data matrices were constructed accordingly. In this study, Phylogenetic analysis was performed with MEGA software (v. 7.0.14). In this study, Phylogenetic analysis was performed with MEGA 6.0 software. The dendrogram was constructed using the neighbor-joining method of the MEGA software with the maximum composite likelihood substitution model, and bootstrapping with 1000 replicates [38]. Marker index for SSR markers was calculated in order to characterize the capacity of each primer to detect polymorphic loci among the genotypes. It is the sum total of the polymorphism information content (PIC) values of SSR markers produced by a particular primer. The PIC value was calculated using the formula PICi = 1 − ∑ P(i)2 [39], where pi is the frequency of the allele. The PIC values provided an estimate of the discriminatory power of any locus by considering the number of alleles per locus and the relative frequencies of these alleles in the population. The genetic diversity within the genotypes was calculated from the following equations and the PopGen program [40] using Nei's gene diversity index [41] and the Shannon information index [42]. Structure 2.2 program was used to determine the genetic structure of genotypes [43]. In many genetic diversity studies with beans, genotypes are successfully divided into groups using the Structure program [44,45]. The F-statistics (FST) value reflects the difference between subpopulations [46]. Using the GenAlex program, basic coordinate analysis was carried out to better understand the diversity among genotypes. On the two-dimensional diagram obtained by covering the total variance of the first two coordinates, groups were determined and compared with cluster analysis. Genetic variation within and between populations was examined with the GenAlex program [47] using the analysis of molecular variance (AMOVA) method. Fst measures the amount of genetic variance that can be explained by population structure based on Wright's F-statistics. An Fst value of 0 indicates no differentiation between the subpopulations while a value of 1 indicates complete differentiation [48]. In addition, genetic indices such as number of loci with private allele, number of different alleles (Na), number of effective alleles (Ne), Shannon's information index (I), unbiased expected (uHe) and expected (He) for each proposed geographic region using the Genalex 6.5 software [45].

SSR Marker Information
Twenty-seven SSR markers produced a total of 142 bands, and the number of alleles per locus ranged from 2 (PVTT001) to 12 (BM153), with an average of 5.26. The polymorphism rate for each SSR marker was 100%. The allele frequency varied between 0.20 (BM153) and 0.78 (BM053). The lowest genetic diversity was 0.37 (BM053) and the highest 0.87 (BM153). Gene diversity is an important parameter used in the estimation of genetic variability between genotypes [41,49]. In a similar study, Dutta et al. [50] obtained a total of 150 alleles using 30 SSR markers in 52 Indian common bean genotypes. They found that the number of alleles ranged from 1 to 19 and the number of alleles per locus was 5. Investigating genetic variation in 60 Brazilian bean genotypes, [25] obtained 196 polymorphic alleles from 85 SSR markers and reported the average number of alleles per locus to range from 2 to 6, with an average of 2.8. Zhang et al. [51] investigated genetic diversity in 229 Chinese native bean genotypes using 30 SSR markers and the number of alleles varied between 2 and 19; they obtained an average of 5.5 alleles per locus and 116 alleles in total. The average PIC value obtained as a result of the analysis with the SSR marker, showing the discriminatory power of a marker, was 0.63, ranging between 0.33 (BM053) and 0.86 (BM153) depending on the markers. Having a high PIC value for a marker is one of the most important indicators that the marker can be used successfully in the evaluation of genetic variation [50]. Markers with high PIC values, such as BM141 (0.81), BMd1 (0.81), BM153 (0.86), and PVAT001 (0.81), are preferred in bean genetic diversity studies (Table 3). In other studies, with SSR markers in beans, the PIC value was 0.23-0.87 [51], 0.03-0.70 [25], 0-0.79 [37], 0.38-0.94 [50], 0.40-0.82 [52], and 0.42-0.88 [4]; these values varied widely and are consistent with our research results.  The number of effective alleles, which was 3.749 on average in the study, varied between 1.578 (BM053) and 7.864 (BM153) according to the markers ( Table 4). The observed heterozygosity was 0.30 on average. The relatively low heterozygosity seen may be due to the autogamous structure of the bean [22,45,53]. However, compared with some other research results, e.g., Kyrgyzstan (0.05) [45], India (0.019) [54], and Brazil (0.16) [25], our value is higher. The expected heterozygosity, with a mean of 0.693, was lowest (0.370) at the BM053 locus and highest (0.882) at the BM153 locus (Table 4). Similar to our findings, [44] reported with 36 SSR markers in 104 wild bean genotypes that the mean expected heterozygosity value was 0.66, and the highest expected heterozygosity value (0.96) was obtained from the PVAT001 marker. Zargar et al. [46] determined the expected heterozygosity values as 0.2192 in the first subpopulation, 0.2124 in the second subpopulation, and 0.2821 in the third subpopulation, respectively, in their analysis using 15 RAPD and 23 SSR markers in 51 Indian bean genotypes. In this study, Shannon information index (I), which ranged from 0.663 (GATS91) to 2.202 (BM153), was found to be on average 1.343 (Table 4). The high Shannon knowledge index in our study showed that the SSR markers employed were useful in determining genetic diversity [55]. Gioia et al. [13] reported the Shannon information index to range from 0.19 to 0.74 (mean 0.66) with 58 SSR markers in 192 bean genotypes. In another study using 65 Vigna umbellata genotypes and 28 SSR markers, six geographical groups were formed, and the Shannon information index varied between 0.845 and 1.019 [56]. On the other hand, Öztürk et al. [2] investigated genetic diversity in 75 bean genotypes using 27 iPBS markers and found the Shannon information index to range from 0.570 to 0.636 (mean 0.599).

Cluster Analysis
Comparative analysis of molecular sequence data enables the determination of proximity or distance between genotypes and displays clusters of genotypes by constructing a phylogenetic tree. For this purpose, cluster analysis was performed among beans by the neighbor-joining method of the maximum composite likelihood substitution model, identifying three clustered groups. Considering the higher cophenetic correlation coefficient, the dendrogram was assumed to represent the similarity matrix very well. Cluster III consists of two subgroups; Aras-98, Elkoca-05, Göynük-98, Yakutiye-98 and Karacaşehir-90 cultivars were included in the first subcluster, along with five Ispir bean lines, and 63, 64, 65, 69 accessions were included in the second subcluster. In addition, cluster I consisted of two subgroups; there were twenty-three participants in the first subgroup and four participants in the second subgroup. In the cluster II, there were eight participants. (Figure 2 and Table 5). This clustering of genotypes showed that there was no significant relationship between geographic origin and genetic similarity. This result suggests that there may be some level of gene flow between genotypes or a recent introduction from a common source. In a similar study aiming to determine genetic diversity using 30 SSR markers in 50 bean genotypes, including 38 local bean genotypes obtained from the Northeast Anatolia region and 12 registered varieties, the genotypes clustered into two groups [57]. However, Öztürk et al. [2], who investigated genetic diversity by using 27 iPBS markers in 71 bean genotypes and 4 commercial varieties collected from Erzincan, determined that the genotypes clustered into two groups and both groups were further divided into two subgroups. In a study by [58], they conducted a genetic diversity study in beans using 26 iPBS primers. At the end of the research, it was determined that the bean inclusions were divided into three main clusters. However, while three subgroups were formed in our study, five subgroups were identified in the findings of the researchers.
Knowing the genetic distances between genotypes provides an enormous advantage in selecting suitable parents for bean breeding programs. In this study, the greatest genetic distance (0.83) was determined between the Karacaşehir-90 variety and Ispir bean lines 49, 50, and 53. The shortest genetic distance was observed between lines 6 and 26 and lines 27 and 28 (0.08 and 0.09, respectively) (Table S1).

Determination of Genetic Diversity Based on Principal Coordinate Analysis
Principal coordinate analysis (PCoA) is a multivariate dataset that provides the ability to find and archive key patterns in multiple loci and multiple samples. With this technique, the distances between the groups, which are based on the two-dimensional diagram formed by the similarity or distance matrix between the individuals, reflect actual distances [59]. PCoA is used to provide a spatial representation of the relative genetic distances between populations [60].
In our study, the baseline coordinate analysis was performed using the neutral genetic distance of Nei. The percentage of genetic diversity explained by each of the three main coordinates of the basic coordinate analysis was 20.57, 16.96, and 13.33; together, these three components explained 50.85% of the diversity. Although the groups were not completely separated in the two-dimensional diagram obtained over the first two components, the distribution of genotypes on the diagram indicated the presence of genetic diversity (Figure 3 (Figure 3). This distribution of genotypes on the diagram shows that genetic diversity is weak both between commercial varieties and between Maden Village genotypes and Elmalı Town Agıldere Village genotypes. Klaedtke et al. [60] reported that the baseline coordinate analysis they applied on 15 bean genotypes using the SSR marker grouped the genotypes in a meaningful way and the first two components explained 77.7% of the total variation. In their study using SSR markers in 349 wild and cultivated bean genotypes from the Andean and Mesoamerican gene pools, Kwak and Gepts [53] found that the results of basic coordinate analysis and genetic structure analysis were similar, and the first two components explained 66% of the total variation.  Table 1.

Molecular Variance Analysis
Analysis of molecular variance (AMOVA) revealed that within-population variance (66%) was higher than between-population variance (34%) ( Table 6). This result indicates that there is gene flow between populations [45]. Blair et al. [44] in their study with 36 SSR markers in 104 wild bean genotypes determined that the variance within populations was 98%. Another study using 11 SSR markers in 28 bean genotypes grown in Kyrgyzstan [45] reported that the variance between populations was higher, contrary to our research findings. Rebaa et al. [55] who used 21 genotypes and 8 SSR markers in broad beans noted an intra-population variance of 83% and an inter-population variance of 17%, and their molecular variance analysis using SSR data revealed a significant intra-population genetic variation. Similarly, research results have been reported in which the within-population variance is higher than the between-population variance in different plant species such as apple [61] and lettuce [62]. The summary statistics for nine populations are listed in Table 7. We determined that the He value ranged from 0.038 (Ic) to 0.189 (Ov) (Mean 0.115), while the uHe value ranged from 0.051 (Ic) to 0.196 (Ov) (Mean 0.131). The I value among the nine populations ranged from 0.055 (Ic) to 0.290 (Ov) (Mean 0.173). The percentage of polymorphic loci (PPL) for bean was lowest at 9.15% (Ic) and 15.49% (Mka). Among the nine populations of bean, the PPL value ranged from 9.15% (Ic) to 64.08% (Ov) (Mean 33.41%). The Nei genetic (h) values of the nine bean populations are presented in Table 8. Among the nine populations of bean from Ispir, the smallest h values observed were in Ev/Mv (0.087), while the greatest were observed in Mka/Ic (0.341).

Genetic Structure Analysis
In many genetic diversity studies with beans, genotypes are successfully separated into groups using the structure program [44,45]. In this study, the population structure of accession in 45 bean genotypes was classified according to the SSR data, and three subpopulations were obtained with little mixing of genotypes regardless of geographical distribution ( Figure 4). Geographical distribution is an important factor in terms of the genetic diversity of species [2,63]. In this study, the proximity of the places where the samples were collected can be counted as the reason for the mixing of these three populations [46]. The low number of populations in our study (K = 3) is due to the high rate of gene flow between the regions where the samples were taken [2,23]. According to these data, there are 17 local genotypes in the first subpopulation, 9 local genotypes together with the 5 commercial varieties in the second subpopulation, and 14 local genotypes in the third subpopulation (Table 9). Bean genotypes and geographic distributions of populations are presented in Figure 5. The F ST (F-statistic) value was determined as 0.34, 0.26, and 0.41 in the first, second, and third subpopulations, respectively, and the mean F ST (F-statistic) value of 0.34 confirmed the segregation of all subpopulations and the diversity into SSR alleles [44] (Table 10). Evaluating genetic diversity in 149 common bean genotypes using 24 SSR markers, Sharma et al. [54] determined that the genotypes were divided in three subpopulations. Zargar et al. [46] who performed genetic structure analysis and UPGMA clustering analysis on 51 Indian bean genotypes using 15 RAPD and 23 SSR markers stated that three groups were formed in both analyses. They emphasized that the FST values obtained as a result of the genetic structure analysis (0.4047, 0.3799, and 0.2059 for the 1st, 2nd, and 3rd subgroups, respectively) are a strong indicator of the effective separation of the subpopulations and the diversity in the SSR alleles. The results of the study reported by Khaidizar et al. [57] showed higher genetic polymorphism when they used SSR to investigate the level of polymorphism in Turkish common bean genotypes, which includes most of the genotypes used in Ceylan et al. [19] study. Consistent with several previous studies, cluster analysis revealed that it resulted in two major clusters, possibly representing two major gene pools, namely Andean and Mesoamerican. It was stated that these small-seeded cultivars, which clustered separately from the others in both plastid and nuclear marker analysis, may belong to the Mesoamerican gene pool.

Conclusions
Assessment of genetic variability of the germplasm is the first step, termed prebreeding, for the improvement and development of superior cultivars. In the present study, genotypes collected from the Erzurum-Ispir region, located in the Northeastern Anatolia region of Türkiye, were evaluated at the molecular level. Our results showed a high level of genetic diversity within the population. It is important to collect local varieties and determine their genetic diversity in order to protect bean genetic resources and use them in breeding studies. An acquaintance of the genetic diversity and population structure of these genotypes may assist in the efficient management of these natural germplasms of beans. The results of this research have shown that the SSR marker system can be used successfully in determining genetic diversity among Ispir bean genotypes. These results are anticipated to guide the selection of appropriate markers in genetic diversity studies in beans.