Genetic Diversity in a Core Subset of Wild Barley Germplasm

Wild barley [Hordeum vulgare ssp. spontaneum (C. Koch) Thell.] is a part of the primary gene pool with valuable sources of beneficial genes for barley improvement. This study attempted to develop a core subset of 269 accessions representing 16 countries from the Plant Gene Resources of Canada (PGRC) collection of 3,782 accessions, and to characterize them using barley simple sequence repeat (SSR) markers. Twenty-five informative primer pairs were applied to screen all samples and 359 alleles were detected over seven barley chromosomes. Analyses of the SSR data showed the effectiveness of the stratified sampling applied in capturing country-wise SSR variation. The frequencies of polymorphic alleles ranged from 0.004 to 0.708 and averaged 0.072. More than 24% or 7% SSR variation resided among accessions of 16 countries or two regions, respectively. Accessions from Israel and Jordan were genetically most diverse, while accessions from Lebanon and Greece were most differentiated. Four and five optimal clusters of accessions were obtained using STRUCTURE and BAPS programs and partitioned 16.3% and 20.3% SSR variations, respectively. The five optimal clusters varied in size from 15 to 104 and two clusters had only country-specific accessions. A genetic separation was detected between the accessions east and west of the Zagros Mountains only at the country, not the individual, level. These SSR patterns enhance our understanding of the wild barley gene pool, and are significant for conserving wild barley germplasm and exploring new sources of useful genes for barley improvement.


Introduction
Recent decades have seen an increasing research effort to conserve and explore germplasm of crop wild relatives [1,2].Crop wild relatives are weedy plants closely related to crops, including their progenitors, harbour beneficial traits such as pest or disease resistance and high yield [3], and represent the best genetic hope for improving genetically impoverished cultivars for human food production [4,5].Successful introgression of exotic disease-and pest-resistance genes from wild into cultivated plants of many crops has been well documented (e.g., see [6,7]).With advances in molecular technology and plant breeding, such introgression is expected to play an increasing role in unlocking the genetic potential of wild relatives for crop improvement [8,9].However, challenges faced with the conservation and utilization of these weedy plants largely remain, particularly in the collection of extant populations, the characterization of related gene pools and the search for beneficial genes of agricultural importance.
Wild barley Hordeum vulgare ssp.spontaneum (C.Koch) Thell. is a weedy, annual, dominantly self-fertilizing diploid (2n = 2x = 14) plant with two-rowed, brittle rachis form [10]. Its natural occurrence presumably is from the eastern Aegean islands and (possibly) Egypt over Middle East through Iran and eastward to Afghanistan, western Pakistan, and Tajikistan [11].This plant grows in a wide range of habitats such as meadows or mesic steppes to semiarid and arid regions and man-made habitats like abandoned fields and roadsides [12].Wild barley is fully inter-fertile with cultivated barley (Hordeum vulgare ssp.vulgare L.), is considered as the progenitor of cultivated barley, and forms part of the primary gene pool for barley improvement [13,14].It has been known to harbor fruitful sources of beneficial genes for barley breeding such as those genes associated with disease resistance to powdery mildew [15], abiotic stress tolerance [16,17], important agricultural traits [18,19] and quality traits [20].
Considerable attention has been paid to collection and conservation of wild barley in the last 60 years [12,13,21].There exist around 23,300 accessions of wild barley and other related Hordeum species currently conserved in seed genebanks worldwide [22].The Institute for Cereal Crops Improvement, Tel-Aviv University, currently holds the largest wild barley collection of 6,637 accessions available for distribution, followed by Plant Gene Resources of Canada (PGRC; the Canadian national seed genebank) at Saskatoon with a unique world collection of 3,782 accessions.This ranking does not include the largest wild barley collection of 14,648 accessions preserved but not yet available for distribution in the John Innes Centre in the United Kingdom [23].These accessions resulted from a joint UK-Israeli expedition and were obtained from relatively few collecting sites.To summarize the wild barley genetic diversity, two core subsets of wild barley (one with 70 accessions and another 144 accessions) were established in 1990s and formed a part of the International Barley Core Collection (BCC) [24].Another expanded subset with 318 accessions was also assembled in 2007 for Wild Barley Diversity Collection (WBDC) to exploit its genetic diversity for important agricultural traits [25].These subsets are currently maintained at the International Centre for Agricultural Research in Dry Areas (ICACDA), Aleppo, Syria.The larger and expanded subsets have been characterized using isozyme [26] and other molecular markers [27], respectively.However, these subsets have accessions of unknown origin and do not fully represent the species distribution.
Efforts have been made to characterize wild barley germplasm using different genetic markers such as isozyme (e.g., [28]), RAPD (e.g., [29]), AFLP (e.g., [30]), SSR (e.g., [31]) and SNP (e.g., [32]).These characterizations largely focused on the level of polymorphism within and between a number of natural populations, the distribution of diversity and its relationships with ecogeographical factors.It is found that wild barley, although dominantly self-pollinated, has a high level of genetic diversity (e.g., see [28,33]).Larger genetic diversity was observed among than within populations (e.g., [34]).Much of the genetic variation is correlated to adaptive traits such as ecological and geographic parameters (e.g., see [12]).The Near East, particularly Israel and Jordan, is the center of genetic diversity for wild barley (e.g., see [35]).A strong geographic differentiation was observed in several genes [36][37][38].However, some of these studies were limited in sample size (e.g., [36]) and focused only on the populations in the Near East with insufficient geographic sampling (e.g., [35]).Also, the patterns of genetic variability may vary among various genetic markers used (e.g., [39]).
The objectives of this study were to develop a core subset of wild barley from the PGRC wild barley collection and to analyze the genetic diversity and structure in the core subset using 25 informative barley SSR markers.The SSR markers were applied, mainly because they are often codominant, highly reproducible, ubiquitous in eukaryotic genomes, and display high allelic variation, probably due to unequal crossing-over or slipped strand mis-pairing during replication (see [31] and literature cited therein).It is our hope that this study, along with those on cultivated barley [40], will generate the baseline genetic information to facilitate the germplasm evaluation and accessibility for conservation and barley improvement [25,41].

Wild Barley Collection and Core Subset
The PGRC wild barley collection currently maintains 3,782 accessions representing 16 countries from Morocco in the western region to Japan in the eastern region (Table S1).It was established since the 1970s, largely by several collection expeditions in the Near East and further west to Morocco during the 1980s-90s by the Canadian scientists Drs.Sati Jana, Bernard Baum, John Martens and George Fedak (e.g., see [33]), duplicating part of the National Small Grain Collection USDA-ARS, United States [21], and continued minor acquisitions, particularly from China and Ethiopia.In spite of these efforts, the collection has only two accessions from the Tibetan Plateau and is completely lacking representation from several other countries such as Libya, Russia, Pakistan, Azerbaijan and Uzbekistan.Based on the existing collection, a core subset of 269 accessions representing 16 countries were developed through a random sampling stratified with respect to country of origin and with the size roughly equal to the natural logarithm frequency of accessions for a country [42].The detailed information on the selected accessions, including geographic records for 49 accessions, is given in supplemental materials Table S1.To infer regional structure, the accessions were grouped based on the distinct distribution of wild barley described in [43,44].The eastern region represents wild barley from the Zagros Mountains and further east, while the western region includes wild barley from the Fertile Crescent and further west.

DNA Extraction and SSR Analysis
About 10 seeds were randomly selected from each accession and grown in the greenhouse at the Saskatoon Research Centre.Leaf tissue from individual two week old greenhouse grown seedlings was collected and freeze dried for 24 h.Genomic DNA was extracted from leaf tissue of one individual seedling randomly selected from an accession using a DNEasy Plant Mini kit (Qiagen, Mississauga, ON, Canada), quantified using a Thermo Scientific Nanodrop 8,000 spectrometer, and adjusted to a final concentration of 15 ng•µL −1 with water.
Twenty-five informative barley SSR primers were selected based on the published marker information and genomic coverage from a consensus map for cultivated barley (Table 1; [45]) and these sequence specific primers were synthesized by IDT (Coralville, Iowa, U.S.A.).A three primer PCR method [46] was applied: a unique reverse primer, a unique forward primer with a T7 universal tail (T7 = TAATACGACTCACTATAGGG), and a third fluorescently labelled T7 primer containing one of four tags: FAM, VIC, PET, or NED.Fluorescently labelled primers were synthesized by Applied Biosystems-Life Technologies (Burlington, Ontario, Canada).Primers were re-suspended in water at 50 pmol•µL −1 .The PCR reactions were set up as: 1x New England Biolabs (NEB, Pickering, Ontario, Canada) Standard Buffer containing 1.5 mM MgCl 2 , 0.1 mM each dNTP (Promega/Fisher Scientific) Nepean, Ontario, Canada), 188 fmol•µL −1 reverse primer, 19 fmol•µL −1 forward primer with T7 universal tail, 63 fmol µL −1 fluorescently labelled T7 universal primer, 1 U Taq polymerase (NEB), and 30 ng template genomic DNA in a final volume of 20 µL.Samples were denatured at 94 °C, 3 min, selectively amplified with 22 cycles of 94 °C 30 s, 56 °C 30 s, and 72 °C 45 s, and then followed immediately in the same tube by fluorescent labelling with 22 additional cycles of 94 °C 10 s, 47 °C 20 s, and 72 °C 60 s with a final extension of 72 °C for 10 min.
Samples were diluted by combining 1 µL from each of four different reactions labelled with four different fluorescent labels and adjusting to 50 µL with water.Size standard was prepared by adding 24 µL of GeneScan 600 LIZ size standard to 890 µL of Hi-Di formamide (Applied Biosystems-Life Technologies, Burlington, Ontario, Canada).One microliter of diluted samples was added to 9 µL of size standard-formamide mixture.Samples were analyzed on an Applied Biosystems 3130xl genetic analyzer using POP-7 polymer (Applied Biosystems-Life Technologies, Burlington, ON, Canada) and a 36 cm capillary array.Data was collected using GeneMapper Software Version 4.1 with bin size set at 2bp.To minimize technique-born and scoring errors, six checks were repeated on each of three plates.Resulting data was manually proofread.

Data Analysis
The scored alleles were first assessed for consistency of the checks and alleles of a primer pair with 5% or higher error rates over the three plates were discarded.The SSR data were analyzed for the level of polymorphism with respect to each primer, the accession country of origin, and the proposed accession region by counting the number of alleles, unique alleles, and rare alleles (with a frequency of 0.05 or smaller), generating the summary statistics on the allelic frequencies, and calculating the polymorphic information content (PIC) following the method described in [47].This was done using a SAS program written in SAS IML [48].To visualize the variation pattern, the numbers of alleles detected by all primer pairs were plotted against their frequencies of occurrence in all assayed accessions.To assess the impact of the variable number of accessions for each country on the polymorphism observed, a linear regression was made using SAS PROC REG [48] on the number of accessions (as independent variable) over each of the five independent variables: the number of alleles, the number of unique alleles, the number of rare alleles, the mean allelic frequency, and the group-specific proportion of variation (or Fst) obtained from the below analysis of molecular variance (AMOVA; [49]).The differences in allelic richness among accessions of pairwise countries or two regions were further assessed following the random permutation procedure described in [50].The random permutation procedure allows for a significance test of the difference in allelic counts between groups of variable accession numbers.
The genetic structure of the assayed accessions was first inferred using prior defined grouping with respect to accession origin: country and region.The inferences were specifically done by estimating genetic distances among groups and clustering the prior defined groups using Arlequin version 3.5 [51].
The Arlequin program provides a partition of total SSR variation into within-and among-group variation components, and a measure of inter-group genetic distances as the proportion of total SSR variation residing between wild barley accessions of any two groups [49].Significance of resulting variance components and inter-group genetic distances was tested with 10,100 random permutations.The neighbor joining clustering of the prior defined groups was made using NTSYS-PC 2.01 [52] that was based on the AMOVA-based estimates of genetic distances with respect to country origin.As the structural inference may be biased from variable accession numbers for these countries, an additional AMOVA was also made for countries with more than six accessions.
The genetic structure was also inferred using model-based Bayesian methods.The Bayesian method available in the BAPS software [53] was applied to estimate the number of clusters.Clustering of individual accessions was done using the model for non-linked markers and 20 replicate runs of the algorithm with the upper-bound values (K) for the number of clusters ranging between 2 and 10.For further verification, the other Bayesian method available in the program STRUCTURE version 2.2.3 [54,55] was used to detect population structure and to assign accessions to subpopulations.The STRUCTURE program was run 30 times for each subpopulation (K) value, ranging from 2 to 10, using the admixture model with 10,000 replicates for burn-in and 10,000 replicates during analysis.The final population subgroups were determined based on (1) likelihood plot of these models, (2) the change in the second derivative (∆K) of the relationship between K and the log-likelihood [56], and (3) stability of grouping patterns across 30 runs.For a given K with 30 runs, the run with the highest likelihood value was selected to assign the posterior membership coefficients to each accession.A graphical bar plot was then generated with the posterior membership coefficients.
The optimal genetic structure obtained from the BAPS program was further analyzed with respect to accession country and region to characterize its feature.The genetic divergence among inferred clusters was assessed based on Nei's minimum distance generated from BAPS.The inferred genetic structure was further compared for consistency with the genetic relationships of individual accessions obtained from two commonly applied approaches.A principal component analysis (PCA) of the 269 accessions was performed using NTSYS-PC 2.01 [52] based on the similarity matrix of 359 SSR alleles, and plots of the first three resulting principal components were made to assess the accession associations.A neighbor-joining (NJ) analysis of all 269 accessions was also made using PAUP* [57] based on the original data of 359 SSR alleles and a radiation tree was displayed using MEGA 3.01 [58].The resulting PCA plots and NJ trees were individually labeled for the inferred structures.
Additional AMOVA analyses were also made based on the optimal genetic structure derived from the two Bayesian analyses above.The average dissimilarity of each accession was estimated following the method described in [59].This average dissimilarity measures the overall genetic difference between an accession of interest and the remaining 268 accessions assayed, reflecting its genetic distinctiveness within the core subset.

SSR Polymorpshim
The assayed SSR primers detected a total of 359 alleles at 25 loci in the 269 wild barley accessions (Table 1).These alleles sampled both transcribed and non-transcribed regions of seven barley chromosomes.The number of loci per chromosome ranged from 2 to 5. The number of alleles detected per locus ranged from 4 to 35, and the number of rare alleles (with a frequency of 0.05 or smaller) per locus ranged from 0 to 25.The average allele frequency for all alleles at a locus ranged from 0.029 to 0.253.Overall, the observed allelic frequencies ranged from 0.004 to 0.708 with an average of 0.072.Specifically, 212 of the 359 alleles were rare alleles; 290 alleles had a frequency of 0.1 or smaller; and 20 alleles had a frequency of 0.3 or larger.The PIC value for each marker ranged from 0.488 to 0.947 and averaged 0.794 (Table 1).

Genetic Diversity
The core subset was selected using a stratified sampling strategy from the wild barley collection with respect to country of origin.The number of accessions per country ranged from one to 36 (Table 2).A variable number of accessions per country were significantly associated with the variation in the estimates of five parameters per country (Table 2): the number of alleles (R 2 = 0.72, p < 0.0001), the number of unique alleles (R 2 = 0.63, p < 0.0004), the number of rare alleles (R 2 = 0.66, p < 0.0002), the mean allelic frequency (R 2 = 0.57, p < 0.001), and the country-specific Fst value (R 2 = 0.41, p < 0.01).These associations clearly demonstrated the effectiveness of the stratified sampling in capturing country-wise SSR variability from the PGRC wild barley collection.The detailed diversity estimates for the core subset with respect to country are shown in Table 2.
To compare genetic diversity among countries of variable samples sizes, the allelic differences among wild barley accessions of pairwise countries were estimated and tested using a random permutation procedure.Significant allelic differences were found largely among accessions from countries with more than six accessions (Table 3).The accessions from Lebanon and Morocco had the most significant allelic differences from the others, and the accessions from the countries in the eastern region had the least significant allelic differences.The largest significant allelic differences were found between accessions from Lebanon and Israel or Jordon.The country-specific Fst estimates the country-specific contribution to the overall SSR variation and provides some measure of the within-country variation.When the countries with more than six accessions were considered (see Table 2), Israel and Jordan had accessions with the smallest Fst value (0.231 or the most within-country variation), followed by Syria (0.236), Turkey (0.237), and Iran (0.237).The country with the largest Fst value (or the least within-country variation) was Lebanon (0.274), followed by Morocco (0.257).
Similarly, SSR variation was also estimated with respect to region (Table 2).The numbers of accessions assayed for the western and eastern regions were 216 and 53, respectively.More alleles, including unique and rare alleles, were observed for the accessions from the western region than from the eastern region and such allelic difference was statistically significant (p < 0.0001), based on the random permutation test.However, the mean allelic frequency (0.075) observed for the western region was slightly lower than that (0.123) for the eastern region.Similarly, the accessions from the western region had slightly more SSR variation with the region-specific Fst value of 0.0741 than that from the eastern region (0.0752).[45].Genomic (G) and expressed sequence tag-derived (E) marker types are specified; ‡ Na = number of alleles detected; Nr = number of rare alleles (or alleles with a frequency of 0.05 or smaller); Maf = mean allelic frequency; PIC = polymorphic information content.These four parameters were estimated from the 269 wild barley accessions.[43,44].The eastern (E) region represents wild barley from the Zagros Mountains and further east, while the western (W) region includes wild barley from the Fertile Crescent and further west; ‡ N = number of accessions.Na = number of alleles.Nu = number of unique alleles.Nr = number of rare alleles (or alleles with a frequency of 0.05 or smaller).Maf = mean allelic frequency.Fst = country-specific Fst calculated from analysis of molecular variance; the value before slash was obtained based on the 15 countries with more than one accession, while the value after slash based on 12 countries with more than six accessions.NA = not available.The estimates of Na, Nu, Nr, and Maf were obtained with respect to country or region.The average dissimilarity (AD) of each accession was estimated.The AD estimates ranged from 0.103 to 0.141 with a mean of 0.114 (Table S1).The accession with the least AD estimate was from Cyprus (CN48879), while the three accessions from Israel (CN80241, CN79043 and CN78947) displayed the largest AD estimates.Overall, more individual accessions from Israel and Jordan displayed distinct genetic backgrounds.

Prior Defined Genetic Structure
The dominant genetic structure detected in the core set was at the country level, followed by the proposed region (Table 4, Figure 1).The percent of total SSR variation residing among the accessions of various countries was 24.23, followed by the regions (7.43) (Table 4).Excluding the countries with less than seven accessions did not significantly reduce the proportion of the total SSR variation (24.16%) present among the assayed countries.Clustering 268 accessions from 15 countries revealed no distinct separation between accessions representing two regions and genetically distinct clusters from Lebanon, Greece and Cyprus (Figure 1A).These patterns of variation were consistent with those PCA-inferred genetic relationships of 269 accessions (Figure 2A).The eastern accessions were not clearly separated from those western accessions.The accessions from Lebanon and Greece were genetically most differentiated and separately from those from the western region.Similarly, the neighbor-joining of individual accessions revealed the same patterns of variation as those shown in the PCA plot (Figure 2B).Several clusters were observed, but they were not distantly diverged.Interestingly, seven accessions from Japan were not closely associated with those accessions from the eastern region, rather with some accessions from the western region.Two accessions from China (CN28546 collected from Tibet; CN28547 from Daofu, Sichuan) were clustered together and associated with those from the western region.However, when the countries with less than seven accessions were not considered, the accessions from four countries in the eastern region (Afghanistan, Japan, Iran and Turkmenistan) were closely grouped together and genetically more related to the accessions from Turkey and Syria (Figure 1B).These results not only confirmed the expected effects of variable sample sizes on the structural inferences at the country or regional level, but also provided some support for the distinction in wild barley accessions between the eastern and western regions [43,44].Also, the accessions from Lebanon and Israel displayed the most genetic differentiations (Figure 1B).This pattern of variation is consistent with the findings of allelic differences reported above and our current understanding of wild barley genetic diversity mentioned earlier.(B) The NJ tree showing eastern and western accession.Note that the accessions from Lebanon, Greece, Japan, and China are highlighted.

Model-based Genetic Structure
The BAPS analysis revealed five optimal clusters in this core subset with the highest log likelihood of −19,454.3 (Figure 3) and a large partition (20.3%) of the total SSR variation (Table 4).Nine (3.3%) accessions had multiple memberships across the five clusters and largely resided between clusters 2 and 4 (Figure 3A).The clusters ranged in size from 15 to 104.The cluster composition varied with respect to accession country and region.The accessions from Lebanon and Greece contributed to clusters 3 and 1.The cluster 4 consisted mainly of the accessions from Cyprus and Syria.The cluster 2 represents the western accessions and the cluster 5 dominantly consisted of the eastern accessions, while both clusters 2 and 5 had accessions mixed from both regions.Clearly the clusters 3 and 1 displayed substantial divergence from the other clusters (Figure 3B).The detailed memberships of each cluster for the assayed accessions were given in Table S1.A: Five optimal clusters and features Nei's minimum distance The STRUCTURE analysis was also applied to infer genetic structure in the core subset, considered K = 2 to 10 clusters, and revealed four optimal clusters (Figure 4A), with the highest log likelihood of −16,724.3 (Figure 4B) and a large partition (16.3%) of the total SSR variation (Table 4).The optimal clusters were supported from the rate of change in the second derivative of the log likelihoods over various Ks analyzed, giving K = 4 as an optimal (Figure 4C).Clearly, each of the four optimal clusters has a considerable proportion of mixed memberships sharing between clusters.

Discussions
The molecular characterization here revealed several interesting patterns of genetic variation in the developed core subset of wild barley germplasm.First, the stratified sampling was effective in capturing country-wise SSR variability.Second, the core subset displayed a large SSR variability with a majority of SSR alleles having a frequency of 0.05 or smaller.Third, the largest SSR variation (24.2%) resided among the accessions of various countries, the accessions from Israel and Jordan were genetically most diverse, and the accessions from Lebanon and Greece were most differentiated.Fourth, the regional accessions explained 7.4% SSR variation, but a genetic separation between the eastern and western accessions was observed only at the country, not individual, level.Fifth, four and five optimal clusters of accessions were obtained using the STRUCTURE and BAPS programs and partitioned 16.3% and 20.3% SSR variations, respectively.These SSR patterns enhance our understanding of the wild barley gene pool and are significant for conserving wild barley germplasm and exploring new sources of useful genes for barley improvement.
The revealed patterns of SSR variation are largely consistent with our current knowledge about genetic diversity of wild barley.For example, the large SSR diversity detected here matched well with those previously reported with SSR markers (e.g., see [60]) and AFLP markers (e.g., [31]).The accessions from Israel and Jordan were genetically the most diverse, aligning well with the suggested centre of genetic diversity in the Near East [35].The association patterns of individual accessions in this core subset (Figure 2) are similar to those revealed with AFLP markers (see Figure 4A of [61]).However, a genetic separation for accessions east and west of the Zagros Mountains was detected only when the accessions were considered with respect to country origin (Figure 1B), but not at the individual level (Figures 2 and 3).This finding is not surprising, given those genetic differentiations previously inferred from large isozyme studies (e.g., [35]), but still provides some support for those from DNA-based re-sequencing studies [38].The discrepancy may reflect the limitation in our genome sampling with only 25 SSR markers, even though the core subset is geographically representative.Re-assessing the genetic differentiation with dense genome sampling through next generation sequencing may help to address the issue.
The genetic structure analyses presented here revealed five and four optimal clusters of wild barley using the BAPS and STRUCTURE programs, respectively.These optimal clusters were much smaller than the ten optimal groups inferred with DArT and SNP markers in the expanded subset (or WBDC) [27].This discrepancy may represent the genetic differences among core subsets and/or reflect the bias of sampling barley genome with different molecular markers.Two of the five clusters inferred with the BAPS program had only country-specific accessions (i.e., from Lebanon and Greece), and the eastern accessions were largely formed into two clusters with the western accessions.These grouping patterns are significant, as they help to provide the best possible structural description of the PGRC wild barley collection and perhaps, to great extent, the current global wild barley gene pool.However, it is still worth noting that the PGRC wild barley collection may deviate from the global wild barley gene pool, given the non-random germplasm sampling and insufficient representation of the species distribution.Currently, the ICARDA collection is ecologically and geographically the most diverse and has the most collection sites distributed over 20 countries [21].Also, it was found here that the largest SSR variation (24.2%) resided among the accessions of various countries, which is larger than those captured with the model-based inferences.This finding helps to strengthen the view in germplasm conservation that country origin still is the best indicator for the extant genetic structure in wild barley and thus should be adequately considered for sampling wild barley germplasm for conservation.
The variation patterns revealed here, although specific to this core subset, should share some general baseline information useful for the evaluation and establishment of specific core subsets by determining the weighting or representation for each country [62].For example, the patterns of SSR variation reported here are compatible with those observed in the BCC core subset of wild barley with isozyme markers [26].The genetic relationships of the wild barley accessions obtained from our PCA analysis (Figure 2A) displayed similar patterns obtained with isozymes (Figure 3 of [26]).However, our findings may be more informative, as our core subset is larger and covered wider geographic regions from Morocco to Japan, and our SSR markers sampled wider barley genomic regions than the reported isozymes.Unfortunately, our core subset is still lack of representation from the related species distribution as mentioned earlier.Further efforts should be made, not on the re-sampling of the PGRC collection for other core subsets, but by continued acquisition toward these countries to increase the coverage of the species distribution and those with genetically distinct germplasm to enhance the diversity coverage.More attention is being paid to the eastern region, as a recent study suggested that the Tibetan Plateau is another center of barley domestication [63].Undoubtedly, these efforts will help to supplement and/or expand the existing BCC core subsets with better diversity coverage and improved accessibility for barley germplasm research and utilization [40,41].
Our core subset had only 49 accessions with geographical records and may be less suited for monitoring the adaptation to climate change than the expanded subset (WBDC) with more accessions of geographical information [25].Also, our characterization applied only 25 SSR markers on seven chromosomes.Thus, the genome sampling is limited and not extensive.It remains unknown how much effect the small number of applied markers would have on the inference of genetic structure and differentiation.Moreover, this study focused only on the diversity and structural inferences of the existing germplasm collection and did not examine the genetic associations with important barley traits [27,64], as the developed core subset has not yet been fully characterized for relevant biological variability.Efforts will be made to characterize the subset with phenotypic parameters for searching germplasm of particular traits and/or association mapping of useful genes of agricultural importance for barley improvement.
Wild barley, like other crop wild relatives, is being threatened by global warming and displays profound adaptive changes at phenotypic and genotypic levels [5].These adaptive changes may diminish the genetic resource for crop improvement and damage food production.Our findings provide the baseline genetic information about the wild barley gene pool, that is helpful to monitor and comprehend the possible genetic changes in the overall gene pool and also useful to germplasm sampling in the species distribution for conservation.The developed core subset will facilitate the germplasm evaluation and accessibility for germplasm utilization (Table S1).This study represents a part of our research effort to conserve and explore crop wild relatives.

Figure 1 .
Figure 1.Genetic structure of wild barley accessions representing various countries and two regions (E = eastern; W = western) based on the genetic distance estimated from AMOVA.(A) Accessions representing 15 countries (i.e., Tajikistan with only one accession was not considered); (B) Accessions representing 12 countries (i.e., the countries with less than seven accessions (Tajikistan, China, Iraq and Ethiopia) were not considered).B: 12 countries

Figure 2 .
Figure 2. Genetic relationships of individual accessions derived from PCA and neighbor-joining (NJ) analysis.(A) The PCA plot showing eastern and western accessions; (B) The NJ tree showing eastern and western accession.Note that the accessions from Lebanon, Greece, Japan, and China are highlighted.
plot showing eastern (circles) and western accessions (all others) B: NJ tree showing eastern (filled circles) and western accessions (all others)

Figure 3 .
Figure 3. Genetic structure of 269 wild barley accessions inferred using the BAPS program and its characteristics.(A) Five optimal clusters with respect to country and region.The cluster composition is shown in number of accessions representing country and region, and the major contribution to each cluster for a country or region is highlighted in bold; (B) Genetic divergence among five optimal clusters.

B:
Genetic divergence among five optimal clusters

Figure 4 .
Figure 4. Genetic structure of 269 wild barley accessions inferred using the STRUCTURE program and sensitivity assessment of inference by STRUCTURE.(A) Genetic structure inferred by STRUCTURE with clusters of K = 3, 4, 5; (B) The log likelihood profiles for models with K = 2 to 10; (C) The rates of change in log likelihood for models with K = 2 to 10, showing the optimal K = 4.

Table 1 .
Variation patterns of 25 SSR markers assayed in 269 wild barley accessions.
† Information on markers, type, linkage group and position was obtained from

Table 2 .
Patterns of SSR variation in 269 wild barley accessions representing 16 countries and two regions.
†The region classification after each country follows those of

Table 4 .
Results for analysis of molecular variance based on 359 SSR markers in 269 wild barley accessions representing 16 countries, two regions, and optimal clusters inferred from BAPS and STRUCTURE programs.

/source df Sum of squares Variance component Percent of variation p-value †
†The permuted probability that the among-group variance component was larger than zero.