Assessing and Broadening Genetic Diversity of Elymus sibiricus Germplasm for the Improvement of Seed Shattering

Siberian wild rye (Elymus sibiricus L.) is an important native grass in the Qinghai-Tibet Plateau of China. It is difficult to grow for commercial seed production, since seed shattering causes yield losses during harvest. Assessing the genetic diversity and relationships among germplasm from its primary distribution area contributes to evaluating the potential for its utilization as a gene pool to improve the desired agronomic traits. In the study, 40 EST-SSR primers were used to assess the genetic diversity and population structure of 36 E. sibiricus accessions with variation of seed shattering. A total of 380 bands were generated, with an average of 9.5 bands per primer. The polymorphic information content (PIC) ranged from 0.23 to 0.50. The percentage of polymorphic bands (P) for the species was 87.11%, suggesting a high degree of genetic diversity. Based on population structure analysis, four groups were formed, similar to results of principal coordinate analysis (PCoA). The molecular variance analysis (AMOVA) revealed the majority of genetic variation occurred within geographical regions (83.40%). Two genotypes from Y1005 and ZhN06 were used to generate seven F1 hybrids. The molecular and morphological diversity analysis of F1 population revealed rich genetic variation and high level of seed shattering variation in F1 population, resulting in significant improvement of the genetic base and desired agronomic traits.


Introduction
Elymus sibiricus L., commonly known as siberian wildrye, is a perennial, cold-season, self-pollinating, and allotetraploid grass with the StStHH genome constitution (2n = 28) [1]. Indigenous to northern Asia, E. sibiricus germplasm are especially rich and diverse in north China, where it is distributed primarily in Qinghai-Tibet Plateau, Inner Mongolia, Sichuan, Xinjiang, and Gansu Provinces [2]. E. sibiricus has been widely grown for pasture and hay, owing to its excellent stress tolerance, good forage quality and adaptability to local environments, and it therefore plays an important role in animal husbandry and sustenance in North China [3].
As an economically important species, E. sibiricus is difficult to grow for commercial seed production since seed shattering can cause up to 80% yield losses if harvesting is delayed [4]. The provinces of Qinghai and Sichuan, China, where the majority of E. sibiricus seed (2,400,000 kg) is produced each year, accounts for over 90% of total seed yield. However, the average seed production of E. sibiricus is only 690 kg¨ha´1 (China Grass Internet). To reduce seed shattering and enhance seed production, one of the most important approaches is to explore genetic diversity.

Seed Shattering Degree of 36 E. sibiricus Accessions
The BTS value among 36 E. sibiricus accessions varied from 31.86 gf (PI655140) to 92.34 gf (ZhN06), with an average of 53.28 gf. Sixteen accessions had a relatively low seed shattering degree with BTS more than the average value of seven accessions including four wild accessions (PI655140, PI595182, HZ02 and XH09) and three cultivars (Hongyuan, Chuancao2 and Tongde) had a relatively high seed shattering degree with the BTS value of less than 40 gf. The other 13 accessions had a moderate seed shattering degree (Figure 1). of E. sibiricus is only 690 kg•ha −1 (China Grass Internet). To reduce seed shattering and enhance seed production, one of the most important approaches is to explore genetic diversity. Morphologically and genetically diverse germplasm is a potentially valuable source for the improvement of desired agronomic traits such as seed yield, quality and stress tolerance [5]. To broaden the genetic base of E. sibiricus germplasm, one important strategy is to develop novel breeding lines by using genetically and phenotypically diverse germplasm to cross with the adapted cultivars. Generally, these resynthesized breeding lines are genetically diverse from inbred line/cultivar [6], and hybrids from two parents with distant genetic base might have higher heterosis [7]. Recent research has revealed wide variation in seed shattering among wild E. sibiricus germplasm from Qinghai-Tibet Plateau [2,8] and suggested these wild germplasm have great potential for the improvement of seed shattering. Additionally, a previous report showed that genetic distance between germplasm can be a predictor of combining ability [9]. It is, therefore, important to study the genetic diversity of E. sibiriucs germplasm with variation of seed shattering from its primary distribution area for improving our understanding of breeding materials and developing more efficient conservation and breeding strategies.
The development of neutral molecular markers has made it fast, reliable and accurate to reveal the genetic diversity of germplasm. Compared with other molecular markers like inter simple sequence repeat (ISSR), sequence-related amplified polymorphism (SRAP), and start codon targeted (SCoT) etc, EST-SSRs are highly polymorphic, abundant and are accessible to research in laboratories via published primers sequences. What is more, EST-SSRs have a higher level of transferability across related species than genomic-SSRs because EST-SSRs originate from the transcribed regions in genomes and possess conserved sequences among homologous genes [5]. Along with the development of next-generation sequencing, transcriptome sequencing has also become an efficient method to identify large EST sequences and develop EST-SSR markers [10]. To date, EST-SSRs have been widely used for genetic diversity [11], genetic mapping [12], and DNA fingerprinting [13].
The pattern of genetic variability of the available germplasm substantially affects the choice of breeding materials and the success of plant breeding programs. The objectives of the present study were to (i) compare the genetic diversity and relationship among E. sibiriucs accessions from North China; (ii) broaden the genetic diversity of E. sibiricus by crossing two genetically and morphologically diverse genotypes and assess genetic variation of the hybrid population.

Seed Shattering Degree of 36 E. sibiricus Accessions
The BTS value among 36 E. sibiricus accessions varied from 31.86 gf (PI655140) to 92.34 gf (ZhN06), with an average of 53.28 gf. Sixteen accessions had a relatively low seed shattering degree with BTS more than the average value of seven accessions including four wild accessions (PI655140, PI595182, HZ02 and XH09) and three cultivars (Hongyuan, Chuancao2 and Tongde) had a relatively high seed shattering degree with the BTS value of less than 40 gf. The other 13 accessions had a moderate seed shattering degree (Figure 1).

Polymorphism of EST-SSR Markers and Genetic Relationships of 36 E. sibiricus Accessions
Furthermore, we analyzed the genetic diversity and variation of 36 E. sibiricus accessions with variation of seed shattering degree (Table 1). One hundred EST-SSR primers selected from Elymus, Pseudoroegneria and Leymus EST database, and 112 novel E. sibiricus EST-SSR markers developed by transcriptome sequencing were chosen to conduct the primers 1 screening. Finally, 40 EST-SSR primers that successfully amplified clear and stable bands were selected to evaluate the genetic diversity of these 36 accessions ( Table 2). The 40 primers generated 380 bands, 331 of which were polymorphic. The percentage of polymorphism (P) was 87.11%. The total bands (T) per primer ranged from 2 (ES-405) to 22 (Elw5616s393) with 9.5 bands per primer. Across the 36 accessions, the polymorphic information content (PIC) values ranged from 0.23 (ES-22 and ES-125) to 0.50 (Elw2698s152 and Elw2807s159, etc.) with an average of 0.44, suggesting a high level of polymorphism.  The population structure of the 36 accessions was investigated using the Hardy-Weinberg Equilibrium by using STRUCTURE V2.3.4 software. Based on maximum likelihood and delta K (∆K) values, the number of optimum groups was four ( Figure 2). Among them, 18 accessions from Sichuan, Inner Mongolia and Xinjiang and one from Gansu were assigned to group 1 (SC, NM, XJ); four accessions from Qinghai were assigned to group 2 (QH); eight accessions from Gansu and two from Sichuan were assigned to group 3 (GS-I); three accessions from Gansu were assigned to group 4 (GS-II). Among 36 accessions, Y1005 and ZhN06 showed the largest genetic distance (0.6752). The results of genetic structure showed that there was not a strong relationship between the genetic structure and the geographical origin. For example, SC02 and SC03 from Ruoergai, Sichuan were assigned to group 3, showing close genetic relationship with accessions from Gansu. values, the number of optimum groups was four ( Figure 2). Among them, 18 accessions from Sichuan, Inner Mongolia and Xinjiang and one from Gansu were assigned to group 1 (SC, NM, XJ); four accessions from Qinghai were assigned to group 2 (QH); eight accessions from Gansu and two from Sichuan were assigned to group 3 (GS-I); three accessions from Gansu were assigned to group 4 (GS-II). Among 36 accessions, Y1005 and ZhN06 showed the largest genetic distance (0.6752). The results of genetic structure showed that there was not a strong relationship between the genetic structure and the geographical origin. For example, SC02 and SC03 from Ruoergai, Sichuan were assigned to group 3, showing close genetic relationship with accessions from Gansu. The principal coordinate analysis (PCoA) showed about 31.52% of the total variation was described by the first three PCo (Figure 3). The majority of accessions from GS and two from SC (SC02 and SC03) shared the same group, three accessions from GS were assigned to one group, the remaining of SC as well as NM, XJ and QH were assigned to a mixed group. The results of PCoA analysis were similar to structure analysis, indicating the reliability of the results.   The principal coordinate analysis (PCoA) showed about 31.52% of the total variation was described by the first three PCo (Figure 3). The majority of accessions from GS and two from SC (SC02 and SC03) shared the same group, three accessions from GS were assigned to one group, the remaining of SC as well as NM, XJ and QH were assigned to a mixed group. The results of PCoA analysis were similar to structure analysis, indicating the reliability of the results.
Results of POPGENE analysis showed high genetic diversity between geographic regions ( Table 3). AMOVA analysis showed a significant (p < 0.001) genetic difference among the five regions. A larger proportion variation (83.40%) was apportioned within geographic regions and 16.60% was apportioned between geographic regions ( Table 4). The genetic identity among five geographic regions ranged from 0.6170 (between NM and QH) to 0.9552 (between NM and XJ) with an average of 0.8218 (Table 5).

Genetic and Phenotypic Variation of Hybrid Population
Two parental genotypes: Y1005-1 (moderate seed shattering degree) and ZhN06-1 (lowest seed shattering degree)were selected as parents to produce seven F 1 individuals by hand pollination, because they had the highest genetic distance and contrasting seed shattering degree. The phenotypic variation and genetic diversity of the hybrid population and their parents were studied using 12 phenotypic traits and EST-SSR markers. Table 6 showed mid-parent heterosis (MPH), higher-parent heterosis (HPH), and coefficient of variation (CV) of the 12 traits for hybrid population and their parents. The greatest variation was found for 1000-seed weight (CV = 33.11%) and seed shattering (SS) (CV = 32.98%), followed by flag leaf length, flag leaf width, tiller number, leaf length, culm diameter, leaf width, plant height, culm number, awn length and panicle length. Some phenotypic traits of the hybrids showed evidence for significant heterosis, including flag leaf length (MPH = 80.9%, HPH = 80.4%), seed shattering (MPH = 51.1%, HPH = 8.1%), leaf length (MPH = 48.4%, HPH = 32.0%) and flag leaf width (MPH = 44.0%, HPH = 23.5%). Whereas the heterosis of tiller number, plant height, 1000-seed weight and culm diameter was lower than that of other phenotypic traits, some of them showed negative heterosis for F 1 hybrids. The high degree of genetic variation found in morphological traits is in accord with the genetic variability ( Table 7). The 40 EST-SSR primers amplified 257 bands (P = 59.92%), with 6.4 bands per primer. PIC ranged from 0.00 to 0.44, with the average of 0.20. The number of bands exclusively present in F 1 lines (BEPF) (8.44%) is higher than bands exclusively present in parents (BEPP) (1.95%). 20.78% bands were shared by Y1005-1 and F 1 lines, whereas 26.62% bands were shared by ZhN06-1 and F 1 lines. These results showed that ZhN06-1 may have a higher heritability than Y1005-1.  The clustering analysis based on phenotypic traits showed two major groups (Figure 4a). Cluster I contained Y1005-1, ZhN01-1 and F 1 -1. Cluster II included the other six F 1 lines, among them F 1 -2, F 1 -3 and F 1 -4 were clustered together in a major subgroup with F 1 -5, while F 1 -6 and F 1 -7 were in a separate subgroup. When compared with the phenotypic-based dendrogram, marker-based cluster revealed poor correlation with morphological characteristics (Figure 4b). Cluster I consisted of Y1005, F 1 -2, F 1 -7, F 1 -6 and ZhN06. Other four F 1 lines were grouped into cluster II. contained Y1005-1, ZhN01-1 and F1-1. Cluster II included the other six F1 lines, among them F1-2, F1-3 and F1-4 were clustered together in a major subgroup with F1-5, while F1-6 and F1-7 were in a separate subgroup. When compared with the phenotypic-based dendrogram, marker-based cluster revealed poor correlation with morphological characteristics (Figure 4b). Cluster I consisted of Y1005, F1-2, F1-7, F1-6 and ZhN06. Other four F1 lines were grouped into cluster II.

Genetic Diversity of E. sibiricus
Genetic diversity is the foundation of species diversity and a crucial precursor in the study of any species, because its quantity and distribution have an effect on the evolutionary and breeding potential of species or populations [14]. As an important forage grass in North China, E. sibiricus possesses great morphological and genetic variation [15]. However, recent research has showed that global climate warming and excessive grazing threaten the productivity and growth of E. sibiricus, causing losses of genetic diversity [16]. It is, therefore, necessary to evaluate the level and distribution of genetic variability for effective exploitation and utilization of E. sibiricus. Former studies have assessed E. sibiricus accessions and populations of different origins using some molecular markers, including ISSR [17], SRAP [18], SCoT [19] and EST-SSR [2]. Each study found high genetic diversity within accessions or populations. Similar genetic diversity level (87.11%) was found in this study, which might be due to the diverse geographic origins of materials tested. Among five geographic regions, GS has higher genetic diversity (86.17%) than the other four regions: SC, XJ, NM and QH. Accessions from QH revealed the highest genetic distance when compared with other populations. Previous studies showed that environment parameters such as latitude, longitude and altitude are highly correlated with the magnitude and distribution of genetic diversity [3]. The wide geographical range of five E. sibiricus populations studied may have contributed to the difference of genetic diversity. Sample size is also an important factor affecting the measurement of genetic diversity [2]. There was a positive correlation between sample size and genetic diversity [19]. In this study, small sample size from some geographic regions (e.g., four accessions from QH) may have resulted in a lower estimate of genetic diversity.
Typically, self-pollinating species possess relatively less within-population genetic variability than out-crossing species [20]. In this study, 83.40% of the genetic variance was apportioned within geographic regions, similar to values previously reported for E. sibiricus [2,3,18] and other self-pollinating Elymus species. For instance, Stevens et al. [21] found 85.0% within-population variation by analyzing four E. trchycaulus populations using SSR markers. Many factors previously reported can affect the pattern of genetic variability such as gene mutation, genetic drift, selection, gene flow, reproduction mode and population size [2,[22][23][24]. In this study, genetic divergence may be more related to complex eco-geographical factors within the E. sibiricus distribution area.

Broadening Genetic Diversity for Seed Shattering Improvement
Like most native grasses, E. sibiricus is difficult to grow for commercial seed production, since seed shattering causes large yield losses during harvest. A major limitation of plant improvement program is the lack of plant materials exhibiting genetic variation for traits of interest [25]. The challenge that exists for plant development is to maintain the genetic diversity within a species while improving desired traits that enable plant materials to perform well. To broaden the genetic diversity of E. sibiricus for future breeding improvement programs, two parental genotypes (Y1005-1 and ZhN06-1) with genetic difference sand contrasting seed shattering habits were selected to produce F 1 lines. Our results showed seed shattering degree in hybrid population ranged from 68.2 gf (F 1 -1) to 143.4 gf (F 1 -6), with an average of 97.2 gf. Three F 1 individuals (F 1 -4, F 1 -6 and F 1 -7) had lower seed shattering degree than low seed shattering parent ZhN06-1 (97.2). Thus, these individuals could be used as breeding materials for developing low seed shattering cultivars in the future. Except for seed shattering, other traits such as flag leaf length and width also showed the positive heterosis. Our results confirmed that some morphological traits of E. sibiricus could be improved by means of hybridization. When compared with the phenotypic-based dendrogram, marker-based cluster revealed poor correlation with morphological characteristics. The phenotypic-based dendrogram using limited morphological data could be affected by environment factors. In comparison, a marker-based cluster is more efficient and allows genetic diversity analysis using any physiological stage or tissue, suggesting its potential in analyzing genetic diversity and relationship of E. sibiricus.
Based on our results, the genetic diversity of hybrid population is 59.92%. Furthermore, 8.44% and 1.95% of polymorphic bands were exclusively present in F 1 lines and parents, respectively. These gained and missed bands were considered as polyploidization-induced rearrangements within coding regions. Hybridization of more genomes with different sizes and compositions in a single nucleus followed by chromosome doubling can induce several types of genomic modifications and rearrangement in the hybrids [26,27]. These new rearranged bands might be associated with effects of heterosis and contribute to surprisingly low seed shattering in the hybrids. However, whether these novel bands were responsible for new genes associated with seed shattering or other important traits is still not clear. In the future, molecular markers combined with sequence data might provide new evidence.

Plant Materials
A total of 36 E. sibiricus accessions were used in the study, comprising wild collections, breeding lines, cultivars, and cultivated types ( Table 1). Seeds of these accessions were obtained from National Plant Germplasm System (NPGS, USA), Lanzhou University, Sichuan Agricultural University and Sichuan Academy of Grassland Science. All accessions were grouped into five geographic regions: SC (Sichuan), NM (Inner Mongolia), XJ (Xinjiang), GS (Gansu) and QH (Qinghai) based on their origin and physical-geographical regionalization. Moreover, eight F 1 lines derived from a pair cross between two parental genotypes: Y1005-1 and ZhN06-1 were also used for genetic diversity analysis (Table 2).

DNA Extraction and PCR Amplification
Twenty individuals of each accession were sampled for the extraction of bulked DNA. Leaf tissues were collected from young plants, and were lyophilized for DNA extraction using a modified cetyltrimethyl ammonium bromide (CTAB) method [28]. DNA concentration and quality were determined using a Nanodrop spectrophotometer (NanoDrop Products, Wilmington, DE, USA) and agarose gel electrophoresis. Finally, the DNA samples were diluted to 25 ng/µL and stored at´20˝C prior to PCR amplification.
A total of 212 EST-SSR primers from different resources were used for genotyping, of which 100 EST-SSR markers were previously developed from Elymus (Elw hereafter), Pseudoroegneria (Ps hereafter) and Leymus (Lt hereafter) EST database [29][30][31] and 112 novel E. sibiricus EST-SSR markers were developed by transcriptome sequencing [32]. The DNA samples of 5 accessions with different geographical origins were used for primer screening. Then 40 EST-SSR primers that successfully amplified and produced clear and stable bands of the expected size by PCR amplification were used in the final analysis (Table 3). The PCR amplification and SSR genotyping were carried out as described by Xie et al. [2] and Zhou et al. [32]. Amplification fragments were then separated on 6% denatured polyacrylamide gels electrophoresis (PAGE). The resulting gel was stained by AgNO 3 solution, and photographed by a digital camera (D7000, Nikon, Tokyo, Japan).

Phenotypic Traits Measurement
The seeds of F 1 lines and their parents were germinated in plastic boxes with moistened blotter paper at room temperature. After germination seedlings were grown in a greenhouse under a 25/15˝C day/night temperature regimes until they were 8 weeks old. Then they were transplanted to field plots in the research farm, Yuzhong, Gansu, China (latitude 35˝34 1 N, longitude 103˝34 1 E, elevation 1720 m). Plants were spaced 0.5 m within rows and 1 m between rows. A total of 12 phenotypic traits, including seed shattering (SS), plant height (PH), leaf length (LL), leaf width (LW), flag leaf length (FLL), flag leaf width (FLW), culm diameter (CD), culm number (CN), tiller number (TN), panicle length (PL), awn length (AL) and 1000-seed weight (1000-SW) were measured using the methods described by Zhao et al. [8]. Seed shattering degree of E. sibiricus accessions was determined by measuring pedicel breaking tensile strength (BTS), which is inversely proportional to shattering degree. Thirty randomly chosen spikelets of each plant were examined at 28 days after heading, and their average BTS values were calculated. The heterosis of hybrids were estimated on mid-parent values and high-parent value using the following formula: mid-parent heterosis (%) = (F 1´M P)/MP 100%, higher-parent heterosis (%) = (F 1´H P)/HPˆ100 %, where F 1 is the mean of the hybrids, MP is the mean of parents, HP is the value of higher parent [33].

Data Analysis
The amplified bands were scored as present (1) or absent (0), and only reproducible bands were considered. The resulting present/absent data matrix was analyzed using POPGENE 32 Version 1.31 [34]. Number of polymorphic band (NPB), percentage polymorphic band (PPB), Shannon information index of diversity (I), Nei 1 s gene diversity (H), and observed number of alleles (Na) and polymorphic information content (PIC) were used to evaluate genetic diversity. PIC was calculated for each primer according to the formula: PIC = 1´p 2´q2 , where p is frequency of present band and q is frequency of absent band [35]. The Analysis of Molecular Variance (AMOVA) was used to partition the total EST-SSR variation into within populations and among populations [36]. The input files for POPGENE and AMOVA were prepared with the aid of DCFA1.1 program written by Zhang and Ge [37]. Population structure of the 36 E. sibiricus accessions was analyzed using STRUCTURE v2.3.4 software with the 1 admixture mode 1 , burn-in period of 10,000 iterations and a run of 100,000 replications of Markov Chain Monte Carlo (MCMC) after burn in [38]. For each run, 10 independent runs of STRUCTURE were performed with the number of clusters (K) varying from 1 to 8. Mean L (K) and delta K (∆K) were estimated using the method described by Evanno et al. [39], maximum likelihood and delta K (∆K) values were used to determine the optimum number of groups. A principal coordinate analysis (PCoA) was constructed based on Jaccard 1 s genetic similarity matrix using DCENTER module in NTSYS (version 2.10) [40]. A dendrogram was constructed using the GenStat (version 17.1) and free tree + tree view (version 1.6.6 for Windows) software. The phenotypic data were analyzed using SPSS software (SPSS, version 22 for Windows, SPSS Inc., Chicago, IL, USA).

Conclusions
This study showed a high level of genetic diversity and a clear population structure of 36 E. sibiricus accessions from its primary distribution area in China. The finding that larger variation existed within geographical regions will provide a guideline for the collection and conservation of E. sibiricus germplasm. More genetic variation of the species can be captured when sampling a larger number of plants from special eco-geographical regions. Meanwhile, cross breeding is an effective way to obtain more genetic and phenotypic variation. F 1 lines of E. sibiricus exhibited a higher genetic variation in the major agronomic traits. In addition, some F 1 lines showed obvious heterosis over parents, especially in seed shattering performance. These hybrids could be used as important genetic resources for genetic improvement of E. sibiricus in future breeding improvement programs.