Development of 65 Novel Polymorphic cDNA-SSR Markers in Common Vetch (Vicia sativa subsp. sativa) Using Next Generation Sequencing

Vetch (Vicia sativa L.) is one of the most important annual forage legumes in the World due to its multiple uses (i.e., hay, grain, silage and green manure) and high nutritional value. However, detrimental cyanoalanine toxins in its plant parts including seeds and its vulnerability to hard winter conditions are currently reducing the agronomic values of vetch varieties. Moreover, the existence in the public domain of very few genomic resources, especially molecular markers, has further hampered breeding efforts. Polymorphic simple sequence repeat markers from transcript sequences (cDNA; simple sequence repeat [SSR]) were developed for Vicia sativa subsp. sativa. We found 3,811 SSR loci from 31,504 individual sequence reads, and 300 primer pairs were designed and synthesized. In total, 65 primer pairs were found to be consistently scorable when 32 accessions were tested. The numbers of alleles ranged from 2 to 19, frequency of major alleles per locus were 0.27–0.87, the genotype number was 2–19, the overall polymorphism information content (PIC) values were 0.20–0.86, and the observed and expected heterozygosity values were 0.00–0.41 and 0.264–0.852, respectively. These markers provide a useful tool for assessing genetic diversity, population structure, and positional cloning, facilitating vetch breeding programs.


Introduction
Vicia sativa subsp. sativa, known as the common vetch, is one of the most commonly grown winter cover crops, or green manure, and is also used as pasture, silage, and hay [1,2]. It is cultivated with mixtures of cereal grains, providing cool-weather weed suppression and preventing fall N scavenging. It has been successfully applied in vineyards and orchards [1,2]. Due to its economic and ecological advantages, common vetch is now widespread through many parts of World, including the Mediterranean basin, west and central Asia, China, eastern Asia, India, and the USA [1][2][3].
The common vetch produces seeds that are quite similar to those of lentils in physical appearance and are highly nutritious [3][4][5][6][7]. However, due to the presence of cyanoalanine toxin in the seeds, which is detrimental to mono-gastric animals, including humans, the common vetch is currently tightly restricted as a feed or food source [3][4][5][6]. Moreover, its vulnerability to severe winter conditions (<−10 °C) further reduces its true agricultural potential [8,9]. Thus, to address these drawbacks, it is imperative to genetically improve this legume species either through conventional breeding or biotechnology approaches. However, a severe lack of genomic resources in the public domain has hampered such efforts.
Next-generation transcriptome sequencing is an excellent solution for enriching relevant genomic resources for non-model crop species such as the common vetch, providing functional annotations as well as genetic marker information [10][11][12]. In particular, cDNA-SSR markers generated from this approach can facilitate marker-assisted selection for vetch improvement programs, because these may be associated with functionally annotated transcribed genes, are cost-effective, and are easily transferable to related species [10][11][12][13]. Recently, we sequenced transcriptomes of common vetch using 454 pyrosequencing technology, and found 3,811 SSR loci from 31,504 individuals. In the present study, we developed and characterized polymorphic cDNA-SSR markers based on these transcriptome sequences to further contribute to breeding and molecular genetic studies of this species.

Results and Discussion
V. sativa subsp. sativa transcriptome sequencing yielded about 28 Mb and GS De Novo yielded 86,532 raw sequencing reads, based on the GS-FLX sequencer. SSRs are one of the most popular marker systems, consisting of various numbers of tandem-repeat di-, tri-, or tetra-nucleotide DNA motifs [14].
To identify SSR markers, we used the ARGOS program with default settings for V. sativa subsp. sativa singleton collections. In total, 3,811 potential SSR motifs were identified, with the majority being trinucleotide (76.3%) and dinucleotide (14.6%) repeats. There was a low rate (9%) of all other types of SSRs (e.g., tetra-, penta-, and hexa-nucleotide motifs) and the majority of trinucleotide SSRs had the GGT/GTG//TGG motif, followed by those with the ACC/CCA/CAC motif. In addition, CT/TC, AT/TA, and GA/AG motifs were abundant among the dinucleotide cDNA-SSRs. The relative proportion of SSR motif types was comparable to that of other plant species [15][16][17][18][19][20]. Kaur et al. [18] reported in theory, the frequencies of di-, tri-, tetra-, penta-, and hexanucleotide repeats should progressively decrease, based on the relative probability of replication slippage events. However, trinucleotide repeat units were predominant, followed by tetra-, di-, hexa-, and pentanucleotide repeat units.
Among the identified SSR loci, we selected 100 primer pairs on the basis of same annealing temperature, only 65 primer pairs produced single dominant polymerase chain reaction (PCR) products that were scorable for 32 accessions (Table 1 and Figure S1), The selected 65 polymorphic primer pairs sequences that were deposited in GenBank to provide a foundation for community genomic resources for vetch breeding and biotechnology research. The number of alleles (N A ) per locus varied widely among the markers (Table 2) and ranged from 2 to 19, with an average of 6.6 alleles. The frequency of major alleles (M AF ) per locus was 0.27-0.87 with an average of 0.508. In addition, the H O values were 0.00-0.86 with an average of 0.106, and the H E values were 0.264-0.852 with an average of 0.670. Lastly, polymorphic index content (PIC) values were 0.20-0.91, with an average of 0.59 Table 2. Considering the relatively high polymorphism levels, the cDNA-SSR markers developed in the present study will be useful for marker-assisted selection and population genetic studies to improve vetch varieties.
The dendrogram showed that the 32 common vetch accessions fell into five distinct clusters ( Figure 1 Dongi et al [21] reported cluster analysis of Trigonella foenumgraecum there was no clear clustering pattern of geographically closer accessions indicating that the association between genetic similarity and geographical distance was less significant. However, it is necessary to use more number of accessions from each geographical location to confirm the available pattern.

Plant Material
Vicia sativa sativa seeds were selected from the National Agrobiodiversity Center, Rural Development Administration, Suwon, Korea (Table 3). Seedlings were germinated and grown in a glasshouse. The leaves of young seedlings were used to extract the mRNA required to synthesize the cDNA library and for 454 sequencing.

Library Preparation
Approximately 1 µg cDNA was used to generate a DNA library to use with the Genome Sequencer GS-FLX Titanium System (Roche, 454 Life Science, Branford, CT, USA). The cDNA fragment ends were polished (blunted), and two short adapters were ligated to both ends according to standard procedures described previously. The adapters, along with the sequencing key, a short sequence of four nucleotides used by the system's software for base calling, provided priming of the sequences for both the amplification and sequencing of the sample library fragments. Following the repair of any nicks in the double-stranded library, the unbound strand of each fragment was released (with 5-Adaptor A). Finally, the quality of this single-stranded template DNA library was assessed using a 2100 BioAnalyzer (Agilent, Waldbronn, Germany). The library was quantified to determine the optimal amount needed as input for emulsion-based clonal amplification.

454 Pyrosequencing
Single effective copies of template species from the DNA library to be sequenced were hybridized to DNA capture beads. Then the immobilized library was resuspended in an amplification solution, and the mixture was emulsified, followed by PCR amplification. The DNA-carrying beads were recovered from the emulsion and enriched after amplification. The second strands of the amplified products were melted, leaving the amplified single-stranded DNA library bound to the beads. Then the sequencing primer was annealed to the immobilized amplified DNA templates. After amplification, a single DNA-carrying bead was placed into each well of a PicoTiterPlate (PTP) device. Simultaneous sequencing with multiple samples on a single PTP (four-region gasket) was used. Then the PTP was inserted into the FLX Genome Titanium sequencer for pyrosequencing [22,23], and sequencing reagent was flowed sequentially over the plate. Information from the PTP wells was captured simultaneously by a camera, and the images were processed in real-time by an onboard computer. Multiplex identifiers were used to specifically tag unique samples in a GS FLX Titanium sequencing run, which were recognized by the GS data analysis software after the sequencing run and provided high confidence for assigning individual sequencing reads to the correct sample. Sequence assembly was performed after sequencing using GS De Novo Assembler software (Roche) to produce contigs and singletons. All sequence data were conformed to references using GS Reference Mapper software (Roche).

Discovery of cDNA-SSR Markers
All contigs and singletons from both transcriptomes were used to mine SSR motifs, and SSR motifs were identified using the ARGOS pipeline program (version 1.46) at the default settings to survey the molecular markers present in the V. sativa subsp. sativa accessions. Parameters were designed for identifying perfect di-, tri-, tetra-, penta-, and hexa-nucleotide motifs with a minimum of six repeats. The primer design parameters were set as follows: length range, 18-23 nucleotides with 21 as optimum; PCR product size range, 100-400 bp; optimum annealing temperature, 55 °C; and GC content 40-60%, with 50% as optimum. Vicia sativa subsp. sativa genomic DNA was extracted from 32 diverse common vetch accessions for cDNA-SSR marker validation using a DNeasy® Plant Mini kit (Qiagen, Valencia, CA, USA), according to the manufacturer's instructions. Fresh leaf tissue from each accession was used for each extraction and ground well using liquid nitrogen. DNA was resuspended in 100 μL water, and dilutions were made to 10 ng/μL followed by storage at either −20 °C or −80 °C. Randomly selected cDNA-SSR primer pairs were validated experimentally, and forward primers were synthesized by adding the M13 sequence to enable the addition of a fluorescent tail through the PCR amplification process [24]. PCR conditions included a hot-start at 95 °C for 10 min, followed by 10 cycles at 94 °C for 30 s, 60-50 °C for 30 s and 72 °C for 30 s, followed by 25 cycles at 94 °C for 30 s, 50 °C for 30 s, and 72 °C for 30 s, with a final elongation step of 72 °C for 10 min. PCR products were separated and visualized using the QIAxcel Gel Electrophoresis System (Qiagen).

Data Analysis
The amplified SSR loci were scored for 32 accessions. The total number of alleles (NA), major allele frequency (allele with the highest frequency) (M AF ), observed heterozygosity (counting heterozygocity) (H O ), expected heterozygosity (H E ), number of genotypes (N G ), and polymorphic information content (PIC) were calculated using PowerMarker and GenAlEx (version 6.5) [25].
The expected heterozygosity formula is as follows: (1) A closely related diversity measure is the polymorphism information content (PIC) [26]: The cluster analysis of 32 accessions was carried out based onunweighted pair group method with arithmetic mean (UPGMA ) phylogenetic and uprooted tree construction, based on the "CS chord 1967" distance method [27] in powermarker

Conclusions
We developed 65 cDNA-SSR markers, which were used successfully to investigate the genetic diversity among 32 accessions of Vicia stiva subsp. sativa. Considering the relatively high PIC values (0.59 in average), cDNA-SSR in Vicia sativa subsp. sativa is suggested to be an informative genetic marker system, which can also be applied to population genetic studies and marker-assisted selection to mine and accumulate useful alleles to increase the agronomic potential of vetch varieties.