Targeted Next-Generation Sequencing Identification of Mutations in Disease Resistance Gene Analogs (RGAs) in Wild and Cultivated Beets

Resistance gene analogs (RGAs) were searched bioinformatically in the sugar beet (Beta vulgaris L.) genome as potential candidates for improving resistance against different diseases. In the present study, Ion Torrent sequencing technology was used to identify mutations in 21 RGAs. The DNA samples of ninety-six individuals from six sea beets (Beta vulgaris L. subsp. maritima) and six sugar beet pollinators (eight individuals each) were used for the discovery of single-nucleotide polymorphisms (SNPs). Target amplicons of about 200 bp in length were designed with the Ion AmpliSeq Designer system in order to cover the DNA sequences of the RGAs. The number of SNPs ranged from 0 in four individuals to 278 in the pollinator R740 (which is resistant to rhizomania infection). Among different groups of beets, cytoplasmic male sterile lines had the highest number of SNPs (132) whereas the lowest number of SNPs belonged to O-types (95). The principal coordinates analysis (PCoA) showed that the polymorphisms inside the gene Bv8_184910_pkon (including the CCCTCC sequence) can effectively differentiate wild from cultivated beets, pointing at a possible mutation associated to rhizomania resistance that originated directly from cultivated beets. This is unlike other resistance sources that are introgressed from wild beets. This gene belongs to the receptor-like kinase (RLK) class of RGAs, and is associated to a hypothetical protein. In conclusion, this first report of using Ion Torrent sequencing technology in beet germplasm suggests that the identified sequence CCCTCC can be used in marker-assisted programs to differentiate wild from domestic beets and to identify other unknown disease resistance genes in beet.


Introduction
Sugar beet is an important crop that supplies around 20% of the sugar consumed worldwide. It is cultivated in over 50 countries [1]. The crop is damaged by biotic and abiotic stresses and the development of new varieties that are tolerant under adverse conditions is one of the main breeding challenges [2]. Rhizomania and nematode-infections are the most widespread diseases, respectively induced by Beet Necrotic Yellow Vein Virus [3] and beet-cyst nematode (Heterodera schachtii Schm.) [4]. The only efficient and cost-effective strategy to control these diseases is to introgress resistance genes into commercial sugar beet varieties. Today, the market share of varieties encompassing better disease resistances is increasing greatly. In particular, the market-share of double-resistant (rhizomania and nematodes) varieties is projected to increase by up to 70% by 2021 [1,3].
The introgression of resistance genes from wild relatives to sugar beet cultivars remains fundamentally important. The sea beet (Beta vulgaris L. ssp. maritima (L.) Arcang.), the wild progenitor of the domesticated sugar beet, provides useful sources of resistance to many diseases [5]. In the absence of resistance, the cultivation of sugar beet would be practically impossible in most areas that are cultivated at the present moment. Wild relatives of domesticated sugar beet are therefore widely used in breeding programs to improve pest and disease resistance. The first source of a resistance gene for rhizomania (Rz1) was identified in the WB258 sea beet population from the Adriatic coast of Italy [6]. Resistance gene analogs (RGAs) are a large family of genes that share conserved domains and structural features. As such, RGAs can be identified from sequenced genomes using bioinformatics approaches [7]. Despite the fact that thousands of RGAs have been identified through sequence homology, only a handful have been cloned and fully characterized. The latter offer information about the structure, function and evolution of resistance genes, and have delivered cultivars with novel resistance [8][9][10][11]. Hs1 pro-1 is the only resistance gene cloned from sugar beet [12]. This gene delivers resistance to infection by the beet nematode.
Single nucleotide polymorphisms (SNPs) can be used to analyze variations inside the sequence of genes with typical RGA features. SNPs present advantages compared to other genetic marker types. SNPs are abundant, stable, co-dominant, amenable to an automated method of detection, accurate, and relatively inexpensive. Sugar beet has an estimated genome size of about 730 Mbp, and a 567 Mbp reference sequence with 26,923 annotated protein-coding genes has recently been published [13]. The recent release of the sugar beet reference genome has facilitated the discovery and mapping of SNPs in association with different traits.
The Sanger sequencing method, introduced in 1977, allowed the identification of the sequence of genes and frequency of mutations inside genes [14]. However, this method is laborious and time-consuming and requires large amounts of DNA, with a relatively low sensitivity and throughput [15]. Allele-specific PCR-based techniques, TaqMan-based methods, and high melting resolution (HRM) analysis have revolutionized high-throughput genotyping approaches [16]. The limitations of these techniques are that only small regions of the targeted genes are screened and not all the associated variants can be identified efficiently. Alternatively, the rapid developments in next-generation sequencing (NGS) technologies allow simultaneous sequencing of multiple genes in a single cost-effective assay [17]. NGS is used to identify specific changes in DNA by rapidly and simultaneously sequencing multiple gene targets within multiple samples. Capistrano-Gossman et al. [18] used SNP markers in an open-pollinated wild beet (B. maritima) population to demonstrate the potential of wild germplasms for sugar beet improvement. The results of the same study indicated that access to the DNA sequence of resistance gene Rz2 opens the path to improve rhizomonia resistance. Grimmer et al. [19] highlighted the potential for sugar beet breeders to exploit the diverse wild beet germplasm for the introgression of genes in order to make commercial varieties resistant to viral infections.
Molecular marker technologies greatly facilitate introgression of disease resistance traits into sugar beet breeding programs. Ion Torrent Ampliseq designer is a new technology with the ability to resequence genes and search for associations based on the Disease Research Area database [20].
It has superior quality, specificity and coverage, which supports short (140-175 bp), medium (275 bp) and long (375 bp) sized PCR amplicons [21]. This technology is currently used in genetic screening, and as a diagnostic method for the identification of mutations inside genes. In the present study, Ion Torrent sequencing technology with the personal genome machine (PGM) was used to detect genetic mutations inside 21 selected RGAs for rhizomania and nematode infections in wild and cultivated beets. This is the first report of using this accurate and efficient technology in sugar beet research. Domestication and continuous selection for desirable traits made cultivated beets susceptible to many diseases as compared to the wild beet germplasms. Comparison of wild and domesticated beets could assist the identification of sources of disease resistance and mutations under selective pressures. The diagnostic gene-specific SNP markers that have been identified are likely to be useful in screening beet germplasms for resistance genes which may be used in marker-assisted selection (MAS) breeding programs for sugar beet.

Plant Material
Ninety-six individual samples of beet were chosen in the present study. Samples were derived from 5 beet groups comprising cytoplasmic male sterile lines, hybrid varieties, O-types, pollinators, and wild-type accessions (B. maritima). The seeds from this germplasm were produced in 2012 at the University of Padova, Italy. The seeds were germinated in soil pots with daily watering. Ten days after planting, seedlings were collected for DNA isolation.

DNA Isolation
DNA was isolated with the BioSprint 96 DNA Plant Kit (Qiagen, Hilden, Germany) in a BioSprint 96 workstation (Qiagen) following the manufacturer's instructions. Leaf samples were ground using a Qiagen TissueLyser (Qiagen). Briefly, 20 mg of leaf tissue was placed into 2 mL tubes and 300 µL of RLT buffer (Qiagen) was added to each sample. One stainless steel 5 mm bead was used for each sample. Samples were then homogenized for 10 min at 30 Hz. Samples were centrifuged at 6000 g for 5 min and supernatant loaded into a 96-well plate with 200 µL isopropanol and 20 µL magnetic beads suspension (Qiagen). The beads were transferred consecutively into four other plates each with a premix, followed by a 4-min binding step and one bead collection step. The first plate was loaded with RPW buffer (guanidine thiocyanate buffer under patent protection). The second and third plates were loaded with 500 µL 96% ethanol. The fourth plate was loaded with 500 µL of 0.02% (v/v) using polyoxyethylene sorbitol anhydride monolaurate (Tween 20). DNA was eluted with 200 µL sterile water After isolation, DNA was assayed for concentration and purity by microfluidic gel electrophoresis with the Agilent 2200 TapeStation system (Agilent Technologies, Santa Clara, CA, USA). The average DNA yield was 50 ng µL −1 with an average 260:280 ratio of 1.85.

Library Preparation and Sequencing
Sequences of the known RGAs were obtained from Hunger et al. [22]. The Ion AmpliSeq Designer system (www.ampliseq.com) was used to generate target amplicons (about 200 bp in length) covering the RGAs sequences. The multiplexed amplicons were then used to generate barcoded libraries using the Ion AmpliSeq Library Kit 2.0 and the Ion Xpress barcoded adapters (Life Technologies, Carlsbad, CA, USA) to allow for discrimination between samples within a NGS run. Amplified libraries were quantified following the manufacturer's recommendations. Barcoded libraries were combined to a final concentration of 7 pM, to achieve optimal yield of clonal templated Ion Sphere Particles (ISPs) (Life Technologies), for emulsion PCR (emPCR) and further ISP enrichment following the manufacturer's recommendations. Sequencing was performed on 318 chips run on the Ion Torrent PGM and analyzed with the Torrent Suite v4.0.2 Software (Life Technologies). Quantification of prepared libraries was conducted by quantitative PCR using the Ion Library Quantification Kit (Life Technologies). Samples were run on the Ion Torrent PGM System (Life Technologies) as described by the manufacturer. The sequences were mapped against the sugar beet reference genome (RefBeet v. 1.2) and checked for known and novel mutations using Torrent Suite Software. The mutation positions identified were used for primer design and SNP Ampliseq assays.

Data Analysis
Alignment of the sequences against the reference sugar beet genome and base calling were performed using the Torrent Suite software. The identification of variants was performed by the Ion Torrent Variant Caller plugin software. SNP or multiple-nucleotide polymorphism (MNP) frequencies were calculated by dividing the number of mutations by the length of each RGA gene in bp. To analyze and visualize the clustering and differentiation of wild and cultivated beets, a principal coordinates analysis (PCoA) was performed on the SNPs data. PCoA was based on genetic distances calculated from SNP markers. PCoA was performed using ad hoc scripts and the R package GenABEL, version 2.12.2 [23]. Four samples with no mutations inside RGAs were removed from PCoA analysis. The Friedman's analysis of variance (non-parametric ANOVA version for repeated measures) and the Kendall coefficient of concordance were used to determine differences between classes of beet individuals with respect to RGA genes [24,25]. The sequence alignment of the gene Bv8_184910_pkon in wild and cultivated plants was carried out using Clustal W with default parameters [26]. The non-parametric tests were performed using the STATISTICA v.13 software [27].

Results
Ion AmpliSeq was used to identify SNP within the sequence of twenty-one RGAs in individual DNA samples from wild and cultivated beets. The number of SNPs detected in 21 RGAs for each of the 96 individual samples is reported in Table 1. The number of SNPs per sample ranged from 0 to 278. The pollinator R740 (Sample ID 33), which was resistant to infection with rhizomonia, showed the highest number of within-RGAs mutations (278 SNPs) whereas samples 61 and 28 (31 SNPs) showed the lowest (Table 1). A total of 821 polymorphisms (763 SNPs and 58 MNPs) were identified inside 21 RGA genes in 95% (91/96) of the beet samples (Table 2). No SNPs were found inside RGAs in four beet samples. Of the 56 polymorphisms found in the gene Bv8_184910_pkon, 55 were SNPs and only 1 was a MNP. Bv3_063740_usyf located inside the Bvchr3.sca010 scaffold had the second highest number of identified SNPs and MNPs (127 mutations). The gene Bv3_063740_usyf showed the highest mutation frequency (2.91% for SNPs and 0.28% for MNPs) followed by Bv7_171340_mgxu (1.58% for SNPs and 0.19% for MNPs).  Samples 8, 40 and 71 had the highest number of unique mutations (each with three unique SNPs) ( Table 3).
The results of Friedman's Analysis of Variance (ANOVA) (N = 21, df = 4, p-value = 0.024) indicated that the mutations were distributed among five beet groups and the groups of beets were significantly different. The value of Kendall's coefficient of concordance (0.13) suggests that the ranking (by number of polymorphisms) of RGA loci is not random with respect to beet groups. Table 3 shows unique MNP and Indel (insertion/deletion) detected in 21 genes among 96 individuals. The sample number 40 (R4430 carrying Rz2 gene for resistant to infection with rhizomonia) had the highest number of unique mutations (three unique SNPs). No mutation was identified in a number of RGA genes; ID 2, 3, 4, 13, 14, 16, 18, 19, 20 and 21 (Table 4). Bv3_063740_usyf which lies in the scaffold Bvchr3.sca010 on chromosome 3 had the highest number of SNPs within the wild beet group. The gene Bv3_063740_usyf showed the largest difference in SNP numbers between O-types and pollinators. Coefficient of variation (CV%) of the number of mutations between four types of beet plants varied from 0-60.0% (Table 4). The highest CV belonged to the gene Bv7_172060_pnhe. Among different beet groups, CMS lines (132) had the highest number of SNPs followed by wild beets (117) whereas the fewest SNPs (95) were found in the O-type class (Table 4). In PCoA, SNP data were used within-gene (i.e., RGA) to calculate genetic distances between samples: for all 21 RGA genes but one, it was not possible to clearly discriminate between wild and cultivated beets based on SNP genotypes. Only when the polymorphisms inside the Bv8_184910_pkon gene were used (including the CCCTCC sequence) to calculate genetic distances were wild and cultivated beets clearly differentiated. Figure 1 plots the first two dimensions from the PCoA of within-Bv8_184910_pkon SNP-based genetic distances. Sequence alignment of the Bv8_184910_pkon gene in wild and cultivated beets is reported in Figure S1.

Discussion
In the present study, we analyzed RGAs in the sugar beet genome to identify polymorphic sequences that differentiate wild accessions from domesticated beets. Although domestication is associated with the loss of resistance to biotic and abiotic stresses, it can be regarded as the selection of suitable wild accessions according to agriculturally relevant phenotypes [28]. The initial domestication event did sample the full range of natural variation in the wild progenitor populations and the characterization of such diversity at the functional locus level can result in more efficient preservation of plant genetic resources (PGR). More detailed information on the structure of genetic variation at the population level would allow a more efficient preservation of the genetic resources [29,30]. Differences in habitats may bring about different selective forces acting upon allele frequencies in populations. High throughput DNA sequencing is one of the approaches that have enabled the tracking of mutations that occurred during the domestication of crop plants. Different NGS platforms have been combined with bioinformatics tools to discover SNPs and mutations within genes [31]. Among the several available NGS platforms, the most popular are the Roche 454

Discussion
In the present study, we analyzed RGAs in the sugar beet genome to identify polymorphic sequences that differentiate wild accessions from domesticated beets. Although domestication is associated with the loss of resistance to biotic and abiotic stresses, it can be regarded as the selection of suitable wild accessions according to agriculturally relevant phenotypes [28]. The initial domestication event did sample the full range of natural variation in the wild progenitor populations and the characterization of such diversity at the functional locus level can result in more efficient preservation of plant genetic resources (PGR). More detailed information on the structure of genetic variation at the population level would allow a more efficient preservation of the genetic resources [29,30]. Differences in habitats may bring about different selective forces acting upon allele frequencies in populations. High throughput DNA sequencing is one of the approaches that have enabled the tracking of mutations that occurred during the domestication of crop plants. Different NGS platforms have been combined with bioinformatics tools to discover SNPs and mutations within genes [31]. Among the several available NGS platforms, the most popular are the Roche 454 Sequencer, the Illumina HiSeq and MiSeq and the Life Technologies Ion Torrent proton and personal genome machine (PGM) [32]. The Ion Torrent PGM used for polymorphism analysis herein is a high-throughput sequencer. The sequence is achieved by synthesis chemistry that uses a semiconductor-based, high-density array of micro-reaction chambers. The PGM's sequence reads are about 100-200 bp and, thanks to a deep coverage, it is possible to detect mutations with low allele frequency.
In this study, we used the Ion Ampliseq approach to identify variants for 21 RGAs in 96 samples from five beet groups. This technology has been widely used in disease diagnosis in medical sciences [20], but our study is the first report of using Ion Ampliseq to screen 21 disease RGAs in beet germplasms. The high-quality AmpliSeq protocol was used in this study and the results showed that this technology was able to detect both single-and multi-nucleotide polymorphisms in domesticated and wild beets. The entire sequencing process, from library preparation to variant identification, was completed in two days. This re-sequencing approach will facilitate the rapid and cost-effective screening of RGA genes in beet germplasm. SNP flanking sequences from this study were uniquely mapped on the reference genome (RefBeet v. 1.2) and the AmpliSeq protocol on the Ion Proton platform showed robust sequencing results. Wild and cultivated accessions had 25 unique SNPs mapped onto all of the 21 RGAs. Four genes (Bv5_094270_ijae Bv8_184910_pkon, Bv7_171340_mgxu, and Bv3_063740_usyf ) showed a density rate (variation at a specific position in the DNA sequence) of >1.0%. When the SNP data within the 21 RGAs were used to calculate distances between samples, no clear differentiation between wild and cultivated beets was observed from PCoA, save for one case. Only polymorphisms inside the RGA gene Bv8_184910_pkon could effectively differentiate wild from cultivated beet samples. These results suggest that most sources of resistance to rhizomania and nematodes in cultivated sugar beets have indeed been introgressed from B. maritima strains, hence they show no relevant genetic differences between the domesticated and wild germplasm. On the other hand, polymorphisms inside the Bv8_184910_pkon gene seem to be associated with a mutation which confers disease resistance. That mutation seems to have originated directly in cultivated beets in response to selective pressure related to rhizomania and nematode infections in field conditions. The Bv8_184910_pkon gene belongs to the receptor-like kinase (RLK) class of RGAs and its database annotation indicates that it relates to a hypothetical protein. A hypothetical protein is a protein whose existence has been predicted, but for which there is no experimental evidence that it is expressed in vivo. In silico methods can be used to predict hypothetical protein functions. The study of Beseli et al. [33] characterized Cercospora nicotianae hypothetical proteins in cercosporin resistance. Their results indicated that the 71cR gene, encoding a hypothetical protein, was upregulated in C. nicotianae in response to cercosporin toxicity, and that the expression of this gene in the cercosporin-sensitive fungus Neurospora crassa can impart cercosporin resistance. RLKs are pattern recognition receptors (PRRs) that mediate pathogen-/microbe-associated molecular pattern (PAMP/MAMP) triggered immunity (PTI/MTI) to allow the recognition of a broad range of pathogens [34]. PAMP/MAMPs are conserved features of most pathogens, such as chitin, flagella, and lipopolysaccharides. Xa21 in rice encodes an RLK involved in resistance to a bacterial disease caused by Xanthomonas oryzae (Xoo) [35]. In another study, Hunger et al. [22] found that some of the same RGAs used in the current study are involved in sugar beet's resistance to infection with Cercospora. Genetic differentiation obtained using RGA-specific SNP markers may be useful for defining genetic diversity of a suite of random R genes in beet plants. They could also help to characterize the genetic structure and geographic distribution of the beet. In the present study, we identified a large number of allelic modifications in RGAs that may be related to important adaptive functions in sugar beet. These modifications appear to be potential targets for subsequent association studies aimed to identify SNP markers linked to disease resistance in sugar beet. SNP markers can be designed from RGAs around a target disease gene to construct an RGA genetic map for the specific target region. Such mapped, genome-wide RGAs and linked SNP markers are valuable tools to develop high-density R-gene genetic maps, target R-genes, co-localize QTLs, design diagnostic markers of R-genes for fine mapping, clone R-genes, and breed for resistance.
In conclusion, the results of non-parametric tests confirmed that there is untapped variation in the wild materials. Further, they show that the sequence CCCTCC inside the gene Bv8_184910_pkon, responsible for differentiation of the wild from cultivated beets, can be used for rapid and convenient scanning of a large number of beet germplasms for identification of disease resistance in a time-and cost-effective assay. As most of the resistance genes share limited conserved domains, CCCTCC sequence information can be exploited to identify and clone unknown RGAs in wild beet plants.