Assessment of Genetic Diversity in Differently Colored Raspberry Cultivars Using SSR Markers Located in Flavonoid Biosynthesis Genes

Raspberry is a valuable berry crop containing a large amount of antioxidants that correlates with the color of the berries. We evaluated the genetic diversity of differently colored raspberry cultivars by the microsatellite markers developed using the flavonoid biosynthesis structural and regulatory genes. Among nine tested markers, seven were polymorphic. In total, 26 alleles were found at seven loci in 19 red (Rubus idaeus L.) and two black (R. occidentalis L.) raspberry cultivars. The most polymorphic marker was RiMY01 located in the MYB10 transcription factor intron region. Its polymorphic information content (PIC) equalled 0.82. The RiG001 marker that previously failed to amplify in blackberry also failed in black raspberry. The raspberry cultivar clustering in the UPGMA dendrogram was unrelated to geographical and genetic origin, but significantly correlated with the color of berries. The black raspberry cultivars had a higher homozygosity and clustered separately from other cultivars, while at the same time they differed from each other. In addition, some of the raspberry cultivars with a yellow-orange color of berries formed a separate cluster. This suggests that there may be not a single genetic mechanism for the formation of yellow-orange berries. The data obtained can be used prospectively in future breeding programs to improve the nutritional qualities of raspberry fruits.


Introduction
The genus Rubus L. (Rosaceae, Rosoideae) is one of the most diverse in the plant kingdom and contains between 600 and 800 species grouped in 12 subgenera, which are widely distributed throughout the world from the lowland tropics to subarctic regions [1]. Among these species, red There are several studies that used random genomic SSR markers to assess genetic diversity in cultivars within [8,28] and between [29] different species. However, we are unaware of studies in which genetic diversity would be assessed using markers located in genes of any metabolic pathway and the biosynthesis of flavonoids, in particular. In this study, we developed SSR markers using nucleotide sequences of structural and regulatory genes of flavonoid biosynthesis in Rubus and Fragaria (strawberry) available at the National Center for Biotechnology Information (NCBI) GenBank database to test whether genetic variation associated with these genes correlate with a variation of berry colors. These markers were genotyped in 19 raspberry cultivars from different geographic regions (Russia, Poland, Italy, Switzerland, UK, and USA) and two cultivars of black raspberry. If alleles at these loci correlate with the content of biologically active substances, they could subsequently be used to optimize selection for valuable traits associated with color and, indirectly, with the content of flavonoids, by accelerating selection via screening genotypes at early stages.

Plant Materials
Nineteen cultivars of red raspberry (Amira, Anne, Babye Leto II, Beglyanka, Brilliantovaya, Bryanskoe Divo, Gerakl, Glen Ample, Marosejka, Meteor, Oranzhevoe Chudo, Pingvin, Polka, Poranna Rosa, Solnyshko, Sugana, Tarusa, Zheltyj Gigant, and Zolotaya Osen) and two cultivars of black raspberry (Cumberland and Jewel) were chosen to genotype SSR loci located in the flavonoid biosynthesis genes. These cultivars have a wide range of fruit color from yellow to black with various geographic and genetic origins, but cultivars of Russian origin from two raspberry breeding centers (Bryansk and Moscow) dominated in the list (Table 1). Raspberry plants used in this study were kindly provided by Dr. I. A. Pozdniakov (OOO Microklon, Pushchino, Russia). Each cultivar represented a microclonally vegetatively propagated line containing practically genetically identical plants. Therefore, a single specimen per culture was used for further DNA isolation and genotyping.

Simple Sequence Repeat (SSR) Marker and Polymerase Chain Reaction (PCR) Primer Development
The WebSat software [30] was used to detect SSR loci in the nucleotide sequences of Rubus and Fragaria × ananassa (the garden strawberry or simply strawberry, a widely grown hybrid species of the genus Fragaria) flavonoid biosynthesis genes available at the NCBI GenBank database (http: //www.ncbi.nlm.nih.gov) ( Table 2). The Primer 3 software (http://primer3.org) was used to design appropriate polymerase chain reaction (PCR) primers based on the sequences flanking the SSR loci. The minimum number of motifs used to select the SSR locus was nine for mono-nucleotide repeats, five for di-nucleotide motifs, three for tri-, and tetra-, and two for penta-, and hexa-nucleotide repeats. Primers were designed using the following criteria: primer length of 18-27 bp (optimally 22 bp), GC content of 40%-80%, annealing temperature of 57-68 • C (optimally 60 • C), and expected amplified product size of 100-400 bp. Primers for the RiG001 locus were as in [8]. Primers were synthesized by Syntol Company (Moscow, Russia) and are summarized in Table 2.

DNA Isolation, PCR Amplification and Fragment Analysis
A single DNA sample per each cultivar was produced from young expanding leaves representing a single plant per each cultivar. Total genomic DNA was extracted using the STAB method [31]. The quality and quantity of extracted DNA were determined by the NanoDrop 2000 spectrophotometer (ThermoFisher). The final concentration of each DNA sample was adjusted to 50 ng/µL in TE buffer before the PCR amplification.
For genotyping, PCR was performed separately for each primer pair using a forward primer labeled with the fluorescent dye 6-FAM and an unlabeled reverse primer (Syntol, Russia). The PCR amplification was performed in a total volume of 20 µL consisted of 50 ng of genomic DNA, 10 pmol of the labeled forward primer, 10 pmol of an unlabeled reverse primer, and PCR Mixture Screenmix (Eurogen, Russia). After an initial denaturation at 95 • C for 3 min, DNA was amplified during 33 cycles in a gradient thermal cycler (Bio-Rad, Hercules, CA, USA) programmed for a 30 s denaturation step at 95 • C, a 20 s annealing step at the optimal annealing temperature of the primer pair and a 35 s extension step at 72 • C. A final extension step was done at 72 • C for 5 min.
The PCR generating clear, stable, and specific DNA fragments within an expected length (200-400 bp) were considered as successful PCR amplifications. If a primer pair failed three times to amplify template DNA that was amplified with other primers, then it was scored as a null genotype.
Separation of amplified DNA fragments was performed in an ABI 3130xl Genetic Analyzer using S450 LIZ size standard (Syntol Company, Moscow, Russia). Peak identification and fragment sizing were done using the Gene Mapper v4.0 software (Applied Biosystems, Foster, CA, USA).

Genetic Data Analysis
Genetic parameters were calculated for 21 raspberry cultivars based on seven SSR polymorphic loci. The allele frequencies, number of alleles, observed (H o ) and expected (H e ) heterozygosities, and polymorphic information content (PIC) were calculated using the PowerMarker v.3.25 software [32]. This software was also used to estimate pairwise Nei's standard genetic distances between each pair of cultivars and to generate a UPGMA dendrogram, which was visualized using the Statistica software (TIBCO Software Inc., Palo Alto, CA, USA). * Optimal annealing temperature. ** Two or three SSRs in these loci were amplified simultaneously by a single pair of primers.

Polymorphism and Genetic Diversity Analysis
Nine SSR markers (six based on Rubus and three on Fragaria nucleotide sequences of the flavonoid biosynthesis genes) were used to estimated genetic diversity in 19 raspberry (R. idaeus) and two black raspberry (R. occidentalis) cultivars. All PCR primer pairs amplified one or two alleles. In raspberries, two loci (RiTT01 and FaAR01) were monomorphic, and other seven were polymorphic. In black raspberry cultivars, the RiG001 was not amplified at all, six loci were monomorphic and only two polymorphic ( Table 2). In total, 26 alleles were found in seven polymorphic microsatellite loci. The number of alleles per locus varied from two per locus (FaFS02 and FaFL01) to nine per locus (RiMY01) with an average number of 3.7 alleles per locus ( Table 3). The RiMY01 locus was the most polymorphic. In general, the SSR loci located in introns were more polymorphic than loci in exons. There were cultivar-specific alleles, such as a unique allele 358 at the RiMY01 locus found only in black raspberry, and alleles 267 and 269 at the RhUF01 locus found only in the red raspberry Meteor and Jewel cultivars, respectively. Meteor contained also a unique allele 333 at the RiMY01 locus.
Parameters of genetic variation for seven polymorphic SSR loci in 21 Rubus cultivars are presented in Table 3. Expected heterozygosity (H e ) ranged from 0.05 in the RiMY01 locus up to 0.84 in the RiMY01 locus with an average value of 0.36. Observed heterozygosity was zero in the RhUF01 locus and ranged from 0.05 in the FaFS02 locus to 0.57 in the RiMY01 locus with an average value of 0.29. The observed heterozygosity was lower than expected in four microsatellite loci and on average (Table 3). On average, the expected and observed heterozygosities were higher for the SSRs in introns (0.49 and 0.44, respectively) compared to the SSRs in exons (0.20 and 0.08, respectively). The average PIC was 0.332 and varied from 0.05 in the FaFS02 locus to 0.82 in the RiMY01 locus (Table 3).

Cluster Analysis
A UPGMA dendrogram was constructed for 21 raspberry cultivars based on seven SSR markers located in the genes of the flavonoid biosynthesis ( Figure 1). The dendrogram clearly separates red and black raspberries. Among the red raspberry cultivars, there is a group of cultivars with yellow-orange colored berries (Anne, Poranna Rosa, Orangevoe Chudo, and Zolotaya Osen), which forms a separate cluster. The same group includes also the Bryanskoe Divo cultivar with light red berries. At the same time, the Zheltyj Gigant (yellow berries) and Beglyanka (orange berries) were not included in this group. Separation of cultivars did not follow their genetic origin. The cultivars Beglyanka, Solnyshko, and Meteor having the same genetic origin from the Kostinbrodskaya × Novost Kuzmina cross were completely separated from each other. In addition, the Babye Leto 2 also having an ancestral hybrid (Autumn Bliss × (September × (Kostinbrodskaya × Novost Kuzmina))) turned out to differ mostly from other raspberry cultivars. Gerakl and Sugana both also having Autumn Bliss as their parent species were significantly separated. At the same time, close similarities have been observed for cultivars from different geographic regions. No genetic differences were found between the Orangevoe Chudo (Russia) and Poranna Rosa (Poland) cultivars, and between the Amira (Italy) and Tarusa (Russia) cultivars, although they have different genetic origins. The Brilliantovaya and Pingvin cultivars were also identical and were obtained with the use of interspecific hybrids.  Table 1 for the full cultivar names.

Discussion
SSR markers (microsatellites) are widely used in genetic diversity studies, QTL and genetic mapping, molecular-assisted selection (MAS), and cultivar identification, because they are multi-allelic, co-dominant, highly informative, relatively accurate and easily detected [33]. SSR markers have been often used to map different types of Rubus [9,13], fingerprinting germplasm [34], and in studies of the genetic diversity and population structure within [28] and among [29] Rubus species. However, genetic diversity has not previously been studied in terms of any specific metabolic pathway genes that determine valuable breeding traits.
In this study, we report on the evaluation of a number of red and black raspberry cultivars using SSR loci representing known sequences of the flavonoid biosynthesis pathway genes, which synthesize biologically active substances with high antioxidant activity-flavonols and anthocyanins. Among these microsatellite loci, six (RcFH01, FaFS01, FaFS02, RiAS01, FaAR01, and RhUF01) were located in the structural genes of the flavonoid biosynthesis (F3H, FLS, ANS, ANR, and UFGT) and two (RiMY01 and RiTT01) in the regulatory genes (MYB10 and TTG1). Flavanone-3-hydroxylase (F3H) is a key enzyme in the flavonoid biosynthesis in plants, as it catalyzes formation of 3-hydroxy flavonol, a common precursor of anthocyanins, flavanols, and proanthocyanidins [35]. Particular attention was paid to the flavonol synthase gene, for which two loci were used. Flavonol synthase (FLS) is an important enzyme of flavonoid pathway that catalyzes the formation of flavonols from dihydroflavonols, and thus may influence anthocyanin levels, as dihydroflavonols are intermediates in the production of both colored anthocyanins and colorless flavonols [36]. The anthocyanidin synthase (ANS) leads to the synthesis of the anthocyanidin, the first colored compound in the anthocyanin biosynthetic pathway, from which anthocyanidin reductase catalyzes the formation of proanthocyanidins (condensed tannins) [37]. The last common step for the production of stable anthocyanins is the glycosylation by the enzyme UDP-glucose/flavonoid 3-O-glucosyl transferase (UFGT) [38].
In addition, loci were used on the sequence of two transcription factors (MYB 10 and TTG1) that belong to the MBW complex, which regulates the production of the late biosynthetic genes [27]. For comparison, we also used a pair of primers designed for the RiG001 locus using the sequence of the R. idaeus aromatic polyketide synthase (PiPKS3) gene, which was not amplified in blackberry cultivars [8]. The RiPKS3 gene differed from the RiPKS1 gene, encoding a typical chalcone synthase (CHS) catalyzing the first step of flavonoid biosynthesis, in four amino acid positions and produced in vitro predominantly p-coumaryltriacetic acid lactone and low levels of chalcone [39]. Within the PCR fragment amplified by the primers for the RiG001 locus the sequence of the RiPKS3 gene (NCBI GenBank AF292369) differed from the RiPKS1 gene sequence (AF292367) by a two nucleotide long deletion (2 bp) and a single nucleotide insertion. Three alleles (349, 350, and 351 bp) were obtained for this locus (Table 2).
In addition to the sequences of the genes of the Rubus plants (R. idaeus, R. coreanus, and R. hybrid), we used the sequences of the genes from Fragaria × ananassa, which is a close relative of Rubus from the same sub-family, Rosoideae. The Rubus and Fragaria both have the same base chromosome number 1n = 7, similar morphology and chloroplast and nuclear DNA phylogenies [13].
Among three most economically important types of raspberry, 19 cultivars of red raspberry with a wide range of berry color from various world breeding centers and two cultivars of black raspberry are mostly used. Both species, red (R. idaeus) and black (R. occidentalis) raspberry belong to the same subgenus Idaeobatus (raspberries) and are diploids (2n = 2x = 14), while blackberry species vary greatly in ploidy [34].
In our study, the average number of alleles for seven polymorphic SSR loci in the flavonoid biosynthesis genes was 3.71, the mean H o and H e were 0.286 and 0.360, respectively, and the mean PIC was 0.332. These values were generally lower than previously reported for R. idaeus [8] and R. coreanus [29], but quite comparable with the data for black raspberry cultivars [28]. Perhaps, this is due to the fact that red raspberry cultivars are, for the most part, complex hybrids with a limited genetic pool [34], and the selection for berries quality has further reduced their diversity. The level of expected heterozygosity (H e ) was higher than observed (H o ) both on average and in most individual loci. These data are different from other studies of the Rubus species, where these parameters were approximately equal [8,29], or even higher [28]. However, unlike those studies, where population samples were used, a collection of different cultures was used in this study, which is not a population sample, but a mixture of genotypes with different genetic background and origin. Therefore, it is expected to observe excess of expected heterozygosity in comparison to observed heterozygosity due to Wahlund effect.
Only the RiMY01 locus was highly polymorphic (PIC = 0.82). This locus had three SSR regions, two of which represent dinucleotide repeats. These data coincide with the results of Castillo et al. [8], in which all three highly informative markers (PIC = 0.78-0.82) represented dinucleotide repeats. In R. coreanus, among five highly polymorphic markers (PIC > 0.7), four represented dinucleotide repeats, and one trinucleotide repeats [29]. The high variation of the RiMY01 locus can be explained by its location in the first intron of the transcription factor MYB10. SSR markers located in introns were more variable in comparison to those located in exons (expected and observed heterozygosities averaged 0.49 and 0.44 vs. 0.20 and 0.08, respectively). Our results are in agreement with those of Garcia-Gomez et al. [40], which showed that SSRs in introns had a higher level of heterozygosity compared to SSRs in exons in Prunus species-0.65 vs. 0.17, respectively. Similar results were also obtained in maize [41]. Significantly higher variation was observed also for SNPs in noncoding regions compared to coding ones [42]. In general, introns are more variable than exons, as they are under less selection pressure during the evolutionary process [43].
The length of most alleles at the RiMY01 locus differ from each other by two nucleotide-long steps, which is consistent with dinucleotide repeats of the SSR motifs in this locus. However, imperfect repeats also often occur in the raspberry SSR loci. For instance, Fernandez et al. [34] has previously reported the alleles with length different by consecutive one nucleotide-long steps in the Rubus57a and Rub5a markers. This single nucleotide stepwise variation is expected for Rub5a, which is a SSR marker with a mononucleotide motif, but Rubus57a is a SSR marker with a dinucleotide motif. We also observed a few alleles with imperfect repeats, such as the unique allele 267 of the RhUF01 locus in the Meteor cultivar, for which the perfect allele size is 270 following the trinucleotide motif GAG stepwise allelic variation.
The black raspberry cultivars were highly homozygous: six out of eight loci were monomorphic ( Table 2). High homozygous in black raspberry has been also found earlier by Lewers and Weber [44]. They noticed that the level of homozygosity for the black raspberry was 80%, but only 40% for the red raspberry. The 21 SSR loci were unable to distinguish between six of the black raspberry cultivars [28]. However, the black raspberry cultivars Cumberland and Jewel were well discriminated in this study. Despite the small number of loci used in our study, these two cultivars were also separated by two loci: RcFH01 and RhUF01. In our study the red raspberry cultivars were easily discriminated from the black raspberry cultivars by a unique black raspberry specific allele 358 at the RiMY01 locus and the allele 309 at the RiAS01 locus, which occurred almost exclusively in the black raspberry cultivars, except the red raspberry cultivar Babye Leto 2. In addition, the RiG001 locus was not amplified in black raspberry. The same was observed also in 48 earlier tested blackberry cultivars [8]. Thus, in respect to this locus, the black raspberry is closer to the wild blackberry than to the red raspberry, although it belongs to different subgenera. No amplification of RiG001 and the unique allele 358 at the RiMY01 locus can be used to separate the red raspberry cultivars from the black ones.
Cluster analysis of the SSR markers located in the genes of the biosynthesis of flavonoids showed a clear separation of the black raspberry (R. occidentalis) cultivars with black colored berries from the red raspberry (R. idaeus) cultivars with berries colored from yellow to dark red ( Figure 1). It is important to note also that five cultivars with berries of similar shades of light red color (three with yellow berries, one with orange, and another with light red color) having completely different origin still clustered together into one sub-group. Perhaps, gene-targeted markers [45] such as SSR loci in the genes of the biosynthesis of flavonoids reflect better their genetic similarity for traits, such as color of their berries, likely controlled or affected by these genes, than random genomic SSR markers.
Castillo et al. [8] found that the primocane fruiting (fall fruiting) raspberry cultivars were grouped into a separate cluster. In Fernandez et al. [34] studies, it was shown that the majority of primocane-fruiting material from various breeding programs, as well as some very early ripening floricane-fruiting genotypes are grouped into one cluster. This shows that cultivars can be grouped according to a particular trait regardless of their origin. At the same time, two cultivars with yellow and orange-colored fruits (Zheltyj Gigant and Beglyanka) fell into another group of red-colored fruits. Perhaps, for a clearer separation, it is necessary to use additionally more polymorphic markers, including other genes of the biosynthesis of flavonoids not represented in this study.
Moreover, it is possible that the yellow color of the raspberry fruits can be obtained by two or more mechanisms. For example, primocane fruiting cultivars were also distributed in two different groups [34]. The genetic mechanisms for the formation of yellow color in raspberry fruit have not yet been fully studied. Although assumptions on this topic were made back in the 1930s, it was not until 2016 when an inactive anthocyanidin synthase (ANS) allele was identified in yellow raspberry [46]. A 5 bp insertion in the coding region of gene creates a premature stop codon resulting in a truncated amino acid sequence of the defective ANS protein. However, other mechanisms are also possible, such as the combinations of recessive and dominant alleles, or the transcription factors that may lead to a huge variety of berry colors in raspberry.
The clustering along the flavonoid pathway also showed that there is a lack of connections between cultivars of the related origin. This is exactly the opposite data compared to the analyses carried out on randomly selected SSR markers evenly distributed across the genome. For example, Fernandez et al. [34] demonstrated that one cluster is almost entirely composed of cultivars from the Scottish raspberry breeding program or cultivars based on their germplasm. From the point of view of MAS the use of gene-targeted markers to assess genotypes for particular breeding traits is preferable to the use of random SSR markers. Graham et al. [9] suggested in 2004 that Rubus idaeus due to the diploid set of chromosomes (2n = 2x = 14) and a very small genome (275 Mb) may be used as a model species for the Rosaceae. For many years, this was impeded by the lack of the full-genome Rubus sequence, although the genomes of other Rosaceae species have been already sequenced, such as apple in 2010, strawberry in 2011, pear and peach in 2013 [47]. However, the situation is changing with genomes of R. occidentalis [48] and R. idaeus [49] having been recently published. This will facilitate developing gene-targeted markers that can advance breeding Rubus for important traits including those related to the nutritional value of their berries.

Conclusions
In this study, we demonstrated that a set of gene-targeted SSR markers representing structural and regulatory genes of flavonoid biosynthesis could potentially allow more informative and meaningful evaluation of the genetic relationship between different cultivars of red and black raspberries that reflect the color of their berries and possibly also their nutritional value. However, the study did not compare this set of gene-targeted markers with an analysis of the same germplasm set using neutral markers. A comparative analysis using a set of neutral SSR markers would seem to be important to support this particular conclusion. The developed primer set can be potentially used for MAS in the Rubus breeding programs for improving the nutritional quality of fruits. This first requires confirmation that the SSR alleles identified correlate with differences in the content of flavonoids. Additional studies and further development of these gene-targeted markers are needed to validate this approach.