Assessment of the Genetic Distinctiveness and Uniformity of Pre-Basic Seed Stocks of Italian Ryegrass Varieties

Lolium multiflorum Lam., commonly known as Italian ryegrass, is a forage grass mostly valued for its high palatability and digestibility, along with its high productivity. However, Italian ryegrass has an outbreeding nature and therefore has high genetic heterogeneity within each variety. Consequently, the exclusive use of morphological descriptors in the existing varietal identification and registration process based on the Distinctness, Uniformity, and Stability (DUS) test results in an inadequately precise assessment. The primary objective of this work was to effectively test whether the uniformity observed at the phenological level within each population of Italian ryegrass was confirmed at the genetic level through an SSR marker analysis. In this research, using 12 polymorphic SSR loci, we analyzed 672 samples belonging to 14 different Italian ryegrass commercial varieties to determine the pairwise genetic similarity (GS), verified the distribution of genetic diversity within and among varieties, and investigated the population structure. Although the fourteen commercial varieties did not show elevated genetic differentiation, with only 13% of the total variation attributable to among-cultivar genetic variation, when analyzed as a core, each variety constitutes a genetic cluster on its own, resulting in distinct characteristics from the others, except for two varieties. In this way, by combining a genetic tool with the traditional morphological approach, we were able to limit biases linked to the environmental effect of field trials, assessing the real source of diversity among varieties and concretely answering the key requisites of the Plant Variety Protection (PVP) system.


Introduction
Lolium multiflorum Lam., commonly known as Italian ryegrass, is one of the most important groups of grasses largely used as forage crops due to its high productivity and high nutritional value as livestock feed, especially in terms of fiber palatability and digestibility [1,2]. Over the years, advantages have also been demonstrated from an environmental point of view: L. multiflorum can be used as a soil stabilizer during the winter to provide ground cover against soil erosion and depletion. Moreover, despite its lower persistence and stress tolerance in comparison with Lolium perenne, Italian ryegrass is more productive and can provide faster ground cover due to its timely emergence and seedling vigor.
Both annual and biennial varieties are available on the market, and to prevent any misunderstanding, it is fundamental to clarify that they both belong to this species: the truly annual forms of L. multiflorum Lam. var. westerwoldicum Wittm. (Westerwolds ryegrass) can be distinguished by their complete flowering in the year of sowing from the biennial forms of L. multiflorum Lam. ssp. italicum (A. Br.) Volkarts, which generally produce very few seed heads in the sowing year, complete the cycle the following year after cold and short-day conditions, and remain leafy during the entire season.
Although Italian ryegrass is naturally diploid, tetraploid forms have been developed by chromosome doubling to achieve higher biomass production and better nutritional characteristics. The increase in an organism's cellular ploidy caused by genome replication without mitosis, as occurred in Italian ryegrass using colchicine (i.e., the most efficient anti-mitotic agent and mitosis inhibitor [3]), has been shown to play an important role in physiology and development via cellular, metabolic, and genetic effects [4,5]. Compared to their diploid counterparts, polyploid forage crops display enlarged leaf dimensions and plant height [6,7] but also faster regrowth after grazing and an increasing number of branches and stems [8]. In particular, tetraploid cultivars of Italian ryegrass showed not only faster leaf elongation rates with respect to diploid cultivars, resulting in longer leaves, but also larger shoot dry weights under stress-free conditions [9,10]. Moreover, positive effects have also been reported in enhanced forage quality, since the increased production of secondary metabolites in the induced polyploids together with a higher cell content/cell wall ratio leads to improved succulence and higher forage intake [11,12].
The most common method used to produce Italian ryegrass varieties is based on a recurrent selection strategy, consisting of repeated cycles of intercrossing among selected superior individuals. Almost 700 varieties of Italian ryegrass were reported in the 2022 edition of the OECD (Organization for Economic Co-operation and Development) list of varieties eligible for seed certification [13]. This highlights the economic value of registering a commercial variety and the consequent necessity of protecting breeder rights. The official regulation of Italian ryegrass cultivar identification and registration drafted by the International Union for the Protection of New Varieties of Plants (UPOV) is based on the DUS (Distinctness, Uniformity, and Stability) test, considering all morphological traits, such as length and width of leaves, the intensity of green color, height and width of plants, length of the inflorescence, and the number of spikelets. Ploidy level is the only non-morphological trait considered as a distinction criterion by UPOV. Grow-out tests are performed in both spaced plant and row plot trials, and the visual scoring of traits is frequently performed using numerical scales with different minimum-maximum ranges (e.g., 1-3, 1-5, or 1-9). However, considering the outbreeding nature of Italian ryegrass, and thus the high genetic heterogeneity within each variety, the exclusive use of morphological descriptors to discriminate and compare the increasing number of varieties is becoming difficult and costly. In addition, the morphology and physiology of a plant are the results of strong interactions with environmental factors such as rainfall, temperature, light exposure, and soil composition. Thus, the idea of a unique and fixed guideline based on the evaluation of morpho-physiological traits results in an inaccurate assessment, especially in a worldwide market.
Without a more precise and standardizable characterization, it is impossible to clearly differentiate one variety of Italian ryegrass from another and guarantee real compliance with the DUS requirements in the current Plant Variety Protection (PVP) system.
In this scenario, molecular tools would be useful in implementing the traditional DUS test, as proposed by Gilliland et al. [14], to avoid the plagiarism risk and demonstrate a clear improved value for cultivation and use (VCU) for a new registered variety. Indeed, a DNA marker system could avoid strong environmental dependence, having the potential to investigate many more samples in genotypic assays. In other words, molecular markers such as simple sequence repeats (SSRs) and single-nucleotide polymorphisms (SNPs) represent more robust and objective tools to be integrated into the PVP system for Italian ryegrass. In particular, microsatellite loci consist of short DNA sections of tandemly repeated di-, tri-, and tetranucleotide motifs, and consequent polymorphisms derive from a different number of repetitions. These differences can be detected with conserved PCR primers designed on the nonrepetitive flanking regions. Since they are genetically welldefined and codominant, SSRs have become a powerful genetic tool for marker-assisted breeding (MAS) and the assessment of diversity and uniformity. In addition, SSRs are highly polymorphic markers and are therefore able to discriminate closely related varieties, as often happens for Italian ryegrass. Finally, the SSR assay is easily performed by PCR, a method that does not require specific and expensive laboratory equipment. Despite the recent development of next-generation sequencing (NGS) techniques, particularly restriction site-associated DNA sequencing (RAD-seq) [15] and genotyping-by-sequencing (GBS) technologies, such as genotyping by random amplicon sequencing and direct (GRAS-Di) [16], we want to provide a molecular assay that is less costly and time-consuming and that could be practically applied by small laboratories of seed companies. From the literature, the use of molecular markers in the evaluation of the genetic diversity of L. multiflorum accessions has been reported by few studies using sequence-related amplified polymorphisms (SRAPs) [17], randomly amplified polymorphic DNA (RAPD) [18], and SSRs combined with bulk strategies [19].
In this study, we used 12 polymorphic SSR loci to genotype 14 Italian ryegrass commercial varieties, each composed of 48 individuals (672 samples in total). In the downstream analysis, we determined the genetic similarity (GS), verified the distribution of genetic diversity within and among varieties, and investigated the population structure. The main aim was to concretely test whether the uniformity observed at the phenological level within each population of Italian ryegrass was confirmed at the genetic level through an SSR marker analysis. Thus, in this way, we limited biases linked to the environmental effect of field trials. Thus, we were able to evaluate the real source of diversity among varieties and obtained a concrete answer to the fundamental requisites of the PVP system.

Plant Materials
In this study, 14 commercial varieties of Italian ryegrass, differing in ploidy level (diploid or tetraploid) and vegetative habit (annual or biennial) (Table 1), were provided and bred by the same private company. For each variety (here named from A to P), 80 seeds belonging to the latest generation of a purification process were sown in growing boards with a common potting soil and grown in a greenhouse under short-day conditions and a temperature of 20 • C at the "L. Toniolo" experimental farm of the University of Padova in Legnaro (PD). After three weeks, leaf samples were collected from 48 seedlings of each variety, for a total of 672 samples. Table 1. Plant material information, including variety code, relative ploidy level, and vegetative habit. A  4x  annual  B  2x  annual  C  4x  biennial  D  2x  biennial  E  2x  biennial  F  4x  annual  G  2x  biennial  H  4x  annual  I  4x  annual  L  2x  biennial  M  4x  biennial  N  4x  biennial  O  2x  annual  P 4x annual

DNA Extraction
Genomic DNA (gDNA) was extracted from young freeze-dried leaves finely ground with a TissueLyser II mill (Qiagen, Valencia, CA, USA) using a DNeasy ® 96 Plant Kit (Qiagen) following the manufacturer's protocol. The quantity and quality of the gDNA were checked using a NanoDrop spectrophotometer (Thermo Fisher Scientific, Pittsburgh, PA, USA) in terms of concentration and 260/280 and 260/280 ratios. Extracted gDNA was stored at −20 • C until further PCR amplification.

SSR Marker Genotyping
For genotyping analysis, an initial batch of 28 primer pairs was chosen from Hirata [20] and Guan [21] considering their polymorphism information content (PIC) and their distribution within the seven linkage groups (LG). Preliminary tests on a subset of four randomly chosen samples were performed in singleplex reactions to verify primer efficiency in terms of the presence of polymorphic alleles for each marker. Thus, 12 primer pairs were selected and combined into three different multiplex groups (Table 2) based on similar annealing temperatures, diverse predicted amplicon sizes, and minimum tendencies of dimer formation checked with PerlPrimer v1.1.21. software. Multiplex PCRs were performed according to the three-primer system described by Schuelke [22], with some minor changes. Each locus was amplified by a pair of locus-specific primers, one with an oligonucleotide tail at the 5 end (M13, PAN-1, PAN-2, or PAN-3) and a third primer complementary to the tail and labeled with a fluorescent dye (6-FAM, VIC, NED, or PET), necessary to subsequently discriminate the different loci during chromatogram screening. The multiplex PCR mixtures had a total reaction volume of 20 µL resulting from 1 × Platinum ® Multiplex PCR Master Mix, 10% GC Enhancer (Applied Biosystems, Carlsbad, CA, USA), 0.25 µM of each tailed primer, 0.75 µM of each tailed reverse primer, 0.5 µM of each fluorescent labeled primer (Applied Biosystems), 20-30 ng of gDNA, and sterile water to volume. Touchdown PCRs were completed for all three multiplexes in 96-well plates under the following conditions: initial denaturation of 5 min at 95 • C, followed by 6 cycles at 95 • C for 30 s, extension at 65 • C for 45 s, which decreased by 1 • C with each cycle, and at 72 • C for 45 s, then 34 cycles at 95 • C for 30 s, at 59 • C for 45 s, and at 72 • C for 45 s. The reaction finished with a final extension step of 30 min at 60 • C. Finally, the quality of the amplicons was verified by electrophoresis run on a 2% agarose/1 TAE gel containing 1× SYBR Safe DNA Gel Stain (Life Technologies).
The PCR products were dried at 65 • C and investigated with capillary electrophoresis using an ABI 3730 DNA analyzer, adopting the GeneScan 500 Liz as the molecular weight internal standard. Thus, chromatograms were screened to define amplicon size at each locus using Peak Scanner software 2.0 (Applied Biosystems).

Genetic Similarity Estimates and Genetic Diversity and Relationship Analyses
Two distinct datasets for marker statistics and analyses were constructed, setting different thresholds of missing data: a dataset admitting samples with missing data of two SSR loci at most, and a second dataset composed of samples with missing data of five SSR loci at most. The strictest dataset (576 samples) was uploaded to the NTSYS software package v.2.21c [23] and used to calculate the genetic similarity (GS) among individuals in all possible pairwise comparisons based on Rohlf's simple matching (SM) coefficient. This coefficient was used because it also takes into account negative co-occurrences of alleles between different samples and populations. In addition, the average GS within and among each variety was calculated. Using the same software and Rohlf's GS matrix, a principal coordinates analysis (PCoA) was carried out.
The genetic relationship analysis was performed according to the maximum-likelihood method (ML) implemented in IQ-Tree v1.6.12 software. This approach enabled us to associate phylogenetic and combinatorial optimization techniques into a fast and effective tree search algorithm with the implementation of the ultrafast bootstrap approximation approach [24]. The SSR marker dataset (576 samples) resulting matrix was analyzed as binary data using the GTR2 method (GTR2 + I + G4 + FO), according to the BIC value found with the ModelFinder algorithm available in the IQ-Tree. The GTR model, used to investigate the genetic relationship with SSR data, was also selected according to the scientific research of Huang et al. [25], Vieira et al. [26], and Minin et al. [27]. Statistical support for the ML dendrogram was computed by running 1000 replicates until convergence for ultrafast bootstrap (UFB) (-bb 1000) [28,29] and 1000 rounds of SH-like approximate likelihood ratio tests (SH-aLRT) (-alrt 1000) [30].
On the other hand, the second most comprehensive dataset, including 644 samples, was used both for the SSR statistics and the population structure analysis. GenoDive v3.0 software [31] was used to perform the Analysis of Molecular Variance (AMOVA) through Fstatistics to analyze the molecular variance at different levels of the population structure (i.e., individual, subpopulation, and population levels), with the number of permutations = 999. GenoDive allows for a correction of the unknown dosage of alleles, permitting the statistics to estimate without or with a reduced bias. This correction uses a maximum likelihood method based on random mating within populations, as first proposed by De Silva et al. [32] and then modified by Meirmans et al. [33]. Genetic diversity was also investigated through the following statistics: number of observed alleles (N a ), number of effective alleles (N e ), and observed (H o ) and expected (H e ) heterozygosity within subpopulations assuming Hardy-Weinberg equilibrium according to Nei. These parameters were also calculated for each Italian ryegrass variety.
F-statistics were computed, and the fixation index (F ST ) and gene flow (N m ) were calculated for each marker locus. F ST measures the amount of genetic variance that can be explained by population structure based on Wright's F-statistics [34], while N m = {(1/F ST ) − 1}/4. An F ST value of 0 indicates no differentiation between the subpopulations, while a value of 1 indicates complete differentiation [35]. Inbreeding coefficients (F IT and F IS ) were computed to measure the deficiency (positive values) or excess (negative values) of heterozygotes for each assessed microsatellite. Similarly, inbreeding coefficients were calculated at the multilocus level to estimate the genetic effect of total population subdivision as a proportional reduction in overall heterozygosity due to variation in SSR allele frequencies among different subpopulations. Finally, marker allele frequencies for each locus were also determined.
Considering each variety separately, Wright's inbreeding coefficient G IS was evaluated to investigate the deficiency and excess of heterozygosity within every Italian ryegrass variety. Moreover, a matrix with Nei's gene diversity (G st ) values (analogous to F ST ) was constructed for all possible pairwise comparisons between the fourteen varieties.

Genetic Structure Analyses
In addition, the population structure of the 644 samples was assessed using the clustering algorithm of STRUCTURE v2.2 software. Since no a priori knowledge of the origin of the populations under study was available, the admixture model was chosen. We ran the Markov chain Monte Carlo (MCMC) model with 100,000 iterations and a burn-in of 20,000 samples under the assumption that the allele frequencies in the populations were correlated. Ten iterations were conducted for each value of the number of populations (K), with K ranging from 1 to 20. The obtained results were analyzed using the STRUCTURE HARVESTER web software to calculate the best value of K according to Evanno et al. [36], and then estimates of membership were plotted as a histogram using an Excel spreadsheet.

Descriptive Statistics of SSR Marker Loci
A total of 239 alleles were detected across the 14 varieties for the 12 microsatellite loci with an average number of observed alleles (N a ) of 19.9, ranging from 10 (02_01B) to 48 (02_10B). Moreover, the effective number of alleles (N e ) per locus varied from 1.7 (02_01B) to 5.2 (02_10B), as reported in Table 3. The same parameters calculated for each variety were equal to 7.3 (N a ) and 3.5 (N e ), ranging from 4.4 in variety I to 9.3 in variety N and from 2.5 (variety A) to 4.6 (variety D), respectively (Table 4).  Allele frequencies were calculated per locus within each population, permitting the identification of the most common genotype.
Out of 239 alleles, 108 were considered "rare", since their overall frequency was lower than 1% [37]. Rare alleles were found at each locus, ranging from 4 to 16. Private alleles, those detected only in a specific variety but absent in all the others, were also searched. Fifty-nine private alleles were detected. On average, each variety presented five private alleles (from 1 of A to 20 of O), with the exception of G and I, which did not show any private allele. Most of them, i.e., 52 out of 59, had a frequency lower than 5% but higher than 1% (not considerable as rare alleles).
Descriptive statistics for Nei's genetic diversity (H-statistics) and Wright's inbreeding coefficients (F-statistics and G IS ) for single marker locus and population accessions were  Table 5). Levels of pairwise population differentiation were variable, ranging from 0.019 to 0.236. The lowest G ST value (0.019) was detected between the M and N varieties, which showed that these varieties had the lowest genetic differentiation and differences. Conversely, the highest G ST value (0.236) was presented between the A and L varieties, which had the highest genetic differentiation and differences. Pairwise comparisons among the 12 microsatellites and all 14 varieties revealed a significant deviation from the Hardy-Weinberg equilibrium with a significant multilocus heterozygosity deficiency (Wahlund effect), since at the multilocus level, the inbreeding coefficient G IS showed all positive values, ranging from 0.142 to 0.62 (Table 6).

Genetic Diversity and Clustering Analysis
Genetic variability within and between varieties was investigated primarily by calculating genetic similarity (GS) for all possible pairwise comparisons among the 576 samples (the strictest dataset) using the entire set of marker alleles scored at all genomic loci. In particular, a pairwise genetic similarity matrix was calculated using a simple matching coefficient. Rohlf's genetic similarity ranged from 78% to 99% among all individuals analyzed. When calculated within each variety, GS varied on average from 88.5% (±2.03) within the N variety to 92.6% (±1.99) within the G variety (Table 7). In the pairwise comparisons between varieties, the N and H populations showed the lowest average value (86.48 ± 1.99%), while the I and A populations exhibited the highest value (90.45 ± 2.07%) (Figure 1).   In addition, an analysis of molecular variance (AMOVA) was conducted to better understand the distribution of genetic differentiation among and within varieties of Italian ryegrass. Findings from AMOVA revealed that 87% (resulting from the sum of 60% found within all the individuals and 27% found among the individuals nested in populations) of the total genetic variation was contributed by differences within varieties, which In addition, an analysis of molecular variance (AMOVA) was conducted to better understand the distribution of genetic differentiation among and within varieties of Italian ryegrass. Findings from AMOVA revealed that 87% (resulting from the sum of 60% found within all the individuals and 27% found among the individuals nested in populations) of the total genetic variation was contributed by differences within varieties, which was notably and significantly higher than that among varieties, since only 13% of the total genetic variation was due to differences among varieties.
Clustering analyses were performed to identify PCoA centroids based on the genetic similarity estimates and to draw the maximum likelihood (ML) dendrogram. Generally, the ML dendrogram revealed a slightly structured distribution of Italian ryegrass accessions ( Figure 2). In most cases, individuals from different varieties clustered separately, although two admixed groups were present. Interestingly, the A variety appeared totally distinguishable from all the others, with a statistical UFB support of 79. Also identified were two main monophyletic subclusters, which divided most of the samples of the P-M/N-F-I-L-G varieties from the E-H-D-B-C varieties. Notably, samples from the P and 8 of the D variety formed a distinct cluster that differed from other varieties. Additionally, varieties D, E, and M/N showed slight fragmentation, since they were split into three or two parts. Notably, samples belonging to the O variety did not cluster together, spreading into almost all other clusters, especially into the two admixed groups and with the C and E samples.
Genes 2022, 13, x FOR PEER REVIEW 10 was notably and significantly higher than that among varieties, since only 13% of the genetic variation was due to differences among varieties. Clustering analyses were performed to identify PCoA centroids based on the ge similarity estimates and to draw the maximum likelihood (ML) dendrogram. Gene the ML dendrogram revealed a slightly structured distribution of Italian ryegrass a sions (Figure 2). In most cases, individuals from different varieties clustered separa although two admixed groups were present. Interestingly, the A variety appeared to distinguishable from all the others, with a statistical UFB support of 79. Also ident were two main monophyletic subclusters, which divided most of the samples of th M/N-F-I-L-G varieties from the E-H-D-B-C varieties. Notably, samples from the P a of the D variety formed a distinct cluster that differed from other varieties. Addition varieties D, E, and M/N showed slight fragmentation, since they were split into thr two parts. Notably, samples belonging to the O variety did not cluster together, sprea into almost all other clusters, especially into the two admixed groups and with the C E samples. Principal coordinates analysis graphically represented the spatial distribution o samples (Figure 3). To obtain clearer and more readable graphical representations plotted the annual varieties separate from the biennial data. In the case of the an Principal coordinates analysis graphically represented the spatial distribution of the samples (Figure 3). To obtain clearer and more readable graphical representations, we plotted the annual varieties separate from the biennial data. In the case of the annual varieties, the first two principal components were able to explain 55.2% of the total genetic variation found within the population as a whole, although most of the accessions were closely plotted in the central area of the four main quadrants. In particular, with the first component, explaining 28  For the biennial varieties, even though most of the accessions were nearly plotted in the central part of the four main quadrants, individuals belonging to the L variety were grouped together and well separated from the other samples. The first two principal components were able to explain 60.3% of the total genetic variation found within the population of the biannual varieties. In particular, the first component, which explained 33.9% of the total diversity, discriminated samples of the L and G varieties from the C and D For the biennial varieties, even though most of the accessions were nearly plotted in the central part of the four main quadrants, individuals belonging to the L variety were grouped together and well separated from the other samples. The first two principal components were able to explain 60.3% of the total genetic variation found within the population of the biannual varieties. In particular, the first component, which explained 33.9% of the total diversity, discriminated samples of the L and G varieties from the C and D varieties. On the other hand, with the second component, explaining 26.4% of the total diversity, accessions of the L variety could be distinguished from the G samples and part of the individuals belonging to the M and N varieties. In addition, as shown in the dendrogram, C could not be distinguished from D and partially from E.

Population Structure
Based on the marker alleles at all SSR loci, the genetic structure of the Italian ryegrass core collection (644 samples) was analyzed using STRUCTURE v. 2.2 software [38] and Structure Harvester software [39] to determine the most likely number of ancestral genotypes represented by the core collection. Following the procedure of Evanno et al. [36], a clear maximum for the ∆K value at K = 2 was found (∆K = 117.5), and a second lowest value was found for K = 13 (∆K = 9.8). In particular, for K = 2 (Figure 4a), the population was split into two genetically distinguishable subgroups, representing the major marker allele clusters or the ancestral multilocus haplotypes. Similarly, the result observed for K = 13 (Figure 4b) enabled the discriminative clustering of the analyzed varieties, with one exception for varieties M and N.
varieties. On the other hand, with the second component, explaining 26.4% of the total diversity, accessions of the L variety could be distinguished from the G samples and part of the individuals belonging to the M and N varieties. In addition, as shown in the dendrogram, C could not be distinguished from D and partially from E.

Population Structure
Based on the marker alleles at all SSR loci, the genetic structure of the Italian ryegrass core collection (644 samples) was analyzed using STRUCTURE v. 2.2 software [38] and Structure Harvester software [39] to determine the most likely number of ancestral genotypes represented by the core collection. Following the procedure of Evanno et al. [36], a clear maximum for the ΔK value at K = 2 was found (ΔK = 117.5), and a second lowest value was found for K = 13 (ΔK = 9.8). In particular, for K = 2 (Figure 4a), the population was split into two genetically distinguishable subgroups, representing the major marker allele clusters or the ancestral multilocus haplotypes. Similarly, the result observed for K = 13 (Figure 4b) enabled the discriminative clustering of the analyzed varieties, with one exception for varieties M and N. Starting from the graphical representation of K = 2, each sample is plotted as a vertical histogram divided into K = 2 (and then for K = 13) with colored segments representing the estimated membership in each hypothesized ancestral genotype. The first clustering of the considered genotypes revealed that 518 of the 644 samples showed a strong ancestral association (>90%). Individuals' membership with the first identified cluster was observed to be higher than 80% in almost all of the analyzed accessions. Particularly, populations A, B, C, D, O, and P had 94%, 91%, 98%, 89%, 92%, and 79% of samples with membership to the first cluster higher than the considered threshold, respectively. On average, all of these varieties presented a membership higher than 90% (except for P with 84%) with the Starting from the graphical representation of K = 2, each sample is plotted as a vertical histogram divided into K = 2 (and then for K = 13) with colored segments representing the estimated membership in each hypothesized ancestral genotype. The first clustering of the considered genotypes revealed that 518 of the 644 samples showed a strong ancestral association (>90%). Individuals' membership with the first identified cluster was observed to be higher than 80% in almost all of the analyzed accessions. Particularly, populations A, B, C, D, O, and P had 94%, 91%, 98%, 89%, 92%, and 79% of samples with membership to the first cluster higher than the considered threshold, respectively. On average, all of these varieties presented a membership higher than 90% (except for P with 84%) with the main cluster. On the other hand, populations E, F, G, H, I, and L had 80%, 100%, 100%, 87%, 93%, and 98% of the samples with membership with the second cluster higher than 80%, respectively. Additionally, in this case, the average membership with the second cluster of all these varieties was higher than 90%, with the exception of the E variety with 87%. In contrast, most of the admixed genotypes (<80% membership in a single ancestral genotype) were from the M and N varieties, with 62% and 67% of the samples, respectively. Specifically, these two populations showed complementary average membership to the two hypothesized ancestral genotypes: the M variety had 63% and 37% membership to Cluster 1 and Cluster 2, while the N variety had 40% and 60%, respectively. More generally, other populations had a very low percentage of admixed samples with this clustering, ranging from 24% and 20% for the E and P varieties to the total absence of admixed genotypes in the F and G populations.
The second largest ∆K revealed an additional level of population structure and allowed the clustering of all investigated genotypes into thirteen additional subgroups. Interestingly, the ancestral population size K = 13 corresponded to the number of varieties used in this study, combining the M and N varieties into a unique admixed group. On average, all individuals of the same variety showed a membership higher than 50% to a specific cluster, which was different for each variety (exceptions were observed for varieties M and N). In addition, membership mean values higher than 80% were reached in the B (82.4%-Cluster 12), C (84.6%-Cluster 4), F (84%-Cluster 7), G (83.9%-Cluster 11), I (87.5%-Cluster 1), and L varieties (80.5%-Cluster 13) by the majority of their individuals. The only cluster shared by two different varieties (M and N) was Cluster 3, with an average membership equal to 50.8% and 68.8%, respectively. Moreover, considering those samples with a membership to the major cluster lower than 80% as admixed, the percentage of admixed samples within the same population ranged from 64% (the M variety) to 13% (the I variety). In particular, the highest proportions of admixed samples were shown within the H (58%), D (57%), N (56%), and O varieties (52%). In contrast, the I, C, and F populations had only 13%, 25%, and 25% admixed samples, respectively.

Discussion
Morphological characteristics have been used for descriptive purposes and are traditionally used to distinguish plant varieties. However, these methods are questionable because of the strong effect of the environment on morphological traits. In addition, this approach is inefficient because of the time and cost involved [40]. Moreover, morphological criteria alone are not sufficient to distinguish some different varieties that are morphologically similar. Therefore, the molecular fingerprinting of a plant variety is extremely important for protecting plant breeders' rights (PBR) [41,42]. Although some issues were related to the application of molecular tools (e.g., the lack of information related to the allelic dosage, the requirement of an appropriate number of markers, and the need for specific bioinformatic knowledge), SSR markers are considered more reliable than RAPD and AFLP markers because of their ability to produce high-fidelity profiles thanks to their codominant nature and specificity [43]. As a consequence, several researchers have reported the application of SSR markers for genetic diversity assessment and variety identification in crops [44][45][46], horticultural plants [47][48][49], and forages [50][51][52].
In this study, 14 commercial varieties of Italian ryegrass were tested with 12 SSR loci, analyzing a total of 672 samples to assess their intrapopulation and interpopulation genetic variation and differentiation and the structure of the population as a whole, and in turn to evaluate them in the PVP context. Detecting genetic variability by means of marker allele composition, allele proportion, and multilocus genotypes in different accessions of L. multiflorum Lam. is a crucial step toward a more trusted variety identification and stronger protection for breeder rights.
As a general rule, the outcrossing breeding system of self-incompatible species, along with the annual life cycle, may have a central role in affecting the conservation of genetic diversity in Italian ryegrass. Indeed, for species with outcrossing mating systems, withinvariety variation is usually high because pollen can be widely spread between varieties, resulting in a low degree of differentiation attributable to among-varieties variation [53]. In this study, 87% of the total variation was attributable to within-cultivar genetic variation, while variance among cultivars was only responsible for 13% of the total variation, which is consistent with results from previous studies in cross-pollinated forage plants. For example, Nie et al. [19] reported a within-variety genetic variation value equal to 82% of the total variation, testing six Chinese L. multiflorum Lam. varieties. Moreover, in perennial ryegrass cultivars, the total within-cultivar component of genetic variation was 85% [54]. In tetraploid white clover, the within-cultivar variation explained 84% of the total variation [55].
The simultaneous occurrence of a marked inbreeding rate (F IS = 0.302, on average), a reduced fixation index (F ST = 0.125, on average), and a high gene flow (N m = 2.18, on average) between overall accessions confirm that most allele diversity and genotype variation are found within populations and that genetic differentiation among varieties is low. Such a low genetic differentiation can be attributable to a high gene flow among populations because of pollen dispersal among too close multiplication fields of different varieties. In addition, along the variety selection and commercialization process other possible causes could be related to variability of heading time and consequent preferential pairing, climate/location effect during seed batch multiplication, or any possible admixture events during storage and bagging processes. Interestingly, the heterozygosity deficiency seen in our results, with a consequent robust deviation from Hardy-Weinberg equilibrium, is not only a common observation in commercial varieties due to the human-induced bottleneck and artificial selection, but also attests to the low variability in these accessions.
The STRUCTURE analysis suggested moderate structuration of populations and the existence of admixed ancestry based on two or thirteen distinct gene pools. It should be noted that the subdivision in two genetically distinguishable subgroups does not reflect either the two different ploidy levels or the two vegetative habitus. However, the resulting information from the STRUCTURE analysis partially confirmed the outcomes from the maximum likelihood dendrogram. In particular, the I, L, and G varieties could have the same hypothesized ancestral genotype according to STRUCTURE analysis and formed a monophyletic group in the ML cladogram with significant UFB support (91). Moreover, samples belonging to the M and N varieties appear to be closely related to each other. In fact, when K = 13 in the STRUCTURE analysis, M and N samples were the only ones to share the same ancestral gene pool (with the majority of the samples considered admixed, with a membership to Cluster 3 ranging from 79% to 1%), and in the ML dendrogram, they were always mixed and clustered together. Additionally, in the pairwise F ST comparisons, the M and N varieties showed the lowest value of genetic differentiation (F ST = 0.019), and again, the intra-variety genetic similarity value of the N population was practically the same as that among the M and N populations (88.54% and 88.59%, respectively). In other words, we can assume that the populations with the highest within variability (M and N), according also to STRUCTURE results, can be considered the most closely related. According to all these results, we can therefore say that the M and N varieties are not distinguishable from each other. In contrast, ML clustering analysis clearly suggested that the A accession has a gene pool clearly distinct from the other accessions, hence clustering apart from the core of varieties. Furthermore, in the pairwise comparisons with other varieties, the genetic similarity values of the L population were the lowest, confirming the resulting values in the pairwise F ST comparisons.
In conclusion, if it is true that these fourteen commercial varieties of Italian ryegrass did not show high degrees of genetic differentiation, it is also true that, when analyzed as a core, each variety described a genetic cluster on its own, resulting in distinguishability from the others. The only evident exception was represented by the M and N varieties. Given their biennial tetraploid nature, a common recent origin from a shared ancestor could be assumed, as reported also by the population structure analysis. Interestingly, from what was reported in the varietal sheets, these two cultivars are morphologically similar, showing a similar growth habit and comparable heading time and inflorescence length, although the M variety should have longer stems and bigger leaves than the N variety.
The registration of a new variety in the Italian National Register of Varieties still does not take genetic information into account, and the features used to distinguish the candidate variety are genealogical, phenotypic, and other agronomic traits [56]. Molecular characterization by DNA markers, in combination with standard morphological descriptors, can improve variety identification, especially in forage crops. Distinguishing one variety from another is not only the foundation of the seed production market but also the necessary precondition for the successful management of seed companies. In forage crops, especially grasses where morphological differences are not always easily noticeable and objective, the implementation of efficient genetic tools for variety identification is needed. Thus, in the process of potential registration of new varieties, after DUS testing by morphological characters, performing high-throughput marker analysis on the candidate varieties would be highly useful, especially when their phenotypes are not significantly different from existing varieties. This work represents an example of how molecular genetics can be a viable tool to protect both breeders and customers by reducing commercial fraud [57,58].
Author Contributions: Conceptualization, E.P. and G.B.; formal analysis, E.P. and F.P.; data curation, E.P.; writing-original draft preparation, E.P.; writing-review and editing, E.P. and F.P.; supervision, G.B.; project administration, G.B.; funding acquisition, G.B. All authors have read and agreed to the published version of the manuscript.
Funding: This study was carried out within the research contract signed with the seed company Mediterranea Sementi S.r.l. The corresponding author declares no conflict of interest with the seed company.
Institutional Review Board Statement: Not applicable.

Informed Consent Statement: Not applicable.
Data Availability Statement: The data are contained within the article and supplementary materials are available on request from the corresponding author. The raw data are not publicly available due to privacy reasons.