Genetic Diversity and Population Structure of Japanese Plum-Type (Hybrids of P. salicina ) Accessions Assessed by SSR Markers

: Japanese plum ( Prunus salicina Lindl.) is widely distributed in temperate zones across the world. Since its introduction to USA in the late 19th century, this species has been hybridized with up to 15 different diploid Prunus species. This high level of introgression has resulted in a wide range of traits and agronomic behaviors among currently grown cultivars. In this work, 161 Japanese plum-type accessions were genotyped using a set of eight Simple Sequence Repeats (SSR) markers to assess the current genetic diversity and population structure. A total of 104 alleles were detected, with an average of 13 alleles per locus. The overall Polymorphic Informative Content (PIC) value of SSR markers was 0.75, which indicates that these SSR markers are highly polymorphic. The Unweighted Pair Group Method with Arithmetic (UPGMA) dendrogram and the seven groups inferred by Discriminant Analysis of Principal Components (DAPC) revealed a strong correlation of the population structure to the parentage background of the accessions, supported by a moderate but highly signiﬁcant genetic differentiation. The results reported herein provide useful information for breeders and for the preservation of germplasm resources. Author Contributions: Conceptualization, M.E.G., A.P. and J.R.; Data curation, B.I.G., S.H. and P.I.; Methodology, B.I.G. and P.I.; Writing—original draft, B.I.G., M.E.G., A.P. and J.R.; Writing—review & editing, B.I.G., M.E.G., S.H., P.I., A.P. and J.R.


Introduction
Japanese plum (Prunus salicina Lindl.) belongs to the Prunus genus in the Rosaceae family [1], which includes around 430 species [2]. This crop was originated approximately in 300 B.C. in the Yangtze River basin in China, where wild populations can be currently found [3,4]. Japanese plum was introduced to Japan from China more than 2000 years ago [5]. In the late 19th century, it was introduced to California (USA) from Japan, so it was called "Japanese plum" [2,6]. Now, this crop is widely distributed in temperate zones across the world [4].
In California, Luther Burbank started Japanese plum modern breeding by intercrossing P. salicina with Prunus simonii Carr. and other native American diploid plums in order to improve its adaptation to local conditions [7]. A number of cultivars were released from these hybridizations, such as "Beauty", "Burbank", "Duarte", "Eldorado", "Formosa", "Santa Rosa", and "Wickson", some of which are currently available and widely grown [2,5,6]. In these hybrids, P. salicina contributed to the improvement of fruit traits of size, flavor, color, and storability; P. simonii contributed to firm flesh and strong flavor; and the native American species such as Prunus americana Marsh. or Prunus besseyi Bailey contributed to disease resistance, tough skin, and aromatic quality [8]. In the southern United States, some of these cultivars were hybridized with the local Prunus angustifolia

DNA Extraction and SSR Analysis
Young leaf samples were collected in spring and preserved in silica gel [56]. The dried leaves were ground on a TissueLysser (Qiagen, Hilden, Germany) prior to the DNA extraction. Genomic DNA was extracted following the protocol described by Hormaza [40] and using a Speedtools Plant DNA Extraction Kit (Biotools, Madrid, Spain) according to the manufacturer's instructions [13,57,58]. Quantity and quality of DNA was assessed using a microvolume spectrophotometer NanoDrop 1000 (ThermoScientific, Delaware, USA) and diluted at 10 ng/µL prior to PCR amplification [13].
A total of 13 SSR markers developed in Japanese plum, peach, and sweet cherry were used ( Table 2). The DNA fragments were amplified using six sets of multiplex PCR reactions (M01 to M06). Each multiplex reaction was designed by combining the expected molecular size (pb) of the fragments amplified by each SSR primer pair and four fluorescent dyes (PET, 6-FAM, VIC, NED). Multiplex PCRs M01-M04 were performed in a final volume of 12.5 µL, and M05 and M06 in a final volume of 11.5 µL. A Qiagen Multiplex PCR Kit (Qiagen, Hilden, Germany) was used for all reactions according to the manufacturer's instructions, with different concentrations for each SSR marker ( Table 2) and 10 ng of genomic DNA. The temperature profile used in M01 to M04 had an initial step of 15 min at 95 • C, 35 cycles of 45 s at 95 • C, 45 s at 57 • C, and 2 min at 72 • C, and a final step of 30 min at 72 • C [31]. M05 and M06 were performed using the same conditions with modifications at the annealing temperature of 46 and 62 • C, respectively [35]. All PCR reactions were carried out using a SimplyAmp Thermal Cycler (Applied Biosystems, Foster City, CA, USA). PCR products were separated by capillary electrophoresis using a genetic analyzer ABI3730 (Applied Biosystems, Foster City, CA, USA). The amplified fragments were sized and scored with a size standard GeneScan 500LIZ (Applied Biosystems, Foster City, CA, USA) [57] on "Fragman" v. 1.0.9 [59], an R package [60] for fragment analysis and revised with the software PeakScanner v. 1.0 (Applied Biosystems, Foster City, CA, USA). The genetic profiles were organized in a table in csv format for the subsequent analysis.

Genetic Diversity Analysis and Genetic Relationships among Accessions
The analysis of genetic diversity and genetic relationships were performed using R software v. 3.6.0 (R Development Core and Team, 2020). For the genetic diversity and population structure analysis, the data of alleles generated by the SSR markers were converted to an object of the class genind using the "df2genind" function of the "adegenet" package v. 2.1.2 [62].
A R script was developed to detect synonymies and homonymies in the data. Synonymies were identified by comparison of the allele data using the "duplicated" function to detect identical genetic profiles considered as synonymies. All accession names were also compared by the "duplicated" function to detect homonymies.
The genetic relationships among accessions were determined using an Unweighted Pair Group Method with Arithmetic averages (UPGMA) cluster analysis according to Nei and Li [67]. The "poppr" package v. 2.8.5 was used to generate an UPGMA dendrogram with a "bootstrap" supported by 1000 replicates [68]. The genetic structure was also analyzed using the "adegenet" package v. 2.1.

[61] by a Discriminant Analysis of Principal
Components (DAPC). The optimal number of groups (K) in the whole population was inferred using the "find.clusters" function according to the lowest Bayesian Information Criterion (BIC) value. A cross-validation function, "xvalDapc" [61], was used to determine the correct number of Principal Components (PCs) to be retained. An Analysis of Molecular Variance (AMOVA) was conducted using the "poppr" package v. 2.8.5 to calculate the variance components among the inferred groups and among the accessions [68].

SSR Genotyping
Eight of the 13 SSR primers pairs (62%) showed good amplification and were selected to evaluate the genetic diversity and population structure. The remaining five (CPPCT-029, UDP96-008, UDP98-406, UDP98-409, and UDP98-412) were excluded from the analysis due to null or poor amplification (Table 2). A total of 104 alleles were amplified using eight SSR primers across 161 Japanese plum-type accessions (155 commercial cultivars and selections, and six reference cultivars). The number of alleles per locus (N A ) ranged from nine (CPPCT033) to 16 (Table 3).

Genetic Relationships among Accessions
The UPGMA dendrogram grouped the accessions into two major clusters supported by a strong bootstrap value (100) (Figure 1), allowing the identification of 159 genotypes and two pairs of synonymies ("Red Beaut" and "606", "Fortune" and "Green Sun"). The clustering of the accessions by their SSR profile was consistent with the available parentage information (Supplementary Materials, Table S1), but weak correspondence with the program breeding or geographical origin was found. According to the dendrogram, "Black Satin" and the accession "S030" clustered separately, forming the smallest cluster (A). Cluster B was the largest cluster, comprising 152 accessions distributed in seven subclusters. The subcluster B1 comprised nine accessions, some of them derived from the same pedigree as "Methley", "Morris", and "AU Amber", and the remaining accessions shared a common and known South African origin, with the exception of "Speckled Egg". The subcluster B2 comprised a set of Californian cultivars of "Eldorado" (cultivar released by Luther Burbank), "Friar", "Angeleno", "Black Diamond", "Royal Diamond", and 19 other accessions, including "Alpha" (Prunus maritima). The subcluster B3 comprised 20 accessions, most of which were commercial cultivars and early selections from South Africa, such as "Sunkiss", "Honey Sweet", "Honey Down", and "Honey Star". The subcluster B4 comprised 22 accessions, including some commercial cultivars: "African Rose", "Black Beaut", "Crimson Glo", "Earliqueen", "Golden Kiss", and "Souvenir". The subcluster B5 was formed by two reference genotypes ["Abundance" (P. salicina) and "Simon" (P. simonii)] and 15 other accessions, including "Burmosa" and its descendants, "Red Beaut" and "606". The subcluster B6 comprised eight accessions, including the reference genotypes of P. salicina "Kelsey" and "Formosa", in addition to the traditional cultivars "Golden Japan" and "Songold". The subcluster B7 encompassed 49 accessions and the reference genotype "Mariposa" (P. salicina). Finally, the cluster C comprised seven accessions, including two accessions from USA ("October Red" and "Sweet Treat"), four accessions from South Africa ("African Pride", "Ruby Star", "S018", and "S026"), and the rootstock cultivar "GF81".

Analysis Genetic Structure
The genetic structure analyzed by DAPC showed a K = 7 value as the optimal clustering, according to the lowest BIC value. The optimal number of PCs to be retained for the subsequent analysis was 10 (Supplementary Materials, Figure S1). This scenario showed groups 1 to 3 and 5 to 7 (G1-G3 to G5-G7) overlapped, and group 4 (G4) clearly differentiated from them across the first two linear discriminant functions (LD1 and LD2) ( Figure 2). The reports of the allele frequencies (loadings) in the dataset allowed determination of the contribution of alleles to the distribution of accessions in the DAPC scatterplot (Supplementary Materials, Figure S2). The DAPC analysis allowed allocation of most of the accessions to their original group according to a membership probability up to 0.9, indicating clear-cut groups. However, some accessions showed lower membership probabilities, ranging from 0.3 to 0.7, which indicate some admixtures in the structured population ( Figure 1). Group G1 comprised a set of 32 accessions (19.9%), including the P. salicina reference "Mariposa" and other commercial cultivars of "Black Splendor", "Queen Rosa", and "Queen Ann". Group G2 comprised 23 accessions (14.3%), mostly cultivars from California ("Hiromi Red", "Earliqueen", "Frontier", and "Green Sun", among others) and South Africa ("African Rose", "Ruby Star", and some advanced selections). Group G3 (n = 18, 11.2%) included the two P. salicina genotype-references "Kelsey" and "Formosa". Group G4 comprised 21 accessions (13%), most of modern cultivars ("Honey Down", "Honey Star", "Honey Sweet", and "Sunkiss"), and some advanced selections from South Africa. A group of 18 accessions (11.2%) formed the group G5, including "Abundance" (P. salicina) and traditional cultivars of "John W", "Santa Rosa" and "Simka". The group G6 comprised 22 accessions (13.7%), including the genotype-reference "Simon" (P. simonii) and some accessions with P. cerasifera in their pedigree ("Methley", "Morris", and the rootstock "GF81 ). Finally, group G7 was formed by 27 accessions (16.8%) and encompassed a high diversity of origins of the traditional cultivars "Angeleno", "Black Diamond", "Eldorado", "Friar", "TC Sun", and "Zanzi Sun".

Genetic Diversity among Groups
Significance variance differences (p < 0.01) were found within the accessions and the AMOVA showed that 81.8% of the total variance observed in the K = 7 scenario was also due to differences within accessions, 14.2% was due to differences among groups, and the remaining 4.0% was due to differences among accessions within groups (Table 4). The statistics of genetic diversity were calculated and summarized per group (K = 7) ( Table 5). The number of alleles (N A ) per locus ranged from 6.25 (G4) to 7.63 (G2). The total number of alleles varied from 50 (G4) to 61 (G2). The allelic richness (A R ) ranged from 6.01 (G4) to 7.25 (G3). The highest number of private alleles (P A ), those present in only one group, was 10 (G2), and only one was observed in G5. Among the groups, the lowest Ho was determined in G1 (0.59), and the highest was observed in G3 and G7 (0.69). The lowest He was recorded in G1 (0.63), and the highest in G3 (0.73). All groups had Inbreeding Coefficient values (F IS ) close to zero, ranging from −0.01 (G7) to 0.15 (G6), showing no excess of homo-or hetero-zygotes.
To validate the genetic differentiation among the seven groups, the F ST values based on Nei's genetic distance among groups were determined (Figure 3). The overall pairwise F ST values of 0.14 suggested a moderate differentiation between groups and varied from 0.11 (between G2 and G6) to 0.19 (between G4 and G7). Most of the groups with paired G4 exhibited higher F ST values than the other pairs. All of these comparisons had non-zero lower and upper 99% confidence intervals (Supplementary Materials, Table S2).

Discussion
The analysis of the genetic relationships and the genetic diversity in the germplasm analyzed, including 155 accessions and six Japanese plum-type reference-genotypes, by SSR markers, showed the correct amplification in eight of the 13 SSR markers used in this study, which were previously developed in Japanese plum [35], peach [26,30,31,61], and sweet cherry [26]. Although extrapolation of the results generated by this approach is complex due to the differences in the number of accessions and SSR markers used in the different studies [69], this approach has proven to be highly useful for cultivar identification because SSR are multi-allelic, codominant markers and most of them are transferable within Prunus species [31].
A total of 104 alleles were amplified by the set of SSR markers, emphasizing their high degree of polymorphism. Similar results were found in a previous study analyzing 47 accessions of Japanese plum using eight SSR markers (N A TOTAL = 104, average of N A PER LOCUS = 13) [51]. The PIC values for all loci found in this study were higher than 0.5, and therefore they were considered highly informative [70].
The observed heterozygosity found in this study was higher than that determined in previous reports in apricot (Ho = 0.51, 48 accessions, 31 SSR markers [40]), sweet cherry (Ho = 0.49, 76 accessions, 24 SSR markers [47]), and peach (Ho = 0.47, 50 accessions, 26 SSR markers [33]; Ho = 0.45, 28 accessions, 10 SSR markers [30]). This higher heterozygosity can be explained by the high number of accessions used in this study, and also by the high degree of introgression of the analyzed accessions, which mostly derived from interspecific crosses between the original species of P. salicina with up to 15 other Prunus species [12].
Genotypes were considered to be duplicated (synonymies) when they were paired on all alleles of the whole set of SSR markers. Two pairs of duplicates were found based on the SSR profile. "Red Beaut" showed 100% of similarity with "606" as expected, because "606" is a selection of cv. "Red Beaut" [15]. The other pair of duplicates was "Fortune" with "Green Sun", although both cultivars are well known and have different phenotypic characteristics [9]. Further research with samples of the same cultivars from other collections or additional SSR markers would be needed to distinguish them.
According to the DAPC clustering, the 161 accessions were distributed in seven groups, in which the main source of the total genetic variation was attributed to variance within accessions. The percentage variation among groups was low, resulting in high similarity among these groups. Six groups were clustered together and difficult to differentiate, which may be due to the use of the same cultivars as the parents in different breeding programs, which could lead to a gene flow across the groups. Most genetic variation within groups rather than among groups has been also found in apricot accessions [41]. In almond, He values higher than Ho values, consistent with the results reported herein, have been attributed to the human selection and the exhausting breeding activity [37]. The highest values of P A were found in G2, indicating that the cultivars in this group may have potential for use for breeding purposes to avoid bottleneck effects, and to be conserved in germplasm banks to maintain diversity [77][78][79]. In this study, moderate genetic diversity (He) was found in all groups and F IS values ranged close to zero, indicating no excess of homo-or -hetero-zygotes. Similarly, F ST values indicated a moderate degree of genetic differentiation [80], supporting the genetic structure obtained herein. The distribution of all accessions across the inferred seven groups corresponded with the genetic relationships observed in the UPGMA dendrogram, revealing a high correlation with the parentage background.
Further research is required to determine the optimal number of SSR markers needed for the analysis of genetic diversity and genetic structure. Although the addition of a new marker should not significantly affect the structure inferred by a sufficiently informative set of SSR [81], the optimal number of markers required to consistently infer the genetic structure in this and other fruit tree species remains unknown.

Conclusions
The SSR markers used herein were highly informative and revealed high genetic diversity within accessions. The entire population was structured in seven groups and confirmed the genetic relationships observed in the UPGMA dendrogram. Although a higher number of accessions were analyzed herein, the genetic diversity was similar to that of previous studies [46,51,[53][54][55]. This may be due to the high number of modern cultivars and advanced selections of breeding programs analyzed, which reveal a bottleneck effect caused by the breeding system practices. The establishment of genetic relationships in Japanese plum-type accessions is highly complex due to their interspecific origin, but can be supported by the knowledge of the parentage lines in commercial cultivars. However, the genealogy of some of the ancestors widely used in most breeding programs is not available [5]. The use of chloroplast markers (cpDNA), and the application of next-generation sequencing technologies (NGS) and high-density SNP-based genotyping [19,76], may lead to additional insight into the degree of diversity among Japanese plum hybrids and the reconstruction of the genealogy of each cultivar.
The conservation of the native Prunus germplasm used in early plum breeding may help to maintain and improve the genetic diversity in Japanese plum-type cultivars. Unfortunately, only a few selections of this material are currently available for breeders [8]. The knowledge of the genetic diversity among Japanese plum-type accessions can enable more informed decisions by breeders for the selection of parents, to maintain biodiversity through germplasm conservation, and to find genotype-phenotype association patterns to be applied by producers and genetic research.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/ 10.3390/agronomy11091748/s1, Figure S1: Clustering and DAPC Cross-validation. (a) Inference of the optimal number of clusters in the 161 Japanese plum-type accessions and (b) DAPC crossvalidation for the optimal number of Principal Components (PCs) retained for the analysis in the seven predefined groups. Figure S2: Loading plots for the alleles contributions to the (a) Linear Discriminant Function 1 (LD1) and (b) Linear Discriminant Function 2 (LD2) of the DAPC when K = 7. Each plot computes the most informative and contributing alleles to the discriminant analysis. Table S1. Genealogical information of the analyzed accessions in which it is available. Table S2. Lower limit (below the diagonal) and upper limit (above the diagonal) of the 99% confidence interval based on 1000 bootstrap replicates. Funding: This research was funded by Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (RTA2017-00003-00); Agencia Estatal de Investigación (PID2020-115473RR-I00/AEI 436/10.13039/501100011033); Gobierno de Aragón-European Social Fund, European Union (Grupo Consolidado A12_17R), and Junta de Extremadura-Fondo Europeo de Desarrollo Regional (FEDER), Plan Regional de Investigación (IB16181), Grupo de Investigación (AGA001, GR18196). B.I. Guerrero was supported by a fellowship of Consejo Nacional de Ciencia y Tecnología of México (CONA-CYT, 471839).