The Genetic Diversity and Structure of Tomato Landraces from the Campania Region (Southern Italy) Uncovers a Distinct Population Identity

: Italy is one of the main producers and processors of tomato and it is considered a secondary center of diversity. In some areas, such as the Campania region (Southern Italy), a range of traditional tomato landraces is still cultivated. The distinction of this heritage germplasm is often based only on folk taxonomy and a more comprehensive deﬁnition and understanding of its genetic identity is needed. In this work, we compared a set of 15 local landraces (representative of traditional fruit types) to 15 widely used contemporary varieties, using 14 ﬂuorescent Simple Sequence Repeat (SSR) markers. Each of the accessions possessed a unique molecular proﬁle and overall landraces had a genetic diversity comparable to that of the contemporary varieties. The genetic diversity, multivariate, and population structure analysis separated all the genotypes according to the pre-deﬁned groups, indicating a very reduced admixture and the presence of a differentiated (regional) population of landraces. Our work provides solid evidence for implementing conservation actions and paves the way for the creation of a premium regional brand that goes beyond the individual landrace names of the Campania region known throughout the world. Fst In the presence of a genetic structure, this suggests the possible presence of locus-speciﬁc adaptive molecular variation.


Introduction
Italy is the leading European nation for tomatoes, considering the cultivated area, the total yield, and the output of the processing industry, which is second in the world only to that of the USA (https://tinyurl.com/y2n6pg5n; accessed 10 March 2021). Moreover, the tomato is a dominant element of the Italian gastronomy and a symbol of a culinary identity that was globally spread by Italian immigrants in the XX century [1].
The tomato was domesticated in the Americas and was introduced to Italy soon after the Columbian exchange [2]. However, its culinary use started in Italy since the second half of the 17th century when, despite the inability to contribute significantly to the caloric intake of the farmers, it was recognized the gustatory value of tomato as (cooked) sauce, condiment and in combination with other foods [3]. In that period, a regional diversification in the use of tomato has also begun, influenced by the political, cultural, and linguistic division of Italy at the time. The success of tomatoes in Italy was not uniform and even today, there are areas where relatively few tomatoes are consumed and cultivated. In the 18th century, the tomato was widespread in the Italian gastronomy especially in the Kingdom of Naples and Sicily (current Southern Italy), at that time under the Spanish rule of the House of Bourbon [4]. Tomato was consumed by commoners both fresh and as a slow cooked sauce, also because of the influence of the Spanish gastronomy [5]. It is especially in this century that varieties with different fruit shapes started to have different destinations of use, a phenomenon that has led to the selection of specific tomato types (e.g., the San Marzano) later employed for industrial canning [3]. After WWII, Italian production reached its peak (in terms of commercial product and dedicated area) during the 1970s, after the widespread diffusion of contemporary varieties [6]. This expansion was accompanied by a strong decline in landrace cultivation, which until recently are predominantly grown in amateur family gardens and in small-size farms for local markets.
In tomato, as well as in other crops, landraces are being recognized as an important genetic material for their yield stability, cultural value, adaptability to low-input and organic farming, resilience to stress, and fruit quality traits [7][8][9][10]. Tomato landraces are of increasing scientific interest also as a source of adaptive traits in the face of climate change and for sustainable fruit quality [11,12]. Moreover, landraces are of commercial importance following the rediscovery of local food systems and the promotion of short food supply chains [13]. These values led to the introduction of EU quality schemes, which promoted the cultivation of some traditional landraces (i.e., San Marzano and Pomodorino Vesuviano, for the Campania region), also because of the possibility of limiting the unauthorized production or marketing of goods using such a name. Regrettably, besides some globally recognized names, most of this germplasm is neglected and at risk of extinction.
In Italy the loss of tomato landrace has been significant especially in the regions where tomato was most diffused [14]. Until the 1950s, tomato breeding was predominantly based on mass and pedigree selection using local germplasm, while hybridization was later exploited [6]. Therefore, in germplasm collections, traditional varietal names can be associated with both landraces and old selected lines [15]. Another problem is the presence of so-called contaminant genotypes, deriving from the fortuitous or voluntary introduction in cultivation of morphologically related varieties, deriving also from crosses and/or on-farm seed propagation of hybrid cultivars [16][17][18]. Moreover, the intrinsic lack of uniformity and unequivocal descriptors of traditional landraces creates the opportunity for spurious, uncontrolled seed marketing, especially from on-line and unprofessional retailers [19][20][21]. Landraces are mainly distinguished by their geographical location and folk taxonomy. Often, tomato landraces or landrace groups are recognized by farmers in terms of fruit type [22]. Nonetheless, the presence of a clear genetic distinction to support landraces' authenticity is not often available, increasing the need for a molecular discrimination [15].
Southern Italy is considered a secondary center of diversity for tomato [23,24], and in view of the above-mentioned reasons, it is not surprising that the Campania region, and more generally, Southern Italy is a rich reservoir of crop landraces [25]. During the transition to intensive modern agriculture, the seed industry in Southern Italy was in its infancy and it is possible to presume that most of the local germplasm has not been involved in formal breeding programs, for instance, by being engaged in the selection of (pre-)breeding material. Therefore, still today the pool of tomato landraces that are locally cultivated may be a genetically different group from contemporary varieties. In this work, to test this hypothesis, we analyzed and compared a population of local landraces (e.g., collected from farms of the Campania region) with an equivalent set of contemporary cultivars, picked among the most diffuse in cultivation. Among the different DNA molecular markers available for tomato [26], we used Simple Sequence Repeats (SSRs) because these highly variable multiallelic DNA markers are considered most suitable to reveal recent demographic events [27,28]. Previous works were dedicated to specific (fruit) types, single or national collections of tomato landraces [29][30][31][32][33][34]. Our specific aims were: (i) to evaluate the level and the distribution of the molecular diversity among and within the collected landraces and a set of common contemporary varieties, going beyond the single tomato landrace or landrace group evaluation; (ii) to quantify the genetic differentiation and highlight locus-specific differences that may contribute to the maintenance of a genetic diversity. Our work provided a solid and experimentally validated basis for the uniqueness of tomato landraces of a specific region, with the prospect of strengthening market opportunities and adding value for cultivation.

Plant Material and DNA Isolation
This work was carried out on a selection of 15 tomato (Solanum lycopersicum L.) landraces of the Campania region, mainly collected in the area to produce the "Pomodoro San Marzano dell'Agro Sarnese-nocerino DOP" and of the "Pomodorino del Piennolo del Vesuvio DOP", two Protected Denomination of Origin (PDO) EU label schemes for the tomatoes of the Campania regions. Seeds were multiplied before analysis at the Department of Agricultural Sciences, University of Naples Federico II. This population was compared to an equal number of cultivars, from different seed companies, selected among the most diffused in the Campania and Puglia regions in the 2009-2014 period according to the cultivated area. As possible output for cladistic classification, we also analyzed two tomato wild relatives, namely S. habrochaites and S. neorickii. The list, code, and main characteristics of the fruit are reported in the Supplementary Table S1. Seeds were germinated on moist filter paper in Petri dishes and then plantlets were transferred to polystyrene seed trays in a growing chamber (24 • C). For DNA isolation, the first two true leaves from two different plants per accession were collected and immediately frozen in liquid nitrogen. Total DNA isolation and quantification were performed as previously described, starting from approximately 150 mg of finely ground frozen tissue [35].

SSR Analysis
To genotype the plant material, we used 14 Simple Sequence Repeat (SSR) loci that represent different repeat classes [36,37]. The loci and their main characteristics are presented in the Supplementary Table S2. Polymerase Chain Reaction (PCR) amplifications were performed independently on two plants per accession. Each reaction was assembled using 20 ng genomic DNA template, 1.5 mM MgCl 2 , 100 µM deoxyribonucleotide-triphosphates, 0.2 µM fluorescently labeled forward primer and unlabeled reverse primer (Supplementary  Table S2), and 0.5 U GoTaq DNA polymerase (Promega, Milan, Italy), for a final volume of 25 µL. PCR reactions were performed in a Verity Pro thermal cycler (Thermo Fisher, Milan, Italy). The PCR cycle was as previously described using the annealing temperature indicated in the Supplementary Table S2 [38]. For allelic discrimination, amplicons were analyzed by fluorescent capillary electrophoresis (dye-labels are reported in Supplementary  Table S2), with the POP6 polymer (Thermo Fisher) in an ABI Prism 3100 Genetic Analyzer (Thermo Fisher). Signal peak height and allele sizes were calculated using the ABI Prism GeneMapper software v. 4.1 (Thermo Fisher, Milan, Italy) calibrated on the GeneScan 500Liz dye size standard (Thermo Fisher). Values were rounded to integer, and if necessary scaled, based on the SSR core size (Supplementary Table S2), minimizing for each locus the average offset of the scaled alleles within the instrumental resolution of the DNA separation (±1 bp) (Supplementary Figure S1).

Data Analysis
For each SSR locus, we calculated the number of different alleles (for the whole collection and for S. lycopersicum), the observed heterozygosity (Ho; number of heterozygotes/N), the number of effective alleles (Ne; = 1/(Sum p i 2 )), the Polymorphic Information Content (PIC; 1 − Sum (p i 2 ), equivalent to the expected heterozygosity), the Shannon's Information Index (I; −1 × Sum (p i × Ln (p i )), and the Wright's fixation index, where N is number of individuals, p i is the frequency of the i-th allele of a locus, and Sum (p i 2 ) represents the sum of the squared p i . These calculations were performed with Genalex 6.5 [39]. Evenness of alleles was obtained with poppr [40]. This R-library was also employed to calculate pairwise genetic distances between genotypes according to the Prevosti coefficient [41]. Hierarchical clustering based on the unweighted pair-group method with arithmetic averages (UPGMA) algorithm, cophenetic correlation and tree visualization were performed in R using the factoextra package [42]. The Agglomerative Cluster (AC) value was computed as reported [43]. The genotype accumulation curve was built by boxplotting the number of multilocus genotypes obtained for an increasing number of SSRs (from 1 to plateau, 10 for cultivars and 12 for landraces) for each predefined population. The data distribution for each number of SSRs was obtained by random sampling loci (n = 10,000) [40]. As model-free ordination technique we used the Principal Coordinate Analysis (PCoA), carried out via covariance matrix with distance data standardization. The Analysis of Molecular Variance (AMOVA) was performed using the pairwise distance matrix with 9999 permutations to test for significance. These calculations were carried out using the Genalex 6.5 software [39]. To validate the population structure, the Fst and Gst between S. lycopersicum populations were calculated per locus and globally and statistically evaluated against a null distribution obtained from 10,000 permutations on alleles, using the MSA 4.05 software [44].

Genetic Diversity of the Germplasm under Investigation
The analysis of the genetic diversity of the entire set of genotypes (n = 32) was carried out with 14 SSRs. Differences within each accession were not detected. All loci were polymorphic yet, the locus LEcaa001 was polymorphic only between wild species and S. lycopersicum and therefore, it was excluded for comparisons between landraces and contemporary cultivars. Main genetic parameters of the tomato collection under investigation are presented in Table 1. We detected 75 different alleles for an average of 5.35 alleles per locus. Large differences were present in the number of alleles per locus, which ranged from 2 (LELE25 and LEcaa001) to 11 (LEEF1Aa). The latter was the most diverse locus also considering the number of effective alleles, a measure of diversity weighted for allele frequencies. There was not a significant difference (p > 0.05; Student's t-test) between the allelic richness in S. lycopersicum subgroups, with an average of 3.0 (respectively, 2.8) alleles per locus in the landraces (resp., contemporary varieties). The observed heterozygosity (Ho) was low (on average, 0.17) but, as expected, highly affected by the tomato groups, because heterozygote genotypes were present at one locus for the open-pollinated landraces (Legaa003) and at 12 SSRs for the contemporary (mostly hybrid) varieties. Nonetheless, two polymorphic loci (LELE25 and LEct001) were fixed in the whole germplasm collection and lacked heterozygotes. A significant correlation between the number of alleles per locus and its observed heterozygosity was present (p = 0.005; Spearman's Rho). Considerable differences among SSRs were also present in the PIC, which ranged from 0.03 (LEcaa001) to 0.81 (LEEF1Aa), the locus with the highest number of alleles. However, the most diverse loci were LE20592 and LEtat002 considering the evenness of distribution of alleles for each SSR.
To evaluate the relationship between the genotypes, pairwise genetic distances were used to build a dendrogram ( Figure 1). In this analysis, the two wild species served as possible outgroup. Genetic distances were summarized in an heatmap to simultaneously visualize how the genetic relatedness varies between genotypes and clusters of genotypes (Supplementary Figure S2). The cophenetic correlation was high (Pearson's productmoment correlation: 0.80) and significant (p < 0.001). The Agglomerative Cluster value was 0.65. The dendrogram illustrated that each genotype had a distinct SSR profile and duplicate accessions were not present in our landrace collection. Moreover, the dendrogram indicated that the genotypes could be clearly separated in biologically meaningful groups at k = 4. The most taxonomically distant wild species had a distinctive position at the highest hierarchical node (k = 2). Within the S. lycopersicum germplasm, the landraces and the contemporary varieties agglomerated in two separate clusters, suggesting the presence of two genetically distinct populations ( Figure 1). To evaluate the relationship between the genotypes, pairwise genetic distances were used to build a dendrogram (Figure 1). In this analysis, the two wild species served as possible outgroup. Genetic distances were summarized in an heatmap to simultaneously visualize how the genetic relatedness varies between genotypes and clusters of genotypes (Supplementary Figure S2). The cophenetic correlation was high (Pearson's product-moment correlation: 0.80) and significant (p < 0.001). The Agglomerative Cluster value was 0.65. The dendrogram illustrated that each genotype had a distinct SSR profile and duplicate accessions were not present in our landrace collection. Moreover, the dendrogram indicated that the genotypes could be clearly separated in biologically meaningful groups at k = 4. The most taxonomically distant wild species had a distinctive position at the highest hierarchical node (k = 2). Within the S. lycopersicum germplasm, the landraces and the contemporary varieties agglomerated in two separate clusters, suggesting the presence of two genetically distinct populations ( Figure 1).  Table S1.
Within the landraces, some small groups of genotypes with a similar fruit shape (Supplementary Table S1) could be also identified, such as the MUR-MAC-MOR subcluster (with "San Marzano"-type fruit and collected in the related PDO area), the ANO-AGN  Table S1.
Within the landraces, some small groups of genotypes with a similar fruit shape (Supplementary Table S1) could be also identified, such as the MUR-MAC-MOR subcluster (with "San Marzano"-type fruit and collected in the related PDO area), the ANO-AGN ("pomodorino" type, both collected in the valley of the Sarno river), and TLI-TLS ("tondo" type, but originating from different cultivation areas).
Finally, before testing for a possible genetic structure, we verified if the number of employed SSRs is suitable to capture the diversity in our germplasm. To this aim, we built genotype accumulation curves, considering the landrace or the contemporary varieties group as defined by our a priori classification. The result indicated that, also taking into account the possible specific polymorphisms within each group, the employed number of SSR loci is above the minimum number needed to fully capture the diversity in our sub-groups ( Figure 2). For every number of randomly sampled SSRs, there was not a statistical difference in the average number of identified multilocus genotypes considering landraces or cultivars (p > 0.05, Student's t-test).
Agronomy 2021, 11, x FOR PEER REVIEW 6 of 12 ("pomodorino" type, both collected in the valley of the Sarno river), and TLI-TLS ("tondo" type, but originating from different cultivation areas). Finally, before testing for a possible genetic structure, we verified if the number of employed SSRs is suitable to capture the diversity in our germplasm. To this aim, we built genotype accumulation curves, considering the landrace or the contemporary varieties group as defined by our a priori classification. The result indicated that, also taking into account the possible specific polymorphisms within each group, the employed number of SSR loci is above the minimum number needed to fully capture the diversity in our subgroups ( Figure 2). For every number of randomly sampled SSRs, there was not a statistical difference in the average number of identified multilocus genotypes considering landraces or cultivars (p > 0.05, Student's t-test). Figure 2. Genotype accumulation analysis in the two pre-defined S. lycopersium populations. In each panel, the boxplots summarize the descriptive statistics relative to the quartiles of the number of multilocus genotypes obtained by randomly sampling loci without replacement (n = 10,000). The number of sampled SSR loci (from 1 to 12) is indicated in the top dark gray bar. Dots represent outliers (i.e., values outside 1.5 times the interquartile range above the upper and below the lower quartile). Contemporary varieties (respectively, landraces) boxplots are in deep salmon (resp., cyan) color.

Genetic Structure Analysis of the S. lycopersium Genotypes (Landraces and Contemporary Varieties)
To infer a population subdivision between contemporary varieties and landraces, we performed a Principal Coordinate Analysis. To achieve this goal, we excluded the wild species because of their distant relatedness and limited number of samples. The first two components explained 39.7% of the total variance, which implies that the different SSRs were useful in sampling a mostly uncorrelated, locus-specific genetic variation. The scatter plot of PC1 and PC2 values shows that the samples belonging to the two pre-defined populations are well separated along both PC1 and PC2 (Figure 3). Moreover, the samples Figure 2. Genotype accumulation analysis in the two pre-defined S. lycopersium populations. In each panel, the boxplots summarize the descriptive statistics relative to the quartiles of the number of multilocus genotypes obtained by randomly sampling loci without replacement (n = 10,000). The number of sampled SSR loci (from 1 to 12) is indicated in the top dark gray bar. Dots represent outliers (i.e., values outside 1.5 times the interquartile range above the upper and below the lower quartile). Contemporary varieties (respectively, landraces) boxplots are in deep salmon (resp., cyan) color.

Genetic Structure Analysis of the S. lycopersium Genotypes (Landraces and Contemporary Varieties)
To infer a population subdivision between contemporary varieties and landraces, we performed a Principal Coordinate Analysis. To achieve this goal, we excluded the wild species because of their distant relatedness and limited number of samples. The first two components explained 39.7% of the total variance, which implies that the different SSRs were useful in sampling a mostly uncorrelated, locus-specific genetic variation. The scatter plot of PC1 and PC2 values shows that the samples belonging to the two pre-defined populations are well separated along both PC1 and PC2 (Figure 3). Moreover, the samples of each population were similarly spread on the two-dimensional plane, an indication of a comparable level of diversity. of each population were similarly spread on the two-dimensional plane, an indication of a comparable level of diversity. Consequently, we performed an Analysis of Molecular Variance to quantify the level of genetic differentiation between the predefined groups ( Table 2). A main contribution to the genetic variance in our germplasm originated from the difference between samples in each subpopulation and not predominantly from all samples, as in the case of a panmictic population. Moreover, the differentiation between landraces and contemporary cultivars was high, with a 22% percent of the molecular variation present between the two sub-populations (Table 2). For these reasons, the AMOVA analysis provided evidence to support a population structure. Finally, to validate the AMOVA results, we calculated two widely used measures of population differentiation, Fst and Gst (according to Nei), per locus and globally [45], using a distribution obtained by allele permutations for the statistical significance ( Table 3). The data indicated that amount of the genetic differentiation between populations is significant. Interestingly, there were clear differences among loci, with five loci with statistically significant high Fst values. In the presence of a genetic structure, this suggests the possible presence of locus-specific adaptive molecular variation. Consequently, we performed an Analysis of Molecular Variance to quantify the level of genetic differentiation between the predefined groups ( Table 2). A main contribution to the genetic variance in our germplasm originated from the difference between samples in each subpopulation and not predominantly from all samples, as in the case of a panmictic population. Moreover, the differentiation between landraces and contemporary cultivars was high, with a 22% percent of the molecular variation present between the two subpopulations (Table 2). For these reasons, the AMOVA analysis provided evidence to support a population structure. Finally, to validate the AMOVA results, we calculated two widely used measures of population differentiation, Fst and Gst (according to Nei), per locus and globally [45], using a distribution obtained by allele permutations for the statistical significance ( Table 3). The data indicated that amount of the genetic differentiation between populations is significant. Interestingly, there were clear differences among loci, with five loci with statistically significant high Fst values. In the presence of a genetic structure, this suggests the possible presence of locus-specific adaptive molecular variation.

Discussion
The tomato is characterized by limited genetic diversity, due to its evolutionary history [23,46]. The breeding sector is dominated by hybrids that have progressively replaced local accessions. Previous molecular characterizations focused on specific landraces, fruit classes or national collections of tomato landraces [29][30][31][32][33][34]. In this work, we tested the hypothesis that a set of regional tomato landraces of Southern Italy represents a genetically differentiated group compared to contemporary varieties because of their origin and possible non-engagement of the breeding sector. The cladistic classification of the genotypes indicated that every accession had a unique profile. The polymorphic SSRs confirmed to be suitable in distinguish closely related tomato landraces [18]. The most informative SSR was LEEF1Aa, probably because of a composite core motif. This locus was also previously described as highly informative in tomato [47]. As expected, wild relatives had also SSR alleles other from S. lycopersicum [33]. The total number of alleles was similar or higher than in other published works [36,37,48,49], indicating a relevant allelic diversity for both landraces and contemporary varieties. It should be added that for landraces, the molecular variation is expected to include both human-driven selection and adaptation to local conditions [50]. Moreover, the selected varieties were chosen to represent different tomato market classes, which associate to different patterns of genetic variation [51]. Also in another paper based on landraces and contemporary tomato varieties [52], the average number of SSR alleles was judged higher compared to the tomato literature, even if the loci employed are not the same of our work. While the presence of distinctive profile for cultivars is not surprising, synonymous cases and intra-varietal variability has been reported for landraces of tomato and other species [53][54][55]. The explorative hierarchical cluster indicated also biologically coherent and highly balanced clusters of cultivars and landraces at k = 4, although the optimal number of clusters was not statistically evaluated. However, the cophenetic correlation was high (≥ 0.80), a reliable signal of a population structure [56]. The relatively low AC value of the UPGMA tree can be justified considering that even for highly structured populations, the "chaining" effect, unique genotypes, and small groups of accessions (together with two or more large groups) can strongly lower (less than 0.6) this index [57]. Finally, also considering the output group, the dendrogram analysis implied a similar level of diversity between the two pre-defined populations in terms of pairwise distances, which should be further explored.
To scan for a genetic structure, we performed a PCoA mainly because it does not require assumptions that may not be hold true for a selected germplasm collection (e.g., HW or linkage equilibrium) [51], and the intrinsic issues of model-based testing for k = 2 [58]. The analysis indicated that the S. lycopersicum genotypes are well separated according to their pre-defined population [51]. A complete distinction between landraces and contemporary varieties and a confirmation of the UPGMA clustering have not been always found [52]. It is necessary to add that work analyzed a nation-wide group of landraces from geographically diverse collections. Landraces of different secondary centers of diversity (i.e., Spain and Italy) were genetically differentiated [59], suggesting that the complete separation observed in our study could be also due to a more restricted geographic area of origin of our germplasm. The level of population differentiation was high [60], and in line with previous analysis of tomato germplasm that underwent conventional breeding [51]. An important implication is that despite the wide diffusion of contemporary varieties, the admixture between our local landraces and cultivars is very low, considering that we analyzed cultivars that were among the most cultivated in the collection sites. Farmers tend to save seeds from one year to another by choosing the most representative fruits, favoring therefore a maintenance breeding for the fruit morphology (e.g., size and shape), phenological characteristics (e.g., time of flowering and maturity), and quality traits (e.g., color and flavor) they attribute to their accession (Zeven, 2000). We also detected some significant differences among loci in highlighting a genetic differentiation. High Fst values (compared to neutral loci) are typically suggestive of genomic regions that are under divergent selection. Nonetheless, at each locus, Fst values are also influenced by heterozygosity and the mutation rate, two features that are more variable among multiallelic SSRs than in bi-allelic markers. For these reasons, the data motivate further genomic scans to pinpoint the effects of the adaptive and/or breeding processes that led to a differentiation between tomato landraces and contemporary varieties [61].

Conclusions
Our study provides a first comparative assessment of the diversity and population structure in a geographically specific collection of tomato landraces from a secondary center of diversity. The data indicated that the landraces constitute a genetically distinct population from common commercial varieties, in addition to having a historical origin and a locally recognized gastronomic identity. Therefore, our work provides a robust justification for implementing measures for in situ and ex situ conservation actions, as well as for creating premium regional brands that can even go beyond the worldwide known, individual landrace names of the Campania region (e.g., San Marzano, piennolo/Pomodorino Vesuviano). Moreover, multivariate analyses indicated that the landraces are characterized by a good, and to a relevant extent, specific diversity. Finally, our work encourages further multidisciplinary studies to unravel the genetic factors responsible for the adaptation to specific environments and potential quality traits of our landraces.
Supplementary Materials: The following are available online at https://www.mdpi.com/2073-4 395/11/3/564/s1, Figure S1: an example of the electropherograms at the locus LE20592 showing the allele peaks. For each genotype (indicated on the left; see Table S1 for the code), the height of the peak refers to the scale on the right-hand side (RFU). The top bar indicates the size reference range provided by the genotyping software, used to calibrate the peak data points to their DNA size according to the internal size standard. The rounded dimension of each fragment (bp) is indicated in the rectangle below. Figure S2: heatmap of the pairwise genetic distance (Prevosti) between genotypes (codes are reported in Table S1). The bar on the right side shows the color scale adopted to represent distances, from deep blue (0) to white (1). Table S1: List of the genotypes under investigation and main features of their fruit shape. Table S2: SSR loci employed in this study and their main features.
Author Contributions: Conceptualization, G.C.; formal analysis, Y.R. and G.C.; investigation, M.C. and G.C.; writing-original draft preparation, G.C.; writing-review and editing, Y.R. and G.C. All authors have read and agreed to the published version of the manuscript.