Grapevine Diversity and Genetic Relationships in Northeast Portugal Old Vineyards

More than 100 grapevine varieties are registered as suitable for wine production in “Douro” and “Trás-os-Montes” Protected Designations of Origin regions; however, only a few are actually used for winemaking. The identification of varieties cultivated in past times can be an important step to take advantage of all the potential of these regions grape biodiversity. The conservation of the vanishing genetic resources boosts greater product diversification, and it can be considered strategic in the valorisation of these wine regions. Hence, one goal of the present study was to prospect and characterise, through molecular markers, 310 plants of 11 old vineyards that constitute a broad representation of the grape genetic patrimony of “Douro” and “Trás-os-Montes” wine regions; 280 samples, grouped into 52 distinct known varieties, were identified through comparison of their genetic profiles generated via 6 nuclear SSR and 43 informative SNP loci amplification; the remaining 30 samples, accounting for 13 different genotypes, did not match with any profile in the consulted databases and were considered as new genotypes. This study also aimed at evaluating the population structure among the 65 non-redundant genotypes identified, which were grouped into two ancestral genetic groups. The mean probability of identity values of 0.072 and 0.510 (for the 6 SSR and 226 SNP sets, respectively) were determined. Minor differences were observed between frequencies of chlorotypes A and D within the non-redundant genotypes studied. Twenty-seven pedigrees were confirmed and nine new trios were established. Ancestors of eight genotypes remain unknown.


Introduction
The wine denominations "Douro", the oldest demarcated and regulated winemaking region in the world, and "Trás-os-Montes", represent approximately 29% of the Portuguese vineyard area for wine production (22 and 7%, respectively) [1]. These Protected Designations of Origin (PDO) situated in Northeast Portugal have an ancient and diverse viticulture history and they are characterised by their mountains with steep slopes and valleys propitious to the existence of distinct microclimates, which led to the evolutionary need of grapevine adaptation to different conditions [2]. In this sense, traditional viticultural practices and local climates were crucial to the high genetic diversity observed in these wine regions. On the other hand, the referred factors also led to the appearance of a decades, namely vineyard restructuring and conversion to commercially available clones from a reduced number of grape varieties. Consequently, crop vulnerability to several abiotic and biotic stresses has increased, producing a massive negative impact on the rich heritage in grape varieties, so crucial to an environmentally sustainable viticulture. A possible response towards these projected future stresses in vineyards in a world with highdemanding wine consumers is to preserve a wider range of grape varieties. This action thus demands a continuous grape varietal prospection for their conservation and putative exploitation in traditional winemaking locations, since they possess old genetic resources on the edge of extinction. In this sense, the main goals of this study were the: (i) molecular identification in old traditional vineyards of a broad representation of grapevine patrimony of "Douro" and "Trás-os-Montes" regions contributing to deepen the knowledge of Northeast Portugal grapevine gene pool. Varietal discrimination was carried out by using the set of six microsatellite markers recommended by OIV [41] and the 48 SNP set developed by Cabezas et al. [29] and comparing SSR and SNP profiles obtained with those of the VIVC database and the ICVV-SNP database, respectively; (ii) evaluation of genetic diversity and relationships among the grape genotypes detected, through SSR and SNP (240 SNP) markers; (iii) determination of chlorotypes and their frequencies through cpSSR and SNP loci amplification; and (iv) determination of first-order kinship relationships among grape varieties using the 240-SNP set.

Results and Discussion
A total of 310 grapevines were sampled in traditional vineyards throughout "Douro" and "Trás-os-Montes" PDO regions (Figure 1). These vines (older than 47 years) were located in old vineyards of wine-growing companies or in small vine parcels, such as the Vassal, Aguieiras, and Sendim sampling locations, belonging to local wine producers for self-consumption.
Samples were characterised using molecular markers, namely Simple Sequence Repeats (SSR) and Single Nucleotide Polymorphisms (SNP), starting by varietal identification and followed by population structure, genetic diversity, and pedigree analyses with nonredundant genotypes.
Plants 2021, 10, x FOR PEER REVIEW 3 of 26 As a result of changes in worldwide wine consumption and European Union incentives [40], radical transformations in European viticulture have occurred in the last decades, namely vineyard restructuring and conversion to commercially available clones from a reduced number of grape varieties. Consequently, crop vulnerability to several abiotic and biotic stresses has increased, producing a massive negative impact on the rich heritage in grape varieties, so crucial to an environmentally sustainable viticulture. A possible response towards these projected future stresses in vineyards in a world with highdemanding wine consumers is to preserve a wider range of grape varieties. This action thus demands a continuous grape varietal prospection for their conservation and putative exploitation in traditional winemaking locations, since they possess old genetic resources on the edge of extinction. In this sense, the main goals of this study were the: (i) molecular identification in old traditional vineyards of a broad representation of grapevine patrimony of "Douro" and "Trás-os-Montes" regions contributing to deepen the knowledge of Northeast Portugal grapevine gene pool. Varietal discrimination was carried out by using the set of six microsatellite markers recommended by OIV [41] and the 48 SNP set developed by Cabezas et al. [29] and comparing SSR and SNP profiles obtained with those of the VIVC database and the ICVV-SNP database, respectively; (ii) evaluation of genetic diversity and relationships among the grape genotypes detected, through SSR and SNP (240 SNP) markers; (iii) determination of chlorotypes and their frequencies through cpSSR and SNP loci amplification; and (iv) determination of first-order kinship relationships among grape varieties using the 240-SNP set.

Results and Discussion
A total of 310 grapevines were sampled in traditional vineyards throughout "Douro" and "Trás-os-Montes" PDO regions (Figure 1). These vines (older than 47 years) were located in old vineyards of wine-growing companies or in small vine parcels, such as the Vassal, Aguieiras, and Sendim sampling locations, belonging to local wine producers for self-consumption. Samples were characterised using molecular markers, namely Simple Sequence Repeats (SSR) and Single Nucleotide Polymorphisms (SNP), starting by varietal identification and followed by population structure, genetic diversity, and pedigree analyses with non-redundant genotypes.
All samples were initially analysed with six nuclear microsatellites (nSSR; approved as descriptors by the OIV [41]) and three chloroplastidial microsatellites (cpSSR) [19]. Microsatellite markers are most adapted for varietal identification (fast results, easy to assay, cost effective, and available databases). However, SSR genotyping is subject to technical variations that required calibration between laboratories [12].
An SNP array was first proposed as an alternative to SSR for varietal identification by Cabezas et al. [29]. These authors reported this 48-SNP set with a discrimination power similar to 14-16 microsatellite markers. Several research groups are currently using the 48-SNP set previously defined for that purpose [6,7,29,42]. Thus, non-redundant grape varieties identified and all non-identified samples, through SSR genotyping, were also profiled with the 48-SNP array selected by Cabezas et al. [29].
Non-redundant genotypes for the 48-SNP set were then genotyped with 192 additional SNP loci for subsequent analyses of population structure, genetic diversity, and parentage relationships. However, only 226 out of 240 SNP markers were informative; 14 SNPs were discarded, because data produced by 11 SNP were missing in at least 61% of the samples, and three SNP were monomorphic in the genotypes analysed.

Genetic Identification Based on nSSR and SNP Markers
Two hundred and eighty samples were identified through comparison of their genetic profiles generated via six nuclear SSR and 43 informative SNP loci amplifications. The SSR and SNP profiles were compared to those stored in the VIVC and ICVV-SNP databases, respectively. The VIVC database includes 5424 genetic profiles [39] and the ICVV-SNP database more than 2800 non-redundant genotypes for 48 SNPs of diverse genetic and geographic origins. Fifty-two distinct grapevine varieties were detected, from which 37 were described as autochthonous to Portugal (Table 1; Figure 2; Supplementary  Tables S1-S4) [3,4,6,17,20]. The SSR and SNP profiles also allowed the identification of 15 All samples were initially analysed with six nuclear microsatellites (nSSR; approved as descriptors by the OIV [41]) and three chloroplastidial microsatellites (cpSSR) [19]. Microsatellite markers are most adapted for varietal identification (fast results, easy to assay, cost effective, and available databases). However, SSR genotyping is subject to technical variations that required calibration between laboratories [12].
An SNP array was first proposed as an alternative to SSR for varietal identification by Cabezas et al. [29]. These authors reported this 48-SNP set with a discrimination power similar to 14-16 microsatellite markers. Several research groups are currently using the 48-SNP set previously defined for that purpose [6,7,29,42]. Thus, non-redundant grape varieties identified and all non-identified samples, through SSR genotyping, were also profiled with the 48-SNP array selected by Cabezas et al. [29].
Non-redundant genotypes for the 48-SNP set were then genotyped with 192 additional SNP loci for subsequent analyses of population structure, genetic diversity, and parentage relationships. However, only 226 out of 240 SNP markers were informative; 14 SNPs were discarded, because data produced by 11 SNP were missing in at least 61% of the samples, and three SNP were monomorphic in the genotypes analysed.
The remaining 30 samples, accounting for 13 different SSR and SNP genetic profiles, did not match with any profile stored in the VIVC and ICVV-SNP databases, respectively, and are being further studied (Table 1; Figure 2; Supplementary Tables S3 and S4). Their discovery uncovers once again the richness of the Portuguese gene pool as already highlighted by Cunha et al. [6]. Most likely, these unique genotypes correspond to minor autochthonous varieties from Portugal, since the lack of profile matching with international databases, and the fact that each genotype (even with more than one sample identified)  Table 1. Fifty-two genotypes were identified belonging to either the most cultivated varieties in Portugal (which means a representation superior to 1% of total area) or to the minority varieties group. Thirteen new genotypes were also detected.
The remaining 30 samples, accounting for 13 different SSR and SNP genetic profiles, did not match with any profile stored in the VIVC and ICVV-SNP databases, respectively, and are being further studied (Table 1; Figure 2; Supplementary Tables S3 and S4). Their discovery uncovers once again the richness of the Portuguese gene pool as already highlighted by Cunha et al. [6]. Most likely, these unique genotypes correspond to minor autochthonous varieties from Portugal, since the lack of profile matching with international databases, and the fact that each genotype (even with more than one sample identified) has been found in a single sampling location (Table 1). Maraš et al. [38] used the term "proto-varieties" to designate plants that have been only cultivated by local grape growers, i.e., plants that directly grow from seeds, or have been multiplied through cuttings just once, from the place where the seed germinated to the orchard. Eventually, they could be multiplied and distributed, becoming varieties. This is how most of the varieties were originated in the past, but these traditional techniques are not used anymore in Western European regions.
The selection of plants was based on the difficulty of their morphological identification by ampelographers, where only 31 out of 310 samples were previously named but mainly with local names (Supplementary Table S2). Microsatellite and SNP loci amplification allowed the confirmation of the varietal identity of six samples ('Roseira', 'Trousseau Noir' syn. 'Bastardo', 'Casculho', 'Tinta Francisca', 'Grand Noir' and 'Rufete'). On the other hand, samples with different names but which fully matched at the genotyped nSSR and SNP loci were also found and they were considered synonyms or misnomers. The 'Verdelheira' sample was identified as the Portuguese variety 'Gouveio'; this variety is known as 'Verdelho' in the Dão wine region [6] and it has 'Godello' (or 'Verdello2 ) as synonyms in Galicia [45]. Other synonyms were established, namely 'Tinta Amarela Antiga', 'Sousão', and 'Rabigato Francês' samples whose genetic profiles match with those of 'Trincadeira', 'Vinhão', and 'Grec Rouge' varieties, respectively (Supplementary Table S2).
The genetic profiles of 'Mourisco', 'Mourisco de Semente', 'Lázaro', and 'Rosada' samples, generated through nSSR and SNP analyses, did not match with any sample profile included in the VIVC and ICVV-SNP databases. This fact led to the conclusion that the two first samples were incorrectly named, but further analysis of the other two There are 343 authorised varieties for wine production in Portugal [48]. All varieties detected in this work are included on the list of varieties authorised for winemaking, except 'Afus Ali', 'Black Monukka', 'Dodrelyabi', 'Jeronimo', and 'Perlette'. More particularly, 42 out of the identified 47 grape varieties authorised for winemaking in Portugal are also approved for red and white wine production under PDO "Douro" and "Trás-os-Montes" (Table 1)  Another fact to take into account was that seven red varieties ('Cornifesto', 'Marufo', 'Rufete', 'Tinta Barroca', 'Tinta Francisca', 'Tinto Cão', and 'Vinhão') and five white ('Diagalves', 'Gouveio', 'Malvasia Rei', 'Samarrinho', and 'Síria') listed as suitable for production of PDO wines are described as late-maturing ones [51,52]. Hence, high quality PDO wines from late-maturing grape varieties will likely need to be considered under a future warmer climate, to cope with the extreme hot temperatures and precipitation deficits registered worldwide, and especially in Portugal.

Nuclear SSR and SNP Diversity
SNP and SSR profiles were compared to assess the genetic diversity of the 65 nonredundant genotypes and these results are summarised in Table 2 (and Supplementary  Table S5). A total of 57 alleles were obtained at the OIV set of six microsatellite loci ranging from 7 (VVMD27) to 12 (VVS2) and with an average of 9.5 alleles per locus. The level of polymorphism found was comparable with that reported for other V. vinifera germplasm diversity studies assessed with SSR markers. An analysis of 51 varieties using the six OIV loci described a range of 7 (VVMD27) to 11 (VVS2) alleles, with an average of 8.2 [17]. The nuclear microsatellite diversity study of 57 grape varieties reported by Cunha et al. [53] showed a total of 53 alleles scored across the six loci, 13 alleles for the VVS2 locus, and 8 for all the others, with an average of 8.8 alleles per locus. Moreover, an analysis of 39 Portuguese varieties described by Castro et al. [4], also using the OIV set of six loci, showed that the allele number ranged from 6 (VVMD27) to 10 (VVMD5 and VVS2), with a mean value of 8.3 alleles per locus.
Allele frequencies and genetic parameters were also determined for the 226 SNP set ( Table 2). The minor allele frequency (MAF) was analysed, since it is a measure of the discriminating ability of the markers. In the case of biallelic markers, the closer MAF is to 0.5, the better [29]. In the present study, the average MAF among the 226 SNPs was 0.244, with seven SNP showing a MAF between 0.4 and 0.5 but also 33 SNP with MAF below 0.1. The minimum MAF registered was 0.023 at SNP1045_291, SNP1419_186, SNP217_190, and Vvi_3947 loci; whereas, the maximum MAF observed was 0.492 at SNP1335_204, SNP575_128, SNP663_578, and SNP855_103 loci. The average MAF was slightly lower than V. vinifera spp. sativa germplasm studied by Emanuelli et al. [7] (MAF = 0.258). The observed number of effective alleles (Ne) differed from 3.928 (VVMD7) to 6.995 (VVMD5), and an average number of 4.962 effective alleles was obtained for nSSR markers, similar to the mean Ne value (4.658) attained by Castro et al. [4]. For SNP markers, Ne ranged from 1.016 (SNP625_278) to 1.999 (SNP1327_56), with an average of 1.604, which is consistent with other reports on the genetic diversity of cultivated grape varieties (Table 2; Ne = 1.593 in [20]; Ne = 1.58 in [38]).
Differences between SNPs and SSRs were also observed with respect to heterozygosity. The observed heterozygosity (Ho) varied between 0.785 (VVMD7 and VrZAG62) and 0.954 (VVMD5); the lowest expected heterozygosity (He) was detected at VVMD7 locus with 0.745 and the highest one at VVS2 locus with 0.820 ( Table 2). The level of heterozygosity observed in this study (Ho = 0.859 and He = 0.790) was similar to that observed for other sets of grape varieties analysed with SSR markers (Ho = 0.833 and He = 0.767 in [53] and Ho = 0.833 and He = 0.769 in [4]). This high level of heterozygosity is in agreement with the natural breeding system of the species and could be a consequence of both natural and human selection against homozygosity in these plants [10,21,54]. Tests for Hardy-Weinberg equilibrium (HWE) revealed no significant deviations (p < 0.05) from HWE at the six SSR loci analysed.
As expected, SSR loci, due to the SSR multiallelic nature and high level of polymorphism, exhibited a significantly higher heterozygosity than biallelic SNP loci. This trend was observed by Emanuelli et al. [7]. On average, SNPs displayed lower observed (Ho = 0.378) and expected (He = 0.351) heterozygosity values than SSRs (Table 2). For 196 SNP loci, no difference (p < 0.05) between Ho and He values was found. However, deviation from the HWE (p < 0.05) was observed for 13.27% of the SNP markers; for 23 SNPs (10.18%), Ho was significantly higher than He, whereas Ho was significantly lower than He in the remaining seven SNPs (3.09%).
The genotype level of polymorphism was assessed by calculating the PIC values for each of the six nSSR and 226 SNP loci. VVMD7 and VVMD5 markers displayed the minimum (0.736) and maximum (0.844) PIC values, respectively ( Table 2 and Table S5). The nSSR loci were highly polymorphic and showed a mean PIC value of 0.773. This was in agreement with findings in other studies on Portuguese native grape genotypes (PIC = 0.738 in [53]; PIC = 0.741 in [4]). PIC values estimated for SNP loci, with an average of 0.280, varied between 0.006 (SNP817_209) and 0.492 (SNP853_312; Table 2). One hundred and sixty-nine SNPs displayed PIC values comprised between 0.2 and 0.5, and the remaining 57 showed PIC values below 0.2. These values indicate that the majority of SNP loci analysed had a very high discriminating capacity for grape varieties. Similar mean PIC values have been also reported in genetic diversity studies on V. vinifera varieties (PIC = 0.253 in [25]; PIC = 0.315 in [29]; PIC = 0.280 in [42]). PIC values for SNP were lower than for SSR markers due to the SNP bi-allelic nature and a maximum PIC value of 0.50 is usually expected for a specified SNP locus [7]. However, this can be resolved either by using a larger set of SNP markers [55] or by considering SNPs as multiallelic molecular markers [25].
The global probability of identity (PI) obtained for the six SSR set (1.0 × 10 −7 ) was considerably higher than that determined for the 226 SNP set (1.4 × 10 −70 ; Table 2), with the 226 SNP set having an equivalent discriminating power as a 61 SSR set. Moreover, even the 43 SNP loci (6.8 × 10 −16 ) used for the varietal identification would give a similar identification power as 14 microsatellites.
Chlorotype diversity found in the present research work is shown in Table 3 (and  Supplementary Tables S3 and S4). At least two allele variants were detected per cpSSR locus: two different size variants were found at the ccmp3 and ccmp5 loci (106 and 107 bp and 102 and 103 bp, respectively); and three different size variants at the ccmp10 locus (114, 115 and 116 bp; Table 3). The combination of alleles from these cpSSRs enabled distinguishing the main grape chl, designated A, B, C, and D, according to Arroyo-García et al. [19]. Chlorotype were also confirmed through SNP_NG_C_001 (C/T), SNP_NG_C_003 (C/T), and SNP_NG_D_003 (A/G) loci amplification. Using the same nomination, CCG, CTG, TTG, and CCA nucleotide combinations were found in A, B, C, and D chlorotypes, respectively (Supplementary Table S4). Chl A was the most frequent (observed in 50.77% of the grape varieties), but a high percentage of chl D was also detected (46.15%; Table 3). Chl B and chl C were only present in the foreign varieties 'Dodrelyabi' and 'Black Monukka', respectively (Supplementary  Tables S3 and S4). Chl A characterises the Iberian Peninsula varieties, which is referred to as a secondary centre of domestication of V. vinifera L. ssp. Vinifera; whereas, chl D is more commonly observed in eastern European grape varieties [13,18,44,56]. In this sense, the existence of 16 varieties and 12 new genotypes presumably autochthonous to Portugal with chl D (24.46 and 18.46%, respectively) could be a consequence of crosses between non-Iberian introduced varieties and Portuguese native germplasm. Chlorotypes of 'Carrega Branco', 'Nevoeira', 'Roseira', and the 13 new genotypes identified were determined for the first time in this work, as no previous references were found in the literature or databases used. Most of these genotypes bear chl D, except 'Nevoeira' and 'NG013' (both chl A; Supplementary Tables S3 and S4).

Population Structure Analysis
The genetic stratification in the set of 65 non-redundant grape genotypes was tested through a STRUCTURE analysis, using 226 SNP profiles. The delta K criterion (∆K) [57] suggested K = 2 as the optimal uppermost hierarchical level of structure (Supplementary Figure S1); in this sense, genotypes divided into two major genetic groups were the best representation. Bar plot representation of the obtained structure is shown in Figure 3.  Figure S1); in this sense, genotypes divided into two major genetic groups were the best representation. Bar plot representation of the obtained structure is shown in Figure 3. . non-redundant genotypes identified throughout "Douro" and "Trás-os-Montes" PDO regions, using 226-SNP profiles. The number of genetic groups (K = 2) was set up considering the ΔK criterion [57]. Every non-redundant genotype is shown as a vertical line, with colour segment lengths proportional to their inferred ancestry: genetic groups 1 and 2 are reported in red and green, respectively. Considering a critical ancestry coefficient of q ≥ 0.70, 20 and 18 genotypes were assigned to SNP-group 1 and SNP-group 2, respectively (with 27 admixed genotypes).
A membership coefficient (q-value) threshold of 0.7 for genetic group assignment was considered. Twenty (30.77%) and 18 (27.69%) genotypes were assigned to SNP-group 1 and SNP-group 2, respectively. The percentage of admixed genotypes was 41.54% (27 genotypes; Figure 3; Supplementary Table S6). SNP-group 1 is composed of Iberian varieties, including 'Alfrocheiro' and 'Hebén' that are progenitors of the majority of varieties also established in this genetic group. SNP-group 2 includes genotypes considered autochthonous to Portugal (except 'Chasselas'), including 'Marufo' and 'Touriga Nacional' that are progenitors of the majority of varieties and five new genotypes also found in this ancestry group.
Principal coordinate analysis (PCoA) was performed to infer the distribution of genetic relationships among structure groups as revealed by 226 SNP loci ( Figure 4A). The first two PCos described 13.09% of the total variation. Ancestral groups (SNP-groups 1 and 2) were discriminated by this analysis, as both were separated along the PCo1. Admixed genotypes were generally placed in between the genotypes of each genetic group ( Figure 4A). Some differences were found between PCoA ( Figure 4A) and structure analysis ( Figure 3). 'NG001' and 'NG007' assigned to SNP-group 1 with membership coefficients above 0.7 (0.96 and 0.71, respectively) appeared in the left part of the PCoA plot, which included genotypes from the SNP-group 2. On the contrary, 'Donzelinho Roxo', 'Mourisco de Semente', and 'Chasselas' assigned to SNP-group 2 with membership coefficients equal or above 0.7 (0.70, 0.74, and 0.76, respectively) appeared in the right part of the PCoA plot, which included genotypes from the SNP-group 1 ( Figure 4A). These differences could be explained by the fact that the membership coefficient of all these genotypes, excluding that of 'NG001', is in the threshold of 0.7 attributed to genetic group assignment in population structure analysis. Nevertheless, these results highlight the risk of overinterpretation of particular data in the PCoA plot, which is based on only two coordinates explaining a limited percentage of the total variation. . Non-redundant genotypes identified throughout "Douro" and "Trás-os-Montes" PDO regions, using 226-SNP profiles. The number of genetic groups (K = 2) was set up considering the ∆K criterion [57]. Every non-redundant genotype is shown as a vertical line, with colour segment lengths proportional to their inferred ancestry: genetic groups 1 and 2 are reported in red and green, respectively. Considering a critical ancestry coefficient of q ≥ 0.70, 20 and 18 genotypes were assigned to SNP-group 1 and SNP-group 2, respectively (with 27 admixed genotypes).
A membership coefficient (q-value) threshold of 0.7 for genetic group assignment was considered. Twenty (30.77%) and 18 (27.69%) genotypes were assigned to SNP-group 1 and SNP-group 2, respectively. The percentage of admixed genotypes was 41.54% (27 genotypes; Figure 3; Supplementary Table S6). SNP-group 1 is composed of Iberian varieties, including 'Alfrocheiro' and 'Hebén' that are progenitors of the majority of varieties also established in this genetic group. SNP-group 2 includes genotypes considered autochthonous to Portugal (except 'Chasselas'), including 'Marufo' and 'Touriga Nacional' that are progenitors of the majority of varieties and five new genotypes also found in this ancestry group.
Principal coordinate analysis (PCoA) was performed to infer the distribution of genetic relationships among structure groups as revealed by 226 SNP loci ( Figure 4A). The first two PCos described 13.09% of the total variation. Ancestral groups (SNP-groups 1 and 2) were discriminated by this analysis, as both were separated along the PCo1. Admixed genotypes were generally placed in between the genotypes of each genetic group ( Figure 4A). Some differences were found between PCoA ( Figure 4A) and structure analysis (Figure 3). 'NG001' and 'NG007' assigned to SNP-group 1 with membership coefficients above 0.7 (0.96 and 0.71, respectively) appeared in the left part of the PCoA plot, which included genotypes from the SNP-group 2. On the contrary, 'Donzelinho Roxo', 'Mourisco de Semente', and 'Chasselas' assigned to SNP-group 2 with membership coefficients equal or above 0.7 (0.70, 0.74, and 0.76, respectively) appeared in the right part of the PCoA plot, which included genotypes from the SNP-group 1 ( Figure 4A). These differences could be explained by the fact that the membership coefficient of all these genotypes, excluding that of 'NG001', is in the threshold of 0.7 attributed to genetic group assignment in population structure analysis. Nevertheless, these results highlight the risk of overinterpretation of particular data in the PCoA plot, which is based on only two coordinates explaining a limited percentage of the total variation. An Unweighted Pair Group Method with Arithmetic Mean (UPGMA) distance tree was constructed to investigate the genetic relationship among the 65 non-redundant genotypes from genetic distance matrices (226-SNP data; Figure 4B). Genotypes displayed different levels of similarity, ranging from 83 to 92%. Five clusters (I to V; 'NG010' as an outlier) were considered ( Figure 4B). Clusters I and II included genotypes belonging mainly to the structure SNP-group 2, and the Clusters IV and V were mainly genotypes assigned to structure SNP-group 1. Cluster III was composed with genotypes considered admixtures (excluding 'Hebén' that was assigned to SNP-group 1). Hence, genotypes were clustered according to their ancestral group, but the following differences were verified: the 'Donzelinho Roxo' that was grouped in Cluster IV but assigned to structure SNPgroup 2; and 'NG001' and 'NG007' situated in Cluster I but allocated to structure SNPgroup 1 ( Figure 4B). Clustering results for 'Donzelinho Roxo', 'NG001', and 'NG007' were also supported by the PCoA (Figure 4B). Interestingly, all the new genotypes along with Marufo were included in Cluster I.

Pedigree Analysis
The 240 SNP-profiles of 65 non-redundant grape genotypes, including those of the 13 unique genotypes identified in the present work, were merged with those stored in the ICVV-SNP database completing a total of about 2500 genotypes, for a wide search of possible first-order kinship relationships.
Portuguese grape germplasm consisted of a very large number of varieties, and in most cases their ancestors remain largely unknown. Recently, Cunha et al. [20] reported the existence of first-degree relationships among several Portuguese varieties.
All reliable trios (both parents and offspring) and duos (parent-offspring pairs) involving genotypes analysed and the corresponding LOD values and the number of mismatching loci are presented in Table 4 and Table 5. The most relevant parentage relationships are also shown in Figure 5. An Unweighted Pair Group Method with Arithmetic Mean (UPGMA) distance tree was constructed to investigate the genetic relationship among the 65 non-redundant genotypes from genetic distance matrices (226-SNP data; Figure 4B). Genotypes displayed different levels of similarity, ranging from 83 to 92%. Five clusters (I to V; 'NG010' as an outlier) were considered ( Figure 4B). Clusters I and II included genotypes belonging mainly to the structure SNP-group 2, and the Clusters IV and V were mainly genotypes assigned to structure SNP-group 1. Cluster III was composed with genotypes considered admixtures (excluding 'Hebén' that was assigned to SNP-group 1). Hence, genotypes were clustered according to their ancestral group, but the following differences were verified: the 'Donzelinho Roxo' that was grouped in Cluster IV but assigned to structure SNP-group 2; and 'NG001' and 'NG007' situated in Cluster I but allocated to structure SNP-group 1 ( Figure 4B). Clustering results for 'Donzelinho Roxo', 'NG001', and 'NG007' were also supported by the PCoA (Figure 4B). Interestingly, all the new genotypes along with Marufo were included in Cluster I.

Pedigree Analysis
The 240 SNP-profiles of 65 non-redundant grape genotypes, including those of the 13 unique genotypes identified in the present work, were merged with those stored in the ICVV-SNP database completing a total of about 2500 genotypes, for a wide search of possible first-order kinship relationships.
Portuguese grape germplasm consisted of a very large number of varieties, and in most cases their ancestors remain largely unknown. Recently, Cunha et al. [20] reported the existence of first-degree relationships among several Portuguese varieties.
All reliable trios (both parents and offspring) and duos (parent-offspring pairs) involving genotypes analysed and the corresponding LOD values and the number of mismatching loci are presented in Tables 4 and 5. The most relevant parentage relationships are also shown in Figure 5. Table 4. List of trios (parents-offspring) identified in this study using 226 SNP data. Genotypes for which a pedigree has been confirmed with molecular markers for the first time are highlighted in bold.    Pedigree results revealed 36 compatible trios, with high LOD values, ranging from 52.20 to 101.40, using a maximum of two mismatching loci as threshold, with the confirmation of 27 already reported trios (see Table 4 for references). The case of 'Marufo' and 'Borraça' as parents of 'Mourisco de Semente' was also considered reliable, despite the detection of four mismatching loci, since this trio was confirmed by Lacombe et al. [58] with 20 SSR markers. Pedigree analysis also allowed the discovery of the probable genetic origins of 9 out of 13 new genotypes (LOD values above 52; Table 4; Figure 5). Pedigree results revealed 36 compatible trios, with high LOD values, ranging from 52.20 to 101.40, using a maximum of two mismatching loci as threshold, with the confir-mation of 27 already reported trios (see Table 4 for references). The case of 'Marufo' and 'Borraça' as parents of 'Mourisco de Semente' was also considered reliable, despite the detection of four mismatching loci, since this trio was confirmed by Lacombe et al. [58] with 20 SSR markers. Pedigree analysis also allowed the discovery of the probable genetic origins of 9 out of 13 new genotypes (LOD values above 52; Table 4; Figure 5).

Offspring
Several grape varieties have been previously reported to have an important role in the establishment of local genetic networks, such as 'Hebén', 'Alfrocheiro', and 'Marufo' in the Iberian Peninsula [20,59,60]. In fact, data analysis showed the significant contribution of 'Marufo' and 'Alfrocheiro' in the generation of Portuguese grapevine diversity, being involved as progenitors in 16 and 8 pedigrees, respectively.
Unlike hermaphrodite grape varieties, female progenitors, such as 'Marufo' (chl D) and 'Hebén' (chl A) need to cross-pollinate to produce descendants, a process that increases genetic diversity and increases hybrid plant vigour, which could have favoured their selection as seed donors by early farmers to ensure grape production [20].
As previously mentioned, 'Marufo as well as 'Tinto Cão' and 'Vinhão' are latematuring varieties. 'Tinto Cão' has been described in the Douro region since the XVIIl century [64]. 'Vinhão' is originated from North Portugal, and according to the French ampelographer Paul Truel, was introduced in Douro from the Minho region in 1790, to improve the colour of Douro wines (cited in [65]).
In some cases, the fact that genotypes were considered admixture (assuming a threshold q-value > 0.7 for group assignment) could be explained by parentage analysis. For example, both 'Touriga Fêmea' and 'NG007' were determined as admixture genotypes; 'Touriga Fêmea' is a progeny derived from 'Malvasia Fina' and 'Touriga Nacional' (assigned to SNP-group 1 and SNP-group 2, respectively), whereas 'NG007' is a reliable result of a cross between 'Marufo' and 'Camarate Tinto' (allocated to SNP-group 2 and SNP-group 1, respectively; Figure 3; Figure 5; Table 4).
Although several compatible duos were also identified, the existence of a compatible duo may not mean necessarily a parent-offspring relationship; since some are siblings or close-related varieties, they are compatible for most of the molecular markers used. In this sense, only duos more consistent (with LOD scores above 25.00 and a maximum of one mismatching loci) were considered. Five reliable duos were detected and summarised in Table 5.
However, no reliable trios or duos within the ICVV-SNP database were found for other genotypes, mainly the Iberian ones: 'Tamarez', 'Tinta Francisca', 'NG005', and 'NG013'. Since the ICVV-SNP database includes a high number of Iberian profiles and even so no parentage relationships were established, most likely progenitors of the referred genotypes are extinct or close to. Their extinction, as proposed by Cunha et al. [20], may be due to: the appearance in the 19th century of different disease-causing agents (e.g., mildews and grape phylloxera pests) that massively annihilated cultivated and wild grapevines throughout Europe; or they were minor varieties (or individual plants) lost along the evolution of viticulture due to other causes.

Sampling and DNA Extraction
To analyse the ancient genetic diversity of V. vinifera in "Douro" and "Trás-os-Montes" PDO regions, 310 plants were sampled across 11 different old mixed variety vineyards, all predating the 1970s (Figure 1; Supplementary Table S1). The selection of plants was based on the difficulty of their morphological identification by ampelographers; the identity of 279 grape samples was unknown and had no names, whereas 31 were named mainly with local names (Supplementary Table S2).
All plants were labelled in the vineyards and young leaves were collected in several exploration trips between 2016 and 2019. Samples were kept on ice until storage at −80 • C for DNA isolation and genotyping.
Genomic DNA was extracted according to Doyle & Doyle [66], with some modifications. Total purified DNA was detected by 1.0% (w/v) agarose gel electrophoresis containing Gel-Green TM Nucleic Acid Gel Stain 0.5x (Biotium, Fremont, CA, USA) and stored at −20 • C until use. The final concentration was confirmed using a NanoDrop ® ND-1000 UV-Vis spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA).
All PCR products were visualised by electrophoresis on 2.0% agarose gels (w/v) containing Gel-Green TM Nucleic Acid Gel Stain 0.5x (Biotium). Fluorescently labelled cp and nSSR products were separated by capillary electrophoresis using the ABI PRISM ® 3130 automated sequencer (Applied Biosystems, Life Technologies, Foster City, CA, USA) and GeneScan TM 500 LIZ ® (Applied Biosystems, Life Technologies, Foster City, CA, USA) as the internal lane size standard. Data produced were analysed by Peak Scanner v1.0 software (Applied Biosystems, Foster City, CA, USA). The sizes of the amplicons were scored in base pairs (bp) based on the relative migration of the internal size standard.
The varietal identification was achieved by comparing the obtained nSSR profiles with those on literature data and the VIVC database, with 5424 profiles [39].

SNP Markers
DNA samples from non-redundant genetic profiles (previously identified through nSSR profiles/VIVC database), and samples that produced nSSR profiles that did not match with those of the VIVC database were also genotyped for a set of 240 SNP markers previously identified by Lijavetzky et al. [25] and Cabezas et al. [29]. Three ctSNP (SNP_NG_C_001, SNP_NG_C_003 and SNP_NG_D_003) loci were used for chlorotype determination.
The SNP genotyping was carried out as recently described in Cunha et al. [20] and Maraš et al. [38], through the Fluidigm (San Francisco, CA, USA) technology. Genotyping services were provided by the Sequencing and Genotyping Unit of the University of the Basque Country. SNP profiles obtained for the 240 SNPs were pairwise compared with those of the ICVV-SNP database for varietal identification.

Data Analyses
Data obtained with the nSSR loci were scored based on the molecular size (in bp) of alleles. For SNP data, numerical values were assigned to each nucleotide (missing data = 0; A = 1, C = 2, G = 3, T = 4).
Non-redundant grapevine genotypes with genetic profiles for 6 nSSR and 48 SNP loci were used for grape variety identification. Genetic profiles for 240 SNP loci were used for population structure and genetic diversity analyses.

Genetic Diversity Analysis
To compare SSR and SNP results, genetic parameters of polymorphism, such as the average number of different alleles per locus (Na), the average number of effective alleles (Ne), observed heterozygosity (Ho) and gene diversity or expected heterozygosity (He) were calculated through the GenAlEx software (version 6.5) [69]. They were determined from single-locus values. The same software was also used to test for deviation from the Hardy-Weinberg equilibrium (HWE) across all loci for each population.

Population Structure Analysis
Structure analyses were performed using the STRUCTURE software (version 2.3.4) [72][73][74][75], using 240-SNP data. This model was carried out to evaluate the number of inferred genetic population clusters (K) and to assign individuals to their likely population of origin, using no prior information. An initial burn-in of 20,000 steps was used to minimise the effect of the starting configuration, followed by 100,000 Markov Chain Monte Carlo (MCMC) steps, as recommended by Falush et al. [74] and Ghaffari et al. [21], under the admixture model and independent allele frequencies. Ten replicate runs per K value were set up, with K ranging from 1 to 10. To identify the number of K clusters explaining the observed genetic structure, the log-probability of the data (LnP(D)) in STRUCTURE output as well as the delta K values were obtained, using the online available program STRUCTURE HAR-VESTER (web version 0.6.94) [76], and based on the Evanno et al. [57] method. Samples were assigned probabilistically to genetic groups according to their membership coefficient (q-value).
To assess the relationship among the non-redundant genotypes, the pairwise genetic distance matrix was computed based on SNP data, through the 'Genetic Distance' procedure in the GenAlEx software (version 6.5) [69], for subsequent analyses. Principal coordinate analyses (PCoA) were performed using the same software for SNP distance matrices, conducted on individual multilocus genotypes and with covariance standardised. The clustering was inferred using the Unweighted Pair Group Method with Arithmetic Mean (UPGMA) [77]. The optimal circular trees obtained for SNP markers were plotted using MEGA X software [78]. A tree was drawn to scale, with branch lengths in the same units as those of the evolutionary distances used to infer the phenetic tree.

Pedigree Analysis
Data from the 240 SNP set were also used to identify possible first-order kinship relationships-trios (mother-father-offspring) and duos (possible parent-offspring pairs)among the non-redundant grape genotypes in the study, since their SNP profiles merged with those of the ICVV-SNP database. The likelihood-based method in the CERVUS software was used (version 3.0) [79]. The likelihood of each detected trio and duo detected was determined based on the natural logarithm of the overall likelihood ratio-logarithmof-odds (LOD) score-and a maximum number of mismatching loci of 1 or 4 SNPs for duos and trios, respectively. Where possible, chlorotypes were used to infer which of the putative parents was the maternal progenitor in each trio [13,19].

Conclusions
SSR and SNP analyses were very useful in the identification and characterisation of the plants analysed and overcame the ampelography difficulties in grapevine prospections to contribute to the ultimate goal of conserving grape varietal legacy in Northeast Portugal. Further morphological, agronomical, and oenological analyses are being undertaken to complement the molecular data reported in this study.
In the present work, 280 plants in the "Douro" and "Trás-os-Montes" PDO regions were identified and grouped into 52 different grape varieties. Thirteen additional unique genotypes were also detected from the study of other 30 vines, which were clustered, along with cv. Marufo (mother of the majority of these new genotypes), in an exclusive independent cluster, accordingly to UPGMA data. Some of these 13 new genotypes could be considered minor neglected Portuguese grape varieties, while others, found only in plants belonging to a single local vine grower still using traditional techniques for grapevine propagation, are probably individual grapevines that emerged as seedlings and can be the initial steps of the establishing of local grape varieties as it occurred in the past. Altogether, their discovery highlights once more the huge patrimony of the Portuguese grape germplasm.
The aforementioned observations emphasise the constant demand for prospection and identification of old grape material in traditional vineyards, to enlarge the knowledge about the still existing varietal diversity for its conservation, characterisation, and eventual exploitation. The broader the knowledge, the greater the chances to overcome biotic and abiotic stresses affecting the vineyards today.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/ 10.3390/plants10122755/s1, Table S1. List of Vitis vinifera L. cultivars (52) and new genotypes (13) identified and corresponding sample codes; Table S2. List of 31 grape samples with local names and SSR and SNP identifications according to the legal cultivar name in Portugal; Table S3. List of the 65 grape genotypes detected at 6 nSSR and 3 cpSSR loci analysed; Table S4. List of the 65 non-redundant grape genotypes detected at 46 polymorphic SNP loci; Table S5. Genetic parameters, allele sizes, and frequencies over 6 microsatellite loci in the 65 non-redundant genotypes analysed in this study; Table S6. Structure results at K = 2 based on 226 SNP markers. Genotypes with membership coefficients (q-values) below the threshold of 0.7 for genetic group assignment were admixed; Figure S1. Delta K plots obtained from STRUCTURE HARVESTER to set the most likely number of genetic groups within the 65 non-redundant grape population identified in the present study, based on 226-SNP data.