Discrimination of Potato ( Solanum tuberosum L.) Accessions Collected in Majella National Park (Abruzzo, Italy) Using Mid-Infrared Spectroscopy and Chemometrics Combined with Morphological and Molecular Analysis

: Development of local plant genetic resources grown in speciﬁc territories requires approaches that are able to discriminate between local and alien germplasm. In this work, three potato ( Solanum tuberosum L.) local accessions grown in the area of Majella National Park (Abruzzo, Italy) and ﬁve commercial varieties cultivated in the same area were characterized using 22 morphological descriptors and microsatellite (SSR) DNA markers. Analysis of the DNA and of the plant, leaf, ﬂower, and tuber morpho-agronomic traits allowed for a reliable discrimination of the local potato accessions, and provided a clear picture of their genetic relationships with the commercial varieties. Moreover, infrared spectroscopy was used to acquire a ﬁngerprint of the tuber ﬂesh composition. A total of 279 spectra, 70% of which were used in calibration and the remaining 30% for prediction, were processed using partial least squares discriminant analysis. About 97% of the calibration samples and 80% of the prediction samples were correctly classiﬁed according to the potato origin. In summary, the combination of the three approaches were useful in the characterization and valorization of local germplasm. In particular, the molecular markers suggest that the potato accession named Montenerodomo, cultivated in Majella National Park, can be considered a local variety and can be registered into the Regional Voluntary GR Register and entered into the foreseen protection scheme, as reported by the Italian regional laws. the accessions from Majella National Park.

In the present work, attenuated total reflectance Fourier transform infrared (ATR-FTIR) spectroscopy in the mid-range (4000-400 cm −1 ) was utilized to attempt a discrimination of autochthone potato accessions cultivated in the mountain territory of Majella National Park (Abruzzo, Italy) and commercial varieties usually grown in the same area. Apart from the safeguard of natural biodiversity and the wilderness, many actions are currently being implemented in the Majella National Park to valorize the local germplasm [35,36] and promote the sustainable economic development of rural areas. In this context, the plant genetic resources of the Majella National Park, including potatoes, have been recently rediscovered. Valorization of these local agronomical specialties also requires approaches that can discriminate them from commercial products. This work focused on the discrimination of three potato landraces that have been historically cultivated within the territory of the park and five non-local varieties. The latter include four commercial varieties and one old ecotype coming from the nearby Gran Sasso-Laga National Park (Figure 1). All the accessions investigated in this work were grown in the same experimental field located within the territory of Majella National Park. Therefore, the possible variability related with the pedoclimatic features of the cultivation site was removed. Preliminarily, a characterization of potato accessions was performed using selected morpho-agronomic traits. In addition, we performed microsatellite (SSR) DNA analysis to properly identify these accessions in comparison to other commercial potato varieties that are locally grown. Varietal classification of the potatoes based on the ATR-FTIR spectra was conducted using partial least squares discriminant analysis, which is suitable for handling large spectroscopic data matrices in food traceability problems.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 3 of 19 autochthone potato accessions cultivated in the mountain territory of Majella National Park (Abruzzo, Italy) and commercial varieties usually grown in the same area. Apart from the safeguard of natural biodiversity and the wilderness, many actions are currently being implemented in the Majella National Park to valorize the local germplasm [35,36] and promote the sustainable economic development of rural areas. In this context, the plant genetic resources of the Majella National Park, including potatoes, have been recently rediscovered. Valorization of these local agronomical specialties also requires approaches that can discriminate them from commercial products. This work focused on the discrimination of three potato landraces that have been historically cultivated within the territory of the park and five non-local varieties. The latter include four commercial varieties and one old ecotype coming from the nearby Gran Sasso-Laga National Park (Figure 1). All the accessions investigated in this work were grown in the same experimental field located within the territory of Majella National Park. Therefore, the possible variability related with the pedoclimatic features of the cultivation site was removed. Preliminarily, a characterization of potato accessions was performed using selected morpho-agronomic traits. In addition, we performed microsatellite (SSR) DNA analysis to properly identify these accessions in comparison to other commercial potato varieties that are locally grown. Varietal classification of the potatoes based on the ATR-FTIR spectra was conducted using partial least squares discriminant analysis, which is suitable for handling large spectroscopic data matrices in food traceability problems.

Potato Samples
Eight different potato accessions were investigated: Gamberale (GA), Turchesa (TU), Montenerodomo (MO), Pizzoferrato (PI), Désirée (DE), Agria (AG), Kennebec (KE), and Spunta (SP). GA, MO, and PI potato accessions are named according to the localities of the Majella National Park where they are traditionally cultivated, whereas TU is a local variety the comes from the nearby Gran Sasso-Laga National Park (Figure 1). DE, AG, KE, and SP are commercial varieties that are also usually grown by the farmers of the Majella National Park. Tubers of the eight potato varieties are displayed in Figure A1 (Appendix A).

Potato Cropping
Potatoes were grown in an experimental field located in Montenerodomo (CH), one of the 39 municipalities included in the Majella National Park territory, at an altitude of about 1000 m asl. The tubers of the commercial varieties were acquired in local markets, whereas tubers of the local accessions were kindly provided by farmers of the Majella National Park. A randomized block design with four replications was adopted. In each plot, ten tuber seeds were distributed in two rows, with inter-and intra-row distances of 0.70 and 0.40 m, respectively. The tubers were planted in April 2018 and the cultivation techniques commonly applied by the farmers of the National Park of Majella were adopted. The tubers of all accessions were harvested on 10 September 2018.

Potato Samples
Eight different potato accessions were investigated: Gamberale (GA), Turchesa (TU), Montenerodomo (MO), Pizzoferrato (PI), Désirée (DE), Agria (AG), Kennebec (KE), and Spunta (SP). GA, MO, and PI potato accessions are named according to the localities of the Majella National Park where they are traditionally cultivated, whereas TU is a local variety the comes from the nearby Gran Sasso-Laga National Park (Figure 1). DE, AG, KE, and SP are commercial varieties that are also usually grown by the farmers of the Majella National Park. Tubers of the eight potato varieties are displayed in Figure A1 (Appendix A).

Potato Cropping
Potatoes were grown in an experimental field located in Montenerodomo (CH), one of the 39 municipalities included in the Majella National Park territory, at an altitude of about 1000 m asl. The tubers of the commercial varieties were acquired in local markets, whereas tubers of the local accessions were kindly provided by farmers of the Majella National Park. A randomized block design with four replications was adopted. In each plot, ten tuber seeds were distributed in two rows, with inter-and intra-row distances of 0.70 and 0.40 m, respectively. The tubers were planted in April 2018 and the cultivation techniques commonly applied by the farmers of the National Park of Majella were adopted. The tubers of all accessions were harvested on 10 September 2018.

Morpho-Agronomic Characterization of Potato Cultivars
A total of 22 morpho-agronomic traits of plants, leaves, flowers, and tubers (listed in Table 1) were used to characterize the eight potato accessions. These descriptors were mainly based on those proposed by the International Union for the Protection of New Varieties of Plants (UPOV) [6]. MO, USA). Genotyping was carried out with five microsatellite (SSR) primer pairs chosen based on their previously assayed discrimination power in a larger collection of potato varieties [12,37,38]. All SSRs were recommended at CIP (International Potato Center, www.cipotato.org) based on quality criteria, genome coverage, and locus-specific information content (Table A1). SSR-PCR and capillary electrophoresis were performed, as reported by Bontempo et al. [39]. The alleles for SSR locus of each potato genotype were assigned with their molecular size and scored as present (1) or absent (0) using GeneScan Analysis software (version number 3.1, Applied Biosystems, Foster City, CA, USA). A similarity matrix was calculated using the Dice coefficient [40] with the program DendroUPGMA (http://genomes.urv.es/UPGMA/) [41]. Through the unweighted pair group method with arithmetic mean (UPGMA) algorithm, it was possible to construct a tree diagram (dendrogram) to illustrate the genetic clustering of the potato varieties under investigation. The R software version 3.2.1 (R Foundation for Statistical Computing, Vienna, Austria) was employed to build the diagram.

ATR-FTIR Measurements
The infrared spectra of the potato samples were recorded on a PerkinElmer Spectrum Two™ (PerkinElmer, Waltham, MA, USA) FTIR spectrometer consisting of a deuterated triglycine sulfate (DTGS) detector and a PerkinElmer Universal Attenuated Total Reflectance (uATR) accessory equipped with a single bounce diamond crystal. Each spectrum was registered from 4000 cm −1 to 400 cm −1 with a 4 cm −1 instrumental resolution and ten scans were averaged per spectral replicate. The background was collected with the crystal exposed to the air. Before each measurement, the ATR crystal was cleaned with methanol and air dried. ATR-FTIR spectra were collected on different sections of each potato tuber obtained by cutting the tubers in thick slices and by contacting the central part of the slice with the ATR crystal. A consistent force was applied using the pressure monitoring system integrated with the instrument to maximize the spectrum intensity. The spectra were collected from eight to nine tubers of each accession and three to five spectra were recorded from each tuber at different depths. The tubers analyzed using ATR-FTIR were randomly extracted from those with size >60 mm (descriptor N4 in Table A1, Appendix A) collected during morpho-agronomic analysis (ranging between about 30 and 180, depending on the accession) and stored in the dark in a dry and fresh room. Acquisition of the spectra of the various accessions was carried out in a random order and was completed in one week in December 2018 to avoid variations caused by differences in aging.

Multivariate Statistical Analysis
Principal component analysis (PCA) and hierarchical agglomerative cluster analysis (HCA) were applied to the morpho-agronomic data. PCA [42] allows for representing multivariate information in a low-dimensionality space defined by a relatively small number of uncorrelated principal components (PCs). PCs are obtained using an orthogonal transformation of the original data in such a way that the first is oriented along the direction of maximum variance and the successive PCs in turn explain the greatest fraction of residual variance under the constraint of mutual orthogonality between the components. Transformation of the original data matrix X is mathematically described by Equation (1): where the columns of matrix P (loadings matrix) define the PC directions, the columns of matrix T (scores matrix) are the coordinates of the samples in the PC space, and the error matrix E collects the residuals associated with the approximation of the original data when fewer PCs than the original number of variables are extracted. Usually, the scores are graphically projected onto the two-or three-dimensional space of the most significant components (score plots), which allows for a straightforward visualization of the trends within the data samples, such as clustering, retaining most of the original information. The loadings can be also plotted (loading plot) in the compressed PC subspace to visualize the relationships between the original variables and the relative weight of each variable in the selected PCs.
In HCA [43], single objects are gradually connected to each other in groups according to similarity, which is inversely related to the distance between objects. The final sequence of merges is graphically represented in a dendrogram, with the vertical axis showing the similarity measure at which each successive object joins a group. In this work, the usual Euclidean distance was selected to compute the similarity and the average linkage method was the clustering algorithm.
The classification of potato varieties was attempted using partial least squares discriminant analysis (PLS-DA). PLS-DA [44,45] takes its origin from partial least squares regression, which allows one to link a matrix X (i.e., raw experimental data) with a multi-response matrix Y (PLS-2 regression) and overcome limitations related with an ill-conditionate covariance matrix (as in the case of a greater number of X variables than the objects). The regression model is built by iteratively extracting latent variables from X factors and Y responses (also referred to as X-scores and Y-scores, respectively). The extracted X-scores are used to predict the Y-scores, and indirectly, the model responses. In classification problems, the model response is categorized via the generation of a dummy binary Y matrix in which 1 and 0 indicates the "in-group" and "out-group" samples, respectively. After the regression model has been built using a calibration data set, the calibration or even external samples are classified according to the computed or predicted outputs. However, the PLS-DA responses are continuous and not binary, and therefore a threshold must be defined to assign the objects; the value 0.5 was used in this work. The optimal number of latent variables in the PLS-DA model was determined using leave-one-out cross-validation. Multivariate statistical analyses were run in Matlab (version 2015b, The Mathworks, Natick, MA, USA) using in-house routines.

Morpho-Agronomic Characterization of Potato Accessions
The results of the morpho-agronomic characterization of potato accessions are graphically shown in Figure 2. The average of four replicates for each accession, collected in Table A1 (Appendix A), and the descriptors were simultaneously projected onto the space of the first three principal components (PCs), accounting for 58.8% of the variance. Plots (a) and (b) of Figure 2 display the biplots in the PC1-PC2 and PC1-PC3 planes, respectively.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 6 of 19 straightforward visualization of the trends within the data samples, such as clustering, retaining most of the original information. The loadings can be also plotted (loading plot) in the compressed PC subspace to visualize the relationships between the original variables and the relative weight of each variable in the selected PCs.
In HCA [43], single objects are gradually connected to each other in groups according to similarity, which is inversely related to the distance between objects. The final sequence of merges is graphically represented in a dendrogram, with the vertical axis showing the similarity measure at which each successive object joins a group. In this work, the usual Euclidean distance was selected to compute the similarity and the average linkage method was the clustering algorithm.
The classification of potato varieties was attempted using partial least squares discriminant analysis (PLS-DA). PLS-DA [44,45] takes its origin from partial least squares regression, which allows one to link a matrix X (i.e., raw experimental data) with a multi-response matrix Y (PLS-2 regression) and overcome limitations related with an ill-conditionate covariance matrix (as in the case of a greater number of X variables than the objects). The regression model is built by iteratively extracting latent variables from X factors and Y responses (also referred to as X-scores and Y-scores, respectively). The extracted X-scores are used to predict the Y-scores, and indirectly, the model responses. In classification problems, the model response is categorized via the generation of a dummy binary Y matrix in which 1 and 0 indicates the "in-group" and "out-group" samples, respectively. After the regression model has been built using a calibration data set, the calibration or even external samples are classified according to the computed or predicted outputs. However, the PLS-DA responses are continuous and not binary, and therefore a threshold must be defined to assign the objects; the value 0.5 was used in this work. The optimal number of latent variables in the PLS-DA model was determined using leave-one-out cross-validation. Multivariate statistical analyses were run in Matlab (version 2015b, The Mathworks, Natick, MA, USA) using in-house routines.

Morpho-Agronomic Characterization of Potato Accessions
The results of the morpho-agronomic characterization of potato accessions are graphically shown in Figure 2. The average of four replicates for each accession, collected in Table A1 (Appendix  A), and the descriptors were simultaneously projected onto the space of the first three principal components (PCs), accounting for 58.8% of the variance. Plots (a) and (b) of Figure 2 display the biplots in the PC1-PC2 and PC1-PC3 planes, respectively.   varieties-AG, KE, and SP-while the latter was formed by the samples of the commercial potato DE together with those belonging to the three accessions grown in Majella National Park-MO, GA, and PI. Within each of these two groups, most of the samples belonging to different accessions overlapped, especially those of the first group. The descriptors with higher loadings on PC1, namely W, 23, N4, 30, 34, and 28 (defined in Table 1), were the morpho-agronomic traits that were more influential in the differentiation of the two clusters and the isolation of the TU accession. On the other hand, PC2 seemed to essentially describe the variability internal to the replicates of each potato accession, which was mainly associated with the N1, N2, and N3 descriptors. The samples of the three commercial potatoes-AG, KE, and SP-were still grouped together along PC3 (Figure 2b), while those belonging to the DE, MO, GA, PI, and TU accessions were instead well separated along this component. The morpho-agronomic traits 26, 14, 19, and 35 were mainly responsible for this differentiation. In summary, the samples of five potato varieties (MO, PI, GA, DE, and TU) clustered in distinct groups. Concerning the three commercial cultivars of AG, KE, and SP, variability within the replicates seemed, by contrast, to be greater than the differences among the varieties. Nevertheless, treatment of the morpho-agronomic data matrix by means of HCA revealed a clustering of potato samples, including AG, KE, and SP, into eight distinct groups (Figure 3), each corresponding to a given variety. Similarities among the various clusters roughly reflected the reciprocal position of the potato classes within the explored PC subspace.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 7 of 19 Figure 2a reveals a neat separation of the TU potatoes and a clustering of the remaining samples into distinct groups separated along PC1: the first group collected the samples of three commercial varieties-AG, KE, and SP-while the latter was formed by the samples of the commercial potato DE together with those belonging to the three accessions grown in Majella National Park-MO, GA, and PI. Within each of these two groups, most of the samples belonging to different accessions overlapped, especially those of the first group. The descriptors with higher loadings on PC1, namely W, 23, N4, 30, 34, and 28 (defined in Table 1), were the morpho-agronomic traits that were more influential in the differentiation of the two clusters and the isolation of the TU accession. On the other hand, PC2 seemed to essentially describe the variability internal to the replicates of each potato accession, which was mainly associated with the N1, N2, and N3 descriptors. The samples of the three commercial potatoes-AG, KE, and SP-were still grouped together along PC3 (Figure 2b), while those belonging to the DE, MO, GA, PI, and TU accessions were instead well separated along this component. The morpho-agronomic traits 26, 14, 19, and 35 were mainly responsible for this differentiation. In summary, the samples of five potato varieties (MO, PI, GA, DE, and TU) clustered in distinct groups. Concerning the three commercial cultivars of AG, KE, and SP, variability within the replicates seemed, by contrast, to be greater than the differences among the varieties. Nevertheless, treatment of the morpho-agronomic data matrix by means of HCA revealed a clustering of potato samples, including AG, KE, and SP, into eight distinct groups (Figure 3), each corresponding to a given variety. Similarities among the various clusters roughly reflected the reciprocal position of the potato classes within the explored PC subspace. The better separation of AG, KE, and SP varieties in the HCA analysis compared to PCA was not surprising considering that some information on the morpho-agronomic characters may not be retained by the first three components selected in PCA, which explained less than 60% of variance.

DNA Fingerprinting
Five SSR markers were used to evaluate the diversity of the samples at the genetic level. Due to their high mutation rate and extensive genome coverage, these markers have been successfully adopted in various applications, including plant DNA fingerprinting [46]. In addition, the SSR markers assayed in this study belong to the robust and highly informative microsatellite-based genetic identity kit set up by Ghislain et al. [47], and have been proposed as a reference for standardizing potato germplasm analyses across laboratories. In total, 21 alleles were identified, with an average of 4.2 alleles per locus. The number of alleles per marker varied from 2 (locus STP0AC58) to 7 (locus STI 001) (Table A2, Appendix A). To evaluate the strength of the relationship among the analyzed samples, an UPGMA dendrogram was built (Figure 4). The genetic distances between potatoes studied here varied from 0.52 (between AG and TU) to 1.00 (GA and DE, PI and TU), with an average value of 0.74 (Table A3, Appendix A). Overall, the dendrogram allowed for distinguishing The better separation of AG, KE, and SP varieties in the HCA analysis compared to PCA was not surprising considering that some information on the morpho-agronomic characters may not be retained by the first three components selected in PCA, which explained less than 60% of variance.

DNA Fingerprinting
Five SSR markers were used to evaluate the diversity of the samples at the genetic level. Due to their high mutation rate and extensive genome coverage, these markers have been successfully adopted in various applications, including plant DNA fingerprinting [46]. In addition, the SSR markers assayed in this study belong to the robust and highly informative microsatellite-based genetic identity kit set up by Ghislain et al. [47], and have been proposed as a reference for standardizing potato germplasm analyses across laboratories. In total, 21 alleles were identified, with an average of 4.2 alleles per locus. The number of alleles per marker varied from 2 (locus STP0AC58) to 7 (locus STI 001) (Table A2, Appendix A). To evaluate the strength of the relationship among the analyzed samples, an UPGMA dendrogram was built (Figure 4). The genetic distances between potatoes studied here varied from 0.52 (between AG and TU) to 1.00 (GA and DE, PI and TU), with an average value of 0.74 (Table A3, Appendix A). Overall, the dendrogram allowed for distinguishing several clusters.
In particular, as DNA markers (such as SSR) are not environmentally influenced [48], genotypes were clustered according to their genetic makeup, regardless of the sampling area. The DE and GA local accessions were grouped together with the control DE. Similarly, PI and TU fell into the same group due to their high genetic similarity. By contrast, MO and AG were sorted into different clusters, displaying an independent genetic status compared to the other genotypes. Our findings demonstrate that SSR analysis was useful for providing a reliable discrimination of potato accessions collected in Majella National Park and providing a clear picture of their genetic relationships with other varieties included in the study.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 8 of 19 several clusters. In particular, as DNA markers (such as SSR) are not environmentally influenced [48], genotypes were clustered according to their genetic makeup, regardless of the sampling area. The DE and GA local accessions were grouped together with the control DE. Similarly, PI and TU fell into the same group due to their high genetic similarity. By contrast, MO and AG were sorted into different clusters, displaying an independent genetic status compared to the other genotypes. Our findings demonstrate that SSR analysis was useful for providing a reliable discrimination of potato accessions collected in Majella National Park and providing a clear picture of their genetic relationships with other varieties included in the study.

Characterization of Potatoes Using ATR-FTIR Spectroscopy
Figure 5a displays the ATR-FTIR spectra in the range 4000-940 cm −1 acquired from representative samples of the eight potato accessions, which reflected the typical tuber flesh composition [32,49], mainly consisting of water (77%-80%) and carbohydrates (9%-19%, predominantly starch), followed by minor components, such as proteins (≈2%), fibers (0.4%-0.8%), lipids (0.1%), and organic acids (0.4%−1%). Absorption bands in the 1800-940 cm −1 range are shown in Figure 5c. The spectral region below 940 cm −1 , showing a continuous absorption band with no fine structure, was not used in the classification analysis. The broad absorption band at 3750-2800 cm −1 was associated with the O-H stretching vibrations of carbohydrates and water [50]. This band strongly overlapped with the signals ascribed to the symmetric and asymmetric stretching modes of the C-H bond that appeared as a single shoulder in the region near 2930 cm −1 . The broad band centered at about 2100 cm −1 was ascribed to the rocking and scissoring vibrations of water molecules not directly bound to starch. The relatively sharp signal at 1800-1500 cm −1 was associated with the vibrations of water molecules adsorbed in the amorphous regions of starch [31]; however, the amide I and amide II peaks of proteins also fall in this spectral range [26]. The shoulder at about 1742 cm −1 was assigned to the C=O stretching of lipids and organic acids [51]. Previous studies revealed that the intensity of the absorption bands at 3750-2800, 2100, and 1800-1500 cm −1 is directly related to the hydration degree of the potato starch [31]. The weak and partially superimposed bands in the spectral range between 1500 and 1200 cm −1 were predominantly due to the deformational modes of the CH/CH2 groups [50]. The absorption bands between 1150 and 940 cm −1 arose from the coupling of C-O, C-C and C-O-H stretching, and the C-O-H bending of starch. Despite the poor resolution and overlapping of the related signals, which did not allow for an unequivocal attribution, changes in this  Figure 5a displays the ATR-FTIR spectra in the range 4000-940 cm −1 acquired from representative samples of the eight potato accessions, which reflected the typical tuber flesh composition [32,49], mainly consisting of water (77%-80%) and carbohydrates (9%-19%, predominantly starch), followed by minor components, such as proteins (≈2%), fibers (0.4%-0.8%), lipids (0.1%), and organic acids (0.4%−1%). Absorption bands in the 1800-940 cm −1 range are shown in Figure 5c. The spectral region below 940 cm −1 , showing a continuous absorption band with no fine structure, was not used in the classification analysis. The broad absorption band at 3750-2800 cm −1 was associated with the O-H stretching vibrations of carbohydrates and water [50]. This band strongly overlapped with the signals ascribed to the symmetric and asymmetric stretching modes of the C-H bond that appeared as a single shoulder in the region near 2930 cm −1 . The broad band centered at about 2100 cm −1 was ascribed to the rocking and scissoring vibrations of water molecules not directly bound to starch. The relatively sharp signal at 1800-1500 cm −1 was associated with the vibrations of water molecules adsorbed in the amorphous regions of starch [31]; however, the amide I and amide II peaks of proteins also fall in this spectral range [26]. The shoulder at about 1742 cm −1 was assigned to the C=O stretching of lipids and organic acids [51]. Previous studies revealed that the intensity of the absorption bands at 3750-2800, 2100, and 1800-1500 cm −1 is directly related to the hydration degree of the potato starch [31]. The weak and partially superimposed bands in the spectral range between 1500 and 1200 cm −1 were predominantly due to the deformational modes of the CH/CH 2 groups [50]. The absorption bands between 1150 and 940 cm −1 arose from the coupling of C-O, C-C and C-O-H stretching, and the C-O-H bending of starch. Despite the poor resolution and overlapping of the related signals, which did not allow for an unequivocal attribution, changes in this region were ascribed to the differences in the relative amounts of amorphous and crystalline starch and hydration of the crystalline form [26]. region were ascribed to the differences in the relative amounts of amorphous and crystalline starch and hydration of the crystalline form [26].

Discrimination of Potato Varieties Using the PLS-DA of ATR-FTIR Spectra
A total of 279 ATR-FTIR spectra were collected by analyzing different slices extracted from 7-9 tubers of each accession (Figure 5a). ATR distortion of the relative intensities of the bands and shifts occur at lower frequencies, which can crucially affect quantitative analyses or accurate band assignments; however, this was expected to have a negligible impact on the fingerprinting ability of the infrared spectra in the classification of the potato accessions. Therefore, ATR correction on the spectra was not performed. The data matrix was partitioned into calibration and prediction data sets consisting of 194 and 85 samples, respectively, via application of the duplex Kennard-Stone algorithm [52] to ensure a good representativeness of both groups. Finally, each potato category was represented using a variable number of calibration samples ranging from 19 (SP) to 29 (MO), whereas the external samples belonging to a given potato accession ranged from 8 (SP) to 13 (MO). The raw ATR-FTIR spectra were subjected to various pre-processing methods [53], namely standard normal variate (SNV), first-and second-derivative transformation, and their combinations, with the aim of removing spurious variability and/or enhancing the systematic differences within the spectra profiles. In particular, SNV consists of autoscaling on the rows such that every spectrum will have a mean of 0 and a standard deviation of 1 after scaling. The Savitzky-Golay approach with a 15-point window was applied in the first-and second-derivative transformation using second-and third-order polynomial fittings, respectively. Regardless of the pre-treatment mode applied to the ATR-FTIR spectra, PLS-DA was conducted on the autoscaled variables (autoscaling on columns). The influence of the spectra pre-treatment on the PLS-DA predictive performance was evaluated using leave-one-out cross-validation. The comparison of the proportion (%) of correctly classified samples for various pre-processing methods, reported in Table 2, revealed that SNV scaling (Figure 5b) provided the best results, with over 97% of classifications being correct. It is worth noting that discrimination based on the raw spectra was noticeably worse (87.1% of correct classifications in cross-validation), despite a relatively wide variability in their intensities. Such differences are probably related with variations in the extent of the contact of the potato flesh with the ATR crystal, which can only be partially controlled through the pressure monitoring system integrated with the instrument. SNV scaling seemed to remove this kind of random variability and to enhance the spectral differences due to the potato accession, especially at lower wavenumbers (Figure 5c). Table 2. Proportion of correctly classified potato samples (non-error-rate, NER%) in leave-one-out cross-validation for different pre-processing methods of the ATR-FTIR spectra. The proportion of correctly assigned potato samples in the calibration and external prediction is reported in Table 3, whereas Figure 6 graphically displays the calculated and predicted PLS-DA responses for each class. In each insert of Figure 6, the data above or below the line represent the samples accepted or refused, respectively, by a given class. Table 3. Proportion (%) of correctly classified potato samples using PLS-DA in the calibration (computed classes) and external prediction (predicted classes).  Inspection of Figure 6 reveals that all the calibration samples belonging to the accessions GA, MO, AG, and KE were correctly classified, while one or two classification errors can be observed for the other potato samples with the associated proportions of correct classifications ranging between 90.9% and 96.0% (Table 3). Concerning the external potato samples, the number of classification errors ranged between one (AG) and three (GA and TU). Because of the lower number of prediction samples compared to the calibration data, the percentage of correctly predicted classes was slightly worse than that in calibration. The observed values, ranging between 72.7% (TU) and 90.9% (AG), do however indicate a good predictive performance of the PLS-DA model. To further confirm the reliability of the classification model, PLS-DA was used to discriminate between the potato samples after shuffling the classes. To this end, 30 different random assignments of the 194 calibration samples into eight categories were generated and PLS-DA classification was applied for every repetition. The model predictive performance was evaluated using cross-validation with five cancellation groups. The trend of prediction errors over the 30 repetitions is displayed in Figure A2 (Appendix A). It can be observed that the proportion of correctly assigned samples to individual groups only rarely Inspection of Figure 6 reveals that all the calibration samples belonging to the accessions GA, MO, AG, and KE were correctly classified, while one or two classification errors can be observed for the other potato samples with the associated proportions of correct classifications ranging between 90.9% and 96.0% (Table 3). Concerning the external potato samples, the number of classification errors ranged between one (AG) and three (GA and TU). Because of the lower number of prediction samples compared to the calibration data, the percentage of correctly predicted classes was slightly worse than that in calibration. The observed values, ranging between 72.7% (TU) and 90.9% (AG), do however indicate a good predictive performance of the PLS-DA model. To further confirm the reliability of the classification model, PLS-DA was used to discriminate between the potato samples after shuffling the classes. To this end, 30 different random assignments of the 194 calibration samples into eight categories were generated and PLS-DA classification was applied for every repetition. The model predictive performance was evaluated using cross-validation with five cancellation groups. The trend of prediction errors over the 30 repetitions is displayed in Figure A2 (Appendix A). It can be observed that the proportion of correctly assigned samples to individual groups only rarely surpassed 40% and the total error was less than 15%, much lower than that observed when PLS-DA was applied to the true potato classes. It follows that the ATR-FTIR spectra of tubers really contained information on the potato accession and the good prediction results provided by PLS-DA applied to the true classes is unlikely to have occurred by chance.

Class
The influence of the various regions of the ATR-FTIR spectrum in the discrimination of the potato samples using PLS-DA was quantified using VIP (variable importance in the projection) scores [54]. The variables with VIP indices greater than one are usually assumed to be significant. The results of the VIP analysis are displayed in Figure 7.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 12 of 19 surpassed 40% and the total error was less than 15%, much lower than that observed when PLS-DA was applied to the true potato classes. It follows that the ATR-FTIR spectra of tubers really contained information on the potato accession and the good prediction results provided by PLS-DA applied to the true classes is unlikely to have occurred by chance. The influence of the various regions of the ATR-FTIR spectrum in the discrimination of the potato samples using PLS-DA was quantified using VIP (variable importance in the projection) scores [54]. The variables with VIP indices greater than one are usually assumed to be significant. The results of the VIP analysis are displayed in Figure 7. The most influential regions of the ATR-FTIR spectrum (VIP > 1) in the potato discrimination that could be unequivocally assigned included the spectral regions around 3370 cm −1 and 2900 cm −1 associated with the O-H and C-H stretching signals, respectively, and the bands centered at 2100 and 1640 cm −1 , which were attributed to the vibrational modes of free water molecules and those bound to starch, respectively. The intensity of the infrared spectrum in these regions has been related to the level of hydration of starch [31], which can be considered a character of potato flesh that is mainly influenced by the kind of accession. It is worth noting that, as described in Section 2.2, the eight potato accessions investigated in this study were grown in the same experimental field and the plants were not artificially irrigated. Therefore, the effect of possible differences in watering on the potato flesh composition can be neglected.

Conclusions
The ATR-FTIR spectrum of potato flesh, although dominated by the high content of moisture and starch in the tubers, which may hide the potential role of minor constituents, provides useful information on the origin of potato accessions. Chemometric treatment of the ATR-FTIR spectra allowed for discrimination of the potato cultivars with a good accuracy. These results confirmed the great potentiality of mid-infrared spectroscopy toward tracing foodstuffs. Because of the low cost, easy use, and minimal sample manipulation, ATR-FTIR can be preferred to more sophisticated instrumental techniques used for the varietal/geographical discrimination of cultivars.
The results obtained in this study are useful for the characterization and valorization of local germplasm. In particular, the molecular markers suggest that the potato accession named Montenerodomo, cultivated in Majella National Park, can be considered a local variety and can be registered into the Regional Voluntary GR Register and entered into the foreseen protection scheme, as reported by the Italian regional laws [48].  The most influential regions of the ATR-FTIR spectrum (VIP > 1) in the potato discrimination that could be unequivocally assigned included the spectral regions around 3370 cm −1 and 2900 cm −1 associated with the O-H and C-H stretching signals, respectively, and the bands centered at 2100 and 1640 cm −1 , which were attributed to the vibrational modes of free water molecules and those bound to starch, respectively. The intensity of the infrared spectrum in these regions has been related to the level of hydration of starch [31], which can be considered a character of potato flesh that is mainly influenced by the kind of accession. It is worth noting that, as described in Section 2.2, the eight potato accessions investigated in this study were grown in the same experimental field and the plants were not artificially irrigated. Therefore, the effect of possible differences in watering on the potato flesh composition can be neglected.

Conclusions
The ATR-FTIR spectrum of potato flesh, although dominated by the high content of moisture and starch in the tubers, which may hide the potential role of minor constituents, provides useful information on the origin of potato accessions. Chemometric treatment of the ATR-FTIR spectra allowed for discrimination of the potato cultivars with a good accuracy. These results confirmed the great potentiality of mid-infrared spectroscopy toward tracing foodstuffs. Because of the low cost, easy use, and minimal sample manipulation, ATR-FTIR can be preferred to more sophisticated instrumental techniques used for the varietal/geographical discrimination of cultivars.
The results obtained in this study are useful for the characterization and valorization of local germplasm. In particular, the molecular markers suggest that the potato accession named Montenerodomo, cultivated in Majella National Park, can be considered a local variety and can be registered into the Regional Voluntary GR Register and entered into the foreseen protection scheme, as reported by the Italian regional laws [48]. Funding: This research was funded by Majella National Park, within the Project "Tipizzazione di specie vegetali endemiche, crop wild relatives e varietà agricole autoctone del Parco della Majella mediante metodi analitici ed approcci statistici multivariati" with the Dipartimento di Scienze Fisiche e Chimiche, Università degli Studi dell'Aquila, and within the Project "Coltiviamo la diversità-Caratterizzazione e conservazione del germoplasma agricolo autoctono del Parco Nazionale della Majella" with the Dipartimento di Scienze Agrarie, Alimentari ed Ambientali, Università degli Studi di Perugia.    Funding: This research was funded by Majella National Park, within the Project "Tipizzazione di specie vegetali endemiche, crop wild relatives e varietà agricole autoctone del Parco della Majella mediante metodi analitici ed approcci statistici multivariati" with the Dipartimento di Scienze Fisiche e Chimiche, Università degli Studi dell'Aquila, and within the Project "Coltiviamo la diversità-Caratterizzazione e conservazione del germoplasma agricolo autoctono del Parco Nazionale della Majella" with the Dipartimento di Scienze Agrarie, Alimentari ed Ambientali, Università degli Studi di Perugia.

Conflicts of Interest:
The authors declare no conflict of interest.
Appendix A Figure A1. Tubers of some of the investigated potato accessions.    Table A3. Dice similarity matrix of the eight potato genotypes based on SSR markers. Two control varieties (AG and DE) are also included.