Combined Untargeted and Targeted Fingerprinting by Comprehensive Two-Dimensional Gas Chromatography to Track Compositional Changes on Hazelnut Primary Metabolome during Roasting

: This study focuses on the detectable metabolome of high-quality raw hazelnuts ( Corylus avellana L.) and on its changes after dry-roasting. Informative fingerprinting was obtained by comprehensive two-dimensional gas chromatography with fast-scanning quadrupole mass spectrometry (GC×GC-qMS) combined with dedicated data processing. In particular, combined untargeted and targeted (UT) fingerprinting, based on pattern recognition by template matching, is applied to chromatograms from raw and roasted samples of Tonda Gentile Trilobata and Anakliuri hazelnuts harvested in Italy and Georgia. Lab-scale roasting was designed to develop a desirable organoleptic profile matching industrial standards. Results, based on 430 peak features, reveal that phenotype expression is markedly correlated to cultivar and pedoclimatic conditions. Discriminant components between cultivars are amino acids (valine, alanine, glycine, and proline); organic acids (citric, aspartic, malic, gluconic, threonic, and 4-aminobutanoic acids); and sugars and polyols (maltose, xylulose, xylitol, turanose, mannitol, scyllo-inositol, and pinitol). Of these, alanine, glycine, and proline have a high informational role as precursors of 2-acetyl- and 2-propionylpyrroline, two key-aroma compounds of roasted hazelnuts. Roasting has a decisive impact on metabolite patterns—it caused a marked decrease ( − 90%) of alanine, proline, leucine and valine, and aspartic and pyroglutamic acid and a − 50% reduction of saccharose and galactose.


Introduction
Roasting is a key technological step for hazelnut industrial transformation that yields distinctive flavors, color, and crunchy texture. It triggers a complex array of many different chemical reactions, mainly involving major constituents-carbohydrates, proteins, free amino acids, and fats [1][2][3][4][5][6]. Carbohydrates are subjected to dehydration, caramelization, and hydrolysis; amino acids and thermo-labile vitamins are degraded; and proteins can decompose or polymerize by cross-linkage between reactive functionalities. Moreover, the high temperatures adopted during roasting impact the integrity of oleosomes (reserve bodies of lipids in raw hazelnuts) and reduce oxidative stability [7]. Lipid oxidation contributes to the formation of reactive carbonyl compounds (in particular, glyoxal and methylglyoxal).
Furthermore, reducing sugars and polyols can react with amino acids and free amino groups forming Amadori and/or Heyns products within the first stages of the Maillard reaction [8]. The evolution of Maillard reaction forms more stable, moderately to highly polar, and volatile compounds, such as carbonyl derivatives (ketones and aldehydes), alcohols, acids, esters, lactones, and sulfur derivatives, together with many heterocycles (furans, pyrazines, pyrroles, thiophenes, pyridines, thiazoles, and oxazoles), aromatic compounds and phenols responsible for major aroma notes of roasted hazelnuts. At the same time, high-molecular-weight products characterized by a brown color (melanoidins) are formed, contributing to the final color of the product.
Texture changes during roasting are due to microstructure modification. Indeed, the roasting leads to thermal degradation of the middle lamella, one element of the cell wall composed of pectic compounds responsible for connecting cells. Consequently, cellular and intercellular spaces become larger, causing an increase in crispness and crunchiness.
Few studies have been devoted to a comprehensive understanding of the impact of dry-roasting on the detectable metabolome of raw hazelnuts. The existing literature covers specific fractions of interest while applying dedicated/targeted methodologies to profile primary metabolites (amino acids, sugars, organic acids, fatty acids, amines, etc.) or specialized metabolites (formerly defined as "plants secondary metabolites") with known bio-activity (e.g., antioxidants), nutritional role, or organoleptic profile (e.g., tastants and astringent compounds). Of these, Ozdemir et al. [9] evaluated changes in total amino acids, thiamine, and riboflavin contents; peroxide value; and free fatty acids profiles in roasted Giresun and Akçakoca hazelnuts. The authors demonstrated a meaningful decrease on all tested parameters with a marked reduction of riboflavin level (−30%) in hazelnuts from the Akçakoca region of Turkey and of thiamine level (−50%). Within the group of nutritionally relevant amino acids, lysine (Lys) diminished <−6% in roasted Giresun hazelnuts but −31% in Akçakoca hazelnuts. Alasalvar et al. [10] screened eighteen native hazelnut varieties from the Giresun province of Turkey for their sugars, organic acids, condensed tannins, and free phenolic acids profile. Fructose, glucose, sucrose, myoinositol, raffinose, and stachyose ranged from 1.99 g/100 g to 4.94 g/100 g among the considered samples. Organic acids (oxalic, maleic, citric, malic, lactic, succinic, and acetic) ranged between 0.96 g/100 g to 2.72 g/100 g, while gallic acid was between 0.159 mg/100 g and 0.871 mg/100 g. The authors observed that roasting brought significant losses (p < 0.05) of both condensed tannins (−97%) and gallic acid (−67%), whereas its effect on sugars and organic acids was not noteworthy.
In a study aimed at evaluating the impact of different roasting conditions on phenolic compounds, Schmitzer et al. [11] revealed a marked decrease of all individual phenolics, with the exception of gallic acid while confirming the presence of flavan-3-ols (catechin, epicatechin, 2 procyanidin dimers, and 3 procyanidin trimers), flavonols (quercetin pentoside, quercetin-3-O-rhamnoside, and myricetin-3-O-rhamnoside), hydrobenzoic acids (gallic acid, protocatechuic acid), and phloretin-2′-O-glucoside (dihydrochalcone class). Nevertheless, the authors evidenced that roasting did not impact the total phenolic content (TPC) and antioxidative potential of kernels. These results are in accordance with those obtained by Belviso et al. [12], who monitored the TPC and antioxidant activity along shelf-life. The effects of roasting on proanthocyanidins were studied by Lainas et al. [13], who compared the content of extractable and bound proanthocyanidins of raw and roasted Turkish Tombul hazelnuts. In raw hazelnuts, extractable proanthocyanidins fraction was 81% of the total phenolic fraction with the presence of oligomers (4-9 mers) and polymers (≥10 mers), whereas in roasted hazelnuts, extractable moieties were only monomers to trimers. This variation, likely related to skin loss during roasting, was accompanied by a higher recovery of dimers, trimers, and tetramers after alkaline hydrolysis of roasted hazelnut skins.
Amaral et al. [14] focused on the lipid fraction and revealed just minor changes in fatty acid and triacylglycerol compositions. After intense roasting conditions (165-200 °C for 15 min on average), the authors observed a slight increase of oleic acid, saturated fatty acids, and triacylglycerols containing oleic acid moieties; and a decrease of linoleic acid, phytosterols (maximum 14.4%), and vitamin E homologues (maximum 10.0%). Negligible amounts of trans fatty acids also were detected.
Of the studies focused on the effect of roasting in the development of characteristic brown color, Fallico et al. [15] evaluated the role of fat fraction, hexanal, and saccharose in color development and hydroxymethylfurfural (HMF) formation. HMF levels were highest in hazelnuts with the deepest browning; the lowest HMF and brown color were correlated to lower fat content (or defatted) samples. Arlorio et al. correlated color and Damino acids distribution, proposing a complex mathematical model (Back Propagation Neural Network on the Discrete Fourier Transform output of the whole surface color) to predict the occurrence of toxic compounds.
The present study was designed to cover gaps for a more comprehensive understanding of the impact of roasting on the detectable metabolome of hazelnuts and on different hazelnut cultivars and harvest regions. The approach takes inspiration from food metabolomics [16][17][18][19] principles and exploits the information potential of multidimensional analytical (MDA) platforms that combine techniques for physicochemical discrimination/separation (e.g., gas chromatography GC) of analytes with spectroscopic detection (e.g., mass spectrometry MS).
To be truly useful, an "untargeted" approach should detect and monitor as many features as possible by annotation and tracking across multiple samples [35,36]. It is conceptually in contrast to purely "targeted" strategies, that, a priori, define a limited number of known analytes and therefore provide limited information with regard to the total composition of a sample.
Based on preliminary results obtained by exploring the distribution of the detectable metabolome in hazelnuts of different cultivars/origins [37], this study makes a step forward in examining the impact of roasting, as an additional yet fundamental variable/process on the chemical signature of primary metabolites in hazelnut kernels. Quali-quantitative variations on known and unknown features/analytes are tracked using chromatographic fingerprinting based on peak features and template matching algorithms.
In particular, the approach known as combined untargeted and targeted (UT) fingerprinting [23,[37][38][39] was applied to raw hazelnuts from the Tonda Gentile Trilobata cultivar (Protected Geographical Indication PGI) harvested in Piedmont (Italy) or Georgia, and from the Anakliuri cultivar, native to Georgia. Raw hazelnuts were submitted to lab-scale dry-roasting in a ventilated oven under time/temperature conditions tuned to develop the desired flavor and crunchy texture. A more intense roasting was also applied to exacerbate thermal stress and capture metabolites variations under even more drastic conditions.

Hazelnut Samples and Roasting Conditions
Commercial grade samples of raw hazelnuts (Corylus avellana L.), with a uniform caliber of 13-14 mm and harvested in 2017, were supplied by Soremartec Italia Srl (Alba, Cuneo, Italy). They were from different geographical areas: (a) cultivar Nocciola Gentile Trilobata (T), harvested in Piedmont (Italy-IT) and Georgia-GE, and (b) cultivar Anakliuri (AN), native to and harvested in Georgia along the Black Sea West coasts.
Two post-harvest drying regimes were used: (a) conventional in-shell drying, after de-husking, for the Anakliuri cultivar, in-field at 35-38 °C (E1) or (b) industrial processing by artificial dryers operating at 18-20 °C (E2). Drying achieved a final kernel humidity of 6%, a condition that keeps the product stable throughout its shelf-life. Before shipping to the laboratory, kernels were stored in (a) controlled conditions keeping the equilibrium relative humidity (ERH) at 65% (Controlled-C) or (b) without any stabilization of relative humidity (Uncontrolled-U). Post-harvest treatments were conducted by the R&D Raw Materials Department of Soremartec Italia Srl (Alba, Cuneo, Italy).
Raw hazelnuts were manually cut in half to check their quality, then put in liquid nitrogen in order to perform ball milling (model MM 400, Retsch, Haan, Germany) at 18 Hz for 10 s. After that, milled hazelnuts were stored in 40 mL glass vials at −18 °C until analysis.
Lab-scale roasting was carried out on a traditional ventilated oven by two different time-temperature protocols: • 180 °C for 10 min, to obtain optimally-roasted hazelnuts, in terms of flavor and color (roasted-R). These conditions were optimized to obtain full and balanced development of major key-odorants [6] and roasting markers [2,3]. • 200 °C for 10 min for a higher level of roasting (over-roasted-OR).
After roasting, hazelnuts were left to reach ambient temperature and ball-milled at 14 Hz for 5 s.
Quality control samples (QC) for response normalization were prepared by mixing 1.00 g of each sample and then carefully mixed to obtain a homogeneous powder. Table 1 summarizes the samples' characteristics and notations used in the text.

Derivatization
Methoximation was performed by adding 20 µL of methoxylamine-hydrochloride in pyridine (20 mg/mL) to the dried extracts and then incubated for 1 h at 40 °C and 1400 rpm in a shaker. Silylation occurred by adding 70 µL of MSTFA with 1% TMCS followed by incubation for 1 h at 65 °C in static conditions.
For identity confirmation of primary metabolites, 1.00 mL of primary metabolites standards mixture (listed in Section 2.1) was submitted to the derivatization procedure and analyzed under conditions described in Sections 2.4 and 2.5.

Comprehensive Two-Dimensional Gas Chromatography-Quadrupole Mass Spectrometry Instrument Set-Up and Experimental Conditions
Primary metabolome analyses were carried out on a GC×GC-qMS system, consisting of a GC coupled with a fast-scanning quadrupole Mass Spectrometer (QP2010 Ultra, Shimadzu Corp, Kyoto, Japan) and an AOC-20i/s autosampler (Shimadzu Corp, Kyoto, Japan). Cold split injection (split 1:3) used an OPTIC4 system (GL Sciences, Eindhoven, The Netherlands).
The modulation system was a loop-type cryogenic modulator, ZX2 (Zoex Corporation, Houston, TX, USA), cooled by a closed-cycle refrigerator/heat exchanger.
The MS acquisition (Scan mode, EI 70 eV) parameters were set as follows: mass range 60-550 m/z, data rate 33.3 Hz, and solvent delay 12.1 min. The detector voltage was timeprogrammed to ensure a sufficient signal intensity of low-abundance metabolites and, at the same time, avoid spectral distortion of major metabolites; for example, C18 fatty acids and sucrose. For this purpose, the following schedule was used: 0 min: 1.
The carrier gas was helium at a constant linear velocity with an initial column head pressure of 214.0 kPa. The oven temperature program was from 75 °C to 160 °C @2.5 °C/min, then @3.0 °C/min to 290 °C, @10 °C/min to 330 °C (hold 8.67 min). The injection volume was 1 µL. Injector temperature was ramped from 90 °C to 280 °C immediately after injection at 60 °C/min.
Each sample was injected once, without replicates. Extraction and derivatization efficiency were verified by monitoring internal standards' (Section 2.3.1) responses. The measurement was organized in four day-wise batches. Injection repeatability and instrumental response stability were checked by QC analyses [18]. Six QC samples were freshly prepared/derivatized and injected at the beginning and the end of the daily batch (two consecutive QCs) as well as after each block of three study samples (single QCs). Within these blocks, study samples were measured in randomized order. At the beginning of each day, after the first QC twin injection, a reagent blank spiked with retention index markers (C7-C30 saturated Fatty Acids Methyl Esters-FAMES) to determine background levels of known reagent artifacts and contaminants and to define retention indices was injected [18].

Chromatographic Fingerprinting by Peak Features Alignment across Chromatograms
To comprehensively map the detectable metabolites signatures of hazelnut, data processing was based on a workflow developed to track untargeted and targeted components across multiple chromatograms. The process is termed UT fingerprinting [38] and was validated for complex fractions of volatiles [29,[39][40][41] in biological fluids metabolomics [42][43][44] and food metabolomics [30,37]. It performs untargeted and targeted pattern recognition of individual 2D peaks by template matching algorithms [45].
The UT fingerprinting work-flow included four major steps: • Step-1: Individual chromatograms were imported by the data processing software (GC Image, GC Image LLC, Lincoln, NE, USA), rasterized according to the PM, and pre-processed for baseline subtraction and peak detection. The detection threshold was set at 150 S/N, as previously validated [23]. • Step-2: The untargeted feature template was created by a dedicated program of the GC Image suite (i.e., Image Investigator™) by cross-matching peak templates from all analyzed chromatograms (24 QCs + 36 samples). After re-alignment of 2D-peak patterns, peaks that consistently matched across all-but-one chromatograms were annotated as reliable peaks and included in the feature template. For peak matching a spectral similarity direct and reverse match factors (DMF and RMF) constraint was applied with the NIST (National Institute of Standards and Technology) similarity algorithm [46] using threshold values ≥ 750 [23]. • Step 3: After feature template generation, the template was pruned by removing solvent peaks, column(s) bleed, and interferents, before proceeding with peaks targeting. Compounds targeting was a supervised process that made putative identifications from an MS library based on spectral similarity [46] using threshold values DMF ≥ 900, RMF ≥ 930, and 1 D retention-index (I T ) coherence (I T ± 15 units).
At this point, the feature template combined untargeted and targeted features-i.e., a UT template. Untargeted peaks correspond to reliable peaks defined at Step-2 but that remained unidentified applying the criteria of Step-3. Targeted peaks are those reliable peaks putatively identified at Step-3.
• Step 4: The UT feature template was then matched to each sample chromatogram thereby recognizing re-aligned peak features, which were exported for further data elaboration. The output was a data matrix of UT peaks together with 1 D and 2 D retention times ( 1 tR, 2 tR), compound names for target analytes, fragmentation spectra, selected ions responses, total ions response, etc.

Method Performance Parameters
Method performance parameters were evaluated on replicated injections of QC samples. Intermediate precision on retention times and targeted peaks' % response were estimated by calculating % relative standard deviation (% RSD). Results are reported in the Supplementary Table S1. Analytes' % response was calculated on the total ion current (TIC) signal by the processing software on the basis of peak volumes normalized to the total response from all UT peaks excluding interfering compounds and column bleeding [47].
Retention times in both chromatographic dimensions ( 1 tR and 2 tR) were collected from targeted peaks on 24 QC analytical runs across all working days. Results are listed in Supplementary Table S1. A quite good retention time stability was achieved with an average % RSD of 0.05 for 1 tR and 1.56 for 2 tR. Response repeatability was 20.6% on average, with a median of 18.4%.

Data Acquisition, 2D Data Processing, and Statistical Analysis
GC×GC data were acquired by GCMS Solution version 4.11 (Shimadzu Deutschland GmbH, Duisburg, Germany) and processed by GC Image GC×GC Edition, ver 2.9 (GC Image, LLC, Lincoln, NE, USA). Data elaboration and results visualization were conducted by using XL-Stat (Addinsoft Inc., New York, NY, USA) and Gene-E (Broadinstitute.org).

Mapping Hazelnut Metabolome by Chromatographic Fingerprinting
The term "fingerprinting" has been described for metabolomics [48,49] and refers to analytical processes capable of unraveling compositional differences between samples. Although spectroscopic techniques, e.g., mass spectrometry (MS), nuclear magnetic resonance (NMR), and Fourier transform infrared spectroscopy (FTIR), have been used for years in metabolomic fingerprinting, modern MDA platforms offer further possibilities to exploit the concept of fingerprinting [20].
In fact, 2D peak patterns generated by comprehensive two-dimensional gas chromatography (GC×GC) can be considered as a sample's distinctive fingerprint, and the detected compounds as minutiae features to be annotated and tracked across multiple samples. The term "minutiae" derives from biometric fingerprinting used in forensic science [50], therein indicating ridge endings and bifurcations on fingertips whose relative position is unique in each individual. Just as for automatic biometric fingerprinting, the localization and extraction of analytical (meta)data from 2D peak pattern features of single chromatograms enable effective cross-comparisons with an intrinsic potential of deepening the knowledge on chemical composition and components distribution. The adoption of MS detection adds orthogonal information about analytes' identities, through distinctive spectral signatures, and amounts (relative or absolute).
In this study, chromatographic fingerprinting was applied by comprehensively extracting peak features information (i.e., summed data for each component peak with associated metadata) including for untargeted and targeted components. To enable effective cross-comparative analysis, peak features pattern matching across multiple chromatograms was guided by mass spectral similarity with DMF ≥ 800 [23,51] and a second-orderpolynomial retention-times transformation [52,53]. For targeted components, the approach explicitly matches corresponding peak-features across chromatograms by the target name. For untargeted components, the software matches unidentified peaks across chromatograms by peak tracking and assigns unique identifiers (#n) based on reliable realignment of the data. Figure 1A shows the pseudocolor image corresponding to a QC sample with targeted peak features highlighted by green circles. Of the 1000, on average, detectable 2D peaks above a response threshold of 50 signal-to-noise ratio (SNR), 80 components were identified by an electron ionization (EI) fragmentation pattern DMF above 900 and I T within a tolerance of ±15. Where possible, putative identifications were confirmed by authentic standards available in authors' laboratories.  Figure 1B highlights both the untargeted (red circles) and targeted (green circles) peak features. Untargeted peak features were consistently extracted after template-based alignment and cross-matching. The process is detailed in the Materials and Methods Section 2.6 and supported by pertinent literature for those interested in its application. Untargeted components (red circles) accounted for 334 2D peaks. Table 2 lists targeted (T) components together with their average retention times ( 1 tR min, 2 tR sec), experimentally determined 1 D I T values, and tabulated retention indices (NIST database [54]). Supplementary Figure S1 illustrates the UT peaks % response distribution across all samples by heat-map and Hierarchical Clustering based on Pearson correlation. Data was log-normalized before computation. Superscript letters following chemical names indicate previous studies where these compounds were described in relation to the roasting process: a indicates Cialiè Rosso et al. [37], b Ozdemir et al. [9], and c Alasalvar et al. [10].
The primary metabolome coverage was extensive and very informative, especially in that it was possible to target 80 compounds, and in particular, 15 amino acids/derivates, 16 sugars including mono-and di-saccharides, four sugar acids, seven polyols, and 24 organic acids involved in cell metabolism.
The high information level encrypted in this hazelnuts data, explored through a process of combined UT fingerprinting and profiling, allows a deeper understanding of the influence of both phenotypes and degree of roasting on primary metabolites patterns, as described next.

Chemical Patterns Characterizing Hazelnut Phenotype
The goal of the first pattern recognition was to evaluate the role of cultivar and pedoclimatic conditions on the metabolome expression (phenotype). In a previous study on primary metabolites signatures of high-quality hazelnuts [37], it was found that, based on this fraction, analyzed samples were independently clustered according to cultivar/origin. In particular, Piedmont (Tonda Gentile Trilobata) and Roman (Tonda Romana) hazelnuts were connoted by the greater relative abundance of several amino acids and sugars, delineating distinctive yet discriminant signatures. On the other hand, Turkish hazelnuts, from the Ordu region, were connoted by a generalized lesser amount of primary metabolites with the exception of Gly and tartaric acid. Moreover, the discrimination between Piedmont and Roman was mainly driven by glucose, galactose, maltose, and fructose, all present in higher amount in Tonda Gentile Trilobata as well as by Trp, Orn, and Tyr. In contrast, Tonda Romana hazelnuts were richer in Leu, Ile, Met, Val, Phe, Pro, and pyroglutamic acid and organic acids (lactic acid, glutaric acid, galacturonic acid, fumaric acid, tartaric acid, and oxalic acid).
In the current study, the metabolites coverage was similar; therefore, the UT data matrix, with 2D peaks % response, was submitted to hierarchical clustering (HC) and supervised exploration by partial least squares discriminant analysis (PLS-DA) to identify distinctive patterns for Tonda Gentile Trilobata vs. Anakliuri hazelnuts. Due to the concurrent presence of several confounding variables, (e.g., post-harvest drying E1 vs. E2 and storage U vs. C) observations were pre-filtered by Fisher-ratio score (F) to exclude UT features with a negligible role in discriminating the cultivar influence on the metabolome. Supplementary Figure S2 shows: (SF2A) HC based on Euclidean distances (Z-score normalization of the data) for % response with 56 UT peaks (F ≥ 4), and (SF2B) PLS-DA scores plot on targeted analytes with meaningful variations between Tonda Gentile Trilobata and Anakliuri. Results indicate that cultivar has a predominant role in the distinctive metabolites' signatures and that the harvest region has a secondary role. Tonda Gentile Trilobata harvested in Italy form an independent cluster (cluster "a" in SF2A) while the geographical origin, although relevant in the discrimination, has a secondary role (cluster "b" in SF2A).
Within the most discriminant components, obtained by ranking the variable importance for the projections (VIPs), there were some amino acids (Val, Ala, Gly, and Pro); organic acids (citric, aspartic, malic, gluconic, threonic, and 4-aminobutanoic acids) and several sugars and polyols (maltose, xylulose, xylitol, turanose, mannitol, scyllo-inositol, and pinitol). Of them, Ala, Gly, and Pro have a high informational role since they may react with sugar degradation products (i.e., deoxyosones) within the Maillard reaction framework to form 2-acetyl and 2propionylpyrroline, key-aroma compounds of hazelnuts [2]. Figure 2 shows univariate statistics, by box-plot visualization, for informative targets and their relative distribution in different samples. Pro and Ala, whose meaningful variations were discriminant between the two cultivars, also have coherent trends between the two pedoclimatic regions (Italy vs. Georgia), suggesting a predominant role of the genome on phenotypic expression. Scyllo-inositol, an inositol isomer, was close to the limit of detection for Anakliuri samples, citric acid, out of the discriminant analytes, has a higher amount in Georgian hazelnuts. Also of note, non-optimal storage conditions, conducted without relative humidity stabilization (U vs. C samples), resulted in higher amounts of several metabolites. Fructose derivatives, galactose, glucose, mannitol, erythritol, gluconic acid, succinic acid, fumaric acid, Ala, Val, and Leu showed meaningful variations (p < 0.05) between U vs. C samples. These data suggest that enzymatic (endogenous and exogenous enzymes) activation most probably occurred by inducing the release of amino acids and sugars from storage deposits and cell metabolism activation. Evidence of kernel viability was confirmed in a previous study aimed at comprehensively mapping hazelnuts volatilome along 12 months of shelf-life [55].
Concerning the impact of drying conditions (E1 and E2), it was not possible to isolate, at this stage (i.e., at time 0 on fresh hazelnuts), a clear impact of drying although it is known to be decisive for long-term storage. In this case, and up to 18 months, low-temperature drying accompanied by low-temperature storage in an inert atmosphere, enables effective inactivation of enzymes and mold/bacteria development while reducing autoxidation of fats [55].
In the current study, the focus was on the total detectable metabolome and not only on aroma precursors and/or on nutrients; therefore, to add a further tracking point on the roasting profile, the process was extended to reach a higher roasting temperature (200 °C-10 min), which is still acceptable for industrial applications. This "over-roasted" stage was similar to that applied in industrial transformation for "granella" (hazelnut grains) used in confectionery toppings.
The results are visualized in Figure 3A by principal component analysis (PCA) scores-plot based on % responses from all 430 UT peak features. The explained variance of the first two principal components (F1 and F2) was 25% of the total variance. However, samples appear clearly clustered according to the extent of thermal treatment.  Table 2).
The impact of roasting tracks along F1 ( Figure 3A) and, from right to left, metabolites signatures of raw hazelnuts, green indicators, evolve to the over-roasted stage (blue indicators) with a concurrent slight decrease of the group's internal variability as shown by confidence ellipses (95% confidence level). By tracking known metabolites along the roasting profile, those with meaningful variations across the three steps (raw, roasted, over-roasted) were reactive amino acids: Ala, Pro, Leu and Val, Asp, and pyroglutamic acid (a derivatization product of glutamic acid (Glu)). All amino acids followed a decreasing trend along roasting time with variations on % response ranging from −83% for Leu and Val to −96% for Glu. Sugars such as saccharose and galactose showed less marked decreases with −56% and −46%, respectively. Histograms in Figure 4 illustrate the % response variation for a selection of targeted analytes, with error bars corresponding to ± SD for the class.
Results are consistent with those of Ozdemir et al. [9], who reported a generalized decrease in absolute concentrations of all amino acids. Since our roasting protocol was by  When the roasting impact is examined in light of aroma precursors distribution (Table  2), the clustering of samples by PCA is even clearer. Figure 3B shows the PCA scores-plot based on % responses from 15 precursors. The intra-class variability for the roasted samples is lower (as shown by confidence ellipses) while the total explained variance is 68%. Aroma precursors (fructose, glucose, maltose, saccharose, Ala, Pro, Ile, Leu, Orn, 5-oxoproline above all) are all reactive species within the Maillard reaction and sugar degradation pathways and, during roasting, form potent odorants. In particular, carbohydrates dehydration and isomerization cause the formation of α-dicarbonyls (2,3-butanedione and 2,3-pentanedione) and furanones (5-hydroxymethyfurfural and 4-hydroxy-2,5-dimethyl-3(2H)-furanone). Strecker degradation of amino acids in the presence of α-dicarbonyls form 2-/3-Methylbutanal and phenylacetaldehyde from Ile, Leu, and Phe, respectively [56]. The intermediates from Strecker reaction, α-amino carbonyl compounds, can dimerize to dihydropyrazines which can either oxidize into pyrazines or further react with aldehydes to form substituted pyrazines [57]. Reactions between Ala, Arg, Lys, Pro and Orn, and deoxyosones (i.e., sugar degradation products) form 2-acetyl and 2-propionylpyrroline that can be further oxidized to pyrrole derivatives [2].
As detailed in Section 3.2, aroma precursors have characteristic signatures in raw hazelnuts ( Figure 3A green indicators, larger intra-class variability) with strong correlations to cultivar and origin, but when lab-scale roasting is applied, they start to react and the resulting distribution patterns in roasted and over-roasted samples are more homogeneous. Roasting is a variable that dominates phenotype variations of key-aroma precursors.
Finally, within roasted samples, the impact of post-harvest drying, i.e., E1 vs. E2 conditions, was analyzed. The variable importance in projection (VIP) selection, conducted on roasted samples (23 runs) and considering all targeted analytes (n = 80), highlighted the presence of glucose (glucose 5TMS), fructose (fructose 5TMS syn-and anti-), 3-α-mannobiose 8 TMS, and threonic and malonic acid within the first 10 variables with the highest ranking (high VIP value and low standard deviation-SD). In particular, sugars were present in higher relative amounts for E1 drying, conducted in-field at higher temperatures, while acids were present in higher amounts for low-temperature drying (E2 conditions). These data should be interpreted in the light of previous results on volatiles signatures from raw and roasted hazelnuts (Tonda Romana and Ordu) submitted to post-harvest drying in similar conditions [55]. Low-temperature drying (E2) enabled effective stabilization of kernels along 12 months of storage and higher relative amounts of key odorants were developed during roasting. In contrast, E1 drying was correlated with an early incidence of autoxidation on raw hazelnuts that, from one side, developed higher amounts of linear saturated and unsaturated aldehydes from fatty acids hydroperoxides decomposition, while, on the other side, when roasted along shelf-life checkpoints, resulted in weaker flavor profiles with lower amounts of key odorants.

Conclusions
UT fingerprinting with GC×GC-MS data was shown to be an effective strategy for exploring the complex hazelnut metabolome and its variations due to variable inputs (e.g., cultivar, geographical origin, post-harvest treatments, and roasting). MDA platforms enabled consistent annotation and tracking of component features, offering the opportunity to explore available analytical metadata to target known analytes and have access to a deeper knowledge of the chemical code behind complex phenomena [58].
This proof-of-concept study, for the first time accurately captured the complex metabolome of raw hazelnuts and tracked its changes along roasting. Although many different variables concur to increase the chemical dimensionality of samples, their influence can be examined by applying various statistics. Untargeted features-being tracked together with all metadata-can be disambiguated/identified in the ex-post analysis. This study comprehensively tracked 430 peak features with 80 targeted components. However, encrypted information of those features that have been left untargeted also can be mined to extend even further our knowledge in light of new variables related to the sample set.
These results add new insights to the existing knowledge about hazelnut primary metabolites and constituents and provide a clearer picture of the interrelation between hazelnut varieties and their processing technologies, offering useful support for future investigations on breeding programs or harvesting procedures aimed at optimizing the overall compositional quality of hazelnuts.
Supplementary Materials: The following are available online at www.mdpi.com/2076-3417/11/2/525/s1 as non-published materials. Table S1. Method performance on retention times ( 1 tR and 2 tR) and targeted peaks % response intermediate precision estimated as % Relative Standard Deviation (% RSD) on 21 quality control (QC) samples analyzed twice a day over the entire study period, Supplementary Figure S1  Funding: This research was funded by Soremartec Italia Srl, Project: Shelling nuts: a comprehensive investigation of hazelnuts flavor, taste and related health topics.

Data Availability Statement:
The data presented in this study are available in supplementary material here.