Comparison of Regular, Pure Shift, and Fast 2D NMR Experiments for Determination of the Geographical Origin of Walnuts

1H NMR spectroscopy, in combination with chemometric methods, was used to analyze the methanol/acetonitrile (1:1) extract of walnut (Juglans Regia L.) regarding the geographical origin of 128 authentic samples from different countries (France, Germany, China) and harvest years (2016–2019). Due to the large number of different metabolites within the acetonitrile/methanol extract, the one-dimensional (1D) 1H NOESY (nuclear Overhauser effect spectroscopy) spectra suffer from strongly overlapping signals. The identification of specific metabolites and statistical analysis are complicated. The use of pure shift 1H NMR spectra such as PSYCHE (pure shift yielded by chirp excitation) or two-dimensional ASAP-HSQC (acceleration by sharing adjacent polarization-heteronuclear single quantum correlation) spectra for multivariate analysis to determine the geographical origin of foods may be a promising method. Different types of NMR spectra (1D 1H NOESY, PSYCHE, and ASAP-HSQC) were acquired for each of the 128 walnut samples and the results of the statistical analysis were compared. A support vector machine classifier was applied for differentiation of samples from Germany/China, France/Germany, and France/China. The models obtained by conduction of a repeated nested cross-validation showed accuracies from 58.9% (±1.3%) to 95.9% (±0.8%). The potential of the 1H-13C HSQC as a 2D NMR experiment for metabolomics studies was shown.


Introduction
The field of metabolomics is becoming increasingly important, especially with regard to the discrimination or classification of samples [1]. Food fingerprinting is a metabolomicsbased approach that focuses on the recognition of specific patterns that enable the differentiation of several groups [1,2]. Regarding food samples, this differentiation can be based on the geographical origin, varieties, different growing conditions, adulterations, or harvest times [3][4][5][6][7]. The most common methods used for metabolomics studies are nuclear magnetic resonance spectroscopy (NMR) and mass spectrometry (MS), e.g., coupled to liquid chromatography (LC), gas chromatography (GC), or tandem MS [8][9][10]. Other methods include vibrational spectroscopy techniques such as infrared (IR) or Raman spectroscopy [11]. NMR spectroscopy requires minimal sample preparation and is a highly automatable technique, which therefore allows a high throughput of samples [8]. Furthermore, it enables structure elucidation, and it is a highly reproducible, nondestructive, and easily quantifiable method [8][9][10]12]. MS is much more sensitive compared to NMR spectroscopy, but signals in mass spectra are not directly quantifiable, since the ionization efficiency is not the same for different metabolites [8,9]. the Optimal Control Theory (OCT) can be used, which lead to a less amount of artifact signals in the resulting spectra [24]. Sharma et al. used 2D ASAP-HSQC spectra for the differentiation of black and green tea and compared the results with regular HSQC and 1 H NMR spectra [4]. The studies of Le Guennec et al. included the evaluation of the general applicability of regular 2D HSQC spectra for metabolomics using synthetic samples [25]. Another study performed by Puig-Castellví et al. focused on the comparison of regular 2D HSQC and 1D 1 H NMR spectra concerning the metabolism of yeast samples [26]. All studies showed that the use of 2D NMR spectra, especially HSQC spectra, seems to be a promising method both for multivariate analysis and for identification of metabolites.
In this study, classification models regarding the discrimination of the geographical origin (France/Germany/China) were built using the methanol/acetonitrile extract of 128 walnut samples. For each walnut sample, a 1D 1 H NOESY spectrum, as well as a PSY-CHE and an 1 H-13 C ASAP-HSQC spectrum, were acquired. The data was analyzed using multivariate statistics, including principal component analysis (PCA) and classification with a repeated nested cross-validation (CV) to avoid overfitting. By discrimination of samples from two countries, each using a linear support vector machine (SVM) classifier, high accuracies from 58.9% (±1.3%) to 95.9% (±0.8%) of the models were obtained. The resulting score and loading plots of the PCA, as well as the classification matrices of all types of NMR spectra, were compared among each other, and the feasibility of these types of NMR experiments for multivariate analysis was evaluated.

Results
The definition of an extraction method in a metabolomics-based approach is a crucial step as well as the choice of the experiment for data acquisition. Most studies regarding walnuts focus on the non-polar extract [27][28][29], and in the field of metabolomics, use of 1D NOESY NMR experiments with water suppression is established [30,31]. In previous studies, we analyzed the 1D NOESY spectra obtained by polar extraction with a mixture of buffer/methanol-d 4 /chloroform-d (5:4:6) [32]. Since the aim was to obtain a high dispersion in the chemical shift of NMR signals, the non-polar extract was unsuitable because the fatty acid fingerprint contains only a few characteristic signals. In this study, the midpolar extract (acetonitrile-d 3 /methanol-d 4 ) was used to obtain both signals from polar and non-polar metabolites. Furthermore, the extract was analyzed as an increased presence of secondary metabolites could be suspected. The extract exhibits many signals in the aromatic region, particularly below 7 ppm indicating polyphenols, and further signals in the region of carbohydrates. A representative 1D 1 H NOESY spectrum of the acetonitriled 3 /methanol-d 4 extract (method A) is shown in Figure 1. The spectrum contains mainly signals from fatty acids. The region of carbohydrates is dominated by signals from sucrose. The region from 6.00 to 9.00 ppm is shown expanded.

PCA and Classification Using 1D 1 H NOESY Spectra
Walnut samples were weighed in triplicates to exclude preparation errors. The extraction was carried out according to method A and then spectra of the individual samples 1  The aliphatic region is dominated by signals from fatty acids. By far the most intense carbohydrate signals originate from sucrose. The signals in the aromatic region showed a much lower intensity compared to those from sucrose or the fatty acids. A stability measurement of the extract was carried out where it was observed that it is sufficiently stable. Only in the aromatic region some changes in signal intensity and signal shifts occur, which are shown in the supplementary information ( Figure S1). The changes in signals of the extract were monitored for one week, whereby visible changes were mostly detected only after several days at room temperature. Consequently, the extract was sufficiently stable for the analytical purpose since changes in signals are only critical if they occur within few hours after extraction. In this study, a total of 128 walnut samples were used for discrimination of geographical origin.

PCA and Classification Using 1D 1 H NOESY Spectra
Walnut samples were weighed in triplicates to exclude preparation errors. The extraction was carried out according to method A and then spectra of the individual samples were measured by 1 H NMR spectroscopy using the 1D 1 H NOESY pulse sequence with water suppression. The spectra obtained were calibrated to OMS as reference and processed by automatic phase and baseline correction. The software Amix was used to visualize the data and to perform a PCA. After the analysis of the triplicates, no outliers were observed. For PCA, a real single spectrum (the first extract of each triplicate) was preferred instead of averaging the triplicates. A total of 183 buckets were manually defined for creating binary classification models. Additionally, the Kruskal-Wallis test as a non-parametric significance test was performed, revealing 44 buckets with a Bonferroni-corrected p-value below 0.00027322, indicating a high relevance in separation of the two sample groups. The buckets used with corresponding p-values are shown in the supporting material (Table S1). Surprisingly, only signals that were associated with fatty acids and carbohydrates showed significant differences. The PCA score and loading plot of DE/CN in Figure 2A, which were plotted using the software Origin, showed two partially overlapping clusters. The PCA score and loading plots of the differentiation France/Germany and France/China are shown in the supplementary information ( Figures S2 and S3).
The classification was performed using the software Matlab with the normalized and scaled buckets as input. The Classification Learner App of Matlab includes the linear support vector machine classifier, which was used for all classifications with the same parameters. For validation, a repeated nested cross-validation (CV) was carried out. Conducting a nested CV, where each sample is left out exactly once for subsequent testing, produces results where the estimate of the true error is given almost unbiased [33]. Westerhuis et al. examined the validation of double CV as a variant of the nested CV by using permutation tests and showed that double CV leads to the most reliable results compared with other CV variants [34]. By repeating nested CV multiple times with different splits of the training and test sets, the influence of the split constitution can be further neglected [35]. For the repeated nested CV, the entire sample set was first separated into five parts, and one part was left out to function as the test set. The corresponding training set consists of the other four parts. A five-fold outer CV was carried out by repeating this procedure five times so that each walnut sample appeared once in a test set. The combination of the results obtained by each split presents the result as a confusion matrix. The whole procedure was performed with five randomly mixed splits to exhibit five combined confusion matrices. The result of a classification model was given as mean in a final confusion matrix. The accuracy is shown in percentage with the standard deviation. The classification was carried out using each possible combination of the three sample groups (DE/CN/FR), resulting in three classifiers. The final confusion matrices are presented in Figure 3. The linear SVM classifier led to the best results and therefore was used for all classification models. was performed with five randomly mixed splits to exhibit five combined confusion matrices. The result of a classification model was given as mean in a final confusion matrix. The accuracy is shown in percentage with the standard deviation. The classification was carried out using each possible combination of the three sample groups (DE/CN/FR), resulting in three classifiers. The final confusion matrices are presented in Figure 3. The linear SVM classifier led to the best results and therefore was used for all classification models.  The classification of German and Chinese walnut samples exhibited an accuracy of 95.9% (±0.8%), indicating a strong and robust model. Nearly all samples from Germany were correctly classified with an accuracy of 99.6%. The Chinese samples were classified correctly, with an accuracy of 82.7%. Misclassifications were mainly attributed to two Chinese samples, which showed a proximity to the cluster of the German samples in the PCA score plot. In the classification model based on the polar extract from previous studies, 5.3% of the fifteen Chinese samples were incorrectly classified [32]    was performed with five randomly mixed splits to exhibit five combined confusion matrices. The result of a classification model was given as mean in a final confusion matrix. The accuracy is shown in percentage with the standard deviation. The classification was carried out using each possible combination of the three sample groups (DE/CN/FR), resulting in three classifiers. The final confusion matrices are presented in Figure 3. The linear SVM classifier led to the best results and therefore was used for all classification models.  The classification of German and Chinese walnut samples exhibited an accuracy of 95.9% (±0.8%), indicating a strong and robust model. Nearly all samples from Germany were correctly classified with an accuracy of 99.6%. The Chinese samples were classified correctly, with an accuracy of 82.7%. Misclassifications were mainly attributed to two Chinese samples, which showed a proximity to the cluster of the German samples in the PCA score plot. In the classification model based on the polar extract from previous studies, 5.3% of the fifteen Chinese samples were incorrectly classified [32]    The classification of German and Chinese walnut samples exhibited an accuracy of 95.9% (±0.8%), indicating a strong and robust model. Nearly all samples from Germany were correctly classified with an accuracy of 99.6%. The Chinese samples were classified correctly, with an accuracy of 82.7%. Misclassifications were mainly attributed to two Chinese samples, which showed a proximity to the cluster of the German samples in the PCA score plot. In the classification model based on the polar extract from previous studies, 5.3% of the fifteen Chinese samples were incorrectly classified [32]. The misclassification was attributed to only one sample, which was also classified wrongly in the model based on the mid-polar extract. For the differentiation of samples from France and Germany, the initially 183 selected buckets were used as well. The classification of samples from France and Germany resulted in a high accuracy of 83.4% (±2.0%). French samples were classified correctly with an accuracy of 91.9% and, in contrast, German samples with 72.2%. It was noticeable in the classification using this extraction method as well as the polar extraction from previous studies [32] that a relatively large number of samples from Baden-Württemberg were incorrectly classified, which can be explained by the proximity to France. In the classification based on the mid-polar extract, the proportion of incorrectly assigned samples with the origin Baden-Württemberg was 43%. In comparison, this value amounted to a similarly high value of approximately 57% for the classification with the polar extraction [32]. For the discrimination of samples from France and China, a high variability within each sample group was observed. An accuracy of 93.7% (±1.1%) was calculated and in total, nearly all samples from France were classified correctly, which corresponds to an accuracy of 99.7%. The Chinese samples were classified with an accuracy of 68.0%, which is probably due to the unequal sample size. The accuracy of the model based on the mid-polar extract was higher than the accuracy (92.6% (±1.2%)) obtained with the model based on the polar extract. In both extraction methods, misclassifications of Chinese samples were attributed to mainly the same samples. The 1D 1 H NOESY spectra of the mid-polar extract of walnuts can be used to discriminate samples from different countries. In comparison with results from previous studies, it is noticeable that the accuracies of the models are similar when using the polar or mid-polar extract [32]. The results of both extraction methods are compared in the supporting material (Table S10).

Comparison with ASAP-HSQC and PSYCHE Spectra
For the comparison of the classification results concerning the geographical origin of the walnut samples using 1D 1 H NOESY, ASAP-HSQC or PSYCHE spectra, the samples from Germany (49), France (64) and China (15) were additionally weighed in triplicates for acquisition of ASAP-HSQC and PSYCHE spectra. The samples were extracted using the extraction method B which is a slightly modified version of the acetonitrile/methanol extraction method A in order to achieve higher concentrations of the metabolites since the signal intensities in PSYCHE and ASAP-HSQC spectra are generally much lower compared to standard 1 H NMR spectra. Figure 4 shows a homonuclear decoupled PSYCHE spectrum of a representative walnut sample.  For the visualization of the data, a PCA was performed using the software Amix. A total of 30 buckets were manually defined for all signals except the solvents and the reference (OMS) in the PSYCHE spectra (supplementary information S2). The corresponding score and loading plot are shown in Figure 2B. The results of the PCA for the differentiation of German and French as well as French and Chinese samples are shown in the supplementary information (Figures S2 and S3). The classification was performed similarly as for the 1D 1 H NOESY spectra using a support vector machine algorithm in the Classification Learner App in Matlab. The composition of the training and test sets for the repeated nested CV was the same for all types of NMR experiments. The confusion matrices for all two-class models are shown in Figure 5.  The PSYCHE spectrum of the acetonitrile-d 3 /methanol-d 4 extract (method B) in Figure 4 shows mostly good decoupling efficiency showing only singlets for chemically distinct NMR signals. In the aliphatic region signals from fatty acids were observed. Furthermore, signals from sucrose and the signal from water were detected. The overall signal intensity in the PSYCHE spectrum is quite low. Especially in the aromatic region hardly any signals were detected, and care must be taken to not consider artefacts in the baseline, originating from the special data processing technique, as small signals by mistake. Furthermore, the full width at half maximum (FWHM) of the peaks is rather large, resulting in just a slightly better resolution compared to regular 1 H NMR spectra (compare Figure 1).
For the visualization of the data, a PCA was performed using the software Amix. A total of 30 buckets were manually defined for all signals except the solvents and the reference (OMS) in the PSYCHE spectra (supplementary information S2). The corresponding score and loading plot are shown in Figure 2B. The results of the PCA for the differentiation of German and French as well as French and Chinese samples are shown in the supplementary information ( Figures S2 and S3). The classification was performed similarly as for the 1D 1 H NOESY spectra using a support vector machine algorithm in the Classification Learner App in Matlab. The composition of the training and test sets for the repeated nested CV was the same for all types of NMR experiments. The confusion matrices for all two-class models are shown in Figure 5. For the visualization of the data, a PCA was performed using the software Amix. A total of 30 buckets were manually defined for all signals except the solvents and the reference (OMS) in the PSYCHE spectra (supplementary information S2). The corresponding score and loading plot are shown in Figure 2B. The results of the PCA for the differentiation of German and French as well as French and Chinese samples are shown in the supplementary information (Figures S2 and S3). The classification was performed similarly as for the 1D 1 H NOESY spectra using a support vector machine algorithm in the Classification Learner App in Matlab. The composition of the training and test sets for the repeated nested CV was the same for all types of NMR experiments. The confusion matrices for all two-class models are shown in Figure 5.    For further enhancement of the spectral resolution regarding the overlap of signals, the acquisition of 2D NMR spectra seems to be promising. Due to the additional 13 C frequency dimension in 2D 1 H-13 C HSQC spectra overlapping signals in the 1 H frequency domain can partially be distinguished more clearly. Since the total acquisition time of 2D NMR spectra in general is much longer compared to 1D NMR data, techniques for the reduction of the measuring time need to be employed, especially regarding high throughput metabolomics studies. Thus, the ASAP-HSQC sequence developed by Schulze-Sünninghausen et al. was used by which the relaxation delay between the scans can be reduced without affecting the quality of the spectra [22]. Furthermore, the ASAP-HSQC pulse sequence was used since the applied pulses derived from the Optimal Control Theory seem to lead to artifact-free spectra [24]. However, the obtained ASAP-HSQC spectra are, in fact, quite similar to regular HSQC spectra. To reduce the experiment time even further, the 1 H-13 C ASAP-HSQC spectra were acquired in combination with nonuniform sampling (NUS). Figure 6 shows a 2D 1 H-13 C ASAP-HSQC spectrum of a walnut sample extracted with acetonitrile-d 3 /methanol-d 4 (method B). reduction of the measuring time need to be employed, especially regarding high put metabolomics studies. Thus, the ASAP-HSQC sequence developed by Schu ninghausen et al. was used by which the relaxation delay between the scans c duced without affecting the quality of the spectra [22]. Furthermore, the ASA pulse sequence was used since the applied pulses derived from the Optimal Con ory seem to lead to artifact-free spectra [24]. However, the obtained ASAP-HSQC are, in fact, quite similar to regular HSQC spectra. To reduce the experiment ti further, the 1 H-13 C ASAP-HSQC spectra were acquired in combination with non sampling (NUS). Figure 6 shows a 2D 1 H-13 C ASAP-HSQC spectrum of a walnu extracted with acetonitrile-d3/methanol-d4 (method B). The ASAP-HSQC spectrum shows separation of several signals, that occ lapped in both types of 1D NMR spectra. Especially in the aliphatic region, show nals of fatty acids, and in the carbohydrate region, signals that occur at the same 1 ical shift are spread across the additional 13 C frequency domain. In the aromatic the spectrum, only few signals can be detected within a total acquisition time o In the supplementary information ( Figure S5) an ASAP-HSQC spectrum wit creased number of scans (256) is shown, which allows the detection of signals in matic region. However, the total acquisition time of more than four hours acqu scans is not suitable for high throughput metabolomics studies, which is why sp The ASAP-HSQC spectrum shows separation of several signals, that occur overlapped in both types of 1D NMR spectra. Especially in the aliphatic region, showing signals of fatty acids, and in the carbohydrate region, signals that occur at the same 1 H chemical shift are spread across the additional 13 C frequency domain. In the aromatic region of the spectrum, only few signals can be detected within a total acquisition time of 33 min. In the supplementary information ( Figure S5) an ASAP-HSQC spectrum with an increased number of scans (256) is shown, which allows the detection of signals in the aromatic region. However, the total acquisition time of more than four hours acquiring 256 scans is not suitable for high throughput metabolomics studies, which is why spectra, as shown in Figure 6, were acquired for all walnut samples and used for multivariate analysis. PCA and classification were performed similarly as explained above, using the integrated 2D peak intensities as input. The buckets for the PCA were manually defined for all detected signals except the solvent signals and the reference. In total, 40 buckets were defined for further analysis (supplementary information S3). Figure 7 shows all two-class confusion matrices based on the ASAP-HSQC spectra.
shown in Figure 6, were acquired for all walnut samples and used for multivariate analysis. PCA and classification were performed similarly as explained above, using the integrated 2D peak intensities as input. The buckets for the PCA were manually defined for all detected signals except the solvent signals and the reference. In total, 40 buckets were defined for further analysis (supplementary information S3). Figure 7 shows all two-class confusion matrices based on the ASAP-HSQC spectra. In general, the accuracies of the classification using the ASAP-HSQC spectra are partially slightly lower compared to the accuracies obtained by the PSYCHE or 1 H NOESY spectra. The differentiation of German and Chinese walnut samples yields a high accuracy of 86.3% (±0.6%), as well as for the differentiation of French and Chinese samples with 88.9% (±1.2%). Both confusion matrices in Figure 7A,C show that the accuracy of the classification of the German and French samples each (98.4% for DE; 97.8% for FR) is much higher than the accuracy of the classification of Chinese samples (46.7% for DE/CN; 50.7% for FR/CN). These results seem to be caused by the unequal sample distribution. The classification of French and German samples only yields an accuracy of 58.9% (±1.3%). Due to the close geographical location of Germany and France, the walnut samples show two strongly overlapping clusters in the PCA (supplementary information, Figure S3), already indicating a complicated classification. Especially the samples originating from southern Germany (from Baden-Württemberg and partially from Hesse) are misclassified which can be attributed to the great proximity to France. In Table 1, all calculated accuracies of the classification of the different two-class models after repeated nested CV based on all applied NMR experiments are summarized. The confusion matrices as well as the PCA score plots obtained using three-class models are shown in the supplementary information ( Figure S4).  In general, the accuracies of the classification using the ASAP-HSQC spectra are partially slightly lower compared to the accuracies obtained by the PSYCHE or 1 H NOESY spectra. The differentiation of German and Chinese walnut samples yields a high accuracy of 86.3% (±0.6%), as well as for the differentiation of French and Chinese samples with 88.9% (±1.2%). Both confusion matrices in Figure 7A,C show that the accuracy of the classification of the German and French samples each (98.4% for DE; 97.8% for FR) is much higher than the accuracy of the classification of Chinese samples (46.7% for DE/CN; 50.7% for FR/CN). These results seem to be caused by the unequal sample distribution. The classification of French and German samples only yields an accuracy of 58.9% (±1.3%). Due to the close geographical location of Germany and France, the walnut samples show two strongly overlapping clusters in the PCA (supplementary information, Figure S3), already indicating a complicated classification. Especially the samples originating from southern Germany (from Baden-Württemberg and partially from Hesse) are misclassified which can be attributed to the great proximity to France. In Table 1, all calculated accuracies of the classification of the different two-class models after repeated nested CV based on all applied NMR experiments are summarized. The confusion matrices as well as the PCA score plots obtained using three-class models are shown in the supplementary information ( Figure S4). Qualitatively, the score and loading plots of all types of NMR experiments (Figure 2 and supplementary information Figures S2 and S3) show the same trend for the differentiation of each two-class model and lead to comparable results. In all score plots, the German samples (black dots) and the Chinese samples (red dots) show two mostly separated clusters along PC1. The corresponding loading plots each as well show a distribution of the buckets along PC1. Especially the loading plot of the PSYCHE spectra show a clear separation of buckets originating from signals of carbohydrates and those originating from fatty acid signals. Using PSYCHE spectra, the distribution of buckets of carbohydrates and fatty acids indicate that the concentrations of fatty acids and carbohydrates seem to differ between German and Chinese walnut samples. In general, the Chinese walnut samples seem to have slightly higher concentrations of fatty acids, which are mainly linoleic, oleic, α-linolenic, palmitic, and stearic acid [27], while German samples exhibit rather higher concentrations of carbohydrates. Furthermore, in the loading plots of the PSYCHE and the ASAP-HSQC spectra (Figure 2B,C) a clustering of the buckets originating from aromatic compounds is visible. In the loading plot of the 1D 1 H NOESY spectra (Figure 2A) this clear separation of buckets is less evident since some buckets in the carbohydrate region and most of the buckets in the aliphatic region showing mostly signals of fatty acids overlap. However, classification using the 1D 1 H NOESY spectra yields slightly higher accuracies, sensitivities, and specificities for almost all models presented. These results are particularly evident in the confusion matrices of the three-class models ( Figure S4). Due to the inherently lower signal-to-noise ratio using the PSYCHE or ASAP-HSQC spectra, which additionally leads to a reduced number of buckets that can be used for the classification, and thus due to this loss of information an increase in misclassifications was expected. Nevertheless, the misclassifications increase, especially where they are also high for the 1D NOESY model and are thus in accordance with the trends already observed.

The Potential of 2D NMR Experiments for Determination of the Geographical Origin of Foods
Due to the additional frequency domain of 2D ASAP-HSQC spectra some signals show less overlap compared to 1D NMR spectra, as shown above. Especially the detection of low abundant metabolites showing signals in strongly overlapping regions is hindered using 1D NMR spectra. This is shown as an exemplary case in Figure 8.

Reagents and Chemicals
Deuterated solvent methanol-d4 (99.8%) was purchased from EURISOTOP (Saint-Aubin Cedex, France). Acetonitrile-d3 (99.5%) was purchased from DEUTERO (Kastellaun, Germany). Octamethylcyclotetrasiloxane (98.0%) was purchased from Alfa Aesar (Ward Hill, Havervill, MA, USA). In Figure 8A-E the box-whisker plots of buckets in the same region of the 1 H frequency domain in the NMR spectra for the 1D 1 H NOESY, PSYCHE and ASAP-HSQC spectra each are presented. The selected buckets are highlighted in the respective NMR spectra below ( Figure 8F: PSYCHE (upper part) and 1D 1 H NOESY spectrum (lower part), Figure 8G: ASAP-HSQC spectrum). The box-whisker plot of the bucket at 3.61 ppm in the 1D 1 H NOESY spectra and at 3.62 ppm in the PSYCHE spectra show mostly the same trend, having higher values for German samples and lower values for Chinese samples. As can be seen from the spectra ( Figure 8F) this bucket most likely shows sucrose. Figure 8C,D show the box-whisker plots of two buckets of sucrose in the ASAP-HSQC spectra that occur at the same chemical shift (signal at 3.66/62.2 ppm (highlighted green) and signal at 3.64/74.5 ppm (highlighted green hatched)). As expected, the box-whisker plots of those signals show the same trend as for the one-dimensional NMR spectra since all signals most likely originate from the same metabolite (sucrose). Figure 8E shows the box-whisker plot of the bucket at 3.63/61.4 ppm in the ASAP-HSQC spectrum which does not belong to the signals of sucrose but is in the same chemical shift region of the 1 H domain as the signals shown before. The associated signal used for the box-whisker plot is highlighted blue in the ASAP-HSQC spectrum in Figure 8G. The trend of the corresponding box-whisker plot differs strongly from the ones shown before having lower values for German and higher values for Chinese walnut samples. This signal at 3.63/61.4 ppm can therefore only be made visible using a 2D HSQC NMR spectrum. In both types of 1D NMR spectra ( Figure 8F) the signal of this specific metabolite is strongly overlapped by the high abundant metabolite sucrose, thus impeding the detection of metabolites showing signals in the same chemical shift region. The use of 1D NMR experiments in this particular case thus leads to a loss of information due to compensation of peak intensities, which can lead to a change in the ratio of specific buckets. This example emphasizes the benefits using 2D HSQC spectra for metabolomics studies. Especially low abundant metabolites that might have an impact on the differentiation of sample groups occur less overlapped by high abundant metabolites. Furthermore, this can lead to improved detection and identification of chemical markers that are relevant for differentiation of sample groups.

Walnut Samples
In total, 128 walnut samples of different geographical origin, varieties, and harvest years 2016 (1), 2017 (30), 2018 (42) and 2019 (55) were analyzed. The sample distribution, including walnut samples from France (64), Germany (49), and China (15), is shown in the supplementary information (Table S11). Most samples were provided by importers and distributors and our collaborators declared the authenticity of the samples.

Sample Preparation
Walnut samples were handled in accordance with previous studies [32]. Freshly delivered walnut samples were dried prior to sample preparation. The walnut samples were shock frozen in liquid nitrogen, and the shell was removed. After another shock freezing process, 100 g of each sample were ground with the addition of 150 g dry ice using a Grindomix GM 300 knife mill equipped with a stainless-steel grinding container and a full metal knife (RETSCH, Haan, Germany). The ground walnut samples were then freeze-dried for 48 h and stored at −20 • C afterwards.

Extraction
Extraction method A: For the methanol/acetonitrile extract 302.5 mg (±2.5 mg) lyophilizate was mixed with 500 µL methanol-d 4 and 500 µL acetonitrile-d 3 (+0.5 mM OMS) and two steel balls (Ø = 2 mm) were added. The mixture was extracted in a ball mill for three minutes at 3.1 m/s and it was centrifuged at 14.000 rcf (4 • C) for ten minutes. Then, 350 µL of the supernatant was diluted with 350 µL methanol-d 4 . Finally, 600 µL of the diluted extract were transferred to a 5 mm NMR tube.
Extraction method B: For the acquisition of ASAP-HSQC and PSYCHE NMR spectra, the extraction method A was slightly modified in order to achieve higher concentrations of the metabolites since the signal intensities in both types of spectra are generally much lower compared to standard 1 H NMR spectra. From each walnut sample, 302.5 mg (±2.5 mg) lyophilizate was mixed with 500 µL methanol-d 4 and 500 µL acetonitrile-d 3 (+2.5 mM OMS). After the addition of two steel balls (Ø = 2 mm), the mixture was extracted in a ball mill for three minutes at 3.1 m/s and centrifuged at 14 000 rcf and 4 • C for ten minutes afterwards. Then, 600 µL of the supernatant were mixed with 100 µL of methanol-d 4 and 600 µL of this extract solution were transferred into a 5 mm NMR tube.

NMR Acquisition
All NMR spectra were acquired on a Bruker Avance III HD 400 MHz NMR spectrometer using TopSpin 3.6.2 (Bruker Biospin GmbH, Rheinstetten, Germany) equipped with a 5 mm BBO probe and operating at 400.13 MHz and 300 K.
The noesygppr1d pulse sequence was used for acquisition of water suppressed 1 H NMR spectra applying the digitization mode baseopt. For every sample, spectra were recorded with number of scans (NS) of 128, 65,536 complex data points and a spectral width of 8417.5 Hz. The RG was set to 64 and the transmitter frequency offset to 1778.2 Hz. The total duration time was 17 min.
The ASAP-HSQC spectra were acquired with the pulse sequence from Schulze-Sünninghausen et al. [22]. In order to combine the ASAP-HSQC experiment with nonuniform-sampling, the syntax of the mc-command was changed. All spectra were recorded with 32 scans and 128 dummy scans, 1024 data points were recorded in F2 as well as 512 increments in F1. The spectral width in F2 was 11 ppm and 165 ppm in F1. The relaxation delay was set to 0.3 s. In addition, nonuniform sampling was used so that only 25% of the increments were actually recorded. The mean coupling constant 1 J CH for the INEPT transfer was set to 145 Hz, resulting in a delay ∆ of 1.72 ms for the inverse INEPT transfer. A value of 210 Hz was selected for CNST3, resulting in a delay ∆ of 1.19 ms. The mixing time for the DIPSI-2 sequence was 30 ms. A garp4 sequence was used for the 13 C decoupling during FID recording. The total duration of the experiment was 33 min.
The homonuclear broadband decoupled PSYCHE NMR spectra were acquired with the reset-psyche pulse sequence from Bruker Biospin GmbH, which relies on the psyche pulse sequence from Foroozandeh et al. [14,18,19]. The spectra were recorded with 128 scans, two previous dummy scans, 16,384 data points in the direct dimension and 20 data chunks. The spectral width was set to 80 Hz in F1 and 8000 Hz in F2. The relaxation delay was 0.5 s. A flip angle β for the CHIRP pulse element of 45 • was chosen. The number of complex points of each FID block κ was set to 64, resulting in a block length of 8 ms. The total duration of the experiment was 71 min.

NMR Data Processing and Analysis
For the 1D 1 H NOESY spectra the FIDs were weighed by an exponential function with a line broadening factor of 0.3 Hz and Fourier transformed. The spectra obtained were calibrated to the OMS signal (δ = 0.085 ppm) and processed by automatic phase and baseline correction with Topspin 3.5 (Bruker Biospin, Rheinstetten, Germany).
The ASAP-HSQC spectra that were acquired in combination with NUS were processed using the Compressed Sensing (CS) Algorithm with Iterative Soft Thresholding (IST) in TopSpin 4.0.8 (Bruker Biospin GmbH, Rheinstetten, Germany). Zero filling up to 2048 data points was applied in the direct and indirect dimensions. In addition, complex linear forward prediction was performed in the F1 dimension with 32 coefficients. A Hilbert transformation was performed in the F2 dimension to calculate the imaginary part of the FID. Afterwards, a manual phase correction, as well as an automatic baseline correction with a first-degree polynomial function in both dimensions, was performed. The spectra were calibrated to the OMS signal (δ ( 1 H) = 0.085 ppm, δ ( 13 C) = 0.883 ppm).
The PSYCHE spectra were processed with the AU program proc_reset in TopSpin 4.0.8 (Bruker Biospin, Rheinstetten, Germany), whereby the number of columns for averaging was set to 1. In the resulting one-dimensional pure shift spectra, zero filling was applied to 4096 data points. In addition, a window multiplication with a QSINE function with sine bell shift of 2 was performed, and the spectra were calibrated to the OMS signal (δ = 0.085 ppm). Furthermore, a manual phase correction and an automatic baseline correction with a firstdegree polynomial function were performed.

Multivariate Analysis
The software Amix (Version 3.9.15, Bruker Biospin, Rheinstetten, Germany) was used for analyzing the spectra. For 1 H NOESY, PSYCHE, and ASAP-HSQC spectra, the buckets were defined manually with variable size, since the signal pattern is different in each spectrum and as this led to the best results regarding PCA (number of buckets: 183 for 1D 1 H NOESY, 30 for PSYCHE and 40 for ASAP-HSQC spectra). The selected buckets (supplementary information S1-S3) were scaled to total intensity (rows). The PCA was carried out using the data obtained. The scaling of bucket variables (columns) is based on unit variance and the default confidence level was set to 95%. The number of principal components was set to a minimum explained variance of 95%. The Kruskal-Wallis significance analysis of variables based on a modified procedure of Goodpaster et al. was performed with a confidence level set to 95% [36]. The p-value was Bonferroni-corrected, and buckets were indicated as significant if p < 0.00027322 for the 1D 1 H NOESY spectra, if p < 0.0016667 for the PSYCHE spectra or if p < 0.00125 for the ASAP-HSQC spectra. The buckets with their corresponding p-values for each differentiation are shown in the supplementary information (Tables S1-S9). The PCA score and loading plots were plotted using the software Origin 2019 (OriginLab, Northampton, MA, USA). For building classification models, the software Matlab R2019a/b (The Mathworks, Inc., Natick, MA, USA) including the Classification Learner App was used. A total of three binary classification models for every kind of NMR spectra were tested. The normalized bucket table was exported from Amix to Matlab. Afterwards, a repeated nested CV was performed [33,35]. First, the sample set (e.g., CN/DE) was randomly separated into five equal parts. In each part, the samples were stratified by origin. Four of these parts were combined to create a training set with which a model was obtained by training the linear support vector machine classifier in combination with an internal five-fold CV. In the internal five-fold CV, which is implemented in the software Matlab, the division into training and test sets was conducted randomized. The validation of the model obtained was performed by using the one part that was left out (test set). The process was repeated five times (five-fold outer CV) to assess the model performance [33] and thus, each part was used as a test set once. In total, the process was repeated five times with randomly remixed parts. The accuracies of the five models for a classification obtained by the repeated nested CV were averaged and given with the corresponding deviation. For the classification based on the different types of NMR spectra, the same composition of training and test sets each was used. The final result of a model was presented as mean in a confusion matrix. For classification, the support vector machine classifier with a linear kernel function and the box constraint level set to 1 was used, since this led to the best results regarding the accuracy of the models.

Conclusions
This study demonstrates the utility of different NMR experiments (1D 1 H NOESY, PSYCHE, and 1 H-13 C ASAP-HSQC) regarding the classification of walnut samples of different geographical origin (France, China, and Germany) in an untargeted metabolomics approach using the mid-polar acetonitrile/methanol (1:1) extract. Multivariate analysis such as PCA and classification using linear SVM showed that the highest classification accuracies can be obtained by using 1D 1 H NOESY spectra. Compared to previous studies, it was shown that the accuracies of the classification models based on two different extraction methods were similar [32]. Similarities were also found in the incorrectly classified samples. Furthermore, the total acquisition time of 1D NOESY spectra is lowest (~17 min) in comparison to the other types of spectra. High classification accuracies can also be obtained using homonuclear decoupled 1 H NMR spectra (PSYCHE) but disadvantages occur regarding the overall acquisition time of PSYCHE spectra (~71 min) and the complicated processing technique, which can additionally lead to the occurrence of artefacts in the spectra. Additionally, classification accuracies decrease slightly using the PSYCHE or ASAP-HSQC spectra, which can be expected from the loss in signal-to-noise. However, the use of 2D 1 H-13 C HSQC spectra for subsequent multivariate analysis is an advantageous method since clearer separation of the signals is possible due to the additional 13 C frequency domain. This also contributes to a greater distinction of the buckets used for classification since they do not overlap with other high abundant metabolites, thus reducing the risk of compensation of bucket intensities, which could lead to a loss of information. Furthermore, the use of 2D 1 H-13 C HSQC spectra may simplify the identification of the metabolites at the same time since information about the 13 C chemical shifts can be obtained simultaneously. In general, reliable and robust models were obtained for the classification of walnut samples from Germany, France, and China. In future studies, data fusion will be conducted to improve the accuracies of the models. Another focus will be the use of 2D 1 H-13 C HSQC experiments for simple identification of metabolites from walnut.

Supplementary Materials:
The following are available online at https://www.mdpi.com/2218-198 9/11/1/39/s1, Tables S1-S9: Buckets with corresponding p-values for each differentiation (DE/CN; FR/DE; FR/CN) using 1D 1 H NOESY, PSYCHE and ASAP-HSQC spectra. Table S10: Comparison of the accuracies of classification models based on the mid-polar extract and the polar extraction method from previous studies [32]. Table S11: Information about walnut samples. S1-S3: List of variable sized buckets used for building the classification models. Figure S1: 1 H NOESY spectra of the stability measurement. Figures S2 and S3: PCA score and loading plot of the differentiation of walnut samples of two countries each (FR/CN; FR/DE). Figure S4: Confusion matrices and PCA score plots using three-class models. Figure S5: ASAP-HSQC spectrum of a walnut extract acquired with 256 scans.

Data Availability Statement:
The data presented in this study are openly available in MetaboLights at doi:10.1093/nar/gkz1019, reference number MTBLS2341 [37].
Rieser Nuss GmbH & Co. KG for providing authentic nut samples. We thank Edris Riedel, Doreen Teske, Maike Arndt and Torben Segelke for sample acquisition and additionally Johanna Härdter for sample preparation. The authors thank Vera Priegnitz and Claudia Wontorra for their support in sample measurement.