1D “Spikelet” Projections from Heteronuclear 2D NMR Data—Permitting 1D Chemometrics While Preserving 2D Dispersion

Nuclear magnetic resonance (NMR) spectroscopy is a powerful tool for the non-targeted metabolomics of intact biofluids and even living organisms. However, spectral overlap can limit the information that can be obtained from 1D 1H NMR. For example, magnetic susceptibility broadening in living organisms prevents any metabolic information being extracted from solution-state 1D 1H NMR. Conversely, the additional spectral dispersion afforded by 2D 1H-13C NMR allows a wide range of metabolites to be assigned in-vivo in 13C enriched organisms, as well as a greater depth of information for biofluids in general. As such, 2D 1H-13C NMR is becoming more and more popular for routine metabolic screening of very complex samples. Despite this, there are only a very limited number of statistical software packages that can handle 2D NMR datasets for chemometric analysis. In comparison, a wide range of commercial and free tools are available for analysis of 1D NMR datasets. Overtime, it is likely more software solutions will evolve that can handle 2D NMR directly. In the meantime, this application note offers a simple alternative solution that converts 2D 1H-13C Heteronuclear Single Quantum Correlation (HSQC) data into a 1D “spikelet” format that preserves not only the 2D spectral information, but also the 2D dispersion. The approach allows 2D NMR data to be converted into a standard 1D Bruker format that can be read by software packages that can only handle 1D NMR data. This application note uses data from Daphnia magna (water fleas) in-vivo to demonstrate how to generate and interpret the converted 1D spikelet data from 2D datasets, including the code to perform the conversion on Bruker spectrometers.


Introduction
Heteronuclear 2D NMR provides superior spectral dispersion over one-dimensional (1D) datasets. The 1D 1 H spectra have been reported to have a peak capacity of~3000, while that of 2D 1 H-13 C approaches 2,000,000 [1]. This becomes very important for the metabolomics studies of very complex samples that show poor resolution in 1D NMR (due to a combination of magnetic susceptibility distortions and sample complexity). As such, the importance of 2D NMR as a routine tool to extract otherwise inaccessible NMR information is continuing to rise [2][3][4][5][6][7][8].
The 1 H- 13 C HSQC data, one of the 2D NMR techniques, are relatively well resolved and can be used to identify a range of metabolites in 13 C labeled samples or concentrated non-enriched samples. The question now becomes: is it possible to preserve all the information from a 2D 1 H-13 C HSQC spectrum while generating a standard 1D format compatible with chemometrics and statistical packages? In the present application note, we describe the generation of such a format-1D spikelet-and demonstrate its applicability using both principal component analysis (PCA) and quantile plots (a useful visualization tool with the ability to identify changes within 1D datasets as color gradients) as examples.

Results and Discussion
The excellent spectral dispersion of HSQC helps metabolite discrimination, however the 2D format makes it incompatible with some chemometric approaches and software packages. Figure 1 compares the full 1 H- 13 C HSQC (1A), conventional 1 H projection (1B), and 13 C projection (1C) against the information rich complete projection (1D) (termed here as "spikelet projection") for in vivo Daphnia magna. Maryam Tabatabaei Anaraki et al. [20] have discussed the impacts of anoxic stress on Daphnia magna. In the study, it was demonstrated that the separation along PC1 arises from the accumulation of lactic acid, whereas the separation along PC2 arises from slight differences in the lipids between the 3 different populations of Daphnia magna. In the spikelet approach, complete 1 H spectra extracted for each point in f1 ( 13 C) in the processed spectrum are concatenated into a continuous profile, and then the corresponding 13 C chemical shifts are projected onto the axis of the spikelet projection. The result is the overall profile of the spikelet projection and is analogous to the conventional 13 C projection ( Figure 1C), but for each 13 C plane, the complete 1 H spectrum is also preserved (see Figure 1D). Figure 2 demonstrates the concept of 1D spikelet projection with an expansion of the area around 57 ppm. Looking at the dashed line in Figure 2A (57 ppm carbon plane), three corresponding 1 H signals from Betaine/Choline, Arginine/Glutamic acid, and Phenylalanine/Tyrosine are present. Figure 2B shows the 57 ppm slice from the full HSQC, in which all 3 signals are well resolved. Figure 2C shows the exact same slice from the spikelet projection ( Figure 2D). The 1 H information content is identical, with the only difference being the label on the axis ( 13 C vs 1 H). For cross checking assignments, the slice from the spikelet projection indicates where to look in the 2D (in this case 57 ppm carbon), after which proton information can be read from the original 2D data in the normal manner. It is important to reiterate that the advantage of the spikelet projections is that they are amenable to 1D data processing approaches. As such, methods that may only be available for 1D formats can be applied to information extracted from the 2D datasets that otherwise may not be possible. The complete code for converting 2D NMR data into the 1D spikelet format is available in Appendix A. The code is executed as a standard AU program in Bruker's Topspin package.

Principle Component Analysis (PCA) of the Spikelet Projections
Here, we simply demonstrate that the same result is obtained when a chemometrics approach is applied to the 1D spikelet data and 2D NMR data. Therefore, PCA analysis of NMR data of living

Principle Component Analysis (PCA) of the Spikelet Projections
Here, we simply demonstrate that the same result is obtained when a chemometrics approach is applied to the 1D spikelet data and 2D NMR data. Therefore, PCA analysis of NMR data of living

Principle Component Analysis (PCA) of the Spikelet Projections
Here, we simply demonstrate that the same result is obtained when a chemometrics approach is applied to the 1D spikelet data and 2D NMR data. Therefore, PCA analysis of NMR data of living Daphnia magna under decreasing levels of oxygenation is performed. Figure 3 compares PCA analysis using (A) the full 2D datasets, and (B) the spikelet projections, both performed in AMIX. The data from 1 H-13 C HSQC were obtained from 99% enriched 13 C Daphnia magna (water fleas) with twenty organisms in the tube per replicate. HSQC spectra are collected every 1.5 h over a period of 24 h.
Daphnia magna under decreasing levels of oxygenation is performed. Figure 3 compares PCA analysis using (A) the full 2D datasets, and (B) the spikelet projections, both performed in AMIX. The data from 1 H-13 C HSQC were obtained from 99% enriched 13 C Daphnia magna (water fleas) with twenty organisms in the tube per replicate. HSQC spectra are collected every 1.5 h over a period of 24 h. For the control points (green), the organisms are supplied food and oxygenated water throughout the entire experiment, and the organisms are maintained in aerobic conditions (analogous to a conventional "control" dataset). The red dots represent a condition when the flow is stopped, and the organisms undergo anoxic stress that increases over time. Over time, as the anoxic stress increases (the red dots representing the stressed condition), distinct deviation from the control cluster (green dots) appears, due to variation caused by the build-up of lactic acid in the organisms' metabolome (see quantile plot in Figure 4 for a simple way to visualize the lactic acid). With both the 2D data and 1D spikelet projections, the PCA shows near identical profiles ( Figure 3A,B). There are some very slight variations between the PCA plots, likely due to the way Amix internally handles noise in the 1D and 2D during processing. However, the same information is obtained from each PCA plot. This is expected, considering the data being fed into the program are indeed identical and just displayed in a different format.

Quantile Plot from the Spikelet Projections
Another interesting approach to visualize the change or variation across a time series is to use a quantile plot. To our knowledge, producing a quantile plot from the 2D NMR data is not possible with currently available software. Even if it were possible, the envelope describing the variation would be superimposed across a contour profile and would require visualization in 3D from a large number of angles to appreciate. However, by reducing the 2D data to 1D, a simple quantile plot is easy to construct (see Figure 4). Here, blue corresponds to signals that do not vary significantly across the dataset, whereas signals in the yellow, orange, and red vary the most across the data. It is For the control points (green), the organisms are supplied food and oxygenated water throughout the entire experiment, and the organisms are maintained in aerobic conditions (analogous to a conventional "control" dataset). The red dots represent a condition when the flow is stopped, and the organisms undergo anoxic stress that increases over time. Over time, as the anoxic stress increases (the red dots representing the stressed condition), distinct deviation from the control cluster (green dots) appears, due to variation caused by the build-up of lactic acid in the organisms' metabolome (see quantile plot in Figure 4 for a simple way to visualize the lactic acid). With both the 2D data and 1D spikelet projections, the PCA shows near identical profiles ( Figure 3A,B). There are some very slight variations between the PCA plots, likely due to the way Amix internally handles noise in the 1D and 2D during processing. However, the same information is obtained from each PCA plot. This is expected, considering the data being fed into the program are indeed identical and just displayed in a different format.

Quantile Plot from the Spikelet Projections
Another interesting approach to visualize the change or variation across a time series is to use a quantile plot. To our knowledge, producing a quantile plot from the 2D NMR data is not possible with currently available software. Even if it were possible, the envelope describing the variation would be superimposed across a contour profile and would require visualization in 3D from a large number of angles to appreciate. However, by reducing the 2D data to 1D, a simple quantile plot is easy to construct (see Figure 4). Here, blue corresponds to signals that do not vary significantly across the dataset, whereas signals in the yellow, orange, and red vary the most across the data. It is very simple to assign the two main peaks of lactic acid, which are known to be the primary stress response of Daphnia under anoxic conditions. As such, quantile plots hold the potential to quickly evaluate what is changing across a series of data. While not normally applicable to 2D NMR, the 1D spikelet permits the implementation in an easy fashion. This is just a very simple example of how converting a 2D NMR data set into a 1D format may also open up new potential for analyses not directly applicable to the 2D data itself.
Metabolites 2019, 9, x 5 of 11 very simple to assign the two main peaks of lactic acid, which are known to be the primary stress response of Daphnia under anoxic conditions. As such, quantile plots hold the potential to quickly evaluate what is changing across a series of data. While not normally applicable to 2D NMR, the 1D spikelet permits the implementation in an easy fashion. This is just a very simple example of how converting a 2D NMR data set into a 1D format may also open up new potential for analyses not directly applicable to the 2D data itself.

Daphnia and Algae Culturing
Unlabelled Daphnia magna were cultured as previously described [21]. The neonates were isolated and exclusively fed a diet of 13 C-enriched algae Chlamydomonas reinhardtii (Silantes, Munich, Germany) three times a week, during which 50-60% of the water was replaced everyday. The algae Chlamydomonas reinhardtii (not isotopically enriched), cultured as previously described [21], was added to the water reservoir to feed D. magna during the NMR experiments, in order to feed the organisms inside the NMR magnet without any additional signals from the food [6,7].

NMR Spectroscopy
All NMR experiments were performed using a Bruker Avance III HD 500 MHz ( 1 H) NMR spectrometer with a 5-mm three channel 1 H-13 C-15 N TCI Prodigy™ cryoprobe fitted with an actively shielded z-gradient. An external D2O lock bulb (~5 µL) was used for all experiments to keep the lock solvent separate from the organisms. The HSQC experiment was performed via double INEPT transfer using sensitivity improvement, States-TPPI phase cycling, and Garp4 decoupling. The SPR-W5-WATERGATE sequence was incorporated into the pulse-sequence to suppress the water signal.

Daphnia and Algae Culturing
Unlabelled Daphnia magna were cultured as previously described [21]. The neonates were isolated and exclusively fed a diet of 13 C-enriched algae Chlamydomonas reinhardtii (Silantes, Munich, Germany) three times a week, during which 50-60% of the water was replaced everyday. The algae Chlamydomonas reinhardtii (not isotopically enriched), cultured as previously described [21], was added to the water reservoir to feed D. magna during the NMR experiments, in order to feed the organisms inside the NMR magnet without any additional signals from the food [6,7].

NMR Spectroscopy
All NMR experiments were performed using a Bruker Avance III HD 500 MHz ( 1 H) NMR spectrometer with a 5-mm three channel 1 H-13 C-15 N TCI Prodigy™ cryoprobe fitted with an actively shielded z-gradient. An external D 2 O lock bulb (~5 µL) was used for all experiments to keep the lock solvent separate from the organisms. The HSQC experiment was performed via double INEPT transfer using sensitivity improvement, States-TPPI phase cycling, and Garp4 decoupling. The SPR-W5-WATERGATE sequence was incorporated into the pulse-sequence to suppress the water signal [21]. A total of 128 increments were collected, each with 36 scans, 1024 time domain points, and recycle delay of 1 s. The interpulse delay during the INEPT transfer was based on 1 J HC of 145 Hz.
Data were processed with a qsine function shifted by 90 • in both dimensions. Data were zero filled to 2048 points in F2, while F1 was filled to 1024 points along with 32 coefficients of forward linear prediction. Spectra were calibrated against a range of known compounds in the Bruker Biofluid Reference Compound Database (v.2.0.0-2.0.5) prior to matching.

In Vivo NMR Spectroscopy
Fully 13 C labelled D. magna were cultured by feeding them a 13 C labelled diet from birth. Twenty of the 10-day-old Daphnia were placed in the 5 -mm NMR tube for each experiment, and 2D datasets were collected every 90 min over a period of 20 h. Three "control" datasets were obtained under continuous flow and the NMR tube was connected to a flow system, as previously described [13]. The temperature of the flow system and NMR was maintained at 10 • C for the duration of the experiment. The HPLC pump allowed for circulation of fluid containing food and water. Three "anoxic" experiments were conducted under no flow condition, and the organisms experienced anoxic stress that increased over time. An excess of 12 C food was added (Chlamydomonas reinhardtii) to the NMR tube at the beginning of each anoxic experiment to ensure Daphnia will have enough food during the no flow conditions.

Data and Statistical Analysis
Principal component analysis (PCA) was performed on 2D and 1D spikelet NMR spectra using an Analysis of Mixtures (AMIX) statistics package (version 3.9.14, Bruker BioSpin). The 2D NMR spectra were binned using rectangular bins of 0.24 ppm in the 13 C dimension (5 to 150 ppm) and 0.1 ppm in the 1 H dimension (−2 to 12.5 ppm). The 13 C buckets were calculated such that there was one bucket per processed point in F1. This gave rise to~87,600 bins per 2D spectrum. The 1D concatenated spikelet data were binned with 0.1 ppm to give the same number of buckets. The approach using 2D NMR for PCA analysis has been introduced elsewhere [8]. The region 4.70-4.85 ppm along the F2 ( 1 H axis) is excluded due to the residual H 2 O/HOD signals present. The "sum of intensities" was used as the integration mode and the scaling was set to "total intensity". The quantile plot was calculated using AssureNMR (version 2.1, Bruker Biospin, Rheinstetten, Germany), using the 1D spikelet bucket table described above.

Conclusions
This application note describes a simple approach to convert 2D 1 H-13 C HSQC NMR data into a 1D format (spikelet projection) that preserves the full information content and spectral dispersion. In turn, the approach potentially permits the larger array for tools that are available, or more developed for 1D NMR, to be applied, or where the visualization of higher dimensional data is challenging. The approach may have application beyond metabolomics. For example, processing of DOSY-HSQC data would require software to display the data in a 3D cube [22]. Converting the 2D HSQC data to a 1D spikelet format would still permit a conventional plot of chemical shift vs. diffusion to be performed. Similar arguments apply for the fitting HSQC collected with a series of T1 and T2 weighting for relaxation analysis, which is often performed in biomolecular NMR [23]. Conversion of the 2D HSQC format to a 1D format should permit additional analysis of the data when software capabilities required to work directly with the 2D format are not available. Furthermore, the conversion of 2D to 1D format may facilitate easier storage in universal formats that have recently been recommended for databased NMR-based metabolomics data [24]. This is particularly useful for the metabolomics in very complex mixtures or in-vivo data, where the additional dispersion of 2D NMR is required to reduce signal overlap.

Conflicts of Interest:
The authors declare no competing financial interests.