Updates and Original Case Studies Focused on the NMR-Linked Metabolomics Analysis of Human Oral Fluids Part I: Emerging Platforms and Perspectives

1H NMR-based metabolomics analysis of human saliva, other oral fluids, and/or tissue biopsies serves as a valuable technique for the exploration of metabolic processes, and when associated with ’state-of-the-art’ multivariate (MV) statistical analysis strategies, provides a powerful means of examining the identification of characteristic metabolite patterns, which may serve to differentiate between patients with oral health conditions (e.g., periodontitis, dental caries, and oral cancers) and age-matched heathy controls. This approach may also be employed to explore such discriminatory signatures in the salivary 1H NMR profiles of patients with systemic diseases, and to date, these have included diabetes, Sjörgen’s syndrome, cancers, neurological conditions such as Alzheimer’s disease, and viral infections. However, such investigations are complicated in view of quite a large number of serious inconsistencies between the different studies performed by independent research groups globally; these include differing protocols and routes for saliva sample collection (e.g., stimulated versus unstimulated samples), their timings (particularly the oral activity abstention period involved, which may range from one to 12 h or more), and methods for sample transport, storage, and preparation for NMR analysis, not to mention a very wide variety of demographic variables that may influence salivary metabolite concentrations, notably the age, gender, ethnic origin, salivary flow-rate, lifestyles, diets, and smoking status of participant donors, together with their exposure to any other possible convoluting environmental factors. In view of the explosive increase in reported salivary metabolomics investigations, in this update, we critically review a wide range of critical considerations for the successful performance of such experiments. These include the nature, composite sources, and biomolecular status of human saliva samples; the merits of these samples as media for the screening of disease biomarkers, notably their facile, unsupervised collection; and the different classes of such metabolomics investigations possible. Also encompassed is an account of the history of NMR-based salivary metabolomics; our recommended regimens for the collection, transport, and storage of saliva samples, along with their preparation for NMR analysis; frequently employed pulse sequences for the NMR analysis of these samples; the supreme resonance assignment benefits offered by homo- and heteronuclear two-dimensional NMR techniques; deliberations regarding salivary biomolecule quantification approaches employed for such studies, including the preprocessing and bucketing of multianalyte salivary NMR spectra, and the normalization, transformation, and scaling of datasets therefrom; salivary phenotype analysis, featuring the segregation of a range of different metabolites into ‘pools’ grouped according to their potential physiological sources; and lastly, future prospects afforded by the applications of LF benchtop NMR spectrometers for direct evaluations of the oral or systemic health status of patients at clinical ‘point-of-contact’ sites, e.g., dental surgeries. This commentary is then concluded with appropriate recommendations for the conduct of future salivary metabolomics studies. Also included are two original case studies featuring investigations of (1) the 1H NMR resonance line-widths of selected biomolecules and their possible dependence on biomacromolecular binding equilibria, and (2) the combined univariate (UV) and MV analysis of saliva specimens collected from a large group of healthy control participants in order to potentially delineate the possible origins of biomolecules therein, particularly host- versus oral microbiome-derived sources. In a follow-up publication, Part II of this series, we conduct censorious reviews of reported observations acquired from a diversity of salivary metabolomics investigations performed to evaluate both localized oral and non-oral diseases. Perplexing problems encountered with these again include those arising from sample collection and preparation protocols, along with 1H NMR spectral misassignments.


Introduction to NMR-Based Salivary/Oral Fluid Metabolomics
The multicomponent 1 H nuclear magnetic resonance (NMR) or liquid chromatographic-mass spectrometric (LC-MS) analyses of biofluids and/or tissues offers a high level of potential with regard to the investigation of metabolic processes, and when coupled with contemporary and/or newly developed multivariate (MV) data analysis techniques, serves as an extremely powerful means of probing, for example, the biochemical basis of human disease etiology, along with therapeutic benefits or toxicological effects induced by administered drugs or further xenobiotics [1]. Indeed, this bioanalytical approach is generally classified as metabolomics and has been extensively employed in a very wide range of biomedical investigations, diagnostic, prognostic, or otherwise. The MV statistical analysis of such MV analytical data is generally based on either (1) supervised pattern recognition or (2) unsupervised exploratory data analysis techniques. The first of these classes of analyses commonly incorporates models for MV discriminant function analysis, which is employed in order to test the significance of any key 'biomarker' variables (i.e., metabolite concentrations, spectral 'bucket' intensities, or further measures related to biomolecule levels) with regard to their assignment to differential classification groups (e.g., control vs. disease groups or control (untreated) vs. treatment-receiving disease groups); these techniques are supervised, i.e., they require prior knowledge of their group membership in a 'training' modeling dataset. The second technique, however, primarily involves principal component analysis (PCA), a strategy that is unsupervised, and permits the employment of inter-related descriptors to detect dataset classification clusterings-it is also valuable for the preliminary identification and removal of statistical 'outlier' samples.
Combinations of multianalyte biofluid 1 H NMR analysis with the above MV metabolomics approaches clearly offers major advantages (diagnostic or otherwise), especially in view of the relatively limited amount of molecular information derivable from the direct computervisual examination of spectral profiles acquired, which often contain information relating to 100 or more biomolecules therein (including the individual determination of many metabolite levels) by trained, expert staff, which is costly, labor-intensive, and time-consuming.
Intriguingly, saliva represents an ideal biofluid medium for such metabonomics investigations, particularly since, when excreted from the salivary and mucous glands, it contains no invading bacteria and only very low concentrations of metabolic agents, which may represent 'markers' of selected oral diseases. Indeed, in the oral environment, microorganisms located in tooth plaque, gingival crevices, or soft tissues such as gums conduct a range of metabolic functions linked to their growth and prevalence, and hence human saliva contains many excreted catabolites (e.g., acetate, formate, lactate, propionate, nand iso-butyrates, nand iso-valerates, etc.) that are unique and dependent on the infiltration, activity, and preponderance of bacterial flora therein. Furthermore, elevated salivary concentrations of markers of inflammatory processes occurring within soft tissue would also be anticipated, for example, ground substance glycosaminoglycans and low-molecular-mass saccharides derived therefrom, plus malodorous amines, etc.
To date, we have presented results concerning the applications of high-field proton ( 1 H) NMR spectroscopy to the detection and quantification of biomolecules present in a variety of complex biological fluids, and some examples from our work in the dental/oral health research area are available in [2][3][4][5]. This technique offers many advantages over alternative Therefore, in this update, we continue in Section 2 by delineating the sources and composition of human saliva samples, with a particular emphasis on whole mouth saliva (WMS) and the relative contributions of the host system and the oral microbiome towards its molecular composition. Section 3 provides an outline of the many advantages offered by the use of saliva as a biofluid for the screening of potential biomarkers, not least because of its ease of collection with only minimal supervision required; possible disadvantages of its use as an analytical matrix are also relayed. Section 4 explores the different types of salivary metabolomics investigations possible, whilst Section 5 provides a brief outline of the history of NMR-based salivary metabolomics and our group's role in these early developments. Section 6 then describes our own recommended protocols for the collection, transport, and storage of WMS specimens, and their preparation for NMR analysis, with details regarding precautions that should be taken against the adverse generation of bioanalytical artefacts during these episodes-we also compare these regimens with those instigated by other researchers. Section 7 reviews pulse sequences commonly employed for the 1 H NMR spectroscopic analysis of WMS supernatants and considers some perspectives for the future application of these techniques and also provides valuable information on the first of two case studies, which involve an investigation of salivary metabolite resonance linewidths and the plausibility of their possible dependence on salivary protein concentrations and WMS protein-metabolite equilibria. Section 8 features a review of the bioanalytical assignment advantages offered by two-dimensional NMR strategies, both homo-and heteronuclear. In Section 9, we discuss the quantification of WMS metabolites using NMR techniques for MV metabolomics data analysis, including spectral bucketing, preprocessing, data normalization, and scaling options available, together with dimensionality reductions of large or very large 1 H NMR-linked WMS supernatant datasets. Section 10 offers a full outline of salivary phenotype analysis and for the first time reports a second case study involving a large number of WMS samples (n = 5 each for no fewer than 48 healthy control human participants); UV considerations of and the between-and within-participant components of variance for each of 31 individual 1 H NMR-determined metabolites were made, followed by the application of MV models to distinguish between the magnitudes of differential pools of these analytes for each participant. We then applied factor analysis to determine component loadings of metabolites in an attempt to segregate a total of four orthogonally distinct pools of within-component-correlated salivary biomolecules on the basis of their physiological sources (e.g., host-or oral microbiome-derived), and/or metabolic pathway involvements. Section 11 is then primarily concerned with the future applications and prospects offered by the employment of LF non-stationary benchtop NMR spectrometers for direct assessments of the oral health status of dental patients at clinical 'point-of-contact' sites, including dental surgeries-the advantages and limitations of this novel approach are discussed in detail. Finally, Section 12 concludes this update and review, and provides relevant recommendations for the future performance of NMR-based metabolomics investigations of human saliva.
In a follow-up publication, Part II of this series critically reviews reported observations acquired from metabolomics studies focused on chemopathological evaluations of both oral and non-oral (systemic) diseases, respectively, the former (Section 2) focused on localized oral disorders featuring periodontal, dental caries, and oral cancer conditions, the latter (Section 3) drawing on salivary metabolomics analysis reports on types 1 and 2 diabetes, cardiovascular diseases, Sjörgen's syndrome, and a series of cancers and neurological conditions, along with virus infections. The latter viral disorders heading includes a consideration of saliva as a possible biofluid medium for the metabolomics testing of SARS-CoV-2 infection and its related pathophysiological effects, together with HIV and acute sore-throat conditions.
Although the authors also regularly employ high-resolution NMR analysis for the 'speciation' of metal ions in human saliva, i.e., determinations of the precise molecular nature of their complexes or chelates in this biofluid [5,13], information that plays an important role in relation to their cellular uptakes and potential toxicities (where relevant), status, for example. According to Gardner et al. [20], WMS citrate, lactate, and urea act as metabolites arising from the salivary host, although it should be noted that salivary lactate levels are often much greater than those of human plasma (reported mean, withinparticipant WMS supernatant levels of up to 20 mmol./L have been observed [3], although one lactate concentration was actually >60 mmol./L), whereas those of blood plasma are ca. 1.5 mmol./L [21], and hence it appears very unlikely that it may all be derived exclusively from a host source alone. Also of importance, we only very rarely detect citrate in the 1 H NMR profiles of human WMS supernatants, and, therefore, if it is present therein at levels beyond the lower limit of detection (LLOD) value, which is ca. ≤5 µmol./L at an operating frequency of 600 MHz, then its value as a salivary 1 H NMR biomarker for host source-derived biomolecules is extremely limited. However, WMS urea, which is readily detectable and quantifiable by 1 H NMR analysis, appears to represent a valuable marker for such a host origin. Indeed, consistent with the observations of Gardner et al. [20], we find only very limited correlations between WMS urea concentrations and those of all other metabolites monitored (r values predominantly ≤0.25, as detailed in Section 10), including organic acid anions, amino acids, and amines, along with a range of further biomolecules. Furthermore, [20] also reported quite strong negative correlations between both the saccharolytic and proteolytic fractions of WMS bacterial counts. A typical high-field (600 MHz) 1 H NMR spectral profile of a WMS supernatant sample is shown in Figure 1, and a corresponding listing of the identities of all salivary metabolite resonance assignments made, along with their respective chemical shift values and coupling patterns, is available in Table 1. For the spectrum shown, there is a grand total of 131 resonances, to which 98 metabolites were assigned (albeit 19 signals tentatively so), with four remaining unassigned. An estimated ≥90% of all signals found, including those affected by superimpositions with other biomolecule resonances, are included in Table 1. Table 1. Assignments of resonances in the 600 MHz 1 H NMR spectra of supernatants of WMS samples (chemical shift (ö) values and resonance coupling patterns are also provided). * Indicates tentative assignment. With the exception of the tentative ones indicated, assignments for the majority of these resonances were confirmed by (1) detailed considerations of chemical shift values, coupling patterns, and coupling constants; (2) comparisons with established literature and available database values (with allowances being made for the pH dependences of selected resonances, where required); (3) the application of the homonuclear two-dimensional (2D) COSY and TOCSY NMR techniques; and (4) standard addition treatments of WMSS samples with standard additions of standard solutions containing authentic, pure analytes, where considered appropriate. All chemical shift values were referenced to TSP (δ = 0.00 ppm), but a further check on these was achieved by use of the WMSS sample -CH 3 group resonances of lactate (δ = 1.330 ppm) and alanine (δ = 1.486 ppm), and, hence, these served as secondary endogenous internal chemical shift references. As previously reported [2][3][4], these typical spectral profiles provide a wealth of valuable metabolic information concerning the nature and salivary concentrations of an extensive range of biomolecules, both host-and oral microbiome-derived, including many organic acid anions, amino acids (both aliphatic and aromatic), carbohydrates, malodorous amines such as trimethylamine and putrescine, choline and its derivatives, N-acetyl sugars, and N-acetyl amino acids, etc. Also detectable are a number of broad resonances that presumably arise from salivary proteins, most notably those centered at ö = 6.85 and 7.55 ppm. These signals are therefore likely to be assignable to protein tyrosine, phenylalanine, tryptophan, and/or histidine residues, and in [20], Garner et al. also found these resonances in both WMS supernatant and parotid saliva samples. In view of this observation, it appears very likely that such macromolecules arise from a host and not a microbial source. Therefore, the intensities of these signals, and hence the concentrations of their assigned biomolecule(s), may potentially be valuable for determining host contributions towards the salivary metabolome. Further experiments to explore this are currently in progress in our laboratories. Appl As previously reported [2][3][4], these typical spectral profiles provide a wealth of valuable metabolic information concerning the nature and salivary concentrations of an extensive range of biomolecules, both host-and oral microbiome-derived, including many organic acid anions, amino acids (both aliphatic and aromatic), carbohydrates, malodorous amines such as trimethylamine and putrescine, choline and its derivatives, N-acetyl sugars, and N-acetyl amino acids, etc. Also detectable are a number of broad resonances that presumably arise from salivary proteins, most notably those centered at ö = 6.85 and 7.55 ppm. These signals are therefore likely to be assignable to protein tyrosine, phenylalanine, tryptophan, and/or histidine residues, and in [20], Garner et al. also found these resonances in both WMS supernatant and parotid saliva samples. In view of this observation, it appears very likely that such macromolecules arise from a host and not a microbial source. Therefore, the intensities of these signals, and hence the concentrations of their assigned biomolecule(s), may potentially be valuable for determining host contributions towards the salivary metabolome. Further experiments to explore this are currently in progress in our laboratories.  Although previous 1 H NMR-based metabolomics investigations of PS and SM/SL saliva samples have been very restricted in the context of poor spectral resolution, limited numbers of participant donors, and gland inflammation complications for the latter [22,23], Gardner et al. [7] compared the 1 H NMR profiles of PS and SM/SL saliva samples collected from the same donors and concluded that there were no major differences between them. However, although understandable because of collection problems, unlike those experienced with WMS, the relatively limited sample size involved (n = 10) may be insufficient to draw robust conclusions from this, and therefore the authors recommend that a full 1 H NMR-based metabolomics investigation should be performed in order to seek, and possibly confirm and verify, any metabolic differences detectable, if indeed justified.

Assignment
Particularly noteworthy is the phenomenon that, as noted by Gardner et al. [7], the overall physiological significance of salivary metabolites such as short-chain organic acid anions is predominantly unknown. However, in the human colon, such bacterial catabolites, e.g., n-butyrate, propionate, lactate, and succinate have been ascribed principal actions in relation to human health. Indeed, these catabolites exert important roles in the maintenance of gut epithelial integrity, especially for n-butyrate, which is known to stimulate the tight junction expression and synchronizes colonic epithelial cell proliferation; these processes occur through the inhibition of histone deacetylase inhibition, an upregulation of the expression of apoptosis-supporting-genes [24,25], and the ability to exert actions regarding metabolic and immune regulation functions [24,26]. Interestingly, they also play roles in the regulating the satiety and appetite of the host [27].
In 2017, Louis and Flint [28] provided an overview of the generation of propionate and n-butyrate by human colonic mucosa, and they note that both these microbial catabolites exert or promote a series of favorable health-friendly actions. Such organic acid anions may arise from the bacterial degradation of dietary carbohydrates, and from amino acids liberated from the actions of proteolytic bacteria on proteins. The critical role of crossfeeding of intermediary metabolites such as lactate, succinate, and 1,2-propanediol between different species of gut bacteria was outlined, as was the influence of environmental elements such as host dietary changes, on the growth and preponderance of propionateand n-butyrate-generating bacteria; all these metabolites are routinely detectable in the 1 H NMR profiles of human saliva [3]. Approaches for the optimization of organic acid anion furnishments to the host do, therefore, require a comprehensive understanding of the metabolism of such species by gut microflora populations.
Moreover, although modern developments and advances in molecular biology have furthered our understanding of the human oral microbiome, to date, knowledge regarding its metabolic actions and activities remains limited [29]. Saccharolytic bacteria, such as Actinomyces, Streptococcus, and Lactobacillus, serve to degrade saccharides and polysaccharides to organic acid anions, which are commonly found in human saliva, e.g., pyruvate, lactate, acetate, and formate, through the Embden-Meyerhof-Parnas glycolytic pathway and its branches, processes that may lead to dental caries. Hence, such organic acid anions all prospectively offer biomarker applications.

Nutrient Availability for the Oral Microbiome
In the oral environment, the availability of exogenous nutrient substrates for the oral microbiome is restricted, largely because of the relatively rapid salivary clearance of ingested carbohydrates through enzymatic and physical actions (usually within a several hour period, as notable from Figure 2), and therefore it is temporal for most humans. In contrast, the colon has a much more retentive, undigested source of such agents, and this accounts for a significant key difference between them. However, in the absence of ingested carbohydrate sources, microbially derived salivary catabolites are still produced, since, in general, their morning awakening levels remain higher than those monitored at all subsequent diurnal time-points. Moreover, with the exception of selected N-acetyl sugars, we find that 'resting' morning awakening human saliva samples always contains little or no carbohydrate nutrients such as glucose and sucrose and no other dietary carbohydrates potentially available to the oral microbiome [30]. However, in this report, salivary flow-rate was not monitored, and elevated salivary metabolite concentrations may arise from its diminished morning 'awakening' rate when expressed relative to those experienced at later diurnal time-point settings. Interestingly, as reported by Gardner et al. [31], enhancements in salivary metabolite concentrations observed in awakening participants in the study documented in [30] were not found to be homogenous, as indeed we might expect if only significant reductions in salivary flow-rate as a determinant were observed. These awakening-mediated fold-change increases in mean salivary levels for choline, n-butyrate and valine were 3.1, 3.6, and 15.7, respectively; those for methyl-and trimethylamine were 3.5 and 6.0, respectively. The early a.m. upregulation in choline, which represents a catabolic precursor of salivary methyl-and trimethylamines [32], is actually very similar to that of the former, and not dissimilar to that of the latter amine, so perhaps they are interrelated anyway. Perhaps there are significant correlations, positive or negative, between these levels in such samples. Appl From our factor analysis applied for the first time in Section 10 (Case Study 2), we find that, although choline is strongly loaded on putatively host-derived Component 1, it also significantly loaded on oral microbiome-sourced Component 2, which also strongly loaded trimethylamine (TMA); these components are orthogonal (uncorrelated). Component 3, however, which putatively contains correlated biomolecules arising from a separate bacterial source, loads methylamine (MA) and dimethylamine (DMA), in addition to TMA. Although probably unrelated in view of its generation from enzymes present in many saccharolytic bacteria, acetate, a further product of choline metabolism (along with acetaldehyde and ethanol) [33], was found to load strongly on Component 2, as expected. Nevertheless, a further dietary source of TMA is carnitine [34], and both phosphatidylcholine and betaine also serve as precursors for it.
Salivary proteins also provide a source of nutrients for oral bacteria, and the substantial rises in valine observed in [30] indicated proteolysis. Furthermore, the n-butyrate upregulation observed may be explicable by its generation as a terminal catabolic product from selected amino acids, notably lysine and glutamate [29,35,36]; however, both lactate and succinate are involved in n-butyrate biosynthesis. Intriguingly, microbial biofilms in saliva have the ability to catabolize proteins therein [37], processes that generate high concentrations of organic acid anions, proline and 5-aminovalerate, together with the biogenic amine putrescine [38]. As displayed in Figure 1, these three catabolites are all detectable in the 600 MHz 1 H NMR profiles of human saliva. Succinate and lactate, which are known to be featured in the generation of n-butyrate, were also found to be metabolically consumed in the baseline-sterilized saliva used in the studies reported in [38].

Ease of Collection and Metabolite Concentration Considerations
As with blood plasma and urine, saliva is a highly complex biofluid containing a multitude of biomolecules that are receptive to 1 H NMR analysis, along with other analytical strategies for this spectroscopic technique such as homo-and heteronuclear 2D techniques. However, like plasma, but unlike urine, saliva has significant levels of proteins present [18, but, as noted above, this is only approximately ≤10% or so of that of blood plasma, and we find that this offers little or no interference with the 1 H NMR determination of salivary metabolites [2][3][4]; even healthy human urine contains trace levels of proteins, which in the majority of cases do not obstruct the quantitative analysis of biomolecules present in this biofluid. Similarly, all these biofluids also contain a broad 'spectrum' of non-NMR-detectable or -visible biomolecules; the latter analytes are below pre-specified LLOD and hence lower limits of quantification (LLOQ) limits for these agents.
Although the conventional and traditional analyses of blood or blood plasma/serum samples is considered integral for the majority of clinical diagnostic and prognostic monitoring work-tasks, methods or protocols available for their collection are hampered by its high expense (most especially in health services), its invasive physical intrusion, the requirement for a trained healthcare specialist (e.g., phlebotomist or trained nurse) for this purpose, and the possibility of adverse reactions to the collection technique experienced by a very small but nevertheless significant proportion of study participants (on Research Ethics Committee-approved participant information sheets, it is usual to warn participants about these possible adverse side-effect associated with blood sample collection, e.g., experiencing dizziness or feeling faint, etc.). However, the collection of saliva is virtually non-invasive and non-intrusive, and can be achieved by patients/participants themselves, often in their own homes, without any clinical supervision required. This non-invasive collection strategy for the detection and/or monitoring of clinical conditions clearly circumvents the discomfort often associated with blood sample collection.
In addition to the simple straightforward, non-invasive, and painless method of saliva sample aspiration (albeit subject to a number of important requirements), further major advantages associated with the use of saliva as a sample matrix in NMR-linked metabolomics studies are that (1) samples are more readily shipped and stored-saliva requires much less laboratory handling and processing than whole blood, and it obviously does not clot; (2) its collection is economical, which reduces costs for healthcare providers and/or researchers; and (3) samples are safer to handle-notably, saliva naturally contains agents that suppress HIV infectivity, and therefore its handling and laboratory processing presents risks of only extremely restricted or negligible rates of oral HIV transmission [39].
Notwithstanding, one already established major disadvantage of the use of saliva in a metabolomics and general biomarker analysis context for extra-oral diseases is the knowledge that levels of such monitorable analytes are always lower or substantially lower in this biofluid than they are in blood samples, as whole, or its plasma/serum isolates [40]. For example, the high-molecular-mass marker IgG has a blood serum concentration lying between 5-30 g/L but is 1000-fold lower in corresponding saliva specimens [41]. Furthermore, for IgA, serum levels are 2.5-5.0 g/L, whereas they are only 0.25-0.50 g/L for saliva [41], i.e., a 10-fold lower mean difference.
Reassuring, however, are the known correlations between blood and saliva sample metabolites, although the authors of this paper stress that it is indeed critical to pre-establish non-fluctuating, strong correlations or relationships between biomarker concentrations prior to the use of saliva for such biomarker monitoring studies, most notably for those involving extra-oral diseases. However, unfortunately many blood plasma metabolites that may potentially act as disease biomarkers and that are transferable to human saliva are usually present at levels that are perhaps <100 µmol./L in the former biofluid, or even at much lower concentrations. If this is the case, and if these plasma concentrations are, say, 100-fold greater than those of corresponding saliva specimens, then they will probably not be 1 H NMR-detectable nor serve to be disease-specific in the latter biofluid, despite the favorable advantages associated with its ease of collection.
Since some key biomarkers in human saliva that are or may arise from a blood-borne source are typically approximately 100-fold lower in concentration than those in blood plasma, and more sensitive techniques other than 1 H NMR analysis are required for their identification and determination. Reviews of the advantages and possible complications associated with the use of saliva for biomarker monitoring purposes are available in [42][43][44], whereas its employment as a medium for the detection and analysis of drugs is documented in [45,46].

Value of NMR Analysis Techniques for the Metabolomics Analysis of Human Saliva
On comparison with other metabolomics techniques, such as LC-MS/GC-MS approaches, the NMR-based salivary bioanalysis strategy provides the valuable advantages of having only minimal requirements for sample preparation and hence associated rapid throughput rates; it also has the ability to detect and quantify a complete multitude of predominantly water-soluble metabolites simultaneously [6]. Moreover, the direct 1 H NMR analysis of human biofluids, including WMS supernatants, can now finally be regarded as a high-sensitivity technique, with it now being possible to detect, and in some cases also quantify, concentrations of ≤5 µmol./L, and this without the prior application of any pre-concentration or extraction methodologies. A further advantage is that it is a virtually non-destructive bioanalytical technique, so, after allowing for the addition of usually ca. 10% (v/v) 2 H 2 O as a field frequency lock, the same samples may then be analyzed by an alternative, perhaps more sensitive, albeit destructive methodologies, such as those employed in LC-MS metabolomics experiments. Supporting databases have also been established to aid with the identification of biomolecules from their resonance patterns visible in salivary spectra (e.g., [47]).

Types of Salivary Metabolomics Experiments
Overall, two major classes of metabolomics strategies may be considered: targeted and untargeted. The untargeted approach screens large or very large numbers of metabolites simultaneously in order to seek metabolic 'fingerprints', the constituents of which may serve as reliable biomarkers for health disorders of interest. However, in these untargeted approaches, large datasets of NMR resonances are produced, and these may give rise to some inaccuracies and high rates of false positives. Moreover, in virtually all studies of this nature, often there are at least some or even many signals that researchers have not been able to identify and assign correctly, and this complicates issues further; in such situations, it is recommended that one or more two-dimensional (2D) NMR techniques are employed to facilitate such assignments, for example, the acquisition of mononuclear 1 H-1 H COSY or TOCSY, or binuclear 1 H-13 C HSQC spectra (Section 8). Hence, these approaches may be problematical for the reliable detection of biomarkers and their validation [48].
Primarily, the validation and cross-validation of biomarkers is usually performed via the random removal of a 'test' set of sampling group observations (usually one-third or so of them), and the remaining two-thirds of them are then employed to develop the model; this process is repeated many times, and the overall classification success is then determined, for example via the Q 2 statistic, which measures the extent of correct 'test' set classification (on a 'hold-out' group of samples, usually of size one-third or so of all those available) and also permutation testing in partial least squares-discriminatory analysis (PLS-DA). Nevertheless, the only correct and acceptable method for the validation of selected biomarkers for both oral and extra-oral diseases is to administer a known therapy for the condition explored and to monitor returns of their salivary levels to a 'normal', healthy control range of values, which may be either up-or downregulations with respect to the disease group. Indeed, it should also be noted that these technologies are also readily applicable to studies focused on probing the hopefully favorable responses arising from the therapeutic actions of administered oral health care or other products. Ideally, such changes should be proportional or related to the condition's severity. As an example, patients with the early stages of clinically established stage 1 PD (otherwise known as gingivitis), may receive an oral healthcare product, e.g., a mouth-rinse formulation containing a known therapeutic agent such as the bactericide chlorhexidine, or a peroxide or chlorite oxidant, according to a strict, carefully designed therapeutic application regimen. Then, in such a case, any salivary biomarkers putatively identified via such primary untargeted metabolomics approaches would be expected to return to their 'normal' range values if the treatment was known to be successful. That is the most successful and established method for establishing the validity of such biomarker molecules. However, it is of much importance to note that, despite this major requirement, many human salivary metabolomics investigations certainly do not progress onto this critical final phase of biomarker establishment and validation.
Targeted metabolomics, however, involves the 'targeted' quantification of a series of salivary metabolites that are already known on the basis of their established and characterized 1 H NMR profiles, e.g., perhaps bacterial organic acid anion catabolites such as acetate, lactate, propionate, nand iso-butyrates, succinate, formate, and 5-aminovalerate, etc. In this context, it aims to recognize unusual patterns or signatures of metabolites and their concentrations as an aid to diagnosing diseases and prognostically monitoring their severities, or for saliva samples, perhaps the excessive infiltration and preponderance of pathogenic bacteria may be monitored by the quantification of upregulated levels of such catabolites, which may serve as significant biomarkers of associated oral diseases and possibly also their severities. It may also be employed for screening modifications in metabolic responses to therapeutic interventions and active healthcare product ingredient doses, etc. [49].

Brief Early History of Salivary NMR Analysis and Associated Metabolomics Technology Investigations
Very early NMR studies of human saliva were somewhat limited since they were conducted at lower operating frequencies than those employed currently, and only a small number of assignments were made in view of unsatisfactory presaturation of the intense water signal, along with some significant overlap with a small number of interfering broad protein and other macromolecule resonances. As noted above, although our group's publication of 600 MHz spectra acquired of HSS samples in 2003 [3] is often cited as the first fully comprehensive high-field NMR study [7], with assignments available for no fewer than 63 metabolites, and the first 'true' metabolomics study of this biofluid [6], the reality is that we first started acquiring such spectra on these samples at the Royal London Hospital Medical College as early as 1990-1991 on a then quite ageing 400 MHz Bruker NMR facility located at our sister Institution, Queen Mary College (QMC), University of London, plus occasionally on a more up-to-date 500 MHz Jeol JNM-GSX 500 spectrometer based at Birkbeck College, also at the University of London (University of London Intercollegiate Research Services (ULIRS) facility). However, publications involving assessments of the metabolic profiles of carious dentin biopsies and saliva, and evaluations of the oxidizing abilities of chlorite/chlorine dioxide-and peroxide-containing oral healthcare products towards biomolecules present in intact human salivary supernatants, only arose from this work in 1997-1999 [2,50,51] in view of our group's specializations in a range of other nondental and non-salivary pathological disciplines. This work was bolstered by the acquisition and our use of a new ULIRS's 600 MHz Bruker instrument at QMC to replace the ageing 400 spectrometer based there in 1991, then the first of its kind in the UK, to the best of the authors' knowledge, and assignments on these spectra were also fully comprehensive.

Recommended Protocols for the Collection, Storage, and Preparation of Human Saliva Specimens for NMR Analysis: Precautions against Artefacts, and Artefactual Metabolite Generation or Consumption
The integrity of saliva specimens and assurances on the maintenance of salivary metabolite concentrations so that they remain identical to those present at the point-ofcollection, are a consideration that is of critical importance to NMR-based metabolomics investigations, and therefore adequate precautions should be taken to limit artefactual chemical modifications to selected metabolites from this point until ready for spectral acquisition, most especially during sample preparation and subsequent storage episodes.
Therefore, in order to provide optimized support for future salivary 'omics' studies, it is essential that recommended protocols have been established and provided in order to maintain analytical validity and permit viable comparisons between distinct studies performed by separate research laboratories. Such protocols aim to address problems encountered with differential sample collection, sample preparation, storage, and spectral acquisitional parameters for NMR analysis. For the 1 H NMR [52,53] and LC-MS [54,55] analyses of blood plasma and urine, such protocols are already available, but quite recently, strategies for sample preparation of human saliva for 1 H NMR analysis have been reported in 2020 [31], and these serve as corollaries to those already established for the LC-MS technique [56]. Following an extensive investigation of participant collection regimens, sample preparation techniques, and spectral acquisition criteria, the evidence-mediated protocol for salivary metabolomics studies proposed in [31] concluded that previously employed investigation protocols for the 1 H NMR quantification of salivary biomolecules, and the utilization of unbuffered internal TSP as a quantitative reference standard, were found to be acceptable for most metabolite determinations, although inexplicably not so for methanol and acetate. Moreover, results acquired were found not to be significantly influenced by the nature and length of centrifugation and freeze-thaw regimens.

Saliva Sample Collection
Containers utilized for the collection of saliva specimens should be previously unused, clean, sterile, free of surface contaminants, of sufficient volume capacity, and with a collection mouth diameter sufficiently wide to permit the drooling of samples by recruited participants without any major complications or unnecessary spillage. Indeed, our group typically use sterile Universal plastic containers of 20 mL total volume. Collection of saliva specimens for metabolomics studies may be conducted using stimulated or unstimulated (passive drooling) approaches. Unstimulated or 'resting' WMS is simply collected via the passive drooling of this fluid into the above recommended sterile containers.
However, for the collection of stimulated saliva samples, participants are usually requested to chew on a softened paraffin wax or alternative solid chewing matrix. However, the authors of the current study certainly do not recommend this for 1 H NMR-based metabolomics-style investigations, since such materials, commercially available or otherwise, usually release quite large numbers of interfering, 1 H NMR-detectable product agents into the WMS medium during such chewing episodes; these may include some endogenous agents already present in this biofluid, including acetate and formate, along with glycerol and propanediols, etc. Importantly, sometimes salivary collection swabs may be employed for WMS collections, but again their abilities to markedly contaminate this biofluid with interferants should be considered in detail. An alternative stimulus for the collection of human saliva involves the application of a 2% (w/v) citric acid solution onto the dorsal surface of the tongue, but this chemical approach is not to be recommended since (1) it will be expected to substantially modify the pH value of the oral environment and that of saliva itself, which will serve to alter the metabolic status of this biofluid immediately prior to its collection; (2) application of such a solution-state acidic stimulus may act to dilute salivary metabolite levels; and (3) although not often 1 H NMR-detectable in whole human saliva, measurements of salivary citrate concentrations are obviously precluded when using this approach. Clearly, the selection of saliva collection sample type and its quality may serve to hamper many metabolomics investigations, and as noted above, without the establishment of a homogeneous system for this process in all interested or collaborating laboratories globally, comparisons of metabolic datasets via meta-analysis and other research strategies are rendered severely limited, if not impossible. Furthermore, errors in saliva sample collection techniques may exert significant effects on results acquired and their physiological or clinical interpretations, and therefore every possible measure to circumvent these should be made.
In our laboratories, we request that participants are fasted for a minimum of 8 h, but preferably a 12 h duration prior to donating saliva specimens. With the assistance of a researcher, potential participants then complete a prior pre-screening questionnaire regarding the provision of essential demographic information, along with details concerning their medical and dental treatment histories, both recent and long-term. All successful participants are then requested to refrain from all oral activities, including eating, drinking, smoking, tooth-brushing, and oral rinse use, etc., throughout this period, including the short, ca. 5 min period between awakening and sample donation. They are also all requested not to consume any alcoholic beverages for at least 24, but preferably for 48 h, prior to the sample collection time-point. Despite this precaution, ethanol is often still detectable in the 1 H NMR profiles of human saliva collected, even after a 48 h abstention duration; however, ethanol is, of course, an established microbial fermentation catabolite, i.e., it may also arise from the actions of selected bacteria, for example, Streptococcus mutans [57]. Moreover, 1 H NMR-detectable methanol can be derived from the passive or direct inhalation of tobacco cigarette smoke, which contains quite high levels of this alcohol [58]. Following collection as directed above, samples are then transferred to the laboratory site on ice. We recommend that for each collection session, each participant should provide an absolute minimum volume of about 5-6 mL in order to permit the performance of replicate 1 H NMR analyses, their use for other forms of analysis (e.g., LC-MS analysis, or F − determinations via ion-selective electrode or 19 F NMR spectroscopy), and also to ensure that there is sufficient clear human WMS supernatant sample remaining following the removal of cells, bacteria, and debris during the centrifugation step described below.
6.2. Abstention from Oral Activities Prior to Saliva Sample Donation: How Long Is Really Necessary?
Our protocol involving a minimum period of 8 but preferably 12 h appears to be acceptable since no 1 H NMR resonances arising from contaminating agents derived from food or other sources are found in the spectral profiles acquired, although it appears that ethanol ingested in the form of alcoholic beverages can persist for 24 h or more. Conversely, that proposed in [31] for only a 1.0 h abstention period is clearly unacceptable (Figure 2), and this is likely to lead to erroneous representations of the source and longevities of selected metabolites, notably low-molecular-mass organic acid anions such as citrate, lactate, acetate, and formate, an expansive range of amino acids, carbohydrates, and lipids, along with any long-chain 'free' fatty acids derived from the hydrolysis of dietary acylglycerol species. Figure 2 shows selected regions of the 600 MHz 1 H spectra of supernatants of WMS samples collected from a healthy human participant immediately before, and at 0.50, 1.00, 2.00, and 4.00 h after, the consumption of a vegetarian sandwich meal, and from these profiles, it is very clear that citrate resonances, which were completely absent from the spectrum prior to eating, are prominent at a time-point of 30 min thereafter and persist for a duration of 4 h or more following the meal. Therefore, any future speculations or debates reading the sources of salivary citrate may be resolved, especially in studies that have only requested that their salivary participant donors only fast or refrain from all oral activities for periods as short as 1.0 h or so. Clearly, from current studies available [7,20], this key, metal ion-complexing metabolite appears to be predominantly, although not exclusively, derived from dietary sources.
Likewise, 1 H NMR resonances arising from glucose and sucrose, which again were absent from the salivary spectra, were readily detectable in these profiles at post-mealtimepoints of 30 and 60 min. Since these signals are still visible at the 60 min time-point, we therefore recommended that an absolute minimum fasting period of 2 h should be deployed to ensure exogenous carbohydrates reach levels at which they are no longer 1 H NMR-detectable. As for exogenously introduced citrate, a much longer fasting period is required, since its salivary level remains detectable by this technique even at 4 h following meal consumption.
Intriguingly, it is of paramount importance to note that there are major post-eating upregulations in the salivary levels of lactate, pyruvate, succinate, and trimethylamine-N-oxide, and these results are fully consistent with those made by Gardner et al. in [31], which involved a study of participant responses to a sucrose rinse challenge. Since it appears that these concentrations firstly rise with time up to the 30-60 min time-points and then decrease slowly up to 4 h thereafter, again a restriction on oral activities for more than this period is certainly warranted. Again, a meal consumption limit restriction of only 1.0, or even 2.0 h, is clearly insufficient.
Additional major post-meal consumption changes to the 1 H NMR profiles found included clear downregulations in alanine, ethanol, 3-aminoisobutyrate, and propionate, although decreases in salivary level for the latter metabolite at a 30 min time-point are compensated by a re-upregulation in its salivary level up until the 240 min time-point. Moreover, a small, apparent triplet centered at δ = 1.832 ppm develops from the 60 min post-eating time-point and increases in intensity thereafter (since this signal is tentatively assigned to the β-CH 2 multiplets of either glutarate or propane-1,3-diol, possibly the assigned agent was dietary-derived). Further modifications to the spectral profiles observed comprised rapid (i.e., within 30 min) decreases in 5-aminovalerate followed by rises in its salivary level, and also decreases in those of all methylamines detectable, and taurine. More-over, decreases in the α-CH multiplet proton resonances of histidine and phenylalanine (δ = 3.16 ppm) were also observed, and this observation is fully consistent with changes found in the aromatic region of spectra acquired (Figure 2d). Also visible in this low-field region were small decreases at the 30 and 60 min post-meal consumption time-points for formate, followed by a subsequent 120 and 240 min time-point approximate return to its preeating salivary concentration. Also notable is the slow and low-level development of the potentially host-sourced broad aromatic amino acid residue signals (δ = 6.85 and 7.55 ppm) from the 60 to the 240 min post-eating time-points. Similarly, an α-fucose resonance also becomes visible from the 120 min post-meal consumption time-point ( Figure 2c).
A systematic review of a total of 39 independent studies focused on salivary metabolomics conducted herein (Table 2), demonstrated that only 34 of these specified a sampling abstention time limit for the collection of WMS samples, i.e., with the exception of water intake, complete refrainment from participating in all oral activities, usually eating, non-water drinking, smoking, use of oral healthcare products, and toothbrushing, etc. This specified parameter ranged from 0 to 12 h for all human investigations considered, with a mean ± SD value of only 1.90 ± 2.19 h. Worryingly, 24 out of the above 34 studies had abstention times of ≤1.5 h, and, therefore, it appears that many, if not all metabolites determined therein may be erroneously represented as those with salivary concentrations unaffected by the influence of external stimuli or exogenous agents, which most obviously include molecules derived from dietary sources. Only two investigations had an abstention period of ≥8 h, including one with a 12 h duration. Table 2. Systematic review of pre-abstention/fasting time limits for metabolomics experiments performed in a total of 39 independent investigations [20,. Five of these studies failed to specify an abstention time limit for WMS samplings. Abbreviations: MS, mass spectrometry; LC, liquid chromatography; GC, gas chromatography; MS, mass spectrometry; CE-TOF, capillary electrophoresis time of flight MS analysis; UPLC, ultra-performance liquid chromatography; UPLC-QTOF, ultra-performance liquid chromatography coupled with quadrupole/time of flight MS analysis; UH-PLC, ultra-high performance liquid chromatography; UHPLC-HR, ultra-high performance liquid chromatography (high-resolution); LA-REI, laser-assisted rapid evaporative ionization; GC/EI, gas chromatography/electron ionization detection; FUPLC, faster ultra-performance liquid chromatography; CPSI, conductive polymer spray ionization; CE-LC, capillary electrophoresis and liquid chromatography.

Code
Disease (   On their arrival at the laboratory, collected saliva specimens should be immediately centrifuged at a minimum rpm of 3500 and 4 • C for a period of 15 min, and following sample preparation as outlined below, the clear WMS supernatants arising therefrom are then stored at −80 • C for a maximum duration of 72 h prior to NMR analysis. According to [31], the centrifugation rate and other centrifugal factors exert little or no effect on the quality and validity of the 1 H NMR profiles acquired. Aliquoted volumes (0.50 mL) of the resulting WMS supernatant samples are then treated with 60 µL of pH 7.00 phosphate buffer (0.10 mol./L) containing 0.04% (w/v) sodium azide (Na + /N 3 − ), and 50 µL of 2 H 2 O containing 0.05% (w/v) 3-(trimethylsilyl)propionate-2,2,3,3-d4 (TSP), yielding a final TSP concentration of ca. 250 µmol./L. TSP acts as an efficient internal chemical shift reference and quantitative calibration standard; sodium azide serves as a microbicidal preservative in order to protect against the adverse artefactual generation and/or consumption of microbial catabolites during sample transport, storage, and preparation stages; phosphate buffer is used to control sample pH values and hence the maintenance of constant chemical shift values for selected, pH-sensitive metabolite resonances; and 2 H 2 O acts as a field frequency lock. These mixtures are then thoroughly homogenized and transferred to 5 mm diameter NMR tubes ready for NMR analysis. However, the phosphate buffer addition step is not considered essential, since saliva itself has a strong buffering capacity with contributions from a combination of carbonate/bicarbonate, phosphate/hydrogen phosphate, and protein-based systems [97,98]. Moreover, the mean ± SD pH value of healthy human saliva, i.e., that collected from subjects with clinically healthy gingiva, is 7.06 ± 0.04 [17].
In order to achieve an optimal protocol for the post-collection transport (if required) and storage of WMS supernatants, a series of potential interfering and metabolite levelmodifying factors should be considered in detail. Firstly, it should be noted that selected biomolecules are susceptible to oxidation on exposure to O 2 , some substantially so. Thus, free (non-protein-incorporated) salivary thiols (estimated mean concentrations of free cysteine, cysteinylglycine and glutathione (GSH) are ≤2, ≤0.4, and ≤3 µmol./L in human saliva [99]) readily oxidize to their corresponding disulfides, the overall process depicted in Equation (1) involving the generation of thiyl radical (RS • ) and superoxide anion (O 2 •− ) intermediates, as does any tissue-derived GSH. However, NMR-based metabolomics researchers need not be too concerned about these processes, since these levels are at or beyond lower limits of quantification (LLOQ) thresholds, and cysteine resonances appear as a resonance height-diminishing complex ABX coupling pattern; such low intensity signals are generally strongly overlapped by those of higher concentration metabolites in the spectral regions involved in any case. However, the generation of any reactive oxygen species (ROS) from such autoxidation reactions, e.g., O 2 •− , hydrogen peroxide (H 2 O 2 ) and ultimately hydroxyl radical ( • OH), may cause some oxidative damage to other, albeit ROSscavenging biomolecules. Moreover, salivary pyruvate, and other 2-oxo-carboxylate species such as α-ketoglutarate can be oxidatively decarboxylated by H 2 O 2 directly, with pyruvate being transformed to acetate and CO 2 (Equation (2)) [100], and this directly relates to observations of relatively small but nevertheless significant increases in acetate level that we commonly observe following prolonged storage regimens, as have other researchers when samples are stored at 22 • C for 48 h [60]; however, acetate, along with formate, appear to also be terminal products arising from the radical-mediated decomposition or photodegradation of a wide range of organic biomolecules [101].
Secondly, failure to satisfactorily eradicate the salivary microbe population as soon as possible following sample collection by added microbicidal agents such as azide or fluoride will lead to significant metabolite changes through the actions of microbial enzymes, and, to date, bacterial saccharolytic and protease pathways have been proposed to account for losses of salivary galactose, pyruvate, alanine, and choline, and increases in acetate, during storage of samples at 22 • C [60]. Decreases in choline and coupled rises in acetate and ethanol levels may be ascribable to the actions of oral bacteria, e.g., Streptococcus sanguinis [102]. Notwithstanding, time-dependent diminishing concentrations of lactate, pyruvate, and galactose observed at 22 • C may be associated with glycolytic pathway activities, along with the further involvement of other catabolic actions exerted by oral bacteria, for example consumption of lactate and n-butyrate by lactate dehydrogenase present in Streptococcus, Lactobacillus, and Actinomyces spp. [103] and conversion of lactate to acetate by Veillonella species [104]. Reductions observed in concentration, also at this temperature, appear to involve their use by selected bacterial species as a source of nitrogen [105].
Such artefactual effects exerted by the salivary microbiome during the collection, transport, storage, and preparation of saliva specimens for 1 H NMR analysis can, at least in principle, be circumvented by the pre-treatment of samples with bacteriostatic agents such as sodium azide or fluoride. Sodium azide serves as a potent bacteriostatic agent that acts by blocking respiration via its inhibition of cytochrome c oxidase activity in gramnegative bacteria [106]; this is achieved through its ability to complex between a haem 3 iron ion and a copper ion B center located within the O 2 reduction site [107,108]. However, azide also inhibits ATP hydrolase activity, although not its biosynthetic actions-similar actions are wielded on F-ATPases present in eubacteria and chloroplasts [109]. Potentially, a series of other bacterial haem, and iron-sulfur (Fe-S) proteins, and other metalloenzymes, are inhibitable by azide exerting its complexation properties at their metal ion centers.
Although the anti-caries activities of fluoride anion are largely associated with its influence on the mineral phases of teeth and remineralization, this agent also exerts significant microbicidal effects towards dental plaque bacteria, which yield organic acid catabolites and hence localized acidotic environments that, in turn, give rise to demineralization processes. Indeed, previous investigations have revealed that fluoride can attenuate bacterial metabolism through a series of processes involving differential mechanisms [110]. For example, it may act directly as an enzyme inhibitor, notably on the glycolytic enzyme enolase in a quasi-irreversible fashion. Additionally, inhibition of haem-based peroxidases with the complexation of fluoride by haem iron ions acts as a direct mechanism of action; Fe(III) ion has a very strong affinity for hard-donor fluoride anion (stability constant for the mono-fluoro-substituted Fe(III) complex (FeF 2− ) = 1.62 × 10 5 M −1 [111]).
Thirdly, the actions of salivary enzymes may persist during and immediately after collection. As noted above, very recently, Duarte et al. [60] employed NMR spectroscopic analysis to monitor changes in any salivary metabolite levels occurring throughout shortterm storage periods, at ambient temperature, as well as +4 and −20 • C, and following sample preparation methods at ambient temperature and +4 • C, and simulating conditions experienced within clinical and laboratory settings. Notably, significant inter-participant and inter-sampling day variabilities were observed, and these results concur with those reported by us, both previously [3] and in this report (Section 10). Following sample collection, no modifications were observed by these researchers during storage at −20 • C for a period of at least 28 days, whereas, as expected, ambient temperature storage gave rise to decreases in the concentrations of selected metabolites. Therefore, the above study concluded that following collection, saliva samples may be stored at ambient temperature, or in the refrigerator at +4 • C, for periods of up to 6 h, and at −20 • C for at least 28 days. When prepared for NMR analysis, samples were found to remain stable at 25 • C for up to 8 h, and at +4 • C for up to 48 h Sodium azide addition averted potential early concentration changes in fucose, proline (6-8 h), and xylose (24 h). However, it should also be noted that a number of Gram-positive bacteria, for example Streptococci and Lactobacilli, have been reported to be somewhat resistant to the bactericidal actions of sodium azide [106,112]; these genera are prevalent in cariogenic and saccharolytic dental biofilms.
Of critical importance, in 2008, Esser et al. [113] investigated the stability and longevity of the protein content of human WMS in samples provided from the time-point of collection both with and without added inhibitors of bacterial metabolism and proteases. Results acquired showed that protein degradation took place within 30 min of specimen collection and commenced during the actual sample donation process. Although protease inhibitors such as phenylmethylsulphonyl fluoride (PMSF) were found to partially suppress this process, blockers of bacterial metabolism did not. Stable protein deterioration products, which were peptides with molecular masses of 2937, 3370, and 4132 Da, were discovered, and the authors suggested that these may be employed as markers of saliva sample quality. In view of these observations, we should perhaps expect some increases in the salivary concentrations of free amino acids from the point of collection until the analysis of samples; however, with the exception of small increases in salivary proline in samples untreated with sodium azide at ambient temperature or at +4 • C, the study documented in [60] did not find such storage-and laboratory processing-dependent upregulations. Moreover, we still find evidence for the retention of selected salivary protein species in samples following our standard collection, sample preparation, and storage phases, the latter involving prolonged biobank depository episodes at a temperature of −80 • C (Figure 1). Although we may expect species of molecular mass up to 4000 Da to have 1 H NMR resonances with slightly or only negligibly higher ∆v 1/2 values than those of low-molecular-mass metabolites (ca. 1.5-2.5 Hz), they would certainly not be as large as those of the broad aromatic signals noted in Figure 1 (ca. 55 Hz). In any case, the authors of [113] recommended the freezing of saliva samples immediately following collection, for example, in liquid N 2 , in order to minimize protein deterioration. However, the processing of specimens at only +4 • C, together with the use of protease inhibitors, can also facilitate protein retention therein. Nevertheless, exposure of salivary protein to freeze-thaw episodes may give rise to significant levels of their denaturation, and hence appropriate caution should be applied where this phenomenon is observed.
Interestingly, fluoride has the ability to inhibit human salivary amylase activity when present at added levels of ≥50 mmol./L [114], and this process has been speculated to occur via a mechanism involving its complexation of the critical Ca 2+ ion required for this enzyme's activity. Indeed, this metal ion may form intramolecular cross-links with the protein. Since a range of enzymes that require Ca 2+ as a co-factor for their activities are inhibited by F − , the mechanism featured in this process may involve the F − -mediated transference of Ca 2+ [115]. Of further note, the observed inhibition of human salivary amylase by the metal ion complexant azide [116] may involve a similar mechanism involving the perturbation of this critical Ca 2+ ion linkage, and therefore it may conceivably block such Ca 2+ -dependent amylase activity.

NMR Pulse Sequences: Recent Advances and Future Prospects
The great majority of all NMR-based metabolomic investigations hinge on use of the metnoesy 1D NOESY experiment or a related version of this sequence. However, pulse sequences used in metabolomics experiments are always determined by the exact nature and molecular composition of study samples, for example, urine, saliva, or blood plasma, as three major classes for consideration. Since the majority of biofluids are aqueous solutionbased (although blood plasma and serum can be considered as special cases of 'mixed solvent systems' in view of their significant lipoprotein-associated lipid and cholesterol contents), an absolutely essential water suppression activity in the pulse sequence is a prerequisite, and water (and HOD) signal presaturation [117] generally represents the optimal selection for quantitative analysis purposes. Moreover, an augmenting NOESY mixing block is generally also incorporated into NMR analysis protocols in order to ameliorate the baseline and vanquishing of the water signal [118]. In general, this approach is routinely employed for the simultaneous multimolecular analysis of urine and saliva samples, in view of their very low and low protein contents, respectively. However, samples that contain high-molecular-mass agents such as proteins and polysaccharides, etc., are analyzed utilizing NMR acquisitions that involve relaxation time filtering systems to improve the visualization of much sharper, low-molecular-mass biomolecule resonances [119]. Indeed, without the application of such filters, 1D single-pulse 1 H NMR spectra of protein-rich blood plasma contains a broad protein resonance envelope that severely overlaps with or obscures many of the much sharper low-molecular-mass signals arising from free amino acids, sugars, and organic acid anions, etc.
The favorable acquisition of NMR spectra is also critically dependent on a range of different parameters, for example, the relaxation delay, spectral width, and number of scans utilized, that exert a powerful effect on the signal-to-noise (STN) ratio and spectral quality. Currently, software packages are mainly restricted to those provided by NMR facility manufacturers (TopSpin (Bruker) and Delta (Jeol, USA)), but this is certainly not the case with the multitude of differing software options usable for additional data-handling stages involved.
Of course, moving forward from 1D NMR experiments to 2D NMR strategies offers major spectral interpretational advantages, and in metabolomics, these technologies regularly serve as highly valuable facilitators for verifying assignments made in 1D spectral profiles. Indeed, it is recommended that such 2D approaches are conducted before the acquisition of 1D spectra for the bulk dataset, so that researchers are very clear on the identities of resonances present in highly complex, metabolically rich biofluid samples such as human saliva, in which >60 metabolites can be detected, and many of these quantified, at an operating frequency of 600 MHz; yet more biomolecules will, of course, be detectable at operating frequencies of ≥700 MHz in view of greater spectral resolution and enhanced sensitivity. Unfortunately, a small number of salivary metabolomics publications appear to contain a number of biofluid spectral misassignments. However, the seeking, detection and clarification of any superimposing resonances in the 1D profiles also serves as a major bioanalytical bonus, and this permits researchers to seek alternative 1 H NMR signals or even analytical methods for quantification purposes in such metabolomics investigations, most especially if the analyte is a potential key, albeit targeted, biomarker molecule.

Value Offered by the Application of Spin-Echo (CPMG) Pulse Sequences to Human Saliva Sample Analysis
The Carr-Purcell-Meiboom-Gill (CPMG) multi spin-echo pulse sequence serves as a valuable spin-spin (transverse) T 2 relaxation time filtering system that has been widely exploited in metabolomics studies of biofluids containing relatively high protein contents such as blood plasma or serum. The applications of this pulse sequence are built on the differential T 2 values of low-(e.g., free amino acid and organic acid anion, etc.) and highmolecular-mass (e.g., proteins and polysaccharides, etc.) biomolecules, which are of course shorter or much shorter for the latter class of molecular species (Figure 3). By careful selection of the 2τ E (2 × echo time) and n echo parameters, broad biological macromolecule signals with short T 2 indices may be significantly attenuated or completely removed from 1 H NMR spectral profiles, whereas, in view of their longer relaxation times, sharp resonances arising from low-molecular-mass metabolites remain readily detectable and, dependent on methods adopted or adapted for such purposes, are also quantifiable, assuming that there is no or only very limited superimposition of resonances selected for electronic integration with those of other biomolecules. However, resonances with intermediate T 2 values yield intermediate attenuations [120], and this is most commonly observed for the relatively molecularly mobile portions of macromolecules in CPMG spin-echo spectra acquired on blood plasma or serum specimens, for example, those of the triacylglycerol moieties of lipoproteins in general, the phosphatidylcholine head-groups of polar high-densitylipoprotein phospholipids, and the N-acetyl-sugar-containing carbohydrate side-chains of 'acute-phase' glycoproteins, the latter of which are markedly upregulated in many human conditions such as inflammatory diseases and surgical trauma, for example [121]. Appl  Although saliva has a much lower total protein content than plasma [18,19], we are still able to observe a small number of relatively minor broad signals arising from macromolecules, for example, those tentatively assigned to protein aromatic amino acids visible at δ = 6.85 and 7.55 ppm in Figure 1. However, predominantly these resonances do not interfere with the electronic integration of those derived from low-molecular-mass aromatic catabolites.
One of the original investigations involved in the development of the CPMG pulse sequence [123] explored the influence of its pulse sequence acquisition criteria on the 1 H NMR detection of low-molecular-mass agents and found that increasing the total echo time span through modifications of the numbers of cycles applied regulates broad protein resonance loss, although signals from low-molecular-mass metabolites are also influenced somewhat, as monitored by overall decreases in their intensities; as we might expect, such small molecule resonance effects were less pronounced than those involving macromolecules. In experiments involving retention of the total echo time at a constant value, eleva- Although saliva has a much lower total protein content than plasma [18,19], we are still able to observe a small number of relatively minor broad signals arising from macromolecules, for example, those tentatively assigned to protein aromatic amino acids visible at δ = 6.85 and 7.55 ppm in Figure 1. However, predominantly these resonances do not interfere with the electronic integration of those derived from low-molecular-mass aromatic catabolites.
One of the original investigations involved in the development of the CPMG pulse sequence [123] explored the influence of its pulse sequence acquisition criteria on the 1 H NMR detection of low-molecular-mass agents and found that increasing the total echo time span through modifications of the numbers of cycles applied regulates broad protein resonance loss, although signals from low-molecular-mass metabolites are also influenced somewhat, as monitored by overall decreases in their intensities; as we might expect, such small molecule resonance effects were less pronounced than those involving macromolecules. In experiments involving retention of the total echo time at a constant value, elevations of this parameter (τ E ) were found to give rise to distortions in scalar couplings. As expected from facile thermodynamic considerations, the researchers involved confirmed that the higher the added protein concentration, the greater the affinity of such macromolecules for binding low-molecular-mass metabolites. As confirmed by Bell et al. [124], resonances arising from these small molecules have shorter T 2 values than those present in non-protein-containing solutions and hence are more influenced by the CPMG relaxation filter system. Further pioneering studies concerning development of the CPMG pulse sequence, and that of the alternative, albeit earlier, Hahn spin-echo Fourier-transform (SEFT) option, are reported in [125,126], respectively.
As long ago as 1988, Bell et al. [124] reported the detection of a highly significant fraction of blood plasma lactate that was bound to plasma proteins (predominantly but not exclusively albumin present at a level of ca. 0.50 mmol./L) in both single-pulse and Hahn spin-echo 1 H NMR spectra acquired on this biofluid. This so-called 'NMR-invisible' lactate may to mobilized from the protein bound to the Hahn spin-echo spectrum-detectable 'free' form via the addition of charged reagents such as ammonium chloride (NH 4 Cl), with a maximal release extent observed at 0.50 mol./L of this reagent added, so presumably negatively charged lactate was tightly electrostatically bound to positively charged protein amino acid side-chains, such as lysine residues, in these plasma proteins, and displaceable by added Cl − anion. In corresponding spin-echo spectra of ultrafiltrates (<10 kDa cut-off) obtained before and after NH 4 Cl addition to whole plasma, large increases in the lactate-CH 3 and -CH resonance intensities of up to three-fold were observed when they were expressed relative to those of the endogenous amino acids alanine or valine. Following NH 4 Cl addition at a level of 0.50 mol./L, the ∆v 1/2 values of the lactate-CH 3 signal decreased from 3.5 to 2.2 Hz; those of alanine remained unmodified (2.3 Hz). The authors concluded that total plasma lactate concentrations were only reliably estimated following excess NH 4 Cl addition. Without this prior sample preparation step, 1 H NMR-determined lactate concentrations were approximately only one-third of those measured by an established enzyme assay system. However, we find that, for both metabolites, these added NH 4 Cl-liberated ∆v 1/2 values found for plasma in Hahn spin-echo spectra were actually rather higher than those measured in single-pulse spectra acquired on WMS supernatant samples, so this indicated that there is little or no 'masking' of fractions of these metabolites (and hence also their 1 H NMR resonances), via their equilibria with protein binding-sites available in this relatively protein-deplete biofluid. The dependencies of these ∆v 1/2 values on biomolecule structure, participant donors, and doublet resonance line position (frequency) are investigated in Case Study 1 described in Section 7.3 below.
In a related study, Aiello et al. [122] explored the NMR detection and quantification of low-molecular-mass species in deuterochloroform (C 2 HCl 3 ) solutions containing synthetic polymers (polyethylene glycol and polystyrene), and experiments were conducted both with and without use of the T 2 CPMG spin-echo pulse sequence filter. The effects exerted by differing acquisitional parameters (echo time, relaxation delay, and the number of cycles), along with sample compositional indices, i.e., molecular mass of polymer and concentrations, on analytical responses, were investigated with a highly recommendable analysis-of-variance (ANOVA)-based design of experiments model, which considered a first-order interaction and squared variable terms, in addition to a range of potential response parameter variance-contributing input factors. This design was employed in order to optimize the reliability of quantitative analysis performed. Mixtures containing variable concentrations of both low-molecular-mass and macromolecular molecules were analyzed both with and without application of the CPMG filter. Results acquired demonstrated that increases in polymer levels attenuate its own resonances in CPMG 1 H NMR spectra anyway, since this gives rise to corresponding increases in sample solution viscosities, and hence corresponding decreases in its resonance's T 2 values. However, the STN ratios of small molecules only experience a minimal reduction when specific acquisitional parameters are selected according to the macromolecule's molecular mass. Moreover, it should be noted that the results acquired were dependent on differences between 'mobile' and 'stiff' polymers, as may be expected, and also between the dynamics of aliphatic and aromatic 1 H nuclei. These phenomena gave rise to differential extents of macromolecule resonance suppression, which, according to the authors, may be employed for making deductions regarding the molecular structure of polymers evaluated. Therefore, in principle, this technique may be employed in metabolomics experiments in order to obtain valuable information on the structural nature of proteins or other macromolecular biomolecules such as proteins or polysaccharides, salivary or otherwise, which bind low-molecular-mass metabolites, in addition to their molecular masses.

'State-of-the Art' Advances beyond the CPMG Pulse Sequence
Generally, 2D filters such as the CPMG pulse sequence is combined with a water/HOD presaturation one, and this combination, known as the presat-CPMG pulse sequence, has now been employed in biofluid NMR analysis for a considerable period of time, and remains a stalwart of NMR-based metabolomics techniques. Although this composite pulse sequence has a logical historical basis, it often does not achieve perfect water suppression, requiring a high radiofrequency power, and consequently it endures a diminishing level of performance with enhancements in magnetic field strength.
However, in view of these problems experienced, a new combinatorial pulse sequence known as 'WASTED' (WAter Suppression with a Transverse relaxation filter that Eliminates Distortions) has been developed [127], and this represents a composite of the CPMGsuperior PROJECT sequence with the Robust-5 water presaturation one. The PROJECT sequence is particularly valuable, since it has the ability to recover resonances suppressed by that which vitiates the intense water signal, notably, but not exclusively, those localized close to its presaturation frequency. Moreover, PROJECT's performance is predominantly independent of magnetic field strength. Hence, WASTED pulse sequences circumvent some of the hurdles associated with the presat-CPMG strategy, and hence offers benefits regarding the multicomponent analysis of biofluids, for which the suppression of the ca. 50 mol./L water signal and a broad 'spectrum' of biomacromolecules simultaneously during spectral acquisition is an essential requirement.
Therefore, the carefully planned application of WASTED pulse sequences, including the sequential order of application of the PROJECT and Robust-5 components, can successfully attenuate water resonance intensities from ca. 50 mol./L down to as little as 0.90 mmol./L [127]. Additionally, the T 2 filters of WASTED permits inter-pulse delays (γ) that are an order of magnitude greater than those countenanced by the CPMG sequence, a process yielding a lower power input for successful operation. In view of these lowered power depositions, pulse sequences such as WASTED provide opportunities for the employment of T 2 encoding methods to distinguish between differential metabolite 'pools' or the structures of macromolecules.
An example of our laboratories' use of the WASTED pulse sequence is available in the pilot study spectra shown in Figure 2 From our single-pulse metnoesy 1D 1 H NMR spectra acquired on WMS supernatants (Figure 4a-c), we found that the mean ± SEM ∆v 1/2 value for the lactate-CH 3 signal was 1.67 ± 0.03 Hz (range 1.33-2.64 Hz; n = 7 samples, each with 2 × ∆v 1/2 values per doublet resonance), although these values appeared to be both sample donor-and metabolite-dependent. Indeed, that for alanine was significantly higher (mean ± SEM 1.98 ± 0.03 Hz, range 1.63-2.65 Hz). In order to explore this further, we elected to perform an analysis-of-variance (ANOVA)-based experimental design model incorporating the 'between-participants (samples)', 'between-metabolites' (i.e., lactate vs. alanine), and 'between-lines-nested-within-resonances' fixed sources of variation, and also a metabolite x participant first-order interaction effect. This UV analysis revealed that the 'betweenparticipants' and 'between-metabolites' effects were found to be statistically significant (p < 10 −6 and <10 −5 respectively), whereas that 'between-lines-nested-within-resonances' was not. However, the above interaction effect was also marginally significant (p = 0.014), and the nature of this can be observed in Figure 4d, which shows that although there were quite strong correlations between the ∆v 1/2 indices for these metabolites for Participants 1 and 4-7, this was certainly not the case for Participants 2 and 3, for which the alanine values were approximately 50% greater than that of lactate; therefore, perhaps, for these two samples, there was a small but significant increase in the amount of alanine, but not lactate, bound to particular salivary macromolecules.    Corresponding mean ± SEM ∆v 1/2 values for the TSP internal standard in the seven spectra acquired were 1.36 ± 0.04 Hz. Therefore, it appears that any binding interactions of salivary lactate and alanine with proteins in this biofluid are similar to those of TSP for the former for all samples, and possibly for most samples regarding the latter biomolecule. Interestingly, for [124], the ∆v 1/2 parameter for alanine in blood plasma was actually significantly lower than that for lactate, prior to release of the latter from protein binding sites via 0.50 mol./L NH 4 Cl addition, but these values were virtually equivalent thereafter (ca. 2.2 Hz). However, for our saliva samples in the current study, the lactate value was lower than that of alanine, but only significantly so in 2/7 samples (Figure 4). Therefore, it certainly appears that the T 2 value of these metabolite protons is not significantly influenced (i.e., decreased) by any salivary protein binding or exchange processes, presumably because of the limited protein concentrations of this biofluid. However, experiments involving the acquisition of spectra prior and subsequent to the addition of macromolecule-releasing NH 4 Cl to a series of biofluids up to a final level of 0.50 mol./L will be required to confirm this. These NH 4 Cl addition experiments are currently in progress in our laboratories. The mean ∆v 1/2 index for salivary lactate was also similar or very similar to those determined for salivary propionate and nand iso-butyrates (data not shown).

Homo-and Heteronuclear Two-Dimensional (2D) NMR Analysis of Human Saliva
In metabolomics studies, such 2D experiments frequently involve homonuclear techniques such as 2D 1 H-1 H correlation spectroscopy (COSY), 2D total correlation spectroscopy ( 1 H-1 H TOCSY), 2D 1 H-1 H INADEQUATE, or 2D J-resolved approaches. However, heteronuclear 2D experiments include 2D 1 H-15 N heteronuclear single quantum coherence (HSQC) and 1 H-13 C HSQC technologies, which may often additionally augment the iden-tification of salivary biomolecules in view of their ability to provide multinuclear and generally more specific chemical shift data secured within the second dimension, together with further atomic connectivity details available within the 2D profiles as a whole. Despite this, one major disadvantage of these increasingly employed 2D NMR techniques is that they generally require lengthy periods of time to acquire and fully process, most especially with the heteronuclear strategies applied, i.e., many hours rather than 10-15 min for a typical 1D 1 H NMR profile. Moreover, spectral interpretation is certainly not facile, and may take considerably longer than the time required for 1D spectral assignments. This is largely a consequence of the relatively poorer sensitivity of 2D NMR analysis when compared on a unit time basis with 1D NMR experiments, the LLOD value being approximately 10-fold less. Furthermore, often it is unable to furnish acceptably uniform resonances for the accurate quantification of metabolites. However, once these essential primary, assignmentfocused 2D NMR experiments are completed, researchers may then optimize acquisitional, interpretational, and assignment requirements for their corresponding 1D experiments, which may sometimes involve the analysis of many hundreds of biofluids, or even more. The applications of a number of these 2D techniques to the analysis of WMS supernatants are discussed in detail below.
Notably, the development and execution of more rapid and more sensitive 2D NMR strategies, along with a greater emphasis on quantitative aspects of these analyses, are currently major priorities for all those engaged or to be engaged in future metabolomics research experiments.

Homonuclear 1 H-1 H Correlation and Total Correlation Spectroscopies
Two-dimensional (2D) 1 H-1 H correlation spectroscopy (COSY) is a rapid and convenient confirmatory strategy to employ for the relatively simple confirmation of directly coupled C1-H and C2-H nuclei in a range of salivary metabolites, and most especially if one or both of the resonances involved are located within crowded or overcrowded spectral regions such as the 0.80-1.50, 1.80-2.50, or 2.90-4.20 ppm sectors. For example, the technique is valuable for verifications of the identities of the amino acids glutamate and glutamine in human saliva, together with the bacterial catabolites nand iso-butyrates. However, the 1 H-1 H TOCSY spectroscopic technique permits the transfer of magnetization from a single or magnetically equivalent group (2 or 3) of 1 H nuclei covalently bonded to one specific carbon atom (C1), to one or more magnetically distinct 1 H nuclei within regions of two or more carbon positions further along a molecular chain, and remote from the C1 location (i.e., those at the C3, C4, C5 positions, in addition to the C2 one, etc.), most notably for the long side-chains of selected amino acid residues such as branched-chain ones (valine, isoleucine, etc.), the lengthy 'backbones' of salivary bacterial-derived organic acid anions such as n-butyrate and 5-aminovalerate, or the malodorous amine putrescine. A brief guide to these C1, C2, C3, etc., labels is provided in Figure 5. Hence, for such TOCSY spectra, much additional molecular structural information over that deduced from the facile direct 3-bond coupling of C2 1 H proton(s) to the C1-borne 1 H nucleus, as observed with the 1 H-1 H COSY pulse sequence, is derivable. Indeed, our previous studies have reported on applications of the TOCSY technique as a means of aiding preliminary assignments made from the 1D 1 H NMR profiles of HSS samples and for overcoming spectral resonance overlap issues therein [3]. From such investigations, it was confirmed that this technique was of much value for the seeking and verification of the identities of complete molecular chains of relevant salivary biomolecules, and the unambiguous assignment of their 1 H signals. The advent of quantitative 2D NMR spectroscopy as applied to recent metabolomics studies [128,129] may serve to facilitate the determination of salivary concentrations of metabolites with resonances that are not readily electronically integratable in the 1D profiles acquired in view of spectral signal superimposition or crowding phenomena. Typical 1 H-1 H TOCSY profiles of typical WMS supernatant samples are shown in Figure 6, and these display strong connectivities between the 1 H signals of a wide range of salivary biomolecules. vary concentrations of metabolites with resonances that are not readily electronically integratable in the 1D profiles acquired in view of spectral signal superimposition or crowding phenomena. Typical 1 H-1 H TOCSY profiles of typical WMS supernatant samples are shown in Figure 6, and these display strong connectivities between the 1 H signals of a wide range of salivary biomolecules.

Valine
Isoleucine 5-Aminovalerate Figure 5. Molecular structures of valine, isoleucine, and 5-aminovalerate. For the purpose of TOCSY spectral interpretations, C1-Cn labels for each structure are arbitrary, although for this diagram they should be read from left-to-right. For valine, the 2 x γ-CH3, β-CH, and α-CH protons represent the C1A and C1B, C2 and C3 ones, respectively; for isoleucine, the δ-CH3, γ-CH2, β-CH, β-CH(CH3), and α-CH protons represent the C1, C2, C3, C4A, and C4B ones, respectively; and for 5-aminovalerate, the α-, β-, γ-, and δ-CH2 groups represent the C1, C2, C3, and C4 ones, respectively. 2D 1 H-1 H COSY spectra reveal connectivities between a specified nucleus and its nearest neighboring ones only, for example between adjacent C1 and C2 nuclei, between C2 and C1 plus C3 nuclei in 5-aminovalerate. However, 2D 1 H-1 H TOCSY spectra shown coupled nuclei linkages between up to 5 or more 1 H nuclei within a molecular chain, and therefore for 5-aminovalerate correlations between all these nuclei are clearly visible, as shown in Figure 6 below. For the purpose of TOCSY spectral interpretations, C1-Cn labels for each structure are arbitrary, although for this diagram they should be read from left-to-right. For valine, the 2 x γ-CH 3 , β-CH, and α-CH protons represent the C1A and C1B, C2 and C3 ones, respectively; for isoleucine, the δ-CH 3 , γ-CH 2 , β-CH, β-CH(CH 3 ), and α-CH protons represent the C1, C2, C3, C4A, and C4B ones, respectively; and for 5-aminovalerate, the α-, β-, γ-, and δ-CH 2 groups represent the C1, C2, C3, and C4 ones, respectively. 2D 1 H-1 H COSY spectra reveal connectivities between a specified nucleus and its nearest neighboring ones only, for example between adjacent C1 and C2 nuclei, between C2 and C1 plus C3 nuclei in 5-aminovalerate. However, 2D 1 H-1 H TOCSY spectra shown coupled nuclei linkages between up to 5 or more 1 H nuclei within a molecular chain, and therefore for 5-aminovalerate correlations between all these nuclei are clearly visible, as shown in Figure 6 below.   acetoin. *Tentative assignment. The 1 H-1 H TOCSY profile shown in (b) was acquired on a Jeol JNM ECZ600R/S1 600 MHz spectrometer operating at frequency of 600.17 MHz, a probe temperature o 25°C (the Watergate_W5_Robust pulse sequence was employed for its acquisition). Acquisition pa rameters were: 11,254 data points per increment; spectral width 11,281 Hz in x, and 6002 Hz in y dimensions; 32 transients in each case; 16 dummy scans; relaxation delay (RD) 2 s; and x and y acquisition times of 1.0 s and 50 ms, respectively. The mixing time was set at 70 ms. Hence, the acquisition of such 2D spectra has been found to readily facilitate the assignments of resonance ascribable to free amino acids, for example those of alanine, glutamate, glutamine, isoleucine, leu cine, valine, lysine, ornithine, proline, and phenylalanine. Additional biomolecule signals verifiable in such TOCSY spectra have included those arising from γ-amino-n-butyrate and propane-1,2-diol the latter being an agent found in many oral healthcare products, which is also present in cigarette smoke [58]. 1 H-13 C correlations for adjacent aliphatic 1 H nuclei in fatty acid chains, i.e. CH3(CH2)nCH2CH2CO-for saturated classes, were also found [3].

J-Resolved NMR Analysis
The 1 H-1 H J-resolved (JRES) technique is ideal for resolving many complex overlap ping multiplet signals by a dispersion of chemical shift and coupling constant data into Figure 6. (a,b) Partial 600 MHz 1 H-1 H TOCSY NMR spectra of WMS supernatant samples. Typical spectra are shown. The spectrum shown in (a) is adapted and updated from that available in [3] with permission. This spectrum was acquired on a Bruker AMX-600 spectrometer, and acquisition parameters for it are also provided in [3]. Connected resonance assignments: 1, fatty acids (terminal-CH 3 spin system); 2, leucine; 3, fatty acids (bulk acyl chain spin system, including α-CH 2 resonance); 4, lysine; 5, 2-hydroxyglutarate; 6, γ-aminobutyrate; 7, ornithine; 8, 5-aminovalerate; 9, isoleucine; 10 H TOCSY profile shown in (b) was acquired on a Jeol JNM-ECZ600R/S1 600 MHz spectrometer operating at frequency of 600.17 MHz, a probe temperature of 25 • C (the Watergate_W5_Robust pulse sequence was employed for its acquisition). Acquisition parameters were: 11,254 data points per increment; spectral width 11,281 Hz in x, and 6002 Hz in y dimensions; 32 transients in each case; 16 dummy scans; relaxation delay (RD) 2 s; and x and y acquisition times of 1.0 s and 50 ms, respectively. The mixing time was set at 70 ms. Hence, the acquisition of such 2D spectra has been found to readily facilitate the assignments of resonances ascribable to free amino acids, for example those of alanine, glutamate, glutamine, isoleucine, leucine, valine, lysine, ornithine, proline, and phenylalanine. Additional biomolecule signals verifiable in such TOCSY spectra have included those arising from γ-amino-n-butyrate and propane-1,2-diol, the latter being an agent found in many oral healthcare products, which is also present in cigarette smoke [58]. 1 H-13 C correlations for adjacent aliphatic 1 H nuclei in fatty acid chains, i.e., CH 3 (CH 2 ) n CH 2 CH 2 CO-for saturated classes, were also found [3].

J-Resolved NMR Analysis
The 1 H-1 H J-resolved (JRES) technique is ideal for resolving many complex overlapping multiplet signals by a dispersion of chemical shift and coupling constant data into two orthogonal frequency domains, an advantage that readily facilitates spectral assignment. If properly optimized through the acquisition of a sufficient, albeit experimental timerestrictive, number of F2 projections, it may be employed for confirming determinations of the linewidth at half-height (∆v 1/2 ) values of many resonances in complex biofluid samples. As noted above in Section 7, this parameter is, of course, of critical importance for exploring equilibria between the 'free' and 'macromolecule-bound' forms of salivary biomolecules, although, as specified, this consideration is of much less importance for human saliva than it is for protein-rich blood plasma, or inflammatory knee-joint synovial fluid, for example, in view of its much lower total protein content [130]. Indeed, the binding of low-molecular-mass metabolites to protein, either electrostatically, or via H-bonding or van der Walls forces, substantially increases the line-widths of such signals, so much so that they are often viewed and termed as being 'NMR-invisible' in CPMG or Hahn spin-echo spectra acquired on protein-rich biofluids [124]. Such a parameter, which is determined by 1 H nuclei T 2 values, is also markedly influenced by differential biofluid viscosities, with greater viscosities leading to higher ∆v 1/2 values in view of reduced resonance T 2 values from the overall lowered molecular mobilities of analytes.
Of further value, the 1 H-1 H JRES technique allows the accurate determination of exact chemical shift values of superimposed 1 H signals present in crowded regions of the 1D profiles obtained on HSS samples. Indeed, this is readily achievable through visualization of the F2 'skyline' projection, which generates information that has been found to complement that acquired from the corresponding 1 H-1 H TOCSY profiles. For example, spectroscopic analysis using the 1 H-1 H JRES approach was found to be highly valuable for the resolution of a large number of WMS supernatant sample resonances present in the crowded 3.00-4.30 ppm sectors of 1D spectra, particularly the α-CH function protons of a range of amino acids (leucine, isoleucine, valine, methionine, valine, histidine, phenylalanine, tyrosine, and aspartate), the α-CH 2 group of glycine, the -CH 2 -functions of taurine, tyrosine, phenylalanine, and histidine, the carbohydrate ring protons of sucrose, galactose, glucose, and N-acetyl sugars, the alditol ring protons of inositol, and the N-CH 3 and -CH 2groups of creatinine and creatine, along with the -CHOH-lactate proton.

1 H-13 C Heteronuclear Multiple Quantum Coherence and Heteronuclear Multiple Bond Coherence Spectroscopies
For 'state-of-the-art' metabolomics analysis of biofluid samples, 13 C NMR analysis is rarely adopted for biomolecular quantification purposes because of its poor sensitivity, which predominantly arises from its low natural abundance (1.11%), with special reference to biomolecules that are present at low concentrations (e.g., <200 µmol./L) and that are more readily determined by the 1 H NMR technique. Indeed, the applications of this technique in such metabolomic profiling experiments is largely restricted to qualitative confirmations of the identities of analytes, including selected biomarkers when detectable at concentrations above their limits of detection. However, the development of inverse geometry probes and supporting pulse sequences has now largely circumvented this limitation, and 1 H-13 C heteronuclear multiple quantum coherence (HMQC) and heteronuclear multiple bond coherence (HMBC) spectroscopies, and related analytical strategies, provide a substantial enhancement in sensitivity over that of the conventional 1D spectroscopic technique. These inverse 1 H-13 C correlational spectroscopies feature the detection of 13 C to 1 H connectivities (although heteronuclei other than 13 C may also be employed). The HMQC technique is selective for direct 1 H-13 C couplings only, whereas HMBC spectroscopies provide valuable molecular information through longer, 2-4 bond range couplings. Hence, the HMQC and HMBC techniques represent the 1 H-13 C equivalents of the homonuclear 1 H-1 H COSY and TOCSY approaches, respectively. Most modern high-resolution NMR facilities implement gradient-selected forms of HMQC and HMBC, which serve to remove unwanted resonance artefacts and hence improve the quality of the 2D spectra acquired.
A total of 600 MHz 1 H-13 C inverse-detected HMQC spectra of HSS samples have been found to contain a series of projecting cross-resonances for many 1 H NMR-detectable biomolecules, including organic acid anion bacterial catabolites and amino acids. Spectra acquired confirmed the advantages offered by the 2D HMQC technique, notably its ability to resolve overlapping 1 H NMR resonances in corresponding 1D spectra. More-over, employment of an 8 mm 13 C NMR probe in these experiments facilitated the detection of a range of biomolecules present at only low salivary levels. Access to available databases aided verifications of the correspondences of metabolite 1 H and 13 C NMR assignments, and therefore this process assisted with analyte identification and clarification. Furthermore, 13 C assignments and related connected HMQC signals may be confirmed via reference to established reference chemical shift values for authentic model compounds (e.g., Lindon et al., 1999 [131]).

Preprocessing, Bucketing, Normalization, Transformation, and Scaling of 1 H NMR Profiles of HSS Specimens: Bioanalytical Considerations for Metabolite Quantification and Metabolomics Analysis
In general, the intensities of salivary biomolecule resonances are determined by electronic integration of the appropriate spectral regions through application of spectrometer proprietary software (e.g., XWIN-NMR), and maintenance of the exact integral regions is ensured for all spectra included in studies. Concentrations of all possible 1 H NMRdetectable metabolites, and, where appropriate, also those of xenobiotics such as drugs ingested, are computed by comparisons of their resonance areas with that of internal TSP, which in our studies is usually added at a final level of ca. 250 µmol./L. Since the protein content of human saliva [130] is substantially lower than that of human blood plasma or serum [19,132], the 'envelopes' generated by any broad macromolecule resonances do not significantly superimpose upon the great majority of sharper low-molecular-mass biomolecule signals. Therefore, any potential obscurement of these sharper resonances is predominantly negligible and has been found not to significantly interfere with their electronic integration and metabolite quantification [3], an observation confirmed in [31].
Of course, the concentrations of salivary biomolecules determined represent only the non-macromolecule-bound fraction of these and therefore are expected to be somewhat lower than their total salivary concentrations. Unfortunately, this certainly appears to be a little known or appreciated fact within the salivary NMR-based metabolomics community. However, for saliva, the extent of such macromolecule binding is expected to be limited and markedly less so than that observed in blood plasma in view of its much lower mean protein concentration [18,19,130,132].
Since it is generally accepted that salivary metabolite concentrations are inversely proportional to salivary flow-rate (SFR), then, unless the purpose of the metabolomics study in question is to compare the absolute concentrations of metabolites, as indeed it sometimes is, then the authors recommend constant sum normalization (CSN) prior to the performance of MV statistical analysis regimens. Following the removal of unwanted resonances (e.g., the residual H 2 O/HOD signal following presaturation, or perhaps those arising from ethanol ingestion, etc.) or spurious (noise only, or those below the lower limit of quantification) resonance regions, in this manner, the total spectral profile intensities for each sample are, usually row-wise, summed and expressed as a 'normalized' sum total value of say 100 or 1000, and the relative intensities of each metabolite resonance are then expressed as a proportion of this total. Therefore, proportionate CSN values may be considered for both UV and MV analyses, and this approach is discussed further in Section 9.2 below.
However, the rather simple method of measuring SFR, which usually involves the expectoration of as much saliva as possible from human participants within a 5.0 min period, and then calculation of this parameter as mL/min, is, of course, afflicted by complications, such as how sure can we be that all possible saliva available is transferred to sampling tubes, etc.

Spectral Bucketing
Autoprocessing of the 1 H NMR spectra of biofluids such as human saliva with phasing and baseline correction software now represent standard, albeit still developing strategies for metabolomics protocols featuring up to several hundreds of bioanalyte predictor variables. Indeed, these analytical tools have been extensively assessed and tested by a wide range of biomedical NMR research laboratories and have, to date, proven to generate highly accurate results for their employment in optimized statistical models for the MV analysis of such datasets. However, a valuable spectral preprocessing strategy for metabolomics studies, including those performed on WMS supernatants, and known as 'intelligent bucketing', has been engendered in order to further enhance benefits to the selective bucketing of resonances in metabolic profiling technologies. Indeed, intelligent bucketing [133] was developed to effectively achieve 'smartly selected' bucket divisions (i.e., smart bucketing decisions) for highly complex, multianalyte resonance-populated 1 H NMR spectra of biofluids acquired.
Hence, this approach offers major advantages regarding the accuracy of salivary 1 H NMR data modeling through the avoidance of inherent complications arising from the application of 'classical' bucketing techniques, i.e., those with fixed bucket widths of say 0.04 or 0.05 ppm. Such complications derive from the inherent, now high sensitivity of high-resolution 1 H NMR analysis to the molecular and physicochemical environments of biofluids explored, particularly those arising from small changes in pH, temperature, and the ubiquitous occurrence of metal ion-biomolecule complexation reactions such as those involving the ligation or chelation of Ca 2+ , Mg 2+ , or other available metal ions by salivary organic acid anions such as succinate, citrate, or lactate [5], or amino acids such as glycine, alanine, or histidine.
Intelligent bucketing involves an algorithm designed to drive critical chemical shift bucket divisional selections, specifically those that decide where exactly chemical shift bucket (or bin) divisions should be. When selecting the fixed bucket approach, unfortunately the edges of buckets arising therefrom may be positioned at the center of selected NMR signals, and therefore its net contribution may be spread over two, or sometimes more, resonance-dependent regions for electronic integration. Although the nature of principal component analysis (PCA) or factor analysis (FA) may self-correct errors of this nature through the combination of two such spectral regions together into a single PC, in view of each PC's correlated bucket variables, inaccurate or only limited results may arise when the resonance in question is susceptible to the above pH-, temperature-, metal ion-, ionic strength-, and, where appropriate, viscoelastic-controlled small inequalities in chemical shift value(s) between samples donated for comparative analytical purposes, as we might expect. In such situations, the contribution of such resonances are asymmetrically divided between two or more bucket integration segments. Furthermore, the remainder of the signal(s) in such buckets may emerge from another biomolecule that represents a portion of a separate predictor variable; contributions of this nature will stray from the statistical model, and if the exact chemical shift location of a resonance is highly physicochemical property-dependent, its relative contributions to each of the bucket intensities featured will vary, perhaps significantly so; this phenomenon will disturb the accuracy of MV analysis conducted, and therefore may confound the interpretation of results arising from the analysis. Fortunately, intelligent bucketing has the computational ability to 'decide' on bucketing divisions that are constructed from local spectral minima, and hence this algorithm circumvents the above preprocessing errors. Such 'bucketing' may be performed on a whole realm of overlaid 1 H NMR profiles, and therefore resonances that shift with small modifications in physicochemical solution properties can be simultaneously reviewed and overcome, if indeed required [133].

Data Normalization, Transformation, Scaling, and Dimensionality Reduction
Usually, an essential measure to be taken prior to the performance of MV metabolomics data analysis is that metabolite resonance intensities should be 'normalized' according to a recommended procedure. In general, the first stage involves expression of these intensities relative to those of a pre-added internal standard of known concentration, usually but not exclusively TSP (which also serves as chemical shift reference), which has a nine-proton intense δ = 0.00 ppm signal. Following allowance for the number of 1 H nuclei giving rise to each resonance integrated, this approach offers the benefit of providing salivary concentrations, or at least 'apparent' 1 H NMR-detectable contents of metabolites, in view of possible equilibria with and low levels of binding to salivary proteins as noted above in Section 7, i.e., the non-protein-bound 'free' biomolecules. However, for HSS samples, such effects appear to be negligible for the majority of metabolites. Such a normalization strategy will account for any systematic error variations in NMR instrument performance, and this process may be conducted with either single or multiple internal standards that have been 'spiked' into the WMS supernatant sampling matrix prior to analysis, or through the employment of pre-selected normalization factors [134].
However, a now very commonly employed normalization process is to express bucketed resonance intensities as proportions or percentages relative to that of the entire spectral profile, i.e., the apportioning of each signal as a proportion or percentage of that of the total intensity of all spectral features considered, allowing for the exclusion of spectral regions that represent noise only, plus those that should be removed because of their potential interfering and confounding nature, for example, the residual H 2 O/HOD signals in salivary 1 H NMR profiles, along with those of any drugs and their metabolites, and additional, perhaps unexpected exogenous agents such as ethanol arising from unadvised alcoholic beverage consumption during the specified research ethics participant information sheet instructions to participants. This sample row-wise CSN approach may be performed either with or without the above TSP-normalization approach, but it is particularly valuable for NMR-based metabolomics datasets where the addition of internal standard(s) have been neglected, or in situations where it was not possible to add one because of major proteinor other biomacromolecule-binding phenomena, such as that observed in high protein content blood plasma and serum; however, this is clearly not the case for WMS supernatant samples in view of their very low protein content [130]. Nevertheless, CSN may generate complications if quite a large number of 1 H NMR-detectable biomolecule concentrations are simultaneously and significantly up-or downregulated as an impact from the chemopathological processes involved in selected disease processes or as an outcome from the administration or intake of a specific drug or toxin. In such cases, expression of the resonance intensities of metabolites involved to that of a total spectral profile may be problematical, i.e., if it transpires that their concentrations are all elevated n-fold, or of similar magnitudes relative to that of the added internal standard, the CSN procedure will not recognize this unless there are at least some resonances that have not been upor downregulated in response to a disease process or stimulus. For example, in cases of dehydration, or other conditions involving diminishing salivary flow-rates, such as that experienced in Sjörgen's syndrome, all or virtually all salivary metabolites will be increased in concentration in this biofluid, and obviously CSN will not readily recognize this output. However, as surmised above, it does lend itself to facilitate distinction between biomolecules that have proportionately increased in concentration because of dehydration, or those that are inversely related to SFR, from 'real' biomarkers responding to the effects induced by a particular disease process. Indeed, in such cases, disease-induced changes in the salivary levels of genuine biomarkers, either up-or downregulations, would be expected to differ significantly from non-etiological modifications in salivary volume. Hence, CSN is sometimes particularly applicable to selected salivary metabolomics analysis experiments, since it at least partially circumvents the requirement to correct salivary metabolite concentration data for variable SFRs or indeed explore their relationships with this parameter, assuming that it can be accurately determined.
An additional normalization approach is product quotient normalization (PQN). PQN transforms 1 H NMR profiles arising in metabolomics experiments in accordance with an overall estimate of the most likely 'dilution' effect [37]. This normalization approach usually involves the subtraction of a mean or median column-bucketed reference spectral profile from those of either all study samples, or a sub-set of them, usually those of the healthy control group only; this reference spectrum may represent the overall average spectrum of the control group, and all samples are normalized to this irrespective of what group they belong to, i.e., disease or control. It may also be a pre-established spectrum obtained from a reputable reference database, a single study one, or a reference model spectrum of all biomolecules detectable and present at relevant, recommended concentrations in aqueous solution at the known average biofluid pH, ionic strength (I), and viscosity parameters. Selection of the mean or median values of predictor (metabolite) variables in the reference spectrum is generally perfunctory, although use of a median spectrum appears to provide the most robust reference profile for experiments involving only small numbers of participants. It is also recommended that a CSN is performed before PQN in order to primarily scale column metabolite variables to the same absolute magnitude. Reportedly, the PQN algorithm is significantly more accurate and robust than both the CSN and vector length normalization strategies [37].
It is also highly recommended to use the now well-known autoscaling procedure to appropriately scale data. This involves the mean-centering of each metabolite column predictor variable's data-point followed by division by its estimated sample standard deviation so that the mean and variance of each one become 0 and 1, respectively (and also unit standard deviation for the latter value). This process permits each metabolite determination to be considered to be on equivalent scales, irrespective of their pre-autoscaling magnitudes; indeed, the 1 H NMR spectra of healthy human saliva contain some biomolecule resonances that, between-participants, are reproducibly of a much greater intensity than those of others (for example, propionate has a much higher salivary concentration than those of all methylamine species). Furthermore, in this context, any 'between-metabolite' heteroscedasticity (variance heterogeneity) problems with the complete datasets of all included metabolite variables are overcome. Although this scaling process does not prove to protect against 'between-classification-within-metabolite variable' heteroscedasticities, which remains a critical assumption for all parametric statistical tests for comparing mean values [135], we therefore further recommend that all MV datasets are further transformed using logarithmic or generalized logarithmic (glog), square root or cubed root, or even inverse transformations. It should also be noted that proportional data such as that obtained in CSN datasets remain non-normally distributed, and in such cases, it is highly recommended that fractions are transformed using logarithmic-or glog-, logit-(i.e., log of the odds ratio, which is the log value of a proportion p divided by 1-p), or arcsine transformations [133].
Moreover, the autoscaling procedure is unable to protect against outlying metabolite, potential predictor variable data-points. Hence, subsequent to scaling to mean zero and unit variance, such outliers remain for the performance of MV analysis thereafter, which leads to complications with the assumption that individual concentration data-points for metabolites are sampled from symmetrical normal distributions of such levels, or those of directly proportional spectroscopic measures. An additional problem with autoscaling is the highly adverse inflation of measurement errors.
Fortunately, Pareto-scaling appears to provide a solution to problems associated with autoscaling; this strategy, which involves mean-centering followed by division by the square root of the metabolite variable sample standard deviation, yields scaled variables that are in a format of somewhere between no scaling applied and autoscaling, and these provide a variance; although somewhat variable for different metabolite columns, such values are all very similar and are a little greater than unity, unlike the unit variance scaling arising from the autoscaling process. The objectives of this scaling technique are again avoidance of the relative weighting importance of higher metabolite concentration indices that are associated with a partial retention of the structural characteristics of the original dataset. However, one major disadvantage of this form of scaling is that it is highly sensitive to large sampling classification fold-changes.
Alternative scaling processes quite commonly used by researchers include range scaling (mean-centering followed by division by the range of the metabolite variable dataset), although, as we might expect, this method is highly sensitive to outlying data-points, and therefore it is essential to remove them prior to its instigation; VAST scaling, which involves calculation of the product of the autoscaled data-point and that of the metabo-lite variable's mean value divided by its standard deviation, an approach targeted at small metabolic variable fluctuations (although it remains inappropriate in cases featuring markedly induced metabolite concentration variation, which appear to lack an overall group structure); and level scaling, which aims to have a focal point based on relative variable responses, and which can be particularly valuable for specific biomarker authentication (although there are again complications regarding the deleterious inflation of measurement errors).
Finally, sometimes it may be considered desirable to reduce the total number of available metabolite variables from datasets in order to optimize the achievement of uncorrelated metabolite variables. This may be performed through the preliminary selection of variables through evaluations of the UV statistical significance of all those available using parametric t-tests, ANOVA, ANCOVA, or even MV analysis-of-variance (MANOVA) model systems, etc., this in addition to the isolation of linear combinations of such variables as separate PCs in PCA protocols. The bottom line is that most fully established metabolomics researchers prefer to perform a combination of both UV and MV analyses on datasets acquired, largely because selected metabolite variables may be univariately significant but not multivariately so (a common observation that may arise from it acting as a singular, orthogonal biomarker that remains unrelated to other possible ones), or vice-versa, which may occur in situations where, although a metabolite may not prove to serve as a statistically significant biomarker in a UV context, it remains strongly correlated to other biomolecules that together form a strong MV pattern or orthogonal component that is of value for distinguishing between disease and healthy control patient cohorts, for example. Although of much interest, these arguments lie beyond the scope of this paper, and therefore readers are referred to [135,136] for further information.
Furthermore, the use of evolutionary algorithms, for example genetic algorithms/genetic programming, is also a valuable approach. Indeed, such evolutionary algorithms are generally performed in combination with a second MV analysis algorithmic technique (e.g., PLS-DA) that seeks and explores composites of metabolite variables that reveal the highest level of efficacy achieved with the secondary algorithm and that are arbitrated by the principles of evolution and species selection exercises [137].
With regard to the above approaches for the preliminary processing of complex datasets prior to the application of MV statistical analysis techniques, Figure 7 shows the effects of the sequential row-wise normalization, glog-transformation, and column-wise Pareto scaling processes on the means, variances, and kernel densities of a two-classification human WMS supernatant 1 H NMR dataset; also shown are variable box-plots for 50 of the most important ISB variables. The WMS samples for this analysis were donated by n = 48 participants with an acute sore-throat condition, and an equivalent number of those without this disorder as healthy controls. Moreover, five daily samples were sequentially collected from each subject over a period of five consecutive days. Full details and examples of corresponding 600 MHz WMS supernatant 1 H NMR profiles for this investigation are presented as a case study in Part II of this report.
From these processing experiments, it can be observed that glog-transformation serves to homogenize variable variances somewhat, although the overall kernel density plot revealed a trimodal distribution of variables remaining. Moreover, subsequent application of the Pareto-scaling approach not only further markedly homogenizes previously heteroscedastic variances but also renders the kernel density plot virtually unimodal and symmetrical. Therefore, we thoroughly recommend that all these preprocessing methods are applied prior to the performance of MV statistical analysis, although it should be noted that alternative normalization (e.g., PQN, median-, or range-normalization, etc.), transformation (e.g., cubed root, inverse, or arcsine transformations, the latter being suitable for proportion or percentage data, etc.) or scaling (e.g., autoscaling, etc.) methods may also be employed where considered relevant or where fully justified.
symmetrical. Therefore, we thoroughly recommend that all these preprocessing methods are applied prior to the performance of MV statistical analysis, although it should be noted that alternative normalization (e.g., PQN, median-, or range-normalization, etc.), transformation (e.g., cubed root, inverse, or arcsine transformations, the latter being suitable for proportion or percentage data, etc.) or scaling (e.g., autoscaling, etc.) methods may also be employed where considered relevant or where fully justified. The influence of different of row-wise normalization approaches was also explored as part of a preliminary investigation for this study. Table 3 shows the results arising from UV t tests performed on datasets analyzed without and with PQN and with CSN normalization methods applied to the full dataset, both with and without FDR correction, which The influence of different of row-wise normalization approaches was also explored as part of a preliminary investigation for this study. Table 3 shows the results arising from UV t tests performed on datasets analyzed without and with PQN and with CSN normalization methods applied to the full dataset, both with and without FDR correction, which is obviously approved for the performance of such parametric UV tests on complex MV datasets such as that here with a total of 209 separate 1 H NMR ISB variables. Clearly, application of the glog transformation and Pareto-scaling served to 'sharpen' the significance test applied via a homogenization of ISB variances, the improved consistencies of these with their selection from normal sampling distributions, and also the assurance of additive responses. Furthermore, application of either the PQN or CSN row-wise normalization methods served to improve the number of statistically significant variables. Indeed, without the application of any row-wise normalization methods, the variances (scatter) of these 1 H NMR ISB variables remained highly variable, i.e., heteroscedastic, and the overall distribution of these values appeared to be bimodal, even with glog-transformation and Pareto-scaling. Table 3. Numbers of significant 1 H NMR ISB variables for UV two-sample t tests performed, both with and without false discovery rate (FDR) correction, on a total of 209 ISB variables from the sore-throat study dataset with different row-wise normalization, transformation, and columnwise scaling methods applied. Row-wise normalization methods used were either product quotient normalization (PQN) to an overall average healthy control group spectrum, or CSN. Where relevant, the transformation applied was a generalized logarithmic (glog) one, and the scaling strategy involved was Pareto-scaling. Up-and downregulations in ISB variables are those expressed with respect to the participants in the acute sore-throat condition group, i.e., a significant upregulation refers to a significantly higher mean ISB intensity value (and hence assigned metabolite salivary concentration) over that corresponding to the healthy control group, and vice-versa for downregulations.

Sample Size Considerations and MV Power Calculations
In this section, a brief summary of power analysis methods available for estimating experimental sample sizes for MV NMR-based metabolomics experiments is provided. Power analysis strategies for UV statistical analysis have been long established, with many possible applications now available [135]. However, methods available for the experimental design of large or very large MV analysis-based models largely comprise quite recent developments [138]. Those now available include Data-driven Sample Size Determination (DSD) [139], which features an algorithm for estimating the minimum sample size (n) required to detect a minimum of one statistically significant 'predictor' variable, or alternatively, a maximum number of these in order to optimize a model system. Indeed, the algorithm applied, known as the statistical recoupling of variables (SRV) algorithm, primarily permits data reduction before analysis and is then centered on metabolically significant analytes, and this is followed up via the computation of inverse cumulative density probabilities (ICDP). Subsequently, kernel density estimate functions or log-normal distribution fittings are instigated for all ICDPs, and expanded datasets are then simu-lated at differing sample sizes. For this purpose, random numbers are drawn within the [0, 1] interval, and input quantiles of the ICDPs are determined. An analysis of variance (ANOVA) approach, coupled with FDR corrections applied via the Benjamini-Yekutieli method, is then utilized to determine statistical significance. This DSD method then generates a table with mean and SD values of statistically significant variables for each dataset size considered.
An alternative approach is the MetSizeR model [140], which is based on PCA. This strategy, however, does not require the applications of a data reduction strategy, although DSD employs an OPLS-DA model, and therefore SRV values are tested via this route. Notwithstanding, MetSizeR is essentially a targeted approach, since the maximum number of variables that it can accommodate is 375, but fortunately this is usually more than sufficient for many bucketed 1 H NMR datasets, unlike the DSD approach that accommodates larger datasets. Standard parameters for this method are specified in [140], and an example featured in this report explored an 1 H NMR-based rat urinary dataset (n = 36) split into two with samples longitudinally collected throughout a 15-day duration. Application of the above two power analysis methods successfully confirmed that ≤30 samples were required as optimal sample sizes in future datasets acquired on this experimental system. SSPA from the Bioconducter R package also serves as a valuable methodology [141,142].
Readers are referred to [135] for an outline summary of data processing, modeling and power calculations using both UV and MV analysis strategies available for the evaluation and interpretation of 'state-of-the-art' metabolomics datasets, 1 H NMR-based or otherwise.

Significance of between-Participant Sources of Metabolite Level Variance in Healthy Humans: Salivary Phenotype Analysis
In 2016, Wallner-Liebmann et al. [30] reported on the abilities of different saliva sample donors to generate characteristic, participant-dependent salivary metabolic signatures. For this purpose, they employed 1 H NMR-based metabolomics analysis to investigate saliva specimens collected from a total of 23 healthy volunteers, with ca. 6 samples donated daily from each of these for a 10 consecutive day duration (seven days under what was described as 'real-life' circumstances and three whilst undergoing a standardized diet regimen, together with a physical exercise program at Day 10). Although the authors claim that this was the very first demonstration of nascent individual salivary metabolic phenotypes in humans, the roots of this study can be found in [3], since, in that investigation, our group demonstrated that there was indeed a significant 'between-participant' component of variance (which was tested against a 'within-participant' one) in 9 out of 11 salivary metabolites quantified and monitored. Notwithstanding, in [30], the researchers involved found that this individual salivary phenotype was not quite as strong as that found for urine but was affected less by diet. Hence, a further concern of ours is that we are fully aware that if saliva specimens are collected within an insufficient time following meal consumption (from Figure 2, we recommend an absolute minimum of at least 4 h, but a period of 8 or even 12 h is certainly preferable), the 1 H NMR profiles of human saliva are contaminated with large numbers of exogenous resonances from food-borne agents, particularly carbohydrates, organic acid anions, and lipids, etc. Indeed, if samples are collected shortly after meals, including those donated only 1.0 h thereafter, then it is often impossible or virtually impossible to view many of the expected endogenous metabolite signals in such spectra.
Moreover, dietary carbohydrates such as glucose or sucrose immediately trigger the salivary generation of a series of lower molecular mass catabolites, and in the current study we found substantial changes in the levels of endogenous, non-food nutrient-sourced metabolites, notably lactate, pyruvate, and succinate, amongst those of others (Figure 2), which arise from the fermentation of dietary carbohydrates. As expected, oral microbiome responses to fermentable carbohydrate exposure are reflected through modifications of the salivary metabolome, including saccharolytic bacterial-mediated rises in acetate, propionate and n-butyrate levels, for example. Although known for some time [143], this phe-nomenon has been monitored via metabolomics techniques in dental plaque [144,145], and sucrose-induced challenges observed therein were associated with those occurring in saliva. Moreover, the salivary biomolecules acetoin, alanine, lactate, pyruvate, and succinate are also upregulated following sucrose exposure [20], and results obtained from the carbohydrate-rich food ingestion experiment reported in Section 6.2 are indeed fully consistent with this observation (Figure 2). Moreover. acetoin acts as an external energy store by selected fermentative bacteria, and therefore its determination in saliva may yield some valuable information on the activities of such species. However, this agent is commonly employed as a food flavoring agent and a fragrance, and is also present in a variety of foods, e.g., blackberries, peppers, butter, and cocoa products, along with beer and wine.
The relative contributions of the oral microbiome towards the makeup of the human salivary metabolome and the biomolecular composition of WMS have been a subject of conjecture for many years [146]. However, recently some light has been shed on this subject through comparative evaluations of the biomolecular status of this biofluid to that of sterile parotid gland (SPG) saliva [20,23,147]. Indeed, these studies have revealed that the SPG fluid lacks a wide range of metabolites present in WMS before entering the oral cavity, and these included bacterially derived organic acid anions, e.g., acetate, propionate, succinate and nand iso-butyrates, and methylamines. Bacterial catabolites such as propionate and nand iso-butyrates, are predominantly non-NMR-detectable in human plasma, although these metabolites are sometimes visible to us as low-intensity 1 H NMR resonances in spectra of human urine.
The investigation reported by Gardener et al. in REF. [20] simultaneously explored matched WMS, parotid saliva (PS), and blood plasma in an attempt to differentiate between patterns of metabolites arising from bacterial loads from those arising endogenously. This study found that the bacterial load of WMS was strongly positively correlated with abundant short-chain organic acid anion concentrations therein, along with malodorous amine and some amino acid levels. However, the results acquired also indicated that citrate, lactate and urea entered WMS through PS and the systemic circulation. Curiously, WMS concentrations of urea were found to be inversely correlated with both saccharolytic and proteolytic sources of bacterial load. Hence, in principle, such NMR-based salivary metabolomics datasets can be employed to determine the bacterial load of WMS samples collected from dental patients, and it is likely that a MV consideration of correlations between its organic acid anion (including lactate and citrate), amine, amino acid, and urea concentrations, may serve to provide valuable microbially relevant data, which in turn may lead to reliable diagnostic and perhaps prognostic stratification information for oral diseases and also possibly for dysbiosis-related conditions, for which such metabolomics datasets may complement those acquired on gastrointestinal fluids.

Case Study 2: Salivary Phenotype Analysis, and Identification of Differential Metabolite Pools within WMS Supernatants and Their Sources
Therefore, in this section, we have conducted a case study to further explore (1) betweenand within-participant components of variance amongst the salivary 1 H NMR metabolic profiles of a large number of healthy control sample donors, and (2) perform MV factor analysis on these data in order to isolate orthogonal (uncorrelated) components that arise from differential sources and/or pathways, with a particular focus on those arising from host and oral microbiome origins.

Univariate Experimental Design Model for Isolating between-and within-Participant Components of Variance for WMS Supernatant Metabolites in a Healthy Control Population
In order to investigate both 'between-participants' and 'between-sampling dayswithin-participants' contributions to the variances of 1 H NMR-determined salivary metabolite concentrations, we set up a model that involved the collection of daily, early morning saliva samples from a total of n = 48 healthy control participants for a total of five separate sampling days each, an analysis model similar to that reported in [3]. This model had an experimental design described by Equation (3), in which σ i 2 represents the 'betweenparticipants' random effects variance component, σ (i)j 2 the 'between-sampling days-withinparticipants' (i.e., sampling days were 'nested' within participants), and σ 2 ijk the fundamental or unexplained (residual) error variance. The basal metabolite level in the absence of these two sources of variation is denoted by µ, and y ijk is the actual level observed for each metabolite.
Results from this UV ANOVA-based analysis for all identified salivary metabolites is shown in Table 4. Of the 240 1 H NMR profiles explored in this investigation, we found that the 'between-participant' source of variation (σ i 2 ) was very highly significant for all 31 metabolites evaluated, as might be expected, whereas that for 'between-sampling days-within-participants' (σ (i)j 2 ) was only significant for nine of these, i.e., propionate, acetate, pyruvate, glutamine, succinate, and phenylalanine, together with the three Nacetylated biomolecule species signals detectable (Table 4). Therefore, these data provide valuable evidence for significant 'between-participant' heterogeneities between all salivary metabolites monitored, and this supports reports available on differential human salivary phenotypes. Figure 8 shows plots of the observed salivary levels of n-butyrate, propionate, and urea versus that predicted from the mathematical experimental design model shown in Equation (3). These plots show that, although there were high levels of predictability for the models involving both organic acid anion metabolites, that for urea was poor, and this indicates high values of error variance (σ ijk 2 ) for this apparent hostderived metabolite, and hence poor levels of predictability for it from this model ( Table 4 shows that Q 2 was zero for it). This study also indicates that some metabolites display a significant salivary metabolite variation 'between-days-within-humans', for example n-butyrate, propionate, and acetate, an observation we might expect if these particular metabolites arise from a salivary microbiome source with differing daily levels of microbial infiltration and preponderance. Table 4. Statistical significance of the 'between-participants' and 'between-sampling days-withinparticipants' (σ i 2 and σ (i)j2 , respectively) components of variance in a random effects ANOVA model to determine their contributions towards the WMS supernatant concentrations of a total of 31 1 H NMR-assigned metabolite variables. All TSP-normalized datasets were glog-transformed and Pareto-scaled prior to analysis. Also shown are the R 2 and Q 2 values for each of these UV models, along with their σ i 2 /σ (i)j 2 ratios. * Indicates that values were rounded to the nearest integer.   In view of these observations, we further extended these investigations MV metabolomics analysis approaches. Therefore, here for the first time, we al In view of these observations, we further extended these investigations

Multivariate Metabolomics Analysis of Distinctive Salivary Phenotypes
In view of these observations, we further extended these investigations to include MV metabolomics analysis approaches. Therefore, here for the first time, we also preliminarily report a sizeable MV salivary phenotyping study involving the above dataset with n = 48 healthy human participants donating WMS supernatant specimens on five sequential sampling days according to our early morning awakening collection regimen (Section 6). Figure 9 shows the orthogonal T score [1] versus T score [1] plot for (a) an OPLS-DA model performed on only n = 6 randomly selected participants, and (b) the 3D Component 3 vs. Component 2 vs. Component 1 scores plots results from a PLS-DA strategy, which sought to distinguish between and provide phenotypic information on the study participants. Although there remains much overlap between the 48 groups of five daily samples collected from each participant, there is clearly some evidence available that the WMS sample metabolite patterns of at least some sample donors were reproducibly distinctive from those of others, as displayed in the 2D plot shown in Figure 9a. The PLS-DA model applied to all the healthy control participants also provided at least some evidence for the distinctiveness of participant clusters (Figure 9b, for example, that for the cyan color-coded one in the center of the 3D plot, and that for the dark blue color-coded one with high Component 2 and 3 scores but low Component 1 scores.
Appl. Sci. 2022, 12, x FOR PEER REVIEW 54 of 77 3 vs. Component 2 vs. Component 1 scores plots results from a PLS-DA strategy, which sought to distinguish between and provide phenotypic information on the study participants. Although there remains much overlap between the 48 groups of five daily samples collected from each participant, there is clearly some evidence available that the WMS sample metabolite patterns of at least some sample donors were reproducibly distinctive from those of others, as displayed in the 2D plot shown in Figure 9a. The PLS-DA model applied to all the healthy control participants also provided at least some evidence for the distinctiveness of participant clusters (Figure 9b, for example, that for the cyan colorcoded one in the center of the 3D plot, and that for the dark blue color-coded one with high Component 2 and 3 scores but low Component 1 scores. (a)

Investigation of Distinctive Metabolite Network Pools Putatively Arising from Host or Salivary Microbiome Sources: Factor Analysis of Component Loading Vectors and Patterns
For this very large healthy control-only participant dataset, we also elected to explore the loadings vectors (correlations) of individual metabolite variables on (with) each PC/factor isolated in a factor analysis technique model applied. For this purpose, we employed Varimax rotation and Kaiser normalization strategies, and the best model generated was that containing a total of 4 PCs. Table 5 lists these loadings vectors for the total

Investigation of Distinctive Metabolite Network Pools Putatively Arising from Host or Salivary Microbiome Sources: Factor Analysis of Component Loading Vectors and Patterns
For this very large healthy control-only participant dataset, we also elected to explore the loadings vectors (correlations) of individual metabolite variables on (with) each PC/factor isolated in a factor analysis technique model applied. For this purpose, we employed Varimax rotation and Kaiser normalization strategies, and the best model generated was that containing a total of 4 PCs. Table 5 lists these loadings vectors for the total of 31 1 H NMR-assigned and determined biomolecule variables, along with ranges of values for their mean salivary and blood plasma concentrations referenced from the Human Metabolome Database (HMDB) [47] and further sources.
Primarily, a comparison of already established mean salivary concentrations with those of plasma demonstrates that the concentrations of n-butyrate, iso-butyrate, propionate, acetate, 5-aminovalerate, pyruvate, succinate, methylamine, and trimethylamine are much or even substantially higher in the former biofluid than they are in the latter, and this is more than sufficiently convincing to confirm that all these metabolites have a predominant oral or oral microbiome source and that any supply form the host system via plasma and parotid saliva can certainly be considered to be minimal. Since lactate and formate have higher salivary levels than those of plasma, it also appears that the presence of larger amounts of these biomolecules in the former biofluid is also at least partially ascribable to an oral or oral microbiome origin, although, as noted above, it has been previously suggested that lactate is derived from a host source [20]. However, the factor analysis model applied below indicates that this metabolite may arise from both sources.
The co-loading of metabolite variables on two or even more PCs, i.e., perhaps a strong loading on one PC but a somewhat weaker one on another, is not at all uncommon in metabolomics experiments in view of probable heterogeneities in their generating sources and/or pathways involved in their production, i.e., for human WMS, either host or microbial ones, or the relative prevalence of bacteria with saccharolytic and/or proteolytic activities within the oral environment. However, from these values, it should be noted that PC1 is mainly strongly loaded with metabolite variables that certainly appear to arise from a host (parotid salivary/blood plasma) source, whereas those robustly loading on PC2 arise exclusively from the salivary microbiome. Indeed, for PC2, selected bacterially derived organic acid anions such as n-caproate, n-and iso-butyrates, propionate, and 5-aminovalerate, etc., are only detectable in the 1 H NMR profiles of human saliva and not blood plasma nor parotid saliva (although trace levels of these may be detectable therein using more highly sensitive analytical techniques). Indeed, levels of propionate and n-butyrate in human saliva range from 6-69 and 0-2.9 mmol./L, respectively [3], whereas blood plasma levels of these are very much lower (mean levels in adults from reported studies were found to be only 0.9 and 1.0 µmol./L, respectively) [47]. Although host metabolic pathways also feature lactate, pyruvate, and succinate, they also load significantly upon the microbiome-sourced PC2 since they are also well-known microbial catabolites. Although acetate is certainly 1 H NMR-detectable in blood plasma, its concentrations therein in healthy controls are very much lower than those found in WMS supernatants (Table 5), and therefore this metabolite appears to be predominantly derived from the oral microbiome (PC2) in this study, as might be expected. Since the amino acids alanine and phenylalanine also load strongly on PC2, it appears that such contributions from these species are derived from bacteria with proteolytic activities.
Interestingly, although TMA very strongly loaded on PC2, an observation indicating its microbial source, MA and DMA loaded less and insignificantly so. However, all these amines contributed towards PC3 very significantly, along with the amino acid valine, and therefore it appears that these PC3-loading metabolites all arise from a further, albeit orthogonal, source or pathway, which is not simply explicable. However, choline, which is required for the biosynthesis of TMA, loaded marginally but significantly on both PCs 2 and 3. Both MA and DMA are degradation products of TMA and hence may also originate from a choline source. Upregulated salivary choline possibly arises from the diminished clearance of shed buccal epithelial cells throughout the course of sleep periods [33]. Notably, PC2-loading acetate is also derived from choline metabolism [33], and TMA may also be generated from dietary carnitine [34]; additional precursors for it are phosphatidylcholine and betaine.
Notwithstanding, since urea is only relatively weakly loaded onto the host source PC1, and less and not significantly so on PCs 2, 3, and 4, its source remains a subject of conjecture. Although some were statistically significant, its Pearson correlations with all other metabolites were low or very low, and this appears to indicate a separate source or pathway for it. Gardner et al. [20] found that salivary urea concentrations strongly negatively correlated with both saccharolytic and proteolytic bacterial loads. Although statistically significant in at least some cases, simple linear Pearson correlations between WMS supernatant urea levels and those of all other metabolites identified therein were all poor or very low, i.e., values ranging from only −0.09 to 0.39 ( Figure 10 and Supplementary Information Section S1), and this confirms that it may indeed arise from an independent source; nevertheless, some of its more stronger, positive correlations were those found with putatively host-sourced amino acids, including tyrosine.
The correlation heatmap shown in Figure 10 also demonstrates a series of strong agglomerative hierarchical clusterings (AHCs) of metabolites generated on the basis of Pearson correlations between them. Results from this form of AHC analysis were found to be consistent with an alternative AHC approach performed below ( Figure 11) and also the factor analysis described here; these are described in Section S2 of the Supplementary Information Section. However, major observations arising from this analysis were (1) the clear distinctiveness of clusters arising from host and oral microbiome sources (the latter incorporating all organic acid anion catabolites arising from the actions of saccharolytic bacteria, along with selected amino acids from proteolytic bacterial metabolism) and (2) the separate clustering of urea alone, an observation confirming its independence of all or virtually all other metabolites. Sub-clusterings of the host-and oral microbiome-derived metabolites are also discussed in Section S2.
Previously, Kopstein and Wrong [148] investigated the origin and fate of salivary urea and ammonia in both healthy and uremic humans, and primarily they found that PDS and WMS had mean urea levels that were 86% and only 31%, respectively, of their corresponding plasma concentrations. However, although there was little or no ammonia found in parotid saliva, this concentration ranged from 0.6-26.0 mmol./kg in WMS, and these values positively correlated with plasma urea levels. However, intriguingly, these researchers found that in samples of WMS equilibrated at 37 • C, urea was completely lost within a 290 min duration; correspondingly, the bacterial hydrolysis product ammonia increased in level during the first 100 min of this episode. Although this route mainly accounted for urea loss from WMS, additional experiments suggested that at least some of the ammonia liberated or present therein prior to incubation is lost via buccal mucosal absorption or was passively reabsorbed through the oral mucosa in its unionized form.
However, the ammonia-based acid neutralization properties of the arginine deiminase and urease enzyme systems serve to counter adverse acidity at oral sites of microbial activity. Indeed, in order to overcome issues associated with recurring modifications in nutrient availabilities, pH, and oxygen tensions, bacteria have developed some unique mechanisms for sustenance purposes [149]. In view of the continuous acidification of oral biofilms, in dental caries, there are enhancements in the fractions of organisms that have marked abilities to decrease pH values via glycolytic pathways and to resume carbohydrate metabolism at acute pH values [150]. Notably, a range of species can maintain the property of generating ammonia-mediated alkaline conditions from salivary biomolecular substrates, and it appears that this process exerts a significant effect on microbial ecology and persistence, along with pH homeostasis [151]. The generation of plaque ammonia proceeds via urea hydrolysis by the enzyme urease or from arginine through arginine deiminase, systems that are viewed to act protectively against the development of dental caries. The cariogenic species Streptococcus mutans [152] has an ammonia-producing agmatine deiminase complex that degrades agmatine to putrescine, ammonia and CO 2 and may represent a suitable target system for anti-caries therapeutic approaches. Moreover, as we might expect, caries-free individuals have been found to display a higher ureaseand arginine-deiminase system-induced ammonia generating activity, in both saliva and plaque samples, than those with low caries vigor [153]. Therefore, caries-free individuals appear to have relatively greater levels of ammonia generation in their oral environments. Appl Previously, Kopstein and Wrong [148] investigated the origin and fate of salivary urea and ammonia in both healthy and uremic humans, and primarily they found that PDS and WMS had mean urea levels that were 86% and only 31%, respectively, of their corresponding plasma concentrations. However, although there was little or no ammonia found in parotid saliva, this concentration ranged from 0.6-26.0 mmol./kg in WMS, and these values positively correlated with plasma urea levels. However, intriguingly, these iso-Butyrate loaded strongly on PC4, and Zarling et al. [154] observed that this organic acid anion originated from the bacterial catabolism of a valine substrate in human stool, and along with iso-valerate, it may indicate fundamental protein maldigestion or malabsorption; although this amino acid was found to predominantly load on PC3 arising from the above factor analysis of our salivary metabolite dataset, it secondarily also loaded on PC4, although with a loading vector of only 0.30 (Table 5). However, this at least indicates some correlation between them.
Also loading on PC4 is a combination of histidine and formate, which would be expected to be associated in view of the former serving as a metabolic precursor of the latter [155]. However, serine, glycine, methionine, choline, and methanol can also be processed by selected endogenous mammalian metabolic systems to produce formate [156]. Formate in the oral environment remains of some significance in view of its crucial involvement in one-carbon metabolic biochemistry in vivo [156], as a potential marker of toxicity from the metabolism of salivary methanol via formaldehyde [58], and its markedly elevated salivary levels (range 0.20-61 mmol./L) [3] over those of other biofluids such as blood plasma. Table 5. Loading vectors of salivary metabolites on (i.e., linear correlations with) components (PCs) 1-4 for a factor analysis model performed with a total of 31 1 H NMR-determined analytes. The TSP-normalized dataset was glog-transformed and Pareto-scaled prior to principal factor analysis, and Varimax rotation with Kaiser normalization was then applied for the purpose of further analysis. The significance threshold of component (PC) loadings vectors was set at 0.35 in view of the large sample size of this investigation. Abbreviations: as Figure 10, with na, not available; n/a, not applicable. * Values obtained from the HMDB [47]; ** only one observation available.  Taurine most notably loads on PC4 very highly, but it also offers a small but significant contribution towards host source-derived PC1. This β-amino acid is found in many tissues at high (usually several mmol./L) levels [157]; however, its presence in human saliva is not simply explicable, although its salivary concentrations are similar to those of blood plasma (Table 5). Taurine is not detectable in glandular saliva [30] but it is found in GCF at high levels, which are likely to arise from immune cell sources since they are able to accumulate this metabolite at concentrations of up to 50 mmol./L [158]. , which are colored blue, black, green, and violet. The two larger green and violet clusters are further split into two subclusters, and one for the latter contains all bacterial-derived malodorous methylamine species with the amino acid valine (far right-hand side), whilst the other contains all oral microbiome-derived organic acid anions, specifically n-caproate, n-butyrate, iso-butyrate, propionate, 5-aminovalerate, and acetate, plus acetoin and the amino acids alanine, phenylalanine, and tyrosine. TSP-normalized metabolite levels were glog-transformed and Pareto-scaled prior to AHC analysis, which was conducted using Ward's agglomeration method and Euclidean distance as a dissimilarity measure. Abbreviations: As Figure 10.

PC1
An AHC model developed for the investigation of predictor metabolite clusterings or pools in this healthy control WMS supernatant dataset was largely very consistent with that obtained from factor analysis loadings vectors ( Figure 11). Indeed, four major clusters were found, and from right-to-left. These comprise (1) oral microbiome-sourced biomolecules, specifically all organic acid anions (n-caproate, n-butyrate, iso-butyrate, propionate, acetate and 5-aminovalerate), malodorous methylamines (MA, DMA, and TMA), 3-D-hydroxybutyrate, acetoin, and thymine, along with the amino acids valine, alanine, phenylalanine, and tyrosine, which potentially arise from the actions of proteolytic bacteria (phenylalanine and tyrosine are presumably sub-clustered together since the latter arises from oxidation of the former)-all malodorous methylamine species were subclustered with the amino acid valine away from all other bacterially derived catabolites; (2) host-derived metabolites, i.e., the amino acids glycine, taurine, glutamine, glutamate, all N-acetylated biomolecule derivatives, plus choline, lactate, pyruvate, and succinate; (3) urea alone, in view of its poor loadings on PCs1-4 above in factor analysis, and poor correlations with virtually all other salivary biomolecules; and histidine and formate, which are independently associated in view of their common metabolic pathway. These results acquired were virtually fully consistent with the factor analysis performed and reported above.
We then sought to examine the factor analysis component (PC) scores values of participant samples collected from this healthy control group in order to determine which pools of metabolites, i.e., host-(PC1-loading) or oral microbiome-derived (PC2-loading) ones, were predominant for each study participant, specifically, which of these two pools had the highest magnitudes. We also aimed to explore the reproducibility of these prevalences between-sampling days-within participants. For this purpose, we set WMS supernatant sample PC score threshold cut-off values for PC1 and PC2, specifically so that (1) each dominant PC score value should be positive and greater than a minimum value of 0.50, and (2) that the value for the higher scoring PC should be significantly greater than that of the other, which was most often found to be of magnitudes >0.80 or so, but sometimes by as much as >2.0. In this context, 64 and 54 of the PC1 (presumably host-derived) and PC2 (presumably microbiome-derived) sample scores vectors satisfied these requirements, respectively. Particularly notable was the observation that, for each of the two PCs involved, 'qualifying' samples were clustered together in rows of 3, 4, or even 5, and these corresponded to samples collected from the same participants. Indeed, 13 participant sets of such samples with high scores were found for each of the two PC classifications, and only four participants out of a total of 48 had 'mixed' PC1-/PC2-rich responses, i.e., the above specified requirements for PC1 and PC2 scores values were found to be heterogenous for these throughout the 5-day sampling period, e.g., PC1 high/higher than PC2 for 2 days, PC2 high/higher than PC1 for the remaining 3 days, etc. Figure 12 shows the high-field regions of typical 600 MHz 1 H NMR profiles acquired on samples that, from the above factor analysis performed, were assigned to participants with PC1 (host)-and PC2 (oral microbiome)-dominant WMS supernatant profiles, and also one with a 'mixed' contribution from both PCs 1 and 2 (PC-contributory metabolites are colorcoded).Therefore, assuming that our suggested origins of PCs 1 and 2 are correct, which Appl. Sci. 2022, 12, 1235 60 of 74 judging from the loading contributions of local host-and microbial-sourced metabolites is indeed the case, then there appears to be evidence that, for this study, 13/48 participants had a salivary 1 H NMR profile that was dominated by endogenous (host blood plasma) sources, whereas another 13/48 of these had those dictated by bacterial catabolites. Moreover, <10% of participants donated saliva samples that appeared to display a mixed host-and microbiome-derived distribution of metabolites.

Future Perspectives for Clinical Salivary NMR Bioanalysis and Monitoring with Low-Field Benchtop NMR Spectrometers
Unfortunately, high-resolution, high-field NMR spectrometer facilities for routine metabolomics screening programs are restrictive to many researchers since both university and commercial facilities containing, or with access to, such unwieldly instruments with large superconducting magnets are somewhat limited, and the expense involved also hampers such studies. However, recently some major developments with low-field (e.g., 60 MHz), non-stationary NMR spectrometers, which do not require such large cryogenic systems, have demonstrated the applicability of this novel technique to the multicomponent analysis of human urine and its value in the diagnosis and monitoring of type 2 diabetes in human participants [159]. Hence, here we also demonstrate the applications of this analytical strategy to determine the metabolic status of HSS samples, and its future potential for routinely screening for oral health and perhaps physiologically remote extraoral conditions at patient point-of-contact sites such as dental surgeries, health centers, and pharmacies. Although LF benchtop NMR analysis is now frequently employed for determinations of the yields, extents, and rates of synthetic organic/inorganic chemistry reactions [160], along with chemical education applications [161,162], this approach may also provide useful clinical diagnostic and/or prognostic tracking information in human patients when biofluids are non-invasively screened. However, our recently conducted investigations focused on the quantitative LF NMR analysis of human saliva was found to be unfortunately limited to only five or so metabolites with the most prominent 1 H NMR resonances in view of metabolite sensitivity issues and the superimposition of signals at this low operating frequency ( Figure 13). These prominent resonances consisted of those of acetate, propionate, glycine, methanol, and formate, although future optimizations of this technique at perhaps higher benchtop spectrometer operating frequencies (say 80, 90, or 100 MHz) may lead to the efficient and reliable analysis of perhaps 12 or more salivary biomolecules.
We found at least reasonable consistencies and acceptable linear correlations between the salivary levels of acetate, propionate, glycine, and methanol estimated at 60 and 400 MHz operating frequencies (r = 0.97 to >0.99), but less so for formate (r = 0.93). Moreover, for propionate estimations, we also revealed some concentration limit restrictions in view of the overlap of its major δ = 1.05 ppm -CH 3 function triplet signal (J = 7.67 Hz [14]) with those of a series of more minor level biomolecules. Because of this, we suggest that, for HSS specimens with propionate concentrations ≥1.2 mmol./L, samples are diluted somewhat to bring this level below this limit in order to minimize such resonance overlap complications; otherwise, the 60 MHz facility may significantly overestimate this value. Estimated LLOD and LLOQ values for such LF NMR analyses were approximately 100 and 300 µmol.//L, respectively, for major metabolites with prominent -CH 3 function singlet resonances, specifically acetate. As expected, these values were somewhat higher for the propionate-CH 3 group triplet, and the 2-proton glycine and 1-proton formate singlet signals. Appl. Sci. 2022, 12, x FOR PEER REVIEW 65 also provide useful clinical diagnostic and/or prognostic tracking information in hum patients when biofluids are non-invasively screened. However, our recently conduc investigations focused on the quantitative LF NMR analysis of human saliva was fo to be unfortunately limited to only five or so metabolites with the most prominen NMR resonances in view of metabolite sensitivity issues and the superimposition of nals at this low operating frequency ( Figure 13). These prominent resonances consiste those of acetate, propionate, glycine, methanol, and formate, although future optim tions of this technique at perhaps higher benchtop spectrometer operating frequen (say 80, 90, or 100 MHz) may lead to the efficient and reliable analysis of perhaps 1 more salivary biomolecules. We found at least reasonable consistencies and acceptable linear correlations betw the salivary levels of acetate, propionate, glycine, and methanol estimated at 60 and MHz operating frequencies (r = 0.97 to >0.99), but less so for formate (r = 0.93). Moreo for propionate estimations, we also revealed some concentration limit restrictions in v of the overlap of its major δ = 1.05 ppm -CH3 function triplet signal (J = 7.67 Hz [14]) w those of a series of more minor level biomolecules. Because of this, we suggest that, HSS specimens with propionate concentrations ≥1.2 mmol./L, samples are diluted so what to bring this level below this limit in order to minimize such resonance overlap c plications; otherwise, the 60 MHz facility may significantly overestimate this value. E mated LLOD and LLOQ values for such LF NMR analyses were approximately 100 300 μmol.//L, respectively, for major metabolites with prominent -CH3 function sin resonances, specifically acetate. As expected, these values were somewhat higher for propionate-CH3 group triplet, and the 2-proton glycine and 1-proton formate singlet nals.  Table 1 (TSP represents the 3-(tr  thylsilyl)propionate-2,2,3,3-d4 chemical shift reference and quantitative internal standard). At  Table 1 (TSP represents the 3-(trimethylsilyl)propionate-2,2,3,3-d4 chemical shift reference and quantitative internal standard). At this low operating frequency, further spin-spin splitting of propionate's -CH 3 and -CH 2 group resonances (classically t and q, respectively) is clearly apparent in both experimentally acquired and simulated 60 MHz spectra. Bruker TopSpin NMR-SIM software was employed to generate simulated spectra. This NMR-SIM program is based on the solution of the quantum mechanical Liouville equation, with the spin system developing during the radiofrequency pulse and FID acquisition process, which features rotating frame magnetization transfer and the execution of T 1 and T 2 effects. Full details of these simulations will be reported elsewhere.
For the readily visible propionate resonances located in the 60 MHz 1 H NMR spectra, i.e., those ascribable to its coupled -CH 3 and -CH 2 groups (t, δ = 1.06 ppm, and q, δ = 2.18 ppm respectively), a consideration of their J value indicates that they will enclose 2 × 7.67/60 and 3 × 7.67/60 = 0.26 and 0.38 ppm, respectively, at 60 MHz (Figure 13), but only 0.026 and 0.038 ppm, respectively, in a corresponding 600 MHz spectrum (Figure 1). This illustrates the precautions that should be taken when attempting to quantify lowmolecular-mass salivary metabolites or xenobiotics using this technique in view of inherent signal superimpositional limitations. However, after adopting suitable analytical calibration strategies, such problems may be overcome. Predominantly, however, these complications are not considered to be influential for 1 H NMR resonances localized within relatively 'spectroscopically clear' regions of spectra acquired, specifically those of the methanol-CH 3 , glycine-CH 2 , and formate-H nuclei.
Readers should also appreciate that the adoption of these LF NMR technologies for the analysis of WMS supernatant or related samples is still at its very early stages, but, prospectively, future developments in this area may give rise to the installation or rapid transport of such non-stationary devices to dental surgery and primary healthcare sites, hospitals, and hospital laboratories, and perhaps also to community pharmacies for the rapid screening of oral health and perhaps other conditions at these point-of-contact sites, most especially if the number of metabolites determinable is increased to >10 and if at least some of these serve as reliable biomarkers for selected conditions. For example, lactate and acetate are useful biomarkers for dental caries [163], and propionate, along with nand iso-butyrate, are for periodontal disease progression [164,165]. However, currently both these butyrate isomers are not readily 1 H NMR-detectable at an operating frequency of only 60 MHz since their salivary concentrations remain below LLOD values.

Concluding Remarks and Recommendations for Future Experimental Investigations
In conclusion, high-resolution NMR analysis serves as an extremely valuable tool for probing the multicomponent metabolic status of human saliva. In this communication, the advantages of this technique for exploring the capacity of salivary 1 H NMR-linked metabolomics techniques to (1) determine the nature, levels, and potential source(s) of salivary metabolites and (2) reliably predict the oral and, where possible, extra-oral health status of humans, has been considered in much detail. Also considered are the many complications and potential pitfalls associated with this strategy, most especially those regarding the careful planning and organization of saliva sample collections from study participant donors; sample transport, preparation, and storage protocols; and methods available for spectral acquisition, including pulse sequence selections available. Most notably, the distinction of 'real' endogenous salivary metabolites from those arising exogenously is of crucial importance, as demonstrated here for citrate and carbohydrates, which certainly appear to prolong in this biofluid for at least 4 and 2 h, respectively, after their dietary ingestion. Moreover, following meal consumption, we also have also demonstrated that selected salivary biomolecules (notably pyruvate, succinate, and lactate) rapidly increase in concentration in response to the carbohydrate challenge provided by food sources of carbohydrates such as sucrose and glucose, which were detectable in the 1 H NMR profiles acquired ≥2 h following meal consumption (Figure 2). This response is ascribable to the utilization of carbohydrate species as an energy source by the salivary microbiome and their rapid catabolic conversion to these organic acid anion metabolites.
Planned developments in NMR-based salivary metabolomics have resulted in the creation of established protocols in order to achieve cross-laboratory standardized results that consider sample collection, storage, preparation, and analysis of saliva and which may serve to permit inter-laboratory comparisons, and also augment preliminary observations in this research area [3,31]. Indeed, the authors of [31] deserve much credit for developing this. Furthermore, the recommended protocol proposed for the 'correct' performance of salivary 1 H NMR-based metabolomics experiments in [31] surmised that our current and previously used protocols for the 1 H NMR determination of salivary biomolecules, and the utilization of unbuffered internal TSP as a quantitative standard and chemical shift reference, were predominantly acceptable and valid. Moreover, from this report [31], the type, temperature, rate, and duration of the critical cell-and debris-removing centrifugation step, together with freeze-thaw storage cycles, were considered not to significantly affect the quality of results obtained.
However, as noted in Section 6.2, there remain clear problems regarding the establishment of a minimum oral activity abstention period required for the collection of whole mouth and other forms of human saliva. Of particular concern is salivary citrate, the origin of which is of much importance since it is associated with upregulated rheological properties [7,84].
We have also explored the applications of pulse sequences for the 1 H NMR analysis of human saliva, including the commonly employed CPMG strategy, and for the first time also present data on use of the newly developed WASTED sequence for the analysis of WMS supernatants ( Figure 2). However, from comparisons of the ∆v 1/2 values of resonance lines in the metnoesy 1D 1 H NMR spectra of a series of WMS supernatant specimens, it was found that with the exception of n = 2 participants, there were no significant differences between alanine and lactate, although that between some participants was, and this may indicate differential metabolite-binding macromolecule (including protein) levels, and/or variable metabolite-macromolecule binding equilibria between these samples. With the exception of one sample that had a near-equivalent value, overall, these values were significantly lower than those determined on the protein-unbound, 'free' form of lactate determined in spectra of human blood plasma acquired using the Hahn SEFT pulse sequence [124]. Hence, it appears that, although certainly present, any such macromolecule binding equilibria present in human saliva does not significantly alter the quantification of low-molecularmass biomolecules in this analytical medium, simply because its availability is insufficient to achieve this. Indeed, we await results from additional experiments conducted to evaluate this further.
Additional information is also provided regarding ways and means for circumventing the misassignment of NMR resonances, most particularly those in commonly analyzed 1 H spectral profiles, and these have included the benefits offered by the acquisition of 2D NMR techniques, both homo-and heteronuclear. Such misassignments may commonly arise in situations where researchers do not or are unable to acquire confirmatory 2D NMR profiles, although the duration of expensive spectrometer time required for their recommended acquisition can often present experimental restrictions. Notwithstanding, such 2D NMR techniques are also valuable for overcoming 1 H NMR resonance congestion in selected regions of their 1D spectral profiles.
Further precautions regarding the applications of this novel technique are optimizations of the quantitative determination of salivary biomolecules, and preprocessing methods for advancing both the UV and MV statistical analyses of salivary metabolomic datasets, including the correct choice of strategies for the bucketing, normalization, transformation, and scaling of metabolite resonances. Although not the main focus of this report, we have also supplied profitable information concerning the techniques available for the MV and computational intelligence analysis (CIA) of such salivary 1 H NMR profile data, although these approaches will be contemplated and scrutinized more so in Part II of this two-paper series.
We also demonstrate, for the first time, the 'pooling' of groups of salivary metabolites according to their physiological or other sources using a consideration of the component loadings of 'predictor' variables in a factor analysis experimental model involving quite a large number of healthy control human participants (n = 48), with five replicate WMS samples provided for each one. Particularly notable from this case study is the observation that all organic acid anion metabolites known to be predominantly or exclusively derived from bacterial catabolism, i.e., n-caproate, n-butyrate, propionate, acetate, and 5-aminovalerate, load the most strongly on PC2 (loadings vectors 0.85, 0.91, 0.84, 0.77, and 0.68 respectively), and certainly not significantly on the other PCs isolated. Therefore, the assignment of its source to the salivary microbiome certainly appears justified. This is further supported by considerable PC2 loading contributions arising from the bacterial catabolites DMA and TMA, along with selected amino acids such as alanine and phenylalanine, which presumably arise from a proteolytic bacterial source. Metabolites such as glutamate, glutamine, pyruvate, succinate, three N-acetylated species, and, to a lesser extent, urea, all loaded strongly on PC1, which was attributed to a host origin. Nonetheless, it was not at all surprising that quite a high proportion of the metabolites included in this study loaded significantly on more than just one PC. Indeed, metabolites such as the amino acid tyrosine appear to have a dual 'residency' on both the oral microbiome-and host-associated PCs, since amino acids such as this may be derived from both origins, the microbial one involving proteolytic bacteria. Likewise, it appears that both choline and acetoin are generated by both sources, as indeed are lactate, acetate, and succinate. For the host source, lactate is featured in the glycolytic pathway via reduction of pyruvate, acetate, and acetyl-CoA in many cellular metabolic processes operating with energy generation, lipid biosynthesis and protein acetylation functions, and succinate in the citric acid cycle. Additionally, it appears that a de novo pathway exists for acetate formation from a pyruvate precursor [166], which is, of course, the terminal product of both glycolysis and fatty acid metabolism. However, both these metabolites load significantly on the salivary metabolome-sourced PC (PC2). Moreover, acetate also readily arises from the autoxidation of pyruvate in aqueous solution (Equation (2)). Enzymes present in protein and amino acid-degrading (proteolytic) bacteria, including Prevotella and Porphyromonas species, readily degrade proteins and amino acids and transform them into organic acid anions, volatile sulfur compounds (VSCs), indole and skatole, and ammonia, for example, and these catabolites are known to play important chemopathological roles in the development of PDs and oral malodor. Although the oral environment may exert some major effects on the metabolic actions of bacteria, localized modifications arising therefrom can, at least in principle, give rise to a more active microbiome with an amplified pathogenicity. Therefore, a full analysis of the oral microbiome and catabolites arising therefrom through metabolomics strategies will serve to enhance our understanding of the mechanisms of oral diseases, together with potential means and/or regimens for their challenge with appropriate therapeutic strategies.
Of the broad 'spectrum' of 1 H NMR-detectable short-chain organic acid anions (acetate, nand iso-butyrates, formate, lactate, propionate, succinate, pyruvate, 5-aminovalerate and fumarate, amongst others), some may effectively serve as PD markers that reflect the growth, preponderance, and metabolism of micro-organisms. Indeed, for those that are not commonly also observed in or derived from host sources, it is highly conceivable that selected groups and/or patterns of these metabolites represent chemotaxonomic markers of different classes of bacterial infiltration. For example, the anaerobic pathogenic bacterium Porphyromonas gingivalis generates high levels of n-butyrate. Moreover, in addition to the roles of propionate, along with nand iso-butyrates in PD [164,165], both acetate and lactate may act as surrogate biomarkers of the susceptibility of patients to dental caries [163]. Indeed, excessive levels of the corresponding acids of these metabolites are viewed as primary end-point biomarkers featured in dental caries etiology. However, formic and pyruvic acids, which may be present at millimolar or near-millimolar concentrations in human saliva [3], are stronger acids than lactic acid (i.e., they have clinically significant lower pK a values [2]), and hence may also trigger pro-cariogenic actions. N-acetyl sugars such as N-acetylglucosamine and N-acetylneuraminate (sialate) are derived from the actions of the bacterial enzymes hyaluronidase and neuraminidase, respectively, and may also play significant roles in the pathogenesis of oral diseases such as PDs [167,168]. The malodorous amines determined such as TMA may represent one or more potentially toxic agents generated by bacteria implicated in the etiology of PDs [169]. Moreover, volatile sulfur compounds (VSCs) also adversely contribute towards PDs [170].
We also conclude that low-field benchtop 1 H NMR spectrometers may be successfully employed to reliably detect and quantify salivary metabolites, although, at present, this novel technology is restricted to molecules with the most prominent resonances, particularly those with clear singlets and/or simple first-order coupling patterns (specifically doublets or triplets only at this stage) or those present in spectroscopically clear spectral regions, for quantitative analysis purposes. From our current studies, only five or so of the most concentrated or 1 H NMR-visible salivary biomolecules, i.e., acetate, propionate, methanol, glycine, and formate, could be determined in WMS supernatant samples, although it should be noted that these may all potentially serve as key, potentially diagnostic biomarkers for the monitoring of oral diseases. Therefore, this information is of potential value to the detection and prognostic screening of common oral health conditions such as PDs and dental caries, although much further research work will be required to plausibly evaluate these applications. Outlooks for the future of this cost-effective technique, such as the on-the-spot, point-of-care, and near-multicomponent analysis of WMS samples at patient point-of-contact sites appear to be highly promising. Once non-stationary benchtop NMR facilities at higher operating frequencies become available and validated for such applications, e.g., those operating at 80 or 100 MHz, or even higher field strengths, these devices may have the ability to acceptably screen up to 15 or more salivary biomolecules. Hence, further developments in these novel directions offer much promise.
In principle, when detectable at sufficient salivary concentrations, and when resolved from any potential superimposing endogenous metabolite resonances, such analyses may be directly conveyed to the detection and quantification of a range of exogenous agents in this biofluid, notably ingested drugs. Additionally, this development in NMR technology may also be transferable to a number of forensic applications, perhaps those featuring the rapid identification of illicit drugs in WMS supernatant samples obtained at crime scenes, again provided that such analytes have at least one prominent resonance that remains free from endogenous salivary interferants, and that samples are collected within acceptable time-zones following drug ingestion.
The benefits offered by human saliva and other oral fluids as diagnostic biofluids are that it is relatively easy for participants to non-invasively self-collect without any form of clinical supervision, and is conveniently stored, once all recommended laboratory processing treatments are applied. Nonetheless, research investigations performed on saliva for the diagnosis of both oral and systemic diseases remain in their early stages. Moreover, the scientific and clinical progression of these studies is curtailed somewhat by a lack of control, and hence homogeneity of sample collection, preparation, and analysis methods, and hence calls for the establishment of recommended and apparently validated protocols for such investigations noted in [31] should be responded to and implemented as soon as it is feasibly possible to do so. The availability of systemic knowledge networks for these technologies, with data available on validated disease biomarkers, will serve to enhance our understanding of correlations between oral and extra-oral health conditions, and this, in turn, may fully encourage the development of precision medicine strategies through the provision of support for pinpointed, innocuous, and perhaps even personally targeted therapies.
Finally, many further investigations, both at high-field and LF, are required to further explore the value of salivary 1 H NMR analysis to the provision of potentially valuable salivary 'biomarker' data in a series of patients with clinically proven oral diseases, e.g., dental caries, periodontal diseases, oral malodor, etc. This essential requirement also applies to the application of these techniques to the salivary diagnosis and prognostic monitoring of extra-oral diseases such as a variety of remote cancers and inflammatory conditions, etc. However, since the salivary concentrations of biomarkers for such systemic conditions are often much lower than corresponding levels found in more localized biofluids such blood plasma or serum, for example, approximately 100-fold, such applications are limited somewhat in view of the analyte sensitivity level of 1 H NMR analysis. Also required are meaningful evaluations of the reliability and reproducibility of multicomponent 1 H NMR metabolic datasets acquired on WMS supernatant and other oral fluid specimens such as GCF, since these appear to be lacking in scientific literature available on salivary metabolomics. These approaches should not only include the analysis of replicate samples from the same participant donors (as conducted in the case study documented in Section 10.1 here) but also evaluations of the stability of biomarkers throughout the post-collection laboratory and NMR analysis processing periods, together with assurances that all possible precautions have been taken to circumvent their artefactual generation or the adverse artefactual introduction of assay interferants. The prognostic stratification of study participants may also be required in some investigations.
Optimization of 'hands-on' assignments of all or virtually all resonances present in the 1 H NMR profiles of WMS supernatant samples and the mean, variances, and ranges of their concentrations should, in at least in principle, permit the performance of automated NMR-linked metabolomics strategies [171,172].