Clinical Metabolomics: The New Metabolic Window for Inborn Errors of Metabolism Investigations in the Post-Genomic Era

Inborn errors of metabolism (IEM) represent a group of about 500 rare genetic diseases with an overall estimated incidence of 1/2500. The diversity of metabolic pathways involved explains the difficulties in establishing their diagnosis. However, early diagnosis is usually mandatory for successful treatment. Given the considerable clinical overlap between some inborn errors, biochemical and molecular tests are crucial in making a diagnosis. Conventional biological diagnosis procedures are based on a time-consuming series of sequential and segmented biochemical tests. The rise of “omic” technologies offers holistic views of the basic molecules that build a biological system at different levels. Metabolomics is the most recent “omic” technology based on biochemical characterization of metabolites and their changes related to genetic and environmental factors. This review addresses the principles underlying metabolomics technologies that allow them to comprehensively assess an individual biochemical profile and their reported applications for IEM investigations in the precision medicine era.


Introduction
The new field of precision medicine is revolutionizing current medical practice and reshaping future medicine. Precision medicine aspires to put the patient as the central driver of healthcare by broadening biological knowledge and acknowledging the greate diversity of individuals [1]. It is well established that complex gene-environment interactions shape normal physiological and disease processes at both the individual and population scale. Predicting normal and pathological states in patients requires dynamic and systematic understanding of these interactions. Systems medicine is a new concept based on holistic approaches for disease diagnosis and monitoring. The basic idea of these approaches is that a complex system is more comprehensively understood if considered as a whole at both the spatial and temporal scales.
Inborn errors of metabolism (IEM) are an appropriate model for systems medicine studies because the biological basis underlying these diseases has been, at least partly, revealed. IEM represent a group of about 500 rare genetic diseases with an overall estimated incidence of 1/2500. Even though these complex, integrated, and dynamic network. The dynamic view refers to the quantitative and qualitative assessment of changes and interactions between the different layers of the biological information [7,[18][19][20][21]. The genetic classifications of disease are now well established, given the modern genomic tools that can provide rich information about large patient cohorts. However, other highly complementary approaches based on proteomic and metabolic information can help researchers to biochemically or physiologically contextualize the underlying genetic information, thus helping to get closer to the phenotype and allowing patient stratification [22]. Thanks to disruptive technological jumps, a revolutionary vision was pioneered by Lee Hood, who coined the term P4 medicine [19], which is aimed to be predictive, preventive, personalized, and participatory. This new shift defines a new healthcare strategy in which each person serves as his or her own control over time [23].
The omics surge presents an amazing opportunity to provide new innovative tools for rapid diagnosis of IEM. Furthermore, metabolomics approaches are relevant for IEM because their basic pathophysiology is tightly related to metabolism. These diseases present with nonspecific clinical symptoms and appropriate laboratory tests are crucial in making a diagnosis. However, conventional biological diagnosis procedures are based on a series of sequential and segmented biochemical tests on various separated analytical platforms. This approach is slow, time-consuming, and complex, whereas optimal patient management requires improved speed of biochemical tests to allow early diagnosis and better monitoring of IEM. To address this need of faster screening and diagnosis strategies, metabolic profiling is a promising candidate.
In this review, we describe basic principles underlying metabolic phenotyping and metabolomic approaches that can be used to comprehensively assess an individual biochemical profile and their reported applications in IEM. Data for this review were identified by searches of PubMed and references from relevant articles using the search terms "metabolomics", "metabonomics", "metabolic profiling", "inborn errors of metabolism", and "inherited metabolic diseases".

Metabolites and Metabolome
The idea behind metabolomics goes back to ancient Greece, where doctors used the organoleptic characteristics of urine to link them to different medical conditions. Urine sweetness has been used to detect high glucose in diabetes [24]. Such organoleptic features are, of course, metabolic in origin. The word metabolome was coined by Olivier et al. in 1998 and defined as the set of metabolites synthesized by an organism [25]. Metabolome refers to the comprehensive complement of all metabolites present in a given biological system, fluid, cell, or tissue [26]. Metabolites can be defined as organic small molecules involved in enzymatic reactions. Thus, metabolomics is one of the "omic" technologies based on biochemical and molecular characterizations of the metabolome and the changes in metabolites related to genetic, environmental, drug or dietary, and other factors.
Metabolomics allows researchers to characterize these interactions and to evaluate the biochemical mechanisms involved in such changes in a systematic fashion. Indeed, metabolites fulfill the key criterion in that they change rapidly in response to physiological changes and may generate vital information about biochemical pathways that are modified in patients and in treated patients. Hence, metabolic profiling is highly informative since metabolites act as substrates or products in biochemical metabolic pathways [22,[27][28][29].
Metabolomics has found applications in many disease studies and in complex interacting systems [22]. The possibility of predicting drug effects from baseline metabolic profiles has been demonstrated and gave rise to pharmacometabonomics as a potential effector for patient stratification and personalized medicine [30][31][32][33][34][35]. It is possible that the future of IEM diagnosis may be found in the developing area of metabolomics by doing simultaneous quantitative metabolic profiling of many metabolites in biological fluids.

Biological Samples
For biological information recovery, metabolomics generally uses biofluids, cells, or tissue extracts as primary sources of metabolic fingerprint data. Compared with intact or extracted tissues, urine and serum or plasma are the most commonly studied biofluids in clinical practice, because they are easily obtained and prepared [36][37][38][39]. However, other specialized fluids could be used, including cerebrospinal fluid [40,41] or saliva [42][43][44] and even breath [45,46]. Dried blood (and other biofluids) spots samples (DBS) have also been investigated [47][48][49][50] and were shown to be an interesting alternative to conventional liquid samples for generating metabolite profiles. Given their practical advantages such as low volume, low cost, and handling convenience, DBS is gaining interest as a sampling support for metabolic profiling in IEM [47,[51][52][53]. Of note, most metabolomics studies, particularly in clinical metabolomics, include data from a single biofluid, most often blood or urine. However, biochemical signature in a biofluid denotes complex interrelationships from different organs, which add another complexity layer for metabolomics data interpretation. This could be only understood by investigating pathophysiological states from a metabolic interactions perspective taking into account local metabolome specificities and their contribution to systemic metabolome. Different data-driven approaches have been described to handle these issues using multiple biofluids sampling and metabolomics data modeling [54,55].

Analytical Technologies
The human metabolome is a complex, highly responsive, and dynamic system. Thus, it raises different analytical challenges compared to other omics analysis approaches that are based on profiling large molecules built with a simple and limited set of subunits, such as nucleotides for genomics and transcriptomics and amino acids for proteomics. Thus, for identification and functional analysis of DNA, RNAs and proteins, the order combination of the subunits is what matters. It is the order of subunits that embodies the observed complexity that carries the biological information. Sequencing technologies rely basically on an incremental detection of these subunits. Researchers must figure out the order of the subunits to decode the carried biological information [56]. However, the same sequencing approach cannot be used to analyze metabolites in complex biofluids, because the analytical challenge is not simply to crack the order code, as there is no obvious order.
To retrieve the metabolic information, the metabolome requires a more complex analysis of chemical mixtures that allows components to be individually and selectively differentiated, identified and measured across a wide qualitative and quantitative chemical space.
The diversity of the physicochemical properties of the various metabolites groups adds another layer of complexity to metabolomics studies. This supplemental challenge has been the key driver for the development of various analytical protocols and platforms. Indeed, scientists tackled this analytical challenge even before the term metabolomics was coined. The first scientific article about metabolomics was published by Pauling and colleagues, in which they described a method using gas chromatographic separation with flame ionization detection to analyze the breath [57]. The authors referred to orthomolecular medicine linking the detected biochemical signature to phenotypes.
Since then, huge development has been made. The mainly used metabolic profiling technologies are nuclear magnetic resonance (NMR) spectroscopy [58][59][60] and mass spectrometry (MS), either combined or not to a gas phase or liquid phase separation method [27,51]. These technologies are suitable for metabolomics studies because they deliver global, unbiased, and comprehensive chemical information from complex mixtures. For information recovery, the multivariate spectroscopic data produced are typically analyzed using chemometric techniques to identify informative metabolic combinations that can be used for either sample classification or global biomarker discovery [51,61,62]. NMR spectroscopy is rapid and nondestructive and has the advantage of being highly reproducible. It is a powerful spectroscopic technology that offers atom-centered information that is crucial for molecular structure elucidation [63]. High-resolution NMR using stronger magnetic fields or two-dimensional NMR allows higher information recovery. The major drawback of NMR is its lack of sensitivity. However, MS offers complementary molecular information and is, by far, more sensitive than NMR. Hence, it allows higher metabolome coverage. The use of separation methods coupled to MS, such as liquid chromatography [38,39], gas chromatography [64], or capillary electrophoresis [65], allows a molecular separation step before MS detection. This enhances sensitivity and the dynamic range and provides complementary molecular information using the separation dimension.
Recently, approaches using another gas phase separation, ion mobility spectrometry (IMS) [66], has been gaining interest in metabolomics [67][68][69][70][71][72][73]. Indeed, IMS is a well-established post-ionization separation method based on size, shape, and charge performed on a millisecond timescale, which represents an intermediate timescale between chromatography (seconds) and high-resolution MS detection (microseconds). Coupled with high-resolution mass spectrometry and chromatography (LC-IM-MS), IMS provides additional analyte selectivity without significantly compromising the speed of MS-based measurements. The MS dimension affords accurate mass information, while the IMS dimension provides molecular, structural, and conformational information through the determination of the ion collision cross section. Indeed, ion mobility spectrometry adds a separation dimension to the hybrid MS instruments allowing, thus, a more comprehensive analysis of complex biological mixtures [69,[74][75][76][77]. Furthermore, accessing retention time, accurate mass, and collision cross section obtained by the combination of LC-IM-MS allows measurement integration, which enhances molecular identification and consequently biomarker discovery [78,79].
Fourier transform mass spectrometry is another group of ultra-high-resolution methods that offer the highest resolving power, resolution, and mass-to-charge ratio (m/z) measurement accuracy and, hence, better metabolome coverage [80]. However, given their high cost, these methods are limited to only a few research groups.
Recently, to increase the high-throughput of global metabolic profiling analysis, ambient ionization sources were introduced. They are capable of direct sampling for complex matrices under ambient conditions. For example, atmospheric solids analysis probe [81], desorption electrospray ionization (DESI) [82][83][84], and rapid evaporative ionization MS methods [85,86] have been demonstrated to provide real-time, interpretable MS data on biofluids and tissues, in vivo and ex vivo, and will certainly reshape the future for high-throughput real-time metabolome analysis. In many surgeries, it is often difficult to distinct visually between the healthy and diseased tissues, and this requires time-consuming biopsies and immuno-staining procedures to be performed by histopathologists during surgery. By eliminating this need for external tissue histotyping, the iKnife could open the way to true real-time precision surgery. For more details about the use of ambient MS in clinical diagnosis, refer to a recent and detailed review by Ifa et al. [87]. Table 1 presents a comparison between different analytical strategies used in metabolomics with potential interest for IEM. Given the already existing chemical biomarker infrastructure and growing adoption of MS in clinical laboratories, its relatively low cost compared to NMR instruments, and the analytical performance of current mass spectrometers in terms of sensitivity and resolution in particular, MS-based metabolomics is a very promising tool in clinical biochemistry in the near future [88].

Metabolomics Workflows: Targeted vs. Untargeted
Metabolomics analysis is typically described as two complementary analytical approaches: targeted and untargeted. The first one aims to define the metabolic profile of the groups to study; subsequently, multivariate statistical analysis is undertaken to define the discriminating metabolites (potential biomarkers) between groups. Second, predictive mathematical models based on multivariate statistical analysis can be built. These models predict a subsequent classification of unknown biological samples (e.g., healthy versus diseased, treated versus untreated). The targeted approach focuses on identifying and quantifying selected metabolites according to their involvement in a metabolic pathway or their specific chemical or biochemical proprieties.
Uses a uniform or periodic electric field and a buffer gas, to separate ions based on size and shape which is hyphenated with mass spectrometry Very robust and reproducible (ability to determine Collision Cross Section which is a robust chemical descriptor) High peak capacity High selectivity Separation of isomeric and isobaric compounds Very high throughput Samples not recoverable (destructive) CCS and mass are highly correlated parameters which limits the orthogonality of the method In general, a metabolomic analysis involves mainly four steps.
Step 1 is a preparatory step on both analytical and conceptual aspects. It is initiated by the biological question to consider and the definition of the study aim. It also defines the most informative biological matrix and the experimental design to implement. In addition, this step defines the appropriate sample preparation according to the considered analytical study.
Step 2 includes analytical and instrumental strategy choices. During this step, data are collected and processed and then statistical analysis is performed.
Step 3 involves the putative annotation, identification, and confirmation of the potential biomarkers generated by the data analysis. Chemical, biochemical, and spectral databases are queried.
Step 4 aims to build a predictive mathematical model based on the identified biomarkers. This model is then validated analytically and clinically. This final step involves the integration of experimental data and their interpretation in the studied biological or clinical context [89]. Figure 1 illustrates the general workflow of translational metabolomics. In general, a metabolomic analysis involves mainly four steps.
Step 1 is a preparatory step on both analytical and conceptual aspects. It is initiated by the biological question to consider and the definition of the study aim. It also defines the most informative biological matrix and the experimental design to implement. In addition, this step defines the appropriate sample preparation according to the considered analytical study.
Step 2 includes analytical and instrumental strategy choices. During this step, data are collected and processed and then statistical analysis is performed.
Step 3 involves the putative annotation, identification, and confirmation of the potential biomarkers generated by the data analysis. Chemical, biochemical, and spectral databases are queried.
Step 4 aims to build a predictive mathematical model based on the identified biomarkers. This model is then validated analytically and clinically. This final step involves the integration of experimental data and their interpretation in the studied biological or clinical context [89]. Figure 1 illustrates the general workflow of translational metabolomics.

Data Analysis, Information Recovery, and the Curse of Dimensionality
Few highly reliable metabolites could be, at some extent, sufficient for diagnostic or monitoring purposes. However, a broader overview using more metabolites is more appropriate to assess, for example, a biochemical pathway. Thus, the choice of the most appropriate data modeling strategy is an important issue and is dependent on the underlying question to be addressed. In mechanistic studies, the structural data descriptions and the underlying extracted information yielded by the built model are more important than its predictive ability to classify new samples. However, in diagnosis applications, the predictive performances of the model are vital regarding samples classification. Hence, the clear and precise definition of the study aims has to be intelligible and purpose driven.
The analytical performance improvements associated with metabolomics platforms led to the generation of complex and high-dimensional datasets. Handling, in a smoothly high-throughput fashion, the huge amount of generated data is a very important issue for transforming the data into clinically actionable knowledge.

Data Analysis, Information Recovery, and the Curse of Dimensionality
Few highly reliable metabolites could be, at some extent, sufficient for diagnostic or monitoring purposes. However, a broader overview using more metabolites is more appropriate to assess, for example, a biochemical pathway. Thus, the choice of the most appropriate data modeling strategy is an important issue and is dependent on the underlying question to be addressed. In mechanistic studies, the structural data descriptions and the underlying extracted information yielded by the built model are more important than its predictive ability to classify new samples. However, in diagnosis applications, the predictive performances of the model are vital regarding samples classification. Hence, the clear and precise definition of the study aims has to be intelligible and purpose driven.
The analytical performance improvements associated with metabolomics platforms led to the generation of complex and high-dimensional datasets. Handling, in a smoothly high-throughput fashion, the huge amount of generated data is a very important issue for transforming the data into clinically actionable knowledge.

Univariate Data Analysis
Metabolomics data analysis can be approached from a univariate perspective using traditional statistical methods that consider only one variable at a time. Univariate methods are common statistical analysis tools and their main advantage is the convenient use and interpretation. To assess the differences between two or more groups, parametric tests such as Student's t-test and ANOVA are commonly applied, respectively. However, normality assumptions should be verified for consistent conclusions [90]. Otherwise, non-parametric test such as Mann-Whitney U test or Kruskal-Wallis one-way analysis of variance could be used if normality is not assumed. Another important issue is that applying multiple univariate tests in parallel to a high dimensional dataset raises the multiple testing problem. In metabolomics studies, a large number of features are simultaneously analyzed. Thus, the probability to find a statistically significant difference accidentally (i.e., true positive) is high. In order to handle this multiple testing issue, different correction methods could be used. Each method tries to balance between avoiding false metabolite associations (i.e., false positives) and discarding true associations (i.e., false negatives). In the Bonferroni correction, the significance level for a hypothesis is divided by the number of hypotheses tested simultaneously [90]. Hence, the Bonferroni correction is considered a stringent correction method. Other less conservative methods are available and are mostly based on the minimization of the false positives or false-discovery rate (FDR). FDR-based methods minimize the expected proportion of false positives on the total number of positives [91]. Gene expression microarray data analysis has matured most of these methods, where thousands of genes are simultaneously tested. Similarly, in untargeted metabolomics studies large sets of metabolites are measured in parallel. The use of less restrictive approaches such as FDR methods seems to be more useful.
Furthermore, it should be noted that potential confounding factors like gender, age or diet may affect the output results if not properly addressed. Furthermore, the main limit of these approaches is their lack of handling the correlations and interactions between the different metabolic features. Hence, advanced multivariate approaches are more suitable.

Multivariate Data Analysis
Translating biological data into knowledge requires addressing biology as an informational science using tools that allow to track the information at large scales. To do so, an entire field was born "Bioinformatics" [92]. Bioinformatics can be defined as mean of conceptualizing biology in terms of molecules and by applying "informatics techniques" borrowed to disciplines such as applied mathematics, computer science and statistics to understand and organize the information related to these molecules, on a large scale. In short, bioinformatics is a management information system for a biological system [93].
The high-dimensionality of metabolic data requires adapted statistical tools to retrieve as much as possible chemical information from the data to translate it into biological knowledge. The major challenge is to reduce the dimensionality by selecting relevant signals from the noisy raw data. To achieve this goal, chemometric tools are widely used. Chemometrics is the science of extracting useful information from chemical systems using data-driven means [94]. It is inherently interdisciplinary, borrowing methods from data-analytic disciplines such as multivariate statistics, applied mathematics, and computer science. Thus, chemometrics is applied to solve both descriptive and predictive problems using biochemical data.
The data analysis methods are mainly divided into two types: unsupervised and supervised methods. The former are mainly exploratory, whereas the latter are explanatory and predictive. Unsupervised methods are used to analyze the behavior of the observations in the data set without taking into account any related outcome. Because there is no class labeling or response, the data set is considered as a collection of analogous objects. Unsupervised learning methods track patterns or clustering trends in the data to understand any spontaneous relationships between the samples. It can also highlight the variables that are responsible for these relationships. Based on effective visualization means, unsupervised learning helps to reveal categories of samples or variables that naturally cluster together based on their underlying similarities. In metabolomics data, it is the metabolic similarity that shapes the clustering. Principal component analysis [95] is a widely used pattern recognition method; it is a projection-based method that reduces the dimensionality of the data by creating components or latent variables. Principal component analysis allows a two-or three-dimensional visualization of the data. However, clustering methods aim to identify clusters in the dataset using similarity measures. A dendrogram or a heat map can be then formed to visualize the samples similarities. The commonly used clustering methods are k-means clustering [96], hierarchical cluster analysis [97], and self-organizing maps [98]. Correlation matrix could also be used to get an overview of the data. Because the main goal in metabolomics, especially in clinical context, is to differentiate between groups (healthy versus diseased, treated versus control), a sample can be classified according to its spectral patterns. The metabolic features responsible for the classification can then be identified. The metabolic features intensities in the dataset matrix can be considered as a multidimensional space of metabolites coordinates. Thus, each spectrum is a point in a multidimensional metabolic hyperspace.
In supervised methods, the multivariate datasets can be modeled so that the class label of separate samples known as a validation set can be predicted based on a series of mathematical models derived from the original data, namely the training set. Various supervised methods could be used in metabolomics, including partial least squares (PLS) methods such as PLS-Discriminant Analysis (PLS-DA) [99] and Orthogonal-PLS-DA (OPLS-DA) [100], as well as support vector machines [101]. Methods based on topology data analysis are gaining great interests and seem promising for data analysis because of their intrinsic flexibility and exploratory and predictive abilities [102]. It must be noted that the retrieved information from the raw data and the generated outputs are highly dependent on the chosen data analysis strategy. Hence, the aim of metabolomics research and the data analysis step are mutually dependent.
Of note, multivariate and univariate data analysis pipelines are not mutually exclusive and it is often recommended to use both to maximize the quality of the information extraction from metabolomics data.
For further details on data analysis techniques and tools in metabolomics, refer to recent reviews on this issue [103][104][105].

Pathway and Network Analysis: From Information to Knowledge
The integration of experimental data and computational tools is mandatory to understand complex biological systems. This gave birth to computational biology which could be divided into two distinct branches: knowledge discovery or data-mining, and simulation-based analysis. The former extracts the hidden patterns from huge amount of experimental data, generating hypotheses. However, the latter tests hypotheses with in silico experiments, providing predictions to be confirmed by in vitro and in vivo studies [9].
One of the biggest challenges of any metabolomics study is linking the identified metabolites to biology, which is a crucial step to move from biomarkers towards more mechanistic insights. To achieve this purpose, pathway and network analysis approaches aim to capitalize on the information generated by metabolomics studies to get insightful inference [106,107]. Both approaches exploit the interrelationships properties contained in the metabolomic data. Network modeling and pathway-mapping tools help to decipher metabolites interactions roles in a biological disturbance [107].
Metabolic pathways are sets of metabolites that are connected to the same biological process, and that are linked by one or multiple enzymatic reactions directly or indirectly. Biological databases are therefore seminal enablers providing rich information of different of metabolic pathways (Table 2). Indeed, pathway analysis (PA) uses prior biological knowledge to analyze metabolic patterns from an integrative point of view. Pathway-based methods are currently known as metabolite set enrichment analysis (MSEA), and are methodologically based on the gene set enrichment analysis (GSEA) approach, previously developed for pathway analysis of gene-expression data [108,109].  [115] There are mainly three distinct methods to perform MSEA [108,116]: Overrepresentation analysis (ORA): The basic hypothesis in this method is that relevant pathways can be detected if the proportion of differential expressed metabolites, within a given pathway, exceeds the proportion of metabolites that could be randomly expected. A hypergeometric test or a Fisher's Exact test is used to evaluate the statistical significance of whether the metabolite belongs to the pathway. The final result from an ORA method consists in a list of the most relevant pathways, ranked by p-value and/or a multiple-hypothesis-test-corrected p-value. The ORA main advantage over non-knowledge-driven (i.e., purely data-driven) analysis is that it gives metabolomic data a biological context. This allows formulating a hypothesis that could subsequently be test experimentally. Hence, ORA turns data analysis into a knowledge generation cycle, proper of the Systems Biology approach. However, PA exhibits some limits. Due to the selected cut-off method for statistical significance potentially important components could be omitted in the analysis. Furthermore, PA assume that pathways are independent from each other, which is contrary to the admitted interaction and overlapping between pathways [108]. Other methods have been developed to overcome these limits.
Quantitative enrichment analysis (QEA): In this approach, the input data are a set of quantified metabolite from multiple samples. Thus, absolute concentrations are used. Enriched pathways can be identified using different approaches like the Wilcoxon-based test [117], globaltest [118] or globalAncova [119]. Enriched pathways include pathways where a set of metabolites that are significantly changed or pathways where a large number of metabolites that significantly changed [116,120].
Single-sample profiling (SSP): Unlike the previous methods that are designed for studies involving multiple samples, this method is used at the sample level. In this case, SSP requires a list of metabolite concentrations in biofluids (i.e., urine, blood and CSF), tissue, or cell type and a database with the normal concentration ranges of the chosen metabolites in the analyzed sample. Thus, SSP identifies, from the data, the set of metabolites presenting significantly different levels compared to the normal ranges [116,120].
For better interpretability of pathway analysis outputs, MSEA results could be combined with pathway topological analysis (PTA). PTA measures assess the impact of the disturbed metabolites within the pathway. First, single impacts are evaluated using the degree and betweenness network centrality measures of each metabolite. This represents the number of shortest paths passing through a certain node to estimate its centrality (importance). Subsequently, the overall impact (i.e., pathway impact) is calculated as the sum of the single impact measures of the disturbed metabolites normalized by the sum of the measures of the impact of all the metabolites within the considered pathway [121]. Indeed, changes in the most important nodes within a network generate a more significant impact on the system than changes in bordering or solitary nodes.
From a topological standpoint, a metabolic network can be considered as an interconnected ensemble of nodes presented by metabolites, and edges representing reactions catalyzed by enzymes. Thus, unlike PA, network analysis uses the high degree of correlation in metabolomics data to build metabolic networks that characterize the complex relationships the measured metabolites. Biological data exhibit a high level of correlation that exists between the different biological components (i.e., DNA, mRNAs, proteins and metabolites). Indeed, a given metabolite may be connected to different metabolic pathways and, thus, show correlation patterns. In other cases, the observed correlations may be due to other causes such as global changes (i.e., diurnal variation in time series studies) or specific changes due to the intrinsic variability of metabolomic data [54,122]. These patterns can provide valuable information about the underlying metabolic network associated to a specific biological process [54,123].
Based on the observed relationship patterns present in the experimental data, correlation-based methods allow building metabolic networks in which each metabolite represents a node. However, unlike the pathway analysis, the links between nodes denotes the level of mathematical correlation between each metabolites pair and called edge. High correlation coefficients are frequent in metabolomics data which is due to the presence of systemic associations [123]. Hence, using classical correlation coefficients leads to overcrowded networks. In addition, direct and indirect associations are not distinguished. To overcome this problem partial correlation could be used [54,123,124]. In partial correlation approach, the correlation between two metabolites is conditioned against the correlation with the remaining metabolites. Consequently, partial correlation allows discriminating between direct and indirect metabolite correlations. In this method, the link between two metabolites is scored according to the ratios differences between the corresponding metabolites in the two sample groups. Therefore, the related network topology is based on the metabolic differences between the two studied phenotypes. These data-driven strategies have been successfully applied for reconstruction of metabolic networks from metabolomics data [123,125,126].
Metabolite identification is a challenging and time consuming task. Thus, a novel approach, named Mummichog, has been proposed by Li et al. for network analysis. This method predicts biological activity directly from mass spectrometry based untargeted metabolomics data without a priori identification of metabolites. The idea behind this strategy is combining network analysis and metabolite prediction under the same computational framework reducing significantly the metabolomics workflow time. This method has been elegantly illustrated by exploring the activation of innate immune cells. It yielded that glutathione metabolism is modified by viral infection driven by constitutive nitric oxide synthases [127].
A wide variety of software tools are available to analyze metabolomic data at the pathway and network level. Table 3 presents different functional analysis tools for both pathway analysis and visualization. MetaboLyzer https://sites.google.com/a/georgetown.edu/fornace-labinformatics/home/metabolyzer [141] Contextual interpretation is crucial to fully embrace the potential of metabolomics. Indeed, metabolites carry out precious contextual biological information. In a metabolic network, flux is defined as the rate (i.e., quantity per unit time) at which metabolites are converted or transported between different compartments [10]. Thus, metabolic fluxes, or fluxome, represent a unique and functional readout of the phenotype. The fluxome captures the metabolome in its ultimate functional interactions with the environment and the genome [10,142]. As such, the fluxome integrates information on different cellular processes, and hence it is a unique spatiotemporal phenotypic signature of cells. Thus, one or more metabolic fluxes could be altered in a metabolic disorder depending on the complexity of the disease [2]. Different strategies are used to translate metabolomics data into fluxomic insights by modeling of metabolic networks. The network modeling can be achieved using constraints of mass and charge conservation along with stoichiometric and thermodynamic ones [34,[143][144][145]. Based on the stoichiometry of the reactants and products of biochemical reactions, flux balance analysis (FBA) can estimate metabolic fluxes without knowledge about the kinetics of the participating enzymes [10,142]. Recently, Cortassa et al. suggested a new approach, distinct from FBA or metabolic flux analysis, which takes into account kinetic mechanisms and regulatory interactions [146].

Potential Integration of Metabolomics in Laboratory Medicine Frameworks
Metabolites embody physiological end-points and regulatory processes directly connected to the fluxome. Hence, the metabolome is very time sensitive and is constantly changing. Therefore, changes in metabolite concentrations are usually more suitable to describe the biochemical state of a biological system. Because metabolomics is the ultimate expression of the genes' influences and proteins' use of metabolites, it offers a rich and tremendous view on the phenotype. Indeed, metabolites carry out precious contextual biological information that could be used to assess pathophysiological states.
Metabolic profiling as a diagnostic tool opens an informative metabolic window into disease, which makes metabolomics an appealing ally in disease diagnosis.
What makes metabolomics a key driver in the post-genomic era is its tight relationship with the phenotype, whether the phenotype is driven by a monogenic or a multifactorial complex condition. Linking metabolic profile modulation with particular genetic variation [126] and/or environmental factors such as the microbiome [147], diet [148], toxics [56], or therapies [34] offers an exciting opportunity to rationalize diagnostics and translate a more comprehensive information into clinical actionable knowledge. The above-cited factors that influence the metabolome, and then phenotype(s), remind us that assessing metabolites, as chemical supporters of life, is the core for knowledge building that will shape clinical decisions.
Early biochemists such as Cori, Warburg, Meyerhof, and Krebs made seminal contributions to map most fundamental aspects of metabolic pathways and physiology. Therefore, urine chemical properties guided early physicians in founding the concept of IEM [4]. Sir Garrod's idea suggested that a biochemical fingerprint within biofluids was a product of human variation and, hence, could be a surrogate for distinct diseases. Garrod argued that the IEM that he was able to observe "were merely extreme examples of variations of chemical behavior which are probably everywhere present in minor degrees" [149]. In other words, he believed that there were phenotypes that could be associated with specific biochemicals. However, given the limited technical sensitivity back then, he was not able to affirm this idea. Recently, his hypothesis was elegantly confirmed with metabolomics approaches and metabolic modeling [126].
With the expected improvements in the metabolic profiling scope and data quality, metabolomics is destined to play a major and disruptive role in the near future as an efficient screening and diagnostic tool [150]. There are mainly two ways that metabolomics could be implemented in clinical context and laboratory medicine: chemometrics or a quantitative approach. For the former, direct statistical analysis is applied to spectral patterns and signal intensity data, and identification of metabolites may be performed in the last step if needed. This method captures metabolic snapshots and builds pattern-recognition-based models using machine learning techniques to sort samples (subjects) according to their metabolic patterns. This approach is eloquently embodied by the intelligent scalpel (iKnife) introduced by Takatz et al. [85], which instantaneously classifies, in vivo and ex vivo, cancerous and noncancerous tissues. This compelling technology aims to help surgeons during cancer surgery [85,86]. In contrast, the quantitative approach targets a set of metabolites and then analyzes the quantitative data directly. This approach affords an absolute quantitation of a set of chosen metabolites (e.g., amino acids, carnitines and acylcarnitines, or organic acids). A multivariate predictive model can be built based on the absolute concentration of these metabolites to predict clinical status or intervention outcomes. Compared to quantitative metabolomics, the key advantage of chemometric profiling is its capability of automated and unbiased assessment of metabolomics data. However, it requires a large number of spectra and sample uniformity, which are less of a concern in quantitative metabolomics. Nevertheless, the multivariate data analysis strategies underlying the two strategies are quite similar. Figure 2 illustrates the two clinical workflows. surgery [85,86]. In contrast, the quantitative approach targets a set of metabolites and then analyzes the quantitative data directly. This approach affords an absolute quantitation of a set of chosen metabolites (e.g., amino acids, carnitines and acylcarnitines, or organic acids). A multivariate predictive model can be built based on the absolute concentration of these metabolites to predict clinical status or intervention outcomes. Compared to quantitative metabolomics, the key advantage of chemometric profiling is its capability of automated and unbiased assessment of metabolomics data. However, it requires a large number of spectra and sample uniformity, which are less of a concern in quantitative metabolomics. Nevertheless, the multivariate data analysis strategies underlying the two strategies are quite similar. Figure 2 illustrates the two clinical workflows.

Applications of Metabolomics in Inborn Errors of Metabolism (IEM) Investigations
IEM being tightly connected with metabolism, the inherent pathophysiological changes are the main determinant of the metabolome and functional understanding of the disease. Hence, due to its intrinsic multidisciplinary nature, integrating biochemistry, analytical chemistry, advanced statistics and bioinformatics, metabolomics analysis represents a promising tool to achieve improved understanding and better diagnosis of IEM in the post-genomic and precision medicine era. For years, MS has been used in the assessment of inherited metabolic diseases. Several IEM are currently diagnosed using targeted MS-based metabolomics methods such as aminoacidopathies, organic acidurias, and fatty acid oxidation disorders [151][152][153][154][155]. Furthermore, MS is now widely implemented in IEM newborn screening national programs worldwide [156]. However, the combination of the already existing tools with data analysis strategies is compelling for better biological information recovery.
To assess different IEM including aminoacidopathies, organic aciduria, and mitochondrial disorders, Janeckova et al. used targeted analysis combined with multivariate data analysis. Their work showed how combining chemometrics modeling and absolute quantification are eloquently complementary in the assessment of IEMs [157]. Drecksen et al. used a similar approach to assess isovaleric aciduria (IVA) based on 86 urine samples: 10 untreated and 10 treated IVA cases, 12 heterozygotes, 22 children controls, and 32 adult controls. The work succeeded in producing a comprehensive profile of metabolites of practical significance in IVA [158]. Osterman et al. described a matrix-assisted laser desorption/ionization MS-based method for acylcarnitine and organic acid

Applications of Metabolomics in Inborn Errors of Metabolism (IEM) Investigations
IEM being tightly connected with metabolism, the inherent pathophysiological changes are the main determinant of the metabolome and functional understanding of the disease. Hence, due to its intrinsic multidisciplinary nature, integrating biochemistry, analytical chemistry, advanced statistics and bioinformatics, metabolomics analysis represents a promising tool to achieve improved understanding and better diagnosis of IEM in the post-genomic and precision medicine era. For years, MS has been used in the assessment of inherited metabolic diseases. Several IEM are currently diagnosed using targeted MS-based metabolomics methods such as aminoacidopathies, organic acidurias, and fatty acid oxidation disorders [151][152][153][154][155]. Furthermore, MS is now widely implemented in IEM newborn screening national programs worldwide [156]. However, the combination of the already existing tools with data analysis strategies is compelling for better biological information recovery.
To assess different IEM including aminoacidopathies, organic aciduria, and mitochondrial disorders, Janeckova et al. used targeted analysis combined with multivariate data analysis. Their work showed how combining chemometrics modeling and absolute quantification are eloquently complementary in the assessment of IEMs [157]. Drecksen et al. used a similar approach to assess isovaleric aciduria (IVA) based on 86 urine samples: 10 untreated and 10 treated IVA cases, 12 heterozygotes, 22 children controls, and 32 adult controls. The work succeeded in producing a comprehensive profile of metabolites of practical significance in IVA [158]. Osterman et al. described a matrix-assisted laser desorption/ionization MS-based method for acylcarnitine and organic acid analysis on DBS. The method enabled the identification and quantification of metabolites involved in different organic aciduria and beta oxidation deficiencies [159]. Using targeted metabolomics addressing complex lipids, Fan and colleagues showed that some sphingolipids species were elevated in Niemann-Pick Type C1 subjects. These lipid biomarkers may be used for monitoring the efficacy of specific therapy [160].
Given the increasing potential of metabolomics in IEM, different groups published work regarding the usefulness of untargeted-metabolomics-based approaches in IEM in disease characterization, diagnosis, and biomarker discovery. For characterization of disease biosignatures, respiratory chain deficiencies have been investigated by several research groups to track specific metabolic signatures using metabolomics [161][162][163]. Wikoff et al. used MS-based untargeted metabolomics in plasma to characterize methylmalonic acidemia and propionic aciduria. Propionylcarnitine, a known biomarker, was retrieved using untargeted strategy which illustrates the potential of metabolic profiling in biomarker detection. Five additional plasma acylcarnitine metabolites presented significant differences between patients and control individuals. In addition, γ-butyrobetaine was highly increased in a subset of patients. This demonstrates that metabolomics can widen the range of metabolites associated with IEM [164]. Auray-Blais et al. used MS-based untargeted metabolomics for biomarker discovery in Fabry disease which led to the discovery of seven globotriaosylceramide (Gb3) analogues as biomarkers that are now suggested biomarkers for the screening and the follow-up of Fabry disease [61,62,165].
Sholmi et al. presented an elegant computational approach for assessing metabolic profiles of red blood cells enzyme deficiencies. The developed predictive method yielded biomarkers for red blood cells alterations and revealed a strong correlation with disrupted metabolic concentrations. Over 200 metabolites were identified as potential biomarkers due to 176 enzyme deficiencies. Furthermore, already known disease indicators were retrieved by the developed prediction method. Importantly, potential novel biomarkers were also predicted. This approach proved to dramatically increase biomarker discovery performance [166].
Because the metabolome is highly influenced by nutritional factors, diet monitoring also has been investigated using metabolomics. Phenylketonuria is an interesting example of diet monitoring in IEM. Using metabolomics, Mutze et al. showed that a long-term dietary fatty acid restriction influences mitochondrial beta-oxidation intermediates. No functional influence on unsaturated fatty acid metabolism and platelet aggregation in patients with phenylketonuria was detected [167].
Regarding the use of metabolomics platforms as a diagnosis tool, several teams proposed metabolomics workflows. Using NMR and DESI-MS methods, Pan et al. clearly discriminated six patients with IEMs from six controls based on their respective urine metabolic profiles, identifying argininosuccinic aciduria, classic homocystinuria, classic methylmalonic acidemia, maple syrup urine disease, phenylketonuria, and type II tyrosinemia [168]. Later, Denes et al. proposed a method based on high-resolution MS with high throughput using DBS and direct flow injection analysis. Their method has been tested on 500 controls and 66 abnormal samples and showed a clear discrimination of the various assessed metabolic diseases [51]. Ilya et al. also proposed another method based on high-resolution MS coupled to liquid chromatography for the assessment of IEM. Their method resolved highly polar as well as hydrophobic analytes under reverse-phase conditions, enabling analysis of a wide range of chemicals in an untargeted fashion. Their work provides a tailored high-resolution MS platform for IEM and covers various metabolites usually quantified by a combination of different separate instrumentation [169]. Miller et al. described a comprehensive global strategy to assess IEM using liquid chromatography and gas chromatography MS-based metabolomics platforms combining both targeted and untargeted analysis. In total, 120 plasma samples from patients with a confirmed IEM and those of 70 controls were assessed. This strategy allowed, elegantly, comprehensive pathway analysis that provides useful diagnostic information of IEM [170].
Regarding NMR-based platforms, Aygen and colleagues conducted a multi-center clinical study in 14 clinical centers in Turkey. Urine samples from 989 neonates were collected and investigated using NMR spectroscopy in two different laboratories to assess reproducibility. The objectives of their study were twofold: (1) to explore the metabolite variations to set pathological thresholds of specific metabolites in comparison with healthy neonates to develop predictive models; and (2) to build a NMR database from a healthy population of neonates for IEM metabolite identification [171].

Metabolite Identification
Metabolite identification is the main bottleneck of metabolomics for large adoption in both translational and clinical context. Despite spectral information becomes available in the literature or in spectral databases, metabolites identification is still a challenging task [172]. To the best of our knowledge, there is no software currently available to fully and smoothly facilitate the identification process. Especially, the integration of NMR and MS data, which is essential for the reliable identification of metabolites. Furthermore, metabolite identification is mandatory for absolute quantitation especially in MS based methods requiring the use of labeled isotope. Thus, more efforts are needed to enhance this drawback of metabolomics.

Standardization and Harmonization
Standardization is a vital aspect for a wide spread of any new technology. Thus, for clinical metabolomics, harmonization of the sample preparation, processing, analysis and reporting using validated and standardized protocols is mandatory [173,174]. This is important since biological samples change over time. The lack of harmonization in protocols for sample handling, MS and NMR data generation and data reporting could lead to poor reproducibility and, thus, to data misinterpretation, particularly in population metabolic profiling. This is a fundamental obstacle for clinical translation. A definition of normal or reference samples is also important to build reference databases. This will rely on signatures derived from the complete characterization of the considered disease, IEM in the scope of this review. Finally, addressing these standardization issues is essential for regulatory compliance, which is a prerequisite for clinical implementation and adoption.

Automation, Data Visualization and Clinical Actionability
Automation at different stages, instrument-, pre-and post-analytic levels are a very important issue for large clinical adoption of any diagnostic innovation. Metabolomics workflow automation is a key enabler regarding high-throughput, reproducibility and reliability which are pillars of modern laboratory medicine practice. To address this limit, current efforts are promising like the iKnife, which would allow real-time cancer diagnosis [85] and breathomics strategies for lung and respiratory diseases based on breath signatures [72]. Data fusion and integration of omics and other biological and clinical data is another great challenge to fully unveil the potential of metabolomics [17,175]. With this regard, combining genomic and metabolic profiling information to enhance clinical diagnostics and to enable patient stratification and monitoring of interventional pathways patient journeys is a promising field [22,176]. The clinical actionability would involve advanced mathematical modeling of genomic and metabolic data sets in relation to patient clinical data using machine learning and expert systems. Intuitive visualization tools of the data in clinical accessible formats are needed to support effective clinical decision making. Figure 3 presents the main challenges in clinical metabolomics. regard, combining genomic and metabolic profiling information to enhance clinical diagnostics and to enable patient stratification and monitoring of interventional pathways patient journeys is a promising field [22,176]. The clinical actionability would involve advanced mathematical modeling of genomic and metabolic data sets in relation to patient clinical data using machine learning and expert systems. Intuitive visualization tools of the data in clinical accessible formats are needed to support effective clinical decision making. Figure 3 presents the main challenges in clinical metabolomics.

Conclusions
It is common to perform early diagnosis of IEM by assessing specific metabolic biomarkers related to a genetic defect. However, the original paradigm of "one gene-one enzyme-one disease" is no longer viewed as a reality for IEM. The impact of an altered protein on metabolic flux is not easily predictable. Indeed, the metabolic pathways are not linear and metabolites are tightly linked with several interactions within a highly organized network [21,177]. Depending on the complexity of the disease, one primary metabolite flux or an entire network of metabolite fluxes might be affected [2,20]. Therefore, a complete contextual, multilayer, network-based functional overview is needed to effectively assess all the actors of a given pathway in a holistic fashion [8]. Systemic approaches are needed to understand IEM complexity and to effectively diagnose and treat them [21]. To achieve such a goal, metabolomics is a key driver in the systems medicine based strategy. The great potential of metabolomics integration with other omics data will allow systems biology and clinical data to be linked. This paves the way for a paradigm shift in medical practice from cohort evidence-based medicine to algorithm-based precision medicine. This will in turn enhance clinicians' abilities to be more pre-emptive and thus, more efficient in handling IEM.
Metabolomics is still in its infancy with regard to the investigation of IEM, and its great potential has yet to be explored worldwide at both the basic and clinical sides. Improving workflows for high-quality data acquisition, processing, and visualization is an important issue for effectively translating the biological information into actionable knowledge under clinically accessible formats for effective healthcare management. However, this innovative global approach also requires a paradigm shift in our practice at different levels. A complete change is needed in our screening and diagnosis strategies. Thus, a disruptive move from a hypothesis-driven approach to a more data-driven and hypothesis-generating approach is crucial to address the challenges of the post-genomic era. The core idea of the paradigm shift in IEM laboratory investigation is presented in Figure 4.
Furthermore, totally new investigative thinking is needed to transform all aspects of the laboratory medicine enterprise, including education, research, and healthcare. Upgrading medical practitioners' skill sets on both the clinical and laboratory sides is needed to smoothly achieve the full potential of systems medicine. These skills integrate biology, computing and data analytics to develop common communication channels for more effective medical interactions. This ongoing high digitization of the individual biological and clinical information offers a tremendous and exciting opportunity to fully embrace the promising era of precision medicine. the biological information into actionable knowledge under clinically accessible formats for effective healthcare management. However, this innovative global approach also requires a paradigm shift in our practice at different levels. A complete change is needed in our screening and diagnosis strategies. Thus, a disruptive move from a hypothesis-driven approach to a more data-driven and hypothesisgenerating approach is crucial to address the challenges of the post-genomic era. The core idea of the paradigm shift in IEM laboratory investigation is presented in Figure 4. Furthermore, totally new investigative thinking is needed to transform all aspects of the laboratory medicine enterprise, including education, research, and healthcare. Upgrading medical practitioners' skill sets on both the clinical and laboratory sides is needed to smoothly achieve the full