Data Processing in Metabolomics

A special issue of Metabolites (ISSN 2218-1989). This special issue belongs to the section "Bioinformatics and Data Analysis".

Deadline for manuscript submissions: closed (31 July 2013) | Viewed by 50998

Special Issue Editor

Institute for Computational Systems Biology, University of Hamburg, D-22607 Hamburg, Germany
Interests: bioinformatics; computational biology; systems medicine; network medicine; metabolomics; multi-omics integration
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

All living cells, throughout all domains of life, control their metabolism in accordance to internal and external conditions. Increasing sensitivity of modern high-throughput spectrometry and chromatography technologies allows us to measure a large fraction of the metabolome of cells, tissues and organs on different scales. The research area of metabolic profiling emerged with a tremendous impact on biomedical research. The hope is to unravel molecular decision processes underlying growth, survival, reproduction and differentiation of cells, tissues, organs and microbial colonies under varying conditions. On a larger scale, we enter the fields of personalized medicine, biomarker discovery & validation as well as therapy optimization. However, the cheaper and faster our technology is, the more data is generated that needs to be analyzed efficiently regarding real-world questions, such as diagnostics, prognosis and in silico modeling of the cell behavior. This special issue will focus on detailing the emerging problems and novel approaches in metabolomics data analysis.

We will consider research papers with special focus on computational methods for analyzing data from typical metabolomics technologies, such as de-noising, smoothing, peak detection, metabolic database design & data integration, disease classification, biomarker identification & validation, disease-sub-typing, data reduction, and metabolic profile alignment. Manuscripts about closely related topics, metabolic network simulation, for instance, are also welcome.

Dr. Jan Baumbach
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Metabolites is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2700 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • metabolic profiles
  • metabolic patterns
  • biomarker detection
  • biomarker validation
  • personalized medicine
  • therapy optimization
  • metabolic modeling
  • metabolic network simulation
  • data integration
  • integrated databases

Published Papers (7 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

351 KiB  
Article
Computational Analyses of Spectral Trees from Electrospray Multi-Stage Mass Spectrometry to Aid Metabolite Identification
by Mingshu Cao, Karl Fraser and Susanne Rasmussen
Metabolites 2013, 3(4), 1036-1050; https://doi.org/10.3390/metabo3041036 - 31 Oct 2013
Cited by 79 | Viewed by 7370
Abstract
Mass spectrometry coupled with chromatography has become the major technical platform in metabolomics. Aided by peak detection algorithms, the detected signals are characterized by mass-over-charge ratio (m/z) and retention time. Chemical identities often remain elusive for the majority of the signals. [...] Read more.
Mass spectrometry coupled with chromatography has become the major technical platform in metabolomics. Aided by peak detection algorithms, the detected signals are characterized by mass-over-charge ratio (m/z) and retention time. Chemical identities often remain elusive for the majority of the signals. Multi-stage mass spectrometry based on electrospray ionization (ESI) allows collision-induced dissociation (CID) fragmentation of selected precursor ions. These fragment ions can assist in structural inference for metabolites of low molecular weight. Computational investigations of fragmentation spectra have increasingly received attention in metabolomics and various public databases house such data. We have developed an R package “iontree” that can capture, store and analyze MS2 and MS3 mass spectral data from high throughput metabolomics experiments. The package includes functions for ion tree construction, an algorithm (distMS2) for MS2 spectral comparison, and tools for building platform-independent ion tree (MS2/MS3) libraries. We have demonstrated the utilization of the package for the systematic analysis and annotation of fragmentation spectra collected in various metabolomics platforms, including direct infusion mass spectrometry, and liquid chromatography coupled with either low resolution or high resolution mass spectrometry. Assisted by the developed computational tools, we have demonstrated that spectral trees can provide informative evidence complementary to retention time and accurate mass to aid with annotating unknown peaks. These experimental spectral trees once subjected to a quality control process, can be used for querying public MS2 databases or de novo interpretation. The putatively annotated spectral trees can be readily incorporated into reference libraries for routine identification of metabolites. Full article
(This article belongs to the Special Issue Data Processing in Metabolomics)
Show Figures

Figure 1

595 KiB  
Article
Counting and Correcting Thermodynamically Infeasible Flux Cycles in Genome-Scale Metabolic Networks
by Daniele De Martino, Fabrizio Capuani, Matteo Mori, Andrea De Martino and Enzo Marinari
Metabolites 2013, 3(4), 946-966; https://doi.org/10.3390/metabo3040946 - 14 Oct 2013
Cited by 30 | Viewed by 7010
Abstract
Thermodynamics constrains the flow of matter in a reaction network to occur through routes along which the Gibbs energy decreases, implying that viable steady-state flux patterns should be void of closed reaction cycles. Identifying and removing cycles in large reaction networks can unfortunately [...] Read more.
Thermodynamics constrains the flow of matter in a reaction network to occur through routes along which the Gibbs energy decreases, implying that viable steady-state flux patterns should be void of closed reaction cycles. Identifying and removing cycles in large reaction networks can unfortunately be a highly challenging task from a computational viewpoint. We propose here a method that accomplishes it by combining a relaxation algorithm and a Monte Carlo procedure to detect loops, with ad hoc rules (discussed in detail) to eliminate them. As test cases, we tackle (a) the problem of identifying infeasible cycles in the E. coli metabolic network and (b) the problem of correcting thermodynamic infeasibilities in the Flux-Balance-Analysis solutions for 15 human cell-type-specific metabolic networks. Results for (a) are compared with previous analyses of the same issue, while results for (b) are weighed against alternative methods to retrieve thermodynamically viable flux patterns based on minimizing specific global quantities. Our method, on the one hand, outperforms previous techniques and, on the other, corrects loopy solutions to Flux Balance Analysis. As a byproduct, it also turns out to be able to reveal possible inconsistencies in model reconstructions. Full article
(This article belongs to the Special Issue Data Processing in Metabolomics)
Show Figures

1000 KiB  
Article
A Computational Framework for High-Throughput Isotopic Natural Abundance Correction of Omics-Level Ultra-High Resolution FT-MS Datasets
by William J. Carreer, Robert M. Flight and Hunter N. B. Moseley
Metabolites 2013, 3(4), 853-866; https://doi.org/10.3390/metabo3040853 - 25 Sep 2013
Cited by 46 | Viewed by 7202
Abstract
New metabolomics applications of ultra-high resolution and accuracy mass spectrometry can provide thousands of detectable isotopologues, with the number of potentially detectable isotopologues increasing exponentially with the number of stable isotopes used in newer isotope tracing methods like stable isotope-resolved metabolomics (SIRM) experiments. [...] Read more.
New metabolomics applications of ultra-high resolution and accuracy mass spectrometry can provide thousands of detectable isotopologues, with the number of potentially detectable isotopologues increasing exponentially with the number of stable isotopes used in newer isotope tracing methods like stable isotope-resolved metabolomics (SIRM) experiments. This huge increase in usable data requires software capable of correcting the large number of isotopologue peaks resulting from SIRM experiments in a timely manner. We describe the design of a new algorithm and software system capable of handling these high volumes of data, while including quality control methods for maintaining data quality. We validate this new algorithm against a previous single isotope correction algorithm in a two-step cross-validation. Next, we demonstrate the algorithm and correct for the effects of natural abundance for both 13C and 15N isotopes on a set of raw isotopologue intensities of UDP-N-acetyl-D-glucosamine derived from a 13C/15N-tracing experiment. Finally, we demonstrate the algorithm on a full omics-level dataset. Full article
(This article belongs to the Special Issue Data Processing in Metabolomics)
Show Figures

Graphical abstract

434 KiB  
Article
A Novel Methodology to Estimate Metabolic Flux Distributions in Constraint-Based Models
by Francesco Alessandro Massucci, Francesc Font-Clos, Andrea De Martino and Isaac Pérez Castillo
Metabolites 2013, 3(3), 838-852; https://doi.org/10.3390/metabo3030838 - 20 Sep 2013
Cited by 25 | Viewed by 7023
Abstract
Quite generally, constraint-based metabolic flux analysis describes the space of viable flux configurations for a metabolic network as a high-dimensional polytope defined by the linear constraints that enforce the balancing of production and consumption fluxes for each chemical species in the system. In [...] Read more.
Quite generally, constraint-based metabolic flux analysis describes the space of viable flux configurations for a metabolic network as a high-dimensional polytope defined by the linear constraints that enforce the balancing of production and consumption fluxes for each chemical species in the system. In some cases, the complexity of the solution space can be reduced by performing an additional optimization, while in other cases, knowing the range of variability of fluxes over the polytope provides a sufficient characterization of the allowed configurations. There are cases, however, in which the thorough information encoded in the individual distributions of viable fluxes over the polytope is required. Obtaining such distributions is known to be a highly challenging computational task when the dimensionality of the polytope is sufficiently large, and the problem of developing cost-effective ad hoc algorithms has recently seen a major surge of interest. Here, we propose a method that allows us to perform the required computation heuristically in a time scaling linearly with the number of reactions in the network, overcoming some limitations of similar techniques employed in recent years. As a case study, we apply it to the analysis of the human red blood cell metabolic network, whose solution space can be sampled by different exact techniques, like Hit-and-Run Monte Carlo (scaling roughly like the third power of the system size). Remarkably accurate estimates for the true distributions of viable reaction fluxes are obtained, suggesting that, although further improvements are desirable, our method enhances our ability to analyze the space of allowed configurations for large biochemical reaction networks. Full article
(This article belongs to the Special Issue Data Processing in Metabolomics)
Show Figures

Figure 1

424 KiB  
Article
Peak Detection Method Evaluation for Ion Mobility Spectrometry by Using Machine Learning Approaches
by Anne-Christin Hauschild, Dominik Kopczynski, Marianna D'Addario, Jörg Ingo Baumbach, Sven Rahmann and Jan Baumbach
Metabolites 2013, 3(2), 277-293; https://doi.org/10.3390/metabo3020277 - 16 Apr 2013
Cited by 172 | Viewed by 9243
Abstract
Ion mobility spectrometry with pre-separation by multi-capillary columns (MCC/IMS) has become an established inexpensive, non-invasive bioanalytics technology for detecting volatile organic compounds (VOCs) with various metabolomics applications in medical research. To pave the way for this technology towards daily usage in medical practice, [...] Read more.
Ion mobility spectrometry with pre-separation by multi-capillary columns (MCC/IMS) has become an established inexpensive, non-invasive bioanalytics technology for detecting volatile organic compounds (VOCs) with various metabolomics applications in medical research. To pave the way for this technology towards daily usage in medical practice, different steps still have to be taken. With respect to modern biomarker research, one of the most important tasks is the automatic classification of patient-specific data sets into different groups, healthy or not, for instance. Although sophisticated machine learning methods exist, an inevitable preprocessing step is reliable and robust peak detection without manual intervention. In this work we evaluate four state-of-the-art approaches for automated IMS-based peak detection: local maxima search, watershed transformation with IPHEx, region-merging with VisualNow, and peak model estimation (PME).We manually generated Metabolites 2013, 3 278 a gold standard with the aid of a domain expert (manual) and compare the performance of the four peak calling methods with respect to two distinct criteria. We first utilize established machine learning methods and systematically study their classification performance based on the four peak detectors’ results. Second, we investigate the classification variance and robustness regarding perturbation and overfitting. Our main finding is that the power of the classification accuracy is almost equally good for all methods, the manually created gold standard as well as the four automatic peak finding methods. In addition, we note that all tools, manual and automatic, are similarly robust against perturbations. However, the classification performance is more robust against overfitting when using the PME as peak calling preprocessor. In summary, we conclude that all methods, though small differences exist, are largely reliable and enable a wide spectrum of real-world biomedical applications. Full article
(This article belongs to the Special Issue Data Processing in Metabolomics)
Show Figures

Figure 1

556 KiB  
Article
Knowledge Discovery in Spectral Data by Means of Complex Networks
by Massimiliano Zanin, David Papo, José Luis González Solís, Juan Carlos Martínez Espinosa, Claudio Frausto-Reyes, Pascual Palomares Anda, Ricardo Sevilla-Escoboza, Rider Jaimes-Reategui, Stefano Boccaletti, Ernestina Menasalvas and Pedro Sousa
Metabolites 2013, 3(1), 155-167; https://doi.org/10.3390/metabo3010155 - 11 Mar 2013
Cited by 15 | Viewed by 5945
Abstract
In the last decade, complex networks have widely been applied to the study of many natural and man-made systems, and to the extraction of meaningful information from the interaction structures created by genes and proteins. Nevertheless, less attention has been devoted to metabonomics, [...] Read more.
In the last decade, complex networks have widely been applied to the study of many natural and man-made systems, and to the extraction of meaningful information from the interaction structures created by genes and proteins. Nevertheless, less attention has been devoted to metabonomics, due to the lack of a natural network representation of spectral data. Here we define a technique for reconstructing networks from spectral data sets, where nodes represent spectral bins, and pairs of them are connected when their intensities follow a pattern associated with a disease. The structural analysis of the resulting network can then be used to feed standard data-mining algorithms, for instance for the classification of new (unlabeled) subjects. Furthermore, we show how the structure of the network is resilient to the presence of external additive noise, and how it can be used to extract relevant knowledge about the development of the disease. Full article
(This article belongs to the Special Issue Data Processing in Metabolomics)
Show Figures

Figure 1

622 KiB  
Article
Validated and Predictive Processing of Gas Chromatography-Mass Spectrometry Based Metabolomics Data for Large Scale Screening Studies, Diagnostics and Metabolite Pattern Verification
by Elin Thysell, Elin Chorell, Michael B. Svensson, Pär Jonsson and Henrik Antti
Metabolites 2012, 2(4), 796-817; https://doi.org/10.3390/metabo2040796 - 31 Oct 2012
Cited by 186 | Viewed by 6566
Abstract
The suggested approach makes it feasible to screen large metabolomics data, sample sets with retained data quality or to retrieve significant metabolic information from small sample sets that can be verified over multiple studies. Hierarchical multivariate curve resolution (H-MCR), followed by orthogonal partial [...] Read more.
The suggested approach makes it feasible to screen large metabolomics data, sample sets with retained data quality or to retrieve significant metabolic information from small sample sets that can be verified over multiple studies. Hierarchical multivariate curve resolution (H-MCR), followed by orthogonal partial least squares discriminant analysis (OPLS-DA) was used for processing and classification of gas chromatography/time of flight mass spectrometry (GC/TOFMS) data characterizing human serum samples collected in a study of strenuous physical exercise. The efficiency of predictive H-MCR processing of representative sample subsets, selected by chemometric approaches, for generating high quality data was proven. Extensive model validation by means of cross-validation and external predictions verified the robustness of the extracted metabolite patterns in the data. Comparisons of extracted metabolite patterns between models emphasized the reliability of the methodology in a biological information context. Furthermore, the high predictive power in longitudinal data provided proof for the potential use in clinical diagnosis. Finally, the predictive metabolite pattern was interpreted physiologically, highlighting the biological relevance of the diagnostic pattern. Full article
(This article belongs to the Special Issue Data Processing in Metabolomics)
Show Figures

Figure 1

Back to TopTop