Machine Learning Applications in Metabolomics Analysis

A special issue of Metabolites (ISSN 2218-1989). This special issue belongs to the section "Bioinformatics and Data Analysis".

Deadline for manuscript submissions: 30 June 2024 | Viewed by 5053

Special Issue Editors


E-Mail Website
Guest Editor
Department of Computer Science, Málaga University, Málaga, Spain
Interests: artificial intelligence; biomedicine; deep learning
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Leicester School of Pharmacy, Faculty of Health and Life Sciences, De Montfort University, Leicester, UK
Interests: chemical pathology; clinical chemistry; NMR-based metabolomics; disease diagnosis and prognostic monitoring; metabolic pathway analysis; bioinorganic chemistry; drug design; development and synthesis; artificial intelligence; machine learning; research ethics
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Metabolomics research is gaining much popularity since it enables the study of biological problems at a biochemical level, and can help us to understand the induction, development and mechanisms of many diseases, complementing information from other ‘omics technologies. Similar to other high-throughput biological technologies, metabolomics can produce large volumes of data, and therefore, machine learning strategies can facilitate its application, with the discovery of new biomolecular signatures, which consequently facilitate the diagnosis/prognostic monitoring of diseases, including rare metabolic disorders, etc.

This Special Issue aims to attract publications focused on the application of machine learning techniques to the analysis of multidimensional metabolomics data, including the development of methods, data augmentation procedures, preprocessing techniques, the comparisons of different methods, interpretability of results, the identification of new signatures, etc.

Prof. Dr. Leonardo Franco
Prof. Dr. Martin Grootveld
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Metabolites is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2700 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • metabolites
  • biofluids
  • Artificial Intelligence
  • metabolomics
  • machine learning

Published Papers (4 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

15 pages, 1929 KiB  
Article
Diagnostics of Thyroid Cancer Using Machine Learning and Metabolomics
by Alyssa Kuang, Valentina L. Kouznetsova, Santosh Kesari and Igor F. Tsigelny
Metabolites 2024, 14(1), 11; https://doi.org/10.3390/metabo14010011 - 22 Dec 2023
Viewed by 1212
Abstract
The objective of this research is, with the analysis of existing data of thyroid cancer (TC) metabolites, to develop a machine-learning model that can diagnose TC using metabolite biomarkers. Through data mining, pathway analysis, and machine learning (ML), the model was developed. We [...] Read more.
The objective of this research is, with the analysis of existing data of thyroid cancer (TC) metabolites, to develop a machine-learning model that can diagnose TC using metabolite biomarkers. Through data mining, pathway analysis, and machine learning (ML), the model was developed. We identified seven metabolic pathways related to TC: Pyrimidine metabolism, Tyrosine metabolism, Glycine, serine, and threonine metabolism, Pantothenate and CoA biosynthesis, Arginine biosynthesis, Phenylalanine metabolism, and Phenylalanine, tyrosine, and tryptophan biosynthesis. The ML classifications’ accuracies were confirmed through 10-fold cross validation, and the most accurate classification was 87.30%. The metabolic pathways identified in relation to TC and the changes within such pathways can contribute to more pattern recognition for diagnostics of TC patients and assistance with TC screening. With independent testing, the model’s accuracy for other unique TC metabolites was 92.31%. The results also point to a possibility for the development of using ML methods for TC diagnostics and further applications of ML in general cancer-related metabolite analysis. Full article
(This article belongs to the Special Issue Machine Learning Applications in Metabolomics Analysis)
Show Figures

Figure 1

16 pages, 2267 KiB  
Article
Prediction of Clinical Remission with Adalimumab Therapy in Patients with Ulcerative Colitis by Fourier Transform–Infrared Spectroscopy Coupled with Machine Learning Algorithms
by Seok-Young Kim, Seung Yong Shin, Maham Saeed, Ji Eun Ryu, Jung-Seop Kim, Junyoung Ahn, Youngmi Jung, Jung Min Moon, Chang Hwan Choi and Hyung-Kyoon Choi
Metabolites 2024, 14(1), 2; https://doi.org/10.3390/metabo14010002 - 19 Dec 2023
Viewed by 1270
Abstract
We aimed to develop prediction models for clinical remission associated with adalimumab treatment in patients with ulcerative colitis (UC) using Fourier transform–infrared (FT–IR) spectroscopy coupled with machine learning (ML) algorithms. This prospective, observational, multicenter study enrolled 62 UC patients and 30 healthy controls. [...] Read more.
We aimed to develop prediction models for clinical remission associated with adalimumab treatment in patients with ulcerative colitis (UC) using Fourier transform–infrared (FT–IR) spectroscopy coupled with machine learning (ML) algorithms. This prospective, observational, multicenter study enrolled 62 UC patients and 30 healthy controls. The patients were treated with adalimumab for 56 weeks, and clinical remission was evaluated using the Mayo score. Baseline fecal samples were collected and analyzed using FT–IR spectroscopy. Various data preprocessing methods were applied, and prediction models were established by 10-fold cross-validation using various ML methods. Orthogonal partial least squares–discriminant analysis (OPLS–DA) showed a clear separation of healthy controls and UC patients, applying area normalization and Pareto scaling. OPLS–DA models predicting short- and long-term remission (8 and 56 weeks) yielded area-under-the-curve values of 0.76 and 0.75, respectively. Logistic regression and a nonlinear support vector machine were selected as the best prediction models for short- and long-term remission, respectively (accuracy of 0.99). In external validation, prediction models for short-term (logistic regression) and long-term (decision tree) remission performed well, with accuracy values of 0.73 and 0.82, respectively. This was the first study to develop prediction models for clinical remission associated with adalimumab treatment in UC patients by fecal analysis using FT–IR spectroscopy coupled with ML algorithms. Logistic regression, nonlinear support vector machines, and decision tree were suggested as the optimal prediction models for remission, and these were noninvasive, simple, inexpensive, and fast analyses that could be applied to personalized treatments. Full article
(This article belongs to the Special Issue Machine Learning Applications in Metabolomics Analysis)
Show Figures

Figure 1

24 pages, 3968 KiB  
Article
Benchmark Dataset for Training Machine Learning Models to Predict the Pathway Involvement of Metabolites
by Erik D. Huckvale, Christian D. Powell, Huan Jin and Hunter N. B. Moseley
Metabolites 2023, 13(11), 1120; https://doi.org/10.3390/metabo13111120 - 01 Nov 2023
Cited by 1 | Viewed by 1157
Abstract
Metabolic pathways are a human-defined grouping of life sustaining biochemical reactions, metabolites being both the reactants and products of these reactions. But many public datasets include identified metabolites whose pathway involvement is unknown, hindering metabolic interpretation. To address these shortcomings, various machine learning [...] Read more.
Metabolic pathways are a human-defined grouping of life sustaining biochemical reactions, metabolites being both the reactants and products of these reactions. But many public datasets include identified metabolites whose pathway involvement is unknown, hindering metabolic interpretation. To address these shortcomings, various machine learning models, including those trained on data from the Kyoto Encyclopedia of Genes and Genomes (KEGG), have been developed to predict the pathway involvement of metabolites based on their chemical descriptions; however, these prior models are based on old metabolite KEGG-based datasets, including one benchmark dataset that is invalid due to the presence of over 1500 duplicate entries. Therefore, we have developed a new benchmark dataset derived from the KEGG following optimal standards of scientific computational reproducibility and including all source code needed to update the benchmark dataset as KEGG changes. We have used this new benchmark dataset with our atom coloring methodology to develop and compare the performance of Random Forest, XGBoost, and multilayer perceptron with autoencoder models generated from our new benchmark dataset. Best overall weighted average performance across 1000 unique folds was an F1 score of 0.8180 and a Matthews correlation coefficient of 0.7933, which was provided by XGBoost binary classification models for 11 KEGG-defined pathway categories. Full article
(This article belongs to the Special Issue Machine Learning Applications in Metabolomics Analysis)
Show Figures

Figure 1

27 pages, 1512 KiB  
Article
Urinary Metabolic Distinction of Niemann–Pick Class 1 Disease through the Use of Subgroup Discovery
by Cristóbal J. Carmona, Manuel German-Morales, David Elizondo, Victor Ruiz-Rodado and Martin Grootveld
Metabolites 2023, 13(10), 1079; https://doi.org/10.3390/metabo13101079 - 13 Oct 2023
Viewed by 856
Abstract
In this investigation, we outline the applications of a data mining technique known as Subgroup Discovery (SD) to the analysis of a sample size-limited metabolomics-based dataset. The SD technique utilized a supervised learning strategy, which lies midway between classificational and descriptive criteria, in [...] Read more.
In this investigation, we outline the applications of a data mining technique known as Subgroup Discovery (SD) to the analysis of a sample size-limited metabolomics-based dataset. The SD technique utilized a supervised learning strategy, which lies midway between classificational and descriptive criteria, in which given the descriptive property of a dataset (i.e., the response target variable of interest), the primary objective was to discover subgroups with behaviours that are distinguishable from those of the complete set (albeit with a differential statistical distribution). These approaches have, for the first time, been successfully employed for the analysis of aromatic metabolite patterns within an NMR-based urinary dataset collected from a small cohort of patients with the lysosomal storage disorder Niemann–Pick class 1 (NPC1) disease (n = 12) and utilized to distinguish these from a larger number of heterozygous (parental) control participants. These subgroup discovery strategies discovered two different NPC1 disease-specific metabolically sequential rules which permitted the reliable identification of NPC1 patients; the first of these involved ‘normal’ (intermediate) urinary concentrations of xanthurenate, 4-aminobenzoate, hippurate and quinaldate, and disease-downregulated levels of nicotinate and trigonelline, whereas the second comprised ‘normal’ 4-aminobenzoate, indoxyl sulphate, hippurate, 3-methylhistidine and quinaldate concentrations, and again downregulated nicotinate and trigonelline levels. Correspondingly, a series of five subgroup rules were generated for the heterozygous carrier control group, and ‘biomarkers’ featured in these included low histidine, 1-methylnicotinamide and 4-aminobenzoate concentrations, together with ‘normal’ levels of hippurate, hypoxanthine, quinolinate and hypoxanthine. These significant disease group-specific rules were consistent with imbalances in the combined tryptophan–nicotinamide, tryptophan, kynurenine and tyrosine metabolic pathways, along with dysregulations in those featuring histidine, 3-methylhistidine and 4-hydroxybenzoate. In principle, the novel subgroup discovery approach employed here should also be readily applicable to solving metabolomics-type problems of this nature which feature rare disease classification groupings with only limited patient participant and sample sizes available. Full article
(This article belongs to the Special Issue Machine Learning Applications in Metabolomics Analysis)
Show Figures

Figure 1

Back to TopTop