Double-Weighted Bayesian Model Combination for Metabolomics Data Description and Prediction
Abstract
:1. Introduction
2. Materials and Methods
2.1. Data Collection
2.2. Data Pretreatment
2.3. Ensemble Machine Learning
2.4. Hyperparameter Optimization and Feature Selection
3. Results
3.1. Study 1: NMR-Based Metabolic Profiling for Diagnostic and Prognostic Purposes in Critically Ill Children (Grauslys et al. [12])
3.2. Study 2: Diagnostic Metabolite Biomarkers of Chronic Typhoid Carriage (Näsström et al. [13])
3.3. Study 3: Profiling the Metabolome of Uterine Fluid for Early Detection of Ovarian Cancer (Pan Wang et al. [14])
4. Discussion
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Bruggeman, F.J.; Westerhoff, H.V. The Nature of Systems Biology. Trends Microbiol. 2007, 15, 45–50. [Google Scholar] [CrossRef] [PubMed]
- Nielsen, J. Systems Biology of Metabolism. Annu. Rev. Biochem. 2017, 86, 245–275. [Google Scholar] [CrossRef]
- Kitano, H. Systems Biology: A Brief Overview. Science 2002, 295, 1662–1664. [Google Scholar] [CrossRef] [PubMed]
- Moghadam, A.; Foroozan, E.; Tahmasebi, A.; Taghizadeh, M.S.; Bolhassani, M.; Jafari, M. System Network Analysis of Rosmarinus Officinalis Transcriptome and Metabolome—Key Genes in Biosynthesis of Secondary Metabolites. PLoS ONE 2023, 18, e0282316. [Google Scholar] [CrossRef]
- Johnson, C.H.; Ivanisevic, J.; Siuzdak, G. Metabolomics: Beyond Biomarkers and towards Mechanisms. Nat. Rev. Mol. Cell Biol. 2016, 17, 451–459. [Google Scholar] [CrossRef]
- Troisi, J.; EBSCOhost (Eds.) Metabolomics Perspectives: From Theory to Practical Application; Academic Press: London, UK, 2022; ISBN 978-0-323-85062-9. [Google Scholar]
- Jacob, M.; Lopata, A.L.; Dasouki, M.; Abdel Rahman, A.M. Metabolomics toward Personalized Medicine. Mass Spectrom. Rev. 2019, 38, 221–238. [Google Scholar] [CrossRef]
- Troisi, J.; Richards, S.M.; Scala, G.; Landolfi, A. Chapter 7—Approaches in Untargeted Metabolomics. In Metabolomics Perspectives; Troisi, J., Ed.; Academic Press: Cambridge, MA, USA, 2022; pp. 237–262. ISBN 978-0-323-85062-9. [Google Scholar]
- Szymańska, E.; Saccenti, E.; Smilde, A.K.; Westerhuis, J.A. Double-Check: Validation of Diagnostic Statistics for PLS-DA Models in Metabolomics Studies. Metabolomics 2012, 8, 3–16. [Google Scholar] [CrossRef] [PubMed]
- Gromski, P.S.; Muhamadali, H.; Ellis, D.I.; Xu, Y.; Correa, E.; Turner, M.L.; Goodacre, R. A Tutorial Review: Metabolomics and Partial Least Squares-Discriminant Analysis—A Marriage of Convenience or a Shotgun Wedding. Anal. Chim. Acta 2015, 879, 10–23. [Google Scholar] [CrossRef]
- Habehh, H.; Gohel, S. Machine Learning in Healthcare. Curr. Genom. 2021, 22, 291–300. [Google Scholar] [CrossRef]
- Grauslys, A.; Phelan, M.M.; Broughton, C.; Baines, P.B.; Jennings, R.; Siner, S.; Paulus, S.C.; Carrol, E.D. NMR-Based Metabolic Profiling Provides Diagnostic and Prognostic Information in Critically Ill Children with Suspected Infection. Sci. Rep. 2020, 10, 20198. [Google Scholar] [CrossRef]
- Näsström, E.; Jonsson, P.; Johansson, A.; Dongol, S.; Karkey, A.; Basnyat, B.; Tran Vu Thieu, N.; Trinh Van, T.; Thwaites, G.E.; Antti, H.; et al. Diagnostic Metabolite Biomarkers of Chronic Typhoid Carriage. PLoS Negl. Trop. Dis. 2018, 12, e0006215. [Google Scholar] [CrossRef] [PubMed]
- Wang, P.; Ma, J.; Li, W.; Wang, Q.; Xiao, Y.; Jiang, Y.; Gu, X.; Wu, Y.; Dong, S.; Guo, H.; et al. Profiling the Metabolome of Uterine Fluid for Early Detection of Ovarian Cancer. Cell Rep. Med. 2023, 4, 101061. [Google Scholar] [CrossRef]
- Yi, X.; Xu, Y.; Hu, Q.; Krishnamoorthy, S.; Li, W.; Tang, Z. ASN-SMOTE: A Synthetic Minority Oversampling Method with Adaptive Qualified Synthesizer Selection. Complex. Intell. Syst. 2022, 8, 2247–2272. [Google Scholar] [CrossRef]
- Fluss, R.; Faraggi, D.; Reiser, B. Estimation of the Youden Index and Its Associated Cutoff Point. Biom. J. J. Math. Methods Biosci. 2005, 47, 458–472. [Google Scholar] [CrossRef]
- Kennedy, A.D.; Wittmann, B.M.; Evans, A.M.; Miller, L.A.D.; Toal, D.R.; Lonergan, S.; Elsea, S.H.; Pappan, K.L. Metabolomics in the Clinic: A Review of the Shared and Unique Features of Untargeted Metabolomics for Clinical Research and Clinical Testing. J. Mass Spectrom. 2018, 53, 1143–1154. [Google Scholar] [CrossRef] [PubMed]
- Kaur, S.; Singla, J.; Nkenyereye, L.; Jha, S.; Prashar, D.; Joshi, G.P.; El-Sappagh, S.; Islam, M.S.; Islam, S.M.R. Medical Diagnostic Systems Using Artificial Intelligence (AI) Algorithms: Principles and Perspectives. IEEE Access 2020, 8, 228049–228069. [Google Scholar] [CrossRef]
- Fridman, L.; Ding, L.; Jenik, B.; Reimer, B. Arguing Machines: Human Supervision of Black Box AI Systems That Make Life-Critical Decisions. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Long Beach, CA, USA, 16–17 June 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1335–1343. [Google Scholar]
- Asakura, T.; Date, Y.; Kikuchi, J. Application of Ensemble Deep Neural Network to Metabolomics Studies. Anal. Chim. Acta 2018, 1037, 230–236. [Google Scholar] [CrossRef]
- Edeh, M.O.; Dalal, S.; Dhaou, I.B.; Agubosim, C.C.; Umoke, C.C.; Richard-Nnabu, N.E.; Dahiya, N. Artificial Intelligence-Based Ensemble Learning Model for Prediction of Hepatitis C Disease. Front. Public. Health 2022, 10, 892371. [Google Scholar] [CrossRef] [PubMed]
- Mahajan, P.; Uddin, S.; Hajati, F.; Moni, M.A. Ensemble Learning for Disease Prediction: A Review. Healthcare 2023, 11, 1808. [Google Scholar] [CrossRef]
- ShahrjooiHaghighi, A.; Frigui, H.; Zhang, X.; Wei, X.; Shi, B.; McClain, C.J. Ensemble Feature Selection for Biomarker Discovery in Mass Spectrometry-Based Metabolomics. In Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing, Limassol, Cyprus, 8–12 April 2019; ACM: New York, NY, USA, 2019; pp. 19–24. [Google Scholar]
- Shahrjooihaghighi, A.; Frigui, H.; Zhang, X.; Wei, X.; Shi, B.; Trabelsi, A. An Ensemble Feature Selection Method for Biomarker Discovery. In Proceedings of the 2017 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT), Bilbao, Spain, 18–20 December 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 416–421. [Google Scholar]
- Han, S.; Huang, J.; Foppiano, F.; Prehn, C.; Adamski, J.; Suhre, K.; Li, Y.; Matullo, G.; Schliess, F.; Gieger, C.; et al. TIGER: Technical Variation Elimination for Metabolomics Data Using Ensemble Learning Architecture. Brief. Bioinform. 2022, 23, bbab535. [Google Scholar] [CrossRef]
- Netzer, M.; Hanser, F.; Breit, M.; Weinberger, K.M.; Baumgartner, C.; Baumgarten, D. Ensemble Based Approach for Time Series Classification in Metabolomics. Stud. Health Technol. Inf. 2019, 260, 89–96. [Google Scholar]
- Cao, Y.; Geddes, T.A.; Yang, J.Y.H.; Yang, P. Ensemble Deep Learning in Bioinformatics. Nat. Mach. Intell. 2020, 2, 500–508. [Google Scholar] [CrossRef]
- Yang, P.; Yee, H.Y.; Zhou, B.; Zomaya, A.Y. A Review of Ensemble Methods in Bioinformatics. Curr. Bioinform. 2010, 5, 296–308. [Google Scholar] [CrossRef]
- Troisi, J.; Colucci, A.; Cavallo, P.; Richards, S.; Symes, S.; Landolfi, A.; Scala, G.; Maiorino, F.; Califano, A.; Fabiano, M.; et al. A Serum Metabolomic Signature for the Detection and Grading of Bladder Cancer. Appl. Sci. 2021, 11, 2835. [Google Scholar] [CrossRef]
- Troisi, J.; Tafuro, M.; Lombardi, M.; Scala, G.; Richards, S.M.; Symes, S.J.K.; Ascierto, P.A.; Delrio, P.; Tatangelo, F.; Buonerba, C.; et al. A Metabolomics-Based Screening Proposal for Colorectal Cancer. Metabolites 2022, 12, 110. [Google Scholar] [CrossRef]
- Troisi, J.; Raffone, A.; Travaglino, A.; Belli, G.; Belli, C.; Anand, S.; Giugliano, L.; Cavallo, P.; Scala, G.; Symes, S.; et al. Development and Validation of a Serum Metabolomic Signature for Endometrial Cancer Screening in Postmenopausal Women. JAMA Netw. Open 2020, 3, e2018327. [Google Scholar] [CrossRef]
- Troisi, J.; Sarno, L.; Landolfi, A.; Scala, G.; Martinelli, P.; Venturella, R.; Di Cello, A.; Zullo, F.; Guida, M. Metabolomic Signature of Endometrial Cancer. J. Proteome Res. 2018, 17, 804–812. [Google Scholar] [CrossRef]
- Troisi, J.; Mollo, A.; Lombardi, M.; Scala, G.; Richards, S.M.; Symes, S.J.K.; Travaglino, A.; Neola, D.; de Laurentiis, U.; Insabato, L.; et al. The Metabolomic Approach for the Screening of Endometrial Cancer: Validation from a Large Cohort of Women Scheduled for Gynecological Surgery. Biomolecules 2022, 12, 1229. [Google Scholar] [CrossRef]
- Troisi, J.; Cavallo, P.; Richards, S.; Symes, S.; Colucci, A.; Sarno, L.; Landolfi, A.; Scala, G.; Adair, D.; Ciccone, C. Noninvasive Screening for Congenital Heart Defects Using a Serum Metabolomics Approach. Prenat. Diagn. 2021, 41, 743–753. [Google Scholar] [CrossRef]
- Troisi, J.; Landolfi, A.; Sarno, L.; Richards, S.; Symes, S.; Adair, D.; Ciccone, C.; Scala, G.; Martinelli, P.; Guida, M. A Metabolomics-Based Approach for Non-Invasive Screening of Fetal Central Nervous System Anomalies. Metabolomics 2018, 14, 77. [Google Scholar] [CrossRef]
- Troisi, J.; Sarno, L.; Martinelli, P.; Di Carlo, C.; Landolfi, A.; Scala, G.; Rinaldi, M.; D’Alessandro, P.; Ciccone, C.; Guida, M. A Metabolomics-Based Approach for Non-Invasive Diagnosis of Chromosomal Anomalies. Metabolomics 2017, 13, 140. [Google Scholar] [CrossRef]
- Troisi, J.; Lombardi, M.; Scala, G.; Cavallo, P.; Tayler, R.S.; Symes, S.J.K.; Richards, S.M.; Adair, D.C.; Fasano, A.; McCowan, L.M.; et al. A Screening Test Proposal for Congenital Defects Based on Maternal Serum Metabolomics Profile. Am. J. Obstet. Gynecol. 2022, 228, 342.e1–342.e12. [Google Scholar] [CrossRef] [PubMed]
Configuration | Ensemble | NB | GLM | LR | FLM | DL | DT | RF | GBT | SVM | |
---|---|---|---|---|---|---|---|---|---|---|---|
Accuracy CB Model | FO | 92.9% | - | 96.7% | - | 73.3% | 83.3% | 70.0% | 60.0% | 86.7% | - |
No FS | 90.5% | - | 93.3% | 60.0% | 80.0% | 80.0% | - | 63.3% | 60.0% | 80.0% | |
No Opt | 88.1% | - | 83.3% | - | 73.3% | 83.3% | - | - | 76.7% | 73.3% | |
No FS and Opt | 95.2% | - | 93.3% | 60.0% | 70.0% | 80.0% | 63.3% | 90.0% | 73.3% | 90.0% | |
Accuracy CV Model | FO | 88.1% | 73.3% | 86.7% | - | 86.7% | 76.7% | 70.0% | 63.3% | 60.0% | 73.3% |
No FS | 81% | 70% | 73.3% | - | 73.3% | 86.7% | 70.0% | 73.3% | 63.3% | 73.3% | |
No Opt | 85.7% | 73.3% | 73.3% | - | 66.7% | 76.7% | 73.3% | 80.0% | 70.0% | - | |
No FS and Opt | 83.3% | 70% | 73.3% | - | 76.7% | 86.7% | 73.3% | 70.0% | 63.3% | 80.0% | |
Accuracy CB-CV Model | FO | 92.5% | 86.7% | 78.7% | - | 96.7% | 79.3% | 72.7% | 72.7% | 78.7% | 86.0% |
No FS | 90% | 68% | 93.3% | - | 90.0% | 90.0% | 83.3% | 68.7% | 78.7% | 82.7% | |
No Opt | 90% | 86.7% | 86.7% | - | 90.0% | 83.3% | - | - | 79.3% | 79.3% | |
No FS and Opt | 92.5% | 68% | 93.3% | - | 86.7% | 90.0% | 76.7% | 75.3% | 78.7% | 93.3% |
Configuration | Ensemble | NB | GLM | LR | FLM | DL | DT | RF | GBT | SVM | |
---|---|---|---|---|---|---|---|---|---|---|---|
Accuracy ST Model | FO | 100.0% | 90.0% | 96.7% | - | 93.3% | 89.3% | 90.0% | 100.0% | 93.3% | 100.0% |
No FS | 100.0% | 100.0% | 100.0% | 96.7% | 100.0% | 96.7% | 93.3% | 96.7% | 90.0% | 100.0% | |
No Opt | 97.5% | 90.0% | 96.7% | 89.3% | - | 100.0% | 93.3% | 93.3% | 90.0% | - | |
No FS and Opt | 97.6% | 100.0% | 100.0% | 96.7% | 90.0% | 96.7% | 93.3% | 93.3% | 90.0% | - | |
Accuracy PT Model | FO | 92.9% | - | 96.7% | - | 73.3% | 83.3% | 70.0% | 60.0% | 86.7% | - |
No FS | 90.5% | - | 93.3% | 60.0% | 80.0% | 80.0% | - | 63.3% | 60.0% | 80.0% | |
No Opt | 88.1% | - | 83.3% | - | 73.3% | 83.3% | - | - | 76.7% | 73.3% | |
No FS and Opt | 95.2% | - | 93.3% | 60.0% | 70.0% | 80.0% | 63.3% | 90.0% | 73.3% | 90.0% | |
Accuracy ST-PT Model | FO | 88.1% | 73.3% | 86.7% | - | 86.7% | 76.7% | 70.0% | 63.3% | 60.0% | 73.3% |
No FS | 81% | 70% | 73.3% | - | 73.3% | 86.7% | 70.0% | 73.3% | 63.3% | 73.3% | |
No Opt | 85.7% | 73.3% | 73.3% | - | 66.7% | 76.7% | 73.3% | 80.0% | 70.0% | - | |
No FS and Opt | 83.3% | 70% | 73.3% | - | 76.7% | 86.7% | 73.3% | 70.0% | 63.3% | 80.0% | |
Accuracy Salmonella Model | FO | 92.5% | 86.7% | 78.7% | - | 96.7% | 79.3% | 72.7% | 72.7% | 78.7% | 86.0% |
No FS | 90% | 68% | 93.3% | - | 90.0% | 90.0% | 83.3% | 68.7% | 78.7% | 82.7% | |
No Opt | 90% | 86.7% | 86.7% | - | 90.0% | 83.3% | - | - | 79.3% | 79.3% | |
No FS and Opt | 92.5% | 68% | 93.3% | - | 86.7% | 90.0% | 76.7% | 75.3% | 78.7% | 93.3% |
Configuration | Ensemble | NB | GLM | LR | FLM | DL | DT | RF | GBT | SVM | |
---|---|---|---|---|---|---|---|---|---|---|---|
Accuracy OC Model | FO | 95.0% | 93.3% | 96.0% | - | 90.0% | 88.7% | 72.7% | 96.7% | 96.7% | 96.7% |
No FS | 95.0% | 75.3% | 93.3% | 90.0% | 89.3% | 96.7% | 81.3% | 90.0% | 85.3% | 96.7% | |
No Opt | 92.5% | 75.3% | 96.0% | - | 86.0% | 92.7% | 88.7% | 86.0% | 68.7% | 92.7% | |
No FS & Opt | 95.0% | 75.3% | 93.3% | 90.0% | 89.3% | 96.7% | 84.7% | 96.7% | 72.0% | 96.7% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Troisi, J.; Lombardi, M.; Trotta, A.; Abenante, V.; Ingenito, A.; Palmieri, N.; Richards, S.M.; Symes, S.J.K.; Cavallo, P. Double-Weighted Bayesian Model Combination for Metabolomics Data Description and Prediction. Metabolites 2025, 15, 214. https://doi.org/10.3390/metabo15040214
Troisi J, Lombardi M, Trotta A, Abenante V, Ingenito A, Palmieri N, Richards SM, Symes SJK, Cavallo P. Double-Weighted Bayesian Model Combination for Metabolomics Data Description and Prediction. Metabolites. 2025; 15(4):214. https://doi.org/10.3390/metabo15040214
Chicago/Turabian StyleTroisi, Jacopo, Martina Lombardi, Alessio Trotta, Vera Abenante, Andrea Ingenito, Nicole Palmieri, Sean M. Richards, Steven J. K. Symes, and Pierpaolo Cavallo. 2025. "Double-Weighted Bayesian Model Combination for Metabolomics Data Description and Prediction" Metabolites 15, no. 4: 214. https://doi.org/10.3390/metabo15040214
APA StyleTroisi, J., Lombardi, M., Trotta, A., Abenante, V., Ingenito, A., Palmieri, N., Richards, S. M., Symes, S. J. K., & Cavallo, P. (2025). Double-Weighted Bayesian Model Combination for Metabolomics Data Description and Prediction. Metabolites, 15(4), 214. https://doi.org/10.3390/metabo15040214