Integration of Meta-Multi-Omics Data Using Probabilistic Graphs and External Knowledge
Abstract
1. Introduction
2. Materials and Methods
2.1. Strains, Media, and Growth Conditions
2.2. Sample Collection
2.3. Transcriptomics
2.4. Metabolomics
2.5. Proteomics
2.6. Meta-Multi-Omics Network Construction
3. Results
3.1. Transcriptomics
3.2. Metabolomics
3.3. Proteomics
3.4. Combined Multi-Omics Analysis
3.5. Comparison with Other Methods
4. Discussion
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- O’Donnell, S.T.; Ross, R.P.; Stanton, C. The Progress of Multi-Omics Technologies: Determining Function in Lactic Acid Bacteria Using a Systems Level Approach. Front. Microbiol. 2019, 10, 3084. [Google Scholar] [CrossRef]
- Subramanian, I.; Verma, S.; Kumar, S.; Jere, A.; Anamika, K. Multi-omics Data Integration, Interpretation, and Its Application. Bioinform. Biol. Insights 2020, 14, 1177932219899051. [Google Scholar] [CrossRef] [PubMed]
- Vahabi, N.; Michailidis, G. Unsupervised Multi-Omics Data Integration Methods: A Comprehensive Review. Front. Genet. 2022, 13, 854752. [Google Scholar] [CrossRef]
- Zhou, G.; Pang, Z.; Lu, Y.; Ewald, J.; Xia, J. OmicsNet 2.0: A web-based platform for multi-omics integration and network visual analytics. Nucleic Acids Res. 2022, 50, W527–W533. [Google Scholar] [CrossRef]
- Singh, A.; Shannon, C.P.; Gautier, B.; Rohart, F.; Vacher, M.; Tebbutt, S.J.; Le Cao, K.A. DIABLO: An integrative approach for identifying key molecular drivers from multi-omics assays. Bioinformatics 2019, 35, 3055–3062. [Google Scholar] [CrossRef]
- Yang, Z.; Michailidis, G. A non-negative matrix factorization method for detecting modules in heterogeneous omics multi-modal data. Bioinformatics 2016, 32, 1–8. [Google Scholar] [CrossRef]
- Lê Cao, K.-A.; González, I.; Déjean, S. integrOmics: An R package to unravel relationships between two omics datasets. Bioinformatics 2009, 25, 2855–2856. [Google Scholar] [CrossRef]
- Sharma, A.; Shigemizu, D.; Boroevich, K.A.; López, Y.; Kamatani, Y.; Kubo, M.; Tsunoda, T. Stepwise iterative maximum likelihood clustering approach. BMC Bioinform. 2016, 17, 319. [Google Scholar] [CrossRef]
- Shen, R.; Olshen, A.B.; Ladanyi, M. Integrative clustering of multiple genomic data types using a joint latent variable model with application to breast and lung cancer subtype analysis. Bioinformatics 2009, 25, 2906–2912. [Google Scholar] [CrossRef]
- Batushansky, A.; Toubiana, D.; Fait, A. Correlation-Based Network Generation, Visualization, and Analysis as a Powerful Tool in Biological Studies: A Case Study in Cancer Cell Metabolism. BioMed Res. Int. 2016, 2016, 8313272. [Google Scholar] [CrossRef] [PubMed]
- Bonnet, E.; Calzone, L.; Michoel, T. Integrative multi-omics module network inference with Lemon-Tree. PLoS Comput. Biol. 2015, 11, e1003983. [Google Scholar] [CrossRef]
- Rodrigues, R.R.; Shulzhenko, N.; Morgun, A. Transkingdom Networks: A Systems Biology Approach to Identify Causal Members of Host-Microbiota Interactions. Methods Mol. Biol. 2018, 1849, 227–242. [Google Scholar] [CrossRef]
- Wen, Y.; Song, X.; Yan, B.; Yang, X.; Wu, L.; Leng, D.; He, S.; Bo, X. Multi-dimensional data integration algorithm based on random walk with restart. BMC Bioinform. 2021, 22, 97. [Google Scholar] [CrossRef]
- Zitnik, M.; Zupan, B. Jumping across biomedical contexts using compressive data fusion. Bioinformatics 2016, 32, i90–i100. [Google Scholar] [CrossRef]
- Tripp, B.A.; Otu, H.H. Integration of Multi-Omics Data Using Probabilistic Graph Models and External Knowledge. Curr. Bioinform. 2022, 17, 37–47. [Google Scholar] [CrossRef]
- Rigden, D.J.; Fernandez, X.M. The 2022 Nucleic Acids Research database issue and the online molecular biology database collection. Nucleic Acids Res. 2022, 50, D1–D10. [Google Scholar] [CrossRef]
- Su, C.; Andrew, A.; Karagas, M.R.; Borsuk, M.E. Using Bayesian networks to discover relations between genes, environment, and disease. BioData Min. 2013, 6, 6. [Google Scholar] [CrossRef]
- Isci, S.; Dogan, H.; Ozturk, C.; Otu, H.H. Bayesian network prior: Network analysis of biological data using external knowledge. Bioinformatics 2014, 30, 860–867. [Google Scholar] [CrossRef] [PubMed]
- Blasche, S.; Kim, Y.; Mars, R.A.T.; Machado, D.; Maansson, M.; Kafkia, E.; Milanese, A.; Zeller, G.; Teusink, B.; Nielsen, J.; et al. Metabolic cooperation and spatiotemporal niche partitioning in a kefir microbial community. Nat. Microbiol. 2021, 6, 196–208. [Google Scholar] [CrossRef]
- Nalbantoglu, U.; Cakar, A.; Dogan, H.; Abaci, N.; Ustek, D.; Sayood, K.; Can, H. Metagenomic analysis of the microbial community in kefir grains. Food Microbiol. 2014, 41, 42–51. [Google Scholar] [CrossRef]
- Nielsen, B.; Gurakan, G.C.; Unlu, G. Kefir: A multifaceted fermented dairy product. Probiotics Antimicrob. Proteins 2014, 6, 123–135. [Google Scholar] [CrossRef]
- Sindi, A.; Badsha, M.B.; Unlu, G. Bacterial Populations in International Artisanal Kefirs. Microorganisms 2020, 8, 1318. [Google Scholar] [CrossRef] [PubMed]
- Vieira, C.P.; Rosario, A.; Lelis, C.A.; Rekowsky, B.S.S.; Carvalho, A.P.A.; Rosario, D.K.A.; Elias, T.A.; Costa, M.P.; Foguel, D.; Conte-Junior, C.A. Bioactive Compounds from Kefir and Their Potential Benefits on Health: A Systematic Review and Meta-Analysis. Oxidative Med. Cell. Longev. 2021, 2021, 9081738. [Google Scholar] [CrossRef] [PubMed]
- Salari, A.; Hashemi, M.; Afshari, A. Functional Properties of Kefiran in the Medical Field and Food Industry. Curr. Pharm. Biotechnol. 2022, 23, 388–395. [Google Scholar] [CrossRef]
- Rosa, D.D.; Dias, M.M.S.; Grzeskowiak, L.M.; Reis, S.A.; Conceicao, L.L.; Peluzio, M. Milk kefir: Nutritional, microbiological and health benefits. Nutr. Res. Rev. 2017, 30, 82–96. [Google Scholar] [CrossRef]
- Slattery, C.; Cotter, P.D.; O’Toole, P.W. Analysis of Health Benefits Conferred by Lactobacillus Species from Kefir. Nutrients 2019, 11, 1252. [Google Scholar] [CrossRef]
- Cheirsilp, B.; Radchabut, S. Use of whey lactose from dairy industry for economical kefiran production by Lactobacillus kefiranofaciens in mixed cultures with yeasts. New Biotechnol. 2011, 28, 574–580. [Google Scholar] [CrossRef] [PubMed]
- Andrews, S. FastQC: A Quality Control Tool for High Throughput Sequence Data. 2010. Available online: http://www.bioinformatics.babraham.ac.uk/projects/fastqc/ (accessed on 24 October 2022).
- Bolger, A.M.; Lohse, M.; Usadel, B. Trimmomatic: A flexible trimmer for Illumina sequence data. Bioinformatics 2014, 30, 2114–2120. [Google Scholar] [CrossRef] [PubMed]
- Chanumolu, S.K.; Albahrani, M.; Otu, H.H. FQStat: A parallel architecture for very high-speed assessment of sequencing quality metrics. BMC Bioinform. 2019, 20, 424. [Google Scholar] [CrossRef]
- Patro, R.; Duggal, G.; Love, M.I.; Irizarry, R.A.; Kingsford, C. Salmon provides fast and bias-aware quantification of transcript expression. Nat. Methods 2017, 14, 417–419. [Google Scholar] [CrossRef]
- Howe, K.L.; Contreras-Moreira, B.; De Silva, N.; Maslen, G.; Akanni, W.; Allen, J.; Alvarez-Jarreta, J.; Barba, M.; Bolser, D.M.; Cambell, L.; et al. Ensembl Genomes 2020-enabling non-vertebrate genomic research. Nucleic Acids Res. 2020, 48, D689–D695. [Google Scholar] [CrossRef] [PubMed]
- Love, M.I.; Huber, W.; Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014, 15, 550. [Google Scholar] [CrossRef]
- Tsugawa, H.; Cajka, T.; Kind, T.; Ma, Y.; Higgins, B.; Ikeda, K.; Kanazawa, M.; VanderGheynst, J.; Fiehn, O.; Arita, M. MS-DIAL: Data-independent MS/MS deconvolution for comprehensive metabolome analysis. Nat. Methods 2015, 12, 523–526. [Google Scholar] [CrossRef] [PubMed]
- Pang, Z.; Chong, J.; Zhou, G.; de Lima Morais, D.A.; Chang, L.; Barrette, M.; Gauthier, C.; Jacques, P.E.; Li, S.; Xia, J. MetaboAnalyst 5.0: Narrowing the gap between raw spectra and functional insights. Nucleic Acids Res. 2021, 49, W388–W396. [Google Scholar] [CrossRef] [PubMed]
- Michener, C.D.; Sokal, R.R. A Quantitative Approach to a Problem in Classification. Evolution 1957, 11, 130–162. [Google Scholar] [CrossRef]
- Isci, S.; Agyuz, U.; Ozturk, C.; Otu, H.H. Detecting gene interactions within a Bayesian Network framework using external knowledge. In Proceedings of the 7th International Symposium on Health Informatics and Bioinformatics (HIBIT), Nevsehir, Turkey, 19–22 April 2012; pp. 82–87. [Google Scholar]
- Barrett, T.; Wilhite, S.E.; Ledoux, P.; Evangelista, C.; Kim, I.F.; Tomashevsky, M.; Marshall, K.A.; Phillippy, K.H.; Sherman, P.M.; Holko, M.; et al. NCBI GEO: Archive for functional genomics data sets—update. Nucleic Acids Res. 2013, 41, D991–D995. [Google Scholar] [CrossRef]
- Kanehisa, M.; Goto, S.; Sato, Y.; Furumichi, M.; Tanabe, M. KEGG for integration and interpretation of large-scale molecular data sets. Nucleic Acids Res. 2012, 40, D109–D114. [Google Scholar] [CrossRef]
- Schaefer, C.F.; Anthony, K.; Krupa, S.; Buchoff, J.; Day, M.; Hannay, T.; Buetow, K.H. PID: The Pathway Interaction Database. Nucleic Acids Res 2009, 37, D674–D679. [Google Scholar] [CrossRef]
- Vastrik, I.; D’Eustachio, P.; Schmidt, E.; Gopinath, G.; Croft, D.; de Bono, B.; Gillespie, M.; Jassal, B.; Lewis, S.; Matthews, L.; et al. Reactome: A knowledge base of biologic pathways and processes. Genome Biol. 2007, 8, R39. [Google Scholar] [CrossRef]
- Carazzolle, M.F.; de Carvalho, L.M.; Slepicka, H.H.; Vidal, R.O.; Pereira, G.A.; Kobarg, J.; Meirelles, G.V. IIS--Integrated Interactome System: A web-based platform for the annotation, analysis and visualization of protein-metabolite-gene-drug interactions by integrating a variety of data sources and tools. PLoS ONE 2014, 9, e100385. [Google Scholar] [CrossRef]
- Stark, C.; Breitkreutz, B.J.; Chatr-Aryamontri, A.; Boucher, L.; Oughtred, R.; Livstone, M.S.; Nixon, J.; Van Auken, K.; Wang, X.; Shi, X.; et al. The BioGRID Interaction Database: 2011 update. Nucleic Acids Res. 2011, 39, D698–D704. [Google Scholar] [CrossRef] [PubMed]
- Schwarz, G. Estimating the dimension of a model. Ann. Stat. 1978, 6, 461–464. [Google Scholar] [CrossRef]
- Neapolitan, R.E. Learning Bayesian Networks; Prentice Hall: Hoboken, NJ, USA, 2004. [Google Scholar]
- Scutari, M. Learning Bayesian Networks with the bnlearn R Package. J. Stat. Softw. 2010, 35, 1–22. [Google Scholar] [CrossRef]
- Friedman, N.; Goldszmidt, M.; Wyner, A. Data Analysis with Bayesian Networks: A Bootstrap Approach. In Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence (UAI), Stockholm, Sweden, 30 July–1 August 1999; pp. 206–215. [Google Scholar]
- Scutari, M.; Nagarajan, R. Identifying significant edges in graphical models of molecular networks. Artif. Intell. Med. 2013, 57, 207–217. [Google Scholar] [CrossRef] [PubMed]
- Isci, S.; Ozturk, C.; Jones, J.; Otu, H.H. Pathway analysis of high-throughput biological data within a Bayesian network framework. Bioinformatics 2011, 27, 1667–1674. [Google Scholar] [CrossRef]
- Korucuoglu, M.; Isci, S.; Ozgur, A.; Otu, H.H. Bayesian pathway analysis of cancer microarray data. PLoS ONE 2014, 9, e102803. [Google Scholar] [CrossRef]
- Scutari, M.; Denis, J.-B. Bayesian Networks with Examples in R, 2nd ed.; Chapman and Hall: Boca Raton, FL, USA, 2021. [Google Scholar]
- Hwang, S.; Rhee, S.Y.; Marcotte, E.M.; Lee, I. Systematic prediction of gene function in Arabidopsis thaliana using a probabilistic functional gene network. Nat. Protoc. 2011, 6, 1429–1442. [Google Scholar] [CrossRef]
- Lo, K.; Raftery, A.E.; Dombek, K.M.; Zhu, J.; Schadt, E.E.; Bumgarner, R.E.; Yeung, K.Y. Integrating external biological knowledge in the construction of regulatory networks from time-series expression data. BMC Syst. Biol. 2012, 6, 101. [Google Scholar] [CrossRef]
- Zitnik, M.; Zupan, B. Gene network inference by fusing data from diverse distributions. Bioinformatics 2015, 31, i230–i239. [Google Scholar] [CrossRef]
- Chanumolu, S.K.; Otu, H.H. Identifying large-scale interaction atlases using probabilistic graphs and external knowledge. J. Clin. Transl. Sci. 2022, 6, e27. [Google Scholar] [CrossRef]
- Amand, J.; Fehlmann, T.; Backes, C.; Keller, A. DynaVenn: Web-based computation of the most significant overlap between ordered sets. BMC Bioinform. 2019, 20, 743. [Google Scholar] [CrossRef]
- Palud, A.; Scornec, H.; Cavin, J.F.; Licandro, H. New Genes Involved in Mild Stress Response Identified by Transposon Mutagenesis in Lactobacillus paracasei. Front. Microbiol. 2018, 9, 535. [Google Scholar] [CrossRef]
- Duar, R.M.; Lin, X.B.; Zheng, J.; Martino, M.E.; Grenier, T.; Perez-Munoz, M.E.; Leulier, F.; Ganzle, M.; Walter, J. Lifestyles in transition: Evolution and natural history of the genus Lactobacillus. FEMS Microbiol. Rev. 2017, 41, S27–S48. [Google Scholar] [CrossRef] [PubMed]
- Krugel, H.; Klimina, K.M.; Mrotzek, G.; Tretyakov, A.; Schofl, G.; Saluz, H.P.; Brantl, S.; Poluektova, E.U.; Danilenko, V.N. Expression of the toxin-antitoxin genes yefM(Lrh), yoeB(Lrh) in human Lactobacillus rhamnosus isolates. J. Basic Microbiol. 2015, 55, 982–991. [Google Scholar] [CrossRef] [PubMed]
- Kamruzzaman, M.; Wu, A.Y.; Iredell, J.R. Biological Functions of Type II Toxin-Antitoxin Systems in Bacteria. Microorganisms 2021, 9, 1276. [Google Scholar] [CrossRef]
- Donegan, N.P.; Thompson, E.T.; Fu, Z.; Cheung, A.L. Proteolytic regulation of toxin-antitoxin systems by ClpPC in Staphylococcus aureus. J. Bacteriol. 2010, 192, 1416–1422. [Google Scholar] [CrossRef] [PubMed]
- Perez-Riverol, Y.; Bai, J.; Bandla, C.; Garcia-Seisdedos, D.; Hewapathirana, S.; Kamatchinathan, S.; Kundu, D.J.; Prakash, A.; Frericks-Zipper, A.; Eisenacher, M.; et al. The PRIDE database resources in 2022: A hub for mass spectrometry-based proteomics evidences. Nucleic Acids Res. 2022, 50, D543–D552. [Google Scholar] [CrossRef]
- Sud, M.; Fahy, E.; Cotter, D.; Azam, K.; Vadivelu, I.; Burant, C.; Edison, A.; Fiehn, O.; Higashi, R.; Nair, K.S.; et al. Metabolomics Workbench: An international repository for metabolomics data and metadata, metabolite standards, protocols, tutorials and training, and analysis tools. Nucleic Acids Res. 2016, 44, D463–D470. [Google Scholar] [CrossRef]
Gene ID/Organism | log2 FC | p_adj | Gene Description |
---|---|---|---|
12225/LK1 | 5.01 | 3.64 × 10−02 | IS30 family transposase |
02047/LK2 | 3.45 | 6.80 × 10−114 | hypothetical protein |
01570/LK2 | 3.39 | 4.03 × 10−133 | hypothetical protein |
01848/LK2 | 3.26 | 2.12 × 10−245 | Aldo/keto reductase |
02423/LK2 | 3.22 | 2.52 × 10−146 | hypothetical protein |
02046/LK2 | 3.05 | 8.80 × 10−82 | Resolvase, N terminal domain |
01913/LK2 | 3.04 | 7.70 × 10−112 | antitoxin YefM |
01912/LK2 | 3.03 | 4.81 × 10−67 | toxin YoeB |
02422/LK2 | 3.02 | 5.27 × 10−148 | DNA-damage-inducible protein J |
12115/LK1 | 3.02 | 5.56 × 10−148 | damage-inducible protein J |
03195/LK1 | −5.68 | 2.39 × 10−285 | peptide ABC transporter substrate-binding protein |
10965/LK1 | −4.94 | 2.38 × 10−72 | 2-dehydropantoate 2-reductase |
07015/LK1 | −4.88 | 6.54 × 10−222 | acetylornithine transaminase |
05460/LK1 | −4.85 | 1.88 × 10−258 | MFS transporter |
05315/LK1 | −4.85 | 4.06 × 10−162 | ABC transporter ATP-binding protein |
01085/LK1 | −4.83 | 9.68 × 10−78 | SUF system NifU family Fe-S cluster assembly protein |
07020/LK1 | −4.81 | 1.90 × 10−145 | acetylglutamate kinase |
07025/LK1 | −4.79 | 3.71 × 10−172 | bifunctional ornithine acetyltransferase/N-acetylglutamate synthase |
05320/LK1 | −4.75 | 5.02 × 10−156 | ABC transporter ATP-binding protein |
01060/LK1 | −4.75 | 3.16 × 10−164 | glutamate synthase |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Can, H.; Chanumolu, S.K.; Nielsen, B.D.; Alvarez, S.; Naldrett, M.J.; Ünlü, G.; Otu, H.H. Integration of Meta-Multi-Omics Data Using Probabilistic Graphs and External Knowledge. Cells 2023, 12, 1998. https://doi.org/10.3390/cells12151998
Can H, Chanumolu SK, Nielsen BD, Alvarez S, Naldrett MJ, Ünlü G, Otu HH. Integration of Meta-Multi-Omics Data Using Probabilistic Graphs and External Knowledge. Cells. 2023; 12(15):1998. https://doi.org/10.3390/cells12151998
Chicago/Turabian StyleCan, Handan, Sree K. Chanumolu, Barbara D. Nielsen, Sophie Alvarez, Michael J. Naldrett, Gülhan Ünlü, and Hasan H. Otu. 2023. "Integration of Meta-Multi-Omics Data Using Probabilistic Graphs and External Knowledge" Cells 12, no. 15: 1998. https://doi.org/10.3390/cells12151998
APA StyleCan, H., Chanumolu, S. K., Nielsen, B. D., Alvarez, S., Naldrett, M. J., Ünlü, G., & Otu, H. H. (2023). Integration of Meta-Multi-Omics Data Using Probabilistic Graphs and External Knowledge. Cells, 12(15), 1998. https://doi.org/10.3390/cells12151998