Evaluating Directed Acyclic Graphs with DAGMetrics: Insights from Tuber and Soil Microbiome Data
Abstract
:1. Introduction
2. Materials and Methods
2.1. DAGs, Bayesian Networks, CPDAGs, and Markov Blankets
2.2. Structure Learning and Evaluation
2.2.1. Descriptive Metrics
- Number of Edges: Measures connectivity and complexity of a DAG.
- Number of Colliders: Counts the number of colliders. In the case of CPDAGs, only edges involved in colliders are typically directed, as they represent causal relationships consistent across all equivalent DAGs. This metric serves as an indicator of the graph’s complexity and the presence of potential causal relationships inferred by the structure learning algorithm.
- Number of Root Nodes: Nodes with no incoming edges.
- Number of Leaf Nodes: Nodes with no outgoing edges.
- Number of Isolated Nodes: Nodes with no connections.
- Degree Metrics: These metrics relate to a specific node as compared to the full structure. They include in-degree, out-degree, and total degree for nodes, highlighting the importance and connectivity of individual nodes.
2.2.2. Comparative Metrics
- True Positive, False Positive, and False Negative Counts: These identify matches and mismatches in edges between DAGs in counts [24].
2.3. DAGMetrics: An R Package for Evaluating and Comparing DAGs
- 1.
- Single DAG Visualization: Users can visualize either the full network or the Markov blankets of specified nodes.
- 2.
- Side-by-Side Comparison: This feature allows for direct comparison of two DAGs or the Markov blankets of specified nodes. In these visualizations, selected nodes and matching edges between the two DAGs are highlighted in green, making it easier to identify shared structures and differences.
2.4. Data Description and Preprocessing
2.5. Model Selection
- The normalized number of edges.
- The normalized number of colliders.
- The normalized number of isolated nodes.
2.6. Statistical Analysis
- The number of edges.
- The number of colliders.
- The number of nodes.
- The number of isolated nodes.
3. Results
3.1. Model Selection
3.2. Analysis of Markov Blankets
3.3. Discovered Local Structures
- A directed edge from Methylotenera mobilis to Methylotenera versatilis.
- A directed edge from Gemmata massiliana to Telmatocola shagniphila.
- Devosia psychrophila had an undirected relationship with Devosia submarina.
- Rhodoferax saidenbachensis had an undirected relationship with Terrimonas pekingensis.
- An undirected relationship between Shinella zoogloeoides and Sphingomonas zeicaulis was observed in both the island post-harvest stage and the continent harvest stage (Figure 4b,c).
- A directed relationship from Rhodoligotrophos appendicifer to Bartonella rochalimae was observed in both the island at post-harvest and continent at post-harvest stages (Figure 4b,d).
4. Discussion
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Sadeghi, A.; Gopal, A.; Fesanghary, M. Causal Discovery in Financial Markets: A Framework for Nonstationary Time-Series Data. arXiv 2023, arXiv:2312.17375. [Google Scholar] [CrossRef]
- Addo, P.M.; Manibialoa, C.; McIsaac, F. Exploring nonlinearity on the CO2 emissions, economic production and energy use nexus: A causal discovery approach. Energy Rep. 2021, 7, 6196–6204. [Google Scholar] [CrossRef]
- Ebert-Uphoff, I.; Deng, Y. Causal discovery for climate research using graphical models. J. Clim. 2012, 25, 5648–5665. [Google Scholar] [CrossRef]
- Agrahari, R.; Foroushani, A.; Docking, T.R.; Chang, L.; Duns, G.; Hudoba, M.; Karsan, A.; Zare, H. Applications of Bayesian network models in predicting types of hematological malignancies. Sci. Rep. 2018, 8, 6951. [Google Scholar] [CrossRef]
- Friedman, N.; Linial, M.; Nachman, I.; Pe’er, D. Using Bayesian networks to analyze expression data. In Proceedings of the Fourth Annual International Conference on Computational Molecular Biology, Tokyo, Japan, 8–11 April 2000; pp. 127–135. [Google Scholar]
- Foroushani, A.; Agrahari, R.; Docking, R.; Chang, L.; Duns, G.; Hudoba, M.; Karsan, A.; Zare, H. Large-scale gene network analysis reveals the significance of extracellular matrix pathway and homeobox genes in acute myeloid leukemia: An introduction to the Pigengene package and its applications. BMC Med. Genom. 2017, 10, 16. [Google Scholar] [CrossRef] [PubMed]
- Zhang, B.; Gaiteri, C.; Bodea, L.G.; Wang, Z.; McElwee, J.; Podtelezhnikov, A.A.; Zhang, C.; Xie, T.; Tran, L.; Dobrin, R.; et al. Integrated systems approach identifies genetic nodes and networks in late-onset Alzheimer’s disease. Cell 2013, 153, 707–720. [Google Scholar] [CrossRef] [PubMed]
- Ganopoulou, M.; Michailidis, M.; Angelis, L.; Ganopoulos, I.; Molassiotis, A.; Xanthopoulou, A.; Moysiadis, T. Could Causal Discovery in Proteogenomics Assist in Understanding Gene–Protein Relations? A Perennial Fruit Tree Case Study Using Sweet Cherry as a Model. Cells 2021, 11, 92. [Google Scholar] [CrossRef]
- Skodra, C.; Michailidis, M.; Moysiadis, T.; Stamatakis, G.; Ganopoulou, M.; Adamakis, I.D.S.; Angelis, L.; Ganopoulos, I.; Tanou, G.; Samiotaki, M.; et al. Disclosing the molecular basis of salinity priming in olive trees using proteogenomic model discovery. Plant Physiol. 2023, 191, 1913–1933. [Google Scholar] [CrossRef]
- Boutsika, A.; Michailidis, M.; Ganopoulou, M.; Dalakouras, A.; Skodra, C.; Xanthopoulou, A.; Stamatakis, G.; Samiotaki, M.; Tanou, G.; Moysiadis, T.; et al. A wide foodomics approach coupled with metagenomics elucidates the environmental signature of potatoes. Iscience 2023, 26, 105917. [Google Scholar] [CrossRef]
- Brouillard, P.; Squires, C.; Wahl, J.; Kording, K.P.; Sachs, K.; Drouin, A.; Sridhar, D. The Landscape of Causal Discovery Data: Grounding Causal Discovery in Real-World Applications. arXiv 2024, arXiv:2412.01953. [Google Scholar]
- Spirtes, P.; Glymour, C.; Scheines, R. Causation, Prediction, and Search; MIT Press: Cambridge, MA, USA, 2001. [Google Scholar]
- Pearl, J. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference; Morgan Kaufmann: Burlington, MA, USA, 1988. [Google Scholar]
- Bouckaert, R.R. Bayesian Belief Networks: From Construction to Inference. Ph.D. Thesis, Universiteit Utrecht, Faculteit Wiskunde en Informatica, Utrecht, The Netherlands, 1995. [Google Scholar]
- Tsamardinos, I.; Brown, L.E.; Aliferis, C.F. The max-min hill-climbing Bayesian network structure learning algorithm. Mach. Learn. 2006, 65, 31–78. [Google Scholar] [CrossRef]
- Scutari, M.; Graafland, C.E.; Gutiérrez, J.M. Who learns better Bayesian network structures: Accuracy and speed of structure learning algorithms. Int. J. Approx. Reason. 2019, 115, 235–253. [Google Scholar] [CrossRef]
- Constantinou, A.C.; Liu, Y.; Chobtham, K.; Guo, Z.; Kitson, N.K. Large-scale empirical validation of Bayesian Network structure learning algorithms with noisy data. Int. J. Approx. Reason. 2021, 131, 151–188. [Google Scholar] [CrossRef]
- Zheng, X.; Aragam, B.; Ravikumar, P.K.; Xing, E.P. Dags with no tears: Continuous optimization for structure learning. In Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada, 3–8 December 2018; Volume 31. [Google Scholar]
- Fang, Z.; Zhu, S.; Zhang, J.; Liu, Y.; Chen, Z.; He, Y. On low-rank directed acyclic graphs and causal structure learning. IEEE Trans. Neural Netw. Learn. Syst. 2023, 35, 4924–4937. [Google Scholar] [CrossRef]
- Ng, I.; Ghassami, A.; Zhang, K. On the role of sparsity and dag constraints for learning linear dags. In Proceedings of the Advances in Neural Information Processing Systems, Virtual, 6–12 December 2020; Volume 33, pp. 17943–17954. [Google Scholar]
- Huang, B.; Zhang, K.; Zhang, J.; Ramsey, J.; Sanchez-Romero, R.; Glymour, C.; Schölkopf, B. Causal discovery from heterogeneous/nonstationary data. J. Mach. Learn. Res. 2020, 21, 1–53. [Google Scholar]
- Maeda, T.N.; Shimizu, S. RCD: Repetitive causal discovery of linear non-Gaussian acyclic models with latent confounders. In Proceedings of the International Conference on Artificial Intelligence and Statistics, PMLR, Online, 26–28 August 2020; pp. 735–745. [Google Scholar]
- Maeda, T.N.; Shimizu, S. Causal additive models with unobserved variables. In Proceedings of the Uncertainty in Artificial Intelligence, PMLR, Online, 27–30 July 2021; pp. 97–106. [Google Scholar]
- Shimizu, S.; Hoyer, P.O.; Hyvärinen, A.; Kerminen, A.; Jordan, M. A linear non-Gaussian acyclic model for causal discovery. J. Mach. Learn. Res. 2006, 7, 2003–2030. [Google Scholar]
- Boutsika, A.; Xanthopoulou, A.; Tanou, G.; Zacharatou, M.E.; Vernikos, M.; Nianiou-Obeidat, I.; Ganopoulos, I.; Mellidou, I. A microbiome survey of contrasting potato terroirs using 16S rRNA long-read sequencing. Plant Soil 2024, 505, 431–448. [Google Scholar] [CrossRef]
- Kalisch, M.; Mächler, M.; Colombo, D.; Maathuis, M.H.; Bühlmann, P. Causal inference using graphical models with the R package pcalg. J. Stat. Softw. 2012, 47, 1–26. [Google Scholar] [CrossRef]
- Tsamardinos, I.; Aliferis, C.F.; Statnikov, A.R.; Statnikov, E. Algorithms for large scale Markov blanket discovery. FLAIRS 2003, 2, 376–381. Available online: https://cdn.aaai.org/FLAIRS/2003/Flairs03-073.pdf (accessed on 25 January 2025).
- Scutari, M. Learning Bayesian networks with the bnlearn R package. arXiv 2009, arXiv:0908.3817. [Google Scholar]
- Shannon, P.; Markiel, A.; Ozier, O.; Baliga, N.S.; Wang, J.T.; Ramage, D.; Amin, N.; Schwikowski, B.; Ideker, T. Cytoscape: A software environment for integrated models of biomolecular interaction networks. Genome Res. 2003, 13, 2498–2504. [Google Scholar] [CrossRef] [PubMed]
- Kalyuzhnaya, M.G.; Beck, D.A.; Vorobev, A.; Smalley, N.; Kunkel, D.D.; Lidstrom, M.E.; Chistoserdova, L. Novel methylotrophic isolates from lake sediment, description of Methylotenera versatilis sp. nov. and emended description of the genus Methylotenera. Int. J. Syst. Evol. Microbiol. 2012, 62, 106–111. [Google Scholar] [CrossRef]
- Mustakhimov, I.; Kalyuzhnaya, M.G.; Lidstrom, M.E.; Chistoserdova, L. Insights into denitrification in Methylotenera mobilis from denitrification pathway and methanol metabolism mutants. J. Bacteriol. 2013, 195, 2207–2211. [Google Scholar] [CrossRef] [PubMed]
- Salcher, M.M.; Schaefle, D.; Kaspar, M.; Neuenschwander, S.M.; Ghai, R. Evolution in action: Habitat transition from sediment to the pelagial leads to genome streamlining in Methylophilaceae. ISME J. 2019, 13, 2764–2777. [Google Scholar] [CrossRef]
- Bulgarelli, D.; Rott, M.; Schlaeppi, K.; Ver Loren van Themaat, E.; Ahmadinejad, N.; Assenza, F.; Rauf, P.; Huettel, B.; Reinhardt, R.; Schmelzer, E.; et al. Revealing structure and assembly cues for Arabidopsis root-inhabiting bacterial microbiota. Nature 2012, 488, 91–95. [Google Scholar] [CrossRef] [PubMed]
- Li, K.; Chen, A.; Sheng, R.; Hou, H.; Zhu, B.; Wei, W.; Zhang, W. Long-term chemical and organic fertilization induces distinct variations of microbial associations but unanimous elevation of soil multifunctionality. Sci. Total Environ. 2024, 931, 172862. [Google Scholar] [CrossRef]
- Jin, D.; Wang, P.; Bai, Z.; Jin, B.; Yu, Z.; Wang, X.; Zhuang, G.; Zhang, H. Terrimonas pekingensis sp. nov., isolated from bulking sludge, and emended descriptions of the genus Terrimonas, Terrimonas ferruginea, Terrimonas lutea and Terrimonas aquatica. Int. J. Syst. Evol. Microbiol. 2013, 63 Pt 5, 1658–1664. [Google Scholar] [CrossRef] [PubMed]
- Kaden, R.; Spröer, C.; Beyer, D.; Krolla-Sidenstein, P. Rhodoferax saidenbachensis sp. nov., a psychrotolerant, very slowly growing bacterium within the family Comamonadaceae, proposal of appropriate taxonomic position of Albidiferax ferrireducens strain T118T in the genus Rhodoferax and emended description of the genus Rhodoferax. Int. J. Syst. Evol. Microbiol. 2014, 64 Pt 4, 1186–1193. [Google Scholar]
- Junier, P.; Cailleau, G.; Fatton, M.; Udriet, P.; Hashmi, I.; Bregnard, D.; Corona-Ramirez, A.; di Francesco, E.; Kuhn, T.; Mangia, N.; et al. A cohesive Microcoleus strain cluster causes benthic cyanotoxic blooms in rivers worldwide. Water Res. X 2024, 24, 100252. [Google Scholar] [CrossRef]
- Lamprinou, V.; Hernández-Mariné, M.; Canals, T.; Kormas, K.; Economou-Amilli, A.; Pantazidou, A. Morphology and molecular evaluation of Iphinoe spelaeobios gen. nov., sp. nov. and Loriellopsis cavernicola gen. nov., sp. nov., two stigonematalean cyanobacteria from Greek and Spanish caves. Int. J. Syst. Evol. Microbiol. 2011, 61, 2907–2915. [Google Scholar] [CrossRef]
- Qadir, M.; Hussain, A.; Iqbal, A.; Shah, F.; Wu, W.; Cai, H. Microbial utilization to nurture robust agroecosystems for food security. Agronomy 2024, 14, 1891. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Averin, P.; Mellidou, I.; Ganopoulou, M.; Xanthopoulou, A.; Moysiadis, T. Evaluating Directed Acyclic Graphs with DAGMetrics: Insights from Tuber and Soil Microbiome Data. Agronomy 2025, 15, 987. https://doi.org/10.3390/agronomy15040987
Averin P, Mellidou I, Ganopoulou M, Xanthopoulou A, Moysiadis T. Evaluating Directed Acyclic Graphs with DAGMetrics: Insights from Tuber and Soil Microbiome Data. Agronomy. 2025; 15(4):987. https://doi.org/10.3390/agronomy15040987
Chicago/Turabian StyleAverin, Pavel, Ifigeneia Mellidou, Maria Ganopoulou, Aliki Xanthopoulou, and Theodoros Moysiadis. 2025. "Evaluating Directed Acyclic Graphs with DAGMetrics: Insights from Tuber and Soil Microbiome Data" Agronomy 15, no. 4: 987. https://doi.org/10.3390/agronomy15040987
APA StyleAverin, P., Mellidou, I., Ganopoulou, M., Xanthopoulou, A., & Moysiadis, T. (2025). Evaluating Directed Acyclic Graphs with DAGMetrics: Insights from Tuber and Soil Microbiome Data. Agronomy, 15(4), 987. https://doi.org/10.3390/agronomy15040987