Previous Article in Journal
The Hungry Daemon: Does an Energy-Harvesting Active Particle Have to Obey the Second Law of Thermodynamics?
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

A Graph Contrastive Learning Method for Enhancing Genome Recovery in Complex Microbial Communities

1
Department of Computer Science, Yangzhou University, Yangzhou 225100, China
2
State Key Laboratory of Pharmaceutical Biotechnology, School of Life Sciences, Nanjing University, Nanjing 210000, China
*
Author to whom correspondence should be addressed.
Entropy 2025, 27(9), 921; https://doi.org/10.3390/e27090921 (registering DOI)
Submission received: 22 June 2025 / Revised: 20 August 2025 / Accepted: 28 August 2025 / Published: 31 August 2025
(This article belongs to the Special Issue Network-Based Machine Learning Approaches in Bioinformatics)

Abstract

Accurate genome binning is essential for resolving microbial community structure and functional potential from metagenomic data. However, existing approaches—primarily reliant on tetranucleotide frequency (TNF) and abundance profiles—often perform sub-optimally in the face of complex community compositions, low-abundance taxa, and long-read sequencing datasets. To address these limitations, we present MBGCCA, a novel metagenomic binning framework that synergistically integrates graph neural networks (GNNs), contrastive learning, and information-theoretic regularization to enhance binning accuracy, robustness, and biological coherence. MBGCCA operates in two stages: (1) multimodal information integration, where TNF and abundance profiles are fused via a deep neural network trained using a multi-view contrastive loss, and (2) self-supervised graph representation learning, which leverages assembly graph topology to refine contig embeddings. The contrastive learning objective follows the InfoMax principle by maximizing mutual information across augmented views and modalities, encouraging the model to extract globally consistent and high-information representations. By aligning perturbed graph views while preserving topological structure, MBGCCA effectively captures both global genomic characteristics and local contig relationships. Comprehensive evaluations using both synthetic and real-world datasets—including wastewater and soil microbiomes—demonstrate that MBGCCA consistently outperforms state-of-the-art binning methods, particularly in challenging scenarios marked by sparse data and high community complexity. These results highlight the value of entropy-aware, topology-preserving learning for advancing metagenomic genome reconstruction.
Keywords: information integration; entropy; mutual information; canonical correlation analysis; genome binning information integration; entropy; mutual information; canonical correlation analysis; genome binning

Share and Cite

MDPI and ACS Style

Wei, G.; Liu, Y. A Graph Contrastive Learning Method for Enhancing Genome Recovery in Complex Microbial Communities. Entropy 2025, 27, 921. https://doi.org/10.3390/e27090921

AMA Style

Wei G, Liu Y. A Graph Contrastive Learning Method for Enhancing Genome Recovery in Complex Microbial Communities. Entropy. 2025; 27(9):921. https://doi.org/10.3390/e27090921

Chicago/Turabian Style

Wei, Guo, and Yan Liu. 2025. "A Graph Contrastive Learning Method for Enhancing Genome Recovery in Complex Microbial Communities" Entropy 27, no. 9: 921. https://doi.org/10.3390/e27090921

APA Style

Wei, G., & Liu, Y. (2025). A Graph Contrastive Learning Method for Enhancing Genome Recovery in Complex Microbial Communities. Entropy, 27(9), 921. https://doi.org/10.3390/e27090921

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop