Evaluation of Genomic Contamination Detection Tools and Influence of Horizontal Gene Transfer on Their Efficiency through Contamination Simulations at Various Taxonomic Ranks
Abstract
:1. Introduction
2. Materials and Methods
2.1. Contamination Simulations (Overview of CRACOT)
2.2. Genomic Contamination Estimation
2.3. Correlation and Violin Plot Creation
3. Results and Discussion
4. Conclusions
Supplementary Materials
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Cornet, L.; Baurain, D. Contamination Detection in Genomic Data: More Is Not Enough. Genome Biol. 2022, 23, 60. [Google Scholar] [CrossRef] [PubMed]
- Schierwater, B.; Eitel, M.; Jakob, W.; Osigus, H.-J.; Hadrys, H.; Dellaporta, S.L.; Kolokotronis, S.-O.; DeSalle, R. Concatenated Analysis Sheds Light on Early Metazoan Evolution and Fuels a Modern “Urmetazoon” Hypothesis. PLoS Biol. 2009, 7, e20. [Google Scholar] [CrossRef] [PubMed]
- Philippe, H.; Brinkmann, H.; Lavrov, D.V.; Littlewood, D.T.J.; Manuel, M.; Wörheide, G.; Baurain, D. Resolving Difficult Phylogenetic Questions: Why More Sequences Are Not Enough. PLoS Biol. 2011, 9, e1000602. [Google Scholar] [CrossRef] [PubMed]
- Laurin-Lemay, S.; Brinkmann, H.; Philippe, H. Origin of Land Plants Revisited in the Light of Sequence Contamination and Missing Data. Curr. Biol. 2012, 22, R593–R594. [Google Scholar] [CrossRef] [PubMed]
- Lupo, V.; Van Vlierberghe, M.; Vanderschuren, H.; Kerff, F.; Baurain, D.; Cornet, L. Contamination in Reference Sequence Databases: Time for Divide-and-Rule Tactics. Front. Microbiol. 2021, 12, 755101. [Google Scholar] [CrossRef]
- Parks, D.H.; Imelfort, M.; Skennerton, C.T.; Hugenholtz, P.; Tyson, G.W. CheckM: Assessing the Quality of Microbial Genomes Recovered from Isolates, Single Cells, and Metagenomes. Genome Res. 2015, 25, 1043–1055. [Google Scholar] [CrossRef]
- Manni, M.; Berkeley, M.R.; Seppey, M.; Simao, F.A.; Zdobnov, E.M. BUSCO Update: Novel and Streamlined Workflows along with Broader and Deeper Phylogenetic Coverage for Scoring of Eukaryotic, Prokaryotic, and Viral Genomes. arXiv 2021, arXiv:2106.11799. [Google Scholar] [CrossRef]
- Orakov, A.; Fullam, A.; Coelho, L.P.; Khedkar, S.; Szklarczyk, D.; Mende, D.R.; Schmidt, T.S.B.; Bork, P. GUNC: Detection of Chimerism and Contamination in Prokaryotic Genomes. Genome Biol. 2021, 22, 178. [Google Scholar] [CrossRef]
- Wood, D.E.; Lu, J.; Langmead, B. Improved Metagenomic Analysis with Kraken 2. Genome Biol. 2019, 20, 257. [Google Scholar] [CrossRef]
- Cornet, L.; Meunier, L.; Vlierberghe, M.V.; Léonard, R.R.; Durieu, B.; Lara, Y.; Misztak, A.; Sirjacobs, D.; Javaux, E.J.; Philippe, H.; et al. Consensus Assessment of the Contamination Level of Publicly Available Cyanobacterial Genomes. PLoS ONE 2018, 13, e0200323. [Google Scholar] [CrossRef]
- Fitch, W.M. Distinguishing Homologous from Analogous Proteins. Syst. Biol. 1970, 19, 99–113. [Google Scholar] [CrossRef]
- Chklovski, A.; Parks, D.H.; Woodcroft, B.J.; Tyson, G.W. CheckM2: A Rapid, Scalable and Accurate Tool for Assessing Microbial Genome Quality Using Machine Learning. Nat. Methods 2022, 20, 1203–1212. [Google Scholar] [CrossRef] [PubMed]
- Federhen, S. The NCBI Taxonomy Database. Nucleic Acids Res. 2012, 40, D136–D143. [Google Scholar] [CrossRef] [PubMed]
- Schoch, C.L.; Ciufo, S.; Domrachev, M.; Hotton, C.L.; Kannan, S.; Khovanskaya, R.; Leipe, D.; Mcveigh, R.; O’Neill, K.; Robbertse, B.; et al. NCBI Taxonomy: A Comprehensive Update on Curation, Resources and Tools. Database 2020, 2020, baaa062. [Google Scholar] [CrossRef] [PubMed]
- Hyatt, D.; Chen, G.-L.; LoCascio, P.F.; Land, M.L.; Larimer, F.W.; Hauser, L.J. Prodigal: Prokaryotic Gene Recognition and Translation Initiation Site Identification. BMC Bioinform. 2010, 11, 119. [Google Scholar] [CrossRef]
- Emms, D.M.; Kelly, S. OrthoFinder: Phylogenetic Orthology Inference for Comparative Genomics. Genome Biol. 2019, 20, 238. [Google Scholar] [CrossRef]
- Song, W.; Steensen, K.; Thomas, T. HgtSIM: A Simulator for Horizontal Gene Transfer (HGT) in Microbial Communities. PeerJ 2017, 5, e4015. [Google Scholar] [CrossRef]
- Cornet, L.; Durieu, B.; Baert, F.; D’hooge, E.; Colignon, D.; Meunier, L.; Lupo, V.; Cleenwerck, I.; Daniel, H.-M.; Rigouts, L.; et al. The GEN-ERA Toolbox: Unified and Reproducible Workflows for Research in Microbial Genomics. GigaScience 2023, 12, giad022. [Google Scholar] [CrossRef]
- Mende, D.R.; Letunic, I.; Maistrenko, O.M.; Schmidt, T.S.B.; Milanese, A.; Paoli, L.; Hernández-Plaza, A.; Orakov, A.N.; Forslund, S.K.; Sunagawa, S.; et al. proGenomes2: An Improved Database for Accurate and Consistent Habitat, Taxonomic and Functional Annotations of Prokaryotic Genomes. Nucleic Acids Res. 2020, 48, D621–D625. [Google Scholar] [CrossRef]
- R Core Team. R: A Language and Environment for Statistical Computing; R Core Team: Vienna, Austria, 2014. [Google Scholar]
- Wickham, H. ggplot2: Elegant Graphics for Data Analysis; Springer: New York, NY, USA, 2016; ISBN 978-3-319-24277-4. [Google Scholar]
- Buchfink, B.; Xie, C.; Huson, D.H. Fast and Sensitive Protein Alignment Using DIAMOND. Nat. Methods 2015, 12, 59–60. [Google Scholar] [CrossRef]
- Wood, D.E.; Salzberg, S.L. Kraken: Ultrafast Metagenomic Sequence Classification Using Exact Alignments. Genome Biol. 2014, 15, R46. [Google Scholar] [CrossRef] [PubMed]
- Arnold, B.J.; Huang, I.-T.; Hanage, W.P. Horizontal Gene Transfer and Adaptive Evolution in Bacteria. Nat. Rev. Microbiol. 2021, 20, 206–218. [Google Scholar] [CrossRef] [PubMed]
- Zhaxybayeva, O.; Gogarten, J.P.; Charlebois, R.L.; Doolittle, W.F.; Papke, R.T. Phylogenetic Analyses of Cyanobacterial Genomes: Quantification of Horizontal Gene Transfer Events. Genome Res. 2006, 16, 1099–1108. [Google Scholar] [CrossRef]
- Dagan, T.; Artzy-Randrup, Y.; Martin, W. Modular Networks and Cumulative Impact of Lateral Transfer in Prokaryote Genome Evolution. Proc. Natl. Acad. Sci. USA 2008, 105, 10039–10044. [Google Scholar] [CrossRef] [PubMed]
- Dagan, T.; Martin, W. Ancestral Genome Sizes Specify the Minimum Rate of Lateral Gene Transfer during Prokaryote Evolution. Proc. Natl. Acad. Sci. USA 2007, 104, 870–875. [Google Scholar] [CrossRef]
- Bohr, L.L.; Mortimer, T.D.; Pepperell, C.S. Lateral Gene Transfer Shapes Diversity of Gardnerella spp. Front. Cell. Infect. Microbiol. 2020, 10, 293. [Google Scholar] [CrossRef]
- Frazão, N.; Sousa, A.; Lässig, M.; Gordo, I. Horizontal Gene Transfer Overrides Mutation in Escherichia Coli Colonizing the Mammalian Gut. Proc. Natl. Acad. Sci. USA 2019, 116, 17906–17915. [Google Scholar] [CrossRef]
- Chen, L.-X.; Anantharaman, K.; Shaiber, A.; Eren, A.M.; Banfield, J.F. Accurate and Complete Genomes from Metagenomes. Genome Res. 2020, 30, 315–333. [Google Scholar] [CrossRef]
- Di Tommaso, P.; Chatzou, M.; Floden, E.W.; Barja, P.P.; Palumbo, E.; Notredame, C. Nextflow Enables Reproducible Computational Workflows. Nat. Biotechnol. 2017, 35, 316–319. [Google Scholar] [CrossRef]
- Kurtzer, G.M.; Sochat, V.; Bauer, M.W. Singularity: Scientific Containers for Mobility of Compute. PLoS ONE 2017, 12, e0177459. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Cornet, L.; Lupo, V.; Declerck, S.; Baurain, D. Evaluation of Genomic Contamination Detection Tools and Influence of Horizontal Gene Transfer on Their Efficiency through Contamination Simulations at Various Taxonomic Ranks. Appl. Microbiol. 2024, 4, 124-132. https://doi.org/10.3390/applmicrobiol4010009
Cornet L, Lupo V, Declerck S, Baurain D. Evaluation of Genomic Contamination Detection Tools and Influence of Horizontal Gene Transfer on Their Efficiency through Contamination Simulations at Various Taxonomic Ranks. Applied Microbiology. 2024; 4(1):124-132. https://doi.org/10.3390/applmicrobiol4010009
Chicago/Turabian StyleCornet, Luc, Valérian Lupo, Stéphane Declerck, and Denis Baurain. 2024. "Evaluation of Genomic Contamination Detection Tools and Influence of Horizontal Gene Transfer on Their Efficiency through Contamination Simulations at Various Taxonomic Ranks" Applied Microbiology 4, no. 1: 124-132. https://doi.org/10.3390/applmicrobiol4010009
APA StyleCornet, L., Lupo, V., Declerck, S., & Baurain, D. (2024). Evaluation of Genomic Contamination Detection Tools and Influence of Horizontal Gene Transfer on Their Efficiency through Contamination Simulations at Various Taxonomic Ranks. Applied Microbiology, 4(1), 124-132. https://doi.org/10.3390/applmicrobiol4010009