Benchmarking State-of-the-Art Approaches for Norovirus Genome Assembly in Metagenome Sample
Abstract
Simple Summary
Abstract
1. Introduction
2. Methods
3. Results
4. Discussion
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Fuentes-Trillo, A.; Monzó, C.; Manzano, I.; Santiso-Bellón, C.; Andrade, J.d.S.R.d.; Gozalbo-Rovira, R.; García-García, A.B.; Rodríguez-Díaz, J.; Chaves, F.J. Benchmarking different approaches for Norovirus genome assembly in metagenome samples. BMC Genom. 2021, 22, 1–12. [Google Scholar] [CrossRef] [PubMed]
- Edgar, R.C.; Taylor, J.; Lin, V.; Altman, T.; Barbera, P.; Meleshko, D.; Lohr, D.; Novakovsky, G.; Buchfink, B.; Al-Shayeb, B.; et al. Petabase-scale sequence alignment catalyses viral discovery. Nature 2022, 602, 142–147. [Google Scholar] [CrossRef]
- Kawasaki, J.; Kojima, S.; Tomonaga, K.; Horie, M. Hidden viral sequences in public sequencing data and warning for future emerging diseases. Mbio 2021, 12, e01638-21. [Google Scholar] [CrossRef] [PubMed]
- Sczyrba, A.; Hofmann, P.; Belmann, P.; Koslicki, D.; Janssen, S.; Dröge, J.; Gregor, I.; Majda, S.; Fiedler, J.; Dahms, E.; et al. Critical assessment of metagenome interpretation—A benchmark of metagenomics software. Nat. Methods 2017, 14, 1063–1071. [Google Scholar] [CrossRef] [PubMed]
- Magoc, T.; Pabinger, S.; Canzar, S.; Liu, X.; Su, Q.; Puiu, D.; Tallon, L.J.; Salzberg, S.L. GAGE-B: An evaluation of genome assemblers for bacterial organisms. Bioinformatics 2013, 29, 1718–1725. [Google Scholar] [CrossRef] [PubMed]
- Luo, W.; Friedman, M.S.; Shedden, K.; Hankenson, K.D.; Woolf, P.J. GAGE: Generally applicable gene set enrichment for pathway analysis. BMC Bioinform. 2009, 10, 1–17. [Google Scholar] [CrossRef]
- Meyer, F.; Fritz, A.; Deng, Z.L.; Koslicki, D.; Lesker, T.R.; Gurevich, A.; Robertson, G.; Alser, M.; Antipov, D.; Beghini, F.; et al. Critical Assessment of Metagenome Interpretation: The second round of challenges. Nat. Methods 2022, 19, 429–440. [Google Scholar] [CrossRef]
- Nurk, S.; Meleshko, D.; Korobeynikov, A.; Pevzner, P.A. metaSPAdes: A new versatile metagenomic assembler. Genome Res. 2017, 27, 824–834. [Google Scholar] [CrossRef]
- Li, D.; Liu, C.M.; Luo, R.; Sadakane, K.; Lam, T.W. MEGAHIT: An ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics 2015, 31, 1674–1676. [Google Scholar] [CrossRef]
- Roux, S.; Emerson, J.B.; Eloe-Fadrosh, E.A.; Sullivan, M.B. Benchmarking viromics: An in silico evaluation of metagenome-enabled estimates of viral community composition and diversity. PeerJ 2017, 5, e3817. [Google Scholar] [CrossRef]
- Sutton, T.D.S.; Clooney, A.G.; Ryan, F.J.; Ross, R.P.; Hill, C. Choice of assembly software has a critical impact on virome characterisation. Microbiome 2019, 7, 12. [Google Scholar] [CrossRef]
- Grabherr, M.G.; Haas, B.J.; Yassour, M.; Levin, J.Z.; Thompson, D.A.; Amit, I.; Adiconis, X.; Fan, L.; Raychowdhury, R.; Zeng, Q.; et al. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat. Biotechnol. 2011, 29, 644–652. [Google Scholar] [CrossRef]
- Bushmanova, E.; Antipov, D.; Lapidus, A.; Prjibelski, A.D. rnaSPAdes: A de novo transcriptome assembler and its application to RNA-Seq data. GigaScience 2019, 8, giz100. [Google Scholar] [CrossRef]
- Baaijens, J.A.; Aabidine, A.Z.E.; Rivals, E.; Schönhuth, A. De novo assembly of viral quasispecies using overlap graphs. Genome Res. 2017, 27, 835–848. [Google Scholar] [CrossRef]
- Hunt, M.; Gall, A.; Ong, S.H.; Brener, J.; Ferns, B.; Goulder, P.; Nastouli, E.; Keane, J.A.; Kellam, P.; Otto, T.D. IVA: Accurate de novo assembly of RNA virus genomes. Bioinformatics 2015, 31, 2374–2376. [Google Scholar] [CrossRef]
- Meleshko, D.; Hajirasouliha, I.; Korobeynikov, A. coronaSPAdes: From biosynthetic gene clusters to RNA viral assemblies. Bioinformatics 2022, 38, 1–8. [Google Scholar] [CrossRef]
- Chan, M.C.; Kwan, H.S.; Chan, P.K. Structure and Genotypes of Noroviruses. In The Norovirus; Elsevier: Amsterdam, The Netherlands, 2017; pp. 51–63. [Google Scholar] [CrossRef]
- Bigot, T.; Temmam, S.; Pérot, P.; Eloit, M. RVDB-prot, a reference viral protein database and its HMM profiles [version 2; peer review: 2 approved]. F1000Research 2020, 8, 530. [Google Scholar] [CrossRef]
- Gurevich, A.; Saveliev, V.; Vyahhi, N.; Tesler, G. QUAST: Quality assessment tool for genome assemblies. Bioinformatics 2013, 29, 1072–1075. [Google Scholar] [CrossRef]
- Viehweger, A.; Krautwurst, S.; Lamkiewicz, K.; Madhugiri, R.; Ziebuhr, J.; Hölzer, M.; Marz, M. Direct RNA nanopore sequencing of full-length coronavirus genomes provides novel insights into structural variants and enables modification analysis. Genome Res. 2019, 29, 1545–1554. [Google Scholar] [CrossRef]
- Eddy, S.R. Accelerated Profile HMM Searches. PLoS Comput. Biol. 2011, 7, e1002195. [Google Scholar] [CrossRef]
- Hyatt, D.; Chen, G.L.; LoCascio, P.F.; Land, M.L.; Larimer, F.W.; Hauser, L.J. Prodigal: Prokaryotic gene recognition and translation initiation site identification. BMC Bioinform. 2010, 11, 1–11. [Google Scholar] [CrossRef] [PubMed]
- Lukashin, A. GeneMark.hmm: New solutions for gene finding. Nucleic Acids Res. 1998, 26, 1107–1115. [Google Scholar] [CrossRef] [PubMed]
- Zhang, K.Y.; Gao, Y.Z.; Du, M.Z.; Liu, S.; Dong, C.; Guo, F.B. Vgas: A Viral Genome Annotation System. Front. Microbiol. 2019, 10, 184. [Google Scholar] [CrossRef] [PubMed]
- Li, W.; Godzik, A. Cd-hit: A fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 2006, 22, 1658–1659. [Google Scholar] [CrossRef]
- Edgar, R.C. MUSCLE: Multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004, 32, 1792–1797. [Google Scholar] [CrossRef]
- Steinegger, M.; Söding, J. MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets. Nat. Biotechnol. 2017, 35, 1026–1028. [Google Scholar] [CrossRef]
RVS | CS | RS | M | T | ||
---|---|---|---|---|---|---|
SRR8074276 | Longest alignment (nt) | 7538 | 7538 | 7538 | 7548 | 7547 |
Genome fraction% | 99.83 | 99.83 | 99.83 | 99.96 | 99.94 | |
Longest alignment IDY% | 99.95 | 99.95 | 99.91 | 99.87 | 99.93 | |
SRR9141472 | Longest alignment (nt) | 7569 | 7569 | 5282 | 7560 | 5848 |
Genome fraction% | 100.0 | 100.0 | 78.80 | 99.88 | 100.0 | |
Longest alignment IDY% | 100.0 | 100.0 | 99.95 | 99.99 | 100.0 | |
SRR9141473 | Longest alignment (nt) | 7536 | 7536 | 4,979 | 7487 | 6,899 |
Genome fraction% | 100.0 | 100.0 | 90.55 | 99.35 | 100.0 | |
Longest alignment IDY% | 100.0 | 100.0 | 99.91 | 99.97 | 100.0 | |
SRR9141474 | Longest alignment (nt) | 7541 | 7541 | 6,838 | 7516 | 7542 |
Genome fraction% | 99.99 | 99.99 | 99.99 | 99.65 | 100.0 | |
Longest alignment IDY% | 100.0 | 100.0 | 99.91 | 99.99 | 100.0 | |
SRR9141475 | Longest alignment (nt) | 7482 | 7493 | 7049 | 7440 | 7534 |
Genome fraction% | 99.28 | 99.43 | 99,08 | 98.72 | 99.97 | |
Longest alignment IDY% | 100.0 | 100.0 | 99.96 | 100.0 | 100.0 | |
SRR9141476 | Longest alignment (nt) | 7533 | 7533 | 5699 | 7526 | 7533 |
Genome fraction% | 100.0 | 100.0 | 100.0 | 99.90 | 100.0 | |
Longest alignment IDY% | 100.0 | 100.0 | 99.98 | 100.0 | 99.98 | |
SRR9141477 | Longest alignment (nt) | 7554 | 7554 | 5935 | 7467 | 3658 |
Genome fraction% | 99.95 | 99.95 | 91.20 | 98.79 | 99.84 | |
Longest alignment IDY% | 100.0 | 100.0 | 99.93 | 99.96 | 99.97 | |
SRR9141478 | Longest alignment (nt) | 7540 | 7540 | 6,592 | 7540 | 7540 |
Genome fraction% | 100.0 | 100.0 | 95.90 | 100.0 | 100.0 | |
Longest alignment IDY% | 99.99 | 99.99 | 99.95 | 99.99 | 99.88 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Meleshko, D.; Korobeynikov, A. Benchmarking State-of-the-Art Approaches for Norovirus Genome Assembly in Metagenome Sample. Biology 2023, 12, 1066. https://doi.org/10.3390/biology12081066
Meleshko D, Korobeynikov A. Benchmarking State-of-the-Art Approaches for Norovirus Genome Assembly in Metagenome Sample. Biology. 2023; 12(8):1066. https://doi.org/10.3390/biology12081066
Chicago/Turabian StyleMeleshko, Dmitry, and Anton Korobeynikov. 2023. "Benchmarking State-of-the-Art Approaches for Norovirus Genome Assembly in Metagenome Sample" Biology 12, no. 8: 1066. https://doi.org/10.3390/biology12081066
APA StyleMeleshko, D., & Korobeynikov, A. (2023). Benchmarking State-of-the-Art Approaches for Norovirus Genome Assembly in Metagenome Sample. Biology, 12(8), 1066. https://doi.org/10.3390/biology12081066