Critical Assessment of Short-Read Assemblers for the Metagenomic Identification of Foodborne and Waterborne Pathogens Using Simulated Bacterial Communities
Abstract
:1. Introduction
2. Materials and Methods
2.1. Reference Datasets
2.2. Illumina Short-Read Simulation
2.3. Metagenome Assembly
2.4. Assessment of Assembly Quality
2.5. Identifications of Plasmids, Virulence Genes, Salmonella Pathogenicity Island (SPI), ARGs, and Chromosomal Point Mutations
2.6. Serotyping
2.7. Multilocus Sequence Typing (MLST)
2.8. Whole-Genome Phylogenetic Analyses
3. Results
3.1. Assembly Quality
3.2. Plasmids
3.3. ARGs
3.4. Virulence Genes and SPIs
3.5. Salmonella Serotypes
3.6. MLST
3.7. Whole-Genome Phylogeny
4. Discussion
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Forbes, J.D.; Knox, N.C.; Ronholm, J.; Pagotto, F.; Reimer, A. Metagenomics: The next culture-independent game changer. Front. Microbiol. 2017, 8, 1069. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Desai, A.; Marwah, V.S.; Yadav, A.; Jha, V.; Dhaygude, K.; Bangar, U.; Kulkarni, V.; Jere, A. Identification of optimum sequencing depth especially for de novo genome assembly of small genomes using next generation sequencing data. PLoS ONE 2013, 8, e60204. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Breitwieser, F.P.; Lu, J.; Salzberg, S.L. A review of methods and databases for metagenomic classification and assembly. Brief. Bioinform. 2019, 20, 1125–1136. [Google Scholar] [CrossRef] [PubMed]
- Tsai, I.J.; Otto, T.D.; Berriman, M. Improving draft assemblies by iterative mapping and assembly of short reads to eliminate gaps. Genome Biol. 2010, 11, R41. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Latorre-Pérez, A.; Villalba-Bermell, P.; Pascual, J.; Vilanova, C. Assembly methods for nanopore-based metagenomic sequencing: A comparative study. Sci. Rep. 2020, 10, 13588. [Google Scholar] [CrossRef] [PubMed]
- Brown, C.L.; Keenum, I.M.; Dai, D.; Zhang, L.; Vikesland, P.J.; Pruden, A. Critical evaluation of short, long, and hybrid assembly for contextual analysis of antibiotic resistance genes in complex environmental metagenomes. Sci. Rep. 2021, 11, 3753. [Google Scholar] [CrossRef] [PubMed]
- Marić, J.; Šikić, M. Approaches to metagenomic classification and assembly. In Proceedings of the 42nd International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), Opatija, Croatia, 20–24 May 2019. [Google Scholar]
- Ayling, M.; Clark, M.D.; Leggett, R.M. New approaches for metagenome assembly with short reads. Brief. Bioinform. 2020, 21, 584–594. [Google Scholar] [CrossRef] [Green Version]
- Lapidus, A.L.; Korobeynikov, A.I. Metagenomic data assembly–the way of decoding unknown microorganisms. Front. Microbiol. 2021, 12, 653. [Google Scholar] [CrossRef]
- Olson, N.D.; Treangen, T.J.; Hill, C.M.; Cepeda-Espinoza, V.; Ghurye, J.; Koren, S.; Pop, M. Metagenomic assembly through the lens of validation: Recent advances in assessing and improving the quality of genomes assembled from metagenomes. Brief. Bioinform. 2019, 20, 1140–1150. [Google Scholar] [CrossRef] [Green Version]
- Forouzan, E.; Shariati, P.; Maleki, M.S.M.; Karkhane, A.A.; Yakhchali, B. Practical evaluation of 11 de novo assemblers in metagenome assembly. J. Microbiol. Methods 2018, 151, 99–105. [Google Scholar] [CrossRef] [PubMed]
- Beckers, H.J.; Daniels-Bosman, M.S.M.; Ament, A.; Daenen, J.; Hanekamp, A.W.J.; Knipschild, P.; Schuurmann, A.H.H.; Bijkerk, H. Two outbreaks of salmonellosis caused by Salmonella Indiana. A survey of the European Summit outbreak and its consequences. Int. J. Food Microbiol. 1985, 2, 185–195. [Google Scholar] [CrossRef]
- FDA. Dole Fresh Vegetables Announces Precautionary Limited Recall of Baby Spinach; FDA: Silver Spring, MD, USA, 2019.
- FDA. Vegpro International Issues a Recall of Fresh Attitude Baby Spinach Because of Potential Salmonella Health Risk; FDA: Silver Spring, MD, USA, 2020.
- van Asperen, I.A.; De Rover, C.M.; Schijven, J.F.; Oetomo, S.B.; Schellekens, J.F.; van Leeuwen, N.J.; Colle, C.; Havelaar, A.H.; Kromhout, D.; Sprenger, M.W. Risk of otitis externa after swimming in recreational fresh water lakes containing Pseudomonas aeruginosa. BMJ 1995, 311, 1407–1410. [Google Scholar] [CrossRef] [PubMed]
- Lopez-Velasco, G.; Welbaum, G.E.; Boyer, R.R.; Mane, S.P.; Ponder, M.A. Changes in spinach phylloepiphytic bacteria communities following minimal processing and refrigerated storage described using pyrosequencing of 16S rRNA amplicons. J. Appl. Microbiol. 2011, 110, 1203–1214. [Google Scholar] [CrossRef] [PubMed]
- Beale, D.J.; Karpe, A.V.; Ahmed, W.; Cook, S.; Morrison, P.D.; Staley, C.; Sadowsky, M.J.; Palombo, E.A. A community multi-omics approach towards the assessment of surface water quality in an urban river system. Int. J. Environ. Res. Public Health. 2017, 14, 303. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Chen, Z.; Kuang, D.; Xu, X.; Gonzalez-Escalona, N.; Erickson, D.L.; Brown, E.; Meng, J. Genomic analyses of multidrug-resistant Salmonella Indiana, Typhimurium, and Enteritidis isolates using MinION and MiSeq sequencing technologies. PLoS ONE 2020, 15, e0235641. [Google Scholar] [CrossRef]
- Gourlé, H.; Karlsson-Lindsjö, O.; Hayer, J.; Bongcam-Rudloff, E. Simulating Illumina metagenomic data with InSilicoSeq. Bioinformatics 2019, 35, 521–522. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Simpson, J.T.; Wong, K.; Jackman, S.D.; Schein, J.E.; Jones, S.J.; Birol, I. ABySS: A parallel assembler for short read sequence data. Genome Res. 2009, 19, 1117–1123. [Google Scholar] [CrossRef] [Green Version]
- Peng, Y.; Leung, H.C.; Yiu, S.M.; Chin, F.Y. IDBA-UD: A de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth. Bioinformatics 2012, 28, 1420–1428. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Zimin, A.V.; Marçais, G.; Puiu, D.; Roberts, M.; Salzberg, S.L.; Yorke, J.A. The MaSuRCA genome assembler. Bioinformatics 2013, 29, 2669–2677. [Google Scholar] [CrossRef] [Green Version]
- Li, D.; Liu, C.M.; Luo, R.; Sadakane, K.; Lam, T.W. MEGAHIT: An ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics 2015, 31, 1674–1676. [Google Scholar] [CrossRef] [PubMed]
- Nurk, S.; Meleshko, D.; Korobeynikov, A.; Pevzner, P.A. metaSPAdes: A new versatile metagenomic assembler. Genome Res. 2017, 27, 824–834. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Boisvert, S.; Raymond, F.; Godzaridis, É.; Laviolette, F.; Corbeil, J. Ray Meta: Scalable de novo metagenome assembly and profiling. Genome Biol. 2012, 13, R122. [Google Scholar] [CrossRef] [Green Version]
- Wood, D.E.; Lu, J.; Langmead, B. Improved metagenomic analysis with Kraken 2. Genome Biol. 2019, 20, 257. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Gurevich, A.; Saveliev, V.; Vyahhi, N.; Tesler, G. QUAST: Quality assessment tool for genome assemblies. Bioinformatics 2013, 29, 1072–1075. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Mikheenko, A.; Saveliev, V.; Gurevich, A. MetaQUAST: Evaluation of metagenome assemblies. Bioinformatics 2016, 32, 1088–1090. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Simão, F.A.; Waterhouse, R.M.; Ioannidis, P.; Kriventseva, E.V.; Zdobnov, E.M. BUSCO: Assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 2015, 31, 3210–3212. [Google Scholar] [CrossRef] [Green Version]
- Bharat, A.; Petkau, A.; Avery, B.P.; Chen, J.C.; Folster, J.P.; Carson, C.A.; Kearney, A.; Nadon, C.; Mabon, P.; Thiessen, J.; et al. Correlation between phenotypic and in silico detection of antimicrobial resistance in Salmonella enterica in Canada using Staramr. Microorganisms 2022, 10, 292. [Google Scholar] [CrossRef] [PubMed]
- Carattoli, A.; Zankari, E.; García-Fernández, A.; Voldby Larsen, M.; Lund, O.; Villa, L.; Aarestrup, F.M.; Hasman, H. In silico detection and typing of plasmids using PlasmidFinder and plasmid multilocus sequence typing. Antimicrob. Agents Chemother. 2014, 58, 3895–3903. [Google Scholar] [CrossRef] [Green Version]
- Chen, L.; Yang, J.; Yu, J.; Yao, Z.; Sun, L.; Shen, Y.; Jin, Q. VFDB: A reference database for bacterial virulence factors. Nucleic Acids Res. 2005, 33, D325–D328. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Roer, L.; Hendriksen, R.S.; Leekitcharoenphon, P.; Lukjancenko, O.; Kaas, R.S.; Hasman, H.; Aarestrup, F.M. Is the evolution of Salmonella enterica subsp. enterica linked to restriction-modification systems? Msystems 2016, 1, e00009-16. [Google Scholar]
- Zankari, E.; Hasman, H.; Cosentino, S.; Vestergaard, M.; Rasmussen, S.; Lund, O.; Aarestrup, F.M.; Larsen, M.V. Identification of acquired antimicrobial resistance genes. J. Antimicrob. Chemother. 2012, 67, 2640–2644. [Google Scholar] [CrossRef]
- Zankari, E.; Allesøe, R.; Joensen, K.G.; Cavaco, L.M.; Lund, O.; Aarestrup, F.M. PointFinder: A novel web tool for WGS-based detection of antimicrobial resistance associated with chromosomal point mutations in bacterial pathogens. J. Antimicrob. Chemother. 2017, 72, 2764–2768. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Chen, Y.; Ye, W.; Zhang, Y.; Xu, Y. High speed BLASTN: An accelerated MegaBLAST search tool. Nucleic Acids Res. 2015, 43, 7762–7768. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Yoshida, C.E.; Kruczkiewicz, P.; Laing, C.R.; Lingohr, E.J.; Gannon, V.P.; Nash, J.H.; Taboada, E.N. The Salmonella in silico typing resource (SISTR): An open web-accessible tool for rapidly typing and subtyping draft Salmonella genome assemblies. PLoS ONE 2016, 11, e0147101. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Thrane, S.W.; Taylor, V.L.; Lund, O.; Lam, J.S.; Jelsbak, L. Application of whole-genome sequencing data for O-specific antigen analysis and in silico serotyping of Pseudomonas aeruginosa isolates. J. Clin. Microbiol. 2016, 54, 1782–1788. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Jolley, K.A.; Maiden, M.C. BIGSdb: Scalable analysis of bacterial genome variation at the population level. BMC Bioinform. 2010, 11, 595. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Kaas, R.S.; Leekitcharoenphon, P.; Aarestrup, F.M.; Lund, O. Solving the problem of comparing whole bacterial genomes across different sequencing platforms. PLoS ONE 2014, 9, e104984. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- van der Walt, A.J.; Van Goethem, M.W.; Ramond, J.B.; Makhalanyane, T.P.; Reva, O.; Cowan, D.A. Assembling metagenomes, one community at a time. BMC Genom. 2017, 18, 521. [Google Scholar] [CrossRef] [Green Version]
- Sczyrba, A.; Hofmann, P.; Belmann, P.; Koslicki, D.; Janssen, S.; Dröge, J.; Gregor, I.; Majda, S.; Fiedler, J.; Dahms, E.; et al. Critical assessment of metagenome interpretation-a benchmark of metagenomics software. Nat. Methods 2017, 14, 1063–1071. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Frey, K.G.; Herrera-Galeano, J.E.; Redden, C.L.; Luu, T.V.; Servetas, S.L.; Mateczun, A.J.; Mokashi, V.P.; Bishop-Lilly, K.A. Comparison of three next-generation sequencing platforms for metagenomic sequencing and identification of pathogens in blood. BMC Genom. 2014, 15, 96. [Google Scholar] [CrossRef]
Sequencing Depth (Million) | Sequencer | Assembler | Number of Contigs | Length of the Largest Contig (bp) | Total Length (bp) | N50 | L50 | Complete BUSCOs (%) | Fragmented BUSCOs (%) | Missing BUSCOs (%) |
---|---|---|---|---|---|---|---|---|---|---|
1 | HiSeq | ABySS | 2547 | 2816 | 583,551 | 906 | 7 | N.A. a | N.A. | N.A. |
IDBA-UD | 19,390 | 30,957 | 18,981,326 | 2446 | 1327 | 85.5 | 12.9 | 1.6 | ||
MaSuRCA | 4169 | 4169 | 4,250,085 | 1877 | 601 | 12.1 | 37.9 | 50.0 | ||
MEGAHIT | 22,873 | 15,323 | 20,016,585 | 1508 | 3034 | 83.8 | 14.5 | 1.7 | ||
metaSPAdes | 45,708 | 46,199 | 30,661,187 | 3887 | 1130 | 86.3 | 12.1 | 1.6 | ||
Ray Meta | 91,979 | 2679 | 19,109,345 | 660 | 1493 | 25.0 | 58.9 | 16.1 | ||
MiSeq | ABySS | 38,624 | 131,924 | 29,026,490 | 8322 | 453 | 96.8 | 3.2 | 0.0 | |
IDBA-UD | 38,905 | 605,513 | 48,155,638 | 16,200 | 329 | 100.0 | 0.0 | 0.0 | ||
MaSuRCA | 6317 | 6317 | 25,421,869 | 41,463 | 68 | 99.2 | 0.8 | 0.0 | ||
MEGAHIT | 69,714 | 167,091 | 62,018,965 | 2525 | 3241 | 100.0 | 0.0 | 0.0 | ||
metaSPAdes | 39,536 | 484,133 | 56,431,406 | 7826 | 700 | 100.0 | 0.0 | 0.0 | ||
Ray Meta | 107,349 | 8316 | 34,485,996 | 1052 | 3854 | 72.6 | 24.2 | 3.2 | ||
NovaSeq | ABySS | 23,285 | 3465 | 6,799,130 | 613 | 584 | N.A. | N.A. | N.A. | |
IDBA-UD | 19,309 | 102,980 | 24,348,504 | 6950 | 519 | 96.0 | 4.0 | 0.0 | ||
MaSuRCA | 4915 | 4915 | 9,411,060 | 5142 | 517 | N.A. | N.A. | N.A. | ||
MEGAHIT | 25,109 | 26,002 | 26,423,629 | 2219 | 2325 | 98.4 | 0.8 | 0.8 | ||
metaSPAdes | 53,521 | 144,898 | 38,848,690 | 5595 | 745 | 100.0 | 0.0 | 0.0 | ||
Ray Meta | 91,241 | 3513 | 23,595,378 | 775 | 2849 | N.A. | N.A. | N.A. | ||
2.4 | HiSeq | ABySS | 96,154 | 399,012 | 38,151,970 | 20,826 | 119 | 100.0 | 0.0 | 0.0 |
IDBA-UD | 20,986 | 982,298 | 36,504,325 | 33,886 | 121 | 100.0 | 0.0 | 0.0 | ||
MaSuRCA | 3792 | 1,157,404 | 22,062,604 | 127,766 | 26 | 100.0 | 0.0 | 0.0 | ||
MEGAHIT | 25,891 | 415,309 | 38,209,837 | 6671 | 623 | 100.0 | 0.0 | 0.0 | ||
metaSPAdes | 136,810 | 18,480 | 39,795,754 | 1305 | 3669 | 100.0 | 0.0 | 0.0 | ||
Ray Meta | 54,095 | 701,901 | 51,025,031 | 26,733 | 200 | 77.4 | 21.0 | 1.6 | ||
2 | NovaSeq | ABySS | 76,438 | 239,683 | 35,384,363 | 13,564 | 229 | 100.0 | 0.0 | 0.0 |
IDBA-UD | 22,191 | 866,390 | 38,814,884 | 43,320 | 106 | 100.0 | 0.0 | 0.0 | ||
MaSuRCA | 4662 | 1,341,366 | 23,872,171 | 107,965 | 33 | 100.0 | 0.0 | 0.0 | ||
MEGAHIT | 28,959 | 333,683 | 41,721,455 | 6365 | 742 | 100.0 | 0.0 | 0.0 | ||
metaSPAdes | 129,147 | 14,900 | 40,184,250 | 1286 | 3904 | 100.0 | 0.0 | 0.0 | ||
Ray Meta | 61,622 | 788,082 | 55,891,970 | 22,000 | 211 | 79.8 | 16.1 | 4.1 | ||
1.5 | MiSeq | ABySS | 27,450 | 985,657 | 31,190,322 | 23,037 | 88 | 100.0 | 0.0 | 0.0 |
IDBA-UD | 46,339 | 830,885 | 59,634,527 | 26,442 | 222 | 100.0 | 0.0 | 0.0 | ||
MaSuRCA | 7482 | 1,649,401 | 35,047,685 | 142,499 | 35 | 100.0 | 0.0 | 0.0 | ||
MEGAHIT | 72,884 | 396,248 | 72,945,424 | 3241 | 2413 | 100.0 | 0.0 | 0.0 | ||
metaSPAdes | 41,606 | 1,238,639 | 67,167,714 | 11,584 | 489 | 100.0 | 0.0 | 0.0 | ||
Ray Meta | 133,221 | 174,311 | 43,254,966 | 10,098 | 355 | 98.4 | 1.6 | 0.0 |
Sequencing Depth (Million) | Sequencer | Assembler | Number of Contigs | Length of the Largest Contig (bp) | Total Length (bp) | N50 | L50 | Complete BUSCOs (%) | Fragmented BUSCOs (%) | Missing BUSCOs (%) |
---|---|---|---|---|---|---|---|---|---|---|
2.4 | HiSeq | ABySS | 94,776 | 567,651 | 38,437,605 | 29,385 | 156 | 100.0 | 0.0 | 0.0 |
IDBA-UD | 27,858 | 675,083 | 38,978,604 | 42,320 | 125 | 100.0 | 0.0 | 0.0 | ||
MaSuRCA | 8722 | 1,065,770 | 31,493,699 | 144,020 | 49 | 100.0 | 0.0 | 0.0 | ||
MEGAHIT | 31,551 | 506,253 | 40,500,481 | 8125 | 595 | 100.0 | 0.0 | 0.0 | ||
metaSPAdes | 49,253 | 598,239 | 51,906,012 | 6280 | 480 | 100.0 | 0.0 | 0.0 | ||
Ray Meta | 129,379 | 40,462 | 40,407,420 | 2512 | 2071 | 98.3 | 1.6 | 0.1 | ||
1.5 | MiSeq | ABySS | 27,664 | 1,156,931 | 31,438,774 | 66,328 | 81 | 100.0 | 0.0 | 0.0 |
IDBA-UD | 45,943 | 694,199 | 62,390,072 | 13,321 | 525 | 100.0 | 0.0 | 0.0 | ||
MaSuRCA | 10,851 | 1,084,592 | 40,632,725 | 74,717 | 82 | 100.0 | 0.0 | 0.0 | ||
MEGAHIT | 53,243 | 506,136 | 66,833,415 | 3602 | 2396 | 100.0 | 0.0 | 0.0 | ||
metaSPAdes | 26,622 | 598,615 | 62,354,405 | 10,634 | 818 | 100.0 | 0.0 | 0.0 | ||
Ray Meta | 128,576 | 55,589 | 42,575,322 | 4620 | 1054 | 99.2 | 0.8 | 0.0 | ||
2 | NovaSeq | ABySS | 75,321 | 294,921 | 35,485,842 | 15,726 | 253 | 100.0 | 0.0 | 0.0 |
IDBA-UD | 28,508 | 1,067,518 | 41,475,688 | 31,641 | 144 | 100.0 | 0.0 | 0.0 | ||
MaSuRCA | 9695 | 826,913 | 32,976,162 | 107,348 | 61 | 100.0 | 0.0 | 0.0 | ||
MEGAHIT | 33,868 | 348,830 | 44,148,453 | 5785 | 692 | 100.0 | 0.0 | 0.0 | ||
metaSPAdes | 49,929 | 595,582 | 54,925,053 | 6036 | 613 | 100.0 | 0.0 | 0.0 | ||
Ray Meta | 118,667 | 24,869 | 40,595,237 | 2461 | 2217 | 96.8 | 3.2 | 0.0 | ||
4.8 | HiSeq | ABySS | 68,676 | 860,765 | 49,469,839 | 49,695 | 112 | 100.0 | 0.0 | 0.0 |
IDBA-UD | 22,240 | 986,994 | 55,895,382 | 21,745 | 345 | 100.0 | 0.0 | 0.0 | ||
MaSuRCA | N.A. a | N.A. | N.A. | N.A. | N.A. | 100.0 | 0.0 | 0.0 | ||
MEGAHIT | 27,805 | 582,024 | 57,905,030 | 7080 | 925 | 100.0 | 0.0 | 0.0 | ||
metaSPAdes | 27,267 | 1,010,681 | 63,473,426 | 30,990 | 354 | 100.0 | 0.0 | 0.0 | ||
Ray Meta | 129,559 | 69,858 | 55,853,404 | 2606 | 1961 | 100.0 | 0.0 | 0.0 | ||
2 | MiSeq | ABySS | 34,265 | 1,086,290 | 38,581,270 | 99,221 | 62 | 100.0 | 0.0 | 0.0 |
IDBA-UD | 48,934 | 598,213 | 69,472,563 | 27,024 | 385 | 100.0 | 0.0 | 0.0 | ||
MaSuRCA | 9240 | 1,692,548 | 49,934,547 | 35,328 | 177 | 100.0 | 0.0 | 0.0 | ||
MEGAHIT | 46,433 | 598,268 | 70,400,115 | 5687 | 1701 | 100.0 | 0.0 | 0.0 | ||
metaSPAdes | 19,644 | 1,345,803 | 65,375,463 | 25,509 | 452 | 100.0 | 0.0 | 0.0 | ||
Ray Meta | 138,343 | 93,582 | 49,496,982 | 5951 | 783 | 100.0 | 0.0 | 0.0 | ||
4 | NovaSeq | ABySS | 81,690 | 859,123 | 53,258,171 | 51,140 | 115 | 100.0 | 0.0 | 0.0 |
IDBA-UD | 20,992 | 1,080,025 | 57,665,814 | 32,176 | 276 | 100.0 | 0.0 | 0.0 | ||
MaSuRCA | 6872 | 741,283 | 56,276,212 | 87,439 | 151 | 100.0 | 0.0 | 0.0 | ||
MEGAHIT | 27,425 | 599,130 | 60,289,818 | 8523 | 895 | 100.0 | 0.0 | 0.0 | ||
metaSPAdes | 25,754 | 1,080,366 | 64,824,870 | 38,468 | 320 | 100.0 | 0.0 | 0.0 | ||
Ray Meta | 118,206 | 64,911 | 56,050,840 | 2454 | 2283 | 99.2 | 0.8 | 0.0 |
Sequencing Depth (Million) | Sequencer | Assembler | Reference | |||||
---|---|---|---|---|---|---|---|---|
ABySS | IDBA-UD | MaSuRCA | MEGAHIT | metaSPAdes | Ray Meta | |||
1 | HiSeq | N.D. a | IncHI2A IncHI2 IncQ1 | IncHI2 | IncHI2A IncHI2 IncQ1 | IncHI2A IncHI2 IncQ1 | IncHI2 | IncHI2A IncHI2 IncQ1 |
MiSeq | IncHI2A IncHI2 IncQ1 | IncHI2A IncHI2 IncQ1 | IncHI2A IncHI2 IncQ1 IncQ1 IncQ1 | IncHI2A IncHI2 IncQ1 | IncHI2A IncHI2 IncQ1 | IncHI2 IncQ1 | ||
NovaSeq | IncHI2 | IncHI2A IncHI2 IncQ1 | IncHI2A IncHI2 IncQ1 | IncHI2A IncHI2 IncQ1 | IncHI2A IncHI2 IncQ1 | IncHI2 IncQ1 | ||
2.4 | HiSeq | IncHI2A IncHI2 IncQ1 | IncHI2A IncHI2 IncQ1 | IncHI2A IncHI2 IncQ1 | IncHI2A IncHI2 IncQ1 | IncHI2A IncHI2 IncQ1 | IncHI2A IncHI2 IncQ1 | |
2 | NovaSeq | IncHI2A | IncHI2A IncHI2 IncQ1 | IncHI2A IncQ1 | IncHI2A IncHI2 IncQ1 | IncHI2A IncHI2 IncQ1 | IncHI2A | |
1.5 | MiSeq | IncHI2 IncHI2A IncQ1 | IncHI2 IncHI2A IncQ1 | IncHI2 IncHI2A IncQ1 | IncHI2 IncHI2A IncQ1 | IncHI2 IncHI2A IncQ1 | IncHI2 IncHI2A IncQ1 |
Sequencing Depth (Million) | Sequencer | Assembler | Reference | |||||
---|---|---|---|---|---|---|---|---|
ABySS | IDBA-UD | MaSuRCA | MEGAHIT | metaSPAdes | Ray Meta | |||
1 | HiSeq | 0 | 66 | 15 | 69 | 79 | 20 | 91 |
MiSeq | 87 | 91 | 90 | 91 | 91 | 55 | ||
NovaSeq | 10 | 90 | 37 | 90 | 90 | 39 | ||
2.4 | HiSeq | 112 | 122 | 105 | 123 | 126 | 100 | |
2 | NovaSeq | 111 | 124 | 105 | 124 | 126 | 97 | |
1.5 | MiSeq | 90 | 93 | 91 | 91 | 89 | 91 |
Sequencing Depth (Million) | Sequencer | Assembler | Reference | |||||
---|---|---|---|---|---|---|---|---|
ABySS | IDBA-UD | MaSuRCA | MEGAHIT | metaSPAdes | Ray Meta | |||
2.4 | HiSeq | 1 | 3 | 1 | 6 | 19 | 0 | 241 |
1.5 | MiSeq | 1 | 65 | 3 | 104 | 94 | 7 | |
2 | NovaSeq | 0 | 6 | 1 | 11 | 25 | 1 | |
4.8 | HiSeq | 8 | 54 | N.A. a | 69 | 93 | 12 | |
2 | MiSeq | 4 | 114 | 15 | 151 | 147 | 10 | |
4 | NovaSeq | 16 | 68 | 19 | 88 | 106 | 12 |
Sequencing Depth (Million) | Sequencer | Assembler | Reference | |||||
---|---|---|---|---|---|---|---|---|
ABySS | IDBA-UD | MaSuRCA | MEGAHIT | metaSPAdes | Ray Meta | |||
1 | HiSeq | N.D. a | SPI-1 (7) SPI-2 (7) SPI-3 (2) | SPI-2 (3) | SPI-1 (6) SPI-2 (7) SPI-3 (2) | SPI-1 (8) SPI-2 (7) SPI-3 (2) | SPI-2 (2) | SPI-1 (8) SPI-2 (6) SPI-3 (3) SPI-4 (1) SPI-5 (1) SPI-9 (1) |
MiSeq | C63PI (1) SPI-1 (5) SPI-2 (8) SPI-3 (2) SPI-4 (1) | SPI-1 (8) SPI-2 (6) SPI-3 (2) SPI-4 (1) SPI-5 (1) SPI-9 (1) | C63PI (1) SPI-1 (7) SPI-2 (8) SPI-3 (2) | C63PI (1) SPI-1 (7) SPI-2 (6) SPI-3 (2) SPI-5 (1) | SPI-1 (8) SPI-2 (6) SPI-3 (3) SPI-5 (1) SPI-9 (1) | SPI-1 (3) SPI-2 (5) SPI-3 (1) | ||
NovaSeq | N.D | C63PI (1) SPI-1 (5) SPI-2 (7) SPI-3 (2) | SPI-1 (1) SPI-2 (2) | C63PI (1) SPI-1 (5) SPI-2 (7) SPI-3 (2) | C63PI (1) SPI-1 (7) SPI-2 (5) SPI-3 (3) SPI-5 (1) SPI-9 (1) | SPI-1 (3) SPI-2 (2) SPI-3 (1) | ||
2.4 | HiSeq | SPI-1 (8) SPI-2 (6) SPI-3 (3) SPI-5 (1) SPI-9 (1) | SPI-1 (8) SPI-2 (6) SPI-3 (2) SPI-4 (1) SPI-5 (1) SPI-9 (1) | SPI-1 (8) SPI-2 (9) SPI-3 (2) SPI-4 (1) SPI-5 (1) SPI-9 (1) | SPI-1 (8) SPI-2 (6) SPI-3 (2) SPI-5 (1) SPI-9 (1) | SPI-1 (8) SPI-2 (6) SPI-3 (3) SPI-4 (1) SPI-5 (1) SPI-9 (1) | SPI-1 (8) SPI-2 (8) SPI-3 (2) | |
2 | NovaSeq | SPI-1 (8) SPI-2 (8) SPI-3 (2) | SPI-1 (8) SPI-2 (6) SPI-3 (2) SPI-4 (1) SPI-5 (1) SPI-9 (1) | SPI-1 (7) SPI-2 (7) SPI-3 (2) SPI-5 (1) SPI-9 (1) | SPI-1 (8) SPI-2 (6) SPI-3 (2) SPI-5 (1) SPI-9 (1) | SPI-1 (8) SPI-2 (6) SPI-3 (3) SPI-4 (1) SPI-5 (1) SPI-9 (1) | SPI-1 (8) SPI-2 (6) SPI-3 (2) | |
1.5 | MiSeq | SPI-1 (8) SPI-2 (6) SPI-3 (3) SPI-4 (1) SPI-5 (1) SPI-9 (1) | SPI-1 (8) SPI-2 (6) SPI-3 (3) SPI-5 (1) SPI-9 (1) | SPI-1 (8) SPI-2 (6) SPI-3 (3) SPI-4 (1) SPI-5 (1) SPI-9 (1) | SPI-1 (8) SPI-2 (6) SPI-3 (2) SPI-5 (1) SPI-9 (1) | SPI-1 (8) SPI-2 (6) SPI-3 (3) SPI-4 (1) SPI-5 (1) SPI-9 (1) | SPI-1 (8) SPI-2 (4) SPI-3 (3) SPI-5 (1) SPI-9 (1) |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Chen, Z.; Meng, J. Critical Assessment of Short-Read Assemblers for the Metagenomic Identification of Foodborne and Waterborne Pathogens Using Simulated Bacterial Communities. Microorganisms 2022, 10, 2416. https://doi.org/10.3390/microorganisms10122416
Chen Z, Meng J. Critical Assessment of Short-Read Assemblers for the Metagenomic Identification of Foodborne and Waterborne Pathogens Using Simulated Bacterial Communities. Microorganisms. 2022; 10(12):2416. https://doi.org/10.3390/microorganisms10122416
Chicago/Turabian StyleChen, Zhao, and Jianghong Meng. 2022. "Critical Assessment of Short-Read Assemblers for the Metagenomic Identification of Foodborne and Waterborne Pathogens Using Simulated Bacterial Communities" Microorganisms 10, no. 12: 2416. https://doi.org/10.3390/microorganisms10122416
APA StyleChen, Z., & Meng, J. (2022). Critical Assessment of Short-Read Assemblers for the Metagenomic Identification of Foodborne and Waterborne Pathogens Using Simulated Bacterial Communities. Microorganisms, 10(12), 2416. https://doi.org/10.3390/microorganisms10122416