ReporType: A Flexible Bioinformatics Tool for Targeted Loci Screening and Typing of Infectious Agents
Abstract
:1. Introduction
2. Implementation
2.1. ReporType Architecture and Workflow
2.2. ReporType Installation, Configuration and Execution
2.3. Database Configuration
2.4. Databases, Test Datasets and Benchmarking
2.4.1. Virus
2.4.2. Bacteria
3. Results and Discussion
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- World Health Organization. Global Genomic Surveillance Strategy for Pathogens with Pandemic and Epidemic Potential, 2022–2032; World Health Organization: Geneva, Switzerland, 2022. [Google Scholar]
- World Health Organization. WHO Guiding Principles for Pathogen Genome Data Sharing; World Health Organization: Geneva, Switzerland, 2022. [Google Scholar]
- Gardy, J.L.; Loman, N.J. Towards a genomics-informed, real-time, global pathogen surveillance system. Nat. Rev. Genet. 2018, 19, 9–20. [Google Scholar] [CrossRef]
- Hill, V.; Githinji, G.; Vogels, C.B.F.; Bento, A.I.; Chaguza, C.; Carrington, C.V.F.; Grubaugh, N.D. Toward a global virus genomic surveillance network. Cell Host Microbe 2023, 31, 861–873. [Google Scholar] [CrossRef]
- Chen, Z.; Azman, A.S.; Chen, X.; Zou, J.; Tian, Y.; Sun, R.; Xu, X.; Wu, Y.; Lu, W.; Ge, S.; et al. Global landscape of SARS-CoV-2 genomic surveillance and data sharing. Nat. Genet. 2022, 54, 499–507. [Google Scholar] [CrossRef]
- Tegally, H.; San, J.E.; Cotten, M.; Moir, M.; Tegomoh, B.; Mboowa, G.; Martin, D.P.; Baxter, C.; Lambisia, A.W.; Diallo, A.; et al. The evolving SARS-CoV-2 epidemic in Africa: Insights from rapidly expanding genomic surveillance. Science 2022, 378, eabq5358. [Google Scholar] [CrossRef] [PubMed]
- Struelens, M.J.; Brisse, S. From molecular to genomic epidemiology: Transforming surveillance and control of infectious diseases. Eurosurveillance 2013, 18, 20386. [Google Scholar] [CrossRef]
- Aksamentov, I.; Roemer, C.; Hodcroft, E.; Neher, R. Nextclade: Clade assignment, mutation calling and quality control for viral genomes. J. Open Source Softw. 2021, 6, 3773. [Google Scholar] [CrossRef]
- O’Toole, Á.; Scher, E.; Underwood, A.; Jackson, B.; Hill, V.; McCrone, J.T.; Colquhoun, R.; Ruis, C.; Abu-Dahab, K.; Taylor, B.; et al. Assignment of epidemiological lineages in an emerging pandemic using the pangolin tool. Virus Evol. 2021, 7, veab064. [Google Scholar] [CrossRef] [PubMed]
- Chen, C.; Nadeau, S.; Yared, M.; Voinov, P.; Xie, N.; Roemer, C.; Stadler, T. CoV-Spectrum: Analysis of globally shared SARS-CoV-2 data to identify and characterize new variants. Bioinformatics 2022, 38, 1735–1737. [Google Scholar] [CrossRef] [PubMed]
- Borges, V.; Pinheiro, M.; Pechirra, P.; Guiomar, R.; Gomes, J.P. INSaFLU: An automated open web-based bioinformatics suite “from-reads” for influenza whole-genome-sequencing-based surveillance. Genome Med. 2018, 10, 46. [Google Scholar] [CrossRef] [PubMed]
- Hadfield, J.; Megill, C.; Bell, S.M.; Huddleston, J.; Potter, B.; Callender, C.; Sagulenko, P.; Bedford, T.; Neher, R.A. Nextstrain: Real-time tracking of pathogen evolution. Bioinformatics 2018, 34, 4121–4123. [Google Scholar] [CrossRef]
- Vilsker, M.; Moosa, Y.; Nooij, S.; Fonseca, V.; Ghysens, Y.; Dumon, K.; Pauwels, R.; Alcantara, L.C.; Vanden Eynden, E.; Vandamme, A.M.; et al. Genome Detective: An automated system for virus identification from high-throughput sequencing data. Bioinformatics 2019, 35, 871–873. [Google Scholar] [CrossRef]
- Uelze, L.; Grützke, J.; Borowiak, M.; Hammerl, J.A.; Juraschek, K.; Deneke, C.; Tausch, S.H.; Malorny, B. Typing methods based on whole genome sequencing data. One Health Outlook 2020, 2, 3. [Google Scholar] [CrossRef]
- Seemann, T. mlst. Available online: https://github.com/tseemann/mlst (accessed on 22 January 2024).
- Jolley, K.A.; Maiden, M.C. BIGSdb: Scalable analysis of bacterial genome variation at the population level. BMC Bioinform. 2010, 11, 595. [Google Scholar] [CrossRef] [PubMed]
- Wick, R.R.; Heinz, E.; Holt, K.E.; Wyres, K.L. Kaptive Web: User-Friendly Capsule and Lipopolysaccharide Serotype Prediction for Klebsiella Genomes. J. Clin. Microbiol. 2018, 56, e00197-18. [Google Scholar] [CrossRef] [PubMed]
- Zhang, S.; den Bakker, H.C.; Li, S.; Chen, J.; Dinsmore, B.A.; Lane, C.; Lauer, A.C.; Fields, P.I.; Deng, X. SeqSero2: Rapid and Improved Salmonella Serotype Determination Using Whole-Genome Sequencing Data. Appl. Environ. Microbiol. 2019, 85, e01746-19. [Google Scholar] [CrossRef] [PubMed]
- Florensa, A.F.; Kaas, R.S.; Clausen, P.T.L.C.; Aytan-Aktug, D.; Aarestrup, F.M. ResFinder—An open online resource for identification of antimicrobial resistance genes in next-generation sequencing data and prediction of phenotypes from genotypes. Microb. Genom. 2022, 8, 000748. [Google Scholar] [CrossRef] [PubMed]
- Kleinheinz, K.A.; Joensen, K.G.; Larsen, M.V. Applying the ResFinder and VirulenceFinder web-services for easy identification of acquired antibiotic resistance and E. coli virulence genes in bacteriophage and prophage nucleotide sequences. Bacteriophage 2014, 4, e27943. [Google Scholar] [CrossRef] [PubMed]
- Seemann, T. ABRicate. Available online: https://github.com/tseemann/abricate (accessed on 20 December 2023).
- Köster, J.; Rahmann, S. Snakemake—A scalable bioinformatics workflow engine. Bioinformatics 2012, 28, 2520–2522. [Google Scholar] [CrossRef] [PubMed]
- ABIView. Available online: https://emboss.sourceforge.net/apps/cvs/emboss/apps/abiview.html (accessed on 20 December 2023).
- Bolger, A.M.; Lohse, M.; Usadel, B. Trimmomatic: A flexible trimmer for Illumina sequence data. Bioinformatics 2014, 30, 2114–2120. [Google Scholar] [CrossRef]
- Bankevich, A.; Nurk, S.; Antipov, D.; Gurevich, A.A.; Dvorkin, M.; Kulikov, A.S.; Lesin, V.M.; Nikolenko, S.I.; Pham, S.; Prjibelski, A.D.; et al. SPAdes: A new genome assembly algorithm and its applications to single-cell sequencing. J. Comput. Biol. 2012, 19, 455–477. [Google Scholar] [CrossRef]
- De Coster, W.; D’Hert, S.; Schultz, D.T.; Cruts, M.; Van Broeckhoven, C. NanoPack: Visualizing and processing long-read sequencing data. Bioinformatics 2018, 34, 2666–2669. [Google Scholar] [CrossRef] [PubMed]
- Vaser, R.; Šikić, M. Time- and memory-efficient genome assembly with Raven. Nat. Comput. Sci. 2021, 1, 332–336. [Google Scholar] [CrossRef]
- Li, H.; Handsaker, B.; Wysoker, A.; Fennell, T.; Ruan, J.; Homer, N.; Marth, G.; Abecasis, G.; Durbin, R.; 1000 Genome Project Data Processing Subgroup. The Sequence Alignment/Map format and SAMtools. Bioinformatics 2009, 25, 2078–2079. [Google Scholar] [CrossRef] [PubMed]
- ReporType. Available online: https://github.com/insapathogenomics/reportype (accessed on 27 February 2024).
- Chapter 7: Manual for the Laboratory-based Surveillance of Measles, Rubella, and Congenital Rubella Syndrome. Available online: https://www.who.int/publications/m/item/chapter-7-manual-for-the-laboratory-based-surveillance-of-measles-rubella-and-congenital-rubella-syndrome (accessed on 27 February 2024).
- Schoch, C.L.; Ciufo, S.; Domrachev, M.; Hotton, C.L.; Kannan, S.; Khovanskaya, R.; Leipe, D.; Mcveigh, R.; O’Neill, K.; Robbertse, B.; et al. NCBI Taxonomy: A Comprehensive Update on Curation, Resources and Tools—Measles. Database (Oxford), 2020. Available online: https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?name=Measles+morbillivirus (accessed on 20 December 2023).
- Manual for the Laboratory-Based Surveillance of Measles, Rubella, and Congenital Rubella Syndrome. Available online: https://www.who.int/publications/m/item/chapter-1-manual-for-the-laboratory-based-surveillance-of-measles-rubella-and-congenital-rubella-syndrome (accessed on 20 December 2023).
- NCBI Virus Database—Taxid: 11234. Available online: https://www.ncbi.nlm.nih.gov/labs/virus/vssi/#/ (accessed on 13 June 2023).
- Namuwulya, P.; Bukenya, H.; Tushabe, P.; Tweyongyere, R.; Bwogi, J.; Cotten, M.; Phan, M.V.T. Near-Complete Genome Sequences of Measles Virus Strains from 10 Years of Uganda Country-wide Surveillance. Microbiol. Resour. Announc. 2022, 11, e0060622. [Google Scholar] [CrossRef] [PubMed]
- Alexander, D.J.; Aldous, E.W.; Fuller, C.M. The long view: A selective review of 40 years of Newcastle disease research. Avian Pathol. 2012, 41, 329–335. [Google Scholar] [CrossRef]
- Dimitrov, K.M.; Abolnik, C.; Afonso, C.L.; Albina, E.; Bahl, J.; Berg, M.; Briand, F.-X.; Brown, I.H.; Choi, K.-S.; Chvala, I.; et al. Updated unified phylogenetic classification system and revised nomenclature for Newcastle disease virus. Infect. Genet. Evol. 2019, 74, 103917. [Google Scholar] [CrossRef]
- Sun, J.; Ai, H.; Chen, L.; Li, L.; Shi, Q.; Liu, T.; Zhao, R.; Zhang, C.; Han, Z.; Liu, S. Surveillance of Class I Newcastle Disease Virus at Live Bird Markets in China and Identification of Variants with Increased Virulence and Replication Capacity. J. Virol. 2022, 96, e0024122. [Google Scholar] [CrossRef]
- Dwivedi, V.D.; Tripathi, I.P.; Tripathi, R.C.; Bharadwaj, S.; Mishra, S.K. Genomics, proteomics and evolution of dengue virus. Brief. Funct. Genom. 2017, 16, 217–227. [Google Scholar] [CrossRef]
- Mendes, C.I.; Lizarazo, E.; Machado, M.P.; Silva, D.N.; Tami, A.; Ramirez, M.; Couto, N.; Rossen, J.W.A.; Carriço, J.A. DEN-IM: Dengue virus genotyping from amplicon and shotgun metagenomic sequencing. Microb. Genom. 2020, 6, e000328. [Google Scholar] [CrossRef]
- Rattanaburi, S.; Sawaswong, V.; Nimsamer, P.; Mayuramart, O.; Sivapornnukul, P.; Khamwut, A.; Chanchaem, P.; Kongnomnan, K.; Suntronwong, N.; Poovorawan, Y.; et al. Genome characterization and mutation analysis of human influenza A virus in Thailand. Genom. Inform. 2022, 20, e21. [Google Scholar] [CrossRef]
- King, J.; Harder, T.; Beer, M.; Pohlmann, A. Rapid multiplex MinION nanopore sequencing workflow for Influenza A viruses. BMC Infect. Dis. 2020, 20, 648. [Google Scholar] [CrossRef] [PubMed]
- Tagnouokam-Ngoupo, P.A.; Ngoufack, M.N.; Kenmoe, S.; Lissock, S.F.; Amougou-Atsama, M.; Banai, R.; Ngono, L.; Njouom, R. Hepatitis C virus genotyping based on Core and NS5B regions in Cameroonian patients. Virol. J. 2019, 16, 101. [Google Scholar] [CrossRef] [PubMed]
- Ramos, D.; Pinto, M.; Sousa Coutinho, R.; Silva, C.; Quina, M.; Gomes, J.P.; Pádua, E. Looking at the Molecular Target of NS5A Inhibitors throughout a Population Highly Affected with Hepatitis C Virus. Pathogens 2023, 12, 754. [Google Scholar] [CrossRef] [PubMed]
- Schoch, C.L.; Ciufo, S.; Domrachev, M.; Hotton, C.L.; Kannan, S.; Khovanskaya, R.; Leipe, D.; Mcveigh, R.; O’Neill, K.; Robbertse, B.; et al. NCBI Taxonomy: A Comprehensive Update on Curation, Resources and Tools—HTLV-1. Database (Oxford), 2020. Available online: https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?name=HTLV-1 (accessed on 20 December 2023).
- Pádua, E.; Rodés, B.; Pérez-Piñar, T.; Silva, A.F.; Jiménez, V.; Ferreira, F.; Toro, C. Molecular characterization of human T cell leukemia virus type 1 subtypes in a group of infected individuals diagnosed in Portugal and Spain. AIDS Res. Hum. Retroviruses 2011, 27, 317–322. [Google Scholar] [CrossRef] [PubMed]
- Quina, M.; Ramos, D.; Silva, C.; Pádua, E. Diversity of Human T-Lymphotropic Virus Type 1 Cosmopolitan Subtype (HTLV-1a) Circulating in Infected Residents in Portugal. AIDS Res. Hum. Retroviruses 2023. [Google Scholar] [CrossRef]
- Nunes, A.; Borrego, M.J.; Nunes, B.; Florindo, C.; Gomes, J.P. Evolutionary dynamics of ompA, the gene encoding the Chlamydia trachomatis key antigen. J. Bacteriol. 2009, 191, 7182–7192. [Google Scholar] [CrossRef]
- Borges, V.; Cordeiro, D.; Salas, A.I.; Lodhia, Z.; Correia, C.; Isidro, J.; Fernandes, C.; Rodrigues, A.M.; Azevedo, J.; Alves, J.; et al. Chlamydia trachomatis: When the virulence-associated genome backbone imports a prevalence-associated major antigen signature. Microb. Genom. 2019, 5, e000313. [Google Scholar] [CrossRef]
- Harris, S.R.; Clarke, I.N.; Seth-Smith, H.M.; Solomon, A.W.; Cutcliffe, L.T.; Marsh, P.; Skilton, R.J.; Holland, M.J.; Mabey, D.; Peeling, R.W.; et al. Whole-genome analysis of diverse Chlamydia trachomatis strains identifies phylogenetic relationships masked by current clinical typing. Nat. Genet. 2012, 44, 413–419. [Google Scholar] [CrossRef]
- O’Neill, C.E.; Skilton, R.J.; Forster, J.; Cleary, D.W.; Pearson, S.A.; Lampe, D.J.; Thomson, N.R.; Clarke, I.N. An inducible transposon mutagenesis approach for the intracellular human pathogen Chlamydia trachomatis. Wellcome Open Res. 2021, 6, 312. [Google Scholar] [CrossRef]
- Seth-Smith, H.M.; Harris, S.R.; Skilton, R.J.; Radebe, F.M.; Golparian, D.; Shipitsyna, E.; Duy, P.T.; Scott, P.; Cutcliffe, L.T.; O’Neill, C.; et al. Whole-genome sequences of Chlamydia trachomatis directly from clinical samples without culture. Genome Res. 2013, 23, 855–866. [Google Scholar] [CrossRef]
- Hadfield, J.; Harris, S.R.; Seth-Smith, H.M.B.; Parmar, S.; Andersson, P.; Giffard, P.M.; Schachter, J.; Moncada, J.; Ellison, L.; Vaulet, M.L.G.; et al. Comprehensive global genome dynamics of Chlamydia trachomatis show ancient diversification followed by contemporary mixing and recent lineage expansion. Genome Res. 2017, 27, 1220–1229. [Google Scholar] [CrossRef]
- Underwood, A.P.; Jones, G.; Mentasti, M.; Fry, N.K.; Harrison, T.G. Comparison of the Legionella pneumophila population structure as determined by sequence-based typing and whole genome sequencing. BMC Microbiol. 2013, 13, 302. [Google Scholar] [CrossRef] [PubMed]
- Moran-Gilad, J.; Prior, K.; Yakunin, E.; Harrison, T.G.; Underwood, A.; Lazarovitch, T.; Valinsky, L.; Luck, C.; Krux, F.; Agmon, V.; et al. Design and application of a core genome multilocus sequence typing scheme for investigation of Legionnaires’ disease incidents. Euro Surveill. 2015, 20, 21186. [Google Scholar] [CrossRef] [PubMed]
- Gaia, V.; Fry, N.K.; Afshar, B.; Lück, P.C.; Meugnier, H.; Etienne, J.; Peduzzi, R.; Harrison, T.G. Consensus sequence-based scheme for epidemiological typing of clinical and environmental isolates of Legionella pneumophila. J. Clin. Microbiol. 2005, 43, 2047–2052. [Google Scholar] [CrossRef] [PubMed]
- Seemann, T. Legsta. Available online: https://github.com/tseemann/legsta (accessed on 20 December 2023).
- Cazalet, C.; Jarraud, S.; Ghavi-Helm, Y.; Kunst, F.; Glaser, P.; Etienne, J.; Buchrieser, C. Multigenome analysis identifies a worldwide distributed epidemic Legionella pneumophila clone that emerged within a highly diverse species. Genome Res. 2008, 18, 431–441. [Google Scholar] [CrossRef]
- Cao, B.; Yao, F.; Liu, X.; Feng, L.; Wang, L. Development of a DNA microarray method for detection and identification of all 15 distinct O-antigen forms of Legionella pneumophila. Appl. Environ. Microbiol. 2013, 79, 6647–6654. [Google Scholar] [CrossRef]
- Borges, V.; Nunes, A.; Sampaio, D.A.; Vieira, L.; Machado, J.; Simões, M.J.; Gonçalves, P.; Gomes, J.P. Legionella pneumophila strain associated with the first evidence of person-to-person transmission of Legionnaires’ disease: A unique mosaic genetic backbone. Sci. Rep. 2016, 6, 26261. [Google Scholar] [CrossRef]
- Kozak-Muiznieks, N.A.; Morrison, S.S.; Mercante, J.W.; Ishaq, M.K.; Johnson, T.; Caravas, J.; Lucas, C.E.; Brown, E.; Raphael, B.H.; Winchell, J.M. Comparative genome analysis reveals a complex population structure of Legionella pneumophila subspecies. Infect. Genet. Evol. 2018, 59, 172–185. [Google Scholar] [CrossRef]
- Khan, M.A.; Knox, N.; Prashar, A.; Alexander, D.; Abdel-Nour, M.; Duncan, C.; Tang, P.; Amatullah, H.; Dos Santos, C.C.; Tijet, N.; et al. Comparative Genomics Reveal that Host-Innate Immune Responses Influence the Clinical Prevalence of Legionella pneumophila Serogroups. PLoS ONE 2013, 8, e67298. [Google Scholar] [CrossRef]
- David, S.; Sánchez-Busó, L.; Harris, S.R.; Marttinen, P.; Rusniok, C.; Buchrieser, C.; Harrison, T.G.; Parkhill, J. Dynamics and impact of homologous recombination on the evolution of Legionella pneumophila. PLoS Genet. 2017, 13, e1006855. [Google Scholar] [CrossRef]
Input Format (Sequencing Technology) | Software | Action(s) |
---|---|---|
AB1 (Sanger) | ABIView [23] | Trimming/Conversion to FASTA |
FASTQ or FASTQ.gz (Illumina, single or paired-end) | Trimmomatic [24] | Quality control/Trimming |
SPAdes [25] | de novo assembly | |
FASTQ or FASTQ.gz (ONT) | NanoFilt [26] | Quality control/Trimming |
Raven [27] | de novo assembly | |
SINGLE or MULTI-FASTA (all) | ABRIcate [21] | Locus screening/typing and Reporting |
ReporType tabular report (all) | SAMtools [28] | Extraction of match sequences |
Configuration (Sequencing Technology) | Tool | Parameter |
---|---|---|
General (all) | ReporType | sample_directory |
input_format | ||
database | ||
(or ‘fasta_db’ and ‘table_db’ to setup a new database) | ||
output_name | ||
output_directory | ||
multi_fasta | ||
threads | ||
prioritize | ||
General (all) | Snakemake | config |
np | ||
configfile | ||
snakefile | ||
Specific (Sanger) | ABIView | startbase |
endbase | ||
Specific (Illumina) | Trimmomatic | illuminaclip |
headcrop | ||
crop | ||
slidingwindow | ||
minlen | ||
leading | ||
trailing | ||
encoding | ||
Specific (ONT) | Nanofilt | quality |
length | ||
maxlength | ||
headcrop | ||
Trailcrop | ||
Specific (ONT) | Raven | Kmer |
polishing | ||
Specific (all) | ABRicate | minid |
mincov |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Cruz, H.; Pinheiro, M.; Borges, V. ReporType: A Flexible Bioinformatics Tool for Targeted Loci Screening and Typing of Infectious Agents. Int. J. Mol. Sci. 2024, 25, 3172. https://doi.org/10.3390/ijms25063172
Cruz H, Pinheiro M, Borges V. ReporType: A Flexible Bioinformatics Tool for Targeted Loci Screening and Typing of Infectious Agents. International Journal of Molecular Sciences. 2024; 25(6):3172. https://doi.org/10.3390/ijms25063172
Chicago/Turabian StyleCruz, Helena, Miguel Pinheiro, and Vítor Borges. 2024. "ReporType: A Flexible Bioinformatics Tool for Targeted Loci Screening and Typing of Infectious Agents" International Journal of Molecular Sciences 25, no. 6: 3172. https://doi.org/10.3390/ijms25063172