Next Article in Journal
Discovery of Known and Novel Viruses in Wild and Cultivated Blueberry in Florida through Viral Metagenomic Approaches
Previous Article in Journal
Genetic and Pathogenic Characterization of QX(GI-19)-Recombinant Infectious Bronchitis Viruses in South Korea
Previous Article in Special Issue
Discriminating between JCPyV and BKPyV in Urinary Virome Data Sets

Informative Regions In Viral Genomes

Department of Microbiome Science, Max Planck Institute for Developmental Biology, 72076 Tübingen, Germany
Max Planck Tandem Group in Computational Biology, Department of Biological Sciences, Universidad de los Andes, Bogotá 111711, Colombia
The Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, Saint Louis, MO 63108, USA
Author to whom correspondence should be addressed.
Academic Editors: Jennifer R. Brum and Simon Roux
Viruses 2021, 13(6), 1164;
Received: 21 April 2021 / Revised: 25 May 2021 / Accepted: 27 May 2021 / Published: 18 June 2021
(This article belongs to the Special Issue Viromics)
Viruses, far from being just parasites affecting hosts’ fitness, are major players in any microbial ecosystem. In spite of their broad abundance, viruses, in particular bacteriophages, remain largely unknown since only about 20% of sequences obtained from viral community DNA surveys could be annotated by comparison with public databases. In order to shed some light into this genetic dark matter we expanded the search of orthologous groups as potential markers to viral taxonomy from bacteriophages and included eukaryotic viruses, establishing a set of 31,150 ViPhOGs (Eukaryotic Viruses and Phages Orthologous Groups). To do this, we examine the non-redundant viral diversity stored in public databases, predict proteins in genomes lacking such information, and used all annotated and predicted proteins to identify potential protein domains. The clustering of domains and unannotated regions into orthologous groups was done using cogSoft. Finally, we employed a random forest implementation to classify genomes into their taxonomy and found that the presence or absence of ViPhOGs is significantly associated with their taxonomy. Furthermore, we established a set of 1457 ViPhOGs that given their importance for the classification could be considered as markers or signatures for the different taxonomic groups defined by the ICTV at the order, family, and genus levels. View Full-Text
Keywords: eukaryotic viruses; phages; orthologous gropus; random forest; ViPhOGs eukaryotic viruses; phages; orthologous gropus; random forest; ViPhOGs
Show Figures

Figure 1

MDPI and ACS Style

Moreno-Gallego, J.L.; Reyes, A. Informative Regions In Viral Genomes. Viruses 2021, 13, 1164.

AMA Style

Moreno-Gallego JL, Reyes A. Informative Regions In Viral Genomes. Viruses. 2021; 13(6):1164.

Chicago/Turabian Style

Moreno-Gallego, Jaime Leonardo, and Alejandro Reyes. 2021. "Informative Regions In Viral Genomes" Viruses 13, no. 6: 1164.

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

Back to TopTop