Next Article in Journal
A Toolbox for Herpesvirus miRNA Research: Construction of a Complete Set of KSHV miRNA Deletion Mutants
Next Article in Special Issue
From Conventional to Next Generation Sequencing of Epstein-Barr Virus Genomes
Previous Article in Journal
In Vitro and in Vivo Evaluation of Mutations in the NS Region of Lineage 2 West Nile Virus Associated with Neuroinvasiveness in a Mammalian Model
Previous Article in Special Issue
Metagenomic Analysis of Virioplankton of the Subtropical Jiulong River Estuary, China
Article Menu

Export Article

Open AccessArticle
Viruses 2016, 8(2), 53; doi:10.3390/v8020053

Identification of Known and Novel Recurrent Viral Sequences in Data from Multiple Patients and Multiple Cancers

1
Center for Biological Sequence Analysis, Department of Systems Biology, Technical University of Denmark, DK-2800 Kgs. Lyngby, Denmark
2
Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, DK-1350 Copenhagen, Denmark
3
Department of Autoimmunology and Biomarkers, Statens Serum Institut, DK-2300 Copenhagen S, Denmark
4
NNF Center for Protein Research, University of Copenhagen, Blegdamsvej 3B, DK-2200 Copenhagen, Denmark
These authors contributed equally to this work.
*
Author to whom correspondence should be addressed.
Academic Editor: Marcus Thomas Gilbert
Received: 30 October 2015 / Revised: 29 January 2016 / Accepted: 5 February 2016 / Published: 19 February 2016
View Full-Text   |   Download PDF [1294 KB, uploaded 19 February 2016]   |  

Abstract

Virus discovery from high throughput sequencing data often follows a bottom-up approach where taxonomic annotation takes place prior to association to disease. Albeit effective in some cases, the approach fails to detect novel pathogens and remote variants not present in reference databases. We have developed a species independent pipeline that utilises sequence clustering for the identification of nucleotide sequences that co-occur across multiple sequencing data instances. We applied the workflow to 686 sequencing libraries from 252 cancer samples of different cancer and tissue types, 32 non-template controls, and 24 test samples. Recurrent sequences were statistically associated to biological, methodological or technical features with the aim to identify novel pathogens or plausible contaminants that may associate to a particular kit or method. We provide examples of identified inhabitants of the healthy tissue flora as well as experimental contaminants. Unmapped sequences that co-occur with high statistical significance potentially represent the unknown sequence space where novel pathogens can be identified. View Full-Text
Keywords: sequence clustering; taxonomic characterisation; novel sequence identification; next generation sequencing; cancer causing viruses; oncoviruses; assay contamination sequence clustering; taxonomic characterisation; novel sequence identification; next generation sequencing; cancer causing viruses; oncoviruses; assay contamination
Figures

Figure 1

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. (CC BY 4.0).

Supplementary material

Scifeed alert for new publications

Never miss any articles matching your research from any publisher
  • Get alerts for new papers matching your research
  • Find out the new papers from selected authors
  • Updated daily for 49'000+ journals and 6000+ publishers
  • Define your Scifeed now

SciFeed Share & Cite This Article

MDPI and ACS Style

Friis-Nielsen, J.; Kjartansdóttir, K.R.; Mollerup, S.; Asplund, M.; Mourier, T.; Jensen, R.H.; Hansen, T.A.; Rey-Iglesia, A.; Richter, S.R.; Nielsen, I.B.; Alquezar-Planas, D.E.; Olsen, P.V.S.; Vinner, L.; Fridholm, H.; Nielsen, L.P.; Willerslev, E.; Sicheritz-Pontén, T.; Lund, O.; Hansen, A.J.; Izarzugaza, J.M.G.; Brunak, S. Identification of Known and Novel Recurrent Viral Sequences in Data from Multiple Patients and Multiple Cancers. Viruses 2016, 8, 53.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Viruses EISSN 1999-4915 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top