Next Article in Journal
Cardiotoxic Effects of Short-Term Doxorubicin Administration: Involvement of Connexin 43 in Calcium Impairment
Previous Article in Journal
ASIC1a Promotes Acid-Induced Autophagy in Rat Articular Chondrocytes through the AMPK/FoxO3a Pathway
Previous Article in Special Issue
Detection of Bacterial Pathogens from Broncho-Alveolar Lavage by Next-Generation Sequencing
Article Menu
Issue 10 (October) cover image

Export Article

Open AccessArticle
Int. J. Mol. Sci. 2017, 18(10), 2124; doi:10.3390/ijms18102124

A Massively Parallel Sequence Similarity Search for Metagenomic Sequencing Data

1
Department of Computer Science, Graduate School of Information Science and Engineering, Tokyo Institute of Technology, 2-12-1 W8-76 Ookayama, Meguro-ku, Tokyo 152-8550, Japan
2
Education Academy of Computational Life Sciences (ACLS), Tokyo Institute of Technology, 4259 J3-141 Nagatsuta-cho, Midori-ku, Yokohama, Kanagawa 226-8503, Japan
3
Department of Computer Science, School of Computing, Tokyo Institute of Technology, 2-12-1 W8-76 Ookayama, Meguro-ku, Tokyo 152-8550, Japan
*
Author to whom correspondence should be addressed.
Received: 31 August 2017 / Revised: 3 October 2017 / Accepted: 6 October 2017 / Published: 11 October 2017
(This article belongs to the Special Issue Deciphering the Human Microbiota: Methods and Impact on Human Health)
View Full-Text   |   Download PDF [2237 KB, uploaded 11 October 2017]   |  

Abstract

Sequence similarity searches have been widely used in the analyses of metagenomic sequencing data. Finding homologous sequences in a reference database enables the estimation of taxonomic and functional characteristics of each query sequence. Because current metagenomic sequencing data consist of a large number of nucleotide sequences, the time required for sequence similarity searches account for a large proportion of the total time. This time-consuming step makes it difficult to perform large-scale analyses. To analyze large-scale metagenomic data, such as those found in the human oral microbiome, we developed GHOST-MP (Genome-wide HOmology Search Tool on Massively Parallel system), a parallel sequence similarity search tool for massively parallel computing systems. This tool uses a fast search algorithm based on suffix arrays of query and database sequences and a hierarchical parallel search to accelerate the large-scale sequence similarity search of metagenomic sequencing data. The parallel computing efficiency and the search speed of this tool were evaluated. GHOST-MP was shown to be scalable over 10,000 CPU (Central Processing Unit) cores, and achieved over 80-fold acceleration compared with mpiBLAST using the same computational resources. We applied this tool to human oral metagenomic data, and the results indicate that the oral cavity, the oral vestibule, and plaque have different characteristics based on the functional gene category. View Full-Text
Keywords: database search; sequence similarity search; metagenomics; human oral microbiome database search; sequence similarity search; metagenomics; human oral microbiome
Figures

Figure 1

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. (CC BY 4.0).

Supplementary material

Scifeed alert for new publications

Never miss any articles matching your research from any publisher
  • Get alerts for new papers matching your research
  • Find out the new papers from selected authors
  • Updated daily for 49'000+ journals and 6000+ publishers
  • Define your Scifeed now

SciFeed Share & Cite This Article

MDPI and ACS Style

Kakuta, M.; Suzuki, S.; Izawa, K.; Ishida, T.; Akiyama, Y. A Massively Parallel Sequence Similarity Search for Metagenomic Sequencing Data. Int. J. Mol. Sci. 2017, 18, 2124.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Int. J. Mol. Sci. EISSN 1422-0067 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top