Next Article in Journal
Determination of Suitable RT-qPCR Reference Genes for Studies of Gene Functions in Laodelphax striatellus (Fallén)
Previous Article in Journal
Genome-Wide Analysis of Known and Potential Tetraspanins in Entamoeba histolytica
Open AccessArticle

PipeMEM: A Framework to Speed Up BWA-MEM in Spark with Low Overhead

Communication & Computer Network Lab of Guangdong, School of Computer Science & Engineering, South China University of Technology, Wushan Road 381, Guangzhou 51000, China
*
Author to whom correspondence should be addressed.
Genes 2019, 10(11), 886; https://doi.org/10.3390/genes10110886
Received: 14 September 2019 / Revised: 30 October 2019 / Accepted: 1 November 2019 / Published: 4 November 2019
(This article belongs to the Special Issue Impact of Parallel and High-Performance Computing in Genomics)
(1) Background: DNA sequence alignment process is an essential step in genome analysis. BWA-MEM has been a prevalent single-node tool in genome alignment because of its high speed and accuracy. The exponentially generated genome data requiring a multi-node solution to handle large volumes of data currently remains a challenge. Spark is a ubiquitous big data platform that has been exploited to assist genome alignment in handling this challenge. Nonetheless, existing works that utilize Spark to optimize BWA-MEM suffer from higher overhead. (2) Methods: In this paper, we presented PipeMEM, a framework to accelerate BWA-MEM with lower overhead with the help of the pipe operation in Spark. We additionally proposed to use a pipeline structure and in-memory-computation to accelerate PipeMEM. (3) Results: Our experiments showed that, on paired-end alignment tasks, our framework had low overhead. In a multi-node environment, our framework, on average, was 2.27× faster compared with BWASpark (an alignment tool in Genome Analysis Toolkit (GATK)), and 2.33× faster compared with SparkBWA. (4) Conclusions: PipeMEM could accelerate BWA-MEM in the Spark environment with high performance and low overhead. View Full-Text
Keywords: BWA-MEM; Spark; low overhead BWA-MEM; Spark; low overhead
Show Figures

Figure 1

MDPI and ACS Style

Zhang, L.; Liu, C.; Dong, S. PipeMEM: A Framework to Speed Up BWA-MEM in Spark with Low Overhead. Genes 2019, 10, 886.

Show more citation formats Show less citations formats
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop