Next Article in Journal
Determination of Suitable RT-qPCR Reference Genes for Studies of Gene Functions in Laodelphax striatellus (Fallén)
Next Article in Special Issue
MapReduce-Based Parallel Genetic Algorithm for CpG-Site Selection in Age Prediction
Previous Article in Journal
Genome-Wide Analysis of Known and Potential Tetraspanins in Entamoeba histolytica
Article

PipeMEM: A Framework to Speed Up BWA-MEM in Spark with Low Overhead

Communication & Computer Network Lab of Guangdong, School of Computer Science & Engineering, South China University of Technology, Wushan Road 381, Guangzhou 51000, China
*
Author to whom correspondence should be addressed.
Genes 2019, 10(11), 886; https://doi.org/10.3390/genes10110886
Received: 14 September 2019 / Revised: 30 October 2019 / Accepted: 1 November 2019 / Published: 4 November 2019
(This article belongs to the Special Issue Impact of Parallel and High-Performance Computing in Genomics)
(1) Background: DNA sequence alignment process is an essential step in genome analysis. BWA-MEM has been a prevalent single-node tool in genome alignment because of its high speed and accuracy. The exponentially generated genome data requiring a multi-node solution to handle large volumes of data currently remains a challenge. Spark is a ubiquitous big data platform that has been exploited to assist genome alignment in handling this challenge. Nonetheless, existing works that utilize Spark to optimize BWA-MEM suffer from higher overhead. (2) Methods: In this paper, we presented PipeMEM, a framework to accelerate BWA-MEM with lower overhead with the help of the pipe operation in Spark. We additionally proposed to use a pipeline structure and in-memory-computation to accelerate PipeMEM. (3) Results: Our experiments showed that, on paired-end alignment tasks, our framework had low overhead. In a multi-node environment, our framework, on average, was 2.27× faster compared with BWASpark (an alignment tool in Genome Analysis Toolkit (GATK)), and 2.33× faster compared with SparkBWA. (4) Conclusions: PipeMEM could accelerate BWA-MEM in the Spark environment with high performance and low overhead. View Full-Text
Keywords: BWA-MEM; Spark; low overhead BWA-MEM; Spark; low overhead
Show Figures

Figure 1

MDPI and ACS Style

Zhang, L.; Liu, C.; Dong, S. PipeMEM: A Framework to Speed Up BWA-MEM in Spark with Low Overhead. Genes 2019, 10, 886. https://doi.org/10.3390/genes10110886

AMA Style

Zhang L, Liu C, Dong S. PipeMEM: A Framework to Speed Up BWA-MEM in Spark with Low Overhead. Genes. 2019; 10(11):886. https://doi.org/10.3390/genes10110886

Chicago/Turabian Style

Zhang, Lingqi, Cheng Liu, and Shoubin Dong. 2019. "PipeMEM: A Framework to Speed Up BWA-MEM in Spark with Low Overhead" Genes 10, no. 11: 886. https://doi.org/10.3390/genes10110886

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop