Next Article in Journal
A Complete VADER-Based Sentiment Analysis of Bitcoin (BTC) Tweets during the Era of COVID-19
Previous Article in Journal
OTNEL: A Distributed Online Deep Learning Semantic Annotation Methodology
Article

JAMPI: Efficient Matrix Multiplication in Spark Using Barrier Execution Mode

1
Starschema Inc., Arlington, VA 22066, USA
2
Google Inc., Seattle, WA 98103, USA
*
Author to whom correspondence should be addressed.
Big Data Cogn. Comput. 2020, 4(4), 32; https://doi.org/10.3390/bdcc4040032
Received: 10 July 2020 / Revised: 12 October 2020 / Accepted: 26 October 2020 / Published: 5 November 2020
The new barrier mode in Apache Spark allows for embedding distributed deep learning training as a Spark stage to simplify the distributed training workflow. In Spark, a task in a stage does not depend on any other tasks in the same stage, and hence it can be scheduled independently. However, several algorithms require more sophisticated inter-task communications, similar to the MPI paradigm. By combining distributed message passing (using asynchronous network IO), OpenJDK’s new auto-vectorization and Spark’s barrier execution mode, we can add non-map/reduce-based algorithms, such as Cannon’s distributed matrix multiplication to Spark. We document an efficient distributed matrix multiplication using Cannon’s algorithm, which significantly improves on the performance of the existing MLlib implementation. Used within a barrier task, the algorithm described herein results in an up to 24% performance increase on a 10,000 × 10,000 square matrix with a significantly lower memory footprint. Applications of efficient matrix multiplication include, among others, accelerating the training and implementation of deep convolutional neural network-based workloads, and thus such efficient algorithms can play a ground-breaking role in the faster and more efficient execution of even the most complicated machine learning tasks. View Full-Text
Keywords: Apache Spark; distributed computing; distributed matrix algebra; deep learning; matrix primitives Apache Spark; distributed computing; distributed matrix algebra; deep learning; matrix primitives
Show Figures

Figure 1

MDPI and ACS Style

Foldi, T.; von Csefalvay, C.; Perez, N.A. JAMPI: Efficient Matrix Multiplication in Spark Using Barrier Execution Mode. Big Data Cogn. Comput. 2020, 4, 32. https://doi.org/10.3390/bdcc4040032

AMA Style

Foldi T, von Csefalvay C, Perez NA. JAMPI: Efficient Matrix Multiplication in Spark Using Barrier Execution Mode. Big Data and Cognitive Computing. 2020; 4(4):32. https://doi.org/10.3390/bdcc4040032

Chicago/Turabian Style

Foldi, Tamas, Chris von Csefalvay, and Nicolas A. Perez 2020. "JAMPI: Efficient Matrix Multiplication in Spark Using Barrier Execution Mode" Big Data and Cognitive Computing 4, no. 4: 32. https://doi.org/10.3390/bdcc4040032

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop