Next Article in Journal
Survey of Automatic Spelling Correction
Previous Article in Journal
Lateral-Acceleration-Based Vehicle-Models-Blending for Automated Driving Controllers
Article

Performance Analysis of Sparse Matrix-Vector Multiplication (SpMV) on Graphics Processing Units (GPUs)

1
Department of Computer and Information Sciences, Taibah University, Medina 42353, Saudi Arabia
2
Department of Computer Science, Aalto University, 02150 Espoo, Finland
3
Department of Computer Science, King Abdulaziz University, Jeddah 21589, Saudi Arabia
4
High Performance Computing Center, King Abdulaziz University, Jeddah 21589, Saudi Arabia
*
Author to whom correspondence should be addressed.
Electronics 2020, 9(10), 1675; https://doi.org/10.3390/electronics9101675
Received: 15 September 2020 / Revised: 2 October 2020 / Accepted: 6 October 2020 / Published: 13 October 2020
(This article belongs to the Section Computer Science & Engineering)
Graphics processing units (GPUs) have delivered a remarkable performance for a variety of high performance computing (HPC) applications through massive parallelism. One such application is sparse matrix-vector (SpMV) computations, which is central to many scientific, engineering, and other applications including machine learning. No single SpMV storage or computation scheme provides consistent and sufficiently high performance for all matrices due to their varying sparsity patterns. An extensive literature review reveals that the performance of SpMV techniques on GPUs has not been studied in sufficient detail. In this paper, we provide a detailed performance analysis of SpMV performance on GPUs using four notable sparse matrix storage schemes (compressed sparse row (CSR), ELLAPCK (ELL), hybrid ELL/COO (HYB), and compressed sparse row 5 (CSR5)), five performance metrics (execution time, giga floating point operations per second (GFLOPS), achieved occupancy, instructions per warp, and warp execution efficiency), five matrix sparsity features (nnz, anpr, nprvariance, maxnpr, and distavg), and 17 sparse matrices from 10 application domains (chemical simulations, computational fluid dynamics (CFD), electromagnetics, linear programming, economics, etc.). Subsequently, based on the deeper insights gained through the detailed performance analysis, we propose a technique called the heterogeneous CPU–GPU Hybrid (HCGHYB) scheme. It utilizes both the CPU and GPU in parallel and provides better performance over the HYB format by an average speedup of 1.7x. Heterogeneous computing is an important direction for SpMV and other application areas. Moreover, to the best of our knowledge, this is the first work where the SpMV performance on GPUs has been discussed in such depth. We believe that this work on SpMV performance analysis and the heterogeneous scheme will open up many new directions and improvements for the SpMV computing field in the future. View Full-Text
Keywords: sparse matrix-vector multiplication (SpMV); high performance computing (HPC); sparse matrix storage; graphics processing units (GPUs); CSR; ELL; HYB; CSR5; parallelization; heterogeneous computing sparse matrix-vector multiplication (SpMV); high performance computing (HPC); sparse matrix storage; graphics processing units (GPUs); CSR; ELL; HYB; CSR5; parallelization; heterogeneous computing
Show Figures

Graphical abstract

MDPI and ACS Style

AlAhmadi, S.; Mohammed, T.; Albeshri, A.; Katib, I.; Mehmood, R. Performance Analysis of Sparse Matrix-Vector Multiplication (SpMV) on Graphics Processing Units (GPUs). Electronics 2020, 9, 1675. https://doi.org/10.3390/electronics9101675

AMA Style

AlAhmadi S, Mohammed T, Albeshri A, Katib I, Mehmood R. Performance Analysis of Sparse Matrix-Vector Multiplication (SpMV) on Graphics Processing Units (GPUs). Electronics. 2020; 9(10):1675. https://doi.org/10.3390/electronics9101675

Chicago/Turabian Style

AlAhmadi, Sarah; Mohammed, Thaha; Albeshri, Aiiad; Katib, Iyad; Mehmood, Rashid. 2020. "Performance Analysis of Sparse Matrix-Vector Multiplication (SpMV) on Graphics Processing Units (GPUs)" Electronics 9, no. 10: 1675. https://doi.org/10.3390/electronics9101675

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Search more from Scilit
 
Search
Back to TopTop