Next Article in Journal
Electrical Impedance Tomography for Hand Gesture Recognition for HMI Interaction Applications
Previous Article in Journal
Efficiency of Priority Queue Architectures in FPGA
 
 
Article

Dynamic SIMD Parallel Execution on GPU from High-Level Dataflow Synthesis †

EPFL SCI-STI-MM, École Polytechnique Fédérale de Lausanne, CH-1015 Lausanne, Switzerland
*
Author to whom correspondence should be addressed.
This paper is an extended version of our paper published in 14th IEEE MCSoC 2021.
Academic Editors: Sanghamitra Roy and Andrea Acquaviva
J. Low Power Electron. Appl. 2022, 12(3), 40; https://doi.org/10.3390/jlpea12030040
Received: 10 March 2022 / Revised: 5 April 2022 / Accepted: 11 July 2022 / Published: 17 July 2022
Developing and fine-tuning software programs for heterogeneous hardware such as CPU/GPU processing platforms comprise a highly complex endeavor that demands considerable time and effort of software engineers and requires evaluating various fundamental components and features of both the design and of the platform to maximize the overall performance. The dataflow programming approach has proven to be an appropriate methodology for reaching such a difficult and complex goal for the intrinsic portability and the possibility of easily decomposing a network of actors on different processing units of the heterogeneous hardware. Nonetheless, such a design method might not be enough on its own to achieve the desired performance goals, and supporting tools are useful to be able to efficiently explore the design space so as to optimize the desired performance objectives. This article presents a methodology composed of several stages for enhancing the performance of dataflow software developed in RVC-CAL and generating low-level implementations to be executed on GPU/CPU heterogeneous hardware platforms. The stages are composed of a method for the efficient scheduling of parallel CUDA partitions, an optimization of the performance of the data transmission tasks across computing kernels, and the exploitation of dynamic programming for introducing SIMD-capable graphics processing unit systems. The methodology is validated on both the quantitative and qualitative side by means of dataflow software application examples running on platforms according to various different mapping configurations. View Full-Text
Keywords: heterogeneous systems; GPU programming; source-to-source compiler; parallel computing; SIMD; RVC-CAL; dynamic dataflow programs heterogeneous systems; GPU programming; source-to-source compiler; parallel computing; SIMD; RVC-CAL; dynamic dataflow programs
Show Figures

Figure 1

MDPI and ACS Style

Bloch, A.; Casale-Brunet, S.; Mattavelli, M. Dynamic SIMD Parallel Execution on GPU from High-Level Dataflow Synthesis. J. Low Power Electron. Appl. 2022, 12, 40. https://doi.org/10.3390/jlpea12030040

AMA Style

Bloch A, Casale-Brunet S, Mattavelli M. Dynamic SIMD Parallel Execution on GPU from High-Level Dataflow Synthesis. Journal of Low Power Electronics and Applications. 2022; 12(3):40. https://doi.org/10.3390/jlpea12030040

Chicago/Turabian Style

Bloch, Aurelien, Simone Casale-Brunet, and Marco Mattavelli. 2022. "Dynamic SIMD Parallel Execution on GPU from High-Level Dataflow Synthesis" Journal of Low Power Electronics and Applications 12, no. 3: 40. https://doi.org/10.3390/jlpea12030040

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop