Energy-Efficient Computing on Parallel Architectures

A special issue of Computation (ISSN 2079-3197).

Deadline for manuscript submissions: closed (6 March 2020) | Viewed by 20889

Special Issue Editors


E-Mail Website
Guest Editor
INFN Ferrara, I-44122 Ferrara, Italy
Interests: HPC; parallel computing; scientific computing; computational physics; energy efficiency

E-Mail Website
Guest Editor
Computing Systems Laboratory, National Technical University of Athens, 15780 Zografou, Greece
Interests: HPC; parallel computing; resource management; performance modeling

E-Mail Website
Guest Editor
Dipartimento di Scienze Chimiche e Farmaceutiche, University of Ferrara, I-44122 Ferrara, Italy
Interests: HPC; parallel computing; scientific computing; computational physics; energy efficiency

Special Issue Information

Dear Colleagues,

The power requirements of large HPC facilities are becoming unsustainable for both technical and economic reasons, and a significant fraction of the total cost of ownership of HPC installations is already driven by the electricity bill.

The idea of charging users for the energy consumed by their compute jobs is spreading, although in the past it has been difficult to implement it due to the lack of fine-grained energy accounting systems at the data center level. These systems have recently started to be adopted, making it possible to implement per-user energy accounting. On the other side, application developers still focus their main efforts towards performance, often neglecting energy-efficiency issues.

With this Special Issue, we aim to increase awareness towards energy-efficiency importance in the scientific computing community, gathering computer scientists, computational scientists, and developers from different research fields, such as computational-physics, -chemistry, -biology, and -engineering. In particular, we aim to encourage the exchange of experiences and knowledge in novel strategies to monitor, profile, and optimize the energy-efficiency of applications, when run on modern parallel computing systems.

Relevant topics include (but are not limited to) the following:

  • The application profiling and analysis of energy requirements aimed to energy-efficiency optimizations.
  • Case studies of parallel applications optimization towards energy-efficiency;
  • Case studies of parallel applications performance optimization on a power budget;
  • Power and energy-efficiency assessment of processors and accelerators including also FPGAs;
  • Programming models, tools, languages, and compilers to support energy-aware computing;
  • Low-power parallel architectures (use and design);
  • Energy-proportional systems;
  • Energy-efficient heterogeneous systems and communication architectures.

This Special Issue is open to any relevant contribution, but is joint with the ECO-PAR mini-symposium, which will be held September the 13th during the ParCo 2019 conference, so we encourage interested authors to also submit an abstract for this event and come to present it.

Dr. Enrico Calore
Dr. Nikela Papadopoulou
Prof. Dr. Sebastiano Fabio Schifano
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Computation is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1800 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Energy-efficiency
  • Scientific computing
  • High-performance computing
  • Profiling
  • Parallel computing
  • Distributed computing
  • Clusters
  • Computational
  • Optimization

Published Papers (5 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

14 pages, 1364 KiB  
Article
Performance and Energy Assessment of a Lattice Boltzmann Method Based Application on the Skylake Processor
by Ivan Girotto, Sebastiano Fabio Schifano, Enrico Calore, Gianluca Di Staso and Federico Toschi
Computation 2020, 8(2), 44; https://doi.org/10.3390/computation8020044 - 08 May 2020
Cited by 1 | Viewed by 2522
Abstract
This paper presents the performance analysis for both the computing performance and the energy efficiency of a Lattice Boltzmann Method (LBM) based application, used to simulate three-dimensional multicomponent turbulent systems on massively parallel architectures for high-performance computing. Extending results reported in previous works, [...] Read more.
This paper presents the performance analysis for both the computing performance and the energy efficiency of a Lattice Boltzmann Method (LBM) based application, used to simulate three-dimensional multicomponent turbulent systems on massively parallel architectures for high-performance computing. Extending results reported in previous works, the analysis is meant to demonstrate the impact of using optimized data layouts designed for LBM based applications on high-end computer platforms. A particular focus is given to the Intel Skylake processor and to compare the target architecture with other models of the Intel processor family. We introduce the main motivations of the presented work as well as the relevance of its scientific application. We analyse the measured performances of the implemented data layouts on the Skylake processor while scaling the number of threads per socket. We compare the results obtained on several CPU generations of the Intel processor family and we make an analysis of energy efficiency on the Skylake processor compared with the Intel Xeon Phi processor, finally adding our interpretation of the presented results. Full article
(This article belongs to the Special Issue Energy-Efficient Computing on Parallel Architectures)
Show Figures

Figure 1

20 pages, 9895 KiB  
Article
Accurate Energy and Performance Prediction for Frequency-Scaled GPU Kernels
by Kaijie Fan, Biagio Cosenza and Ben Juurlink
Computation 2020, 8(2), 37; https://doi.org/10.3390/computation8020037 - 27 Apr 2020
Cited by 4 | Viewed by 3007
Abstract
Energy optimization is an increasingly important aspect of today’s high-performance computing applications. In particular, dynamic voltage and frequency scaling (DVFS) has become a widely adopted solution to balance performance and energy consumption, and hardware vendors provide management libraries that allow the programmer to [...] Read more.
Energy optimization is an increasingly important aspect of today’s high-performance computing applications. In particular, dynamic voltage and frequency scaling (DVFS) has become a widely adopted solution to balance performance and energy consumption, and hardware vendors provide management libraries that allow the programmer to change both memory and core frequencies manually to minimize energy consumption while maximizing performance. This article focuses on modeling the energy consumption and speedup of GPU applications while using different frequency configurations. The task is not straightforward, because of the large set of possible and uniformly distributed configurations and because of the multi-objective nature of the problem, which minimizes energy consumption and maximizes performance. This article proposes a machine learning-based method to predict the best core and memory frequency configurations on GPUs for an input OpenCL kernel. The method is based on two models for speedup and normalized energy predictions over the default frequency configuration. Those are later combined into a multi-objective approach that predicts a Pareto-set of frequency configurations. Results show that our approach is very accurate at predicting extema and the Pareto set, and finds frequency configurations that dominate the default configuration in either energy or performance. Full article
(This article belongs to the Special Issue Energy-Efficient Computing on Parallel Architectures)
Show Figures

Figure 1

13 pages, 438 KiB  
Article
Performance and Energy Footprint Assessment of FPGAs and GPUs on HPC Systems Using Astrophysics Application
by David Goz, Georgios Ieronymakis, Vassilis Papaefstathiou, Nikolaos Dimou, Sara Bertocco, Francesco Simula, Antonio Ragagnin, Luca Tornatore, Igor Coretti and Giuliano Taffoni
Computation 2020, 8(2), 34; https://doi.org/10.3390/computation8020034 - 17 Apr 2020
Cited by 7 | Viewed by 2885
Abstract
New challenges in Astronomy and Astrophysics (AA) are urging the need for many exceptionally computationally intensive simulations. “Exascale” (and beyond) computational facilities are mandatory to address the size of theoretical problems and data coming from the new generation of observational facilities in AA. [...] Read more.
New challenges in Astronomy and Astrophysics (AA) are urging the need for many exceptionally computationally intensive simulations. “Exascale” (and beyond) computational facilities are mandatory to address the size of theoretical problems and data coming from the new generation of observational facilities in AA. Currently, the High-Performance Computing (HPC) sector is undergoing a profound phase of innovation, in which the primary challenge to the achievement of the “Exascale” is the power consumption. The goal of this work is to give some insights about performance and energy footprint of contemporary architectures for a real astrophysical application in an HPC context. We use a state-of-the-art N-body application that we re-engineered and optimized to exploit the heterogeneous underlying hardware fully. We quantitatively evaluate the impact of computation on energy consumption when running on four different platforms. Two of them represent the current HPC systems (Intel-based and equipped with NVIDIA GPUs), one is a micro-cluster based on ARM-MPSoC, and one is a “prototype towards Exascale” equipped with ARM-MPSoCs tightly coupled with FPGAs. We investigate the behavior of the different devices where the high-end GPUs excel in terms of time-to-solution while MPSoC-FPGA systems outperform GPUs in power consumption. Our experience reveals that considering FPGAs for computationally intensive application seems very promising, as their performance is improving to meet the requirements of scientific applications. This work can be a reference for future platform development for astrophysics applications where computationally intensive calculations are required. Full article
(This article belongs to the Special Issue Energy-Efficient Computing on Parallel Architectures)
Show Figures

Figure 1

17 pages, 772 KiB  
Article
ThunderX2 Performance and Energy-Efficiency for HPC Workloads
by Enrico Calore, Alessandro Gabbana, Sebastiano Fabio Schifano and Raffaele Tripiccione
Computation 2020, 8(1), 20; https://doi.org/10.3390/computation8010020 - 23 Mar 2020
Cited by 10 | Viewed by 4587
Abstract
In the last years, the energy efficiency of HPC systems is increasingly becoming of paramount importance for environmental, technical, and economical reasons. Several projects have investigated the use of different processors and accelerators in the quest of building systems able to achieve high [...] Read more.
In the last years, the energy efficiency of HPC systems is increasingly becoming of paramount importance for environmental, technical, and economical reasons. Several projects have investigated the use of different processors and accelerators in the quest of building systems able to achieve high energy efficiency levels for data centers and HPC installations. In this context, Arm CPU architecture has received a lot of attention given its wide use in low-power and energy-limited applications, but server grade processors have appeared on the market just recently. In this study, we targeted the Marvell ThunderX2, one of the latest Arm-based processors developed to fit the requirements of high performance computing applications. Our interest is mainly focused on the assessment in the context of large HPC installations, and thus we evaluated both computing performance and energy efficiency, using the ERT benchmark and two HPC production ready applications. We finally compared the results with other processors commonly used in large parallel systems and highlight the characteristics of applications which could benefit from the ThunderX2 architecture, in terms of both computing performance and energy efficiency. Pursuing this aim, we also describe how ERT has been modified and optimized for ThunderX2, and how to monitor power drain while running applications on this processor. Full article
(This article belongs to the Special Issue Energy-Efficient Computing on Parallel Architectures)
Show Figures

Figure 1

24 pages, 2233 KiB  
Article
GPU Computing with Python: Performance, Energy Efficiency and Usability
by Håvard H. Holm, André R. Brodtkorb and Martin L. Sætra
Computation 2020, 8(1), 4; https://doi.org/10.3390/computation8010004 - 09 Jan 2020
Cited by 15 | Viewed by 7097
Abstract
In this work, we examine the performance, energy efficiency, and usability when using Python for developing high-performance computing codes running on the graphics processing unit (GPU). We investigate the portability of performance and energy efficiency between Compute Unified Device Architecture (CUDA) and Open [...] Read more.
In this work, we examine the performance, energy efficiency, and usability when using Python for developing high-performance computing codes running on the graphics processing unit (GPU). We investigate the portability of performance and energy efficiency between Compute Unified Device Architecture (CUDA) and Open Compute Language (OpenCL); between GPU generations; and between low-end, mid-range, and high-end GPUs. Our findings showed that the impact of using Python is negligible for our applications, and furthermore, CUDA and OpenCL applications tuned to an equivalent level can in many cases obtain the same computational performance. Our experiments showed that performance in general varies more between different GPUs than between using CUDA and OpenCL. We also show that tuning for performance is a good way of tuning for energy efficiency, but that specific tuning is needed to obtain optimal energy efficiency. Full article
(This article belongs to the Special Issue Energy-Efficient Computing on Parallel Architectures)
Show Figures

Figure 1

Back to TopTop