EE has become one of the most sought-after design parameters for the current computer systems, especially the large-scale structures. Several strategies are presently explored for enhancing EE for HPC systems, both in terms of architectural design, hardware, and software technologies [
36,
37,
38]. The first study of energy-efficient Ethernet (EEE) in the field of HPC was presented by [
39] via assessing its power-savings capacity. In contrast with previous proposals, a thorough study of the effect of added EEE latency overhead was provided using several virtual systems applying traces of real HPC applications. The concept of “power-down threshold” was proposed as a potential addition to EEE to reduce the on/off overhead changeover. The studies discovered that EEE saves approximately 70 percent in connecting the power by shutting off connections, but at the expense of efficiency, leading to a 15 percent (average) increase in total system power consumption. The authors of [
40,
41] focused their study on the description of some evolutionary changes in HPC hardware, and how recent hardware trends pose challenges associated with Exascale computing hardware development. Reference [
42] examined energy management problems, challenges, and potential solutions for the period 2010–2016 by concentrating on the energy usage of data centers and HPC systems. The EE issues currently affecting data centers were highlighted, potential threats identified, as well as several short-term predictions. Additionally, the study grouped energy-efficient approaches into seven components and also Exascale as an HPC framework prospect. Reference [
43] described and analyzed several methods of presenting the energy consumption of HPC systems at runtime and a method for estimating the energy ingestion of protocols for fault tolerance. A strategy to categorize fault tolerance protocols into three groups of families; (hierarchical, coordinated, and uncoordinated) was advocated and showed how important the strategy would help users make correct choices concerning energy-efficient services. Reference [
44], studied the correlation among both EE and strength of large-scale parallel systems. It was illustrated theoretically and empirically that significant energy savings are possible by merging undervolting and conventional software-level stability methods on contemporary HPC systems without the need for hardware redesign. The system is evaluated experimentally and shown to save up to 12.1 percent energy relative to the reference runs of 8 HPC specifications. Furthermore, it can save up to 9.1 percent more energy than a state-of-the-art frequency regulated dynamic voltage and frequency scaling (DVFS) solution; lower operating frequency or hardware device voltage supply [
31,
33]. DVFS is a significant way of reducing a computer system’s power and energy usage because CMOS-based parts (e.g., CPU, GPU, and memory) are the key power consumers in the device. Reference [
45], provided a study of AI-based energy building forecasting procedures with a particular concentration on ensemble models. Four major types of AI-based forecasting have been researched based on concepts and implementations, including multiple linear regression, artificial neural networks (ANNs), supporting vector regression, and set model. This paper also addressed the advantages and disadvantages of each type of model. The paper carried out an intensive discussion of the advantages and disadvantages of AI-based prediction models. Reference [
46] stated that diverse researchers describe AI in various ways. There are two dimensions to the differences in the AI definition: One is human centrality, and the other is rationality. Most of the aspects that the intelligence deals with rational actions are adopted. Reference [
47], did not consider the use of AI in HPC to make energy efficient. They suggested HPC AI500—a test suite to analyze HPC systems that run scientific workloads on the DL. The growing workload from HPCAI500 is focused on real-world, scientific DL applications, spanning the most representative scientific fields. They proposed a set of metrics for the thorough evaluation of HPC AI systems, taking into account both accuracy, efficiency, power, and cost. Reference [
48] informs of a study conducted in machine learning concentrating on refining the predictive performance of algorithms, but recently, researchers are becoming more interested in improving EE as well. The paper gives an insight as to why developing energy-efficient algorithms in machine learning is of great importance. In comparison to past methods, AI enables HPC systems beyond basic rules-based instructions. Reference [
49] indicated that in comparison to previous methods, AI empowers HPC systems beyond basic rule-based instructions. Instead, AI tests the data using a series of ’theories’ and algorithms as instructions. Reference [
50], suggested two initiatives (Machine Learning classifiers and DVFS settings during runtime) to address balancing application performance and system power consumption in HPC during runtime of the program, using closed loop feedback architectures based on the self-aware computing paradigm to observe, decide, and act, presented ultramodern energy-conscious HPC, particularly the recognition and grouping of strategies by device and unit size, optimization metrics, and energy or power management methods. Types of system include single computers, clusters, networks, and clouds, while devices comprise CPUs, GPUs, multiprocessors, and hybrid-systems. With respect to modern HPC systems, they addressed tools and APIs, as well as environments aimed at predicting and simulating energy and power intake. Reference [
51] gave an overview of the recent research advancements in energy-efficient computing, identified common characteristics, and classified the approaches. They addressed the causes and issues of high power or energy usage and present a taxonomy of energy-efficient computer system design covering the levels of hardware, operating system, virtualization, and data centers. Reference [
52] stated that, HPC systems of significant size, system-wide power consumption has been described as one of the core constraints going forward, where DRAM main memory units account for approximately 30–50 per cent of the overall power utilization of a node. Nonetheless, as an alternative to DRAM, a range of new memory technologies called nonvolatile memory (NVM) products are being examined. Reference [
53] examines the trade-off between energy and performance (time of execution) for HPC applications in a real small-scale power-scalable cluster as well as the trade-off between energy and performance (time of execution) for serial and parallel HPC programs. From the array of literature discussed, it could be seen that some works have dealt on deploying AI to enhance the EE of HPC systems. Unfortunately, these works have not failed to provide thorough evidence on why AI is needed in HPC systems. Hence, presenting a myopic view. Secondly, there was no linkage between HPC, 5G, and EE in the previous works.
Table 1 presents a summary of reviewed related work.