Towards Energy Efﬁciency in Data Centers: An Industrial Experience Based on Reuse and Layout Changes

: Data centers are widely recognized for demanding many energy resources. The greater the computational demand, the greater the use of resources operating together. Consequently, the greater the heat, the greater the need for cooling power, and the greater the energy consumption. In this context, this article aims to report an industrial experience of achieving energy efﬁciency in a data center through a new layout proposal, reuse of previously existing resources, and air conditioning. We used the primary resource to adopt a cold corridor conﬁnement, the increase of the raised ﬂoor’s height, and a better direction of the cold airﬂow for the aspiration at the servers’ entrance. We reused the three legacy refrigeration machines from the old data center, and no new ones were purchased. In addition to 346 existing devices, 80 new pieces of equipment were added (between servers and network assets) as a load to be cooled. Even with the increase in the amount of equipment, the implementations contributed to energy efﬁciency compared to the old data center, still reducing approximately 41% of the temperature and, consequently, energy-saving.


Introduction
The evolution of computational resources has allowed the emergence of a set of applications in the most diverse platforms, such as web, desktop, mobile, and application contexts, such as finance management [1,2], privacy and safety [3], health [4,5], education [6], smart cities [7,8], smart homes [9], and smart things [10].
Things that could not be processed on a personal computer 20 years ago can currently be processed by a micro-chip available in a smartwatch or an appliance. Given the large set of applications, data sources, and online services, the demand for data processing and storage resources has been increasing, and there is a lack of large data centers. According to the forecast demonstrated in [11], between the years 2019 and 2025, there will be an annual increase of 2% in the global data center market, indicating (i) the increase in the construction of new data centers in super scale; (ii) growth in the acquisition of flash technology for critical applications; (iii) the adoption of alternative forms of energy; (iv) the use of 200 and 400 Gb switch ports, and (v) the increase in the use of hyper-convergent and convergent infrastructure. In [11], the authors point out that more than 100 data centers have been implemented with an energy capacity of more than 15 MW in recent years. According to the study, one of the reasons for the increase is related to the fact that telecommunications providers are investing in optimizing broadband in several locations. Another indication is that government agencies drive initiatives to develop smart cities and the digital economy's growth.
The growing increase in data produced by applications has taken us to a level where knowledge management and massive data volume are no longer non-functional software and hardware architecture requirements. It started and was treated as a functional and essential requirement. This is particularly noteworthy with the emergence of sophisticated paradigms and innovative approaches to data organization, control, and processing, such as the publish-subscribe [12,13] and fog-computing [14] and the Apache Hadoop frameworks [15][16][17], Spark [18][19][20], Kafka [21,22], Perforce and Git [23]. Likewise, several works and architectures seek to optimize the use of computational resources and mitigate energy consumption in data centers, either by conducting changes in the local physical structure or via software-level optimizations.
As it is widely known for the state-of-the-art and industry practice, in an organization, the data center is the sector responsible for storing, processing, scaling, and managing data traffic from all sectors and applications [24][25][26]. Among the equipment involved, there is equipment that acts directly and indirectly in the processing and storage of data, such as, e.g., servers, switches, routers, and mass storage devices. It involves managing critical, confidential, and essential information for the business, ranging from sensitive customer data to the institution's strategic planning data. Since it is directly aligned with the organization's profitable business, the requirement for providing a service with high reliability, security, responsiveness, and real-time support is increased. Keeping each piece of equipment and components intact is essential. An untreated failure in an institution's data center can range from a pause in receiving e-mails to a complicated and risky process of recovering corrupted data. In short, this must involve mechanisms and strategies for self-organization, replication, and redundancy, processes recognized for guaranteeing the requirements demanded, and increasing computational and energy consumption.
Likewise, the need for high availability of services and the continuous use of equipment 24 h a day, 7 days a week, also caused increases in heat generated during data processing and provision of all these services involved. Consequently, this generated the need to use specific and uninterrupted air conditioning systems, which, due to their operation, increased electricity costs even more. As stated by Yuling et al. [27], the energy cost of a data center is responsible for approximately 50% of its total operating cost. According to Song et al. [28] and Liu et al. [29], cooling systems tend to be responsible for the slice of about 40% of all energy consumed by data centers, being only behind the numbers of electrical consumption needed to power the servers, network assets, and other equipment directly linked to the operation of the systems and services. The improvement projects that bring a cost reduction of air conditioning consumption in the data center are critical success factors to better energy efficiency in environments like this, significantly if these improvements also extend equipment life without direct electrical power from the data center. It is a consensus in the literature and practice of data centers that the most impact factor in energy consumption is maintaining the temperature at adequate levels [30][31][32]. The strong dependence on temperature and energy consumption in semiconductors was demonstrated by Vanish [33] in 1967. With this, several researchers in computer science have sought to optimize the cooling in different types of hardware and technologies, e.g., processors and motherboards [34][35][36][37][38][39][40].
However, to the best of our knowledge, the studies on data center environments have provided efforts in solutions that deal with the modernization of servers, policies of optimization, or insertion of intelligent central control software. In contrast to previous studies, this article aims to investigate the following questions:

1.
Is it possible to achieve better energy efficiency in data centers just by changing the equipment's layout and taking advantage of legacy resources? 2.
How to maintain the sustainable consumption of physical and energy resources in the data center despite the growing demand for processing and storage? 3.
What know-how has been learned in the face of the challenges of implementing a data center in the context of tropical climate?
In order to meet the research questions above, the following methodologies were explored to achieve energy efficiency and decrease the temperature of servers and consequently all environments: (i) cold air corridor confinement technique; (ii) adjusting the position of the data centers; and (iii) downflow air insufflation (cold air circulates under the raised floor) aided by grilles with high airflow directional control. The results demonstrate a significant reduction of 46% of the temperature inside the data center, reducing from 25 to 14 ºC on average, only using legacy CRACs (Computer Room Air Conditioning) and maintaining good cooling performance, with a high index of cooling. Availability and uninterrupted security, including 36 new pieces of network equipment (routers and switches), 44 new servers, and 346 previously existing ones.
Paper Outline: This work is organized as follows: Section 2 discusses the related works; Section 3 presents the data center improvements implemented; Section 4 shows and discusses the results, and finally Section 5 presents the conclusions.

Related Works
This section discusses the research works that implement improvements to save energy in a data center context.
Heller et al. [41], after comparing strategies to find subsets of minimum power, built a software manager named ElasticTree, a system to dynamically adapt the energy consumption of a network of the data center, which consists of three logic modules: optimizer, routing, and power control. Initially, a table test is conducted on a data traffic test basis built with Open Flow switches [42]. Subsequently, they conducted accurate tests by monitoring traffic on electronic sites. In both tests, the authors assessed the trade-offs between energy efficiency, performance, and robustness. The results demonstrated energy savings of up to 50%.
Han et al. [43] propose SAVE, a Software Defined Networks (SDN)-based solution for incorporating a Virtual Data Center (VDC) with assisted energy efficiency. The proposed solution is based on the scheme that finds ideal VDC component mappings and ideal routing paths in various data center environments to reduce energy consumption, which was 18.75%.
Song et al. [28] investigated whether airflow management and the data center location selection can affect energy consumption. In the study, the author presents a rack layout with a vertical cooling airflow. They explore two cooling systems, a computer room air conditioning cooling system and an air saver. Based on these two cooling systems, four cities were selected from data center locations worldwide. Furthermore, they implement energy efficiency metrics to cool the data center, such as energy efficiency, performance coefficient, and chiller hours. The results show that refrigeration efficiency and operating costs vary significantly with different climatic conditions, energy prices, and refrigeration technologies. The authors conclude that climatic conditions are the main factor affecting the air saver, and the maximum energy saving indicated in the study was 35%.
In [44], Ma et al. investigated the problem of incorporating virtual data centers aware of energy use. They propose a model of energy consumption, including the virtual machine node models and the virtual switch node, to measure energy consumption when incorporating a virtual data center quantitatively. In addition, using this model, they implement a heuristic algorithm and another based on particle swarm optimization. The second algorithm proved to be the best solution for the incorporation of the virtual data center. The results of the experiment show an energy saving of 11% up to 28%.
Sun et al. [45] explored deep reinforcement learning coupled with SDN to improve the energy efficiency of data center networks and ensure flow completion time. The proposed solution dynamically collects the traffic distribution of the switches to train the model. The trained model can quickly analyze complicated traffic characteristics using neural networks, produce an adaptive action to schedule flows, and deliberately configure margins for different links. After the action, the flows are consolidated into some active links and switches to save energy. The refine margin setting for active links avoids the violation of the flow completion time. According to the authors, the simulation results demonstrate energy savings of up to 12.2 % compared to the existing approaches in the experiment's data center, such as the ElasticTree proposed by Heller et al. [41]. As the authors do not present the actual values, which compromises the comparison with other works in the literature, it is impossible to obtain a realistic estimate of the energy savings gain. Furthermore, the data center and the applications are different.
Saadi et al. [46] have proposed a model optimized for energy efficiency to reduce energy consumption and complete more tasks with greater efficiency in virtual machines in an environment of a cloud. The authors take advantage of the performance/power ratio to define upper limits for overload detection. The results point to an average energy saving of up to 21% compared to two previous approaches pointed out in the study, without compromising the applications' requirements.
MirhoseiniNejad et al. [47] report in their study that considering the thermal effects of server workloads in conjunction with the control parameters of the cooling unit saves more energy than optimizing each one separately. For this, a low complexity holistic data center model is used that provides control decisions with refined control variables based on the thermal interactions between IT and the refrigeration unit entities. The methodology resulting in savings of 11% combining cooling and workload management.
Kaffes et al. [48] propose the PACT model to reduce energy consumption in data centers with high demands, which combines the use of two mechanisms, Turbo Control and CPU Jailing. In experiments with Google data centers, the authors showed an average of 9% energy savings regardless of workload. In addition, there was a 4% improvement in performance. Table 1 presents the summary of related works, emphasizing the approach used and the energy-saving rate. Different from the previously mentioned works, this work aimed to report the experience of optimizing energy efficiency in a data center through the reuse of legacy equipment and the alteration of the layout. Table 1. Summary of related work on energy saving in data centers.

Proposed Solution to Improve Cooling Efficiency
With the expansion of research and development activities, the research institution decided on a building change to meet all demands. Thus, in 2018, we started the physical construction of a new data center within the new physical location. The goal was for the data center to be adequate with the company's operational growth of 50% and achieve the lowest possible maintenance cost. Thus, there was a need to implement measures to ensure better energy efficiency, reduced energy consumption, and less impact on the environment.
As described by Li et al. [27], there were approximately eight million data centers around the world, which consumed 416.2 terawatt-hours of electricity, equivalent to 2% of the total electricity consumed worldwide. This piece of information was one of the main reasons for making improvements in the cooling system.
Besides considering the reduction of refrigeration costs, it was necessary to consider the expected usage increase and the necessity to support as many pieces of equipment as needed for an extended period. In this way, the new refrigeration system should support complete temperature control regardless of the number of servers, without risks of autoshutdown and equipment failures (due to high temperatures/zero downtime), that is, without causing costly impacts to the research institution.

Cooling with Insufflation Downwards (DownFlow)
The DownFlow insufflation is characterized by the process of collecting the hot air generated by the data center equipment through the air inlets located at the top of the CRAC. Then, after the cooling treatment of this collected hot air, the air is returned to the environment through insufflation to the confined space below the raised floor (plenum). Due to the natural pressure created in the environment, the cold air is pushed upwards through perforated floor plates or ventilation diffusers designed for this function. Therefore, in the environment above the raised floor, cold air is available for the servers' internal cooling fans and other equipment installed inside the racks. Once the equipment is cooled, the equipment's exhaust systems return a warm air current that rises to the environment and is again captured by the upper inlets of the CRAC, starting a new cooling cycle. The steps of the DownFlow type inflation are shown in Figure 1. We already used this cooling method in the old data center, entirely using 03 CRAC units (model Stultz ASD 1072 A, 30 TR), specifically prepared for this air conditioning method. However, due to the old building structural restrictions, it was only possible to install the raised floor with a height of 40 cm, which was enough to fit the passage of data and power cables. As a result, since the previous solution had three CRAC units, this allowed us to reduce the costs of adapting the new solution in the new data center and reuse the equipment, leading to savings in training costs and changes in the maintenance process.
As a factor of improving and optimizing the system, we decided to adopt 70 cm of height, with an increase of 75% in the floor elevation, taking into account the statement by IBM [49] that "the highest raised floor allows for a better balance of air conditioning in space". Based on a study by Beitelmal [50], this height value brought us closer to the 76.2-91.4 cm height range, which is considered the best option for raised data center floors. Compared to the old setup, the main benefits are the more uniform airflow rates through the air diffuser grids, more uniform temperature at the top of the racks, and a higher uniformity of static pressure within the plenum.

High Flow Air Diffusers
In order to implement well the DownFlow cooling system (described in the previous section), with sufficient cold air pressure coming from the plenum, it is necessary to use diffuser plates or grilles with high airflow on the raised floor to make it easier for the cold air below the raised floor to be sucked in by the equipment installed in the racks. It is important that the cooled air leaves from the plenum only through the diffuser plates or grilles in the direction of the rows of racks, preventing the cold air from escaping through other areas of the raised floor that are not to be used for the cooling process of the servers, since this waste of resources implies cooling inefficiency and consequently energy inefficiency.
In our old data center, we already used perforated plates with high airflow arranged near the rack's entrance and the positioning of the rows of racks alternating cold aisles with hot aisles. The layout concept, where the rows of racks are positioned face to face so that there is a constant alternation between cold and hot aisles, is shown in Figure 2. According to the study by Ni et al. [51], floors with diffusing grids that have adjustable drivers are more efficient than those without this feature. Wan et al. [52] show that integrating this type of diffusing grid with other advanced technologies can reduce the cooling cost by 30%. Thus, with bases in these studies and in the indications from refrigeration equipment suppliers [53,54], we implemented the replacement of the perforated plates (without drivers) by using high airflow diffuser grids with airflow direction control in the cooling project of the new data center. Unlike perforated plates, grids allow adjustment in the direction and have different angles to the outflow of cold air that passes through them, making the servers receive better ventilation for equipment installed in the lower and upper parts of the racks. Figure 3 shows the comparison of the difference in the direction of the cold airflow next to the racks implemented in the old and new data centers.

Cold Aisle Confinement
According to Lu et al. [55], the cold aisle containment system uses the method of confining the entire cold aisle of the rack row. Thus, it is possible to guarantee a greater concentration of chilled air at the front of the servers, facilitating the process of cooling and making the rest of the data center become a large open area for the return of hot air and, consequentially, having the proper separation between flows of hot and cold air. Figure 4 shows an example of cold aisle confinement. Compared to another aisle containment option, which in this case has the main goal of containing the hot aisle, and also through the analysis of information from suppliers of climatization equipment for data centers [56], confining cold corridors has shown the following benefits: • Lower implementation cost -adhering to the context and the search for making the new data center construction cost planning as small as possible without compromising the quality of air conditioning; • Implementation simplicity-installing doors and a roof for the basic confinement of the aisle. The low implementation complexity helped to not compromise the planned schedule for putting the new data center into production; • Increased operating time without direct power-during an event that causes the initialization of the auxiliary power systems that supply the CRAC (e.g., generator set) to fail to start, or even an eventual delay in starting it, the confinement of the cold aisle creates an internal bank of cold air storage that provides the servers a significant additional running time before they shut down due to excessive temperature. This attribute related to this type of aisle confinement makes it possible to have greater certainty that the automated process of shutting down our 346 pieces of old equipment, added to the more 80 new ones, would have the necessary time to be completed without being compromised by the excess of heat.
We based our approach on the work of Shrivastava and Ibrahim [57], who demonstrate that a well-sealed containment system of cold aisles offers a better thermal environment for the equipment contained in the data center. Moreover, it can provide a longer dwell time in the event of a cooling failure. In this context, it is also expected that the equipment's capacity to draw in cold air through the flow from the plenum will be amplified as long as there is an adequate sealing of the containment system.

Results
After 7 months of the data center's construction, we assessed the solution's efficiency regarding the servers' cooling temperature control since the load to be cooled increased with the entry of more than 80 pieces of equipment. In this way, we compared the average values of inlet air temperatures recorded by servers installed in our new data center. These average values are an essential measurement parameter of the efficient cooling equipment's study within a data center, and this efficiency directly impacts energy consumption. Fulton [58] states that the temperature of the servers' entrance is one of the most critical points in the operation of a data center. Therefore, controlling that temperature is critical to any data center's efficient performance.
We collected temperature data of eight high-scale processing servers, which by the nature of their activities, tend to induce high heat production (all servers are equipped with three NVIDIA Tesla V100 GPUs for data science research). These data are automatically recorded by sensors located on the motherboard of these servers and stored in an internal log. Hence, there is no interference from the operating system or any other software running on the same server.
We chose these high-scale processing servers because they are positioned at the top of the racks, approximately 178 cm away from the cold air outlet coming from the raised floor. Bearing in mind that, compared to other equipment located in lower positions of the rack, these servers tend to draw less cold air and, consequently, register higher inlet temperatures than the others.
Once the sampling elements have been defined, the set of input temperature data collected from each server is consolidated in such a way as to calculate the individual inlet temperature averages of these eight devices, considering two specific periods: • April 2018-April 2019-Period before moving to the new data center, from April 2018, when these servers were installed in the old data center, until April 2019, when we shut down these eight servers to change the building; • May 2019-January 2020-Period after the change of building, from May 2019, when we started the initialization of the eight servers already in the new data center, until January 2020, when we set the cutoff point of this study. Table 2 shows the data extracted from the eight servers, taking into account the two periods mentioned above. We can see the temperature reduction through the analysis of the column "Average Temperature Difference ( • C)", which is a result of the comparison of the average temperature collected before and after the move to the new data center. For example, we have the GPU Server 06, which in the old environment had an inlet temperature of 28.18 • C, and in the new environment, it had 15.26 • C. For this server, there was an environmental heat suppression of approximately 46%, thus minimizing the possibility of interference from high temperatures that could be harmful to its processing performance or its general functioning. Considering the temperature data extracted from the GPUs servers during the two periods as statistically valid samples, which represent the inlet temperature of all other servers of the data center, it is possible to verify that there was an equivalent reduction of temperature for all equipment in the new data center of approximately 41%. This implies a reduction in the cost of current cooling equipment and the possibility of adding more equipment to be refrigerated.
The temperature reduction in the new data center was achieved by reusing the same three CRACs (model Stultz ASD 1072 A, 30 TR) from the old data center, despite the addition of 36 more pieces of network equipment (routers and switches) and 44 new servers, added to the 346 pieces old equipment (servers, storage devices, network assets, and others). In the old data center, where less equipment was cooled, we used the theee CRACs simultaneously and uninterruptedly throughout the week, with no redundancy if one of the units failed, which was a problem. In contrast, after implementing the improvements in the new data center, we have worked with redundancy, rotating the CRACs weekly, so that we have one of the units in standby mode and the other two units operating simultaneously.
In order to evaluate the data center energy efficiency, we applied the Power Usage Effectiveness (PUE) value [59]. The (PUE) is defined by [59] as the total power supplied to the data center divided by the power consumed by IT equipment, defined in Equation (1). The target value is 1.0 since the lower the value (closer to 1), the better the data center's energy efficiency index [52,60] We measured 473,832 kWh of power consumed by IT equipment and the value of 532,218 of total electrical power supplied to the data center. In this way, the value of the (PUE) obtained was: 1.123. Furthermore, we calculated the Data Center infrastructure Efficiency (DCiE) defined in Equation (2) and obtained 88% efficiency, i.e., the reciprocal of (PUE).
Compared to other companies, our results demonstrated competitive energy efficiency [60] when conducting a comparative study of (PUE) in data centers of different companies, e.g., Google (1.12), Facebook (1.08), Microsoft (1.12), eBay (1.45), Yahoo (1.08), HP (1.19), and others. According to [60], we would be among the 2% of the best energyefficient data centers. It indicates that with the reuse of legacy systems, layout changes, and cooling improvements, we beat the results of other similar big companies.

Conclusions
This paper presents a practical approach to decrease a data center's temperature in a technologist industry by reusing equipment and changing the servers' layout. We obtained an average reduction of 41% in temperature. The most related work [28], which implements layout changes, reach 35% energy-efficiency with software control implementation. The other previous works applied experiments assessing the energy consumption of data centers in specific applications and reach an inferior performance compared with [28], which implements a modernization of servers. Heller et al. [41] achieved an efficiency of 50% with simulation results. Our approach achieves a result that is 6% better than Song et al. [28], with a less costly solution in terms of implementation time. Moreover, our approach has other advantages: reusing cooling and server devices and not having software changes.
In general, our work presents and discusses the industry's point of view with a practical bias, generating instances for other companies to implement the proposed solution.
Since (a) we used a layout arrangement of our server racks, privileging the alternation between cold air corridors and hot air corridors, (b) we made use of high air diffuser grids on the entire floor of our cold air aisles, (c) we confined all the cold aisles to favor a better intake of cold air by the rack-mounted servers, and (d) reused all of our refrigeration units from our old data center, without having to purchase any new units, we highlight the following key contributions of our work: • Better electricity efficiency in our new data center; • Results of better cooling performance for old and new servers, even if geographically the new data center is located in a city with a humid tropical climate with an average temperature of 27 • C (with peaks of 37 • C).
We noticed that the combination of the implementations brought about two possible solutions: 1.
Since our equipment works smoothly in the temperature range of 20-22 • C, the servers' input temperature value and the current value measured in the environment is around 15 and 16 • C. We use this difference in such a way as to reduce the operation time of the CRACs. This would reduce electricity consumption and consequently reduce expenses; 2.
The option to bring more servers into the new data center, since the refrigerated thermal load has clearance for this if we use 22 • C as the maximum input temperature limit in the cold aisles.
The first option becomes more attractive for more conservative plans that do not consider the growth of the amount of equipment housed within the data center. This is due to the fact that the feasibility will persist only if the current thermal load to be cooled remains stable, which implies not changing the current number of servers or, at most, only replacing old equipment with new ones.
The second option has advantages, in the strategic aspect, when considering the feasibility of offering physical space for new servers without compromising the data center's cooling quality. The inclusion of new projects, which require new machines or the expansion of existing projects, is possible without burdening resources related to the data center's cooling system.
As future work, we envision to: