Hotspot-Aware Workload Scheduling and Server Placement for Heterogeneous Cloud Data Centers

: Data center servers located in thermal hotspot regions receive inlet air at a higher than the set temperature and thus generate comparatively high outlet temperature. Consequently, there is a rise in energy that is consumed to cool down the servers that otherwise would undergo reliability hazards. The workload deployment across the servers should be resilient to thermal hotspots to ensure smooth performance. In a heterogeneous data center environment, an equally important fact is the placement of the servers in a thermal hotspot-aware manner to lower the peak outlet temperatures. These approaches can be applied proactively with the help of outlet temperature prediction. This paper presents the hotspot adaptive workload deployment algorithm (HAWDA) and hotspot aware server relocation algorithm (HASRA) based on thermal proﬁling regarding outlet temperature prediction. HAWDA deploys workload on servers in a thermal-efﬁcient manner and HASRA optimizes the server location in thermal hotspot regions to lower the peak outlet temperatures. Performance comparison is carried out to analyze the efﬁcacy of HAWDA against the TASA and GRANITE algorithms. Results suggest that HAWDA provides average peak utilization of the servers similar to GRANITE and TASA without additional burden on the cooling mechanism, with and without server relocation, as HAWDA minimizes the peak outlet temperature.


Introduction
Wide use of internet data centers to provide uninterrupted access to cloud operations has substantially increased energy consumption over the past few years. Studies show that 40% of the total average energy consumed by a data center is utilized for cooling [1,2]. Due to unmanaged placement of servers and uneven server utilization, thermal hotspots are formed [3][4][5]. Thermal hotspots are regions of high air temperature that can occur frequently near the top of the racks due to heat recirculation [3,6] and/or due to the shortfalls of the cooling mechanism [5]. The rise in inlet temperature along with the server utilization has a corresponding effect over increasing outlet temperature of the servers [6]. Heat starts to accumulate more quickly inside server enclosures with higher inlet temperature as compared to servers with lower inlet temperature at the same utilization level [7]. At the same time, servers in the thermal hotspot regions conduct heat at a higher rate than their neighboring servers [8]. With the occurrence of heat recirculation, this situation can prolong the cooling process, resulting in increased cooling energy consumption and can cause reliability issues, and thus should be avoided [3].
The peak outlet temperature of the servers and the resultant cooling load can be lowered by either underutilizing the servers or keeping them idle [9]. Neither of these remedies works well as far as the heat generation is concerned because servers consume up to 60% of the peak power when in an idle state. Empirical studies from existing literature have observed that a typical server in an idle state consumes extensive energy due to its build and specifications [10][11][12]. Even if the power budget of the servers is capped to lower heat dissipation [13], it will result in lower performance due to low utilization of servers. Underutilization is not always the best solution unless the outlet temperature of the servers can be predicted. The servers can thus be adaptively utilized to maintain performance. Otherwise, the underutilization of servers may result in spending even more energy on computing than saved while cooling.
For adaptive utilization of servers generally, and for servers experiencing thermal hotspots particularly, a useful consideration is the heat dissipation behavior of each server. The heat dissipation from each server, in terms of the outlet temperature, can be profiled considering inlet air temperature and CPU utilization. The outlet temperature of the server at various utilization levels can be predicted using thermal profiling [14]. Thermal prediction can be used for energy-efficient proactive workload scheduling.
Most data centers today are heterogeneous, comprising servers of different generations and hardware specifications. This is because servers are added gradually to expand the capacity of the data center, data centers regularly go through maintenance, and existing servers are replaced with new servers. Two heterogeneous servers with different physical builds may have different outlet temperatures when receiving cold air at identical inlet temperatures and running workload at similar utilization levels [14]. Moreover, the physical location of each server also affects its outlet temperature. The temperature of heterogeneous servers residing within different regions of the data center hall can be predicted using thermal prediction models, and, by using this information, the best location for each server can be identified with respect to inlet temperature within different regions of the data center hall.
This paper aims to avoid thermal hotspot creation in the data center by lowering peak outlet temperature without causing an additional burden on the cooling mechanism and affecting the utilization of data center servers. To achieve these objectives, two algorithms are presented: (1) thermal-aware workload scheduling, and (2) thermal-aware server relocation. These algorithms use thermal profile-based thermal prediction [14]. The results demonstrate that by applying these algorithms individually or collectively, the peak outlet temperature of the data center is reduced significantly without any change in the overall utilization of data center servers. This paper is organized into seven sections. Sections 2 and 3 describe the related work and the background concepts of thermal-profiling, respectively. Section 4 describes our methodology and evaluation approach. It also presents a hotspot-aware scheduling algorithm and a hotspot-aware server location optimization algorithm. Experimental setup and results are discussed in Section 5. Section 6 presents a discussion on the results followed by conclusions in Section 7.

Related Work
A workload scheduling technique that prefers the most energy-efficient servers to schedule virtual machines (VMs) can save energy using frequency scaling, as each VM will be using a minimum frequency limit-just enough to avoid violating the service level agreement (SLA) [15]. Each server has to be evaluated for energy efficiency and graded accordingly. However, for heterogeneous servers, when energy-efficient servers are fully loaded, there is a need to utilize less efficient servers. This situation is prone to thermal hotspots unless the servers are profiled and located in a data center region according to their thermal behavior.
A few studies show the impact of server location and the rise in inlet air temperature on the creation of thermal hotspots. Bo et al. [5] and S. McIntosh et al. [16] identified the data center thermal hotspots as the rise in inlet air temperature due to multiple factors such as heat recirculation, the mixture of cold and hot air, physical flaws in the cooling mechanism, and central processing unit (CPU) utilization. They used regression analysis and interpolation of temperature data to identify the thermal hotspot regions inside the data center. However, they did not consider the thermal profiling of the servers to predict the occurrence of thermal hotspots or to link the thermal profiles with the location optimization of the servers. The traditional data centers' thermal monitoring cannot correctly pinpoint a 'thermal hotspot causing server' without complex statistical analysis because of the sparseness of thermal monitoring sensors inside the data center hall [5,16]. However, thermal-profiling-based outlet temperature evaluation techniques can identify those thermal hotspot servers with comparatively more accuracy and speed. When using thermal sensors for temperature estimation and thermal modeling [17,18], it should be considered that if the server placement is not thermal-aware, some servers may undergo thermal hotspot conditions due to the rise in inlet air temperature. If the servers are located according to the inlet temperature sensitivity using thermal-profiling based outlet temperature estimation, the peak outlet temperature, and hence chances of thermal hotspot creation, across data center servers can be decreased.
A thermal-aware server provisioning approach for data centers [19,20] should consider that high inlet temperature can cause the underutilized servers to attain maximum temperature. Therefore, the average utilization rate of servers can be increased by thermal-aware relocation of the servers. In related approaches, Al-Qawasmeh et al. [21] used the decadeold logic of Moore et al. [13] to allocate power budgets to computing nodes according to the thermal and power constraints. Power is saved by allocating optimum power budgets to the computing nodes and using the power profiles of the tasks. The lack of thermal prediction makes this approach less effective when applied in real data centers. This is because the change in inlet temperature can increase the outlet temperature of the servers even in the idle state, as well as during utilization, and thus requires prediction modeling for temperature. In addition, the use of numerous processor states and task types makes the energy-efficient mapping of each task to a processor-core quite computer-intensive and thus impractical for an average-sized data center with hundreds of virtualized servers hosting multiple VMs.
Regression [21] based techniques are used to link CPU utilization with heat generation at constant inlet temperature using the heat imbalance model [7] to estimate heat generation from the related server based on outlet temperature. However, this technique may not work for high inlet temperatures because the cooling capacity of the air will be decreased [22] and therefore the server will start to accumulate heat. This decreases the prediction accuracy and hence the thermal efficiency of workload scheduling. The RC model [23] is used by Kumar et al. [24] to evaluate the ambient heat dissipation from the servers and the data center workload balancing approach of [25] is extended to this end. Power consumption is used to calculate the temperature of the servers that is then utilized for maintaining thermal balance across all servers and for optimizing the power consumption of the servers and computer room air conditioning (CRAC) unit. Although Kumar et al. [24] consider CRAC units' set temperature as the inlet temperature, it is not possible to calculate heat recirculation and/or rise in inlet temperature, making the calculations unreliable because all the servers will be otherwise receiving the inlet air at higher than the set temperature. Further, the use of processor chip temperature to be equivalent to server temperature is debatable. Additionally, this approach is not used for the location optimization of the servers.
Some green cloud computing approaches [26] aiming for energy-efficient management of cloud infrastructure propose to save energy through server consolidation. The fact is that, to save power, if a few servers are overloaded with VMs, then this makes these servers prone to thermal hotspot conditions when exposed to increased inlet temperature unless a thermal-aware VM scheduling is followed. Furthermore, thermal-aware workload scheduling aimed at distributing and redistributing VMs across servers based on the thermal status of servers [26] may not achieve the target without thermal profiles. The variations in inlet temperature, power consumption, and CPU load have a combined effect on the outlet temperature. Without the use of a thermal profile, it is difficult to determine the thermal state of a server. The scheduling algorithm for back-to-back leasing of VM utilizes the servers at peak level, especially the backfilling infrastructure as a service (IAAS) resource scheduling [27]. Such situations may give rise to multiple thermal hotspots due to variations in inlet temperature and/or inefficient cooling.
Workload scheduling approaches that rely upon computational fluid dynamics simulations [6,19,28] can provide a better estimation of cooling power consumption if their respective energy models include the phenomenon of inlet temperature variation and the physical location of the servers. Simulator-based implementations of thermal-aware resource scheduling [26] or inferring the thermal effect of a resource allocation is limited because the physical world is quite different from simulation. The effect of power consumption and inlet temperature on outlet temperature may comparatively be more accurately demonstrable by using actual thermal profiles of physical servers. Thermal profiling-based techniques such as [22,29] either give inaccurate results or underutilize the servers unless the servers' placement in the data center is thermal-aware. To achieve a high air conditioning thermostat setting [9,30], servers should be placed at optimum locations before the evaluation of the power consumption of the data center. Otherwise, the thermal-aware resource scheduling algorithm will tend to underutilize the servers in fear of thermal hotspots.
Workload scheduling techniques for data centers that use the RC-thermal model of heat exchange [31,32] should consider that workload backfilling might not help reduce the peak outlet temperature if inlet temperature variation is not considered. This is because the slight variation in inlet temperature substantially affects the coefficients of heat recirculation and heat extraction for data center servers [33]. In fact, backfilling-based workload scheduling may lead to thermal hotspot creation and thus may cause reliability issues [3]. Thermalaware workload scheduling that is based on task-temperature profiles should consider the effect of variation in inlet temperature on physical servers, otherwise, the thermal map can be unexpected because of this scheduling [34]. In the presence of a large number of servers, heterogeneity, and heat recirculation, the frequency of thermal hotspots can increase. Server-based thermal profiling is more simple and more generic than task-based thermal profiling in such a diverse scenario. If the thermal profiles of severs are available, the chances of high outlet temperature as a result of workload scheduling can be predicted proactively, and thus can be avoided. Moreover, the servers can be relocated to lower the peak outlet temperature based on thermal prediction, as shown in this paper. This paper has the following contributions: • A generic approach for thermal hotspot-aware resource management of data centers using thermal profiles of servers is proposed. The proposed approach proactively predicts the outlet temperatures and helps avoid thermal hotspots in data centers. • Hotspot-adaptive workload deployment algorithm (HAWDA) and hotspot-aware server relocation algorithm (HASRA) are developed and evaluated in terms of outlet temperature, power consumption, and server utilization of data center servers. • A simulation study is implemented with HAWDA and HASRA using Alibaba cloud workload traces. HAWDA and HASRA are compared with the existing thermal-aware scheduling algorithm (TASA) and greedy-based scheduling algorithm minimizing total energy (GRANITE).

Background
Thermal profiles of servers are created by stress-testing them at various utilization levels using thermal benchmarks by manipulating multiple VMs to imitate real-life computational load [14]. As shown in Table 1, a thermal profile that maintains the thermal and power data at certain levels of server utilization can be represented in tabular form. The second column represents the server utilization in terms of percentage and usable CPU frequency. Consider, for example, an octa-core processor with a processor core frequency of 2.66 GHz, making the maximum processing capacity 21.28 GHz. The hypervisor consumes some CPU cycles in addition to VMs, so the maximum available GHz for the server is approximately 20.97 GHz. The last column represents the inlet temperature received by the server. The third and fourth column represents the net increase in outlet temperature T i ∆ and power consumed at various level of server utilization. Suppose server i receives the cold air at temperature T i received , which may be greater than the set temperature T set of the cooling mechanism. The servers receiving inlet air at a high temperature will cause an equivalent rise in the outlet air temperature, as shown in Equation (1).
where ∆ i T represents the change in inlet temperature of server i and causes the equivalent change in outlet temperature of the server.
High inlet temperature leads to corresponding high outlet temperature and may increase the intensity of recirculated heat [33]. The increased outlet temperature of server i due to ∆ i T has two effects: extra burden on the cooling mechanism and reliability issues. The maximum inlet temperature for the thermal hotspot threshold can be determined [4,5,16,21] and can be avoided by distributing the workload, thus minimizing the maximum increase in inlet temperature [6,33,35]. This may lead to uniform outlet temperature and reduced heat recirculation but at the expense of underutilization of servers [13], and hence reduced performance in terms of effective CPU cycles provided to each task/V M per unit time. The net increase in outlet temperature T i ∆ of server i is the difference between the outlet and inlet temperature, as shown in Equation (2), and can be included in the thermal profile for various levels of server utilization for a range of inlet temperatures T i received .
Given a thermal profile TP i of server i, the outlet temperature T i oulet of server i, with reference to server utilization, can be determined at run time using inlet temperature and interpolation [14]. For an m-slot (number of rows) thermal-profile, the server utilization is quantized into m levels. Intended utilization of the server can be used to predict the possible worst-case outlet temperature. For example, a server i running at utilization level x can be represented at CPU i x where i = 1, 2, ..., n and x = 0, ..., m; x = 0 is idle state whereas x = m represents maximum utilization of the server. The worst-case predicted outlet temperature at server utilization CPU i y , where y < x and x are the next higher utilization level of thermal profile immediately covering y (considering that y is the actual thermal reading and not a quantized level present in the thermal-profile) can be given as (3) [14]: where CPU i y ≤ CPU i x and β is the prediction error and is the difference between T i outlet (at utilization level y) and the predicted temperature T i (predicted oulet) (at utilization level x). The value of β varies directly with the variation is inlet temperature and ranges between 0.2 • C and 0.4 • C.

Proposed Methodology
Thermal-aware workload scheduling is used to lower the maximum increase in inlet temperature and avoid thermal hotspots in data centers. This is achieved at the cost of underutilization of servers, thereby reducing performance in terms of server utilization. This paper proposes a thermal hotspot adaptive workload scheduling algorithm based on thermal prediction and compares it with two algorithms: (1) a thermal aware scheduling algorithm (TASA) [34] that allocates workloads to the current coolest server to minimize cooling energy, and (2) greedy based scheduling algorithm minimizing total energy (GRAN-ITE) [36] that allocates workloads on the server that result in the least increase in total power consumption after workload placement. Additionally, in some cases, the workload cannot be scheduled on a server even if computing capacity is available because the thermal requirements are not met. In this case, the server can be relocated to a cooler area (with a lower inlet temperature) within the data center hall. This paper also proposes a server relocation algorithm, which in combination with our thermal hotspot adaptive workload scheduling algorithm shows better performance in terms of utilization without much effect on the cooling mechanism and is an alternative approach to server underutilization in thermal hotspot regions [3,4,33,34,36].

Workload Characterization
This paper considers the workload of cloud hosting data centers in the form of VMs for the rendering of IaaS over physical servers. This workload is composed of batches of multiple heterogeneous VM requests, where a single batch is the unit of workload scheduling. These VM batches are lined up in a job queue of IaaS ready for deployment on the available servers. The maximum CPU usage demand in terms of CPU cycles for all VMs in a single batch can be represented as batch k GHz . The maximum CPU cycle demand by all VM batches is represented by BatchList[K], which is the list of all VM batches to be deployed.

Evaluation Approach
This paper proactively evaluates the possible distribution of VM batches across the servers in terms of predicted outlet temperatures at the maximum theoretical workload of each batch. The list of all n servers along with relevant information on the current state of the server is represented as server [n]. This information include server ID, computing capacity CPU i max (GHz), computing capacity currently available CPU i available (GHz), inlet temperature at current time T i received , and outlet temperature at current time T i outlet . The thermal-profile TP of the server containing relevant information, as described previously, is also stored. The current utilization of the server can be calculated by subtracting CPU i available from CPU i max . To consider the worst case, the predicted maximum outlet temperature T i PMO of server i running at predicted utilization level x is calculated by adding the current inlet temperature T i received to the predicted outlet increase in temperature T i ∆ [x] at that utilization level, as given in Equation (4): where the predicted utilization T i ∆ [x] of server i with respect to the current utilization and batch k GHz is given as in Equation (5).
Because we use 8-core servers for our experiments and a maximum of 8-core VM for our workload, the thermal profiles used in this paper have nine quantized utilization levels; therefore, the value of T i ∆ [x] can be any one element of the set {idle, 12.5%, 25%, 37.5%, 50%, 62.5%, 75%, 87.5%, f ull} for each server. For outlet temperature prediction purposes, interpolation can be used for in-between values. For the current study, we define a term T max , which is the highest T i PMO of all the servers at hand and is the maximum limit for making scheduling decisions because T max supposedly belongs to a server that is located inside a thermal hotspot region. The value of T max can be calculated as

Hotspot Adaptive Workload Deployment Algorithm
This section presents the proposed thermal hotspot-resistant adaptive workload deployment algorithm (HAWDA). HAWDA uses the worst-case prediction model, as shown in Equation (4), to predict the chances of peak outlet temperature and deploys a batch batch k GHz on the server that shows the least increase in predicted temperature. Due to this reason, the proposed approach can work comparatively better in thermal hotspot regions than the non-prediction-based workload scheduling. For each server 1 ≤ i ≤ n, HAWDA has the following objective functions: HAWDA algorithm (Algorithm 1) takes the list of servers (along with their thermal profiles) and the list of batches as input parameters. For each batch k GHz in BatchList[K], the algorithm iterates through all the servers and finds a suitable server with enough computing capacity such that all three objective functions are satisfied. If the objective functions are satisfied, batch k GHz is deployed on the selected server. If there is no thermal hotspot-resistant deployment possible (three objective functions are not satisfied), the algorithm does not deploy the batch despite the availability of CPU capacity on the server. The time complexity of HAWDA is O(nm), where n is the number of servers and m is the total number of cores across the data center servers.

Hotspot Aware Server Relocation Algorithm
As discussed in the previous section, even in the case of the availability of CPU capacity on a server, if the objective functions of the HAWDA algorithm are not met, a batch is not deployed. To increase utilization of servers and deploy maximum batches (with the possibility of homogeneous and lower outlet temperatures), the servers can be relocated according to their thermal profiles and the regional inlet temperature. This section presents an optimized server relocation algorithm for thermal hotspot-aware server arrangement using a thermal-prediction model that identifies the location for server relocation by using the current inlet temperature and thermal profile of the servers under test. The model is named hotspot aware server relocation algorithm (HASRA) and presented in Algorithm 2. Calculate T max for server i at maximum utilization 4: if (T max ≤ T i PMO ) then 5: for server j in Server[n] from j = n down to n/2 do

end if 14: end for
The objective of the HASRA algorithm is to identify a server i whose T i PMO at the current location, regarding maximum utilization is likely to approach T max , and exchange server i with server j from the cooler regions of the data center, such that after relocation The HASRA algorithm takes the list of servers (along with their thermal profiles) as the input parameter. The HASRA algorithm ensures the homogeneity of outlet temperature of the relocated servers. The servers are arranged according to the decreasing order of their inlet temperature. The value of T max is calculated from the upper half of the servers.
The algorithm iterates through the first half of the servers (placed in the hotter region) and finds a server i whose T i PMO may approach T max at the current location. Once identified, another server j is searched from the second half of servers (placed in a cooler region than server i) so that if the locations of both the servers are switched, the three objective functions of the algorithm are fulfilled. In line 8 of HASRA algorithm, the T j PMO is calculated with T i received and similarly T i PMO is calculated at T j received . In short, HASRA only recommends server relocation after ensuring that the server relocation will bring down the thermal gradient and a decline in T max . The HASRA algorithm allows level 2 assurance of minimizing the peak outlet temperature. However, it is worth noting that for data centers comprising homogeneous servers, HASRA will not be applicable. Even when server relocation is not possible, HAWDA still provides level 1 assurance for minimizing the chances of T max and hence the thermal hotspots. Thus, both techniques are complementary to each other and together provide maximum utilization of servers even at thermal hotspots causing high inlet temperatures. The time complexity of HASRA is O(n), where n is the number of servers in the data center.

Experiment Setup
For this study, we chose two heterogeneous server types (A and B). The specifications of these servers are given in Table 2. For this study, the workload is composed of sixteen 8-core, thirty-two 4-core, sixty-four 2-core, and hundred and twenty-eight single-core VMs. Alibaba cloud workload traces [37] are used for this purpose. Because we use 8-core servers for the simulation experiments and a maximum of 8-core VM for our workload, the thermal profiles used in this paper have nine quantized utilization levels for each server, i.e., idle, 12.5%, 25%, 37.5%, 50%, 62.5%, 75%, 87.5%, and full, as shown in Table 3. We abstract the overall CPU utilization of the server in terms of physical core utilization, e.g., a 4-core VM running at full utilization on an 8-core server represents 50% utilization of the server regardless of which four cores of the server it is mapped to.  The experimental setup comprises a data center with a total of 96 servers placed in 8 server racks. Out of these 96 servers, 43 are type A servers (SA) and 53 are type B servers (SB). These servers are randomly placed in the data center racks as shown in Figure 1a. As per the experiment results, the hypervisor running on a server consumes some CPU cycles in addition to the VMs, so the maximum useable processing power of SA and SB are approximately 20.8 GHz and 14.1 GHz, respectively. The combined usable CPU capacity of the 96 servers for workload scheduling is 1641.7 GHz. The cooling temperature of the CRAC unit T set is set to 22.9 • C. Figure 1b shows the inlet temperature T i received for each server i in the data center. We consider the inlet air temperature for the severs to be higher than T set to emulate heat recirculation [13]. The top two servers of each rack are assumed to be in the thermal hotspot region. HAWDA and HASRA algorithms rely upon predicted outlet temperature T i PMO calculated through Equation (4). Consider, for example, the bottom-most server in rack 4. It is an SA server with an inlet temperature of 23.5 • C. Assuming it is idle, and we want to schedule a 4-core VM of this server, the T i PMO after scheduling this VM will be 39.1 • C (23.5 + 15.6). This is because a 4-core machine is utilizing 50% of the 8-core server, at the maximum, and according to the thermal profile of SA (see Table 3), the net increase in the outlet temperature at 50% utilization is 15.6 • C. The assumption in this study is that once a batch has been deployed on the server, it will run indefinitely, and T i PMO is the worst-case outlet temperature prediction. The T i PMO of all servers is calculated, and the T max belongs to the top two servers of rack 4 (44.42 • C).

Workload Scheduling
The workload of all VM batches is deployed on the data center servers using TASA [34], GRANITE [36], and HAWDA algorithms. Before deployment, the VMs are sorted in decreasing order of their score count. Because TASA and HAWDA are thermal-aware algorithms, the servers are sorted according to ascending order of outlet temperature at an idle state. TASA deploys the VM batches to the coolest server first. HAWDA deploys VM batches to servers in the best fit manner (see Algorithm 1). A batch is deployed on a server that would result in the least T i PMO after the batch is deployed. GRANITE deploys the VM batches to the server resulting in the minimum increase in total power consumption. Figure 2 shows the plots of outlet temperature, power consumption, and percentage utilization of data center servers after deployment of all VM batches for all algorithms. The box plots in Figure 2a,b show the variance in peak outlet temperature and power consumption, respectively. The red dot represents the mean value. The lower and upper whiskers represent the lower and upper 25% data values, with the endpoints of whiskers being the minimum and maximum data value, respectively. The boxes represent the middle 50% data values, and the boundary between the two boxes represents the median data value. From Figure 2a it is observed that HAWDA reduces the maximum outlet temperature of the data center servers. This is because it uses a proactive approach of predicting the outlet temperature before deploying workload on the servers. From thermal profiles of SA and SB, it is observed that SB produces more heat while doing the same task as compared to SA at all utilization levels. HAWDA proactively deploys more workload on the cooler SA servers, hence reducing the overall outlet temperature of the data center. Conversely, TASA deploys workload on the next available coolest server, which can be the SB server (see Figure 2c) leading to higher outlet temperature.
Additionally, there is a lower power difference between utilization slots for the SB server as compared to the SA servers, as seen in the thermal profile, hence GRANITE deploys more workload on SB rather than SA servers, underutilizing SA servers, as observed in Figure 2c. Because SB servers consume more power and generate more heat, this results in a large variation in outlet temperature and power consumption across servers, as observed in Figure 2a,b. Similarly, as observed from Figure 2b), the overall power consumption is reduced using the HAWDA algorithm as compared to TASA and GRANITE. Moreover, HAWDA utilizes more servers to their peak capacity without increasing the outlet temperature and thermal hotspot creation, whereas TASA and GRANITE avoid thermal hotspot creation by underutilizing the servers in the thermal hotspot region to lower the overall maximum outlet temperature of the servers that cause heat recirculation, at the cost of performance.

Workload Scheduling with Server Relocation
As discussed previously, hotspot-aware workload scheduling will underutilize servers to avoid thermal hotspot creation. To increase the utilization without the creation of thermal hotspots, a possible solution is to use the HASRA server relocation algorithm (see Algorithm 2), which identifies the hottest servers and replaces them with the cooler servers such that the predicted outlet temperature of the cooler servers remains less than the hotter servers after relocation. Table 4 shows the updated locations of the servers in the data center after HASRA is applied. The T i PMO of all servers is calculated again after the relocation of servers and the T max value is 43.42 • C, which is 1 • C less than before server relocation. It is interesting to note that the T max value does not belong to any server at the top of the racks where the inlet temperature is high. This is because HASRA relocates cooler SA servers to locations where the inlet temperature is high. The T max value still belongs to all SB servers placed in rack 3 to rack 6 that all have inlet temperature of 23.3 • C. After the relocation of servers using HASRA, the workload of all VM batches is deployed on the data center servers using TASA, GRANITE, and HAWDA algorithms. Before deployment, the VMs are sorted in decreasing order of their score count.  Figure 3 shows the plots of outlet temperature, power consumption, and percentage utilization of data center servers after deployment of all VM batches for all algorithms after applying the HASRA algorithm. The box plots in Figures 3a,b show the variance in peak outlet temperature and power consumption, respectively. The red dot represents the mean value. The lower and upper whiskers represent the lower and upper 25% data values, with the endpoints of whiskers being the minimum and maximum data value, respectively. The boxes represent the middle 50% data values, and the boundary between the two boxes represents the median data value. When comparing Figure 2 with Figure 3, it is observed that by applying the HASRA algorithm, the overall outlet temperature for all scheduling algorithms is reduced. This is because hotter SB servers are relocated in the cooler regions of the data center, resulting in reduced outlet temperature. When comparing power consumption after server relocation, it is observed that the overall power consumption is slightly reduced while using TASA and HAWDA. This is because after applying HASRA, SB servers are relocated to the cooler region, resulting in increasing utilization of SB servers, whereas relocating SA servers to the hotter regions results in decreasing utilization of SA servers as seen in Figures 2c and 3c. There is a negligible effect of HASRA on GRANITE, as GRANITE is temperature agnostic.

Discussion
In this section, we simulate Alibaba cloud workload for 25 h using the workload scheduling algorithms considered in this paper and compare these scheduling algorithms based on total computing capacity utilization, peak outlet temperature, and energy consumption. Table 5 shows the maximum possible combined computing power utilization of all servers in terms of gross GHz available for each scheduling algorithm for the given workload. The total capacity of the data center is 1641.7 GHz. HAWDA and HAWDA+HASRA algorithms use the maximum computing capacity with minimum peak outlet temperature. This is because HAWDA schedules more workload on SA servers having higher clock speed than that of SB servers. Therefore, HAWDA provisions more computing capacity at lower outlet temperatures.

Peak Outlet Temperature
It is highly desirable that there should be no thermal hotspots, but ideal situations do not always exist. Because thermal hotspots (in terms of heat recirculation) may occur due to the peak outlet temperature of the servers, to resist the thermal hotspot conditions, it is most desirable that the T max be lowered. Figure 4a,b show the peak outlet temperature (maximum outlet temperature from all the servers) for each workload scheduling algorithm before and after server relocation. As GRANITE is not thermal-aware (not sensitive to T i received of the server), the deployment of VM batches to servers brings about more hot air from the servers and increase the chances of high inlet temperatures due to heat recirculation and burden on the cooling mechanism as well. Due to the reactive nature of TASA, the workload is deployed on a cooler server without considering if that workload will increase the outlet temperature of the server, leading to heat recirculation. TASA and GRANITE have a similar peak outlet temperature as compared to HAWDA, which is thermal hotspot aware and sensitive to T i received of the server based on T i PMO . The advantage of this combined approach of server relocation (HASRA) and workload scheduling (HAWDA) is the decrease in peak outlet temperature by more than one degree celsius.
The time of the day can also affect the peak outlet temperature and cooling power consumption inside the data center facility; e.g., during the daytime, solar radiation may lead to heat propagation inside the data center facility, leading to higher peak outlet temperature resulting in higher cooling energy. This phenomenon is observed in Figure 4a,b between 630 and 1170 min, where a higher peak outlet temperature is observed.

Heat Flow from Servers
It is worth noting that the workload scheduling based on heat flow calculation may not be as fruitful as the inlet temperature (T i received ) based workload scheduling model used in this paper. This is because the amount of heat Q i flowing through a server i is represented by where p is the density of air (typically 1.19 kg/m 3 ), f i is the airflow rate inside server i (at 520 CFM or 0.2454 m 3 /s), and C p is the specific heat of air (normally 1.005 J Kg −1 K −1 ) [33]. If the value of p f i C p is considered to be constant, then it is evident from Equation (10) that the higher the value of T i ∆ the higher is the heat impact on the cooling system. This is independent of the location of the server because the power consumption of the server remains the same at any location. Therefore, Equation (10) will calculate similar heat for two homogeneous or heterogeneous servers at similar utilization levels but receiving the cold air at different temperatures. Hence it is better to use the inlet temperature as a reference for modeling the workload scheduler as proposed in this paper. Following are some limitations of our workload scheduling model: We consider that the heat discharged from the server does not flow back into the server, ii.
The heat generated by the memory, disk, and the motherboard is negligible, and iii.
We consider that no external factor contributes to heat propagation in the data center facility.

Energy Consumption
The amount of cooling energy spent on a server can be calculated regarding the power being consumed by that server and the coefficient of performance (COP) of the inlet temperature [13], where COP is the amount of work done to remove the amount of heat dissipated by server i and is calculated using Equation (11): It can be noted that the higher the inlet temperature, the higher will be the wastage of cooling energy because the energy spent to cool down the air at T set is higher than T received if the latter value is larger than the former value. As the rise in inlet temperature leads to a corresponding rise in outlet temperature, the corresponding outlet air temperature for each server with T received > T set will be an added burden on CRAC. Hence, the cooling energy wasted for each server with reference to T received and T set at energy consumption E i computing for server i with T received > T set can be calculated as (12) Tables 6 and 7 show the statistics of computing energy and cooling energy for the 25 h simulation of workload execution. The value of T set is considered to be 22.9 • C. Column 2 shows the total energy consumed (in KWh) over 25 h; columns 3 to 6 show the minimum, maximum, average, and standard deviation of energy consumed per minute (in Watts), respectively. It can be seen from Table 6 that there is not much difference in energy consumption across each workload scheduling algorithm. For TASA, a minor decrease in energy consumption is observed after server relocation as compared to before server relocation. However, for HAWDA, a minor increase in energy consumption is observed after server relocation as compared to before server relocation. This is because HAWDA schedules more workload on SA servers (higher computing capacity provisioning due to higher clock speed), which after relocation have been moved to the hotter region. Moreover, for GRANITE, there is no change, as GRANITE is not thermal-aware and schedules the VMs on the same machines after server relocation as scheduled before server relocation. Because GRANITE schedules more VMs on SB servers, which were in hotter regions before relocation, a decrease in cooling energy for GRANITE is observed in Table 7. Hence, before and after server relocation, HAWDA helps reduce the peak outlet temperature significantly without a notable increase in computation energy and additional load on the cooling mechanism.

Conclusions and Future Work
This paper shows the importance of spatio-thermal consideration for workload scheduling across data center servers, including servers affected by thermal hotspots. When allocating workload batches, the location and outlet temperature of the server should be considered to resist the thermal hotspots. A useful tool for the implementation of such an approach is a thermal-profile-based outlet temperature prediction. Scheduling approaches such as GRANITE that are not thermal-aware may lead to elevating the utilization of servers that generate more heat and thus result in the generation of higher maximum outlet temperatures. A reactive approach like TASA can lead to assigning workload on servers that generate more heat but are in the cooler regions, resulting in higher peak outlet temperatures. A more flexible approach is the hotspot-resistant workload scheduling algorithm HAWDA, proposed in this paper, which can significantly reduce the maximum peak outlet temperature while ensuring performance in servers with high inlet temperatures through the adaptive allocation of workload. The servers that are left underutilized for being prone to thermal hotspot conditions can be relocated according to the hotspot-aware server relocation algorithm HASRA, presented in this paper, to complement HAWDA. As shown in this paper, the combined approach provides the same level of average peak utilization of the servers as GRANITE and TASA without causing an additional burden on the cooling mechanism, as the peak outlet temperature of HAWDA is much lower. For more realistic results, in the future we intend to make associations with practical computation workloads of an already tested calculation (atomic relaxations of a structured lattice), which needs a particular time to a specific computing configuration using the resource allocation algorithms TASA, GRANITE, and HAWDA, with and without HASRA and consider the associated energy consumption for computation, the cooling system concerning the outside temperature, etc.