Minimizing Power Consumption of an Experimental HVAC System Based on Parallel Grid Searching

: This paper proposes a parallel grid search algorithm to ﬁnd an optimal operating point for minimizing the power consumption of an experimental heating, ventilating and air conditioning (HVAC) system. First, a multidimensional, nonlinear and non-convex optimization problem subject to constraints is formulated based on a semi-physical model of the experimental HVAC system. Second, the optimization problem is parallelized based on Graphics Processing Units to simultaneously compute optimization loss functions for different solutions in a searching grid, and to ﬁnd the optimal solution as the one having the minimum loss function. The proposed algorithm has an advantage that the optimal solution is known with evidence as to the best one subject to current resolutions of the searching grid. Experimental studies are provided to support the proposed algorithm.


Introduction
Heating, ventilation, and air conditioning (HVAC) systems are usually comprised of the heating, air conditioning, and ventilation systems in residential or commercial buildings, to provide thermal comfort and air quality in indoor spaces [1,2]. The power consumption of HVAC systems is often more than half of total building power consumption [3]; however, many HVAC systems in practice have been running in a full load mode without considering the power consumption efficiency. Thus, it is important to find optimal operating points of HVAC systems in order to avoid unnecessary power consumption. The power consumption optimization is usually composed by two steps: (i) an analytical or numerical model is built to describe the relationship between the input and output variables of the HVAC system; (ii) an optimal operating point of the HVAC system is found to achieve the minimum power consumption subject to some constraints. The subject of this paper is on the second step.
Many optimization methods have been proposed to minimize the HVAC energy consumption in the literature, as summarized here in Table 1. Vakiloroaya et al. [4] reviewed different HVAC energy-saving strategies in the way of combining air conditioning technologies or changing the HVAC configurations. Lu et al. [5,6] presented a modified genetic algorithm (GA) for minimizing power consumption of HVAC systems, based on mathematical models of major components and their heat exchangers. Asiedu et al. [7], Chow et al. [8], Ullah et al. [9] and Kampelis et al. [10] also used the GA for the HVAC power optimization. Fong et al. [11] exploited the simulation tool TRNSYS [12] and TESS Libraries [13] to model the overall HVAC system for a local subway station, and took the robust evolutionary algorithm (REA) to optimize the energy consumption of the HVAC simulation models. Wemhoff et al. [14,15] proposed Lumped-HVAC (L-HVAC) method to build the HVAC model based on the coupled water balance and energy exchange calculations, and the Master Controller (MC) method to minimize energy for a system while meeting conditioning requirements. Zakula et al. [16] took the adaptive grid search technique to map the optimal heat pump performance as a function of the capacity, indoor and outdoor temperatures. Kim et al. [17] developed an energy performance prediction and optimization software to predict and optimize the energy consumption of a chiller system using a machine learning algorithm based on artificial neural network (ANN) models. Kusiak et al. [18] adopted the multi-layer perception (MLP) algorithm to construct energy consumption predictive models of an air handling unit, and the dynamic penalty-based electromagnetism-like algorithm (DPEM) to minimize the overall energy consumption. Wei et al. [19] constructed a quad-objective optimization problem to balance the energy consumption and the indoor air quality, and a modified particle swarm optimization algorithm (PSO) to minimize the total energy consumption of a dynamic overall HVAC systems. Kim et al. [20] developed an integrated meta-model for a lighting, heating, ventilating, and air conditioning system, and the GA algorithm for the minimum energy consumption with the constraints on both thermal and visual comfort. Aframa et al. [21] designed a supervisory model predictive controller (MPC) based on ANN models for minimizing the total cost of operating the HVAC system. Alibabaei et al. [22] discussed the development of three different strategic planning models including Smart Dual Fuel Switching System (SDFSS), Load Shifting (LSH), and LSH-SDFSS, and developed a novel smart dual fuel switching system for minimizing the energy cost of a residential HVAC system. Ghahramani et al. [23] introduced an adaptive hybrid algorithm to learn optimal HVAC settings with no prior historical data. According to the above review of HVAC energy optimization strategies, there are two groups of optimization algorithms: the local optimization algorithm based on the gradient information and the global optimization algorithm based on heuristic searching techniques. The optimization problem for HVAC systems is in the multidimensional, nonlinear and non-convex nature. As a result, both algorithms have some limitations in finding a good solution. The former depends heavily on the given initial solution so that it is easy to yield a local optimum as the solution. The latter usually takes a high computational cost by using random initial solutions in many attempts; even so, it is not unclear whether the obtained optimal solution can be improved further.
This paper proposes a parallel grid search algorithm to find a global solution for the power consumption optimization problem of an experimental HVAC system. First, a multidimensional, nonlinear and non-convex optimization problem subject to constraint is formulated based on a semi-physical model of the experimental HVAC system. Second, the optimization problem is parallelized in order to simultaneously compute the optimization loss functions for different solutions in a searching grid, and to find the optimal solution as the one having the minimum loss function. The proposed algorithm is implemented in the Graphics Processing Unit (GPU), which has tens of thousands of computing cores for parallel computations. The main advantage of the proposed algorithm is that it does not suffer the above-mentioned limitations of the existing optimization algorithms. In particular, the proposed algorithm is exhaustive in terms of computing the loss function values of all feasible points in the searching grid, from which the optimal solution is obtained. Hence, it is confident that the obtained optimal solution is the best one subject to current resolutions of the searching grid.
The rest of the paper is organized as follows. Section 2 describes the problem to be solved. Section 3 presents the HVAC model and the proposed algorithm. Section 4 validates the proposed algorithm via experimental studies. Section 5 makes some concluding remarks.

Problem Description
HVAC systems are usually composed by cooling coils, chillers, cooling towers, pumps, fans and rooms [1]; in terms of transferring energies, there are five loops being involved, namely, the indoor air loop, chilled water loop, refrigerant loop, cooling water loop and outdoor air loop as shown in Figure 1. For an experimental HVAC system in this paper, its input variables are the rotating speed R SA of a cooling coil fan, the rotating speed R CW of a cooling water pump, the rotating speed R CHW of a chilled water pump, the outdoor temperature T out and humidity D out , i.e., x := R SA R CW R CHW T out D out T .
(1)  Here R SA , R CW and R CHW are controllable variables, while T out and D out are uncontrollable variables being determined by environmental conditions. The output variables are the power consumptions of the cooling coil fan, cooling water pump, chilled water pump and chiller, respectively denoted by P SA , P CW , P CHW and P Chiller , as well as the indoor temperature T in , i.e., y := P SA P CW P CHW P chiller T in T . ( The objective is to minimize the total power consumption of the experimental HVAC system subject to some requirements on the comfortable temperature in rooms, by finding the optimal values of controllable variables in x. Thus, the optimization loss function is The last item on the right side of Equation (3) means that there are two cooling coil fans to respectively supply and return air in the HVAC system. Meanwhile, there is a constraint that the indoor temperature T in is maintained in a required range, namely, where T in,lb and T in,ub are the upper and lower bounds of T in , respectively. Besides, it is assumed that the outdoor temperature T out and humidity D out are constant, because T out and D out are approximately unchanged during the time that T in is changed by varying R SA , R CW and R CHW . In summary, the power consumption optimization problem for the HVAC system can be formulated as subject to the constraint in (4).

The Proposed Algorithm
This section proposes a parallel grid search algorithm to find an optimal operating point for minimizing the power consumption of the experimental HVAC system.
First, a mathematical model of the experimental HVAC system is needed. The model is obtained by integrating the sub-models of major components of the experimental HVAC system in three parts: the cooling coil and room, the chiller and the cooling tower. Details in establishing the sub-models are irrelevant for the optimization problem and are reported elsewhere in our earlier work [24]. For the purpose of minimizing power consumption, the output variable vector y in (2) needs to be calculated for given values of the input variable vector x in (1). All major components of the experimental HVAC system are closely coupled via four intermediate variables z The HVAC model is Equation (6) is schematically shown as an Matlab simulink diagram in Figure 2, where f 1 , f 2 and f 3 are represented by three square frames, and f 4 is decomposed as three subparts with the cubic function u 3 and model coefficients d 1 , d 2 and d 3 . The function f 1 is the model for the cooling coil and room, where a = [a 1 , · · · , a 8 ] T is the vector of model coefficients to be determined from experimental data. Since there is no humidifier, the indoor and outdoor humidities are assumed to be the same, i.e., D out = D in . The function f H calculates the air enthalpy being associated with temperature T and humidity D [25], The function f 2 represents the chiller model being described by a two-layer feed-forward network as shown in Figure 3, Here x and y are the input and output of the system respectively;In this sub-model, the input variables are T CWS , R CW and T out , and the output variable is T CWR . Symbol u is the output of the hidden layer; w 1 and w 2 are the weight matrices of hidden and output layers respectively; b 1 and b 2 are their bias vectors; ϕ 1 (·) and ϕ 2 (·) are their activation functions, which are usually given as Here n is the number of input variables of the chiller model. The function f 3 is the cooling tower model, .
Here the function f Hb calculates the enthalpy of saturated air and its wet bulb temperature T [25], The function f K is given by [25] T is the vector of model coefficients. The function f 4 includes the models for the cooling and chilled water pumps as well as the cooling coil fan, T is the vector of model coefficients. In this sub-model, the input variables are R CHW , R CW and R SA , and the output variable is P CHW , P CW and P SA .  Second, the main idea of the proposed algorithm is introduced [26]. In general, a grid search algorithm is an exhaustive searching algorithm, which can easily adapt to nonlinear, non-differentiable or even disconnected optimization problems. It selects a few values of each input variable, and formulates all possible combinations of the input variable values. The loss function of an optimization problem is calculated for all possible combinations to formulate a searching grid. The feasible solutions are found as the points in the searching grid satisfying the required constraints for the optimization problem. The optimal solution is the feasible point having the smallest loss function value. In the experimental HVAC system, the same type of rotating speed sensors are used to obtain the measurements of the three controllable variables R SA , R CW and R CHW ; in addition, the variational ranges of R SA , R CW and R CHW are normalized to be the same. Thus, it is reasonable to choose the same resolution r for the three variables. By separating the variational range with a uniform distance r, the number of different values for R SA is where R SA,u and R SA,l indicate the upper and lower limits of the variational range of R SA . Similarly, the numbers of different values for R CW and R CHW are The number of grid points to be searched is N 1 × N 2 × N 3 . The grid point is denoted by where i, j and k with 1 ≤ i ≤ N 1 , 1 ≤ j ≤ N 2 and 1 ≤ k ≤ N 3 are the grid point indices. With the HVAC model in (6), values of y and z can be calculated for given values of the controllable variables R SA , R CW and R CHW in x (i, j, k) and the constant values of the uncontrollable variables T out and D out , based on the estimated coefficientsâ,ŵ 1 ,ŵ 2 ,b 1 ,b 2 ,ĉ, andd. The power consumption for each grid point is obtained as P (i, j, k). After the power consumption is calculated for all grid points, the feasible grid points are determined as the ones satisfying the constraint in (4), where T in (i, j, k) is the indoor temperature associated with x (i, j, k). The optimal solution is the feasible point with the smallest power consumption, i.e., x (i 0 , j 0 , k 0 ) = arg min i∈I,j∈J,k∈K Third, owing to a fact that the calculations of P (i, j, k) for all grid points are independent of each other, the grid search algorithm is very suitable for parallel computations. Theoretically, if there are enough computing nodes, then these independent calculations can be performed simultaneously in a non-sequential manner, without communication among all the computing nodes. Thus, a parallel algorithm is designed by taking the advantage of a GPU's Single-Instruction Multiple-Data (SIMD) property and a stream processing technique [27]. In the parallel algorithm, the calculation of P (i, j, k) is done by the GPU as a computation device. A Central Processing Unit (CPU) as the host is responsible for initializing the parameters of the HVAC model and managing the final result transmission with the GPU.
The general large-scale numerical computing technology developed on GPU by NVIDIA has shown evident advantages in floating-point operations and memory bandwidth [28]. With the potential of a GPU being exploited, one single computer has an entry-level super-computing power. The Compute Unified Device Architecture (CUDA) platform is a new software framework for the GPU parallel computing. It provides a programming model based on a three-level (Grid-Block-Thread) thread management structure as shown in Figure 4. In this model, the CPU serves as the host, which is responsible for logical transaction processing and serial computing, and the GPU serves as the device (or coprocessor) to execute instructions, which can be highly threaded in a so-call kernel function [28].  For the power consumption optimization of the HVAC system, the outdoor temperature T out and humidity D out are fixed and there are three optimization variables, namely, R SA , R CW and R CHW . It is known that each grid in the CUDA architecture has multiple two-dimensional blocks, and each block has multiple two-dimensional threads. Only three of these dimensions are required here. The grid point indices in (9) can be directly represented as where block.x, thread.x and thread.y are the in-built variables in CUDA for the indices of thread and block, respectively. The algorithm starts with the formulation of the entire searching grid composed by N 1 × N 2 × N 3 points. Then, the global memory of GPU is allocated for the input and output variables, and the inputs of all the searching grids using (9) are transferred from the host memory to the global memory. Since the HVAC model coefficients, the outdoor temperature and humidity are commonly used in all threads, they are initialized in the common memory of GPU, which greatly improves the computing efficiency. Next, grid parameters (blockDim and threadDim in Algorithm 1) are set and the kernel function is launched. In the kernel function, after the computational tasks of all the threads are finished, the results are copied from the global memory back to the host memory. Finally, their power consumption values are compared to obtain the optimal grid point with the smallest power consumption. Compare the values of P total to obtain the optimal solution x (i 0 , j 0 , k 0 ) in (10) Algorithm 1 gives the proposed algorithm in detail. The symbol optimizationKernel stands for a kernel function, blockDim is the number of blocks and threadDim is the number of threads per block. They are determined by the shared memory size configuration and registers per thread. Generally, threadDim is usually a multiple of 64 [28]. It should be noted that their improper selection will lead to lower GPU occupancy. The symbol __shared__ is a keyword to mean that this variable (EveryBlockPtotal) is a shared variable. All the threads in a block can share memories, synchronize and communicate by using shared variables. min(·) is a function to find the minimum of a data set. The symbol __syncthreads() guarantees that every thread in the block has completed instructions before the function min(·) is executed [27].

Experimental Studies
In this section, the HVAC model coefficients are estimated from experimental data, and the proposed algorithm is compared with the two other algorithms.

Model Coefficient Estimation
Real-time experiments are implemented in the experimental HVAC system, whose major components are shown in Figure 5. The experiments are implemented for 15 h, where the manipulated input variables are changed in turns from one constant value to another every 1000 s. It is observed that the HVAC system reaches a steady-state in the last 100 seconds of each 1000-s change. Thus, the data samples in the last 100 s of all 1000-s changes are extracted and merged together. Figures 6 and 7 present the measured input and output variables, respectively. The sample period is 1 s. The HVAC model coefficients in (6)    The measured and estimated outputs are compared in Figure 7. The corresponding coefficients of variation (CVs) are given Table 2 as a qualitative measure of model quality. The CV for a variable X and its estimateX is defined as where µ X−X and σ X−X are the sample mean and standard deviation of the error variable X −X, respectively. It can be seen that the estimated outputs are in line with the measured outputs, and the CVs are quite small.

Performance Comparison of Optimization Algorithms
This subsection compares the proposed algorithm with the interior-reflective Newton algorithm as a typical local optimization algorithm, and the genetic algorithm (GA) as a representative global optimization algorithm.
For the experimental HVAC system, the rotational speed of pumps and fans must be in a moderate range. In particular, the rotational speed of the chilled water pump should be larger than that of the cooling water pump. Thus, the value ranges of input variables R SA , R CW and R CHW are restricted to the interval in Equation (16), The rotational speeds are controlled by voltage signals in the experiment, so that the unit is the duty cycle with 0% and 100% for the minimum and maximum speeds respectively. A preliminary experiment is carried out to determine the feasible ranges of input variables in Equation (16) in order to meet with the required indoor temperature. The outdoor temperature is fixed to 30 • C. For maintaining a desirable internal atmospheric environment, the indoor temperatures is constrained as The hardware configuration for the optimization algorithms is shown in Table 3. Tesla K80 of NVIDIA with a total of 4992 cores is selected as the device to be used in the parallel grid search algorithm. The results of the three algorithms are obtained as Table 4. First, for the interior-reflective Newton algorithm, the initial point must be given and it is generated by a random function within the interval in (16). It is concluded from Table 4 that different initial points lead to entirely different optimization results from the interior-reflective Newton algorithm. Second, the GA algorithm takes the default setup parameters in the MATLAB optimization toolbox, with the population size 100, the tolerance value 0.1, and a randomly-generated initial population. It is observed from Table 4 that the GA algorithm yields similar optimization results where values of P total are quite close in three trials. However, it is hard to know whether a smaller value of P total can be achieved by introducing more trials. Third, for the proposed algorithm, the numbers in (7) and (8) are selected as N 1 = N 2 = 40/0.1 = 400, N 3 = 50/0.1 = 500, because the resolution of the experimental devices for measuring rotational speeds is about 0.1. The optimal result is associated with P total = 2374.10, which is much smaller than the counterparts from the interior-reflective Newton algorithm, and is slightly smaller than the best one. P total = 2374.21 from the GA algorithm. Moreover, the propose algorithm is able to give the distribution of P total in Figure 8 for all points satisfying the indoor temperature constraint in (17). In order to observe the interior of Figure 8a, the 3D cloud image in Figure 8a is unfolded along the z axis (R CHW ), and twenty 2D cloud images are obtained as shown in Figure 9. For the i-th sub- figure  (i = 1,2, ..., 20), the value of R CHW is given by R CHW (i) = 40 × i/20 + 10, while the horizontal and longitudinal coordinate variables are R SA and R CW respectively. The irregular shapes imply the presence of a non-convex optimization problem. The optimal point [33.0, 60. 3, 14.2] in Table 4 is the darkest blue point (red asterisk) in the third column of the first row in Figure 9. It is worthy to note that there are several deep blue points in the cloud chart, such as the two black dots in Figure 9. The values of points nearby are listed in Table 5. Clearly, there are many locally optimal solutions, so that the optimization problem is clearly non-convex and nonlinear, which demonstrates the necessity of applying the proposed algorithm to solve the optimization problem. The proposed algorithm provides the entire solution space containing all values of the loss function corresponding to all operating points in the searching grid; thus, the best operating point is ready to be found. However, the GA algorithm only gives the solution path to the optimal operating point, which is difficult to be trusted without a comparison to other operating points in the entire solution space. Hence, unlike the GA algorithm, it is very confident that the proposed algorithm can find the best operating point for the given resolution of the searching grid.    Finally, the computation time is compared in Table 4. As expected, the interior-reflective Newton algorithm is the quickest, but its local optimal result is not acceptable. The proposed algorithm takes about the half computing time of the GA algorithm. As a comparison, the serial grid search algorithm takes 891,000 s (each grid search spends about 0.0139 s), which is more than 1300 times of the proposed algorithm. Thus, it is very necessary to do the grid search in a parallel manner. In summary, the proposed algorithm is able to guarantee a good optimization solution within an acceptable computing time.

Optimization Result Validation
This subsection provides experiments to validate the optimization results achieving the minimal power consumption.
Real-time experiments are implemented under the same experimental conditions as the ones for estimating HVAC model coefficients. Figure 10 shows the input variables in x of the experimental HVAC system. Multiple operating points are testified for the comparison purpose. The last constant parts of x in Figure 10 are associated with the optimal operating point from the proposed algorithm in Table 4. Measured output variables are compared with estimated counterparts from the HVAC model using model coefficients in (12)- (15). The consistency between the measured and estimated output variables in Figure 11 says that the estimated HVAC model is accurate. As expected, P total achieves the smallest value at the last part in Figure 11f. The actual value of P total is equal to 2509, with a small deviation ((2509-2374.1)/2374.1 = 5.7 %) from the expected value 2374.10 in Table 4. This is due to a fact that the experimental conditions cannot be controlled the same as the assumed ones; in particular, the outdoor temperature T out fluctuates around 30 • . Despite such a small deviation, the experiments clearly support the validity of the proposed algorithm in terms of finding an optimal operating point to achieve the minimal power consumption.

Conclusions
This paper presented a parallel grid search algorithm to find an operating point achieving the minimal power consumption for experimental HVAC systems. A multidimensional, nonlinear and non-convex optimization problem subject to constraints was formulated based on a mathematical model of the experimental HVAC system. By exploiting Graphics Processing Units, the optimization problem was parallelized to simultaneously compute optimization loss functions for different solutions in a searching grid. The proposed algorithm provided the entire solution space containing all values of the loss functions corresponding to all operating points in the searching grid. The optimal solution was obtained as the one having the minimum loss function and satisfying the required constraints. Thus, the main advantage of the proposed algorithm was that the obtained optimal solution was known with evidence to be the best one subjected to current resolutions of the searching grid. The proposed algorithm was validated in experimental studies and compared with two existing algorithms. In particular, the small deviation between the experimental and estimated results demonstrated that the proposed algorithm indeed found the optimal operating point to achieve the minimal power consumption.
As one future work, it is necessary to consider a non-constant outdoor temperature T out . The HVAC model in (6) has considered T out in the components f 1 and f 3 . The computational time of the proposed algorithm is about 11 minutes as given in Table 4. Hence, if T out is changing not too quickly, then it is ready to implement the proposed algorithm for different values of T out . The selection of sampling periods is another future work. A higher sample period reduces the computational time because fewer data samples are involved in optimization. However, it may deteriorate the optimization accuracy, because noises are omnipresent in the collected data samples, and the negative noise effects will be larger by exploiting fewer data samples in estimating the HVAC model parameters and calculating the optimal operating point.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript:

R CHW
chilled water pump speed R CW cooling water pump speed R SA cooling coil fan speed T CHWS chilled water supply temperature T CHWR chilled water return temperature T CWS cooling water supply temperature T CWR cooling water return temperature T SA supplied air temperature through the cooling coil T in indoor temperature T out outdoor temperature D in indoor humidity D out outdoor humidity P chiller power consumption of the chiller P CHW power consumption of the chilled water pump P CW power consumption of the cooling water pump P SA power consumption of the cooling coil fan