Optimizing the Regulation of Aggregated Thermostatically Controlled Loads by Jointly Considering Consumer Comfort and Tracking Error

: Thermostatically controlled loads (TCLs) are promising to offer demand-side regulation with proper control. In this paper, the aggregate power of TCLs is used to track the automatic generation control (AGC) signal by changing the temperature setpoint. The dynamics of the indoor temperature are described by a Monte Carlo model, and population dissatisfaction is described by the predicted percentage of dissatisﬁed (PPD). The objective is optimization from two aspects, minimizing both population dissatisfaction and tracking error. We propose an improved active target particle swarm optimization (APSO) algorithm to optimize the model, making it possible to ensure that the user’s dissatisfaction is as small as possible while the aggregate power tracks the AGC signal. The novelty of this paper is to introduce PPD into the model and at the same time establish three models using PPD as the objective function and constraints. The simulation results are shown to verify the efﬁciency of the designed model.


Introduction
In the traditional power grid, the power system adjusts the output of each generator on the power generation side by automatic generation control (AGC) to maintain the frequency offset within the allowable range [1]. In the process of building a smart grid, frequency adjustment service has become an important issue in the research of power market auxiliary services. Demand-side management (DSM), an aspect of the smart grid, is highly significant for auxiliary grid services [2]. Recently, in the research of demand-side regulation, the direct load control scheme based on air conditioning loads has attracted the attention of researchers due to its fast response speed and low cost [3].
Works such as [4,5] have studied and established a mathematical model based on the physical characteristics of air conditioning load to describe the continuous evolution of the temperature state and the switching process of the thermostat state. In [6,7], a homogeneous air-conditioning load model was described and its synchronization effect and power oscillation were analyzed. Specifically, the influence of the synchronized operation of air conditioning loads and the power of their damage to power system oscillations were studied. In [8,9], the authors established a state queue model for aggregated air conditioning, which proposed a dynamic parameter selection method for the state transition matrix. In [10,11], based on the Markov chain method, the authors established a heterogeneous identification model for the aggregated air conditioning load and analyzed the control performance under different state information. According to the partial differential dissipation model of air conditioning load, a bilinear state space model was established to adjust the temperature setting of the aggregated loads and achieve aggregated power regulation [12]. The control of aggregated air conditioning loads mainly involves two methods: on/off control [13][14][15] and temperature settings adjustment [16][17][18][19]. In [20], the author proposed an adaptive grouping control method for air conditioning loads based on parameter similarity. The authors of [21] designed a random sorting method to control the switching action of air conditioning loads. In [22], the author considered the consumer's comfort requirements to establish a fuzzy control rule to control the air conditioning load and protect consumers' demands. Based on the existing switch control methods of air conditioning loads, the authors in [23] designed two methods to introduce the demand response function into the scheduling problem for analysis. In [24,25], an air conditioning load scheduling strategy was proposed based on model reference adaptive control, which can be extended to other load control schemes with similar switching characteristics.
In the above works, the trade-off between population dissatisfaction and power tracking in the power system was not studied. In this paper, three optimization schemes are established to reach a trade-off between population dissatisfaction and power tracking errors by selecting the optimal temperature range for the general public and the grid. The rest of this paper is organized as follows: Section 2 introduces the model of a physically based single thermostatically controlled load (TCL) and a commonly applied thermal comfort model. Section 3 proposes three schemes to reach a trade-off between population dissatisfaction and power tracking errors. Section 4 introduces a temperature setpoint optimization algorithm based on active target particle swarm optimization (APSO), and Section 5 verifies the reliability and rationality of the scheme with MATLAB and EasyFit simulations. Finally, Section 6 summarizes the key contributions of this work.

System Model and Problem Formulation
The demand-side regulation system is composed of TCLs, agents, and a control center, as shown in Figure 1. Different numbers of TCLs make up different groups to better provide regulation services for the power system. In addition, the TCLs are common for providing regulation services by indirect load control. In particular, the control center termly transmits control instructions via the temperature setpoints, and the agents turn off and turn on the TCLs according to the AGC signal.

Monte Carlo Model
This section introduces a Monte Carlo model to indicate the indoor temperature change of thermostatically controlled loads. We assume that the indoor temperature corresponding to load i (i = 1, 2, . . . , N) is T i , and the environmental temperature is T ∞ . In addition, C i (kWh/ • C) and R i ( • C/kW) indicate equivalent heat capacity and thermal resistance, respectively. The operating state of a TCL is represented by the variable s i , where s i = 1 refers to the on state, and s i = 0 refers to the off state. The cooling power of a TCL is P i (kW). Then, a separate set of models can be used to simulate the first-order ordinary differential equations for the load dynamics: The on/off state variable s i (t) is controlled by the change of temperature deadband: where T min i and T max i are the lower and upper limits of the indoor temperature, respectively. Furthermore, T min i and T max i are related to temperature setpoint T set i : where T sp i represents the temperature setpoint of a TCL, and ∆ is the width of the temperature deadband. The aggregated power consumption of TCLs in group l (l = 1, 2, . . . , M) can be calculated as follows: where N l denotes the number of TCLs in group l, and η i is the performance coefficient of TCL i. In the following sections, we assume that the temperature setpoints of TCLs are changed instantly by broadcasting control commands to all TCL units as follows: In (5),T sp i is the preferred temperature setpoint of the ith consumer, and ∆T sp l is the command of setpoint change, bounded by ∆T sp,max l for consumer comfort. Therefore, we assume that ∆T sp l is always within the desired range.
Equations (1)-(5) represent the entire Monte Carlo model, which is the basic model for studying TCLs [16,20,[26][27][28] and is also the basis for other models. Moreover, the Monte Carlo model is a extensive single-input single-output dynamic model, which will be excellent for the schemes to implement.
The operational characteristics of TCLs over time are shown in Figures 2 and 3. The rising curve indicates the natural temperature rise of the room. An increase in temperature indicates a shutdown of the load. During this period, the system does not consume power, and the room temperature naturally rises due to heat conduction convection. The descent curve represents the cooling process of the air conditioner compressor. The temperature drop indicates the opening of the load, whereby the compressor cooling consumes power and the room temperature decreases. The change in load switch status is closely related to change in room temperature. Once the room temperature arrives at the upper limit, the TCL should be switched on. Once the room temperature arrives at the lower limit, the TCL should be switched off to allow the room temperature to rise naturally.

Predicted Percent Dissatisfied Model
Predicted percentage of dissatisfied forecasts a cluster of people who feel uncomfortable as a percentage of the total number of persons for a given thermal environment according to the average thermal sensation scale. The predicted percentage of dissatisfied (PPD) value is a function of the predicted mean vote (PMV) based people's voting of thermal sensation in the environment. The unsatisfied people will be reflected by a curve drawn in percentage terms. People's thermal perception of the environment is different. Even when PMV = 0 in comfortable conditions, there is still that 5 percent of people who are not satisfied.
Thermal comfort plays a crucial role in human comfort. The American Society of Heating, Refrigerating, and Air-Conditioning Engineers (ASHRAE) defines the thermal comfortable environment as being dependent on six variables: air temperature, air velocity, humidity, mean radiant temperature, clothing resistance, as well as activity level [18]. Fanger came up with the thermal comfort equation on the basis of the human body's heat balance [19]: where L l represents the human body's heat storage, M l is the energy metabolism rate of the human body in community l, W indicates the mechanical power of the human body, t ai is the ambient temperature of the bodies of consumers in community l, t r expresses the indoor average radiant temperature, f cl represents the ratio of clothed surface area to naked body surface area, P a is the vapor's partial pressure, calculated taking into account air humidity: where φ represents air moisture. t cl is clothed body's mean surface temperature, obtained as where I cl indicates the thermal resistance of clothing, and h c is the convection heat transfer coefficient, which can be expressed as where V a is the air velocity. Combining (6)-(9), we acquire the relationship between room temperature and human body heat storage: The predicted mean vote (PMV) can be defined as Consumer comfort is usually expressed in PMV. The relationship between PPD and PMV is shown in Table 1. The forecasted dissatisfaction proportion of community i is defined as

Optimal Solutions
TCLs are used to provide frequency regulation services to the smart grid and are divided into M groups. Two major goals are proposed: (a) minimization of dissatisfaction, and (b) minimization of error in tracking the AGC signal. Three models are proposed considering comfort constraints and power constraints from three different aspects. The optimized control structure is shown in Figure 4. Within a period t, from t 0 to t 1 , the optimal temperature setpoint T set i is calculated, and the control center generates a command, which is broadcasted to the TCLs. TCLS adjust their temperature settings according to the commands.

Scheme A
The objective of this scheme is to minimize the total PPD subject to the power tracking and temperature constraints.
where γ is the largest allowable error value, and the error is controlled within γ, which can meet the required precision value. M is the number of groups of aggregated TCLs, PPD l is the value of the dissatisfaction of each group, P l is the aggregated power consumption of each group of TCLs, and P r is the value of AGC.

Scheme B
The objective of this scheme is to minimize the largest PPD subject to the power tracking and temperature constraints.
where max PPD l is maximum dissatisfaction in the l groups of consumers.

Scheme C
The objective of this scheme is to minimize the power tracking errors subject to the PPD and temperature constraints.
where α is an upper limit of the total PPD.
We make consumer dissatisfaction the objective function in Scheme A. A smaller degree of dissatisfaction indicates greater comfort for the consumer. From the overall perspective and regarding the sum of dissatisfaction levels of the M groups of TCLs systems as a goal, it is hoped that the consumers' level of comfort will be relatively satisfactory. In this model, tracking error and temperature setpoint are two constraints. In the whole system, the total power needs to keep up with the reference signal, and the temperature setpoint needs to be controlled within a set range.
Similarly, Scheme B is still from the consumer's point of view. The objective function becomes that the largest group has the least dissatisfaction. Constraints are the same as Scheme A. We want the power to keep up with the reference signal while the temperature is within the set interval. Compared to the Scheme A, Scheme B focuses more on the satisfaction of a single group of consumers. This can effectively avoid the large differences in PPD values among the groups in the scheme, while the total PPD's value is still within the acceptable range for the system. However, compared with the first model, this has the disadvantage of large tracking errors and high power consumption.
Scheme C is proposed from the power system side. This model uses the minimum value of the tracking error as an objective function. At this point, the constraints are dissatisfaction and temperature setpoint. The sum of the M groups of dissatisfaction level should be less than α, and the temperature needs to be within the set range. Compared with the former two models, the power consumption of this model is greatly reduced, close to the AGC signal, and the power system is more stable. At this time, the power supply is almost equal to the power consumption, but the disadvantage of the system is that the consumers' dissatisfaction will increase.
To evaluate the tracking performance, the root mean square error (RMSE) is expressed as where N s is total number of samples, P max target and P min target represent the maximum and minimum limits of sample signals, and e k is the tracking error.

Temperature Setpoint Optimization Algorithm Based on APSO
Active target particle swarm optimization (APSO) is a stochastic global optimization algorithm based on a population evolution algorithm [29]. It has the advantages of being a simple algorithm with easy implementation and a good global search ability. At the same time, it has no special requirements for optimizing the form of the objective function. This algorithm has been widely used in many fields such as function optimization, neural network training, parameter tuning, and industrial system optimization. All temperature updates are subject to the following rules: where T j represents the jth change of temperature, T j is jth temperature, and lbest j is the historical optimal value of the jth temperature. gbest indicates the global optimal value of the entire temperature. The values of c 1 and c 2 are equal to positive constants, rand 1 , rand 2 are equal to two random numbers from the interval [0, 1], and ω represents the weighting factor. The temperature setpoint optimization algorithm based on APSO is shown in Figure 5. As an evolutionary algorithm, the APSO algorithm maintains the diversity of the population during the evolution process, which is the precondition for the algorithm to converge to the global optimum. However, when a particle finds a current optimal value, other particles will quickly move closer to it, causing the diversity of the group to be lost, and making the algorithm prone to premature convergence. The whole process is as follows: Firstly, the iteration step size of the temperature range and the temperature change rate is initialized. Secondly, the objective function value is evaluated. Thirdly, the optimal temperature of the individual and the optimal temperature of the group are sought. Fourthly, the update of the temperature and temperature change are implemented, and the objective function value is calculated to determine whether the update condition is satisfied. If it is not satisfied, the temperature and change rate of the temperature continue to be updated. If not, the control center obtains the temperature and change rate of temperature and communicates them to TCLs to provide adjustment services for the balance of the power system. Finally, if the AGC signal meets the termination condition, the service is terminated, otherwise the control center continues to provide AGC load tracking optimization services.

Simulation Results
The optimization problem was solved using the MATLAB software platform. The running times of the three schemes were 5 min 26 s, 7 min 3 s, and 4 min 16 s, respectively. The tracking performance of the TCLs under different conditions was analyzed at the same time. In this section, the performances of the global temperature constraints are evaluated through a discussion of three different cases, as well as the trade-off between the dissatisfaction of consumers and the tracking errors. In the three simulation cases, three scheduling indicators were analyzed and compared: one was the tracking error, which reflects the frequency adjustment's performance, another was the variable room temperature range, which reflects the impact of the requirements of consumer comfort, the last one is the consumers' dissatisfaction (PPD), which reflects the satisfaction regarding power distribution to the consumer. Each of these three cases was divided into 3 groups (M = 3 and the total number of TCLs is 3000), of which N 1 = 800 in Group 1, N 2 = 1000 in Group 2, and N 3 = 1200 in Group 3. The number of TCLs is determined by referring to the baseline load value in [30].
TCLs had the parameter values listed in Table 2. We distributed every parameter (R, C, P, T sp0 , T ∞ , ∆ db ) normally around the mean value in Table 2. R and C obeyed a Gaussian distribution with a mean of 2 and a variance of 0.1. In addition, P obeyed a mean distribution of 5.6 with a variance of 0.1. A one-hour AGC signal was taken from the Pennsylvania-New Jersey-Maryland (PJM) Interconnection electricity markets, and the sampling time of the signal was 4 s, as shown in Figure 6.
As can be seen from Table 3, the total PPD of Scheme A is relatively small, and those of Group 2 and Group 3 are quite different. The total PPD value of Scheme B is relatively large, but the difference between each group is small, indicating that the dissatisfaction is similar across groups. Scheme C was considered from the grid side. In order to make sure the tracking error is minimal, the total PPD is larger.   In Scheme A, the total PPD is considered as the objective function, the constraints of which are tracking error and temperature. The polymerization properties of the air conditioning load provide service for grid frequency regulation. It can be seen from the data simulation that the total power of TCLs can efficiently follow the sampling signal. Tracking errors are in a small range, and the RMSE reaches 2.44%. We obtain the probability density function of the PPD by using a distribution-fitting software that follows an error distribution with the best fitness: where k = 1.7182, α = 9.9132, β = 8.3863, and γ = 0 are the distribution parameters. In addition, the probability density function is shown in Figure 7a. In Scheme B, the minimum for a single group of PPD is achieved. We also use temperature and tracking error as constraints to evaluate the aggregate properties of the air conditioning load to provide service to the grid frequency regulation. It can be clearly observed that tracking errors are larger than those of Scheme A, and the RMSE achieves 2.65%. We obtain the probability density function of the PPD by using a distribution-fitting software that follows an error distribution with the best fitness: where k = 1.6664, σ = 1.2151, and µ = 7.9175 are the distribution parameters. The probability density function is shown in Figure 7b. In Scheme C, the minimum value of the tracking error is considered. It was shown that the tracking errors are smaller than those in Scheme A and Scheme B, and the RMSE achieves 1.37%. We obtain the probability density function of the PPD by using a distribution-fitting software that follows a Kumaraswamy distribution with the best fitness: where α 1 = 3.2268, α 2 = 4.4536, a = 4.4694, and b = 14.879 are the distribution parameters. The probability density function is shown in Figure 7c.
In addition, the setpoint temperature change in the three schemes is shown in Figure 8, from which it is observed that the setpoint changes are within the normal human adaptation range for temperature. Similarly, as shown in Figure 9, the setpoint temperature also follows a continuous distribution. We used the EasyFit software to fit the temperature setpoints of the three schemes and obtain the distribution with the best fitness: the Dagum distribution and the Weibull distribution, which are expressed as:   For the AGC signal with 900 steps, the switching times of 3000 TCLs under the three schemes are 168, 179, and 158, respectively. On average, the TCL is turned on or off every 5-6 steps. If the time interval between two steps is extended to 60 s, the TCL will be switched 10 times in one hour. In practice, the number of TCLs is larger than 3000. To reduce the switching time of the TCL, we can divide the TCLs into different groups. For example, we divide 60000 TCLs into 20 groups, and each group has 3000 TCLs. In every two-hour schedule, only one group is used. In this way, we can reduce the switching time to one in two hours. It can be seen from Figure 10 that the temperature variables are within the temperature limit. Similarly, Figure 11 shows the tracking performance of the three schemes. It can be seen from Figure 11 that the three schemes can provide promising load tracking services. But the large error in the graph is existed, it is users being picky about comfort that forces a sacrifice in the error to meet their needs. All in all, the load tracking service is still acceptable, which proves that the three schemes are established and reliable. In addition, comparing previous results in Table 4, it is observed that the value of the RMSE is reduced by applying these proposed control strategies. This work fully considers the inherent constraint relationship of frequency adjustment deviation, service cost, and consumer comfort, and achieves an optimal trade-off between power quality cost and consumer comfort in the process of using TCLs to provide frequency adjustment. In contrast to the existing control methods of TCLs, this work also considers the comfort of consumers in the frequency adjustment process, and fairness for the consumers participating in the frequency adjustment service is ensured. The change in actual temperature is from 25 • C to 28 • C, and the corresponding PPD values for each group under the three schemes, as shown in Table 3, show that the scheme has little impact on consumers, indicating that the consumers are willing to participate in the demand response program. RMSE is controlled to be less than 2.5%. Compared with the existing methods in the literature, the method proposed in this paper has higher accuracy, which improves the reliability of the power system.

Conclusions
This paper uses a PPD model for expressing consumer dissatisfaction. Three schemes are proposed which consider consumer dissatisfaction and the stability of the grid frequency. The first scheme is proposed to minimize the total PPD of the population. Owing to the large difference in the values of PPD among groups in the first scheme, the second scheme is developed to minimize the PPD of the largest group, in order to achieve PPD fairness for different groups. By considering the power constraints, a scheme of minimizing the tracking error is also proposed. A temperature setpoint optimization algorithm based on APSO is developed to solve the above three schemes and achieve an optimal trade-off between consumer comfort and the cost. It is shown that the proposed schemes can reduce the deviation and cost of TCLs participating in the grid frequency regulation service, ensure consumer comfort, and realize the optimization of frequency regulation deviation, auxiliary service cost, and consumers' comfort. This establishes a quantitative indicator system for the grid and consumers to assess the cost of participating in ancillary services. The rationality of the model was verified by simulation analysis. The model in this paper can effectively reflect actual factors such as temperature setting and population dissatisfaction. In the future, it would be interesting to study the influence of renewable generation on the results of this work.

Conflicts of Interest:
The authors declare no conflict of interest. The funding sponsors had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, and in the decision to publish the results.