An Integer Non-Cooperative Game Approach for the Transactive Control of Thermal Appliances in Energy Communities

Non-cooperative scheduling games can be used to coordinate residential loads in order to achieve a common goal while accounting for individual consumer’s interests, privacy, and autonomy. However, a significant portion of the residential flexibility—Thermostatically Controlled Loads (TCLs) such as water and space heating/cooling appliances—has not been fully addressed under this game theoretic approach: their comfort constraints and integer control were not considered. This paper presents a method for properly including TCLs in this framework and discusses its application in energy communities. Specifically, we propose a general mathematical formulation for considering users’ comfort in non-cooperative games. We model the integer nature of the TCLs control with binary variables and show that optimal or close to optimal (less than 1%) solutions are reached. Moreover, different total cost functions can be used depending on the market context and the objective of the demand management program. To illustrate and discuss these aspects in practical applications, we used a case study of an energy community in Spain. The results show that the TC solutions are optimal or only 0.80% worse than optimal; different total cost functions result in different results (load curve smoothing or peak load reduction); consumers’ comfort is respected; and the proposed game model cooperates with consumers in order to minimize community’s costs.


Introduction
In the context of the transition to sustainability in power systems, transactive control (TC) has emerged as a form for coordinating the multiple agents in power systems (consumers; producers; distributed system operators-DSOs; transmission system operators-TSOs; aggregators, etc.) in order to optimize their operation to reduce the system's costs, provide ancillary services, and enhance energy quality, among others [1]. TC methods consider agents' particularities, priorities, interests, and autonomy [2]. The core objective is to reach an optimal allocation of resources (e.g., generation, controllable devices, and loads) in a decentralized form by only allowing actors to exchange minimal information (e.g., total consumption) among themselves [3]. These decentralized and transparent characteristics make TC an attractive solution for load control in the residential sector in which privacy is a main concern and a large number of consumers exist [4]. Therefore, the main motivation of the approach proposed in this paper is to deploy TC based solutions to unveil the flexibility offered by residential loads in order to cope with new challenges in power systems.
An effective method of implementing TC is by non-cooperative game theory because it allows modeling agents' preferences, priorities, conflicting interests, and complex interactions in a decentralized manner [5]. When applied to the load control in the residential sector, game theoretic methods capture load scheduling interactions between mid-size to small-size consumers that negotiate their load flexibility until an equilibrium is reached and all consumers are satisfied with the result [6]. Mohsenian-Rad et al. [7], who conducted one of the first studies in this field, proposed a non-cooperative game approach to schedule residential appliances by considering consumers' preferences. In this pioneering study, the authors adopted simple load models with continuous decision variables, constant daily energy to be scheduled, and no operation constraints. Additionally, to divide the market results among participants, this study considered a billing mechanism that allocates the total quadratic cost according to the energy share of each individual consumer. Later on, other studies extended this discussion on billing mechanisms applied to transactive load control by analyzing the fairness of different methodologies [8], studying potential cheating behaviors [9], proposing alternatives to avoid untruthfulness of consumers [10], and analyzing other costs models [11].
At the same time, other works have focused on implementation aspects, for example, by developing a game-theoretic software framework to simulate demand response of electric vehicles and evaluate real-world use cases [12], by proposing new coordination algorithms that require less communication among participants and consider communication losses [13], by analyzing the stability of the Nash Equilibria when non-quadratic payoffs are designed [14], by introducing a market mechanism able to deal with a large population of devices [15], and by considering the role of utilities/aggregators [16][17][18]. These contributions kept using relatively simple models to describe controllable loads, particularly using continuous variables to describe the scheduling problem.
Recently, new forms of load have been introduced as resources of TC in the context of non-cooperative games. For example, the authors of [19] added electric vehicles (EV) to the continuous non-cooperative game model while addressing the uncertainty related to the number of EVs in a system. On a separate path, the authors of [20] replicated the study of reference [7] in a more realistic manner, including a simplified form of heating, ventilation, and air conditioning (HVAC) systems modeled as continuous loads but without explicitly representing their operation constraints. A better representation of consumers' discomfort associated with HVAC loads was added as part of the cost function in reference [21] together with additional energy supply constraints.
None of the studies referred above explicitly modeled thermostatically controlled loads (TCLs), particularly with respect to power/temperature comfort characteristics. The lack of attention to TCLs in the context of non-cooperative games is a significant gap. This is because these loads are a main source of flexibility in the residential sector [22] and are widely explored in other forms of control (for example, centralized optimization, time-varying incentives, and dynamic pricing schemes [23][24][25][26][27][28][29]) and are present in other TC models (for instance, auction theory and game theoretic models that apply other solving methods such as the Nikaido-Isoda function or mean-field approaches, multi-agent simulation, and fast control strategies) [30][31][32][33][34][35]. Controlling TCLs raise new challenges for the game models because they imply representing consumers' comfort preferences and their operation constraints. The characteristics of thermal control have not been explored in the literature of TC with non-cooperative games discussed in the last paragraphs, which assumes simple operation models for the controllable loads. Thus, understanding the practical implications of TCLs aspects is key to allow extending transactive energy control to these loads.
Moreover, the aforementioned literature on non-cooperative games relies on the assumption that control is continuous. As demonstrated by the authors of [36], modeling the control with continuous variables allows some important guarantees in terms of uniqueness/optimality of the Nash Equilibria (NE) when applying game theory to these type of problems. However, thermostatically controlled loads, such as many other residential appliances, often have an on/off control activated by a thermostat, which, realistically, implies an integer (binary) representation. This means that, in practical terms, the equilibrium guarantees that a continuous space might not fully apply to the TC in the residential sector, particularly in situations where the number of consumers is small and the controllable resources are highly discrete. Even though a few studies have considered this discrete nature of residential control, e.g., in the context of potential games [37] and generalized ordinal potential games [38], introducing more complex types of loads such as TCLs remains unexplored.
In summary, transactive control offers important decentralized characteristics that are suitable to residential demand side management. Current technological solutions developed around non-cooperative game theory have shown promising results, but they were unable to properly include TCLs, which is the largest source of flexibility among domestic loads. Moreover, two aspects of TCLs control are not fully addressed by the current theory of non-cooperative games: (1) the operation model of thermal loads, which includes temperature constraints, and (2) the on/off nature of the decisions, which produces the problematic integer and changes equilibria conditions. Therefore, it is important to understand the implications of these theory gaps in the real-world implementation of TC in the residential sector.
To address these issues, this paper advances the state of the art of day-ahead load scheduling of residential consumers by providing a game-theoretic framework in order to properly include domestic on/off TCLs in transactive control based on non-cooperative games. The specific contributions are the following:

1.
We formulate a game with binary variables representing the on/off control of TCLs and explicitly model the consumers' comfort constraints; 2.
We discuss the theoretical foundations of this integer TCLs game, and we show that local or global optimal equilibria can be reached; 3.
We generalize the paper's results to two types of cost functions often applied to energy communities: quadratic and peak pricing.
All the aforementioned contributions are analyzed theoretically and validated through simulations with real data. We create a realistic case study using information gathered from an LV network in the south of Spain with 201 consumers and evaluate the impacts of the TCLs load model and its integer characteristic in this context. The rest of this paper is organized as follows. In Section 2, we present the system model, including the new framework to incorporate integer thermal loads into the set of controllable appliances and the community cost functions analyzed. We then present the proposed game in Section 3, including the discussion about the integer nature of the loads and the applicability to multiple total cost functions. Finally, we present simulation results in Section 4, and we conclude the work in Section 5.

System Description
We consider a smart residential community of consumers connected to the same substation with an aggregation platform [39]. Smart residential communities [40] are an organization model in which consumers can coordinate their energy utilization and manage their distributed resources in a manner to reduce their electricity bills and costs [41], increase their revenues [42], and use their assets more efficiently [43]. This community architecture is well aligned with the transactive control concept [42], enabling consumers to be more proactive and autonomous [44]. The non-cooperative game proposed in this paper can adequately model the interactions among consumers in an energy community. Our smart community model is shown in Figure 1. We assume that consumers have flexibility from thermostatically controlled loads and are willing to manage/schedule them in a TC approach in order to reduce their bills. Moreover, participation on the demand-side management program is optional. For those interested in joining, their goal is to minimize their payments individually, but the proposed non-cooperative model is community oriented. For that reason, the aim of the proposed load scheduling game is to coordinate the consumers in order to minimize the community's total cost while respecting their preferences and their decision autonomy. Consumers' appliances are controlled locally by their energy consumption controller (ECC), which is part of a home energy management system (HEMS). The HEMSs from different consumers communicate with each other by exchanging only aggregated load information with the neighbors' HEMSs and cost parameters/billing mechanism with the aggregation platform. Notice that individual consumption information is not directly shared among the participants, preserving consumers' privacy. Consumers' management systems can be deployed inside their smart meters (or as an additional equipment that communicates with the energy meter), which are connected to the power line and to a local area network (LAN), for information exchange [45]. Moreover, the aggregation platform buys energy for the community in the energy markets (e.g., future contracts and day-ahead auctions) and passes along the values to consumers in the form of a total cost function. This function is defined by the aggregation platform in order to drive consumers' demand to meet the established contracts/bids in the markets. Since this platform is no longer responsible for directly coordinating and controlling the user's devices, an indirect control [46] is performed, and consumers take decisions locally. It is important to notice that the market participation of the energy community (aggregation platform) is out of the scope of this work: the contributions here are related to the scheduling process of TCLs within the community given a total cost function, especially to the solution properties when non-cooperative games are applied.
To model TCLs in this paper, we use physically based load models (PBLM). We consider the power temperature characteristics presented in reference [26] and assume an on/off control of these devices.

Load Modeling
In this section, we present the general framework that includes domestic on/off thermostatically controlled loads (TCLs) in transactive control based on non-cooperative games. Without the loss of generality, we assume that consumers in the community have one TCL to be scheduled the day ahead with the objective of reducing the electricity costs. This assumption can be easily loosened by including an extra index on all variables and parameters. Moreover, we also assume that the remaining load is uncontrollable and deterministic: Consumption is forecasted and summed in one inflexible load curve for each consumer (w n ). The electrical consumption of TCLs to be scheduled is subjected to consumers' preferences, which are related to minimum and maximum acceptable temperatures-e.g., indoor temperature for air-conditioners (ACs) and space heater and water temperature for water heaters-and they are described as follows: θ n,t ě θ min n,t @n P N @t P T , θ n,t ď θ max n,t @n P N @t P T , where θ n,t is the indoor temperature, θ min n,t and θ max n,t are consumers' preferences for the temperature, N is the set of consumers in the energy community, and T is the set of time slots in the planning horizon. The authors of [26] presented a TCL operation model in which the temperature is described by a linear function of the appliance's power demand and performance as well as other variables, such as ambient temperature, time of use, and consumer habits. Therefore, the TCLs temperature in a time slot t is described as follows: θ n,t " f px n,t , θ n,t´1 , ∆q, where x n,t is the on/off decision variable for the TCL, θ n,t´1 is the temperature in time slot t´1, and ∆ includes other parameters (e.g., TCLs performance and power consumption and inside/outside temperature, among others). For instance, we detail the operation model of air conditioners (ACs) in Equation (4). For all n P N and for all t P T , the internal temperature θ n,t evolves according to the following: θ n,t " f px n,t , θ n,t´1 , TH n , R n , η n , E n , θ et t , δq " θ n,t´1´δ TH n R n`θ n,t´1´θ et t`ηn R n E n x n,t˘, where δ is the time slot size (in hours), TH n is the AC thermal capacity, R n its thermal resistance, E n its power rate, η n its performance, and θ et t is the external temperature. The variables x n,t are binary, representing if the TCL of consumer n is on/off during time slot t.
Consumer n's total load in a specific time slot t is calculated as the sum of his/her base load w n,t and controllable load E n x n,t . The last one is the power consumption of the TCL (E n ) times the on/off decision variable (x n,t ). Both base and controllable loads are defined in terms of power and need to be multiplied by the size of the time slot δ (in hours) in order to have energy consumption. The following equation for each consumer n P N defines consumers' day-ahead energy scheduling vectors (l n ). l n,t " δpw n,t`En x n,t q.
In summary, we can define a feasible energy consumption scheduling set S n for each user n, which includes all possible scheduling vectors respecting their preferences. S n " l n " rl n,1 , l n,2 , . . . , l n,T s P R T : x n,t " t0, 1u @t P T u.
We can define the energy consumption vector of the group of participants as L t " ř nPN l n,t @t P T .

Community Costs
In this paper, the consumer's load (controllable and uncontrollable) is charged according to a total cost function. This function is defined by the aggregation platform and is applied to the community as a whole in order to enforce load control. The game model then drives the scheduling process and defines how this cost is divided among consumers. Two total cost functions are studied in order to demonstrate the possibility of implementing non-cooperative games in different market contexts and to analyze how those functions impact results.

Quadratic Cost Function
First, we consider the common quadratic cost function applied in most part of the non-cooperative models for residential load scheduling in the literature [7][8][9][10][11]13,20,37]. In addition to being extensively used in the literature of transactive control, it is strictly convex and has a unique global minimum. The total community cost in each scheduling time slot t P T is defined by the following.
Thus, the total community cost of the day-ahead operations planning is the sum of the quadratic total cost in each time slot-defined in Equation (7)-in all time slots of the planning horizon: where a t ą 0, b t ě 0, and c t ě 0 are constants and can be time varying-e.g., have different values for different time slots of the day to represent better the power system conditions. In practice, Equation (8) can represent real energy costs associated with thermal generation or power losses as well as specific tariffs contracted with aggregators or retailers. For example, the aggregation platform can contract a two-step tariff with the retailer or the distribution company for the community, which can be approximated by a quadratic function for the scheduling process. The solution to the problem described above that minimizes the total system cost for a group of consumers N can be calculated as the following mixed-integer quadratic optimization model (MIQP), with the decision variables constrained to the scheduling set defined in Equation (6).

Peak Pricing Function
In order to study a scenario with high potential for demand management, we also consider a peak pricing model in which the overall energy costs are composed by an energy price and a demand charge. Although the quadratic cost function aforementioned is strictly convex and easy to optimize, a volumetric rate (applied to the energy consumption) together with a peak demand charge (applied to the peak load of the billing period) is a more realistic scenario for a residential energy community [47,48]. For instance, the aggregation platform can contract a peak reduction with the retail market for the community and use this total cost function to drive consumers to jointly reduce the community's power demand. Assuming the volumetric energy component as variant in time, e.g., a time-of-use (TOU) tariff, the total cost of the community can be written by the following: where d t ą 0 @t P T are the TOU tariffs for the energy consumption, and e ą 0 is the peak demand charge for the community. It is important to note that this function is convex as it results from the sum of two convex functions [49]. The solution that minimizes the total system cost for the community composed by a group of consumers N can be calculated as the following mixed-integer linear program (MILP) in which α t are auxiliary variables for computing the peak load.
ř nPN l n,t @t P T l n P S n @n P N

Thermal Loads Scheduling as a Game
In this section, we present a non-cooperative game designed to coordinate the thermostatically controlled loads (TCLs) when the cost functions described above are applied to the community. The preferences and market parameters follow the definitions in the previous sections. We discuss the assumptions of this model when applied to TCLs, the existence of equilibrium points when integer loads are present, the model advantages, its general applicability, and the market and price formation. We also present the algorithm used to solve the game together with an analysis of its computational complexity and communication overhead.
The definition of the proposed non-cooperative TCLs scheduling game is given by a tuple Γ " xN , pS n q nPN , tu n u nPN y in which the following is the case: N " t1, 2, . . . , Nu is the set of consumers living in the community and participating in the scheduling game (i.e., players); S n " tl n u nPN denotes the action space for consumer n P N . This set is composed by feasible scheduling vectors, l n , that respect users' comfort constraints defined in Equation (6); u n : S Þ Ñ R is the utility function user n P N receives. It is defined as the negative of consumers' bill, which is a share of the community's total cost (either quadratic or peak pricing). The utility function is shown in Equation (12), in which Cpl n , l´nq can be Equation (8) or (10). The share f n depends on the billing mechanism, and we use a version that bills each consumer according to his/her energy consumption during the scheduling horizon-see Equation (13). This is a popular billing model in the literature of non-cooperative scheduling games [7][8][9][10][11]19,36,37], and we analyze its advantages and limitations when applied to the game with TCLs.
u n pl n , l´nq "´f n Cpl n , l´nq (12) f n " ř tPT l n,t ř tPT L t (13) In this billing setting, if two participants n and m have total load ř tPT l n,t " β ř tPT l m,t after the scheduling game is solved, then consumer n will pay β times consumer m's bill. Moreover, the sum of all consumers' payments will be equal to the community's total cost: note that ř nPN ř tPT l n,t " ř tPT L t .

Model Assumptions
In this paper, we assume that TCLs have a fixed total energy to be scheduled in the billing period in line with the load models reported in the literature [7][8][9][10][11]13,[19][20][21][36][37][38]. This assumption makes the factor f n constant for each n P N and consumers seek to minimize the community's total cost in order to increase their own utility. Therefore, consumers must coordinate their controllable load schedules to optimize the community's consumption; otherwise, they will pay more.
Moreover, we consider that the aggregation platform contracts the community's load in the market and chooses one of the total cost functions defined in Section 2.3 to drive the game. This buying process is out of the scope of this paper, and we assume that the platform's benefits/costs are included in the parameters of the chosen function. Moreover, we also consider that the risks of imbalances between the result of the day-ahead scheduling game and the real consumption of the community are taken by the aggregation platform (e.g., it measures the probability of consumers' deviating from the equilibrium in real time and add the imbalances' cost to the parameters of the total cost functions).
We also assume that the consumers' uncontrollable load is forecasted by their energy management systems using past consumption data and a forecasting method. Thus, this load is considered deterministic, and its prediction method and errors (including the difference between the forecasted and real consumption) are out of the scope of this paper.

Equilibrium Points of the Integer Game
In games, the players' goal is to choose and play optimal strategies, i.e., those resulting in larger utilities. However, the best strategy for a player depends on the choices of the other players. Therefore, one solution of a game is a Nash Equilibrium (NE): a strategy profile from which no player has incentive to deviate. In our setting, a strategy profile pln , l˚nq is an NE if, for all consumers n P N , the following holds. u n pln , l˚nq ě u n pl n , l˚nq @l n P S n The billing studied in this paper guarantees the existence of the Nash Equilibrium for the game with integer thermal loads. With the fixed factor f n multiplying the total community's cost, consumers seek to minimize this function. If the total cost function has a minimum, it will be the Nash Equilibrium of the game [50]: no consumer would be better off by unilaterally moving from this solution if the others are playing it as this would increase the total cost and consumers' bills as a consequence. This can be seen by replacing the utility Equation (12) in the NE Equation (14). Due to the integer nature of the variables, the NE can be a local or global minimum [50]. Both total cost functions studied in this paper are convex, meaning they have minimum points that can be local or global optima of the community's scheduling problem and equilibria points as a consequence. In the results section, we show that solutions of the game are optimum or close to optimum values.

Advantages of the Game Model
The game model is designed to divide the total cost of the community according to a fixed share, which brings a coordination aspect to the game: All participants intend to minimize this total cost function, and minimum values for the entire community are equilibria. Here, it is important to note that, despite the name, non-cooperative games can be intentionally created to achieve cooperation goals [51], as shown by the design of our utility function.
Moreover, any total cost function with a minimum point (local or global) can be used in this setting, e.g., convex functions as the quadratic and peak pricing studied here. The existence of a Nash Equilibrium is guaranteed for any community's total cost function with a minimum value, allowing the implementation of this model in multiple market contexts.
Another interesting advantage is that cheating behavior is discouraged in this model. As consumers' bills equal a fixed share times the total cost function, any behavior that results in a non-equilibrium solution and does not minimize the cost will also harm the cheater participant.
Finally, in this non-cooperative game, the interactions between consumers to reduce their individual energy bills can be mimicked by the iterative and decentralized learning algorithm described in Section 3.5. The idea is to reach an equilibrium point, the Nash Equilibrium (NE) [52], which is a stable solution. This also means that the autonomy of consumers is respected during the process because they decide their schedules individually and locally, according to their constraints and preferences.

On Market and Price Formation
The smart energy communities studied in this paper are inside energy markets. In this section, we provide a discussion on how the interactions between consumers, energy community, aggregation platform, and the current energy markets might happen.
Energy communities are an important aspect to the clean energy transition. By moving citizens to the fore, public acceptance of renewable energy projects is increased, which makes private investments on clean energy more attractive. Moreover, they can provide direct benefits to citizens such as cheaper electricity bills and higher energy efficiency.
Finally, this organization model can help provide flexibility to electricity markets through, for instance, storage systems and demand response.
A new regulation has been implemented to provide a legal framework for energy communities' constitution and market participation. For instance, the European Parliament and Council passed directive 944 in 2019 authorizing citizens to organize themselves in energy communities and participate in the energy markets [53]. In this context, the consumers considered in our scope would be able to trade in the energy markets through their energy communities and aggregation platforms.
More specifically, the definitions of the total cost function and its parameters (e.g., a t , b t , and c t for the quadratic, d t , and e for the peak pricing) are a result of the trading process between the aggregation platform and the energy markets when buying the community's load. The community participants are, therefore, price-takers in the definition process of these parameters because they cannot influence wholesale/retail markets prices, even when joining forces as a community.
On the other hand, consumers inside the community have the power to influence the final total cost: By choosing their strategies (appliances schedules), the value to be split between them is optimized, as well as their utilities. This inside-community trading process results in consumers who know the impact of their choices in the game results. Therefore, within the boundaries of their community and given a total cost function, consumers are price makers. One should note that the focus of this paper is on the negotiation process among consumers and on this price-making characteristic. This inside-community negotiation process is better described by the algorithm used to solve the game, which is presented in the next section-see Figure 2.

Best Response Algorithm
To find a solution to the proposed game, we use the Best Response Dynamics (BRD), which is a learning method applied to solve games; in other words, it finds equilibrium points. It is a sequential decision model in which consumers take turn and best respond to opponents' last strategies (TCLs schedules in our framework). We use a version in which consumers communicate the total group's load instead of their individual profiles, which results in less message exchanges and reduces the privacy issue [10].
The algorithm works as follows. Assuming that each consumer's home energy management system keeps track of opponents' total consumption vector (L ḱ n ) and the total load of all players is initialized with zeros L 0 " r0s, each iteration of the gameplay consists of three steps:

1.
Consumers receive the total load vector and calculate the load of opponents L ḱ n using their previous best strategies; 2.
They update their strategies l k`1 n as a response to L ḱ n by solving the local MIQP (15) with Cpq n , L ḱ n q defined by (8), if the quadratic total cost function is applied, or the local MILP (15) with Cpq n , L ḱ n q defined by (10), if the peak pricing total cost function is applied; l k`1 n " BR n pL ḱ n q " argmax q n PS n u n pq n , L ḱ n q " argmin q n PS n Cpq n , L ḱ n q 3.
Consumers add their local strategy l k`1 n to the opponents' total consumption vector L ḱ n and send this new aggregated consumption profile L k`1 to the next player. As BRD converges to a Nash Equilibrium for games in which all consumers have the same concave utility function (the negative of the total cost) [50], the process continues until an equilibrium is reached and consumers can no longer reduce their bills when changing their schedules. As shown in Figure 2, this algorithm is decentralized and each consumer has the autonomy to schedule locally his/her TCLs. The aggregation platform is responsible for defining the cost function parameters. However, consumers' decisions are taken locally by their HEMSs through the BRD. Moreover, because of the design of the utility function, i.e., the billing proportional to each participant's share, the consumers aim at minimizing the global objective (the total cost). Finally, their shares and bills are calculated by the aggregation platform after receiving consumers' loads.
One should notice that community participants communicate the group's total load. This design guarantees users' privacy as they do not share individual preferences, strategies, or load profiles: their personal consumption is integrated in the total load vector and no participant knows the opponents' individual values. This aspect is better illustrated in Figure 3 in which L is the total load vector communicated between consumers, and l n is the local consumption vector, which is known by participant n only. Other methods to enhance data privacy and secrecy can be added to the algorithm, e.g., multi-party computation protocol (MPC) [10] or Institute of Electrical and Electronics Engineers (IEEE) 2030.5 standard [54].

Computational Complexity and Communication Overhead
In this section, the computational complexity and communication overhead of the Best-Response Algorithm is analyzed in comparison to the centralized model. An homogeneous scenario is considered: All consumers of the community have one air conditioner with operation described by Equation (4) and same parameters (i.e., performance, thermal capacity, thermal resistance, and power rate), as well as same preferences θ min n,t and θ max n,t . Moreover, a constant outside temperature θ et t during the day-ahead horizon is examined, which is also equal for all community participants. If θ min n,t (θ max n,t ) is less than (greater than) or equal to the minimum (maximum) temperature the thermal load can reach, the space of solutions of the binary problem of one consumer is 2 24 δ . In the worst case, one iteration of the Best Response Algorithm would have to explore this entire space to find a solution. Moreover, this must be performed for each consumer n P N sequentially. Finally, the BRD will take r rounds to converge, resulting in a worst-case complexity of OprˆNˆ2 24 δ q. For the centralized model, all consumers' problems are solved simultaneously, and the computation complexity is, thus, Op2 N 24 δ q. When realistic preferences on the minimum and maximum temperatures are considered, the complexity decreases as the space of feasible solutions also decreases according to Equation (4). However, this worst-case analysis provides insight on what influences the BRD resolution time: the size of the time slots, the consumers' preferences, and the number of consumers. One should notice that the number of rounds of the BRD is proportional to the number of consumers in the game, but it is also impacted by other factors such as the path to reach the minimum total cost (which is related to algorithm's playing order) and the complexity of one iteration given the opponents' consumption vector.
Finally, each iteration of the BRD means one exchange of messages; thus, the communication overhead is equal to rˆN. The analytical conclusions on computational complexity and communication overhead are summarized in Table 1.

Computational Complexity Communication Overhead
Best Response Dynamics rˆNˆ2

General Applicability of the Model
The game model can be extended to other appliances. For instance, different types of thermostatically controlled loads (TCLs) have similar behavior to the ACs presented-as shown in reference [26]-and can be modeled using Equations (1)-(3). Moreover, shiftable appliances, such as washing machines, also have an on/off control, resulting in the same conclusions on equilibrium points presented in Section 3.2. Even devices that can be modeled as continuous variables (e.g., an electric vehicle) could be added to the model without changing its properties. This can be explained by the fact that consumers try to optimize the total cost function in order to minimize their own bills. Therefore, all conclusions related to the existence of Nash Equilibria, convergence of BRD, application of total cost functions with minimum values, and nonexistence of cheating behavior are applicable and do not depend on the appliances controlled and their constraints.
Moreover, uncertainties could be added to the model. For instance, uncertainty on participants' final consumption, especially the non-controllable loads, can be a risk to the total cost consumers or the aggregation platform will pay depending on the demand-side management contracts they agree upon. In Section 3.1, those uncertainties were assumed deterministic, and the risks are taken by the aggregation platform, but they can be modeled as stochastic variables in the scope of our proposition without changing the nature and properties of the non-cooperative game. One could add an expectation over the base load probability function in Equation (12). It is clear that simulation results could be different if uncertainties are considered, but consumers will continue seeking to minimize the total community cost in the uncertain framework, the game will have a Nash Equilibrium for any convex function considered, and the best response algorithm will converge.

Case Study
To simulate the proposed model in a realistic context, verify its applicability, and analyze its advantages and limitations, we use real data collected from a real community of consumers in the South of Spain for which its network is shown in Figure 4. Hourly active power consumption from June 2019 is collected and averaged to build a daily consumption curve for each of the 201 consumers of the community. The thermostatically controlled loads (TCLs) considered are air conditioners (ACs), with their operation model described in Equation (4). Their physical parameters (power, resistance, capacity, etc.) are calculated based on random values according to the data in Reference [26], while each household occupation is estimated by considering the real data. Only consumers with high enough consumption are assumed to have an AC installed. As a result, 70 consumers out of the 201 have ACs. Afterwards, we simulated the AC load with physical parameters using Equation (4) and calculated consumers' inflexible loads by subtracting the AC load from the data collected. We kept the other 131 consumers without AC in the TCLs management program because they are part of the community, their inflexible load impacts the community's consumption/peak (i.e., the total cost), and we can study how the TC approach affects consumers without flexibility. This initial scenario with both type of consumers is called base case (BAS) and is shown in Figure 5. The parameters of the cost and utility functions are generated following two different methodologies depending on the type of function: quadratic or peak pricing. For the first, we use a three-step piecewise function to represent the cost function. The parameters are defined based on the following tiers prices: 22, 33, and 45 ¢/kWh. The thresholds between tiers are adopted as 60% and 75% of the group's peak load on the base scenario. We adjust a quadratic curve to this piecewise function, resulting in parameters a t " 0.081 ¢{kWh 2 , b t " 12.605 ¢{kWh, and c t " 1.701 ${kWh for all t P T . For the peak pricing function, the chosen parameters are as follows: 1 $/kW for the peak charge and a two-level TOU tariff for the volumetric rate, with values equal to 0.12 $/kWh between 0 h and 17 h, and 0.20 $/kWh between 17 h and 24 h. Although the process of defining those cost parameters by the aggregation platform is out of the scope of this work, some of the chosen values are based on PG&E tariffs for 2019 [55]. The final ranges of all parameters (generated or calculated) are presented in Table 2.

Solutions of the Non-Cooperative Game with TCL
We apply the Best Response Dynamics (BRD) to solve the scheduling game designed in Section 3, with the data generated according to Section 4.1. The BRD is run twice: one with the quadratic total cost function and once with the peak pricing function. Both results are the Nash Equilibria of the proposed transactive control approach. We also calculate the optimal solutions with respect to problems (9) (centralized with quadratic total cost) and (11) (centralized with peak pricing function) in order to obtain a quality measurement of the NEs attained. Moreover, by comparing the results of the games with the centralized solutions, we also analyze their ability to flatten the community's load curve and how different total cost functions impact the outcomes.
The results are shown in Table 3, including the base scenario total costs calculated with the different functions. The solutions of the proposed TC approach with different total cost functions are very close to the centralized ones in terms of community's total cost. In the case of the quadratic total cost function, the game solution has same value as the centralized counterpart. For the peak pricing, the game solution is slightly more expensive ($4.65), which is a value that does not affect much consumers' payments given that they are 201. Therefore, the prices-of-anarchy (PoAs) of the game solutions, which measures the amount of damage suffered by consumers due to the absence of a central authority [56], are 1 and 1.008, respectively. Those close to optimal results are explained by two factors: the utility function is designed to make minimizing the community's total cost the consumers' goal, and the integer nature of the variables can lresult in local optima. Even though the solution can be sub-optimal, in the TC settings, consumers benefit from a decentralized optimization process in which they have local autonomy to define their TCLs schedule, an impossible feature for centralized approaches. In addition to optimizing the community's total cost, the TC approach with different total cost functions also optimizes the group's peak-to-average ratio (PAR). This parameter measures how flat the resulting load curve is [57]. In Figure 6, which depicts the load dispatch of the three scenarios for both total cost functions, it can be seen that the load curve is smoothed and/or flattened with TC approaches almost as much as in the centralized cases. In the peak pricing scenario (right plot in the figure), consumers are motivated to reduce the group's peak to reduce their own bills due to the peak charge in the community's cost function (10), with consumers' utility proportional to this total cost. This explains why the PAR values are smaller for the peak pricing function than for the quadratic function in Table 3. Therefore, the quadratic total cost results in smoothing the load curve while the peak pricing intends to reduce the peak and flatten the load curve.
Moreover, in the quadratic total cost case (left plot in Figure 6), solutions of the TC and centralized approaches overlap, demonstrating that the decentralized game is able to reach optimal solutions, especially when the total cost function is strictly convex.

Convergence Process of the Algorithm
The orange lines in Figure 6 represent the Nash Equilibria: solutions attained with the proposed TC approach with different total cost functions. In Figure 7, we show the convergence path taken by the BRD to reach those solutions. The plots depict change in consumption per iteration of all consumers with AC. In both plots, the algorithm's first round show all 70 consumers modifying their schedules, as this round is designed to construct an optimized initial solution: the consumers start without load and respond best to the previous players at their turn, and the total community load is known after the end of this first round when all consumers had played once (including consumers without AC who do not have flexibility but add their inflexible load schedule to the game)-see flowchart Figure 2. Moreover, the number of consumers changing their schedules in each round reduces throughout the iterations until no modifications exist and the NE is reached. This happens after 12 rounds in the case of the quadratic total cost function (left plot) and 10 rounds for the peak pricing (right plot).

Scalability and Solution Times
In this section, we provide results on scalability and time required to reach solutions. In order to analyze the impact of the factors discussed in Section 3.6 in solution time, multiple simulations are performed in a Windows 10 computer with AMD Ryzen 5 3500U processor running at 2100 MHz and using 20 GB of RAM. The algorithm is implemented in Python 3.7 and uses Gurobi 9.1 solver.
Four aspects are analyzed: the number of consumers in the community, consumers' preferences, size of the time slot, and number of rounds to reach the equilibria. For the first, we create five sub-cases with different numbers of consumers N = {5, 10, 50, 100, 150}, with N = 5 as a subset of N = 10, which is a subset of N = 50, and so on. The second is an aspect related to the complexity of each consumer's local problem, and we analyze it by constructing six sub-cases with 10 consumers: the first three are chosen by picking them randomly in terms of one for those with the largest total energy consumption; one for those with the largest AC load as a percentage of their total energy use; and one for those with the smallest total energy consumption. For the third, we create two cases by modifying the size of the time slot to δ " 0.5 (30 min) and δ " 1.0 (1 h), thus, adjusting all system/market parameters. The fourth aspect is analyzed within the results of the aforementioned scenarios. One should note that the sub-cases created are variations or parts of the data presented in Section 4.1 in order to have comparable results. Tables 4-6 show the time to reach equilibria. More consumers means more time to solve the algorithm, but this is not the only factor, as can be seen in Table 4 for cases N = 100 with quadratic total cost and N = 50 with peak pricing. The complexity of each consumer's problem also influences resolution time. This complexity can be described by, for instance, the number of appliances each participant needs to schedule (which is one in our test case) or by his/her constraints/preferences. As shown in Table 5, the instance in which consumers have the highest load takes more time to solve for both total cost function. Finally, the discretization of the day-ahead in smaller time slots (e.g., 15 min or 30 min) results in more variables and, thus, in more time to solve the problem-see Table 6.

Consumers' Comfort and Savings
As we are discussing the impact of adding TCLs to scheduling games, we show one consumer's solution in Figure 8 for the scenarios without and with energy management. The consumer's load without energy management is shown in the upper left plot, with energy management and a quadratic total cost function in the upper right, and with energy management and a peak pricing total cost in the lower plot. The plots include the inflexible load (yellow bars), the AC dispatch (blue bars), the temperature inside the room (theta), and consumer's preferences. Between time slots 0 and 27 (6 h 45) and 56 (14 h 00) and 67 (16 h 45), this consumer is not at home or does not need the internal temperature to be from 21˝C to 23˝C (his/her comfort constraints). Therefore, preferences during these time slots are relaxed, i.e., are kept 0˝C and 100˝C. In the base scenario, the AC is turned on as soon as the internal temperature reaches the consumers' comfort limit. This dispatch without management does not consider the community's load and total cost and only considers individual consumers' preferences. In the scenarios with the TC energy management, the equilibria solutions turn on the AC when the consumers are not at home to anticipate consumption during the community's periods with higher demand. This flexibility is essential for the decentralized model to minimize the total cost: optimizing the solution depends on smoothing consumers' load when a quadratic total cost function defines their bills and reducing the community's load peak and filling its valley when a peak pricing function specifies their bills.
Moreover, consumers' comfort is respected in the decentralized solutions. The internal temperature is kept inside the defined comfort range during the time slots when the consumer is at home and/or wants the switched AC on. Moreover, lower temperatures are reached in the decentralized setting: instead of being close to the upper limit as in the base scenario, the game solutions reach temperature values closer to the lower limit. For the consumer shown in Figure 8, the average internal temperature during those constrained time slots (28 to 55 and 68 to 95) is 22.7˝C in the base scenario and is lowered to 22.2˝C when the TC with a quadratic total cost function is applied and to 22.5˝C when the TC with peak pricing is used. Therefore, the game is able to lower the average comfort temperatures of air conditioners while also reducing consumers' bills.
We also present consumers' monthly savings provided by the TC approach when compared to the base scenario. To calculate them, we define their energy consumption in all scenarios to determine consumers' billing factor f n and share the total cost of the scheduling among them. We compute their payoffs using Equation (12) and multiply the value by 30 days. Thus, bills in the base scenario equal f BAS n times the quadratic or peak pricing total cost of this scenario, and bills in the TC scenarios equal f TC Q n times the quadratic total cost of the NE, or f TC PP n times the peak pricing total cost of the NE. Then, the savings are calculated as the difference between consumers' utility in the scenario without energy management (BAS) and their utility when they participate in the scheduling game. For the quadratic total cost, consumers' monthly savings vary from $0.00 (very small consumers) to $1.34 (very large consumers), and the community saves around $27.08 with the TC coordination in a month. For the peak pricing total cost, consumers' savings vary from $0.00 to $61.19, and the community saves around $1191.24 with the TC approach in a month. We show the distribution of savings, for both types of consumers (with and without AC) in Figure 9. The left plot depicts savings when the quadratic total cost function is applied to the community, and the left one depicts when the peak pricing is used. Both plots show savings against the total amount of energy change in consumers' schedules if compared to the scenario without energy management. One can observe that consumers that made more flexibility available to reach the community's goal had more savings, especially in the peak pricing scenario.

Discussion and Future Challenges
The non-cooperative game presented here is able to properly include thermostatically controlled loads (TCLs) with an on/off control. The results show that the community total cost is minimized with a strictly convex total cost function (quadratic) and reaches a close to optimal solution with a convex one (peak pricing). Moreover, the community's final load is smoothed with the quadratic TC and flattened with the peak pricing TC. These important conclusions can help aggregation platforms to decide whether to use one of these cost functions.
Moreover, consumers' comfort is respected with the TC approach: reducing the community's total cost does not impact consumers' comfort, and their average internal temperature is even lowered with the utilization of the transactive control model if compared to the scenario without energy management.
Another important feature is that all consumers have savings, even those without AC, which can be an incentive for consumers to stay in the community program regardless of the flexibility they have to offer, e.g., some days they may not need to use their ACs.
In addition, this model is able to align consumers' goal with the community's objectives, bringing natural coordination to the game. Therefore, these cooperative aspects (every consumer reduces its own cost while minimizing the community's total cost) can be effective in engaging end-users.
Moreover, the studied billing mechanism has other advantages as it is easy to implement, transparent to consumers, simple to understand, and able to optimize different types of total cost functions. For the last, functions can be chosen depending on the aggregation platform's goal (e.g., reduce the peak load, smooth the load curve, and provide ancillary services), without affecting the NE existence or the algorithm's convergence, as long as the selected function has a minimum.
A future challenge of this methodology is the problem of how to consider the energy "payback" nature of TCLs [58,59]. Shifting TCLs in time while maintaining comfort standards may imply overall energy increase in relation to the baseline consumption depending on consumers' preferences and the total cost function. In the study case, the sum of consumers' total load in the base scenario is 2991.06 kWh. After the Best Response Dynamics to solve the quadratic game, this value is equal to the base one. In the case of the peak pricing, there is an increase of 0.94%, resulting in a value of 3019.31 kWh. The additional 28.25 kWh from ACs confirms the energy variant nature of TCLs and their "payback" characteristic, especially when the peak is highly penalized. In this case, the difference is small, and assuming a fixed consumption to guarantee coordination and convergence of the game is reasonable. However, it is important to further extend the analysis of this energy "payback" characteristic and identify what can be the best scenarios (e.g., strictly convex total cost functions) and the limitations (e.g., non convex functions with minimum values) relative to the proposed model.

Conclusions
In this paper we advanced the state of the art of non-cooperative games applied to the day-ahead load scheduling of residential consumers by adding thermostatically controlled loads (TCLs) into the set of appliances considered in the transactive control (TC). We proposed a game model for the residential community to schedule their flexible TCLs-more specifically, air conditioners-with the objective of minimizing consumers and community costs. Consumers' comfort constraints were modeled explicitly, and their TCL's operation was described with binary variables in order to represent the on/off nature of the control.
We showed the following: (1) the integer nature of the control can lead to optimal or close to optimal solutions; (2) the billing mechanism used is able to align individual's goal to the community's goal; (3) consumers' comfort is properly modeled and respected; and (4) different total cost functions can be used depending on the market context and the objective of the demand management program. Moreover, the non-cooperative game was solved in a decentralized fashion, providing consumers with the autonomy to decide their schedules according to their preferences and interests.
To discuss the practical application of non-cooperative games to scheduling TCLs and the aforementioned theoretical conclusions, we created a case study using real data from 201 consumers in Spain. The results showed that the TC solution is equal to the optimal one when a quadratic total cost function is used, and it is 0.80% worst than the optimal total cost when a peak pricing function is applied. Moreover, the quadratic game was able to smooth the community's curve while the peak pricing game reduced its peak-toaverage ratio. Both games converged to a Nash Equilibrium in 10 to 12 rounds. Consumers' comfort was respected, and the internal temperature of their homes was even reduced with the TC approach. Finally, all consumers had savings compared to the scenario without energy management.

Acknowledgments:
The authors would like to thank the Brazilian agency CAPES and the Fulbright DDRA Award for their support of this work.

Conflicts of Interest:
The authors declare no conflict of interest.

Nomenclature n, m
Indexes for consumers t Indexes for time slots k Indexes for game stages N Set of consumers T Set of time slots S n Set of strategies of n x n,t Binary scheduling variable: is 1 if consumer n's TCL is on at t and 0 otherwise l n,t , q n,t Total consumption of n at t L t Community's total load at t CpLq Community's total cost when its load curve is L u n pl n , l´nq Utility of consumer n