Online Energy Management and Heterogeneous Task Scheduling for Smart Communities with Residential Cogeneration and Renewable Energy

With the development of renewable energy technology and communication technology in recent years, many residents now utilize renewable energy devices in their residences with energy storage systems. We have full confidence in the promising prospects of sharing idle energy with others in a community. However, it is a great challenge to share residents’ energy with others in a community to minimize the total cost of all residents. In this paper, we study the problem of energy management and task scheduling for a community with renewable energy and residential cogeneration, such as residential combined heat and power system (resCHP) to pay the least electricity bill. We take elastic and inelastic load demands into account which are delay intolerant and delay tolerant tasks in the community. The minimum cost problem of a non-cooperative community is extracted into a random non-convex optimization problem with some physical constraints. Our objective is to minimize the time-average cost for each resident in the community, including the cost of the external grid and natural gas. The Lyapunov optimization theory and a primal-dual gradient method are adopted to tackle this problem, which needs no future data and has low computational complexity. Furthermore, we design a cooperative renewable energy sharing algorithm based on State-action-reward-state-action (Sarsa) Algorithm, in the condition that each residence in the community is able to communicate with its neighbors by a central controller. Finally, extensive simulations are presented to validate the proposed algorithms by using practical data.


Introduction
With the development of renewable energy technology, many residents utilize renewable energy devices in their residences with energy storage systems.Therefore, we have full confidence in the promising prospects of sharing idle energy with others in a community.As a smart grid develops rapidly, energy management systems have been widely utilized in many fields, e.g., Supervisory Control and Data Acquisition [1].Particularly, a residential combined heat and power (resCHP) [2] system becomes increasingly popular in a smart grid for its relatively low emissions and high efficiency, which can produce electricity and thermal energy simultaneously.In this way, the system will be more efficient and economical than previous systems that generate these two kinds of energy separately.The long-term system cost can be minimized by using large batteries which are used to store electricity.In addition, renewable energy devices such as solar panel or wind turbine are popular in a smart grid, which becomes one typical feature with increasing environmental concerns.However, large-scale integration of renewable energy will cause the power system to destabilize.In this condition, batteries and elastic loads can be utilized to decrease the fluctuation caused by renewable energy.In addition, it is more friendly for the environment to utilize batteries and elastic loads which can do the power dispatching across time [3].
The energy management and optimal scheduling for a smart grid with resCHP system with hybrid renewable energy sources can reduce carbon dioxide emissions.Li et al. [4] proposed a real-time scheduling policy by applying modified Lyapunov optimization to separate and sequentially determine the joint load scheduling and storage control.Yu et al. [5] proposed a real-time and distributed algorithm by applying the Lyapunov optimization method and an alternating direction method of multipliers to solve a stochastic programming problem with many practical constraints.Liu et al. [6] proposed a parallel distributed optimization algorithm to minimize the cost of microgrids and the utility company by applying game theory.Ou et al. [7] utilized a dynamic switch and time-variant control strategies to design a new voltage controller to convert carbon dioxide into methane and methanol, and then studies control strategies for a hybrid energy microgrid [8], where the controller utilized a radial basis function network-sliding algorithm and a general regression neural network for maximum power point tracking (MPPT) control to control real power fast and steadily.In practice, we should pay more attention to the dynamic operation, MPPT and system stability in faults.Hong et al. [9] proposed an intelligent controller including the radial basis function network and the improved neural network for MPPT control.Bui et al. [10] used an agent communication language to develop a modified contract net protocol for different agents' communication to optimize the operation of microgrids.Ou et al. [11,12] discusses a distributed generator model and a distributed energy resource for both islanded and grid-connected modes.Zhang et al. [13] introduced a power scheduling approach to solve the problem of renewable energy's stochastic availability.Ou et al. [14] applied a smart damping controller for the static synchronous compensator to reduce the power fluctuations and damping in a hybrid energy microgrid.Mane et al. [15] proposed an improved Lyapunov controller operated in cascade mode to control a hybrid energy microgrid of fuel and an ultracapacitor.Bahrami et al. [16] modified the traditional demand response programs for multiple energy carriers to transfer into an energy center.Mohammadi et al. [17] proposed a method by applying game theory to increase efficiency of the traditional communication network.With the development of information and communication technology, we have a good two-way communication between smart appliances and control centers, which makes it possible to reduce the cost by scheduling the tasks using some demand response strategies, such as real-time price strategy [18].Smart appliances can schedule their tasks by using a real-time price strategy to avoid the peak for reducing the total system cost and the capacity of generators [19].Previous researchers designed some algorithms to help the practical system decide when to store electricity, utilize it, or sell it to the grid according to real-time price.Without an appropriate real-time price strategy, some needless peaks will arise, which are unfavourable to coordinate elastic loads [20].Other researchers focused on scheduling elastic loads without real-time price.
In the paper, we firstly formulate the minimum cost problem of our system into a non-convex optimization problem with physical constraints, which is difficult to handle.Then, this problem is reformulated via relaxing time coupling constraints into a time-averaged constraint.The Lyapunov optimization theory is adopted to handle the reformulated issue.Furthermore, we design an online algorithm for energy management and task scheduling, which does not depend on the statistic system information and has a low computational complexity.Our system model is shown in Figure 1.
We summarize the contributions of our paper as follows:

•
We propose a practical integrated model that has a resCHP system, a renewable energy device, an energy storage (battery) and a boiler for each residence in the community, extending from other models.Our aim to minimize the time-average cost of the total community, including the cost from the external grid and the gas station.In each residence, we study two cases that is not considered in other models: one has delay intolerant (DI) tasks and the other has delay tolerant (DT) tasks.

•
We firstly formulate a cost-optimal problem of one residence in a non-cooperative community in order to reduce the total cost with the constraints of DI and DT tasks.Then, we present an online Task Scheduling Algorithm (TSA) by the Lyapunov optimization approach, which needs no future data and has low computational complexity.Because it is difficult to separate variables, we use a standard primal-dual gradient method to figure out a solution of the problem.For less cost, we propose an energy sharing policy for cooperative energy sharing in the community.By using this policy, we design a cooperative renewable energy sharing algorithm based on a Sarsa algorithm, on the condition that each residence in the community needs to communicate with its neighbors by a central controller.

•
Extensive simulations are presented to validate the proposed algorithms by using real trace data.
The TSA algorithm shows that a larger battery maximum output and V will lead to a higher shaved cost.We satisfy the DT load demand before user-defined deadlines and we can see that, with the increase of the deadline, the saved cost will increase.Compared with the TSA Algorithm (non-cooperative algorithm), the cooperative renewable energy sharing algorithm can reduce a nearly 9% cost for the community while meeting all the demands of the residents.
The rest of this paper consists of the following five sections: Section 2 puts up some questions of an existing resCHP system and task scheduling in a smart grid.Section 3 proposes a system model of a community including resCHP systems, renewable energy devices, energy storage and boilers.We introduce the renewable energy sharing policy and the control target.Section 4 formulates the optimization problem with some constraints in a non-cooperative and cooperative community and gives the details of our designed algorithm.In Section 5, we evaluate the performance of our algorithm by using the real data.Finally, some conclusions are drawn in Section 6.

Related Work
With the development of information and communication technology, the traditional power grid has a good chance of upgrading by disposing different kinds of energy sources optimally.The resCHP system becomes increasingly popular in smart grids for its low emissions and high efficiency.Alipour et al. [21] proposed a stochastic programming framework for optimally scheduling, considering uncertainties of demand response.Motevasel et al. [22] proposed an energy management system for optimal scheduling of a microgrid with resCHP to find the optimum point of distributed energy resources, batteries and thermal storage devices.Tasdighi et al. [23] transformed the energy management issue into a mixed-integer linear programming problem for the system using resCHP.Ma et al. [24] designed an energy management framework for resCHP consumers with the demand response using internal prices.In these papers, it is assumed that the electricity price and the load demand in the future is predictable.These studies solve different problems by using Dynamic Programming and the 'curse of dimensionality' problem [25] appears.
Many previous studies focused on a designing task scheduling policy.Goudarzi et al. [26] proposed a task scheduling policy that could minimize consumers' electricity cost by using an inconvenience function.Buttazzo et al. [27] discussed limited preemption models between two cases of fully preemptive and nonpreemptive scheduling.Du et al. [28] proposed a novel linear sequential optimization enhanced multiloop algorithm to schedule load demand considering the satisfaction of consumers.These works tried to design optimal task scheduling algorithms by only considering the convenience of consumers.
There are many optimization methods for reducing the system cost with energy storage in a smart grid.Zhou et al. [29] designed an algorithm about the queueing models of a battery and a tank in the resCHP system to reduce the average cost by adopting a Lyapunov optimization approach.This approach is also used in other papers to optimize the system cost.Neely et al. [30] reduced the average cost of flexible consumers and met deadlines of delay tolerant tasks.Guo et al. [31] considered both DI and DT tasks from the perspective of a household when tackling the problem of minimizing the average cost.Urgaonkar et al. [32] developed an online control algorithm of data center electricity management for minimizing the long-term average cost.However, their models did not include renewables.Liu et al. [33] designed a centralized algorithm to solve the optimization of energy management problem based on adaptive dynamic programming.Gatsis et al. [34] proposed a near-optimal scheduling algorithm considering the Advanced Metering Infrastructure messages between the consumers and the utility company.Logenthiran et al. [35] tackled the problem formulated from day-ahead load shifting method for demand-side management of smart grid by a heuristic-based Evolutionary Algorithm.Koutitas et al. [36] developed a period management strategy of the periodic loads for reducing the cost in a period of time.Their model did not consider task scheduling in the resCHP system.In this paper, we consider the optimal scheduling of DT tasks in a comprehensive system that has a resCHP system, a renewable energy device (e.g., solar panel, wind turbine), a battery and a boiler for each residence to reduce the total cost.

System Model and Problem Statement
The overview of our system is shown in Figure 1.There are a set of smart appliances denoted by i ∈ {1, 2, • • • , N t } in the residence j and two types of flows, i.e., information and energy flow.For simplicity, we assume that one appliance is utilized in a residence each time slot.We assume that our system works in discrete time slot t ∈ {1, 2, • • • , T} and T will be 72 h if we study this problem in three days.The details of our system are described as follows.

System Architecture
This system includes a set of battery, renewable energy devices, boilers and resCHP systems.Let e j (t) and h j (t) represent the stochastic electricity and heat demand of residence j at time t, respectively.Electricity demand e j (t) can be balanced by battery b j (t) or external grid p j (t).Heat demand h j (t) is satisfied by a boiler or the resCHP system for bathing.
In time slot t, the resCHP system of residence j exports electricity η e u j (t) to the battery and η h u j (t) to satisfy heat demand simultaneously, where η e and η h are the efficiency of converting natural gas to electricity and heat.The resCHP system consumes natural gas u j (t).In addition, the battery gets energy r j (t) from renewable energy and the boiler exports energy g j (t) to satisfy heat demand.We set the electricity price C e (t) in the range (C e,min , C e,max ).We assume that the price of natural gas C g is a constant in each time slot because it does not change greatly.The electricity price C e (t) can be obtained from PG & E [37] and the price of natural gas C g can be known from the report of RateFinder for June 2017 [38] from PG & E. To minimize the average long-term cost, our algorithm focuses on the electricity p j (t) bought from the grid and natural gas u j (t) consumed by the resCHP and g j (t) consumed by the boiler.The symbols used are summarized in Abbreviations.

Renewables Generation
We assume that there are N residents in a community and all residences are provided with solar panels and wind turbines.From [39], we define that the energy generated from a solar panel r s (t) for residence j can be calculated as , where ζ represents the efficiency of solar energy to electricity, A j is solar panel area of residence j, I(t) is the illumination intensity and t d is the duration of the time slot t.There is an upper bound r s,max for solar energy with the illumination intensity.Also from [39], we define that the energy generated from a wind turbine r w j (t) can be calculated as , where ς represents the efficiency of wind energy to electricity, WB j is rotor blade area of wind turbine, ρ stands for the density of the air, v(t) is wind speed and t d is the duration of time slot t.Obviously, there is an upper bound r w,max for wind energy.In time slot t, the energy converting from renewable energy generators into battery in time slot t for residence j is r j (t).We have r j (t) = r s j (t) + r w j (t), r j (t) ≤ r s,max + r w,max .

Electricity and Heat Demand
The electricity demand is set as e j (t) and the heat demand h j (t) for residence j.We assume that tasks are continuous and the electricity consumption rate π t j of each task is constant.Electricity demand is satisfied by the electricity from external grid p j (t) and the electricity from battery b j (t) in time slot t.Heat demand is satisfied by the resCHP system η h u j (t) and the boiler g j (t).According to the energy conservation law, we have the following equation: h j (t) = η h u j (t) + η s g j (t). (2)

Delay-Tolerant and Delay-Intolerant Tasks
In this part, we provide a brief introduction to DI and DT tasks.There are some DI tasks in our daily life, such as television or lighting, which needs to be satisfied immediately.DT tasks appear with the advent of smart appliances.With the development of smart appliances, people can make an appointment to do some tasks such as washing and bathing.
In our model, residence use n t to present the number of tasks N t that arrive at time slot t.According to the degree of delay tolerance, tasks can be divided into two groups: DT tasks and DI tasks.Each task i ∈ N t can be characterized by two parameters: the time needed for the task a t i and the deadline for the task d t i .The task should be finished before t + d t i .If a t i = d t i , the task should be satisfied immediately; therefore, the task is delay-intolerant.Otherwise, the task is delay-tolerant.For the DI tasks, we cannot propose an ideal scheduling algorithm.We focus on discussing the optimal scheduling algorithm of DT tasks to achieve the minimum cost.The postponing time is set as s t i and if the task is delay-tolerant, the postponing time s t i is 0. We set d max max t,i d t i for the maximum delay of tasks.

Energy Storage
Here, we do not consider the electricity loss of the battery in the process of charging and discharging.For simplicity, we view the state of the battery as an electricity queue.From Figure 1, we can see that the energy in energy storage in time slot t for the residence i B j (t) includes three parts, first from the external grid, second from the renewable energy and the last part from the resCHP system.Thus, the battery level B j (t) in time slot t is given in the following equation: In general, p j (t) has a positive value, and then we see b j (t) ≤ e j (t) from the aspect of practical application and the constraints are given as follows: where b max j is the battery capacity of residence j.The constraint Equations ( 4) and ( 5) means that the electricity discharged or charged from the battery has an upper bound in a practical model.

Energy Sharing Policy
The electricity demand e j (t) can be satisfied for residence j as the following expression: p j (t) = e j (t) − b j (t) + ∑ j =i r jj (t) residence j can draw or share energy r jj (t) from its neighbor j .According to the law of energy conservation, we can obtain that ∑ i ∑ j =i r jj (t) = 0.

Control Target
In time slot t, our system cost consists of the cost from the external grid and natural gas utilized by the resCHP system and the boiler.Electricity demand e j (t) is dependent on the task scheduling from time slot t to t − d max + 1, and we have and the function ψ(τ is not a specific function.We aim to design a scheduling algorithm to achieve the minimum long-term average cost by assigning the amount of electricity and gas.The long-term average cost can be described as the following expression: For simplicity, we do not consider some practical factors, such as the electricity loss in the transmission in order to focus on minimizing the electricity from a grid related with variables (s t i , r(t), b(t), u(t), h(t)).We can add these factors into our model in the future.

Problem Formulation
In this section, we firstly consider a non-cooperative scenario that residents do not share energy with others.Electricity and heat demand e(t) and h(t) are supposed to be independent from each other.According to the system model shown in Figure 1, we firstly assume a convex function of consumers' dissatisfaction F t i (s) according to [40], which is a utility function of the task i when it delays s time slots.If the delay s = 0, the value of the function should be zero.As the delay s increases, the utility function will increase, which means that dissatisfaction of consumers increases.The utility function can prove to have a long-term bound α by using the same method in [41]: where s t i is the delay for the task i.Generally, we should fulfill a task before its deadline, and the constraint for s t i is given as the following expression: The optimization problem can be summarized as follows: s.t.(1), ( 2), ( 3), ( 4), ( 5), ( 6), ( 8), (9), where C e (t)p(t) + u(t)C g + η s g(t)C g is the total cost of the system in time slot t.
We know that lim T→∞ ∑ T t=1 e j (t)C e (t) is the total cost of power demand.We have the following equation: where π t i is the electricity consumption rate.We rewrite the optimization problem as the following P2: s.t.(3), ( 4), ( 5), ( 6), ( 8), (9), P2 is the simplified problem of P1.To solve P2, we use the Lyapunov optimization method [42].
To ensure that dissatisfaction of consumers has a upper bound, we create a virtual queue J(t) as the following: We can prove that if this virtual queue J(t) meets the restriction lim sup T→∞ J(T)/T = 0 , then we have (15)

The Lower Bound of the Minimum Cost
In this part, we prove that the minimum cost of P2 has a lower bound.We set C and C to be the minimum cost of P2 and P3: (3), ( 5), ( 8), (9).
We can see that P3 is a relaxation of P2 and C is the lower bound of C. We use Theorem 4.5 in [43] to prove that C can be calculated according to an optimal randomized stationary policy b(t).Lemma 1.We can get C from an optimal randomized stationary policy that only depends on the system state.The control variables (r(t), st i , b(t), ũ(t)) are some functions of [n t , c tj , r(t), e(t), h(t)] in each time slot.We have these functions as follows: + η e ũ(t Proof.We prove it by using Theorem 4.5 in [43].Our problem satisfies all sufficient conditions proposed in Theorem 4.5 for the existence of an optimal randomized stationary policy.Equation (18) means that the long-term average dissatisfaction is no less than α.Equations ( 18) and (19) shows that B j (t) and J(t) have a long-term average stability.Finally, C can be calculated by an optimal randomized stationary policy (r(t), st i , b(t), ũ(t), h(t)).

TSA: Task Scheduling Algorithm
We usually use Lyapunov drift to study optimal control of queueing networks.By the Lyapunov drift, we can stabilize these two queues B j (t) and J j (t) for each residence j.We add a weighted penalty term V to the drift of a Lyapunov function L j (t).Based on this method, we design a function ).One part of our goal is to minimize the drift of L j (t), making B j (t) closer to the constant ε where the battery level B j (t) is bounded.We have several constants n max = max t n t , r max j = max t r j (t), a max = max t,i a i t , F max = max t,i F i t (d i t ).We set Z j (t) = (J j (t), B j (t)).From the Lyapunov approach, we define the drift is ∆ = E{(L j (t + 1) − L j (t))|Z j (t)}.

Lemma 2. The Lyapunov drift will satisfy
where D 1 2 (n 2 max F 2 max + α 2 ).The Lyapunov drift ∆ can be calculated from J j (t) and B j (t).We make some simplification to minimize the Lyapunov drift ∆.
Proof.We first square Equation ( 14) of J(t).By using max[x, 0] 2 ≤ x 2 , we have Then, we square Equation (3) of B j (t).By using some parameters that we defined in the previous expression, we have Finally, we figure out the Lyapunov drift L(t + 1) − L(t) as follows: According to the stochastic optimization technique, the Lyapunov drift ∆ should be minimized to make the queue of consumers' satisfaction J j (t) and the queue of battery level B j (t) mean rate stable.We also aim to minimize the system cost, so we set a parameter V denoting the trade-off between the system cost and the Lyapunov drift ∆; then, we add VE[∑ n t i=1 ∑ + η e u j (t)C g + h j (t)C g ] on both sides of Equation ( 19), we have + η e u j (t)C g + h j (t)C g ] The goal of task scheduling algorithm (TSA) is to minimize the right side of Equation ( 20) according to Equation (5).During time slot t, we can figure out s we have P4 as follows: P4 : min We can see that P4 is feasible and satisfies Slater's condition.To solve this convex optimization problem, we have where ζ t , µ + t , µ − t are the Lagrange multipliers and dual variables for constraints (23).As P4 is convex, feasible and satisfies Slater's condition, the Karush-Kuhn-Tucker conditions are accessible and sufficient for optimality [43], given by The convex relaxation of P4 is tight if and only if any solution of Equation ( 22) satisfies ζ t ≥ 0 or ζ t = 0, ∀t.Since P4 is a steady-state optimization problem, we solve this problem with the following algorithm by a standard primal-dual gradient method [44]: where .We can see that the constraint of Equation ( 5) is redundant in this situation.The performance of TSA can be proved by the following theorem.We can see that the battery level and the cost of the total system are both limited.Theorem 1.We set ε = b max j + VC e,max and B j (0) = ε, and then B j (t) will show the following property of stability: Proof.First, we prove the upper bound of B j (t) by using mathematical induction.The basis: for t = 0, we have for the initial setup.The inductive step: we assume that B j (t) ≤ ε + b max j + r max j .Then, we need to prove that B j (t + 1) ≤ ε + b max j + r max j .In the next time slot t + 1, we have two situations as follows: (i) if B j (t) ≤ ε, when ξ = 1 and b(t) = −b max j , we will have the maximum of the increased electricity.Therefore, we will have (ii) if B j (t) > ε, we can know that b(t) > 0 according to Equation (23), that is to say, battery discharges.Thus, we have . Above all, we can prove that Second, we prove the lower bound of B j (t) by also using mathematical induction.The inductive step: we assume that B j (t) ≥ 0.Then, we need to prove that B j (t + 1) ≥ 0. In the next time slot t + 1, we have two situations as follows: (i) if B j (t) ≤ ε, we can know that b(t) > 0, according to Equation (23), that is to say, battery charges.
Theorem 2. The cost of the system using TSA satisfies: Proof.The constraint of Equation ( 4) ensures that b(t) is in the range of [0, b max j ] when battery discharges.As we know that B j (t) By setting C to be a lower bound of C, we calculate the average value from t = 0 to T as follows: By setting B(0) = ε, we can see that L(0) = 0 in Equation ( 26).On both sides of Equation ( 26), we divide by VT, Equation (25) shows that, with the increase of parameter V, the total cost shows a converging trend.
Remark 1.When the parameter V is large enough, the total cost will approach a value that is in positive correlation with the battery capacity.

Cooperative Renewable Energy Sharing Algorithm
The cooperative scenario is built upon the solutions to the non-cooperative scenario where the renewable energy is cooperative in the community.In general, there are two following cases of the renewable energy: In a situation where the renewable energy is enough, the extra energy p j (t) + b j (t) − e j (t) of resident j can be offered to its neighbor j .Assuming that resident j can offer energy r jj (t) to its neighbor j .Then, we have In a situation where the renewable energy is not enough, the inadequate energy e j (t) − p j (t) − b j (t) of resident j can be drawn from its neighbor.Assuming that resident j can draw energy r jj (t) from its neighbor j .Then, we have r jj (t) = e j (t) − p j (t) − b j (t).
According to the law of energy conservation, we can obtain that ∑ j ∑ j =j r jj (t) = 0. Let C TSA be the optimal total cost of the situation where residents are non-cooperative, and C CRESA be the total cost of the situation where the renewable energy of residents are cooperative in the community.
Ĉtol j (t).Ĉtol j (t) means the cost for residence j in a cooperative situation.Obviously, C CRESA < C TSA because we utilize idle energy from residents' neighbors, which can reduce the electricity drawn from the external grid and reduce the total cost.We design the Cooperative Renewable Energy Sharing Algorithm (CRESA) by the Sarsa algorithm.We define the four elements as follows: (1) State Space We define the state space Φ which consists of electricity price and the number of residents N. We discretize the electricity price into M intervals; therefore, the state space is given by Φ = {1, ..., M} × {1, ..., N}.We formulate the reward maximization problem as follows: (2) Action Space We denote the maximum allowable renewable energy drawn from or sharing with neighbors as R max .Therefore, the action space of the renewable energy consists of three actions: harvested, hold on, and sharing: According to Equation (32) in [45], we can obtain C TSA .From Algorithm 2 in [45], residence j harvests energy from neighbors when tmp j − ∑ T t=1 C tol j (t) ≤ 0 and sharing energy with neighbors when tmp j − ∑ T t=1 C tol j (t) > 0.
(3) Reward Function At time slot t, after taking an action a ∈ A at state φ ∈ Φ, the central controller will receive a reward to ensure that the central controller knows the impact of its action.We should harvest the renewable energy from neighbors when the electricity price is high and share with neighbors when the electricity price is low.Therefore, we define the reward function as We use the Sarsa algorithm (a classic reinforcement learning algorithm) to design an update policy including state, action and reward function.For each pair of state φ and action a, we define a Q function as follows: where (φ, a) is the state-action pair in time slot t and (φ i , a i ) is the possible state-action pair in the next time slot t + 1, the parameter β ∈ (0, 1] is the learning rate determining the exploration rate of Sarsa, γ ∈ (0, 1] is the discount factor determining the importance of future rewards. Based on the Sarsa algorithm, a solution of cooperative renewable energy sharing problem can be achieved, on the condition that each residence in the community needs to communicate with its neighbors by a central controller.The Cooperative Renewable Energy Sharing Algorithm (CRESA) is summarized in Algorithm 1.

Numeric Performance
In this section, we will evaluate the performance of the TSA algorithm by using practical electricity price data.The parameters for the resCHP systems and the electricity and heat demand in time slot t will be described in the next part.

Parameter-Settings for the Dynamic Simulation
We adopt electricity price and natural gas price from PG & E [37] during 24 June 2017-28 June2017.Figure 2a shows the electricity price C e (t).We assume that electricity demand and heat demand are stochastic processes.As in [46], we set ζ = 20% and ς = 30% as a common statistic data, which represent the efficiency from solar and wind energy to electricity, respectively.We use the real data from [47,48] for solar radiation I(t) and wind speed v(t).We set the air density as ρ = 1.2041 kg/m 3 and the wind blades area as WB j = 10 m 2 .Figure 2b,c shows the dynamic data of solar radiation and wind speed.We implement our proposed algorithm on a PC with 64-bit Windows 7, 12 GB RAM and a 1.80 GHz CPU.Simulation settings are as follows.We consider a system with 150 appliances of 150 h with 1-h each time slot, i.e., 150 time slots in total.We assume that conversion efficiency of renewable energy is 20% and the maximum output is 8 kWh.Battery capacity b max j is set to be 6 kWh.We assume that the probability of electricity demand arriving is the same.The electricity consumption rate π t i of these appliances are set to be 0.1.We set the function of consumers' dissatisfaction to be F(x) = x 2 .Our resCHP system uses natural gas whose price C g is set to be $5.4/MMBtu.We assume that α = 30 and parameter V in the Lyapunov drift L(t) is from 1 to 40.

Performance Evaluation
The cost of DT tasks is compared with that of DI tasks by using our algorithm.By setting ε = b max j + VCe max , we have the maximum battery level ε + b max j + r max j .In particular, our algorithm satisfies DI tasks when they arrive.We can figure out that the cost reduced by our algorithm increases with time.
Figure 3 shows the amount of electricity charged/discharged from the battery b(t) in time slot t.We capture a figure of 150 time slots.This figure shows how the battery charges/discharges according to the TSA algorithm.
The battery level B j (t) which has a hard constraint and we can see it from Figure 4 when the deadline as d t i is set to be 14 time slots.By setting ε = b max j + VC e,max , B j (t) has a bound ε + b max j + r max j .As we can see from Figure 4, the battery level has a constraint less than ε + b max j + r max j in 150 time slots.Figure 5 plots the percentage of reduced cost in total cost of different deadlines for V = 6.From Figure 5, we can see that if the deadline of DT tasks is bigger, the percentage of reduced cost in total cost will increase.We can figure out the reduced cost of DT tasks in one case, which is 12.49% of the total cost when we set the deadline d t i = 14.We will obtain more benefits if delay of DT tasks is longer.We compare the cost of DI tasks in each time slot with that of DT tasks by using our algorithm in Figure 6 for deadline d t i = 6.As we can see from Figure 6, one DT task has a lower cost than one DI task under the same condition.
Figure 7 plots the percentage of reduced cost in total cost of different DT/(DT + DI) for d t i = 6, 10, 16 and V = 6.The percentage of reduction in total cost grows as the deadline d t i increases from Figure 7.We can see that more DT tasks will lead to a higher reduced cost in our algorithm.
Figure 8 plots the percentage of reduced cost about the parameter V for deadline d t i = 6.From Equation (25), we can see that the percentage of reduced cost increases with the parameter V.
Figure 9 depicts the total cost of the system versus V for DT/(DI + DT)= 0.2, 0.5, 1.We can see that a larger DT/(DI + DT) has a less total cost and the total cost decreases slowly as V grows when V reaches 30.There is a linear relationship between V and the battery capacity b max j .The total cost of the system also decreases slowly when the battery capacity b max j increases, which finally tends to be a bound.From Equation ( 14), we can guarantee that the virtual queue of consumers' satisfaction is bounded and stable.From Figure 4, we can see that the queue of battery level is stable.From Figures 8  and 9, we achieve the optimal system cost when V = 30.This demonstrates that, by our proposed algorithm, we can minimize our system cost while stabilizing the queue of battery level and the queue of consumers' satisfaction.
From Figure 10, we can see that the cumulative total cost in a cooperative community is lower than that in a non-cooperative community.Our CRESA algorithm can achieve 9% of cost reduction comparing with the TSA Algorithm in 200 time slots.The deadline for task i

Conclusions
In this paper, we investigate the problem of energy management and task scheduling for a smart grid with a resCHP system and renewable energy, considering unpredictable electricity demands that include two kinds of tasks: DI and DT tasks.To minimize the total cost of the community, we formulate the cost minimization problem into a stochastic non-convex optimization programming with physical constraints, which is challenging to solve.Then, by the TSA algorithm based on the Lyapunov optimization approach, we tackle the reformulated optimization problem after relaxing the time coupling constraint into a time-averaged constraint.The TSA algorithm shows that a larger battery maximum output and V will lead to a higher shaved cost.We satisfy the DT load demand before user-defined deadlines and we can see that, with the increase of the deadline, the saved cost will increase.Our TSA algorithm which applies the Lyapunov optimization method does have high efficiency and we will study a more efficient algorithm in our further research.Then, we design a cooperative renewable energy sharing algorithm based on a Sarsa algorithm for the cooperative mode, on the condition that each residence in the community needs to communicate with its neighbors by a central controller.Finally, our proposed CRESA algorithm can obtain a lower cost than a non-cooperative algorithm and extensive simulations are presented to validate the proposed algorithm.

Figure 1 .
Figure 1.The structure of a system model.

t
are positive scalars representing the controller gains.We set the highest electricity price C e,max and set ε = b max j + VC e,max .According to Equation (23), we will always have ε − B j (t) − C e (t) > 0 when B j (t) < b max j .The battery will absorb the electricity from grid and b(t) = −b max j .The battery discharges when B j (t) > b max j

j
TSA − C CRESA /N) by adopting the Nash bargaining method to implement the cooperative plan.We define tmp j = C TSA j − C TSA −C CRESA N

Figure 2 .Figure 3 .Figure 4 .
Figure 2. The dynamic real data.(a) electricity prices in 150 time slots from PG & E; (b) the dynamic solar radiation; (c) the dynamic wind speed.
Actual electricity stored in the battery in time slot t for residence j u j (t) Natural gas consumed by the resCHP system for residence j g j (t) Natural gas consumed by the boiler for residence j C e (t) Electricity price in time slot t p(t) Electricity drawn from the grid in time slot t s t Electricity demand in time slot t h i (t) Heat demand in time slot t B i (t) Battery level in time slot t π i (t) Electricity consumption for task of residence i F t i (s) Dissatisfaction function of task i for delay s C g Natural gas price η e Efficiency of converting natural gas to electricity η h Efficiency of converting natural gas to heat in the resCHP system η s Efficiency of converting natural gas to heat consumed by the boiler η r Efficiency of converting renewable energy to electricity