Penalty Based Control Mechanism for Strategic Prosumers in a Distribution Network

: The distribution side of the traditional power grid is changing as the users (known as prosumers) can inject power to the grid. However, uncontrollable injection of power can destabilize the grid. Thus, the stability of the grid must be maintained. Since the prosumers are self-interested entities, they will take their actions to maximize their own pay-offs. We formulate the problem as a non-cooperative game theoretic problem where the magnitude of the voltage must be within an acceptable limit at each node of the power network. Since the power-ﬂow equations must be satisﬁed at each node, it becomes a coupled constrained game where the constraints are the same across the prosumers. We propose a distributed penalty based algorithm which converges to an equilibrium. In this mechanism, the prosumers are quoted a price based on the active and reactive power drawn or injected to the power grid. The algorithm is easy to implement and it converges to an efﬁcient solution which maximizes the sum of the utilities of the prosumers while maintaining the grid’s stability.


Introduction
The proliferation of the distributed energy resources (e.g., Photo voltaic (PV) arrays, solar rooftops, energy storage units) have transformed the notion of traditional consumers of energy. The consumers can now also produce and supply power to the grid. We denote the consumers with the capability of producing energies as prosumers. The users are also using electric vehicles (EVs) which require energy for charging, but can also provide energy back to the grid by discharging. However, as the distribution network is going through rapid transformation it may create instabilities into the grid. It is worthwhile to remember that a centralized solution is not implementable in a system of thousands of household entities and their controllable devices since the homeowners or prosumers take their decision which will benefit them. Thus, each prosumer takes its own decision which will maximize its own objective without coordinating with other prosumers. Thus, the question is how to control different self-interested entities while maintaining the stability of the grid?
In order to answer the above question, we consider a system where prosumers can sell and buy energies from the grid. For each time period, each prosumer decides how much to buy and how much to sell depending on its own utility function, selling and buying price. The power drawn from or injected into the grid must follow the Kirchoff's current and voltage laws. We consider a radial distribution network where each node consists of a subset of prosumers. At each node, the magnitude of the voltage must be maintained within an acceptable range to maintain stability of the grid. The voltage magnitude is obtained by solving a set of non-linear active and reactive power flow equations which are very difficult to solve. Note that the active and reactive powers injected or drawn are decided by the prosumers. Thus, each prosumer needs to decide how much to buy or sell while maintaining the stability of the grid.
Since the non-linear AC power flow equations are non-convex, we consider a linearized power flow equations which are reasonable and considered in the literature [1,2]. However, each prosumer only solves its own optimization problem, hence, we consider a game theoretic setting to characterize the strategic interactions among multiple prosumers. The prosumers have the same constraint that the voltage magnitude must be within an acceptable range at each node. Hence, this leads to a coupled constrained game [3]. We seek to obtain a generalized Nash equilibrium (GNE) which is the relevant equilibrium concept for a coupled constrained game. The coupled constrained game is difficult to solve as the strategy space depends on the strategies of the other players and the power system parameters such as line impedances. However, a prosumer is unaware of the power network (the reactance, resistance, and the number of nodes) and is also unaware of the decisions of other prosumers.
We propose a distributed algorithm to obtain a GNE. GNE is not unique in general, however, the GNE obtained using the distributed algorithm is an efficient solution i.e., it maximizes the sum of the utilities of the prosumers. In the distributed algorithm, a controller which has knowledge of the power network and its parameters quotes a penalty price to the prosumers if the voltage magnitude constraint is not satisfied at a node. For example, if the voltage magnitude is lower than a specified value, then, the prosumers are penalized for selling energies and vice versa. Note that prosumers may also sell reactive power since the distributed energy resources are converted to the AC power using inverter. Excessive reactive power is also penalized if the voltage magnitude becomes small because supply exceeds the demand.
When the voltage magnitude exceeds the upper limit, the penalty for drawing power is increased till the voltage magnitude remains within an acceptable limit. We show that such an iterative price algorithm converges to a GNE. The convergence is also very fast and thus, can readily be implemented in practice. Numerically, we show that the utility of our algorithm in maintaining the stability of the grid. The penalty parameters depends on the number of links a node has and the impedance values between those links since power needs to be rerouted to the neighboring nodes if the supply and demand does not match at a given node.
We summarize our contributions in the following: • We mathematically model strategic prosumers who take their decisions in order to maximize their own objectives without coordinating with other prosumers while maintaining the voltage stability in the grid using a game theoretic setting.

•
We propose a practically implementable algorithm which converges to the efficient GNE.

•
We show that there exists a distributed algorithm where a controller at each node selects penalty prices for violating a constraint.
The rest of the paper is organized as the following: Section 2 compares our work with the existing literature. Section 3 describes the system parameters, and the objective each prosumer tries to optimize. In Section 4 we propose a distributed algorithm which converges to a GNE. Section 5 empirically demonstrates the strength of our algorithm in mitigating the voltage constraint violation. We conclude and provide future research direction in Section 6.

Related Literature
Demand response pricing has already been studied [4][5][6][7][8][9]. Real time pricing has also been considered [10][11][12][13][14][15][16]. However, these papers did not consider that the users can also sell energies back to the grid. Naturally, they did not consider the AC power flow equations while coming up with the prices. Energy exchange among the users in a micro-grid setting has been considered [3,[17][18][19][20]. Though these papers considered that the users can sell back energies, they did not consider non-linear AC power flow equations and the power network structure.
Recent papers [1,2,[21][22][23] have considered the control of power of prosumers in order to maintain the voltage stability. These papers assumed a centralized optimization problem with a distributed algorithm for maintaining the stability. However, in a smart grid, each prosumer takes its own decision. Hence, it is difficult to control each prosumer. Thus, game theoretic notion is better adapted to study the strategic interaction among prosumers. In a game theoretic setting, it is not apriori clear whether an equilibrium exists and whether an equilibrium can be implemented in practice (if exists). Articles [24][25][26] considered a game theoretic setting to study the interaction among multiple prosumers. However, these papers did not consider any distributed algorithm which can implement the equilibrium strategy. We provide a distributed algorithm which can be implemented in practice. The distributed algorithm converges to a generalized NE, and we show that the GNE is efficient.
The article [27] considered a distributed energy management system for a cooperative multi-agent system. In contrast, we consider a non co-operative game where each prosumer takes its own decision without coordinating others. We also considered the stability of the grid unlike in [27]. The papers [28,29] proposed a distributed algorithm for controlling agents. However, in the game theoretic setting each agent takes its own decision. Further, the stability constraint where the voltage magnitude must be within the acceptable limit has not been considered in the above papers. [30,31] consider incentive mechanisms for promoting sharing of energies among different consumers. However, the stability of the grid is not considered. Further, we formulate the problem as a game theoretic setting and propose distributed algorithm which converges to an equilibrium strategy.

System Model and Problem Formulation
In this section, we first, describe the prosumers and their decision variables in Section 3.1. Subsequently, we describe the power flow equations as the prosoumers inject/draw power from the grid in Section 3.2. In Section 3.3, we describe the utilities of the prosumers. Finally, in Section 3.4 we describe the optimization problem that each prosumer tries to solve.

Prosumer's Decision
Each prosumer j decides the power b j,t it will buy during time period [t, t + δt). The duration δt is chosen depending on how fast the prosumers can update their decisions. Most often in the real time setting δt varies between 5 min to 15-min. However, δt can be easily adapted to a much granular level. Note that the prosumer can also sell s j,t amount of power during time period [t, t + δt). A prosumer can sell its power by discharging from its battery or from the renewable energy resources. Note that a prosumer also represents one who does not have any distributed resources or storage unit. Some users may have a deferrable load. The demand for those load only need to be fulfilled within a certain deadline. For example, the EV needs to be ready before 8 am (e.g., if the user is going to work). However, the individual load may vary over time. We denote the set of deferrable appliances as A i . Suppose the load assigned to appliance j of user i is x i,j,t for the time duration [t, δt). Hence, we have where X j is the amount of energy required for appliance j before its deadline T j . The power demand of prosumer i during time [t, t + δt), d i,t must satisfy the following- where d i,t,max , and d i,t,min are known beforehand. Each prosumer has a renewable energy harvesting device which harvestsĒ i,t amount of power (A prosumer may not have any renewable energy harvesting device. In that case, the renewable energy will be 0.) during time [t, t + δt). The prosumer i may also have a battery with capacity B i,max . If the prosumer does not have any battery then B i,max is 0. The state of the battery is B i,t . The amount of energy discharged from the battery is e i,t , and charged to the battery is l i,t . Thus, we have Note that the renewable energy generation is a random process. Hence, a prosumer will only have the estimate ofĒ i,t , rather than the exact realized value. Note that we assume that during the time interval [t, t + δt)Ē i,t is constant. If δt is small, it is a reasonable assumption.
The state of the battery can not be less than 0. The state of the battery is also required to be a specific value at the end of the horizon. Most often, the state of the battery is kept to be same as the start of the day. Thus, we have Thus, the total power consumption of prosumer i during time [t, t + δt) is given by where η d ≤ 1 and η c ≤ 1 are respectively the discharging, and charging efficiency from the battery. Note that only a portion of energy bought by the prosumer can be used because of the transmission loss.

Power Flow Constraint
When a prosumer injects or draws power, the power flow equations must be satisfied at every time instance. We also characterize the active and reactive power flow equations as DC power is converted to AC power when battery is discharged.

Inverter
The distributed energy resources are converted in AC power via an inverter. The reactive power provided by the inverter is q i,t and the power capacity of the inverter is r i . Hence, we must have Recall that e i,t is the real power discharged from the Battery or distributed energy resources (if there is no battery).

Power Flow Equations
We, now, describe the power flow equations which must be satisfied. The power flow network consists of nodes N and edges E . An edge or link exists between two nodes if the two nodes are connected.
Let us assume that the set N i of prosumers belong to the node i in the power network. Note that in general, the cardinality of the set N i can be 1 which denotes that each prosumer's end can be treated as a node. The root of the tree is the feeder with a fixed voltage and power injection, and is denoted by bus 0. We denote the other buses as n = 1, . . . , N. For each link (i, j) ∈ E , let z i,j = y i,j + iw i,j be the complex impedance. Let I i,j,t be the complex current flowing during time [t.t + δt) and P i,j,t , Q i,j,t be the real and passive power respectively flowing through edge (i, j) during time [t, t + δt).
Let p j,t and q j,t be the total real load and passive load of all the prosumers at node j. Then, We are now ready to represent the power flow model. For all links (i, j) we must have where v i,t = |V i,t | 2 . The equality constraints in the system of equations in (8) describe the physical power flow models. Because of the non-linearity, the values of the voltage may not be uniquely determined even when the power load p j,t , q j,t are specified. There can be no solution or multiple solutions.
The following voltage constraint must be satisfied at each nodē wherev i and v i correspond to the maximum and minimum voltage limit respectively at node i. Though the constraints specified in system of equations in (8) are non-linear, we consider a linearized version (i.e. |I i,j | = 0) specified by the following equation where where P i is the set of edges in the path between Bus 0 and Bus i. Linearized model is a good representation of the radial distribution network. It also has another advantage-unlike the non-linear model, for a given p and q the voltage equations can be solved accurately. Though we do not consider any limit on the power flow explicitly, the constraint on the magnitude of the voltage implicitly limits the amount of active and reactive power flow. We can add constraints on the power magnitude without losing the complexity of the problem. For example, constraint such as p min ≤ p t ≤ p max is linear and convex. Thus, the problem will remain convex and our proposed algorithm can be adapted to handle the above constraint.

Prosumer's Utility
Each prosumer i attains an utility U i (·) depending on the amount of power consumed. The utility of the user depends on the the consumption d i,t . The utility of a prosumer/user is the economic value obtained by consuming or availing a service. If a prosumer/user consumes more energy, the utility will increase. The utility of a prosumer inherently depends on its willingness to consume. For example, if a prosumer consumes significantly less amount of energy compared to its desired value, its utility will be smaller. Similarly, if a prosumer wants to set its temperature at a certain value, the utility will be smaller if it is set at a value different from its preferred value.
The prosumer also has to pay c t b i,t δ amount for buying energy as the price charged by the utility company is in $/kwh. Note that a prosumer is charged depending on the energy consumption, not on the power consumption. Similarly, the prosumer also gets g t s i,t δ t amount for selling energy. We assume that g t < c t to avoid trivial solutions.
Note that the utility can have time correlated component which can easily be incorporated in the model. Those utility functions can be represented as U i (∑ T t=1 d i,t ). On the other hand the utility may be separable across the time. Without loss of generality, we consider that the utility function is given byŪ i (∑ T t=1 d i,t ).
Assumption 1. We assume that U i (·) is increasing and concave function.
The concavity assumption stems from the fact that if a prosumer consumes more energy the rate of change of valuation decreases.

Optimization Problem of Prosumer
Each prosumer is a selfish entity which only wants to optimize its own payoff. Thus, each prosumer solves the following problem Note that it is a coupled constrained game since the constraint in (10) is common to all the prosumers.

Strategic Interactions
Each prosumer tries to solve its own optimization problem in (12). Thus, we consider a non co-operative game theoretic setting where each prosumer takes its own decision. The constraint in (10) is the same across the prosumers. Thus, the feasible strategy space of a prosumer inherently depends on the actions of others. However, the game is a coupled-constrained game since the constraints are common to the prosumers. We, thus, seek to obtain a generalized Nash equilibrium.
Before defining the generalized Nash equilibrium formally, we introduce some notations. We denote the objective function of each prosumer i as F i (·). We also represent the decision variable of prosumer i as S i . Suppose [S * 1 , . . . , S * i , . . . , S * n ] ∈ S is a strategy profile where S is the set of feasible strategies.

Definition 1. The strategy profile
Thus in a generalized Nash equilibrium a prosumer can not find another feasible strategy which can give a better utility if the strategies of the other prosumers remain fixed.

Algorithm
In this section, we describe the algorithm DIST-OPT which converges to the generalized Nash equilibrium.
We propose a distributed algorithm which converges to the generalized Nash equilibrium.
1. Initializeλ i,t = 0, λ i,t = 0, ζ i,t = 0 for i = 1, . . . , N. 2. Each prosumer i ∈ N i solves the following problem 3. Prosumers inform their latest strategy S k i at k-th iteration. If ||S k i − S k−1 i || < , the algorithm terminates. Otherwise, it proceeds in the following. 4. Controller at bus i measures the voltage at node i. 5. Controller at bus i updatesλ i,t , and λ i,t according to the followinḡ where γ is the step parameter.
A controller is located at each node, it knows the local information such as the neighboring nodes, the resistance and reactance values between its neighbors. These is the only information required for the controller to implement the algorithm DIST-OPT. We assume that a controller at the bus will communicate with the prosumers.
Note that a prosumer i pays a penalty λ i,t ζ i,t if it injects too much active power s i,t and reactive power q i,t respectively. On the other hand, if it draws too much active power b i,t it also pays the penaltȳ λ i,t . One key difference with the traditional price is that such a price is charged based on the power. It is not based on the energy unlike the real time prices for the energy consumption.
Note that the penalty parameter is not the same for all the prosumers, rather it depends on the node set it belongs. It also depends on the impedances between the neighboring links. Since prosumers always inject reactive power, thus, reactive power is not penalized when the voltage magnitude is higher than the threshold. However, when the voltage magnitude is smaller than the acceptable value, reactive power is also penalized.
The following theorem entails that even though each prosumer takes its own decision, it converges to the generalized Nash equilibrium. Theorem 1. The algorithm DIST-OPT converges to a generalized Nash equilibrium.
Outline of Proof: The algorithm DIST-OPT is a primal-dual gradient algorithm. The penalty prices correspond to the dual variables. Thus, the proof follows from the theory of convex optimization [32].
Since the optimization problem at each prosumer's end is a convex optimization problem, thus, the algorithm can be implemented in practice even when there are large number of nodes.
Note that there can be multiple generalized NE. For example, suppose that there are two prosumers and one node, the cost of energy is zero, and the minimum amount of energy requirement for each of the prosumers is zero. Now, one of the prosumers can draw a lot of power till the point the voltage magnitude becomes equal to the maximum limit. The other prosumer can not draw any power as it will make the voltage exceeds the maximum limit. Thus, there can be multiple equilibria. We, thus, want a generalized NE which is close to an efficient solution (if any), i.e., the strategy is closer to the optimal solution of the problem where sum of the prosumers' objectives are maximized subject to the set of constraints. The generalized NE obtained from the Algorithm DIST-OPT is indeed an efficient solution which is formally stated in the following. (S 1 , . . . , S n ) ∈ S (15) Proof: Note that the optimization problem in (15) is exactly equal to the optimization problem in (12). Since GNE obtained using the algorithm DIST-OPT is an optimal solution of (12), thus, it is also an optimal solution of (15).
Thus, the generalized NE attained also maximizes the sum of the utilities of the prosumers. Hence, the generalized NE attained by the algorithm DIST-OPT is efficient.
Execution Time: Note that DIST-OPT algorithm gives an optimal solution of a strictly convex optimization problem. DIST-OPT is a primal dual gradient algorithm. Thus, the convergence is fast and polynomial in 1 and polynomial in the dimension of the decision variables. Thus, even when the number of nodes increases, the run time only scales in a polynomial order. In our simulations, we also show that the convergence is fast.

Numerical Results
In this section, we demonstrate our proposed architecture on a simulated power grid system. We, first, describe our set-up. Subsequently, we describe the insights we obtain from the simulated system.

Simulation Set-Up
We run the proposed algorithm in 5-min interval for 24 h. So, there are 288-time intervals. We assume that the renewable energy is harvested according to a truncated Gaussian distribution with mean 5 kwh and variance 2 kwh. The storage unit is assumed to be of capacity uniformly distributed in the interval [0, 5] kwh across the users. Initial battery level is assumed to be 0 i.e., it is fully discharged. The prices for the conventional energy is assumed to be governed by Time-of-Use (ToU) time scale. Thus, the cost of buying energy varies over time. Currently, the selling price to the grid is assumed to be the same as the buying price in the net-metering scheme. However, when the renewable energy will have higher penetration net metering scheme is not profitable to the retailers. Thus, we consider that the selling price at time t, g t is a random variable uniformly distributed between [c t /2, c t ] where c t is the real-time buying price. Similar to [33], the user's utility for energy x is taken to be of the form min{−ax 2 + bx, b 2 4a }.
The parameter b 2a is the maximum demand of the users. We vary b 2a over time for each user.
Specifically, during peak time (time between 9 am to 6 pm) we consider b 2a is uniformly distributed between [7.5, 15] kwh. On the other hand, during the off-peak hours, we consider it as uniformly distributed in the interval [1, 10]-kwh. The acceptable voltage range is assumed to be between [0.95, 1.05] volt p.u. The resistance and reactance of the edges are assumed to be 0.09-ohm/km. We also consider a 14-node distribution system with radial network. The distribution network setup is exactly equal to one considered in [1], we have also shown in Figure 1. The number of prosumers are considered to be 1000 which are uniformly distributed over the network.  with hourly mean values from [6] and choose the percentage of load that can be shed (p l (t) −p l (t)/p l (t)) randomly from [30%, 50%]. The reactive power requestsq l (t) andq l (t) are generated based on the active power requestsp l (t) andp l (t) using a power factor chosen randomly from [0.8, 0.9]. The parameters α l and β l are chosen to be 0.5 and 500, respectively for each load. We use the 5-min real-time pricing data from CAISO [35] in the simulation. We set the cost function of diesel generation as C g ( p g (t)) 40( p g (t) t) 2 + 60( p g (t) t) and the ramping parameter as r g = 0.3. The capacity of the BESSĒ b is 3 MWh andĒ b is chosen to be 0.1 MWh. The initial battery energy level is set to be E b (0) = 1.5 MWh. The parameters in the battery cost function are chosen as α b = 1 and c b = 0.

B. Benchmarks
In order to evaluate the performance of the proposed online EMS, we use two benchmarks: 1) an optimal offline algorithm that optimizes the objective over the and 2) a greedy algorithm that optimiz independently.
The offline algorithm solves the problem over the entire time horizon The offline algorithm provides a low algorithms, assuming that all system s output power of the renewables, the the energy prices) are known a prio achieve in practice due to the stochas lem. Although the optimal solution to not achievable in practice, it gives us compare with any online algorithms.
Another benchmark we consider is aims to minimize the cost at each tim Greedy: The greedy algorithm is shortsighted at each time without taking the future

C. Case Study
We apply the proposed online EMS  Note that each prosumer solves its own optimization problem which we have solved using CVX toolbox of MATLAB. The algorithm converges very fast with mostly within 20-25 iterations. The algorithm converges within 3.2 s on an average where the average is taken over 25 runs. We consider as 10 −6 .

Voltage Variation
We have compared our algorithm (Figure 2 compared to the one where there is no control mechanism. It shows how our algorithm DIST-OPT maintains the voltage magnitude compared to the setting where there is no control mechanism. As it is evident from the Figures 2 and 3 that our algorithm maintains the voltage stability compared to the setting where there is no control mechanism. Note that during the peak time (off-peak, respv.) the voltage tends to exceed (go below, respv.) the upper (lower, respv.) limit without the control mechanism.  Figure 4 shows the variation of the total penalty prices to implement DIST-OPT algorithm at a node. Note that during the peak period, the penalty is imposed for drawing power. On the other hand during the off-peak period, the penalty is imposed for injecting power. Hence, it shows that we need to implement penalties both for drawing and injecting power. The prices are the highest at nodes 3 and 12 since the resistances are higher between those nodes and neighbours.   Figure 5 shows the impact of an increase in the household storage unit on the prices. Note that the average price for injecting power as well as the price for drawing power decreases. Intuitively, as the capacity increases, prosumers can optimize more efficiently, thus, the average penalty reduces.

Impact of Storage Units
Though the average penalty price reduces overall, prices at individual time period may increase. Note that the price for power drawn during the off-peak increases since prosumers now want to store more energies during the off-peak periods. Similarly, the prosumers can also supply excess energies to the other prosumers during the peak period. Thus, the prices for injecting power increases during the peak time. It also shows that if the storage has a higher penetration level, the prices need to be computed carefully.  Figure 6 depicts the total penalty prices for the reactive power injected into the network. It shows that the prices are positive mostly during the off-peak time. Intuitively, during the off-peak period the demand is lower, thus, the voltage magnitude is small. Thus, if prosumers inject power, it is penalized in order to maintain the voltage magnitude above the acceptable limit. On the other hand, during the peak period the demand is large, thus, the penalty price for injecting power decreases in order to incentivize the prosumers to inject more power. The prices are higher at nodes 3 and 12 since the reactances are higher.  Figure 7 depicts the total amount of energies injected from the prosumers to the grid. Note that the prosumers inject a larger amount of energy to the grid during the peak period. Intuitively, the prices are higher during the peak period, thus, the prosumers tend to inject a larger amount of energy during the peak period in order achieve a larger profit. Figure 7 also shows that when the storage capacities of the prosumers are larger, more energy is injected into the grid during the peak period. This is because prosumers can also store energy during the off-peak period and give back energy during the peak period. Hence, the energy given back during the off-peak period decreases and the energy given back during the peak period increases.

Conclusions and Future Work
We consider a scenario where the prosumers inject or draw power from the grid. The solution of power flow equations may violate the acceptable voltage magnitudes at various nodes of the power system if the prosumers draw or inject too much power. Thus, we consider a distributed control mechanism to maintain the voltage stabilities. However, the voltage stability constraint depends on the decisions of all the prosumers. Each prosumer takes its own decision without coordinating with others. Thus, developing an optimal distributed control mechanism is inherently challenging. We formulate the problem as a coupled constrained game where each prosumer maximizes its own payoff subject to the common constraint of non-linear power flow equations. We seek to obtain a generalized Nash equilibrium,. We linearize the power flow equations and propose a distributed iterative algorithm which converges to a generalized Nash equilibrium. It induces a penalty price (negative or positive) based on the power if the power flow equations violate the voltage magnitude. Our proposed algorithm converges to an efficient solution i.e. it is an optimal solution of the joint optimization problem of maximizing the sum of the objectives of the prosumers. Our numerical analysis shows that such a mechanism can maintain the stability of the grid and can be implemented in practice.
Our work can be extended in several directions. The characterization of algorithm for AC power flow model is left for the future. We expect that the tools which we have developed will provide the basis for the AC power flow model. The uncertainties of the renewable energies have not been considered as we consider a real-time market. However, the energy market operates in day-ahead setting as well as real-time setting. Hence, the characterization of an equilibrium in a day-ahead setting is also left for the future.