Mechanism Design for Demand Management in Energy Communities

Wei, Xupeng; Anastasopoulos, Achilleas

doi:10.3390/g12030061

Open AccessEditor’s ChoiceArticle

Mechanism Design for Demand Management in Energy Communities

by

Xupeng Wei

and

Achilleas Anastasopoulos

^*

Electrical Engineering and Computer Sciences Department, University of Michigan, Ann Arbor, MI 48109-2122, USA

^*

Author to whom correspondence should be addressed.

Games 2021, 12(3), 61; https://doi.org/10.3390/g12030061

Submission received: 14 June 2021 / Revised: 23 July 2021 / Accepted: 30 July 2021 / Published: 31 July 2021

(This article belongs to the Special Issue Social and Economic Networks)

Download

Browse Figures

Versions Notes

Abstract

:

We consider a demand management problem in an energy community, in which several users obtain energy from an external organization such as an energy company and pay for the energy according to pre-specified prices that consist of a time-dependent price per unit of energy as well as a separate price for peak demand. Since users’ utilities are their private information, which they may not be willing to share, a mediator, known as the planner, is introduced to help optimize the overall satisfaction of the community (total utility minus total payments) by mechanism design. A mechanism consists of a message space, a tax/subsidy, and an allocation function for each user. Each user reports a message chosen from her own message space, then receives some amount of energy determined by the allocation function, and pays the tax specified by the tax function. A desirable mechanism induces a game, the Nash equilibria (NE), of which results in an allocation that coincides with the optimal allocation for the community. As a starting point, we design a mechanism for the energy community with desirable properties such as full implementation, strong budget balance and individual rationality for both users and the planner. We then modify this baseline mechanism for communities where message exchanges are allowed only within neighborhoods, and consequently, the tax/subsidy and allocation functions of each user are only determined by the messages from their neighbors. All of the desirable properties of the baseline mechanism are preserved in the distributed mechanism. Finally, we present a learning algorithm for the baseline mechanism, based on projected gradient descent, that is guaranteed to converge to the NE of the induced game.

Keywords:

mechanism design; Nash equilibrium; demand management; energy networks; learning in games

1. Introduction

Resource allocation is an essential task in networked systems such as communication networks, energy/power networks, etc. [1,2,3]. In such systems, there is usually one or multiple kinds of limited and divisible resources allocated among several agents. When full information regarding agents’ interests is available, solving the optimal resource allocation problem reduces to a standard optimization problem. However, in many interesting scenarios, strategic agents may choose to conceal or misreport their interests in order to obtain more resources. In such cases, it is possible that appropriate incentives are designed so that selfish agents are incentivized to truly report their private information, thus enabling optimal resource allocation [4].

In existing works related to resource allocation problems, mechanism design [5,6] is frequently used to address the agents’ strategic behavior mentioned above. In the framework of mechanism design, the participants reach an agreement regarding how they exchange messages, how they share the resources, and how much they pay (or get paid). Such agreements are designed to incentivize the agents to provide the information needed to solve the optimization problem.

Contemporary energy systems have witnessed an explosion of emerging techniques, such as smart meters and advanced metering infrastructure (AMI) in the last decade. These smart devices and systems facilitate the transmission of users’ data via communication networks and enable dynamic pricing and demand response programs [7], making it possible to implement mechanism design techniques on energy communities for improved efficiency. In this paper, we develop mechanisms to solve a demand management problem in energy communities. In an energy community, users obtain energy from an energy company and pay for it. The pre-specified prices dictated by the energy company consist of a time-dependent price per unit of energy as well as a separate price for peak demand. Users’ demand is subject to constraints related to equipment capacity and minimum comfort level. Each user possesses a utility as a function of their own demand. Utilities are private information for users. The welfare of the community is the sum of utilities minus energy cost. If users were willing to truthfully report their utilities, one could easily optimize energy allocations to maximize social welfare. However, since users are strategic and might not be willing to report utilities directly, to maximize the welfare, we need to find an appropriate mechanism that incentivizes them to reveal some information about their utilities, so that optimal allocation is reached even in the presence of strategic behaviors. These mechanisms are usually required to possess several interesting properties, among which are full implementation in Nash equilibria (NE), individual rationality, and budget balance [6,8,9]. Moreover, in environments with communication constraints, it is desirable to have “distributed” mechanisms, whereby energy allocation and tax/subsidies for each user can be evaluated using only local messages in the user’s neighborhood. Finally, for actual deployment of practical mechanisms, we hope that the designed mechanism has convergence properties that guarantee that NE is reached by the agents by means of a provably convergent learning algorithm.

1.1. Contributions

This paper proposes a method of designing a mechanism for implementing the optimal allocation of the demand management problem in an energy community, where there are strategic users communicating over a pre-specified message exchange network. The main contributions of our work are as follows:

(a): We design a baseline, “centralized” mechanism for an environment with concave utilities and convex constraints. A “centralized” mechanism allows for messages from all users to be communicated to the planner [6,8,9]. To avoid excessive communication cost brought about by direct mechanisms (due to messages being entire utility functions), the mechanisms proposed in this paper are indirect, non-VCG type [10,11,12], with messages being real vectors with finite (and small) dimensionality. Unlike related previous works [13,14,15,16], a simple form of allocation function is adopted, namely, allocation equals demand. The mechanism possesses the properties of full implementation, budget balance, and individual rationality [6,8,9]. Although we develop the mechanism for demand management in energy communities, the underlying ideas can be easily adapted to other problems and more general environments. Specifically, environments with non-monotonic utilities, external fixed unit prices, and the requirement of peak shaving are tractable with the proposed mechanism.
(b): Inspired by the vast literature on distributed non-strategic optimization [17,18,19,20,21], as well as our recent work on distributed mechanism design (DMD) [22,23], we modify the baseline mechanism and design a “distributed” version of it. A distributed mechanism can be deployed in environments with communication constraints, where users’ messages cannot be communicated to the central planner; consequently the allocation and tax/subsidy functions for each user should only depend on messages from direct neighbors. The focus of our methodology is to show how a centralized mechanism can be modified into a decentralized one in a systematic way by means of introducing extra message components that act as proxies of the messages not available to a user due to communication constraints. An added benefit of this systematic design is that the new mechanism preserves all of the desirable properties of the centralized mechanism.
(c): Since mechanism design (centralized or distributed) deals with equilibrium properties, one relevant question is how equilibrium is reached when agents enter the mechanism. Our final contribution in this paper is to provide “learning” algorithms [24,25,26,27,28] that address this question for both cases of the proposed centralized and decentralized mechanisms. The algorithm is based on the projected gradient descent method in optimization theory ([29] Chapter 7). Learning proceeds through price adjustments and demand announcements according to the prices. During this process, users do not need to reveal their entire utility functions. Convergence of the message profile toward one NE is conclusively proven, and since the mechanism is designed to fully implement the optimal allocation in NE, this implies that the allocation corresponding to the limiting message profile is the social welfare maximizing solution.

1.2. Related Literature

The model for demand management in energy communities investigated in this paper originates from network utility maximization (NUM) problems, which is one typical category of resource allocation problems in networks (see [30], Chapter 2, for a detailed approach to models and algorithms for solving NUM problems). There are two distinct research directions that have emanated from the standard centralized formulations of optimization problems.

The first direction addresses the problem of communication constraints when solving an optimization problem in a centralized fashion. Taking into account these communication constraints, several researchers have proposed distributed optimization methods [17,18,19,20,21] whereby an optimization problem is solved by means of message-passing algorithms between neighbors in a communication network. The works have been further refined to account for possible users’ privacy concerns during the optimization processes [31,32,33,34]. Nevertheless, the users are assumed to be non-strategic in this line of works.

The second research direction, namely mechanism design, addresses the presence of strategic agents in optimization problems in a direct way. The past several decades have witnessed applications of this approach in various areas of interest, such as market allocations [14,35,36], spectrum sharing [37,38,39], data security [40,41,42], smart grid [43,44,45], etc. The well-known VCG mechanism [10,11,12] has been utilized extensively in this line of research. In VCG, users have to communicate utilities (i.e., entire functions), which leads to a high cost of information transmission. To ease the burden of communication, Kelly’s mechanism [13] (and extensions to multiple divisible resources [46]) has been proposed as a solution, which uses logarithmic functions as surrogates of utilities. The users need to only report a real number and thus the communication cost reduces dramatically, at the expense of efficiency loss [47] and/or the assumption of price-taking agents. A number of works extend Kelly’s idea to reduce message dimensionality in strategic settings [48,49,50]. Other indirect mechanisms guaranteeing full implementation in environments with allocative constraints have been proposed in [51,52] using penalty functions to incentivize feasibility and in [13,14,15] using proportional allocation or its generalization, radial allocation [16,53]. All aforementioned works on mechanism design can be categorized as “centralized” mechanisms, which means that agents’ messages are broadcasted to a central planner who evaluates allocation and taxation for all users.

The first contribution of this paper (contribution (a)) relates to the existing research detailed above in the following way. Instead of adopting the classic VCG framework as in [10,11,12], the proposed centralized mechanism is indirect with finite message dimensionality. In line with previous works [14,15,16,53], our centralized mechanism guarantees full implementation in contrast to some other indirect mechanisms (e.g., Kelly’s mechanism [13,46,47]). Additionally, compared to the works in [13,14,15,16,53], the form of allocation functions in our centralized mechanism is simpler, while at the same time, it enables the use of the mechanism in more general environments (e.g., environments allowing negative demands, non-monotonic utilities, extra prices, etc.).

The first attempts in designing decentralized mechanisms were reported in [22,23], where mechanisms are designed with the additional property that the allocation and tax functions for each agent depend only on the messages emitted by neighboring agents. As such, allocation and taxation can be evaluated locally.

Similar to the works in [22,23], the centralized mechanism in our work can be modified into a decentralized version (contribution (b)). Unlike the previous two works, however, the distributed mechanism in this paper has a straightforward form for the allocation function, making the proofs much less involved. In addition and partially due to this simplification, these decentralized mechanisms can be applied to a broader range of environments.

Recently, there has also been a line of research designing mechanisms over networks. The works of [54,55] present mechanisms for applications such as prize allocation and peer ranking, where agents are aware of their neighbors’ private characteristic (e.g., suitability, ability, or need). We caution the reader not to confuse the networks in the context of [54,55] with the message exchange networks introduced in our work in the context of contribution (b). Indeed, the networks in the two works listed above reveal how much of the private information of other agents is available to each agent, but it does not put restrictions on allocation/tax functions. On the other hand, the message exchange networks in our work treat utilities as each agent’s strictly private information and puts restrictions on the form of allocation/tax functions.

Finally, learning in games is motivated by the fact that NE is, theoretically, a complete information solution concept. However, since users do not know each-others’ utilities, they cannot evaluate the designed NE offline. Instead, there is a need for a process (learning), during which the NE is learnt by the community. The classic works [24,25,26] adopted fictitious play, while in [27], a connection between supermodularity and convergence of learning dynamics within an adaptive dynamics class was made and was further specialized in [56] to the Lindahl allocation problem. A general class of learning dynamics named adaptive best response was discussed in [57] in connection with contractive games. Learning in monotone games [28,58] was investigated in [59,60,61,62,63,64], with further applications in network optimization [65,66]. Recently, learning NE by utilizing reinforcement learning has been reported in [67,68,69,70,71].

In the last contribution of this paper (contribution (c)), we develop a learning algorithm for the proposed mechanisms. Most of the learning algorithms mentioned above require a strong property for induced games. For example, the learning algorithm of [27] requires supermodularity and the adaptive best response class of dynamics in [57] only applies to contractive games. Unfortunately, guaranteeing such strong properties limits the applicability of the mechanism. We overcome this difficulty with the proposed learning algorithm based on a PGD of the dual problem, thus guaranteeing convergence to NE with more general settings than the ones described in [27,57].

2. Model and Preliminaries

2.1. Demand Management in Energy Communities

Consider an energy community consisting of N users and a given time horizon T, where T can be viewed as the number of days during one billing period. Each user i in the user set

N

has their own prediction of their usage across one billing period denoted by

x^{i} = (x_{1}^{i}, \dots, x_{T}^{i})

, where

x_{t}^{i}

is the predicted usage of user i on the tth time slot of the billing period. Regarding notation, throughout the paper we use superscripts to denote users and constraints, and subscripts to denote time slots. Note that

x_{t}^{i}

can be a negative number due to the potential possibility that users in the electrical grid can generate power through renewable technologies (e.g., photovoltaic) and can return the surplus back to the grid. The users are characterized by their utility functions as

v^{i} (x^{i}) = \sum_{t = 1}^{T} v_{t}^{i} (x_{t}^{i}), \forall i \in N .

The energy community, as a whole, pays for the energy. The unit prices are given separately for every time slot t, denoted by

p_{t}

. These prices are considered given and fixed (e.g., by the local utility company). In addition, the local utility company imposes a unit peak price

p_{0}

in order to incentivize load balancing and to lessen the burden of peaks in demand. To conclude, the cost of energy to the community is as follows:

J (x) = \sum_{t = 1}^{T} p_{t} (\sum_{i = 1}^{N} x_{t}^{i}) + p_{0} \cdot max_{1 \leq t \leq T} \sum_{i = 1}^{N} x_{t}^{i},

(1)

where

x

is a concatenation of demand vectors

x^{1}, \dots, x^{N}

.

The centralized demand management problem for the energy community can be formulated as

\underset{x \in X}{maximize} \sum_{i = 1}^{N} v^{i} (x^{i}) - J (x) .

(2)

The meaning of the feasible set

X

is to incorporate possible lower bounds on each user’s demand (e.g., minimal indoor heating or AC) and/or upper bounds due to the capacities of the facilities as well as transmission line capacities.

In order to solve the optimization problem (2) using convex optimization methods, the following assumptions are made.

Assumption 1.

All of the utility functions

v_{t}^{i} (\cdot)

s are twice differentiable and strictly concave.

Assumption 2.

The feasible set

X

is a polytope formed by several linear inequality constraints, and

X

is coordinate convex, i.e., if

x \in X

, then setting any of the components of

x

to 0 does not let it fall outside of set

X

.

By Assumption 2,

X

can be written as

{x | A x \leq b}

for some

A \in R^{L \times N T}

and

b \in R_{+}^{L}

, where L is the number of linear constraints in

X

, and

\begin{matrix} A & = {[a^{1} \dots a^{L}]}^{T}, \\ a^{l} & = {[a_{1}^{1, l} \dots a_{T}^{1, l} \dots a_{1}^{N, l} \dots a_{T}^{N, l}]}^{T}, l = 1, \dots, L, \\ b & = {[b^{1}, \dots, b^{L}]}^{T} . \end{matrix}

The coordinate convexity in Assumption 2 is mainly used for the outside option required by the individual rationality. Under this assumption, for a feasible allocation

x

, if any user i changes their mind and chooses not to participate in the mechanism, the mechanism yields a feasible allocation with

x^{i} = 0

fixed.

With Assumptions 1 and 2, the energy community faces an optimization problem with a strictly concave and continuous objective function over a nonempty compact convex feasible set. Therefore, from convex optimization theory, the optimal solution for this problem always exists and should be unique [29].

Substituting the max function in (1) with a new variable w, the optimization problem in (2) can be equivalently restated as

\begin{matrix} \underset{x, w}{maximize} & \sum_{i = 1}^{N} v^{i} (x^{i}) - \sum_{t = 1}^{T} p_{t} \sum_{i = 1}^{N} x_{t}^{i} - p_{0} w \end{matrix}

(3a)

\begin{matrix} subject to & A x \leq b, \end{matrix}

(3b)

\begin{matrix} \sum_{i = 1}^{N} x_{t}^{i} \leq w, \forall t \in \{1, \dots, T\} . \end{matrix}

(3c)

The proof of this equivalency can be found in Appendix A. The new optimization problem has a differentiable concave objective function with a convex feasible set, which means that it is still a convex optimization, and therefore, KKT conditions are sufficient and necessary conditions for a solution

(x, λ, μ)

to be the optimal solution, where

λ = {[λ^{1}, \dots, λ^{L}]}^{T}

are the Lagrange multipliers for each linear constraint

a^{l T} x \leq b^{l}

in constraint

x \in X

and

μ = {[μ_{1}, \dots, μ_{T}]}^{T}

are the Lagrange multipliers for (3c). The KKT conditions are listed as follows:

Primal Feasibility:

$\begin{matrix} x & \in X, \end{matrix}$

(4a)

$\begin{matrix} \sum_{i = 1}^{N} x_{t}^{i} & \leq w . \end{matrix}$

(4b)
Dual Feasibility:

$λ^{l} \geq 0, l = 1, \dots, L; μ_{t} \geq 0, t = 1, \dots, T .$

(4c)
Complementary Slackness:

$\begin{matrix} λ^{l} (a^{l T} x - b^{l}) & = 0, l = 1, \dots, L, \end{matrix}$

(4d)

$\begin{matrix} μ_{t} (\sum_{i = 1}^{N} x_{t}^{i} - w) & = 0, t = 1, \dots, T . \end{matrix}$

(4e)
Stationarity:

$\begin{matrix} p_{0} & = \sum_{t} μ_{t}, \end{matrix}$

(4f)

$\begin{matrix} {\dot{v}}_{t}^{i} (x_{t}^{i}) & = p_{t} + \sum_{l} λ^{l} a_{t}^{i, l} + μ_{t}, t = 1, \dots, T, i \in N . \end{matrix}$

(4g)

where ${\dot{v}}_{t}^{i} (\cdot)$ is the first order derivative of $v_{t}^{i} (\cdot)$ .

We conclude this section by pointing out once more that our objective is not to solve (3a)–(3c) or (4a)–(4g) in a centralized or decentralized fashion. Such a methodology is well established and falls under the research area of centralized or decentralized (non-strategic) optimization. Furthermore, such a task can be accomplished only under the assumption that users report their utilities (or related quantities, such as derivatives of utilities at specific points) truthfully, i.e., they do not act strategically. Instead, our objective is to design a mechanism (i.e., messages and incentives) so that strategic users are presented with a game, the NE of which is designed so that it corresponds to the optimal solution of (3a)–(3c) or (4a)–(4g).

2.2. Mechanism Design Preliminaries

In an energy community, utilities are users’ private information. Due to privacy and strategic concerns, users might not be willing to report their utilities. As a result, (3a)–(3c) or (4a)–(4g) cannot be solved directly. In order to solve (3a)–(3c) and (4a)–(4g) under the settings stated above, we introduce a planner as an intermediary between the community and the energy company. To incentivize users to provide necessary information for optimization, the planner signs a contract with users, which prespecifies the messages needed from users and rules for determining the allocation and taxes/subsidies from/to the users. The planner commits to the contract. Informally speaking, the design of such contract is referred to as mechanism design.

More formally, a mechanism is a collection of message sets and an outcome function [8]. Specifically, in resource allocation problems, a mechanism can be defined as a tuple

(M, \hat{x} (\cdot), \hat{t} (\cdot))

, where

M = M^{1} \times \dots \times M^{N}

is the space of message profile;

\hat{x} : M \mapsto X

is an allocation function determining the allocation

x

according to the received message profile

m \in M

; and

\hat{t} : M \mapsto R^{N}

is a tax function that defines the payments (or subsidies) of users based on m (specifically,

\hat{t} = {{\hat{t}}^{i}}_{i \in N}

with

{\hat{t}}^{i} : M \mapsto R

defining the tax/subsidy function for user i). Once defined, the mechanism induces a game

(N, M, {u^{i}}_{i \in N})

. In this game, each user i chooses her message

m^{i}

from the message space

M^{i}

, with the objective to maximize her payoff

u^{i} (m) = v^{i} ({\hat{x}}^{i} (m)) - {\hat{t}}^{i} (m)

. The planner charges taxes and pays for the energy cost to the company, so the planner’s payoff turns out to be

\sum_{i} {\hat{t}}^{i} (m) - J (\hat{x} (m))

(the net income of the planner).

For the mechanism-induced game

G

, NE is an appropriate solution concept. At the equilibrium point

m^{*}

, if

\hat{x} (m^{*})

coincides with the optimal allocation (i.e., the solution of (3a)–(3c)), we say that the mechanism implements the optimal allocation at

m^{*}

. A mechanism has the property of full implementation if all of the NE

m^{*}

s implement the optimal allocation.

There are other desirable properties in a mechanism. Individual rationality is the property that everyone volunteers to participate in the mechanism-induced game instead of quitting. For the planner, this means that the sum of taxes

\sum_{i} {\hat{t}}^{i} (m^{*})

collected at NE is larger than the cost paid to the energy company

J (\hat{x} (m^{*}))

. In the context of this paper, strong budget balance is the property that the sum of taxes is exactly the same as the cost paid to the energy company, so no additional funds are required by the planner or the community to run the mechanism other than the true energy cost paid to the energy company. In addition, if we use the solution concept of NE, one significant problem is how the users know the NE without full information. Therefore, some learning algorithm is needed to help users learn the NE. If under a specific class of learning algorithm, the message profile m converges to NE

m^{*}

, then we say that the mechanism has learning guarantees with this certain class.

3. The Baseline “Centralized” Mechanism

In this section, we temporarily assume that there are no communication constraints, i.e., all of the message components are accessible for the calculations of the allocation and taxation. The mechanism designed under this assumption is called a “centralized” mechanism. In the next section, we extend this mechanism to an environment with communication constraints.

In the proposed centralized mechanism, we define user i’s message

m^{i}

as

m^{i} = ({\{y_{t}^{i}\}}_{t = 1}^{T}, {\{q^{i, l}\}}_{l \in L}, {\{s_{t}^{i}\}}_{t = 1}^{T}, {\{β_{t}^{i}\}}_{t = 1}^{T}) .

Each message component above has an intuitive meaning. Message

y_{t}^{i} \in R

can be regarded as the demand for time slot t announced by user i. Message

q^{i, l} \in R_{+}

is the additional price that user i expects to pay for constraint l, which corresponds to the Lagrange multiplier

λ^{l}

. Message

s_{t}^{i} \in R_{+}

is proportional to the peak price that user i expects to pay at time t. Intuitively, setting one

s_{t}^{i}

greater than

s_{t^{'}}^{i}

means that user i thinks day t is more likely to be the day with the peak demand rather than

t^{'}

. This component corresponds to the Lagrange multiplier

μ_{t}

. Message

β_{t}^{i} \in R

is the prediction of user

(i + 1)

’s usage at time t by user i. This message is included for technical reasons that will become clear in the following (for a user index

i \in N

, let

i - 1

and

i + 1

denote modulo N operations).

Denote the message space of user i by

M^{i}

, and the space of the message profile is represented as

M = M^{1} \times \dots \times M^{N}

. The allocation functions and the tax functions are functions defined on

M

. The allocation functions follow the simple definition:

{\hat{x}}_{t}^{i} (m) = y_{t}^{i}, t = 1, \dots, T, \forall i \in N .

(5)

i.e., users obtains exactly what they request.

Prior to the definition of the tax functions, we want to find some variable that acts similar to

μ_{t}

at NE. Although

s_{t}^{i}

is designed to be proportional to

μ_{t}

, it does not guarantee

\sum_{t} s_{t}^{i} = p_{0}

, which is the KKT condition (4f). To solve this problem, we utilize a technique similar to the proportional/radial allocation in [13,14,15,16,53] to shape the suggested peak price vector

s

into a form that satisfies (4f). For a generic T-dimensional peak price vector

\tilde{s} = ({\tilde{s}}_{1}, \dots, {\tilde{s}}_{T})

and a generic T-dimensional total demand vector

\tilde{y} = ({\tilde{y}}_{1}, \dots, {\tilde{y}}_{T})

, define a radial pricing operator

{RP}^{i} : R_{+}^{T} \times R^{T} \mapsto R_{+}^{T}

as

{RP}^{i} (\tilde{s}, \tilde{y}) = ({RP}_{1}^{i} (\tilde{s}, \tilde{y}), \dots, {RP}_{T}^{i} (\tilde{s}, \tilde{y})),

(6a)

where

{RP}_{t}^{i} (\tilde{s}, \tilde{y}) = \{\begin{matrix} \frac{{\tilde{s}}_{t}}{\sum_{t^{'}} {\tilde{s}}_{t^{'}}} p_{0}, if \tilde{s} \neq 0, \\ \frac{p_{0} \cdot 1 {t \in arg max_{t^{'}} {\tilde{y}}_{t^{'}}}}{# (arg max_{t^{'}} {\tilde{y}}_{t^{'}})}, if \tilde{s} = 0, \end{matrix}

(6b)

and

# (arg max_{t^{'}} {\tilde{y}}_{t^{'}})

represents the number of elements in

\tilde{y}

that are equal to the maximum value.

The output of the radial pricing

RP (\cdot, \cdot)

is taken as the peak price in the subsequent tax functions. When the given suggested price vector

\tilde{s}

is a nonzero vector, the unit peak price is allocated to each day proportional to

{\tilde{s}}_{t}

. If the suggested price vector

\tilde{s} = 0

, then divide

p_{0}

to the days with peak demand with equal proportion.

The tax functions are defined as

{\hat{t}}^{i} (m) = {\cos t}^{i} (m) + \sum_{t = 1}^{T} pr β_{t}^{i} (m) + \sum_{l \in L} {con}^{i, l} (m) + \sum_{t = 1}^{T} {con}_{t}^{i} (m),

(7)

where

\begin{matrix} {\cos t}^{i} (m) = & \sum_{t = 1}^{T} (p_{t} + {RP}_{t}^{i} (s^{- i}, ζ^{- i})) {\hat{x}}_{t}^{i} (m) + \sum_{l \in L_{i}} q^{- i, l} a^{i, l} {\hat{x}}^{i} (m), \end{matrix}

(8a)

\begin{matrix} pr β_{t}^{i} (m) = & {(β_{t}^{i} - y_{t}^{i + 1})}^{2}, \end{matrix}

(8b)

\begin{matrix} {con}^{i, l} (m) = & {(q^{i, l} - q^{- i, l})}^{2} + q^{i, l} (b^{l} - \sum_{j \neq i} a^{j, l} y^{j} - a^{i, l} β^{i - 1}), \end{matrix}

(8c)

\begin{matrix} {con}_{t}^{i} (m) = & {(s_{t}^{i} - s_{t}^{- i})}^{2} + s_{t}^{i} (z^{- i} - ζ_{t}^{- i}), \end{matrix}

(8d)

and

\begin{matrix} s_{t}^{- i} & = \frac{1}{N - 1} \sum_{j \neq i} s_{t}^{j} \forall i \forall t, \end{matrix}

(9a)

\begin{matrix} q^{- i, l} & = \frac{1}{N - 1} \sum_{j \neq i} q^{j, l} \forall i \forall l, \end{matrix}

(9b)

\begin{matrix} ζ_{t}^{- i} & = \sum_{j \neq i} y_{t}^{j} + β_{t}^{i - 1}, \forall i \forall t, \end{matrix}

(9c)

\begin{matrix} z^{- i} & = max_{t} \{ζ_{t}^{- i}\} \forall i, \end{matrix}

(9d)

and

a^{i, l}

is defined as

a^{i, l} = [a_{1}^{i, l}, \dots, a_{T}^{i, l}]

.

The tax function for user i consists of three parts. The first part

{\cos t}^{i} (m)

is the cost for the demand. According to this part, user i pays a fixed price and the peak price for their demand. Note that the peak price at time t,

{RP}_{t}^{i} (s^{- i}, ζ^{- i})

, is generated by the vector of peak prices from all other agents,

s^{- i}

, and the total demand from all other agents (agent i’s demand at time t is approximated by

β_{t}^{i - 1}

). As a result, the peak price is not controlled by user i at all. The second part

pr β_{t}^{i} (m)

(

pr β

stands for “proxy-

β

”) is a penalty term for the imprecision of prediction

β^{i}

that incentivizes

β^{i}

to align with

y^{i + 1}

at NE. The third part consists of two penalty terms

{con}^{i, l} (m)

and

{con}_{t}^{i} (m)

for each constraint

l \in L

and each peak demand inequality

t \in {1, \dots, T}

, respectively. Both of them have a quadratic term that incentivizes consensus of the messages

q^{i, l}

and

s_{t}^{i}

among agents, respectively. In addition, they possess a form that looks similar to the complementary slackness conditions (4d) and (4e). This special design facilitates the suggested price to come to an agreement and ensures that the primal feasibility and complementary slackness hold at NE, which are shown in Lemma 2.

The main property we want from this mechanism is full implementation. We expect the allocation scheme under the NE of the mechanism-induced game to coincide with that of the original optimization problem. Full implementation can be shown in two steps. First, we show that, if there is a (pure strategy) NE, it must induce the optimal allocation. Then, we prove the existence of such (pure strategy) NE.

From the form of the tax functions, we can immediately obtain the following lemma.

Lemma 1.

At any NE, for each user i, the demand proxy

β_{t}^{i}

is equal to the demand of their neighbor, i.e.,

β_{t}^{i} = y_{t}^{i + 1}

for all t.

Proof.

Suppose that m is an NE, where there exists at least one user i whose message

β^{i}

does not agree with the next user’s demand, i.e.,

β^{i} \neq y^{i + 1}

. Say,

β_{t}^{i} \neq y_{t}^{i + 1}

for some t. Then, we can find a profitable deviation

\tilde{m}

that keeps everything other than

β^{i}

the same as m but modifies

β_{t}^{i}

with

{\tilde{β}}_{t}^{i} = y_{t}^{i + 1}

. Compare the payoff value

u_{i}

before and after the deviation:

\begin{matrix} u_{i} (\tilde{m}) - u_{i} (m) = & - {({\tilde{β}}_{t}^{i} - y_{t}^{i + 1})}^{2} + {(β_{t}^{i} - y_{t}^{i + 1})}^{2} \\ = & {(β_{t}^{i} - y_{t}^{i + 1})}^{2} > 0 . \end{matrix}

Thus, if there is some

β^{i} \neq y^{i + 1}

, user i can always construct another announcement

{\tilde{m}}^{i}

, such that user i gets a better payoff. □

It can be seen from Lemma 1 that the messages

β

play an important role in the mechanism. They appear in two places in the tax functions: first, in the expression of

ζ_{t}^{- i} = \sum_{j \neq i} y_{t}^{j} + β_{t}^{i - 1}

which is the total demand at time t used in user i’s tax function, and second, in the expression for excess demand

b^{l} - \sum_{j \neq i} a^{j, l} y^{j} - a^{i, l} β^{i - 1}

for the lth constraint. Note that we do not want user i to control these terms with their messages (specifically

y_{t}^{i}

) because they already control their allocation directly and this creates technical difficulties. Indeed, quoting the self-announced demand in the tax function raises the possibility of unexpected strategic moves for user i to obtain extra profit. Instead, using the proxy

β^{i - 1}

instead of

y^{i}

eliminates user i’s control on their own slackness factor, while Lemma 1 guarantees that, at NE, these quantities become equal.

With the introduction of these proxies, we show in the following lemmas that, at NE, all KKT conditions required for the optimal solution are satisfied. First, we prove that primal feasibility (KKT 1) and complementary slackness (KKT 3) are ensured by the design of the penalty terms “pr”s and constraint-related terms “con”s, if we treat

q

and

RP (s, ζ)

as the Lagrange multipliers.

Lemma 2.

At any NE, users’ suggested prices are equal:

\begin{matrix} q^{i, l} & = q^{l}, \forall l \in L_{i} \forall i \in N, \\ s_{t}^{i} & = s_{t}, t = 1, \dots, T, \forall i \in N . \end{matrix}

Furthermore, users’ announced demand profile satisfies

y \in X

, and equal prices, together with the demand profile, satisfy complementary slackness:

\begin{matrix} q (A x - b) & = 0, \\ s_{t} (z - \sum_{i} y_{t}^{i}) & = 0, \forall t = 1, \dots, T, \end{matrix}

which implies

{RP}_{t}^{i} (s, ζ^{- i}) (z - \sum_{i} y_{t}^{i}) = 0, \forall t = 1, \dots, T,

(10)

where z is the peak demand during the billing period.

Proof.

The proof can be found in Appendix B. □

Dual feasibility (KKT 2) holds trivially by definition. We now show that stationarity condition (KKT 4) holds at NE by imposing a first-order condition on the partial derivatives of user i’s utility w.r.t. their message component

y_{t}^{i}

.

Lemma 3.

At NE, stationarity holds, i.e.,

\begin{matrix} {\dot{v}}_{t}^{i} ({\hat{x}}_{t}^{i} (m)) & = p_{t} + {RP}_{t}^{i} (s, ζ^{- i}) + \sum_{l \in L_{i}} q^{l} a_{t}^{i, l}, \end{matrix}

(11)

\begin{matrix} p_{0} & = \sum_{t = 1}^{T} {RP}_{t}^{i} (s, ζ^{- i}) . \end{matrix}

(12)

Proof.

The proof is in Appendix C. □

With Lemmas 1–3, it is straightforward to derive the first part of our result, i.e., efficiency of the allocation at any NE.

Theorem 1.

For the mechanism-induced game

G

, if NE exists, then the NE results in the same allocation as the optimal solution to the centralized problem (3a)–(3c).

Proof.

If

m^{*}

is an NE, from Lemmas 1 and 2, we know that, at NE,

β^{i *} = y^{i + 1}

and that all of the prices

q^{i *}

,

s^{i *}

, and all of the

ζ^{- i *}

are the same among all users

i \in N

. We denote these equal quantities by

y^{*}, q^{*}, s^{*}

and

ζ^{*}

.

Consider the solution

s o l = (x, w, λ, μ) = (y^{*}, {max}_{t} {ζ_{t}^{*}}, q^{*}, RP (s^{*}, ζ^{*}))

. From Lemma 2, the solution

s o l

satisfies (4a)–(4e) (primal feasibility and complementary slackness). From Lemma 3,

s o l

has (4g) and (4f) (stationarity). The dual feasibility (4c) holds because of the nonnegativity of

q

and

s

.

Therefore,

s o l

satisfies all four KKT conditions, which means that the allocation

\hat{x} (m^{*})

is the optimal allocation. □

The following theorem shows the existence of NE.

Theorem 2.

For the mechanism-induced game

G

, there exists at least one NE.

Proof.

From the theory of convex optimization, we know that the optimal solution of (3a)–(3c) exists. Based on this solution, one can construct a message profile that satisfies all of the properties we present in Lemmas 1–3 and prove that there is no unilateral deviation for all users. The details are presented in Appendix D. □

Full implementation indicates that, if all users are willing to participate in the mechanism, the equilibrium outcome is the optimal allocation. For each user i, the payoff at NE is

\begin{matrix} u_{i} (m^{*}) & = v^{i} ({\hat{x}}^{i} (m^{*})) \\ - \sum_{t = 1}^{T} \underset{Aggregated unit price for {\hat{x}}_{t}^{i}}{\underset{︸}{(p_{t} + {RP}_{t}^{i} (s, ζ^{- i}) + \sum_{l \in L_{i}} q^{- i, l} a_{t}^{i, l})}} {\hat{x}}_{t}^{i} (m^{*}) . \end{matrix}

(13)

In other words, the users pay for their own demands via the aggregated unit prices given by the consensus at NE. By counting the planner as a participant of the mechanism with utility

\sum_{i \in N} {\hat{t}}^{i} (m^{*}) - J (x^{*})

, a strong budget balance is automatically achieved. However, there are still two questions remaining. Are the users willing to follow this mechanism or would they rather not participate? Will the planner have to pay extra money for implementing such mechanism? The two theorems below answer these questions.

Theorem 3

(Individual Rationality for Users). Assume that agent i obtains

x^{i} = 0

and pays nothing if they choose not to participate in the mechanism. Then, at NE, participating in the mechanism is weakly better than not participating, i.e.,

u_{i} (m^{*}) \geq v^{i} (0) .

Proof.

The main idea for the proof of Theorem 3 is to find a message profile with

m^{- i *}

, in which user i’s payoff is

v^{i} (0)

, and then, we can argue that following NE is not worse since

m^{*}

is a best response to

m^{- i *}

. The details of the proof can be found in Appendix E. □

Theorem 4

(Individual Rationality for the Planner). At NE, the planner does not need to pay extra money for the mechanism:

\sum_{i \in N} {\hat{t}}^{i} (m^{*}) - J (\hat{x} (m^{*})) \geq 0 .

(14)

Moreover, by slight modification of the tax functions defined in (7), the total payment of users and the energy cost achieve a balance at NE:

\sum_{i \in N} {\tilde{t}}^{i} (m^{*}) - J (\hat{x} (m^{*})) = 0 .

(15)

Proof.

The individual rationality of the planner can be verified by substituting

m^{*}

in (14) directly. By redistributing the income of the planner back to the users in a certain way, the total payment of users is exactly

J (\hat{x} (m^{*}))

, and consequently, no money is left after paying the energy company. The details are left to Appendix F. □

4. Distributed Mechanism

In the previous mechanism, allocation functions and tax functions of users depend on the global message profile m. If one wants to compute the tax

{\hat{t}}^{i}

for a certain user i, all messages

m^{j}

for all

j \in N

are needed. Such mechanisms are not desirable for environments with communication constraints, where such a global message exchange is restricted. To tackle this problem, we provide a distributed mechanism, in which the calculation of the allocation and tax of a certain user depends only on the messages from the “available” users and therefore satisfies the communication constraints. In this section, we first introduce communication constraints using a message exchange network model. We then develop a distributed mechanism, which accommodates the communication constrains and preserves the desirable properties of the baseline centralized mechanism.

4.1. Message Exchange Network

In an environment with communication constraints, all of the users are organized in a undirected graph

GR = (N, E)

, where the set of nodes

N

is the set of users and the set of edges

E

indicates the accessibility to the message for each user. If

(i, j) \in E

, user i can access the message of user j, i.e., the message of j is available for user i when computing the allocation and tax of user i, and vice versa. Here, we state a mild requirement for the message exchange network:

Assumption 3.

The graph

GR

is a connected and undirected graph.

In fact, the mechanism we show works for the cases where

GR

is an undirected tree. Although an undirected connected graph is not necessarily a tree, since we can always find a spanning tree from such graph, it is safe to consider the mechanism under the assumption that the given network has a tree structure. If that is not the case, the mechanism designer can claim a spanning tree from the original message exchange network and design the mechanism only based on the tree instead of the whole graph (essentially some of the connections of the original graph are never used for message exchanges).

The basic idea behind the decentralized modification of the baseline mechanism is intuitively straightforward. Looking at the tax function for user i in the centralized mechanism, we observe that several messages required do not come from i’s immediate neighbors. For this reason, we define new “summary” messages that are quoted by i’s neighbors and represent the missing messages. At the same time, for this to work, we add additional penalty terms that guarantee that the summary messages indeed represent the needed terms at NE.

Notice that, in the previous mechanism, user i is expected to announce a

β_{t}^{i}

equal to the demand of the next user

(i + 1)

. However, here, we might have

(i, i + 1) \notin E

, and owing to the communication constraint, we are not able to compare

β_{t}^{i}

with

y_{t}^{i + 1}

. Instead,

β_{t}^{i}

should be a proxy of the demand of user i’s direct neighbor. This motivates us to define the function

ϕ (i)

, where

ϕ (i) \in N (i)

;

N (i)

is the set of user i’s neighbors (excluding i); and

ϕ (i) = j

denotes that, in user i’s tax function, the proxy variable

β

used for user i’s

{con}_{i}^{l} (m)

terms in their tax function is provided by user j. In other words,

ϕ (i)

is a “helper” for user i who quotes a proxy of their demand whenever needed.

In the next part, we use the summaries of the demands to deal with the distributed issue. For the sake of convenience, we define

n (i, k)

as the nearest user to user k among the neighbors of user i and user i itself.

n (i, k)

is well-defined because of the tree structure. The proof is omitted here. The details can be found in ([72] Chapter 4, Section 7.1).

4.2. The Message Space

In the distributed mechanism, the message

m^{i}

in user i’s message space

M^{i}

is defined as

\begin{matrix} m^{i} = ({\{y_{t}^{i}\}}_{t = 1}^{T}, {\{q^{i, l}\}}_{l \in L}, {\{s_{t}^{i}\}}_{t = 1}^{T}, {\{β_{t}^{i, j} : ϕ (j) = i\}}_{t = 1}^{T}, \\ {n^{i, j, l} : j \in N (i)}_{l \in L}, {\{ν_{t}^{i, j} : j \in N (i)\}}_{t = 1}^{T}) . \end{matrix}

(16)

Here,

n^{i, j, l}

is a summary for demands of users related to constraint l and connected to user i via j, as depicted in Figure 1. Message

ν_{t}^{i, j}

serves a similar role for the peak demand.

4.3. The Allocation and Tax Functions

The allocation functions

{\hat{x}}_{t}^{i} (m) = y_{t}^{i}

are still straightforward. There are some modifications on tax functions, including adjustments on prices, consensus of new variables, and terms for complementary slackness.

\begin{matrix} {\hat{t}}^{i} (m) & = {\cos t}^{i} (m) + \sum_{l} (pr n^{i, l} (m) + {con}^{i, l} (m)) \\ + \sum_{t} (pr β_{t}^{i} (m) + pr ν_{t}^{i} (m) + {con}_{t}^{i} (m)), \end{matrix}

(17)

where

\begin{matrix} {\cos t}^{i} (m) & = \sum_{t = 1}^{T} (p_{t} + {RP}_{t}^{- i} (s^{- i}, ζ^{- i})) {\hat{x}}_{t}^{i} (m) + \sum_{l \in L_{i}} q^{- i, l} a^{i, l} {\hat{x}}^{i} (m), \end{matrix}

(18a)

\begin{matrix} {con}^{i, l} (m) & = {(q^{i, l} - q^{- i, l})}^{2} + q^{i, l} (b^{l} - \sum_{j \in N (i)} f^{i, j, l} - a^{i, l} β^{ϕ (i), i}), \end{matrix}

(18b)

\begin{matrix} {con}_{t}^{i} (m) & = {(s_{t}^{i} - s_{t}^{- i})}^{2} + s_{t}^{i} (z^{- i} - ζ_{t}^{- i}), \end{matrix}

(18c)

\begin{matrix} pr n^{i, l} (m) & = \sum_{j \in N (i)} {(n^{i, j, l} - f^{i, j, l})}^{2}, \end{matrix}

(18d)

\begin{matrix} pr β_{t}^{i} (m) & = \sum_{j : ϕ (j) = i} {(β_{t}^{i, j} - y_{t}^{j})}^{2}, \end{matrix}

(18e)

\begin{matrix} pr ν_{t}^{i} (m) & = \sum_{j \in N (i)} {(ν_{t}^{i, j} - f_{t}^{i, j})}^{2} \end{matrix}

(18f)

\begin{matrix} f^{i, j, l} & = a^{j, l} y^{j} + \sum_{h \in N (j) \ {i}} n^{j, h, l}, \end{matrix}

(18g)

\begin{matrix} f_{t}^{i, j} & = y_{t}^{j} + \sum_{h \in N (j) \ {i}} ν_{t}^{j, h} . \end{matrix}

(18h)

and

\begin{matrix} s_{t}^{- i} & = \frac{1}{| N (i) |} \sum_{j \in N (i)} s_{t}^{j} \forall i \forall t, \end{matrix}

(19a)

\begin{matrix} q^{- i, l} & = \frac{1}{| N (i) |} \sum_{j \in N (i)} q^{j, l} \forall i \forall l, \end{matrix}

(19b)

\begin{matrix} ζ_{t}^{- i} & = \sum_{j \in N (i)} f_{t}^{i, j} + β_{t}^{ϕ (i), i}, \forall i \forall t, \end{matrix}

(19c)

\begin{matrix} z^{- i} & = max_{t} \{ζ_{t}^{- i}\} \forall i . \end{matrix}

(19d)

In order to intuitively see how the decentralized mechanism works, take as an example the term

{con}^{i, l} (m)

in (18b), which is a modified version of (8c), repeated here for convenience

{con}^{i, l} (m) = {(q^{i, l} - q^{- i, l})}^{2} + q^{i, l} (b^{l} - \sum_{j \neq i} a^{j, l} y^{j} - a^{i, l} β^{i - 1})

related to the l-th constraint. Other than the quadratic term, which is identical in both expressions, the difference between the centralized and decentralized versions is in the expressions

\sum_{j \neq i} a^{j, l} y^{j} + a^{i, l} β^{i - 1}

and

\sum_{j \in N (i)} f^{i, j, l} + a^{i, l} β^{ϕ (i), i}

, respectively. The second term in each of these expressions relates to the proxy

β^{i - 1}

, which in the decentralized version is substituted by the proxy

β^{ϕ (i), i}

due to the fact that the proxy for

y^{i}

is not provided by user

i - 1

anymore but is provided by user i’s helper

ϕ (i)

. The first term,

\sum_{j \neq i} a^{j, l} y^{j} = \sum_{j \in N (i)} a^{j, l} y^{j} + \sum_{j \notin N (i) \cup {i}} a^{j, l} y^{j}

, which cannot be directly evaluated in the decentralized version (since it depends on messages outside the neighborhood of i) is now evaluated as

\sum_{j \in N (i)} f^{i, j, l} = \sum_{j \in N (i)} a^{j, l} y^{j} + \sum_{j \in N (i)} \sum_{h \in N (j) \ {i}} n^{j, h, l}

. It should now be clear that the role of the new messages

n^{j, h, l}

quoted by the neighbors

j \in N (i)

of i is to summarize the total demands of other users. Furthermore, the additional quadratic penalty terms has to effectuate this equality. This idea is made precise in the next section.

4.4. Properties

It is clear that this mechanism is distributed, since all of the messages needed for the allocation and tax functions of user i come from neighborhood

N (i)

and the user themselves. Due to way the messages and taxes are designed, the proposed mechanism satisfies properties similar to those in Lemmas 2 and 3, and, consequently, Theorems 1 and 2. The reason is that the components n and

ν

behave in the same manner as the absent

y^{h}, h \notin N (i)

in user i’s functions at NE, which makes the proofs of the properties in previous mechanism still work here. We elaborate on these properties in the following.

Lemma 4.

At any NE, we have the following results regarding the proxy messages:

\begin{matrix} β_{t}^{i, j} = & y_{t}^{j}, & \forall j : ϕ (j) = i, \end{matrix}

(20)

\begin{matrix} n^{i, j, l} = & a^{j, l} y^{j} + \sum_{h \in N (j) \ {i}} n^{j, h, l}, & \forall i, \forall j \in N (i), \forall l \in L, \end{matrix}

(21)

\begin{matrix} ν_{t}^{i, j} = & y_{t}^{j} + \sum_{h \in N (j) \ {i}} ν_{t}^{j, h}, & \forall t, \forall i, \forall j \in N (i) . \end{matrix}

(22)

Proof.

β^{i, j}, n^{i, j, l}

, and

ν_{t}^{i, j}

only appear in the quadratic penalty terms of user i’s tax function. Therefore, for any user i, the only choice to minimize the tax is to bid

β^{i, j}, n^{i, j, l}

, and

ν_{t}^{i, j}

by (20)–(22). □

Now, based on the structure of the message exchange network, we have

Lemma 5.

At any NE,

n^{i, j, l}

and

ν_{t}^{i, j}

satisfy

\begin{matrix} n^{i, j, l} = & \sum_{t = 1}^{T} \sum_{h : n (i, h) = j} a_{t}^{h, l} y_{t}^{h}, \forall i \in N, \forall j \in N (i), \forall l \in L, \end{matrix}

(23)

\begin{matrix} ν_{t}^{i, j} = & \sum_{h : n (i, h) = j} y_{t}^{h}, \forall i \in N, \forall j \in N (i), \forall t \in T . \end{matrix}

(24)

Proof.

The proof is presented in Appendix G. □

With Lemma 5, we immediately obtain the following results.

Lemma 6.

At any NE, for all user i, we have

\begin{matrix} \sum_{j \in N (i)} f^{i, j, l} = & \sum_{j \in N \ {i}} a^{j, l} y^{j}, \forall l \in L_{i}, \end{matrix}

(25)

\begin{matrix} \sum_{j \in N (i)} f_{t}^{i, j} = & \sum_{j \in N \ {i}} y_{t}^{j}, \forall t \in T . \end{matrix}

(26)

Proof.

At NE, by directly substitution,

\begin{matrix} \sum_{j \in N (i)} f^{i, j, l} & = \sum_{j \in N (i)} (a^{j, l} y^{j} + \sum_{h \in N (j) \ {i}} n^{j, h, l}) \\ = \sum_{j \in N (i)} (a^{j, l} y^{j} + \sum_{h \in N (j) \ {i}} \sum_{k : n (j, k) = h} a^{k, l} y^{k}) \\ = \sum_{j \in N (i)} (\sum_{h : n (i, h) = j} a^{h, l} y^{h}) = \sum_{j \in N \ {i}} a^{j, l} y^{j}, \forall l \in L_{i}, \end{matrix}

The third equality holds by the fact that the users in set

{k | h \in N (j) \ {i}, n (j, k) = h}

are the ones that are not in the subtree starting from a single branch

(j, i)

with root j, which is exactly

{k | n (i, k) = j} \ {j}

.

Equation (26) holds for a similar reason. □

Lemma 6 plays a similar role to Lemma 1. With Lemma 6, the properties in Lemmas 2 and 3 can be reproduced in the distributed mechanism. We then obtain the following theorem.

Theorem 5.

For the mechanism-induced game

G

, NE exists. Furthermore, any NE of game

G

induces the optimal allocation.

Proof.

By substituting (25) and (26) in (17), we obtain exactly the same form of tax function in a centralized mechanism on equilibrium, which yields desirable results, as shown in Lemmas 2 and 3. We conclude that any NE induces the optimal allocation. The existence of NE can be proved by a construction similar to that of Theorem 2. □

As was true in the baseline centralized mechanism, in the distributed case, the planner may also have concerns about whether the users have an incentive to participate and whether the mechanism requires external sources of funds to maintain the balance. As it turns out, Theorems 3 and 4 still hold here. As a result, the users are better off joining the mechanism, and the market has a balanced budget. The proofs and the construction of the subsidies can be performed in a manner similar to the centralized case and therefore are omitted.

5. Learning Algorithm

5.1. A Learning Algorithm for the Centralized Mechanism

The property of full implementation ensures that social welfare maximization can be reached if all participants reach NE in the mechanism-induced game and no one obtains a unilateral profitable deviation at NE. Nevertheless, it is troublesome for participants to anticipate NE as being the outcome if none of them knows (or can calculate) NE without knowledge of other users’ utilities. To settle this issue, one can design a learning algorithm to help participants learn the NE in an online fashion. In this section, we present such a learning algorithm for the centralized mechanism discussed in Section 3. Instead of using Assumption 1, here, we make a stronger assumption in order to obtain a convergent algorithm.

Assumption 4.

All of the utility functions

v_{t}^{i} (\cdot)

s are proper, twice differentiable concave functions with δ-strong concavity.

Here,

δ

-strong concavity of a function

g (\cdot)

is defined by the

δ

-strong convexity of

- g (\cdot)

. A function

f (\cdot)

is strongly convex with parameter

δ

if

f (y) \geq f (x) + \nabla f {(x)}^{T} (y - x) + \frac{δ}{2} | | y - {x | |}^{2} .

The design of the learning algorithm involves three steps. First, we find the relation between NE and the optimal solution of the original optimization problem. This step was performed in the proof of Theorem 1: we see in NE that

y^{*}

coincides with

x^{*}

in the optimal allocation, that

q^{i *}

equals

λ^{*}

, and that the components of

s^{i *}

are proportional to the components of

μ^{*}

. Then, by Slater’s condition, strong duality holds here, so we connect the Lagrange multipliers

λ^{*}, μ^{*}

with the optimal solution of the dual problem. Due to the strong concavity of the utilities and stationarity, given

λ^{*}

and

μ^{*}

, the optimal allocation

x^{*}

can be uniquely determined. Finally, if we can find an algorithm to solve the dual problem, the design is completed.

The first two steps are straightforward. For the third one, we can see that the dual problem is also a convex optimization problem, so projected gradient descent (PGD) is one of the choices for the learning algorithm. The proof of convergence of PGD is not trivial. In the proof developed in [29], the convergence of PGD holds when (a) the objective function is

β

-smooth and (b) the feasible set is closed and convex. In Appendix H, we show that (a) is satisfied by Assumption 4. To check (b), we need to find a feasible set for the dual variables. Since in PGD of the dual problem, the gradient of the dual function turns out to be a combination of functions of the form

{({\dot{v}}_{t}^{i})}^{- 1} (\cdot)

, the feasible set should satisfy two requirements: first, all of the elements are in the domain of the dual function’s gradient in order to make every iteration valid; second,

(λ^{*}, μ^{*})

is in the feasible set so that we do not miss it. With these requirements in mind, we make Assumption 5 and construct a feasible set for the dual problem based on that.

Assumption 5.

For each utility

v_{t}^{i} (\cdot)

, there exist

{\underset{̲}{r}}_{t}^{i}, {\bar{r}}_{t}^{i} \in R

satisfying

1.: $\forall x \in X$ , ${\dot{v}}_{t}^{i} (x_{t}^{i}) \in [{\underset{̲}{r}}_{t}^{i}, {\bar{r}}_{t}^{i}]$ ,
2.: $\forall p \in [{\underset{̲}{r}}_{t}^{i}, {\bar{r}}_{t}^{i}]$ , $\exists x_{t}^{i} \in R, s . t . {\dot{v}}_{t}^{i} (x_{t}^{i}) = p$ .

Before we explain this assumption, we define

$\underset{̲}{r} = {[{\underset{̲}{r}}_{1}^{1}, \dots, {\underset{̲}{r}}_{T}^{1}, \dots, {\underset{̲}{r}}_{T}^{N}]}^{T}$ , $\bar{r} = {[{\bar{r}}_{1}^{1}, \dots, {\bar{r}}_{T}^{1}, \dots, {\bar{r}}_{T}^{N}]}^{T}$ ,
$p = {[p_{1} \dots p_{T}]}^{T}$ , $\tilde{p} = 1_{N} \otimes p$ and

$\tilde{A} = (\begin{matrix} A \\ 1_{N}^{T} \otimes I_{T} \end{matrix}), \tilde{λ} = (\begin{matrix} λ \\ μ \end{matrix}), \tilde{p} = 1_{N} \otimes p .$

(27)

where ⊗ represents the Kronecker product of matrices. Then, we define a set of proper prices $P$ as the feasible set for the dual problem:

$P = {\tilde{λ} = (λ, μ) \geq 0 : \underset{̲}{r} \leq {\tilde{A}}^{T} \tilde{λ} + \tilde{p} \leq \bar{r}, 1_{T}^{T} μ = p_{0}} .$

Observe that, by stationarity, the

((i - 1) T + t)

th entry of

{\tilde{A}}^{T} \tilde{λ} + \tilde{p}

equals

{\dot{v}}_{t}^{i} (x_{t}^{i *})

in the optimal solution. Consequently, Assumption 5 implies two things: first,

{\tilde{λ}}^{*} \in P

; second, all

{\tilde{A}}^{T} \tilde{λ} + \tilde{p}

can be a vector of

{\dot{v}}_{t}^{i}

s on some

x \in R^{N T}

if

\tilde{λ} \in P

. Hence, with Assumption 5, it is safe to narrow down the feasible set of the dual problem to

P

without changing the optimal solution. Furthermore, for all of the price vectors in

P

,

{({\dot{v}}_{t}^{i})}^{- 1} (\cdot)

in PGD can be evaluated. Back to condition (b) stated above, since

P

is closed and convex, PGD is convergent in this case.

Based on all of the assumptions and the PGD method, we propose Algorithm 1 as a learning algorithm for the NE of the centralized mechanism.

The convergence of PGD yields the convergence of proposed learning algorithm:

Theorem 6.

Choose a step size

α \leq δ^{'} / ∥ A ∥

, where

∥ A ∥

is A’s spectral norm and

δ^{'}

is the parameter of strong concavity of the centralized objective function. As the number of iterations K grows, the distance between the computed price vector

(q (K), s (K))

and the optimal price vector

(q^{*}, s^{*})

is non-increasing. Furthermore,

{lim}_{K \to \infty} m (K) = m^{*}

, where

m^{*}

is the NE.

Proof.

See Appendix H. □

Algorithm 1: Learning algorithm for the centralized mechanism.

5.2. A Learning Algorithm for the Distributed Mechanism

Algorithm 1 cannot be applied in a distributed environment because lines 4 and 5 for the price adjustments require global user messages. Luckily, the price adjustment is the only part requiring messages from other users during an iteration. Based on this fact, the main modification of the algorithm is to figure out a way for users to obtain the aggregated demand of each constraint, which was indeed the same problem we needed to address when we modified the centralized mechanism to a distributed one. Naturally, we can utilize the proxies

ν, n

to help exchange the necessary messages, as we did in the design of the distributed mechanism. If the proxies

ν

and n behave the same way as described in Lemma 5, then we can modify line 4 and 5 in Algorithm 1 as follows and we can expect the same output as that of the original algorithm:

\begin{matrix} {\tilde{q}}^{i, l} (k + 1) = q^{i, l} (k) - α (b^{l} - \sum_{j \in N (i)} (a^{j, l} y^{j} (k) + \sum_{h \in N (j) \ {i}} n^{j, h, l} (k)) - a^{i, l} y^{i} (k)), \forall i \in N, l \in L, \end{matrix}

(28)

{\tilde{s}}_{t}^{i} (k + 1) = s_{t}^{i} (k) + α (\sum_{j \in N (i)} (y_{t}^{j} (k) + \sum_{h \in N (j) \ {i}} ν_{t}^{j, h} (k)) + y_{t}^{i} (k)), \forall i \in N, t \in T .

(29)

To make sure the proxies behave the same way as described in Lemma 5, some additional message exchanges are required in the beginning of each iteration before price adjustments. We design the maintenance algorithm for

n, ν

in Algorithms 2 and 3.

It is worth noting that a deadlock does not occur during the proxy maintenance and that the proxies after maintenance behave the way we expect in Lemma 5. The reason is that the distributed mechanism uses an acyclic subgraph of the network, so there are no loops. Notice that the proof of Lemma 5 (see Appendix G) uses an iterative process to justify the statement. Indeed, what happens during the maintenance process described here is exactly the same as the process depicted in the proof of Lemma 5. The termination signal is used to notify users when to proceed to the price adjustment step.

With the assistance of proxy maintenance Algorithms 2 and 3, we arrive at an extended version of the original learning algorithm for the distributed mechanism stated as Algorithm 4.

Algorithm 2: Maintenance algorithm for proxy

n^{i, j, l}

.

Algorithm 3: Maintenance algorithm for proxy

ν_{t}^{i, j}

.

With the support of Lemma 5 and Theorem 6, this modified learning algorithm is also guaranteed to converge to NE in the distributed mechanism. Nevertheless, this algorithm is not as fast as the original one. We can observe that one user can do nothing but wait if any of the dependencies for their calculations is absent. As a result, this scheme has a mixed sequential and concurrent operation. To be more precise, the extra time for the modified algorithm is proportional to the length of the longest path of the chosen spanning tree in the network (one can derive this result following the procedures detailed in Lemma 5 by focusing on the steps for the transmission of the message from one side of the network to another side). Thus, the algorithm time complexity is impacted by the network structure.

Algorithm 4: Learning algorithm for user i in the distributed mechanism.

6. A Concrete Example

To give a sense of how the two mechanisms and the learning algorithm work, we provide a simple non-trivial example here. We first present the original centralized problem for the example, and then identify the NE of the centralized mechanism based on the properties we found. For the distributed mechanism, we illustrate how the proxy variables at NE are determined with a simple example of a message exchange network. Lastly, we implement the learning algorithm for the centralized mechanism.

6.1. The Demand Management Optimization Problem

In the energy community, assume that there are three users in the user set

N = {1, 2, 3}

and

T = 2

days in a billing period. Suppose user i on day t has the following utility function:

v_{t}^{i} (x_{t}^{i}) = i \cdot t \cdot ln (2 + x_{t}^{i}) .

Set

p_{1} = 0.1

,

p_{2} = 0.2

, and the peak price

p_{0} = 0.05

. We adopt the following centralized problem as a concrete example:

\begin{matrix} \underset{x}{maximize} & \sum_{t = 1}^{2} \sum_{i = 1}^{3} i \cdot t \cdot ln (2 + x_{t}^{i}) - J (x) \\ subject to & x_{t}^{i} \geq - 1, i = 1, 2, 3, t = 1, 2, \\ \sum_{t = 1}^{2} (x_{t}^{1} + x_{t}^{2} + x_{t}^{3}) \leq 2, \end{matrix}

where

J (x) = 0.1 \cdot \sum_{t = 1}^{2} t \cdot (\sum_{i = 1}^{3} x_{t}^{i}) + 0.05 \cdot {max}_{t} {\sum_{i = 1}^{3} x_{t}^{i}}

.

The exact solution to this problem is

x_{1}^{1 *} = - 1

,

λ^{7 *} = (249 + \sqrt{106201}) / 520

,

μ_{2} = 0.05

, and

x_{t}^{i *} = 2 / (λ^{7 *} + p_{t} + μ_{t}^{*}) - 2

for

(i, t) \neq (1, 1)

,

λ^{1 *} = λ^{7 *} + p_{1} - 1 / (x_{1}^{1 *} + 2)

.

λ^{l *} = 0

for

l = 2, \dots, 6

,

μ_{1}^{*} = 0

. The interested readers can verify it by using KKT conditions. For the sake of convenience, we adopt the following approximate results:

(x_{1}^{1 *}, x_{2}^{1 *}, x_{1}^{2 *}, x_{2}^{2 *}, x_{1}^{3 *}, x_{2}^{3 *}) = (- 1.0000, - 0.5246, - 0.3410, 0.9508, 0.4885, 2.4263) .

The lower bound constraint for

x_{1}^{1}

and the upper bound constraint for the sum are active. Thus, according to KKT conditions,

λ^{l *} = 0

for

l = 2, \dots, 6

,

λ^{1 *} = 0.2056

, and

λ^{7 *} = 1.1056

by stationarity. The total demands of Day 1 and Day 2 are

- 0.8525

and

2.8525

, respectively, so Day 2 has the peak demand

w^{*} = 2.8525

, Day 1 charges no peak price (

μ_{1}^{*} = 0

), and Day 2 has an extra unit peak price

μ_{2}^{*} = 0.05

.

6.2. The Centralized Mechanism

For this example, in the centralized mechanism, user i needs to choose their message

m^{i}

with the following components:

m^{i} = (y_{1}^{i}, y_{2}^{i}, {q^{i, l}}_{l = 1}^{7}, s_{1}^{i}, s_{2}^{i}, β_{1}^{i}, β_{2}^{i}) .

For the sake of brevity, let us take user 1 for example. In this problem setting, user 1 needs to report their demands for two days (

y_{1}^{1}, y_{2}^{1}

); suggest a set of prices for constraints 1–7 (

q^{1, 1}, \dots, q^{1, 7}

), suggest unit peak prices for two days (quantities

s_{1}^{1}

and

s_{2}^{1}

do not necessarily sum up to

p_{0} = 0.05

); and lastly, provide proxies

β_{1}^{1}

and

β_{2}^{1}

for user 2’s demands.

User 1’s tax function is

\begin{matrix} {\hat{t}}^{1} (m) & = \sum_{t = 1}^{2} (p_{t} + {RP}_{t}^{1} (s^{- 1}, ζ^{- 1})) y_{t}^{1} - q^{- 1, 1} y_{1}^{1} - q^{- 1, 2} y_{2}^{1} + q^{- 1, 7} (y_{1}^{1} + y_{2}^{1}) \\ + \sum_{t = 1}^{2} {(β_{t}^{1} - y_{t}^{2})}^{2} + {(q^{1, 1} - q^{- 1, 1})}^{2} + q^{1, 1} (1 + β_{1}^{3}) + {(q^{1, 2} - q^{- 1, 2})}^{2} + q^{1, 2} (1 + β_{2}^{3}) \\ + \sum_{l = 3}^{6} {(q^{1, l} - q^{- 1, l})}^{2} + q^{1, 3} (1 + y_{1}^{2}) + q^{1, 4} (1 + y_{2}^{2}) + q^{1, 5} (1 + y_{1}^{3}) + q^{1, 6} (1 + y_{2}^{3}) \\ + {(q^{1, 7} - q^{- 1, 7})}^{2} + q^{1, 7} (2 - y_{1}^{2} - y_{2}^{2} - y_{1}^{3} - y_{2}^{3} - β_{1}^{3} - β_{2}^{3}) \\ + {(s_{1}^{1} - s_{1}^{- 1})}^{2} + s_{1}^{1} (z^{- 1} - ζ_{1}^{- 1}) + {(s_{2}^{1} - s_{2}^{- 1})}^{2} + s_{2}^{1} (z^{- 1} - ζ_{2}^{- 1}) \end{matrix}

where

\begin{matrix} q^{- 1, l} & = (q^{2, l} + q^{3, l}) / 2, \\ s_{t}^{- 1} & = (s_{t}^{2} + s_{t}^{3}) / 2, \\ ζ_{t}^{- 1} & = β_{t}^{3} + y_{t}^{2} + y_{t}^{3}, \\ z^{- 1} & = max \{ζ_{1}^{- 1}, ζ_{2}^{- 1}\}, \end{matrix}

and according to the definition (6a) and (6b) of the radial pricing operator, we have

{RP}_{t}^{1} (s^{- 1}, ζ^{- 1}) = \{\begin{matrix} \frac{s_{t}^{- 1}}{s_{1}^{- 1} + s_{2}^{- 1}} p_{0}, & if s_{1}^{- 1} + s_{2}^{- 1} > 0, \\ p_{0}, & if s_{1}^{- 1} = s_{2}^{- 1} = 0 and ζ_{t}^{- 1} > ζ_{t^{'}}^{- 1} (t^{'} \neq t), \\ p_{0} / 2, & if s_{1}^{- 1} = s_{2}^{- 1} = 0 and ζ_{t}^{- 1} = ζ_{t^{'}}^{- 1} (t^{'} \neq t), \\ 0, & if s_{1}^{- 1} = s_{2}^{- 1} = 0 and ζ_{t}^{- 1} < ζ_{t^{'}}^{- 1} (t^{'} \neq t) . \end{matrix}

From Theorem 1, we know that, at NE, user 1’s message

m^{* 1}

is such that

y^{1}

corresponds to the optimal solution

x^{1}

,

q^{1}

equals the optimal Lagrange multiplier

λ

,

β^{1}

equals

y^{2}

, and finally

s^{1}

is proportional to the Lagrange multiplier

μ

.

6.3. The Distributed Mechanism

In this subsection, we first demonstrate the modifications on message spaces compared with the centralized mechanism and then show how the newly introduced components n and

ν

work. The specific NE can be determined in a similar way to that of the centralized mechanism and is therefore omitted.

Assume that the energy community has communication constraints with the message exchange network depicted in Figure 2. Apart from the network topology, the

ϕ

-relation that indicates the responsibility of proxy

β

is also an important part of distributed mechanism. Here, we set

ϕ (1) = 2, ϕ (2) = 1, ϕ (3) = 2

. Then, for proxy variables

β^{ϕ (i), i}, i = 1, 2, 3

,

β^{2, 1}

in user 1’s tax is provided by user 2,

β^{1, 2}

in user 2’s tax is provided by user 1, and

β^{2, 3}

in user 3’s tax is provided by user 2.

For this message exchange network, the message components for each user are

\begin{matrix} m^{1} = & (y_{1}^{1}, y_{2}^{1}, {q^{1, l}}_{l = 1}^{7}, {s_{t}^{1}}_{t = 1}^{2}, {β_{t}^{1, 2}}_{t = 1}^{2}, {n^{1, 2, l}}_{l = 1}^{7}, {ν^{1, 2}}_{t = 1}^{2}), \\ m^{2} = & (y_{1}^{2}, y_{2}^{2}, {q^{2, l}}_{l = 1}^{7}, {s_{t}^{2}}_{t = 1}^{2}, {β_{t}^{2, 1}}_{t = 1}^{2}, {β_{t}^{2, 3}}_{t = 1}^{2}, \\ {n^{2, 1, l}}_{l = 1}^{7}, {n^{2, 3, l}}_{l = 1}^{7}, {ν^{2, 1}}_{t = 1}^{2}, {ν^{2, 3}}_{t = 1}^{2}), \\ m^{3} = & (y_{1}^{3}, y_{2}^{3}, {q^{3, l}}_{l = 1}^{7}, {s_{t}^{3}}_{t = 1}^{2}, {n^{3, 2, l}}_{l = 1}^{7}, {ν^{3, 2}}_{t = 1}^{2}) . \end{matrix}

Therefore, in the distributed mechanism, users are still required to provide their demands

y

, suggested unit prices

q

, and suggested unit peak prices

s

. Different from the centralized mechanism, there are no

β

among user 3’s message components, while user 2 needs to provide two

β

s, namely

β^{2, 1}, β^{2, 3}

. In addition, for each constraint l, every user needs to announce variable n to each of their neighbors; for each day t, every user also needs to provide variable

ν

to each of their neighbors.

For the rest of this subsection, we focus on user 3 and consider how the n variables play their roles in the tax evaluation. With this message exchange network, we can write down user 3’s tax function explicitly:

\begin{matrix} {\hat{t}}^{3} (m) & = \sum_{t = 1}^{2} (p_{t} + {RP}_{t}^{3} (s^{- 3}, ζ^{- 3})) y_{t}^{3} - q^{- 3, 5} y_{1}^{3} - q^{- 3, 6} y_{2}^{3} + q^{- 3, 7} (y_{1}^{3} + y_{2}^{3}) \\ + \sum_{l = 1}^{7} pr n^{3, l} (m) + \sum_{t = 1}^{2} pr ν_{t}^{3} (m) + \sum_{l = 1}^{7} {(q^{3, l} - q^{- 3, l})}^{2} \\ + q^{3, 1} (1 - n^{2, 1, 1}) + q^{3, 2} (1 - n^{2, 1, 2}) + q^{3, 3} (1 + y_{1}^{2} - n^{2, 1, 3}) + q^{3, 4} (1 + y_{2}^{2} - n^{2, 1, 4}) \\ + q^{3, 5} (1 + β_{1}^{2} - n^{2, 1, 5}) + q^{3, 6} (1 + β_{2}^{2} - n^{2, 1, 6}) + q^{3, 6} (1 + β_{2}^{2} - n^{2, 1, 6}) \\ + q^{3, 7} (2 - β_{1}^{2} - β_{2}^{2} - y_{1}^{2} - y_{2}^{2} - n^{2, 1, 7}) + \sum_{t = 1}^{2} ({(s_{t}^{3} - s_{t}^{- 3})}^{2} + s_{t}^{3} (z^{- i} - ζ_{t}^{- 3})), \end{matrix}

where

\begin{matrix} pr n^{3, l} (m) & = {(n^{3, 2, l} - n^{2, 1, l})}^{2}, for l = 1, 2, 5, 6, \\ pr n^{3, 3} (m) & = {(n^{3, 2, 3} + y_{1}^{2} - n^{2, 1, 3})}^{2}, \\ pr n^{3, 4} (m) & = {(n^{3, 2, 4} + y_{2}^{2} - n^{2, 1, 4})}^{2}, \\ pr n^{3, 7} (m) & = {(n^{3, 2, 7} - y_{1}^{2} - y_{2}^{2} - n^{2, 1, 7})}^{2}, \\ pr ν_{t}^{3} (m) & = {(ν_{t}^{3, 2} - y_{t}^{2} - ν_{t}^{2, 1})}^{2}, \end{matrix}

and

\begin{matrix} q^{- 3, l} & = q^{2, l}, l = 1, \dots, 7, \\ s_{t}^{- 3} & = s_{t}^{2}, t = 1, 2, \\ ζ_{t}^{- 3} & = β_{t}^{2, 3} + y_{t}^{2} + ν_{t}^{2, 1}, t = 1, 2, \\ z^{- 3} & = max \{ζ_{1}^{- 3}, ζ_{2}^{- 3}\} . \end{matrix}

In user 3’s tax function, there are no

pr β_{t}^{3} (m)

terms because user 3 is not assigned to any other users for providing

β

proxies.

To figure out how the proxies n work, here, we focus on the seventh constraint, and see how the corresponding constraint term is evaluated in user 3’s tax function. The reason why other ns and

ν

s work is similar. For user 3, the constraint term is

{con}^{3, 7} (m) = {(q^{3, 7} - q^{- 3, 7})}^{2} + q^{3, 7} \underset{Slackness part}{\underset{︸}{(2 - y_{1}^{2} - y_{2}^{2} - β_{1}^{2, 3} - β_{2}^{2, 3} - n^{2, 1, 7})}} .

In the centralized mechanism, the slackness part turns out to be

1 - \sum_{t = 1}^{2} \sum_{i = 1}^{3} x_{t}^{i *}

at NE. What we want to show is that, with the distributed mechanism, the same outcome can be realized at NE. Similar to the centralized mechanism,

y^{i} = x^{i *}

at NE, so

y_{1}^{2} = x_{1}^{2 *}, y_{2}^{2} = x_{2}^{2 *}

. By Lemma 4,

β_{1}^{2, 3} + β_{2}^{2, 3} = y_{1}^{3} + y_{2}^{3} = x_{1}^{3 *} + x_{2}^{3 *}

, so it remains to show that

n^{2, 1, 7} = x_{1}^{1 *} + x_{2}^{1 *}

.

Let us trace how the

n^{2, 1, 7}

is generated at NE. From (21),

n^{2, 1, 7} = a_{1}^{1, 7} y_{1}^{1} + a_{2}^{1, 7} y_{2}^{1} + \sum_{h \in N (1) \ {2}} n^{1, h, 7} .

Notice that

N (1) = {2}

, so

N (1) \ {2}

is empty. As a result, at NE,

n^{2, 1, 7} = y_{1}^{1} + y_{2}^{1} = x_{1}^{1 *} + x_{2}^{1 *}

.

6.4. The Learning Algorithm

Before the algorithm is implemented, one might want to check whether the problem setting satisfies Assumptions 4 and 5.

First, we can check Assumption 5. Suppose for the specific environment that we have

{\underset{̲}{r}}_{t}^{i} = i \cdot t / 9

and

{\bar{r}}_{t}^{i} = i \cdot t

. Then, for the first condition in Assumption 5, since each

x_{t}^{i}

has a lower bound

- 1

, we have

{\dot{v}}_{t}^{i} (x_{t}^{i}) = \frac{i \cdot t}{x_{t}^{i} + 2} \leq \frac{i \cdot t}{- 1 + 2} = {\bar{r}}_{t}^{i} .

Additioanlly, every

x_{t}^{i}

is upper bounded by 7 because, from the seventh constraint, we have

x_{t}^{i} \leq 2 - \sum_{(i^{'}, t^{'}) \neq (i, t)} x_{t^{'}}^{i^{'}} \leq 2 - 5 \cdot (- 1) = 7,

and thus

{\dot{v}}_{t}^{i} (x_{t}^{i}) = \frac{i \cdot t}{x_{t}^{i} + 2} \geq \frac{i \cdot t}{7 + 2} = i \cdot t / 9 .

For the second condition in Assumption 5, for all

p \in [i \cdot t / 9, i \cdot t]

, we have

{\dot{v}}_{t}^{i} (x_{t}^{i}) = p \Leftrightarrow x_{t}^{i} = \frac{i \cdot t}{p} - 2,

so Assumption 5 is verified.

With the

{\underset{̲}{r}}_{t}^{i}

s and

{\bar{r}}_{t}^{i}

s chosen above, a dual feasible set

P

is constructed. Within this price set

P

, Algorithm 1 evaluates the function

{({\dot{v}}_{t}^{i})}^{- 1} (\cdot)

only in the interval

[i \cdot t / 9, i \cdot t]

. Consequently, in running Algorithm 1, we only need to define

v_{t}^{i} (\cdot)

on the interval

[- 1, 7]

.

Regarding Assumption 4, we need to show that

v_{t}^{i} (\cdot)

is strongly concave on

[- 1, 7]

. Since one can verify that the function

- i t ln (2 + x) - a x^{2} / 2

is convex on

[- 1, 7]

for

0 \leq a \leq i t / 81

, every

v_{t}^{i} (\cdot)

is strong concave, and thus, Assumption 4 holds (this is based on the fact that f is strongly convex with parameter

δ

iff

g (x) = f (x) - \frac{δ}{2} {∥ x ∥}^{2}

is convex.).

To choose an appropriate step size

α

for the algorithm, we need to investigate further the parameter

δ

. In our environment, the sum of utility functions

f (x) = - \sum_{t = 1}^{2} \sum_{i = 1}^{3} v_{t}^{i} (x_{t}^{i})

is strongly concave on

{[- 1, 7]}^{6}

with parameter

δ = 18 / 81

because each component

v_{t}^{i}

of f is a strongly concave function with parameter

i \cdot t / 81

and the parameter

δ

is additive:

\sum_{t = 1}^{2} \sum_{i = 1}^{3} i t / 81 = 18 / 81

. By calculation,

∥ \tilde{A} ∥ \approx 3.1623

, so one possible step size can be

α = 0.1 < 2 \times δ / ∥ \tilde{A} ∥

. According to Algorithm 1, the updates required are (define

η (i, t) = 2 (i - 1) + t

for convenience):

\begin{matrix} {\tilde{q}}^{i, η (j, t)} (k + 1) = q^{i, η (j, t)} (k) - α (1 + y_{t}^{j} (k)), j = 1, 2, 3, t = 1, 2, \\ {\tilde{q}}^{i, 7} = q^{i, 7} - α (2 - \sum_{j = 1}^{3} \sum_{t = 1}^{2} y_{t}^{j} (k)), \\ {\tilde{s}}_{t}^{i} (k + 1) = s_{t}^{i} (k) + α \sum_{j = 1}^{3} y_{t}^{j}, t = 1, 2, \\ (q^{i} (k + 1), s^{i} (k + 1)) = {Proj}_{P} ({\tilde{q}}^{i} (k + 1), {\tilde{s}}^{i} (k + 1)), \\ y_{t}^{i} (k + 1) = \frac{i \cdot t}{p_{t} - q^{i, η (i, t)} (k + 1) + q^{7} (k + 1) + s_{t}^{i} (k + 1)} - 2 . \end{matrix}

To verify the convergence of the learning algorithm, we run it with the initial price set to

(q (0), s (0)) = {Proj}_{P} (0_{9 \times 1})

. After

K = 100

iterations, we observe the convergence for both the suggested prices

q, s

and the corresponding announced demands

y

. Figure 3 shows the process of convergence and verifies that the convergence rate is exponential, as expected.

7. Conclusions

Motivated by the work of mechanism design for NUM problems, we proposed a new class of (indirect) mechanisms, with application in demand management in energy communities. The proposed mechanisms possess desirable properties including full implementation, individual rationality, and budget balance and can be easily generalized to different environments with peak shaving and convex constraints. We showed how the original “centralized” mechanism can be modified in a systematic way to account for environments with communication constraints. This modification leads to a new type of mechanisms that we call “decentralized” mechanisms and can be thought of as the analog to decentralized optimization (developed for optimization problems with non-strategic agents) for environments with strategic users. Finally, motivated by the need for practical deployment of these mechanisms, we introduced a PGD-based learning algorithm for users to learn the NE of the mechanism-induced game.

Possible future research directions include seeking more efficient learning algorithms for the distributed mechanism as well as co-design of a (distributed) mechanism and characterization of the class of convergent algorithms for this design.

Author Contributions

X.W.—conceptualization, formal analysis, investigation, methodology, software, visualization, original draft, review and editing; A.A.—conceptualization, formal analysis, funding acquisition, investigation, methodology, project administration, resources, supervision, review and editing. All authors have read and agreed to the published version of the manuscript.

Funding

This research received funding from National Science Foundation.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Equivalence of Centralized Optimization Problem (2) and Original Problem (3a)–(3c)

We first prove this sufficiency by showing that we can always derive the optimal solution of (2) from the optimal solution of newly constructed (3a)–(3c). Suppose that the optimal solution of (3a)–(3c) is

(x^{*}, w^{*})

. We claim that

x^{*}

is the optimal solution of the original problem (2). First, the feasibility of

x^{*}

in (2) is ensured by (3b) in the newly constructed problem.

Now, we check the optimality. Suppose that

x^{*}

is not the optimal for (2), and instead,

x^{'}

is the optimal. In the new problem, construct

\tilde{x} = x^{'}, \tilde{w} = max_{1 \leq t \leq T} \sum_{i = 1}^{N} x_{t}^{^{'} i},

then it is easy to verify that

(\tilde{x}, \tilde{w})

is feasible for the new optimization. Notice that

\begin{matrix} \sum_{i = 1}^{N} v^{i} ({\tilde{x}}^{i}) - \sum_{t = 1}^{T} p_{t} \sum_{i = 1}^{N} {\tilde{x}}_{t}^{i} - p_{0} \tilde{w} \\ = & \sum_{i = 1}^{N} v^{i} (x^{^{'} i}) - \sum_{t = 1}^{T} p_{t} \sum_{i = 1}^{N} x_{t}^{^{'} i} - p_{0} max_{1 \leq t \leq T} \sum_{i = 1}^{N} x_{t}^{^{'} i} \\ > & \sum_{i = 1}^{N} v^{i} (x^{* i}) - \sum_{t = 1}^{T} p_{t} \sum_{i = 1}^{N} x_{t}^{* i} - p_{0} max_{1 \leq t \leq T} \sum_{i = 1}^{N} x_{t}^{* i} \\ \geq & \sum_{i = 1}^{N} v^{i} (x^{* i}) - \sum_{t = 1}^{T} p_{t} \sum_{i = 1}^{N} x_{t}^{* i} - p_{0} w^{*} . \end{matrix}

(A1)

The first inequality follows the optimality of

x^{'}

in the original optimization (2); the second inequality comes from the constraint (3c) in the new optimization. By this inequality chain, we find a

(\tilde{x}, \tilde{w})

with a better objective function value in (3a)–(3c) than

(x^{*}, w^{*})

, which contradicts the assumption that

(x^{*}, w^{*})

is an optimal solution of (3a)–(3c).

Therefore, by contradiction, we shows that, if

(x^{*}, w^{*})

is optimal solution of (3a)–(3c),

x^{*}

must be the optimal solution for the original optimization (2).

For the other direction, we need to show if

x^{'}

is optimal solution of (2), then we are able to construct an optimal solution of (3a)–(3c) based on

x^{'}

. We construct

\tilde{x} = x^{'}

and

\tilde{w} = max_{1 \leq t \leq T} \sum_{i = 1}^{N} x_{t}^{^{'} i}

and argue that this

(\tilde{x}, \tilde{w})

is the optimal for (3a)–(3c). Assume that

(x^{*}, w^{*})

is the optimal for (3a)–(3c); then, we still obtain the same inequality chain as (A1) (except that, for the second line, there should be a “greater than or equal” sign instead), and the equality and inequalities hold for the same reasons, as stated above. This shows that

(\tilde{x}, \tilde{w})

has the same objective value as the optimal solution of (3a)–(3c), and thereforem

(\tilde{x}, \tilde{w})

constructed from

x^{'}

of the original problem is also the optimal for the new problem.

Appendix B. Proof of Lemma 2

Proof.

At NE

m^{*}

, for the constraint l in

L

, consider the message components

q^{i, l}

for each user i. In user i’s tax function, denote the part relative to

q^{i, l}

by

{\hat{t}}_{q}^{i, l}

. We have

\begin{matrix} {\hat{t}}_{q}^{i, l} (m^{i}, m^{- i *}) & = {(q^{i, l} - q^{- i, l *})}^{2} + q^{i, l} (b^{l} - \sum_{j \neq i} a^{j, l} y^{j *} - a^{i, l} β^{i - 1}) \\ = {(q^{i, l} - q^{- i, l *})}^{2} + q^{i, l} \underset{denoted by e^{l} (y^{*})}{\underset{︸}{(b^{l} - \sum_{j} a^{j, l} y^{j *})}} \\ (β^{i - 1} = y^{i}) \end{matrix}

by Lemma 1.

For any user i, there is no unilateral profitable deviation on

m^{i *}

. Hence, if we fix

m^{- i *}

and all of the message components of

m^{i *}

except

q^{i, l}

, it is a necessary condition that user i cannot find a better response than

q^{i, l *}

.

Consider the best response of

q^{i, l}

in different cases of

e^{l} (y^{*})

.

Case 1.

e^{l} (y^{*}) > 0

, i.e., the constraint l is inactive at NE. Note that

{\hat{t}}_{q}^{i, l}

is a quadratic function of

q^{i, l}

of the following form

{\hat{t}}_{q}^{i, l} = {(q^{i, l})}^{2} - (2 q^{- i, l *} - e^{l} (y^{*})) q^{i, l} + {(q^{- i, l *})}^{2} .

Without considering the nonnegative restriction, the best choice should be

q^{- i, l *} - e^{l} (y^{*}) / 2

. Since

q^{i, l} \geq 0

, the best choice for

q^{i, l}

would be

{(q^{- i, l *} - e^{l} (y^{*}) / 2)}^{+}

(here,

{(\cdot)}^{+} = max {\cdot, 0}

), which is unique with fixed

m_{- i}^{*}

.

Therefore,

q^{i, l *} = {(q^{- i, l *} - e^{l} (y^{*}) / 2)}^{+} .

Observe that

{(q^{- i, l *} - e^{l} (y^{*}) / 2)}^{+} \leq {(q^{- i, l *})}^{+} = q^{- i, l *}

. Equality holds only if

q^{- i, l *} \leq e^{l} (y^{*}) / 2

and

q^{- i, l *} = 0

. Thus, for all i,

q^{i, l *} \leq q^{- i, l *}

, equality holds only if

q^{i, l *} = 0

and

q^{- i, l *} = 0

. In other words, if for one user i we have

q^{i, l *} = q^{- i, l *}

, then all of the

q^{i, l *} = 0

.

Notice that

q^{i, l *} < q^{- i, l *}

implies that

q^{i, l}

is smaller than one of the

q^{j, l}

among user

j \neq i

, which means that

q^{i, l}

is not the largest. Assume that

q^{i, l *} < q^{- i, l *}

for all i; then, no

q^{i, l}

can be the largest among

{q^{i, l}}_{i \in N}

, but we also know that

{q^{i, l}}_{i \in N}

is a finite set, and therefore, it must have a maximum. Here comes the contradiction. As a result, there must exist at least one i such that

q^{i, l *} = q^{- i, l *}

, which implies that all of the

q^{i, l *} = 0

.

Case 2.

e^{l} (y^{*}) = 0

, i.e., the constraint l is active at NE. In this case,

{\hat{t}}_{q}^{i, l} = {(q^{i, l} - q^{- i, l *})}^{2}

. It is clear that every user’s best response is to make their own price align with the average of others.

Notice that, if

q^{i, l *} = q^{- i, l *}

, then

q^{- i, l}

is equal to the average of all

q^{l}

. Consequently,

q^{i, l *} = q^{j, l *}

for all

i, j \in N

.

Case 3.

e^{l} (y^{*}) < 0

, i.e., the constraint l is violated at NE. In this case,

{\hat{t}}_{q}^{i, l} = {(q^{i, l})}^{2} - \underset{> 0}{\underset{︸}{(2 q^{- i, l *} - e^{l} (y^{*}))}} q^{i, l} + {(q^{- i, l *})}^{2},

which leads to a condition for all user i as

q^{i, l *} = q^{- i, l *} + \underset{> 0}{\underset{︸}{(- e^{l} (y^{*}) / 2)}} > q^{- i, l *} .

In a finite set, if one number is strictly larger than the average of the others, it means that it is not the smallest number in the set. If this condition is true for all user i, it means there is no smallest number among the set, which is impossible. Therefore, Case 3 does not occur at NE.

In summary, at NE, we always have

e^{l} (y^{*}) \geq 0

, and

q^{i, l}

s are equal. Moreover,

q^{i, l *} e^{l} (y^{*}) = 0

. These prove the primal feasibility, equal prices, and complementary slackness on prices q in the Lemma 2.

Now, for time t, consider the message component

s_{t}^{i}

for each user i. In user i’s tax function, denote the part relative to

s_{t}^{i}

by

{\hat{t}}_{s}^{i, l}

. We have

\begin{matrix} {\hat{t}}_{s}^{i, t} (m^{i}, m^{- i *}) & = {(s_{t}^{i} - s_{t}^{- i *})}^{2} + s_{t}^{i} (z^{- i} - \sum_{j \neq i} y_{t}^{j *} - β_{t}^{i - 1 *}) \\ = {(s_{t}^{i} - s_{t}^{- i *})}^{2} + s_{t}^{i} \underset{denoted by g_{t} (y^{*})}{\underset{︸}{(z^{*} - \sum_{j} y_{t}^{j *})}}, \end{matrix}

where

z^{*} = {max}_{t} \sum_{j} y_{t}^{j *}

.

Different from the proof of previous part, here, we only need to consider two cases of

g_{t} (y^{*})

because, by definition of z,

g_{t} (y^{*})

is always nonnegative. Another thing we can observe is that there exists at least one t such that

g_{t} (y^{*}) = 0

, i.e., at least one time t is the time for the peak demand. Define the set

\tilde{T}

as the time set containing all time t with peak demand.

Check the best response of

s_{t}

separately. For all

t^{'} \notin \tilde{T}

,

g_{t^{'}} (y^{*}) > 0

, then following the similar steps shown above, we know

s_{t^{'}}^{i *} = 0

for all i. For those

t \in \tilde{T}

, we have already had

g_{t} (y^{*}) = 0

. For those t, the best response is

s_{t}^{i *} = s_{t}^{- i *}

, which is true for every user i. As a result, we also have the equal prices and complementary slackness for

s_{t}^{i}

.

Check (10). For

t \in \tilde{T}

,

g_{t} (y^{*}) = 0

, so (10) holds.

For

t \notin \tilde{T}

,

g_{t} (y^{*}) > 0

, so

t \notin arg max_{\tilde{t}} \sum_{j \neq i} y_{\tilde{t}}^{j} + β_{\tilde{t}}^{i - 1} .

Adiitionally, we know that such a

t \notin \tilde{T}

has

s_{t} = 0

. Therefore, for

t \notin \tilde{T}

, for either branch in the definition of radial pricing

RP

,

{RP}_{t}^{i} (s, ζ^{- i}) = 0

. Hence, (10) holds for

t = 1, \dots, T

. □

Appendix C. Proof of Lemma 3

Proof.

At NE, for user i, the

u^{i} (m^{i}, m^{- i})

can be treated as a function of

m^{i}

, with

m^{- i}

fixed. By the assumption of the existence of NE,

u^{i} (m^{i}, m^{- i})

must have a global maximizer with respect to

m^{i}

. Given

m^{- i}

, all of the auxiliary variables and functions only determined by

m^{- i}

are constants here, and one can check that the other terms in (5) and (7) are differentiable. Necessary conditions for the global maximizer are

\frac{\partial u^{i}}{\partial y_{t}^{i}} = {\dot{v}}_{t}^{i} ({\hat{x}}_{t}^{i} (m)) - (p_{t} + {RP}_{t}^{i} (s, ζ^{- i}) + \sum_{l \in L_{i}} a_{t}^{i, l} q^{l}) = 0 .

Therefore,

{\dot{v}}_{t}^{i} ({\hat{x}}_{t}^{i} (m)) = p_{t} + {RP}_{t}^{i} (s, ζ^{- i}) + \sum_{l \in L_{i}} a_{t}^{i, l} q^{l},

which is exactly the equation (11).

Equation (12) follows directly from the definition of

RP

operator. □

Appendix D. Proof of Theorem 2

Proof.

By assumption, the centralized problem is a convex optimization problem with a non-empty feasible set, so there must exist an optimal solution

\{x^{*}, w^{*}\}

and corresponding Lagrange multipliers

λ^{l *}, μ_{t}^{*}

which satisfy KKT conditions (4a)–(4g).

Consider the message profile

m^{*}

consisting of

\begin{matrix} y_{t}^{i} & = x_{t}^{i *}, t = 1, \dots, T, \forall i \in N, \\ q^{i, l} & = λ^{l *}, l \in L_{i}, \forall i \in N, \\ s_{t}^{i} & = μ_{t}^{*}, t = 1, \dots, T, \forall i \in N \\ β_{t}^{i} & = x_{t}^{i + 1 *}, t = 1, \dots, T, \forall i \in N . \end{matrix}

If for arbitrary user i, no profitable unilateral deviations exist, i.e., there does not exist an

\tilde{m} = ({\tilde{m}}_{i}, m_{- i}^{*})

such that

u_{i} (\tilde{m}) > u_{i} (m^{*})

, then

m^{*}

is a NE of the game

G

.

We can focus on

u_{i} (m)

of user i to see whether they have a profitable deviation given

m_{- i}^{*}

. For user i, we have

\begin{matrix} q^{- i, l} = λ^{l *}, \forall l \in L \\ s_{t}^{- i} = μ_{t}^{*}, t = 1, \dots, T, \\ {RP}_{t}^{i} (s^{- i}, y^{- i}, β^{i - 1}) = μ_{t}^{*}, t = 1, \dots, T, \\ z^{- i} = w^{*} = max_{t} (\sum_{j \in N} x_{t}^{j *}) . \end{matrix}

Therefore, in the interest of user i, they want to maximize the following

\begin{matrix} \begin{matrix} u^{i} (m^{i}, m^{- i *}) = & \sum_{t = 1}^{T} \underset{Function of y_{t}^{i}}{\underset{︸}{(v_{t}^{i} (y_{t}^{i}) - (p_{t} + μ_{t}^{*}) y_{t}^{i} - \sum_{l \in L_{i}} λ^{l *} a_{t}^{i, l} y_{t}^{i})}} \\ - \sum_{l \in L} \underset{Function of q^{i, l}}{\underset{︸}{({(q^{i, l} - λ^{l *})}^{2} + q^{i, l} (b^{l} - \sum_{j} \sum_{t = 1}^{T} a_{t}^{j, l} x_{t}^{j *}))}} \\ - \sum_{t = 1}^{T} \underset{Function of s_{t}^{i}}{\underset{︸}{({(s_{t}^{i} - μ_{t}^{*})}^{2} + s_{t}^{i} (w^{*} - \sum_{j} x_{t}^{j *}))}} \\ - \sum_{t = 1}^{T} \underset{Function of β_{t}^{i}}{\underset{︸}{{(β_{t}^{i} - x_{t}^{i + 1 *})}^{2}}} . \end{matrix} \end{matrix}

(A2)

The last term of (A2) is the only term related to

β^{i}

, which is a quadratic terms. As a strategic agent, it is clear that user i does not deviate from

β^{i} = x^{i + 1 *}

; otherwise, they pay for the penalty.

The second and third terms of (A2) are quite similar: they both consist of a quadratic term and a term for complementary slackness. For the second term, let us consider constraint l. If l is active in the optimal solution, the complementary slackness term goes to 0. To avoid extra payment, user i does not deviate

q^{i, l}

from the price suggested by optimal solution

λ^{l *}

. If l is inactive, the price

λ^{l *}

suggested by optimal solution is 0. Then, the penalty of constraint l for user i is

{(q^{i, l})}^{2} + q^{i, l} \underset{> 0}{\underset{︸}{(b^{l} - \sum_{j} \sum_{t = 1}^{T} a_{t}^{j, l} x_{t}^{j *})}},

where user i can only select a nonnegative price

q^{i, l}

. There are no better choices better than choosing

q^{i, l} = 0 = λ^{l *}

. A similar analysis works for the third term of (A2). As a result, there are no unilateral profitable deviations on

q^{i, l}

for all l and

s_{t}^{i}

for all t.

Now, we denote the terms in the parentheses in the first part of (A2) by

f_{t}^{i} (y_{t}^{i})

. Since these four terms are disjoint in the aspect of inputted variables,

u^{i} (m^{i}, m^{- i *})

achieves its maximum if and only if every

f_{t}^{i} (y_{t}^{i})

achieves its maximum, and the other three terms equal their minimum. For the first part, due to the strict concavity of

v_{t}^{i} (\cdot)

, the second-order derivative of

f_{t}^{i} (y_{t}^{i})

for each t is negative, which indicates that

f_{t}^{i} (y_{t}^{i})

is strictly concave as well. We can find the maxima of

f_{t}^{i} (y_{t}^{i})

by the first-order condition:

\frac{d f_{t}^{i}}{d y_{t}^{i}} (y_{t}^{i *}) = {\dot{v}}_{t}^{i} (y_{t}^{i}) - (p_{t} + μ_{t}^{*}) - \sum_{l \in L_{i}} λ^{l *} a_{t}^{i, l} = 0 .

(A3)

By (4g) in the KKT conditions, we know that the only

y_{t}^{i}

that makes (A3) hold is

y_{t}^{i} = x_{t}^{i *}

for all t. The reason is that, by the strict concavity assumption of utility function

v_{t}^{i} (\cdot)

, the first-order derivative of

v_{t}^{i}

strictly decreases, and therefore, for one aggregated price, there is at most one demand value x that makes

{\dot{v}}_{t}^{i} (x)

equal that price.

Therefore, for any agent i, if others send messages

m^{- i *}

, the only best response of agent i is to announce

m^{i *}

. Under this circumstance, sending messages other than

m^{i *}

does not increase agent i’s payoff

u^{i} (m)

. Consequently,

m^{*}

is an NE of the induced game

G

. □

Appendix E. Proof of Theorem 3

Proof.

For any user i, if they choose to participate with other users, when everyone anticipates the NE, user i’s payoff is of the form (13) if they only consider modifying

y^{i}

and keeps other components unchanged. Thus, user i faces the following optimization problem:

y^{i} = arg max_{y^{i} \in R^{T}} v^{i} (y^{i}) - \sum_{t = 1}^{T} (p_{t} + {RP}_{t}^{i} (s, ζ^{- i})) y_{t}^{i} - \sum_{l \in L_{i}} q^{- i, l} \sum_{t = 1}^{T} a_{t}^{i, l} y_{t}^{i} .

By the definition of NE,

y^{i *}

is one of the best solutions, which yields a payoff

u^{i} (m^{*})

. User i can also choose

{\tilde{y}}^{i} = 0

. Denote the corresponding message by

{\tilde{m}}_{i}

. Then, the payoff value becomes

u^{i} ({\tilde{m}}^{i}, m^{- i *}) = v^{i} (0)

, which coincides with the payoff for not participating. Since

m^{i *}

is the best response to

m^{- i *}

, we have

u^{i} (m^{*}) \geq u^{i} ({\tilde{m}}^{i}, m^{- i *}) = v^{i} (0)

. In other words, if every one anticipates the NE as the outcome, to participate is at least not worse than not participating. □

Appendix F. Proof of Theorem 4

Proof.

Suppose that the optimal solution for the original problem given by NE is

(x^{*}, λ^{*}, μ^{*})

; then, the tax for user i is

{\hat{t}}^{i} (m^{*}) - J ({\hat{x}}_{t}^{i} (m^{*})) = \sum_{t = 1}^{T} (p_{t} + μ_{t}^{*}) x_{t}^{i *} + \sum_{l \in L_{i}} λ^{l *} \sum_{t = 1}^{T} a_{t}^{i, l} x_{t}^{i *} - J (x^{i *}) .

The total amount of tax is

\begin{matrix} \sum_{i \in N} {\hat{t}}^{i} (m^{*}) - J (x^{i *}) \\ = & \sum_{i \in N} \sum_{t = 1}^{T} (p_{t} + μ_{t}^{*}) x_{t}^{i *} + \sum_{i \in N} \sum_{l \in L_{i}} λ^{l *} \sum_{t = 1}^{T} a_{t}^{i, l} x_{t}^{i *} - J (x^{i *}) \\ = & \sum_{t = 1}^{T} (p_{t} \sum_{i \in N} x_{t}^{i *} + μ_{t}^{*} \sum_{i \in N} x_{t}^{i *}) + \sum_{l \in L} λ^{l *} \sum_{t = 1}^{T} \sum_{i \in N} a_{t}^{i, l} x_{t}^{i *} - J (x^{i *}) \\ = & \sum_{l \in L} λ^{l *} \sum_{t = 1}^{T} \sum_{i \in N} a_{t}^{i, l} x_{t}^{i *} . \end{matrix}

For each constraint l, by the complementary slackness, we have

λ^{l *} (b^{l} - \sum_{t = 1}^{T} \sum_{i \in N} a_{t}^{i, l} x_{t}^{i *}) = 0 .

Therefore,

\sum_{i \in N} {\hat{t}}^{i} (m^{*}) - J (x^{i *}) = \sum_{l \in L} λ^{l *} b^{l} \geq 0,

which shows that, at NE, the planner’s payoff is nonnegative.

Furthermore, in order to save unnecessary expenses on the planner, the energy community can adopt the mechanism with the following tax function

{\tilde{t}}^{i} (m)

instead

{\tilde{t}}^{i} (m) = {\hat{t}}^{i} (m) - \sum_{l \in L} q^{- i, l} b^{l} / N .

Note that user i has no control on the additional term because no components of

m^{i}

are in that term, and thus, the additional term does not change NE. Since the prices are equal at NE, the planner actually gives

\sum_{l \in L} λ^{l *} b^{l}

back to the users. Hence,

\sum_{i \in N} {\tilde{t}}^{i} (m^{*}) - J (x^{i *}) = 0,

As a side comment, the choice of

{\tilde{t}}^{i} (m)

is not unique. Any adjustment works here as long as it does not depend on

m^{i}

for each

t^{i} (\cdot)

and sums up to

\sum_{l \in L} λ^{l *} b^{l}

at NE. □

Appendix G. Proof of Lemma 5

Proof.

Here, we provide a non-rigorous proof of (24). The proof of (23) is quite similar. For a detailed version of the proof, we refer the interested readers to 7.1, Chapter 4, of [72].

Before we show the proof of this part, for the sake of convenience, we define

n (i, k)

as the nearest user among the neighbors of user i and user i itself to user k.

n (i, k)

is well-defined because one can show that

n (i, k) = j

provides a partition for all the users.

Equation (24) can be shown by applying (22) iteratively. Recall that the message exchange network is assumed to be a undirected acyclic graph (i.e., a tree). First, consider the user j on the leaves (the nodes with only one degree). Suppose the neighbor of user j is i, then

N (j) = {i}

. By (22), we have

ν_{t}^{i, j} = y_{t}^{j}

. Since no k satisfies

n (i, k) = j

other than j themselves, (24) holds for

ν_{t}^{i, j}

, where j is a leaf node.

For more general cases, to compute

ν_{t}^{i, j}

, it is safe to only consider the subgraph

{GR}_{i}

that contains only node i and node k such that

n (i, k) = j

. When applying (22), it is impossible to have node

l \in {GR}_{i}^{C}

involved because, if it occurs when expanding “

ν

” term for some

j^{'}

, l is a neighbor of

j^{'}

. We know that there is a route from i to

j^{'}

, say route

i L j^{'}

. Since

l \in {GR}_{i}^{C}

,

n (i, l) \neq j

, there exists a route

L^{'}

that does not involve any node in the branch starting from node j, such that

l L^{'} i

, which results in a loop

l L^{'} i L j^{'} l

.

Then, by using (22) iteratively, we can see that (1) every node in

{GR}_{i}

is visited at least once and gives a corresponding demand “y”; (2). each

y_{t}^{j}

is given only once (except root i, which does not give

y_{t}^{i}

in this procedure); and (3) when it proceeds to the leaf nodes, the iteration terminates because there are no more “

ν

” terms to expand. Hence,

ν_{t}^{i, j} = \sum_{h \in {GR}_{i}} y_{t}^{h}

, and we can easily verify that

{GR}_{i} \ {i}

is nothing but

{h : n (i, h) = j}

. □

Appendix H. Convergence of the Learning Algorithm for Centralized Mechanism

The convergence of the proposed learning algorithm can be shown in the three steps mentioned in Section 5. THe first step shows the connection between

m^{*}

and

x^{*}, λ^{*}, μ^{*}

of the optimal solution for the original optimization, which has already been clarified in Section 5. As a result, learning NE is equivalent to learning the optimal solution of the original optimization problem. For the second step, as a convex optimization problem with a non-empty feasible set defined by linear inequalities, Slater’s condition is easy to check. Therefore, we have a strong duality in this problem, which means we can obtain the optimal solution of the original problem as long as we solve the dual problem. The last step is to identify the dual problem and to find a convergent algorithm for it. This part of appendix explains how to pin down the dual function and the dual feasible set, and shows the convergence of PGD algorithm on this dual problem.

Before we identify the dual function of the original problem, for the sake of convenience, in constraint (3c) of the original problem, move w to the left-hand side and rewrite (3b) and (3c) into one matrix form

\tilde{A} x + \tilde{1} w \leq \tilde{b},

where

\tilde{A}

is defined in (27), and

\tilde{1} = (\begin{matrix} 0_{L} \\ - 1_{T} \end{matrix}), \tilde{b} = (\begin{matrix} b \\ 0_{T} \end{matrix}) .

Suppose that

f (x) = \sum_{i} \sum_{t} (v_{t}^{i} (x_{t}^{i}) - p_{t} x_{t}^{i})

; then, the objective function can be written as

f (x) - p_{0} w

. Observe that, by Assumption 4,

(v_{t}^{i} (x_{t}^{i}) - p_{t} x_{t}^{i})

s are also strongly concave without cross terms. Sequently, by the definition of strong concavity one can show directly that, as the sum of these strongly concave functions,

f (x)

is strongly concave as well. Let

h (x) = - f (x)

; then,

h (x)

is strongly convex with parameter

δ^{'}

. Denote by

h^{*} (\cdot)

the conjugate function of

h (x)

.

With these notations in mind, the dual function of the original problem is

\begin{matrix} D (\tilde{λ}) = & sup_{x, w} \{f (x) - p_{0} w - {\tilde{λ}}^{T} (\tilde{A} x + \tilde{1} w - \tilde{b})\} \\ = & b^{T} λ + sup_{x} \{{(- {\tilde{A}}^{T} \tilde{λ})}^{T} x - h (x)\} + sup_{w} {1_{T}^{T} μ w - p_{0} w} \\ = & b^{T} λ + h^{*} (- {\tilde{A}}^{T} \tilde{λ}), \end{matrix}

Here, we should be cautious about the domain of

D (\tilde{λ})

. In the second line,

{sup}_{w} {1_{T}^{T} μ w - p_{0} w}

is only defined when the coefficient

1_{T}^{T} μ - p_{0} = 0

, i.e.,

\sum_{t} μ_{t} = p_{0}

. Therefore, we get the following dual problem:

\begin{matrix} \underset{λ, μ}{minimize} & b^{T} λ + h^{*} (- {\tilde{A}}^{T} \tilde{λ}) \end{matrix}

(A4)

\begin{matrix} subject to & \sum_{t} μ_{t} = p_{0}, \end{matrix}

(A5)

\begin{matrix} λ \geq 0, μ \geq 0 . \end{matrix}

(A6)

Now, we derive the dual problem for the original optimization. To find the optimal solution, one direct thought is to use a projected gradient descent. Luckily, we have the following theorem, which ensures the convergence of PGD algorithm.

Theorem A1.

For a minimization problem on a closed and convex feasible set

X

with objective function

f (x)

, suppose that

X^{*}

is the set of optimal solutions. If f is convex and β-smooth on

X

, by using PGD with step size

α < 2 / β

, there exists

x^{*} \in X^{*}

, such that

lim_{k \to \infty} x (k) = x^{*} .

Proof.

The proof can be found in ([29] Theorem 1, Section 7.2). □

Theorem A1 indicates that, if the dual problem satisfies certain conditions, the solution converges to the set of optimal solutions. Although it is not clear whether the dual problem has a unique solution, by strong duality and the uniqueness of the solution to the primal problem, no matter which dual optimal solution is achieved, the corresponding primal solution can only be the unique optimal one and results in the same outcome.

Now, for the dual problem, we need to check the conditions required by Theorem A1. First, check the objective function. It is clear that any conjugate functions are convex, so

h^{*} (\cdot)

is convex and, consequently,

h^{*} (- \tilde{A} \tilde{λ})

is convex in

\tilde{λ}

as a composition of convex function and affine function. Thus, the objective function is convex. Since

h (x)

is strongly convex with parameter

δ^{'}

, by the result mentioned in [73], the

δ^{'}

-strong convexity of

h (\cdot)

implies that its conjugate

h^{*} (\cdot)

is

1 / δ^{'}

-smooth. Then, we have

| | \nabla h^{*} (- {\tilde{A}}^{T} {\tilde{λ}}^{1}) - \nabla h^{*} (- {\tilde{A}}^{T} {\tilde{λ}}^{2}) | | \leq 1 / δ^{'} \cdot | | - {\tilde{A}}^{T} ({\tilde{λ}}^{1} - {\tilde{λ}}^{2}) | | \leq (∥ \tilde{A} ∥ / δ^{'}) \cdot | | {\tilde{λ}}^{1} - {\tilde{λ}}^{2} | |,

which indicates that the objective function is

β

-smooth with

β = ∥ \tilde{A} ∥ / δ^{'}

. However, we are not sure whether the objective function is well-defined on the whole feasible set. Fortunately, by Assumption 5, we know that the optimal price vector

{\tilde{λ}}^{*}

lies in

P

and we can verify that

P

is a subset of the feasible set generated by (A5) and (A6). Therefore, by solving the following optimization problem, we can obtain the same optimal solution, and Theorem A1 is applicable here.

\underset{\tilde{λ} \in P}{minimize} b^{T} λ + h^{*} (- {\tilde{A}}^{T} \tilde{λ}) .

(A7)

By applying PGD to (A7) with step size

α \leq 2 / β = 2 δ^{'} / ∥ \tilde{A} ∥

, the update rules are as follows:

\begin{matrix} {\hat{λ}}^{l} (k + 1) = λ^{l} (k) - α (b^{l} + {[- \tilde{A} \nabla h^{*} (- {\tilde{A}}^{T} \tilde{λ} (k))]}_{l}), \end{matrix}

(A8)

\begin{matrix} {\hat{μ}}_{t} (k + 1) = μ (k) - α {[- \tilde{A} \nabla h^{*} (- {\tilde{A}}^{T} \tilde{λ} (k))]}_{L + t}, \end{matrix}

(A9)

\begin{matrix} (λ (k + 1), μ (k + 1)) = {Proj}_{P} (\hat{λ} (k + 1), \hat{μ} (k + 1)) . \end{matrix}

(A10)

where

{[\cdot]}_{j}

represents the jth entry of the inputted vector. To modify these rules into a learning algorithm for a centralized mechanism, by the relation between

m^{*}

and the optimal solution of the original problem and the dual, one might want to substitute

λ, μ

with

q^{i}, s^{i}

for each user i. However,

\nabla h^{*}

is not tractable for users as they do not know the utilities of the others. Thankfully, users can obtain the values of

\nabla h^{*}

-related terms by cooperation without revealing their entire utility functions. This method is realized by inquiries for the demands under given prices from each user. A key point for this implementation is to build a connection between

\nabla h^{*}

and the marginal value function

{\dot{v}}_{t}^{i}

for each demand

x_{t}^{i}

.

A useful result of subgradient of function f and its conjugate

f^{*}

can be used here, which is quoted as Theorem A2.

Theorem A2.

Suppose

f^{*} (s)

is the conjugate of

f (x)

, then

x \in \partial f^{*} (s) \Leftrightarrow s \in \partial f (x) .

Proof.

The proof can be found in [73]. □

Since h is closed (because h is proper convex and continuous) and strictly convex by assumption,

h^{*}

is differentiable (see [73]), and therefore, the subgradient of

h^{*}

on a fixed

p

is a singleton. As a result,

x = \nabla h^{*} (- ρ) \Leftrightarrow ρ = - \nabla h (x) = \nabla f (x) \Leftrightarrow x_{t}^{i} = {({\dot{v}}_{t}^{i})}^{- 1} (p_{t} + ρ_{t}^{i}) .

The last equivalent sign comes from the fact that

{[\nabla f (x)]}_{(i - 1) T + t} = \frac{d}{d x_{t}^{i}} (v_{t}^{i} (x_{t}^{i}) - p_{t} x_{t}^{i}) = {\dot{v}}_{t}^{i} (x_{t}^{i}) - p_{t} .

Thus, in every iteration of PGD, before using (A8) and (A9), one can first evaluate

x_{t}^{i} (k) = {({\dot{v}}_{t}^{i})}^{- 1} (p_{t} + {[{\tilde{A}}^{T} \tilde{λ (k)}]}_{(i - 1) T + t}),

(A11)

and then, (A8) and (A9) become

\begin{matrix} {\hat{λ}}^{l} (k + 1) = λ^{l} (k) - α (b^{l} - {[\tilde{A} x (k)]}_{l}) = λ^{l} (k) - α (b^{l} - a^{l} x (k)), \end{matrix}

(A12)

\begin{matrix} {\hat{μ}}_{t} (k + 1) = μ (k) + α {[\tilde{A} x (k)]}_{L + t} = μ (k) + α \sum_{j} x_{t}^{j} (k) . \end{matrix}

(A13)

Arranging (A10)–(A13) in an appropriate order, we obtain an algorithm with the same convergent property of the original PGD, and significantly, no

\nabla h^{*}

is in the algorithm. By substituting

λ

and

μ

with

q^{i}

and

s^{i}

(by making duplications of (A12) and (A13) for each user i) and by substituting

x

with

y

, we obtain Algorithm 1. Consequently, Theorem 6 follows directly from the convergence of PGD indicated in Theorem A1.

References

Schmidt, D.A.; Shi, C.; Berry, R.A.; Honig, M.L.; Utschick, W. Distributed resource allocation schemes. IEEE Signal Process. Mag. 2009, 26, 53–63. [Google Scholar] [CrossRef]
Nair, A.S.; Hossen, T.; Campion, M.; Selvaraj, D.F.; Goveas, N.; Kaabouch, N.; Ranganathan, P. Multi-agent systems for resource allocation and scheduling in a smart grid. Technol. Econ. Smart Grids Sustain. Energy 2018, 3, 1–15. [Google Scholar] [CrossRef] [Green Version]
HamaAli, K.W.; Zeebaree, S.R. Resources allocation for distributed systems: A review. Int. J. Sci. Bus. 2021, 5, 76–88. [Google Scholar]
Menache, I.; Ozdaglar, A. Network games: Theory, models, and dynamics. Synth. Lect. Commun. Netw. 2011, 4, 1–159. [Google Scholar] [CrossRef]
Hurwicz, L.; Reiter, S. Designing Economic Mechanisms; Cambridge University Press: Cambridge, UK, 2006. [Google Scholar]
Börgers, T.; Krahmer, D. An Introduction to the Theory of Mechanism Design; Oxford University Press: Oxford, UK, 2015. [Google Scholar]
United States Department of Energy. 2018 Smart Grid System Report; United States Department of Energy: Washington, DC, USA, 2018.
Garg, D.; Narahari, Y.; Gujar, S. Foundations of mechanism design: A tutorial part 1-key concepts and classical results. Sadhana 2008, 33, 83. [Google Scholar] [CrossRef] [Green Version]
Garg, D.; Narahari, Y.; Gujar, S. Foundations of mechanism design: A tutorial Part 2-Advanced concepts and results. Sadhana 2008, 33, 131. [Google Scholar] [CrossRef] [Green Version]
Vickrey, W. Counterspeculation, auctions, and competitive sealed tenders. J. Financ. 1961, 16, 8–37. [Google Scholar] [CrossRef]
Clarke, E.H. Multipart pricing of public goods. Public Choice 1971, 11, 17–33. [Google Scholar] [CrossRef]
Groves, T. Incentives in teams. Econom. J. Econom. Soc. 1973, 41, 617–631. [Google Scholar] [CrossRef]
Kelly, F.P.; Maulloo, A.K.; Tan, D.K. Rate control for communication networks: Shadow prices, proportional fairness and stability. J. Oper. Res. Soc. 1998, 49, 237–252. [Google Scholar] [CrossRef]
Yang, S.; Hajek, B. Revenue and stability of a mechanism for efficient allocation of a divisible good. Preprint 2005. Available online: https://www.researchgate.net/publication/238308421_Revenue_and_Stability_of_a_Mechanism_for_Ecient_Allocation_of_a_Divisible_Good (accessed on 14 May 2021).
Maheswaran, R.; Başar, T. Efficient signal proportional allocation (ESPA) mechanisms: Decentralized social welfare maximization for divisible resources. IEEE J. Sel. Areas Commun. 2006, 24, 1000–1009. [Google Scholar] [CrossRef]
Sinha, A.; Anastasopoulos, A. Mechanism design for resource allocation in networks with intergroup competition and intragroup sharing. IEEE Trans. Control Netw. Syst. 2018, 5, 1098–1109. [Google Scholar] [CrossRef] [Green Version]
Rabbat, M.; Nowak, R. Distributed optimization in sensor networks. In Proceedings of the 3rd International Symposium on Information Processing in Sensor Networks, Berkeley, CA, USA, 26–27 April 2004; pp. 20–27. [Google Scholar]
Boyd, S.; Parikh, N.; Chu, E. Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers; Now Publishers Inc.: Hanover, MA, USA, 2011. [Google Scholar]
Wei, E.; Ozdaglar, A.; Jadbabaie, A. A distributed Newton method for network utility maximization–I: Algorithm. IEEE Trans. Autom. Control 2013, 58, 2162–2175. [Google Scholar] [CrossRef]
Alvarado, A.; Scutari, G.; Pang, J.S. A new decomposition method for multiuser DC-programming and its applications. IEEE Trans. Signal Process. 2014, 62, 2984–2998. [Google Scholar] [CrossRef]
Di Lorenzo, P.; Scutari, G. Distributed nonconvex optimization over time-varying networks. In Proceedings of the 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Shanghai, China, 20–25 March 2016; pp. 4124–4128. [Google Scholar]
Sinha, A.; Anastasopoulos, A. Distributed mechanism design with learning guarantees for private and public goods problems. IEEE Trans. Autom. Control 2020, 65, 4106–4121. [Google Scholar] [CrossRef]
Heydaribeni, N.; Anastasopoulos, A. Distributed Mechanism Design for Network Resource Allocation Problems. IEEE Trans. Netw. Sci. Eng. 2020, 7, 621–636. [Google Scholar] [CrossRef] [Green Version]
Brown, G.W. Iterative solution of games by fictitious play. Act. Anal. Prod. Alloc. 1951, 13, 374–376. [Google Scholar]
Monderer, D.; Shapley, L.S. Fictitious play property for games with identical interests. J. Econ. Theory 1996, 68, 258–265. [Google Scholar] [CrossRef]
Hofbauer, J.; Sandholm, W.H. On the global convergence of stochastic fictitious play. Econometrica 2002, 70, 2265–2294. [Google Scholar] [CrossRef]
Milgrom, P.; Roberts, J. Rationalizability, learning, and equilibrium in games with strategic complementarities. Econom. J. Econom. Soc. 1990, 58, 1255–1277. [Google Scholar] [CrossRef]
Scutari, G.; Palomar, D.P.; Facchinei, F.; Pang, J.S. Monotone games for cognitive radio systems. In Distributed Decision Making and Control; Springer: Berlin/Heidelberg, Germany, 2012; pp. 83–112. [Google Scholar]
Polyak, B. Introduction to Optimization; Optimization Software Inc.: New York, NY, USA, 1987. [Google Scholar]
Srikant, R.; Ying, L. Communication Networks: An Optimization, Control, and Stochastic Networks Perspective; Cambridge University Press: Cambridge, UK, 2013. [Google Scholar]
Huang, Z.; Mitra, S.; Vaidya, N. Differentially private distributed optimization. In Proceedings of the 2015 International Conference on Distributed Computing and Networking, Goa, India, 4–7 January 2015; pp. 1–10. [Google Scholar]
Cortés, J.; Dullerud, G.E.; Han, S.; Le Ny, J.; Mitra, S.; Pappas, G.J. Differential privacy in control and network systems. In Proceedings of the 2016 IEEE 55th Conference on Decision and Control (CDC), Las Vegas, NV, USA, 12–14 December 2016; pp. 4252–4272. [Google Scholar]
Nozari, E.; Tallapragada, P.; Cortés, J. Differentially private distributed convex optimization via objective perturbation. In Proceedings of the American Control Conference (ACC), Boston, MA, USA, 6–8 July 2016; pp. 2061–2066. [Google Scholar]
Han, S.; Topcu, U.; Pappas, G.J. Differentially private distributed constrained optimization. IEEE Trans. Autom. Control 2017, 62, 50–64. [Google Scholar] [CrossRef]
Groves, T.; Ledyard, J. Optimal allocation of public goods: A solution to the “free rider” problem. Econom. J. Econom. Soc. 1977, 45, 783–809. [Google Scholar] [CrossRef]
Hurwicz, L. Outcome functions yielding Walrasian and Lindahl allocations at Nash equilibrium points. Rev. Econ. Stud. 1979, 46, 217–225. [Google Scholar] [CrossRef]
Huang, J.; Berry, R.A.; Honig, M.L. Auction-based spectrum sharing. Mob. Netw. Appl. 2006, 11, 405–418. [Google Scholar] [CrossRef]
Wang, B.; Wu, Y.; Ji, Z.; Liu, K.R.; Clancy, T.C. Game theoretical mechanism design methods. IEEE Signal Process. Mag. 2008, 25, 74–84. [Google Scholar] [CrossRef]
Wang, S.; Xu, P.; Xu, X.; Tang, S.; Li, X.; Liu, X. TODA: Truthful online double auction for spectrum allocation in wireless networks. In Proceedings of the 2010 IEEE Symposium on New Frontiers in Dynamic Spectrum (DySPAN), Singapore, 6–9 April 2010; pp. 1–10. [Google Scholar]
Ghosh, A.; Roth, A. Selling privacy at auction. In Proceedings of the 12th ACM Conference on Electronic Commerce, San Jose, CA, USA, 5–9 June 2011; pp. 199–208. [Google Scholar]
Khalili, M.M.; Naghizadeh, P.; Liu, M. Designing cyber insurance policies: Mitigating moral hazard through security pre-screening. In Proceedings of the International Conference on Game Theory for Networks, Knoxville, TN, USA, 9 May 2017; Springer: Berlin/Heidelberg, Germany, 2017; pp. 63–73. [Google Scholar]
Pal, R.; Wang, Y.; Li, J.; Liu, M.; Crowcroft, J.; Li, Y.; Tarkoma, S. Data Trading with Competitive Social Platforms: Outcomes are Mostly Privacy Welfare Damaging. IEEE Trans. Netw. Serv. Manag. 2020. [Google Scholar] [CrossRef]
Caron, S.; Kesidis, G. Incentive-based energy consumption scheduling algorithms for the smart grid. In Proceedings of the 2010 First IEEE International Conference on Smart Grid Communications, Gaithersburg, MD, USA, 4–6 October 2010; pp. 391–396. [Google Scholar]
Samadi, P.; Mohsenian-Rad, H.; Schober, R.; Wong, V.W. Advanced demand side management for the future smart grid using mechanism design. IEEE Trans. Smart Grid 2012, 3, 1170–1180. [Google Scholar] [CrossRef]
Muthirayan, D.; Kalathil, D.; Poolla, K.; Varaiya, P. Mechanism design for demand response programs. IEEE Trans. Smart Grid 2019, 11, 61–73. [Google Scholar] [CrossRef] [Green Version]
Iosifidis, G.; Gao, L.; Huang, J.; Tassiulas, L. An iterative double auction for mobile data offloading. In Proceedings of the 2013 11th International Symposium and Workshops on Modeling and Optimization in Mobile, Ad Hoc and Wireless Networks (WiOpt), Tsukuba, Japan, 13–17 May 2013; pp. 154–161. [Google Scholar]
Johari, R.; Tsitsiklis, J.N. Efficiency loss in a network resource allocation game. Math. Oper. Res. 2004, 29, 407–435. [Google Scholar] [CrossRef] [Green Version]
Yang, S.; Hajek, B. VCG-Kelly mechanisms for allocation of divisible goods: Adapting VCG mechanisms to one-dimensional signals. IEEE J. Sel. Areas Commun. 2007, 25, 1237–1243. [Google Scholar] [CrossRef] [Green Version]
Johari, R.; Tsitsiklis, J.N. Efficiency of scalar-parameterized mechanisms. Oper. Res. 2009, 57, 823–839. [Google Scholar] [CrossRef] [Green Version]
Farhadi, F.; Golestani, S.J.; Teneketzis, D. A surrogate optimization-based mechanism for resource allocation and routing in networks with strategic agents. IEEE Trans. Autom. Control 2018, 64, 464–479. [Google Scholar] [CrossRef]
Kakhbod, A.; Teneketzis, D. An Efficient Game Form for Unicast Service Provisioning. IEEE Trans. Autom. Control 2012, 57, 392–404. [Google Scholar] [CrossRef] [Green Version]
Kakhbod, A.; Teneketzis, D. Correction to “An Efficient Game Form for Unicast Service Provisioning” [Feb 12 392-404]. IEEE Trans. Autom. Control 2015, 60, 584–585. [Google Scholar] [CrossRef]
Sinha, A.; Anastasopoulos, A. A General Mechanism Design Methodology for Social Utility Maximization with Linear Constraints. ACM Sigmetrics Perform. Eval. Rev. 2014, 42, 12–15. [Google Scholar] [CrossRef]
Baumann, L. Self-Ratings and Peer Review. 2018. Available online: https://www.amse-aixmarseille.fr/sites/default/files/events/JMPLeonieBaumann_0.pdf (accessed on 30 July 2021).
Bloch, F.; Olckers, M. Friend-Based Ranking in Practice. Available online: https://arxiv.org/abs/2101.02857 (accessed on 14 May 2021).
Chen, Y. A family of supermodular NAvailable online: Ash mechanisms implementing Lindahl allocations. Econ. Theory 2002, 19, 773–790. [Google Scholar] [CrossRef] [Green Version]
Healy, P.J.; Mathevet, L. Designing stable mechanisms for economic environments. Theor. Econ. 2012, 7, 609–661. [Google Scholar] [CrossRef] [Green Version]
Scutari, G.; Facchinei, F.; Pang, J.S.; Palomar, D.P. Real and complex monotone communication games. IEEE Trans. Inf. Theory 2014, 60, 4197–4231. [Google Scholar] [CrossRef] [Green Version]
Gharesifard, B.; Cortés, J. Distributed convergence to Nash equilibria in two-network zero-sum games. Automatica 2013, 49, 1683–1692. [Google Scholar] [CrossRef] [Green Version]
Ye, M.; Hu, G. Game design and analysis for price-based demand response: An aggregate game approach. IEEE Trans. Cybern. 2016, 47, 720–730. [Google Scholar] [CrossRef] [Green Version]
Grammatico, S. Dynamic control of agents playing aggregative games with coupling constraints. IEEE Trans. Autom. Control 2017, 62, 4537–4548. [Google Scholar] [CrossRef] [Green Version]
Yi, P.; Pavel, L. Distributed generalized Nash equilibria computation of monotone games via double-layer preconditioned proximal-point algorithms. IEEE Trans. Control Netw. Syst. 2018, 6, 299–311. [Google Scholar] [CrossRef] [Green Version]
Paccagnan, D.; Gentile, B.; Parise, F.; Kamgarpour, M.; Lygeros, J. Nash and Wardrop equilibria in aggregative games with coupling constraints. IEEE Trans. Autom. Control 2018, 64, 1373–1388. [Google Scholar] [CrossRef] [Green Version]
Parise, F.; Ozdaglar, A. A variational inequality framework for network games: Existence, uniqueness, convergence and sensitivity analysis. Games Econ. Behav. 2019, 114, 47–82. [Google Scholar] [CrossRef] [Green Version]
Xiao, Y.; Hou, X.; Hu, J. Distributed solutions of convex-concave games on networks. In Proceedings of the 2019 American Control Conference (ACC), Philadelphia, PA, USA, 10–12 July 2019; pp. 1189–1194. [Google Scholar]
Bimpikis, K.; Ehsani, S.; Ilkılıç, R. Cournot competition in networked markets. Manag. Sci. 2019, 65, 2467–2481. [Google Scholar] [CrossRef]
Zhang, K.; Yang, Z.; Başar, T. Policy optimization provably converges to Nash equilibria in zero-sum linear quadratic games. arXiv 2019, arXiv:1906.00729. [Google Scholar]
uz Zaman, M.A.; Zhang, K.; Miehling, E.; Bașar, T. Reinforcement learning in non-stationary discrete-time linear-quadratic mean-field games. In Proceedings of the 2020 59th IEEE Conference on Decision and Control (CDC), Jeju, Korea, 14–18 December 2020; pp. 2278–2284. [Google Scholar]
Roudneshin, M.; Arabneydi, J.; Aghdam, A.G. Reinforcement learning in nonzero-sum Linear Quadratic deep structured games: Global convergence of policy optimization. In Proceedings of the 2020 59th IEEE Conference on Decision and Control (CDC), Jeju, Korea, 14–18 December 2020; pp. 512–517. [Google Scholar]
Shi, Y.; Zhang, B. Multi-agent reinforcement learning in Cournot games. In Proceedings of the 2020 59th IEEE Conference on Decision and Control (CDC), Jeju, Korea, 14–18 December 2020; pp. 3561–3566. [Google Scholar]
Sohet, B.; Hayel, Y.; Beaude, O.; Jeandin, A. Learning Pure Nash Equilibrium in Smart Charging Games. In Proceedings of the 2020 59th IEEE Conference on Decision and Control (CDC), Jeju, Korea, 14–18 December 2020; pp. 3549–3554. [Google Scholar]
Sinha, A. Mechanism Design with Allocative, Informational and Learning Constraints. Ph.D. Thesis, 2017. Available online: http://web.eecs.umich.edu/~anastas/anastas/docs/northwestern_2017.pdf (accessed on 14 May 2021).
Zhou, X. On the Fenchel duality between strong convexity and Lipschitz continuous gradient. arXiv 2018, arXiv:1803.06573. [Google Scholar]

Figure 1. Proxies in the Message Exchange Network: for constraint l, user i announces

n^{i, k, l}

as a summary of demands for the tree on the left of i (starting from k) and

n^{i, j, l}

as a summary of demands for the tree on the right of i (starting from j).

Figure 1. Proxies in the Message Exchange Network: for constraint l, user i announces

n^{i, k, l}

as a summary of demands for the tree on the left of i (starting from k) and

n^{i, j, l}

as a summary of demands for the tree on the right of i (starting from j).

Figure 2. Message exchange network: Users 1 and 2, and users 2 and 3 are neighbors. User 1’s message is invisible to user 3, and vice versa.

Figure 3. The convergence of the learning algorithm.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wei, X.; Anastasopoulos, A. Mechanism Design for Demand Management in Energy Communities. Games 2021, 12, 61. https://doi.org/10.3390/g12030061

AMA Style

Wei X, Anastasopoulos A. Mechanism Design for Demand Management in Energy Communities. Games. 2021; 12(3):61. https://doi.org/10.3390/g12030061

Chicago/Turabian Style

Wei, Xupeng, and Achilleas Anastasopoulos. 2021. "Mechanism Design for Demand Management in Energy Communities" Games 12, no. 3: 61. https://doi.org/10.3390/g12030061

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Mechanism Design for Demand Management in Energy Communities

Abstract

1. Introduction

1.1. Contributions

1.2. Related Literature

2. Model and Preliminaries

2.1. Demand Management in Energy Communities

2.2. Mechanism Design Preliminaries

3. The Baseline “Centralized” Mechanism

4. Distributed Mechanism

4.1. Message Exchange Network

4.2. The Message Space

4.3. The Allocation and Tax Functions

4.4. Properties

5. Learning Algorithm

5.1. A Learning Algorithm for the Centralized Mechanism

5.2. A Learning Algorithm for the Distributed Mechanism

6. A Concrete Example

6.1. The Demand Management Optimization Problem

6.2. The Centralized Mechanism

6.3. The Distributed Mechanism

6.4. The Learning Algorithm

7. Conclusions

Author Contributions

Funding

Conflicts of Interest

Appendix A. Equivalence of Centralized Optimization Problem (2) and Original Problem (3a)–(3c)

Appendix B. Proof of Lemma 2

Appendix C. Proof of Lemma 3

Appendix D. Proof of Theorem 2

Appendix E. Proof of Theorem 3

Appendix F. Proof of Theorem 4

Appendix G. Proof of Lemma 5

Appendix H. Convergence of the Learning Algorithm for Centralized Mechanism

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI