A Class of Distributed Online Aggregative Optimization in Unknown Dynamic Environment

Yang, Chengqian; Wang, Shuang; Zhang, Shuang; Lin, Shiwei; Huang, Bomin

doi:10.3390/math12162460

Open AccessArticle

A Class of Distributed Online Aggregative Optimization in Unknown Dynamic Environment

by

Chengqian Yang

¹,

Shuang Wang

¹,

Shuang Zhang

²,

Shiwei Lin

²

and

Bomin Huang

^2,*

¹

School of Control Engineering, Northeastern University at Qinhuangdao, Qinhuangdao 066004, China

²

College of Computer Engineering, Jimei University, Xiamen 361021, China

^*

Author to whom correspondence should be addressed.

Mathematics 2024, 12(16), 2460; https://doi.org/10.3390/math12162460

Submission received: 9 July 2024 / Revised: 7 August 2024 / Accepted: 7 August 2024 / Published: 8 August 2024

(This article belongs to the Topic Distributed Optimization for Control)

Download

Browse Figures

Versions Notes

Abstract

This paper considers a class of distributed online aggregative optimization problems over an undirected and connected network. It takes into account an unknown dynamic environment and some aggregation functions, which is different from the problem formulation of the existing approach, making the aggregative optimization problem more challenging. A distributed online optimization algorithm is designed for the considered problem via the mirror descent algorithm and the distributed average tracking method. In particular, the dynamic environment and the gradient are estimated by the averaged tracking methods, and then an online optimization algorithm is designed via a dynamic mirror descent method. It is shown that the dynamic regret is bounded in the order of

O (\sqrt{T})

. Finally, the effectiveness of the designed algorithm is verified by some simulations of cooperative control of a multi-robot system.

Keywords:

online optimization; aggregated terms; distributed algorithm; dynamic environment

MSC:

68W15

1. Introduction

The distributed online optimization problem for multi-agent systems has received considerable attention in the past few decades [1,2]. The objective is to minimize a global time-varying cost function, written as the sum of local convex functions, under the constrict that each individual agent has only knowledge of the local convex functions.

Recently, some researchers have focused on distributed online aggregative optimization, which is a special class of online optimization problems. In distributed online aggregative optimization, the local cost functions include some aggregative terms motivated by many real applications. These aggregative terms make the design and analyses of the online optimization algorithm more challenging. Some tracking methods were presented to solve distributed online aggregative optimization problems. For example, a distributed aggregative gradient tracking algorithm is proposed and analyzed in [3] to solve a distributed online aggregative optimization problem. It is shown that convergence to the optimal variable is linear. Different from [3], a set constraint is considered in [4], where an online distributed gradient tracking algorithm is proposed to solve a distributed online aggregative optimization problem with exact gradient information and stochastic/noisy gradients, respectively. The upper boundary of dynamic regret is analyzed and given, and some simulations on the target surrounding problem are provided to verify the effectiveness of the designed algorithm. The authors of [5] considered the distributed online aggregative optimization without the assumption of boundedness of the gradients and the feasible sets. In particular, a projected aggregative tracking algorithm was presented in [5], and the authors showed that the dynamic regret is bounded by a constant term and a term related to time variations.

Dynamic environments are common in the real applications of online optimization [6,7,8,9,10,11]. For example, the authors of [6] showed that many online learning problems, including the dynamic texture analysis, solar flare detection, sequential compressed sensing of a dynamic scene, traffic surveillance, tracking self-exciting point processes, and network behavior in the Enron email corpus, can benefit from the incorporation of a dynamic environment. In [7], the tracking problem of a time-varying parameter with unknown dynamics was studied as an online optimization in a dynamic environment. The distributed tracking and tracking of dynamic point processes on network problems were solved in [8,9] using mirror descent methods with dynamical environments, respectively. The localization of sensor networks problem was solved in [10] by using a distributed online bandit learning algorithm over a multi-agent network with a dynamical environment. Note that the dynamic environments were assumed to be known in [6,7,8,9,10]. In contrast with [6,7,8,9,10], an unknown dynamic environment was considered in [11], where the dynamic regret was shown to be bounded.

Based on the above discussions, the distributed online aggregative optimization with a dynamic environment is motivated by many real applications. The problem has many applications, especially under the condition that the dynamic environment is unknown. Therefore, this paper considers a class of distributed online aggregative optimization problems with an unknown dynamic environment. The main contributions are as follows. Some aggregative terms and an unknown dynamic environment are simultaneously considered in this paper for an online optimization problem. The problem formulation comes from some real applications, e.g., the cooperative control problem for a multi-robot system in the simulation section. Compared with [3,4,5,6,7,8,9,10,11], the dynamic environment was not considered in [3,4,5], and the aggregative terms were not considered in [6,7,8,9,10,11], respectively. Therefore, the design and analyses of the optimization algorithm in this paper are more challenging than in [3,4,5,6,7,8,9,10,11]. In particular, some averaged tracking methods estimate the dynamic environment and the gradient. Based on the estimations, an online optimization algorithm is designed via a dynamic mirror descent method. It is shown that the dynamic regret is bounded in the order of

O (\sqrt{T})

, and some simulations are proposed to verify the effectiveness of the designed algorithm.

The remaining parts of this paper are organized as follows. In Section 2 and Section 3, some preliminaries and problem formulation are presented. The main results are proposed in Section 4. In Section 5, the performance of the designed algorithm is verified by some simulations. Section 6 concludes the paper.

2. Preliminaries

2.1. Notations

R^{n}

denotes the n-dimensional Euclidean space.

∥ x ∥

denotes the 2-norm of a vector x.

[T]

with positive integer T denotes the set

{1, 2, \dots, T}

.

〈 x, y 〉

denotes the inner product of vectors

x \in R^{n}

and

y \in R^{n}

, i.e.,

〈 x, y 〉 = x^{T} y

where

x^{T}

denotes the transpose of x. For a function

f (x, t) : R^{n} \times R \to R

,

\nabla f (x, t)

denotes the gradient of

f (x, t)

with respect to a vector x.

σ_{2} (W)

denotes the second-largest singular value of matrix W.

2.2. Graph Theory

For a multi-agent system, we use a directed graph

G = (V, E)

to describe the information exchange within it, where

V = {1, 2, \dots, N}

is the node set and

E \subset V \times V

is the edge set.

(j, i) \in E

represents that agent i can obtain information from agent j. The self-loop is unconsidered, i.e.,

(i, i) \notin E

.

N_{i} = {j : (j, i) \in E}

denotes the in-neighbor set of node i. If for every

(j, i) \in E

, there exists

(i, j) \in E

, then the graph is an undirected graph. A path between nodes i and j is a sequence of edges

(i, V_{1}), (V_{1}, V_{2}), \dots, (V_{k}, j)

in the graph

G

where

V_{l}, l = 1, 2, \dots, k

are some distinct nodes. An undirected graph is connected if there is a path between each pair of nodes.

2.3. Bregman Divergence

The mirror descent algorithm based on Bregman divergence is frequently used and effective in online optimization [8]. The Bregman divergence is defined as follows. Let

R : R^{n} \to R

be a strongly convex function which satisfies

R (x) \geq R (y) + 〈 x - y, \nabla R (y) 〉 + \frac{1}{2} {∥ x - y ∥}^{2}, \forall x, y \in R^{n},

(1)

and define the Bregman divergence

D_{R} (x, y)

as

D_{R} (x, y) = R (x) - R (y) - 〈 x - y, \nabla R (y) 〉 .

(2)

3. Problem Formulation

Considering a network system composed of N agents, and there exists a sequence of time-varying convex function

f_{t} (x) : R^{n} \to R

over the network composed of

f_{i, t} (x, v (x)) : R^{n} \times R^{d} \to R

. Considering the following optimization problem,

\begin{matrix} \min ._{x \in X} f_{t} (x) & = & \sum_{i = 1}^{N} f_{i, t} (x, v (x)), \end{matrix}

(3)

where

i \in [N]

,

t \in [T]

,

x \in X \subseteq R^{n}

,

ψ_{i} : R^{n} \to R^{d}

, and

v : R^{n} \to R^{d}

is the aggregative variable defined by

v (x) = \frac{1}{N} \sum_{i = 1}^{N} ψ_{i} (x)

. The function

f_{i, t} (x, v (x))

is convex and assigned to agent i. Supposing there exists a sequence of unknown stable non-expansive mapping

A_{t} \in R^{n \times n}

, i.e.,

∥ A_{t} ∥ \leq 1

, such that

\begin{matrix} x_{t + 1}^{*} = A_{t} x_{t}^{*} + ϑ_{t}, \end{matrix}

(4)

where

x_{t}^{*} = {argmin}_{x \in X} f_{t} (x)

and

ϑ_{t}

is an unstructured and unknown noise. Assuming that agent i independently observes the mapping

A_{t}

, and

A_{i, t}

denotes the observed value in time t. Moreover, assuming that

A_{t}^{*}

is the optimal observed value for all agents and

A_{t}^{*}

is the optimal solution to the following optimization problem:

\begin{matrix} A_{t}^{*} = \underset{{\tilde{A}}_{t}}{arg min} \sum_{i = 1}^{N} {∥ {\tilde{A}}_{t} - A_{i, t} ∥}^{2} . \end{matrix}

(5)

It follows from (4) and (5) that

\begin{matrix} x_{t + 1}^{*} = A_{t}^{*} x_{t}^{*} + θ_{t}, \end{matrix}

(6)

where

θ_{t} = (A_{t} - A_{t}^{*}) x_{t}^{*} + ϑ_{t}

.

Distributed Online Aggregative Optimization Problem:

The objective is to generate a sequence

x_{i, t}, i \in [N], t \in [T]

to minimize the following dynamic regret,

\begin{matrix} R e g (T) : = \sum_{t = 1}^{T} \sum_{i = 1}^{N} f_{i, t} (x_{i, t}) - \sum_{t = 1}^{T} \sum_{i = 1}^{N} f_{i, t} (x_{t}^{*}), \end{matrix}

(7)

where

x_{t}^{*}

is the optimal solution to (3) which satisfies constraint (6), i.e.,

x_{t}^{*} = {argmin}_{x \in X} f_{t} (x)

while

x_{t}^{*}

satisfies constraint (6).

Remark 1.

The online aggregative optimization problems have been studied in [4,5]. However, this paper considers an unknown dynamic environment, which is more challenging than [4,5], i.e., the mapping

A_{t}

.

Let

\nabla ψ_{i} (x_{i})

,

\nabla_{1} f_{i, t} (x_{i}, v)

and

\nabla_{2} f_{i, t} (x_{i}, v)

denote

\nabla_{x_{i}} ψ_{i} (x_{i})

,

\nabla_{x_{i}} f_{i, t} (x_{i}, v)

and

\nabla_{v} f_{i, t} (x_{i}, v)

, respectively. Let

W \in R^{N \times N}

be the adjacency matrix of network

G

, and

η_{t} \in R, t \in [T]

be the global step-size which will be used to design an online optimization algorithm in the next section. Let

\bar{X} \subseteq R^{n}

and

\bar{Y} \subseteq R^{d}

be some convex sets which will be defined in the next section. Some necessary assumptions are as follows.

Assumption 1.

Graph

G

is undirected and connected, and W is doubly stochastic, i.e.,

\begin{matrix} \sum_{i = 1}^{N} W_{i j} = \sum_{j = 1}^{N} W_{i j} = 1, \end{matrix}

and there exists a constant

α \in (0, 1)

such that

W_{i j} \geq α

when

W_{i j} \geq 0

, and

W_{i i} \geq α

for all

i \in [N]

.

Assumption 2.

For

x, m \in \bar{X}

,

y, n \in \bar{Y}

,

i \in [N]

and

t \in [T]

, the functions

f_{i, t} (x)

,

\nabla_{1} f_{i, t} (x, y)

,

\nabla_{2} f_{i, t} (x, y)

,

ψ_{i} (x)

, and

\nabla ψ_{i} (x)

are Lipschitz continuous, and the functions

ψ_{i} (x)

,

\nabla_{1} f_{i, t} (x, y)

and

\nabla_{2} f_{i, t} (x, y)

are bounded, i.e.,

\begin{matrix} ι ∥ x - m ∥ \leq ∥ f_{i, t} (x) - f_{i, t} (m) ∥ \leq L ∥ x - m ∥ \\ ∥ \nabla_{1} f_{i, t} (x, y) - \nabla_{1} f_{i, t} (m, n) ∥ \leq L (∥ x - m ∥ + ∥ y - n ∥) \\ ∥ \nabla_{2} f_{i, t} (x, y) - \nabla_{2} f_{i, t} (m, n) ∥ \leq L (∥ x - m ∥ + ∥ y - n ∥) \\ ∥ ψ_{i} (x) - ψ_{i} (m) ∥ \leq L ∥ x - m ∥ \\ ∥ \nabla ψ_{i} (x) - \nabla ψ_{i} (m) ∥ \leq L ∥ x - m ∥ \\ ∥ ψ_{i} (x) ∥ \leq H, ∥ \nabla_{1} f_{i, t} (x, y) ∥ \leq H, ∥ \nabla_{2} f_{i, t} (x, y) ∥ \leq H, \end{matrix}

where L is the Lipschitz constant,

ι > 0

and

H > 0

.

Assumption 3.

There exists

\begin{matrix} \begin{matrix} | D_{R} (x, z) - D_{R} (y, z) | & \leq K ∥ x - y ∥ \\ D_{R} (x, \sum_{i = 1}^{N} a (i) y_{i}) & \leq \sum_{i = 1}^{N} a (i) D_{R} (x, y_{i}) \\ | \nabla R (x) - \nabla R (y) | & \leq L_{R} ∥ x - y ∥, \end{matrix} \end{matrix}

where

x, y, y_{i}, z \in \bar{X}

,

a (i)

is on the N-dimensional simplex, and K and

L_{R}

are some Lipschitz constants.

Assumption 4.

There exists positive constants

M_{A}

and M, such that

∥ A_{t}^{*} - I_{n} ∥ \leq M_{A} η_{t}

, and

\sum_{i = 1}^{N} {∥ A_{i, t} - A_{i, t - 1} ∥}_{1} \leq M η_{t}

, where

η_{t}

is the global step sequence. Moreover,

D_{R} (A_{t}^{*} x

,

A_{t}^{*} y) \leq D_{R} (x, y)

holds for

x, y \in \bar{X}

and

t \in [T]

.

Remark 2.

Assumptions 1, 3, and 4 come from [6,7,8,11]. They are common in mirror descent algorithms for distributed online optimization. Assumption 2 is also common in online aggregation optimization [4].

4. Main Result

Inspired by [11], consider the following online algorithm:

\begin{matrix} {\hat{A}}_{i, t} & = \sum_{j = 1}^{N} W_{i j} {\hat{A}}_{j, t - 1} + A_{i, t} - A_{i, t - 1} \end{matrix}

(8a)

\begin{matrix} {\hat{x}}_{i, t + 1} & = \underset{x \in X}{argmin} {η_{t} 〈 x, \nabla_{i, t} 〉 + D_{R} (x, y_{i, t})} \end{matrix}

(8b)

\begin{matrix} x_{i, t + 1} & = \sum_{j = 1}^{N} W_{i j} {\hat{A}}_{j, t} {\hat{x}}_{i, t + 1} \end{matrix}

(8c)

\begin{matrix} y_{i, t + 1} & = \sum_{j = 1}^{N} W_{i j} x_{j, t + 1} \end{matrix}

(8d)

\begin{matrix} v_{i, t + 1} & = \sum_{j = 1}^{N} W_{i j} v_{j, t} + ψ_{i} (x_{i, t + 1}) - ψ_{i} (x_{i, t}) \end{matrix}

(8e)

\begin{matrix} z_{i, t + 1} & = \sum_{j = 1}^{N} W_{i j} z_{j, t} + \nabla ψ_{i} (x_{i, t + 1}) - \nabla ψ_{i} (x_{i, t}), \end{matrix}

(8f)

where

i \in [N]

,

t \in [T]

,

\nabla_{i, t} = \nabla_{1} f_{i, t} (x_{i, t}, v_{i, t}) + \nabla_{2} f_{i, t} (x_{i, t}, v_{i, t}) z_{i, t}

,

η_{t}

is the global step sequence,

{\hat{x}}_{i, t + 1}

,

y_{i, t + 1}

,

{\hat{A}}_{i, t + 1}

,

v_{i, t + 1}

and

z_{i, t + 1}

are some auxiliary variables, and

x_{i, t + 1}

is the designed sequence. In (8), the initial states are chosen such that

{\hat{A}}_{i, 0} = A_{i, 0} = 0_{n \times n}

,

y_{i, 0} = x_{i, 0} = {\hat{x}}_{i, 0} = 0_{n}

,

v_{i, 0} = ψ_{i} (x_{i, 0})

and

z_{i, 0} = \nabla ψ_{i} (x_{i, 0})

.

It follows from (8) that the states in (8) are bounded. Let sets

\bar{X}

and

\bar{Y}

satisfy

X \subseteq \bar{X}

,

x_{i, t} \in \bar{X}

,

y_{i, t} \in \bar{X}

and

v_{i, t} \in \bar{Y}

. Let

P = {max}_{i} {v_{i, 1}, z_{i, 1}}

,

Q = {sup}_{x \in X} ∥ x ∥

,

R^{2} = {sup}_{x, y \in X} D_{R} (x, y)

, and

\begin{matrix} ▵ A_{i, t} & = & \sum_{j = 1}^{N} W_{i j} {\hat{A}}_{j, t} - A_{t}^{*}, {\bar{z}}_{t} = \frac{1}{N} \sum_{i = 1}^{N} z_{i, t} \\ {\bar{x}}_{t + 1} & = & \frac{1}{N} \sum_{i = 1}^{N} x_{i, t + 1}, \nabla v (x) = \frac{1}{N} \sum_{i = 1}^{N} \nabla ψ_{i} (x) \\ {\tilde{\nabla}}_{i, t} & = & \nabla_{1} f_{i, t} (x_{i, t}, v (x_{i, t})) + \nabla_{2} f_{i, t} (x_{i, t}, v (x_{i, t})) \nabla v (x_{i, t}) . \end{matrix}

Remark 3.

We have provided a comparative analysis of problem formulation of online optimization in Table 1, where

DAT

denotes distributed average tracking. It follows that the problem considered in this paper is more challenging than [3,4,5,6,7,8,9,10,11]. Algorithms (8a)–(8d) come from [11]. In fact, algorithms (8a)–(8d) have been used in [11] to solve a class-distributed online optimization problem with an unknown mapping. However, the aggregative variable

v (x)

is not considered in [11]. This makes the design and analyses of the optimization algorithm more challenging. In particular, it follows from (3) that, for agent i, the gradient of the function

f_{i, t} (x, v (x))

with respect to x is

\nabla_{1} f_{i, t} (x_{i, t}, v (x_{i, t})) + \nabla_{2} f_{i, t} (x_{i, t}, v (x_{i, t})) \nabla v (x_{i, t})

. But

v (x_{i, t})

and

\nabla v (x_{i, t})

are unknown for agent i due to the topology constraint. Therefore, the algorithms (8e) and (8f) are added by using the distributed average tracking method to estimate

v (x_{i, t})

and

\nabla v (x_{i, t})

. Based on (8e) and (8f), the new gradient

\nabla_{i, t} = \nabla_{1} f_{i, t} (x_{i, t}, v_{i, t}) + \nabla_{2} f_{i, t} (x_{i, t}, v_{i, t}) z_{i, t}

is introduced in algorithm (8b). Hence, algorithms (8e) and (8f) are used to achieve aggregative gradient tracking via

D A T

. A strict analysis of dynamic regret of algorithm (8) will be presented as follows.

Some lemmas and the main result are presented as follows, and the proofs are given in the Appendix A, Appendix B, Appendix C, Appendix D, Appendix E, Appendix F and Appendix G.

Lemma 1.

Consider the mapping (5) and the algorithm (8a), and supposing that Assumptions 1 and 4 hold. Then for all

i \in [N]

and

t \in [T]

, there exists

\begin{matrix} ∥ ▵ A_{i, t} ∥ \leq \frac{8 M η_{t}}{1 - λ}, \end{matrix}

where

λ \in (0, 1)

.

Lemma 2.

Considering algorithm (8f), and supposing that Assumptions 1 and 2 hold, then for all

i \in [N]

and

t \in [T]

, there exists,

\begin{matrix} \begin{matrix} ∥ z_{i, t} - {\bar{z}}_{t} ∥ \leq Δ_{1}, ∥ \nabla_{i, t} ∥ \leq Δ_{2}, \end{matrix} \end{matrix}

where

Δ_{1} = N P γ + \frac{2 N L γ β}{1 - β} + 4 L

,

Δ_{2} = H Δ_{1} + H L + H

,

γ = {(1 - \frac{α}{2 N^{2}})}^{- 2}

, and

β = 1 - \frac{α}{2 N^{2}}

.

Lemma 3.

Considering algorithms (8a)–(8d), and supposing that Assumptions 1–4 hold, then for all

i \in [N]

and

t \in [T]

, there exists,

\begin{matrix} \begin{matrix} ∥ x_{i, t + 1} - {\bar{x}}_{t + 1} ∥ & \leq Δ_{3} \sum_{τ = 0}^{t} σ_{2}^{t - τ} (W) η_{τ}, \end{matrix} \end{matrix}

(9)

where

Δ_{3} = \sqrt{N} [Δ_{2} + M_{A} Q + \frac{8 M Q}{1 - λ}]

.

Lemma 4.

Considering algorithm (8), and supposing that Assumptions 1–3 hold, then for all

i \in [N]

and

t \in [T]

, there exists,

\begin{matrix} \begin{matrix} ∥ x_{i, t + 1} - x_{i, t} ∥ & \leq Δ_{4} η_{t} + 2 Δ_{3} \sum_{τ = 0}^{t} σ_{2}^{t - τ} (W) η_{τ} \\ ∥ v_{i, t} - v (x_{i, t}) ∥ & \leq N γ P β^{t} + Δ_{5} η_{t - 1} + Δ_{6} \sum_{τ = 0}^{t} σ_{2}^{t - τ} (W) η_{τ} \\ ∥ z_{i, t} - \nabla v (x_{i, t}) ∥ & \leq N γ P β^{t} + Δ_{5} η_{t - 1} + Δ_{6} \sum_{τ = 0}^{t} σ_{2}^{t - τ} (W) η_{τ}, \end{matrix} \end{matrix}

(10)

where

Δ_{4} = Δ_{2} + M_{A} Q + \frac{8 M Q}{1 - λ}

,

Δ_{5} = L Δ_{4} (\frac{N γ β}{1 - β} + 2)

, and

Δ_{6} = 2 L Δ_{3} (\frac{N γ β}{1 - β} + 3)

.

Lemma 5.

Considering algorithm (8), and supposing that Assumptions 1, 3 and 4 hold, then for all

i \in [N]

and

t \in [T]

, there exists,

\begin{matrix} \begin{matrix} \sum_{t = 1}^{T} \sum_{i = 1}^{N} (\frac{1}{η_{t}} D_{R} (x_{t}^{*}, y_{i, t}) - \frac{1}{η_{t}} D_{R} (x_{t}^{*}, {\hat{x}}_{i, t + 1})) \\ \leq & \frac{2 N R^{2}}{η_{T + 1}} + \sum_{t = 1}^{T} \frac{N K}{η_{t + 1}} ∥ x_{t + 1}^{*} - A_{t}^{*} x_{t}^{*} ∥ + \sum_{t = 1}^{T} \sum_{i = 1}^{N} \frac{η_{t}}{η_{t + 1}} Δ_{7} ∥ A_{t}^{*} x_{t}^{*} - x_{i, t + 1} ∥, \end{matrix} \end{matrix}

(11)

where

Δ_{7} = \frac{8 L_{R} M Q}{(1 - λ)}

.

Lemma 6.

Considering algorithm (8), and supposing that Assumptions 1–3 hold, then for all

i \in [N]

and

t \in [T]

, there exists,

\begin{matrix} \begin{matrix} ∥ {\tilde{\nabla}}_{i, t} - \nabla_{i, t} ∥ \leq (L + L^{2} + H) {N γ P β^{t} + Δ_{5} η_{t - 1} + Δ_{6} \sum_{τ = 0}^{t} σ_{2}^{t - τ} (W) η_{τ}} . \end{matrix} \end{matrix}

The main result of this paper is as follows.

Theorem 1.

Considering the optimization problem (3), constraint (6), and algorithm (8), and supposing that Assumptions 1–4 hold. Let

η_{t} = η > 0

and

ι > Δ_{7}

. Then there exists

\begin{matrix} \begin{matrix} Reg (T) \leq \frac{L}{ι - Δ_{7}} \{Δ_{8} η T + \frac{2 N R^{2} + N K C_{T}}{η} + Δ_{9}\}, \end{matrix} \end{matrix}

where

C_{T} = \sum_{t = 1}^{T} ∥ θ_{t} ∥

,

Δ_{8} = 2 Q (L + L^{2} + H) \{Δ_{5} + \frac{Δ_{6}}{1 - σ_{2} (W)}\} + \frac{N {(H + H L)}^{2}}{2} + \frac{2 N (H + H L) Δ_{3}}{1 - σ_{2} (W)}

, and

Δ_{9} = 2 N Q Δ_{7} + Δ_{7} C_{T} + 2 Q N^{2} γ P (L + L^{2} + H) \frac{β}{1 - β}

.

Remark 4.

The condition

ι > Δ_{7}

is reasonable since a sufficiently small

Δ_{7}

is chosen by decreasing

L_{R}

. It follows from the boundary

Reg (T)

in Theorem 1 that a proper step is

η_{t} = η_{0} / \sqrt{T}

with

η_{0} > 0

. In such a case, the regret

Reg (T)

is bounded in the order of

O (\sqrt{T})

for any arbitrary

T > 1

.

5. Simulations

Inspired by [5,8], considering a cooperative control problem for a multi-robot system in a plant

X \subseteq R^{2}

, in particular, a moving target with state

T \in X

is modeled as

T_{t + 1} = A_{t} T_{t} + ϑ_{t}^{\circ},

(12)

where

A_{t} \in R^{2 \times 2}

and

ϑ_{t}^{\circ} \in R^{2}

is the noise. Moreover, there is a multi-robot network composed of N robots aiming to protect the target, while there are N intruders aiming to capture the target. Let

I_{i, t} \in X

denote the state of the ith intruder at time t, which is assigned to robot i. Supposing that the formation of a multi-robot network is defined by a virtual leader–follower formation method, let

x_{t}^{\circ}

denote the leader’s state which is unknown for all robots. In particular, let

x_{i, t} \in X

denote the estimated value of

x_{t}^{\circ}

by robot i at time t, and the state of robot i is given by

ψ_{i} (x_{i, t}) = β_{i} x_{i, t} + γ_{i}

, where

β_{i}

and

γ_{i}

represent an offset of robot i to the virtual leader

x_{i, t}

.

The multi-robot network is used to protect the moving target by choosing an optimal virtual leader such that the average state of all the robots is close to the target. Meanwhile, each robot is simultaneously close to the associated intruder, referring to Figure 1. Let

v (x_{t}) = \frac{1}{N} \sum_{i = 1}^{N} ψ_{i} (x_{i, t})

denote the averaged state of robots where

x_{t} = col (x_{1, t}, x_{2, t}, \dots, x_{N, t})

. Then, the cooperative control problem can be modeled as the following optimization problem:

\begin{matrix} \min ._{x_{i, t} \in X} f_{t} (x_{t}) = \sum_{i = 1}^{N} f_{i, t} (x_{i, t}, v (x_{t})), \end{matrix}

(13)

where

f_{i, t} (x_{i, t}, v (x_{t})) = α_{1} ∥ v (x_{t}) - T_{t} ∥^{2} + α_{2} {∥ ψ_{i} (x_{i, t}) - I_{i, t} ∥}^{2},

α_{1} > 0

, and

α_{2} > 0

. In fact, the objective is to cooperate in seeking an optimal virtual leader, which is the optimal solution to the problem (13).

It is worth pointing out that the estimated virtual leader should be consensus. Therefore, let

x_{t}^{*} \in X

be the consensus solution to problem (13). Then

x_{t}^{*}

should be the solution to the following online optimization problem:

\begin{matrix} x_{t}^{*} = \min ._{x \in X} f_{t} (x) & = & \sum_{i = 1}^{N} f_{i, t} (x, v (x)), \end{matrix}

(14)

where

ψ_{i} (x) = β_{i} x + γ_{i}

and

v (x) = \frac{1}{N} \sum_{i = 1}^{N} ψ_{i} (x)

. In addition, it follows from (12) that

x_{t}^{*}

should also have the mapping

A_{t}

, i.e.,

x_{t}^{*}

satisfies the following constraint:

x_{t + 1}^{*} = A_{t} x_{t}^{*} + ϑ_{t},

(15)

where

ϑ_{t} \in R^{2}

is an unknown and unstructured noise. Note that

A_{t}

is unknown for all robots. Hence,

A_{i, t}

denotes the observed mapping of

A_{t}

at time t for robot i. Problem (14) and constraint (15) take the form of problem (3) and constraint (6), so Algorithm 1 can be used to solve problem (14) with constraint (15).

Let

N = 4

,

X = {x \in R^{2} | ∥ x ∥ \leq 3}

,

A_{t} = (\begin{matrix} 1 & (1 + cos t) * 10^{- 3} \\ 0 & 1 \end{matrix})

,

ϑ_{t}^{\circ}

is a random vector chosen from a normal distribution with zero mean and standard deviation

[1; 2] * 10^{- 3}

,

I_{i, t} = T_{t} + col (cos (2 i π / N), sin (2 i π / N))

,

β_{i} = 1

,

γ_{i} = col (cos (2 i π / N),

sin (2 i π / N))

,

α_{1} = 0.1

,

α_{2} = 0.2

, the unknown noise

ϑ_{t}

satisfies

∥ ϑ_{t} ∥ \leq 0.2

, and

A_{i, t} = A_{t} + μ_{t} I_{2}

where

μ_{t}

is a random number chosen from a normal distribution with zero mean and standard deviation

10^{- 3}

.

Choosing

T = 1000

,

η_{t} = 0.001

and

R (x) = {0.01 ∥ x ∥}^{2}

, and supposing that an arbitrary doubly stochastic W is given, it can be verified that Assumptions 1–4 and the condition

ι > Δ_{7}

hold. The simulations are performed on a computer equipped with AMD Ryzen 9 5950X 16-core CPU, 64G RAM, and Nvidia RTX 3080Ti GPU. Figure 2 shows the trajectory

x_{1, t}

of robot 1 by using Algorithm 1. The dynamic regret of Algorithm 1 is indicated in Figure 3, and the regret

Reg (T)

in Figure 3 is bounded. Therefore, Theorem 1 is verified by Figure 2 and Figure 3.

6. Conclusions

This paper introduces an online optimization algorithm for addressing distributed online aggregative problems featuring dynamic environments. With this algorithm, the dynamic regret converges to a boundary without relying on the condition that the dynamic environment is known. Future research includes the distributed online optimization problems in the context of time-varying directed networks.

Author Contributions

Conceptualization, C.Y., S.W. and B.H.; methodology, C.Y. and S.W.; writing—original draft preparation, C.Y. and S.W.; writing—review and editing, S.Z. and S.L.; funding acquisition, B.H. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the National Key R&D Program of China under Grant 2022ZD0119601.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Proof of Lemma 1.

The proof follows from Lemma 2.1 in [11]. □

Appendix B

Proof of Lemma 2.

Let

ϵ_{z_{i, t + 1}} = \nabla ψ_{i} (x_{i, t + 1}) - \nabla ψ_{i} (x_{i, t})

. Then the algorithm (8f) can be rewritten as

\begin{matrix} z_{i, t + 1} = \sum_{j = 1}^{N} W_{i j} z_{j, t} + ϵ_{z_{i, t + 1}} . \end{matrix}

According to Lemma 2 in [12],

\begin{matrix} \begin{matrix} ∥ z_{i, t + 1} - {\bar{z}}_{t + 1} ∥ \\ \leq & N γ β^{t} max_{j} ∥ z_{j, 1} ∥ + γ \sum_{l = 1}^{t - 1} β^{t - l} \sum_{j = 1}^{N} ∥ ϵ_{z_{j, l + 1}} ∥ + \frac{1}{N} \sum_{j = 1}^{N} ∥ ϵ_{z_{j, t + 1}} ∥ + ∥ ϵ_{z_{i, t + 1}} ∥ . \end{matrix} \end{matrix}

(A1)

It follows from Assumption 2 that

∥ ϵ_{z_{i, t + 1}} ∥ \leq 2 L

. Then,

\begin{matrix} ∥ z_{i, t + 1} - {\bar{z}}_{t + 1} ∥ \leq Δ_{1} . \end{matrix}

Moreover, it follows from (8f) that

\begin{matrix} \begin{matrix} \frac{1}{N} \sum_{i = 1}^{N} z_{i, t + 1} = & \frac{1}{N} \sum_{i = 1}^{N} \sum_{j = 1}^{N} W_{i j} z_{j, t} + \frac{1}{N} \sum_{i = 1}^{N} \nabla ψ_{i} (x_{i, t + 1}) - \frac{1}{N} \sum_{i = 1}^{N} \nabla ψ_{i} (x_{i, t}) . \end{matrix} \end{matrix}

By using Assumption 1,

\begin{matrix} {\bar{z}}_{t + 1} - \frac{1}{N} \sum_{i = 1}^{N} \nabla ψ_{i} (x_{i, t + 1}) = {\bar{z}}_{t} - \frac{1}{N} \sum_{i = 1}^{N} \nabla ψ_{i} (x_{i, t}), \end{matrix}

It follows from

z_{i, 0} = \nabla ψ_{i} (x_{i, 0})

that

{\bar{z}}_{t} = \frac{1}{N} \sum_{i = 1}^{N} \nabla ψ_{i} (x_{i, t})

. Assumption 2 implies that

∥ {\bar{z}}_{t} ∥ \leq L

holds. Therefore,

\begin{matrix} ∥ \nabla_{i, t} ∥ & \leq & H + H (∥ z_{i, t} - {\bar{z}}_{t} ∥ + ∥ {\bar{z}}_{t} ∥) \\ \leq & H + H Δ_{1} + H L = Δ_{2} . \end{matrix}

□

Appendix C

Proof of Lemma 3.

Let

e_{i, t} = {\hat{x}}_{i, t + 1} - y_{i, t}

and

p_{i, t} = (A_{t}^{*} - I_{n} + ▵ A_{i, t}) {\hat{x}}_{i, t + 1}

. It follows from Lemma 2.2 in [11] that

\begin{matrix} \begin{matrix} ∥ e_{i, t} ∥ \leq Δ_{2} η_{t} \\ ∥ p_{i, t} ∥ \leq (M_{A} Q + 8 M Q \sum_{k = 0}^{t} λ^{t - k}) η_{t} \\ {\bar{x}}_{t + 1} = \frac{1}{N} \sum_{τ = 0}^{t} \sum_{j = 1}^{N} (e_{j, τ} + p_{j, τ}), \end{matrix} \end{matrix}

(A2)

and

\begin{matrix} \begin{matrix} ∥ x_{i, t + 1} - {\bar{x}}_{t + 1} ∥ \leq & \sqrt{N} [∥ \nabla_{i, t} ∥ + M_{A} Q + \frac{8 M Q}{1 - λ}] \sum_{τ = 0}^{t} σ_{2}^{t - τ} (W) η_{τ} . \end{matrix} \end{matrix}

(A3)

By using the result

∥ \nabla_{i, t} ∥ \leq Δ_{2}

in Lemma 2, (9) holds. □

Appendix D

Proof of Lemma 4.

Considering the first inequality in (10), and it follows from (A2) that

\begin{matrix} \begin{matrix} ∥ {\bar{x}}_{t + 1} - {\bar{x}}_{t} ∥ = & ∥ \frac{1}{N} \sum_{j = 1}^{N} (e_{j, t} + p_{j, t}) ∥ \\ \leq & (Δ_{2} + M_{A} Q + 8 M Q \sum_{k = 0}^{t} λ^{t - k}) η_{t}, \end{matrix} \end{matrix}

(A4)

which implies that

\begin{matrix} \begin{matrix} ∥ x_{i, t + 1} - x_{i, t} ∥ \\ \leq & ∥ x_{i, t + 1} - {\bar{x}}_{t + 1} ∥ + ∥ {\bar{x}}_{t + 1} - {\bar{x}}_{t} ∥ + ∥ {\bar{x}}_{t} - x_{i, t} ∥ \\ \leq & (Δ_{2} + M_{A} Q + 8 M Q \sum_{k = 0}^{t} λ^{t - k}) η_{t} + 2 Δ_{3} \sum_{τ = 0}^{t} σ_{2}^{t - τ} (W) η_{τ} . \end{matrix} \end{matrix}

(A5)

Thus, the first inequality in (10) holds.

Consider the second inequality in (10). Note that there exists

\begin{matrix} \begin{matrix} ∥ v (x_{i, t}) - v_{i, t} ∥ \\ \leq & ∥ v (x_{i, t}) - \frac{1}{N} \sum_{i = 1}^{N} ψ_{i} (x_{i, t}) ∥ + ∥ \frac{1}{N} \sum_{i = 1}^{N} ψ_{i} (x_{i, t}) - v_{i, t} ∥ \\ \leq & \frac{L}{N} \sum_{j = 1}^{N} ∥ x_{i, t} - x_{j, t} ∥ + ∥ \frac{1}{N} \sum_{i = 1}^{N} ψ_{i} (x_{i, t}) - v_{i, t} ∥, \end{matrix} \end{matrix}

(A6)

where the last line is since the function

ψ_{i}

is Lipschitz (see Assumption 2). By using (9),

\begin{matrix} \begin{matrix} \frac{L}{N} \sum_{j = 1}^{N} ∥ x_{i, t} - x_{j, t} ∥ \\ \leq & \frac{L}{N} \sum_{j = 1}^{N} (∥ x_{i, t} - {\bar{x}}_{t} ∥ + ∥ {\bar{x}}_{t} - x_{j, t} ∥) \\ \leq & 2 L Δ_{3} \sum_{τ = 0}^{t} σ_{2}^{t - τ} (W) η_{τ} . \end{matrix} \end{matrix}

(A7)

Let

ϵ_{v_{i, t + 1}} = ψ_{i} (x_{i, t + 1}) - ψ_{i} (x_{i, t})

. By using Assumption 2 and (A5),

\begin{matrix} \begin{matrix} ∥ ϵ_{v_{i, t + 1}} ∥ \leq & L ∥ x_{i, t + 1} - x_{i, t} ∥ \\ \leq L (Δ_{2} + M_{A} Q + 8 M Q \sum_{k = 0}^{t} λ^{t - k}) η_{t} + 2 L Δ_{3} \sum_{τ = 0}^{t} σ_{2}^{t - τ} (W) η_{τ} . \end{matrix} \end{matrix}

It follows from that

\begin{matrix} v_{i, t + 1} = \sum_{j = 1}^{N} W_{i j} v_{j, t} + ϵ_{v_{i, t + 1}} . \end{matrix}

According to Lemma 2 in [12], it follows that

\begin{matrix} \begin{matrix} ∥ v_{i, t + 1} - \frac{1}{N} \sum_{j = 1}^{N} v_{j, t + 1} ∥ \\ \leq & N γ β^{t} max_{j} ∥ v_{j, 1} ∥ + γ \sum_{l = 1}^{t - 1} β^{t - l} \sum_{j = 1}^{N} ∥ ϵ_{v_{j, l + 1}} ∥ + \frac{1}{N} \sum_{j = 1}^{N} ∥ ϵ_{v_{j, t + 1}} ∥ + ∥ ϵ_{v_{i, t + 1}} ∥ \\ \leq & N γ P β^{t} + (\frac{N γ β}{1 - β} + 2) L (Δ_{2} + M_{A} Q + 8 M Q \sum_{k = 0}^{t} λ^{t - k}) η_{t} \\ + (\frac{N γ β}{1 - β} + 2) 2 L Δ_{3} \sum_{τ = 0}^{t} σ_{2}^{t - τ} (W) η_{τ} . \end{matrix} \end{matrix}

(A8)

It follows from (8e) and Assumption 1 that

\begin{matrix} \begin{matrix} \frac{1}{N} \sum_{i = 1}^{N} v_{i, t + 1} & = \frac{1}{N} \sum_{i = 1}^{N} \sum_{j = 1}^{N} W_{i j} v_{j, t} + \frac{1}{N} \sum_{i = 1}^{N} ψ_{i} (x_{i, t + 1}) - \frac{1}{N} \sum_{i = 1}^{N} ψ_{i} (x_{i, t}) . \end{matrix} \end{matrix}

According to the initial state

v_{i, 0} = ψ_{i} (x_{i, 0})

, it follows that

\begin{matrix} \begin{matrix} \frac{1}{N} \sum_{j = 1}^{N} v_{j, t} & = \frac{1}{N} \sum_{i = 1}^{N} ψ_{i} (x_{i, t}) . \end{matrix} \end{matrix}

(A9)

It follows from (A8) and (A9) that

\begin{matrix} \begin{matrix} ∥ \frac{1}{N} \sum_{i = 1}^{N} ψ_{i} (x_{i, t}) - v_{i, t} ∥ \\ \leq N γ P β^{t} + (\frac{N γ β}{1 - β} + 2) L (Δ_{2} + M_{A} Q + 8 M Q \sum_{k = 0}^{t - 1} λ^{t - k - 1}) η_{t - 1} \\ + (\frac{N γ β}{1 - β} + 2) 2 L Δ_{3} \sum_{τ = 0}^{t - 1} σ_{2}^{t - 1 - τ} (W) η_{τ} . \end{matrix} \end{matrix}

(A10)

Combining (A6), (A7), and (A10), the second inequality in (10) holds.

Considering the third inequality in (10), and note that there exists

\begin{matrix} \begin{matrix} ∥ z_{i, t} - \nabla v (x_{i, t}) ∥ \\ \leq & ∥ z_{i, t} - \frac{1}{N} \sum_{i = 1}^{N} \nabla ψ_{i} (x_{i, t}) ∥ + ∥ \frac{1}{N} \sum_{i = 1}^{N} \nabla ψ_{i} (x_{i, t}) - \nabla v (x_{i, t}) ∥ \\ \leq & ∥ z_{i, t} - {\bar{z}}_{i, t} ∥ + \frac{L}{N} \sum_{j = 1}^{N} ∥ x_{i, t} - x_{j, t} ∥, \end{matrix} \end{matrix}

(A11)

Similar to (A8), there exists

\begin{matrix} \begin{matrix} ∥ ϵ_{z_{i, t + 1}} ∥ \leq & L ∥ x_{i, t + 1} - x_{i, t} ∥ \\ \leq L (Δ_{2} + M_{A} Q + 8 M Q \sum_{k = 0}^{t} λ^{t - k}) η_{t} + 2 L Δ_{3} \sum_{τ = 0}^{t} σ_{2}^{t - τ} (W) η_{τ} . \end{matrix} \end{matrix}

It follows from (A1) that

\begin{matrix} \begin{matrix} ∥ z_{i, t} - {\bar{z}}_{i, t} ∥ \leq N γ P β^{t} & + (\frac{N γ β}{1 - β} + 2) L (Δ_{2} + M_{A} Q + 8 M Q \sum_{k = 0}^{t} λ^{t - k}) η_{t} \\ + (\frac{N γ β}{1 - β} + 2) 2 L Δ_{3} \sum_{τ = 0}^{t} σ_{2}^{t - τ} (W) η_{τ} . \end{matrix} \end{matrix}

(A12)

Therefore, according to (A5), (A11), and (A12), the third inequality in (10) holds. □

Appendix E

Proof of Lemma 5.

The proof is follows from Lemma 2.3 in [11]. □

Appendix F

Proof of Lemma 6.

It follows from Assumption 2 and Lemma 2 that

\begin{matrix} \begin{matrix} ∥ {\tilde{\nabla}}_{i, t} - \nabla_{i, t} ∥ \\ \leq ∥ \nabla_{1} f_{i, t} (x_{i, t}, v (x_{i, t})) - \nabla_{1} f_{i, t} (x_{i, t}, v_{i, t}) ∥ \\ + ∥ \nabla_{2} f_{i, t} (x_{i, t}, v (x_{i, t})) \nabla v (x_{i, t}) - \nabla_{2} f_{i, t} (x_{i, t}, v_{i, t}) z_{i, t} ∥ \\ \leq L ∥ v (x_{i, t}) - v_{i, t} ∥ + ∥ \nabla_{2} f_{i, t} (x_{i, t}, v (x_{i, t})) \nabla v (x_{i, t}) - \nabla_{2} f_{i, t} (x_{i, t}, v_{i, t}) \nabla v (x_{i, t}) ∥ \\ + ∥ \nabla_{2} f_{i, t} (x_{i, t}, v_{i, t}) \nabla v (x_{i, t}) - \nabla_{2} f_{i, t} (x_{i, t}, v_{i, t}) z_{i, t} ∥ . \end{matrix} \end{matrix}

(A13)

According to Assumption 2 and Lemma 4, we have

\begin{matrix} \begin{matrix} ∥ \nabla_{2} f_{i, t} (x_{i, t}, v (x_{i, t})) \nabla v (x_{i, t}) - \nabla_{2} f_{i, t} (x_{i, t}, v_{i, t}) \nabla v (x_{i, t}) ∥ \\ \leq ∥ \nabla_{2} f_{i, t} (x_{i, t}, v (x_{i, t})) - \nabla_{2} f_{i, t} (x_{i, t}, v_{i, t}) ∥ ∥ \nabla v (x_{i, t}) ∥ \\ \leq L ∥ v (x_{i, t}) - v_{i, t} ∥ ∥ \nabla v (x_{i, t}) ∥ \\ \leq L^{2} \{N γ P β^{t} + Δ_{5} η_{t - 1} + Δ_{6} \sum_{τ = 0}^{t} σ_{2}^{t - τ} (W) η_{τ}\}, \end{matrix} \end{matrix}

(A14)

and

\begin{matrix} \begin{matrix} ∥ \nabla_{2} f_{i, t} (x_{i, t}, v_{i, t}) \nabla v (x_{i, t}) - \nabla_{2} f_{i, t} (x_{i, t}, v_{i, t}) z_{i, t} ∥ \\ \leq & ∥ \nabla_{2} f_{i, t} (x_{i, t}, v_{i, t}) ∥ ∥ z_{i, t} - \nabla v (x_{i, t}) ∥ \\ \leq & H \{N γ P β^{t} + Δ_{5} η_{t - 1} + Δ_{6} \sum_{τ = 0}^{t} σ_{2}^{t - τ} (W) η_{τ}\} . \end{matrix} \end{matrix}

(A15)

Then the proof is over by combining (A13)–(A15) and (10). □

Appendix G

Proof of Theorem 1.

Note that the function

f_{i}

is convex. Thus, there exists

\begin{matrix} \begin{matrix} f_{i, t} (x_{i, t}) - f_{i, t} (x_{t}^{*}) \\ \leq & 〈 {\tilde{\nabla}}_{i, t}, x_{i, t} - x_{t}^{*} 〉 \\ = & 〈 \nabla_{i, t}, {\hat{x}}_{i, t + 1} - x_{t}^{*} 〉 + 〈 {\tilde{\nabla}}_{i, t}, x_{i, t} - y_{i, t} 〉 + 〈 {\tilde{\nabla}}_{i, t}, y_{i, t} - {\hat{x}}_{i, t + 1} 〉 + 〈 {\tilde{\nabla}}_{i, t} - \nabla_{i, t}, {\hat{x}}_{i, t} - x_{t}^{*} 〉 . \end{matrix} \end{matrix}

(A16)

According to algorithm (8a), Lemma 4.1 in [13], and the fact that

D_{R} (x, y) \geq \frac{1}{2} {∥ x - y ∥}^{2}

where

x, y \in X

, it follows that

\begin{matrix} 〈 \nabla_{i, t}, {\hat{x}}_{i, t + 1} - x_{t}^{*} 〉 \\ \leq \frac{1}{η_{t}} D_{R} (x_{t}^{*}, y_{i, t}) - \frac{1}{η_{t}} D_{R} (x_{t}^{*}, {\hat{x}}_{i, t + 1}) - \frac{1}{η_{t}} D_{R} ({\hat{x}}_{i, t + 1}, y_{i, t}) \\ \leq \frac{1}{η_{t}} D_{R} (x_{t}^{*}, y_{i, t}) - \frac{1}{η_{t}} D_{R} (x_{t}^{*}, {\hat{x}}_{i, t + 1}) - \frac{1}{2 η_{t}} {∥ {\hat{x}}_{i, t + 1} - y_{i, t} ∥}^{2} . \end{matrix}

(A17)

By using Lemmas 2 and 3, Assumption 1, and Algorithm (8d),

\begin{matrix} \begin{matrix} 〈 {\tilde{\nabla}}_{i, t}, x_{i, t} - y_{i, t} 〉 \\ = & 〈 {\tilde{\nabla}}_{i, t}, x_{i, t} - {\bar{x}}_{t} + {\bar{x}}_{t} - y_{i, t} 〉 \\ = & 〈 {\tilde{\nabla}}_{i, t}, x_{i, t} - {\bar{x}}_{t} 〉 + \sum_{j = 1}^{N} W_{i j} 〈 \nabla_{i, t}, {\bar{x}}_{t} - x_{j, t} 〉 \\ \leq & ∥ {\tilde{\nabla}}_{i, t} ∥ (∥ x_{i, t} - {\bar{x}}_{t} ∥ + \sum_{j = 1}^{N} W_{i j} ∥ x_{j, t} - {\bar{x}}_{t} ∥) \\ \leq & 2 (H + H L) Δ_{3} \sum_{τ = 0}^{t} σ_{2}^{t - τ} (W) η_{τ} . \end{matrix} \end{matrix}

(A18)

It follows from Lemma 2 that

\begin{matrix} 〈 {\tilde{\nabla}}_{i, t}, y_{i, t} - {\hat{x}}_{i, t + 1} 〉 \\ \leq & ∥ {\tilde{\nabla}}_{i, t} ∥ ∥ {\hat{x}}_{i, t + 1} - y_{i, t} ∥ \\ \leq & \frac{1}{2 η_{t}} {∥ {\hat{x}}_{i, t + 1} - y_{i, t} ∥}^{2} + \frac{{(H + H L)}^{2}}{2} η_{t} . \end{matrix}

(A19)

According to (A16)–(A19), we have that

\begin{matrix} \begin{matrix} f_{i, t} (x_{i, t}) - f_{i, t} (x_{t}^{*}) \\ \leq & 〈 {\tilde{\nabla}}_{i, t} - \nabla_{i, t}, {\hat{x}}_{i, t} - x_{t}^{*} 〉 + \frac{1}{η_{t}} D_{R} (x_{t}^{*}, y_{i, t}) - \frac{1}{η_{t}} D_{R} (x_{t}^{*}, {\hat{x}}_{i, t + 1}) \\ + \frac{{(H + H L)}^{2}}{2} η_{t} + 2 (H + H L) Δ_{3} \sum_{τ = 0}^{t} σ_{2}^{t - τ} (W) η_{τ} . \end{matrix} \end{matrix}

Then it follows from Lemmas 5 and 6 that

\begin{matrix} \begin{matrix} \sum_{t = 1}^{T} \sum_{i = 1}^{N} (f_{i, t} (x_{i, t}) - f_{i, t} (x_{t}^{*})) \\ \leq & \sum_{t = 1}^{T} \sum_{i = 1}^{N} ∥ {\hat{x}}_{i, t} - x_{t}^{*} ∥ (L + L^{2} + H) \{N γ P β^{t} + Δ_{5} η_{t - 1} + Δ_{6} \sum_{τ = 0}^{t} σ_{2}^{t - τ} (W) η_{τ}\} \\ + \frac{2 N R^{2}}{η_{T + 1}} + \sum_{t = 1}^{T} \frac{N K}{η_{t + 1}} ∥ x_{t + 1}^{*} - A_{t}^{*} x_{t}^{*} ∥ + \sum_{t = 1}^{T} \sum_{i = 1}^{N} \frac{η_{t}}{η_{t + 1}} Δ_{7} ∥ A_{t}^{*} x_{t}^{*} - x_{i, t + 1} ∥ \\ + \frac{N {(H + H L)}^{2}}{2} \sum_{t = 1}^{T} η_{t} + 2 N (H + H L) Δ_{3} \sum_{t = 1}^{T} \sum_{τ = 0}^{t} σ_{2}^{t - τ} (W) η_{τ} . \end{matrix} \end{matrix}

(A20)

Note that

η_{τ}

is fixed and satisfies

η_{t} = η

. It follows from

∥ {\hat{x}}_{i, t} - x_{t}^{*} ∥ \leq 2 Q

, (A20) and Assumption 2 that

\begin{matrix} \begin{matrix} \sum_{t = 1}^{T} \sum_{i = 1}^{N} ι ∥ x_{i, t} - x_{t}^{*} ∥ \\ \leq & Δ_{7} \sum_{t = 1}^{T} \sum_{i = 1}^{N} ∥ x_{i, t} - x_{t}^{*} ∥ + Δ_{8} η T + \frac{N K}{η} \sum_{t = 1}^{T} ∥ θ_{t} ∥ + \frac{2 N R^{2}}{η} \\ + \sum_{i = 1}^{N} Δ_{7} ∥ x_{T + 1}^{*} - x_{i, T + 1} ∥ + Δ_{7} \sum_{t = 1}^{T} ∥ θ_{t} ∥ + 2 Q N γ P (L + L^{2} + H) \sum_{t = 1}^{T} \sum_{i = 1}^{N} β^{t} . \end{matrix} \end{matrix}

(A21)

Then it follows from (A21) that

\begin{matrix} \begin{matrix} \sum_{t = 1}^{T} \sum_{i = 1}^{N} (ι - Δ_{7}) ∥ x_{i, t} - x_{t}^{*} ∥ \leq Δ_{8} η T + \frac{2 N R^{2} + N K C_{T}}{η} + Δ_{9} . \end{matrix} \end{matrix}

(A22)

Therefore, the proof is over by combining (A22) with

\begin{matrix} \begin{matrix} Reg (T) \leq & \sum_{t = 1}^{T} \sum_{i = 1}^{N} L ∥ x_{i, t} - x_{t}^{*} ∥ . \end{matrix} \end{matrix}

□

References

Shi, Y.; Ran, L.; Tang, J.; Wu, X. Distributed optimization algorithm for composite optimization problems with non-smooth function. Mathematics 2022, 10, 3135. [Google Scholar] [CrossRef]
Li, X.X.; Xie, L.H.; Li, N. A survey on distributed online optimization and online games. Annu. Rev. Control 2023, 56, 24. [Google Scholar] [CrossRef]
Li, X.X.; Xie, L.H.; Hong, Y.G. Distributed aggregative optimization over multi-agent networks. IEEE Trans. Autom. Control 2022, 67, 3165–3171. [Google Scholar] [CrossRef]
Li, X.X.; Yi, X.L.; Xie, L.H. Distributed online convex optimization with an aggregative variable. IEEE Trans. Control Netw. Syst. 2022, 9, 438–449. [Google Scholar] [CrossRef]
Carnevale, G.; Camisa, A.; Notarstefano, G. Distributed online aggregative optimization for dynamic multirobot coordination. IEEE Trans. Autom. Control 2023, 68, 3736–3743. [Google Scholar] [CrossRef]
Hall, E.C.; Willett, R.M. Online convex optimization in dynamic environments. IEEE J. Sel. Top. Signal Process. 2015, 9, 647–662. [Google Scholar] [CrossRef]
Mokhtari, A.; Shahrampour, S.; Jadbabaie, A.; Ribeiro, A. Online optimization in dynamic environments: Improved regret rates for strongly convex problems. In Proceedings of the 55th IEEE Conference on Decision and Control (CDC), Las Vegas, NV, USA, 12–14 December 2016; pp. 7195–7201. [Google Scholar]
Shahrampour, S.; Jadbabaie, A. Distributed online optimization in dynamic environments using mirror descent. IEEE Trans. Autom. Control 2018, 63, 714–725. [Google Scholar] [CrossRef]
Nazari, P.; Khorram, E.; Tarzanagh, D.A. Adaptive online distributed optimization in dynamic environments. Optim. Method Softw. 2021, 36, 973–997. [Google Scholar] [CrossRef]
Li, J.Y.; Li, C.J.; Yu, W.W.; Zhu, X.M.; Yu, X.H. Distributed online bandit learning in dynamic environments over unbalanced digraphs. IEEE Trans. Netw. Sci. Eng. 2021, 8, 3034–3047. [Google Scholar] [CrossRef]
Wang, S.; Huang, B.M. Distributed online optimisation in unknown dynamic environment. Int. J. Syst. Sci. 2024, 55, 1167–1176. [Google Scholar] [CrossRef]
Lee, S.; Zavlanos, M.M. On the sublinear regret of distributed primal-dual algorithms for online constrained optimization. arXiv 2017, arXiv:1705.11128. [Google Scholar]
Beck, A.; Teboulle, M. Mirror descent and nonlinear projected subgradient methods for convex optimization. Oper. Res. Lett. 2003, 31, 167–175. [Google Scholar] [CrossRef]

Figure 1. The concept of the cooperative control problem for a multi-robot system.

Figure 2. Trajectories of

x_{t}^{*}

and

x_{1, t}

over

T = 1000

.

Figure 2. Trajectories of

x_{t}^{*}

and

x_{1, t}

over

T = 1000

.

Figure 3. The dynamic regret of Algorithm 1 for problem (14) with constraint (15).

Table 1. Comparison of problem formulation and algorithm of online optimization.

Reference	Aggregative Terms	Dynamic Environment	Algorithm/Technique
[3]	Yes	No	Distributed aggregative gradient tracking
[4]	Yes	No	Online distributed gradient tracking
[5]	Yes	No	Projected aggregative tracking
[6]	No	Known	Dynamic mirror descent
[7]	No	Known	Online gradient descent
[8]	No	Known	Decentralized mirror descent
[10]	No	Known	Adaptive gradient method
[9]	No	Known	Distributed bandit online leaning
[11]	No	Unknown	Gradient tracking via DAT
This paper	Yes	Unknown	Aggregative gradient tracking via DAT

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yang, C.; Wang, S.; Zhang, S.; Lin, S.; Huang, B. A Class of Distributed Online Aggregative Optimization in Unknown Dynamic Environment. Mathematics 2024, 12, 2460. https://doi.org/10.3390/math12162460

AMA Style

Yang C, Wang S, Zhang S, Lin S, Huang B. A Class of Distributed Online Aggregative Optimization in Unknown Dynamic Environment. Mathematics. 2024; 12(16):2460. https://doi.org/10.3390/math12162460

Chicago/Turabian Style

Yang, Chengqian, Shuang Wang, Shuang Zhang, Shiwei Lin, and Bomin Huang. 2024. "A Class of Distributed Online Aggregative Optimization in Unknown Dynamic Environment" Mathematics 12, no. 16: 2460. https://doi.org/10.3390/math12162460

APA Style

Yang, C., Wang, S., Zhang, S., Lin, S., & Huang, B. (2024). A Class of Distributed Online Aggregative Optimization in Unknown Dynamic Environment. Mathematics, 12(16), 2460. https://doi.org/10.3390/math12162460

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Class of Distributed Online Aggregative Optimization in Unknown Dynamic Environment

Abstract

1. Introduction

2. Preliminaries

2.1. Notations

2.2. Graph Theory

2.3. Bregman Divergence

3. Problem Formulation

4. Main Result

5. Simulations

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Appendix A

Appendix B

Appendix C

Appendix D

Appendix E

Appendix F

Appendix G

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI