Analysis of Known Linear Distributed Average Consensus Algorithms on Cycles and Paths

Jesús Gutiérrez-Gutiérrez; Marta Zárraga-Rodríguez; Xabier Insausti

doi:10.3390/s18040968

,

and

Department of Biomedical Engineering and Sciences, Tecnun, University of Navarra, Manuel Lardizábal 13, 20018 San Sebastián, Spain

^*

Author to whom correspondence should be addressed.

Sensors2018, 18(4), 968;https://doi.org/10.3390/s18040968

This article belongs to the Collection Smart Communication Protocols and Algorithms for Sensor Networks

Version Notes

Order Reprints

Abstract

In this paper, we compare six known linear distributed average consensus algorithms on a sensor network in terms of convergence time (and therefore, in terms of the number of transmissions required). The selected network topologies for the analysis (comparison) are the cycle and the path. Specifically, in the present paper, we compute closed-form expressions for the convergence time of four known deterministic algorithms and closed-form bounds for the convergence time of two known randomized algorithms on cycles and paths. Moreover, we also compute a closed-form expression for the convergence time of the fastest deterministic algorithm considered on grids.

Keywords:

average consensus algorithms; distributed computation; sensor networks; convergence time; number of transmissions

1. Introduction

A distributed averaging (or average consensus) algorithm obtains in each sensor the average (arithmetic mean) of the values measured by all the sensors of a sensor network in a distributed way.

The most common distributed averaging algorithms are linear and iterative:

x (t + 1) = W (t) x (t), t \in {0, 1, 2, \dots},

(1)

where:

x (t) = (\begin{matrix} x_{1} (t) \\ ⋮ \\ x_{n} (t) \end{matrix})

(2)

is a real vector, n is the number of sensors of the network, which we label

v_{j}

with

j \in {1, \dots, n}

,

x_{j} (0)

is the value measured by the sensor

v_{j}

,

x_{j} (t)

is the value computed by the sensor

v_{j}

in time

t \neq 0

and the weighting matrix

W (t)

is an

n \times n

real sparse matrix satisfying that if two sensors

v_{j}

and

v_{k}

are not connected (i.e., if

v_{j}

and

v_{k}

cannot interchange information), then

{[W (t)]}_{j, k} = 0

. From the point of view of communication protocols, there exist efficient ways of implementing synchronous algorithms of the form of (1). (see, e.g., [1]). The linear distributed averaging algorithms can be classified as deterministic or randomized depending on the nature of the weighting matrices

W (t)

.

1.1. Deterministic Linear Distributed Averaging Algorithms

Several well-known deterministic linear distributed averaging algorithms can be found in [2] and [3]. Those algorithms are time-invariant and have symmetric weights, that is, the deterministic weighting matrix

W (t)

is symmetric and does not depend on t (and consequently,

x (t) = W^{t} x (0)

).

In [2], the authors search among all the symmetric weighting matrices W the one that makes (1) the fastest possible and show that such a matrix can be obtained by numerically solving a convex optimization problem. This algorithm is called the fastest linear time-invariant (LTI) distributed averaging algorithm for symmetric weights. It should be mentioned that in [4], the authors proposed an in-network algorithm for finding such an optimal weighting matrix.

In [2], the authors also give a slower algorithm: the fastest constant edge weights algorithm. In this other algorithm, they consider a particular structure of symmetric weighting matrices that depends on a single parameter and find the value of that parameter that makes (1) the fastest possible.

In [3], another two algorithms can be found: the maximum-degree weights algorithm and the Metropolis–Hastings algorithm.

For other deterministic linear distributed averaging algorithms, we refer the reader to [5] and the references therein.

1.2. Randomized Linear Distributed Averaging Algorithms

For the randomized case, a well-known linear distributed averaging algorithm was given in [6]. That algorithm is called the pairwise gossip algorithm because only two randomly-selected sensors interchange information at each time instant t.

Another well-known randomized algorithm can be found in [7]. That algorithm is called the broadcast gossip algorithm because a single sensor is randomly selected at each time instant t and broadcasts its value to all its neighboring sensors. The broadcast gossip algorithm is a linear distributed consensus algorithm rather than a linear distributed averaging algorithm. However, the broadcast gossip algorithm converges to a random consensus value, which is, in expectation, the average of the values measured by all the sensors of the network. If one uses the directed version of the broadcast gossip algorithm [8] in a symmetric graph, one would converge to the true average.

For other randomized linear distributed averaging algorithms, we refer the reader to [9] and the references therein. The linear distributed averaging algorithms reviewed in Section 1.1 and Section 1.2 are the most cited algorithms in the literature on the topic.

1.3. Our Contribution

A key feature of a distributed averaging algorithm is its convergence time, because it allows one to establish the stopping criterion for the iterative algorithm. The convergence time is defined as the number of iterations t required in (1) until the effective value computed by the sensors,

x (t)

, has approached the steady state sufficiently close (to a threshold

ϵ

). In the literature, we have not found closed-form expressions for the convergence time of the six linear distributed averaging algorithms mentioned in Section 1.1 and Section 1.2. A mathematical expression is said to be a closed-form expression if it is written in terms of a finite number of elementary functions (i.e., in terms of a finite number of constants, arithmetic operations, roots, exponentials, natural logarithms and trigonometric functions). In the present paper, we compute closed-form expressions for the convergence time of the deterministic algorithms and closed-form upper bounds for the convergence time of the randomized algorithms on two common network topologies: the cycle and the path. Observe that these closed-form formulas give us upper bounds for the convergence time of the considered algorithms (stopping criteria) on any network that contains as a subgraph a cycle or a path with the same number of sensors. Specifically, in this paper, we compute:

a closed-form expression for the convergence time of the fastest LTI distributed averaging algorithm for symmetric weights on the considered topologies (see Section 2.1); moreover, we also compute a closed-form expression for the convergence time of this algorithm on a grid;
a closed-form expression for the convergence time of the fastest constant edge weights algorithm on the considered topologies (see Section 2.2);
a closed-form expression for the convergence time of the maximum-degree weights algorithm on the considered topologies (see Section 2.3);
a closed-form expression for the convergence time of the Metropolis–Hastings algorithm on the considered topologies (see Section 2.3);
closed-form lower and upper bounds for the convergence time of the pairwise gossip algorithm on the considered topologies (see Section 3.1);
closed-form lower and upper bounds for the convergence time of the broadcast gossip algorithm on the considered topologies (see Section 3.2).

From these closed-form formulas, we study the asymptotic behavior of the convergence time of the considered algorithms as the number of sensors of the network grows. The obtained asymptotic and non-asymptotic results allow us to compare the considered algorithms in terms of convergence time and, consequently, in terms of the number of transmissions required, as well (see Section 4 and Section 5). The knowledge of the number of transmissions required lets us know the energy consumption of the distributed technique. The knowledge of the energy consumption is a key factor in the design of a new wireless sensor network (WSN), where one has to decide the number of nodes and the network topology. It should be mentioned that when designing new WSNs, cycles, paths and grids are topologies that are considered frequently.

2. Convergence Time of Deterministic Linear Distributed Averaging Algorithms

Different definitions of convergence time are used in the literature. We have found three different definitions for the convergence time of a deterministic linear distributed averaging algorithm (see [2,10,11]). In this paper, we consider the definition of

ϵ

-convergence time given in [11]:

τ (ϵ, {W (t)}_{t \geq 0}) : = min \{t_{0} : \frac{∥ x (t) - P_{n} {x (0) ∥}_{2}}{∥ x (0) - P_{n} {x (0) ∥}_{2}} \leq ϵ, \forall t \geq t_{0}, \forall x (0) \neq P_{n} x (0)\},

(3)

where

ϵ \in (0, 1)

,

{∥ \cdot ∥}_{2}

is the spectral norm and

P_{n} : = \frac{1}{n} 1_{n} 1_{n}^{⊤}

, with

1_{n}

being the

n \times 1

matrix of ones and ⊤ denoting the transpose. If we replace the spectral norm by the infinity norm in that definition, we obtain the definition of

ϵ

-convergence time given in [10]. If the deterministic matrix

W (t)

in (1) does not depend on t, we denote the

ϵ

-convergence time by

τ (ϵ, W)

.

2.1. Convergence Time of the Fastest LTI Distributed Averaging Algorithm for Symmetric Weights

In this section, we give a closed-form expression for the

ϵ

-convergence time of the fastest LTI distributed averaging algorithm for symmetric weights, and we study its asymptotic behavior as the number of sensors of the network grows. We consider three common network topologies: the cycle, the grid and the path (see Figure 1).

Figure 1. Considered network topologies with 16 sensors.

2.1.1. The Cycle

Let:

{\overset{\circ}{W}}_{n} (γ) : = (\begin{matrix} 1 - 2 γ & γ & 0 & \dots & 0 & 0 & γ \\ γ & 1 - 2 γ & γ & \dots & 0 & 0 & 0 \\ 0 & γ & 1 - 2 γ & \dots & 0 & 0 & 0 \\ ⋮ & ⋱ & ⋱ & ⋱ & ⋱ & ⋮ & ⋮ \\ 0 & 0 & 0 & \dots & 1 - 2 γ & γ & 0 \\ 0 & 0 & 0 & \dots & γ & 1 - 2 γ & γ \\ γ & 0 & 0 & \dots & 0 & γ & 1 - 2 γ \end{matrix}) .

(4)

Using (4), Theorem 1 gives the expression of the weighting matrix of the fastest LTI distributed averaging algorithm for symmetric weights on a cycle with n sensors.

Theorem 1.

Let

n \in N

, with

n > 3

. Then,

{\overset{\circ}{W}}_{n} (γ_{0})

is the weighting matrix of the fastest LTI distributed averaging algorithm for symmetric weights on a cycle with n sensors, where:

γ_{0} = \frac{1}{2 - \cos \frac{2 π}{n} - \cos \frac{2 π (j_{0} - 1)}{n}},

(5)

with:

\begin{matrix} j_{0} = \{\begin{matrix} \frac{n}{2} + 1 & if n is even, \\ \frac{n + 1}{2} & if n is odd . \end{matrix} \end{matrix}

(6)

Proof.

See Appendix B. ☐

We now give a closed-form expression for the

ϵ

-convergence time of the fastest LTI distributed averaging algorithm for symmetric weights on a cycle. We also study the asymptotic behavior of this convergence time as the number of sensors of the cycle grows.

We first introduce some notation: Two sequences of numbers

{a_{n}}

and

{b_{n}}

are said to be asymptotically equal, and write

a_{n} \sim b_{n}

, if and only if

{lim}_{n \to \infty} \frac{a_{n}}{b_{n}} = 1

(see, e.g., [12] (p. 396)), and, consequently,

τ (ϵ, {\overset{\circ}{W}}_{n} (γ_{0})) = Θ (n^{2} \log ϵ^{- 1}) .

(7)

Let

f, g : N \to R

be two non-negative functions. We write

f (n) = O (g (n))

(respectively,

f (n) = Ω (g (n))

) if there exist

K \in (0, \infty)

and

n_{0} \in N

such that

f (n) \leq K g (n)

(respectively,

f (n) \geq K g (n)

) for all

n \geq n_{0}

. If

f (n) = O (g (n))

and

f (n) = Ω (g (n))

, then we write

f (n) = Θ (g (n))

.

Theorem 2.

Consider

ϵ \in (0, 1)

and

n \in N

, with

n > 3

. Let

{\overset{\circ}{W}}_{n} (γ_{0})

be as in Theorem 1. Then,

τ (ϵ, {\overset{\circ}{W}}_{n} (γ_{0})) = \{\begin{matrix} ⌈\frac{\log ϵ^{- 1}}{- \log \frac{1 + \cos \frac{2 π}{n}}{3 - \cos \frac{2 π}{n}}}⌉ & if n is even, \\ ⌈\frac{\log ϵ^{- 1}}{- \log \frac{\cos \frac{π}{n} + \cos \frac{2 π}{n}}{2 + \cos \frac{π}{n} - \cos \frac{2 π}{n}}}⌉ & if n is odd, \end{matrix}

(8)

where log is the natural logarithm and

⌈ x ⌉

denotes the smallest integer not less than x. Moreover,

τ (ϵ, {\overset{\circ}{W}}_{n} (γ_{0})) \sim \frac{n^{2} \log ϵ^{- 1}}{2 π^{2}},

(9)

Proof.

See Appendix C. ☐

Since the number of transmissions per iteration on a cycle with n sensors is n for the fastest LTI distributed averaging algorithm for symmetric weights, the total number of transmissions required for

τ (ϵ, {\overset{\circ}{W}}_{n} (γ_{0}))

iterations is

T (ϵ, {\overset{\circ}{W}}_{n} (γ_{0})) : = n τ (ϵ, {\overset{\circ}{W}}_{n} (γ_{0}))

. From Theorem 2, we obtain:

T (ϵ, {\overset{\circ}{W}}_{n} (γ_{0})) \sim \frac{n^{3} \log ϵ^{- 1}}{2 π^{2}},

(10)

and hence,

T (ϵ, {\overset{\circ}{W}}_{n} (γ_{0})) = Θ (n^{3} \log ϵ^{- 1})

.

2.1.2. The Grid

Let:

{\tilde{W}}_{n} (α) : = (\begin{matrix} 1 - α & α \\ α & 1 - 2 α & α \\ ⋱ & ⋱ & ⋱ \\ α & 1 - 2 α & α \\ α & 1 - α \end{matrix})

(11)

be the

n \times n

matrix for

n \geq 2

, and

{\tilde{W}}_{1} (α) : = 1

. We define:

{\overset{⊠}{W}}_{r, c} (α) : = {\tilde{W}}_{r} (α) \otimes {\tilde{W}}_{c} (α),

(12)

where ⊗ is the Kronecker product. Using (12), Theorem 3 gives the expression of the weighting matrix of the fastest LTI distributed averaging algorithm for symmetric weights on a grid of r rows and c columns.

Theorem 3.

Let

r, c \in N

, with

r c > 2

. Then, the

r c \times r c

matrix

{\overset{⊠}{W}}_{r, c} (\frac{1}{2})

is the weighting matrix of the fastest LTI distributed averaging algorithm for symmetric weights on a grid of r rows and c columns.

Proof.

See Appendix D. ☐

We now give a closed-form expression for the

ϵ

-convergence time of the fastest LTI distributed averaging algorithm for symmetric weights on a grid of r rows and c columns. We also study the asymptotic behavior of this convergence time as the number of rows of the grid grows.

Theorem 4.

Consider

ϵ \in (0, 1)

and

r, c \in N

, with

r c > 2

. Without loss of generality, we assume

r \geq c

. Then,

τ (ϵ, {\overset{⊠}{W}}_{r, c} (\frac{1}{2})) = ⌈\frac{\log ϵ^{- 1}}{- \log \cos \frac{π}{r}}⌉ .

(13)

Moreover,

τ (ϵ, {\overset{⊠}{W}}_{r, c} (\frac{1}{2})) \sim \frac{2 r^{2} \log ϵ^{- 1}}{π^{2}}

(14)

and consequently,

τ (ϵ, {\overset{⊠}{W}}_{r, c} (\frac{1}{2})) = Θ (r^{2} \log ϵ^{- 1}) .

(15)

Proof.

From [2] (Theorem 1), Theorem A1 and (A64), we obtain (13). The rest of the proof runs as the proof of Theorem 2. ☐

Since the number of transmissions per iteration on a grid of r rows and c columns is

r c

for the fastest LTI distributed averaging algorithm for symmetric weights, the total number of transmissions required for

τ (ϵ, {\overset{⊠}{W}}_{r, c} (\frac{1}{2}))

iterations is:

T (ϵ, {\overset{⊠}{W}}_{r, c} (\frac{1}{2})) : = r c τ (ϵ, {\overset{⊠}{W}}_{r, c} (\frac{1}{2})) .

(16)

If

r = c = \sqrt{n}

, from Theorem 4, we obtain:

T (ϵ, {\overset{⊠}{W}}_{r, c} (\frac{1}{2})) \sim \frac{2 n^{2} \log ϵ^{- 1}}{π^{2}},

(17)

and hence,

T (ϵ, {\overset{⊠}{W}}_{r, c} (\frac{1}{2})) = Θ (n^{2} \log ϵ^{- 1})

. Observe that from (13), the optimal configuration for a grid with n sensors is obtained when

r = c = \sqrt{n}

.

2.1.3. The Path

Since the path with n sensors can be seen as a grid of n rows and one column, from Theorem 3, we conclude that

{\tilde{W}}_{n} (\frac{1}{2})

is the weighting matrix of the fastest LTI distributed averaging algorithm for symmetric weights on a path of n sensors, and from Theorem 4, we conclude that:

τ (ϵ, {\tilde{W}}_{n} (\frac{1}{2})) = ⌈\frac{\log ϵ^{- 1}}{- \log \cos \frac{π}{n}}⌉ .

(18)

Moreover,

τ (ϵ, {\tilde{W}}_{n} (\frac{1}{2})) \sim \frac{2 n^{2} \log ϵ^{- 1}}{π^{2}}

(19)

and consequently,

τ (ϵ, {\tilde{W}}_{n} (\frac{1}{2})) = Θ (n^{2} \log ϵ^{- 1}) .

(20)

Finally, from (16), we obtain:

T (ϵ, {\tilde{W}}_{n} (\frac{1}{2})) \sim \frac{2 n^{3} \log ϵ^{- 1}}{π^{2}},

(21)

and hence,

T (ϵ, {\tilde{W}}_{n} (\frac{1}{2})) = Θ (n^{3} \log ϵ^{- 1})

.

2.2. Convergence Time of the Fastest Constant Edge Weights Algorithm

In [2], the authors consider the real symmetric weighting matrices

W_{n} (ρ)

given by:

{[W_{n} (ρ)]}_{j, k} : = \{\begin{matrix} ρ & if j \neq k, and v_{j} and v_{k} are connected, \\ 1 - d_{j} ρ & if j = k, \\ 0 & otherwise, \end{matrix}

(22)

where

d_{j}

denotes the degree of the sensor

v_{j}

(i.e., the number of sensors different from

v_{j}

connected to

v_{j}

).

Observe that the weighting matrices of the fastest LTI distributed averaging algorithms for symmetric weights given in Section 2.1 for a cycle and a path, namely

{\overset{\circ}{W}}_{n} (γ_{0})

and

{\tilde{W}}_{n} (\frac{1}{2})

, can be regarded as

W_{n} (ρ)

in (22) taking

ρ = γ_{0}

and

ρ = \frac{1}{2}

, respectively. Therefore, the closed-form expression for the

ϵ

-convergence time of the fastest constant edge weights algorithm is given by Theorem 2 on a cycle and by Theorem 4 on a path. That is, the

ϵ

-convergence time of the fastest constant edge weights algorithm and the

ϵ

-convergence time of the fastest LTI distributed averaging algorithm for symmetric weights is the same on a cycle and on a path.

2.3. Convergence Time of the Maximum-Degree Weights Algorithm and of the Metropolis–Hastings Algorithm

For the maximum-degree weights algorithm [3], the weighting matrix considered is the real symmetric matrix

W_{n} (ρ)

in (22) with:

ρ = \frac{1}{1 + {max}_{j \in {1, \dots, n}} d_{j}} .

(23)

On the other hand, for the Metropolis–Hastings algorithm [3], the entries of the weighting matrix

W_{n}

are given by:

{[W_{n}]}_{j, k} = \{\begin{matrix} \frac{{[A]}_{j, k}}{1 + max {d_{j}, d_{k}}} & if j \neq k, \\ 1 - \sum_{h \in {1, \dots, n} ∖ {j}} {[W_{n}]}_{j, h} & if j = k, \end{matrix}

(24)

where A is the adjacency matrix of the network, that is A is the

n \times n

real symmetric matrix given by:

{[A]}_{j, k} = \{\begin{matrix} 1 & if j \neq k, and v_{j} and v_{k} are connected, \\ 0 & otherwise . \end{matrix}

(25)

2.3.1. The Cycle

Observe that the weighting matrices of the maximum-degree weights algorithm and the Metropolis–Hastings algorithm for a cycle with n sensors can be regarded as

{\overset{\circ}{W}}_{n} (γ)

in (4) taking

γ = \frac{1}{3}

.

We now give a closed-form expression for the

ϵ

-convergence time of the maximum-degree weights algorithm and of the Metropolis–Hastings algorithm on a cycle. We also study the asymptotic behavior of this convergence time as the number of sensors of the cycle grows.

Theorem 5.

Consider

ϵ \in (0, 1)

and

n \in N

, with

n > 3

. Then:

τ (ϵ, {\overset{\circ}{W}}_{n} (\frac{1}{3})) = ⌈\frac{\log ϵ^{- 1}}{- \log \frac{1 + 2 \cos \frac{2 π}{n}}{3}}⌉ .

(26)

Moreover,

τ (ϵ, {\overset{\circ}{W}}_{n} (\frac{1}{3})) \sim \frac{3 n^{2} \log ϵ^{- 1}}{4 π^{2}},

(27)

and therefore,

τ (ϵ, {\overset{\circ}{W}}_{n} (\frac{1}{3})) = Θ (n^{2} \log ϵ^{- 1}) .

(28)

Proof.

Combining (A29) and (A30), we obtain:

\begin{matrix} {∥{\overset{\circ}{W}}_{n} (\frac{1}{3}) - P_{n}∥}_{2} = \frac{1 + 2 \cos \frac{2 π}{n}}{3}, \end{matrix}

(29)

and applying [2] (Theorem 1) and Theorem A1, (26) holds. The rest of the proof runs as the proof of Theorem 2. ☐

Since the number of transmissions per iteration on a cycle with n sensors is n for both algorithms, the total number of transmissions required for

τ (ϵ, {\overset{\circ}{W}}_{n} (\frac{1}{3}))

iterations is

T (ϵ, {\overset{\circ}{W}}_{n} (\frac{1}{3})) : = n τ (ϵ, {\overset{\circ}{W}}_{n} (\frac{1}{3}))

. From Theorem 5, we obtain:

T (ϵ, {\overset{\circ}{W}}_{n} (\frac{1}{3})) \sim \frac{3 n^{3} \log ϵ^{- 1}}{4 π^{2}},

(30)

and thus,

T (ϵ, {\overset{\circ}{W}}_{n} (\frac{1}{3})) = Θ (n^{3} \log ϵ^{- 1})

.

2.3.2. The Path

Observe that the weighting matrices of the maximum-degree weights algorithm and of the Metropolis–Hastings algorithm for a path with n sensors can be regarded as

{\tilde{W}}_{n} (α)

in (11) taking

α = \frac{1}{3}

.

We now give a closed-form expression for the

ϵ

-convergence time of the maximum-degree weights algorithm and of the Metropolis–Hastings algorithm on a path. We also study the asymptotic behavior of this convergence time as the number of sensors of the path grows.

Theorem 6.

Consider

ϵ \in (0, 1)

and

n \in N

, with

n > 3

. Then:

τ (ϵ, {\tilde{W}}_{n} (\frac{1}{3})) = ⌈\frac{\log ϵ^{- 1}}{- \log \frac{1 + 2 \cos \frac{π}{n}}{3}}⌉ .

(31)

Moreover,

τ (ϵ, {\tilde{W}}_{n} (\frac{1}{3})) \sim \frac{3 n^{2} \log ϵ^{- 1}}{π^{2}},

(32)

and therefore,

τ (ϵ, {\tilde{W}}_{n} (\frac{1}{3})) = Θ (n^{2} \log ϵ^{- 1}) .

(33)

Proof.

Combining (A63) and [4] (Lemma 1), we obtain:

\begin{matrix} {∥{\tilde{W}}_{n} (\frac{1}{3}) - P_{n}∥}_{2} = \frac{1}{3} + \frac{2}{3} \cos \frac{π}{n}, \end{matrix}

(34)

and applying [2] (Theorem 1) and Theorem A1, (31) holds. The rest of the proof runs as the proof of Theorem 2. ☐

Since the number of transmissions per iteration on a path with n sensors is n for both algorithms, the total number of transmissions required for

τ (ϵ, {\tilde{W}}_{n} (\frac{1}{3}))

iterations is

T (ϵ, {\tilde{W}}_{n} (\frac{1}{3})) : = n τ (ϵ, {\tilde{W}}_{n} (\frac{1}{3}))

. From Theorem 6, we obtain:

T (ϵ, {\tilde{W}}_{n} (\frac{1}{3})) \sim \frac{3 n^{3} \log ϵ^{- 1}}{π^{2}},

(35)

and thus,

T (ϵ, {\tilde{W}}_{n} (\frac{1}{3})) = Θ (n^{3} \log ϵ^{- 1})

.

3. Convergence Time of Randomized Linear Distributed Averaging Algorithms

3.1. Lower and Upper Bounds for the Convergence Time of the Pairwise Gossip Algorithm

In the literature, we have found two different definitions for the convergence time of a randomized linear distributed averaging algorithm (see [6,7]). In this subsection, we consider the definition of

ϵ

-convergence time for a randomized linear distributed averaging algorithm given in [6]:

τ (ϵ, {W (t)}_{t \geq 0}) : = sup_{x (0) \neq 0_{n \times 1}} inf \{t : \Pr (\frac{∥ x (t) - P_{n} {x (0) ∥}_{2}}{{∥ x (0) ∥}_{2}} \geq ϵ) \leq ϵ\},

(36)

where

ϵ \in (0, 1)

and

\Pr

denotes probability.

We prove in Theorem A1 (Appendix A) that the definitions of

ϵ

-convergence time in (3) and (36) coincide when applied to deterministic LTI distributed averaging algorithms with symmetric weights (in particular, the four algorithms considered in Section 2). For those algorithms, we also obtain from Theorem A1 that:

τ (\frac{1}{e}, W) = ⌈ τ^{'} (W) ⌉,

(37)

where

τ^{'} (W)

denotes the definition of convergence time given in [2].

We recall here that in the pairwise gossip algorithm [6], only two sensors interchange information at each time instant t. These two sensors

v_{j_{t}}

and

v_{k_{t}}

are randomly selected at each time instant t, and the weighting matrix

W (t)

, which we denote by

W_{P} (t)

, is the symmetric matrix given by:

{[W_{P} (t)]}_{j, k} = \{\begin{matrix} \frac{1}{2} & if j, k \in {j_{t}, k_{t}}, \\ 1 & if j = k \notin {j_{t}, k_{t}}, \\ 0 & otherwise, \end{matrix}

(38)

for all

j, k \in {1, \dots, n}

.

In [6], a lower and an upper bound for the

ϵ

-convergence time of the pairwise gossip algorithm were introduced. We now give a closed-form expression for those bounds on a cycle and on a path, and we study their asymptotic behavior as the number of sensors of the network grows.

3.1.1. The Cycle

Theorem 7.

Consider

ϵ \in (0, 1)

and

n \in N

, with

n > 3

. Suppose that

{\overset{\circ}{W}}_{P} (t)

is the weighting matrix of the pairwise gossip algorithm given in (38) on a cycle with n sensors, where the edge

{v_{j_{t}}, v_{k_{t}}}

is randomly selected at each time instant

t \in N \cup {0}

with probability

\frac{1}{n}

. Then:

\frac{1}{2} {\overset{\circ}{l}}_{P} (ϵ) \leq τ (ϵ, {{\overset{\circ}{W}}_{P} (t)}_{t \geq 0}) \leq 3 {\overset{\circ}{l}}_{P} (ϵ),

(39)

with:

{\overset{\circ}{l}}_{P} (ϵ) = \frac{\log ϵ^{- 1}}{- \log (1 + \frac{1}{n} (\cos \frac{2 π}{n} - 1))} .

(40)

Moreover,

{\overset{\circ}{l}}_{P} (ϵ) \sim \frac{n^{3} \log ϵ^{- 1}}{2 π^{2}}

(41)

and:

τ (ϵ, {{\overset{\circ}{W}}_{P} (t)}_{t \geq 0}) = Θ (n^{3} \log ϵ^{- 1}) = τ (ϵ, {\overset{\circ}{W}}_{n} (\frac{1}{2 n})),

(42)

Proof.

The entries of the expectation of

{\overset{\circ}{W}}_{P} (0)

are given by:

{[E ({\overset{\circ}{W}}_{P} (0))]}_{j, k} = \{\begin{matrix} \frac{1}{2 n} & if j - k \in {- 1, 1}, \\ \frac{1}{2 n} & if j - k \in {1 - n, n - 1}, \\ \frac{1}{n} (n - 1) & if j = k, \\ 0 & otherwise, \end{matrix}

(43)

for all

j, k \in {1, \dots, n}

. Thus,

E ({\overset{\circ}{W}}_{P} (0)) = {\overset{\circ}{W}}_{n} (\frac{1}{2 n})

. Therefore, combining (A29) and [6] (Theorem 3), we obtain (39). The rest of the proof runs as the proof of Theorem 2. ☐

Since the number of transmissions per iteration on a cycle with n sensors is two for the pairwise gossip algorithm, the total number of transmissions required for

τ (ϵ, {{\overset{\circ}{W}}_{P} (t)}_{t \geq 0})

iterations is

T (ϵ, {{\overset{\circ}{W}}_{P} (t)}_{t \geq 0}) : = 2 τ (ϵ, {{\overset{\circ}{W}}_{P} (t)}_{t \geq 0})

. From Theorem 7, we obtain

{\overset{\circ}{l}}_{P} (ϵ) \leq T (ϵ, {{\overset{\circ}{W}}_{P} (t)}_{t \geq 0}) \leq 6 {\overset{\circ}{l}}_{P} (ϵ)

and

T (ϵ, {{\overset{\circ}{W}}_{P} (t)}_{t \geq 0}) = Θ (n^{3} \log ϵ^{- 1})

.

3.1.2. The Path

Theorem 8.

Consider

ϵ \in (0, 1)

and

n \in N

, with

n > 3

. Suppose that

{\tilde{W}}_{P} (t)

is the weighting matrix of the pairwise gossip algorithm given in (38) on a path with n sensors, where the edge

{v_{j_{t}}, v_{k_{t}}}

is randomly selected at each time instant

t \in N \cup {0}

with probability

\frac{1}{n - 1}

. Then:

\frac{1}{2} {\tilde{l}}_{P} (ϵ) \leq τ (ϵ, {{\tilde{W}}_{P} (t)}_{t \geq 0}) \leq 3 {\tilde{l}}_{P} (ϵ),

(44)

with:

{\tilde{l}}_{P} (ϵ) = \frac{\log ϵ^{- 1}}{- \log (1 + \frac{1}{n - 1} (\cos \frac{π}{n} - 1))} .

(45)

Moreover,

{\tilde{l}}_{P} (ϵ) \sim \frac{2 n^{3} \log ϵ^{- 1}}{π^{2}}

(46)

and:

τ (ϵ, {{\tilde{W}}_{P} (t)}_{t \geq 0}) = Θ (n^{3} \log ϵ^{- 1}) = τ (ϵ, {\tilde{W}}_{n} (\frac{1}{2 n - 2})) .

(47)

Proof.

The entries of the expectation of

{\tilde{W}}_{P} (0)

are given by:

{[E ({\tilde{W}}_{P} (0))]}_{j, k} = \{\begin{matrix} \frac{1}{2 n - 2} & if j - k \in {- 1, 1}, \\ 1 - \frac{1}{n - 1} & if j = k, j \neq 1 and j \neq n, \\ 1 - \frac{1}{2 n - 2} & if j = k, j \in {1, n}, \\ 0 & otherwise, \end{matrix}

(48)

for all

j, k \in {1, \dots, n}

. Thus,

E ({\tilde{W}}_{P} (0)) = {\tilde{W}}_{n} (\frac{1}{2 n - 2})

. Therefore, combining (A63) and [6] (Theorem 3), we obtain (44). The rest of the proof runs as the proof of Theorem 2. ☐

Since the number of transmissions per iteration on a path with n sensors is two for the pairwise gossip algorithm, the total number of transmissions required for

τ (ϵ, {{\tilde{W}}_{P} (t)}_{t \geq 0})

iterations is

T (ϵ, {{\tilde{W}}_{P} (t)}_{t \geq 0}) : = 2 τ (ϵ, {{\tilde{W}}_{P} (t)}_{t \geq 0})

. From Theorem 8, we obtain

{\tilde{l}}_{P} (ϵ) \leq T (ϵ, {{\tilde{W}}_{P} (t)}_{t \geq 0}) \leq 6 {\tilde{l}}_{P} (ϵ)

and

T (ϵ, {{\tilde{W}}_{P} (t)}_{t \geq 0}) = Θ (n^{3} \log ϵ^{- 1})

.

3.2. Lower and Upper Bounds for the Convergence Time of the Broadcast Gossip Algorithm

We begin this subsection with the definition of

ϵ

-convergence time for a randomized linear distributed averaging algorithm given in [7] (Equation (42)):

τ (ϵ, {W (t)}_{t \geq 0}) : = sup_{x (0) \neq P_{n} x (0)} inf \{t : \Pr (\frac{∥ x (t) - P_{n} {x (t) ∥}_{2}}{∥ x (0) - P_{n} {x (0) ∥}_{2}} \geq ϵ) \leq ϵ\},

(49)

where

ϵ \in (0, 1)

.

It can be proven that the definitions of

ϵ

-convergence time in (36) and (49) coincide when applied to algorithms in which the matrix

W (t)

satisfies

W (t) P_{n} = P_{n} W (t) = P_{n}

for all

t \in N \cup {0}

(in particular, the pairwise gossip algorithm and deterministic LTI distributed averaging algorithms with symmetric weights).

Observe that (49) is actually a definition for the convergence time of linear distributed consensus algorithms, not only of linear distributed averaging algorithms.

We recall here that in the broadcast gossip algorithm, a single sensor broadcasts at each time instant t. This sensor

v_{j_{t}}

is randomly selected at each time instant t with probability

\frac{1}{n}

, and the weighting matrix

W (t)

is given by:

{[W (t)]}_{j, k} = \{\begin{matrix} 1 & if j = k and {[A]}_{j, j_{t}} = 0, \\ φ & if j = k and {[A]}_{j, j_{t}} = 1, \\ 1 - φ & if k = j_{t} and {[A]}_{j, j_{t}} = 1, \\ 0 & otherwise, \end{matrix}

(50)

for all

j, k \in {1, \dots, n}

, where

φ \in (0, 1)

and A is the adjacency matrix of the network. We denote by

W_{B} (t)

the weighting matrix in (50) when

φ

is the optimal parameter:

φ_{0}

(see [7] (Section V)).

In [7], a lower and an upper bound for the

ϵ

-convergence time of the broadcast gossip algorithm were introduced. We now give a closed-form expression for

φ_{0}

and for those bounds on a cycle and on a path. We also study the asymptotic behavior of the bounds as the number of sensors of the network grows.

3.2.1. The Cycle

Theorem 9.

Consider

ϵ \in (0, 1)

and

n \in N

, with

n > 3

. Suppose that

{\overset{\circ}{W}}_{B} (t)

is the weighting matrix in (50) when the network is a cycle with n sensors and φ is the optimal parameter:

{\overset{\circ}{φ}}_{0}

. Then:

{\overset{\circ}{φ}}_{0} = 1 - \frac{n}{2 (n + \cos \frac{2 π}{n} - 1)}

(51)

and:

{\overset{\circ}{l}}_{B} (ϵ) \leq τ (ϵ, {{\overset{\circ}{W}}_{B} (t)}_{t \geq 0}) \leq 6 {\overset{\circ}{l}}_{B} (ϵ),

(52)

with:

{\overset{\circ}{l}}_{B} (ϵ) = \frac{\log ϵ^{- 1}}{- 2 \log (\frac{n + 2 \cos \frac{2 π}{n} - 2}{n + \cos \frac{2 π}{n} - 1})} .

(53)

Moreover,

{\overset{\circ}{l}}_{B} (ϵ) \sim \frac{n^{3} \log ϵ^{- 1}}{4 π^{2}},

(54)

and:

τ (ϵ, {{\overset{\circ}{W}}_{B} (t)}_{t \geq 0}) = Θ (n^{3} \log ϵ^{- 1}) = τ (ϵ, {\overset{\circ}{W}}_{n} (\frac{1 - {\overset{\circ}{φ}}_{0}}{n})) .

(55)

Proof.

See Appendix E. ☐

Since the number of transmissions per iteration on a cycle with n sensors is one for the broadcast gossip algorithm, the total number of transmissions required for

τ (ϵ, {{\overset{\circ}{W}}_{B} (t)}_{t \geq 0})

iterations is

T (ϵ, {{\overset{\circ}{W}}_{B} (t)}_{t \geq 0}) : = τ (ϵ, {{\overset{\circ}{W}}_{B} (t)}_{t \geq 0})

. From Theorem 9, we obtain

{\overset{\circ}{l}}_{B} (ϵ) \leq T (ϵ, {{\overset{\circ}{W}}_{B} (t)}_{t \geq 0}) \leq 6 {\overset{\circ}{l}}_{B} (ϵ)

and

T (ϵ, {{\overset{\circ}{W}}_{B} (t)}_{t \geq 0}) = Θ (n^{3} \log ϵ^{- 1})

.

3.2.2. The Path

Theorem 10.

Consider

ϵ \in (0, 1)

and

n \in N

, with

n > 3

. Suppose that

{\tilde{W}}_{B} (t)

is the weighting matrix in (50) when the network is a path with n sensors and φ is the optimal parameter:

{\tilde{φ}}_{0}

. Then:

{\tilde{φ}}_{0} = 1 - \frac{n}{2 (n + \cos \frac{π}{n} - 1)}

(56)

and:

{\tilde{l}}_{B} (ϵ) \leq τ (ϵ, {{\tilde{W}}_{B} (t)}_{t \geq 0}) \leq 6 {\tilde{l}}_{B} (ϵ),

(57)

with:

{\tilde{l}}_{B} (ϵ) = \frac{\log ϵ^{- 1}}{- 2 \log (\frac{n + 2 \cos \frac{π}{n} - 2}{n + \cos \frac{π}{n} - 1})} .

(58)

Moreover,

{\tilde{l}}_{B} (ϵ) \sim \frac{n^{3} \log ϵ^{- 1}}{π^{2}},

(59)

and:

τ (ϵ, {{\tilde{W}}_{B} (t)}_{t \geq 0}) = Θ (n^{3} \log ϵ^{- 1}) = τ (ϵ, {\tilde{W}}_{n} (\frac{1 - {\overset{\circ}{φ}}_{0}}{n})) .

(60)

Proof.

See Appendix F. ☐

Since the number of transmissions per iteration on a path with n sensors is one for the broadcast gossip algorithm, the total number of transmissions required for

τ (ϵ, {{\tilde{W}}_{B} (t)}_{t \geq 0})

iterations is

T (ϵ, {{\tilde{W}}_{B} (t)}_{t \geq 0}) : = τ (ϵ, {{\tilde{W}}_{B} (t)}_{t \geq 0})

. From Theorem 10, we obtain

{\tilde{l}}_{B} (ϵ) \leq T (ϵ, {{\tilde{W}}_{B} (t)}_{t \geq 0}) \leq 6 {\tilde{l}}_{B} (ϵ)

and

T (ϵ, {{\tilde{W}}_{B} (t)}_{t \geq 0}) = Θ (n^{3} \log ϵ^{- 1})

.

4. Discussion

As in this paper we have used the same definition of converge time for both deterministic and randomized linear distributed averaging algorithms (namely, the one in (49)), the results given in Section 2 and Section 3 allow us to compare the considered algorithms on a cycle and on a path in terms of convergence time and, consequently, in terms of the number of transmissions required, as well. In particular, these results show the following:

The behavior of the considered deterministic linear distributed averaging algorithms is as good as the behavior of the considered randomized ones in terms of the number of transmissions required on a cycle and on a path with n sensors: $Θ (n^{3} \log ϵ^{- 1})$ .
For a large enough number of sensors and regardless of the considered distributed averaging algorithm, the number of transmissions required on a path is four times larger than the number of transmissions required on a cycle.

Furthermore, regarding the cycle, from (10), (30), (41) and (54), we obtain the following enlightening asymptotic equalities:

{\overset{\circ}{l}}_{B} (ϵ) \sim \frac{T (ϵ, {\overset{\circ}{W}}_{n} (γ_{0}))}{2} \sim \frac{{\overset{\circ}{l}}_{P} (ϵ)}{2} \sim \frac{T (ϵ, {\overset{\circ}{W}}_{n} (\frac{1}{3}))}{3},

(61)

and regarding the path, from (21), (35), (46) and (59), we obtain:

{\tilde{l}}_{B} (ϵ) \sim \frac{T (ϵ, {\tilde{W}}_{n} (\frac{1}{2}))}{2} \sim \frac{{\tilde{l}}_{P} (ϵ)}{2} \sim \frac{T (ϵ, {\tilde{W}}_{n} (\frac{1}{3}))}{3} .

(62)

5. Numerical Examples

For the numerical examples, we first consider a cycle and a path with five and 10 sensors. For each network topology, we present a figure: Figure 2 for the cycle and Figure 3 for the path. Figure 2 (resp. Figure 3) shows the number of transmissions of the fastest LTI distributed averaging algorithm for symmetric weights

T (ϵ, {\overset{\circ}{W}}_{n} (γ_{0}))

(resp.

T (ϵ, {\tilde{W}}_{n} (1 / 2))

) and of the Metropolis–Hastings algorithm

T (ϵ, {\overset{\circ}{W}}_{n} (1 / 3))

(resp.

T (ϵ, {\tilde{W}}_{n} (1 / 3))

) with

ϵ \in (10^{- 15}, 1)

. The figure also shows the lower bound,

{\overset{\circ}{l}}_{P} (ϵ)

, and upper bound,

6 {\overset{\circ}{l}}_{P} (ϵ)

, given for the number of transmissions of the pairwise gossip algorithm, and the lower bound,

{\overset{\circ}{l}}_{B} (ϵ)

, and upper bound,

6 {\overset{\circ}{l}}_{B} (ϵ)

, given for the number of transmissions of the broadcast gossip algorithm (resp.

{\tilde{l}}_{P} (ϵ)

,

6 {\tilde{l}}_{P} (ϵ)

,

{\tilde{l}}_{B} (ϵ)

and

6 {\tilde{l}}_{B} (ϵ)

). Furthermore, the figure shows the average number of transmissions of the pairwise gossip algorithm,

\hat{T} (ϵ, {{\overset{\circ}{W}}_{P} (t)}_{t \geq 0})

, and of the broadcast gossip algorithm,

\hat{T} (ϵ, {{\overset{\circ}{W}}_{B} (t)}_{t \geq 0})

, (resp.

\hat{T} (ϵ, {{\tilde{W}}_{P} (t)}_{t \geq 0})

and

\hat{T} (ϵ, {{\tilde{W}}_{B} (t)}_{t \geq 0})

), that we have computed by using Monte Carlo simulations. In those simulations, we have performed 1000 repetitions of the corresponding algorithm for each

ϵ \in (10^{- 15}, 1)

, and we have considered that the values measured by the sensors,

x_{j} (0)

with

j \in {1, \dots, n}

, are independent identically distributed random variables with unit-variance, zero-mean and uniform distribution.

Figure 2. (a) A cycle with five sensors; (b) a cycle with 10 sensors.

Figure 3. (a) A path with five sensors; (b) a path with 10 sensors.

In this section, we present another two figures: Figure 4 and Figure 5. Unlike in Figure 2 and Figure 3, in Figure 4 and Figure 5, we have fixed

ϵ

instead of the number of sensors n of the network. Specifically, we have chosen

ϵ = 10^{- 3}

and

ϵ = 10^{- 6}

with

n \in {5, \dots, 30}

.

Figure 4. A cycle: (a)

ϵ = 10^{- 3}

; (b)

ϵ = 10^{- 6}

.

Figure 5. A path: (a)

ϵ = 10^{- 3}

; (a)

ϵ = 10^{- 6}

.

In the figures, it can be observed that the Metropolis–Hastings algorithm behaves on average better than the pairwise gossip algorithm in terms of the number of transmissions required on the considered networks. It can also be observed that the broadcast gossip algorithm behaves on average approximately equal to the fastest LTI distributed averaging algorithm for symmetric weights in terms of the number of transmissions required on those networks. However, we recall here that the broadcast gossip algorithm converges to a random consensus value instead of to the average consensus value, and it should be executed several times in order to get that average value in every sensor.

The figures also bear evidence of the asymptotic equalities given in (61) and in (62).

6. Conclusions

In this paper, we have studied the convergence time of six known linear distributed averaging algorithms. We have considered both deterministic (the fastest LTI distributed averaging algorithm for symmetric weights, the fastest constant edge weights algorithm, the maximum-degree weights algorithm and the Metropolis–Hastings algorithm) and randomized (the pairwise gossip algorithm and the broadcast gossip algorithm) linear distributed averaging algorithms. In the literature, we have not found closed-form expressions for the convergence time of the considered algorithms. We have computed closed-form expressions for the convergence time of the deterministic algorithms and closed-form upper bounds for the convergence time of the randomized algorithms on two common network topologies: the cycle and the path. Moreover, we have also computed a closed-form expression for the convergence time of the fastest LTI algorithm on a grid. From the computed closed-form formulas, we have studied the asymptotic behavior of the convergence time of the considered algorithms as the number of sensors of the considered networks grows.

Although there exist different definitions of convergence time in the literature, in this paper, we have proven that one of them (namely, the one in (49)) encompasses all the others for the algorithms here considered. As we have used the definition of converge time in (49) for both deterministic and randomized linear distributed averaging algorithms, the obtained closed-form formulas and asymptotic results allow us to compare the considered algorithms on cycles and paths in terms of convergence time and, consequently, in terms of the number of transmissions required, as well.

We now summarize the most remarkable conclusions:

The best algorithm among the considered deterministic distributed averaging algorithms is not worse than the best algorithm among the considered randomized distributed averaging algorithms for cycles and paths.
The weighting matrix of the fastest LTI distributed averaging algorithm for symmetric weights and the weighting matrix of the fastest constant edge weights algorithm are the same on cycles and on paths.
The number of transmissions required on a path with n sensors is asymptotically four-times larger than the number of transmissions required on a cycle with the same number of sensors.
The number of transmissions required grows as $n^{3}$ on cycles and on paths for the six algorithms considered.
For the fastest LTI algorithm, the number of transmissions required grows as $n^{2}$ on a square grid of n sensors (i.e., $r = c = \sqrt{n}$ ).

A future research direction of this work would be to generalize the analysis presented in the paper to other network topologies. In particular, networks that can be decomposed into cycles and paths could be studied.

Acknowledgments

This work was supported in part by the Spanish Ministry of Economy and Competitiveness through the RACHELproject (TEC2013-47141-C4-2-R), the CARMENproject (TEC2016-75067-C4-3-R) and the COMONSENSnetwork (TEC2015-69648-REDC).

Author Contributions

Jesús Gutiérrez-Gutiérrez conceived the research question. Jesús Gutiérrez-Gutiérrez, Marta Zárraga-Rodríguez and Xabier Insausti proved the main results. Xabier Insausti performed the simulations. Jesús Gutiérrez-Gutiérrez, Marta Zárraga-Rodríguez and Xabier Insausti wrote the paper. All authors have read and approved the final manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Comparison of Several Definitions of Convergence Time

We begin by giving a property of the spectral norm. Its proof is implicit in the Appendix of [1].

Lemma A1.

Let B and P be two

n \times n

real symmetric matrices with

B P = P

(or equivalently,

P B = P

). Suppose that P is idempotent. Then:

$B^{t} P = P$ for all $t \in N$ .
$B^{t} - P = {(B - P)}^{t}$ for all $t \in N$ .
$∥ B^{t} {- P ∥}_{2} = {∥ B - P ∥}_{2}^{t}$ for all $t \in N$ .

We recall that an

n \times n

matrix A is idempotent if and only if

A^{2} = A

. An example of idempotent matrix is

P_{n}

with

n \in N

, since

{[P_{n}^{2}]}_{j, k} = \sum_{h = 1}^{n} {[P_{n}]}_{j, h} {[P_{n}]}_{h, k} = \sum_{h = 1}^{n} \frac{1}{n} \frac{1}{n} = \frac{1}{n} = {[P_{n}]}_{j, k}

for all

j, k \in {1, \dots n}

.

The following result gives an eigenvalue decomposition for the matrix

P_{n}

for all

n \in N

.

Lemma A2.

If

n \in N

, then

P_{n} = V_{n} diag (1, 0, \dots, 0) V_{n}^{*}

, where

V_{n}

is the

n \times n

Fourier unitary matrix.

Proof.

From [13] (Lemma 2) or [14] (Lemma 3), we obtain that

V_{n} diag (1, 0, \dots, 0) V_{n}^{*}

is a circulant matrix with:

{[V_{n} diag (1, 0, \dots, 0) V_{n}^{*}]}_{j, 1} = {[\frac{1}{\sqrt{n}} V_{n} (\begin{matrix} 1 \\ 0 \\ ⋮ \\ 0 \end{matrix})]}_{j, 1} = \frac{1}{\sqrt{n}} \sum_{k = 1}^{n} {[V_{n}]}_{j, k} {[(\begin{matrix} 1 \\ 0 \\ ⋮ \\ 0 \end{matrix})]}_{k, 1} = \frac{1}{\sqrt{n}} {[V_{n}]}_{j, 1} = \frac{1}{n}

(A1)

for all

j \in {1, \dots n}

. Therefore,

V_{n} diag (1, 0, \dots, 0) V_{n}^{*} = P_{n}

. ☐

We finish this subsection with a result regarding the

ϵ

-convergence time.

Theorem A1.

Let B be an

n \times n

real symmetric matrix with

B \neq P_{n}

and

B^{t} \to P_{n}

. If

ϵ \in (0, 1)

, then:

\begin{matrix} (A2) & min \{t_{0} \in N : \frac{∥ B^{t} x - P_{n} {x ∥}_{2}}{∥ x - P_{n} {x ∥}_{2}} \leq ϵ, \forall t \geq t_{0}, \forall x \neq P_{n} x\} & = max_{x \neq 0_{n \times 1}} min \{t \in N : \frac{∥ B^{t} x - P_{n} {x ∥}_{2}}{{∥ x ∥}_{2}} \leq ϵ\} \\ (A3) & = ⌈\frac{\log ϵ^{- 1}}{- \log {∥B - P_{n}∥}_{2}}⌉ . \end{matrix}

Proof.

Let

t \in N

. We first prove that the following statements are equivalent:

$\frac{∥ B^{t} x - P_{n} {x ∥}_{2}}{∥ x - P_{n} {x ∥}_{2}} \leq ϵ$ for all $x \neq P_{n} x$ .
$\frac{∥ B^{t} x - P_{n} {x ∥}_{2}}{{∥ x ∥}_{2}} \leq ϵ$ for all $x \neq 0_{n \times 1}$ .

1⇒2 Fix

x \neq 0_{n \times 1}

. If

x \neq P_{n} x

, applying Lemma A2 yields:

\begin{matrix} \frac{∥ B^{t} x - P_{n} {x ∥}_{2}}{{∥ x ∥}_{2}} & \leq \frac{ϵ ∥ x - P_{n} {x ∥}_{2}}{{∥ x ∥}_{2}} = \frac{ϵ ∥ (I_{n} - P_{n}) {x ∥}_{2}}{{∥ x ∥}_{2}} \leq ϵ {∥I_{n} - P_{n}∥}_{2} = ϵ {∥V_{n} V_{n}^{*} - V_{n} diag (1, 0, \dots, 0) V_{n}^{*}∥}_{2} \end{matrix}

(A4)

\begin{matrix} = ϵ {∥V_{n} I_{n} V_{n}^{*} - V_{n} diag (1, 0, \dots, 0) V_{n}^{*}∥}_{2} = ϵ {∥V_{n} diag (0, 1, \dots, 1) V_{n}^{*}∥}_{2} = ϵ, \end{matrix}

(A5)

where

I_{n}

is the

n \times n

identity matrix. If

x = P_{n} x

, from Lemma A2 and [2] (Theorem 1), we obtain:

\frac{∥ B^{t} x - P_{n} {x ∥}_{2}}{{∥ x ∥}_{2}} = \frac{∥ B^{t} P_{n} x - P_{n} {x ∥}_{2}}{{∥ x ∥}_{2}} = \frac{∥ P_{n} x - P_{n} {x ∥}_{2}}{{∥ x ∥}_{2}} = 0 < ϵ .

(A6)

2⇒1 If

x \neq P_{n} x

, then:

\begin{matrix} \frac{∥ B^{t} x - P_{n} {x ∥}_{2}}{∥ x - P_{n} {x ∥}_{2}} & = \frac{∥ B^{t} x - P_{n} x - P_{n} x + P_{n} {x ∥}_{2}}{∥ x - P_{n} {x ∥}_{2}} = \frac{∥ B^{t} x - B^{t} P_{n} x - P_{n} x + P_{n}^{2} {x ∥}_{2}}{∥ x - P_{n} {x ∥}_{2}} \end{matrix}

(A7)

\begin{matrix} = \frac{∥ B^{t} (x - P_{n} x) - P_{n} (x - P_{n} x) ∥_{2}}{∥ x - P_{n} {x ∥}_{2}} \leq ϵ . \end{matrix}

(A8)

Consequently,

\begin{matrix} min \{t_{0} \in N : \frac{∥ B^{t} x - P_{n} {x ∥}_{2}}{∥ x - P_{n} {x ∥}_{2}} \leq ϵ, \forall t \geq t_{0}, \forall x \neq P_{n} x\} \end{matrix}

(A9)

\begin{matrix} = min \{t_{0} \in N : \frac{∥ B^{t} x - P_{n} {x ∥}_{2}}{{∥ x ∥}_{2}} \leq ϵ, \forall t \geq t_{0}, \forall x \neq 0_{n \times 1}\} \end{matrix}

(A10)

\begin{matrix} = min \{t_{0} \in N : max_{x \neq 0_{n \times 1}} \frac{∥ B^{t} x - P_{n} {x ∥}_{2}}{{∥ x ∥}_{2}} \leq ϵ, \forall t \geq t_{0}\} \end{matrix}

(A11)

\begin{matrix} = min \{t_{0} \in N : {∥ B^{t} - P_{n} ∥}_{2} \leq ϵ, \forall t \geq t_{0}\} \end{matrix}

(A12)

\begin{matrix} = min \{t_{0} \in N : {∥ B - P_{n} ∥}_{2}^{t} \leq ϵ, \forall t \geq t_{0}\} \end{matrix}

(A13)

\begin{matrix} = min \{t_{0} \in N : {∥ B - P_{n} ∥}_{2}^{t_{0}} \leq ϵ\} \end{matrix}

(A14)

\begin{matrix} = min \{t_{0} \in N : \log (∥ B - P_{n} ∥_{2}^{t_{0}}) \leq \log ϵ\} \end{matrix}

(A15)

\begin{matrix} = min \{t_{0} \in N : t_{0} \log {∥ B - P_{n} ∥}_{2} \leq \log ϵ\} = min \{t_{0} \in N : t_{0} \geq \frac{\log ϵ}{\log ∥ B - P_{n} ∥_{2}}\} \end{matrix}

(A16)

\begin{matrix} = min \{t_{0} \in N : t_{0} \geq \frac{\log ϵ^{- 1}}{- \log ∥ B - P_{n} ∥_{2}}\} = ⌈\frac{\log ϵ^{- 1}}{- \log {∥B - P_{n}∥}_{2}}⌉ . \end{matrix}

(A17)

To prove (A10). we have used the equivalence

1 \Leftrightarrow 2

. To show (A12) and (A13), we have applied the definition of the spectral norm (see, e.g., [15] (pp. 603, 609)) and Assertion 3 of Lemma A1, respectively. To prove (A14), we have used [2] (Theorem 1) (

∥ B - P_{n} ∥_{2} < 1

).

As:

\begin{matrix} (A18) & min \{t_{0} \in N : \frac{∥ B^{t} x - P_{n} {x ∥}_{2}}{∥ x - P_{n} {x ∥}_{2}} \leq ϵ, \forall t \geq t_{0}, \forall x \neq P_{n} x\} & = min \{t \in N : ∥ B - P_{n} ∥_{2}^{t} \leq ϵ\} \\ (A19) & = min \{t \in N : ∥ B^{t} - P_{n} ∥_{2} \leq ϵ\}, \end{matrix}

we only need to show that

T_{1} = T_{2}

to finish the proof, where:

T_{1} = min \{t \in N : max_{x \neq 0_{n \times 1}} \frac{∥ B^{t} x - P_{n} {x ∥}_{2}}{{∥ x ∥}_{2}} \leq ϵ\}

(A20)

and:

T_{2} = max_{x \neq 0_{n \times 1}} t_{x},

(A21)

with:

t_{x} = min \{t \in N : \frac{∥ B^{t} x - P_{n} {x ∥}_{2}}{{∥ x ∥}_{2}} \leq ϵ\} .

(A22)

Since:

\frac{∥ B^{T_{1}} x - P_{n} {x ∥}_{2}}{{∥ x ∥}_{2}} \leq ϵ \forall x \neq 0_{n \times 1}

(A23)

we have

t_{x} \leq T_{1}

for all

x \neq 0_{n \times 1}

and, consequently,

T_{2} \leq T_{1}

. If

\{\frac{∥ B^{t} x - P_{n} {x ∥}_{2}}{{∥ x ∥}_{2}}\}

is a decreasing sequence for all

x \neq 0_{n \times 1}

, then:

\frac{∥ B^{T_{2}} x - P_{n} {x ∥}_{2}}{{∥ x ∥}_{2}} \leq \frac{∥ B^{t_{x}} x - P_{n} {x ∥}_{2}}{{∥ x ∥}_{2}} \leq ϵ \forall x \neq 0_{n \times 1}

(A24)

and therefore,

max_{x \neq 0_{n \times 1}} \frac{∥ B^{T_{2}} x - P_{n} {x ∥}_{2}}{{∥ x ∥}_{2}} \leq ϵ

(A25)

and

T_{1} \leq T_{2}

. Thus, if we prove that these sequences are decreasing, the proof is complete. Given

x \neq 0_{n \times 1}

, from Lemma A1 and [2] (Theorem 1), we conclude that:

\frac{∥ B^{t + 1} x - P_{n} {x ∥}_{2}}{{∥ x ∥}_{2}} = \frac{∥ {(B - P_{n})}^{t + 1} {x ∥}_{2}}{{∥ x ∥}_{2}} \leq \frac{∥ (B - P_{n}) ∥_{2} {∥ {(B - P_{n})}^{t} x ∥}_{2}}{{∥ x ∥}_{2}} \leq \frac{∥ {(B - P_{n})}^{t} {x ∥}_{2}}{{∥ x ∥}_{2}} = \frac{∥ B^{t} x - P_{n} {x ∥}_{2}}{{∥ x ∥}_{2}}

(A26)

for all

t \in N

. To prove the two equalities in (A26), we have used Assertion 2 of Lemma A1. To show the first inequality in (A26), we have applied a well-known inequality on the spectral norm (see, e.g., [15] (p. 611)), and to prove the second inequality in (A26), we have used [2] (Theorem 1) (

∥ B - P_{n} ∥_{2} < 1

). ☐

Appendix B. Proof of Theorem 1

Let

B (γ_{1}, \dots, γ_{n})

be the

n \times n

real symmetric matrix given by:

(\begin{matrix} 1 - γ_{1} - γ_{n} & γ_{1} & 0 & \dots & 0 & 0 & γ_{n} \\ γ_{1} & 1 - γ_{1} - γ_{2} & γ_{2} & \dots & 0 & 0 & 0 \\ 0 & γ_{2} & 1 - γ_{2} - γ_{3} & \dots & 0 & 0 & 0 \\ ⋮ & ⋱ & ⋱ & ⋱ & ⋱ & ⋮ & ⋮ \\ 0 & 0 & 0 & \dots & 1 - γ_{n - 3} - γ_{n - 2} & γ_{n - 2} & 0 \\ 0 & 0 & 0 & \dots & γ_{n - 2} & 1 - γ_{n - 2} - γ_{n - 1} & γ_{n - 1} \\ γ_{n} & 0 & 0 & \dots & 0 & γ_{n - 1} & 1 - γ_{n - 1} - γ_{n} \end{matrix}) .

(A27)

Observe that the matrix in (A27) satisfies

B (γ_{1}, \dots, γ_{n}) P_{n} = P_{n}

.

We define the function

f : R^{n} \mapsto [0, \infty)

as

f (γ_{1}, \dots, γ_{n}) : = {∥ B (γ_{1}, \dots, γ_{n}) - P_{n} ∥}_{2}

. We next prove that:

{∥{\overset{\circ}{W}}_{n} (γ_{0}) - P_{n}∥}_{2} \leq {∥ B (γ_{1}, \dots, γ_{n}) - P_{n} ∥}_{2} \forall γ_{1}, \dots, γ_{n} \in R

(A28)

Observe that

{\overset{\circ}{W}}_{n} (γ_{0}) = B (γ_{0}, \dots, γ_{0})

. As

{\overset{\circ}{W}}_{n} (γ)

is circulant, its eigenvalues are (see, e.g., [16] (Equation (3.7)) or [17] (Equation (5.2))):

a_{j} : = 1 + 2 γ (\cos \frac{2 π (j - 1)}{n} - 1), j \in {1, \dots, n}

(A29)

Let

V_{n} = (v_{1} | \dots | v_{n})

be the

n \times n

Fourier unitary matrix. It is well known (see, e.g., [16] (Equation (3.11)) or [17] (Lemma 5.1)) that

v_{j}

is an unit eigenvector of

\overset{\circ}{W} (γ)

associated with the eigenvalue

a_{j}

for all

j \in {1, \dots, n}

. From Lemma A2:

{∥{\overset{\circ}{W}}_{n} (γ) - P_{n}∥}_{2} = max {| a_{j} | : j \in {2, \dots, n}} .

(A30)

Case 1: Assume that n is even. Then,

{∥\overset{\circ}{W} (γ_{0}) - P_{n}∥}_{2} = a_{2} = - a_{\frac{n}{2} + 1} = a_{n} = \frac{1 + \cos \frac{2 π}{n}}{3 - \cos \frac{2 π}{n}} \in (0, 1) .

(A31)

Therefore,

y_{2} = \frac{\sqrt{2}}{2} (v_{n} + v_{2})

and

y_{n} = \frac{\sqrt{2}}{2 \sqrt{- 1}} (v_{n} - v_{2})

are unit eigenvectors of

\overset{\circ}{W} (γ_{0})

associated to

a_{2} = a_{n}

. As:

\begin{matrix} {[y_{2}]}_{j, 1} & = \sqrt{\frac{2}{n}} \cos \frac{2 π (j - 1)}{n}, \end{matrix}

(A32)

\begin{matrix} {[y_{n}]}_{j, 1} & = \sqrt{\frac{2}{n}} \sin \frac{2 π (j - 1)}{n}, \end{matrix}

(A33)

\begin{matrix} {[v_{\frac{n}{2} + 1}]}_{j, 1} & = \sqrt{\frac{1}{n}} {(- 1)}^{j - 1} \end{matrix}

(A34)

for all

j \in {1, \dots, n}

, from [4] (Theorem 1), we obtain three subgradients of f at

(γ_{0}, \dots, γ_{0}) \in R^{n}

, namely

g_{1}

,

g_{2}

and

g_{3}

, given by:

\begin{matrix} {[g_{1}]}_{j, 1} & = - \frac{2}{n} {(\cos \frac{2 π (j - 1)}{n} - \cos \frac{2 π j}{n})}^{2}, \end{matrix}

(A35)

\begin{matrix} {[g_{3}]}_{j, 1} & = - \frac{2}{n} {(\sin \frac{2 π (j - 1)}{n} - \sin \frac{2 π j}{n})}^{2}, \end{matrix}

(A36)

\begin{matrix} {[g_{2}]}_{j, 1} & = \frac{4}{n} \end{matrix}

(A37)

for all

j \in {1, \dots, n}

. If

μ = \frac{1}{3 - \cos \frac{2 π}{n}}

, we have that

μ g_{1} + μ g_{2} + (1 - 2 μ) g_{3} = 0_{n \times 1}

, where

0_{n \times 1}

is the

n \times 1

zero matrix. The result now follows from [18] (p. 12) and the fact that a convex combination of subgradients of f at

(γ_{0}, \dots, γ_{0})

is also a subgradient of f at

(γ_{0}, \dots, γ_{0})

.

Case 2: Assume that n is odd. Then,

{∥\overset{\circ}{W} (γ_{0}) - P_{n}∥}_{2} = a_{2} = - a_{\frac{n + 1}{2}} = - a_{\frac{n + 3}{2}} = a_{n} = \frac{\cos \frac{2 π}{n} + \cos \frac{π}{n}}{2 - \cos \frac{2 π}{n} + \cos \frac{π}{n}} \in (0, 1) .

(A38)

Therefore,

y_{2} = \frac{\sqrt{2}}{2} (v_{n} + v_{2})

and

y_{n} = \frac{\sqrt{2}}{2 \sqrt{- 1}} (v_{n} - v_{2})

are unit eigenvectors of

\overset{\circ}{W} (γ_{0})

associated with

a_{2} = a_{n}

, and

y_{\frac{n + 1}{2}} = \frac{\sqrt{2}}{2} (v_{\frac{n + 1}{2}} + v_{\frac{n + 3}{2}})

and

y_{\frac{n + 3}{2}} = \frac{\sqrt{2}}{2 \sqrt{- 1}} (v_{\frac{n + 1}{2}} - v_{\frac{n + 3}{2}})

are unit eigenvectors of

\overset{\circ}{W} (γ_{0})

associated with

a_{\frac{n + 1}{2}} = a_{\frac{n + 3}{2}}

. As:

\begin{matrix} {[y_{2}]}_{j, 1} & = \sqrt{\frac{2}{n}} \cos \frac{2 π (j - 1)}{n}, \end{matrix}

(A39)

\begin{matrix} {[y_{n}]}_{j, 1} & = \sqrt{\frac{2}{n}} \sin \frac{2 π (j - 1)}{n}, \end{matrix}

(A40)

\begin{matrix} {[y_{\frac{n + 1}{2}}]}_{j, 1} & = \sqrt{\frac{2}{n}} {(- 1)}^{j - 1} \cos \frac{π (j - 1)}{n}, \end{matrix}

(A41)

\begin{matrix} {[y_{\frac{n + 3}{2}}]}_{j, 1} & = \sqrt{\frac{2}{n}} {(- 1)}^{j - 1} \sin \frac{π (j - 1)}{n} \end{matrix}

(A42)

for all

j \in {1, \dots, n}

, from [4] (Theorem 1), we obtain four subgradients of f at

(γ_{0}, \dots, γ_{0}) \in R^{n}

, namely

g_{1}

,

g_{2}

,

g_{3}

and

g_{4}

given by:

\begin{matrix} {[g_{1}]}_{j, 1} & = - \frac{2}{n} {(\cos \frac{2 π (j - 1)}{n} - \cos \frac{2 π j}{n})}^{2}, \end{matrix}

(A43)

\begin{matrix} {[g_{2}]}_{j, 1} & = - \frac{2}{n} {(\sin \frac{2 π (j - 1)}{n} - \sin \frac{2 π j}{n})}^{2}, \end{matrix}

(A44)

\begin{matrix} {[g_{3}]}_{j, 1} & = \frac{2}{n} {(\cos \frac{π (j - 1)}{n} + \cos \frac{π j}{n})}^{2}, \end{matrix}

(A45)

\begin{matrix} {[g_{4}]}_{j, 1} & = \frac{2}{n} {(\sin \frac{π (j - 1)}{n} + \sin \frac{π j}{n})}^{2} \end{matrix}

(A46)

for all

j \in {1, \dots, n}

. If

μ = \frac{1}{2} \frac{1 + \cos \frac{π}{n}}{2 - \cos \frac{2 π}{n} + \cos \frac{π}{n}}

, we have that

μ g_{1} + μ g_{2} + (\frac{1}{2} - μ) g_{3} + (\frac{1}{2} - μ) g_{4} = 0_{n \times 1}

. The result now follows from [18] (p. 12) and the fact that a convex combination of subgradients of f at

(γ_{0}, \dots, γ_{0})

is also a subgradient of f at

(γ_{0}, \dots, γ_{0})

.

Since

∥ \overset{\circ}{W} (γ_{0}) - P_{n} ∥_{2} < 1

, applying [2] (Theorem 1) and Theorem A1, Theorem 1 holds.

Appendix C. Proof of Theorem 2

From [2] (Theorem 1), Theorem A1, (A31) and (A38), we obtain (8).

To finish the proof, we only need to show (9) and (7).

We begin by proving (9). Applying Taylor’s theorem (see, e.g., [19] (p. 113)), there exist two bounded functions

f, g : (0, \frac{π}{2}] \to R

such that:

- \log \frac{1 + \cos x}{3 - \cos x} = \frac{x^{2}}{2} + f (x) x^{3}

(A47)

and:

- \log \frac{\cos \frac{x}{2} + \cos x}{2 + \cos \frac{x}{2} - \cos x} = \frac{x^{2}}{2} + g (x) x^{3}

(A48)

for all

x \in (0, \frac{π}{2}]

. Therefore, from (A31) and (A38), we have:

\frac{\log ϵ^{- 1}}{- \log ∥ \overset{\circ}{W} (γ_{0}) - P_{n} ∥_{2}} = \frac{\log ϵ^{- 1}}{\frac{{(\frac{2 π}{n})}^{2}}{2} + y_{n} {(\frac{2 π}{n})}^{3}},

(A49)

where

{y_{n}}_{n \geq 4}

is the bounded sequence of real numbers given by:

{y_{n}}_{n \geq 4} = \{\begin{matrix} f (\frac{2 π}{n}) & if n is even, \\ g (\frac{2 π}{n}) & if n is odd . \end{matrix}

(A50)

Thus,

lim_{n \to \infty} \frac{\log ϵ^{- 1}}{- n^{2} \log {∥ \overset{\circ}{W} (γ_{0}) - P_{n} ∥}_{2}} = lim_{n \to \infty} \frac{\log ϵ^{- 1}}{2 π^{2} + y_{n} \frac{8 π^{3}}{n}} = \frac{\log ϵ^{- 1}}{2 π^{2}} .

(A51)

Hence, as

a \leq ⌈ a ⌉ < a + 1

, with

a \in R

, applying Theorem A1, we obtain:

\begin{matrix} \frac{\log ϵ^{- 1}}{2 π^{2}} & = lim_{n \to \infty} \frac{\log ϵ^{- 1}}{- n^{2} \log {∥ \overset{\circ}{W} (γ_{0}) - P_{n} ∥}_{2}} \leq lim_{n \to \infty} \frac{τ (ϵ, \overset{\circ}{W} (γ_{0}))}{n^{2}} \end{matrix}

(A52)

\begin{matrix} \leq lim_{n \to \infty} \frac{\log ϵ^{- 1}}{- n^{2} \log {∥ \overset{\circ}{W} (γ_{0}) - P_{n} ∥}_{2}} + lim_{n \to \infty} \frac{1}{n^{2}} = \frac{\log ϵ^{- 1}}{2 π^{2}}, \end{matrix}

(A53)

and consequently, (9) holds.

Finally, we prove (7). If

δ \in (0, \frac{\log ϵ^{- 1}}{2 π^{2}})

, then there exists

n_{0} \in N

, with

n_{0} > 3

, such that:

|\frac{τ (ϵ, \overset{\circ}{W} (γ_{0}))}{n^{2}} - \frac{\log ϵ^{- 1}}{2 π^{2}}| < δ \forall n \geq n_{0} .

(A54)

Thus, if

n \geq n_{0}

, then:

- δ < \frac{τ (ϵ, \overset{\circ}{W} (γ_{0}))}{n^{2}} - \frac{\log ϵ^{- 1}}{2 π^{2}} < δ,

(A55)

or equivalently,

(\frac{1}{2 π^{2}} - \frac{δ}{\log ϵ^{- 1}}) n^{2} \log ϵ^{- 1} < τ (ϵ, \overset{\circ}{W} (γ_{0})) < (\frac{1}{2 π^{2}} + \frac{δ}{\log ϵ^{- 1}}) n^{2} \log ϵ^{- 1} .

(A56)

Appendix D. Proof of Theorem 3

We denote with

\overset{⊠}{W}

the set of all the

r c \times r c

real symmetric matrices such that:

{\overset{⊠}{W}}_{r, c} = \{B \in R^{r c \times r c}, B = B^{⊤}, B P_{n} = P_{n}, {[B]}_{j, k} = 0 if j \neq k and {[\overset{⊠}{A}]}_{j, k} = 0\},

(A57)

where

\overset{⊠}{A}

is the adjacency matrix of a grid of r rows and c columns. Consider the bijection

B : R^{q} \mapsto {\overset{⊠}{W}}_{r, c}

defined in [4] (Equation (8)), where

q = 4 r c - 3 c - 3 r + 2

(i.e., q is the number of edges when the network is viewed as an undirected graph).

We define the function

f : R^{q} \mapsto [0, \infty)

as

f (w_{1}, \dots, w_{q}) : = {∥ B (w_{1}, \dots, w_{q}) - P_{n} ∥}_{2}

. We next prove that:

{∥{\overset{⊠}{W}}_{r, c} (\frac{1}{2}) - P_{n}∥}_{2} \leq {∥ B (w_{1}, \dots, w_{q}) - P_{n} ∥}_{2} \forall w_{1}, \dots, w_{q} \in R .

(A58)

Without loss of generality, we can assume that

r \geq c

. We first show that

{\overset{⊠}{W}}_{r, c} (\frac{1}{2}) \in {\overset{⊠}{W}}_{r, c}

:

({\tilde{W}}_{r} (\frac{1}{2}) \otimes {\tilde{W}}_{c} (\frac{1}{2}))^{⊤} = ({\tilde{W}}_{r} (\frac{1}{2}))^{⊤} \otimes ({\tilde{W}}_{c} (\frac{1}{2}))^{⊤} = {\tilde{W}}_{r} (\frac{1}{2}) \otimes {\tilde{W}}_{c} (\frac{1}{2}),

(A59)

and:

\begin{matrix} ({\tilde{W}}_{r} (\frac{1}{2}) \otimes {\tilde{W}}_{c} (\frac{1}{2})) P_{n} & = ({\tilde{W}}_{r} (\frac{1}{2}) \otimes {\tilde{W}}_{c} (\frac{1}{2})) (\sqrt{n} P_{r} \otimes \sqrt{n} P_{c}) \end{matrix}

(A60)

\begin{matrix} = n (({\tilde{W}}_{r} (\frac{1}{2}) P_{r}) \otimes ({\tilde{W}}_{c} (\frac{1}{2}) P_{c})) \end{matrix}

(A61)

\begin{matrix} = n (P_{r} \otimes P_{c}) = \sqrt{n} P_{r} \otimes \sqrt{n} P_{c} = P_{n} . \end{matrix}

(A62)

The eigenvalues of

{\tilde{W}}_{n} (α)

are (see, e.g., [20]):

a_{j} : = 1 - 2 α + 2 α (\cos \frac{π (j - 1)}{n}), j \in {1, \dots, n}

(A63)

and therefore, the eigenvalues of

{\tilde{W}}_{n} (\frac{1}{2})

are given by

a_{j} (n) : = \cos \frac{(j - 1) π}{n}

with

j \in {1, \dots, n}

. Their associated orthonormal eigenvectors are given by

{[v_{1} (n)]}_{k, 1} = \frac{1}{\sqrt{n}}

and

{[v_{j} (n)]}_{k, 1} = \sqrt{\frac{2}{n}} \cos \frac{(2 k - 1) (j - 1) π}{2 n}

with

j \in {2, \dots, n}

,

k \in {1, \dots, n}

(see, e.g., [20]). Consequently, the eigenvalues of

{\tilde{W}}_{r} (\frac{1}{2}) \otimes {\tilde{W}}_{c} (\frac{1}{2})

are

a_{j} (r) a_{k} (c)

and associated orthonormal eigenvectors are

v_{j} (r) \otimes v_{k} (c)

with

j \in {1, \dots, r}

and

k \in {1, \dots, c}

.

From [4] (Lemma 1),

{∥{\overset{⊠}{W}}_{r, c} (\frac{1}{2}) - P_{n}∥}_{2} = a_{2} (r) a_{1} (c) = - a_{r} (r) a_{1} (c) = \cos \frac{π}{r} \in (0, 1) .

(A64)

Then,

y_{1} = v_{2} (r) \otimes v_{1} (c)

and

y_{2} = v_{r} (r) \otimes v_{1} (c)

are unit eigenvectors of

{\overset{⊠}{W}}_{r, c} (\frac{1}{2})

associated with

a_{2} (r) a_{1} (c)

and

a_{r} (r) a_{1} (c)

, respectively, and their entries are given by:

\begin{matrix} {[y_{1}]}_{c (j - 1) + k, 1} & = \sqrt{\frac{2}{r c}} \cos \frac{(2 j - 1) π}{2 r}, \end{matrix}

(A65)

\begin{matrix} y_{2}]_{c (j - 1) + k, 1} & = {(- 1)}^{j - 1} \sqrt{\frac{2}{r c}} \sin \frac{(2 j - 1) π}{2 r}, \end{matrix}

(A66)

for all

j \in {1, \dots, r}

and

k \in {1, \dots, c}

.

Let

ε

be the set of edges of the grid. An edge

e = {j, k}

connects the sensors

v_{j}, v_{k}

, and we enumerate the edges such that

e_{l} = {j_{l}, k_{l}}

for all

l \in {1, \dots, q}

. We consider that the edges of the grid are sorted as follows:

ε_{H}

,

ε_{V}

,

ε_{NW}

and

ε_{NE}

are the set of horizontal, vertical, northwest-southeast diagonal and northeast-southwest diagonal edges, respectively. Moreover, if

e_{l_{1}} = {j_{l_{1}}, k_{l_{1}}}, e_{l_{2}} = {j_{l_{2}}, k_{l_{2}}} \in ε_{h}

with

h \in {H, V, NW, NE}

and

min {j_{l_{1}}, k_{l_{1}}} < min {j_{l_{2}}, k_{l_{2}}}

, then the edge

e_{l_{1}}

precedes the edge

e_{l_{2}}

in

ε_{h}

.

From [4] (Theorem 1) , we obtain two subgradients of f,

g_{1}

and

g_{2}

given by:

\begin{matrix} g_{1} & = {(g_{1}^{(H)} | g_{1}^{(V)} | g_{1}^{(NW)} | g_{1}^{(NE)})}^{⊤}, \end{matrix}

(A67)

\begin{matrix} g_{2} & = {(g_{2}^{(H)} | g_{2}^{(V)} | g_{2}^{(NW)} | g_{2}^{(NE)})}^{⊤}, \end{matrix}

(A68)

where

g_{1}^{(H)} = g_{2}^{(H)} = 0_{1 \times r (c - 1)}

,

\begin{matrix} {[g_{1}^{(V)}]}_{1, c (j - 1) + k} & = - \frac{8}{r c} \sin^{2} (\frac{π}{2 r}) \sin^{2} (\frac{j π}{r}), \end{matrix}

(A69)

\begin{matrix} {[g_{2}^{(V)}]}_{1, c (j - 1) + k} & = \frac{8}{r c} \cos^{2} (\frac{π}{2 r}) \sin^{2} (\frac{j π}{r}), \end{matrix}

(A70)

for all

j \in {1, \dots, r - 1}, k \in {1, \dots, c}

,

\begin{matrix} {[g_{1}^{(NW)}]}_{1, (c - 1) (j - 1) + k} & = - \frac{8}{r c} \sin^{2} (\frac{π}{2 r}) \sin^{2} (\frac{j π}{r}), \end{matrix}

(A71)

\begin{matrix} {[g_{2}^{(NW)}]}_{1, (c - 1) (j - 1) + k} & = \frac{8}{r c} \cos^{2} (\frac{π}{2 r}) \sin^{2} (\frac{j π}{r}), \end{matrix}

(A72)

for all

j \in {1, \dots, r - 1}, k \in {1, \dots, c - 1}

, and:

\begin{matrix} {[g_{1}^{(NE)}]}_{1, (c - 1) (j - 1) + k - 1} & = - \frac{8}{r c} \sin^{2} (\frac{π}{2 r}) \sin^{2} (\frac{j π}{r}), \end{matrix}

(A73)

\begin{matrix} {[g_{2}^{(NE)}]}_{1, (c - 1) (j - 1) + k - 1} & = \frac{8}{r c} \cos^{2} (\frac{π}{2 r}) \sin^{2} (\frac{j π}{r}), \end{matrix}

(A74)

for all

j \in {1, \dots, r - 1}

,

k \in {2, \dots, c}

. If

μ = \cos^{2} (\frac{π}{2 r})

, we have that

μ g_{1} + (1 - μ) g_{2} = 0_{(4 r c - 3 c - 3 r + 2) \times 1}

. The result now follows from [18] (p. 12) and the fact that a convex combination of subgradients of f at a certain point is also a subgradient of f at that point.

Since

∥ {\overset{⊠}{W}}_{r, c} (\frac{1}{2}) - P_{n} ∥_{2} < 1

, applying [2] (Theorem 1) and Theorem A1, Theorem 3 holds.

Appendix E. Proof of Theorem 9

We begin by proving (51). The Laplacian matrix of a cycle with n sensors is:

\begin{matrix} \overset{\circ}{L} & = diag (1_{n}^{⊤} {circ}_{n} (0, 1, 0, \dots, 0, 1)) - {circ}_{n} (0, 1, 0, \dots, 0, 1) \end{matrix}

(A75)

\begin{matrix} = diag (2, 2, \dots, 2) - {circ}_{n} (0, 1, 0, \dots, 0, 1) = {circ}_{n} (2, - 1, 0, \dots, 0, - 1) . \end{matrix}

(A76)

From [21] (Equation (3.4a)), the eigenvalues of

\overset{\circ}{L}

are given by

\{2 (1 - \cos \frac{2 π (j - 1)}{n}) : j \in {1, \dots, n}\}

and, consequently,

λ_{n - 1} (\overset{\circ}{L}) = 2 (1 - \cos \frac{2 π}{n})

. From [7] (Corollary 1), we have:

{\overset{\circ}{φ}}_{0} = \frac{n - λ_{n - 1} (\overset{\circ}{L})}{2 n - λ_{n - 1} (\overset{\circ}{L})} = 1 - \frac{n}{2 (n + \cos \frac{2 π}{n} - 1)},

(A77)

and therefore, (51) holds. The entries of the expectation of

{\overset{\circ}{W}}_{B} (0)

are given by:

{[E ({\overset{\circ}{W}}_{B} (0))]}_{j, k} = \{\begin{matrix} \frac{1}{n} (1 - {\overset{\circ}{φ}}_{0}) & if j - k \in {- 1, 1}, \\ \frac{1}{n} (1 - {\overset{\circ}{φ}}_{0}) & if j - k \in {1 - n, n - 1}, \\ \frac{1}{n} (2 {\overset{\circ}{φ}}_{0} + n - 2) & if j = k, \\ 0 & otherwise, \end{matrix}

(A78)

for all

j, k \in {1, \dots, n}

. Thus,

E ({\overset{\circ}{W}}_{B} (0)) = {\overset{\circ}{W}}_{n} (\frac{1 - {\overset{\circ}{φ}}_{0}}{n})

. Therefore, combining (A29) and (A30) yields:

{∥{\overset{\circ}{W}}_{n} (\frac{1 - {\overset{\circ}{φ}}_{0}}{n}) - P_{n}∥}_{2} = \frac{n + 2 \cos \frac{2 π}{n} - 2}{n + \cos \frac{2 π}{n} - 1} .

(A79)

As:

{\overset{\circ}{φ}}_{0} = \frac{n - λ_{n - 1} (\overset{\circ}{L})}{2 n - λ_{n - 1} (\overset{\circ}{L})} = 1 - \frac{n}{2 n - λ_{n - 1} (\overset{\circ}{L})}

(A80)

we get:

λ_{n - 1} (\overset{\circ}{L}) = n (2 - \frac{1}{1 - {\overset{\circ}{φ}}_{0}}) = n \frac{1 - 2 {\overset{\circ}{φ}}_{0}}{1 - {\overset{\circ}{φ}}_{0}}

(A81)

and consequently,

\begin{matrix} 1 - \frac{2 {\overset{\circ}{φ}}_{0} (1 - {\overset{\circ}{φ}}_{0})}{n} λ_{n - 1} (\overset{\circ}{L}) - \frac{{(1 - {\overset{\circ}{φ}}_{0})}^{2}}{n^{2}} {(λ_{n - 1} (\overset{\circ}{L}))}^{2} = 1 - 2 {\overset{\circ}{φ}}_{0} (1 - 2 {\overset{\circ}{φ}}_{0}) - {(1 - 2 {\overset{\circ}{φ}}_{0})}^{2} \end{matrix}

(A82)

\begin{matrix} = 1 - 2 {\overset{\circ}{φ}}_{0} + 4 {\overset{\circ}{φ}}_{0}^{2} - 1 + 4 {\overset{\circ}{φ}}_{0} - 4 {\overset{\circ}{φ}}_{0}^{2} = 2 {\overset{\circ}{φ}}_{0} = 2 - \frac{n}{n + \cos \frac{2 π}{n} - 1} = \frac{n + 2 \cos \frac{2 π}{n} - 2}{n + \cos \frac{2 π}{n} - 1} . \end{matrix}

(A83)

Now, applying (A79), (A82) and [7] (Equations (28) and (46)), we obtain (52). The rest of the proof runs as the proof of Theorem 2.

Appendix F. Proof of Theorem 10

We begin by proving (56). The Laplacian matrix of a path with n sensors is:

{[\tilde{L}]}_{j, k} = \{\begin{matrix} - 1 & if j - k \in {- 1, 1}, \\ 2 & if j = k, j \neq 1 and j \neq n, \\ 1 & if j = k, j \in {1, n}, \\ 0 & otherwise, \end{matrix}

(A84)

From [20], the eigenvalues of

\tilde{L}

are given by

\{2 (1 - \cos \frac{π (j - 1)}{n}) : j \in {1, \dots, n}\}

and, consequently,

λ_{n - 1} (\tilde{L}) = 2 (1 - \cos \frac{π}{n})

. From [7] (Corollary 1), we have:

{\tilde{φ}}_{0} = \frac{n - λ_{n - 1} (\tilde{L})}{2 n - λ_{n - 1} (\tilde{L})} = 1 - \frac{n}{2 (n + \cos \frac{π}{n} - 1)},

(A85)

and therefore, (56) holds. The entries of the expectation of

{\tilde{W}}_{B} (0)

are given by:

{[E ({\tilde{W}}_{B} (0))]}_{j, k} = \{\begin{matrix} \frac{1}{n} (1 - {\tilde{φ}}_{0}) & if j - k \in {- 1, 1}, \\ \frac{1}{n} (2 {\tilde{φ}}_{0} + n - 2) & if j = k, j \neq 1 and j \neq n, \\ \frac{1}{n} ({\tilde{φ}}_{0} + n - 1) & if j = k, j \in {1, n}, \\ 0 & otherwise, \end{matrix}

(A86)

for all

j, k \in {1, \dots, n}

. Thus,

E ({\tilde{W}}_{B} (0)) = {\tilde{W}}_{n} (\frac{1 - {\tilde{φ}}_{0}}{n})

. Therefore, combining (A63) and [4] (Lemma 1) yields:

{∥{\tilde{W}}_{n} (\frac{1 - {\tilde{φ}}_{0}}{n}) - P_{n}∥}_{2} = \frac{n + 2 \cos \frac{π}{n} - 2}{n + \cos \frac{π}{n} - 1} .

(A87)

As:

{\tilde{φ}}_{0} = \frac{n - λ_{n - 1} (\tilde{L})}{2 n - λ_{n - 1} (\tilde{L})} = 1 - \frac{n}{2 n - λ_{n - 1} (\tilde{L})}

(A88)

we get:

λ_{n - 1} (\tilde{L}) = n (2 - \frac{1}{1 - {\tilde{φ}}_{0}}) = n \frac{1 - 2 {\tilde{φ}}_{0}}{1 - {\tilde{φ}}_{0}}

(A89)

and consequently,

\begin{matrix} 1 - \frac{2 {\tilde{φ}}_{0} (1 - {\tilde{φ}}_{0})}{n} λ_{n - 1} (\tilde{L}) - \frac{{(1 - {\tilde{φ}}_{0})}^{2}}{n^{2}} {(λ_{n - 1} (\tilde{L}))}^{2} = 1 - 2 {\tilde{φ}}_{0} (1 - 2 {\tilde{φ}}_{0}) - {(1 - 2 {\tilde{φ}}_{0})}^{2} \end{matrix}

(A90)

\begin{matrix} = 1 - 2 {\tilde{φ}}_{0} + 4 {\tilde{φ}}_{0}^{2} - 1 + 4 {\tilde{φ}}_{0} - 4 {\tilde{φ}}_{0}^{2} = 2 {\tilde{φ}}_{0} = 2 - \frac{n}{n + \cos \frac{2 π}{n} - 1} = \frac{n + 2 \cos \frac{π}{n} - 2}{n + \cos \frac{π}{n} - 1} . \end{matrix}

(A91)

Now, applying (A87), (A90) and [7] (Equations (28) and (46)), we obtain (57). The rest of the proof runs as the proof of Theorem 2.

References

Insausti, X.; Camaró, F.; Crespo, P.M.; Beferull-Lozano, B.; Gutiérrez-Gutiérrez, J. Distributed pseudo-gossip algorithm and finite-length computational codes for efficient in-network subspace projection. IEEE J. Sel. Top. Signal Process. 2013, 7, 163–174. [Google Scholar] [CrossRef]
Xiao, L.; Boyd, S. Fast linear iterations for distributed averaging. Syst. Control Lett. 2004, 53, 65–78. [Google Scholar] [CrossRef]
Xiao, L.; Boyd, S.; Kimb, S.J. Distributed average consensus with least-mean-square deviation. J. Parallel Distrib. Comput. 2007, 67, 33–46. [Google Scholar] [CrossRef]
Insausti, X.; Gutiérrez-Gutiérrez, J.; Zárraga-Rodríguez, M.; Crespo, P.M. In-network computation of the optimal weighting matrix for distributed consensus on wireless sensor networks. Sensors 2017, 17, 1702. [Google Scholar] [CrossRef] [PubMed]
Olshevsky, A.; Tsitsiklis, J. Convergence speed in distributed consensus and averaging. SIAM Rev. 2011, 53, 747–772. [Google Scholar] [CrossRef]
Boyd, S.; Ghosh, A.; Prabhakar, B.; Shah, D. Randomized gossip algorithms. IEEE Trans. Inf. Theory 2006, 52, 2508–2530. [Google Scholar] [CrossRef]
Aysal, T.C.; Yildiz, M.E.; Sarwate, A.D.; Scaglione, A. Broadcast gossip algorithms for consensus. IEEE Trans. Signal Process. 2009, 57, 2748–2761. [Google Scholar] [CrossRef]
Wu, S.; Rabbat, M.G. Broadcast Gossip Algorithms for Consensus on Strongly Connected Digraphs. IEEE Trans. Signal Process. 2013, 61, 3959–3971. [Google Scholar] [CrossRef]
Dimakis, A.D.G.; Kar, S.; Moura, J.M.F.; Rabbat, M.G.; Scaglione, A. Gossip algorithms for distributed signal processing. Proceed. IEEE 2010, 98, 1847–1864. [Google Scholar] [CrossRef]
Olshevsky, A.; Tsitsiklis, J. Convergence speed in distributed consensus and averaging. SIAM J. Control Optim. 2009, 48, 33–55. [Google Scholar] [CrossRef]
Olshevsky, A.; Tsitsiklis, J. A lower bound for distributed averaging algorithms on the line graph. IEEE Trans. Autom. Control 2011, 56, 2694–2698. [Google Scholar] [CrossRef]
Apostol, T.M. Calculus; John Wiley & Sons: Hoboken, NJ, USA, 1967; Volume 1. [Google Scholar]
Gutiérrez-Gutiérrez, J.; Crespo, P.M. Asymptotically equivalent sequences of matrices and Hermitian block Toeplitz matrices with continuous symbols: Applications to MIMO systems. IEEE Trans. Inf. Theory 2008, 54, 5671–5680. [Google Scholar] [CrossRef]
Gutiérrez-Gutiérrez, J.; Crespo, P.M. Asymptotically equivalent sequences of matrices and multivariate ARMA processes. IEEE Trans. Inf. Theory 2011, 57, 5444–5454. [Google Scholar] [CrossRef]
Bernstein, D.S. Matrix Mathematics; Princeton University Press: Princeton, NJ, USA, 2009. [Google Scholar]
Gray, R.M. Toeplitz and circulant matrices: A review. Found. Trends Commun. Inf. Theory 2006, 2, 155–239. [Google Scholar] [CrossRef]
Gutiérrez-Gutiérrez, J.; Crespo, P.M. Block Toeplitz matrices: Asymptotic results and applications. Found. Trends Commun. Inf. Theory 2011, 8, 179–257. [Google Scholar] [CrossRef]
Shor, N.Z. Minimization Methods for Non-Differentiable Functions; Springer: Berlin, Germany, 1985. [Google Scholar]
Apostol, T.M. Mathematical Analysis; Addison-Wesley: Boston, MA, USA, 1974. [Google Scholar]
Yueh, W.C.; Cheng, S.S. Explicit Eigenvalues and inverses of tridiagonal Toeplitz matrices with four perturbed corners. ANZIAM J. 2008, 49, 361–387. [Google Scholar] [CrossRef]
Gray, R.M. On the asymptotic eigenvalue distribution of Toeplitz matrices. IEEE Trans. Inf. Theory 1972, 18, 725–730. [Google Scholar] [CrossRef]

Figure 1. Considered network topologies with 16 sensors.

Figure 2. (a) A cycle with five sensors; (b) a cycle with 10 sensors.

Figure 3. (a) A path with five sensors; (b) a path with 10 sensors.

Figure 4. A cycle: (a)

ϵ = 10^{- 3}

; (b)

ϵ = 10^{- 6}

.

Figure 5. A path: (a)

ϵ = 10^{- 3}

; (a)

ϵ = 10^{- 6}

.

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Analysis of Known Linear Distributed Average Consensus Algorithms on Cycles and Paths

Abstract

1. Introduction

1.1. Deterministic Linear Distributed Averaging Algorithms

1.2. Randomized Linear Distributed Averaging Algorithms

1.3. Our Contribution

2. Convergence Time of Deterministic Linear Distributed Averaging Algorithms

2.1. Convergence Time of the Fastest LTI Distributed Averaging Algorithm for Symmetric Weights

2.1.1. The Cycle

2.1.2. The Grid

2.1.3. The Path

2.2. Convergence Time of the Fastest Constant Edge Weights Algorithm

2.3. Convergence Time of the Maximum-Degree Weights Algorithm and of the Metropolis–Hastings Algorithm

2.3.1. The Cycle

2.3.2. The Path

3. Convergence Time of Randomized Linear Distributed Averaging Algorithms

3.1. Lower and Upper Bounds for the Convergence Time of the Pairwise Gossip Algorithm

3.1.1. The Cycle

3.1.2. The Path

3.2. Lower and Upper Bounds for the Convergence Time of the Broadcast Gossip Algorithm

3.2.1. The Cycle

3.2.2. The Path

4. Discussion

5. Numerical Examples

6. Conclusions

Acknowledgments

Author Contributions

Conflicts of Interest

Appendix A. Comparison of Several Definitions of Convergence Time

Appendix B. Proof of Theorem 1

Appendix C. Proof of Theorem 2

Appendix D. Proof of Theorem 3

Appendix E. Proof of Theorem 9

Appendix F. Proof of Theorem 10

References

Article Metrics

Citations

Article Access Statistics