Next Article in Journal
Green Grape Detection and Picking-Point Calculation in a Night-Time Natural Environment Using a Charge-Coupled Device (CCD) Vision Sensor with Artificial Illumination
Next Article in Special Issue
A Novel Energy-Efficient Multi-Sensor Fusion Wake-Up Control Strategy Based on a Biomimetic Infectious-Immune Mechanism for Target Tracking
Previous Article in Journal
Optimize the Coverage Probability of Prediction Interval for Anomaly Detection of Sensor-Based Monitoring Series
Previous Article in Special Issue
A Hardware-Supported Algorithm for Self-Managed and Choreographed Task Execution in Sensor Networks
Article Menu
Issue 4 (April) cover image

Export Article

Sensors 2018, 18(4), 968; doi:10.3390/s18040968

Article
Analysis of Known Linear Distributed Average Consensus Algorithms on Cycles and Paths
Department of Biomedical Engineering and Sciences, Tecnun, University of Navarra, Manuel Lardizábal 13, 20018 San Sebastián, Spain
*
Author to whom correspondence should be addressed.
Received: 22 February 2018 / Accepted: 22 March 2018 / Published: 24 March 2018

Abstract

:
In this paper, we compare six known linear distributed average consensus algorithms on a sensor network in terms of convergence time (and therefore, in terms of the number of transmissions required). The selected network topologies for the analysis (comparison) are the cycle and the path. Specifically, in the present paper, we compute closed-form expressions for the convergence time of four known deterministic algorithms and closed-form bounds for the convergence time of two known randomized algorithms on cycles and paths. Moreover, we also compute a closed-form expression for the convergence time of the fastest deterministic algorithm considered on grids.
Keywords:
average consensus algorithms; distributed computation; sensor networks; convergence time; number of transmissions

1. Introduction

A distributed averaging (or average consensus) algorithm obtains in each sensor the average (arithmetic mean) of the values measured by all the sensors of a sensor network in a distributed way.
The most common distributed averaging algorithms are linear and iterative:
x ( t + 1 ) = W ( t ) x ( t ) , t { 0 , 1 , 2 , } ,
where:
x ( t ) = x 1 ( t ) x n ( t )
is a real vector, n is the number of sensors of the network, which we label v j with j { 1 , , n } , x j ( 0 ) is the value measured by the sensor v j , x j ( t ) is the value computed by the sensor v j in time t 0 and the weighting matrix W ( t ) is an n × n real sparse matrix satisfying that if two sensors v j and v k are not connected (i.e., if v j and v k cannot interchange information), then [ W ( t ) ] j , k = 0 . From the point of view of communication protocols, there exist efficient ways of implementing synchronous algorithms of the form of (1). (see, e.g., [1]). The linear distributed averaging algorithms can be classified as deterministic or randomized depending on the nature of the weighting matrices W ( t ) .

1.1. Deterministic Linear Distributed Averaging Algorithms

Several well-known deterministic linear distributed averaging algorithms can be found in [2] and [3]. Those algorithms are time-invariant and have symmetric weights, that is, the deterministic weighting matrix W ( t ) is symmetric and does not depend on t (and consequently, x ( t ) = W t x ( 0 ) ).
In [2], the authors search among all the symmetric weighting matrices W the one that makes (1) the fastest possible and show that such a matrix can be obtained by numerically solving a convex optimization problem. This algorithm is called the fastest linear time-invariant (LTI) distributed averaging algorithm for symmetric weights. It should be mentioned that in [4], the authors proposed an in-network algorithm for finding such an optimal weighting matrix.
In [2], the authors also give a slower algorithm: the fastest constant edge weights algorithm. In this other algorithm, they consider a particular structure of symmetric weighting matrices that depends on a single parameter and find the value of that parameter that makes (1) the fastest possible.
In [3], another two algorithms can be found: the maximum-degree weights algorithm and the Metropolis–Hastings algorithm.
For other deterministic linear distributed averaging algorithms, we refer the reader to [5] and the references therein.

1.2. Randomized Linear Distributed Averaging Algorithms

For the randomized case, a well-known linear distributed averaging algorithm was given in [6]. That algorithm is called the pairwise gossip algorithm because only two randomly-selected sensors interchange information at each time instant t.
Another well-known randomized algorithm can be found in [7]. That algorithm is called the broadcast gossip algorithm because a single sensor is randomly selected at each time instant t and broadcasts its value to all its neighboring sensors. The broadcast gossip algorithm is a linear distributed consensus algorithm rather than a linear distributed averaging algorithm. However, the broadcast gossip algorithm converges to a random consensus value, which is, in expectation, the average of the values measured by all the sensors of the network. If one uses the directed version of the broadcast gossip algorithm [8] in a symmetric graph, one would converge to the true average.
For other randomized linear distributed averaging algorithms, we refer the reader to [9] and the references therein. The linear distributed averaging algorithms reviewed in Section 1.1 and Section 1.2 are the most cited algorithms in the literature on the topic.

1.3. Our Contribution

A key feature of a distributed averaging algorithm is its convergence time, because it allows one to establish the stopping criterion for the iterative algorithm. The convergence time is defined as the number of iterations t required in (1) until the effective value computed by the sensors, x ( t ) , has approached the steady state sufficiently close (to a threshold ϵ ). In the literature, we have not found closed-form expressions for the convergence time of the six linear distributed averaging algorithms mentioned in Section 1.1 and Section 1.2. A mathematical expression is said to be a closed-form expression if it is written in terms of a finite number of elementary functions (i.e., in terms of a finite number of constants, arithmetic operations, roots, exponentials, natural logarithms and trigonometric functions). In the present paper, we compute closed-form expressions for the convergence time of the deterministic algorithms and closed-form upper bounds for the convergence time of the randomized algorithms on two common network topologies: the cycle and the path. Observe that these closed-form formulas give us upper bounds for the convergence time of the considered algorithms (stopping criteria) on any network that contains as a subgraph a cycle or a path with the same number of sensors. Specifically, in this paper, we compute:
  • a closed-form expression for the convergence time of the fastest LTI distributed averaging algorithm for symmetric weights on the considered topologies (see Section 2.1); moreover, we also compute a closed-form expression for the convergence time of this algorithm on a grid;
  • a closed-form expression for the convergence time of the fastest constant edge weights algorithm on the considered topologies (see Section 2.2);
  • a closed-form expression for the convergence time of the maximum-degree weights algorithm on the considered topologies (see Section 2.3);
  • a closed-form expression for the convergence time of the Metropolis–Hastings algorithm on the considered topologies (see Section 2.3);
  • closed-form lower and upper bounds for the convergence time of the pairwise gossip algorithm on the considered topologies (see Section 3.1);
  • closed-form lower and upper bounds for the convergence time of the broadcast gossip algorithm on the considered topologies (see Section 3.2).
From these closed-form formulas, we study the asymptotic behavior of the convergence time of the considered algorithms as the number of sensors of the network grows. The obtained asymptotic and non-asymptotic results allow us to compare the considered algorithms in terms of convergence time and, consequently, in terms of the number of transmissions required, as well (see Section 4 and Section 5). The knowledge of the number of transmissions required lets us know the energy consumption of the distributed technique. The knowledge of the energy consumption is a key factor in the design of a new wireless sensor network (WSN), where one has to decide the number of nodes and the network topology. It should be mentioned that when designing new WSNs, cycles, paths and grids are topologies that are considered frequently.

2. Convergence Time of Deterministic Linear Distributed Averaging Algorithms

Different definitions of convergence time are used in the literature. We have found three different definitions for the convergence time of a deterministic linear distributed averaging algorithm (see [2,10,11]). In this paper, we consider the definition of ϵ -convergence time given in [11]:
τ ϵ , { W ( t ) } t 0 : = min t 0 : x ( t ) P n x ( 0 ) 2 x ( 0 ) P n x ( 0 ) 2 ϵ , t t 0 , x ( 0 ) P n x ( 0 ) ,
where ϵ ( 0 , 1 ) , · 2 is the spectral norm and P n : = 1 n 1 n 1 n , with 1 n being the n × 1 matrix of ones and denoting the transpose. If we replace the spectral norm by the infinity norm in that definition, we obtain the definition of ϵ -convergence time given in [10]. If the deterministic matrix W ( t ) in (1) does not depend on t, we denote the ϵ -convergence time by τ ( ϵ , W ) .

2.1. Convergence Time of the Fastest LTI Distributed Averaging Algorithm for Symmetric Weights

In this section, we give a closed-form expression for the ϵ -convergence time of the fastest LTI distributed averaging algorithm for symmetric weights, and we study its asymptotic behavior as the number of sensors of the network grows. We consider three common network topologies: the cycle, the grid and the path (see Figure 1).

2.1.1. The Cycle

Let:
W n ( γ ) : = 1 2 γ γ 0 0 0 γ γ 1 2 γ γ 0 0 0 0 γ 1 2 γ 0 0 0 0 0 0 1 2 γ γ 0 0 0 0 γ 1 2 γ γ γ 0 0 0 γ 1 2 γ .
Using (4), Theorem 1 gives the expression of the weighting matrix of the fastest LTI distributed averaging algorithm for symmetric weights on a cycle with n sensors.
Theorem 1.
Let n N , with n > 3 . Then, W n ( γ 0 ) is the weighting matrix of the fastest LTI distributed averaging algorithm for symmetric weights on a cycle with n sensors, where:
γ 0 = 1 2 cos 2 π n cos 2 π ( j 0 1 ) n ,
with:
j 0 = n 2 + 1 if   n   is   even , n + 1 2 if   n   is   odd .
Proof. 
See Appendix B.  ☐
We now give a closed-form expression for the ϵ -convergence time of the fastest LTI distributed averaging algorithm for symmetric weights on a cycle. We also study the asymptotic behavior of this convergence time as the number of sensors of the cycle grows.
We first introduce some notation: Two sequences of numbers { a n } and { b n } are said to be asymptotically equal, and write a n b n , if and only if lim n a n b n = 1 (see, e.g., [12] (p. 396)), and, consequently,
τ ( ϵ , W n ( γ 0 ) ) = Θ ( n 2 log ϵ 1 ) .
Let f , g : N R be two non-negative functions. We write f ( n ) = O ( g ( n ) ) (respectively, f ( n ) = Ω ( g ( n ) ) ) if there exist K ( 0 , ) and n 0 N such that f ( n ) K g ( n ) (respectively, f ( n ) K g ( n ) ) for all n n 0 . If f ( n ) = O ( g ( n ) ) and f ( n ) = Ω ( g ( n ) ) , then we write f ( n ) = Θ ( g ( n ) ) .
Theorem 2.
Consider ϵ ( 0 , 1 ) and n N , with n > 3 . Let W n ( γ 0 ) be as in Theorem 1. Then,
τ ϵ , W n ( γ 0 ) = log ϵ 1 log 1 + cos 2 π n 3 cos 2 π n if   n   is   even , log ϵ 1 log cos π n + cos 2 π n 2 + cos π n cos 2 π n if   n   is   odd ,
where log is the natural logarithm and x denotes the smallest integer not less than x. Moreover,
τ ( ϵ , W n ( γ 0 ) ) n 2 log ϵ 1 2 π 2 ,
Proof. 
See Appendix C.  ☐
Since the number of transmissions per iteration on a cycle with n sensors is n for the fastest LTI distributed averaging algorithm for symmetric weights, the total number of transmissions required for τ ( ϵ , W n ( γ 0 ) ) iterations is T ( ϵ , W n ( γ 0 ) ) : = n τ ( ϵ , W n ( γ 0 ) ) . From Theorem 2, we obtain:
T ( ϵ , W n ( γ 0 ) ) n 3 log ϵ 1 2 π 2 ,
and hence, T ( ϵ , W n ( γ 0 ) ) = Θ ( n 3 log ϵ 1 ) .

2.1.2. The Grid

Let:
W n ( α ) : = 1 α α α 1 2 α α α 1 2 α α α 1 α
be the n × n matrix for n 2 , and W 1 ( α ) : = 1 . We define:
W r , c ( α ) : = W r ( α ) W c ( α ) ,
where ⊗ is the Kronecker product. Using (12), Theorem 3 gives the expression of the weighting matrix of the fastest LTI distributed averaging algorithm for symmetric weights on a grid of r rows and c columns.
Theorem 3.
Let r , c N , with r c > 2 . Then, the r c × r c matrix W r , c 1 2 is the weighting matrix of the fastest LTI distributed averaging algorithm for symmetric weights on a grid of r rows and c columns.
Proof. 
See Appendix D.  ☐
We now give a closed-form expression for the ϵ -convergence time of the fastest LTI distributed averaging algorithm for symmetric weights on a grid of r rows and c columns. We also study the asymptotic behavior of this convergence time as the number of rows of the grid grows.
Theorem 4.
Consider ϵ ( 0 , 1 ) and r , c N , with r c > 2 . Without loss of generality, we assume r c . Then,
τ ϵ , W r , c 1 2 = log ϵ 1 log cos π r .
Moreover,
τ ϵ , W r , c 1 2 2 r 2 log ϵ 1 π 2
and consequently,
τ ϵ , W r , c 1 2 = Θ ( r 2 log ϵ 1 ) .
Proof. 
From [2] (Theorem 1), Theorem A1 and (A64), we obtain (13). The rest of the proof runs as the proof of Theorem 2.  ☐
Since the number of transmissions per iteration on a grid of r rows and c columns is r c for the fastest LTI distributed averaging algorithm for symmetric weights, the total number of transmissions required for τ ϵ , W r , c 1 2 iterations is:
T ϵ , W r , c 1 2 : = r c τ ϵ , W r , c 1 2 .
If r = c = n , from Theorem 4, we obtain:
T ϵ , W r , c 1 2 2 n 2 log ϵ 1 π 2 ,
and hence, T ϵ , W r , c 1 2 = Θ ( n 2 log ϵ 1 ) . Observe that from (13), the optimal configuration for a grid with n sensors is obtained when r = c = n .

2.1.3. The Path

Since the path with n sensors can be seen as a grid of n rows and one column, from Theorem 3, we conclude that W n 1 2 is the weighting matrix of the fastest LTI distributed averaging algorithm for symmetric weights on a path of n sensors, and from Theorem 4, we conclude that:
τ ϵ , W n 1 2 = log ϵ 1 log cos π n .
Moreover,
τ ϵ , W n 1 2 2 n 2 log ϵ 1 π 2
and consequently,
τ ϵ , W n 1 2 = Θ ( n 2 log ϵ 1 ) .
Finally, from (16), we obtain:
T ϵ , W n 1 2 2 n 3 log ϵ 1 π 2 ,
and hence, T ϵ , W n 1 2 = Θ ( n 3 log ϵ 1 ) .

2.2. Convergence Time of the Fastest Constant Edge Weights Algorithm

In [2], the authors consider the real symmetric weighting matrices W n ( ρ ) given by:
[ W n ( ρ ) ] j , k : = ρ if j k , and v j and v k are connected , 1 d j ρ if j = k , 0 otherwise ,
where d j denotes the degree of the sensor v j (i.e., the number of sensors different from v j connected to v j ).
Observe that the weighting matrices of the fastest LTI distributed averaging algorithms for symmetric weights given in Section 2.1 for a cycle and a path, namely W n ( γ 0 ) and W n 1 2 , can be regarded as W n ( ρ ) in (22) taking ρ = γ 0 and ρ = 1 2 , respectively. Therefore, the closed-form expression for the ϵ -convergence time of the fastest constant edge weights algorithm is given by Theorem 2 on a cycle and by Theorem 4 on a path. That is, the ϵ -convergence time of the fastest constant edge weights algorithm and the ϵ -convergence time of the fastest LTI distributed averaging algorithm for symmetric weights is the same on a cycle and on a path.

2.3. Convergence Time of the Maximum-Degree Weights Algorithm and of the Metropolis–Hastings Algorithm

For the maximum-degree weights algorithm [3], the weighting matrix considered is the real symmetric matrix W n ( ρ ) in (22) with:
ρ = 1 1 + max j { 1 , , n } d j .
On the other hand, for the Metropolis–Hastings algorithm [3], the entries of the weighting matrix W n are given by:
[ W n ] j , k = [ A ] j , k 1 + max { d j , d k } if j k , 1 h { 1 , , n } { j } [ W n ] j , h if j = k ,
where A is the adjacency matrix of the network, that is A is the n × n real symmetric matrix given by:
[ A ] j , k = 1 if j k , and v j and v k are connected , 0 otherwise .

2.3.1. The Cycle

Observe that the weighting matrices of the maximum-degree weights algorithm and the Metropolis–Hastings algorithm for a cycle with n sensors can be regarded as W n ( γ ) in (4) taking γ = 1 3 .
We now give a closed-form expression for the ϵ -convergence time of the maximum-degree weights algorithm and of the Metropolis–Hastings algorithm on a cycle. We also study the asymptotic behavior of this convergence time as the number of sensors of the cycle grows.
Theorem 5.
Consider ϵ ( 0 , 1 ) and n N , with n > 3 . Then:
τ ϵ , W n 1 3 = log ϵ 1 log 1 + 2 cos 2 π n 3 .
Moreover,
τ ϵ , W n 1 3 3 n 2 log ϵ 1 4 π 2 ,
and therefore,
τ ϵ , W n 1 3 = Θ ( n 2 log ϵ 1 ) .
Proof. 
Combining (A29) and (A30), we obtain:
W n 1 3 P n 2 = 1 + 2 cos 2 π n 3 ,
and applying [2] (Theorem 1) and Theorem A1, (26) holds. The rest of the proof runs as the proof of Theorem 2.  ☐
Since the number of transmissions per iteration on a cycle with n sensors is n for both algorithms, the total number of transmissions required for τ ϵ , W n 1 3 iterations is T ϵ , W n 1 3 : = n τ ϵ , W n 1 3 . From Theorem 5, we obtain:
T ϵ , W n 1 3 3 n 3 log ϵ 1 4 π 2 ,
and thus, T ϵ , W n 1 3 = Θ ( n 3 log ϵ 1 ) .

2.3.2. The Path

Observe that the weighting matrices of the maximum-degree weights algorithm and of the Metropolis–Hastings algorithm for a path with n sensors can be regarded as W n ( α ) in (11) taking α = 1 3 .
We now give a closed-form expression for the ϵ -convergence time of the maximum-degree weights algorithm and of the Metropolis–Hastings algorithm on a path. We also study the asymptotic behavior of this convergence time as the number of sensors of the path grows.
Theorem 6.
Consider ϵ ( 0 , 1 ) and n N , with n > 3 . Then:
τ ϵ , W n 1 3 = log ϵ 1 log 1 + 2 cos π n 3 .
Moreover,
τ ϵ , W n 1 3 3 n 2 log ϵ 1 π 2 ,
and therefore,
τ ϵ , W n 1 3 = Θ ( n 2 log ϵ 1 ) .
Proof. 
Combining (A63) and [4] (Lemma 1), we obtain:
W n 1 3 P n 2 = 1 3 + 2 3 cos π n ,
and applying [2] (Theorem 1) and Theorem A1, (31) holds. The rest of the proof runs as the proof of Theorem 2.  ☐
Since the number of transmissions per iteration on a path with n sensors is n for both algorithms, the total number of transmissions required for τ ϵ , W n 1 3 iterations is T ϵ , W n 1 3 : = n τ ϵ , W n 1 3 . From Theorem 6, we obtain:
T ϵ , W n 1 3 3 n 3 log ϵ 1 π 2 ,
and thus, T ϵ , W n 1 3 = Θ ( n 3 log ϵ 1 ) .

3. Convergence Time of Randomized Linear Distributed Averaging Algorithms

3.1. Lower and Upper Bounds for the Convergence Time of the Pairwise Gossip Algorithm

In the literature, we have found two different definitions for the convergence time of a randomized linear distributed averaging algorithm (see [6,7]). In this subsection, we consider the definition of ϵ -convergence time for a randomized linear distributed averaging algorithm given in [6]:
τ ϵ , { W ( t ) } t 0 : = sup x ( 0 ) 0 n × 1 inf t : Pr x ( t ) P n x ( 0 ) 2 x ( 0 ) 2 ϵ ϵ ,
where ϵ ( 0 , 1 ) and Pr denotes probability.
We prove in Theorem A1 (Appendix A) that the definitions of ϵ -convergence time in (3) and (36) coincide when applied to deterministic LTI distributed averaging algorithms with symmetric weights (in particular, the four algorithms considered in Section 2). For those algorithms, we also obtain from Theorem A1 that:
τ 1 e , W = τ ( W ) ,
where τ ( W ) denotes the definition of convergence time given in [2].
We recall here that in the pairwise gossip algorithm [6], only two sensors interchange information at each time instant t. These two sensors v j t and v k t are randomly selected at each time instant t, and the weighting matrix W ( t ) , which we denote by W P ( t ) , is the symmetric matrix given by:
[ W P ( t ) ] j , k = 1 2 if j , k { j t , k t } , 1 if j = k { j t , k t } , 0 otherwise ,
for all j , k { 1 , , n } .
In [6], a lower and an upper bound for the ϵ -convergence time of the pairwise gossip algorithm were introduced. We now give a closed-form expression for those bounds on a cycle and on a path, and we study their asymptotic behavior as the number of sensors of the network grows.

3.1.1. The Cycle

Theorem 7.
Consider ϵ ( 0 , 1 ) and n N , with n > 3 . Suppose that W P ( t ) is the weighting matrix of the pairwise gossip algorithm given in (38) on a cycle with n sensors, where the edge { v j t , v k t } is randomly selected at each time instant t N { 0 } with probability 1 n . Then:
1 2 l P ( ϵ ) τ ϵ , { W P ( t ) } t 0 3 l P ( ϵ ) ,
with:
l P ( ϵ ) = log ϵ 1 log 1 + 1 n cos 2 π n 1 .
Moreover,
l P ( ϵ ) n 3 log ϵ 1 2 π 2
and:
τ ϵ , { W P ( t ) } t 0 = Θ ( n 3 log ϵ 1 ) = τ ϵ , W n 1 2 n ,
Proof. 
The entries of the expectation of W P ( 0 ) are given by:
[ E ( W P ( 0 ) ) ] j , k = 1 2 n if j k { 1 , 1 } , 1 2 n if j k { 1 n , n 1 } , 1 n n 1 if j = k , 0 otherwise ,
for all j , k { 1 , , n } . Thus, E ( W P ( 0 ) ) = W n ( 1 2 n ) . Therefore, combining (A29) and [6] (Theorem 3), we obtain (39). The rest of the proof runs as the proof of Theorem 2.  ☐
Since the number of transmissions per iteration on a cycle with n sensors is two for the pairwise gossip algorithm, the total number of transmissions required for τ ϵ , { W P ( t ) } t 0 iterations is T ϵ , { W P ( t ) } t 0 : = 2 τ ϵ , { W P ( t ) } t 0 . From Theorem 7, we obtain l P ( ϵ ) T ϵ , { W P ( t ) } t 0 6 l P ( ϵ ) and T ϵ , { W P ( t ) } t 0 = Θ ( n 3 log ϵ 1 ) .

3.1.2. The Path

Theorem 8.
Consider ϵ ( 0 , 1 ) and n N , with n > 3 . Suppose that W P ( t ) is the weighting matrix of the pairwise gossip algorithm given in (38) on a path with n sensors, where the edge { v j t , v k t } is randomly selected at each time instant t N { 0 } with probability 1 n 1 . Then:
1 2 l P ( ϵ ) τ ϵ , { W P ( t ) } t 0 3 l P ( ϵ ) ,
with:
l P ( ϵ ) = log ϵ 1 log 1 + 1 n 1 cos π n 1 .
Moreover,
l P ( ϵ ) 2 n 3 log ϵ 1 π 2
and:
τ ϵ , { W P ( t ) } t 0 = Θ ( n 3 log ϵ 1 ) = τ ϵ , W n 1 2 n 2 .
Proof. 
The entries of the expectation of W P ( 0 ) are given by:
[ E ( W P ( 0 ) ) ] j , k = 1 2 n 2 if j k { 1 , 1 } , 1 1 n 1 if j = k , j 1 and j n , 1 1 2 n 2 if j = k , j { 1 , n } , 0 otherwise ,
for all j , k { 1 , , n } . Thus, E ( W P ( 0 ) ) = W n ( 1 2 n 2 ) . Therefore, combining (A63) and [6] (Theorem 3), we obtain (44). The rest of the proof runs as the proof of Theorem 2.  ☐
Since the number of transmissions per iteration on a path with n sensors is two for the pairwise gossip algorithm, the total number of transmissions required for τ ϵ , { W P ( t ) } t 0 iterations is T ϵ , { W P ( t ) } t 0 : = 2 τ ϵ , { W P ( t ) } t 0 . From Theorem 8, we obtain l P ( ϵ ) T ϵ , { W P ( t ) } t 0 6 l P ( ϵ ) and T ϵ , { W P ( t ) } t 0 = Θ ( n 3 log ϵ 1 ) .

3.2. Lower and Upper Bounds for the Convergence Time of the Broadcast Gossip Algorithm

We begin this subsection with the definition of ϵ -convergence time for a randomized linear distributed averaging algorithm given in [7] (Equation (42)):
τ ϵ , { W ( t ) } t 0 : = sup x ( 0 ) P n x ( 0 ) inf t : Pr x ( t ) P n x ( t ) 2 x ( 0 ) P n x ( 0 ) 2 ϵ ϵ ,
where ϵ ( 0 , 1 ) .
It can be proven that the definitions of ϵ -convergence time in (36) and (49) coincide when applied to algorithms in which the matrix W ( t ) satisfies W ( t ) P n = P n W ( t ) = P n for all t N { 0 } (in particular, the pairwise gossip algorithm and deterministic LTI distributed averaging algorithms with symmetric weights).
Observe that (49) is actually a definition for the convergence time of linear distributed consensus algorithms, not only of linear distributed averaging algorithms.
We recall here that in the broadcast gossip algorithm, a single sensor broadcasts at each time instant t. This sensor v j t is randomly selected at each time instant t with probability 1 n , and the weighting matrix W ( t ) is given by:
[ W ( t ) ] j , k = 1 if j = k and [ A ] j , j t = 0 , φ if j = k and [ A ] j , j t = 1 , 1 φ if k = j t and [ A ] j , j t = 1 , 0 otherwise ,
for all j , k { 1 , , n } , where φ ( 0 , 1 ) and A is the adjacency matrix of the network. We denote by W B ( t ) the weighting matrix in (50) when φ is the optimal parameter: φ 0 (see [7] (Section V)).
In [7], a lower and an upper bound for the ϵ -convergence time of the broadcast gossip algorithm were introduced. We now give a closed-form expression for φ 0 and for those bounds on a cycle and on a path. We also study the asymptotic behavior of the bounds as the number of sensors of the network grows.

3.2.1. The Cycle

Theorem 9.
Consider ϵ ( 0 , 1 ) and n N , with n > 3 . Suppose that W B ( t ) is the weighting matrix in (50) when the network is a cycle with n sensors and φ is the optimal parameter: φ 0 . Then:
φ 0 = 1 n 2 n + cos 2 π n 1
and:
l B ( ϵ ) τ ϵ , { W B ( t ) } t 0 6 l B ( ϵ ) ,
with:
l B ( ϵ ) = log ϵ 1 2 log n + 2 cos 2 π n 2 n + cos 2 π n 1 .
Moreover,
l B ( ϵ ) n 3 log ϵ 1 4 π 2 ,
and:
τ ϵ , { W B ( t ) } t 0 = Θ ( n 3 log ϵ 1 ) = τ ϵ , W n 1 φ 0 n .
Proof. 
See Appendix E.  ☐
Since the number of transmissions per iteration on a cycle with n sensors is one for the broadcast gossip algorithm, the total number of transmissions required for τ ϵ , { W B ( t ) } t 0 iterations is T ϵ , { W B ( t ) } t 0 : = τ ϵ , { W B ( t ) } t 0 . From Theorem 9, we obtain l B ( ϵ ) T ϵ , { W B ( t ) } t 0 6 l B ( ϵ ) and T ϵ , { W B ( t ) } t 0 = Θ ( n 3 log ϵ 1 ) .

3.2.2. The Path

Theorem 10.
Consider ϵ ( 0 , 1 ) and n N , with n > 3 . Suppose that W B ( t ) is the weighting matrix in (50) when the network is a path with n sensors and φ is the optimal parameter: φ 0 . Then:
φ 0 = 1 n 2 n + cos π n 1
and:
l B ( ϵ ) τ ϵ , { W B ( t ) } t 0 6 l B ( ϵ ) ,
with:
l B ( ϵ ) = log ϵ 1 2 log n + 2 cos π n 2 n + cos π n 1 .
Moreover,
l B ( ϵ ) n 3 log ϵ 1 π 2 ,
and:
τ ϵ , { W B ( t ) } t 0 = Θ ( n 3 log ϵ 1 ) = τ ϵ , W n 1 φ 0 n .
Proof. 
See Appendix F.  ☐
Since the number of transmissions per iteration on a path with n sensors is one for the broadcast gossip algorithm, the total number of transmissions required for τ ϵ , { W B ( t ) } t 0 iterations is T ϵ , { W B ( t ) } t 0 : = τ ϵ , { W B ( t ) } t 0 . From Theorem 10, we obtain l B ( ϵ ) T ϵ , { W B ( t ) } t 0 6 l B ( ϵ ) and T ϵ , { W B ( t ) } t 0 = Θ ( n 3 log ϵ 1 ) .

4. Discussion

As in this paper we have used the same definition of converge time for both deterministic and randomized linear distributed averaging algorithms (namely, the one in (49)), the results given in Section 2 and Section 3 allow us to compare the considered algorithms on a cycle and on a path in terms of convergence time and, consequently, in terms of the number of transmissions required, as well. In particular, these results show the following:
  • The behavior of the considered deterministic linear distributed averaging algorithms is as good as the behavior of the considered randomized ones in terms of the number of transmissions required on a cycle and on a path with n sensors: Θ ( n 3 log ϵ 1 ) .
  • For a large enough number of sensors and regardless of the considered distributed averaging algorithm, the number of transmissions required on a path is four times larger than the number of transmissions required on a cycle.
Furthermore, regarding the cycle, from (10), (30), (41) and (54), we obtain the following enlightening asymptotic equalities:
l B ( ϵ ) T ( ϵ , W n ( γ 0 ) ) 2 l P ( ϵ ) 2 T ϵ , W n 1 3 3 ,
and regarding the path, from (21), (35), (46) and (59), we obtain:
l B ( ϵ ) T ϵ , W n 1 2 2 l P ( ϵ ) 2 T ϵ , W n 1 3 3 .

5. Numerical Examples

For the numerical examples, we first consider a cycle and a path with five and 10 sensors. For each network topology, we present a figure: Figure 2 for the cycle and Figure 3 for the path. Figure 2 (resp. Figure 3) shows the number of transmissions of the fastest LTI distributed averaging algorithm for symmetric weights T ( ϵ , W n ( γ 0 ) ) (resp. T ( ϵ , W n ( 1 / 2 ) ) ) and of the Metropolis–Hastings algorithm T ( ϵ , W n ( 1 / 3 ) ) (resp. T ( ϵ , W n ( 1 / 3 ) ) ) with ϵ ( 10 15 , 1 ) . The figure also shows the lower bound, l P ( ϵ ) , and upper bound, 6 l P ( ϵ ) , given for the number of transmissions of the pairwise gossip algorithm, and the lower bound, l B ( ϵ ) , and upper bound, 6 l B ( ϵ ) , given for the number of transmissions of the broadcast gossip algorithm (resp. l P ( ϵ ) , 6 l P ( ϵ ) , l B ( ϵ ) and 6 l B ( ϵ ) ). Furthermore, the figure shows the average number of transmissions of the pairwise gossip algorithm, T ^ ( ϵ , { W P ( t ) } t 0 ) , and of the broadcast gossip algorithm, T ^ ( ϵ , { W B ( t ) } t 0 ) , (resp. T ^ ( ϵ , { W P ( t ) } t 0 ) and T ^ ( ϵ , { W B ( t ) } t 0 ) ), that we have computed by using Monte Carlo simulations. In those simulations, we have performed 1000 repetitions of the corresponding algorithm for each ϵ ( 10 15 , 1 ) , and we have considered that the values measured by the sensors, x j ( 0 ) with j { 1 , , n } , are independent identically distributed random variables with unit-variance, zero-mean and uniform distribution.
In this section, we present another two figures: Figure 4 and Figure 5. Unlike in Figure 2 and Figure 3, in Figure 4 and Figure 5, we have fixed ϵ instead of the number of sensors n of the network. Specifically, we have chosen ϵ = 10 3 and ϵ = 10 6 with n { 5 , , 30 } .
In the figures, it can be observed that the Metropolis–Hastings algorithm behaves on average better than the pairwise gossip algorithm in terms of the number of transmissions required on the considered networks. It can also be observed that the broadcast gossip algorithm behaves on average approximately equal to the fastest LTI distributed averaging algorithm for symmetric weights in terms of the number of transmissions required on those networks. However, we recall here that the broadcast gossip algorithm converges to a random consensus value instead of to the average consensus value, and it should be executed several times in order to get that average value in every sensor.
The figures also bear evidence of the asymptotic equalities given in (61) and in (62).

6. Conclusions

In this paper, we have studied the convergence time of six known linear distributed averaging algorithms. We have considered both deterministic (the fastest LTI distributed averaging algorithm for symmetric weights, the fastest constant edge weights algorithm, the maximum-degree weights algorithm and the Metropolis–Hastings algorithm) and randomized (the pairwise gossip algorithm and the broadcast gossip algorithm) linear distributed averaging algorithms. In the literature, we have not found closed-form expressions for the convergence time of the considered algorithms. We have computed closed-form expressions for the convergence time of the deterministic algorithms and closed-form upper bounds for the convergence time of the randomized algorithms on two common network topologies: the cycle and the path. Moreover, we have also computed a closed-form expression for the convergence time of the fastest LTI algorithm on a grid. From the computed closed-form formulas, we have studied the asymptotic behavior of the convergence time of the considered algorithms as the number of sensors of the considered networks grows.
Although there exist different definitions of convergence time in the literature, in this paper, we have proven that one of them (namely, the one in (49)) encompasses all the others for the algorithms here considered. As we have used the definition of converge time in (49) for both deterministic and randomized linear distributed averaging algorithms, the obtained closed-form formulas and asymptotic results allow us to compare the considered algorithms on cycles and paths in terms of convergence time and, consequently, in terms of the number of transmissions required, as well.
We now summarize the most remarkable conclusions:
  • The best algorithm among the considered deterministic distributed averaging algorithms is not worse than the best algorithm among the considered randomized distributed averaging algorithms for cycles and paths.
  • The weighting matrix of the fastest LTI distributed averaging algorithm for symmetric weights and the weighting matrix of the fastest constant edge weights algorithm are the same on cycles and on paths.
  • The number of transmissions required on a path with n sensors is asymptotically four-times larger than the number of transmissions required on a cycle with the same number of sensors.
  • The number of transmissions required grows as n 3 on cycles and on paths for the six algorithms considered.
  • For the fastest LTI algorithm, the number of transmissions required grows as n 2 on a square grid of n sensors (i.e., r = c = n ).
A future research direction of this work would be to generalize the analysis presented in the paper to other network topologies. In particular, networks that can be decomposed into cycles and paths could be studied.

Acknowledgments

This work was supported in part by the Spanish Ministry of Economy and Competitiveness through the RACHELproject (TEC2013-47141-C4-2-R), the CARMENproject (TEC2016-75067-C4-3-R) and the COMONSENSnetwork (TEC2015-69648-REDC).

Author Contributions

Jesús Gutiérrez-Gutiérrez conceived the research question. Jesús Gutiérrez-Gutiérrez, Marta Zárraga-Rodríguez and Xabier Insausti proved the main results. Xabier Insausti performed the simulations. Jesús Gutiérrez-Gutiérrez, Marta Zárraga-Rodríguez and Xabier Insausti wrote the paper. All authors have read and approved the final manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Comparison of Several Definitions of Convergence Time

We begin by giving a property of the spectral norm. Its proof is implicit in the Appendix of [1].
Lemma A1.
Let B and P be two n × n real symmetric matrices with B P = P (or equivalently, P B = P ). Suppose that P is idempotent. Then:
  • B t P = P for all t N .
  • B t P = ( B P ) t for all t N .
  • B t P 2 = B P 2 t for all t N .
We recall that an n × n matrix A is idempotent if and only if A 2 = A . An example of idempotent matrix is P n with n N , since P n 2 j , k = h = 1 n [ P n ] j , h [ P n ] h , k = h = 1 n 1 n 1 n = 1 n = [ P n ] j , k for all j , k { 1 , n } .
The following result gives an eigenvalue decomposition for the matrix P n for all n N .
Lemma A2.
If n N , then P n = V n diag ( 1 , 0 , , 0 ) V n * , where V n is the n × n Fourier unitary matrix.
Proof. 
From [13] (Lemma 2) or [14] (Lemma 3), we obtain that V n diag ( 1 , 0 , , 0 ) V n * is a circulant matrix with:
[ V n diag ( 1 , 0 , , 0 ) V n * ] j , 1 = 1 n V n 1 0 0 j , 1 = 1 n k = 1 n [ V n ] j , k 1 0 0 k , 1 = 1 n [ V n ] j , 1 = 1 n
for all j { 1 , n } . Therefore, V n diag ( 1 , 0 , , 0 ) V n * = P n .  ☐
We finish this subsection with a result regarding the ϵ -convergence time.
Theorem A1.
Let B be an n × n real symmetric matrix with B P n and B t P n . If ϵ ( 0 , 1 ) , then:
(A2) min t 0 N : B t x P n x 2 x P n x 2 ϵ , t t 0 , x P n x = max x 0 n × 1 min t N : B t x P n x 2 x 2 ϵ (A3) = log ϵ 1 log B P n 2 .
Proof. 
Let t N . We first prove that the following statements are equivalent:
  • B t x P n x 2 x P n x 2 ϵ for all x P n x .
  • B t x P n x 2 x 2 ϵ for all x 0 n × 1 .
1⇒2 Fix x 0 n × 1 . If x P n x , applying Lemma A2 yields:
B t x P n x 2 x 2 ϵ x P n x 2 x 2 = ϵ I n P n x 2 x 2 ϵ I n P n 2 = ϵ V n V n * V n diag ( 1 , 0 , , 0 ) V n * 2
= ϵ V n I n V n * V n diag ( 1 , 0 , , 0 ) V n * 2 = ϵ V n diag ( 0 , 1 , , 1 ) V n * 2 = ϵ ,
where I n is the n × n identity matrix. If x = P n x , from Lemma A2 and [2] (Theorem 1), we obtain:
B t x P n x 2 x 2 = B t P n x P n x 2 x 2 = P n x P n x 2 x 2 = 0 < ϵ .
2⇒1 If x P n x , then:
B t x P n x 2 x P n x 2 = B t x P n x P n x + P n x 2 x P n x 2 = B t x B t P n x P n x + P n 2 x 2 x P n x 2
= B t ( x P n x ) P n ( x P n x ) 2 x P n x 2 ϵ .
Consequently,
min t 0 N : B t x P n x 2 x P n x 2 ϵ , t t 0 , x P n x
= min t 0 N : B t x P n x 2 x 2 ϵ , t t 0 , x 0 n × 1
= min t 0 N : max x 0 n × 1 B t x P n x 2 x 2 ϵ , t t 0
= min t 0 N : B t P n 2 ϵ , t t 0
= min t 0 N : B P n 2 t ϵ , t t 0
= min t 0 N : B P n 2 t 0 ϵ
= min t 0 N : log ( B P n 2 t 0 ) log ϵ
= min t 0 N : t 0 log B P n 2 log ϵ = min t 0 N : t 0 log ϵ log B P n 2
= min t 0 N : t 0 log ϵ 1 log B P n 2 = log ϵ 1 log B P n 2 .
To prove (A10). we have used the equivalence 1 2 . To show (A12) and (A13), we have applied the definition of the spectral norm (see, e.g., [15] (pp. 603, 609)) and Assertion 3 of Lemma A1, respectively. To prove (A14), we have used [2] (Theorem 1) ( B P n 2 < 1 ).
As:
(A18) min t 0 N : B t x P n x 2 x P n x 2 ϵ , t t 0 , x P n x = min t N : B P n 2 t ϵ (A19) = min t N : B t P n 2 ϵ ,
we only need to show that T 1 = T 2 to finish the proof, where:
T 1 = min t N : max x 0 n × 1 B t x P n x 2 x 2 ϵ
and:
T 2 = max x 0 n × 1 t x ,
with:
t x = min t N : B t x P n x 2 x 2 ϵ .
Since:
B T 1 x P n x 2 x 2 ϵ x 0 n × 1
we have t x T 1 for all x 0 n × 1 and, consequently, T 2 T 1 . If B t x P n x 2 x 2 is a decreasing sequence for all x 0 n × 1 , then:
B T 2 x P n x 2 x 2 B t x x P n x 2 x 2 ϵ x 0 n × 1
and therefore,
max x 0 n × 1 B T 2 x P n x 2 x 2 ϵ
and T 1 T 2 . Thus, if we prove that these sequences are decreasing, the proof is complete. Given x 0 n × 1 , from Lemma A1 and [2] (Theorem 1), we conclude that:
B t + 1 x P n x 2 x 2 = ( B P n ) t + 1 x 2 x 2 ( B P n ) 2 ( B P n ) t x 2 x 2 ( B P n ) t x 2 x 2 = B t x P n x 2 x 2
for all t N . To prove the two equalities in (A26), we have used Assertion 2 of Lemma A1. To show the first inequality in (A26), we have applied a well-known inequality on the spectral norm (see, e.g., [15] (p. 611)), and to prove the second inequality in (A26), we have used [2] (Theorem 1) ( B P n 2 < 1 ).  ☐

Appendix B. Proof of Theorem 1

Let B ( γ 1 , , γ n ) be the n × n real symmetric matrix given by:
1 γ 1 γ n γ 1 0 0 0 γ n γ 1 1 γ 1 γ 2 γ 2 0 0 0 0 γ 2 1 γ 2 γ 3 0 0 0 0 0 0 1 γ n 3 γ n 2 γ n 2 0 0 0 0 γ n 2 1 γ n 2 γ n 1 γ n 1 γ n 0 0 0 γ n 1 1 γ n 1 γ n .
Observe that the matrix in (A27) satisfies B ( γ 1 , , γ n ) P n = P n .
We define the function f : R n [ 0 , ) as f ( γ 1 , , γ n ) : = B ( γ 1 , , γ n ) P n 2 . We next prove that:
W n ( γ 0 ) P n 2 B ( γ 1 , , γ n ) P n 2 γ 1 , , γ n R
Observe that W n ( γ 0 ) = B ( γ 0 , , γ 0 ) . As W n ( γ ) is circulant, its eigenvalues are (see, e.g., [16] (Equation (3.7)) or [17] (Equation (5.2))):
a j : = 1 + 2 γ cos 2 π ( j 1 ) n 1 , j { 1 , , n }
Let V n = v 1 | | v n be the n × n Fourier unitary matrix. It is well known (see, e.g., [16] (Equation (3.11)) or [17] (Lemma 5.1)) that v j is an unit eigenvector of W ( γ ) associated with the eigenvalue a j for all j { 1 , , n } . From Lemma A2:
W n ( γ ) P n 2 = max { | a j | : j { 2 , , n } } .
Case 1: Assume that n is even. Then,
W ( γ 0 ) P n 2 = a 2 = a n 2 + 1 = a n = 1 + cos 2 π n 3 cos 2 π n ( 0 , 1 ) .
Therefore, y 2 = 2 2 ( v n + v 2 ) and y n = 2 2 1 ( v n v 2 ) are unit eigenvectors of W ( γ 0 ) associated to a 2 = a n . As:
y 2 j , 1 = 2 n cos 2 π ( j 1 ) n ,
y n j , 1 = 2 n sin 2 π ( j 1 ) n ,
v n 2 + 1 j , 1 = 1 n ( 1 ) j 1
for all j { 1 , , n } , from [4] (Theorem 1), we obtain three subgradients of f at ( γ 0 , , γ 0 ) R n , namely g 1 , g 2 and g 3 , given by:
g 1 j , 1 = 2 n cos 2 π ( j 1 ) n cos 2 π j n 2 ,
g 3 j , 1 = 2 n sin 2 π ( j 1 ) n sin 2 π j n 2 ,
g 2 j , 1 = 4 n
for all j { 1 , , n } . If μ = 1 3 cos 2 π n , we have that μ g 1 + μ g 2 + ( 1 2 μ ) g 3 = 0 n × 1 , where 0 n × 1 is the n × 1 zero matrix. The result now follows from [18] (p. 12) and the fact that a convex combination of subgradients of f at ( γ 0 , , γ 0 ) is also a subgradient of f at ( γ 0 , , γ 0 ) .
Case 2: Assume that n is odd. Then,
W ( γ 0 ) P n 2 = a 2 = a n + 1 2 = a n + 3 2 = a n = cos 2 π n + cos π n 2 cos 2 π n + cos π n ( 0 , 1 ) .
Therefore, y 2 = 2 2 ( v n + v 2 ) and y n = 2 2 1 ( v n v 2 ) are unit eigenvectors of W ( γ 0 ) associated with a 2 = a n , and y n + 1 2 = 2 2 ( v n + 1 2 + v n + 3 2 ) and y n + 3 2 = 2 2 1 ( v n + 1 2 v n + 3 2 ) are unit eigenvectors of W ( γ 0 ) associated with a n + 1 2 = a n + 3 2 . As:
y 2 j , 1 = 2 n cos 2 π ( j 1 ) n ,
y n j , 1 = 2 n sin 2 π ( j 1 ) n ,
y n + 1 2 j , 1 = 2 n ( 1 ) j 1 cos π ( j 1 ) n ,
y n + 3 2 j , 1 = 2 n ( 1 ) j 1 sin π ( j 1 ) n
for all j { 1 , , n } , from [4] (Theorem 1), we obtain four subgradients of f at ( γ 0 , , γ 0 ) R n , namely g 1 , g 2 , g 3 and g 4 given by:
g 1 j , 1 = 2 n cos 2 π ( j 1 ) n cos 2 π j n 2 ,
g 2 j , 1 = 2 n sin 2 π ( j 1 ) n sin 2 π j n 2 ,
g 3 j , 1 = 2 n cos π ( j 1 ) n + cos π j n 2 ,
g 4 j , 1 = 2 n sin π ( j 1 ) n + sin π j n 2
for all j { 1 , , n } . If μ = 1 2 1 + cos π n 2 cos 2 π n + cos π n , we have that μ g 1 + μ g 2 + ( 1 2 μ ) g 3 + ( 1 2 μ ) g 4 = 0 n × 1 . The result now follows from [18] (p. 12) and the fact that a convex combination of subgradients of f at ( γ 0 , , γ 0 ) is also a subgradient of f at ( γ 0 , , γ 0 ) .
Since W ( γ 0 ) P n 2 < 1 , applying [2] (Theorem 1) and Theorem A1, Theorem 1 holds.

Appendix C. Proof of Theorem 2

From [2] (Theorem 1), Theorem A1, (A31) and (A38), we obtain (8).
To finish the proof, we only need to show (9) and (7).
We begin by proving (9). Applying Taylor’s theorem (see, e.g., [19] (p. 113)), there exist two bounded functions f , g : 0 , π 2 R such that:
log 1 + cos x 3 cos x = x 2 2 + f ( x ) x 3
and:
log cos x 2 + cos x 2 + cos x 2 cos x = x 2 2 + g ( x ) x 3
for all x 0 , π 2 . Therefore, from (A31) and (A38), we have:
log ϵ 1 log W ( γ 0 ) P n 2 = log ϵ 1 2 π n 2 2 + y n 2 π n 3 ,
where { y n } n 4 is the bounded sequence of real numbers given by:
{ y n } n 4 = f 2 π n if n is even , g 2 π n if n is odd .
Thus,
lim n log ϵ 1 n 2 log W ( γ 0 ) P n 2 = lim n log ϵ 1 2 π 2 + y n 8 π 3 n = log ϵ 1 2 π 2 .
Hence, as a a < a + 1 , with a R , applying Theorem A1, we obtain:
log ϵ 1 2 π 2 = lim n log ϵ 1 n 2 log W ( γ 0 ) P n 2 lim n τ ( ϵ , W ( γ 0 ) ) n 2
lim n log ϵ 1 n 2 log W ( γ 0 ) P n 2 + lim n 1 n 2 = log ϵ 1 2 π 2 ,
and consequently, (9) holds.
Finally, we prove (7). If δ 0 , log ϵ 1 2 π 2 , then there exists n 0 N , with n 0 > 3 , such that:
τ ( ϵ , W ( γ 0 ) ) n 2 log ϵ 1 2 π 2 < δ n n 0 .
Thus, if n n 0 , then:
δ < τ ( ϵ , W ( γ 0 ) ) n 2 log ϵ 1 2 π 2 < δ ,
or equivalently,
1 2 π 2 δ log ϵ 1 n 2 log ϵ 1 < τ ( ϵ , W ( γ 0 ) ) < 1 2 π 2 + δ log ϵ 1 n 2 log ϵ 1 .

Appendix D. Proof of Theorem 3

We denote with W the set of all the r c × r c real symmetric matrices such that:
W r , c = B R r c × r c , B = B , B P n = P n , [ B ] j , k = 0 if j k and [ A ] j , k = 0 ,
where A is the adjacency matrix of a grid of r rows and c columns. Consider the bijection B : R q W r , c defined in [4] (Equation (8)), where q = 4 r c 3 c 3 r + 2 (i.e., q is the number of edges when the network is viewed as an undirected graph).
We define the function f : R q [ 0 , ) as f ( w 1 , , w q ) : = B ( w 1 , , w q ) P n 2 . We next prove that:
W r , c 1 2 P n 2 B ( w 1 , , w q ) P n 2 w 1 , , w q R .
Without loss of generality, we can assume that r c . We first show that W r , c 1 2 W r , c :
W r 1 2 W c 1 2 = W r 1 2 W c 1 2 = W r 1 2 W c 1 2 ,
and:
W r 1 2 W c 1 2 P n = W r 1 2 W c 1 2 n P r n P c
= n W r 1 2 P r W c 1 2 P c
= n P r P c = n P r n P c = P n .
The eigenvalues of W n α are (see, e.g., [20]):
a j : = 1 2 α + 2 α cos π ( j 1 ) n , j { 1 , , n }
and therefore, the eigenvalues of W n 1 2 are given by a j ( n ) : = cos ( j 1 ) π n with j { 1 , , n } . Their associated orthonormal eigenvectors are given by [ v 1 ( n ) ] k , 1 = 1 n and [ v j ( n ) ] k , 1 = 2 n cos ( 2 k 1 ) ( j 1 ) π 2 n with j { 2 , , n } , k { 1 , , n } (see, e.g., [20]). Consequently, the eigenvalues of W r 1 2 W c 1 2 are a j ( r ) a k ( c ) and associated orthonormal eigenvectors are v j ( r ) v k ( c ) with j { 1 , , r } and k { 1 , , c } .
From [4] (Lemma 1),
W r , c 1 2 P n 2 = a 2 ( r ) a 1 ( c ) = a r ( r ) a 1 ( c ) = cos π r ( 0 , 1 ) .
Then, y 1 = v 2 ( r ) v 1 ( c ) and y 2 = v r ( r ) v 1 ( c ) are unit eigenvectors of W r , c 1 2 associated with a 2 ( r ) a 1 ( c ) and a r ( r ) a 1 ( c ) , respectively, and their entries are given by:
[ y 1 ] c ( j 1 ) + k , 1 = 2 r c cos ( 2 j 1 ) π 2 r ,
y 2 ] c ( j 1 ) + k , 1 = ( 1 ) j 1 2 r c sin ( 2 j 1 ) π 2 r ,
for all j { 1 , , r } and k { 1 , , c } .
Let ε be the set of edges of the grid. An edge e = { j , k } connects the sensors v j , v k , and we enumerate the edges such that e l = { j l , k l } for all l { 1 , , q } . We consider that the edges of the grid are sorted as follows: ε H , ε V , ε NW and ε NE are the set of horizontal, vertical, northwest-southeast diagonal and northeast-southwest diagonal edges, respectively. Moreover, if e l 1 = { j l 1 , k l 1 } , e l 2 = { j l 2 , k l 2 } ε h with h { H , V , NW , NE } and min { j l 1 , k l 1 } < min { j l 2 , k l 2 } , then the edge e l 1 precedes the edge e l 2 in ε h .
From [4] (Theorem 1) , we obtain two subgradients of f, g 1 and g 2 given by:
g 1 = g 1 ( H ) | g 1 ( V ) | g 1 ( NW ) | g 1 ( NE ) ,
g 2 = g 2 ( H ) | g 2 ( V ) | g 2 ( NW ) | g 2 ( NE ) ,
where g 1 ( H ) = g 2 ( H ) = 0 1 × r ( c 1 ) ,
g 1 ( V ) 1 , c ( j 1 ) + k = 8 r c sin 2 π 2 r sin 2 j π r ,
g 2 ( V ) 1 , c ( j 1 ) + k = 8 r c cos 2 π 2 r sin 2 j π r ,
for all j { 1 , , r 1 } , k { 1 , , c } ,
g 1 ( NW ) 1 , ( c 1 ) ( j 1 ) + k = 8 r c sin 2 π 2 r sin 2 j π r ,
g 2 ( NW ) 1 , ( c 1 ) ( j 1 ) + k = 8 r c cos 2 π 2 r sin 2 j π r ,
for all j { 1 , , r 1 } , k { 1 , , c 1 } , and:
g 1 ( NE ) 1 , ( c 1 ) ( j 1 ) + k 1 = 8 r c sin 2 π 2 r sin 2 j π r ,
g 2 ( NE ) 1 , ( c 1 ) ( j 1 ) + k 1 = 8 r c cos 2 π 2 r sin 2 j π r ,
for all j { 1 , , r 1 } , k { 2 , , c } . If μ = cos 2 π 2 r , we have that μ g 1 + ( 1 μ ) g 2 = 0 ( 4 r c 3 c 3 r + 2 ) × 1 . The result now follows from [18] (p. 12) and the fact that a convex combination of subgradients of f at a certain point is also a subgradient of f at that point.
Since W r , c 1 2 P n 2 < 1 , applying [2] (Theorem 1) and Theorem A1, Theorem 3 holds.

Appendix E. Proof of Theorem 9

We begin by proving (51). The Laplacian matrix of a cycle with n sensors is:
L = diag ( 1 n circ n ( 0 , 1 , 0 , , 0 , 1 ) ) circ n ( 0 , 1 , 0 , , 0 , 1 )
= diag ( 2 , 2 , , 2 ) circ n ( 0 , 1 , 0 , , 0 , 1 ) = circ n ( 2 , 1 , 0 , , 0 , 1 ) .
From [21] (Equation (3.4a)), the eigenvalues of L are given by 2 1 cos 2 π ( j 1 ) n : j { 1 , , n } and, consequently, λ n 1 ( L ) = 2 1 cos 2 π n . From [7] (Corollary 1), we have:
φ 0 = n λ n 1 ( L ) 2 n λ n 1 ( L ) = 1 n 2 n + cos 2 π n 1 ,
and therefore, (51) holds. The entries of the expectation of W B ( 0 ) are given by:
[ E ( W B ( 0 ) ) ] j , k = 1 n ( 1 φ 0 ) if j k { 1 , 1 } , 1 n ( 1 φ 0 ) if j k { 1 n , n 1 } , 1 n 2 φ 0 + n 2 if j = k , 0 otherwise ,
for all j , k { 1 , , n } . Thus, E ( W B ( 0 ) ) = W n ( 1 φ 0 n ) . Therefore, combining (A29) and (A30) yields:
W n 1 φ 0 n P n 2 = n + 2 cos 2 π n 2 n + cos 2 π n 1 .
As:
φ 0 = n λ n 1 ( L ) 2 n λ n 1 ( L ) = 1 n 2 n λ n 1 ( L )
we get:
λ n 1 ( L ) = n 2 1 1 φ 0 = n 1 2 φ 0 1 φ 0
and consequently,
1 2 φ 0 1 φ 0 n λ n 1 ( L ) 1 φ 0 2 n 2 λ n 1 ( L ) 2 = 1 2 φ 0 1 2 φ 0 1 2 φ 0 2
= 1 2 φ 0 + 4 φ 0 2 1 + 4 φ 0 4 φ 0 2 = 2 φ 0 = 2 n n + cos 2 π n 1 = n + 2 cos 2 π n 2 n + cos 2 π n 1 .
Now, applying (A79), (A82) and [7] (Equations (28) and (46)), we obtain (52). The rest of the proof runs as the proof of Theorem 2.

Appendix F. Proof of Theorem 10

We begin by proving (56). The Laplacian matrix of a path with n sensors is:
[ L ] j , k = 1 if   j k { 1 , 1 } , 2 if   j = k , j 1   and   j n , 1 if   j = k , j { 1 , n } , 0 otherwise ,
From [20], the eigenvalues of L are given by 2 1 cos π ( j 1 ) n : j { 1 , , n } and, consequently, λ n 1 ( L ) = 2 1 cos π n . From [7] (Corollary 1), we have:
φ 0 = n λ n 1 ( L ) 2 n λ n 1 ( L ) = 1 n 2 n + cos π n 1 ,
and therefore, (56) holds. The entries of the expectation of W B ( 0 ) are given by:
[ E ( W B ( 0 ) ) ] j , k = 1 n ( 1 φ 0 ) if   j k { 1 , 1 } , 1 n 2 φ 0 + n 2 if   j = k , j 1   and   j n , 1 n ( φ 0 + n 1 ) if   j = k , j { 1 , n } , 0 otherwise ,
for all j , k { 1 , , n } . Thus, E ( W B ( 0 ) ) = W n ( 1 φ 0 n ) . Therefore, combining (A63) and [4] (Lemma 1) yields:
W n 1 φ 0 n P n 2 = n + 2 cos π n 2 n + cos π n 1 .
As:
φ 0 = n λ n 1 ( L ) 2 n λ n 1 ( L ) = 1 n 2 n λ n 1 ( L )
we get:
λ n 1 ( L ) = n 2 1 1 φ 0 = n 1 2 φ 0 1 φ 0
and consequently,
1 2 φ 0 1 φ 0 n λ n 1 ( L ) 1 φ 0 2 n 2 λ n 1 ( L ) 2 = 1 2 φ 0 1 2 φ 0 1 2 φ 0 2
= 1 2 φ 0 + 4 φ 0 2 1 + 4 φ 0 4 φ 0 2 = 2 φ 0 = 2 n n + cos 2 π n 1 = n + 2 cos π n 2 n + cos π n 1 .
Now, applying (A87), (A90) and [7] (Equations (28) and (46)), we obtain (57). The rest of the proof runs as the proof of Theorem 2.

References

  1. Insausti, X.; Camaró, F.; Crespo, P.M.; Beferull-Lozano, B.; Gutiérrez-Gutiérrez, J. Distributed pseudo-gossip algorithm and finite-length computational codes for efficient in-network subspace projection. IEEE J. Sel. Top. Signal Process. 2013, 7, 163–174. [Google Scholar] [CrossRef]
  2. Xiao, L.; Boyd, S. Fast linear iterations for distributed averaging. Syst. Control Lett. 2004, 53, 65–78. [Google Scholar] [CrossRef]
  3. Xiao, L.; Boyd, S.; Kimb, S.J. Distributed average consensus with least-mean-square deviation. J. Parallel Distrib. Comput. 2007, 67, 33–46. [Google Scholar] [CrossRef]
  4. Insausti, X.; Gutiérrez-Gutiérrez, J.; Zárraga-Rodríguez, M.; Crespo, P.M. In-network computation of the optimal weighting matrix for distributed consensus on wireless sensor networks. Sensors 2017, 17, 1702. [Google Scholar] [CrossRef] [PubMed]
  5. Olshevsky, A.; Tsitsiklis, J. Convergence speed in distributed consensus and averaging. SIAM Rev. 2011, 53, 747–772. [Google Scholar] [CrossRef]
  6. Boyd, S.; Ghosh, A.; Prabhakar, B.; Shah, D. Randomized gossip algorithms. IEEE Trans. Inf. Theory 2006, 52, 2508–2530. [Google Scholar] [CrossRef]
  7. Aysal, T.C.; Yildiz, M.E.; Sarwate, A.D.; Scaglione, A. Broadcast gossip algorithms for consensus. IEEE Trans. Signal Process. 2009, 57, 2748–2761. [Google Scholar] [CrossRef]
  8. Wu, S.; Rabbat, M.G. Broadcast Gossip Algorithms for Consensus on Strongly Connected Digraphs. IEEE Trans. Signal Process. 2013, 61, 3959–3971. [Google Scholar] [CrossRef]
  9. Dimakis, A.D.G.; Kar, S.; Moura, J.M.F.; Rabbat, M.G.; Scaglione, A. Gossip algorithms for distributed signal processing. Proceed. IEEE 2010, 98, 1847–1864. [Google Scholar] [CrossRef]
  10. Olshevsky, A.; Tsitsiklis, J. Convergence speed in distributed consensus and averaging. SIAM J. Control Optim. 2009, 48, 33–55. [Google Scholar] [CrossRef]
  11. Olshevsky, A.; Tsitsiklis, J. A lower bound for distributed averaging algorithms on the line graph. IEEE Trans. Autom. Control 2011, 56, 2694–2698. [Google Scholar] [CrossRef]
  12. Apostol, T.M. Calculus; John Wiley & Sons: Hoboken, NJ, USA, 1967; Volume 1. [Google Scholar]
  13. Gutiérrez-Gutiérrez, J.; Crespo, P.M. Asymptotically equivalent sequences of matrices and Hermitian block Toeplitz matrices with continuous symbols: Applications to MIMO systems. IEEE Trans. Inf. Theory 2008, 54, 5671–5680. [Google Scholar] [CrossRef]
  14. Gutiérrez-Gutiérrez, J.; Crespo, P.M. Asymptotically equivalent sequences of matrices and multivariate ARMA processes. IEEE Trans. Inf. Theory 2011, 57, 5444–5454. [Google Scholar] [CrossRef]
  15. Bernstein, D.S. Matrix Mathematics; Princeton University Press: Princeton, NJ, USA, 2009. [Google Scholar]
  16. Gray, R.M. Toeplitz and circulant matrices: A review. Found. Trends Commun. Inf. Theory 2006, 2, 155–239. [Google Scholar] [CrossRef]
  17. Gutiérrez-Gutiérrez, J.; Crespo, P.M. Block Toeplitz matrices: Asymptotic results and applications. Found. Trends Commun. Inf. Theory 2011, 8, 179–257. [Google Scholar] [CrossRef]
  18. Shor, N.Z. Minimization Methods for Non-Differentiable Functions; Springer: Berlin, Germany, 1985. [Google Scholar]
  19. Apostol, T.M. Mathematical Analysis; Addison-Wesley: Boston, MA, USA, 1974. [Google Scholar]
  20. Yueh, W.C.; Cheng, S.S. Explicit Eigenvalues and inverses of tridiagonal Toeplitz matrices with four perturbed corners. ANZIAM J. 2008, 49, 361–387. [Google Scholar] [CrossRef]
  21. Gray, R.M. On the asymptotic eigenvalue distribution of Toeplitz matrices. IEEE Trans. Inf. Theory 1972, 18, 725–730. [Google Scholar] [CrossRef]
Figure 1. Considered network topologies with 16 sensors.
Figure 1. Considered network topologies with 16 sensors.
Sensors 18 00968 g001
Figure 2. (a) A cycle with five sensors; (b) a cycle with 10 sensors.
Figure 2. (a) A cycle with five sensors; (b) a cycle with 10 sensors.
Sensors 18 00968 g002
Figure 3. (a) A path with five sensors; (b) a path with 10 sensors.
Figure 3. (a) A path with five sensors; (b) a path with 10 sensors.
Sensors 18 00968 g003
Figure 4. A cycle: (a) ϵ = 10 3 ; (b) ϵ = 10 6 .
Figure 4. A cycle: (a) ϵ = 10 3 ; (b) ϵ = 10 6 .
Sensors 18 00968 g004
Figure 5. A path: (a) ϵ = 10 3 ; (a) ϵ = 10 6 .
Figure 5. A path: (a) ϵ = 10 3 ; (a) ϵ = 10 6 .
Sensors 18 00968 g005

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Sensors EISSN 1424-8220 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top