The Role of Gossiping in Information Dissemination over a Network of Agents

We consider information dissemination over a network of gossiping agents. In this model, a source keeps the most up-to-date information about a time-varying binary state of the world, and n receiver nodes want to follow the information at the source as accurately as possible. When the information at the source changes, the source first sends updates to a subset of m≤n nodes. Then, the nodes share their local information during the gossiping period, to disseminate the information further. The nodes then estimate the information at the source, using the majority rule at the end of the gossiping period. To analyze the information dissemination, we introduce a new error metric to find the average percentage of nodes that can accurately obtain the most up-to-date information at the source. We characterize the equations necessary to obtain the steady-state distribution for the average error and then analyze the system behavior under both high and low gossip rates. We develop an adaptive policy that the source can use to determine its current transmission capacity m based on its past transmission rates and the accuracy of the information at the nodes. Finally, we implement a clustered gossiping network model, to further improve the information dissemination.


Introduction
Motivated by many applications-such as autonomous vehicular systems, content advertising on social media, and city emergency-warning systems-information dissemination over the networks has gained significant attention.For instance, in the case of autonomous vehicular systems or city emergency-warning systems, timely critical information, such as accident alerts or tornado warnings, needs to be disseminated as quickly and as accurately as possible.As another example, companies often want to let their potential customers know about their latest products through advertisements over social media.In both of these examples, there is a single information source where the most up-to-date information is disseminated to multiple receivers over time.
In this paper, we consider a communication system with a source and n receiver nodes.The source keeps the most recent information about the state of the world, which takes binary values 0 or 1, and changes according to an exponential distribution.Upon each information update, the source wants to let the receiver nodes know about the most recent information.As the source has limited transmission capacity, it cannot send information to more than m ≤ n nodes, and each information transmission at the source takes an exponentially distributed length of time.After sending updates to m nodes, in order to further disseminate information, local information is shared between each pair of receiver nodes, a process we shall refer to as gossiping.The gossiping period continues until the information at the source is updated again.At the end of each gossiping period, each receiver node that did not get the most recent information directly from the source comes up with an estimate based on the majority of the information it received from the other nodes.In order to measure the accuracy of the information dissemination at the end of each update cycle, we consider an error metric that takes value 1 for a receiver node that has a different estimate compared to the information at the source.

Related Work
In the gossip-network literature, a model where only one node tries to spread its information to the entire network was considered in [1] and named single-piece dissemination.Multi-piece spreading, where all nodes try to spread their individual information to the remaining nodes, was studied in [2].Moreover, the problem of finding the average of all nodes' initial information in a gossip network was studied under the framework of distributed averaging in [3,4].The main goal of these works was to analytically characterize either the information spreading time [1,2] or the averaging time [3,4] in the entire network.In all these settings, the information was considered to be static, i.e., it did not change over time.
Reference [5] considered the problem of gossiping dynamic information.As the gossip network may consist of asynchronous agents where there is no central clock, in order to maintain the information flow in the gossip network, timestamping is a commonly used technique, where the agents keep the generation time of their local status updates [6].
During the gossiping phase, information updates among the agents are determined based on whoever has the largest timestamp of particular information, which indicates the information freshness of the local agents.
In another line of research, to measure information freshness, the age of information was defined as the difference between the current time and the timestamp of the last status update received by the agents.For a more detailed review of the age of information, we refer to [7,8].Recently, scaling of the age of information was considered in gossip networks [9][10][11].In [9], the stochastic hybrid system (SHS) method was used to characterize the version age in arbitrarily connected gossip networks, and scaling of the version age was studied in the symmetric ring and fully connected networks.By using the idea of clustering, scaling of the age of information was further improved in [10].Then, scaling of the binary freshness metric [12][13][14], which takes either the value 1 when the information is fresh, or 0 otherwise, was studied in the gossip networks in [11].
In all these aforementioned works, the timestamp of the information plays a critical role in determining information dissemination in gossip networks.As the timestamp of the information increases, as new versions of the information are generated, either the size grows without bound-in which case, the agents spend most of their capacity in exchanging large numbers and comparing the values of these large timestamps [5], to determine the freshest information-or, in the case of a bounded timestamp, when overflow happens, the order for information freshness can be lost [6].In certain applications, an external adversary may interrupt the information flow and alter the timestamp of the information, such that the older versions of the information may be re-branded as fresh information [15].Recently, the effect of timestomping on age scaling was explored in [16].
Unlike the earlier works on gossip networks, as in [1,2], we considered in this paper a time-varying information source and, instead of tracking the information-spreading time, we studied the average percentage of the nodes that had access to the most recent information at the source before it was updated.Compared to the dynamic information dissemination in [6], in our work we did not use the timestamps of the information.Instead, to maintain the information flow, we used instantaneous signaling from the source to the nodes, to synchronize the nodes.We implemented an information updating mechanism consisting of two phases: in the first phase, only the source could send updates to m nodes; in the second phase, i.e., in the gossiping phase, only the nodes could share their local information.Thus, in the gossiping phase, incorrect information in the network could also spread.Works [9][10][11] considered the age of information in gossip networks, where each information update at the source was treated as a new update.In our work, on the other hand, we considered a binary dynamic information source.Thus, the content of the information affected our error metric.Furthermore, in [9][10][11], the nodes updated their information only if they received fresher information.By contrast, in our work, the nodes that did not receive any update directly from the source made decisions based on the majority of the updates that they received from the other nodes.As a result, the error metric and the information updating model that we considered differed from the earlier works in [9][10][11].
The binary information structure appears in real-world applications such as robotic networks, where a group of robots working on a horizontal line wants to decide whether a neighbor robot is in front of the group or behind it.Here, the binary information structure is sufficient to represent the relative position of the robots in the network.Inspired by this example, by using only the sign of the relative states, reference [17] considered a decentralized-online-convex-optimization problem with time-varying loss functions, and reference [18] solved a distributed discrete-time optimization over multi-agent networks.As another notable example, reference [19] considered a model where the actual opinion of the public evolved as a continuous variable in [0, 1] but the expressed opinions took only discrete binary values {0, 1} that resembled the opinion polls.Motivated by all these examples, in our work, we focused our attention on binary information dissemination as an initial step in analyzing the role of gossiping in information dissemination.
Finally, dissemination of misinformation on social networks has attracted significant interest in recent years.The network immunization problem has been considered, to prevent the diffusion of harmful information that can infect the network [20][21][22].More specifically, reference [20] proposed an algorithm that utilizes the community structure for network immunization.Reference [21] proposed a comprehensive solution for the immediate detection and containment of harmful content, aiming to curb its propagation across the network.Reference [22] applied deep neural networks to develop context-aware algorithms that can detect fake news.

Contributions
In this work, we first characterize the equations necessary to obtain the steady-state distribution of the average error (which was also appeared in our preliminary work in [23]).Then, we provide analytical results for the high and low gossip rates.When the gossip rate is high, we show that the probability of obtaining correct information converges to a step function where if the majority of the nodes have the correct information then all the nodes are able to estimate the information correctly with probability 1.In other words, as the gossip rate increases, the information at all nodes becomes mutually available to them, and all the nodes in the network behave like a single node.However, when the gossip rate is low, the gossiping phase can be approximated by either not receiving any updates, in which case the nodes hold on to their prior information, or receiving a single update.Based on this approximation, we characterize analytically the gain obtained through gossiping, and we find an adaptive selection policy for the source, which suggests that the source should send updates to more nodes when the nodes have mostly incorrect information.Then, to further reduce the average error, we implement the idea of clustering, where, instead of sending information to all nodes, the source sends its information only to a smaller number of cluster heads.Then, the cluster heads forward the information to nodes within their clusters.For this network model, we characterize the equations to find the long-term average error at the cluster heads and at the nodes in the clusters.Finally, we provide extensive simulations, to illustrate the effect of gossiping and clustering on information dissemination.

System Model and Problem Formulation
We considered an information updating system consisting of a source and n receiver nodes, as shown in Figure 1.The source kept the most up-to-date information about a state of the world that took binary values of 0 or 1.The information at the source was updated following a Poisson process with rate λ e .We defined the time interval between the jth and j + 1th information update at the source as the jth update cycle and denoted it by I j .We assumed that the source was able to send instantaneous signals to the nodes.After receiving these signals, the nodes knew that information at the source had been updated, but they did not know what information had been realized at the source.Such instantaneous signalings exist in many practical systems.For example, consider a news provider making news either to support or oppose a topic of interest.After the news is published, the news provider can send an instantaneous notification to its subscribers about the occurrence of the news through push notifications on smart devices or headlines in TV broadcasts or their websites.However, after receiving these notifications, individuals still do not know the actual update until they enter the news provider's website or watch the TV broadcast.As another application, consider a city emergency-warning system, or anomaly detection in security applications where warning signals can occur over time.As warning signals can also happen due to false alarms, upon receiving such warning signals, individuals do not know whether there is an actual anomaly or not, until further test results can confirm the actual status.Thus, motivated by the aforementioned examples, we utilized the synchronization signal, which can indicate the information update at the source but does not provide any information about the source's state realization.
We denoted the information at the source at update cycle j as x s (j).For a given x s (j), the information at the source at the j + 1th update cycle was equal to x s (j + 1) = x s (j) with probability 1 − p and to x s (j + 1) = 1 − x s (j) with probability p, i.e.,

P(x
for all j, where 0 < p < 1 2 .As 0 < p < 1 2 , the nodes kept their state estimation unchanged whenever a new update cycle started.(Our results are extendable to the setting where 0.5 < p < 1.In this case, the optimal decision taken by each node should be to revert their belief at the beginning of each update cycle.) The source updated each receiver node according to a Poisson process with rate λ s n .In this system, in addition to the update arrivals from the source, each node can share its local information with the other nodes, a process called gossiping.Specifically, in this work, we considered a fully connected network where each node was connected to every other node with equal update rates.The total update rate of a node was λ.Thus, in this network, each node updated its neighbor nodes following a Poisson process with rate λ n−1 .We denoted the information at node i at update cycle I j as x i (j).The nodes wanted to follow the most up-to-date information prevailing at the source as accurately as possible, based on the updates received from the source as well as from the neighbor nodes during an update cycle.
In this paper, we considered an information updating mechanism where at the beginning of each update cycle I j the source sent its current information to m nodes where 1 ≤ m ≤ n, as shown in Figure 1a.Here, we assumed that the source knew (or was able to sense/monitor) the information prevailing at the nodes and, thus, it sent updates to the nodes that carried different information compared to the source.(This approach was motivated especially by online advertisements, whereby companies such as Amazon and Google are able to monitor whether a potential customer is interested in their target products by the customer's search, view, and click history and, thus, present their advertisements accordingly.They can sense the final opinion of their potential customer by observing the potential customer's behavior, such as buying an advertised product).During this phase, if the information at the source was updated, then another update cycle started and, thus, the jth update cycle could be terminated before sending updates to m nodes.If the source sent updates to m nodes, it sent another instantaneous signal to start the gossiping among the nodes.Then, we entered the gossiping phase in the update cycle I j .During the gossiping phase, illustrated in Figure 1b, the nodes shared their local information with one another.When the information at the source was updated, the gossiping phase ended.At the end of the gossiping period, the nodes that did not get an update directly from the source updated their information based on the majority of the updates they received during the gossiping period.If a node did not get any updates from the source or the other nodes, it kept its local information unchanged.We denoted the information at node i at the end of the gossiping period by x ′ i (j).In order to measure the performance of the information dissemination process, we defined the error metric for node i at the update cycle j as Then, the average estimation error over all the nodes equaled ∆(j) = 1 n ∑ n i=1 ∆ i (j), and the long-term average estimation error over all the nodes was given by In the next section, we provide detailed analyses to characterize the long-term average error ∆.

The Long-Term Average Error
In this section, we characterize the long-term average error ∆.Let us consider a generic update cycle I j and, for simplicity of presentation, let us drop the index j from the variables in the rest of the analysis.At the beginning of the update cycle, we denote the number of nodes that have the same information as the source by N ∈ {0, . . ., n}.In this phase, either the source sends an update to a node after an exponential time with the rate λ s or the information at the source is updated after an exponential time with the rate λ e .Thus, the source sends an update to a node with probability λ s λ s +λ e or the information at the source is updated and the next update cycle starts with probability λ e λ s +λ e .Therefore, during a typical update cycle I with N < n − m, the source sends K s updates with the following probability mass function (pmf): Similarly, if N ≥ n − m, we have For an update cycle with N < n − m, the network enters the gossiping phase with probabil- , which decreases with m.In other words, choosing a large m decreases the probability of entering the gossiping phase.On the other hand, choosing a small m results in sending updates to a small number of nodes and, thus, in the gossiping phase, incorrect information can be spread.Therefore, there is an optimal m that achieves the smallest average error ∆.
If the source sends updates to m nodes before the information at the source is updated, then the gossiping phase starts.During the gossiping phase, either each node receives an update from the other nodes after an exponential time with rate λ or the information at the source is updated after an exponential time with rate λ e .As in [24], in the gossiping phase, node i receives K i updates with the following pmf: In other words, K i has geometric distribution with parameter λ e λ+λ e , i.e., K i ∼ Geo( λ e λ+λ e ).At the beginning of the gossiping phase, there are N + m nodes with the same information as the source and n − N − m nodes with incorrect information.For the nodes with x i = x s , conditioned on the total number of updates K i = k i that they received during the gossiping phase, the distribution of the number of updates that are equal to x s is given by where R i is a random variable denoting the number of updates that are equal to x s .In other words, for a node i that has . Similarly, for the nodes i with x i ̸ = x s , we have At the end of the gossiping period, based on the majority of the updates, the nodes i that have x s as their prior information estimate the information at the source as x ′ i = x s with probability P T,1 (N), which is given by We note that the first summation term in (9) corresponds to the case where a node receives a strictly higher number of x s during the gossiping period.The second summation term in (9) refers to the case where a node receives an equal number of x s and 1 − x s .In this case, a node estimates the information as either x s or 1 − x s with equal probabilities.If a node does not get any updates during the gossiping phase, it keeps its current information that is given by the last term in (9).Similarly, for a node i that has prior information x i ̸ = x s , we can derive an expression for the probability of updating its information to x s , denoted by P T,2 (N), as Note that this expression is identical to that in (9), except that in the summations we use the probabilities given in ( 8) and that P(K i = 0) is excluded.In the next theorem, we state the long-term average error.
Proof.We note that at the end of an update cycle with a gossiping phase, m nodes that obtain information directly from the source will have x ′ i = x s (In the gossiping phase, these nodes send information to other nodes with rate λ, but they do not update their information based on the updates received from the other nodes).There are N nodes that have prior information x s .These nodes will update their information to x ′ i = x s with probability P T,1 (N) and to x ′ i = 1 − x s with probability 1 − P T,1 (N).Thus, the total number of nodes that update their information to x s , denoted by N ′ 1 , has the binomial distribution N ′ 1 ∼ Bin(N, P T,1 (N)).On the other hand, there are n − N − m nodes that have prior information 1 − x s .At the end of the gossiping phase, these nodes will update their information to x ′ i = x s with probability P T,2 (N) and to x ′ i = 1 − x s with probability 1 − P T,2 (N).Thus, the total number of nodes that change their information to x s , denoted by N ′ 2 , obeys the binomial distribution N ′ 2 ∼ Bin(n − N − m, P T,2 (N)).Therefore, at the end of the gossiping period, the total number of the nodes that have x s is equal to m + N ′ , where N ′ = N ′ 1 + N ′ 2 has the following pmf: for n ′ = 0, . . ., n − m, where ℓ lower = max{0, n ′ + N + m − n} and ℓ upper = min{N, n ′ }.
Next, let us define N ′′ (j) to be the number of nodes that have the same information with the source at the end of the update cycle I j , i.e., x ′ i (j) = x s (j).If the update cycle I j ends before entering the gossiping phase, then either N(j) < n − m, K s < m or N(j) ≥ n − m.In these cases, the source sends updates to k s nodes with probability distributions given in (4) and (5), respectively.If the source is able to send updates to m nodes, then the gossiping phase starts and, as a result, N ′′ (j) = m + n ′ nodes will have x s (j) with probabilities P(K i = m)P(N ′ = n ′ ), where n ′ = 0, . . ., n − m.Thus, the probability distribution of N ′′ for a given N is given by With the pmf of N ′′ as provided in (13), we can fully characterize the transition probabilities of going from N nodes that have x s at the beginning of an update cycle to N ′′ nodes that have x s at the end of that update cycle.Now let us consider a Markov chain over the state space (x s , N), where by abuse of notation we label the first n + 1 states (0, 0), (0, We note that the stochastic matrix P in ( 14) is irreducible, as every state b is accessible from any state a in a finite update cycle duration.As P a,a > 0 for all a in ( 14), the Markov chain induced by P is also aperiodic.Thus, the above Markov chain admits a unique stationary distribution given by the solution of π = πP, such that ∑ 1 i=0 ∑ n j=0 π ij = 1, π ij ≥ 0, ∀i, j.Finally, we characterize the long-term average error among all the nodes by (11).
In the following section, we proceed to approximate the probabilities P T,1 (N) and P T,2 (N) provided in this section, to understand the effect of gossiping better when the gossip rate λ is low and high compared to the information change rate at the source λ e .

Analysis for High and Low Gossip Rates
In this section, we develop approximations for P T,1 (N) and P T,2 (N), which are the probabilities of choosing x s at the end of a gossiping period when the nodes have the same prior information with the source and when they do not, respectively.First, by assuming sufficiently large n and N, we can approximate the conditional pmfs for R i given in (7) and ( 8) by the binomial distribution Let us denote the corresponding P T,1 (N) and P T,2 (N) obtained by substituting this binomial approximation into ( 9) and ( 10) by PT,1 (N) and PT,2 (N), respectively.As PT,1 (N) = PT,2 (N) + P(K i = 0), for the rest of this section we will only approximate PT,2 (N), and can find the probability PT,1 (N) accordingly.Next, for sufficiently large values of k i , we can approximate where 2 du.We note that ( 15) is due to the normal approximation of binomial distribution by using the central limit theorem (CLT).
In the following proposition, we show that PT,2 (N) can be approximated by a summation of Q-functions.
Proposition 1.When λ is sufficiently large compared to λ e , PT,2 (N) can be approximated by Proof.Using the CLT, there exists a sufficiently large K, such that the difference between the probabilities 1−(1−ϵ) 1/(K+1) , the difference between PT,2 (N) and P T,app (N) can be smaller than 2ϵ for every ϵ > 0 by choosing sufficiently large λ.
Next, we show that PT,2 (N) can be approximated by the summation of Q-functions when λ is sufficiently large.In the following proposition, we show that as λ → ∞ the probability PT,2 (N) converges to a step function.( Proof.First, we consider the case when N+m n < 1 2 .In this case, we note that Q √ k i A(N) is a decreasing function of k i .Thus, for any arbitrary ϵ 1 > 0, there exists an L, such that is a decreasing function of k i .Thus, for any ϵ 2 > 0, there exists a large L, such that Q − √ k i A(N) < ϵ 2 .Therefore, we can write Now, similar to the first part of the proof, we can show that PT,2 (N) > λ λ+λ e − 2ϵ 2 by selecting a sufficiently large λ.Thus, when N+m n > 1  2 , we have lim λ→∞ PT,2 (N) = 1.Finally, when N+m n = 1  2 , we note that the A(N) terms in ( 16) become 0, which implies PT,2 (N) ≈ P T,app (N) = 1 2 .
In Proposition 2, we showed that when the gossip rate λ is sufficiently large, the nodes start to have access to information from all other nodes.As a result, all the nodes in the network collectively start to behave like a single node, where at the end of a gossiping period the information is updated based on the majority of the information at all nodes.In other words, if the majority of the nodes have the same information as the source, which happens if N+m n > 1  2 , all the nodes update their information to x s and, thus, they will have the same information as the source at the end of the gossiping period.On the other hand, when the majority of the nodes have the incorrect information 1 − x s , which happens if , then all the nodes will have the incorrect information at the end of the gossiping period.Therefore, when the information at the source changes frequently (i.e., λ e is large) and the source has limited total update rate capacity (i.e., λ s is small), a high gossip rate λ can cause incorrect information to disseminate in the network.As a result, gossiping can be harmful in these scenarios.On the other hand, when the source has high transmission rates, at each update cycle, it is enough for the source to send its information to the number of nodes that achieves the majority, i.e., N+m n > 1  2 .After that, the remaining nodes can obtain the correct information during the gossiping phase.Thus, when the source has enough transmission rate, high gossip rates among the nodes can be utilized by sending the updates to at most half the network.
Next, we consider the case in which the gossip rate λ is relatively low compared to the rate of information change at the source, λ e .When the gossip rate is low, the nodes either do not get any updates, in which case they hold on to their prior information, or they mostly get only one update from the other nodes and, hence, update their information based on the only received update.In the following proposition, we approximate the probability PT,2 (N) when λ is low.Proposition 3. When λ is sufficiently small, the probability PT,2 (N) can be approximated by Proof.When λ is sufficiently low, the nodes may not receive any updates or receive a single update packet from the other nodes in the gossiping phase.Thus, the nodes that have the incorrect information 1 − x s as prior information obtain x s with probability (1 − P(K i = 0)) N+m n , which is equal to Next, we consider the difference between PT,2 (N) and P low T,app (N), which is given by PT,2 (N) Thus, when the gossip rate λ is sufficiently low compared to λ e , the upper bound on (19) can be made arbitrarily small, making the approximation PT,2 (N) ≈ P low T,app (N) tight.

Gossip Gain and an Adaptive Policy for Selecting Transmission Capacity
As a result of gossiping, when λ is low, the nodes that have the correct information x s as prior information keep their information as x s with probability P low T,app (N) + P(K i = 0), which is given by PT,1 (N) ≈ λ λ+λ e N+m n + λ e λ+λ e .Thus, when λ is small, the probability PT,1 (N) can be equivalently approximated by Therefore, when the gossip rate is low, we have Thus, at the end of the gossiping period, there are nodes that have the same information as the source x s .If we consider the system with no gossiping, where only the source can send updates to m nodes, at the end of an update cycle most N + m nodes have the same information as the source.Thus, compared to the system with no gossiping, the gain (error reduction) obtained as a result of gossiping can be computed as which is obtained by subtracting ] + m and dividing the result by n due to the definition of ∆.Note that the last term 21) is equal to the probability of entering the gossiping phase.
Let us denote the average error for a system with no gossiping (that is, λ = 0) by ∆ ng .If the gossip rate is low, the overall gain obtained from gossiping, |∆ − ∆ ng |, can be approximated by where B(p) is a scaling function, in terms of p, to represent the effect of gossiping on the steady-state distribution π.
When the gossip rate among the nodes is low, the gossip gain G(N) in ( 21) depends on the selection of m.Therefore, if the source is allowed to dynamically choose its transmission capacity m in terms of N, a natural choice is to adaptively select an m which maximizes the gossiping gain by solving where ρ s = λ s λ s +λ e .(We note that ∂G(N) ∂m = 0 has two solutions.The other solution is equal to m * (N) in (23) except that the square-root term has a positive sign.One can show that this root is always larger than n − N and, thus, cannot be a feasible selection for m).In fact, it is easy to see from ( 23) that the optimal solution m * (N) always lies in the range 0 ≤ m * (N) ≤ n−N 2 .When the source has infinite transmission capacity, we have lim λ s →∞ m * (N) = n−N 2 , which suggests that the source should send its information to at most half of the nodes that carry incorrect information.In the other extreme case, when the source's transmission capacity is equal to 0, we have lim λ s →0 m * (N) = 0, in which case the source should not send its information to any other nodes.In general, for a given λ s , m * (N) in ( 23) is a decreasing function of N, which means that when N is small, i.e., when most of the nodes have incorrect information, the source should send updates to a higher number of nodes.As N increases, the source should send updates to a smaller number of nodes, as most nodes carry the same information as the source.In the following section, in order to reduce the average error, we implement clustering in the gossip network.

Average Error in Clustered Networks
In this section, we explore the idea of clustering in the gossip network, in order to further reduce the average error.As illustrated in Figure 2, the gossip network is partitioned into m s clusters with equal cluster size n c , i.e., without loss of generality, we assume that n is divisible by m s and, thus, we have n c = n m s .Each cluster has a designated cluster head that is connected to the source directly.In the clustered network, when the information at the source is updated, instead of sending updates to individual nodes directly, the source only sends its information to cluster heads that carry different information compared to the source.Thus, the source can send its information to a smaller number of cluster heads with higher update rates.Furthermore, as the cluster heads behave like an information source of each cluster, by using cluster heads, we can increase the total update rates going from the source to the individual nodes while decreasing the total number of connections at the gossip network.The downside of clustering is that if the information is updated during transmission from the source to the cluster heads, information may not be disseminated to individual nodes in the gossip network.Thus, we need to choose the number of cluster heads m s optimally, to minimize the average error.At the update cycle I(j), let the number of cluster heads that have the same information as the source be N s (j), where 0 ≤ N s (j) ≤ m s .Before the information at the source is updated again, the source sends updates to K sc (j) cluster heads with the following probability distribution: If K sc < m s − N s , it means that the information at the source is updated before all the cluster heads obtain the information at the source.In this case, a new update cycle begins and the source starts sending information to the cluster heads again.If all the cluster heads obtain the most current information at the source, then the cluster heads start sending their information to m c nodes within their corresponding cluster that carries different information compared to the source with a total update rate of λ.(In this section, we introduce the cluster heads as special nodes of the network, as in [10].However, using these special nodes may not always be possible, as they may result in additional costs for the system.Considering these factors, we take the total update rate of the cluster heads as λ, which is the same as that of the regular nodes at the clusters, which means that in the absence of these special nodes, some of the nodes in the gossip network can be used as cluster heads.As all the clusters in the network are identical, we focus on a typical cluster and obtain the average error of a node within that cluster).In a typical update cycle I, we define N c as the number of nodes that carry the same information at the source.When N c < n c −m c , a cluster head sends updates to K c nodes with the following pmf: If N c ≥ n c − m c , then we have When N c < n c − m c , if the cluster head sends updates to K c = m c nodes, then the gossiping phase starts and all the nodes within the same cluster share their local information with one another.When the information at the source is updated, the gossiping phase ends, and the nodes that do not get information directly from the cluster head update their information based on the majority of the updates that they received during the gossiping phase.
In the next lemma, for a given state N s , we provide the expression for the long-term average error at the nodes within the clusters.Lemma 1.Under the proposed clustered network structure, for a given N s , the long-term average error at the nodes within the clusters, denoted by ∆ c|N s , is given by where P( is the row vector of the steady-state distribution of the Markov chain formed over the state space (x s , N c ).The unique stationary distribution is given by the solution of π N s = π N s P N s for a stochastic matrix P N s ∈ R 2(n c +1)×2(n c +1) , where the transition probabilities of P N s can be derived by replacing N ′′ , N, and n in ( 14) by N ′′ c , N c , and n c , respectively.
Proof.For a given N s , the average error analysis with a clustered gossip network similarly follows from Section 3.During the gossiping phase, node i receives K i updates with the pmf in (6).Similar to (7) and ( 8), we can rewrite P(R i = r|K i = k i , x i = x s ) and P(R i = r|K i = k i , x i ̸ = x s ) by replacing N and m with N c and m c , respectively.Then, we can define P T,1 (N c ) and P T,2 (N c ) as in ( 9) and (10), respectively.Before starting the gossiping period, N c nodes have x s and n c − N c − m c nodes have 1 − x s as their prior information.At the end of the gossiping period, we have where P(N ′′ s = n ′′ s |N s = j) is provided in (33).
Proof.In order to obtain the long-term average error ∆ c = E[∆ c|N s ], we need to find the probability distribution for N s .For that, we note that in the clustered network, the information at the source and the number of cluster heads that have the same information at the source, i.e., (x s , N s ), also form a Markov chain.During the source's update transmission to the cluster heads, by using (24) we write the probability distribution for transition to state N ′′ s from state N s , as follows: The Markov chain formed by (x s , N s ) has the states (0, 0), . . ., (0, m s ), (1, 0), . . ., (1, m s ) where we label these states from 1 to 2(m s + 1), correspondingly.Then, the stochastic matrix P s , consisting of P s a,b , which denotes the probability of moving from state a to state b and is given by (34).Then, we can arrive at the unique stationary distribution π s = π s P s that satisfies ∑ 1 i=0 ∑ m s j=0 π s i,j = 1, and π s i,j ≥ 0, ∀i, j.Finally, by using ∆ c = E[∆ c|N s ], we obtain the average error of a node in a cluster in (31).
Similarly, the average error at the cluster heads ∆ s can be obtained by using (32) with the stationary distribution π s and P(N ′′ s = n ′′ s |N s ) in (33).
In general, the clustered networks can model a system where not all the nodes have access to the source directly.In a way, cluster heads constitute a small group of nodes that have the privilege of accessing the information source directly.These nodes can be considered as paid subscribers to the source, while regular nodes can have free access to the information through these paid subscribers and gossiping.Thus, looking at the average difference between the errors at the cluster heads, ∆ s , and at the regular nodes, ∆ c , tells us how much a regular node can increase its quality of information through subscription.We can also imagine the clustered gossip networks in a way such that if every node is connected (subscribed) to the source directly, the information quality at the individual nodes may decrease due to the limited update capacity of the source.Instead, these nodes may choose some nodes as subscribers and share the cost of the subscription.As a result, through clustering, the nodes can decrease the cost of accessing the information while increasing the overall quality of their information.
In the next section, we provide numerical results to shed light on the effects of gossiping and clustering on information dissemination.

Numerical Results
This section has three subsections: in the first one, we discuss the numerical results of the effects of various parameters, such as transmission capacity m, rate of information change λ e , information transmission rate at the source λ s , gossip rate λ, and the number of nodes n on information dissemination in gossip networks; in the second one, we provide simulation results to corroborate the analytical results in Section 4; in the third one, simulations illustrate the results of Section 5-that is, the effects of clustering on information dissemination.

Simulations for the Effects of Various System Parameters on Information Dissemination
In the numerical results provided in this subsection, we provide real-time simulations over 200,000 update cycles, and we provide the sample average errors with the markers in Figures 3 and 4. In the first numerical study, we took p = 0.4, λ e = 1, λ s = 10, and n = 60.We found the average error ∆ with respect to m when λ = {0, 10, 20}.Note that λ = 0 corresponded to the case of no gossiping among the nodes.We see in Figure 3a that when m was small, i.e., when the source could send updates to a small number of nodes, the average error ∆ increased with gossip rate λ.As m was small and the information change rate p = 0.4 was high, incorrect information disseminated, due to gossiping in the network.As a result, the system with no gossiping (λ = 0) achieved the lowest average error.When we increased m sufficiently, the nodes started to have access to the same information as the source, and gossiping helped to disseminate the correct information.That is why the systems with gossiping-i.e., λ = 10, 20-achieved lower average error compared to the system with no gossiping.The lowest average error ∆ was achieved when m = 25 for λ = 10, 20 and m = 55 for λ = 0. Here, we also note that the average error ∆ was lower when λ = 10 compared to λ = 20, which shows that for a given m, there is an optimal gossip rate that achieves the lowest average error.Finally, increasing m further decreased the probability of entering the gossiping phase, and that is why all the curves in Figure 3a overlap when m ≥ 40.In the second numerical study, we considered the same variable selections as in the previous example except that we took m = {5, 10, 15} and changed λ from 0 to 40.We see in Figure 3b that increasing the gossip rate λ initially helped to reduce the average error ∆.Then, increasing λ further increased ∆ as the incorrect information among the nodes became more available.We see in Figure 3b that the minimum average error was obtained when λ = 1 for m = 5, λ = 3 for m = 10, and λ = 6 or λ = 7 for m = 15.We note that as the source sent updates to more nodes, the optimal gossip rate increased.
In the third numerical study, we considered p = 0.2, λ e = 1, λ = 5, and n = 60.We increased λ s from 1 to 400 for m = {5, 10, 15}.We see in Figure 3c that increasing λ s initially decreased the average error ∆ faster.However, as ∆ depended also on the other parameters, such as m and the gossip rate λ, increasing λ s further did not improve the average error ∆ and it converged to 0.348 for m = 5, 0.21 for m = 10, and 0.144 for m = 15.
In the fourth numerical study, we considered the effect of the network size n on the information dissemination.For that, first, we took p = 0.2, λ e = 1, λ = 10, m = 8, and n = {10, 20, . . ., 150}, and we increased λ s = {0.1n,0.2n, 0.5n} with the network size n.In this case, as the network size increased, the source's transmission rate also increased.However, we kept the total number of nodes that the source could send updated to the same, i.e., m = 8 for all n.In Figure 4a, when λ s = {0.1n,0.2n}, we see that the average error ∆ initially decreased with n, as λ s was initially a primary limiting factor.Increasing n further increased ∆ as m became more important.That is why all these three curves overlap each other when λ s is sufficiently large.Then, we considered a scenario where we kept λ s = 4 and only increased m = {0.1n,0.2n, 0.5n}.In Figure 4b, increasing the maximum number of nodes that the source could send updates to in an update cycle alone did not reduce ∆ as n increased.As we increased n, λ s became the presiding factor, and all the curves in Figure 4b overlap.Finally, we increased both the source's transmission rate λ s and capacity m with n, i.e., λ s = {0.1n,0.2n, 0.5n} and m = {0.1n,0.2n, 0.5n}.As a result, in Figure 4c, we observe that we could achieve a constant ∆ by increasing λ s and m proportional to n.

Simulations for High and Low Gossiping Rates
In this subsection, we provide numerical results for the analysis developed for high and low gossip rates in Section 4. Here, we also ran real-time simulations over 10,000,000 update cycles.As m = 20, λ s = 2, and λ e = 1, out of 10,000,000 update cycles, approximately in 10, 000, 000 × λ s λ s +λ e m ∼ 3000 update cycles, the system entered the gossiping phase.
As P T,1 (N) and P T,2 (N) were the probabilities of individuals that were able to obtain the source's information as a result of gossiping, the sample averages of P T,1 (N) and P T,2 (N) were obtained approximately over 3000 update cycles, where the system entered the gossiping phase.In the first numerical study, we verified the analytical results in Propositions 1 and 2. For this simulation, we numerically evaluated P T,2 (N) when n = 200, m = 20, λ s = 2, λ e = 1, p = 0.2 for λ = {20, 200, 400}.Then, we compared P T,2 (N) to P T,app (N).In Figure 5, we observe that when λ was high compared to λ e , P T,2 (N) could be approximated well by P T,app (N), which was given by the summation of Q-functions in (16).Furthermore, due to Proposition 2, as we increased λ from 20 to 400, P T,app (N) and, thus, P T,2 (N) converged to a step function, i.e., when N < n 2 −m = 80, we observed that P T,2 (N) converged to 0, and when N > n 2 −m = 80, P T,2 (N) converged to 1 while we had P T,2 (80) = 0.5.In the remaining numerical studies, we considered the case when the gossip rate λ was low compared to λ e .In the second simulation, we evaluated P T,1 (N) and P T,2 (N) with the same parameters except for λ = {0.1,0.5, 1}.We have shown in Proposition 3 that when λ is low compared to λ e , P T,2 (N) can be approximated by P low T,app (N) in (18).We see in Figure 6b that when λ = 0.1 and λ = 0.5, P T,2 (N) matched closely to P low T,app (N) in (18).When λ = λ e = 1, P T,2 (N) could still be approximated well by P low T,app (N), but their differences started to be noticeable.Similarly, for the low gossiping rate, we see in Figure 6a that the approximation for P T,1 (N) given in (20) was close when λ = {0.1,0.5}.When the gossip rate λ was low, during the gossiping phase, the nodes either did not receive any updates, in which case they held on to their previous beliefs, or only got one update.That is why in Figure 6b, when N was low, P T,2 (N), which was the probability of having the correct information as a result of gossiping for a node that had incorrect prior information, was close to 0, and then it increased with N.  20) and ( 18), respectively, when the gossiping rate is low.
In the third simulation study, when the gossip rate was low, we numerically found the gossip gain (22), which was the difference between the average error with no gossiping ∆ ng and the average error with gossiping ∆.For this example, we took n = 80, λ = 0.4, λ s = 10, λ e = 1, and p = {0.3,0.5, 0.7}.We plotted |∆ − ∆ ng | with respect to m in Figure 7a.We observed in Figure 7a that for all values of p, the gossip gain initially increased with m as the source sent correct information to a sufficient number of nodes.Then, increasing m further decreased the gossip gain as the probability of entering the gossiping phase decreased in an update cycle.We observe in Figure 7a that the optimum gain was obtained when m = 8 for all p values.We note that the scaling term B(p) in ( 22) was equal to 1.7, 1.1, and 0.8 for p = 0.2, p = 0.5, and p = 0.7, respectively.We also note that G(N) in (21) decreased N in the next update cycle with probability p and increased N with probability 1 − p.Thus, the term B(p) in (22), which was the amplitude of the gossip gain, decreased with p. Based on G(N) in ( 21), we can find the optimal m that maximizes the gossip gain G(N) for each N, which is provided as m * (N) in (23).So far, in this work, we have only considered the case where m is kept constant for all update cycles.However, m * (N) in (23) decreases with N, which suggests a policy that selects m adaptively, depending on N. In the next simulation result, we took n = 60, p = 0.2, λ = 10, λ e = 1, and λ s = {1, 5, 10}.In Figure 7b, we plotted m * (N) and their corresponding rounding to the nearest integer.We see in Figure 7b that the source sent updates to more nodes as λ s increased.
In the last simulation study, we compared the performances of the proposed adaptive policy and the constant policy for selecting m.We considered n = 60, p = 0.2, λ = {0, 1, 5}, λ e = 1, and varied λ s from 1 to 200.We first implemented the adaptive-m transmission policy by using the nearest integer rounding of m * (N) in (23), which was denoted by m * (N).We then found the stationary distribution π and calculated the average m, using E[ m * ] = ∑ n j=0 (π 0j + π 1j ) m * (N), which is depicted in Figure 8b.In order to make a fair comparison, we took the nearest integer rounding of E[ m * ], which is shown with the dashed lines in Figure 8b, and implemented the constant m transmission policy.We see in Figure 8a that the adaptive m policy (even without gossiping) achieved significantly lower average error ∆ compared to the constant m policy.In Figure 8a, we also observe that as the gossiping took place, especially when nodes had the correct information, the average error ∆ decreased with the gossip rate λ.In the adaptive m selection policy, we see in Figure 8b that increasing gossip rate λ not only achieved lower ∆ but also decreased the source's transmission capacity E[ m * ].Even though we found this policy for low gossip rates (λ < λ e ), we observed that it was an effective transmission policy even for the higher values of λ and could achieve lower ∆ compared to the constant m policy.

Simulations for the Clustered Networks
In this subsection, we provide the results of simulations that illustrate the effects of clustering on information dissemination.In the first numerical we chose λ = 10, λ s = 10, λ e = 1, p 0.4, and n = 120.We took m c = 5 and considered all m s values that could divide n.In Figure 9, we plotted the long-term average error at the clusters, ∆ c , at the cluster heads, ∆ s .We see that increasing the number of cluster heads initially helped to reduce ∆ c as the update rates from the cluster heads to the nodes increased.We see in Figure 9 that the minimum ∆ c was achieved when m s = 15.Increasing m s further increased ∆ c , as the average error at the cluster heads ∆ s became large.
In the numerical study, we compared the performances of the gossiping networks with and without clustering when the source's transmission capacity m had an upper limit m lim = 12.For this numerical study, we took the same set of variables as in the first numerical study, but we increased n = 12, 24, . . ., 96.For each n, we found the optimum m for the network model without clustering and the optimum m s for the clustered network that minimized the average error at the nodes.We plotted the minimum average error values in Figure 10a and the optimum m and m s selections in Figure 10b.We see in Figure 10a that the average error with clustering, ∆ c , was smaller than the average error without clustering, ∆, for all values of n, although the source used its maximum capacity m = m lim for n ≥ 24 in the network model without clustering, as shown in Figure 10b.For the clustered network model, the optimal number of cluster heads mostly increased with n and reached m lim for n ≥ 84.

Conclusions and Future Directions
In this work, we considered information dissemination over gossip networks consisting of a source that keeps the most up-to-date information about a binary state of the world and n nodes whose common goal is to follow the binary state of the world as accurately as possible.We first characterized the equations necessary to obtain the average error ∆ over all the nodes.Then, we provided analytical results for the high and low gossip rates.As information became available among the nodes in the high gossip rates, all the nodes behaved like a single node.In the low gossip case, we analyzed the gossip gain, which was the error reduction compared to the system with no gossiping, and we obtained m * (N), which maximized the gain.This suggests an adaptive m selection policy using m * (N), where the source sends updates to more nodes if most of them have incorrect prior information.Finally, we implemented a clustered gossiping network model and characterized the average errors at the cluster heads and at the nodes in the clusters.
We would like to note that, in this paper, information change probability p and update rate λ e are taken as given exogenous parameters to the source.As time passes, the source generates updates based on Poisson ticking with rate λ e among which information is reverted with rate pλ e .Let us assume that pλ e is fixed (while 0 < p < 0.5) and m is constant.As pλ e is constant, information change rate over time does not change as we vary λ e .When λ e is low (and, thus, p is relatively large), then information at the source is flipped more frequently and the update cycle duration gets longer (as a result, the probability of entering the gossiping phase is higher).As the information is flipped more often, the majority of the nodes may have incorrect information.During the gossiping phase, this may increase the average error, as incorrect information may be disseminated further in the network.On the other hand, when λ e is high while p is low, the information at the source is updated more frequently, but information does not get mutated much.In this case, the probability of entering the gossiping phase decreases and, thus, the system may not benefit from gossiping.Therefore, for a fixed pλ e , there should be an optimal p and λ e selection that minimizes the average error.We leave the optimization problem over λ e as a future research direction.
As a future direction of research, one could consider the problem where the information at the source can take k > 2 different values based on a known pmf.Furthermore, here we have considered only fully connected networks, and extending these results to arbitrarily connected networks could be another interesting direction.One could consider a setting where the source does not have access to the prior information on the nodes, and has to select nodes randomly.In addition to the real-time simulation results, we would like to test our results with the real-world datasets provided in [25].Finally, one can consider a setting where, although some nodes have the most accurate information, they maliciously send incorrect information to others during the gossiping phase, thus increasing average error.

Figure 1 .
Figure 1.A communication system that consists of a source and fully connected n nodes where (a) only the source sends updates to the nodes, and (b) the nodes share their local information, called the gossiping phase.

Proposition 2 .
As λ → ∞ the probability PT,2 (N) converges to a step function given by lim λ→∞ source

Figure 2 .
Figure 2. A clustered gossip network that consists of a source and m s = 2 cluster heads and fully connected n c = 6 nodes.

Figure 7 .
Figure 7. (a) The gossip gain |∆ − ∆ ng | in (22) with regard to m for p = {0.3,0.5, 0.7}.(b) A sample evolution of m * (N) in (23) and its rounding to the nearest integer for different values of λ s .

Figure 8 .
Figure 8.The comparison between (a) the average error ∆ and (b) the average m for the adaptive m and constant m selection policies.

Figure 9 .Figure 10 .
Figure 9.The long-term average error at the clusters, ∆ c , and at the cluster heads, ∆ s , as we increase the number of clusters m s .