Simple Majority Consensus in Networks with Unreliable Communication

In this work, we analyze the performance of a simple majority-rule protocol solving a fundamental coordination problem in distributed systems—binary majority consensus—in the presence of probabilistic message loss. Using probabilistic analysis for a large-scale, fully-connected, network of 2n agents, we prove that the Simple Majority Protocol (SMP) reaches consensus in only three communication rounds, with probability approaching 1 as n grows to infinity. Moreover, if the difference between the numbers of agents that hold different opinions grows at a rate of n, then the SMP with only two communication rounds attains consensus on the majority opinion of the network, and if this difference grows faster than n, then the SMP reaches consensus on the majority opinion of the network in a single round, with probability converging to 1 as exponentially fast as n→∞. We also provide some converse results, showing that these requirements are not only sufficient, but also necessary.


Introduction
The digital age drove forth the need for easy and fast access to information. The world wide web has facilitated the existence of many useful multiagent systems from messaging apps to cryptocurrency [1] and distributed data storage (or cloud services) [2,3]. However, the design of multiagent systems inherently requires agents to communicate and coordinate according to a prescribed shared protocol to achieve a common goal. For example, messaging apps must always show messages in the same order to all participants in a conversation, which is challenging when user clocks are not necessarily synchronized [4,5]. Cryptocurrencies employ decentralized data structures to register currency transactions, which require a vast majority of users to agree upon its current state [6]. Distributed data storage services must show consistent views of stored files in the presence of multiple concurrent reading and writing operations [7,8].
In the pursuit of developing such distributed protocols, much of the literature routinely makes two powerful assumptions. The first is that communication links are reliable [9][10][11], i.e., all messages between agents are eventually delivered. The second is that there exists an upper bound on the transmission delay of messages from one agent to another (usually the maximum propagation time of links) [12]. Nonetheless, communication networks are notoriously unreliable [13][14][15]. In fact, actual communication links may suffer from sudden crashes, resulting in messages in transit to be lost forever. In an effort to ensure reliability, distributed applications are generally built upon a reliable broadcast layer implemented by the Transmission Control Protocol (TCP) [16]-one of the main protocols in the internet protocol suite. However, while TCP guarantees eventual delivery of all sent messages, it does not provide any upper time bound on delivery time [17] (p. 9). In practice, these assumptions do not hold simultaneously.
In this work, we assume no such underlying structure exists and analyze the performance of a simple majority-rule protocol solving a fundamental coordination problem in distributed systems-binary majority consensus, in the presence of probabilistic message loss. Using probabilistic analysis for a large scale, fully-connected network of 2n agents, we prove that the Simple Majority Protocol (SMP) converges rapidly to a consensus on the majority opinion of the network with probability approaching 1 as n → ∞, given that the difference between the numbers of agents that hold different opinions grows as fast as √ n. Otherwise, if the difference between the numbers of agents that hold different opinions is relatively close to zero, then the SMP still converges extremely fast to a consensus, but not necessarily on the initial majority opinion of the network.

Importance of Reliable Communication
Reliability of communication is essential to guarantee coordination in almost all cases. The pitfalls and design challenges of coordination when communication is unreliable is best illustrated by the two generals' problem, which was popularized by Jim Gray [18].
Consider two generals who must coordinate a joint attack on an enemy. Both generals must attack simultaneously for the attack to succeed. While the two generals agreed that they will attack, they have not agreed upon a time for the attack. To coordinate, they can send messages to one another by running messengers. However, the messengers can be captured by the enemy and their messages will therefore not reach their destination.
Due to the uncertainty of message delivery, there exists no deterministic joint communication protocol that guarantees coordinated attack. To see this, assume there exists such a protocol by contradiction. Since a deterministic protocol must solve the problem in a finite number of steps, then the protocol prescribes a fixed number of message exchanges between the two generals, after which both must attack together. Some of these messages are successfully delivered and some are lost. Consider the last successfully delivered message in a run of the protocol, after which the recipient is confident enough to attack without the need for any further correspondence. Suppose this message was lost instead, then the recipient will hold off and not attack. However, the sender does not know about this last communication failure. By the protocol definition he must attack anyway, despite his counterpart's reluctance-contradicting the assumption that the protocol was a solution to the problem.

Majority Consensus
The impossibility result of the two generals' problem had far-reaching implications in the field of distributed protocols and databases, including the study of binary consensus [19]. In the binary consensus problem, every agent is initially assigned some binary value, referred to as the agent's initial opinion. The goal of a protocol that solves consensus is to have every agent eventually decide on the same opinion, thus reaching agreement throughout the system. More formally, given any initial assignment of agent opinions, a run of a protocol which solves consensus must exhibit the following three properties:
Agreement: if some agent decided on v, no opinion other than v can be decided on by any other agent; 3.
Nontriviality: if some agent has decided on v, then v was an opinion initially assigned to some agent.
Consensus is a fundamental problem in distributed systems, as many other coordination problems were shown to be directly reducible to and from consensus. The list includes agreeing on what transactions to commit to a database [20], state machine replication [21], atomic snapshots [22], total ordering of concurrent events [23], and the two generals' prob-lem, implying that no protocol can guarantee all three properties when communication is unreliable [24].
In light of this, it is interesting to consider a variation of the two generals' problem where the probability of a messenger getting captured is p (independently of other messengers) [25,26]. While coordinated attack is still deterministically impossible, it is straightforward to design a protocol that guarantees success with probability at least q, which can be as close as desired to 1. The first general simply sends log p (1 − q) messengers, then attacks at the specified time without waiting for a reply, and the second general attacks if any messenger from the first general arrives.
In this work, we investigate whether leveraging such an assumption helps to solve binary majority consensus, in which the nontriviality clause stipulates that if a majority of agents initially hold the same opinion, then all agents must decide on this opinion. This variant of consensus is utilized when the agreed upon opinion holds importance beyond facilitating agreement. For example, a distributed system of sensors capable of detecting natural gas could use majority consensus to answer the question "Is the amount of gas in the air greater than 10,000 ppm?" to help detect a gas leak in a gas processing center.
We analyze the performance of the SMP in a complete graph of communication, i.e., where each agent has an active communication channel to every other agent in the system. In SMP, agents communicate in equal-length time intervals called rounds. All messages are sent at the beginning of a communication round, and they either arrive by the end of the round or are considered lost. We assume that all message loss events are statistically independent and identically distributed with some constant probability.
The SMP can be briefly described as follows: in each round, every agent sends its current opinion to all other agents. Then, it waits to receive all messages from other agents proposing their own opinions. If a majority of received messages propose the same opinion, then the agent adopts this opinion for the next round. All ties are reconciled by readopting the agent's own opinion. After a fixed number of rounds r, each agent decides on its currently adopted opinion.
Similarly to the probabilistic protocol for the two generals' problem discussed above, the SMP does not solve consensus deterministically, but rather provides probabilistic guarantees instead. The Decision and Nontriviality properties of classical consensus are assured, since all agents decide by the end of round r and any opinion that was decided on, was proposed by some agent. However, Agreement is not assured, since there always exists a nonzero probability of a run of the protocol in which message losses cause one agent to see only one opinion and another agent to see only the other, thus making them disagree. Likewise, Nontriviality of majority consensus is not guaranteed, since the majority opinion could be hidden from some agent. We will show in this article that the probability of these runs is negligible as the number of agents, n, tends to infinity, thus demonstrating that unreliable communication is not an insurmountable obstacle for coordination.
Specifically, we prove that the SMP with r = 3 reaches classical consensus with probability converging to 1 as n tends to infinity. In a system of 2n agents, let δ n be the number of agents that are initially assigned the majority opinion minus n. For simplicity, assume the majority opinion is always the same for all n. We show that if δ n grows at a rate of √ n, then the SMP with r = 2 reaches majority consensus with probability approaching 1 as n → ∞. We also show that if δ n grows at a rate faster than √ n, then the SMP with r = 1 reaches majority consensus with probability that converges to 1 exponentially fast.
We also show that these achievability results are, in fact, tight. We will prove that if δ n = 0, then r = 3 communication rounds is a necessary condition, since the probability to reach consensus with only r = 2 rounds converges to 0 as n → ∞. Similarly, if δ n grows as slow as √ n, then r = 2 rounds are a necessary condition to reach majority consensus.

Related Work
The problem of binary majority consensus was extensively researched in many different fields and contexts including autonomous systems [27][28][29][30], distributed systems [31][32][33], and information theory [34][35][36]. Almost always the problem is studied in the context of possible failure of some aspect of the network. In distributed systems, failure most often arises from agents behaving maliciously, failing to follow the protocol, or outright crashing. Consequently, protocols that solve consensus (and majority consensus by extension) are designed to tolerate a certain fraction of the set of agents failing [37,38]. Transmission faults (i.e., message loss, erasure, or addition) can be considered an extension of agent failure, but doing so may lead to false conclusions. For example, in a system of n agents, the entire system may be considered faulty even if only one message from each agent is lost. However, as shown by Santoro and Widmayer [39], the system may tolerate up to n − 1 messages losses in a round and still reach consensus. Additionally, assuming a probability distribution on message loss is consistent with how network protocols are analyzed. The most notable example is that TCP throughput was shown to be inversely proportional to the square root of the link's average packet (i.e., message) loss probability [40].
In [27,29,30,34], the authors studied the effects of message loss, random topology, Gaussian noise, and faulty agents, on the SMP's convergence rate, i.e., the fraction of initial assignments of agent opinions (out of 2 n ) resulting in successful agreement. Specifically, in [30] computer simulations showed an improvement in the convergence rate of the SMP as the message loss probability increased up to 0.8, after which the rate begins to decrease to zero. In contrast, we are interested in the maximal probability of failure over any initial assignment of agent opinions, since we cannot assume any distribution or frequency on the input to the consensus problem.
Mustafa and Pekeč [28] studied the requirements on the connectivity of the network such that, under assumption of reliable communication, SMP achieves consensus on any initial assignment of agent opinions. Their main result is that the SMP computes the majority consensus successfully only in highly-connected networks. This conclusion led us to analyze the SMP under the assumption of a fully-connected network. However, message loss may actually improve the chances of consensus in graphs with lesser degrees of connectivity, as shown in [30]. We leave the proof of this hypothesis to future work. Additionally, the complete graph assumption is a valid approximation for unstructured overlays in peer to peer networks, e.g., Freenet, Gnutella, and Fast Track [41].
Our work closely resembles the work performed in [35,36]. These articles have shown that in a lossless fully-connected network where agents poll a portion of their neighbors uniformly at random, the SMP converges quickly to majority consensus with probability of error (in the sense that agreement was reached, but not on the majority opinion) that decays exponentially with n. While assuming the existence of infinite agents in a system may initially seem ludicrous and impractical, our own computer simulations of the SMP showed that these kind of results hold true even if the number of agents is on order of 10 6 , which is already the case in cryptocurrency protocols. We add another assumption of unreliable communication and show that this, essentially, does not change the outcome.
Yet, another line of relatively recent work deserves a special attention. In [42], a local polling protocol is proposed, and it is proved that it reaches consensus on the initial global majority in general graphs with certain degree properties. An estimation on the number of required steps to reach consensus is provided. In [43], similar results were given for random regular graphs. In both of these papers, it is assumed that a clear bias exists between the two initial opinions, in contrast to our main assumption in the current work, that the initial condition may be completely unbiased. In [44], the binary consensus problem was tackled from a different angle. For a random graph G(n, p) with a connectivity parameter p ∈ (0, 1) and any given ∈ (0, 1), this work reveals what the initial difference between the two camps should be, such that the larger camp will eventually win with probability at least as high as 1 − . In [45], the binary consensus problem was solved for relatively sparse random graphs but with random initial states, which is slightly different than the assumptions in the current work. A remarkable result was proved in [45], stating that a consensus can be reached in at most four communication rounds.
The remaining part of the paper is organized as follows. In Section 2, we establish notation conventions. In Section 3, we formalize the model, the protocol, and the objectives of this work. In Section 4, we provide and discuss the main results of this work, and in Section 5, we prove them.

Notation Conventions
Throughout the paper, random variables will be denoted by capital letters, realizations will be denoted by the corresponding lower case letters, and their alphabets will be denoted by calligraphic letters. Random vectors and their realizations will be denoted, respectively, by boldface capital and lower case letters. Their alphabets will be superscripted by their dimensions. The binary Kullback-Leibler divergence function between two binary probability distributions with parameters α, β ∈ [0, 1] is defined as: where logarithms, here and throughout the sequel, are understood to be taken to the natural base. The cumulative distribution function of a standard normal random variable is defined by: The probability of an event E will be denoted by P{E }, and the expectation operator with respect to a probability distribution Q will be denoted by E Q [·], where the subscript will often be omitted. The variance of a random variable X is denoted by Var[X]. The indicator function of an event A will be denoted by 1{A}. The set {1, 2, . . . , n} will often be denoted by [1 : n]. For x = (x 1 , x 2 , . . . , x n ) ∈ X n and for any a ∈ X , let us denote: For two non-negative sequences a n and b n , the sequence A n = n + a n is called asymmetric of exact order of b n if there exists some α > 0, such that lim n→∞ a n b n = α. Moreover, the sequence A n = n + a n is called asymmetric of order larger than b n if lim n→∞ a n b n = ∞.

Model, Protocol, and Objectives
Assume a set of 2n agents, and denote their assignment of initial opinions by x 0,n ∈ {0, 1} 2n . The vector x 0,n is called the initial state. Denote the numbers of zeros and ones in x 0,n by I 0 and I 1 , respectively. At each round, each agent transmits its current state to all other agents. If a message sent between any pair of agents arrives, then it is assumed to be delivered correctly. Otherwise, if x ∈ {0, 1} is transmitted between any pair of agents, but got lost, then the designated receiver receives the default symbol e. This assumption is only made for the purpose of making the definitions that follow brighter. For a sent message x ∈ {0, 1} and a received message Y ∈ {0, e, 1}, we assume that all message losses are statistically independent and identically distributed according to 1] is the loss parameter of the network. The binary erasure channel is characterized by a similar conditional distribution, but note that the actual faults in our model are message losses, not to be confused with erasures, which are different kinds of faults. The two extreme cases of a reliable network (i.e., with q = 0) and a completely unreliable network (i.e., with q = 1) are of less interest, for obvious reasons; hence, we assume throughout that q ∈ (0, 1).
At round ≥ 1, the agent i ∈ [1 : 2n] receives the (random) vector: and for a ∈ {0, 1}, he calculates the enumerators: In the SMP, each agent updates (note that we use this terminology even if the value of an agent does not change between two consecutive rounds) its value according to the more common value at hand, i.e., agent i chooses: The vector x ∈ {0, 1} 2n is called the state at the end of round . A specific SMP defines a priori the number of rounds until termination. Let us denote by SMP(r) the SMP with r rounds of communication until termination. We say that the SMP(r) attains consensus if: and denote this event by C n . Similarly, we say that the SMP(r) attains majority consensus if the following holds: and denote this event by C m n . For a specific initial state x 0,n , the probability of error in achieving consensus is defined as P e (x 0,n ) = P[C c n ]. The maximal error probability with respect to the initial state is defined by: The error probability in achieving majority consensus is defined similarly and denoted P m e (x 0,n ). Now, the first objective of this work is to prove that the SMP requires only very few rounds of communication to attain consensus, with a maximal error probability that converges to 0 when n → ∞. The second objective is to determine for which initial states it is possible to also achieve majority consensus with a small probability of error.

Main Results
Our first main result is the following, which is proved in Section 5.1. Theorem 1. Let {x 0,n } n≥1 , be a sequence of initial states over 2n agents. Assume that the 2n agents communicate over a network with a loss parameter q ∈ (0, 1). Then: 1.
We now provide a short discussion on the results of Theorem 1. Theorem 1 shows that the SMP requires at most three rounds of communications to attain consensus, in the limit of an infinite number of agents. Consensus on the majority cannot be ensured for all possible initial states, but only for those initial states that have a significant majority to one of the sides. To understand this fact better, consider the following special case. Assume a network with 2n agents, such that I 0 = n + log(n) and I 1 = n − log(n). Since this majority in favor of the zeros is so weak, then it is most likely that the random losses in the network will completely hide it; we expect that about half of the agents will have N 1,i (0) > N 1,i (1), thus updating their current opinion to '0', while the other half will update their current opinion to '1's. We conclude that the state at the end of round 1 is probabilistically equivalent to a sequence of 2n fair coin tosses, and hence, with a probability of about one half, the majority at the end of round 1 will be different from the initial majority.
More quantitatively, let I 0 = n + a n and I 1 = n − a n , where {a n } n≥1 is a non-negative, nondecreasing sequence. Moreover, for an agent with an initial opinion '0', let p n denote the sequence of probabilities of the events that such an agent updates its opinion to '0'. Then, the following trichotomy is seen inside the proof of Theorem 1.
One of the most surprising facts, at least to the authors of this work, is the following. For highly symmetric initial states, although p n n→∞ − −− → 1 2 (which is proved in Appendix C), it turns out (see Proposition 3 in Section 5.1) that after a single round of communication, the initial symmetry breaks equiprobably into one of the sides. Moreover, for the symmetric case of I 0 = I 1 = n, we prove in Propositions 3 and 4 that with a probability converging to 1, the state at the end of round 1 will be asymmetric of exact order of √ n. Then, according to the second point in Lemma 1, the state at the end of round two is going to have a significant majority to one of the sides, and thus, according to the third point in Lemma 1, only one more round of communication is required to achieve consensus. If the initial state is already asymmetric of exact order of √ n, then only two rounds of communication are needed for attaining consensus, and in this case, it is guaranteed (with high probability) that all agents agree on the initial majority opinion.
The phenomenon that the initial symmetry breaks into a sufficient majority after the first round is of key importance, since it makes the convergence of the SMP so rapid. In fact, we also conclude that the faulty communication between the agents even helps in attaining consensus, by breaking the symmetry in some extreme cases, e.g., consider the case of I 0 = I 1 = n and a reliable network (i.e., the case of q = 0). Then, ad infinitum, the state at the end of any round will be symmetric. Otherwise when losses exist according to some q ∈ (0, 1), this will not be the case, even if the percentage of losses is extremely small (but fixed at all n).
A significant difference exists between the first point of Theorem 1 and its last two points, which is the following. The first point of Theorem 1 is based on Proposition 1 in Section 5.1, which is mainly proved by using the Chernoff bound. Since the Chernoff bound is a nonasymptotic tool, we acquire a large-deviations result, i.e., for a given sequence {a n } n≥1 (with the condition lim n→∞ a n √ n = ∞), we propose a tight upper bound on P m e (x 0,n ), which holds for any finite n (this tightness follows from the fact that a lower bound with a matching exponent can be derived as well). This result is obviously stronger than just On the other hand, the second and the third points of Theorem 1 are based on Propositions 2 and 3 in Section 5.1, respectively. Since the proofs of these propositions involve central limit theorems, we merely arrive at asymptotic results. As a consequence, we do not know at what rates the probabilities in the second and the third points of Theorem 1 converge to one.
Since the results of the second and the third points of Theorem 1 are merely asymptotic, a few words on finite n effects are in order. We base the following facts on computer simulations of the SMP. On the one hand, convergence to consensus at more than three rounds is definitely possible, but only when the initial state is symmetric or almost symmetric. The reason for that is the fact mentioned above, according to which, the state at round 1 is probabilistically equivalent to a sequence of 2n fair coin tosses, and hence, the probability that the state at round 1 is again symmetric behaves asymptotically as 1/ √ n (upper and lower bounds can be derived using the Stirling's bounds to n!), which is not negligible at all, even for a relatively large number of agents. For relatively small values of n, we observed several realizations with even more than a single returning to a fully symmetric state. Although quite rare, these events should be taken into consideration in practical implementations.
All the results provided in Theorem 1 are, in fact, achievability results, i.e., they only tell under what conditions consensus can be attained. Hence, it is worth investigating whether consensus may be attained by the SMP with even less communication rounds than required in Theorem 1. In the following result, which is the second main result of this work and is proved in Section 5.2, we show that for highly symmetric initial states, three rounds of communications are not only sufficient, but also necessary. Theorem 2. Let {x 0,n } n≥1 be a sequence of symmetric initial states over 2n agents, i.e., N(x 0,n ; 0) = N(x 0,n ; 1) = n for all n. Assume that the 2n agents communicate over a network with a loss parameter q ∈ (0, 1). Then, the SMP(2) attains P[C n ] n→∞ − −− → 0.
While Theorem 2 provides a converse result with regard to the third point of Theorem 1, a similar converse result can also be established with regard to the second point of Theorem 1. If the initial state is asymmetric of exact order of √ n, then the SMP will likely not attain consensus after only a single round of communication, and furthermore, the probability of reaching consensus will tend to 0 as n → ∞. We omit the proof of this negative result.

Proof of Theorem 1
The first point of Theorem 1 is proved via the following result, which is proved in Appendix A.
n=1 be a sequence such that lim n→∞ A n √ n = ∞. For an initial state x 0,n ∈ {0, 1} 2n with at least n + A n zeros or at least n + A n ones and a channel parameter q ∈ [0, 1), the SMP (1) To prove the second point of Theorem 1, we rely on the following result, which is proved in Appendix B.
If x 0,n ∈ {0, 1} 2n has at least n + α √ n zeros, then: 2. If x 0,n ∈ {0, 1} 2n has at least n + α √ n ones, then Then, combining the results of Propositions 1 and 2 using the law of total probability, the second point of Theorem 1 follows immediately.
To prove the third point of Theorem 1, we provide one more result. The following proposition shows that if the initial state is symmetric, then the state at round one will be asymmetric of order at least √ n. This result is proved in Appendix C.

Proof of Theorem 2
The following proposition, which is proved in Appendix D, shows that if the initial state is symmetric, then the state at round one cannot be asymmetric of order larger than √ n.
Proposition 4. Let {B n } ∞ n=1 be a sequence such that lim n→∞ B n √ n = ∞. For an initial state x 0,n ∈ {0, 1} 2n with n zeros and n ones and a channel parameter q ∈ (0, 1), the following holds: We also have the following result, which is proved in Appendix E.
Proposition 5. Let {C n } ∞ n=1 be a sequence such that lim n→∞ C n n = 0. Let x 0,n ∈ {0, 1} 2n be an initial state with n + C n zeros or n + C n ones. Let q ∈ (0, 1) be a channel parameter and denote the constant f q = 32/ min{q, 1 − q}. Then, the SMP(1) is characterized by: We are now in a good position to prove Theorem 2. Let C(q) = 1 2 f −1 q , choose the sequence: and define the sequence of events: According to Proposition 4, we have that: which converges to zero as n → ∞. In addition, it follows from Proposition 5 that: P{C n |F c n } ≤ exp −C(q)n log(n) · exp − f q · C(q)n log(n) n − C(q)n log(n) (33) ≤ exp −C(q)n log(n) · exp − = exp{−C(q)n log(n) · exp{− log(n)}} (35) = exp −C(q)n log(n)n −1 (36) = exp{−C(q) log(n)} (37) where (34) holds for all large enough n. Then, consider the following: where (39) is due to the law of total probability and (41) follows from (32) and (38). The proof of Theorem 2 is complete.
Author Contributions: All authors contributed equally to this research work. All authors have read and agreed to the published version of the manuscript.
Funding: This research received no external funding.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A. Proof of Proposition 1
Due to symmetry, we only analyze the case I 0 > I 1 . It follows from the union bound that: In the following, let us denote by Ber(p) a Bernoulli random variable with a success probability p and by Bin(n, p) a binomial random variable with n independent experiments, each one with a success probability p. We adopt the following convention: if an event contains at least two binomial random variables, then we assume that they are statistically independent.
Let us denote q = 1 − q. If an agent starts with a '0', then the probability to decide in favor of '1' is upper-bounded by: where the addition of the second 1 in (A3) follows from the need to strictly break the tie to adopt '1' and (A4) is due to the fact that Bin(1, q ) ≤ 2 with probability one. If an agent starts with a '1', then the probability to decide '1' is upper-bounded by: P Bin n − A n − 1, q + 1 ≥ Bin n + A n , q ≤ P Bin n − A n , q + 1 ≥ Bin n + A n , q .
Since (A6) cannot be smaller than (A5), we continue with (A6). From now on, we prove that the probability in (A6), to be denoted by P n , converges to zero as n → ∞. Let: where I ∼ Ber(q ), for all ∈ {1, 2, . . . , n − A n }, J k ∼ Ber(q ), for all k ∈ {1, 2, . . . , n + A n }, and all of these binary random variables are independent. Now: where (A10) is due to Markov's inequality. Since (A10) holds for every λ ≥ 0, it follows that: We obtain that: where (A14) is due to the independence of all binary random variables and (A16) follows from the inequality 1 + x ≤ e x . Upon defining: we find that To facilitate expressions, we solve for f (λ) = 1 and find that: Substituting it back into (A17) yields that: Consider the following: where (A27) follows from the inequality √ 1 − t ≤ 1 − t/2. Continuing from (A2), we arrive at which converges to zero when n → ∞, as long as lim n→∞ A n √ n = ∞ and lim n→∞ A n n < 1. For the case of lim n→∞ A n n = 1, consider the following. Let {A n } ∞ n=1 be any sequence with lim n→∞ A n n = 1 and let {A n } ∞ n=1 be a sequence with lim n→∞ A n n = α, for α ∈ (0, 1). Then, for sufficiently large n, A n ≥ A n , and thus, it follows that: which completes the proof of Proposition 1.

Appendix B. Proof of Proposition 2
Step 1: The Limit of the Probability to Decide '1' If an agent starts with a '1', then the probability to decide in favor of '1' is given by: and if an agent starts with a '0', then the probability to decide in favor of '1' is given by From now on, we prove that the probability in (A32), to be denoted by P n , converges to a value, which is strictly smaller than 1 2 for all sufficiently large n. An identical result also holds for the probability in (A31), the proof of which is very similar and hence omitted.
To conclude that X n itself converges in distribution to X ∼ N (0, σ 2 ), we only need to prove that |X n − X n | converges in distribution to 0. We have that: which proves that that |X n − X n | converges in L 2 to 0, thus also in distribution. It then follows from [47] (Theorem 3.1) that X n converges in distribution to X ∼ N (0, σ 2 ). Concerning the sequence Z n , consider the following: and furthermore It follows that Z n converges in L 2 to Z = αq , i.e., a deterministic random variable. Hence, Z n also converges to Z = αq in probability [46] (Lemma 1.3.5). Now, for > 0 arbitrarily small, consider the following: where (A53) is due to the law of total probability and (A55) follows from the fact that (X n , Y n ) are independent of Z n . Since {I } and {J s } are all independent, the joint law of the pair (X n , Y n ) converges to the joint law of (X, Y) and X, Y are independent. Hence, by Portmanteau's theorem [47] (p. 16, Theorem 2.1), and the fact that Z n converges to Z = αq in probability: where and In a similar fashion: = P X n ≥ Y n + Z n + αq |Z n ≤ αq + P Z n ≤ αq + + P X n ≥ Y n + Z n + αq |Z n > αq − P Z n > αq − (A62) ≥ P X n ≥ Y n + αq + + αq |Z n ≤ αq + P Z n ≤ αq + (A63) and thus lim inf where From the continuity of the Q-function and the fact that > 0 is arbitrarily small, we conclude that: where and hence lim n→∞ P n = Q(t 0 ).
Step 2: Many Zeros with High Probability Let 0 < δ < 1 2 − Q(t 0 ) be given. Let Q 0 n , Q 1 n denote the probabilities of deciding '0', for the two possible initial states. Since P n ≤ Q(t 0 ) + δ < 1 2 for all sufficiently large n, it follows that min{Q 0 We now prove that the probability of drawing a relatively small number of zeros tends to 0 as n → ∞. Denote N 0 = N(X 1 ; 0) and consider the following for s ≥ 0: where (A73) is due to Markov's inequality. Since (A73) holds for every s ≥ 0, it follows that: (A74) Note that: where I ∼ Ber(Q 0 n ), for all ∈ {1, 2, . . . , n + α √ n}, J k ∼ Ber(Q 1 n ), for all k ∈ {1, 2, . . . , n − α √ n}, and all of these binary random variables are independent. We obtain that: where (A78) is due to the independence of all binary random variables and (A80) is true since min{Q 0 n , Q 1 n } ≥ Φ(t 0 ) − δ for all sufficiently large n and e −s − 1 ≤ 0. Substituting (A81) back into (A74) yields that: Upon defining: we find that the solution to g (s) = 0 is given by Substituting it back into (A84) yields that: We upper-bound the expression in (A89) using Pinsker's inequality [48,49]. Recall that the total variation distance between two probability distributions P and Q is defined by: and the Kullback-Leibler divergence is defined by Then, Pinsker's inequality asserts that: Thus, we arrive at: Hence, we conclude that for all n sufficiently large: which converges to 1 as n → ∞. Proposition 2 is now proved.

Appendix C. Proof of Proposition 3
Denote N 0 = N(X 1 ; 0). Let {p n } denote the sequence of probabilities of the events that an agent with an initial value '0' updates its value to '0' after a single round of communication.
Step 1: An Upper Bound on the PMF of the Binomial Distribution We start by upper-bounding the probability mass function (PMF) of the binomial random variable X = Bin(n, p), which is given by: To upper-bound the binomial coefficient in (A96), we invoke the following Stirling's bounds: √ 2πn · n n · e −n ≤ n! ≤ e √ n · n n · e −n , and obtain the following Substituting (A103) back into (A96) yields: where D(α β), for α, β ∈ [0, 1], is defined in (1).
Step 2: The Limit of {p n } is 1 2 First, we show that {p n } is lower-bounded by 1 2 . For q = 1 − q, denote: We have that: where (A110) is true since Bin(1, q ) ≤ 1 with probability one. It follows by symmetry that or, which implies that Next, we upper-bound the sequence {p n }. Note that: ≤ P Bin n, q + 1 ≥ Bin n, q (A120) As for the last term in (A123), we have that: where (A125) follows from the Cauchy-Schwarz inequality. Substituting (A130) back into (A123) yields that: Now, consider the following: As for the middle term in (A135), it follows from (A107) that: To upper-bound (A136) let n = 1/ 4 √ n, for n = 1, 2, . . . and define the set of numbers: whose cardinality is given by Denote M n = {1, 2, . . . , n − 1} ∩ N c n . For any ∈ M n , it follows from Pinsker's inequality that: We now continue from (A136) and arrive at: where (A141) follows from (A140) and the fact that D(α β) ≥ 0 in general. The inequality in (A142) is because of the following reasons. First, the minimizers of (n − ) in M n are 1 or n − 1. Second, the minimizer of (n − ) in N n is the endpoint of N n which is the most distant from 1/2. For simplicity, we assumed without loss of generality that q ∈ (1/2, 1). The passage to (A143) is due to the fact that |M n | ≤ n − 1 as well as (A138) and in (A144), we substituted n = 1/ 4 √ n. Denote the expression in (A144) by G n and notice that this expression converges to zero as n → ∞. We substitute G n back into (A135) and then into (A131). Since {p n } is lower-bounded by 1 2 , we conclude that: Thus, {p n } converges to 1 2 as long as q = 0, 1.
Step 3: Asymptotic Behavior of the Number of Zeros We would like to prove that the random variable |N 0 − n|/ √ n is bounded away from zero with an overwhelmingly high probability at large n. Note that: where I n, ∼ Ber(p n ) and J n, ∼ Ber(1 − p n ), for all ∈ {1, 2, . . . , n}, and all of these binary random variables are independent. Let > 0 and δ( ) > 0, that will be specified later on with the property that δ( ) →0 −−→ 0. Consider the following: To conclude that the two normalized sums inside the probability in (A147) converge in distribution to normal random variables, we invoke Lindeberg-Feller central limit theorem [46] (p. 116, Theorem 2.4.5.). First, we introduce the concept of a "triangular array" of variables. A triangular array of random variables is of the form {X n,i }, n ≥ 1, 1 ≤ i ≤ n, where for every n, the random variables X n,1 , X n,2 , . . . , X n,n are independent, have zero mean, and have finite variance. Then, one has the following result.
Theorem A1. (Lindeberg-Feller CLT) Suppose {X n,i } is a triangular array such that : and s 2 n → s 2 = 0. If the Lindeberg condition holds: for every > 0, Now, concerning the left-hand-side normalized sum inside the probability in (A147), notice that: which converges to s 2 = 1 4 as n → ∞. In addition, Lindeberg's condition in (A150) is trivially satisfied since all the random variables in our setting are bounded. Thus, it follows by Lindeberg-Feller CLT that: From exactly the same considerations: and X, Y are independent since {I n, } and {J n, } are all independent. We continue from (A147) and arrive at: which can obviously be satisfied by a proper choice of δ( ). We conclude that for any > 0, there exists some M( ), such that for all n ≥ M( ): which completes the proof of Proposition 3.

Appendix D. Proof of Proposition 4
Let us denote N = N(X 1 ; 0). For any µ ≥ 0, it follows from Markov's inequality that: and thus, since (A161) holds for every µ ≥ 0, it follows that Note that: where I m ∼ Ber(p n ) and J m ∼ Ber(1 − p n ), for all m ∈ {1, 2, . . . , n}, and all of these binary random variables are independent. We obtain that: = (1 − p n + p n e µ ) n · (p n + (1 − p n )e µ ) n (A167) where (A166) is due to the independence of all binary random variables and (A169) follows from the fact that the expression in (A168) is maximized for p n = 1 2 . Substituting (A170) back into (A162) yields that: Upon defining: we find that the solution to f (µ) = 0 is given by Substituting it back into (A173) provides that: = exp 2n log 1 + B n n − B n − (n + B n ) log n + B n n − B n (A178) = exp 2n log n n − B n − (n + B n ) log n + B n n − B n (A179) = exp (n − B n ) log n n − B n + (n + B n ) log n n + B n (A180) = exp −n · 1 − B n n log 1 − B n n + 1 + B n n log 1 + B n n . (A181) Consider the function: which is symmetric around t = 0. Its first order and second order derivatives are given by and Hence, we conclude that g(t) ≥ t 2 , and thus: which completes the proof of Proposition 4.

Appendix E. Proof of Proposition 5
Step 1: A Simplification for the Consensus Probability Due to symmetry, we only analyze the case I 0 > I 1 . It follows that: Step 2: A Lower Bound on P{X 1 (i) = 1} If an agent starts with a '0', then the probability to decide in favor of '1' is lowerbounded by: P Bin n − C n , q ≥ Bin n + C n − 1, q + 2 ≥ P Bin n − C n , q ≥ Bin n + C n , q + 2 . (A190) If an agent starts with a '1', then the probability to decide in favor of '1' is lower-bounded by: P Bin n − C n − 1, q + 1 ≥ Bin n + C n , q ≥ P Bin n − C n − 1, q + Bin 1, q ≥ Bin n + C n , q (A191) = P Bin n − C n , q ≥ Bin n + C n , q .
Since (A190) cannot be larger than (A192), we continue with (A190). From now on, we lower-bound the probability in (A190), to be denoted by Q n . The probability in (A190) can be written explicitly as: n+C n ∑ k=0 n − C n (1 − q) q n−C n − n + C n k (1 − q) k q n+C n −k 1{ ≥ k + 2}. (A193) We continue by lower-bounding the PMF of the binomial random variable X = Bin(n, p), which is given by: To lower-bound the binomial coefficient in (A194), we use the Stirling's bounds in (A97) and obtain that: Substituting (A196) back into (A194) yields: where D(α β), for α, β ∈ [0, 1], is defined in (1). Substituting twice this lower bound into (A193), we arrive at: n − C n (n − C n − ) exp −(n − C n )D n − C n 1 − q × n + C n k(n + C n − k) exp −(n + C n )D k n + C n 1 − q .