On the Distributed Construction of Stable Networks in Polylogarithmic Parallel Time

We study the class of networks which can be created in polylogarithmic parallel time by network constructors: groups of anonymous agents that interact randomly under a uniform random scheduler with the ability to form connections between each other. Starting from an empty network, the goal is to construct a stable network which belongs to a given family. We prove that the class of trees where each node has any k>= 2 children can be constructed in O(log n) parallel time with high probability. We show that constructing networks which are k-regular is Omega(n) time, but a minimal relaxation to (l, k)-regular networks, where l = k - 1 can be constructed in polylogarithmic parallel time for any fixed k, where k>2. We further demonstrate that when the finite-state assumption is relaxed and k is allowed to grow with n, then k = log log n acts as a threshold above which network construction is again polynomial time. We use this to provide a partial characterisation of the class of polylogarithmic time network constructors.


Introduction
Passively dynamic networks are an important type of dynamic networks in which the network dynamics are external to the algorithm and are a property of the environment in which a given system operates. Wireless sensor networks in which individual sensors are carried by autonomous entities, such as animals, or are deployed in a dynamic environment such as the flow of a river are examples of passively dynamic networks. In terms of modelling such systems, the network dynamics are usually assumed to be controlled by an adversary scheduler, who has exclusive control on the interaction or communication sequence among the computational entities.
One line of research has been assuming the scheduler to be fair, in the sense that it can forever conceal potentially reachable configurations of the system. This sub-type of passively dynamic networks are known as population protocols and were introduced in the seminal paper of Angluin et al. [AAD + 06]. 3 A type of fair scheduler which is typically assumed when the running time of protocols is to be analysed, is the uniform random scheduler, which in every discrete step selects equiprobably a pair of entities to interact from all permissible pairs of entities. Traditionally, the population protocols literature had been considering extremely weak entities and the goal was to reveal the computational possibilities and limitations under such a challenging interaction scheme. Recent progress has been highlighting the interesting trade-offs between local space of the entities and the running time of protocols, showing among other things that very fast running times (where fast is here considered to be anything growing as polylog(n), n being the total number of entities in the system) can be achieved for a wide range of basic distributed tasks if the entities are equipped with as few states as polylog(n). Alistarh and Gelashvili [DA15] have also proposed the first sub-linear leader election protocol, which stabilizes in O(log 3 n) parallel time, assuming O(log 3 n) states at each agent. Gasieniec and Stachowiak [GS18] designed a space optimal (O(log log n) states) leader election protocol, which stabilises in O(log 2 n) parallel time. General characterizations, including upper and lower bounds of the trade-offs between time and space in population protocols are provided in [DA17]. Doty et al. [DEM + 18] show that a state count of O(n 60 ) enables fast and exact population counting.
Another line has been considering worst-case adversary schedulers, which may even be aware of the protocol and trying to optimise against it. There, the entities are typically assumed to be powerful, like processors of traditional distributed systems, and the only restrictions imposed on the scheduler are instantaneous or temporal connectivity restrictions which essentially do not allow the scheduler to forever block communication between any two parts of the system. This was initiated by O'Dell and Wattenhofer [OW05] for the asynchronous case and then the synchronous case was extensively studied in a series of papers by Kuhn et al. [KLO10]. Michail et al. [MCS12] extended this to the case of possibly disconnected dynamic networks, in which connectivity is only guaranteed in a temporal sense.
The other main type of dynamic networks with respect to who controls the changes in the network topology, are actively dynamic networks. In such networks, the algorithm is able to either implicitly change the sequence of interactions by controlling the mobility of the entities or explicitly modify the network structure by creating and destroying communication links at will. This is for example the subject of the area of overlay network construction [AAC + 05, AS07, AW07, GHS19] and very recently Michail et al. introduced a fully distributed model for computation and reconfiguration in actively dynamic networks [MSS20].
An interesting alternative family of dynamic networks rises when one considers a mixture of the passive network dynamics of the environment and the active dynamics resulting from an algorithm that can partially control the network changes or that can fix network structures that the environment is unable to affect. This is naturally motivated by molecular interactions where, for example, proteins can bind to each other, forming structures and maintaining their stability despite the dynamicity of the solution in which they reside. Michail and Spirakis [MS16] introduced and studied such an abstract model of distributed network construction, called the network constructors model, where the network dynamicity is the same as in population protocols but now the finite-state entities can additionally activate and deactivate pairwise connections upon their interactions. It was shown that very complex global networks can be formed stably despite the dynamicity of the environment. Then Michail [Mic18] studied a geometric variant of network constructors, in which the entities can only form geometrically constrained shapes in 2D or 3D space. Another interesting hybrid dynamic network model is the one by Gmyr et al. [GHSS17], in which the entities have partial control over the connections of an otherwise worst-case passively dynamic network, following the model of Kuhn et al. [KLO10].

Our Approach
We investigate which families of networks can be stably constructed by a distributed computing system in polylogarithmic parallel time. To our knowledge, this is the first attempt made to approach this task.
Our protocols assume the existence of a leader node. A node x is a leader node if in the initial configuration all u ∈ V \ {x}, where V is the set of all nodes, are in state q 0 and x is in state s = q 0 .
We first study the k-Children Spanning Tree problem, where the goal is to construct a tree where each node has at most k ≥ 2 children. We show that it is possible to solve this problem for any k in O(log n) time with high probability. We then show that network constructors which create k-regular graphs necessarily take Ω(n) time. However, with a minimal relaxation to (k, k−1)-regular networks the problem can be solved for any constant k ≥ 2 in polylogarithmic time. We examine this as a special case of the (l, k)-Regular Network problem, where the goal is to construct a spanning network in which every node has at least l < k and at most k connections, where 2 < k < n. We then transitioned to experimental analysis of the protocol which not only provided evidence of the sharp contrast of the minimal relaxation but also revealed a threshold value for k, beyond which the problem reverts to polynomial time. We used this knowledge to propose a first partial characterisation of the set of polylogarithmic time network constructors. We leave providing formal bounds as an open problem, with a potential proof strategy provided in the Appendix.
In Section 2, we formally define the model of network constructors and the network construction problems that are considered in this work. In Section 3, we study the k-children spanning tree problem, first for k = 2, and then for k ≥ 2. In Section 4, we first provide the lower bound for k-regular networks. We then present a protocol for the (l, k)-regular network problem and our experimental analysis culminating in the partial characterisation. In Section 5, we conclude and give further research directions that are opened by our work.

Preliminaries and Definitions
2.1 The model Definition 1. A Network Constructor (NET) is a distributed protocol defined by a 4-tuple (Q, q 0 , Q out , δ), where Q is a finite set of node-states, q 0 ∈ Q is the initial node-state, Q out ⊆ Q is the set of output node-states, and δ : When we present the transition function of a protocol we only present the effective transitions. Additionally, we agree that the size of a protocol is the number of its states, i.e., |Q|.
The system consists of a population V I of n distributed processes (called nodes for the rest of this paper). In the generic case, there is an underlying interaction graph G I = (V I , E I ) specifying the permissible interactions between the nodes. Interactions in this model are always pairwise. In this work, G I is a complete undirected interaction graph, i.e., E I = {uv : u, v ∈ V I and u = v}, where uv = {u, v}. Initially, all nodes in V I are in the initial node-state q 0 . A central assumption of the model is that edges have binary states. An edge in state 0 is said to be inactive while an edge in state 1 is said to be active. All edges are initially inactive. Execution of the protocol proceeds in discrete steps. In every step, a pair of nodes uv from E I is selected by an adversary scheduler and these nodes interact and update their states and the state of the edge joining them according to the transition function δ.
A configuration is a mapping C : V I ∪ E I → Q ∪ {0, 1} specifying the state of each node and each edge of the interaction graph. Let C and C be configurations, and let u, υ be distinct nodes. We say that C goes to C via en- An execution is a finite or infinite sequence of configurations C 0 , C 1 , C 2 , ..., where C 0 is an initial configuration and C i → C i+1 , for all i ≥ 0. A fairness condition is imposed on the adversary to ensure the protocol makes progress. An infinite execution is fair if for every pair of configurations C and C such that C → C , if C occurs infinitely often in the execution then so does C . In what follows, every execution of a NET will by definition considered to be fair.
We define the output of a configuration C as the graph In words, the output-graph of a configuration consists of those nodes that are in output states and those edges between them that are active, i.e., the active subgraph induced by the nodes that are in output states. The output of an execution C 0 , C 1 , ... is said to stabilize (or converge) to a graph G if there exists some step t ≥ 0 such that (abbreviated s.t. in several places) G(C i ) = G for all i ≥ t, i.e., from step t and onwards the output-graph remains unchanged. Every such configuration C i , for i ≥ t, is called output-stable. The running time (or time to convergence) of an execution is defined as the minimum such t (or ∞ if no such t exists). Throughout the paper, whenever we study the running time of a NET, we assume that interactions are chosen by a uniform random scheduler which, in every step, selects independently and uniformly at random one of the |E I | = n(n − 1)/2 possible interactions. In this case, the running time becomes a random variable (abbreviated r.v. throughout) X and our goal is to obtain bounds on the expectation E[X] of X. Note that the uniform random scheduler is fair with probability 1.
In this work time is treated as sequential in our analyses, i.e., a time-step consists of a single interaction selected by the scheduler. Such a sequential estimate can be easily translated to some estimate of parallel time. For example, assuming that Θ(n) interactions occur in parallel in every step, one could obtain an estimation of parallel time by dividing sequential time by n. All results are given in parallel time.
Definition 2. We say that an execution of a NET on n nodes constructs a graph (or network ) G, if its output stabilizes to a graph isomorphic to G.
Definition 3. We say that a protocol P constructs a graph language G , if in every execution P constructs a graph G ∈ G and for all G there exists an execution of P which constructs G.

Problem definitions
Here we provide formal definitions for all of the classes of networks considered in this paper.
k-Children Spanning Tree. The goal is to construct a spanning tree where each individual element has at most k ∈ N children.
(l, k)-Regular Network. A spanning network where for any l, k ∈ N where l < k, elements with degree d < l form a clique and all others have a degree of at least l and at most k.

Experimental Setup
We performed experiments with the goal of guiding a proof of the running time necessary to solve the (l, k)-regular network problem. We learned that a formal proof would be difficult due to the reliance of random variables on the values of other random variables, so we leave this as an open problem. We then experimented with different values of k to see what the effect would be, and discovered a running time threshold in the process. All were implemented using C and compiled with GCC. All tests were repeated at least five times per value of n and the average number of time steps taken as the result. To terminate our experiments we designed special stabilisation conditions. Details including a formal proof of correctness can be found in the Appendix.

Polylogarithmic-time Protocols for k-Children
Spanning Tree In this section, we study the complexity of the k-Children Spanning Tree problem. We first focus on the special case where k = 2 and give a protocol (Protocol 1). We show that it has a running time of O(log n) parallel time with high probability. Finally, we generalise for all k ≥ 2 by giving a protocol (Protocol 2) and prove that the running time is again O(log n). // All transitions that do not appear have no effect

2-Children Spanning
In the above protocol, the F state corresponds to being a node which is not a member of the tree. L i corresponds to the leader node which acts as the root of the tree, and O i to non-leader nodes in the tree, where i represents the number of children of a given node. We assume that for every execution of Protocol 1 on a population P of n nodes, n − 1 nodes initialise to the state F and one node initialises to the state L 0 .
Lemma 1. Protocol 1 stably constructs the graph language T = {G|G is a tree and ∀u ∈ P =⇒ ∆ + (u) ≤ 2}, where ∆ + (u) is defined as the number of children of the node u in O(log n) parallel time.
Proof. A full proof of this theorem is located in the Appendix.
Lemma 2. For each time step in Protocol 1, the probability of any node in the set of unconnected nodes U connecting to the tree is at least 2 |U | n .
Proof. Assume there are |S| nodes which are connected to the tree. The probability of a node x ∈ U connecting to the tree is |S||U | n 2 . If there are at least n/2 nodes connected to the tree, then |S||U | The case where there are less than n/2 nodes connected to the tree is symmetrical, meaning that same process happens in reverse for 1 ≤ n ≤ n/2. Therefore 2 |U | n is a lower bound of the probability of connecting to the tree. Proof. By application of Lemmas 1 and 3.

k-Children Spanning Tree
We now consider the problem of constructing the graph language T k = {G|G is a rooted tree and ∀u ∈ P =⇒ ∆ + (u) ≤ k}. Protocol 2 below operates in the same way as Protocol 1 but relaxes the finite-state restriction to provide states and rules for all i ≤ k, where k ≥ 2.  Proof. We observe that the number of open slots o is initially k. o is nondecreasing, as every increase in ∆ + (u) for some u necessarily increases |V (S)|. Since there are always open slots available, every unconnected node is guaranteed to be able to connect to S at some point. Therefore when S stabilises it will contain all u ∈ P .
Lemma 5. For all executions of Protocol 2 on the population P of n nodes, it stabilizes to some G ∈ T k where |V (G)| = n.
Proof. We prove this via an induction on the connected component S. For the base case, there is one node in the state L 0 . This is trivially a member of T k as no connections have formed yet. We now assume that there is a connected component of size |S|. For a connected component of size |S|+1, an unconnected node u ∈ V \ S in the state F must connect to S at some node x ∈ S. By Lemma 3, such a node must exist. If the node x has two children it is in the state O 2 or L 2 , as for all nodes in states O i and L j the i and j correspond to the number of children of those nodes. Since there is no defined transitions from these states no u can connect to x. Therefore S remains a tree and G(S) ∈ T k .
Lemma 6. For all G ∈ T k , there is an execution of Protocol 2 which stabilises on G when starting on a population P of size n = |V (G)|.
Proof. We first set the value of k to the maximum number of connections in any node in the tree. Let the leader node l in the population P correspond to the root r of G. If r has i children, connect i nodes in the state F to l. For each child c of the leader node, let it correspond to a child d of r. If d has j children, connect j nodes in the state F to c. Continuing this process for all nodes u ∈ G, the result is a spanning tree where all nodes in the tree are equivalent to some u ∈ G.
Theorem 2. Protocol 2 stably constructs the graph language T k in O(log n) time w.h.p.
Proof. By application of the Lemmas above. Protocol 2 can only be faster than Protocol 1 as it has more open slots per node.

Time Thresholds for (l, k)-Regular Networks
In this section, we present our solution for the (l, k)-Regular Network problem for l = k − 1, the Cross-edges Tree protocol. We first show that a k-regular network, defined as a network where each node has degree exactly equal to k, cannot be constructed in polylogarithmic time. We then show via experimental analysis that this impossibility result does not hold for the minimal relaxation of (l, k)-Regular Networks when k is a constant and l = k − 1. Finally, we demonstrate that when k exceeds the threshold of log log n, the protocol itself is no longer in the polylogarithmic time class. Note that from now on k refers to the degree of a node, not the number of children.
Theorem 3. Any protocol which constructs a k-regular network where k < n has a running time of Ω(n).
Proof. Consider the population P of size n using a generic k-regular network construction protocol X. The number of connections is limited by k to kn 2 as this is less than the n(n−1) 2 maximum for n nodes. The population initially has kn network connection entry points which can be used to make new connections and which decrease by 2 for every connection made. Since (kn) ≤ n(n − 1), at some point in the execution there must be two nodes with 1 unused entry point each. Using these points and stabilising the protocol means both nodes must be selected by the scheduler at the same time, an event with probability 1 n 2 . Since an event with probability 1 n 2 is unavoidable the protocol X must construct a network in at least Ω(n 2 ) interactions.

Protocol 3 Cross-edges Tree
The Cross-edges Tree protocol adds additional rules allowing leaves within a tree to connect to other nodes within the tree as though they are candidates for becoming children.
We now provide the results of simulating the protocol for k = 3. We used the same conditions as in the other running time experiments, executing the protocol 10 times for each population size n, where n = 10 + 6t, where t is the test number from 0 to 199.
The running time is difficult to prove formally. This is because random variables are used which represent the number of nodes with a given degree in a given time step. Their values depend on the values of all random variables in the previous time step. We therefore turn our focus to experiments based on measuring the impact of the value of k on the running time of the protocol.
We have measured the running time of our Cross-edges Tree protocol for different network sizes. The results below show that a higher value of k has little effect on the running time until k exceeds log log n.     To investigate why the protocol slows down dramatically after this point, we ran experiments where we stored the number of nodes with specific degrees in each time step. We executed the protocol with 200 nodes, and ran 10 iterations. These degrees were set to 0, 1, k/2, k − 1, and k. The results show that the cause seems to be a large reduction in the number nodes which are in the k − 1 state as k grows as a fraction of n. They suggest that when the fraction of k − 1 nodes is below some fraction between 1/4 and 1/8 of the total the protocol slows down and enters the class of protocols with polynomial time.

Conclusions
There are a number of open problems to be addressed. The most important is to develop an exact characterisation of the class of networks which can be constructed in polylogarthimic parallel time. However, there are other, more immediate problems. For example, we have yet to investigate the effect that widening the difference between k and l will have on the protocol. We speculate that this will result in a faster running time in exchange for less uniformity within the resulting spanning network. We have also speculated about the possibilities of using a leaderless version of our Cross-tree Protocol. We believe that such a protocol may offer a trade-off between running time and the possibility of forming networks which are spanning, depending on the values of k and l.

C Proof of correctness for 2-Slot Protocol
Let T = {G|G is a tree and ∀u ∈ P =⇒ ∆ + (u) ≤ 2}, where ∆ + (u) is defined as the number of children of the node u. o is initally 2 as there is one node in S with no children. o is non-decreasing, as every increase in ∆ + (u) for some u necessarily increases |V (S)|. Since there are always open slots available, every unconnected node is guaranteed to be able to connect to S at some point. Therefore when S stabilises it will contain all u ∈ P .
A node is available if it has at least 1 open slot.
Lemma 10. For all executions of Protocol 1 on the population P of n nodes, it stabilises to some G ∈ T where |V (G)| = n.
Proof. We prove this via an induction on the connected component S. For the base case, there is one node in the state L 0 . This is trivially a member of T as no connections have formed yet. We now assume that there is a connected component of size |S|. For a connected component of size |S|+1, an unconnected node u ∈ V \ S in the state F must connect to S at some node x ∈ S. By Lemma 1, such a node must exist. If the node x has two children it is in the state O 2 or L 2 , as for all nodes in states O i and L j the i and j correspond to the number of children of those nodes. Since there is no defined transitions from these states no u can connect to x. Therefore S remains a tree and G(S) ∈ T .
Lemma 11. For all G ∈ T , there is an execution of Protocol 1 which stabilises on G when starting on a population P of size n = |V (G)|.
Proof. We prove this providing a method to construct any G ∈ T with Protocol 1. Let the leader node l in the population P correspond to the root r of G. If r has i children, connect i nodes in the state F to l. For each child c of the leader node, let it correspond to a child d of r. If d has j children, connect j nodes in the state F to c. Continuing this process for all nodes u ∈ G, the result is a spanning tree where all nodes in the tree are equivalent to some u ∈ G.
Theorem 5. Protocol 1 stably constructs the graph language T .
Proof. By application of the Lemmas above.

D Expected Running Time of 2-Children Spanning Tree
Lemma 12. Let T ∈ T of n nodes. The number of available nodes α(T ) = |T |/2 + 1.
Proof. Observe that for T , every second node which connects to T keeps the number of available nodes the same. This is because two new nodes must become children of the same node, and the second new node takes the second slot. For the base case, n = 1 and α = 0 + 1 = 1. We divide n = n + 1 into two cases: n is even and n is odd. If n is even, then α = n/2 + 1. Then for n = n + 1, α = n + 1/2 + 1 = n/2 + 1. This corresponds to observation earlier that every other node (i.e n is odd) should not increase α. If n is odd, then α = n/2 + 1 = (n − 1)/2 + 1. Then for n = n + 1, α = n + 1/2 + 1 = n/2 + 1 as expected.
Remark: At any point during the execution of A, for the connected component S, G(S) ∈ α(T ).
Let the probablistic process P be an execution of the protocol 1 with the following scheduling restriction: If at any point during the execution of A two nodes x and y have exactly one child, disconnect that child of x or y which is a leaf and connect it to the other node. If both are leaves, pick one at random.
Lemma 13. The expected time to convergence of the probabalistic process P is O(log n).
Proof. Let the r.v. X be the number of steps until convergence. A step is successful if any unconnected node joins the connected component S.
An epoch i is the period beginning with the step following the (i−1)st success and ending with the step at which the ith success occurs. The r.v. X i , 1 ≤ i ≤ n − 1, is the number of steps in epoch i. Proof. Assume there is an execution of A which has an slower running time than P. Such an execution must have a lower number of available nodes at some point than P. If the execution simulates the scheduling restriction of P then it cannot be slower than P. If the execution does not simulate the restriction then at some point two nodes have two leaves and one is not shifted to the other. The number of available nodes is therefore greater by one and the expected running time faster than P. Therefore any execution of A must be at least as fast as P.
Theorem 6. The expected running time of protocol 1 is upper bounded by the O(log n) running time of P.
Proof. By application of lemmas 13 and 14.

E k = 3 Formal Proof Strategy
To investigate why the running time of the Protocol is Polylogarithmic, we modified the simulator to perform a single test and output the degree of each node for every time step. Based on the above we have created the following strategy for proving that the running time of Protocol 3 for k = 3.
The proof is divided into two phases. In phase 1, the number of nodes with degree 0 is large, but after some time it will be small. It can be shown that in the time it takes for this to happen, the number of nodes with degrees 2 are at least some fraction of n with high probability, perhaps n/A for some constant A. In phase 2, which begins when the number of degree 0 nodes is small, it can be shown that the number of degree 2 nodes remains at least n/A w.h.p and that this allows the protocol to stabilise in polylogarithmic time for arbitrarily low numbers of degree 0/1 nodes.