Probability Models of Distributed Proof Generation for zk-SNARK-Based Blockchains

: The paper is devoted to the investigation of the distributed proof generation process, which makes use of recursive zk-SNARKs. Such distributed proof generation, where recursive zk-SNARK-proofs are organized in perfect Mercle trees, was for the ﬁrst time proposed in Latus consensus protocol for zk-SNARKs-based sidechains. We consider two models of a such proof generation process: the simpliﬁed one, where all proofs are independent (like one level of tree), and its natural generation, where proofs are organized in partially ordered set (poset), according to tree structure. Using discrete Markov chains for modeling of corresponding proof generation process, we obtained the recurrent formulas for the expectation and variance of the number of steps needed to generate a certain number of independent proofs by a given number of provers. We asymptotically represent the expectation as a function of the one variable n / m , where n is the number of provers m is the number of proofs (leaves of tree). Using results obtained, we give numerical recommendation about the number of transactions, which should be included in the current block, idepending on the network parameters, such as time slot duration, number of provers, time needed for proof generation, etc.


Introduction
Sidechains (SCs) [1][2][3][4], and also some similar tools, such as [5,6] are very suitable and prospective instrument in modern blockchains. They may be considered as some adjunct to the blockchain which allows one to obtain some additional features that are not implemented in initial blockchain (which is also called mainchain, not to be confused with sidechain).
Generally speaking, SCs may use an arbitrary consensus protocol with proved securitytaking into account conditions in which its security was proved [7][8][9][10]. In what follows we will consider only SCs based on Latus Consensus Protocol [11], which isa hybrid PoS based on Ouroboros Praos [12], with additional binding to a PoW mainchain (MC). Binding to MC is a necessary requirement for SCs [1][2][3][4]11], to guarantee such blockchain properties as liveness and persistence [13]. To ensure security level of transactions in SC, some information should be regularly sent from SC to MC. One MC may have a plethora of SCs, so we need to reduce the volume of information sent, but in a such way that will not increase different risks for SC. In Latus consensus, this information contains a series of recursive zk-SNARK-proofs [14,15] that establish decentralized and verifiable cross-chain data transfers.
Other, internal nodes of the proof tree are so-called "merge" proofs [4], which prove the correctness of two proofs in child nodes. Therefore proof in the root node proves correctness of transitions between UTXO states corresponding to the whole block.
All zk-SNARK-proofs for the proof tree are distributively constructed by proovers. Each prover, who creates zk-SNARK-proof, assigns the prices for his proofs within some interval or set, defined at the end of the previous epoch. If there is more than one proof for some node of the proof tree, the blockforger chooses the cheapest one. Under these conditions, the mutual activity of blockforger and provers should provide efficient and stable functioning of sidechain.
This work is a revised, corrected, and extended version of the conference thesis [30]. It contains the results which describe and explain the functioning of SCs, and first of all the blockforger's and prover's behavior, using probability theory and combinatorial apparatus. We use Markov chains for modeling distributed proof generation process in zk-SNARKs-based blockchains. The main purposes of our researches are: • To estimate the number of steps (or to find its expectation and variance) needed to build a complete set of zk-SNARK-proofs for base assertions corresponding to the transactions, which the blockforger includes in the block he creates; • Using these results, to recommend the maximal number of transactions that the blockforger should include in the block, to guarantee that the corresponding proof tree will be created with high probability during one time slot.
We consider two different models, which corresponds to two types of proof construction. The first model describes the the simpler case when all the proofs are built independently (like one level of the proof tree). The second model investigates a more complicated problem, when the proofs are located at nodes from the different levels of the proof tree. Such a set of proofs has the natural partial order, because the proofs from the upper level of the tree may be constructed only when the proofs from the previous levels are constructed.
The paper is organized as follows. At the beginning of the Section 2 we give some preliminary information from combinatorics, probability theory, and Markov chains technique, which is necessary for further researches. The notion of lumping for Markov chains is a special case regarding the general idea of factorization for mathematical structures. Unfortunately only a small part of textbooks pay attention to this concept. Our point of view here is that a problem can be described by several Markov chains with different level of factorization, depending on how many details we want to know at the moment. In Section 2.3 we illustrate this idea on the sample of coupon collector's problem.
Then, in Section 3 we analyse the number of steps needed to construct a complete set of proofs, which are the leaves of the unproved part of the proof tree. In this case they may be generated independently and simultaneously. We give a series of examples of different stochastic models, which are helpful in our researches. We prove that two models, described in Examples 5 and 6, are stochastically equivalent, although the first one was initially formulated as non-Markovian, and the second one was formulated in terms of the Markov chain. We study the lumped form of this model in Example 7. Using this technique we obtained the recurrent formulas for the expectation and variance of the number of steps, depending on the number of provers n and the number of leaves m, and then asymptotically reduce the expectation to a function h of single parameter n/m and describe its behavior.
Finally in Section 4 we research the process of proof creation for the entire perfect binary tree and show that this construction is convenient to generalize the previously investigated models for the case of a partially ordered set. Some useful insights emerged from this generalization, such as a more appropriate probability distribution on poset items. We conclude our article with Section 4.6 which contains numerical results regarding the number of transactions, which the blockforger should include in the current block. Such a number depends on the network parameters, such as time slot duration, number of active provers, time needed for prove generation, and so on. We present a few tables with these recommended numbers of transactions for the different preset probabilities of successful block generation.

Preliminaries
Here we provide the necessary facts about lumping for Markov chains and describe the Markov chain corresponding to the coupon collector problem as a result of two subsequent lumping constructions. These technique and examples are important for our main models. Notation 1. For cardinality of finite set S, we use two notations (depending on convenience): Notation 2. For non-negative integer m, by the corresponding boldface letter m we denote (depending on context) the totally ordered poset {1 < 2 < · · · < m} or its underlying set {1, 2, . . . , m}.

Stirling Numbers of the Second Kind
The "twelve-fold way" of combinatorics ([31], 1.9) counts the number of mappings (injections, surjections, or all possible) between two finite sets, distinguishing or not distinguishing elements in each of them. For example, the symmetric groups S m and S n act on the set Sur(m, n) of surjections m n via pre-and post-composition, respectively. The Stirling partition numbers (or Stirling numbers of the second kind) can be defined as a number of orbits: i.e., this is the number of partitions of the m labeled elements into n non-empty nonlabelled blocks, or the number of ways to nest m Matryoshka dolls so you can still see n (matryoshkas are linear, ordered by size). The action S n on Sur(m, n) is free, so | Sur(m, n)| = n! m n .
On the other hand, given a surjection π : m n, elements of the orbit π • S m are identified with cosets from S m / St π = S m /(S m 1 × · · · × S m n ), where m i = #π −1 (i).
Multiplying both parts of (2) by z m /m! and taking the sum over m one can obtain the exponential generating function for n!{ m n } Each map f : m → n is factorised as f = (m Im f → n). Then the total number of functions m → n One can consider (4) as an identity between integer polynomial in a free variable n. So we get an alternative definition of Stirling numbers as coefficients of the transition matrix between two polynomial bases.
Möbius inversion [31] (3.7) in the case of power-set P m admits a simpler formulation as the inclusion-exclusion principle [31] (2.1). It allows, on the contrary to (4), to express the number of surjections in terms of the numbers of all functions: The (forward) difference operator acts on numerical sequences (x k ) as ∆ : x k → x k+1 − x k . Its powers are expressed by binomial formula ∆ n x k = ∑ n r=0 (−1) r ( n r )x k+n−r . It allows to rewrite the previous formula (5) as: Stirling numbers of the second kind appear in [32] as a double sequence A008277, where one can find some additional information, references, and links.

Factorisation of Markov Chains
In what follows, we assume that Markov chains are discrete-time, time-homogeneous and with finite or countable state-space S. Elements of transition matrix p are written as A such position of indexes corresponds to the right action of p on the row vector of states. This is a right stochastic matrix, i.e., with ∑ j∈S p ij = 1.
Here we consider the notion of lumping for Markov chains; see, for example [33] ( §6.3). The general mathematical idea of transferring a structure from a set to a factor-set also works in the case of Markov chains. Given a surjection π : S T, consider the corresponding logical matrix v π and its Moore-Penrose inverse v † π (see [34]).
In our special case: the logical matrix corresponding to surjection isa projection and, hence, Moore-Penrose inverse v † π is a real one-side inverse: v † π v π = 1. Lemma 1. Let p = (p ss ) s,s ∈S be a right stochastic matrix over a state-space S. For surjection π : S T the following conditions are equivalent: 1. for any t ∈ T the sum ∑ s ∈π −1 (t ) p ss is locally constant on s ∈ π −1 (t) for each t ∈ T; 2.
v π v † π pv π = pv π . Definition 1. Let p = (p ss ) s,s ∈S be a right stochastic matrix over a state-space S. A surjection π : S T satisfying the conditions of the previous lemma is called a lumping map (and the corresponding partition S = t∈T π −1 (t) is called lumpable). Proposition 1. Let p = (p ss ) s,s ∈S be a stochastic matrix and π : S T a lumping map.

1.
Then, one can define a new stochastic matrix p π over a state-space T with entries 2.
The lumped k-fold transition matrix can be written as We believe that the following statement is a kind of "folkloric" result.

Proposition 2.
Suppose that a finite group G acts on the set of states S by the rule S × G (s, g) → s g ∈ S and the stochastic matrix p = (p(s, s )) s,s ∈S is G-invariant, i.e., Then, the canonical projection π : S S/G to the set of orbits is a lumping map.
Proof. Denote St s := {g ∈ G | s g = s} the stabilizer subgroup of state s. For an orbit s G := {s g | g ∈ G} the sum from Lemma 1 takes the form Then, the standard argument shows that the last sum is G-invariant as a function on s 1 : ∑ g∈G p(s h 1 , s g ) = ∑ g∈G p(s 1 , s gh −1 ) = ∑ g ∈G p(s 1 , s g ).

Coupon Collector Model via Products and Factorizations
The classical coupon collector problem can be described as follows.

Example 1.
There are n distinct coupons in the urn. A collector draws with return one random coupon in a step. The subjects of interest are the following random variables:

•
The number of distinct coupons selected after m steps; • The number of steps required to obtain exactly r distinct coupons.
Crossed products of Markov chains (and their generalizations) are described in [35]. We obtain a version of the coupon collector model as the crossed power of a simple deterministic process. The other two versions are results of its subsequent factorizations. It leads to the classical occupancy distribution described via Stirling partition numbers. This context is closely related to our further models. The n-th crossed power of the above Markov chain has the set of states Z n 0 and transition matrix where 1 Z 0 is the identity matrix on the basis Z 0 . This is a random walk over the n-dimensional hyperoctant Z n 0 with nonzero transition probabilities p(a, a + e i ) = 1/n, a ∈ Z n 0 , e i = (0, . . . , 0 Then nonzero entries of m-fold transition matrix are where h i 0 and h 1 + · · · + h n = m.
So each row of this matrix represents a multinomial distribution on vectors h.
The next step is when a collector wants to remember whether each fixed coupon was drawn, no matter how many times. In particular, If the collector is able to keep only one number in memory, we continue the lumping.

Example 4 (Only number of samples
The number ξ m = ξ 0 p m of distinct coupons selected after m steps has the classical occupancy distribution [37]: The expectation of number ζ n r of steps required to obtain exactly r distinct coupons is described via harmonic numbers H n = 1 + 1/2 + · · · + 1/n:

Distributed Generation of Sets of Proofs
This section presents the main results about the distributed generation of sets of separate independent proofs. This is the simplified model, where all proofs may be generated simultaneously and independently. This model corresponds, in particular, to the case of the generation of proofs which are on the same level as proof tree.

Models of Distributed Generation of Sets of Proofs
The further two examples show two possible approaches to the description of this model, which then appear equivalent.
Example 5 (States are subsets). Let provers be special nodes in the peer-to-peer network. They need to construct zk-SNARK-proofs for finite set N of so called proof-candidates.
We describe this process as a Markov chain, whose states are subsets of N ⊆ N of proofcandidates not yet proved. The number of provers m > 0 is fixed. On each step, beginning in the state N , each prover independently and with equal probabilities selects a single proof-candidate from N and construct its proof, so the selection is given by a function g : m → N uniformly distributed among all functions m → N . The resulting state is the difference N := N Im g obtained by removing just the proved elements. Nonzero transition probabilities p(N , N ) equal the part of all functions m → N which come from surjections to N N i.e., g = (m N N → N ).
An alternative way is to define a probability measure on trajectories: Example 6 (Non-Markovian model). Let at the beginning each prover for 1 i m independently select its own so-called priority ordering σ i ∈ Ord N with equal probability 1/| Ord N| = 1/|N|!. This determines the chain of states, i.e., the subsets together with linear orderings: In jth step, 1 j k, being in the state N j−1 ith prover select proof-candidate according to the function g j : m → N j−1 given by g j (i) := σ There is the natural projection ρ N N : Ord(N) → Ord(N ), which removes elements of N N from an ordering. Then, we put Proposition 3. The models from Examples 5 and 6 are stochastically equivalent.

Proof.
We give a sketch of the proof. A more general situation is described in Example 12.
Selections of (σ i ) 1 i m are uniformly distributed on (Ord N) m . This implies 1. uniform distribution of g j in the set of functions m → N j−1 , and . The second item follows from the definition (11) and from the fact that the fiber of ρ N N over each point has the same cardinality | Ord N|/| Ord N | = |N|!/|N |!.
Example 7 (States are numbers). The cardinality function N → |N | is a lumping map for the Markov chain from Example 5. The states of the factorized Markov chain are {0, 1, . . . , |N|}, the only nonzero elements of transition matrix are the following: Note that each row this matrix coincides with the classical occupancy distribution from Example 4.
Let the initial state be ξ mn 0 ≡ n. The evolution is described via powers of transition matrix: The absorbing state is 0. All trajectories are strictly decreasing and ξ mn k ≡ 0 for k n. The absorption time τ mn is a random variable which measures the exact number of steps m provers needs to generate all n proofs. i.e., τ mn = k + 1 iff ξ mn k+1 = 0 and ξ mn k = 0. Taking into account the lower triangular form of our transition matrix, we get recurrent and explicit formulas for probabilities: Pr(τ mn = k) = ∑ 0<n k <···<n 2 <n 1 =n p n 1 n 2 · · · p n k−1 n k p n k 0 Multiplying (13) by k and taking a sum over k we get the recurrent formula for th moment: In particular, this allows one to get the next formulas for calculating expectation and variance.

Proposition 4.
Let m > 0. Then τ m0 ≡ 0 and for n > 0 In Table 1 at the end of the paper we present probability distributions of τ mn accurate to 10 −6 (except of the last column). A cell contains the list of pairs k; p mn k of value k and the corresponding probability p mn k (nonzero up to accuracy). The number of proofs n runs through powers of 2, which corresponds to the number of leaves of a perfect binary tree.
We compare the values of E τ mn as results of infinite-precision calculations according (15) using Wolfram Mathematica and of 10 5 random tests written on C++ of model from Example 6. For m, n ∈ {10, 20, 30, 40, 50, 100, 200, 300} the numerical results obtained by these two different ways match up to 2 digits after the dot.

Remark 1.
For fixed positive integer m we consider two modifications of the coupon collector model from Example 4:

1.
After m, 2m, 3m, . . . steps all coupons drown, during the last m steps, which are removed from the urn permanently.

2.
Each time when collector drown m new distinct coupons, these m coupons are removed from the urn permanently.
Note that if for the first modification we apply time scaling i.e., consider a subprocess at moment 0, m, 2m, . . ., we obtain the proofs generation model from Example 7. The second modification is slightly slower than the first, i.e., the expectation of the number of steps to obtain exactly the r distinct coupons in the second modification is no less than in the first modification. These observations show that the expectation of the time τ mn of proof generation from Example 7 can be majorized by the expectation of the time ζ n r from coupon collector model from Example 4:

Asymptotics of τ mn
For the general Formula (14) for the probabilities Pr(τ mn = k) it seems very difficult to obtain approximation in an explicit form. However, Pr(τ mn = 1) is just a fraction of surjective maps m → n among all such maps:

Large Number of Provers
Firstly we consider the case of large number of provers, i.e., m n. Equivalently this means Pr(τ mn = 1) ≈ 1 or E τ mn ≈ 1. Note that τ mn = 1 iff on the first step the corresponding map m → n from provers to proof-candidates is surjective.

Proposition 5.
For fixed number n > 0 of proof-candidates the next asymptotic hold: Proof. For each proof-candidate i let A i be the event that i is not proved on the first step, From the inclusion-exclusion principle: where the next sums are small with respect to the first. To calculate expectation E τ mn we can take into account only values τ = 1, 2; the contribution of other values is asymptotically small.

Remark 2.
In blog post [38] it is observed that the upper bound can be derived from the inequality between usual and conditional probabilities.
Note that the right hand side of (17) has the same asymptotic as Pr(τ mn = 1) in (19), so one can consider it as asymptotical upper bound for Pr(τ mn = 1).

Asymptotics of the Stirling Numbers and Probabilities Pr(τ mn = 1)
The asymptotics of the Stirling numbers of the second kind have been studied since Laplace (1814). From a long list of publications we consider only results related with our context.
A usual way is to apply Cauchy's integration formula to the generating function (3): where C is a suitable contour around the origin and φ(z) = n ln(e z − 1) − m ln(z). The saddle point ρ solves the equation φ (ρ) = 0 or ρ 1−e −ρ = m n , or, finally, Lambert W function or product logarithm is a multivalued function inverse to w → we w , and W 0 is its principal branch; see [39].
The following expression coincides with the first term of [40] (5.1) or with [41] (5.9) derived in the context local limit theorem or with [42] (2.9): where σ 2 = n m 2 1 − ρ e ρ −1 is a variance of the limiting normal distribution. This approximation is uniform for n/m in each closed subinterval of (0, 1).
Using Stirling formula for m! we can obtain asymptotic probability as a function of two parameters n/m and m: where α and γ depends only on the ratio n/m: This dependencies are shown on Figures 1 and 2. One can see that when n/m run from 0 to 1, the functions α(n/m) and γ(n/m) change respectively from 1 to ∞ and from 1 to 1/e.

Dependence on the Ratio n/m
Next we research asymptotical behaviour of τ mn depending on m and n, and formulate related results as conjectures. At the moment we can prove only some transitions, as others comes from infinite-precision calculations. Note that E τ mn for large m and n asymptotically depends only on the ratio n/m and we study the character of this dependence.
A series of calculations with infinite precision allows one to formulate the following sequence of hypotheses.

Hypothesis 1.
For each fixed m, n ∈ Z >0 the sequence Z >0 k → E τ km kn is increasing and upper bounded.

Remark 3.
Recall that Remark 1 states a connection between coupon collector and proof generation models. Taking into account (16) and that for ζ n r from Example 4 the sequence k → ζ kn kr is increasing and upper bounded, we can prove the following. If the sequence Z >0 k → E τ k k is increasing and upper bounded, then for each fixed m, n ∈ Z >0 with n > m the sequence Z >0 k → E τ km kn is increasing and upper bounded. The function h(x) is non-decreasing because E τ mn strictly increases by n and strictly decreases by m.
For the case of m = 750 provers, points ( n 750 , E τ 750 n ) of graph on Figure 3 approximate the corresponding points of the imaging graph of function h(x). For small x it looks like a flight of stairs with steps of height 1 starting at point (0, 1).
Results of calculations are presented as graphs of γ k (x), λ k (x) on Figures 5 and 6 for k = 2 and on Figures 7-9 for k = 3.
Remark 5. The inequalities (26) mean that for m large enough and n/m → x ∈ (ζ k−1 , ζ k ) the distribution of τ mn tends to Bernoulli distribution with values k − 1, k. One can see this in Table 1, where for large numbers of provers m and proof-candidates n lists of values and probabilities contain at most two items (i.e., for other values probabilities are very small). Moreover, the variance of τ mn tends to the variance of Bernoulli distribution: Var τ mn → Pr(τ mn = k − 1) · Pr(τ mn = k) 1/4.
Indeed, our numerical calculations allow to suppose that Var τ mn < 1 if m 10, n/m < 10 4 .

Distributed Generation of Proof Trees
This subsection deals with more complicated and, at the same time, more useful real application models of proof generation. In Latus consensus, zk-SNARK-proofs form perfect binary trees (proof trees), like the hashes of transactions form similar trees in the mainchain. The nodes of the tree form a partially ordered set (poset) whose Hasse diagram is the tree itself. So it is natural to formulate a part of our results in terms of general posets.

Ordered Sets and Lattices
Basic facts about posets mentioned below can be found in [31,43]

(ch.3).
A poset is a set equipped with a partial order, i.e., a binary relation which is transitive, reflexive, and antisymmetric.
Let P be a poset. A chain in P is a subset with total induced order. An antichain in P is a subset where any two distinct elements are incomparable. The height ht(P) of finite poset P is the maximum cardinality of a chain in P. The width wd(P) of finite poset P is the maximum cardinality of a antichain in P.
A subset I ⊆ P in a poset P is called a down-set (resp. up-set) if for each x ∈ I and y ∈ P with y x (resp. y x) we have y ∈ I. Note that down-sets in P are up-sets in the opposite poset P op and vice versa.
Denote O d (P) (resp. O u (P)) the lattice of down-sets (resp. up-sets). A subset I ⊆ P is a down-set if its complement P I is an up-set. The set of up-sets in P form a distributive lattice ordered by inclusion. The map O d (P) → O u (P), I → P I is an anti-isomorphism of lattices.
Denote Min I (resp. Max I) the set of minimal (resp. maximal) elements in I ⊆ P. Note that Min I and Max I are antichains. For an arbitrary subset X ⊆ P, we denote X ↓ (resp. X ↑ ) the down closure (resp. (up closure), i.e., the smallest down-set (resp. greatest up-set) containing X. In the case of a singleton the down-set {x} ↓ is called principle.
In this way up-sets (resp. down-sets) are in one-to-one correspondence with antichains.
Note that the above correspondence P → O u (P), O d (P) is a part of Birkhoff's representation theorem, which in modern formulation states the antiequivalence of categories of finite posets and finite distributive lattices.
A direct corollary of Birkhoff's theorem states that the symmetry group Aut P of a finite poset P is naturally isomorphic to the symmetry group Aut O(P) of the corresponding lattice O(P) = O d (P) or O u (P). Corollary 1. The canonical map α : Aut P → Aut O(P), Q α(g) = {p g | p ∈ Q}, g ∈ Aut P, Q ∈ O(P) is a group isomorphism.
For two posets P and Q there exist new posets • the product P × Q, where (p, q) (p , q ) iff p p in P and q q in Q. The product of distributive latices is a distributive lattice; • the co-product P Q which is the disjoint union, orders restricted on P and Q coincide with the initial, the elements from different sets are incomparable; • linear sum P + Q which is disjoint union where, orders restricted on P and Q coincide with initial and p < q for each p ∈ P, q ∈ Q. The linear sum of distributive latices is a distributive lattice; For two posets P and Q there exist the following natural isomorphisms of lattices: where the top element of one sublattice is glued with the bottom element of another.

Definition 2.
Let P be a finite poset. A compatible total ordering of P is a monotone bijection to finite ordinal σ : P ∼ = − → {1 < 2 < · · · < |P|}. Denote Ord(P) the set of all compatible total orderings of P.
For finite posets P and Q there exist natural bijections Compatible total orderings Ord(p q), p, q ∈ Z 0 for a coproduct of two chains are in one-to-one correspondence with shuffle permutations σ ∈ S p,q ⊆ S p+q , i.e., such that The number of such a permutation is given by binomial coefficient Definition 3. For a poset P and a subset Q ⊆ P with induced order there exists the natural restriction map Ord(P) → Ord(Q), σ → σ| Q , where a pair of monotone bijection σ| Q and monotone injection ι is uniquely determined from the following commutative diagram Proposition 6. Let P be a finite poset. Then Ord(Q) for Q ⊂ P with a natural restriction maps form a presheaf on subsets of P ordered by inclusion.

Proof. One can directly check that for a chain of subsets
Note that very similar constructions around Birkhoff's duality describe shapes of cells of higher categories in [44].

Poset Version of Coupon Collector Model
Coupon Collector's Process on Posets was considered in the PhD thesis [45]. Here we describe generalisations of Markov chains from Examples 2-4 to the case of poset N.  p(k, k) = k/(k + 1), p(k, k + 1) = 1/(k + 1).

Definition 4. A rooted binary tree is called perfect if all its interior nodes have two children and all leaves have the same depth or same level.
A perfect binary tree is completely determined by the number of its leaves. To produce a perfect binary tree with levels we need to create 2 − 1 proofs.
A perfect binary tree M with 2 − 1 nodes as a poset consists of words of length < in an alphabet of two letters, say {0, 1}; and w w iff w begins with w. So the empty word corresponds to the greatest element, the root. Figure 10  Each perfect binary tree M +1 with + 1 levels as a poset is the disjoint sum of two copies of one level smaller trees with the greatest element added The last identity together with (28) and (29) implies This is the sequence A003095 in [32]: 0, 1, 2, 5, 26, 677, 458330, . . ..

Proposition 8.
Each compatible total ordering on M +1 given by (32), according to (30) can be obtained as a shuffle of two orderings on M . So the number of compatible total orderings of a perfect binary tree satisfies the recurrent relations and, hence, admits the explicit formula which can be interpreted as the number of all permutations on nodes of the tree multiplied the probability that the random permutation of nodes is compatible order on tree. This is the sequence A056972 in [32]: 1, 2, 80, 21964800, 74836825861835980800000, . . ..
Denote τ w the transposition from the corresponding copy of S 2 , the 'symmetry in w'. It swaps between left and right subtrees at w (i.e. (w0v) τ w = w1v and (w1v) τ w = w0v for v ∈ {0, 1} * ) and leaves the rest immobile. The symmetry group Aut M admits a presentation with all of the above symmetries τ w as generators and relations are: τ w τ w = τ w τ w whenever w and w are incomparable in M (in this case τ w and τ w lives in two different factors of a direct product in (33)); • τ wv τ w = τ w τ (wv) τw (this is the multiplication rule for semidirect product in (33)).
The presentation (33) of elements of Aut M means that in each position corresponding to the internal node labeled by a word w one can put either transposition τ or the neutral element e. So Aut M has 2 2 −1 −1 elements τ W which are in one-to-one correspondence with subsets W on internal nodes (where transpositions τ are located). For any compatible total ordering σ ∈ Ord(W),

Distributed Generation of Posets
First we consider models from Examples 11-13 which are generalizations of models from Examples 5-7. We switch from sets to posets.
and existence of surjection m N N implies |N N | m. In the case of uniform distributions non-zero elements of transition matrix are If N is a discrete poset then Min N = N are arbitrary subsets and we obtain a Markov chain from Example 5.
For this Markov chain the empty set ∅ is the absorbing state and all trajectories are strictly decreasing by inclusion. The subject of our interest is the absorption time τ m N = τ m N µ , a random variable which is equal to the number of steps it takes m provers to create all the proofs in up-set N . Note that τ m N = k iff p k−1 N∅ = 0 and p k N∅ = 1 The random variable τ m N takes values in the interval [ht(N), |N|], i.e., p n N∅ = 0 for n < ht(N) and p n N∅ = 1 for n |N|. S,o one can express the expectation of the absorption time via elements of powers of the transition matrix: From the other hand we have a recurrent formula involving matrix elements of the top raw of p as coefficients: Now we can extend Example 6 and Proposition 3 about stochastic equivalence of two models to the case of posets. To do this, we need to go from a uniform distribution of probabilities to an arbitrary one. Pr Ord(N ) (σ ), (37) are unique, turning the maps N Pr Ord(N) (σ). (38) An element of Cartesian degree (Ord N) m corresponds to the choice of a ranging σ i ∈ Ord(N) by each prover 1 i m. It completely determines a trajectory for this Markov chain, i.e., strongly decreasing sequence of up-sets of not yet proven candidates together with selection in each moment 0 j < k by each prover 1 i m the first possible (1) ∈ Min N j according to its own ranging. Directly from the definition one can see that conditional probabilities of such selections are given by (38).
Consider the case when N is a discrete poset and, hence, Min N = N are arbitrary subsets. If we additionally suppose that the initial distribution Pr Ord N is uniform, then for each N the matched distributions Pr Ord N and Pr Min N are also uniform because the numbers of summands in (37) are independent on σ ∈ N and a ∈ Min N respectively. They are naturally indexed in the first case by |N|!/|N |! permutations of N preserving order between elements of N and in the second case by (|N | − 1)! permutations preserving a. So, this covers the case of Example 6 and Proposition 3.
It should be emphasized that the construction in this example is less universal than the general case of Example 11. For instance in the case of N = {a} {b < c} from (38)   Proof. For each ε > 0 there exists m 0 ∈ N such that for all m m 0 for all k from [0.. ht N) all elements of Min N /k will be proved on (k + 1)th step with probability > 1 − ε.
Some types of posets N we will obtain asymptotic in the form The case is suitable when N admits a rich symmetry enough.
where κ N is a number of such k that # Min N /k = wd N.
Proof. Transitivity of the action of Aut N /k on Min N /k implies that uniform probability distribution on Min N /k is optimal. Denote the right hand side of (42) by Φ(N) and n k = | Min N /k |. By induction, we can write min µ∈M(N /k ) E τ m N /k µ as a sum where N + /(k+1) is N /(k+1) with one additional element from Min N /k (in all cases we obtain isomorphic posets); and "· · · " means summands which are small with respect to (1 − 1/ wd N) m . Next we remove small terms from inclusion-exclusion formula (5) for Striling numbers: Then, in all three possible cases wd N /(k+1) = wd N /k , n k = wd N /k or wd N /(k+1) < wd N /k , n k = wd N /k or wd N /(k+1) = wd N /k , The perfect binary tree M satisfies assumptions of Proposition 11; we have ht M = , wd M = 2 −1 and κ M = 1.

Corollary 2. For perfect binary tree M and for a large number of provers m:
and the corresponding probability Next we consider the case of coproducts of chains A kth copower N = 1 i k n of a chain n satisfies assumptions of Proposition 11. We have ht N = κ N = n and wd N = k.
This explicit expression can be obtained from simplification of (1 − x − y) f (x, y) using recurrent relation (47) and boundary conditions (48).
Finally we extract coefficients α k n The next step would be: Find asymptotic formula (41) for arbitrary finite coproduct (45) of finite chains.
For each fixed finite poset N one can consider its nth copowers i∈n N and then study the dependence of absorption time τ m i∈n N on the number of copies n and number of provers m. If N is a singleton we obtain a random variable τ mn from Example 7. This function has a number of properties that generalize the properties of h(x).

Practical Realization of Proof Trees Generation
For the stable and efficient functioning of the sidechain, it is necessary that the following conditions are met: 1.
All transactions that the blockforger plans to include in the issued block must be processed within the time slot, i.e., the time allotted for the creation of this block, and the correspondent proof tree must be completely built; 2.
The number of these transactions should be the maximum possible, for which the probability of constructing the corresponding proof tree is close to 1.
The first condition is necessary in order to minimize or reduce to zero the number of proofs that will be created but not used, i.e., so that the work of the provers is not done in vain. The second condition is necessary to maximize the sidechain throughput.
Therefore, it is necessary to define, given the network parameters (such as the length of the time slot and the number of active provers), such a maximum number of leaves so that the corresponding proof tree is completely built in a time slot with a probability of at least 1 − ε for sufficiently small ε > 0.
We assume that the time slot length is fixed throughout the life of the sidechain. We also assume that the time required to form one proof is the same throughout the lifetime of the sidechain for all miners. This time will be called a tick. The whole part of dividing the time slot duration by the tick duration is equal to the number of proofs that each active miner can build in one time slot. Since the lengths of the time slot and tick are fixed, the number of such proofs during the time slot is also fixed. However, the number of provers may vary.
The task is to determine the maximum number of transactions in a block for a given numbers k of ticks in a time slot, m of provers, for which the corresponding proof tree will be built with a probability of at least 1 − ε.
To solve this problem, we will use the results of Section 3, and also make the following assumptions.
We will assume that provers build all levels of the proof tree sequentially, from leaves to root. First probabilities are calculated so that the corresponding level will be completely built in 1, 2, 3, etc., ticks (for a given number of proofs and provers). Then, using these probabilities we find the number of levels that will be built with probability 1 − ε in k tics: If Pr(τ m 2 −1 = 1) ≈ 1, we can reduce the previous formula as Pr(τ m M k) ≈ ∑ k +1 +···+k k− k +1 ,...,k 1 ∏ <r Pr(τ m 2 r−1 = k r ). Table 1, which indicates the probabilities of constructing a given number of proofs for a given number of provers for a given number of ticks, is auxiliary for solving our problem.
Each row in Table 1 corresponds to a certain fixed number of provers. The columns correspond to the levels of the proof tree, starting from the second from the root. For example, in a cell with coordinates 512 provers, 32 proofs there is a list of two pairs of numbers: 1;0:999997 2;0.000003 . This means that 512 provers will build 32 proofs in exactly 1 tick with a probability of 0.999997 and in exactly 2 ticks with a probability of 0.000003. Therefore, the probability of building 32 proofs in no more than 2 ticks is non-distinguished from 1.
Let us calculate the maximum number of transactions in a block that 512 provers can process with a probability of at least 0.95 in 9 ticks.
The first 5 levels (including the root) will be processed each in 1 tick with a probability almost equal to 1. Therefore, we have at most 4 ticks for building the remaining levels. Note that the eighth level can be built in 1 tick with a very small probability of 0.088899, so this level requires two ticks. The probability of building it in no more than two ticks will be 0.088899 + 0.911101, which is practically equal to 1. That is, if there are 8 levels in the tree, then 2 ticks remain for the 6th and 7th levels, 1 tick for each level. According to the results in Table 1, the probability of building these two levels in 2 ticks is 0.999997 · 0.980019 = 0.980016, which is more than 0.95, therefore, a block with 128 transactions will be released with a probability of at least 0.95, which satisfies our requirements.
Similarly, it can be shown that the probability of a block with 256 transactions being released is significantly less than 0.95. Therefore, if there are 512 active provers, it is recommended to issue a block with 128 transactions.
Based on Table 1, Table 2 was built, which shows the recommended number of transactions in a block for a different number of provers. All possible values of the number of provers are divided here into intervals, in accordance with the number of transactions in the block. For example, 2176 provers will build a block with 512 transactions with a probability of 0.95001, and 2175 provers with a probability of 0.949825. Therefore, if the number of provers is at least 2176, then the recommended number of transactions in a block is 512, and if the number of provers is from 998 to 2175, then the recommended number of transactions is 256. m ≈ ln n − ln ε − ln(1 − 1/n) , n = 2 −1 , ε = 1 − Pr(τ m M = ).

Conclusions
This paper is a part of series of works concerning the sidechains with Latus consensus and zk-SNARKs. The previous works were [30], which may be considered as a restricted preimage of this one, and [46], which researches some game theoretical aspects, occurring when provers set prices for their proofs. All articles from the series are devoted to some concrete practical problems, which may be formulated, in general, as conditions of fully decentralized sidechains based on the Latus consensus protocol. We partially solved these problems analyzing existed mathematical models and methods and creating our specific ones, like probability distributions on partially ordered sets, which are the most suitable for existing purposes. The specific characteristics of this work is some numbers of hypothesis, which were formulated based on a large amount of numerical results obtained using infinite-precision calculations. For our opinion, the task to prove all them seems to be rather non-trivial. The numerical results, obtained at the end of the article, allows to chose correct values of some parameters to achieve stability and high throughput in sidechains. The further researches, which continue the series, are planned to be devoted to a more general, more efficient, and more complicated approach, when a series of blocks are built simultaneously, allowing provers to create proofs for several sequential blocks. Note that this approach allows one to increase essentially without losing stability in the sidechain, and it is therefore useful and interesting.