Models for Generation of Proof Forest in zk-SNARK Based Sidechains

: Sidechains are among the most promising scalability and extended functionality solutions for blockchains. Application of zero knowledge techniques (Latus, Mina) allows for reaching high level security and general throughput, though it brings new challenges on keeping decentralization where signiﬁcant effort is required for robust computation of zk-proofs. We consider a simultaneous decentralized creation of various zk-proof trees that form proof-trees sequences in sidechains in the model that combines behavior of provers, both deterministic (mutually consistent) or stochastic (independent) and types of proof trees. We deﬁne the concept of efﬁciency of such process, introduce its quantity measure and recommend parameters for tree creation. In deterministic cases, the sequences of published trees are ultimately periodic and ensure the highest possible efﬁciency (no collisions in proof creation). In stochastic cases, we obtain a universal measure of prover efﬁciencies given by the explicit formula in one case or calculated by a simulation model in another case. The optimal number of allowed provers’ positions for a step can be set for various sidechain parameters, such as number of provers, number of time steps within one block, etc. Beneﬁts and restrictions for utilization of non-perfect binary proof trees are also explicitly presented.


Introduction
Sidechains (SCs) were first proposed in [1] and quickly became a universal and very comfortable tool in blockchain technology, while [1] proposed sidechains with a "classical" proof-of-work (PoW) protocol for usage in Bitcoin blockchain, the next article [2] builds secure proof-of-stake (PoS) sidechains, and articles [3,4] create a special type of sidechain consensus, the Latus protocol, which is secure for both PoW and PoS sidechains. They may be considered as an additional construction bound to blockchain (called mainchain (MC)) with the aim to provide some additional functionalities. SCs should be bound to an MC, as described in mentioned articles, to provide stronger security guarantees using blockchain properties, such as liveness and persistence [5]. Binding means that SCs also should send some information to the MC to guarantee the fairness of transformations in the SC [3].
In Latus, this information contains a series of recursive zk-SNARK-proofs [6,7] to establish decentralized and verifiable cross-chain transfers. Such an approach is similar to the ones proposed in [8] (Mina) and [9] (zkSync Era), but with additional features that allow its secure usage in sidechains. Latus introduces a special dispatching scheme that assigns the generation of proofs randomly to interested parties, who then implement these tasks in parallel and submit generated proofs to the blockchain. An incentive scheme provides a reward for each accepted proof.
There are a variety of purposes to use SCs in different blockchains: for making smartgrid systems scalable and adaptable [10]; for secure data isolation in scalable consortium blockchain architecture based on world-state collaborative storage methods [11]; for secure parallel computation [12]; and for creating an exchange platform that allows the ability to trade tokens issued from different sidechains [13].
In this paper, we assume that SC uses the Latus protocol [4], which is a hybrid PoS based on Ouroboros Praos [16]. However, as SCs may be based on various consensuses, our main results may be applied for other consensus protocols (such as proof of work, proof of stake, etc.) with similar properties, allowing distributed proof generation and recursive SNARKs.
Following the modern trends in blockchain investigations, we use SNARKs to prove the correct functioning of SCs (detailed survey about different types of SNARKs used in blockchain can be found in [17]). These cause some modifications of the Merkle tree, which we call the "proof tree".
Distributed proof generation for block creation is the basic feature of Latus consensus. In brief, the procedure of block creation is as follows. A block forger-an entity that creates the block-shares a list of transactions to be included into the block. Then, other entities, called provers, construct SNARK-proofs for these transactions and also for each node of the corresponding proof trees built on these transactions.
Each prover, who creates a SNARK-proof, sets the prices for their proofs, according to the pricing policy of the current epoch. In the case of collisions, when there are several proofs for some node of the binary tree, the block forger chooses the cheapest one. Proofs that were included into the block are paid by the block forger from the corresponding transaction fees.
Our previous papers [18][19][20] considered only issues related to a single block creation. We used probabilistic and game-theoretic approaches to answer the questions on the optimal choice of SC parameters (e.g., a recommended number of transactions per block and a recommended value of incentives chosen). The model in which we obtain our results is very close to realistic with only one simplifying assumption: splitting of the block construction process into a fixed number of similar steps. Such assumption can be justified by the fact that the network synchronization time is essentially smaller than the proof generation time.
In this paper, unlike the previous ones, we consider algorithms related to simultaneous construction of trees by different block forgers. The sequence of such trees forms a linearly ordered forest. When the following block is published, the most left tree is included into it, and the remaining ones are regarded as a buffer for further block generation. The base assumptions used in this paper is almost the same as in the our previous papers, with the following modification: In practice, blockchain's sequences of transitions are proved using base SNARK for certifying single state transitions (in leaves) and merge SNARK for merging two proofs (in other vertices) ( [21], section 4.1.1). We consider the scheme that replaces each base proof for a transaction by merged proofs for pair of leaves from the new additional level. So, we obtain a mathematically equivalent model where trees are one level higher and a pair of fresh leaves correspond to each transaction; therefore, we obtain more succinct algorithms.
The paper is organized as follows. The necessary basic preliminary results on binary trees are presented in Section 2. For us it is convenient to consider the sets of strict binary trees and of non-strict binary trees as specific realizations of the free magma M = n 0 M 0 with a single generator. In Section 2.2, we introduce agreed monoid structures on infinite sequences of strict binary trees and positive numbers. The background of these construc-tions is explained in Appendix A. In Section 2.3, we introduce the concept of perfect forest necessary for description of the main algorithm. Section 3 presents the general scheme for 2 × 2 algorithms corresponding to the prover's behavior: deterministic (mutually consistent) or stochastic (independent), and shapes of generated trees: strict binary or perfect binary. It works with a potentially infinite sequence of trees t called a buffer. In the external for-loop, the first element is removed from the buffer and published as the following block. Block generation is divided into a fixed number of steps, each of which calls the OneStep() procedure. In all 2 × 2 cases each of suitable subsequent pairs of trees is merged into a single tree. Selection of such suitable pairs, deterministic or stochastic, is the unique difference among 2 × 2 cases (see Table 1). Its result can be encoded as a sequence Λ of trees with the height 1. After this, the result of OneStep() procedure can be described as a function f : t → Λ • t, acting on the buffer t via the product defined in Section 2.2. Iteration of f over all steps and then a removal of the first tree from the buffer determines a function F(t) describing transformation of the buffer during block generation. In Section OneStep, we consider a reduction of the above scheme, where only the number of leaves of each tree is taken into account. This is sufficient to study the throughput parameters of the blockchain(s). This reduction is possible due to the homomorphism of monoids described in Section 2.2.
In Section 4, we consider the case when behavior is deterministic. We prove that a sequence (t, Ft, F 2 t, . . .) of buffer states takes a finite number of values and, hence, this sequence and the sequence of published trees are ultimately periodic. In Section 4.1, deterministic generation of strict binary trees is considered. The sequences of buffer states and published blocks are ultimately periodic with a period of length 1. This follows from the fact that transformation of the buffer after each block publishing is a weak contraction with respect to the Hamming distance. Under some natural assumption the height of the tree published over the period is the sum of (the integer part of) the binary logarithm of the number of provers and of the number of steps. An explicit formula for the fixed point is obtained in terms of the product on infinite sequences of binary trees. In Section 4.2, deterministic generation of perfect binary trees is considered. It is observed that the period of the sequence of published trees consists of perfect trees of two heights h max −1 and h max . The total efficiency is independent of shapes of generated trees and can be expressed by an explicit formula and admits natural asymptotics. Explicit results are obtained in the cases of a singe prover of a single step.
In Section 5, we consider the case when behavior is stochastic. In Section 5.1, it is shown that the placement of provers into suitable positions at each stage can be described by the classical occupancy distribution. In the deterministic case, the total number of generated proofs is the product of numbers of published blocks, of steps in a block and of provers. In the stochastic case, we define the total efficiency (respectively the block publishing efficiency) as the total number of generated proofs (respectively the number of proofs embodied in published blocks) divided by the above product. Thus, both efficiencies are numbers in the interval [0, 1]. They are convenient to compare the cases of various shapes and various values of integer parameters. The total efficiency is independent of shapes of generated trees and is expressed by an explicit formula and admits natural asymptotics. In Section 5.2, average values of the block publishing efficiency are calculated using a simulation model. We compare two cases when the shape of trees is binary strict or perfect. The maxima of these values as functions of numbers of positions are investigated. In Section 5.3 for generation of strict binary trees in a stochastic case we investigate an analog of the deterministic Formula (22) for heights of trees. Using the least squares method, we find the best approximation of the expected height of generated trees linear on the number of steps and logarithmic on the number of provers.
In Appendix A, we recall the concept of a non-symmetric operad and construction of the corresponding PRO. As the following step, one can obtain a monoid structure of infinite sequences of operations. The non-symmetric magma operad and the non-symmetric semigroup operad as its factor come from corresponding free objects with a single generator.
The corresponding monoids of infinite sequences of strict binary trees and its factor-monoid given by taking the number of leaves of each tree are considered in Section 2.2.
In Appendix B, we describe the software implementation of our algorithms and explain how they are used in the main text.

Binary Trees and Free Magmas
Trees, as a part of Graph Theory, are used in mathematics ( [22], Chapter 5) and computer science, (see the classical volume [23]). In this subsection, we give some necessary definitions related to various types of binary rooted trees and forests, as there is no precise description convenient for our purposes and there might exist a confusion in terminology.
Rooted trees can be defined either (1) recursively, or as (2) partially ordered sets, or as (3) special types of graphs. All of these points of view are useful in their respective contexts.
A tree is defined as a connected acyclic undirected graph. Choosing a vertex in a tree as a root determines the natural direction for all edges (towards the root). However, usually it is not shown explicitly.
A rooted tree is a directed graph with a distinguished vertex called the root such that there exists a unique path from each vertex to the root. Thus, the set of vertices is equipped with a natural partial order: v w if there exists a path from v to w. In addition, the rooted tree itself becomes the Hasse diagram for this partial order (being drawn root up). If v → w is an edge in a directed rooted tree, we say that v is a child of w, and w is a parent of v. A vertex that has no child is called a leaf. Let us denote as ht t the height of the tree t, i.e., the maximal length of the path from a leaf to the root. Thus, leaves are minimal elements, and the root is the greatest element in the corresponding partially ordered set.
Each tree can be embedded in the plane. For a rooted tree all such embeddings are classified by linear orderings for the children of each vertex. Thus, if all such orderings are fixed, we say they are an ordered rooted tree (or a plane rooted tree).
Here, we mainly consider strict binary plane rooted trees (Instead of the term "strict binary tree", the terms "complete binary trees" and "full binary trees" are also used.), where "strict binary" means that every vertex is either a leaf or has exactly two children (left and right). In this case, for each non-leaf vertex, its left child vertex and its right child vertex are roots of maximal binary subtrees called, respectively, a left child tree and a right child tree. Notation 1. Let us denote as n v (t) the number of vertices, as n i (t) the number of internal vertices and as n (t) the number of leaves in a rooted tree t.
In a strict binary tree t these numbers satisfy the identities A magma is a set equipped with a binary operation without any additional properties Chapter 1 [23], ( [24], 0.2). The free magma M with a single generator * (up to isomorphism) is described recursively as Thus, an element of M n+1 is a bracketing of a string of n + 1 symbols * , i.e., the insertion of n left parentheses and n right parentheses defining the order of an n-fold application of the binary operation, e.g., five elements of M 4 are presented in Figure 1a (with the outer brackets removed). Cardinality of M n+1 is counted by Catalan number C n := 1 n + 1 ( 2n n ). Various kinds of objects that are counted using Catalan numbers are called Catalan families. The Stanley's book [25] presents 214 Catalan families including various types of trees.
Strict binary trees are mentioned in Exercise 2.5 [25]. The generator of the free magma of strict binary trees is the zero height tree • whose single vertex is the root. For two binary trees t 1 , t 2 , the binary tree t 1 t 2 is a binary tree with a new root, whose left child tree is t 1 and the right child tree is t 2 . Strict binary trees with four leaves corresponding to elements of M 4 are presented in Figure 1c.
Non-strict binary trees (often in computer science literature just "binary trees") obtain yet another example of Catalan family Exercise 2.4 [25]. Thus, they can be defined recursively in the same way as strict binary trees, with the only difference that their free magma generator is the empty tree. A non-empty non-strict binary tree can be represented by a plane rooted tree, where each vertex has 0, 1 or 2 children, but with additional structure, every single child must be labeled "left" or "right". Such trees with three vertices corresponding to elements of M 4 are presented in Figure 1b. Note that a non-strict binary tree is obtained from the corresponding strict binary by removing all its leaves.

Monoid of Sequences of Strict Binary Trees
Here, we consider composition t • t of infinite sequences of strict binary trees given by the gluing of subsequent leaves in t with subsequent roots in t , and composition • of infinite sequences of positive integers corresponding to the numbers of leaves in a tree.
First, for a strict binary tree t with n leaves and strict binary trees t 1 , . . . , t n , we define the composition t • (t 1 , . . . , t n ) obtained by gluing together the i-th leaf in t with the root of t i for i = 1, 2, . . . n.
This obtains the a non-symmetric operad structure on strict binary trees. Then one can apply the construction from Appendix A to obtain the monoidal structures described below. The details of the Appendix A are not necessary to understand the remainder of the material.
Let T = T strict be a set of sequences (t i ) i 0 of strict binary trees with only a finite number of non-zero height trees, equipped with a binary i.e., each t i is successively applied in the sense of (A1) to the first not yet used n (t i ) elements of t . Let L = L strict be a set of sequences ( i ) i 0 of positive integers with only a finite number of elements > 1, equipped with a binary operation ( , ) → • , where i.e., successively each (t • t ) is the sum of the first not yet used i elements of .

Corollary 1.
The above binary operations turn both T and L into monoids. The units are

Perfect Binary Trees and Forests
A rooted tree is called perfect ("complete" by some authors) if all leaves are of the same distance to the root (that equals to the height h of the tree).

Notation 2.
For each non-negative integer h, the unique perfect binary tree of height h is denoted as t h .
The perfect binary tree t h has 2 h leaves; moreover, 2 d vertices are on the distance d from the root (they can be subsequently indexed with 0 j < 2 d , whose binary string representation has length d), and 2 h+1 − 1 vertices in total. See Note that a non-strict binary tree can be obtained from the corresponding strict binary tree by removing all its leaves.
All perfect binary trees are built recursively using magma square: Perfect binary trees are most convenient to store a maximal number of leaves (that correspond to transactions).
A forest (respectively plane forest) f is a co-product (disconnected disjoint union) of a family of trees (respectively plane trees) (t i ) i∈I . In both cases, the set of components I is unordered.
Let us recall that a subset I ⊆ P in a poset P is called a down-set (respectively up-set) if, for each x ∈ I and y ∈ P with y x (respectively y x), we have y ∈ I. Note that a subset I ⊆ P is a down-set if its complement P\I is an up-set.

Lemma 1.
For a finite sequence f = (t i ) 0 i<p of strict binary trees, the following conditions are equivalent: 1.
There exists a bracketing, i.e., an element of the free magma M p , such that, after the corresponding application of p − 1 magma operations , a perfect binary tree t ( f ) is obtained; 2.
(t i ) 0 i<p is a down-set of some perfect binary tree containing all its leaves with components ordered from left to right; 3. (a) Each tree t i in the family is perfect, The total sum ∑ j=p−1 j=0 n (j) is an power of 2.
Proof. 1 ⇔ 2: For a down-set in the perfect binary tree containing all its leaves, its complement is the non-strict binary tree corresponding to the appropriate bracketing. 1 ⇒ 3: (a) and (c) are obvious; to prove (b), let t be a subtree of height h in a perfect tree t of height h such that each leaf in t is a leaf in t. Then, there exists a subforest in t that consists of 2 h−h perfect trees of height h including t .
3 ⇒ 2: Split the leaves of f into equal left and right halves. If there is a tree t i in our forest whose leaves are in both halves, then, from 3(c), we conclude that the forest consists of this unique tree. Otherwise, we can represent f = f f r as a co-product of two forests satisfying condition 3, so we suppose t ( f ) = t ( f ) t ( f r ) and repeat the above choice for f and f r .

Definition 1.
A finite sequence (t i ) 0 i<p of binary trees satisfying conditions of the above lemma is called a perfect binary forest. Figure 3 shows an example of a perfect binary forest and the corresponding perfect binary tree t 4 = (t 2 (t 1 (t 0 t 0 ))) (t 2 ((t 0 t 0 ) t 1 )). In what follows, we use (potentially) infinite sequences of proof trees (t i ) i 0 with only a finite number of non-zero height trees as a buffer of proofs.

Definition 2.
An infinite sequences of binary trees (t i ) i 0 with only a finite number of non-zero height trees is called perfect if some its initial fragment (t i ) 0 i<p , containing all non-zero height trees, is perfect.

Proposition 1.
A sequence t ∈ T is perfect if there exists t ∈ T such that t • t is a sequence that consists of a single perfect tree followed by zero-height trees.

Definition 3.
A pair (t i , t i+1 ) of subsequent trees in a perfect sequence is called a perfect, if the new sequence with the pair (t i , t i+1 ) replaced with the product t i t i+1 remains perfect.
Note that the binary operation between two strict binary trees can be written as t t = ∧ • (t, t ), where ∧ is the strict binary tree of the height 1. Thus, to replace (t i , t i+1 ) with the product t i t i+1 is the same as to take the composition ∧ i • t in the sense of (2), Equivalently, suppose that the vertices of the whole perfect tree are labeled as in Figure 2. A pair (t i , t i+1 ) of neighboring trees in its down-set is perfect if their roots are labeled by strings 'w0' and 'w1' respectively for some w ∈ {0, 1} * .
Obviously, two different perfect pairs cannot intersect. Thus, perfect pairs in a perfect sequence form an infinite sequence.

The General Scheme
A special feature of Algorithm 1 considered in this paper is a simultaneous construction of several trees forming a linear ordered forest, rather than a single tree. Periodically, the leftmost tree is included into the published block, and the remainder are considered as a buffer for further block generation.

Algorithm 1: The general scheme of blocks generation
Input: b behavior , b shape , n bl , n st , n pr , n pos ; OneStep(b behavior , b shape , n pr , n pos ); Block publishing: remove t 0 from the buffer t and append t 0 to the list t © ; Input parameters are: • Dichotomous b behavior regulates behavior of provers, deterministic (mutually consistent) or stochastic (independent); • Dichotomous b shape describes the shape of generated trees, only perfect binary trees or arbitrary strict binary trees; • n bl is the number of blocks published during the epoch; • n st is the number of steps for one block generation; • n pr is the number of provers; • n pos is the number of positions allocated for proof.
During the work of the algorithm, a potentially infinite sequence of binary trees t = (t i ) i 0 arises that we call a buffer. It describes the state of the system and is used in the further block generation.
The output is the sequence t © = (t © i ) 0 i<n bl of binary trees included into published blocks (one tree per block).
The published list is initialized with the empty list t © ← (). Theoretically, we assume that at the beginning t = (•) i 0 is a potentially infinite sequence of zero height trees. In practice, we initialize t with an empty list and put zero-height trees there as needed.

Remark 1.
In a specific implementation, instead of two lists of binary trees, one can work with a single list, and an integer pointer p; trees with indexes i < p are considered as published, and trees with indexes i p form a buffer. Block publishing in these terms is just an increment of the pointer p ← p + 1.
The external for-loop of Algorithm 1 runs over all n bl blocks in the epoch. The body of this loop consists of the internal for-loop that runs over n st steps in the block and the block publishing directives. The body of the internal loop consists of Procedure OneStep(b behavior , b shape , n pr , n pos ).
Procedure OneStep(b behavior , b shape , n pr , n pos ) case perfect do for k ← n pr − 1 to 0 by −1 do in t replace the k-th perfect pair with their magma product; case stochastic do RandomPositions ← ∅; for k ← 0 to n pr do put to RandomPositions a random integer from 0 to n pos − 1; sort RandomPositions descending; case perfect do foreach k in RandomPositions do in t replace the k-th perfect pair with their magma product; This procedure presents the general scheme for 2 × 2 different cases corresponding to combinations of two dichotomous parameters b behavior and b shape . It acts on the buffer: some subsequent pairs of trees (t k , t k+1 ) are replaced with their magma product t k t k+1 , and these pairs essentially depend on dichotomous parameters b behavior and b shape .
In the case "b behavior is deterministic", the number of such pairs coincides with the number of provers n pr ; then, for the case that "b shape is strict", these pairs are first n pr subsequent pairs of the trees in the buffer: {(t 0 , t 1 ), (t 2 , t 3 ), . . . (t 2n pr −2 , t 2n pr −1 )}; for the case that b shape is perfect, we use first n pr perfect pairs in the buffer. In the case "b behavior is stochastic", first n pos subsequent/perfect positions are reserved for the random choice. Each prover independently with equal probability selects one of these positions. For positions selected by at least one prover, the corresponding pair of trees is replaced with its magma product.
For fixed input parameters b behavior , b shape , n st , n pr , n pos , the following functions (deterministic or random) together with their domains and codomains are defined. Let us denote • T = T b shape the set of b shape (i.e., strict/perfect) binary trees; • T = T b shape the set of sequences t = (t i ) i 0 of b shape binary trees with finitely many non-zero height trees; • t [ ] be the sequence obtained by the shift, i.e., t [ ] F = F b behavior ,b shape ,n st ,n pr ,n pos : π = π b behavior ,b shape ,n st ,n pr ,n pos : T → T, t → f n st (t) 0 .
Here, the function (4) is determined by Procedure OneStep. It can be described as a product (2) with the sequence Λ = Λ b behavior ,b shape ,n pr ,n pos ∈ T of trees of height 1. The internal for-loop of Algorithm 1 means the n st -th iteration of this function. Then, block publishing determines the map (6), and the state of the buffer after each block publishing is changed according to the map (5). Table 1 shows the whole scheme. This four cases are also implemented in software described in Appendix B.

The Algorithm in Terms of Numbers of Leaves
Let us recall that a perfect tree is completely determined by its height, but there are C n strict binary trees with n + 1 leaves. However, the exact shape of trees is not important for the blockchain('s) throughput parameters' study. Thus, we can collect information about the number of leaves (and, therefore, (1) about the number of vertices) and optionally about heights of trees. Thus, instead of the sequence of trees t = (t i ) i 0 , one can consider the sequence of integers = ( i ) i 0 , where i = n (t i ) is the number of leaves of the corresponding tree. The initial state is described as ( i = 1) i 0 . The magma operation on strict trees corresponds to addition of the numbers of leaves: n (t i t i+1 ) = n (t i ) + n (t i+1 ).
A pair of numbers ( j , j+1 ) in a perfect sequence ( i ) i 0 is perfect (i.e., corresponds to the perfect pair of trees) if j = j+1 and 2 j divides If, moreover, the sequence ( i ) i 0 is non-increasing, the condition (7) takes the equivalent form Let us denote by L = L b shape a set of sequences ( i ) i 0 of positive integers with only a finite number of elements > 1, in the case that b shape is perfect. For a version of Algorithm 1 in terms of numbers of leaves, one considers the functions corresponding to (4)- (6). We keep the same names for the functions. Different (co)domains avoid ambiguity: where in (9), λ = λ b behavior ,b shape ,n pr ,n pos ∈ L is the sequence of 1 or 2.
Elements of the double sequence (n,k) are naturally linearly ordered, lexicographically on the pairs (n, k). In this sequence, The length of the sequence = ( i ) i 0 ∈ L is defined as follows:

Deterministic Case
In this section, we consider only the case of Algorithm 1 when b behavior is deterministic. We consider sequences (n,k) for an arbitrary n that corresponds to the limiting case n bl → ∞ of an infinite outer for-loop. One can also consider the infinite sequenceˆ of the number of leaves of published trees with elements © n = π (n) , n 0.

Lemma 2.
In the case, b behavior is deterministic, and the function f : L → L preserves a property of a sequence to be non-increasing.
Proof. In the case b shape is perfect, take into account (8).

Corollary 2.
In the case that b behavior is deterministic, the sequences (n,k) from (12) are nonincreasing for all n 0 and 0 k n pr .
Let us denote as L a set of non-increasing sequences from L. For a non-increasing sequence ∈ L (e.g., for all (n,k) when b behavior is deterministic), the Formula (14) can be simplified: For ∈ L, the sum has only finite non-zero summands and, according to (1), can be interpreted as the total numbers of internal vertices. Independently on b shape , for ∈ L, we have the identities Lemma 3. In the case "b behavior is deterministic", for fixed n st , n pr the total number of internal vertices (and, hence, the length and all elements) of all (n,k) are bounded together, i.e.,: Proof. In the case "b shape is strict", Len (n,k) n pr and Len (n) < n pr . This follows from the observation that Len 2n pr implies Len f ( ) n pr . In the case that "b shape is perfect", the number of the same non-zero height trees in the buffer is bounded by n pr + 1. See (27) for more accurate estimates.
Then, we can estimate from above the total number of internal vertices: In the second case, we use the fact that all  n pr > 1, the function C b shape is monotone increasing and, hence, invertible. Let N i ( (n) ) C b shape (n st n pr + 1). Then, π( (n) ) (n) 0 n st n pr + 1 and, by (16), we have N i (F( )) N i ( ). Thus, sup n,k N i ( (n,k) ) < C b shape (n st n pr + 1) + n st n pr .

Definition 4.
The sequence x = (x i ) i∈Z −1 is called ultimately periodic if there exists r ∈ Z 0 and s ∈ Z >0 such that, if i > r, then x i = x i+s . In this case, we write x = x 0 , . . . , x r preperiod , (x r+1 , . . . , x r+s period ).
The minimal preperiod is the minimal r satisfying the above condition. The minimal period is the minimal s for minimal r satisfying the above condition.
Given a function g : X → X and an element x 0 ∈ X, the sequence x 0 , g(x 0 ), g(g(x 0 )), . . . is ultimately periodic, whenever X is finite. Lemma 3 implies that the set (n) | n 0 is finite.

Generation of Strict Binary Trees
In this subsection, we consider the case "b behavior is deterministic" and "b shape is strict". In this case, one-step functions (4) and (9)  .
Let us consider the Hamming distance for , ∈ L : Lemma 6. In the case that "b behavior is deterministic and b shape is strict", the transformation F is a weak contraction on (L , d H ):

Corollary 4. The sequence of buffer states (n)
n 0 is stabilized at the fixed point of F. Thus, this sequence and the sequence of published blocks are ultimately periodic with a period of length 1. The tree published in the period has n pr n st + 1 leaves.
The following lemma shows that, if n st log 2 (n pr − 1), then the length of pre-period 1. Let us consider a semi-infinite matrix, where rows are given by (21) for n pr 1, and the corresponding matrix, where each tree is replaced by the number of its leaves. Elements (n∧) • log 2 n 0 = t log 2 n are removed after the shift. The matrix is lower triangular in the sense that (n∧) • log 2 n m = • for m n 1.

Proposition 3.
Let us denote f k,p = p · ∧, (2 k − p) · • the sequence of trees of length 2 k for 0 p 2 k .

1.
For each m 1, the following sequence can be presented as the concatenation

2.
For n m 1, Proof. For each m 1, the sequence (23) is the concatenation over all positive integers k of sequences consisting of 2 k−1 (m − 1) copies of t k and then compositions t k • (p∧) for p from 1 to 2 k . (Note that t k • (2 k ∧) = t k+1 .) The tree t k appears in the positions 2 k−1 (m + 1) n 2 k m, or, equivalently, n m 2 k 2n m+1 . The tree t k • (p∧) appears in the position 2 k m < n = p + 2 k m 2 k (m + 1), or, equivalently, n m+1 2 k < n m . is a sequence of positive integers with inserted additional (m − 1) · 2 k copies of 2 k+1 for each k 0.

Generation of Perfect Binary Trees
Let us recall that the perfect tree is completely determined by its height. Thus, in this subsection, we consider the sequences h = (h i ) i 0 of heights of perfect trees instead of perfect trees itself: t = t * h i i 0 . Buffer states and published trees are characterized by the sequences of heights h (n,k) and h © with t According to Lemma 2, at each moment, the sequence of heights of perfect trees in the buffer is non-increasing. By Lemma 3, the heights of trees are bounded by the constant h max , the same for all buffer states.
Let µ h = µ (n,k) h be the number of perfect trees in the buffer of positive height h, i.e., the buffer at each step has the form where h ×µ i The following Lemma describes these parameters for a fixed number of provers n pr and number of steps n st .

Lemma 8. At every step,
The map f corresponding to one step is given by the formula Here 2{p/2} ∈ {0, 1} is the parity of p. (26)  2 h + 2. The inequality (27) and the formula (28) are proved together. If (27) is satisfied, then the quantity of provers is sufficient to build all possible non-zero height perfect trees and, hence, we obtain (28). Now, suppose that ∑ h µ h 2 n pr ; then, at the next step, by (28),

Proof. The inequality
where the last inequality follows from the fact that the number of non-zero summands in (27) is not greater than n pr .

Remark 2.
The above lemma allows for constructing an optimized implementation of OneStep() procedure according to formula (28), and then to obtain a (pre-)reduced form of h © according to Definition 5. The remainder of the results in this subsection essentially use this implementation.
Let h © = h © n pr , n st be the sequence of heights of trees in generated blocks. According to Corollary 3, it is ultimately periodic.

Hypothesis 2.
A period of h © consists of perfect trees of heights h max −1 and h max in some order. Corollary 6. Let a (respectively b) be the number of perfect trees of height h max −1 (respectively h max ). Then, the length of the period is a + b = 2 log 2 n st n pr −ν , for some ν ∈ {0, 1, . . . , ν 2 (n st n pr + 1)}, where ν 2 (n st n pr + 1) is the multiplicity of 2 in prime factorization of n st n pr + 1, and a = 2 log 2 n st n pr +1−ν − n st n pr + 1 2 ν , b = n st n pr + 1 2 ν − 2 log 2 n st n pr −ν .

Definition 5.
The presentation h © = h © 1 , . . . , h © r , (h © r+1 , . . . , h © r+s ) of the sequence (of heights) of published trees with the smallest pre-period and period is called the reduced form of h © .
Hypothesis 3. The pre-period of the reduced form of h © is a non-decreasing sequence.

Remark 5.
The above hypothesis is not true for the pre-reduced form of h © as we can see in (29) and (30).

The Case of a Single Prover n pr = 1
In the case of a single prover Conjectures 1 and 2 are obvious, moreover explicit description is obtained.
In the case of a single prover n pr = 1, the sequence h © = h © (1, k) can be determined recursively together with an auxiliary sequence g that calculates the number of generated proofs in the buffer g 0 = 0, h © n = log 2 (g n + k + 1) = h, if g n + k < 2 h+1 − 1, h + 1, otherwise, Corollary 7. In the case of a single prover n pr = 1, the sequence h © (1, k) is periodic. Let k, k ∈ [2 h − 1, 2 h+1 − 1] and k + k = 2 h+1 + 2 h − 2. Then, the periods of reduced forms of h © (1, k) and h © (1, k ) are obtained each from the other with order reversing and replacement h ↔ h + 1.

Stochastic Case
In this section, we consider the result of Algorithm 1 only in the case of "b behavior is stochastic".

Occupancy Distribution: Efficiency
Placement of provers into suitable positions at each stage can be described by the classical occupancy distribution. See [26], the more recent [27] and references therein. Another closely related model is a coupon collector problem.
Thus, we have n pr 1 provers placed independently and with equal probability into n pos 1 positions. In terms of urn models [26], provers and positions correspond to balls and bins, respectively. Let us denote ξ n pr n pos a random number of non-empty positions. It takes integer values i from 1 to min{n pr , n pos } with probabilities Pr{ξ n pr n pos = i} = n Here, we use the following notations: is the Stirling number of the second kind, i.e., the number of factorizations of an n-element set to an m-element factor-set; • (n) k = n(n − 1) · · · (n − k + 1) is the falling factorial.
Mean and variance of the occupancy distribution (e.g., Theorem 2A [27]) are obtained as the special cases of expected values of falling factorials Theorem 1A [27]: E n pos − ξ n pr n pos r = n pos r E r , E r = 1 − r/n pos n pr , 0 r < n pos ; E ξ n pr n pos = n pos (1 − E 1 ), Var ξ n pr n pos = n pos (n pos − 1)E 2 + E 1 − n pos E 2 1 .
Let N t := Len denote the total number of generated trees, N and N int , respectively, the numbers of leaves and internal vertices in these trees. Similarly let N © t = n bl be the number of published trees, N © and N © int , respectively, the number of leaves and internal vertices in these trees.
Note that, according to (1), Moreover, as a random variable, N int = n bl n st ξ n pr n pos .
Throughput of our system can be naturally measured by the number of processed transactions, i.e., be the number of leaves N and N © . On the other hand, the useful work of provers is given by the number of internal vertices N int and N © int . According to (32), if n st n pr 1, then N int /N ≈ 1 and N © int /N © ≈ 1. Note that in the cases that "b behavior is deterministic", N int = n bl n st n pr . We use this to define the "normalized" values: • proof generation efficiency Ef(n pr , n pos ) := E N int n bl n st n pr = E ξ n pr n pos n pr = n pos n pr 1 − 1 − 1/n pos n pr , • proof publishing efficiency Ef © (n bl , n st , n pr , n pos ) := E N © int n bl n st n pr .

Simulation Model for Block Publishing Efficiencies
In both cases that "b shape is strict" and "b shape is perfect", Ef © is an increasing function of n bl . In the cases that "b shape is strict", the lengths of buffers h (n) are not greater then n pos − 1. This implies convergence in the following proposition.

Proposition 5.
In the case that "b shape is strict", Ef © Ef whenever n bl → ∞.

Hypothesis 5.
In the case that "b shape is perfect", Ef © Ef whenever n bl → ∞.
We try to illustrate the above convergences in Figure 4. As we can see, in the case that "b shape is strict", Ef © tends quickly to Ef. On the contrary, in the case that "b shape is perfect", the convergence is very slow. Note that the limit value is Ef | n pr =n pos =5 = 0.67232, and some values close to asymptote are Ef © | n bl =1000 = 0.6359, Ef © | n bl =2000 = 0.6451, Ef © | n bl =3000 = 0.6497, Ef © | n bl =5000 = 0.6541. In practice, it is important to select a right value of n pos when other parameters are fixed. In Figure 5, dependencies of Ef and Ef © (for both cases "b shape is strict" and binary and "b shape is perfect") on the number of positions are shown in the case when n bl = 100, n st = 10, n pr = 10. Values for efficiencies are average over 300 random calculations. In Table 2, for the case "b shape is perfect" and in Table 3 for the case "b shape is strict", the arguments of maxima and the corresponding maximal values of blocks publishing efficiency Ef © are given, as functions of the number of positions n pos for n bl = 200 and various values of n st and n pr . Table 2. ArgMax and Max of the function n bl → Ef © for the case "b shape is perfect".

Simulation Model for Heights of Strict Binary Trees
Here, we consider an analog of the formula (22) for average heights h of published strict binary trees in the case when "b behavior is stochastic". We consider a linear approximation h(n st , n pr ) ≈ αn st + β log 2 n pr + γ.
Optimal parameters α, β, γ are obtained by means of the least squares method. For a fixed positive integer m, we consider all pairs of integers (n st , log 2 n pr ) = (i, j) satisfying 1 j i m. Thus, the problem is to minimize the square of the error vector The critical point (α, β, γ) is the solution of the system of linear equations The inverse matrix for this system is: It can be calculated using Wolfram Mathematica. Enthusiasts can verify this using formulas for sums ∑ n k=1 k (p) = 1 p+1 n (p+1) of rising factorial powers for k (p) = k(k + 1) · · · (k + p − 1) or the Faulhaber's formulas for generalized harmonic numbers H n,−p := ∑ n k=1 k p . The results of numerical experiments (coefficients for linear approximation and the standard deviation) are presented in Table 4. For the case m = 8, the graphical presentation of average values h(i, 2 j ) and their linear approximation are given in Figure 6.

Discussion
Latus consensus, proposed in [3,4], provides secure and stable sidechain operation, even under the complete distrust assumption in sidechains, and even when all participants in the sidechain consensus are malicious. We assume that the adversary in the sidechain may try to sensor the transactions of other participants inside the SC, or from the MC to the SC, or try to steal tokens in the SC, hide the state of the SC, or stop block production in the SC, but these actions must not be successful. For reliable operation under such assumptions, Latus consensus needs uninterruptible zk-SNARK-proof generation, which, in turn, demands essential computational power and cannot be performed by a single regular participant with commodity hardware. Projects Coda [8] or Polygon [28] implicitly involve a semi-trusted participant with powerful computational capabilities (such as cloud computation) to solve this task on time, but this approach may lead to a single failure point. To eliminate this problem we propose a new approach for efficient decentralization proof generation. According to our approach, decentralized generation is applied not only to blocks in the whole epoch but even to each block. Thus, proofs are generated with the involvement of a large number of independent participants, which are called provers. This provides a decentralization of the zk-SNARK sidechain operation in secure and reliable way.
In this article, we introduced the properties of this process as proof generation efficiency Ef and proof publishing efficiency Ef © , which are investigated in Section 5.1.
For the generation of a single block considered in [18], the block forger initially selects the number of levels of a perfect binary tree that will be included into a block. In [18] (table 1), we find what number of provers is needed to build the perfect binary tree with = 4, 5, 6, 7, 8, 9 levels during n st = 9 steps with probability 0.95. In this limit case, the role of the efficiency garnered from this article results in the ratio of the number of useful proofs, which is the number of internal vertices n i (t) in the built tree to the total number of proofs n st n pr produced by all provers during the block publishing process In Table 5, we compare the values of Ef 1 from (37) with the efficiencies Ef © obtained by the emulation procedure when b shape is perfect in three cases: n bl = 10, n bl = 100 and n bl = n pr , and when b shape is strict and n bl = 10 (denoted Ef © perfect and Ef © strict , respectively). The value of Ef calculated from (34) is the limit for both Ef © perfect and Ef © strict at n bl → ∞. So, one can see Ef © /Ef 1 ≈ 3 as the average . In the top half of Table 5, the values of Ef © are closed to the best possible limit Ef. At the bottom, when the number of provers becomes extreme, Ef © perfect still obtains an increment with respect to Ef 1 ; however, both values are too small. The convergence Ef © perfect → Ef is slow and a very big buffer size is required in this situation. However, Ef © strict remains closed to Ef.

Conclusions
We investigated characteristics of blocks and characteristics of a SC in a whole under various parameters of a SC, such as the number of blocks in an epoch, number of provers, number of available transactions, time of block creation, etc. We obtained theoretical results for a model with additional restrictions on provers' behavior and various experimental results for an extended model without such restrictions. These results are helpful when determining the parameters of a sidechain, such as epoch length, block creation time, proof incentives, etc. We show that in deterministic cases the sequences of published trees are ultimately periodic and ensure the highest possible efficiency of the proof creation process, because in this case there are no collisions in proof creation.
In stochastic cases, we obtain a universal measure of prover efficiencies taking values in the interval [0,1] and given by the explicit formula in one case, calculated by simulation models in other cases.
In the sense of efficiency defined, the optimal number of allowed prover positions for a step can be proposed for various sidechain parameters, such as number of provers, number of steps in a block, and so on. We also considered non-perfect binary tree utilization in a blockchain, and described benefits and restrictions for such trees' use. It turns out that we can achieve large efficiency using these very trees.
Our algorithm for the strict binary trees gives the following differences (compared with the case of perfect binary trees):
Block publishing efficiency is higher; 3.
The average height of the generated perfect binary trees is proportional to log n st n pr = log n st + log n pr . The average height of the generated strict binary trees has a similar linear dependence on the logarithm of the number of provers log n pr , but a different linear dependence to the number of steps n st (not log n st ). So, we can consider the case of strict binary trees practically interesting only if the number of steps n st is small.
All results from [18][19][20] and the present paper are related and, put together, give a comprehensive description of SC behavior under our chosen parameters, allowing us to find the optimal sets of parameters that help to optimize functioning, in particular increasing throughput.
One of the most interesting results in this article shows that, under the condition of quick synchronization among block forgers, the throughput increases up to three times in practical settings and can be increased even more for extreme settings cases.
One topic of future research is the investigation of the efficiency of Verkle tree [29] and Curve tree structures [30], which can be used for creating proof trees instead of using different variants of Merkle trees.  be convenient to implement 2 × 2 cases as separate procedures DP(), DS(), SP(), SS(), where the first (respectively second) letter in the name corresponds to the case when b behaviour is D [eterministic] or S[tochastic] (respectively b shape is P[erfect] or S[trict]). In addition, we have procedures SPP(), SSP() which repeat simulations and with additional external loop over the number of positions. The Main() procedure calls one of the above 6 procedures depending on the input parameters, which corresponds to command lines ">ProofStream . . . " with the following 4, 5, 6 or 8 parameters.
Two cases when b behaviour is deterministic: • >ProofStream 1 0 n bl n st n pr In the case b shape is strict is the most simple. The corresponding calculations helped to formulate the results from Section 4.1 which all then was proved. • >ProofStream 1 1 n st n pr In the case b shape is perfect the output is the complete description of ultimately periodic sequences of buffer states and heights of published trees for special values of number of steps n st and number of provers n pr . Hypothesis 1-4 are based on numerous computation.
Two cases when b behaviour is stochastic require the additional parameter: the number of positions n pos . They use a pseudo-random number generator with seeds obtained from cryptographic RNG.
• >ProofStream 0 0 n bl n st n pr n pos • >ProofStream 0 1 n bl n st n pr n pos Graphs in Figures 4-6 were obtained using Wolfram Mathematica. A part of data to plot as well as some result in Tables 2-4 are results of running loop over different parameter. In particular, the loop for(int n pos = min; n pos < max; n pos += increment){. . . } is called by the command lines: • >ProofStream 0 0 n simulations n bl n st n pr min max increment • >ProofStream 0 1 n simulations n bl n st n pr min max increment The initial C code with a brief description and some calculation examples is available at the annanelasa/ProofStream repo on GitHub.