Stochastic Process Leading to Catalan Number Recurrence

: Motivated by a simple model of earthquake statistics, a finite random discrete dynamical system is defined in order to obtain Catalan number recurrence by describing the stationary state of the system in the limit of its infinite size. Equations describing dynamics of the system, represented by partitions of a subset of { 1,2, . . . , N } , are derived using basic combinatorics. The existence and uniqueness of a stationary state are shown using Markov chains terminology. A well-defined mean-field type approximation is used to obtain block size distribution and the consistency of the approach is verified. It is shown that this recurrence asymptotically takes the form of Catalan number recurrence for particular dynamics parameters of the system.


Introduction
The main result of this work is the definition and analysis of a discrete random dynamical system leading to a distribution related to Catalan number recurrence.It is part of the program for linking discrete random dynamical systems to integer sequences [1].The motivation for the program results from previous works [2,3] on probabilistic cellular automata [4] leading to Catalan and Motzkin numbers [5,6].These form one more interpretation of each these two integer sequences, as originating from the limiting distributions for discrete dynamical systems.
The achieved results suggest that for other integer sequences also, appropriate dynamical systems can be defined to derive the respective recurrences.However, apart from the examples of Catalan and Motzkin numbers, they have not been defined so far.The implementation of this program creates an interesting link between integer sequences and dynamical systems.
Integer sequences can arise from dynamical systems in the contexts of closed orbits of a point under the action of map iteration (see, for example, [7][8][9]), and from queueing theory (see [10]).Our approach focuses on the construction of a random dynamical system, which is described by a (stationary state) distribution that is given by the recurrence associated with a specific integer sequence.
The achieved results [2,3] suggest that for other integer sequences also, appropriate dynamical systems can be defined to derive the respective recurrences.However, apart from the examples of Catalan and Motzkin numbers, they have not been defined so far.The implementation of this program creates an interesting link between integer sequences and dynamical systems.
The dynamical system is inspired by a very simplified view of earthquakes as driven by slow accumulation of energy (possibly generated by respective slow motions in the earth's crust) and its abrupt releases in the form of quakes.In particular, in the case of a toy model of earthquakes in the form of Random Domino Automaton (see [11] and the references therein), it was discovered that the model generates a size distribution of clusters, which, after re-scaling, coincides with Motzkin number recurrence [6].A similar system was proposed for Catalan numbers [3].It should be emphasized that the previous work was framed using physical terms-here, we present a more rigorous, analytical approach.
This article defines a new system for which-unlike the system [3]-there are no space correlations.Relevant stationary-state variables are introduced and show how the equations they are supposed to fulfill result from counting all possibilities for all system states.The mean-field-type assumption is strictly formulated.Small-size terms, which are included in the group size distribution equation, disappear in the infinite size limit N −→ ∞.Finally, the solution to a special case is briefly discussed.
The model of aggregation and separation of individuals (which can be identified with portions of energy in the context of earthquakes) considered in this article can be illustratively described as follows.Consider the merger dynamics for N units, each of which can be in two mutually exclusive states: prone to merger, or to separation.Separated individuals are always solitary, unlike the aggregated individuals which occur in clusters of sizes 1, 2, 3, . ... Random changes that occur in discrete time steps can only be as follows: a separating unit can turn into an aggregating unit with a certain constant probability, or a group of (only aggregating) units can either merge with another group, or separate.Both of these possibilities occur with probability depending on the size of the group.
What is the group size distribution in this process?Can it be calculated for the steady state?The short answer to this last question is yes, but the solution is a bit more complex than it might seem at first glance.In particular, Catalan number recurrence is an approximate solution.The nature of this approximation will be explained in detail in the text below.This article defines the system and analyzes it in a detailed way.Obviously, the distribution depends on the choice of model parameters; that is, two functions and one number.
The solution to the problem for the stationary state is given in the form of a recurrence equation derived with an assumption, which can be considered a kind of mean-field approximation.It has also been shown that the proposed derivation of the equations is internally consistent, i.e., the approximate group size distribution equations are consistent with two exact equations, namely for the proportions of aggregating and separating units and for the number of groups.Moreover, for a particular choice of parameters, the recurrence equation is reduced, in the limit of the infinite size of the system N, to the form of Catalan number recurrence [12].
The plan of the paper is as follows: Section 2 provides the necessary notation and definitions; Section 3 analyzes the system as a Markov process, introduces stationary variables and specifies the mean-field type approximation that is used later in the text; Section 4 contains the derivation of the equations for the stationary state of the process; then, Section 5 is devoted to the special case leading to Catalan number recurrence; finally, Section 6 contains the possible interpretation of the process in physical terms and finishes with comments.

Definition of the System
In this section, we define the system of arbitrary size N and its evolution during discrete time steps.Dynamics of the groups (clusters) is expressed in terms of partitions.

The System
We start from N individuals, which can be in one of two states.

Definition 1 (States of elements). Let [N]
denotes the set of the N first integers {1, 2, . . ., N}, and Q = {s, a}, where s and a are formal symbols.Then, defines the state of the element i ∈ [N] at time t ∈ N.
The label s stands for separating state and the label a stands for aggregating state-the meaning of these terms is given in Definition 8. Since aggregative individuals occur in clusters, we introduce the notion of partitions for a set of indices of aggregative individualsa group is formed by those individuals, for which indices belong to a block of the partition (at a given time step).

Definition 2 (Set of indexes).
Definition 3 (State of the system).Denote by A(t) the set of all possible partitions of A(t) The state of the system at the time t, S(t) ∈ A(t), is given by The block j of the partition S(t) is denoted by S j (t) := {{i j1 , i j2 , . . ., i jk j }}.
Definition 4 (Size of a block).The size of a block S j (t) = ⟨i j1 , i j2 , . . ., i jk j ⟩ is equal to k j .

Definition 5 (The number of blocks).
The number of blocks is denoted by n(t).For the partition S(t) given by Formula (3), the number of blocks is n(t) = l.
The most basic characteristic of the system is the ratio of the number of aggregative individuals to the number of separative ones, or equivalently, the density of aggregative units as given below.
Definition 6 (Density).The density of the system ρ(t) is given by the ratio of the number of aggregative elements, i.e., those with s(i, t) = a, to the number of all elements

Evolution Rule
The evolution of the system depends on parameters, i.e., two functions and one number-the probability of merging µ and probability of separation δ, both depending on the size of respective cluster, and also on the probability ν of the transition of an individual from the separative to aggregative state.

Definition 7 (Dynamic parameters). The dynamic parameters are functions
In the following, we use short notation δ m ≡ δ(m) and µ m ≡ µ(m).After introducing the above definitions, we can precisely determine the rules for the evolution of the system.Definition 8 (Evolution rule).The evolution of the system is given by where f is defined by the following rule.For a time t, choose a number from [N]-with equal probability for each one-and denote it by j.
(a) If s(j, t) = s, then one from the two following excluding options (a1) and (a2) happens, with respective probabilities ν and (1 − ν), where ν ∈ (0, 1) is a fixed real number.(a1) (b) If s(j, t) = a, then it belongs to some block, say, S l (t) of size k l .Then, one from the three following excluding options (b1), (b2) and (b3) is chosen, with respective probabilities δ k l , µ k l , and s(i, t + 1) := s for i ∈ {i l1 , i l2 , . . ., i lk l }, s(i, t) for i ∈ [N] − {i l1 , i l2 , . . ., i lk l }. (14) The process related to the point (a1) is called transition, the one related to (b1) is called merging or aggregation and the process defined in (b2) is called separation.

Variables for the Stationary State of the System
One of the fundamental properties of the system is that during the evolution it tends to a stationary state-this fact is explained below using the theory of Markov chains.In consequence, it is possible to characterize properties of the stationary state.To this aim, we introduce appropriate variables.

Stationary Variables
The process given by Definition 8 is Markovian.Thus, in principle, a wide range of relevant techniques can be used, such as in [13].However, for even moderate size of space of states, the approach turns out to be very ineffective for computational purposes (see [14]).The main problem is to write down a transition matrix for large systems.However, general properties of the process can be derived using basic theory (like, for example, in [15]).
Proposition 1.The space of states of the process given by Definition 8 is irreducible, aperiodic and recurrent.
Proof.Every two states are communicated, because of the following: An arbitrary state can be changed to the state consisting of separative individuals only, by an appropriate number of separations (b2).Next, any state can be built by an appropriate combination of transformations (a1) and multiple merging (b1).This proves irreducibility.The system is finite, and probabilities of all possible transitions are strictly greater than zero, thus the space of states is recurrent.Finally, it follows directly from the definition that the space of states is also aperiodic.
Corollary 1.The stationary distribution for the process given by Definition 8 is unique and equal to the limiting distribution.
Denote by (i) a state of the system S(t), identified as a state in the Markov process, and then denote by π (i) the value of the limiting distribution π for the state (i).For every state (i), the density ρ (i) , the number of blocks n (i) and the numbers of blocks of the size m, n Definition 10 (Number of groups).The stationary number of blocks n is Definition 11 (Number of groups of size m).The stationary number of blocks of size m is for all m. (20)

Mean-Field Approximation
For derivations presented in Section 4, we use a mean-field-like approximation (see, for example, [16]) in the following well-defined form.
For every state (i) with n (i) > 1

Equations for the Stationary State of the System
For a system in a steady state, it is possible to derive equations balancing the "gains" and "losses" occurring during evolution.Below are the appropriate derivations for density, number of blocks and number of blocks of a specific size.

Density
Proof.The expected value for the number of elements changing state from aggregative to separative-which is possible due to separation rule (b2) of Definition 8 only-in a single time step, for any state (i), is given by Thus, multiplying by respective probability π (i) and summing for all states (i) we arrive at ∑ all (i) where we use Equation (20).
The expected value for the number of elements changing state from separative to aggregative-which is possible due to transition rule (a1) of Definition 8 only-in a single time step, for any state (i), is given by The weighted sum for all states gives ∑ all (i) where we use Equation ( 18) and the property ∑ all (i) π (i) = 1.
We emphasize that the Equation ( 22) is exact-there were no approximations in its derivation.

The Number of Blocks
Proof.In a single time step, the number of blocks may increase-due to the transition rule (a1) of Definition 8-by one only, and the respective expected value is given by Formula (26).
The number of blocks can decrease also by one only-due to the merging (b1) and separation (b2) rules of Definition 8.The respective contributions for a state (i) are and next, after summation over all states (i) weighted by π (i) , become Hence, we obtain the Equation ( 27).
The Equation ( 27) is exact, like Equation ( 22).They have the same right-hand side, thus one obtains the following relation (30), where the density is eliminated.

The Number of Blocks of Size m
Similar reasoning as above can be applied to the number of blocks of a given size.It appears that the respective formulas depends on parity of the size.Thus, we introduce the following notation.
Proof.A decrease in the number of blocks of size m can appear due to the merging (b1) and separation rules (b2) of Definition 8.For (b1), it may happen that k l = m and k p = m, thus the contributions to the expected value for any state (i) with n (i) > 1 are States with n (i) = 0 and n (i) = 1 do not contribute in this way.In effect, the contribution is The mean-field approximation given by Equation (21) leads to and summation over all states (i) weighted by π (i) leads to where we included states with n (i) = 0 and n (i) = 1, assuming that respective π (i) s are negligible.This assumption is justified for large system, similarly to the mean-field approximation.
Next, the contribution of separation (b2) is similar to the above An increase in the number of blocks of size m can appear due to the rule (b1) of Definition 8, with k l + k p = m.In a way similar to above (i.e., writing appropriate contributions for a state (i), applying the mean-field approximation (21) and summing over all states (i)), one can obtain the following: for m odd, the contribution for a state (i) and for m even, the contribution is Hence, we obtain the Formula (32).
Since the total number of blocks is the sum of the number of blocks of all sizes, i.e., n = ∑ i≥1 n i , we obtain the following fact.Proof.The result comes from summing Equation (32) for all m ≥ 1 and the following identities Moreover, since the density of the system depends on the number of blocks of particular sizes, i.e., ρN = ∑ i≥1 in i , then, similarly as before, we obtain the following fact.27) and ( 22) may be understood as exact formulas for the first and the second moment of the distribution n m weighted by functions (δ m + µ m ) and µ m , respectively.The approximate Equation (32) gives the distribution, for which the first and the second moments are exact.

Catalan Recurrence
Having derived balance equations for density, number of blocks and number of blocks of specific sizes, we can use them for various choices of system parameters.In particular, we can choose parameters for which the above equations can be solved and lead to the cluster size distribution given by the scaled recursion determining the Catalan numbers.

Special Choice of Parameters
To achieve the goal, we specify parameters µ m and δ m in the following way with α and β constants, 0 < α, β, (α + β) < 1.In consequence of (42), balance equations are significantly simplified.
Note, for the choice given by formulae (42), probabilities of shift and removal are the same for each cluster, independently of its size.Corollary 5.For dynamic parameters given by formulae (42), the balance equation for ρ (22) gives the formula for the density Proof.It follows from direct calculations.
Corollary 7. The average size of a cluster for dynamic parameters in the form (42) is Corollary 8.For dynamic parameters given by formulae (42), the balance Equation (32) for the number of m-clusters n m is where Proof.It follows from direct calculations.

Large System Size Approximation
The next step to obtain Catalan number recurrence is to consider the limit when the size of the system increases to infinity.For this purpose, we first normalize the variables n m appropriately.
For the rescaled variable x m , the Equation ( 46) is and appropriate discussion and analysis of the various regimes for such a toy model goes far beyond the scope of this article.Nevertheless, we emphasize that the parameter δ m that controls merging (or stress transfer) does not exist in the cellular automata models [3], where this property was determined by the appropriate geometry.As with RDA (see, for example, [14]), the avalanche size distribution may also be considered for the process proposed above.It coincides with the block size distribution only for the parameters described by Formula (42).

Comments
This paper introduces a system for which an approximate equation for the distribution of sizes of blocks in the stationary state takes the form of Catalan number recurrence.For the introduced process, a direct, approximate, but effective way for calculating the cluster size distribution is presented in detail, with explicit formulation of the mean-field-type approximation.In the context of previous works on cellular automata [2,3], it seams that such a more rigorous approach can be applied to other integer sequences, and relevant processes can be defined.
An interesting research thread is the relationship between the Motzkin number and Catalan number recurrences, resulting from the rules defining the simple dynamics of toy models of earthquakes, with inverse power distributions with various exponents (not only ̸ = 3  2 , but also arbitrary in some range) [11].Another relation (however, of a different nature) of Catalan numbers to Self-Organized dynamical systems related to the earthquake model (namely the Otsuka model, being a branching process) is described in [18].
The material presented above indicates that integer sequences and related dynamical systems may be sources of interesting, nontrivial dynamics with possible applications to natural hazards.It is not excluded that this connection may act in the opposite direction, i.e., that simplified natural hazard models, like those mentioned above, could be a kind of inspiration for mathematicians studying dynamical systems and/or integer sequences.

Corollary 4 .
Equation (32) implies Equation (22).Proof.The result comes from summing Equation (32) multiplied by respective m for all m ≥ 1 and the following identities

)Proof.Corollary 6 .
The left-hand side of Equation (22) simplifies to β ∑ m≥1 n m m = βρN.For dynamic parameters given by formulae (42), the balance equation for n (27) and Equation (43) gives the formula for the number of blocks n b1) For S(t) consisting of at least two blocks, another block S p (t), p ̸ = l of the partition S(t) is chosen in such a way that every remaining block (i.e., different from S l (t)) has the same chance, and then s(i, t + 1) := s(i, t) for all i, (10)S(t + 1) := ⟨i l1 , ..., i lk l , i k1 , ..., i kk p ⟩ ∪ r̸ =l,p S r (t).(11)ForS(t) consisting of only one block s(i, t + 1) := s(i, t) for all i,