1. Introduction
Catalan numbers are remarkably ubiquitous in mathematics. More than two hundred objects are counted by them, as described in [
1]. Moreover, Catalan numbers have the longest entry in the On-Line Encyclopedia of Integer Sequences (OEIS) [
2]. They are the most well-known representative of a class of numbers called Catalan-like numbers, with which they share their various properties [
3,
4]. All Catalan-like numbers belong to the larger class of integer sequences [
2].
In addition to the purely combinatorial origin of certain integer sequences, they can also appear in other areas of mathematics. In particular, they can arise from dynamical systems in the context of closed orbits of a point under the action of map iteration [
5,
6,
7] and from queueing theory [
8].
Catalan-like numbers also appear in a very interesting way in ref. [
9], where the problem of the decay of initial data in the form of a unit step for Bogoyavlensky lattices [
10,
11] is studied. It turns out that the problem is solvable, since the dynamics are linearizable, and the solution is expressed in terms of generalized hypergeometric functions being exponential generating functions for generalized Catalan numbers. This work extends earlier results [
12], where the crucial role of the form of this exponential generating function was remarkably noticed. The results can then be proved using known identities for Hankel determinants [
3,
13]. Moreover, ref. [
9] also extends the result [
14], where a symmetry reduction consistent with dynamics is used instead of identities for determinants.
There are also other close examples of the appearance of combinatorial objects in mathematical physics: a connection of the Catalan numbers with the Chebyshev polynomials [
15,
16], the Catalan and Hurwitz numbers in the theory of the dispersionless Toda chain [
17,
18], the generalized Catalan numbers, and the spectral properties of the products of random matrices [
19].
The aim of this work is to define a single stochastic discrete dynamical system evolving in discrete time, similar to the one from [
20], but that leads not to one but various recurrences of Catalan-like integer sequences. The approach for generating of integer sequences presented here is very direct and focuses on the construction of a discrete stochastic process, which in stationary state is characterized by a distribution that is given by the recurrence related to integer sequences. The motivation for the approach results from two works [
21,
22], on probabilistic cellular automata leading to Motzkin [
23,
24,
25], and Catalan numbers [
1,
26,
27], respectively.
This concept was initiated by analyzing a simple earthquake model whose probability parameters are determined by solving an inverse problem for given earthquake statistics [
28]. This model can be fitted to real data [
29], but it can also be applied to a much wider range of distributions, including those related to integer sequences. In this way, one more interpretation of Motzkin numbers, and later Catalan numbers, was established as recurrences originating from the limiting distributions for discrete dynamical systems.
Those results, as well as common properties of Catalan-like numbers [
3,
4], suggest that for other integer sequences, an appropriate stochastic process can also be defined in order to derive the respective recurrences.
This paper extends the method to a broader class of recurrences based on a more general definition of a stochastic process. Within this approach, we obtain a “master recurrence”, which after appropriate reductions leads to an infinite number of recurrences. In this way, we obtain various recurrences from a single stochastic process, including those related to the Catalan and Motzkin numbers, but also other ones; for example, those related to the Schröder numbers [
30,
31,
32] or to the sequence A064641 [
33].
This article defines a new process for which the derived stationary state equations are exact, and there is no need to use mean-field-type approximations, as was necessary in earlier works (for details, see 
Section 3). Relevant stationary-state variables are introduced and show how the equations they are supposed to fulfill result from counting all possibilities for all system states. The general group size distribution equation simplifies in the limit of the infinite size of the system 
. Finally, reductions are applied, and a few special cases related to the above-mentioned sequences are calculated in detail.
The model of aggregation and separation of entities considered in this paper can be described figuratively as follows. Each of the N entities can be in one of two mutually exclusive states: in the aggregative state or in the separative state. Separative individuals are always solitary, unlike aggregative individuals, which occur in clusters of sizes . Random changes that occur in discrete time steps can only be as follows. In each time step, a chosen separative unit can turn into an aggregative unit with a certain constant probability, or a chosen group of (only aggregative) entities can either increase by one, decrease by one, all units can turn into a separative state, or the whole cluster can be activated. Activation is a state of a cluster that enables the merging of clusters. There are two draws in each time step, so if there are two simultaneously activated clusters, they can merge. All of these changes for aggregative entities occur with respective probability given by parameters depending on the size of the cluster. For fixed-probability parameters, the process leads to a stationary state. It is possible to explicitly calculate the resulting number of separative entities (and hence the number of aggregative entities) and the total number of clusters. Moreover, the group size distribution in this process is given by the recurrence relation, which is then simplified by appropriate reductions.
The structure of the paper is as follows. 
Section 2 provides the necessary notation, definitions and sets the Evolution rule. 
Section 3 contains the derivation of the equations for the stationary state of the process. 
Section 4 is devoted to the reduction to a finite number of parameters. 
Section 5 deals with specific cases and examples. Finally, 
Section 6 and back matter contains concluding remarks and comments.
  2. The System and Its Evolution Rule
In this Section, we introduce the notion of the system. The system and its dynamics can be expressed more formally in terms of partitions of sets of indexes of some set, as presented in ref. [
20]. Here, we formulate all the definitions and evolution rules clearly in a less formal but precise way.
  2.1. The System
The system consists of a set of N entities labeled by integers and represented by their labels. The size N of the system does not change over time. Each entity can be in one of two states—either s or a. The symbol s stands for a separative state, and the symbol a stands for an aggregative state.
Definition 1 (Set of entities)
. Let us denote by  the set of the first N integers, i.e., , and define , where s and a are formal symbols. Then,defines the state of the element —referred to as k-entity—at time . Aggregative entities are grouped into clusters of various sizes: 
. The size of a cluster is simply the number of entities it contains. The number of clusters is denoted by 
. The number of blocks of size 
i is denoted by 
. Then, it follows that
The system is characterized by the ratio of the number of aggregative entities to the number of separative ones, or equivalently, the density of aggregative units, as given below.
Definition 2 (Density). The density of the system  is given by the ratio of the number of aggregative elements, that is, those with , to the number of all elements.
 Separative units do not group. The number of separative units is denoted by 
, and it is given as follows:
Entities can change their state during evolution. They are subjected to various rules described in the next section.
  2.2. Draw Configurations and Their Probabilities
The evolution of the system takes place in discrete time steps. In each time step , two integers , not necessarily different, are drawn with flat probability. Thus, one of the following five mutually excluding configurations appears with the respective probability. Those five configurations are denoted by (ss), (s), (sa), (a), and (aa).
- (ss)- —Two different individuals in the separative state are chosen, i.e.,  -  and  - . With  - , it happens with the following probability: 
 
- (s)- —One individual in the separative state is chosen, i.e.,  -  and  - , and the respective probability is as follows: 
 
- (sa)- —One individual is in the separative state and one is in the aggregative state, thus in a cluster of some size  i- , i.e.,  -  and  - , or opposite  -  and  - , 
 
- (a)- —Two chosen individuals, or one (if  - ), belong to a cluster of some size  i- , i.e.,  -  and  - , with  -  and  -  in the same partition, and the probability is as follows: 
 
- (aa)- —Two individuals in different clusters are chosen, one in a cluster of size  i-  and the other one in a cluster of size  j-  (it may happen that  - ), i.e.,  -  and  - , with  -  and  -  in different partitions, and the probability is as follows: - 
            where  -  is the Kronecker delta (i.e.,  -  for  -  and  -  for  - ). Note that  - . In particular, 
 
We define 
, 
, and 
. Then, the following identities hold:
        and
  2.3. Evolution and Dynamic Parameters
First, we define possible changes of state in individuals assigning relevant probability parameters. Next, the merging conditions of the clusters are defined.
Definition 3 (Transition). If an entity in the separative state is chosen, i.e., if , then the transition takes place with probability ν—the  entity changes its state to a. Otherwise, with probability , it remains in state s. If two integers  and  are chosen and , then the probability of transition is .
 Definition 4 (Separation, -activation, -increase, -decrease). If an entity in the aggregative state is chosen, i.e., if , then it belongs to a cluster of some size i, and one of the following changes can occur:
- Separation: 
                with probability , i.e., all entities belonging to the cluster containing the chosen  entity change their states to s; 
- -activation:  
                with probability , the activation is valid in the current time step only; γ-activated clusters can merge; 
- -increase:  
                with probability , i.e., the cluster containing chosen k-entity increases its size by 1 via absorption of a separative unit with respective probability ; 
- -decrease:  
                with probability , i.e., the cluster containing chosen k-entity decreases its size by 1 via change of state of a one entity from the cluster to a separative state. 
Otherwise, with probability , nothing changes. If  and  belong to the same cluster, the respective probability parameters are denoted by tilde, that is, , , , and . We also assume the notation convention .
 Definition 5 (Merging of clusters). If there are two γ-activated clusters in the current time step, then they merge with probability  depending on whether after merging the cluster additionally increases/does not change/decreases its size by 1. Otherwise, with probability , nothing happens. Also, if there is only one γ-activated cluster, nothing happens.
 Remark 1. Note that the above merging process can be generalized in a simple way by introducing dependency of parameters σ on the sizes of the merged clusters, i.e., it is straightforward to define  with possible dependencies on sizes i and j of the respective clusters.
 Definition 6 (Evolution rule). In a given time step t, pick up two integers  and . Then, for a given state of the system, depending on which of the configurations—(ss), (s), (sa), (a), (aa)—happens, the changes described in Definitions 3, 4, and 5 take place with their respective probabilities. Then, the same procedure is repeated in the next time step .
 All possible changes in the system during a single time step of the evolution are summarized in 
Table 1. The changes in a single entity according to Definitions 3 and 4 are illustrated in 
Figure 1. The evolution variants of the 
 configuration are presented in 
Figure 2.
  3. Equations for Stationary State
The process defined above can also be defined in terms of Markov chains (compare [
34]), and with the use of such notion, it is straightforward to conclude the existence of its stationary state. For a more rigorous description of this approach, we refer to the recent article [
35] and the references therein.
In this section, we derive equations relating probability parameters to state parameters of the system, which are necessary conditions for the stationary state of the system.
For the stationary state, the variables that describe the state of the system do not depend on time. Moreover, the stationary state does not depend on the values of the probabilities defined above that determine the evolution of the system but only on their ratios. This is because their rescaling (e.g., dividing by 2) only affects the appropriate slowing down of the process (twice), and this has no effect on the stationary state.
  3.1. Density
The density 
 is given by the number of separative units in the system—see Equation (
4).
Proposition 1. The balance equation for the density reads as follows:  Proof.  The contributions to the expected value of change in density—counted in the number of aggregative individuals—in their respective configurations are as follows:
          
- (ss)—gain by transition , 
- (s)—gain by transition , 
- (sa)- — gain by transition  - , loss by separation  - , gain by  - -increase  - , and loss by  - -decrease  - ; in total, 
 
- (a)- —loss by separation  - , gain by  - -increase  - , and loss by  - -decrease  - ; in total, 
 
- (aa)- —loss by separation  - , gain by growth  - , loss by reduction  - , gain by merging with increase  - , and loss by merging with reduction  - ; in total, 
 
Combining of these contributions with the requirement that the expected value for the change is equal to zero leads to Equation (
13).    □
   3.2. Number of Clusters
Proposition 2. The balance equation for the number of clusters reads as follows:  Proof.  The contributions to the number of clusters are as follows:
          
- (ss)—gain by transition , 
- (s)—gain by transition , 
- (sa)- —gain by transition  -  and loss by separation  - ; in total, 
 
- (a)—loss by separation , 
- (aa)- —loss by separation  -  and loss by merging  - ; in total, 
 
The sum of these contributions gives Equation (
14).    □
   3.3. Number of Clusters of Size 1
Proposition 3. The balance equation for the number of clusters of size 1 reads as follows:  Proof.  The contributions are as follows:
          
The sum of these contributions gives Equation (
15).    □
   3.4. Number of Clusters of Size 2 and Larger
In the next fact, we use the following symbol:
Proposition 4. The balance equation for the number of clusters of size  reads as follows:  Proof.  Contributions:
          
- (ss)—gives no contribution, 
- (s)—also gives no contribution, 
- (sa)- —loss by separation  - , loss by growth and reduction  - , and gain by growth and reduction  - ; in total, 
 
- (a)- —loss by separation  - , loss by growth and reduction  - , and gain by growth and reduction  - ; in total, 
 
- (aa)—loss by separation , loss by growth and by reduction - , gain by growth and by reduction - , loss by merging - , and gain by merging - ; in total, 
The sum of these contributions gives Equation (
17).    □
   4. Reductions
Up to this point, the considerations presented have remained general to ensure the widest possible range of model properties. However, in this section, we aim to show the connection of the above equations with recurrences for a class of integer sequences. In this section, we reduce the model through appropriate choice of process parameters.
In the characterization of the process, we defined the parameters with tilde for 
 and 
 in the same partition. The choice 
 removes the term 
. In similar way, the number of terms in Equations (
13), (
14), (
15), and (
17) will be reduced if the following condition is imposed:
Next, in order to simplify the terms 
 and 
, we define the dynamic of the process as follows:
	  The parameters 
, and 
 define the respective probabilities irrespective of the size of the affected cluster. Thus, they characterize the relative chance for a cluster independently of the size.
Hence, with constraints (
18) and (
19), we arrive at the following reduced system of equations (in the same order: (
13), (
14), (
15) and (
17)): 
Remark 2. Note that Equations (20) and (21) are linear in  and quadratic in n; thus, the system can be explicitly solved for n and .  The aim is to show the relation of the above recurrence to integer sequences, which are infinite; thus, we consider a limit 
. Hence, we rescale the variables as follows:
	  We define the symbols 
 as follows:
      and introduce 
, and 
 Then, in the limit 
, the above equations reduce to the following: 
If the process is without merging, then 
, and also 
 (equivalently, we can consider no 
-activation, i.e., take 
). We have the following:
      and recurrence is linear. Thus, in the following, we assume 
.
If the process has no transition, i.e., 
, it follows from Equations (
26) and (
27) that there is no other stationary state, except for 
 and 
, thus with density 
, otherwise 
.
In this article, we aim to show that one dynamical system can be the source of many recurrences related to known integer sequences. For this purpose—as will be evident in the following—we do not need to study the defined system in full generality. Therefore, we will limit ourselves to considering the case 
. This restriction significantly simplifies the system. For 
, Equations (
28) and (
29) are in cascading form: the first gives 
, the second gives 
 as a function of 
, the next gives 
 as a function of 
 and 
, and so on.
For 
, to reduce the sizes of clusters during evolution, it is necessary to ensure that separation can happen. Thus, we must assume 
. Note that if the process is without separation, i.e., 
, then 
, i.e., we have a full system with density 
, or
Consequently, the condition  gives , which is not compatible with the condition  and , which is of interest here.
Equations (
26) and (
27) for 
, 
 and 
 are equivalent to the following: 
Proposition 5. If , , , and , then the system of Equations (
32) 
and (
33) 
has a unique positive solution , where  and .  Proof.  In all cases, i.e., depending on whether  is , , or , the existence of the unique positive solution follows directly from the analysis of the graph of the above equations in the plane .
Equation (
32) defines a parabola that intersects the 
m axis at the points 
 and 
. Equation (
33) defines a line intersecting the 
m-axis at point 
 (or does not intersect it when 
) and the 
-axis at point 
. In each case, there is exactly one intersection point for 
 and 
, which determines the solution.
The exact form of the solution can be obtained by solving the quadratic equation (for m) and then the linear equation (for ).    □
   5. Recurrences
In this Section, we will show a method for determining the dynamic parameters of a system in order to obtain a given recurrence from a system of equations describing the steady state. In particular, we will show whether it is possible to reconcile a given choice of parameters with obtaining an appropriate initial value of the recurrence. In particular, we will consider in detail, among others, the cases of recurrences associated with the Catalan, Motzkin, and Schroeder numbers.
For the sake of simplicity, we will additionally assume 
. For a non-zero value of 
, we simply obtain more examples of recurrences, which can be studied exactly the same way as those presented below. Then, 
 and 
, and the recurrence is of the following form: 
The dependence of the equations on ratios of parameters reflects the fact that scaling these parameters—for example, dividing by 2—will slow down the process but will not influence its stationary state.
In the process, because there is separation and the transition from the separative to aggregative state for single identities, the number of groups will decrease with size. In contrast, classical integer sequences do not decrease. Hence, to relate these two objects, it is necessary to introduce appropriate scaling (compare [
21,
22]):
Then, using the following short notation
      the above recurrence is reduced to
Next, we define the following for 
:
	  Thus, the choice of 
, and 
R determines the coefficients of the recurrence.
  5.1. ,  and  
In this case, we have the following:
		The first two formulas set values of 
 and 
, while the last one gives an additional value for dynamic parameters, namely
To check if the above restriction is compatible with state equations for the system, we introduce symbols 
:
Then, we have (
42), (
32) (modified with (
42)), and (
33) in the following form:
Solving for 
, and 
 gives the following:
        where
From 
, it follows that
Next, to have 
 (or equivalently 
, see Equation (
42)), we must assume the following:
		The second inequality, after using (
49), gives the following condition:
		The last condition is stronger than the condition (
50).
Thus, for given 
, and hence 
Y, we need to choose 
 and 
 respecting the condition (
52), then calculate 
 and 
.
Finally, 
 is given as follows:
Example 1. A064641 sequence  [33] The choice , ,  gives the following reccurence relation:Hence, , and this implies that . The condition (
52) 
is . Thus, for example, one can choose , , , and hence For  and , the above parameters lead to the recurrence of the A064641 sequence [2]; however, it starts withAs can be seen from Equation (
53)
, for  and , it follows that .    5.2.  ,  and  
For 
, we have the following:
Then, Equations (
32) and (
33) imply the following: 
From conditions 
 and 
 it follows that
Since
        we have
		Thus, for a given 
 such that 
, the value of 
 varies from 0 to 
∞ for 
 within the range given by inequality (
59). In particular, 
 for the following:
Example 2. Motzkin numbers  [23,24,25] To obtain the Motzkin numbers, we set  and . Hence, we have the following recurrence relation:which for  and  is equivalent to the standard form: The exemplary choice  and  leads to the following:and hence to the following condition: The choice  provides . Finally,  and  give the correspondence between the distribution  and Motzkin numbers .
 Example 3. Shifted Catalan numbers  [1,4] We set  and . Hence, we have the following recurrence relation:which directly coincidences with the shifted Catalan number recurrence (where ): We set  and , as in the previous example, and consequently, we obtain Equation (65). Then, the choice  provides  for  and .
Then, , and  gives an appropriate correspondence of  to the shifted Catalan numbers.
 Remark 3. For a given m and , the choice of  and , as defined by Equations (57) and (58), and an appropriate change in the value , ensures a smooth transition from the distribution  related to the Motzkin numbers to the distribution related to shifted Catalan numbers, through a resulting change in . Note that the coefficients P and Q also change accordingly.  In this way, by choosing P and Q, one can easily generate further examples of various integer sequences, starting with  and directly related to the  distribution.
  5.3.  ,  and  
For 
, we have the following:
Equations (
32) and (
33) imply the following:
From 
 and 
, it follows that
Finally, 
 is given as follows:
Example 4. Schröder numbers  [30,31,32] For  and , we have the following recurrence relation: The exemplary choice  and  leads to the following:and hence Therefore,which shows that for the recurrence of Schröder numbers, it is impossible to obtain  for any value of parameter .    5.4.  ,  and  
In this case, one has the following:
For 
, as previously, we can obtain the following:
From , it follows that .
And finally,
        which shows restriction for 
.
  6. Conclusions
In this paper, a discrete system evolving according to a relatively simple algorithm is defined. The rules of changes in the state of the system, dependent on parameters defining the probabilities of specific changes, lead to a stationary state that can be described in an exact way (without approximations) through a set of equations dependent on these parameters. From these equations, one can obtain formulas for the basic parameters of the system, such as the density and the number of clusters. These equations, after appropriate reduction, lead to the master recurrence with coefficients. By choosing the appropriate coefficients, we obtain the known recurrences associated with the Catalan, Motzkin, Schröder, and other numbers. For all of these cases, the corresponding probability parameters were determined. The general method presented in this paper can be used to further investigate this and other similar systems.
The system defined in this paper has an advantage over the previous ones in two respects. To derive the stationary state equations, it is not necessary to use any approximation, including the mean-field approximation, as was the case for systems that are cellular automata [
21,
22]. On the other hand, the advantage over these two previous works and ref. [
20] (related exclusively to Catalan numbers) lies in the universality of this new stochastic process.
The main result of the paper is the definition of a stochastic process for which the precisely derived steady-state equations are the source of many different recurrences, including those related to the widely known integer sequences. This realization shows that the relations between different Catalan-like recurrences (see, for example, [
3,
4]) can be interpreted as changes in the values of parameters responsible for just a few subprocesses in a relatively simple dynamical system. As pointed out in Remark 3, a transition between different recurrences—in this case, related to the Catalan and Motzkin numbers—can be achieved by changing a single parameter.
The stochastic system is defined in this work in a way that aims at generality in order to ensure broad properties of the model. In particular, for the sole purposes of this work, it is not necessary to introduce the parameter . Therefore, in the part dealing with the analysis of the system, we limit ourselves to considering the case , which also significantly simplifies the relevant equations. Investigating the properties of the system for  therefore remains a challenge for future study. The situation is fundamentally different with the condition . If , then one can simply obtain more examples of recurrences in exactly the same way as those presented in the text.
A separate challenge is the question of whether it is possible to modify the evolution rules of the system in such a way that, in every case, or at least in a wider range, the initial value of  is allowed. These issues will be the subject of further research.
The system presented in this article can also be considered as a specific realization of the aggregation–disaggregation process. These processes have applications in many different fields (see, for example, [
36,
37,
38]), including the interesting problem of mathematical biology concerning the formation of causal groups [
39] (both as applied to animals and humans) and the explanation of the distributions of the sizes of these groups [
40,
41,
42].
Finally, following the growing interest in the use of machine learning to aid in the discovery process in the field of integer sequences [
43], we would like to point out that the stochastic process defined in this work can be used to test algorithms designed to discover equations directly from observed data [
44]. Knowing which parameters lead to a given recurrence, we can obtain a nontrivial random distribution of cluster sizes from the above-defined evolution rules and use it to test the ability to identify the equations behind such “data” by various algorithms, such as, for example, the well-established SINDy (Sparse Identification of Nonlinear Dynamics) algorithm for equation discovery [
45]. In particular, dealing with the property of nontrivial smooth transitions between different recursions, indicated in Remark 3, can be a good indicator of the performance of the tested algorithm.