Statistics of Correlations and Fluctuations in a Stochastic Model of Wealth Exchange

In our recently proposed stochastic version of discretized kinetic theory, the exchange of wealth in a society is modelled through a large system of Langevin equations. The deterministic part of the equations is based on non-linear transition probabilities between income classes. The noise terms can be additive, multiplicative or mixed, both with white or Ornstein–Uhlenbeck spectrum. The most important measured correlations are those between Gini inequality index G and social mobility M, between total income and G, and between M and total income. We describe numerical results concerning these correlations and a quantity which gives average stochastic deviations from the equilibrium solutions in dependence on the noise amplitude.


Introduction
Wealth exchange models [1,2] are used in the context of economic theory and econophysics [3,4] to describe in a simplified way the individual economic interactions occurring in a society. In particular, they allow to predict emerging collective features like the income distribution, the Gini index or the Pareto exponent. Most of these models have equilibrium solutions, but it is also well known that economic systems are never exactly at equilibrium. Hence, fluctuations should also be taken into account in the models. This is in some sense done in economic agent-based models [5,6] which are essentially computer simulations of economic systems based on a population sample. By their very nature they include statistical fluctuations which depend on the size of the sample. In this paper, we shall focus instead on models based on discretized kinetic theory [7,8]. These models are expressed in mathematical form through large systems of non-linear ordinary differential equations which describe transitions of individuals of a society between income classes. The transitions are the consequence of economic interactions which occur with certain probabilities defined by the model and depend on several parameters. Interaction terms can be of degree 2 in the class population densities (direct interactions) or of degree 3 (indirect interactions, used to model redistribution processes). These kinetic models are well established, and have also been tested on networks and for the description of taxation, welfare, tax evasion and tax audits (see e.g., [9,10]).
We have described in [11,12] the complex mathematical procedure needed to introduce stochastic noise into the system and leading to a consistent set of kinetic Langevin equations. The main difficulty in building up this procedure lies in the presence of dynamical constraints which, differently from cases typically treated in the literature, refer to several equations. The stochastic variations of the populations must respect, both in the additive and multiplicative case, the condition that the total population is conserved. In analogy with statistical mechanics, we speak of a "canonical" system (with non-conserved total income), when the total income µ is free to fluctuate and speak of a "micro-canonical" system when the total income is fixed; this requires a second constraint on the stochastic variations. It is also possible to consider a "mixed noise", such that at each step of the time evolution of the system the stochastic variation is a linear combination with random coefficients of an additive and a multiplicative variation.
In this paper, we present extensive statistical results, almost all relative to a stochastic model with multiplicative noise. They concern quantities which are of major interest for real world economies and whose values today constitute a widespread object of concern [13,14]. Most significant among these quantities is certainly the correlation between economic inequality and social mobility, respectively measured in our model by a parameter G which expresses the Gini index and by a suitably defined indicator M. The model displays for this correlation, in correspondence with a range of values of the Gini index compatible with those of industrialized countries, negative values (Section 2). This is in agreement with empirical data [15,16]. In Section 3 we investigate the dependence of the correlation R GM on G. In Section 4 we briefly discuss the case of Ornstein-Uhlenbeck noise. We also give an estimate of the average deviations of the population from equilibrium as a function δ av of the noise amplitude Γ (Section 5). Finally, Section 6 contains some conclusions and comments on possible future extensions of the work.

Langevin Equation. Deterministic Solutions
We consider a population of individuals divided into a finite number n of classes, each one characterized by its average income r j with 0 < r 1 ≤ r 2 ≤ . . . ≤ r n . Let x j (t) for 1 ≤ j ≤ n denote the fraction at time t of individuals in the j-th class. In previous work, see e.g., [8], assuming different economic behaviors of individuals belonging to different classes, two of us constructed a model for the time evolution of x(t) = (x 1 (t), ..., x n (t)) in correspondence to a whole of economic exchanges. The model (actually describing also taxation and redistribution processes) was formulated as a system of ordinary differential equations, in fact "deterministic" in the x j variables. In contrast, the discretized Langevin equations we are dealing with here take the form where dx i represents the variation, in the time interval dt, of the population x i of the income class i. The total population is normalized to 1, ∑ j=1...n x j (t) = 1 for all t ≥ 0. We emphasize here that this normalization, valid at t = 0, holds true for all t ≥ 0 as a consequence of the specific choice of the elements entering into the Equation (1). We take the class incomes r i linearly growing in i (see example below), even though other choices are also possible. The deterministic variation of dx i (the part proportional to dt) can be easily recognized by setting Γ = 0. It contains the coefficients C i hk which define the model by fixing the inter-class transition probabilities and account for the above mentioned behavioral heterogeneity. More precisely, C i hk is the probability that an individual of income class h will belong to class i after an encounter with an individual of class k. These coefficients are required to satisfy the identity ∑ n i=1 C i hk = 1 for any h, k ∈ {1, ..., n} and are taken in this paper as in [8]. Note that the transition probability fluxes are proportional to the products x h x k of the class population densities because pairwise monetary exchanges are here considered. The noise vector with components η i also must respect the constraints mentioned above. These are implemented through suitable linear transformations applied, at each step in the time evolution, to a vector of stochastic variables with standard Gaussian distributions, see e.g., [12]. The results reported in this paper are all obtained with multiplicative noise, except the plot in Figure 1, which is obtained with mixed noise. We take as initial condition an equilibrium configuration of the deterministic system with a certain total income µ = ∑ n i=1 r i x i . Such an equilibrium can be obtained with high accuracy through a Runge-Kutta integration of the deterministic equations over a very long interval (typically 10 4 or 10 5 steps). We recall here that in the deterministic case the equilibrium configuration does not depend on the initial conditions, but only on the total income. We recall next, before proceeding, the definitions of the quantities which are investigated in the paper.
The Gini index, commonly used as a measure of inequality of wealth or income, can range from 0 (complete equality) to 1 (maximal inequality). It can be calculated based on the Lorenz curve, which plots on the axis of ordinates the cumulative percentage of the total income of a population earned by the bottom percentage of individuals, represented on the axis of abscissas. In comparison with it, the 45 degree line represents perfect equality of incomes. The Gini index is defined as the ratio of the area between the Lorenz curve and the 45 degree line and the total area under this line. In our discrete approach, we calculated the area under the Lorenz curve as a sum of trapezia. In the deterministic case, when total income conservation holds true, we calculate the Gini index G at equilibrium. In the stochastic case, when total income changes in time, there are no equilibrium solutions and we rather get a time-series for the Gini index.
The social mobility coefficient M we use here is essentially a weighted average, over the classes, of the probability for an individual to be promoted to the upper class in the unit time. It is computed using an expression first introduced in [17], which can be found e.g., in [11,12] and which we do not report here because it would require a longer definition of symbols entering in the C i hk . Empirical evidence shows a clear correlation between these two quantities. Namely it is found that mobility reduces when inequality rises, thus implying a negative correlation between G and M [15,16,18]. This correlation, nicknamed the "Great Gatsby Curve" [19], is important since it means that the increase of inequality (as presently observed in several countries) tends to be a self-reinforcing phenomenon, unless it is complemented by suitable social policies. It should also be stressed that this correlation holds for societies at near equilibrium, while it may be different in phases of strong economic growth [20].
Consider now for example a system with 10 classes (n = 10), class incomes r i = 10i, and the coefficients C i hk as in [8]. In order to set the income equal to 30 we can initially put all the population into class 3, i.e., set x 3 = 1 and all the other x i equal to zero. To set the income equal to 29, one can assign x 2 = 0.1, x 3 = 0.9 and all the rest zero, and so on. The asymptotic equilibrium configuration with µ = 30 is Since the equilibrium configuration depends only on the total income, a biunivocal relation between G eq and µ is defined, which in a reasonable range of G eq is almost linear; one has for instance, in the interval 0.35 ≤ G eq ≤ 0.41, G eq = −0.1594 + 0.03712µ − 0.0006µ 2 (see Figure 3; the relation between M eq and µ is also shown, although it is not of immediate interest for this work).

Stochastic Time-Series
In the discretized Langevin Equation (1) we typically set dt = 0.1 or dt = 1 and let the system evolve in time-series of 5000 steps, repeated for N R realizations; N R varies between 50 and 6000, depending on the scope. (The choices for dt and the number of steps are based on our previous experience with the relaxation time of the deterministic system.) In the following we shall also compare results obtained through time-series of 10,000 integration steps and N R /2 realizations, or time-series of 2500 steps and 2N R realizations; in principle the results should coincide in the ergodic limit and in fact the averages are very close, but we have found that correlation estimators obtained with the time-series of 5000 or 2500 steps tend to display smaller fluctuations. After each integration step the values of G and M are computed. Both quantities are non-trivial functions of the populations x i . They fluctuate around their equilibrium value according to a Wiener process (see example in Figure 4), as can be checked by evaluating the Hurst exponent of their time-series, which is very close to 0.5. The time auto-correlation function of G is remarkably linear ( Figure 5). The same is true for the auto-correlation of µ and M, though not reported here. (We recall the definition of the Hurst exponent in this context: in a time-series with N points, the expectation value of the ratio between the range ρ(N) of the series and its standard deviation σ(N) is proportional to N H as N → ∞, where H is the Hurst exponent).    These histograms are neither Gaussian nor symmetric. If we regard the values of R GM for each realization as random variables themselves, assuming they are independent we may expect that the averages of R GM over a certain number N R of realizations obey the central limit theorem, and thus have a Gaussian distribution with a standard deviation given by the standard deviation of the single realizations divided by √ N R − 1. In order to check this, we made N S = 200 series of simulations, each one comprising N R = 50 realizations of 5000 steps. We obtained in this case σ averages = 0.0524 and σ/ √ N S − 1 = 0.0531, which displays a close agreement. The corresponding histogram is reasonably symmetric and Gaussian.

Dependence of the Correlations on the Total Income and on G
In the previous sections we have discussed the statistical properties of the correlations R GM , R Gµ and R Mµ in the case of multiplicative noise. The behavior of the correlations was evaluated for a fixed initial value µ of the total income. As reported in [12], however, the value of the correlations has a clear dependence on µ (all other model parameters being fixed). Since there is a biunivocal relation between µ and G eq , we can also say that the correlations depend on G eq ; this choice of variable is actually better, since G eq has a direct economic meaning, while the value of µ depends on the definition of the income classes and on an arbitrary reference unit. Numerical evaluations of the correlations (see [12]) show that in a suitable range of values of µ and G eq the R GM correlation is approximately linear in µ and remains negative, thus confirming the general validity of the empirical "Great Gatsby" rule mentioned in Section 2.1. On the other hand, the correlation R Gµ is approximately linear in µ but changes sign, showing that the question whether in economics "a raising tide lifts all boats" does not have an absolute answer. Finally, the correlation R Mµ is seen to be always very close to 1, confirming the strong linkage between mobility and total income; note that in a physical analogy the total income can be identified with the total energy and thus with the temperature in the canonical case.
Note that for values of G much smaller than usual, even the R GM correlation can become positive: see Figure 1, obtained with the mixed noise introduced in [21] and extending to the (unrealistic) value of G = 0.25. In order to obtain a meaningful diagram, one needs to generate a large number of distinct initial conditions for the stochastic equations, each one having a different value of µ. For instance, in Figure 1 there are 480 initial conditions, one for each result represented by a red dot. Every result is the average of 80 realizations starting from those initial conditions.

Langevin Equation with Ornstein-Uhlenbeck Noise and Dependence of the Correlations on Γ and τ
It is straightforward to replace the white noise η i , used in the Langevin equation until now, with an Ornstein-Uhlenbeck noise y i having memory time τ. To this end, an integration step of the discretized OU stochastic equation is added to each integration step of the Langevin equation. The full integration step, including multiplicative noise normalization, looks as follows: The correlations can now be computed in dependence both on Γ and τ. The plots of Figures 8-10 are obtained, where each dot represents the average of 50 realizations. A 3D polynomial fit confirms that the correlations have a very weak dependence on Γ and τ. Note that the largest values of Γ correspond to a very strong noise. The noise amplitude can be related to economic data by considering that for Γ = 0.001 the corresponding fluctuations of the total income µ in the stochastic realizations are of the order of 0.1%, thus quite realistic for a society at near equilibrium. The model is robust with respect to an increase in the noise amplitude. For instance, a tenfold increase of Γ leads to a proportional increase in the fluctuations of µ, while the values of the correlations R GM , R Gµ and R Mµ are substantially unchanged. For Γ up to 0.032, also with OU noise with memory time τ = 32, the fluctuations of µ can be of the order of 50% or more.

Fluctuations of the Populations in Dependence from Γ
Looking at the equilibrium populations, denoted by x i,eq , for certain model parameters and for a certain µ, it is interesting to measure the average amplitude of the stochastic fluctuations x i − x i,eq over several realizations in dependence on the noise amplitude Γ. A suitable measure appears to be the following: As can be seen in Figure 11, the dependence of δ av on Γ is to a good approximation linear at least up to Γ = 0.03, provided the averages are made on a large number of realizations.  Figure 11. Average deviation δ av from deterministic equilibrium in dependence on the noise amplitude Γ, for multiplicative white noise. Each dot is the average of 500 realizations with 5000 steps.

Conclusions
In this work we have investigated the properties of a system of nonlinear Langevin stochastic equations which describe the evolution in time of the income distribution of an idealized society. The individuals of this society interact through money exchanges which are in part deterministic (in the sense that they have fixed transition probabilities) and in part random, being caused by a noise source of the additive, multiplicative or "mixed" kind.
The noise frequency spectrum can be further characterized as white or colored (Ohrnstein-Uhlenbeck noise).
By analysing the results of a large number of numerical simulations we have determined the statistical distribution of the correlations R GM (Gini inequality index-social mobility), R Gµ (Gini-total income), R Mµ (mobility-total income), in correspondence of given average values of the total income. This distribution turns out to be asymmetrical, with a typically negative mean value for R GM . On the other hand, the mean value of R Mµ is always positive and close to 1, and that of R Gµ has variable sign, depending on the other parameters of the system.
The dependence of the correlations on the total income µ can be translated into a dependence on G, by taking advantage of the deterministic relation between G and µ at equilibrium. We have computed the R GM correlation also in a range of inequality values G which extends far below the usual range of pre-redistribution G values typical of industrialized countries, namely G ≥ 0.35. In this extended range the R GM correlation becomes negative; this means that when inequality is very low, an inequality increase correlates with an increase in social mobility.
In the case of a colored noise with amplitude Γ and memory time τ, the R GM correlation can be computed and plotted as a function of those two parameters, in order to highlight possible transitions between the two regimes R GM > 0 and R GM < 0 in the phase plane Γ − τ. No such transitions appear to be present, however. The R Gµ and R Mµ correlations are also slightly affected by variations of Γ and τ.
Finally, the average deviation of the system from the deterministic equilibrium turns out to be a linear function of the noise amplitude, with high accuracy, in a wide range of values of the noise amplitude. This shows that the system is stable with respect to the noise and does not exhibit any tendency to run away from equilibrium, also in the presence of strong noise.
In a conceivable extension of the model, the total population may be allowed to vary. This would remove one of the algebraic constraints and simplify the stochastic version. Immigration and emigration phenomena, or demographic changes on a short or long term could in this way be taken into account. It should be noticed, however, that the deterministic part of the present model changes considerably in the case of a non-constant population, because some general properties of the differential equations cease to be true. We will address these issues in future work.