Next Article in Journal
Bias Adjustment for a Nonparametric Entropy Estimator
Previous Article in Journal
Metric Structure of the Space of Two-Qubit Gates, Perfect Entanglers and Quantum Control
Article Menu

Export Article

Entropy 2013, 15(6), 1985-1998; doi:10.3390/e15061985

Article
Inequality of Chances as a Symmetry Phase Transition
Institut National de Sciences Appliquées, 20 Avenue des Buttes de Coësmes F-35708 Rennes Cedex 7, France
Received: 7 February 2013; in revised form: 27 April 2013 / Accepted: 15 May 2013 / Published: 23 May 2013

Abstract

:
We propose a model for Lorenz curves. It provides two-parameter fits to data on incomes, electric consumption, life expectation and rate of survival after cancer. Graphs result from the condition of maximum entropy and from the symmetry of statistical distributions. Differences in populations composing a binary system (poor and rich, young and old, etc.) bring about chance inequality. Symmetrical distributions insure equality of chances, generate Gini coefficients Gi ≤ ⅓, and imply that nobody gets more than twice the per capita benefit. Graphs generated by different symmetric distributions, but having the same Gini coefficient, intersect an even number of times. The change toward asymmetric distributions follows the pattern set by second-order phase transitions in physics, in particular universality: Lorenz plots reduce to a single universal curve after normalisation and scaling. The order parameter is the difference between cumulated benefit fractions for equal and unequal chances. The model also introduces new parameters: a cohesion range describing the extent of apparent equality in an unequal society, a poor-rich asymmetry parameter, and a new Gini-like indicator that measures unequal-chance inequality and admits a theoretical expression in closed form.
Keywords:
Lorenz plots; inequality of chances; symmetry; phase transition; maximum entropy; Gini coefficient
PACS Codes:
89.65.Gh; 64.60.A-

1. Introduction

For more than a century, Lorenz plots [1] have been used to describe inequalities in economic, social or biological systems. Now, a curve in a plane is usually the expression of an underlying mathematical law. Our goal here is to derive it from a small number of plausible assumptions. Related Gini coefficients [2], G i , provide a simple measure of inequality. We pointed out [3] that values 0 < Gi ≤ ⅓ implied the coexistence of inequality in the distribution of incomes with equality of chances for individuals to get any possible income. We also discussed [3,4] the relevancy for the characterisation of inequality of a notion originated in thermodynamics and statistical physics-entropy. Analogies between economics and physics have often been reported [5,6,7]. Georgescu-Roegen in particular discussed [8] the role of entropy in production, that is, material economic processes. Theil [9], Eliazar [10], and Eliazar and Sokolov [11], applied statistical formulations of entropy, or information, to an immaterial, but nevertheless objective and fundamental relation between individuals, inequality. Here we invoke entropy together with another crucial concept in many statistical phenomena, symmetry. We thus show that the elementary theory of second-order symmetry phase transitions in physics [12,13] can be tailored to describe Lorenz’s inequality plots and corresponding values of G i . Predictive features of the resulting model will be checked against data on incomes, electricity consumption, life expectation and (male) cancer-specific rate of survival. The latter is defined as the difference between cancer incidence and the mortality of its victims, that is, the probability of death by any cause other than cancer. Dollars, kWh, years, ages, etc., will thus be generically referred to as benefit units (BUs) in this paper, while “individuals” may apply to persons, households, economic agents, countries, etc. In the present probabilistic description, expectation values will be shown as and eventually assumed to be equal to their estimators, ‹w› = W/N, where W = n w n is the total benefit of a particular kind enjoyed by N individuals, and the random variable w n defines the amount of benefit befallen on a randomly chosen individual n.
Phase transitions have been reported in connection with interest rate models [14], but they appear mainly in physics, from liquid-gas to binary alloy transformations. Different substances may feature dissimilar data, but suitable normalisation of relevant quantities results in a single universal curve on each side of the transition, describing the properties of all such substances simultaneously. This is the signature of a phase transition and is known as the law of corresponding states for liquid-gas transitions, and universality in the general case. In binary alloys universality is directly related to a change in the symmetry of relevant structures. We show here that it also holds for quite different economic and demographic data, and we obtain a mathematical expression for it. It allows the calculation of Lorenz plots and thereby their possible intersections. Atkinson [15] pointed out that, given two intersecting such curves, additive social welfare functions (SWF) could be found that led to opposite rankings of their inequality measures. We show that symmetry determines whether the number of such intersections, if any, is even or odd.
We discuss the relation between Lorenz graphs and statistical distributions in Section 2. Section 3 deals with the statistical implementation of the model, to be compared in Section 4 to data on incomes [16], electricity consumption [17], life expectation [16] and cancer survival [18]. Section 5 summarises our results and discusses their conceptual and practical consequences. An appendix proves two lemmas on relationships between the symmetry of the statistical distribution and the resulting graphs.

2. Thermodynamics of Inequality

2.1. Assumptions

Once averaged over many individuals and sufficiently long intervals of time (typically, a year) for data to be statistically significant, the individual’s share of total benefit defines one out of many possible states. Individuals are said to occupy such states, to an extent measured by occupation numbers, i.e., the number of individuals occupying a given state. One has strict equality of chances if this number is a constant for all available states. A complete set of occupation numbers determines the state of the social system referred to the particular kind of benefit under study. Time intervals will be assumed to be short compared with the characteristic time of relevant medical, economic or social changes. In other words, possible transformations of society are assumed to be quasi-static. In general [19], inequality indicators are expected to be scale and replication invariant, that is, there is no change in the indicator when either all benefits (scaling) or both benefit and number of individuals are multiplied by the same positive factor, while leaving occupation numbers unchanged (replication). The latter does not affect the per capita benefit w , while the ratio ω n = w n w , with expectation value ω = 1 , is insensitive to scaling. Functions of the vector ω = { ω 1 , ω n , ω N } are thus automatically invariant against both transformations, and will be systematically used in the following. The cumulative distribution function (CDF) F ( ω ) = Pr ( ω n ω ) , the probability density function (PDF) is f ( ω ) = d F d ω , and the quantile function   ω ( F ) is assumed to admit a Taylor expansion in the interval 0 F 1 . The following assumptions are perhaps more specific to Lorenz functions, L ( F ) :
  • Finiteness. Real-world benefits are finite, with ω m = inf ( ω ) 0 and ω M = sup ( ω ) < , and f ( ω ) is bounded, continuous, differentiable and single-peaked in an open interval, ω ] ω m , ω M [ f ( ω ) > 0 , while f ( ω ) = 0 otherwise.
  • Generalised Pareto criterion. Individuals interact, and the resulting relationship is reflexive, symmetric and transitive, defining a class of equivalence. Two mutually exclusive such classes – poor and rich, young and old, healthy and sick, etc.—describe L ( F ) schematically, provided the boundary F p between them is realistically defined [20]. If the two classes are equally populated (a conceptually possible case), then F p = 1 2 . Otherwise F p > 1 2 satisfies the equation F p = 1 L ( F p ) , a simple generalisation of Pareto’s 80/20 well-known rule.
  • Entropy. The most probable state of the whole society with respect to a given type of benefit maximises the entropy functional. Entropy decreases as inequality increases, and goes to zero in the limit of absolute inequality, where a single individual gets the whole benefit and leaves nothing to others.
  • Phase transition. The change from F p = 1 2 to F p > 1 2 marks a second-order phase transition from symmetric to asymmetric distributions. The entropy and F p are continuous across the transition.
Of course, as will become clear below, supplementary assumptions will be necessary to obtain a workable expression for the entropy.

2.2 Statistical Mechanics and Lorenz Functions

Standard entropy maximisation with no other constraint than the obvious f ( ω ) d ω = 1 results in a single Lagrange multiplier and an optimal uniform, therefore symmetrical density f u ( ω ) = 1 ω M ω m . Consider an additive social welfare function u ( ω ) f ( ω ) d ω and assume that the average u is imposed as an additional constraint; the final result is:
f ( ω ) = e β u ( ω ) Z ( β , α ) ,          Z ( β , α ) = ω m ω M e β u ( ω ) d ω ,
where α refers to “external” parameters, due to other constraints. The parameter β is zero for uniform densities, and can in principle be obtained in the general case from u = β ln Z ( β , α ) , which leads to Z ( β , α ) = Z 0 ( α ) e β u . The optimal distribution is symmetrical if the centred function u ˜ ( ω ˜ ) = u ( ω ˜ ) u is even, with ω ˜ = ω 1 . In particular, a normal distribution is obtained if the value of u ˜ 2 becomes an additional constraint. Symmetric PDFs f ( ω ) describe equality of chances (for an individual to belong to any of two classes), F p = 1 2 , irrespective of inequalities in the distribution of benefits. Maximum equal-chance inequality (ECI) has R = ω m ω M = 0 , perfect equality requires R = 1 . Equality of chances should not be confused with equality of opportunities [21], dealing with sources of inequality and their possible legitimacy.
Lorenz plots L ( F ) result from taking as evaluation function in Equation (1) u ( ω ) = ω , the maximum benefit enjoyed by the fraction F ( ω ) of the population. We write L ( F ) in an apparently redundant but altogether useful form, as different functions L ,   L 1 ,   L 2 of arguments F , ω , F ² , respectively, indicating as many ways of writing the same Riemann-Stieltjes integral:
L ( F ) = 0 F ω ( F ) d F = L 1 ( ω ) = ω m ω ( F ) ω f ( ω ) d ω = L 2 ( F ² ) = 0 F ² ω ( x ) 2 x d x .
The expectation value of ω ˜ is ω ˜ = L ( 1 ) 1 0 . In terms of the function   ϕ ( F | R ) = L ( F ) F ² , Gini’s coefficient [22] is given by:
G i = 2 0 1 [ F L ( F ) ] d F = 1 3 2 0 1 ϕ ( F | R ) d F = 1 3 2 ϕ ,
that is, G i is the same for any two distributions (called Gini-equivalent here) with the same value of ϕ , and therefore with the same area under ϕ -curves. Strict equality of chances requires a uniform distribution, that is, β = 0 in Equation (1). Then, using Equation (2) and L ( 1 ) = 1 , one obtains ω m + ω M = 2 , a result that applies to all symmetric distributions according to Lemma 1 in the appendix. Now, since nobody survives with less than zero income or, for that matter, with zero life expectancy, a symmetric distribution implies that nobody earns more than twice the average. Incidentally, this furnishes a useful test of consistency when values of G i < 1 3 are found. The resulting Lorenz curve is:
L u ( F | R ) = 2 R 1 + R F + 1 R 1 + R F 2 ,
and, correspondingly, ϕ u ( F | R ) = g ( G i ) ϕ 0 ( F ) , where g ( G i ) = ( 1 3 G i ) and ϕ 0 ( F ) = F ( 1 F ) are universal functions pertaining to uniform distributions, a particular case of symmetric distributions. Any of the latter is Gini-equivalent to one of the former with R = 1 3 G i 1 + 3 G i and ϕ u = ϕ = g ( G i ) 6 . Resulting L-curves scan the whole 0 G i 1 3   region with   1 R 0 . Changing to G i > 1 3 requires the transfer of individuals from one half of the population to the other, so the distribution becomes asymmetric and L 2 ( F ² ) becomes downward-convex.

2.2.1. L-Curves

Figure 1a shows four theoretical curves displaying: (i) perfect equality ( ω 1 ), (ii) a normal distribution together with (iii) its Gini-equivalent uniform counterpart, the curves (ii) and (iii) being practically indistinguishable in Figure 1a, and (iv) maximum inequality compatible with equal chance, L 2 ( F ² ) = F 2 . Figure 1a has F ² abscissas, instead of F as usual. This is not accidental. It means that the reference state—conceptually possible but unattained in practice—is no longer perfect equality but maximum ECI. We refer to the resulting graphs as L-curves. Figure 1a also displays empirical data on incomes in the United States, world electricity consumption, life expectation per individual in a class of age, and the rate of survival of males in the USA after contracting cancer. Corresponding distributions are shown in Figure 1b.
Figure 1. (a) Concave and convex L-graphs resulting from equal- and unequal-chance inequality, respectively. ─∙─ Perfect equality, L ( F ) = F . —— Gaussian and its Gini-equivalent uniform density, both having G i = 0.17 Equal-chance line, L ( F ) = F ² . ▲▲▲ USA incomes. ◦◦◦ World electricity consumption. ▢▢▢ Life expectation. ♦♦♦ Cancer rate of survival. Class boundaries, F ² = 1 4 and L 2 ( F ² ) = 1 F ² . –– Unit-slope tangent. X──X Distance d from equal-chance line. (b) Corresponding statistical distributions. Symbols apply as in (a).
Figure 1. (a) Concave and convex L-graphs resulting from equal- and unequal-chance inequality, respectively. ─∙─ Perfect equality, L ( F ) = F . —— Gaussian and its Gini-equivalent uniform density, both having G i = 0.17 Equal-chance line, L ( F ) = F ² . ▲▲▲ USA incomes. ◦◦◦ World electricity consumption. ▢▢▢ Life expectation. ♦♦♦ Cancer rate of survival. Class boundaries, F ² = 1 4 and L 2 ( F ² ) = 1 F ² . –– Unit-slope tangent. X──X Distance d from equal-chance line. (b) Corresponding statistical distributions. Symbols apply as in (a).
Entropy 15 01985 g001
The unequal-chance inequality (UCI) region is defined by F ² > L 2 ( F ² ) . Real-world data lies in this region and forms downward-concave L-curves. This is also the case of beer bubble-size distributions [23], that systematically show G i 0.33 and asymmetric PDFs. Of course, besides the common points (0,0) and (1,1), there is no possible intersection of concave and convex L-curves. The transition line, L = F ² , implies ω = 2 F and an optimal uniform density f u ( ω ) = 1 2 , shown in Figure 1b.

2.2.2. Symmetry, Class Boundaries and Discontinuities

The median, the mode and the mean coincide for symmetrical distributions. The natural boundary between classes is then the axis of symmetry, as illustrated in Figure 1b, with F p = 1 2   and ω ( F p ) = ω = 1 . By Lemma 1 in the appendix, it coincides with a maximum of ϕ = L 2 ( F ² ) F ² . In fact, the extremums of ϕ occur in both regions at points of abscissa F 0 2 where   d L 2 d ( F ² ) | F 0 2 = 1 , that is, where the tangent to L 2 ( F ² ) is perpendicular to the second diagonal in Figure 1a, of equation L 2 ( F 0 2 ) = 1 F 0 2 . We therefore extend the definition of class boundary to the locus of all extremums of ϕ . We call L 2 ( F 0 2 ) the ideal UCI class-boundary line, because it closely describes the minima of ϕ in this region. Now, ECI extremums of ϕ form a vertical line of abscissa F 0 2 = 1 4 , while 1 2 F 0 2 1 in the UCI region. There is clearly a discontinuity in the values of F 0 2 , a hint of a possible phase transition. Since F p must be continuous according to the generalised Pareto principle, we circumvent the difficulty by giving region-dependent definitions: F p = F 0 = 1 2 in the ECI region and F p = F 0 2 > 1 2 in the UCI region.
Inequality thus shows up in at least two conceivable and non equivalent ways: for ECI, the poor fraction of the population equals the rich fraction, F p = 1 F p , but the former as a whole gets less than the latter,   L ( F p ) < 1 2 < 1 L ( F p ) ; for UCI, one of the two classes outnumbers the other, F p > 1 2 > 1 F p , with F p = 1 for absolute inequality. The generalised Pareto criterion results in L ( F p ) = 1 F p and, from Lemma 2 in the appendix, the maximum per capita benefit is necessarily greater than twice the average in this case.

2.3. Universality

If a phase transition is indeed at work here, we expect a law of corresponding states to apply, as it does for similar transformations in physics [12,13]. L-curves should result from one another by suitable rescaling of universal functions, at most one on each side of the transition line L = F ² . This is shown in Figure 2 for symmetrical distributions in Figure 2a and empirical UCI data in Figure 2b. Gini-equivalent ϕ -curves from symmetrical distributions necessarily intersect an even number of times, twice in practice, on both sides of the axis of symmetry, as illustrated by Figure 2a. Figure 2b describes UCI and results from a rotation of axes by 45 degrees in Figure 1a, so as to make the F²-axis join the main diagonal, followed by (cosmetic) inversion of the resulting ordinates to make data positive. One obtains   ψ ( X ) graphs looking like the made-up dashed curve in Figure 2b, intersecting once the UCI universal curve. After normalisation and rescaling of the X-axis so as to have all X M coincident a   X M = 1 2 , empirical data crowds indeed close to a universal curve as shown in Figure 2b. The rotation results in new coordinates ( X , Y ) , where Y ( X ) is the universal curve, defined parametrically in the UCI region by:
X = F ² + L 2 ( F ² ) 2 ,     Y = F ² L 2 ( F ² ) d 2 = ψ ( X ) ψ M  
where ψ M = d 2 is the maximum of ψ ( X ) , with d the maximum distance between the L = F ² line and the L-curve.
Figure 2. Symmetry-dependent universal behaviours. (a) ECI concave L-curves: ––Perfect equality, G i = 0 . ∙∙∙∙ Uniform distribution, R = 0.33 ,    G i = 0.17 . – – – Gini-equivalent, Gaussian-like distribution, G i = 0.17. (b) UCI convex L-curves: ▲▲▲ Incomes. ◦◦◦ World electricity consumption. ▢▢▢ Life expectation. ♦♦♦ Rate of survival after cancer. –– Theoretical curve. ∙∙∙∙∙ Absolute inequality. - - - Fictitious data before normalisation and symmetrisation (right-hand ordinates, with d 2 = 0.7 ,    X M = 0.8 ).
Figure 2. Symmetry-dependent universal behaviours. (a) ECI concave L-curves: ––Perfect equality, G i = 0 . ∙∙∙∙ Uniform distribution, R = 0.33 ,    G i = 0.17 . – – – Gini-equivalent, Gaussian-like distribution, G i = 0.17. (b) UCI convex L-curves: ▲▲▲ Incomes. ◦◦◦ World electricity consumption. ▢▢▢ Life expectation. ♦♦♦ Rate of survival after cancer. –– Theoretical curve. ∙∙∙∙∙ Absolute inequality. - - - Fictitious data before normalisation and symmetrisation (right-hand ordinates, with d 2 = 0.7 ,    X M = 0.8 ).
Entropy 15 01985 g002
Data shows this maximum to occur at values X M 1 2 in all cases. The theoretical curve, to be obtained in the next section, is also shown for comparison. The L-curve for absolute inequality is just a pair of perpendicular straight lines, L = 0 for 0 F ² < 1 , and 0 < L 1 for F ² = 1 , with 2 d = 1 . It becomes   Y = F ² = 2 X fo   X 1 2 and Y = 2 ( 1 X ) for   X > 1 2 in Figure 2b.

3. The Transition to Convexity

3.1. Probabilistic Model

We define the unequal-chance order parameter (OP) of the concave → convex transformation as the difference between equal- and unequal-chance cumulated benefit fractions for the population fraction   F in the UCI region, and zero elsewhere. It is just the numerator ψ ( X ) = ψ M Y ( X ) = F ² L 2 ( F ² ) in the second Equation (5). Let S { ψ } be the entropy functional an   S 0 the entropy in the L = F ² equal-chance state. We assume that constraints exist that result in an entropy difference S { ψ } S 0 < 0 that depends not only on ψ ( X ) , but also on d ψ d X . Indeed, if correlations exist, they should depend on the gradient of the order parameter. The maximum ψ M = ψ ( X M ) = 2 d for each curve is the difference between the probability of being poor and that of being rich, ψ M F p ( 1 F p ) = 2 F 0 2 1 = 2 d 1 . At absolute inequality   ψ M = 1 , F p = 1 and   S { ψ M } = 0 ; thereon S { ψ } increases as d decreases. Now, the asymmetry is the same if the poor outnumber the rich—the case of incomes—or just the opposite, as for demography, where the numerous young are richer in life expectation. The signs of the OP and its derivative should therefore be irrelevant. The Taylor expansion of the entropy density d S d X contains then only even powers of ψ ( X ) and d ψ d X . For ψ 1 , to fourth order in the OP and second order in its derivative we have:
S { ψ } = S 0 { 1 + A 0 1 [ ψ ² + B A ψ 4 + C A ( d ψ d X ) 2 ] d X }        = S 0 { 1 4 d ² 0 1 [ Y 2 d 2 Y 4 ξ 2 ( d Y d X ) 2 ] d X } ,
where   ξ ² = C A , and   A < 0 is required to make S { ψ 1 } = 0 , while ψ ( X ) 0 must optimise the integral. Note the great generality of this approach. It applies equally well to any type of entropic form out of the many that have been proposed, whether extensive or not [24]. Equation (6) could also conceivably describe average entropy production, for example. Anyway, these undefined features do not prevent the calculation to proceed.

3.1.1. Entropy Maximisation

At first sight, three parameters, A ,   B ,   C , are necessary to describe a single L-curve, but Figure 2b suggests that at most two parameters, d and eventually   X M , should suffice. Consider then two cases where we know that ψ is a constant, the equal chance limit with ψ 0 , and absolute inequality with ψ 1 and S { 1 } = 0 . According to the definition, S = S 0 for arbitrary A or B in the entire ECI region, which results in ψ = 0 ϕ in it. Maximum entropy requires ψ ( A + 2 B ψ 2 ) = 0 , so ψ = ψ M is either zero, as expected for the first case, or A 2 B = 1 in the second case, i.e. A = 2 and B = 1 . We tentatively apply these values to the whole UCI region. Incidentally, if L-curves rather than straight lines are to be obtained, this is a matter-of-fact argument for the introduction of the derivative term. The condition for an extremum is given by Euler’s equation, which reads:
ξ ² d ² Y d X ² + Y ( 1 2 d ² Y 2 ) = 0 ,
with boundary conditions Y ( X M ) = 1 ,   Y ( X M ) = ( d Y d X ) | X M = 0 , supplemented by Y ( 0 ) = Y ( 1 ) = 0 . Useful insight is obtained from successive approximate solutions. All boundary conditions are satisfied in the d 0 + limit by   Y ( X ) = sin ( X ξ 0 ) ,     ξ 0 = 1 π . When replaced in (6) this solution makes the integral equal to zero as announced, independently of the value of A . Next, we linearise Equation (7) by replacing Y 2 by its average in the previous approximation, sin ² ( X ξ 0 ) = 1 2 . Now, if Y is indeed universal it cannot depend on d, which imposes ξ = ξ 0 1 d ² . The parameter ξ is a natural unit of measurement of   X and of the rate of change of the OP, insofar as a sizeable change in the latter along an L-curve requires significant changes in the ratio     X ξ . Schematically, it defines a cohesion range on X such that increments Δ X ξ are unimportant, those where Δ X ξ do matter, and a significant level of equality of chances persists for Δ X ξ . The product π ξ furnishes a convenient correlation or cohesion indicator: it is a maximum, unity, for minimum UCI, ( d = 0 + ) , and decreases to zero, as shown below, for absolute inequality ( 2 d = 1 ). Equation (7) admits a first integral:
ξ 2 ( d Y d X ) 2 + Y 2 d 2 Y 4 = 1 d 2 = ξ 2 ( d Y d X ) 2 | X = 0 = ξ 2 ( d Y d X ) 2 | X = 1 .

3.1.2. Class Asymmetry and Intersections of L-Curves

If ψ ( X ) is not strictly symmetrical about   X M = 1 2 , the two derivatives at the end of (8) will be different, which requires two values of ξ for a single value of d. We use the notation   ξ p | r = { ξ p | ξ r } for a simultaneous description of the two classes. The approximate solution found for Equation (7) above suggests putting   Y = sin φ . When this is replaced in Equation (8), its solution can be expressed in parametric form, in terms of the incomplete, Φ ( k ² , φ ) , and complete, K ( k ² ) = Φ ( k ² , π 2 ) , elliptic integrals of the first kind [25], with k ² = d ² 1 d ² :
X p | r ( k ² , φ ) = { Φ ( k 2 , φ ) K ( k ² ) X M | X M + Φ ( k 2 , φ π 2 ) K ( k ² ) ( 1 X M ) } ,        Y = sin φ
and:
Φ ( k ² , φ ) = 0 φ 1 1 k ² sin ² θ d θ ,             φ π 2 .  
Its continuation into the region π φ > π 2 is just   Φ ( k ² , φ ) = K ( k ² ) + Φ ( k ² , φ π 2 ) . The cohesion ranges are:
ξ p | r   ( k ² ) = 1 K ( k 2 ) 1 + k ² { X M | 1 X M } = 2 ξ ( k ² ) { X M | 1 X M }
One recovers the approximate expression for ξ 0 = 1 2 K ( 0 ) = 1 π obtained above. Equation (11) shows that the average parameter   ξ ( k ² ) = ξ p   + ξ r   2 = 1 2 K ( k ² ) 1 + k ² , describing the interclass cohesion, is a decreasing function of k2 that becomes ξ ( 1 ) = 0 at absolute inequality, where no cohesion is possible. Class asymmetry is defined by Δ = ξ p ξ r ξ p + ξ r = 2 X M 1 . If Δ 0 , two parameters, like d and X M , ξ p and ξr, or ξ ( k ² ) and Δ , suffice to characterise an L-curve. As a result two convex L-curves, say A and B, having (i) d A = d B , coincide if X M , A = X M , B ; they intersect once if X M , A X M , B . In both cases G i A = G i B . (ii) If d A d B intersections are irrelevant because in all cases   G i A G i B , but there is no intersection if X M , A = X M , B . The apparent dependence of X ( k ² , φ ) on k² is in fact negligible over the range of interest. The universal curve in Figure 2b makes use of the average k ² = 0.139 and provides an acceptable fit to all our data.

4. Results

4.1. Fitting Empirical Data

Equations (5), once solved for F and L 2 ( F 2 ) as functions of X and Y , allow, using Equation (9), to obtain theoretical Lorenz curves in the UCI region. The parameters d and X M are directly read from data—i.e., not adjusted for an overall fit. They provide Lorenz curves as shown in Figure 3a conventional L ( F ) plot. The expansion in Equation (6) is in principle valid only for ψ 1 . A distinctive feature of this model is that it precisely defines its own limit, k ² = 0.22 , above which the theoretical initial slope of L 2 ( F ² ) becomes unrealistically negative, as suggested by Figure 3. Universality of L-curves is nevertheless substantiated well beyond this limit, as shown in Figure 2, where cancer data has   k ² = 0.34 .
Figure 3. Conventional L ( F ) Lorenz plots. Symbols for data and for their fits by model predictions are shown in the insert. The model should not apply for values k ² > 0.22 .
Figure 3. Conventional L ( F ) Lorenz plots. Symbols for data and for their fits by model predictions are shown in the insert. The model should not apply for values k ² > 0.22 .
Entropy 15 01985 g003

4.2. A New Indicator

The main advantage of the Gini coefficient is its conceptual simplicity, though counterbalanced by possible inaccuracies when obtained from discontinuous data. Equations (5) and (9) provide a function describing L-curves and thereby allow a precise numerical integration of the function F L ( F ) in Equation (3). Gini values in Table 1 below have been obtained in this way. Furthermore, a new Gini-like indicator   G i u c , measures unequal-chance inequality in the ( F 2 , L 2 ) plane, and is easily obtained in closed form. It is twice the area under the curve ψ = F ² L 2 ( F ² ) = 2 d sin φ :
G i u c ( k ) = 2 2 d 0 π Y ( X ) d X d θ d θ        = 2 K ( k 2 ) 2 1 + k 2 0 π 2 k sin θ 1 k 2 sin 2 θ d θ        = 2 2 1 + k 2   argth ( k ) K ( k 2 ) ,     k ² 0.22.
Quantities like G i and G i u c ( k ) , resulting from a good fit to the whole L-curve, become meaningless for k ² > 0.22 . Others, dependent on the single value d through k ² = d ² 1 d ² like ξ ( k ² ) , are still valid beyond this limit. This is shown in Table 1.
Table 1. Characteristic parameters of unequal-chance inequality for different types of data. F p is the largest fraction of the population, the poorest in the first two cases, the youngest in the other two.
Table 1. Characteristic parameters of unequal-chance inequality for different types of data. F p is the largest fraction of the population, the poorest in the first two cases, the youngest in the other two.
BENEFITk2FPXM d 2 π ξ ( k ² ) Δ = ξ p ξ r ξ p + ξ r G i Giuc(k)
Income0.0080.580.680.160.990.350.460.16
Electricity consumption0.0460.660.520.320.970.040.600.38
Life expectation0.1590.760.550.520.890.100.790.68
Model limits0.2200.800.50.600.8500.850.78
Survival after cancer0.3440.860.450.710.78–0.11––––

5. Conclusions

This work provides a model that fairly fits Lorenz curves, up to G i = 0.85 . It is just the social analogue of Ginzburg and Landau’s ideas [13] on second-order phase transitions in physics. The symmetry of the statistical distribution plays a crucial role in this development. Symmetrical distributions result in ECI downward-concave L 2 ( F ² ) curves, G i < 1 3 and w m + w M = 2 w , i.e., nobody gets more than twice the per capita average, a straightforward but apparently overlooked result. Since equality of chances must be a rather unusual event, too low values of G i may be profitably checked for consistency against this relation. Asymmetrical distributions display convex L-curves, G i > 1 3 and impose w M > 2 w . Initial slopes of L 2 ( F ² ) furnish a supplementary criterion, though less convenient for numerical applications. A clear-cut distinction appears to be necessary between equal- and unequal-chance inequalities, related to different regions in the ( F ² ,   L 2 ) plane. One may expect that critical values, corresponding to G i = 1 3 at the phase transition, will also be found in other indicators of inequality. New parameters appear in the UCI region, like the cohesion range ξ , measuring the range of persistent equality in the distribution, the asymmetry parameter Δ , and the Gini-like coefficient G i u c . The latter measures how far away a society is from maximum ECI in just the same way as G i measures how far away it is from perfect equality.
Quite different phenomena, from income distribution to cancer rate of survival, obey the same statistical laws. The resulting description of inequality implies an apparently oversimplified two-class division of society. A more detailed analysis should provide criteria allowing recognition of existing classes, whatever their number, out of real-life distributions. This amounts to a nontrivial challenge – modelling the probability density function.

Appendix: Two Lemmas on Convexity

We assumed that ω ( F ) admits a Taylor expansion in the region   0 F 1 :
ω ( F ) = ω m + a 1 F + a 2 F ² + O ( F 3 ) ,           a 1 = d ω d F | 0 = 1 f ( ω m + )  
so one obtains:
d L 2 d ( F ² ) = ω 2 F = ω m 2 F + a 1 2 + a 2 2 F + O ( F ² )
Let the density f ( ω ) be single-peaked at ω p k , with f p k = f ( ω p k ) . Then the following lemma applies to downward-concave L-curves:
Lemma 1: Let R be a parameter that preserves the symmetry of distributions, while defining a family L ( F | R ) of concave L-curves that spans the whole region F L ( F | R ) F 2 as R changes. (a) Such families are generated by, and only by, symmetric distributions, with w m + w M = 2 w . Maximum ECI admits only a uniform distribution. (b) Functions ϕ ( F | R ) = L ( F | R ) F ² have their maxima at a common abscissa, F p = 1 2 , and are symmetrical about this point. Peaks in the densities, f p k = f ( ω p k ) > 1 2 , and in the functions ϕ ( F | R ) , occur at the same values ω p k = ω p = 1 and F p k = F p = F ( ω p ) . (c) If ω m > 0 , the initial slope of   L 2 ( F ² ) is infinite, if ω m = 0 , f ( 0 + ) f ( 0 ) is necessarily discontinuous.
Proof: (a) The probability density for perfect equality is a Dirac δ-function, δ ( ω ˜ ) , symmetrical about ω = 1 . Maximum ECI has ϕ ( F | R ) 0 and from Equation (2), d L 2 d ( F ² ) = ω 2 F = 1 , that is,   f u ( ω ) = d F d ω = 1 2 for 0 < ω < 2 . This is the uniform, therefore symmetrical distribution shown in Figure 1b. From the definition of ϕ ,
ϕ ( F | R ) = 0 F [ ω ( F ) 2 F ] d F = F p F ˜ ω ˜ ( F ˜ ) d F ˜ + ( F p 2 F ˜ 2 ) ,
where F ˜ = F F p . The last term is symmetric about F p = 1 2 and ω ˜ ( F ˜ ) is odd. The antiderivative, like ϕ ( F | R ) , is therefore even in F ˜ , i.e. symmetrical about F = F p . Conversely, if ϕ ( F | R ) is symmetric about F p , ω ˜ ( F ˜ ) and F ˜ ( ω ˜ ) alike are odd, and the derivative of the latter, f ( ω ) , is an even function of ω ˜ , i.e., symmetrical about F p . Symmetry entails, for ω 1 ω 2 , f ( ω 1 ) = f ( ω 2 ) iff ω 1 = ω x and ω 2 = ω + x , with x 1 . This gives ω 1 + ω 2 = 2 or w m + w M = 2 w as announced. (b) If f ( ω ) is symmetrical, F p k = 1 2 = F p by definition of ECI, and ω p = ω p k = ω . Maxima of ϕ ( F | R ) occur for d L 2 d ( F ² ) | p = ω p 2 F p = 1 = ω p k 2 F p k . Concavity imposes   d ² L 2 d ( F ² ) ² = 1 4 F ² ( 1 f ω F ) < 0 , or     ω p 2 F p f p k = f p k > 1 2 . (c) If ω m > 0 , Equation (6) gives d L 2 d ( F ² ) F ² 0 . For   ω m = 0 , Equation (5) implies that f ( 0 + ) > 0 if ω is to remain bounded, so   f ( 0 + ) f ( 0 ) = 0 , which proves the discontinuity.
Lemma 2. Under the same conditions on ω and f ( ω ) as in lemma 1, downward-convex L-curves, (a) have ω m = 0 , f p < 1 2 and finite initial slopes, and (b) result from asymmetric distributions with ω M > 2 .
Proof: (a) Convexity requires the L-curve to satisfy F ² > L 2 ( F ² ) 0 everywhere, so the initial slope is 1 > d L 2 d ( F 2 ) | 0 0 , while from Equation (6) it would be infinity if ω m 0 . It also implies   d ² L 2 d ( F ² ) ² > 0 , that is,     ω p 2 F p f p = f p < 1 2 . (b) Any asymmetric function f ( ω ) can be split into odd f A ( ω ˜ ) = f ( 1 + ω ˜ ) f ( 1 ω ˜ ) 2 = f A ( ω ˜ ) and even f S ( ω ˜ ) = f ( 1 + ω ˜ ) + f ( 1 ω ˜ ) 2 = f S ( ω ˜ ) components, where we now take ω ˜ as independent variable. Recalling that ψ ( F ² ) = F ² L 2 ( F ² ) , assume that ω M 2 . Then,
ψ ( 1 ) = 0 ω M ( 2 F ω ) f ( ω ) d ω         = 0 1 ( 2 F 1 ) d F 1 1 ω ˜ f S ( ω ˜ ) d ω ˜ 1 1 ω ˜ f A ( ω ˜ ) d ω ˜ 0 ,
where we made use of the fact that f S + f A = f ( 1 + ω ˜ ) = 0 for ω ˜ ω ˜ M to change the upper limits of integration from ω ˜ M to 1. The first term in the right-hand side is zero. The second term is also zero, because ω ˜ and f S are of opposite parity in the interval of integration. Then, since ω ˜ and f A are both odd and not identically zero, the last integral cannot be zero. This is absurd, because contrary to the definition implying ψ ( 1 ) = 0 .

References

  1. Lorenz, M.O. Methods of measuring concentration of wealth. J. Amer. Statist. Assoc. 1905, 9, 209–219. [Google Scholar] [CrossRef]
  2. Gini, C. Variabilità e mutabilità (1912). In Memorie di Metodologia Statistica; Pizetti, E., Salvemini, T., Eds.; Libreria Eredi Virgilio Veschi: Rome, Italy, 1955. [Google Scholar]
  3. Rosenblatt, J.; Martinas, K. Inequality indicators and distinguishability in economics. Physica. A 2008, 387, 2047–2054. [Google Scholar] [CrossRef]
  4. Rosenblatt, J.; Martinas, K. Probabilistic foundations of economic distributions and inequality indicators. In Income Distribution: Inequalities, Impacts and Incentives; Irving, H.W., Ed.; Nova Science Publishers: New York, NY, USA, 2008; pp. 149–170. [Google Scholar]
  5. Chakrabarti, B.K.; Chakraborti, A.; Chakravarty, S.R.; Chatterjee, A. Econophysics. of Income and Wealth Distributions; Cambridge University Press: Cambridge, UK, 2013. [Google Scholar]
  6. Souma, W. Physics of Personal Income, 2002. Available online: http://arxiv.org/cond-mat/0202388/ (accessed on: 08/01/13).
  7. Martinás, K. Thermodynamics and sustainability: A new approach by extropy. Per. Pol. Chem. Eng. 1998, 42, 69–83. [Google Scholar]
  8. Georgescu-Roegen, N. The Entropy Law and the Economic Process; Harvard University Press: Cambridge, MA, USA, 1971. [Google Scholar]
  9. Theil, H. Economics and Information Theory; North Holland: Amsterdam, The Netherlands, 1967. [Google Scholar]
  10. Eliazar, I. Randomness, evenness, and Rényi’s index. Physica A 2011, 390, 1982–1990. [Google Scholar] [CrossRef]
  11. Eliazar, I.; Sokolov, I.M. Measuring statistical evenness: A panoramic overview. Physica A 2012, 391, 1323–1353. [Google Scholar] [CrossRef]
  12. Landau, L. Theory of phase transitions (Part 1). Phys. Z. Sowjetunion. 1937, 11, 26, and (Part 2). Phys. Z Sowjetunion. 1937, 11, 545. [Google Scholar]
  13. Ginzburg, V.L.; Landau, L.D. On the theory of superconductivity. Zh. Eksp. Teor. Fiz. 1950, 20, 1064–1082. [Google Scholar]
  14. Pirjol, D. Phase Transition in a log-normal Markov Functional Model. J. Math. Phys. 2010, 52, 013301. [Google Scholar] [CrossRef]
  15. Atkinson, A.B. On the Measurement of Inequality. J. Econ. Theory 1970, 2, 244–263. [Google Scholar] [CrossRef]
  16. U.S. Census Bureau. Current Population Survey, Annual Social and Economic Supplement, 2007. Available online: http://www.census.gov/#/ (accessed on 4 February 2013).
  17. United Nations Development Programme. Available online: http://www.undp.org/ (accessed on 8 January 2013).
  18. New York City Cancer Statistics. Available online: http://www.health.state.ny.us/statistics/ cancer/registry/table6/tb6totalnyc.htm/ (accessed on 8 January 2013).
  19. Cowell, F.A. Measurement of inequality. In Handbook of Income Distribution; Atkinson, A.B., Bourguignon, F., Eds.; North Holland: Amsterdam, The Netherlands, 1999. [Google Scholar]
  20. Cohen, M.H.; Eliazar, I. Econophysical visualization of Adam Smith’s invisible hand. Physica A 2013, 392, 813–823. [Google Scholar] [CrossRef]
  21. Roemer, J. Equality of Opportunity; Harvard University Press: Cambridge, MA, USA, 1998. [Google Scholar]
  22. Yitzhaki, S.; Schechtman, E. The Gini. Methodology; Springer: New York, NY, USA, 2012. [Google Scholar]
  23. Sauerbrei, S. Lorenz curves, size classification, and dimensions of bubble size distributions. Entropy 2010, 12, 1–13. [Google Scholar] [CrossRef]
  24. Tsallis, C. Introduction to Nonextensive. Statistical Mechanics; Springer Science+Business Media: New York, NY, USA, 2009. [Google Scholar]
  25. Abramowitz, M.; Stegun, I. Handbook of Mathematical Functions; Dover: New York, NY, USA, 1972. [Google Scholar]
Entropy EISSN 1099-4300 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top