The Relation between Evenness and Diversity

Contrary to common belief, decomposition of diversity into independent richness and evenness components is mathematically impossible. However, richness can be decomposed into independent diversity and evenness or inequality components. The evenness or inequality component derived in this way is connected to most of the common measures of evenness and inequality in ecology and economics. This perspective justifies the derivation of measures of relative evenness, which give the amount of evenness relative to the maximum and minimum possible for a given richness. Pielou's [1] evenness measure J is shown to be such a measure.


Introduction
Defining and quantifying evenness is even more difficult than quantifying diversity.Ecologists do not agree on a set of properties to characterize evenness.However, one general point of agreement in the literature is that diversity is a compound quantity made up of richness and evenness components, and that these components should be defined so that they are independent of each other [2][3][4][5].Smith and Wilson [3], in their influential review of evenness measures, made this principle the cornerstone of their characterization of evenness measures.I have myself repeated this idea [6].Surprisingly, it is not true.Ecologists have it backwards.I prove below that it is impossible to decompose standard diversity measures into independent richness and evenness components.Instead, species richness can be decomposed into independent diversity and evenness components.
Here I pursue the consequences of this mathematical perspective.I derive the evenness and inequality measures that follow from the mathematics of diversity.I then show that because evenness and richness components cannot be independent of each other, it makes sense to also consider measures of "relative evenness", the amount of evenness relative to the minimum and maximum amount possible given the observed richness.Many arguments about the merits of different evenness measures dissolve immediately once relative and absolute evenness are distinguished.I show that Pielou's [1] evenness measure, which has often been rejected because it is not independent of richness, is an excellent measure of relative evenness.This and many other evenness and inequality measures in ecology and economics are related to the independent evenness and inequality measures I derive here.An open question not fully resolved here is how to estimate the evenness of a population from an incomplete sample.A suggestion is given.

Theoretical Background
There is a simple way to partition diversity or compositional complexity measures into independent components.If a standard measure of diversity or compositional complexity consists of two independent components, then the numbers equivalent of the measure must equal the numbers equivalent of the first component times the numbers equivalent of the second component.I derived this theorem [6] in order to partition diversity into independent alpha and beta components, but it applies to decomposition of diversity into any kinds of independent components.
Before applying this theorem, definitions of its terms are needed.Measures of "compositional complexity" include entropies, diversities, and other things.Loosely speaking, these all measure the degree of unpredictability of the species identity of a randomly chosen individual.All are functions only of species relative abundances (frequencies); they do not depend on density.They treat every species as equally distinct (though it would be valuable to incorporate phylogenetic or functional differences in the future).For a fixed number of species, they take their minimum value when all the abundance in the assemblage is concentrated in a single species, and they take their maximum value when all species are equally common.Any small transfer of abundance from a common species to a rarer species increases compositional complexity.The latter condition can be refined in various ways, but for now this is sufficient.This condition and its variants are called the "Principle of Transfers" in economics.For additional discussion see [3,[7][8][9].
Most measures of compositional complexity are based on sums of powers of the species relative abundances, or limits of such sums as the exponent on the relative abundances approaches some special value (like unity).These are the "standard measures of compositional complexity" or "standard complexity measures" [6], and they can all be written as lim where F(x) is a monotonically increasing function, and S is the number of species in the assemblage.The parameter q determines the measure's sensitivity to species frequency, and is non-negative [8].
Standard measures of compositional complexity include species richness, Shannon entropy, the exponential of Shannon entropy, the Gini-Simpson index, the inverse Simpson concentration, and many others [6].
To understand the "numbers equivalent" [9], suppose a complexity measure has a value X when applied to some assemblage.Any other assemblage with the same value X is equivalent to the original assemblage, according to the chosen measure.Among these assemblages is one with all species equally common.The number of equally common species in this assemblage (or reference community) is the "numbers equivalent" (the term used by economists) or "effective number of species" (the term most often used by ecologists) of the value X of the original measure.This is a way of converting complexity measures to an easily interpretable linear scale.The conversion can be done algebraically.Species richness is its own numbers equivalent.Shannon entropy is converted by taking its exponential.The Gini-Simpson index is converted by the formula 1/(1 − H GS ).It turns out that all standard complexity measures have the same numbers equivalent [6]: For q≠1: For q=1: These are the Hill numbers [10], or the exponential of Renyi entropies of order q.For any given standard complexity measure, the value of q is determined by looking to see what power is applied to the species frequencies in the standard measure.(If the standard measure is some function of Shannon entropy, then q = 1).I have argued elsewhere [6,[11][12][13][14] that the concept of diversity, especially in conservation applications, should be quantified by measures of compositional complexity that obey the "replication principle".This principle says that if we have N equally large, equally diverse assemblages with no species in common, the diversity of the pooled assemblages must equal N times the diversity of a single assemblage.This principle is implicit in most ecological reasoning about diversity, and logical contradictions can result when measures without this property are equated with diversity.The numbers equivalents of all standard complexity measures, Equation 1, obey the replication principle [6].I therefore call these "true diversities of order q" or just "diversity of order q" (hence the symbol q D).
There are other true diversities (Gregorius, personal communication) but they are not yet used by ecologists.
For the purposes of this paper, it doesn't matter whether readers agree with this interpretation of numbers equivalents as true diversities.The important thing is that numbers equivalents are easy to partition into independent components; their partitioning is necessarily multiplicative.If the reader's favorite diversity measure is some other standard complexity measure, he or she can convert it to its numbers equivalent, partition it multiplicatively, and convert this partitioning formula back into the partitioning formula for the original measure (see [6] for examples).The results of this paper therefore apply to any standard complexity measure and are independent of decisions about which standard complexity measures should be identified as "diversities".

Diversity Cannot Be Decomposed into Independent Richness and Evenness Components
If diversity of order 1, the exponential of Shannon entropy, could be partitioned into independent richness and evenness components, the partitioning theorem tells us that the decomposition would have to be e H = S•EF 0, 1 where EF 0, 1 stands for an undetermined "evenness factor", and the subscripts mean that diversity of orders 0 and 1 are involved.This can be solved for EF 0, 1 EF 0, 1 = e H /S = 1 D/S.This is in fact a popular evenness measures [5,15].
The theorem says that if independent components exist, these are what they must be.However, it is clear that these components, S and EF 0, 1 , cannot be independent, and furthermore EF 0, 1 is less than or equal to unity so it cannot be the numbers equivalent of anything.If these two components were really independent, then the value of one component would put no mathematical constraints on the value of the other (they would form a Cartesian product space).This is the case when we decompose gamma diversity into independent alpha and beta components.Knowing alpha (and only alpha, not gamma) tells us nothing at all about beta, and vice-versa.Yet if we try to decompose diversity e H into evenness EF 0, 1 and richness S, the value of S does constrain the possible values of EF 0, 1 , and vice versa.For example, if S = 2, we know that e H must be between 1 (its minimum possible value) and 2 (its maximum possible value when S = 2).This implies that EF 0, 1 can only range from 0.5 to 1, else EF 0, 1 •S will be lower than the lowest possible value for total diversity.Similarly, if S = 20, we can infer that EF 0, 1 falls into the interval [1/20, 1].This shows that the range of EF 0, 1 (and more generally, EF 0, q ) is determined by the value of S. Since the value of S mathematically constrains the range of values of EF 0, 1 , EF 0, 1 is not independent of S. Evenness and richness cannot be considered as orthogonal components of diversity.For most species abundance distributions the dependence of evenness on richness is observed to be strong [5].

Derivation of Evenness and Inequality Measures from the Partitioning Theorem
If we change our perspective slightly, we can apply the partitioning theorem correctly.Instead of decomposing diversity into richness and something else, suppose we try to decompose richness itself into diversity and some other independent quantity X.If these two components are independent, then our partitioning theorem tells us that their relationship must be S = q D•X.Since q D is always less than or equal to S, X must always be greater than or equal to unity, with equality only if the community is perfectly even (because then and only then is S = q D).X therefore satisfies the theorem's requirement that the components are both valid numbers equivalents.The partitioning equation can now be solved for X.If the diversity measure q D is the diversity of order 1 (the exponential of Shannon entropy) we get S = e H X X = S/e H = 0 D/ 1 D. X turns out to be the reciprocal of the standard evenness measure mentioned above, EF 0, 1 = e H /S. If use diversity of order 2, the inverse Simpson concentration, X is X = S/ 2 D = 0 D/ 2 D which is also the reciprocal of a standard evenness measure EF 0, 2 ("E 1/D " in [3]).X may therefore be considered a measure of inequality ("unevenness").We will call it the "inequality factor of orders zero and q" and write it as IF 0, q .The general expression for the inequality factor of orders 0 and q, with q > 0, is The evenness factor EF 0, q of Section 2 turns out to be the reciprocal of IF 0, q : EF 0, q ≡ (   S i 1 p i q ) 1/(1-q) /S.
We can check that the inequality factor really is independent of the diversity q D. Suppose we are told only that the value of q D is 20.We can infer that S must be greater than or equal to 20 (since 0 D> = q D for all q > 0).Yet this only tells us that IF 0, q must be greater than or equal to unity, which we already knew since this is always the case.If we were told instead that q D was 40, we would still only be able to say that IF 0, q must be greater than or equal to unity.Knowing q D gives us no information about IF 0, q .The same sort of argument applies if we are told the value of IF 0, q .Knowing that IF 0, q equals 5 tells us nothing about the value of q D. Therefore the diversity of order q (q > 0) and the inequality factor are mathematically independent (they form a Cartesian product space).The partitioning theorem has given us the unique decomposition of species richness into independent diversity and inequality components.The evenness factor EF 0, q , which is the reciprocal of IF 0, q , must therefore also be independent of diversity, since monotonic transformations like taking the reciprocal do not affect independence.
This derivation shows that some popular evenness measures are justified theoretically, but it also shows that they have not been correctly interpreted.Ecologists have not really been partitioning diversity into independent richness and evenness components.Ecologists have actually been partitioning richness into independent higher-order diversity and inequality or evenness components.This shows that one of the most often repeated requirements of evenness measures, that they be independent of S, needs to be reconsidered.Figure 1 gives some simple species-abundance graphs and their inequality and evenness factors.Inequality and evenness factors.In Row A, all communities are maximally even, so their inequality factors are all unity.Their evenness factors are also all unity.In Row B, communities all have an inequality factor of 2 and an evenness factor of ½.In Row C, the communities show maximal inequality; their inequality factors equal the number of species.Evenness factors are their reciprocals.
It may seem counterintuitive that two independent abundance-sensitive quantities should combine to produce richness, which is not sensitive to abundance.Abundances affect diversity and inequality in complementary ways; multiplying them eliminates the dependence on abundance.It may also seem confusing that two independent quantities can both be related to a third quantity, S. Of course, if any two of the quantities are known, the third can be determined.However, diversity and inequality (or evenness) do not give any information about each other in the absence of information about S. It is in this sense that diversity and inequality are independent, or orthogonal, like the orthogonal x-and ycomponents of a vector.As shown in the previous section, the same cannot be said about richness and evenness, nor about richness and diversity.4. IF 0, q and EF 0, q Satisfy Common Requirements for Evenness and Inequality Measures IF 0, q and EF 0, q follow directly from the mathematics of diversity, which makes them easy to interpret and which guarantees their logical consistency.Smith and Wilson [3] presented a list of properties ecologists require in an evenness measure, along with properties that are desirable but not essential.Using numerical examples, they show that EF 0, 2 (their "E 1/D ") satisfies all required properties, and most of the desirable but inessential properties.They did not consider EF 0, 1 or other orders of EF 0, q , but it is possible to prove that all EF 0, q satisfy properties "close to" the four properties required by Smith and Wilson.The proofs will be presented elsewhere [16].We have to reinterpret one of Smith and Wilson's properties, though.Their most essential property was "independence from S", which they tested by checking if a measure was unchanged when an assemblage was replaced by N copies of itself, each copy with different species.This property is called "replication invariance" in the economics literature.Our EF 0, q and IF 0, q are indeed replication invariant, but as we have shown, they are not independent of richness.Smith and Wilson wrongly equated replication invariance with independence.Replication invariance is not sufficient to guarantee independence; one must also check the limits of each component to make sure they are not constrained by the other component.
A transfer of a small amount of abundance from a common species to a rare species should increase evenness.This version of the Principle of Transfers, mentioned earlier as one of the defining principles of measures of compositional complexity, is on most lists of essential properties for evenness measures [3,4,15].Measures with this property preserve the partial ordering induced by the Lorentz curve [4,8,15].EF 0, q satisfies this important property, and IF 0, q satisfies the corresponding version for inequality measures.
One property not satisfied by these measures is continuity as S varies.Suppose a community has S species with relative abundances [p 1 − ε/S, p 2 − ε/S,..., p S-1 -ε/S, ε], where the S-th species is the rarest and has relative abundance ε.In the limit as ε approaches zero, the evenness of this community is required to continuously approach the evenness of the community with the S-1 remaining species with relative abundances [p 1 , p 2 , ..p S-1 ].Routledge [17] notes that this continuity requirement is inconsistent with some of the other common requirements for evenness measures, and it seems unnatural to require it of a concept that depends in an essential way on the discontinuous variable S. The lack of continuity does make these measures difficult to estimate; see Section 8.3 for a suggested solution.

Interpretation of Evenness and Inequality Factors
Measures should have meaning.Here are some tools to help understand the meaning of evenness and inequality factors.

Interpretation in Terms of Proportion of Dominant Species
Hill [10] gave an approximate verbal interpretation of 1 D as the "number of common species", and interpreted 2 D as the "number of abundant species".While these interpretations are not rigorous, they do provide insight into the meaning of the evenness and inequality factors.If 2 D is the number of abundant or dominant species, then the evenness factor 2 D/S is, roughly speaking, the proportion of dominant species in the community.If there are some species that are abundant in a community, 2 D only "sees" these, and hardly takes into account the rare species in the community.In the idealized communities of Figure 1, this interpretation is exact. 2 D and 1 D differ in the sharpness of the cut-off between dominant and rare species. 2 D has a fairly sharp cutoff if there are some species that are much more abundant than the rest. 1 D does not have a sharp cutoff and counts average species as well as dominant ones; it is always greater than or equal to 2 D, so EF 0, 1 is always greater than or equal to EF 0, 2 .The measure 1-EF 0, q gives the proportion of rare species in the community, roughly speaking.

Interpretation in Terms of Equivalent Maximally Uneven Communities
The concept of effective number of species, or the "numbers equivalent", was very helpful in interpreting diversity measures in terms of easily visualized reference communities.A similar concept can help interpret evenness and inequality measures.EF 0, q , IF 0, q , and all their monotonic transformations can be converted into a common, easily interpretable scale.
Suppose our evenness measure EF 0, 1 has the value 0.143.How uneven is the community?Any two communities with this evenness value are equivalent with respect to the aspect of evenness measured by EF 0, 1 .One community whose evenness is easy to visualize is the maximally uneven community of T species, with virtually its entire population concentrated in one species, and all other species represented by vanishingly small populations.The evenness EF 0, 1 of such a community is e H /S = 1/T, because e H approaches unity when virtually all the population belongs to one species.If EF 0, 1 of a community is 0.143, then the maximally uneven community with the same value of EF 0, 1 is 1/T = 0.143 T = 7.0.A community with seven species, all vanishingly rare except for one, has an evenness of 0.143.This equivalent community gives an easily visualized idea of the evenness of the original community.
The inequality IF 0, 1 is even easier to interpret using this method.IF 0, 1 (and IF 0, q generally) is exactly the number of species in the maximally uneven community equivalent to the community of interest with respect to inequality.Any monotonic transformation of IF 0, q will have the same equivalent maximally uneven community.
This reasoning does not distinguish between inequality or evenness, since evenness is a monotonic transformation of inequality.After all, our evenness and inequality measures are both really measuring the same thing.Figure 1 gives values of IF 0, q (which is also the size of the equivalent maximally unequal community) for various idealized communities.For communities such as these, whose species are all either equally dominant or vanishingly rare, the value of q is irrelevant and IF 0, q has the same value for any q.

Graphic Interpretation
The measures of inequality and evenness, IF 0, q and EF 0, q , have simple graphical interpretations based on the "diversity profile".A diversity profile is a graph of diversity q D versus q, for nonnegative values of q.Diversity profiles are perfectly flat when the community is perfectly even, and the profiles become more steeply decreasing as the community becomes more uneven.The evenness and inequality measures derived here are really measures of how steeply the diversity profile decreases.The inequality factor IF 0, 1 compares 0 D with 1 D, so it is just the ratio of the two distances shown in Figure 2a.Similarly the inequality factor IF 0, 2 is the ratio of the two distances shown in Figure 2b.The y-axis is diversity of order q, and the x-axis is the order q.IF 0,1 is the ratio of the two distances shown in blue on the left.IF 0,2 is the ratio of the two distances shown in blue on the left.

Interpretation in Terms of Mean Deviation from Equiprobability
To make the meaning of IF 0, 1 clearer, we can rewrite it in terms of the individual species frequencies.Suppose a community of S species has N individuals.(The final result will turn out to be independent of N.) If the community were perfectly even, every individual would belong to a species that had frequency 1/S.In an uneven community, some individuals will belong to species whose frequencies are higher than 1/S, and some individuals would belong to species whose frequencies are lower than 1/S.The factor p i /(1/S) measures the proportional deviation from perfect evenness for Species i.For example, if p i is 0.4 and 1/S is 0.2, the species is twice as abundant (0.4/0.2) as it would have been in a perfectly even community.What is the average, over all the individuals in the community, of this "inequality factor"?The most appropriate average when products are involved is the geometric mean.We take the factors for each individual, multiply them all together, and then take the Nth root to get the (geometric) mean inequality factor for the community.For example, suppose N = 10 and there are 8 individuals of Species A, 1 individual of Species B, and 1 individual of Species C.This is a very uneven community.If it were perfectly even, each species would have frequency 1/S = 1/3.Each individual of Species A has an inequality factor of (8/10)/(1/3) = 2.4.Each individual of Species B has an inequality factor of (1/10)/(1/3) = 0.3.Each individual of Species C has this same inequality factor, 0.3.The product of all of their inequality factors (one factor for each individual) is (2) We take the N-th root to get the geometric mean of these factors: (99.07)

Motivation
The inequality factor IF 0, q has a minimum value of unity for a perfectly even community.It might sometimes be useful to say that a community with no inequality has an inequality of zero.This would agree more closely with common language.Many biologists and economists have therefore preferred inequality measures whose minimum value is zero.Several monotonic transformations of IF 0, q have minimum values of zero and maximum values of infinity, and preserve the important mathematical properties of the base measures.

Theil Entropy Inequality Measure
Since inequality is such an important theme in economics, economists have spent much effort on developing a mathematically rigorous theory of inequality.We will borrow some of their results and translate them into ecologists' language and notation.Their most frequently used measures turn out to be equivalent to the ones we have just derived in ecology.
In economics a "household" or a "firm" is like a species, and a household's income or a firm's output are equivalent to the abundance of a species.One small difference between ecologists' and economists' measures is that economists always deal with finite resources, while the populations studied by ecologists are usually so large that they are treated as infinite.The relative abundance of Species i in economics is N i /N while in ecology it is a probability p i .In this section we will take the viewpoint of economics and assume the population is finite with size N, and each species has abundance N i .This is a mere formality since the end results are independent of N and depend only on the ratios N i /N, which are just the p i of ecologists.
A more important difference is that economists frequently deal with households or firms that have zero resources or output, while ecologists rarely include species with zero abundance in their analyses.This difference needs to be kept in mind when crossing the boundaries between the two disciplines.It is irrelevant when calculating diversity, because diversity measures are invariant to the addition of species with zero abundance, but it matters very much for evenness and inequality.
Economists generally write inequality measures in terms of the mean resource abundance, Of course they are usually not referring to abundance of species but to household income or other such things.The most important measure of inequality in economics is the Theil entropy inequality measure TEI [18]: = ln(S) -H.= ln(S/e H ) = ln IF 0, 1 .
The preferred inequality measure in economics is therefore just the logarithm of our inequality factor IF 0, 1 .

Logarithmic Transformations of General IF 0, q
Taking the logarithm of any IF 0, q will transform it into a measure with a minimum value of zero (for a perfectly even community) and no upper limit (increasingly large when inequality is large).This gives some commonly used measures of evenness and inequality in ecology.In the preceding paragraph we mentioned ln(IF 0, 1 ) = ln(S/e H ) = ln(S) -H.The logarithmic transformation of the evenness factor EF 0, 1 yields another commonly used measure: ln(EF 0, 1 ) = ln(e H/S ) = Hln(S).It ranges from zero (for a perfectly even community) to negative infinity (for highly uneven communities).For fixed S, this measure cannot be more negative than −ln S.This evenness measure was first proposed in ecology by Buzas and Hayek [19].It is the negative of ln IF 0, 1 since EF 0, 1 is the reciprocal of IF 0, 1 .Since EF 0, 1 is replication invariant, this transformation is also replication-invariant.

Deformed Logarithmic Transformations
Another monotonic transformation is extensively used in economics and physics and can be useful in ecology.This transformation is similar to a logarithmic transformation.The function (X 1-q -1) /(1 − q) ≡ ln q (X) is called the "deformed logarithm" or "q-logarithm" in physics [20] and information theory [21].Its limit as q approaches unity is the natural logarithm of X.When this transformation is applied to our EF 0, q (the reciprocal of IF 0, q ) we obtain a measure that is (-q) times the very important generalized entropy inequality index of economics GEI [22]: ln q (EF 0, q ) = [EF 0, q = [−q]•GEI.The limit of this as q approaches unity is ln IF 0, 1 or −ln EF 0, 1 , since EF 0, 1 and IF 0, 1 are reciprocals.This limit is exactly TEI, the Theil entropy inequality measure.
Generalized entropy inequality measures range from zero (for a perfectly even community with no inequality) to positive infinity (increasing indefinitely with increasing inequality).For fixed S they cannot exceed [(S) q-1 -1]/[q(q − 1)].For q = 1 the maximum possible value for fixed S is lnS.

Motivation
The rightmost community in Row C of Figure1 shows more inequality (a higher IF 0, q ) than the communities to its left, since the dominant species in the community form a smaller proportion of the total number of species.If species were households and abundance was wealth, the distribution of wealth among these households is much less equal than the distribution of wealth in the communities to the left of it in Figure 1.Yet all of these communities are maximally unequal, given their number of species.It is impossible for the leftmost community, with just two species, to show as much inequality as the rightmost community with its sixteen species.This is a necessary feature of inequality measures that preserve the Lorentz partial order [4].
This means that when ecologists compare the evenness or inequality of two communities with very different species richness, the richer community may well show a greater inequality than the poorer community even if the poorer community is maximally uneven.For example, suppose we compare a hypothetical north temperate Jack Pine (Pinus banksiana) forest with the tropical rain forest on Barro Colorado Island [23].Imagine the Jack Pine forest has four species with the following frequencies; {0.98, 0.01, 0.005, 0.005}.Almost all trees are Jack Pines (which form virtually monospecific forests following forest fires).The inequality factor IF 0, 1 for this forest is 3.55, meaning that it has the same inequality as a maximally uneven forest of 3.55 species.The inequality factor IF 0, 2 , which puts more emphasis on the dominant species, equals 3.84 species.The evenness factors are interpreted as the proportion of dominant species in the community, and are EF 0, 1 = 0.28, and EF 0, 2 = 0.26 (about a quarter of the species are dominant, which is right).Since the community has four species, its maximum possible inequality factor is 4.00, and its minimum possible evenness is 0.25.The community is close to its maximum possible inequality and its minimum possible evenness.
The Barro Colorado Island rain forest is far less extreme; no single species makes up more than 15% of the population.Nevertheless the proportion of abundant species to total species is much smaller than in the Jack Pine forest.Only 0.6% of the species are in the top quartile of abundances.The inequality factors for Barro Colorado Island are IF 0, 1 = 5.72, and IF 0, 2 = 14.70.The evenness factors are EF 0, 1 = 0.175, and EF 0, 2 = 0.07.This correctly shows that the proportion of dominant species in Barro Colorado Island (0.07 according to EF 0, 2 ) is actually smaller than the proportion of dominant species in the Jack Pine Forest (0.26).In this sense the unevenness and inequality of Barro Colorado Island are greater than the unevenness and inequality of the Jack Pine forest.
However, for many ecological purposes, it is also informative to have measures of inequality and evenness relative to the range of inequality that is possible for a community given its species richness.This kind of measure of relative inequality has a fixed minimum value (preferably zero) when the community is perfectly even, and a fixed maximum value (preferably unity) when virtually all the abundance is concentrated in a single species.Relative evenness would have opposite values: zero when the community is maximally uneven given its number of species, and unity when it is perfectly even.Such measures would show that Barro Colorado Island is far more even than it could be, given its richness, while the Jack Pine forest is close to the maximal unevenness possible for a forest with four species.There are several ways to construct relative inequality and evenness measures.

Linear Transformations of Evenness and Inequality Factors
The simplest way to create relative inequality and evenness indices with these characteristics is to transform the inequality factor IF 0, q and evenness factor EF 0, q onto the unit interval using the linear transformation (x − x min )/(x max − x min ).Evenness EF 0, q has a minimum value of 1/S and a maximum value of 1.0.The transformation (x − x min )/(x max − x min ) of EF 0, q onto the unit interval give a relative evenness index, RE 0, q : RE 0, q ≡ (EF 0, q -EF 0, q min )/(EF 0, q max -EF 0, q min ) = (EF 0, q -1/S)/(1 -1/S) = (S*EF 0, q -1)/(S -1) This transformation was first applied by Heip [2] for the case q = 1.It is zero when the community is maximally uneven for its number of species, and it is unity when the community is perfectly even.A similar transformation could be applied to inequality, yielding a relative inequality index RI 0, q : RI 0, q ≡ (IF 0, q -1)/(S -1).( 4) However, the evenness factor and inequality factor do not have a linear relationship, so these linear transformations do not preserve the interpretation that evenness and inequality are opposites.For a given community, the relative evenness and the relative inequality ("unevenness") generated by these transformations would sometimes paradoxically both be close to zero.The Barro Colorado Island forest is an example of this; for q = 1 the relative inequality is 0.016 and relative evenness is 0.172, both simultaneously low.We could enforce complementarity of relative evenness and inequality by defining relative evenness as 1-relative inequality, or vice versa, but it is hard to justify favoring either inequality or evenness.These measures have another problem.Community A in Figure 3 is maximally uneven; Community C is maximally even.The transition from Community A to Community B involves a transfer of exactly half the community's abundance, and the transition from Community B to Community C also involves the transfer of exactly half the abundance.Since Community B is exactly intermediate between Communities A and C in this sense, its relative inequality and relative evenness should both be equal to 0.5 (that is, midway between 0 and 1).However, as shown in Figure 4, this is not the case when Eqs. 3 or 4 define relative evenness and inequality.

Transformations of Logarithms of Evenness and Inequality Factors
The lack of complementarity between relative evenness and relative inequality, and the failure of Eqs. 3 and 4 to yield 0.5 for the intermediate Community B, can be fixed by taking the logarithms of IF 0, q and EF 0, before transforming them.Since IF 0, q and EF 0, q are reciprocals of each other, their logarithms show a simple linear relationship (ln EF 0, q = −ln IF 0, q ).The linear transformation (x − x min )/(x max − x min ), when applied to these logarithms, preserves this linear relationship, producing relative logarithmic evenness and inequality measures, RLE 0, q and RLI 0, q , that are complements of each other.The natural logarithm of EF 0, q ranges from -ln S to 0, so the linear transformation of ln EF 0, q onto the unit interval is: RLE 0, q ≡ (ln EF 0, q + ln S) / (ln S) = (ln q Dln S + ln S) / ln S = ln q D / ln S. (5) When q = 1 this is just J, Pielou's [1] measure of evenness.This relation can be rearranged as 1 D = S J where J is this relative logarithmic evenness index.The same exponential relationship between q D, S, and RLE 0, q holds for all q.Similarly ln IF 0, q ranges from 0 to ln S, so the linear transformation onto the unit interval is: RLI 0, q ≡ (ln IF 0, q )/(ln S) = (ln Sln q D)/ ln S = 1 − RLE 0, q When these relative evenness and inequality measure are plotted for Communities A, B, and C, they are perfectly complementary.Both give 0.5 for Community B (Figure 5), which is exactly intermediate between Communities A and B as measured by transfer of abundance.

Slope of Chord of Renyi Spectrum
When a community is replicated m times, each point of its true diversity profile (sensu [6]) increases by a factor of m.This changes its shape, making it steeper.However, if we profiled the logarithm of diversity q D vs q, this replication m times would cause the profile to rise everywhere by the same amount, ln m.The shape of the profile of the logarithm of q D is therefore replication invariant, which makes it a useful tool in diversity analysis.The logarithm of the diversity profile is known in statistics as the Renyi entropy spectrum of the community [24].
The slope of the Renyi spectrum between x = 0 and x = q is Slope = (ln q Dln S)/q.This is the logarithm of the evenness factor EF 0, q , divided by q.This can be converted to a measure of relative inequality by the usual linear transformation (x − x min )/(x max − x min ).The slope for a maximally uneven community of S species is -(ln S)/q, because a maximally uneven community has ln q D = ln 1 = 0 for q > 0. The slope of the chord therefore could range from 0 (perfectly even community) to −(ln S)/q (maximally uneven community).It can be transformed onto the unit interval by dividing by −(ln S)/q: [(ln q Dln S)/q]/[-(ln S)/q] = (ln Sln q D)/ln S = RLI 0, q .This yields the same RLI 0, q derived in the previous section.The complement of this is a relative measure of evenness: 1 − (ln Sln q D)/ln S = ln q D/ln S = RLE 0, q .This yields the same RLE 0, q derived in the previous section.When q = 1 this is Pielou's evenness measure J again.Her formula and its generalizations to higher-order q have this simple graphical interpretation in terms of the slope of the chord of the Renyi spectrum from x = 0 to x = q.

Relative Evenness Measures Cannot and Should Not Be Replication Invariant
These derivations shed new light on Pielou's evenness measure.Smith and Wilson [3] and many others discard this measure because it is not replication invariant, but it is impossible and undesirable for a relative measure of inequality or evenness to be replication invariant.Consider the communities in Figure 1.In Row B, the first community is maximally uneven.Its relative evenness is therefore zero by definition.The second community in Row B consists of two replicates of the first community.If evenness were replication-invariant, this community must also have a relative evenness of zero.Yet, the four-species community with maximal unevenness is not this community but the one below it in Row C. The evenness of the community in Row C is clearly less than the one above it (and this can be proven using the Principle of Transfers).Therefore the relative evenness of the second community in Row B cannot be zero.This shows that relative evenness must not be replication invariant.
The measures of relative logarithmic evenness RLE 0, q (Equation 5) and relative inequality RLI 0, q (Equation 6) provide more intuitive results than the raw evenness and inequality factors for our Jack Pine and Barro Colorado Island forest example.For the Jack Pine forest, relative logarithmic inequality RLI 0, 2 is 97%, accurately showing that this community is almost maximally uneven for a four-species community.Its relative logarithmic evenness RLE 0, 2 is 3%, correctly showing that evenness is close to the minimum possible for a four-species community.For Barro Colorado Island, relative logarithmic inequality RLI 0, 2 is 47%, far lower than the 95% of the Jack Pine forest.Its relative logarithmic evenness RLE 0, 2 is 53%, a reasonably moderate value, showing that the Barro Colorado Island rain forest is far more even, given its richness, than the Michigan Jack Pine forest.

Statistical Concerns
In economics, the total number of households or companies is known with great precision, so estimates of inequality are also precise.In ecology this is not the case.Species richness is almost impossible to estimate reliably in high-diversity communities.Measures of inequality and evenness depend strongly on S, and this raises serious statistical issues.Suppose a community consists of two species, each with 100,000 individuals, so that it is perfectly even.We add 1 individual each of six different new species.The practical difference between these two communities is very small, and these rare species would not be detected in any normal sampling process.Yet, the population evenness, relative evenness, and relative logarithmic evenness change dramatically as these rare species are added (Figure 6).This issue is so severe that the population value of evenness or inequality may seem to be a virtually unknowable quantity unless a complete census is done.Some authors (e.g., [3]) suggest using evenness measures only to characterize an actual sample.However, usually ecologists are interested in characterizing the population, not just a particular sample.The relation between the evenness of a small sample and the evenness of the population is tenuous at best.Scientists have therefore been searching for measures of evenness that are less sensitive to our uncertainty in S. In the following sections I briefly mention some of these measures and suggest another alternative.

Partitioning Higher-Order Diversity Measures
I derived evenness and inequality measures by partitioning species richness S (which is 0 D of course) into independent diversity and inequality components.This dooms us from the start if we are worried about statistical reliability of our measures, since it is based directly on the difficult-to-estimate value of S.Many authors have trued to avoid this problem by partitioning not 0 D but 1 D, which can be accurately estimated from small samples [25]. 1 D can be partitioned into any higher-order diversity and an independent "inequality" component.The most logical higher-order diversity to use is 2 D, so that This implies that IF 1, 2 = 1 D/ 2 D, and its reciprocal is a measure of "evenness", EF 1, 2 = 2 D/ 1 D. This evenness measure was introduced by Hill [10] as his E 1, 2 .
Unfortunately Hill's evenness factors based on higher-order diversities have a fatal flaw.They can equal unity in two contradictory circumstances.If the assemblage is completely even, then 1 D = 2 D and the evenness is unity, as it should be.If the assemblage is extremely uneven, so that the diversity profile was very steep around q=0, then the profile would also be nearly horizontal for q ≥ 1.This would cause 1 D and 2 D to be nearly equal, and the evenness would again be close to unity, even though this assemblage is maximally uneven.These measures are therefore non-monotonic with respect to increasing evenness.For example, the highly uneven assemblage whose species frequencies are [0.999,0.001] has an "evenness" EF 1, 2 equal to 0.9941, close to unity.The more even assemblage (according to the Principle of Transfers) with frequencies [0.9, 0.1] has a lower "evenness"; EF 1, 2 = 0.881.
The concept of a relative measure of evenness, discussed above, also applies to this higher-order "evenness".Given some observed value for 1 D, the minimum possible value for the evenness factor 2 D/ 1 D is 1/ 1 D, since the minimum possible value of 2 D is 1.The maximum possible value is unity for the perfectly even community (since then 2 D = 1 D = S).Using the standard transformation (x − x min )/(x max − x min ), we obtain the higher-order relative evenness RE 1, 2 : . This modification of Hill's E 1, 2 had been proposed by Alatalo [26].
Earlier we saw that it was better to form a relative evenness measure out of ln EF 0, 1 instead of EF 0, 1 itself.We do the same thing here, transforming ln EF 1, 2 into a relative measure using (x − x min )/(x max − x min ): ].This has a direct graphical interpretation in terms of the Renyi spectrum, much like the relative logarithmic evenness based on EF 0, 1 .
These relative higher-order evennesses RE 1, 2 and RLE 1, 2 are apparently not affected by the problems of the absolute evenness EF 1, 2 .The relative evenness RE 1, 2 of [0.999, 0.001] is 0.25, and the relative evenness RE 1, 2 of [0.9, 0.1] is 0.57.The relative logarithmic evenness RLE 1, 2 of [0.999, 0.001] is 0.25 while the RLE 1, 2 of [0.9, 0.1] is 0.61.These measures correctly show that relative evenness of the second community is greater than the relative evenness of the first community.However, to my knowledge, the relation between these measures and the Lorentz partial ordering is not yet known.

An Estimable Evenness Measure
One way to improve the estimation of the evenness indices derived in this paper would be to improve the estimation of S.There are many reviews of this subject, and many excellent nonparametric estimators, such as the Chao estimators [27].These should always be used rather than the observed sample value of S, if the richness of a population is estimated by taking incomplete samples.
However, these nonparametric estimators generally provide only lower bounds for the population value of S [28].There is no guarantee that there are not some very rare species dispersed through the ecosystem in densities so low that they will never be detected through normal sampling.This makes the true value of S an unknowable quantity.It is difficult even to quantify the uncertainty in a particular estimate of S without making parametric assumptions.On the other hand, if some species are so rare that they are impossible to detect, then they are also so rare that they make little difference to the dayto-day functioning of the ecosystem.Why not forget about them and satisfy ourselves with characterizing the bulk of the population?
One approach, also used in estimating S, is to standardize on a particular sample size N, and estimate the mean evenness of a sample of that size.This would be done by repeatedly rarefying a larger sample down to the standard size.However, sample sizes that are sufficiently large to characterize a low-diversity community will often not be large enough to characterize a high-diversity community.
Furthermore, sampling to a fixed size does not preserve the important theoretical properties of a diversity measure like S. Diversities follow the replication principle, so if we pool two equally large populations with richness S, and with identical species frequencies but no shared species, the pooled population will have richness 2•S, twice the richness of either of the original populations.However, the richness of a sample of fixed size taken from the pooled population will not be twice the richness of a sample from one of the original populations.Sampling strategies should preserve, as much as possible, the mathematical properties of the measure being estimated, and sampling at a fixed, standardized sample size does not do this.Instead of sampling at a fixed size, we need an adaptive approach to choosing the sample size.
The concept of "sample coverage" was introduced by Good [29] and Good and Toulmin [30] and underlies many nonparametric estimation techniques [28].The sample coverage is the proportion of the population belonging to sampled species.For example, suppose the true population frequencies of the species in an ecosystem are {0.5, 0.3, 0.18.0.02}.Suppose we make a sample of size N and we find Species 1, 2, and 3, but not Species 4. The coverage of our sample is 0.98, because the species in our sample make up 98 % of the population.The species we have not sampled will represent individuals that make up about 2% of the population, and these can be ignored.
The sample coverage serves as an adaptive "stopping rule" for choosing sample size [28,31].The mean relative evenness of a sample that gives, say, 95% coverage is a well-defined number that can be estimated with precision.The possible presence of nearly undetectable ultra-rare species simply has no effect on this number.The number will measure the relative evenness not of the population but of a standardized percentage of the population.
It may seem that in order to estimate this number, we would need to know the complete species list and the true population frequencies of each species, so that we would know when our sample reached 95% coverage.However, Good [29] discovered a simple way to estimate sample coverage without knowing anything about the population.The sample coverage C is approximately equal to C = 1 − (f 1 /N) where f 1 is the number of singleton species in the sample (the number of species represented by exactly one individual in the sample).This estimate is most accurate when f 1 is large.If we wanted to estimate the richness of a community at 95% coverage, we would keep sampling until C = 0.95, and then measure the evenness of the sample.More accurate would be to make a sample that exceeds 95% coverage, and repeatedly rarefy it down to 95% coverage, averaging the evenness of each rarefied sample.
The richness at fixed coverage, unlike the richness at fixed sample size, will approximately obey the replication principle.Suppose a community has relative abundances [0.4.0.4, 0.1, 0.05, 0.025, 0.025].If this is sampled at 95% coverage, on the average the observed richness will be 4 (the four most common species make up 95% of the population).If a replicate community is added to this one, the relative abundances of the new community will be [0.2.0.2, 0.2, 0.2, 0.05, 0.05, 0.025, 0.025, 0.0125, 0.0125, 0.0125, 0.0125].Now the eight most common species make up 95% of the population, so the most likely observed richness at 95% coverage will be about eight, double the observed richness of the original community.
Richness estimated in this way will depend very much on the choice of coverage chosen.The best way to facilitate comparison with the results of others is to make a rarefaction curve based on coverage values instead of the usual sample sizes (Anne Chao, pers.com.).These rarefaction curves for different communities may intersect, just like rarefaction curves based on sample sizes.
The richness and diversity at a given coverage can be used in the formulas for evenness and relative logarithmic evenness.The resulting measures should not be considered as estimates of true population values of the parent measures, but as valid descriptive measures in their own right.They will approximately share the theoretical properties of their parent measures.While the true population values of the parent measures are virtually unknowable without a complete census, their values for a sample with coverage X can be reliably estimated and meaningfully compared across communities, resolving the problem of sensitivity to S inherent in these measures.

Relative versus Absolute Evenness and Inequality
Alatalo [26] gives a thoughtful critique which rejects Pielou's J (my "relative logarithmic evenness" RLE 0, q ) but endorses EF 0, 1 .His Table 1 lists values for these measures when applied to a community in which half the species have a large relative abundance X, and the other half have a vanishingly small relative abundance.He gives several such communities, each with different richness.He argues that these communities should all have an evenness of 0.50 independent of their richness, and he notes that EF 0, 1 does give 0.5 for all these communities while J increases sharply with richness.However, his arguments confuse absolute and relative evenness, and the apparent defects of J are precisely what are needed in a measure of relative evenness (the amount of evenness relative to the range of evenness possible for the given richness).J and EF 0, 1 are looking at exactly the same thing from different but equally valid viewpoints.
Consider my version of his Table 1.For different values of richness, this table gives J, IF 0, 1 , and EF 0, 1 for the maximally uneven community, the maximally even community, and the intermediate community considered by Alatalo [26].J correctly gives unity for all completely even communities, regardless of richness.It also correctly gives zero for the maximally uneven communities, independent of richness, as a relative measure must do.Note that IF 0, 1 and EF 0, 1 are not independent of richness for the maximally uneven community, so they are clearly not relative measures of evenness or inequality.They are giving the absolute evenness and inequality.When richness is greater, the maximally unequal community shows more absolute inequality (less absolute evenness) than when richness is low.Table 1.Relative and absolute evenness and inequality.Relative logarithmic evenness J is always zero when community is maximally uneven, and is always unity when community is perfectly even.The intermediate communities of Alatalo [26] are not really intermediate in inequality or evenness when richness is high; they are actually closer to the completely even community.See Figure 7.

Maximally uneven
Intermediate according Completely even to Alatalo [ .Changes in evenness and inequality in equal steps.Each step involves transfer of half the abundance of the community.When richness S is high, Alatalo's [26] intermediate communities are closer to the perfectly even community than to the center.This explains why J seems to vary with richness in his example.
To understand why J changes with richness in the "intermediate" communities of Alatalo, consider Figure 7.This shows assemblages with four species, and assemblages with eight species.Each step illustrated in Figure 7 involves a transfer of half the abundance in the assemblage.Thus each step is the same "size".Starting with the perfectly even community on the far right, the very first step going left (decreasing evenness) always produces Alatalo's "intermediate" community.This is reached in one step from the perfectly even community, regardless of richness.Note, though, that when richness is high, more equally-large steps (each transferring half the abundance of the community) are required to get to the completely uneven community.The richer the community, the more additional steps are needed to get to the minimally uneven community.Thus Alatalo's "intermediate" community doesn't stay intermediate as richness rises.It is much closer to the completely even community when richness is high.On the other hand, when richness equals 2, his "intermediate" community is actually the maximally uneven community!J does exactly what it should, and is nicely linear with respect to this kind of transfer of abundance.The variation of J with respect to S has been much criticized but is perfectly logical in a relative measure of the logarithm of evenness.This is even clearer if we look at 1-J, the relative logarithmic inequality of orders zero and one.In Figure 7 the absolute inequality ranges from 8 to 1 when S = 8, and Alatalo's intermediate community has an inequality of 2, very far from the middle value.The logarithm of inequality for S=8 ranges from 3 to 2 to 1 to 0 using logarithms to the base 2, and the successive values of 1-J (which are independent of choice of logarithm base) coincide perfectly with this: 1, 2/3, 1/3, 0.
The comparison of the evenness of Barro Colorado Island forest and a species-poor Jack Pine forest in Sections 7.1 and 7.5 shows that this perspective is a fruitful one in practice as well as in theory.Gosselin [5] also found that Pielou's J was one of the most well-behaved evenness measures.His main criticism of the measure was that "it lacked an axiomatic background".Now that an axiomatic derivation is provided, J should enter the ecologists' toolbox without reservations.

An Alternative Evenness Concept
An alternative approach to the one used in this paper identifies evenness with a transformation of the variance of the log abundances of the species in a community [3] or the variance of a rarity function [32].However, contrary to [3], these measures do not obey the Principle of Transfers, and therefore are not consistent with the Lorentz partial ordering [4,15,32].For fixed richness, standard diversity measures may decrease instead of increase when this kind of "evenness" is increased.These kinds of measures may still be useful and informative, but as noted by Taillie [15], to avoid confusion and contradictions with standard measures of compositional complexity they should be called something other than "evenness".Engen [32] suggests the term "variability".

Conclusion
A suite of meaningful evenness and inequality measures can be derived from species richness and diversity using the partitioning theorem in [6].These include absolute measures which express the amount of evenness and inequality in a species abundance distribution, and relative measures which express the degree of evenness and inequality given the richness of the community.Both kinds of measures are useful.The framework presented here shows that measures of evenness and inequality proposed by Pielou [1], Hill [10], Heip [2], Theil [18], Alatalo [26], and others in ecology and economics are all related, and examine different but equally valid aspects of a single unified concept of evenness and inequality.

Figure 1 .
Figure 1.Inequality and evenness factors.In Row A, all communities are maximally even, so their inequality factors are all unity.Their evenness factors are also all unity.In Row B, communities all have an inequality factor of 2 and an evenness factor of ½.In Row C, the communities show maximal inequality; their inequality factors equal the number of species.Evenness factors are their reciprocals.

Figure 2 .
Figure 2. Inequality factors in relation to the diversity profile.The y-axis is diversity of order q, and the x-axis is the order q.IF 0,1 is the ratio of the two distances shown in blue on the left.IF 0,2 is the ratio of the two distances shown in blue on the left.

Figure 3 .
Figure 3. Intermediate evenness.Community A is a maximally uneven four-species community.Community C is perfectly even.Community B is exactly intermediate in evenness and inequality.

Figure 4 .
Figure 4. Non-complementarity of relative evenness RE 0, 1 and relative inequality RI 0, 1 .RE 0, 1 and RI 0, 1 and their complements for Community A, Community B, and Community C from Figure 3. RI 0, 1 and RE 0, 1 are not complements of each other.Compare Figure 5.

Figure 5 .
Figure 5. Complementarity of relative logarithmic evenness RLE 0, 1 and relative logarithmic inequality RLI 0, 1 .RLE 0, 1 and RLI 0, 1 and their complements for Community A, Community B, and Community C from Figure 3.This shows that RLI 0, 1 and RLE 0, 1 are complements of each other.Compare Figure 4.

Figure 6 .
Figure 6.Sensitivity of evenness measures to ultra-rare species for a variety of measures.The initial community consists of two species, each with abundance 100,000.From left to right, vanishingly rare species are added, one at a time.Each vanishingly rare species consists of one individual.

RLE 1 , 2 =
[ln EF 1, 2ln (1/ 1 D)]/[ln 1ln (1/ 1 D)] = [ln 2 Dln 1 D + ln 1 D)]/[ln 1 D] = [ln 2 D]/[ln 1 D 1/10 = 1.58.1.58is the single number that could replace all the individual inequality factors in Equation 2 and still give the same final product.It is exactly our IF 0, 1 .The reciprocal, 0.63, is exactly our evenness factor EF 0, 1 .The inequality factor IF 0, 1 is the geometric mean of the inequality factors of the individuals in the community, where an individual's inequality factor is just the factor by which the individual's species exceeds or undershoots the frequency that each species would have if the community were perfectly even.We could have taken some other kind of average rather than the geometric average.If we had chosen to take the mean or expected value of the individual inequality factors, we would have taken This is exactly our IF 0, 2 .The reciprocal is 0.505, the evenness EF 0, 2 .