A dynamic fractal processes is rich in interconnected scales with no one scale dominating. It has been known since Weierstrass constructed the first fractal function in 1872 that such functions are continuous everywhere, but are no where differentiable. Consequently, such functions are not the solutions to traditional equations of motion with integer-order derivatives, and therefore, the phenomena they describe are not simple mechanical processes. Thus, information in fractal phenomena is coupled across multiple scales, as, for example, observed in the architecture of the mammalian lung [

114,

115,

116] and in cities [

107]; manifest in the long-range correlations in human gait [

117,

118] and the extinction of biological species [

103]; measured in the human cardiovascular network [

119] and in a number of other contexts [

13]. The geometric interpretation of fractals is also given in the fractal nutrient model of AR [

11]. Thus, we have both a deterministic and statistical application of fractals to the understanding of multi-scaled phenomena that manifest allometry patterns.

Here, we focus on the statistical nature of allometry and emphasize that allometry is strictly a relation between average quantities. This immediately brings to the forefront one of the major problems in constructing a theoretical understanding of the origins of allometry. Simply put, the measure of size

X is a random variable as is the functionality,

Y, and from the Jensen inequality [

120], we have for a nonlinear function

$Y=F\left(X\right)$:

where the brackets denote an ensemble average. Consequently, the empirical AR:

cannot be realized by directly averaging the schematic AR given by Equation (

2) over data, since

$F\left(X\right)={X}^{b}$ is not linear for

$b\ne 1$. West and West [

33] sketch out the beginnings of a theory to explain how the allometry relations could originate from the scaling of the underlying statistical fluctuations.

#### 3.1. Subordination

A simple stochastic process is that of Poisson in which the probability,

$P\left(\tau \right)$, of an event occurring per unit time is

${\lambda}_{0}.$ The rate equation:

has an exponential solution for the probability of an event occurring in a time interval

$(0,\tau ).$ This is the kind of simplified dynamics Huxley adopted for the description of the differential growth of different parts of an organism. To generalize this description, we introduce the notion of subordination. This notion implies the existence of two ideas of time, as explained by Svenkeson

et al. [

121]. One is the operational time,

τ, which is the internal time of a single individual, with an individual generating the ordinary dynamics of a non-fractional system, such as given by Equation (

9). The other idea is chronological time,

t: the time as measured by the clock of an external observer. The subordination procedure transforms the deterministic differential equation in operational time to a fractional differential equation in chronological time.

Note that this idea of operational time ought to be familiar. We encountered a version of it in the discussion of empirical physiologic time in which the time interval or frequency experienced by an organism is determined by the total body mass of that organism. A similar distinction is made in psychology, where the subjective time experienced by an individual in the performance of tasks is separate and distinct from the objective time of the clock on the wall [

22]. Given this empirical distinction between the reality of the individual and that of the collective, mathematicians recognized the need for a time that was intrinsic to a process, whose dynamics are regular, but that appears quite complicated to an observer measuring the process from outside. Consequently, a procedure was developed to transform intrinsically regular behavior to the experimentally observed complex behavior by relating operational time to chronological time.

Svenkeson

et al. [

121] point out that in operational time, an individual’s behavior can appear deterministic, but to an experimenter observing the individual, their temporal behavior can appear to erratically grow in time, then abruptly to freeze in different states for extended time intervals. Due to the random nature of the evolution of the individual in chronological time, the subordination process involves an ensemble average over many individuals, each evolving according to its own internal clock, independently of one another. The resulting ensemble average over a large number of individuals results in an average trajectory that is fractal. We apply this reasoning to rate Equation (

9).

The facilitate the discussion, we consider the discrete version of Equation (

9):

in the notation

$P\left(n\right)=P\left(n\Delta \tau \right)$, where the time has been partitioned into discrete intervals. The solution to this discrete equation:

is an exponential, in the limit

${\lambda}_{0}\Delta \tau <<1$, such that

$n\Delta \tau $ becomes continuous time. However, when the simple process is influenced by the environment, the limit of the discrete solution is no longer an exponential. Adopting the subordination interpretation, we define the discrete index,

n, as an operational time that is stochastically connected to the chronological time,

t, in which the global behavior is observed. We assume that the chronological time lies in the interval

$(n-1)\Delta \tau \le t\le n\Delta \tau $, and consequently, the equation for the average dynamics of the probability is given by [

122]:

The physical meaning of Equation (

12) is determined by considering each tick of the internal clock,

n, as measured in experimental time, to be an event. Since the observation is made in experimental time, the time intervals between events define a set of independent identically distributed random variables. The integral in Equation (

12) is then built up according to renewal theory [

123]. After the

n-th event, it changes from state

$P(n-1)$ to

$P\left(n\right)$, where it remains until the action of the next event. The sum over

n takes into account the possibility that any number of events could have occurred prior to an observation at experimental time

t, and

${G}_{n}\left(t,{t}^{\prime}\right)d{t}^{\prime}$ is the probability that the last event occurs in the time interval (

${t}^{\prime},{t}^{\prime}+d{t}^{\prime}$).

We assume that the waiting times between consecutive events in Equation (

12) are identically distributed independent random variables, so that the kernel is defined:

The probability that no event has occurred in a time,

t, is given by the survival probability,

$\Psi \left(t\right)$. Individual events occur statistically with a waiting-time pdf,

$\psi \left(t\right)$, and taking advantage of their renewal nature, the waiting-time pdf for the

n-th event in a sequence is connected to the previous event by:

and

${\psi}_{{}_{0}}\left(t\right)=\psi \left(t\right).$ The waiting-time pdf is related to the survival probability through:

Here, we select the probability of no event occurring up to time

t to be:

and consequently, the waiting-time pdf is renewal and also inverse power law.

To find an analytical expression for the behavior in experimental time, it is convenient to study the Laplace transform of Equation (

12), where

$\widehat{f}\left(s\right)$ denotes the Laplace transform of

$f\left(t\right)$:

and the relation

${\widehat{\psi}}_{n}\left(s\right)={\left[\widehat{\psi}\left(s\right)\right]}^{n}$ was used, due to the correlation structure of Equation (

14). With the discrete time solution in operational time, Equation (

11), this can be written as:

Performing the sum and noting the relationship given by Equation (

15), we find:

This can be expressed in the typical form:

where:

is the Montroll–Weiss memory kernel [

124].

In the asymptotic limit

$s\to 0$, Equation (

20) becomes:

with the parameter value:

Consequently, the subordination process results in ordinary rate Equation (

9), through the inverse Laplace transform of Equation (

22), being replaced with the fractional rate equation [

121,

125]:

where we introduce a Riemann–Liouville (RL) fractional operator. Here, we define the RL integral:

and the RL derivative:

with the operator index in the range

$n-1\le \alpha \le n$ for integer

n. We use the notation

${D}_{t}\left[g\left(t\right)\right]=\frac{dg\left(t\right)}{dt}.$Note that fractional rate Equation (

23) reduces to ordinary rate Equation (

9) in the limit

$\alpha =1$, since the gamma function diverges at zero argument. Consequently, the solution to Equation (

23) must reduce to the exponential in this limit.

The solution to fractional rate equation Equation (

23) was obtained by the mathematician Mittag-Leffler at the turn of the twentieth century:

in terms of the infinite series that now bears his name:

The time dependence of the Mittag-Leffler function (MLF) is extremely interesting. At early times, the MLF has the analytic form of the stretched exponential:

at late times, it has the analytic form of an inverse power law:

and the MLF smoothly joins these two asymptotic expressions. Consequently, the relatively benign statistics of Poisson at

$\alpha =1$ become the intermittent inverse power law statistics at

$0<\alpha <1.$ The complexity of the resulting statistics is captured in the power law index, much like the allometry exponent captures the complexity of the fractal structure of allometric phenomena.

#### 3.2. Fractional Phase Space Equations

In the previous subsection, the fractional rate equation is given by Equation (

23) using subordination to generalize the ordinary time derivative to the RL fractional derivative. Now, it is necessary to further extend the argument to the dynamic variable, where the probability that the dynamic variable,

$Z\left(t\right)$, lies in the interval (

$z,z+dz$) at time

t is the phase space quantity,

$P(z,t)dz$. The construction of the fractional partial differential equation for the pdf in both “space” and time and the scaling method for obtaining a solution has been presented elsewhere [

33], but for the sake of completeness, I present those details here, as well. The fractional phase space equation (FPSE) with fractional derivatives in both space and time is:

where

${D}_{t}^{\alpha}\left[\xb7\right]$ is the RL fractional derivative in time,

${\partial}_{\left|z\right|}^{\beta}\left[\xb7\right]$ is the Riesz–Feller fractional derivative in one space dimension,

${P}_{0}\left(z\right)$ is the initial value of the pdf typically taken to be the delta function

$\delta \left(z\right)$, and

${K}_{\beta}$ is a generalized diffusion coefficient. Equation (

30) is sometimes called the fractional Fokker–Planck equation (FFPE) with zero potential, because it can be generalized by introducing a potential function in complete analogy with the historical Fokker–Planck equation. It is not necessary to review the fractional calculus in order to understand the solution to Equation (

30) in terms of its scaling properties.

When

$\alpha =1$, Equation (

30) reduces to the anomalous diffusion equation [

29,

31,

126]:

This equation will be useful subsequently.

The Fourier transform of the symmetric Riesz–Feller operator

${\partial}_{\left|z\right|}^{\beta}\left[\xb7\right]$ acting on an analytic function,

$\mathit{f}\left(\mathit{z}\right)$, is [

29,

126]:

where

$\tilde{f}\left(k\right)$ is the Fourier transform of

$f\left(z\right).$ The Laplace transform of an RL fractional time derivative

${D}_{t}^{\alpha}\left[\xb7\right]$ acting on the analytic function,

$g\left(t\right)$, is:

where

$\widehat{g}\left(u\right)$ is the Laplace transform of

$g\left(t\right)$. Consequently, the phase space dynamics given by Equation (

30) can be expressed as the Fourier–Laplace transform:

and the asterisk denotes the double transform. Therefore, the solution when

${P}_{0}\left(z\right)=\delta \left(z\right)$ in Fourier–Laplace space is:

The pdf that solves the FPSE is given by the inverse Fourier–Laplace transform of Equation (

35). We note that the space-time representation of the solution to the FFPE for various combinations of

α and

β and potential functions are reviewed by Klafter and Metzler [

29], who show how to derive Equation (

30) using the continuous time random walk (CTRW) of Montroll and Weiss [

124].

The inverse Laplace transform of Equation (

35) yield the MLF, just as obtained in

Section 3.1:

which is the characteristic function for the process. The inverse Fourier transform of the characteristic function yields the probability density:

When

$\alpha =1$, we know that the MLF reduces to an exponential, in which case, the solution is the characteristic function for the alpha-stable Lévy distribution in space with a Lévy index

$0<\beta \le 2$ and a “width” that increases linearly with time:

The series representation for the Lévy distribution is given in a number of places [

27,

31,

127,

128].

A variety of other solutions to the FFPE have been obtained by Mainardi [

129], including the inverse Fourier transform for

$\beta =2$, in which case, the solution asymptotically relaxes as the inverse power law

${t}^{-\alpha /2}$.

#### 3.2.1. Statistics of Allometry Parameters

As an exemplar of the statistics of the network size, consider the growth of the average TBM across species, in which case the “space” variable is the total body mass. We assume the FFPE for the pdf to be given by:

where the phase space variable is

$m=\u2329{M}_{i}\u232a=\u2329X\u232a$, and the discrete index for species

i is suppressed for notational convenience. The Fourier transform of this equation, with

$\tilde{P}(k,t)$ the Fourier transform of

$P(m,t)$, yields the equation for the characteristic function:

whose solution is [

126]:

Equation (

40) is the characteristic function for a Lévy distribution with Lévy index

$\beta <2.$ Note that at early times

$\lambda \beta >>1/t$, the inverse Fourier transform of Equation (

40) is given by Equation (

38).

The asymptotic form of the pdf, obtained from the inverse Fourier transform of Equation (

40), is therefore given by the inverse power law, that is, the Pareto distribution for the average TBM:

West and West [

52] fit the power law index in the steady-state TBM pdf to a data set of mammalian species tabulated by Heusner [

49] and depict the pdf in

Figure 3. They constructed a histogram of the interspecies TBM for the 391 mammalian species from these data by partitioning the mass axis into intervals of 20

gm and counting the number of species within each of the intervals. The vertical axis is the relative number of species as a function of TBM. The figure depicts the fit to the logarithmic histogram data points indicated by dots starting at a TBM of 1.1

kg. An inverse power law would be a straight line with a negative slope on this log-log graph. Fitting the power law index from the steady-state TBM pdf to the value

$\beta =1.67$ yields the solid curve in the figure, which fits the data extremely well. The curve is quite clearly an inverse power law in the interspecies TBM. This coarse-grained description of the interspecies mass statistics indicates great variability in the TBM, which is indeed the case. Moreover, since

$\beta <2$, the variance of the interspecies TBM diverges or more properly increases without bound with increasing TBM.

**Figure 3.**
The average total body mass (TBM) data for the 391 mammalian species tabulated by Heusner [

49] are used to construct a histogram. The mass interval is divided into twenty equally spaced intervals on a logarithm scale and the number of species within each interval counted. The quality of the fit using the inverse power law is measured by the correlation function to be

${r}^{2}=0.998$ (From West and West [

52] with permission).

**Figure 3.**
The average total body mass (TBM) data for the 391 mammalian species tabulated by Heusner [

49] are used to construct a histogram. The mass interval is divided into twenty equally spaced intervals on a logarithm scale and the number of species within each interval counted. The quality of the fit using the inverse power law is measured by the correlation function to be

${r}^{2}=0.998$ (From West and West [

52] with permission).

This inverse power law in the average TBM implies a clustering of the fluctuations in mass, with bursts in the number of species near a given mass interspersed with gaps of various lengths in the mass spectrum. However, on closer inspection of these bursts of speciation, there is seen to be contained within each burst a number of smaller bursts intermittently spaced with gaps in the number of species. This intermittent bursting is characteristic of inverse power law statistics [

15].

#### 3.2.2. Urban Variability

Bettencourt

et al. [

130] in their study of urban scaling constructed the metric:

which they called the Scale-Adjusted Metropolitan Indicators (SAMIs). They used

${Y}_{i}$ as the observed value of the measure of innovation, wealth or crime for each city,

i, with population

${X}_{i}$. They found that a Laplace distribution provides an excellent fit to the normalized SAMI histogram for the statistical residuals,

${\eta}_{i}$, across different cities. However, they made the assumption that the allometry exponent is approximately universal.

Quite independently and contemporaneously, an analogous measure was devised by West and West [

52] for the relative variation in the allometry parameters. They argued that since there are independent fluctuations in

X and

Y, these result in what Warton

et al. [

131] call equation error; also known as natural variability, natural variation and intrinsic scatter. Considering that ARs are not predictive, but instead, summarize vast amounts of data [

132], this natural variability was interpreted as fluctuations in the modeling allometry parameters (

$a,b$). Denoting the fitted values of the random parameters as

$\overline{a}$ and

$\overline{b}$, if the fluctuations are assumed to be contained in the allometry coefficient, West and West [

52] define the residual in the allometry coefficient:

The numerator and denominator in Equation (

43) are measured independently, and in the case they were investigating,

$\u2329{Y}_{i}\u232a$ is the average BMR and

$\u2329{X}_{i}\u232a$ is the average TBM. The statistics of the normalized allometry coefficient,

${e}^{{\eta}_{i}}$, were determined by least squares fitting to the data to be given by a Pareto distribution. On the other hand, when the fluctuations are assumed to be constrained in the allometry exponent, they define the residual:

In their analysis, the allometry coefficient and exponent were held fixed, so that the parametric fluctuations fitted by a histogram gave the best fit to be that of a Laplace distribution centered on the fitted value of the allometry exponent. This latter result is completely consistent with that of Bettencourt

et al. [

130].

Both research groups reach the conclusion that the Laplace distributions for the statistics of the allometry exponent imply the inverse power law pdf in the size of the network. This overlap of interpretation was reached in spite of the fact that in one case, the data consisted of independent measures of BMR and TBM, which is a convex AR, and the other was on independent measurements of city economic quantities and populations in a given year, which is a concave AR. The convergence of conclusions reached in these two studies suggests the necessity of statistical measures being foundational for understanding allometry in general, as had been argued previously [

15].

#### 3.2.3. Scaling Solution

Uchaikin [

133] directly inverse transformed Equation (

35) for arbitrary

α and

β; but that level of mathematical detail is not necessary for the present analysis. For our present purposes, the desired insight is provided by directly utilizing the scaling properties of Equation (

36) by considering the solution in the form of the inverse Fourier transform:

The series expansion for the MLF allows one to write for the scaling:

where the second factor in the summation is the result of applying the Tauberian Theorem to the inverse Fourier transform of

${\left|k\right|}^{n\beta}$. A scaling equation emerges when the parameters satisfy the equality

$A={B}^{\alpha /\beta}$, resulting in:

If we now select the time parameter to be

$B=1/t$, we can write:

Finally, the pdf that solves the FPSE in terms of the similarity variable,

$z/{t}^{{\mu}_{z}}$, satisfies the scaling equation:

The function

${F}_{z}\left(\xb7\right)$ in Equation (

47) is left unspecified, but it is analytic in the similarity variable,

$z/{t}^{{\mu}_{z}}$. As mentioned in the Introduction, a standard diffusion process,

$Z\left(t\right)$, is the displacement of a diffusing particle from its initial position at time

t, and for vanishing small dissipation, the scaling parameter is

${\mu}_{z}=1/2$ and the functional form of

${F}_{z}(\xb7)$ is a Gauss distribution. However, for general complex phenomena, there is a broad class of distributions for which the functional form of

${F}_{z}(\xb7)$ is not Gaussian and the scaling index

${\mu}_{z}\ne 1/2$. For example, the

α-stable Lévy process [

25,

26,

126,

127] scales in this way and the Lévy index is in the range

$0<\alpha \le 2$, with the equality holding for the Gauss distribution; the scaling index is related to the Lévy index by

${\mu}_{z}=1/\alpha $.

#### 3.2.4. Allometry Relations

Of course, the stochastic variables of interest here are not necessarily space and time; they are the measures of functionality and the size of the allometry phenomena being investigated. Banavar

et al. [

134] used scaling theory in which

$\mathit{z}$ is interpreted as the TBM, population abundance or both, and

t is the region of area over which the population roams, and they obtain an expression similar in form to Equation (

47). A sequence of four additional hypotheses establish a framework for the analysis of diverse empirical macroecological laws. The fractional calculus approach leading to Equation (

47) is less ambitious than the scaling theory of Banavar

et al. [

134], but has the virtue of being able to systematically study the influence of the environment on the process of interest through the inclusion of an external force, as we did in the case of the TBM in Equation (

39).

Let us now replace the space and time discussion of the previous sections with the variables of interest in an allometry context. We identify the function variable

$y=Y$ with

z, the average measure of size

$x=\u2329X\u232a$ with

t and the exponent,

${\mu}_{z}$, with

b. In this way, the scaling pdf can be written in terms of phase space variables as:

for a generic allometry process. The average functionality of interest is therefore given by:

in agreement with Equation (

8). The allometry coefficient is given as the average similarity variable

$q=y/{x}^{b}$:

therefore, the scaling properties of the pdf solution to the fractional phase space equation entails allometry.

Here, we noted that the allometry coefficient is determined by Equation (

49), the average similarity variable. It probably does not need emphasis, but the scaling variable is precisely the quantity that was defined by the SAMI measure in Equation (

42), and its average is here shown to determine the level of the allometry relation. The allometry exponent is a different matter; it is given by the ratio of the scaling index,

α, for the fractional derivative in time and the scaling index,

β, for the fractional derivative in “space”. Consequently, the ratio denotes a balance between the memory of the underlying process, with

$\alpha =1$ indicating no memory, and the nonlocal nature in the phase space of the variate, with

$\beta =2$ indicating a homogeneous local process, such as obtained in classical diffusion. However, the allometry relation only yields their ratio, being as it is a relationship between averages. In order to untangle their separate contributions to allometry, requires a more detailed statistical study.