Extreme Value Laws for Superstatistics

We study the extreme value distribution of stochastic processes modeled by superstatistics. Classical extreme value theory asserts that (under mild asymptotic independence assumptions) only three possible limit distributions are possible, namely: Gumbel, Fr\'echet and Weibull distribution. On the other hand, superstatistics contains three important universality classes, namely $\chi^2$-superstatistics, inverse $\chi^2$-superstatistics, and lognormal superstatistics, all maximizing different effective entropy measures. We investigate how the three classes of extreme value theory are related to the three classes of superstatistics. We show that for any superstatistical process whose local equilibrium distribution does not live on a finite support, the Weibull distribution cannot occur. Under the above mild asymptotic independence assumptions, we also show that $\chi^2$-superstatistics generally leads an extreme value statistics described by a Fr\'echet distribution, whereas inverse $\chi^2$-superstatistics, as well as lognormal superstatistics, lead to an extreme value statistics associated with the Gumbel distribution.


I. INTRODUCTION
Superstatistics [1][2][3][4][5][6][7][8][9][10][11][12] is a powerful technique to model and/or analyze complex systems with two (or more) clearly separated time scales in the dynamics. The basic idea is to consider for the theoretical modeling a superposition of many systems in local equilibrium, each with its own inverse temperature β, and finally perform an average over the fluctuating β which are distributed according to some probability density g(β). Most generally, the parameter β need not be inverse temperature but can be any system parameter that exhibits large-scale fluctuations, such as energy dissipation in a turbulent flow, or volatility in financial markets. Ultimately all expectation values relevant for the complex system under consideration are averaged over this distribution g(β).
In almost all of the above cases of application one will be interested in extreme values of a suitable variable of the complex system under consideration which is described by a particular class of superstatistics. For example, for superstatistical models of the dynamics of share price changes, which often, in good approximation, are either modeled by χ 2 -superstatistics (equivalent to Tsallis statistics [28,29]) or lognormal superstatistics [8], extreme negative values corresponds to share price crashes. Or, for sea level fluctuations produced by surges [19], extreme positive values of a surge may lead to overtopping of flood defence systems, thus leading to flooding with all its far-reaching physical, economic, and social consequences. For correct risk estimates of extreme events it is very important to map a given superstatistics onto the relevant class of extreme value statistics. This is the topic of this paper.
Clearly, as outlined above, extreme values within a given model or data set produced by the dynamics of a complex system are of notable practical relevance [30,31]. At the same time there is a well developed mathematical theory for their statistical inference [32][33][34][35]. Recently there has been much activity on the rigorous application of extreme values theory to deterministic dynamical systems [36][37][38][39][40][41][42][43][44] and also to stochastically perturbed ones [45][46][47]. A remarkable feature of the dynamical system approach is that there exist some correlations between events, and hence the extreme value theory used to tackle it must account for this correlation going beyond a theory that is just based on sequences of events that are statistically independent. In the superstatistics approach, correlations are also present, due to the fact that parameter changes take place on long time scales, but the relaxation time of the system is short as compared to the time scale of these parameter changes, so that local equilibrium is quickly reached. What is missing so far is a general analysis which types of generalized statistical mechanics lead to which type of extreme value statistics. Here we deal with this question for general superstatistical models.
Our models are general in the sense that we allow for some mild form of statistical dependence of events; thus, it is not necessary to have independent identically distributed random variables.
Extreme value theory quite generally tells us (under suitable asymptotic independence assumptions) that there are only three possible limit distributions, namely, the Gumbel, Fréchet and Weibull distribution. On the other hand, superstatistics contains three important universality classes, namely χ 2 -superstatistics, inverse χ 2 -superstatistics, and lognormal superstatistics, which are quite typical for many complex systems, meaning that most complex systems with time scale separation fall into one of the above three classes of superstatistics. Here we show that for any superstatistical process whose local equilibrium distribution does not live on a finite support the Weibull distribution cannot occur. This leaves us with Fréchet and Gumbel distributions. Under mild asymptotic independence assumptions we show that χ 2 -superstatistics generally leads to extreme values distributed according to a Fréchet distribution, whereas inverse χ 2 superstatistics, as well as lognormal superstatistics, lead to Gumbel distributions. This paper is organized as follows. In Section II we briefly review the most important results from extreme value theory relevant for our purposes. Section III recalls the general concept of superstatistics, and provides some important results on the asymptotic behaviour of the generalized Boltzmann factors underlying this approach. In Section IV we then combine the two approaches, proving our main results which elucidate which type of extreme value theory is relevant for which type of superstatistics. Our concluding remarks are given in Section V.

II. EXTREME VALUE THEORY FOR STATIONARY PROCESSES
Classic extreme values theory is concerned with the probability distribution of unlikely events.
Given a stationary stochastic process X 1 , X 2 , . . . , consider the random variable M n defined as the maximum over the first n observations: (1) In many cases the limit of the random variable M n may degenerate when n → ∞. Analogously to central limit laws for partial sums, the degeneracy of the limit can be avoided by considering a rescaled sequence a n (M n − b n ) for suitable normalising values a n ≥ 0 and b n ∈ R. Indeed, extreme value theory studies the existence of normalising values such that as n → ∞, with G(x) a non-degenerate probability distribution.
Two cornerstones in Extreme Value Theory are the Fisher-Tippet Theorem [48] and the Gnedenko Theorem [49]. The former asserts that if the limiting distribution G exist, then it must be either one of three possible types, whereas the latter theorem gives necessary and sufficient conditions for the convergence of each of the types. A third cornerstone in Extreme Value Theory are the Leadbetter conditions [32,50]. These are a kind of weak asymptotic independence conditions, under which the two previous theorems generalize to stationary stochastic series satisfying them.
Let us review these results in somewhat more detail.
In the case where the process X i is independent identically distributed (i.i.d.) the Fisher-Tippett Theorem states that if X 1 , X 2 , . . . is i.i.d. and there exist sequences a n ≥ 0 and b n ∈ R such that the limit distribution G is non-degenerate, then it belongs to one of the following types: This distribution is known as the Gumbel extreme value distribution (e.v.d.).
Type II : family of distributions is known as the Fréchet e.v.d.
This family is known as the Weibull e.v.d.
A further extension of this result is the Gnedenko Theorem, which provides a characterization of the convergence in each of these cases. Let X 1 , X 2 , . . . be an i.i.d. stochastic process and let F be The following conditions are necessary and sufficient for the convergence to each type of e.v.d.: Type I: There exists some strictly positive function h(t) such that lim t→x − all real x; Type II: x M = +∞ and lim t→∞ Type III: x M < ∞ and lim t→0 This result implies that the extremal type is completely determined by the tail behaviour of the distribution F (x).

B. The Stationary Case
In the case of stationary stochastic processes Leadbetter [32,50] introduced conditions (namely D(u n ) and D ′ (u n )) on the dependence structure which allow for a reduction to the independent case.
Given X 1 , X 2 , . . . a stationary sequence of random variables and i 1 , . . . , i p a collection of integers, let F i 1 ,...,ip denote the joint distribution function of the variables X i 1 , . . . , X ip . For brevity, we will write F i 1 ,...,ip (u) for F i 1 ,...,ip (u, . . . , u). Given {u n } a real sequence, condition D(u n ) is said to hold if for any integers 1 ≤ i 1 < · · · < i p < j 1 < · · · < j p ′ ≤ n for which j 1 − i p ≥ l, we have where α n,ln → 0 as n → ∞ for some sequence l n = o(n).
Let X 1 , X 2 , . . . be a stationary sequence and a n > 0 and b n given constants such that above is satisfied for u n = x/a n + b n for each real x, then G(x) has one of the three extreme value distribution listed before for the i.i.d. case (see Theorem 3.3.3. in [32]). In other words, condition D(u n ) alone is enough to extend the Fisher-Tippett Theorem to the non-independent case.
To extend the Gnedenko Theorem (which characterizes the convergence to each of the three extreme value types) we need to introduce an additional condition. Given a stationary sequence X 1 , X 2 , . . . and a sequence of constants {u n }, condition D ′ (u n ) will be said to hold if where [ ] denotes the integer part.
Given a stationary process X 1 , X 2 , . . . , consider the i.i.d. process Y 1 , Y 2 , . . . , whose distribution function is the same as that of X 1 , and whose partial maximum is defined as Suppose that D(u n ) and D ′ (u n ) are satisfied for a stationary sequence X 1 , X 2 , . . . , when u n = x/a n + b n for each x ({a n }, {b n } being given sequences of constants). Then P (a n (M n − b n ) ≤ x) converges to G(x) for some non-degenerate G d.f. if and only if P (a n (M n − b n ) ≤ x) also converges to G(x) (see Theorem 3.5.2. in [32]). In other words, if D(u n ) and D ′ (u n ) hold, then the stochastic process X 1 , X 2 , . . . can be treated as if it was i.i.d.
Condition D(u n ) is a weak form of mixing, which requires the stochastic process to exhibit mild asymptotic independence of the variables, whereas D ′ (u n ) is a non-clustering condition. These conditions can be replaced by stronger hypotheses, such as the m-dependence (requiring that X i and X j are actually independent if |i − j| > m) or strong mixing (a stronger version of (3)). In the case of stationary normal sequences, conditions D(u n ) and D ′ (u n ) can be replaced by requiring that the correlations between X i and X j go to 0 when |i − j| goes to infinity. See [32] for more details.

A. The Model
Consider a non-equilibrium system that is composed of regions that exhibit spatio-temporal fluctuations of an intensive quantity, for example the inverse temperature β. We consider a non-equilibrium steady state of a macroscopic system, composed of many smaller cells that are temporarily in local equilibrium. Within each cell, β is approximately constant. Each cell is large enough to obey statistical mechanics, but has a different value of the intensive parameter β assigned to it according to a probability density g(β).
Given a distribution g(β), we define its associated effective Boltzmann factor as The corresponding statistical mechanics based on this generalized Boltzmann factor can be constructed if one introduces more general entropy measures than the usual Shannon entropy and maximizes this subject to suitable constraints, see [10,12,28,51] for examples. If E is the energy of a microstate associated with each cell, B(E) represents the statistics of the statistics (e −βE ) of the cells of the system. The ordinary Boltzmann factor is recovered for g(β) = δ(β − β 0 ), where δ(·) is the Dirac delta function.
As a simple paradigmatic example consider a Brownian particle that moves in a changing environment. The particle stays for a while in a certain cell of the system (with a given temperature), then it moves to the next cell (with a different temperature) and so on. In each cell the velocity v obeys the equationv = −γv + σL(t), where L(t) is Gaussian white noise. The inverse temperature of each cell is related to the parameters γ and σ by β ∼ γ/σ 2 . Unlike ordinary Brownian motion, the parameter β is not assumed to be constant, but fluctuates according to a probability distribution g(β). The stationary probability distribution of v is then obtained by averaging the fluctuations in β.
In mathematical terms, consider a stochastic process that for a short frame of time is well described by a Gaussian distribution, but on a longer time scale the parameter values of this Gaussian fluctuate. The conditional probability density of v for a given state β is given by As in the example, let g(β) be the probability distribution describing the fluctuations of β, then the long-term stationary probability distribution is obtained by averaging over β. Therefore its density function f is In terms of the effective Boltzmann factor B(·) given by (4), we have that f (v) ∼ B(v 2 /2).
The importance of the superstatistics concept comes from its generality: One can generalize the above example to any Hamiltonian H determining the energy E of the microstates, and to any distribution g(β). Although the superstatistics approach describes a nonequilibrium system having different regions of different temperature, methods from equilibrium statistical mechanics can still be formally used, such as e.g., generalized maximum entropy principles [10,12].

B. High-Energy Asymptotics
In this paper we deal with the extreme values of a superstatistical model. To this aim, it is crucial to know the tail behaviour of the distribution defining the process. In the case of a superstatistical distribution, the tail is determined by the large-E behaviour of its associated effective Boltzmann factor B(E). We may use the results of [3] about the high-energy asymptotics.
There are three different superstatistical distributions which are commonly found in many applications, namely g(β) being given by a χ 2 , Inverse-χ 2 and Lognormal distribution. The χ 2 and Inverse-χ 2 superstatistics can be seen as representatives of a more general class of probability distributions with a given tail behaviour: power-law tail or exponential tail, respectively. More precisely, the exponential tail for inverse χ 2 -superstatistics is an exponential in the square root of the energy.

Power-Law Tail
Assume that the function g(β) is such that g(β) ∼ β γ , γ > 0 as β → 0. In this case we have that the high energy asymptotic behaviour of the Boltzmann factor is [3] B(E) ∼ E −γ−1 , as E → ∞.
A typical example which corresponds to the above case is that of β being χ 2 -distributed. Note that quite generally the β → 0 behaviour of g(β) determines the E → ∞ behavior of B(E).

Exponential Tail
Assume g(β) is such that g(β) ∼ e −c/β , c > 0 as β → 0, then the asymptotic behaviour is given by This case is realized if, for example, g is equal to the inverse χ 2 distribution.

Log-Normal Distribution
Assume that g(β) is equal to the Lognormal distribution whose density function is where µ and σ > 0 are parameters.
For this example the generalized Boltzmann factor takes on the form Doing a change of variables y = ln β this transforms into The asymptotic behaviour of B(E) derived in [3] is expressed in terms of the Lambert or product-log function. Here we will need more manageable asymptotics based on the asymptotics of the characteristic function of the Lognormal distribution [52]. Consider It is easy to check that The function φ(t) corresponds to the characteristic function of the Lognormal distribution. In [52] its asymptotics is studied and it is shown that Using this it is easy to see that as E → ∞.

IV. EXTREME VALUES FOR SUPERSTATISTICAL DISTRIBUTIONS
Consider X 1 , X 2 , . . . , a stationary stochastic process for which the probability distribution of each random variable is well described by a superstatistical distribution, i.e., whose density function f (·) is given by (5). We are interested in its associated maximum process M n given by (1) and the existence of a limiting distribution G like (2) for suitable constants a n ≥ 0 and b n . Additionally we will assume that X 1 , X 2 , . . . satisfies conditions D(u n ) and D ′ (u n ).
The Gnedenko Theorem implies that the extremal type of the limit G is determined by the tail of the distribution defining the process. When this distribution is a superstatistical distribution, its tail is determined in turn by its associated effective Boltzmann factor. Note that for distributions that do not live on a finite support f (v) > 0 for any v ∈ R, therefore x M = sup{x| F (x) < 1} = ∞.
Then, as a consequence of Gnedenko Theorem (see Section II), we have that convergence to a Type III (Weibull) distribution cannot occur.
There are three different distribution functions g for superstatistical models that are commonly encountered in applications: χ 2 , inverse-χ 2 and Lognormal. The χ 2 and the inverse-χ 2 superstatistics can be understood each as particular cases of more general behaviour of the tail, namely power-law tail and exponential law. In this section we show that in the case of power-law tail the associated maximum process converges to a Type II (Fréchet) distribution, whereas the associated maximum process for the exponential law and the superstatistics generated by the Lognormal distribution converges to a Type I (Gumbel) distribution.

A. Power-Law Tail
Assume that the function g(β) is such that g(β) ∼ β γ , γ > 0 as β → 0. In this case the high energy asymptotic behaviour of its associated Boltzmann factor is Then, the hazard functionF (v) = 1 − F (v) = P (X i > v) is given as Given two functions a(x) and b( Then it is straightforward (using L'Hôpital's rule) that a(x) ∼ b(x) implies A(x) ∼ B(x). Therefore, in the case g(β) ∼ β γ we have Using this asymptotics it is easy to check that Applying now the Gnedenko Theorem it follows that there exist renormalizing sequences a n > 0 and b n such that the limiting function G associated with the maximum process M n exists and is equal to a Fréchet distribution (Type II) with shape parameter α = 2γ + 1.
A particular example of this case is realized when g(β) is equal to the Γ-distribution: where β 0 is the average of β and n represents the number of degrees of freedom. When n is an integer and β 0 = n/4 this corresponds to a χ 2 distribution with n degrees of freedom. This distribution behaves as g(β) ∼ β n/2−1 around β = 0, which implies that the limiting function G associated to its maximum process converges to a Fréchet distribution with shape parameter α = n − 1.

B. Exponential Tail
Assume now that g(β) is such that g(β) ∼ e −c/β , c > 0 as β → 0. In this case, its Boltzmann factor has the following asymptotic behaviour Then, the hazard functionF (v) = 1 − F (v) = P (X i > v) is given as as v → ∞.

Using the fact that
where erf(·) is the error function and c is a constant, it follows as v → ∞.
If in this case one tries to compute the limit of (1 − F (xv)) / (1 − F (v)) as v → ∞ one gets that it is equal to the limit of e −(x−1)v as v → ∞, which goes to 0 or ∞ as v → ∞. In this case the Fréchet type is not a good candidate.
Using the asymptotics (6) it is easy to check that Applying the Gnedenko Theorem (with h(t) ≡ 1) it follows that there exist renormalizing sequences a n > 0 and b n such that the limiting function G associated with the maximum process M n exists and it is equal to a Type I (Gumbel) distribution.
A particular example that falls into this class is the case where g(β) is given by the inverse Γ distribution where β 0 is again the average of β and n represents the degrees of freedom of the inverse χ 2 distribution. When n is an integer and β 0 = n/4 this corresponds to an inverse χ 2 distribution with n degrees of freedom.

C. Log-Normal Distribution
Finally let us assume that g(β) is equal to the log-normal distribution where µ and σ > 0 are parameters. As shown in Section III we have as E → ∞. Then, the asymptotics of the hazard functionF (v) = 1 − F (v) = P (X i > v) is given as v → ∞.
Using the fact that exp(−e u v 2 /4) v dv = (Ei(− 1 4 e u v 2 ))/2+c, where Ei(·) is the exponential integral function and c is a constant, it is not difficult to check the following asymptotic expansion as v → ∞.
Using this asymptotics one can easily verify that for h(v) = 2e −µ v . Hence the relevant extreme value distribution is again a Type I (Gumbel) distribution.

V. CONCLUSIONS AND OUTLOOK
We have considered stationary stochastic processes whose stationary distribution functions are given in terms of generalized Boltzmann factors, resulting from the fact that the underlying complex system exhibits time scale separation and is well described by a superstatistical dynamics. We have shown that, under mild asymptotic independence assumptions, the maximum process associated with these superstatistical systems converges either to a Type I or a Type II extreme value distribution. More specifically, in case of a χ 2 -superstatistics (giving rise to a Tsallis-statistics) the limiting function is equal to a Type II (Fréchet) distribution. On the other hand, in the case of an inverse χ 2 or a log-normal superstatistics, the limiting function is equal to a Type I (Gumbel) distribution.
These results are relevant if one considers, for example, historical time series of finite length, trying to extract from the limited data set the statistics of extreme events. Usually not enough extreme events are available to get a reliable result. The method proposed here would be to first analyse which type of superstatistics is realized, and to then conclude onto the relevant extreme value distribution from our general result derived in this paper. So for example for temporal changes in sea levels measured at various coastal locations in the UK (surges as produced by meteorological changes), after subtracting the mean sea level and the astronomical tide, a recent analysis [19] shows that the dynamics is best described by a χ 2 -superstatistics. Our result derived here then implies that the extreme value statistics relevant for these sea level changes is given by a Fréchet distribution. On the other hand, measured accelerations of a tracer particle in fully developed turbulent flows have been shown to be well-described by log-normal superstatistics [14].
Our results derived here then imply that extreme acceleration events of this particle follow a Gumbel distribution.