Bridging Extremes: The Invertible Bimodal Gumbel Distribution

This paper introduces a novel three-parameter invertible bimodal Gumbel distribution, addressing the need for a versatile statistical tool capable of simultaneously modeling maximum and minimum extremes in various fields such as hydrology, meteorology, finance, and insurance. Unlike previous bimodal Gumbel distributions available in the literature, our proposed model features a simple closed-form cumulative distribution function, enhancing its computational attractiveness and applicability. This paper elucidates the behavior and advantages of the invertible bimodal Gumbel distribution through detailed mathematical formulations, graphical illustrations, and exploration of distributional characteristics. We illustrate using financial data to estimate Value at Risk (VaR) from our suggested model, considering maximum and minimum blocks simultaneously.


Introduction
Bimodal heavy-tailed distributions are powerful analytical tools for capturing the complex nature of phenomena subject to extreme events in hydrology, meteorology, insurance, reliability, and finance, among other disciplines.Two main features characterize it.Firstly, it exhibits bimodality, meaning it has two distinct peaks or modes, indicating the presence of two prominent regimes within the overall data set.Secondly, it has heavy tails, which means a higher likelihood of occurrences of extreme values than with light tails distributions.
Bimodal heavy-tailed distributions are related to René Thom's catastrophe theory, focusing on systems characterized by sudden, dramatic changes and a propensity for extreme events, e.g., [1,2].Catastrophe theory deals with situations where small parameter changes can lead to abrupt shifts in the system's state.This concept aligns with bimodal distributions, where a system may switch between two states or regimes.The heavy-tailed aspect of these distributions reflects the likelihood of rare, extreme events, mirroring the focus of catastrophe theory on significant, sudden changes.Both concepts encapsulate the unpredictability and uncertainty inherent in the systems they describe.Catastrophe theory provides a mathematical framework for understanding these dynamics, which can manifest statistically as bimodal, heavy-tailed distributions.This connection is especially relevant in economics and finance, where catastrophic shifts and heavy-tailed distributions are frequently observed.Essentially, the interplay between these concepts helps us to understand and model systems where small inputs or changes can lead to significant, unpredictable, and often extreme outputs or shifts.
Unlike standard bimodal distributions, these heavy-tailed versions give significant weight to extreme events, allowing for a more accurate representation of systems where outliers or "black swans" play a critical role.For example, one can perform inference over tails of financial returns by fitting an appropriate limiting distribution over data that exceeds a fixed threshold, where the dual peaks in such a distribution can indicate states or types of behavior within the system [3].In addition, typically, a bimodal distribution exhibits higher entropy compared to an unimodal distribution because two distinct modes add complexity and unpredictability to the system.In this way, the concept of entropy dovetails nicely with its inherent complexities, providing a quantitative lens through which to assess and strategize based on this kind of data.
Meanwhile, under certain conditions, statistics of extreme events are described by theoretical distributions.For example, the Gumbel, also known as the extreme value or Fisher-Tippet type I distribution, is a limiting distribution for the maximum (or the minimum) of a sufficiently large simple random sample.This result arises from the Fisher-Tippett-Gnedenko theorem, which states that the normalized maximum of a sequence of such random samples converges to one of three types of extreme value distribution: Gumbel, Fréchet, or Weibull.The Gumbel case is suitable for some typical families of populational distributions, such as logistic, Gaussian, and gamma.
Nevertheless, practical situations demand more, and therefore we find several generalizations of Gumbel to make it more flexible, for example, the two-component extreme value distribution or mixture of two Gumbel distributions [4], the exponentiated Gumbel [5], the transmuted extreme value [6], the generalized Gumbel [7], the generalized three-parameter Gumbel [8], the beta-Gumbel [9], the Kumaraswamy-Gumbel [10], and the exponentiated generalized Gumbel [11].However, some lead to non-identifiable models because of observationally equivalent parameterizations [12].There are other closely related models such as the exponentiated Gumbel Type-2 [13], the Kumaraswamy generalized exponentiated Gumbel type-2 [14], the bimodal generalized extreme value (GEV) [15], and a bimodal Gumbel distribution applied to environmental data [16].However, the disadvantage of the latter model is that its cumulated distribution function does not have a simple closed form.
In this work, we put forward an invertible bimodal Gumbel distribution whose cumulated distribution function has a simple closed form, making it more attractive for computational procedures and more flexible in applications (Section 2).Our suggested distribution allows us to model both maximum and minimum simultaneously, while the classical Gumbel distribution describes only one of the extremes (maximum or minimum).
After discussing the maximum likelihood estimation of the parameters from simulated data (Section 3), we illustrate our approach using two financial data sets to estimate the value at risk (VaR) in Section 4. As we are interested in studying the probability distribution's tails, we perform the block maxima technique among the available tools to find the appropriate cutoff.Instead of the usual power law distribution [3,17,18], we suggest a bimodal Gumbel distribution as a candidate model to describe the tail behavior of financial returns.

Main Results
Also known as type I extreme value distribution, the Gumbel distribution is one of the limit distributions of normalized maximum (or minimum) statistics [19], belonging to the class of the GEV distribution [20].We denote the Gumbel random variable Y with a location parameter µ ∈ R and a scale parameter σ > 0 as Y ∼ G(•; µ, σ).The forms of its probability density function (PDF) and cumulative distribution function (CDF) are, respectively, and Let us introduce our suggested generalization of the Gumbel distribution left open by [15] in the following way.Considering the transformation, after plugging it into (1) and (2), we obtain the invertible bimodal Gumbel distribution X with CDF and PDF given by, respectively, and where δ > 0 and µ ∈ R are shape parameters and σ > 0 is a scale parameter.We shall denote it as X ∼ F IBG (•; µ, σ, δ) throughout this paper.
To illustrate the role of its parameters, Figure 1 depicts the effect of the shape parameter δ.When δ = 0, the model ( 5) reduces to the unimodal Gumbel (1).The density becomes bimodal for δ > 0, and the modes' separation rises as δ increases.Figure 2 contrasts the PDF shapes with negative and positive values of µ, illustrating its role as a location parameter (left) or shape parameter (right).Finally, Figure 3 shows that σ remains the scale parameter.

Some Distributional Characteristics
Modes 1. Straightforwardly from the concept of the modes of X ∼ GB (•; µ, σ, δ), one can find that they are the solution of the differential equation where and with sign(x) = x/|x| as the sign function.
Moments 2. We can write down the kth moment of X as By substitution y = T(x) and taking the inverse function , we can express (9) in terms of a unimodal Gumbel Y, as defined in (2) as where I A is the indicator function of an event A.
Moment-Generating Type Function 3. The moment-generating function (MGF) encapsulates information about the distributional moments, being a helpful tool to characterize an IBG random variable X.For our convenience, however, we consider its power transformation X δ+1 and derive its MGF shown in (13) as follows.From its definition, Considering the substitution y = e −T(x) σ and the expression T −1 (ln , we can rewrite the integral (11) as being that is, where are the upper and lower incomplete Gamma functions.Now, we can retrieve the moments of X δ+1 by taking derivatives of the cumulant-generating type function, C X = ln ϕ X (t).As usual, from the expansion ln z = (z − 1) − (z − 1) 2 /2 + (z − 1) 3 /3 − . . ., we find Thus, for example, we get the first two moments of X δ+1 by taking the derivatives To benchmark our result, if δ = 0, from ( 13) and ( 16), we find E(X) = σγ − µ and Var(X) = σ 2 π 2 /6, because Γ(a; x) + γ(a; x) = Γ(a), where Ψ(x) = d ln(Γ(x))/dx = Γ (x)/Γ(x), Ψ(1) = γ is the Euler's constant, and Ψ (1) = π 2 6 .Thus, we get the mean and variance of the basic Gumbel model as expected, which confirms the accuracy of (13).
Quantiles 4. Sampling by the inverse transform is a basic method with which to generate a pseudo-random variate of X, based on its quantile function of F IBG .While the bimodal Gumbel model introduced previously by [16] does not provide a simple way to perform this method, our suggested model (5) yields a simple expression for the quantile function.Since X is an absolutely continuous random variable, denoting the cumulative probability as the standard uniform random variable F IBG (x q ; µ, σ, δ) = q ∼ U[0, 1], we obtain the random quantile function as Entropy 5.The differential entropy of the bimodal Gumbel distribution , where Y ∼ G(., σ) denotes the basic Gumbel distribution, is given by where g is the PDF of Y, as defined in (2).By substituting y = T(x) in ( 18), we obtain

Parameter Estimation
This section discusses the maximum likelihood (ML) estimation method to estimate the vector parameters Θ = (µ, σ, δ).Let x 1 , . . ., x n be realizations independent copies of a random variable with PDF as defined in (5).The log-likelihood function is This log-likelihood function is well-defined across the entire parameter space and is continuous and differentiable for the vector parameters.Additionally, the family of distributions F IBG is identifiable, meaning different parameters should lead to distinct probability distributions, ensuring a unique maximum for the likelihood function.Ahmad et al. (2010) [21] showed the identifiability of the finite mixture of Gumbel distributions; in particular, the family of a Gumbel component F G = {G : G = G(., µ, σ) as (2)} is identifiable.Based on this, we have that the IBG family, F IBG = {F IBG : F IBG (., µ, σ, δ) as (4)}, is identifiable.It must be proven that Since F G is identifiable, then µ 1 = µ 2 and σ 1 = σ 2 .Thus, the Equation ( 21) is valid if and only if which only happens when δ 1 = δ 2 , for any x ∈ R.
The ML estimates μ, σ, δ are the solution of the system of likelihood equations After algebraic manipulations, we get the unique closed-form solution for estimating µ, However, the estimates σ and δ must be obtained numerically.
Tables 1-3 depict the empirical expected values, bias, MSE, and SE of the ML estimators of the IBG model.Figures 4-6 illustrate the empirical behavior of the MSE vs n.Overall, the MSE decreases as the sample size increases, confirming the optimal properties of ML estimators from the statistical inference theory.In this study, we did not face numerical problems in estimating these parameters.

Application
We use two financial data sets taken from https://finance.yahoo.com(accessed on 1 December 2021) to illustrate the applicability of the invertible bimodal Gumbel model.The first is the daily stock prices of Petrobras (PETR4), quoted in US dollars, from 1 March 2000 to 10 January 2021, totaling 5, 465 observations.The other is the daily exchange rate of the Brazilian real against the US dollar (USD/BRL) from 12 January 2003 to 15. October 2021, totaling 4, 223 data points.We aim to get the value-at-risk (VaR) of these data, a common measure of financial risk.It denotes the maximum loss incurred on a portfolio over a specific time horizon with a given confidence level 1 − α [22].It is expressed in probabilistic terms as where F(x) is the cumulative distribution function (CDF) of a real random variable X t observed at time t ∈ {0, 1, 2, . ..}, and 0 < α < 1 is a small prespecified probability.
Particularly, in our study, the time horizon comprises the totality of data in each discrete time series.Moreover, as X t is an absolutely continuous random variable with an invertible CDF, we can write where F −1 denotes the inverse function of F, and x α is the α−quantile of X t .As usual, X t means the log return of prices, that is, where P t is a price at time t.Table 4 summarizes descriptive statistics for PETR4 and USD/BRL returns.The return averages are close to zero, and the proximity between the absolute values of the first and third quartiles indicates the possible symmetry of the data, except for the possible extreme values suggested by the maximum (PETR4) and minimum (USD/BRL) statistics.Indeed, Figures 7 and 8 depict extreme returns, some of them due to the COVID-19 event.In the critical period of the pandemic, Petrobras shares plummeted 57% due to low demand for petroleum products (Figure 7).As for the exchange rate USD/BRL, the effect was the opposite: the American dollar became more expensive than the Brazilian real because of various political and economic reasons.Furthermore, in these data sets, we observe extreme positive (PETR4) and negative (USD/BRL) values that stand out significantly from the rest of the observations.Now, we perform the block maxima and minima method to extract extreme values from our data.Let X 1 , . . .X n be a random sample of log returns following X ∼ F IBG .Based on its realized values {x t } n t=1 , we organize it into T non-overlapping sub-samples of length N, where T means the integer part of n/N, resulting in T data blocks of size N.We choose N to cover natural periods (e.g., a week or month) so that the new sub-sample is IID.Now, we take the maximum and the minimum over each N-history.We define the jth sub-sample of maximum and minimum as and for j = 1, . . ., T. This results in a new sample of size 2T, consisting of maxima and minima, For our case, N = 15 is a block length providing IID sub-samples based on the Ljung-Box test for serial independence with a significance level of 5%.The left panels of Figures 9 and 10 depict the series of extreme PETR4 and USD/BRL returns extracted from the blocks, while the left ones show the bimodal form of their distributions.
Thus, we fit these empirical distributions of extreme returns using our suggested invertible bimodal Gumbel model, F IBG (x; θ), with parameters µ, σ, and δ.Table 5 shows their maximum likelihood estimates.Figure 11 depicts the fitted model against the corresponding distribution of the extracted extreme returns, indicating that the IBG is suitable for simultaneously describing minima and maxima extreme returns.Finally, Table 6 presents the estimated VaR for α = 10%, 5%, and 1%.As we are dealing with the logarithmic returns, to make these VaR values more understandable, we may consider that the maximum return of the stock is exp VaR − 1.Thus, for example, over a 15-day period, we do not expect a return greater than 7.4% for PETR4 and 2.6% for USD/BRL with a confidence of 90%.

Conclusions
This paper introduced and examined the IBG distribution as an extension of the classical Gumbel distribution.We addressed the limitations of the unimodal Gumbel by proposing a model capable of simultaneously representing both maximum and minimum extremes, enhancing its applicability and versatility.The mathematical formulations accompanied by illustrative figures elucidate the characteristics and behavior of the proposed distribution, emphasizing its advantages in terms of computational efficiency and flexibility.We presented its distributional properties, including mode, moment-generating functions, and entropy.In our illustration, we performed the maximum and minimum blocks technique to obtain serial independent data for the VaR estimation through the ML method, offering a novel perspective on modeling extremes through the lens of the invertible bimodal Gumbel distribution.

Table 1 .
Means, biases, mean squared errors (MSE), and standard errors (SE) of the estimated parameters from 100 Monte Carlo replications of samples with n = 50.

Table 2 .
Means, biases, mean squared errors (MSE), and standard errors (SE) of the estimated parameters from 100 Monte Carlo replications of samples with n = 100.

Table 3 .
Means, biases, mean squared errors (MSE), and standard errors (SE) of the estimated parameters from 100 Monte Carlo replications of samples with n = 1000.