Extreme value analysis for mixture models with heavy-tailed impurity

This paper deals with the extreme value analysis for the triangular arrays, which appear when some parameters of the mixture model vary as the number of observations grow. When the mixing parameter is small, it is natural to associate one of the components with"an impurity"(in case of regularly varying distribution,"heavy-tailed impurity"), which"pollutes"another component. We show that the set of possible limit distributions is much more diverse than in the classical Fisher-Tippett-Gnedenko theorem, and provide the numerical examples showing the efficiency of the proposed model for studying the maximal values of the stock returns.


Introduction
Consider the mixture model ( ; , ⃗ ) = (1 − ) (1) ( ; ⃗ (1) ) + (2) ( , ⃗ (2) ), (1) where ∈ (0, 1) is a mixture parameter, (1) ( ; ⃗(1) ) and (2) ( ; ⃗(2) ) are CDFs of two distributions parametrised by vectors ⃗(1) , ⃗(2) correspondingly, and ⃗ = ( ⃗(1) , ⃗(2) ). In this paper we focus on the case when the second component in this mixture corresponds to some heavy-tailed distribution, while the first one can be either light-or heavy-tailed. When is small, the second component can be referred to as the heavy-tailed impurity. 1 Some applications of this approach are described by Grabchak and Molchanov (2015). For instance, in population dynamics, this approach can be used for modelling the migration of species: the distance of migration of most species can be modelled by light-tailed distribution, but there is a small number of species with "very active" behaviour. It would be a worth mentioning that the parameters and ⃗ may depend on the number of available observations, denoted below by . For instance, in the aforementioned example from population dynamics, the proportion of "very active" species decays when the total number of species grows. In this context, the distribution of resulting variable changes with , and this model can be considered as the infinitesimal triangular array -a collection of real random variables { , = 1.. }, → ∞ as → ∞, such that 1 , ..., are independent for each . The classical limit theorems for this class of models are well-known in the literature, see, e.g., monographs by Petrov (2012), Meerschaert and Scheffler (2001). For instance, it is known that the class of possible non-degenerate limit laws of the sums 1 + ... + − with deterministic and triangular array { , = 1.. } satisfying the assumption of infinite smallness, coincides with the class of infinitely-divisible distributions. Surprisingly, there are very few papers dealing with the extreme value analysis for this model. To the best of our knowledge, there exists no general statements describing the class of non-degenerate limits of ( max with deterministic , . Clearly, the convergence to types theorem is applicable to this situation, and guarantees that the limit law is determined up to the change of location and scale. Nevertheless, unlike the well-known Fisher-Tippett-Gnedenko theorem, the class of limit distributions in (2) includes not only the Gumbel, Fréchet and Weibull laws. Some conditions guaranteeing the convergence of the triangular array to some limit are given by Freitas and Hüsler (2003), but their result essentially employ the assumption that the limit distribution is twice differentiable, which is violated in the examples of the model (1) provided below. Let us mention here that other known papers on this topic are concentrated on some particular examples yielding convergence to the Gumbel law, see Anderson, Coles and Hüsler (1997), Dkengne, Eckert and Naveau (2016).
In the first part of the paper (Section 2) we consider the particular case of (1), when the first component has the Weibull distribution (and therefore it is in the maximum domain of attraction of the Gumbel law), while the heavytailed impurity is modelled by the regularly varying distribution (MDA of the Fréchet law). Note that in the classical setting, when the parameters and ⃗ are fixed, the limit behaviour of the sum is determined by the second component, and therefore the maximum under proper normalisation converges to the Fréchet law. Interestingly enough, even in the case, when only one parameter (namely the mixing parameter ) varies, the set of possible limit distributions includes Gumbel and Fréchet distributions and also one discontinuous law. The exact statement is formulated in Theorem 1.
In Section 3 we turn towards more complicated model, which appears when one uses the truncated regularly varying distribution for the second component, and the truncation level grows with . This part of our research is motivated by a discussion concerning the choice between truncated and nontruncated Pareto-type distributions, see Beirlant, Alves and Gomes (2016). The asymptotic behaviour depends on the rate of growth of : as we show, the resulting conditions are related to the soft, hard and intermediate truncation regimes introduced by Chakrabarty and Samorodnitsky (2012). Note that in that paper it is shown that the softly truncated regularly varying distribution has heavy tails (understood in the sense of the non-Gaussian limit law for the sum), and therefore the term "heavy-tailed impurity" can be also used for models of this kind.
The main theoretical contribution of our research is formulated as Theorem 2, dealing with the case when both and depend on . It turns out that the set of possible limit laws in (2) includes 6 various distributions, and, for some sets of parameters, maximal value diverges under any (also nonlinear) normalisation. Our theoretical findings are illustrated by the simulation study (Section 4).
The choice of the Weibull distribution for the first component in (1) is partially based on the great popularity of this distribution in applications, see, e.g., the overview by Laherrere and Sornette (1998). As we show in Section 5, our model with heavy-tailed impurity is more appropriate for modelling the stock returns as a "pure" model. In this context, our paper continues the discussion started in the paper by Malevergne, Pisarenko and Sornette (2005), where it is shown that the tails of the empirical distribution of log-returns decay slower than the tails of the Weibull distribution but faster than the power law.

Weibull-RV mixture
In this section, we focus on a particular case of the model (1), namely where ⃗ = ( , , ), 1 is the distribution function of the Weibull law, and 2 corresponds to the regularly varying distribution on [ , ∞), with = inf { > 0 : 2 ( ) > 0} and a continuous slowly varying function (·). Let us recall that by definition, lim →∞ ( )/ ( ) = 1, ∀ > 0, and the term "slow variation" comes from the property for every > 0. The extensive overview of the properties of slowly varying functions is given in [4] and [18].
As we already mentioned in the introduction, the first component is in the MDA of the Gumbel law, while the second is in the MDA of the Fréchet law. In Appendix A, we show that the the mixture distribution function is in the MDA of the Fréchet law, provided that the parameters and ⃗ are fixed.
In what follows we consider the case when the mixing parameter = decays to zero as grows. It is natural to slightly generalise the model to the form of row-wise independent triangular array where is an unbounded increasing sequence, and for any = 1.. , the r.v.'s , = 1.. are independent. The set-up allowing various numbers of elements in different rows is standard both in studying the classical limit laws (see [17]) and in the extreme value theory (see [7]).
As we show in the next theorem, the asymptotic behaviour of the maximum in this model is determined by the rate of growth of , the rate of decay of and the slowly varying function . Note that the rates of log and are compared in terms of the following three alternative conditions,  In all cases, one can take if (A3) holds, and ( ) →˜for some˜> 0 as → ∞. The normalising sequences can be fixed in the form (10).
Proof The proof is given in Appendix B.
The graphical representation of this result is presented in Figure 1.

Weibull-truncated RV mixture
Now we consider one more complicated model, such that the distribution of the second component in (3) also changes as grows. Consider the mixture distribution where as before, ⃗ = ( , , ), 1 is the distribution function of the Weibull law (see (4)), whilẽ︀ 2 is the upper-truncated regularly varying distribution,︀ with 2 ( ; ) corresponding to a regularly varying distribution (5). It would be an interesting mentioning that the components in this model correspond to different maximum domains of attraction: the maximum for the first component under proper normalisation converges to the Gumbel law, while the second -to the Weibull law, see Appendix C.
By analogue with (7), we consider the triangular array where , are unbounded increasing sequences, and for any = 1.. , the r.v.'s , = 1.. are independent. Note that the classical limit laws for this model (law of large numbers and limit theorems for the sums) are essentially established in [16].
The next theorem reveals the asymptotic behaviour of the maximal value depending on the rates of , , , and the properties of the slowly varying function . An important difference from the model considered in Section 2 is that in some cases the limit distribution is degenerate for any (also non-linear) normalising sequence.
It turns out that if tends to any finite constant, then the limit distribution is Gumbel. Our findings in the remaining case → ∞ are presented in Table 1. The asymptotic behaviour of the maximum is determined by the asymptotic properties of the sequences , in terms of (A1)-(A3), and the rate of growth of in terms of the following three alternating conditions: The conditions (M1)-(M3) are related to the notion of hard-and soft truncation. Following [5], we say that that a variable is truncated softly, if For a regularly varying distribution of , the condition (14) holds if there exists for any > 0 4 . Analogously, is truncated hard, that is, if there exists ∈ (0, 1/ ) such that / → 0. Our results for the case (M1) (see first raw in Table 1) coincide with the findings from [5]: in the soft truncation regime, truncated power tails behave, in important respects, as if no truncation took place. In fact, in our setup, the results are completely the same as for the non-truncated distribution considered in Theorem 1.
Our outcomes for (M2) (second raw in Table 1) are quite close to another finding from [5], namely, in the hard truncation regime much of "heavy tailedness" is lost. Actually, we get that the behaviour is determined by the first component except the case (A2) with lim →∞ − ̸ = ∞. Finally, the intermediate case (M3) (third raw in Table 1) is divided into various subcases. The comparison with [5] is not possible because the authors decide to largely leave this question aside in this article, in order to keep its size manageable. In our research, we provide the complete study of this case.
The exact result is formulated below. 4 Here and below we mean by that lim →∞ ( / ) = ∞.
Moreover, in these cases the distribution of is degenerate for any increasing sequence ( ), which is unbounded in and .
Proof The proof is given in Appendix D.

Simulation study
The aim of the current section is to illustrate the dependence of limit distribution for maxima in the model (13) Table 2.
As previously, the primary separation is made due to the rates of log and : we fix = −1 (log ) and = −1 (log ) 2 , which imply conditions (A1) and (A2), respectively. Next, the models are divided according to the rate of growth of in the form = (log( + 1)) with = 1/2, 1, 2. Recall that from Theorem 2, it follows that the limit distribution is Gumbel for the pairs (A1)-(M1), (A1)-(M2) and (A2)-(M2) (note that for the last two cases under our choice), and Fréchet for the pair (A2)-(M1). For each case we simulate 1000 samples of length 1000, and find the maximal value of each sample. The goodness-of-fit of the limit distributions of the maximal values suggested by Theorem 2 is tested by the Kolmogorov-Smirnov criterion. Figure 2 depicts the kernel density estimates of the densities of normalised maxima in each case superimposed with the limit distributions implied by Theorem 2. It can be seen that for all groups the density estimates are quite close to the theoretical densities, and the Kolmogorov-Smirnov test does not reject the null of the corresponding theoretical distributions (corresponding p-values are given on the same figure). Starting from the prominent paper by Mandelbrot [13], heavy-tailedness of distributions of price changes is a well-known stylised fact, leading to the frequent choice of power laws for the modelling, see, e.g., [6]. However, numerous papers admit that the tails of the distributions used for modelling the returns is though heavier than normal, yet lighter than of a power law. For instance, Laherrère and Sornette [11] demonstrate that daily price variations on the exchange market can be successfully described by the Weibull distribution with parameter smaller than one. Malevergne et al. [12] intently analyse financial returns on different time scales, ranging from daily to 5-and 1-minute data, and come to the conclusion that the Pareto distribution fits the highest 5% of the data, while the remaining 95% are most efficiently described by the Fig. 3: First (second) row: the plot of data; p-values for Weibull fit for positive (absolute negative) log returns Weibull law. Thus, it is reasonable to expect that the overall distribution of returns should be successfully described by the model which (in some sense) lies in between of these two distributions. This idea serves as a motivation of the application of the model (3) to modelling the log-returns.
In our study we consider hourly logarithmic returns of BMW shares in 2019. Following [12], we analyse positive and negative returns separately. The sample sizes are equal to 1130 and 1062, respectively. The plots for positive and negative log returns are presented in the first plot in Figure 3. In what follows, we assume that the log-returns are jointly independent. This assumption was checked by the chi-squared test resulting in p-values 0.234 and 0.223 for positive and negative returns, respectively.
1. Separation of components. For each = 1...1130, the sample 1 , ..., is divided into 2 parts corresponding to the first and the second components  Figure 4. The first plot in two rows indicates that in both cases declines with , though in case of negative log returns the decrease is not so evident. From the second plot one can see that with = appear to tend to infinity. Therefore, as suggested by Theorems 1 and 2, we examine the asymptotic behaviour of the ratio log /( ) for different values of > 0. For both positive and absolute values of negative log returns we get that for = 0.45 this ratio decreases rapidly.
2. Model selection. Based on the partition obtained on the previous step, the decision between truncated and non-truncated distributions for the second 3. Estimation of parameters. The parameters of the first and second components are estimated by the maximum-likelihood approach. The estimated values are presented in Table 3. Sinceˆ/ is equal to 0.483 for positive log returns and 0.484 for absolute values of negative, we conclude that the assumption (A2) is fulfilled with = 0.45, and therefore the limit distribution for maxima is the Fréchet distribution, see item (ii) in Theorem 1. It is worth mentioning that for both positive and absolute values of negative log returns we getˆ> 2, which is completely coherent with general empirical results for financial returns and addresses the common critique against models with infinite variance, see [6]. 4. Validation of the model. Figure 6 depicts the true density of positive (top left) and absolute values of negative (top right) log returns superimposed with densities of 100 simulations from the mixture (3) with the corresponding parameter estimates. The constructed model is also verified by the empirical confidence intervals for the sample quantiles based on 100

Conclusion
This paper contributes to the existing literature in the following respects.
1. We model the heavy-tailed impurity via the mixture of distribution with varying parameters and (following the ideas from [3], [5]) consider the resulting model as a triangular array. The notion of heavy-tailed impurity is not new, but all previously known probabilistic results are concentrated only on the classical limit laws, see [10], [16]. In this paper, we establish the limit laws for the maximum in these models. 2. The paper delivers an example of the triangular array such that its rawwise maximum has (under proper normalisation) 6 different distributions, depending on the rates of the varying parameters. To the best of our knowledge, all previous articles on the extreme value analysis for triangular arrays deal with the convergence to the limit law with twice differentiable cdf ([2], [7], [9]), while some limit distributions in our example are discontinuous. 3. We show the difference between various types of truncation for the regularly varying distributions used for modelling the impurity. Our conditions (M1)-(M2) are close to soft and hard truncation regimes introduced in [5], leading to similar (but not completely the same) outcomes for our mixture model as for the model considered in [5]. Moreover, unlike previous papers, we study in details the case of intermediate truncation regime (M3). 4. For practical purposes we describe the four-step scheme for the application of this model to the asset price modelling. This approach can be considered as a possible development of the idea that the distribution of stock returns is in some sense between exponential and power law. The comprehensive discussion of this idea can be found in [12].
A Classical EVA for the mixture model Thus, the limit distribution for maxima is determined by the second component, leading to the Fréchet limit. In fact, choosing that is the Fréchet-type distribution.

B Proof of Theorem 1
For given sequences , , the left-hand side of (8) can be represented as where ( ) = + . Our aim is to find the sequences , guarantying that this limit (denoted by ( )) is non-degenerate. We divide the range of possible rates of convergence of , into several essentially different cases.
(i) Let → 0 as → ∞ or = > 0. As − ( ) ( ( )) → 0 as → ∞ by the slow variation of (·) (see (6)), we get and therefore we deal with the extreme value analysis of the Weibull law. Since¯1( ) = − , > 0, ≥ 1 is a von Mises function, i.e.,¯1( ) can be represented as with˘= 1 and ( ) = ( ) −1 1− , > 0, we get that the limit distribution is Gumbel under the choice ( ) = + with (ii) Let → ∞ as → ∞. This case is divided into several subcases, depending on the relation between and . 1. First, let us consider | |. Then as → ∞. Therefore, as ( ) ∼ ( ) for all fixed ∈ R as → ∞. Clearly, since is not present in the above limit, one can take = 0. As for , we have i.e., It would be a worth mentioning that depends on the function via the equality (16). Let us recall that (·) is slowly varying and therefore Thus, from (16) we get we get that for the condition (17) to be fulfilled, it is sufficient that the right-hand side tends to zero as → ∞, i.e., lim →∞ log ( ) /( + ) = 0, or, equivalently, We conclude that the condition (18) yields (17), and in this case the limit distribution is Fréchet. 2. Now, let > 0 and ∈ R be such that | |. In this case as → ∞. Thus, the limit in (15) takes the form Let the norming constants and be chosen in the form (9). Then ( ) is the cdf of the Gumbel law if As previously, we would like to replace (19) with another condition without (·). Once more, we would like to recall that by slow variation of (·) ( ) ∀ > 0.
From this we conclude that and the fact that the right-hand side tends to zero as → ∞ will imply (19). In other words, we obtain the Gumbel limit if 3. The last possible situation is when neither (18), nor (20) is satisfied. Clearly, this is the case only when = · (log ) / for some > 0. Not surprisingly, it turns out that the final answer now depends on the asymptotic behaviour of (·). a) Let us first consider (·) the case ( ) → ∞ as → ∞. Then one can take = 0 and find as the solution to the equation The limit for the second component in (15) coincides with the corresponding one in item 2(i) (and leads to the cdf of the Fréchet law), while for the first component we get The value of the latter limit is zero for all fixed > 0 since ( ) → ∞ as → ∞, and therefore the limit distribution is Fréchet.
b) Now, let (·) be such that ( ) →˜> 0 as → ∞. Then the same choice of norming constants as when ( ) → ∞ as → ∞ leads to the same limits as before. However, the value of (21) now depends on , namely, Thus, in this case the limit distribution is equal to An interesting point is that we get the limit distribution that is not from the extreme value family, having an atom at = −1/ (˜) −1/ . c) Finally, let (·) be such that ( ) → 0 as → ∞. Then the normalising sequence can be chosen as in item 1, and for the first component we get while for the second one Therefore, in this case the limit distribution is again Gumbel.
C Limit law for the truncated RV distribution Lemma 1 Let̃︀ 2 ( ) = 2 ( ; , , ) be the upper-truncated regularly varying distribution defined as (12). Theñ︀ 2 is in the maximum domain of attraction of the Weibull law having the distribution function (4) with = 1.

D Proof of Theorem 2
Step 1. Several simple cases. As in the proof of Theorem 1, we use the notation ( ) = + . We have ) → 0 as → ∞ by slow variation of (·), we get (i) Let → 0 as → ∞. By slow variation of (·), − ( ) ( ( )) → 0 as → ∞, therefore, the whole second component disappears. We deal with maxima of a Weibull random variable and obtain the Gumbel limit under the normalisation ( ) = + with , in the form (17).
(ii) Let = . By a similar argument as in the previous item, Under the same choice of normalising sequences, we get for all fixed ∈ R as → ∞. Therefore, the limit distribution is again Gumbel. From (M1) it follow that for any ∈ (0, − 1 / ] the right-hand side tends to zero as → ∞, and therefore − ( ) → 0 as → ∞. The rest of the proof in this situation follows the same lines as the proof of Theorem 1. Other cases are more complicated, and we divide the further proof into several steps.
The condition ( ) > leads to the inequality > − log( ( ))(1 +¯(1)) (23) for sufficiently large. Finally, as − log( ( )) takes only non-negative values, and the sequences , tend to infinity as → ∞, we conclude that the necessary condition for the existence of non-degenerate limit distribution is log . (24) 1. Now let us consider the case when (A1) holds. We have since > / > . The right-hand side tends to zero as → ∞ and therefore (24) holds. Choosing and as in (9), we get that ( ) > for all ∈ R, and therefore the limit distribution is Gumbel.
2. Now assume that (A2) holds. In this case, (24) can be violated. In fact, with some ∈ (0, 1 / ). From (M2), it follows that the right-hand side is infinite if ≥ , and has an unknown asymptotic behaviour otherwise. The lower bound is given by where for ≥ , the right-hand side tends to zero as → ∞, while otherwise the asymptotic behaviour is again unknown. In this case, we conclude that if is such that (24) holds, the non-degenerate limit distribution exists and is in fact the Gumbel distribution. 3. Finally, in the case (A3), because ∈ (0, 1 / ). Since by (24) there exists ∈ (0, 1 / ) such that the right-hand side tends to zero as → ∞, we get under proper normalisation the Gumbel limit distribution.
1. If (A1) is satisfied, it is possible to obtain the Gumbel limit under the same choice of normalising sequence (9). Indeed, in this case log =˘( ) / log ( ) log , because > / . Therefore, (24) follows from (A1), and we obtain the Gumbel distribution as a limit. 2. If (A2) holds, then the result turns out to depend on the asymptotic behaviour of (·).
Let us recall that − is equal to a constant. a) If ( ) → ∞ as → ∞, we have that

24
V. Panov and E. Morozova 3. Finally, let us consider the case (A3). As in the previous situations, the limit distribution depends on the asymptotic behaviour of (·). a) Let ( ) → ∞ as → ∞. Since then − ( ) → ∞ as → ∞, we conclude that the non-degenerate limit exists only if (23) holds. In the considered case, (23) is equivalent to 1/˘1/ < 1 The normalising sequence (9) again leads to the Gumbel limit distribution. b) If ( ) →˜for some˜> 0 as → ∞, we have that − ( ) =˜˘− . -If 1/˘1/ < 1, a linear normalising sequence as in (9) leads to the Gumbel limit, since ( ) > for all ∈ R and large enough; see the previous item. As we see, the limit distribution does not belong to the extreme value family, and has an atom at = −1/ (˜) −1/ . -If 1/˘1/ = 1, the choice (10) leads to the discrete limit distribution having a unique atom at =˘˜− 1/ with probability mass 1/ . c) Lastly, let (·) be such that ( ) → 0 as → ∞. Then the Gumbel limit can be obtained under the normalisation (9) since − ( ) → 0 as → ∞, see item (i,c) in Theorem 1. This observation completes the proof.