Robust estimations for the tail index of Weibull-type distribution

Based on suitable left-truncated or censored data, two flexible classes of $M$-estimations of Weibull tail coefficient are proposed with two additional parameters bounding the impact of extreme contamination. Asymptotic normality with $\sqrt {n}$-rate of convergence is obtained. Its robustness is discussed via its asymptotic relative efficiency and influence function. It is further demonstrated by a small scale of simulations and an empirical study on CRIX.


Introduction
The estimation of tail quantities plays an important role in extreme value statistics. One challenging problem is to select extreme sample fraction to balance the asymptotic variance and bias. Meanwhile, this requires a large and ideal sample from the underlying distribution. Indeed, in practical data analysis, it is not unusual to encounter outliers or mis-specifications of the underlying model which may have a considerable impact on the estimation results. A typical treatment is then required for instance by downweighting its influence on the estimation in various standards, see e.g., Basu et al. (1998), Beran and Schell (2012), Vandewalle et al. (2004Vandewalle et al. ( , 2007, Goegebeur et al. (2015); Liu and Tang (2010).
Given the wide applications of Weibull-type distributions and little studies on its robust estimations, this paper shall address this issue concerning its tail quantities. Let X 1 , . . . , X n be an independent and identically distributed sequence from parent X ∼ F (x) satisfying (1.1) where α > 0 is the so-called Weibull tail coefficient (WTC) and (x) is a slowly varying function at infinity, i.e., (cf. Bingham et al. (1987)) lim t→∞ (tx)/ (x) = 1, ∀x > 0.
Prominent instances of Weibull-type distributions of F are Gaussian (α = 2), gamma, Logistic and exponential (α = 1) and extended Weibull (any α > 0) distributions (cf. Gardes and Girard (2008)). As an important subgroup of light-tailed distributions, Weibull-type distributions are of great use in hydrology, meteorology, environmental and actuarial science, to name but a few (cf. Beirlant and Teugels (1992); Hashorva and Weng (2014); Dȩbicki and Hashorva (2018); Arendarczyk and Dȩbicki (2011)). Meanwhile, the WTC governs the tail behavior of F , and the larger the WTC is, the faster the tail of F decays. Dedicated estimations of WTC have thus been proposed and most of them are based on an asymptotically vanishing sample fraction of high quantiles, which asymptotic normality is achieved under certain secondorder condition specifying the rate of convergence of (tx)/ (t) to 1, see e.g., Girard (2004), Gardes and Girard (2008), Goegebeur et al. (2010), Asimit et al. (2010). Indeed, most data-sets from applied-oriented fields are relative large and with certain deviations from the pre-supposed model. For instance, it occurs with the slowly varying function (·) (where 1 − F (x) = exp{−x α (x)}) in the left part of the distributions. To the best of our knowledge, it is new to investigate the robust Weibull tail estimations when only a small sample is available.
Inspired by the theory of robust inference in Huber (1964), we propose two classes of robust estimations of WTC. Denote for given c 0 > 0 h(t) = (c 0 t − 1) ln t − 1, t > 0.
(1.2) Clearly, we have g(x; α) = −α −1 h(x α ) is the score function of Please note that h(t), t > 0 is not monotone and thus one cannot directly weaken the effect of outliers by bounding score function g(x; α). On the other hand, most interest of risk management lies principally in the extreme large risks. This motivates us to consider some tailored h(t) according to certain lefttruncated/censored Weibull distributions with the same Weibull tail coefficient α under considerations. Namely, we set below t 0 = arg min t≥1 h(t) with h specified by (1.2) and which properties are stated as below.
Basically, bothh ← and h * are certain modifications of h via its valued interval and domain region. Now, we are ready to state our M -estimations of Weibull tail coefficient using the M -estimation process based on the alternative samplesX i 's and X * i 's respectively fromX := X|{X ≥ 1} ∼F and X * := max(X, min(max(y, v), u), }, v < u and is a set of distributions with support in (0, ∞).
Definition 1.1. Let F W (x; α) andh, h * be given by (1.3) and (1.4), respectively. Define the psi-functioñ Then the functionalT (F ) as the solution of the equatioñ Then the functional T * (F ), F ∈ as the solution of the equation is called huberized Weibull tail M -functional corresponding to ψ * . The corresponding M -estimator T * n = T * (v,u) n (F n ), the solution of the equation is the huberized M -estimator of the Weibull tail coefficient α.
We remark that (1.5) and (1.6) hold sincẽ  Figure 1 illustrates the lower huberization by comparing the score functionψ −1,∞ (y; α) ofF W (recall Lemma 1.1) withψ v,∞ (y; α). We see that the contaminated Weibull density by Gamma (see (3.1) below for its definition) has almost the same shape as the pre-supposed Weibull one in the right tail, and therefore lower-huberized psi-functionψ v,∞ (y; α) can restrict the influence of all observations below y 0 = (h ← (v)) 1/α instead of removing them completely. On the other hand, for all y > y 0 , theψ v,∞ (y; α) is shifted downwards for the consistency purpose. One may similarly analyze the ψ * function.
The paper principally investigates the asymptotic behavior of the proposed new classes of M -estimations of Weibull tail coefficient. Details are as follows.
In Section 2, we consider Weibull distributions in Theorems 2.1 and 2.2 and establish its asymptotic normality of the M -estimationsT n and T * n with √ n-rate of convergence, which is rather faster than that of most classical Weibull tail estimations such as the Hill-type estimation, see Theorem 2 in Girard (2004). Generally, we study related asymptotic properties in Theorems 2.3 and 2.4 when the underlying risk follows Weibull-type distributions specified in (1.1). Some bounded asymptotic bias may appear due to its deviations from the Weibull distributions.
In Section 3, using asymptotically relative efficiency (AEFF) and influence function (IF), we investigate the robustness (Theorem 3.1) and the bias, which are further related to the choices of flexible parameters v and u. These results are useful, especially when the practical regulators in risk management consider the trade-off between the robustness and consistency.
In Section 4, a small scale of Monte Carlo simulations and an empirical study concerning the CRIX proposed by Trimborn and Härdle (2016) are carried out. We see that both M -estimations are robust and perform very well even for small samples, in comparisons with the classical maximum likelihood estimations and Hill-type estimations of the Weibull tail coefficient. We expect the results would be beneficial to both financial practitioners and theoretical experts in risk management and extreme value statistics.
The rest of the paper is organized as follows. Main results are given in Section 2 followed with a section dedicated to the robust analysis. Sections 4 and 5 are devoted to a small scale of Monte Carlo simulations and an empirical studies on CRIX. All proofs of the results are postulated to Section 6.

Asymptotic results
Throughout this section, we keep the same notation as in Introduction and write further p → and d → for the convergence in probability and in distribution, respectively. All the limits are taken as n → ∞ unless otherwise stated.
with v = −1, u = ∞ reduces to the maximum likelihood estimation of α. This fact will be used in Theorem 3.1 for the asymptotic relative efficiency analysis. Additionally, we have by laws of large numbers that m = m(n) satisfies m/n Remark 2.2. (i) The difference betweenψ and ψ * is that h * is not the score function of F * W , the distribution of the censored risk at point x 0 , where 0 < d 0 ≤ α 0 ≤ d 1 is needed to ensure the monotonicity of h * and ψ * , see details in (6.5) with α = α 0 . (ii) The proposed M -estimations are principally based on suitable left-truncated and censored data, which are commonly used in survival analysis, see e.g., Kundu et al. (2017). Moreover, both consistency and robustness are obtained since we bound the psi-functions to weaken the influence of the extreme outliers for the exact Weibull models.
In what follows, we consider generally the Weibull-type risks and investigate asymptotic properties of the proposed M -estimations.
Theorem 2.4. Let X 1 , . . . , X n be a random sample from Please note that here the t 0 , the unique solution ofλ F and λ * F specified in Theorems 2.3 and 2.4, might not be equal to α 0 . In other words, to maintain the robustness of the M -estimations is at cost of consistency. In the next section, we shall discuss the balance via the flexible parameters v and u.

Robustness
A simple criterion for choosing v and u in the M -estimations is the trade-off between the efficiency loss (that one is willing to put up with when data are generated by a Weibull distribution), and its asymptotic bias (when the underlying distribution deviates from the ideal Weibull distribution). We study below the relative asymptotic efficiency (AEFF) in Theorem 3.1, and then analyze its influence function. Both quantities are some functions of the flexible parameters v and u, which enable the risk regulators to balance the robustness and consistency.
As stated in Remark 2.1, the M -estimationT (v,u) n with v = −1, u = ∞ reduces to the maximum likelihood estimation of α. Therefore, a straightforward application of Theorems 2.1 and 2.2 leads to the following theorem.
Theorem 3.1. Under the same assumptions of Theorems 2.1 and 2.2, we have the relative asymptotic efficiency functions ofT , the maximum likelihood estimation) are given by .
Hereμ and µ * are given by Theorems 2.1 and 2.2, respectively. ). For smaller v, the relative asymptotic effective loss ofT n is rather smaller than that of T * n . While for larger v, both are asymptotically the same.
The influence function approach, known also as the "infinitesimal approach", is generally employed to quantify robustness. Recall that the influence function describes the effect of some functional T (F ) for F in an infinitesimal -contamination neighbourhood {F |F (x) = (1 − )F (x) + G(x)}, is defined by Here F W (x; α 0 ) is given by (1.1) with c 0 = 1, α 0 = 1.
We have .
In Figure 3, we take G(x) = Γ(x; λ, β) with scale parameter λ = 0.5 and shape parameter β ∈ (0, 5), which is a Weibull-type distribution with α = 1. Its density function g(x; λ, β) is given by We see that, the absolute values of the influence functions of both M -estimationsT n and T * n are increasing in β, and decreasing with v. In other words, with increasing huberization and light-tail contamination, one gets the reduction of sensitivity to deviations from the Weibull model.

(4.2)
Here, we use alternatively k n = k opt given by (since the traditional optimal choice of k n in Girard (2004)  The last column of Table 1 is the relative proportion of k n for which M SE( α (kn) , denoted by p Hill , is given by The p Hill describes the percent that the Hill-type estimation outperforms the estimationT We conclude from Table 1 that (i) The bias of the proposed M -estimations is smaller than that of Hill-type estimation and MLE estimation (see columns 2-5 for details).
(ii) The sample variance s 2 of our estimations is very close to zero. Note by passing that even with the optimal choice of k n = k opt , the s 2 of Hill-type estimations is still relatively larger than the other (see columns 6-9 for details).
(iii) Since the ratios of MSE satisfy r ≤ r * ≤r, we see that the best rank estimation isT n , which coincides with the analysis of the relative efficiency (see columns 10-12 and Figure 2).
(iiii) outperforms Hill-type estimators α (kn) Hill for almost all k n 's. For n = 80, p Hill does not exceed 10% in most cases which means that there is a set K with at most s = 8 of k n ∈ K such that the Hill-type estimators would outperformT (0,∞) n . Similar argument holds for n = 100. Hence, the M -estimations perform better even for small samples.

Empirical Study
The CRIX, a market index (benchmark), is designed by Trimborn and Härdle (2016). It enables each interested party to study the performance of the crypto market as a whole or single crypto market, and therefore attracts increasing attention of risk managers and regulators. We select the daily CRIX index during 31 July 2014-1 January 2018 (available on crix.berlin) and take all n = 713 positive log returns of CRIX multiplied by 15 to obtain a moderate amount of sample of size m around 35-50 greater than 1 for the M -estimationT n (recall scaled risks keep the same tail decay feature) as the original data sequence X = (X i , i = 1, . . . , n).
In Figure 4 we employ the empirical mean excess function from extreme value theory to analyze its tail feature (set below I{·} as the indicator function) where X i 's are the scaled daily log returns of CRIX. We see that the log mean excess function behaves linearly for large threshold, indicating the Weibull tail feature of the data-set (cf. Dierckx et al. (2009)). Therefore, we illustrate the robustness of the proposed M -estimationsT (v,∞) n and T * (v,∞) n with (d 0 , d 1 ) = (0.8848, 0.9898) as the 95% confidence interval via MLE, and v = 0 using the real data-set X and compare it with the Hill-type estimations α (kn) Hill given by (4.1). Specifically, we consider the same contamination distribution G(x) = Γ(x; 0.5, 0.5) and contamination level = 0.05i, i = 0, 1, . . . , 10. Besides, the sample fraction k n involved in the Hill-type estimations, is chosen via the bootstrap and maximum likelihood method as follows.
b−Hill is the average of Hill-type estimations based on m = 100 bootstrap samples, and α mle is the maximum likelihood estimation of the shape parameter α of Weibull distribution (see (1.1) for its definition). Due to the unknown Weibull tail coefficient α, we use alternatively the relative deviation of α at contamination level to + δ , denoted by D( α) to study the relative robustness. Specifically, Hill and α (2) Hill stand for the M -estimations and Hill-type estimations with optimal choice of k n as in (5.1), accordingly.
From Table 2, we draw the following conclusions: (i) As expected, the proposed M -estimations are not sensitive to the contaminations, since the relative deviations of M -estimations are almost zero. Conversely, both Hill-type estimations with optimal choices of sample fraction have obvious deviations from no contamination to small contamination (D( α
Consequently, the consistency ofT n is obtained.
Next, we show the asymptotic normality ofT n . Set below (recallμ given in (6.2)) is finite in a neighbourhood of α 0 and continuous at α = α 0 . It follows thus by Theorem A, p. 251 in Serfling (1980) thatT n is asymptotically normal distributed.
Hence, the asymptotic variance of √ m(T n − α 0 ) is given bỹ Please note that m/n p → P {X ≥ 1} = exp{−c 0 }. We complete the proof of Theorem 2.1.
Proof of Theorem 2.2. Similar arguments of Theorem 2.1 apply withψ,F W andh replaced by ψ * , F * and h * , respectively. First we show the consistency of T * n . It follows by (1.6) that ψ * v,u (y; α) is strictly increasing and continuous in α. Hence it suffices to show that has an isolated root α = α 0 . We have Next, it follows by a change of variable t = h * (y α ) and integration by parts that (6.5) where in the second equality we use h s α 0 /α ln s exp{−c 0 s α 0 /α }dh * (s) > 0 (6.6) since h * (s) is strictly increasing over (x d 0 0 , ∞) and Consequently, the consistency of T * n is obtained.
Hence, the asymptotic variance is given by We complete the proof of Theorem 2.2.
Proof of Theorem 2.3. The result follows by analogous arguments as in the proof of Theorem 2.1. Sincẽ ψ v,u (x; α) is strictly increasing and contionuous in α, the assumptions of Theorem 2.3 are sufficient for the consistency and asymptotic normality ofT n . Using further Lemma 7.2.1A and Theorem A (see p. 249 and 251 therein) by Serfling (1980), we complete the proof of Theorem 2.3.
Proof of Theorem 2.4. The result follows by analogous arguments as in the proof of Theorem 2.2. Since ψ * v,u (x; α) is strictly increasing and continuous in α, the assumptions of Theorem 2.4 are sufficient for the consistency and asymptotic normality of T * n . Using further Lemma 7.2.1A and Theorem A (see p. 249 and 251 therein) by Serfling (1980), we complete the proof of Theorem 2.4.