A New Family of Consistent and Asymptotically-Normal Estimators for the Extremal Index

Olmo, Jose

doi:10.3390/econometrics3030633

Open AccessArticle

A New Family of Consistent and Asymptotically-Normal Estimators for the Extremal Index

by

Jose Olmo

Department of Economics, University of Southampton, Bld 58 (Murray Bld), Highfield Campus, Southampton SO17 1BJ, UK

Econometrics 2015, 3(3), 633-653; https://doi.org/10.3390/econometrics3030633

Submission received: 14 July 2015 / Revised: 3 August 2015 / Accepted: 7 August 2015 / Published: 28 August 2015

(This article belongs to the Special Issue Quantile Methods)

Download

Browse Figures

Versions Notes

Abstract

:

The extremal index (θ) is the key parameter for extending extreme value theory results from i.i.d. to stationary sequences. One important property of this parameter is that its inverse determines the degree of clustering in the extremes. This article introduces a novel interpretation of the extremal index as a limiting probability characterized by two Poisson processes and a simple family of estimators derived from this new characterization. Unlike most estimators for θ in the literature, this estimator is consistent, asymptotically normal and very stable across partitions of the sample. Further, we show in an extensive simulation study that this estimator outperforms in finite samples the logs, blocks and runs estimation methods. Finally, we apply this new estimator to test for clustering of extremes in monthly time series of unemployment growth and inflation rates and conclude that runs of large unemployment rates are more prolonged than periods of high inflation.

Keywords:

asymptotic theory; clustering of extremes; extremal index; extreme value theory; order statistics

JEL classifications:

C14; C18; C4

1. Introduction

There are different interpretations of the extremal index (θ) in the literature on extreme value theory for weakly dependent processes. This concept, originated in papers by Loynes [1] and O’Brien [2] and developed in detail by Leadbetter [3], reflects the effect of clustering of extreme observations on the limiting distribution of the maximum. Thus, the first author showed that

P {M_{1, n} \leq u_{n}} \approx F^{n θ} (u_{n}),

(1)

with

0 \leq θ \leq 1

,

M_{1, n} = max {X_{1}, \dots, X_{n}}

,

u_{n}

a normalizing sequence and F the unconditional distribution function of a sequence of weakly dependent random variables

{X_{i}, i \geq 1}

; see also [4,5]. In a similar spirit, [6] showed that the presence of clustering affected the limiting distribution of block maxima, i.e.,

P {M_{2, r_{n}} \leq u_{n} | X_{1} > u_{n}} ⟶ θ,

(2)

with

M_{2, r_{n}} = max {X_{2}, \dots, X_{r_{n}}}

,

r_{n}

determining a partition of the sample of length n, such that

r_{n} \to \infty

and

r_{n} = o (n)

.

Alternatively, Leadbetter [3] showed that for stationary sequences exhibiting short range dependence, the inverse of the extremal index is the limiting mean number of exceedances of

u_{n}

in an interval of length

r_{n}

. This result mathematically reads as follows

E [\overset{r_{n}}{\sum_{j = 1}} I (X_{j} > u_{n}) | \overset{r_{n}}{\sum_{j = 1}} I (X_{j} > u_{n}) \geq 1] ⟶ θ^{- 1},

(3)

with

I (X > u_{n})

the indicator function. By stationarity, this property is satisfied for any block of

r_{n}

consecutive elements defined in the sequence.

Inference about the extremal index parameter has also been extensively studied. The most popular estimators are the logs method, obtained from operating with the asymptotic results of the distribution of the maximum, the runs method derived from operating with (2) and the blocks method obtained from (3). For a careful review of these and other related estimators proposed in the literature, see [5,7,8]. Alternative estimation techniques recently developed are [9,10,11,12]. Finally, for a review of the underlying probability theory, the interested reader can consult [13] or, more recently, [14].

Our first aim in this paper is to build on the results of Leadbetter given in (3) about cluster size and to introduce an alternative characterization of θ as a limiting probability characterized by two Poisson processes. This characterization allows an intuitive and simple estimation procedure. Instead of focusing on the cluster size of extremes in a sequence, our characterizing condition of θ focuses on an extra level

v_{n}

satisfying the following property

E [\overset{r_{n}}{\sum_{j = 1}} I (X_{j} > v_{n}) | \overset{r_{n}}{\sum_{j = 1}} I (X_{j} > u_{n}) \geq 1] ⟶ 1 .

(4)

We will see that this condition implies that the ratio of exceedances of

v_{n}

and

u_{n}

by the sequence of block maxima converges asymptotically to θ.

This characterization of the extremal index as a limiting probability naturally yields a family of estimators that is consistent and converges, after proper standardization, to a normal distribution. This result is new in the extreme value theory literature. In fact, most estimators of θ proposed in the literature are inconsistent as the sample size increases. This is due to the Poisson property inherited from the choice of extreme levels for defining the extremal index estimator. A few exceptions are the estimators of θ proposed in [9,15] and, more recently, [14]. The first author solves this problem by using lower levels that allow one to benefit from increasing sample sizes and also by introducing a standardizing sequence that corrects for increasing cluster sizes. In [15], using this sequence, a variant of the blocks method is proposed that is asymptotically normal. In a similar context, Novak and Weissman [9] also prove the consistency and asymptotic normality of the blocks and runs methods. Robert [14] introduces estimators of the limiting cluster size probabilities, which are constructed through a recursive algorithm. This author derives estimators of the extremal index and studies their asymptotic and finite-sample properties. Other recent articles generalizing the extreme value theory to models with serial dependence are, for example, [16,17,18].

In this paper the characterization of θ and the subsequent estimator determined by two levels makes statistical inference about the θ parameter possible in a simpler manner. Interestingly, the estimation procedure is straightforward and not very sensitive to the practical choice of the block length and threshold

u_{n}

for fixed sample sizes. Similar strategies are pursued in [9,15] and, more recently, [14]. We also show in an extensive simulation study that the nominal coverage of our estimator obtained from the asymptotic normal distribution is very good for small sample sizes. Finally, our estimator fares very well in a comparison of finite samples among the logs, blocks and runs methods for a wide class of time series exhibiting clustering of extremes and widely discussed in the literature.

The paper is structured as follows. Section 2 discusses the new characterization of the extremal index. Section 3 introduces the family of estimators of θ and its asymptotic properties. In Section 4, we illustrate these asymptotic results with a simulation experiment for time series exhibiting clustering of extreme values. In Section 5, our alternative estimation method is applied to test for the presence of clustering of extremes in monthly macroeconomic time series of unemployment growth and inflation rates in the United States. Section 6 concludes.

2. Characterization of the Extremal Index

It is well known for sequences of independent and identically distributed (i.i.d.) random variables following an unknown distribution F that, under some regularity conditions, the asymptotic distribution of the sample maximum is non-degenerate. In particular, for some suitable constants

a_{n} > 0

,

b_{n}

, we have that

P {a_{n}^{- 1} (M_{1, n} - b_{n}) \leq x} ⟶ G (x),

(5)

where G is the relevant limiting distribution function that must be one of the following types (see [19]),

Type I: (Gumbel): $G (x) = e^{- e^{- x}}, - \infty < x < \infty$ .
Type II: (Fréchet): $G (x) = \{\begin{matrix} 0 & x \leq 0, \\ e^{- x^{- \frac{1}{ξ}}} & x > 0, ξ > 0 . \end{matrix}$
Type III: (Weibull): $G (x) = \{\begin{matrix} 1 & x \geq 0, \\ e^{- {(- x)}^{- \frac{1}{ξ}}} & x < 0, ξ < 0 . \end{matrix}$

Taking logs in both terms of (5) and denoting

u_{n} (x) = a_{n} x + b_{n}

, we observe that

n (1 - F (u_{n} (x)) ⟶ τ (x) a s n \to \infty,

(6)

with

τ (x)

a positive real function defined by the exponent of any of the three extreme value distributions introduced above.

For large n and

u_{n}

sufficiently high, Equation (6) is sufficient to define a family of random variables

Z_{u_{n} (x)} : = \overset{n}{\sum_{i = 1}} I (X_{i} > u_{n} (x))

indexed by x that converges in distribution to a family of Poisson random variables with mean

τ (x)

; see [20,21]. For the sake of simplicity in the exposition, we will assume hereafter x fixed and will use

u_{n}

to denote the threshold sequence

u_{n} (x)

. Similarly,

Z_{u_{n}}

denotes the corresponding random variable.

These important and well-known results of extreme value theory (evt ) can be extended to study the maximum of a wide class of dependent processes. We concentrate here on stationary sequences where the extent of long-range dependence is restricted by a distributional mixing condition

D (u_{n})

introduced in [3]. This mixing condition is said to hold for a sequence

{u_{n}}

if for any integers

1 \leq i_{1} < \dots < i_{p} < j_{1} < \dots < j_{p^{'}} \leq n

for which

j_{1} - i_{p} \geq l_{n}

, we have

D (u_{n}) : |F_{i_{1}, \dots, i_{p}, j_{1}, \dots, j_{p^{'}}} (u_{n}) - F_{i_{1}, \dots, i_{p}} (u_{n}) F_{j_{1}, \dots, j_{p^{'}}} (u_{n})| \leq α_{n, l_{n}},

(7)

where

α_{n, l_{n}} \to 0

as

n \to \infty

for some

l_{n} = o (n)

and where

F_{i_{1}, \dots, i_{p}} (u_{n})

denotes

P {X_{i_{1}} \leq u_{n}, \dots, X_{i_{p}} \leq u_{n}}

. This condition entails the asymptotic serial independence of the extreme events, these defined as exceedances over the threshold

u_{n}

.

Under this mixing condition, expressions (1) and (5) guarantee that

P {M_{1, n} \leq u_{n}} ⟶ G^{θ} (x), a s n \to \infty .

(8)

Furthermore, [3] showed that there exist different partitions of the sequence

{X_{i}, i \geq 1}

of length n defined by

k_{n}

blocks of size

r_{n}

, with

k_{n} \to \infty

,

k_{n} = o (n)

,

k_{n} l_{n} = o (n)

with

l_{n}

, introduced in (7),

r_{n} = [n / k_{n}]

, with

[\cdot]

the integer part, such that

P {M_{1, n} \leq u_{n}} - P^{k_{n}} {M_{1, r_{n}} \leq u_{n}} ⟶ 0, as n \to \infty .

(9)

This approximation of the asymptotic distribution of the sample maximum under serial dependence and condition (8) imply, after taking logs, that

k_{n} (1 - F_{1, \dots, r_{n}} (u_{n})) ⟶ θ τ .

(10)

In this environment, the random variable

Z_{u_{n}}

does not consist of independent elements and, in general, no longer converges in distribution to a Poisson random variable. Nonetheless, this random variable can be thinned to eliminate the presence of serial short-range dependence in the extremes. The thinning process consists of dividing the sequence of length n in

k_{n}

blocks of size

r_{n}

and choosing the block maxima that exceed the level

u_{n}

. This method allows one to define a new random variable denoted

Z_{u_{n}}^{*} : = \overset{k_{n}}{\sum_{j = 1}} I (M_{(j - 1) r_{n} + 1, j r_{n}} > u_{n})

whose observations, under

D (u_{n})

, are asymptotically serially independent.

Theorem 4.1 of [3] uses this thinning to define a point process

N_{n}^{(u_{n})}

on the interval

(0, 1]

consisting of the elements of

Z_{u_{n}}^{*}

indexed by

j / k_{n}

,

j = 1, \dots, k_{n}

and that converges in distribution to a Poisson process with mean

θ τ

and denoted hereafter

N (θ τ)

.

We build on these results, in particular condition (3), to define a sequence of extreme levels

v_{n}

characterized by the following condition

E [\overset{r_{n}}{\sum_{j = 1}} I (X_{j} > v_{n}) | \overset{r_{n}}{\sum_{j = 1}} I (X_{j} > u_{n}) \geq 1] ⟶ 1,

(11)

where

v_{n}

depends on the choice of

u_{n}

and satisfies by construction that

v_{n} \geq u_{n}

. Further, given that

E [\overset{r_{n}}{\sum_{j = 1}} I (X_{j} > v_{n}) | \overset{r_{n}}{\sum_{j = 1}} I (X_{j} > u_{n}) \geq 1] = \frac{r_{n} P {X_{j} > v_{n}}}{P \{\underset{j = 1}{⋃^{r_{n}}} (X_{j} > u_{n})\}},

(12)

and satisfies (11), it follows by (10) and by multiplying the numerator and denominator on the right term by

k_{n}

that

n (1 - F (v_{n})) ⟶ θ τ, w i t h 0 < τ < \infty a s n \to \infty .

(13)

This condition implies that

P {M_{1, n} \leq v_{n}} ⟶ G^{θ^{2}} a s n \to \infty,

(14)

and therefore, for appropriate sequences

k_{n}

and

r_{n}

, this is equivalent to

k_{n} (1 - F_{1, \dots, r_{n}} (v_{n})) ⟶ θ^{2} τ a s n \to \infty .

(15)

The sequence

v_{n}

defines a further thinning of

Z_{u_{n}}^{*}

given by

Z_{v_{n}}^{*} : = \overset{k_{n}}{\sum_{j = 1}} I (M_{(j - 1) r_{n} + 1, j r_{n}} > v_{n})

and an associated point process

N_{n}^{(v_{n})}

in

(0, 1]

, indexed by

j / k_{n}

, that satisfies the following result:

Theorem 1. Let the stationary sequence

{X_{i}}_{i = 1}^{n}

satisfy

D (u_{n})

where

u_{n}

satisfies (6). Let

k_{n} \to \infty

,

k_{n} = o (n)

and

k_{n} l_{n} = o (n)

with

l_{n}

introduced in (7). Let

{X_{i}}_{i = 1}^{n}

have extremal index θ, with

0 < θ \leq 1

. Then, the point process

N_{n}^{(v_{n})}

, with

v_{n}

satisfying (11), converges in distribution to a Poisson process

N^{'} (θ^{2} τ)

on

(0, 1]

.

The proof of this result follows from Theorem 4.1 of [3] and the above conditions. This theorem allows us to introduce an alternative characterization of the extremal index as the ratio of the limiting point processes

N_{n}^{(v_{n})}

and

N_{n}^{(u_{n})}

.

Corollary 1. Under assumptions in Theorem 1, the extremal index θ is the ratio of the intensity parameters of the Poisson processes

N^{'} (θ^{2} τ)

and

N (θ τ)

, defined as the limits of the processes of

N_{n}^{(v_{n})}

and

N_{n}^{(u_{n})}

, respectively.

Alternatively, we observe that by operating with expressions (10) and (15), the extremal index θ can be characterized as in the following corollary:

Corollary 2. Under assumptions in Theorem 1, the extremal index θ is characterized as the limit of the probability

P \{M_{1, r_{n}} > v_{n} | M_{1, r_{n}} > u_{n}\}

.

The proof of this result is immediate by observing that

P \{M_{1, r_{n}} > v_{n} | M_{1, r_{n}} > u_{n}\} = \frac{1 - F_{1, \dots, r_{n}} (v_{n})}{1 - F_{1, \dots, r_{n}} (u_{n})} ⟶ θ, a s k_{n} \to \infty .

(16)

The sequences

v_{n}

and

u_{n}

are extreme levels that determine two point processes that converge in the limit to the above Poisson processes. In contrast to the standard versions of the logs, blocks and runs methods, the family of estimators of θ derived from the new characterizations above are consistent, and their asymptotic distribution is Gaussian. We explore these properties in the following section.

3. Estimation of the Extremal Index

The extremal index provides a measure of the clustering of the largest observations of a stationary sequence. We review in this section three of the most popular estimators of θ introduced in the literature: the logs method, the blocks method and the runs estimator. The section follows with the definition of a novel consistent and asymptotically normal estimator of θ that stems naturally from our characterization of the extremal index as the limiting probability defined by the sequences

u_{n}

and

v_{n}

.

The logs method builds on the approximation of the asymptotic distribution of

M_{1, n}

given by

P^{k_{n}} {M_{1, r_{n}} \leq u_{n}}

, with

{k_{n}, r_{n}}

an appropriate partition of the sample of size n. The estimator takes the form

{\hat{θ}}_{n}^{(1)} = \frac{log (1 - Z_{u_{n}}^{*} / k_{n})}{r_{n} log (1 - Z_{u_{n}} / n)},

(17)

with

Z_{u_{n}} / n

the empirical counterpart of

1 - F (u_{n})

and

Z_{u_{n}}^{*} / k_{n}

of

1 - F_{1, \dots, r_{n}} (u_{n})

.

Alternatively, the concept of extremal index introduced by [3] and given by interpreting

θ^{- 1}

as the limiting mean cluster size of the exceedances yields the blocks method:

{\hat{θ}}_{n}^{(2)} = \frac{Z_{u_{n}}^{*}}{Z_{u_{n}}} .

(18)

This estimator can be regarded as an approximation of

{\hat{θ}}_{n}^{(1)}

using the first order expansions of the logarithm for the numerator and denominator. Another popular estimator is the runs estimator; this method naturally follows from the characterization of θ in [6] and is given by

{\bar{θ}}_{n} = \frac{W_{u_{n}}}{Z_{u_{n}}},

(19)

where

W_{u_{n}} = \overset{n - r_{n}}{\sum_{i = 1}} I (X_{i} > u_{n}) (1 - I (X_{i + 1} > u_{n})) \cdot \cdot (1 - I (X_{i + r_{n}} > u_{n}))

.

The first two statistical moments of these estimators are studied in [4]. However, due to the Poisson character of

Z_{u_{n}}^{*}

entailed by the choice of extreme levels

u_{n}

and the potential strong dependence between adjacent observations in

Z_{u_{n}}

, the consistency and asymptotic distribution of these estimators are compromised and only achieved after cumbersome standardizations; see [15], for example. In what follows, we introduce a new family of estimators for θ indexed by the sequence

r_{n}

obtained from the partition of n in

k_{n}

blocks and based on the novel characterization of the extremal index in Corollary 2. Furthermore, under appropriate choices of

k_{n}

satisfying conditions stated in Theorem 1, this family of estimators is consistent and asymptotically normal.

Definition 1. Let

u_{n}

and

v_{n}

be two sequences satisfying

D (\cdot)

, property (11) and the conditions in Theorem 1. Then, we can define the following family of estimators of θ indexed by

r_{n}

as

{\tilde{θ}}_{n} (r_{n}) = \frac{Z_{v_{n}}^{*}}{Z_{u_{n}}^{*}} .

(20)

This estimator can be interpreted as a refinement of the blocks method in which the sequence

v_{n}

satisfies the empirical version of condition (11), that is

Z_{u_{n}}^{*} = Z_{v_{n}}

, and replaces

u_{n}

in (18). This specific choice of the sequence

v_{n}

with

v_{n} > u_{n}

implies that our estimator, in contrast to the blocks method, consists of a ratio of asymptotically i.i.d. processes, and as we will see in the result below, it is consistent and asymptotically normal. Alternative refinements of the blocks method are given in [9].

Before introducing the consistency and asymptotic normality of

{\tilde{θ}}_{n} (r_{n})

, we need the following corollary:

Corollary 3. Let

Z_{u_{n}}^{*}

and

Z_{v_{n}}^{*}

be quantities defined by sequences

u_{n}

and

v_{n}

that satisfy

D (\cdot)

, property (6) and a partition determined by the sequence

k_{n}

satisfying the conditions in Theorem 1. Then:

Z_{u_{n}}^{*} = k_{n} (1 - F_{1, \dots, r_{n}} (u_{n})) + o_{P} (k_{n}), with k_{n} \to \infty

(21)

and

Z_{v_{n}}^{*} = k_{n} (1 - F_{1, \dots, r_{n}} (v_{n})) + o_{P} (k_{n}), with k_{n} \to \infty .

(22)

Proof of Corollary 3. For the sake of space, we only show here the proof for the sequence

u_{n}

. This result can be obtained from applying Chebyshev’s inequality to the quantity

\frac{1}{k_{n}} \overset{k_{n}}{\sum_{j = 1}} I (M_{(j - 1) r_{n} + 1, j r_{n}} > u_{n})

. More specifically, from this inequality and given that

I (M_{(j - 1) r_{n} + 1, j r_{n}} > u_{n})

are Bernoulli random variables with variance

(1 - F_{1, \dots, r_{n}} (u_{n})) F_{1, \dots, r_{n}} (u_{n})

, we know that

P {| \frac{1}{k_{n}} \overset{k_{n}}{\sum_{j = 1}} I_{j} - (1 - F_{1, \dots, r_{n}} (u_{n})) | > ε} \leq \frac{(1 - F_{1, \dots, r_{n}} (u_{n})) F_{1, \dots, r_{n}} (u_{n})}{k_{n} ε^{2}} + \frac{2 \overset{k_{n}}{\sum_{i = 1}} \overset{k_{n}}{\sum_{j > i}} c o v (I_{i}, I_{j})}{k_{n}^{2} ε^{2}},

(23)

where, for the sake of space in the expression,

I_{j} : = I (M_{(j - 1) r_{n} + 1, j r_{n}} > u_{n})

.

By definition of the sequence

u_{n}

and condition

D (u_{n})

, the right term on the preceding expression converges to zero, for

k_{n} \to \infty

and for every

ε > 0

. Then

Z_{u_{n}}^{*} : = \overset{k_{n}}{\sum_{j = 1}} I (M_{(j - 1) r_{n} + 1, j r_{n}} > u_{n}) = k_{n} (1 - F_{1, \dots, r_{n}} (u_{n})) + o_{P} (k_{n}) .

(24)

Further, the Poisson character of the quantity

\overset{k_{n}}{\sum_{j = 1}} I (M_{(j - 1) r_{n} + 1, j r_{n}} > u_{n})

and (10) imply that

Z_{u_{n}}^{*} = θ τ + o_{P} (k_{n}), for k_{n} \to \infty,

(25)

with

0 < τ < \infty

constant. ☐

With these results in place, we are ready to introduce the following results.

Theorem 2. Let

u_{n}

and

v_{n}

be sequences satisfying

D (\cdot)

, property (11) and the conditions in Theorem 1. Then

{\tilde{θ}}_{n} (r_{n}) = θ + o_{P} (1), as k_{n} \to \infty,

(26)

and

\sqrt{k_{n} (1 - F_{1, \dots, r_{n}} (u_{n}))} ({\tilde{θ}}_{n} (r_{n}) - θ) \overset{d}{⟶} N (0, θ), as k_{n} \to \infty,

(27)

with d denoting convergence in the distribution.

Proof of Theorem 2. Let

u_{n}

and

v_{n}

be sequences satisfying

D (u_{n})

,

D (v_{n})

, (11) and the conditions in Theorem 1. Let

{\tilde{θ}}_{n} (r_{n})

be the family of estimators of θ introduced above and

θ_{n} = \frac{1 - F_{1, \dots, r_{n}} (v_{n})}{1 - F_{1, \dots, r_{n}} (u_{n})}

. Now, by Chebyshev’s inequality applied to the quantity

\frac{1}{k_{n}} \frac{\overset{k_{n}}{\sum_{j = 1}} I (M_{(j - 1) r_{n} + 1, j r_{n}} > v_{n})}{\sqrt{1 - F_{1, \dots, r_{n}} (u_{n})}}

, we obtain

P {| \frac{1}{k_{n}} \frac{\overset{k_{n}}{\sum_{j = 1}} I (M_{(j - 1) r_{n} + 1, j r_{n}} > v_{n})}{\sqrt{1 - F_{1, \dots, r_{n}} (u_{n})}} - \frac{(1 - F_{1, \dots, r_{n}} (v_{n}))}{\sqrt{1 - F_{1, \dots, r_{n}} (u_{n})}} | > ε} \leq

(28)

\leq \frac{(1 - F_{1, \dots, r_{n}} (v_{n})) F_{1, \dots, r_{n}} (v_{n})}{k_{n} (1 - F_{1, \dots, r_{n}} (u_{n})) ε^{2}} + \frac{2 \overset{k_{n}}{\sum_{i = 1}} \overset{k_{n}}{\sum_{j > i}} c o v (I (M_{(i - 1) r_{n} + 1, i r_{n}} > v_{n}), I (M_{(j - 1) r_{n} + 1, j r_{n}} > v_{n}))}{k_{n}^{2} ε^{2}},

(29)

that for every

ε > 0

converges to zero, by condition

D (v_{n})

and given that

F_{1, \dots, r_{n}} (v_{n}) ⟶ 1

, as

k_{n} \to \infty

. Therefore, it holds that

\frac{\overset{k_{n}}{\sum_{j = 1}} I (M_{(j - 1) r_{n} + 1, j r_{n}} > v_{n})}{k_{n} \sqrt{1 - F_{1, \dots, r_{n}} (u_{n})}} = \frac{(1 - F_{1, \dots, r_{n}} (v_{n}))}{\sqrt{1 - F_{1, \dots, r_{n}} (u_{n})}} + o_{P} (1) .

(30)

Now, by Corollary 3,

\frac{Z_{u_{n}}^{*}}{k_{n}} = 1 - F_{1, \dots, r_{n}} (u_{n}) + o_{P} (1) .

(31)

The ratio of the last two expressions implies that

\frac{{\tilde{θ}}_{n} (r_{n})}{\sqrt{1 - F_{1, \dots, r_{n}} (u_{n})}} = \frac{θ_{n}}{\sqrt{1 - F_{1, \dots, r_{n}} (u_{n})}} + o_{P} (1) .

(32)

Finally, from Corollary 2, it follows that

θ_{n} = θ + o (1)

as

k_{n} \to \infty

. Then

({\tilde{θ}}_{n} (r_{n}) - θ) = o_{P} (\sqrt{1 - F_{1, \dots, r_{n}} (u_{n})}) .

(33)

For notational convenience, we can write the former result as

\sqrt{1 - F_{1, \dots, r_{n}} (u_{n})} ({\tilde{θ}}_{n} (r_{n}) - θ_{n}) = o_{P} (1 - F_{1, \dots, r_{n}} (u_{n})),

that implies

\sqrt{1 - F_{1, \dots, r_{n}} (u_{n})} ({\tilde{θ}}_{n} (r_{n}) - θ_{n}) = o_{P} (1),

given that

1 - F_{1, \dots, r_{n}} (u_{n}) ⟶ 0

as

k_{n} \to \infty .

For the second result of the theorem, we first note that expression (30) multiplied by the normalizing rate

\sqrt{k_{n}}

can be written as

\frac{1}{\sqrt{k_{n}}} \overset{k_{n}}{\sum_{j = 1}} (\frac{I (M_{(j - 1) r_{n} + 1, j r_{n}} > v_{n})}{\sqrt{1 - F_{1, \dots, r_{n}} (u_{n})}} - \frac{1 - F_{1, \dots, r_{n}} (v_{n})}{\sqrt{1 - F_{1, \dots, r_{n}} (u_{n})}}) .

(34)

Now, by condition

D (u_{n})

, the observations

(\frac{I (M_{(j - 1) r_{n} + 1, j r_{n}} > v_{n})}{\sqrt{1 - F_{1, \dots, r_{n}} (u_{n})}} - \frac{1 - F_{1, \dots, r_{n}} (v_{n})}{\sqrt{1 - F_{1, \dots, r_{n}} (u_{n})}})

have zero mean and are asymptotically serially independent. Furthermore, the variance of each observation is given by

V (\frac{I (M_{(j - 1) r_{n} + 1, j r_{n}} > v_{n})}{\sqrt{1 - F_{1, \dots, r_{n}} (u_{n})}}) = \frac{(1 - F_{1, \dots, r_{n}} (v_{n})) F_{1, \dots, r_{n}} (v_{n})}{1 - F_{1, \dots, r_{n}} (u_{n})} = θ_{n} F_{1, \dots, r_{n}} (v_{n}) .

(35)

The observations are indexed by the block size

r_{n}

corresponding to the partition

k_{n}

. This implies that the standard central limit theorem results cannot be applied. Instead, we have to use the triangular array version of the Lindeberg-Lévy central limit theorem defined by elements

Y_{k_{n} j} = \frac{I (M_{(j - 1) r_{n} + 1, j r_{n}} > v_{n})}{\sqrt{1 - F_{1, \dots, r_{n}} (u_{n})}}

. The Lindeberg condition in this framework is

lim_{k_{n} \to \infty} \overset{k_{n}}{\sum_{j = 1}} (s_{k_{n}}^{- 2}) E [Y_{k_{n} j}^{2} I (| Y_{k_{n} j} | \geq ε s_{k_{n}})] = 0,

(36)

with ε any positive value and where

s_{k_{n}}^{2} = \overset{k_{n}}{\sum_{j = 1}} V (Y_{k_{n} j}) + 2 \overset{k_{n}}{\sum_{i = 1}} \overset{k_{n}}{\sum_{j > i}} c o v (Y_{k_{n} i}, Y_{k_{n} j})

with

V (Y_{k_{n} j})

obtained in (35) and

c o v (Y_{k_{n} i}, Y_{k_{n} j}) ⟶ 0

as

k_{n} \to \infty

, by the mixing condition

D (v_{n})

. After some basic algebra, it is simple to see that (36) is satisfied, and then, as

k_{n} \to \infty

,

\frac{1}{\sqrt{k_{n}}} \overset{k_{n}}{\sum_{j = 1}} (\frac{I (M_{(j - 1) r_{n} + 1, j r_{n}} > v_{n})}{\sqrt{1 - F_{1, \dots, r_{n}} (u_{n})}} - \frac{1 - F_{1, \dots, r_{n}} (v_{n})}{\sqrt{1 - F_{1, \dots, r_{n}} (u_{n})}}) \overset{d}{⟶} N (0, lim_{k_{n} \to \infty} θ_{n} F_{1, \dots, r_{n}} (v_{n})) .

(37)

Now, by the consistency result (31) and dividing by

\frac{Z_{u_{n}}^{*}}{k_{n}}

, we obtain

\sqrt{k_{n} (1 - F_{1, \dots, r_{n}} (u_{n}))} ({\tilde{θ}}_{n} (r_{n}) - θ_{n}) \overset{d}{⟶} N (0, lim_{k_{n} \to \infty} θ_{n} F_{1, \dots, r_{n}} (v_{n})) .

(38)

Finally, the choice of

v_{n}

implies that

F_{1, \dots, r_{n}} (v_{n}) ⟶ 1

, and we obtain the desired result. ☐

In fact, the true rate of convergence for the consistency is

\frac{1}{\sqrt{1 - F_{1, \dots, r_{n}} (u_{n})}}

. This is so because

\frac{({\tilde{θ}}_{n} (r_{n}) - θ)}{\sqrt{1 - F_{1, \dots, r_{n}} (u_{n})}} = o_{P} (1), as k_{n} \to \infty,

and

1 - F_{1, \dots, r_{n}} (u_{n}) ⟶ 0

; see the first part of the proof of Theorem 2 for details. This interesting result implies that

\sqrt{1 - F_{1, \dots, r_{n}} (u_{n})} ({\tilde{θ}}_{n} (r_{n}) - θ) = o_{P} (1), as k_{n} \to \infty .

Hence, from (27), we achieve, as in standard statistical theory, a

\sqrt{k_{n}}

rate of convergence for the asymptotic normality.

It is still unresolved how to choose in practice the sequence

v_{n}

conditional on an appropriate choice of the sequence

u_{n}

satisfying

D (u_{n})

and determining

Z_{u_{n}}^{*}

. A suitable choice for

v_{n}

is the sequence that satisfies the empirical counterpart of condition (11);

\frac{1}{Z_{u_{n}}^{*}} \overset{k_{n}}{\sum_{j = 1}} \overset{i r_{n}}{\sum_{i = (j - 1) r_{n} + 1}} I (X_{i} > {\hat{v}}_{n}) ⟶ 1,

that yields, naturally, the following estimator of

v_{n}

:

{\hat{v}}_{n} = F^{- 1} (1 - \frac{Z_{u_{n}}^{*}}{n}) .

(39)

With this sequence, we define the process

Z_{{\hat{v}}_{n}}^{*} = \overset{k_{n}}{\sum_{j = 1}} I (M_{(j - 1) r_{n} + 1, j r_{n}} > {\hat{v}}_{n})

and obtain the corresponding feasible version of the estimator

{\tilde{θ}}_{n} (r_{n})

, defined now by

{\tilde{θ}}_{n}^{f} (r_{n}) = \frac{Z_{{\hat{v}}_{n}}^{*}}{Z_{u_{n}}^{*}} .

Furthermore, by the consistency result obtained in Corollary 3, we can replace the rate

\sqrt{k_{n} (1 - F_{1, \dots, r_{n}} (u_{n}))}

by

\sqrt{Z_{u_{n}}^{*}}

and obtain the following statistic:

\sqrt{Z_{u_{n}}^{*}} ({\tilde{θ}}_{n}^{f} (r_{n}) - θ) .

Corollary 4. Let

u_{n}

be a sequence satisfying

D (u_{n})

and defining the quantity

Z_{u_{n}}^{*}

, and let

{\hat{v}}_{n} = F^{- 1} (1 - \frac{Z_{u_{n}}^{*}}{n})

, with

F (x)

the distribution function of a sequence

{X_{i}}_{i = 1}^{n}

with extremal index

0 < θ \leq 1

. Furthermore, let

k_{n} \to \infty

,

k_{n} = o (n)

and

k_{n} l_{n} = o (n)

with

l_{n}

introduced in (7). Then

{\tilde{θ}}_{n}^{f} (r_{n}) = θ + o_{P} (1), a s k_{n} \to \infty,

(40)

and

\sqrt{Z_{u_{n}}^{*}} ({\tilde{θ}}_{n}^{f} (r_{n}) - θ) \overset{d}{⟶} N (0, θ), as k_{n} \to \infty .

(41)

Proof of Corollary 4. We will concentrate on the consistency of the estimator. The proof of asymptotic normality, once the consistency result is derived follows immediately from dividing and multiplying

\sqrt{k_{n}}

by

\sqrt{Z_{u_{n}}^{*}}

in (41), from the consistency result (31) and from applying Theorem 2.

Let

u_{n}

be a sequence satisfying

D (u_{n})

and defining the quantity

Z_{u_{n}}^{*}

, and let

{\hat{v}}_{n} = F^{- 1} (1 - \frac{Z_{u_{n}}^{*}}{n})

. We first need to show that

D ({\hat{v}}_{n})

holds given that

D (u_{n})

does. By Lemma 3.6.2 (iv) in [22],

D ({\hat{v}}_{n})

will hold if

{\hat{v}}_{n} \geq u_{n}

. Therefore, we need to prove this inequality. A sufficient condition is to see that

F (u_{n}) \leq F ({\hat{v}}_{n})

for any given n sufficiently high. Note that

F (u_{n}) = 1 - \frac{Z_{u_{n}}}{n} + o_{P} (1)

with

Z_{u_{n}} = \overset{n}{\sum_{i = 1}} I (X_{i} > u_{n})

. Then,

F (u_{n}) \leq F ({\hat{v}}_{n})

if and only if

Z_{u_{n}} \geq Z_{u_{n}}^{☆}

. Now, by the character of these two quantities,

Z_{u_{n}}^{☆}

a thinning of

Z_{u_{n}}

, this inequality holds naturally.

For the second part of the consistency proof, we note that by (6) and (10), the mixing condition

D ({\hat{v}}_{n})

implies that

\frac{k_{n} (1 - F_{1, \dots, r_{n}} ({\hat{v}}_{n}))}{n (1 - F ({\hat{v}}_{n}))} ⟶ θ, as k_{n}, n \to \infty .

(42)

Using this asymptotic relationship, and after some algebra, the estimator

{\hat{v}}_{n}

satisfies

{\hat{v}}_{n} = F_{1, \dots, r_{n}}^{- 1} (1 - \frac{θ n (1 - F ({\hat{v}}_{n}))}{k_{n}} + o_{P} (\frac{1}{k_{n}})) .

(43)

Now, replacing (43) into (22) and observing that

n (1 - F ({\hat{v}}_{n})) = Z_{u_{n}}^{*}

, we obtain

Z_{{\hat{v}}_{n}}^{*} = θ Z_{u_{n}}^{*} + o_{P} (1) .

Finally, by definition of the feasible estimator

{\tilde{θ}}_{n}^{f} (r_{n})

, it follows that

{\tilde{θ}}_{n}^{f} (r_{n}) = θ + o_{P} (1) .

☐

In practice, when the distribution F is unknown, the estimator of

v_{n}

is replaced by the empirical version of

F^{- 1} (1 - \frac{Z_{u_{n}}^{*}}{n})

, that is by the extreme order statistic

X_{Z_{u_{n}}^{*} + 1 : n}

from the sequence

X_{1 : n} \geq X_{2 : n} \geq \dots \geq X_{n : n}

. Furthermore, in the studies in finite samples with n fixed, we will use as a candidate for

u_{n}

an extreme order statistic of the sequence of block maxima. In particular, this will be

{\hat{u}}_{n} = M_{t + 1 : k_{n}}

with

t > 0

fixed, implying that

Z_{{\hat{u}}_{n}}^{*} = t

and

{\hat{v}}_{n} = X_{t + 1 : n}

. These results enable us to apply Theorem 2 to make statistical inference about the extremal index. An interesting case is testing for the existence of the clustering of extremes in stationary sequences. It is well known that under limiting serial independence in the extremes, θ takes the value one. Therefore a valid test is

H_{0} : θ = 1

against the one-sided alternative

H_{A} : θ < 1

. This possibility is explored in the application to macroeconomic time series.

4. Simulations: Some Examples

This section studies some of the most popular examples of time series exhibiting clustering of extremes. We discuss in detail the following four processes: the autoregressive model of [23], the doubly-stochastic process studied in [4], the moving-maximum process with Fréchet marginals and the maximum of a lagged autoregressive process introduced by [24]. The simulation exercise is divided into two components. First, we compare estimation methods across different partitions, and second, we study the asymptotic coverage of the limiting normal distribution introduced in Theorem 2 for different sample sizes.

The first example is due to [23]. Let

{X_{i}}

be a strictly-stationary first order autoregressive sequence driven by

X_{i} = \frac{1}{r} X_{i - 1} + ε_{i}

, with

r \geq 2

an integer,

ε_{i}

discrete uniform random variables on

{0, 1 / r, \dots, (r - 1) / r}

and independent of

X_{i - 1}

. The random variable

X_{i}

has a uniform distribution on

[0, 1]

, and the extremal index is

θ = \frac{r - 1}{r}

. Figure 1 and Figure 2 report the sequence of estimates of θ for

r = 5

and

r = 2

, respectively. The different panels in these figures report the estimates of θ obtained from the logs method, blocks method, runs method and the estimator

{\tilde{θ}}_{n}^{f}

, for different levels determined by order statistics

{\hat{u}}_{n} = M_{t + 1 : k_{n}}

and

{\hat{v}}_{n} = X_{t + 1 : n}

with

t = 1, 3, 5, 7

, and n = 200.

A close inspection of the charts shows that the blocks and the runs method underestimate θ as

r_{n}

increases. This is so because these estimators, by construction, have a decreasing numerator as the block size increases. On the other hand, estimates derived from the logs method are very accurate for the higher threshold sequences employed, but unfortunately, as

u_{n}

decreases, the estimator exhibits problems due to the fact that every single block contains an exceedance (

Z_{{\hat{u}}_{n}}^{*} = k_{n}

). In these cases, this estimator is not well defined. In contrast,

{\tilde{θ}}_{n}^{f}

shows reliable and stable estimates of θ across all threshold levels and partitions employed.

Figure 1. Chernick model with

r = 5

.

θ = 0.8

. Sample mean for different estimators of θ.

m = 100

Monte Carlo simulations.

{\hat{u}}_{n} = M_{t + 1 : k_{n}}

, and

{\hat{v}}_{n} = X_{t + 1 : n}

with

t = 1, 3, 5, 7

; n = 200.

r_{n} \in [1, 20]

. θ plotted with,

{\tilde{θ}}_{n}^{f}

with

- o

,

{\hat{θ}}_{n}^{(1)}

with

- ⋄

,

{\hat{θ}}_{n}^{(2)}

with

- +

, and

{\bar{θ}}_{n}

with the

- *

line.

Figure 1. Chernick model with

r = 5

.

θ = 0.8

. Sample mean for different estimators of θ.

m = 100

Monte Carlo simulations.

{\hat{u}}_{n} = M_{t + 1 : k_{n}}

, and

{\hat{v}}_{n} = X_{t + 1 : n}

with

t = 1, 3, 5, 7

; n = 200.

r_{n} \in [1, 20]

. θ plotted with,

{\tilde{θ}}_{n}^{f}

with

- o

,

{\hat{θ}}_{n}^{(1)}

with

- ⋄

,

{\hat{θ}}_{n}^{(2)}

with

- +

, and

{\bar{θ}}_{n}

with the

- *

line.

Figure 2. Chernick model with

r = 2

.

θ = 0.5

.

Figure 2. Chernick model with

r = 2

.

θ = 0.5

.

The second process under investigation is the doubly-stochastic model studied in [4]. Let

{ξ_{i}, i \geq 1}

be i.i.d. with distribution function F; suppose that

Y_{1} = ξ_{1}

and for

i > 1

,

Y_{i} = \{\begin{matrix} Y_{i - 1} with probability ψ, \\ ξ_{i} with probability 1 - ψ, \end{matrix}

the choice being made independently for each i. The doubly-stochastic sequence

{X_{i}, i \geq 1}

is defined by

X_{i} = \{\begin{matrix} Y_{i} with probability η, \\ 0 with probability 1 - η, \end{matrix}

independently of anything else. In this example, the extremal index is

θ = \frac{1 - ψ}{1 - ψ + ψ η}

. Smith and Weissman [4] compare different estimators of θ for

ψ = 0.9

and

η = 0.7

(

θ = 0.137

) and show that the runs method is superior to the rest of the competing estimators. Figure 3 is consistent with their results.

{\tilde{θ}}_{n}^{f}

seems to be, however, a very good competitor of

{\bar{θ}}_{n}

for every single level and outperforms the logs and blocks estimators across all levels. To compare the performance of the runs method against

{\tilde{θ}}_{n}^{f}

, we also estimate the extremal index of this process for

ψ = 0.5

and

η = 0.5

(

θ = 0.667

). Both of the runs and blocks method exhibit the same declining pattern observed before for increasing block sizes (see Figure 4). In this case, however,

{\tilde{θ}}_{n}^{f}

shows a superior performance to the rest of the estimators. The logs methods is the only competitor exhibiting a similar performance to our family of estimators.

Figure 3. Doubly-stochastic model with

ψ = 0.9

and

η = 0.7

.

θ = 0.137

. Sample mean for different estimators of θ.

m = 100

Monte Carlo simulations.

{\hat{u}}_{n} = M_{t + 1 : k_{n}}

, and

{\hat{v}}_{n} = X_{t + 1 : n}

, with

t = 1, 3, 5, 7

; n = 200.

r_{n} \in [1, 20]

. θ plotted with,

{\tilde{θ}}_{n}^{f}

with

- o

,

{\hat{θ}}_{n}^{(1)}

with

- ⋄

,

{\hat{θ}}_{n}^{(2)}

with

- +

, and

{\bar{θ}}_{n}

with the

- *

line.

Figure 3. Doubly-stochastic model with

ψ = 0.9

and

η = 0.7

.

θ = 0.137

. Sample mean for different estimators of θ.

m = 100

Monte Carlo simulations.

{\hat{u}}_{n} = M_{t + 1 : k_{n}}

, and

{\hat{v}}_{n} = X_{t + 1 : n}

, with

t = 1, 3, 5, 7

; n = 200.

r_{n} \in [1, 20]

. θ plotted with,

{\tilde{θ}}_{n}^{f}

with

- o

,

{\hat{θ}}_{n}^{(1)}

with

- ⋄

,

{\hat{θ}}_{n}^{(2)}

with

- +

, and

{\bar{θ}}_{n}

with the

- *

line.

Figure 4. Doubly-stochastic model with

ψ = 0.5

and

η = 0.5

.

θ = 0.667

.

Figure 4. Doubly-stochastic model with

ψ = 0.5

and

η = 0.5

.

θ = 0.667

.

The following example is the moving-maximum process:

y_{i} = max {μ X_{i}, (1 - μ) X_{i + 1}}

, where

{X_{i}, i \geq 1}

is an i.i.d. sequence with the common distribution function standard Fréchet

(ξ = 1)

. The extremal index is

θ = 1 - μ

for

μ < 0.5

and

θ = μ

, otherwise. We consider

μ = 0.5

. Figure 5 shows results in the spirit of those found for the Chernick process with

r = 2

.

This part of the simulation section concludes with the example introduced by L. de Haan:

y_{i} = max_{k \geq 0} ρ^{k} x_{i - k}

, where

0 < ρ < 1

and

{X_{i}, i \geq 1}

is an i.i.d. sequence with the common distribution function standard Fréchet

(ξ = 1)

. The extremal index is

θ = 1 - ρ

. The results reported in Figure 6 for

ρ = 0.25

are consistent with models exhibiting low, but significant clustering; see, for example, the results of the simulation exercise for the Chernick model and r = 5, where

θ = 0.8

, and the doubly-stochastic process with

ψ = 0.5

and

η = 0.5

, where

θ = 0.667

.

Figure 5. Moving-maximum process with

μ = 0.5

.

θ = 0.5

. Sample mean for different estimators of θ.

m = 100

Monte Carlo simulations.

{\hat{u}}_{n} = M_{t + 1 : k_{n}}

, and

{\hat{v}}_{n} = X_{t + 1 : n}

, with

t = 1, 3, 5, 7

; n = 200.

r_{n} \in [1, 20]

. θ plotted with,

{\tilde{θ}}_{n}^{f}

with

- o

,

{\hat{θ}}_{n}^{(1)}

with

- ⋄

,

{\hat{θ}}_{n}^{(2)}

with

- +

, and

{\bar{θ}}_{n}

with the

- *

line.

Figure 5. Moving-maximum process with

μ = 0.5

.

θ = 0.5

. Sample mean for different estimators of θ.

m = 100

Monte Carlo simulations.

{\hat{u}}_{n} = M_{t + 1 : k_{n}}

, and

{\hat{v}}_{n} = X_{t + 1 : n}

, with

t = 1, 3, 5, 7

; n = 200.

r_{n} \in [1, 20]

. θ plotted with,

{\tilde{θ}}_{n}^{f}

with

- o

,

{\hat{θ}}_{n}^{(1)}

with

- ⋄

,

{\hat{θ}}_{n}^{(2)}

with

- +

, and

{\bar{θ}}_{n}

with the

- *

line.

Figure 6. Maximum of the lagged autoregressive process with

ρ = 0.25

.

θ = 0.75

.

Figure 6. Maximum of the lagged autoregressive process with

ρ = 0.25

.

θ = 0.75

.

In order to compare the results of the blocks and logs methods with our new feasible estimator more in depth, we report in Figure 7 and Figure 8 the estimated mean square error for the doubly-stochastic model. The penalty function is

M S E ({\tilde{θ}}_{n}^{f} (r_{n})) = \frac{1}{m} \overset{m}{\sum_{i = 1}} {({\tilde{θ}}_{n i}^{f} (r_{n}) - θ)}^{2}, with r_{n} = 1, \dots, 50,

(44)

with

m = 100

simulated Monte Carlo sequences. The results confirm the findings above. Whereas the blocks method performs rather well for the first simulation experiment, it completely fails to report accurate estimates for the second experiment with a higher extremal index.

Figure 7. Simulated mean square error (MSE) of the estimators of θ for the doubly-stochastic model with

ψ = 0.9

and

η = 0.7

.

m = 100

Monte Carlo simulations.

{\tilde{θ}}_{n}^{f}

with

- o

,

{\hat{θ}}_{n}^{(1)}

with

- ⋄

and

{\hat{θ}}_{n}^{(2)}

with

- +

.

r_{n} \in [1, 50]

. n = 200 in (a), and n = 1000 in (b).

{\hat{u}}_{n} = M_{t + 1 : k_{n}}

and

{\hat{v}}_{n} = X_{t + 1 : n}

with

t = 7

.

Figure 7. Simulated mean square error (MSE) of the estimators of θ for the doubly-stochastic model with

ψ = 0.9

and

η = 0.7

.

m = 100

Monte Carlo simulations.

{\tilde{θ}}_{n}^{f}

with

- o

,

{\hat{θ}}_{n}^{(1)}

with

- ⋄

and

{\hat{θ}}_{n}^{(2)}

with

- +

.

r_{n} \in [1, 50]

. n = 200 in (a), and n = 1000 in (b).

{\hat{u}}_{n} = M_{t + 1 : k_{n}}

and

{\hat{v}}_{n} = X_{t + 1 : n}

with

t = 7

.

Figure 8. Simulated mean square error (MSE) of the estimators of θ for the doubly-stochastic model with

ψ = 0.5

and

η = 0.5

.

m = 100

Monte Carlo simulations.

{\tilde{θ}}_{n}^{f}

with

- o

,

{\hat{θ}}_{n}^{(1)}

with

- ⋄

and

{\hat{θ}}_{n}^{(2)}

with

- +

.

r_{n} \in [1, 50]

. n = 200 in (a), and n = 1000 in (b).

{\hat{u}}_{n} = M_{t + 1 : k_{n}}

and

{\hat{v}}_{n} = X_{t + 1 : n}

with

t = 7

.

Figure 8. Simulated mean square error (MSE) of the estimators of θ for the doubly-stochastic model with

ψ = 0.5

and

η = 0.5

.

m = 100

Monte Carlo simulations.

{\tilde{θ}}_{n}^{f}

with

- o

,

{\hat{θ}}_{n}^{(1)}

with

- ⋄

and

{\hat{θ}}_{n}^{(2)}

with

- +

.

r_{n} \in [1, 50]

. n = 200 in (a), and n = 1000 in (b).

{\hat{u}}_{n} = M_{t + 1 : k_{n}}

and

{\hat{v}}_{n} = X_{t + 1 : n}

with

t = 7

.

This section concludes with the study of the empirical coverage of the normal distribution followed asymptotically by our family of estimators. Figure 9 and Figure 10 report the asymptotic coverage of the two-sided normal confidence intervals for the examples above covering the Chernick and doubly-stochastic models. The simulations are computed for the order statistics above with

r_{n} = [1, 50]

and

n = 200, 500, 1000

. The plots illustrate the accuracy of the normal asymptotic distribution in all cases. It is worth noting that the partition of the sample and the extent of clustering play an important role for obtaining the correct nominal coverage for the asymptotic normal approximation. Thus, for large sample sizes, the models exhibiting a low clustering of extremes, Chernick with

r = 5

and the doubly-stochastic model with

ψ = η = 0.5

, only provide accurate approximations as the block size increases.

Figure 9. Chernick model with

r = 5

in (a) and

r = 2

in (b).

m = 500

Monte Carlo simulations.

+ -

for

n = 200

,

o -

for

n = 500

and

- -

for

n = 1000

.

{\hat{u}}_{n} = M_{t + 1 : k_{n}}

and

{\hat{v}}_{n} = X_{t + 1 : n}

with

t = ⌈ 0.005 k_{n} ⌉

, where

⌈ \cdot ⌉

rounds to the next integer.

r_{n} = [1, 50]

,

k_{n} = n / r_{n}

.

α = 0.05

.

Figure 9. Chernick model with

r = 5

in (a) and

r = 2

in (b).

m = 500

Monte Carlo simulations.

+ -

for

n = 200

,

o -

for

n = 500

and

- -

for

n = 1000

.

{\hat{u}}_{n} = M_{t + 1 : k_{n}}

and

{\hat{v}}_{n} = X_{t + 1 : n}

with

t = ⌈ 0.005 k_{n} ⌉

, where

⌈ \cdot ⌉

rounds to the next integer.

r_{n} = [1, 50]

,

k_{n} = n / r_{n}

.

α = 0.05

.

Figure 10. Doubly-stochastic model with

ψ = 0.9

and

η = 0.7

and

ψ ψ = 0.5

in (a) and

η = 0.5

in (b).

m = 500

Monte Carlo simulations.

+ -

for

n = 200

,

o -

for

n = 500

and

- -

for

n = 1000

.

{\hat{u}}_{n} = M_{t + 1 : k_{n}}

and

{\hat{v}}_{n} = X_{t + 1 : n}

with

t = ⌈ 0.005 k_{n} ⌉

, where

⌈ \cdot ⌉

rounds to the next integer.

r_{n} = [1, 50]

,

k_{n} = n / r_{n}

.

α = 0.05

.

Figure 10. Doubly-stochastic model with

ψ = 0.9

and

η = 0.7

and

ψ ψ = 0.5

in (a) and

η = 0.5

in (b).

m = 500

Monte Carlo simulations.

+ -

for

n = 200

,

o -

for

n = 500

and

- -

for

n = 1000

.

{\hat{u}}_{n} = M_{t + 1 : k_{n}}

and

{\hat{v}}_{n} = X_{t + 1 : n}

with

t = ⌈ 0.005 k_{n} ⌉

, where

⌈ \cdot ⌉

rounds to the next integer.

r_{n} = [1, 50]

,

k_{n} = n / r_{n}

.

α = 0.05

.

5. Application to Macroeconomic Time Series

Macroeconomic series usually exhibit clustering of extreme values that indicate periods of crisis, financial distress, booms and bursts. This phenomenon is more acute as the data are studied at shorter frequencies; thus, monthly time series usually exhibit stronger clustering of extremes than quarterly series, and so on. This stylized fact in macroeconomic time series is usually modeled with econometric processes that accommodate volatility clustering; see the ARCH family of volatility processes proposed by [25] or the stochastic volatility processes developed in [26]. These processes are, however, silent about the probability of runs of extremes in one or the other tail, and only after tedious calculations, which depend most of the times on a number of parameters, can one compute the chances of these events. Alternatively, inference about the extremal index is an unexplored option in this field that can lead to very interesting insights about the existence of clustering in the extremes of these sequences and on its persistence.

In this application, we pursue this alternative with monthly data in unemployment growth and inflation rates from the United States spanning the period February 1947 until July 2007 for the first series and February 1947 until June 2008 for the second series. Data are obtained from https://www.economy.com/freelunch. The dynamics of these series are reported in Figure 11 and Figure 12. Dickey-Fuller unit root tests reject the null hypothesis, providing statistical support to the stationarity of both series. As pointed out by a referee, the visual inspection of both processes could also suggest the presence of several structural breaks around periods characterized by drastic changes in monetary policy. In this likely scenario, it would also be reasonable to assume that the unemployment growth process and inflation rates are both stationary around a shifting mean, implying that the analysis of the serial dependence of the extremes of both sequences should be carried out separately for each regime. Nevertheless, for simplicity in the description of the problem, we assume hereafter a constant mean in both cases and, in turn, the stationarity of both processes.

Panel (a) of Figure 13 and Figure 14 report the estimates of θ using the four methods discussed in the paper. We can observe from the plots a similar decaying pattern for the four families of estimates. Note, however, that, whereas the sequences of the logs, blocks and runs estimates decay as the block size

r_{n}

increases, the estimator

{\tilde{θ}}_{n}^{f}

proposed in this paper stabilizes after the first partitions around a value of 0.3 for unemployment growth and 0.55 for the inflation rate, respectively. The statistical significance of these results is carried out by computing confidence intervals for θ at

5 %

using the results of Theorem 2 above. These intervals are displayed on the right panels of Figure 13 and Figure 14.

The results on the extremal index point towards a strong clustering in the positive extremes of both sequences. A more detailed analysis obtained by inspecting the confidence levels of each sequence of estimates indicates a stronger clustering for the unemployment series than for inflation.1 This implies that periods of high unemployment are more persistent than highly inflationary periods. A simple version of the Philips curve (see [27]) shows that inflation and economic growth are positively correlated or, similarly, that inflation and unemployment are negatively correlated. These empirical observations would imply that high inflation is followed by economic growth and, hence, falls in unemployment. Our empirical analysis of the extremes of both series adds further insights into this relationship. More specifically, we observe that highly inflationary periods are less persistent than periods with large unemployment growth, suggesting that economic policies focused on producing inflation to boost economic growth can only be successful in the very short term, as inflation quickly returns to normal levels. In contrast, large unemployment growth has lingering effects that are not easily reduced by policies aimed at rising inflation.

Figure 11. Monthly time series of unemployment growth for United States spanning the period February 1947 to June 2007.

Figure 12. Monthly time series of inflation rates for the United States spanning the period February 1947 to July 2008.

Figure 13. Panel (a) reports different estimates of θ.

r_{n} \in [1, 20]

.

{\tilde{θ}}_{n}^{f}

with

- o

,

{\hat{θ}}_{n}^{(1)}

with

- ⋄

,

{\hat{θ}}_{n}^{(2)}

with

- +

, and

{\bar{θ}}_{n}

with the

- *

line. Panel (b) reports two-sided confidence intervals for θ at

5 %

computed from

{\tilde{θ}}_{n}^{f}

, with

{\hat{u}}_{n} : = M_{t + 1 : k_{n}}

and

{\hat{v}}_{n} : = X_{t + 1 : n}

,

t = 7

.

Figure 13. Panel (a) reports different estimates of θ.

r_{n} \in [1, 20]

.

{\tilde{θ}}_{n}^{f}

with

- o

,

{\hat{θ}}_{n}^{(1)}

with

- ⋄

,

{\hat{θ}}_{n}^{(2)}

with

- +

, and

{\bar{θ}}_{n}

with the

- *

line. Panel (b) reports two-sided confidence intervals for θ at

5 %

computed from

{\tilde{θ}}_{n}^{f}

, with

{\hat{u}}_{n} : = M_{t + 1 : k_{n}}

and

{\hat{v}}_{n} : = X_{t + 1 : n}

,

t = 7

.

Figure 14. Panel (a) reports different estimates of θ.

r_{n} \in [1, 20]

.

{\tilde{θ}}_{n}^{f}

with

- o

,

{\hat{θ}}_{n}^{(1)}

with

- ⋄

,

{\hat{θ}}_{n}^{(2)}

with

- +

, and

{\bar{θ}}_{n}

with the

- *

line. Panel (b) reports two-sided confidence intervals for θ at

5 %

computed from

{\tilde{θ}}_{n}^{f}

, with

{\hat{u}}_{n} : = M_{t + 1 : k_{n}}

and

{\hat{v}}_{n} : = X_{t + 1 : n}

,

t = 7

.

Figure 14. Panel (a) reports different estimates of θ.

r_{n} \in [1, 20]

.

{\tilde{θ}}_{n}^{f}

with

- o

,

{\hat{θ}}_{n}^{(1)}

with

- ⋄

,

{\hat{θ}}_{n}^{(2)}

with

- +

, and

{\bar{θ}}_{n}

with the

- *

line. Panel (b) reports two-sided confidence intervals for θ at

5 %

computed from

{\tilde{θ}}_{n}^{f}

, with

{\hat{u}}_{n} : = M_{t + 1 : k_{n}}

and

{\hat{v}}_{n} : = X_{t + 1 : n}

,

t = 7

.

6. Conclusions

The existence of serial dependence in the extremes of stationary sequences is summarized in one single parameter: the extremal index. We have introduced in this paper a novel characterization of the extremal index based on the limiting expected value of two point processes defined by the sequence of block maxima determined by appropriate partitions and exceeding different threshold levels. Unlike most estimators for θ in the literature, this characterization yields a family of estimators that is consistent and asymptotically normal. Therefore, in contrast to most estimators of the extremal index that are downward biased, as the block size, our family of estimators yields very stable estimates across partitions, adding reliability to the results obtained.

In our application, we observe a significant clustering of extremes in unemployment and inflation rates. A more detailed analysis to the estimates of the extremal index shows that runs of large unemployment growth are more prolonged than runs of high inflation. This indicates a higher persistence of large unemployment rates than of inflationary periods.

Conflicts of Interest

The author declares no conflict of interest.

References

R.M. Loynes. “Extreme Values in Uniformly Mixing Stationary Stochastic Processes.” Ann. Math. Stat. 36 (1965): 993–999. [Google Scholar] [CrossRef]
G.L. O’Brien. “The maximum term of uniformly mixing stationary sequences.” Z. Wahrscheinlichkeitstheorie verw. Gebiete 30 (1974): 57–63. [Google Scholar] [CrossRef]
M. R. Leadbetter. “Extremes and Local Dependence in Stationary Sequences.” Z. Wahrscheinlichkeitstheorie verw. Gebiete 65 (1983): 291–306. [Google Scholar] [CrossRef]
R.L. Smith, and I. Weissman. “Estimating the Extremal index.” J. R. Stat. Soc. Ser. B 56 (1994): 515–528. [Google Scholar]
P. Embrechts, C. Klüppelberg, and T. Mikosch. Modelling Extremal Events for Insurance and Finance. Berlin, Germany: Springer, 1997. [Google Scholar]
G.L. O’Brien. “Extreme Values for Stationary and Markov Sequences.” Ann. Probab. 15 (1987): 281–291. [Google Scholar] [CrossRef]
S. Coles. An Introduction to Statistical Modeling of Extreme Values. Springer Series in Statistics; London, UK: Springer-Verlag, 2001. [Google Scholar]
C. Finkenstadt, and H. Rootzén. “Extreme Values in Finance, Telecommunications and Environment.” In Monographs on Statistics and Applied Probability 99. London, UK: Chapman and Hall, 2003. [Google Scholar]
S. Novak, and I. Weissman. “On blocks and runs estimators of the extremal index.” J. Stat. Plan. Inference 66 (1998): 281–288. [Google Scholar]
T. Hsing. “Extremal Index Estimation for a Weakly Dependent Stationary Sequence.” Ann. Stat. 21 (1993): 2043–2071. [Google Scholar] [CrossRef]
C.A. Ferro, and J. Segers. “Inference for clusters of extremes.” J. R. Stat. Soc. Ser. B 65 (2003): 545–556. [Google Scholar] [CrossRef]
F. Laurini, and J.A. Tawn. “New Estimators for the Extremal Index and Other Cluster Characteristics.” Extremes 3 (2003): 189–211. [Google Scholar] [CrossRef]
M.R. Leadbetter, and H. Rootzén. “Extremal theory for stochastic processes.” Ann. Probab. 16 (1988): 431–478. [Google Scholar] [CrossRef]
C.Y. Robert. “Inference for the Limiting Cluster Size Distribution of Extreme Values.” Ann. Stat. 37 (2009): 271–310. [Google Scholar] [CrossRef]
T. Hsing. “Estimating the parameters of rare events.” Stoch. Process. Appl. 37 (1991): 117–139. [Google Scholar] [CrossRef]
H. Drees. “Weighted approximations of tail processes for β-mixing random variables.” Ann. Appl. Probab. 10 (2000): 1274–1301. [Google Scholar] [CrossRef]
H. Drees. “Extreme quantile estimation for dependent data, with applications to finance.” Bernoulli 9 (2003): 617–657. [Google Scholar] [CrossRef]
L. De Haan, C. Mercadier, and C. Zhou. “Erasmus University of Rotterdam, Rotterdam, Netherlands. Adapting extreme value statistics to financial time series: Dealing with bias and serial dependence.” Unpublished manuscript. 2015. [Google Scholar]
B.V. Gnedenko. “Sur la distribution limit du terme d’une série aléatorie.” Ann. Math. 44 (1943): 423–453. [Google Scholar] [CrossRef]
J.L. Hodges, and L. le Cam. “The Poisson approximation to the Poisson binomial distribution.” Ann. Math. Stat. 31 (1960): 737–740. [Google Scholar] [CrossRef]
E.L. Lehman. Elements of Large-Sample Theory. New York, NY, USA: Springer-Verlag, 1999. [Google Scholar]
M.R. Leadbetter, G. Lindgren, and H. Rootzén. Extremes and Related Properties of Random Sequences and Processes. New York, NY, USA: Springer-Verlag, 1983. [Google Scholar]
M.R. Chernick. “A limit theorem for the maximum of autorregresive processes with uniform marginal distribution.” Ann. Probab. 9 (1981): 145–149. [Google Scholar] [CrossRef]
L. De Haan. “Sample extremes: An elementary introduction.” Stat. Neerlandica 30 (1976): 161–172. [Google Scholar] [CrossRef]
R.F. Engle. “Autoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom inflation.” Econometrica 50 (1982): 987–1007. [Google Scholar] [CrossRef]
S.J. Taylor. “Modeling stochastic volatility: A review and comparative study.” Math. Financ. 4 (1994): 183–204. [Google Scholar] [CrossRef]
A. W. Phillips. “The Relationship between Unemployment and the Rate of Change of Money Wages in the United Kingdom 1861–1957.” Economica 25 (1958): 283–299. [Google Scholar]

¹A more formal alternative is to compute confidence intervals for the difference between the extremal indexes of each macroeconomic series.

© 2015 by the author; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license ( http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Olmo, J. A New Family of Consistent and Asymptotically-Normal Estimators for the Extremal Index. Econometrics 2015, 3, 633-653. https://doi.org/10.3390/econometrics3030633

AMA Style

Olmo J. A New Family of Consistent and Asymptotically-Normal Estimators for the Extremal Index. Econometrics. 2015; 3(3):633-653. https://doi.org/10.3390/econometrics3030633

Chicago/Turabian Style

Olmo, Jose. 2015. "A New Family of Consistent and Asymptotically-Normal Estimators for the Extremal Index" Econometrics 3, no. 3: 633-653. https://doi.org/10.3390/econometrics3030633

Article Menu

A New Family of Consistent and Asymptotically-Normal Estimators for the Extremal Index

Abstract

1. Introduction

2. Characterization of the Extremal Index

3. Estimation of the Extremal Index

4. Simulations: Some Examples

5. Application to Macroeconomic Time Series

6. Conclusions

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI