Measures of Dispersion and Serial Dependence in Categorical Time Series

Weiß, Christian H.

doi:10.3390/econometrics7020017

Open AccessArticle

Measures of Dispersion and Serial Dependence in Categorical Time Series

by

Christian H. Weiß

Department of Mathematics and Statistics, Helmut Schmidt University, 22043 Hamburg, Germany

Econometrics 2019, 7(2), 17; https://doi.org/10.3390/econometrics7020017

Submission received: 21 December 2018 / Revised: 1 April 2019 / Accepted: 17 April 2019 / Published: 22 April 2019

(This article belongs to the Special Issue Discrete-Valued Time Series: Modelling, Estimation and Forecasting)

Download

Browse Figures

Versions Notes

Abstract

:

The analysis and modeling of categorical time series requires quantifying the extent of dispersion and serial dependence. The dispersion of categorical data is commonly measured by Gini index or entropy, but also the recently proposed extropy measure can be used for this purpose. Regarding signed serial dependence in categorical time series, we consider three types of

κ

-measures. By analyzing bias properties, it is shown that always one of the

κ

-measures is related to one of the above-mentioned dispersion measures. For doing statistical inference based on the sample versions of these dispersion and dependence measures, knowledge on their distribution is required. Therefore, we study the asymptotic distributions and bias corrections of the considered dispersion and dependence measures, and we investigate the finite-sample performance of the resulting asymptotic approximations with simulations. The application of the measures is illustrated with real-data examples from politics, economics and biology.

Keywords:

Cohen’s κ; extropy; nominal variation; signed serial dependence; asymptotic distribution

1. Introduction

In many applications, the available data are not of quantitative nature (e.g., real numbers or counts) but consist of observations from a given finite set of categories. In the present article, we are concerned with data about political goals in Germany, fear states in the stock market, and phrases in a bird’s song. For stochastic modeling, we use a categorical random variable X, i.e. a qualitative random variable taking one of a finite number of categories, e.g.

m + 1

categories with some

m \in N

. If these categories are unordered, X is said to be a nominal random variable, whereas an ordinal random variable requires a natural order of the categories (Agresti 2002). To simplify notations, we always assume the possible outcomes to be arranged in a certain order (either lexicographical or natural order), i.e. we denote the range (state space) as

S = {s_{0}, s_{1}, \dots, s_{m}}

. The stochastic properties of X can be determined based on the vector of marginal probabilities by

p = {(p_{0}, \dots, p_{m})}^{⊤} \in {[0; 1]}^{m + 1}

, where

p_{i} = P (X = s_{i})

(probability mass function, PMF). We abbreviate

s_{k} (p) : = \sum_{j = 0}^{m} p_{j}^{k}

for

k \in N

, where

s_{1} (p) = 1

has to hold. The subscripts “

0, 1, \dots, m

” are used for

S

and p to emphasize that only m of the probabilities can be freely chosen because of the constraint

p_{0} = 1 - p_{1} - \dots - p_{m}

.

Well-established dispersion measures for quantitative data, such as variance or inter quartile range, cannot be applied to qualitative data. For a categorical random variable X, one commonly defines dispersion with respect to the uncertainty in predicting the outcome of X(Kvålseth 2011b; Rao 1982; Weiß and Göb 2008). This uncertainty is maximal for a uniform distribution

p_{uni} = {(\frac{1}{m + 1}, \dots, \frac{1}{m + 1})}^{⊤}

on

S

(a reasonable prediction is impossible if all states are equally probable, thus maximal dispersion), whereas it is minimal for a one-point distribution

p_{one}

(i.e., all probability mass concentrates on one category, so a perfect prediction is possible). Obviously, categorical dispersion is just the opposite concept to the concentration of a categorical distribution. To measure the dispersion of the categorical random variable X, the most common approach is to use either the (normalized) Gini index (also index of qualitative variation, IQV) (Kvålseth 1995; Rao 1982) defined as

ν_{G} = \frac{m + 1}{m} (1 - s_{2} (p)),

(1)

or the (normalized) entropy (Blyth 1959; Shannon 1948) given by

ν_{En} = \frac{- 1}{ln (m + 1)} \sum_{i = 0}^{m} p_{i} ln p_{i} with 0 \cdot ln 0 : = 0 .

(2)

Both measures are minimized by a one-point distribution

p_{one}

and maximized by the uniform distribution

p_{uni}

on

S

. While nominal dispersion is always expressed with respect to these extreme cases, it has to be mentioned that there is an alternative scenario of maximal ordinal variation, namely the extreme two-point distribution; however, this is not further considered here.

If considering a (stationary) categorical process

{(X_{t})}_{Z}

instead of a single random variable, then not only marginal properties are relevant but also information about the serial dependence structure (Weiß 2018). The (signed) autocorrelation function (ACF), as it is commonly applied in case of real-valued processes, cannot be used for categorical data. However, one may use a type of Cohen’s κ instead (Cohen 1960). A

κ

-measure of signed serial dependence in categorical time series is given by (see Weiß 2011, 2013; Weiß and Göb 2008);

κ (h) = \sum_{i = 0}^{m} \frac{p_{i i} (h) - p_{i}^{2}}{1 - s_{2} (p)} \in [\frac{- s_{2} (p)}{1 - s_{2} (p)}; 1] for lags h \in N .

(3)

Equation (3) is based on the lagged bivariate probabilities

p_{i j} (h) = P (X_{t} = i, X_{t - h} = j)

for

i, j = 0, \dots, m

.

κ (h) = 0

for serial independence at lag h, and the strongest degree of positive (negative) dependence is indicated if all

p_{i i} (h) = p_{i}

(

p_{i i} (h) = 0

), i.e., if the event

X_{t - h} = s_{i}

is necessarily followed by

X_{t} = s_{i}

(

X_{t} \neq s_{i}

).

Motivated by a mobility index discussed by Shorrocks (1978), a simplified type of

κ

-measure, referred to as the modified κ, was defined by Weiß (2011, 2013):

κ^{*} (h) = \frac{1}{m} \sum_{i = 0}^{m} \frac{p_{i i} (h) - p_{i}^{2}}{p_{i}} \in [- \frac{1}{m}; 1] for lags h \in N .

(4)

Except the fact that the lower bound of the range differs from the one in Equation (3) (note that this lower bound is free of distributional parameters), we have the same properties as stated before for

κ (h)

. The computation of

κ^{*} (h)

is simplified compared to the one of

κ (h)

and, in particular, its sample version

{\hat{κ}}^{*} (h)

has a more simple asymptotic normal distribution, see Section 5 for details. Unfortunately,

κ^{*} (h)

is not defined if only one of the

p_{i}

equals 0, whereas

κ (h)

is well defined for any marginal distribution not being a one-point distribution. This issue may happen quite frequently for the sample version

{\hat{κ}}^{*} (h)

if the given time series is short (a possible circumvention is to replace all summands with

p_{i} = 0

by 0). For this reason,

κ^{*} (h), {\hat{κ}}^{*} (h)

appear to be of limited use for practice as a way of quantifying signed serial dependence. It should be noted that a similar “zero problem” happens with the entropy

ν_{En}

in Equation (2), and, actually, we work out a further relation between

ν_{En}

and

{\hat{κ}}^{*} (h)

below.

In the recent work by Lad et al. (2015), extropy was introduced as a complementary dual to the entropy. Its normalized version is given by

ν_{Ex} = \frac{- 1}{m ln (\frac{m + 1}{m})} \sum_{i = 0}^{m} (1 - p_{i}) ln (1 - p_{i}) .

(5)

Here, the zero problem obviously only happens if one of the

p_{i}

equals 1 (i.e., in the case of a one-point distribution). Similar to the Gini index in Equation (1) and the entropy in Equation (2), the extropy takes its minimal (maximal) value 0 (1) for

p = p_{one}

(

p = p_{uni}

), thus also Equation (5) constitutes a normalized measure of nominal variation. In Section 2, we analyze its properties in comparison to Gini index and entropy. In particular, we focus on the respective sample versions

{\hat{ν}}_{Ex}

,

{\hat{ν}}_{G}

and

{\hat{ν}}_{En}

(see Section 3). To be able to do statistical inference based on

{\hat{ν}}_{Ex}

,

{\hat{ν}}_{G}

and

{\hat{ν}}_{En}

, knowledge about their distribution is required. Up to now, only the asymptotic distribution of

{\hat{ν}}_{G}

and (to some part) of

{\hat{ν}}_{En}

has been derived; in Section 3, comprehensive results for all considered dispersion measures are provided. These asymptotic distributions are then used as approximations to the true sample distributions of

{\hat{ν}}_{Ex}

,

{\hat{ν}}_{G}

and

{\hat{ν}}_{En}

, which is further investigated with simulations and a real application (see Section 4).

The second part of this paper is dedicated to the analysis of serial dependence. As a novel competitor to the measures in Equations (3) and (4), a new type of modified

κ

is proposed, namely

κ^{⋆} (h) = \sum_{i = 0}^{m} \frac{p_{i i} (h) - p_{i}^{2}}{1 - p_{i}} \in [\sum_{i = 0}^{m} \frac{- p_{i}^{2}}{1 - p_{i}}; 1] for lags h \in N .

(6)

Again, this constitutes a measure of signed serial dependence, which shares the before-mentioned (in)dependence properties with

κ (h), κ^{*} (h)

. However, in contrast to

κ^{*} (h)

, the newly proposed

κ^{⋆} (h)

does not have a division-by-zero problem: except for the case of a one-point distribution,

κ^{⋆} (h)

is well defined. Note that, in Section 3.2, it turns out that

κ^{⋆} (h)

is related to

ν_{Ex}

in some sense, e.g.

κ (h)

is related to

ν_{G}

and

κ^{*} (h)

to

ν_{En}

. In Section 5, we analyze the sample version of

κ^{⋆} (h)

in comparison to those of

κ (h), κ^{*} (h)

, and we derive its asymptotic distribution under the null hypothesis of serial independence. This allows us to test for significant dependence in categorical time series. The performance of this

{\hat{κ}}^{⋆}

-test, in comparison to those based on

\hat{κ}, {\hat{κ}}^{*}

, is analyzed in Section 6, where also two further real applications are presented. Finally, we conclude in Section 7.

2. Extropy, Entropy and Gini Index

As extropy, entropy and Gini index all serve for the same task, it is interesting to know their relations and differences. An important practical issue is the “

0 ln 0

”-problem, as mentioned above, which never occurs for the Gini index, only occurs in the case of a (deterministic) one-point distribution for the extropy, and always occurs for the entropy if only one

p_{i} = 0

. Lad et al. (2015) further compared the non-normalized versions of extropy and entropy, and they showed that the first is never smaller than the latter. Actually, using the inequality

ln (1 + x) > x / (1 + x / 2)

for

x > 0

from Love (1980), it follows that

- \sum_{i = 0}^{m} (1 - p_{i}) ln (1 - p_{i}) \geq - \sum_{i = 0}^{m} p_{i} ln p_{i} \geq 1 - s_{2} (p),

(7)

(see Appendix B.1 for further details).

Things change, however, if considering the normalized versions

ν_{Ex}

,

ν_{En}

and

ν_{G}

. For illustration, assume an underlying Lambda distribution

L_{m} (λ)

with

λ \in (0; 1)

defined by the probability vector

p_{m; λ} = {(1 - λ + \frac{λ}{m + 1}, \frac{λ}{m + 1}, \dots, \frac{λ}{m + 1})}^{⊤}

(Kvålseth 2011a). Note that

λ \to 0

leads to a one-point distribution, whereas

λ \to 1

leads to the uniform distribution; actually,

L_{m} (λ)

can be understood as a mixture of these boundary cases. For

L_{m} (λ)

, the Gini index satisfies

ν_{G} = λ (2 - λ)

for all

m \in N

(see Kvålseth (2011a)). In addition, the extropy

ν_{Ex}

has rather stable values for varying m (see Figure 1a), whereas the entropy values in Figure 1b change greatly. This complicates the interpretation of the actual level of normalized entropy.

Finally, the example

m = 10

plotted in Figure 1c shows that, in contrast to Equation (7), there is no fixed order between the normalized entropy

ν_{En}

and Gini index

ν_{G}

. In this and many further numerical experiments, however, it could be observed that the inequalities

ν_{Ex} \geq ν_{En}

and

ν_{Ex} \geq ν_{G}

hold. These inequalities are formulated as a general conjecture here.

From now on, we turn towards the sample versions

{\hat{ν}}_{Ex}, {\hat{ν}}_{En}, {\hat{ν}}_{G}

of

ν_{Ex}, ν_{En}, ν_{G}

. These are obtained by replacing the probabilities

p_{i}, p

by the respective estimates

{\hat{p}}_{i}, \hat{p}

, which are computed as relative frequencies from the given sample data

x_{1}, \dots, x_{n}

. As detailed in Section 3,

x_{1}, \dots, x_{n}

are assumed as time series data, but we also consider the case of independent and identically distributed (i.i.d.) data.

3. Distribution of Sample Dispersion Measures

To be able to derive the asymptotic distribution of statistics computed from

X_{1}, \dots, X_{n}

, Weiß (2013) assumed that the nominal process is

ϕ

-mixing with exponentially decreasing weights such that the CLT on p. 200 in Billingsley (1999) is applicable. This condition is not only satisfied in the i.i.d.-case, but also for, among others, the so-called NDARMA models as introduced by Jacobs and Lewis (1983) (see Appendix A.1 for details). Then, Weiß (2013) derived the asymptotic distribution of

\sqrt{n} (\hat{p} - p)

, which is the normal distribution

N (0, Σ)

with

Σ = {(σ_{i j})}_{i, j = 0, \dots, m}

given by

σ_{i j} = p_{j} (δ_{i, j} - p_{i}) + \sum_{h = 1}^{\infty} (p_{i j} (h) + p_{j i} (h) - 2 p_{i} p_{j}) .

(8)

Using this result, the asymptotic properties of

{\hat{ν}}_{G}

,

{\hat{ν}}_{En}

and

{\hat{ν}}_{Ex}

can be derived, as shown in Appendix B.2. The following subsections present and compare these properties in detail.

3.1. Asymptotic Normality

As shown in Appendix B.2, provided that

p \neq p_{one}, p_{uni}

, all variation measures

{\hat{ν}}_{G}

,

{\hat{ν}}_{En}

and

{\hat{ν}}_{Ex}

are asymptotically normally distributed. More precisely,

\sqrt{n} ({\hat{ν}}_{G} - ν_{G})

is asymptotically normally distributed with variance

\begin{matrix} σ_{G}^{2} = & 4 {(\frac{m + 1}{m})}^{2} \sum_{i, j = 0}^{m} p_{i} p_{j} σ_{i j} \\ \overset{Equation (8)}{=} & 4 {(\frac{m + 1}{m})}^{2} (s_{3} (p) - s_{2}^{2} (p)) (1 + 2 \sum_{h = 1}^{\infty} \underset{= : ϑ (h)}{\underset{︸}{\sum_{i, j = 0}^{m} \frac{(p_{i j} (h) - p_{i} p_{j}) p_{i} p_{j}}{s_{3} (p) - s_{2}^{2} (p)}}}), \end{matrix}

(9)

a result already known from Weiß (2013). Here,

ϑ (h)

might be understood as a measure of serial dependence, in analogy to the measures in Equations (3), (4) and (6). In particular,

ϑ (h) = 0

in the i.i.d.-case, and

ϑ (h) = κ (h)

for NDARMA processes (Appendix A.1).

Analogously,

\sqrt{n} ({\hat{ν}}_{En} - ν_{En})

is asymptotically normally distributed with variance

\begin{matrix} σ_{En}^{2} = & \sum_{i, j = 0}^{m} \frac{1 + ln p_{i}}{ln (m + 1)} \frac{1 + ln p_{j}}{ln (m + 1)} σ_{i j} \\ \overset{Equation (8)}{=} & \frac{1}{{(ln (m + 1))}^{2}} (\sum_{i = 0}^{m} p_{i} {(ln p_{i})}^{2} - {(\sum_{i = 0}^{m} p_{i} ln p_{i})}^{2}) \\ \cdot (1 + 2 \sum_{h = 1}^{\infty} \underset{= : ϑ^{*} (h)}{\underset{︸}{\frac{\sum_{i, j = 0}^{m} (1 + ln p_{i}) (1 + ln p_{j}) (p_{i j} (h) - p_{i} p_{j})}{\sum_{i = 0}^{m} p_{i} {(ln p_{i})}^{2} - {(\sum_{i = 0}^{m} p_{i} ln p_{i})}^{2}}}}) . \end{matrix}

(10)

In the i.i.d.-case, where the last factor becomes 1 (cf. Appendix A.1), this result was given by Blyth (1959), whereas the general expression in Equation (10) can be found in the work of Weiß (2013).

A novel result follows for the extropy, where

\sqrt{n} ({\hat{ν}}_{Ex} - ν_{Ex})

is asymptotically normally distributed with variance

\begin{matrix} σ_{Ex}^{2} = & \sum_{i, j = 0}^{m} \frac{1 + ln (1 - p_{i})}{m ln (\frac{m + 1}{m})} \frac{1 + ln (1 - p_{j})}{m ln (\frac{m + 1}{m})} σ_{i j} \\ \overset{Equation (8)}{=} & \frac{1}{{(m ln (\frac{m + 1}{m}))}^{2}} (\sum_{i = 0}^{m} p_{i} {(ln (1 - p_{i}))}^{2} - {(\sum_{i = 0}^{m} p_{i} ln (1 - p_{i}))}^{2}) \\ \cdot (1 + 2 \sum_{h = 1}^{\infty} \underset{= : ϑ^{⋆} (h)}{\underset{︸}{\frac{\sum_{i, j = 0}^{m} (1 + ln (1 - p_{i})) (1 + ln (1 - p_{j})) (p_{i j} (h) - p_{i} p_{j})}{\sum_{i = 0}^{m} p_{i} {(ln (1 - p_{i}))}^{2} - {(\sum_{i = 0}^{m} p_{i} ln (1 - p_{i}))}^{2}}}}) . \end{matrix}

(11)

Again, the last factor becomes 1 in the i.i.d.-case as

ϑ^{⋆} (h) = 0

, and

ϑ^{⋆} (h) = κ (h)

for NDARMA processes (Appendix A.1).

In Equations (9)–(11), the notations

ϑ (h)

,

ϑ^{*} (h)

and

ϑ^{⋆} (h)

have been introduced (see the respective expressions covered by the curly bracket) to highlight the similar structure of the asymptotic variances, and to locate the effect of serial dependence. Actually, one might use

ϑ (h)

,

ϑ^{*} (h)

and

ϑ^{⋆} (h)

as measures of serial dependence in categorical time series, although their definition is probably too complex for practical use. In Section 3.2, when analyzing the bias of

{\hat{ν}}_{G}

,

{\hat{ν}}_{En}

and

{\hat{ν}}_{Ex}

, analogous relations to the

κ

-measures defined in Section 1 are established.

3.2. Asymptotic Bias

In Appendix B.2, we express the variation measures

{\hat{ν}}_{G}

,

{\hat{ν}}_{En}

and

{\hat{ν}}_{Ex}

as centered quadratic polynomials (at least approximately), and subsequently derive a bias formula. For the sample Gini index, it follows that

E [{\hat{ν}}_{G}] \approx ν_{G} - \frac{1}{n} \frac{m + 1}{m} \sum_{i = 0}^{m} σ_{i i} \overset{Equation (8)}{=} ν_{G} (1 - \frac{1}{n} (1 + 2 \sum_{h = 1}^{\infty} κ (h))) .

(12)

This formula was also derived by Weiß (2013), and it leads to the exact corrective factor

(1 - \frac{1}{n})

in the i.i.d.-case. For

{\hat{ν}}_{En}, {\hat{ν}}_{Ex}

, such bias formulae do not exist yet. However, from our derivations in Appendix B.2, we newly obtain that

\begin{matrix} E [{\hat{ν}}_{En}] \approx & ν_{En} - \frac{1}{2 n} \sum_{i = 0}^{m} \frac{p_{i}^{- 1}}{ln (m + 1)} σ_{i i} \overset{Equation (8)}{=} ν_{En} - \frac{1}{2 n} \frac{m}{ln (m + 1)} (1 + 2 \sum_{h = 1}^{\infty} κ^{*} (h)) . \end{matrix}

(13)

In the i.i.d.-case, the last factor reduces to 1, and

κ^{*} (h) = κ (h)

for NDARMA processes (Appendix A.1). Comparing Equations (12) and (13), we see that the effect of serial dependence on the bias is always expressed in terms of a

κ

-measure, using the ordinary

κ

(Equation (3)) for the Gini index, and the modified

κ

(Equation (4)) for the entropy. Concerning the extropy, it turns out that the newly proposed

κ

-measure from Equation (6) takes this role:

\begin{matrix} E [{\hat{ν}}_{Ex}] \approx & ν_{Ex} - \frac{1}{2 n} \sum_{i = 0}^{m} \frac{{(1 - p_{i})}^{- 1}}{m ln (\frac{m + 1}{m})} σ_{i i} \overset{Equation (8)}{=} ν_{Ex} - \frac{1}{2 n} \frac{1}{m ln (\frac{m + 1}{m})} (1 + 2 \sum_{h = 1}^{\infty} κ^{⋆} (h)) . \end{matrix}

(14)

In the i.i.d.-case, the last factor again reduces to 1, and also

κ^{⋆} (h) = κ (h)

holds for NDARMA processes (Appendix A.1). Altogether, Equations (12)–(14) show a unique structure regarding the effect of serial dependence. Furthermore, the computed bias corrections imply the relations

ν_{G} \leftrightarrow κ

,

ν_{En} \leftrightarrow κ^{*}

and

ν_{Ex} \leftrightarrow κ^{⋆}

. The sample versions of

κ, κ^{*}, κ^{⋆}

are analyzed later in Section 5.

3.3. Asymptotic Properties for Uniform Distribution

The asymptotic normality established in Section 3.1 certainly does not apply to the deterministic case

p = p_{one}

, but we also have to exclude the boundary case of a uniform distribution

p_{uni}

. As shown in Appendix B.2, the asymptotic distribution of

{\hat{ν}}_{G}

,

{\hat{ν}}_{En}

and

{\hat{ν}}_{Ex}

in the uniform case is not a normal distribution but a quadratic-form one. All three statistics can be related to the Pearson’s

χ^{2}

-statistic:

\begin{matrix} n m (1 - {\hat{ν}}_{G}) = \\ 2 n ln (m + 1) \cdot (1 - {\hat{ν}}_{En}) \approx \\ 2 n m^{2} ln (\frac{m + 1}{m}) \cdot (1 - {\hat{ν}}_{Ex}) \approx \end{matrix}\} n \sum_{j = 0}^{m} \frac{{({\hat{p}}_{j} - \frac{1}{m + 1})}^{2}}{\frac{1}{m + 1}} .

(15)

The actual asymptotic distribution can now be derived by applying Theorem 3.1 in Tan (1977) to the asymptotic result in Equation (8), which requires computing the eigenvalues of

Σ

. In special cases, however, one is not faced with a general quadratic-form distribution but with a

χ_{m}^{2}

-distribution; this happens for NDARMA processes and certainly in the i.i.d.-case (see Weiß (2013)). Then, defining

c = 1 + 2 \sum_{h = 1}^{\infty} κ (h)

, it holds that Equation (15) asymptotically follows c times a

χ_{m}^{2}

-distribution (with

c = 1

in the i.i.d.-case).

4. Simulations and Applications

Section 4.1 presents some simulation results regarding the quality of the asymptotic approximations for

{\hat{ν}}_{G}

,

{\hat{ν}}_{En}

and

{\hat{ν}}_{Ex}

as derived in Section 3. Section 4.2 then applies these measures within a longitudinal study about the most important goals in politics in Germany.

4.1. Finite-Sample Performance of Dispersion Measures

In applications, the normal and

χ^{2}

-distributions derived in Section 3 were used as an approximation to the true distribution of

{\hat{ν}}_{G}

,

{\hat{ν}}_{En}

and

{\hat{ν}}_{Ex}

. Therefore, the finite-sample performance of these approximations had to be analyzed, which was done by simulation (with 10,000 replications per scenario). In the tables provided by Appendix C, the simulated means (Table A1) and standard deviations (Table A2) for

{\hat{ν}}_{G}

,

{\hat{ν}}_{En}

and

{\hat{ν}}_{Ex}

are reported and compared to the respective asymptotic approximations. Then, a common application scenario was considered, the computation of two-sided 95% confidence intervals (CIs) for

{\hat{ν}}_{G}

,

{\hat{ν}}_{En}

and

{\hat{ν}}_{Ex}

. Since the true parameter values are not known in practice, one has to plug-in estimated parameters into the formulae for mean and variance given in Section 3. The simulated coverage rates reported in Table A3 refer to such plug-in CIs. For these simulations, we either used an i.i.d. data-generating process (DGP) or an NDARMA DGP (see Appendix A.1): a DMA(1) process with

φ_{1} = 0.25

or a DAR(1) process with

ϕ_{1} = 0.40

. These were combined with the marginal distributions (

m = 3

) summarized in Table 1:

p_{1}

and

p_{2}

were used before by Weiß (2011, 2013),

p_{3}

by Kvålseth (2011b), and

p_{4}

to

p_{6}

are Lambda distributions

L_{m} (λ)

with

λ \in {0.25, 0.50, 0.75}

(see Kvålseth 2011a).

Finally, we used the results in Section 3.3 to test the null hypothesis (level 5%) of the uniform distribution

L_{3} (1)

(

ν_{G} = ν_{En} = ν_{Ex} = 1

) based on

{\hat{ν}}_{G}

,

{\hat{ν}}_{En}

and

{\hat{ν}}_{Ex}

. The considered alternatives were

L_{m} (λ)

with

λ \in {0.98, 0.96, 0.94, 0.92, 0.90}

, and the DGPs were as presented above. The simulated rejection rates are summarized in Table A4. Note that the required factor

c = 1 + 2 \sum_{h = 1}^{\infty} κ (h)

equald

c = (1 + ϕ_{1}) / (1 - ϕ_{1})

with

ϕ_{1} = κ (1)

in the DAR(1) case, and

c = 1 + 2 κ (1) = 1 + 2 φ_{1} (1 - φ_{1})

in the DMA(1) case, and was thus easy to estimate from the given time series data.

Let us now investigate the simulation results. Comparing the simulated mean values in Table A1 with the true dispersion values in Table 1, we realized a considerable negative bias for small sample sizes, which became even larger with increasing serial dependence. Fortunately, this bias is explained very well by the asymptotic bias correction, in any of the considered scenarios. With some limitations, this conclusion also applies to the standard deviations reported in Table A2; however, for sample size

n = 100

and increasing serial dependence, the discrepancy between asymptotic and simulated values increased. As a result, the coverage rates in Table A3 performed rather poorly for sample size 100; thus, reliable CIs generally required a sample of size at least 250. It should also be noted that Gini index and extropy performed very similarly and often slightly worse than the entropy. Finally, the rejection rates in Table A4 concerning the tests for uniformity showed similar sizes (columns “

1.00

”; slightly above 0.05 for

n = 100

) but little different power values: best for extropy and worst for entropy.

4.2. Application: Goals in Politics

The monitoring of public mood and political attitudes over time is important for decision makers as well as for social scientists. Since 1980, the German General Social Survey (“ALLBUS”) is carried out by the “GESIS—Leibniz Institute for the Social Sciences” in every second year (exception: there was an additional survey in 1991 after the German reunification). In the years before and including 1990, the survey was done only in Western Germany, but in all Germany for the years 1991 and later. In what follows, we consider the cumulative report for 1980–2016 in GESIS—Leibniz Institute for the Social Sciences (2018), and there the question “If you had to choose between these different goals, which one would seem to you personally to be the most important?”. The four possible (nominal) answers are

$s_{0}$ : “To maintain law and order in this country”;
$s_{1}$ : “To give citizens more influence on government decisions”;
$s_{2}$ : “To fight rising prices”; and
$s_{3}$ : “To protect the right of freedom of speech”.

The sample sizes of this longitudinal study varied between 2795 and 3754, and are thus sufficiently large for the asymptotics derived in Section 3. If just looking at the mode as a summary measure (location), there is not much change in time: from 1980 to 2000 and again in 2016, the majority of respondents considered

s_{0}

(law and order) as the most important goal in politics, whereas

s_{1}

(influence) was judged most important between 2002 and 2014.

Much more fluctuations are visible if looking at the dispersion measures in Figure 2a. Although the absolute values of the measures differ, the general shapes of the graphs are quite similar. Usually, the dispersion measures take rather large values (≥0.90 in most of the years), which shows that any of the possible goals is considered as being most important by a large part of the population. On the other hand, the different goals never have the same popularity. Even in 2008, where all measures give a value very close to 1, the corresponding uniformity tests lead to a clear rejection of the null of a uniform distribution (p-values approximately 0 throughout).

Let us now analyze the development of the importance of the political goals in some more detail, by looking at the extropy for illustration. Figure 2b shows the approximate 95%-CIs in time (a bias correction does not have a visible effect because of the large sample sizes). There are phases where successive CIs overlap, and these are interrupted by breaks in the dispersion behavior. Such breaks happen, e.g., in 1984 (possibly related to a change of government in Germany in 1982/83), in 2002 (perhaps related to 9/11), or in 2008 and 2010 (Lehman bankruptcy and economic crisis). These changes in dispersion go along with reallocations of probability masses, as can be seen from Figure 2c. From the frequency curves of

s_{0}

and

s_{1}

, we can see the above-mentioned change in mode, where the switch back to

s_{0}

(law and order) might be caused by the refugee crisis. In addition, the curve for

s_{2}

(fight rising prices) helps for explanation, as it shows that

s_{2}

is important for the respondents in the beginning of the 1980s (where Germany suffered from very high inflation) and in 2008 (economic crisis), but not otherwise. Thus, altogether, the dispersion measures together with their approximate CIs give a very good summary of the distributional changes over time.

5. Measures of Signed Serial Dependence

After having discussed the analysis of marginal properties of a categorical time series, we now turn to the analysis of serial dependencies. In Section 1, two known measures of signed serial dependence, Cohen’s

κ (h)

in Equation (3) and a modification of it,

κ^{*} (h)

in Equation (4), are briefly surveyed, and, in Section 3.2, we realize a connection to

ν_{G}

and

ν_{En}

, respectively. Motivated by a zero problem with

κ^{*} (h)

, a new type of modified

κ

is proposed in Equation (6), the measure

κ^{⋆} (h)

, and this turns out to be related to

ν_{Ex}

.

If replacing the (bivariate) probabilities in Equations (3), (4) and (6) by the respective (bivariate) relative frequencies computed from

x_{1}, \dots, x_{n}

, we end up with sample versions of these dependence measures. Knowledge of their asymptotic distribution is particularly relevant for the i.i.d.-case, because this allows us to test for significant serial dependence in the given time series. As shown by Weiß (2011, 2013),

\hat{κ} (h)

then has an asymptotic normal distribution, and it holds approximately that

E [\hat{κ} (h)] \approx - \frac{1}{n}, V [\hat{κ} (h)] \approx \frac{1}{n} (1 - \frac{1 + 2 s_{3} (p) - 3 s_{2} (p)}{{(1 - s_{2} (p))}^{2}}) .

(16)

The sample version of

κ^{*} (h)

has a more simple asymptotic normal distribution with Weiß (2011, 2013)

E [{\hat{κ}}^{*} (h)] \approx - \frac{1}{n}, V [{\hat{κ}}^{*} (h)] \approx \frac{1}{m n},

(17)

but it suffers from the before-mentioned zero problem, especially for short time series.

Thus, it remains to derive the asymptotics of the novel

{\hat{κ}}^{⋆} (h)

under the null of an i.i.d. sample

X_{1}, \dots, X_{n}

. The starting point is an extension of the limiting result in Equation (8). Under appropriate mixing assumptions (see Section 3), Weiß (2013) derived the joint asymptotic distribution of all univariate and equal-bivariate relative frequencies, i.e. of all

\sqrt{n} ({\hat{p}}_{i} - p_{i})

and

\sqrt{n} ({\hat{p}}_{j j} (h) - p_{j j} (h))

, which is the

2 (m + 1)

-dimensional normal distribution

N (0, Σ^{(h)})

. The covariance matrix

Σ^{(h)}

consists of four blocks with entries

\begin{matrix} σ_{i, j} = & p_{j} (δ_{i, j} - p_{i}) + \sum_{k = 1}^{\infty} (p_{i j} (k) + p_{j i} (k) - 2 p_{i} p_{j}), \\ σ_{i, m + 1 + j}^{(h)} = & 2 (δ_{i, j} - p_{i}) p_{j j} (h) + \sum_{k = 1}^{\infty} (p_{i j j} (k, h) - p_{i} p_{j j} (h)) \\ + \sum_{k = h + 1}^{\infty} (p_{j j i} (h, k - h) - p_{i} p_{j j} (h)) \\ + \sum_{k = 1}^{h - 1} (p_{j i j} (k, h - k) - p_{i} p_{j j} (h)) = σ_{m + 1 + j, i}^{(h)}, \\ σ_{m + 1 + i, m + 1 + j}^{(h)} = & (δ_{i, j} - p_{j j} (h)) p_{i i} (h) + 2 (δ_{i, j} p_{i j j} (h, h) - p_{i i} (h) p_{j j} (h)) \\ + \sum_{k = 1}^{h - 1} (p_{j i j i} (k, h - k, k) + p_{i j i j} (k, h - k, k) - 2 p_{i i} (h) p_{j j} (h)) \\ + \sum_{k = h + 1}^{\infty} (p_{j j i i} (h, k - h, h) + p_{i i j j} (h, k - h, h) - 2 p_{i i} (h) p_{j j} (h)), \end{matrix}

(18)

where always

i, j = 0, \dots, m

, and where

\begin{matrix} p_{a b c} (k, l) = & P (X_{t} = a, X_{t - k} = b, X_{t - k - l} = c), \\ p_{a b c d} (k, l, m) = & P (X_{t} = a, X_{t - k} = b, X_{t - k - l} = c, X_{t - k - l - m} = d) . \end{matrix}

This rather complex general result simplifies greatly special cases such as an NDARMA- DGP (Weiß 2013) and, in particular, for an i.i.d. DGP:

\begin{matrix} σ_{i, j} = & p_{j} (δ_{i, j} - p_{i}), \\ σ_{i, m + 1 + j} = & 2 (δ_{i, j} - p_{i}) p_{j}^{2} = σ_{m + 1 + j, i}, \\ σ_{m + 1 + i, m + 1 + j} = & δ_{i, j} p_{i}^{2} (1 + 2 p_{i}) - 3 p_{i}^{2} p_{j}^{2} . \end{matrix}

(19)

Now, the asymptotic properties of

{\hat{κ}}^{⋆} (h)

can be derived, as done in Appendix B.3.

\sqrt{n} ({\hat{κ}}^{⋆} (h) - κ^{⋆} (h))

is asymptotically normally distributed, and mean and variance can be approximated by plugging Equation (18) into

\begin{matrix} E [{\hat{κ}}^{⋆} (h)] \approx & κ^{⋆} (h) + \frac{1}{n} \sum_{j = 0}^{m} \frac{(1 - p_{j}) σ_{j, m + 1 + j}^{(h)} - (1 - p_{j j} (h)) σ_{j j}}{{(1 - p_{j})}^{3}}, \\ V [{\hat{κ}}^{⋆} (h)] \approx & \frac{1}{n} \sum_{i, j = 0}^{m} \frac{p_{i i} (h) - 2 p_{i} + p_{i}^{2}}{{(1 - p_{i})}^{2}} \frac{p_{j j} (h) - 2 p_{j} + p_{j}^{2}}{{(1 - p_{j})}^{2}} σ_{i, j} \\ + \sum_{i, j = 0}^{m} \frac{σ_{m + 1 + i, m + 1 + j}^{(h)}}{(1 - p_{i}) (1 - p_{j})} + 2 \sum_{i, j = 0}^{m} \frac{p_{i i} (h) - 2 p_{i} + p_{i}^{2}}{{(1 - p_{i})}^{2}} \frac{σ_{i, m + 1 + j}^{(h)}}{1 - p_{j}} . \end{matrix}

(20)

In the i.i.d.-case, we simply have

E [{\hat{κ}}^{⋆} (h)] \approx - \frac{1}{n}, V [{\hat{κ}}^{⋆} (h)] \approx \frac{1}{n} (s_{2} (p) - \sum_{i = 0}^{m} {(\frac{p_{i}^{2}}{1 - p_{i}})}^{2} + {(\sum_{i = 0}^{m} \frac{p_{i}^{2}}{1 - p_{i}})}^{2}) .

(21)

Comparing Equations (16), (17) and (21), we see that all three measures have the same asymptotic bias

- 1 / n

, but their asymptotic variances generally differ. An exception to the latter statement is obtained in the case of a uniform distribution, then also the asymptotic variances coincide (see Appendix B.3).

6. Simulations and Applications

Section 6.1 presents some simulation results, where the quality of the asymptotic approximations for

\hat{κ} (h), {\hat{κ}}^{*} (h), {\hat{κ}}^{⋆} (h)

according to Section 5 is investigated, as well as the power if testing against different types of serial dependence. Two real-data examples are discussed in Section 6.2 and Section 6.3, first an ordinal time series with rather strong positive dependence, then a nominal time series exhibiting negative dependencies.

6.1. Finite-Sample Performance of Serial Dependence Measures

In analogy to Section 4.1, we compared the finite-sample performance of the normal approximations in Equations (16), (17) and (21) via simulations1. As the power scenarios, we not only included NDARMA models but also the NegMarkov model described in Appendix A.2. These models were combined with the marginal distributions

p_{1}

to

p_{6}

in Table 1 plus

p_{7} = p_{uni}

; for the NegMarkov model, it was not possible to have the marginals

p_{2}

and

p_{4}

with very low dispersion. The full simulation results are available from the author upon request, but excerpts thereof are shown in the sequel to illustrate the main findings. As before, all tables are collected in Appendix C.

First, we discuss the distributional properties of

\hat{κ} (h), {\hat{κ}}^{*} (h), {\hat{κ}}^{⋆} (h)

under the null of an i.i.d. DGP. The unique mean approximation

- 1 / n

worked very well without exceptions. The quality of the standard deviations’ approximations is investigated in Table A5. Generally, the actual marginal dispersion was of great influence. For large dispersion (e.g.,

p_{6}, p_{7}

in Table A5), we had a very good agreement between asymptotic and simulated standard deviation, where deviations were typically not larger than

\pm 0.001

. For low dispersion (e.g.,

p_{5}

in Table A5), we found some discrepancy for

n = 100

and increasing h: the asymptotic approximation resulted in lower values for

\hat{κ} (h), {\hat{κ}}^{⋆} (h)

(discrepancy up to 0.004), and in larger values for

{\hat{κ}}^{*} (h)

(discrepancy up to 0.002). Consequently, if testing for serial independence, we expect the size for

{\hat{κ}}^{*} (h)

to be smaller than the nominal

5 %

-level, and to be larger for

\hat{κ} (h), {\hat{κ}}^{⋆} (h)

. This was roughly confirmed by the results in Table A6 (and by further simulations), with the smallest size values for

{\hat{κ}}^{*} (h)

(might be smaller by up to 0.01) and the largest for

{\hat{κ}}^{⋆} (h)

. The sizes of

\hat{κ} (h)

tended to be smaller than 0.05 for

h = 1

(discrepancies

\leq 0.003

), while those of

{\hat{κ}}^{⋆} (h)

were always rather close to 5%.

A more complex picture was observed with regard to the power of

\hat{κ} (h), {\hat{κ}}^{*} (h), {\hat{κ}}^{⋆} (h)

(see Table A6). For positive dependence (DMA(1) and DAR(1)),

\hat{κ} (h), {\hat{κ}}^{⋆} (h)

performed best if being close to a marginal uniform distribution, whereas

{\hat{κ}}^{*} (h)

had superior power for lower marginal dispersion levels (and

\hat{κ} (h)

performed second-best). For negative dependence (NegMarkov), in contrast,

{\hat{κ}}^{⋆} (h)

was the optimal choice, and

{\hat{κ}}^{*} (h)

might have a rather poor power, especially for low dispersion. Thus, while

\hat{κ} (h), {\hat{κ}}^{⋆} (h)

both showed a more-or-less balanced performance with respect to positive and negative dependence, we had a sharp contrast for

{\hat{κ}}^{*} (h)

.

6.2. Application: Fear Index

The Volatility Index (VIX) serves as a benchmark for U.S. stock market volatility, and increasing VIX values are interpreted as indications of greater fear in the market (Hancock 2012). Hancock (2012) distinguished between the

m + 1 = 13

ordinal fear states given in Table 2. From the historical closing rates of the VIX offered by the website https://finance.yahoo.com/, a time series of daily fear states was computed for the

n = 4287

trading days in the period 1990–2006 (before the beginning of the financial crisis). The obtained time series is plotted in Figure 3.

As shown in the plots in the top panel of Figure 3, the states

s_{9}

–

s_{12}

are never observed during 1990–2006, thus we have zero frequencies affecting the computation of

{\hat{ν}}_{En}

and

{\hat{κ}}^{*} (h)

. The marginal distribution itself deviates visibly from a uniform distribution, thus it is reasonable that the dispersion measures

{\hat{ν}}_{G}

,

{\hat{ν}}_{En}

and

{\hat{ν}}_{Ex}

are clearly below 1. Actually, the PMF mainly concentrates on the low to moderate fear states, high anxiety (or more) only happened for few of the trading days. Even more important is to investigate the development of these fear states over time. While negative serial dependence would indicate a permanent fluctuation between the states, positive dependence would imply some kind of inertia regarding the respective states. The serial dependence structure was analyzed, as shown in the bottom panel of Figure 3, where the critical values for level 5% (dashed lines) were computed according to the asymptotics in Section 5. All measures indicated significantly positive dependence, thus the U.S. stock market has a tendency to stay in a state once attained. However,

{\hat{κ}}^{*} (h)

(Figure 3, center) produced notably smaller values than

\hat{κ} (h), {\hat{κ}}^{⋆} (h)

(Figure 3, left and right). Considering that the time series plot with its long runs of single states implies a rather strong positive dependence, the values produced by

{\hat{κ}}^{*} (h)

did not appear to be that plausible. Thus,

\hat{κ} (h), {\hat{κ}}^{⋆} (h)

, which resulted in very similar values, appeared to be better interpretable in the given example. Note that the discrepancy between

{\hat{κ}}^{*} (h)

and

\hat{κ} (h), {\hat{κ}}^{⋆} (h)

went along with the discrepancy between

{\hat{ν}}_{En}

and

{\hat{ν}}_{G}, {\hat{ν}}_{Ex}

.

6.3. Application: Wood Pewee

Animal behavior studies are an integral part of research in biology and psychology. The time series example to be studied in the present section dates back to one of the pioneers of ethology, to Wallace Craig. The data were originally presented by Craig (1943) and further analyzed (among others) in Chapter 6 of the work by Weiß (2018). They constitute a nominal time series of length

n = 1327

, where the three states

s_{0}, s_{1}, s_{2}

express the different phrases in the morning twilight song of the Wood Pewee (“pee-ah-wee”, “pee-oh” and “ah-di-dee”, respectively). Since the range of a nominal time series lacks a natural order, a time series plot is not possible. Thus, Figure 4 shows a rate evolution graph as a substitute, where the cumulative frequencies of the individual states are plotted against time t. From the roughly linear increase, we conclude on a stable behavior of the time series (Weiß 2018).

The dispersion measures

{\hat{ν}}_{G}

,

{\hat{ν}}_{En}

and

{\hat{ν}}_{Ex}

all led to values between 0.90 and 0.95, indicating that all three phrases are frequently used (but not equally often) within the morning twilight song of the Wood Pewee. This is confirmed by the PMF plot in Figure 4, where we found some preference for the phrase

s_{0}

. From the serial dependence plots in the bottom panel of Figure 4, a quite complex serial dependence structure becomes visible, with both positive and negative dependence values and with a periodic pattern. Positive values happen for even lags h (and particularly large values for multiples of 4), and negative values for odd lags. The positive values indicate a tendency for repeating a phrase, and such repetitions seem to be particularly likely after every fourth phrase. Negative values, in contrast, indicate a change of the phrase, e.g., it will rarely happen that the same phrase is presented twice in a row. While

\hat{κ} (h), {\hat{κ}}^{⋆} (h)

gave a very clear (and similar) picture of the rhythmic structure, it was again

{\hat{κ}}^{*} (h)

that caused some implausible values, e.g., the non-significant value at lag 2. Thus, both data examples indicate that

{\hat{κ}}^{*} (h)

should be used with caution in practice (also because of the zero problem). A decision between

\hat{κ} (h)

and

{\hat{κ}}^{⋆} (h)

is more difficult;

\hat{κ} (h)

is well established and slightly advantageous for uncovering positive dependencies, whereas

{\hat{κ}}^{⋆} (h)

is computationally simpler and shows a very good performance regarding negative dependencies.

7. Conclusions

This work discusses approaches for measuring dispersion and serial dependence in categorical time series. Asymptotic properties of the novel extropy measure for categorical dispersion are derived and compared to those of Gini index and entropy. Simulations showed that all three measures performed quite well, with slightly better coverage rates for the entropy but computational advantages for Gini index and extropy. The extropy was most reliable if testing the null hypothesis of a uniform distribution. The application and interpretation of these measures was illustrated with a longitudinal study about the most important political goals in Germany.

The analysis of the asymptotic bias of Gini index, entropy and extropy uncovered a relation between these three measures and three types of

κ

-measures for signed serial dependence. While two of these measures, namely

κ (h)

and

κ^{*} (h)

, have already been discussed in the literature, the “

κ

-counterpart” to the extropy turned out to be a new type of modified Cohen’s

κ

, denoted by

κ^{⋆} (h)

. The asymptotics of

{\hat{κ}}^{⋆} (h)

were investigated and utilized for testing for serial dependence. A simulation study as well as two real-data examples (time series of fear states and song of the Wood Pewee) showed that

{\hat{κ}}^{*} (h)

has several drawbacks, while both

\hat{κ} (h)

and

{\hat{κ}}^{⋆} (h)

work very well in practice. The advantages of

{\hat{κ}}^{⋆} (h)

are computational simplicity and a superior performance regarding negative dependencies.

Funding

This research received no external funding.

Acknowledgments

The author thanks the editors and the three referees for their useful comments on an earlier draft of this article.

Conflicts of Interest

The author declares no conflict of interest.

Appendix A. Some Models for Categorical Processes

This appendix provides a brief summary of those models for categorical processes that are used for simulations and illustrative computations in this article. More background on these and further models for categorical processes can be found in the book by Weiß (2018).

Appendix A.1. NDARMA Models for Categorical Processes

The NDARMA model (“new” discrete autoregressive moving-average model) was proposed by Jacobs and Lewis (1983), and its definition might be given as follows (Weiß and Göb 2008):

Let

{(X_{t})}_{Z}

and

{(ϵ_{t})}_{Z}

be categorical processes with state space

S

, where

{(ϵ_{t})}_{Z}

is i.i.d. with marginal distribution p, and where

ϵ_{t}

is independent of

{(X_{s})}_{s < t}

. Let

(α_{t, 1}, \dots, α_{t, p}, β_{t, 0}, \dots, β_{t, q}) \sim MULT (1; ϕ_{1}, \dots, ϕ_{p}, φ_{0}, \dots, φ_{q})

be i.i.d. multinomial random vectors, which are independent of

{(ϵ_{t})}_{Z}

and of

{(X_{s})}_{s < t}

. Then,

{(X_{t})}_{Z}

is said to be an NDARMA(p, q) process (and the cases

q = 0

and

p = 0

are referred to as a DAR(p) process and DMA(q) process, respectively) if it follows the recursion

X_{t} = α_{t, 1} \cdot X_{t - 1} + \dots + α_{t, p} \cdot X_{t - p} + β_{t, 0} \cdot ϵ_{t} + \dots + β_{t, q} \cdot ϵ_{t - q} .

(A1)

(Here, if the state space

S

is not numerically coded, we assume

0 \cdot s = 0

,

1 \cdot s = s

and

s + 0 = s

for each

s \in S

.)

NDARMA processes have several attractive properties, e.g.

X_{t}

and

ϵ_{t}

have the same stationary marginal distribution:

P (X_{t} = s_{i}) = p_{i} = P (ϵ_{t} = s_{i})

for all

i \in S

. Their serial dependence structure is characterized by a set of Yule–Walker-type equations for the serial dependence measure Cohen’s

κ

from Equation (3) (Weiß and Göb 2008):

κ (h) = \sum_{j = 1}^{p} ϕ_{j} κ (| h - j |) + \sum_{i = 0}^{q - h} φ_{i + h} r (i) for h \geq 1,

(A2)

where the

r (i)

satisfy

r (i) = \sum_{j = max {0, i - p}}^{i - 1} ϕ_{i - j} \cdot r (j) + φ_{i} 1 (0 \leq i \leq q)

. It should be noted that NDARMA processes satisfy

κ (h) \geq 0

, i.e. they can only handle positive serial dependence. Another important property is that the bivariate distributions at lag h are

p_{i | j} (h) = p_{i} + κ (h) (δ_{i, j} - p_{i})

. This implies that

p_{i j} (h) - p_{i} p_{j} = κ (h) (δ_{i, j} - p_{i}) p_{j}

and, as a consequence, that all of the serial dependence measures mentioned in this work coincide for NDARMA processes:

ϑ (h) = ϑ^{*} (h) = ϑ^{⋆} (h) = κ (h) = κ^{*} (h) = κ^{⋆} (h)

.

Finally, Weiß (2013) showed that an NDARMA process is

ϕ

-mixing with exponentially decreasing weights such that the CLT on p. 200 in Billingsley (1999) is applicable.

Appendix A.2. Markov Chains for Categorical Processes

A discrete-valued Markov process

{(X_{t})}_{Z}

is characterized by a “memory of length

p \in N

”, in the sense that

P (X_{t} = x_{t} | X_{t - 1} = x_{t - 1}, \dots) = P (X_{t} = x_{t} | X_{t - 1} = x_{t - 1}, \dots, X_{t - p} = x_{t - p})

has to hold for all

x_{i} \in S

. In the case

p = 1

,

{(X_{t})}_{Z}

is commonly called a Markov chain. If the transition probabilities

P (X_{t} = i | X_{t - 1} = j)

of the Markov chain do not change with time t, i.e., if

P (X_{t} = i | X_{t - 1} = j) = p_{i | j}

for all

t \in N

, it is said to be homogeneous (analogously for higher-order Markov processes). An example of a parsimoniously parametrized homogeneous Markov chain (Markov process) is the DAR(1) process (DAR(p) process) according to Appendix A.1, which always exhibits positive serial dependence.

A parsimoniously parametrized Markov model with negative serial dependence was proposed by Weiß (2011), the “Negative Markov model” (NegMarkov model). For a given probability vector

π \in {(0; 1)}^{m + 1}

and some

α \in (0; 1]

, its transition probabilities are defined by

p_{i | j} = \{\begin{matrix} α π_{j} & if i = j, \\ β_{j} π_{i} & if i \neq j, \end{matrix} where β_{j} = \frac{1 - α π_{j}}{1 - π_{j}} \geq 1 .

The resulting ergodic Markov chain has the stationary marginal distribution

p_{j} = \frac{π_{j} / β_{j}}{\sum_{i = 0}^{m} π_{i} / β_{i}} for j = 0, \dots, m .

As an example, if

π = p_{uni}

, then the

β_{j}

become

1 + (1 - α) / m

such that p is also a uniform distribution. However, the conditional distribution given

X_{t - 1} = j

is not uniform:

p_{i | j} = \{\begin{matrix} \frac{α}{m + 1} & if i = j, \\ (1 + \frac{1 - α}{m}) \frac{1}{m + 1} & if i \neq j . \end{matrix}

Appendix B. Proofs

Appendix B.1. Derivation of the inequality in Equation (7)

The first inequality in Equation (7) was shown by Lad et al. (2015). Using the inequality

ln (1 + x) > x / (1 + x / 2)

for

x > 0

from Love (1980), it follows for

p_{i} \in (0; 1)

that

- ln (1 - p_{i}) = ln (1 + \frac{p_{i}}{1 - p_{i}}) > \frac{2 p_{i}}{2 - p_{i}}, - ln p_{i} = ln (1 + \frac{1 - p_{i}}{p_{i}}) > 2 \frac{1 - p_{i}}{1 + p_{i}} .

Consequently, we have

- \sum_{i = 0}^{m} (1 - p_{i}) ln (1 - p_{i}) \geq \sum_{i = 0}^{m} p_{i} (1 - p_{i}) \frac{2}{2 - p_{i}} \geq 1 - s_{2} (p),

as well as

- \sum_{i = 0}^{m} p_{i} ln p_{i} \geq \sum_{i = 0}^{m} p_{i} (1 - p_{i}) \frac{2}{1 + p_{i}} \geq 1 - s_{2} (p) .

Appendix B.2. Derivations for Sample Dispersion Measures

For studying the asymptotic properties of the sample Gini index according to Equation (1), it is important to know that

{\hat{ν}}_{G}

can be exactly rewritten as the centered quadratic polynomial

{\hat{ν}}_{G} = ν_{G} - 2 \frac{m + 1}{m} \sum_{i = 0}^{m} p_{i} ({\hat{p}}_{i} - p_{i}) - \frac{m + 1}{m} \sum_{i = 0}^{m} {({\hat{p}}_{i} - p_{i})}^{2} .

(A3)

Since

E [\hat{p}] = p

holds exactly, this representation immediately implies an exact way of bias computation,

n (E [{\hat{ν}}_{G}] - ν_{G}) = \frac{m + 1}{m} \sum_{i = 0}^{m} V [\sqrt{n} ({\hat{p}}_{i} - p_{i})]

, where

V [\sqrt{n} ({\hat{p}}_{i} - p_{i})]

is approximately given by

σ_{i i}

according to Equation (8). Furthermore, provided that

p \neq p_{uni}

, the resulting linear approximation together with the asymptotic normality of

\sqrt{n} (\hat{p} - p)

can be used to derive the asymptotic result (Equation (9)) (Delta method). Here,

p = p_{uni}

has to be excluded, because then the linear term in Equation (A3) vanishes. Hence, in the boundary case

p = p_{uni}

, we end up with an asymptotic quadratic-form distribution instead of a normal one. Actually, it is easily seen that

n m (1 - {\hat{ν}}_{G})

then coincides with the Pearson’s

χ^{2}

-statistic with respect to

p_{uni}

; see Section 4 in Weiß (2013) for the asymptotics.

For the entropy in Equation (2), an exact polynomial representation such as in Equation (A3) does not exist, thus we have to use a Taylor approximation instead:

{\hat{ν}}_{En} \approx ν_{En} - \sum_{j = 0}^{m} \frac{1 + ln p_{j}}{ln (m + 1)} ({\hat{p}}_{j} - p_{j}) - \frac{1}{2} \sum_{j = 0}^{m} \frac{p_{j}^{- 1}}{ln (m + 1)} {({\hat{p}}_{j} - p_{j})}^{2} .

(A4)

However, then, one can proceed as before. Thus, an approximate bias formula follows from

n (E [{\hat{ν}}_{En}] - ν_{En}) \approx - \frac{1}{2} \sum_{j = 0}^{m} \frac{p_{j}^{- 1}}{ln (m + 1)} V [\sqrt{n} ({\hat{p}}_{j} - p_{j})] \approx - \frac{1}{2} \sum_{j = 0}^{m} \frac{p_{j}^{- 1}}{ln (m + 1)} σ_{j j} .

For

p \neq p_{uni}

, we can use the linear approximation implied by Equation (A4) to conclude on the asymptotic normality of

\sqrt{n} ({\hat{ν}}_{En} - ν_{En})

with variance

σ_{En}^{2} = \sum_{i, j = 0}^{m} (\frac{1 + ln p_{i}}{ln (m + 1)}) (\frac{1 + ln p_{j}}{ln (m + 1)}) σ_{i j},

also see the results in (Blyth 1959; Weiß 2013). Note that, in the i.i.d.-case, where

σ_{i j} = p_{j} (δ_{i, j} - p_{i})

, one computes

\begin{matrix} \sum_{i, j = 0}^{m} (1 + ln p_{i}) (1 + ln p_{j}) p_{j} (δ_{i, j} - p_{i}) \\ = \sum_{i = 0}^{m} {(1 + ln p_{i})}^{2} p_{i} - {(\sum_{i = 0}^{m} (1 + ln p_{i}) p_{i})}^{2} \\ = 1 + 2 \sum_{i = 0}^{m} p_{i} ln p_{i} + \sum_{i = 0}^{m} p_{i} {(ln p_{i})}^{2} - {(1 + \sum_{i = 0}^{m} p_{i} ln p_{i})}^{2} \\ = \sum_{i = 0}^{m} p_{i} {(ln p_{i})}^{2} - {(\sum_{i = 0}^{m} p_{i} ln p_{i})}^{2} . \end{matrix}

Finally, in the boundary case

p = p_{uni}

, again the linear term vanishes such that

2 n ln (m + 1) \cdot (1 - {\hat{ν}}_{En}) \approx n \sum_{j = 0}^{m} \frac{{({\hat{p}}_{j} - \frac{1}{m + 1})}^{2}}{\frac{1}{m + 1}}

equals the Pearson’s

χ^{2}

-statistic.

Finally, we do analogous derivations concerning the extropy in Equation (5). Starting with the Taylor approximation

{\hat{ν}}_{Ex} \approx ν_{Ex} + \sum_{j = 0}^{m} \frac{1 + ln (1 - p_{j})}{m ln (\frac{m + 1}{m})} ({\hat{p}}_{j} - p_{j}) - \frac{1}{2} \sum_{j = 0}^{m} \frac{{(1 - p_{j})}^{- 1}}{m ln (\frac{m + 1}{m})} {({\hat{p}}_{j} - p_{j})}^{2},

(A5)

it follows that

n (E [{\hat{ν}}_{Ex}] - ν_{Ex}) \approx - \frac{1}{2} \sum_{j = 0}^{m} \frac{{(1 - p_{j})}^{- 1}}{m ln (\frac{m + 1}{m})} σ_{j j} .

For

p \neq p_{uni}

, we can use the linear approximation implied by Equation (A5) to conclude on the asymptotic normality of

\sqrt{n} ({\hat{ν}}_{Ex} - ν_{Ex})

with variance

σ_{Ex}^{2} = \sum_{i, j = 0}^{m} \frac{1 + ln (1 - p_{i})}{m ln (\frac{m + 1}{m})} \frac{1 + ln (1 - p_{j})}{m ln (\frac{m + 1}{m})} σ_{i j} .

Note that, in the i.i.d.-case, where

σ_{i j} = p_{j} (δ_{i, j} - p_{i})

, one computes

\begin{matrix} \sum_{i, j = 0}^{m} (1 + ln (1 - p_{i})) (1 + ln (1 - p_{j})) p_{j} (δ_{i, j} - p_{i}) \\ = \sum_{i = 0}^{m} p_{i} {(ln (1 - p_{i}))}^{2} - {(\sum_{i = 0}^{m} p_{i} ln (1 - p_{i}))}^{2} \end{matrix}

as before. Finally, in the boundary case

p = p_{uni}

, again the linear term vanishes such that

2 n m^{2} ln (\frac{m + 1}{m}) \cdot (1 - {\hat{ν}}_{Ex}) \approx n \sum_{j = 0}^{m} \frac{{({\hat{p}}_{j} - \frac{1}{m + 1})}^{2}}{\frac{1}{m + 1}}

equals the Pearson’s

χ^{2}

-statistic.

Appendix B.3. Derivations for Measures of Signed Serial Dependence

We partition

x \in {(0; 1)}^{2 (m + 1)}

as

x = {(x_{0}, \dots, x_{m}, x_{m + 1 + 0}, \dots, x_{m + 1 + m})}^{⊤}

, and we define

f (x) = \sum_{j = 0}^{m} \frac{x_{m + 1 + j} - x_{j}^{2}}{1 - x_{j}} .

Then,

\frac{\partial}{\partial x_{j}} f (x) = \frac{x_{m + 1 + j} - 2 x_{j} + x_{j}^{2}}{{(1 - x_{j})}^{2}}, \frac{\partial}{\partial x_{m + 1 + j}} f (x) = \frac{1}{1 - x_{j}},

and

\frac{\partial^{2}}{\partial x_{j}^{2}} f (x) = \frac{- 2 (1 - x_{m + 1 + j})}{{(1 - x_{j})}^{3}}, \frac{\partial^{2}}{\partial x_{j} \partial x_{m + 1 + j}} f (x) = \frac{1}{{(1 - x_{j})}^{2}},

all other second-order derivatives equal 0. Thus, a second-order Taylor approximation of

{\hat{κ}}^{⋆} (h) = f (\dots, {\hat{p}}_{j}, \dots, {\hat{p}}_{j j} (h), \dots)

is given by

\begin{matrix} {\hat{κ}}^{⋆} (h) & \approx & κ^{⋆} (h) + \sum_{j = 0}^{m} \frac{p_{j j} (h) - 2 p_{j} + p_{j}^{2}}{{(1 - p_{j})}^{2}} ({\hat{p}}_{j} - p_{j}) + \sum_{j = 0}^{m} \frac{1}{1 - p_{j}} ({\hat{p}}_{j j} (h) - p_{j j} (h)) \\ - \sum_{j = 0}^{m} \frac{1 - p_{j j} (h)}{{(1 - p_{j})}^{3}} {({\hat{p}}_{j} - p_{j})}^{2} + \sum_{j = 0}^{m} \frac{1}{{(1 - p_{j})}^{2}} ({\hat{p}}_{j} - p_{j}) ({\hat{p}}_{j j} (h) - p_{j j} (h)) . \end{matrix}

Hence, using Equation (18), it follows that

\begin{matrix} n (E [{\hat{κ}}^{⋆} (h)] - κ^{⋆} (h)) & \approx & \sum_{j = 0}^{m} \frac{1}{{(1 - p_{j})}^{2}} σ_{j, m + 1 + j}^{(h)} - \sum_{j = 0}^{m} \frac{1 - p_{j j} (h)}{{(1 - p_{j})}^{3}} σ_{j j} \\ = & \sum_{j = 0}^{m} \frac{(1 - p_{j}) σ_{j, m + 1 + j}^{(h)} - (1 - p_{j j} (h)) σ_{j j}}{{(1 - p_{j})}^{3}} . \end{matrix}

Furthermore, the Delta method implies that

\sqrt{n} ({\hat{κ}}^{⋆} (h) - κ^{⋆} (h)) \sim N (0, σ^{2})

with

\begin{matrix} σ^{2} & = & \sum_{i, j = 0}^{m} \frac{p_{i i} (h) - 2 p_{i} + p_{i}^{2}}{{(1 - p_{i})}^{2}} \frac{p_{j j} (h) - 2 p_{j} + p_{j}^{2}}{{(1 - p_{j})}^{2}} σ_{i, j} \\ + \sum_{i, j = 0}^{m} \frac{σ_{m + 1 + i, m + 1 + j}^{(h)}}{(1 - p_{i}) (1 - p_{j})} + 2 \sum_{i, j = 0}^{m} \frac{p_{i i} (h) - 2 p_{i} + p_{i}^{2}}{{(1 - p_{i})}^{2}} \frac{σ_{i, m + 1 + j}^{(h)}}{1 - p_{j}} . \end{matrix}

Note that, under the null of an i.i.d. DGP, we have the simplifications

\frac{p_{j j} (h) - 2 p_{j} + p_{j}^{2}}{{(1 - p_{j})}^{2}} = \frac{- 2 p_{j}}{1 - p_{j}}, \frac{1 - p_{j j} (h)}{{(1 - p_{j})}^{3}} = \frac{1 + p_{j}}{{(1 - p_{j})}^{2}} .

Thus, the second-order Taylor approximation of

{\hat{κ}}^{⋆} (h)

then simplifies to

\begin{matrix} {\hat{κ}}^{⋆} (h) & \approx & κ^{⋆} (h) + \sum_{j = 0}^{m} \frac{- 2 p_{j}}{1 - p_{j}} ({\hat{p}}_{j} - p_{j}) + \sum_{j = 0}^{m} \frac{1}{1 - p_{j}} ({\hat{p}}_{j j} (h) - p_{j}^{2}) \\ - \sum_{j = 0}^{m} \frac{1 + p_{j}}{{(1 - p_{j})}^{2}} {({\hat{p}}_{j} - p_{j})}^{2} + \sum_{j = 0}^{m} \frac{1}{{(1 - p_{j})}^{2}} ({\hat{p}}_{j} - p_{j}) ({\hat{p}}_{j j} (h) - p_{j}^{2}) . \end{matrix}

Furthermore,

σ_{i, j} = p_{j} (δ_{i, j} - p_{i}), σ_{i, m + 1 + j} = 2 p_{j}^{2} (δ_{i, j} - p_{i}), σ_{m + 1 + i, m + 1 + j} = δ_{i, j} p_{i}^{2} (1 + 2 p_{i}) - 3 p_{i}^{2} p_{j}^{2},

according to Equation (19). Thus, for an i.i.d. DGP, it follows that

\begin{matrix} n (E [{\hat{κ}}^{⋆} (h)] - κ^{⋆} (h)) = n E [{\hat{κ}}^{⋆} (h)] & \approx & \sum_{j = 0}^{m} \frac{σ_{j, m + 1 + j} - (1 + p_{j}) σ_{j j}}{{(1 - p_{j})}^{2}} \\ = & \sum_{j = 0}^{m} \frac{2 p_{j}^{2} (1 - p_{j}) - (1 + p_{j}) p_{j} (1 - p_{j})}{{(1 - p_{j})}^{2}} \\ = & \sum_{j = 0}^{m} \frac{- p_{j} (1 - p_{j}) (1 + p_{j} - 2 p_{j})}{{(1 - p_{j})}^{2}} = - 1 . \end{matrix}

Furthermore,

\begin{matrix} σ^{2} & = & \sum_{i, j = 0}^{m} \frac{- 2 p_{i}}{1 - p_{i}} \frac{- 2 p_{j}}{1 - p_{j}} p_{j} (δ_{i, j} - p_{i}) + 2 \sum_{i, j = 0}^{m} \frac{- 2 p_{i}}{1 - p_{i}} \frac{1}{1 - p_{j}} 2 p_{j}^{2} (δ_{i, j} - p_{i}) \\ + \sum_{i, j = 0}^{m} \frac{1}{1 - p_{i}} \frac{1}{1 - p_{j}} (δ_{i, j} p_{i}^{2} (1 + 2 p_{i}) - 3 p_{i}^{2} p_{j}^{2}) \\ = & 4 \sum_{i = 0}^{m} \frac{p_{i}^{3}}{{(1 - p_{i})}^{2}} - 8 \sum_{i = 0}^{m} \frac{p_{i}^{3}}{{(1 - p_{i})}^{2}} + \sum_{i = 0}^{m} \frac{p_{i}^{2} (1 + 2 p_{i})}{{(1 - p_{i})}^{2}} \\ - 4 \sum_{i, j = 0}^{m} \frac{p_{i}^{2}}{1 - p_{i}} \frac{p_{j}^{2}}{1 - p_{j}} + 8 \sum_{i, j = 0}^{m} \frac{p_{i}^{2}}{1 - p_{i}} \frac{p_{j}^{2}}{1 - p_{j}} - 3 \sum_{i, j = 0}^{m} \frac{p_{i}^{2}}{1 - p_{i}} \frac{p_{j}^{2}}{1 - p_{j}} \\ = & \sum_{i = 0}^{m} \frac{p_{i}^{2} (1 - 2 p_{i})}{{(1 - p_{i})}^{2}} + (\sum_{i = 0}^{m} \frac{p_{i}^{2}}{1 - p_{i}})^{2} . \end{matrix}

This leads to Equation (21).

In the special case of a uniform distribution, i.e. where all

p_{i} = \frac{1}{m + 1}

, the asymptotic variances according to Equations (16), (17) and (21) coincide. Then, we have in Equation (16) that

1 - \frac{1 + 2 s_{3} (p) - 3 s_{2} (p)}{{(1 - s_{2} (p))}^{2}} = 1 - \frac{1 + \frac{2}{{(m + 1)}^{2}} - \frac{3}{m + 1}}{{(1 - \frac{1}{m + 1})}^{2}} = 1 - \frac{{(m + 1)}^{2} + 2 - 3 (m + 1)}{m^{2}} = 1 - \frac{m^{2} - m}{m^{2}} = \frac{1}{m},

which corresponds to Equation (17). However, the same expression also follows in Equation (21):

s_{2} (p) - \sum_{i = 0}^{m} {(\frac{p_{i}^{2}}{1 - p_{i}})}^{2} + {(\sum_{i = 0}^{m} \frac{p_{i}^{2}}{1 - p_{i}})}^{2} = \frac{1}{m + 1} - \frac{1}{m^{2} (m + 1)} + \frac{1}{m^{2}} = \frac{1}{m} .

Appendix C. Tables

Table A1. Asymptotic vs. simulated mean (M

_{\cdot}

-a vs. M

_{\cdot}

-s) of

{\hat{ν}}_{G}

,

{\hat{ν}}_{En}

,

{\hat{ν}}_{Ex}

for DGPs i.i.d., DMA(1) with

φ_{1} = 0.25

, and DAR(1) with

ϕ_{1} = 0.40

.

Table A1. Asymptotic vs. simulated mean (M

_{\cdot}

-a vs. M

_{\cdot}

-s) of

{\hat{ν}}_{G}

,

{\hat{ν}}_{En}

,

{\hat{ν}}_{Ex}

for DGPs i.i.d., DMA(1) with

φ_{1} = 0.25

, and DAR(1) with

ϕ_{1} = 0.40

.

	DGP	i.i.d.		DMA(1), 0.25		DAR(1), 0.40		i.i.d.		DMA(1), 0.25		DAR(1), 0.40		i.i.d.		DMA(1), 0.25		DAR(1), 0.40
PMF	$n$	M $_{G}$ -a	M $_{G}$ -s	M $_{G}$ -a	M $_{G}$ -s	M $_{G}$ -a	M $_{G}$ -s	M $_{En}$ -a	M $_{En}$ -s	M $_{En}$ -a	M $_{En}$ -s	M $_{En}$ -a	M $_{En}$ -s	M $_{Ex}$ -a	M $_{Ex}$ -s	M $_{Ex}$ -a	M $_{Ex}$ -s	M $_{Ex}$ -a	M $_{Ex}$ -s
$p_{1}$	100	0.970	0.970	0.967	0.967	0.957	0.957	0.969	0.968	0.965	0.964	0.954	0.954	0.982	0.982	0.980	0.980	0.975	0.975
	250	0.976	0.976	0.975	0.975	0.971	0.971	0.975	0.975	0.973	0.973	0.969	0.969	0.986	0.986	0.985	0.985	0.983	0.983
	500	0.978	0.978	0.977	0.977	0.975	0.975	0.977	0.977	0.976	0.976	0.974	0.974	0.987	0.987	0.987	0.987	0.985	0.985
	1000	0.979	0.979	0.979	0.979	0.978	0.978	0.978	0.978	0.978	0.978	0.977	0.977	0.988	0.988	0.987	0.987	0.987	0.987
$p_{2}$	100	0.627	0.627	0.625	0.625	0.619	0.619	0.649	0.649	0.645	0.644	0.634	0.634	0.739	0.739	0.737	0.737	0.731	0.731
	250	0.631	0.631	0.630	0.630	0.627	0.627	0.655	0.655	0.654	0.654	0.649	0.649	0.743	0.743	0.742	0.742	0.739	0.739
	500	0.632	0.632	0.632	0.631	0.630	0.630	0.657	0.657	0.657	0.656	0.654	0.654	0.744	0.744	0.743	0.743	0.742	0.742
	1000	0.633	0.633	0.632	0.633	0.632	0.632	0.658	0.658	0.658	0.658	0.657	0.657	0.744	0.744	0.744	0.744	0.744	0.743
$p_{3}$	100	0.759	0.759	0.756	0.756	0.749	0.749	0.756	0.755	0.752	0.751	0.741	0.741	0.842	0.842	0.840	0.840	0.835	0.835
	250	0.764	0.764	0.762	0.762	0.760	0.760	0.762	0.762	0.761	0.761	0.757	0.757	0.846	0.846	0.845	0.845	0.843	0.843
	500	0.765	0.765	0.765	0.765	0.763	0.763	0.764	0.764	0.764	0.764	0.762	0.762	0.847	0.847	0.846	0.847	0.845	0.845
	1000	0.766	0.766	0.766	0.766	0.765	0.765	0.766	0.765	0.765	0.765	0.764	0.764	0.847	0.847	0.847	0.847	0.847	0.847
$p_{4}$	100	0.433	0.433	0.431	0.431	0.427	0.428	0.486	0.486	0.482	0.481	0.471	0.471	0.568	0.568	0.566	0.566	0.560	0.561
	250	0.436	0.436	0.435	0.435	0.433	0.434	0.492	0.492	0.491	0.491	0.487	0.487	0.572	0.572	0.571	0.571	0.569	0.569
	500	0.437	0.437	0.436	0.436	0.435	0.436	0.495	0.495	0.494	0.494	0.492	0.492	0.573	0.573	0.572	0.572	0.571	0.571
	1000	0.437	0.437	0.437	0.437	0.436	0.436	0.496	0.496	0.495	0.495	0.494	0.494	0.573	0.573	0.573	0.573	0.573	0.573
$p_{5}$	100	0.743	0.743	0.740	0.740	0.733	0.733	0.764	0.764	0.760	0.759	0.749	0.749	0.827	0.827	0.824	0.824	0.819	0.819
	250	0.747	0.747	0.746	0.746	0.743	0.743	0.770	0.770	0.768	0.769	0.764	0.765	0.830	0.830	0.829	0.829	0.827	0.827
	500	0.749	0.749	0.748	0.748	0.747	0.747	0.772	0.772	0.771	0.772	0.769	0.769	0.831	0.831	0.831	0.831	0.830	0.830
	1000	0.749	0.749	0.749	0.749	0.748	0.748	0.773	0.773	0.773	0.773	0.772	0.772	0.832	0.832	0.832	0.832	0.831	0.831
$p_{6}$	100	0.928	0.928	0.925	0.925	0.916	0.916	0.929	0.929	0.925	0.925	0.915	0.915	0.956	0.956	0.953	0.954	0.948	0.948
	250	0.934	0.934	0.932	0.932	0.929	0.929	0.936	0.936	0.934	0.934	0.930	0.930	0.959	0.959	0.958	0.958	0.956	0.956
	500	0.936	0.936	0.935	0.935	0.933	0.933	0.938	0.938	0.937	0.937	0.935	0.935	0.960	0.960	0.960	0.960	0.959	0.959
	1000	0.937	0.937	0.936	0.936	0.935	0.935	0.939	0.939	0.939	0.939	0.938	0.938	0.961	0.961	0.961	0.961	0.960	0.960

Table A2. Asymptotic vs. simulated standard deviation (S

_{\cdot}

-a vs. S

_{\cdot}

-s) of

{\hat{ν}}_{G}

,

{\hat{ν}}_{En}

,

{\hat{ν}}_{Ex}

for DGPs i.i.d., DMA(1) with

φ_{1} = 0.25

, and DAR(1) with

ϕ_{1} = 0.40

.

Table A2. Asymptotic vs. simulated standard deviation (S

_{\cdot}

-a vs. S

_{\cdot}

-s) of

{\hat{ν}}_{G}

,

{\hat{ν}}_{En}

,

{\hat{ν}}_{Ex}

for DGPs i.i.d., DMA(1) with

φ_{1} = 0.25

, and DAR(1) with

ϕ_{1} = 0.40

.

	DGP	i.i.d.		DMA(1), 0.25		DAR(1), 0.40		i.i.d.		DMA(1), 0.25		DAR(1), 0.40		i.i.d.		DMA(1), 0.25		DAR(1), 0.40
PMF	$n$	S $_{G}$ -a	S $_{G}$ -s	S $_{G}$ -a	S $_{G}$ -s	S $_{G}$ -a	S $_{G}$ -s	S $_{En}$ -a	S $_{En}$ -s	S $_{En}$ -a	S $_{En}$ -s	S $_{En}$ -a	S $_{En}$ -s	S $_{Ex}$ -a	S $_{Ex}$ -s	S $_{Ex}$ -a	S $_{Ex}$ -s	S $_{Ex}$ -a	S $_{Ex}$ -s
$p_{1}$	100	0.017	0.019	0.020	0.023	0.027	0.032	0.017	0.020	0.020	0.024	0.027	0.033	0.011	0.012	0.012	0.014	0.016	0.020
	250	0.011	0.011	0.013	0.014	0.017	0.018	0.011	0.012	0.013	0.014	0.017	0.019	0.007	0.007	0.008	0.008	0.010	0.011
	500	0.008	0.008	0.009	0.009	0.012	0.012	0.008	0.008	0.009	0.009	0.012	0.013	0.005	0.005	0.006	0.006	0.007	0.008
	1000	0.006	0.006	0.006	0.007	0.008	0.009	0.006	0.006	0.006	0.007	0.008	0.009	0.003	0.003	0.004	0.004	0.005	0.005
$p_{2}$	100	0.071	0.071	0.084	0.083	0.109	0.106	0.063	0.064	0.074	0.075	0.097	0.096	0.057	0.058	0.067	0.068	0.088	0.088
	250	0.045	0.045	0.053	0.053	0.069	0.068	0.040	0.040	0.047	0.047	0.061	0.061	0.036	0.036	0.043	0.043	0.055	0.055
	500	0.032	0.032	0.037	0.037	0.049	0.049	0.028	0.029	0.033	0.033	0.043	0.043	0.026	0.026	0.030	0.030	0.039	0.039
	1000	0.023	0.023	0.027	0.027	0.035	0.035	0.020	0.020	0.024	0.024	0.031	0.031	0.018	0.018	0.021	0.021	0.028	0.028
$p_{3}$	100	0.058	0.058	0.068	0.068	0.088	0.086	0.053	0.054	0.062	0.063	0.081	0.081	0.042	0.043	0.049	0.050	0.064	0.065
	250	0.037	0.037	0.043	0.043	0.056	0.055	0.033	0.034	0.039	0.039	0.051	0.051	0.027	0.027	0.031	0.031	0.041	0.041
	500	0.026	0.026	0.030	0.030	0.039	0.039	0.024	0.024	0.028	0.028	0.036	0.036	0.019	0.019	0.022	0.022	0.029	0.029
	1000	0.018	0.018	0.021	0.021	0.028	0.028	0.017	0.017	0.020	0.020	0.025	0.025	0.013	0.013	0.016	0.016	0.020	0.020
$p_{4}$	100	0.078	0.077	0.092	0.090	0.119	0.115	0.072	0.073	0.085	0.085	0.110	0.109	0.073	0.073	0.085	0.086	0.111	0.110
	250	0.049	0.049	0.058	0.058	0.075	0.075	0.046	0.046	0.054	0.054	0.070	0.070	0.046	0.046	0.054	0.054	0.070	0.071
	500	0.035	0.035	0.041	0.041	0.053	0.053	0.032	0.032	0.038	0.038	0.049	0.049	0.033	0.033	0.038	0.038	0.050	0.050
	1000	0.025	0.025	0.029	0.029	0.038	0.038	0.023	0.023	0.027	0.027	0.035	0.035	0.023	0.023	0.027	0.027	0.035	0.035
$p_{5}$	100	0.065	0.064	0.076	0.075	0.099	0.096	0.056	0.057	0.066	0.067	0.086	0.086	0.048	0.048	0.056	0.056	0.073	0.073
	250	0.041	0.041	0.048	0.048	0.062	0.062	0.036	0.036	0.042	0.042	0.054	0.054	0.030	0.030	0.035	0.035	0.046	0.046
	500	0.029	0.029	0.034	0.034	0.044	0.044	0.025	0.025	0.029	0.029	0.038	0.038	0.021	0.021	0.025	0.025	0.032	0.033
	1000	0.020	0.020	0.024	0.024	0.031	0.031	0.018	0.018	0.021	0.021	0.027	0.027	0.015	0.015	0.018	0.018	0.023	0.023
$p_{6}$	100	0.033	0.034	0.039	0.040	0.051	0.052	0.030	0.032	0.036	0.037	0.046	0.050	0.021	0.022	0.025	0.026	0.032	0.034
	250	0.021	0.021	0.025	0.025	0.032	0.032	0.019	0.019	0.022	0.023	0.029	0.030	0.013	0.013	0.016	0.016	0.020	0.021
	500	0.015	0.015	0.017	0.017	0.023	0.023	0.014	0.014	0.016	0.016	0.021	0.021	0.009	0.010	0.011	0.011	0.014	0.015
	1000	0.010	0.011	0.012	0.012	0.016	0.016	0.010	0.010	0.011	0.011	0.015	0.015	0.007	0.007	0.008	0.008	0.010	0.010

Table A3. Simulated coverage rate for 95% CIs of

{\hat{ν}}_{G}

,

{\hat{ν}}_{En}

,

{\hat{ν}}_{Ex}

for DGPs i.i.d., DMA(1) with

φ_{1} = 0.25

, and DAR(1) with

ϕ_{1} = 0.40

.

Table A3. Simulated coverage rate for 95% CIs of

{\hat{ν}}_{G}

,

{\hat{ν}}_{En}

,

{\hat{ν}}_{Ex}

for DGPs i.i.d., DMA(1) with

φ_{1} = 0.25

, and DAR(1) with

ϕ_{1} = 0.40

.

		Coverage ${\hat{ν}}_{G}$ , DGP:			Coverage ${\hat{ν}}_{En}$ , DGP:			Coverage ${\hat{ν}}_{Ex}$ , DGP:
PMF	$n$	i.i.d.	DMA(1), 0.25	DAR(1), 0.40	i.i.d.	DMA(1), 0.25	DAR(1), 0.40	i.i.d.	DMA(1), 0.25	DAR(1), 0.40
$p_{1}$	100	0.895	0.893	0.899	0.904	0.901	0.905	0.892	0.890	0.897
	250	0.914	0.909	0.898	0.920	0.915	0.905	0.912	0.907	0.896
	500	0.931	0.922	0.912	0.935	0.927	0.918	0.930	0.921	0.910
	1000	0.938	0.935	0.926	0.940	0.937	0.930	0.938	0.934	0.925
$p_{2}$	100	0.936	0.926	0.908	0.937	0.929	0.915	0.931	0.928	0.911
	250	0.944	0.941	0.933	0.945	0.942	0.937	0.944	0.941	0.935
	500	0.947	0.946	0.941	0.947	0.946	0.943	0.947	0.946	0.941
	1000	0.949	0.947	0.944	0.949	0.948	0.946	0.949	0.947	0.945
$p_{3}$	100	0.931	0.920	0.904	0.936	0.926	0.918	0.930	0.919	0.901
	250	0.941	0.938	0.930	0.943	0.943	0.937	0.940	0.937	0.929
	500	0.945	0.945	0.940	0.946	0.946	0.943	0.944	0.944	0.939
	1000	0.948	0.947	0.945	0.948	0.948	0.947	0.948	0.947	0.945
$p_{4}$	100	0.941	0.925	0.904	0.938	0.925	0.905	0.938	0.931	0.913
	250	0.947	0.939	0.929	0.943	0.940	0.931	0.948	0.942	0.934
	500	0.949	0.946	0.941	0.948	0.946	0.942	0.950	0.947	0.943
	1000	0.952	0.945	0.946	0.949	0.946	0.947	0.949	0.946	0.947
$p_{5}$	100	0.933	0.922	0.904	0.938	0.928	0.915	0.939	0.923	0.906
	250	0.945	0.939	0.931	0.947	0.942	0.935	0.944	0.940	0.931
	500	0.946	0.945	0.941	0.947	0.946	0.943	0.947	0.945	0.941
	1000	0.948	0.947	0.945	0.949	0.948	0.946	0.949	0.947	0.945
$p_{6}$	100	0.909	0.894	0.884	0.920	0.908	0.901	0.906	0.890	0.877
	250	0.933	0.923	0.908	0.936	0.929	0.919	0.931	0.921	0.905
	500	0.938	0.935	0.927	0.940	0.938	0.933	0.937	0.934	0.925
	1000	0.943	0.941	0.939	0.945	0.943	0.942	0.943	0.941	0.938

Table A4. Rejection rate if testing null hypothesis of uniform distribution (

m = 3

) on 5%-level based on

{\hat{ν}}_{G}

,

{\hat{ν}}_{En}

,

{\hat{ν}}_{Ex}

. DGPs i.i.d., DMA(1) with

φ_{1} = 0.25

, and DAR(1) with

ϕ_{1} = 0.40

, with marginal distribution

L_{m} (λ)

.

Table A4. Rejection rate if testing null hypothesis of uniform distribution (

m = 3

) on 5%-level based on

{\hat{ν}}_{G}

,

{\hat{ν}}_{En}

,

{\hat{ν}}_{Ex}

. DGPs i.i.d., DMA(1) with

φ_{1} = 0.25

, and DAR(1) with

ϕ_{1} = 0.40

, with marginal distribution

L_{m} (λ)

.

	$λ$	1.00	0.98	0.96	0.94	0.92	0.90	1.00	0.98	0.96	0.94	0.92	0.90	1.00	0.98	0.96	0.94	0.92	0.90
DGP	$n$	Rejection rate ${\hat{ν}}_{G}$						Rejection rate ${\hat{ν}}_{En}$						Rejection rate ${\hat{ν}}_{Ex}$
i.i.d.	100	0.049	0.056	0.081	0.122	0.183	0.271	0.050	0.057	0.082	0.120	0.178	0.262	0.050	0.058	0.084	0.127	0.190	0.281
	250	0.052	0.068	0.132	0.250	0.421	0.605	0.051	0.068	0.129	0.243	0.407	0.590	0.051	0.068	0.132	0.252	0.423	0.608
	500	0.050	0.088	0.223	0.465	0.723	0.899	0.050	0.087	0.219	0.455	0.712	0.893	0.050	0.088	0.225	0.468	0.727	0.901
	1000	0.051	0.132	0.420	0.782	0.961	0.997	0.051	0.132	0.414	0.775	0.959	0.997	0.051	0.133	0.423	0.785	0.962	0.997
DMA(1), 0.25	100	0.054	0.060	0.076	0.109	0.150	0.213	0.057	0.063	0.079	0.109	0.150	0.210	0.055	0.062	0.078	0.112	0.155	0.219
	250	0.052	0.065	0.108	0.194	0.317	0.467	0.052	0.067	0.108	0.190	0.309	0.455	0.052	0.066	0.110	0.197	0.322	0.474
	500	0.051	0.077	0.172	0.350	0.575	0.778	0.052	0.077	0.169	0.343	0.564	0.769	0.051	0.078	0.174	0.354	0.579	0.782
	1000	0.049	0.106	0.315	0.634	0.879	0.978	0.050	0.106	0.310	0.625	0.873	0.976	0.050	0.107	0.317	0.638	0.881	0.978
DAR(1), 0.4	100	0.056	0.059	0.067	0.088	0.112	0.146	0.059	0.062	0.071	0.091	0.115	0.147	0.059	0.062	0.071	0.092	0.118	0.153
	250	0.053	0.060	0.085	0.132	0.199	0.294	0.054	0.062	0.085	0.131	0.195	0.286	0.054	0.062	0.087	0.135	0.205	0.301
	500	0.050	0.066	0.120	0.218	0.366	0.535	0.051	0.067	0.119	0.213	0.356	0.522	0.051	0.067	0.121	0.221	0.371	0.542
	1000	0.050	0.083	0.199	0.404	0.647	0.846	0.050	0.083	0.195	0.396	0.636	0.838	0.050	0.084	0.200	0.408	0.651	0.850

Table A5. Asymptotic vs. simulated standard deviation (S

_{\cdot}

-a vs. S

_{\cdot}

-s) of

\hat{κ} (h)

,

{\hat{κ}}^{*} (h)

,

{\hat{κ}}^{⋆} (h)

for i.i.d. DGPs with

m = 3

.

Table A5. Asymptotic vs. simulated standard deviation (S

_{\cdot}

-a vs. S

_{\cdot}

-s) of

\hat{κ} (h)

,

{\hat{κ}}^{*} (h)

,

{\hat{κ}}^{⋆} (h)

for i.i.d. DGPs with

m = 3

.

		S $_{\hat{κ}}$ -a	S $_{\hat{κ}}$ -s, $h =$			S $_{{\hat{κ}}^{*}}$ -a	S $_{{\hat{κ}}^{*}}$ -s, $h =$			S $_{{\hat{κ}}^{⋆}}$ -a	S $_{{\hat{κ}}^{⋆}}$ -s, $h =$
PMF	$n$		1	2	3		1	2	3		1	2	3
$p_{5}$	100	0.064	0.064	0.065	0.066	0.058	0.056	0.057	0.057	0.074	0.075	0.077	0.078
	250	0.040	0.041	0.041	0.041	0.037	0.036	0.036	0.036	0.047	0.047	0.048	0.048
	500	0.029	0.029	0.029	0.029	0.026	0.026	0.026	0.026	0.033	0.033	0.033	0.034
	1000	0.020	0.020	0.020	0.020	0.018	0.018	0.018	0.018	0.023	0.023	0.024	0.024
$p_{6}$	100	0.060	0.060	0.061	0.061	0.058	0.057	0.057	0.058	0.063	0.064	0.064	0.065
	250	0.038	0.038	0.038	0.038	0.037	0.036	0.036	0.037	0.040	0.040	0.040	0.040
	500	0.027	0.027	0.027	0.027	0.026	0.026	0.026	0.026	0.028	0.028	0.028	0.028
	1000	0.019	0.019	0.019	0.019	0.018	0.018	0.018	0.018	0.020	0.020	0.020	0.020
$p_{7}$	100	0.058	0.058	0.059	0.059	0.058	0.057	0.058	0.058	0.058	0.058	0.059	0.059
	250	0.037	0.036	0.037	0.037	0.037	0.036	0.037	0.036	0.037	0.037	0.037	0.037
	500	0.026	0.026	0.026	0.026	0.026	0.026	0.026	0.026	0.026	0.026	0.026	0.026
	1000	0.018	0.018	0.018	0.018	0.018	0.018	0.018	0.018	0.018	0.018	0.018	0.018

Table A6. Rejection rate (RR) if testing null hypothesis of i.i.d. data on 5%-level based on

\hat{κ} (1)

,

{\hat{κ}}^{*} (1)

,

{\hat{κ}}^{⋆} (1)

. DGPs i.i.d., DMA(1) with

φ_{1} = 0.15

, DAR(1) with

ϕ_{1} = 0.15

, and NegMarkov with

α = 0.75

.

Table A6. Rejection rate (RR) if testing null hypothesis of i.i.d. data on 5%-level based on

\hat{κ} (1)

,

{\hat{κ}}^{*} (1)

,

{\hat{κ}}^{⋆} (1)

. DGPs i.i.d., DMA(1) with

φ_{1} = 0.15

, DAR(1) with

ϕ_{1} = 0.15

, and NegMarkov with

α = 0.75

.

	DGP	i.i.d., RR for			DMA(1), 0.15, RR for			DAR(1), 0.15, RR for			NMark, 0.75, RR for
PMF	$n$	$\hat{κ} (1)$	${\hat{κ}}^{*} (1)$	${\hat{κ}}^{⋆} (1)$	$\hat{κ} (1)$	${\hat{κ}}^{*} (1)$	${\hat{κ}}^{⋆} (1)$	$\hat{κ} (1)$	${\hat{κ}}^{*} (1)$	${\hat{κ}}^{⋆} (1)$	$\hat{κ} (1)$	${\hat{κ}}^{*} (1)$	${\hat{κ}}^{⋆} (1)$
$p_{5}$	100	0.047	0.042	0.051	0.484	0.537	0.400	0.594	0.634	0.507	0.478	0.235	0.521
	250	0.049	0.048	0.049	0.851	0.896	0.756	0.925	0.947	0.861	0.886	0.654	0.908
	500	0.049	0.049	0.050	0.988	0.994	0.961	0.997	0.999	0.989	0.995	0.936	0.997
	1000	0.049	0.048	0.049	1.000	1.000	0.999	1.000	1.000	1.000	1.000	0.999	1.000
$p_{6}$	100	0.048	0.047	0.049	0.540	0.559	0.509	0.662	0.673	0.631	0.338	0.268	0.353
	250	0.050	0.049	0.049	0.903	0.916	0.878	0.960	0.966	0.947	0.725	0.636	0.735
	500	0.050	0.049	0.051	0.996	0.997	0.992	0.999	1.000	0.999	0.958	0.920	0.961
	1000	0.049	0.050	0.050	1.000	1.000	1.000	1.000	1.000	1.000	1.000	0.998	1.000
$p_{7}$	100	0.048	0.047	0.048	0.577	0.569	0.573	0.699	0.688	0.697	0.275	0.269	0.276
	250	0.047	0.048	0.048	0.925	0.924	0.924	0.973	0.972	0.973	0.640	0.636	0.641
	500	0.049	0.049	0.049	0.998	0.998	0.998	1.000	1.000	1.000	0.918	0.916	0.918
	1000	0.050	0.050	0.050	1.000	1.000	1.000	1.000	1.000	1.000	0.998	0.998	0.998

References

Agresti, Alan. 2002. Categorical Data Analysis, 2nd ed.Hoboken: John Wiley & Sons, Inc. [Google Scholar]
Billingsley, Patrick. 1999. Convergence of Probability Measures, 2nd ed.New York: John Wiley & Sons, Inc. [Google Scholar]
Blyth, Colin R. 1959. Note on estimating information. Annals of Mathematical Statistics 30: 71–79. [Google Scholar] [CrossRef]
Cohen, Jacob. 1960. A coefficient of agreement for nominal scales. Educational and Psychological Measurement 20: 37–46. [Google Scholar] [CrossRef]
Craig, Wallace. 1943. The Song of the Wood Pewee, Myiochanes virens Linnaeus: A Study of Bird Music. New York State Museum Bulletin No. 334. Albany: University of the State of New York. [Google Scholar]
GESIS—Leibniz Institute for the Social Sciences. 2018. ALLBUS 1980–2016 (German General Social Survey). ZA4586 Data File (Version 1.0.0). Cologne: GESIS Data Archive. (In German) [Google Scholar]
Hancock, Gwendolyn D’Anne. 2012. VIX and VIX futures pricing algorithms: Cultivating understanding. Modern Economy 3: 284–94. [Google Scholar]
Jacobs, Patricia A., and Peter A. W. Lewis. 1983. Stationary discrete autoregressive-moving average time series generated by mixtures. Journal of Time Series Analysis 4: 19–36. [Google Scholar] [CrossRef]
Kvålseth, Tarald O. 1995. Coefficients of variation for nominal and ordinal categorical data. Perceptual and Motor Skills 80: 843–47. [Google Scholar] [CrossRef]
Kvålseth, Tarald O. 2011a. The lambda distribution and its applications to categorical summary measures. Advances and Applications in Statistics 24: 83–106. [Google Scholar]
Kvålseth, Tarald O. 2011b. Variation for categorical variables. In International Encyclopedia of Statistical Science. Edited by Miodrag Lovric. Berlin: Springer, pp. 1642–45. [Google Scholar]
Lad, Frank, Giuseppe Sanfilippo, and Gianna Agro. 2015. Extropy: Complementary dual of entropy. Statistical Science 30: 40–58. [Google Scholar] [CrossRef]
Love, Eric Russell. 1980. Some logarithm inequalities. Mathematical Gazette 64: 55–57. [Google Scholar] [CrossRef]
Rao, C. Radhakrishna. 1982. Diversity and dissimilarity coefficients: A unified approach. Theoretical Population Biology 21: 24–43. [Google Scholar] [CrossRef]
Shannon, Claude Elwood. 1948. A mathematical theory of communication. Bell System Technical Journal 27: 379–423, 623–56. [Google Scholar] [CrossRef]
Shorrocks, Anthony F. 1978. The measurement of mobility. Econometrica 46: 1013–24. [Google Scholar] [CrossRef]
Tan, Wai-Yuan. 1977. On the distribution of quadratic forms in normal random variables. Canadian Journal of Statististics 5: 241–50. [Google Scholar] [CrossRef]
Weiß, Christian H. 2011. Empirical measures of signed serial dependence in categorical time series. Journal of Statistical Computation and Simulation 81: 411–29. [Google Scholar] [CrossRef]
Weiß, Christian H. 2013. Serial dependence of NDARMA processes. Computational Statistics and Data Analysis 68: 213–38. [Google Scholar] [CrossRef]
Weiß, Christian H. 2018. An Introduction to Discrete-Valued Time Series. Chichester: John Wiley & Sons, Inc. [Google Scholar]
Weiß, Christian H., and Rainer Göb. 2008. Measuring serial dependence in categorical time series. AStA Advances in Statistical Analysis 92: 71–89. [Google Scholar] [CrossRef]

1.

The “zero problem” for

κ^{*} (h)

described after Equation (4) happened mainly for

n = 100

and for distributions with low dispersion such as

p_{2}

to

p_{4}

, in about 0.5% of the i.i.d. simulation runs. It increased with positive dependence, to about 2% for the DAR(1) simulation runs. This problem was circumvented by replacing all affected summands by 0.

Figure 1. Normalized dispersion measures for Lambda distribution

L_{m} (λ)

against

λ

:

ν_{Ex}, ν_{G}

in (a),

ν_{En}

in (b), comparison for

m = 10

in (c).

Figure 1. Normalized dispersion measures for Lambda distribution

L_{m} (λ)

against

λ

:

ν_{Ex}, ν_{G}

in (a),

ν_{En}

in (b), comparison for

m = 10

in (c).

Figure 2. ALLBUS data from Section 4.2: (a) dispersion measures; (b) extropy with 95% CIs; and (c) relative frequencies; plotted against year of survey.

Figure 3. Fear states time series plot and PMF (top); and plots of

\hat{κ} (h), {\hat{κ}}^{*} (h), {\hat{κ}}^{⋆} (h)

(bottom).

Figure 3. Fear states time series plot and PMF (top); and plots of

\hat{κ} (h), {\hat{κ}}^{*} (h), {\hat{κ}}^{⋆} (h)

(bottom).

Figure 4. Wood Pewee time series: rate evolution graph and PMF plot (top); and plots of

\hat{κ} (h), {\hat{κ}}^{*} (h), {\hat{κ}}^{⋆} (h)

(bottom).

Figure 4. Wood Pewee time series: rate evolution graph and PMF plot (top); and plots of

\hat{κ} (h), {\hat{κ}}^{*} (h), {\hat{κ}}^{⋆} (h)

(bottom).

Table 1. Marginal distributions considered in Section 4.1 together with the corresponding dispersion values.

PMF	$ν_{G}$	$ν_{En}$	$ν_{Ex}$
$p_{1} = {(0.2, 0.2, 0.25, 0.35)}^{⊤}$	0.980	0.979	0.988
$p_{2} = {(0.05, 0.1, 0.15, 0.7)}^{⊤}$	0.633	0.660	0.745
$p_{3} = {(0.2, 0.15, 0.05, 0.6)}^{⊤}$	0.767	0.767	0.848
$p_{4} = {(0.8125, 0.0625, 0.0625, 0.0625)}^{⊤}$	0.438	0.497	0.574
$p_{5} = {(0.625, 0.125, 0.125, 0.125)}^{⊤}$	0.750	0.774	0.832
$p_{6} = {(0.4375, 0.1875, 0.1875, 0.1875)}^{⊤}$	0.938	0.940	0.961

Table 2. Definition of fear states for Volatility Index (VIX) according to Hancock (2012).

State and Explanation		VIX	State and Explanation		VIX
$s_{0}$	extreme complacency	$[0; 10)$	$s_{7}$	extremely high anx.	$[40; 45)$
$s_{1}$	very low anx. = high compl.	$[10; 15)$	$s_{8}$	near panic	$[45; 50)$
$s_{2}$	low anx. = moderate compl.	$[15; 20)$	$s_{9}$	moderate panic	$[50; 55)$
$s_{3}$	moderate anx. = low compl.	$[20; 25)$	$s_{10}$	panic	$[55; 60)$
$s_{4}$	moderately high anxiety	$[25; 30)$	$s_{11}$	intense panic	$[60; 65)$
$s_{5}$	high anxiety	$[30; 35)$	$s_{12}$	extreme panic	$[65; 100]$
$s_{6}$	very high anxiety	$[35; 40)$

© 2019 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Weiß, C.H. Measures of Dispersion and Serial Dependence in Categorical Time Series. Econometrics 2019, 7, 17. https://doi.org/10.3390/econometrics7020017

AMA Style

Weiß CH. Measures of Dispersion and Serial Dependence in Categorical Time Series. Econometrics. 2019; 7(2):17. https://doi.org/10.3390/econometrics7020017

Chicago/Turabian Style

Weiß, Christian H. 2019. "Measures of Dispersion and Serial Dependence in Categorical Time Series" Econometrics 7, no. 2: 17. https://doi.org/10.3390/econometrics7020017

APA Style

Weiß, C. H. (2019). Measures of Dispersion and Serial Dependence in Categorical Time Series. Econometrics, 7(2), 17. https://doi.org/10.3390/econometrics7020017

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Measures of Dispersion and Serial Dependence in Categorical Time Series

Abstract

1. Introduction

2. Extropy, Entropy and Gini Index

3. Distribution of Sample Dispersion Measures

3.1. Asymptotic Normality

3.2. Asymptotic Bias

3.3. Asymptotic Properties for Uniform Distribution

4. Simulations and Applications

4.1. Finite-Sample Performance of Dispersion Measures

4.2. Application: Goals in Politics

5. Measures of Signed Serial Dependence

6. Simulations and Applications

6.1. Finite-Sample Performance of Serial Dependence Measures

6.2. Application: Fear Index

6.3. Application: Wood Pewee

7. Conclusions

Funding

Acknowledgments

Conflicts of Interest

Appendix A. Some Models for Categorical Processes

Appendix A.1. NDARMA Models for Categorical Processes

Appendix A.2. Markov Chains for Categorical Processes

Appendix B. Proofs

Appendix B.1. Derivation of the inequality in Equation (7)

Appendix B.2. Derivations for Sample Dispersion Measures

Appendix B.3. Derivations for Measures of Signed Serial Dependence

Appendix C. Tables

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI