Next Article in Journal
Important Issues in Statistical Testing and Recommended Improvements in Accounting Research
Next Article in Special Issue
Evaluating Approximate Point Forecasting of Count Processes
Previous Article in Journal
Monte Carlo Inference on Two-Sided Matching Models

Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

# Measures of Dispersion and Serial Dependence in Categorical Time Series

Department of Mathematics and Statistics, Helmut Schmidt University, 22043 Hamburg, Germany
Econometrics 2019, 7(2), 17; https://doi.org/10.3390/econometrics7020017
Received: 21 December 2018 / Revised: 1 April 2019 / Accepted: 17 April 2019 / Published: 22 April 2019
(This article belongs to the Special Issue Discrete-Valued Time Series: Modelling, Estimation and Forecasting)

## Abstract

:
The analysis and modeling of categorical time series requires quantifying the extent of dispersion and serial dependence. The dispersion of categorical data is commonly measured by Gini index or entropy, but also the recently proposed extropy measure can be used for this purpose. Regarding signed serial dependence in categorical time series, we consider three types of $κ$-measures. By analyzing bias properties, it is shown that always one of the $κ$-measures is related to one of the above-mentioned dispersion measures. For doing statistical inference based on the sample versions of these dispersion and dependence measures, knowledge on their distribution is required. Therefore, we study the asymptotic distributions and bias corrections of the considered dispersion and dependence measures, and we investigate the finite-sample performance of the resulting asymptotic approximations with simulations. The application of the measures is illustrated with real-data examples from politics, economics and biology.

## 1. Introduction

In many applications, the available data are not of quantitative nature (e.g., real numbers or counts) but consist of observations from a given finite set of categories. In the present article, we are concerned with data about political goals in Germany, fear states in the stock market, and phrases in a bird’s song. For stochastic modeling, we use a categorical random variable X, i.e. a qualitative random variable taking one of a finite number of categories, e.g. $m + 1$ categories with some $m ∈ N$. If these categories are unordered, X is said to be a nominal random variable, whereas an ordinal random variable requires a natural order of the categories (Agresti 2002). To simplify notations, we always assume the possible outcomes to be arranged in a certain order (either lexicographical or natural order), i.e. we denote the range (state space) as $S = { s 0 , s 1 , … , s m }$. The stochastic properties of X can be determined based on the vector of marginal probabilities by $p = ( p 0 , … , p m ) ⊤ ∈ [ 0 ; 1 ] m + 1$, where $p i = P ( X = s i )$ (probability mass function, PMF). We abbreviate $s k ( p ) : = ∑ j = 0 m p j k$ for $k ∈ N$, where $s 1 ( p ) = 1$ has to hold. The subscripts “$0 , 1 , … , m$” are used for $S$ and p to emphasize that only m of the probabilities can be freely chosen because of the constraint $p 0 = 1 - p 1 - … - p m$.
Well-established dispersion measures for quantitative data, such as variance or inter quartile range, cannot be applied to qualitative data. For a categorical random variable X, one commonly defines dispersion with respect to the uncertainty in predicting the outcome of X(Kvålseth 2011b; Rao 1982; Weiß and Göb 2008). This uncertainty is maximal for a uniform distribution $p uni = ( 1 m + 1 , … , 1 m + 1 ) ⊤$ on $S$ (a reasonable prediction is impossible if all states are equally probable, thus maximal dispersion), whereas it is minimal for a one-point distribution $p one$ (i.e., all probability mass concentrates on one category, so a perfect prediction is possible). Obviously, categorical dispersion is just the opposite concept to the concentration of a categorical distribution. To measure the dispersion of the categorical random variable X, the most common approach is to use either the (normalized) Gini index (also index of qualitative variation, IQV) (Kvålseth 1995; Rao 1982) defined as
$ν G = m + 1 m 1 - s 2 ( p ) ,$
or the (normalized) entropy (Blyth 1959; Shannon 1948) given by
$ν En = - 1 ln ( m + 1 ) ∑ i = 0 m p i ln p i with 0 · ln 0 : = 0 .$
Both measures are minimized by a one-point distribution $p one$ and maximized by the uniform distribution $p uni$ on $S$. While nominal dispersion is always expressed with respect to these extreme cases, it has to be mentioned that there is an alternative scenario of maximal ordinal variation, namely the extreme two-point distribution; however, this is not further considered here.
If considering a (stationary) categorical process $( X t ) Z$ instead of a single random variable, then not only marginal properties are relevant but also information about the serial dependence structure (Weiß 2018). The (signed) autocorrelation function (ACF), as it is commonly applied in case of real-valued processes, cannot be used for categorical data. However, one may use a type of Cohen’s κ instead (Cohen 1960). A $κ$-measure of signed serial dependence in categorical time series is given by (see Weiß 2011, 2013; Weiß and Göb 2008);
$κ ( h ) = ∑ i = 0 m p i i ( h ) - p i 2 1 - s 2 ( p ) ∈ - s 2 ( p ) 1 - s 2 ( p ) ; 1 for lags h ∈ N .$
Equation (3) is based on the lagged bivariate probabilities $p i j ( h ) = P ( X t = i , X t - h = j )$ for $i , j = 0 , … , m$. $κ ( h ) = 0$ for serial independence at lag h, and the strongest degree of positive (negative) dependence is indicated if all $p i i ( h ) = p i$ ($p i i ( h ) = 0$), i.e., if the event $X t - h = s i$ is necessarily followed by $X t = s i$ ($X t ≠ s i$).
Motivated by a mobility index discussed by Shorrocks (1978), a simplified type of $κ$-measure, referred to as the modified κ, was defined by Weiß (2011, 2013):
$κ * ( h ) = 1 m ∑ i = 0 m p i i ( h ) - p i 2 p i ∈ - 1 m ; 1 for lags h ∈ N .$
Except the fact that the lower bound of the range differs from the one in Equation (3) (note that this lower bound is free of distributional parameters), we have the same properties as stated before for $κ ( h )$. The computation of $κ * ( h )$ is simplified compared to the one of $κ ( h )$ and, in particular, its sample version $κ ^ * ( h )$ has a more simple asymptotic normal distribution, see Section 5 for details. Unfortunately, $κ * ( h )$ is not defined if only one of the $p i$ equals 0, whereas $κ ( h )$ is well defined for any marginal distribution not being a one-point distribution. This issue may happen quite frequently for the sample version $κ ^ * ( h )$ if the given time series is short (a possible circumvention is to replace all summands with $p i = 0$ by 0). For this reason, $κ * ( h ) , κ ^ * ( h )$ appear to be of limited use for practice as a way of quantifying signed serial dependence. It should be noted that a similar “zero problem” happens with the entropy $ν En$ in Equation (2), and, actually, we work out a further relation between $ν En$ and $κ ^ * ( h )$ below.
In the recent work by Lad et al. (2015), extropy was introduced as a complementary dual to the entropy. Its normalized version is given by
$ν Ex = - 1 m ln ( m + 1 m ) ∑ i = 0 m ( 1 - p i ) ln ( 1 - p i ) .$
Here, the zero problem obviously only happens if one of the $p i$ equals 1 (i.e., in the case of a one-point distribution). Similar to the Gini index in Equation (1) and the entropy in Equation (2), the extropy takes its minimal (maximal) value 0 (1) for $p = p one$ ($p = p uni$), thus also Equation (5) constitutes a normalized measure of nominal variation. In Section 2, we analyze its properties in comparison to Gini index and entropy. In particular, we focus on the respective sample versions $ν ^ Ex$, $ν ^ G$ and $ν ^ En$ (see Section 3). To be able to do statistical inference based on $ν ^ Ex$, $ν ^ G$ and $ν ^ En$, knowledge about their distribution is required. Up to now, only the asymptotic distribution of $ν ^ G$ and (to some part) of $ν ^ En$ has been derived; in Section 3, comprehensive results for all considered dispersion measures are provided. These asymptotic distributions are then used as approximations to the true sample distributions of $ν ^ Ex$, $ν ^ G$ and $ν ^ En$, which is further investigated with simulations and a real application (see Section 4).
The second part of this paper is dedicated to the analysis of serial dependence. As a novel competitor to the measures in Equations (3) and (4), a new type of modified $κ$ is proposed, namely
$κ ⋆ ( h ) = ∑ i = 0 m p i i ( h ) - p i 2 1 - p i ∈ ∑ i = 0 m - p i 2 1 - p i ; 1 for lags h ∈ N .$
Again, this constitutes a measure of signed serial dependence, which shares the before-mentioned (in)dependence properties with $κ ( h ) , κ * ( h )$. However, in contrast to $κ * ( h )$, the newly proposed $κ ⋆ ( h )$ does not have a division-by-zero problem: except for the case of a one-point distribution, $κ ⋆ ( h )$ is well defined. Note that, in Section 3.2, it turns out that $κ ⋆ ( h )$ is related to $ν Ex$ in some sense, e.g. $κ ( h )$ is related to $ν G$ and $κ * ( h )$ to $ν En$. In Section 5, we analyze the sample version of $κ ⋆ ( h )$ in comparison to those of $κ ( h ) , κ * ( h )$, and we derive its asymptotic distribution under the null hypothesis of serial independence. This allows us to test for significant dependence in categorical time series. The performance of this $κ ^ ⋆$-test, in comparison to those based on $κ ^ , κ ^ *$, is analyzed in Section 6, where also two further real applications are presented. Finally, we conclude in Section 7.

## 2. Extropy, Entropy and Gini Index

As extropy, entropy and Gini index all serve for the same task, it is interesting to know their relations and differences. An important practical issue is the “$0 ln 0$”-problem, as mentioned above, which never occurs for the Gini index, only occurs in the case of a (deterministic) one-point distribution for the extropy, and always occurs for the entropy if only one $p i = 0$. Lad et al. (2015) further compared the non-normalized versions of extropy and entropy, and they showed that the first is never smaller than the latter. Actually, using the inequality $ln ( 1 + x ) > x / ( 1 + x / 2 )$ for $x > 0$ from Love (1980), it follows that
$- ∑ i = 0 m ( 1 - p i ) ln ( 1 - p i ) ≥ - ∑ i = 0 m p i ln p i ≥ 1 - s 2 ( p ) ,$
(see Appendix B.1 for further details).
Things change, however, if considering the normalized versions $ν Ex$, $ν En$ and $ν G$. For illustration, assume an underlying Lambda distribution $L m ( λ )$ with $λ ∈ ( 0 ; 1 )$ defined by the probability vector $p m ; λ = ( 1 - λ + λ m + 1 , λ m + 1 , … , λ m + 1 ) ⊤$ (Kvålseth 2011a). Note that $λ → 0$ leads to a one-point distribution, whereas $λ → 1$ leads to the uniform distribution; actually, $L m ( λ )$ can be understood as a mixture of these boundary cases. For $L m ( λ )$, the Gini index satisfies $ν G = λ ( 2 - λ )$ for all $m ∈ N$ (see Kvålseth (2011a)). In addition, the extropy $ν Ex$ has rather stable values for varying m (see Figure 1a), whereas the entropy values in Figure 1b change greatly. This complicates the interpretation of the actual level of normalized entropy.
Finally, the example $m = 10$ plotted in Figure 1c shows that, in contrast to Equation (7), there is no fixed order between the normalized entropy $ν En$ and Gini index $ν G$. In this and many further numerical experiments, however, it could be observed that the inequalities $ν Ex ≥ ν En$ and $ν Ex ≥ ν G$ hold. These inequalities are formulated as a general conjecture here.
From now on, we turn towards the sample versions $ν ^ Ex , ν ^ En , ν ^ G$ of $ν Ex , ν En , ν G$. These are obtained by replacing the probabilities $p i , p$ by the respective estimates $p ^ i , p ^$, which are computed as relative frequencies from the given sample data $x 1 , … , x n$. As detailed in Section 3, $x 1 , … , x n$ are assumed as time series data, but we also consider the case of independent and identically distributed (i.i.d.) data.

## 3. Distribution of Sample Dispersion Measures

To be able to derive the asymptotic distribution of statistics computed from $X 1 , … , X n$, Weiß (2013) assumed that the nominal process is $ϕ$-mixing with exponentially decreasing weights such that the CLT on p. 200 in Billingsley (1999) is applicable. This condition is not only satisfied in the i.i.d.-case, but also for, among others, the so-called NDARMA models as introduced by Jacobs and Lewis (1983) (see Appendix A.1 for details). Then, Weiß (2013) derived the asymptotic distribution of $n p ^ - p$, which is the normal distribution $N ( 0 , Σ )$ with $Σ = ( σ i j ) i , j = 0 , … , m$ given by
$σ i j = p j ( δ i , j - p i ) + ∑ h = 1 ∞ p i j ( h ) + p j i ( h ) - 2 p i p j .$
Using this result, the asymptotic properties of $ν ^ G$, $ν ^ En$ and $ν ^ Ex$ can be derived, as shown in Appendix B.2. The following subsections present and compare these properties in detail.

#### 3.1. Asymptotic Normality

As shown in Appendix B.2, provided that $p ≠ p one , p uni$, all variation measures $ν ^ G$, $ν ^ En$ and $ν ^ Ex$ are asymptotically normally distributed. More precisely, $n ( ν ^ G - ν G )$ is asymptotically normally distributed with variance
$σ G 2 = 4 ( m + 1 m ) 2 ∑ i , j = 0 m p i p j σ i j = Equation ( 8 ) 4 ( m + 1 m ) 2 s 3 ( p ) - s 2 2 ( p ) 1 + 2 ∑ h = 1 ∞ ∑ i , j = 0 m p i j ( h ) - p i p j p i p j s 3 ( p ) - s 2 2 ( p ) ︸ = : ϑ ( h ) ,$
a result already known from Weiß (2013). Here, $ϑ ( h )$ might be understood as a measure of serial dependence, in analogy to the measures in Equations (3), (4) and (6). In particular, $ϑ ( h ) = 0$ in the i.i.d.-case, and $ϑ ( h ) = κ ( h )$ for NDARMA processes (Appendix A.1).
Analogously, $n ( ν ^ En - ν En )$ is asymptotically normally distributed with variance
$σ En 2 = ∑ i , j = 0 m 1 + ln p i ln ( m + 1 ) 1 + ln p j ln ( m + 1 ) σ i j = Equation ( 8 ) 1 ln ( m + 1 ) 2 ∑ i = 0 m p i ( ln p i ) 2 - ∑ i = 0 m p i ln p i 2 · 1 + 2 ∑ h = 1 ∞ ∑ i , j = 0 m ( 1 + ln p i ) ( 1 + ln p j ) p i j ( h ) - p i p j ∑ i = 0 m p i ( ln p i ) 2 - ∑ i = 0 m p i ln p i 2 ︸ = : ϑ * ( h ) .$
In the i.i.d.-case, where the last factor becomes 1 (cf. Appendix A.1), this result was given by Blyth (1959), whereas the general expression in Equation (10) can be found in the work of Weiß (2013).
A novel result follows for the extropy, where $n ( ν ^ Ex - ν Ex )$ is asymptotically normally distributed with variance
$σ Ex 2 = ∑ i , j = 0 m 1 + ln ( 1 - p i ) m ln ( m + 1 m ) 1 + ln ( 1 - p j ) m ln ( m + 1 m ) σ i j = Equation ( 8 ) 1 m ln ( m + 1 m ) 2 ∑ i = 0 m p i ln ( 1 - p i ) 2 - ∑ i = 0 m p i ln ( 1 - p i ) 2 · 1 + 2 ∑ h = 1 ∞ ∑ i , j = 0 m 1 + ln ( 1 - p i ) 1 + ln ( 1 - p j ) p i j ( h ) - p i p j ∑ i = 0 m p i ln ( 1 - p i ) 2 - ∑ i = 0 m p i ln ( 1 - p i ) 2 ︸ = : ϑ ⋆ ( h ) .$
Again, the last factor becomes 1 in the i.i.d.-case as $ϑ ⋆ ( h ) = 0$, and $ϑ ⋆ ( h ) = κ ( h )$ for NDARMA processes (Appendix A.1).
In Equations (9)–(11), the notations $ϑ ( h )$, $ϑ * ( h )$ and $ϑ ⋆ ( h )$ have been introduced (see the respective expressions covered by the curly bracket) to highlight the similar structure of the asymptotic variances, and to locate the effect of serial dependence. Actually, one might use $ϑ ( h )$, $ϑ * ( h )$ and $ϑ ⋆ ( h )$ as measures of serial dependence in categorical time series, although their definition is probably too complex for practical use. In Section 3.2, when analyzing the bias of $ν ^ G$, $ν ^ En$ and $ν ^ Ex$, analogous relations to the $κ$-measures defined in Section 1 are established.

#### 3.2. Asymptotic Bias

In Appendix B.2, we express the variation measures $ν ^ G$, $ν ^ En$ and $ν ^ Ex$ as centered quadratic polynomials (at least approximately), and subsequently derive a bias formula. For the sample Gini index, it follows that
$E [ ν ^ G ] ≈ ν G - 1 n m + 1 m ∑ i = 0 m σ i i = Equation ( 8 ) ν G 1 - 1 n 1 + 2 ∑ h = 1 ∞ κ ( h ) .$
This formula was also derived by Weiß (2013), and it leads to the exact corrective factor $( 1 - 1 n )$ in the i.i.d.-case. For $ν ^ En , ν ^ Ex$, such bias formulae do not exist yet. However, from our derivations in Appendix B.2, we newly obtain that
$E [ ν ^ En ] ≈ ν En - 1 2 n ∑ i = 0 m p i - 1 ln ( m + 1 ) σ i i = Equation ( 8 ) ν En - 1 2 n m ln ( m + 1 ) 1 + 2 ∑ h = 1 ∞ κ * ( h ) .$
In the i.i.d.-case, the last factor reduces to 1, and $κ * ( h ) = κ ( h )$ for NDARMA processes (Appendix A.1). Comparing Equations (12) and (13), we see that the effect of serial dependence on the bias is always expressed in terms of a $κ$-measure, using the ordinary $κ$ (Equation (3)) for the Gini index, and the modified $κ$ (Equation (4)) for the entropy. Concerning the extropy, it turns out that the newly proposed $κ$-measure from Equation (6) takes this role:
$E [ ν ^ Ex ] ≈ ν Ex - 1 2 n ∑ i = 0 m ( 1 - p i ) - 1 m ln ( m + 1 m ) σ i i = Equation ( 8 ) ν Ex - 1 2 n 1 m ln ( m + 1 m ) 1 + 2 ∑ h = 1 ∞ κ ⋆ ( h ) .$
In the i.i.d.-case, the last factor again reduces to 1, and also $κ ⋆ ( h ) = κ ( h )$ holds for NDARMA processes (Appendix A.1). Altogether, Equations (12)–(14) show a unique structure regarding the effect of serial dependence. Furthermore, the computed bias corrections imply the relations $ν G ↔ κ$, $ν En ↔ κ *$ and $ν Ex ↔ κ ⋆$. The sample versions of $κ , κ * , κ ⋆$ are analyzed later in Section 5.

#### 3.3. Asymptotic Properties for Uniform Distribution

The asymptotic normality established in Section 3.1 certainly does not apply to the deterministic case $p = p one$, but we also have to exclude the boundary case of a uniform distribution $p uni$. As shown in Appendix B.2, the asymptotic distribution of $ν ^ G$, $ν ^ En$ and $ν ^ Ex$ in the uniform case is not a normal distribution but a quadratic-form one. All three statistics can be related to the Pearson’s $χ 2$-statistic:
$n m ( 1 - ν ^ G ) = 2 n ln ( m + 1 ) · ( 1 - ν ^ En ) ≈ 2 n m 2 ln ( m + 1 m ) · ( 1 - ν ^ Ex ) ≈ n ∑ j = 0 m ( p ^ j - 1 m + 1 ) 2 1 m + 1 .$
The actual asymptotic distribution can now be derived by applying Theorem 3.1 in Tan (1977) to the asymptotic result in Equation (8), which requires computing the eigenvalues of $Σ$. In special cases, however, one is not faced with a general quadratic-form distribution but with a $χ m 2$-distribution; this happens for NDARMA processes and certainly in the i.i.d.-case (see Weiß (2013)). Then, defining $c = 1 + 2 ∑ h = 1 ∞ κ ( h )$, it holds that Equation (15) asymptotically follows c times a $χ m 2$-distribution (with $c = 1$ in the i.i.d.-case).

## 4. Simulations and Applications

Section 4.1 presents some simulation results regarding the quality of the asymptotic approximations for $ν ^ G$, $ν ^ En$ and $ν ^ Ex$ as derived in Section 3. Section 4.2 then applies these measures within a longitudinal study about the most important goals in politics in Germany.

#### 4.1. Finite-Sample Performance of Dispersion Measures

In applications, the normal and $χ 2$-distributions derived in Section 3 were used as an approximation to the true distribution of $ν ^ G$, $ν ^ En$ and $ν ^ Ex$. Therefore, the finite-sample performance of these approximations had to be analyzed, which was done by simulation (with 10,000 replications per scenario). In the tables provided by Appendix C, the simulated means (Table A1) and standard deviations (Table A2) for $ν ^ G$, $ν ^ En$ and $ν ^ Ex$ are reported and compared to the respective asymptotic approximations. Then, a common application scenario was considered, the computation of two-sided 95% confidence intervals (CIs) for $ν ^ G$, $ν ^ En$ and $ν ^ Ex$. Since the true parameter values are not known in practice, one has to plug-in estimated parameters into the formulae for mean and variance given in Section 3. The simulated coverage rates reported in Table A3 refer to such plug-in CIs. For these simulations, we either used an i.i.d. data-generating process (DGP) or an NDARMA DGP (see Appendix A.1): a DMA(1) process with $φ 1 = 0.25$ or a DAR(1) process with $ϕ 1 = 0.40$. These were combined with the marginal distributions ($m = 3$) summarized in Table 1: $p 1$ and $p 2$ were used before by Weiß (2011, 2013), $p 3$ by Kvålseth (2011b), and $p 4$ to $p 6$ are Lambda distributions $L m ( λ )$ with $λ ∈ { 0.25 , 0.50 , 0.75 }$ (see Kvålseth 2011a).
Finally, we used the results in Section 3.3 to test the null hypothesis (level 5%) of the uniform distribution $L 3 ( 1 )$ ($ν G = ν En = ν Ex = 1$) based on $ν ^ G$, $ν ^ En$ and $ν ^ Ex$. The considered alternatives were $L m ( λ )$ with $λ ∈ { 0.98 , 0.96 , 0.94 , 0.92 , 0.90 }$, and the DGPs were as presented above. The simulated rejection rates are summarized in Table A4. Note that the required factor $c = 1 + 2 ∑ h = 1 ∞ κ ( h )$ equald $c = ( 1 + ϕ 1 ) / ( 1 - ϕ 1 )$ with $ϕ 1 = κ ( 1 )$ in the DAR(1) case, and $c = 1 + 2 κ ( 1 ) = 1 + 2 φ 1 ( 1 - φ 1 )$ in the DMA(1) case, and was thus easy to estimate from the given time series data.
Let us now investigate the simulation results. Comparing the simulated mean values in Table A1 with the true dispersion values in Table 1, we realized a considerable negative bias for small sample sizes, which became even larger with increasing serial dependence. Fortunately, this bias is explained very well by the asymptotic bias correction, in any of the considered scenarios. With some limitations, this conclusion also applies to the standard deviations reported in Table A2; however, for sample size $n = 100$ and increasing serial dependence, the discrepancy between asymptotic and simulated values increased. As a result, the coverage rates in Table A3 performed rather poorly for sample size 100; thus, reliable CIs generally required a sample of size at least 250. It should also be noted that Gini index and extropy performed very similarly and often slightly worse than the entropy. Finally, the rejection rates in Table A4 concerning the tests for uniformity showed similar sizes (columns “$1.00$”; slightly above 0.05 for $n = 100$) but little different power values: best for extropy and worst for entropy.

#### 4.2. Application: Goals in Politics

The monitoring of public mood and political attitudes over time is important for decision makers as well as for social scientists. Since 1980, the German General Social Survey (“ALLBUS”) is carried out by the “GESIS—Leibniz Institute for the Social Sciences” in every second year (exception: there was an additional survey in 1991 after the German reunification). In the years before and including 1990, the survey was done only in Western Germany, but in all Germany for the years 1991 and later. In what follows, we consider the cumulative report for 1980–2016 in GESIS—Leibniz Institute for the Social Sciences (2018), and there the question “If you had to choose between these different goals, which one would seem to you personally to be the most important?”. The four possible (nominal) answers are
• $s 0$: “To maintain law and order in this country”;
• $s 1$: “To give citizens more influence on government decisions”;
• $s 2$: “To fight rising prices”; and
• $s 3$: “To protect the right of freedom of speech”.
The sample sizes of this longitudinal study varied between 2795 and 3754, and are thus sufficiently large for the asymptotics derived in Section 3. If just looking at the mode as a summary measure (location), there is not much change in time: from 1980 to 2000 and again in 2016, the majority of respondents considered $s 0$ (law and order) as the most important goal in politics, whereas $s 1$ (influence) was judged most important between 2002 and 2014.
Much more fluctuations are visible if looking at the dispersion measures in Figure 2a. Although the absolute values of the measures differ, the general shapes of the graphs are quite similar. Usually, the dispersion measures take rather large values (≥0.90 in most of the years), which shows that any of the possible goals is considered as being most important by a large part of the population. On the other hand, the different goals never have the same popularity. Even in 2008, where all measures give a value very close to 1, the corresponding uniformity tests lead to a clear rejection of the null of a uniform distribution (p-values approximately 0 throughout).
Let us now analyze the development of the importance of the political goals in some more detail, by looking at the extropy for illustration. Figure 2b shows the approximate 95%-CIs in time (a bias correction does not have a visible effect because of the large sample sizes). There are phases where successive CIs overlap, and these are interrupted by breaks in the dispersion behavior. Such breaks happen, e.g., in 1984 (possibly related to a change of government in Germany in 1982/83), in 2002 (perhaps related to 9/11), or in 2008 and 2010 (Lehman bankruptcy and economic crisis). These changes in dispersion go along with reallocations of probability masses, as can be seen from Figure 2c. From the frequency curves of $s 0$ and $s 1$, we can see the above-mentioned change in mode, where the switch back to $s 0$ (law and order) might be caused by the refugee crisis. In addition, the curve for $s 2$ (fight rising prices) helps for explanation, as it shows that $s 2$ is important for the respondents in the beginning of the 1980s (where Germany suffered from very high inflation) and in 2008 (economic crisis), but not otherwise. Thus, altogether, the dispersion measures together with their approximate CIs give a very good summary of the distributional changes over time.

## 5. Measures of Signed Serial Dependence

After having discussed the analysis of marginal properties of a categorical time series, we now turn to the analysis of serial dependencies. In Section 1, two known measures of signed serial dependence, Cohen’s $κ ( h )$ in Equation (3) and a modification of it, $κ * ( h )$ in Equation (4), are briefly surveyed, and, in Section 3.2, we realize a connection to $ν G$ and $ν En$, respectively. Motivated by a zero problem with $κ * ( h )$, a new type of modified $κ$ is proposed in Equation (6), the measure $κ ⋆ ( h )$, and this turns out to be related to $ν Ex$.
If replacing the (bivariate) probabilities in Equations (3), (4) and (6) by the respective (bivariate) relative frequencies computed from $x 1 , … , x n$, we end up with sample versions of these dependence measures. Knowledge of their asymptotic distribution is particularly relevant for the i.i.d.-case, because this allows us to test for significant serial dependence in the given time series. As shown by Weiß (2011, 2013), $κ ^ ( h )$ then has an asymptotic normal distribution, and it holds approximately that
$E κ ^ ( h ) ≈ - 1 n , V κ ^ ( h ) ≈ 1 n 1 - 1 + 2 s 3 ( p ) - 3 s 2 ( p ) ( 1 - s 2 ( p ) ) 2 .$
The sample version of $κ * ( h )$ has a more simple asymptotic normal distribution with Weiß (2011, 2013)
$E κ ^ * ( h ) ≈ - 1 n , V κ ^ * ( h ) ≈ 1 m n ,$
but it suffers from the before-mentioned zero problem, especially for short time series.
Thus, it remains to derive the asymptotics of the novel $κ ^ ⋆ ( h )$ under the null of an i.i.d. sample $X 1 , … , X n$. The starting point is an extension of the limiting result in Equation (8). Under appropriate mixing assumptions (see Section 3), Weiß (2013) derived the joint asymptotic distribution of all univariate and equal-bivariate relative frequencies, i.e. of all $n p ^ i - p i$ and $n p ^ j j ( h ) - p j j ( h )$, which is the $2 ( m + 1 )$-dimensional normal distribution $N ( 0 , Σ ( h ) )$. The covariance matrix $Σ ( h )$ consists of four blocks with entries
$σ i , j = p j ( δ i , j - p i ) + ∑ k = 1 ∞ p i j ( k ) + p j i ( k ) - 2 p i p j , σ i , m + 1 + j ( h ) = 2 ( δ i , j - p i ) p j j ( h ) + ∑ k = 1 ∞ p i j j ( k , h ) - p i p j j ( h ) + ∑ k = h + 1 ∞ p j j i ( h , k - h ) - p i p j j ( h ) + ∑ k = 1 h - 1 p j i j ( k , h - k ) - p i p j j ( h ) = σ m + 1 + j , i ( h ) , σ m + 1 + i , m + 1 + j ( h ) = δ i , j - p j j ( h ) p i i ( h ) + 2 δ i , j p i j j ( h , h ) - p i i ( h ) p j j ( h ) + ∑ k = 1 h - 1 p j i j i ( k , h - k , k ) + p i j i j ( k , h - k , k ) - 2 p i i ( h ) p j j ( h ) + ∑ k = h + 1 ∞ p j j i i ( h , k - h , h ) + p i i j j ( h , k - h , h ) - 2 p i i ( h ) p j j ( h ) ,$
where always $i , j = 0 , … , m$, and where
$p a b c ( k , l ) = P ( X t = a , X t - k = b , X t - k - l = c ) , p a b c d ( k , l , m ) = P ( X t = a , X t - k = b , X t - k - l = c , X t - k - l - m = d ) .$
This rather complex general result simplifies greatly special cases such as an NDARMA- DGP (Weiß 2013) and, in particular, for an i.i.d. DGP:
$σ i , j = p j ( δ i , j - p i ) , σ i , m + 1 + j = 2 ( δ i , j - p i ) p j 2 = σ m + 1 + j , i , σ m + 1 + i , m + 1 + j = δ i , j p i 2 ( 1 + 2 p i ) - 3 p i 2 p j 2 .$
Now, the asymptotic properties of $κ ^ ⋆ ( h )$ can be derived, as done in Appendix B.3. $n κ ^ ⋆ ( h ) - κ ⋆ ( h )$ is asymptotically normally distributed, and mean and variance can be approximated by plugging Equation (18) into
$E κ ^ ⋆ ( h ) ≈ κ ⋆ ( h ) + 1 n ∑ j = 0 m ( 1 - p j ) σ j , m + 1 + j ( h ) - 1 - p j j ( h ) σ j j ( 1 - p j ) 3 , V κ ^ ⋆ ( h ) ≈ 1 n ∑ i , j = 0 m p i i ( h ) - 2 p i + p i 2 ( 1 - p i ) 2 p j j ( h ) - 2 p j + p j 2 ( 1 - p j ) 2 σ i , j + ∑ i , j = 0 m σ m + 1 + i , m + 1 + j ( h ) ( 1 - p i ) ( 1 - p j ) + 2 ∑ i , j = 0 m p i i ( h ) - 2 p i + p i 2 ( 1 - p i ) 2 σ i , m + 1 + j ( h ) 1 - p j .$
In the i.i.d.-case, we simply have
$E κ ^ ⋆ ( h ) ≈ - 1 n , V κ ^ ⋆ ( h ) ≈ 1 n ( s 2 ( p ) - ∑ i = 0 m p i 2 1 - p i 2 + ∑ i = 0 m p i 2 1 - p i 2 ) .$
Comparing Equations (16), (17) and (21), we see that all three measures have the same asymptotic bias $- 1 / n$, but their asymptotic variances generally differ. An exception to the latter statement is obtained in the case of a uniform distribution, then also the asymptotic variances coincide (see Appendix B.3).

## 6. Simulations and Applications

Section 6.1 presents some simulation results, where the quality of the asymptotic approximations for $κ ^ ( h ) , κ ^ * ( h ) , κ ^ ⋆ ( h )$ according to Section 5 is investigated, as well as the power if testing against different types of serial dependence. Two real-data examples are discussed in Section 6.2 and Section 6.3, first an ordinal time series with rather strong positive dependence, then a nominal time series exhibiting negative dependencies.

#### 6.1. Finite-Sample Performance of Serial Dependence Measures

In analogy to Section 4.1, we compared the finite-sample performance of the normal approximations in Equations (16), (17) and (21) via simulations1. As the power scenarios, we not only included NDARMA models but also the NegMarkov model described in Appendix A.2. These models were combined with the marginal distributions $p 1$ to $p 6$ in Table 1 plus $p 7 = p uni$; for the NegMarkov model, it was not possible to have the marginals $p 2$ and $p 4$ with very low dispersion. The full simulation results are available from the author upon request, but excerpts thereof are shown in the sequel to illustrate the main findings. As before, all tables are collected in Appendix C.
First, we discuss the distributional properties of $κ ^ ( h ) , κ ^ * ( h ) , κ ^ ⋆ ( h )$ under the null of an i.i.d. DGP. The unique mean approximation $- 1 / n$ worked very well without exceptions. The quality of the standard deviations’ approximations is investigated in Table A5. Generally, the actual marginal dispersion was of great influence. For large dispersion (e.g., $p 6 , p 7$ in Table A5), we had a very good agreement between asymptotic and simulated standard deviation, where deviations were typically not larger than $± 0.001$. For low dispersion (e.g., $p 5$ in Table A5), we found some discrepancy for $n = 100$ and increasing h: the asymptotic approximation resulted in lower values for $κ ^ ( h ) , κ ^ ⋆ ( h )$ (discrepancy up to 0.004), and in larger values for $κ ^ * ( h )$ (discrepancy up to 0.002). Consequently, if testing for serial independence, we expect the size for $κ ^ * ( h )$ to be smaller than the nominal $5 %$-level, and to be larger for $κ ^ ( h ) , κ ^ ⋆ ( h )$. This was roughly confirmed by the results in Table A6 (and by further simulations), with the smallest size values for $κ ^ * ( h )$ (might be smaller by up to 0.01) and the largest for $κ ^ ⋆ ( h )$. The sizes of $κ ^ ( h )$ tended to be smaller than 0.05 for $h = 1$ (discrepancies $≤ 0.003$), while those of $κ ^ ⋆ ( h )$ were always rather close to 5%.
A more complex picture was observed with regard to the power of $κ ^ ( h ) , κ ^ * ( h ) , κ ^ ⋆ ( h )$ (see Table A6). For positive dependence (DMA(1) and DAR(1)), $κ ^ ( h ) , κ ^ ⋆ ( h )$ performed best if being close to a marginal uniform distribution, whereas $κ ^ * ( h )$ had superior power for lower marginal dispersion levels (and $κ ^ ( h )$ performed second-best). For negative dependence (NegMarkov), in contrast, $κ ^ ⋆ ( h )$ was the optimal choice, and $κ ^ * ( h )$ might have a rather poor power, especially for low dispersion. Thus, while $κ ^ ( h ) , κ ^ ⋆ ( h )$ both showed a more-or-less balanced performance with respect to positive and negative dependence, we had a sharp contrast for $κ ^ * ( h )$.

#### 6.2. Application: Fear Index

The Volatility Index (VIX) serves as a benchmark for U.S. stock market volatility, and increasing VIX values are interpreted as indications of greater fear in the market (Hancock 2012). Hancock (2012) distinguished between the $m + 1 = 13$ ordinal fear states given in Table 2. From the historical closing rates of the VIX offered by the website https://finance.yahoo.com/, a time series of daily fear states was computed for the $n = 4287$ trading days in the period 1990–2006 (before the beginning of the financial crisis). The obtained time series is plotted in Figure 3.
As shown in the plots in the top panel of Figure 3, the states $s 9$$s 12$ are never observed during 1990–2006, thus we have zero frequencies affecting the computation of $ν ^ En$ and $κ ^ * ( h )$. The marginal distribution itself deviates visibly from a uniform distribution, thus it is reasonable that the dispersion measures $ν ^ G$, $ν ^ En$ and $ν ^ Ex$ are clearly below 1. Actually, the PMF mainly concentrates on the low to moderate fear states, high anxiety (or more) only happened for few of the trading days. Even more important is to investigate the development of these fear states over time. While negative serial dependence would indicate a permanent fluctuation between the states, positive dependence would imply some kind of inertia regarding the respective states. The serial dependence structure was analyzed, as shown in the bottom panel of Figure 3, where the critical values for level 5% (dashed lines) were computed according to the asymptotics in Section 5. All measures indicated significantly positive dependence, thus the U.S. stock market has a tendency to stay in a state once attained. However, $κ ^ * ( h )$ (Figure 3, center) produced notably smaller values than $κ ^ ( h ) , κ ^ ⋆ ( h )$ (Figure 3, left and right). Considering that the time series plot with its long runs of single states implies a rather strong positive dependence, the values produced by $κ ^ * ( h )$ did not appear to be that plausible. Thus, $κ ^ ( h ) , κ ^ ⋆ ( h )$, which resulted in very similar values, appeared to be better interpretable in the given example. Note that the discrepancy between $κ ^ * ( h )$ and $κ ^ ( h ) , κ ^ ⋆ ( h )$ went along with the discrepancy between $ν ^ En$ and $ν ^ G , ν ^ Ex$.

#### 6.3. Application: Wood Pewee

Animal behavior studies are an integral part of research in biology and psychology. The time series example to be studied in the present section dates back to one of the pioneers of ethology, to Wallace Craig. The data were originally presented by Craig (1943) and further analyzed (among others) in Chapter 6 of the work by Weiß (2018). They constitute a nominal time series of length $n = 1327$, where the three states $s 0 , s 1 , s 2$ express the different phrases in the morning twilight song of the Wood Pewee (“pee-ah-wee”, “pee-oh” and “ah-di-dee”, respectively). Since the range of a nominal time series lacks a natural order, a time series plot is not possible. Thus, Figure 4 shows a rate evolution graph as a substitute, where the cumulative frequencies of the individual states are plotted against time t. From the roughly linear increase, we conclude on a stable behavior of the time series (Weiß 2018).
The dispersion measures $ν ^ G$, $ν ^ En$ and $ν ^ Ex$ all led to values between 0.90 and 0.95, indicating that all three phrases are frequently used (but not equally often) within the morning twilight song of the Wood Pewee. This is confirmed by the PMF plot in Figure 4, where we found some preference for the phrase $s 0$. From the serial dependence plots in the bottom panel of Figure 4, a quite complex serial dependence structure becomes visible, with both positive and negative dependence values and with a periodic pattern. Positive values happen for even lags h (and particularly large values for multiples of 4), and negative values for odd lags. The positive values indicate a tendency for repeating a phrase, and such repetitions seem to be particularly likely after every fourth phrase. Negative values, in contrast, indicate a change of the phrase, e.g., it will rarely happen that the same phrase is presented twice in a row. While $κ ^ ( h ) , κ ^ ⋆ ( h )$ gave a very clear (and similar) picture of the rhythmic structure, it was again $κ ^ * ( h )$ that caused some implausible values, e.g., the non-significant value at lag 2. Thus, both data examples indicate that $κ ^ * ( h )$ should be used with caution in practice (also because of the zero problem). A decision between $κ ^ ( h )$ and $κ ^ ⋆ ( h )$ is more difficult; $κ ^ ( h )$ is well established and slightly advantageous for uncovering positive dependencies, whereas $κ ^ ⋆ ( h )$ is computationally simpler and shows a very good performance regarding negative dependencies.

## 7. Conclusions

This work discusses approaches for measuring dispersion and serial dependence in categorical time series. Asymptotic properties of the novel extropy measure for categorical dispersion are derived and compared to those of Gini index and entropy. Simulations showed that all three measures performed quite well, with slightly better coverage rates for the entropy but computational advantages for Gini index and extropy. The extropy was most reliable if testing the null hypothesis of a uniform distribution. The application and interpretation of these measures was illustrated with a longitudinal study about the most important political goals in Germany.
The analysis of the asymptotic bias of Gini index, entropy and extropy uncovered a relation between these three measures and three types of $κ$-measures for signed serial dependence. While two of these measures, namely $κ ( h )$ and $κ * ( h )$, have already been discussed in the literature, the “$κ$-counterpart” to the extropy turned out to be a new type of modified Cohen’s $κ$, denoted by $κ ⋆ ( h )$. The asymptotics of $κ ^ ⋆ ( h )$ were investigated and utilized for testing for serial dependence. A simulation study as well as two real-data examples (time series of fear states and song of the Wood Pewee) showed that $κ ^ * ( h )$ has several drawbacks, while both $κ ^ ( h )$ and $κ ^ ⋆ ( h )$ work very well in practice. The advantages of $κ ^ ⋆ ( h )$ are computational simplicity and a superior performance regarding negative dependencies.

## Funding

This research received no external funding.

## Acknowledgments

The author thanks the editors and the three referees for their useful comments on an earlier draft of this article.

## Conflicts of Interest

The author declares no conflict of interest.

## Appendix A. Some Models for Categorical Processes

This appendix provides a brief summary of those models for categorical processes that are used for simulations and illustrative computations in this article. More background on these and further models for categorical processes can be found in the book by Weiß (2018).

#### Appendix A.1. NDARMA Models for Categorical Processes

The NDARMA model (“new” discrete autoregressive moving-average model) was proposed by Jacobs and Lewis (1983), and its definition might be given as follows (Weiß and Göb 2008):
Let $( X t ) Z$ and $( ϵ t ) Z$ be categorical processes with state space $S$, where $( ϵ t ) Z$ is i.i.d. with marginal distribution p, and where $ϵ t$ is independent of $( X s ) s < t$. Let
$( α t , 1 , … , α t , p , β t , 0 , … , β t , q ) ∼ MULT ( 1 ; ϕ 1 , … , ϕ p , φ 0 , … , φ q )$
be i.i.d. multinomial random vectors, which are independent of $( ϵ t ) Z$ and of $( X s ) s < t$. Then, $( X t ) Z$ is said to be an NDARMA(p, q) process (and the cases $q = 0$ and $p = 0$ are referred to as a DAR(p) process and DMA(q) process, respectively) if it follows the recursion
$X t = α t , 1 · X t - 1 + … + α t , p · X t - p + β t , 0 · ϵ t + … + β t , q · ϵ t - q .$
(Here, if the state space $S$ is not numerically coded, we assume $0 · s = 0$, $1 · s = s$ and $s + 0 = s$ for each $s ∈ S$.)
NDARMA processes have several attractive properties, e.g. $X t$ and $ϵ t$ have the same stationary marginal distribution: $P ( X t = s i ) = p i = P ( ϵ t = s i )$ for all $i ∈ S$. Their serial dependence structure is characterized by a set of Yule–Walker-type equations for the serial dependence measure Cohen’s $κ$ from Equation (3) (Weiß and Göb 2008):
$κ ( h ) = ∑ j = 1 p ϕ j κ ( | h - j | ) + ∑ i = 0 q - h φ i + h r ( i ) for h ≥ 1 ,$
where the $r ( i )$ satisfy $r ( i ) = ∑ j = max { 0 , i - p } i - 1 ϕ i - j · r ( j ) + φ i 1 ( 0 ≤ i ≤ q )$. It should be noted that NDARMA processes satisfy $κ ( h ) ≥ 0$, i.e. they can only handle positive serial dependence. Another important property is that the bivariate distributions at lag h are $p i | j ( h ) = p i + κ ( h ) ( δ i , j - p i )$. This implies that $p i j ( h ) - p i p j = κ ( h ) ( δ i , j - p i ) p j$ and, as a consequence, that all of the serial dependence measures mentioned in this work coincide for NDARMA processes: $ϑ ( h ) = ϑ * ( h ) = ϑ ⋆ ( h ) = κ ( h ) = κ * ( h ) = κ ⋆ ( h )$.
Finally, Weiß (2013) showed that an NDARMA process is $ϕ$-mixing with exponentially decreasing weights such that the CLT on p. 200 in Billingsley (1999) is applicable.

#### Appendix A.2. Markov Chains for Categorical Processes

A discrete-valued Markov process $( X t ) Z$ is characterized by a “memory of length $p ∈ N$”, in the sense that
$P ( X t = x t | X t - 1 = x t - 1 , … ) = P ( X t = x t | X t - 1 = x t - 1 , … , X t - p = x t - p )$
has to hold for all $x i ∈ S$. In the case $p = 1$, $( X t ) Z$ is commonly called a Markov chain. If the transition probabilities $P ( X t = i | X t - 1 = j )$ of the Markov chain do not change with time t, i.e., if $P ( X t = i | X t - 1 = j ) = p i | j$ for all $t ∈ N$, it is said to be homogeneous (analogously for higher-order Markov processes). An example of a parsimoniously parametrized homogeneous Markov chain (Markov process) is the DAR(1) process (DAR(p) process) according to Appendix A.1, which always exhibits positive serial dependence.
A parsimoniously parametrized Markov model with negative serial dependence was proposed by Weiß (2011), the “Negative Markov model” (NegMarkov model). For a given probability vector $π ∈ ( 0 ; 1 ) m + 1$ and some $α ∈ ( 0 ; 1 ]$, its transition probabilities are defined by
$p i | j = α π j if i = j , β j π i if i ≠ j , where β j = 1 - α π j 1 - π j ≥ 1 .$
The resulting ergodic Markov chain has the stationary marginal distribution
$p j = π j / β j ∑ i = 0 m π i / β i for j = 0 , … , m .$
As an example, if $π = p uni$, then the $β j$ become $1 + ( 1 - α ) / m$ such that p is also a uniform distribution. However, the conditional distribution given $X t - 1 = j$ is not uniform:
$p i | j = α m + 1 if i = j , ( 1 + 1 - α m ) 1 m + 1 if i ≠ j .$

## Appendix B. Proofs

#### Appendix B.1. Derivation of the inequality in Equation (7)

The first inequality in Equation (7) was shown by Lad et al. (2015). Using the inequality $ln ( 1 + x ) > x / ( 1 + x / 2 )$ for $x > 0$ from Love (1980), it follows for $p i ∈ ( 0 ; 1 )$ that
$- ln ( 1 - p i ) = ln 1 + p i 1 - p i > 2 p i 2 - p i , - ln p i = ln 1 + 1 - p i p i > 2 1 - p i 1 + p i .$
Consequently, we have
$- ∑ i = 0 m ( 1 - p i ) ln ( 1 - p i ) ≥ ∑ i = 0 m p i ( 1 - p i ) 2 2 - p i ≥ 1 - s 2 ( p ) ,$
as well as
$- ∑ i = 0 m p i ln p i ≥ ∑ i = 0 m p i ( 1 - p i ) 2 1 + p i ≥ 1 - s 2 ( p ) .$

#### Appendix B.2. Derivations for Sample Dispersion Measures

For studying the asymptotic properties of the sample Gini index according to Equation (1), it is important to know that $ν ^ G$ can be exactly rewritten as the centered quadratic polynomial
$ν ^ G = ν G - 2 m + 1 m ∑ i = 0 m p i ( p ^ i - p i ) - m + 1 m ∑ i = 0 m ( p ^ i - p i ) 2 .$
Since $E [ p ^ ] = p$ holds exactly, this representation immediately implies an exact way of bias computation, $n E [ ν ^ G ] - ν G = m + 1 m ∑ i = 0 m V n ( p ^ i - p i )$, where $V n ( p ^ i - p i )$ is approximately given by $σ i i$ according to Equation (8). Furthermore, provided that $p ≠ p uni$, the resulting linear approximation together with the asymptotic normality of $n p ^ - p$ can be used to derive the asymptotic result (Equation (9)) (Delta method). Here, $p = p uni$ has to be excluded, because then the linear term in Equation (A3) vanishes. Hence, in the boundary case $p = p uni$, we end up with an asymptotic quadratic-form distribution instead of a normal one. Actually, it is easily seen that $n m ( 1 - ν ^ G )$ then coincides with the Pearson’s $χ 2$-statistic with respect to $p uni$; see Section 4 in Weiß (2013) for the asymptotics.
For the entropy in Equation (2), an exact polynomial representation such as in Equation (A3) does not exist, thus we have to use a Taylor approximation instead:
$ν ^ En ≈ ν En - ∑ j = 0 m 1 + ln p j ln ( m + 1 ) ( p ^ j - p j ) - 1 2 ∑ j = 0 m p j - 1 ln ( m + 1 ) ( p ^ j - p j ) 2 .$
However, then, one can proceed as before. Thus, an approximate bias formula follows from
$n E [ ν ^ En ] - ν En ≈ - 1 2 ∑ j = 0 m p j - 1 ln ( m + 1 ) V n ( p ^ j - p j ) ≈ - 1 2 ∑ j = 0 m p j - 1 ln ( m + 1 ) σ j j .$
For $p ≠ p uni$, we can use the linear approximation implied by Equation (A4) to conclude on the asymptotic normality of $n ( ν ^ En - ν En )$ with variance
$σ En 2 = ∑ i , j = 0 m 1 + ln p i ln ( m + 1 ) 1 + ln p j ln ( m + 1 ) σ i j ,$
also see the results in (Blyth 1959; Weiß 2013). Note that, in the i.i.d.-case, where $σ i j = p j ( δ i , j - p i )$, one computes
$∑ i , j = 0 m ( 1 + ln p i ) ( 1 + ln p j ) p j ( δ i , j - p i ) = ∑ i = 0 m ( 1 + ln p i ) 2 p i - ∑ i = 0 m ( 1 + ln p i ) p i 2 = 1 + 2 ∑ i = 0 m p i ln p i + ∑ i = 0 m p i ( ln p i ) 2 - 1 + ∑ i = 0 m p i ln p i 2 = ∑ i = 0 m p i ( ln p i ) 2 - ∑ i = 0 m p i ln p i 2 .$
Finally, in the boundary case $p = p uni$, again the linear term vanishes such that
$2 n ln ( m + 1 ) · ( 1 - ν ^ En ) ≈ n ∑ j = 0 m ( p ^ j - 1 m + 1 ) 2 1 m + 1$
equals the Pearson’s $χ 2$-statistic.
Finally, we do analogous derivations concerning the extropy in Equation (5). Starting with the Taylor approximation
$ν ^ Ex ≈ ν Ex + ∑ j = 0 m 1 + ln ( 1 - p j ) m ln ( m + 1 m ) ( p ^ j - p j ) - 1 2 ∑ j = 0 m ( 1 - p j ) - 1 m ln ( m + 1 m ) ( p ^ j - p j ) 2 ,$
it follows that
$n E [ ν ^ Ex ] - ν Ex ≈ - 1 2 ∑ j = 0 m ( 1 - p j ) - 1 m ln ( m + 1 m ) σ j j .$
For $p ≠ p uni$, we can use the linear approximation implied by Equation (A5) to conclude on the asymptotic normality of $n ( ν ^ Ex - ν Ex )$ with variance
$σ Ex 2 = ∑ i , j = 0 m 1 + ln ( 1 - p i ) m ln ( m + 1 m ) 1 + ln ( 1 - p j ) m ln ( m + 1 m ) σ i j .$
Note that, in the i.i.d.-case, where $σ i j = p j ( δ i , j - p i )$, one computes
$∑ i , j = 0 m 1 + ln ( 1 - p i ) 1 + ln ( 1 - p j ) p j ( δ i , j - p i ) = ∑ i = 0 m p i ln ( 1 - p i ) 2 - ∑ i = 0 m p i ln ( 1 - p i ) 2$
as before. Finally, in the boundary case $p = p uni$, again the linear term vanishes such that
$2 n m 2 ln ( m + 1 m ) · ( 1 - ν ^ Ex ) ≈ n ∑ j = 0 m ( p ^ j - 1 m + 1 ) 2 1 m + 1$
equals the Pearson’s $χ 2$-statistic.

#### Appendix B.3. Derivations for Measures of Signed Serial Dependence

We partition $x ∈ ( 0 ; 1 ) 2 ( m + 1 )$ as $x = ( x 0 , … , x m , x m + 1 + 0 , … , x m + 1 + m ) ⊤$, and we define
$f ( x ) = ∑ j = 0 m x m + 1 + j - x j 2 1 - x j .$
Then,
$∂ ∂ x j f ( x ) = x m + 1 + j - 2 x j + x j 2 ( 1 - x j ) 2 , ∂ ∂ x m + 1 + j f ( x ) = 1 1 - x j ,$
and
$∂ 2 ∂ x j 2 f ( x ) = - 2 ( 1 - x m + 1 + j ) ( 1 - x j ) 3 , ∂ 2 ∂ x j ∂ x m + 1 + j f ( x ) = 1 ( 1 - x j ) 2 ,$
all other second-order derivatives equal 0. Thus, a second-order Taylor approximation of $κ ^ ⋆ ( h ) = f … , p ^ j , … , p ^ j j ( h ) , …$ is given by
$κ ^ ⋆ ( h ) ≈ κ ⋆ ( h ) + ∑ j = 0 m p j j ( h ) - 2 p j + p j 2 ( 1 - p j ) 2 p ^ j - p j + ∑ j = 0 m 1 1 - p j p ^ j j ( h ) - p j j ( h ) - ∑ j = 0 m 1 - p j j ( h ) ( 1 - p j ) 3 p ^ j - p j 2 + ∑ j = 0 m 1 ( 1 - p j ) 2 p ^ j - p j p ^ j j ( h ) - p j j ( h ) .$
Hence, using Equation (18), it follows that
$n E κ ^ ⋆ ( h ) - κ ⋆ ( h ) ≈ ∑ j = 0 m 1 ( 1 - p j ) 2 σ j , m + 1 + j ( h ) - ∑ j = 0 m 1 - p j j ( h ) ( 1 - p j ) 3 σ j j = ∑ j = 0 m ( 1 - p j ) σ j , m + 1 + j ( h ) - 1 - p j j ( h ) σ j j ( 1 - p j ) 3 .$
Furthermore, the Delta method implies that $n κ ^ ⋆ ( h ) - κ ⋆ ( h ) ∼ N ( 0 , σ 2 )$ with
$σ 2 = ∑ i , j = 0 m p i i ( h ) - 2 p i + p i 2 ( 1 - p i ) 2 p j j ( h ) - 2 p j + p j 2 ( 1 - p j ) 2 σ i , j + ∑ i , j = 0 m σ m + 1 + i , m + 1 + j ( h ) ( 1 - p i ) ( 1 - p j ) + 2 ∑ i , j = 0 m p i i ( h ) - 2 p i + p i 2 ( 1 - p i ) 2 σ i , m + 1 + j ( h ) 1 - p j .$
Note that, under the null of an i.i.d. DGP, we have the simplifications
$p j j ( h ) - 2 p j + p j 2 ( 1 - p j ) 2 = - 2 p j 1 - p j , 1 - p j j ( h ) ( 1 - p j ) 3 = 1 + p j ( 1 - p j ) 2 .$
Thus, the second-order Taylor approximation of $κ ^ ⋆ ( h )$ then simplifies to
$κ ^ ⋆ ( h ) ≈ κ ⋆ ( h ) + ∑ j = 0 m - 2 p j 1 - p j p ^ j - p j + ∑ j = 0 m 1 1 - p j p ^ j j ( h ) - p j 2 - ∑ j = 0 m 1 + p j ( 1 - p j ) 2 p ^ j - p j 2 + ∑ j = 0 m 1 ( 1 - p j ) 2 p ^ j - p j p ^ j j ( h ) - p j 2 .$
Furthermore,
$σ i , j = p j ( δ i , j - p i ) , σ i , m + 1 + j = 2 p j 2 ( δ i , j - p i ) , σ m + 1 + i , m + 1 + j = δ i , j p i 2 ( 1 + 2 p i ) - 3 p i 2 p j 2 ,$
according to Equation (19). Thus, for an i.i.d. DGP, it follows that
$n E κ ^ ⋆ ( h ) - κ ⋆ ( h ) = n E κ ^ ⋆ ( h ) ≈ ∑ j = 0 m σ j , m + 1 + j - ( 1 + p j ) σ j j ( 1 - p j ) 2 = ∑ j = 0 m 2 p j 2 ( 1 - p j ) - ( 1 + p j ) p j ( 1 - p j ) ( 1 - p j ) 2 = ∑ j = 0 m - p j ( 1 - p j ) ( 1 + p j - 2 p j ) ( 1 - p j ) 2 = - 1 .$
Furthermore,
$σ 2 = ∑ i , j = 0 m - 2 p i 1 - p i - 2 p j 1 - p j p j ( δ i , j - p i ) + 2 ∑ i , j = 0 m - 2 p i 1 - p i 1 1 - p j 2 p j 2 ( δ i , j - p i ) + ∑ i , j = 0 m 1 1 - p i 1 1 - p j δ i , j p i 2 ( 1 + 2 p i ) - 3 p i 2 p j 2 = 4 ∑ i = 0 m p i 3 ( 1 - p i ) 2 - 8 ∑ i = 0 m p i 3 ( 1 - p i ) 2 + ∑ i = 0 m p i 2 ( 1 + 2 p i ) ( 1 - p i ) 2 - 4 ∑ i , j = 0 m p i 2 1 - p i p j 2 1 - p j + 8 ∑ i , j = 0 m p i 2 1 - p i p j 2 1 - p j - 3 ∑ i , j = 0 m p i 2 1 - p i p j 2 1 - p j = ∑ i = 0 m p i 2 ( 1 - 2 p i ) ( 1 - p i ) 2 + ( ∑ i = 0 m p i 2 1 - p i ) 2 .$
In the special case of a uniform distribution, i.e. where all $p i = 1 m + 1$, the asymptotic variances according to Equations (16), (17) and (21) coincide. Then, we have in Equation (16) that
$1 - 1 + 2 s 3 ( p ) - 3 s 2 ( p ) ( 1 - s 2 ( p ) ) 2 = 1 - 1 + 2 ( m + 1 ) 2 - 3 m + 1 ( 1 - 1 m + 1 ) 2 = 1 - ( m + 1 ) 2 + 2 - 3 ( m + 1 ) m 2 = 1 - m 2 - m m 2 = 1 m ,$
which corresponds to Equation (17). However, the same expression also follows in Equation (21):
$s 2 ( p ) - ∑ i = 0 m p i 2 1 - p i 2 + ∑ i = 0 m p i 2 1 - p i 2 = 1 m + 1 - 1 m 2 ( m + 1 ) + 1 m 2 = 1 m .$

## Appendix C. Tables

Table A1. Asymptotic vs. simulated mean (M$·$-a vs. M$·$-s) of $ν ^ G$, $ν ^ En$, $ν ^ Ex$ for DGPs i.i.d., DMA(1) with $φ 1 = 0.25$, and DAR(1) with $ϕ 1 = 0.40$.
Table A1. Asymptotic vs. simulated mean (M$·$-a vs. M$·$-s) of $ν ^ G$, $ν ^ En$, $ν ^ Ex$ for DGPs i.i.d., DMA(1) with $φ 1 = 0.25$, and DAR(1) with $ϕ 1 = 0.40$.
DGPi.i.d.DMA(1), 0.25DAR(1), 0.40i.i.d.DMA(1), 0.25DAR(1), 0.40i.i.d.DMA(1), 0.25DAR(1), 0.40
PMF$n$M$G$-aM$G$-sM$G$-aM$G$-sM$G$-aM$G$-sM$En$-aM$En$-sM$En$-aM$En$-sM$En$-aM$En$-sM$Ex$-aM$Ex$-sM$Ex$-aM$Ex$-sM$Ex$-aM$Ex$-s
$p 1$1000.9700.9700.9670.9670.9570.9570.9690.9680.9650.9640.9540.9540.9820.9820.9800.9800.9750.975
2500.9760.9760.9750.9750.9710.9710.9750.9750.9730.9730.9690.9690.9860.9860.9850.9850.9830.983
5000.9780.9780.9770.9770.9750.9750.9770.9770.9760.9760.9740.9740.9870.9870.9870.9870.9850.985
10000.9790.9790.9790.9790.9780.9780.9780.9780.9780.9780.9770.9770.9880.9880.9870.9870.9870.987
$p 2$1000.6270.6270.6250.6250.6190.6190.6490.6490.6450.6440.6340.6340.7390.7390.7370.7370.7310.731
2500.6310.6310.6300.6300.6270.6270.6550.6550.6540.6540.6490.6490.7430.7430.7420.7420.7390.739
5000.6320.6320.6320.6310.6300.6300.6570.6570.6570.6560.6540.6540.7440.7440.7430.7430.7420.742
10000.6330.6330.6320.6330.6320.6320.6580.6580.6580.6580.6570.6570.7440.7440.7440.7440.7440.743
$p 3$1000.7590.7590.7560.7560.7490.7490.7560.7550.7520.7510.7410.7410.8420.8420.8400.8400.8350.835
2500.7640.7640.7620.7620.7600.7600.7620.7620.7610.7610.7570.7570.8460.8460.8450.8450.8430.843
5000.7650.7650.7650.7650.7630.7630.7640.7640.7640.7640.7620.7620.8470.8470.8460.8470.8450.845
10000.7660.7660.7660.7660.7650.7650.7660.7650.7650.7650.7640.7640.8470.8470.8470.8470.8470.847
$p 4$1000.4330.4330.4310.4310.4270.4280.4860.4860.4820.4810.4710.4710.5680.5680.5660.5660.5600.561
2500.4360.4360.4350.4350.4330.4340.4920.4920.4910.4910.4870.4870.5720.5720.5710.5710.5690.569
5000.4370.4370.4360.4360.4350.4360.4950.4950.4940.4940.4920.4920.5730.5730.5720.5720.5710.571
10000.4370.4370.4370.4370.4360.4360.4960.4960.4950.4950.4940.4940.5730.5730.5730.5730.5730.573
$p 5$1000.7430.7430.7400.7400.7330.7330.7640.7640.7600.7590.7490.7490.8270.8270.8240.8240.8190.819
2500.7470.7470.7460.7460.7430.7430.7700.7700.7680.7690.7640.7650.8300.8300.8290.8290.8270.827
5000.7490.7490.7480.7480.7470.7470.7720.7720.7710.7720.7690.7690.8310.8310.8310.8310.8300.830
10000.7490.7490.7490.7490.7480.7480.7730.7730.7730.7730.7720.7720.8320.8320.8320.8320.8310.831
$p 6$1000.9280.9280.9250.9250.9160.9160.9290.9290.9250.9250.9150.9150.9560.9560.9530.9540.9480.948
2500.9340.9340.9320.9320.9290.9290.9360.9360.9340.9340.9300.9300.9590.9590.9580.9580.9560.956
5000.9360.9360.9350.9350.9330.9330.9380.9380.9370.9370.9350.9350.9600.9600.9600.9600.9590.959
10000.9370.9370.9360.9360.9350.9350.9390.9390.9390.9390.9380.9380.9610.9610.9610.9610.9600.960
Table A2. Asymptotic vs. simulated standard deviation (S$·$-a vs. S$·$-s) of $ν ^ G$, $ν ^ En$, $ν ^ Ex$ for DGPs i.i.d., DMA(1) with $φ 1 = 0.25$, and DAR(1) with $ϕ 1 = 0.40$.
Table A2. Asymptotic vs. simulated standard deviation (S$·$-a vs. S$·$-s) of $ν ^ G$, $ν ^ En$, $ν ^ Ex$ for DGPs i.i.d., DMA(1) with $φ 1 = 0.25$, and DAR(1) with $ϕ 1 = 0.40$.
DGPi.i.d.DMA(1), 0.25DAR(1), 0.40i.i.d.DMA(1), 0.25DAR(1), 0.40i.i.d.DMA(1), 0.25DAR(1), 0.40
PMF$n$S$G$-aS$G$-sS$G$-aS$G$-sS$G$-aS$G$-sS$En$-aS$En$-sS$En$-aS$En$-sS$En$-aS$En$-sS$Ex$-aS$Ex$-sS$Ex$-aS$Ex$-sS$Ex$-aS$Ex$-s
$p 1$1000.0170.0190.0200.0230.0270.0320.0170.0200.0200.0240.0270.0330.0110.0120.0120.0140.0160.020
2500.0110.0110.0130.0140.0170.0180.0110.0120.0130.0140.0170.0190.0070.0070.0080.0080.0100.011
5000.0080.0080.0090.0090.0120.0120.0080.0080.0090.0090.0120.0130.0050.0050.0060.0060.0070.008
10000.0060.0060.0060.0070.0080.0090.0060.0060.0060.0070.0080.0090.0030.0030.0040.0040.0050.005
$p 2$1000.0710.0710.0840.0830.1090.1060.0630.0640.0740.0750.0970.0960.0570.0580.0670.0680.0880.088
2500.0450.0450.0530.0530.0690.0680.0400.0400.0470.0470.0610.0610.0360.0360.0430.0430.0550.055
5000.0320.0320.0370.0370.0490.0490.0280.0290.0330.0330.0430.0430.0260.0260.0300.0300.0390.039
10000.0230.0230.0270.0270.0350.0350.0200.0200.0240.0240.0310.0310.0180.0180.0210.0210.0280.028
$p 3$1000.0580.0580.0680.0680.0880.0860.0530.0540.0620.0630.0810.0810.0420.0430.0490.0500.0640.065
2500.0370.0370.0430.0430.0560.0550.0330.0340.0390.0390.0510.0510.0270.0270.0310.0310.0410.041
5000.0260.0260.0300.0300.0390.0390.0240.0240.0280.0280.0360.0360.0190.0190.0220.0220.0290.029
10000.0180.0180.0210.0210.0280.0280.0170.0170.0200.0200.0250.0250.0130.0130.0160.0160.0200.020
$p 4$1000.0780.0770.0920.0900.1190.1150.0720.0730.0850.0850.1100.1090.0730.0730.0850.0860.1110.110
2500.0490.0490.0580.0580.0750.0750.0460.0460.0540.0540.0700.0700.0460.0460.0540.0540.0700.071
5000.0350.0350.0410.0410.0530.0530.0320.0320.0380.0380.0490.0490.0330.0330.0380.0380.0500.050
10000.0250.0250.0290.0290.0380.0380.0230.0230.0270.0270.0350.0350.0230.0230.0270.0270.0350.035
$p 5$1000.0650.0640.0760.0750.0990.0960.0560.0570.0660.0670.0860.0860.0480.0480.0560.0560.0730.073
2500.0410.0410.0480.0480.0620.0620.0360.0360.0420.0420.0540.0540.0300.0300.0350.0350.0460.046
5000.0290.0290.0340.0340.0440.0440.0250.0250.0290.0290.0380.0380.0210.0210.0250.0250.0320.033
10000.0200.0200.0240.0240.0310.0310.0180.0180.0210.0210.0270.0270.0150.0150.0180.0180.0230.023
$p 6$1000.0330.0340.0390.0400.0510.0520.0300.0320.0360.0370.0460.0500.0210.0220.0250.0260.0320.034
2500.0210.0210.0250.0250.0320.0320.0190.0190.0220.0230.0290.0300.0130.0130.0160.0160.0200.021
5000.0150.0150.0170.0170.0230.0230.0140.0140.0160.0160.0210.0210.0090.0100.0110.0110.0140.015
10000.0100.0110.0120.0120.0160.0160.0100.0100.0110.0110.0150.0150.0070.0070.0080.0080.0100.010
Table A3. Simulated coverage rate for 95% CIs of $ν ^ G$, $ν ^ En$, $ν ^ Ex$ for DGPs i.i.d., DMA(1) with $φ 1 = 0.25$, and DAR(1) with $ϕ 1 = 0.40$.
Table A3. Simulated coverage rate for 95% CIs of $ν ^ G$, $ν ^ En$, $ν ^ Ex$ for DGPs i.i.d., DMA(1) with $φ 1 = 0.25$, and DAR(1) with $ϕ 1 = 0.40$.
Coverage $ν ^ G$, DGP:Coverage $ν ^ En$, DGP:Coverage $ν ^ Ex$, DGP:
PMF$n$i.i.d.DMA(1), 0.25DAR(1), 0.40i.i.d.DMA(1), 0.25DAR(1), 0.40i.i.d.DMA(1), 0.25DAR(1), 0.40
$p 1$1000.8950.8930.8990.9040.9010.9050.8920.8900.897
2500.9140.9090.8980.9200.9150.9050.9120.9070.896
5000.9310.9220.9120.9350.9270.9180.9300.9210.910
10000.9380.9350.9260.9400.9370.9300.9380.9340.925
$p 2$1000.9360.9260.9080.9370.9290.9150.9310.9280.911
2500.9440.9410.9330.9450.9420.9370.9440.9410.935
5000.9470.9460.9410.9470.9460.9430.9470.9460.941
10000.9490.9470.9440.9490.9480.9460.9490.9470.945
$p 3$1000.9310.9200.9040.9360.9260.9180.9300.9190.901
2500.9410.9380.9300.9430.9430.9370.9400.9370.929
5000.9450.9450.9400.9460.9460.9430.9440.9440.939
10000.9480.9470.9450.9480.9480.9470.9480.9470.945
$p 4$1000.9410.9250.9040.9380.9250.9050.9380.9310.913
2500.9470.9390.9290.9430.9400.9310.9480.9420.934
5000.9490.9460.9410.9480.9460.9420.9500.9470.943
10000.9520.9450.9460.9490.9460.9470.9490.9460.947
$p 5$1000.9330.9220.9040.9380.9280.9150.9390.9230.906
2500.9450.9390.9310.9470.9420.9350.9440.9400.931
5000.9460.9450.9410.9470.9460.9430.9470.9450.941
10000.9480.9470.9450.9490.9480.9460.9490.9470.945
$p 6$1000.9090.8940.8840.9200.9080.9010.9060.8900.877
2500.9330.9230.9080.9360.9290.9190.9310.9210.905
5000.9380.9350.9270.9400.9380.9330.9370.9340.925
10000.9430.9410.9390.9450.9430.9420.9430.9410.938
Table A4. Rejection rate if testing null hypothesis of uniform distribution ($m = 3$) on 5%-level based on $ν ^ G$, $ν ^ En$, $ν ^ Ex$. DGPs i.i.d., DMA(1) with $φ 1 = 0.25$, and DAR(1) with $ϕ 1 = 0.40$, with marginal distribution $L m ( λ )$.
Table A4. Rejection rate if testing null hypothesis of uniform distribution ($m = 3$) on 5%-level based on $ν ^ G$, $ν ^ En$, $ν ^ Ex$. DGPs i.i.d., DMA(1) with $φ 1 = 0.25$, and DAR(1) with $ϕ 1 = 0.40$, with marginal distribution $L m ( λ )$.
$λ$1.000.980.960.940.920.901.000.980.960.940.920.901.000.980.960.940.920.90
DGP$n$Rejection rate $ν ^ G$Rejection rate $ν ^ En$Rejection rate $ν ^ Ex$
i.i.d.1000.0490.0560.0810.1220.1830.2710.0500.0570.0820.1200.1780.2620.0500.0580.0840.1270.1900.281
2500.0520.0680.1320.2500.4210.6050.0510.0680.1290.2430.4070.5900.0510.0680.1320.2520.4230.608
5000.0500.0880.2230.4650.7230.8990.0500.0870.2190.4550.7120.8930.0500.0880.2250.4680.7270.901
10000.0510.1320.4200.7820.9610.9970.0510.1320.4140.7750.9590.9970.0510.1330.4230.7850.9620.997
DMA(1), 0.251000.0540.0600.0760.1090.1500.2130.0570.0630.0790.1090.1500.2100.0550.0620.0780.1120.1550.219
2500.0520.0650.1080.1940.3170.4670.0520.0670.1080.1900.3090.4550.0520.0660.1100.1970.3220.474
5000.0510.0770.1720.3500.5750.7780.0520.0770.1690.3430.5640.7690.0510.0780.1740.3540.5790.782
10000.0490.1060.3150.6340.8790.9780.0500.1060.3100.6250.8730.9760.0500.1070.3170.6380.8810.978
DAR(1), 0.41000.0560.0590.0670.0880.1120.1460.0590.0620.0710.0910.1150.1470.0590.0620.0710.0920.1180.153
2500.0530.0600.0850.1320.1990.2940.0540.0620.0850.1310.1950.2860.0540.0620.0870.1350.2050.301
5000.0500.0660.1200.2180.3660.5350.0510.0670.1190.2130.3560.5220.0510.0670.1210.2210.3710.542
10000.0500.0830.1990.4040.6470.8460.0500.0830.1950.3960.6360.8380.0500.0840.2000.4080.6510.850
Table A5. Asymptotic vs. simulated standard deviation (S$·$-a vs. S$·$-s) of $κ ^ ( h )$, $κ ^ * ( h )$, $κ ^ ⋆ ( h )$ for i.i.d. DGPs with $m = 3$.
Table A5. Asymptotic vs. simulated standard deviation (S$·$-a vs. S$·$-s) of $κ ^ ( h )$, $κ ^ * ( h )$, $κ ^ ⋆ ( h )$ for i.i.d. DGPs with $m = 3$.
S$κ ^$-aS$κ ^$-s,   $h =$S$κ ^ *$-aS$κ ^ *$-s,   $h =$S$κ ^ ⋆$-aS$κ ^ ⋆$-s,   $h =$
PMF$n$ 123 123 123
$p 5$1000.0640.0640.0650.0660.0580.0560.0570.0570.0740.0750.0770.078
2500.0400.0410.0410.0410.0370.0360.0360.0360.0470.0470.0480.048
5000.0290.0290.0290.0290.0260.0260.0260.0260.0330.0330.0330.034
10000.0200.0200.0200.0200.0180.0180.0180.0180.0230.0230.0240.024
$p 6$1000.0600.0600.0610.0610.0580.0570.0570.0580.0630.0640.0640.065
2500.0380.0380.0380.0380.0370.0360.0360.0370.0400.0400.0400.040
5000.0270.0270.0270.0270.0260.0260.0260.0260.0280.0280.0280.028
10000.0190.0190.0190.0190.0180.0180.0180.0180.0200.0200.0200.020
$p 7$1000.0580.0580.0590.0590.0580.0570.0580.0580.0580.0580.0590.059
2500.0370.0360.0370.0370.0370.0360.0370.0360.0370.0370.0370.037
5000.0260.0260.0260.0260.0260.0260.0260.0260.0260.0260.0260.026
10000.0180.0180.0180.0180.0180.0180.0180.0180.0180.0180.0180.018
Table A6. Rejection rate (RR) if testing null hypothesis of i.i.d. data on 5%-level based on $κ ^ ( 1 )$, $κ ^ * ( 1 )$, $κ ^ ⋆ ( 1 )$. DGPs i.i.d., DMA(1) with $φ 1 = 0.15$, DAR(1) with $ϕ 1 = 0.15$, and NegMarkov with $α = 0.75$.
Table A6. Rejection rate (RR) if testing null hypothesis of i.i.d. data on 5%-level based on $κ ^ ( 1 )$, $κ ^ * ( 1 )$, $κ ^ ⋆ ( 1 )$. DGPs i.i.d., DMA(1) with $φ 1 = 0.15$, DAR(1) with $ϕ 1 = 0.15$, and NegMarkov with $α = 0.75$.
DGPi.i.d., RR forDMA(1), 0.15, RR forDAR(1), 0.15, RR forNMark, 0.75, RR for
PMF$n$$κ ^ ( 1 )$$κ ^ * ( 1 )$$κ ^ ⋆ ( 1 )$$κ ^ ( 1 )$$κ ^ * ( 1 )$$κ ^ ⋆ ( 1 )$$κ ^ ( 1 )$$κ ^ * ( 1 )$$κ ^ ⋆ ( 1 )$$κ ^ ( 1 )$$κ ^ * ( 1 )$$κ ^ ⋆ ( 1 )$
$p 5$1000.0470.0420.0510.4840.5370.4000.5940.6340.5070.4780.2350.521
2500.0490.0480.0490.8510.8960.7560.9250.9470.8610.8860.6540.908
5000.0490.0490.0500.9880.9940.9610.9970.9990.9890.9950.9360.997
10000.0490.0480.0491.0001.0000.9991.0001.0001.0001.0000.9991.000
$p 6$1000.0480.0470.0490.5400.5590.5090.6620.6730.6310.3380.2680.353
2500.0500.0490.0490.9030.9160.8780.9600.9660.9470.7250.6360.735
5000.0500.0490.0510.9960.9970.9920.9991.0000.9990.9580.9200.961
10000.0490.0500.0501.0001.0001.0001.0001.0001.0001.0000.9981.000
$p 7$1000.0480.0470.0480.5770.5690.5730.6990.6880.6970.2750.2690.276
2500.0470.0480.0480.9250.9240.9240.9730.9720.9730.6400.6360.641
5000.0490.0490.0490.9980.9980.9981.0001.0001.0000.9180.9160.918
10000.0500.0500.0501.0001.0001.0001.0001.0001.0000.9980.9980.998

## References

1. Agresti, Alan. 2002. Categorical Data Analysis, 2nd ed.Hoboken: John Wiley & Sons, Inc. [Google Scholar]
2. Billingsley, Patrick. 1999. Convergence of Probability Measures, 2nd ed.New York: John Wiley & Sons, Inc. [Google Scholar]
3. Blyth, Colin R. 1959. Note on estimating information. Annals of Mathematical Statistics 30: 71–79. [Google Scholar] [CrossRef]
4. Cohen, Jacob. 1960. A coefficient of agreement for nominal scales. Educational and Psychological Measurement 20: 37–46. [Google Scholar] [CrossRef]
5. Craig, Wallace. 1943. The Song of the Wood Pewee, Myiochanes virens Linnaeus: A Study of Bird Music. New York State Museum Bulletin No. 334. Albany: University of the State of New York. [Google Scholar]
6. GESIS—Leibniz Institute for the Social Sciences. 2018. ALLBUS 1980–2016 (German General Social Survey). ZA4586 Data File (Version 1.0.0). Cologne: GESIS Data Archive. (In German) [Google Scholar]
7. Hancock, Gwendolyn D’Anne. 2012. VIX and VIX futures pricing algorithms: Cultivating understanding. Modern Economy 3: 284–94. [Google Scholar]
8. Jacobs, Patricia A., and Peter A. W. Lewis. 1983. Stationary discrete autoregressive-moving average time series generated by mixtures. Journal of Time Series Analysis 4: 19–36. [Google Scholar] [CrossRef]
9. Kvålseth, Tarald O. 1995. Coefficients of variation for nominal and ordinal categorical data. Perceptual and Motor Skills 80: 843–47. [Google Scholar] [CrossRef]
10. Kvålseth, Tarald O. 2011a. The lambda distribution and its applications to categorical summary measures. Advances and Applications in Statistics 24: 83–106. [Google Scholar]
11. Kvålseth, Tarald O. 2011b. Variation for categorical variables. In International Encyclopedia of Statistical Science. Edited by Miodrag Lovric. Berlin: Springer, pp. 1642–45. [Google Scholar]
12. Lad, Frank, Giuseppe Sanfilippo, and Gianna Agro. 2015. Extropy: Complementary dual of entropy. Statistical Science 30: 40–58. [Google Scholar] [CrossRef]
13. Love, Eric Russell. 1980. Some logarithm inequalities. Mathematical Gazette 64: 55–57. [Google Scholar] [CrossRef]
14. Rao, C. Radhakrishna. 1982. Diversity and dissimilarity coefficients: A unified approach. Theoretical Population Biology 21: 24–43. [Google Scholar] [CrossRef]
15. Shannon, Claude Elwood. 1948. A mathematical theory of communication. Bell System Technical Journal 27: 379–423, 623–56. [Google Scholar] [CrossRef]
16. Shorrocks, Anthony F. 1978. The measurement of mobility. Econometrica 46: 1013–24. [Google Scholar] [CrossRef]
17. Tan, Wai-Yuan. 1977. On the distribution of quadratic forms in normal random variables. Canadian Journal of Statististics 5: 241–50. [Google Scholar] [CrossRef]
18. Weiß, Christian H. 2011. Empirical measures of signed serial dependence in categorical time series. Journal of Statistical Computation and Simulation 81: 411–29. [Google Scholar] [CrossRef]
19. Weiß, Christian H. 2013. Serial dependence of NDARMA processes. Computational Statistics and Data Analysis 68: 213–38. [Google Scholar] [CrossRef]
20. Weiß, Christian H. 2018. An Introduction to Discrete-Valued Time Series. Chichester: John Wiley & Sons, Inc. [Google Scholar]
21. Weiß, Christian H., and Rainer Göb. 2008. Measuring serial dependence in categorical time series. AStA Advances in Statistical Analysis 92: 71–89. [Google Scholar] [CrossRef]
 1 The “zero problem” for $κ * ( h )$ described after Equation (4) happened mainly for $n = 100$ and for distributions with low dispersion such as $p 2$ to $p 4$, in about 0.5% of the i.i.d. simulation runs. It increased with positive dependence, to about 2% for the DAR(1) simulation runs. This problem was circumvented by replacing all affected summands by 0.
Figure 1. Normalized dispersion measures for Lambda distribution $L m ( λ )$ against $λ$: $ν Ex , ν G$ in (a), $ν En$ in (b), comparison for $m = 10$ in (c).
Figure 1. Normalized dispersion measures for Lambda distribution $L m ( λ )$ against $λ$: $ν Ex , ν G$ in (a), $ν En$ in (b), comparison for $m = 10$ in (c).
Figure 2. ALLBUS data from Section 4.2: (a) dispersion measures; (b) extropy with 95% CIs; and (c) relative frequencies; plotted against year of survey.
Figure 2. ALLBUS data from Section 4.2: (a) dispersion measures; (b) extropy with 95% CIs; and (c) relative frequencies; plotted against year of survey.
Figure 3. Fear states time series plot and PMF (top); and plots of $κ ^ ( h ) , κ ^ * ( h ) , κ ^ ⋆ ( h )$ (bottom).
Figure 3. Fear states time series plot and PMF (top); and plots of $κ ^ ( h ) , κ ^ * ( h ) , κ ^ ⋆ ( h )$ (bottom).
Figure 4. Wood Pewee time series: rate evolution graph and PMF plot (top); and plots of $κ ^ ( h ) , κ ^ * ( h ) , κ ^ ⋆ ( h )$ (bottom).
Figure 4. Wood Pewee time series: rate evolution graph and PMF plot (top); and plots of $κ ^ ( h ) , κ ^ * ( h ) , κ ^ ⋆ ( h )$ (bottom).
Table 1. Marginal distributions considered in Section 4.1 together with the corresponding dispersion values.
Table 1. Marginal distributions considered in Section 4.1 together with the corresponding dispersion values.
PMF$ν G$$ν En$$ν Ex$
$p 1 = ( 0.2 , 0.2 , 0.25 , 0.35 ) ⊤$0.9800.9790.988
$p 2 = ( 0.05 , 0.1 , 0.15 , 0.7 ) ⊤$0.6330.6600.745
$p 3 = ( 0.2 , 0.15 , 0.05 , 0.6 ) ⊤$0.7670.7670.848
$p 4 = ( 0.8125 , 0.0625 , 0.0625 , 0.0625 ) ⊤$0.4380.4970.574
$p 5 = ( 0.625 , 0.125 , 0.125 , 0.125 ) ⊤$0.7500.7740.832
$p 6 = ( 0.4375 , 0.1875 , 0.1875 , 0.1875 ) ⊤$0.9380.9400.961
Table 2. Definition of fear states for Volatility Index (VIX) according to Hancock (2012).
Table 2. Definition of fear states for Volatility Index (VIX) according to Hancock (2012).
State and ExplanationVIXState and ExplanationVIX
$s 0$extreme complacency$[ 0 ; 10 )$$s 7$extremely high anx.$[ 40 ; 45 )$
$s 1$very low anx. = high compl.$[ 10 ; 15 )$$s 8$near panic$[ 45 ; 50 )$
$s 2$low anx. = moderate compl.$[ 15 ; 20 )$$s 9$moderate panic$[ 50 ; 55 )$
$s 3$moderate anx. = low compl.$[ 20 ; 25 )$$s 10$panic$[ 55 ; 60 )$
$s 4$moderately high anxiety$[ 25 ; 30 )$$s 11$intense panic$[ 60 ; 65 )$
$s 5$high anxiety$[ 30 ; 35 )$$s 12$extreme panic$[ 65 ; 100 ]$
$s 6$very high anxiety$[ 35 ; 40 )$

## Share and Cite

MDPI and ACS Style

Weiß, C.H. Measures of Dispersion and Serial Dependence in Categorical Time Series. Econometrics 2019, 7, 17. https://doi.org/10.3390/econometrics7020017

AMA Style

Weiß CH. Measures of Dispersion and Serial Dependence in Categorical Time Series. Econometrics. 2019; 7(2):17. https://doi.org/10.3390/econometrics7020017

Chicago/Turabian Style

Weiß, Christian H. 2019. "Measures of Dispersion and Serial Dependence in Categorical Time Series" Econometrics 7, no. 2: 17. https://doi.org/10.3390/econometrics7020017

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.