#
Entropy-Based Tests for Complex Dependence in Economic and Financial Time Series with the `R` Package `tseriesEntropy`

^{1}

^{2}

^{*}

## Abstract

**:**

`R`package

`tseriesEntropy`, dedicated to testing for serial/cross dependence and nonlinear serial dependence in time series, based on the entropy metric ${S}_{\rho}$. The package implements tests for both continuous and categorical data. The nonparametric tests, based on ${S}_{\rho}$, rely on minimal assumptions and have also been shown to be powerful for small sample sizes. The measure can be used as a nonlinear auto/cross-dependence function, both as an exploratory tool, or as a diagnostic measure, if computed on the residuals from a fitted model. Different null hypotheses of either independence or linear dependence can be tested by means of resampling methods, backed up by a sound theoretical background. We showcase our methods on a panel of commodity price time series. The results hint at the presence of a complex dependence in the conditional mean, together with conditional heteroskedasticity, and indicate the need for an appropriate nonlinear specification.

## 1. Introduction

`R`package

`tseriesEntropy`[12], which implements and extends such results and provides user-friendly routines, together with plotting and summary abilities, so that the measure can be used as a dropout replacement of the overly-used correlograms. Most tests can be applied both to continuous and categorical data. We describe the theoretical background underlying inference and testing with the entropy-based metric ${S}_{\rho}$. Then, we focus on describing in detail all the routines present in the

`tseriesEntropy`package by means of examples and code snippets that can be used to exactly reproduce some of the results of the paper. Different null hypotheses of either independence or linear dependence can be tested and the tests can be used both as exploratory tools or as diagnostic measures, if computed on the residuals from a fitted model. We also illustrate the practical usage of the package on a panel of time series of commodities.

`tseriesEntropy`, together with providing a sketch of the underlying theoretical background, we provide a selective review of the software libraries dedicated to the theme of testing for serial/cross dependence in time series. We also mention some of the packages that are not explicitly dedicated to time series, but which implement recent notable theoretical results, especially in the multivariate case. The review is by no means exhaustive and we have selected those packages that appear to rely upon a sound theoretical background with available mathematical results on the validity of the associated inferences. The package

`np`[13] contains bootstrap tests for serial and pairwise independence, based on the metric entropy ${S}_{\rho}$. These are implemented in the functions

`npsdeptest`and

`npdeptest`, respectively. The measure is the same we use in

`tseriesEntropy`, that also implements a similar test in its function

`Srho.test.ts`and that encompasses both tests. The package

`NTS`[14] contains some tests for threshold nonlinearity and lack of fit. The package

`testcorr`[15] contains functions that implement robust tests based on auto/cross-correlation functions and for serial independence. Weighted portmanteau tests for goodness-of-fit and serial correlation, based on the trace of the square of the autocorrelation matrix, are implemented in the package

`WeightedPortTest`[16]. There, a gamma-based approximation is used to derive the asymptotic null distribution of the test statistics. The package

`SDD`[17] implements bootstrap tests for serial independence, based on generalized divergence functionals, that include, as a special case, the Hellinger distance. The authors use a nonparametric kernel density estimator for the densities, based upon Gaussian kernels. Then, the divergence measures are approximated by summation over a finite grid of values. The null distribution is obtained through permutation. The core function

`ADF`also implements the serial independence test, based on grouping values in a contingency table and then using Pearson’s Chi-squared statistic. The package

`dCovTS`[18] includes tests for pairwise/multivariate dependence in time series, based on the distance covariance/correlation function. The null distribution is approximated through either the iid or the wild bootstrap scheme. A portmanteau diagnostic test for vector autoregressive moving average (VARMA) models, based on the determinant of the standardized multivariate residual autocorrelations, is implemented in the

`portes`package [19]. The package

`tsextreme`[20] characterises the extreme dependence structure of time series through Bayesian methods. The package

`extremogram`[21] implements permutation tests for serial and cross independence based on the extremogram. The package

`copula`[22] contains tests of serial and multivariate independence, based on the empirical copula process. Finally, the package

`tseries`[23], which is probably the first

`R`package dedicated to time series to have appeared on CRAN, implements two neural network tests for nonlinearity in the mean, either in a single series or in a bivariate (regression) framework. We mention them even if they are not directly based upon the idea of measuring the serial/cross dependence. Both tests are asymptotic.

`R`packages specifically dedicated to time series, there are a number of packages that propose tests for independence/goodness-of-fit through diverse approaches. The packages

`wdm`[24] and

`testforDEP`[25] implement several measures of dependence and the associated tests for bivariate/multivariate independence. The package

`LIStest`[26] implements a test for bivariate independence for continuous data, based on the longest increasing subsequence. The package

`USP`[27] implements various independence tests for discrete, continuous, and infinite-dimensional data. These are permutation tests based on U-statistics. The package

`IndepTest`[28] provides implementations of the weighted Kozachenko–Leonenko entropy estimator and permutation tests of independence based on it. The package

`dHSIC`[29] contains an implementation of the d-variable Hilbert Schmidt multivariate independence criterion and several hypothesis tests based on it. A similar test is also implemented in the package

`EDMeasure`[30], together with several other tests based upon measures of mutual dependence and conditional mean dependence. Multivariate independence tests, based on the notion of distance multivariance, are implemented in the package

`multivariance`[31]. The package

`steadyICA`[32] also implements a similar set of tests, but these rely on the notion of distance covariance instead. A test for conditional univariate/multivariate independence, based on the generalized covariance measure, is implemented in the package

`GeneralisedCovarianceMeasure`[33].

`Srho`is also introduced and briefly illustrated. In Section 3 we introduce the routines to test for serial/cross independence with ${S}_{\rho}$. As in Section 2, there are separate routines for testing both continuous and discrete/categorical time series and we also describe the S4 class

`Srho.test`designed to work with all the tests based upon ${S}_{\rho}$. Section 4 describes the theoretical background and the routines dedicated to testing for nonlinear serial dependence in time series. In particular, Section 4.1 illustrates the routines that implement the test where the null hypothesis is that of a linear Gaussian random process. The null distribution is based on surrogate data and Simulated Annealing. The test where the null hypothesis is that of a generic linear process (not necessarily Gaussian) is described in Section 4.2. In such cases, the null distribution is derived by means of a smoothed sieve bootstrap scheme. Finally, in Section 5 we show an application of the tests upon a panel of four monthly commodity price time series.

## 2. The Measure ${\mathbf{S}}_{\rho}$ for Serial and Cross Dependence

**Proposition**

**1**

**.**Let $({X}_{t},{Y}_{t+k})\sim N(0,1,{\rho}_{k})$ be a standard Normal random vector with correlation coefficient ${\rho}_{k}$. Then

`tseriesEntropy`implements the measure both for continuous and categorical data.

#### 2.1. Continuous State Space Time Series

`Srho.ts`implements the nonparametric estimator of Equation (6). The syntax is the following:

`x`and

`y`are numeric vectors/time series. If

`y`is not missing, then the function computes the entropy measure ${S}_{k}$ between ${X}_{t}$ and ${Y}_{t+k}$, where the lag k ranges from

`-lag.max`to

`lag.max`. As a simple illustration, we generate a time series

`x`of 50 observations from an AR(1) process and induce a nonlinear dependence at lag 1 in the series

`y`.

`plot = FALSE`. The choice of the kernel functions in the nonparametric estimator of Equation (6) has a limited impact and is taken to be Gaussian for the univariate densities and the product of two Gaussians for the bivariate density.

`tseriesEntropy`implements several options and some of these rely on the package

`ks`[36]. They are controlled through the option

`bw`and are presented in Table 1.

`reference`or

`mlcv`, then the bandwidth matrix for estimating the bivariate density is diagonal and this implies a spherical Gaussian kernel. The methods that rely on the package

`ks`, namely,

`lscv, scv, pi`can use both a diagonal or an unstructured bandwidth matrix through the option

`bdiag`. If

`bdiag = TRUE`(the default), then a diagonal matrix is used. This option has been introduced in version 0.7-0.

`hcubature`of the package

`cubature`[41]. The maximum tolerance

`tol`is passed to

`hcubature`and usually there is no need to change its default value. The option

`method = “summation”`selects an alternative estimator based on summation. As also remarked in [10], the estimator based upon adaptive integration is generally preferable.

`y`is missing, then

`Srho.ts`computes the serial version of the measure. This is shown in the next example, where we compute ${S}_{k}$, with the Likelihood Cross Validation bandwidth selector, on a realization from an MA(1) process:

#### 2.2. Categorical/Discrete State Space Time Series

`Srho`:

`Srho`is the following:

`Srho.ts`lies in the option

`nor`, which has been introduced to deal with the effects of the margins. While the measure, based on the distance between densities, is free from the effects of the marginal probability distributions, this is not the case with discrete/categorical data, so that the maximum reachable value of the measure is not 1 but depends upon the marginal probabilities. As is pointed out below, this has no practical effects if ${S}_{k}$ is used in hypothesis testing. However, if the actual value of the measure matters, as is the case when, for instance, one compares the level of dependence of different series, then the option

`nor = TRUE`normalizes the measure against its maximum theoretical attainable level so that the actual range is the interval $[0,1]$, as it should be. This effect is illustrated in the following example, where we generate 1000 random variates from a discrete uniform distribution on the first 5 integers and correlate the sequence with itself so that we should observe perfect dependence at lag 0.

`nor = TRUE`the normalized measure reaches 1 at lag zero (right panel).

`stationary = TRUE`assumes stationarity, so that the marginal probabilities are estimated on the whole sample and this leads to more efficient estimators, even if its effect on large samples is negligible.

#### 2.3. The S4 Classes `Srho-class` and `Srho.ts-class`

`Srho-class`and its extension

`Srho.ts-class`are designed to store and manage the results coming from

`Srho`and

`Srho.ts`, respectively. These are equipped with methods

`show`and

`plot`.

`plot`method allows the achievement of fine tuning and customizations, as shown in Figure 5.

## 3. Tests for Serial and Cross Dependence

`tseriesEntropy`offers specialized functions for testing for serial and cross dependence. The entropy measure ${S}_{k}$ has been shown to provide powerful tests that overcome many of the issues of the auto- and cross-correlation functions [10]. They can be used both as exploratory tools to investigate the dependence structure of time series and as diagnostic tools to assess the presence of residual dependence from a fitted model. Being based upon a nonparametric estimator, it is model-free and is able to detect departures from independence in any possible direction. Given two time series of size n, realizations of stationary random processes ${X}_{t}$ and ${Y}_{t}$ we test the null hypotheses that ${X}_{t}$ and ${Y}_{t+k}$ are independent, for each

`k`ranging in

`[-lag.max, lag.max]`. The distribution of the test statistic ${S}_{k}$ under ${H}_{0}$ is obtained by resampling/permutation.

#### 3.1. Tests for Continuous Time Series

`Srho.test.ts`:

`Srho.ts`, the user needs to specify the number

`B`of bootstrap resamples used to build the distribution of the test statistic under the null hypothesis. As before, if

`y`is missing, the routine tests for serial dependence in ${X}_{t}$. We illustrate this in the following example where we compute ${S}_{k}$ on

`w`and

`x`, realizations from a Gaussian white noise and an AR(1) process, respectively. For convenience, we use the parallel version of the routine

`Srho.test.ts.p`and

`B = 40`. In practice, the choice of

`B`depends upon the experimenter. In general, 100 replications are enough to have a rough idea of the result and the number can be increased for a finer assessment of the significance level.

`Srho.ts`, but the rejection bands at levels, specified by

`quant`, are added, and, by default, they are 95% (green dashed line) and 99% (blue dashed line). No lag of ${S}_{k}$ exceeds the confidence bands for the white noise

`w`(left panel). As for the AR(1) series

`x`, the test statistic points correctly to the presence of dependence in the 5 lags. In its serial version, the null distribution of ${S}_{k}$ is obtained by random permutation, as also put forward in [10].

`Srho.test.ts.p`is the parallel version of the routine that uses the

`parallel`package. The user only needs to specify the number of workers to be used through the option

`nwork`. By default, all the available cores are used.

`ci.type`. First, we generate two independent realizations of the following AR(1) process:

`x`and

`y`. Since the two time series are independent realizations, the test should not reject the null hypothesis at all lags, but the presence of strong serial dependence in the time series affects the results.

`ci.type = "perm"`). This also destroys the serial correlation of the two series and biases the result of the test since the variance of the null distribution of the test statistic depends upon the autocorrelation of the series. One way to overcome this problem would be to prewhiten the series before applying the cross-entropy test (see e.g., ref. [42,43]). Such an approach has its limits, in that it is tailored to be used for testing with the cross-correlation function, but its use for general measures of dependence is questionable. The solution adopted in

`Srho.test.ts`is to resample the series by means of a moving block bootstrap that preserves the serial dependence structure of the series [44]. This is selected by setting

`ci.type = "mbb"`, which is the default if

`y`is not missing. The block length is equal to

`lag.max`. The result of the right panel shows correctly that no lag of the cross-entropy ${S}_{k}$ exceeds the rejection bands at 99%.

#### 3.2. Tests for Discrete/Categorical Time Series

`Srho.test`:

`Srho.test.ts`; moreover, for the cross-entropy version, only the permutation option is available. The moving block bootstrap is implemented in a future version of the package. In the next example, we generate

`y`, a correlated binary sequence (at lag 2) by thresholding over the origin a realization

`x`of a MA process:

**Figure 8.**Serial Entropy ${S}_{k}$ for $k=1,\dots ,5$ (black, solid line) computed on a correlated binary time series generated from discretizing a MA(2) process. The rejection bands at 95% (green dashed line) and 99% (blue dashed line) correspond to the null hypothesis of serial independence.

#### 3.3. The S4 Class `Srho.test-class`

`Srho.test-class`is an extension of

`Srho-class`, to allow dealing with the results coming from all the tests implemented in the package.

`show`method prints both the significant lags at levels set by

`quant`and the p-values of the test at each lag. As shown in the previous examples, the

`plot`method adds the rejection bands of the tests computed on the null distribution at the levels specified in

`quant`.

## 4. Tests for Nonlinear Serial Dependence

`tseriesEntropy`implement the tests for nonlinear serial dependence introduced in [11] where the formal definition of linear processes is discussed along the lines of [45]. In particular, the null hypothesis assumes that the data generating process $\left\{{X}_{t}\right\}$ follows a zero-mean AR$(\infty )$ as follows:

#### 4.1. Tests for Linear Gaussian Dependence with Surrogate Data

`surrogate.SA`:

`x`, the number of lags of the autocorrelation to be matched by surrogates (

`nlag`), and the number of surrogates

`nsurr`. The remaining parameters pertain to the Simulated Annealing algorithm and are further discussed below in Section 4.1.1. We illustrate the use of the routine in the following example. First, we generate the original time series

`x`from a AR(1) process. Then, we generate 2 surrogates and plot all the series, see Figure 9.

`x`(up to

`eps.SA`).

`eps.SA = 0.05`.

`Trho.test.SA`, and its parallel version

`Trho.test.SA.p`, implement the test statistic ${T}_{k}$ together with the surrogate data approach based on Simulated Annealing. The null hypothesis tested is that of Equation (20) and the syntax is the following:

`Srho.ts`or

`surrogate.SA`so that they are not discussed again. Note that even if the syntax allows for the presence of the bivariate version of the test, this has not yet been implemented. Indeed, it requires the extension of the theory put forward in [11] and is the subject of future investigations. We illustrate the typical usage of the routine in the following example, where we generate two realizations of a linear MA(1) process:

`x1`has Gaussian innovations, whereas

`x2`has Student’s t innovations (with 3 degrees of freedom). In both cases, in order to expedite the computations, the number of surrogates was 40 and the target criterion was set to 0.1.

`x1`(left panel), while it rejected it for

`x2`(right panel). Note that, as suggested in [11], the

`mlcv`bandwidth selector was used and gave the best performance in conjunction with ${\widehat{T}}_{k}$. The Simulated Annealing algorithm for generating surrogate time series is discussed in some detail in the next section.

#### 4.1.1. Generating Surrogate Time Series with Simulated Annealing

`n`and let ${\widehat{\rho}}_{k}$, $k\in \mathbb{N}$ be its sample autocorrelation function. We denoted with ${\mathrm{x}}^{*}$ the candidate surrogate, with autocorrelation function ${\widehat{\rho}}_{k}^{*}$. The cost function implemented is:

`Te =`T and with ${\mathrm{x}}^{*}$, a random permutation of the original series $\mathrm{x}$. For each temperature value T:

- swap two observations of ${\mathrm{x}}^{*}$ and obtain the series ${\mathrm{x}}^{*\left(s\right)}$;
- compute $\mathsf{\Delta}C=\left[C(\mathrm{x},{\mathrm{x}}^{*\left(s\right)})-C(\mathrm{x},{\mathrm{x}}^{*})\right]$;
- if $\mathsf{\Delta}C<0$ accept the swap, that is, ${\mathrm{x}}^{*}={\mathrm{x}}^{*\left(s\right)}$if $\mathsf{\Delta}C\u2a7e0$ accept the swap with probability $p=exp(-\mathsf{\Delta}C/T)$;
- repeat step (1)–(3) until either the number of accepted swaps reaches
`nsuccmax`×`n`or the number of trials reaches`nmax`×`n`; - lower the temperature T, for instance by setting $T=\alpha T$ where $\alpha $
`= RT < 1`; - repeat the whole procedure until the cost function reaches a specified threshold
`eps.SA`.

Parameter | Value | Description |

Te | 0.001 | initial temperature |

RT | 0.9 | reduction factor for Te |

eps.SA | 0.05 | threshold |

nsuccmax | 30 | Te is decreased after nsuccmax×n successes |

nmax | 300 | Te is decreased after nmax×n trials |

che | 1 × ${10}^{5}$ | after che× 2n global iterations the algorithm starts again |

`eps.SA`should depend upon the sample size

`n`and one can try increasing it to speed up the computations.

#### 4.2. Tests for General Nonlinear Serial Dependence

- Given a time series (${x}_{1},\dots ,{x}_{n}$), fit an AR$\left(p\right)$ model upon it$${X}_{t}=\sum _{i=1}^{p}{\varphi}_{i}{X}_{t-i}+{\epsilon}_{t}$$
- derive the centered residuals ${\widehat{\epsilon}}_{t}^{c}$:$$\begin{array}{cc}\hfill {\widehat{\epsilon}}_{t}^{c}& ={\widehat{\epsilon}}_{t}-{n}^{-1}\sum _{t=1}^{n}{\widehat{\epsilon}}_{t},\phantom{\rule{4.pt}{0ex}}\mathrm{where}\phantom{\rule{4.pt}{0ex}}\hfill \end{array}$$$$\hspace{1em}\hspace{1em}\hspace{1em}\hspace{1em}\hspace{1em}\hspace{1em}\begin{array}{cc}\hfill {\widehat{\epsilon}}_{t}& ={x}_{t}-\sum _{i=1}^{p}{\widehat{\varphi}}_{i}\left(p\right){x}_{t-i}\phantom{\rule{2.em}{0ex}}t=p+1,\dots ,n;\hfill \end{array}$$
- compute the kernel density estimate of ${\widehat{\epsilon}}_{t}^{c}$:$${\widehat{f}}_{{\epsilon}_{t}}\left(\epsilon \right)=\frac{1}{n-p}\sum _{t=p+1}^{n}\frac{1}{h}K\left(\frac{\epsilon -{\widehat{\epsilon}}_{t}^{c}}{h}\right),$$
- draw the bootstrap innovations ${\widehat{\epsilon}}_{t}^{*}$ from the kernel density estimate$${\widehat{\epsilon}}_{t}^{*}\sim \mathrm{i}.\mathrm{i}.\mathrm{d}.{\widehat{f}}_{{\epsilon}_{t}}\left(x\right)dx;$$
- obtain the bootstrapped time series ${x}_{1}^{*},\dots ,{x}_{n}^{*}$ according to:$${x}_{t}^{*}=\sum _{i=1}^{p}{\widehat{\varphi}}_{i}\left(p\right){x}_{t-i}^{*}+{\widehat{\epsilon}}_{t}^{*}\phantom{\rule{1.em}{0ex}}t=-Q,\dots ,n$$
- Repeat steps (4)–(5) B times.

`surrogate.ARs`:

`stats::ar`to fit the AR model and the arguments

`order.max`and

`fit.methods`are passed to it. The default options ensure that the best AR$\left(p\right)$ model, where p ranges from 1 to

`order.max`, is selected by means of the AIC and

`order.max`depends upon the length of the series. The following example illustrates the usage of the routine and the results are presented in Figure 11.

**Figure 11.**Time series from a AR(1) process together with two surrogate/resampled time series generated through the smoothed sieve bootstrap scheme.

`surrogate.AR`that implements the standard sieve bootstrap [51].

`Srho.test.AR`and

`Srho.test.AR.p`implement the test statistic ${\widehat{S}}_{k}$ together with the sieve bootstrap scheme. The null hypothesis tested is that of Equation (21) and the syntax is the following:

`smoothed`selects either the smoothed sieve scheme of

`surrogate.ARs`or the standard sieve of

`surrogate.AR`. The remaining option has already been discussed above. According to the results of [11], this is the most powerful and flexible test and its use is recommended in conjunction with the

`reference`bandwidth selection criterion. We show its use on a time series from a nonlinear moving average process with nonlinear dependence at lag k.

`nlma`function:

`x`of 50 observations from

`nlma`and show the inability of the analysis based upon the autocorrelation, to detect the nonlinear dependence, see Figure 12. Then, we compute the bootstrap test of nonlinear serial dependence, based on ${S}_{k}$ and present the results in Figure 13:

**Figure 13.**Serial entropy ${S}_{k}$ for $k=1,\dots ,5$ (black, solid line), computed on a realization of a nonlinear MA(1) process. The rejection bands at 95% (green dashed line) and 99% (blue dashed line) correspond to the null hypothesis of a general linear process.

#### 4.3. Discussion

`Srho.test.ts`and that of nonlinear serial dependence

`Srho.test.AR`and address why the two cannot be used interchangeably. In literature, several diagnostic tests to verify the independence of the residuals of a fitted model have been proposed. These can be based upon dependence measures, akin to ${S}_{k}$, or rely on ad hoc measures, such as generalized correlations. Contrarily to some of the claims, finding structure in the residuals of a linear model does not necessarily imply a nonlinear specification, but can simply point to a misspecified (linear) model. Hence, a proper test for nonlinear serial dependence should explicitly enforce the null hypothesis of linearity on the test statistic and this is the approach adopted here. We illustrate the matter in the following example, where the time series

`x`came from an ARMA(1,1) process. Then, we fitted a MA(1) model to the series and tested for independence of both

`x`and the residuals

`res`with the permutation test of

`Srho.test.ts`.

`x`(left panel) and the residuals

`res`of the fitted MA(1) (right panel). In the latter case, rejection implied lack of fit but not nonlinearity. This was further confirmed by applying the test for nonlinear serial dependence of

`Srho.test.AR`. The results are plotted in Figure 15:

`Trho.test.SA`, based upon Simulated Annealing and paired with the

`mlcv`bandwidth selector, had a computational complexity of $O\left({n}^{2}\right)$ (n being the length of the series), whereas the test

`Srho.test.AR`, paired with the

`reference`criterion, had a complexity of $O\left(n\right)$, which made it generally preferable, in view of its superior performance. In order to (partly) overcome the burden, and to parallelize the functions, the key code portions for estimating ${\widehat{S}}_{k}$, and for the resampling schemes, were coded in

`Fortran`. In any case, we recommend running the tests with a small initial number of resamples, especially if the series is long.

**Figure 14.**Serial entropy ${S}_{k}$ for $k=1,\dots ,5$ (black, solid line), computed on a realization of a linear ARMA(1,1) process (left panel) and on the residuals of a fitted MA(1) model upon the series (right panel). The rejection bands at 95% (green dashed line) and 99% (blue dashed line) corresponded to the null hypothesis of serial independence.

**Figure 15.**Serial entropy ${S}_{k}$ for $k=1,\dots ,5$ (black, solid line), computed on a realization of a linear ARMA(1,1) process (left panel) and on the residuals of a fitted MA(1) model upon the series. The rejection bands at 95% (green dashed line) and 99% (blue dashed line) corresponded to the null hypothesis of a general linear process.

## 5. Detecting Complex Dependence in Commodity Prices

`Srho.test.AR`. In all the tests, the number of bootstrap replications was set to 1000 and we used the reference bandwidth, as suggested in [11].

## Author Contributions

## Funding

## Data Availability Statement

`R`package

`tseriesEntropy`is available on CRAN at https://cran.r-project.org/web/packages/tseriesEntropy/ (accessed on 27 December 2022).

## Conflicts of Interest

## Abbreviations

AR | Autoregressive |

MA | Moving Average |

ARMA | Autoregressive Moving Average |

TAR | Threshold Autoregressive |

TARMA | Threshold Autoregressive Moving Average |

AIC | Akaike Information Criterion |

## References

- Tong, H.; Lim, K. Threshold autoregression, limit cycles and cyclical data. J. R. Stat. Soc. Ser. B (Methodol.)
**1980**, 42, 245–292. [Google Scholar] [CrossRef] - Tong, H. Nonlinear Time Series. A Dynamical System Approach; Oxford University Press: Oxford, UK, 1990; p. xvi+564. [Google Scholar]
- Fan, J.; Yao, Q. Nonlinear Time Series. Nonparametric and Parametric Methods; Springer: New York, NY, USA, 2003; p. xx+551. [Google Scholar]
- Enders, W. Applied Econometric Time Series, 4th ed.; Wiley Series in Probability and Statistics; Wiley: Hoboken, NJ, USA, 2014. [Google Scholar]
- De Gooijer, J. Elements of Nonlinear Time Series Analysis and Forecasting; Springer Series in Statistics; Springer International Publishing: Cham, Switzerland, 2017. [Google Scholar]
- Tsay, R.; Chen, R. Nonlinear Time Series Analysis; Wiley Series in Probability and Statistics; Wiley: Hoboken, NJ, USA, 2018. [Google Scholar]
- Giannerini, S.; Goracci, G.; Rahbek, A. The validity of bootstrap testing in the threshold framework. J. Econom.
**2023**, in press. [Google Scholar] [CrossRef] - Goracci, G.; Giannerini, S.; Chan, K.S.; Tong, H. Testing for threshold effects in the TARMA framework. Stat. Sin.
**2023**, 33, 3. [Google Scholar] [CrossRef] - Chan, K.S.; Giannerini, S.; Goracci, G.; Tong, H. Testing for threshold regulation in presence of measurement error. Stat. Sin.
**2023**, 34, 3. [Google Scholar] [CrossRef] - Granger, C.W.J.; Maasoumi, E.; Racine, J. A Dependence Metric for Possibly Nonlinear Processes. J. Time Ser. Anal.
**2004**, 25, 649–669. [Google Scholar] [CrossRef] - Giannerini, S.; Maasoumi, E.; Dagum, E.B. Entropy Testing for Nonlinear Serial Dependence in Time Series. Biometrika
**2015**, 102, 661–675. [Google Scholar] [CrossRef] - Giannerini, S.
`tseriesentropy:`Entropy Based Analysis and Tests for Time Series. R Package Version 0.7-0. 2021. Available online: https://CRAN.R-project.org/package=tseriesEntropy (accessed on 27 December 2022). - Hayfield, T.; Racine, J.S. Nonparametric Econometrics: The
`np`Package. J. Stat. Softw.**2008**, 27, 1–32. [Google Scholar] [CrossRef] - Tsay, R.; Chen, R.; Liu, X.
`NTS:`Nonlinear Time Series Analysis. R Package Version 1.1.2. 2020. Available online: https://CRAN.R-project.org/package=NTS (accessed on 27 December 2022). - Dalla, V.; Giraitis, L.; Phillips, P.C.B.
`testcorr:`Testing Zero Correlation. R Package Version 0.1.2. 2020. Available online: https://CRAN.R-project.org/package=testcorr (accessed on 27 December 2022). - Fisher, T.J.; Gallagher, C.M.
`WeightedPortTest:`Weighted Portmanteau Tests for Time Series Goodness-of-Fit. R Package Version 1.0. 2012. Available online: https://CRAN.R-project.org/package=WeightedPortTest (accessed on 27 December 2022). - Bagnato, L.; De Capitani, L.; Mazza, A.; Punzo, A.
`SDD`: An R Package for Serial Dependence Diagrams. J. Stat. Softw.**2015**, 64, 1–19. [Google Scholar] [CrossRef] - Pitsillou, M.; Fokianos, K.
`dCovTS:`Distance Covariance and Correlation for Time Series Analysis. R Package Version 1.1. 2016. Available online: https://CRAN.R-project.org/package=dCovTS (accessed on 27 December 2022). - Mahdi, E.; McLeod, A.I. Improved Multivariate Portmanteau Diagnostic Test. J. Time Ser. Anal.
**2012**, 33, 211–222. [Google Scholar] [CrossRef] - Lugrin, T.; Davison, A.C.; Tawn, J.A. Bayesian Uncertainty Management in Temporal Dependence of Extremes. Extremes
**2016**, 19, 491–515. [Google Scholar] [CrossRef] [Green Version] - Frolova, N.; Cribben, I.
`extremogram:`Estimation of Extreme Value Dependence for Time Series Data. R Package Version 1.0.2. 2016. Available online: https://CRAN.R-project.org/package=extremogram (accessed on 27 December 2022). - Hofert, M.; Kojadinovic, I.; Maechler, M.; Yan, J.
`copula:`Multivariate Dependence with Copulas. R Package Version 1.0-1. 2020. Available online: https://CRAN.R-project.org/package=copula (accessed on 27 December 2022). - Trapletti, A.; Hornik, K.
`tseries:`Time Series Analysis and Computational Finance. R Package Version 0.10-48. 2020. Available online: https://CRAN.R-project.org/package=tseries (accessed on 27 December 2022). - Nagler, T.
`wdm:`Weighted Dependence Measures. R Package Version 0.2.2. 2020. Available online: https://CRAN.R-project.org/package=wdm (accessed on 27 December 2022). - Miecznikowski, J.C.; Hsu, E.S.; Chen, Y.; Vexler, A.
`testforDEP:`Dependence Tests for Two Variables. R Package Version 0.2.0. 2017. Available online: https://CRAN.R-project.org/package=testforDEP (accessed on 27 December 2022). - Garcia, J.E.; Gonzalez-Lopez, V.A.
`LIStest:`Tests of Independence Based on the Longest Increasing Subsequence. R Package Version 2.1. 2014. Available online: https://CRAN.R-project.org/package=LIStest (accessed on 27 December 2022). - Berrett, T.B.; Kontoyiannis, I.; Samworth, R.J.
`USP:`U-Statistic Permutation Tests of Independence for All Data Types. R Package Version 0.1.1. 2020. Available online: https://CRAN.R-project.org/package=USP (accessed on 27 December 2022). - Berrett, T.B.; Grose, D.J.; Samworth, R.J.
`IndepTest:`Nonparametric Independence Tests Based on Entropy Estimation. R Package Version 0.2.0. 2018. Available online: https://CRAN.R-project.org/package=IndepTest (accessed on 27 December 2022). - Pfister, N.; Peters, J.
`dHSIC:`Independence Testing via Hilbert Schmidt Independence Criterion. R Package Version 2.1. 2019. Available online: https://CRAN.R-project.org/package=dHSIC (accessed on 27 December 2022). - Jin, Z.; Yao, S.; Matteson, D.S.; Shao, X.
`EDMeasure:`Energy-Based Dependence Measures. R Package Version 1.2. 2018. Available online: https://search.r-project.org/CRAN/refmans/EDMeasure/html/EDMeasure-package.html (accessed on 27 December 2022). - Böttcher, B.
`multivariance:`Measuring Multivariate Dependence Using Distance Multivariance. R Package Version 2.3.0. 2020. Available online: https://CRAN.R-project.org/package=multivariance (accessed on 27 December 2022). - Risk, B.B.; James, N.A.; Matteson, D.S.
`steadyICA:`ICA and Tests of Independence via Multivariate Distance Covariance. R Package Version 1.0. 2015. Available online: https://CRAN.R-project.org/package=steadyICA (accessed on 27 December 2022). - Peters, J.; Shah, R.D.
`GeneralisedCovarianceMeasure:`Test for Conditional Independence Based on the Generalized Covariance Measure (GCM). R Package Version 0.1.0. 2019. Available online: https://CRAN.R-project.org/package=GeneralisedCovarianceMeasure (accessed on 27 December 2022). - Maasoumi, E. A Compendium to Information Theory in Economics and Econometrics. Econom. Rev.
**1993**, 12, 137–181. [Google Scholar] [CrossRef] - Geenens, G.; de Micheaux, P.L. The Hellinger Correlation. J. Am. Stat. Assoc.
**2022**, 117, 639–653. [Google Scholar] [CrossRef] - Duong, T.
`ks:`Kernel Smoothing. R Package Version 1.12.0. 2021. Available online: https://CRAN.R-project.org/package=ks (accessed on 27 December 2022). - Silverman, B.W. Density Estimation for Statistics and Data Analysis; Chapman and Hall: London, UK, 1986. [Google Scholar]
- Bowman, A.W. An Alternative Method of Cross-Validation for the Smoothing of Density Estimates. Biometrika
**1984**, 71, 353–360. [Google Scholar] [CrossRef] - Duong, T.; Hazelton, M.L. Cross-Validation Bandwidth Matrices for Multivariate Kernel Density Estimation. Scand. J. Stat.
**2005**, 32, 485–506. [Google Scholar] [CrossRef] - Duong, T.; Hazelton, M. Plug-in Bandwidth Matrices for Bivariate Kernel Density Estimation. J. Nonparametric Stat.
**2003**, 15, 17–30. [Google Scholar] [CrossRef] - Narasimhan, B.; Johnson, S.G.; Hahn, T.; Bouvier, A.; Kiêu, K.
`cubature:`Adaptive Multivariate Integration over Hypercubes. R Package Version 2.0.4.1. 2020. Available online: https://CRAN.R-project.org/package=cubature (accessed on 27 December 2022). - Brockwell, P.J.; Davis, R.A. Time Series: Theory and Methods; Springer: New York, NY, USA, 1991. [Google Scholar]
- Li, W. Diagnostic Checks in Time Series, 1st ed.; Chapman and Hall/CRC: Boca Raton, FL, USA, 2003. [Google Scholar]
- Bühlmann, P. Bootstraps for Time Series. Stat. Sci.
**2002**, 17, 52–72. [Google Scholar] [CrossRef] - Bickel, P.J.; Bühlmann, P. Closure of Linear Processes. J. Theor. Probab.
**1997**, 10, 445–479. [Google Scholar] [CrossRef] - Theiler, J.; Eubank, S.; Longtin, A.; Galdrikian, B.; Farmer, J. Testing for Nonlinearity in Time Series: The Method of Surrogate Data. Physica D
**1992**, 58, 77–94. [Google Scholar] [CrossRef] - Chan, K.S. On the Validity of the Method of Surrogate Data. Fields Inst. Commun.
**1997**, 11, 77–97. [Google Scholar] - Schreiber, T.; Schmitz, A. Surrogate Time Series. Physica D
**2000**, 142, 346–382. [Google Scholar] [CrossRef] - Vidal, R.V.V. Applied Simulated Annealing; Lecture Notes in Economics and Mathematical Systems; Springer: Berlin, Germany, 1993; Volume 396. [Google Scholar]
- Bickel, P.J.; Bühlmann, P. A New Mixing Notion and Functional Central Limit Theorems for a Sieve Bootstrap in Time Series. Bernoulli
**1999**, 5, 413–446. [Google Scholar] [CrossRef] - Bühlmann, P. Sieve Bootstrap for Time Series. Bernoulli
**1997**, 3, 123–148. [Google Scholar] [CrossRef] - Ravazzolo, F.; Rothman, P. Oil and U.S. GDP: A Real-Time Out-of-Sample Examination. J. Money Credit Bank.
**2013**, 45, 449–463. [Google Scholar] [CrossRef] - Hannan, E.; Rissanen, J. Recursive Estimation of Mixed Autoregressive-Moving Average Order. Biometrika
**1982**, 69, 81–94. [Google Scholar] [CrossRef] - Choi, B. ARMA Model Identification; Springer: New York, NY, USA, 1992. [Google Scholar] [CrossRef]
- Goracci, G.; Ferrari, D.; Giannerini, S.; Ravazzolo, F. Robust Estimation for Threshold Autoregressive Moving-Average Models. arXiv
**2022**, arXiv:2211.08205. [Google Scholar] - Dimitriou, D.; Kenourgios, D.; Simos, T. Are there any other safe haven assets? Evidence for “exotic” and alternative assets. Int. Rev. Econ. Financ.
**2020**, 69, 614–628. [Google Scholar] [CrossRef] - Ergemen, Y.E.; Haldrup, N.; Rodríguez-Caballero, C.V. Common long-range dependence in a panel of hourly Nord Pool electricity prices and loads. Energy Econ.
**2016**, 60, 79–96. [Google Scholar] [CrossRef]

**Figure 1.**Relation between ${S}_{\rho}$ and the correlation coefficient $\rho $ under the bivariate Gaussian setting.

**Figure 2.**Cross entropy ${S}_{k}$ between ${X}_{t}$ and ${Y}_{t+k}$ ($k=-3,\dots ,3$) from Equation (9).

**Figure 4.**Cross Entropy ${S}_{k}$ for categorical data, in the presence of perfect dependence at lag 1. The left panel shows the unnormalized measure (blue, solid line), where the maximum attainable level is indicated in red. The right panel shows the normalized version of the measure (black, solid line).

**Figure 5.**The same as Figure 4 (right) but with plot customizations.

**Figure 6.**Serial Entropy ${S}_{k}$ for $k=1,\dots ,5$ (black, solid line) computed on a realization from a white noise (left panel) and a AR(1) process (right panel). The rejection bands at 95% (green dashed line) and 99% (blue dashed line) correspond to the null hypothesis of serial independence.

**Figure 7.**Cross Entropy ${S}_{k}$ for $k=-5,\dots ,5$ (black, solid line), computed between two independent realizations of a AR(1) process. The rejection bands at 95% and 99% are indicated as green and blue dashed lines, respectively. In the left panel they are computed by permutation (

`ci.type=‘perm’`), whereas in the right panel the bands are computed through a moving block bootstrap (

`ci.type=‘mbb’`).

**Figure 9.**Time series from a AR(1) process together with two surrogate time series generated through Simulated Annealing.

**Figure 10.**Test statistic ${T}_{k}$ for $k=1,\dots ,6$ (black, solid line), computed on a linear Gaussian MA(1) process (left panel) and on a linear MA(1) process driven by Student’s t innovations (right panel). The rejection bands at 95% (green dashed line) and 99% (blue dashed line) correspond to the null hypothesis of a linear Gaussian process.

**Figure 16.**Time plots of the four monthly commodities series January 1994–November 2022. Left column: raw commodities ${y}_{t}$. Right column: log-returns ${x}_{t}=\nabla log\left({y}_{t}\right)$.

**Figure 17.**Sample correlograms for the four commodities up to one year. The confidence bands at level $99\%$ under the white noise hypothesis are indicated as blue dashed horizontal lines.

**Figure 18.**Sample correlograms for the squared series of the four commodities up to one year. The confidence bands at level $99\%$ under the white noise hypothesis are indicated as blue dashed horizontal lines.

**Figure 19.**Serial entropy ${S}_{k}$ for lag $k=1,\dots ,13$ (black, solid line), computed on the commodity price series. The rejection bands at 95% (green dashed line) and 99% (blue dashed line) correspond to the null hypothesis of a general linear process.

**Figure 20.**Serial entropy ${S}_{k}$ for lag $k=1,\dots ,13$ (black, solid line), computed on the residuals of a ARMA$(1,1)$–GARCH$(1,1)$ model fitted upon the series of commodity prices. The rejection bands at 95% (green dashed line) and 99% (blue dashed line) correspond to the null hypothesis of a general linear process.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Giannerini, S.; Goracci, G.
Entropy-Based Tests for Complex Dependence in Economic and Financial Time Series with the `R` Package `tseriesEntropy`. *Mathematics* **2023**, *11*, 757.
https://doi.org/10.3390/math11030757

**AMA Style**

Giannerini S, Goracci G.
Entropy-Based Tests for Complex Dependence in Economic and Financial Time Series with the `R` Package `tseriesEntropy`. *Mathematics*. 2023; 11(3):757.
https://doi.org/10.3390/math11030757

**Chicago/Turabian Style**

Giannerini, Simone, and Greta Goracci.
2023. "Entropy-Based Tests for Complex Dependence in Economic and Financial Time Series with the `R` Package `tseriesEntropy`" *Mathematics* 11, no. 3: 757.
https://doi.org/10.3390/math11030757