Two Tests for Dependence (of Unknown Form) between Time Series

This paper proposes two new nonparametric tests for independence between time series. Both tests are based on symbolic analysis, specifically on symbolic correlation integral, in order to be robust to potential unknown nonlinearities. The first test is developed for a scenario in which each considered time series is independent and therefore the interest is to ascertain if two internally independent time series share a relationship of an unknown form. This is especially relevant as the test is nuisance parameter free, as proved in the paper. The second proposed statistic tests for independence among variables, allowing these time series to exhibit within-dependence. Monte Carlo experiments are conducted to show the empirical properties of the tests.


Introduction
In Science, either Natural or Social, a primary focus of attention has been the evaluation of whether two (or more) variables measured over time are related. If there is a relation, then further research into the nature and strength of the relation could be worthwhile in order to make new discoveries and/or gain predictive accuracy. Accordingly, testing for dependence between two univariate time series has been widespread, as we will show below. However, the vast majority of literature on the topic has relied on correlation as a main tool for developing other sophisticated statistical devices (tests). On the other hand, any correlation-based test is designed to detect linear relationships between time series, which limits the scope and utility of the test for dealing with (unknown) potential forms of dependence between the series beyond linear ones. The relatively few attempts to deal with nonlinear relationships have relied on nonparametric methods that generally require large data sets. In this age of massive data, this is not a limitation and, therefore, new statistical tests with fewer assumptions and with a wider range to detect potential relationships can be devised. The goal of this paper is to develop statistical tools to test dependence among time series robust to the form of the potential dependence that is established among the variables.
The classical Pearson cross-correlation coefficient assesses a linear relationship between two time series. However, it is well-known that it is not reliable when the time series under study are autocorrelated. In order to solve this caveat, scholars have developed several alternative statistical tests. Most of the work done is parametric and is based on the residuals of estimated models. The starting point was Haugh [1], who proposed a test for non-correlation between two jointly Gaussian covariance stationary time series, say {X t } and {Y t }, by first prewhitening X t and Y t and then basing the test on the residual cross-correlation function. However, Haugh's innovative test was rapidly criticized because (of problems) of its low power against popular alternative hypotheses (see [2] and [3] for the earliest documented critiques). Since then, Haugh's technique has been extended in several ways and the common denominator has been to improve the power of the test, (see: [4][5][6][7][8][9][10][11] for details). Many of the results are summarized by [12]. Basically, all these procedures follow the main skeleton given by Haugh, namely, first to obtain white noise residuals by performing univariate autoregression, and then to check the sample cross-correlation of the residuals.
There have also been several attempts to avoid parametric approaches: see [13][14][15][16]. The last of these proposed a nonparametric test of independence which was designed to avoid the autoregressive moving average (ARMA) pre-specification and to avoid kernel selection methods. Pre-specification and kernel selection were potential limitations in Haugh's [1] and Hong's tests [8]. In this vein, this paper proposes two new nonparametric tests for independence between time series. The first test is developed for a scenario similar to that in Pearson's original test, namely, for situations in which each considered time series is independent (within-independence or serially independent). Therefore, the interest is to investigate if two (or more) internally independent time series can be interrelated (either in a linear way, as in the case of correlation; or, more generally, in a nonlinear way). As mentioned earlier, there are many instances in several scientific domains where each time series is serially-dependent (auto-dependent). The second test proposed tests for independence between variables by allowing these time series to exhibit within-dependence. We use the concept of the symbolic correlation integral, as introduced in [17], to construct tests robust to nonlinearities.
Symbolic correlation integral is directly linked and influenced by the popular and well-known correlation integral definition, which can be understood as a measure of dependence. In this regard, dependence is not restricted to linear correlation (as in the case of Haugh-based tests) and therefore it will be robust to other complex forms of dependence that might occur between variables. Symbolic analysis is a field that has attracted the interest of many scholars and practitioners from several scientific disciplines (see [17]). By relying on these kinds of nonparametric concepts and techniques, we avoid imposing restrictive parametric assumptions, such as linearity and normality and, therefore, more generally applicable tests can be constructed. There might be other possibilities like entropy based statistics, as introduced in [18] that could avoid some of these restrictions.
The rest of the paper is structured as follows: in Section 2, we present the notation that will be used throughout the paper and the definition of symbolic correlation that serves as a common thread in the rest of the paper. Section 3 is devoted to presenting and defining the concept of joint symbolic correlation integral and its corresponding estimator. In Section 4, two tests for independence between series and their corresponding asymptotic treatment are described. In Section 5, empirical size and power are analyzed to better understand the finite sample behavior of the tests under several scenarios. We make a multi-level comparison with other tests available in the literature, and we provide some guidelines about how to fix the parameters of the new tests.

Definitions and Notation
Let {x t } t∈I be a real-valued time series from a strictly stationary stochastic process of real random variables, where I is a set of time indexes. For a positive integer m ≥ 2, we denote by S m the symmetric group of order m!, that is, the group formed by all the permutations of length m. Let π = (i 1 , i 2 , . . . , i m ) ∈ S m . We will call an element π in the symmetric group S m a symbol. We consider that the time series is embedded in an m-dimensional space as follows: x t = (x t , x t+1 , ..., x t+(m−1) ).
When the set of indexes I is finite and has cardinality T, {x t } n t=1 is a vectorial time series of length n = T − m + 1. Each x t is called m−history and the positive integer m is usually known as the embedding dimension.
We say that x t is of π-type if and only if π = (i 1 , i 2 , . . . , i m ) is the unique symbol in the group S m satisfying the two following conditions: Condition (b) guarantees the uniqueness of the symbol π. This is justified if the values of x t have a continuous distribution, so equal values are very uncommon, with a theoretical probability of occurrence of 0. In the case of a discrete distribution, then condition (b) guarantees that the symbolization map is well defined.
Next, we define the symbolization map Notice that the symbolization map s transforms the vectorial time series of m-histories into a sequence of symbols. Moreover, each element of R m is mapped to a symbol of S m providing a partition of R m of size m!, called symbolic partition of R m . Given two time instants, t and s, and two ordinal patterns π, δ ∈ S m , we define p πδ ts as the probability that, in time t, x t is of π-type and, in time s, x s is of δ-type. Thus, we can construct the following Within this context, we define the indicator function which always takes the value 1 when the ordinal patterns of the m-histories x t and x s are the same. The indicator variable, I(s x (x t ), s x (x s )), is a Bernoulli random variable with probability of success Morover, if {x t } is i.i.d. and |t − s| ≥ m, then µ x ts = 1 m! (as shown in [17]).
Based on these concepts, Caballero et al. [17] defined the symbolic correlation integral of a time series {x t } t∈I for an embedding dimension m ≥ 2 as and they proved that under the null that {x t } t∈I is i.i.d., the statistic is asymptotically distributed as an N( 1 m! , σ m ), where the standard deviation σ m does not depend on the sample time series.
An interesting relation betweenŜC m with a well-known index can be obtained. If N(π) denotes the absolute frequency of π ∈ S m in the symbolic time series (of length n), then symbolic correlation integral can be rewritten as: which is closely related to normalized Gini index or Index of Qualitative Variation (see [19] for details), which is given by

Joint Symbolic Correlation Integral
There is a natural extension of the symbolic correlation integral to a multivariate scenario. Similar to Symbolic Correlation Integral, Joint Symbolic Correlation Integral will measure the probability that all univariate time series forming the multivariate time series have the same ordinal pattern at different time periods (where the patterns could potentially vary along the time series). As we are therefore interested in analyzing a k-dimensional time series, first, we will define the Joint Symbolic Correlation Integral (JSC) of a set X ⊂ R m 1 × · · · × R m k , and then we will focus on multivariate time series.
Definition 1 (Joint Symbolic Correlation Integral). Let X j ⊂ R m j with j = 1, 2, ..., k be distributed according to invariant measures ν 1 , ν 2 , . . . , ν k , respectively, and ν = (ν 1 , . . . , ν k ) a invariant measure on X = X 1 × X 2 × · · · × X k . The joint symbolic correlation integral of the set X = X 1 × X 2 × · · · × X k is defined as Notice that the symbolization map for the multivariate time series is component-wise, and we allow for each component of the multivariate time series to have a different embedding dimension.
In the case in which the invariant measures are independent, that is, where m = (m 1 , m 2 , . . . , m k ).
Notice that, only under the null of independence among the time series, Based on this definition, we also have a natural extension of JSC for multivariate time series. To this end, given a multivariate time series {w t = (x 1t , x 2t , . . . , x kt )} t∈I , each of the time series {x jt } is embedded in R m j , as in the previous section, to construct the embedded multivariate time series w t = (x 1t , x 2t , . . . , x kt ). Then, the symbol space for {w t } t∈I is defined as Γ k = ∏ k j=1 S m j , the Cartesian product of symmetric groups S m j with j = 1, 2, ..., k, and the symbolization map is defined as s w = (s x 1 , s x 2 , . . . , s x k ). Thus, the symbolization of the multivariate time series is component-wise.
Next, we are interested in computing the estimator, JSC m , of the joint symbolic correlation integral for {w t = (x 1t , x 2t , . . . , x kt )} T t=1 . To this end, we define the indicator function and then where n = min{n 1 , n 2 , . . . , n k } with n j = T − m j + 1 for j = 1, 2, . . . , k. In addition, the indicator function defined by Equation (6) is a Bernoulli random variable with probability of success µ ts . When the time series that form the vectorial time series {w t } are independent between themselves, we have that We should highlight that, when every x jt T t=1 is a stationary time series, from [20], the estimator of joint symbolic correlation integral is asymptotically unbiased:

Testing Independence between Time Series with JSC
In this section, we construct a test for independence between time series based on JSC. We will distinguish between two null hypotheses. In the first, we will test independence between time series when they are i.i.d. In this case, we will provide the asymptotic distribution of the test statistic and will show the conditions under which this test is not affected by the intermediate step of estimating the parameters of a given model. Such tests are called nuisance-parameter-free tests. In the second, we will relax the null hypothesis to allow for the time series under consideration not to be i.i.d. In this second case, we will not give the asymptotic distribution but use bootstrapping to test for significance. Proofs can be found in Appendix A. We are going to consider a vectorial time series, {w t = (x 1t , x 2t , . . . , x kt )} t∈I , where each {x jt } is i.i.d. and they are independent between themselves.
Since the time series {x jt } are i.i.d., from [17], we have that SC m j = 1 m j ! for j = 1, 2, ..., k. Moreover, since the time series are independent between themselves, from Equations (8) and (9), we have that As the time series {x jt } are i.i.d. for all j = 1, 2, ..., k, using Equation (8), it follows that where m = max{m 1 , m 2 , ..., m k } and then Under these conditions, the following result provides the asymptotic distribution of JSC m . Theorem 1. Let {x jt } t ∈ I with j = 1, 2, ..., k be k i.i.d. times series that are independent between themselves. Given embedding dimensions m 1 , m 2 , ..., m k ≥ 2 and m = max{m 1 , m 2 , ..., m k }, it follows that In many cases, the researcher wants to apply statistics, like statistic Equation (11), to estimated errors (residuals) of a model fitted to the raw data. The asymptotic distribution of Equation (11) is derived under the assumption that true errors are considered. Even in the case that true errors are iid, the estimated errors might exhibit some form of dependence, and therefore the asymptotic distribution might be affected by the estimation process. We now wonder if the statistic Equation (11) can be safely applied after the estimation of some intermediate parameters.

Definition 2.
Let S( θ) be a statistic that depends upon some consistently estimated parameter, θ. Assume that, at a true value θ, Theorem 2 (Nuisance-free parameter property). Assume that the data-generating process for each time series {x jt } t ∈ I is given by where I j t−1 is a finite set of regressors, θ j is a vector of parameters, u jt are i.i.d., and G j is a continuous function defined on a compact set for j = 1, 2, ..., k. Then, given embedding dimensions m 1 , m 2 , ..., m k ≤ 2 and m = max{m 1 , m 2 , ..., m k }, the statistic JSC m applied to the residuals {(u 1t , u 2t , . . . , u kt )} t is nuisance-parameter free.

H 0 : Time Series Independent between Themselves
When the time series under consideration, {x jt } t ∈ I , are not i.i.d., we consider the following statistic:δ Notice that, when the time series are independent between themselves,δ(m) = 0 andδ(m) = 0 otherwise. Thus, to test for significance, we use a bootstrap test based on the block-bootstrap proposed by Politis and White [21] and corrected in Patton et al. [22]. In order to be under the null of independence among the time series, we resample each time series independently rather than jointly.
Let us illustrate the procedure. Given the original time-series {x jt } T t=1 , we computeδ(m) and evaluate the same for each of the B − 1 bootstrap realizations of the k time-series, namelyδ (b) (m). Next, we compute the upper and lower p-values as where 1 is the Heaviside function. Based on these values, the null hypothesis of independence among time series is rejected at a significance level α if upper − p < α 2 or lower − p < α 2 .

Empirical Size and Power
We present some evidence of the performance of the proposed tests when they are applied to systems that might (or might not) exhibit within-dependence (i.e., autodependent processes, whose easiest form is autocorrelation) and/or dependence between processes (between-dependence or cross-dependence). We refer to the statistics depicted in Equations (11) and (13) by Test1 and Test2. For simplicity, the statistics have been computed assuming that the correct time-lag has been obtained. To this end, we have considered the following systems of relations: We have selected these systems (with these parameters, sample sizes and relationships) because they have been studied in previous works, so comparability is easy. In particular, system S1 specifies two within-independent processes that are independent between themselves, while S2 specifies two between-independent Gaussian autoregressive AR(1) processes. System S3 represents two between-independent processes that exhibit a nonlinear form of within-dependence. On the other hand, systems S4, S5, S6, and S7 are between-dependent processes. S4 is a linear within-dependent model that entails ideal conditions (normality and linearity) for the application of parametric statistics. System S5 has a stable bivariate nonlinear autoregressive within-dependence. S6 model shows within-dependence in one of the variables, but not in the other. System S7 shows first a nontrivial nonlinear relation between the processes and, second, it considers two different forms of within-dependence. Finally, system S8 is very interesting as it is a nonlinear deterministic process formed from the chaotic logistic equation.
Each system was simulated 2000 times and the following tables collect the proportion of rejection the null hypothesis considered at the 5% nominal level. The tests were performed for m = 3. The selection of parameter m is open to the practioner, and we will deal with this regard later on this section. Table 1 shows the power and size results of Test1 for the eight models S1-S8 for the null X t and Y t being i.i.d. and independent between themselves. We observe first that, when the stochastic system is generated under the null (i.e., two independent processes that are both auto-independent), the test rejects according at the fixed 5% level. In this regard, the new test Test1 behaves as expected under the null, regardless of the sample size. The results for systems S2 and S3 suggest that Test1 test easily detects a departure from one of the conditions of the null (namely, within-dependence), even when they are between-independent processes. The results for S4, S5 and S7 indicate that, when the departure from the null is larger (in this case both null conditions are violated), then Test1 behaves powerfully, even in the case of nonlinear dependencies, as in systems S5 and S7. As regards system S6, notice that, despite the fact that process Y is i.i.d., process X depends on Y. Even in the case that one process has no within-independence but there is between dependence, the test exhibits power to detect these departures. Finally, the results for system S8 are especially interesting because the processes involved are purely deterministic (there is no random term) and the dependence between them is evident. Despite this peculiar dynamic structure, it is noticeable that the Pearson cross-correlation test has a rejection rate of the null of 5%; in other words, Pearson's test behaves at the nominal level of the statistic, which implies that it systematically suggests not to reject the null of independence between processes when there is an obvious deterministic dependence. In contrast, Test1 test rejects the null with great power. The explanation for this performance is that Pearson's test is limited to detecting linear relationships between variables, while Test1 considers any form of potential relationship.
On the other hand, it can be also concluded from the above comments on simulations that Test1 is not capable of distinguishing between the forms of dependences (between and/or within). In other words, if the final user obtains a rejection of the null, she does not know the reason for the rejection. Given the simplicity of the test and its power with complex dependence forms, it is advisable to use Test1 as a first step. However, to complete the process, it is necessary either to use Test2 or to apply some sort pre-whitening process. We now explore the first solution (Test2) and, later in the paper, we consider the behavior of the test in the case of pre-whitening in a multivariate scenario. Table 2 shows the power and size results of Test2 for the eight models S1-S8 for the null X t and Y t being independent just between themselves.  The empirical behavior of other available tests on the same systems and sample sizes are reported in the Appendix. Based on the results for these models, we make the following remarks: (i) The output for systems S1, S2 and S3 hints that Test2 can correctly deal with models that exhibit several forms of within-dependence, and this internal dependence does not contaminate the ability of Test2 to indicate, at a nominal level of 0.05 that both processes are independent. In this regard, it is noteworthy to observe that, a Haugh-type test could not have been used with system S3 because these tests report confident results only for systems of linear and Gaussian dependence, as is the case of S2.
(ii) When within-dependence is linear, as in S4, the power of the test is impressive regardless of the sample size. This empirical fact implies that Test2 can be used to detect simple cases of linear relationships between variables. As reported in [14], Haugh-based tests also have extremely good power for this type of linear dependence. Accordingly, the final user of the tests could safely use the nonparametric Test2 test or parametric Haugh-based tests.
(iii) However, when there is within-dependence of nonlinear nature, as in S5, S6, S7 and S8, it is well known that Haugh based tests are unable to detect dependence, regardless of the sample size. As can be observed from this simulation, Test2 detects dependence when the sample is large enough.
From (i)-(iii), it can be concluded that Test2 can be used to effectively detect dependence between variables with fewer restrictions than other available tests in the literature, and, from a practical point of view, the larger the sample size, the more reliable the results are.
As mentioned earlier, an interesting advantage of both tests is that they can be used in a multivariate setting. We now consider two new sets of multivariate systems. The first set is formed of S9 and S10 systems. Each system is a three-variable stochastic linear system that is used in this paper to show that Test1 can be satisfactorily used for pre-whitened data as proved in the previous section: These systems were generated and estimated by Ordinary Least Squares (OLS) to obtain the linear structure of each variable and residuals were then tested with Test1. The results are in Table 3.  After removing the linear structure, independent estimated errors are obtained. Provided that errors are simulated independently, it is expected that Test1 will not reject the null at the nominal level, as shown in the table above.
The second set of systems is conducted to study the behavior of Test2 in a multivariate system of complex relationships. To this end, we have considered three systems: S11 considers the case where two variables, X and Y, do not have between-dependence (cross-dependence) but are both driven by a common variable Z. One can think of Z as an environmental variable that determines (explains) X and Y, and therefore both are related by this external variable. Here, we expect Test2 to reject the null. Thus, S12 is a nonlinear and more complex model than S11, but, in essence, is similar. This system has two non-interacting variables, X and Y, that share common environmental forcing. Finally, S13 considers the case of three-variables where one of them has no dynamic structure, and the other two only have one-side dependence. We expect Test2 to be able to detect this dependence. The results are given in Table 4. The results suggest that Test2 is able to clearly detect departures from the null in a multivariate context. Even in the case of hidden common variables, the test unveils the indirect relationship, despite its linear or nonlinear nature and despite the sample size. It is also concluded that Test2 needs more observations (larger sample sizes) to detect dependences when considering scenarios like S13, where of six potential relationships between variables, only one exists (from Y to X).

Comparison with Other Tests
As we indicated in the introductory section, the technical literature on this topic has produced several statistics that test for independence between time series. This subsection aims to compare among the most relevant tests.
A comparison among tests can be conducted at several levels. We compare at the level of: the assumptions required to derive and implement the test, the parameters that the final user has to fix to conduct the test, and the empirical power of the test.
According to the literature, the improvements have occurred around some criteria that to some extent are related to the required assumptions for deriving the statistic and implementing the test(s). On this regard, scholars have mainly focused on the following criteria: (i) stationarity (or not) of the system generating process, (ii) linearity (or not) of the system, and (iii) robustness (or not) to the presence of outliers.
On the other hand, all available statistics require the final user of these statistical tests to make certain decisions on some aspects that will necessarily affect the final result of the test. Provided the test is used in the residuals of the model, one of the most important decisions is the fact that a correct model needs to be estimated. Obviously, pre-estimation (or not) of an autoregressive model before using the test is a critical decision. Another important choice for using the test is due to the fact that some of the tests relied on the use of kernels. Throughout the literature, there has been some controversy regarding how to choose the kernel and to what extent empirical behavior of the test changes because of the selected kernel. Along with the kernel, a selection is also required for truncation parameters. Finally, all the tests have to choose the number of observations in the lag vector, which is equivalent to parameter m (embedding) of our tests.
These observations lead us to complete the previous list with the following items: Kernel selection, (vi) embedding selection. Table 5 allows us to compare the tests considered in terms of the robustness to processes that might be nonstationary and nonlinear, and to the presence of outliers (criteria (i)-(iii)). The table also facilitates comparisons in line with the choices that the user has to make before using the test(s), (criteria (iv)-(vi)). According to the previous table, the tests presented in this work have a greater range of applicability. The data that can be analyzed can be compatible with an ample number of models. In other words, other tests are less generally applicable. From a practical point of view, the new tests facilitate user work, since she has to select a smaller number of parameters. This is especially relevant since we alleviate the burden of modeling a (correct) autoregressive process. Any of our techniques only require selecting parameter m, which is a necessary parameter in all the available tests.
As explained in the introductory section, mainly all the available tests are derived from a seminal Haugh's test, which is best known along with Hong's test, which is the test with better behavior in terms of power. To complete the comparison, we now compare the results in terms of power. To do so, we consider these two well-known tests, namely Haugh and Hong tests, and compare it with Test2, which is the most general one. To make a fair comparison, it is only conducted on models to which both tests can be applied.
We firstly describe these competitive tests and then show their results on the corresponding systems.
The Haugh's (1976) [1] procedure considers the following portmanteau statistic given by where rûv(j) = ∑ n t=j+1û tvt−j / ∑ n t=1û 2 t ∑ n t=1v 2 t 1/2 are the residual cross-correlations for 0 ≤ j ≤ n − 1, rûv(j) = rûv(−j) for 1 − n ≤ j < 0, andû t ,v t , t = 1, ..., n are the two residual series of length n, obtained by fitting univariate models to each of the series. The constant M ≤ n − 1 is a fixed integrer and must be chosen a priori. The asymptotic distribution of S M is chi-square under the null hypothesis of independence and the hypothesis is rejected for large values of the test statistic. Hong (1996) [8] generalizes Haugh's statistic. In fact, Hong's test is a weighted sum of residual cross-correlations of the form The weighting depends on a kernel function k and a smoothing parameter d (both have to be selected a priori). Under the null hypothesis, the test statistic Q n is asymptotically N(0, 1) and it rejects the null for large values of Q n .
The empirical power results on systems are collected in Table 6. Results point to several observations: (a) all tests have maximum power for the simplest system (S4), so all are highly competitive, and therefore the following comments will only apply to systems S5, S6 and S7. (b) None of the tests compared is competitive for (small) sample sizes: 200 and 500. (c) Haugh and Hong tests do not improve power by increasing the sample size; however, Test2 not only improves, but also reaches levels close to full power.

Selecting Parameter m
As mentioned earlier, all tests involve the selection of a parameter, m that comform the basic units of analysis. This parameter is an integer which stands for the fix length of the vectors that are formed to be introduced in the tests. These vectors are generally the first m consecutive observations from a time series (raw data or residuals). We have referred throughout this paper to this parameter as embedding dimension or m-history. This terminology is very frequent in the field related to entropy and nonlinear chaotic dynamical systems.
None of the cited tests provide advice for how to select this parameter. In this section, we reflect on how to select the parameter m and we analyze the empirical effect of increasing m.
An obvious observation is that, if the system generating process is constructed by two (or more) cross dependent equations where dependence is in lags larger than m, then no test will capture such a dependence and will consider that they are independent time series. One natural solution is to construct the same m-histories with some fixed time-delay, namely.x t,τ = (x t , x t+τ , x t+2τ , ..., x t+(m−1)τ ).
This problem and also this solution is rarely found in Economics and Finance; however, it is more frequent in Physics and subfields related with nonlinear and chaotic behavior. Indeed, the modeling and prediction of chaotic time series require proper reconstruction of the state space from the available data in order to successfully estimate invariant properties of the embedded attractor. Thus, one must choose appropriate delay time τ and embedding dimension m for phase space reconstruction. For the aim of the presented tests, there is no need to go beyond what other tests do for analyzing dependences.
Provided that the new tests rely on symbolic measures, it is worth considering in this regard that previous research on these measures (see [24]) has provided rules for selecting m. Each m induces a number of symbols. The number of symbols has to be large enough to capture departures from the null. For m = 2, only (2!) 2 = 4 symbols are being evaluated and are expected to be too few to detect departures from the null(s). For the next integer, m = 3, 36 symbols are evaluated, and in the case of increasing to an embedding dimension of 4, 576 symbols will then be used.
According to the authors, gains in terms of power can be obtained by increasing m according to the following rule: given a data set of T observations, the embedding dimension will be the largest m that satisfies 5 × number of symbols < T with m = 2, 3, 4, .... In our case, the rule is 5(m!) 2 < T. The intuition beyond this rule is clear; on the one hand, the larger the number of symbols, the larger the sensitivity for detecting departures from the null. On the other hand, m! grows very fast and statistical devices require enough samples to behave normally. Note that the larger the m, the finer the search for dependences, but at the cost of increasing sample size required to satisfy the rule.
Finally, the study is completed by empirically analyzing the effect (in terms of power) of a change in the embedding parameter. We consider the basic models (S4-S8) and we evaluate for m = 2, 3, 4. It makes no sense to use m = 5 because, in this case, 14,400 symbols will be used while (at most) only 3000 observations are available. Results are collected in Table 7. From these results, we firstly observe that, for a few number of symbols (i.e., for m = 2), the test has no power (as expected), except for the deterministic system and the linear one. In addition, secondly, the power of the test tends to increase with m. For this reason, we recommend the potential users of the test to adhere to the automatic rule for choosing parameter m.

Conclusions
In this work, we have extended the concept symbolic correlation integral to the multivariate domain with the intention of studying, in a non-parametric way, the dependency relationships that might occur between observed variables. Taking as our starting point the multivariate correlation integral, we have developed two statistical tests that can be used to detect dependence between series, even in the case when the relations between them are complex. Each new test is characterized by its corresponding null hypothesis. In both cases, these tests improve on the existing one in terms of power and usability, which are mostly designed to capture linear relationships. As a consequence of the nonparametric character of both tests, each test can be used with guarantees and practically without restrictions as long as the series are stationary and the sample size is not excessively small. This means that both tests are appropriate for massive data analysis. However, for small data sets (say, below 200 observations), users should be aware that nonlinearity is highly difficult to detect and our advice is to rely on other statistics if linear relations can provide a plausible link among series.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A. Proofs
This section is devoted to the proof of Theorem 1. To this end, we need to prove the following proposition where J ts = JS ts − µ ts , which is a random variable with zero mean by Equations (6) and (8).

Proof.
Recall that, for π, δ ∈ S m j , (p πδ tt ) πδ = PM x j (|t − t |) and (p πδ ss ) πδ = PM x j (|s − s |) for every j = 1, 2, ..., k. Fix t and s such that s ≥ t + 3m − 2. Then, it follows that 1. If h 1 ≥ m or h 2 ≥ m, then by Lemma A1, we have that δ(h 1 , h 2 ) = 0. 2. If h 1 < m or h 2 < m, we have the following cases: • If t > t and s > s • If t = t and s = s The remaining cases t < t, s < s , t < t, s < s, t = t , and s = s are symmetric to the previous ones. Since the probabilities used in Equations (A1)-(A3) depend only on h 1 and h 2 , the proof follows.
Proof of Proposition A1. We consider From Lemma A1, the variance Equation (A4) can be rewritten as 2 n(n − 1) Since N(t) and N(s) are finite sets, the first summand in Equation (A5) converges to zero as n approaches infinity. The second summand in Equation (A5) can be written as where we denote by N(t) the complementary set of N(t).
In the first summand in Equation (A6), since t ∈ N(t) and s ∈ N(t), we have that, for a fixed t, the number of time indexes s that make s ∈ N(s) ∩ N(t ) possible is finite, and therefore this summand converges to zero as n goes to infinity.
Notice that, for fixed t and s with s ≥ t + 2m, we have that E[J ts J t+h 1 s+h 2 ] = E[J t s J t +h 1 s +h 2 ] always and s ∈ N(t ). Furthermore, from Lemma A1, it follows that Then, from Equation (A7), we have that Since the number of summands in ∑ n−1 t=1 ∑ n s=t+2m is 1 2 (−2 − 4m(−1 + n) + n + n 2 ), then lim n→∞ n(n − 1) 2 as desired.
For all j = 1, 2, . . . , k, the residual function is given by The m j ! symbols provide a partition of R m j in m j ! subsets {A j π 1 , A j π 2 , . . . , A j π m j ! }, such that the probability P(u jt (θ j ) ∈ A j π i ∩ A j π l ) = 0 for all i = l if the values of u jt come from a continuous distribution because then equal values are very uncommon, with a theoretical probability of occurrence of 0. Therefore, for a given m j history u jt (θ j ), we may assume without loss of generality that it belongs to a ball of radius ε j > 0 satisfying B(u jt (θ j ), ε j ) ⊂ A j π for some π ∈ S m j . For each ε j and given that G j is a continuous map in a compact set (and hence uniformly continuous) for all j = 1, 2, . . . , k, there exists a δ j > 0 independent of I j t−1 , θ j and θ j such that, if (I j t−1 , θ j ) − (I j t−1 , θ j ) < δ j , then by Equation (A18) |u jt (θ j ) − u jt ( θ j )| < j and hence u jt ( θ j ) ∈ A j π . Thus, it follows that I(s x j (u jt (θ j )), s x j (u js (θ j ))) = I(s x j (u jt ( θ j )), s x j (u js ( θ j ))) for almost all t, s, and hence JS ts (θ 1 , θ 2 , . . . , θ k ) = JS ts ( θ 1 , θ 2 , . . . , θ k ) and therefore lim n→∞ P n(n − 1) 2 JSC m (θ 1 , θ 2 , . . . , θ k ) − JSC m ( θ 1 , θ 1 , . . . , θ k ) > 0 = 0, which finishes the proof of the theorem.
Remark A1. Notice that we have used only the fact that the map G j with j = 1, 2, ..., k, are uniformly continuous. Therefore, we could state only this condition in Theorem 2 and avoid the compact domain of every G j .