Statistical Tests of Symbolic Dynamics †

Basic Null Hypotheses. Abstract: A novel general method for constructing nonparametric hypotheses tests based on the ﬁeld of symbolic analysis is introduced in this paper. Several existing tests based on symbolic entropy that have been used for testing central hypotheses in several branches of science (particularly in economics and statistics) are particular cases of this general approach. This family of symbolic tests uses few assumptions, which increases the general applicability of any symbolic-based test. Additionally, as a theoretical application of this method, we construct and put forward four new statistics to test for the null hypothesis of spatiotemporal independence. There are very few tests in the specialized literature in this regard. The new tests were evaluated with the mean of several Monte Carlo experiments. The results highlight the outstanding performance of the proposed test.


Introduction
The construction and design of powerful statistical tests are crucial elements for both theoretical and applied scientists. The utility of a test generally depends on its degree of applicability, which is usually related to the assumptions contained in the design of the test, and the restrictions of the scientific field in which the test will be used. Nowadays, the utility of statistical tests also depends on efficiency: reducing the need for computational resources and speed, which are vital for real-time monitoring and control applications. Taking applicability and efficiency into account, in this paper we propose a new general, flexible statistical methodology to design and test central hypotheses, and we establish an asymptotic distribution theory for a wide range of tests by using the new proposed approach.
The new framework is based on symbolic analysis, which is a field of increasing interest for several scientific disciplines (see [1]) Symbolic analysis studies dynamical systems on the basis of the sequences of symbols which are obtained for a suitable (and generally selected by the user) partition of the state space. In other words, the idea behind the symbolic approach is to split the phase space into a finite number of regions, and then each region is labeled with a symbol. From this point of view, the symbolic approach is a coarse-grained description of dynamics. As coarse-grained methods, which are usually used to provide some description of the data generating process, symbolic analysis focuses on some essential features of the generating dynamics which are frequently of interest to the researcher, for example, (in)dependence, cycles and nonlinear structure. In general terms, it can be said that symbolic analysis allows for designing tests that only focus on the relevant information required for the problem at hand.
This approach is not new in science. In the particular case of time series analysis, the symbolic approach implies transforming raw time series into a sequence of symbols.
Although seeminglycounter-intuitive, symbolic analysis is rooted in information theory and also in dynamics theory. For example, properties of symbols or codes are central to the theory of communication [2]. Not in vein, there is a well-established mathematical discipline, namely, symbolic dynamics, that studies the behavior of dynamical systems. The name of "symbolic dynamics" was firstly coined by [3], although the discipline started in 1898 with the pioneering work by Hadamard, who developed a symbolic description of sequences of geodesic flows. Interestingly, ref. [4] highlighted the power of the symbolic approach by showing that a complete description of the behavior of a dynamical system can be captured in terms of symbols. Notice that this property is crucial for the understanding of this paper as long as important characteristics of a random variable can also be captured by studying the symbols derived from it.
The symbolic approach has been useful in many areas of scientific research. In the experimentalist realm, relevant contributions have been made in several fields: astrophysics; biology and medicine; chemistry, mechanical systems and fluid flow; artificial intelligence, control and communication; and data mining, classification and rule discovery ( [5][6][7], for an overview). In the non-experimentalist realm, symbolic analysis has been interestingly used. In economics and finance, data are transformed and analyzed in terms of particular symbols [8]. Two examples are recession indicators utilized to study and to determine the business cycle, and the indicators used to characterize the stock market bull and bear market periods. In geography, works like that of [9] show how qualitative variables (symbolic analysis) can be used to map descriptions. In spatial econometrics, economic spatial dependence has recently been studied by transforming data into symbols [10,11]. Other interesting applications are [12,13].
Despite all these interesting applications and the scientifically founded roots of the symbolic approach, there is no systematic body of statistical tools for conducting inference based on symbolic sequences. There are some notable exceptions: [14][15][16][17][18][19][20][21]. A common factor to all of these statistical approaches is that they are centered on ordinal patterns, which is one type of symbol. In this paper we present a novel, systematic and general framework for any potential symbol in order to test for wide range of potential null hypotheses that include, as particular cases, most of the previously indicated multidisciplinary situations, namely, ordinal patterns. We also provide a general asymptotic distribution theory for symbolic analysis. Particularly, this paper shows how, by means of symbols, it is possible to design nonparametric tests for a wide class of null hypotheses with special attention to limitations (restrictions) that typically appear in economics and finance. Therefore, this paper aims also to provide the theoretical basis for hypothesis testing by means of symbols.
An appealing advantage to symbolic analysis is that it requires very few assumptions about the data generating process in order to conduct statistical inference. This advantage is promising as the tools based on this method will share the model-free property, which avoids making unnecessary assumptions and provides more general results. Most of the econometric and statistical tests typically used in some of the mentioned disciplines cannot deal with potential nonlinear forms of dependence. By construction, nonlinear structures are not a limitation for symbolic analysis.
The capability of this approach is clearly illustrated by the scope of what we label "the symbolic main theorem" (SMT). Given a null hypothesis H 0 , for example, the null of serial independence, the SMT will give us four nonparametric asymptotic tests for that null, which are distribution free. The transformation of data into symbols is done by means of a symbolization map. Some of its properties are also studied in this paper. These symbolic-based tests have to deal with ordinary statistical problems that usually appear in economics and finance, such as data scarcity and suboptimal empirical power of the test. Given the flexibility of the symbols, we provide theoretical results and strategies to overcome such difficulties.
A clear example of the power of the new tool is illustrated by the spatio-temporal data modeling issues occupying a prominent role in spatial econometrics, geography and regional science, about which we can find a vast amount of literature ( [3], and references therein). We constructed several symbolic-based tests by using the SMT. These tests also constitute an added-value of the paper, because there are currently very few available tests designed to deal with spatiotemporal dependence. The problem becomes more difficult if potential nonlinear dependence is considered. A notable exception is [9] who has treated nonlinearity in a spatial framework.
Finally, the results of this paper might be of interest to fields of research where information theory plays a relevant role. Particularly, nonparametric entropy measures and tests for serial dependence have drawn the attention of econometricians (see [22] and references therein). The clearest link between our results and information theory is through the concept of symbolic entropy. In the context of time series analysis, permutation entropy, which is a type of symbolic entropy, uses the probabilities of length-m ordinal patterns in the definition of Shannon entropy. An ordinal pattern is a particular type of symbolization map. Given the characteristics of this map, the SMT allows us to obtain an asymptotic distribution theory for a permutation entropy-based test. Providing the statistical foundation for permutation entropy is specially relevant because: (a) there are very few asymptotic distribution theories available for entropy, in general; and (b) permutation entropy is currently used in computer science due to its relation to "incompressibility", and is also useful in the study of dynamical systems because of its connection to complexity.
From another point of view, some well-established nonparametric tests can be understood as particular types of symbolic analysis. For example, the nonparametric runs test for randomness by Wald-Wolfowitz (see [23]); joint-counting procedures for spatial association [24]; and in general, categorical data techniques [25] are simple examples that use the very general procedures of translating information into symbols. In this regard, symbolic analysis can be understood as a method related to this literature.
The paper is organized as follows: In Section 2, we provide the main notation and relevant concepts that will be used in the paper. Among them we highlight: symbolization maps, standard or non-standard maps and decomposable maps. Due to the generality of the method, we require the potential tests to be adaptable to different contexts that are to be able to deal with a wide range of null hypotheses. To this end we introduce the notion of perfect and non-perfect set on subindexes in Section 3. This allows us to give general theoretical results to tackle practical situations that might otherwise be intractable because of the problem and/or of the type of hypothesis. Therefore we distinguish between two main classes of theoretical situations that lead us to different statistical solutions. In Section 4 we show how to construct symbolic-based tests via likelihood ratio statistics and via asymptotic normality. Section 5 considers the theoretical case that the null hypothesis cannot be treated under perfect situations, and hence other results are applicable. Section 6 puts forward the main theorem of this paper. Under the general conditions of this theorem, we introduce four tests for serial independence, four tests for spatial independence and four new tests for spatiotemporal independence, in Section 7. These tests are based on different symbolization maps, according to those given in Sections 2 and 3. Finally, in Section 8, we outline a Monte Carlo simulation experiment to show the capabilities of the spatiotemporal test for independence under linear and nonlinear settings. The paper ends with some conclusions.

Notation and Definitions
As indicated in the previous section, we give some definitions and introduce the basic notation that will be used throughout the rest of the paper.
Let {X i } i∈I be a stationary real-valued process, where I is a set of indexes. Let Γ = {η 1 , η 2 , . . . , η n } be a set of n > 1 elements that we label as symbols. Now assume that there exists a map f : {X i } i∈I → Γ for some subset of indexes I ⊆ I. We will say that i ∈ I is of η-type if and only if f (X i ) = η. We will call the map f a symbolization map for {X i } i∈I .
Notice that it is possible to expand the definition of a symbolization map to the k−dimensional case by introducing the concept of decomposable maps: if f j : {X ij } i∈I → Γ j , j = 1, 2, . . . , k, are k symbolization maps, then the product is a symbolization map for the k-dimensional variable {(X i1 , X i2 , . . . , X ik )} i∈I . We will call F a k-decomposable symbolization map.
Given a symbolization map f : {X i } i∈I → Γ and a symbol η ∈ Γ, we denote by p η the probability of occurrence of symbol η. Symbolization maps can be classified according to their behavior under the null hypothesis. If the symbolization map f is such that under a given null hypothesis (H 0 ) all the symbols have the same probability to occur, we will say that f is a standard symbolization map. On the contrary, we will refer to f as a non-standard symbolization map.
The symbolic entropy of a process {X i } i∈I is defined as the Shannon's entropy of the n distinct symbols as follows: with the convention 0 × ln 0 = lím x→0 + x ln x = 0. Symbolic entropy, h(Γ), can be understood as the information in terms of symbols η ∈ Γ of the process {X i } i∈I . Notice that 0 ≤ h(Γ) ≤ ln(n). Notice also that the lower bound is attained when only one symbol occurs, and the upper bound when all n possible symbols appear with the same probability.
Consider the following index i ∈ I that we define an indicator random variable Z ηi as follows: that is, we have that Z ηi = 1 if and only if i is of η-type, and Z ηi = 0 otherwise. Then Z ηi is a Bernoulli variable with probability of "success" p η , where "success" means that i is of η-type. It is straightforward to see that Our interest is in knowing how many is are of η-type for all symbol η ∈ Γ. In order to answer the question, we construct the following counting variable: The variable Y η can take the values {0, 1, 2, . . . , R}, where |I | = R. To complete with notation, we will denote by the cardinality of the subset of symbolized indexes I formed by all the elements of η-type.
Then, under the conditions above, one could easily compute the relative frequency of a symbol η ∈ Γ by:

On the Independence of Variables {Z η i } i∈I
If the subset of symbolized indexes I is chosen such that the Z ηi variables are independent for all i ∈ I , then Y η is a binomial random variable Moreover, if the subset of indexes I is such that the variables Z η s and Z σ t are independent for all symbol η s , σ t ∈ Γ, s = t, where s, t ∈ {1, 2, . . . , n}, then the joint probability density function of the n variables (Y η 1 , Y η 2 , . . . , Y η n ) is: . . , Y η n = a n ) = (a 1 + a 2 + · · · + a n )! a 1 !a 2 ! · · · · · a n ! p a 1 η 1 p a 2 η 2 · · · p a n η n where a 1 + a 2 + · · · + a n = R. Consequently the joint distribution of the n variables (Y η 1 , Y η 2 , . . . , Y η n ) is a multinomial distribution. In this case we will call the set of symbolized indexes I a perfect subset. It is possible to develop theoretical results that contemplate situations for which the researcher will benefit of constructing symbolization maps for which the set I is not perfect. Given our previous definition of perfect set, the indexes subset I can be nonperfect in the following cases: • Case (a): Z ηi and Z ηj are not independent for all i = j for some η ∈ Γ, and hence Y η is not a binomial random variable. • Case (b): Z η s and Z σ t are not independent for all s = t for some σ t , η s ∈ Γ, and hence Theoretical results for perfect and non-perfect subsets are the topics of the following two sections.

Constructing Tests with Perfect Subsets of I
In this section we establish a general framework for testing for a null hypothesis H 0 . This is done by focusing on the symbols' distribution when the subset I of symbolized symbols is perfect. Under this condition, we now show how to construct tests of hypothesis via likelihood ratio statistics and via asymptotic normality.

Procedure
In general, our procedure consists of proceeding systematically as follows: 1.
(step 1) Fix the null hypothesis H 0 to be tested. 2.
(step 2) Define the set of symbols Γ and the symbolization map f .
(step 4) Finally design and compute a desired test statistic.
Steps 1 to 2 will be developed later in the paper. To accomplish the aim with steps 3 and 4, we first show how to construct a likelihood ratio test.
Recall that when the set is perfect, (Y η 1 , . . . , Y η n ) follows a multinomial distribution and its likelihood function is: n ηn η n and the likelihood ratio test statistic is where p η i is the maximum likelihood estimator of p η i for all i = 1, 2, . . . , n. In this case, as shown in the Appendix A, maximum likelihood estimators are where the k degrees of freedom will depend on the set of symbols.
It is also proved that, if standard symbolization maps are considered, then which is an affine transformation of the symbolic entropy (1).
As we have emphasized in the Introduction that nonparametric results of entropy based measures are relevant for econometrics and for other fields of research. Given that the distribution of the affine transformation in (3) holds for standard symbolization maps under broad conditions, this result will be of general applicability. In particular, as we will show in other sections, permutation entropy, as introduced in [26], is a particular type of symbolic entropy that has drew the attention of several scholars for theoretical interests basically because stationary and ergodic processes coincide with Shannon entropy, and for an applied interest in nonlinear and complex systems or processes (see [1]). In this regard, (2) and (3) can be viewed as an initial step towards statistical inference of ordinal pattern distributions.
An alternative to likelihood ratio tests for symbolic maps can be considered by modifying step 4. Given that under a perfect subset I the indicator variables Z η i are independent, the random variable has a limiting normal distribution with zero mean and unit variance for all symbol η ∈ Γ. Moreover asymptotically distributes as a multivariate normal distribution MN(0, I). In this case, the asymptotic distribution holds for standard or nonstandard symbolization maps at the cost of estimating p η i that can be consistently estimated by p η i .

Constructing Tests with Non Perfect Subsets of I
In this section we establish the equivalent counterpart of the general framework (above presented) when the subset I of symbolized symbols is non-perfect. Accordingly, we now show to what extent and under which situations the previous likelihood and asymptotically normal tests (elaborated under perfect situations) can be adapted to deal with them.
Non perfect sets of indexes I might be very useful for test design, especially for situations or scientific domains characterized by relatively scarce sample size as compared with the number of symbols. In macroeconomics, although not necessarily in finance, data scarcity can be the usual restriction. Ideally, one can be able to design perfect sets of indexes to carry out symbolic-based hypothesis testing. This section then tries to provide symbolic-based methods for constructing hypothesis tests for situations for which an ideal design is not possible because of the nature of the problem, because of the computational capabilities or because of any other potential reason.

Binomial Approximation
Let us consider that variables Z ηi and Z ηj are not independent for all i = j for some η ∈ Γ. In this case, Y η is not a binomial random variable. The interesting question is how far is Y η from B(R, p η ). We are interested in studying under what assumptions the variable Y η can be approximated to a binomial random variable: In fact, it is possible to compute a bound for this binomial approximation. Denote by L(Y η ) the distribution of Y η , and we are interested in the bound of the binomial approximation of the distribution of Y η measured in terms of total variation distance. The total variation distance d TV between two probability measures P and Q is defined by where the supremum is taken over all measurable sets of the real line. Following Theorem 1.1 in [27] and after a few calculations, a bound can be given as follows: For each i, j ∈ I let Z ηj , Z ηi and J ηij be defined in the same probability space where W i counts the number of indexes that are of η-type and p = p η and q = 1 − p η . On the other Therefore, in order to get a bound for the binomial approximation, we have to get bound On the other hand, we have that Therefore, we have shown that the sum of dependent indicators can be approximated to a binomial random variable when the following two conditions are satisfied: (1) dependencies among the indicators are weak and (2) the probabilities of the indicators occurring under the null hypothesis are small. Notice that point (1) can be guaranteed by selecting the subset of symbolized indexes I such that |B i | is small enough.

Normal Approximation
Additionally, when the indicator variables Z ηi are not independent for all i ∈ I , the following central limit theorem for dependent indicators ensures the convergence to a normal distribution: Theorem 1 (Theorem 7.7.5 (Anderson, 1971)). Let Z 1 , Z 2 , . . . be a stationary stochastic process such that for every integer n and integers t 1 , t 2 , . . . , t n with t 1 < · · · < t n , Z t 1 , . . . , Z t n are distributed independently of Z 1 , . . . , has a limiting normal distribution with mean 0 and variance Theorem 1 states that, if the dependencies are weak (for instance, |B i | is small enough can be computed as follows: We now consider the case where Z η s and Z σ t are not independent for all s = t for some σ t , η s ∈ Γ, and hence (Y η 1 , Y η 2 , . . . , Y η n ) is not a multinomial distribution (i.e., previous case (b)). To this end, denote by B . It is possible to show that (X η 1 , X η 2 , . . . , X η n ) is a multivariate normal distribution. In fact, it is equivalent to proving that any linear combination of the X η s is normal. In order to do so, notice that each variable X η is a sum of indicator variables, and therefore, any linear combination of these variables is again a sum of indicator variables. Consider an arbitrary linear combination as follows: Notice that the variable αX η can be written as is an indicator variable for all i ∈ I . Therefore, it follows that the variable M is a sum of dependent indicator variables. Again, by Theorem 1 we get that M follows a normal distribution whenever the dependencies among the indicator variables are weak (for instance, when the cardinality of the set is small enough for all i ∈ I and all symbol η, η s , η t ∈ Γ).
Then we get that (X η 1 , X η 2 , . . . , X η n ) has a limiting multivariate normal distribution and we can compute the variance and covariance matrix for all η s , η t ∈ Γ as follows:

The Symbolic Main Theorem
Previous partial results can be collected in the following main theorem. • If the set of symbolized indexes I is perfect: is asymptotically χ 2 k distributed where k is the difference between the number of parameters to be estimated under the alternative hypothesis H 1 and the number of parameters to be estimated under the null H 0 .
• The set of symbolized indexes I is not perfect: 1.
If the sets B i and B (η s ,η t ) i have good properties in the sense that for all symbol η ∈ Γ the variables Y η have a good approximation to a binomial distribution and (Y η 1 , Y η 2 , . . . , Y η n ) has a good approximation to a multinomial distribution, then G(Γ) = 2R[ln(n) − h(Γ)] is asymptotically χ 2 k distributed where k is the difference between the number of parameters to be estimated under the alternative hypothesis H 1 and the number of parameters to be estimated under the null H 0 .
Notice that in the case of nonstandard symbolization maps, an analogous theorem to Theorem 2 will hold. More concretely, for point 1 of the SMT, a likelihood ratio test is also available, although not in the closed form as presented here (see for example [28]). Point 2 of the SMT hold independently of whether the symbolization map is standard.
The usefulness of the theorem will become evident in the next section, where it will be applied to specific null hypotheses. Naturally, it is possible to consider an alternatively bootstrap-based test for symbolization maps, instead of asymptotic ones, although we do not follow this way in this paper, it being a subfield for further research of interest since it partially might avoid taking care of nonperfect indexes sets.

Different Symbolizations for Different Nulls Related with Independence
According to the general symbolic theorem, in this section we show how it is possible to test interesting null hypotheses by using symbolic analysis. To concrete, we focus on testing for different nulls of independence as it is a well-known field of research and because recently published articles can be generally understood and extended under this new theoretical framework. Given that each null hypothesis (step 1) will require a particular symbolization map, in this section we present different symbolization procedures (step 2) to test for serial dependence, spatial dependence, and spatiotemporal dependence, respectively. Then we present the results of step 3 and step 4 depending on the statistic technique the researcher wants to use according to Theorem 2, i.e., either likelihood ratio statistics or/and asymptotically normal statistics. Given a null hypothesis, the behavior of the tests obtained from this approach will strongly depend on the expertise of the researcher in constructing the symbolization map. We emphasize that both power analysis of the class of tests, and power competition among alternative nonparametric tests were already given in previous work [10,16]; therefore, we are not going to replicate them here.
As we have indicated, the crucial component of the symbolic procedure is to choose a symbolic mapping which ensures that the distribution of the symbols can detect deviations from the null. The null hypotheses considered in this section are related to the important topic of "statistical independence". This is a very well-studied topic in time series analysis and therefore there is a generous number of available tests. On the contrary, spatial independence is not so well-known and is non-trivial how to test for it. As we will show, it is needed to use another different symbolization map for detecting spatial patterns. Similar comments can be made for spatio-temporal independence. Needless to say, there are other hypotheses of interest in econometrics, and the researcher will have to design suitable symbolic maps for testing them. For example, in [29], the authors dealt with the opposite problem: how to test for a pure deterministic chaotic process. In these and other cases, the power of the tests will centrally depend on the ability of the research to design the symbolization map for the desired null hypothesis.

Serial Independence Tests
In the case of time series, refs. [15,16] used the following symbolization procedure to test for serial dependence: Let {X t } t∈I be a real-valued time series (in this case the subindex t refers to time) for which we are interested in testing the null of serial independence (step 1). In order to complete step 2, we denote by Γ 1 = S m the symmetric group of order m!, that is, the group formed by all the permutations of length m (for a positive integer m ≥ 2). Let π i = (i 1 , i 2 , . . . , i m ) ∈ S m . The positive integer m is usually known as the embedding dimension.
An ordinal pattern for a symbol is defined as π i = (i 1 , i 2 , . . . , i m ) ∈ S m at a given time t ∈ I. The time series can be embedded in an m-dimensional space: X m (t) = (X t+1 , X t+2 , . . . , X t+m ) for t ∈ I It is said that t is of π i −type if and only if π i = (i 1 , i 2 , . . . , i m ) is the unique symbol in the group S m satisfying the two following conditions: (a) X t+i 1 ≤ X t+i 2 ≤ · · · ≤ X t+i m , and Notice that condition (b) guaranties uniqueness of the symbol π i . This is justified if the values of X t have a continuous distribution so that equal values are very uncommon, with a theoretical probability of occurrence of 0.
In this case, the symbolization map is defined as f 1 : {X t } t∈I → S m given by where (i 1 , i 2 , . . . , i m ) ∈ S m is such that t is of (i 1 , i 2 , . . . , i m )-type. Now the design of the symbolization map (step 2) is completed. Moreover, under the null of independence the distribution of the symbols is uniform and therefore the map f 1 is a standard symbolization map. Additionally, the set of symbolized indexes is I = {1, 2, . . . , T − m + 1}, which is not perfect.
Notice that in order to have a perfect set and therefore ensure the independence of the indicator variables Z πt , it is enough to consider as a set of symbolized indexes Accordingly, using this symbolization map, the next corollary straightforwardly follows from Theorem 2: Corollary 1. Let f 1 : {X t } t∈I → Γ 1 be the symbolization map defined in (6) with |I | = R. Denote by h(Γ 1 ) the permutation entropy defined in (1). If the time series {X t } t∈I is independent, then

•
If the set I is perfect, then: 1.
The affine transformation of the permutation entropy , . . . , ) is a multivariate normal distribution NM(0, I).
• If the set I is not perfect: 1.
Since the sets B π i 's has cardinality of at most 2m, we can get a good approximation to the following result via [21]. The affine transformation of the permutation entropy Then ( , . . . , ) is a multivariate normal distribution MN(0, Σ) where the variance and covariance matrix can be estimated using (4) and (5).
These results for permutation entropy are in relation to a relatively recent line of research based on order patterns for analyzing time series. Ordinal patterns can be, per se, used for descriptive purposes, like autocorrelation, with the added advantage that the require no assumptions such as Gaussianity or linearity. On the contrary, only mild stationary conditions can exist in the underlying process. The above corollary is a further step for the development of statistical inference for ordinal time series. Naturally, it is possible to obtain other kinds of statistical results by adding more assumptions to the generating process. In fact, notorious results can be found in [4]) if Gaussianity and ergodicity are assumed. In this regard, our asymptotic results for order patterns keep assumptions at a minimum. Additionally, by maintaining general applicability at minimum cost (in terms of assumptions) for serial independence tests, some bootstrap-based statistics for ordinal patterns have been put forward in [29].
An interesting property of the symbolization procedure presented in this section is that it can be also used for discrete distributions. To do so it necessary to consider a non-standard version of the map. Under such circumstances, the likelihood ratio (2) can be directly used once the behavior of p η i is known under the null of serial independence.

Spatial Independence Tests
In the case of spatial processes, ref. [10] gave a symbolization procedure to test for spatial independence as follows: Let {X s } s∈S be a real-valued spatial process, where S is a set of coordinates. Given a location s 0 , we will denote by (ρ 0 i , θ 0 i ) the polar coordinates of location s i taking as origin s 0 .
Let m ∈ N with m ≥ 2. Consider now that the spatial process {X s } s∈S is embedded in a different m-dimensional space as follows: X m (s 0 ) = (X s 0 , X s 1 , . . . , X s m−1 ) for s 0 ∈ S where s 1 , s 2 , . . . , s m−1 are the m − 1 nearest neighbors to s 0 , which are ordered from lesser to higher Euclidean distance with respect to location s 0 . Notice that in the case of two or more locations being equidistant to s 0 , we will choose them in an anticlockwise manner. In formal terms, s 1 , s 2 , . . . , s m−1 are the m − 1 nearest neighbors to s 0 satisfying the following two conditions: . Notice that conditions (a) and (b) ensure the uniqueness of X m (s) for all s ∈ S. The proposed standard symbolization map f is defined as follows: denote by Me the median of the spatial process {X s } s∈S and let Now, define the indicator function Then, the standard symbolization map is defined as: f 2 (X s ) = (I ss 1 , I ss 2 , . . . , I ss m−1 ), (9) where Γ 2 stands for the set of symbols defined by f 2 .
Notice that under the null of spatial independence, the distribution of the symbols is uniform and therefore the map f 2 is a standard symbolization map.
Moreover, in this case I = S is not a perfect symbolized set. To construct a perfect symbolized set I , one can proceed as follows. Take a location s 0 ∈ S at random. Let N s be the set of nearest neighbors to s. Now select the following element in I by taking s 1 ∈ S such that N s 1 ∩ N s 0 = ∅. Then construct recursively the set I by taking s k ∈ S \ {s 0 , s 1 , . . . , s k−1 } satisfying N s i ∩ N s j = ∅ for all i = j with i, j = 1, 2, . . . , k.
As it is evident, the method is flexible enough to allow the researcher to select his own set and map of symbols for a given null. For example, if under the previous symbolization procedure, the power (or size) of the test is not satisfactory, one can always consider other possible symbolization procedures for the same null and for the same spatial process {X s } s∈S . Let Γ 3 = {1, 2, . . . , k} × {1, 2, . . . , k}. Again, let N s be the set of nearest neighbors to s and let n s be its cardinality. Denote by X N s = 1 n s ∑ s ∈N s X s . Denote by q i and q N i the i-th quantile of the variables X and X N respectively, for i ∈ {1, 2, . . . , k − 1}. We will denote by q 0 = min s∈S X s (resp q N 0 = min s∈S X N s ) and q k+1 = max s∈S X s (resp. q N k+1 = max s∈S X N s ). Then we define the symbolization map if and only if X s ∈ [q i−1 , q i ] and X N s ∈ [q N j−1 , q N j ]. Again, under the null of independence the distribution of the symbols is uniform and therefore the map f 3 : {X s } s∈I → {1, 2, . . . , k} × {1, 2, . . . , k} is a standard symbolization map.
Again, the same set of recursively constructed symbolized indexes S ensures the independence of the indicator variables Z ηs . Accordingly, using this symbolization map, the next corollary straightforwardly follows from Theorem 2: 3 be the symbolization maps defined in (8) and (10) with |S | = R. Denote by h(Γ i ) the symbolic entropy defined in (1). If the spatial process {X s } s∈S is independent, it follows that: • If the set S is perfect then: 1.
The affine transformation of the symbolic entropy Then ( , . . . , ) is a multivariate normal distribution MN(0, I).
• If the set S is not perfect, then: 1.
Since the sets B σ i have small cardinality, we can get a good approximation to the following result of [17]. The affine transformation of the symbolic entropy Then ( , . . . , ) is a multivariate normal distribution MN(0, Σ) where the variance and covariance matrix can be estimated using (4) and (5).
In Section 2 we indicate that there is a class of symbolization maps that are nonstandard. Consider a situation in which a reduction in the number of possible symbols under study will benefit the behavior and properties of the test. In this, and other potential situations, non-standard maps might be useful. As an example, we now construct a nonstandard symbolization map to test for independence in the spatial context. The following symbolization is an example of the most general procedure that we give in Appendix A.3.
Consider again the set Γ 2 of symbols defined in (9) for a fixed embedding dimension m. Now we will denote by a the rest of the division of a over m − 1.
Now define the following equivalence relation ∼: if and only if there exists an integer k such that I s s i = I ss i+k for all i ∈ {1, 2, . . . , m − 1}. Now we consider as a set of symbols Γ 4 = Γ 2 / ∼ the set of classes in Γ 2 modulo, the equivalence relation ∼.
Notice that, in general, in this case not all the symbols in Γ 4 have the same probability of occurring, and therefore the symbolization map f 4 : {X s } s∈S → Γ 4 is non-standard.

Spatiotemporal Independence Tests
The issues related to spatiotemporal data modeling occupy a prominent role in current econometrics, where we can find recent literature devoted to this topic (see [9,30]). Spatiotemporal dependence introduces considerable difficulties with respect to modeling, computation and statistical theory. If independence can be taken for granted, and likewise the common assumption of cross-sectional independence, then computations and the application of inference rules simplifies significantly. It seems reasonable therefore to test first for spatiotemporal independence, and if the evidence for independence is strong, then proceed with the well-known methods. Unfortunately, tests for spatiotemporal independence are scarce. The aim of this section is twofold: to contribute to this rather scarce literature, and to highlight the usefulness of the novel general method presented in this paper. To this end we consider the relevant null of spatiotemporal dependence. Of particular interest for our tests is that dependence is not taken as a synonymous with correlation, and therefore nonlinearities are not restrictions for our test.
Consider the process {X ts } t∈I,s∈S . As in the previous cases, one can define several standard and non-standard symbolization maps. For simplicity, we adapt the previous symbolizations to the spatiotemporal case as follows: For a fixed location s 0 ∈ S define {X t(s 0 ) } as the time series {X 1s 0 , X 2s 0 , . . . , X p(s 0 ) , . . . }. Similarly for a fixed period t 0 ∈ I we define {X (t 0 )s } as the spatial process {X t 0 s 1 , X t 0 s 2 , . . . , X t 0 s p , . . . } Let m t , m s ∈ N with m t , m s ≥ 2 be the time and space embedding dimensions respectively. Then under this setting we define the following decomposable symbolization maps F 1i : {X ts } t∈I ,s∈S → § m × Γ i for i = 2, 3 and 4 defined by: where f 1 : {X t(s) } → S m and f i : {X (t)s } → Γ i for i = 2, 3, 4 are defined as above.
Notice that, when testing for spatiotemporal independence, when i = 2, 3 the symbolization map F 1i is standard, while for i = 4 is non-standard.
It is also possible to define an extension of the symbolization map f 2 in a spatiotemporal context. Indeed, consider the following map: defined by g(X ts ) = ( f (X (t)s ), f (X (t+1)s ), . . . , f (X (t+m t −1)s )) where f (X (t+i)s ) = (I ts,(t+i)s 1 , I ts,(t+i)s 2 , . . . , I ts,(t+i)s m t −1 )) for all i = 0, 1, . . . , m t − 1 and the indicator function I ts,(t+i)s j is defined as in (7). Accordingly, using this symbolization map, the next corollary straightforwardly follows from Theorem 2: Γ 2 be the standard symbolization maps defined in (11) with i = 2, 3 and in (12) Γ 2 ) the symbolic entropy defined in (1). If the spatiotemporal process {X ts } t∈I,s∈S is independent, then: • If indexes sets are perfect then: 1.

2.
Then ( , . . . , ) is a multivariate normal distribution MN(0, I) where n is the cardinality of the set of symbols.
• If indexes sets are not perfect then 1.

2.
Then ( , . . . , ) is a multivariate normal distribution MN(0, Σ) where the variance and covariance matrix can be estimated using (4) and (5) and n is the cardinality of the set of symbols.

Empirical Behavior of the Tests for Spatiotemporal Independence
In this section we evaluate the empirical behavior of the STG test with different configurations for the subset I . The first aim of this section is to show the flexibility of Corollary 3 to cope with different scenarios. The second goal is to evaluate the empirical behavior of the new test. An the third intention of this simulation is to evaluate the incidence of the selection of I on the empirical size of the test and on the power.
To those ends we designed a Monte Carlo experiment as follows: Firstly we consider the problem of testing for independence on regular lattices of several orders-R = 64 (8 × 8) and T = 150; R = 100 (10 × 10)-for which we consider two possible temporal scenarios, depending on data availability, T = 200 and T = 800. We also simulated richer regular lattices of order R = 400 (20 × 20), although on this occasion we only considered T = 200. The symbolization map follows from (12) with m s = 4 and m t = 2. The test under study was generated from Corollary 3 under Expression (13). Therefore, we used a perfect indexes subset. This subset was constructed recursively, as indicated in Section 7: N s i t j ∩ N s r t k = ∅, where N s i t j is the set conformed with s i , the three nearest neighbors of s i in t j and the four spatial locations in the next time period. The power of the test is evaluated with the following DGPs: where ε t ∼ N(0, 1), which was also used for evaluating the empirical size of the test. Parameters α, γ intensified temporal and spatial dependencies, respectively, and λ was fixed at five in all simulations. The weighting matrix W has been specified as a binary type using a contiguity criterion and rook-type movements. Table 1 collects the empirical size and power of STG statistical test for 1000 repetitions. It is straightforward to observe that the size is controlled, and the test is powerful. For low intensity level of parameter (α = γ = 0.1) the test is absolutely powerful. We have to set α = 1/40, γ = 1/25 (or below) to lose power. This occurred despite the DGP under consideration. Regular spatiotemporal configurations are interesting because (1) time series posit a natural order for observations, (2) lattice data provide the simplest extension of time series and (3) some scientific methods are compatible with this spatiotemporal configuration. However, irregular patterns are of frequent occurrence with spatial data. In geographical settings, data are liable to be recorded across heterogeneously-sized administrative regions, while economic distances do not correspond to regular spacing. Therefore, it is also useful to adapt the STG symbolic test to irregular spatiotemporal settings. In terms of our general methodology (see Corollary 3) this problem in tractable by considering the symbolization map F 1i , i = 2 where we control the dependence among the indicators by controlling on average the cardinality of the sets B i . Particularly, we will select the set of indexes I such that |B i | ≤ (m s + m t − 1)/2; i.e., the average of the cardinality of the sets B i is less than half of the number of spationtemporal neighbors.
Therefore, to complete the experiment (in the case of nonperfect lattices) we evaluate the STG-version for irregular lattices where coordinates of each spatial location are drawn from a N(0,1). We have considered the three nearest neighbors for irregular lattices. Afterwards, the resulting matrix was row-standardised in the usual way. Table 2 collects the size and power for models constructed from DGP1 and DGP2. The introduction of irregular lattices has led us to introduce non-perfect indexes, and accordingly the size of the test slightly increased, although the levels seem acceptable, particularly for generous sample data. Power is as interesting as for the case of perfect indexes, and therefore the same comments applies (similar results are obtained in the case of using the multivariate normal approximation).

Comparison with Other Spatiotemporal Test for Independence
We now face our test with an unfavorable scenario characterized by small amount of available data on irregular lattices, also in linear and nonlinear setups. To this end we consider pairs of the following sample sizes: (36 × 10), (64 × 10), (100 × 10), (100 × 30) and (200 × 10). According to our theoretical discussion, given data scarcity and irregular spatial configuration, we use the non-perfect subset of indexes. Additionally, we consider the symbolization map based on equivalence relationsF 1,2 = ( f 1 ,f 2 ) as depicted in Appendix A.3 for nonstandard maps.
To complete the empirical study, we compare our test with another nonparametric spatiotemporal test [31] which is described in Appendix A.4 and we refer to it as STBP.
Notice, however, that the STBP test requires one to correctly specify the weighting matrix, W; this is not a requirement for the symbolic test.
In terms of empirical size, both tests behave similarly well for linear processes (Table 3). On the contrary, for the nonlinear processes (Table 4), the size of the STBP test is poor, while the symbolic-based test performs as expected. In terms of empirical power, the STBP test outperforms the STG test, especially for low intensity levels of dependence in the case of the linear process. However, under a nonlinear spatiotemporal configuration, the STG clearly presents a better balance between size and power and outperforms the STBP in all cases.

Conclusions
Central null hypotheses in experimental and non-experimental branches of science can be easily tested by means of symbolized information. This paper provides with the analytical tools to construct nonparametric hypothesis tests based on symbols. These tools are able to cope with different null hypotheses and with distinct scenarios in which some realistic limitations might be imposed to test designs.
A shared characteristic of all these symbolic test families is that few assumptions are needed to obtain asymptotic results. Therefore, general applicability of this method is guaranteed. In particular, in this paper we have shown that two well-known symbolic-based tests are particular cases of the main symbolic theorem (Theorem 2), which is stated in this paper for the first time. Furthermore, a set of new symbolic-based tests for spatiotemporal independence is put forward by using the main results of this paper collected under the main symbolic theorem (Theorem 2). Monte Carlo simulations provide evidence of the extraordinary power of the contrasted test. Currently, there are circumstances where robustness to speed, noise or computational cost are paramount, so fruitful applications of symbolic analysis are favored.
Further lines of research are worthy. We now indicate some of them on which we and other scholars are currently working: (i) One of the appealing properties of symbolic-based testing is that it requires few assumptions. In this paper we have assumed stationarity; however, it would be interesting to study whether it is possible to be less restrictive. (ii) In the context of time series analysis, most available techniques require the existence of second moments; however, by using certain symbolizations, it might be possible to waive this requirement. This will allow time series researchers to consider a wider variety of model classes. (iii) One of the main contributions of the paper is that it suggests that researchers can design a symbolization procedure (map) to test null hypotheses. It would be interesting to study what types of null hypotheses are more suitable to analysis using symbolic maps. This appendix gives a procedure to construct a non-standard symbolization maps. Let A = {A 1 , A 2 , . . . , A d } be a family of nonempty subsets of the set of symbols Γ. Assume that A is a partition of Γ, that is, Γ = d i=1 A i and A 1 ∩ A j = ∅ for all i = j. Now we define the relation ∼ in Γ by η ∼ σ if and only if η, σ ∈ A i for some i = 1, 2, . . . , d.
Obviously the relation ∼ is an equivalence relation and therefore we can consider the quotient set Γ = Γ/ ∼ formed by all the classes of equivalence as a new set of symbols. Denote by σ the class o equivalence of symbol σ. Therefore, there exists a natural projection π : Γ → Γ defined by π(σ) = σ. Moreover, any symbolization map with set of symbols Γ, namely f : {X i } i∈I → Γ , can be extended to a symbolization map with set of symbols Γ by considering the following map Notice that the cardinality of the set Γ is d which is always smaller or equal than the cardinality of the former set Γ.

Appendix A.4. A Generalization to Spatiotemporal Data of the Brett and Pinkse Test
The test of [32] is a nonparametric test of spatial dependence based on that two variables are independent if the joint characteristic function factorizes into the product of the marginal characteristic functions. We adapt this test to its use for studying spatiotemporal data.
Let {y ts } t∈I,s∈S a spatiotemporal realization of a process. The y ts 's can have continuous, discrete or mixed distributions, and the distribution functions are generally unknown. Under the null hypothesis, the spatiotemporal process is stationary and independent in space and time.
The design of the test is as follows. Let g be any practitioner-chosen density function with infinite support, and denote by h(x) = e iux g(u)du the Fourier transform of g. Given a location s, N s refers to the set of neighbors of coordinate s. Fix a positive integer m and define N m ts = {t s | t = t, t + 1, . . . , t + m − 1; s ∈ N s } as the set of nearby observations to location s in period t and n ts = N m ts as the number of observations. Let y N ts = 1 n ts ∑ rk∈N m ts y rk stands for the sampling average of the proximate observations to y ts . The STBP test null hypothesis is H 0 . y ts and y N ts are independent for all t∈ I and s∈ S.
Define h (t 1 s 1 ,t 2 s 2 ) = h(y t 1 s 1 − y t 2 s 2 ) and h NN (t 1 s 1 ,t 2 s 2 ) = h(y N t 1 s 1 − y N t 2 s 2 ). Introduce η n1 , η n2 and η n3 defined by where n = RT is the number of observations. Let Under the null of the test, the extension of the Brett and Pinkse statistic for a spatiotemporal context is the following: STBP = nη n 2ν n which is asymptotically χ 2 1 distributed. The following two sufficient conditions are required by the STBP test to be consistent: (1) spatiotemporal dependence of a fixed order, (2) the sequence has to be strongly mixing. Strong mixing is a weak dependence condition, while fixed ordered dependence is a restriction regarding these relationships must be produced between proximate observations. In this case, then the null hypothesis will be asymptotically rejected; however the behavior of the test will be undetermined, when the dependence involves observations that are not geographically or temporally proximate. Using spatial data, this means that an specification of the so-called spatial weighting matrix is needed and that this specification must be correct (Lopez et al, 2011, for more details).