Ordinal Pattern Dependence in the Context of Long-Range Dependence

Ordinal pattern dependence is a multivariate dependence measure based on the co-movement of two time series. In strong connection to ordinal time series analysis, the ordinal information is taken into account to derive robust results on the dependence between the two processes. This article deals with ordinal pattern dependence for a long-range dependent time series including mixed cases of short- and long-range dependence. We investigate the limit distributions for estimators of ordinal pattern dependence. In doing so, we point out the differences that arise for the underlying time series having different dependence structures. Depending on these assumptions, central and non-central limit theorems are proven. The limit distributions for the latter ones can be included in the class of multivariate Rosenblatt processes. Finally, a simulation study is provided to illustrate our theoretical findings.


Introduction
The origin of the concept of ordinal patterns is in the theory of dynamical systems. The idea is to consider the order of the values within a data vector instead of the full metrical information. The ordinal information is encoded as a permutation (cf. Section 3). Already in the first papers on the subject, the authors considered entropy concepts related to this ordinal structure (cf. [1]). There is an interesting relationship between these concepts and the well-known Komogorov-Sinai entropy (cf. [2,3]). Additionally, an ordinal version of the Feigenbaum diagram has been dealt with e.g., in [4]. In [5], ordinal patterns were used in order to estimate the Hurst parameter in long-range dependent time series. Furthermore, Ref. [6] have proposed a test for independence between time series (cf. also [7]). Hence, the concept made its way into the area of statistics. Instead of long patterns (or even letting the pattern length tend to infinity), rather short patterns have been considered in this new framework. Furthermore, ordinal patterns have been used in the context of ARMA processes [8] and change-point detection within one time series [9]. In [10], ordinal patterns were used for the first time in order to analyze the dependence between two time series. Limit theorems for this new concept were proven in a short-range dependent framework in [11]. Ordinal pattern dependence is a promising tool, which has already been used in financial, biological and hydrological data sets, see in this context, also [12] for an analysis of the co-movement of time series focusing on symbols. In particular, in the context of hydrology, the data sets are known to be long-range dependent. Therefore, it is important to also have limit theorems available in this framework. We close this gap in the present article.
All of the results presented in this article have been established in the Ph.D. thesis of I. Nüßgen written under the supervision of A. Schnurr. The article is structured as follows: in the subsequent section, we provide the reader with the mathematical framework. The focus is on (multivariate) long-range dependence. In Section 3, we recall the concept of ordinal pattern dependence and prove our main results. We present a simulation study in Section 4 and close the paper by a short outlook in Section 5.

Mathematical Framework
We consider a stationary d-dimensional Gaussian time series Y j j∈Z (for d ∈ N), with: = 1 for all j ∈ Z and p = 1, . . . , d. Furthermore, we require the cross-correlation function to fulfil r (p,q) (k) < 1 for p, q = 1, . . . , d and k ≥ 1, where the component-wise cross-correlation functions r (p,q) (k) are given by r (p,q) j+k for each p, q = 1, . . . , d and k ∈ Z. For each random vector Y j , we denote the covariance matrix by Σ d , since it is independent of j due to stationarity. Therefore, we have Σ d = r (p,q) (0) p,q=1,...,d .
We specify the dependence structure of Y j j∈Z and turn to long-range dependence: we assume that for the cross-correlation function r (p,q) (k) for each p, q = 1, . . . , d, it holds that: with L p,q (k) → L p,q (k → ∞) for finite constants L p,q ∈ [0, ∞) with L p,p = 0, where the matrix L = L p,q p,q=1,...,d has full rank, is symmetric and positive definite. Furthermore, the parameters d p , d q ∈ 0, 1 2 are called long-range dependence parameters. Therefore, Y j j∈Z is multivariate long-range dependent in the sense of [13], Definition 2.1.
The processes we want to consider have a particular structure, namely for h ∈ N, we obtain for fixed j ∈ Z: The following relation holds between the extendend process Y j,h j∈Z and the primarily regarded process Y j j∈Z . For all k = 1, . . . , dh, j ∈ Z we have: where x = max{k ∈ Z : k ≤ x}. Note that the process Y j,h j∈Z is still a centered Gaussian process since all finite-dimensional marginals of Y j j∈Z follow a normal distribution. Stationarity is also preserved since for all p, q = 1, . . . , dh, p ≤ q and k ∈ Z, the cross-correlation function r (p,q,h) (k) of the process Y j,h j∈Z is given by Hence, we arrive at: where Σ 1≤i,k≤h , p, q = 1, . . . , d.
Therefore, we finally have to show that based on the assumptions on Y j j∈Z , the extended process is still long-range dependent.
Hence, we have to consider the cross-correlations again: since p * , q * ∈ {1, . . . , d} and m * ∈ {0, . . . , h − 1}, with p * := p−1 h + 1, q * := q−1 h + 1 and m * = (q − p) mod h. Let us remark that a k b k ⇔ lim k→∞ a k b k = 1. Therefore, we are still dealing with a multivariate long-range dependent Gaussian process. We see in the proofs of the following limit theorems that the crucial parameters that determine the asymptotic distribution are the long-range dependence parameters d p , p = 1, . . . , d of the original process Y j j∈Z and therefore, we omit the detailed description of the parameters d p * herein.
It is important to remark that the extended process Y j,h j∈Z is also long-range dependent in the sense of [14], p. 2259, since: with: and L(k) can be chosen as any constant L p,q that is not equal to zero, so for simplicity, we assume without a loss of generality L 1,1 = 0, and therefore, L(k) = L 1,1 , since the condition in [14] only requires convergence to a finite constant b p * ,q * . Hence, we may apply the results in [14] in the subsequent results. We define the following set, which is needed in the proofs of the theorems of this section.
and denote the corresponding long-range dependence parameter to each p ∈ P * by We briefly recall the concept of Hermite polynomials as they play a crucial role in determining the limit distribution of functionals of multivariate Gaussian processes. Definition 1. (Hermite polynomial, [15], Definition 3.1) The j-th Hermite polynomial H j (x), j = 0, 1, . . ., is defined as Their multivariate extension is given by the subsequent definition.
Analogously to the univariate case, the family of multivariate Hermite polynomials H k 1 ,...,k d , k 1 , . . . , k d ∈ N forms an orthogonal basis of L 2 R d , ϕ I d , which is defined as The parameter ϕ I d denotes the density of the d-dimensional standard normal distribution, which is already divided into the product of the univariate densities ϕ in the formula above.
We denote the Hermite coefficients by The Hermite rank m( f , I d ) of f with respect to the distribution N (0, I d ) is defined as the largest integer m, such that: Having these preparatory results in mind, we derive the multivariate Hermite expansion given by We focus on the limit theorems for functionals with Hermite rank 2. First, we introduce the matrix-valued Rosenblatt process. This plays a crucial role in the asymptotics of functionals with Hermite rank 2 applied to multivariate long-range dependent Gaussian processes. We begin with the definition of a multivariate Hermitian-Gaussian random measureB(dλ) with independent entries given bỹ whereB (p) (dλ) is a univariate Hermitian-Gaussian random measure as defined in [16], Definition B.1.3. The multivariate Hermitian-Gaussian random measureB(dλ) satisfies: and: whereB(dλ) * = B (1) (dλ), . . . , B (d) (dλ) denotes the Hermitian transpose ofB(dλ). Thus, following [14], Theorem 6, we can state the spectral representation of the matrixvalued Rosenblatt process Z 2,H (t), t ∈ [0, 1] as where each entry of the matrix is given by The double prime in R 2 excludes the diagonals |λ i | = λ j , i = j in the integration. For details on multiple Wiener-Itô integrals, as can be seen in [17].
The following results were taken from [18], Section 3.2. The corresponding proofs were outsourced to the Appendix A.
Theorem 1. Let Y j j∈Z be a stationary Gaussian process as defined in (1) that fulfils (2) for d p ∈ 1 4 , 1 2 , p = 1 . . . , d. For h ∈ N we fix: and Σ d,h as described in (6). Let f : R dh → R be a function with Hermite rank 2 such that the set of discontinuity points D f is a Null set with respect to the dh-dimensional Lebesgue measure. Furthermore, we assume f fulfills E f 2 Y j,h < ∞. Then: where: The matrix K(d * ) is a normalizing constant, as can be seen in [18], Corollary 3.6. Moreover, B L (dλ) is a multivariate Hermitian-Gaussian random measure with E(B L (dλ)B L (dλ) * ) = L dλ and L as defined in (2). Furthermore, C 2 := 1 2d * (4d * −1) is a normalizing constant and: h,k+(q−1)h for each p, q ∈ P * and i, k = 1, . . . , h and: where C denotes the matrix of second order Hermite coefficients, given by It is possible to soften the assumptions in Theorem 1 to allow for mixed cases of shortand long-range dependence.
Corollary 1. Instead of demanding in the assumptions of Theorem 1 that (2) holds for Y j j∈Z with the addition that for all p = 1, . . . , d we have d p ∈ 1 4 , 1 2 , we may use the following condition. We assume that: with L p,q (k) as given in (2), but we do no longer assume d p ∈ 1 4 , 1 2 for all p = 1, . . . , d but soften the assumption to d * ∈ 1 4 , 1 2 and for d p = d * , p = 1, . . . , d we allow for d p ∈ (−∞, 0) ∪ 0, 1 4 . Then, the statement of Theorem 1 remains valid.
However, with a mild technical assumption on the covariances of the one-dimensional marginal Gaussian processes that is often fulfilled in applications, there is another way of normalizing the partial sum on the right-hand side in Theorem 1, this time explicitly for the case #P * = 2 and h ∈ N, such that the limit can be expressed in terms of two standard Rosenblatt random variables. This yields the possibility of further studying the dependence structure between these two random variables. In the following theorem, we assume #P * = d = 2 for the reader's convenience.

Ordinal Pattern Dependence
Ordinal pattern dependence is a multivariate dependence measure that compares the co-movement of two time series based on the ordinal information. First introduced in [10] to analyze financial time series, a mathematical framework including structural breaks and limit theorems for functionals of absolutely regular processes has been built in [11]. In [19], the authors have used the so-called symbolic correlation integral in order to detect the dependence between the components of a multivariate time series. Their considerations focusing on testing independence between two time series are also based on ordinal patterns. They provide limit theorems in the i.i.d.-case and otherwise use bootstrap methods. In contrast, in the mathematical model in the present article, we focus on asymptotic distributions of an estimator of ordinal pattern dependence having a bivariate Gaussian time series in the background but allowing for several dependence structures to arise. As it will turn out in the following, this yields central but also non-central limit theorems.
We start with the definition of an ordinal pattern and the basic mathematical framework that we need to build up the ordinal model.
Let S h denote the set of permutations in {0, . . . , h}, h ∈ N 0 that we express as (h + 1)dimensional tuples, assuring that each tuple contains each of the numbers above exactly once. In mathematical terms, this yields: : 0 ≤ π i ≤ h, and π i = π k , whenever i = k, i, k = 0, . . . , h , as can be seen in [11], Section 2.1.
The number of permutations in S h is given by #S h = (h + 1)!. In order to get a better intuitive understanding of the concept of ordinal patterns, we have a closer look at the following example, before turning to the formal definition. Example 1. Figure 1 provides an illustrative understanding of the extraction of an ordinal pattern from a data set. The data points of interest are colored in red and we consider a pattern of length h = 3, which means that we have to take n = 4 data points into consideration. We fix the points in time t 0 , t 1 , t 2 and t 3 and extract the data points from the time series. Then, we search for the point in time which exhibits the largest value in the resulting data and write down the corresponding time index. In this example, it was given by t = t 1 . We order the data points by writing the time position of the largest value as the first entry, the time position of the second largest as the second entry, etc. Hence, the absolute values are ordered from largest to smallest and the ordinal pattern (1, 0, 3, 2) ∈ S 3 is obtained for the considered data points. Formally, the aforementioned procedure can be defined as follows, as can be seen in [11], Section 2.1.

Definition 3.
As the ordinal pattern of a vector x = (x 0 , . . . , x h ) ∈ R h+1 , we define the unique permutation π = (π 0 , . . . , π h ) ∈ S h : such that: The last condition assures the uniqueness of π if there are ties in the data sets. In particular, this condition is necessary if real-world data are to be considered.
In Figure 2, all ordinal patterns of length h = 2 are shown. As already mentioned in the introduction, from the practical point of view, a highly desirable property of ordinal patterns is that they are not affected by monotone transformations, as can be seen in [5], p. 1783. Mathematically, this means that if f : R → R is strictly monotone, then: In particular, this includes linear transformations f (x) = ax + b, with a ∈ R + and b ∈ R.
Following [11], Section 1, the minimal requirement of the data sets we use for ordinal analysis in the time series context, i.e., for ordinal pattern probabilities as well as for ordinal pattern dependence later on, is ordinal pattern stationarity (of order h). This property implies that the probability of observing a certain ordinal pattern of length h remains the same when shifting the moving window of length h through the entire time series and is not depending on the specific points in time. In the course of this work, the time series, in which the ordinal patterns occur, always have either stationary increments or are even stationary themselves. Note that both properties imply ordinal pattern stationarity. The reason why requiring stationary increments is a sufficient condition is given in the following explanation.
One fundamental property of ordinal patterns is that they are uniquely determined by the increments of the considered time series. As one can imagine in Example 1, the knowledge of the increments between the data points is sufficient to obtain the corresponding ordinal pattern. In mathematical terms, we can define another mappingΠ, which assigns the corresponding ordinal pattern to each vector of increments, as can be seen in [5], p. 1783.

Definition 4.
We define for y = (y 1 , . . . , y h ) ∈ R h the mappingΠ : R h → S h : Π(y 1 , . . . , y h ) := Π(0, y 1 , y 1 + y 2 , . . . , y 1 + . . . + y h ), such that for y i = x i − x i−1 , i = 1, . . . , h, we obtain: We define the two mappings, following [5], p. 1784: An illustrative understanding of these mappings is given as follows. The mapping S(π), which is the spatial reversion of the pattern π, is the reflection of π on a horizontal line, while T (π), the time reversal of π, is its reflection on a vertical line, as one can observe in Figure 3. Based on the spatial reversion, we define a possibility to divide S h into two disjoint sets.

Definition 5.
We define S * h as a subset of S h with the property that for each π ∈ S h , either π or S(π) are contained in the set, but not both of them.
Note that this definition does not yield the uniqueness of S * h .

Example 2.
We consider the case h = 2 again and we want to divide S 2 into a possible choice of S * 2 and the corresponding spatial reversal. We choose S * 2 = {(2, 1, 0), (2, 0, 1), (1, 2, 0)}, and therefore, The only condition that has to be satisfied is that if one permutation is chosen for S * 2 , then its spatial reverse must not be an element of this set.
We stick to the formal definition of ordinal pattern dependence, as it is proposed in [11], Section 2.1. The considered moving window consists of h + 1 data points, and hence, h increments. We define: and: Then, we define ordinal pattern dependence OPD as The parameter q represents the hypothetical case of independence between the two time series. In this case, p and q would obtain equal values and therefore, OPD would equal zero. Regarding the other extreme, the case in which both processes coincide or one is a strictly monotone increasing transform of the other one, we obtain the value 1. However, in the following, we assume p ∈ (0, 1) and q ∈ (0, 1).
Note that the definition of ordinal pattern dependence in (17) only measures positive dependence. This is no restriction in practice, because negative dependence can be investigated in an analogous way, by considering OPD X (1) ,−X (2) . If one is interested in both types of dependence simultaneously, in [11], the authors propose to use OPD X (1) ,X (2) To keep the notation simple, we focus on OPD as it is defined in (17).
We compare whether the ordinal patterns in X (1) j j∈Z coincide with the ones in Recall that it is an essential property of ordinal patterns that they are uniquely determined by the increment process. Therefore, we have to consider the increment Hence, we can also express p and q (and consequently OPD) as a probability that only depends on the increments of the considered vectors of the time series. Recall the definition of Y j,h j∈Z for d = 2, given by such that Y j,h ∼ N (0, Σ 2,h ) with Σ 2,h as given in (6).
In the course of this article, we focus on the estimation of p. For a detailed investigation of the limit theorems for estimators of OPD, we refer to [18]. We define the estimator of p, the probability of coincident patterns in both time series in a moving window of fixed length, byp  Figure 4. Illustration of estimation of ordinal pattern dependence.
Having emphasized the crucial importance of the increments, we define the following conditions on the increment process Y j j∈Z : let Y j j∈Z be a bivariate, stationary Gaussian We assume that Y j j∈Z fulfills (2) with d * in 1 4 , 1 2 . We allow for min{d 1 , d 2 } to be in the range (−∞, 0) ∪ 0, 1 4 .
We begin with the investigation of the asymptotics ofp n . First, we calculate the Hermite rank ofp n , since the Hermite rank determines for which ranges of d * the estimatorp n is still long-range dependent. Depending on this range, different limit theorems may hold.
Proof. Following [20], Lemma 5.4 it is sufficient to show the following two properties: Note that the conclusion is not trivial, because m( f , Σ 2,h ) = m( f , I 2,h ) in general, as can be seen in [15], Lemma 3.7. Lemma 5.4 in [20] can be applied due to the following reasoning. Ordinal patterns are not affected by scaling, therefore, the technical condition that Σ −1 2,h − I 2,h is positive semidefinite is fulfilled in our case. We can scale the standard deviation of the random vector Y j,h by any positive real number σ > 0 since for all j ∈ Z we have: To show property (i), we need to consider a multivariate random vector: with covariance matrix Σ 2,h . We fix i = 1, . . . , 2h. We divide the set S h into disjoint sets, namely into S * h , as defined in Definition 5 and the complimentary set S h \ S * h . Note that: holds. This implies: for π ∈ S h . Hence, we arrive at: In order to prove (ii), we consider: 1 , . . . , U (2) h t to be a random vector with independent N (0, 1) distributed entries. For i = 1, . . . , h and k = h + 1, . . . , 2h such that k − h = i, we obtain: This was shown in the proof of Lemma 3.4 in [20]. All in all, we derive m( f , Σ 2,h ) = 2 and hence, have proven the lemma.
The case m( f , Σ 2,h ) = 2 exhibits the property that the standard range of the long-range dependence parameter d * ∈ 0, 1 2 has to be divided into two different sets. If d * ∈ 1 4 , 1 2 , the transformed process f Y j,h j∈Z is still long-range dependent, as can be seen in [16], Table 5.1. If d * ∈ 0, 1 4 , the transformed process is short-range dependent, which means by definition that the autocorrelations of the transformed process are summable, as can be seen in [13], Remark 2.3. Therefore, we have two different asymptotic distributions that have to be considered for the estimatorp n of coincident patterns.

Limit Theorem for the Estimator of p in Case of Long-Range Dependence
First, we restrict ourselves to the case that at least one of the two parameters d 1 and d 2 is in 1 4 , 1 2 . This assures d * ∈ 1 4 , 1 2 . We explicitly include mixing cases where the process corresponding to min{d 1 , d 2 } is allowed to be long-range as well as short-range dependent.
Note that this setting includes the pure long-range dependence case, which means that for p = 1, 2, we have d p ∈ 1 4 , 1 2 , or even d 1 = d 2 = d * . However, in general, the assumptions are lower, such that we only require d p ∈ 1 4 , 1 2 for either p = 1 or p = 2 and the other parameter is also allowed to be in (−∞, 0) or 0, 1 4 . We can, therefore, apply the results of Corollary 1 and obtain the following asymptotic distribution forp n : Theorem 3. Under the assumption in (L), we obtain: with Z (p,q) 2,d * +1/2 (1) as given in Theorem 1 for p, q ∈ P * and C 2 := 1 2d * (4d * −1) being a normalizing constant. We have:α for each p, q ∈ P * and i, k = 1, . . . , h and (α i,k ) 1≤i,k≤dh = Σ −1 2,h CΣ −1 2,h , where the variable: denotes the matrix of second order Hermite coefficients.
Proof. The proof of this theorem is an immediate application of the Corollary 1 and Lemma 1. Note that forp n it holds that it is square integrable with respect to Y j,h and that the set of discontinuity points is a Null set with respect to the 2h-dimensional Lebesgue measure. This is shown in [18], Equation (4.5).
We turn to an example that deals with the asymptotic variance of the estimator of p in Theorem 3 in the case h = 1.

Example 3.
We focus on the case h = 1 and consider the underlying process Y j,1 j∈Z = . It is possible to determine the asymptotic variance depending on the correlation r (1,2) (0) between these two increment variables. We start with the calculation of the second order Hermite coefficients in the case π = (1, 0). This corresponds to the event Y j ≥0 and: j ≥0 .
Plugging the second order Hermite coefficients and the entries of the inverse of the covariance matrix depending on r (1,2) (0) into the formulas, we arrive at: Therefore, in the case h = 1, we obtain the following factors in the limit variance in Theorem 3:

Remark 2.
It is not possible to analytically determine the limit variance for h = 2, as this includes orthant probabilities of a four-dimensional Gaussian distribution. Following [22], no closed formulas are available for these probabilities. However, there are fast algorithms at hand that calculate the limit variance efficiently. It is possible to take advantage of the symmetry properties of the multivariate Gaussian distribution to keep the computational cost of these algorithms low. For detail, as can be seen in [18], Section 4.3.1.

Limit Theorem for the Estimator of p in Case of Short-Range Dependence
In this section, we focus on the case of d * ∈ (−∞, 0) ∪ 0, 1 4 . If d * ∈ 0, 1 4 , we are still dealing with a long-range dependent multivariate Gaussian process Y j,h j∈Z .
However, the transformed processp n − p is no longer long-range dependent, since we are considering a function with Hermite rank 2, see also [16], Table 5.1. Otherwise, if d * ∈ (−∞, 0), the process Y j,h j∈Z itself is already short-range dependent, since the crosscorrelations are summable. Therefore, we obtain the following central limit theorem by applying Theorem 4 in [14].

Theorem 5.
Under the assumptions in (S), we obtain: We close this section with a brief retrospect of the results obtained. We established limit theorems for the estimator of p as probability of coincident pattern in both time series and hence, on the most important parameter in the context of ordinal pattern dependence. The long-range dependent case as well as the mixed case of short-and long-range dependence was considered. Finally, we provided a central limit theorem for a multivariate Gaussian time series that is short-range dependent if transformed byp n . In the subsequent section, we provide a simulation study that illustrates our theoretical findings. In doing so, we shed light on the Rosenblatt distribution and the distribution of the sum of Rosenblatt distributed random variables.

Simulation Study
We begin with the generation of a bivariate long-range dependent fractional Gaussian First, we simulate two independent fractional Gaussian noise processes U (1) j j=1,...,n and U (2) j j=1,...,n derived by the R-package "longmemo", for a fixed parameter H ∈ 1 2 , 1 in both time series. For the reader's convenience, we denote the long-range dependence parameter d by H = d + 1 2 as it is common, when dealing with fractional Gaussian noise and fractional Brownian motion. We refer to H as Hurst parameter, tracing back to the work of [23]. For H = 0.7 and H = 0.8 we generate n = 10 6 samples, for H = 0.9, we choose n = 2 · 10 6 . We denote the correlation function of univariate fractional Gaussian noise by r (1,1) for ψ, φ ∈ R. Note that this yields the following properties for the cross-correlations of the two processes for k ≥ 0: We use ψ = 0.6 and φ = 0.8 to obtain unit variance in the second process. Note that we chose the same Hurst parameter in both processes to get a better simulation result. The simulations of the processes Y , are shown on the right-hand side in Figure 5. The long-range dependent behavior of the increment processes is very illustrative in these processes: roughly speaking, they become smoother the larger the Hurst parameter gets. We turn to the simulation results for the asymptotic distribution of the estimator p n . The first limit theorem is given in Theorem 3 for H = 0.8 and H = 0.9. In the case of H = 0.7, a different limit theorem holds, see Theorem 5. Therefore, we turn to the simulation results of the asymptotic distribution of the estimatorp n of p, as shown in Figure 6 for pattern length h = 2. The asymptotic normality in case H = 0.7 can be clearly observed. We turn to the interpretation of the simulation results of the distribution of p n − p for H = 0.8 and H = 0.9 as the weighted sum of the sample (cross-)correlations: we observe in the Q-Q plot for H = 0.8 that the samples in the upper and lower tail deviate from the reference line. For H = 0.9, a similar behavior in the Q-Q plot is observed. We want to verify the result in Theorem 4 that it is possible, by a different weighting, to express the limit distribution ofp n − p as the distribution of the sum of two independent standard Rosenblatt random variables. The simulated convergence result is provided in Figure 7. We observed the standard Rosenblatt distribution.

Conclusions and Outlook
We considered limit theorems in the context of the estimation of ordinal pattern dependence in the long-range dependence setting. Pure long-range dependence, as well as mixed cases of short-and long-range dependence, were considered alongside the transformed short-range dependent case. Therefore, we complemented the asymptotic results in [11]. Hence, we made ordinal pattern dependence applicable for long-range dependent data sets as they arise in the context of neurology, as can be seen in [24] or artificial intelligence, as can be seen in [25]. As these kinds of data were already investigated using ordinal patterns, as can be seen, for example, in [26], this emphasizes the large practical impact of the ordinal approach in analyzing the dependence structure multivariate time series. This yields various research opportunities in these fields in the future.
Our results rely on the assumption of Gaussianity of the considered multivariate time series. If we focus on comparing the coincident ordinal patterns in a stationary longrange dependent bivariate time series, we highly benefit from the property of ordinal patterns not being affected by monotone transformations. It is possible to transform the data set to the Gaussian framework without losing the necessary ordinal information. In applications, this property is highly desirable. If we consider the more general setting, that is, stationary increments, the mathematical theory in the background gets a lot more complex leading to the limitations of our results. A crucial argument used in the proofs of the results in Section 2 is given in the Reduction Theorem, originally proven in Theorem 4.1 in [27] in the univariate case and extended to the multivariate setting in Theorem 6 in [14]. For further details, we refer the reader to the Appendix A. However, this result only holds in the Gaussian case. Limit theorems for the sample cross-correlation process of multivariate linear long-range dependent processes with Hermite rank 2 have recently been proven in Theorem 4 in [28]. This is possibly an interesting starting point to adapt the proofs in the Appendix A to this larger class of processes without requiring Gaussianity.
Considering the property of having a discrete bivariate time series in the background, an interesting extension is given in time continuous processes and the associated techniques of discretization to still regard the ordinal perspective. To think even further beyond our scope, a generalization to categorical data is conceivable and yields an interesting open research opportunity.

Data Availability Statement:
The data that support the findings of this study are available from the corresponding author, Ines Nüßgen, upon reasonable request.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A. Technical Appendix
All proofs in this appendix were taken from [18], Chapter 3.

Appendix A.1. Preliminary Results
Before turning to limit theorems, we introduce a possibility to decompose the ddimensional Gaussian process Y j j∈Z using the Cholesky decomposition, as can be seen in [29]. Based on the definition of the multivariate normal distribution, as can be seen in [30], Definition 1.6.1, we find an upper triangular matrixÃ, such thatÃÃ t = Σ d . Then, it holds that: where U * j is a d-dimensional Gaussian process where each U * j has independent and identically N (0, 1) distributed entries. We want to assure that U * j j∈Z preserves the long-range dependent structure of Y j j∈Z . Since we know from (2) that: the process U * j has to fulfill: with L =ÃL U * Ã t . Then, it holds for all n ∈ N that: Note that the assumption in (A2) is only well-defined because we assumed r (p,q) (k) < 1 for k ≥ 1 and p, q = 1, . . . , d in (1). This becomes clear in the following considerations. In the proofs of the theorems in this chapter, we do not only need a decomposition of Y j , but also of Y j,h . As Y j,h is still a multivariate Gaussian process, the covariance matrix of Y j,h given by Σ d,h is positive definite. Hence, it is possible to find an upper triangular matrix A, such that AA t = Σ d,h . It holds thatL for: The random vector U j,h consists of (d · h) independent and standard normally distributed random variables. We notice the different structure of U j,h compared to Y j,h . We assure that for consecutive j, the entries in U j,h are all different while there are identical This complicates our aim that: The special structure of Y j,h j∈Z , namely that consisting of h consecutive entries of each marginal process Y (p) j , p = 1, . . . , d, alongside the dependence between two random vectors in the process Y j,h , has to be reflected in the covariance matrix of U j,h , j = 1, . . . , n . Hence, we need to check whether such a vector U j,h , j = 1, . . . , n exists, i.e., if there is a positive semi-definite matrix that fulfills these conditions. We define A as a block diagonal matrix with A as main-diagonal blocks and all off-diagonal blocks as dh × dh-zero matrix.
We denote the covariance matrix of Y j,h , j = 1, . . . , n t by Σ Y,n and define the following matrix: We know that Σ Y,n is a positive semi-definite for all n ∈ N because Y j is a Gaussian process. Mathematically described, this means that: for all x = (x 1 , . . . , x nhd ) t ∈ R nhd . We conclude: Therefore, Σ U,n is a positive semi-definite matrix for all n ∈ N and the random vector: exists and (A4) holds. Note that we do not have any further information on the dependence structure within the process U j , in general, this process neither exhibits long-range dependence, nor independence, nor stationarity. We continue with two preparatory results that are also necessary for proving Theorem 2.1. Lemma A1. Let Y j j∈Z be a d-dimensional Gaussian process as defined in (1) that fulfills (2) with d 1 = . . . = d d = d * , such that: Let C 2 be a normalization constant: and let B Y be an upper triangular matrix, such that: Furthermore, for l ∈ N we have: Then, for h ∈ N it holds that: 2,d * +1/2 (1) has the spectral domain representation: where: andB(dλ) = B (1) (dλ), . . . ,B (d) (dλ) is a multivariate Hermitian-Gaussian random measure as defined in (12).
Proof. First, we can use (A1): such that U * j is a multivariate Gaussian process with U * j ∼ N (0, I d ) and U * j is still long-range dependent, as can be seen in (A2). It is possible to decompose the sample cross-covariance matrixΓ Y,n (l) − Γ Y (l) with respect to Y j at lag l given bŷ where we define the sample cross-covariance matrixΓ U * ,n (l) − Γ U * (l) with respect to U * j at lag l byΓ Each entry of:Γ p,q=1,...,d is given byr Following [31], proof of Lemma 7.4, the limit distribution of: is equal to the limit distribution of: We recall the assumption that d * = d p for all p = 1, . . . , d. We follow [14], Theorem 6 and use the Cramer-Wold device : Let a 1,1 , a 1,2 , . . . , a d,d ∈ R. We are interested in the asymptotic behavior of: We consider the function: and may apply Theorem 6 in [14]. Using the Hermite decomposition of f as given in (11), we observe that f and therefore, a p,q , p, q = 1, . . . , d, only affects the Hermite coefficients. Indeed, using [15], Lemma 3.5, the Hermite coefficients reduce to a p,q for each summand on the right-hand side in (A7). Hence, we can state: where Z (p,q) 2,d * +1/2 (1) has the spectral domain representation: where: andB(dλ) = B (1) (dλ), . . . ,B (d) (dλ) is an appropriate multivariate Hermitian-Gaussian random measure. Thus, we proved convergence in the distribution of the sample-cross correlation matrix: We take a closer look at the covariance matrix of vec Γ U * ,n (0) − Γ U * (0) . Following [28], Lemma 5.7, we observe: with L U * as defined in (A3) and ⊗ denotes the Kronecker product. Furthermore, K d denotes the commutation matrix that that transforms vec(A) into vec A t for A ∈ R d×d . Further details can be found in [32].
Hence, the covariance matrix of the vector of the sample cross-covariances is fully specified by the knowledge of L U * as it arises in the context of long-range dependence in (A3).
We obtain a relation between L and L U * , since: Both: and: hold and we obtain: We study the covariance matrix of: vec Γ Y,n (0) − Γ Y (0) : Let B U * be an upper triangular matrix, such that: We know that such a matrix exists because L U * is positive definite. Analogously, we define B Y : Then, it holds that: We arrive at: 2,d * +1/2 (1) has the spectral domain representation: where: is a multivariate Hermitian-Gaussian random measure as defined in (12). Note that the standardization on the left-hand side is appropriate since the covariance matrix of vec(Z 2,d * +1/2 (1)) is given by (A12) by denoting: We observe: following [28], (27). Neither the case |λ 1 | = |λ 2 | nor |λ 3 | = |λ 4 | has to be incorporated as the diagonals are excluded in the integration in (A12).
Appendix A.2. Proof of Theorem 2.1 Proof. Without loss of generality, we assume E f Y j,h = 0. Following the argumentation in [20], Theorem 5.9, we first remark that Y j,h D = AU j,h with U j,h and A as described in (A4) and (A5). We now want to study the asymptotic behavior of the partial sum as can be seen in [15], Lemma 3.7, hence, we know by [14], Theorem 6, that these partial sums are dominated by the second order terms in the Hermite expansion of f * : E f * U j,h H l 1 ,...,l dh U j,h H l 1 ,...,l dh U j,h + o P n 2d * .
This follows from the multivariate extension of the Reduction Theorem as proven in [14]. We obtain: Note that: since the entries of U j,h are independent for fixed j and identically N (0, 1) distributed. Thus, the subtrahend in (A14) equals the expected value of the minuend.
since we are considering a stationary process. We obtain: Hence, we can state the following: where we define A := (α ik ) 1≤ik≤dh := Σ −1 d,h CΣ −1 d,h , with C := E Y j,h f Y j,h Y t j,h as the matrix of second order Hermite coefficients, in contrast to B now with respect to the original considered process Y j,h j∈Z .
Remembering the special structure of . . , dh, we can see that: where we divide: We can now split the considered sum in (A17) in a way such that we are afterwards able to express it in terms of sample cross-covariances. In order to do so, we define the sample cross-covariance at lag l bŷ for p, q = 1, . . . , d.
Note that in the case h = 1, it follows directly that: (p,q) n (0).
The case h = 2 has to be regarded separately, too, and we obtain: Note that for each of the terms labeled by , the following holds for d * ∈ 1 4 , 1 2 : We use this property later on when dealing with the asymptotics of the term in (A17). Finally, we consider the term in (A17) for h ≥ 3 and arrive at: Again for each of the terms labeled by it holds for d * ∈ 1 4 , 1 2 : since each describes a sum with a finite number (independent of n) of summands.
Therefore, we continue to express the terms denoted by by o P n 2d * .
With these calculations, we are able to re-express the partial sum, whose asymptotics we are interested in, in terms of the sample cross-correlations of the original long-range dependent process Y j j∈Z .
Finally, we know that any real-valued symmetric matrix A can be decomposed via diagonalization into an orthogonal matrix V and a diagonal matrix D, where the entries of the latter one are determined via the eigenvalues of A, for details, as can be seen in [34], p. 327.