Fractal Conditional Correlation Dimension Infers Complex Causal Networks

Canlı Usta, Özge; Bollt, Erik M.

doi:10.3390/e26121030

Open AccessArticle

Fractal Conditional Correlation Dimension Infers Complex Causal Networks

by

Özge Canlı Usta

^1,2,3,*,†

and

Erik M. Bollt

^1,2,*,†

¹

Department of Electrical and Computer Engineering, Clarkson University, 8 Clarkson Ave., Potsdam, NY 13699, USA

²

Clarkson Center for Complex Systems Science, Clarkson University, 8 Clarkson Ave., Potsdam, NY 13699, USA

³

Department of Electrical and Electronics Engineering, Dokuz Eylül University, Izmir 35390, Turkey

^*

Authors to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Entropy 2024, 26(12), 1030; https://doi.org/10.3390/e26121030

Submission received: 4 October 2024 / Revised: 25 November 2024 / Accepted: 27 November 2024 / Published: 28 November 2024

(This article belongs to the Section Information Theory, Probability and Statistics)

Download

Browse Figures

Versions Notes

Abstract

Determining causal inference has become popular in physical and engineering applications. While the problem has immense challenges, it provides a way to model the complex networks by observing the time series. In this paper, we present the optimal conditional correlation dimensional geometric information flow principle (

o G e o C

) that can reveal direct and indirect causal relations in a network through geometric interpretations. We introduce two algorithms that utilize the

o G e o C

principle to discover the direct links and then remove indirect links. The algorithms are evaluated using coupled logistic networks. The results indicate that when the number of observations is sufficient, the proposed algorithms are highly accurate in identifying direct causal links and have a low false positive rate.

Keywords:

causal inference; correlation dimension; geometric information flow

1. Introduction

Causal inference has attracted attention in various scientific fields, from engineering [1] to climate science [2,3] and from neuroscience [4] to ecological systems [5]. The problem is reconstructing the causal relations from the observed time series of a complex network. However, the underlying dynamics of the networks are often unknown, and the observations can be limited. Hence, the ability to model the networks and infer causal relationships among the systems can be quite challenging.

We have written that [6,7] “a basic question when defining the concept of information flow is to contrast versions of reality for a dynamical system. Either a subcomponent is closed or alternatively, there is an outside influence due to another component”. Claude Granger’s Nobel prize [8]-winning work leading to Granger causality (see also Wiener [9]) formulates causal inference as a concept of quality of forecasts. That is, we ask, does system X provide sufficient information regarding forecasts of future states of system X, or are there improved forecasts with observations from system Y? We declare that X is not closed, as it is receiving influence (or information) from system Y, when data from Y improve forecasts of X, and this is called Weiner–Granger causality, WGC. In Granger’s original test for causality (GC) between two time series, a time series Y has a causal inference on a second time series X if the future of X includes information from past terms of Y [10] as decided by forecasting X in two different ways with linear models with and without considering the information from Y. GC deals with the identification of causality in stochastic and linear systems, and its extensions have been introduced to tackle the problem of detecting causation in separability between multivariate time series and nonlinear models [11,12,13]. Other variations on the concepts of WGC exist based on other concepts of forecasting.

Cross-mapping (CM) techniques, which use the predictions of one system based only on the past observations from the other system, are also employed in detecting causal inference problems [14,15]. Rulkov et al. have studied the connections of two unidirectional coupled systems and the detection of synchronization using the CM-based technique in [14]. The authors have also focused on the connections of unidirectional coupled systems and applied the mutual nonlinear prediction method to neuroscience [15]. Several methods have been proposed to infer causal relationships and synchronization using CM techniques [16,17,18,19]. Following the line of CM techniques-based research works, Sugihara et al. have also proposed convergent cross-mapping (CCM) that utilizes a state–space reconstruction technique [5]. CCM can identify causality in weakly coupled networks and find causal links in complex ecosystems. Although CCM requires a large amount of data and fails in case of strong coupling or synchrony, CCM and its alternatives have also been widely used in recent years [20,21,22].

On the other hand, information–theoretic approaches are implemented to solve the causal inference problem in many applications due to being model-free, including transfer entropy [23,24,25,26]. Of particular interest to us here is the more nuanced concept of direct information flow, which considers if X causes Y conditioned on considering intermediaries Z: that is, if X flows through Z to influence Y, but perhaps there is no direct influence from X to Y. In a particular study, Sun et al. have suggested the optimal causation entropy principle (oCSE), an algorithm that reveals the causal relations in a complex network by using causation entropy [25], to learn direct and indirect influences. The principle is based on the idea that the causal parents of a node in the network contain the minimal set of nodes that maximize causation entropy. The oCSE principle allows us to differentiate causal parents of a node from indirect influences of a node by using discovery and removal algorithms.

The idea of understanding connections between geometric information flow and causal inference has been investigated in recent decades [7,27,28,29,30]. An index has been proposed in [27] where it was called the dynamic complexity coherence measure. The index is the ratio of the sum of the correlation dimensions of individual subsystems to the correlation dimension of the coupled dynamical system. If the two systems are independent, the sum of the correlation dimensions of individual subsystems equals the correlation dimension of the concatenated dynamical system. However, if two systems are coupled, the sum of the correlation dimensions of individual subsystems is greater than the correlation dimension of the concatenated dynamical system. The index can determine the degree of synchronization and the presence of coupling [27]. The authors in [28] have also shown that the correlation dimension can reveal the presence and the direction of coupling. The synchronization of coupled systems can be determined by using this method.

Krakovská has used the correlation dimension to detect causality [30]. The study investigates the causal relevance between or within two systems for the different coupling strengths using the correlation dimension in the reconstructed state space. It has been emphasized that the correlation dimension in causal analysis can be a promising method between and within systems. Furthermore, the study states that correlation-based methods provide some advantages in finding causal relations in dynamical systems when we have sufficiently long observations and the states are observable.

Surasinghe and Bollt have suggested the correlation dimension geometric information flow measure to quantify causal inference between two related systems in the geometric sense [7]. The authors have proposed a new measure, i.e., geometric information flow

G e o C_{\cdot \to \cdot}

, based on the conditional correlation dimension, which enables the identification of causal relations between two related systems by geometric terms. They found that

G e o C_{\cdot \to \cdot}

provides us with geometric interpretable results when detecting causality in synthetic and real examples.

Conversely, Cummins et al. have established a theoretical model that builds on Takens’ theorem [31] for recovering dynamic interactions between weakly coupled or moderately coupled dynamical systems. The authors have examined the limitations of the state–space reconstruction methods. The manifold of the systems from one single coordinate observation function has been reconstructed. Then, the approach seeks to identify the reconstructions that have mutual driving, one-directional driving, or are completely independent. The approach fails to recover self-loops, and it cannot differentiate between mutual and unidirectional dynamical driving in connected components [32]. Although the study in [30] claims that the correlation dimension reveals the causal relations in the reconstructed space, it can fail in some cases. If two uncoupled systems

(X, Y)

are driven by a hidden common driver Z, X and Z cannot distinguish, and it implies a directional link from X to Y when X and Z are synchronized.

Although the correlation dimension and measures based on it have been explored for detecting causality and synchronization, they have not been studied extensively to reveal connections in the network. The existing methods in the literature are particularly interested in detecting synchronization. In this paper, we focus on detecting causality in the networks, assuming the networks are not synchronized.

Additionally, the previously discussed CM or state–space methods reconstruct the phase space from a single observation of a node. In contrast, we observe all states of the subsystems in the network. Even if we analyze the correlation dimension in the reconstructed state–space or observe all states, using only the correlation dimension may be insufficient to determine the direct and indirect influences of the network. However, we have utilized the conditional correlation dimensions-based measure in [7] to infer direct and indirect influences in this paper.

The main goal of this paper is to quantify the causal inference between the subsystems in a network in the geometric sense. Unlike previous studies, this paper extends the analysis of causal inference problems using only geometric interpretation to detect causal links in the networks. Expanding upon the fractal geometric concepts of the consequence of information flow in [7], we develop the optimal conditional correlation dimensional geometric information flow principle (

G e o C

) that resembles the oCSE principle previously proposed by Sun et al. [25]. We present two algorithms to detect the causal links and remove indirect links using the correlation dimension geometric information flow

G e o C

. The performance of the

o G e o C

algorithm is investigated in coupled logistic networks.

2. Problem Statement

2.1. Preliminaries and Basic Definitions

In this section, we present the notation and the basic definitions. A graph,

G \equiv (V, E)

is defined by the set of vertices (nodes),

V = \{v_{1}, v_{2}, \dots, v_{N}\}

, and the set of edges (links),

E \subseteq V \times V

. If ∀

(v_{i}, v_{j}) \in E \Rightarrow (v_{j}, v_{i}) \in E

, the graph is undirected. Otherwise, it is defined as a directed graph. The set of

N_{i} = {v_{j} \in V | (v_{i}, v_{j}) \in E}

is denoted as the parents of the

i^{t h}

node. In short, we denote

v_{i}

simply as i. The graphs can also be represented by their adjacency matrix

A

. The elements of

A

are

a_{i j} = 1

if there is an edge from j to i. Otherwise,

a_{i j} = 0

.

Consider a discrete-time dynamical system in

R^{d}

expressed as

x_{n + 1} = f (x_{n})

(1)

where

x_{n} \in R^{d}

is the state variable at time step n and

f (\cdot) : R^{d} \to R^{d}

is the local dynamics. We also consider a discrete-time dynamical network consisting of N identical components

x_{n + 1}^{(i)} = f (x_{n}^{(i)}) + σ \sum_{\begin{matrix} i = 1 \\ i \neq j \end{matrix}}^{N} a_{i j} κ g (x_{n}^{(i)}, x_{n}^{(j)}) i = 1, 2, \dots, N .

(2)

Here,

x_{n}^{(i)} \in R^{d}

is the state variable of node i at time step n,

f (\cdot) : R^{d} \to R^{d}

is the local dynamic,

σ

is the coupling strength,

a_{i j}

represents the coupling from node j to node i and it can be expressed in matrix form

A \in R^{N \times N}

,

κ \in R^{d \times d}

is the inner coupling matrix and

g (x_{n}^{(i)}, x_{n}^{(j)}) : R^{d} \times R^{d} \to R^{d}

is the coupling function. To simplify the notation, we define the next step of

x_{n}^{(i)}

as

x^{' (i)} = x_{n + 1}^{(i)},

(3)

where the ′ denotes the next time step as an alternative notation to explicitly indexing time. Let

{\{x_{n}^{(i)}\}}_{n = 1}^{T}

and

{\{x_{n}^{(j)}\}}_{n = 1}^{T}

represent sets of measurements from a network in (2). Assume that a manifold of observations of

(x_{n}^{(i)}, x_{n}^{(j)}, x_{n}^{' (i)}) \in (X^{(i)} \times X^{(j)} \times X^{' (i)}

) and

(x_{n}^{(i)}, x_{n}^{' (i)}) \in (X^{(i)} \times X^{' (i)}

) are defined as

M

and

Ω_{1}

, respectively. Based on how these manifolds lie in the space provides crucial information about whether

x_{n}^{' (i)}

depends only on

x_{n}^{(i)}

or on

(x_{n}^{(i)}, x_{n}^{(j)})

. Thus, using the dimensions of the manifold of the subsystems can be decisive in determining causal inference between systems [7]. First, the conditional correlation dimensional geometric information flow is defined as follows:

Definition 1

(Conditional Correlation Dimensional Geometric Information Flow [7]). Assume that

M

and

Ω_{1}

are bounded Borel sets. Let

M

be a manifold of the data set taken at time steps from 1 to

T + 1

for node i as

(X_{1}^{(i)}, X_{2}^{(i)}, \dots, X_{T}^{(i)}, X_{T + 1}^{(i)})

and let

Ω_{1}

be a set

X^{(i)} = (X_{1}^{(i)}, X_{2}^{(i)}, \dots, X_{T}^{(i)})

taken at time steps from 1 to T for node i. The geometric information flow

G e o (\cdot | \cdot)

is defined in the sequel:

G e o (X^{' (i)} | X^{(i)}) = D_{2} (M) - D_{2} (Ω_{1})

(4)

where

D_{2} (\cdot)

is the correlation dimension of the given dataset [33]. Then, the authors in [7] have defined correlation dimensional geometric information flow between two systems by the following:

Definition 2

(Correlation Dimensional Geometric Information Flow [7]). Consider

X^{(i)} = (X_{1}^{(i)}, X_{2}^{(i)},

\dots, X_{T}^{(i)})

and

X^{(j)} = (X_{1}^{(j)}, X_{2}^{(j)}, \dots, X_{T}^{(j)})

as time series measured at time steps from 1 to T for nodes i and j, respectively. The correlation dimensional geometric information flow from j to i is measured by using the conditional correlation dimension in (4) and is given by

G e o C_{j \to i} : = G e o (X^{' (i)} | X^{(i)}) - G e o (X^{' (i)} | X^{(i)}, X^{(j)}) .

(5)

It is clear that if j influences i, then

G e o C_{j \to i} > 0

. However, if j does not influence i,

G e o C_{j \to i} = 0

.

G e o C

is based on quantifying information flow between variables and how manifolds are mapped [7]. The study also investigates information flow between two systems, even if the observation set is not only a manifold but also a fractal. It lies on how the fractal dimension changes through the transformations [34].

2.2. Geometric Causation of Information Flow in Networks

We aim to extend the previous concept of correlation dimension geometric information flow to the networks. The idea is to make an analogy between the oCSE principle [25] and

G e o C

[7]. Hence, the extension of the proposed geometric measure

G e o C_{j \to i}

leads us to solve this problem.

Definition 3 (optimal Conditional Correlation Dimensional Geometric Information Flow (

o G e o C

)).

Let I, J, and K be subsets of nodes in a network. The correlation dimensional geometric information flow from J to I by conditioning on K is defined as

G e o C_{J \to I | K} : = G e o (X^{' (I)} | X^{(K)}) - G e o (X^{' (I)} | X^{(J)}, X^{(K)})

(6)

where

X^{' (I)}

is the observation at

T + 1

for subset I and

X^{(J)} = (X_{1}^{(J)}, X_{2}^{(J)}, \dots, X_{T}^{(J)})

and

X^{(K)} = (X_{1}^{(K)}, X_{2}^{(K)}, \dots, X_{T}^{(K)})

are time series measured at time steps from 1 to T for nodes J and K, respectively. It is clear that if

X^{(I)} = X^{(K)}

, then (6) becomes

G e o C_{J \to I | K} = G e o C_{J \to I | I} = G e o C_{J \to I}

(7)

as is denoted (5). Moreover, if

X^{(K)} = \emptyset

,

G e o C_{\cdot \to \cdot | \cdot}

simplifies to

G e o C_{J \to I | \emptyset} = G e o (X^{' (I)}) - G e o (X^{' (I)} | X^{(J)}) .

(8)

Consider the case where

J \subset K

. In this case,

G e o (X^{' (I)} | X^{(J)}, X^{(K)})

reduces to

G e o (X^{' (I)} | X^{(J)})

, which implies that

G e o C_{J \to I | K} = 0

. Furthermore, if

J \subset N_{I}

and

J ⊄ K

, then

G e o C_{J \to I | K} > 0

.

Using Definition 3 and its properties, we can quantify the information flow between variables geometrically. Moreover, we can identify direct influences and indirect influences by using similar algorithms that were earlier designed in [25]. “FORWARD GEOC” in Algorithm 1 computes

G e o C_{j \to i | K}

for each node

i \in V

and finds the maximum of

G e o C_{j \to i | K}

over j. Then, the algorithm discovers one of the causal parents of i in each iteration and updates the

K

set iteratively until

G e o C_{j \to i | K}

reaches zero

(ε_{f o r w a r d})

. In “BACKWARD GEOC”, the candidate set of the causal parents of i is used, and the algorithm calculates

G e o C_{j \to i | K - {j}}

over the K set. If each

G e o C

is zero

(ε_{b a c k w a r d})

, then the candidate set of the causal parent of i is removed from the K set. “BACKWARD GEOC” also returns the estimated adjacency matrix

\hat{A}

to present the graph.

ε_{f o r w a r d}

and

ε_{b a c k w a r d}

can be determined by setting a threshold or performing a significant test statistically [35]. We utilized a shuffle test to determine the zero

(ε_{f o r w a r d})

in Algorithm 1. The shuffle test procedure is presented in Appendix B. We set a threshold for

ε_{b a c k w a r d}

in the backward algorithm. Note that the computation of the correlation dimension is required in Algorithm 1. Hence, the estimation of the correlation dimension is explained in Section Estimation of Correlation Dimension.

Algorithm 1

o G e o C

Algorithm

1:: procedure Forward GeoC( $x_{n}^{(i)} \forall i = 1, 2, \dots N$ and $n = 1, 2, \dots, T$ ;
$V$ : the vertex set; $ε_{f o r w a r d}$ : threshold for zero)
2:: Initialize: $N \leftarrow \emptyset$
3:: for $i \in V$ do
4:: Initialize: $K \leftarrow \emptyset$ , $i n d e x_m a x \leftarrow \emptyset$
5:: do{ $K \leftarrow K \cup {i n d e x_m a x}$ }
6:: for $j \in V - K$ do
7:: geoc_j[j] = $G e o C_{j \to i | K}$
8:: end for
9:: $m a x_{g e o c} = {max}_{j} {g e o c_j}$
10:: $i n d e x_m a x = arg max {g e o c_j}$
11:: while $m a x_{g e o c} > ε_{f o r w a r d}$
12:: $N [i] \leftarrow K$
13:: end for
14:: return $I$ , $N$
15:: end procedure
16:: procedure Backward GeoC( $x_{n}^{(i)} \forall i = 1, 2, \dots N$ and $n = 1, 2, \dots, T$ ;
Set of nodes $I \subset V$ ; and set of nodes $N \subset V$ : the candidate set of causal parents of $I$ ; $ε_{b a c k w a r d}$ : threshold for zero)
17:: Initialize: $\hat{A} = 0_{N \times N}$ whose elements $[{\hat{a}}_{i j}]$
18:: for $i \in I$ do
19:: $K \leftarrow N [i]$
20:: for $j \in K$ do
21:: if $G e o C_{j \to i | K - {j}} > ε_{b a c k w a r d}$ then
22:: ${\hat{a}}_{i j} = 1$
23:: end if
24:: end for
25:: end for
26:: return $\hat{A}$ : estimated adjacency matrix
27:: end procedure

Estimation of Correlation Dimension

D_{2} (\cdot)

is used to estimate

G e o C_{J \to I | K}

. Hence, the correlation dimension and its implementation details are discussed in this section.

Consider the probability of a trajectory being within a ball

B_{ϵ} (x)

of radius

ϵ

around

x

, which is defined as

p_{ϵ} (x) = \int_{P_{ϵ (x)}} d μ (x)

. Then, the generalized correlation integral becomes [33]

C_{q} (ϵ) = \int_{x} p_{ϵ} {(x)}^{q - 1} d μ (x) .

(9)

The correlation integral in Equation (9) can be rewritten using a Heaviside step function as follows

C_{q} (ϵ) = \int_{x} {[\int_{y} Θ (ϵ - | | x - y | |) d μ (y)]}^{q - 1} d μ (x) .

(10)

Here,

Θ

is the Heaviside step function, which is defined as

Θ (x) = 1

if

x > 0

and

Θ (x) = 0

otherwise.

Grassberger and Proccacia have discussed the correlation integral for the case of

q = 2

[36,37]. From a finite set of observations of

x_{i}

, the estimation of the correlation integral is given as

\hat{C} (ϵ) = \frac{1}{{[T (T - 1)]}^{q - 1}} \sum_{i = 1}^{T} {[\sum_{i \neq j} Θ (ϵ - | | x_{i} - x_{j} | |)]}^{q - 1}

(11)

where T is the number of samples. The modified version of the correlation sum when

q = 2

is described in [38]

\hat{C} (ϵ) = \frac{2}{T (T - 1)} \sum_{i = 1}^{T} \sum_{j \neq i + 1}^{T} Θ (ϵ - | | x_{i} - x_{j} | |) .

(12)

The summation terms count the pairs

(x_{i}, x_{j})

for which distance

| | x_{i} - x_{j} | | < ϵ

. It is expected that

\hat{C} (ϵ)

scales a power law,

\hat{C} \propto ϵ^{D_{2}}

when

N \to \infty

and

ϵ \to 0

. The correlation dimension

D_{2}

is defined as

\begin{matrix} d (T, ϵ) = \frac{\partial ln C (ϵ, T)}{\partial ln ϵ} \\ D_{2} = lim_{ϵ \to 0} lim_{N \to \infty} d (T, ϵ) . \end{matrix}

(13)

A well-known technique for estimating

D_{2}

involves obtaining the slope of the

ln C (ϵ, T) / ln ϵ

curve in linear regions for

T ≫ 0

[39]. First,

ln C (ϵ, T)

is plotted against

ln ϵ

by increasing

ϵ

until

ln C (ϵ, T)

no longer changes with increasing

ln ϵ

. Then, the slope of the

ln C (ϵ, T) / ln ϵ

in the linear region is determined using a numerical estimation method, particularly least squares estimation. Clearly, the estimation of

D_{2}

depends on the number of samples T, minimum radius

ϵ_{m i n}

, maximum radius

ϵ_{m a x}

and the number of radius steps

#_{r s}

between

ϵ_{m i n}

and

ϵ_{m a x}

. The details are demonstrated in Appendix A.

3. Results

In this section, we present some examples to demonstrate the performance of the proposed method.

Example 1.

In the first example, we choose the logistic map as a dynamical system in Equation (2), and its state equation is defined as

x_{n + 1} = f (x_{n}) = a x_{n} (1 - x_{n}) x_{0} \in [0, 1] .

(14)

Here, for

a = 4

, the system is chaotic. The number of nodes in the network is

N = 7

in (2), and we consider directed and bidirectional coupled networks as shown in Figure 1a,b. The coupling strength is

σ = 0.1

, and the inner coupling matrix is

κ = 1

. We choose the coupling function as

g (x_{n}^{(i)}, x_{n}^{(j)}) = f (x_{n}^{(j)}) - f (x_{n}^{(i)})

. The number of the permutation is selected as

N_{p} = 100

to determine the zero

ε_{f o r w a r d}

, and the significance threshold is

θ = 0.01

. The performance of the proposed algorithm is defined in terms of the true positive rate (TPR), the false positive rate (FPR) and the receiver operating characteristic (ROC) curve [40].

Figure 1c,d demonstrate the TPRs and FPRs with respect to various sample sizes (T) when the network is selected in Figure 1a,b, respectively. As the number of samples increases, the TPR reaches one, and the FPR becomes zero as depicted in Figure 1c. We could find all links correctly for the network in Figure 1a using the

o G e o C

algorithm. The TPR is almost one when T is increased (i.e., the algorithm misses only one link as a false negative in Figure 1b), and FPR drops when

T > 9000

, as shown in Figure 1d.

Example 2.

In this example, we investigate the case of randomly coupled networks. We use the Erdős–Rényi (ER) model to generate random graphs [41], creating random couplings with a probability of p = 0.1 for the networks. The number of nodes is selected as

N = 20

. In particular, we choose the same dynamical system as the logistic map and the network parameters in Example 1. However, the number of independent trials is 10.

We show only one of the realizations in Figure 2a. We illustrate TPRs and FPRs for different sample sizes in Figure 2b. The points represent the average values of TPRs and FPRs over 10 trials. The maximum point of the error bar indicates the highest value of TPRs and FPRs, while the minimum point represents their lowest values. Additionally, the ROC curve is demonstrated in Figure 2c with

T = 500

and

T = 1000

for the network in Figure 2a. The number of permutations is chosen as

N_{p} = 100

. The significance threshold

θ

varies from

0.01

to

0.99

in Figure 2c to extract the ROC curve.

TPR approaches one, and the interval between the maximum and minimum values of the error bars is decreased when T is increased, as shown in Figure 2b. FPR is slightly reduced but not equal to zero for

T = 10, 000

. In summary, the algorithm achieves to find true links in the networks; it detects a small number of links as false positives.

The ROC curves in Figure 2c indicate that the performance of the proposed algorithms is improved when the number of samples is increased as expected. However, when

θ

is increased, the performance of algorithms decreases even if

N_{p}

is large.

4. Discussion

In this study, we investigated the causal inference of networks using only geometric interpretations. We utilized the conditional correlation dimensional geometric information flow measure based on the correlation dimension to accomplish this. We proposed the

o G e o C

principle, which allows us to find causal and noncausal parents of a node, thereby identifying direct and indirect links through geometric sense. We tested our proposed method on coupled logistic maps. Our findings revealed that we could find the links when the number of samples was large enough. False positives decreased when the observations were sufficient and the significance level

θ

was selected appropriately.

It is also important to note that the number of observations is a vital parameter when estimating

G e o C

. Although

G e o C

detects causal relationships between systems, the estimation requires a large number of observations to estimate

D_{2}

accurately [42].

We obtained

D_{2}

by finding the slope of the

ln C (ϵ, T) / ln ϵ

curve over its linear regions. In this technique, the selection of the minimum radius

ϵ_{m i n}

and the maximum radius

ϵ_{m a x}

plays a key role. If the dimensionality of the system is high, finding a linear region in the curve can be problematic even with a large number of observations. Hence, in case of large networks, it is important to increase the number of observations and to select the radius

ϵ_{m i n}

and

ϵ_{m a x}

by considering the linear region of the

ln C (ϵ, T) / ln ϵ

curve.

Furthermore, we plotted ROC curves according to the significance level

θ

when the number of shuffles was fixed. If the number of shuffles is large enough, the

o G e o C

algorithm can determine the causal links while removing the noncausal links. When the significance level

θ

is increased, the false positive rate reaches one even if the number of shuffles is large, as the ROC curve depicts.

As a direction for future work, it would be interesting to apply the

o G e o C

principle to real data. In real data analysis, it may not be possible to observe all the states of the network nodes. In this case, it is necessary to reconstruct the phase space from a single observation of a node. The embedding dimension parameters (e.g., the delay of embedding and the embedding dimension) are significant factors when reconstructing the state space using Takens’ embedding theorem [39]. When the length of the data is sufficient, and there is no noise in the data, there exists a diffeomorphism between the reconstructed state space and the original space. It ensures the invariants of the system, such as

D_{2}

, are preserved in the reconstructed phase space. However, the observations are generally too short or noisy in real data. Therefore, the choice of the embedding dimension parameters is essential for an accurate estimation of

D_{2}

.

Determining

D_{2}

can also become challenging in noisy conditions. The studies in [33,43] have shown that we may not find the significant scaling region in the correlation sum and the linear interval of the correlation dimension when the dataset has a 2% noise level. Therefore, we may not obtain a reliable estimation of

G e o C

in the presence of noise.

In addition, it is known that

D_{2}

estimation can be made with several techniques [44,45,46]. The

o G e o C

principle with different

D_{2}

estimation techniques or noisy data can be investigated for further exploration.

Although existing studies and our proposed method deal with detecting causal inference in deterministic systems, the question of which method to use in stochastic systems, especially in weak and moderate coupling, remains unresolved. Another open question is whether different dynamic properties of interacting systems might bias the estimation of the causal direction. In a recent study, the ability of a state–space correspondence method to identify causal direction in nonlinear bivariate stochastic processes was investigated to solve these problems [47].

In our case, it is known that the

o G e o C

principle utilizes the estimation of the correlation dimension. When the variance noise level of nonlinear bivariate stochastic processes increases, the estimation of the correlation dimensions for the systems becomes less reliable. As a result, it becomes difficult to identify causal and noncausal parents of a node, and the performance of the proposed algorithms will be reduced accordingly. A potential solution to these challenges can be to examine the estimation of the correlation dimension using the newly proposed techniques in [44,45,46] for nonlinear bivariate stochastic processes.

To conclude, we showed that the

o G e o C

principle can detect the causal parents of a node in a network when the observations are long enough and entirely noise-free. It will be interesting to apply the

o G e o C

principle to real data where the causal relations are unknown, including time series of air quality, temperature, and humidity. Future studies should also explore applications involving stochastic interactions, such as determining causality in physiological control mechanisms, brain activity interactions, and coupled ocean–atmosphere chaotic systems. We aim to test this principle in a more detailed study and explore their causal relationships.

Author Contributions

Conceptualization, Ö.C.U. and E.M.B.; methodology, Ö.C.U. and E.M.B.; software, Ö.C.U. and E.M.B.; validation, Ö.C.U. and E.M.B.; formal analysis, Ö.C.U. and E.M.B.; investigation, Ö.C.U. and E.M.B.; resources, E.M.B.; data curation, Ö.C.U. and E.M.B.; writing—original draft preparation, Ö.C.U. and E.M.B.; writing—review and editing, Ö.C.U. and E.M.B.; visualization, Ö.C.U. and E.M.B.; supervision, E.M.B.; project administration, E.M.B.; funding acquisition, E.M.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Acknowledgments

The authors would like to acknowledge support from the NIH-CRCNS, DARPA RSDN, the ARO, the AFSOR and the ONR.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Estimation of Correlation Dimension

Given a time series

{\{x_{n}\}}_{t = 1}^{T}

, our goal is to estimate

D_{2}

. Although

D_{2}

is invariant, the correlation sum Equation (12) is not invariant for a given

ϵ

[33]. Therefore, the correlation sum in Equation (12) is calculated for several radii at first. Then, the curve of

ln \hat{C} (ϵ) / ln (ϵ)

is plotted. The slope of the curve is computed over a linear region. In this paper, we use the Linear Regression model to determine the slope. The pseudo-code is given in Algorithm A1.

In our simulations, we choose the minimum radius as

ϵ_{m a x} = 0.0562

, the maximum radius as

ϵ_{m a x} = 0.630

, and the number of radius steps as

#_{r s} = 50

for the coupled logistic networks. We utilize kDTree to count the pairs inside the ball of each

ϵ

and use multiprocessing tools when estimating

D_{2}

.

Algorithm A1

D_{2}

Estimation

1:: procedure $D_{2}$ estimation( $x$ , $ϵ_{m i n}$ , $ϵ_{m a x}$ , $#_{r s}$ )
2:: Initialize: step_ $ϵ$ = ( $ϵ_{m a x}$ − $ϵ_{m i n}$ )/ $#_{r s}$ ,
3:: $s l o p e_a r r a y \leftarrow []$
4:: $r a d i u s_a r r a y \leftarrow []$
5:: for $r = ϵ_{m i n} to ϵ_{m a x}$ step step_ $ϵ$ do
6:: Calculate $\hat{C} (r)$ using $x$
7:: $s l o p e_a r r a y \leftarrow s l o p e_a r r a y + [ln \hat{C} (r)]$
8:: $r a d i u s_a r r a y \leftarrow r a d i u s_a r r a y + [ln (r)]$
9:: end for
10:: $(D_{2}, r e s i d u a l s) \leftarrow Linear_Regression (r a d i u s_a r r a y, s l o p e_a r r a y)$
11:: return $D_{2}$
12:: end procedure

Appendix B. Shuffle Test to Determine the Zero

There are two thresholds, such as

ε_{f o r w a r d}

and

ε_{b a c k w a r d}

, to determine the zero in the

G e o C

algorithm. We perform a statistical significance test to check whether

G e o C

is greater than zero. The idea is to obtain an empirical cumulative distribution from the shuffled

G e o C

and use it to determine the significance level.

To achieve this, we generate a random permutation array that shuffles the time and surrogate

x_{n}^{(j)}

according to the permutation as

x_{n}^{(j^{★})}

. Then,

G e o C_{j^{★} \to i | K}

is calculated. The procedure is repeated

N_{p}

times, and the shuffled

G e o C

s are sorted in ascending order. The significance threshold

θ

and

N_{p}

are used to set the predefined index, which determines the significance level.

ε

is the

p r e d e f i n e d i n d e x^{t h}

element of the ascending list of shuffled

G e o C

values. The pseudo-code is given in Algorithm A2.

The determination of

ε

depends significantly on the number of shuffles,

N_{p}

and

θ

. As the

N_{p}

increases and

θ

decreases, we can determine zero significantly.

Algorithm A2 Shuffle Test

1:: procedure Shuffle Test( $x_{n}^{(i)}$ , $x_{n}^{(j)}$ $x_{n}^{(K)}$ for $n = 1, 2, \dots, T$ ; $N_{p} :$ number of shuffles, $θ$ : significance threshold)
2:: Initialize: $a r r a y_i n d e x \leftarrow i n t (N_{p} \times (1 - θ))$ ,
3:: $s h u f f l e_a r r a y \leftarrow []$
4:: for $t r i a l \in N_{p}$ do
5:: Generate a random permutation array that shuffles n
6:: Shuffle $x_{n}^{(j)}$ according the permutation and obtain shuffled times series $x_{n}^{(j^{★})}$
7:: Calculate $G e o C_{j^{★} \to i | K}$
8:: $s h u f f l e_a r r a y \leftarrow s h u f f l e_a r r a y + [G e o C_{j^{★} \to i | K}]$
9:: end for
10:: $a s c e n d i n g_G e o_a r r a y \leftarrow sort (s h u f f l e_a r r a y)$
11:: $ε \leftarrow a s c e n d i n g_G e o_a r r a y [a r r a y_i n d e x]$
12:: return $ε$
13:: end procedure

Appendix C. Illustration of D 2 Estimations for the Networks

To demonstrate

D_{2}

estimations, we show one step of the proposed forward algorithm for the networks in Figure 1a and Figure 2a, respectively.

First, we start with the network in Figure 1a. Let us randomly choose the third node

(i = 3)

from the network. The forward algorithm calculates

G e o C_{j \to 3 | \emptyset}

for all j and returns the maximum value of

G e o C_{j \to 3 | \emptyset}

for all j and its index of this maximum value in the first iteration. In our case, we found that the maximum

G e o C

as

m a x_{g e o c} = G e o C_{2 \to 3 | \emptyset} = 0.163

for the third node. When Equation (8) is written for

i = 3

and

j = 2

, it becomes

\begin{matrix} G e o C_{2 \to 3 | \emptyset} & = G e o (X^{' (3)}) - G e o (X^{' (3)} | X^{(2)}) \\ = D_{2} (M_{X^{' (3)}}) - D_{2} (M_{(X^{' (3)}, X^{(2)})}) + D_{2} (M_{X^{(2)}}) . \end{matrix}

(A1)

In the estimation of

G e o C_{2 \to 3 | \emptyset}

, we require to estimate the correlation dimension of

D_{2} (M_{X^{' (3)}})

,

D_{2} (M_{(X^{' (3)}, X^{(2)})})

and

D_{2} (M_{X^{(2)}})

for given datasets

X^{' (3)}

,

(X^{' (3)}, X^{(2)})

and

X^{(2)}

, respectively. We use Algorithm 2 to estimate

D_{2} (\cdot)

. We plot the curves of

ln \hat{C} (ϵ) / ln (ϵ)

of the datasets

X^{' (3)}

,

(X^{' (3)}, X^{(2)})

and

X^{(2)}

in Figure A1. The blue circle points demonstrate the natural logarithm of correlation sum versus the natural logarithm of radius. The red line determines the slope of the blue points by using least squares estimation. In our case, the slope of the red line gives us the estimated correlation dimension.

In the algorithm, we determine whether the maximum value of

G e o C_{\cdot \to \cdot | \cdot}

is statistically significant or not. We found that

G e o C_{2 \to 3 | \emptyset} > 0

. Hence, the index of the maximum

G e o C_{j \to 3 | \emptyset}

becomes

K = i n d e x_m a x = 2

.

In the second iteration, we need to compute

G e o C_{j \to 3 | 2}

for all j except for

K = 2

. Our results reveal that the maximum value of

G e o C_{j \to 3 | 2}

becomes

G e o C_{7 \to 3 | 2} = 0.001

when

j = 7

. If

j = 7

,

i = 3

, and

K = 2

, Equation (6) is expressed by

\begin{matrix} G e o C_{7 \to 3 | 2} & = G e o (X^{' (3)} | X^{(2)}) - G e o (X^{' (3)} | X^{(2)}, X^{(7)}) \\ = D_{2} (M_{(X^{' (3)}, X^{(2)})}) - D_{2} (M_{X^{(2)}}) - D_{2} (M_{(X^{' (3)}, X^{(2)}, X^{(7)})}) + D_{2} (M_{(X^{(2)}, X^{(7)})}) . \end{matrix}

(A2)

Again, we illustrate the curves of

ln \hat{C} (ϵ) / ln (ϵ)

and corresponding estimated correlation dimensions (

D_{2} (M_{(X^{' (3)}, X^{(2)})})

,

D_{2} (M_{X^{(2)}})

,

D_{2} (M_{(X^{' (3)}, X^{(2)}, X^{(7)})})

, and

D_{2} (M_{(X^{(2)}, X^{(7)})})

) in Figure A2.

Figure A1. The curve of

ln \hat{C} (ϵ) / ln (ϵ)

for a given dataset of (a)

X^{' (3)}

, (b)

(X^{' (3)}, X^{(2)})

, (c)

X^{(2)}

for the network in Figure 1a. The estimated correlation dimension is shown in the legend. The number of observations is

T = 10, 000

.

Figure A1. The curve of

ln \hat{C} (ϵ) / ln (ϵ)

for a given dataset of (a)

X^{' (3)}

, (b)

(X^{' (3)}, X^{(2)})

, (c)

X^{(2)}

for the network in Figure 1a. The estimated correlation dimension is shown in the legend. The number of observations is

T = 10, 000

.

Figure A2. The curve of

ln \hat{C} (ϵ) / ln (ϵ)

for a given dataset of (a)

(X^{' (3)}, X^{(2)})

, (b)

X^{(2)}

, (c)

(X^{' (3)}, X^{(2)}, X^{(7)})

, (d)

(X^{(2)}, X^{(7)})

for the network in Figure 1a. The estimated correlation dimension is shown in the legend. The number of observations is

T = 10, 000

.

Figure A2. The curve of

ln \hat{C} (ϵ) / ln (ϵ)

for a given dataset of (a)

(X^{' (3)}, X^{(2)})

, (b)

X^{(2)}

, (c)

(X^{' (3)}, X^{(2)}, X^{(7)})

, (d)

(X^{(2)}, X^{(7)})

for the network in Figure 1a. The estimated correlation dimension is shown in the legend. The number of observations is

T = 10, 000

.

In the second iteration, the maximum

G e o C

is not statistically significant

(G e o C_{7 \to 3 | 2} < 0)

. Thus, the algorithm stops in the second iteration. We determine the causal parents of the third node as

K = 2

for the network in Figure 1a as expected.

Second, we continue with the network in Figure 2a. We randomly choose the seventh node

(i = 7)

from the network. The algorithm finds the maximum

G e o C

as

G e o C_{2 \to 7 | \emptyset}

in the first iteration, and it is statistically significant (

G e o C_{2 \to 7 | \emptyset} > 0

). When estimating

G e o C_{2 \to 7 | \emptyset}

, the algorithm computes the correlation dimension of the datasets

X^{' (7)}

,

(X^{' (7)}, X^{(2)})

, and

X^{(2)}

using the same technique in Algorithm A1. The curve of estimation of

ln \hat{C} (ϵ) / ln (ϵ)

and the corresponding estimated

D_{2} (\cdot)

is shown in Figure A3.

Figure A3. The curve of

ln \hat{C} (ϵ) / ln (ϵ)

for a given dataset of (a)

X^{' (7)}

, (b)

(X^{' (7)}, X^{(2)})

, (c)

X^{(2)}

for the network in Figure 2a. The estimated correlation dimension is shown in the legend. The number of observations is

T = 10, 000

.

Figure A3. The curve of

ln \hat{C} (ϵ) / ln (ϵ)

for a given dataset of (a)

X^{' (7)}

, (b)

(X^{' (7)}, X^{(2)})

, (c)

X^{(2)}

for the network in Figure 2a. The estimated correlation dimension is shown in the legend. The number of observations is

T = 10, 000

.

In the second iteration, the maximum

G e o C

is achieved

G e o C_{20 \to 7 | 2}

. We demonstrate the curve of

ln \hat{C} (ϵ) / ln (ϵ)

and the estimations of

D_{2} (\cdot)

of the dataset

(X^{' (7)}, X^{(2)})

,

X^{(2)}

,

(X^{' (7)}, X^{(2)}, X^{(20)})

, and

(X^{(2)}, X^{(20)})

in Figure A4. The algorithm determines

G e o C_{20 \to 7 | 2} > 0

and updates

K = (2, 20)

as a candidate set of causal parents.

In the third iteration,

G e o C_{10 \to 7 | (2, 20)}

takes the maximum value by calculating the correlation dimension of the datasets

(X^{' (7)}, X^{(2)}, X^{(20)})

, (

X^{(2)}, X^{(20)})

,

(X^{' (7)}, X^{(2)}, X^{(7)}, X^{(10)})

, and

(X^{(2)}, X^{(7)}, X^{(10)})

. The curves of

ln \hat{C} (ϵ) / ln (ϵ)

of these datasets and their estimated

D_{2} (\cdot)

are represented in Figure A5. In this case, the algorithm decides

G e o C_{10 \to 7 | (2, 20)} < 0

and returns the causal parents of the seventh node as

K = {2, 20}

.

Figure A4. The curve of

ln \hat{C} (ϵ) / ln (ϵ)

for a given dataset of (a)

(X^{' (7)}, X^{(2)})

, (b)

X^{(2)}

, (c)

(X^{' (7)}, X^{(2)}, X^{(20)})

, (d)

(X^{(2)}, X^{(20)})

for the network in Figure 2a. The estimated correlation dimension is shown in the legend. The number of observations is

T = 10, 000

.

Figure A4. The curve of

ln \hat{C} (ϵ) / ln (ϵ)

for a given dataset of (a)

(X^{' (7)}, X^{(2)})

, (b)

X^{(2)}

, (c)

(X^{' (7)}, X^{(2)}, X^{(20)})

, (d)

(X^{(2)}, X^{(20)})

for the network in Figure 2a. The estimated correlation dimension is shown in the legend. The number of observations is

T = 10, 000

.

Figure A5. The curve of

ln \hat{C} (ϵ) / ln (ϵ)

for a given dataset of (a)

(X^{' (7)}, X^{(2)}, X^{(20)})

, (b) (

X^{(2)}, X^{(20)})

, (c)

(X^{' (7)}, X^{(2)}, X^{(7)}, X^{(10)})

, (d)

(X^{(2)}, X^{(7)}, X^{(10)})

for the network in Figure 2a. The estimated correlation dimension is shown in the legend. The number of observations is

T = 10, 000

.

Figure A5. The curve of

ln \hat{C} (ϵ) / ln (ϵ)

for a given dataset of (a)

(X^{' (7)}, X^{(2)}, X^{(20)})

, (b) (

X^{(2)}, X^{(20)})

, (c)

(X^{' (7)}, X^{(2)}, X^{(7)}, X^{(10)})

, (d)

(X^{(2)}, X^{(7)}, X^{(10)})

for the network in Figure 2a. The estimated correlation dimension is shown in the legend. The number of observations is

T = 10, 000

.

References

Sudu Ambegedara, A.; Sun, J.; Janoyan, K.; Bollt, E. Information-theoretical noninvasive damage detection in bridge structures. arXiv 2016, arXiv:1612.09340. [Google Scholar] [CrossRef] [PubMed]
Runge, J.; Bathiany, S.; Bollt, E.; Camps-Valls, G.; Coumou, D.; Deyle, E.; Glymour, C.; Kretschmer, M.; Mahecha, M.D.; Muñoz-Marí, J.; et al. Inferring causation from time series in Earth system sciences. Nat. Commun. 2019, 10, 2553. [Google Scholar] [CrossRef] [PubMed]
Runge, J.; Gerhardus, A.; Varando, G.; Eyring, V.; Camps-Valls, G. Causal inference for time series. Nat. Rev. Earth Environ. 2023, 4, 487–505. [Google Scholar] [CrossRef]
Seth, A.K.; Chorley, P.; Barnett, L.C. Granger causality analysis of fMRI BOLD signals is invariant to hemodynamic convolution but not downsampling. Neuroimage 2013, 65, 540–555. [Google Scholar] [CrossRef] [PubMed]
Sugihara, G.; May, R.; Ye, H.; Hsieh, C.h.; Deyle, E.; Fogarty, M.; Munch, S. Detecting causality in complex ecosystems. Science 2012, 338, 496–500. [Google Scholar] [CrossRef]
Bollt, E.M. Open or closed? Information flow decided by transfer operators and forecastability quality metric. arXiv 2018, arXiv:1804.03687. [Google Scholar] [CrossRef]
Surasinghe, S.; Bollt, E.M. On geometry of information flow for causal inference. Entropy 2020, 22, 396. [Google Scholar] [CrossRef]
Hendry, D.F. The nobel memorial prize for clive wj granger. Scand. J. Econ. 2004, 106, 187–213. [Google Scholar] [CrossRef]
Wiener, N. The theory of prediction. In Modern Mathematics for Engineers; McGraw Hill: New York, NY, USA, 1956. [Google Scholar]
Granger, C.W. Investigating causal relations by econometric models and cross-spectral methods. Econometrica 1969, 424–438. [Google Scholar] [CrossRef]
Marinazzo, D.; Pellicoro, M.; Stramaglia, S. Kernel method for nonlinear Granger causality. Phys. Rev. Lett. 2008, 100, 144103. [Google Scholar] [CrossRef] [PubMed]
Barrett, A.B.; Barnett, L.; Seth, A.K. Multivariate Granger causality and generalized variance. Phys. Rev. E 2010, 81, 041907. [Google Scholar] [CrossRef] [PubMed]
Marinazzo, D.; Liao, W.; Chen, H.; Stramaglia, S. Nonlinear connectivity by Granger causality. Neuroimage 2011, 58, 330–338. [Google Scholar] [CrossRef] [PubMed]
Rulkov, N.F.; Sushchik, M.M.; Tsimring, L.S.; Abarbanel, H.D. Generalized synchronization of chaos in directionally coupled chaotic systems. Phys. Rev. E 1995, 51, 980. [Google Scholar] [CrossRef] [PubMed]
Schiff, S.J.; So, P.; Chang, T.; Burke, R.E.; Sauer, T. Detecting dynamical interdependence and generalized synchrony through mutual prediction in a neural ensemble. Phys. Rev. E 1996, 54, 6708. [Google Scholar] [CrossRef] [PubMed]
Arnhold, J.; Grassberger, P.; Lehnertz, K.; Elger, C.E. A robust method for detecting interdependences: Application to intracranially recorded EEG. Phys. D Nonlinear Phenom. 1999, 134, 419–430. [Google Scholar] [CrossRef]
Quiroga, R.Q.; Kraskov, A.; Kreuz, T.; Grassberger, P. Performance of different synchronization measures in real data: A case study on electroencephalographic signals. Phys. Rev. E 2002, 65, 041903. [Google Scholar] [CrossRef]
Andrzejak, R.G.; Kraskov, A.; Stögbauer, H.; Mormann, F.; Kreuz, T. Bivariate surrogate techniques: Necessity, strengths, and caveats. Phys. Rev. E 2003, 68, 066202. [Google Scholar] [CrossRef]
Chicharro, D.; Andrzejak, R.G. Reliable detection of directional couplings using rank statistics. Phys. Rev. E 2009, 80, 026217. [Google Scholar] [CrossRef]
Ye, H.; Deyle, E.R.; Gilarranz, L.J.; Sugihara, G. Distinguishing time-delayed causal interactions using convergent cross mapping. Sci. Rep. 2015, 5, 14750. [Google Scholar] [CrossRef]
Mønster, D.; Fusaroli, R.; Tylén, K.; Roepstorff, A.; Sherson, J.F. Causal inference from noisy time-series data—Testing the Convergent Cross-Mapping algorithm in the presence of noise and external influence. Future Gener. Comput. Syst. 2017, 73, 52–62. [Google Scholar] [CrossRef]
Breston, L.; Leonardis, E.J.; Quinn, L.K.; Tolston, M.; Wiles, J.; Chiba, A.A. Convergent cross sorting for estimating dynamic coupling. Sci. Rep. 2021, 11, 20374. [Google Scholar] [CrossRef] [PubMed]
Schreiber, T. Measuring information transfer. Phys. Rev. Lett. 2000, 85, 461. [Google Scholar] [CrossRef] [PubMed]
Sun, J.; Bollt, E.M. Causation entropy identifies indirect influences, dominance of neighbors and anticipatory couplings. Phys. D Nonlinear Phenom. 2014, 267, 49–57. [Google Scholar] [CrossRef]
Sun, J.; Taylor, D.; Bollt, E.M. Causal network inference by optimal causation entropy. SIAM J. Appl. Dyn. Syst. 2015, 14, 73–106. [Google Scholar] [CrossRef]
Lord, W.M.; Sun, J.; Ouellette, N.T.; Bollt, E.M. Inference of causal information flow in collective animal behavior. IEEE Trans. Mol. Biol. Multi-Scale Commun. 2016, 2, 107–116. [Google Scholar] [CrossRef]
Janjarasjitt, S.; Loparo, K. An approach for characterizing coupling in dynamical systems. Phys. D Nonlinear Phenom. 2008, 237, 2482–2486. [Google Scholar] [CrossRef]
Krakovská, A.; Budáčová, H. Interdependence measure based on correlation dimension. In Proceedings of the 9th International Conference on Measurement, Bratislava, Slovakia, 27–30 May 2013; pp. 31–34. [Google Scholar]
Krakovská, A.; Jakubík, J.; Budáčová, H.; Holecyová, M. Causality studied in reconstructed state space. Examples of uni-directionally connected chaotic systems. arXiv 2015, arXiv:1511.00505. [Google Scholar]
Krakovská, A. Correlation dimension detects causal links in coupled dynamical systems. Entropy 2019, 21, 818. [Google Scholar] [CrossRef]
Takens, F. Detecting strange attractors in turbulence. In Dynamical Systems and Turbulence; Springer: New York, NY, USA, 1981; pp. 366–381. [Google Scholar]
Cummins, B.; Gedeon, T.; Spendlove, K. On the efficacy of state space reconstruction methods in determining causality. SIAM J. Appl. Dyn. Syst. 2015, 14, 335–381. [Google Scholar] [CrossRef]
Kantz, H.; Schreiber, T. Nonlinear Time Series Analysis; Cambridge University Press: Cambridge, UK, 2003. [Google Scholar]
Sauer, T.; Yorke, J.A.; Casdagli, M. Embedology. J. Stat. Phys. 1991, 65, 579–616. [Google Scholar] [CrossRef]
Good, P. Permutation, Parametric, and Bootstrap Tests of Hypotheses; Springer: New York, NY, USA, 2005. [Google Scholar]
Grassberger, P.; Procaccia, I. Characterization of strange attractors. Phys. Rev. Lett. 1983, 50, 346. [Google Scholar] [CrossRef]
Grassberger, P.; Procaccia, I. Measuring the strangeness of strange attractors. Phys. D Nonlinear Phenom. 1983, 9, 189–208. [Google Scholar] [CrossRef]
Theiler, J. Efficient algorithm for estimating the correlation dimension from a set of discrete points. Phys. Rev. A 1987, 36, 4456. [Google Scholar] [CrossRef] [PubMed]
Abarbanel, H. Analysis of Observed Chaotic Data; Springer Science & Business Media: New York, NY, USA, 2012. [Google Scholar]
Sammut, C.; Webb, G.I. Encyclopedia of Machine Learning; Springer Science & Business Media: New York, NY, USA, 2011. [Google Scholar]
Erdős, P.; Rényi, A. On the evolution of random graphs. Publ. Math. Inst. Hung. Acad. Sci. 1960, 5, 17–60. [Google Scholar]
Eckmann, J.P.; Ruelle, D. Fundamental limitations for estimating dimensions and Lyapunov exponents in dynamical systems. Phys. D Nonlinear Phenom. 1992, 56, 185–187. [Google Scholar] [CrossRef]
Schreiber, T. Determination of the noise level of chaotic time series. Phys. Rev. E 1993, 48, R13. [Google Scholar] [CrossRef]
Ji, C.; Zhu, H.; Jiang, W. A novel method to identify the scaling region for chaotic time series correlation dimension calculation. Chi. Sci. Bull. 2011, 56, 925–932. [Google Scholar] [CrossRef]
Krakovská, A.; Chvosteková, M. Simple correlation dimension estimator and its use to detect causality. Chaos Solitons Fractals 2023, 175, 113975. [Google Scholar] [CrossRef]
Makarov, V.A.; Muñoz, R.; Herreras, O.; Makarova, J. Correlation dimension of high-dimensional and high-definition experimental time series. Chaos Interdiscip. Nonlinear Sci. 2023, 33, 123114. [Google Scholar] [CrossRef]
Porta, A.; de Abreu, R.M.; Bari, V.; Gelpi, F.; De Maria, B.; Catai, A.M.; Cairo, B. On the validity of the state space correspondence strategy based on k-nearest neighbor cross-predictability in assessing directionality in stochastic systems: Application to cardiorespiratory coupling estimation. Chaos An Interdiscip. J. Nonlinear Sci. 2024, 34, 053115. [Google Scholar] [CrossRef]

Figure 1. The performance of the proposed algorithm for the networks in (a,b). (a) A network with directed coupling consisting of

N = 7

nodes and 8 links. (b) A network with bidirectional coupling consisting of

N = 7

nodes and 12 links. In both networks, each circle represents a logistic map. (c) TPRs and FPRs are plotted against various sample sizes

(T)

for the network in Figure (a). (d) We also illustrated TPRs and FPRs with respect to

(T)

for the network in Figure (b). In both simulations, the number of permutations is

N_{p} = 100

, and the significance threshold is

θ = 0.01

.

Figure 1. The performance of the proposed algorithm for the networks in (a,b). (a) A network with directed coupling consisting of

N = 7

nodes and 8 links. (b) A network with bidirectional coupling consisting of

N = 7

nodes and 12 links. In both networks, each circle represents a logistic map. (c) TPRs and FPRs are plotted against various sample sizes

(T)

for the network in Figure (a). (d) We also illustrated TPRs and FPRs with respect to

(T)

for the network in Figure (b). In both simulations, the number of permutations is

N_{p} = 100

, and the significance threshold is

θ = 0.01

.

Figure 2. The performance of the proposed algorithm for the networks randomly coupled according to the Erdős–Rényi (ER) model with the probability of

p = 0.1

. The simulations are repeated 10 times for different networks. (a) An illustration of one of the networks. (b) Error bar points show the mean of TPRs and FPRs with respect to different sample sizes. The maximum and minimum points of the error bar indicate the highest and lowest values of TPRs and FPRs. The number of permutations is again

N_{p} = 100

with

θ = 0.01

. (c) ROC curve for

T = 500

and

T = 1000

. Here,

N_{p} = 100

, but

θ

is varied from to

0.01

to

0.99

to plot the ROC curve.

Figure 2. The performance of the proposed algorithm for the networks randomly coupled according to the Erdős–Rényi (ER) model with the probability of

p = 0.1

. The simulations are repeated 10 times for different networks. (a) An illustration of one of the networks. (b) Error bar points show the mean of TPRs and FPRs with respect to different sample sizes. The maximum and minimum points of the error bar indicate the highest and lowest values of TPRs and FPRs. The number of permutations is again

N_{p} = 100

with

θ = 0.01

. (c) ROC curve for

T = 500

and

T = 1000

. Here,

N_{p} = 100

, but

θ

is varied from to

0.01

to

0.99

to plot the ROC curve.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Canlı Usta, Ö.; Bollt, E.M. Fractal Conditional Correlation Dimension Infers Complex Causal Networks. Entropy 2024, 26, 1030. https://doi.org/10.3390/e26121030

AMA Style

Canlı Usta Ö, Bollt EM. Fractal Conditional Correlation Dimension Infers Complex Causal Networks. Entropy. 2024; 26(12):1030. https://doi.org/10.3390/e26121030

Chicago/Turabian Style

Canlı Usta, Özge, and Erik M. Bollt. 2024. "Fractal Conditional Correlation Dimension Infers Complex Causal Networks" Entropy 26, no. 12: 1030. https://doi.org/10.3390/e26121030

APA Style

Canlı Usta, Ö., & Bollt, E. M. (2024). Fractal Conditional Correlation Dimension Infers Complex Causal Networks. Entropy, 26(12), 1030. https://doi.org/10.3390/e26121030

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Fractal Conditional Correlation Dimension Infers Complex Causal Networks

Abstract

1. Introduction

2. Problem Statement

2.1. Preliminaries and Basic Definitions

2.2. Geometric Causation of Information Flow in Networks

Estimation of Correlation Dimension

3. Results

4. Discussion

Author Contributions

Funding

Institutional Review Board Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A. Estimation of Correlation Dimension

Appendix B. Shuffle Test to Determine the Zero

Appendix C. Illustration of D 2 Estimations for the Networks

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI