Abstract
Rooted in dynamic systems theory, convergent cross mapping (CCM) has attracted increased attention recently due to its capability in detecting linear and nonlinear causal coupling in both random and deterministic settings. One limitation with CCM is that it uses both past and future values to predict the current value, which is inconsistent with the widely accepted definition of causality, where it is assumed that the future values of one process cannot influence the past of another. To overcome this obstacle, in our previous research, we introduced the concept of causalized convergent cross mapping (cCCM), where future values are no longer used to predict the current value. In this paper, we focus on the implementation of cCCM in causality analysis. More specifically, we demonstrate the effectiveness of cCCM in identifying both linear and nonlinear causal coupling in various settings through a large number of examples, including Gaussian random variables with additive noise, sinusoidal waveforms, autoregressive models, stochastic processes with a dominant spectral component embedded in noise, deterministic chaotic maps, and systems with memory, as well as experimental fMRI data. In particular, we analyze the impact of shadow manifold construction on the performance of cCCM and provide detailed guidelines on how to configure the key parameters of cCCM in different applications. Overall, our analysis indicates that cCCM is a promising and easy-to-implement tool for causality analysis in a wide spectrum of applications.
1. Introduction
Causality analysis aims to find the relationship between causes and effects by exploring the directional influence of one variable on the other, and it has been a central topic in science, economy, climate, and many other fields [,,,,,,,,]. Compared with correlation, which reflects the mutual dependence between two variables, causality analysis may provide additional information since two time series with low correlation may have strong unidirectional or bi-directional causal coupling between them. Some representative examples can be found in [].
The first practical causal analysis framework is Granger Causality (GC), which was proposed by Granger in 1969 []. GC is a statistical approach that relies on a multi-step linear prediction model and aims to determine whether the values of one time series are useful in predicting the future values of the other. As a well-known technique, the validity and computational simplicity of GC have been widely recognized [,,,]. At the same time, it has also been noticed that when there exists instantaneous and/or strong nonlinear interactions between two regions, GC analysis may lead to invalid results [,]. Moreover, GC may not be able to detect the causation in deterministic settings [,].
In 1990, directed information (DI)—the first causality detection tool based on information theory—was proposed by Massey [] when studying discrete memoryless communication channels with feedback. DI measures the directed information flowing from one sequence to the other. As an information-theoretic framework, a major advantage of DI is that it is a universal method that does not rely on any model assumptions of the signals and is not limited by linearity or separability [,]. In refs. [,], the performance of DI in causality analysis was demonstrated using both simulated data and experimental fMRI data. It was found that DI is capable of detecting both linear and non-linear causal relationships. However, it was also noticed that the direct evaluation of DI relies heavily on probability estimation and tends to be sensitive to data length as well as the step size used in the quantization process [].
In 2012, convergent cross mapping (CCM), a new causality model based on state space reconstruction was proposed by Sugihara et al. [], and it was demonstrated that CCM could serve as an effective tool in addressing non-separable systems and identifying weakly coupled variables under deterministic settings, which may not be covered by GC. Since then, CCM has attracted considerable attention from the research community in many different fields [,,,,,,,,].
Recall that causality aims to determine whether the current and past values of one time series are useful in predicting the future values of another in addition to its own past values. In CCM, however, both the past and future values are utilized to reconstruct the current value []. As a result, the causality defined by CCM is inconsistent with the original, widely accepted definition of causality where the key assumption is that the future values of one process cannot influence the past of the other.
Motivated by this observation, in [], we introduced the concept of causalized convergent cross mapping (cCCM). More specifically, if only the current and historical values of X and the past values of Y are used to predict the current value , and vice versa, then CCM is converted to causalized CCM. We further proved the approximate equivalence of DI and cCCM under stationary ergodic Gaussian random processes [].
This study is a continued work of our previous research [] and is focused on the implementation perspective of cCCM in causality detection. More specifically, in this study, we aimed to further investigate the effectiveness of cCCM in identifying both linear and nonlinear causal coupling in various settings through a large number of examples, including Gaussian random variables with additive noise, sinusoidal waveforms, autoregressive models, stochastic processes with a dominant spectral component embedded in noise, deterministic chaotic maps, and systems with memory, as well as experimental functional Magnetic Resonance Imaging (fMRI) data. In particular, we analyze the impact of shadow manifold construction on the performance of cCCM and provide detailed guidelines on how to configure the key parameters of cCCM (especially the shadow manifold dimension and time lag) in different applications. Moreover, we examine the noise effect in cCCM and show that, in general, reliable causality detection can be achieved when the signal-to-noise ratio (SNR) is larger than 15 dB. Overall, our analysis indicates that CCM is a promising and easy-to-implement tool for causality analysis in a wide spectrum of applications.
The rest of the paper is organized as follows. In Section 2, we briefly revisit the original CCM, the causalized CCM (cCCM), the conditional equivalence between cCCM and DI, and the extension of bivariate cCCM to multivariate cCCM. In Section 3, we present the major results of the study, where we discuss the impact of noise on the performance of cCCM, evaluate the effectiveness of cCCM in causality analysis through numerous examples, provide detailed guidelines on the configuration of cCCM, and compare the performances of bivariate and multivariate cCCM and GC through both simulation examples and experimental fMRI data. Finally, we present the conclusions drawn from this research and provide related discussions in Section 4.
2. A Revisit of Causalized Convergent Cross Mapping
In this section, we first briefly revisit convergent cross mapping (CCM) [] and introduce the concept of causalized CCM (cCCM) []. We then present the conditional equivalence of cCCM and the directed information framework [] and introduce the extension of bivariate cCCM to multivariate cCCM.
Convergent cross mapping (CCM). The CCM algorithm is based on state-space reconstruction. Consider two dynamically coupled variables X and Y that share the same attractor manifold . Let and be the time series consisting of samples of X and Y, respectively. The CCM causality analysis framework can be summarized as follows:
- Step 1: Construct the shadow manifolds with respect to and .
- Step 2: For each vector , find its nearest neighbors and denote the time indices (from closest to farthest) of the nearest neighbors of by .
- Step 3: If the two signals X and Y are dynamically coupled, then the nearest neighbors of in would be mapped to the nearby points of on manifold . The estimated based on , or say the cross mapping from X to Y, is defined aswherewhere d denotes the Euclidean distance between two vectors. Please note that for every i, is a function of t. The cross mapping from Y to X can be defined in a similar way. As n increases, it is expected that and would converge to and , respectively.
- Step 4: The cross mapping correlations are defined aswhere denotes the Pearson correlation.
- Step 5: If and converges faster than then we say that the causal effect of X on Y is stronger than that in the reverse.
Geometric illustration of convergent cross mapping. Here, we provide the geometric illustration of convergent cross mapping from the shadow manifold to the shadow manifold under both strong and weak causal coupling.
Figure 1a corresponds to the situation when there is a strong causal relationship from X to Y, and Figure 1b illustrates the case when there is only a weak causation. For illustration purpose, the dimension of the shadow manifold was chosen to be , the neighborhood of is represented using a the simplex consisting of three nearest neighbors, and the neighborhood of is represented in the same way.
Figure 1.
Geometric illustration of the cross mapping from to . (a) When strong causation exists from X to Y, the nearest neighbors of are mapped to the nearest neighbors of . (b) When there is only weak causation from X to Y, the nearest neighbors of are no longer mapped to the nearest neighbors of .
Causalized convergent cross mapping (cCCM). Note that in CCM, both the past and future values are used in data reconstruction, which is inconsistent with the original definition of causality where it is assumed that the future values of one process cannot impact the past of another. For this reason, we propose the concept of causalized convergent cross mapping (cCCM).
More specifically, in CCM, if we limit the search of all the nearest neighbors in to , i.e., we only use the current and previous values of X and the past values of Y to predict the current value , operating in the same way for the other direction, and then we obtained causalized CCM. That being said, Step 2 in cCCM now becomes
- Step 2 for cCCM: For each vector , find its nearest neighbors in with an index smaller than t and denote the time indices (from closest to farthest) of the nearest neighbors of by . Note that for , we now have .
Then, we follow Steps 3–5 above, and denote the corresponding causalized cross mapping correlation, or the cCCM causation, as .
Conditional equivalence between cCCM and directed information. As an information-theoretic causality model, directed information (DI) measures the information flow from one time series to the other. DI plays a central role in causality analysis for two reasons. First, it is a universal method that does not have any modeling constraints on the sequences to be evaluated [,]. Second, DI serves as the pivot that links existing causality models GC [,], transfer entropy (TE) [,,], and dynamic causal modeling (DCM) [,] through conditional equivalence between them.
In [], we proved the conditional equivalence between cCCM and DI under Gaussian variables and used DI as a bridge to connect cCCM to other representative tools of causality analysis. More specifically, we showed that if (i) X and Y are dynamically coupled, zero-mean Gaussian random variables and their joint distribution is bivariate Gaussian, and (ii) , are stationary ergodic random processes; then, when n is sufficiently large,
where denotes the average DI from X to Y, measured in bits per sample. The conditional equivalence of DI and cCCM under Gaussian random variables was demonstrated in [] using experimental fMRI data.
This result also connects cCCM to other representative causality analysis frameworks in the family—GC, TE (Transfer Entropy, 2000 []), and DCM (Dynamic causal modelling, 2003 [])—through the conditional equivalence between them under Gaussian random variables [,].
It is worth pointing out that the simulation-based analysis in [] suggested that cCCM is often more robust in causality detection than DI. This is mainly because the DI calculation is based on probability estimation, which is sensitive to the step size used in the quantization process []. cCCM, on the other hand, gets around this obstacle through geometric cross mapping between the corresponding shadow manifolds, at the cost of a higher computational complexity. More specifically, cCCM relies on a K-nearest neighbor search and has a computational complexity of in the sequence length n, but the computational complexity of DI is only .
Extension of bivariate cCCM to multivariate cCCM. Bivariate cCCM can be extended to multivariate conditional cCCM [] based on a multivariate KNN search, which takes a similar approach as in the multivariate KNN predictability approaches [,,,].
Let denote the set of dynamically coupled random variables that share the same attractor manifold. As shown in [], the multivariate conditional cCCM from with respect to (i.e., all the remaining random variables in ) is defined using the causality ratio as
where denotes the estimation error vector based on , and is the estimation error vector based on the whole . The definition can be adjusted by modifying to reflect the conditional cCCM with respect to either an individual random variable or a group of random variables.
3. Results
3.1. The Impact of Estimation Error in cCCM
Note that CCM and cCCM are based on data reconstruction, and the reconstructed data converge to the true data as the data length goes to infinity when there exists causal coupling between the random variables under consideration. Here, we consider the impact of estimation error in cCCM.
As an example, we consider . Note that
where , and
When there exists estimation error, we can model as
where denotes the estimation error, which is independent of Y. In assuming is of zero-mean and variance , it can be shown that (please refer to the Supplementary file of [])
where denotes the variance of Y. This result implies that the cCCM value decreases as the estimation error power increases.
In the following, using the noise-free case as the benchmark, we examine the noise effect on cCCM through simulation examples, including Gaussian random variable and its signed and squared versions (Examples 1 and 2, respectively), as well as sinusoidal waveforms (Example 3). As shown in Table 1, when we increase the SNR from 0 dB to 20 dB, the cCCM value of the noisy signal gradually converges to the noise-free result. More specifically, our results suggest that reliable causality detection can be achieved when the SNR is larger than 15 dB.
Table 1.
Impact of estimation error ( and are AWGN noise generated independently of X and Y, respectively.)
The performance of cCCM is not only affected by noise but also closely related to the selection of E and . For Examples 1 and 2 in Table 1, we chose and For Example 3, we used and Here, a larger is used mainly because and are significantly over-sampled in Example 3. More discussion on the choice of shadow manifold parameters can be found in Section 3.2.
3.2. Causality Detection Using cCCM and the Choice of Shadow Manifold Parameters
In this section, we illustrate the performance of cCCM (together with CCM) in causality detection through simulation examples, including autoregressive models, stochastic processes with a dominant spectral component embedded in noise, deterministic chaotic maps, and systems with memory. As will be seen, these examples show that CCM and cCCM are sensitive to changes in coupling strength. It can be observed that CCM tends to result in larger causation values than cCCM; this is expected since CCM uses both the past and future values of X to predict the current value of Y (and vice versa), while cCCM only uses both the past values of X to predict the current value of Y (and vice versa).
We will also discuss the choice of key parameters—the dimension of shadow manifold E and the time lag —in the cCCM algorithm and the impact of these parameters on the detection of causal relationships. According to Takens’ theorem [] and Whitney’s embedding theorem [,], the “magic number” is , and often less [], where d is the dimension of the attractor shared by X and Y. Another parameter, the time lag , is generally chosen as . When the signal is over-sampled, can also be used.
It should be noted that for an accurate assessment of the causation, the sampling rate should always be chosen to be larger than the Nyquist rate. Otherwise, the causal relationship identified by cCCM may be invalid since the under-sampled sequences cannot capture the total information in the original signals.
3.2.1. Examples on Autoregressive Models
Example 4:
Let X and Y be random processes given by
where and are independent, and . We chose and and then the cCCM and CCM values between these two time series are
The convergence of CCM and cCCM with respect to the data length is shown in Figure 2.
Figure 2.
Performance of cCCM and CCM versus the data length for Examples 4 and 5.
Example 5:
Let X and Y be random processes given by
where and are independent, and . Then, the cCCM and CCM values between these two time series are
The convergence of CCM and cCCM with respect to the data length is shown in Figure 2.
3.2.2. Examples on Stochastic Processes with a Dominant Spectral Component
Example 6:
Let X and Y be two stochastic processes given by
where are independent AWGN noise with SNR dB, and (here, 0.005 is the step size). Then, the cCCM and CCM values between these two time series are
The convergence of CCM and cCCM with respect to the data length is shown in Figure 3.
Figure 3.
Performance of cCCM and CCM versus the data length for Examples 6 and 7.
Example 7:
Let X and Y be two stochastic processes given by
where and are independent AWGN noise with SNR dB, and . Then, the cCCM and CCM values between these two time series are
The convergence of CCM and cCCM with respect to the data length is shown in Figure 3.
We selected in Examples 6 and 7 to reduce the impact of noise; please refer to Section 3.2.5 for more details.
3.2.3. Examples on Deterministic Chaotic Maps
Example 8:
Let X and Y be two stochastic processes given by
where , and . Then, the cCCM and CCM values between these two time series are
The convergence of CCM and cCCM with respect to the data length is shown in Figure 4.
Figure 4.
Performance of cCCM and CCM versus the data length for Examples 8 and 9.
Example 9:
Let X and Y be two stochastic processes given by
where , and . Then, the cCCM and CCM values between these two time series are
The convergence of CCM and cCCM with respect to the data length is shown in Figure 4.
3.2.4. Examples on Systems with Memory
In this subsection, we examined the causal relationship in systems with memory (Examples 10–14) using CCM and cCCM under different choices of E and .
Example 10:
Consider a system with memory
Here, (i) the MATLAB command “randn(1024,1)” returns an 1024-by-1 matrix of normally distributed random numbers; (ii) and while . The results corresponding to different E or values are displayed in Table 2.
Table 2.
Results for Example 10.
We can see that the causation from cannot be fully captured when and
Example 11:
Table 3.
Results for Example 11.
We can see that the causation from cannot be fully captured when and
Example 12:
Consider a system with different dominant delays from Example 11:
where and while . Then, for different E or values, the results are displayed in Table 4.
Table 4.
The results of Example 12.
We can see that the causation from cannot be fully captured when and
Example 13:
Table 5.
The results of Example 13.
From this example, we can see the following: (i) when , we have and the causation corresponding to item cannot be captured; (ii) when and , we have and the causation corresponding to items and cannot be captured; and (iii) when and , we have , and the causation corresponding to all the items can be captured.
Now, if we consider the time-delayed causality, in which remains the same and , then this is equivalent to considering the causality from to . In this case, as shown in Table 6, even when and , we have and the causation corresponding to all the items can be captured.
Table 6.
The results of time-delayed causality analysis in Example 13.
Example 14:
Table 7.
The results of Example 14.
In this example, both and can only capture the causation corresponding to , and and can capture the overall causation accurately.
Now, if we consider the time-delayed causality, in which remains the same and , then this is equivalent to considering the causality from to . In this case, as shown in Table 8, and work even better than and since leads to a manifold with a lower dimension and, hence, a higher nearest neighbor density.
Table 8.
The results of time-delayed causality analysis in Example 14.
From Examples 10–14, it can be seen that in systems with memory, the selection of the shadow manifold dimension E and the signal lag largely rely on the positions of the dominant delays in the channel impulse response.
It can be seen that in systems with memory, for the accurate evaluation of CCM and cCCM causality, the following conditions need to be satisfied:
- (a)
- , where denotes the largest dominant delay.
- (b)
- For each t, the shadow manifold constructing vector should contain all the samples corresponding to the dominant delays.
It is also observed that if the conditions above are not satisfied, time-delayed cCCM from to might still capture the causation accurately if the instantaneous information exchange between and is not significant. More specifically, if we consider a linear time-invariant (LTI) system , where denotes the channel impulse response, when is negligibly small, we say that there is no significant instantaneous information exchange between and .
In the following two examples, we compare the performance between cCCM and Granger causality (GC) for systems with memory.
Example 15:
Consider a system with memory:
where , and while . Here, we assume that is independent of X. We then compare the performances of GC and CCM under different noise powers, and the results are shown in Table 9.
Table 9.
Results for Example 15.
As can be seen, as long as the signal-to-noise ratio (SNR) is not too small, cCCM can capture the strong bidirectional causality between and , but GC cannot. This is mainly because cCCM takes the instantaneous information exchange between and into consideration, but GC does not. That is, when there exists instantaneous information exchange between and , GC may fail to capture the causal coupling between and .
It is also observed that the value decreases as the noise power increases, which is consistent with our analysis in Section 3.1. When and SNR dB, both cCCM and GC can no longer deliver valid results due to the strong noise effect.
Recall that the most commonly used method in Granger causality [,,] analysis is to compare the following two prediction errors and :
And the Granger causality is defined to be the log-likelihood ratio
where , , and stands for the determinant of the covariance matrix.
Our results in Table 9 and the definition of GC suggest that the small fluctuations in the GC values as the noise variance increases from to 4 are more likely to reflect the impact of the noise rather than the detection of the causality.
Example 16:
Consider
where while , and is an independently generated Gaussian noise. Then, the Granger causality between X and Y is
and the causality detected by cCCM is
In this example, there is no instantaneous information exchange between and , and both GC and cCCM detect the strong unidirectional causality from X to Y and deliver consistent results.
3.2.5. Additional Examples on the Selection of the Dimension of the Shadow Manifold E and Time Lag
In this subsection, we illustrate the impact of E and on the performance of cCCM through some additional simulation examples, including single-tone time series embedded in noise (Example 13) and Gaussian stochastic process (Example 14). As in Example 3, it was found that a large value may help enhance the performance of cCCM under noise. However, it is also noticed that if E is too large, cCCM may no longer deliver valid results, as the excessively high dimension of the shadow manifold significantly reduces the density of the nearest neighbors, leading to inaccurate state-space reconstruction and causality evaluation.
Example 17:
This is a revisit of Example 3, with additional discussions on the selection of E and and different sampling instants. Consider the following noisy single-tone time series:
where , and and are independent AWGN noises with SNRs varying in 0, 5, 10, 15, and 20 dB, or equal to 0 for all t in the noise-free case. By changing the values for E and , we are able to observe different noise effects. The simulation results for and are shown in Table 10 below.
Table 10.
Performance of cCCM under additive white Guassian noise with different E and values.
As can be seen, as we increase the length of the data span , the noise effect is reduced. In particular, compared with and when we choose and , a much better noise immunity is achieved, since is sufficiently long.
Note that increasing leads to the downsampling of the time series, and increasing E expands the dimension of the shadow manifolds. An over-increase in E or might downgrade the performance of cCCM. From Example 14, it can be seen that if E is much larger than , cCCM may deliver inaccurate results.
Example 18:
Consider
In this example, there is a strong unidirectional causality from X to Y, but very weak causation in the inverse direction. Choose . From Figure 5, we can see that as E increases, the cCCM value keeps on decreasing and reduces to when , which no longer reflects the strong unidirectional causality from X to Y.
Figure 5.
cCCM results for Example 18 (, ): an excessively large E may downgrade the performance of cCCM; here, .
3.2.6. Examples of the Impact of Sampling Frequency on cCCM
In this subsection, we show that for the accurate assessment of causation, signals under consideration should be sampled with a sampling frequency higher than the Nyquist rate.
Example 19:
Consider
where and From Figure 6, it can be seen that if the sampling frequency is higher than the Nyquist rate, then strong bidirectional causal coupling can be detected between X and Y. On the other hand, if the sampling frequency is lower than the Nyquist rate, then the resulted cCCM value is no longer valid.
Figure 6.
Impact of sampling frequency on cCCM convergence speed: an illustration using sinusoidal waveforms with different frequencies. (a) = 75 Hz; (b) = 100 Hz; (c) = 200 Hz. Here, denotes the frequency of the corresponding sinusoidal waveform. The sampling time sequence was chosen as ; that is, sampling frequency Hz. As can be seen, cCCM works well when the sampling rate is above the Nyquist rate, as shown in (a) but may or may not deliver meaningful results when the sampling frequency is below or equal to the Nyquist rate, as shown in (b,c).
3.2.7. Examples on Data Repetition in Causality Analysis
The following examples illustrate that even if X and Y are two independent signals that are not causally coupled, a causal pattern can be enforced in the concatenated time series through data repetition.
Example 20:
Let randn(1000, 1) and randn(1000, 1) be two independent normally distributed time series. We have
That is, X and Y are not causally coupled. Consider
Then, we have
As can be seen, data concatenation results in strong causality that does not exist in the original X and Y.
3.2.8. An Example of Multivariate Conditional cCCM
Example 21:
Let randn(1024,1), randn(1024,1), and randn(1024,1) be independent and normally distributed Gaussian random variables. Consider
Then, the bivariate cCCM between X and Y is
which provides a delusion that there exists strong bidirectional causality between X and Y. On the other hand, the multivariate cCCM between X and Y conditioning on Z is
which accurately reflect the independent relationship between X and Y. From this example, it can be seen that conditional cCCM can help inspect the dependence among the random variables under consideration and may deliver more accurate results in the causality evaluation.
3.3. Application of cCCM for Brain Causality Analysis Using Experimental fMRI Data
In this study, we applied both bivariate and multivariate cCCM for a causality analysis of the brain network using experimental fMRI data and compared the results with those of GC [,].
We considered an fMRI dataset where fourteen right-handed healthy college students (7 males and 7 females, 23.4 ± 4.2 years of age) from Michigan State University volunteered to participate in a task-driven fMRI-based study. For each subject, fMRI datasets were collected on a visual stimulation condition with a scene–object fMRI paradigm, where each volume of images was acquired 192 times (8 min) while a subject was presented with 12 blocks of visual stimulation after an initial 10 s resting period. In a predefined randomized order, the scenery pictures were presented in six blocks, and the object pictures were presented in another six blocks. All pictures were unique. In each block, ten pictures were presented continuously for 25 s (2.5 s for each picture), followed with a 15 s baseline condition (a white screen with a black fixation cross at the center). The subject needed to press their right index finger once when the screen was switched from the baseline to the picture condition. More details on fMRI data acquisition and preprocessing can be found in [].
Region of Interest (ROI) selection: we selected 10 ROI regions, including the left primary visual cortex (LV1), left parahippocampal place area (LPPA), left sensory motor cortex (LSMC), left parahippocampal white matter (LPWM), left retrosplenial cortex (LRSC), right primary visual cortex (RV1), right parahippocampal place area (RPPA), right sensory motor cortex (RSMC), right frontal white matter (RFWM), and right retrosplenial cortex (RRSC).
3.3.1. Results for Bivariate and Multivariate cCCM
Note that the total length of the fMRI BOLD time series under visual stimulation condition was , with the sampling period being 2.5 s. In the literature, it was reported that increasing the sampling rate of the fMRI signal can improve the robustness of the causality analysis []. Here, we first interpolate the fMRI sequence by a factor of 2 using the spline interpolation command in MATLAB and then conducted causality analysis for all the possible unidirectional regional pairs using both bivariate and multivariate cCCM.
The causality analysis results based on bivariate cCCM (averaged over all 14 subjects) are shown in Figure 7. Our results suggest the presence of unidirectional causality from LV1 → LSMC, RV1 → LSMC, LV1 → LPWM, LV1 → RFWN, and LPPA → LPWM.
Figure 7.
FMRI-based causality analysis using bivariate cCCM. Unidirectional causality was detected in the brain network under a visual simulation condition with a scene–object fMRI paradigm.
The results corresponding to multivariate conditional cCCM with respect to individual brain regions are shown in Figure 8. As can be seen, RV1 has the most significant impact on the conditional causality from LV1 → LSMC, , and . This implies that RV1 has the highest inter-region dependence with LV1. For the same reason, LV1 has the most significant impact on the conditional causality from RV1 → LSMC. That is, multivariate conditional cCCM with respect to individual regions can detect unidirectional causality and also reflect the impact of interdependence between the ROIs on the conditional causality.
Figure 8.
FMRI-based causality analysis using multivariate conditional cCCM with respect to individual regions. (a) , (b) , (c) , (d) . The results indicate that multivariate cCCM (with respect to individual regions) can detect unidirectional causality and also reflect the impact of interdependence between the ROIs on the conditional causality. More specifically, it can be seen that due to the dependence between the brain regions, multivariate conditional CCM values are much smaller than the bivariate cCCM values. In particular, RV1 has the most significant impact on the conditional causality from LV1 → LSMC, , and . This implies that RV1 has the highest dependence with LV1. For the same reason, LV1 has the the most significant impact on the conditional causality from RV1 → LSMC.
3.3.2. Results for Bivariate and Multivariate Granger Causality (GC)
For comparison purposes, we analyzed the brain network causality using both bivariate and multivariate GC [] with the same fMRI dataset.
From Figure 7 and Figure 9, it can be seen that bivariate GC delivers similar results as cCCM except for the causal coupling from . More specifically, cCCM shows that there exists unidirectional causality from , while GC shows that there exists bidirectional causality between and but no significant unidirectional causality. In ref. [], the DI-based causality analysis also verified the presence of unidirectional causality from for the same dataset, which is consistent with the results of cCCM. These results suggest that for this fMRI dataset, cCCM tends to deliver a more accurate causality evaluation than GC.
Figure 9.
FMRI-based causality analysis using bivariate GC. The results of GC are largely consistent with those of bivariate cCCM except from . This may be because (i) cCCM takes the instantaneous information exchange between and into consideration, but GC does not; and (ii) cCCM can capture both linear and nonlinear causal causal coupling, and GC may have difficulty in detecting nonlinear causality. That is, when there exists instantaneous information exchange and/or a nonlinear causal relationship between and , GC may fail to capture the underlying causal coupling accurately.
A natural question arises: how should we explain the difference between cCCM and GC for the causality analysis here? Since based on the central limit theorem, fMRI signals can be modeled as Gaussian random variables for which cCCM and GC are conditionally equivalent. The underlying argument is that the equivalence between cCCM and GC under Gaussian random variables is subject to two conditions: (i) both and follow the linear auto-regression model; and (ii) there is no significant instantaneous information exchange between and . More specifically, cCCM takes the instantaneous information exchange between and into consideration, but GC does not. For this reason, when there exists instantaneous information exchange between and , GC may fail to capture the causal coupling between and , but cCCM succeeds. This is demonstrated through simulations in Example 15. In addition, cCCM can capture both linear and nonlinear causal causal coupling, but GC may have difficulty in detecting nonlinear causality. For these reasons, cCCM might be a more robust causality analysis tool than GC.
In the multivariate case, the theoretical relationship between cCCM and GC is not clear yet. In comparing Figure 8 and Figure 10, it can be seen that the results of multivariate cCCM and GC are largely consistent for and . However, they deliver very different results for the conditional causality from and with respect to other individual regions. In particular, for these two region pairs, the results of multivariate cCCM with respect to other individual regions are consistent with their bivariate counterparts and also reflect the impact of inter-region dependence on the conditional causality. However, the corresponding results of multivariate GC with respect to other individual regions vary significantly with the region under consideration, and 50% or more are no longer consistent with those of the bivariate GC.
Figure 10.
FMRI-based causality analysis using multivariate conditional GC with respect to individual regions. (a) , (b) , (c) , (d) . It can be seen that the results of multivariate cCCM and GC are largely consistent for (a,d). However, for (b,c), multivariate cCCM and GC deliver very different results. In particular, for the conditional causality from and , the results of multivariate cCCM with respect to other individual regions are consistent with their bivariate counterparts and also reflect the impact of inter-region dependence on the conditional causality. However, the corresponding results of multivariate GC with respect to other individual regions vary significantly with the region under consideration, and 50% or more are no longer consistent with those of the bivariate GC.
Further theoretical analysis is needed on the theoretical relationship between conditional GC and multivariate cCCM, as well as the relationship between DI and the recent minimum entropy framework [] in both bivariate and multivariate scenarios.
4. Conclusions and Discussion
In this paper, we revisited the definition of original CCM, identified the gap between CCM and the traditional definition of causality, presented causalized CCM (cCCM), and discussed the conditional equivalence of cCCM and directed information and the extension of bivariate cCCM to multivariate cCCM. We then evaluated the effectiveness of cCCM in the detection of causality through a large number of examples including Gaussian random variables with additive noise, sinusoidal waveforms, autoregressive models, stochastic processes with a dominant spectral component embedded in noise, deterministic chaotic maps, and systems with memory, as well as experimental fMRI data. We also examined the impact of noise on the performance of cCCM, and our results suggest that, in general, reliable results can be achieved when SNR dB. In particular, we provided detailed discussions on the choice of the dimension of the shadow manifolds E and the time lag and the impact of these parameters on the detection of causal relationships using cCCM. Finally, we applied both bivariate and multivariate cCCM for the causality analysis of the brain network using experimental fMRI data and compared the results with those of GC.
Based on the conditional equivalence of cCCM and DI [], we can see that cCCM provides an alternative way to evaluate the directed information transfer between stationary ergodic Gaussian random variables. Compared with DI, which relies heavily on probability estimation and tends to be sensitive to data length and quantization step size, cCCM, on the other hand, gets around this problem through geometric cross mapping between the manifolds involved.
However, the advantage of cross-mapping-based causality detection techniques comes with prices. The major limitation with CCM and cCCM is that they are based on the K-nearest neighbor (KNN) search algorithm and hence have a high computation complexity , where n is the data length. The convergence speeds of CCM and cCCM also vary with the signals under applications and need to be taken into consideration in causality analysis, especially in dynamic systems where the causal relationships are time-variant. It is worthy to point out that when combined with the sliding window approach [,], cCCM can be used to evaluate time-varying causality in dynamic networks such as brain networks [].
Overall, both our theoretical [] and numerical analysis demonstrated that cCCM is a promising and easy-to-implement tool for causality detection in a wide spectrum of applications. In this paper, we showed that appropriate choices of E, , and the sampling frequency are critical for cCCM-based causality analysis and provided detailed guidelines on the configuration of cCCM. We wish that this paper can serve as a helpful reference on the implementation of cCCM for causality detection in different applications.
Author Contributions
Conceptualization, T.L. and R.Z.; methodology, T.L. and J.R.; software, B.S. and J.D.; validation, B.S. and J.D.; formal analysis, T.L. and J.R.; investigation, B.S., J.D. and T.L.; resources, B.S., J.D., N.S., D.C.Z. and R.Z.; data curation, B.S., J.D., N.S., D.C.Z., R.Z. and T.L.; writing—original draft preparation, B.S., T.L. and J.D.; writing—review and editing, B.S., J.D., N.S., D.C.Z., J.R., R.Z. and T.L.; visualization, T.L., B.S. and J.D.; supervision, T.L.; project administration, T.L.; funding acquisition, T.L., J.R. and R.Z. All authors have read and agreed to the published version of the manuscript.
Funding
This study was supported in part by the National Science Foundation (NSF) under awards 2032709 and 1919154, and the National Institutes of Health (NIH) under awards R01AG49749, P30AG024824, and P30AG072931.
Institutional Review Board Statement
This study was conducted in accordance with the Declaration of Helsinki and approved by the Institutional Review Board (or Ethics Committee) of Michigan State University (STUDY 00004848, Date approved: 7 July 2022; STUDY LEGACY06-537, Date Approved: 12 May 2009) and the University of Texas Southwestern Medical Center (STUDY 052016-076, Date Approved: 30 June 2017).
Informed Consent Statement
Informed consent was obtained from all subjects involved in the study.
Data Availability Statement
The fMRI datasets presented in this study are available to qualified investigators according to the NIH data sharing policy upon reasonable request. All the other data supporting the findings of this study are available within the article. The relevant MATLAB code can be found at https://github.com/BAWC-Evan-Sun/CCM-Implement.git (accessed on 20 June 2024).
Acknowledgments
We would like to thank the High-Performance Computing Cluster (HPCC) of Michigan State University for providing cyberinfrastructure to support our computational data analysis.
Conflicts of Interest
The authors declare no conflicts of interest.
Abbreviations
The following abbreviations are used in this manuscript:
| CCM | Convergent cross mapping; |
| cCCM | Causalized convergent cross mapping; |
| GC | Granger causality; |
| DI | Directed information; |
| fMRI | Functional magnetic resonance imaging; |
| SNR | Signal-to-noise ratio; |
| DMN | Default mode network; |
| BOLD | Blood-oxygen-level-dependent; |
| KNN | K-nearest neighbor; |
| TE | Transfer entropy; |
| DCM | Dynamic causal modeling; |
| AWGN | Additive white Gaussian noise; |
| MSE | Mean square error; |
| CR | Causality ratio; |
| ROI | Region of interest; |
| LV1 | Left primary visual cortex; |
| LPPA | Left parahippocampal place area; |
| LSMC | Left sensory motor cortex; |
| LPWM | Left parahippocampal white matter; |
| LRSC | Left retrosplenial cortex; |
| RV1 | Right primary visual cortex; |
| RPPA | Right parahippocampal place area; |
| RSMC | Right sensory motor cortex; |
| RFWM | Right frontal white matter; |
| RRSC | Right retrosplenial cortex. |
References
- Hua, J.C.; Jin Kim, E.; He, F. Information Geometry Theoretic Measures for Characterizing Neural Information Processing from Simulated EEG Signals. Entropy 2024, 26, 213. [Google Scholar] [CrossRef]
- Ma, Y.; Qian, J.; Gu, Q.; Yi, W.; Yan, W.; Yuan, J.; Wang, J. Network Analysis of Depression Using Magnetoencephalogram Based on Polynomial Kernel Granger Causality. Entropy 2023, 25, 1330. [Google Scholar] [CrossRef] [PubMed]
- Wang, Q.; Yao, W.; Bai, D.; Yi, W.; Yan, W.; Wang, J. Schizophrenia MEG Network Analysis Based on Kernel Granger Causality. Entropy 2023, 25, 1006. [Google Scholar] [CrossRef]
- Stokes, P.A.; Purdon, P.L. A study of problems encountered in Granger causality analysis from a neuroscience perspective. Proc. Natl. Acad. Sci. USA 2017, 114, E7063–E7072. [Google Scholar] [CrossRef] [PubMed]
- Zhang, X.; Mormino, E.C.; Sun, N.; Sperling, R.A.; Sabuncu, M.R.; Yeo, B.T.T.; Weiner, M.W.; Aisen, P.; Weiner, M.; Aisen, P.; et al. Bayesian model reveals latent atrophy factors with dissociable cognitive trajectories in Alzheimer’s disease. Proc. Natl. Acad. Sci. USA 2016, 113, E6535–E6544. [Google Scholar] [CrossRef]
- Hillebrandt, H.; Friston, K.J.; Blakemore, S.J. Effective connectivity during animacy perception—Dynamic causal modelling of Human Connectome Project data. Sci. Rep. 2014, 4, 6240. [Google Scholar] [CrossRef]
- Marinescu, I.E.; Lawlor, P.N.; Kording, K.P. Quasi-experimental causality in neuroscience and behavioural research. Nat. Hum. Behav. 2018, 2, 891–898. [Google Scholar] [CrossRef]
- Deshpande, G.; Santhanam, P.; Hu, X. Instantaneous and causal connectivity in resting state brain networks derived from functional MRI data. NeuroImage 2011, 54, 1043–1052. [Google Scholar] [CrossRef]
- Deng, J.; Sun, B.; Scheel, N.; Renli, A.B.; Zhu, D.C.; Zhu, D.; Ren, J.; Li, T.; Zhang, R. Causalized convergent cross-mapping and its approximate equivalence with directed information in causality analysis. PNAS Nexus 2023, 3, 422. [Google Scholar] [CrossRef]
- Granger, C.W.J. Investigating Causal Relations by Econometric Models and Cross-spectral Methods. Econometrica 1969, 37, 424. [Google Scholar] [CrossRef]
- Granger, C.W.J.; Newbold, P. Forecasting Economic Time Series; Elsevier: Amsterdam, The Netherlands, 1977; p. 225. [Google Scholar] [CrossRef]
- Barnett, L.; Seth, A.K. The MVGC multivariate Granger causality toolbox: A new approach to Granger-causal inference. J. Neurosci. Methods 2014, 223, 50–68. [Google Scholar] [CrossRef]
- Mannino, M.; Bressler, S.L. Foundational perspectives on causality in large-scale brain networks. Phys. Life Rev. 2015, 15, 107–123. [Google Scholar] [CrossRef] [PubMed]
- Seth, A.K.; Chorley, P.; Barnett, L.C. Granger causality analysis of fMRI BOLD signals is invariant to hemodynamic convolution but not downsampling. NeuroImage 2013, 65, 540–555. [Google Scholar] [CrossRef]
- David, O.; Guillemain, I.; Saillet, S.; Reyt, S.; Deransart, C.; Segebarth, C.; Depaulis, A. Identifying neural drivers with functional MRI: An electrophysiological validation. PLoS Biol. 2008, 6, 2683–2697. [Google Scholar] [CrossRef]
- Sugihara, G.; May, R.; Ye, H.; hao Hsieh, C.; Deyle, E.; Fogarty, M.; Munch, S. Detecting Causality in Complex Ecosystems. Science 2012, 338, 496–500. [Google Scholar] [CrossRef] [PubMed]
- Massey, J. Causality, feedback, and directed information. In Proceedings of the the International Symposium on Information Theory and Its Applications, Waikiki, HI, USA, 11 November 1990; pp. 303–305. [Google Scholar]
- Wang, Z.; Alahmadi, A.; Zhu, D.C.; Li, T. Causality Analysis of fMRI Data Based on the Directed Information Theory Framework. IEEE Trans. Biomed. Eng. 2016, 63, 1002–1015. [Google Scholar] [CrossRef]
- Amblard, P.O.; Michel, O.J.J. On directed information theory and Granger causality graphs. J. Comput. Neurosci. 2011, 30, 7–16. [Google Scholar] [CrossRef]
- Tsonis, A.A.; Deyle, E.R.; May, R.M.; Sugihara, G.; Swanson, K.; Verbeten, J.D.; Wang, G. Dynamical evidence for causality between galactic cosmic rays and interannual variation in global temperature. Proc. Natl. Acad. Sci. USA 2015, 112, 3253–3256. [Google Scholar] [CrossRef] [PubMed]
- Deyle, E.R.; Maher, M.C.; Hernandez, R.D.; Basu, S.; Sugihara, G. Global environmental drivers of influenza. Proc. Natl. Acad. Sci. USA 2016, 113, 13081–13086. [Google Scholar] [CrossRef]
- Liu, O.R.; Gaines, S.D. Environmental context dependency in species interactions. Proc. Natl. Acad. Sci. USA 2022, 119. [Google Scholar] [CrossRef]
- Chang, C.W.; Miki, T.; Ye, H.; Souissi, S.; Adrian, R.; Anneville, O.; Agasild, H.; Ban, S.; Be’eri-Shlevin, Y.; Chiang, Y.R.; et al. Causal networks of phytoplankton diversity and biomass are modulated by environmental context. Nat. Commun. 2022, 13, 1140. [Google Scholar] [CrossRef]
- Chen, D.; Sun, X.; Cheke, R.A. Inferring a Causal Relationship between Environmental Factors and Respiratory Infections Using Convergent Cross-Mapping. Entropy 2023, 25, 807. [Google Scholar] [CrossRef]
- Wang, J.Y.; Kuo, T.C.; Hsieh, C.H. Causal effects of population dynamics and environmental changes on spatial variability of marine fishes. Nat. Commun. 2020, 11, 2635. [Google Scholar] [CrossRef]
- McCracken, J.M.; Weigel, R.S. Convergent cross-mapping and pairwise asymmetric inference. Phys. Rev. E 2014, 90, 062903. [Google Scholar] [CrossRef] [PubMed]
- Breston, L.; Leonardis, E.J.; Quinn, L.K.; Tolston, M.; Wiles, J.; Chiba, A.A. Convergent cross sorting for estimating dynamic coupling. Sci. Rep. 2021, 11, 20374. [Google Scholar] [CrossRef]
- Wismüller, A.; Abidin, A.Z.; D’Souza, A.M.; Wang, X.; Hobbs, S.K.; Leistritz, L.; Nagarajan, M.B. Nonlinear functional connectivity network recovery in the human brain with mutual connectivity analysis (MCA): Convergent cross-mapping and non-metric clustering. Proc. SPIE Int. Soc. Opt. Eng. 2015, 3, 94170M. [Google Scholar] [CrossRef]
- Permuter, H.H.; Kim, Y.H.; Weissman, T. Interpretations of Directed Information in Portfolio Theory, Data Compression, and Hypothesis Testing. IEEE Trans. Inf. Theory 2011, 57, 3248–3259. [Google Scholar] [CrossRef]
- Soltani, N.; Goldsmith, A. Inferring neural connectivity via measured delay in directed information estimates. In Proceedings of the 2013 IEEE International Symposium on Information Theory, Istanbul, Turkey, 7–12 July 2013; pp. 2503–2507. [Google Scholar] [CrossRef]
- Schreiber, T. Measuring information transfer. Phys. Rev. Lett. 2000, 85, 461–464. [Google Scholar] [CrossRef]
- Barnett, L.; Barrett, A.B.; Seth, A.K. Granger Causality and Transfer Entropy Are Equivalent for Gaussian Variables. Phys. Rev. Lett. 2009, 103, 238701. [Google Scholar] [CrossRef] [PubMed]
- Friston, K.; Harrison, L.; Penny, W. Dynamic causal modelling. NeuroImage 2003, 19, 1273–1302. [Google Scholar] [CrossRef]
- Wang, Z.; Liang, Y.; Zhu, D.C.; Li, T. The Relationship of Discrete DCM and Directed Information in fMRI-Based Causality Analysis. IEEE Trans. Mol. Biol. Multi-Scale Commun. 2018, 4, 3–13. [Google Scholar] [CrossRef]
- Ridderinkhof, K.R.; Ullsperger, M.; Crone, E.A.; Nieuwenhuis, S. The Role of the Medial Frontal Cortex in Cognitive Control. Science 2004, 306, 443–447. [Google Scholar] [CrossRef] [PubMed]
- Porta, A.; Faes, L.; Bari, V.; Marchi, A.; Bassani, T.; Nollo, G.; Perseguini, N.M.; Milan, J.; Minatel, V.; Borghi-Silva, A.; et al. Effect of Age on Complexity and Causality of the Cardiovascular Control: Comparison between Model-Based and Model-Free Approaches. PLoS ONE 2014, 9, e89463. [Google Scholar] [CrossRef] [PubMed]
- Porta, A.; Faes, L. Wiener–Granger Causality in Network Physiology With Applications to Cardiovascular Control and Neuroscience. Proc. IEEE 2016, 104, 282–309. [Google Scholar] [CrossRef]
- Porta, A.; Bari, V.; Gelpi, F.; Cairo, B.; Maria, B.D.; Tonon, D.; Rossato, G.; Faes, L. On the Different Abilities of Cross-Sample Entropy and K-Nearest-Neighbor Cross-Unpredictability in Assessing Dynamic Cardiorespiratory and Cerebrovascular Interactions. Entropy 2023, 25, 599. [Google Scholar] [CrossRef] [PubMed]
- Abarbanel, H.D.I.; Carroll, T.A.; Pecora, L.M.; Sidorowich, J.J.; Tsimring, L.S. Predicting physical variables in time-delay embedding. Phys. Rev. E 1994, 49, 1840–1853. [Google Scholar] [CrossRef] [PubMed]
- Takens, F. Detecting strange attractors in turbulence. In Lecture Notes in Mathematics; Springer: Berlin/Heidelberg, Germany, 1981; pp. 366–381. [Google Scholar] [CrossRef]
- Packard, N.H.; Crutchfield, J.P.; Farmer, J.D.; Shaw, R.S. Geometry from a Time Series. Phys. Rev. Lett. 1980, 45, 712–716. [Google Scholar] [CrossRef]
- Whitney, H.; Eells, J.; Toledo, D. Collected Papers of Hassler Whitney (Contemporary Mathematicians); Birkhäuser: Basel, Switzerland, 1992. [Google Scholar]
- Geweke, J.F. Measures of Conditional Linear Dependence and Feedback Between Time Series. J. Am. Stat. Assoc. 1984, 79, 907. [Google Scholar] [CrossRef]
- Zhu, D.C.; Majumdar, S. Integration of resting-state FMRI and diffusion-weighted MRI connectivity analyses of the human brain: Limitations and improvement. J. Neuroimag. Off. J. Am. Soc. Neuroimag. 2014, 24, 176–186. [Google Scholar] [CrossRef]
- Lin, F.H.; Ahveninen, J.; Raij, T.; Witzel, T.; Chu, Y.H.; Jääskeläinen, I.P.; Tsai, K.W.K.; Kuo, W.J.; Belliveau, J.W. Increasing fMRI Sampling Rate Improves Granger Causality Estimates. PLoS ONE 2014, 9, e100319. [Google Scholar] [CrossRef][Green Version]
- Ning, L. An information-theoretic framework for conditional causality analysis of brain networks. Netw. Neurosci. 2024, 3, 1–38. [Google Scholar] [CrossRef]
- Allen, E.A.; Damaraju, E.; Plis, S.M.; Erhardt, E.B.; Eichele, T.; Calhoun, V.D. Tracking Whole-Brain Connectivity Dynamics in the Resting State. Cereb. Cortex 2014, 24, 663–676. [Google Scholar] [CrossRef] [PubMed]
- Schumacher, J.; Peraza, L.R.; Firbank, M.; Thomas, A.J.; Kaiser, M.; Gallagher, P.; O’Brien, J.T.; Blamire, A.M.; Taylor, J.P. Dynamic functional connectivity changes in dementia with Lewy bodies and Alzheimer’s disease. Neuroimag. Clin. 2019, 22, 101812. [Google Scholar] [CrossRef] [PubMed]
- Deng, J.; Sun, B.; Kavcic, V.; Liu, M.; Giordani, B.; Li, T. Novel methodology for detection and prediction of mild cognitive impairment using resting-state EEG. Alzheimer’s Dement. 2024, 20, 411. [Google Scholar] [CrossRef] [PubMed]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).