Spectral Ranking of Causal Influence in Complex Systems

Zalmijn, Errol; Heskes, Tom; Claassen, Tom

doi:10.3390/e23030369

Open AccessEditor’s ChoiceArticle

Spectral Ranking of Causal Influence in Complex Systems

by

Errol Zalmijn

^1,2,*,

Tom Heskes

¹

and

Tom Claassen

¹

Institute for Computing and Information Sciences, Radboud University, 6525 EC Nijmegen, The Netherlands

²

ASML Research Department, 5504 DT Veldhoven, The Netherlands

^*

Author to whom correspondence should be addressed.

Entropy 2021, 23(3), 369; https://doi.org/10.3390/e23030369

Submission received: 21 January 2021 / Revised: 9 March 2021 / Accepted: 16 March 2021 / Published: 20 March 2021

Download

Browse Figures

Versions Notes

Abstract

:

Similar to natural complex systems, such as the Earth’s climate or a living cell, semiconductor lithography systems are characterized by nonlinear dynamics across more than a dozen orders of magnitude in space and time. Thousands of sensors measure relevant process variables at appropriate sampling rates, to provide time series as primary sources for system diagnostics. However, high-dimensionality, non-linearity and non-stationarity of the data are major challenges to efficiently, yet accurately, diagnose rare or new system issues by merely using model-based approaches. To reliably narrow down the causal search space, we validate a ranking algorithm that applies transfer entropy for bivariate interaction analysis of a system’s multivariate time series to obtain a weighted directed graph, and graph eigenvector centrality to identify the system’s most important sources of original information or causal influence. The results suggest that this approach robustly identifies the true drivers or causes of a complex system’s deviant behavior, even when its reconstructed information transfer network includes redundant edges.

Keywords:

complex systems; time series; transfer entropy; eigenvector centrality; original information; node importance; coupled Lorenz systems

1. Introduction

Semiconductor lithography systems are extremely complicated electromechanical systems, capable of sub-nanometer positioning and sub-milliKelvin temperature control, while generating extreme ultraviolet light from laser-pulsed tin plasma. It is notoriously difficult to fully understand the complex interactions among thousands of observed variables affecting the output of such systems. Model-based approaches alone are inadequate to effectively diagnose rare or new system issues, as they inherently do not model abnormal behavior. To efficiently, yet reliably, reduce the search space of potential causes, we consider both local and global causal influences of a complex system’s components, as measured by Schreiber’s transfer entropy [1] and Bonacich’s eigenvector centrality [2], respectively. Transfer entropy quantifies predictive information transfer as a potential signature of causality between two stationary time series, providing direction, strength and delay of linear or non-linear interactions. For time series with a Gaussian distribution, transfer entropy reduces to Granger causality [3]. However, being a bivariate measure, transfer entropy naturally disregards the multivariate nature of interactions in complex systems. Therefore, multivariate approaches use conditional transfer entropy [4] to separate true cause–effect relations from mere correlations, i.e., direct from so-called transitive indirect or semi-metric, and thus redundant relations, by iteratively conditioning out (subsets of) all other time series. Unfortunately, such an information decomposition is infeasible in (real time) diagnosis or prognosis of high-dimensional technological complex systems, due to its exponential scaling of computational costs with time-series dimension. Instead, we use a ranking algorithm that relies on standard transfer entropy in exhaustive, but computationally feasible bivariate interaction analysis of a complex system’s multivariate time series, resulting in an information transfer network that is likely to contain redundant edges. However, Benzi et al. [5] show that eigenvector centrality represents the upper limit of Katz centrality [6], where the global influence of nodes is essentially determined by the longest possible paths in the network, which implies its insensitivity to shortest paths, including transitive indirect or semi-metric, and thus redundant edges. In addition, empirical studies by Kalavri et al. [7] revealed that PageRank (or eigenvector centrality) yields similar rankings when computed with, or without, a graph’s (first-order) semi-metric edges. For diagnostic purposes as envisioned, we suffice to apply standard transfer entropy for approximative network inference of a complex system’s true causal structure and then use graph eigenvector centrality for computationally efficient, yet consistently accurate ranking of probable causes or key drivers of the system under disturbance.

This study relates most closely to the work of Streicher et al. [8], who implemented this approach into a two-step ranking algorithm for (chemical) plant-wide fault detection and diagnosis.

In this paper, we introduce the measures of transfer entropy and eigenvector centrality and describe the ranking algorithm as used. In rolling window analysis of a simulated time series representing two bidirectionally coupled Lorenz systems, we compare its causal network inference results to those of a constraint-based algorithm for multivariate causal analysis. The rolling analysis approach also allows the ranking algorithm to estimate the global influence of each state variable as exerted on the coupled Lorenz systems over time. Finally, we assess the ranking algorithm in diagnosing a real-world industrial system issue, using a higher-dimensional time series.

2. Applied Algorithms and Methods

2.1. Transfer Entropy

Transfer entropy (

T E

) is an information-theoretic implementation of Wiener’s notion of causality applied to time series [9], whereby the cause precedes—and contains unique information about—the effect. Consider two stationary ergodic Markov processes X and Y (or

X^{(i)}

and

X^{(j)}

as below) and their corresponding time series

{x_{1}, x_{2}, . . ., x_{M}}

and

{y_{1}, y_{2}, . . ., y_{M}}

of M samples. Transfer entropy quantifies reduction in uncertainty about future states of a source process X, when passed states of a target process Y are observed in addition to passed states of X itself. As an asymmetric measure based on transition probabilities, transfer entropy naturally incorporates directional and dynamic information, which may imply causation between X and Y:

\begin{matrix} T E_{X \to Y}^{(k, l)} (t, τ) & = \sum_{y_{t}, y_{t - 1}^{(l)}, x_{t - τ}^{(k)}} p (y_{t}, y_{t - 1}^{(l)}, x_{t - τ}^{(k)}) {log}_{b} \frac{p (y_{t} | y_{t - 1}^{(l)}, x_{t - τ}^{(k)})}{p (y_{t} | y_{t - 1}^{(l)})} \end{matrix}

(1)

where

x_{t}

and

y_{t}

represent states of X and Y at time t, while

T E_{X \to Y}^{(k, l)} (t, τ)

indicates maximized information transfer from X to Y, computed across a range

D = {0, 1, 2, . . ., τ_{m a x}}

of embedding delays

τ

, such that

max_{τ ϵ D} {T E_{X \to Y}^{(k, l)} (t, τ)} = T E_{X \to Y}^{(k, l)} (t, \underset{τ ϵ D}{arg max} {T E_{X \to Y}^{(k, l)} (t, τ)})

. The embedding dimensions k and l denote the number of passed states in X and Y used to condition the probabilities of transition to the next state of Y (or X) represented by

x_{t - τ}^{(k)} = {x_{t - τ - k + 1}, x_{t - τ - k + 2}, . . ., x_{t - τ}}

and

y_{t - 1}^{(l)} = {y_{t - 1 - l + 1}, y_{t - 1 - l + 2}, . . ., y_{t - 1}}

. The logarithm base

b = 2

defines the informational unit of transfer entropy in bits, i.e., reduction in average code-length required to optimally encode the target variable (effect), given passed states of the source variable (cause) and target variable. Herein, we keep the embedding dimensions at the commonly used

k = l = 1

, mostly for computational reasons. For every pair

(X^{(i)}, X^{(j)})

from a multivariate time series

{X^{(1)}, X^{(2)}, . . ., X^{(N)}}

, we apply Equation (1) to estimate information transfer

T E

and thereby determine the candidate source times series

X_{(d)}

and target time series

X_{(r)}

. We then generate

1 / (α = 0.05) - 1

surrogates

{X_{(d)}^{^{'} (1)}, X_{(d)}^{^{'} (2)}, . . ., X_{(d)}^{^{'} (19)}}

, which share their amplitude distribution and power spectrum with original time series

X_{(d)}

, using the iterative amplitude adjusted Fourier transform (iAAFT) proposed by Schreiber et al. [10]. Following this study, we estimate information transfer

T E^{^{'}}

from each surrogate to (original) target time series

X_{(r)}

and obtain

{T E_{1}^{^{'}}, . . ., T E_{19}^{^{'}}}

. If

T E > m a x {T E_{1}^{^{'}}, . . ., T E_{19}^{^{'}}}

, information transfer

T E

is considered to be significant or non-significant otherwise. The resultant information transfer network, given by a directed weighted graph

G = (V, E)

, comprises a set

V = {v_{i}}_{i = 1}^{N}

of N nodes and

E = {e_{i j} = (v_{i}, v_{j})}

of edges. Each edge

e_{i j}

connects a source node

v_{i}

to target node

v_{j}

with strength

w_{i j}

, i.e., information transfer

T E

.

2.2. Eigenvector Centrality

To diagnose performance issues or even failures within technological complex systems, we wish to locate where perturbations enter the system and propagate, causing downstream effects throughout the system. Therefore, we consider an information transfer network wherein the (main) sources of original information can be identified by measurement and ranking of each node’s global network influence termed centrality or importance. Centrality, the basic principle of Google’s search engine [11], has proven useful to measure and rank a page’s relevance based on its inbound links. Here, we use out-degree eigenvector centrality, which defines the centrality

c (v_{i})

of a node

v_{i} ϵ V = {v_{1}, . . ., v_{n}}

as proportional to the summed centralities of its outbound neighbors:

\begin{matrix} c (v_{i}) = λ^{- 1} \sum_{j = 1}^{N} W_{i j} c (v_{j}) & , or W c = λ c \end{matrix}

(2)

where

λ^{- 1}

is a proportionality factor and c is the eigenvector of centralities associated with eigenvalue

λ

of adjacency matrix W, whose entries

w_{i j}

denote information transfer from node

v_{i}

to node

v_{j}

in graph G. If W is non-negative and irreducible, the Perron-Frobenius theorem [12,13] ensures there is a unique vector

c_{1}

of N centralities

c_{1} (v_{i}) > 0, \forall v_{i}

associated with the largest positive eigenvalue

λ_{1} = ρ (W)

or spectral radius of W, satisfying Equation (2). Usually

c_{1}

is normalized, such that each entry indicates the centrality or importance of a node

v_{i}

in graph G on a relative scale from 0 to 1. Alternatively, matrix W is normalized to a transition matrix P, whose entries

p_{i j}

denote probabilities of transition from node

v_{i}

to node

v_{j}

in a random walk on graph G or Markov chain while

\sum_{j = 1}^{N} p_{i j} = 1

, such that:

\begin{matrix} p_{i j} = \{\begin{matrix} w_{i j} / \sum_{j = 1}^{N} w_{i j}, & if \sum_{j = 1}^{N} w_{i j} \neq 0 \\ 0, & o t h e r w i s e \end{matrix} \end{matrix}

(3)

Following Google’s PageRank approach [14], we modify matrix P by adding a teleportation probability

γ ϵ 〈 0, 1 〉

and an all-ones matrix J to obtain an

N \times N

, irreducible, positive matrix

P^{^{'}}

:

\begin{matrix} P_{γ}^{^{'}} = γ P + \frac{1}{N} (1 - γ) J \end{matrix}

(4)

where a random walker at node

v_{i}

follows an edge with probability

γ

or jumps to any other node in the N-nodes network with probability

(1 - γ)

. Herein, we use PageRank’s typical value of

γ = 0.85

. Considering the Markov chain associated with matrix

P_{γ}^{^{'}}

, the Perron–Frobenius theorem ensures an unique stationary probability distribution that matches the eigenvector

{\vec{π}}_{1}

of

P_{γ}^{^{'}}

associated with eigenvalue

λ_{1} (P_{γ}^{^{'}}) = 1

, such that

{\vec{π}}_{1} = P_{γ}^{^{'}} {\vec{π}}_{1}

. Eigenvector

{\vec{π}}_{1}

is usually computed via power iteration [15] in k steps:

{\vec{π}}_{1} = \lim_{k \to \infty} {\vec{π}}^{(k)} = \lim_{k \to \infty} P_{γ}^{^{'} (k)} {\vec{π}}^{(0)}

, where

{\vec{π}}^{(0)}

denotes an initial distribution.

We use the FaultMap algorithm (https://github.com/SimonStreicher/FaultMap, accessed on 20 May 2015) from Streicher [8], which, to our knowledge, is the only open-source implementation of the method as described above and summarized in the pseudo-code of Algorithm 1, as shown below.

Algorithm 1 FaultMap.

Information Transfer Network Inference

1:: Input:N-dimensional time series ${X^{(1)}, X^{(2)}, . . ., X^{(N)}}$ , of M samples from system X represented by: $x^{(i)} = {x_{1}^{(i)}, x_{2}^{(i)} . . ., x_{M}^{(i)}} ϵ R^{M}$ , statistical significance threshold $δ$ in, e.g., a rank-order test using iAAFT surrogates, embedding delay range $τ_{m a x}$ and embedding history lengths k and l for time series i and time series $j (\neq i)$
2:: Output: adjacency matrix $W ϵ R^{N \times N}$ , where entry ( $i, j$ ) represents information transfer from node i to node j
3:: for $i \leftarrow 1$ to N do
4:: for $j \leftarrow 1$ to $N, j \neq i$ do
5:: for $τ \leftarrow 0$ to $τ_{m a x}$ do
6:: compute $T E_{i \to j} (k, l, τ)$ by Equation (1)
7:: if $T E_{i \to j} = 0 < δ < max_{0 ⩽ τ ⩽ τ_{m a x}} {E q u a t i o n (1)}$ then
8:: $W_{i j} \leftarrow T E_{i \to j}$
9:: else
10:: $W_{i j} \leftarrow 0$
11:: end if
12:: end for
13:: end for
14:: end for

Spectral Centrality Ranking

1:: Input: matrix $P_{γ}^{^{'}} = γ P + \frac{1}{N} (1 - γ) J$ where $γ ϵ 〈 0, 1]$ , ranking distance $ϵ$
2:: Output: node centrality score vector $π_{1}$
3:: initialize $π^{(0)}$ with probabilities $[(1 / N, 1 / N, . . ., 1 / N)]$
4:: while $| π^{(k + 1)} - π^{(k)} | > ϵ$ do
5:: compute eigenvector $π^{(k + 1)}$ of matrix $P_{γ}^{^{'}}$ associated
with eigenvalue $λ_{1} (P_{γ}^{^{'}}) = 1$ , such that $π_{1} = P_{γ}^{^{'}} π_{1}$
6:: end while

2.3. Validation

In what follows, we assess the method’s accuracy in causal inference and node ranking using simulated time series, followed by root cause analysis from real-world diagnostic data. Firstly, we consider a system

S_{(L_{1} ⇄ L_{2})}

of two bidirectionally coupled Lorenz systems

L_{1}

and

L_{2}

, investigated by Wibral et al. [16] and given by:

\begin{array}{l} (5a) & {\dot{X}}_{1} (t) & = 10.0 (Y_{1} (t) - X_{1} (t)) \\ (5b) & {\dot{Y}}_{1} (t) & = X_{1} (t) (25.0 - Z_{1} (t)) - Y_{1} (t) + 0.1 Y_{2}^{2} (t - 3) \\ (5c) & {\dot{Z}}_{1} (t) & = X_{1} (t) Y_{1} (t) - 2.67 Z_{1} (t) \\ (5d) & {\dot{X}}_{2} (t) & = 10.0 (Y_{2} (t) - X_{2} (t)) \\ (5e) & {\dot{Y}}_{2} (t) & = X_{2} (t) (28.0 - Z_{2} (t)) - Y_{2} (t) + 0.05 Y_{1}^{2} (t - 5) \\ (5f) & {\dot{Z}}_{2} (t) & = X_{2} (t) Y_{2} (t) - 2.67 Z_{2} (t) \end{array}

The bidirectional coupling

Y_{1} ⇄ Y_{2}

is governed by time-delayed quadratic terms. We used Pydelay [17] to generate a multivariate time series

{X_{1}, X_{2}, Y_{1}, Y_{2}, Z_{1}, Z_{2}}

of 150K samples by numerically integrating Equations (5a)–(5f) with step size

d t = 0.01

and initial conditions

X_{1} (0) = X_{2} (0) = 1.0

,

Y_{1} (0) = Y_{2} (0) = 0.97

and

Z_{1} (0) = Z_{2} (0) = 0.99

. To assess FaultMap’s accuracy in causal network reconstruction against other model-free approaches, we use PCMCI (v4.0) from Runge [18] as a distinct, constraint-based, multivariate alternative. FaultMap estimates edge-weight

w_{i j}

as

Δ T E (τ) = T E_{X^{(i)} \to X^{(j)}} (τ) - T E_{X^{(j)} \to X^{(i)}} (τ)

using the Kraskov–Stögbauer–Grassberger estimator for transfer entropy in the Java Information Dynamics Toolkit from Lizier [19]. PCMCI performs condition selection at an optimized significance level

α

using Akaike’s information criterion, followed by a conditional independence test where we use the linear ParCorr test (at

α = 0.05

) instead of the computationally expensive nonlinear CMI test. Following the recommended sample size of Bauer [20], we use a rolling analysis window

W_{s}

of 2K samples to define 50 adjacent time series slices for causal network reconstruction of the coupled Lorenz systems by both algorithms and node importance ranking by FaultMap only.

3. Results and Discussion

3.1. Coupled Lorenz Systems

Figure 1 depicts the rolling window analysis approach, the butterfly-shaped attractors of the coupled Lorenz systems in phase space

(x, y, z)

and cause–effect detection count heatmaps for both algorithms. Figure 2 exemplifies FaultMap’s rolling window analysis results of which Figure 3 specifically reports information transfer (delay) via coupling

Y_{1} ⇄ Y_{2}

and global influence of X-, Y- and Z-variables on the coupled Lorenz systems. The heatmaps in Figure 1b show the total count of each cause–effect relation detected by FaultMap or PCMCI over 50 adjacent time series slices. Firstly, PCMCI’s linear ParCorr test detected nonlinear coupling

Y_{1} ⇄ Y_{2}

remarkably well, with a rate of 85% vs. 89% by FaultMap and achieved a notably higher detection rate (95%) of (direct) causal links throughout the coupled Lorenz systems than FaultMap (83%). ParCorr’s sensitivity to nonlinear interactions has been previously discussed by Krich et al. [21]. PCMCI’s higher detection rate of direct links compared to FaultMap is mainly due to its 100% detection rate of self-influence at each system state variable, whereas FaultMap only reached a 100% detection rate of self-influence at the coupled system state variables

Y_{1}

and

Y_{2}

. We could not find an obvious explanation for the algorithms’ differing detection rates of self-influence (see the heatmaps’ diagonal), but may argue that self-influence as subset of the previously defined edge-set E, or formally

{e_{i i} = (v_{i}, v_{i})} \subseteq E

, is irrelevant in the context of Equation (2). Lastly, note that unlike what we would expect from its multivariate causal network reconstruction designed to distinguish direct from indirect effects, PCMCI detected 23% more transitive indirect links than FaultMap.

Due to the transitivity of bivariate information transfer, most (if not all) networks inferred by FaultMap include direct and indirect connections, all of which passed strict significance tests, as

Y_{1} \to Y_{2} \to Z_{2}

and

Y_{1} ⤏ Z_{2}

shown in Figure 2. The coupled Lorenz systems are easily discernible by two subnetworks

(X_{1}, Y_{1}, Z_{1})

and

(X_{2}, Y_{2}, Z_{2})

that exhibit distinct levels of information transfer with similar time delays in the range of milliseconds. At a rate of at least 98%, FaultMap detected the same edges

X \to Y

,

Y \to X

,

X \to Z

and

Y \to Z

indicating (normalized) net influence within both Lorenz subsystems, as Gencaga et al. [22] found for a single Lorenz system. PCMCI shows similar detection results for these edges (see heatmaps). It is important to note that the rolling window approach took PCMCI 495 min of runtime on a 24-core HPC node with 192 Gb for causal analysis vs. 58 min by FaultMap for a two-step causal analysis and node ranking. PCMCI’s considerably higher computational costs compared to FaultMap obviously limit its applicability for the diagnostic purposes considered above.

In Figure 3a,b, we focus on reconstruction of time delays in coupling

Y 1 ⇄ Y 2

, given the modeled delays in Equations (5b) and (5e). Figure 3a reveals the dynamic nature of bidirectional information transfer in terms of strength and delay. The median difference (≈0.05) of information transfer distributions in Figure 3b suggests that Lorenz system

L_{2}

predominantly drives

L_{1}

, which is reasonable to expect since the coupling strength (

0.1

) in Equation (5b) is twice the coupling strength (

0.05

) in Equation (5e). Given coupling delays of 3 and 5 s, the distribution of reconstructed interaction delays seems realistic but may be impacted by lag synchronisation of the Lorenz systems, as suggested by Coufal et al. [23]. Figure 3c shows highly dynamic global influence of particularly X- and Y-variables, while the Y-variables remain the driving force within their respective Lorenz subsystem at all times. Grey-colored bars highlight time windows in which

L_{1}

is identified as driving, and

L_{2}

as driven subsystem. Since the coupling strengths are constant,

Y_{1} \to Y_{2}

is likely to dominate

Y_{2} \to Y_{1}

in strength when the

Y_{2}

state reaches vanishingly low values relative to the

Y_{1}

state. The median difference across all node importance distributions in Figure 3d reflects the aforementioned drive–response relations in, and between, the interacting Lorenz systems. It might explain FaultMap’s 100% detection rate of Y-variable self-loops vs. lower rates of all other self-loops. The outcome of both Y-variables as Lorenz system key driver complies with the Lorenz model of Rayleigh–Bénard convection [24], where temperature difference drives convective heat transfer in addition to conduction at Rayleigh number

R a \geq 25

(see Equations (5b) and (5e)). To our knowledge, this is the first rolling window analysis to date, capturing the dynamics of time-varying information transfer or global influence (importance) of state variables within interacting Lorenz systems. Our findings may enable automated identification of monitoring observables for performance (anomaly) diagnostics or predictive maintenance within technological complex systems. The ability to capture time-varying importance of a complex system’s state variables is also relevant in time-series analysis of natural complex systems, including the Earth’s climate.

3.2. Technological Complex Systems

Nanolithography systems are among the most complex technological systems today, capable of sub-nanometer positioning and sub-milliKelvin temperature control, even as system modules accelerate at up to 15 Gs. Such systems are particularly challenging for model-based diagnosis of rare or new issues, due to nonlinear interactions across multiple time and spatial scales. To assess FaultMap’s potential in diagnosis of issues within such systems, we investigate temperature, flow and pressure instability within an ASML subsystem. Therefore, we use a multivariate time series of 315 binary samples from 366 parameters related to the problem. As shown in Figure 4, FaultMap identified parameter

P_{0}

as a primary source of original information, i.e., most probable cause leading to event

P_{27}

, through a network of collateral effects

{P_{1}, . . ., P_{26}}

. The indicated root cause is confirmed to be correct by a series of automatically logged system events as well as service actions. Interestingly, the event log messages

P_{16}

and

P_{23}

follow the network’s direction of time i.e., from cause to effect (causal inference), while the logged service actions related to

P_{1}

and

P_{23}

follow the reversed time order, i.e., from effect to cause (diagnostic inference). Hence, the last logged service action

P_{1}

appears as a direct effect of root cause

P_{0}

in the network. This observation is promising, with regard to the automation of reliable data-driven diagnostics for technological complex systems.

4. Conclusions

To fully understand a complex system’s dynamical behavior, it is essential to identify its main sources of causal influence affecting downstream elements throughout the system. We empirically show that spectral centrality analysis of its causal network as approximated by standard transfer entropy allows one to accurately and consistently identify the most important node(s) of original information representing the most probable cause(s) or driver(s) of disturbance in the system. The ranking algorithm we use compares favorably against the alternative algorithm for multivariate information transfer estimation in causal analysis of two nonlinearly coupled Lorenz systems. In addition, it shows to be accurate, consistent and efficient, identifying the alternately driving and driven Lorenz subsystems, as well as the driving force within either subsystem, over time. Finally, the ranking algorithm correctly traces back the original disturbance within a high-dimensional technological complex system from sampled time series of several hundreds of parameters. Considering the high-dimensionality of observations across multiple time and spatial scales from such systems, we conclude that the inherent robustness of spectral centrality to semi-metricity of directed networks makes it a viable option for reliable and scalable diagnostics. Additionally, spectral centrality ranking allows for feature selection and is particularly useful in identifying long-term effects.

Regardless of the computational costs, state-of-the-art multivariate causal inference methods may be the better choice to account and control for unobserved variables or capture synergistic interactions. However, comprehensive comparison of our method with these approaches is beyond the scope of this study and therefore recommended for future research.

We thank David Sigtermans for our many inspiring discussions and Leonardo Barbini for his valuable feedback on the causal analysis results of the Lorenz systems. Finally, we are grateful to Simon Streicher for his implementation and helpful suggestions using it.

Author Contributions

Conceptualization, E.Z.; software, E.Z.; validation, E.Z.; writing—original draft preparation, E.Z.; formal analysis, E.Z.; investigation, E.Z.; resources, E.Z.; data curation, E.Z.; writing—original draft preparation, E.Z.; writing—review and editing, T.H. and T.C.; visualization, E.Z.; supervision, T.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The source code for simulation of time series representing coupled Lorenz systems, is available on https://osf.io, accessed on 19 March 2021.

Conflicts of Interest

The authors declare no conflict of interest.

References

Schreiber, T. Measuring Information Transfer. Phys. Rev. Lett. 2000, 85, 461–464. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Bonacich, P. Factoring and weighting approaches to status scores and clique identification. J. Math. Sociol. 1972, 2, 113–120. [Google Scholar] [CrossRef]
Barnett, L.; Barrett, A.B.; Seth, A.K. Granger Causality and Transfer Entropy Are Equivalent for Gaussian Variables. Phys. Rev. Lett. 2009, 103, 238701. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Lizier, J.; Rubinov, M. Multivariate construction of effective computational networks from observational data, Preprint 25/2012. In Max Planck Institute for Mathematics in the Science; 2012; Available online: http://www.mis.mpg.de/preprints/2012/preprint2012_25.pdf (accessed on 14 April 2014).
Benzi, M.; Klymko, C. On the Limiting Behavior of Parameter-Dependent Network Centrality Measures. SIAM J. Matrix Anal. Appl. 2015, 36, 686–706. [Google Scholar] [CrossRef] [Green Version]
Katz, L. A new status index derived from sociometric analysis. Psychometrika 1953, 18, 39–43. [Google Scholar] [CrossRef]
Kalavri, V.; Simas, T.; Logothetis, D. The shortest path is not always a straight line: Leveraging semimetricity in graph analysis. Proc. VLDB Endow. 2016, 9, 672–683. [Google Scholar] [CrossRef]
Streicher, S.; Sandrock, C. Plant-wide fault and disturbance screening using combined transfer entropy and eigenvector centrality analysis. arXiv 2019, arXiv:1904.04035. [Google Scholar]
Wiener, N. The theory of prediction. In Modern Mathematics for the Engineer; Beckenbach, E.F., Ed.; McGraw-Hill: New York, NY, USA, 1956. [Google Scholar]
Schreiber, T.; Schmitz, A. Surrogate time series. Phys. Nonlinear Phenom. 2000, 142, 346–382. [Google Scholar] [CrossRef] [Green Version]
Page, L.; Brin, S.; Motwani, R.; Winograd, T. The PageRank Citation Ranking: Bringing Order to the Web; Technical Report; Stanford Digital Libraries: Stanford, CA, USA, 1998; pp. 1–17. [Google Scholar]
Frobenius, G. Ueber Matrizen aus nicht negativen Elementen. Sitzungsberichte KöNiglich Preuss. Akad. Wiss. 1912, 26, 456–477. [Google Scholar]
Perron, O. Zur theorie der matrices. Math. Ann. 1907, 2, 248–263. [Google Scholar] [CrossRef]
Wills, R.S. Google’s PageRank: The Math Behind the Search Engine. Math. Intell. 2006, 28, 6–11. [Google Scholar] [CrossRef]
von Mises, R.; Pollaczek-Geiringer, H. Praktische Verfahren der Gleichungsauflösung. ZAMM Z. Angew. Math. Mech. 1929, 9, 152–164. [Google Scholar] [CrossRef]
Wibral, M.; Wollstadt, P.; Meyer, U.; Pampu, N.; Priesemann, V.; Vicente, R. Revisiting Wiener’s principle of causality–interaction-delay reconstruction using transfer entropy and multivariate analysis on delay-weighted graphs. In Proceedings of the 2012 Annual International Conference of the IEEE Engineering in Medicine and Biology Society, San Diego, CA, USA, 28 August–1 September 2012; pp. 3676–3679. [Google Scholar]
Flunkert, V. Pydelay: A Simulation Package, Delay-Coupled Complex Systems and Applications to Lasers; Springer: Berlin/Heidelberg, Germany, 2011. [Google Scholar]
Runge, J. Causal Network Reconstruction from Time Series: From Theoretical Assumptions to Practical Estimation. Chaos Interdiscip. J. Nonlinear Sci. 2018, 28, 075310. [Google Scholar] [CrossRef] [PubMed]
Lizier, J.T. JIDT: An Information-Theoretic Toolkit for Studying the Dynamics of Complex Systems. Front. Robot. AI 2014, 1, 1–11. [Google Scholar] [CrossRef] [Green Version]
Bauer, M. Data-Driven Methods for Process Analysis. Ph.D. Thesis, University of London, London, UK, 2005. [Google Scholar]
Krich, C.; Runge, J.; Miralles, D.G.; Migliavacca, M.; Perez-Priego, O.; El-Madany, T.; Carrara, A.; Mahecha, M.D. Estimating causal networks in biosphere–atmosphere interaction with the PCMCI approach. Biogeosciences 2020, 17, 1033–1061. [Google Scholar] [CrossRef] [Green Version]
Gencaga, D.; Rossow, W.; Knuth, K. A Recipe for the Estimation of Information Flow in a Dynamical System. Entropy 2015, 17, 438–470. [Google Scholar] [CrossRef] [Green Version]
Coufal, D.; Jakubík, J.; Jajcay, N.; Hlinka1, J.; Krakovská, A.; Paluš, M. Detection of coupling delay: A problem not yet solved. Chaos 2017, 27, 083109. [Google Scholar] [CrossRef] [PubMed]
Rayleigh, L. On convecting currents in a horizontal layer of fluid when the higher temperature is on the under side. Philos. Mag. 1916, 32, 529–546. [Google Scholar] [CrossRef] [Green Version]

Figure 1. (a) Time series of system

S_{(L_{1} ⇄ L_{2})}

composed of bidirectionally delay-coupled Lorenz systems

L_{1}

and

L_{2}

, generated from Equations (5a)–(5f). (b) Heatmaps of detection count per cause–effect relation, for FaultMap (left) and PCMCI (right). Direct cause–effect relations are denoted by (*).

Figure 1. (a) Time series of system

S_{(L_{1} ⇄ L_{2})}

composed of bidirectionally delay-coupled Lorenz systems

L_{1}

and

L_{2}

, generated from Equations (5a)–(5f). (b) Heatmaps of detection count per cause–effect relation, for FaultMap (left) and PCMCI (right). Direct cause–effect relations are denoted by (*).

Figure 2. Information transfer network of two bidirectionally delay-coupled Lorenz systems. Edges indicate direct (⟶) or transitive indirect (⤏) information transfer. Edge annotations denote information transfer delay (sec). Node importance indicates a node’s global network influence. Edge-weight represents level of information transfer (

w_{i j}

).

Figure 2. Information transfer network of two bidirectionally delay-coupled Lorenz systems. Edges indicate direct (⟶) or transitive indirect (⤏) information transfer. Edge annotations denote information transfer delay (sec). Node importance indicates a node’s global network influence. Edge-weight represents level of information transfer (

w_{i j}

).

Figure 3. Dynamic information transfer via bidirectional delay-coupling

Y_{1} ⇄ Y_{2}

of Lorenz systems

L_{1}

and

L_{2}

, and dynamic importance of the Lorenz system state variables. (a) Dynamic information transfer (delay) via bidirectional delay-coupling

Y_{1} ⇄ Y_{2}

between Lorenz systems

L_{1}

and

L_{2}

. (b) Distribution of information transfer (delay) in 3a. (c) Dynamic importance of state variables in delay-coupled Lorenz systems

L_{1}

and

L_{2}

. (d) Distributions of Lorenz system state variable importance in 3c.

Figure 3. Dynamic information transfer via bidirectional delay-coupling

Y_{1} ⇄ Y_{2}

of Lorenz systems

L_{1}

and

L_{2}

, and dynamic importance of the Lorenz system state variables. (a) Dynamic information transfer (delay) via bidirectional delay-coupling

Y_{1} ⇄ Y_{2}

between Lorenz systems

L_{1}

and

L_{2}

. (b) Distribution of information transfer (delay) in 3a. (c) Dynamic importance of state variables in delay-coupled Lorenz systems

L_{1}

and

L_{2}

. (d) Distributions of Lorenz system state variable importance in 3c.

Figure 4. Top-ranked node

P_{0}

(root cause) transfers original information towards event

P_{27}

via a network of collateral effects

{P_{1}, . . ., P_{26}}

within an ASML subsystem. (Figure 2 legend applies here.)

Figure 4. Top-ranked node

P_{0}

(root cause) transfers original information towards event

P_{27}

via a network of collateral effects

{P_{1}, . . ., P_{26}}

within an ASML subsystem. (Figure 2 legend applies here.)

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zalmijn, E.; Heskes, T.; Claassen, T. Spectral Ranking of Causal Influence in Complex Systems. Entropy 2021, 23, 369. https://doi.org/10.3390/e23030369

AMA Style

Zalmijn E, Heskes T, Claassen T. Spectral Ranking of Causal Influence in Complex Systems. Entropy. 2021; 23(3):369. https://doi.org/10.3390/e23030369

Chicago/Turabian Style

Zalmijn, Errol, Tom Heskes, and Tom Claassen. 2021. "Spectral Ranking of Causal Influence in Complex Systems" Entropy 23, no. 3: 369. https://doi.org/10.3390/e23030369

APA Style

Zalmijn, E., Heskes, T., & Claassen, T. (2021). Spectral Ranking of Causal Influence in Complex Systems. Entropy, 23(3), 369. https://doi.org/10.3390/e23030369

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Spectral Ranking of Causal Influence in Complex Systems

Abstract

1. Introduction

2. Applied Algorithms and Methods

2.1. Transfer Entropy

2.2. Eigenvector Centrality

2.3. Validation

3. Results and Discussion

3.1. Coupled Lorenz Systems

3.2. Technological Complex Systems

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI