# What Caused What? A Quantitative Account of Actual Causation Using Dynamical Causal Networks

^{1}

^{2}

^{3}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Theory

#### 2.1. Dynamical Causal Networks

#### 2.2. Occurrences and Transitions

#### 2.3. Cause and Effect Repertoires

#### 2.4. Actual Causes and Actual Effects

**Realization.**A transition ${v}_{t-1}\prec {v}_{t}$ must be consistent with the transition probability function of a dynamical causal network ${G}_{u}$,

**Composition.**Occurrences and their actual causes and effects can be uni- or multi-variate. For a complete causal account of the transition ${v}_{t-1}\prec {v}_{t}$, all causal links between occurrences ${x}_{t-1}\subseteq {v}_{t-1}$ and ${y}_{t}\subseteq {v}_{t}$ should be considered. For this reason, we evaluate every subset of ${x}_{t-1}\subseteq {v}_{t-1}$ as occurrences that may have actual effects and every subset ${y}_{t}\subseteq {v}_{t}$ as occurrences that may have actual causes (see Figure 4). For a particular occurrence ${x}_{t-1}$, all subsets ${y}_{t}\subseteq {v}_{t}$ are considered as candidate effects (Figure 5A). For a particular occurrence ${y}_{t}$, all subsets ${x}_{t-1}\subseteq {v}_{t-1}$ are considered as candidate causes (see Figure 5B). In what follows, we refer to occurrences consisting of a single variable as “first-order” occurrences and to multi-variate occurrences as “high-order” occurrences, and, likewise, to “first-order” and “high-order” causes and effects.

**Information.**An occurrence must provide information about its actual cause or effect. This means that it should increase the probability of its actual cause or effect compared to its probability if the occurrence is unspecified. To evaluate this, we compare the probability of a candidate effect ${y}_{t}$ in the effect repertoire of the occurrence ${x}_{t-1}$ (Equation (3)) to its corresponding probability in the unconstrained repertoire (Equation (6)). In line with information-theoretical principles, we define the effect information ${\rho}_{e}$ of the occurrence ${x}_{t-1}$ about a subsequent occurrence ${y}_{t}$ (the candidate effect) as:

**Integration.**A high-order occurrence must specify more information about its actual cause or effect than its parts when they are considered independently. This means that the high-order occurrence must increase the probability of its actual cause or effect beyond the value specified by its parts.

**Exclusion:**An occurrence should have at most one actual cause and one actual effect (which, however, can be multi-variate; that is, a high-order occurrence). In other words, only one occurrence ${y}_{t}\subseteq {v}_{t}$ can be the actual effect of an occurrence ${x}_{t-1}$, and only one occurrence ${x}_{t-1}\subseteq {v}_{t-1}$ can be the actual cause of an occurrence ${y}_{t}$.

Realization: | There is a dynamical causal network ${G}_{u}$ and a transition ${v}_{t-1}\prec {v}_{t}$, such that ${p}_{u}\left({v}_{t}\right|{v}_{t-1})>0$. |

Composition: | All ${x}_{t-1}\subseteq {v}_{t-1}$ may have actual effects and be actual causes, and all ${y}_{t}\subseteq {v}_{t}$ may have actual causes and be actual effects. |

Information: | Occurrences must increase the probability of their causes or effects ($\rho ({x}_{t-1},{y}_{t})>0$). |

Integration: | Moreover, they must do so above and beyond their parts ($\alpha ({x}_{t-1},{y}_{t})>0$). |

Exclusion: | An occurrence has only one actual cause (or effect), and it is the occurrence that maximizes ${\alpha}_{c}$ (or ${\alpha}_{e}$). |

**Definition**

**1.**

- 1.
- The integrated cause information of ${y}_{t}$ over ${x}_{t-1}$ is maximal$${\alpha}_{c}({x}_{t-1},{y}_{t})={\alpha}^{max}\left({y}_{t}\right);\phantom{\rule{4.pt}{0ex}}\mathit{and}$$
- 2.
- No subset of ${x}_{t-1}$ satisfies condition (1)$${\alpha}_{c}({x}_{t-1}^{\prime},{y}_{t})={\alpha}^{max}\left({y}_{t}\right)\Rightarrow {x}_{t-1}^{\prime}\not\subset {x}_{t-1}.$$

- 1.
- If ${x}^{*}\left({y}_{t}\right)=\left\{{x}_{t-1}\right\}$, then ${x}_{t-1}$ is the actual cause of ${y}_{t}$;
- 2.
- if $|{x}^{*}\left({y}_{t}\right)|>1$ then the actual cause of ${y}_{t}$ is indeterminate; and
- 3.
- if ${x}^{*}\left({y}_{t}\right)=\{\u2300\}$, then ${y}_{t}$ has no actual cause.

**Definition**

**2.**

- 1.
- The integrated effect information of ${x}_{t-1}$ over ${y}_{t}$ is maximal$${\alpha}_{e}({x}_{t-1},{y}_{t})={\alpha}^{max}\left({x}_{t-1}\right);\phantom{\rule{4.pt}{0ex}}\mathit{and}$$
- 2.
- No subset of ${y}_{t}$ satisfies condition (1)$${\alpha}_{e}({x}_{t-1},{y}_{t}^{\prime})={\alpha}^{max}\left({x}_{t-1}\right)\Rightarrow {y}_{t}^{\prime}\not\subset {y}_{t}.$$

- 1.
- If ${y}^{*}\left({x}_{t-1}\right)=\left\{{y}_{t}\right\}$, then ${y}_{t}$ is the actual effect of ${x}_{t-1}$;
- 2.
- if $|{y}^{*}\left({x}_{t-1}\right)|>1$ then the actual effect of ${x}_{t-1}$ is indeterminate; and
- 3.
- if ${y}^{*}\left({x}_{t-1}\right)=\{\u2300\}$, then ${x}_{t-1}$ has no actual effect.

**Definition**

**3.**

**Definition**

**4.**

## 3. Results

#### 3.1. Same Transition, Different Mechanism: Disjunction, Conjunction, Bi-Conditional, and Prevention

**Disjunction:**The first example (Figure 7A, OR-gate), is a case of symmetric over-determination ([7], Chapter 10): each input to C would have been sufficient for $\{C=1\}$, yet both $\{A=1\}$ and $\{B=1\}$ occurred at $t-1$. In this case, each of the inputs to C has an actual effect, $\{A=1\}\to \{C=1\}$ and $\{B=1\}\to \{C=1\}$, as they raise the probability of $\{C=1\}$ when compared to its unconstrained probability. The high-order occurrence $\{AB=11\}$, however, is reducible (with ${\alpha}_{e}=0$). While both $\{A=1\}$ and $\{B=1\}$ have actual effects, by the causal exclusion principle, the occurrence $\{C=1\}$ can only have one actual cause. As both $\{A=1\}\leftarrow \{C=1\}$ and $\{B=1\}\leftarrow \{C=1\}$ have ${\alpha}_{c}={\alpha}_{c}^{\mathrm{max}}=0.415$ bits, the actual cause of $\{C=1\}$ is either $\{A=1\}$ or $\{B=1\}$, by Definition 1; which of the two inputs it is remains undetermined, since they are perfectly symmetric in this example. Note that $\{AB=11\}\leftarrow \{C=1\}$ also has ${\alpha}_{c}=0.415$ bits, but $\{AB=11\}$ is excluded from being a cause by the minimality condition.

**Conjunction:**In the second example (Figure 7B, AND-gate), both $\{A=1\}$ and $\{B=1\}$ are necessary for $\{D=1\}$. In this case, each input alone has an actual effect, $\{A=1\}\to \{C=1\}$ and $\{B=1\}\to \{C=1\}$ (with higher strength than in the disjunctive case); here, also, the second-order occurrence of both inputs together has an actual effect, $\{AB=11\}\to \{D=1\}.$ Thus, there is a composition of actual effects. Again, the occurrence $\{D=1\}$ can only have one actual cause; here, it is the second-order cause $\{AB=11\}$, the only occurrence that satisfies the conditions in Definition 1 with ${\alpha}_{c}={\alpha}_{c}^{\mathrm{max}}=2.0$.

**Bi-conditional**: The significance of high-order occurrences is further emphasized by the third example (Figure 7C), where E is a “logical bi-conditional” (an XNOR) of its two inputs. In this case, the individual occurrences $\{A=1\}$ and $\{B=1\}$ by themselves make no difference in bringing about $\{E=1\}$; their effect information is zero. For this reason, they cannot have actual effects and cannot be actual causes. Only the second-order occurrence $\{AB=11\}$ specifies $\{E=1\}$, which is its actual effect $\{AB=11\}\to \{E=1\}$. Likewise, $\{E=1\}$ only specifies the second-order occurrence $\{AB=11\}$, which is its actual cause $\{AB=11\}\leftarrow \{E=1\}$, but not its parts taken separately. Note that the causal strength in this example is lower than in the case of the AND-gate, since, everything else being equal, $\{D=1\}$ is, mechanistically, a less-likely output than $\{E=1\}$.

**Prevention:**In the final example, Figure 7D, all input states but $\{AB=10\}$ lead to $\{F=1\}$. Here, $\{B=1\}\to \{F=1\}$ and $\{B=1\}\leftarrow \{F=1\}$, whereas $\{A=1\}$ does not have an actual effect and is not an actual cause. For this reason, the transition ${v}_{t-1}\prec {v}_{t}$ is reducible ($\mathcal{A}({v}_{t-1}\prec {v}_{t})=0$, see Appendix A), since A could be partitioned away without loss. This example can be seen as a case of prevention: $\{B=1\}$ causes $\{F=1\}$, which prevents any effect of $\{A=1\}$. In a popular narrative accompanying this example, $\{A=1\}$ is an assassin putting poison in the King’s tea, while a bodyguard administers an antidote $\{B=1\}$, and the King survives $\{F=1\}$ [12]. The bodyguard thus “prevents” the King’s death (However, the causal model is also equivalent to an OR-gate, as can be seen by switching the state labels of A from ‘0’ to ‘1’ and vice versa. The discussed transition would correspond to the case of one input to the OR-gate being ‘1’ and the other ‘0’. As the OR-gate switches on (‘1’) in this case, the ‘0’ input has no effect and is not a cause). Note that the causal account is state-dependent: For a different transition, A may have an actual effect or contribute to an actual cause; if the bodyguard does not administer the antidote ($\{B=0\})$, whether the King survives depends on the assassin (the state of A).

#### 3.2. Linear Threshold Units

**Theorem**

**1.**

- 1.
- The actual cause of $\{{Y}_{t}=1\}$ is an occurrence $\{{X}_{t-1}={x}_{t-1}\}$ with $|{x}_{t-1}|=k$ and $min\left({x}_{t-1}\right)=1$, and
- 2.
- if $min\left({x}_{t-1}\right)=1$ and $|{x}_{t-1}|\le k$ then the actual effect of $\{{X}_{t-1}={x}_{t-1}\}$ is $\{{Y}_{t}=1\}$; otherwise $\{{X}_{t-1}={x}_{t-1}\}$ has no actual effect, it is reducible.

**Proof.**

#### 3.3. Distinct Background Conditions

#### 3.4. Disjunction of Conjunctions

**Theorem**

**2.**

- 1.
- If ${y}_{t}=1$,
- (a)
- The actual cause of $\{{Y}_{t}=1\}$ is an occurrence $\{{X}_{t-1}={x}_{t-1}\}$ where ${x}_{t-1}={\left\{{x}_{i,j,t-1}\right\}}_{i=1}^{{n}_{j}}\subseteq {v}_{t-1}$ such that $min\left({x}_{t-1}\right)=1$; and
- (b)
- the actual effect of $\{{X}_{t-1}={x}_{t-1}\}$ is $\{{Y}_{t}=1\}$ if $min\left({x}_{t-1}\right)=1$ and $|{x}_{t-1}|={c}_{j}={n}_{j}$; otherwise ${x}_{t-1}$ is reducible.

- 2.
- If ${y}_{t}=0$,
- (a)
- The actual cause of $\{{Y}_{t}=0\}$ is an occurrence ${x}_{t-1}\subseteq {v}_{t-1}$ such that $max\left({x}_{t-1}\right)=0$ and ${c}_{j}=1\phantom{\rule{3.33333pt}{0ex}}\forall \phantom{\rule{3.33333pt}{0ex}}j$; and
- (b)
- if $max\left({x}_{t-1}\right)=0$ and ${c}_{j}\le 1\phantom{\rule{3.33333pt}{0ex}}\forall \phantom{\rule{3.33333pt}{0ex}}j$ then the actual effect of $\{{X}_{t-1}={x}_{t-1}\}$ is $\{{Y}_{t}=0\}$; otherwise ${x}_{t-1}$ is reducible.

**Proof.**

#### 3.5. Complicated Voting

#### 3.6. Non-Binary Variables

#### 3.7. Noise and Probabilistic Variables

#### 3.8. Simple Classifier

## 4. Discussion

#### 4.1. Testing All Possible Counterfactuals with Equal Probability

#### 4.2. Distinguishing Actual Effects and Actual Causes

#### 4.3. Composition

#### 4.4. Integration

#### 4.5. Exclusion

#### 4.6. Intended Scope and Limitations

#### 4.7. Accountability and Causal Responsibility

## 5. Conclusions

## Author Contributions

## Funding

## Acknowledgments

## Conflicts of Interest

## Appendix A. Irreducibility of the Causal Account

- Identify irrelevant variables within a causal account that do not contribute to any causal link (Figure A1A);
- evaluate how entangled the sets of causes and effects are within a transition ${v}_{t-1}\prec {v}_{t}$ (Figure A1B); and
- compare $\mathcal{A}$ values between (sub-)transitions, in order to identify clusters of variables whose causes and effects are highly entangled, or only minimally connected (Figure A1C).

**Figure A1.**Reducible and irreducible causal accounts. (

**A**) “Prevention” example (see Figure 7D, main text). We have that $\mathcal{A}=0$ bits as $\{A=1\}$ does not contribute to any causal links. (

**B**) Irreducible transition (see Figure 6, main text). A partition of the transition along the MIP destroys the 2nd-order causal link, leading to $\mathcal{A}=0.17$ bits. (

**C**) In larger systems, $\mathcal{A}$ can be used to identify (sub-)transitions with highly entangled causes and effect. While the causes and effects in the full transition are only weakly entangled, with $\mathcal{A}=0.03$ bits, the top and bottom (sub-)transitions are irreducible, with $\mathcal{A}=0.83$ bits.

## Appendix B. Supplementary Proof 1

**Lemma**

**A1.**

**Proof.**

**Lemma**

**A2.**

**Proof.**

**Lemma**

**A3.**

**Proof.**

**Lemma**

**A4.**

**Proof.**

**Theorem**

**A1.**

- 1.
- The actual cause of $\{{Y}_{t}=1\}$ is an occurrence $\{{X}_{t-1}={x}_{t-1}\}$ with $|{x}_{t-1}|=k$ and $min\left({x}_{t-1}\right)=1$. Furthermore, the causal strength of the link is$${\alpha}_{c}^{max}\left({y}_{t}\right)=k-{log}_{2}\left(\sum _{j=0}^{k}{q}_{k,j}\right)>0;\phantom{\rule{4.pt}{0ex}}\mathit{and}$$
- 2.
- If $min\left({x}_{t-1}\right)=1$ and $|{x}_{t-1}|\le k$ then the actual effect of $\{{X}_{t-1}={x}_{t-1}\}$ is $\{{Y}_{t}=1\}$ with causal strength$${\alpha}_{e}({x}_{t-1},{y}_{t})={log}_{2}\left(\frac{{q}_{c,c}}{{q}_{c-1,c-1}}\right)>0,$$

**Proof.**

**Part 1:**Consider an occurrence $\{{X}_{t-1}={x}_{t-1}\}$, such that $|{x}_{t-1}|=c\le n$ and $\sum _{x\in {x}_{t-1}}x=j$. Then, the probability of ${x}_{t-1}$ in the cause-repertoire of ${y}_{t}$ is

**Part 2:**Again, consider occurrences ${X}_{t-1}={x}_{t-1}$ with $|{x}_{t-1}|=c$ and $\sum _{x\in {x}_{t-1}}}x=j$. The probability of ${y}_{t}$ in the effect repertoire of such an occurrence is

## Appendix C. Supplementary Proof 2

**Lemma**

**A5.**

**Proof.**

**Lemma**

**A6.**

**Proof.**

**Lemma**

**A7.**

**Proof.**

**Lemma**

**A8.**

**Proof.**

**Lemma**

**A9.**

**Proof.**

**Theorem**

**A2.**

- 1.
- If ${y}_{t}=1$,
- (a)
- The actual cause of $\{{Y}_{t}=1\}$ is an occurrence $\{{X}_{t-1}={x}_{t-1}\}$, where ${x}_{t-1}={\left\{{x}_{i,j,t-1}\right\}}_{i=1}^{{n}_{j}}\subseteq {v}_{t-1}$, such that $min\left({x}_{t-1}\right)=1$; and
- (b)
- The actual effect of $\{{X}_{t-1}={x}_{t-1}\}$ is $\{{Y}_{t}=1\}$, if $min\left({x}_{t-1}\right)=1$ and $|{x}_{t-1}|={c}_{j}={n}_{j}$; otherwise, ${x}_{t-1}$ is reducible.

- 2.
- If ${y}_{t}=0$,
- (a)
- The actual cause of $\{{Y}_{t}=0\}$ is an occurrence ${x}_{t-1}\subseteq {v}_{t-1}$, such that $max\left({x}_{t-1}\right)=0$ and ${c}_{j}=1\phantom{\rule{3.33333pt}{0ex}}\forall \phantom{\rule{3.33333pt}{0ex}}j$; and
- (b)
- If $max\left({x}_{t-1}\right)=0$ and ${c}_{j}\le 1\phantom{\rule{3.33333pt}{0ex}}\forall \phantom{\rule{3.33333pt}{0ex}}j$, then the actual effect of $\{{X}_{t-1}={x}_{t-1}\}$ is $\{{Y}_{t}=0\}$; otherwise, ${x}_{t-1}$ is reducible.

**Proof.**

**Part 1a:**The actual cause of $\{{Y}_{t}=1\}$. For an occurrence $\{{X}_{t-1}={x}_{t-1}\}$, the probability of ${x}_{t-1}$ in the cause repertoire of ${y}_{t}$ is

**Part 1b:**Actual effect of ${x}_{t-1}$ when ${y}_{t}=1$. Again, consider occurrences ${X}_{t-1}={x}_{t-1}$ with ${c}_{j}$ elements from each of the k conjunctions. The effect repertoire of a DOC with k conjunctions over such occurrences is

**Part 2a:**The actual cause of $\{{Y}_{t}=0\}$. For an occurrence $\{{X}_{t-1}={x}_{t-1}\}$, the cause repertoire of ${y}_{t}$ is

**Part 2b:**Actual effect of ${x}_{t-1}$ when ${y}_{t}=0$. Again, consider occurrences ${X}_{t-1}={x}_{t-1}$ with ${c}_{j}$ elements from each of k conjunctions. The probability of ${y}_{t}$ in the effect repertoire of ${x}_{t-1}$ is

## References

- Illari, M.; Phyllis, F.R.; Williamson, J. (Eds.) Causality in the Sciences; Oxford University Press: Oxford, UK, 2011; p. 952. [Google Scholar] [CrossRef]
- Silver, D.; Huang, A.; Maddison, C.J.; Guez, A.; Sifre, L.; Van Den Driessche, G.; Schrittwieser, J.; Antonoglou, I.; Panneershelvam, V.; Lanctot, M.; et al. Mastering the game of Go with deep neural networks and tree search. Nature
**2016**, 529, 484. [Google Scholar] [CrossRef] [PubMed] - Metz, C. How Google’s AI Viewed the Move No Human Could Understand. WIRED, 14 March 2016. [Google Scholar]
- Sporns, O.; Tononi, G.; Edelman, G. Connectivity and complexity: the relationship between neuroanatomy and brain dynamics. Neural Netw.
**2000**, 13, 909–922. [Google Scholar] [CrossRef] - Wolff, S.B.; Ölveczky, B.P. The promise and perils of causal circuit manipulations. Curr. Opin. Neurobiol.
**2018**, 49, 84–94. [Google Scholar] [CrossRef] [PubMed] - Lewis, D. Philosophical Papers, Volume II; Oxford University Press: Oxford, UK, 1986. [Google Scholar]
- Pearl, J. Causality: Models, Reasoning And Inference; Cambridge University Press: Cambridge, UK, 2000; Volume 29. [Google Scholar]
- Woodward, J. Making Things Happen. A theory of Causal Explanation; Oxford University Press: Oxford, MI, USA, 2003. [Google Scholar]
- Hitchcock, C. Prevention, Preemption, and the Principle of Sufficient Reason. Philos. Rev.
**2007**, 116, 495–532. [Google Scholar] [CrossRef] - Paul, L.A.; Hall, E.J. Causation: A User’S Guide; Oxford University Press: Oxford, UK, 2013. [Google Scholar]
- Weslake, B. A Partial Theory of Actual Causation. Br. J. Philos. Sci.
**2015**. Available online: https://philpapers.org/rec/WESAPT (accessed on 10 February 2019). - Halpern, J.Y. Actual Causality; MIT Press: Cambridge, MA, USA, 2016. [Google Scholar]
- Good, I.J.I. A Causal Calculus I. Br. J. Philos. Sci.
**1961**, 11, 305–318. [Google Scholar] [CrossRef] - Suppes, P. A Probabilistic Theory of Causality; Number 4; North Holland Publishing Company: Amsterdam, The Netherlands, 1970. [Google Scholar]
- Spirtes, P.; Glymour, C.; Scheines, R. Causation, Predictions, and Search; Springer: New York, NY, USA, 1993. [Google Scholar]
- Pearl, J. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference; Morgan Kaufmann Series in Representation And Reasoning; Morgan Kaufmann Publishers: Burlington, MA, USA, 1988. [Google Scholar]
- Wright, R.W. Causation in tort law. Calif. Law Rev.
**1985**, 73, 1735. [Google Scholar] [CrossRef] - Tononi, G.; Sporns, O.; Edelman, G.M. Measures of degeneracy and redundancy in biological networks. Proc. Natl. Acad. Sci. USA
**1999**, 96, 3257–3262. [Google Scholar] [CrossRef] - Hitchcock, C. The Intransitivity of Causation Revealed in Equations and Graphs. J. Philos.
**2001**, 98, 273. [Google Scholar] [CrossRef] - Halpern, J.Y.J.; Pearl, J. Causes and explanations: A structural-model approach. Part I: Causes. Br. J. Philos. Sci.
**2005**, 56, 843–887. [Google Scholar] [CrossRef] - Halpern, J.Y. A Modification of the Halpern-Pearl Definition of Causality. arXiv
**2015**, arXiv:1505.00162. [Google Scholar] - Lewis, D. Counterfactuals; Harvard University Press: Cambridge, MA, USA, 1973. [Google Scholar]
- Woodward, J. Counterfactuals and causal explanation. Int. Stud. Philos. Sci.
**2004**, 18, 41–72. [Google Scholar] [CrossRef] - Beckers, S.; Vennekens, J. A principled approach to defining actual causation. Synthese
**2018**, 195, 835–862. [Google Scholar] [CrossRef] - Oizumi, M.; Albantakis, L.; Tononi, G. From the Phenomenology to the Mechanisms of Consciousness: Integrated Information Theory 3.0. PLoS Comput. Biol.
**2014**, 10, e1003588. [Google Scholar] [CrossRef] [PubMed] - Albantakis, L.; Tononi, G. The Intrinsic Cause-Effect Power of Discrete Dynamical Systems—From Elementary Cellular Automata to Adapting Animats. Entropy
**2015**, 17, 5472–5502. [Google Scholar] [CrossRef] - Tononi, G. Integrated information theory. Scholarpedia
**2015**, 10, 4164. [Google Scholar] [CrossRef] - Tononi, G.; Boly, M.; Massimini, M.; Koch, C. Integrated information theory: From consciousness to its physical substrate. Nat. Rev. Neurosci.
**2016**, 17, 450–461. [Google Scholar] [CrossRef] - Chajewska, U.; Halpern, J. Defining Explanation in Probabilistic Systems. In Uncertainty in Artificial Intelligence 13; Geiger, D., Shenoy, P., Eds.; Morgan Kaufmann: San Francisco, CA, USA, 1997; pp. 62–71. [Google Scholar]
- Yablo, S. De Facto Dependence. J. Philos.
**2002**, 99, 130–148. [Google Scholar] [CrossRef] - Hall, N. Structural equations and causation. Philos. Stud.
**2007**, 132, 109–136. [Google Scholar] [CrossRef] - Ay, N.; Polani, D. Information Flows in Causal Networks. Adv. Complex Syst.
**2008**, 11, 17–41. [Google Scholar] [CrossRef] - Korb, K.B.; Nyberg, E.P.; Hope, L. A new causal power theory. In Causality in the Sciences; Oxford University Press: Oxford, UK, 2011. [Google Scholar] [CrossRef]
- Janzing, D.; Balduzzi, D.; Grosse-Wentrup, M.; Schölkopf, B. Quantifying causal influences. Ann. Stat.
**2013**, 41, 2324–2358. [Google Scholar] [CrossRef] - Biehl, M.; Ikegami, T.; Polani, D. Towards information based spatiotemporal patterns as a foundation for agent representation in dynamical systems. In Proceedings of the Artificial Life Conference, Cancún, Mexico, 4 July–8 August 2016. [Google Scholar] [CrossRef]
- Pearl, J. The International Journal of Biostatistics An Introduction to Causal Inference An Introduction to Causal Inference *. Int. J. Biostat.
**2010**, 6, 7. [Google Scholar] [CrossRef] - Hoel, E.P.; Albantakis, L.; Marshall, W.; Tononi, G. Can the macro beat the micro? Integrated information across spatiotemporal scales. Neurosci. Conscious.
**2016**, 2016, niw012. [Google Scholar] [CrossRef] [PubMed] - Rubenstein, P.K.; Weichwald, S.; Bongers, S.; Mooij, J.M.; Janzing, D.; Grosse-Wentrup, M.; Schölkopf, B. Causal Consistency of Structural Equation Models. arXiv
**2017**, arXiv:1707.00819. [Google Scholar] - Marshall, W.; Albantakis, L.; Tononi, G. Black-boxing and cause-effect power. PLOS Comput. Biol.
**2018**, 14, e1006114. [Google Scholar] [CrossRef] [PubMed] - Schaffer, J. Causes as Probability Raisers of Processes. J. Philos.
**2001**, 98, 75. [Google Scholar] [CrossRef] - Marshall, W.; Gomez-Ramirez, J.; Tononi, G. Integrated Information and State Differentiation. Front. Psychol.
**2016**, 7, 926. [Google Scholar] [CrossRef] [PubMed] - Balduzzi, D.; Tononi, G. Integrated information in discrete dynamical systems: Motivation and theoretical framework. PLoS Comput. Biol.
**2008**, 4, e1000091. [Google Scholar] [CrossRef] - Fano, R.M. Transmission of Information: A Statistical Theory of Communications; MIT Press: Cambridge, MA, USA, 1961. [Google Scholar]
- Mayner, W.G.; Marshall, W.; Albantakis, L.; Findlay, G.; Marchman, R.; Tononi, G. PyPhi: A toolbox for integrated information theory. PLoS Comput. Biol.
**2018**, 14, e1006343. [Google Scholar] [CrossRef] - Halpern, J.; Pearl, J. Causes and explanations: A structural-model approach. Part I: Causes. In Proceedings of the Seventeenth Conference on Uncertainty in Artificial Intelligence (UAI 2001), Seattle, WA, USA, 2–5 August 2001; pp. 194–202. [Google Scholar]
- McDermott, M. Causation: Influence versus Sufficiency. J. Philos.
**2002**, 99, 84. [Google Scholar] [CrossRef] - Hopkins, M.; Pearl, J. Clarifying the Usage of Structural Models for Commonsense Causal Reasoning. In Proceedings of the AAAI Spring Symposium on Logical Formalizations of Commonsense Reasoning; Number January; AAAI Press: Menlo Park, CA, USA, 2003; pp. 83–89. [Google Scholar]
- Livengood, J. Actual Causation and Simple Voting Scenarios. Noûs
**2013**, 47, 316–345. [Google Scholar] [CrossRef] - Twardy, C.R.; Korb, K.B. Actual Causation by Probabilistic Active Paths. Philos. Sci.
**2011**, 78, 900–913. [Google Scholar] [CrossRef] - Fenton-Glynn, L. A proposed probabilistic extension of the Halpern and Pearl definition of ‘actual cause’. Br. J. Philos. Sci.
**2017**, 68, 1061–1124. [Google Scholar] [CrossRef] [PubMed] - Beckers, S.; Vennekens, J. A general framework for defining and extending actual causation using CP-logic. Int. J. Approx. Reason.
**2016**, 77, 105–126. [Google Scholar] [CrossRef] - Glennan, S. Singular and General Causal Relations: A Mechanist Perspective. In Causality in the Sciences; Oxford University Press: Oxford, UK, 2011; p. 789. [Google Scholar]
- Eells, E.; Sober, E. Probabilistic causality and the question of transitivity. Philos. Sci.
**1983**, 50, 35–57. [Google Scholar] [CrossRef] - Pearl, J. The Structural Theory of Causations. In Causality in the Sciences; Number July; Oxford University Press: Oxford, UK, 2009. [Google Scholar]
- Shimony, S.E. Explanation, irrelevance and statistical independence. In Proceedings of the Ninth National Conference on Artificial Intelligence-Volume 1; AAAI Press: Menlo Park, CA, USA, 1991; pp. 482–487. [Google Scholar]
- Mitchell, M. Computation in Cellular Automata: A Selected Review. In Non-Standard Computation; Gramß, T., Bornholdt, S., Groß, M., Mitchell, M., Pellizzari, T., Eds.; Wiley-VCH Verlag GmbH & Co. KGaA: Weinheim, Germany, 1998; pp. 95–140. [Google Scholar] [CrossRef]
- Woodward, J. Causation in biology: Stability, specificity, and the choice of levels of explanation. Biol. Philos.
**2010**, 25, 287–318. [Google Scholar] [CrossRef] - Datta, A.; Garg, D.; Kaynar, D.; Sharma, D. Tracing Actual Causes Tracing Actual Causes; Technical Report; 2016. Available online: https://apps.dtic.mil/dtic/tr/fulltext/u2/1025704.pdf (accessed on 10 February 2019).
- Szegedy, C.; Zaremba, W.; Sutskever, I.; Bruna, J.; Erhan, D.; Goodfellow, I.; Fergus, R. Intriguing properties of neural networks. arXiv
**2013**, arXiv:1312.6199. [Google Scholar] - Datta, A.; Garg, D.; Kaynar, D.; Sharma, D.; Sinha, A. Program Actions as Actual Causes: A Building Block for Accountability. In Proceedings of the 2015 IEEE 28th Computer Security Foundations Symposium, Verona, Italy, 13–17 July 2015; pp. 261–275. [Google Scholar] [CrossRef]
- Economist. For Artificial Intelligence to Thrive, It Must Explain Itself. 2018. Available online: https://www.economist.com/science-andtechnology2018/02/15/for-artificial-intelligence-to-thrive-it-must-explain-itself (accessed on 10 February 2019).
- Knight, W. The dark art at the heart of AI. MIT Technol. Rev.
**2017**, 120, 55–63. [Google Scholar] - Damasio, A.R.; Damasio, H. Neurobiology of Decision-Making; Research and Perspectives in Neurosciences; Springer: Berlin/Heidelberg, Germany, 2012. [Google Scholar]
- Haggard, P. Human volition: Towards a neuroscience of will. Nat. Rev. Neurosci.
**2008**, 9, 934–946. [Google Scholar] [CrossRef] - Tononi, G. On the Irreducibility of Consciousness and Its Relevance to Free Will; Springer: New York, NY, USA, 2013; pp. 147–176. [Google Scholar] [CrossRef]
- Marshall, W.; Kim, H.; Walker, S.I.; Tononi, G.; Albantakis, L. How causal analysis can reveal autonomy in models of biological systems. Philos. Trans. R. Soc. A
**2017**, 375, 20160358. [Google Scholar] [CrossRef] [PubMed]

**Figure 1.**Realization: Dynamical causal network and transition. (

**A**) A discrete dynamical system constituted of two interacting elements: An OR- and AND-logic gate, which are updated synchronously at every time step, according to their input-output functions. Arrows denote connections between the elements. (

**B**) The same system can be represented as a dynamical causal network over consecutive time steps. (

**C**) The system described by its entire set of transition probabilities. As this particular system is deterministic, all transitions have a probability of either $p=0$ or $p=1$. (

**D**) A realization of a system transient over two time steps, consistent with the system’s transition probabilities: $\{(\mathrm{OR},\phantom{\rule{4.pt}{0ex}}{\mathrm{AND})}_{t-1}=10\}\prec \{(\mathrm{OR},\phantom{\rule{4.pt}{0ex}}{\mathrm{AND})}_{t}=10\}$.

**Figure 2.**Assessing cause and effect repertoires. (

**A**) Example effect repertoires, indicating how the occurrence $\{{\mathrm{OR}}_{t-1}=1\}$ constrains the future states of ${\mathrm{OR}}_{t}$ (left) and ${(\mathrm{OR},\mathrm{AND})}_{t}$ (right) in the causal network shown in Figure 1. (

**B**) Example cause repertoires indicating how the occurrences $\{{\mathrm{OR}}_{t}=1\}$ (left) and $\{{(\mathrm{OR},\mathrm{AND})}_{t}=10\}$ (right) constrain the past states of ${\mathrm{OR}}_{t-1}$. Throughout the manuscript, filled circles denote occurrences, while open circles denote candidate causes and effects. Green shading is used for t, blue for $t-1$. Nodes that are not included in the occurrence or candidate cause/effect are causally marginalized.

**Figure 3.**Partitioning the repertoire $\pi ({Y}_{t}\mid {x}_{t-1})$. (

**A**) The set of all possible partitions of an occurrence, $\mathsf{\Psi}({x}_{t-1},{Y}_{t})$, includes all partitions of ${x}_{t-1}$ into $2\le m\le |{x}_{t-1}|$ parts, according to Equation (7); as well as the special case $\psi =\left\{({x}_{t-1},\u2300)\right\}$. Considering this special case a potential partition has the added benefit of allowing us to treat singleton occurrences and multi-variate occurrences in a common framework. (

**B**) Except for the special case when the occurrence is completely cut from the nodes it constrains, we generally do not consider cases with $m=1$ as partitions of the occurrence. The partition must eliminate the possibility of joint constraints of ${x}_{t-1}$ onto ${Y}_{t}$. The set of all partitions $\mathsf{\Psi}({X}_{t-1},{y}_{t})$ of a cause repertoire $\pi ({X}_{t-1}\mid {y}_{t})$ includes all partitions of ${y}_{t}$ into $2\le m\le |{y}_{t}|$ parts, according to Equation (9), and, again, the special case of $\psi =\left\{(\u2300,{y}_{t})\right\}$ for $m=1$.

**Figure 4.**Considering the power set of occurrences. All subsets ${x}_{t-1}\subseteq {v}_{t-1}$ and ${y}_{t}\subseteq {v}_{t}$ are considered as occurrences which may have an actual effect or an actual cause.

**Figure 5.**Assessing the cause and effect information, their irreducibility (integration), and the maximum cause/effect (exclusion). (

**A**,

**B**) Example effect and cause information. The state that actually occurred is selected from the effect or cause repertoire (green is used for effects, blue for causes). Its probability is compared to the probability of the same state when unconstrained (overlaid distributions without fill). All repertoires are based on product probabilities, $\pi $ (Equations (3) and (4)), that discount correlations due to common inputs when variables are causally marginalized. For example, $\pi \left(\{{\left(\mathrm{OR},\phantom{\rule{4.pt}{0ex}}\mathrm{AND}\right)}_{t}=01\}\right)>0$ in (A, right panel), although $p\left(\{{\left(\mathrm{OR},\phantom{\rule{4.pt}{0ex}}\mathrm{AND}\right)}_{t}=01\}\right)=0$. (

**C**,

**D**) Integrated effect and cause information. The probability of the actual state in the effect or cause repertoire is compared against its probability in the partitioned effect or cause repertoire (overlaid distributions without fill). Of all second-order occurrences shown, only $\{{\left(\mathrm{OR},\phantom{\rule{4.pt}{0ex}}\mathrm{AND}\right)}_{t}=10\}$ irreducibly constrains $\{{\left(\mathrm{OR},\phantom{\rule{4.pt}{0ex}}\mathrm{AND}\right)}_{t-1}=10\}$. For first-order occurrences, ${\alpha}_{c/e}={\rho}_{c/e}$ (see text). Maximum values are highlighted in bold. If, as in panel (

**B**), a superset of a candidate cause or effect specifies the same maximum value, it is excluded by a minimality condition.

**Figure 6.**Causal Account. There are two first-order occurrences with actual effects and actual causes. In addition, the second-order occurrence $\{(\mathrm{OR},\phantom{\rule{4.pt}{0ex}}{\mathrm{AND})}_{t}=10\}$ has an actual cause $\{(\mathrm{OR},\phantom{\rule{4.pt}{0ex}}{\mathrm{AND})}_{t-1}=10\}$.

**Figure 7.**Four dynamically identical transitions can have different causal accounts. Shown are the transitions (top) and their respective causal accounts (bottom).

**Figure 8.**A linear threshold unit with four inputs and threshold $k=3$ (Majority gate). (

**A**) All inputs are considered relevant variables. (

**B**) The case $D=0$ is taken as a fixed background condition (indicated by the red pin).

**Figure 9.**Disjunction of two conjunctions $(A\wedge B)\vee C$. (

**A**) All inputs to D are considered relevant variables. (

**B**) $B=0$ is taken as a fixed background condition.

**Figure 12.**Probabilistic variables. While the transition shown in (

**A**) does have a deterministic equivalent, the transition shown in (

**B**) would be impossible in the deterministic case.

**Figure 13.**Simple classifier. D is a “dot-detector”, S is a “segment-detector”, and L is a “line-detector” (see text).

**Figure 14.**Causal conditioning and marginalizing. (

**A**) Variables outside the transition of interest are treated as fixed background conditions (indicated by the red pins). The transition probabilities $p\left({v}_{t}\right|{v}_{t-1})$ are conditioned on the state of these background variables. (

**B**) When evaluating the strength of a causal link within the transition, the remaining variables in ${G}_{u}$ outside the causal link are causally marginalized; that is, they are replaced by an average across all their possible states. With B marginalized, the state of A by itself does not determine and is not determined by the occurrence $\{\mathrm{XOR}=1\}$.

**Figure 15.**Composition: High-order occurrences. (

**A**) Double Bi-conditional: Transition and causal account. (

**B**) Cause repertoires corresponding to the two first-order and one second-order occurrences with actual causes (see text).

**Figure 16.**Integration: Irreducible versus reducible occurrences. (

**A**) Transition and causal account of Figure 6. (

**B**) The second-order occurrence $\left\{\right(OR,AND)=10\}$ with actual cause $\{AB=10\}$ is irreducible under the MIP. (

**C**) Reducible transition with equivalent first-order causal links, but missing the second-order causal link present in (

**A**). (

**D**) The constraints specified by the second-order occurrence $\left\{\right(OR,AND)=10\}$ here are the same, and thus reducible, to those under the MIP.

**Figure 17.**Exclusion: Any occurrence can, at most, have one actual cause or effect. (

**A**) Out of the three candidate causes $\{A=1\}$, $\{B=1\}$, and $\{AB=11\}$, the actual cause of $\{\mathrm{AND}=1\}$ is the high-order occurrence $\{AB=11\}$, with ${\alpha}_{c}={\alpha}_{c}^{\mathrm{max}}=2.0$ bits. (

**B**) Out of the three candidate effects, $\{\mathrm{AND}=1\}$, $\{\mathrm{XOR}=1\}$, and $\{(\mathrm{AND},\phantom{\rule{4.pt}{0ex}}\mathrm{XOR})=11\}$, the actual effect of $\{A=1\}$ is the first-order occurrence $\{AND=1\}$, with ${\alpha}_{e}={\alpha}_{e}^{\mathrm{max}}=1.0$ bit; $\{(\mathrm{AND},\phantom{\rule{4.pt}{0ex}}\mathrm{XOR})=11\}$ is excluded by the minimality condition (Definition 2).