# Testability of Instrumental Variables in Linear Non-Gaussian Acyclic Causal Models

^{1}

^{2}

^{3}

^{4}

^{5}

^{*}

## Abstract

**:**

## 1. Introduction

- 1.
- We propose a necessary condition for detecting variables that cannot serve as (conditional) IVs by the so-called generalized independent noise (GIN) condition [24], which is called instrumental variable generalized independent noise (IV-GIN) condition. We characterize the graphical implications of IV-GIN condition in linear non-Gaussian acyclic causal models.
- 2.
- We then further show whether and how the graphical criteria of an instrumental variable can be checked by exploiting the IV-GIN conditions.
- 3.
- We develop a method to select the set of candidate IVs for the target causal influence $X\to Y$ from the observational data by IV-GIN conditions.
- 4.
- We demonstrate the efficacy of our algorithm on both synthetic and real-word data.

## 2. Related Work

#### 2.1. Instrument Variable Models

#### 2.2. Causal Graphical Models

## 3. Preliminaries

#### 3.1. Notation and Graph Terminology

**path**is a sequence of nodes $\{{V}_{1},\dots ,{V}_{r}\}$ such that ${V}_{i}$ and ${V}_{i+1}$ are adjacent in G, where $1\le i<r$. Furthermore, if the edge between ${V}_{i}$ and ${V}_{i+1}$ has its arrow pointing to ${V}_{i+1}$ for $i=1,2,\dots ,r-1$, we say that the path is

**directed**from ${V}_{1}$ to ${V}_{r}$. A

**collider**on a path $\{{V}_{1},\dots ,{V}_{p}\}$ is a node ${V}_{i}$, $1<i<p$, such that ${V}_{i-1}$ and ${V}_{i+1}$ are parents of ${V}_{i}$. We say a path is

**active**if this path can be traced without traversing a collider. A

**trek**between ${V}_{i}$ and ${V}_{j}$ is a path that does not contain any colliders in G. The set of all parents and children of ${V}_{i}$ are denoted by $\mathbf{Pa}\left({V}_{i}\right)$ and $\mathbf{Ch}\left({V}_{i}\right)$, respectively. Besides, for a set $\mathbf{O}$, $\left|\mathbf{O}\right|$ denotes the number of elements of set $\mathbf{O}$. Other commonly used concepts in graphical models, such as d-separation, can be found in [4,7].

#### 3.2. Instrumental Variable Model

**Definition**

**1**

- 1.
- $\mathbf{W}$ contains only nondescendants of Y in G;
- 2.
- $\mathbf{W}$ d-separates Z from Y in the graph obtained by removing the edge $X\to Y$ from G;
- 3.
- $\mathbf{W}$ does not d-separates Z from X in G.

**Definition**

**2**

#### 3.3. Problem Setup

## 4. Necessary Condition for Instrumental Variable

#### 4.1. A Motivating Example

- Subgraph (a): ${U}_{1}={\epsilon}_{{U}_{1}}$, $Z={\epsilon}_{Z}$, $X=2Z+0.5{U}_{1}+{\epsilon}_{X}$, and $Y=1X+2{U}_{1}+{\epsilon}_{Y}$;
- Subgraph (b): ${U}_{1}={\epsilon}_{{U}_{1}}$, $Z=1{U}_{1}+{\epsilon}_{Z}$, $X=2Z+0.5{U}_{1}+{\epsilon}_{X}$, and $Y=1X+2{U}_{1}+{\epsilon}_{Y}$.

- Gaussian Case: All noise terms in subgraphs (a) and (b) are generated from the standard Gaussian distributions.
- Uniform Case: All noise terms in subgraphs (a) and (b) are generated from the uniform distributions over the interval $[0,1]$.

#### 4.2. IV-GIN Condition for Instrumental Variable

**Definition**

**3**

**Theorem**

**1**

**(Darmois–Skitovitch Theorem).**

**Theorem**

**2**

**(Necessary Condition for IV).**Let G be a linear non-Gaussian acyclic causal model. Let treatment X, outcome Y, Z, and $\mathbf{W}$ be correlated random variables in G. Assume faithfulness holds. If Z is a valid IV conditioning on $\mathbf{W}$ relative to $X\to Y$ in G, then $\left(\right\{Z,\mathbf{W}\},\{X,Y,\mathbf{W}\left\}\right)$ follows the GIN condition.

**Example**

**1**

#### 4.3. Graphical Implications of IV-GIN Condition in Linear non-Gaussian causal Models

**Theorem**

**3.**

- 1.
- There exists a node $C\in \mathbf{V}$, $C\notin \mathbf{W}$, such that for every trek π between a node ${V}_{p}\in \{X,Y,\mathbf{W}\}$ and a node ${V}_{q}\in \{Z,\mathbf{W}\}$, (a) π goes through at least one node in $\{C,\mathbf{W}\}$, denoted by ${V}_{k}$, and (b) ${V}_{k}$ has its arrow pointing to ${V}_{p}$ in π. (In other words, ${V}_{k}$ is causally earlier (according to the causal order) than ${V}_{p}$ on π.)
- 2.
- There is at least one directed path between any one node in $\{C,\mathbf{W}\}$ and any one node in $\{X,Y\}$.
- 3.
- There is no proper subset $\tilde{\mathbf{W}}$ of $\mathbf{W}$ to satisfy conditions 1 and 2.

**Example**

**2.**

## 5. Testability of Instrument Criteria Validity in Terms of IV-GIN Conditions

#### 5.1. Condition 1 of Instrument Criteria

**Proposition**

**1.**

**Example**

**3.**

#### 5.2. Condition 2 of Instrument Criteria

- 2a.
- There is no active nondirected path between Z and Y that does not include X;
- 2b.
- There is no active directed path from Z to Y that does not include X.

#### 5.2.1. Subcondition 2a

**Proposition**

**2.**

**Example**

**4.**

**Example**

**5.**

#### 5.2.2. Subcondition 2b

## 6. Algorithm for Selecting the Candidate IVs

Algorithm 1: IV-GIN |

Input: Treatment X, outcome Y, and set of observed variables $\mathbf{O}$.Output: Set of candidate $\mathbf{C}$ and its corresponding conditional set $\mathbf{Conset}$.1: Initialize the set of candidate IVs: $\mathbf{C}=\varnothing $, the conditional set: $\mathbf{Conset}=\varnothing $, the length of conditional set: $\mathrm{ConsetLen}=0$, and $\mathbf{Tag}=\mathbf{O}$; 2: while $\mathrm{ConsetLen}<\left|\mathbf{Tag}\right|$ do3: for each variable ${Z}_{i}\in \mathbf{C}$ do4: repeat5: Select a subset $\mathbf{W}$ from $\mathbf{O}\setminus {Z}_{i}$ such that $\mathbf{W}=\mathrm{ConsetLen}$; 6: if $\left[Z\right|\left|\mathbf{W}\right]$ follows the IV-GIN condition then7: Add ${Z}_{i}$ into $\mathbf{C}$, and delete ${Z}_{i}$ from $\mathbf{Tag}$; 8: Set $\mathbf{Conset}\left({Z}_{i}\right)=\mathbf{W}$; 9: Break the repeat loop of line 4; 10: end if11: until all subsets with length $\mathrm{ConsetLen}$ in $\mathbf{O}\backslash {Z}_{i}$ are selected;12: end for13: $\mathrm{ConsetLen}=\mathrm{ConsetLen}+1$; 14: end while15: Return: $\mathbf{C}$ and $\mathbf{Conset}$ |

**Theorem**

**4**

## 7. Experiments on Synthetic Data

**Comparisons:**We make comparisons with two state-of-the-art methods: the sisVIVE algorithm [20] that needs more than half of the variables to be valid IVs, and the IV-TETRAD algorithm [21] that needs two or more variables to be valid IVs. (Here, we adopt the two functions, TestTetrad and TestResiuals, to select IVs in the IV-TETRAD algorithm.) The source codes of sisVIVE and IV-TETRAD are available from https://mirrors.sjtug.sjtu.edu.cn/cran/web/packages/sisVIVE/index.html (accessed on 20 January =2022) and http://www.homepages.ucl.ac.uk/~ucgtrbd/code/iv_discovery/ (accessed on 20 January 2022), respectively.

**Scenarios:**We designed three scenarios, as shown in Figure 9, where X is treatment, Y is outcome, the variables ${U}_{i}$ ($i=1,2$) are unobserved, and ${Z}_{j}$ ($j=1,\dots ,4$) are potential IVs. For scenarios ${S}_{1}$ and ${S}_{2}$, nodes ${Z}_{2}$ and ${Z}_{3}$ both are valid IVs conditioning on an empty set relative to $X\to Y$, and node ${Z}_{1}$ is an invalid IV due to the path ${Z}_{1}\leftarrow {U}_{1}\to Y$. The key difference between scenarios ${S}_{1}$ and ${S}_{2}$ is that there is an active nondirected path between ${Z}_{3}$ and X in ${S}_{2}$ while not in ${S}_{1}$. For scenario ${S}_{3}$, ${Z}_{1}$ is a valid IV conditioning on ${Z}_{3}$ relative to $X\to Y$, ${Z}_{2}$ is a valid IV conditioning on an empty set relative to $X\to Y$, ${Z}_{3}$ is an invalid IV due to the paths ${Z}_{3}\to Y$ and ${Z}_{3}\leftarrow {U}_{1}\to Y$, and ${Z}_{4}$ is an invalid IV due to the path $X\to {Z}_{4}\leftarrow Y$.

**Metrics:**To evaluate the accuracy of the selected IVs, we used the following two metrics:

- Correct-selecting rate: The number of correctly selected valid IVs divided by the total number of valid IVs in the ground-truth graph.
- Selection commission: The number of falsely detected IVs divided by the total number of selected IVs in the output $\mathbf{C}$ of the current algorithm.

**Experimental setup:**We generated data by a linear non-Gaussian causal acyclic model according to the above three scenarios. In detail, the causal strength ${b}_{ij}$ was generated uniformly in $[-2,-0.5]\cup [0.5,2]$ and the non-Gaussian noise terms were generated from exponential distributions to the second power. Here, we conducted experiments with the following tasks:

- T1.
- Sensitivity on the effect of sample size. We considered different sample sizes $N=1k,3k,5k$, where k = 1000.
- T2.
- Sensitivity on the effect of unmeasured confounders between X and Y. The coefficients between $\{X,Y\}$ and ${U}_{1}$ are set such that ${b}_{X{U}_{1}}={b}_{Y{U}_{1}}=\lambda $, at two levels, $(0.125,0.25)$, as that in [21]. The sample size N is 5000.

**Results on Task T1:**The experimental results are reported in Table 1. From the table, we can see that our proposed IV-GIN outperforms other methods with both evaluation metrics in all there scenarios and in all sample sizes, indicating that our IV-GIN condition’s testability is wider than other algorithms’ in the linear non-Gaussian causal models. We found that the IV-TETRAD algorithm does not perform well, especially in scenarios ${S}_{2}$ and ${S}_{3}$, indicating that it is not capable when there is an active nondirected path between valid IV and treatment X (scenario ${S}_{2}$) and a single IV is present (scenario ${S}_{3}$). We further noticed that the sisVIVE algorithm does not perform well in scenario ${S}_{3}$. This is because fewer than half of the variables are valid IV conditioning on the same set in scenario ${S}_{3}$.

**Results on Task T2:**The experimental results are reported in Table 2. It is worth noting that stronger confounding makes it more difficult to select valid IVs. From the table, we found IV-GIN gives better performances than other methods with different confounding coefficients in almost all scenarios, indicating that our IV-GIN condition is more efficient than other algorithms. We noticed that although the Correct-selecting rate of sisVIVE is higher than IV-GIN in scenario ${S}_{1}$ when $\lambda =0.25$, the selection commission of IV-GIN is lower than sisVIVE (lower is better for selection commission).

## 8. Application to Vitamin D Data

## 9. Discussion

## 10. Conclusions and Further Work

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## Appendix A. Proofs

**Theorem**

**A1.**

**Proof.**

#### Appendix A.1. Proof of Theorem 3

**Proof.**

#### Appendix A.2. Proof of Theorem 2

**Proof.**

#### Appendix A.3. Proof of Proposition 1

**Proof.**

#### Appendix A.4. Proof of Proposition 2

**Proof.**

#### Appendix A.5. Proof of Theorem 4

**Proof.**

## References

- Wright, P.G. Tariff on Animal and Vegetable Oils; Macmillan Company: New York, NY, USA, 1928. [Google Scholar]
- Goldberger, A.S. Structural equation methods in the social sciences. Econom. J. Econom. Soc.
**1972**, 40, 979–1001. [Google Scholar] [CrossRef] - Bowden, R.J.; Turkington, D.A. Instrum. Var.; Number 8; Cambridge University Press: Cambridge, UK, 1990. [Google Scholar]
- Pearl, J. Causality: Models, Reasoning, and Inference, 2nd ed.; Cambridge University Press: New York, NY, USA, 2009. [Google Scholar]
- Imbens, G.W. Instrumental Variables: An Econometrician’s Perspective. Stat. Sci.
**2014**, 29, 323–358. [Google Scholar] [CrossRef] - Imbens, G.W.; Rubin, D.B. Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction; Cambridge University Press: Cambridge, UK, 2015. [Google Scholar]
- Spirtes, P.; Glymour, C.; Scheines, R. Causation, Prediction, and Search; MIT Press: Cambridge, MA, USA, 2000. [Google Scholar]
- Hernán, M.A.; Robins, J.M. Instruments for causal inference: An epidemiologist’s dream? Epidemiology
**2006**, 17, 360–372. [Google Scholar] [CrossRef] [PubMed] - Baiocchi, M.; Cheng, J.; Small, D.S. Instrumental variable methods for causal inference. Stat. Med.
**2014**, 33, 2297–2340. [Google Scholar] [CrossRef] [PubMed] - Bound, J.; Jaeger, D.A.; Baker, R.M. Problems with instrumental variables estimation when the correlation between the instruments and the endogenous explanatory variable is weak. J. Am. Stat. Assoc.
**1995**, 90, 443–450. [Google Scholar] [CrossRef] - Pearl, J. On the testability of causal models with latent and instrumental variables. In Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence; Morgan Kaufmann Publishers Inc.: San Francisco, CA, USA, 1995; pp. 435–443. [Google Scholar]
- Manski, C.F. Partial Identification of Probability Distributions; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2003. [Google Scholar]
- Palmer, T.M.; Ramsahai, R.R.; Didelez, V.; Sheehan, N.A. Nonparametric bounds for the causal effect in a binary instrumental-variable model. Stata J.
**2011**, 11, 345–367. [Google Scholar] [CrossRef] - Kitagawa, T. A test for instrument validity. Econometrica
**2015**, 83, 2043–2063. [Google Scholar] [CrossRef] - Wang, L.; Robins, J.M.; Richardson, T.S. On falsification of the binary instrumental variable model. Biometrika
**2017**, 104, 229–236. [Google Scholar] [CrossRef] [PubMed] - Kédagni, D.; Mourifié, I. Generalized instrumental inequalities: Testing the instrumental variable independence assumption. Biometrika
**2020**, 107, 661–675. [Google Scholar] [CrossRef] - Gunsilius, F.F. Nontestability of instrument validity under continuous treatments. Biometrika
**2021**, 108, 989–995. [Google Scholar] [CrossRef] - Kuroki, M.; Cai, Z. Instrumental variable tests for Directed Acyclic Graph Models. In Proceedings of the International Workshop on Artificial Intelligence and Statistics, Bridgetown, Barbados, 6–8 January 2005; pp. 190–197. [Google Scholar]
- Spearman, C. Pearson’s contribution to the theory of two factors. Br. J. Psychol.
**1928**, 19, 95–101. [Google Scholar] [CrossRef] - Kang, H.; Zhang, A.; Cai, T.T.; Small, D.S. Instrumental variables estimation with some invalid instruments and its application to Mendelian randomization. J. Am. Stat. Assoc.
**2016**, 111, 132–144. [Google Scholar] [CrossRef] - Silva, R.; Shimizu, S. Learning instrumental variables with structural and non-gaussianity assumptions. J. Mach. Learn. Res.
**2017**, 18, 1–49. [Google Scholar] - Sullivant, S.; Talaska, K.; Draisma, J. Trek separation for Gaussian graphical models. Ann. Stat.
**2010**, 38, 1665–1685. [Google Scholar] [CrossRef] - Spirtes, P. Calculation of Entailed Rank Constraints in Partially Non-linear and Cyclic Models. In Proceedings of the Twenty-Ninth Conference on Uncertainty in Artificial Intelligence; AUAI Press: Arlington, VA, USA, 2013; pp. 606–615. [Google Scholar]
- Xie, F.; Cai, R.; Huang, B.; Glymour, C.; Hao, Z.; Zhang, K. Generalized Independent Noise Conditionfor Estimating Latent Variable Causal Graphs. In Proceedings of the Advances in Neural Information Processing Systems, Virtual, 6–12 December 2020; pp. 14891–14902. [Google Scholar]
- Choi, M.J.; Tan, V.Y.; Anandkumar, A.; Willsky, A.S. Learning latent tree graphical models. J. Mach. Learn. Res.
**2011**, 12, 1771–1812. [Google Scholar] - Chandrasekaran, V.; Parrilo, P.A.; Willsky, A.S. Latent variable graphical model selection via convex optimization. In Proceedings of the 2010 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton), Monticello, IL, USA, 29 September–1 October 2010; pp. 1935–1967. [Google Scholar]
- Meng, Z.; Eriksson, B.; Hero, A. Learning latent variable Gaussian graphical models. In Proceedings of the International Conference on Machine Learning, Beijing, China, 21–26 June 2014; pp. 1269–1277. [Google Scholar]
- Zorzi, M.; Sepulchre, R. AR identification of latent-variable graphical models. IEEE Trans. Autom. Control.
**2015**, 61, 2327–2340. [Google Scholar] [CrossRef] - Wu, C.; Zhao, H.; Fang, H.; Deng, M. Graphical model selection with latent variables. Electron. J. Stat.
**2017**, 11, 3485–3521. [Google Scholar] [CrossRef] - Kumar, S.; Ying, J.; de Miranda Cardoso, J.V.; Palomar, D.P. A Unified Framework for Structured Graph Learning via Spectral Constraints. J. Mach. Learn. Res.
**2020**, 21, 1–60. [Google Scholar] - Ciccone, V.; Ferrante, A.; Zorzi, M. Learning latent variable dynamic graphical models by confidence sets selection. IEEE Trans. Autom. Control.
**2020**, 65, 5130–5143. [Google Scholar] [CrossRef] - Alpago, D.; Zorzi, M.; Ferrante, A. A scalable strategy for the identification of latent-variable graphical models. IEEE Trans. Autom. Control.
**2021**. [Google Scholar] [CrossRef] - Bertsimas, D.; Cory-Wright, R.; Johnson, N.A. Sparse Plus Low Rank Matrix Decomposition: A Discrete Optimization Approach. arXiv
**2021**, arXiv:2109.12701. [Google Scholar] - Spirtes, P.; Meek, C.; Richardson, T. Causal inference in the presence of latent variables and selection bias. In Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence; Morgan Kaufmann Publishers Inc.: Burlington, MA, USA, 1995; pp. 499–506. [Google Scholar]
- Colombo, D.; Maathuis, M.H.; Kalisch, M.; Richardson, T.S. Learning high-dimensional directed acyclic graphs with latent and selection variables. Ann. Stat.
**2012**, 40, 294–321. [Google Scholar] [CrossRef] - Kitson, N.K.; Constantinou, A.C.; Guo, Z.; Liu, Y.; Chobtham, K. A survey of Bayesian Network structure learning. arXiv
**2021**, arXiv:2109.11415. [Google Scholar] - Hoyer, P.O.; Shimizu, S.; Kerminen, A.J.; Palviainen, M. Estimation of causal effects using linear non-Gaussian causal models with hidden variables. Int. J. Approx. Reason.
**2008**, 49, 362–378. [Google Scholar] [CrossRef] - Entner, D.; Hoyer, P.O. Discovering unconfounded causal relationships using linear non-gaussian models. In JSAI International Symposium on Artificial Intelligence; Springer: Berlin/Heidelberg, Germany, 2010; pp. 181–195. [Google Scholar]
- Tashiro, T.; Shimizu, S.; Hyvärinen, A.; Washio, T. ParceLiNGAM: A causal ordering method robust against latent confounders. Neural Comput.
**2014**, 26, 57–83. [Google Scholar] [CrossRef] [PubMed] - Salehkaleybar, S.; Ghassami, A.; Kiyavash, N.; Zhang, K. Learning Linear Non-Gaussian Causal Models in the Presence of Latent Variables. J. Mach. Learn. Res.
**2020**, 21, 1–24. [Google Scholar] - Ciccone, V.; Ferrante, A.; Zorzi, M. Robust identification of “sparse plus low-rank” graphical models: An optimization approach. In Proceedings of the 2018 IEEE Conference on Decision and Control (CDC), Miami, FL, USA, 17–19 December 2018; pp. 2241–2246. [Google Scholar]
- Alpago, D.; Zorzi, M.; Ferrante, A. Identification of sparse reciprocal graphical models. IEEE Control. Syst. Lett.
**2018**, 2, 659–664. [Google Scholar] [CrossRef] - Frot, B.; Nandy, P.; Maathuis, M.H. Robust causal structure learning with some hidden variables. J. R. Stat. Soc. Ser. (Stat. Methodol.)
**2019**, 81, 459–487. [Google Scholar] [CrossRef] - Agrawal, R.; Squires, C.; Prasad, N.; Uhler, C. The DeCAMFounder: Non-Linear Causal Discovery in the Presence of Hidden Variables. arXiv
**2021**, arXiv:2102.07921. [Google Scholar] - Brito, C.; Pearl, J. Generalized instrumental variables. In Proceedings of the Eighteenth Conference on Uncertainty in Artificial Intelligence; Morgan Kaufmann Publishers Inc.: San Francisco, CA, USA, 2002; pp. 85–93. [Google Scholar]
- Bollen, K.A. Structural Equations with Latent Variable; John Wiley & Sons: Hoboken, NJ, USA, 1989. [Google Scholar]
- Shimizu, S.; Hoyer, P.O.; Hyvärinen, A.; Kerminen, A. A linear non-Gaussian acyclic model for causal discovery. J. Mach. Learn. Res.
**2006**, 7, 2003–2030. [Google Scholar] - Kagan, A.M.; Rao, C.R.; Linnik, Y.V. Characterization Problems in Mathematical Statistics; John Wiley: New York, NY, USA, 1973. [Google Scholar]
- Fisher, R.A. Statistical Methods for Research Workers; Springer: Berlin/Heidelberg, Germany, 1950. [Google Scholar]
- Zhang, Q.; Filippi, S.; Gretton, A.; Sejdinovic, D. Large-scale kernel methods for independence testing. Stat. Comput.
**2018**, 28, 113–130. [Google Scholar] [CrossRef] - Skaaby, T.; Husemoen, L.L.N.; Martinussen, T.; Thyssen, J.P.; Melgaard, M.; Thuesen, B.H.; Pisinger, C.; Jørgensen, T.; Johansen, J.D.; Menné, T.; et al. Vitamin D status, filaggrin genotype, and cardiovascular risk factors: A Mendelian randomization approach. PLoS ONE
**2013**, 8, e57647. [Google Scholar] - Martinussen, T.; Nørbo Sørensen, D.; Vansteelandt, S. Instrumental variables estimation under a structural Cox model. Biostatistics
**2019**, 20, 65–79. [Google Scholar] [CrossRef] [PubMed] - Silva, R.; Shimizu, S. Learning Instrumental Variables with Non-Gaussianity Assumptions: Theoretical Limitations and Practical Algorithms. arXiv
**2015**, arXiv:1511.02722. [Google Scholar] - Hyvärinen, A.; Karhunen, J.; Oja, E. Independent Component Analysis; John Wiley & Sons: Hoboken, NJ, USA, 2004; Volume 46. [Google Scholar]
- Hoyer, P.O.; Janzing, D.; Mooij, J.M.; Peters, J.; Schölkopf, B. Nonlinear causal discovery with additive noise models. In Advances in Neural Information Processing Systems; Curran Associates Inc.: Red Hook, NY, USA, 2009; pp. 689–696. [Google Scholar]
- Zhang, K.; Hyvärinen, A. On the identifiability of the post-nonlinear causal model. In Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence; AUAI Press: Arlington, VA, USA, 2009; pp. 647–655. [Google Scholar]
- Peters, J.; Mooij, J.M.; Janzing, D.; Schölkopf, B. Causal Discovery with Continuous Additive Noise Models. J. Mach. Learn. Res.
**2014**, 15, 2009–2053. [Google Scholar]

**Figure 1.**A simple instrumental variable example where X is treatment, Y is outcome, and Z is an IV relative to $X\to Y$.

**Figure 2.**A typical instrumental variable model where X is treatment, Y is outcome, and Z is an IV conditioning on $\{{W}_{1},{W}_{2}\}$ relative to $X\to Y$.

**Figure 3.**(

**a**) Z is a valid IV for the relation $X\to Y$ and (

**b**) Z is an invalid IV for the relation $X\to Y$.

**Figure 4.**Illustration on the fact that non-Gaussianity leads to dependence between invalid IV Z and surrogate-variable $Y-\frac{{\sigma}_{YZ}}{{\sigma}_{XZ}}X$. (

**a**) Scatter plot of valid IV Z and surrogate-variable $Y-\frac{{\sigma}_{YZ}}{{\sigma}_{XZ}}X$. (

**b**) Scatter plot of invalid IV Z and surrogate-variable $Y-\frac{{\sigma}_{YZ}}{{\sigma}_{XZ}}X$.

**Figure 5.**Causal graph where Z is a valid IV conditioning on ${W}_{1}$ relative to $X\to Y$ but an invalid IV conditioning on ${W}_{2}$ relative to $X\to Y$.

**Figure 6.**Causal graph where Z is an invalid IV conditioning on ${W}_{1}$ relative to $X\to Y$ due to the nondirected path $Z\leftarrow {U}_{2}\to Y$.

**Figure 7.**Causal graph where Z is a invalid IV conditioning on an empty set relative to $X\to Y$ but $\left(\right\{Z\},\{Y,X\left\}\right)$ follows the GIN condition.

**Figure 8.**Causal graph where Z is an invalid IV conditioning on an empty set relative to $X\to Y$ due to the directed path $Z\to Y$.

**Table 1.**Performance of IV-GIN, sisVIVE, and IV-TETRAD on selecting valid IVs with different sample sizes.

Correct-Selecting Rate ↑ | Selection Commission ↓ | ||||||
---|---|---|---|---|---|---|---|

Algorithm | IV-GIN (Ours) | sisVIVE | IV-TETRAD | IV-GIN (Ours) | sisVIVE | IV-TETRAD | |

Scenario ${S}_{1}$ | 1k | 0.92 | 0.76 | 0.84 | 0.12 | 0.0 | 0.16 |

3k | 0.95 | 0.81 | 0.96 | 0.03 | 0.0 | 0.04 | |

5k | 0.97 | 0.85 | 0.96 | 0.0 | 0.0 | 0.04 | |

Scenario ${S}_{2}$ | 1k | 0.9 | 0.92 | 0.03 | 0.03 | 0.08 | 0.0 |

3k | 0.95 | 0.93 | 0.02 | 0.0 | 0.02 | 0.0 | |

5k | 1.0 | 0.94 | 0.0 | 0.0 | 0.0 | 0.0 | |

Scenario ${S}_{3}$ | 1k | 0.75 | 0.29 | 0.05 | 0.1 | 0.59 | 0.1 |

3k | 0.86 | 0.2 | 0.02 | 0.05 | 0.7 | 0.05 | |

5k | 0.93 | 0.24 | 0.02 | 0.02 | 0.63 | 0.0 |

**Table 2.**Performance of IV-GIN, sisVIVE, and IV-TETRAD on selecting valid IVs with different effect of unmeasured confounders between treatment and outcome.

Correct-Selecting Rate ↑ | Selection Commission ↓ | ||||||
---|---|---|---|---|---|---|---|

Algorithm | IV-GIN (Ours) | sisVIVE | IV-TETRAD | IV-GIN (Ours) | sisVIVE | IV-TETRAD | |

Scenario ${S}_{1}$ | $\lambda =0.125$ | 0.96 | 0.83 | 0.92 | 0.06 | 0.01 | 0.08 |

$\lambda =0.25$ | 0.85 | 0.72 | 0.86 | 0.01 | 0.0 | 0.01 | |

Scenario ${S}_{2}$ | $\lambda =0.125$ | 0.98 | 0.93 | 0.02 | 0.04 | 0.06 | 0.0 |

$\lambda =0.25$ | 0.92 | 0.91 | 0.0 | 0.08 | 0.1 | 0.0 | |

Scenario ${S}_{3}$ | $\lambda =0.125$ | 0.89 | 0.22 | 0.05 | 0.03 | 0.58 | 0.02 |

$\lambda =0.25$ | 0.85 | 0.2 | 0.03 | 0.07 | 0.61 | 0.0 |

Metrics | Scenario ${\mathit{S}}_{1}$ | Scenario ${\mathit{S}}_{2}$ | Scenario ${\mathit{S}}_{3}$ |
---|---|---|---|

Correct-selecting rate ↑ | 0.1 | 0.1 | 0.09 |

Selection commission ↓ | 0.0 | 0.12 | 0.3 |

**Table 4.**Performance of IV-GIN on selecting valid IVs with 5k sample sizes where the locations of nodes X and Y are swapped.

Metrics | Scenario ${\mathit{S}}_{1}$ | Scenario ${\mathit{S}}_{2}$ | Scenario ${\mathit{S}}_{3}$ |
---|---|---|---|

Correct-selecting rate ↑ | 0.96 | 1.0 | 0.92 |

Selection commission ↓ | 0.01 | 0.0 | 0.04 |

**Table 5.**Summary of the testability results using the IV-GIN conditions presented in our paper and IV-TETRAD conditions presented in [21].

Testability of Instrument Criteria | |||
---|---|---|---|

Method | Scenario ${\mathit{S}}_{\mathbf{1}}$ | Scenario ${\mathit{S}}_{\mathbf{1}}$ | Scenario ${\mathit{S}}_{\mathbf{1}}$ |

IV-GIN (ours) | Fully | Partially | None |

IV-TETRAD | None | Fully | None |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Xie, F.; He, Y.; Geng, Z.; Chen, Z.; Hou, R.; Zhang, K.
Testability of Instrumental Variables in Linear Non-Gaussian Acyclic Causal Models. *Entropy* **2022**, *24*, 512.
https://doi.org/10.3390/e24040512

**AMA Style**

Xie F, He Y, Geng Z, Chen Z, Hou R, Zhang K.
Testability of Instrumental Variables in Linear Non-Gaussian Acyclic Causal Models. *Entropy*. 2022; 24(4):512.
https://doi.org/10.3390/e24040512

**Chicago/Turabian Style**

Xie, Feng, Yangbo He, Zhi Geng, Zhengming Chen, Ru Hou, and Kun Zhang.
2022. "Testability of Instrumental Variables in Linear Non-Gaussian Acyclic Causal Models" *Entropy* 24, no. 4: 512.
https://doi.org/10.3390/e24040512