Optimization of Active Learning Strategies for Causal Network Structure
Abstract
:1. Introduction
- We conduct a priori research on the proposed model, synthesize the data in the real Bayesian network library, and propose the theory and results of the algorithm. Then, we find that there are a large number of similar properties between Bayesian networks and causal networks. Meanwhile, the essence of causal graphs is summarized from the data provided by the Bayesian network library. Therefore, it is considered to determine the causal graph structure orientation process in each chain component after the decomposition of the essential graph, which can ensure that each experiment does not have to be repeated on the entire causal graph, thus reducing the experimental cost.
- Traditional algorithms search for the optimal intervention node by iterating the Markov equivalence class or the complete structure of the causal graph, and they are not considered in our paper. We directly build the search for the optimal intervention node on a simpler chain component structure. At the same time, the important concept of the center of the tree in graph theory is introduced, which is extended to the undirected graph, and the generalized concept of the center is defined for the experiment performed. For each node in the undirected graph, we find the longest path to other nodes (there is no repeated edge) as its eccentricity; then, the center of the undirected graph is the point with the least eccentricity. If there are two center points, according to the results obtained a priori, the center point with more neighbors is chosen as the center of the whole undirected graph, which is also the optimal intervention node found by the algorithm. Therefore, a new active learning algorithm for causal network structures is proposed.
2. Causal DAGs and Intervention
2.1. Causal Calculus
2.2. Active Learning and Intervention Calculus
3. Optimization Design of Stage Intervention Based on Central Point
3.1. Decomposition of Essential Graph
3.2. A Priori Analysis of Model
3.2.1. Priors Based on Real Bayesian Networks
3.2.2. A Priori Analysis Based on the Results of Existing Algorithms
- The essential diagram of the causal network structure is a chain diagram. When the directed edge is removed, each connected chain branch obtained has a large number of tree structures. Therefore, the intervention experiment can be carried out in each chain branch to simplify the experimental process and reduce the cost.
- From the conclusions of existing algorithms, it can be inferred that the optimal intervention target selected by structural learning is usually a special node in the causal graph, which is in the central position of the graph and has more neighbor nodes. Therefore, the causal network outcome active learning algorithm based on the central node is proposed.
3.2.3. Active Learning Algorithm Design
Algorithm 1: Optimal intervention design algorithm based on central point |
4. Experimental Evaluation
4.1. Results
4.1.1. Experimental Results of Optimal Intervention Design Algorithm
4.1.2. Comparative Experimental Results
5. Conclusions
6. Follow-Up Work and Prospects
- The scalability of the optimized design proposed in this paper depends only on the size of the maximum connected branch after the decomposition of the causal graph, and it does not depend on the size of the DAG. Moreover, no latent variables are assumed in the experiment. Although the algorithm can determine the direction of undirected edges and output DAGs based on single-vertex intervention, it needs further verification regarding whether the application of the learning method can be generalized.
- The causal relationship in the real world is very complex, and various potential variables usually exist in the real-world data set that interfere with the analysis of the causal relationship. In future experiments, the algorithm can be applied to actual data, and the constraints and optimization objectives of the model can be modified according to the actual problems to be explored, so as to make the algorithm more robust.
- The algorithm proposed in this paper is a single-vertex intervention algorithm for the active learning of causal network structures. However, the design of the optimal intervention experiment can also be based on multi-vertex intervention, with the purpose of finding an intervention node set of a certain scale that meets the optimization conditions in the node set to restore the causal structure. Therefore, we can further explore the relationship between the causal graph structure property and the intervention node set and find a multi-vertex intervention algorithm based on the same graph structure.
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Imoto, E.P.S.; Miyano, S. Finding Optimal Bayesian Network Given a Super-Structure. J. Mach. Learn. Res. 2008, 9, 2251–2286. [Google Scholar]
- Kalisch, M.H.M.M.; Bühlmann, P. Estimating high-dimensional intervention effects from observational data. Ann. Stat. 2009, 37, 3133–3164. [Google Scholar]
- Hauser, A.; Bühlmann, P. Characterization and greedy learning of interventional Markov equivalence classes of directed acyclic graphs. J. Mach. Learn. Res. 2012, 13, 2409–2464. [Google Scholar]
- Greenewald, K.; Katz, D.; Shanmugam, K.; Magliacane, S.; Kocaoglu, M.; Boix Adsera, E.; Bresler, G. Sample efficient active learning of causal trees. Adv. Neural Inf. Process. Syst. 2019, 32, 1–11. [Google Scholar]
- Shanmugam, R.A.C.S.K.Y.K.; Uhler, C. Abcd-strategy: Budgeted experimental design for targeted causal structure discovery. In Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, PMLR, Okinawa, Japan, 16–18 April 2019; pp. 3400–3409. [Google Scholar]
- He, Y.; Geng, Z. Active learning of causal networks with intervention experiments and optimal designs. J. Mach. Learn. Res. 2008, 9, 2523–2547. [Google Scholar]
- Hauser, A.; Bühlmann, P. Two optimal strategies for active learning of causal models from interventions. In Proceedings of the 6th European Workshop on Probabilistic Graphical Models, Granada, Spain, 19–21 September 2012; pp. 123–130. [Google Scholar]
- Salehkaleybar, A.A.S.; Hashemi, M. Active learning of causal structures with deep reinforcement learning. arXiv 2022, arXiv:1610.08611. [Google Scholar]
- Hauser, A.; Bühlmann, P. Two optimal strategies for active learning of causal models from interventional data. Int. J. Approx. Reason. 2014, 55, 926–939. [Google Scholar] [CrossRef]
- He, Y.; Geng, Z. Causal network learning from multiple interventions of unknown manipulated targets. arXiv 2016, arXiv:1610.08611. [Google Scholar]
- Kalisch, M.; Bühlman, P. Estimating high-dimensional directed acyclic graphs with the PC-algorithm. J. Mach. Learn. Res. 2007, 8, 613–636. [Google Scholar]
- Brown, I.T.L.E.; Aliferis, C.F. The max-min hill-climbing Bayesian network structure learning algorithm. Mach. Learn. 2006, 65, 31–78. [Google Scholar]
- Verma, T.S.; Pearl, J. Equivalence and synthesis of causal models. In Uncertainty in Artificial Intelligence; Elsevier: Amsterdam, The Netherlands, 1990; p. 270. [Google Scholar]
- Tillman, R.E.; Eberhardt, F. Learning causal structure from multiple datasets with similar variable sets. Behaviormetrika 2014, 41, 41–64. [Google Scholar] [CrossRef]
- Geiger, D.H.D.; Chickering, D.M. Learning Bayesian networks: The combination of knowledge and statistical data. Mach. Learn. 1995, 20, 197–243. [Google Scholar]
- He, Y.; Jia, J.; Geng, Z. Structural learning of causal networks. Behaviormetrika 2017, 44, 287–305. [Google Scholar] [CrossRef]
- Spirtes, P.; Glymour, C. An algorithm for fast recovery of sparse causal graphs. Soc. Sci. Comput. Rev. 1991, 9, 62–72. [Google Scholar] [CrossRef]
- Maathuis, P.N.M.H.; Richardson, T.S. Estimating the effect of joint interventions from observational data in sparse high-dimensional settings. Ann. Stat. 2017, 45, 647–674. [Google Scholar]
- Meek, C. Graphical Models: Selecting Causal and Statistical Models. Ph.D. Thesis, Carnegie Mellon University, Pittsburgh, PA, USA, 1997. [Google Scholar]
- Nicholson, K.B.K.L.R.H.A.E.; Axnick, K. Varieties of causal intervention. In Proceedings of the PRICAI 2004: Trends in Artificial Intelligence: 8th Pacific Rim International Conference on Artificial Intelligence, Auckland, New Zealand, 9–13 August 2004; Springer: Berlin/Heidelberg, Germany, 2004; pp. 322–331. [Google Scholar]
- Glymour, F.E.C.; Scheines, R. On the number of experiments sufficient and in the worst case necessary to identify all causal relations among n variables. arXiv 2012, arXiv:1702.08567. [Google Scholar]
- Hauser, A.; Bühlmann, P. Jointly interventional and observational data: Estimation of interventional Markov equivalence classes of directed acyclic graphs. J. R. Stat. Soc. Ser. Stat. Methodol. 2015, 77, 291–318. [Google Scholar] [CrossRef]
- Gen, Y.H.Z.; Liang, X. Learning causal structures based on Markov equivalence class. In Proceedings of the Algorithmic Learning Theory: 16th International Conference, ALT 2005, Singapore, 8–11 October 2005; Springer: Berlin/Heidelberg, Germany, 2005; pp. 92–106. [Google Scholar]
- Kiyavash, A.G.S.S.N.; Zhang, K. Counting and sampling from Markov equivalent DAGs using clique trees. In Proceedings of the AAAI Conference on Artificial Intelligence, Washington, DC, USA, 27 January–1 February 2019; Volume 33, pp. 3664–3671. [Google Scholar]
- Dimakis, M.K.A.; Vishwanath, S. Cost-optimal learning of causal graphs. In Proceedings of the International Conference on Machine Learning, PMLR, Sydney, Australia, 6–11 August 2017; pp. 1875–1884. [Google Scholar]
- Xie, Z.M.X.; Geng, Z. Structural learning of chain graphs via decomposition. J. Mach. Learn. Res. 2008, 9, 2847–2880. [Google Scholar]
- Maathuis, M.K.M.M.D.C.M.H.; Bühlmann, P. Causal inference using graphical models with the R package pcalg. J. Stat. Softw. 2012, 47, 1–26. [Google Scholar]
Number of Nodes | 5 | 8 | 10 | 20 | 30 | |
---|---|---|---|---|---|---|
PCSHD | n = 500 | 2.406 (0.01) | 5.510 | 3.968 | 15.355 | 11.700 |
2.099 (0.05) | 5.209 | 3.939 | 15.973 | 11.880 | ||
1.938 (0.10) | 5.162 | 4.217 | 16.284 | 17.583 | ||
1.696 (0.50) | 6.127 | 8.239 | 26.685 | 52.663 | ||
n = 1000 | 1.922 | 4.923 | 3.432 | 14.347 | 9.722 | |
1.703 | 4.785 | 3.538 | 14.166 | 15.084 | ||
1.628 | 4.750 | 3.912 | 15.887 | 16.736 | ||
1.486 | 6.082 | 8.289 | 25.346 | 51.884 | ||
n = 5000 | 1.259 | 4.160 | 2.988 | 14.000 | 8.996 | |
1.120 | 4.207 | 3.373 | 17.325 | 16.237 | ||
1.119 | 4.369 | 3.918 | 28.245 | 17.664 | ||
1.246 | 5.933 | 8.784 | 29.289 | 53.241 | ||
AlgSHD | n = 500 | 4.828 (0.01) | 10.089 | 8.130 | 25.533 | 23.024 |
4.658 (0.05) | 9.729 | 8.325 | 28.467 | 27.620 | ||
4.616 (0.10) | 9.823 | 8.660 | 27.275 | 29.458 | ||
4.437 (0.50) | 10.984 | 12.815 | 39.042 | 69.924 | ||
n = 1000 | 4.557 | 9.585 | 7.804 | 25.282 | 21.765 | |
4.396 | 9.390 | 7.904 | 26.497 | 30.048 | ||
4.365 | 9.375 | 8.289 | 27.128 | 28.906 | ||
4.172 | 10.921 | 12.798 | 29.363 | 69.000 | ||
n = 5000 | 3.911 | 8.658 | 7.210 | 25.448 | 20.872 | |
3.655 | 8.681 | 7.624 | 30.263 | 29.395 | ||
3.621 | 8.705 | 8.127 | 28.245 | 30.024 | ||
3.245 | 10.352 | 13.136 | 43.884 | 71.283 |
5 | 8 | 10 | 20 | 30 | |
---|---|---|---|---|---|
n = 500 | 2.17 | 3.29 | 4.09 | 8.38 | 12.25 |
n = 1000 | 2.18 | 3.28 | 4.11 | 8.42 | 12.30 |
n = 5000 | 2.32 | 3.30 | 4.03 | 8.29 | 12.14 |
ratio | 43.9% | 41.5% | 40.8% | 41.8% | 40.8% |
Vertex Count | 5 | 5_2 | 10 | 20 (GES) | 30 (GES) | |
---|---|---|---|---|---|---|
ANI | OurAlg | 1.888 | 1.640 | 4.181 | 1.840 | 2.720 |
MaxEntropy | 2.469 | 2.460 | 4.594 | 1.982 | ** | |
MaxMin | 2.442 | 2.478 | 4.622 | 1.954 | ** | |
Random | 2.380 | 2.233 | 5.190 | 2.266 | 3.550 | |
OptUN | 1.893 | ** | 2.024 | 2.186 | 3.548 | |
SHD | OurAlg | 1.685 | 1.533 | 2.165 | 8.507 | 2.880 |
MaxEntropy | 1.189 | 1.4253 | 2.216 | 12.209 | ** | |
MaxMin | 1.723 | 1.492 | 2.292 | 13.359 | ** | |
Random | 2.292 | 1.784 | 2.973 | 9.231 | 6.23 | |
OptUN | 1.415 | ** | 2.196 | 11.995 | 3.182 | |
Time (s) | OurAlg | 59.091 | 69.012 | 266.734 | 86.089 | 110.332 |
MaxEntropy | 298.251 | 269.195 | 2372.411 | 413.122 | ** | |
MaxMin | 387.834 | 244.731 | 2505.332 | 403.664 | ** | |
Random | 70.524 | 64.451 | 288.810 | 64.733 | 90.826 | |
OptUN | 55.400 | ** | 67.568 | 210.602 | 128.93 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, M.; Zhang, X. Optimization of Active Learning Strategies for Causal Network Structure. Mathematics 2024, 12, 880. https://doi.org/10.3390/math12060880
Zhang M, Zhang X. Optimization of Active Learning Strategies for Causal Network Structure. Mathematics. 2024; 12(6):880. https://doi.org/10.3390/math12060880
Chicago/Turabian StyleZhang, Mengxin, and Xiaojun Zhang. 2024. "Optimization of Active Learning Strategies for Causal Network Structure" Mathematics 12, no. 6: 880. https://doi.org/10.3390/math12060880
APA StyleZhang, M., & Zhang, X. (2024). Optimization of Active Learning Strategies for Causal Network Structure. Mathematics, 12(6), 880. https://doi.org/10.3390/math12060880