Explainable Generative AI: A Two-Stage Review of Existing Techniques and Future Research Directions
Abstract
1. Introduction
- It provides the first two-stage synthesis of explainability research in GenAI, integrating insights from both review and empirical studies;
- It introduces an analytical taxonomy that organizes existing explainability approaches according to shared methodological principles and application contexts;
- It critically synthesizes strengths, limitations, and recurring tensions observed across the literature; and
- It articulates future research directions aimed at addressing conceptual fragmentation, evaluation challenges, and human-centered transparency requirements.
2. Preliminaries of Explainable Generative AI
2.1. Generative Artificial Intelligence
2.2. Explainable Artificial Intelligence (XAI)
2.3. The Need for Explainability and Transparency in GenAI
2.4. A Formalized Model for Explainability in GenAI
3. Research Methodology
3.1. Data Sources and Search Strategy
3.2. Selection Criteria
3.3. Data Extraction and Analysis
3.4. Bias and Certainty Assessments
4. Results and Analysis
4.1. First Stage: Review of Reviews
4.1.1. Temporal Distribution in Review Studies
4.1.2. Domain Distribution in Review Studies
4.1.3. Key Findings
4.1.4. Cross-Study Comparisons in Review Studies
4.2. Second Stage: Empirical Review of Primary Studies
4.2.1. Temporal and Geographic Distribution
4.2.2. Domain Distribution
4.2.3. Distribution Across GenAI Model Families
4.2.4. Explainability Techniques
4.2.5. Evaluation Methods
4.2.6. Cross-Study Comparisons
5. Open Challenges in Explainable GenAI
5.1. Lack of Generalizable and GenAI-Specific Frameworks
5.2. Scalability and Computational Feasibility
5.3. Evaluation Metrics and Benchmarking
5.4. Explainability Challenges in Multimodal Generative Models
5.5. Balancing Performance and Interpretability
5.6. Ethical, Regulatory, and User-Centered Alignment
5.7. Synthesis and Outlook
6. Practical Recommendations and Future Research Directions
6.1. Integrating Explainability into Model Training Pipelines
- Actionable research questions:
- How can training-time constraints (e.g., attention supervision, latent regularization) improve attribution fidelity without degrading generative quality?
- What forms of internal representations are most interpretable for downstream applications?
- Near-term milestone (1–2 years): Benchmarks for training-time interpretability across major GenAI architectures.
- Long-term milestone (5+ years): Widely adopted training frameworks where interpretability is a built-in objective.
6.2. Provenance Tracking and Traceability for Generative Outputs
- Actionable research questions:
- How can token-level provenance be captured without violating privacy or model efficiency?
- What are the minimal metadata standards required for regulatory compliance?
- Near-term milestone: Prototype provenance metadata standards for text-to-image and multimodal systems.
- Long-term milestone: Harmonized provenance protocols integrated into GenAI APIs.
6.3. Human-Centered Explanation Interfaces and Interaction Design
- Actionable research questions:
- What interaction patterns (e.g., layered explanations, contrastive queries) best support non-experts?
- How can explanation interfaces adapt dynamically to user goals and expertise?
- Near-term milestone: Usability-tested multimodal explanation interfaces.
- Long-term milestone: Standardized interaction guidelines for explainable GenAI tools.
6.4. Human-Centered Explainability and Stakeholder Needs
- Actionable research questions:
- What level of explanation granularity is optimal for different stakeholders (developers, regulators, end-users)?
- How can explanations be evaluated for cognitive alignment and decision usefulness?
- Near-term milestone: Stakeholder-specific explanation taxonomies and evaluation rubrics.
- Long-term milestone: Regulatory-ready explanation templates tailored to high-risk domains.
6.5. Domain-Adaptive and Standardized Evaluation Protocols
- Actionable research questions:
- How can fidelity metrics be combined with human-centered metrics to provide holistic evaluation?
- Can domain-specific benchmarks (e.g., medical, legal, educational) be generalized across sectors?
- Near-term milestone: Release of public benchmark datasets for GenAI explainability.
- Long-term milestone: International standards for GenAI explanation quality (ISO/IEC-style).
6.6. Hybrid Neuro-Symbolic and Generative Approaches
- Actionable research questions:
- How can causal abstraction layers be integrated into generative pipelines?
- What types of symbolic representations improve transparency without limiting generative flexibility?
- Near-term milestone: Experimental hybrid architectures demonstrated across multiple modalities.
- Long-term milestone: Scalable hybrid systems widely deployed in industry.
6.7. A Roadmap for Explainable GenAI
- Immediate Priorities (0–2 years)
- Training-time interpretability constraints.
- Provenance and traceability mechanisms.
- User-tested explanation interfaces.
- Multimodal explainability benchmarks.
- Medium-Term Priorities (3–5 years)
- Stakeholder-aligned explanation standards.
- Domain-specific evaluation frameworks.
- Hybrid neuro-symbolic–generative prototypes.
- Long-Term Priorities (5+ years)
- Model-agnostic explainability frameworks.
- Regulatory-aligned, auditable GenAI systems.
- Fully integrated human-centered design pipelines.
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
| AAEs | Adversarial Autoencoders |
| ALE | Accumulated Local Effects |
| AI | Artificial Intelligence |
| CAM | Class Activation Mapping |
| CDGM | Causal Deep Generative Models |
| CNNs | Convolutional Neural Networks |
| CTGAN | Conditional Tabular Generative Adversarial Network |
| CVAEs | Conditional Variational Autoencoders |
| DL | Deep Learning |
| DMs | Diffusion Models |
| DT | Decision Tree |
| EU AI Act | European Union Artificial Intelligence Act |
| GANs | Generative Adversarial Networks |
| GenAI | Generative Artificial Intelligence |
| GPT | Generative Pre-trained Transformer |
| Grad-CAM | Gradient-Weighted Class Activation Mapping |
| HCI | Human–Computer Interaction |
| ICU | Intensive Care Unit |
| IEEE | Institute of Electrical and Electronics Engineers |
| IG | Integrated Gradients |
| IML | Interpretable Machine Learning |
| IoT | Internet of Things |
| k-NN | k-Nearest Neighbor |
| LIME | Local Interpretable Model-agnostic Explanations |
| LLMs | Large Language Models |
| LR | Logistic Regression |
| LRP | Layer-wise Relevance Propagation |
| ML | Machine Learning |
| NLP | Natural Language Processing |
| NN | Neural Network |
| OSF | Open Science Framework |
| PDPs | Partial Dependency Plots |
| PRISMA | Preferred Reporting Items for Systematic Reviews and Meta-Analysis |
| RAG | Retrieval-Augmented Generation |
| RISE | Randomized Input Sampling for Explanation |
| RLHF | Reinforcement Learning from Human Feedback |
| SHAP | SHapley Additive exPlanations |
| SLR | Systematic Literature Review |
| TRMs | Transformer-based Models |
| VAEs | Variational Autoencoders |
| XAI | eXplainable Artificial Intelligence |
References
- Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial networks. Commun. ACM 2020, 63, 139–144. [Google Scholar] [CrossRef]
- Kingma, D.P.; Welling, M. An Introduction to Variational Autoencoders. Found. Trends Mach. Learn. 2019, 12, 307–392. [Google Scholar] [CrossRef]
- Sander, M.E.; Giryes, R.; Suzuki, T.; Blondel, M.; Peyré, G. How do transformers perform in-context autoregressive learning? In Proceedings of the 41st International Conference on Machine Learning (ICML 2024), JMLR.org, Vienna, Austria, 21–27 July 2024; Available online: https://proceedings.mlr.press/v235/sander24a.html (accessed on 12 January 2026).
- Yang, L.; Zhang, Z.; Song, Y.; Hong, S.; Xu, R.; Zhao, Y.; Zhang, W.; Cui, B.; Yang, M.H. Diffusion Models: A Comprehensive Survey of Methods and Applications. ACM Comput. Surv. 2023, 56, 60. [Google Scholar] [CrossRef]
- Feuerriegel, S.; Hartmann, J.; Janiesch, C.; Zschech, P. Generative AI. Bus. Inf. Syst. Eng. 2023, 66, 111–126. [Google Scholar] [CrossRef]
- Dwivedi, R.; Dave, D.; Naik, H.; Singhal, S.; Omer, R.; Patel, P.; Qian, B.; Wen, Z.; Shah, T.; Morgan, G.; et al. Explainable AI (XAI): Core Ideas, Techniques, and Solutions. ACM Comput. Surv. 2023, 55, 194. [Google Scholar] [CrossRef]
- Linardatos, P.; Papastefanopoulos, V.; Kotsiantis, S. Explainable AI: A Review of Machine Learning Interpretability Methods. Entropy 2020, 23, 18. [Google Scholar] [CrossRef] [PubMed]
- Schneider, J. Explainable Generative AI (GenXAI): A Survey, Conceptualization, and Research Agenda. Artif. Intell. Rev. 2024, 57, 289. [Google Scholar] [CrossRef]
- Zhao, H.; Chen, H.; Yang, F.; Liu, N.; Deng, H.; Cai, H.; Wang, S.; Yin, D.; Du, M. Explainability for Large Language Models: A Survey. ACM Trans. Intell. Syst. Technol. 2024, 15, 20. [Google Scholar] [CrossRef]
- Page, M.J.; McKenzie, J.E.; Bossuyt, P.M.; Boutron, I.; Hoffmann, T.C.; Mulrow, C.D.; Shamseer, L.; Tetzlaff, J.M.; Akl, E.A.; Brennan, S.E.; et al. The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. Int. J. Surg. 2021, 88, n71. [Google Scholar] [CrossRef]
- Sengar, S.S.; Hasan, A.B.; Kumar, S.; Carroll, F. Generative artificial intelligence: A systematic review and applications. Multimed. Tools Appl. 2024, 84, 23661–23700. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.U.; Polosukhin, I. Attention is All you Need. In Proceedings of the Advances in Neural Information Processing Systems 30 (NIPS 2017), Long Beach, CA, USA, 4–9 December 2017; Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R., Eds.; Curran Associates, Inc.: New York, NY, USA, 2017; Volume 30. [Google Scholar]
- Chen, M.; Mei, S.; Fan, J.; Wang, M. Opportunities and challenges of diffusion models for generative AI. Natl. Sci. Rev. 2024, 11, nwae348. [Google Scholar] [CrossRef]
- Higgins, I.; Matthey, L.; Pal, A.; Burgess, C.; Glorot, X.; Botvinick, M.; Mohamed, S.; Lerchner, A. beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework. In Proceedings of the International Conference on Learning Representations, Toulon, France, 24–26 April 2017. [Google Scholar]
- Decardi-Nelson, B.; Alshehri, A.S.; Ajagekar, A.; You, F. Generative AI and process systems engineering: The next frontier. Comput. Chem. Eng. 2024, 187, 108723. [Google Scholar] [CrossRef]
- Zhang, S.; Han, T.; Bhalla, U.; Lakkaraju, H. Unifying AI Attribution: A New Frontier in Understanding Complex Systems; Insight Article; D^3 Institute—Digital Data Design Institute at Harvard: Boston, MA, USA, 2025. [Google Scholar]
- Anthony, Q.; Michalowicz, B.; Hatef, J.; Xu, L.; Abduljabbar, M.; Shafi, A.; Subramoni, H.; Panda, D.K.D. Understanding and Characterizing Communication Characteristics for Distributed Transformer Models. IEEE Micro 2025, 45, 8–17. [Google Scholar] [CrossRef]
- Belcic, I.; Stryker, C. RAG vs. Fine-Tuning vs. Prompt Engineering. IBM Think 2025. Available online: https://www.ibm.com/think/topics/rag-vs-fine-tuning-vs-prompt-engineering (accessed on 12 January 2026).
- Liu, P.; Yuan, W.; Fu, J.; Jiang, Z.; Hayashi, H.; Neubig, G. Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing. ACM Comput. Surv. 2023, 55, 195. [Google Scholar] [CrossRef]
- Lent, M.; Fisher, W.; Mancuso, M. An Explainable Artificial Intelligence System for Small-Unit Tactical Behavior. In Proceedings of the 16th Innovative Applications of Artificial Intelligence Conference, San Jose, CA, USA, 27–29 July 2004; AAAI Press: Washington, DC, USA, 2004; pp. 900–907. [Google Scholar]
- Barredo Arrieta, A.; Díaz-Rodríguez, N.; Del Ser, J.; Bennetot, A.; Tabik, S.; Barbado, A.; Garcia, S.; Gil-Lopez, S.; Molina, D.; Benjamins, R.; et al. Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Inf. Fusion 2020, 58, 82–115. [Google Scholar] [CrossRef]
- Ahmed, N.A.; Alpkocak, A. A quantitative evaluation of explainable AI methods using the depth of decision tree. Turk. J. Electr. Eng. Comput. Sci. 2022, 30, 2054–2072. [Google Scholar] [CrossRef]
- Machamer, P.; Darden, L.; Craver, C.F. Thinking about Mechanisms. Philos. Sci. 2000, 67, 1–25. [Google Scholar] [CrossRef]
- Lombrozo, T. The structure and function of explanations. Trends Cogn. Sci. 2006, 10, 464–470. [Google Scholar] [CrossRef]
- Miller, T. Explanation in artificial intelligence: Insights from the social sciences. Artif. Intell. 2019, 267, 1–38. [Google Scholar] [CrossRef]
- Larsson, S.; Heintz, F. Transparency in artificial intelligence. Internet Policy Rev. 2020, 9, 1469. [Google Scholar] [CrossRef]
- Ensuring Transparency in Generative AI Systems. 2025. Available online: https://palospublishing.com/ensuring-transparency-in-generative-ai-systems/ (accessed on 6 December 2025).
- Braun, V.; Clarke, V. Using thematic analysis in psychology. Qual. Res. Psychol. 2006, 3, 77–101. [Google Scholar] [CrossRef]
- Da’u, A.; Salim, N. Recommendation system based on deep learning methods: A systematic review and new directions. Artif. Intell. Rev. 2020, 53, 2709–2748. [Google Scholar] [CrossRef]
- Kitchenham, B.; Charters, S. Guidelines for Performing Systematic Literature Reviews in Software Engineering; EBSE Technical Report EBSE-2007-01; Software Engineering Group, School of Computer Science and Mathematics, Keele University: Keele, UK, 2007. [Google Scholar]
- Saarela, M.; Kärkkäinen, T. Can we automate expert-based journal rankings? Analysis of the Finnish publication indicator. J. Inf. 2020, 14, 101008. [Google Scholar] [CrossRef]
- Bushey, J. AI-Generated Images as an Emergent Record Format. In Proceedings of the 2023 IEEE International Conference on Big Data (BigData), Sorrento, Italy, 15–18 December 2023; pp. 2020–2031. [Google Scholar]
- Hanif, A.; Beheshti, A.; Benatallah, B.; Zhang, X.; Habiba; Foo, E.; Shabani, N.; Shahabikargar, M. A Comprehensive Survey of Explainable Artificial Intelligence (XAI) Methods: Exploring Transparency and Interpretability. In Proceedings of the Web Information Systems Engineering, Victoria, Australia, 25–27 October 2023; Zhang, F., Wang, H., Barhamgi, M., Chen, L., Zhou, R., Eds.; Springer Nature: Singapore, 2023; pp. 915–925. [Google Scholar]
- Zarghami, S.; Kouchaki, H.; Yang, L.; Martinez, P. Explainable Artificial Intelligence in Generative Design for Construction. In Proceedings of the 2024 European Conference on Computing in Construction, Crete, Greece, 14–17 July 2024. [Google Scholar]
- Longo, L.; Brcic, M.; Cabitza, F.; Choi, J.; Confalonieri, R.; Ser, J.D.; Guidotti, R.; Hayashi, Y.; Herrera, F.; Holzinger, A.; et al. Explainable Artificial Intelligence (XAI) 2.0: A manifesto of open challenges and interdisciplinary research directions. Inf. Fusion 2024, 106, 102301. [Google Scholar] [CrossRef]
- Mudabbiruddin, M.; Mosavi, A.; Imre, F. From Deep Learning to ChatGPT for Materials Design. In Proceedings of the 2024 IEEE 11th International Conference on Computational Cybernetics and Cyber-Medical Systems (ICCC), Hanoi, Vietnam, 4–6 April 2024; pp. 1–8. [Google Scholar]
- Pan, S.; Luo, L.; Wang, Y.; Chen, C.; Wang, J.; Wu, X. Unifying Large Language Models and Knowledge Graphs: A Roadmap. IEEE Trans. Knowl. Data Eng. 2024, 36, 3580–3599. [Google Scholar] [CrossRef]
- Jain, R.; Jain, A. Generative AI in Writing Research Papers: A New Type of Algorithmic Bias and Uncertainty in Scholarly Work. In Artificial Intelligence and Soft Computing; Rutkowski, L., Ed.; Springer: Berlin/Heidelberg, Germany, 2024; pp. 656–669. [Google Scholar]
- Zeiser, T.; Ehret, D.; Lutz, T.; Saar, J. Explainable AI in Manufacturing. In Proceedings of the 2024 IEEE International Conference on Engineering, Technology, and Innovation (ICE/ITMC), Funchal, Portugal, 24–28 June 2024; pp. 1–8. [Google Scholar]
- Qu, T.; Yang, Z. Overview of Artificial Intelligence Applications in Educational Research; ISAIE ’24. In Proceedings of the 2024 International Symposium on Artificial Intelligence for Education, New York, NY, USA, 6–8 September 2024; pp. 101–108. [Google Scholar]
- Bui, L.V. Advancing patent law with generative AI: Human-in-the-loop systems for AI-assisted drafting, prior art search, and multimodal IP protection. World Pat. Inf. 2025, 80, 102341. [Google Scholar] [CrossRef]
- Ye, X.; Yigitcanlar, T.; Goodchild, M.; Huang, X.; Li, W.; Shaw, S.L.; Fu, Y.; Gong, W.; Newman, G. Artificial intelligence in urban science: Why does it matter? Ann. GIS 2025, 31, 181–189. [Google Scholar] [CrossRef]
- Demuth, S.; Paris, J.; Faddeenkov, I.; De Sèze, J.; Gourraud, P.A. Clinical applications of deep learning in neuroinflammatory diseases: A scoping review. Rev. Neurol. 2025, 181, 135–155. [Google Scholar] [CrossRef] [PubMed]
- López Joya, S.; Diaz-Garcia, J.; Ruiz, M.; Martin-Bautista, M. Dissecting a social bot powered by generative AI: Anatomy, new trends and challenges. Soc. Netw. Anal. Min. 2025, 15, 7. [Google Scholar] [CrossRef]
- Abbas, K. Management accounting and artificial intelligence: A comprehensive literature review and recommendations for future research. Br. Account. Rev. 2025, 57, 101551. [Google Scholar] [CrossRef]
- Non, L.R.; Marra, A.R.; Ince, D. Rise of the Machines—Artificial Intelligence in Healthcare Epidemiology. Curr. Infect. Dis. Rep. 2025, 27, 4. [Google Scholar] [CrossRef]
- Westphal, A.; Mrowka, R. Special issue European Journal of Physiology: Artificial intelligence in the field of physiology and medicine. Pflügers Arch. Eur. J. Physiol. 2025, 477, 509–512. [Google Scholar] [CrossRef]
- Mikołajewska, E.; Mikołajewski, D.; Mikołajczyk, T.; Paczkowski, T. Generative AI in AI-Based Digital Twins for Fault Diagnosis for Predictive Maintenance in Industry 4.0/5.0. Appl. Sci. 2025, 15, 3166. [Google Scholar] [CrossRef]
- Sun, J.; Liao, V.; Muller, M.; Agarwal, M.; Houde, S.; Talamadupula, K.; Weisz, J. Investigating Explainability of Generative AI for Code through Scenario-based Design. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems, New Orleans, LA, USA, 29 April–5 May 2022; pp. 212–228. [Google Scholar] [CrossRef]
- El-Zanfaly, D.; Huang, Y.; Dong, Y. Sand-in-the-loop: Investigating Embodied Co-Creation for Shared Understandings of Generative AI. In Proceedings of the 2023 ACM Designing Interactive Systems Conference (DIS), Pittsburgh, PA, USA, 10–14 July 2023; pp. 256–260. [Google Scholar] [CrossRef]
- Ezzahed, Z.; Chevrot, A.; Hurter, C.; Olive, X. Bringing Explainability to Autoencoding Neural Networks Encoding Aircraft Trajectories. In Proceedings of the 13th SESAR Innovation Days 2023, SIDS 2023, Séville, Spain, 27–30 November 2023. [Google Scholar]
- Wang, Y.; Shen, S.; Lim, B.Y. RePrompt: Automatic Prompt Editing to Refine AI-Generative Art Towards Precise Expressions. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, CHI ’23, New York, NY, USA, 23–28 April 2023. [Google Scholar] [CrossRef]
- Jeong, S.; Li, M.; Berger, M.; Liu, S. Concept Lens: Visually Analyzing the Consistency of Semantic Manipulation in GANs. In Proceedings of the 2023 IEEE Visualization and Visual Analytics (VIS), Melbourne, Australia, 22–27 October 2023; pp. 221–225. [Google Scholar] [CrossRef]
- Hasko, R.; Hasko, O.; Kutucu, H. Teaching Assistant Robots in Various Fields: Natural Sciences, Medicine and Specific Non-Deterministic Conditions. In Proceedings of the 6th International Conference on Informatics and Data-Driven Medicine (IDDM 2023), Bratislava, Slovakia, 17–19 November 2023; Volume 3609, pp. 303–309. [Google Scholar]
- Minutti, C.; Escalante-Ramírez, B.; Olveres, J. PumaMedNet-CXR: An Explainable Generative Artificial Intelligence for the Analysis and Classification of Chest X-Ray Images. Comput. Sist. 2023, 27, 909–920. [Google Scholar] [CrossRef]
- Esposito, M.; Palagiano, F.; Lenarduzzi, V.; Taibi, D. Beyond Words: On Large Language Models Actionability in Mission-Critical Risk Analysis; ESEM ’24. In Proceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, Barcelona, Spain, 24–25 October 2024; pp. 517–527. [Google Scholar] [CrossRef]
- Moruzzi, S.; Ferrari, F.; Riscica, F. Biases, Epistemic Filters, and Explainable Artificial Intelligence. In Proceedings of the Workshops at the Third International Conference on Hybrid Human-Artificial Intelligence (HHAI 2024), Malmö, Sweden, 10–14 June 2024; CEUR Workshop Proceedings. Volume 3825, pp. 33–36. [Google Scholar]
- Pozzi, M.; Noei, S.; Robbi, E.; Cima, L.; Moroni, M.; Munari, E.; Torresani, E.; Jurman, G. Generating and evaluating synthetic data in digital pathology through diffusion models. Sci. Rep. 2024, 14, 28435. [Google Scholar] [CrossRef] [PubMed]
- Pontorno, O.; Guarnera, L.; Battiato, S. On the Exploitation of DCT-Traces in the Generative-AI Domain. In Proceedings of the IEEE International Conference on Image Processing (ICIP), Abu Dhabi, United Arab Emirates, 27–30 October 2024; pp. 3806–3812. [Google Scholar] [CrossRef]
- Riello, P.; Quille, K.; Jaiswal, R.; Sansone, C. Reimagining Student Success Prediction: Applying LLMs in Educational AI with XAI; HCAIep ’24. In Proceedings of the 2024 Conference on Human Centred Artificial Intelligence—Education and Practice, New York, NY, USA, 2–3 December 2024; pp. 34–40. [Google Scholar] [CrossRef]
- Sachan, S.; Dezem, V.; Fickett, D. Blockchain for Ethical and Transparent Generative AI Utilization by Banking and Finance Lawyers. In Proceedings of the Explainable Artificial Intelligence, Valletta, Malta, 17–19 July 2024; Longo, L., Lapuschkin, S., Seifert, C., Eds.; Springer: Cham, Switzerland, 2024; pp. 319–333. [Google Scholar] [CrossRef]
- Sachan, S.; Liang, X.; Liu, X. Blockchain-based auditing of legal decisions supported by explainable AI and generative AI tools. Eng. Appl. Artif. Intell. 2024, 129, 107666. [Google Scholar] [CrossRef]
- Bird, J.; Lotfi, A. CIFAKE: Image Classification and Explainable Identification of AI-Generated Synthetic Images. IEEE Access 2024, 12, 15642–15650. [Google Scholar] [CrossRef]
- Burgess, M. Deceptive AI dehumanizes: The ethics of misattributed intelligence in the design of Generative AI interfaces. In Proceedings of the 2024 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC), Liverpool, UK, 2–6 September 2024; pp. 96–108. [Google Scholar] [CrossRef]
- Ince, V.; Bader-El-Den, M.; Sari, O. Enhanced dataset synthesis using CTGAN for metagenomic dataset. In Proceedings of the 2024 IEEE 12th International Conference on Intelligent Systems, IS 2024, Varna, Bulgaria, 29–31 August 2024; Sgurev, V., Jotsov, V., Piuri, V., Doukovska, L., Yoshinov, R., Eds.; IEEE: New York City, NY, USA, 2024; pp. 1–6. [Google Scholar] [CrossRef]
- Bryan-Kinns, N.; Zhang, B.; Zhao, S.; Banar, B. Exploring Variational Auto-encoder Architectures, Configurations, and Datasets for Generative Music Explainable AI. Mach. Intell. Res. 2024, 21, 29–45. [Google Scholar] [CrossRef]
- Abu-Rasheed, H.; Abdulsalam, M.H.; Weber, C.; Fathi, M. Supporting Student Decisions on Learning Recommendations: An LLM-Based Chatbot with Knowledge Graph Contextualization for Conversational Explainability and Mentoring. In Proceedings of the Joint Proceedings of LAK 2024 Workshops Co-Located with 14th International Conference on Learning Analytics and Knowledge (LAK 2024), Kyoto, Japan, 18–22 March 2024; CEUR-WS.org, CEUR Workshop Proceedings. Volume 3667, pp. 230–239. [Google Scholar]
- Herdt, R.; Maass, P. Visualize and Paint GAN Activations. In Proceedings of the 2024 IEEE International Workshop on Machine Learning for Signal Processing (MLSP), London, UK, 22–25 September 2024; pp. 1–6. [Google Scholar] [CrossRef]
- Balmer, V.; Kuhn, S.; Bischof, R.; Salamanca, L.; Kaufmann, W.; Perez-Cruz, F.; Kraus, M. Design Space Exploration and Explanation via Conditional Variational Autoencoders in Meta-Model-Based Conceptual Design of Pedestrian Bridges. Autom. Constr. 2024, 163, 105411. [Google Scholar] [CrossRef]
- Vilone, G.; Sovrano, F.; Lognoul, M. On the Explainability of Financial Robo-Advice Systems. In Explainable Artificial Intelligence for Finance; Springer: Berlin/Heidelberg, Germany, 2024; pp. 219–242. [Google Scholar] [CrossRef]
- Durango, I.; Gallud, J.A.; Penichet, V. The data dance: Choreographing seamless partnerships between humans, data, and GenAI. Int. J. Data Sci. Anal. 2024, 20, 3613–3640. [Google Scholar] [CrossRef]
- Kim, P.W. A Framework to Overcome the Dark Side of Generative Artificial Intelligence (GAI) Like ChatGPT in Social Media and Education. IEEE Trans. Comput. Soc. Syst. 2024, 11, 5266–5274. [Google Scholar] [CrossRef]
- Lee, D.; Lee, J.; Shin, D. GPT Prompt Engineering for a Large Language Model-Based Process Improvement Generation System. Korean J. Chem. Eng. 2024, 41, 3263–3286. [Google Scholar] [CrossRef]
- Heo, S.; Byun, J.; Ifaei, P.; Ko, J.; Ha, B.; Hwangbo, S.; Yoo, C. Towards mega-scale decarbonized industrial park (Mega-DIP): Generative AI-driven techno-economic and environmental assessment of renewable and sustainable energy utilization in petrochemical industry. Renew. Sustain. Energy Rev. 2024, 189, 113933. [Google Scholar] [CrossRef]
- Jang, S.; Lee, H.; Kim, Y.; Lee, D.; Shin, J.; Nam, J. When, What, and how should generative artificial intelligence explain to Users? Telemat. Inform. 2024, 93, 102175. [Google Scholar] [CrossRef]
- Demirbaga, U. Advancing anomaly detection in cloud environments with cutting-edge generative AI for expert systems. Expert Syst. 2024, 42, e13722. [Google Scholar] [CrossRef]
- Biswal, S. SCOUT: Surveillance and Cyber harassment Observation of Unseen Threats. In Proceedings of the 2024 International Conference on Artificial Intelligence, Metaverse and Cybersecurity (ICAMAC), Dubai, United Arab Emirates, 25–26 October 2024; pp. 1–6. [Google Scholar] [CrossRef]
- Kim, S.S.Y. Establishing Appropriate Trust in AI through Transparency and Explainability. In Proceedings of the CHI Extended Abstracts, Hybrid, HO, USA, 11–16 May 2024; pp. 433:1–433:6. [Google Scholar] [CrossRef]
- Ehsan, U.; Riedl, M. Explainable AI Reloaded: Challenging the XAI Status Quo in the Era of Large Language Models; HttF ’24. In Proceedings of the Halfway to the Future Symposium, Santa Cruz, CA, USA, 21–23 October 2024. [Google Scholar] [CrossRef]
- Chaccour, C.; Karapantelakis, A.; Murphy, T.; Dohler, M. Telecom’s Artificial General Intelligence (AGI) Vision: Beyond the GenAI Frontier. IEEE Netw. 2024, 38, 21–28. [Google Scholar] [CrossRef]
- Pendyala, V.S.; Chintalapati, A. Using Multimodal Foundation Models for Detecting Fake Images on the Internet with Explanations. Future Internet 2024, 16, 432. [Google Scholar] [CrossRef]
- Taylor-Melanson, W.; Sadeghi, Z.; Matwin, S. Causal generative explainers using counterfactual inference: A case study on the Morpho-MNIST dataset. Pattern Anal. Appl. 2024, 27, 89. [Google Scholar] [CrossRef]
- Hu, Y.; Giacaman, N.; Donald, C. Enhancing Trust in Generative AI: Investigating Explainability of LLMs to Analyse Confusion in MOOC Discussions. In Proceedings of the Joint Proceedings of LAK 2024 Workshops, Kyoto, Japan, 18–22 March 2024. [Google Scholar]
- Toth, G.; Albrecht, R.; Pruski, C. Explainable AI, LLM, and digitized archival cultural heritage: A case study of the Grand Ducal Archive of the Medici. AI Soc. 2025, 40, 4561–4573. [Google Scholar] [CrossRef]
- Di Lodovico, C.; Torrielli, F.; Di Caro, L.; Rapp, A. How Do People Develop Folk Theories of Generative AI Text-to-Image Models? A Qualitative Study on How People Strive to Explain and Make Sense of GenAI. Int. J. Hum. Comput. Interact. 2025, 42, 14846–14870. [Google Scholar] [CrossRef]
- Leimeister, J.M.; Reinhard, P.; Li, M.; Fina, M. Fact or Fiction? Exploring Explanations to Identify Factual Confabulations in RAG-Based LLM Systems. In Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems, ACM, Yokohama, Japan, 26 April–1 May 2025. [Google Scholar] [CrossRef]
- Jeck, J.; Leiser, F.; Hüsges, A.; Sunyaev, A. TELL-ME: Toward Personalized Explanations of Large Language Models; CHI EA ’25. In Proceedings of the Extended Abstracts of the CHI Conference on Human Factors in Computing Systems, New York, NY, USA, 11–16 May 2025. [Google Scholar] [CrossRef]
- Basaran, O.T.; Dressler, F. XAInomaly: Explainable and interpretable Deep Contractive Autoencoder for O-RAN traffic anomaly detection. Comput. Netw. 2025, 261, 111145. [Google Scholar] [CrossRef]
- Ahmed, M.U.; Begum, S.; Barua, S.; Masud, A.N.; Di Flumeri, G.; Navarin, N. Enhancing Explainability, Robustness, and Autonomy: A Comprehensive Approach in Trustworthy AI. In Proceedings of the 2025 IEEE Symposium on Trustworthy, Explainable and Responsible Computational Intelligence (CITREx), Trondheim, Norway, 17–20 March 2025; pp. 1–7. [Google Scholar] [CrossRef]
- Bhattacharya, A.; Stumpf, S.; De Croon, R.; Verbert, K. Explanatory Debiasing: Involving Domain Experts in the Data Generation Process to Mitigate Representation Bias in AI Systems. In Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems, CHI ’25, New York, NY, USA, 26 April–1 May 2025. [Google Scholar] [CrossRef]
- Katsuragi, M.; Tanaka, K. Comparing AI-Generated and Human-Crafted T-Shirt Layouts through an XAI Lens: Key Design Elements and Implications for Co-Creative Tools. In Proceedings of the Extended Abstracts of the CHI Conference on Human Factors in Computing Systems, CHI EA ’25, New York, NY, USA, 11–16 May 2025. [Google Scholar] [CrossRef]
- Yoshioka, T.; Morikura, Y.; Izumi, T.; Wada, T. Sustainable data-driven framework and policy recommendations for enhancing sports promotion using generative and explainable Artificial Intelligence. J. Phys. Educ. Sport 2025, 25, 638–645. [Google Scholar] [CrossRef]
- Rathakrishnan, M.; Gayan, S.; Edirisinghe, S.; Inaltekin, H. A Multi-Model Framework for Synthesizing High-Fidelity Network Intrusion Data Using Generative AI. In Proceedings of the 2025 5th International Conference on Advanced Research in Computing (ICARC), Belihuloya, Sri Lanka, 19–20 February 2025; pp. 1–6. [Google Scholar] [CrossRef]
- Future of Life Institute. The Act Texts—EU Artificial Intelligence Act. 2024. Available online: https://artificialintelligenceact.eu/the-act/ (accessed on 28 October 2025).
- Salih, A.M.; Raisi-Estabragh, Z.; Galazzo, I.B.; Radeva, P.; Petersen, S.E.; Lekadir, K.; Menegaz, G. A Perspective on Explainable Artificial Intelligence Methods: SHAP and LIME. Adv. Intell. Syst. 2025, 7, 2400304. [Google Scholar] [CrossRef]
- Slack, D.; Hilgard, S.; Jia, E.; Singh, S.; Lakkaraju, H. Fooling LIME and SHAP: Adversarial Attacks on Post hoc Explanation Methods; AIES ’20. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, New York, NY, USA, 7–12 February 2020; pp. 180–186. [Google Scholar] [CrossRef]
- Rombach, R.; Blattmann, A.; Lorenz, D.; Esser, P.; Ommer, B. High-Resolution Image Synthesis with Latent Diffusion Models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 10674–10685. [Google Scholar] [CrossRef]
- Elhage, N.; Nanda, N.; Olsson, C.; Henighan, T.; Joseph, N.; Mann, B.; Askell, A.; Bai, Y.; Chen, A.; Conerly, T.; et al. A Mathematical Framework for Transformer Circuits. 2021. Available online: https://transformer-circuits.pub/2021/framework/index.html (accessed on 6 December 2025).
- Rauker, T.; Ho, A.; Casper, S.; Hadfield-Menell, D. Toward Transparent AI: A Survey on Interpreting the Inner Structures of Deep Neural Networks. In Proceedings of the IEEE Conference on Secure and Trustworthy Machine Learning (SaTML), Raleigh, NC, USA, 8–10 February 2023; pp. 464–483. [Google Scholar] [CrossRef]
- Kumarage, P.; Saarela, M. Explainability in Generative AI: An Umbrella Review of Current Techniques, Limitations, and Future Directions. In Proceedings of the Late-Breaking Work at the 2025 International Conference on Explainable AI (XAI 2025), Istanbul, Turkey, 9–11 July 2025. [Google Scholar]
- Dhariwal, P.; Nichol, A. Diffusion Models Beat GANs on Image Synthesis. In Proceedings of the Advances in Neural Information Processing Systems, Online, 6–14 December 2021; Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P., Vaughan, J.W., Eds.; Curran Associates, Inc.: New York, NY, USA, 2021; Volume 34, pp. 8780–8794. [Google Scholar]
- Saharia, C.; Chan, W.; Saxena, S.; Li, L.; Whang, J.; Denton, E.L.; Ghasemipour, K.; Gontijo Lopes, R.; Karagol Ayan, B.; Salimans, T.; et al. Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding. In Proceedings of the Advances in Neural Information Processing Systems, New Orleans, LA, USA, 28 November–9 December 2022; Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A., Eds.; Curran Associates, Inc.: New York, NY, USA, 2022; Volume 35, pp. 36479–36494. [Google Scholar]
- Luccioni, A.S.; Akiki, C.; Mitchell, M.; Jernite, Y. Stable bias: Evaluating societal representations in diffusion models; NIPS ’23. In Proceedings of the 37th International Conference on Neural Information Processing Systems, Red Hook, NY, USA, 10–16 December 2023. [Google Scholar]
- Hertz, A.; Mokady, R.; Tenenbaum, J.; Aberman, K.; Pritch, Y.; Cohen-Or, D. Prompt-to-Prompt Image Editing with Cross-Attention Control. In Proceedings of the 11th International Conference on Learning Representations (ICLR), La Jolla, CA, USA, 1–5 May 2023. [Google Scholar]
- Radford, A.; Kim, J.W.; Hallacy, C.; Ramesh, A.; Goh, G.; Agarwal, S.; Sastry, G.; Askell, A.; Mishkin, P.; Clark, J.; et al. Learning Transferable Visual Models From Natural Language Supervision. In Proceedings of the 38th International Conference on Machine Learning, Online, 18–24 July 2021; Meila, M., Zhang, T., Eds.; Volume 139, pp. 8748–8763. [Google Scholar]
- Monteiro, W.R.; Reynoso-Meza, G. A Review of the Convergence Between Explainable Artificial Intelligence and Multi-Objective Optimization. TechRxiv 2022. [Google Scholar] [CrossRef]
- DeGrave, A.J.; Cai, Z.R.; Janizek, J.D.; Daneshjou, R.; Lee, S.I. Auditing the inference processes of medical-image classifiers by leveraging generative AI and the expertise of physicians. Nat. Biomed. Eng. 2023, 9, 294–306. [Google Scholar] [CrossRef]











| Key Aspect | Implications for GenAI |
|---|---|
| Trust and Transparency | Increase user adoption and confidence in AI-generated content. |
| Regularity Compliance | AI developers and deploying organizations must align GenAI models with global AI governance standards to ensure lawful deployment. |
| Bias and Fairness Assessment | Fairer AI-generated results minimize discrimination and improve AI’s societal impact. |
| Mitigation of Misinformation | Helps minimize AI-generated misinformation and propaganda, and improve trust in AI-powered content creation. |
| Ethical and Legal Considerations | Reduce legal risks, safeguard user rights, and ensure AI-generated content does not violate regulations or norms. |
| User Control and Interpretability | Enhances transparency in AI interactions by allowing users to refine and oversee model outputs more effectively. |
| Security and Robustness | Strengths AI security by detecting malicious manipulations and reducing risks of AI misuse or cyber threats. |
| Accountability and Auditing | Ensures AI-generated outputs remain accountable and verifiable to promote ethical development of AI. |
| Criterion | Included | Excluded |
|---|---|---|
| a. Type of Publication | Peer-reviewed journals and conference papers | Editorial materials, books, book chapters, short surveys, notes, conference reviews, articles, theses, other gray literature |
| b. Study Topic | GenAI, Explainability | Common AI, explainability for common AI |
| c. Recentness | Publications on 2020 or after | Publications before 2020 |
| d. Language | English | Other languages |
| e. Type of Study | Different types of literature reviews and empirical studies | Only abstracts and proposals |
| f. Study Content | Explainability of GenAI | Explainability only for common AI, Only about GenAI |
| Review Article | Main Findings |
|---|---|
| Bushey [32] | GenAI explainability is underdeveloped but critical, especially as AI-generated images become part of legal records and medical decisions. |
| Hanif et al. [33] | XAI can bridge the human–AI understanding gap in high-stakes domains. |
| Zarghami et al. [34] | Emphasizes explainability in construction; proposes a taxonomy and hybrid methods. |
| Longo et al. [35] | Current XAI methods do not scale to GenAI due to model complexity; promising but untested techniques exist. |
| Schneider [8] | GenAI explainability is an urgent but underdeveloped area; trust, interactivity, verifiability, evaluation, and cost need attention. |
| Mudabbiruddin et al. [36] | Traditional XAI techniques (e.g., SHAP, PatternNet) contribute to GenAI output explainability but remain insufficient for full transparency and trust. |
| Pan et al. [37] | Combining LLMs and knowledge graphs could improve GenAI explainability. |
| Jain and Jain [38] | XAI techniques like Reinforcement Learning from Human Feedback (RLHF) and post hoc methods fail to adequately address GenAI explainability. |
| Zeiser et al. [39] | XAI for GenAI in manufacturing is essential but still lacking. |
| Qu and Yang [40] | ChatGPT offers educational potential but interpretability and bias remain issues. |
| Bui [41] | Explainability is crucial for integrating GenAI into patent law; uses traditional XAI techniques (e.g., SHAP, LIME), but gaps persist. |
| Ye et al. [42] | GenAI has potential in planning and participation, but its opacity poses risks. |
| Demuth et al. [43] | Clinical GenAI uses post hoc techniques like saliency maps and layer-wise relevance propagation, which help visualize input regions but lack intrinsic interpretability. |
| López Joya et al. [44] | SHAP aids GenAI-powered bot detection insights, but GenAI-specific XAI remains immature. |
| Abbas [45] | Current literature lacks empirical studies and practical integration of techniques for GenAI explainability. |
| Non et al. [46] | GenAI explainability in healthcare epidemiology is viewed as an ethical and conceptual necessity. |
| Westphal and Mrowka [47] | GenAI explainability is crucial for clinical integration; adaptation of traditional XAI to GenAI is still evolving. |
| Mikołajewska et al. [48] | Lack of explainability limits GenAI adoption in industrial settings such as digital twins. |
| Empirical Article | Country | Main Findings |
|---|---|---|
| Sun et al. [49] | USA | Identifies users’ explainability needs for GenAI for code and proposed four human-centered explainability features to address them. |
| El-Zanfaly et al. [50] | USA | Demonstrates that explainability of GenAI can be achieved through intuitive, embodied interaction rather than technical explanations. |
| Ezzahed et al. [51] | Switzerland | Introduces visual latent space analysis to improve the explainability of VAEs in air traffic management. |
| Wang et al. [52] | Singapore | Shows how XAI techniques like SHAP and PDPs can be used with proxy models to make GenAI prompt editing transparent and interpretable. |
| Jeong et al. [53] | USA | Introduces Concept Lens, a tool that explains image-based GenAI by visualizing consistent and inconsistent semantic edits in GANs. |
| Hasko et al. [54] | Ukraine | Demonstrates how integrating XAI with GenAI models like GPT-3.5 in robotic assistants enhances transparency and user trust. |
| Minutti et al. [55] | Mexico | Introduces a -VAE-based GenAI model that enhances explainability through interpretable latent space and bias control. |
| Esposito et al. [56] | Italy | Proposes actionability as a practical, domain-specific proxy for explainability in GenAI systems used for mission-critical risk analysis. |
| Moruzzi et al. [57] | Italy | Introduces epistemic filters as a conceptual framework to improve GenAI explainability by accounting for both model and user biases in human-AI interactions. |
| Pozzi et al. [58] | Italy | Applies explainability via Concept Relevance Propagation to assess the fidelity of features learned from GenAI in digital pathology. |
| Pontorno et al. [59] | Italy | Uses LIME to identify detectable traces left by GenAI models to improve explainability in deepfake detection. |
| Riello et al. [60] | Italy | Demonstrates that LLM attention scores can offer interpretable, feature-level explanations for educational predictions. |
| Sachan et al. [61] | UK | Introduces a framework that links GenAI outputs to explainable decisions using Evidential Reasoning and ensures their transparency through blockchain auditing. |
| Sachan et al. [62] | UK | Ensures explainability and accountability in GenAI outputs by grounding them in XAI-generated legal decisions and tracking their use through immutable records. |
| Bird and Lotfi [63] | UK | Demonstrates how Grad-CAM can reveal subtle visual cues in AI-generated images in GenAI image detection systems. |
| Burgess [64] | UK | Shows that demystification improves GenAI explainability by reducing user misattribution of intelligence. |
| Ince et al. [65] | UK | Shows that using SHAP with CTGAN makes GenAI data augmentation more transparent and interpretable. |
| Bryan-Kinns et al. [66] | UK | Demonstrates that structuring VAE latent spaces with musical attributes makes GenAI music more interpretable and controllable. |
| Abu-Rasheed et al. [67] | Germany | Proposes a hybrid system that improves GenAI explainability in education using GPT-4, knowledge graphs, and expert-guided conversational support. |
| Herdt and Maass [68] | Germany | Presents a method to visualize and control GAN outputs using activation vectors for improved interpretability and structure-level generation. |
| Balmer et al. [69] | Switzerland | Shows that GenAI can be made explainable by combining CVAEs with decision trees and sensitivity analysis for transparent design exploration. |
| Vilone et al. [70] | Switzerland | Proposes a legal compliance framework revealing that current GenAI lacks sufficient explainability for financial advice under EU regulations. |
| Durango et al. [71] | Spain | Introduces the DYNAMIC framework that enhances GenAI explainability through adaptive, interpretable, and user-driven system design. |
| Kim [72] | South Korea | Proposes conceptual frameworks (DIKW hierarchy, Human-GenAI collaboration models, and ZPD) to guide the responsible use of text-generating AI by fostering explainability through XAI literacy. |
| Lee et al. [73] | South Korea | Develops a GPT-based multi-agent system that generates structured, explainable outputs for chemical process improvements. |
| Heo et al. [74] | South Korea | Shows how Deep SHAP explains GenAI (AAE) energy forecasts by revealing key climate and feature influences. |
| Jang et al. [75] | South Korea | Provides a user-centered framework identifying when, what, and how explanations should be delivered in GenAI chatbots. |
| Demirbaga [76] | Turkey | Introduces CloudGEN, a GAN-powered anomaly detection framework that combines generative modeling with SHAP-based explainability in cloud systems. |
| Biswal [77] | India | Demonstrates how LIME and SHAP can be effectively applied to a fine-tuned GenAI model to provide explanations for detecting subtle cyber harassment. |
| Kim [78] | USA | Shows that uncertainty expressions can improve trust calibration in LLMs and proposes a framework for explainability in GenAI |
| Ehsan and Riedl [79] | USA | Reframes GenAI explainability by proposing human-centered approaches that emphasize actionable understanding over algorithmic transparency |
| Chaccour et al. [80] | USA | Highlights the potential of integrating causal AI, XAI, uncertainty, and neuro-symbolic AI to enhance GenAI explainability in telecom networks. |
| Pendyala and Chintalapati [81] | USA | Shows that LIME and removal-based methods can explain how foundation models detect GenAI-generated fake images. |
| Taylor-Melanson et al. [82] | Canada | Introduces a set of counterfactual explanation methods using causal GenAI models that enhance the interpretability of image classifiers. |
| Hu et al. [83] | New Zealand | Shows that integrated gradients can enhance the transparency of LLM predictions in GenAI for education. |
| Toth et al. [84] | Italy | Demonstrates how ChatGPT-4 and explainable AI can be combined to make archival metadata more transparent and semantically accessible. |
| Di Lodovico et al. [85] | Italy | Shows that users from diverse, evolving folk theories to explain GenAI outputs, revealing gaps in current GenAI explainability tools. |
| Leimeister et al. [86] | Germany | Demonstrates that tailored explanations for GenAI, such as factual and analogical, can improve users’ ability to detect confabulations in GenAI outputs. |
| Jeck et al. [87] | Germany | Introduces TELL-ME, a prototype that provides personalized explanations for GenAI outputs based on user expertise. |
| Basaran and Dressler [88] | Germany | Introduces fastSHAP-C, a real-time method to explain generative autoencoder decisions in O-RAN anomaly detection. |
| Ahmed et al. [89] | Sweden | Proposes ExplainAgent, a modular tool that unifies existing XAI methods to improve GenAI transparency through user-tailored explanations. |
| Bhattacharya et al. [90] | Belgium | Shows how explainability tools can support domain experts in guiding and validating GenAI-generated data to reduce bias. |
| Katsuragi and Tanaka [91] | Japan | Demonstrates that textual explanations from a GenAI model can enhance designer trust and support human-AI collaboration in creative layout tasks. |
| Yoshioka et al. [92] | Japan | Shows how XAI can make policy insights from GenAI-generated data more transparent. |
| Rathakrishnan et al. [93] | Sri Lanka | Applies LIME to make CTGAN-generated intrusion data interpretable and enables analysts to understand feature contributions without exposing real data. |
| Category | Specific Techniques |
|---|---|
| Pre-existing | Local Interpretable Model-Agnostic Explanations (LIME) [59,77,81,92,93] |
| Techniques | SHapley Additive exPlanations (SHAP) [65,76,77,92] |
| Gradient-weighted Class Activation Mapping (Grad-CAM) [63] | |
| Integrated gradients [83] | |
| Heatmaps [85] | |
| Concept Relevance Propagation (CRP) [58] | |
| Retrieval-Augmented Generation (RAG) and fine-tuning [56] | |
| Removal-based explanations [81] | |
| Machine learning classifiers [59] | |
| Attention-based explanations [60] | |
| Decision trees and sensitivity analysis [69] | |
| Deep SHAP, latent variable analysis, multiple linear regression and spline methods [74] | |
| Feature attribution, counterfactual explanations, causal reasoning, fairness and bias assessments, self-evaluation metrics, chain-of-thought reasoning [70] | |
| Causal AI, Common XAI and Neuro-symbolic AI [80] | |
| Natural language textual justifications [91] | |
| AI documentation (fact sheets and model cards) [49] | |
| Modified | Actionability as a proxy [56] |
| Pre-existing | fastSHAP-C [88] |
| Techniques | Uncertainty expression technique [78] |
| Sand playground interface [50] | |
| Mystification and demystification [64] | |
| Rationale generation and Seamful XAI [79] | |
| MeasureVAE and AdversarialVAE [66] | |
| Uncertainty indicators and attention visualization [49] | |
| Revised DIKW hierarchy and Zone of Proximal Development (ZPD) [72] | |
| Factual explanations, analogical explanations, probabilistic explanations, chain-of-thought explanations [86] | |
| Pixel-based explanations, attribute-based explanations and counterfactual explanations [82] | |
| DYNAMIC Framework: Combination of GenAI lens, interpretable neural modules, XAI techniques (LIME, SHAP, counterfactual explanations, concept activation vectors), and real-time visualizations with D3.js [71] | |
| Combination of activation vector visualization [68] | |
| Combination of LIME, SHAP, transformer-based attention mechanisms, context-aware and multimodal explanations, and user-centric interactive explanation systems [89] | |
| Combination of integrated gradients, remove and retrain (ROAR), LLM embedding-based semantic clustering, LLM-based hierarchical node labeling, t-SNE dimensionality reduction, frameNet-based semantic role labeling via LLM, location–event extraction using WordNet semantics with LLM [84] | |
| Combination of evidential reasoning, blockchain-based auditing, and anonymized AI prompting [61] | |
| Combination of feature ablation, counterfactual explanations, and self explainability [87] | |
| Combination of data-centric explanations, model impact analysis, local what-if analysis, and transparency measures [90] | |
| Combination of I-MAKER, C-MAKER, blockchain-based auditing, anonymization of AI prompts and explainable legal reasoning [62] | |
| Combination of latent response matrix, divergence plots, mean curvature plots, intervention-based analysis and composite visualization [51] | |
| Combination of GIPHT system, DSFILES (Detailed Simplified Flowsheet Input Line Entry System), literature-based validation and prompt engineering techniques [73] | |
| Combination of -VAE, latent space manipulation, weighted masking, and comparative evaluation [55] | |
| Combination of SHAP, Partial Dependence Plots (PDPs) and proxy model (LightGBM) [52] | |
| Combination of knowledge graph-based contextualization, re-prompting and intent classification, expert-defined constraints and rules and human mentor fallback mechanism [67] | |
| Combination of XAI, federated learning and human–robot interaction and collaboration [54] | |
| Combination of timing of explanations, explanation arrangement, accuracy of presentation, and global and local explanations [75] | |
| Novel | Social transparency [49,79] |
| Techniques | Concept lens [53] |
| Epistemic filters [57] | |
| LLMs’ self-explanations [78] | |
| Human-GenAI collaboration models [72] |
| Analytical Dimension | Categories | Representative Studies from This Review |
|---|---|---|
| Mechanism of Explanation | Feature attribution | SHAP in CTGAN [65], LIME for cyber harassment [77], Grad-CAM for image detection [63] |
| Latent space interpretability | -VAE for bias control [55], VAE music generation [66], latent response matrices [51] | |
| Surrogate modeling | SHAP + LightGBM proxy [77], decision trees + CVAE [69] | |
| Causal and counterfactual reasoning | Causal GenAI images [82], feature ablation + counterfactuals [87] | |
| Semantic concept modeling | Concept Lens in GANs [53], CRP in pathology [58] | |
| Natural-language rationalization | LLM self-explanations [78], natural language design explanations [91] | |
| Social and interaction-based transparency | Social transparency [49,79], embodied sandbox explanations [50] | |
| Timing of Explanation | Ante hoc | -VAE latent control [55], epistemic filters [57], Concept Lens [53] |
| Post hoc | LIME [77,93], SHAP [65,76], Grad-CAM [63], attention visualization [60] | |
| Hybrid (Ante + Post hoc) | DYNAMIC framework [71], ExplainAgent [89], uncertainty-aware GenAI [78] | |
| Scope of Explanation | Local | LIME for intrusion data [93], Grad-CAM images [63], counterfactual classifiers [82] |
| Global | SHAP global importance [76], sensitivity analysis [69] | |
| Mixed (Local + Global) | SHAP + PDPs [77], DYNAMIC framework [71], ExplainAgent [89] | |
| Target Audience | Developers | SHAP proxy models [52], fastSHAP-C [88], attention analysis [60] |
| End users | Chatbot explanation timing [75], uncertainty expression [78], co-creation sandbox [50] | |
| Domain experts | Digital pathology [58], air-traffic VAEs [51], chemical process GPT agents [73] | |
| Learners and educators | MOOCs with integrated gradients [83], TEL mentoring system [67] | |
| Policymakers and regulators | Blockchain-audited GenAI [61,62], legal explainability in robo-advice [70] | |
| Methodological | Quantitative | SHAP, PDPs [52,65,76] |
| Nature | Qualitative | Social transparency [49], epistemic filters [57] |
| Mixed | ExplainAgent [89], DYNAMIC framework [71] |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Kumarage, P.M.; Saarela, M. Explainable Generative AI: A Two-Stage Review of Existing Techniques and Future Research Directions. AI 2026, 7, 31. https://doi.org/10.3390/ai7010031
Kumarage PM, Saarela M. Explainable Generative AI: A Two-Stage Review of Existing Techniques and Future Research Directions. AI. 2026; 7(1):31. https://doi.org/10.3390/ai7010031
Chicago/Turabian StyleKumarage, Prabha M., and Mirka Saarela. 2026. "Explainable Generative AI: A Two-Stage Review of Existing Techniques and Future Research Directions" AI 7, no. 1: 31. https://doi.org/10.3390/ai7010031
APA StyleKumarage, P. M., & Saarela, M. (2026). Explainable Generative AI: A Two-Stage Review of Existing Techniques and Future Research Directions. AI, 7(1), 31. https://doi.org/10.3390/ai7010031
