Understanding AI Agents—A Data-Driven Literature Review
Abstract
1. Introduction
2. AI Agents
3. Methodology
- Color of bubbles is used to denote cluster membership, thereby delineating thematic groupings and facilitating cross-cluster comparison.
- Bubble size operationalizes each manuscript’s similarity to the concept “AI agents” as quantified by our similarity measure; larger bubbles indicate a stronger conceptual association.
- Edges represent the top 100 pairwise similarity scores across the publications, emphasizing the most salient inter-publication relationships while maintaining visual tractability.
4. Architecture & Frameworks
4.1. The Evolution of Agentic Architectures
4.2. Cognitive Foundations of Single-Agent Systems
4.3. Memory, Reflection and Long-Horizon Autonomy
4.4. Reasoning and Planning Mechanisms
4.5. Grounding and External Tool Use
4.6. Framework Ecosystems for Agent Deployment
4.7. Transition to Multi-Agent Architectures
5. Multi-Agent Systems
5.1. From Single-Agent Architectures to Agentic AI
5.2. Conceptual Foundations of Multi-Agent Systems
5.3. Architectural Paradigms and Coordination Structures
5.4. Core Components of Multi-Agent Architectures
5.5. Collective Reasoning and Planning
5.6. Tool Use and Shared Data Environments
5.7. Communication Protocols and Interoperability
5.8. Reliability and Governance Considerations in Multi-Agent Systems
5.9. Architectural Trade-Offs: Resource Complexity and Performance
6. Applications
6.1. The Expanding Landscape of AI Agent Applications
6.2. Applications in Financial Services
6.3. Applications in Scientific Research and Discovery
6.4. Applications in Software Engineering and Web Automation
6.5. Applications in Education and Learning
6.6. Applications in Business and Enterprise Automation
6.7. Applications in Robotics and the Internet of Things
6.8. Applications in Security and Cybersecurity
6.9. Applications in Gaming and Simulation
6.10. Specialized and Emerging Applications
7. Safety
7.1. Safety Foundations and Research Scope
7.2. Definition and Scope of Safety
7.3. Categories of Threats and Vulnerabilities
7.4. Intra-Execution Threats in Single-Agent Systems
7.4.1. Perception and Input Manipulation
7.4.2. Reasoning, Planning and Cognitive Failures
7.5. External Interaction Threats in Multi-Agent and Ecosystem Settings
7.5.1. Agent-to-Agent Threats
7.5.2. Agent-to-Memory Threats
7.5.3. Agent-to-Environment Threats
7.6. Mitigation Strategies and Technical Safeguards
7.6.1. Architectural and Technical Defenses
7.6.2. Privacy-Preserving Techniques
7.7. Recurring Concerns and Open Technical Problems
8. Ethics, Accountability & Governance
8.1. Governance Principles and Normative Scope
8.2. The Normative Core: Ethics and Value Alignment
8.3. Trust, Anthropomorphism and Human Mental Models
8.4. The Accountability Gap: Responsibility and Liability
8.5. Governance Mechanisms and Oversight
8.6. Interoperability and Ecosystem Governance
8.7. Convergence, Trade-Offs and Open Governance Challenges
9. Challenges in Agentic AI
9.1. Reliability and Sequential Reasoning
9.2. The Evaluation Paradox
9.3. Architectural Fragmentation and Interoperability
9.4. Resource Efficiency and Latency
9.5. Human-Agent Coordination
9.6. Emergent and Uncategorized Challenges
9.7. Open Research Directions
10. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
| AI | Artificial Intelligence |
| API | Application Programming Interface |
| DB | Database |
| GSM8K | Grade School Math 8K |
| GUI | Graphical User Interface |
| JEPA | Joint-Embedding Predictive Architecture |
| LLM | Large Language Model |
| MAS | Multi-Agent Systems |
| MCP | Model Context Protocol |
| MMLU | Massive Multitask Language Understanding |
| Pass@k | Probability of at least one correct solution out of k samples |
| RAG | Retrieval-Augmented Generation |
| SMT | Satisfiability Modulo Theories |
| SR | Success Rate |
References
- Kapoor, S.; Stroebl, B.; Siegel, Z.S.; Nadgir, N.; Narayanan, A. AI Agents That Matter. arXiv 2024, arXiv:2407.01502. [Google Scholar] [CrossRef]
- Deng, Z.; Guo, Y.; Han, C.; Ma, W.; Xiong, J.; Wen, S.; Xiang, Y. AI Agents Under Threat: A Survey of Key Security Challenges and Future Pathways. ACM Comput. Surv. 2025, 57, 1–36. [Google Scholar] [CrossRef]
- Kolt, N. Governing AI Agents. arXiv 2025, arXiv:2501.07913. [Google Scholar] [CrossRef]
- Castelfranchi, C. Modelling Social Action for AI Agents. Artif. Intell. 1998, 103, 157–182. [Google Scholar] [CrossRef]
- Masterman, T.; Besen, S.; Sawtell, M.; Chao, A. The Landscape of Emerging AI Agent Architectures for Reasoning, Planning, and Tool Calling: A Survey. arXiv 2024, arXiv:2404.11584. [Google Scholar] [CrossRef]
- He, Y.; Wang, E.; Rong, Y.; Cheng, Z.; Chen, H. Security of AI Agents. In Proceedings of the 2025 IEEE/ACM International Workshop on Responsible AI Engineering (RAIE), Ottawa, ON, Canada, 29 April 2025; pp. 45–52. [Google Scholar] [CrossRef]
- Chan, A.; Ezell, C.; Kaufmann, M.; Wei, K.; Hammond, L.; Bradley, H.; Bluemke, E.; Rajkumar, N.; Krueger, D.; Kolt, N.; et al. Visibility into AI Agents. In Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency, Rio de Janeiro, Brazil, 3 June 2024; pp. 958–973. [Google Scholar] [CrossRef]
- Casper, S.; Bailey, L.; Hunter, R.; Ezell, C.; Cabalé, E.; Gerovitch, M.; Slocum, S.; Wei, K.; Jurkovic, N.; Khan, A.; et al. The AI Agent Index. arXiv 2025, arXiv:2502.01635. [Google Scholar] [CrossRef]
- Chan, A.; Wei, K.; Huang, S.; Rajkumar, N.; Perrier, E.; Lazar, S.; Hadfield, G.K.; Anderljung, M. Infrastructure for AI Agents. arXiv 2025, arXiv:2501.10114. [Google Scholar]
- Yang, Y.; Chai, H.; Song, Y.; Qi, S.; Wen, M.; Li, N.; Liao, J.; Hu, H.; Lin, J.; Chang, G.; et al. A Survey of AI Agent Protocols. arXiv 2025, arXiv:2504.16736. [Google Scholar] [CrossRef]
- Gero, K.I.; Ashktorab, Z.; Dugan, C.; Pan, Q.; Johnson, J.; Geyer, W.; Ruiz, M.; Miller, S.; Millen, D.R.; Campbell, M.; et al. Mental Models of AI Agents in a Cooperative Game Setting. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, Honolulu, HI, USA, 21 April 2020; pp. 1–12. [Google Scholar] [CrossRef]
- Putta, P.; Mills, E.; Garg, N.; Motwani, S.; Finn, C.; Garg, D.; Rafailov, R. Agent Q: Advanced Reasoning and Learning for Autonomous AI Agents. arXiv 2024, arXiv:2408.07199. [Google Scholar] [CrossRef]
- Yang, J.; Tan, R.; Wu, Q.; Zheng, R.; Peng, B.; Liang, Y.; Gu, Y.; Cai, M.; Ye, S.; Jang, J.; et al. Magma: A Foundation Model for Multimodal AI Agents. In Proceedings of the 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 10 June 2025; pp. 14203–14214. [Google Scholar] [CrossRef]
- Zhang, J.; Lan, T.; Zhu, M.; Liu, Z.; Hoang, T.; Kokane, S.; Yao, W.; Tan, J.; Prabhakar, A.; Chen, H.; et al. xLAM: A Family of Large Action Models to Empower AI Agent Systems. arXiv 2024, arXiv:2409.03215. [Google Scholar] [CrossRef]
- Insa-Cabrera, J.; Dowe, D.L.; España-Cubillo, S.; Hernández-Lloreda, M.V.; Hernández-Orallo, J. Comparing Humans and AI Agents. In Artificial General Intelligence; Schmidhuber, J., Thórisson, K.R., Looks, M., Eds.; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2011; Volume 6830, pp. 122–132. [Google Scholar] [CrossRef]
- White, R.W. Advancing the Search Frontier with AI Agents. Commun. ACM 2024, 67, 54–65. [Google Scholar] [CrossRef]
- Kasirzadeh, A.; Gabriel, I. Characterizing AI Agents for Alignment and Governance. arXiv 2025, arXiv:2504.21848. [Google Scholar] [CrossRef]
- Sapkota, R.; Roumeliotis, K.I.; Karkee, M. AI Agents vs. Agentic AI: A Conceptual Taxonomy, Applications and Challenges. Inf. Fusion 2026, 126, 103599. [Google Scholar] [CrossRef]
- Wan, H.; Zhang, J.; Suria, A.A.; Yao, B.; Wang, D.; Coady, Y.; Prpa, M. Building LLM-Based AI Agents in Social Virtual Reality. In Proceedings of the Extended Abstracts of the CHI Conference on Human Factors in Computing Systems, Honolulu, HI, USA, 11 May 2024; pp. 1–7. [Google Scholar] [CrossRef]
- South, T.; Marro, S.; Hardjono, T.; Mahari, R.; Whitney, C.D.; Greenwood, D.; Chan, A.; Pentland, A. Authenticated Delegation and Authorized AI Agents. arXiv 2025, arXiv:2501.09674. [Google Scholar] [CrossRef]
- Han, E.; Yin, D.; Zhang, H. Bots with Feelings: Should AI Agents Express Positive Emotion in Customer Service? Inf. Syst. Res. 2023, 34, 1296–1311. [Google Scholar] [CrossRef]
- Ruan, J.; Chen, Y.; Zhang, B.; Xu, Z.; Bao, T.; Du, G.Q.; Shi, S.W.; Mao, H.; Li, Z.; Zeng, X.; et al. TPTU: Task Planning and Tool Usage of Large Language Model-Based AI Agents. 2023. Available online: https://openreview.net/forum?id=GrkgKtOjaH (accessed on 16 February 2026).
- Mollick, E.; Mollick, L.; Bach, N.; Ciccarelli, L.J.; Przystanski, B.; Ravipinto, D. AI Agents and Education: Simulated Practice at Scale. arXiv 2024, arXiv:2407.12796. [Google Scholar]
- Ferrag, M.A.; Tihanyi, N.; Debbah, M. From LLM Reasoning to Autonomous AI Agents: A Comprehensive Review. arXiv 2025, arXiv:2504.19678. [Google Scholar] [CrossRef]
- Mitchell, M.; Ghosh, A.; Luccioni, A.S.; Pistilli, G. Fully Autonomous AI Agents Should Not Be Developed. arXiv 2025, arXiv:2502.02649. [Google Scholar] [CrossRef]
- Petrovic, V.M. Artificial Intelligence and Virtual Worlds—Toward Human-Level AI Agents. IEEE Access 2018, 6, 39976–39988. [Google Scholar] [CrossRef]
- Huang, X.; Lian, J.; Lei, Y.; Yao, J.; Lian, D.; Xie, X. Recommender AI Agent: Integrating Large Language Models for Interactive Recommendations. ACM Trans. Inf. Syst. 2025, 43, 1–33. [Google Scholar] [CrossRef]
- Yadav, D.; Jain, R.; Agrawal, H.; Chattopadhyay, P.; Singh, T.; Jain, A.; Singh, S.B.; Lee, S.; Batra, D. EvalAI: Towards Better Evaluation Systems for AI Agents. arXiv 2019, arXiv:1902.03570. [Google Scholar] [CrossRef]
- Roohani, Y.; Lee, A.; Huang, Q.; Vora, J.; Steinhart, Z.; Huang, K.; Marson, A.; Liang, P.; Leskovec, J. BioDiscoveryAgent: An AI Agent for Designing Genetic Perturbation Experiments. arXiv 2025, arXiv:2405.17631. [Google Scholar]
- Cañas, J.J. AI and Ethics When Human Beings Collaborate with AI Agents. Front. Psychol. 2022, 13, 836650. [Google Scholar] [CrossRef] [PubMed]
- Shetty, M.; Chen, Y.; Somashekar, G.; Ma, M.; Simmhan, Y.; Zhang, X.; Mace, J.; Vandevoorde, D.; Las-Casas, P.; Gupta, S.M.; et al. Building AI Agents for Autonomous Clouds: Challenges and Design Principles. In Proceedings of the ACM Symposium on Cloud Computing, Redmond, WA, USA, 20 November 2024; pp. 99–110. [Google Scholar] [CrossRef]
- Yang, H.; Zhang, B.; Wang, N.; Guo, C.; Zhang, X.; Lin, L.; Wang, J.; Zhou, T.; Guan, M.; Zhang, R.; et al. FinRobot: An Open-Source AI Agent Platform for Financial Applications Using Large Language Models. arXiv 2024, arXiv:2405.14767. [Google Scholar] [CrossRef]
- Dennis, A.R.; Lakhiwal, A.; Sachdeva, A. AI Agents as Team Members: Effects on Satisfaction, Conflict, Trustworthiness, and Willingness to Work With. J. Manag. Inf. Syst. 2023, 40, 307–337. [Google Scholar] [CrossRef]
- Kostka, B.; Kwiecieli, J.; Kowalski, J.; Rychlikowski, P. Text-Based Adventures of the Golovin AI Agent. In Proceedings of the 2017 IEEE Conference on Computational Intelligence and Games (CIG), New York, NY, USA, 22–25 August 2017; pp. 181–188. [Google Scholar] [CrossRef]
- Kumar, A. Building Autonomous AI Agents Based AI Infrastructure. Int. J. Comput. Trends Technol. 2024, 72, 116–125. [Google Scholar] [CrossRef]
- Huang, Y. Levels of AI Agents: From Rules to Large Language Models. arXiv 2024, arXiv:2405.06643. [Google Scholar]
- Wang, H.; Wang, C.; Chen, Z.; Liu, F.; Bao, C.; Xu, X. Impact of AI-Agent-Supported Collaborative Learning on the Learning Outcomes of University Programming Courses. Educ. Inf. Technol. 2025, 30, 17717–17749. [Google Scholar] [CrossRef]
- Zhou, J.; Li, R.; Tang, J.; Tang, T.; Li, H.; Cui, W.; Wu, Y. Understanding Nonlinear Collaboration between Human and AI Agents: A Co-Design Framework for Creative Design. In Proceedings of the CHI Conference on Human Factors in Computing Systems, Honolulu, HI, USA, 11 May 2024; pp. 1–16. [Google Scholar] [CrossRef]
- Ning, L.; Liang, Z.; Jiang, Z.; Qu, H.; Ding, Y.; Fan, W.; Wei, X.; Lin, S.; Liu, H.; Yu, P.S.; et al. A Survey of WebAgents: Towards Next-Generation AI Agents for Web Automation with Large Foundation Models. In Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining (Vol. 2), Toronto, ON, Canada, 3 August 2025; pp. 6140–6150. [Google Scholar] [CrossRef]
- Nayyar, R.K.; Verma, P.; Srivastava, S. Differential Assessment of Black-Box AI Agents. In Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, BC, Canada, 22 February–1 March 2022; Volume 36, pp. 9868–9876. [Google Scholar] [CrossRef]
- Wang, X.; Dai, H.; Gao, S.; Li, P. Characteristic AI Agents via Large Language Models. arXiv 2024, arXiv:2403.12368. [Google Scholar] [CrossRef]
- Mathur, L.; Liang, P.P.; Morency, L.-P. Advancing Social Intelligence in AI Agents: Technical Challenges and Open Questions. arXiv 2024, arXiv:2404.11023. [Google Scholar] [CrossRef]
- Jiang, Y.-H.; Li, R.; Zhou, Y.; Qi, C.; Hu, H.; Wei, Y.; Jiang, B.; Wu, Y. AI Agent for Education: Von Neumann Multi-Agent System Framework. arXiv 2025, arXiv:2501.00083. [Google Scholar]
- Sami, A.M.; Rasheed, Z.; Kemell, K.-K.; Waseem, M.; Kilamo, T.; Saari, M.; Duc, A.N.; Systä, K.; Abrahamsson, P. System for Systematic Literature Review Using Multiple AI Agents: Concept and an Empirical Evaluation. arXiv 2024, arXiv:2403.08399. [Google Scholar] [CrossRef]
- Han, X.; Wang, N.; Che, S.; Yang, H.; Zhang, K.; Xu, S.X. Enhancing Investment Analysis: Optimizing AI-Agent Collaboration in Financial Research. In Proceedings of the 5th ACM International Conference on AI in Finance, Brooklyn, NY, USA, 14 November 2024; pp. 538–546. [Google Scholar] [CrossRef]
- Alabed, A.; Javornik, A.; Gregory-Smith, D.; Casey, R. More than Just a Chat: A Taxonomy of Consumers’ Relationships with Conversational AI Agents and Their Well-Being Implications. Eur. J. Mark. 2024, 58, 373–409. [Google Scholar] [CrossRef]
- Arora, D.; Sonwane, A.; Wadhwa, N.; Mehrotra, A.; Utpala, S.; Bairi, R.; Kanade, A.; Natarajan, N. MASAI: Modular Architecture for Software-Engineering AI Agents. arXiv 2024, arXiv:2406.11638. [Google Scholar] [CrossRef]
- Balunovic, M.; Beurer-Kellner, L.; Fischer, M.; Vechev, M. AI Agents with Formal Security Guarantees. 2024. Available online: https://openreview.net/forum?id=c6jNHPksiZ (accessed on 16 February 2026).
- Sun, G.; Zhan, X.; Such, J. Building Better AI Agents: A Provocation on the Utilisation of Persona in LLM-Based Conversational Agents. In Proceedings of the ACM Conversational User Interfaces 2024, Luxembourg, 8 July 2024; pp. 1–6. [Google Scholar] [CrossRef]
- Ashktorab, Z.; Dugan, C.; Johnson, J.; Pan, Q.; Zhang, W.; Kumaravel, S.; Campbell, M. Effects of Communication Directionality and AI Agent Differences in Human-AI Interaction. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, Yokohama, Japan, 6 May 2021; pp. 1–15. [Google Scholar] [CrossRef]
- Ponnusamy, P.; Ghias, A.R.; Yi, Y.; Yao, B.; Guo, C.; Sarikaya, R. Feedback-Based Self-Learning in Large-Scale Conversational AI Agents. AI Mag. 2021, 42, 43–56. [Google Scholar] [CrossRef]
- Fung, P.; Bachrach, Y.; Celikyilmaz, A.; Chaudhuri, K.; Chen, D.; Chung, W.; Dupoux, E.; Gong, H.; Jégou, H.; Lazaric, A.; et al. Embodied AI Agents: Modeling the World. arXiv 2025, arXiv:2506.22355. [Google Scholar] [CrossRef]
- Huang, K.; Zhang, S.; Wang, H.; Qu, Y.; Lu, Y.; Roohani, Y.; Li, R.; Qiu, L.; Li, G.; Zhang, J.; et al. Biomni: A General-Purpose Biomedical AI Agent. bioRxiv 2025. [Google Scholar] [CrossRef]
- Mehrotra, S.; Jorge, C.C.; Jonker, C.M.; Tielman, M.L. Integrity-Based Explanations for Fostering Appropriate Trust in AI Agents. ACM Trans. Interact. Intell. Syst. 2024, 14, 1–36. [Google Scholar] [CrossRef]
- Zhou, J.; Zhang, B.; Li, G.; Chen, X.; Li, H.; Xu, X.; Chen, S.; He, W.; Xu, C.; Liu, L.; et al. An AI Agent for Fully Automated Multi-Omic Analyses. Adv. Sci. 2024, 11, 2407094. [Google Scholar] [CrossRef]
- Baranchuk, M.; Bolina, V.; De Witt, C.; Hammond, L.; Motwani, S.; Strohmeier, M.; Torr, P. Secret Collusion among AI Agents: Multi-Agent Deception via Steganography. In Proceedings of the Advances in Neural Information Processing Systems 37, Vancouver, BC, Canada, 10–15 December 2024; pp. 73439–73486. [Google Scholar] [CrossRef]
- Bovo, R.; Abreu, S.; Ahuja, K.; Gonzalez, E.J.; Cheng, L.-T.; Gonzalez-Franco, M. EmBARDiment: An Embodied AI Agent for Productivity in XR. In Proceedings of the 2025 IEEE Conference on Virtual Reality and 3D User Interfaces (VR), Saint-Malo, France, 8 March 2025; pp. 708–717. [Google Scholar] [CrossRef]
- Mao, S.; Cai, Y.; Xia, Y.; Wu, W.; Wang, X.; Wang, F.; Ge, T.; Wei, F. ALYMPICS: LLM Agents Meet Game Theory—Exploring Strategic Decision-Making with AI Agents. arXiv 2024, arXiv:2311.03220. [Google Scholar]
- Joshi, S. Advancing Innovation in Financial Stability: A Comprehensive Review of AI Agent Frameworks, Challenges and Applications. World J. Adv. Eng. Technol. Sci. 2025, 14, 117–126. [Google Scholar] [CrossRef]
- Joshi, S. Review of Autonomous Systems and Collaborative AI Agent Frameworks. Int. J. Sci. Res. Arch. 2025, 14, 961–972. [Google Scholar] [CrossRef]
- Houde, S.; Brimijoin, K.; Muller, M.; Ross, S.I.; Silva Moran, D.A.; Gonzalez, G.E.; Kunde, S.; Foreman, M.A.; Weisz, J.D. Controlling AI Agent Participation in Group Conversations: A Human-Centered Approach. In Proceedings of the 30th International Conference on Intelligent User Interfaces, Cagliari, Italy, 24 March 2025; pp. 390–408. [Google Scholar] [CrossRef]
- Muller, M.; Houde, S.; Gonzalez, G.N.; Brimijoin, K.; Ross, S.; Moran, D.A.S.; Weisz, J. Group Brainstorming with an AI Agent: Creating and Selecting Ideas. In Proceedings of the International Conference on Computational Creativity (ICCC 2024), Jönköping, Sweden, 17–21 June 2024; Available online: https://computationalcreativity.net/iccc24/papers/ICCC24_paper_18.pdf (accessed on 16 February 2026).
- Jiang, Y.-H.; Shi, J.; Tu, Y.; Zhou, Y.; Zhang, W.; Wei, Y. For Learners: AI Agent Is All You Need. In Proceedings of the International Conference on Computational Creativity (ICCC 2024), Jönköping, Sweden, 17–21 June 2024; pp. 21–46. Available online: https://www.researchgate.net/profile/Yuan-Hao-Jiang/publication/384803779_For_Learners_AI_Agent_Is_All_You_Need/links/675d4f34da24c8537c6ef4fe/For-Learners-AI-Agent-Is-All-You-Need.pdf (accessed on 16 February 2026).
- Rasheed, Z.; Waseem, M.; Systä, K.; Abrahamsson, P. Large Language Model Evaluation via Multi AI Agents: Preliminary Results. arXiv 2024, arXiv:2404.01023. [Google Scholar] [CrossRef]
- Joshi, S. A Literature Review of Gen AI Agents in Financial Applications: Models and Implementations. SSRN 2025. [Google Scholar] [CrossRef]
- Noothigattu, R.; Bouneffouf, D.; Mattei, N.; Chandra, R.; Madan, P.; Varshney, K.R.; Campbell, M.; Singh, M.; Rossi, F. Teaching AI Agents Ethical Values Using Reinforcement Learning and Policy Orchestration. IBM J. Res. Dev. 2019, 63, 2:1–2:9. [Google Scholar] [CrossRef]
- Zhang, R.; Du, H.; Liu, Y.; Niyato, D.; Kang, J.; Xiong, Z.; Jamalipour, A.; In Kim, D. Generative AI Agents with Large Language Model for Satellite Networks via a Mixture of Experts Transmission. IEEE J. Sel. Areas Commun. 2024, 42, 3581–3596. [Google Scholar] [CrossRef]
- Yang, Y.; Ma, M.; Huang, Y.; Chai, H.; Gong, C.; Geng, H.; Zhou, Y.; Wen, Y.; Fang, M.; Chen, M.; et al. Agentic Web: Weaving the Next Web with AI Agents. arXiv 2025, arXiv:2507.21206. [Google Scholar] [CrossRef]
- Yu, D.; Song, K.; Lu, P.; He, T.; Tan, X.; Ye, W.; Zhang, S.; Bian, J. MusicAgent: An AI Agent for Music Understanding and Generation with Large Language Models. arXiv 2023, arXiv:2310.11954. [Google Scholar] [CrossRef]
- Zhang, Q.; Hu, Y.; Yan, J.; Zhang, H.; Xie, X.; Zhu, J.; Li, H.; Niu, X.; Li, L.; Sun, Y.; et al. Large-Language-Model-Based AI Agent for Organic Semiconductor Device Research. Adv. Mater. 2024, 36, 2405163. [Google Scholar] [CrossRef] [PubMed]
- Chiang, J.Y.F.; Lee, S.; Huang, J.-B.; Huang, F.; Chen, Y. Why Are Web AI Agents More Vulnerable Than Standalone LLMs? A Security Analysis. arXiv 2025, arXiv:2502.20383. [Google Scholar] [CrossRef]
- Ivanov, D.; Dütting, P.; Talgam-Cohen, I.; Wang, T.; Parkes, D.C. Principal-Agent Reinforcement Learning: Orchestrating AI Agents with Contracts. arXiv 2024, arXiv:2407.18074. [Google Scholar]
- Duan, W.; McNeese, N.; Freeman, G.; Li, L. Mitigating Gender Stereotypes toward AI Agents through an eXplainable AI (XAI) Approach. Proc. ACM Hum.-Comput. Interact. 2024, 8, 1–35. [Google Scholar] [CrossRef]
- Yu, X.; Peng, B.; Vajipey, V.; Cheng, H.; Galley, M.; Gao, J.; Yu, Z. ExACT: Teaching AI Agents to Explore with Reflective-MCTS and Exploratory Learning. arXiv 2025, arXiv:2410.02052. [Google Scholar]
- Ju, H.; Aral, S. Collaborating with AI Agents: Field Experiments on Teamwork, Productivity, and Performance. arXiv 2025, arXiv:2503.18238. [Google Scholar] [CrossRef]
- Feng, K.J.K.; Pu, K.; Latzke, M.; August, T.; Siangliulue, P.; Bragg, J.; Weld, D.S.; Zhang, A.X.; Chang, J.C. Cocoa: Co-Planning and Co-Execution with AI Agents. arXiv 2025, arXiv:2412.10999. [Google Scholar]
- Umarov, I.; Mozgovoy, M. Believable and Effective AI Agents in Virtual Worlds: Current State and Future Perspectives. Int. J. Gaming Comput.-Mediat. Simul. 2012, 4, 37–59. [Google Scholar] [CrossRef]
- Gupta, S. AI Agents Collaboration under Resource Constraints: Practical Implementations. Int. J. Artif. Intell. Res. Dev. 2025, 3, 51–63. [Google Scholar] [CrossRef]
- Ruan, J.; Chen, Y.; Zhang, B.; Xu, Z.; Bao, T.; Du, G.; Shi, S.; Mao, H.; Li, Z.; Zeng, X.; et al. TPTU: Large Language Model-Based AI Agents for Task Planning and Tool Usage. arXiv 2023, arXiv:2308.03427. [Google Scholar] [CrossRef]
- Zhang, Y.; Cai, Y.; Zuo, X.; Luan, X.; Wang, K.; Hou, Z.; Zhang, Y.; Wei, Z.; Sun, M.; Sun, J.; et al. The Fusion of Large Language Models and Formal Methods for Trustworthy AI Agents: A Roadmap. arXiv 2024, arXiv:2412.06512. [Google Scholar] [CrossRef]
- Bhatt, A.; Rushing, C.; Kaufman, A.; Tracy, T.; Georgiev, V.; Matolcsi, D.; Khan, A.; Shlegeris, B. Ctrl-Z: Controlling AI Agents via Resampling. arXiv 2025, arXiv:2504.10374. [Google Scholar] [CrossRef]
- Kon, P.T.J.; Liu, J.; Ding, Q.; Qiu, Y.; Yang, Z.; Huang, Y.; Srinivasa, J.; Lee, M.; Chowdhury, M.; Chen, A. Curie: Toward Rigorous and Automated Scientific Experimentation with AI Agents. arXiv 2025, arXiv:2502.16069. [Google Scholar] [CrossRef]
- Bousetouane, F. Agentic Systems: A Guide to Transforming Industries with Vertical AI Agents. arXiv 2025, arXiv:2501.00881. [Google Scholar] [CrossRef]
- Mo, T.; Jiang, Z.; Zheng, Q. Interactive AI Agent for Code Refactoring Assistance: A Study on Decision-Making Strategies and Human–Agent Collaboration Effectiveness. Acad. Nexus J. 2025, 4. Available online: https://academianexusjournal.com/index.php/anj/article/view/35 (accessed on 16 February 2026).
- Murthy, R.; Heinecke, S.; Niebles, J.C.; Liu, Z.; Xue, L.; Yao, W.; Feng, Y.; Chen, Z.; Gokul, A.; Arpit, D.; et al. REX: Rapid Exploration and eXploitation for AI Agents. arXiv 2024, arXiv:2307.08962. [Google Scholar] [CrossRef]
- Lim, S.; Shim, H. No Secrets between the Two of Us: Privacy Concerns over Using AI Agents. Cyberpsychology 2022, 16, 3. [Google Scholar] [CrossRef]
- Hong, J.-W.; Williams, D. Racism, Responsibility and Autonomy in HCI: Testing Perceptions of an AI Agent. Comput. Hum. Behav. 2019, 100, 79–84. [Google Scholar] [CrossRef]
- Aryal, S.; Do, T.; Heyojoo, B.; Chataut, S.; Gurung, B.D.S.; Gadhamshetty, V.; Gnimpieba, E. Leveraging Multi-AI Agents for Cross-Domain Knowledge Discovery. arXiv 2024, arXiv:2404.08511. [Google Scholar]
- Lei, S.; Xie, L.; Peng, J. Unethical Consumer Behavior Following Artificial Intelligence Agent Encounters: The Differential Effect of AI Agent Roles and Its Boundary Conditions. J. Serv. Res. 2025, 28, 598–613. [Google Scholar] [CrossRef]
- Cihon, P.; Stein, M.; Bansal, G.; Manning, S.; Xu, K. Measuring AI Agent Autonomy: Towards a Scalable Approach with Code Inspection. arXiv 2025, arXiv:2502.15212. [Google Scholar] [CrossRef]
- Thomas, G.; Chan, A.J.; Kang, J.; Wu, W.; Christianos, F.; Greenlee, F.; Toulis, A.; Purtorab, M. WebGames: Challenging General-Purpose Web-Browsing AI Agents. arXiv 2025, arXiv:2502.18356. [Google Scholar]
- DeChant, C. Episodic Memory in AI Agents Poses Risks That Should Be Studied and Mitigated. In Proceedings of the 2025 IEEE Conference on Secure and Trustworthy Machine Learning (SaTML), Copenhagen, Denmark, 9 April 2025; pp. 321–332. [Google Scholar] [CrossRef]
- Bhunia, A.K.; Das, A.; Muhammad, U.R.; Yang, Y.; Hospedales, T.M.; Xiang, T.; Gryaditskaya, Y.; Song, Y.-Z. Pixelor: A Competitive Sketching AI Agent. ACM Trans. Graph. 2020, 39, 1–15. [Google Scholar] [CrossRef]
- Giske, C.G.; Bressan, M.; Fiechter, F.; Hinic, V.; Mancini, S.; Nolte, O.; Egli, A. GPT-4-Based AI Agents—The New Expert System for Detection of Antimicrobial Resistance Mechanisms? J. Clin. Microbiol. 2024, 62, e00689-24. [Google Scholar] [CrossRef]
- Jabbour, J.; Janapa Reddi, V. Generative AI Agents in Autonomous Machines: A Safety Perspective. In Proceedings of the 43rd IEEE/ACM International Conference on Computer-Aided Design, Newark, NJ, USA, 27 October 2024; pp. 1–13. [Google Scholar] [CrossRef]
- Chang, M.L.; Lee, A.; Han, N.; Huang, A.; Simão, H.; Reig, S.; Mohammad Ali, A.U.; Martinez, R.; Khanuja, N.M.; Zimmerman, J.; et al. Dynamic Agent Affiliation: Who Should the AI Agent Work for in the Older Adult’s Care Network? In Proceedings of the Designing Interactive Systems Conference, Copenhagen, Denmark, 1–5 July 2024; pp. 1774–1788. [Google Scholar] [CrossRef]
- Chmait, N.; Li, Y.-F.; Dowe, D.L.; Green, D. A Dynamic Intelligence Test Framework for Evaluating AI Agents. In Proceedings of the EGPAI 2016—Evaluating General Purpose AI, The Hague, The Netherlands, 30 August 2016. [Google Scholar]
- De Mesentier Silva, F.; Borovikov, I.; Kolen, J.; Aghdaie, N.; Zaman, K. Exploring Gameplay with AI Agents. In Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, Edmonton, AB, Canada, 13–17 November 2018; Volume 14, pp. 159–165. [Google Scholar] [CrossRef]
- Liu, Z.; Qiu, J.; Wang, S.; Zhang, J.; Liu, Z.; Ram, R.; Chen, H.; Yao, W.; Heinecke, S.; Savarese, S.; et al. MCPEval: Automatic MCP-Based Deep Evaluation for AI Agent Models. arXiv 2025, arXiv:2507.12806. [Google Scholar] [CrossRef]
- Sun, H.; Zeng, S. Introspection of Thought Helps AI Agents. arXiv 2025, arXiv:2507.08664. [Google Scholar] [CrossRef]
- Mouri Zadeh Khaki, A.; Choi, A. Evaluating Fairness in LLM Negotiator Agents via Economic Games Using Multi-Agent Systems. Mathematics 2026, 14, 458. [Google Scholar] [CrossRef]
- Lee, M. A Mathematical Investigation of Hallucination and Creativity in GPT Models. Mathematics 2023, 11, 2320. [Google Scholar] [CrossRef]
- Wei, J.; Wang, X.; Schuurmans, D.; Bosma, M.; Xia, F.; Chi, E.H.; Le, Q.V.; Zhou, D. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. arXiv 2022, arXiv:2201.11903. [Google Scholar]
- Carlini, N.; Tramer, F.; Wallace, E.; Jagielski, M.; Herbert-Voss, A.; Lee, K.; Roberts, A.; Brown, T.B.; Song, D.; Erlingsson, U.; et al. Extracting Training Data from Large Language Models. arXiv 2020, arXiv:2012.07805. [Google Scholar]
- Shoeybi, M.; Patwary, M.; Puri, R.; LeGresley, P.; Casper, J.; Catanzaro, B. Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism. arXiv 2019, arXiv:1909.08053. [Google Scholar]
- Tirumala, K.; Markosyan, A.; Zettlemoyer, L.; Aghajanyan, A. Memorization without Overfitting: Analyzing the Training Dynamics of Large Language Models. arXiv 2022, arXiv:2205.10770. [Google Scholar] [CrossRef]
- Hu, E.J.; Shen, Y.; Wallis, P.; Allen-Zhu, Z.; Li, Y.; Wang, S.; Wang, L.; Chen, W. LoRA: Low-Rank Adaptation of Large Language Models. arXiv 2021, arXiv:2106.09685. [Google Scholar]
- Gusenbauer, M. Google Scholar to overshadow them all? Comparing the sizes of 12 academic search engines and bibliographic databases. Scientometrics 2019, 118, 177–214. [Google Scholar] [CrossRef]
- Mai, J.; Gao, C.; Bao, J. Domain Generalization through Data Augmentation: A Survey of Methods, Applications, and Challenges. Mathematics 2025, 13, 824. [Google Scholar] [CrossRef]
- Sørensen, K.; Van den Broucke, S.; Fullam, J.; Doyle, G.; Pelikan, J.; Slonska, Z.; Brand, H. Health Literacy and Public Health: A Systematic Review and Integration of Definitions and Models. BMC Public Health 2012, 12, 80. [Google Scholar] [CrossRef]



| Research Stream | Mean Relevance (μ) | Standard Deviation (σ) |
|---|---|---|
| Architecture & Frameworks | 70.16 | 35.85 |
| Multi-Agent Systems | 37.78 | 42.12 |
| Applications | 70.42 | 34.67 |
| Safety | 33.65 | 39.11 |
| Ethics, Accountability & Governance | 36.93 | 41.68 |
| Feature | Single AI Agent | Multi-Agent Architecture |
|---|---|---|
| Composition | Single LLM augmented with tools and memory | Ensemble of specialized agents using multiple LLMs or models |
| Task complexity | Focused on a single, bounded task | Supports complex, multi-step workflows requiring coordination |
| Reasoning process | Internal iterative loops such as ReAct | Distributed and collaborative reasoning with recursive task allocation |
| Execution | Sequential autonomous steps within a defined scope | Parallelized execution coordinated among specialized agents |
| Architecture | Technical Profile (Complexity, Latency, Memory) | Performance Metrics & Benchmarks |
|---|---|---|
| Single-Agent | Complexity: Low Latency: Low (Real-time) Memory: Context-heavy | Metrics: Pass@k, Success Rate (SR) Benchmarks: MMLU, GSM8K |
| Hierarchical MAS | Complexity: Moderate (Parallel) Latency: Variable (Orchestration) Memory: Distributed/Low load | Metrics: Task Completion, Efficiency Benchmarks: SWE-bench |
| Decentralized MAS | Complexity: High Latency: High (Negotiation) Memory: Large system footprint | Metrics: Consensus Rate, Cost Benchmarks: Alympics, ChatEval |
| Agentic RAG | Complexity: Moderate (Two-stage) Latency: Moderate (DB-lookup) Memory: Efficient (Ext. Vector DB) | Metrics: Precision, Faithfulness Benchmarks: HotpotQA, RGB |
| Embodied/Web | Complexity: Variable (High-freq) Latency: Critical (<100 ms) Memory: Hardware-dependent | Metrics: Path Length, GUI Success Benchmarks: VisualWebArena |
| Application Domain | Representative Systems/Approaches | Agent Type | Key Tasks | Performance Metrics | Observed Performance Trends |
|---|---|---|---|---|---|
| 6.1 Financial Services | FinRobot, FinVision | Multi-Agent | Trading, risk analysis, fraud detection | Accuracy, false-positive rate, response time | Improved risk accuracy, reduced false positives, faster decision-making |
| 6.2 Scientific Research | AutoGen, CrewAI, BioDiscoveryAgent, Biomni | Multi-Agent | Literature review, experiment design, data analysis | Hit rate, precision/recall, productivity gain | Higher experimental success rates, strong gains in productivity and scalability |
| 6.3 Software Engineering | SWE-Agent, SWE-bench | Single & Multi-Agent | Code generation, debugging, maintenance | Task success rate, benchmark score | Multi-agent setups outperform single agents on complex coding tasks |
| 6.4 Web Automation | WebAgents, OpenAI Operator, VisualWebArena | Single & Multi-Agent | Web navigation, task automation | Task completion rate, robustness, generalization | Strong performance in structured tasks, limited robustness in open environments |
| 6.5 Education | Intelligent Tutoring Systems, PitchQuest | Multi-Agent | Tutoring, simulation, feedback | Learning outcome, engagement, accuracy | Improved engagement and personalized learning outcomes |
| 6.6 Business & Enterprise | Siri, Alexa, Replika | Single & Multi-Agent | Customer service, workflow automation | Response time, user satisfaction, task efficiency | Increased efficiency and reduced workload, strong UX improvements |
| 6.7 Robotics & IoT | Multi-robot systems, embodied agents | Multi-Agent | Physical interaction, control, coordination | Task success rate, control accuracy | Effective in structured environments, challenges in unstructured settings |
| 6.8 Cybersecurity | Multi-agent defense systems | Multi-Agent | Threat detection, response coordination | Detection rate, false positives, response latency | Faster detection and improved coordination, but new attack surfaces |
| 6.9 Gaming & Simulation | Alympics, Pixelor | Single & Multi-Agent | Strategy, gameplay, simulation | Win rate, human-level performance | Achieves human-level or superhuman performance in controlled settings |
| 6.10 Emerging Applications | FilmAgent, OptiMuse, Pairit | Multi-Agent | Creative tasks, co-creation | Output quality, user preference, collaboration efficiency | Human–AI teams outperform AI-only in creativity and final output selection |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Stübinger, J.; Metz, F. Understanding AI Agents—A Data-Driven Literature Review. Mathematics 2026, 14, 1478. https://doi.org/10.3390/math14091478
Stübinger J, Metz F. Understanding AI Agents—A Data-Driven Literature Review. Mathematics. 2026; 14(9):1478. https://doi.org/10.3390/math14091478
Chicago/Turabian StyleStübinger, Johannes, and Fabio Metz. 2026. "Understanding AI Agents—A Data-Driven Literature Review" Mathematics 14, no. 9: 1478. https://doi.org/10.3390/math14091478
APA StyleStübinger, J., & Metz, F. (2026). Understanding AI Agents—A Data-Driven Literature Review. Mathematics, 14(9), 1478. https://doi.org/10.3390/math14091478
