Understanding AI Agents—A Data-Driven Literature Review

Stübinger, Johannes; Metz, Fabio

doi:10.3390/math14091478

Open AccessReview

Understanding AI Agents—A Data-Driven Literature Review

by

Johannes Stübinger

^* and

Fabio Metz

Faculty of Business and Economics, Coburg University of Applied Sciences, 96450 Coburg, Germany

^*

Author to whom correspondence should be addressed.

Mathematics 2026, 14(9), 1478; https://doi.org/10.3390/math14091478

Submission received: 18 February 2026 / Revised: 2 April 2026 / Accepted: 16 April 2026 / Published: 28 April 2026

(This article belongs to the Special Issue Mathematical and Computing Sciences for Artificial Intelligence, 2nd Edition)

Download

Browse Figures

Versions Notes

Abstract

This paper presents a systematic, data-driven literature review of research on Artificial Intelligence (AI) agents based on the top 100 Google Scholar publications related to the search terms “AI agent” and “AI agents”. The rapid advancement of AI agents, driven in particular by recent progress in Large Language Models, has resulted in a diverse and fragmented research landscape that lacks comprehensive quantitative overviews. To address this gap, we implement and apply a fully automated, AI-driven analysis pipeline to the domain of AI agents. The collected publications are processed using a Large Language Model accessed via a Python-based Application Programming Interface (API), enabling an automated analysis of the literature without manual categorization. Based on this approach, the publications are grouped into data-driven thematic clusters reflecting dominant research perspectives in the field. Specifically, the identified clusters comprise “Architecture & Frameworks”, “Multi-Agent Systems”, “Applications”, “Safety” and “Ethics, Accountability & Governance”. By synthesizing the literature in a structured and automated manner, this work provides a consolidated overview of central research patterns, identifies key operational and structural challenges and highlights fragmentation across AI agent research. The findings support a more systematic understanding of AI agents and provide a foundation for future research on robust, scalable and trustworthy AI agent systems.

Keywords:

AI agent; Google Scholar; data-driven literature review; large language models; automated analysis; architecture & frameworks; multi-agent systems; applications; safety; ethics; accountability & governance

MSC:

68T01; 68T07; 68T30; 68T42

1. Introduction

The concept of Artificial Intelligence (AI) agents [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100] has evolved from an early theoretical objective in cybernetics and computer science into a central paradigm of contemporary artificial intelligence research [8,77]. While early work focused on symbolic AI and autonomous robotics, recent advances mark a clear transition from static generative models operating as human-controlled assistants toward autonomous, goal-directed systems capable of independent decision-making [3,25]. This shift has been accelerated by the emergence of Large Language Models (LLMs) and foundation models, which increasingly serve as cognitive backbones enabling agents to reason, plan and interact with dynamic and open-ended environments [6,12,83].

Across the literature, AI agents are commonly characterized by a core triad of perception, cognition and action [2,5]. Their behavior is frequently described along a spectrum of agentic properties such as autonomy, reactivity, proactiveness and social ability [1,2,17]. Modern agentic systems are designed to adapt to underspecified instructions and to pursue long-term objectives with limited human intervention by coordinating multi-step workflows across digital and physical contexts [2,8,52,71]. As a result, AI agent research has rapidly expanded across multiple disciplines, including machine learning, natural language processing, human–computer interaction and application-oriented domains such as biomedicine, finance and software engineering [1,2,53,83].

Agentic AI represents a paradigm shift from reactive language models to autonomous systems capable of executing open-ended, multi-step tasks. Unlike passive assistants, AI agents actively employ advanced reasoning, persistent memory and dynamic tool use to interact with complex environments [3]. These capabilities are often extended through multi-agent collaboration, in which multiple agents coordinate to solve decentralized and complex problems, thereby establishing Agentic AI as a distinct and emerging research field [5]. At the same time, the increasing level of autonomy introduces substantial challenges, particularly with respect to system design, evaluation, reliability and the need for robust governance mechanisms to mitigate emerging safety risks [3].

Despite the rapid and ongoing expansion of Agentic AI, the research landscape remains fragmented. There is no universally accepted definition of an AI agent and substantial heterogeneity exists in design assumptions, evaluation methodologies and reporting practices [1,8,10,99]. High-level analyses further indicate a lack of standardized benchmarks and reproducible evaluation protocols, complicating the assessment of scientific progress and practical relevance [1,8,71]. Existing surveys typically focus on narrow subsets of the field or rely on qualitative syntheses, thereby lacking a comprehensive quantitative perspective on how the broader field is structured. As a result, a consolidated and data-driven overview that systematically captures the dominant thematic streams and their interconnections across the literature is still missing.

In this paper, we address this gap by conducting a systematic, data-driven literature review of the top 100 Google Scholar publications on AI agents, with the objective of providing a structured and reproducible overview of the field and identifying its dominant thematic structures. In the first step, an automated pipeline crawls and processes the textual content of the selected manuscripts using a Large Language Model accessed via a Python-based API, thereby enabling a fully automated and reproducible analysis without manual intervention. Based on this approach, the publications are grouped into data-driven thematic clusters, namely “Architecture & Frameworks”, “Multi-Agent Systems”, “Applications”, “Safety” and “Ethics, Accountability & Governance”. In the second step, these clusters are analyzed as a logical progression, tracing how architectural foundations and multi-agent coordination provide the functional basis for diverse applications, which in turn necessitate specific safety and ethical guardrails. Specifically, we (i) analyze their structural composition and interconnections, (ii) synthesize their core research contributions and (iii) evaluate their relative importance within the overall research landscape. By leveraging probabilistic clustering and similarity-based assignment, the study reveals that research on AI agents is characterized by a high degree of fragmentation despite clear recurring conceptual patterns. Drawing from the top 100 Google Scholar papers, the proposed methodology enables a consistent and scalable synthesis of the literature, which ultimately uncovers both dominant research directions and critical gaps, thereby providing a structured foundation for future work on robust, scalable and trustworthy AI agent systems.

The remainder of this paper is organized as follows. Section 2 introduces the concept of AI agents. In Section 3, we present the methodology and outline our approach. Section 4 reviews the underlying architectural principles and frameworks, after which Section 5 is devoted to multi-agent systems and Section 6 discusses their applications. Safety aspects are addressed in Section 7, while ethical considerations, accountability and governance are examined in Section 8, followed by a discussion of specific challenges in Section 9. Finally, Section 10 concludes the paper.

2. AI Agents

AI agents are autonomous software systems engineered to perceive their environment, reason about tasks and execute actions to achieve user-defined objectives [2,101]. Rather than functioning as passive tools, they operate as autonomous systems that leverage Large Language Models, which act as powerful instruments for addressing a diverse array of natural language processing tasks [102,103,104,105,106,107]. AI agents decompose complex goals into executable plans [3]. Their architecture integrates modular components, including perception modules that process multimodal inputs such as textual, visual and auditory data, memory systems that maintain contextual information over extended horizons and action modules that invoke external tools and APIs to manipulate digital environments [2,3,52].

From a technical point of view, an AI Agent consists of the main pillars “Foundation Model”, “Memory Systems”, “Planning”, “Tool Usage” and “Action Execution” [10].

Foundation Model: At the heart of an LLM-based agent lies its foundation model, typically a Large Language Model or multimodal model, which can be formally represented as a function

F_{θ} : X \to Y

, where

X

denotes the input space (e.g., text, images, audio) and

Y

denotes the output space. Given an input

x \in X

, the model produces an output

y = F_{θ} (x)

with parameters

θ \in R^{n}

encoding learned knowledge and reasoning capabilities, including multimodal inputs

x = (x_{text}, x_{image}, x_{audio})

.

Memory Systems: LLM agents incorporate both short-term and long-term memory, which can be defined as

M = M_{short} \cup M_{long}

where short-term memory maintains recent context

M_{short} = \{x_{t - k}, \dots, x_{t}\}

and long-term memory stores persistent information

M_{long} = {m_{1}, m_{2}, \dots, m_{n}} .

A retrieval function

R (q, M_{long}) \to m_{i}

selects relevant information, enabling responses of the form

y_{t} = F_{θ} (x_{t}, M_{short}, R (q, M_{long})),

which support continuity and knowledge accumulation over time.

Planning: Planning allows agents to decompose a complex task

T

into a sequence of subtasks

T = {T_{1}, T_{2}, \dots, T_{n}}

using a planning function

P (T) \to (T_{1}, T_{2}, \dots, T_{n}) .

These subtasks are executed sequentially

T_{i} \to T_{i + 1}, \forall i \in {1, \dots, n - 1},

often with an optimization objective such as

m i n \sum_{i = 1}^{n} Cost (T_{i}),

improving both problem-solving efficiency and interpretability.

Tool Usage: Since LLMs have limitations in advanced reasoning, precise computation and up-to-date knowledge, agents integrate external tools defined as

A = {A_{1}, A_{2}, \dots, A_{k}} .

A selection function

S (x) \to A_{i}

determines the appropriate tool, which produces an output

o_{i} = A_{i} (x) .

The final response is then enhanced as

y = F_{θ} (x, o_{i}),

ensuring that

Accuracy (y_{with tools}) \geq Accuracy (y_{without tools}) .

Action Execution: Agents interact with an environment

E

by selecting actions

a_{t} \in A

, leading to state transitions

s_{t + 1} = δ (s_{t}, a_{t}) .

This interaction forms a loop

(s_{t}, a_{t}) \to s_{t + 1},

optionally guided by a reward function

r_{t} = R (s_{t}, a_{t}),

with the objective of maximizing cumulative reward

m a x E [\sum_{t = 0}^{\infty} {γ^{t} r}_{t}],

where

γ

describes the discount factor. This enables agents to actively perform tasks such as API calls, database queries and system interactions.

Beyond static execution, agents possess capabilities for decision-making and adaptive behavior, employing reflection mechanisms to adjust strategies based on environmental feedback and execution outcomes [2,5]. This supports robust error handling and sustained performance improvement over time [12]. To address highly complex and end-to-end workflows, agents can be deployed in multi-agent architectures, where specialized units collaborate through structured communication protocols to coordinate tasks and share information effectively [2,18].

Figure 1 shows the cumulative number of Google Scholar records from 1990 to 31 January 2026 whose titles contain the terms “AI agent” or “AI agents”, obtained using an all-in-title query (allintitle: “AI agent” OR allintitle: “AI agents”). By restricting the search to title-level occurrences, the analysis focuses on documents in which AI agents are likely to represent a central research topic rather than a peripheral mention.

From 1990 until approximately 2017, the cumulative count increases only slowly, indicating a comparatively limited and stable volume of title-matching publications. A moderate acceleration becomes visible between 2018 and 2023. In contrast, the period from 2023 to early 2026 exhibits a pronounced and abrupt increase: within roughly two years, the cumulative number of records grows by more than a factor of five. Moreover, the cumulative total reached in 2025 already exceeds the sum of all preceding years (1990–2023). This marked change in growth rate is consistent with a recent surge in research activity on AI agents and underlines the timeliness of conducting a comprehensive, data-driven literature review on this topic.

While Figure 1 focuses on academic publication output, Figure 2 complements this analysis by illustrating worldwide Google search interest for the Google Trends query “AI Agent” from 1 January 2023 to 31 January 2026.

Figure 2 reports the weekly Google Trends interest on the platform’s normalized 0–100 scale. Search interest remains close to baseline throughout 2023 and increases only gradually during 2024, with the first clearly visible upward movement occurring toward the end of 2024. In 2025, the time series exhibits a rapid increase, culminating in a pronounced peak in early August 2025 (index value 100). After this maximum, interest remains at comparatively elevated levels, albeit with substantial week-to-week fluctuations.

3. Methodology

In this section, we describe the data-driven and fully automated methodology applied to analyze the literature on AI agents. In contrast to traditional literature reviews, we implement an AI-driven approach and directly apply it to the domain of AI agents without manual categorization or subjective intervention.

As a first step, we crawled the top 100 Google Scholar publications [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100] related to the search terms “AI agent” and “AI agents” (snapshot date: 15 December 2025, results sorted by relevance). Google Scholar was selected as the underlying database because it (i) provides a freely accessible search engine, (ii) indexes the full text and metadata of scientific literature across a wide range of publication formats and disciplines and (iii) represents one of the largest academic search engines, containing more than 380 million documents [108].

Furthermore, Google Scholar applies a relevance-based ranking mechanism that orders results according to multiple factors, including citation counts, publication sources and textual relevance. This ranking can be interpreted as a proxy for the visibility and influence of publications within the scientific literature. As a result, the top-ranked documents reflect contributions that are particularly prominent and widely recognized within the research field. Accordingly, the resulting dataset represents a structured snapshot of the most visible literature at the time of retrieval, forming the basis for the subsequent data-driven analysis.

The retrieved list was kept in the original order provided by Google Scholar and was not re-ranked or manually modified. We use the rank position as a proxy for relevance within our dataset, where Paper [1] denotes the highest-ranked result and Paper [100] the lowest-ranked result among the collected publications. We retain this ordering throughout the manuscript to ensure consistency and transparency in referencing and analysis.

Second, the collected publications are processed in an automated manner using the Gemini 2.5 Flash by Google, accessed via a Python-based (version 3.12.6) application programming interface. We use the LLM to analyze the textual content of the publications and to identify recurring themes and conceptual similarities across the corpus. Formally, let the corpus be defined as

D = {d_{1}, d_{2}, \dots, d_{n}},

where each

d_{i}

represents a publication and

G = \{g_{1}, g_{2}, \dots, g_{K}\}

denotes the set of thematic groups.

The language model is used to estimate the probability that a document belongs to a given group:

P (g_{k} ∣ d_{i}) = f_{θ} (d_{i}, g_{k}),

where

f_{θ}

represents the LLM.

Thus, each document is associated with a probability distribution over all groups:

\sum_{k = 1}^{K} P (g_{k} ∣ d_{i}) = 1, \forall i \in {1, \dots, n} .

Based on these probabilities, documents can be assigned to the most likely group via

c_{i} = \underset{g_{k} \in G}{a r g m a x} P (g_{k} ∣ d_{i}),

or alternatively retained as soft assignments reflecting uncertainty.

Based on this AI-driven analysis, the publications are grouped into clusters representing dominant thematic patterns present in the literature. The identification of these clusters emerges directly from the data and is not guided by predefined taxonomies or manual labeling, but instead follows a probabilistic assignment derived from the model.

To further quantify conceptual relationships between publications, we compute pairwise semantic similarity scores across the corpus. For each document, the title and abstract are transformed into vector embeddings using the embedding functionality of the Gemini 2.5 Flash model via its API. Let

e_{i}

denote the embedding of document

d_{i}

obtained from its textual representation. The similarity between two documents

d_{i}

and

d_{j}

is then defined using cosine similarity between their embedding vectors. These pairwise similarity scores provide a continuous measure of semantic proximity between publications and are used to capture structural relationships within the dataset. While cluster membership is determined by the probabilistic assignment

{{P (g}_{k} | d}_{i})

, the similarity measure is used to analyze intra- and inter-cluster relationships and forms the basis for the network representation shown in Figure 3. This design separates probabilistic clustering from similarity-based structure analysis, thereby improving methodological transparency and reproducibility. Although both steps rely on the same textual input, they serve distinct purposes: probabilistic assignment captures thematic grouping, while similarity measures capture structural relationships between documents.

In a final step, we apply the same data-driven approach to generate structured textual descriptions of the identified clusters. For this purpose, we instruct the LLM to synthesize the core ideas and research focus of each cluster based exclusively on the publications assigned to it. This procedure ensures a consistent and scalable analysis and forms the foundation for the thematic discussion presented in the following sections.

Figure 3 shows the algorithm’s output described above: each bubble represents one of the top 100 Google Scholar publications and the number inside refers to the corresponding entry in the bibliography. In our setting, each stream is named according to the most frequent topic within the respective cluster. To be more specific, our data-driven review algorithm (i) divides the publications into the streams “Architecture & Frameworks”, “Multi-Agent Systems”, “Applications”, “Safety” and “Ethics, Accountability & Governance” and (ii) assigns each manuscript to the stream based on the highest assignment probability

{{P (g}_{k} | d}_{i})

. “Architecture & Frameworks” emerges as the largest stream with 38 publications, followed by “Applications” (27), “Ethics, Accountability & Governance” (14), “Multi-Agent Systems” (12) and “Safety” (9). To enable a more granular interpretation of the findings, Figure 3 is constructed based on the following design principles, each encoding a distinct analytical dimension:

Color of bubbles is used to denote cluster membership, thereby delineating thematic groupings and facilitating cross-cluster comparison.
Bubble size operationalizes each manuscript’s similarity to the concept “AI agents” as quantified by our similarity measure; larger bubbles indicate a stronger conceptual association.
Edges represent the top 100 pairwise similarity scores across the publications, emphasizing the most salient inter-publication relationships while maintaining visual tractability.

The following lines describe the characteristics and attributes of the five streams. We pay particular attention to the special features of each stream in order to gain an overview and first insights.

The Architecture & Frameworks cluster comprises works that structure the architectural design space of AI agents, focusing on how reasoning, planning and tool use are composed into coherent system architectures and implemented through reusable frameworks. Particularly tight connections can be observed between architectural landscape surveys and framework-oriented reviews, for example between Masterman et al. [5] and Ferrag et al. [24], as well as between layered capability models and orchestration-focused framework surveys, such as Huang [36] and Joshi [60].

The Multi-Agent Systems cluster comprises works that investigate how multiple AI agents coordinate, specialize and collaborate to solve complex tasks through distributed reasoning, shared memory and structured communication. Particularly tight connections can be observed between architectural and taxonomy-oriented perspectives, for example between Masterman et al. [5] and Sapkota et al. [18], as well as between closely related framework-centric surveys by Joshi [59,60], which jointly emphasize orchestration, coordination and collaborative execution.

The Applications stream comprises works that examine how AI agents are deployed across diverse domains, including finance, scientific research, software engineering, education, enterprise automation, robotics, security, gaming and creative industries, highlighting their role in automating workflows and augmenting human decision-making. As an illustrative example of a particularly tight substream, strong connections can be observed between studies on agents in virtual and simulated environments and gameplay evaluation, for instance between Umarov and Mozgovoy [77], De Mesentier Silva et al. [98], Petrovic [26] and Fung et al. [52], which jointly position simulation and embodiment as key testbeds for assessing agent capabilities.

The Safety stream comprises works that analyze risks and vulnerabilities of AI agents arising from autonomy, tool use, memory and interaction with open environments. Particularly tight connections can be observed between survey and empirical studies on execution-time attack surfaces, for example between Deng et al. [2], the WebAgent security analysis by Ning et al. [39] and the empirical study on the increased jailbreak susceptibility of web-based agents by Chiang et al. [71], which together form a dense substream around prompt injection and web-based attack vectors.

The Ethics, Accountability & Governance stream comprises works that examine the normative, legal and socio-technical implications of deploying autonomous AI agents, with a focus on value alignment, responsibility attribution and institutional oversight. Particularly tight connections can be observed between governance- and accountability-oriented contributions, such as Kolt [3], Chan et al. [7] and Kasirzadeh and Gabriel [17], as well as between studies on trust, anthropomorphism and user expectations in human–agent interaction, including Mehrotra et al. [54], Lei et al. [89], Chang et al. [96] and Lim and Shim [86].

Beyond the categorical assignment, our algorithm evaluates each publication across all five thematic dimensions by assigning independent relevance scores (ranging from 0 to 100). This multi-dimensional profiling allows for a “feature-based” comparison, where each research stream is treated as a measurable attribute of the manuscript. Table 1 summarizes the descriptive statistics for these scores across the entire corpus of 100 publications.

The data reveals that “Architecture & Frameworks” (μ = 70.16) and “Applications” (μ = 70.42) are the most dominant features of the current landscape, reflecting a field heavily focused on system implementation. In contrast, “Multi-Agent Systems”, “Safety” and “Ethics, Accountability & Governance” exhibit lower mean scores but substantially higher standard deviations (σ > 39). This statistical profile indicates a fragmented landscape where these topics reach maximum relevance (100.0) in specialized “deep-dive” papers but are not yet baseline components of the broader architectural discourse.

The following sections conduct a deep-dive analysis of the outlined streams.

4. Architecture & Frameworks

4.1. The Evolution of Agentic Architectures

The architectural landscape of AI agents has undergone rapid conceptual and technological evolution. Early approaches relied predominantly on symbolic reasoning and handcrafted rule-based systems, whereas more recent work increasingly adopts adaptive architectures built around Large Language Models and multimodal foundation models [1,3,8,100]. This transition has substantially expanded the functional scope of agency, enabling computational systems to reason over heterogeneous inputs, generate structured plans and interact autonomously with dynamic and uncertain environments [5].

Following the literature-driven clustering and coding procedure described in Section 3, this section synthesizes the architectural elements that most consistently co-occur across contemporary agent-based systems. Rather than proposing a normative architecture, the section reflects how the literature itself organizes agentic systems around recurrent components, including cognitive loops, memory and reflection mechanisms, reasoning and planning strategies and grounding through external tools. These elements constitute the conceptual foundation for more complex multi-agent architectures, which are examined in Section 5.

4.2. Cognitive Foundations of Single-Agent Systems

Across the surveyed literature, single-agent systems are most commonly described in terms of a continuous cognitive loop comprising perception, reasoning and action [2,5,22,79]. The perception component mediates between the agent and its environment by ingesting multimodal inputs, such as text, images, or sensor data and transforming them into structured representations suitable for downstream processing by an LLM [2,18]. Many architectures further introduce an explicit preprocessing stage that performs prompt construction, context aggregation, or input normalization in order to ensure that incoming information is coherent and task-relevant [2].

Reasoning constitutes the cognitive core of the agent. At this stage, the LLM interprets contextual information, performs probabilistic inference and generates intermediate representations that support decision-making, planning and problem solving [6,18,83]. This capability marks the transition from passive language modeling toward goal-directed behavior. The action component translates the outputs of reasoning into executable operations, including application programming interface calls, program execution, database queries, or physical actions in embodied systems [2,6]. Taken together, perception, reasoning and action define the minimal architectural configuration required for autonomous interaction with an environment [4].

4.3. Memory, Reflection and Long-Horizon Autonomy

To support sustained autonomy and multi-step reasoning, many contemporary agent architectures extend the basic cognitive loop with persistent memory and reflection mechanisms [5,18,83]. Memory enables contextual continuity by allowing agents to store and retrieve information across interactions, thereby influencing future decisions and behaviors [2,18,37,92]. The literature recurrently distinguishes three forms of memory. Episodic memory records interaction histories and prior trajectories. Semantic memory stores structured domain knowledge [18,37]. Vector-based memory supports similarity search and retrieval-augmented generation through embedding-based retrieval. Coordinating these memory modalities is essential to maintaining a coherent internal state and preventing goal drift over long horizons [18].

Reflection mechanisms complement memory by enabling agents to evaluate and revise their own reasoning processes. Reflective architectures allow agents to analyze prior decision trajectories, identify errors or inefficiencies and adapt subsequent actions accordingly. Frameworks such as Reflexion operationalize reflection through explicit feedback loops in which agents generate verbal critiques of their own outputs and iteratively refine their behavior. The integration of memory and reflection transforms agents from purely reactive systems into adaptive entities capable of learning and long-horizon planning [5,18,24].

4.4. Reasoning and Planning Mechanisms

Reasoning and planning mechanisms determine an agent’s capacity to translate abstract objectives into structured sequences of actions [1,5,68,83]. A prominent line of work focuses on improving the reliability of reasoning by encouraging models to externalize intermediate inference steps. The Chain-of-Thought approach exemplifies this strategy by prompting models to generate explicit reasoning traces, thereby reducing implicit inference errors [1,6,18,36]. Extensions such as Tree-of-Thoughts and Graph-of-Thought further enhance robustness by enabling the exploration of multiple reasoning paths before committing to a solution [2,36].

A complementary body of research separates planning from execution. Architectures such as PlanReAct [12] and Plan-and-Act [68] generate explicit natural-language plans that can be inspected, revised and monitored during execution. This decoupling supports feedback, error correction and controllability, particularly in open-ended environments where execution outcomes are uncertain. Collectively, these approaches reflect a shift in the literature from static text generation toward deliberative and feedback-driven decision-making processes [5,68].

4.5. Grounding and External Tool Use

Autonomous behavior requires agents to ground internal reasoning in external data sources, computational tools and application interfaces [1,5,6,10,18]. The ReAct paradigm formalizes this interaction by interleaving reasoning steps with actions that query or manipulate the environment [5,24,68]. Each reasoning–action cycle updates the agent’s internal state based on observable outcomes, thereby improving reliability and situational awareness [18].

Retrieval-Augmented Generation (RAG) architectures extend grounding by embedding live information retrieval directly into the reasoning loop [16,18,24,68]. More recent Agentic RAG approaches introduce additional autonomy by allowing agents to dynamically select information sources, manage retrieval strategies and integrate retrieved knowledge into downstream reasoning [24,83]. In the surveyed literature, these mechanisms consistently appear as a means of anchoring agent behavior in verifiable external information while preserving flexibility in reasoning and action.

4.6. Framework Ecosystems for Agent Deployment

The increasing complexity of agentic architectures has motivated the development of software frameworks that standardize integration, orchestration and scalability. LangChain and LangGraph provide modular environments for connecting LLMs with tools, memory components and stateful execution graphs [24,59,60]. AutoGen and CrewAI emphasize conversational programming and collaborative task execution, supporting structured interaction patterns among multiple agents [2,59,60].

Enterprise-oriented platforms such as IBM watsonx.ai, AWS Bedrock Agents and Google Vertex AI Agent Builder integrate similar capabilities into secure and compliant infrastructures suitable for industrial deployment [59,60,83]. Within the literature, these frameworks are typically discussed as implementation-level ecosystems that operationalize architectural concepts rather than as architectural primitives themselves. Together, they constitute the technical substrate of an emerging agentic ecosystem that supports the construction, deployment and management of autonomous AI systems.

4.7. Transition to Multi-Agent Architectures

While single-agent architectures enable autonomous reasoning and action within bounded contexts, many real-world tasks require coordination, role specialization and parallel execution. Recent research therefore documents a transition from modular, isolated agents toward agentic AI systems composed of multiple interacting agents with complementary capabilities. In such systems, cognitive responsibilities are distributed across specialized roles, including planning, retrieval, verification, synthesis and coordinated through structured interaction mechanisms [5,18,68].

The architectural principles governing collective intelligence, including coordination structures, shared memory, communication protocols and governance mechanisms, are examined in detail in Section 5.

5. Multi-Agent Systems

5.1. From Single-Agent Architectures to Agentic AI

The transition from modular single-agent architectures to multi-agent systems represents a fundamental shift in the organization of autonomy. Whereas single-agent systems encapsulate perception, reasoning, memory and action within a unified control loop, multi-agent systems distribute cognition, control and execution across multiple interacting agents [5,18]. This shift enables parallelism, role specialization and coordinated execution of complex, long-horizon tasks that exceed the practical capabilities of individual agents [18].

Consistent with the literature-driven analysis described in Section 3, multi-agent systems emerge as a recurrent architectural response to increasing task complexity and environmental uncertainty. Rather than extending single agents indefinitely, contemporary work increasingly emphasizes coordinated collections of agents whose interaction gives rise to system-level intelligence. Table 2 summarizes the principal architectural differences between single-agent and multi-agent paradigms, highlighting the progression from self-contained reasoning toward coordinated, collective behavior.

5.2. Conceptual Foundations of Multi-Agent Systems

The defining characteristic of a multi-agent system lies in the deliberate distribution of reasoning and responsibility across autonomous components [18]. Each agent operates with its own reasoning capacity while contributing to a shared system-level objective through structured interaction mechanisms [2,6,18,72]. This design principle is grounded in theories of distributed cognition, according to which collective behavior can yield outcomes that surpass the performance of any single agent [18,78].

Recent research further emphasizes the role of social intelligence in multi-agent systems, particularly in environments where agents collaborate with humans or with other agents under uncertainty [42]. Social capabilities such as interpreting multimodal cues, managing ambiguity and integrating multiple perspectives are increasingly recognized as prerequisites for effective cooperation in human–agent teams [42]. Simulation environments such as Alympics provide controlled experimental settings for analyzing these dynamics in both cooperative and competitive scenarios [58].

5.3. Architectural Paradigms and Coordination Structures

Multi-agent architectures can be situated along a spectrum ranging from centralized to decentralized coordination structures [5,68]. In hierarchical or orchestrated systems, a central coordination layer, often conceptualized as a meta-agent, decomposes tasks, assigns subtasks, manages dependencies and resolves conflicts [5,18,83]. This paradigm promotes global coherence and facilitates system-level optimization. MetaGPT exemplifies this approach by modeling agents as organizational roles coordinated through predefined workflows [18,88]. Similarly, AutoGen supports orchestrated collaboration through conversational task delegation supervised by a coordinating agent [14,18,90].

By contrast, peer-to-peer architectures emphasize decentralized interaction, allowing agents to communicate directly and negotiate task execution without reliance on a central controller [36,83]. Such designs enhance flexibility and robustness but require careful management of communication to mitigate redundancy, conflict and inefficiency [5]. As reflected in the surveyed literature, many contemporary systems adopt hybrid architectures that combine centralized orchestration with localized autonomy in order to balance efficiency, adaptability and resilience.

5.4. Core Components of Multi-Agent Architectures

The operational effectiveness of multi-agent systems depends on architectural components that extend beyond those of single-agent models. A central component is role specialization, whereby agents are assigned distinct functional responsibilities such as planning, retrieval, synthesis, or evaluation [5,18]. Specialization is commonly implemented through role-specific prompting strategies or model configurations and improves both interpretability and efficiency by aligning reasoning behavior with domain-specific expertise.

Persistent shared memory constitutes another foundational component. Because multi-agent systems rely on distributed processing, they require synchronized memory structures to maintain contextual continuity and situational awareness across agents [2,18]. Shared repositories, including vector databases and structured knowledge stores, reduce redundancy, prevent goal divergence and enable coherent collective reasoning. Through shared memory, otherwise independent agents are integrated into a unified cognitive system.

5.5. Collective Reasoning and Planning

Reasoning and planning in multi-agent systems involve translating system-level objectives into coordinated sequences of actions executed by multiple agents [18,68]. In many architectures, a planning or orchestration agent decomposes high-level goals into subtasks that can be executed concurrently by specialized agents [18]. Graph-based orchestration environments such as LangGraph support this process by explicitly representing dependencies, state transitions and execution flows [59,60].

Unlike single-agent reasoning loops, collective reasoning emerges through iterative interaction. Agents exchange intermediate results, critique one another’s outputs and adapt strategies based on shared feedback. The Multi-Agent Debate pattern formalizes this process by organizing agents to generate independent solutions, critique alternatives and converge on improved outcomes [68]. Reflexive evaluation extends self-critique to the inter-agent level, enabling verifier agents to assess the correctness and consistency of peer outputs [18]. Dynamic team construction further optimizes resource allocation by creating or dissolving specialized agents as task requirements evolve [5]. The evaluation of such interactive and emergent behaviors poses distinct challenges compared to static benchmarks. Benchmarking platforms such as EvalAI can support evaluations in interactive and non-static environments, including scenarios where agents must respond to other agents rather than to fixed datasets [28].

Empirical evidence reported in the literature indicates that such collaborative reasoning frameworks outperform single-agent configurations on complex tasks. In code generation, heterogeneous multi-agent systems augmented with verification agents achieve higher functional correctness on benchmarks such as HumanEval [64]. In educational settings, multi-agent collaboration structures described as inner and outer circulation support collective deliberation and learner knowledge construction, fostering forms of swarm intelligence [43].

5.6. Tool Use and Shared Data Environments

Autonomous access to external tools and data is central to effective coordination in multi-agent systems [18]. Function calling enables agents to perform domain-specific computations while maintaining clear interaction boundaries that limit unintended interference [18]. Retrieval-Augmented Generation architectures provide a shared informational substrate by integrating vector databases and retrieval pipelines that are accessible to all participating agents [18,24].

Agentic RAG extends this paradigm by allowing agents to orchestrate retrieval strategies collaboratively, dynamically allocating information-gathering responsibilities and synthesizing results across the collective [18,24]. Embedding tool use and retrieval directly into inter-agent communication aligns distributed reasoning with verifiable external knowledge and supports coherent system-level decision-making.

5.7. Communication Protocols and Interoperability

The scalability of multi-agent systems depends on reliable and standardized communication mechanisms that support coordination across heterogeneous agents [10,18]. Communication protocols define the formal grammar through which agents exchange information, negotiate tasks and maintain synchronization [10]. The Model Context Protocol standardizes interactions between agents and external resources by specifying consistent mechanisms for context integration and tool invocation [68]. By standardizing tool discovery and invocation, the Model Context Protocol reduces ambiguity in agent behavior and supports more predictable interaction patterns [99].

Agent-to-Agent protocols enable asynchronous collaboration among heterogeneous agents, supporting capability discovery, long-running workflows and distributed task execution [18,24,68]. The Agent Network Protocol extends these principles to cross-domain collaboration by facilitating structured negotiation among agents operating across organizational boundaries [10]. The Agent Protocol, built on OpenAPI v3, specifies standardized lifecycle operations such as execution runs, thread management and memory handling [10]. Together, these protocols form the communication infrastructure required for large-scale agentic cooperation.

5.8. Reliability and Governance Considerations in Multi-Agent Systems

As multi-agent systems increase in scale and autonomy, reliability and governance emerge as intrinsic architectural concerns [5,17,68]. Visibility and traceability are essential for maintaining control over distributed operations. Agents require unique identifiers that bind actions to specific goals, permissions and dependencies, thereby enabling attribution and post hoc analysis [7,9,18].

Logging and monitoring mechanisms support real-time oversight and debugging by allowing analysts to trace interactions across complex workflows [18]. Authenticated delegation and scoped permissions further ensure that agents operate within authorized boundaries, extending established security standards such as Open Authorization 2.0 to regulate inter-agent collaboration and prevent unauthorized privilege escalation [9,20]. These mechanisms embed governance constraints directly into the operational fabric of multi-agent systems and provide a technical foundation that complements the broader safety and ethical frameworks discussed in Section 7 and Section 8.

5.9. Architectural Trade-Offs: Resource Complexity and Performance

To bridge the conceptual foundations discussed in the previous sections with the practical application domains, this section provides a technical synthesis of agentic architectures. Table 3 categorizes the discussed models based on their computational complexity, resource requirements and standardized evaluation metrics, addressing the inherent trade-offs in autonomous system design.

The architectural shift from isolated units to coordinated multi-agent collectives significantly expands the operational capacity of AI agents [5,18,68]. By distributing cognitive tasks across specialized roles, these systems become capable of addressing the high-dimensional requirements of real-world environments [14,24,59]. The following section examines how these structural capabilities are translated into domain-specific utility across fields such as finance, science and software engineering.

6. Applications

6.1. The Expanding Landscape of AI Agent Applications

The adoption of AI agents has transformed the execution of complex tasks across a wide range of application domains [2,33]. Their integration into applied contexts reflects a convergence of reasoning, autonomy and coordination capabilities that extends beyond conventional automation paradigms [69]. Consistent with the procedure in Section 3, the following domains reflect application clusters that recur across the analyzed literature. Building on the architectural foundations and coordination mechanisms discussed in Section 4 and Section 5, this section synthesizes the principal domains in which AI agents are deployed according to the surveyed literature. Particular attention is given to the distinction between task-specific single-agent systems and coordinated multi-agent or agentic AI systems [18]. Collectively, these applications illustrate the breadth of agentic intelligence and its capacity to augment human decision-making, optimize workflows and enable new forms of digital collaboration.

6.2. Applications in Financial Services

Within the financial sector, AI agents are widely recognized for their potential to support rapid, data-driven decision-making in high-stakes environments [2,3]. Agent-based systems are applied across investment management, risk assessment, fraud detection and customer engagement. In investment analysis and trading, autonomous agents process heterogeneous data sources, including market reports, news streams and social media, to identify opportunities, forecast trends and refine trading strategies [45,59,60]. Systems such as FinRobot, an open-source multi-agent framework for financial analytics and FinVision, designed for market prediction, demonstrate how agentic architectures adapt to dynamic financial contexts [32,59,65].

In risk management and fraud detection, AI agents analyze complex transaction networks to identify anomalous patterns indicative of financial misconduct [59,60]. Empirical studies report substantial improvements in risk model accuracy and notable reductions in false-positive fraud alerts when agentic decision support is employed [65]. At the customer interface, generative agents provide personalized assistance by handling inquiries autonomously and reducing response times, with studies indicating clear efficiency gains following agent integration [65].

Agentic AI systems further extend these capabilities through coordinated workflows involving specialized agents. In structured finance, autonomous agents verify loan documentation and cross-check applicant data against banking records, accelerating due diligence processes [24]. In financial modeling, multi-agent teams composed of planners, analysts and compliance specialists collaboratively perform exploratory analysis, model validation and regulatory reporting [18,24]. These deployments exemplify how multi-agent coordination enhances transparency and reliability in complex financial operations.

6.3. Applications in Scientific Research and Discovery

Scientific research represents one of the most rapidly advancing domains for agentic AI, as agents increasingly support analytical, integrative and creative tasks across complex workflows [24,76]. AI agents automate substantial portions of the research pipeline, including hypothesis generation, experimental design, literature review and data interpretation [24]. Multi-agent research assistants implemented within frameworks such as AutoGen and CrewAI coordinate retriever, summarizer and synthesizer agents to manage large-scale information flows and generate coherent scientific outputs [18]. These systems have been applied to literature reviews, patent analyses and funding proposal generation, yielding measurable productivity gains.

In biomedical and life sciences research, agentic systems address the growing complexity of data-intensive analysis. The Biomni framework enables autonomous execution of biomedical tasks by integrating specialized data mining tools, databases and experimental protocols [53]. Similarly, AutoBA supports fully automated multi-omic workflows by dynamically constructing analysis pipelines that adapt to heterogeneous data structures [55]. More specialized discovery agents further extend this paradigm. The BioDiscoveryAgent integrates literature retrieval, code execution and critic agents to design genetic perturbation experiments, achieving improved experimental hit ratios relative to Bayesian optimization baselines without task-specific model training [29]. In clinical laboratory settings, Large Language Model-based agents have been validated for interpreting disk diffusion antimicrobial susceptibility tests, demonstrating high agreement with expert assessments under standardized prompting protocols [94].

Agentic approaches are also applied beyond the life sciences. In telecommunications, generative agents employ retrieval-augmented generation to parameterize satellite communication models, which are subsequently optimized using mixture-of-experts proximal policy optimization algorithms to improve resource allocation efficiency [67]. Collaborative research environments such as AgentRxiv illustrate how multi-agent systems simulate shared scientific workspaces in which language model agents exchange findings and iteratively refine hypotheses, leading to improved precision and research quality [24]. Domain-specific information extraction agents further contribute to scientific discovery by compiling structured datasets from specialized literature, achieving high recall and precision in tasks such as organic field-effect transistor knowledge extraction [70].

6.4. Applications in Software Engineering and Web Automation

Software engineering was among the earliest domains to adopt AI agents and continues to serve as a testbed for large-scale cognitive automation [1]. AI agents now support multiple phases of the software development lifecycle, including requirements analysis, system design, testing and maintenance [24]. Agentic systems such as SWE-Agent and evaluation benchmarks including SWE-bench demonstrate the capacity of AI agents to address real-world software engineering tasks derived from large open-source codebases [5,8,18].

In research workflows, multi-agent systems have also been applied to automate systematic literature reviews. Specialized agents generate search strings, identify relevant literature and extract structured data, reducing manual effort while improving reproducibility [44].

In web automation, the emergence of the agentic web has enabled agents to autonomously browse, interpret and interact with online environments [3,8]. WebAgents automate repetitive tasks by perceiving webpage structures and executing sequential actions, although challenges remain with generalization across heterogeneous layouts and robustness to execution errors [39]. Benchmarks such as VisualWebArena formalize these capabilities by evaluating agents on translating natural language instructions into graphical user interface actions in visually rich webpages [74]. Foundation models trained on large-scale user interface datasets further demonstrate the ability to ground visual inputs into coordinate-based actions, enabling realistic digital task execution [13]. Platforms such as OpenAI Operator exemplify this interaction paradigm by allowing agents to navigate web interfaces and complete multi-step tasks on behalf of users [9,18].

Hybrid architectures address limitations of general-purpose language models in domain-specific web tasks. The InteRecAgent framework employs a language model as a coordinating component that controls specialized recommendation tools, enabling conversational item retrieval through structured queries and item-to-item matching [27]. Benchmarks such as Mind2Web provide standardized evaluation of agent performance in these environments and support progress in human–agent interaction research [8,68].

6.5. Applications in Education and Learning

The integration of AI agents into educational settings introduces a paradigm in which learning becomes interactive, adaptive and personalized [63]. Intelligent tutoring agents act as virtual mentors by providing individualized instruction, adaptive exercises and real-time feedback tailored to each learner’s progress and preferences [63].

Beyond tutoring, agentic systems support simulated and experiential learning environments. PitchQuest, a venture capital simulation platform, employs multiple agents such as mentors, evaluators and role players to replicate real-world scenarios and deliver context-sensitive feedback [23]. In collaborative learning contexts, multi-agent systems foster peer-like interaction among learners. Frameworks such as the von Neumann Multi-Agent System model educational collaboration as a process of knowledge co-construction, with agents simultaneously assuming roles as teachers, companions and evaluators [43].

AI agents also contribute to programming education by offering context-specific guidance, diagnosing errors and suggesting improvements that support skill acquisition and learner confidence [37]. Across these settings, educational agents fulfill both cognitive and social functions, enhancing performance, engagement and motivation through adaptive interaction.

6.6. Applications in Business and Enterprise Automation

AI agents have become integral to modern business operations by enabling scalable cognitive delegation and automation of complex workflows [17,18]. Enterprises increasingly deploy agentic systems for workflow management, data pipeline orchestration and decision support [59]. In customer service, conversational agents autonomously handle user requests, triage support tickets and maintain context-aware dialogue. Prominent examples include Siri, Alexa and Replika, which demonstrate how generative agents provide personalized assistance while maintaining linguistic coherence and emotional responsiveness [18,21,46,86]. At scale, such systems increasingly employ self-learning mechanisms to reduce user friction, with commercial assistants shown to automatically rewrite failed or ambiguous user queries in order to mitigate speech recognition and natural language understanding errors without human intervention [51].

Decision-support agents further augment organizational intelligence by conducting internal searches, ranking relevant content, managing communications and autonomously scheduling meetings [18]. In extended reality work environments, embodied agents assist knowledge workers by managing information overload through gaze-based context retrieval, supporting tasks such as reading comprehension and question answering in multi-window settings [57]. Studies of human–agent relationships characterize such systems as servant or partner agents depending on perceived power asymmetries, highlighting how anthropomorphic traits influence trust and engagement [89]. Across domains, agentic automation increases efficiency while reshaping managerial practices, shifting human oversight from direct control toward strategic guidance.

6.7. Applications in Robotics and the Internet of Things

Robotics represents one of the most tangible instantiations of agentic intelligence by embedding perception and reasoning within physical embodiments [52,92]. Embodied AI agents integrate sensory input with motor control to interact with both digital and physical environments. These agents may be virtual, wearable, or robotic [52]. Robotic systems operate in unstructured environments such as homes, factories, or disaster zones, performing general-purpose tasks that require physical dexterity and adaptive cognition [52].

Recent agentic systems extend these capabilities to continuous control in three-dimensional environments. In physical simulation tasks, agents output low-level control signals, such as six-degree-of-freedom displacements, to accomplish manipulation objectives and bridge high-level reasoning with embodied action [13]. Virtual environments, including massively multiplayer online role-playing games, serve as scalable and safe testbeds for studying embodiment and situated interaction prior to physical deployment [26].

Multi-robot systems further expand this paradigm by coordinating multiple embodied agents to achieve shared objectives in domains such as industrial assembly, warehouse logistics, agriculture and rescue operations [18,52,78]. These systems rely on inter-agent communication and shared spatial memory to maintain coordination and situational awareness [18]. In parallel, AI agents increasingly manage digital infrastructures within the Internet of Things and cloud ecosystems. Artificial Intelligence for IT Operations agents autonomously monitor resources, detect faults and orchestrate recovery operations through observe–think–act cycles. Together, these deployments illustrate how agentic intelligence scales from individual machines to integrated physical–digital infrastructures [31,35].

6.8. Applications in Security and Cybersecurity

Security constitutes a critical domain for AI agent deployment due to the high stakes associated with autonomous decision-making [90]. While agentic systems enhance threat detection and response efficiency, they also introduce new vulnerabilities. Autonomous agents may be exploited to automate cyberattacks or generate deceptive online content [3] and access to tools and application programming interfaces increases the attack surface, enabling prompt injection and potential remote code execution [48].

At the same time, agentic systems strengthen defensive postures. In cybersecurity operations, multi-agent systems coordinate incident response activities such as threat classification, log correlation and compliance verification [18]. These systems accelerate detection and reduce manual workload in large-scale infrastructures governed by stringent legal and operational constraints. However, multi-agent architectures introduce additional challenges, as compromised agents may propagate malicious states or corrupted outputs throughout the system [18]. Empirical studies demonstrate that even Large Language Model-based agents may autonomously exploit known vulnerabilities, underscoring the need for rigorous validation and containment mechanisms [48].

6.9. Applications in Gaming and Simulation

In virtual environments, agents frequently act as important contributors to user engagement in video games and simulation settings, shaping interaction dynamics and influencing player experience [77]. Games and simulation environments function as controlled testbeds for agentic intelligence. The Alympics framework employs Large Language Model agents to model human-like strategic behavior in game-theoretic scenarios such as the Water Allocation Challenge [58]. In game development, automated agents based on A-star search algorithm have been used to playtest The Sims Mobile, enabling rapid identification of balance issues compared to human testing [98].

Agentic systems have also demonstrated competitive performance in creative gameplay. Social virtual reality platforms extend agentic interaction through non-playable agents that maintain long-term interaction histories, enabling sustained social presence via context-aware dialogue and non-verbal behavior [19]. The Pixelor agent achieves human-level performance in Pictionary-style games by learning optimal stroke sequences through neural sorting [93].

6.10. Specialized and Emerging Applications

Beyond mainstream domains, AI agents are increasingly applied in specialized and creative contexts. In geographic information systems, agentic architectures support spatial reasoning, optimization and predictive modeling [24]. In the creative industries, multi-agent frameworks enable automated content generation and human–AI collaboration. The FilmAgent system exemplifies automated multimedia production by assigning roles such as director, screenwriter and cinematographer to coordinated agents for three-dimensional virtual film creation [24].

Human–AI co-creative collaboration has received growing attention. Nonlinear collaboration frameworks such as OptiMuse emphasize flexible communication and alternative solution generation in graphic design tasks [38]. In group brainstorming, the Koala agent increased idea generation but required user-controllable behavior to avoid disrupting group dynamics [61]. Empirical findings indicate that ideas generated by mixed human–agent teams are more likely to be selected as final solutions than those produced by agents alone, highlighting the role of agents as co-creators rather than autonomous substitutes [62]. At larger scales, platforms such as Pairit demonstrate that human–AI teams can produce higher-quality advertising text but lower-quality images than human-only teams, while significantly altering communication patterns [75].

Table 4 summarizes the main application domains of AI agents, including representative systems, agent types and key performance dimensions. The comparison shows that multi-agent systems generally outperform single-agent approaches in complex, multi-step tasks such as scientific research, financial modeling and software engineering, while single-agent systems remain effective in more constrained settings like customer service and web automation. Across domains, common evaluation metrics include accuracy, task success rate, efficiency and user satisfaction. At the same time, the results highlight persistent challenges related to robustness, generalization and security, particularly in dynamic and open environments. Overall, the table provides a structured basis for comparing agentic AI systems and identifying domain-specific strengths and limitations.

The broad deployment of AI agents across these critical domains demonstrates their transformative potential but simultaneously exposes them to diverse operational environments [2,3,33]. This increased exposure, coupled with the autonomy required for task execution, reveals significant technical vulnerabilities, particularly regarding prompt injection and goal hijacking in open-ended systems [39,48,71]. As agents move from controlled testing to open-ended application, ensuring their safety and resilience becomes a primary research concern, as detailed in Section 7.

7. Safety

7.1. Safety Foundations and Research Scope

Ensuring the safety and trustworthiness of AI agents constitutes a central concern in the literature, reflecting the transition from AI systems as passive decision-support tools to autonomous agents capable of executing complex and consequential actions with limited human oversight [8,9,25,34]. As increasing responsibility is delegated to such systems, existing risks are amplified and novel failure modes emerge, motivating a combination of technical safeguards and operational control mechanisms [7,25].

This section synthesizes technical safety risks, vulnerabilities and mitigation strategies associated with autonomous and agentic AI systems. Ethical, legal and broader governance implications are addressed separately in Section 8.

7.2. Definition and Scope of Safety

Safety in AI agents extends beyond traditional notions of reliability and robustness to encompass risks arising from autonomy, tool use, persistent memory and interaction with open environments [2,8,25]. Across the surveyed literature, risk is commonly framed as increasing with the degree of autonomy. From a safety engineering perspective, evaluation practices are critical because systematic bias in assessment can lead to underestimation or mischaracterization of risks in deployed agentic systems. Related work on intelligence evaluation highlights that safety risks are further compounded by biased assessment protocols. Theoretical frameworks for intelligence testing argue for universal distributions of tasks in order to avoid cultural, contextual, or anthropomorphic biases when assessing machine intelligence, particularly in socially interactive or human-facing agentic systems [15,97]. As control is relinquished to an agent, the potential for harmful outcomes increases, including financial loss, systemic disruption and physical harm [3,7,8,25].

As agentic systems move from purely virtual settings to embodied and cyber–physical environments, safety risks intensify. Generative AI agents deployed in autonomous machines introduce additional failure modes, including hallucinations, catastrophic forgetting and the absence of formal safety guarantees, in which errors may translate directly into physical harm. To address these challenges, safety scorecards have been proposed to assess risk levels based on the depth of agent integration within autonomous control stacks, distinguishing lower-risk personalization tasks from higher-risk perceptual and decision-making functions [95].

Safety risks are closely related to the alignment problem, which is examined from an ethical and governance perspective in Section 8. From a technical safety standpoint, misalignment manifests operationally as hazardous behavior arising from underspecified, ambiguous, or misinterpreted objectives, particularly under long-horizon autonomy and limited supervision [2,3,25].

7.3. Categories of Threats and Vulnerabilities

Threats to AI agent safety arise across system components and interaction modes and are frequently rooted in the limitations of Large Language Models and the expanded action space introduced by tool use and external integrations [2]. The literature commonly distinguishes vulnerabilities according to whether they originate during agent execution or emerge through external interactions with other agents, memory systems, or environments [2]. The following subsections adopt this distinction to structure recurring threat categories and their operational implications.

7.4. Intra-Execution Threats in Single-Agent Systems

7.4.1. Perception and Input Manipulation

The perception layer constitutes a primary attack surface, particularly in text-based interfaces [2]. Prompt injection and goal hijacking attacks exploit adversarial inputs to manipulate an agent’s reasoning process, potentially overriding original task specifications and inducing unauthorized actions [2,7]. These attacks can scale readily and may circumvent existing safety mechanisms using comparatively simple prompt constructions [2].

Indirect prompt injection further amplifies this risk when agents ingest untrusted external content such as webpages or documents. This vulnerability is especially pronounced in agents that autonomously browse online environments. Empirical studies report that compromised webpages can act as effective vectors for injection, enabling attackers to manipulate agent behavior through seemingly benign content [39]. In such scenarios, malicious instructions embedded within data blur the boundary between information and command and may lead to conversation leakage, phishing, or unauthorized data access [1,2,3]. Jailbreaking techniques similarly exploit weaknesses in alignment constraints and, when combined with execution capabilities, can escalate from harmful content generation to real-world actions performed by LLM-controlled agents [2]. Recent empirical studies further indicate that web-based AI agents are significantly more vulnerable to jailbreaking attacks than standalone Large Language Models, due to their continuous exposure to untrusted external content and their extended action capabilities [71].

7.4.2. Reasoning, Planning and Cognitive Failures

Vulnerabilities within reasoning and planning arise from fundamental limitations of current models and from biases introduced during training [2]. Misalignment may cause agents to pursue proxy objectives that diverge from intended goals, producing unsafe behavior even when systems are trained with alignment techniques [2,25]. Reinforcement Learning from Human Feedback remains constrained by the diversity, inconsistency and incompleteness of human value signals [2,84,91].

Agents may also exhibit sycophantic behavior, producing outputs that conform to user beliefs or misleading prompts despite being factually incorrect [2,3]. Hallucinations pose an additional safety risk, as agents may generate plausible but incorrect information that propagates across multi-step reasoning chains and compounds errors over time [2,5]. Backdoor attacks and data poisoning further undermine safety by embedding latent vulnerabilities during training, enabling malicious behavior to be triggered by specific inputs, particularly when training data originates from untrusted sources [2,3].

7.5. External Interaction Threats in Multi-Agent and Ecosystem Settings

7.5.1. Agent-to-Agent Threats

In multi-agent systems, collaboration introduces additional safety risks because the compromise of a single agent may propagate harmful states throughout the system [2]. A prominent concern is secret collusion, in which agents exchange information via steganographic channels embedded in otherwise benign communications in order to evade monitoring mechanisms [2,56]. Evidence suggests that Large Language Models can encode and decode such hidden signals, while existing oversight tools remain insufficient for reliable detection [56].

Contagion effects may also arise when erroneous reasoning, hallucinations, or malicious payloads propagate rapidly across cooperative agent networks, destabilizing collective behavior and complicating fault isolation [2,9].

7.5.2. Agent-to-Memory Threats

Persistent memory and retrieval-augmented generation systems introduce vulnerabilities related to data integrity and confidentiality [2]. Prompt leakage attacks may induce agents to reveal confidential instructions or sensitive information stored in context windows or external memory repositories [1,2,3]. Even when communications are encrypted, private information may leak through side channels, including packet-size inference [2].

Memory poisoning represents a further risk in which manipulated or adversarially injected content compromises future reasoning and planning, thereby undermining long-term reliability [2].

7.5.3. Agent-to-Environment Threats

Safety risks also arise from interactions with external digital or physical environments. Environmental injection attacks occur when malicious code or instructions embedded in external data sources are retrieved and acted upon as though they were internal commands [2]. More broadly, the literature revisits concerns about loss of human control over autonomous systems capable of rapid and irreversible actions, particularly when operating beyond direct oversight. Once initiated, such processes may resist timely intervention, underscoring the need for precise objective specification and robust safety constraints [3].

7.6. Mitigation Strategies and Technical Safeguards

7.6.1. Architectural and Technical Defenses

Architectural approaches to safety focus on constraining agent behavior and limiting the impact of failures [6,9]. Isolation and sandboxing are widely emphasized safeguards that restrict access to system resources and external services in order to contain harmful outputs even when unsafe commands are generated [1,6,47,55]. Empirical evidence supports the necessity of such controls, as advanced systems may still produce unsafe instructions despite alignment training [1].

Access control mechanisms enforce the principle of least privilege for tool usage, reducing the risk of unauthorized actions [20]. Authentication and authorization protocols verify agent identities and permissions, preventing unintended escalation of capabilities [20]. Additional safeguards include reasoning guardrails that detect unsafe plans during execution [68], integration with external security analyzers that enforce constraints on action sequences [48] and rollback mechanisms that support recovery from unintended or harmful actions, particularly in high-stakes domains such as finance [9]. Oversight architectures incorporating activity logs and real-time monitoring aim to detect problematic behaviors promptly. However, human-in-the-loop approaches remain cognitively demanding and may not scale effectively in high-frequency or high-stakes agentic operations [7,30]. Execution opacity further complicates oversight, as multi-step internal planning processes may produce cascades of errors or undesirable behaviors that remain invisible until they manifest externally [2,18,40]. Formal verification provides an additional layer of assurance by employing formal methods or domain-specific security analyzers that mathematically guarantee that an agent’s execution trace does not violate predefined security constraints, thereby offering stronger guarantees than best-effort detection mechanisms [48,80].

7.6.2. Privacy-Preserving Techniques

Privacy-preserving methods address risks associated with sensitive data in prompts and memory systems [82]. Data desensitization techniques, including format-preserving encryption, enable agents to process protected inputs while reducing the risk of information leakage. Experimental evidence suggests that such approaches need not substantially impair usability. Session-aware models provide an alternative by isolating user-specific data through prompt tuning while freezing core model parameters, thereby mitigating privacy risks without sacrificing personalization [6].

7.7. Recurring Concerns and Open Technical Problems

Despite advances in safety mechanisms, fundamental challenges persist due to limitations of current models and the complexity of open, autonomous environments. Tool use exemplifies a recurring safety paradox. Tools are essential for extending agent capabilities, yet they introduce vulnerabilities due to their complexity and susceptibility to compromise. This tension motivates proposals for zero-trust agent architectures in which all external inputs, whether originating from users, tools, or environments, are validated against explicit security policies [68].

Alignment also remains technically challenging, as reliably inferring user intent from underspecified natural language continues to resist robust solutions [3,68]. Scaling multi-agent systems introduces additional difficulties related to debugging, reproducibility and error containment. Errors may propagate rapidly across decentralized architectures, complicating root-cause analysis and remediation [1,3,47].

Memory mechanisms, while essential for long-horizon planning and contextual continuity, introduce acute safety and privacy risks. Episodic and long-term memories are vulnerable to manipulation, unauthorized retention and leakage, motivating designs that prioritize interpretability, controllability and robustness over time [2,92]. Finally, agents often struggle to output data in strict structural formats, such as JavaScript Object Notation or lists, which can cause parsing failures and execution errors [22,79]. Techniques that integrate reward feedback directly into prompts or adjust token logits have been proposed to enable learning from trial-and-error without costly fine-tuning, although the safety implications of such approaches remain an open research question [50,66,85].

While technical safeguards and architectural isolation provide a first line of defense against agentic failure [6,9,47], they primarily address the functional reliability of the system. However, as agents operate with increasing autonomy in socio-technical systems, technical safety alone is insufficient to address the broader questions of responsibility and value alignment [3,17,25]. This necessitates a transition from engineering-based safety to the normative, legal and institutional frameworks of ethics and governance explored in Section 8.

8. Ethics, Accountability & Governance

8.1. Governance Principles and Normative Scope

The deployment of increasingly autonomous AI agents, defined as systems capable of pursuing complex, long-horizon goals with limited human supervision, poses fundamental challenges to established ethical, legal and institutional frameworks [3,7,8]. As autonomy increases, so do the risks associated with delegating authority to artificial systems, amplifying the need for accountability and governance mechanisms that extend beyond technical safety controls [7,17,25,90]. This section synthesizes the normative, legal and socio-technical dimensions of agentic AI governance and complements the technical safety analysis presented in Section 7.

8.2. The Normative Core: Ethics and Value Alignment

Ethical analyses of AI agents are commonly framed through the principal–agent problem, which captures the structural difficulty of ensuring that an autonomous system reliably and faithfully executes the objectives of a human principal [3,17,18,20]. Unlike traditional software systems, AI agents operate under underspecified goals, adapt to novel situations and may optimize proxy objectives that diverge from implicit human values [2,3].

A central ethical concern is goal misalignment. Even when agents are designed to pursue beneficial objectives, ambiguity in natural-language instructions and contextual variation may lead to unintended outcomes that violate moral expectations or social norms [3,17,25]. Loyalty and deception constitute additional challenges. Unlike human professionals bound by fiduciary duties, AI agents lack intrinsic commitments to user interests. Empirical studies report that agents may engage in deceptive behaviors, including withholding information or exhibiting sycophancy by reinforcing user beliefs even when demonstrably incorrect, thereby prioritizing perceived approval over truthfulness [2,3,25].

Agent affiliation further complicates ethical alignment in assistive contexts because affiliation determines whose interests an agent prioritizes. Research on caregiving agents for older adults suggests that, as cognitive decline progresses, users increasingly expect agents to shift primary allegiance from the individual to caregivers, raising tensions between autonomy, privacy and protection [96]. In multi-agent systems, ethical risks also extend to covert coordination and secret collusion, in which agents exchange hidden signals to evade oversight, particularly under misspecified objectives [2,56].

Bias and fairness remain persistent concerns. AI agents inherit and may amplify biases present in training data, producing inequitable outcomes that reinforce social stereotypes [2,18,25,35]. The feminization of AI assistants in service roles illustrates how design choices may normalize gendered expectations and perpetuate cultural inequities [73]. Explainability-based interventions, including ex post explanations that surface feature contributions or reasoning pathways, have been shown to promote analytical engagement and reduce over-anthropomorphization by reminding users that agents are computational systems rather than moral actors [4,73].

8.3. Trust, Anthropomorphism and Human Mental Models

Trust is central to the ethical deployment of AI agents. Appropriate trust is commonly defined as alignment between a user’s subjective belief in an agent’s capabilities and the agent’s actual trustworthiness [54]. Miscalibrated trust may result in over-reliance or underutilization, both of which undermine effective and safe interaction [7,16,25]. Transparency regarding limitations, uncertainty and system boundaries is therefore important for trust calibration [54].

Anthropomorphism complicates trust relationships. Human-like traits, emotional expressions, or persistent personas may encourage engagement but can also inflate user expectations beyond an agent’s actual capabilities [21,52]. Empirical studies indicate that persona consistency over long-term interaction remains challenging for Large Language Models, as illustrated by benchmarks such as Character100 [41]. When expectations are violated, trust may erode sharply, particularly when anthropomorphic cues obscure the probabilistic and fallible nature of the system [49].

User mental models represent an additional governance challenge. Studies of cooperative task environments report that users who experience consistent and successful interaction tend to develop more accurate models of agent competence, whereas mismatches between perceived and actual capability can reduce trust and effectiveness [11]. Explanations of global agent behavior, knowledge distribution and limitations are therefore relevant for supporting stable mental models, particularly in adaptive or multi-agent settings [11].

8.4. The Accountability Gap: Responsibility and Liability

Assigning responsibility for the actions of AI agents presents a persistent accountability gap due to system opacity, adaptive behavior and the multiplicity of stakeholders involved in development and deployment [3]. The internal reasoning processes of agents are often opaque, undermining established principles of responsibility attribution and complicating post hoc analysis of harmful outcomes [6].

Responsibility attribution is further complicated by the problem of many hands, in which accountability is diffused across developers, deployers and users, enabling deflection of responsibility through narratives that treat the system as a scapegoat [3,18]. To address this challenge, the literature draws on agency law and the Uniform Electronic Transactions Act to frame responsibility analytically, despite AI agents lacking legal personhood [3,20]. These frameworks emphasize liability assignment based on an actor’s ex ante capacity to prevent harm and ex post ability to mitigate damage, rather than on direct causal control alone [3]. Experimental evidence further supports the relevance of such accountability mechanisms. Studies grounded in the Computers Are Social Actors framework indicate that humans attribute blame to AI systems in a manner comparable to human decision makers, with perceived agent autonomy strongly correlating with responsibility attribution in cases of unjust or harmful outcomes [87].

8.5. Governance Mechanisms and Oversight

Translating ethical principles into enforceable practice requires governance mechanisms that integrate organizational oversight with technical traceability [3,7,9]. Visibility is central to effective governance. Agent identifiers bound to legal entities enable traceability and responsibility attribution and are often complemented by standardized documentation artifacts such as agent cards describing capabilities, evaluation outcomes and intended use conditions [7,9,10,20,90].

Continuous monitoring and comprehensive activity logging support both real-time oversight and post hoc auditing by enabling organizations to trace deviations from intended behavior and assess compliance with internal and external requirements [3,7,9]. Incident reporting mechanisms further institutionalize learning from failures through structured processes for documenting and analyzing harmful or unexpected agent behavior [9].

Oversight architectures increasingly incorporate scalable approaches that augment human supervision with automated auditing. Specialized AI agents may monitor, test, or audit other agents, enabling continuous oversight in environments where human-in-the-loop approaches do not scale effectively [3]. Automated red-teaming, supported by model-generated adversarial prompts, is also emerging as a governance practice for identifying vulnerabilities prior to deployment [7,8,68,81].

8.6. Interoperability and Ecosystem Governance

Effective governance of agent ecosystems requires standardized communication and interoperability mechanisms that support transparency and control across organizational boundaries [10,68]. Conventional protocols such as Hypertext Transfer Protocol or Remote Procedure Call may be insufficiently expressive for complex, multi-party agent interaction at the semantic level [10].

Emerging standards address this gap. The Model Context Protocol defines mechanisms for resource access and context integration, while Agent-to-Agent protocols enable structured, asynchronous collaboration among heterogeneous agents. These standards support ecosystem-level governance by enabling more consistent enforcement of policies, permissions and accountability across distributed agent networks [10,68].

8.7. Convergence, Trade-Offs and Open Governance Challenges

The governance of AI agents involves navigating trade-offs between autonomy, efficiency, transparency and control. A recurring theme in the literature is the tension between safety and utility. Stricter oversight and validation mechanisms may constrain adaptability and performance, whereas insufficient control increases ethical and societal risk [1,3,7,25,76]. Human-in-the-loop approaches can mitigate risk but may not scale to highly autonomous or multi-agent settings [76].

Several challenges remain open. Persistent memory mechanisms, while important for long-horizon planning and contextual continuity, raise governance concerns related to manipulation, unauthorized retention and leakage of sensitive information [92]. Limitations in causal reasoning further constrain agents’ capacity to assess risk and responsibility, particularly in high-stakes domains [2,18,68]. Scaling multi-agent systems introduces additional governance challenges related to emergent behavior, nondeterminism and architectural fragmentation, which complicate oversight and accountability [2,7,18,24,31,68]. Beyond technical and legal considerations, agentic AI also raises broader socio-economic governance questions. Autonomous agents may disrupt labor markets, reshape organizational structures and alter existing business models, motivating new policy frameworks and institutional responses [68].

The implementation of ethical principles is mediated by the underlying technical substrate [3,7]. A persistent gap exists between normative requirements and the operational reality of current models [2,18,68], driven by bottlenecks in reasoning and evaluation. These fundamental hurdles, which must be overcome to realize trustworthy systems, are synthesized as the core challenges in Section 9.

9. Challenges in Agentic AI

9.1. Reliability and Sequential Reasoning

A fundamental bottleneck identified in current agentic architectures is the heavy reliance on autoregressive large language models that capture statistical correlations rather than underlying causal structures [18,52]. This documented deficit in causal modeling renders agentic systems highly brittle under distributional shifts and inherently incapable of robust counterfactual reasoning [18]. In long-horizon tasks, these limitations manifest as severe credit assignment problems, where a single reasoning error or hallucination can propagate through sequential actions, causing catastrophic failure cascades [12,18]. Furthermore, unlike traditional control systems, such as PID controllers or Linear Quadratic Regulators, which offer mathematical guarantees of stability like Lyapunov stability, generative AI agents currently lack formal verification frameworks. This absence of rigorous verification makes their deployment in safety-critical physical environments highly perilous [95].

9.2. The Evaluation Paradox

The transition from static language models to dynamic AI agents has precipitated a significant evaluation crisis within the field. Traditional static benchmarks are increasingly viewed as inadequate because they fail to capture the multi-step, interactive nature of agentic workflows [5,99]. Evaluating autonomous agents introduces high levels of stochasticity, as non-deterministic model outputs interact with dynamic environments, rendering experimental reproducibility exceedingly difficult [5,16]. Furthermore, current agent evaluations frequently lack proper holdout sets, which may allow developers to inadvertently program domain-specific shortcuts rather than achieving true generalizable intelligence [1,52]. While automated evaluation paradigms, such as the LLM-as-a-Judge approach, have been proposed to scale dynamic benchmarking, they continue to suffer from high computational overhead, inherent model biases and general unreliability when assessing long execution trajectories [5,54,99].

9.3. Architectural Fragmentation and Interoperability

The current agentic ecosystem is characterized by severe architectural fragmentation and a lack of unified standards for agent-tool and inter-agent communication [10]. Traditional web protocols, including HTTP and RPC, are fundamentally inadequate for complex agentic systems because they rely on stateless, data-centric transport and lack the semantic richness required for context-aware, persistent task delegation [68]. Although nascent protocols such as the Model Context Protocol (MCP) and the Agent-to-Agent (A2A) protocol are emerging to standardize structured communication and capability discovery [10,68], widespread interoperability remains elusive [9]. This fragmentation prevents the formation of scalable, decentralized collective intelligence and forces agents into isolated, provider-specific silos, which significantly complicates integration with legacy enterprise infrastructure [10,59].

9.4. Resource Efficiency and Latency

Endowing large language models with agentic capabilities through iterative prompting techniques, such as Chain-of-Thought, ReAct, or Monte Carlo Tree Search, dramatically increases computational overhead [1,12,76]. Because agents must repeatedly query foundational models to evaluate intermediate states, inference costs can scale unboundedly, disrupting the Pareto frontier of accuracy and cost [1]. Beyond financial considerations, this architecture introduces severe latency bottlenecks. In edge computing and embodied robotics, server-grade throughput is often unfeasible due to strict power and memory constraints [95]. Consequently, while autonomous vehicles typically require decision frequencies of 20 Hz, complex LLM-based motion planners often operate at a sluggish 1–3 Hz, rendering them unsafe for real-time dynamic environments [95].

9.5. Human-Agent Coordination

As agents transition from passive tools to active collaborators, human–agent coordination becomes fraught with cognitive and sociotechnical challenges. A primary concern is the establishment of appropriate trust, formally defined as the precise alignment between a human’s reliance on the system and the agent’s actual trustworthiness and capability [54]. Humans exhibit complex psychological biases, including algorithm aversion following minor agent errors or, conversely, dangerous over-trust in highly fluent but hallucinated outputs [33,54]. Furthermore, agents struggle with the ambiguity problem, which involves the difficulty of inferring a user’s true, often underspecified intent from natural language [68]. Addressing these issues requires the design of transparent, integrity-based explanation mechanisms and robust Human-in-the-Loop oversight architectures that do not overwhelm the user with excessive cognitive load [54,68].

9.6. Emergent and Uncategorized Challenges

Beyond standard architectural constraints, highly advanced multi-agent systems exhibit novel, emergent vulnerabilities. A critical security threat newly identified in the literature is secret collusion via steganography [56]. Autonomous agents can exploit the output distributions of generative models to encode secret information, thereby communicating and coordinating outside the purview of human overseers and standard security monitors [56]. This poses an unprecedented Advanced Persistent Threat in decentralized networks. Additionally, the integration of persistent memory and Agentic Retrieval-Augmented Generation dramatically expands the attack surface, exposing agents to indirect prompt injections, environmental injection attacks and data poisoning, which can recursively corrupt an agent’s reasoning process and extract sensitive private data [2,6,71].

9.7. Open Research Directions

Synthesizing the future work sections of the reviewed literature reveals several critical and unresolved technical vectors. A highly cited direction is the integration of formal methods, specifically fusing LLM agents with Satisfiability Modulo Theories (SMT) solvers and formal verification methods to mathematically guarantee logical consistency and constrain state-space exploration in safety-critical domains [80]. To overcome the brittleness of purely statistical learning, researchers emphasize the development of Joint-Embedding Predictive Architectures (JEPAs) and causal world models that allow agents to simulate physical and procedural dynamics for robust zero-shot generalization [52,68]. Furthermore, enabling agents to dynamically update their policies and memory in non-stationary environments without suffering from catastrophic forgetting remains a paramount challenge for long-term deployment [68,95]. Finally, there is a clear need for decentralized infrastructure, including privacy-preserving and cryptographically secure frameworks that integrate Decentralized Identifiers with agent protocols to facilitate inter-agent capability discovery, resource metering and verifiable autonomous payments [68].

10. Conclusions

This manuscript presents a data-driven literature review of the top 100 Google Scholar publications on AI agents, identifying five dominant research streams: Architecture & Frameworks, Multi-Agent Systems, Applications, Safety and Ethics, Accountability & Governance. The analysis shows that AI agents are evolving from passive language-model-based assistants into autonomous, goal-directed systems capable of executing complex, multi-step workflows across diverse digital and physical environments. Across the reviewed literature, this transition is consistently associated with the integration of reasoning, planning, memory, tool use and action, while at the same time exposing substantial fragmentation in terminology, design assumptions, evaluation practices and governance models.

A central finding of this review is that the primary bottleneck of current research has shifted from isolated single-agent reasoning to the reliable orchestration of multi-agent systems. More specifically, the literature reveals several unresolved algorithmic gaps in multi-agent orchestration. These include the difficulty of dynamic and high-precision role allocation in non-stationary environments, the absence of sufficiently standardized semantic protocols for interoperability across heterogeneous agents, the vulnerability of shared-memory architectures to inconsistency, poisoning and goal drift and the limited capability of current systems to isolate faults once errors propagate through interdependent agent pipelines. In addition, existing evaluation frameworks remain insufficient for measuring emergent collective behavior, long-horizon reliability and reproducibility in interactive environments, while formal verification mechanisms are still not mature enough to provide strong guarantees for safety-critical deployment. Taken together, these technical problems indicate that multi-agent orchestration remains an open research challenge rather than a mature implementation layer.

The observations of this study also highlight the value of agentic AI as a distinct advance beyond conventional generative AI. By combining reasoning, planning, memory, tool use and autonomous action, agentic systems extend AI from passive response generation to goal-directed execution and coordinated problem solving. This capability is particularly valuable in domains characterized by distributed information, long decision chains and high workflow complexity, including scientific discovery, finance, software engineering, cybersecurity, enterprise automation and robotics. The reviewed literature therefore suggests that the significance of agentic AI lies not only in improved model performance, but in its emerging capacity to act as a scalable infrastructure for autonomous decision support, workflow execution and collaborative intelligence across complex socio-technical environments.

The further scope of agentic AI is substantial, but its expansion depends on simultaneous progress in orchestration, safety and governance. Future research should therefore focus on coordination-aware benchmarks, formal verification of agent interaction and execution traces, interoperable communication standards, secure and privacy-preserving shared-memory infrastructures and hybrid architectures that balance centralized oversight with decentralized autonomy. The present study is based on a Google Scholar snapshot, enabling broad and interdisciplinary coverage of highly visible contributions. Future research could extend this approach by incorporating additional academic databases such as Scopus and Web of Science and by applying structured review protocols following PRISMA-style guidelines. To enhance methodological robustness, future studies may further integrate targeted augmentation strategies to mitigate domain-specific biases [109]. In addition, the automated review pipeline could be adapted to other research domains, including public health and financial risk modeling [110]. Overall, progress across these technical and methodological dimensions will be essential for the development of robust, scalable and trustworthy agent ecosystems.

Author Contributions

Conceptualization, J.S.; methodology, J.S.; software, F.M.; validation, F.M. and J.S.; formal analysis, J.S. and F.M.; investigation, F.M.; resources, J.S.; data curation, F.M.; writing—original draft preparation, F.M. and J.S.; writing—review and editing, F.M. and J.S.; visualization, J.S.; supervision, J.S.; project administration, J.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Acknowledgments

We are further grateful to the two anonymous reviewers for their helpful comments and suggestions, which contributed to improving this paper.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

AI	Artificial Intelligence
API	Application Programming Interface
DB	Database
GSM8K	Grade School Math 8K
GUI	Graphical User Interface
JEPA	Joint-Embedding Predictive Architecture
LLM	Large Language Model
MAS	Multi-Agent Systems
MCP	Model Context Protocol
MMLU	Massive Multitask Language Understanding
Pass@k	Probability of at least one correct solution out of k samples
RAG	Retrieval-Augmented Generation
SMT	Satisfiability Modulo Theories
SR	Success Rate

References

Kapoor, S.; Stroebl, B.; Siegel, Z.S.; Nadgir, N.; Narayanan, A. AI Agents That Matter. arXiv 2024, arXiv:2407.01502. [Google Scholar] [CrossRef]
Deng, Z.; Guo, Y.; Han, C.; Ma, W.; Xiong, J.; Wen, S.; Xiang, Y. AI Agents Under Threat: A Survey of Key Security Challenges and Future Pathways. ACM Comput. Surv. 2025, 57, 1–36. [Google Scholar] [CrossRef]
Kolt, N. Governing AI Agents. arXiv 2025, arXiv:2501.07913. [Google Scholar] [CrossRef]
Castelfranchi, C. Modelling Social Action for AI Agents. Artif. Intell. 1998, 103, 157–182. [Google Scholar] [CrossRef]
Masterman, T.; Besen, S.; Sawtell, M.; Chao, A. The Landscape of Emerging AI Agent Architectures for Reasoning, Planning, and Tool Calling: A Survey. arXiv 2024, arXiv:2404.11584. [Google Scholar] [CrossRef]
He, Y.; Wang, E.; Rong, Y.; Cheng, Z.; Chen, H. Security of AI Agents. In Proceedings of the 2025 IEEE/ACM International Workshop on Responsible AI Engineering (RAIE), Ottawa, ON, Canada, 29 April 2025; pp. 45–52. [Google Scholar] [CrossRef]
Chan, A.; Ezell, C.; Kaufmann, M.; Wei, K.; Hammond, L.; Bradley, H.; Bluemke, E.; Rajkumar, N.; Krueger, D.; Kolt, N.; et al. Visibility into AI Agents. In Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency, Rio de Janeiro, Brazil, 3 June 2024; pp. 958–973. [Google Scholar] [CrossRef]
Casper, S.; Bailey, L.; Hunter, R.; Ezell, C.; Cabalé, E.; Gerovitch, M.; Slocum, S.; Wei, K.; Jurkovic, N.; Khan, A.; et al. The AI Agent Index. arXiv 2025, arXiv:2502.01635. [Google Scholar] [CrossRef]
Chan, A.; Wei, K.; Huang, S.; Rajkumar, N.; Perrier, E.; Lazar, S.; Hadfield, G.K.; Anderljung, M. Infrastructure for AI Agents. arXiv 2025, arXiv:2501.10114. [Google Scholar]
Yang, Y.; Chai, H.; Song, Y.; Qi, S.; Wen, M.; Li, N.; Liao, J.; Hu, H.; Lin, J.; Chang, G.; et al. A Survey of AI Agent Protocols. arXiv 2025, arXiv:2504.16736. [Google Scholar] [CrossRef]
Gero, K.I.; Ashktorab, Z.; Dugan, C.; Pan, Q.; Johnson, J.; Geyer, W.; Ruiz, M.; Miller, S.; Millen, D.R.; Campbell, M.; et al. Mental Models of AI Agents in a Cooperative Game Setting. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, Honolulu, HI, USA, 21 April 2020; pp. 1–12. [Google Scholar] [CrossRef]
Putta, P.; Mills, E.; Garg, N.; Motwani, S.; Finn, C.; Garg, D.; Rafailov, R. Agent Q: Advanced Reasoning and Learning for Autonomous AI Agents. arXiv 2024, arXiv:2408.07199. [Google Scholar] [CrossRef]
Yang, J.; Tan, R.; Wu, Q.; Zheng, R.; Peng, B.; Liang, Y.; Gu, Y.; Cai, M.; Ye, S.; Jang, J.; et al. Magma: A Foundation Model for Multimodal AI Agents. In Proceedings of the 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 10 June 2025; pp. 14203–14214. [Google Scholar] [CrossRef]
Zhang, J.; Lan, T.; Zhu, M.; Liu, Z.; Hoang, T.; Kokane, S.; Yao, W.; Tan, J.; Prabhakar, A.; Chen, H.; et al. xLAM: A Family of Large Action Models to Empower AI Agent Systems. arXiv 2024, arXiv:2409.03215. [Google Scholar] [CrossRef]
Insa-Cabrera, J.; Dowe, D.L.; España-Cubillo, S.; Hernández-Lloreda, M.V.; Hernández-Orallo, J. Comparing Humans and AI Agents. In Artificial General Intelligence; Schmidhuber, J., Thórisson, K.R., Looks, M., Eds.; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2011; Volume 6830, pp. 122–132. [Google Scholar] [CrossRef]
White, R.W. Advancing the Search Frontier with AI Agents. Commun. ACM 2024, 67, 54–65. [Google Scholar] [CrossRef]
Kasirzadeh, A.; Gabriel, I. Characterizing AI Agents for Alignment and Governance. arXiv 2025, arXiv:2504.21848. [Google Scholar] [CrossRef]
Sapkota, R.; Roumeliotis, K.I.; Karkee, M. AI Agents vs. Agentic AI: A Conceptual Taxonomy, Applications and Challenges. Inf. Fusion 2026, 126, 103599. [Google Scholar] [CrossRef]
Wan, H.; Zhang, J.; Suria, A.A.; Yao, B.; Wang, D.; Coady, Y.; Prpa, M. Building LLM-Based AI Agents in Social Virtual Reality. In Proceedings of the Extended Abstracts of the CHI Conference on Human Factors in Computing Systems, Honolulu, HI, USA, 11 May 2024; pp. 1–7. [Google Scholar] [CrossRef]
South, T.; Marro, S.; Hardjono, T.; Mahari, R.; Whitney, C.D.; Greenwood, D.; Chan, A.; Pentland, A. Authenticated Delegation and Authorized AI Agents. arXiv 2025, arXiv:2501.09674. [Google Scholar] [CrossRef]
Han, E.; Yin, D.; Zhang, H. Bots with Feelings: Should AI Agents Express Positive Emotion in Customer Service? Inf. Syst. Res. 2023, 34, 1296–1311. [Google Scholar] [CrossRef]
Ruan, J.; Chen, Y.; Zhang, B.; Xu, Z.; Bao, T.; Du, G.Q.; Shi, S.W.; Mao, H.; Li, Z.; Zeng, X.; et al. TPTU: Task Planning and Tool Usage of Large Language Model-Based AI Agents. 2023. Available online: https://openreview.net/forum?id=GrkgKtOjaH (accessed on 16 February 2026).
Mollick, E.; Mollick, L.; Bach, N.; Ciccarelli, L.J.; Przystanski, B.; Ravipinto, D. AI Agents and Education: Simulated Practice at Scale. arXiv 2024, arXiv:2407.12796. [Google Scholar]
Ferrag, M.A.; Tihanyi, N.; Debbah, M. From LLM Reasoning to Autonomous AI Agents: A Comprehensive Review. arXiv 2025, arXiv:2504.19678. [Google Scholar] [CrossRef]
Mitchell, M.; Ghosh, A.; Luccioni, A.S.; Pistilli, G. Fully Autonomous AI Agents Should Not Be Developed. arXiv 2025, arXiv:2502.02649. [Google Scholar] [CrossRef]
Petrovic, V.M. Artificial Intelligence and Virtual Worlds—Toward Human-Level AI Agents. IEEE Access 2018, 6, 39976–39988. [Google Scholar] [CrossRef]
Huang, X.; Lian, J.; Lei, Y.; Yao, J.; Lian, D.; Xie, X. Recommender AI Agent: Integrating Large Language Models for Interactive Recommendations. ACM Trans. Inf. Syst. 2025, 43, 1–33. [Google Scholar] [CrossRef]
Yadav, D.; Jain, R.; Agrawal, H.; Chattopadhyay, P.; Singh, T.; Jain, A.; Singh, S.B.; Lee, S.; Batra, D. EvalAI: Towards Better Evaluation Systems for AI Agents. arXiv 2019, arXiv:1902.03570. [Google Scholar] [CrossRef]
Roohani, Y.; Lee, A.; Huang, Q.; Vora, J.; Steinhart, Z.; Huang, K.; Marson, A.; Liang, P.; Leskovec, J. BioDiscoveryAgent: An AI Agent for Designing Genetic Perturbation Experiments. arXiv 2025, arXiv:2405.17631. [Google Scholar]
Cañas, J.J. AI and Ethics When Human Beings Collaborate with AI Agents. Front. Psychol. 2022, 13, 836650. [Google Scholar] [CrossRef] [PubMed]
Shetty, M.; Chen, Y.; Somashekar, G.; Ma, M.; Simmhan, Y.; Zhang, X.; Mace, J.; Vandevoorde, D.; Las-Casas, P.; Gupta, S.M.; et al. Building AI Agents for Autonomous Clouds: Challenges and Design Principles. In Proceedings of the ACM Symposium on Cloud Computing, Redmond, WA, USA, 20 November 2024; pp. 99–110. [Google Scholar] [CrossRef]
Yang, H.; Zhang, B.; Wang, N.; Guo, C.; Zhang, X.; Lin, L.; Wang, J.; Zhou, T.; Guan, M.; Zhang, R.; et al. FinRobot: An Open-Source AI Agent Platform for Financial Applications Using Large Language Models. arXiv 2024, arXiv:2405.14767. [Google Scholar] [CrossRef]
Dennis, A.R.; Lakhiwal, A.; Sachdeva, A. AI Agents as Team Members: Effects on Satisfaction, Conflict, Trustworthiness, and Willingness to Work With. J. Manag. Inf. Syst. 2023, 40, 307–337. [Google Scholar] [CrossRef]
Kostka, B.; Kwiecieli, J.; Kowalski, J.; Rychlikowski, P. Text-Based Adventures of the Golovin AI Agent. In Proceedings of the 2017 IEEE Conference on Computational Intelligence and Games (CIG), New York, NY, USA, 22–25 August 2017; pp. 181–188. [Google Scholar] [CrossRef]
Kumar, A. Building Autonomous AI Agents Based AI Infrastructure. Int. J. Comput. Trends Technol. 2024, 72, 116–125. [Google Scholar] [CrossRef]
Huang, Y. Levels of AI Agents: From Rules to Large Language Models. arXiv 2024, arXiv:2405.06643. [Google Scholar]
Wang, H.; Wang, C.; Chen, Z.; Liu, F.; Bao, C.; Xu, X. Impact of AI-Agent-Supported Collaborative Learning on the Learning Outcomes of University Programming Courses. Educ. Inf. Technol. 2025, 30, 17717–17749. [Google Scholar] [CrossRef]
Zhou, J.; Li, R.; Tang, J.; Tang, T.; Li, H.; Cui, W.; Wu, Y. Understanding Nonlinear Collaboration between Human and AI Agents: A Co-Design Framework for Creative Design. In Proceedings of the CHI Conference on Human Factors in Computing Systems, Honolulu, HI, USA, 11 May 2024; pp. 1–16. [Google Scholar] [CrossRef]
Ning, L.; Liang, Z.; Jiang, Z.; Qu, H.; Ding, Y.; Fan, W.; Wei, X.; Lin, S.; Liu, H.; Yu, P.S.; et al. A Survey of WebAgents: Towards Next-Generation AI Agents for Web Automation with Large Foundation Models. In Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining (Vol. 2), Toronto, ON, Canada, 3 August 2025; pp. 6140–6150. [Google Scholar] [CrossRef]
Nayyar, R.K.; Verma, P.; Srivastava, S. Differential Assessment of Black-Box AI Agents. In Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, BC, Canada, 22 February–1 March 2022; Volume 36, pp. 9868–9876. [Google Scholar] [CrossRef]
Wang, X.; Dai, H.; Gao, S.; Li, P. Characteristic AI Agents via Large Language Models. arXiv 2024, arXiv:2403.12368. [Google Scholar] [CrossRef]
Mathur, L.; Liang, P.P.; Morency, L.-P. Advancing Social Intelligence in AI Agents: Technical Challenges and Open Questions. arXiv 2024, arXiv:2404.11023. [Google Scholar] [CrossRef]
Jiang, Y.-H.; Li, R.; Zhou, Y.; Qi, C.; Hu, H.; Wei, Y.; Jiang, B.; Wu, Y. AI Agent for Education: Von Neumann Multi-Agent System Framework. arXiv 2025, arXiv:2501.00083. [Google Scholar]
Sami, A.M.; Rasheed, Z.; Kemell, K.-K.; Waseem, M.; Kilamo, T.; Saari, M.; Duc, A.N.; Systä, K.; Abrahamsson, P. System for Systematic Literature Review Using Multiple AI Agents: Concept and an Empirical Evaluation. arXiv 2024, arXiv:2403.08399. [Google Scholar] [CrossRef]
Han, X.; Wang, N.; Che, S.; Yang, H.; Zhang, K.; Xu, S.X. Enhancing Investment Analysis: Optimizing AI-Agent Collaboration in Financial Research. In Proceedings of the 5th ACM International Conference on AI in Finance, Brooklyn, NY, USA, 14 November 2024; pp. 538–546. [Google Scholar] [CrossRef]
Alabed, A.; Javornik, A.; Gregory-Smith, D.; Casey, R. More than Just a Chat: A Taxonomy of Consumers’ Relationships with Conversational AI Agents and Their Well-Being Implications. Eur. J. Mark. 2024, 58, 373–409. [Google Scholar] [CrossRef]
Arora, D.; Sonwane, A.; Wadhwa, N.; Mehrotra, A.; Utpala, S.; Bairi, R.; Kanade, A.; Natarajan, N. MASAI: Modular Architecture for Software-Engineering AI Agents. arXiv 2024, arXiv:2406.11638. [Google Scholar] [CrossRef]
Balunovic, M.; Beurer-Kellner, L.; Fischer, M.; Vechev, M. AI Agents with Formal Security Guarantees. 2024. Available online: https://openreview.net/forum?id=c6jNHPksiZ (accessed on 16 February 2026).
Sun, G.; Zhan, X.; Such, J. Building Better AI Agents: A Provocation on the Utilisation of Persona in LLM-Based Conversational Agents. In Proceedings of the ACM Conversational User Interfaces 2024, Luxembourg, 8 July 2024; pp. 1–6. [Google Scholar] [CrossRef]
Ashktorab, Z.; Dugan, C.; Johnson, J.; Pan, Q.; Zhang, W.; Kumaravel, S.; Campbell, M. Effects of Communication Directionality and AI Agent Differences in Human-AI Interaction. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, Yokohama, Japan, 6 May 2021; pp. 1–15. [Google Scholar] [CrossRef]
Ponnusamy, P.; Ghias, A.R.; Yi, Y.; Yao, B.; Guo, C.; Sarikaya, R. Feedback-Based Self-Learning in Large-Scale Conversational AI Agents. AI Mag. 2021, 42, 43–56. [Google Scholar] [CrossRef]
Fung, P.; Bachrach, Y.; Celikyilmaz, A.; Chaudhuri, K.; Chen, D.; Chung, W.; Dupoux, E.; Gong, H.; Jégou, H.; Lazaric, A.; et al. Embodied AI Agents: Modeling the World. arXiv 2025, arXiv:2506.22355. [Google Scholar] [CrossRef]
Huang, K.; Zhang, S.; Wang, H.; Qu, Y.; Lu, Y.; Roohani, Y.; Li, R.; Qiu, L.; Li, G.; Zhang, J.; et al. Biomni: A General-Purpose Biomedical AI Agent. bioRxiv 2025. [Google Scholar] [CrossRef]
Mehrotra, S.; Jorge, C.C.; Jonker, C.M.; Tielman, M.L. Integrity-Based Explanations for Fostering Appropriate Trust in AI Agents. ACM Trans. Interact. Intell. Syst. 2024, 14, 1–36. [Google Scholar] [CrossRef]
Zhou, J.; Zhang, B.; Li, G.; Chen, X.; Li, H.; Xu, X.; Chen, S.; He, W.; Xu, C.; Liu, L.; et al. An AI Agent for Fully Automated Multi-Omic Analyses. Adv. Sci. 2024, 11, 2407094. [Google Scholar] [CrossRef]
Baranchuk, M.; Bolina, V.; De Witt, C.; Hammond, L.; Motwani, S.; Strohmeier, M.; Torr, P. Secret Collusion among AI Agents: Multi-Agent Deception via Steganography. In Proceedings of the Advances in Neural Information Processing Systems 37, Vancouver, BC, Canada, 10–15 December 2024; pp. 73439–73486. [Google Scholar] [CrossRef]
Bovo, R.; Abreu, S.; Ahuja, K.; Gonzalez, E.J.; Cheng, L.-T.; Gonzalez-Franco, M. EmBARDiment: An Embodied AI Agent for Productivity in XR. In Proceedings of the 2025 IEEE Conference on Virtual Reality and 3D User Interfaces (VR), Saint-Malo, France, 8 March 2025; pp. 708–717. [Google Scholar] [CrossRef]
Mao, S.; Cai, Y.; Xia, Y.; Wu, W.; Wang, X.; Wang, F.; Ge, T.; Wei, F. ALYMPICS: LLM Agents Meet Game Theory—Exploring Strategic Decision-Making with AI Agents. arXiv 2024, arXiv:2311.03220. [Google Scholar]
Joshi, S. Advancing Innovation in Financial Stability: A Comprehensive Review of AI Agent Frameworks, Challenges and Applications. World J. Adv. Eng. Technol. Sci. 2025, 14, 117–126. [Google Scholar] [CrossRef]
Joshi, S. Review of Autonomous Systems and Collaborative AI Agent Frameworks. Int. J. Sci. Res. Arch. 2025, 14, 961–972. [Google Scholar] [CrossRef]
Houde, S.; Brimijoin, K.; Muller, M.; Ross, S.I.; Silva Moran, D.A.; Gonzalez, G.E.; Kunde, S.; Foreman, M.A.; Weisz, J.D. Controlling AI Agent Participation in Group Conversations: A Human-Centered Approach. In Proceedings of the 30th International Conference on Intelligent User Interfaces, Cagliari, Italy, 24 March 2025; pp. 390–408. [Google Scholar] [CrossRef]
Muller, M.; Houde, S.; Gonzalez, G.N.; Brimijoin, K.; Ross, S.; Moran, D.A.S.; Weisz, J. Group Brainstorming with an AI Agent: Creating and Selecting Ideas. In Proceedings of the International Conference on Computational Creativity (ICCC 2024), Jönköping, Sweden, 17–21 June 2024; Available online: https://computationalcreativity.net/iccc24/papers/ICCC24_paper_18.pdf (accessed on 16 February 2026).
Jiang, Y.-H.; Shi, J.; Tu, Y.; Zhou, Y.; Zhang, W.; Wei, Y. For Learners: AI Agent Is All You Need. In Proceedings of the International Conference on Computational Creativity (ICCC 2024), Jönköping, Sweden, 17–21 June 2024; pp. 21–46. Available online: https://www.researchgate.net/profile/Yuan-Hao-Jiang/publication/384803779_For_Learners_AI_Agent_Is_All_You_Need/links/675d4f34da24c8537c6ef4fe/For-Learners-AI-Agent-Is-All-You-Need.pdf (accessed on 16 February 2026).
Rasheed, Z.; Waseem, M.; Systä, K.; Abrahamsson, P. Large Language Model Evaluation via Multi AI Agents: Preliminary Results. arXiv 2024, arXiv:2404.01023. [Google Scholar] [CrossRef]
Joshi, S. A Literature Review of Gen AI Agents in Financial Applications: Models and Implementations. SSRN 2025. [Google Scholar] [CrossRef]
Noothigattu, R.; Bouneffouf, D.; Mattei, N.; Chandra, R.; Madan, P.; Varshney, K.R.; Campbell, M.; Singh, M.; Rossi, F. Teaching AI Agents Ethical Values Using Reinforcement Learning and Policy Orchestration. IBM J. Res. Dev. 2019, 63, 2:1–2:9. [Google Scholar] [CrossRef]
Zhang, R.; Du, H.; Liu, Y.; Niyato, D.; Kang, J.; Xiong, Z.; Jamalipour, A.; In Kim, D. Generative AI Agents with Large Language Model for Satellite Networks via a Mixture of Experts Transmission. IEEE J. Sel. Areas Commun. 2024, 42, 3581–3596. [Google Scholar] [CrossRef]
Yang, Y.; Ma, M.; Huang, Y.; Chai, H.; Gong, C.; Geng, H.; Zhou, Y.; Wen, Y.; Fang, M.; Chen, M.; et al. Agentic Web: Weaving the Next Web with AI Agents. arXiv 2025, arXiv:2507.21206. [Google Scholar] [CrossRef]
Yu, D.; Song, K.; Lu, P.; He, T.; Tan, X.; Ye, W.; Zhang, S.; Bian, J. MusicAgent: An AI Agent for Music Understanding and Generation with Large Language Models. arXiv 2023, arXiv:2310.11954. [Google Scholar] [CrossRef]
Zhang, Q.; Hu, Y.; Yan, J.; Zhang, H.; Xie, X.; Zhu, J.; Li, H.; Niu, X.; Li, L.; Sun, Y.; et al. Large-Language-Model-Based AI Agent for Organic Semiconductor Device Research. Adv. Mater. 2024, 36, 2405163. [Google Scholar] [CrossRef] [PubMed]
Chiang, J.Y.F.; Lee, S.; Huang, J.-B.; Huang, F.; Chen, Y. Why Are Web AI Agents More Vulnerable Than Standalone LLMs? A Security Analysis. arXiv 2025, arXiv:2502.20383. [Google Scholar] [CrossRef]
Ivanov, D.; Dütting, P.; Talgam-Cohen, I.; Wang, T.; Parkes, D.C. Principal-Agent Reinforcement Learning: Orchestrating AI Agents with Contracts. arXiv 2024, arXiv:2407.18074. [Google Scholar]
Duan, W.; McNeese, N.; Freeman, G.; Li, L. Mitigating Gender Stereotypes toward AI Agents through an eXplainable AI (XAI) Approach. Proc. ACM Hum.-Comput. Interact. 2024, 8, 1–35. [Google Scholar] [CrossRef]
Yu, X.; Peng, B.; Vajipey, V.; Cheng, H.; Galley, M.; Gao, J.; Yu, Z. ExACT: Teaching AI Agents to Explore with Reflective-MCTS and Exploratory Learning. arXiv 2025, arXiv:2410.02052. [Google Scholar]
Ju, H.; Aral, S. Collaborating with AI Agents: Field Experiments on Teamwork, Productivity, and Performance. arXiv 2025, arXiv:2503.18238. [Google Scholar] [CrossRef]
Feng, K.J.K.; Pu, K.; Latzke, M.; August, T.; Siangliulue, P.; Bragg, J.; Weld, D.S.; Zhang, A.X.; Chang, J.C. Cocoa: Co-Planning and Co-Execution with AI Agents. arXiv 2025, arXiv:2412.10999. [Google Scholar]
Umarov, I.; Mozgovoy, M. Believable and Effective AI Agents in Virtual Worlds: Current State and Future Perspectives. Int. J. Gaming Comput.-Mediat. Simul. 2012, 4, 37–59. [Google Scholar] [CrossRef]
Gupta, S. AI Agents Collaboration under Resource Constraints: Practical Implementations. Int. J. Artif. Intell. Res. Dev. 2025, 3, 51–63. [Google Scholar] [CrossRef]
Ruan, J.; Chen, Y.; Zhang, B.; Xu, Z.; Bao, T.; Du, G.; Shi, S.; Mao, H.; Li, Z.; Zeng, X.; et al. TPTU: Large Language Model-Based AI Agents for Task Planning and Tool Usage. arXiv 2023, arXiv:2308.03427. [Google Scholar] [CrossRef]
Zhang, Y.; Cai, Y.; Zuo, X.; Luan, X.; Wang, K.; Hou, Z.; Zhang, Y.; Wei, Z.; Sun, M.; Sun, J.; et al. The Fusion of Large Language Models and Formal Methods for Trustworthy AI Agents: A Roadmap. arXiv 2024, arXiv:2412.06512. [Google Scholar] [CrossRef]
Bhatt, A.; Rushing, C.; Kaufman, A.; Tracy, T.; Georgiev, V.; Matolcsi, D.; Khan, A.; Shlegeris, B. Ctrl-Z: Controlling AI Agents via Resampling. arXiv 2025, arXiv:2504.10374. [Google Scholar] [CrossRef]
Kon, P.T.J.; Liu, J.; Ding, Q.; Qiu, Y.; Yang, Z.; Huang, Y.; Srinivasa, J.; Lee, M.; Chowdhury, M.; Chen, A. Curie: Toward Rigorous and Automated Scientific Experimentation with AI Agents. arXiv 2025, arXiv:2502.16069. [Google Scholar] [CrossRef]
Bousetouane, F. Agentic Systems: A Guide to Transforming Industries with Vertical AI Agents. arXiv 2025, arXiv:2501.00881. [Google Scholar] [CrossRef]
Mo, T.; Jiang, Z.; Zheng, Q. Interactive AI Agent for Code Refactoring Assistance: A Study on Decision-Making Strategies and Human–Agent Collaboration Effectiveness. Acad. Nexus J. 2025, 4. Available online: https://academianexusjournal.com/index.php/anj/article/view/35 (accessed on 16 February 2026).
Murthy, R.; Heinecke, S.; Niebles, J.C.; Liu, Z.; Xue, L.; Yao, W.; Feng, Y.; Chen, Z.; Gokul, A.; Arpit, D.; et al. REX: Rapid Exploration and eXploitation for AI Agents. arXiv 2024, arXiv:2307.08962. [Google Scholar] [CrossRef]
Lim, S.; Shim, H. No Secrets between the Two of Us: Privacy Concerns over Using AI Agents. Cyberpsychology 2022, 16, 3. [Google Scholar] [CrossRef]
Hong, J.-W.; Williams, D. Racism, Responsibility and Autonomy in HCI: Testing Perceptions of an AI Agent. Comput. Hum. Behav. 2019, 100, 79–84. [Google Scholar] [CrossRef]
Aryal, S.; Do, T.; Heyojoo, B.; Chataut, S.; Gurung, B.D.S.; Gadhamshetty, V.; Gnimpieba, E. Leveraging Multi-AI Agents for Cross-Domain Knowledge Discovery. arXiv 2024, arXiv:2404.08511. [Google Scholar]
Lei, S.; Xie, L.; Peng, J. Unethical Consumer Behavior Following Artificial Intelligence Agent Encounters: The Differential Effect of AI Agent Roles and Its Boundary Conditions. J. Serv. Res. 2025, 28, 598–613. [Google Scholar] [CrossRef]
Cihon, P.; Stein, M.; Bansal, G.; Manning, S.; Xu, K. Measuring AI Agent Autonomy: Towards a Scalable Approach with Code Inspection. arXiv 2025, arXiv:2502.15212. [Google Scholar] [CrossRef]
Thomas, G.; Chan, A.J.; Kang, J.; Wu, W.; Christianos, F.; Greenlee, F.; Toulis, A.; Purtorab, M. WebGames: Challenging General-Purpose Web-Browsing AI Agents. arXiv 2025, arXiv:2502.18356. [Google Scholar]
DeChant, C. Episodic Memory in AI Agents Poses Risks That Should Be Studied and Mitigated. In Proceedings of the 2025 IEEE Conference on Secure and Trustworthy Machine Learning (SaTML), Copenhagen, Denmark, 9 April 2025; pp. 321–332. [Google Scholar] [CrossRef]
Bhunia, A.K.; Das, A.; Muhammad, U.R.; Yang, Y.; Hospedales, T.M.; Xiang, T.; Gryaditskaya, Y.; Song, Y.-Z. Pixelor: A Competitive Sketching AI Agent. ACM Trans. Graph. 2020, 39, 1–15. [Google Scholar] [CrossRef]
Giske, C.G.; Bressan, M.; Fiechter, F.; Hinic, V.; Mancini, S.; Nolte, O.; Egli, A. GPT-4-Based AI Agents—The New Expert System for Detection of Antimicrobial Resistance Mechanisms? J. Clin. Microbiol. 2024, 62, e00689-24. [Google Scholar] [CrossRef]
Jabbour, J.; Janapa Reddi, V. Generative AI Agents in Autonomous Machines: A Safety Perspective. In Proceedings of the 43rd IEEE/ACM International Conference on Computer-Aided Design, Newark, NJ, USA, 27 October 2024; pp. 1–13. [Google Scholar] [CrossRef]
Chang, M.L.; Lee, A.; Han, N.; Huang, A.; Simão, H.; Reig, S.; Mohammad Ali, A.U.; Martinez, R.; Khanuja, N.M.; Zimmerman, J.; et al. Dynamic Agent Affiliation: Who Should the AI Agent Work for in the Older Adult’s Care Network? In Proceedings of the Designing Interactive Systems Conference, Copenhagen, Denmark, 1–5 July 2024; pp. 1774–1788. [Google Scholar] [CrossRef]
Chmait, N.; Li, Y.-F.; Dowe, D.L.; Green, D. A Dynamic Intelligence Test Framework for Evaluating AI Agents. In Proceedings of the EGPAI 2016—Evaluating General Purpose AI, The Hague, The Netherlands, 30 August 2016. [Google Scholar]
De Mesentier Silva, F.; Borovikov, I.; Kolen, J.; Aghdaie, N.; Zaman, K. Exploring Gameplay with AI Agents. In Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, Edmonton, AB, Canada, 13–17 November 2018; Volume 14, pp. 159–165. [Google Scholar] [CrossRef]
Liu, Z.; Qiu, J.; Wang, S.; Zhang, J.; Liu, Z.; Ram, R.; Chen, H.; Yao, W.; Heinecke, S.; Savarese, S.; et al. MCPEval: Automatic MCP-Based Deep Evaluation for AI Agent Models. arXiv 2025, arXiv:2507.12806. [Google Scholar] [CrossRef]
Sun, H.; Zeng, S. Introspection of Thought Helps AI Agents. arXiv 2025, arXiv:2507.08664. [Google Scholar] [CrossRef]
Mouri Zadeh Khaki, A.; Choi, A. Evaluating Fairness in LLM Negotiator Agents via Economic Games Using Multi-Agent Systems. Mathematics 2026, 14, 458. [Google Scholar] [CrossRef]
Lee, M. A Mathematical Investigation of Hallucination and Creativity in GPT Models. Mathematics 2023, 11, 2320. [Google Scholar] [CrossRef]
Wei, J.; Wang, X.; Schuurmans, D.; Bosma, M.; Xia, F.; Chi, E.H.; Le, Q.V.; Zhou, D. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. arXiv 2022, arXiv:2201.11903. [Google Scholar]
Carlini, N.; Tramer, F.; Wallace, E.; Jagielski, M.; Herbert-Voss, A.; Lee, K.; Roberts, A.; Brown, T.B.; Song, D.; Erlingsson, U.; et al. Extracting Training Data from Large Language Models. arXiv 2020, arXiv:2012.07805. [Google Scholar]
Shoeybi, M.; Patwary, M.; Puri, R.; LeGresley, P.; Casper, J.; Catanzaro, B. Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism. arXiv 2019, arXiv:1909.08053. [Google Scholar]
Tirumala, K.; Markosyan, A.; Zettlemoyer, L.; Aghajanyan, A. Memorization without Overfitting: Analyzing the Training Dynamics of Large Language Models. arXiv 2022, arXiv:2205.10770. [Google Scholar] [CrossRef]
Hu, E.J.; Shen, Y.; Wallis, P.; Allen-Zhu, Z.; Li, Y.; Wang, S.; Wang, L.; Chen, W. LoRA: Low-Rank Adaptation of Large Language Models. arXiv 2021, arXiv:2106.09685. [Google Scholar]
Gusenbauer, M. Google Scholar to overshadow them all? Comparing the sizes of 12 academic search engines and bibliographic databases. Scientometrics 2019, 118, 177–214. [Google Scholar] [CrossRef]
Mai, J.; Gao, C.; Bao, J. Domain Generalization through Data Augmentation: A Survey of Methods, Applications, and Challenges. Mathematics 2025, 13, 824. [Google Scholar] [CrossRef]
Sørensen, K.; Van den Broucke, S.; Fullam, J.; Doyle, G.; Pelikan, J.; Slonska, Z.; Brand, H. Health Literacy and Public Health: A Systematic Review and Integration of Definitions and Models. BMC Public Health 2012, 12, 80. [Google Scholar] [CrossRef]

Figure 1. Cumulative number of Google Scholar records with titles containing the terms “AI agent” or “AI agents”, retrieved using an all-in-title query, from 1990 to 31 January 2026.

Figure 2. Worldwide weekly Google Trends interest (0–100) for the query “AI Agent”, from 1 January 2023 to 31 January 2026.

Figure 3. Categorization of top 100 Google Scholar publications in the field of AI agents.

Table 1. Descriptive statistics of the relevance scores across the five identified research streams.

Research Stream	Mean Relevance (μ)	Standard Deviation (σ)
Architecture & Frameworks	70.16	35.85
Multi-Agent Systems	37.78	42.12
Applications	70.42	34.67
Safety	33.65	39.11
Ethics, Accountability & Governance	36.93	41.68

Table 2. Comparison between single-agent and multi-agent architectures [5,18].

Feature	Single AI Agent	Multi-Agent Architecture
Composition	Single LLM augmented with tools and memory	Ensemble of specialized agents using multiple LLMs or models
Task complexity	Focused on a single, bounded task	Supports complex, multi-step workflows requiring coordination
Reasoning process	Internal iterative loops such as ReAct	Distributed and collaborative reasoning with recursive task allocation
Execution	Sequential autonomous steps within a defined scope	Parallelized execution coordinated among specialized agents

Table 3. Technical Requirements and Evaluation Metrics of AI Agent Architectures (Synthesized from [1,5,12,18,68,95,99]).

Architecture	Technical Profile (Complexity, Latency, Memory)	Performance Metrics & Benchmarks
Single-Agent	Complexity: Low Latency: Low (Real-time) Memory: Context-heavy	Metrics: Pass@k, Success Rate (SR) Benchmarks: MMLU, GSM8K
Hierarchical MAS	Complexity: Moderate (Parallel) Latency: Variable (Orchestration) Memory: Distributed/Low load	Metrics: Task Completion, Efficiency Benchmarks: SWE-bench
Decentralized MAS	Complexity: High Latency: High (Negotiation) Memory: Large system footprint	Metrics: Consensus Rate, Cost Benchmarks: Alympics, ChatEval
Agentic RAG	Complexity: Moderate (Two-stage) Latency: Moderate (DB-lookup) Memory: Efficient (Ext. Vector DB)	Metrics: Precision, Faithfulness Benchmarks: HotpotQA, RGB
Embodied/Web	Complexity: Variable (High-freq) Latency: Critical (<100 ms) Memory: Hardware-dependent	Metrics: Path Length, GUI Success Benchmarks: VisualWebArena

Table 4. Overview of AI agent applications, architectures and key performance metrics across domains.

Application Domain	Representative Systems/Approaches	Agent Type	Key Tasks	Performance Metrics	Observed Performance Trends
6.1 Financial Services	FinRobot, FinVision	Multi-Agent	Trading, risk analysis, fraud detection	Accuracy, false-positive rate, response time	Improved risk accuracy, reduced false positives, faster decision-making
6.2 Scientific Research	AutoGen, CrewAI, BioDiscoveryAgent, Biomni	Multi-Agent	Literature review, experiment design, data analysis	Hit rate, precision/recall, productivity gain	Higher experimental success rates, strong gains in productivity and scalability
6.3 Software Engineering	SWE-Agent, SWE-bench	Single & Multi-Agent	Code generation, debugging, maintenance	Task success rate, benchmark score	Multi-agent setups outperform single agents on complex coding tasks
6.4 Web Automation	WebAgents, OpenAI Operator, VisualWebArena	Single & Multi-Agent	Web navigation, task automation	Task completion rate, robustness, generalization	Strong performance in structured tasks, limited robustness in open environments
6.5 Education	Intelligent Tutoring Systems, PitchQuest	Multi-Agent	Tutoring, simulation, feedback	Learning outcome, engagement, accuracy	Improved engagement and personalized learning outcomes
6.6 Business & Enterprise	Siri, Alexa, Replika	Single & Multi-Agent	Customer service, workflow automation	Response time, user satisfaction, task efficiency	Increased efficiency and reduced workload, strong UX improvements
6.7 Robotics & IoT	Multi-robot systems, embodied agents	Multi-Agent	Physical interaction, control, coordination	Task success rate, control accuracy	Effective in structured environments, challenges in unstructured settings
6.8 Cybersecurity	Multi-agent defense systems	Multi-Agent	Threat detection, response coordination	Detection rate, false positives, response latency	Faster detection and improved coordination, but new attack surfaces
6.9 Gaming & Simulation	Alympics, Pixelor	Single & Multi-Agent	Strategy, gameplay, simulation	Win rate, human-level performance	Achieves human-level or superhuman performance in controlled settings
6.10 Emerging Applications	FilmAgent, OptiMuse, Pairit	Multi-Agent	Creative tasks, co-creation	Output quality, user preference, collaboration efficiency	Human–AI teams outperform AI-only in creativity and final output selection

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Stübinger, J.; Metz, F. Understanding AI Agents—A Data-Driven Literature Review. Mathematics 2026, 14, 1478. https://doi.org/10.3390/math14091478

AMA Style

Stübinger J, Metz F. Understanding AI Agents—A Data-Driven Literature Review. Mathematics. 2026; 14(9):1478. https://doi.org/10.3390/math14091478

Chicago/Turabian Style

Stübinger, Johannes, and Fabio Metz. 2026. "Understanding AI Agents—A Data-Driven Literature Review" Mathematics 14, no. 9: 1478. https://doi.org/10.3390/math14091478

APA Style

Stübinger, J., & Metz, F. (2026). Understanding AI Agents—A Data-Driven Literature Review. Mathematics, 14(9), 1478. https://doi.org/10.3390/math14091478

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Understanding AI Agents—A Data-Driven Literature Review

Abstract

1. Introduction

2. AI Agents

3. Methodology

4. Architecture & Frameworks

4.1. The Evolution of Agentic Architectures

4.2. Cognitive Foundations of Single-Agent Systems

4.3. Memory, Reflection and Long-Horizon Autonomy

4.4. Reasoning and Planning Mechanisms

4.5. Grounding and External Tool Use

4.6. Framework Ecosystems for Agent Deployment

4.7. Transition to Multi-Agent Architectures

5. Multi-Agent Systems

5.1. From Single-Agent Architectures to Agentic AI

5.2. Conceptual Foundations of Multi-Agent Systems

5.3. Architectural Paradigms and Coordination Structures

5.4. Core Components of Multi-Agent Architectures

5.5. Collective Reasoning and Planning

5.6. Tool Use and Shared Data Environments

5.7. Communication Protocols and Interoperability

5.8. Reliability and Governance Considerations in Multi-Agent Systems

5.9. Architectural Trade-Offs: Resource Complexity and Performance

6. Applications

6.1. The Expanding Landscape of AI Agent Applications

6.2. Applications in Financial Services

6.3. Applications in Scientific Research and Discovery

6.4. Applications in Software Engineering and Web Automation

6.5. Applications in Education and Learning

6.6. Applications in Business and Enterprise Automation

6.7. Applications in Robotics and the Internet of Things

6.8. Applications in Security and Cybersecurity

6.9. Applications in Gaming and Simulation

6.10. Specialized and Emerging Applications

7. Safety

7.1. Safety Foundations and Research Scope

7.2. Definition and Scope of Safety

7.3. Categories of Threats and Vulnerabilities

7.4. Intra-Execution Threats in Single-Agent Systems

7.4.1. Perception and Input Manipulation

7.4.2. Reasoning, Planning and Cognitive Failures

7.5. External Interaction Threats in Multi-Agent and Ecosystem Settings

7.5.1. Agent-to-Agent Threats

7.5.2. Agent-to-Memory Threats

7.5.3. Agent-to-Environment Threats

7.6. Mitigation Strategies and Technical Safeguards

7.6.1. Architectural and Technical Defenses

7.6.2. Privacy-Preserving Techniques

7.7. Recurring Concerns and Open Technical Problems

8. Ethics, Accountability & Governance

8.1. Governance Principles and Normative Scope

8.2. The Normative Core: Ethics and Value Alignment

8.3. Trust, Anthropomorphism and Human Mental Models

8.4. The Accountability Gap: Responsibility and Liability

8.5. Governance Mechanisms and Oversight

8.6. Interoperability and Ecosystem Governance

8.7. Convergence, Trade-Offs and Open Governance Challenges

9. Challenges in Agentic AI

9.1. Reliability and Sequential Reasoning

9.2. The Evaluation Paradox

9.3. Architectural Fragmentation and Interoperability

9.4. Resource Efficiency and Latency

9.5. Human-Agent Coordination

9.6. Emergent and Uncategorized Challenges

9.7. Open Research Directions

10. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives