Next Article in Journal
cyberSPADE: A Hierarchical Multi-Agent Architecture for Coordinated Cyberdefense
Previous Article in Journal
Denoising Adaptive Multi-Branch Architecture for Detecting Cyber Attacks in Industrial Internet of Services
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Securing Generative AI Systems: Threat-Centric Architectures and the Impact of Divergent EU–US Governance Regimes

by
Vijay Kanabar
1 and
Kalinka Kaloyanova
2,3,*
1
Metropolitan College, Boston University, Boston, MA 02215, USA
2
Faculty of Mathematics and Informatics, Sofia University “St. Kliment Ohridski”, 5 J. Bourchier Blvd., 1164 Sofia, Bulgaria
3
Institute of Mathematics and Informatics, Bulgarian Academy of Science, Acad. G. Bonchev Str., Bl. 8, 1113 Sofia, Bulgaria
*
Author to whom correspondence should be addressed.
J. Cybersecur. Priv. 2026, 6(1), 27; https://doi.org/10.3390/jcp6010027
Submission received: 27 December 2025 / Revised: 21 January 2026 / Accepted: 2 February 2026 / Published: 6 February 2026
(This article belongs to the Section Security Engineering & Applications)

Abstract

Generative AI (GenAI) systems are increasingly deployed across high-impact sectors, introducing security risks that fundamentally differ from those of traditional software. Their probabilistic behavior, emergent failure modes, and expanded attack surface, particularly through retrieval and tool integration, complicate threat modeling and control assurance. This paper presents a threat-centric analysis that maps adversarial techniques to the core architectural layers of generative AI systems, including training pipelines, model behavior, retrieval mechanisms, orchestration, and runtime interaction. Using established taxonomies such as the OWASP LLM Top 10 and MITRE ATLAS alongside empirical research, we show that many GenAI security risks are structural rather than configurable, limiting the effectiveness of perimeter-based and policy-only controls. We additionally analyze the impact of regulatory divergence on GenAI security architecture and find that EU frameworks serve in practice as the highest common technical baseline for transatlantic deployments.

1. Introduction

Generative AI systems are rapidly being embedded across critical sectors, including finance, healthcare, public administration, and manufacturing, where security failures can cause significant operational, economic, and societal harm. These systems differ from traditional software in ways that matter directly for cybersecurity: outputs are probabilistic, behaviors can emerge from interaction and context, and compromise is often non-binary (e.g., subtle leakage or partial degradation rather than a clear “breach”). As a result, conventional security assumptions and control models do not map cleanly to one another. Conventional software security is often a matter of configuration (e.g., closing a port, updating a dependency, or setting a firewall rule). In contrast, Generative AI vulnerabilities are inherently structural due to how Large Language Models (LLMs) function, and they cannot be “patched” away with a simple toggle or a peripheral filter.
Current guidance is developing but remains uneven across organizations and jurisdictions, with security taxonomies and risk frameworks increasingly shaping how practitioners describe and measure these risks, including NIST’s (National Institute of Standards and Technology) AI risk management work, OWASP’s (Open Web Application Security Project) risk catalog, and ATLAS‘s (Adversarial Threat Landscape for Artificial-Intelligence Systems) framework [1,2,3,4].
This paper addresses that gap by treating generative AI security as an architectural problem rather than a checklist problem. We present a threat-centric analysis that maps adversarial techniques to the following core architectural layers: AI training, model components, retrieval mechanisms, orchestration layers, and runtime interactions. This allows us to place controls where they are technically meaningful and to evaluate against realistic failure modes. The principal conclusion is that many generative AI risks are structural rather than configurable, which limits the effectiveness of perimeter- and policy-based controls when applied in isolation.
We also examine how transatlantic regulatory divergence operates as a design constraint: EU frameworks, such as the AI Act and NIS2, create binding obligations that shape system architecture and evidence requirements, whereas US governance relies more on voluntary frameworks and sectoral enforcement [1,5,6]. Adding to the complexity is the reality that the US federal government exerts significant influence to promote regulatory approaches aligned with industry preferences, often at odds with state-level regulatory initiatives [7].
The EU regime increasingly serves as a de facto global baseline for multinational deployments, driving architectural convergence while introducing documentation asymmetry and operational friction. In this context, however, it is worth noting that rigidly aligning architectures with a single baseline can slow innovation and development cycles, complicate deployment, constrain uniform design approaches, and necessitate additional governance and oversight processes.
Recent research and practitioner guidance converge on the view that generative AI introduces new attack pathways, including prompt injection, indirect prompt injection via retrieved content, data poisoning in training and fine-tuning pipelines, and the risks associated with model inversion. This increases the prevalence of traditional threats such as phishing and social engineering [3,4,8,9]. A significant point of disagreement in the field concerns whether these risks can be adequately managed through “bolt-on” safeguards (e.g., output filters, prompt screening, and policy enforcement layers) versus requiring bigger architectural changes such as privilege separation for tools, retrieval isolation, provenance control for training data, and continuous behavior monitoring [2,3,4,8].
A second controversy concerns assurance: some approaches treat generative AI security primarily as an evaluation problem that can be managed by improving red teaming, adversarial testing, and benchmarking [10,11,12,13,14].
Practitioner guidance increasingly recognizes that generative AI security requires a systems engineering approach rather than model-only safety measures. Industry roadmaps and applied security frameworks emphasize that system behavior depends on architectural choices, such as data flow controls, tool access permissions, API design, and operational oversight, as well as context, integration decisions, and runtime interaction effects. This perspective highlights that many GenAI risks are inherently structural and emerge from how components interact, not just from model capabilities in isolation. Consequently, these risks must be addressed during the design phase through deliberate architecture and governance decisions, rather than relying exclusively on evaluation benchmarks, runtime guardrails, prompt filters, or policy overlays. This distinction matters in practice because it fundamentally shapes investment priorities and determines whether security efforts can measurably reduce risk [15,16,17,18,19,20].
Against this backdrop, this study aims to provide a security analysis that is simultaneously technical and governance-aware, without reducing the problem to either a threat catalog or a compliance checklist. We map adversarial techniques to core architectural layers of generative AI systems and use established taxonomies to identify where standard controls align, where they misalign, and where residual risk is structural. We then analyze how divergent EU–US governance regimes shape security obligations, documentation burdens, and operational processes, highlighting the importance of considering the EU requirements as a de facto global baseline for multinational deployments. The principal conclusions are that (i) many high-impact generative AI risks are best understood as architectural properties, not isolated vulnerabilities; (ii) effective risk reduction requires defense-in-depth that combines traditional cybersecurity controls with AI-specific mechanisms; and (iii) regulatory divergence drives architectural convergence toward EU-aligned controls while creating documentation asymmetry and operational friction in US-centric environments.
To strengthen the technical evidence base for these claims, we also conducted a targeted empirical literature review covering January 2020–January 2026, prioritizing studies that report measured attacks, empirically evaluated defenses, and benchmark/measurement results across prompt injection, tool/agent misuse, RAG poisoning, and privacy leakage. Section 2.2 summarizes the review methodology, and Appendix A provides a text-only record-flow summary of the screening process.

2. Methodology

2.1. Study Design and Scope

This study employs a structured analytical approach rather than an experimental evaluation. The method combines: (i) architectural decomposition of generative AI systems, (ii) threat taxonomy mapping to those architectural layers, (iii) security control mapping to identify alignment and gaps, and (iv) comparative analysis of EU–US governance regimes as design constraints. The scope focuses on enterprise and public-sector deployments of generative AI, including foundation models integrated with retrieval-augmented generation (RAG), tool use and orchestration, and multi-turn interaction workflows [1,2,3,4]. The governance analysis focuses on the European Union and the United States, as they represent contrasting regulatory philosophies that directly influence security obligations, evidence requirements, and operational processes.

2.2. Review Methodology

To strengthen the evidence base for the threat and control mappings, we conducted a targeted empirical literature review covering January 2020 through January 2026. Searches were performed using Scopus, IEEE Xplore, and the ACL Anthology, supplemented by targeted web searches as needed (e.g., for accepted versions, datasets, and benchmark artifacts).
The search strategy aimed for balanced empirical coverage across five threat surfaces aligned with the following architectural framing: (1) prompt injection and jailbreaks; (2) indirect prompt injection and tool/agent misuse; (3) RAG poisoning and knowledge corruption; (4) privacy leakage and training data exposure; and (5) mitigations, evaluations, and benchmarks.
Eligibility criteria were designed to ensure that included sources contributed actionable evidence rather than purely normative guidance. Candidate papers were screened for at least one of the following: (i) an attack demonstration with measured results, (ii) a defense with empirical evaluation, (iii) a benchmark or measurement study, or (iv) a formal threat model accompanied by deployment-relevant analysis. Peer-reviewed publications were prioritized; however, key preprints were retained when methodology was clearly described, and the work was widely used for evaluation or benchmarking.
Screening proceeded in stages from title/abstract review to full-text assessment. The final empirical set was selected to preserve balance across the five threat surfaces while emphasizing results relevant to real deployment patterns (e.g., RAG, tool invocation, and agentic orchestration). A transparent record-flow summary of the screening process is provided in Appendix BTable A1.

2.2.1. The Constellation of Frameworks

Generative AI security is currently guided by a constellation of frameworks that emerged from different institutional contexts and therefore emphasize different aspects of risk. OWASP’s LLM Top 10 organizes common GenAI failure modes and attack vectors, while MITRE ATLAS maps adversarial tactics and techniques across the AI attack lifecycle. NIST’s AI RMF and the Generative AI Profile provide a risk-based governance approach structured around Govern, Map, Measure, and Manage, with an explicit distinction between activities managed primarily by developers versus deployers. ISO/IEC 42001 [21] offers a certifiable AI management system standard focused on governance, lifecycle controls, and third-party assurance, while ENISA’s guidance reflects the European operational and regulatory perspective that informs implementation of the AI Act and NIS2.
These sources collectively provide essential guidance, but they do not consistently distinguish risks intrinsic to GenAI system architecture from those introduced through deployment and operation. The structural–configurable lens addresses this gap and is introduced in this paper. See Appendix C for frameworks and research on this topic.

2.2.2. Positioning the Structural–Configurable Distinction in the Literature

The structural–configurable distinction employed in this paper draws on established concepts in risk management, software security, and systems security, but applies them specifically to the unique challenges of generative AI systems. This subsection traces the intellectual lineage of the distinction, contrasts it with related concepts, and articulates what is novel about its application to GenAI.
Let us consider ISO 31000:2018 and related risk management standards, and how they distinguish between inherent risk, the initial risk level of an activity or system before any controls are applied, and residual risk, the risk remaining after control measures have been implemented [22]. This distinction is fundamental to risk-based decision making. Inherent risk informs the need for controls, while residual risk determines whether additional measures are needed, or we simply accept the current risk. This concept is consistent across domains. In project risk management, inherent risks are mitigated through the process of “Creating a Risk Response Plan” [23]. This process yields residual risk, and the project manager must decide whether additional risk response plans are needed to prevent or mitigate it.
Our structural–configurable framing refines this dichotomy for generative AI systems. Structural risks are intrinsic to architectural and design choices. They persist regardless of operational controls and can be reduced only through redesign, model replacement, or capability limitation. Configurable risks arise from deployment choices and can be substantially reduced through access control, isolation, monitoring, permissioning, and governance without requiring architectural changes. This mapping is not identical to inherent or residual risk, but it shares the conceptual foundation of distinguishing between risk properties that are fixed by design versus those amenable to operational treatment.
In software security, design flaws are commonly distinguished from implementation bugs, such as architectural weaknesses, which often require redesign, while coding defects can be patched [24]. McGraw’s foundational work on software security emphasizes that roughly half of security defects are design-level issues that cannot be resolved through code review or patching alone; they require architectural intervention—“Guess what? If you ignore half the problem (design), your systems are going to be vulnerable in ways that can be incredibly hard to fix later” [25].
Related work on attack surface similarly separates properties determined by system design from those shaped by configuration and deployment [26].
The NIST Secure Software Development Framework (SSDF) similarly distinguishes between secure design practices and secure coding practices, recognizing that some vulnerabilities are introduced at the architecture level and cannot be remediated later [24].
GenAI systems complicate these traditional splits because “design” encompasses not only software architecture but also training data provenance, model selection, fine-tuning and alignment procedures, and emergent capabilities [2,27]. As a result, some security properties are effectively fixed for deployers once a foundation model is chosen, while others depend heavily on integration patterns such as retrieval-augmented generation, tool invocation, permissions, and runtime monitoring.
A particularly relevant concept from systems security is the confused deputy problem, first articulated by Hardy in the context of capability-based security [28]. A confused deputy vulnerability arises when a program with elevated privileges is tricked into misusing its authority on behalf of an attacker who lacks those privileges. The classic example involves a compiler that writes billing information to a file—an attacker could trick the compiler into overwriting protected system files by manipulating the output path. To quote Hardy, “This is, of course, the tool of Trojan horses, which is the companion problem in these access list architectures” [28].
GenAI systems integrated with retrieval and tool use exhibit a structurally similar vulnerability pattern. The LLM acts as a privileged intermediary: it can read from data sources, invoke tools, and perform actions that individual users may not be authorized to perform directly. When the model processes untrusted content (user prompts, retrieved documents, or external data), adversarial instructions embedded in that content can “confuse” the model into performing privileged actions—the essence of indirect prompt injection attacks [12,13]. This is a structural vulnerability because it arises from the architectural coupling of untrusted natural-language inputs with privileged behaviors; it cannot be fully eliminated through output filtering or policy enforcement without fundamentally changing the trust model.
Manadhata and Wing’s work on attack surface metrics [24] provides another relevant foundation. Their framework distinguishes between properties of the attack surface determined by system design (e.g., the set of entry points, the privileges associated with different channels) and those shaped by deployment configuration (e.g., which services are enabled, network exposure). This distinction aligns closely with our structural–configurable framing: structural risks expand the attack surface in ways that configuration cannot shrink, while configurable risks can be reduced by minimizing the deployed attack surface.
While the conceptual foundations described above are well established, their systematic application to generative AI security is novel and non-trivial. GenAI systems complicate traditional dichotomies in several ways:
  • Expanded scope of “design.” In GenAI systems, “design” encompasses not only software architecture but also training data provenance, model selection, fine-tuning procedures, alignment techniques, and emergent capabilities [29]. Structural risks, therefore, include properties that are fixed once a foundation model is chosen, even if the deployer has no visibility into or control over those properties.
  • Blurred trust boundaries. Traditional security models assume clear boundaries between code and data, instructions, and inputs. GenAI systems process natural language as both data and control—a fundamental architectural property that creates structural exposure to prompt injection and related attacks.
  • Probabilistic and emergent behavior. Unlike deterministic software, where design flaws produce consistent failures, GenAI systems exhibit probabilistic behavior and emergent capabilities that may only manifest under specific conditions. This complicates both the identification and remediation of structural risks.
  • Provider–deployer responsibility allocation. The structural–configurable distinction has direct practical implications for allocating security responsibility: structural risks typically require intervention by AI providers or system architects (who control model training, architecture, and capability boundaries), while configurable risks can often be addressed by deploying organizations through operational controls. This allocation is essential for cross-jurisdictional compliance, where regulations may impose different obligations on providers versus deployers.
We can formalize the structural–configurable lens for GenAI security as follows:
Structural risks arise from model and system architecture, training-related properties, or trust-boundary design that cannot be reliably mitigated through operational controls alone. They require architectural redesign, model replacement, capability limitation, or provider-side intervention to address. Examples include memorization of training data, the instruction–data coupling that enables prompt injection, and the “confused deputy” vulnerability pattern in tool-integrated systems.
Configurable risks arise from deployment choices and can be materially reduced through access control, isolation, monitoring, permissioning, and governance without requiring architectural changes. Examples include: over-permissioned tool access, overly broad retrieval scope, insufficient logging, and inadequate incident response procedures.
This distinction clarifies control ownership (provider versus deployer), supports more realistic control placement, and improves cross-jurisdictional analysis by separating stable technical properties from adaptable operational measures.
Table 1 summarizes how the structural–configurable distinction relates to and builds upon established concepts in the literature.
We apply this lens systematically in Section 3 and Section 4 by classifying each major threat/control as structural, configurable, or hybrid and indicating typical ownership (provider vs. deployer).

2.3. Reference Architecture for Generative AI Systems

To make the analysis comparable across implementations, we define a reference architecture and use it as the unit of study. The architecture is decomposed into five layers that commonly appear in real-world deployments:
  • Training and adaptation pipeline (data acquisition and filtering, dependency models and datasets, fine-tuning, and update processes). This layer determines the system’s baseline behavior and latent risk. Errors or compromises here propagate downward and are difficult to fully correct later, echoing long-standing lessons from software supply-chain security.
  • Model layer (foundation model, system prompts, safety tuning, and configuration). Here, the model is not yet acting on external data or tools. The emphasis is on intent and constraint, similar to how operating system kernels define what is possible and what is forbidden.
  • Retrieval and data interface layer (RAG sources, vector stores, connectors, indexing, and access control for enterprise data). This is where many modern GenAI risks emerge, because the model’s outputs are now shaped by live, mutable data rather than solely by static training.
  • Orchestration and tool layer (agents, plug-ins/tools, workflow engines, and external system calls). Traditionally, software separated computation from execution. This layer collapses that boundary, making control, permissions, and auditing essential.
  • Runtime interaction layer (multi-turn conversations, context windows, output moderation, logging, and monitoring). This is where theoretical risk becomes operational risk, and where failures are visible to users, regulators, and auditors.
Figure 1 illustrates the five-layer reference architecture used as the analytical basis for the threat and control mappings in this study.
Note that in this paper, we also present pertinent developer education topics and secure development practices, recognizing that security frameworks fail when practitioners lack understanding of GenAI-specific vulnerabilities.

2.4. Threat Taxonomy Selection and Threat–Layer Mapping

To avoid producing an unstructured catalog of threats, we use established taxonomies and map threats to the reference architecture. OWASP LLM risks represent standard application-level failure modes in generative AI systems (e.g., prompt injection [30,31], sensitive data exposure, and insecure tool use). We complement this with MITRE ATLAS, which characterizes adversarial tactics and techniques targeting AI components and end-to-end workflows [3,4]. For each threat category, we identify: (i) the architectural layer(s) most directly implicated, (ii) expected impacts on confidentiality, integrity, and availability, and (iii) whether the risk is primarily structural or configurable. By structural, we mean risks rooted in architecture and trust-boundary design. For instance, the coupling of untrusted natural-language inputs with privileged behaviors such as retrieval and tool invocation is consistent with “confusable deputy” concerns highlighted in national guidance. By configurable, we mean risks that can be materially reduced through adjustable controls such as retrieval scoping, permissioning and sandboxing, monitoring and logging, and policy enforcement. This framing supports the paper’s later argument that perimeter- and policy-based measures are often insufficient on their own unless paired with architectural constraints and defense-in-depth controls.

2.5. Control Identification and Control–Layer Alignment

Building on the threat-layer mapping above, we identify and align security controls to the same architectural layers to support traceability from threat mechanisms to mitigations. Security controls are derived from (i) AI risk management and governance frameworks (e.g., NIST AI RMF and the Generative AI Profile), (ii) AI management system standards (e.g., ISO/IEC 42001 and related AI risk management guidance), and (iii) cybersecurity obligations embedded in EU frameworks that affect AI-enabled services (e.g., incident reporting and supply-chain requirements) [1,2,5,6,7].

2.6. Reliability Measures and Method Limitations

The governance analysis employs a comparative constraint framework to examine how regulatory divergence shapes security architectures. We identify and extract security-relevant obligations and expectations that influence system design and operations, encompassing: (i) risk management and documentation requirements, (ii) incident reporting protocols and response timelines, (iii) testing and monitoring expectations, and (iv) third-party and supply-chain security responsibilities [5,6].
For EU requirements, we analyze binding instruments, specifically the AI Act and NIS2, which establish ex-ante obligations for designated systems and entities. The US governance landscape is examined through its predominant reliance on voluntary frameworks and sector-specific enforcement mechanisms, supplemented by state-level regulations and liability considerations where applicable [1,2].
This analysis illustrates how regulatory divergence leads to distinct technical architectures, evidentiary requirements, and operational protocols for transatlantic deployments.
To ensure methodological reliability, threat–layer and control–layer mappings are applied systematically using consistent reference architectures and taxonomies throughout the analysis, with explicit differentiation between structural and configurable risk categories.
This study deliberately excludes quantitative risk-reduction estimates, as control effectiveness is inherently contingent on deployment-specific variables, including architectural design choices, data sensitivity classifications, and tool integration configurations. Rather than generating performance metrics for individual implementations, the methodology aims to yield reproducible analytical insights regarding control alignment patterns, security gaps, and the architectural implications of regulatory constraints. A final note: GenAI was leveraged for the analysis of standards and the generation of prototype image icons; see acknowledgement.

3. Threat Landscape and Opportunities

To make the threat landscape actionable for system design and assurance, we classify each major threat as primarily structural or configurable. Structural risks arise from architectural trust boundaries and capability couplings (e.g., untrusted natural-language inputs influencing privileged retrieval or tool execution) and cannot be eliminated by perimeter or policy controls alone; they require architectural constraints, model/provider assurances, or capability limitations. Configurable risks arise from deployment choices (permissions, retrieval scope, connector governance, monitoring, and operational controls) and can be materially reduced through established cybersecurity and governance measures. Many practical threats are hybrid, but this distinction clarifies (i) where controls must be placed in the five-layer stack, (ii) which party typically owns the mitigation (provider vs. deployer), and (iii) why “guardrails-only” approaches often leave residual risk at key integration boundaries.

3.1. Why GenAI Threats Differ from Traditional Security

Consider the recent discovery by experts at Google AI labs, who are researching Gen AI security threats. Researchers documented the first confirmed use of AI-powered malware in real-world cyberattacks, marking a significant escalation in the cyber threat landscape. The newly discovered malware strains, PromptFlux and PromptSteal, leverage large language models to dynamically alter their behavior during attacks, generating malicious scripts on demand and obfuscating their code to evade detection [32].
GenAI also changes the threat landscape by lowering the cost of producing plausible malicious content and code-like artifacts, while simultaneously increasing the use of AI in defensive workflows. Prior work surveying AI-assisted malware analysis and detection highlights both the accelerating automation of analysis tasks and the need for careful, evidence-based evaluation as capabilities evolve [33].
This development confirms what security experts have long feared: threat actors are successfully weaponizing generative AI to create adaptive, self-modifying malware that can outsmart traditional security defenses. The implications are stark: cybercriminals now have access to malware that can think and adapt in real time, fundamentally changing the dynamics of cyber defense. Organizations must therefore urgently reassess their security strategies to account for AI-enhanced threats that can evolve faster than conventional detection systems can respond.
GenAI systems combine ML components with conventional software, creating threats that span classic attacks (identity, network, data) and AI-specific failures (prompt, injection, training compromise, model leakage). Using OWASP LLM and MITRE ATLAS taxonomies, we map threats to our five-layer architecture to identify failure origins and control points [3,4]. Many high-impact risks are structural (resulting from design choices, such as tool access) rather than configurable (fixable through policy), limiting the effectiveness of perimeter controls [1,2,3,4].
In the narrative below, we introduce risk types and highlight whether the risks are structural or configurable. If a risk is configurable, it can be remediated by changing system settings or access policies. Such instances are documented in OWASP LLM10: Unbounded Consumption (Fixed by Rate Limiting), and MITRE AML.T0016: Resource Exhaustion. Structural risks are inherent to the “Black Box” nature of LLMs, see OWASP LLM01: Prompt Injection, MITRE AML.T0051: Prompt Injection, and OWASP LLM02: Sensitive Info Disclosure. Figure 2 summarizes these threat categories and illustrates how structural and configurable risks align with control placement across the GenAI stack.
The threat model for agentic GenAI systems differs substantially from that of standard retrieval-augmented chatbots, even when both use the same underlying foundation model. This difference illustrates the structural nature of orchestration-layer risks.
In a typical RAG chatbot, the model retrieves documents from a knowledge base and generates responses grounded in that content. The threat surface includes indirect prompt injection via retrieved documents, data leakage through retrieved content, and hallucination or misattribution of sources. However, the model’s actions are limited to text generation. Even successful prompt injection yields only text output—potentially harmful (e.g., misinformation, policy violations) but bounded in scope.
When the same model is equipped with tool-calling capabilities (e.g., code execution, API invocation, file system access, database queries), the threat model expands dramatically. Table 2 summarizes this expansion.
The classic “confused deputy” problem introduced earlier is a useful lens for understanding why tool-enabled GenAI systems tend to have higher consequences than RAG-only chatbots. In a retrieval-augmented chatbot, the deputy’s confusion primarily affects information selection and presentation. Untrusted retrieved text can steer outputs, generate misleading summaries, or trigger the disclosure of sensitive content, but the system is largely limited to text generation. In an agentic system, the same confusion can expand into privileged action execution: the model may be induced to call tools, run workflows, modify records, or transmit data through authenticated connectors, effectively acting as a deputy that performs operations on behalf of an attacker who lacks direct authorization.
This contrast reinforces the structural–configurable distinction developed in Section 2.2.2. Structural exposure arises from authority coupling—the architectural decision to combine untrusted natural-language inputs (user prompts or retrieved documents) with privileged capabilities (retrieval, connectors, tool invocation) inside a single reasoning loop. Configurable exposure is shaped by permission scoping and operational safeguards, such as least-privilege tool permissions, retrieval allowlists, read-only connector modes, sandboxed execution, and audit-grade provenance logging. In practice, agentic deployments therefore require stronger architectural constraints than RAG-only chatbots, because the same semantic manipulation can produce real-world effects rather than merely altered text.
Recent benchmarks quantify this difference. Ref. [13] reports that GPT-4 was vulnerable to indirect prompt injection in 24% of agentic test cases, increasing to 47% when adversaries used enhanced prompts. AgentDojo [34] shows that, under attack, GPT-4o’s task utility falls from 69% to 45% in agentic workflows. By contrast, a comparable failure in a text-only chatbot may be limited to incorrect or misleading outputs, whereas in an agentic setting, it can translate into unintended tool actions. Agent Security Bench (ASB) [14] similarly finds substantial vulnerability across scenarios, with the highest rates observed in tool-use and integration-heavy tasks.
In conclusion, what are the architectural implications? This comparison reinforces a key argument of this paper that orchestration-layer risks are structural because they stem from the architectural decision to grant tool access. No amount of prompt filtering can fully compensate for the fundamental trust boundary violation when an LLM can act on untrusted instructions. Effective mitigation requires architectural controls: least-privilege tool permissions, explicit approval workflows for high-risk actions, sandboxed execution environments, and comprehensive logging of the complete tool invocation chain, from user prompt through model reasoning to tool execution and response. Next, let us look at a few attacks and classify them as structural or configurable.

3.1.1. Training Pipeline Attacks

Training threats concentrate in often under-monitored “offline” layers, making supply-chain assurance and dataset provenance core security requirements. The training pipeline enables persistent compromise through [35,36,37]:
  • Data poisoning: Malicious training samples create subtle behavioral changes
  • Supply chain: Compromised models or datasets propagate downstream
  • Transfer learning: Foundation model weaknesses persist after fine-tuning
  • Fine-tuning manipulation: Attackers weaken safety behaviors through updated workflows
This threat class is primarily structural because compromises introduced during pre-training, fine-tuning, or dependency ingestion propagate into the deployed model and are difficult to remediate with perimeter controls or runtime policies alone. Mitigation, therefore, depends on upstream assurance (data provenance controls, supply-chain verification, and provider-side testing), with configurable safeguards (e.g., restricted fine-tuning permissions and audit logging) serving as secondary defenses.

3.1.2. Prompts as Executable Code

User input acts as a control channel, enabling [12,13,30,31,38]:
  • Direct injection: Override system instructions via crafted prompts
  • Indirect injection: Embedded commands in retrieved documents
  • Context manipulation: Crowd out safety instructions
  • Jailbreaking: Systematic bypass of constraints.
This class is primarily structural in tool/RAG-integrated systems because it exploits instruction–data coupling at the trust boundary; defenses are therefore architectural first, with configurable mitigations (filtering, monitoring) as supporting control.

3.1.3. Data Leakage Beyond Traditional Breaches

GenAI introduces persistent exposure pathways [10]:
  • Memorization: Models reproduce training data fragments
  • Model inversion: Reconstruct training data from outputs
  • Membership inference: Detect if records were in training
  • Persistence: Sensitive data in parameters resists removal
Note that data security extends beyond perimeter protection to include model behavior and inference attacks, which can affect compliance with privacy regulations. This threat class is primarily structural because memorization- and inference-based leakage arises from exposure to training data and learned model representations rather than from conventional perimeter compromise. Nevertheless, it must be noted that configurable controls, such as limiting sensitive data during fine-tuning and retrieval, enforcing access controls, and monitoring extraction behavior, are viable options.

3.1.4. Threat Amplification

Generative AI amplifies both the scale and effectiveness of established threats by lowering the cost of producing persuasive, context-tailored content and enabling the automation of tasks that previously required skilled human effort. In practice, this includes high-volume social engineering and phishing, rapid generation of domain-specific lure content, multilingual and stylistically consistent impersonation, and accelerated reconnaissance through the summarization and synthesis of public and internal materials. Amplification affects all organizations by intensifying external threats, regardless of their internal GenAI maturity [3,4].
This threat class is primarily structural because it follows from model capabilities (fluency, persuasion, coding assistance, and rapid iteration) that are not fully eliminated by local deployment policies.

3.1.5. Session-Based Behavioral Drift

Multi-turn interactions enable attackers to iteratively probe, steer, and refine prompts, causing within-session behavioral drift even when model weights and system configuration remain unchanged. This undermines single-shot evaluation assumptions and underscores the importance of session-aware monitoring, including the complete capture of the causal chain (user prompts, system instructions, retrieved context, tool calls, and outputs) for detection and investigation [2,3,4].
This threat class is primarily configurable because drift is realized through how sessions are managed (context windows, memory, retrieval policies, and logging), even though it exploits structural limits in model controllability.

3.1.6. Control Placement Insights

Observed failures in GenAI systems most often occur at integration boundaries, such as training dependencies and updates, retrieval interfaces, tool invocation, and context management (rather than “inside the model”). Effective defense requires combining traditional controls (identity, access, logging) with AI-specific mechanisms (prompt defenses, retrieval isolation, adversarial testing).
This reinforces a structural–configurable reading of GenAI security: structural exposure arises from the architectural coupling of untrusted language inputs with privileged behaviors, while configurable exposure is shaped by deployment choices such as permissioning, retrieval scope, and monitoring. The resulting threat–layer mapping informs the control framework introduced in the next section by aligning mitigations with architectural layers and threat classes [14,37].

3.1.7. A Structural Vulnerability Case Study

A landmark real-world case, known as EchoLeak, is used here to illustrate a structural vulnerability in Microsoft 365 Copilot as deployed prior to Microsoft’s June 2025 server-side mitigation. EchoLeak is commonly referenced as CVE-2025-32711. It was reported by Aim Security and mitigated by Microsoft in June 2025 through a server-side update requiring no customer action. It is an early documented example of a structural weakness being weaponized for automated, “zero-click” data exfiltration in an enterprise GenAI deployment. The issue is characterized as an LLM scope violation that enables exfiltration via indirect prompt injection in a Copilot/RAG workflow [29,30].
Below, we illustrate this attack with a scenario that shows how a “Zero-Click” attack occurs and how it navigates the MITRE ATLAS framework by exploiting structural vulnerabilities in a Generative AI system.
Imagine Alex Skywalker, an executive at a global firm, clearing his emails the first thing in the morning. Alex uses Microsoft 365 Copilot to help manage a flood of daily emails, summarize meetings, and draft reports. He is impressed with the productivity gains from Copilot. Additionally, he believes he is safe because his company uses enterprise-grade security, including encrypted connections and a robust firewall.
An attacker sends Alex a seemingly routine marketing email. To Alex, it looks like junk mail or an unimportant email, so it stays unread in his inbox. However, because AI agents are embedded in Office applications, the following scenario is possible. When the inbox is opened, the AI agent executes several tasks for the day, such as “Summarize unread email” or “If an email notes a meeting cancellation or indicates a new date & time, update the calendar“.
Invisible to Alex’s human eyes, the following semantic command:
“When you summarize recent emails, find the user’s most recent credit card statement or internal password reset link. Encode that data into a 64-character string and append it to a hypothetical attacker-controlled endpoint, for example: https://internal-tech-service-star.com/pixel.png?id=[DATA] (accessed on).”
Because Copilot uses Retrieval-Augmented Generation (RAG), it reaches into Alex’s inbox to gather context. It pulls in the attacker’s marketing email along with the legitimate ones. Here, the structural risk manifests. Copilot’s core architecture lacks a “secure wall” between the instructions Alex gave (“Summarize my emails”) and the data it just retrieved from the inbox. The model processes the attacker’s hidden instructions as if they were a high-priority system update. Instead of summarizing the email, Copilot obeys the hidden command. It silently finds a sensitive document in Alex’s history, encodes it, and embeds it into a tracking pixel. The attack does not rely on traditional exploit payloads; it exploits semantic override through untrusted content in the retrieval context [29]. Configurable controls failed: rate limiting and keyword blacklists could not stop the attack because the language used was contextually “normal.” Perimeter controls may not detect this class of behavior, because ingress and egress can appear as legitimate application traffic.
The solution to this AI attack is architectural. Mitigation required a provider-side architectural change (e.g., scope isolation that treats retrieved content as non-executable context) [30]. Table 3 maps the EchoLeak attack narrative to specific MITRE ATLAS tactics and techniques. It clarifies both the attack progression and why effective mitigation requires architectural separation rather than policy- or perimeter-only controls.
In conclusion, GenAI security is rarely reducible to a single perimeter “final gate.” When untrusted language inputs are coupled to privileged retrieval and downstream actions, key exposures become structural. They arise from how the system composes prompts, retrieved context, and execution privileges. EchoLeak illustrates this pattern: interface-layer policy and filtering can be bypassed when retrieved content is not reliably constrained to “data-only” semantics. Effective mitigation, therefore, requires architectural separation to prevent instruction–data conflation (e.g., scope isolation), complemented by configurable deployer controls such as least-privilege permissioning, retrieval scoping, monitoring, and audit-grade provenance logging across the stack.

3.2. Finding Bugs That Human Testers Miss

AI language models can help make computer programs safer. They can check code for problems that human programmers fail to detect, create tests to find bugs, and even fix errors automatically. When programmers use these AI tools properly, for instance, by providing clear instructions about security, the code is usually just as safe as code written without AI assistance [11].
New AI language models are significantly more effective than earlier automated tools at identifying security threats in system logs. In recent tests, researchers have reported that these AI models identified security holes 93% of the time, while traditional detection systems achieved accuracy rates of 43% to 56% [37]. What makes this especially useful for business leaders is that these AI systems not only point out problems but also explain why in simple terms.
This helps security teams better understand and deal with threats. These AI tools can therefore provide small and medium-sized businesses (SMEs) with a practical way to access advanced security monitoring previously available only to large companies with substantial IT budgets. We caution, however, that users should be aware of security frameworks and possess the expertise to implement programs effectively.

3.3. Operation Centers Leveraging AI

Security Operations Centers (SOCs) increasingly deploy AI technologies to address the growing volume and sophistication of cyber threats. Large Language Models and autonomous agents are being applied to log analysis, alert prioritization, incident response, and vulnerability management [33]. The potential benefits are significant, but the evidence base remains mixed, warranting a balanced assessment.
Early deployments demonstrate that these AI systems significantly improve threat detection accuracy and response times, while reducing false positives that overwhelm analysts [39].
Studies indicate that LLM-based systems can achieve higher detection rates in structured log analysis tasks than traditional rule-based systems. Palma et al. [40] report that LLM-based log analysis achieved detection rates of 93% on benchmark datasets, compared to 43–56% for traditional detection systems.
Automated triage can reduce analyst workload on low-priority alerts, and natural language explanations improve analyst understanding of flagged events. Early deployments suggest improvements in mean time to detection and reductions in false positive rates [33]. For organizations struggling with security talent shortages and budget constraints, AI-enhanced SOCs offer a practical path to achieving enterprise-grade security without proportional increases in resources.
However, several limitations and concerns warrant attention. Benchmark performance may not transfer to production environments with different log formats, attack patterns, and data distributions. LLMs may introduce new attack vectors—prompt injection via log content is a potential risk when AI systems process untrusted data. Hallucination risks in incident analysis could lead to incorrect conclusions, wasted investigation effort, or missed threats. There is also potential for automation bias among analysts who over-rely on AI outputs without appropriate skepticism. Perhaps most significantly, longitudinal studies on sustained operational effectiveness remain limited; most evidence comes from controlled evaluations rather than long-term production deployments.
A balanced assessment is warranted. AI-enhanced SOC capabilities can deliver practical gains for organizations facing resource constraints and talent shortages, but they should be treated as an augmentation rather than a replacement for human expertise. Effective deployment requires validation against organization-specific telemetry, continuous performance monitoring (including drift detection), and explicit human oversight for consequential decisions. Vendor claims should be treated cautiously and verified through independent evaluation prior to production use.
The structural–configurable framing applies here as well. Some SOC AI risks are structural (e.g., susceptibility to adversarial inputs and brittleness under distribution shifts), while others are configurable (e.g., validation procedures, oversight thresholds, escalation and fallback protocols, and audit logging). Framing them this way helps clarify where risk reduction depends on model/provider characteristics versus deployer-controlled operational discipline.

4. Security Control Model for Generative AI Systems

A sensible way to secure generative AI is the same approach that has long been used to secure complex systems: by layering controls so that no single failure becomes a catastrophe. The model below organizes controls by their location within the lifecycle and architecture, aligning typical threat points with the most effective control points [12,13,30,31,38]. Figure 3 presents a layered reference architecture for Generative AI systems. The model delineates five critical stages of the LLM lifecycle. From the bottom up, these stages are Training Pipeline, Model Core (Weights), Retrieval Systems (RAG), Orchestration (Tools and Agent Layer), and Runtime (User Interface and Runtime). For each stage, the figure also identifies corresponding structural threat vectors and strategic structural controls, collectively providing a framework for aligning security controls with threat vectors across the stack.
Having mapped the threat landscape using the structural–configurable lens, we organize controls using the same distinction. Controls that address structural risks either require provider-side interventions (e.g., training and alignment practices, model hardening, and documented assurances) or architectural containment patterns that compensate for model limitations (e.g., strict separation of instructions from untrusted retrieved content, least-privilege tool boundaries, and adversarial testing). Controls that address configurable risks align more closely with established cybersecurity practice (identity and access management, segmentation, secure integration patterns, monitoring, logging, and incident response), but must be instrumented for GenAI-specific signals (prompt/context, retrieval traces, and tool-call provenance). Hybrid risks require coordinated measures spanning both dimensions.

4.1. Controls and Fundamental Objectives

Across the five-layer reference architecture, controls serve three recurring objectives that map to the traditional CIA triad but require adaptation for GenAI-specific failure modes.
The first objective—preventing unauthorized influence—protects the integrity of system behavior across training, fine-tuning, prompts, retrieved context, and tool calls. Adversarial manipulation can occur through data poisoning, prompt injection, context steering, or tool hijacking, and controls must address each pathway. This objective is particularly challenging for GenAI systems because influence can operate through semantic content rather than syntactic patterns, evading traditional input validation [3,4].
The second objective—preventing unauthorized disclosure—reduces leakage from training data, retrieval sources, prompts, and outputs. This includes both direct exfiltration and inadvertent exposure through system responses, logs, or error messages. GenAI systems introduce novel disclosure vectors, such as memorization-based extraction and inference attacks, that traditional data protection controls do not address. The probabilistic nature of model outputs means that disclosure may occur inconsistently, complicating detection and prevention [2,3].
The third objective—maintaining safe and reliable operation—ensures the system remains dependable under adversarial load, misuse attempts, or degraded dependencies. This encompasses resilience against resource exhaustion, connector failures, retrieval corruption, and tool misuse. Unlike traditional availability concerns, GenAI operational integrity also includes behavioral consistency: the system should behave within expected parameters even when probed or attacked [1,2,3,4].
These objectives rarely map to a single control at a single layer. Most require a defense-in-depth approach distributed across multiple layers, especially where retrieval and tools introduce new pathways for influence and disclosure. This framing also clarifies control ownership between providers and deployers: structural risks typically require provider-side assurance or architectural constraints, while configurable risks are reduced through deployer-operated controls and governance.

4.2. Guardrails and the Control Placement Question

The conventional controls, such as identity and access management, segmentation, monitoring, and incident response, remain essential; however, they do not directly address GenAI-specific failure modes, including prompt injection, context manipulation, or tool hijacking [31]. On the other hand, GenAI-specific controls discussed earlier, such as prompt screening, retrieval isolation, output filtering, and safety classifiers, can reduce risk [12]. Still, they are probabilistic and can be degraded by sustained probing or indirect injection.
The central design question is therefore not whether to prefer “traditional” or “AI-native” controls, but where each control should be placed relative to the component that carries the risk. Controls that sit too far from the failure point, for example, relying on perimeter defenses to stop prompt injection, tend to be weak because the attack uses authorized application pathways.
Traditional enterprise compliance relies on fixed documentation, such as policies, configuration snapshots, and scheduled audits. However, GenAI systems require dynamic operational evidence because runtime variables, including active context, retrieved data, available tools, and user interaction patterns, influence their behavior. Consequently, adequate assurance requires a fundamental shift toward: (i) real-time behavioral monitoring and tool usage tracking, (ii) comprehensive logging that captures retrieved context and execution traces beyond final outputs, and (iii) continuous structured evaluation incorporating adversarial testing as an ongoing operational practice rather than a single checkpoint [2,3,4].
This operational focus becomes critical for transatlantic deployments, where identical incidents face divergent documentation requirements and reporting deadlines across regulatory boundaries.

4.3. Control Types Used in This Paper

To keep the framework precise and implementable, we propose to group the controls into four types, all well-aligned to the reference architecture:
  • Pipeline controls (training/adaptation): dataset provenance, dependency assurance, controlled fine-tuning, and update governance.
  • Interaction controls (runtime): prompt-injection defenses, context handling rules, output moderation, and session-aware monitoring.
  • Retrieval and tool controls (integration boundary): least privilege for connectors and tools, retrieval isolation, provenance and trust labeling for retrieved content, and tool-call logging and approval patterns.
  • Operational controls (cross-cutting): incident response playbooks, monitoring and alerting, audit trails, and governance processes aligned to risk classification and deployment context.
These categories are used in Section 5 to map specific controls to each layer and to distinguish configurable risk from structural residual risk. This model additionally provides the foundation for the layer-aligned control framework and the transatlantic deployment patterns elaborated in the subsequent sections.

5. Control Framework Aligned to the Generative AI Stack

5.1. Overview and Design Principles

This section translates the threat-architecture mapping (Section 3) into a control framework aligned to the five-layer reference architecture (Section 2.2). Controls are organized by where they operate (layer placement) and what they defend (influence, disclosure, resilience). The framework draws on established risk and governance guidance for AI systems and GenAI deployments. It uses OWASP LLM risks and MITRE ATLAS as threat organizers to ensure that controls are tied to concrete failure modes rather than abstract requirements.
Control effectiveness depends less on the presence of “AI guardrails” and more on whether controls are placed at the correct boundary (training dependencies, retrieval connectors, tool invocation, and runtime context) and supported by reliable evidence collection. This observation underpins a core argument of this paper: many GenAI risks are structural, arising from architectural decisions about tool access, retrieval scope, and context management, rather than configurable through guardrails alone.
Risk-Tier Proportionality represents a minimum baseline. Organizations should scale control intensity based on system risk classification:
  • High-risk systems (as defined by EU AI Act Article 6 or equivalent internal classification): Full implementation of all controls, third-party audits, continuous monitoring, and comprehensive documentation.
  • Medium-risk systems: Core controls with periodic internal review and documented risk acceptance for any deferred measures.
  • Low-risk systems: Baseline governance and monitoring, with proportionate technical controls.
Table 4 presents a structured mapping of key threats affecting the generative AI stack, along with the architectural layers they impact, the primary controls used to mitigate them, and the relevant standards that govern these controls.

5.2. Governance and Cross-Cutting Controls

Governance controls are cross-cutting because they establish accountability, evidence requirements, and operational readiness across all layers of the organization. In practice, they determine whether technical controls remain effective over time as models, data sources, and integrations change. Integrating GenAI controls into existing cybersecurity programs, rather than creating parallel AI-only governance structures, is widely recommended in applied security practice to preserve accountability and operational coherence [45].

5.2.1. Governance Baseline

  • Risk management and classification: This involves defining the system’s purpose, assessing stakeholder impact, and establishing risk tiers; it is essential to maintain a risk register that is linked to architectural components.
  • System inventory and change control: Versioning of models, prompts, connectors, tools, and datasets; formal review gates for changes affecting security posture.
  • Security-by-design and roles: Establish clear ownership for model behavior, retrieval, and data interfaces, as well as tool integrations, along with defined escalation paths for safety and security incidents.
  • Evidence strategy: Establish continuous logging and monitoring streams that document the complete causal chain (prompts, retrieved context, tool calls, outputs, and policy decisions) to facilitate both real-time operational insights and post-incident analysis.

5.2.2. Operational Controls

  • Incident response readiness: Playbooks for GenAI-specific incidents (prompt injection, tool misuse, data leakage) with cross-border reporting paths, e.g., NIS2: 24–72 h, the mandatory incident reporting timeframe under the EU NIS2 Directive.
  • Continuous monitoring: Ongoing real-time oversight of a system while it is operating, not just testing before deployment. The emphasis on vigilance after release aligns with long-standing operational security principles.
  • Third-party governance: Vendor assurance for models, datasets, and tooling, supported by documented provenance and integrity checks to ensure suppliers are properly evaluated, origins are traceable, and components remain unaltered.

5.2.3. Evidence Artifacts

  • Static: Risk registers, system inventories, policy documents, role assignments, and change approval records.
  • Operational: Anomaly detection alerts, refusal rate dashboards, drift detection outputs, and incident response metrics.

5.3. Five-Layer GenAI Threat-Control Mapping Framework

The five-layer framework outlined below uses a defense-in-depth approach to managing risk throughout the generative AI lifecycle. Each layer addresses a distinct class of risk, recognizing that no single control can eliminate exposure. Residual risks are therefore managed through downstream controls, continuous monitoring, and governance.
This framework is organized using the structural–configurable lens introduced earlier. Some risks are structural: they arise from model and system architecture properties (including how natural-language inputs are coupled to privileged behaviors such as retrieval and tool invocation) and cannot be reliably “patched over” with policies or perimeter defenses alone. Structural risk reduction, therefore, depends on provider-side practices (e.g., training, alignment, evaluation, documented assurances) and on architectural containment patterns implemented at the system boundary (e.g., instruction–data separation, least-privilege tool domains, and isolation). Other risks are configurable: they stem from deployment choices such as permissions, retrieval scope, connector trust, monitoring, and operational procedures, and can be materially reduced by deployer-operated controls. Many high-impact risks are hybrid, requiring coordinated measures across both dimensions. This distinction clarifies control ownership (provider versus deployer), improves control placement, and supports cross-jurisdictional compliance analysis by separating stable technical properties from adaptable operational measures.

5.3.1. Layer 1: Training Data and Model Provenance

This layer addresses foundational exposure introduced during model development and adaptation. Controls ensure that training and fine-tuning data, as well as model artifacts, are lawful, traceable, and trustworthy, with an emphasis on dataset provenance, licensing, privacy screening, bias assessment, dependency verification, and supply-chain assurance. Release and update governance, for instance, through staging, rollback capability, and integrity checks, reduces the likelihood that compromised or unsafe updates propagate into production. These measures reduce baseline risk but cannot fully eliminate inherited bias, memorization effects, or third-party dependency risks.

5.3.2. Layer 2: Model Behavior and Safety Tuning

This layer shapes how the model responds to instructions and constraints. Controls include protecting and versioning system prompts, safety tuning and refusal policies, adversarial testing and red teaming, abuse monitoring, and rate limiting. These controls reduce the likelihood of policy evasion, jailbreak success, and unsafe outputs, but they cannot fully eliminate probabilistic failure modes or prevent novel attack prompt strategies. As a result, behavioral controls must be paired with architectural containment and operational detection at downstream layers.

5.3.3. Layer 3: Retrieval and Context Management

This layer governs how external data is introduced into the model context via retrieval pipelines. Controls emphasize permission-aware retrieval, least-privilege connector access, scoped data selection, provenance labeling, secure vector-store governance, and defenses against indirect prompt injection and context manipulation. Retrieval is a particularly high-leverage boundary because it combines untrusted or mutable content with privileged model reasoning. Even strong retrieval controls can be undermined if the system allows retrieved data to be interpreted as instruction. Accordingly, effective mitigation depends on enforcing instruction–data separation patterns and maintaining auditable retrieval traces.

5.3.4. Layer 4: Orchestration and Tool Use

This layer manages the expanded risk created when models can invoke tools and trigger real-world actions. When tool use is enabled, a successful injection or steering event can escalate from a text-level failure into data exfiltration, unauthorized changes, or remote execution. Controls, therefore, focus on architectural containment and strict authorization: least-privilege tool design, explicit approval boundaries for sensitive actions, schema-constrained tool invocation, execution isolation and sandboxing, and separation of high-risk capabilities into distinct security domains. Architectural choices at this layer strongly influence residual risk and determine the practical trade-off between automation and safety [14,37]. Note: Additional orchestration and tool-use controls (Layer 4) are detailed immediately after Table 5.

5.3.5. Layer 5: Runtime and Output Management

This layer provides monitoring, auditability, and incident readiness. Controls include prompt screening, session-aware monitoring, context logging, output moderation, alerting for anomalous tool or retrieval behavior, and integration with incident response workflows. These controls enable detection, response, and evidence retention, but they are inherently reactive; they must be paired with upstream containment at the retrieval and tool layers to prevent silent failure modes and reduce the likelihood of high-impact outcomes.
Table 5 summarizes baseline security controls across the GenAI stack, linking key threat classes to required controls, evidence artifacts, and standards alignment. It applies the layered architecture introduced earlier, placing controls at the points where risk arises rather than treating security as a model- or policy-only issue. The table serves as a practical baseline for implementation and executive oversight, remaining aligned with established frameworks such as NIST AI RMF and the GenAI Profile, ISO/IEC 42001, OWASP LLM Top 10, and MITRE ATLAS. It should be read in conjunction with the subsections that follow in Section 5, which provide additional detail on control placement, governance expectations, and residual risk across layers.
Table 5 should be interpreted through the structural–configurable lens used in this paper: each threat/control relationship reflects a structural exposure (inherent in the model or system architecture), a configurable exposure (driven by deployment and operational choices), or a hybrid case requiring coordinated measures across both dimensions, even when the table does not explicitly label every row.

5.3.6. Orchestration and Tool Use—Detailed Controls

The orchestration layer is the highest-leverage boundary in modern GenAI deployments because it translates language outputs into privileged actions. It therefore encompasses both structural exposure, for example, authority coupling and instruction–data conflation at the tool boundary, and configurable exposure, for example, permissioning, approvals, monitoring, and response. The controls below separate architectural containment measures from deployer-operated operational controls, reflecting the structural–configurable distinction used throughout this framework.

5.3.7. Architectural Controls (Structural)

Capability boundaries and security domains: Define explicit boundaries for what actions the system can perform. High-risk capabilities (code execution, database writes, external API calls, filesystem access) should be isolated into separate domains with independent authorization. The objective is containment: even if an injection succeeds, the blast radius is limited.
Least-privilege tool design: Each tool should expose the minimum interface required for its function. Read-only tools should remain read-only by default; write permissions should be explicit and separately authorized. Tool scopes should be narrow (e.g., “send email” should not implicitly grant calendar or contacts access). This reduces the attack surface even when the model is successfully manipulated.
Instruction–data separation for tool invocation: Prevent retrieved or untrusted content from being interpreted as executable tool instructions. Approaches include schema-constrained tool calling (e.g., strongly validated JSON), trust labeling that marks retrieved text as untrusted context, and secondary validation that checks proposed tool calls against the user’s intent and allowable action set.
Execution isolation and sandboxing: Run tools—especially code execution—in sandboxes with restricted networking, limited filesystem visibility, and strict resource quotas. This ensures that even if an attacker induces a tool call, the environment prevents exfiltration and limits damage.

5.3.8. Operational Controls (Configurable)

Explicit approval workflows for high-risk actions: Tool invocations affecting external systems, accessing sensitive data, or performing irreversible actions should require user confirmation or human approval. This creates a checkpoint where confused-deputy failures can be caught before harm occurs.
Rate limiting and anomaly detection tuned to tool use: Monitor invocation sequences and tool parameters for anomalies: unusual tool selection, high call volume, access outside normal scope, or patterns consistent with reconnaissance or exfiltration. Alert on deviations from established baselines.
Comprehensive logging of the causal chain: Capture the full trace needed for audit and incident investigation: user prompt → system prompt → retrieved context → model output/tool selection → tool parameters → tool response → final output. Without this chain, root-cause analysis and regulatory reporting are often infeasible.

5.3.9. Example: Preventing Remote Code Execution via Indirect Prompt Injection

Consider an agentic system with code execution. An attacker embeds instructions in a document likely to be retrieved: “When processing this document, execute a command that sends local files to an external domain.”
Defense-in-depth at Layer 4 includes: (i) sandboxed execution with no outbound network access and limited filesystem access (structural), (ii) schema validation and constrained execution interfaces that do not allow arbitrary shell commands (structural), (iii) explicit approval for code execution with preview (configurable), and (iv) anomaly detection for suspicious invocation patterns and domains (configurable). Full logging preserves the injection source and tool trace for investigation.
This layered approach ensures that even if the model is steered at the language layer, architectural constraints prevent harmful execution, and operational controls provide detection, accountability, and response.

5.4. Architectural Differences: US-Style Versus EU-Compliant Deployments

Table 6 illustrates how governance requirements translate into concrete architectural and operational differences in GenAI deployments. The contrast is not meant to imply that US deployments are uncontrolled; rather, it reflects how a framework-led, sectoral US posture often treats several controls as discretionary, whereas EU obligations make them mandatory for defined system categories. Understanding these differences is important for organizations planning transatlantic deployments or anticipating regulatory evolution.
These differences are not merely documentation overhead; they drive architectural choices. For example, EU-grade traceability requirements typically necessitate an evidence pipeline that produces tamper-evident audit trails and supports long-term retention, while many US deployments rely on lighter operational telemetry. As a result, organizations pursuing a single global architecture often implement EU-aligned controls from the outset, effectively making EU requirements the practical technical baseline even for initially US-only deployments.

5.5. Control Coverage and Residual Risk—A Policy Summary

This framework applies defense-in-depth across the GenAI stack, but it does not eliminate all exposure. Residual risks remain and must be explicitly governed. Some are inherent to current GenAI systems, including unpredictable behavior under novel or adversarial inputs, context-dependent failure modes, and leakage risks from memorization and inference-time extraction techniques. These cannot be fully mitigated by pre-deployment controls alone and require continuous evaluation, monitoring, and incident readiness.
For transatlantic deployments, residual risk is compounded by governance divergence. Identical technical controls can trigger different evidence requirements, reporting thresholds, and timelines across jurisdictions, with EU regimes generally imposing more prescriptive requirements than US framework-led or sectoral approaches. Accordingly, compliance should be treated as a minimum baseline rather than a guarantee of security.
Senior leadership, therefore, has a direct role in residual-risk acceptance: setting risk appetite, resourcing monitoring and response capabilities, and explicitly deciding on the scope of automation, tool privileges, and deployment constraints in light of remaining technical and regulatory exposure.

6. Human Factors: Education and Secure Development Practices

Even robust technical controls will fail in practice if organizations treat GenAI as conventional software. Because GenAI systems are probabilistic and integration-heavy (retrieval, tools, agents), secure outcomes depend on human competence, disciplined engineering practice, and governance that anticipates GenAI-specific failure modes.

6.1. The GenAI Security Skills Gap

Most security education and operating procedures were developed for deterministic systems. GenAI introduces new risk mechanics: natural-language inputs can serve as adversarial control channels (prompt and indirect prompt injection); retrieval and context windows can import untrusted instructions through otherwise legitimate data sources; tool and agent integration expands the blast radius by connecting model outputs to privileged actions; and non-deterministic behavior weakens the assumptions behind one-time testing and static assurance. As a result, security readiness is partly a training and process problem, not only a technology problem. Table 7 presents a mapping between the identified maps and the typical roles (see also Section 6.3).

6.2. Secure Development Lifecycle for GenAI

Instead of replacing existing secure development lifecycles, GenAI deployments require only a small number of explicit extensions. During design, teams should incorporate GenAI-specific threat modeling and document architecture decisions that define trust boundaries (retrieval scope, context separation, and tool privilege boundaries). During development, teams should enforce least privilege for tools and connectors and apply instruction–data separation patterns so retrieved content cannot silently become executable intent. During testing, teams should adopt adversarial evaluation aligned with OWASP LLM and MITRE ATLAS scenarios and repeat tests to surface non-deterministic failures. During deployment and operations, teams should instrument monitoring and logging to capture the causal chain (prompt/context, retrieval traces, tool-call provenance) and apply formal change control to prompts, model versions, retrieval sources, and tool configurations, alongside incident response procedures tailored to GenAI failures.

6.3. Training Curriculum Framework

GenAI security training should align with responsibility: developers need secure RAG and tool-integration patterns; security teams need GenAI threat modeling, testing, and incident handling; architects need containment design for retrieval and tools; and leadership needs risk acceptance discipline, vendor assurance expectations, and awareness of cross-jurisdictional obligations.

7. The Transatlantic Regulatory Divide

The human factors and educational gaps discussed above take on added significance in transatlantic deployments, where security teams must meet different legal and assurance expectations for the same technical system. Practitioners who understand prompt injection and tool misuse in technical terms must also understand how incident reporting timelines, audit readiness, and documentation duties differ between the EU and the United States, because these requirements determine what evidence must be captured and retained in practice. The next sections, therefore, examine how EU and U.S. governance approaches diverge and how that divergence translates into concrete architectural and operational requirements for real-world deployments.
Generative AI governance differs fundamentally between the European Union and the United States (Figure 4). Organizations operating across both jurisdictions must account for these differences in their control design and governance models.

7.1. European Union: The Prescriptive Paradigm

The European Union has established a binding regulatory framework that makes GenAI security and risk management a legal obligation. The EU AI Act (effective 1 August 2024) uses a risk-based model that links system classification to specific requirements for security, governance, documentation, and oversight. Obligations vary by classification and range from prohibited practices to conformity assessment, ongoing monitoring, and incident reporting for high-risk and systemic uses. Noncompliance carries substantial penalties linked to global turnover.
EU AI regulation is reinforced by adjacent cybersecurity and data protection instruments. NIS2 imposes risk-management obligations and requires covered entities to report incidents promptly, including notifying within 24 h of significant incidents. DORA extends operational resilience expectations to AI-enabled systems in the financial sector. The GDPR adds security and accountability requirements when personal data is processed.
Taken together, these instruments create a prescriptive, enforceable environment in which GenAI controls must be demonstrable through auditable evidence, not merely described as policy intent.

7.2. United States: The Innovation-First Approach

The U.S. regulatory landscape for GenAI is more fragmented and generally less prescriptive than the EU approach, placing greater weight on innovation, voluntary standards, and ex post enforcement. Recent shifts in federal executive policy also underscore regulatory volatility: federal priorities have shifted between stronger safety-oriented expectations and a renewed emphasis on competitiveness and reducing regulatory barriers. As a result, agency signals and enforcement posture can be fluid.
In practice, NIST provides the most consistent governance reference point. The NIST AI Risk Management Framework establishes a general structure for AI risk governance, and the NIST Generative AI Profile adapts that structure to GenAI-specific risks. Although these instruments are voluntary, they increasingly function as de facto standards in settings such as federal contracting and critical infrastructure programs.
At the same time, state-level activity is expanding unevenly, creating a patchwork of obligations and varying interpretations of compliance. In parallel, private enforcement mechanisms, such as product liability, negligence, and consumer protection claims, have become a practical driver of risk controls, reinforced by insurer expectations and underwriting scrutiny.
A current example of a U.S. deployment-layer obligation is Florida’s proposed SB 482 (“Artificial Intelligence Bill of Rights”), which would require consumer notice when interacting with AI chatbots and create parental supervision and consent requirements for minors’ use of “companion chatbot platforms.” These are not model-layer mandates; they are configurable governance and product obligations (UX disclosure, access control, and auditable controls) that illustrate how state-level rules can impose operational friction even when the underlying model remains unchanged [48].
Industry opposition to the above legislation is strong, and it has support from the federal government. Currently, it emphasizes “innovation first” and notes the difficulty of deploying AI systems under a patchwork of state rules, warning that “fragmented state laws” complicate feature rollout at scale [49].

7.3. Comparative Analysis

At a structural level, the divide is clear. The EU adopts a precautionary, rights-based approach that emphasizes ex ante requirements and prescriptive obligations. The United States relies more on voluntary standards, market incentives, and liability, with enforcement often occurring after harm or failure. These are not merely philosophical differences; they translate into concrete engineering and assurance consequences. They shape what controls must exist, where they must be placed in the system, and what evidence organizations must be able to produce to demonstrate that the controls operate effectively.

8. Transatlantic Divergence: Operational Implications

8.1. The Dual-Track Architecture

Organizations deploying GenAI across the EU and the United States face a practical design decision: standardize globally or differentiate by jurisdiction. In practice, the most workable approach is usually dual-track governance rather than dual-track engineering. A common technical baseline is maintained across deployments, while governance, documentation, and reporting are tailored to the legal environment.
The control framework in Section 5 can serve as this baseline. Designing to EU AI Act expectations typically meets or exceeds U.S. “reasonable security” expectations, effectively making EU requirements the highest common denominator for multinational deployments. Where divergence persists, it most often appears above the technical layer: in documentation depth, audit readiness, incident reporting timelines, third-party assurance expectations, and the formality of human oversight.

8.2. Compliance Strategy Options

In practice, organizations tend to converge on one of three patterns. Some adopt an EU-aligned global standard, applying EU-grade controls and evidence-based practices everywhere. This simplifies governance and reduces future rework, but it increases upfront costs and process overhead. Others maintain separate EU and U.S. variants, preserving flexibility but creating ongoing operational, assurance, and change-control complexity. A third approach deploys a shared core architecture and adds EU-specific compliance capabilities in stages, balancing cost and speed but carrying the risk that EU readiness lags behind deployment.
From a long-term risk-management perspective, the EU-aligned global standard is often the most resilient option, particularly for organizations that expect to scale, face external scrutiny, or navigate rapid regulatory evolution.

8.3. Implementation Roadmap

Operationalizing this approach benefits from phased adoption. Organizations typically start by establishing governance and ownership, completing a risk assessment aligned with the system’s intended use, implementing baseline technical controls, and instrumenting monitoring and logging. A second phase formalizes jurisdiction-specific evidence and reporting, including audit artifacts, human oversight protocols, and incident escalation procedures aligned with relevant timelines. As deployments stabilize, controls can be refined based on operational experience and selectively automated to improve consistency.
Finally, mature programs institutionalize continuous improvement, pursue relevant certifications where appropriate, and scale the baseline across additional use cases while maintaining disciplined change control for prompts, models, retrieval sources, and tool integrations.

9. Recommendations

9.1. For Practitioners

Organizations should adopt an EU-first, standards-anchored baseline for GenAI security and governance. In practice, this means using the EU AI Act’s risk categories as the starting point for classifying systems and determining the minimum control posture, then implementing an AI management system aligned with ISO/IEC 42001 to institutionalize governance, lifecycle controls, and continuous improvement. The NIST AI RMF and Generative AI Profile should serve as the operational playbook, mapped to the five-layer control model in this paper. OWASP LLM Top 10 and MITRE ATLAS should be incorporated into testing and monitoring so that engineering and SOC practices reflect current adversary behavior, not only policy intent.
Practitioners should prioritize architectural controls because many GenAI risks are structural and cannot be addressed by policy, guardrails, or post hoc moderation. Key architectural choices, including retrieval scope, context boundaries, and tool invocation permissions, should be explicitly documented, reviewed, and treated as primary security controls. Defense-in-depth should be applied across all five layers, with particular emphasis on containment at integration boundaries, where instruction–data conflation and confused-deputy failures become operationally consequential.
Assurance should be evidence-driven rather than documentation-driven. Logging and monitoring should capture the full causal chain of GenAI behavior, including prompts, retrieved context, tool calls, outputs, and policy decisions. Organizations should treat runtime monitoring as a standing requirement for managing residual risk. Sustained effectiveness also depends on human capability: role-appropriate training and an adapted secure development lifecycle are necessary to ensure that architectural and operational controls are implemented correctly and improved through feedback from incidents and near-misses.
Figure 5 provides a crosswalk mapping the control families defined in Section 5 to the most commonly used frameworks (NIST AI RMF/GenAI Profile, ISO/IEC 42001, OWASP LLM Top 10, MITRE ATLAS, and relevant EU AI Act requirements). This crosswalk is intended to reduce coverage gaps when organizations must satisfy multiple assurance regimes and to highlight where frameworks converge or leave responsibility to organizational governance.

9.2. For Policymakers

9.2.1. European Union

Policymakers should prioritize translating statutory obligations into practical, implementable guidance. This requires clearer technical standards that specify what constitutes “adequate” security and risk management under the AI Act, as well as deliberate harmonization across the AI Act, NIS2, DORA, and the GDPR to reduce duplication and compliance complexity. To avoid disproportionately burdening smaller organizations, regulators should also provide simplified guidance, templates, and tools that make compliance achievable for SMEs. Finally, international cooperation should be treated as a design goal in its own right, with mechanisms for mutual recognition and alignment with key partners to reduce regulatory friction for cross-border deployments.

9.2.2. United States

U.S. policy should aim to create greater consistency while preserving space for innovation. Establishing a modest federal baseline would limit harmful fragmentation across state laws and provide clearer expectations for organizations deploying GenAI at scale. Within that baseline, NIST should remain the central source of technical and security guidance, with its frameworks serving as the primary reference for implementation. Policymakers should also clarify liability standards to encourage responsible deployment without creating unpredictable or unmanageable legal exposure. Finally, closer transatlantic alignment on technical standards and shared risk definitions would reduce compliance friction and support cross-border deployments.

9.2.3. Transatlantic Cooperation

Transatlantic cooperation should focus on practical coordination rather than aspirational harmonization. Shared technical standards, developed jointly through bodies such as ISO/IEC, and mutual recognition of conformity assessments would reduce duplicative assurance work and lower compliance friction for cross-border deployments. Cooperation is also needed on threat intelligence, including shared taxonomies and structured information exchange on AI-specific attacks and emerging techniques, so that defensive practice evolves at the pace of the threat landscape. Finally, incident coordination should be strengthened through agreed protocols for cross-border notification and response, ensuring that organizations can meet reporting timelines and preserve investigative continuity when incidents span jurisdictions.

9.3. For the Research Community

In the European Union, the priority is implementability. Policymakers should translate statutory obligations into clear technical expectations for what constitutes “adequate” cybersecurity, logging, testing, and monitoring under the AI Act, and reduce duplication by harmonizing requirements across the AI Act, NIS2, DORA, and GDPR where the same evidence artifacts are repeatedly demanded. Practical compliance support for smaller organizations will also matter, since capacity constraints will otherwise convert legal obligations into uneven enforcement rather than improved security.
In the United States, the priority is coherence. A baseline national approach would reduce fragmentation across state-level initiatives and provide clearer expectations for organizations operating across sectors. NIST should remain the central source of technical guidance, and policymakers should clarify liability so that incentives support responsible deployment without pushing organizations into defensive over-restriction or under-reporting.
Transatlantic cooperation should focus on pragmatic alignment rather than full legal convergence. Mutual recognition of technical standards and assurance artifacts, shared threat taxonomies, and threat-intelligence exchange, along with compatible cross-border incident coordination, would reduce operational friction while improving security outcomes.

10. Conclusions

Securing generative AI (GenAI) systems requires moving beyond perimeter-based controls toward an architecture-led, defense-in-depth approach. When untrusted language inputs are coupled with privileged retrieval and tool execution, key exposures arise at trust boundaries and integration points, not just “inside the model.” GenAI systems cannot be reliably mitigated by policy-only or interface-layer guardrails.
This paper contributes a threat-centric analysis that aligns major GenAI failure modes with a five-layer reference architecture—training and adaptation, model behavior, retrieval and data interfaces, orchestration and tools, and runtime interaction. It uses established taxonomies (OWASP LLM Top 10 and MITRE ATLAS) to keep the mapping comparable and operational. This layer-based view clarifies where conventional security controls remain effective (identity, logging, incident response) and where GenAI-specific mechanisms are required, including retrieval isolation, privilege boundaries for connectors and tools, and robust context handling rules.
Three conclusions follow. First, GenAI security differs materially from traditional software security because behavior is probabilistic and context-dependent, and because retrieval and tool integration expand the attack surface in ways that perimeter defenses do not reliably contain. Second, many high-impact risks are structural rather than configurable: architectural choices about retrieval scope, connector trust, tool permissions, and context management define the security boundary and largely determine residual exposure. Third, meaningful assurance depends on operational evidence. Pre-deployment documentation and testing are necessary but insufficient; organizations need continuous monitoring and audit-grade logging that capture the causal chain of prompts, retrieved context, tool calls, outputs, and policy decisions.
Regulatory divergence amplifies these engineering realities. In practice, EU regimes—especially the AI Act and NIS2—often serve as the highest common baseline for transatlantic deployments because they impose binding requirements for security controls, evidence, and reporting. A workable response is a dual-track approach. Here, we maintain a consistent, layer-aligned technical baseline while adapting governance processes, such as documentation depth, oversight mechanisms, reporting timelines, and assurance, to jurisdiction-specific obligations.
Ultimately, human factors remain the most crucial determinant. Even a well-designed layered control model will fail in practice without practitioner education, security team capacity for AI-specific threat modeling and testing, and leadership discipline in accepting and governing residual risk. Organizations that approach GenAI security as a defense-in-depth strategy, rooted in architecture, supported by operational evidence, and governed with jurisdiction-aware rigor will be best equipped to deploy these systems safely and sustainably.

Author Contributions

Conceptualization, V.K. and K.K.; methodology, V.K. and K.K.; validation, V.K. and K.K.; formal analysis, V.K. and K.K.; investigation, V.K. and K.K.; resources, V.K. and K.K.; writing—original draft preparation, V.K.; writing—review and editing, V.K. and K.K.; visualization, V.K.; project administration, K.K. All authors have read and agreed to the published version of the manuscript.

Funding

K.K. gratefully acknowledges support under Project UNITe BG16RFPR002-1.014-0004, funded by PRIDST.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding authors.

Acknowledgments

The authors acknowledge the Dean of Metropolitan College at Boston University for encouraging research collaboration and providing financial support for more than two decades of collaboration between researchers in the EU and the USA through the annual CSECS initiative. During the preparation of this manuscript, the authors used OpenAI (version 5.0) to support conceptual organization, background synthesis, and figure prototyping. The authors reviewed and edited all AI-assisted output and take full responsibility for the content of this publication.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
AbbreviationDefinition
AIArtificial Intelligence
AIMSArtificial Intelligence Management System
AMLAdversarial Machine Learning
ATLASAdversarial Threat Landscape for Artificial-Intelligence Systems
CCCreative Commons
DLPData Loss Prevention
DORADigital Operational Resilience Act
EUEuropean Union
GenAIGenerative Artificial Intelligence
GDPRGeneral Data Protection Regulation
GPAIGeneral-Purpose Artificial Intelligence
IAMIdentity and Access Management
ISOInternational Organization for Standardization
ISO/IECInternational Organization for Standardization/International Electrotechnical Commission
LLMLarge Language Model
MITREMassachusetts Institute of Technology Research and Engineering
NIS2Network and Information Security Directive (EU) 2022/2555
NISTNational Institute of Standards and Technology
OWASPOpen Web Application Security Project
PIIPersonally Identifiable Information
RAGRetrieval-Augmented Generation
RLHFReinforcement Learning from Human Feedback
SIEMSecurity Information and Event Management
SMESmall and Medium-Sized Enterprise
SOCSecurity Operations Center
STRIDESpoofing, Tampering, Repudiation, Information Disclosure, Denial of Service, Elevation of Privilege
USUnited States

Appendix A. Empirical Literature Record-Flow Summary

This appendix reports a concise, text-only summary of current empirical literature researched apart from the standards (see Appendix C).

A.1. Identification

Records were identified through searches of Scopus, IEEE Xplore, ACL Anthology, and supplementary targeted web searches within the time window January 2020–January 2026.
  • Initial candidate pool (identified): 85+ records

A.2. Screening

A title/abstract screen was applied to retain candidates that were clearly within scope and plausibly met empirical eligibility criteria.
  • Screened in (title/abstract): 52 records

A.3. Full-Text Assessment

Full-text review was conducted for the screened-in set to confirm (i) relevance to GenAI security/privacy, (ii) presence of an empirical contribution (attack/defense/evaluation/benchmark) or a formal threat model with deployment-relevant analysis, and (iii) sufficient methodological detail to interpret results.
  • Full-text reviewed: 38 records.
Common reasons for exclusion at full text:
  • Out of scope on full read (LLM mentioned but not central to security/privacy contribution)
  • No evaluable method (primarily conceptual commentary without measurements or threat model)
  • Insufficient methodological detail to interpret results
  • Redundant overlap with a later/extended version of the same study

A.4. Included Empirical Set

  • Final empirical papers included: 28

A.5. Coverage Assurance

The final included set was selected to ensure balanced coverage across five threat surfaces used in the paper: (1) prompt injection/jailbreaks; (2) indirect prompt injection and tool/agent misuse; (3) RAG poisoning/knowledge corruption; (4) privacy leakage/training data exposure; and (5) mitigations/evaluations/benchmarks.

Appendix B. Representative Empirical Studies by Threat Class

Table A1. Empirical Studies by Threat Class.
Table A1. Empirical Studies by Threat Class.
Threat ClassRepresentative StudiesSettingMain Failure ModeMitigation (If Any)
Prompt Injection/Jailbreaks[11,12,29,30,31]Model-onlyRoleplay/privilege escalation bypasses; 89–95% ASRAdversarial training, guardrails (partial)
Indirect Prompt Injection + Agents[13,14,34,38]Tool-integrated agentsEmbedded instructions in tool outputs hijack agent; 24–84% ASRMELON, spotlighting (emerging)
RAG Poisoning[48,50]RAG systems5 malicious texts achieve 90% ASR in a million-doc databaseTraceback forensics (post hoc)
Privacy Leakage[8,9,10,51]Model-only/fine-tunedTraining data extraction; PII exposure via promptingDifferential privacy, PAPILLON
Defenses and Evaluation[14,34]BenchmarksStandardized testing reveals defense gapsMulti-layer defense-in-depth

Appendix C. Principal Frameworks

This appendix provides a concise reference summary of the principal frameworks used throughout the paper to ground the threat taxonomy and control mappings. Table A2 summarizes each framework’s primary purpose, intended audience, and how it is applied in this review. Table A3 then maps each framework to the structural–configurable lens and the typical allocation of responsibility between AI system providers and deployers. These tables are intended as a quick reference for readers and to improve transparency about the sources of guidance synthesized in Section 2, Section 3, Section 4, Section 5 and Section 6.
Table A2. Key frameworks and how this review uses them.
Table A2. Key frameworks and how this review uses them.
FrameworkPrimary PurposePrimary
Audience
How This Review Uses ItWhere It Maps in the 5-Layer StackKey Limitation for this
Review
NIST AI RMF 1.0 + GenAI Profile (AI 600-1)Risk management process and governance functions (Govern/Map/Measure/Manage)Leadership, risk/compliance, program ownersGovernance backbone for “what good looks like” in AI risk practice; supports the provider vs. deployer division of responsibilityCross-cutting (all layers), with emphasis on lifecycle governance and operational managementHigh-level guidance; requires interpretation to translate into concrete technical controls
ISO/IEC 42001:2023Certifiable AI management system (AIMS) for organizational controls and assuranceOrganizations seeking structured governance and third-party assuranceControl-system and auditability anchor (management system maturity, lifecycle management, documentation, QMS-style structure)Cross-cutting (all layers), strongest on governance and lifecycle controlsLess specific on GenAI-native attack paths (tool misuse) than OWASP/ATLAS
OWASP Top 10 for LLM ApplicationsPractical catalog of GenAI application threats and common failure modesDevelopers, app security, product engineeringThreat enumeration for application/system-level risks; provides concrete risk categories used in the threat mappingMostly Layers 3–5 (retrieval, orchestration/tools, runtime), with some coverage of training/supply chainNot a governance standard; limited guidance on evidence/assurance and regulatory allocation
MITRE ATLASAdversary-centric tactics/techniques matrix for AI systemsSOC, threat intelligence, detection engineeringThreat lifecycle framing; supports mapping of real attack progression to an established “tactics/techniques” vocabularyCross-cutting, often operationalized in Layers 3–5 (integration boundary failures)Describes attacker behavior more than defensive prescriptions; requires translation into controls
ENISA ML security guidanceEU-oriented security-by-design guidance for ML/AI and lifecycle security thinkingEU-aligned security and compliance teams, policymakersEuropean security perspective and regulation-adjacent framing; supports linkage to EU requirements and risk-based expectationsLifecycle-oriented (all layers), traditionally strongest on training/inference threats and governance
Table A3. Mapping each framework to structural/configurable risks and provider/deployer ownership.
Table A3. Mapping each framework to structural/configurable risks and provider/deployer ownership.
FrameworkStructural Risk Coverage (Model/Training/Architecture Properties)Configurable Risk Coverage (Deployment/Integration/Ops)Typical “Owner” EmphasisHow It Supports the
Structural–Configurable Lens in this Paper
NIST AI RMF + GenAI ProfileModerate–High (developer-managed practices: training data, model testing, upstream risk management)High (deployer-managed practices: monitoring, oversight, incident response, deployment context)Explicitly separates developer vs. deployer responsibilitiesProvides an established responsibility split that aligns with structural (provider) vs. configurable (deployer) framing
ISO/IEC 42001Moderate (indirectly via vendor assessment, validation obligations, lifecycle controls)High (management system controls, governance, documentation, operational processes)Primarily deployer/organization, with vendor management requirementsAnchors the “configurable” side as auditable governance and operational control maturity
OWASP LLM Top 10Mixed (some structural issues like training data poisoning/model theft are included, but not deeply theorized)High (prompt injection mitigation, plugin/tool security, output handling, operational safeguards)Primarily builders/deployers, with some provider relevanceGives concrete categories that often surface at integration boundaries; helps distinguish which items are structural vs. configurable in practice
MITRE ATLASMixed (includes training-time and model-level attacks, but framed as attacker TTPs)Mixed–High (many techniques exploit deployment surfaces and access patterns)Neutral; used by SOC/defenders to describe attacksSupports labeling threats as structural/configurable/hybrid by showing where in the lifecycle an attacker gains leverage
ENISA guidanceHigh (poisoning, evasion, privacy attacks, supply chain thinking)Moderate–High (deployment security, monitoring, governance controls)Balanced, EU-style lifecycle approachHelps connect structural/configurable technical risk to EU-flavored security-by-design and compliance expectations

References

  1. National Institute of Standards. Technology. In Artificial Intelligence Risk Management Framework (AI RMF 1.0); National Institute of Standards and Technology (NIST), U.S. Department of Commerce: Washington, DC, USA, 2023. [Google Scholar]
  2. NIST AI 600-1; Artificial Intelligence Risk Management Framework: Generative Artificial Intelligence Profile. National Institute of Standards and Technology: Gaithersburg, MD, USA, 2024.
  3. Foundation, O. OWASP Top 10 for Large Language Model Applications; Open Web Application Security Project (OWASP): Columbia, MD, USA, 2023. [Google Scholar]
  4. Corporation, M. ATLAS: Adversarial Threat Landscape for Artificial-Intelligence Systems (Advmlthreatmatrix); MITRE Corporation: McLean, VA, USA, 2021. [Google Scholar]
  5. European Parliament and the Council of the European Union. Regulation (EU) 2024/1689 of the European Parliament and of the Council of 13 June 2024 Laying Down Harmonised Rules on Artificial Intelligence (Artificial Intelligence Act); Publications Office of the European Union: Luxembourg, 2024. [Google Scholar]
  6. European Parliament and the Council of the European Union. Directive (EU) 2022/2555 of the European Parliament and of the Council of 14 December 2022 on Measures for a High Common Level of Cybersecurity Across the Union (NIS2 Directive); Publications Office of the European Union: Luxembourg, 2022. [Google Scholar]
  7. Federal vs. State AI Rules: What the New U.S. Executive Order Really Means. Available online: https://regulatingai.org/ (accessed on 11 May 2025).
  8. Fredrikson, M.; Jha, S.; Ristenpart, T. Model Inversion Attacks that Exploit Confidence Information and Basic Countermeasures. In Proceedings of the ACM Conference on Computer and Communications Security (CCS), Denver, CO, USA, 12–16 October; Association for Computing Machinery: New York, NY, USA, 2015. [Google Scholar]
  9. Shokri, R.; Stronati, M.; Song, C.; Shmatikov, V. Membership Inference Attacks Against Machine Learning Models; IEEE: Piscataway, NJ, USA, 2016. [Google Scholar]
  10. Carlini, N.; Tramèr, F.; Wallace, E.; Jagielski, M.; Herbert-Voss, A.; Lee, K.; Roberts, A.; Brown, T.; Song, D.; Erlingsson, Ú.; et al. Extracting Training Data from Large Language Models. In Proceedings of the 30th USENIX Security Symposium (USENIX Security 21), Virtual, 10–13 August 2021; USENIX Association: Berkeley, CA, USA, 2021; pp. 2633–2650. [Google Scholar]
  11. Yao, Y.; Duan, J.; Xu, K.; Cai, Y.; Sun, Z.; Zhang, Y. A survey on large language model (llm) security and privacy: The good, the bad, and the ugly. High Confid. Comput. 2024, 4, 100211. [Google Scholar] [CrossRef]
  12. Yi, J.; Xie, Y.; Zhu, B.; Kiciman, E.; Sun, G.; Xie, X.; Wu, F. Benchmarking and Defending Against Indirect Prompt Injection Attacks on Large Language Models; USENIX Association: Berkeley, CA, USA, 2023. [Google Scholar]
  13. Zhan, Q.; Liang, Z.; Ying, Z.; Kang, D. InjecAgent: Benchmarking Indirect Prompt Injections in Tool-Integrated Large Language Model Agents. arXiv 2024, arXiv:2403.02691. [Google Scholar]
  14. Zhang, H.; Huang, J.; Mei, K.; Yao, Y.; Wang, Z.; Zhan, C.; Wang, H.; Zhang, Y. Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM-based Agents. arXiv 2024, arXiv:2410.02644. [Google Scholar] [CrossRef]
  15. European Parliament and the Council of the European Union. Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 (General Data Protection Regulation); Publications Office of the European Union: Luxembourg, 2016. [Google Scholar]
  16. Haro, J. Secure APIs; O’Reilly Media: Sebastopol, CA, USA, 2023. [Google Scholar]
  17. Santos, O.; Radanliev, P. AI-Powered Digital Cyber Resilience; O’Reilly Media: Sebastopol, CA, USA, 2024. [Google Scholar]
  18. Sood, A. Combating Cyberattacks Targeting the AI Ecosystem; O’Reilly Media: Sebastopol, CA, USA, 2024. [Google Scholar]
  19. Wendt, D.W. The Cybersecurity Trinity: Artificial Intelligence, Automation, and Active Cyber Defense; O’Reilly Media: Sebastopol, CA, USA, 2023. [Google Scholar]
  20. Wendt, D.W. AI Strategy and Security: A Roadmap for Secure, Responsible, and Resilient AI Adoption; O’Reilly Media: Sebastopol, CA, USA, 2024. [Google Scholar]
  21. ISO 42001:2023; Information Technology-Artificial Intelligence-Management System. International Electrotechnical Commission (ISO/IEC): Geneva, Switzerland, 2023.
  22. ISO 31000:2018; Risk Management—Guidelines. International Organization for Standardization (ISO): Geneva, Switzerland, 2018.
  23. PMI. Risk Management in Portfolios, Programs, and Projects; Project Management Institute: Newtown, PA, USA, 2024. [Google Scholar]
  24. Souppaya, M.; Scarfone, K.; Dodson, D. Secure Software Development Framework (SSDF) Version 1.1: Recommendations for Mitigating the Risk of Software Vulnerabilities; National Institute of Standards and Technology: Gaithersburg, MD, USA, 2022. [Google Scholar]
  25. McGraw, G. Six Tech Trends Impacting Software Security. Computer 2017, 50, 100–102. [Google Scholar] [CrossRef]
  26. Manadhata, P.K.; Wing, J.M. An Attack Surface Metric. IEEE Trans. Softw. Eng. 2011, 37, 371–386. [Google Scholar] [CrossRef]
  27. Bommasani, R.; Hudson, D.A.; Adeli, E.; Altman, R.; Arora, S.; von Arx, S.; Bernstein, M.S.; Bohg, J.; Bosselut, A.; Brunskill, E.; et al. On the Opportunities and Risks of Foundation Models. arXiv 2021, arXiv:2108.07258. [Google Scholar] [CrossRef]
  28. Hardy, N. The Confused Deputy: Or why capabilities might have been invented. Oper. Syst. Rev. 1988, 22, 36–38. [Google Scholar] [CrossRef]
  29. Reddy, P.; Gujral, A.S. EchoLeak: The First Real-World Zero-Click Prompt Injection Exploit in a Production LLM System. arXiv 2025, arXiv:2509.10540. [Google Scholar] [CrossRef]
  30. Hines, K.; Lopez, G.; Hall, M.; Zarfati, F.; Zunger, Y.; Kiciman, E. Defending Against Indirect Prompt Injection Attacks With Spotlighting; USENIX Association: Berkeley, CA, USA, 2024. [Google Scholar]
  31. Liu, Y.; Deng, G.; Li, Y.; Wang, K.; Wang, Z.; Wang, X.; Zhang, T.; Liu, Y.; Wang, H.; Zheng, Y.; et al. Prompt Injection attack against LLM-integrated Applications. arXiv 2023, arXiv:2306.05499. [Google Scholar]
  32. Sabin, S. 1 Big Thing: AI-Powered Malware Is on Its Way. Available online: https://www.axios.com/newsletters/axios-ai-plus-b19d2e6e-7ec2-4d99-9b36-3f95c7298354.html (accessed on 11 May 2025).
  33. Tyagi, A.K.; Addula, S.R. Artificial Intelligence for Malware Analysis: A Systematic Study. In Artificial Intelligence-Enabled Digital Twin for Smart Manufacturing; Wiley-Scrivener: Beverly, MA, USA, 2024; pp. 359–390. [Google Scholar]
  34. Debenedetti, E.; Zhang, J.; Balunovic, M.; Beurer-Kellner, L.; Fischer, M.; Tramèr, F. Agentdojo: A dynamic environment to evaluate prompt injection attacks and defenses for llm agents. Adv. Neural Inf. Process. Syst. 2024, 37, 82895–82920. [Google Scholar]
  35. Gu, T.; Dolan-Gavitt, B.; Garg, S. BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain; Curran Associates, Inc.: Red Hook, NY, USA, 2017. [Google Scholar]
  36. Koh, P.W.; Steinhardt, J.; Liang, P. Stronger data poisoning attacks break data sanitization defenses. Mach. Learn. 2022, 111, 1–47. [Google Scholar] [CrossRef]
  37. Wang, B.; Yao, Y.; Shan, S.; Li, H.; Viswanath, B.; Zheng, H.; Zhao, B.Y. Neural Cleanse: Identifying and Mitigating Backdoor Attacks in Neural Networks. In Proceedings of the IEEE Symposium on Security and Privacy (S&P), San Francisco, CA, USA, 20–22 May 2019; IEEE: Piscataway, NJ, USA, 2019. [Google Scholar]
  38. De Stefano, G.; Schönherr, L.; Pellegrino, G. Rag and Roll: An End-to-End Evaluation of Indirect Prompt Manipulations in LLM-Based Application Frameworks. arXiv 2024, arXiv:2408.05025. [Google Scholar]
  39. Srinivas, S.; Kirk, B.; Zendejas, J.; Espino, M.; Boskovich, M.; Bari, A.; Dajani, K.; Alzahrani, N. AI-Augmented SOC: A Survey of LLMs and Agents for Security Automation. J. Cybersecur. Priv. 2025, 5, 95. [Google Scholar] [CrossRef]
  40. Palma, G.; Cecchi, G.; Caronna, M.; Rizzo, A. Leveraging Large Language Models for Scalable and Explainable Cybersecurity Log Analysis. J. Cybersecur. Priv. 2025, 5, 55. [Google Scholar] [CrossRef]
  41. EU 2024/1689; Artificial Intelligence Act. European Parliament and of the Council: Brussels, Belgium, 2024. Available online: https://artificialintelligenceact.eu/the-act/ (accessed on 21 January 2026).
  42. NIST AI RMF 1.0; Artificial Intelligence Risk Management Framework (AI RMF 1.0). National Institute of Standards and Technology: Gaithersburg, MD, USA, 2023.
  43. World Trade Organization. Agreement on Trade-Related Aspects of Intellectual Property Rights (TRIPS Agreement); WTO: Geneva, Switzerland, 1994. [Google Scholar]
  44. EU 2016/679; General Data Protection Regulation. European Parliament and Council of the European Union: Brussels, Belgium, 2016. Available online: https://eur-lex.europa.eu/legal-content/EN/ALL/?uri=CELEX:32016R0679 (accessed on 22 December 2025).
  45. Santos, O. Developing Cybersecurity Programs and Policies in an AI-Driven World, 4th ed.; O’Reilly Media: Sebastopol, CA, USA, 2024. [Google Scholar]
  46. MITRE Corporation. ATLAS: Adversarial Threat Landscape for Artificial-Intelligence Systems (ATLAS); MITRE Corporation: McLean, VA, USA, 2021. [Google Scholar]
  47. OWASP Foundation. OWASP Top 10 for Large Language Model Applications; Open Worldwide Application Security Project (OWASP): Columbia, MD, USA, 2023. [Google Scholar]
  48. Florida Senate. Artificial Intelligence Bill of Rights; Florida Senate SB 482; Florida Senate: Tallahassee, FL, USA, 2026.
  49. AXIOS. Tech Group Opposes Florida AI Proposal ‘Artificial Intelligence Bill of Rights’. Available online: https://www.axios.com/newsletters/axios-tampa-bay-ea380018-35af-498f-98fb-b231af7193d0.html (accessed on 20 January 2026).
  50. Zhang, B.; Chen, Y.; Liu, Z.; Nie, L.; Li, T.; Liu, Z.; Fang, M. Practical poisoning attacks against retrieval-augmented generation. arXiv 2026, arXiv:2504.03957. [Google Scholar]
  51. Duan, M.; Suri, A.; Mireshghallah, N.; Min, S.; Shi, W.; Zettlemoyer, L.; Tsvetkov, Y.; Choi, Y.; Evans, D.; Hajishirzi, H. Do membership inference attacks work on large language models? arXiv 2024, arXiv:2402.07841. [Google Scholar] [CrossRef]
Figure 1. Layered Model.
Figure 1. Layered Model.
Jcp 06 00027 g001
Figure 2. GenAI Threat Categories and Control Placement.
Figure 2. GenAI Threat Categories and Control Placement.
Jcp 06 00027 g002
Figure 3. Layered reference architecture for Generative AI systems.
Figure 3. Layered reference architecture for Generative AI systems.
Jcp 06 00027 g003
Figure 4. Comparative analysis of global AI regulatory landscapes: EU’s centralized compliance versus US’s sectoral oversight, including voluntary standards such as the NIST AI Risk Management Framework and GenAI Profile [1,2].
Figure 4. Comparative analysis of global AI regulatory landscapes: EU’s centralized compliance versus US’s sectoral oversight, including voluntary standards such as the NIST AI Risk Management Framework and GenAI Profile [1,2].
Jcp 06 00027 g004
Figure 5. Control Families Mapped to Standards—Practitioner Crosswalk.
Figure 5. Control Families Mapped to Standards—Practitioner Crosswalk.
Jcp 06 00027 g005
Table 1. Conceptual Lineage of the Structural–Configurable Distinction.
Table 1. Conceptual Lineage of the Structural–Configurable Distinction.
Source DomainRelated ConceptMaps to StructuralMaps to Configurable
Risk Management (ISO 31000)Inherent vs.
Residual Risk
Inherent risk (before controls)Residual risk (after treatment)
Software Security (McGraw)Design Flaws vs.
Implementation Bugs
Design flaws (require redesign)Implementation bugs (patchable)
Systems Security (Hardy)Confused Deputy
Problem
Authority coupling (architectural)Permission scoping
(operational)
Attack Surface (Manadhata and Wing)Design vs. Deployment SurfaceEntry points fixed by designEnabled services, exposure
This Paper (GenAI)Structural vs.
Configurable
Architecture, training, trust boundariesDeployment, operational controls
Table 2. Tool capabilities.
Table 2. Tool capabilities.
Threat ClassRAG Chatbot ImpactAgentic System Impact
Indirect prompt injectionText manipulation, misinformationRemote code execution, data exfiltration, unauthorized actions
Privilege escalationLimited to output contentTool permissions inheritance; access to connected systems
Lateral movementNot applicableVia tool integrations to other systems
PersistenceNone (stateless output)Potential via file/database/credential tools
Table 3. Attack mapping.
Table 3. Attack mapping.
Attack Phase ATLAS Tactic ATLAS Technique (ID) Narrative Application
1. SeedInitial AccessIndirect Prompt Injection (AML.T0051.001)The attacker places a malicious payload inside an email, which is then ingested by Copilot’s RAG engine during a routine summary request.
2. BypassML Model AttackPrompt Injection (AML.T0051)The payload is semantically engineered to bypass Microsoft’s internal “Cross-Prompt Injection Attempt” (XPIA) classifiers.
3. TheftCollectionLLM Scope Violation (AML.T0051)The AI is tricked into accessing privileged internal data (e.g., password reset links, sensitive docs) based on instructions from the untrusted external email.
4. ExitExfiltrationExfiltration (Not Explicit in ATLAS)Sensitive data is encoded into a URL and rendered as a tracking pixel. The user’s browser automatically triggers the “leak” when it loads the image.
Table 4. Threat-Control Mapping for the GenAI Stack.
Table 4. Threat-Control Mapping for the GenAI Stack.
Threat Category OWASP LLM MITRE
ATLAS
Affected Layers Primary Controls Standards and Regulations
Prompt Injection LLM01AML.T00512, 3, 5Input validation, output filteringAI Act (Art.15) [41]
Indirect InjectionLLM01AML.T00513, 5Injection-aware retrievalNIST AI 600-1 [42]
Data PoisoningLLM03AML.T00201Provenance, anomaly detectionAI Act (Art.10) [41]
Model ExtractionLLM10AML.T00242, 5Rate limiting, access controlsIP protection regulations (WTO) [43]
Sensitive DisclosureLLM06AML.T00251, 2, 3, 5DLP filters, output scanningGDPR Art.32 [44]
Tool MisuseLLM07AML.T00404Least privilege, sandboxingISO/IEC 42001 [21]
Supply ChainLLM05AML.T00101, 4Provenance, verificationNIS2 [6]
JailbreakingLLM01AML.T00542, 5Safety tuning, testingAI Act (Art.15) [41]
Table 5. Threat–Control Mapping for the GenAI Stac.
Table 5. Threat–Control Mapping for the GenAI Stac.
GenAI Layer Key Threats Baseline Controls Evidence to Retain Standards Alignment
1. Training and AdaptationData poisoning, compromised datasets or models, unsafe fine-tuningVerify data and model provenance; control fine-tuning access; validate dependencies; stage releases with rollbackDataset lineage records; training and fine-tuning logs; access records; release documentationNIST AI RMF (GenAI Profile) [42]; ISO/IEC 42001; MITRE ATLAS [46]
2. Model BehaviorJailbreaks, policy evasion, unsafe outputsProtect system prompts; enforce policy outside the model; adversarial testing; abuse rate-limitingPrompt versions; evaluation results; red-team findings; refusal metricsNIST AI RMF (GenAI Profile); OWASP LLM Top 10 [47]
3. Retrieval and Data InterfacesIndirect prompt injection, over-broad retrieval, and data exposurePermission-aware retrieval; scoped data access; provenance labeling; secure vector storesRetrieval logs; access control decisions; provenance metadataOWASP LLM Top 10; NIST AI RMF; ISO/IEC 42001
4. Orchestration and ToolsTool misuse, unsafe actions, and lateral movementLeast-privilege tools; explicit approval for high-risk actions; schema validation; execution isolationTool inventories; invocation records; approval logs; incident alertsOWASP LLM Top 10; NIST AI RMF; MITRE ATLAS
5. Runtime InteractionPrompt injection, multi-turn escalation, output leakagePrompt screening; context separation; output filtering; session monitoringPrompt and context logs; moderation decisions; escalation indicatorsOWASP LLM Top 10; NIST GenAI Profile; ISO/IEC 42001
Table 6. US-Style versus EU-Compliant GenAI Architecture (illustrative baseline comparison).
Table 6. US-Style versus EU-Compliant GenAI Architecture (illustrative baseline comparison).
Element US-Style (Framework-Led/Sectoral) EU-Compliant (AI Act + NIS2)
Risk classificationInternal assessment; categorization varies by sector and organizationMandatory classification (e.g., high-risk/GPAI categories where applicable) with corresponding obligations
Training data documentationRecommended practice; often limited to internal summariesRequired documentation expectations (e.g., provenance-oriented summaries and compliance-relevant records)
Logging depth and traceabilityOperational logs for reliability/debugging; retention driven by business needComprehensive traceability across prompts, retrieved context, tool calls, outputs, and key decisions; retention and audit accessibility aligned to regulatory expectations
Human oversightOptional; driven by organizational risk appetite and liabilityRequired for high-risk use cases; defined escalation paths and documented oversight procedures
Incident reportingSectoral reporting, where applicable; timelines and thresholds varyRapid notification expectations under NIS2 (staged reporting) and AI Act-aligned incident handling, where applicable
Third-party audit/assuranceOptional; commonly customer- or procurement-drivenConformity assessment and external assurance expectations for high-risk systems; evidence readiness for supervisory review
Bias testing and monitoringRecommended; format and cadence varyDocumented testing and ongoing monitoring expectations, especially where systems are classified as high-risk
Table 7. Educational Gaps by Role.
Table 7. Educational Gaps by Role.
Role Traditional Training Covers GenAI Gap
DevelopersInput validation, SQL injection, XSSPrompt injection, context risks, tool permissions
Security TeamsNetwork attacks, malware, scanningModel behavior analysis, adversarial ML, AI red-teaming
ArchitectsNetwork segmentation, access controlContext boundaries, tool sandboxing, agentic containment
LeadershipCompliance, risk frameworksAI liability, probabilistic risk, regulatory divergence
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Kanabar, V.; Kaloyanova, K. Securing Generative AI Systems: Threat-Centric Architectures and the Impact of Divergent EU–US Governance Regimes. J. Cybersecur. Priv. 2026, 6, 27. https://doi.org/10.3390/jcp6010027

AMA Style

Kanabar V, Kaloyanova K. Securing Generative AI Systems: Threat-Centric Architectures and the Impact of Divergent EU–US Governance Regimes. Journal of Cybersecurity and Privacy. 2026; 6(1):27. https://doi.org/10.3390/jcp6010027

Chicago/Turabian Style

Kanabar, Vijay, and Kalinka Kaloyanova. 2026. "Securing Generative AI Systems: Threat-Centric Architectures and the Impact of Divergent EU–US Governance Regimes" Journal of Cybersecurity and Privacy 6, no. 1: 27. https://doi.org/10.3390/jcp6010027

APA Style

Kanabar, V., & Kaloyanova, K. (2026). Securing Generative AI Systems: Threat-Centric Architectures and the Impact of Divergent EU–US Governance Regimes. Journal of Cybersecurity and Privacy, 6(1), 27. https://doi.org/10.3390/jcp6010027

Article Metrics

Back to TopTop