Conceptualising RAG-Driven Agentic AI with Multi-Layer MCP for Seismic Structural Systems

Ávila, Carlos Fabián; Rivera Tapia, Edgar David

doi:10.3390/buildings16051018

Open AccessCommunication

Conceptualising RAG-Driven Agentic AI with Multi-Layer MCP for Seismic Structural Systems

by

Carlos Fabián Ávila

^*

and

Edgar David Rivera Tapia

Mecánica Computacional e Inteligencia Artificial Aplicada (MCIAA), Ingeniería Civil, Facultad de Ciencias, Ingeniería y Construcción, Universidad UTE, Quito 170527, Ecuador

^*

Author to whom correspondence should be addressed.

Buildings 2026, 16(5), 1018; https://doi.org/10.3390/buildings16051018

Submission received: 4 December 2025 / Revised: 2 February 2026 / Accepted: 24 February 2026 / Published: 5 March 2026

(This article belongs to the Special Issue Automation and Intelligence in the Construction Industry)

Download

Browse Figures

Versions Notes

Abstract

The integration of Generative AI into civil engineering is currently constrained by the risk of non-compliant outputs and an inherent lack of physics-based knowledge. To address these limitations, this paper presents a conceptual framework for the integration of Agentic Artificial Intelligence (AI) into the complete lifecycle of seismic-resistant structural engineering. The proposal employs a modular software architecture built on the Model Context Protocol (MCP), enabling distributed collaboration among specialised AI agents. We operationalise this architecture across six critical stages, where specific agents govern distinct phases: (1) Seismic Hazard and (2) Structural Modelling agents quantify demands through deterministic tool execution; the (3) Design agent optimises element sizing under the strict governance of Retrieval-Augmented Generation (RAG) for code compliance; (4) Construction Quality Control and (5) Structural Health Monitoring (SHM) agents validate as-built geometry and service-life performance; and an overarching (6) Ethical Audit agent supervises the ecosystem to ensure safety and algorithmic transparency. By decoupling probabilistic design iteration from immutable numerical execution, this framework ensures that generative outputs are traceable, transparent, and professionally accountable, offering a verified pathway for the deployment of AI systems in structural engineering.

Keywords:

agentic AI in structural engineering; multi context protocol; agents; retrieval augmented generation

1. Introduction

The Architecture, Engineering, and Construction sector currently faces a dichotomy: the urgent need for integrated project delivery versus the reality of deep technological isolation [1]. This struggle is rooted in the widespread use of Black-Box analysis tools, such as CAD, FEM, and BIM. As exemplified by recent multidisciplinary assessments of historic structures [2], these platforms are essential for resolving complex geometries and non-linear behaviours, yet they operate within fixed procedural boundaries. While this fragmentation is problematic, it is sustained by the industry’s historical imperative for deterministic methods [3], which provide the verifiable outcomes necessary for safety, regulatory compliance, and liability management [4]. These factors converge to establish a foundational definition of our current toolkit: deterministic engineering software is inherently characterised by its reliance on predefined, immutable logic.

Unfortunately, this rigid characterisation creates operational friction. Because these systems force information to flow in a linear, sequential manner [5], they lack the capacity for autonomous adaptation; consequently, any deviation or need for complex data interpretation necessitates significant, often manual, expert intervention [4,6]. This limitation is particularly critical in seismic design involving Soil–Structure Interaction. As highlighted assessments of historical infrastructure [7], accurate vulnerability analysis requires simulating sequential ‘construction stages’—such as excavation, backfilling, and water loading—to determine initial stress states. Currently, interpreting the massive, heterogeneous datasets generated by such multi-stage non-linear analyses remains a disjointed, manual process susceptible to oversight. Reliance on customised integration for specialised tools creates a brittle architecture, where complex logic demands high maintenance whenever systems change. Recognising this systemic fragility necessitates a fundamental evolution in traditional methodology. Therefore, the essential strategy requires orchestrating a cohesive, automated ecosystem that facilitates intelligent reasoning across core engineering platforms while strictly maintaining ‘Human-in-the-Loop’ to guarantee process fidelity. This approach effectively eliminates the bottlenecks that currently delay civil engineering projects, establishing a robust foundation for advanced computational workflows.

To operationalise this cohesive link and transcend the limitations of linear workflows, the discipline must look beyond conventional tools. Consequently, the application of computational intelligence within Civil Engineering requires a rigorous distinction between conventional Machine Learning (ML) methodologies and the emerging paradigm of Agentic Artificial Intelligence (AI) [8,9]. This distinction characterises the autonomous agent not merely as a passive predictive model, but as a dynamic system interacting within its digital environment, capable of sensing conditions and executing goal-driven actions to influence specific outcomes. When this goal-oriented definition acts in concert with the advanced reasoning of foundational generative AI, it will enable these systems to operate with genuine autonomy, executing complex, multi-step tasks with proactive flexibility rather than merely responding to user prompts [8].

However, the deployment of the proposed Agentic AI necessitates a “Human-in-the-loop” paradigm to bridge computational autonomy with professional accountability. By synthesising the capabilities for self-direction and reflection inherent in Large Language Models (LLMs) [10], Agentic AI assumes the role of “Central Coordinator” within Multi-Agent Systems (MAS) frameworks [11]; yet, this agentic coordination is most effective when anchored by human oversight to ensure technical validity and ethical alignment. This symbiotic structure enables autonomous agents to augment intricate workflows and mitigating risk [12], while preserving the engineer’s role as the ultimate arbiter. While this operational independence introduces the acute challenge of establishing clear safety criteria for agent failures [13], the integration of Human-in-the-loop protocols confirms that Agentic AI constitutes the fundamental shift: moving the field from systems focused solely on prediction towards those capable of supervised, goal-directed management [13]. While Large Language Models offer exceptional capabilities in natural language inference, their inherent reliance on static training data—characterised by fixed cut-off dates—often results in ‘hallucinations’ when processing fragmented or detailed queries. These constraints render unaugmented models insufficient for high-stakes engineering applications [14]. To address this fundamental limitation, the field has adopted Retrieval-Augmented Generation (RAG), defined as an integrative framework that couples the generative LLM with an efficient information retrieval system [14,15]. This architecture is specifically designed to augment the LLM’s internal knowledge base with external, current, and domain-specific data, enabling effective performance where pre-trained data is typically incomplete [14]. By integrating authoritative documents from domain-specific repositories—such as seismic design codes [16,17,18], historical ground motion records [19], and experimental hysteresis data [20]—before generating a response, the system effectively mitigates the critical risk of hallucinations. This synthesis of data augmentation and error mitigation underpins the current architectural transition toward Agentic RAG systems; this shift reflects the reality that earthquake engineering workflows, such as performance-based assessments and non-linear analysis, require autonomous agents capable of complex planning and environmental interaction, extending far beyond simple text generation [10]. However, given the catastrophic consequences of structural failure in seismically active regions, this autonomy must be governed by a ‘Human-in-the-loop’ protocol. In this paradigm, the structural engineer serves as the definitive validation node, anchoring the agent’s probabilistic reasoning to deterministic seismic provisions and ethical liability. Consequently, the most significant contribution of RAG in this context is its mechanism for factual grounding: by compelling the LLM to derive its outputs directly from specific regulatory clauses and validated spectral data, the framework structurally enforces the factual grounding of technical judgements compared to unaugmented generative models.

Despite the remarkable advances in AI, the core challenge of LLMs in engineering remains a lack of inherent physics-based knowledge, leading to what is termed “Recursive Hallucination” and “Structural Distortion” [21]. Essentially, LLMs can generate plausible-looking, yet structurally unsound, designs. To address this gap, a fundamental methodological reorganisation was proposed in late 2024 with the introduction of the Model Context Protocol (MCP), an open standard designed to harmonise the interface between LLMs and external, domain-specific data sources [22]. This protocol functions as a universal “socket”—analogous to a USB-C port for AI—enabling autonomous agents to dynamically discover and execute workflows, read resources, and utilise tools across disparate computational systems [23].

It is important to understand that the literature defines MCP not as a proprietary application, but as a communication protocol that standardises the interaction between an “AI Host” and the external world. This architecture is structurally underpinned by a client-host-server model, facilitating a standardised information exchange via JSON-RPC 2.0 messages [24]. This tripartite structure is fundamentally critical, as it effectively decouples the abstract reasoning capabilities of the AI from the domain-specific, immutable logic of engineering tools.

A recent study, supported by a dedicated dataset for validation, successfully demonstrated that MCP is a feasible and robust method of connection between an LLM and an external engineering API, specifically OPENSEESPY, enabling complex structural analysis via a structured CIDI prompt [25,26]. Therefore, the MCP represents a foundational structure and systemic definition of Civil Engineering applications, moving beyond mere generative capacity to verifiable, physics-informed execution.

While the individual utility of these technologies is recognised, the current state-of-the-art [27] remains constrained by a ‘linear pipeline’ architecture. In this prevailing paradigm, AI functions primarily as a ‘Copilot’ or task-specific assistant, where the Model Context Protocol is utilised largely as a connectivity interface to facilitate conversation. Crucially, these existing frameworks rely heavily on post hoc ‘Human-in-the-Loop’ validation to mitigate inevitable hallucinations. However, it is argued that relying on human oversight to police stochastic errors is insufficient for the strict liability requirements of seismic infrastructure.

To transcend the limitations of current linear pipelines and address the critical need for rigorous validation in seismic design, this study introduces a unified orchestration framework that represents a distinct architectural shift. To the best of our knowledge, this is the first work to jointly conceptualise Agentic AI as an executive orchestrator, Retrieval-Augmented Generation as a regulatory grounding mechanism, and the Model Context Protocol as a deterministic substrate within a single end-to-end architecture. Crucially, it is reframed MCP not merely as a connectivity layer, but as a formal Independent Design Verification Boundary between probabilistic reasoning and deterministic computation, thereby preventing numerical hallucination by architectural design. Similarly, rather than employing RAG solely for conversational enhancement, this framework encapsulates it as a “constitutional” governance layer, restricting agentic decisions to contextual justifications sourced exclusively from immutable codes and standards.

Beyond these structural innovations, the proposed architecture distinguishes itself through a lifecycle-wide, event-driven backbone that unifies seismic hazard assessment, structural analysis, Quality Assurance (QA) and Structural Health Monitoring into a closed-loop system. This continuity allows for the explicit positioning of the engineer as an “Executive Reviewer.” In this paradigm, human oversight is operationalised not as post hoc error correction, but as a formal approval gate embedded within the architecture. This approach aligns agentic autonomy with professional accountability, ensuring that the engineer remains the ultimate executive authority in the design loop.

2. Materials and Methods

The system integrates seven layers organised according to the Model Context Protocol (Figure 1). At the top, the User & Interface Layer captures engineering intent and delivers results and explanations. The Agentic AI Layer interprets user goals (given in natural language, simplifying the user–machine interactions) and orchestrates multi-step workflows across domain agents. The Agents Layer, comprising specialised MCP clients (Hazard, Structural, Design, Quality Assurance, SHM, and Audit & Ethics Agents), translates high-level plans into structured tool calls and interprets computational responses. Communication with deterministic numerical engines is mediated by the MCP Gateway Layer, which ensures schema consistency, authentication, and routing, and by the Event Bus Layer, which broadcasts asynchronous events enabling reactive and concurrent operations. The MCP Server Layer contains the deterministic computational engines responsible for hazard analysis, structural simulation, code-compliant design, construction quality verification, structural health monitoring, and audit functions. At the bottom, the External Data Input Layer ingests as-built geometry, sensor streams, and traceability logs, supporting continuous life-cycle updates to models and decisions.

2.1. The Model Context Protocol Paradigm

The proposed framework adopts the Model Context Protocol to address the fundamental incompatibility between stochastic Large Language Model reasoning and the rigorous determinism required in seismic engineering. By enforcing a standardised client–server paradigm, the architecture strictly segregates cognitive planning from numerical execution, ensuring that the system functions as a reliable engineering tool rather than an uncontrolled generative model.

2.1.1. Client–Server Dichotomy

The architecture is defined by a strict functional separation between two layers:

The Client (Reasoning Layer): The “Agents Layer” operates as the MCP client, comprising domain-specific agents (e.g., Hazard, Structural, Design) driven by LLM reasoning (Figure 1). These agents function as the system’s orchestrators and are the sole initiators of requests. They utilise high-level cognition to decompose complex objectives and determine when to execute specific tasks—such as deciding to run a spectral analysis—without performing the calculations themselves.
The Server (Computational Layer): The “MCP Server Layer” consists of stateless, deterministic engines that expose validated engineering tools (e.g., run_response_spectrum, check_aci318). These servers lack agency and reasoning capabilities; they strictly execute algorithms upon request and return structured, machine-interpretable outputs.

2.1.2. Safety and Determinism

This separation provides a robust mechanism for preventing “hallucinations” in safety-critical workflows. By routing all computations through the MCP Gateway to deterministic servers, the framework ensures that every structural demand, capacity check, or safety decision originates exclusively from validated physics-based algorithms. Consequently, the agents reason about the engineering process using textual logic, but the numerical ground truth is never generated by the LLM, thereby guaranteeing traceability and reproducibility.

2.2. Multi-Layer Architectural Topology

The proposed framework (Figure 1) establishes a hierarchical topology comprising seven distinct strata, rigorously organised under the Model Context Protocol to decouple stochastic LLM reasoning from deterministic engineering verification.

At the summit, the User & Interface Layer (Layer 1: Human-in-the-Loop Oversight) defines the regulatory boundary, capturing high-level engineering intent whilst serving as the definitive approval gate for design alternatives and explainable outputs. Directly beneath, the Agentic AI Layer (Layer 2: Executive Planning & Orchestration) acts as the system’s cognitive core. This layer decomposes complex, multi-stage workflows—spanning hazard assessment to structural health monitoring—managing cross-domain dependencies, based on AI reasoning, without executing direct computations.

Operational logic is delegated to the Agents Layer (Layer 3: Domain-Specific MCP Clients). Comprising specialised agents for Hazard, Structural Analysis, Design, Quality Assurance (QA), SHM, and Audit, this layer translates the orchestrator’s natural language plans into structured MCP tool schemas. To ensure protocol integrity, the MCP Gateway (Layer 4: Secure Middleware & Routing) mediates all traffic between clients and servers. It enforces strict authentication and schema validation, guaranteeing that stochastic agent requests adhere to the rigid input requirements of the numerical engines.

Simultaneously, the Event Bus (Layer 5: Asynchronous Reactive Backbone) employs a publish-subscribe model to broadcast system-wide triggers—such as hazard.cms.ready or qa.deviation.alert—enabling real-time cross-domain reactivity. The computational foundation resides in the MCP Server Layer (Layer 6: Deterministic Computational Engines). This suite executes validated physics-based algorithms (e.g., PSHA, FEM, Code Checks) and strictly excludes hallucination. Crucially, this layer integrates a Knowledge/RAG Server, which retrieves contextual regulatory data (e.g., ACI 318-25 clauses) to support decision-making without altering numerical results.

Finally, the External Data Input Layer (Layer 7: Lifecycle Data Ingestion) anchors the digital twin in physical reality by feeding as-built BIM models, sensor streams, and construction logs into the upstream servers. To operationalise this architecture, Table 1 details the specific Human-in-the-Loop activities, structural metrics, and RAG-retrieved regulatory provisions that govern each agent defined in Layer 3.

2.3. Integration of Controlled Retrieval-Augmented Generation

The framework integrates RAG technology by encapsulating it within a dedicated Knowledge MCP Server, strictly adhering to the client–server architecture defined by the Model Context Protocol. Rather than functioning as an unchecked generative layer, this server is implemented as a deterministic endpoint that exposes specific tools—such as search_codes, search_qa_guidelines, and search_projects—which agents invoke via the MCP Gateway.

Technically, this integration relies on the MCP Gateway to enforce schema validation on all retrieval requests, ensuring that queries for regulatory clauses (e.g., ACI 318, ASCE 7) or historical project data are structured and secure. Upon invocation, the Knowledge Server queries a vectorised knowledge corpus and returns structured JSON containing ranked passages and source metadata, explicitly avoiding free-form conversational outputs.

This architecture establishes a rigorous Contextual vs. Computational Separation. The RAG-driven Knowledge Server is solely responsible for providing textual justification and interpretive context, while physics-based computations—such as Finite Element Method (FEM) and Probabilistic Seismic Hazard Analysis (PSHA)—are executed by isolated, stateless MCP servers. This separation prevents LLM hallucinations from corrupting numerical workflows while ensuring that every engineering decision is traceable to a specific, immutable document within the vector store.

2.4. System Dynamics and Lifecycle Integration

The proposed framework (Figure 1) orchestrates system dynamics through a hybrid execution model that harmonises deterministic control with reactive agility, ensuring robust performance across the engineering lifecycle.

2.4.1. Synchronous Execution

The core operational mechanism relies on a standard synchronous request–response cycle managed by the MCP Gateway. In this phase, domain agents (MCP clients) initiate tool calls—such as struct-server.run_response_spectrum or design-server.check_aci318—which the Gateway validates for schema consistency before routing to the appropriate deterministic MCP server. This strict mediation ensures that all safety-critical calculations remain reproducible, authorised, and strictly isolated from potential reasoning errors inherent in the LLM layer.

2.4.2. Asynchronous Reactivity

To accommodate dynamic inputs, the architecture employs an Event Bus Layer that establishes event-driven loops distinct from the linear request cycle. This mechanism broadcasts asynchronous state changes, such as qa.deviation.alert during construction or shm.alert.damage during operation, allowing the Agentic AI to react concurrently to emerging hazards without blocking ongoing computations.

2.4.3. Lifecycle Continuity

The convergence of deterministic tool execution and reactive, event-driven monitoring unifies the project lifecycle through a shared digital thread. By ingesting as-built geometry, sensor streams, and traceability logs via the External Data Input Layer, the system links initial design assumptions with real-world Quality Assurance verification and long-term Structural Health Monitoring. This integration transforms static engineering models into a continuous, closed-loop ecosystem capable of dynamically updating assessments in response to physical degradation or seismic events.

2.5. Systemic Mitigation via Architectural Determinism

To address the inherent stochasticity of Large Language Models in structural engineering, this framework relies on architectural determinism rather than probabilistic tuning. While quantitative benchmarks (e.g., hallucination rates) are subject to prompt sensitivity and remain a focus for future work, the current system enforces mitigation through a Qualitative Failure-Mode Analysis and a Formal Epistemic Firewall.

2.5.1. The Epistemic Firewall (Verification Boundary)

The primary safeguard is the imposition of a functional boundary—an epistemic firewall—that physically decouples reasoning from calculation. In unconstrained LLM workflows, numerical values and design recommendations are generated by the model’s unverified approximate generation, creating a high risk of fabricated spectra or incorrect drift limits.

In contrast, the MCP-governed workflow ensures that the agentic layer is architecturally incapable of fabricating engineering results. This relies on a formal verification argument:

Numerical Integrity: All quantitative outputs (e.g., shear capacity, spectral acceleration) originate exclusively from the deterministic MCP Server Layer (Layer 6).
Regulatory Fidelity: All code references are retrieved verbatim via the RAG-enabled Knowledge Server.

Consequently, “hallucinations” are confined to the explanatory text generated by the agent–LLM interaction within a RAG controlled architecture. They cannot contaminate the engineering ground truth, as the LLM functions solely as a request router and results synthesizer, never as a calculator.

2.5.2. Automated Quality Assurance (Verification) Versus Professional Oversight

Beyond the epistemic firewall, the architecture enforces a rigorous distinction between the Automated Quality Assurance Layer and Human-in-the-Loop (HITL) Oversight. The validation layer operates as an automated, automated rule checking mechanism. Its primary function is to systemically mitigate the specific risks of “Recursive Hallucination” and arithmetic deficiency by evaluating internal consistency and workflow conformance independent of human intervention.

To operationalise this, the MCP Gateway (Layer 4) executes schema validation on every tool call. This enforces the Client–Server dichotomy, strictly decoupling stochastic agentic reasoning from immutable server logic. Simultaneously, the Audit & Ethics Agents (Layer 3) provide continuous oversight by logging tool inputs/outputs. As validated in recent studies using OPENSEESPY [25,26], these machine-enforceable safeguards ensure the process remains mathematically pure before results reach the user. Consequently, HITL oversight is reserved as a post-synthesis, normative layer. This allows the engineer to focus on professional accountability and design approval, rather than debugging the stochastic errors inherent in generative models.

2.5.3. Positioning of Quantitative Metrics

While the proposed framework establishes the conceptual architecture for containment, explicit quantitative validation—such as factual consistency scores and LLM-as-a-Judge benchmarks—remains highly context-dependent. Consequently, the definition and application of such metrics fall beyond the scope of the present architectural formulation and are intentionally deferred to future work.

3. Results and Discussion

3.1. The Epistemological Shift: From Linear Prediction to Autonomous Orchestration

Historically, the Civil Engineering sector’s reliance on proprietary tools has established a “linear, sequential” workflow that fundamentally lacks the capacity for autonomous adaptation [28]. Recent comprehensive reviews [27], confirm that current AI integrations remain constrained by this paradigm, typically functioning as linear pipelines or “Copilots” that rely heavily on post hoc human validation to mitigate stochastic errors. While this approach reduces risk, it fails to fundamentally resolve the reliability gap required for autonomous seismic infrastructure. Consequently, current integration efforts frequently result in a “brittle architecture” where point-to-point connections require significant manual maintenance. This operational friction demonstrates that the solution to industry fragmentation is not merely better software interoperability, but rather “cognitive orchestration” [29]—a move from reactive error correction to architectural prevention.

Achieving this orchestration requires a fundamental distinction between conventional Machine Learning and Agentic AI [30]. Unlike predictive ML, Agentic AI is defined as systems contextually aware and autonomously goal directed [31]. By elevating the LLM to the role of a “Central Coordinator,” this framework provides the “self-direction and reflection” necessary to decompose complex tasks without human micro-management. The convergence of this cognitive layer with traditional deterministic logic confirms that the discipline is transitioning from systems focused solely on prediction toward those capable of autonomous, goal-directed management. This capability is particularly vital for optimising Soil–Structure Interaction [7], where accurate analysis requires rigorous sequencing of ‘construction stages’ (e.g., excavation, backfilling). By autonomously orchestrating these multi-step dependencies on deterministic servers, the framework ensures algorithmic precision in defining initial stress states, effectively automating complex data interpretation that is otherwise prone to manual oversight.

3.2. Validating the Epistemic Firewall and Cognitive Safety

The implementation of the Client–Server dichotomy [32] successfully operationalises the “Epistemic Firewall” defined in Section 2.5, effectively eliminating the risk of Recursive Hallucination in numerical workflows. By restricting the Agentic AI to textual reasoning whilst isolating physics-based computations in deterministic servers, the framework ensures that the AI is physically incapable of “guessing” a structural response. As demonstrated in recent validation studies using OPENSEESPY [25,26], this architecture confirms that whilst the process is orchestrated by stochastic agents, the product remains mathematically pure and reproducible.

Critically, this architectural separation also aligns with the cognitive attention thresholds defined by Nielsen [33]. By automating the V&V process within the middleware, the system maintains cycle times (6–12 s) that allow the engineer to maintain focus on the dialogue without the cognitive burden of manual syntax checking. Consequently, the rigorous application of MCP transforms Generative AI from a “Black Box” into a verifiable, physics-informed structure suitable for high-stakes engineering.

3.3. RAG as a “Constitutional” Governance Layer

Rather than functioning merely as a conversational enhancement, the Knowledge MCP Server is positioned as a “Regulatory Governor.” This reclassification is critical; Civil Engineering mandates rigorous factual accuracy and cannot tolerate the epistemic risks associated with unsupported assertions or probabilistic errors.

This architecture encapsulates RAG within a deterministic Model Context Protocol (MCP) environment, enforcing a strict logical dichotomy between Contextual Justification—sourced directly from immutable regulatory standards such as ACI 318 and NEC15—and Computational Truth, derived from physics-based Finite Element Method servers. By compelling the agent to anchor all decisions in these external, verifiable contexts, the approach effectively mitigates the risk of hallucination inherent in unaugmented models. Consequently, the framework guarantees “traceable reasoning,” wherein every engineering decision—from preliminary design assumptions to final compliance checks—is inextricably linked to a verifiable document. This ensures that the system provides not just answers, but valid arguments where the premises logically support the conclusion, thereby satisfying the industry’s stringent requirements for liability management and professional accountability.

3.4. Achieving Lifecycle Continuity: A Closed-Loop Event-Driven Ecosystem

The persistence of delayed response in structural health monitoring between physical assets and digital models fundamentally constrains traditional civil engineering. Current practices often render models static and unresponsive to real-time degradation, acting merely as archival snapshots rather than dynamic operational tools. Recent empirical studies confirm that traditional “information integration” fails to address the “surge of criticalities” inherent in complex infrastructure, necessitating a shift toward intelligent, low-latency processing [34]. To realise true Lifecycle Continuity, we must transition to an Event-Driven Engineering Ecosystem. This argument rests on the necessity of implementing an asynchronous Event Bus Layer. Unlike synchronous systems that idle awaiting manual input, this architecture operates autonomously using distributed, stateless microservices, effectively solving the latency problem through streamlined, non-blocking workflow execution [35].

In critical scenarios such as Structural Health Monitoring, the system detects specific telemetry events—exemplified by shm.alert.damage—and independently triggers computational re-analysis. This capacity for self-directed assessment constitutes decisive evidence that the architecture bridges the validation gap required for autonomous civil infrastructure, moving beyond passive sensing to “AIoT-enabled decentralised fault diagnosis” [36]. Therefore, because the framework resolves the latency problem through asynchronous processing, and because it enables immediate, automated reaction to seismic or degradation events, it successfully transcends traditional static limitations. It is inferred that the proposed framework transforms the digital model into a dynamic, “closed-loop ecosystem” capable of “perception-fusion-prediction” cycles. This is robust, as the capacity for real-time adaptation is a necessary condition for modern structural resilience, a condition which this event-driven architecture demonstrably satisfies.

3.5. Illustrative Application: End-to-End Workflow Integration Logic

To demonstrate the practical implementation of this architecture, we operationalised a holistic seismic assessment workflow (Figure 2) orchestrated by six specialised agents, each governing a distinct phase of the structural lifecycle. The process begins with the Hazard Agent and Structural Agent, which quantify seismic demands through deterministic tool calls (e.g., hazard-server.run_psha). These outputs drive the Design Agent, which sizes elements and details reinforcement. Crucially, this triad is governed by a RAG-driven Knowledge Server that retrieves regulatory provisions (e.g., NEC-15, ACI 318) verbatim, ensuring that the generated design is bounded by verified code mandates rather than probabilistic inference.

The workflow extends beyond design into execution and monitoring. The Construction QA Agent validates the physical “as-built” geometry against the digital specifications, while the SHM Agent (Structural Health Monitoring) continuously ingests sensor data to assess performance during the service life. Overarching this entire ecosystem is the Audit, Ethics Agent. This supervisor agent provides the necessary verification and validation layer, auditing the decision logs of the other agents to ensure safety compliance and algorithmic transparency before any output is flagged for human review. This multi-agent structure ensures that the automation covers the full spectrum from conceptual design to long-term structural integrity

3.6. Ethical Implications: The Engineer as “Executive Reviewer”

The integration of autonomous orchestration requires conceptualising Agentic AI as an advanced instrument rather than a professional replacement [37]. Our framework operationalises this by positioning the engineer not as a “Manual Calculator,” but as an “Executive Reviewer.” This shift is enforced through three architectural safeguards.

First, Liability Allocation is strictly defined by system roles. Deterministic MCP Servers bear responsibility for numerical outputs (e.g., drifts, force demands), whilst the Agentic AI is restricted to workflow orchestration. If an incorrect drift value occurs, accountability traces to the specific analysis engine, not the agentic layer. This explicitly prevents the AI from generating numerical thresholds, ensuring the human engineer retains legal responsibility for final design approval.

Second, accountability via approval logging is mandatory at all Human-in-the-Loop gates. Approving a seismic design check generates an immutable record linking the workflow identifier, numerical model version, and retrieved regulatory clauses to the engineer’s identity. This creates an auditable decision trail comparable to signed calculation packages, enhancing transparency while mirroring professional documentation standards.

Finally, the system ensures Ethical Oversight in Ambiguous Design Choices. Autonomous agents cannot resolve normative trade-offs, such as balancing constructability against seismic robustness or choosing between code-minimum and conservative designs. When multiple code-compliant alternatives exist, the system presents implications but defers selection to the engineer. This ensures ethical judgement remains a human responsibility, reinforcing the engineer’s role as an active executive.

3.7. Limitations and Implementation Challenges

While the proposed framework offers a unifying abstraction for integrating Artificial Intelligence into structural engineering workflows, it is presented as a reference architecture to guide future implementation rather than a solution with immediate industrial readiness. Practical adoption faces three primary constraints. First, integration and interoperability remain significant hurdles; the utilisation of the MCP requires standardised APIs and rigorous schema definitions, which are frequently absent in legacy engineering software (e.g., traditional FEM solvers). Consequently, connecting these heterogeneous tools currently necessitates the development of custom middleware or wrappers.

Second, the architecture introduces computational overhead and system complexity distinct from traditional linear workflows. Although the underlying numerical mechanics remain deterministic, the agentic orchestration and RAG validation layers impose additional latency and resource demands that require careful optimisation for large-scale applications. Finally, regulatory scalability poses a challenge due to jurisdictional diversity. Accurately grounding the system across conflicting national codes and evolving standards requires the continuous curation of semantic alignments within the knowledge base to ensure robust governance.

3.8. Future Research Venues

To transition the proposed framework from a conceptual proposal to a verifiable industrial reality, future research must rigorously stress-test the autonomy and verifiability of its constituent layers through targeted pilot cases. Specifically, the Agentic AI’s capability to autonomously decompose vague goals for complex structures must be validated to ensure operation without human micro-management.

A vital future research venue involves applying this architecture to reinforced concrete infrastructure to validate its autonomous diagnostic capabilities. In this practical scenario, the Event Bus would trigger an immediate structural re-analysis upon detecting seismic telemetry that exceeds design thresholds. By autonomously updating the Structural Agent’s model with observed stiffness degradation, the system provides the Executive Reviewer with rapid, physics-based data for critical re-occupancy decisions. Furthermore, the ‘Contextual Grounding’ of the Knowledge Server must be confirmed by subjecting the RAG architecture to conflicting international codes, thereby testing its ability to provide traceable reasoning across distinct regulatory environments. Executing these specific validation scenarios constitutes the essential condition for establishing the framework’s industrial viability.

4. Conclusions

The integration of the Model Context Protocol constitutes the critical enabler for the proposed agentic AI architecture in structural seismic engineering. Given that seismic design mandates absolute determinism, traceability, and strict regulatory compliance—standards that traditional LLMs cannot independently guarantee—MCP provides the essential secure, standardised client–server foundation. By isolating domain-specific computations, such as probabilistic seismic hazard analysis and nonlinear dynamic assessments, within validated MCP servers, the framework strictly segregates reasoning from calculation. This architectural separation effectively precludes LLM hallucination from compromising numerical workflows, ensuring that all safety decisions remain grounded in deterministic algorithms. Simultaneously, MCP empowers the agentic layer to operate safely within an event-driven, closed-loop ecosystem, where the MCP Gateway and Event Bus manage asynchronous updates ranging from hazard readiness to structural health monitoring. Consequently, MCP transcends the role of an auxiliary component to function as the foundational mechanism that transforms agentic AI from a mere conversational interface into a trustworthy, auditable, and lifecycle-aware computational system indispensable for safety-critical infrastructure.

Author Contributions

Conceptualization, C.F.Á.; Formal analysis, C.F.Á. and E.D.R.T.; Investigation, C.F.Á. and E.D.R.T.; Methodology, C.F.Á.; Supervision, C.F.Á.; Writing—original draft, E.D.R.T.; Writing—review & editing, C.F.Á. and E.D.R.T. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data presented in this study are available in article.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

ACI	American Concrete Institute
AI	Artificial Intelligence
AIoT	Artificial Intelligence of Things
API	Application Programming Interface
ASCE	American Society of Civil Engineers
BIM	Building Information Modeling
CAD	Computer-Aided Design
FEM	Finite Element Method
JSON	JavaScript Object Notation
LiDAR	Light Detection and Ranging
LLM	Large Language Model
MAS	Multi-Agent Systems
MCP	Model Context Protocol
ML	Machine Learning
NLP	Natural Language Processing
PSHA	Probabilistic Seismic Hazard Analysis
QA	Quality Assurance
RAG	Retrieval-Augmented Generation
SHM	Structural Health Monitoring

References

Ikudayisi, A.E.; Chan, A.P.C.; Darko, A.; Adedeji, Y.M.D. Integrated practices in the Architecture, Engineering, and Construction industry: Current scope and pathway towards Industry 5.0. J. Build. Eng. 2023, 73, 106788. [Google Scholar] [CrossRef]
Zucca, M.; Reccia, E.; Vecchi, E.; Pintus, V.; Dessì, A.; Cazzani, A. An Evaluation of the Structural Behaviour of Historic Buildings Under Seismic Action: A Multidisciplinary Approach Using Two Case Studies. Appl. Sci. 2024, 14, 10274. [Google Scholar] [CrossRef]
Onatayo, D.; Onososen, A.; Oyediran, A.O.; Oyediran, H.; Arowoiya, V.; Onatayo, E. Generative AI Applications in Architecture, Engineering, and Construction: Trends, Implications for Practice, Education & Imperatives for Upskilling—A Review. Architecture 2024, 4, 877–902. [Google Scholar] [CrossRef]
Khodabakhshian, A.; Puolitaival, T.; Kestle, L. Deterministic and Probabilistic Risk Management Approaches in Construction Projects: A Systematic Literature Review and Comparative Analysis. Buildings 2023, 13, 1312. [Google Scholar] [CrossRef]
Gomes, A.M.; Azevedo, G.; Sampaio, A.Z.; Lite, A.S. BIM in Structural Project: Interoperability Analyses and Data Management. Appl. Sci. 2022, 12, 8814. [Google Scholar] [CrossRef]
Team, T.D. AI Agents and Deterministic Workflows: A Spectrum, Not a Binary Choice|Deepset Blog. 2025. Available online: https://www.deepset.ai/blog/ai-agents-and-deterministic-workflows-a-spectrum (accessed on 20 November 2025).
Zucca, M.; Crespi, P.G.; Longarini, N. Seismic vulnerability assessment of an Italian historical masonry dry dock. Case Stud. Struct. Eng. 2017, 7, 1–23. [Google Scholar] [CrossRef]
Ren, Y.; Liu, Y.; Ji, T.; Xu, X. AI Agents and Agentic AI-Navigating a Plethora of Concepts for Future Manufacturing. arXiv 2025. [Google Scholar] [CrossRef]
Acharya, D.B.; Kuppan, K.; Divya, B. Agentic AI: Autonomous Intelligence for Complex Goals—A Comprehensive Survey. IEEE Access 2025, 13, 18912–18936. [Google Scholar] [CrossRef]
Ahmad, H.M. Multi-Agent Retrieval-Augmented System for Domain-Specific Knowledge in Struc-tural Engineering. Master’s Thesis, Aalto University, Espoo, Finland, 2025. [Google Scholar]
Rawat, A.; Witt, E.; Lill, I. A Conceptual Framework for Llm-Based Multi-Agent Systems in Construction Management. In Proceedings of the CIB W78 Conference on IT in Construction, Porto, Portugal, 14–17 July 2025. [Google Scholar]
Das, K.; Khursheed, S.; Paul, V.K. The impact of BIM on project time and cost: Insights from case studies. Discov. Mater. 2025, 5, 25. [Google Scholar] [CrossRef]
International Telecommunication Union. AI Standards for Global Impact: From Governance to Action, Geneva. 2025. Available online: https://www.itu.int/epublications/es/publication/ai-standards-for-global-impact-from-governance-to-action (accessed on 20 November 2025).
Junttu, J. Enhancing Information Accessibility in Infrastructure Construction. Master’s Thesis, Tampere University, Tampere, Finland, 2025. [Google Scholar]
Barnett, S.; Kurniawan, S.; Thudumu, S.; Brannelly, Z.; Abdelrazek, M. Seven Failure Points When Engineering a Retrieval Augmented Generation System. In Proceedings of the IEEE/ACM 3rd International Conference on AI Engineering—Software Engineering for AI, Lisbon, Portugal, 14–15 April 2024; ACM: New York, NY, USA, 2024; pp. 194–199. [Google Scholar] [CrossRef]
ACI Committee 318; Building Requirements for Structural Concrete and Commentary. American Concrete Institute: Farmington Hills, MI, USA, 2019; Volume 1.
ASCE/SEI 7-22; Minimum Design Loads and Associated Criteria for Buildings and Other Structures. American Society of Civil Engineers: Reston, VA, USA, 2022.
CAMICON; MIDUVI. Norma Ecuatoriana de la Construcción—NEC: NEC-SE-MP—Mamposteria Estructural, Quito. 2014. Available online: https://www.habitatyvivienda.gob.ec/wp-content/uploads/2023/03/10.-NEC-SE-MP-Mamposteria-Estructural.pdf (accessed on 28 August 2025).
Rios, D.; Altamirano, M.; Ilbay, D.; Tlapanco, J.; Rivera-Tapia, D.; Avila, C. Beyond Prescriptive Codes: A Validated Linear–Static Methodology for Seismic Design of Soft-Storey RC Structures. Buildings 2025, 16, 60. [Google Scholar] [CrossRef]
Orakcal, K.; Massone, L.M.; Ulugtekin, D. A Hysteretic Constitutive Model for Reinforced Concrete Panel Elements. Int. J. Concr. Struct. Mater. 2019, 13, 51. [Google Scholar] [CrossRef]
Xu, B. Hallucination is Inevitable for LLMs with the Open World Assumption. arXiv 2025, arXiv:2510.05116. [Google Scholar] [CrossRef]
Anthropic. Introducing the Model Context Protocol. 2024. Available online: https://www.anthropic.com/news/model-context-protocol (accessed on 17 June 2025).
Protocol, M.C. What Is the Model Context Protocol (MCP)? 2025. Available online: https://modelcontextprotocol.io/docs/getting-started/intro (accessed on 24 November 2025).
Cloud, G. What Is Model Context Protocol (MCP)? 2025. Available online: https://cloud.google.com/discover/what-is-model-context-protocol (accessed on 24 November 2025).
Avila, C.; Ilbay, D.; Rivera, D. Human–AI Teaming in Structural Analysis: A Model Context Protocol Approach for Explainable and Accurate Generative AI. Buildings 2025, 15, 3190. [Google Scholar] [CrossRef]
Avila, C.; Ilbay, D.; Tapia, P.; Rivera, D. Toward Responsible AI in High-Stakes Domains: A Dataset for Building Static Analysis with LLMs in Structural Engineering. Data 2025, 10, 169. [Google Scholar] [CrossRef]
Elsisi, A. Large Language Models Application in Civil and Structural Engineering: Review. SSRN 2025. [Google Scholar] [CrossRef]
Sacks, R.; Eastman, C.; Lee, G.; Teicholz, P. BIM Handbook; Wiley: Hoboken, NJ, USA, 2018. [Google Scholar] [CrossRef]
Wang, L.; Ma, C.; Feng, X.; Zhang, Z.; Yang, H.; Zhang, J.; Chen, Z.; Tang, J.; Chen, X.; Lin, Y.; et al. A survey on large language model based autonomous agents. Front. Comput. Sci. 2024, 18, 186345. [Google Scholar] [CrossRef]
Xi, Z.; Chen, W.; Guo, X.; He, W.; Ding, Y.; Hong, B.; Zhang, M.; Wang, J.; Jin, S.; Zhou, E.; et al. The rise and potential of large language model based agents: A survey. Sci. China Inf. Sci. 2025, 68, 121101. [Google Scholar] [CrossRef]
Franklin, S.; Graesser, A. Is It an agent, or just a program?: A taxonomy for autonomous agents. In Intelligent Agents III Agent Theories, Architectures, and Languages; Müller, J., Ed.; Springer: Berlin, Germany, 1996; pp. 21–35. [Google Scholar] [CrossRef]
Ray, P.P. A Survey on Model Context Protocol: Architecture, State-of-the-art, Challenges and Future Directions. TechRxiv 2025. [Google Scholar] [CrossRef]
Nielsen, J. Usability Engineering|Enhanced Reader, 1st ed.; Morgan Kaufmann Publishers Inc.: San Francisco, CA, USA, 1994. [Google Scholar]
Wang, K.; Zhou, X.; Guan, J. The construction of an integrated cloud network digital intelligence platform for rail transit based on artificial intelligence. Sci. Rep. 2025, 16, 393. [Google Scholar] [CrossRef] [PubMed]
Bertozzi, N.; Geraci, A.; Bergamasco, L.; Ferrera, E.; Pristeri, E.; Pastrone, C. A Distributed Event-Orchestrated Digital Twin Architecture for Optimizing Energy-Intensive Industries. In Proceedings of the 10th International Conference on Internet of Things, Big Data and Security, Porto, Portugal, 6–8 April 2025; pp. 337–344. [Google Scholar] [CrossRef]
Geck, C.C.; Al-Zuriqat, T.; Elmoursi, M.; Dragos, K.; Smarsly, K. AIoT-enabled decentralized sensor fault diagnosis for structural health monitoring. In Proceedings of the 11th European Workshop on Structural Health Monitoring, EWSHM 2024, Potsdam, Germany, 10–13 June 2024. [Google Scholar] [CrossRef]
Shneiderman, B. Human-Centered AI: A New Synthesis. In Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Springer: Berlin/Heidelberg, Germany, 2021; Volume 12932, pp. 3–8. [Google Scholar] [CrossRef]

Figure 1. Multi-layer agentic AI architecture illustrating the hierarchical flow from the user and interface layer (Layer 1: human-in-the-loop oversight) through the agentic AI and domain-specific agent layers (Layers 2–3), the MCP gateway and event bus (Layers 4–5) for secure orchestration and asynchronous communication, to the MCP server layer (Layer 6) and external data input layer (Layer 7) enabling deterministic computation and lifecycle data ingestion.

Figure 2. Layered Agentic AI architecture applied to a building structure workflow. The diagram illustrates the vertical integration from the User Interface (Layer 1) through the Agentic Orchestration (Layers 2–3) down to the deterministic MCP Servers (Layer 6). The feedback loop is closed via the Event Bus (Layer 5), which ingests physical data from the timber structure (Layer 7) to trigger automated Quality Assurance protocols.

Table 1. Operational mapping of domain-specific agents, delineating Human-in-the-Loop (HITL) oversight activities, governing structural metrics, and RAG-retrieved regulatory provisions.

Agent	Human-in-the-Loop (HITL) Activities	Key Structural Metrics	Regulatory Provisions Retrieved via RAG (Generic)
Agentic AI (Executive Layer)	• Define performance objectives • Approve workflow scope • Confirm modelling depth	• Target performance level (LS, CP, IO) • Analysis strategy	• Performance objectives • Design philosophy clauses (capacity design, hierarchy of resistance)
Hazard Agent	• Validate site location and soil class • Select hazard level (DBE/MCE) • Approve design spectrum	• PGA • Spectral acceleration Sa(T) • Site-modified response spectrum	• Seismic zoning maps • Soil classification criteria • Spectrum construction rules
Structural Agent	• Approve modelling assumptions • Review governing response quantities	• Inter-storey drift ratios • Storey shears • Modal participation factors	• Analysis method applicability • Drift computation definitions • Modelling assumptions (effective stiffness, damping)
Design Agent	• Accept governing load combinations • Approve compliance results	• Demand–capacity ratios • Drift compliance • Detailing indicators	• Strength equations • Load combinations • Drift limits • Ductility and detailing provisions
Construction QA Agent	• Review detected deviations • Decide acceptance or remediation	• Geometric tolerances • Reinforcement placement errors • Material conformity	• Construction tolerance limits • Acceptance criteria • Inspection and testing requirements
SHM Agent	• Interpret damage indicators • Decide re-occupancy or intervention	• Frequency shifts • Stiffness degradation • Residual drift	• Post-earthquake assessment guidelines • Damage state definitions • Re-occupancy criteria
Audit & Ethics Agent	• Formal approval and sign-off • Acceptance of assumptions	• Approval logs • Version traceability	• Professional responsibility clauses • Documentation and liability requirements

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ávila, C.F.; Rivera Tapia, E.D. Conceptualising RAG-Driven Agentic AI with Multi-Layer MCP for Seismic Structural Systems. Buildings 2026, 16, 1018. https://doi.org/10.3390/buildings16051018

AMA Style

Ávila CF, Rivera Tapia ED. Conceptualising RAG-Driven Agentic AI with Multi-Layer MCP for Seismic Structural Systems. Buildings. 2026; 16(5):1018. https://doi.org/10.3390/buildings16051018

Chicago/Turabian Style

Ávila, Carlos Fabián, and Edgar David Rivera Tapia. 2026. "Conceptualising RAG-Driven Agentic AI with Multi-Layer MCP for Seismic Structural Systems" Buildings 16, no. 5: 1018. https://doi.org/10.3390/buildings16051018

APA Style

Ávila, C. F., & Rivera Tapia, E. D. (2026). Conceptualising RAG-Driven Agentic AI with Multi-Layer MCP for Seismic Structural Systems. Buildings, 16(5), 1018. https://doi.org/10.3390/buildings16051018

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Conceptualising RAG-Driven Agentic AI with Multi-Layer MCP for Seismic Structural Systems

Abstract

1. Introduction

2. Materials and Methods

2.1. The Model Context Protocol Paradigm

2.1.1. Client–Server Dichotomy

2.1.2. Safety and Determinism

2.2. Multi-Layer Architectural Topology

2.3. Integration of Controlled Retrieval-Augmented Generation

2.4. System Dynamics and Lifecycle Integration

2.4.1. Synchronous Execution

2.4.2. Asynchronous Reactivity

2.4.3. Lifecycle Continuity

2.5. Systemic Mitigation via Architectural Determinism

2.5.1. The Epistemic Firewall (Verification Boundary)

2.5.2. Automated Quality Assurance (Verification) Versus Professional Oversight

2.5.3. Positioning of Quantitative Metrics

3. Results and Discussion

3.1. The Epistemological Shift: From Linear Prediction to Autonomous Orchestration

3.2. Validating the Epistemic Firewall and Cognitive Safety

3.3. RAG as a “Constitutional” Governance Layer

3.4. Achieving Lifecycle Continuity: A Closed-Loop Event-Driven Ecosystem

3.5. Illustrative Application: End-to-End Workflow Integration Logic

3.6. Ethical Implications: The Engineer as “Executive Reviewer”

3.7. Limitations and Implementation Challenges

3.8. Future Research Venues

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI