Toward Responsible AI in High-Stakes Domains: A Dataset for Building Static Analysis with LLMs in Structural Engineering
Abstract
1. Summary
2. Data Description
2.1. Prompt Description of the Study Case
2.1.1. Context
- Purpose: Provide the professional and technical framework within which all subsequent instructions must be interpreted.
2.1.2. Instructions
- Analyse a three-dimensional reinforced concrete frame.
- Verify code compliance for inter-storey drift.
- Apply structural optimisation if necessary.
- Purpose: Define the general workflow of the analysis.
2.1.3. Details
- Material properties: modulus of elasticity.
- Geometry: spans in X and Y directions; storey heights.
- Cross-sections: beam and column dimensions.
- Cracking factors: beams (0.7) and columns (0.8).
- Loads: dead load (4.9 kN/m2) and live load (1.9 kN/m2).
- Coefficients: load factors, base shear coefficient, torsion factor, drift amplification, and maximum allowable drift.
- Purpose: These values serve as tabular input data to be read directly by the solver. Each line corresponds to a parameter category, and their interpretation is straightforward (e.g., geometric dimensions in meters and load intensities in kN/m2).
2.1.4. Tasks
- Iterate through all inelastic drift values.
- Compare against maximum allowable drift (0.02).
- If one or more values exceed the limit, report the storey number, direction, and drift value.
- Only if all drifts are ≤0.02 may compliance be confirmed.
- Present results in tabular format with numerical precision.
- If non-compliance occurs, generate 10 alternatives by modifying materials and section dimensions.
- Re-evaluate drifts for each configuration.
- Compare alternatives in tabular form.
- Highlight compliant and efficient options.
2.1.5. Intent
- Generate an automated technical report.
- Include detailed structural analysis, code validation, and optimisation proposals when required.
- Use technical language with clear tables.
- Ensure suitability for professional and academic environments.
2.2. Dataset Significance
- -
- Storey Drift: Storey drift values are reported for each storey in both the X and Y directions (Table 3). These tables allow readers to observe the vertical distribution of drift across cases and computational methods. Storey drift quantifies the relative displacement between consecutive levels and is a key parameter for NEC-15 compliance, which establishes a 2% upper limit. Reading guide: Table 3 displays raw drift values per storey and direction, enabling direct verification of inter-storey deformation patterns.
- -
- Maximum Displacement: The maximum storey displacements in the X and Y directions summarised numerically in Table 4. This dataset provides insight into global deformation profiles, which are essential for evaluating the likelihood of structural interaction with neighboring buildings. Reading guide: Table 4 reports the corresponding numerical values of the displacement in meters for each computational method.
- -
- Base Shear: Table 5 presents the base shear (kN) values for the studied cases. These results quantify the seismic demand transmitted to the foundation and reflect the combined influence of structural weight and stiffness. Reading guide: Table 5 provides total base shear per case and method, facilitating cross-model comparison.
- -
- Building Period. The fundamental period of vibration for each case is shown in Table 6. As an indirect measure of stiffness, the building period is critical for understanding overall structural dynamics, where shorter periods generally correspond to reduced displacements. Reading guide: Table 6 presents the fundamental period of vibration for Study Case A across the different modelling scenarios (GPT, GPT+MCP, OpenSees, and ETABS). This comparison highlights the consistency of solver-based methods relative to ETABS, while also showing the divergence of GPT-only outputs. The fundamental period serves as an indicator of global structural stiffness, where shorter values generally reflect stiffer dynamic behaviour.
2.3. Relative Error as a Measure of Accuracy
- ○
- Benchmarking performance: It provides a direct comparison of novel AI-assisted methods (GPT and GPT+MCP) against a widely validated standard (ETABS).
- ○
- Cross-parameter evaluation: Since relative error is unitless, it permits consistent assessment across storey drift (%), displacements (m), base shear (kN), and period (s).
- ○
- Reproducibility and extension: Researchers can employ the provided error tables to reproduce the evaluation, extend the analysis to new structural typologies, or integrate the metric into broader model validation frameworks.
2.4. Dataset Statistics
2.5. Potential for Reuse and Future Research
3. Methods
3.1. Architecture Workflow
3.2. Data Collection and Processing
3.3. Validation and Curation
3.4. Data Quality and Noise Control
Supplementary Materials
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
| AI | Artificial Intelligence |
| LLM | Large Language Model |
| GPT | Generative Pre-trained Transformer |
| MCP | Model Context Protocol |
| BIM | Building Information Modelling |
| HVAC | Heating, Ventilation, and Air Conditioning |
| NEC-15 | Norma Ecuatoriana de la Construcción 2015 |
| ASCE 7-22 | American Society of Civil Engineers Standard 7-2022 |
| ETABS | Extended Three-dimensional Analysis of Building Systems |
| OpenSees | Open System for Earthquake Engineering Simulation |
| OpenSeesPy | Python interface to OpenSees |
| JSON | JavaScript Object Notation |
| API | Application Programming Interface |
| FastAPI | Fast Application Programming Interface (Python framework) |
| CIDI | Context–Instruction–Details–Intent |
References
- Avila, C.; Ilbay, D.; Rivera, D. Human–AI Teaming in Structural Analysis: A Model Context Protocol Approach for Explainable and Accurate Generative AI. Buildings 2025, 15, 3190. [Google Scholar] [CrossRef]
- Garza Morales, G.A.; Nizamis, K.; Bonnema, G.M. Engineering complexity beyond the surface: Discerning the viewpoints, the drivers, and the challenges. Res. Eng. Des. 2023, 34, 367–400. [Google Scholar] [CrossRef]
- Morales, G.A.G.; Nizamis, K.; Bonnema, G.M. Why is there complexity in engineering? A scoping review on complexity origins. In Proceedings of the IEEE International Systems Conference (SysCon), Vancouver, BC, Canada, 17–20 April 2023; pp. 1–8. [Google Scholar] [CrossRef]
- Suh, N.P. Complexity in Engineering. CIRP Ann. 2005, 54, 46–63. [Google Scholar] [CrossRef]
- Oladele, A., Jr.; Ibrahim, A. Artificial Intelligence For Systems Engineering Complexity: A Review On The Use Of Ai And Machine Learning Algorithms. Comput. Sci. IT Res. J. 2024, 5, 787–808. [Google Scholar] [CrossRef]
- Liang, H.; Kalaleh, M.T.; Mei, Q. Integrating Large Language Models for Automated Structural Analysis. arXiv 2025. [Google Scholar] [CrossRef]
- Cha, Y.-J.; Ali, R.; Lewis, J.; Büyüköztürk, O. Deep learning-based structural health monitoring. Autom. Constr. 2024, 161, 105328. [Google Scholar] [CrossRef]
- Zhang, L.; Le, B.; Akhtar, N.; Lam, S.-K.; Ngo, T. Large Language Models for Computer-Aided Design: A Survey. ACM Comput. Surv. 2025, 37, 31. [Google Scholar] [CrossRef]
- Salehi, H.; Burgueño, R. Emerging artificial intelligence methods in structural engineering. Eng. Struct. 2018, 171, 170–189. [Google Scholar] [CrossRef]
- Cáceres, M.; Avila, C.; Rivera, E. Thermodynamics-Informed Neural Networks for the Design of Solar Collectors: An Application on Water Heating in the Highland Areas of the Andes. Energies 2024, 17, 4978. [Google Scholar] [CrossRef]
- Lu, J.; Tian, X.; Zhang, C.; Zhao, Y.; Zhang, J.; Zhang, W.; Feng, C.; He, J.; Wang, J.; He, F. Evaluation of large language models (LLMs) on the mastery of knowledge and skills in the heating, ventilation and air conditioning (HVAC) industry. Energy Built Environ. 2024, 6, 875–892. [Google Scholar] [CrossRef]
- Yang, X.; Chen, B.; Tam, Y.-C. Arithmetic Reasoning with LLM: Prolog Generation & Permutation. In Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics, Mexico City, MX, USA, 16–21 June 2024. [Google Scholar]
- Ismayilzada, M.; Paul, D.; Montariol, S.; Geva, M.; Bosselut, A. CRoW: Benchmarking Commonsense Reasoning in Real-World Tasks. Conference on Empirical Methods in Natural Language Processing. arXiv 2023. [Google Scholar] [CrossRef]
- Ghimire, P.; Kim, K.; Acharya, M. Opportunities and Challenges of Generative AI in Construction Industry: Focusing on Adoption of Text-Based Models. Buildings 2024, 14, 220. [Google Scholar] [CrossRef]
- Anthropic. Introducing the Model Context Protocol 2024. Available online: https://www.anthropic.com/news/model-context-protocol (accessed on 17 June 2025).
- Krishnan, N. Advancing Multi-Agent Systems Through Model Context Protocol: Architecture, Implementation, and Applications. arXiv 2025, arXiv:2504.21030. [Google Scholar] [CrossRef]
- Hou, X.; Zhao, Y. Model Context Protocol (MCP): Landscape, Security Threats, and Future Research Directions. arXiv 2025. [Google Scholar] [CrossRef]
- Ray, P.P.; Pratim, P.R. A Survey on Model Context Protocol: Architecture, State-of-the-art, Challenges and Future Directions. techrXiv 2025. [Google Scholar] [CrossRef]
- Norma Ecuatoriana de la Construcción—NEC: NEC-SE-MP. Mamposteria Estructural; Camicon Miduvi: Quito, Ecuador, 2014. [Google Scholar]
- ASCE/SEI 7-22; Minimum Design Loads and Associated Criteria for Buildings and Other Structures. American Society of Civil Engineers: Reston, VI, USA, 2022.
- Jiang, G.; Chen, J. Efficient fine-tuning of large language models for automated building energy modeling in complex cases. Autom. Constr. 2025, 175, 106223. [Google Scholar] [CrossRef]
- Jurišević, N.; Kowalik, R.; Gordić, D.; Novaković, A.; Vukašinović, V.; Rakić, N.; Nikolic, J.; Vukicevic, A. Large Language Models as Tools for Public Building Energy Management: An Assessment of Possibilities and Barriers. Int. J. Qual. Res. 2025, 19, 817–830. [Google Scholar] [CrossRef]

![]() |
| (a) |
| prompt = ( “Context:” “You are an expert in structural analysis using natural language and numerical simulation with OpenSeesPy.” “The implemented system is capable of interpreting technical prompts and generating automated structural simulations” “based on international seismic-resistant design standards.” “Instructions:” “Analyse a three-dimensional reinforced concrete frame, verify code compliance for inter-storey drift,” “and apply structural optimisation if necessary.” “Details:” “The modulus of elasticity for concrete is 21,458,890.83 kN/m2.” “The structural system has spans of 4.0 and 4.0 m in the X direction, and spans of 4.0 and 4.0 m in the Y direction.” “The structure has 2 stories, with storey heights of: 3.0 and 3.0 m respectively.” “Beams have a cross-sectional dimension of 0.25 × 0.30 m, and columns are 0.30 × 0.30 m.” “Cracking factors are 0.7 for beams and 0.8 for columns.” “Dead load is 4.9 kN/m2 and live load is 1.9 kN/m2.” “The weight coefficients are: 1.0 for dead load, 0.15 for live load, 0.1488 for base shear coefficient,” “1.0 for vertical distribution of base shear, 0.05 for accidental torsion,” “and a drift amplification factor of 6.0 is applied to estimate inelastic drift. The maximum allowable drift is 0.02.” “Tasks:” “1. Perform linear static seismic analysis using the equivalent lateral force method with OpenSeesPy.” “2. Compute maximum displacements and storey drifts per level and direction (X and Y).” “3. Perform strict numerical validation:” “ - Iterate through all obtained inelastic drift values.” “ - For each value, compare it against the allowable maximum (0.02).” “ - If *at least one value* exceeds 0.02, *you must not state that all values are compliant*.” “ - Report precisely: storey number, direction (X or Y), and the drift value that exceeds the limit.” “ - Only if *all drifts* are ≤0.02, the code compliance can be confirmed.” “ - Present results in tabular format and be rigorous with numerical precision.” “4. Also determine floor-by-floor shear forces and vibration modes.” “5. Structural optimisation:” “ - If any drift exceeds the limit, propose a structural optimisation based on displacements,” “ storey drifts, shear forces, and vibration modes.” “ - Generate 10 alternatives by modifying material properties and section dimensions.” “ - Evaluate drift for each alternative and present the comparison in tabular format.” “ - Highlight the configurations that meet code requirements and provide better structural efficiency.” “Intent:” “Generate an automated technical report, including detailed structural analysis, code validation,” “and optimisation in case of non-compliance. The output must be expressed in technical language and clear tables,” “suitable for professional and academic environments.” ) |
| (b) |
| Category | Parameter | Case A | Case B | Case C | Case D |
|---|---|---|---|---|---|
| Geometry | No. of Stories | 2 | 3 | 5 | 5 |
| Storey Heights (m) | 3.0–3.0 | 3.0–3.0–3.0 | 3.5–2.5–2.5–2.5–2.5 | 3.0–2.5–2.5–2.5–2.5 | |
| Spans in X (m) | 4.0–4.0 | 4.0–5.0–4.0 | 4.0–4.0–4.0 | 4.0–5.0–6.0 | |
| Spans in Y (m) | 4.0–4.0 | 3.5–4.5 | 4.0–4.0–4.0 | 3.5–4.5 | |
| Geometry Type | Symmetric | Asymmetric | Symmetric | Asymmetric | |
| Sections | Beam Cross-Section (m) | 0.25 × 0.30 | 0.30 × 0.30 | 0.30 × 0.40 | 0.30 × 0.40 |
| Column Cross-Section (m) | 0.30 × 0.30 | 0.35 × 0.35 | 0.40 × 0.40 | 0.40 × 0.45 | |
| Material | Cracking Factor (Beams) | 0.7 | |||
| Cracking Factor (Columns) | 0.8 | ||||
| Concrete Young’s Modulus (kN/m2) | 21,458,890.83 | ||||
| Loads | Dead Load (kN/m2) | 4.9 | |||
| Live Load (kN/m2) | 1.9 | ||||
| Dead Load Coefficient | 1 | ||||
| Live Load Coefficient | 0.15 | ||||
| Seismic Parameters | Base Shear Coefficient | 0.1488 | |||
| Vertical Distribution Coefficient | 1 | ||||
| Accidental Torsion Coefficient | 0.05 | ||||
| Drift Amplification Factor | 6 | ||||
| Maximum Allowable Drift | 0.02 | ||||
| Inter-Storey Drift X | |||||
|---|---|---|---|---|---|
| Case | Storey | GPT | GPT+MCP | OpenSees | ETABS |
| A | 2 | 0.009 | 0.013 | 0.013 | 0.013 |
| 1 | 0.007 | 0.012 | 0.012 | 0.012 | |
| Case | GPT | GPT+MCP | OpenSees | ETABS |
|---|---|---|---|---|
| A | 0.047 | 0.01264 | 0.01264 | 0.01282 |
| Case | GPT | GPT+MCP | OpenSees | ETABS |
|---|---|---|---|---|
| A | 10.09 | 13.969 | 13.969 | 13.97 |
| Case | GPT | GPT+MCP | OpenSees | ETABS |
|---|---|---|---|---|
| A | 0.38 | 0.485 | 0.485 | 0.489 |
| GPT | GPT+MCP | OpenSees | ||||
|---|---|---|---|---|---|---|
| Case | X | Y | X | Y | X | Y |
| A | 266.529 | 235.335 | 1.427 | 1.427 | 1.427 | 1.427 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Avila, C.; Ilbay, D.; Tapia, P.; Rivera, D. Toward Responsible AI in High-Stakes Domains: A Dataset for Building Static Analysis with LLMs in Structural Engineering. Data 2025, 10, 169. https://doi.org/10.3390/data10110169
Avila C, Ilbay D, Tapia P, Rivera D. Toward Responsible AI in High-Stakes Domains: A Dataset for Building Static Analysis with LLMs in Structural Engineering. Data. 2025; 10(11):169. https://doi.org/10.3390/data10110169
Chicago/Turabian StyleAvila, Carlos, Daniel Ilbay, Paola Tapia, and David Rivera. 2025. "Toward Responsible AI in High-Stakes Domains: A Dataset for Building Static Analysis with LLMs in Structural Engineering" Data 10, no. 11: 169. https://doi.org/10.3390/data10110169
APA StyleAvila, C., Ilbay, D., Tapia, P., & Rivera, D. (2025). Toward Responsible AI in High-Stakes Domains: A Dataset for Building Static Analysis with LLMs in Structural Engineering. Data, 10(11), 169. https://doi.org/10.3390/data10110169


