Modeling Dynamic Risk Perception Using Large Language Model (LLM) Agents
Abstract
1. Introduction
- i.
- How does risk perception change with each additional precursor?
- ii.
- How do precursor types and their sequential order influence the magnitude of this escalation?
- iii.
- Can large language models (LLMs) replicate human-like reasoning in assessing such risk dynamics?
2. Methodology
2.1. Step 1: Data Collection
- i.
- Group (A) with 10 cases for GPT agent design and verification. This number was adequate to allow iteratively improving prompts and probability mappings, convey variation across representative precursor-risk scenarios, and conform to accepted procedures for model validation before extensive testing.
- ii.
- Group (B) with 90 cases for the research experiment. This larger dataset enabled the systematic assessment of the GPT agent’s scalability, consistency, robustness, and capacity to mimic human-like reasoning across a range of precursor types, sequences, and escalation patterns.
2.2. Step 2: Precursor Extractor (Agent 1)
- i.
- Technical anomalies (e.g., alarms, sensor faults, pressure deviations);
- ii.
- Environmental triggers (e.g., poor visibility, abnormal weather conditions);
- iii.
- Human errors (e.g., procedural violations, miscommunication);
- iv.
- Organizational factors (e.g., staffing shortages, deferred maintenance).
2.3. Step 3: Subjective Probability Estimator (Agent 2)
- i.
- Contextual Initialization (Background Encoding)
- ii.
- Initial Precursor Processing (Single-Precursor Inference)
- iii.
- Sequential Precursor Integration (Bayesian Updating)
- iv.
- Cumulative Risk Estimation (Dynamic Bayesian Modeling):
- Inputs:
- Outputs:
2.4. Step 4: Statistical Analysis
- Case ID
- Precursor index
- Precursor description
- Time of precursor
- Subjective probability
- Precursor category
3. Results
3.1. GPT Agent Output Overview
3.2. Results of Statistical Analysis
3.2.1. Number of Precursors
3.2.2. Value of Subjective Probability
3.2.3. Monotonicity and Escalation Trends
3.2.4. Risk Escalation Stages
- i.
- Early stage (Precursors 1–2): Represents a lower initial risk, indicating that early warnings are often perceived as less critical.
- ii.
- Middle stage (Precursors 3–4): Represents a critical threshold where probabilities markedly escalate, reflecting accumulated perception of risk. Median probability jumps significantly from Precursor 3 (0.42) to Precursor 4 (0.77), suggesting a critical shift point.
- iii.
- Late stage (Precursors 5–9): Represent approach certainty, emphasizing urgency and severity, likely involving compounding human or organizational factors. Precursor 5–9 shows narrower ranges and consistently high values (>0.60), indicating emergency-level risk awareness.
3.2.5. Influence of Precursor Types
3.2.6. ANOVA and Mixed-Effects Analysis
4. Discussion
5. Conclusions
- i.
- Monotonic escalation: Subjective accident probability increases consistently with precursor accumulation, exhibiting an average escalation rate of 8.0 ± 0.9% per precursor (p < 0.05).
- ii.
- Critical tipping point: The fourth precursor marks a consistent perceptual transition from moderate concern to high-risk awareness, indicating a critical stage in dynamic risk perception.
- iii.
- Typology sensitivity: Organizational and human-factor precursors exert the strongest effects on perceived risk, while technical and environmental factors contribute more modest increments.
- iv.
- Human-like reasoning: The dual GPT-agent framework effectively replicates human probabilistic reasoning, with close correspondence to expert judgments (mean deviation ± 0.08).
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Appendix A. Agent Design and Configuration Details
- Platform: ChatGPT 4o
- Development method: Instruction-based conversational design (no coding required)
- Core system prompt:
- i.
- You are an accident precursor extractor.
- ii.
- Identify and summarize all observable conditions or events that occurred within seven days before the incident.
- iii.
- Present outputs in the format: Precursor [number] − [description + time marker].
- iv.
- Classify each precursor as technical, human, organizational, or environmental.
- Constraints: 7-day window, factual summarization only, exclusion of speculative or causal inference.
- Validation: 10 CSB cases manually labeled and cross-checked; achieved precision = 0.88, recall = 0.84.
- Platform: ChatGPT 4o
- Development method: Sequential conversational reasoning (background + precursor inputs).
- Core system prompt:
- i.
- You are a safety analyst estimating the subjective probability of a major accident given sequential precursors.
- ii.
- For each new precursor, update the probability (0–1.00) and provide a brief rationale. Assume Bayesian-like reasoning that integrates prior context and newly observed events.
- Output format: {Probability Value (0–1.00); Narrative Rationale}.
- Verification: Compared against expert-assigned baselines across 10 cases (mean absolute deviation ± 0.08).
References
- Galvany, A. Signs, Clues and Traces: Anticipation in Ancient Chinese Political and Military Texts. Early China 2015, 38, 151–193. [Google Scholar] [CrossRef]
- Kieckhefer, R. Magic in the Middle Ages; Cambridge University Press: Cambridge, UK, 2021. [Google Scholar]
- Gnoni, M.G.; Saleh, J.H. Near-miss management systems and observability-in-depth: Handling safety incidents and accident precursors in light of safety principles. Saf. Sci. 2017, 91, 154–167. [Google Scholar] [CrossRef]
- Gnoni, M.G.; Tornese, F.; Guglielmi, A.; Pellicci, M.; Campo, G.; De Merich, D. Near miss management systems in the industrial sector: A literature review. Saf. Sci. 2022, 150, 105704. [Google Scholar] [CrossRef]
- Wen, H. A new perspective on precursors and rare events from a systematic review. J. Loss Prev. Process Ind. 2025, 99, 105785. [Google Scholar] [CrossRef]
- Khakzad, N.; Khan, F.; Amyotte, P. Quantitative risk analysis of offshore drilling operations: A Bayesian approach. Saf. Sci. 2013, 57, 108–117. [Google Scholar] [CrossRef]
- Perez, P.; Tan, H. Accident Precursor Probabilistic Method (APPM) for modeling and assessing risk of offshore drilling blowouts—A theoretical micro-scale application. Saf. Sci. 2018, 105, 238–254. [Google Scholar] [CrossRef]
- Garrick, B.J.; Christie, R.F. Probabilistic risk assessment practices in the USA for nuclear power plants. Saf. Sci. 2002, 40, 177–201. [Google Scholar] [CrossRef]
- Johnson, J.W.; Rasmuson, D.M. The US NRC’s accident sequence precursor program: An overview and development of a Bayesian approach to estimate core damage frequency using precursor information. Reliab. Eng. Syst. Saf. 1996, 53, 205–216. [Google Scholar] [CrossRef]
- Jang, S.; Park, S.; Jae, M. Development of an Accident Sequence Precursor Methodology and its Application to Significant Accident Precursors. Nucl. Eng. Technol. 2017, 49, 313–326. [Google Scholar] [CrossRef]
- Accident Sequence Precursor (ASP) Program|Nuclear Regulatory Commission. Available online: https://www.nrc.gov/about-nrc/regulatory/research/asp (accessed on 14 October 2025).
- Kirchsteiger, C. Impact of accident precursors on risk estimates from accident databases. J. Loss Prev. Process Ind. 1997, 10, 159–167. [Google Scholar] [CrossRef]
- de las Heras-Rosas, C.; Suárez-Cebador, M.; Salguero-Caparrós, F.; Rubio-Romero, J.C. Analysis of the main components precursors of occupational accidents in the construction industry in Spain (2003–2022). Saf. Sci. 2025, 190, 106902. [Google Scholar] [CrossRef]
- Kyriakidis, M.; Hirsch, R.; Majumdar, A. Metro railway safety: An analysis of accident precursors. Saf. Sci. 2012, 50, 1535–1548. [Google Scholar] [CrossRef]
- Wu, W.; Gibb, A.G.F.; Li, Q. Accident precursors and near misses on construction sites: An investigative tool to derive information from accident databases. Saf. Sci. 2010, 48, 845–858. [Google Scholar] [CrossRef]
- Yang, M.; Khan, F.; Lye, L.; Amyotte, P. Risk assessment of rare events. Process Saf. Environ. Prot. 2015, 98, 102–108. [Google Scholar] [CrossRef]
- Bier, V.M. Statistical methods for the use of accident precursor data in estimating the frequency of rare events. Reliab. Eng. Syst. Saf. 1993, 41, 267–280. [Google Scholar] [CrossRef]
- Goossens, L.H.J.; Cooke, R.M. Applications of some risk assessment techniques: Formal expert judgement and accident sequence precursors. Saf. Sci. 1997, 26, 35–47. [Google Scholar] [CrossRef]
- Guo, Z.; Haimes, Y.Y. Risk Assessment of Infrastructure System of Systems with Precursor Analysis. Risk Anal. 2016, 36, 1630–1643. [Google Scholar] [CrossRef]
- Khakzad, N.; Khakzad, S.; Khan, F. Probabilistic risk assessment of major accidents: Application to offshore blowouts in the Gulf of Mexico. Nat. Hazards 2014, 74, 1759–1771. [Google Scholar] [CrossRef]
- Sajid, Z. A dynamic risk assessment model to assess the impact of the coronavirus (COVID-19) on the sustainability of the biomass supply chain: A case study of a US biofuel industry. Renew. Sustain. Energy Rev. 2021, 151, 111574. [Google Scholar] [CrossRef]
- Dao, U.; Sajid, Z.; Khan, F.; Zhang, Y. Dynamic Bayesian network model to study under-deposit corrosion. Reliab. Eng. Syst. Saf. 2023, 237, 109370. [Google Scholar] [CrossRef]
- Slovic, P. Perception of Risk. Science 1987, 236, 280–285. [Google Scholar] [CrossRef] [PubMed]
- Apostolakis, G. The Concept of Probability in Safety Assessments of Technological Systems. Science 1990, 250, 1359–1364. [Google Scholar] [CrossRef] [PubMed]
- Watson, S.R. The meaning of probability in probabilistic safety analysis. Reliab. Eng. Syst. Saf. 1994, 45, 261–269. [Google Scholar] [CrossRef]
- Yellman, T.W.; Murray, T.M. Comment on ‘The meaning of probability in probabilistic safety analysis’. Reliab. Eng. Syst. Saf. 1995, 49, 201–205. [Google Scholar] [CrossRef]
- Apostolakis, G.E. The interpretation of probability in probabilistic safety assessments. Reliab. Eng. Syst. Saf. 1988, 23, 247–252. [Google Scholar] [CrossRef]
- D’Agostini, G. Teaching statistics in the physics curriculum: Unifying and clarifying role of subjective probability. Am. J. Phys. 1999, 67, 1260–1268. [Google Scholar] [CrossRef]
- Animah, I. Application of Bayesian network in the maritime industry: Comprehensive literature review. Ocean. Eng. 2024, 302, 117610. [Google Scholar] [CrossRef]
- Zhang, G.; Thai, V.V. Expert elicitation and Bayesian Network modeling for shipping accidents: A literature review. Saf. Sci. 2016, 87, 53–62. [Google Scholar] [CrossRef]
- Wang, Q.-A.; Chen, J.; Ni, Y.; Xiao, Y.; Liu, N.; Liu, S.; Feng, W. Application of Bayesian networks in reliability assessment: A systematic literature review. Structures 2025, 71, 108098. [Google Scholar] [CrossRef]
- Wen, H.; Khan, F.; Amin, M.T.; Halim, S.Z. Myths and misconceptions of data-driven methods: Applications to process safety analysis. Comput. Chem. Eng. 2022, 158, 107639. [Google Scholar] [CrossRef]
- Charalampidou, S.; Zeleskidis, A.; Dokas, I.M. Hazard analysis in the era of AI: Assessing the usefulness of ChatGPT4 in STPA hazard analysis. Saf. Sci. 2024, 178, 106608. [Google Scholar] [CrossRef]
- Sujan, M.; Slater, D.; Crumpton, E. How can large language models assist with a FRAM analysis? Saf. Sci. 2025, 181, 106695. [Google Scholar] [CrossRef]
- Wu, H.; Triebe, M.J.; Sutherland, J.W. A transformer-based approach for novel fault detection and fault classification/diagnosis in manufacturing: A rotary system application. J. Manuf. Syst. 2023, 67, 439–452. [Google Scholar] [CrossRef]
- Dang, P.; Zhu, J.; Li, W.; Xie, Y.; Zhang, H. Large-language-model-driven agents for fire evacuation simulation in a cellular automata environment. Saf. Sci. 2025, 191, 106935. [Google Scholar] [CrossRef]
- Sabetta, N.; Costantino, F.; Stabile, S. A comparative analysis for automated information extraction from OSHA Lockout/Tagout accident narratives with Large Language Model. Procedia Comput. Sci. 2025, 253, 1362–1372. [Google Scholar] [CrossRef]
- Baek, S.; Park, C.Y.; Jung, W. Automated safety risk management guidance enhanced by retrieval-augmented large language model. Autom. Constr. 2025, 176, 106255. [Google Scholar] [CrossRef]
- Gu, J.; Pang, L.; Shen, H.; Cheng, X. Do LLMs Play Dice? Exploring Probability Distribution Sampling in Large Language Models for Behavioral Simulation. In Proceedings of the 31st International Conference on Computational Linguistics, Abu Dhabi, United Arab Emirates, 19–24 January 2025; Rambow, O., Wanner, L., Apidianaki, M., Al-Khalifa, H., Di Eugenio, B., Schockaert, S., Eds.; Association for Computational Linguistics: Stroudsburg, PA, USA, 2025; pp. 5375–5390. Available online: https://aclanthology.org/2025.coling-main.360/ (accessed on 14 October 2025).
- Pournemat, M.; Rezaei, K.; Sriramanan, G.; Zarei, A.; Fu, J.; Wang, Y.; Eghbalzadeh, H.; Feizi, S. Reasoning Under Uncertainty: Exploring Probabilistic Reasoning Capabilities of LLMs. arXiv 2025, arXiv:2509.10739. [Google Scholar] [CrossRef]
- Heinrich, H.W. Industrial Accident Prevention. A Scientific Approach; McGraw-Hill: New York, NY, USA, 1931. [Google Scholar]
- Sun, K.; Wang, X.; Miao, X.; Zhao, Q. A review of AI edge devices and lightweight CNN and LLM deployment. Neurocomputing 2025, 614, 128791. [Google Scholar] [CrossRef]
- Jonnala, S.; Swamy, B.; Thomas, N.M. Geopolitical Bias in Sovereign Large Language Models: A Comparative Mixed-Methods Study. J. Res. Innov. Technol. 2025, 4, 173–192. [Google Scholar] [CrossRef]
- Osborne, M.R.; Bailey, E.R. Me vs. the machine? Subjective evaluations of human- and AI-generated advice. Sci. Rep. 2025, 15, 3980. [Google Scholar] [CrossRef]
- Zhuang, N.; Cao, B.; Yang, Y.; Xu, J.; Xu, M.; Wang, Y.; Liu, Q. LLM Agents Can Be Choice-Supportive Biased Evaluators: An Empirical Study. In Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence (AAAI-25), Philadelphia, PA, USA, 25 February–4 March 2025; pp. 26436–26444. [Google Scholar] [CrossRef]











| Statistic | Value |
|---|---|
| Count of cases | 90 |
| Count of precursors | 479 |
| Mean | 5.32 |
| Standard deviation | 1.85 |
| Minimum | 4 |
| Median | 5 |
| Maximum | 15 |
| Statistic | Minimum | Maximum | Mean | Standard Deviation | -Value |
|---|---|---|---|---|---|
| Spearman’s ρ | 0.75 | 1.00 | 0.89 | 0.07 | <0.05 |
| Stage | Precursor | Median | Range |
|---|---|---|---|
| Early stage | Precursor 1 | 0.08 | [0.02, 0.25] |
| Precursor 2 | 0.19 | [0.05, 0.45] | |
| Middle stage | Precursor 3 | 0.42 | [0.06, 0.70] |
| Precursor 4 | 0.77 | [0.12, 0.90] | |
| Late stage | Precursor 5 | 0.79 | [0.22, 0.90] |
| Precursor 6 | 0.85 | [0.30, 0.95] | |
| Precursor 7 | 0.90 | [0.50, 0.95] | |
| Precursor 8 | 0.89 | [0.80, 0.95] | |
| Precursor 9 | 0.92 | [0.82, 0.95] |
| Precursor Category | Count | Median Probability |
|---|---|---|
| Organizational factor | 182 | 0.56 |
| Technical anomaly | 147 | 0.41 |
| Human error | 131 | 0.47 |
| Environmental trigger | 19 | 0.35 |
| Total | 479 | - |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wen, H.; Parsaee, M.; Sajid, Z. Modeling Dynamic Risk Perception Using Large Language Model (LLM) Agents. AI 2025, 6, 296. https://doi.org/10.3390/ai6110296
Wen H, Parsaee M, Sajid Z. Modeling Dynamic Risk Perception Using Large Language Model (LLM) Agents. AI. 2025; 6(11):296. https://doi.org/10.3390/ai6110296
Chicago/Turabian StyleWen, He, Mojtaba Parsaee, and Zaman Sajid. 2025. "Modeling Dynamic Risk Perception Using Large Language Model (LLM) Agents" AI 6, no. 11: 296. https://doi.org/10.3390/ai6110296
APA StyleWen, H., Parsaee, M., & Sajid, Z. (2025). Modeling Dynamic Risk Perception Using Large Language Model (LLM) Agents. AI, 6(11), 296. https://doi.org/10.3390/ai6110296

