Improving Construction Site Safety with Large Language Models: A Performance Analysis
Abstract
1. Introduction
2. Related Work on GPT Models in the Construction Sector
3. Materials and Methods
3.1. Images Collection
- (a)
- Google images: this set consists of 51 creative commons license images sourced from Google, covering a wide range of work activities in construction sites (e.g., working at heights, operating heavy machinery, and handling hazardous materials). An example is provided in Figure 3.
- (b)
- Images from real construction sites: this set consists of 51 images that are from different building construction sites in Sicily, capturing moments of daily work and workers’ routine activities. An example is provided in Figure 4.
3.2. Hazard Level Assessment: Experts vs. GPT-4o
- (a)
- Immediate or potential hazards, such as checking for the presence of inadequate equipment, unsafe surfaces, or hazardous materials left uncontrolled.
- (b)
- Correct use of Personal Protective Equipment (PPE), ensuring that tools such as helmets, gloves, protective goggles, and safety clothing were appropriately used.
- (c)
- Compliance with safety regulations, verifying that all applicable laws and regulations related to the Italian construction sector and workplace safety were being followed.
3.3. Metrics Definition
- (a)
- False Positive (FP): also known as type I error, it occurs when a test incorrectly rejects a null hypothesis that is true. It happens when the test finds evidence of something that does not exist.
- (b)
- False Negative (FN): also known as type II error, it occurs when a test fails to detect a condition or phenomenon that is present. Type II errors are more concerning in safety contexts, as they represent situations where hazards are not detected by the model.
- (c)
- True Positive (TP): the model correctly predicts the positive class, namely it identifies the presence of hazards.
- (d)
- True Negative (TN): the model correctly predicts the negative class, namely it identifies the absence of hazards.
- (a)
- Mean Absolute Error (MAE): it quantifies the average deviation of the model’s predictions from the actual observed values. For n samples, if yᵢ is the actual value and is the predicted one, MAE is calculated as follows (Equation (1)).
- (b)
- Accuracy (A): it represents the number of correct predictions (i.e., TP and TN) out of the total number of predictions made, namely it indicates the percentage of cases where the model correctly classifies a situation in line with the expert judgments (Equation (2)).
- (c)
- Sensitivity (S): it measures the ability of a statistical test or classification model to correctly identify TP. In other words, it measures the proportion of TP identified out of the total number of TP and FN (Equation (3)).
- (d)
- Precision (P): it is a performance metric used to measure the accuracy of positive predictions in a classification problem. It quantifies the proportion of correctly predicted positive instances out of all instances classified as positive by the model (Equation (4)).
- (e)
- F1 Score (F1): it is a performance metric that combines both Precision (P) and Sensitivity (S) into a single measure, providing a balanced evaluation of the model’s ability to correctly identify positive instances while limiting false positives. F1 score is defined as the harmonic mean of P and S, which makes it particularly suitable in cases where the class distribution is imbalanced or when both false positives and false negatives carry significant consequences. It is computed as follows (Equation (5)):
4. Results
- MAE results indicate that GPT-4o produced more accurate predictions on real images than on Google images. The aggregated MAE value is 0.20968, indicating an overall average error of approximately 21% between the system’s predictions and the evaluators’ assessments.
- Sensitivity: the model performs overall well in identifying hazardous situations (~69.6%) but missed about 30.4% (false negatives).
- Accuracy (~68.6%): roughly 68.6% of the model’s predictions matched expert consensus. However, about 31.4% were misclassified due to either FP or FN. Specifically, Table 2 shows that the model produced two FP for both the real and Google image datasets, indicating the presence of hazards that were not identified by the experts. As concerns FN, the model recorded 12 and 16 cases for real and Google images respectively. This suggests that the model seems generally less effective at detecting hazards in images sourced from Google.
- Precision: the model tends to be highly accurate in predicting hazardous conditions, with a precision of 94.1%.
- According to expert opinions, approximately 90% of the 102 images depict hazardous situations (i.e., TP + FN), suggesting a potential bias toward medium-high risk cases. This may reflect a tendency among individuals to overestimate risks, or conversely, that the abundance of safety protocols and regulations makes full compliance challenging. However, this strong imbalance in the dataset—where hazardous cases largely outnumber safe ones—requires careful consideration when evaluating the model’s performance. Traditional accuracy metrics can be misleading in such contexts, as a classifier might achieve high accuracy simply by predicting the majority class more frequently. In this regard, the F1 score becomes a more reliable indicator, as it balances precision and recall, providing insights into the model’s ability to correctly identify both positive and negative classes under class imbalance. The F1 score obtained for the Google images dataset is 0.871428, while it is 0.820513 for the real-image dataset. Both values indicate relatively strong performance, but the drop in the real-image score suggests that the classifier struggles more when compared with images closer to real-world conditions, where variability and noise are higher. Overall, these results imply that while the model generalizes reasonably well, the imbalance of the dataset may inflate the perceived performance.
4.1. Sensitivity Analysis of the Decision Threshold Under Class Imbalance
4.2. Analisys of Images with High Discrepancy
5. Discussion
5.1. Use of GPT-4o in Workplace Safety: Opportunities and Challenges
- Accessibility and availability: multimodal GPT-4o can be available 24/7, providing instant access to hazard recognition and risk assessment guidance and information, which can be crucial in time-sensitive situations.
- Consistency: GPT-4o can provide consistent evaluations based on predefined criteria by design using a correct set of parameters, reducing variability that might occur with human judgment. The system also provides a detailed description of the identified hazard level. This allows for the analysis of the system’s errors. By identifying critical points, its use could be limited to situations where it demonstrates high reliability.
- Scalability: it can handle multiple queries via API, making it scalable for large construction projects.
5.2. Ethical Considerations
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
| LLM | Large Language Model |
| AI | Artificial Intelligence |
| HRRP | Hazard Recognition and Risk Perception |
| JHA | Job Hazard Analysis |
| OHS | Occupational Health and Safety |
| VR | Virtual Reality |
| AR | Augmented Reality |
| BIM | Building Information Modeling |
| CHPtD | Construction Hazard Prevention through Design |
| AEC | Architecture, Engineering, and Construction |
| PPE | Personal Protective Equipment |
| FP | False Positive |
| FN | False Negative |
| TP | True Positive |
| TN | True Negative |
| MAE | Mean Absolute Error |
Appendix A
| File ID | Expert 1 | Expert 2 | Expert 3 | Expert 4 | Mean Value of Experts’ Judgement | AI Model |
|---|---|---|---|---|---|---|
| 1 | H | VH | H | H | 0.8125 | 0.3 |
| 2 | H | H | M | L | 0.5625 | 0.3 |
| 3 | M | H | M | H | 0.625 | 0.2 |
| 4 | H | VH | H | M | 0.75 | 0.3 |
| 5 | VH | VH | H | H | 0.875 | 0.7 |
| 6 | H | VH | H | H | 0.8125 | 0.7 |
| 7 | VH | VH | H | H | 0.875 | 0.4 |
| 8 | H | H | M | H | 0.6875 | 0.4 |
| 9 | H | H | H | H | 0.75 | 0.4 |
| 10 | H | VH | H | VH | 0.875 | 0.3 |
| 11 | M | H | H | VH | 0.75 | 0.6 |
| 12 | M | M | M | M | 0.5 | 0.6 |
| 13 | M | H | M | H | 0.625 | 0.8 |
| 14 | L | L | M | L | 0.3125 | 0.4 |
| 15 | H | H | H | VH | 0.8125 | 0.3 |
| 16 | H | H | H | H | 0.75 | 0.6 |
| 17 | L | L | L | M | 0.3125 | 0.3 |
| 18 | H | H | H | VH | 0.8125 | 0.7 |
| 19 | H | H | H | H | 0.75 | 0.6 |
| 20 | VH | VH | H | H | 0.875 | 0.6 |
| 21 | VH | M | H | H | 0.75 | 0.8 |
| 22 | L | L | L | L | 0.25 | 0.4 |
| 23 | VH | H | H | H | 0.8125 | 0.3 |
| 24 | H | M | M | H | 0.625 | 0.3 |
| 25 | M | M | H | H | 0.625 | 0.3 |
| 26 | VH | VH | VH | VH | 1 | 0.8 |
| 27 | H | H | H | H | 0.75 | 0.3 |
| 28 | H | H | H | H | 0.75 | 0.3 |
| 29 | M | H | H | H | 0.6875 | 0.4 |
| 30 | M | H | M | H | 0.625 | 0.5 |
| 31 | VH | VH | VH | VH | 1 | 0.9 |
| 32 | VH | VH | VH | H | 0.9375 | 0.8 |
| 33 | VH | VH | VH | VH | 1 | 0.8 |
| 34 | VH | H | H | VH | 0.875 | 0.7 |
| 35 | L | L | H | M | 0.4375 | 0.3 |
| 36 | M | M | H | H | 0.625 | 0.6 |
| 37 | H | M | H | M | 0.625 | 0.7 |
| 38 | VH | H | VH | VH | 0.9375 | 0.8 |
| 39 | H | H | VH | H | 0.8125 | 0.4 |
| 40 | L | M | H | M | 0.5 | 0.4 |
| 41 | M | H | H | H | 0.6875 | 0.8 |
| 42 | VH | VH | VH | VH | 1 | 0.9 |
| 43 | VH | H | H | H | 0.8125 | 0.8 |
| 44 | M | H | H | H | 0.6875 | 0.3 |
| 45 | M | M | L | M | 0.4375 | 0.2 |
| 46 | M | H | H | H | 0.6875 | 0.3 |
| 47 | H | M | M | M | 0.5625 | 0.4 |
| 48 | M | M | M | H | 0.5625 | 0.8 |
| 49 | H | H | M | VH | 0.75 | 0.6 |
| 50 | M | M | M | H | 0.5625 | 0.7 |
| 51 | M | H | H | H | 0.6875 | 0.3 |
| File ID | Expert 1 | Expert 2 | Expert 3 | Expert 4 | Mean Value of Experts’ Judgement | AI Model |
|---|---|---|---|---|---|---|
| 1 | L | L | L | M | 0.3125 | 0.3 |
| 2 | M | M | M | H | 0.5625 | 0.6 |
| 3 | VH | VH | H | VH | 0.9375 | 0.8 |
| 4 | H | H | H | H | 0.75 | 0.7 |
| 5 | L | L | L | M | 0.3125 | 0.2 |
| 6 | M | M | H | H | 0.625 | 0.4 |
| 7 | H | M | H | H | 0.6875 | 0.3 |
| 8 | L | L | L | M | 0.3125 | 0.3 |
| 9 | H | M | M | H | 0.625 | 0.3 |
| 10 | M | L | M | M | 0.4375 | 0.3 |
| 11 | L | M | L | M | 0.375 | 0.3 |
| 12 | H | M | H | H | 0.6875 | 0.6 |
| 13 | H | H | H | H | 0.75 | 0.6 |
| 14 | L | L | L | L | 0.25 | 0.6 |
| 15 | L | L | L | M | 0.3125 | 0.8 |
| 16 | L | M | M | L | 0.375 | 0.3 |
| 17 | H | M | H | M | 0.625 | 0.4 |
| 18 | H | L | H | H | 0.625 | 0.4 |
| 19 | H | M | H | H | 0.6875 | 0.7 |
| 20 | H | H | H | H | 0.75 | 0.8 |
| 21 | H | H | H | H | 0.75 | 0.7 |
| 22 | H | H | H | H | 0.75 | 0.7 |
| 23 | H | H | H | H | 0.75 | 0.4 |
| 24 | H | H | H | H | 0.75 | 0.7 |
| 25 | H | M | H | H | 0.6875 | 0.9 |
| 26 | M | M | M | H | 0.5625 | 0.8 |
| 27 | VH | M | H | H | 0.75 | 0.7 |
| 28 | H | M | M | H | 0.625 | 0.3 |
| 29 | H | H | VH | VH | 0.875 | 0.3 |
| 30 | M | M | L | M | 0.4375 | 0.6 |
| 31 | M | M | M | H | 0.5625 | 0.3 |
| 32 | M | M | M | H | 0.5625 | 0.6 |
| 33 | H | M | M | H | 0.625 | 0.3 |
| 34 | H | M | H | H | 0.6875 | 0.4 |
| 35 | H | H | H | H | 0.75 | 0.7 |
| 36 | H | M | M | H | 0.625 | 0.3 |
| 37 | H | H | M | VH | 0.75 | 0.4 |
| 38 | M | H | M | VH | 0.6875 | 0.7 |
| 39 | H | H | M | H | 0.6875 | 0.7 |
| 40 | H | M | M | H | 0.625 | 0.5 |
| 41 | H | H | H | H | 0.75 | 0.4 |
| 42 | H | M | M | VH | 0.6875 | 0.3 |
| 43 | H | M | M | H | 0.625 | 0.7 |
| 44 | M | H | M | H | 0.625 | 0.6 |
| 45 | M | H | M | M | 0.5625 | 0.3 |
| 46 | L | M | H | M | 0.5 | 0.6 |
| 47 | H | M | H | H | 0.6875 | 0.3 |
| 48 | M | M | M | H | 0.5625 | 0.7 |
| 49 | H | M | L | M | 0.5 | 0.6 |
| 50 | H | M | M | H | 0.625 | 0.8 |
| 51 | H | M | M | H | 0.625 | 0.3 |
References
- ANSA. Available online: https://www.ansa.it/english/news/business/2026/02/03/792-workplace-deaths-in-2025-down-5-from-2024_080391d5-3e67-40e3-a0ea-ca9b9f03350a.html (accessed on 5 February 2026).
- La Fata, C.M.; Giallanza, A.; Micale, R.; La Scalia, G. Ranking of occupational health and safety risks by a multi-criteria perspective: Inclusion of human factors and application of VIKOR. Saf. Sci. 2021, 138, 105234. [Google Scholar] [CrossRef]
- Uddin, S.M.J.; Albert, A.; Ovid, A.; Alsharef, A. Leveraging ChatGPT to Aid Construction Hazard Recognition and Support. Saf. Educ. Train. Sustain. 2023, 15, 7121. [Google Scholar]
- Eurostat. Accidents at Work Statistics. Available online: https://ec.europa.eu/eurostat/statistics-explained/index.php?title=Accidents_at_work_statistics (accessed on 5 February 2026).
- Jeelani, I.; Albert, A.; Azevedo, R.; Jaselskis, E.J. Development and Testing of a Personalized Hazard-Recognition Training Intervention. J. Constr. Eng. Manag. 2017, 143, 04016120. [Google Scholar] [CrossRef]
- Perlman, A.; Sacks, R.; Barak, R. Hazard Recognition and Risk Perception in Construction. Saf. Sci. 2014, 64, 22–31. [Google Scholar] [CrossRef]
- Sun, J.; Chang, F.; Zhou, Z.; Man, S.-S.; Shou Chan, A.H. A Systematic Review of Hazard Recognition and Risk Perception Research in the Construction Industry. Saf. Sci. 2025, 186, 106813. [Google Scholar] [CrossRef]
- Fleming, M.; Fischer, B. Hazard Recognition: Bridging Knowledge and Competency for Process and Occupational Safety. Prof. Saf. 2017, 62, 52–61. [Google Scholar]
- Hardison, D.; Hallowell, M.; Littlejohn, R. Does the Format of Design Information Affect Hazard Recognition Performance in Construction Hazard Prevention through Design Reviews? Saf. Sci. 2020, 121, 191–200. [Google Scholar] [CrossRef]
- Khaleghi, P.; Akbari, H.; Alavi, N.M.; Kashani, M.M.; Batooli, Z. Identification and Analysis of Human Errors in Emergency Department Nurses Using SHERPA Method. Int. Emerg. Nurs. 2022, 62, 101159. [Google Scholar] [CrossRef]
- Pereira, F.; González García, M.d.l.N.; Poças Martins, J. An Evaluation of the Technologies Used for the Real-Time Monitoring of the Risk of Falling from Height in Construction—Systematic Review. Buildings 2024, 14, 2879. [Google Scholar] [CrossRef]
- Sarvari, H.; Baghbaderani, A.B.; Chan, D.W.M.; Beer, M. Determining the Significant Contributing Factors to the Occurrence of Human Errors in Urban Construction Projects: A Delphi-SWARA Study Approach. Technol. Forecast. Soc. Change 2024, 205, 123512. [Google Scholar] [CrossRef]
- Occupational Safety and Health Administration (OSHA). OSHA 3071—Job Hazard Analysis; U.S. Department of Labor: Washington, DC, USA, 2002.
- Lingard, H.; Rowlinson, S. Occupational Health and Safety in Construction Project Management; Routledge: London, UK, 2004. [Google Scholar]
- Jeelani, I.; Han, K.; Albert, A. Development of Immersive Personalized Training Environment for Construction Workers. In Proceedings of the Congress on Computing in Civil Engineering, Seattle, WA, USA, 25–27 June 2017; pp. 407–415. [Google Scholar]
- La Fata, C.M.; Giallanza, A.; Micale, R.; La Scalia, G. Toward acceptance of human-robot collaboration in industrial settings: A bibliometric and systematic literature review. Int. J. Adv. Manuf. Technol. 2025, 139, 2139–2160. [Google Scholar] [CrossRef]
- Jamwal, A.; Agrawal, R.; Sharma, M.; Giallanza, A. Industry 4.0 Technologies for Manufacturing Sustainability: A Systematic Review and Future Research Directions. Appl. Sci. 2021, 11, 5725. [Google Scholar] [CrossRef]
- Dzeng, R.J.; Lin, C.T.; Fang, Y.C. Using Eye-Tracker to Compare Search Patterns between Experienced and Novice Workers for Site Hazard Identification. Saf. Sci. 2016, 82, 56–67. [Google Scholar] [CrossRef]
- Cheng, B.; Luo, X.; Mei, X.; Chen, H.; Huang, J. A Systematic Review of Eye-Tracking Studies of Construction Safety. Front. Neurosci. 2022, 16, 891725. [Google Scholar] [CrossRef]
- Wang, D.; Chen, J.; Zhao, D.; Dai, F.; Zheng, C.; Wu, X. Monitoring Workers’ Attention and Vigilance in Construction Activities through a Wireless and Wearable Electroencephalography System. Autom. Constr. 2017, 82, 122–137. [Google Scholar] [CrossRef]
- Adebayo, Y.; Udoh, P.; Kamudyariwa, X.B.; Osobajo, O.A. Artificial Intelligence in Construction Project Management: A Structured Literature Review of Its Evolution in Application and Future Trends. Digital 2025, 5, 26. [Google Scholar] [CrossRef]
- Trivedi, P.; Alqahtani, F.M. The Advancement of Artificial Intelligence (AI) in Occupational Health and Safety (OHS) across High-Risk Industries. J. Infrastruct. Policy Dev. 2024, 8, 6889. [Google Scholar] [CrossRef]
- Sacks, R.; Perlman, A.; Barak, R. Construction Safety Training Using Immersive Virtual Reality. Constr. Manag. Econ. 2013, 31, 1005–1017. [Google Scholar] [CrossRef]
- Ding, Y.; Ma, J.; Luo, X. Applications of Natural Language Processing in Construction. Autom. Constr. 2022, 136, 104169. [Google Scholar] [CrossRef]
- Sridi, C.; Brigui, S. The Use of ChatGPT in Occupational Medicine: Opportunities and Threats. Ann. Occup. Environ. Med. 2023, 35, e42. [Google Scholar] [CrossRef] [PubMed]
- Kalyan, K.S. A Survey of GPT-3 Family Large Language Models Including ChatGPT and GPT-4. Nat. Lang. Process. J. 2024, 6, 100048. [Google Scholar] [CrossRef]
- GPT-4. Available online: https://openai.com/it-IT/index/gpt-4-research/ (accessed on 13 May 2025).
- OpenAI. GPT-4 Technical Report. arXiv 2023. [Google Scholar] [CrossRef]
- Basulo-Ribeiro, J.; Teixeira, L. Is ChatGPT an Ally or an Enemy? Its Impact on Society Based on a Systematic Literature Review. J. Inf. Sci. Theory Pract. 2024, 12, 79–95. [Google Scholar]
- Saka, A.; Taiwo, R.; Saka, N.; Salami, B.A.; Ajayi, S.; Akande, K.; Kazemi, H. GPT Models in Construction Industry: Opportunities, Limitations, and a Use Case Validation. Dev. Built Environ. 2024, 17, 100300. [Google Scholar] [CrossRef]
- Firat, M. What ChatGPT Means for Universities: Perceptions of Scholars and Students. J. Appl. Learn. Teach. 2023, 6, 57–63. [Google Scholar] [CrossRef]
- Li, J.; Dada, A.; Puladi, B.; Kleesiek, J.; Egger, J. ChatGPT in Healthcare: A Taxonomy and Systematic Review. Comput. Methods Programs Biomed. 2024, 245, 108013. [Google Scholar] [CrossRef] [PubMed]
- Zong, M.; Krishnamachari, B. A Survey on GPT-3. arXiv 2022. [Google Scholar] [CrossRef]
- Uddin, S.M.J.; Albert, A.; Tamanna, M. Harnessing the Power of ChatGPT to Promote Construction Hazard Prevention through Design (CHPtD). Eng. Constr. Archit. Manag. 2024, 32, 7832–7856. [Google Scholar] [CrossRef]
- Aladağ, H. Assessing the Accuracy of ChatGPT Use for Risk Management in Construction Projects. Sustainability 2023, 15, 16071. [Google Scholar] [CrossRef]
- Hussain, R.; Sabir, A.; Lee, D.Y.; Zaidi, S.F.A.; Pedro, A.; Abbas, M.S.; Park, C. Conversational AI-Based VR System to Improve Construction Safety Training of Migrant Workers. Autom. Constr. 2024, 160, 105315. [Google Scholar] [CrossRef]
- Xiao, B.; Wang, Y.; Zhang, Y.; Chen, C.; Darko, A. Automated Daily Report Generation from Construction Videos Using ChatGPT and Computer Vision. Autom. Constr. 2024, 168, 105874. [Google Scholar] [CrossRef]
- Samsami, R. Optimizing the Utilization of Generative Artificial Intelligence (AI) in the AEC Industry: ChatGPT Prompt Engineering and Design. CivilEng 2024, 5, 971–1010. [Google Scholar] [CrossRef]
- Bazrafshan, P.; Melag, K.; Ebrahimkhanlou, A. Semantic and lexical analysis of pre-trained vision language artificial intelligence models for automated image descriptions in civil engineering. AI Civ. Eng. 2025, 4, 17. [Google Scholar] [CrossRef]
- Tran, D.Q.; Jeon, Y.; Park, M.; Park, S. GPT-based Logic Reasoning for Hazard Identification in Construction Site using CCTV Data. In Proceedings of the 41st International Symposium on Automation and Robotics in Construction, ISARC 2024, Lille, France, 3–5 June 2024; pp. 291–298. [Google Scholar]
- Barone, G. Machine Learning e Intelligenza Artificiale: Metodologie per lo Sviluppo di Sistemi Automatici; Dario Flaccovio Editore: Palermo, Italy, 2021. [Google Scholar]
- European Union. Regulation (EU) 2024/1689; Official Journal of the European Union: Luxembourg, 2024; L 2024/1689.







| Semantic Hazard Level | Quantitative Value |
|---|---|
| Low (L) | 0.25 |
| Medium (M) | 0.5 |
| High (H) | 0.75 |
| Very High (VH) | 1 |
| Images | FP | FN |
|---|---|---|
| 2 | 16 | |
| Real | 2 | 12 |
| AI Predicts Hazardous Conditions | AI Predicts Non-Hazardous Conditions | |
|---|---|---|
| Experts predict hazardous conditions | 64 (TP) | 28 (FN) |
| Experts predict non-hazardous conditions | 4 (FP) | 6 (TN) |
| Images | MAE | A | S | P | F1 |
|---|---|---|---|---|---|
| 0.23603 | 0.64706 | 0.66667 | 0.941176 | 0.871428 | |
| Real | 0.18333 | 0.72549 | 0.72727 | 0.941176 | 0.820513 |
| Both | 0.20968 | 0.68627 | 0.696 | 0.941 | 0.8 |
| Team of Experts | GPT-4o |
|---|---|
| Expert 1: Workers are properly equipped with all PPE but are operating within the suspended load area, exposing them to the risk of overturning. Expert 2: Operators are within the crane’s operating area, near a suspended load. Expert 3: Presence of workers within the load handling area of a crane, wearing appropriate PPE. Expert 4: Workers are engaged in the placement of the precast concrete wall, within the crane’s operating area. They are potentially exposed to crushing hazards due to hook and chain failure and possible slipping of the wooden block placed at the base of the precast structure. | Presence of PPE: workers are wearing safety helmets, high-visibility vests, and gloves, which significantly reduces the risk of injuries. Site conditions: the area appears well-organized and clean, with no obvious obstacles or clutter that could cause accidents. Safety systems: anchors and supports for lifting the panels are visible, indicating that safe lifting practices are being followed. Safety distance: workers maintain an appropriate distance from the most hazardous areas, such as the lifting point of the precast panels. |
| Team of Experts | GPT-4o |
|---|---|
| Expert 1: Workers are not wearing helmets and are positioned within the operating area of the concrete pump. Expert 2: Operators are near the concrete pump discharge, without appropriate PPE. Expert 3: Work is being carried out within the operational area of mechanical arms, with workers lacking proper PPE. Expert 4: The two workers are not wearing essential PPE, such as helmets and gloves. They are positioned within the operating area of a concrete pump, posing a potential risk of crushing due to the possible failure of the pump arm. | Workers are wearing some PPE, such as high-visibility vests, which enhance their visibility. The construction site appears to be somewhat organized, with fencing and designated work areas. However, there are several concerns: it is unclear whether the workers are wearing protective helmets, safety glasses, or gloves; the work area seems to be exposed to other ongoing construction activities that may introduce additional hazards; the workers’ proximity to the concrete pump increases the risk of accidental contact or crushing. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
La Fata, C.M.; Barone, G.; Cammarata, M. Improving Construction Site Safety with Large Language Models: A Performance Analysis. Information 2026, 17, 210. https://doi.org/10.3390/info17020210
La Fata CM, Barone G, Cammarata M. Improving Construction Site Safety with Large Language Models: A Performance Analysis. Information. 2026; 17(2):210. https://doi.org/10.3390/info17020210
Chicago/Turabian StyleLa Fata, Concetta Manuela, Gianfranco Barone, and Marco Cammarata. 2026. "Improving Construction Site Safety with Large Language Models: A Performance Analysis" Information 17, no. 2: 210. https://doi.org/10.3390/info17020210
APA StyleLa Fata, C. M., Barone, G., & Cammarata, M. (2026). Improving Construction Site Safety with Large Language Models: A Performance Analysis. Information, 17(2), 210. https://doi.org/10.3390/info17020210

