Next Article in Journal
Early Settlements of Construction Disputes in Public Projects: An Archetype for Reducing Disagreements over Delay Assessments
Previous Article in Journal
Platform-Based Approaches in the AEC Industry: A Bibliometric Review and Trend Analysis
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

Analysis of Fall-from-Height Accidents in Construction Based on Text Mining Technology and Improved Apriori Algorithm

1
School of Civil Engineering and Architecture, Wuhan University of Technology, Wuhan 430070, China
2
Haikou Rural Revitalization Investment and Development Co., Ltd., Haikou 570000, China
*
Author to whom correspondence should be addressed.
Buildings 2026, 16(3), 596; https://doi.org/10.3390/buildings16030596 (registering DOI)
Submission received: 21 May 2025 / Revised: 20 December 2025 / Accepted: 28 January 2026 / Published: 1 February 2026
(This article belongs to the Section Construction Management, and Computers & Digitization)

Abstract

In recent years, fall-from-height accidents have frequently occurred in construction activities, posing severe risks to workers’ safety and impeding the sustainable development of construction enterprises as well as social stability. Due to the complexity and multifactorial nature of such accidents, traditional safety risk assessment methods face significant limitations in uncovering their underlying causes. To address this issue, this study develops a novel analytical framework that integrates text mining with an improved Apriori algorithm. A standardized text preprocessing pipeline is established, including data collection, construction of a domain-specific lexicon, and synonym-based term unification. Key features are extracted using the TF-IDF method, while thematic patterns are identified through LDA topic modeling. To overcome the contextual insensitivity of conventional association rule mining, the Apriori algorithm is enhanced by introducing time-based constraints, enabling the discovery of accident causation patterns that differ between daytime and nighttime. Using 1064 accident reports from 22 provinces in China, the framework extracted 40 high-frequency accident-causing features and generated a richer set of meaningful association rules compared to the standard algorithm. The results indicate that insufficient safety protection, inadequate worker training, and management deficiencies are the predominant causes of fall-from-height accidents. Building on these insights, the study proposes targeted preventive measures. The findings make significant theoretical contributions by enhancing methodological frameworks for accident analysis while also providing practical insights to improve safety management practices in the construction industry.
Keywords: fall-from-height accidents; text mining; TF-IDF; Apriori algorithm; construction safety fall-from-height accidents; text mining; TF-IDF; Apriori algorithm; construction safety

Share and Cite

MDPI and ACS Style

Sun, R.; Wang, J. Analysis of Fall-from-Height Accidents in Construction Based on Text Mining Technology and Improved Apriori Algorithm. Buildings 2026, 16, 596. https://doi.org/10.3390/buildings16030596

AMA Style

Sun R, Wang J. Analysis of Fall-from-Height Accidents in Construction Based on Text Mining Technology and Improved Apriori Algorithm. Buildings. 2026; 16(3):596. https://doi.org/10.3390/buildings16030596

Chicago/Turabian Style

Sun, Rongjian, and Junwu Wang. 2026. "Analysis of Fall-from-Height Accidents in Construction Based on Text Mining Technology and Improved Apriori Algorithm" Buildings 16, no. 3: 596. https://doi.org/10.3390/buildings16030596

APA Style

Sun, R., & Wang, J. (2026). Analysis of Fall-from-Height Accidents in Construction Based on Text Mining Technology and Improved Apriori Algorithm. Buildings, 16(3), 596. https://doi.org/10.3390/buildings16030596

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop