An Integrated Framework for Automated Identification of Workers’ Safety Violation Based on Knowledge Graph

Zhu, Yifan; Ouyang, Yewei; Pan, Rui; Sun, Zhanhui; Zhou, Yang; Ma, Rui; Cheng, Baoquan; Wang, Wen

doi:10.3390/buildings16051037

Open AccessArticle

An Integrated Framework for Automated Identification of Workers’ Safety Violation Based on Knowledge Graph

by

Yifan Zhu

¹

,

Yewei Ouyang

²,

Rui Pan

^3,4,5,

Zhanhui Sun

^3,4,5,

Yang Zhou

^3,4,5,

Rui Ma

⁶

,

Baoquan Cheng

⁷

and

Wen Wang

^8,*

¹

Engineering Digital Technology R&D Center, Engineering Design & Research Institute of China Communications Construction Third Highway Engineering Co., Ltd., Beijing 100010, China

²

Sino-Australia Joint Research Center in Building Information Modeling and Smart Construction, Shenzhen University, Shenzhen 518060, China

³

School of Safety Science Center, Tsinghua University, Beijing 100084, China

⁴

Sensor and Equipment Research Center, Hefei Institute for Public Safety Research, Tsinghua University, Hefei 230601, China

⁵

Safety Culture Education Research and Development Center, Hefei Institute for Public Safety Research, Tsinghua University, Hefei 230601, China

⁶

School of Management, Zhengzhou University, Zhengzhou 450001, China

⁷

School of Civil Engineering, Central South University, Changsha 410083, China

⁸

Shanghai Research Institute of Building Sciences Group Co., Ltd., Shanghai 200032, China

^*

Author to whom correspondence should be addressed.

Buildings 2026, 16(5), 1037; https://doi.org/10.3390/buildings16051037

Submission received: 28 December 2025 / Revised: 13 February 2026 / Accepted: 25 February 2026 / Published: 6 March 2026

(This article belongs to the Special Issue Recent Advances in Intelligent Infrastructure and Construction Engineering)

Download

Browse Figures

Versions Notes

Abstract

Automatic identification of worker safety violations can substantially strengthen construction-site safety management by enabling continuous, real-time monitoring. Although recent advances have made automated detection feasible, many existing systems still suffer from poor adaptability and limited extensibility. To address these limitations, this study proposes an integrated, knowledge graph-based framework for automatic identification of workers’ safety violations. The framework comprises two principal components: (1) a knowledge graph construction module that encodes domain knowledge (safety regulations, task–hazard relationships, and contextual constraints) into a machine-readable graph structure and (2) a graph-enabled violation identification module that maps structured scene descriptions of worker and environmental states to the knowledge graph and performs semantic inference to detect violations. In this study, these structured scene descriptions are manually specified and simulated as subject–predicate–object triplets; integration with raw sensing data is left for future work. For validation, we construct a knowledge graph containing 1200 safety rules and evaluate the violation identification module on 500 annotated examples representing realistic worker scenarios. Using this curated knowledge graph and structured inputs, the proposed approach achieves an identification accuracy of 97.6% for unsafe worker behaviors. Experimental analysis shows that the knowledge graph representation substantially improves the system’s expandability and interpretability compared with traditional hard-coded rules, facilitating easier incorporation of new rules and multimodal sensing inputs. The results indicate that knowledge graph-driven reasoning offers a practical, scalable pathway for robust, context-aware safety violation detection in varied construction environments.

Keywords:

safety violation; knowledge graph; natural language processing; construction safety; information extraction; deep learning

1. Introduction

The construction industry remains one of the most hazardous sectors worldwide, with persistently poor safety performance. For example, fatalities in the United States increased from 933 in 2014 to 1066 in 2019 [1], and in China, there were 734 construction accidents resulting in 840 fatalities in 2018 [2]. A substantial body of evidence attributes a large proportion of construction accidents and related fatalities to workers’ unsafe behaviors that violate established safety rules and procedures [3,4,5]. Detecting such violations promptly and correcting unsafe behaviors are therefore central tasks in construction safety management.

Traditionally, safety inspections on construction sites are conducted manually by safety officers through periodic walk-rounds and checklists. These conventional inspections are labor-intensive, time-consuming, and inherently constrained by limited human resources, which often prevents timely detection and remediation of all safety violations on large or distributed sites [6,7,8]. Motivated by the proliferation of sensing technologies and advances in artificial intelligence (AI) [9], automated safety inspection systems have emerged to provide continuous, scalable monitoring of worker activities and site conditions. These systems exploit video analytics, wearable sensors, and other Internet of Things (IoT) devices to identify hazards and non-compliant behaviors in near real time.

Despite these technological advances, most existing automated approaches still rely on narrowly defined, hand-crafted rules that are hard-coded into detection pipelines. Such rule-based implementations typically target a limited set of predefined violations (e.g., hardhat use, fall-protection compliance) and lack the representational flexibility to accommodate new rules, contextual nuances, or heterogeneous sensing modalities. Consequently, scalability, maintainability, and interpretability remain major challenges: adding new safety regulations or adapting to different project contexts often requires laborious manual re-engineering of detection logic. Moreover, many state-of-the-art data-driven methods focus on pattern recognition without providing an explicit, machine-readable representation of domain knowledge that supports transparent inference and explanation.

To address these limitations, this research proposes an integrated, knowledge graph-based framework for automatic identification of workers’ safety violations. The core idea is to represent construction safety knowledge—regulatory requirements, task–hazard relationships, contextual constraints, and sensing semantics—as a structured, machine-interpretable graph. Such a representation enables semantic reasoning, supports multimodal data fusion, and facilitates incremental extension of the rule base without invasive code changes. Concretely, the proposed framework comprises two primary components: (1) a knowledge graph generation module that extracts and encodes construction safety knowledge from textual and domain sources using Natural Language Processing (NLP) techniques, producing a domain-specific knowledge graph, and (2) a graph-enabled safety violation identification module that maps structured scene descriptions of workers and environments to graph entities and applies matching and inference to detect violations in context. In a full deployment, these structured descriptions can be generated from sensing systems such as video-based detectors and wearable sensors; in this study, however, we focus on validating the reasoning mechanism using manually constructed and simulated triplets derived from safety regulations and examination questions.

The remaining paper is organized as follows. Section 2 introduces the related work on applying artificial intelligence and knowledge graph in construction safety management. Section 3 describes the overall structure of the proposed framework. Section 4 presents the case study to verify the feasibility of the proposed framework. Section 5 discuss the experimental results, and Section 6 concludes the paper and envisions the future research directions.

2. Literature Review

2.1. Existing Automated Inspection Methods in Construction Safety

Automated inspection in construction safety could realize the continuous 24 h identification of safety violations while adapting to diverse application contexts and integrating multiple data sources. Current research on automated inspection of construction safety is categorized into three types: personal protective equipment (PPE)-based, hazardous location-based, and behavior safety identification-based.

The hardhat is one of the most important personal protective equipment (PPE) pieces for workers’ safety. Zhang et al. [10] proposed an enhanced BiFPN-based deep learning technique for detecting workers and hardhats. Mneymneh et al. [11] used a motion detection algorithm and an object identification instrument for intelligent hardhat monitoring. Fang et al. [12] employed the regular Fast Region-based Convolutional Network Method (Faster R-CNN) approach to recognize workers in the distance who were not wearing hardhats. Other PPEs are also involved in workers’ safety-related research. Fang et al. [13] further distinguished whether workers wore safety harnesses while working at height based on two primary models: the Faster-R-CNN model that detects workers and the Convolutional Neural Network (CNN) model that identifies safety harnesses. In addition, Fang et al. [14] presented an effective deep learning-based technique to detect whether hardhat, harness, and anchorage are used in highland situations. Chen et al. [1] proposed a novel solution to distinguish the proper use of personal protective equipment (PPE), including hardhats, dust masks, safety glasses, and safety belts. They detect the workers’ skeleton joint points and objects and then calculate the object–individual association to identify whether there is improper use of PPE.

Regarding whether workers are exposed to hazardous environments, Fang et al. [15] identified whether workers were on or across structural support based on CNN. Konstantinou et al. [16] suggested a vision-based method for tracking workers with similar appearances and the rapid changes in a complicated environment characterized by congestion, background clutter, and occlusions. Khan et al. [17] recognized the location and spatial interaction between workers and scaffolds and proposed an object correlation detection (OCD) algorithm to identify workers’ hazardous behavior. Fang et al. [18] detected whether the worker is working in the constraint of the certification range by matching the face. Yan et al. [19] proposed a method to predicate the struck-by accidents through the 3D spatial relationship. Nath et al. [20] identified if workers were wearing the hardhat and vest at all times through three distinct You-Only-Look-Once (YOLO) structure-based deep learning methods. Xiong et al. [21] detected the proper use of the hardhat and vest by identifying if they were worn on the correlated body parts.

Different researchers have offered approaches for determining if workers’ activities are safe. To reduce accidents and injuries, Yan et al. [22] developed an ergonomic posture identification approach for capturing injury-prone postures. Seo et al. [23] identified workers’ harmful and tiring postures using an ergonomic evaluation system based on a computer vision-based assessment approach to avoid work-related musculoskeletal problems. Xiong et al. [24] transformed workers’ behavior information into a knowledge graph representation and analyzed the hazard level using coded-specific criteria. Combining the knowledge graph and computer vision technique, Fang et al. [25] identified safety violations. The study identified unsafe behavior via the application of defined requirements. Specifically, it gathered input from the visual scene and converted it into a knowledge graph representation to determine the safety status.

Although several methods have been proposed for identifying workers’ safety violations, only specific safety rules are included in these studies. Most methods use predefined rules to identify whether a worker has unsafe behavior. Furthermore, each rule only suits one predefined situation and cannot be applicable to other environments. Their knowledge exploitation lacks versatility, with the safety rules tailored to meet specific applications instead of the broader and varied construction environment.

2.2. Knowledge Graph Representation of Construction Safety Knowledge and Its Application

As a representation approach evolved from the semantic web, a knowledge graph is generally acknowledged as one of the promising techniques for the storing and management of knowledge. It permits sophisticated queries across many data sources to handle construction information on site [26], thus enabling its uses in complex and changing construction contexts. In addition, a knowledge graph may save additional time and expense by offering a comprehensive knowledge representation methodology [27]. For the knowledge graph-based technique, users may configure their mechanisms to achieve the desired outcomes flexibly. Furthermore, the graph-based knowledge representation promotes the finding of hidden information. Therefore, the knowledge graph format is more versatile and cost-effective in management.

A variety of knowledge graph representations have been presented by several researchers. Ding et al. [28] suggested an event logic-based graph whose node is an event and edges express sequential, casual, is-a relations. Li et al. [29] presented an AND/OR graph-based knowledge point organization paradigm to represent difficult-to-describe selective knowledge. Yu et al. [30] developed the tax graph to describe the calculation logic related to individual tax themes, with millions of linked calculation models indicating the calculation statement. The model incorporates input/output data nodes as well as the computation function node. Although several architectures of knowledge graphs have been presented by academics, most of them apply only to a particular application area and cannot be applied in construction safety domains. Furthermore, Jiang et al. [31,32] presented this unique structure of the knowledge graph utilizing the condition–fact structure to capture information. The fact part explains the event, and the condition part illustrates the circumstances under which those events occur. This representation structure is appropriate for portraying the knowledge relating to construction safety. However, the structure’s taxonomy of entities and connections is very simplistic, and artificial intelligence cannot use this model to learn more detailed data. On the other hand, the logical execution order between each component is not specified, making it challenging to express the logical connection between the different components.

To better store, evaluate, and utilize construction-related information, researchers have examined the construction-relevant knowledge structure. Hjelseth et al. [33,34] devised the RASE methodology to characterize the rule’s structure and partition the content. The RASE method outlines the standard structure and divides the information into four sections: Requirement, Applicability, Selection, and Exception. The RASE method has been used in numerous areas to analyze construction codes [33,34,35,36]. Researchers also adopted other different approaches to represent the knowledge structure in distinct construction-related domains. Solihin et al. [37] used the conceptual graph to represent the BIM-based rule used in compliance checking, which makes it possible to capture the contained knowledge and makes all participants in the implementation efforts understand the knowledge. Häußler et al. [38] proposed a novel compliance checking method for the BIM-based technical design of railways’ construction structure through the Business Process Model and Notation (BPMN) and the Decision Model and Notation (DMN). Sydora et al. [39] presented BIM-based rule language for describing interior design rules in a machine-readable format and a corresponding evaluation method. They regard design rules as constraints for generating design alternatives. Xu et al. [40] presented a method to integrate the heterogeneous data for compliance checking to the underground utilities. Yurchyshyna et al. [41] proposed an ontology-based method for compliance checking of construction projects against building codes. Recent research on construction inspection reports has explored text-based defect detection and graph neural network-driven text classification, such as studies on fire-door defect datasets and GNN-based classification models for pre-completion inspections. These works demonstrate the potential of combining structured representations and graph learning for construction quality and safety management and motivate our use of graph structures for safety rule representation and reasoning [42,43]. Closely related to this study is our prior work [44], which proposed a condition-based knowledge graph structure for representing construction safety rules and demonstrated an initial automatic extraction model achieving an F1-score of approximately 67% on a knowledge graph containing 1200 rules. That conference paper primarily focused on the design of condition-based knowledge representation and the feasibility of automatic rule extraction, without developing a full multi-layer rule structure or a dedicated violation identification module. The present work extends [44] in several key aspects: (1) we introduce a multi-layer Rule Knowledge Graph (RKG) with explicit concept, connection, logic, and statement layers, including a logic layer with AND/OR/NOT nodes; (2) we augment the rule graph with an Association Knowledge Graph (AKG) for background and synonym knowledge; and (3) we design and validate a complete safety violation identification module that performs logic layer-based reasoning on scene descriptions, achieving 97.6% accuracy on 500 annotated scenarios.

Although several methods have been proposed by these scholars, they focus on other construction-relevant fields, like BIM-based compliance checking and construction structure design, and they do not focus on workers’ safety violations. AI cannot understand the data based on the abovementioned knowledge representation for safety violation identification. In addition, current knowledge representation in the construction field cannot adequately represent construction safety-related knowledge, as they lack the more specific details like content structure or logic.

3. Methodology

For applying construction safety-related knowledge to detect workers’ safety violations automatically, this research proposed an integrated identification framework based on a construction safety-related knowledge graph. The framework consists of two main parts: (1) a knowledge graph generation component that encodes domain knowledge (safety regulations, task–hazard relationships, and contextual constraints) into a machine-readable graph structure and (2) a safety violation identification module that maps sensed worker/environmental data to the knowledge graph and performs semantic inference to detect rule violations in a context-aware manner. The overall architecture is shown in Figure 1.

3.1. Terminology and Hierarchy of the Knowledge Graph

To ensure clarity and consistency regarding the graph structures used in this study, the key terminologies are defined hierarchically as follows:

Construction Safety-related Knowledge Graph (CSKG): The overarching knowledge graph framework proposed in this study. It serves as the unified semantic base and is composed of two interlinked sub-graphs: the RKG and the AKG.

Rule Knowledge Graph (RKG): A sub-graph of the CSKG that explicitly captures rules (extracted from regulations and standards) in a multi-layered triplet-and-logic representation. It preserves both declarative facts and execution logic to support compliance checking.

Association Knowledge Graph (AKG): A sub-graph of the CSKG that encodes public/background knowledge and semantic relations (e.g., synonyms, subclass relations). It is used for entity linking and expansion to enhance the system’s interpretability.

3.2. Automated Knowledge Extraction and Graph Population from Safety Texts

To enable the automated understanding of safety rules, this study constructs the CSKG based on the hierarchical definitions established in Section 3.1. As previously defined, the CSKG structure incorporates both the RKG for logic representation and the AKG for semantic expansion. Figure 2 illustrates the overall schema of the proposed CSKG.

3.2.1. Rule Knowledge Graph: Schema and Semantics

This study improves the condition–fact RKG representation used for encoding construction safety regulations. Based on the issues of simplistic ‘structures’ taxonomy and not clearly logical execution order, this study adds a logic layer to the graph structure to showcase the detail of rules’ logical execution and the relationship between triplets. In addition, the subject, object, and relation entity types are added to the RKG, assisting the following later identification process. The proposed structure of the RKG is shown in Figure 2, which is categorized into four layers: Concept layer, Connection layer, Logic layer, and Statement layer. In addition, this study divides the rule into two parts based on the condition–fact structure: the requirement part and condition part. The requirement part demonstrates the requirement that workers should follow, and condition part regulates the circumstances under which these requirements apply.

In this research, each rule is represented as a set of triplets, with different layers of the RKG providing complementary information. Each RKG layer demonstrates partial information about the triplet. Construction regulations typically define requirements on PPE usage, permissible workspaces, and safe behaviors under various tasks and environmental conditions [7,22]. Thus, the mentioned elements above serve as the basis for establishing the RKG. The RKG is constructed based on the structure of knowledge graph provided by Jiang et al. [31,32]. It is composed of a sequence of rules retrieved from safety regulations, with each rule defined by a collection of triplets, shown in Equations (1) and (2):

t_{u_{1}} = (\{n_{1} : a_{1}\}, n_{2}, \{n_{3} : a_{3}\})

(1)

t_{u_{2}} = (\{n_{1} : a_{1}\}, r, \{n_{3} : a_{3}\})

(2)

where

n_{1}, n_{3} \in C

, and

C

is a set of concept nodes in the triplet;

n_{2} \in O

, and

O

is a set of connection nodes in the triplet; and

a_{1}, a_{3} \in A

, and

A

is a set of attributes. ‘1’ is for the subject, and ‘3’ is for the object;

r

is the relation between

n_{1}

and

n_{3}

. Triplet

t_{u_{1}}

uses the connection node as a connection to represent the relation, and

t_{u_{2}}

applies the relations ‘Con_Belongto’ to show the connection. Furthermore, triplet

t_{u_{1}}

represents an event such as a worker standing at a height, a worker moving rebar, or a worker wearing the hardhat. Triplet

t_{u_{2}}

denotes the affiliation between entities in a specific rule. Furthermore, in an RKG, the attribute and affiliation of an entity are valid only in the belonging rules.

The concept layer is composed of concept nodes, which express the subject and object entities in triplets. The concept nodes exhibit five sorts of entities: Person, Work, Object, Location, and Environment, using five distinct types:

Person represents the roles and professions on job sites, such as Worker, Rebar Worker, or Masonry Worker;
Work depicts the behavior and activity of the people and machines, such as Climb;
Object represents the things that typically exist on sites, and this sort of entity has its own characteristics to further specify its situations within the belonging rules, such as Helmet;
Location indicates a geographic area belonging to machinery, object, people, or region, such as High-altitude;
Environment specifies the weather and periods, such as wind, and it employs three preset characteristics to display the degrees and levels: Unit, Value, and Property. The selection range of meteorological circumstances is illustrated by these three attributes. In particular, Unit offers the weather’s unit, such as m/s; Value defines the numerical value; and Property displays the weather’s value range, such as more than or equal to. In addition, the Object and Location nodes have ‘Con Belongto’ relations that clarify the attribution relationships.

The connection layer includes connection nodes and directed links that are connected to concept nodes; the link direction determines whether a concept node is a subject or object in triplets. Three forms of connection nodes are implemented in three unique domains: Operate, Position, and Predicate. Typically, Operate is applied to the PPE domain of a worker, exhibiting the interaction between the worker and PPE such as wear and hang. Position regulates the location and weather relationship, such as in (meaning worker in a kind of weather), up, down, or around some objects or locations. ‘Predicate’ describes the working conditions of workers and objects, as well as how they utilize and operate other objects, like use, carry, or do. These nodes support the representation of domain-specific actions and contextual relationships relevant to safety rule interpretation.

The logic layer describes the execution of logic in the requirement and condition part and the logic order between triplets. Its structure is similar to an event-tree expansion, where higher-level nodes branch into more detailed logical elements. A bigger branch may be extended to multiple smaller branches, with the majority of terminals serving as connection nodes. The logic layer can represent precisely the association between different triplets and logic execution of order. There are two types of nodes in the logic layer: logic nodes and part nodes. The former plays an indispensable role in the depiction of logic execution. In safety rules, it is an objective existence, indicating that certain needs and circumstances are subject to unique limitations. The logic node is necessitated to explain this situation. AND, OR, and NOT nodes are the three distinct types of logic nodes. AND indicates that all child nodes must be followed. OR indicates that at least one of the child nodes must be followed. The NOT node indicates that none of the child nodes should be followed. The outcome of logic nodes will be reported to the part nodes. The part node indicates which triplets correspond to the requirement and condition parts. REQ and CON are two different types of part nodes. REQ means that all triplets connected to this node belong to the requirement part. CON means that all triplets connected with this node belong to the condition part. In Figure 2, the triplet [Person, Hang, Safety Belt] is the requirement part since it is linked to the REQ node, and this triplet must be obeyed because it is also connected to the AND node.

The statement layer contains the root-level representation of each rule. A statement node connects to its associated REQ and CON nodes, allowing the complete rule to be stored as a structured and machine-interpretable unit. This configuration supports rule-level retrieval, reasoning, and subsequent automated violation identification.

3.2.2. Association Knowledge Graph

In addition to rule-based information, background knowledge related to construction safety is required to support reliable detection of safety violations. To supply this complementary information, an AKG is developed to link and enrich the entities defined in the RKG. The AKG facilitates entity normalization, semantic expansion, and cross-entity association during the violation identification process. Entities in the AKG are derived from the concept layer of the RKG and are supplemented with manually curated entries when necessary.

Unlike the RKG—where relations may be represented as connection nodes—the AKG expresses relations directly as edges in the triplet form shown in Equation (2). The AKG owns two types of links: ‘similar’ and ‘subclassof’. The former demonstrates that two entities have similar meanings, and the latter means that a subject entity is the subclass concept to the object entity. These links establish lightweight semantic associations among nodes. For example, the AKG records the relation [Safety line, similar, Safety belt] to support synonym recognition and entity unification during reasoning.

Generate Rule Knowledge Graph

The proposed framework employs the automated extraction model developed by Wei [42] to obtain rule-related information from safety texts and convert it into the multi-layer representation of the RKG and AKG. This extraction process must accommodate the characteristics of construction safety documents, which often contain fragmented statements and frequently omit explicit worker-related subjects. Manually supplementing such missing information is time-consuming and inconsistent, necessitating an extraction representation that can operate effectively under incomplete textual inputs.

Furthermore, triplets in the RKG consist of a subject entity, a connection entity, and an object entity. Directly fixing connection entities as predefined relations reduces flexibility and increases the structural complexity of rules. At the same time, RKG triplets require representation of both intra-triplet attributes and inter-triplet logical relationships. These requirements motivate the introduction of a specialized tuple-based intermediate representation that supports the RKG’s multi-layer structure.

To address these challenges, each triplet is decomposed into two typed partial triplets (collectively referred to as tuples): the subject tuple and the object tuple. For the two typed tuples, the subject tuple represents the subject entity and its associated connection entity, and the object tuple represents the object entity and its associated connection entity. Each relation label contains the triplet type representation (the triplet is a requirement or condition portion) and the kinds for two entities (the subject or object). For example, a triplet [Worker, Wear, Helmet] belonging to a requirement part can be split into two tuples: [Worker, Person-Req_AND-Operate, Wear] and [Wear, Operate-Req AND-Object, Helmet]. The relation label ‘Operate-Req_AND-Object’ indicates that the subject type is Operate, the object type is Object, and the relations between the two entities is Req AND, where ‘Req’ denotes that this tuple is part of the requirement part and ‘AND’ denotes the tuple’s execution logic. Hence, the relation label can display the variety of data contained in the relation label.

In addition, certain related labels differ from those listed above. For example, ‘Object-Con_Belongto-Object’ represents that the type of both the object and subject entities are Object and that the subject entity belongs to the object. In particular, specific triplets do not need a connection entity; for these triplets, the relation label will be ‘Object-Con_AND-Work’, and a connection entity will be added to the triplet during the establishment of the RKG. Similarly, the relation type has an attribute representation similar to ‘Object-Con AND-Attribute’, and the connection entity’s sequence representation of ‘Predicate-Connect Dis-Predicate’ specifies that each triplet belonging to a connection entity must be evaluated independently.

After completing the triplet extraction phase, the next step is establishing the RKG by using the extracted partial triplet. First, the tuple sets will be entered for expansion. For example, a tuple ‘Object-Con AND-Labor’ signifying the condition that the object is doing some work will be separated into two tuples: ‘Object-Con AND-Predicate’ describing the first half of triplet and ‘Predicate-Con AND-Work’ describing the second half of triplet; the Predicate-typed entities in the two resulting tuples are identical, each containing the entity ‘conduct’. Tuples describing attributes (e.g., Environment–Con_OR–Property or Object–Con_AND–Attribute) do not produce standalone triplets. Instead, their attribute entities are attached directly to the corresponding concept nodes. Subsequently, the two tuples with identical linked entities and relations will be joined with the new triplet. However, some safety-related rules do not possess the subject description, causing the subject tuple to not be extracted in the previous step. In this procedure, a new subject tuple is generated based on the missing-element pattern to ensure a complete triplet. Second, the matching logic layer will be formed, followed by the relation in the triplet. Logic nodes (AND, OR, NOT) and part nodes (REQ, CON) are assigned according to the rule’s extracted logical operators and relation labels. The matching structure enables clear representation of execution order and conditional dependencies among triplets.

Finally, each processed text produces a statement node connected to its associated logic nodes. This node specifies the relationship between the requirement and condition sections and serves as the root for the rule instance in the knowledge graph.

2.: Generate Association Knowledge Graph

Typically, for the establishment of the AKG, data originates from two sources: the public knowledge graph and manually contributed information. This research will examine each concept node co-existing in the public knowledge graph and the RKG to search for similar entities and subclass or superclass entities. Afterward, the searched relations will be added to the AKG.

3.3. The Safety Violation Identification Module

To detect workers’ safety violations using the proposed CSKG, this study developed an identification module that leverages the knowledge graph to determine whether a worker has made a safety violation. The module receives a set of triplet-form worker descriptions generated by perception components (such as PPE classifiers, object detectors, action recognizers, and localization systems) and outputs a binary or graded safety status (e.g., “Safe”/“Not Safe”). The module is divided into two main components: scene description expansion and rule evaluation. Figure 3 depicts the overall structure of identification module.

3.3.1. Scene Description

The safety violation identification module determines the worker’s safety status based on the worker’s scene description. The description of the scene has a series of triplets:

S = {(s_{i}, r_{i}, o_{i})}_{i = 1}^{N}

(3)

where

s_{i} = s u b j e c t

,

r_{i} = r e l a t i o n

, and

o_{i} = o b j e c t

. These triplets form the initial scene description. Because raw perception outputs are often sparse or incomplete, scene descriptions require semantic expansion before they are matched against safety rules.

3.3.2. Scene Description Expansion

A single observed triplet typically activates only those rules directly associated with its entities. To increase coverage, the scene description is expanded to incorporate semantically related entities and inferred relations. The output of this section is an extended scene description, which contains both the input scene description and supplementary description.

This component completes the description extension by three stages, as seen in Figure 3. The scene description is expanded by (1) entity linking to concept nodes in the AKG/RKG, (2) retrieving related entities via the AKG, and (3) materializing additional triplets.

(1): Entity linking. For each entity string from sensors, the RKG/AKG is searched. If an exact match is not found, the most similar concept using the Jaro–Winkler string similarity metric is used. The worker description is the input of the identifying entities stage, constructed as a set of triplets, showing $T^{'} = \{t_{1}^{'}, t_{2}^{'}, \dots, t_{m}^{'}\}$ , where $m$ is the number of transformed triplets, and the form of triplet $t_{i}^{'} = (n_{1_{i}}^{'}, n_{2_{i}}^{'}, n_{3_{i}}^{'}) {i \in (1,2, \dots, m)}$ , where $n_{1_{i}}^{'}$ is the subject entity in triplet, $n_{2_{i}}^{'}$ is the relation, and $n_{3 i}^{'}$ is the object, facilitating artificial intelligence’s grasp of meaning. If the entity is missing from the CSKG, the entity with the greatest resemblance will be substituted. In this approach, the Jaro–Winkler distance was applied to measure the resemblance between two entities, as shown by Equations (4) and (5).

$d_{w} = d_{j} + L \times P (1 - d_{j})$

(4)

$d_{j} = \frac{1}{3} (\frac{m}{s_{1}} + \frac{m}{s_{2}} + \frac{m - t}{m})$

(5)
(2): Expansion via the AKG. For each linked concept, retrieve ‘similar’ and ‘subclassof’ neighbors up to depth d. For each retrieved neighbor, synthesize corresponding triplets by preserving the original relation type when semantically compatible. This allows the system to infer unseen but valid scene descriptions (e.g., retrieving “protective equipment” from “helmet”).
(3): Materializing additional triplets. After expansion, an extended scene triplet set S+ is obtained. Then, candidate rules are identified to evaluate and perform logical evaluation by rules. A rule R in the RKG participates in evaluation if $\exists t_{s} \in S^{+}$ and $\exists t_{r} \in R$ such that they share the same subject or object concept. Attribute-related entities (e.g., wind ≥ 10 m/s) are attached to their concept nodes rather than materialized as full triplets. When perception text lacks subject descriptions—a common case in construction safety documents—a plausible subject tuple is synthesized according to rule patterns.

The second component, the rule evaluation, identifies the extended scene description’s behavior safety in accordance with each relevant rule. The conformity (whether the conduct is safe or not) of the scene description to each rule may be determined. For each matched triplet pair

(t_{s}, t_{r})

, the judgment method depends on the connection type:

Operate (interaction/PPE): Check whether the relevant PPE object is present in the scene description and whether the interaction predicate (e.g., “wear”, “hang”) is satisfied. This may be implemented based on classifiers from vision or sensor modules; in this study, it is emulated using symbolic triplets.
Position (spatial relations): Determine whether the spatial relationship (e.g., “at high-altitude”, “in hazardous area”) holds, based on positional attributes of the worker and locations.
Predicate (action/state): Judge whether the worker or object is performing (or not performing) a specified activity, based on action predicates (e.g., “carry”, “climb”).

It has three stages: searching for each related rule, evaluating each rule, and outcome analysis. In the first stage, the model first compares the triplets in the RKG and the scene description. If a triplet in the rule of the RKG and a triplet in the scene description have the same subject or object entities, the rule is needed to participate in the evaluation. This is how the equation may be written:

(n_{1_{i}}^{'}, n_{2}, n_{3_{i}}^{'}) \in T_{i}, T_{i} \in R K G

(6)

where RKG represents the rule knowledge graph,

T_{i}

is the

i_{t h}

rule in the RKG, and

n_{2}

is anyone of the connection entity. In the second stage, the model applies each applicable rule to assess the worker’s behavior and obtains the analysis results. The particular steps are: First, the model determines which triplet in the rule corresponds to the triplet in the scene description, such that

{n_{1_{i}}^{'} = n_{1}, n_{3_{i}}^{'} = n_{3}}

. Then, the type of the connection entity in the rule determines the judging methodology, since Operate, Position, and Predicate types have distinct ways of judgment.

For each worker, the per-rule results. A worker is labeled not safe if at least one applicable rule is judged violated with confidence above a threshold T. Optionally, a risk score can be computed by weighting rule violations by rule criticality and detection confidences, enabling graded alerts (low/medium/high). All decisions are accompanied by the violated rule IDs and the failed triplet(s) for interpretability.

All perceptual outputs carry confidence scores. We propagate these confidences to triplet and rule judgments and use fuzzy aggregation to reduce brittleness. Two thresholds are tunable: entity-linking threshold

T_{l i n k}

and violation threshold

T_{v i o}

. We recommend setting these via cross-validation on annotated examples.

Operate (interaction/PPE): Check whether the object is detected and whether the predicate is satisfied. Detection is based on classifiers (e.g., PPE detection from images); semantics such as “wear” may require that “helmet” bounding box overlaps the head region beyond a threshold. Each triplet returns a Boolean and a confidence score.
Position (spatial relations): Evaluate using spatial inference (e.g., GPS coordinates, relative positions, bounding-box topologies). Examples: in (High-altitude) is true if worker altitude > threshold or if worker is within a “high-altitude” zone polygon. In the third stage, the newly formed entities will replace the old entity to build new triplets, and the new triplets will be added to the triplet set.
Predicate (action/state): Evaluate temporal/action predicates using activity recognition (short action windows). For example, carry(rebar) is detected from action classifier outputs aggregated over a time window.

Finally, the final safety status is determined after aggregating the evaluation outcomes from all relevant rules. The worker is marked Unsafe if any mandatory requirement is violated under the applicable conditions defined by the rule set.

3.3.3. Implementation Scope and Validation Strategy

The confidence propagation, fuzzy aggregation, and threshold tuning mechanisms described in Section 3.3.2 constitute the theoretical framework designed for a fully automated, sensor-driven deployment. However, to rigorously validate the logical correctness and semantic reasoning capability of the proposed CSKG independent of perceptual noise, the experimental verification in Section 4 utilizes manually verified and simulated triplets as input. This approach decouples the reasoning module from potential upstream detection errors, ensuring that the validation results accurately reflect the performance of the rule logic itself. Full integration with real-time sensor data is reserved for subsequent phases of large-scale deployment.

4. Case Study

A case study was conducted to demonstrate the CSKG generation and verify its applicability. First, we will generate the CSKG using safety-related documents (i.e., regulations) and public safety knowledge. Second, an automated extraction model to complement the automated generation module was trained and evaluated. Finally, a validation test will be performed to evaluate the identification module’s effectiveness. Figure 4 illustrates the workflow, and the three phases are described below.

4.1. Generate Knowledge Graph

4.1.1. Collect Safety Regulations

A total of 97 safety-related documents, covering Chinese National Standards (GB), Construction Industry Standards (JGJ), and safety manuals, have been collected. These documents were selected based on their relevance to construction safety and workers’ safety violation while displaying a certain degree of generality. Part of the selected documents is shown in Table 1.

4.1.2. Preprocessing

This investigation begins by defining the scope of documents to be extracted, ultimately selecting rules in the three key safety issues outlined in Section 3.2 (Which kinds of PPE should workers wear? Which workplaces and conditions are permitted for workers to stay? Which process or activity is considered to be safe for workers?). To represent the content clearly, this research breaks the relatively complicated rules into basic components based on the content and punctuation, and the missing subject or object will be added based on the chapter headings. Finally, a corpus of 1236 candidate rule statements is consolidated for use in the subsequent phase. After filtering out duplicated, overly generic, or out-of-scope statements, 1200 rules are retained and used to construct the final RKG and the annotated corpus for model training.

4.1.3. Annotate Rules

For training the automated extraction model, this research annotates rules to collect the training corpus. The rule annotation is to capture vital information from unstructured text. On the one hand, the captured data are used for training the automated extraction model and establishing the RKG. This research extracts the data using the Brat Rapid Annotation Tool (BRAT), a web-based tool for annotating existing text documents by adding notes [43]. BRAT’s output, designed for systematic annotation, has a specified structure that artificial intelligence can analyze and interpret. BRAT has two kinds of annotations: text span annotations and relation annotations. The former identifies entities and their categories, whereas the latter asserts the relationship between entities. The annotated corpus produced a structured representation that is directly interpretable by the automated extraction framework and supports consistent graph generation. The final annotated dataset formed the ground-truth corpus for training, validation, and testing.

4.1.4. Construct the Rule Knowledge Graph

P (true positive) To guarantee the high quality and confidence of the knowledge graph used in the identification experiments, this work employs the manually annotated data obtained from Section 4.1.3 to create the RKG, rather than directly using the noisy output of the automated extraction model. Specifically, the final RKG is built from the filtered set of 1200 rules selected from the 1236 preprocessed candidate statements. The RKG was implemented in Neo4j, a widely adopted graph database solution in construction safety and knowledge graph research [44]. After annotation, this research outputs the annotated entities, including their name, type, and ID number, along with their linked relations; they are then deposited into the graph database management system Neo4j. Figure 5 shows an example rule decomposed into triplets: [Person, at, high altitude] (Con_AND), [Person, Hang, safety belt] (Req_AND). This rule indicates that a worker operating at high altitude must wear a safety belt. Therefore, such structured logic allows the RKG to represent both declarative knowledge and execution.

4.1.5. Construct the Association Knowledge Graph

To enhance semantic coverage and enable concept expansion, an AKG was developed and integrated with the RKG. The public knowledge graph OwnThink, containing more than 25 million entities and billions of relations, was used as an external knowledge base [45]. Through the relations ‘Also Known As and ‘Another Name’, this research requires the Concept entity that existed in both the RKG and OwnThink to locate the desired entity. In addition, this research finds the ‘subclassof’ and ‘similar’ links between person-type entities and adds them into the AKG to improve searching. The searched relations and entities are entered into Neo4j. These expansions significantly improve entity linking accuracy and allow the system to reason over synonyms and hierarchical categories (e.g., helmet → PPE → protective equipment).

4.2. Train Automated Extraction Model and Performance

The automated data extraction model must be trained to extract data with greater precision. After annotation, the annotated data will be converted to json-typed files and serve as the training corpus for the extraction model. The extraction model utilizes the Bidirectional Encoder Representations from Transformers (BERT)-based deep learning model presented by Cui et al. [46]. Train, valid, and test corpus are proportionally 7:1.5:1.5 of the total corpus.

In the validation of the safety violation identification module, we treated “having a safety violation” as the positive class. Accordingly, TP (true positive) denotes the number of workers correctly identified as having safety violations; FP (false positive) is the number of workers incorrectly identified as having safety violations; FN (false negative) is the number of workers with safety violations that are incorrectly identified as safe; and TN (true negative) is the number of workers correctly identified as engaging in safe behavior. The study evaluated the performance of the framework by the accuracy, precision, recall, and F1-score metrics below:

A c c u r a c y = \frac{T P + T N}{T P + T N + F P + F N}

(7)

P r e c i s i o n_{S a f e t y} = \frac{T P}{T P + F P}

(8)

R e c a l l_{S a f e t y} = \frac{T P}{T P + F N}

(9)

The performance of the automated extraction model is shown in Table 2. The label ‘Type’ refers to the tuple type; ‘Precision’ refers to the precision value within each type; and ‘Recall’ refers to the recall value within each type. Precision, recall, and F-1 values are, respectively, 78.07%, 58.92%, and 67.15%. To support multi-layer rule representation, tuple categories were grouped into the subject tuple, the object tuple, and others. The subject tuple consists of the subject entity, the connection entity, and the link exhibiting the triplet’s type and logic execution between these two nodes, while the object tuple consists of the connection entity, the object entity, and the links between these two nodes. The other part includes the tuple describing the relationship between entities and the attribute of entities. The results indicate that the model provides adequate accuracy for complementing manual rule extraction and for supporting large-scale rule generation in the future.

The performance of the subject tuple in precision is 95.24%, recall is 60.06%, and F1-score is 74.07%; object tuple performance in precision is 76.42%, recall is 59.564%, and F1-score is 66.93%. Observation reveals that the subject tuple’s performance is superior to the object tuple. In addition, this research splits the tuple type into four functional categories: Operate, Position, Predicate, and Attribute. The type Operate means the tuple is used to represent how the worker wears the PPE, the Position type describes the circumstance in which the worker stays, the Predicate type represents the activity of the worker and the object, and the Attribute type represents the situation including the ‘object’s attribute or the relation between two different entities. The performance of Operate is: Precision (79.66%), Recall (78.99%), and F1-score (79.32%); the performance of Position is: Precision (85.31%), Recall (60.70%), and F1-score (70.93%); the performance of Predicate is: Precision (71.11%), Recall (50.96%), and F1-score (59.37%); and the performance of Attribute is: Precision (86.05%), Recall (55.22%), and F1-score (67.27%). These findings indicate that Predicate and Attribute perform the worst, Position performs well, and Operate is the best.

4.3. Validation Test

To verify the validity and accuracy of the identification module, the authors developed the validation set, then adopted the safety violation identification module to identify the workers’ safety violations and analyzed the efficiency by comparing the identification result with the manual result.

4.3.1. Construct Validation Sets

The validation set involves two sources: rules in the knowledge graph and examination questions about the safety supervisor. There are two reasons for selecting these sources: On one hand, both sources are accurate as prepared by specialists. Alternatively, two sources ensure objectivity and prove the framework’s applicability in various contexts.

The authors establish the validation example based on rules in the knowledge graph and statements from examination questions, then present the source-dependent artificial judgment outcome. The validation example comprises a series of triplets taken or changed from the sources. In subsequence, the authors increase the number of validation sets by raising, decreasing, and changing the triplet set. This study manually decides whether the new example is safe or not based on the referred triplet set. Each validation example is established based on a single rule or one examination question. For the answer of each validation set, the researchers identified if the worker is safe or not based on the dependent rule or examination answer. For example, a triplet set ([worker, at, height], [worker, wear, safety belt]) can be constructed based on a safety rule: worker stay at height should wear the safety belt. We identified that the worker in this circumstance is safe depending on the safety rule. Then, if we delete the triplet [worker, wear, safety belt], the worker is unsafe based on the referred rule.

For the comparison, the research divides the validation set into three types: the rule validation set only contains the example referred by the rules, the question validation set contains the example referred by the examination questions, and the overall validation set contains all the examples mentioned above.

4.3.2. Validation and Results

Cypher, a querying language tied to the Neo4j system, was used to obtain and save data from the graph-based database [47]; hence, the research was able to retrieve and store data from the Neo4j. This study automates the validation process by implementing Python3 software for program validation.

Each safety behavior category has its own norms and criteria for identification. Regarding the Operate aspect, the criteria for identification rely on whether PPE violates the regulation. If PPE was not included in the scene description, assume that the worker is properly outfitted. For hazardous location exposure, the identification module recognizes location-related behavior as hazardous when required spatial relationships are not detected or when inappropriate location operations are performed. In the safe procedure aspect, the primary assignment is to identify if a worker has participated in a hazardous or safe activity. When a worker does a hazardous activity or has no indispensable safe activity, the program will judge that the worker has the safety violation.

Figure 6 depicts the validation test identification result. The label ‘True’ indicates the worker’s behavior in the example is safe, while the label ‘False’ indicates it is hazardous. ‘Actual label’ indicates the manual answer, while ‘Predict label’ indicates the identification result. In validation sets including 500 instances, 300 are created via knowledge graphs and 200 by examination questions. The validation test results are displayed in Table 3 and Figure 7.

The total accuracy of validation sets is 97.60%, while the accuracy of validation sets derived from the rules and examination questions is 98.67% and 96.00%, respectively. Furthermore, relating to safety behavior, the precision is 98.56%, 96.46%, and 97.82% for the rules from the knowledge graph, examination questions, and the overall validation set, respectively; the recall score is 99.51%, 96.46%, and 98.43%; and the F1-score is 99.03%, 96.46%, and 98.13%, respectively. Precision, recall, and F1-score for identifying safety violation are as follows: 98.91%, 95.40%, and 97.21%, respectively; 96.81%, 95.40%, and 96.13%, respectively; and 97.85%, 95.40%, and 96.67%, respectively. To the overall performance of each validation set, the validation set of rules executes better than the other two sets, followed by the validation set of overall questions, and then the validation set of examination questions.

5. Discussion

The results of this study provide evidence that both the automated extraction model and the safety violation identification module are feasible components of a knowledge graph-driven framework for automated compliance checking. At the same time, several observations and limitations merit discussion. For the automated extraction model, tuple types with extremely low sample sizes (fewer than 30 instances) and F1-scores of zero were excluded from the performance summary due to their limited statistical significance.

Tuple categories with relatively fixed text structures have a significant impact on extraction accuracy. For example, the tuple categories “Position-Con_OR-Environment”, “Environment-Con_OR-Unit”, “Environment-Con_OR-Value”, and “Environment-Con_OR-Property” often follow the pattern “in some weather” and exhibit highly regular linguistic expressions. Their F1-scores all exceed 85%, indicating excellent performance. Similarly, the tuple categories “Person-Req_AND-Operate”, “Operate-Req_AND-Object”, and “Operate-Req_NOT-Object” are related to workers’ PPE-wearing conditions and typically follow strict patterns such as “should/should not be equipped with …”. Among these, “Operate-Req_NOT-Object” has far fewer instances than the other two types, which adversely affects its performance and reduces accuracy by nearly 70% relative to the best-performing tuples. In addition, “Person-Req_NOT-Position” and “Position-Con_AND-Location” each correspond to well-defined patterns. “Person-Req_NOT-Position” is associated with statements such as “person do not at …”, and “Person-Con_AND-Location” often appears as “person at height”; their F1-scores reach 88.89% and 71.79%, respectively. The tuple type “Object-Con_AND-Work” is related to statements such as “something is doing the work” and achieves an F1-score of 70.59%. By contrast, the types “Predicate-Req_NOT-Work”, “Object-Con_Belongto-Object”, and “Object-Con_AND-Attribute” exhibit more diverse and less regular text structures, leading to the poorest performance among all categories. These findings suggest that extraction performance could be enhanced by expanding the annotated corpus—particularly for sparsely represented tuple types—and by integrating post-processing mechanisms that leverage semantic constraints embedded in the RKG and AKG. Such post-processing could correct extraction errors through consistency checks based on domain knowledge, thereby reducing noise and improving the stability of downstream reasoning.

The validation results of the identification module further confirm the applicability of the proposed framework. Across all validation sets, the module achieved accuracy above 90%, demonstrating that the knowledge graph provides sufficient semantic richness and logical structure to support automated safety assessment. Nevertheless, the accuracy on examination question-based examples was marginally lower than that on rule-based examples. This difference can be attributed to two main factors. First, rule-based examples are fully aligned with the knowledge graph, which contains all relevant concepts and relations; thus, the module rarely encounters missing information. In contrast, examination question-based examples occasionally contain entities that are absent from or ambiguously represented in the graph, leading to misinterpretation during entity linking. Second, entity linking currently relies primarily on string similarity rather than semantic similarity, making the module more susceptible to mismatches such as substituting closely spelled but conceptually distant terms.

Although the performance of the identification module is satisfactory, errors in the validation sets prevent the accuracy from reaching 100%. Several causes for mismatches can be identified. First, some scene descriptions encoded in the knowledge graph are incomplete, making it difficult for the identification module to detect safety violations reliably. Second, each validation example is designed according to a primary rule, and the manual label is derived using that rule; however, multiple rules may apply to a given scenario in the knowledge graph. As a consequence, the example may be assigned one label when evaluated under a particular rule but an inconsistent label when evaluated under other applicable rules. Third, the identification module relies on text similarity to match entities and assess behavior safety, but it does not fully account for semantic similarity across entities.

It is important to emphasize that, in the present case study, the high identification accuracy (97.6%) is obtained using a manually curated RKG rather than the automatically extracted one. This experimental design decouples the evaluation of the extraction model from the evaluation of the identification module, allowing us to separately assess the feasibility of the multi-layer RKG structure and the logic layer-based reasoning. In a fully automated deployment where the RKG is populated directly by the extraction model, the end-to-end performance would inevitably be lower than the reported 97.6% due to extraction errors, especially for complex predicate and attribute tuples with relatively low recall. Future work will explicitly quantify this degradation by constructing partially automated RKGs with controlled levels of noise and measuring how extraction errors propagate to violation detection outcomes.

Moreover, the current validation assumes that the input to the identification module is a structured set of triplets describing worker states and environmental conditions. These triplets are manually constructed or simulated from safety regulations and examination questions and therefore do not yet reflect the noise and incompleteness present in raw sensing data. Integrating real perception pipelines (e.g., vision-based PPE detectors, worker tracking, and action recognition) will introduce additional uncertainty at the input level, which may further lower end-to-end accuracy. This limitation will be addressed in future experiments that couple the proposed knowledge graph-based reasoning with practical sensing systems on actual construction sites.

Overall, this study suggests that while the proposed framework is effective and practical for identifying safety violations, its accuracy and robustness can be further improved. Promising directions include expanding the annotated dataset, incorporating semantic embedding-based entity linking, strengthening post-extraction correction strategies, and increasing rule and entity coverage within the knowledge graph. Integrating additional multimodal data sources (e.g., 3D spatial information, sensor trajectories) may also help refine scene descriptions and reduce ambiguity. These improvements will enable the framework to better handle linguistic variability, complex rule interactions, and real-world construction scenarios, thereby enhancing its capacity for scalable and interpretable safety monitoring.

6. Conclusions

This study presented an integrated framework for constructing a construction safety-related knowledge graph and automatically identifying workers’ safety violations. The framework consists of two key components: (1) a knowledge graph generation module that combines rule-based information with public background knowledge and (2) a safety violation identification module that evaluates scene descriptions against encoded safety rules. A case study was conducted to assess feasibility, in which the automated extraction model achieved an F1-score of 67%, and the identification module reached 97% accuracy in determining workers’ safety status, demonstrating the practicality of the proposed approach.

The framework offers several advantages. First, the generation module enables the semi-automated construction of safety knowledge graphs, substantially reducing manual effort. Second, the encoded rule structures allow the system to retrieve all relevant regulations based on scene descriptions, improving the completeness of safety assessments. Third, the graph-based representation provides flexible scalability, enabling users to adjust rule coverage without modifying the underlying algorithms. Finally, the multi-layered knowledge graph representation enhances interpretability and supports transparent identification of safety violations.

Despite these benefits, several limitations remain. The automated extraction model requires further performance improvement, particularly for tuple types with diverse linguistic expressions. The current framework also assumes that scene descriptions are provided as structured text; thus, it cannot yet process raw construction site video into textual triplets, limiting end-to-end automation. In addition, entity matching in the identification module relies primarily on character-level similarity, which does not fully capture semantic relationships and may introduce matching errors.

In summary, the current study validates the reasoning capability of the multi-layer knowledge graph under idealized conditions where the rule base is manually curated and the scene descriptions perfectly structured. While the automated extraction model and sensing modules are conceptually integrated into the framework, their outputs are not yet fully coupled with the identification module in an end-to-end fashion. As a result, the reported 97.6% accuracy should be interpreted as an upper bound under clean symbolic inputs. Future work will focus on closing this gap by improving extraction performance, integrating real-world sensing data, and systematically evaluating the framework under realistic noise and incompleteness.

Future research will focus on three directions. First, the extraction model will be strengthened by expanding the annotated dataset and optimizing the underlying model architecture. Second, the proposed framework will be integrated with real-world sensing systems (such as camera-based object detection and wearable sensors) to validate its performance in operational construction environments. Third, semantic matching techniques based on machine learning or contextual embeddings will be incorporated to improve entity linking and enhance the robustness of rule evaluation.

Author Contributions

Conceptualization, methodology, software, investigation, writing—original draft, Y.Z. (Yifan Zhu); data curation, formal analysis, validation, Y.O.; software, visualization, data curation, R.P.; validation, investigation, Z.S.; resources, formal analysis, Y.Z. (Yang Zhou); writing—review and editing, methodology, R.M.; validation, writing—review and editing, B.C.; supervision, project administration, funding acquisition, writing—review and editing, W.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Open Research Fund of Hubei Key Laboratory of Hydropower Engineering Construction and Management, grant number 2024KSD17. The APC was funded by the Open Research Fund of Hubei Key Laboratory of Hydropower Engineering Construction and Management.

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to the inclusion of proprietary company and industry information that is subject to confidentiality obligations and non-disclosure agreements with the participating organizations.

Conflicts of Interest

Author Zhu Yifan was employed by the company Engineering Digital Technology R&D Center, Engineering Design & Research Institute of China Communications Construction Third Highway Engineering Co., Ltd. Author Wang Wen was employed by the company Shanghai Research Institute of Building Sciences Group Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Chen, S.; Demachi, K. Towards on-site hazards identification of improper use of personal protective equipment using deep learning-based geometric relationships and hierarchical scene graph. Autom. Constr. 2021, 125, 103619. [Google Scholar] [CrossRef]
Ministry of Housing and Urban-Rural Development (MOHURD). Notice of the General Office of the Ministry of Housing and Urban-Rural Development on the Situation of Production Safety Accidents in Housing and Municipal Engineering in 2019; China Real Estate Association: Beijing, China, 2020. (In Chinese) [Google Scholar]
Zhang, W.; Zhu, S.; Zhang, X.; Zhao, T. Identification of critical causes of construction accidents in China using a model based on system thinking and case analysis. Saf. Sci. 2020, 121, 606–618. [Google Scholar] [CrossRef]
Andersen, L.P.S.; Grytnes, R. Different ways of perceiving risk and safety on construction sites and implications for safety cooperation. Constr. Manag. Econ. 2021, 39, 419–431. [Google Scholar] [CrossRef]
Cheng, B.; He, X.; Huang, J.; Li, H.; Wu, S.; Chen, H. Toward early warning of unsafe behavior of excavator operators under time pressure: Experimental evidence and EEG-based detection via RCF-IncepLite model. Accid. Anal. Prev. 2026, 229, 108424. [Google Scholar] [CrossRef]
Asadzadeh, A.; Arashpour, M.; Li, H.; Ngo, T.; Bab-Hadiashar, A.; Rashidi, A. Sensor-based safety management. Autom. Constr. 2020, 113, 103128. [Google Scholar] [CrossRef]
Tang, S.; Roberts, D.; Golparvar-Fard, M. Human-object interaction recognition for automatic construction site safety inspection. Autom. Constr. 2020, 120, 103356. [Google Scholar] [CrossRef]
Chen, H.; Luo, X.; Zheng, Z.; Ke, J. A proactive workers’ safety risk evaluation framework based on position and posture data fusion. Autom. Constr. 2019, 98, 275–288. [Google Scholar] [CrossRef]
Fu, H.; Hao, Y.; Wu, Z.; Xu, H.; Zuo, J. Human-centered collaborative design in green buildings: A comprehensive review of neurotechnology integration. Renew. Sustain. Energy Rev. 2026, 231, 116772. [Google Scholar] [CrossRef]
Zhang, C.; Tian, Z.; Song, J.; Zheng, Y.; Xu, B. Construction worker hardhat-wearing detection based on an improved BiFPN. In 2020 25th International Conference on Pattern Recognition (ICPR); IEEE: Milan, Italy, 2021; pp. 8600–8607. [Google Scholar] [CrossRef]
Mneymneh, B.E.; Abbas, M.; Khoury, H. Vision-Based Framework for Intelligent Monitoring of Hardhat Wearing on Construction Sites. J. Comput. Civ. Eng. 2019, 33, 04018066. [Google Scholar] [CrossRef]
Fang, Q.; Li, H.; Luo, X.; Ding, L.; Luo, H.; Rose, T.M.; An, W. Detecting non-hardhat-use by a deep learning method from far-field surveillance videos. Autom. Constr. 2018, 85, 1–9. [Google Scholar] [CrossRef]
Fang, W.; Ding, L.; Luo, H.; Love, P.E.D. Falls from heights: A computer vision-based approach for safety harness detection. Autom. Constr. 2018, 91, 53–61. [Google Scholar] [CrossRef]
Fang, Q.; Li, H.; Luo, X.; Ding, L.; Luo, H.; Li, C. Computer vision aided inspection on falling prevention measures for steeplejacks in an aerial environment. Autom. Constr. 2018, 93, 148–164. [Google Scholar] [CrossRef]
Fang, W.; Zhong, B.; Zhao, N.; Love, P.E.D.; Luo, H.; Xue, J.; Xu, S. A deep learning-based approach for mitigating falls from height with computer vision: Convolutional neural network. Adv. Eng. Inform. 2019, 39, 170–177. [Google Scholar] [CrossRef]
Konstantinou, E.; Lasenby, J.; Brilakis, I. Adaptive computer vision-based 2D tracking of workers in complex environments. Autom. Constr. 2019, 103, 168–184. [Google Scholar] [CrossRef]
Khan, N.; Saleem, M.R.; Lee, D.; Park, M.W.; Park, C. Utilizing safety rule correlation for mobile scaffolds monitoring leveraging deep convolution neural networks. Comput. Ind. 2021, 129, 103448. [Google Scholar] [CrossRef]
Fang, Q.; Li, H.; Luo, X.; Ding, L.; Rose, T.M.; An, W.; Yu, Y. A deep learning-based method for detecting non-certified work on construction sites. Adv. Eng. Inform. 2018, 35, 56–68. [Google Scholar] [CrossRef]
Yan, X.; Zhang, H.; Li, H. Computer vision-based recognition of 3D relationship between construction entities for monitoring struck-by accidents. Comput.-Aided Civ. Infrastruct. Eng. 2020, 35, 1023–1038. [Google Scholar] [CrossRef]
Nath, N.D.; Behzadan, A.H.; Paal, S.G. Deep learning for site safety: Real-time detection of personal protective equipment. Autom. Constr. 2020, 112, 103085. [Google Scholar] [CrossRef]
Xiong, R.; Tang, P. Pose guided anchoring for detecting proper use of personal protective equipment. Autom. Constr. 2021, 130, 103828. [Google Scholar] [CrossRef]
Yan, X.; Li, H.; Wang, C.; Seo, J.O.; Zhang, H.; Wang, H. Development of ergonomic posture recognition technique based on 2D ordinary camera for construction hazard prevention through view-invariant features in 2D skeleton motion. Adv. Eng. Inform. 2017, 34, 152–163. [Google Scholar] [CrossRef]
Seo, J.; Lee, S. Automated postural ergonomic risk assessment using vision-based posture classification. Autom. Constr. 2021, 128, 103725. [Google Scholar] [CrossRef]
Xiong, R.; Song, Y.; Li, H.; Wang, Y. Onsite video mining for construction hazards identification with visual relationships. Adv. Eng. Inform. 2019, 42, 100966. [Google Scholar] [CrossRef]
Fang, W.; Ma, L.; Love, P.E.D.; Luo, H.; Ding, L.; Zhou, A. Knowledge graph for identifying hazards on construction sites: Integrating computer vision with ontology. Autom. Constr. 2020, 119, 103310. [Google Scholar] [CrossRef]
Pauwels, P.; Zhang, S.; Lee, Y.C. Semantic web technologies in AEC industry: A literature overview. Autom. Constr. 2017, 73, 145–165. [Google Scholar] [CrossRef]
Li, X.; Lyu, M.; Wang, Z.; Chen, C.H.; Zheng, P. Exploiting knowledge graphs in industrial products and services: A survey of key aspects, challenges, and future perspectives. Comput. Ind. 2021, 129, 103449. [Google Scholar] [CrossRef]
Ding, X.; Li, Z.; Liu, T.; Liao, K. ELG: An Event Logic Graph. arXiv 2019. [Google Scholar] [CrossRef]
Li, S.; Li, X.; Wang, L. Knowledge points organization model based on AND/OR graph in ICAI. In 2010 6th International Conference on Natural Computation, ICNC; IEEE: Yantai, China, 2010; Volume 4, pp. 2121–2124. [Google Scholar] [CrossRef]
Yu, J.; Mccluskey, K. Tax Knowledge Graph for a Smarter and More Personalized TurboTax. arXiv 2020. [Google Scholar] [CrossRef]
Jiang, T.; Zeng, Q.; Zhao, T.; Qin, B.; Liu, T.; Chawla, N.V.; Jiang, M. Biomedical Knowledge Graphs Construction from Conditional Statements. IEEE/ACM Trans. Comput. Biol. Bioinform. 2021, 18, 823–835. [Google Scholar] [CrossRef]
Jiang, T.; Zhao, T.; Qin, B.; Liu, T.; Chawla, N.V.; Jiang, M. The role of “condition”: A novel scientific knowledge graph representation and construction model. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; Association for Computing Machinery: New York, NY, USA, 2019; pp. 1634–1642. [Google Scholar] [CrossRef]
Hjelseth, E.; Ph, D.; Nisbet, N. Capturing normative constraints by use of the semantic mark-up RASE methodology. In Proceedings of the 28th International Conference of CIB W78, Sophia Antipolis, France, 26–28 October 2011; pp. 26–28. Available online: https://itc.scix.net/pdfs/w78-2011-Paper-45.pdf (accessed on 24 February 2026).
Hjelseth, E.; Nisbet, N. Exploring semantic based model checking. In Proceedings of the CIB W78 Conference, Cairo, Egypt, 16–19 November 2010; Volume 27, pp. 16–18. [Google Scholar]
İlal, S.M.; Günaydın, H.M. Computer representation of building codes for automated compliance checking. Autom. Constr. 2017, 82, 43–58. [Google Scholar] [CrossRef]
Beach, T.H.; Rezgui, Y.; Li, H.; Kasim, T. A rule-based semantic approach for automated regulatory compliance in the construction sector. Expert Syst. Appl. 2015, 42, 5219–5231. [Google Scholar] [CrossRef]
Solihin, W.; Eastman, C. A Knowledge Representation Approach to Capturing BIM Based Rule Checking Requirements Using Conceptual Graph. In Proceedings of the CIB W78 Conference 2015, Cairo, Egypt, 16–19 November 2015; Volume 21, pp. 370–402. [Google Scholar]
Häußler, M.; Esser, S.; Borrmann, A. Code compliance checking of railway designs by integrating BIM, BPMN and DMN. Autom. Constr. 2021, 121, 103427. [Google Scholar] [CrossRef]
Sydora, C.; Stroulia, E. Rule-based compliance checking and generative design for building interiors using BIM. Autom. Constr. 2020, 120, 103368. [Google Scholar] [CrossRef]
Xu, X.; Cai, H. Semantic approach to compliance checking of underground utilities. Autom. Constr. 2020, 109, 103006. [Google Scholar] [CrossRef]
Yurchyshyna, A.; Zarli, A. An ontology-based approach for formalisation and semantic organisation of conformance requirements in construction. Autom. Constr. 2009, 18, 1084–1098. [Google Scholar] [CrossRef]
Wei, Z.; Su, J.; Wang, Y.; Tian, Y.; Chang, Y. A Novel Cascade Binary Tagging Framework for Relational Triple Extraction. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics; Association for Computational Linguistics: Stroudsburg, PA, USA, 2020; pp. 1476–1488. [Google Scholar] [CrossRef]
Wang, S. Graph neural network–driven text classification for fire-door defect inspection in pre-completion construction. Sci. Rep. 2025, 15, 44382. [Google Scholar] [CrossRef]
Zhu, Y.; Luo, X. A Knowledge Graph for Automated Construction Workers’ Safety Violation Identification. In ISARC. Proceedings of the International Symposium on Automation and Robotics in Construction; IAARC Publications: Oulu, Finland, 2022; Volume 39, pp. 312–319. [Google Scholar]
Ownthink. Available online: https://www.ownthink.com/docs/kg/ (accessed on 7 June 2023).
Cui, Y.; Che, W.; Liu, T.; Qin, B.; Yang, Z. Pre-training with whole word masking for Chinese BERT. IEEE/ACM Trans. Audio Speech Lang. Process. 2021, 29, 3504–3514. [Google Scholar] [CrossRef]
Cypher Query Language. Available online: https://neo4j.com/developer/cypher/ (accessed on 6 June 2023).

Figure 1. The overall structure of the integrated framework for a construction safety-related knowledge graph.

Figure 2. The overall structure of the knowledge graph.

Figure 3. The structure of the safety violation identification module. Note: The ellipses (…) denote the presence of additional, structurally identical rule nodes, triplets, or evaluation outcomes that are omitted for brevity and visual clarity. They do not imply hidden logical steps.

Figure 4. The process of case study.

Figure 5. One example rule in the Neo4j.

Figure 6. The identification result in each type of validation set: (a) Validation set of rules. (b) Validation set of questions. (c) Overall validation sets.

Figure 7. Comparison of test results in three different validation sets.

Table 1. Part of selected regulations.

Name	Code
Technical code for safety of working at height of building construction	JGJ 80-2016
Standard for construction safety inspection	JGJ 59-2011
Technical code for safety of building demolition engineering	JGJ 147-2016
Technical specification for safety operation of constructional machinery	JGJ 33-2012
Technical code for construction safety of deep building foundation excavations	JGJ 311-2013
Unified code for technique for constructional safety	GB 50870-2013
Code for safety of power supply and consumption for construction site	GB 50194-2014
Code for construction of steel structures	GB 50755-2012
Technical code for roof engineering	GB 50345-2012
Technical code for slope roof engineering	GB 50693-2011

Table 2. The performance of the automated extraction model in part type.

Functional Category	Tuple Type (Key Identifier)	Semantic Structure/Example	Precision (%)	Recall (%)	F1 (%)
Operate	Operate-Req_NOT-Object	Prohibition: “Should not wear [Object]”	72.73	66.67	69.57
	Person-Req_NOT-Operate	Prohibition: “Person must not operate”	100.00	77.78	87.50
	Operate-Req_AND-Object	Requirement: “Must wear [Object]”	79.00	85.87	82.29
Position	Position-Con_AND-Object	Condition: “When close to [Object]”	75.00	40.00	52.17
	Person-Req_NOT-Position	Restriction: “Person not allowed at [Loc]”	92.31	85.71	88.89
	Position-Req_NOT-Location	Restriction: “Cannot stay in [Location]”	81.82	47.37	60.00
	Position-Con_AND-Location	Condition: “Person at [Height/Place]”	87.50	60.87	71.79
	Position-Con_OR-Environment	Condition: “In [Weather] condition”	96.30	96.30	96.30
	Position-Req_NOT-Object	Restriction: “Keep away from [Object]”	76.32	55.77	64.44
Attribute	Object-Con_Belongto-Object	Relation: Ownership or Component	50.00	10.00	16.67
	Environment-Con_OR-Property	Attribute: Weather property (e.g., >level 6)	100.00	76.92	86.96
	Environment-Con_OR-Unit	Attribute: Measurement unit	92.31	92.31	92.31
	Environment-Con_OR-Value	Attribute: Numeric value	92.31	92.31	92.31
	Object-Con_AND-Attribute	Attribute: Specific object feature	50.00	18.18	26.67
Predicate	Predicate-Req_NOT-Work	Prohibition: “Stop [Work]”	40.00	12.50	19.05
	Predicate-Con_AND-Work	Condition: “While doing [Work]”	63.64	36.84	46.67
	Object-Con_AND-Work	Relation: Object involved in [Work]	78.26	64.29	70.59
	Predicate-Con_OR-Object	Condition: “Involves [Object]”	65.52	65.52	65.52
	Predicate-Con_AND-Object	Condition: “With [Object]”	71.74	52.38	60.55
	Predicate-Req_NOT-Object	Prohibition: “Do not use [Object]”	72.48	54.48	62.20
Overall	ALL Types	--	78.07	58.92	67.15

Table 3. Results of the validation test.

	Accuracy	Precision		Recall		F1-Score
	Accuracy	Safety	Unsafety	Safety	Unsafety	Safety	Unsafety
Rule	98.67%	98.56%	98.91%	99.51%	96.81%	99.03%	97.85%
Question	96.00%	96.46%	95.40%	96.46%	95.40%	96.46%	95.40%
All	97.60%	97.82%	97.21%	98.43%	96.13%	98.13%	96.67%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhu, Y.; Ouyang, Y.; Pan, R.; Sun, Z.; Zhou, Y.; Ma, R.; Cheng, B.; Wang, W. An Integrated Framework for Automated Identification of Workers’ Safety Violation Based on Knowledge Graph. Buildings 2026, 16, 1037. https://doi.org/10.3390/buildings16051037

AMA Style

Zhu Y, Ouyang Y, Pan R, Sun Z, Zhou Y, Ma R, Cheng B, Wang W. An Integrated Framework for Automated Identification of Workers’ Safety Violation Based on Knowledge Graph. Buildings. 2026; 16(5):1037. https://doi.org/10.3390/buildings16051037

Chicago/Turabian Style

Zhu, Yifan, Yewei Ouyang, Rui Pan, Zhanhui Sun, Yang Zhou, Rui Ma, Baoquan Cheng, and Wen Wang. 2026. "An Integrated Framework for Automated Identification of Workers’ Safety Violation Based on Knowledge Graph" Buildings 16, no. 5: 1037. https://doi.org/10.3390/buildings16051037

APA Style

Zhu, Y., Ouyang, Y., Pan, R., Sun, Z., Zhou, Y., Ma, R., Cheng, B., & Wang, W. (2026). An Integrated Framework for Automated Identification of Workers’ Safety Violation Based on Knowledge Graph. Buildings, 16(5), 1037. https://doi.org/10.3390/buildings16051037

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

An Integrated Framework for Automated Identification of Workers’ Safety Violation Based on Knowledge Graph

Abstract

1. Introduction

2. Literature Review

2.1. Existing Automated Inspection Methods in Construction Safety

2.2. Knowledge Graph Representation of Construction Safety Knowledge and Its Application

3. Methodology

3.1. Terminology and Hierarchy of the Knowledge Graph

3.2. Automated Knowledge Extraction and Graph Population from Safety Texts

3.2.1. Rule Knowledge Graph: Schema and Semantics

3.2.2. Association Knowledge Graph

3.3. The Safety Violation Identification Module

3.3.1. Scene Description

3.3.2. Scene Description Expansion

3.3.3. Implementation Scope and Validation Strategy

4. Case Study

4.1. Generate Knowledge Graph

4.1.1. Collect Safety Regulations

4.1.2. Preprocessing

4.1.3. Annotate Rules

4.1.4. Construct the Rule Knowledge Graph

4.1.5. Construct the Association Knowledge Graph

4.2. Train Automated Extraction Model and Performance

4.3. Validation Test

4.3.1. Construct Validation Sets

4.3.2. Validation and Results

5. Discussion

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI