A Novel Method for Named Entity Recognition in Long-Text Safety Accident Reports of Prefabricated Construction

Luo, Qianmai; Zhang, Guozong; Sun, Yuan

doi:10.3390/buildings15173063

Open AccessArticle

A Novel Method for Named Entity Recognition in Long-Text Safety Accident Reports of Prefabricated Construction

by

Qianmai Luo

¹

,

Guozong Zhang

^1,* and

Yuan Sun

²

¹

School of Urban Economics and Management, Beijing University of Civil Engineering and Architecture, Beijing 102616, China

²

Value Engineering Society of China, Beijing 100191, China

^*

Author to whom correspondence should be addressed.

Buildings 2025, 15(17), 3063; https://doi.org/10.3390/buildings15173063

Submission received: 18 July 2025 / Revised: 19 August 2025 / Accepted: 23 August 2025 / Published: 27 August 2025

(This article belongs to the Special Issue Large-Scale AI Models Across the Construction Lifecycle)

Download

Browse Figures

Versions Notes

Abstract

Prefabricated construction represents an advanced approach to sustainable development, and safety issues in prefabricated construction projects have drawn widespread attention. Safety accident case reports contain a wealth of safety knowledge, and extracting and learning from such historical reports can significantly enhance safety management capabilities. However, these texts are often semantically complex and lengthy, posing challenges for traditional Information Extraction (IE) methods. This study focuses on the challenge of Named Entity Recognition (NER) in long texts under complex engineering contexts and proposes a novel model that integrates Modern Bidirectional Encoder Representations from Transformers (ModernBERT),Bidirectional Long Short-Term Memory (BiLSTM), andConditional Random Field (CRF). A comparative analysis with current mainstream methods is conducted. The results show that the proposed model achieves an F1 score of 0.6234, outperforming mainstream baseline methods. Notably, it attains F1 scores of 0.95 and 0.92 for the critical entity categories “Consequence” and “Type,” respectively. The model maintains stable performance even under semantic noise interference, demonstrating strong robustness in processing unstructured and highly heterogeneous engineering texts. Compared with existing long-text NER models, the proposed method exhibits superior semantic parsing ability in engineering contexts. This study enhances information extraction methods and provides solid technical support for constructing safety knowledge graphs in prefabricated construction, thereby advancing the level of intelligence in the construction industry.

Keywords:

prefabricated construction; safety risk identification; named entity recognition; information extraction; pre-trained model

1. Introduction

Modular construction is a sustainable mode of production and has become a significant development direction in the construction industry [1,2,3]. Driven by technological advancements, the demand for sustainability, and the pursuit of higher efficiency, the global prefabricated construction market is rapidly expanding. In North America, the share of prefabricated buildings in new construction projects increased from 2.14% in 2015 to 6.64% in 2023 [4]. In China, this proportion rose from 2.7% in 2015 to 20.5% in 2020, with plans to further increase it to 40% by 2030 [3]. Despite its numerous advantages, safety issues at construction sites remain prominent in prefabricated construction. New construction processes, such as the unloading, hoisting, and installation of large components in prefabricated projects, have introduced new safety hazards [5]. According to statistics from one hundred prefabricated construction accident cases, falls from height account for 35%, crushing/collision incidents for 27%, and falling objects for 23% [6]. In addition, the frequency of accidents is relatively high. Data from the U.S. Bureau of Labor Statistics in 2018 show that in the construction of prefabricated wood and steel structures, the accident rates are 6.5 and 4.4 per 100 full-time workers, respectively. In contrast, the accident rate in traditional construction is only 2.7 per 100 full-time workers [7]. Ensuring construction safety in prefabricated buildings is a vital part of the United Nations Sustainable Development Goals (SDGs), including Good Health and Well-being (SDG 3), Sustainable Cities and Communities (SDG 11), and Responsible Consumption and Production (SDG 12) [8]. Therefore, emphasizing safety in modular construction is both a critical and urgent imperative.

Safety accident investigation reports provide detailed records of the causes and specific contexts of accidents, offering abundant case information for safety management. Effectively utilizing such textual information is of great significance for enhancing safety management [9]. With the ongoing digital transformation of the construction industry, a large amount of data related to prefabricated construction safety—such as investigation reports and construction logs—has been accumulated. However, these data typically exist in unstructured or semi-structured forms [10] and have not been fully integrated or leveraged, resulting in fragmented risk information that is difficult to effectively support safety risk management. Against this backdrop, Natural Language Processing (NLP) technologies have gradually been introduced into construction safety management with the goal of mining valuable safety information from massive amounts of unstructured or semi-structured accident texts [11,12]. Among these, Named Entity Recognition (NER) is a critical component of NLP that automatically identifies core safety entities such as accident types, causal factors, and construction activities from texts [13]. Meanwhile, NER serves as a bridge in many downstream applications, including question–answering systems, knowledge graph construction, and chatbots. In recent years, deep learning-based NER methods have achieved significant progress, especially pre-trained language models represented by Bidirectional Encoder Representations from Transformers (BERT), which have greatly improved recognition accuracy through their context modeling capabilities. Jeon et al. (2022) [14] proposed the KoBERT architecture, which is capable of recognizing complex entities in noisy texts with an F1 score of 91%, demonstrating its effectiveness in complex semantic environments. However, the standard BERT architecture is limited by an input length restriction of 512 tokens, making it difficult to directly process the long texts commonly found in accident reports [15]. Currently, most NER work is conducted on short texts, which severely limits the effective utilization of safety information. Although segmenting long texts can address this issue, it causes the loss of contextual information and may disrupt entity integrity [16]. To overcome these challenges, this study introduces Modern Bidirectional Encoder Representations from Transformers (ModernBERT), a pre-trained language model optimized for long texts, and integrates it with a Bidirectional Long Short-Term Memory (BiLSTM)–Conditional Random Field (CRF) structure. This model framework can handle context lengths of up to 8192 tokens, significantly extending the application scope of information extraction.

The remainder of this paper is organized as follows. Section 2 reviews the related literature; Section 3 introduces the integrated model framework and its methodology; Section 4 presents the experimental results; Section 5 analyzes the model’s stability and discusses the theoretical and practical implications of this study; and Section 6 concludes the research.

2. Literature Review

The construction industry is a high-risk sector [17]. The number of safety accidents in construction has consistently remained at a high level, resulting in significant casualties, economic losses, and disruptions to productivity [18,19]. Due to the frequent and severe nature of accidents at construction sites, various safety information records and reports are crucial for accident prevention and decision-making. However, most of this information exists in unstructured textual form and requires NLP techniques for effective extraction and utilization [20]. NLP can transform unstructured documents in the construction domain into structured information, facilitating efficient management by safety personnel. Among NLP tasks, text classification, topic clustering, and NERs are the most representative and widely applied branches in construction safety information extraction (IE) [21]. However, the first two lack sufficient depth in semantic understanding and cannot accurately pinpoint specific entities within the text. The core of NER lies in precisely identifying safety entities with specific meanings—such as accident types, causal factors, construction activities, personnel roles, and safety measures—by analyzing semantic information in the text, thus providing foundational semantic support for subsequent knowledge applications [22].

NER technology has evolved through several stages, from rule- and dictionary-based methods to machine learning, deep learning, and finally pre-trained language models [23]. In the early stage, rule- and dictionary-based approaches were primarily used, relying on regular expressions and manually crafted rules to match specific entities [24]. Although effective within specific domains, these methods suffer from limited generalizability. Subsequently, machine learning techniques such as Hidden Markov Models (HMMs), Maximum Entropy Markov Models (MEMMs), Support Vector Machines (SVMs), and CRF were widely applied [25]. However, these methods depend heavily on manual feature engineering and struggle to capture long-distance dependencies. In the initial phase of deep learning, models such as Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), and BiLSTM automatically extracted features, significantly improving recognition performance [26]. In recent years, pre-trained language models have rapidly advanced NER tasks, especially with the introduction of models like BERT, which further enhance entity recognition accuracy. Related studies are summarized in Table 1.

Currently, BERT and its integrated models still face significant limitations when processing long texts, primarily constrained by a maximum sequence length of 512 tokens, which hinders their ability to effectively handle long documents directly. To address the challenge of long-text processing, researchers have proposed various technical solutions. Chunking is the most common method [16], where long texts are segmented into fixed-length parts and processed stepwise using a sliding window; however, this approach leads to loss of contextual information. Longformer [35] and BigBird [36] are improved Transformer-based models designed to overcome computational bottlenecks in processing long texts. Longformer employs a combination of local sliding window attention and global attention mechanisms to achieve linear computational complexity. Building upon Longformer, BigBird introduces a random sparse attention mechanism. Both models extend text processing length to 4096 tokens and have been successfully applied in NER in medical texts [37] and long-text entity recognition in the construction domain [38]. Although Longformer and BigBird offer innovative solutions for long-text processing, their respective design characteristics bring certain drawbacks, such as sensitivity to hyperparameters, instability, and computational resource constraints, even though these can be mitigated through distributed matrix multiplication [39]. In contrast, ModernBERT demonstrates outstanding performance across various tasks due to its efficient processing capabilities and flexible architecture [40].

Although significant progress has been made in NER, there remain many theoretical and practical deficiencies in safety information extraction within the construction field, especially in the domain of prefabricated construction. First, with the advancement of construction industrialization and the gradual implementation of digital transformation, safety incidents in prefabricated construction have become increasingly complex and diverse. As a result, traditional qualitative analysis methods can no longer meet the practical demands for efficiently handling such situations. Moreover, in the prefabricated construction sector, efforts to mine and apply unstructured or semi-structured textual data remain insufficient. Second, mainstream deep learning models are primarily trained on short texts; therefore, their performance is significantly limited when processing longer texts. This limitation severely restricts the extraction and utilization of safety information.

3. Methodology

This study employs the ModernBERT-BiLSTM-CRF model for training and recognizing entities. The overall model framework is illustrated in Figure 1. The input text sequence undergoes preprocessing and is transformed into word embedding vectors as model input. The input sequence is first encoded by the ModernBERT model. After processing through the ModernBERT layer, each word embedding integrates contextual information, and the resulting semantic representations are passed to the subsequent BiLSTM layer. The BiLSTM captures both forward and backward sequence information, enabling an understanding of each word’s meaning within its context. Finally, a CRF decodes the feature sequence output by the BiLSTM to generate the final label sequence.

3.1. ModernBERT

ModernBERT [40] integrates a range of cutting-edge optimization techniques and introduces systematic improvements over the traditional BERT model in terms of architecture and computational efficiency. The model architecture is illustrated in Figure 2.

It replaces absolute positional embeddings with rotary positional embeddings (RoPE), removes unnecessary bias terms to simplify the architecture, and adds an additional normalization layer after the embedding layer. For normalization, it adopts pre-normalization blocks in combination with standard layer normalization. The activation function is replaced by GeGLU instead of the commonly used GeLU. A key technical innovation of ModernBERT is the introduction of the Alternate Attention Mechanism (AAM), which fundamentally differs from traditional global attention schemes. Technically, this mechanism applies full global attention only once every three layers, while the remaining layers adopt a sliding window strategy, wherein each token attends only to its 128 nearest neighbors (local attention). Considering that the computational complexity of the attention mechanism increases sharply with the number of tokens, this design enables ModernBERT to handle long-sequence inputs more efficiently than existing models. The formula is shown as follows:

\begin{matrix} A t t e n t i o n (Q, K, V) = s o f t m a x (\frac{{Q K}^{T}}{\sqrt{d_{k}}}) V \end{matrix}

(1)

\begin{matrix} A A M (Q, K, V) = G l o b a l a n d l o c a l A t t e n t i o n (Q, K, V) \end{matrix}

(2)

where Q denotes the query matrix, K the key matrix, and V the value matrix.

{Q K}^{T}

represents the dot product of Q and K.

d_{k}

is the dimensionality of the key vectors. Softmax denotes the normalization of similarity scores to obtain attention weights.

ModernBERT significantly improves computational efficiency through its innovative unpadding and sequence packing techniques. Traditional approaches rely on padding tokens to standardize sequence lengths, resulting in considerable redundant computation. The unpadding technique in ModernBERT completely removes padding tokens and restructures the valid content into smaller batches. Combined with Flash Attention, this optimization eliminates the overhead of repeated padding and unpadding operations in conventional methods, leading to a performance gain of 10–20%. Furthermore, ModernBERT adopts a sequence packing strategy, using a greedy algorithm to dynamically combine multiple short sequences into compact batches that approach the model’s maximum input length, thereby maximizing GPU parallelism. This approach not only reduces padding waste but also enhances training throughput. Along with Flash Attention’s optimization of RoPE, ModernBERT requires only a single unpadding operation, with no need for subsequent reconstruction, making both pretraining and inference more efficient. These techniques collectively optimize resource utilization and significantly accelerate the model while maintaining its performance.

3.2. BiLSTM

BiLSTM belong to a class of RNN architectures capable of capturing both forward and backward dependencies within a text sequence [41]. In NER tasks, determining the boundaries and categories of entities often relies on contextual information from both preceding and succeeding words. Unidirectional models struggle to effectively capture such symmetrical semantic relationships. In contrast, BiLSTM can more accurately represent the semantic state of words within their context. It achieves this by incorporating two independent LSTM units—one processing the input sequence from left to right, and the other from right to left—thereby comprehensively capturing the full contextual semantics. The model architecture is illustrated in Figure 3.

In this study, BiLSTM is employed to further process the word vectors output by ModernBERT. The computational process of the BiLSTM model is described as follows:

\begin{matrix} i_{t} = σ (W_{i} [x_{t}; h_{t - 1}] + b_{i}) \end{matrix}

(3)

\begin{matrix} f_{t} = σ (W_{f} [x_{t}; h_{t - 1}] + b_{f}) \end{matrix}

(4)

\begin{matrix} o_{t} = σ (W_{o} [x_{t}; h_{t - 1}] + b_{o}) \end{matrix}

(5)

\begin{matrix} {\tilde{c}}_{t} = \tanh (W_{c} [x_{t}; h_{t - 1}] + b_{c}) \end{matrix}

(6)

\begin{matrix} c_{t} = f_{t} \cdot c_{t - 1} + i_{t} \cdot {\tilde{c}}_{t}) \end{matrix}

(7)

\begin{matrix} h_{t} = o_{t} \cdot \tanh (c_{t}) \end{matrix}

(8)

where σ denotes the Sigmoid activation function. · represents matrix multiplication.

x_{t}

is the input feature vector at time step t.

h_{t}

is the current hidden state vector.

W_{i}, W_{f}, W_{o}, W_{c}

are the weight matrices for the input gate, forget gate, output gate, and cell candidate, respectively.

b_{i}, b_{f}, b_{o}, b_{c}

are the corresponding bias vectors.

t a n h

denotes the hyperbolic tangent function.

The final output is the concatenation of the forward and backward hidden states:

\begin{matrix} \tilde{h_{t}} = [\vec{h_{t}}; \overset{\leftarrow}{h_{t}}] \end{matrix}

(9)

3.3. CRF

In NER tasks, BiLSTM excels at capturing long-distance textual dependencies but cannot effectively model the dependencies between adjacent labels [42]. CRF are discriminative probabilistic graphical models [43] whose core advantage lies in considering the transition probabilities between labels. By decoding via a globally optimal path, CRF prevents the model from producing invalid label sequences. The model architecture is illustrated in Figure 4.

The output

\tilde{h_{t}}

of the BiLSTM is projected through a hidden layer into an emission score matrix P, whose dimensions match the number of labels, capturing local semantic alignment between “words and labels.” The CRF layer introduces a transition score matrix A, which models dependencies between labels. For example, the entity label “B-Safety Harness” is more likely to be followed by “I-Safety Harness” than by “B-Helmet.” The higher the score, the more semantically reasonable the transition. By combining emission and transition scores, the CRF computes the total score of a label sequence y as follows:

\begin{matrix} S (Χ, y) = \sum_{i = 0}^{T} A_{y_{i}, y_{i + 1}} + \sum_{i = 1}^{T} P_{i, y_{i}} \end{matrix}

(10)

where X is the input sequence, y is the label sequence, T is the sequence length, A is the transition score matrix, P is the emission score matrix.

To convert the score into a probability and guide model training, the CRF normalizes over all possible label sequences

y^{'}

using the softmax function, yielding the conditional probability:

\begin{matrix} p (y| Χ) = \frac{e^{S (Χ, y)}}{\sum_{y^{'} \in γ} e^{S (Χ, y^{'})}} \end{matrix}

(11)

where γ denotes the set of all possible label sequences.

During training, the model parameters are optimized by maximizing the log-likelihood of the correct label sequence:

\begin{matrix} \ln p (y| Χ) = S (Χ, y) - \ln \sum_{y^{'} \in γ} e^{S (Χ, y^{'})} \end{matrix}

(12)

During inference, the model selects the label sequence with the highest total score among all possible sequences. This problem is solved via Viterbi decoding, which consists of three steps: (i) Initialization: Compute initial scores for each label at the first position, incorporating the start symbol. (ii) Recursion: For each subsequent position and each possible label, compute accumulated scores by considering all previous labels and record the optimal predecessor. (iii) Termination and Backtracking: Determine the final label using the end symbol, then backtrack using the recorded predecessors to recover the optimal sequence

y^{*}

.

\begin{matrix} y^{*} = \underset{y^{'} \in γ}{argmax} S (Χ, y^{'}) \end{matrix}

(13)

3.4. Evaluation Metrics for NER

Precision (P), Recall (R), and F1-score are used as evaluation metrics to measure the model’s performance. Their calculation formulas are as follows:

\begin{matrix} P r e c i s i o n = \frac{T P}{T P + F P} \end{matrix}

(14)

\begin{matrix} R e c a l l = \frac{T P}{T P + F N} \end{matrix}

(15)

\begin{matrix} F 1 = 2 \times \frac{P r e c i s i o n \times R e c a l l}{P r e c i s i o n + R e c a l l} \end{matrix}

(16)

where True Positive (TP) refers to the number of correctly identified entities; False Positive (FP) refers to the number of non-entities incorrectly identified as entities; False Negative (FN) refers to the number of true entities that were not identified.

4. Experiments and Results

4.1. Data Collection and Processing

This study collected 1235 construction accident investigation reports related to prefabricated steel structure buildings. Each report was carefully reviewed, resulting in the removal of 597 reports unrelated to prefabricated steel structure buildings and 109 reports with incomplete content (requiring the report to include a complete overview of the accident, accident process, causes, and prevention measures). Ultimately, 529 valid prefabricated steel structure building accident investigation reports were obtained.

Text preprocessing of accident investigation reports is a crucial step in building a high-quality corpus, as illustrated in Figure 5. First, the text format is standardized. Next, the raw texts are cleaned by removing irrelevant information, redundant spaces, and other noise. Then, stop words that do not contribute to semantic analysis are deleted. Subsequently, data quality checks are conducted to ensure the texts are complete, consistent, and accurate. Finally, the corpus is constructed by saving the processed and normalized texts, facilitating subsequent tasks such as information extraction and model training. Statistical analysis of the collected texts shows that the longest document contains 6450 characters, and the average length is 1517 characters—both exceeding the text length typically handled by traditional pre-trained models.

4.2. Entity Type Definition and Annotation

Based on the content of accident investigation reports and literature review, a set of core terms covering the entire accident process was compiled, comprising six categories of entities: accident time, construction activity, accident cause, accident type, accident consequence, and preventive measure, as shown in Table 2.

After defining the named entity types, the training data annotation process began. This study utilized the open-source tool Yedda [50] and applied the BIO sequence labeling method for entity annotation. The BIO method assigns a label to each character to indicate its position within a named entity: “B” denotes the beginning of an entity, “I” denotes characters inside the entity, and “O” denotes characters outside any entity. This approach clearly marks the boundaries and types of entities within the text, as illustrated in Figure 6. To ensure the reliability of the dataset, this study implemented rigorous quality control measures during the annotation process. All annotations were independently carried out by three PhD students with backgrounds in construction safety management and NLP. In the initial phase, we conducted discussions and performed pilot annotations to standardize the annotation guidelines. Subsequently, a subset of samples was randomly selected for cross-checking to ensure consistency and accuracy. Following a detailed annotation process, a total of 12,772 entities were labeled, with the distribution of annotated entities shown in Figure 7.

4.3. Experimental Setup

In this study, all experiments were conducted on a server equipped with an NVIDIA GeForce RTX 4090 GPU with 24 GB of memory, running the Linux operating system. The code was developed using PyCharm Community Edition 2022, with Python 3.9 as the programming language and PyTorch 2.5.1 as the deep learning framework. During model training, the dataset was split into a training set and a test set at a ratio of 8:2, and five-fold cross-validation was employed to calculate the final results. In the training configuration, the batch size was set to 4, the learning rate to 1 × 10⁻⁴, and the optimizer used was Adaw. The model was trained for 100 epochs, using the cross-entropy loss function.

4.4. NER Results

(1) NER Results on Short and Long Texts and Model Comparison

Comparative experiments were conducted on both short-text and long-text corpora to evaluate the performance of different pre-trained models in NER for safety accident reports. The evaluation metrics used were Precision, Recall, and F1-score. The results are presented in Table 3 and Table 4.

Table 3 presents the NER performance of different pre-trained models in short-text scenarios. These short texts were obtained by segmenting the long texts, with lengths kept within 512 characters. The experimental results show that the BERT-BiLSTM-CRF model outperforms other models in terms of Precision, Recall, and F1-score, achieving values of 0.7012, 0.6825, and 0.6916, respectively. This indicates that combining BiLSTM and CRF mechanisms with the BERT model enables more effective capturing of entity boundaries and types in short-text NER tasks, thereby improving recognition accuracy. In comparison, the pure BERT model, the ModernBERT-BiLSTM model, and the ModernBERT-BiLSTM-CRF model exhibit slightly lower performance than the BERT-BiLSTM-CRF model.

Table 4 lists the NER results of various models on long-text tasks. The experiments demonstrate that the ModernBERT-BiLSTM-CRF model achieves the best performance in long-text scenarios. This result fully illustrates that ModernBERT, owing to its improved architectural design, can better capture long-distance dependencies, thereby enhancing entity recognition performance in long texts. In contrast, the BERT-BiLSTM-CRF model, the ModernBERT-BiLSTM model, and the pure BERT model all perform worse than the ModernBERT-BiLSTM-CRF model on long texts. This is primarily due to the contextual window limitations inherent in BERT when processing long sequences.

To validate the effectiveness of the CRF layer, the ablation study results presented in Table 3 and Table 4 indicate that incorporating the CRF layer significantly improves model performance under both short and long text scenarios. This confirms the utility of structured label transition constraints in NER tasks. Overall, within the length of 512 tokens, BERT’s full attention mechanism can fully capture the dependencies among all tokens, especially the close correlations between local and global contexts. That is, in short texts, the full attention mechanism of BERT can model all token relationships without omission. In contrast, ModernBERT for long texts adopts a sparse attention mechanism, which may overlook important information. When the text length exceeds 512 characters, the advantage of this long-range sparse attention mechanism becomes prominent, as it can focus on positional and contextual information in long texts, thereby enhancing the contextual modeling capability. Therefore, for short texts, BERT’s global attention mechanism combined with absolute positional encoding achieves dense interactions within the 512-token limit, making it suitable for local semantic modeling. ModernBERT, by employing strategies such as sparsification and chunking to overcome length limitations while retaining global context, is better suited for tasks involving long-distance dependencies.

To verify the statistical significance of the performance difference between the ModernBERT-BiLSTM-CRF and BERT-BiLSTM-CRF models in long-text NER tasks, a paired-sample t-test was conducted on their F1 scores using the same test set. The results show that ModernBERT-BiLSTM-CRF consistently outperformed the BERT-BiLSTM-CRF model across all five experimental runs, with an average F1 score improvement of 3.6%. The paired t-test revealed a highly significant difference (t = 4.5200, p = 0.0054 < 0.05), and the effect size (Cohen’s d = 2.02) indicates a substantial practical difference. These quantitative findings strongly support the effectiveness of the ModernBERT architecture improvement and demonstrate its statistically significant performance advantage in long-text processing tasks.

(2) NER Results by Entity Category

In long texts, this study investigates the model’s recognition capability and differences across various entity categories within prefabricated construction safety accident reports. Using the experimental model ModernBERT-BiLSTM-CRF, a quantitative comparative evaluation was conducted for six entity categories, with the results presented in Table 5.

The experimental results indicate significant differences in recognition performance among different entity categories. The model performs best on the “Consequence” and “Type” entities, achieving F1 scores of 0.95 and 0.92, respectively, demonstrating high accuracy and stability in identifying accident consequences such as casualties and economic losses, as well as accident types like falls from height and object strikes. This is attributed to the relatively standardized expression, clear semantic boundaries, short length, and uniformity of these entities in the corpus. The “Time” entity achieves an F1 score of 0.78, with high precision but relatively low recall. Although most predictions are correct, some entities are missed. This phenomenon arises because time expressions vary widely, and the model struggles to handle ambiguous boundaries or semantically unclear temporal segments. In contrast, the recognition performance for the “Cause,” “Measure,” and “Activity” entities is relatively poor, with F1 scores of 0.43, 0.41, and 0.34, respectively, indicating limited capability in recognizing these semantically complex entities. The weakest recognition is for the “Activity” entity, mainly due to its involvement with specific construction tasks that have diverse and inconsistent expressions. For example, terms such as “steel column hoisting,” “component installation,” and “lifting operations” can all refer to similar construction activities. In some cases, even generic expressions like “operation” or “during the operation” are used, making it difficult for the model to establish stable recognition patterns. Additionally, some construction activities do not appear as explicit verb structures or proper nouns but are instead embedded within complex semantic structures. The “Cause” entity is also difficult to identify, as accident causes are diverse. Without the support of a large sample size, the model lacks sufficient contextual understanding when extracting direct or indirect causes of accidents. Additionally, the “Measure” entity involves suggestive and improvement-related expressions with flexible and varied language and longer lengths, making it challenging for the model to accurately capture boundaries and core semantics.

5. Discussion

5.1. Model Performance Analysis

(1) Noise Robustness Experiment

To further verify the robustness of the model in complex real-world application scenarios, this study conducted interference experiments by introducing semantic noise into the long-text corpus, including word order disruption, insertion of irrelevant words, and random character deletions [14]. In this study, the noise perturbation ratio was controlled at the 5% character level during the experimental setup. A comparative analysis was conducted to evaluate the model’s NER performance before and after the introduction of noise. The experimental results show that without noise, the model’s Precision, Recall, and F1 scores were 0.6855, 0.5828, and 0.6234, respectively. After introducing noise, these metrics changed to 0.6691, 0.5857, and 0.6186, respectively. It can be observed that Precision decreased noticeably, indicating that the accuracy of the model’s predictions was somewhat affected in noisy environments, with an increased rate of misidentified entities. Recall showed a slight increase, suggesting that the model’s ability to capture potential entities was marginally enhanced after noise introduction, possibly because the noisy text introduced certain cues related to entity boundaries. The overall F1 score declined, reflecting a reduction in the model’s overall performance under noise interference. Although the recall improved slightly, it was insufficient to offset the negative impact of the decreased precision. These results reveal that the model possesses a certain degree of resistance to semantic noise in long-text NER tasks, as its performance did not exhibit drastic fluctuations.

(2) Comparative Analysis of Existing Long-Text Processing Models

In recent years, numerous improved pre-trained models have been proposed for long-text NER tasks. Among them, models such as Longformer and BigBird have received widespread attention, extending the text processing length to 4096 characters. These models demonstrate high recognition accuracy in domains where entity boundaries are clear and annotation standards are strict, particularly in well-structured corpora like medical texts. To evaluate the practical performance of various models across different domains, this paper summarizes and analyzes current research achievements in long-text NER, as shown in Table 6.

In the medical domain, studies by Li et al. (2023) [37] and Schäfer et al. (2023) [51] show that Longformer and BigBird models perform excellently in extracting medical concepts and drug names, achieving F1 scores above 0.9, mainly benefiting from clear domain terminology and standardized data. In contrast, Gao et al. (2024) [38] report relatively weaker performance of the Longformer-CASREL model on engineering consulting texts, primarily due to the diverse expression forms and high language ambiguity of technical texts, which bear some similarity to the characteristics of the corpus used in this study. The corpus utilized in this study not only exceeds the maximum character length manageable by Longformer and BigBird models but also features significant variation in writing styles and implicit semantic expressions and contains many strongly context-dependent and domain-specific terms. Despite these substantial challenges, the proposed model still achieves an F1 score of 0.6234. Although this is somewhat lower than the results in structured domains, it demonstrates good practical applicability in handling complex, unstructured, and semi-structured long texts. Additionally, it should be noted that this comparative analysis aims to present the representative performance of existing mainstream pre-trained models in long-text tasks within their respective domains, illustrating their capability characteristics and application scenarios, rather than serving as a direct performance comparison with the model proposed in this study on the same task.

5.2. Knowledge Contributions

The knowledge contributions of this study in the field of prefabricated building construction safety are evident and primarily manifested in the following two aspects.

1. Previous research has been limited by the constraints of traditional Transformer architectures, focusing mainly on information extraction methods for short texts, with insufficient exploration of long-text processing. To address this gap, this paper proposes a novel long-text NER method specifically designed for the prefabricated construction safety domain. Through an in-depth analysis of the proposed model’s performance across varying text lengths, the effectiveness of the model in long-text NER is validated. Additionally, contrastive experiments introducing semantic noise evaluate the model’s robustness when faced with non-standard text inputs in practical applications, thereby demonstrating the practical potential of the proposed method. This approach significantly improves the efficiency of information extraction from prefabricated construction safety accident texts, opening a new technical pathway for semantic parsing of long texts, and substantially expanding the depth of NLP technology’s semantic understanding and knowledge construction applications within the construction engineering domain.

2. The prefabricated building construction process involves a large volume of unstructured or semi-structured long-text data in various forms, such as accident reports, construction logs, and supervision records. The NER method proposed in this study enables automatic extraction of critical safety information from massive amounts of text, thereby achieving structured knowledge management and improving safety management performance. Moreover, it facilitates the transition of enterprises toward data-driven construction safety management models, enhancing the synergistic development of building industrialization and intelligent construction in the industry.

5.3. Limitations and Future Work

Although the long-text NER method proposed in this study has achieved satisfactory performance on prefabricated building construction safety accident reports, several limitations remain that require further improvement and optimization in future research.

1. The training and testing corpora used in this study are primarily derived from publicly available Chinese reports of prefabricated building construction safety accidents. The overall dataset size is relatively limited, and the distribution of entity types is imbalanced, which may adversely affect the model’s ability to recognize long-tail entity categories and its generalization performance [52].

2. The current dataset is limited to Chinese texts, and the feasibility of transferring the model to other languages, such as English, has not been explored.

3. This study focuses solely on the NER task and does not extend to constructing semantic relationships between entities. To some extent, this undoubtedly restricts the model’s potential applications in higher-level tasks such as knowledge graph construction and safety inference analysis.

Future research will focus on addressing the above issues. To tackle the challenges of limited sample size and cross-lingual adaptation, a progressive knowledge distillation framework will be employed [53]. Additionally, future studies will integrate an entity relation extraction module to advance the construction of knowledge graphs in the field of prefabricated construction safety.

6. Conclusions

This study focuses on construction safety accidents in prefabricated buildings and proposes a new method for NER to address the challenges associated with long text. Systematic experiments and comparative analyses were conducted on actual safety incident investigation reports, leading to the following conclusions.

1. This work integrates the advantages of various models to construct a novel NER architecture based on ModernBERT-BiLSTM-CRF. Compared to traditional BERT and BERT-BiLSTM-CRF models, the ModernBERT-BiLSTM-CRF model outperforms others in Precision, Recall, and F1 metrics, achieving values of 0.6855, 0.5828, and 0.6234, respectively. Notably, it demonstrates excellent recognition performance for the key entity categories “Consequence” and “Type,” with F1 scores reaching 0.95 and 0.92, respectively.

2. In long-text tests involving semantic noise interference, the model’s performance exhibits only minor fluctuations. This indicates strong robustness and adaptability when handling real-world texts characterized by irregular expressions, inconsistent formatting, and semantic ambiguity. Such robustness provides reliable support for safety information extraction and knowledge graph construction in complex contexts.

3. Although models like Longformer and BigBird perform well in medical tasks, they show limited adaptability when confronted with engineering texts featuring complex terminology, semantic ambiguity, and greater text length. The proposed method demonstrates higher semantic parsing capability for longer and more heterogeneous texts typical of engineering contexts, thereby offering greater practical value for engineering applications.

Author Contributions

Methodology, writing—original draft preparation, and visualization, Q.L. Conceptualization, validation, writing—review and editing, supervision, project administration, and resources, G.Z. Formal analysis, investigation, and data curation, Y.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Li, Y.; Gao, Y.; Meng, X.; Liu, X.; Feng, Y. Assessing the air pollution abatement effect of prefabricated buildings in China. Environ. Res. 2023, 239, 117290. [Google Scholar] [CrossRef] [PubMed]
Miracco, G.; Nicoletti, F.; Ferraro, V.; Muzzupappa, M.; Mattanò, V.M.; Alberti, F. Achieving nZEB goal through prefabricated buildings: Case study in Italy. Energy Build. 2025, 329, 115301. [Google Scholar] [CrossRef]
Zhao, W.; Hao, J.L.; Gong, G.; Ma, W.; Zuo, J.; Di Sarno, L. Decarbonizing prefabricated building waste: Scenario simulation of policies in China. J. Clean. Prod. 2024, 458, 142529. [Google Scholar] [CrossRef]
Modular Building Institute MBI. 2024 Annual Modular Construction Reports. 2024. Available online: https://www.modular.org/industry-analysis/ (accessed on 11 May 2025).
Song, Y.; Wang, J.; Liu, D.; Guo, F. Study of Occupational Safety Risks in Prefabricated Building Hoisting Construction Based on HFACS-PH and SEM. Int. J. Environ. Res. Public Health 2022, 19, 1550. [Google Scholar] [CrossRef] [PubMed]
Jeong, G.; Kim, H.; Lee, H.S.; Park, M.; Hyun, H. Analysis of safety risk factors of modular construction to identify accident trends. J. Asian Archit. Build. Eng. 2022, 21, 1040–1052. [Google Scholar] [CrossRef]
Sadeghi, H.; Cheung, C.M.; Yunusa-Kaltungo, A.; Manu, P. A systematic review of occupational safety and health in modular integrated construction. Saf. Sci. 2025, 189, 106897. [Google Scholar] [CrossRef]
Wen, B.; Musa, S.N.; Onn, C.C.; Ramesh, S.; Liang, L.; Wang, W.; Ma, K. The role and contribution of green buildings on sustainable development goals. Build. Environ. 2020, 185, 107091. [Google Scholar] [CrossRef]
Pan, X.; Zhong, B.; Wang, Y.; Shen, L. Identification of accident-injury type and bodypart factors from construction accident reports: A graph-based deep learning framework. Adv. Eng. Inform. 2022, 54, 101752. [Google Scholar] [CrossRef]
Baker, H.; Hallowell, M.R.; Tixier, A.J.P. Automatically learning construction injury precursors from text. Autom. Constr. 2020, 118, 103145. [Google Scholar] [CrossRef]
Cheng, M.Y.; Kusoemo, D.; Gosno, R.A. Text mining-based construction site accident classification using hybrid supervised machine learning. Autom. Constr. 2020, 118, 103265. [Google Scholar] [CrossRef]
Shao, L.; Guo, S.; Dong, Y.; Niu, H.; Zhang, P. Cause analysis of construction collapse accidents using association rule mining. Eng. Constr. Archit. Manag. 2023, 30, 4120–4142. [Google Scholar] [CrossRef]
Zhou, Z.; Wei, L.; Luan, H. Deep learning for named entity recognition in extracting critical information from struck-by accidents in construction. Autom. Constr. 2025, 173, 106106. [Google Scholar] [CrossRef]
Jeon, K.; Lee, G.; Yang, S.; Jeong, H.D. Named entity recognition of building construction defect information from text with linguistic noise. Autom. Constr. 2022, 143, 104543. [Google Scholar] [CrossRef]
Zhong, Y.; Goodfellow, S.D. Domain-specific language models pre-trained on construction management systems corpora. Autom. Constr. 2024, 160, 105316. [Google Scholar] [CrossRef]
Lv, X.; Liu, Z.; Zhao, Y.; Xu, G.; You, X. HBert: A Long Text Processing Method Based on BERT and Hierarchical Attention Mechanisms. Int. J. Semant. Web Inf. Syst. 2023, 19, 1–14. [Google Scholar] [CrossRef]
Golizadeh, H.; Hon, C.K.H.; Drogemuller, R.; Reza Hosseini, M. Digital engineering potential in addressing causes of construction accidents. Autom. Constr. 2018, 95, 284–295. [Google Scholar] [CrossRef]
Gurmu, A.T. Hybrid Model for Assessing the Influence of Safety Management Practices on Labor Productivity in Multistory Building Projects. J. Constr. Eng. Manag. 2021, 147, 04021139. [Google Scholar] [CrossRef]
Khalid, U.; Sagoo, A.; Benachir, M. Safety Management System (SMS) framework development—Mitigating the critical safety factors affecting Health and Safety performance in construction projects. Saf. Sci. 2021, 143, 105402. [Google Scholar] [CrossRef]
Baek, S.; Jung, W.; Han, S.H. A critical review of text-based research in construction: Data source, analysis method, and implications. Autom. Constr. 2021, 132, 103915. [Google Scholar] [CrossRef]
Ricketts, J.; Barry, D.; Guo, W.; Pelham, J. A Scoping Literature Review of Natural Language Processing Application to Safety Occurrence Reports. Safety 2023, 9, 22. [Google Scholar] [CrossRef]
Xu, N.; Ma, L.; Wang, L.; Deng, Y.; Ni, G. Extracting Domain Knowledge Elements of Construction Safety Management: Rule-Based Approach Using Chinese Natural Language Processing. J. Manag. Eng. 2021, 37, 04021001. [Google Scholar] [CrossRef]
Yang, J.; Zhang, T.; Tsai, C.Y.; Lu, Y.; Yao, L. Evolution and emerging trends of named entity recognition: Bibliometric analysis from 2000 to 2023. Heliyon 2024, 10, e30053. [Google Scholar] [CrossRef]
Zhang, F.; Fleyeh, H.; Wang, X.; Lu, M. Construction site accident analysis using text mining and natural language processing techniques. Autom. Constr. 2019, 99, 238–248. [Google Scholar] [CrossRef]
Goyal, A.; Gupta, V.; Kumar, M. Recent Named Entity Recognition and Classification techniques: A systematic review. Comput. Sci. Rev. 2018, 29, 21–43. [Google Scholar] [CrossRef]
Li, J.; Sun, A.; Han, J.; Li, C. A Survey on Deep Learning for Named Entity Recognition. IEEE Trans. Knowl. Data Eng. 2022, 34, 50–70. [Google Scholar] [CrossRef]
Moon, S.; Chi, S.; Im, S.B. Automated detection of contractual risk clauses from construction specifications using bidirectional encoder representations from transformers (BERT). Autom. Constr. 2022, 142, 104465. [Google Scholar] [CrossRef]
Liu, J.; Luo, H.; Fang, W.; Love, P.E.D. A contrastive learning framework for safety information extraction in construction. Adv. Eng. Inform. 2023, 58, 102194. [Google Scholar] [CrossRef]
Wu, W.; Wen, C.; Yuan, Q.; Chen, Q.; Cao, Y. Construction and application of knowledge graph for construction accidents based on deep learning. Eng. Constr. Archit. Manag. 2023, 32, 1097–1121. [Google Scholar] [CrossRef]
Cao, K.; Chen, S.; Yang, C.; Li, Z.; Luo, L.; Ren, Z. Revealing the coupled evolution process of construction risks in mega hydropower engineering through textual semantics. Adv. Eng. Inform. 2024, 62, 102713. [Google Scholar] [CrossRef]
Shishehgarkhaneh, M.B.; Moehler, R.C.; Fang, Y.; Hijazi, A.A.; Aboutorab, H. Transformer-Based Named Entity Recognition in Construction Supply Chain Risk Management in Australia. IEEE Access 2024, 12, 41829–41851. [Google Scholar] [CrossRef]
Jing, F.; Zhang, M.; Li, J.; Xu, G.; Wang, J. A Novel Named Entity Recognition Algorithm for Hot Strip Rolling Based on BERT-Imseq2seq-CRF Model. Appl. Sci. 2022, 12, 11418. [Google Scholar] [CrossRef]
Wang, H.; Xu, S.; Cui, D.; Xu, H.; Luo, H. Information Integration of Regulation Texts and Tables for Automated Construction Safety Knowledge Mapping. J. Constr. Eng. Manag. 2024, 150, 04024034. [Google Scholar] [CrossRef]
Shuai, B. A rationale-augmented NLP framework to identify unilateral contractual change risk for construction projects. Comput. Ind. 2023, 149, 103940. [Google Scholar] [CrossRef]
Beltagy, I.; Peters, M.E.; Cohan, A. Longformer: The Long-Document Transformer. arXiv 2020, arXiv:2004.05150. [Google Scholar] [CrossRef]
Zaheer, M.; Guruganesh, G.; Dubey, A.; Ainslie, J.; Alberti, C.; Ontanon, S.; Pham, P.; Ravula, A.; Wang, Q.; Yang, L.; et al. Big Bird: Transformers for Longer Sequences. arXiv 2021, arXiv:2007.14062. [Google Scholar] [CrossRef]
Li, Y.; Wehbe, R.M.; Ahmad, F.S.; Wang, H.; Luo, Y. A comparative study of pretrained language models for long clinical text. J. Am. Med. Inform. Assoc. 2023, 30, 340–347. [Google Scholar] [CrossRef]
Gao, B.; Hu, Y.; Gu, J.; Han, X. Integrating deep learning and multi-attention for joint extraction of entities and relationships in engineering consulting texts. Autom. Constr. 2024, 168, 105739. [Google Scholar] [CrossRef]
Morteza, A.; Chou, R.A. Distributed Matrix Multiplication: Download Rate, Randomness and Privacy Trade-Offs. In Proceedings of the 2024 60th Annual Allerton Conference on Communication, Control, and Computing, Urbana, IL, USA, 24–27 September 2024; pp. 1–7. [Google Scholar]
Warner, B.; Chaffin, A.; Clavié, B.; Weller, O.; Hallström, O.; Taghadouini, S.; Gallagher, A.; Biswas, R.; Ladhak, F.; Aarsen, T.; et al. Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference. arXiv 2024, arXiv:2412.13663. [Google Scholar] [CrossRef]
Graves, A.; Fernández, S.; Schmidhuber, J. Bidirectional LSTM Networks for Improved Phoneme Classification and Recognition. In Artificial Neural Networks: Formal Models and Their Applications—ICANN 2005; Duch, W., Kacprzyk, J., Oja, E., Zadrozny, S., Eds.; Springer: Berlin/Heidelberg, Germany, 2005; pp. 799–804. [Google Scholar]
Li, W.; Du, Y.; Li, X.; Chen, X.; Xie, C.; Li, H.; Li, X. UD_BBC: Named entity recognition in social network combined BERT-BiLSTM-CRF with active learning. Eng. Appl. Artif. Intell. 2022, 116, 105460. [Google Scholar] [CrossRef]
Lafferty, J.D.; McCallum, A.; Pereira, F.C.N. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. In Proceedings of the Eighteenth International Conference on Machine Learning, San Francisco, CA, USA, 28 June–1 July 2001; Morgan Kaufmann Publishers Inc.: Burlington, MA, USA, 2001; pp. 282–289. [Google Scholar]
Chen, Q.; Long, D.; Yang, C.; Xu, H. Knowledge Graph Improved Dynamic Risk Analysis Method for Behavior-Based Safety Management on a Construction Site. J. Manag. Eng. 2023, 39, 04023023. [Google Scholar] [CrossRef]
Huo, X.; Yin, Y.; Jiao, L.; Zhang, Y. A data-driven and knowledge graph-based analysis of the risk hazard coupling mechanism in subway construction accidents. Reliab. Eng. Syst. Saf. 2024, 250, 110254. [Google Scholar] [CrossRef]
Gao, S.; Ren, G.; Li, H. Knowledge Management in Construction Health and Safety Based on Ontology Modeling. Appl. Sci. 2022, 12, 8574. [Google Scholar] [CrossRef]
Guo, B.H.W.; Goh, Y.M. Ontology for design of active fall protection systems. Autom. Constr. 2017, 82, 138–153. [Google Scholar] [CrossRef]
Xing, X.; Zhong, B.; Luo, H.; Li, H.; Wu, H. Ontology for safety risk identification in metro construction. Comput. Ind. 2019, 109, 14–30. [Google Scholar] [CrossRef]
Xu, N.; Liang, Y.; Guo, C.; Meng, B.; Zhou, X.; Hu, Y.; Zhang, B. Entity recognition in the field of coal mine construction safety based on a pre-training language model. Eng. Constr. Archit. Manag. 2023, 32, 2590–2613. [Google Scholar] [CrossRef]
Yang, J.; Zhang, Y.; Li, L.; Li, X. YEDDA: A Lightweight Collaborative Text Span Annotation Tool. In Proceedings of ACL 2018, System Demonstrations; Liu, F., Solorio, T., Eds.; Association for Computational Linguistics: Melbourne, Australia, 2018; pp. 31–36. [Google Scholar]
Schäfer, H.; Idrissi-Yaghir, A.; Bewersdorff, J.; Frihat, S.; Friedrich, C.M.; Zesch, T. Medication event extraction in clinical notes: Contribution of the WisPerMed team to the n2c2 2022 challenge. J. Biomed. Inform. 2023, 143, 104400. [Google Scholar] [CrossRef]
Nemoto, S.; Kitada, S.; Iyatomi, H. Majority or Minority: Data Imbalance Learning Method for Named Entity Recognition. IEEE Access 2025, 13, 9902–9909. [Google Scholar] [CrossRef]
Li, Z.; Hu, C.; Zhang, R.; Chen, J.; Guo, X. Zero-Shot Cross-Lingual Named Entity Recognition via Progressive Multi-Teacher Distillation. IEEE/ACM Trans. Audio Speech Lang. Process. 2024, 32, 4617–4630. [Google Scholar] [CrossRef]

Figure 1. Overall framework of the model.

Figure 2. Architecture of the ModernBERT model.

Figure 3. Architecture of the BiLSTM model.

Figure 4. Architecture of the CRF model.

Figure 5. Text preprocessing.

Figure 6. Schematic diagram of text annotation.

Figure 7. Distribution of named entities.

Table 1. BERT-Based NER studies in construction safety.

Related Work	Methodology	F1 Score	Research Focus
[27]	BERT	93.4%	Identification of Contractual Risk Clauses in Construction Specifications
[28]	CL-CasRel	66.9%	NER in Safety Documents
[29]	BERT-BiLSTM-CRF	88.26%	Identification of Construction Safety Knowledge Entities
[30]	BERT-GPLinker	91.90%	Identification of Risk Factors in Large-Scale Hydropower Construction Projects
[13]	BERT-LSTM	91%	Extraction of Key Information from Construction Accidents
[31]	RoBERTa	85.80%	NER of Risks in the Construction Supply Chain
[32]	BERT-Imseq2seq-CRF	91.47%	NER in Hot-Rolled Strip Rolling Process Texts
[14]	KoBERT	91.0%	Information Extraction from Construction Defect Reports
[15]	RoBERTa	95.6%	Extraction of Compliance Information in Construction Projects
[33]	BERT-BiLSTM-CRF	85.65%	Extraction of Safety Specification Information in Construction
[34]	BERT-base-uncased	87%	Identification of Risk Elements in Unilateral Contract Changes of Construction Projects

Table 2. Categories of entities.

Entity Category	Definition	Description	Source
Accident Time	Time	The specific time point or time period when the accident occurred.	[44,45]
Construction Activity	Activity	The specific construction task being carried out at the time of the accident.	[44,46,47,48]
Accident Cause	Cause	The direct and indirect causes of the accident, such as unsafe worker behaviors, mechanical failures, and managerial deficiencies.	[13,29,44,45,46,47,48,49]
Accident Type	Type	The nature of the accident, including falls from height, struck-by-object incidents, collapses, etc.	[44,45,47]
Accident Consequence	Consequence	The impact of the accident, including casualties, injuries, and economic losses.	[13,29,44,45,46,48]
Preventive Measure	Measure	Recommended preventive actions or improvement strategies to avoid similar accidents.	[46,47,48]

Table 3. NER performance of different pretrained models on short texts.

Model	Precision	Recall	F1
BERT	0.6712	0.6753	0.6732
BERT-BiLSTM-CRF	0.7012	0.6825	0.6916
ModernBERT-BiLSTM	0.7381	0.6225	0.6656
ModernBERT-BiLSTM-CRF	0.6825	0.6643	0.6731

Table 4. NER performance of different pretrained models on long texts.

Model	Precision	Recall	F1
BERT	0.5912	0.5743	0.5827
BERT-BiLSTM-CRF	0.6023	0.5824	0.5922
ModernBERT-BiLSTM	0.7106	0.5756	0.6143
ModernBERT-BiLSTM-CRF	0.6855	0.5828	0.6234

Table 5. NER performance by entity category.

Entity Category	Precision	Recall	F1
Time	0.98	0.65	0.78
Activity	0.49	0.26	0.34
Cause	0.58	0.33	0.43
Type	0.92	0.92	0.92
Consequence	0.93	0.97	0.95
Measure	0.68	0.31	0.41

Table 6. Comparison of existing NER performance on long texts.

No.	Related Work	Model	Data Scale	F1 Score	Text Length
1	[36]	Clinical-Longformer	1304 electronic medical records	Average 0.9055	4096 characters
1	[36]	Clinical-BigBird	1304 electronic medical records	Average 0.8945	4096 characters
2	[37]	Longformer-CASREL	7 engineering consulting standards and specification documents	0.6899	4096 characters
3	[49]	DeBERTa v3-Longformer	1017 medical texts	0.940	4096 characters
4	This method	ModernBERT-BiLSTM-CRF	529 construction accident investigation reports	0.6234	8192 characters

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Luo, Q.; Zhang, G.; Sun, Y. A Novel Method for Named Entity Recognition in Long-Text Safety Accident Reports of Prefabricated Construction. Buildings 2025, 15, 3063. https://doi.org/10.3390/buildings15173063

AMA Style

Luo Q, Zhang G, Sun Y. A Novel Method for Named Entity Recognition in Long-Text Safety Accident Reports of Prefabricated Construction. Buildings. 2025; 15(17):3063. https://doi.org/10.3390/buildings15173063

Chicago/Turabian Style

Luo, Qianmai, Guozong Zhang, and Yuan Sun. 2025. "A Novel Method for Named Entity Recognition in Long-Text Safety Accident Reports of Prefabricated Construction" Buildings 15, no. 17: 3063. https://doi.org/10.3390/buildings15173063

APA Style

Luo, Q., Zhang, G., & Sun, Y. (2025). A Novel Method for Named Entity Recognition in Long-Text Safety Accident Reports of Prefabricated Construction. Buildings, 15(17), 3063. https://doi.org/10.3390/buildings15173063

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Novel Method for Named Entity Recognition in Long-Text Safety Accident Reports of Prefabricated Construction

Abstract

1. Introduction

2. Literature Review

3. Methodology

3.1. ModernBERT

3.2. BiLSTM

3.3. CRF

3.4. Evaluation Metrics for NER

4. Experiments and Results

4.1. Data Collection and Processing

4.2. Entity Type Definition and Annotation

4.3. Experimental Setup

4.4. NER Results

5. Discussion

5.1. Model Performance Analysis

5.2. Knowledge Contributions

5.3. Limitations and Future Work

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI