Fault Knowledge Graph Construction Method for Hydraulic Turbine Speed Control System Based on BERTWWM-BiLSTM-MHA-CRF Model

Liu, Sheng; Zhang, Kefei; Zhang, Tianbao; Wang, Zhong; Ai, Xun

doi:10.3390/app152312377

Open AccessArticle

Fault Knowledge Graph Construction Method for Hydraulic Turbine Speed Control System Based on BERTWWM-BiLSTM-MHA-CRF Model

by

Sheng Liu

^1,2,

Kefei Zhang

^1,2,*,

Tianbao Zhang

^1,2,

Zhong Wang

^1,2 and

Xun Ai

^1,2

¹

School of Electrical and Electronic Engineering, Hubei University of Technology, Wuhan 430068, China

²

Hubei Engineering Research Center for Safety Monitoring of New Energy and Power Grid Equipment, Hubei University of Technology, Wuhan 430068, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(23), 12377; https://doi.org/10.3390/app152312377

Submission received: 23 October 2025 / Revised: 13 November 2025 / Accepted: 18 November 2025 / Published: 21 November 2025

Download

Browse Figures

Versions Notes

Abstract

As a crucial component within the power industry, the hydraulic turbine speed control system significantly plays a vital role in the safe and stable operation of hydropower stations. The intelligent operation and maintenance of this system is a vital means to ensure the safety, stability, and economy of the unit. The hydropower plant has accumulated extensive fault text data related to the hydraulic turbine speed control system over the years, which has yet to be effectively mined and utilized. To address these issues, this paper proposes a novel method using BERTWWM-BiLSTM-MHA-CRF for constructing a fault knowledge graph of hydraulic turbine speed control system. Initially, the knowledge graph schema is designed, followed by an analysis of the recording characteristics of the hydraulic turbine speed control system fault text. This is accompanied by the cleaning and labeling of unstructured text. Subsequently, an entity extraction model utilizing the BERTWWM-BiLSTM-MHA-CRF framework is developed to facilitate the intelligent extraction of entities and relationships. Finally, the triples, consisting of entities and relationships, are stored in the Neo4j graph database to finalize the construction and visualization of the fault knowledge graph, along with the proposed application process for auxiliary decision-making. The data processing methodology outlined in this paper, based on the graph schema design, effectively produces high-quality datasets. Furthermore, compared to the traditional model and mainstream large language models, the BERTWWM-BiLSTM-MHA-CRF model demonstrates superior entity extraction performance. Finally, combining fault instance validation, it demonstrates that the knowledge graph provides effective support for fault diagnosis in the hydraulic turbine speed control system.

Keywords:

knowledge graph; hydraulic turbine speed control system; entity extraction; schema design

1. Introduction

The hydraulic turbine control regulation system is a critical component of the hydropower unit. It directly influences the stability and efficiency of power generation. By monitoring and regulating the rotational speed of the hydraulic turbine in real time, the speed regulation system ensures synchronization between the generator and the power grid. It can avoid frequency instability due to load fluctuations or emergencies. Any faults in the system can directly impact the operational efficiency and safety of the hydropower unit. However, substantial experience and data accumulated by operation and maintenance personnel during fault resolution activities in the hydraulic turbine speed control system often remain underutilized. Therefore, the objective of this research is to utilize intelligent technologies to integrate and utilize relevant data, assisting on-site operations and maintenance personnel in more efficiently handling faults, improving the operational and maintenance level of hydropower stations, maximizing the operational time of equipment related to turbine speed control systems, enhancing the lifespan of the units, increasing electricity generation efficiency, and ultimately bringing better economic and social benefits to hydropower stations.

Knowledge Graph [1] (KG) is a next-generation intelligent semantic search engine technology proposed by Google [2]. Knowledge graph is fundamentally a structured semantic knowledge base that describes concepts and their relationships in the physical world in symbolic form [3]. In recent years, knowledge graph has also been widely used in various fields, and many scholars have conducted a lot of research. Shu Jiawei et al. [4] applied knowledge graphs in the field of electrical equipment and proposed an improved sequence labeling joint extraction method for knowledge extraction in power graphs, transforming entity relationships into graph structures. However, their approach has a long computation time and requires lightweight processing. Yang He et al. [5] applied knowledge graphs in the fishing industry and proposed a dual-attention mechanism-based entity relation extraction method for fishing standards. By employing sentence structure classification labeling and a dual-attention mechanism, they effectively addressed the overlapping relationship problem present in fishing standard texts. Hao Zhigang et al. [6] applied knowledge graphs in the field of food safety inspection. They improved Chinese dependency syntax trees and graph neural network technologies to effectively solve the issue of weakened inter-word dependencies caused by long sentences in Chinese food inspection announcement texts. Deng Na et al. [7] applied knowledge graphs in the field of traditional Chinese medicine patents, addressing the complexities of entity overlap and relationships in patent texts by integrating semantic features and a multi-layer cross-attention mechanism. Zhang Yu et al. [8] applied knowledge graphs in the apple cultivation sector, proposing a reinforcement learning-based entity relation joint extraction model. This model tackles the scarcity of annotated data in low-resource scenarios typical of apple cultivation by introducing pseudo-label generation and gradient simulation mechanisms, significantly enhancing the model’s generalization and extraction performance. At the same time, a series of collaborative frameworks have emerged that enhance the interaction between general domain knowledge graphs and large models based on the natural language processing capabilities of LLMs [9]. The GraphRAG framework proposed by Edge et al. [10] improves summarization capabilities but performs poorly in detail-oriented question answering in vertical domains. Gutiérrez et al. [11] proposed the HippoRAG2 framework, simulating human hippocampal mechanisms; however, the method of open information extraction is limited in its application to vertical domains. Wei Yijin et al. [12] constructed a knowledge graph of potato varieties based on the GraphRAG framework and the Qwen2-70B-Instruct model.

In the application of knowledge graphs based on related vertical domains, Yu Tong et al. [13] integrated the knowledge of terminology, documents, databases and other knowledge in the field of traditional Chinese medicine based on knowledge graph technology, constructed a knowledge graph of traditional Chinese medicine, and developed a knowledge base system of traditional Chinese medicine and health care knowledge graph, which can be used to perform a variety of functions, such as visual display of the knowledge, rapid retrieval and recommendation. Chen Qiang [14] introduced knowledge mapping into the tobacco production and processing process, and constructed a knowledge mapping system for tobacco re-baking in order to represent and manage the knowledge of tobacco re-baking in an orderly manner. Hao-Chen Ding et al. [15] solved the problem of multi-source heterogeneity of oil tea knowledge data by utilizing the construction of oil tea knowledge graph, and designed and developed an oil tea knowledge graph application system. Nie Tongpan et al. [16] applied knowledge graph technology to aircraft power system fault diagnosis, firstly, constructed the ontology of knowledge graph with the credible expert knowledge in the field as the data source, and performed well in the functions of intelligent search, intelligent question and answer, and intelligent recommendation. Han Xudong et al. [17] developed a set of intelligent auxiliary diagnosis system for operation and maintenance integrating knowledge graph based on expert knowledge with high confidence level and utilizing the advantages of artificial intelligence and big data. Xu Qi [18] designed and developed a CNC fault diagnosis system based on knowledge graph, initially completed the fault diagnosis knowledge graph, and designed a diagnosis method based on knowledge graph. In the field of fault diagnosis for turbine speed control systems, Wang Shuqing et al. [19] employed an adaptive noise-complete ensemble empirical mode decomposition algorithm in conjunction with a variational mode decomposition algorithm to analyze the vibration signals of turbines. They used a spatial relational reasoning unit deep learning model to identify and diagnose faults accurately. However, this method demonstrates limited generalization ability and cannot guarantee the accuracy of diagnostic results in practical applications. Sun Shaonan et al. [20] constructed a Bayesian network model based on historical fault data from turbines for precise fault diagnosis. However, the research on the construction of knowledge graph for the field of the hydraulic turbine speed control system defects is still in the initial stage of, and there is a lack of complete knowledge graph construction method and application framework for this field.

To address these challenges, this paper proposes a method for constructing a fault knowledge graph of the hydraulic turbine speed control system based on the BERTWWM-BiLSTM-MHA-CRF model. First, the schema design is conducted; subsequently, the text related to the hydraulic turbine speed control system is labeled and processed. Next, entities and relations are extracted, and finally, the extracted triad data is stored in the Neo4j database to enable knowledge graph visualization, completing the construction of the fault knowledge graph for the hydraulic turbine speed control system. Next, a process is proposed for utilizing the constructed knowledge graph to assist in decision-making. By leveraging the knowledge graph along with observed anomalies, it is possible to infer components, identify abnormal causes, and recommend corrective measures. This process provides valuable references for on-site operations and maintenance personnel, thereby enhancing the operational and maintenance capabilities of hydropower stations.

The innovation of this paper lies in the field of turbine speed control system fault diagnosis. Building on the BERT-BiLSTM-CRF model, we introduce a Whole Word Masking (WWM) mechanism and incorporate a Multi-Head Attention (MHA) mechanism to propose the BERTWWM-BiLSTM-MHA-CRF entity recognition model. Additionally, we combine this with a relation extraction model to create a high-quality knowledge graph for turbine speed control system faults. The main contributions of this paper are as follows:

We construct the turbine governor system fault knowledge graph by integrating structured, semi-structured, and unstructured data with the BERTWWM-BiLSTM-MHA-CRF model for entity extraction.
We propose an application process of fault auxiliary decision-making for the turbine governor system that provides reference for on-site operation and maintenance personnel to enhance the operation and maintenance level of hydropower stations.

The subsequent sections of this paper will unfold in the following order: Section 2 will review the construction methods of knowledge graphs. Section 3, Section 4 and Section 5 will detail the construction process of the turbine governor system fault knowledge graph. Section 6 provide in-depth analysis and discussion of all evaluation results. Section 7 concludes the paper.

2. Knowledge Graph Construction Architecture

The construction of the fault knowledge graph for the hydraulic turbine speed control system is primarily divided into three components: entity extraction, relationship extraction, knowledge storage, and graph display. Initially, we collect relevant information regarding the speed control system to obtain empirical data, analyze fault data to derive concepts related to the fault domain, and complete knowledge modeling of this domain, including attribute–entity–relationship modeling. Subsequently, we utilize Label Studio for data annotation and to transform labeled data into BIO format. This process utilizes Label Studio for data labeling and transformation into the BIO format. To facilitate entity extraction, a method based on BERT-WWM-BiLSTM-MHA-CRF is designed to construct a fault entity recognition model for the hydraulic turbine speed control system, enabling the effective extraction of key fault entity information from the text. Additionally, building on the characteristics of the acquired fault entity information and expert discussions, a BERT-based fault relationship extraction method is proposed to construct the fault text relationship set for the hydraulic turbine speed control system. Finally, to facilitate knowledge storage and display, the acquired fault entity and relationship information is organized into the ternary form of “entity–relationship–entity,” utilizing Neo4j graph database technology for storage and mapping. Neo4j is a graph database management system that stores data as graphs and offers robust querying and analysis functionalities. Consequently, a knowledge map of hydraulic turbine unit speed control system faults has been successfully constructed through entity extraction, relationship extraction, and knowledge storage and graph presentation efforts. Finally, it is suggested that the dynamic updating of the hydraulic turbine speed control system fault knowledge map, along with the generation of auxiliary decisions utilizing this map, can enhance staff understanding and analysis of relevant information regarding system faults, thereby improving the efficiency and accuracy of fault processing. The construction process for the specific fault knowledge map of the hydraulic turbine speed control system is illustrated in Figure 1.

3. Data Acquisition and Processing

3.1. Data Sources

Through regular inspections, information on the operating conditions of the hydraulic turbine and the speed control system is gathered, including equipment operating and surface conditions, operating parameters, sounds, temperatures, and more than 490,000 words of fault text have been compiled. The specific format of the fault text is presented in Table 1. The fault text for the hydraulic turbine speed control system is derived from inspection reports.

3.2. Schema Design

Regardless of whether it is a structured or unstructured project, it is essential to first design the knowledge graph schema. This schema describes the types of entities, their attributes, and the relationships within the knowledge graph, serving as a critical reference for querying and analyzing the graph, akin to the table structure in a relational database. This schema is typically expressed as “entity–relationship–entity” and “entity–attribute–attribute value” triples. The construction of the knowledge graph requires a clear structure that accurately articulates the semantic connections among entities, relations, and attributes, thereby enhancing the practical application of the constructed knowledge graph. The following figure presents the completed schema based on data from the fault information of the hydraulic turbine speed control system, as illustrated in Figure 2.

The fault knowledge schema for the turbine speed control system provides a detailed description of the relationships among equipment, components, and fault states within the system using multiple triples. First, the equipment and its constituent components are linked by “containment” relationships, indicating that the equipment comprises multiple components. Next, components may emerged abnormal phenomena, indicating the potential failure manifestations that the components might encounter during operation. Additionally, abnormal phenomena are generated “due to” specific abnormal causes, which further elucidates the root cause of the failure. Different work items “contain” process requirements, facilitating systematic management of the speed control process and its associated tasks. Finally, the identified abnormal causes may “belong” to a specific fault type, which is crucial for the classification and subsequent treatment of faults. Simultaneously, the system addresses the identified abnormal phenomena by “corresponding” specific treatment measures to ensure the stable operation of the turbine unit. This series of triples establishes a structured knowledge base for fault diagnosis, management decisions, and optimized maintenance strategies, thereby forming a comprehensive mechanism for fault identification and treatment.

3.3. Labeling Methods

After obtaining the preprocessed fault text data, this study primarily employs Label Studio for manual data annotation, defining four categories of annotated entities and three categories of annotated relationships. The BIO annotation method is a widely used sequence labeling technique, often applied in tasks like Named Entity Recognition (NER) within the field of Natural Language Processing (NLP). “BIO” is an acronym for “Beginning,” “Inside,” and “Outside.” In the BIO annotation method, each word is assigned a label that indicates its position within a particular entity. Specifically, the labels are divided into the following three types:

(1): “B-”: indicates the beginning of the entity. For example, in “the touch screen suddenly jammed”, “touch” is labeled as “B-COM”;
(2): “I-”: Indicates the interior of an entity, as in the sentence, “touch screen” will be labeled as “I-COM”;
(3): “O”: Indicates that the word does not belong to any entity, as in the sentence. “Suddenly” will be labeled as “O”. Examples of labeling styles are shown in Table 2.

Based on the four types of entities—components, abnormal phenomena, abnormal causes, and treatment measures—the relationships between the entities are defined as four categories: Emerged, Due To, Corresponds, and No Relationship (NA), with specific descriptions provided in Table 3. The relationship between components and abnormal phenomena is represented by Emerged; for example, the relationship between the touchscreen and a freeze can be expressed as <touchscreen, Emerged, freeze>. The constructed experimental corpus of fault relationships for the hydraulic turbine speed control system is shown in Table 3.

4. Named Entity Recognition Methods

The fault entity identification model for the hydraulic turbine speed control system, based on BERT-WWM-BiLSTM-MHA-CRF, is primarily composed of three components: BERT-WWM, BiLSTM-MHA, and Conditional Random Field (CRF), as illustrated in Figure 3. The BERT-WWM-BiLSTM-MHA-CRF model developed in this study includes the BERT-WWM layer, BiLSTM layer, MHA layer, and CRF layer. First, the input text is processed using the BERT-WWM layer to generate semantic features related to the turbine speed control system faults. Next, the bidirectional coding capability of BiLSTM extracts deeper semantic content by combining contextual information and semantically related features of key characters, while the Multi-Head Attention (MHA) captures dependencies across varying ranges within the sequences. Finally, the CRF layer decodes the weighted predicted values, and the Viterbi algorithm computes the globally optimal labeling sequence to recognize the entity class of each character in the faulty text.

4.1. The BERT-WWM Model

BERT-WWM, also known as full word mask BERT, is the primary processing stage of the model, whose main task is to extract primary semantic representations from the input text. As shown in Figure 4, the input character sequence X = {x₁, x₂, …, x_n}, each character xi can be mapped into a high-dimensional space in the BERT-WWM model to form the embedding vector v_i = F(x_i), F is the embedding function. Thus, the sequence of characters X is transformed into a sequence of vectors V = {v₁, v₂, …, v_n}. For the obtained vector sequence, BERT-WWM feeds it into a series of Transformer encoders, which utilize a self-attention mechanism to effectively capture the dependencies between the positions in the sequence.

In the Figure 4,

E_{i}

—input of the model;

N

—number of input words;

T r m

—Transformer processing unit;

T i

—the output of the model.

One of the model’s features is its ability to be pre-trained based on contextual information, thereby enhancing language comprehension. The pre-training process of the BERT model comprises two tasks: Masked Language Model (MLM) and Next Sentence Prediction (NSP). The Masked Language Model is a widely used pre-training approach where some words in the fault text of the turbine speed control system are randomly masked (typically 15% by default), after which the Transformer model predicts the masked words. This approach helps the BERT model incorporate contextual information, thereby improving its understanding of the hydraulic turbine unit speed control system fault text. In the entity recognition task for the fault text of the hydraulic turbine unit speed control system, the BERT model must consider not only the contextual information among words but also the relationships between sentences, leading to the construction of a next sentence prediction task. This task is a binary classification problem where specific steps involve randomly selecting two fault sentences, sentence a and sentence b, from the fault text of the hydraulic turbine unit speed control system: 50% of the time, sentence a is the correct fault sentence, and the other 50% is a randomly chosen fault sentence. By extracting the semantic features of these two sentences, the BERT model can predict whether they are consecutive sentences.

The BERT model is trained during the pre-training phase using the Masked Language Model (MLM) task, in which a portion of individual words in the input is randomly selected for masking. This approach does not take into account the intrinsic relationships among words. This is particularly evident in the Chinese language context, where the combination of words often conveys richer semantic information. For instance, the entity “touch screen” is split into three separate components: “touch,” “touch,” and “screen.” During the pre-training phase, one or more of these words may be masked, leading to a loss of the relationship between the parts. When the entity “touch screen” is split, a specific word or part of it may be masked by [MASK]. Consequently, the relationship between the masked word and the complete term is not adequately learned, resulting in an incomplete understanding of the overall semantics of “touch screen.” This limitation may result in degraded model performance in tasks requiring deep semantic understanding, such as identifying fault entities in the hydraulic turbine speed control system. Utilizing the Whole Word Masking (WWM) pre-training approach of BERT, the model can mask “touch screen” as a single entity, enabling more accurate understanding and recognition of its meaning as a concrete electrical term. The whole word masking strategy is illustrated in Table 4.

4.2. BiLSTM-MHA Modeling

4.2.1. Bi-LSTM Based Faulty Text Feature Coding

LSTM is an optimized version of the RNN model, designed to incorporate long and short-term memory functions. LSTM introduces a threshold mechanism and memory unit within the RNN framework. This threshold mechanism not only allows the model to forget existing information but also effectively filters and learns the information entering the memory unit.

The presence of memory cells allows LSTM to retain information longer than the short-term memory of RNNs, effectively addressing issues such as gradient vanishing and gradient explosion. LSTM consists of a single cell module, which comprises four components: the input gate, the forget gate, the candidate cell state, and the output gate. The structure of the LSTM cell is illustrated in Figure 5.

Inputs at the moments

x_{t}

—t in the Figure 5; the LSTM structure is computed as shown in Equations (1)–(6):

f_{t} = σ (W_{f} x_{t} + U_{f} h_{t - 1} + b_{f})

(1)

i_{t} = σ (W_{i} x_{t} + U_{i} h_{t - 1} + b_{i})

(2)

C_{t}^{'} = \tanh (W_{c} h_{t - 1} + U_{c} x_{t} + b_{c})

(3)

C_{t} = f_{t} C_{t + 1} + i_{t} C_{t}^{'}

(4)

o_{t} = σ (W_{o} x_{t} + U_{o} h_{t - 1} + b_{y})

(5)

h_{t} = o_{t} \tanh (C_{t})

(6)

where,

x_{t}

is the input of the current moment;

i_{t}

is the input gate;

C_{t}^{'}

is the candidate cell;

o_{t}

is the output gate;

W_{i}, U_{i}, W_{c}, U_{c}

is the weight of different neural units under each gating mechanism;

b_{f}, b_{i}, b_{c}

is the function deviation of different units;

h_{t - 1}

is the state of the hidden layer;

σ

,

\tanh

are the neuron activation functions.

Bi-LSTM utilizes a bidirectional LSTM mechanism, where the forward LSTM layer extracts information from time t and previous time steps, while the backward LSTM layer captures information from time t and subsequent time steps in the input sequence. The output features from the two LSTM layers can be fused through methods such as addition, averaging, or concatenation. Bi-LSTM retains the advantages of LSTM while addressing its inability to consider contextual information, effectively capturing feature information from both preceding and following elements in the sequence. The structure of the Bi-LSTM model is illustrated in Figure 6.

4.2.2. Multi-Headed Self-Attention Mechanism MHA

The MHA is introduced to calculate the dependency weights among characters, capture various ranges of dependencies within the sequence, extract the semantically related feature information of key characters, and improve the capability of named entity recognition in the fault domain. In the MHA module, it is first determined that different parameter matrices are selected to perform hth linear transformations on

Q

,

K

,

V

, and then continue to input to the deflated dot product attention module, and the computational flow of deflated dot product attention is shown in Figure 7. Each parallel head does not share the relevant parameters, and thus is able to extract unique character information for each character on different subspaces separately. After performing h linear transformations, the results of the deflated dot product attention are computed and merged, and another linear mapping is performed to obtain the final output

s t = (\begin{matrix} s t_{1}, s t_{2}, \dots, s t_{m} \end{matrix})

, which is computed as shown in Equations (7) and (8):

{head}_{i} = Attention (Q W_{i}^{ϱ}, K W_{i}^{κ}, V W_{i}^{ν})

(7)

s t_{m} = Multihead (Q, K, V) = c o n c a t (h e a d_{1}, h e a d_{2}, \dots, h e a d_{h}) W^{o}

(8)

where

W_{i}^{Q} \in R^{d_{k} \times d_{k} / h}, W_{i}^{K} \in R^{d_{k} \times d_{k} / h}, W_{i}^{V} \in R^{d_{k} \times d_{k} / h}, W^{O} \in R^{d_{k} \times d_{k} / h}

denotes the parameter matrix used for the linear transformation and concat(-) denotes the splicing operation.

Finally, the output vector of the BERTWWM layer is combined by the output vector of the MHA

s t

and the BiLSTM output vector h, which is then fed into the

\tanh

activation function to obtain the output of the BiLSTM-MHA intermediate layer

z_{t}

as shown in Equation (9), which will be used as the input of the CRF layer.

z_{t} = \tanh (h \oplus s t)

(9)

4.3. CRF to Obtain the Global Optimal Sequence

Since BiLSTM cannot fully consider the dependencies between faulty entity labels. Therefore, CRF is used in this section to fully consider the connected relationship between faulty entities, so as to obtain the global optimal sequence and improve the accuracy of entity recognition. Specifically, given

X = (x_{1}, x_{2}, …, x_{n})

as the input sequence, the probability score calculation formula

S (X, y)

for the predicted output sequence result y is shown in Equation (10).

S (X, y) = \sum_{i = 0}^{n} A_{y_{b} y_{i + 1}} + \sum_{i + 1}^{n} P_{i, y}

(10)

where

A_{y_{i}, y_{i + 1}}

—element of the transfer matrix;

P_{i, y_{i}}

—ith character in the fault text.

The total score of correct labeling is summed with all possible labeling scores and normalized to obtain the probability of generating a labeled sequence given the input sequence (Py|X), the probability is shown in Equation (11).

P (y | X) = \frac{e^{S (X, y)}}{\sum_{\tilde{y} \in Y_{X}} e^{S (X, \tilde{y})}}

(11)

where

X

—input sequence;

y

—label sequence.

During the training process, the computation of the loss function is carried out by means of the logarithmic maximum likelihood estimation method to obtain the logarithmic probability of the output sequence of the model that has the highest degree of conformity with the correct labeled sequence, as shown in Equation (12).

\log (P (y^{*} | X)) = S (X, y^{*}) - \log (\sum_{\tilde{y} \in Y_{X}} e^{S (X, \tilde{y})})

(12)

where

y^{*}

—output sequence.

Finally, the highest score result is output according to Equation (13) as the final labeling result of the faulty entity of the speed control system, thus completing the construction of the faulty entity recognition model of the speed control system equipment.

y^{*} = \underset{\tilde{y} \in Y_{X}}{\arg \max} S (X, \tilde{y})

(13)

5. Hydraulic Turbine Governor Fault Knowledge Graph Construction

The logical architecture of a knowledge graph consists of a schema layer and a data layer. There are three commonly used construction methods: top-down, bottom-up, and a combination of the two. This paper adopts the top-down method to construct the knowledge graph for hydraulic turbine speed control system faults. First, based on the characteristics of the fault text and the designed schema, entity extraction is performed using the BERT-WWM-BiLSTM-MHA-CRF named entity recognition model. Additionally, BERT-based methods are employed to achieve relationship and attribute extraction. Knowledge fusion is accomplished by calculating the cosine similarity between the entities to complete the construction of the entity layer. Then, a Neo4j graph database is used to store the knowledge and visualize the knowledge graph of hydraulic turbine speed control system faults. Finally, a knowledge update method is proposed for the real-time updating of the knowledge graph.

5.1. Entity Layer Construction

The entity layer represents the instantiation of the schema and consists of the actual fault text data from the hydraulic turbine speed control system. For the original structured data within the fault text, its normalized characteristics allow for direct instantiation according to the designed schema. For unstructured data, entity and relationship extraction, along with knowledge fusion, should be performed first before instantiation according to the schema.

5.1.1. Entity Extraction

For the entity extraction task, this paper employs the BERT-WWM-BiLSTM-MHA-CRF model built in Section 3 to extract four categories of key entity information: components, abnormal phenomena, abnormal causes, and treatment measures from the unstructured text of the data source. To facilitate the application of the subsequent knowledge graph, similar or related anomalies identified during extraction should be merged.

5.1.2. Relationship Extraction

BERT is a pre-trained bi-directional language model that uses the Transformer architecture as its feature encoder. The Transformer is a deep network based on a self-attention mechanism, capable of capturing long-range features while also enabling efficient parallel computation. The R-BERT model is used to process the Chinese relational task by leveraging the pre-trained BERT language model. It combines the information from the target entities, adding identifiers before and after each entity to indicate their locations according to BERT’s input requirements. The input sentence and the entity pair information are then combined into a single input sequence. BERT outputs the final implied state vector for the identifiers and the state vectors for the two target entities. These three parts of vector information undergo linear or non-linear transformations before being classified by the softmax layer. The model structure is illustrated in Figure 8.

Given a statement s containing entities

e_{1}

and

e_{2}

, suppose its final hiding state is obtained according to BERT as H, suppose entity

e_{1}

has hiding vectors

H_{i}

to

H_{j}

and entity

e_{2}

has hiding vectors

H_{k}

and

H_{m}

. We average all the vectors for each entity. Then we add an activation function and add a fully connected layer. So,

e_{1}

and

e_{2}

are converted to

H_{1}^{'}

, as shown in Equations (14) and (15):

H_{1}^{'} = W_{1} [\tanh (\frac{1}{j - i + 1} \sum_{t = i}^{j} H_{t})] + b_{1}

(14)

H_{2}^{'} = W_{2} [\tanh (\frac{1}{m - k + 1} \sum_{t = k}^{m} H_{t})] + b_{2})

(15)

where W₁ and W₂ share parameters and b₁ and b₂ share parameters, i.e., W₁ = W₂ and b₁ = b₂. The # and $ are separators representing two entities, respectively. For the final hidden state vector represented by the first token ([CLS]), an activation function and a fully connected layer are also added, as shown in Equation (16):

H_{0}^{'} = W_{0} (\tanh (H_{0})) + b_{0}

(16)

where

W_{0} \in R^{d \times d}

,

W_{1} \in R^{d \times d}

,

W_{2} \in R^{d \times d}

, d denotes the hidden_size of BERT.

Concatenate

H_{0}^{'}, H_{1}^{'}, H_{2}^{'}

and add the fully connected layer and the softmax layer, as shown in Equation (17):

h^{″} = W_{3} [concat (H_{0}^{'}, H_{1}^{'}, H_{2}^{'})] + b_{3} p = softmax (h^{″})

(17)

where

W_{3} \in R^{L \times 3 d}

, L is the number of relations, p is the probability output, and b₀, b₁, b₂, b₃ are the bias vectors.

A Softmax classifier is used to classify the predicted labels, and the probability that the participle t is label i is calculated through the Softmax layer based on the labeled prediction vector

p_{t}^{i} (θ)

, as shown in Equations (18) and (19):

p_{t}^{i} (θ) = \frac{\exp (y_{t}^{i})}{\sum_{j = 1}^{N_{t}} \exp (y_{t}^{j})}

(18)

y_{t} {= W}_{y} T_{t} {+ b}_{y}

(19)

where

N_{t}

is the total number of tags,

y_{t}

is the model’s score for the participle

t

on all tags,

W_{y} \in R^{N_{t} d}

is the parameter matrix, and

b_{y} \in R^{N_{t}}

is the bias term.

5.2. Knowledge Fusion and Storage

Given that the accuracy of entity extraction from named entity recognition techniques rarely reaches 100%, and considering that multiple representations of the same entity exist in the defective text, entity unification is necessary. The entity alignment task can be achieved by matching the text similarity of the entities obtained from knowledge extraction. Semantically similar entities will have approximate vector representations, and the cosine similarity algorithm measures the similarity between two vectors by calculating the cosine of the angle between them [21,22].

Therefore, this paper employs the cosine similarity algorithm for matching and fusion of entities. Let B1 and B2 represent the entity word vectors in the specialized lexicon and the output word vectors from the BERT-WWM model for the extracted entities, respectively. The cosine similarity of the two entities is computed as shown in Equation (20). The closer the cosine similarity is to 1, the more similar the two entities are. The entity with the highest similarity

\cos θ

to the dictionary is selected for knowledge fusion. Conversely, a cosine similarity closer to 0 indicates lower similarity. First, a dictionary is constructed based on the entity names from the hydraulic turbine speed control system fault knowledge map. Then, the cosine similarity algorithm is applied to match the entities in the dictionary. For instance, the entities “relay oscillation” and “relay jerk” are merged into a single entity, “relay oscillation,” after entity alignment.

\cos θ = \frac{B_{1} \cdot B_{2}}{| B_{1} | \times | B_{2} |}

(20)

Neo4j is a graph database designed for processing graphically structured data, offering superior data processing capabilities compared to traditional relational databases [23]. By utilizing the graph database model and the Cypher query language, it is easy to represent and query the relationships between entities, enabling complex data manipulation and analysis. Additionally, the graph database stores data graphically, making data processing more intuitive and visual. The graphical model clearly illustrates the relationships between entities, making the data easier to understand and analyze. The basic storage structure is shown in Figure 9. Meanwhile, this paper employs the Neo4j graph database to incrementally update the knowledge map of hydraulic turbine speed control system faults. The steps are as follows:

(1): To address the newly appeared entity types in the fault text of the hydraulic turbine speed control system, the schema is updated after experts summarize the relationships between the newly added entities and the original entities, including any subordinate relationships;
(2): Knowledge extraction and fusion of the added entities are performed according to the schema to complete the update of the entity layer;
(3): The py2neo development framework is used to link to the Neo4j graph database, allowing for the addition and modification of the knowledge module based on the original knowledge graph to facilitate the update of the knowledge graph.

6. Example Analysis

The experiments were conducted under Windows 11 operating system, the training framework was Pytorch 1.11.0, the programming language was Python 3.11, and the hardware configurations were as follows: the CPU was a 2nd Gen Intel^® Core™ i7-12700H @ 2.30 GHz (Intel Corporation, Santa Clara, CA, USA); the GPU was an NVIDIA GeForce GTX 1660 Ti (NVIDIA Corporation, Santa Clara, CA, USA); and the memory capacity is 16 GB. The training parameters of the models outlined in Table 5.

Three evaluation metrics, precision rate (P), recall rate (R) and F1 value, are used in the experiments to assess the performance of the named entity recognition model.

The precision rate is the proportion of instances that are actually positive classes out of all instances recognized as positive classes by the model, as in Equation (21):

P = \frac{T_{p}}{T_{p} + F_{p}} \times 100 %

(21)

Recall is the proportion of instances that are correctly recognized as positive classes by the model out of all instances that are actually positive classes, as in Equation (22):

R = \frac{T_{P}}{T_{P} + F_{N}} \times 100 %

(22)

The F1 value is the harmonic mean of precision and recall, which tries to find a balance between the two. The F1 value is between 0 and 1, where 1 means perfect precision and recall and 0 means that at least one of the precision or recall is zero, which is calculated as in Equation (23):

F_{1} = \frac{2 P \times R}{P + R} \times 100 %

(23)

where

T_{P}

,

F_{p}

and F_N denote the number of entities correctly recognized by the model, the number of entities incorrectly recognized by the model and the number of entities not recognized by the model, respectively.

6.1. Knowledge Extraction Experiment Results and Analysis

Due to the current lack of publicly available datasets in the field of hydraulic turbine speed control system fault diagnosis, as well as the overall scarcity of diagnostic data for hydraulic turbine speed control systems in general, this study labels 14,065 records of fault text to construct a hydraulic turbine speed control system fault dataset. The ratio of the training set to the test set is 8:2, and comparative experiments are conducted to validate the effectiveness of the proposed method for entity extraction and relationship extraction in the domain of hydraulic turbine speed control system faults.

6.1.1. Comparison and Analysis of Entity Extraction Results with Traditional Entity Recognition Models

The benchmark algorithm BERT-BiLSTM-CRF, along with the improved BERT-CRF and BiLSTM-CRF algorithms, is compared on different entities using various performance metrics. Named entity recognition experiments are conducted using different models. The evaluation metrics include accuracy rate P, recall rate R and F1 score. The comparison of the F1 scores across different entities for the same model is shown in Table 6 and Figure 10, while the training curves for the various models regarding the entity “components” are displayed in Figure 11.

As shown in Table 6 and Figure 10, the BERT-WWM-BiLSTM-MHA-CRF model excels in four types of entity identification tasks: components, abnormal phenomena, abnormal causes, and treatment measures, achieving F1 scores of 98.44%, 96.61%, 98.09%, and 95.99%, respectively. This indicates that the model performs effectively in entity recognition. For the identification of the entity “abnormal cause,” the model achieves an accuracy of 97.82%, a recall rate of 98.37%, and an F1 score of 98.09%. This indicates that the model captures most of the correct entities (high recall) while ensuring accuracy in its predictions (high precision). Regarding the identification of entities as “abnormal phenomena” and “treatment measures,” the model achieves an accuracy of 96.69% and 95.52%, respectively, with recall rates of 96.54% and 96.46%, and F1 scores of 96.61% and 95.99%. In comparison to the “abnormal causes” entity, the indicators for “abnormal phenomena” and “treatment measures” are slightly lower than those for “components” and “abnormal causes.” For the recognition of the entity “components,” the model demonstrates the highest effectiveness, achieving an accuracy of 97.64%, a recall of 99.25%, and an F1 score of 98.44%. This indicates that the model is both accurate and comprehensive in recognizing this entity. The textual descriptions of the components of the hydraulic turbine unit speed control system have a unified format and structure, employing standardized terminology. This results in a degree of similarity and commonality among the descriptions of different components.

The model training curves in Figure 11 indicate that BERT-CRF and BiLSTM-CRF struggle to learn complex features. In contrast, the BERT-WWM-BiLSTM-MHA-CRF model learns more slowly in the early stages but achieves a higher F1 score upon convergence. The pre-training model with a full-word mask aids the model in capturing complex contextual information, while the multi-head self-attention mechanism enables it to identify key information within intricate sequences.

6.1.2. Comparison and Analysis of Entity Extraction Results with Mainstream Large Language Models

To better evaluate the entity extraction performance of the BERT-WWM-BiLSTM-MHA-CRF model in the context of hydraulic turbine governor system fault texts, this paper utilizes a self-constructed dataset of hydraulic turbine governor system faults. It employs precision (P), recall (R), and F1 score as evaluation metrics. Four models are selected for a comparative analysis of entity extraction results: the proposed BERT-WWM-BiLSTM-MHA-CRF traditional deep learning model; the state-of-the-art (SOTA) large language model ChatGPT-4.0 released by OpenAI; the Qwen-2.5 large language model developed by Alibaba, optimized for Chinese tasks; and the native DeepSeek-R1-Distill-Qwen32B large language model DS-Base. The model configurations and training details are presented in Table 7, while Table 8 and Figure 12 illustrate the F1 score comparisons of the same model on different entities.

According to Table 8 and Figure 12, the BERT-WWM-BiLSTM-MHA-CRF model exhibits a significant advantage in the recognition tasks of four categories of entities: components, anomalies, root causes, and mitigation measures, compared to the DS-Base, ChatGPT-4.0, and Qwen-2.5 large language models. In the recognition of the entity “root causes,” this model achieved an F1 score that is 3.86% higher than that of the best-performing Qwen-2.5 large language model. This improvement can be attributed to the model’s architecture, which is specifically designed for entity recognition tasks, enabling it to effectively focus on the entity features within the input data and generate more targeted embeddings and predictions. Compared to the other three large language models, this model significantly enhances the accuracy and efficiency of entity recognition by integrating various advanced algorithms, demonstrating stronger adaptability in the specialized domain addressed in this study, thereby maximizing its performance potential.

6.1.3. Relationship Extraction Results

Using the self-constructed hydroelectric motor speed control system fault relationship dataset, we obtained the experimental results presented in Table 9 and Figure 13. The R-BERT model proposed in this paper excels in relationship extraction for hydroelectric motor speed control system faults. The F1 scores for the relationships “Emerged” and “Corresponds” reach 83.23% and 84.68% on the dataset, respectively, indicating that the model effectively captures a high number of correct entities (high recall). This demonstrates that the model not only captures a high number of correct entities (high recall) but also ensures the accuracy of its predictions (high precision). The highest F1 score is associated with the relation “Due to,” which reaches 88.21%, indicating that the model is particularly accurate and comprehensive in recognizing this relationship.

6.2. Knowledge Fusion Experimental Results

To validate the effectiveness of the knowledge fusion method, this study extracted 50 entities with irregular naming from the fault text of the turbine speed control system. Additionally, 50 entities with incomplete names were manually created. The 100 irregular entities were then subjected to a matching analysis with the entities found in a specialized dictionary for the hydropower industry. Some of the matching results for these entities are shown in Table 10.

Using the entity extraction model, word vectors for the irregular entities and the candidate entities to be matched from the dictionary were obtained. Cosine similarity was then calculated based on Equation (20) to identify the entity with the highest similarity. Experimental statistics reveal that out of the 100 irregular entities, 94 entities were correctly matched, indicating a significant degree of entity alignment and facilitating the knowledge fusion of the knowledge graph for turbine speed control system faults.

6.3. Fault Knowledge Map Construction Results of Water Motor Speed Control System

In this paper, the construction of the knowledge graph is conducted using a top-down approach. To illustrate the constructed knowledge graph architecture more clearly, the proportional valve failure is used as an example of schema instantiation, with the Neo4j graph database utilized for visual display, as shown in Figure 14.

Figure 14 includes ten types of entities: equipment, components, abnormal phenomena, abnormal causes, treatment measures, fault types, work items, process requirements, working principles, and key links. Among them, four types of entities—parts, abnormal phenomena, abnormal causes, and treatment measures—are constructed from unstructured text through knowledge extraction and knowledge fusion. The relationships between these entities are defined using attribute values, thereby strengthening the semantic rules among them. The remaining six types of entities are derived from structured data, and the construction of the knowledge graph is performed directly.

The structure of the knowledge map for partial hydraulic turbine speed control system failures is illustrated in Figure 15. In the constructed knowledge map of hydraulic turbine speed control system failures, different entities are represented by distinct colors, making them easier to distinguish on the map. The relationships between these entities are indicated by connecting lines. The completed hydraulic turbine speed control system fault knowledge map clearly demonstrates the complex relationships among different fault entities.

Each circular node in Figure 15 represents a specific entity, and these entities are interconnected according to defined relationship rules, forming the fault knowledge graph of the hydraulic turbine unit speed control system. The color of the entities corresponds to their categories within the schema layer, including a total of ten categories: equipment, components, abnormal phenomena, abnormal causes, treatment measures, fault types, work items, process requirements, working principles, and key links.

To verify the correctness and rationality of the constructed knowledge graph, this paper uses the components entity “inverter” and the abnormal phenomena entity “unstable output voltage” as search conditions, querying the knowledge graph with a Cypher statement for related entities. The Cypher statement retrieves the abnormal causes entity and the treatment measures entity associated with the specified entities. The query results are as follows: the cause of the abnormality is “input voltage fluctuation,” and the solution is “check and stabilize the input voltage.” These results are highly credible and significant, providing support for fault diagnosis and auxiliary decision-making in the hydraulic turbine speed control system.

6.4. Application Framework for Fault Knowledge Mapping of Hydraulic Turbine Speed Control System

This paper introduces a comprehensive knowledge graph for faults in hydro turbine governor systems utilizing the BERTWWM-BiLSTM-MHA-CRF model, which delineates ten critical dimensions: Equipment, encompassing various types of hydro turbines (such as Kaplan, Francis, and axial-flow turbines) and their corresponding governor systems, facilitating rapid comprehension of different equipment functionalities and characteristics by maintenance personnel; Components, which encompass key elements, including generators, governors, booster pumps, and sensors, which facilitate effective fault identification and analysis; Anomalies, which encompass phenomena such as speed fluctuations, abnormal vibrations, and elevated noise levels, assist in the rapid identification of potential issues; Root Causes, which explore the fundamental origins of faults, including mechanical failures, electrical malfunctions, and operational errors; Mitigation Measures, which offer maintenance personnel actionable methods and guidance for responding to faults, encompassing preventive measures and protocols; Fault Types, which categorize the various faults to enhance identification efficiency; Maintenance Tasks, emphasizing the importance of routine inspections and periodic maintenance; Technical Specifications, which standardize equipment installation and operational protocols to ensure safety and reliability; Operational Principles, which provide detailed explanations of the fundamental mechanisms governing hydro turbines and their governor systems, thereby facilitating fault analysis; and Critical Nodes, which highlight essential focal points during fault diagnosis and response processes, such as monitoring key sensor statuses and conducting real-time data analysis, thereby ensuring prompt communication and efficient handling of fault information. By integrating these elements, the knowledge graph offers comprehensive informational support to maintenance personnel, not only enhancing fault identification and handling efficiency but also providing effective guidance for equipment management and maintenance in practical applications, ultimately improving overall system reliability and operational efficiency. The model’s performance in extracting entities and relations from actual inspection reports is illustrated in Table 11.

Hydraulic Turbine Speed Control System: The complexity of fault conditions and the over-reliance on the professional knowledge and experience of on-site operation, maintenance, and repair personnel lead to low fault-handling efficiency and insufficient accuracy. In this paper, a knowledge graph is constructed to derive diagnostic results and assist decision-making based on fault phenomena, providing references for operations and maintenance (O&M) personnel. The application process is illustrated in Figure 16.

When professionals identify defects in the hydraulic turbine unit speed control system through operational inspections or monitoring signals, the BERT-WWM-BiLSTM-MHA-CRF model is used for entity identification to extract defective devices and phenomenon entities. Cosine similarity is then utilized to compute the matching degree between the extracted entities and the existing entities in the knowledge graph, with thresholds set according to expert experience for entity linking. Entities exceeding the threshold are matched with cases in the atlas using Cypher language, which helps reason out possible defect levels, affected parts, and root causes. Strategies are provided to enhance grid operation and maintenance. Entities below the threshold are analyzed and processed by experts, who update the knowledge atlas of hydraulic turbine speed control system faults according to established knowledge updating procedures.

7. Conclusions

A domain knowledge graph construction method based on the BERT-WWM-BiLSTM-MHA-CRF model is proposed for hydraulic turbine speed control system faults, along with the establishment of its application framework.

(1): Characterize the fault text of the hydraulic turbine speed control system, and collect and annotate unstructured data based on the analysis results to improve the quality and utilization of the defective text training dataset.
(2): The BERT-WWM-BiLSTM-MHA-CRF model is constructed, offering improved entity extraction capabilities and accurately recognizing entity information in the fault text of the hydraulic turbine speed control system compared to the BERT-BiLSTM-CRF model and mainstream large Language models. Additionally, a relationship extraction method based on BERT is proposed, which collaborates with the entity recognition model to complete the entity extraction task for the fault text.
(3): A method for updating the fault knofwledge map of the hydraulic turbine speed control system and the application process for utilizing the knowledge map in assisted decision-making are proposed. This method can reason through the fault parts, causes, and treatment measures based on the fault content, providing references for field operation and maintenance personnel and improving the operation and maintenance level of the hydraulic turbine speed control system.

Author Contributions

Conceptualization, S.L.; methodology, S.L. and K.Z.; validation, K.Z.; investigation, Z.W.; data curation, T.Z.; writing—original draft preparation, X.A. All authors have read and agreed to the published version of the manuscript.

Funding

This work is supported by the PhD Research Start-Up Foundation of Hubei University of Technology (No. BSDQ2020023).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding authors.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Singhal, A. Introducing the knowledge graph: Things, not strings. Official Google Blog, 2 May 2012; 5. [Google Scholar]
Zhang, D.; Liu, Z.; Jia, W.; Liu, H.; Tan, J. An overview of the current research status and application prospect of knowledge graph in intelligent manufacturing. J. Mech. Eng. 2021, 57, 90–113. [Google Scholar] [CrossRef]
Liu, Q.; Li, Y.; Duan, H.; Liu, Y.; Qin, Z. A review of knowledge graph construction techniques. Comput. Res. Dev. 2016, 53, 582–600. [Google Scholar]
Shu, J.; Yang, T.; Geng, Y.; Yu, J. A Joint Extraction Method for Overlapping Entity Relationships in Power Knowledge Graph Construction. High Volt. Technol. 2024, 50, 4912–4922. [Google Scholar] [CrossRef]
Yang, H.; Yu, H.; Sun, Z.; Liu, J.; Yang, H.; Zhang, S.; Sun, H.; Jiang, X.; Yu, Y. Entity-Relationship Extraction for Fisheries Standards Based on a Dual Attention Mechanism. Trans. Chin. Soc. Agric. Eng. 2021, 37, 204–212. [Google Scholar]
Hao, Z.; Liu, C.; Qin, L. Joint Entity and Relationship Extraction from Food Inspection Announcements Based on Chinese Character Adjacency Graphs. Trans. Chin. Soc. Agric. Eng. 2023, 39, 283–292. [Google Scholar]
Deng, N.; Yu, Z.; Dan, W.; Chen, X.; Liu, S. A Joint Entity-Relationship Extraction Model for Traditional Chinese Medicine Patent Texts Integrating Semantic Features and Multi-layer Cross-attention Mechanism. Data Anal. Knowl. Discov. 2025, 9, 141–153. [Google Scholar]
Zhang, Y.; Li, S. A Joint Entity-Relationship Extraction Model for Apple Cultivation Domain in Low-resource Scenarios. Trans. Chin. Soc. Agric. Eng. 2024, 40, 188–195. [Google Scholar]
Ibrahim, N.; Aboulela, S.; Ibrahim, A.; Kashef, R. A survey on augmenting knowledge graphs (KGs) with large language models (LLMs): Models, evaluation metrics, benchmarks, and challenges. Discov. Artif. Intell. 2024, 4, 76–103. [Google Scholar] [CrossRef]
Edge, D.; Trinh, H.; Cheng, N.; Bradley, J.; Chao, A.; Mody, A.; Truitt, S.; Metropolitansky, D.; Ness, R.O.; Larson, J. From local to global: A graph rag approach to query-focused summarization. arXiv 2024. [Google Scholar] [CrossRef]
Gutiérrez, B.J.; Shu, Y.; Qi, W.; Zhou, S.; Su, Y. From RAG to memory: Non-parametric continual learning for large language models. In Proceedings of the Forty-Second International Conference on Machine Learning, Vancouver, BC, Canada, 13–19 July 2025; pp. 1–19. [Google Scholar]
Wei, Y.; Ren, Y.; Zhao, H.; Fan, J.; Fang, W.; Yan, S. Construction of a knowledge graph for selection and breeding research of new potato varieties in China based on GraphRAG. J. Plant Genet. Resour. 2025, 26, 1229–1241. [Google Scholar] [CrossRef]
Yu, T.; Li, J.; Yu, Q.; Tian, Y.; Sun, X.; Xu, L.; Zhang, Z. Construction and application of knowledge mapping for Chinese medicine and health care. China Digit. Med. 2017, 12, 64–66. [Google Scholar]
Chen, Q. Construction and Application of Knowledge Graph for Tobacco Rebaking. Master’s Thesis, University of Electronic Science and Technology, Chengdu, China, 2022. [Google Scholar] [CrossRef]
Ding, H.-C.; Wang, Z.-M. Ontology-based construction and application of Chinese knowledge map for oil tea. World For. Res. 2020, 33, 50–55. [Google Scholar] [CrossRef]
Nie, T.; Zeng, J.; Cheng, Y.; Ma, L. Knowledge graph construction technique and application for aircraft power system fault diagnosis. J. Aeronaut. 2022, 43, 46–62. [Google Scholar]
Han, X.; Wang, Z. Research on intelligent auxiliary diagnosis system for thermal power units. Therm. Power Gener. 2021, 50, 8–14. [Google Scholar] [CrossRef]
Xu, Q. CNC Fault Diagnosis System Based on Knowledge Graph. Master’s Thesis, Beijing University of Posts and Telecommunications, Beijing, China, 2020. [Google Scholar] [CrossRef]
Wang, S.; Sheng, S.; Wang, Y.; Zhai, Y.; Chen, K. Hydraulic Turbine Fault Diagnosis Based on ICEEMDAN-VMD and a Simple Recurrent Unit Combined with Attention Mechanism. Chin. Rural. Water Conserv. Hydropower 2024, 11, 166–173+178. Available online: https://link.cnki.net/urlid/42.1419.tv.20240514.1034.012 (accessed on 15 May 2024).
Sun, S.; Li, B.; Nie, X. Research on Hydraulic Turbine Fault Diagnosis Based on Bayesian Networks. J. Hydraul. Power Energy Sci. 2023, 41, 190–194. [Google Scholar]
Zhao, X.; Jia, Y.; Li, A.; Chang, C. A review of research on multi-source knowledge fusion technology. J. Yunnan Univ. (Nat. Sci. Ed.) 2020, 42, 459–473. [Google Scholar]
Alokaili, A.; Menai, M.E.B. SVM ensembles for named entity disambiguation. Computing 2020, 102, 1051. [Google Scholar] [CrossRef]
Rong, F.; Qu, Y.; Zhang, Y.; Tong, X.; Hu, J. Analyzing the ideas and characteristics of Hu Jingqing’s treatment of dementia by constructing a knowledge graph based on Neo4j. World Sci. Technol.-Mod. Tradit. Chin. Med. 2023, 25, 826–834. [Google Scholar]

Figure 1. Flowchart for constructing the fault knowledge map of hydraulic turbine unit speed control system.

Figure 2. Schema of knowledge mapping of faults in the speed control system.

Figure 3. BERTWWM-BiLSTM-MHA-CRF model.

Figure 4. Structure of the BERT model.

Figure 5. LSTM model structure.

Figure 6. Structure of the BiLSTM model.

Figure 7. Calculation flow of the deflation dot product attention mechanism.

Figure 8. Relational extraction model based on BERT.

Figure 9. Schematic diagram of Neo4j basic storage structure.

Figure 10. Comparison of entity F₁ values for traditional entity recognition models.

Figure 11. Model training curve.

Figure 12. Comparison of entity F1 scores with mainstream large language models.

Figure 13. Values of each metric for different models on the self-constructed dataset.

Figure 14. Graph Schema instantiation.

Figure 15. Knowledge mapping of some turbine unit speed control system failures.

Figure 16. Knowledge graph based auxiliary decision making for hydraulic turbine unit speed control system.

Table 1. Example of fault text.

Fault Text

1. The connecting rod of the main connector sensor is detached or the circuit is faulty, the governor regulation mode is cut from the power mode to the open mode, and the report is “guide vane following fault”, check the connecting rod on the spot or check the signal circuit.
2. Closed-loop dead zone and gain coefficient is not set properly, the unit grid-connected operation, the receiver occurs small frequent jerking, appropriate to reduce the closed-loop gain coefficient, increase the closed-loop starting and stopping dead zone, through the test to find the appropriate PID parameters, can be appropriately reduce the speed of regulation.
3. Failure of the unit frequency measurement circuit and the control program does not accurately determine the Frequent fluctuations in the unit frequency during grid-connected operation lead to fluctuations in the unit load. Observe the unit frequency signal and unwire the frequency measuring circuit for abnormal fluctuations.
4. The main proportional valve or guide valve is stuck, and the unit is overspeed during start-up, check the historical data, analyze the action of the main proportional valve guide spool, find out the point of stuckness, contact the host division to replace it, and check the oil quality.

Table 2. Sample labeling.

English Translation	Fault Text	Entity Labeling
touchscreen	触	B-COM
	摸	I-COM
	屏	I-COM
suddenly	突	O
suddenly	然	O
to become unresponsive	卡	B-ABP
to become unresponsive	死	I-ABP

Table 3. Relationship category definitions.

Type of Header Entity	Tail Entity Type	Relationship Category	Define
components	abnormal phenomenon	Emerged	Emerged
abnormal phenomenon	abnormal causes	Due To	Due To
abnormal causes	Treatment measures	Corresponds	Corresponds
-	-	irrelevant	NA

Table 4. Comparison of the same masking strategy.

Masking Method	Example Text
original text	the touchscreen of the control cabinet is frozen
mask of a single character	the control [MASK] touch [MASK] frozen
full-word mask	[MASK] [MASK] [MASK] touchscreen [MASK] [MASK]

Table 5. Training parameters of the model.

Parameter Type	Detailed Configuration
Epoch	20
Batch_size	32
Learning_rate	0.00003
LSTM_units	128
weight_decay	0.01
Max_length	512

Table 6. The results of traditional entity recognition models.

Model	Components F1 Score/%	Abnormal Phenomena	Abnormal Causes	Treatment Measures
Model	Components F1 Score/%	F1 Score/%	F1 Score/%	F1 Score/%
BiLSTM-CRF	91.96	91.10	91.21	90.11
BERT-CRF	93.95	91.75	91.69	91.42
BERT-BiLSTM-CRF	96.82	92.92	95.71	92.23
BERT-BiLSTM-MHA-CRF	97.83	95.51	97.21	94.80
BERT-WWM-BiLSTM-CRF	97.93	95.64	96.71	93.90
BERT-WWM-BiLSTM-MHA-CRF	98.44	96.61	98.09	95.99

Table 7. Model Configurations and Training Details.

Model	Parameter Scale	Input Method	Invocation/Training Method
BERT-WWM-BiLSTM-MHA-CRF	150 M	Text encoded by BERT-WWM as embedding input	Local full-parameter training, trained for 20 epochs
ChatGPT-4.0	Officially undisclosed	Structured message list input (JSON format)	API call, tailored prompt specifications by task type
Qwen-2.5	32 B	Plain text prompt input or dialogue history list input	API call, using the instruction-optimized version, tailored prompt specifications by task type
DS-Base	32 B	Raw text input	No prompt strategy and domain fine-tuning

Table 8. The results of mainstream large language models.

Model	Components F1 Score/%	Abnormal Phenomena	Abnormal Causes	Treatment Measures
Model	Components F1 Score/%	F1 Score/%	F1 Score/%	F1 Score/%
DS-Base	85.62	83.24	83.92	82.81
ChatGPT-4.0	90.26	88.64	89.62	91.08
Qwen-2.5	93.68	92.89	94.23	95.23
BERT-WWM-BiLSTM-MHA-CRF	98.44	96.61	98.09	95.99

Table 9. Comparison of evaluation indicators for different relationship extraction.

Relationship Category	Precision/%	Recall/%	F1 Score/%
Emerged	78.64	88.39	83.23
Due_To	87.16	89.28	88.21
Corresponds	87.92	81.67	84.68

Table 10. Partial entity matching results.

Irregular Entities	Word Vectors of Irregular Entities	Matched Entities	Word Vectors of Matched Entities	Cosine Similarity	True/False
sensor	(0.269, 0.015, −0.007, …, 0.046)	power sensor	(−0.186, −0.012, −0.047, …, 0.106)	0.964	true
power supply device	(0.703, −0.018, −0.007, …, 0.017)	power transfer switch	(0.431, 0.013, −0.007, …, 0.059)	0.978	true
cpu module	(0.764, −0.087, −0.019, …, 0.052)	CPU module	(0.333, 0.029, 0.049, …, −0.046)	0.875	false
the operation light is off	(0.342, −0.015, −0.001, …, 0.033)	the operation light is not illuminated	(0.679, −0.067, −0.058, …, 0.126)	0.926	true

Table 11. Performance test samples of the model in the application of entity and relation extraction from actual inspection reports.

Sample Inspection Report	Entity	Relation
In the 31F unit’s shutdown state, the active power mechanical meters on the governor’s electrical cabinet (PG1) and the hydraulic system control cabinet both showed −100 MW. Please check and address this issue. The inspection work for the active power mechanical meter anomaly of the 31F governor has been completed. It was found that the abnormal reading of the active power meter for the 31F unit was due to a mismatch with the zero point in the PLC program. The zero point has now been corrected. After this adjustment, the power meter correctly reflects the simulated power of the unit, displaying normal values. The work area has been cleared, and personnel have evacuated.	{“Equipment”: [(“Governor”, 3, 5)], “Components”: [(“Mechanical Meter”, 10, 12)], “Anomalous Phenomena”: [(“Mismatch with Zero Point”, 149, 153)], “Mitigation Measures”: [(“Correct Zero Point”, 157, 160)]}	[(“Governor”, “Mechanical Meter”, “includes”), (“Mismatch with Zero Point”, “Indication Meter Anomaly”, “caused by”), (“Mismatch with Zero Point”, “Correct Zero Point”, “action taken”)]
The cushion fastening bolts for the pressure oil pipe of the 14F governor relay (located at the turbine instrument cabinet) have loosened, causing the cushion to separate from the pipeline and resulting in failure of the cushion. Please check and address this issue.	{“Equipment”: [(“Governor”, 3, 6)], “Components”: [(“Relay Pressure Oil Pipe”, 7, 15)], “Anomalous Phenomena”: [(“Loose Cushion Fastening Bolts”, 16, 23)], “Fault Types”: [(“Cushion Separation from Pipeline, Cushion Failure”, 74, 85)], “Anomalous Causes”: [(“Loose or Detached Fasteners”, 160, 168)]}	[(“Governor”, “Relay Pressure Oil Pipe”, “includes”), (“Loose Cushion Fastening Bolts”, “Cushion Separation from Pipeline, Cushion Failure”, “caused by”)]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liu, S.; Zhang, K.; Zhang, T.; Wang, Z.; Ai, X. Fault Knowledge Graph Construction Method for Hydraulic Turbine Speed Control System Based on BERTWWM-BiLSTM-MHA-CRF Model. Appl. Sci. 2025, 15, 12377. https://doi.org/10.3390/app152312377

AMA Style

Liu S, Zhang K, Zhang T, Wang Z, Ai X. Fault Knowledge Graph Construction Method for Hydraulic Turbine Speed Control System Based on BERTWWM-BiLSTM-MHA-CRF Model. Applied Sciences. 2025; 15(23):12377. https://doi.org/10.3390/app152312377

Chicago/Turabian Style

Liu, Sheng, Kefei Zhang, Tianbao Zhang, Zhong Wang, and Xun Ai. 2025. "Fault Knowledge Graph Construction Method for Hydraulic Turbine Speed Control System Based on BERTWWM-BiLSTM-MHA-CRF Model" Applied Sciences 15, no. 23: 12377. https://doi.org/10.3390/app152312377

APA Style

Liu, S., Zhang, K., Zhang, T., Wang, Z., & Ai, X. (2025). Fault Knowledge Graph Construction Method for Hydraulic Turbine Speed Control System Based on BERTWWM-BiLSTM-MHA-CRF Model. Applied Sciences, 15(23), 12377. https://doi.org/10.3390/app152312377

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Fault Knowledge Graph Construction Method for Hydraulic Turbine Speed Control System Based on BERTWWM-BiLSTM-MHA-CRF Model

Abstract

1. Introduction

2. Knowledge Graph Construction Architecture

3. Data Acquisition and Processing

3.1. Data Sources

3.2. Schema Design

3.3. Labeling Methods

4. Named Entity Recognition Methods

4.1. The BERT-WWM Model

4.2. BiLSTM-MHA Modeling

4.2.1. Bi-LSTM Based Faulty Text Feature Coding

4.2.2. Multi-Headed Self-Attention Mechanism MHA

4.3. CRF to Obtain the Global Optimal Sequence

5. Hydraulic Turbine Governor Fault Knowledge Graph Construction

5.1. Entity Layer Construction

5.1.1. Entity Extraction

5.1.2. Relationship Extraction

5.2. Knowledge Fusion and Storage

6. Example Analysis

6.1. Knowledge Extraction Experiment Results and Analysis

6.1.1. Comparison and Analysis of Entity Extraction Results with Traditional Entity Recognition Models

6.1.2. Comparison and Analysis of Entity Extraction Results with Mainstream Large Language Models

6.1.3. Relationship Extraction Results

6.2. Knowledge Fusion Experimental Results

6.3. Fault Knowledge Map Construction Results of Water Motor Speed Control System

6.4. Application Framework for Fault Knowledge Mapping of Hydraulic Turbine Speed Control System

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI