Construction and Application of Knowledge Graph for Power Grid New Equipment Start-Up

Tang, Wei; Zhang, Yue; Mao, Xun; Jia, Hetong; Lv, Kai; Shan, Lianfei; Qiao, Yongtian; Jiang, Tao

doi:10.3390/en18205471

Open AccessArticle

Construction and Application of Knowledge Graph for Power Grid New Equipment Start-Up

by

Wei Tang

¹,

Yue Zhang

^2,3,*,

Xun Mao

¹,

Hetong Jia

^2,3,

Kai Lv

¹,

Lianfei Shan

^2,3,

Yongtian Qiao

^2,3 and

Tao Jiang

^2,3

¹

State Grid Anhui Electric Power Research Institute, Hefei 230601, China

²

NARI Group Corporation Co., Ltd., (State Grid Electric Power Research Institute Co., Ltd.), Nanjing 211106, China

³

Beijing Kedong Electric Power Control System Co., Ltd., Beijing 100192, China

^*

Author to whom correspondence should be addressed.

Energies 2025, 18(20), 5471; https://doi.org/10.3390/en18205471

Submission received: 6 July 2025 / Revised: 1 September 2025 / Accepted: 4 September 2025 / Published: 17 October 2025

(This article belongs to the Special Issue Digital Modeling, Operation and Control of Sustainable Energy Systems)

Download

Browse Figures

Versions Notes

Abstract

To address the lack of effective risk-identification methods during the commissioning of new power grid equipment, we propose a knowledge graph construction approach for both scheme generation and risk identification. First, a gated attention mechanism fuses textual semantics with knowledge embeddings to enhance feature representation. Then, by introducing a global memory matrix with a decay-factor update mechanism, long-range dependencies across paragraphs are captured, yielding a domain-knowledge-augmentation universal information-extraction framework (DKA-UIE). Using the DKA-UIE, we learn high-dimensional mappings of commissioning-scheme entities and their labels, linking them according to equipment topology and risk-identification logic to build a commissioning knowledge graph for new equipment. Finally, we present an application that utilizes this knowledge graph for the automated generation of commissioning plans and risk identification. Experimental results show that our model achieves an average precision of 99.19%, recall of 99.47%, and an F₁-score of 99.33%, outperforming existing methods. The resulting knowledge graph effectively supports both commissioning-plan generation and risk identification for new grid equipment.

Keywords:

new equipment start-up; memory matrix; knowledge augmentation; UIE framework; knowledge graph; risk identification

1. Introduction

The compilation of the start-up plan for new equipment currently mainly relies on manual compilation. This method not only has low compilation efficiency [1,2,3] but also is prone to causing “misoperation” in the power grid, increasing the risk of putting new equipment into operation. With the continuous development of the new power system, the number of newly added equipment in various regions is constantly increasing every year. The commissioning of each new piece of equipment requires the preparation of a primary power transmission plan and a secondary protection plan. However, due to the heterogeneity of data systems in various regions and the differences in scheme compilation habits, it is difficult to form a unified and standardized scheme compilation system for new equipment start-up, which in turn restricts the compilation efficiency and the level of standardization. Therefore, constructing a knowledge-driven model suitable for business scenarios where new equipment is started holds significant academic and practical value.

Knowledge graphs take the triplet of “head entity–relation–tail entity” as the basic unit, construct semantic relation networks through ordered connections, and have powerful relation representation and reasoning capabilities [4,5,6]. With their accurate representation of the topological structure of the power grid, knowledge graphs are gradually becoming an indispensable core technology in the field of power grids. The generation of high-quality triples is the core of knowledge graph construction. However, when dealing with semi-structured or unstructured data, the generation process highly depends on effective knowledge extraction methods. At present, the mainstream knowledge extraction methods include rule templates, syntactic analysis, and deep learning [7,8,9]. References [10,11] analyze power system entities using regular expressions. However, this method has strict requirements for the format of the input text, weak generalization ability, and high labor cost. Reference [12] proposes a BiLSTM-CRF extraction method based on multi-model classification and transforms the unstructured historical initiation scheme data into structured data. Reference [13] integrates multi-dimensional feature information and proposes an entity extraction method for the new equipment start-up scheme based on MIL-BILSTM + Attention, which improves the accuracy of entity recognition in the new equipment scheme. Reference [14] proposes a fault-plan entity-recognition method based on a general information extraction framework, achieving high-precision plan entity extraction. The above-mentioned methods have achieved certain results in the document parsing of the power grid field, but there are still two main problems: On the one hand, in the text vectorization stage, the existing models mostly rely on the semantic encoding of word representations and fail to effectively integrate the triplet association information of “equipment-relationship-equipment” in the power grid topology structure, resulting in the lack of structural constraints in the representation of new equipment entities in the high-dimensional semantic space. Especially when facing unlogged-in or variant-named entities, the accuracy rate drops significantly. On the other hand, traditional information extraction usually adopts the static attention mechanism, which is difficult to model the context dependency relationships across sentences and paragraphs, limits the ability to capture long-range semantic associations in complex texts, and thereby affects the accuracy of entity alignment and logical relationship extraction.

The UIE framework [15,16,17] demonstrates excellent performance in information extraction and possesses strong semantic modeling and generalization capabilities. In response to the above problems, this paper proposes a domain-knowledge-augmentation universal information-extraction framework (DKA-UIE), and its core innovation points include the following:

(1): Constructing a power grid topology knowledge vector library based on TransR embedding and designing a knowledge-guided dynamic feature fusion mechanism to achieve the effective fusion of semantic and structural features. Specifically, knowledge recall is carried out through Euclidean distance, and the semantic features and related triplet knowledge vectors are fused with gated attention, thereby enhancing the semantic expression ability of new equipment entities and improving their recognition robustness under unregistered words or variant descriptions.
(2): A hierarchical memory-augmented attention architecture is proposed. A learnable global memory matrix and a dynamic update mechanism controlled by attenuation factors are introduced to achieve persistent modeling and selective retention of context information across sentences and paragraphs in long texts, significantly enhancing the model’s information integration ability in complex semantic scenarios.

The proposed method provides a solid foundation for semantic extraction in constructing a knowledge graph for starting new equipment and effectively supports subsequent risk identification.

2. Knowledge Graph Construction of New Equipment Start-Up in Power Grid

2.1. Knowledge Graph Construction Scheme of New Equipment Start-Up in Power Grid

In the process of constructing the knowledge graph for the start-up of new power grid equipment, this paper designs a six-level ontology architecture, covering core areas such as power generation, transmission, transformation, distribution, users, and protection. This architecture uses equipment as the basic topological nodes and builds the initial knowledge graph framework based on the structured data of the entire network, enabling formal modeling of key equipment in the power grid and their physical connections. For the unstructured text data widely existing in the start-up scenarios of new equipment (such as start-up plans, switching operation opinions, and protection setting calculations), a domain-knowledge-augmentation universal information-extraction framework is introduced to automatically extract the equipment entities and their semantically associated triples and integrate them into the existing topological structure for the dynamic update and expansion of the knowledge graph. The constructed knowledge graph not only reflects the physical topological characteristics of the power grid but also integrates the operation logic of the equipment and the associated information of potential risks, providing structured support for the subsequent generation of start-up plans and risk identification. The knowledge graph construction scheme for new equipment start-up is shown in Figure 1.

(1): The data layer includes both structured power grid models and unstructured text data. The power grid model primarily covers key elements such as parameters, topological connections, and dispatching mechanisms. Unstructured text includes documents such as start-up plans, switching operation opinions, and protection setting calculations.
(2): The data preprocessing layer cleans and normalizes multi-source heterogeneous data, removing redundant, missing, or conflicting information to improve data quality. Meanwhile, the data are labeled according to the unified ontology specification to construct a standardized training and validation sample library.
(3): The knowledge extraction layer, as the core component in the construction of the knowledge graph, adopts a predefined ontology architecture to extract standard entities and relationships from structured data through procedural means. For unstructured text data, the DKA-UIE framework is applied to achieve equipment entity recognition and semantic relationship extraction, effectively addressing the challenges posed by unlogged words and named variants and improving extraction accuracy and generalization ability.
(4): The knowledge reasoning layer, after completing the initial knowledge extraction, performs entity disambiguation and semantic alignment to ensure data consistency from different sources at the semantic level. By combining the preset-rule reasoning mechanism and graph topology analysis, it models risk propagation paths and identifies potential hidden dangers, enhancing the intelligent reasoning capability of the topology.
(5): The knowledge application layer, built on the completed knowledge graph, supports the intelligent formulation of new equipment start-up plans and automatic verification of risk factors, enhancing the scientific accuracy and efficiency of power grid operation decisions and promoting the evolution of power grid operations toward intelligence.

2.2. UIE Knowledge Extraction Algorithm

The UIE model takes the original text and structural schema instructions (SSI) as input and uses the Transformer for encoding and decoding to extract structured information. The overall architecture of the UIE is shown in Figure 2.

In the UIE model, after the structural pattern instruction

s

and the input text

x

are encoded by the Transformer, the text feature vector is obtained as follows:

H = E n c o d e r (s \oplus x)

(1)

where

E n c o d e r

represents the structure of the Transformer encoder;

H \in ℝ^{n \times d}

is the feature representation matrix of the input text;

s

stands for SSI;

x

is the original input text;

\oplus

indicates the join operation;

n

is the total length of the text sequence;

d

is the feature dimension.

In the UIE decoding stage, the structured output can be obtained by the following formula:

y_{i}, h_{i} = D e c o d e r ([H; h_{1}, h_{2}, \dots, h_{i - 1}])

(2)

where

D e c o d e r

represents the Transformer decoder;

h_{i}

represents the latent state at the i-th moment; y_i represents the output at the i-th moment.

3. New Equipment Start-Up Knowledge Extraction Based on the DKA-UIE Framework

3.1. New Equipment Start-Up Knowledge Extraction

Firstly, conduct data cleaning and standardization processing on unstructured texts such as power grid start-up plans and switching operation opinions to construct a uniformly labeled sample library. Then, the power grid topology knowledge vector constructed based on TransR is introduced into the model. The text semantics and structural knowledge are integrated through the gated attention mechanism, and the cross-paragraph dependency modeling is achieved by combining the hierarchical memory-augmented attention architecture. Finally, the automatic extraction of equipment entities and relationships is completed through the trained and optimized model, providing support for the construction of the knowledge graph. The overall process is shown in Figure 3.

3.2. New Equipment Start-Up Knowledge Labeling

The start-up of new equipment is an important step before the grid equipment is connected to the power system for formal operation. Its main purpose is to verify whether the phase sequence of the primary equipment and the secondary protection setting values are correct. Among them, the start-up principle scheme and the start-up protection scheme involve a large number of entity types, mainly including 12 types of entities such as plants and stations, dispatching mechanisms, and busbars. By annotating and classifying 12 types of key entities, the operating equipment, relationships, and equipment statuses during the start-up process of new equipment can be effectively identified, providing a structured basis for dynamic knowledge graph updates, generating new equipment start-up plans, and identifying risks. The entity labels are shown in Table 1 as follows.

3.3. DKA-UIE Framework

To enhance the information extraction capability in the start-up scenario of new power grid equipment, the DKA-UIE framework introduces external knowledge and hierarchical memory augmentation mechanisms based on UIE. Firstly, the TransR [18,19] model is used to pre-train the power grid topology knowledge base, and the knowledge recall of related triples is achieved through the Euclidean distance. Then, the gated attention fusion mechanism is adopted to dynamically fuse the text-semantic features and external knowledge vectors, thereby enhancing the semantic expression ability of the new equipment entity. Furthermore, the DKA-UIE framework decoder also introduces the global memory matrix [20,21] and attenuation-factor update mechanism to persistently store the context information across sentences and paragraphs, achieving long-range dependency modeling. The overall architecture of the DKA-UIE is shown in Figure 4.

Based on the TransR embedding method, power grid equipment and its relationships are mapped to low-dimensional vectors to form a domain knowledge vector library:

K = {k_{i} | i \in I}

(3)

where

K

represents the domain knowledge vector library;

k_{i} \in ℝ^{d}

represents the TransR embedding vector of entities or relations;

I

is the index set of entities or relations in the knowledge base.

For the input text, after TransR embedding, the Top-k-related domain knowledge is recalled from the knowledge base based on the Euclidean distance:

H_{k} = SelectTop (k_{x}, K)

(4)

where

k_{x}

is the TransR embedding vector of the input text;

H_{k} \in ℝ^{k \times d}

represents the recalled knowledge, and k is determined by the length of the sentence;

SlectTop

is the distance metric recall method [22].

To enhance the model’s perception ability of external knowledge, the gated attention mechanism is adopted to dynamically fuse the text features output by the UIE encoder with the recalled knowledge vectors:

g = σ (W_{g} [H + H_{k}] + b_{g})

(5)

\tilde{H} = g ⊙ H + (1 - g) ⊙ H_{k}

(6)

where

H \in ℝ^{n \times d}

is the text representation of the UIE encoder;

[H + H_{k}] \in ℝ^{n \times 2 d}

broadcasts the knowledge vector to the semantic vector of each time step;

W_{g} \in ℝ^{d \times 2 d}

is the trainable parameter matrix of the gate value;

b_{g} \in ℝ^{d}

is the bias term;

g \in ℝ^{n \times d}

is the gating value, controlling the fusion ratio of the original hidden state and external knowledge;

σ (\cdot)

is the sigmoid activation function, mapping the input to the interval (0, 1);

⊙

is element-by-element multiplication;

\tilde{H} \in ℝ^{n \times d}

is the final semantic representation after integrating external knowledge.

To enhance the modeling ability of the model for long-term dependent features, in this paper, a global memory matrix is introduced into the decoder, and a weighted moving average strategy with attenuation factors is adopted for dynamic update. This mechanism enables different encoding layers to retain historical context interaction information across time steps, enhancing the representation ability of the attention mechanism for long-term dependent features. The dynamic memory decay factors are as follows:

\bar{h} = \frac{1}{n} \sum_{i = 1}^{n} h_{i}

(7)

γ = σ (W_{γ} \bar{h} + b_{γ})

(8)

where

\bar{h}

is the average latent vector of the text input;

γ

is the dynamic learning parameter of the memory decay factor, controlling the weight between historical memory and current features;

W_{γ} \in ℝ^{d}

is the trainable parameter matrix;

b_{γ}

is the bias term.

When the covariance matrix that fuses text and structured information is adopted to represent the current hierarchical input information, the memory matrix that fuses historical information can be expressed as follows:

M_{s} = γ M_{s - 1} + (1 - γ) \tilde{H} {\tilde{H}}^{T}

(9)

where

M_{s} \in ℝ^{d \times d}

represents the global memory matrix updated at time

s

;

M_{s - 1} \in ℝ^{d \times d}

represents the memory matrix at time

s - 1

.

Then, introduce the memory matrix into the attention mechanism [23,24,25] to achieve effective capture and reuse of the context:

Attention (Q, K, V) = softmax (\frac{Q K^{T} + M_{s}}{\sqrt{h}}) V

(10)

where

Q, K, V \in ℝ^{d \times h}

are the query, key, and value vectors, respectively, and

softmax (\cdot)

is used to normalize the attention weights.

To enhance the model’s representation ability of the structural information of the knowledge graph, a loss function based on contrastive learning is introduced in the training stage. This function achieves precise alignment of the semantic structure of entities and relations by minimizing the embedding distance of positive sample triples and maximizing the embedding distance of negative sample triples simultaneously:

\begin{array}{l} L_{K G} = \sum_{(h, r, t) \in Ω} \max (0, δ + d^{-} - d^{+}) \\ d^{-} = | | E_{h^{-}} + E_{r} - E_{t^{-}} | |_{2} \\ d^{+} = | | E_{h} + E_{r} - E_{t} | |_{2} \end{array}

(11)

where

L_{K G}

is the triple structure loss;

Ω

represents the set of positive triples in the knowledge graph. Each positive triplet

(h, r, t)

is composed of a head entity

h

, a relation

r

, and a tail entity

t

, representing a real factual relation. Negative samples are generated by randomly replacing the real triplet tail entities.

d^{-}

represents the negative sample embedding distance;

d^{+}

represents the positive sample embedding distance.

E_{h}, E_{r}, E_{t}, E_{t^{-}}

are the embedding vectors of the head entity, relation, tail entity, and negative sample tail entity, respectively.

δ

is the interval hyperparameter, which is used to control the minimum discrimination between positive and negative samples.

To improve the stability of the model’s memory of historical features, in this paper, L2 regularization constraints are introduced between adjacent moment memory matrices to suppress the fluctuations during the training process:

L_{M A A} = {‖M_{s} - M_{M - s}‖}_{2}

(12)

where

L_{M A A}

represents the memory augmentation loss.

To achieve end-to-end training in knowledge augmentation and memory modeling, the UIE basic loss, triplet structure loss, and memory augmentation loss are further combined to form a multi-task loss function:

L = L_{U I E} + λ_{1} L_{K G} + λ_{2} L_{M A A}

(13)

where

L_{U I E}

is the loss function of the UIE basic model.

λ_{1}

and

λ_{2}

are the adjustable hyperparameters of the triple contrast loss and the memory augmentation loss, respectively.

The above-mentioned multitask losses are jointly optimized through the gradient descent method. Ultimately, this training strategy can effectively enhance the accuracy and robustness of the model in the knowledge extraction task of the start-up plan for new power grid equipment.

4. Case Study Analysis

4.1. Test Data

The test data are from a certain control center, including more than 20 documents such as the primary power transmission plan and the secondary protection plan. Each paragraph in the documents is regarded as a piece of data, totaling 2474 pieces of data containing 8929 entities. The dataset is divided in a ratio of 6:2:2. Among them, the training set contains 1484 pieces of data, including 5125 entities; the validation set contains 495 pieces of data, including 1854 entities; and the test set contains 495 pieces of data, including 1950 entities. The training environment uses deep learning frameworks PyTorch 1.13.0 and Paddle 2.4.2. The operating environment consists of CentOS 7.9, an Intel CPU running at 2.50 GHz, 128 GB of system RAM, and an NVIDIA GPU with 16 GB of memory. During the training process, the following parameters are adjustable: the number of training rounds is 100, the learning rate is 1 × 10⁻⁵, the batch size is 16, the hyperparameter of positive and negative sample intervals is 0.1, the hyperparameter of triplet contrast loss is 0.5, the hyperparameter of memory augmentation loss is 0.5, and the optimizer is Adam.

4.2. Evaluation Index

To accurately evaluate the entity recognition effect of the new equipment start-up plan, the precision rate (P), recall rate (R) and F₁-score are adopted as evaluation indicators:

P = \frac{T P}{T P + F P} \times 100 %

(14)

R = \frac{T P}{T P + F N} \times 100 %

(15)

F_{1} = \frac{2 P R}{P + R} \times 100 %

(16)

where

T P

represents the number of positive samples with correct predictions;

F P

represents the number of positive samples with incorrect predictions;

F N

represents the number of negative samples with incorrect predictions.

4.3. Effect Analysis

The knowledge extraction model for training new equipment start-up schemes based on the DKA-UIE was verified by using 495 test samples. The recognition effects of knowledge entities for 12 new equipment start-up schemes are shown in Table 2.

In the test set of the new equipment start-up scheme, the DKA-UIE model performed outstandingly in the entity extraction task. From the analysis of the above table, it can be known that the F₁-score of tags such as vol and dcc can reach 100%, and the F₁-score of tags with more than 10 categories can all reach 99%. Overall, the average accuracy rate of model entity extraction is 99.19%, the recall rate is 99.47%, and the F₁-score is 99.33%, demonstrating a relatively high level of precision. The test results show that this model has a strong information extraction ability in the field of new equipment start-up schemes, providing strong support for the generation of new equipment start-up schemes and risk identification.

4.4. Model Comparison

On the same dataset, the recognition effects of BiLSTM-CRF [26], BERT-CRF [27], UIE, and the DKA-UIE model proposed in this paper in the new equipment start-up scheme were verified. The accuracy curve of the training process is shown in Figure 5.

It can be known from Figure 5 that the accuracy rates of the four models gradually increase with the increase in training rounds and tend to be stable in the middle and later stages. Among them, the accuracy rate of the DKA-UIE model has always remained the highest. It converges rapidly at the beginning of training and eventually approaches 100%, indicating that this model has the best effect on the entity extraction task of the new equipment start-up scheme. The accuracy rate of the UIE model is second, approaching 94.5% after stabilization, while the accuracy rate of the BERT-CRF model is close to 93%. Compared with the accuracy rates of the DKA-UIE, both have decreased. The BiLSTM-CRF model has the lowest accuracy rate and eventually stabilizes at around 90%. The above analysis indicates that when dealing with the tasks of concatenating entities and the complex relationships between entities, the DKA-UIE model shows more significant advantages. For the test sample set, the average results of the precision rate, recall rate, and F₁-score of each model are shown in Table 3.

This paper conducts a statistical analysis of the F₁-scores of different category labels, presenting the performance of each model in different tasks. The results are shown in Figure 6. It can be seen from the heat map that different models (DKA-UIE, UIE, BERT-CRF, BiLSTM-CRF) present different performance trends on each label. Specifically, the DKA-UIE model performs stably in most tasks, with F₁-scores generally close to 1, demonstrating strong performance. The UIE also achieved an F₁-score close to 1 in multiple tasks, but in the “disconnector” task, the F₁-score was only 0.65. In the tasks of “disconnector” and “operation time”, the F₁-scores of BERT-CRF are 0.72 and 0.8, respectively, indicating that the general model still has certain limitations when dealing with vertical domain tasks. The overall performance of BiLSTM-CRF is relatively weak, and the F₁-scores are generally low, making it difficult to support engineering applications. Overall, the DKA-UIE model performs the best in most tasks. By integrating the structured information of new equipment start-up into the memory-augmented attention mechanism, the extraction ability of complex entities in the new equipment start-up scheme has been effectively improved. To sum up, the innovation of the DKA-UIE model in integrating domain knowledge into the memory to augment the attention mechanism, especially its application in the power system, has demonstrated a powerful information extraction ability.

To evaluate the practical feasibility of the DKA-UIE model, this study conducts a comparative analysis of computational efficiency across models during both the training and inference phases. Under identical hardware conditions and using the same dataset, the total time required to complete 100 training epochs and the average time for each model to perform inference on a single test sample are recorded. The results are presented in Table 4.

As shown in the table, the BiLSTM-CRF model, with its simple architecture, achieves the highest computational efficiency. The BERT-CRF and UIE models, based on the Transformer architecture, incur significantly higher training and inference times. The proposed DKA-UIE model introduces additional computational overhead due to the incorporation of external knowledge and a memory-augmented mechanism, resulting in increased training and inference times compared to the UIE baseline model. Nevertheless, its average inference time per sample remains under 50 milliseconds, which is sufficient to meet the response requirements in offline planning scenarios.

5. New Equipment Start-Up Scheme Generation and Risk Identification Based on Knowledge Graph

The development of start-up plans for new equipment involves various application scenarios, including new transmission lines, new substations, merger unit retrofit measures, power supply restoration, and replacement of protection devices. Among these, the start-up plan for new substations is particularly complex due to the coordinated operation and logical sequencing involving busbars, circuit breakers, disconnectors, transmission lines, transformers, and associated protection devices. This study analyzes a sample of 74 practical start-up energization and protection schemes for 220 kV substations from a provincial power company in recent years. These cases, derived from the province’s grid construction and renovation projects, cover various types such as new construction, retrofit, and technical countermeasures, demonstrating strong engineering representativeness and practical relevance. Through systematic review and inductive analysis, 21 typical topological structures were identified. The classification is primarily based on key characteristics such as substation busbar configurations (e.g., double busbar, three-section busbar, four-section busbar) and inter-station electrical connections (e.g., single-circuit line, double-circuit line). Start-up schemes of different topological types have certain commonalities in logic, but there are significant differences in specific implementation details. To address this complexity, this paper proposes a modular generation method based on knowledge graphs, extracting the core business logic of the start-up scheme into universal modular components as the basic framework for scheme generation. By introducing the dynamic configuration item mechanism, precise scheme generation for different topological structures and different application scenarios is achieved. In the process of generating the start-up scheme, first calculate the similarity between the topological structure of the graph’s isomorphic network and the standard topological structure in the graph, and automatically select the most matching dynamic configuration item. Meanwhile, combined with the equipment entities stored in the knowledge graph and their associated relationships, the system will automatically generate the start-up principle plan and protection plan for new equipment.

After the new equipment start-up plan is compiled, if the start-up plan is manually adjusted according to the actual power grid operation mode and power flow, the model will automatically verify three types of risks: name risk, impact risk, and loop closure risk. In the power-on start-up step, the equipment involved in each operation step is extracted using the aforementioned DKA-UIE model, and their states are recorded. Combined with the entities in the knowledge graph and their context information, the normalization verification of the equipment names is completed. Based on the voltage levels of the lines, busbars, and transformers, as well as the switching times of the hot standby to operation state, combined with the reasoning model embedded in the knowledge graph, the impact risks in the operation steps are identified. Based on the topological state information in the operation steps aggregated by the message-passing network, the risk of loop closure in the operation process is identified through the comparative analysis of the equipment states in the time series. Through the embedded computing and dynamic reasoning of the above-mentioned knowledge graph, the intelligent generation of the start-up plan and dynamic risk identification can be achieved.

6. Conclusions

This paper presents a knowledge graph construction method for generating start-up schemes and identifying risks in new power grid equipment. Through analysis and verification using calculation examples, the following conclusions are drawn.

(1): The proposed DKA-UIE integrates the TransR power grid topology knowledge base. By introducing a gated attention mechanism, it enables the dynamic fusion of semantic features and structured knowledge. Additionally, a hierarchical memory augmentation architecture is employed to model long-range cross-paragraph dependencies, significantly enhancing the model’s context awareness and logical reasoning capabilities and overcoming the limitations of traditional methods in context modeling.
(2): Based on the verification of new equipment start-up data from a power grid in a certain area, the F₁-score of the DKA-UIE framework in the entity recognition task reached 99.33%, 4.84 percentage points higher than the UIE benchmark model. Notably, it maintained a high recall rate in recognizing complex entities such as “disconnector” and “AC lines”, demonstrating the model’s high adaptability to the power grid scenario.
(3): The new equipment start-up knowledge graph constructed with the DKA-UIE enables unified modeling of equipment entities and risk rules, supports the modular generation of start-up schemes and risk identification, and provides reliable support for intelligent power grid dispatching.

The current study focuses on the construction and application of a knowledge graph for new equipment start-up principles and protection schemes. To address the diversity of start-up scenarios, future work will target typical applications such as line retrofitting and substation expansion, aiming to develop scenario-specific methods for knowledge graph modeling and updating to enhance the system’s generalization capability. Meanwhile, we plan to explore collaboration mechanisms with different dispatching centers to promote the coordinated use of multi-source, heterogeneous power system documents under the premise of ensuring data security and regulatory compliance. To improve model deployment efficiency, lightweight techniques (e.g., quantization, pruning) and hardware acceleration (e.g., TPU, FPGA) will be investigated. Furthermore, continual learning will be introduced to enable dynamic knowledge graph updates, enhancing the system’s adaptability to grid expansion and operational changes.

Author Contributions

Conceptualization, W.T.; Methodology, Y.Z. and L.S.; Resources, X.M. and K.L.; Data curation, Y.Q. and T.J.; Writing—original draft, H.J. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the science and technology project of SGCC, No. 5108-202320056A-1-1-ZN.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

Authors Wei Tang, Xun Mao and Kai Lv were employed by the State Grid Anhui Electric Power Research Institute. Authors Yue Zhang, Hetong Jia, Lianfei Shan, Yongtian Qiao and Tao Jiang were employed by the company NARI Group Corporation Co., Ltd., (State Grid Electric Power Research Institute Co., Ltd.). Authors Yue Zhang, Hetong Jia, Lianfei Shan, Yongtian Qiao and Tao Jiang were employed by the Beijing Kedong Electric Power Control System Co., Ltd. All the authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Wang, J.; Wang, X. Research on Automatic Compilation System of Power Grid Startup Plan Based on State Machine Principle. Inner Mong. Electr. Power Technol. 2021, 39, 59–62. [Google Scholar]
Lu, S.; Li, G.; Zhou, T.; Xu, C.; Zhu, J. Research on Intelligent Generation Model Study of New Equipment Start-Up Steps Based on Finite State Machine Principle. Northeast Electr. Power Technol. 2019, 40, 16–20. [Google Scholar]
Ren, C.; Niu, S.; Ke, X.; Wang, B.; Wang, Z.; Cheng, L. Research on intelligent compilation system of new equipment start-up scheme in power grid. Electr. Meas. Instrum. 2022, 59, 78–84. [Google Scholar]
Yu, J.; Wang, X.; Zhang, Y.; Liu, Y.; Zhao, S.; Shan, L. Construction and application of knowledge graph for intelligent dispatchingand control. Power Syst. Prot. Control 2020, 48, 29–35. [Google Scholar]
Zheng, W.; Yang, Y.; Lu, J.; Zheng, J.; Tan, H.; Yu, J.; Yu, T. Entity linking method of distribution dispatching texts for a distribution network knowledge graph. Power Syst. Prot. Control 2021, 49, 111–117. [Google Scholar]
Li, M.; Tao, H.; Xu, H.; Liu, J.; Zhang, Q.; Zhang, W. The Technical Framework and Application Prospect of Artificial Intelligence Application in the Field of Power Grid dispatching and Control. Power Syst. Technol. 2020, 44, 393–400. [Google Scholar]
Tong, J.; Wu, Z.; Guan, L.; Liu, Q.; Du, L.; Xu, L. Power Dispatching Text Analysis and Application Based on Natural Language Understanding. Power Syst. Technol. 2020, 44, 4148–4155. [Google Scholar]
Chen, J.; Xing, K.; Meng, W.; Guo, J.; Feng, J. Chinese Medical Named Entity Recognition Model Based on Local Enhancement. J. Commun. 2024, 45, 171–183. [Google Scholar]
Sun, Z.; Chen, L.; Wei, S.; Chen, B. Chinese Named Entity Recognition Enhanced by Boundary Information and Vocabulary Information. J. Nanjing Norm. Univ. (Eng. Technol. Ed.) 2024, 24, 79–86. [Google Scholar]
Lu, H.; Yuan, Y.; Guo, H.; Yuan, L.; Wang, G.; Liu, H. Regular expression based information analytic method for substation centralized monitoring. Autom. Electr. Power Syst. 2017, 41, 78–83. [Google Scholar]
Wu, R.; Zhang, W.; Wang, L. Tripping Transmission Line Name Matching Method Based on Regular Expression. Electr. Power Inf. Commun. Technol. 2017, 15, 30–35. [Google Scholar]
Zhang, D.; Guo, H.; Chu, Z.; Wang, B. Research on Structured Data ConversionModel for Grid Startup Scheme Based on Multi class BiLSTM CRF. Electr. Power Inf. Commun. Technol. 2023, 21, 54–61. [Google Scholar]
Yu, Y.; Xie, M.; Shao, Q.; Zhang, J.; Zhang, P.; Rao, G. Feature word extraction model for power grid start-up and transmission scheme based on multi-faceted information learning. Electr. Power Inf. Commun. Technol. 2022, 20, 25–33. [Google Scholar]
Pi, J.; Qi, S.; Sun, W.; Lou, X.; Wo, J.; Zhang, Y.; Jiang, T.; Shan, L. Entity and event identification method for powergrid fault handling plan based on UIE framework. China Electr. Power 2023, 56, 138–146. [Google Scholar]
Li, C.; Li, B.; Duan, Y. Construction and application of a public health emergency management knowledge graph base on UIE in Shenzhen. China Digit. Med. 2024, 19, 48–55. [Google Scholar]
Lu, Y.; Liu, Q.; Dai, D.; Xiao, X.; Lin, H.; Han, X.; Sun, L.; Wu, H. Unified Structure Generation for Universal Information Extraction. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, Dublin, Ireland, 22–27 May 2022; Association for Computational Linguistics: Stroudsburg, PA, USA, 2022; Volume 60, pp. 5755–5772. [Google Scholar]
Yu, X.C.B.; Fang, M.; Liu, T.; Yu, H.; Hu, Z.; Huang, F.; Li, Y.; Wang, B. Universal Information Extraction with Meta-Pretrained Self-Retrieval. In Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, Toronto, ON, Canada, 9–14 July 2023; Association for Computational Linguistics: Stroudsburg, PA, USA, 2023; Volume 61, pp. 4084–4100. [Google Scholar]
Zhang, X.; Yang, Q.; Xu, D. TranS: Transition-based Knowledge Graph Embedding with Synthetic Relation Representation. In Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, Abu Dhabi, United Arab Emirates, 7–11 December 2022; Association for Computational Linguistics: Stroudsburg, PA, USA, 2022; pp. 1202–1208. [Google Scholar]
Baalbaki, H.; Hazimeh, H.; Harb, H.; Angarita, R. TransModE: Translational Knowledge Graph Embedding Using Modular Arithmetic. Procedia Comput. Sci. 2022, 207, 1154–1163. [Google Scholar] [CrossRef]
Zhao, L.; Feng, X.; Feng, X.; Zhong, W.; Xu, D.; Yang, Q.; Liu, H.; Qin, B.; Liu, T. Length Extrapolation of Transformers: A Survey from the Perspective of Positional Encoding. Find. Assoc. Comput. Linguist. 2024, 62, 9959–9977. [Google Scholar]
Xu, Y.; Wang, S.; Li, P.; Liu, X.; Wang, X.; Liu, W.; Liu, Y. Pluggable Neural Machine Translation Models via Memory-Augmented Adapters. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), Torino, Italy, 20–25 May 2024; ELRA and ICCL: Paris, France, 2024; pp. 12794–12808. [Google Scholar]
Zhang, C.; Chen, Q. HD-RAG: Retrieval-Augmented Generation for Hybrid Documents Containing Text and Hierarchical Tables. arXiv 2025, arXiv:2504.09554. [Google Scholar]
Chen, Y.; Lv, A.; Lin, T.E.; Chen, C.; Wu, Y.; Huang, F.; Li, Y.; Yan, R. Fortify the Shortest Stave in Attention: Enhancing Context Awareness of Large Language Models for Effective Tool Use. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics, Bangkok, Thailand, 11–16 August 2024; Association for Computational Linguistics: Stroudsburg, PA, USA, 2024; pp. 2345–2356. [Google Scholar]
Liu, Q.; Shang, Z.; Lu, S.; Liu, Y.; Liu, Y.; Yu, S. Physics-guided TL-LSTM network for early-stage degradation trajectory prediction of lithium-ion batteries. J. Energy Storage 2025, 106, 114736. [Google Scholar] [CrossRef]
Wang, C.; Wang, Z.; Dong, H.; Lauria, S.; Liu, W.; Wang, Y. Fusionformer: A Novel Adversarial Transformer Utilizing Fusion Attention for Multivariate Anomaly Detection. IEEE Trans. Neural Netw. Learn. Syst. 2025, 36, 14479–14492. [Google Scholar] [CrossRef] [PubMed]
Huang, Z.; Xu, W.; Yu, K. Bidirectional LSTM-CRF Models for Sequence Tagging. arXiv 2015, arXiv:1508.01991. [Google Scholar] [CrossRef]
Guo, Z.D.; Lin, M.; Li, C. Research on Domain Word Vector Generation Based on BERT-CRF. Comput. Softw. Comput. Appl. 2022, 58, 156–162. [Google Scholar]

Figure 1. Construction plan for the knowledge graph of new equipment start-up.

Figure 2. The UIE model architecture.

Figure 3. Knowledge extraction approach for new equipment start-up.

Figure 4. The UIE knowledge augmentation structure.

Figure 5. Accuracy curve of the training process.

Figure 6. F₁-score comparison heatmap for each model.

Table 1. List of entity tags for new equipment start-up.

Serial Number	Entity Category	Label Category
1	Substation	substation
2	Dispatching agency	dcc
3	AC line	acline
4	Busbar	busbar
5	Breaker	breaker
6	Disconnector	dis
7	Transformer	xfmr
8	Voltage level	vol
9	Operation time	time
10	Fixed value sheet	sheet
11	Protective current	protect_current
12	Protection time	protect_time

Table 2. Label recognition results of each category.

Entity Category	Precision/%	Recall/%	F₁/%
substation	99.76	99.52	99.64
dcc	100	100	100
acline	99.22	99.22	99.22
busbar	99.38	98.77	99.08
breaker	99.75	99.75	99.75
dis	94.42	94.42	94.42
xfmr	100.00	100	100
vol	99.66	100.00	99.83
time	100.00	100.00	100.00
sheet	100.00	100.00	100.00
protect_current	100.00	100.00	100.00
protect_time	100.00	100.00	100.00

Table 3. Performance comparison of different models.

Model	Precision/%	Recall/%	F₁/%
DKA-UIE	99.19	99.48	99.33
UIE	95.2	93.8	94.49
BERT-CRF	93.15	92.22	92.67
BiLSTM-CRF	90.04	88.18	89.08

Table 4. Computational efficiency comparison of different models.

Model	Total Training Time/min	Average Inference Time/ms/Sample
DKA-UIE	72.6	42.7
UIE	64.5	35.2
BERT-CRF	52.4	21.6
BiLSTM-CRF	36.3	7.3

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Tang, W.; Zhang, Y.; Mao, X.; Jia, H.; Lv, K.; Shan, L.; Qiao, Y.; Jiang, T. Construction and Application of Knowledge Graph for Power Grid New Equipment Start-Up. Energies 2025, 18, 5471. https://doi.org/10.3390/en18205471

AMA Style

Tang W, Zhang Y, Mao X, Jia H, Lv K, Shan L, Qiao Y, Jiang T. Construction and Application of Knowledge Graph for Power Grid New Equipment Start-Up. Energies. 2025; 18(20):5471. https://doi.org/10.3390/en18205471

Chicago/Turabian Style

Tang, Wei, Yue Zhang, Xun Mao, Hetong Jia, Kai Lv, Lianfei Shan, Yongtian Qiao, and Tao Jiang. 2025. "Construction and Application of Knowledge Graph for Power Grid New Equipment Start-Up" Energies 18, no. 20: 5471. https://doi.org/10.3390/en18205471

APA Style

Tang, W., Zhang, Y., Mao, X., Jia, H., Lv, K., Shan, L., Qiao, Y., & Jiang, T. (2025). Construction and Application of Knowledge Graph for Power Grid New Equipment Start-Up. Energies, 18(20), 5471. https://doi.org/10.3390/en18205471

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Construction and Application of Knowledge Graph for Power Grid New Equipment Start-Up

Abstract

1. Introduction

2. Knowledge Graph Construction of New Equipment Start-Up in Power Grid

2.1. Knowledge Graph Construction Scheme of New Equipment Start-Up in Power Grid

2.2. UIE Knowledge Extraction Algorithm

3. New Equipment Start-Up Knowledge Extraction Based on the DKA-UIE Framework

3.1. New Equipment Start-Up Knowledge Extraction

3.2. New Equipment Start-Up Knowledge Labeling

3.3. DKA-UIE Framework

4. Case Study Analysis

4.1. Test Data

4.2. Evaluation Index

4.3. Effect Analysis

4.4. Model Comparison

5. New Equipment Start-Up Scheme Generation and Risk Identification Based on Knowledge Graph

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI