Type-Constrained Structural–Semantic Fusion with Dynamic Relation Priors for Industrial Knowledge Graph Link Prediction and Its Application in Fault Diagnosis

Luo, Yonghao; Hu, Jianpeng; Zhang, Guozheng; Lv, Jingru

doi:10.3390/electronics15112413

Open AccessArticle

Type-Constrained Structural–Semantic Fusion with Dynamic Relation Priors for Industrial Knowledge Graph Link Prediction and Its Application in Fault Diagnosis

¹

School of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai 201620, China

²

Jingwei Textile Machinery Company Limited, Beijing Economic-Technological Development Area, Beijing 100176, China

^*

Author to whom correspondence should be addressed.

Electronics 2026, 15(11), 2413; https://doi.org/10.3390/electronics15112413

Submission received: 21 April 2026 / Revised: 18 May 2026 / Accepted: 28 May 2026 / Published: 2 June 2026

(This article belongs to the Section Artificial Intelligence)

Download

Browse Figures

Versions Notes

Abstract

Knowledge graph link prediction is a fundamental task for improving the completeness and reasoning capability of knowledge graphs. In industrial knowledge graph scenarios, missing relations may limit knowledge completion, relational reasoning, and downstream industrial applications. Fault diagnosis is a representative application scenario, where missing relations among fault phenomena, alarm information, fault locations, and fault causes may further affect fault analysis, maintenance decision-making, and industrial knowledge services. Industrial knowledge graphs usually suffer from sparse local structures, imbalanced relation distributions, explicit entity-type boundaries, and highly confusing candidate entities with similar structural or semantic contexts. These characteristics make it difficult for conventional embedding-based or graph neural network-based methods to achieve reliable candidate ranking by relying only on structural propagation or semantic matching. To address these challenges, this study proposes a type-constrained structural–semantic fusion framework with dynamic relation priors for industrial knowledge graph link prediction, and further investigates its application to fault diagnosis. The proposed framework extends a relation-centered graph neural reasoning backbone by generating dynamic relation priors through query-conditioned relation-level graph propagation over a predefined relation graph, thereby enhancing query-specific structural reasoning. It further introduces a semantic projection module to align textual representations of entities and relations with structural representations at the candidate-ranking stage. In addition, relation-category and hierarchy-aware signals are used to modulate relation representations during propagation, while entity-type constraints are incorporated into final scoring and type-constrained hard negative construction. In this way, structural evidence, textual semantic information, and entity-type validity constraints are jointly used for candidate ranking rather than being treated as isolated signals. Experiments are conducted on two public benchmark datasets, WN18RR and FB15k-237, and two industrial knowledge graph datasets in Chinese and English. The Chinese industrial knowledge graph is constructed from fault diagnosis knowledge and is used as a representative application dataset, while the English industrial knowledge graph is used to further evaluate the adaptability of the proposed framework in a related industrial production scenario. The proposed method achieves MRR scores of 0.599 and 0.446 on WN18RR and FB15k-237, respectively, and obtains MRR scores of 0.8532 and 0.7994 on the Chinese and English industrial knowledge graphs. The results demonstrate that the proposed framework improves both general link prediction performance and industrial-domain adaptability, especially in scenarios involving sparse structures, type-constrained candidate validity, and semantically confusing entities, and shows practical potential for fault diagnosis applications.

Keywords:

knowledge graph completion; link prediction; industrial knowledge graph; fault diagnosis; graph neural networks; structural–semantic fusion; type constraints

1. Introduction

Knowledge graphs, as a structured paradigm for knowledge organization and representation, have demonstrated substantial value in applications such as knowledge services, intelligent manufacturing, equipment maintenance, and decision support. By organizing entities, attributes, and relations into graph structures, knowledge graphs can effectively represent complex semantic associations in the real world and provide a unified knowledge foundation for knowledge retrieval, relational reasoning, and downstream intelligent applications [1]. In industrial scenarios, knowledge graphs can integrate heterogeneous knowledge from multiple sources, including equipment, process flows, operating states, alarm information, and fault experience, thereby providing important support for industrial knowledge management, equipment maintenance, production optimization, and decision support [2]. Fault diagnosis is one representative application scenario of industrial knowledge graphs, where relations among fault phenomena, alarm information, fault locations, and fault causes provide essential evidence for maintenance reasoning and decision support.

However, in real industrial environments, the construction and maintenance of knowledge graphs are often costly and constrained by data completeness, knowledge extraction quality, and dynamic scenario changes. As a result, industrial knowledge graphs usually contain a large number of missing facts. Such incompleteness may appear in different industrial applications, such as missing links among equipment, processes, operating states, faults, and maintenance actions. In fault diagnosis applications, it may further appear as missing relations between a fault phenomenon and its corresponding alarm information, fault location, or fault cause. The continuous emergence of new equipment, new processes, and new fault patterns further aggravates the problem of incomplete entity relations, which in turn affects subsequent reasoning and application performance. In this context, knowledge graph completion becomes a key technique for improving the completeness and usability of industrial knowledge graphs, among which link prediction is the most fundamental and commonly adopted task for inferring latent semantic relations between entities [3,4].

In recent years, a large number of approaches have been proposed for link prediction from the perspectives of embedding-based modeling, path- or rule-based reasoning, graph neural propagation, and semantic enhancement [3]. Embedding-based methods are computationally efficient and relatively simple, but they still struggle to capture complex relational patterns and multi-hop dependencies [5,6,7]. Path- and rule-based methods offer stronger interpretability, yet they often suffer from high path search costs and strong dependence on graph completeness [8,9,10]. Graph neural network methods can better capture local topological patterns and higher-order neighborhood information, but they remain limited in long-tail relations, locally sparse settings, and fine-grained semantic discrimination [11,12,13]. Semantic enhancement methods alleviate the sparsity issue to some extent by introducing entity descriptions, relation texts, and pretrained language models; however, without effective integration of graph structure and domain constraints, they remain insufficient for complex industrial knowledge graph link prediction [14,15,16].

Although recent structural–semantic methods have introduced textual information into knowledge graph link prediction, many of them mainly use semantics as initialization features, independent text-based matching signals, or score-level auxiliary signals. Such designs can improve semantic representation but may still be insufficient for industrial knowledge graphs, where candidate validity is often governed by relation-specific type boundaries and domain rules. For example, in a fault diagnosis application, a fault phenomenon may be structurally close to several candidate entities, but only candidates with valid diagnostic roles, such as fault causes, fault locations, or alarm information, should be considered appropriate under a specific relation. In broader industrial scenarios, candidate entities may be structurally close and semantically similar but play different roles under different relations. Therefore, relying only on structural propagation or semantic matching may still lead to incorrect rankings. This reveals a more specific research gap: existing methods have not sufficiently investigated how query-adaptive relation representations, candidate-ranking-level semantic evidence, relation-category-aware propagation, type-aware scoring, and type-constrained hard negative sampling can be jointly considered for industrial knowledge graph link prediction.

Compared with general-purpose knowledge graphs, industrial knowledge graphs exhibit several more distinctive characteristics. First, relation categories in industrial graphs are often well defined, but their distributions are highly imbalanced, with a small number of high-frequency relations dominating the graph while many others show long-tail characteristics. Second, industrial knowledge graphs are often much sparser, and many entities appear only in limited contexts, making it challenging for conventional structural propagation models to learn stable representations. Third, industrial relations are typically associated with stronger type constraints and business rules, and the valid boundaries between entity types are often explicit; consequently, relying only on structural proximity or semantic similarity may lead to incorrect rankings. Fourth, industrial entities and relations are highly dependent on domain-specific terminology and semantics, so candidates that are structurally similar may still differ substantially in semantic roles. These characteristics are particularly evident in fault diagnosis knowledge graphs, where fault phenomena, alarm information, fault locations, and fault causes are connected through highly constrained diagnostic relations. Therefore, effective link prediction for industrial knowledge graphs requires not only strong structural propagation and multi-hop reasoning capability, but also effective semantic modeling and explicit type constraints to improve reliability and scenario adaptability [2].

Motivated by the above observations, this study proposes a type-constrained structural–semantic fusion framework with dynamic relation priors for industrial knowledge graph link prediction and further investigates its application to fault diagnosis. The core idea of the proposed framework is to enhance relation-centered structural reasoning with query-adaptive relation priors generated through relation-level graph propagation over a predefined relation graph, complement candidate ranking with textual semantic evidence, and incorporate entity-type validity constraints into final scoring and hard negative construction. The proposed model first enhances the context adaptability of relation representations through relation-graph-based dynamic relation prior generation and enriches entity and relation representations through a semantic fusion module. It then introduces a hierarchy-aware relation propagation mechanism into relation-centered multi-hop propagation to better model relation-category semantics and industrial relation patterns. Finally, by combining type-constrained negative sampling with a constraint-aware joint training objective, the model improves candidate discrimination in highly confusing scenarios. In other words, relation-category and hierarchy-aware signals modulate relation representations during propagation, the semantic module is integrated mainly at the candidate-ranking level, and entity-type constraints are mainly used in final scoring and type-constrained hard negative construction. This design provides a practical balance between semantic discrimination, structural reasoning, type validity modeling, and computational efficiency.

Figure 1 provides an overview of the proposed framework and shows how dynamic relation priors, semantic evidence, hierarchy-aware relation propagation, and type-constrained scoring are organized for industrial knowledge graph link prediction. The detailed workflow of each component is further described in Section 3.

To evaluate the effectiveness of the proposed method, experiments are conducted on two public benchmark datasets, WN18RR and FB15k-237, as well as two industrial knowledge graph datasets in Chinese and English. The public datasets are used to validate the general link prediction capability of the model, while the industrial datasets are used to assess its adaptability and robustness in industrial scenarios. Specifically, the Chinese industrial knowledge graph is constructed from fault diagnosis knowledge and is used as a representative application dataset for evaluating the practical value of the proposed framework in fault diagnosis. The English industrial knowledge graph is derived from a related industrial production-line scenario and is used to further evaluate the adaptability of the proposed framework beyond a single application dataset. In addition, relation-type analysis, ablation experiments, and hyperparameter analysis are performed to systematically investigate the sources of performance gains.

The main contributions of this study are summarized as follows. First, we propose a type-constrained structural–semantic fusion framework with dynamic relation priors for industrial knowledge graph link prediction and evaluate its applicability in fault diagnosis scenarios. The framework is designed to address the joint challenges of sparse local structures, imbalanced relation distributions, explicit type boundaries, and semantically confusing candidates commonly observed in industrial knowledge graphs. Second, we generate dynamic relation priors through query-conditioned relation-level graph propagation over a predefined relation graph, so that relation representations can be adapted to the current query context before multi-hop structural reasoning. Meanwhile, relation-category and hierarchy-aware information is introduced to modulate relation representations during propagation and improve the discrimination of industrial relation patterns. Third, we design a candidate-ranking and training strategy that combines structural scores, semantic matching scores, and entity-type-constrained scores at the final scoring stage. A type-constrained hard negative sampling strategy is further used to construct more informative negative samples and sharpen the decision boundary between highly confusing candidates. Fourth, extensive experiments are conducted on two public benchmark datasets and two industrial knowledge graph datasets. The results, together with relation-type analysis, ablation studies, and hyperparameter analysis, demonstrate the effectiveness and robustness of the proposed framework in both general link prediction scenarios and industrial knowledge graph scenarios, while also showing its application potential in fault diagnosis.

The remainder of this paper is organized as follows. Section 2 reviews related studies on knowledge graph link prediction and industrial knowledge graphs. Section 3 presents the proposed framework, including relation-graph-based dynamic relation prior generation, the semantic fusion module, type-constrained and hierarchy-aware relation propagation, type-constrained hard negative sampling, and the training strategy. Section 4 reports the experimental settings, results, and discussion. Section 5 describes the application workflow of the proposed framework using fault diagnosis as a representative industrial scenario. Section 6 concludes the paper and outlines future work.

2. Related Work

Knowledge graph link prediction has attracted extensive attention, and existing methods can be broadly categorized into four groups [3]. The first group comprises embedding-based methods. These methods map entities and relations into continuous vector spaces and assess the plausibility of triples through carefully designed scoring functions. Early translational models, such as TransE, interpret relations as translations in vector space and are characterized by conceptual simplicity and training efficiency [5]. Subsequently, TransH and TransR enhanced relation modeling from the perspectives of relation-specific hyperplanes and relation-specific projection spaces [17,18]. In bilinear modeling, DistMult and ComplEx improve entity–relation matching through multiplicative interactions [19,20]. ConvE employs convolutional neural networks to capture local interaction patterns between entities and relations [21]. RotatE models relations as rotations in complex space and can naturally capture symmetric, antisymmetric, and inverse relations [6]. TuckER provides a unified tensor factorization framework for modeling higher-order interactions between entities and relations [7]. In addition, HittER further exploits higher-order topological information to improve representation quality [22]. Although embedding-based methods are efficient and scalable, their scoring functions are usually defined over learned entity and relation embeddings and therefore provide limited explicit modeling of multi-hop structural evidence, textual semantics, and industrial type constraints.

The second group includes path- and rule-based methods. These approaches emphasize explicit reasoning through relational paths or logical rules and thus offer relatively strong interpretability. The Path Ranking Algorithm (PRA) extracts candidate path features through path-constrained random walks and applies a linear classifier for prediction [8]. NeuralLP proposes a differentiable logical rule learning framework that integrates rule learning into end-to-end training [9]. DRUM further strengthens compositional reasoning in multi-relation and multi-hop scenarios [10]. Although these methods can explicitly present intermediate reasoning paths or rule evidence, they often require the enumeration or sampling of a large number of candidate paths. As a result, they are susceptible to path explosion, noisy paths, and spurious paths in large-scale knowledge graphs, which limits both efficiency and robustness. Moreover, manually or automatically derived rules may not fully cover the heterogeneous type boundaries and relation-specific validity constraints commonly observed in industrial knowledge graphs.

The third group consists of graph neural network-based methods. These methods aggregate multi-relational structural information through message passing and have been widely applied to knowledge graph link prediction in recent years. R-GCN introduces relation-specific transformations into entity neighborhood aggregation, enabling the model to better distinguish different relation types [11]. CompGCN further enhances entity–relation interaction modeling through compositional operators such as addition, subtraction, and circular correlation, thereby improving graph representation learning [12]. On this basis, researchers have gradually shifted from traditional entity-centered modeling to relation-centered reasoning. NBFNet proposes a relation-centered reasoning paradigm by neuralizing the Bellman–Ford algorithm and directly performing relation-guided multi-hop propagation conditioned on the query [13]. These methods generally show stronger ability in modeling complex relation dependencies, multi-hop reasoning, and inductive generalization; however, there remains room for improvement in semantic utilization and relation channel discrimination during propagation. In particular, relation-centered graph reasoning provides a strong structural backbone, but it usually relies mainly on graph topology and learned relation embeddings. When applied to industrial knowledge graphs with sparse local neighborhoods and semantically similar candidates, structural propagation alone may not provide sufficient evidence for fine-grained candidate ranking.

The fourth group includes pretrained language model-based methods. With the rapid development of large-scale pretrained language models, textual semantics has played an increasingly important role in knowledge graph link prediction. KG-BERT linearizes triples into natural language sequences and reformulates link prediction as a sequence classification task, thereby exploiting the semantic representation capability of pretrained language models [14]. StAR further incorporates structural features on top of pretrained language models to enhance the interaction between semantic and structural information [15]. Subsequent studies have systematically investigated the applicability and evaluation protocols of pretrained language models for knowledge graph link prediction and proposed more suitable modeling strategies. KG-S2S formulates knowledge graph link prediction in a sequence-to-sequence generation framework, improving adaptability to diverse graph structures [23]. SimKGC introduces contrastive learning to align entity descriptions semantically, leading to a more efficient and scalable semantic matching mechanism [16]. Overall, these methods exhibit clear advantages in semantic representation, but they often lack explicit multi-hop structural reasoning capability and typically incur relatively high computational costs during training and inference. More importantly, most text-enhanced methods focus on semantic matching between textual descriptions, while the explicit use of relation-specific type rules and industrial validity constraints is still limited.

To further clarify the limitations of existing studies, Table 1 summarizes different categories of knowledge graph link prediction methods from the perspectives of their main strengths and limitations in industrial scenarios. The comparison shows that existing methods have complementary strengths but still face different constraints when applied to industrial knowledge graphs. Therefore, industrial knowledge graph link prediction requires a more balanced framework that can jointly consider relation-centered structural reasoning, semantic candidate discrimination, and entity-type validity constraints.

In summary, existing studies have improved knowledge graph link prediction from the perspectives of embedding modeling, path/rule reasoning, graph structural propagation, and pretrained semantic enhancement. Nevertheless, limitations persist in structural reasoning for complex relational patterns, fine-grained semantic discrimination among candidate entities, and the explicit use of rule constraints [3]. These limitations are even more pronounced in industrial knowledge graph scenarios, where relation distributions are imbalanced, local structures are sparse, type boundaries are explicit, and domain semantics are highly specialized [2]. It should be noted that some existing studies have already explored structural–semantic fusion or rule-aware reasoning from different perspectives. However, the joint consideration of query-adaptive relation representations, candidate-ranking-level semantic evidence, hierarchy-aware relation propagation, type-aware scoring, and type-constrained hard negative sampling remains insufficiently explored for industrial knowledge graph link prediction. To address this issue, this study proposes a type-constrained structural–semantic fusion framework with dynamic relation priors for industrial knowledge graph link prediction and further investigates its application to fault diagnosis. The proposed framework integrates relation-graph-based dynamic relation prior generation, semantic candidate scoring, hierarchy-aware relation propagation, type-aware final scoring, and type-constrained negative sampling, aiming to improve link prediction in both general and industrial-domain scenarios while demonstrating its application potential in fault diagnosis.

3. Method

3.1. Problem Definition and Overall Framework

The task of knowledge graph link prediction aims to infer missing entities from observed factual triples. Let a knowledge graph be denoted as

G = (E, R, T),

(1)

where

E

denotes the set of entities,

R

denotes the set of relations, and

T \subseteq E \times R \times E

denotes the set of observed factual triples. For any triple

(h, r, t) \in T

,

h \in E

and

t \in E

represent the head and tail entities, respectively, and

r \in R

denotes the semantic relation between them. For tail entity prediction, the model is required to rank candidate entities for a query

(h, r, ?)

; head entity prediction can be handled in a unified manner in a similar way.

Compared with general-purpose knowledge graphs, industrial knowledge graphs pose more challenging modeling requirements. On the one hand, industrial relations exhibit stronger scenario dependence and category-specific characteristics, meaning that the same relation may shift its semantic focus across different equipment, processes, or fault chains. On the other hand, industrial knowledge graphs often involve clearer entity type boundaries, and many relations impose explicit validity constraints on the types of their head and tail entities. In addition, industrial graphs typically suffer from local structural sparsity, imbalanced relation distributions, and highly confusing candidates. These challenges are common in industrial knowledge graph link prediction and are particularly evident in applications such as fault diagnosis, where entities with different diagnostic roles may be structurally close or semantically similar. Therefore, methods relying solely on structural propagation or semantic matching are often insufficient for producing stable and reliable predictions in industrial scenarios.

To address these challenges, this study proposes a type-constrained structural–semantic fusion framework with dynamic relation priors for industrial knowledge graph link prediction. The proposed framework is built upon a relation-centered neural reasoning backbone inspired by NBFNet, rather than replacing it with an entirely new message-passing paradigm, and its applicability is further investigated in fault diagnosis scenarios. The main extensions lie in relation-graph-based dynamic relation prior generation, hierarchy-aware relation propagation, candidate-ranking-level semantic scoring, type-aware final scoring, and type-constrained hard negative construction. The model consists of four tightly coupled components. First, dynamic relation prior representations are constructed by performing query-conditioned relation-level graph propagation over a predefined relation graph to improve the scenario adaptability of relation representations. Second, a semantic fusion module is introduced to compensate for the limitations of purely structural methods in fine-grained semantic discrimination by leveraging textual information of entities and relations. Third, a type-constrained and hierarchy-aware relation propagation mechanism is incorporated into the relation-centered multi-hop propagation backbone to better model relation-category semantics and query-relevant propagation paths. Fourth, type-constrained negative sampling and a joint training objective are combined to improve discrimination among highly confusing industrial candidates.

More specifically, given a query

(h, r, ?)

, the proposed framework follows a step-by-step workflow. First, the query relation and its contextual information are used to generate a dynamic relation prior through relation-level graph propagation over the predefined relation graph, which provides query-adaptive relation information for subsequent structural reasoning. Second, relation-centered multi-hop propagation is performed over the knowledge graph, where relation-category and hierarchy-aware signals are used to modulate relation representations during propagation. Entity-type constraints are mainly incorporated at the candidate scoring and hard negative sampling stages. Third, textual descriptions of entities and relations are encoded and projected into a semantic latent space, and a semantic matching score is computed for each candidate entity. Fourth, structural, semantic, and type-constrained scores are combined at the candidate-ranking stage to obtain the final score. During training, type-constrained hard negative sampling further selects informative negative candidates that are type-valid but difficult to distinguish under the current model.

Figure 1 illustrates the overall framework of the proposed model. Specifically, the model first generates dynamic relation prior representations through query-conditioned propagation over the predefined relation graph, while also obtaining semantic representations of entities and relations using a pretrained text encoder. Then, query-specific structural reasoning is performed in a relation-centered multi-hop propagation backbone with a hierarchy-aware relation propagation mechanism. During candidate ranking, structural information, semantic information, and entity-type-constrained scoring information are further integrated to produce the final score. Finally, the model is optimized end-to-end with a type-constrained hard negative sampling strategy.

For convenience, the final score of a candidate triple

(h, r, t)

is denoted as

s (h, r, t) .

(2)

3.2. Dynamic Relation Prior Generation

In conventional knowledge graph link prediction models, relations are usually represented as globally shared static vectors. Although such representations can capture overall statistical semantics, they are often inadequate for modeling higher-order relation dependencies and the semantic shifts of the same relation under different query contexts. This issue is particularly prominent in industrial knowledge graphs, where relation semantics often vary with equipment objects, process stages, and fault contexts.

To improve the scenario adaptability of relation representations, a dynamic relation prior generation mechanism is introduced before propagation begins. Differently from directly generating relation priors with an independent multilayer perceptron, this study constructs dynamic relation priors through query-conditioned propagation over a predefined relation graph. Specifically, a relation graph is constructed according to relation co-occurrence patterns, relation-category associations, and domain-specific relation dependencies observed in the training graph. Let this relation graph be denoted as

G_{R} = (R, A_{R})

, where

R

is the relation set and

A_{R}

denotes relation-level edges. Let the base representation of relation r be

e_{r}

, and let the sample-specific contextual feature induced by the current query

(h, r, ?)

be denoted as

x_{h, r}

. The contextual feature

x_{h, r}

is constructed from the current query information and relation-context statistics, such as the representation of the query head entity, the base representation of the query relation, and relation co-occurrence features derived from the training graph.

The dynamic relation prior is then formulated as

e_{r}^{prior} = f_{RGNN} (e_{r}, x_{h, r}; G_{R}),

(3)

where

f_{RGNN} (\cdot)

denotes the query-conditioned relation-level graph propagation function over the predefined relation graph

G_{R}

. This formulation indicates that the generated relation prior depends not only on the base relation representation and the current query condition, but also on the structural dependencies encoded in the relation graph.

More specifically, the initial state of relation r can be written as

{\bar{e}}_{r}^{(0)} = η (e_{r}, x_{h, r}),

(4)

where

η (\cdot)

denotes a query-conditioned relation initialization function. Relation-level propagation is then performed over

G_{R}

:

{\bar{e}}_{r}^{(l + 1)} = {RGNN}^{(l)} ({\bar{e}}_{r}^{(l)}, \{{\bar{e}}_{r^{'}}^{(l)} ∣ (r^{'}, r) \in A_{R}\}),

(5)

where

{\bar{e}}_{r}^{(l)}

denotes the relation representation at the l-th relation-level propagation layer, and

{RGNN}^{(l)} (\cdot)

denotes the relation-level message-passing operation. After

L_{R}

relation-level propagation layers, the final dynamic relation prior is obtained as

e_{r}^{prior} = {\bar{e}}_{r}^{(L_{R})} .

(6)

This design aligns relation representations with the current sample context before propagation starts, thereby alleviating the rigidity of static relation vectors. Meanwhile, it also provides a more robust initialization basis for subsequent hierarchy-aware relation propagation, enabling propagation to proceed on relation representations that are better adapted to the current query context. Therefore, the dynamic relation prior is not used as an independent scoring branch; instead, it serves as a query-adaptive relation representation generated from the relation graph and guides the following relation-centered structural propagation.

3.3. Semantic Fusion Module

Relying solely on graph structural information is inadequate to distinguish entities that are topologically similar but semantically different, especially under complex relation mappings. To compensate for the limitations of structural representations, a semantic fusion module is introduced to construct semantic representations from textual descriptions of entities and relations and combine them with structural information. It should be noted that the semantic module in this study is mainly integrated at the candidate-ranking level rather than being injected into every message-passing layer. This design provides textual semantic evidence for candidate discrimination while avoiding excessive propagation overhead.

Specifically, a pretrained text encoder is employed to encode entity names, relation names, and their associated descriptive information. The resulting high-dimensional text embeddings are then projected into a latent space aligned with structural representations. Let the textual descriptions of entity e and relation r be denoted as

desc (e)

and

desc (r)

, respectively. Their high-dimensional text vectors can be written as

u_{e} = Enc (desc (e)),

(7)

u_{r} = Enc (desc (r)),

(8)

where

Enc (\cdot)

denotes the pretrained text encoder. The text encoder is used to provide semantic representations of entity and relation descriptions. These representations can be precomputed before model training, which reduces repeated encoding cost during training and inference.

The text vectors are then mapped into a unified latent space through linear projection to obtain semantic representations of entities and relations:

z_{e}^{sem} = W_{e} u_{e} + b_{e},

(9)

z_{r}^{sem} = W_{r} u_{r} + b_{r},

(10)

where

W_{e}

and

W_{r}

are learnable projection matrices, and

b_{e}

and

b_{r}

are bias terms.

For the tail prediction task

(h, r, ?)

, a semantic query vector is constructed from the semantic representations of the head entity and the relation:

q_{h, r}^{sem} = ψ (z_{h}^{sem}, z_{r}^{sem}),

(11)

where

ψ (\cdot)

denotes the semantic query composition function. In this study,

ψ (\cdot)

can be implemented by concatenation followed by a linear projection,

q_{h, r}^{sem} = W_{q} [z_{h}^{sem}; z_{r}^{sem}] + b_{q},

(12)

where

W_{q}

and

b_{q}

are learnable parameters. The semantic score is then computed according to the matching degree between the semantic query vector and the semantic representation of the candidate tail entity:

s_{sem} (h, r, t) = g_{sem} (q_{h, r}^{sem}, z_{t}^{sem}),

(13)

where

g_{sem} (\cdot)

denotes the semantic scoring function. A simple and efficient instantiation is cosine similarity or a dot-product matching function. For example, the cosine form can be written as

s_{sem} (h, r, t) = \frac{{(q_{h, r}^{sem})}^{⊤} z_{t}^{sem}}{{∥q_{h, r}^{sem}∥}_{2} {∥z_{t}^{sem}∥}_{2}} .

(14)

Through this module, the model can exploit textual semantic information from entities and relations to perform finer-grained discrimination among candidates that are structurally similar but semantically different, thereby improving semantic discrimination during candidate ranking. Thus, the semantic module provides a candidate-ranking-level complement to the structural reasoning backbone, rather than replacing relation-centered structural propagation.

3.4. Type-Constrained and Hierarchy-Aware Relation Propagation

After obtaining dynamic relation priors and semantic representations, the model performs relation-centered multi-hop propagation over the knowledge graph to extract structural evidence relevant to the current query. Unlike traditional entity-centered aggregation, relation-centered propagation emphasizes iterative expansion of the query relation along multi-hop paths, which is more suitable for modeling relation dependencies and path reasoning in link prediction. Following the relation-centered reasoning idea, the propagation process is conditioned on the query relation and aims to estimate the plausibility of candidate entities through multi-hop structural evidence.

However, in industrial knowledge graphs, propagation based solely on local adjacency is still insufficient. Industrial relations usually have explicit category semantics and upper-level relation types. For example, different relations may belong to broader industrial categories such as fault-causality, monitoring association, structural composition, or spatial deployment. These relation-category priors provide useful constraints for relation representation learning, especially when local graph structures are sparse or relation instances are imbalanced. Therefore, instead of assigning edge-level softmax attention weights according to entity type consistency, this study introduces a hierarchy-aware relation attention mechanism to modulate relation representations before they participate in relation-centered propagation.

For each relation r, let

c (r)

denote its upper-level relation category, and let

e_{c (r)}

denote the corresponding category representation. Given the relation state

h_{r}^{(l)}

at layer l and the query-specific contextual feature

x_{h, r}

, the hierarchy-aware relation attention coefficient is computed as

a_{h, r}^{(l)} = σ (W_{a} [h_{r}^{(l)}; e_{c (r)}; x_{h, r}] + b_{a}),

(15)

where

W_{a}

and

b_{a}

are learnable parameters,

[\cdot; \cdot]

denotes vector concatenation, and

σ (\cdot)

is the sigmoid activation function. The attention coefficient controls the relative contribution of the current relation state and its upper-level relation-category representation.

The hierarchy-modulated relation representation is then obtained as

{\tilde{h}}_{r}^{(l)} = a_{h, r}^{(l)} ⊙ h_{r}^{(l)} + (1 - a_{h, r}^{(l)}) ⊙ e_{c (r)},

(16)

where ⊙ denotes element-wise multiplication. When

a_{h, r}^{(l)}

is large, the model relies more on the relation state learned from the current propagation layer; when

a_{h, r}^{(l)}

is small, the model incorporates more upper-level category information. Therefore, this mechanism can be regarded as a gated attention mechanism between relation-specific semantics and relation-category semantics.

To avoid excessive dependence on category-level priors, a hierarchy modulation coefficient

γ

is further introduced:

{\tilde{h}}_{r}^{(l)} = (1 - γ) h_{r}^{(l)} + γ [a_{h, r}^{(l)} ⊙ h_{r}^{(l)} + (1 - a_{h, r}^{(l)}) ⊙ e_{c (r)}],

(17)

where

γ \in [0, 1]

controls the strength of hierarchy-aware modulation. A smaller

γ

makes the model closer to the original relation-centered propagation backbone, whereas a larger

γ

strengthens the influence of relation-category priors.

After relation-category-aware modulation, the resulting relation representation

{\tilde{h}}_{r}^{(l)}

is used in the relation-centered message-passing process. For an edge associated with relation

r_{i j}

, the message propagated from node j to node i at layer l can be written as

m_{i j}^{(l)} = ρ (h_{j}^{(l)}, {\tilde{h}}_{r_{i j}}^{(l)}),

(18)

where

h_{j}^{(l)}

denotes the representation of node j at layer l,

{\tilde{h}}_{r_{i j}}^{(l)}

denotes the hierarchy-modulated relation representation, and

ρ (\cdot)

denotes the relation-specific message function. In implementation,

ρ (\cdot)

can be instantiated by operations such as TransE-style addition, DistMult-style multiplication, or RotatE-style complex interaction.

The neighborhood messages are then aggregated by a permutation-invariant aggregation function:

u_{i}^{(l)} = AGG (\{m_{i j}^{(l)} ∣ j \in N (i)\}),

(19)

where

N (i)

denotes the neighborhood of node i, and

AGG (\cdot)

can be instantiated as the sum, mean, max, or PNA aggregation. In the PNA setting, multiple statistics, including mean, maximum, minimum, and standard deviation, are combined with degree-based scaling to enhance the expressive ability of neighborhood aggregation.

The representation of node i at the next layer is updated as

h_{i}^{(l + 1)} = σ (W^{(l)} [h_{i}^{(l)}; u_{i}^{(l)}]),

(20)

where

W^{(l)}

is a learnable transformation matrix and

σ (\cdot)

is a nonlinear activation function.

It should be noted that the attention mechanism described above operates on relation representations and relation-category priors, rather than assigning softmax attention weights to each propagation edge according to entity type consistency. Entity type constraints are incorporated mainly through the type-aware scoring branch and type-constrained hard negative construction. In this way, relation-category information helps guide structural propagation, while entity type information further constrains candidate ranking and training.

Based on the above propagation process, the structural score can be further obtained as

s_{str} (h, r, t) = g_{str} (z_{h}^{str}, z_{r}^{str}, z_{t}^{str}),

(21)

where

z_{h}^{str}

,

z_{r}^{str}

, and

z_{t}^{str}

denote the structural representations of the head entity, relation, and tail entity, respectively, and

g_{str} (\cdot)

is the structural scoring function. This process enables the model to perform relation-centered multi-hop structural reasoning under the guidance of dynamic relation priors and relation-category-aware attention, thereby producing more reliable structural evidence for candidate ranking in industrial knowledge graphs.

3.5. Type-Constrained Negative Sampling

After structural propagation and semantic modeling, the model is able to rank candidate triples from the perspectives of structural dependency, latent semantic relevance, and industrial rule consistency. However, if conventional random negative sampling is still used during training, the advantages of the model in industrial knowledge graph scenarios cannot be fully exploited. Randomly replacing the head or tail entity often produces a large number of trivial negatives, such as entities with clearly incorrect types or entities that are clearly unrelated to the current relation. Such negatives provide limited learning value and fail to reflect the industrial scenario in which candidates are highly confusing but entity-type boundaries are explicit.

To address this issue, a type-constrained negative sampling strategy is adopted by explicitly considering the type boundaries associated with each relation when constructing negative samples. Let the ground-truth triple be

(h, r, t)

. For each relation r, define its valid head and tail type sets as

C (r) = (T_{h}^{r}, T_{t}^{r}),

(22)

where

T_{h}^{r}

and

T_{t}^{r}

denote the valid head-type set and valid tail-type set for relation r, respectively.

For tail replacement, the candidate hard negative set can be defined as

N_{hard}^{K} (h, r, t) = \{t^{'} ∣ t^{'} \in E, type (t^{'}) \in T_{t}^{r}, t^{'} \neq t\},

(23)

where

N_{hard}^{K} (h, r, t)

denotes the K hard negatives further selected from candidates satisfying the tail-type constraint. Similarly, for head prediction, corresponding hard negatives can be constructed from candidates satisfying

T_{h}^{r}

.

Rather than directly using all type-valid candidates as negatives, the proposed method further filters these candidates according to the current model scores, including structural scores, semantic scores, and type-constrained scores, so that the selected negatives are not only type-valid but also highly confusing under the current model. Specifically, the negative construction process contains two stages. In the first stage, candidates that violate the relation-specific type constraints are filtered out, and only type-valid candidates are retained. In the second stage, the retained candidates are ranked according to the current model score, and the top-K high-scoring false candidates are selected as hard negatives. As a result, the resulting negatives are closer to the positive samples in terms of type, structure, and semantics, forcing the model to learn a more refined decision boundary during training.

3.6. Constraint-Aware Joint Training Objective

Based on the above structural propagation, semantic fusion, and type-constrained negative sampling, a unified constraint-aware training objective is further designed. For a candidate triple

(h, r, t)

, in addition to the structural score

s_{str} (h, r, t)

and the semantic score

s_{sem} (h, r, t)

, a type-constrained score is introduced as

s_{type} (h, r, t) = σ (w_{r}^{⊤} [e_{type} (h); e_{type} (t)] + b_{r}),

(24)

where

e_{type} (h)

and

e_{type} (t)

denote the type vectors of the head and tail entities, respectively, and

w_{r}

and

b_{r}

are relation-specific learnable parameters.

The final scoring function is then defined as

s_{final} (h, r, t) = s_{str} (h, r, t) + α s_{sem} (h, r, t) + β s_{type} (h, r, t),

(25)

where

α

and

β

denote the weights of the semantic branch and the type-constrained branch, respectively. This formulation unifies structural information, semantic information, and industrial rule constraints in a single framework, so that final candidate ranking is determined by multiple sources of evidence rather than a single signal. Here, the structural score provides multi-hop graph evidence, the semantic score provides textual matching evidence at the candidate-ranking level, and the type-constrained score reflects relation-specific type validity.

Let the final score of the ground-truth triple

(h, r, t)

be

s_{final} (h, r, t)

, and let the corresponding negative triple be denoted as

(h^{'}, r, t^{'})

. The model is trained using a ranking-based loss:

L_{rank} = \sum_{(h, r, t) \in T} \sum_{(h^{'}, r, t^{'}) \in N_{hard}} \max (0, μ - s_{final} (h, r, t) + s_{final} (h^{'}, r, t^{'})),

(26)

where

μ

denotes the margin hyperparameter, and

N_{hard}

denotes the hard negative set generated by the type-constrained negative sampling strategy. The symbol

μ

is used here to avoid confusion with the hierarchy modulation coefficient

γ

in the propagation module. The objective is to enforce a sufficiently large margin between the scores of positive and negative triples, thereby enabling the model to learn a more stable decision boundary.

During inference, for a given query

(h, r, ?)

or

(?, r, t)

, the model first performs relation-aware propagation to obtain query-specific structural representations. It then computes structural, semantic, and type-constrained scores, combines them through the final scoring function, and ranks all candidate entities in descending order of the final score. In this way, the proposed framework forms a complete link prediction model for industrial knowledge graphs based on structural and semantic fusion. In contrast to the training stage, no negative sampling is required during inference; all candidate entities are scored and ranked according to

s_{final} (h, r, t)

.

3.7. Computational Complexity and Scalability Analysis

This subsection analyzes the additional computational cost introduced by the proposed modules. Let

n = | E |

denote the number of entities,

m = | T |

denote the number of observed triples, d denote the hidden dimension, L denote the number of entity-graph propagation layers,

L_{R}

denote the number of relation-graph propagation layers,

| A_{R} |

denote the number of relation-level edges in the predefined relation graph, and K denote the number of selected hard negatives for each positive triple.

For relation-centered structural propagation over the knowledge graph, the dominant cost comes from message passing over graph edges. The complexity is approximately

O (L m d)

, which is comparable to common multi-hop graph neural reasoning models when the same number of propagation layers and hidden dimensions are used.

For dynamic relation prior generation, the module operates on the relation graph

G_{R} = (R, A_{R})

rather than directly over the full entity graph. Therefore, for a relation-level GNN with

L_{R}

layers and hidden dimension d, the propagation cost is approximately

O (L_{R} | A_{R} | d)

, excluding small linear transformations. Since the number of relation nodes and relation-level edges is usually much smaller than the number of entity nodes and entity-level triples, the additional cost of relation-level prior generation is limited compared with the main entity-graph propagation backbone.

For the semantic module, the most expensive operation is text encoding. However, entity and relation textual embeddings can be precomputed before model training because entity names, relation names, and descriptions are fixed for a given dataset. During training and inference, the semantic module mainly involves linear projection and vector matching, whose cost is approximately

O (n d)

for scoring all candidate entities in a query. Therefore, the semantic branch introduces limited additional online cost compared with repeatedly encoding text during each training step.

For type-constrained hard negative sampling, the first-stage type filtering reduces the candidate space by excluding type-invalid entities. The second-stage hard negative selection requires scoring retained candidates and selecting the top-K confusing negatives. Although this introduces additional training overhead compared with random negative sampling, it provides more informative negatives and improves training effectiveness in industrial graphs where many random negatives are trivial. During inference, the negative sampling module is not used, so it does not increase inference complexity.

Overall, the proposed framework introduces moderate additional cost through relation-graph-based dynamic relation prior generation, semantic candidate scoring, and type-constrained hard negative sampling. These costs are controlled by performing relation-level propagation on a relatively small relation graph, precomputing textual embeddings, using lightweight projection layers, and restricting hard negative selection to type-valid candidates. Therefore, the framework remains scalable for medium-scale industrial knowledge graphs. For very large-scale or rapidly changing industrial knowledge graphs, more efficient candidate retrieval and incremental updating strategies will be further investigated in future work.

4. Experiments

4.1. Datasets and Experimental Settings

The experiments are conducted on four datasets to evaluate the proposed model from complementary perspectives, including general link prediction capability, industrial-domain adaptability, and practical applicability in fault diagnosis. Specifically, WN18RR [21] and FB15k-237 [24] are used as public benchmark datasets to evaluate the general link prediction capability of the proposed model. WN18RR is derived from WordNet and exhibits a clear semantic hierarchy, making it suitable for evaluating the ability to model hierarchical relations and multi-hop reasoning patterns. FB15k-237 is derived from Freebase and removes inverse-relation leakage, making it more appropriate for evaluating generalization under complex relational settings. Beyond these general-purpose benchmarks, two industrial knowledge graph datasets are further used to assess the adaptability and robustness of the proposed model in domain-specific industrial scenarios. Among them, the Chinese industrial knowledge graph is constructed from Computer Numerical Control (CNC) fault diagnosis knowledge and is used to examine the practical applicability of the proposed model in a representative fault diagnosis scenario, while the English industrial knowledge graph is derived from a related Industry 4.0 production-line scenario and is used to evaluate cross-scenario adaptability in a broader industrial context.

The Chinese industrial knowledge graph is constructed from the book Practical Computer Numerical Control (CNC) Machine Tool Fault Diagnosis and Maintenance: 500 Cases [25]. Specifically, the Chinese dataset is built by extracting fault-related knowledge from the book and then performing entity standardization, relation normalization, triple construction, noise cleaning, and duplicate removal, so as to obtain an industrial link prediction dataset for a representative CNC fault diagnosis application scenario. This dataset is used to evaluate the practical applicability of the proposed industrial knowledge graph link prediction model in fault diagnosis, since its triples mainly describe diagnostic relations among fault phenomena, alarm information, fault locations, fault causes, and related operations. During entity standardization, synonymous expressions, variant names, and inconsistent naming formats were merged according to their semantic roles in fault knowledge. Relation normalization was conducted by mapping extracted relations into a predefined set of industrial fault-diagnosis relations. Triples with ambiguous entities, duplicated records, or insufficient relation evidence were removed during data cleaning. The final triples were split into training, validation, and test sets under a fixed random seed, while keeping the relation distribution approximately consistent across different splits.

In contrast, the English industrial knowledge graph is mainly derived from a publicly available Industry 4.0 production-line dataset released on Zenodo [26] and later described in detail in the corresponding journal article [27]. More specifically, the English industrial dataset used in this study is not the full original dataset, but a reconstructed subgraph extracted from that benchmark dataset for the link prediction task. Although this dataset is not a fault-diagnosis dataset in a narrow sense, it represents a related industrial production-line scenario and is used as an additional industrial dataset to evaluate whether the proposed model remains effective beyond the representative fault diagnosis application dataset. This design allows the resulting dataset to retain realistic industrial semantics while remaining suitable for controlled experimental evaluation. For the English industrial dataset, relations and entities that were too sparse, duplicate-like, or mainly descriptive rather than relational were removed, and the remaining triples were reorganized into a link-prediction-oriented format.

For the Chinese industrial knowledge graph, entities are categorized into five types according to their semantic roles in fault knowledge, namely, error code, operation, phenomenon, fault location, and cause. Based on these categories, valid head- and tail-type sets are further constructed for each relation, which are used for type-constrained scoring and type-constrained hard negative sampling. These entity types correspond to key diagnostic elements in CNC fault diagnosis and provide explicit type boundaries for judging whether a candidate entity is valid under a specific diagnostic relation. Although these type definitions are derived from the fault diagnosis application, they also reflect a common characteristic of industrial knowledge graphs, namely, that many relations impose explicit validity constraints on the semantic roles of head and tail entities. The statistics of the four datasets are summarized in Table 2.

Beyond the basic statistics, the four datasets differ substantially in their domains, semantic sources, and structural characteristics. WN18RR is a lexical knowledge graph derived from WordNet, and its textual semantics mainly come from WordNet glosses and relation labels. It contains clear hierarchical semantic relations and is therefore suitable for evaluating multi-hop reasoning and semantic hierarchy modeling. FB15k-237 is a general-domain knowledge graph derived from Freebase, where inverse-relation leakage has been removed. Compared with WN18RR, it contains more diverse relation patterns and is used to evaluate generalization under complex relational settings.

The Chinese industrial knowledge graph is constructed from CNC fault diagnosis and maintenance knowledge and is used as a representative application dataset in this study. Its entities and relations mainly describe diagnostic elements such as fault phenomena, alarm information, fault locations, fault causes, and operations. Therefore, it exhibits sparse local structures, concentrated diagnostic relations, explicit entity-type boundaries, and highly confusing candidates under certain diagnostic relations. These characteristics make it suitable for evaluating the practical value of the proposed industrial knowledge graph link prediction model in fault diagnosis applications.

The English industrial knowledge graph is derived from an Industry 4.0 production-line dataset and is used as an additional industrial dataset for cross-scenario validation. Although it is not a fault-diagnosis dataset in a narrow sense, it contains realistic industrial entities and relations from a production-line scenario, with a larger candidate space than the Chinese industrial knowledge graph. This dataset is therefore used to examine whether the proposed model remains effective in a related industrial scenario beyond the representative fault diagnosis application dataset. For both industrial datasets, semantic inputs are mainly constructed from entity names, relation names, and available domain descriptions.

All experiments were implemented using Python 3.8, PyTorch 1.10.0 (Meta AI, Menlo Park, CA, USA), CUDA 11.3 (NVIDIA Corporation, Santa Clara, CA, USA), and Transformers 4.18.0 (Hugging Face, New York, NY, USA). The experiments were conducted on a workstation equipped with an NVIDIA RTX A6000 GPU with 48 GB memory (NVIDIA Corporation, Santa Clara, CA, USA).

During preprocessing, the following steps are performed. First, inverse relations are constructed for each relation to enhance the modeling of bidirectional relation patterns. Second, pretrained text encoders are used to obtain semantic vectors of entities and relations. For the public datasets, textual inputs are mainly derived from WordNet glosses or Freebase labels, whereas for the industrial datasets, entity names, relation names, and domain descriptions are used. Third, a relation co-occurrence graph is constructed to learn relation priors, which provides structural support for subsequent dynamic relation prior generation. Finally, a type-constrained hard negative sampling strategy is employed during training to improve discrimination among highly confusing candidates.

To avoid potential information leakage, all trainable structural statistics, including relation co-occurrence statistics, dynamic relation prior inputs, type constraints used during training, and hard negative candidate pools, were constructed only from the training triples. Validation and test triples were not used for model training, relation-prior construction, or hard negative construction. For semantic initialization, only entity names, relation names, and externally available textual descriptions were used, and no validation or test triple labels were introduced during text embedding construction. Validation triples were used only for model selection, and test triples were used only for final evaluation. In the filtered evaluation protocol, known true triples from the training, validation, and test sets were removed from the corrupted candidate list following the standard link prediction setting, so as to avoid false-negative interference during ranking.

All experiments are evaluated under the filtered setting, i.e., existing true triples from the training, validation, and test sets are filtered out during head and tail prediction to avoid false-negative interference. The evaluation metrics are the commonly used Mean Reciprocal Rank (MRR), Hits@1, Hits@3, and Hits@10. For the public datasets, preprocessing mainly introduces inverse relations, text-based semantic initialization, and relation co-occurrence priors to strengthen general link prediction performance. For the industrial datasets, entity type constraints, relation prior enhancement, and hard negative construction are further incorporated to improve the adaptability of the model in industrial knowledge graphs. For the Chinese industrial knowledge graph, these constraints directly reflect diagnostic role boundaries among fault phenomena, alarm information, fault locations, fault causes, and operations, thereby supporting the evaluation of the proposed model in a representative fault diagnosis application. For the English industrial knowledge graph, the same modeling strategy is applied to examine whether the proposed model can generalize to a related production-line knowledge graph. All baseline methods are evaluated under the same filtered setting, and the reported results are based on the same training, validation, and test splits for fair comparison.

The main hyperparameter settings used in the experiments are summarized in Table 3. The semantic weight

α

and the type-constrained weight

β

are further analyzed in the hyperparameter analysis. If not otherwise specified, the same basic training configuration is used across datasets, while dataset-specific parameters are selected according to validation performance.

4.2. Results on Public Benchmark Datasets

To verify the effectiveness of the proposed structural-and-semantic fusion model on general knowledge graph link prediction tasks, comparative experiments are first conducted on the two public benchmark datasets WN18RR and FB15k-237. The results are reported in Table 4.

As shown in Table 4, the proposed method achieves the best results on both public datasets. On WN18RR, the proposed method obtains an MRR of 0.599 and Hits@1, Hits@3, and Hits@10 of 0.538, 0.631, and 0.724, respectively. On FB15k-237, the corresponding scores are 0.446, 0.354, 0.498, and 0.650. Compared with representative baselines, the proposed method shows clear and stable advantages in overall ranking quality and top-ranked candidate accuracy.

On WN18RR, the proposed method improves the MRR of NBFNet from 0.551 to 0.599, indicating that dynamic relation priors and the semantic fusion module can provide more effective complementary information in scenarios involving hierarchical relations and multi-hop semantic dependencies, thereby improving overall candidate ranking quality. Meanwhile, the Hits@10 score of 0.724 indicates stronger coverage of high-ranking candidates. On FB15k-237, the proposed method also outperforms strong baselines such as NBFNet, N-BERT, HittER, and N-Former, demonstrating good generalization ability in more diverse and structurally complex public benchmark scenarios.

Among all baselines, NBFNet is the most directly comparable structure-based reasoning model because it also emphasizes relation-centered multi-hop propagation. Compared with NBFNet, the proposed method improves MRR by 0.048 on WN18RR and by 0.031 on FB15k-237. These improvements suggest that dynamic relation priors and candidate-ranking-level semantic evidence can provide useful complementary signals beyond structural propagation. The improvements in Hits@1 also indicate that the correct entity is more frequently ranked at the top, which is important for practical knowledge graph completion systems where only a few top-ranked candidates are typically inspected.

Overall, the public benchmark results validate the effectiveness of the proposed structural-and-semantic fusion framework for general knowledge graph link prediction. Although this is not the primary focus of the study, the results show that the proposed method is not only effective in industrial scenarios but also consistently beneficial on general datasets.

4.3. Results on Industrial Datasets

After validating the general effectiveness of the model on public benchmarks, experiments are further conducted on the Chinese industrial knowledge graph and the English industrial knowledge graph to evaluate the adaptability and robustness of the proposed method in industrial scenarios. The results are shown in Table 5.

As shown in Table 5, the proposed method achieves the best performance on both the Chinese and English industrial knowledge graphs. On the Chinese industrial knowledge graph, the proposed method obtains an MRR of 0.8532 and Hits@1, Hits@3, and Hits@10 of 0.8235, 0.8784, and 0.9144, respectively. On the English industrial knowledge graph, the corresponding scores are 0.7994, 0.7908, 0.8042, and 0.8216. These results indicate that the proposed model performs strongly not only on the Chinese industrial graph, where semantic relations are more concentrated, but also on the larger and more challenging English industrial graph with a broader candidate space.

On the Chinese industrial knowledge graph, the proposed method improves the MRR of NBFNet from 0.8220 to 0.8532, indicating that relation-graph-based dynamic relation priors, hierarchy-aware relation propagation, candidate-ranking-level semantic evidence, and type-aware constraints can more effectively exploit relation contexts, textual semantics, and type boundaries in industrial knowledge, thereby strengthening the modeling of query-relevant reasoning evidence. Meanwhile, the improvement in Hits@10 to 0.9144 further demonstrates strong ranking performance under high-recall candidate scenarios. On the English industrial knowledge graph, the proposed method also outperforms strong baselines such as NBFNet and DistMult, indicating that it remains robust and effective in larger-scale industrial scenarios with more complex candidate spaces.

The industrial results are practically meaningful because industrial link prediction is often used to provide a short list of candidate entities for downstream fault analysis, maintenance knowledge retrieval, or decision support. On the Chinese industrial knowledge graph, the Hits@1 improvement from 0.7919 to 0.8235 over NBFNet indicates that the correct entity is more frequently ranked first, which can reduce manual verification cost in fault-related candidate recommendation. On the English industrial knowledge graph, the MRR improvement from 0.7784 to 0.7994 shows that the proposed framework remains effective when the candidate space becomes larger. These gains suggest that the combination of relation-adaptive structural reasoning, semantic candidate discrimination, and type-aware constraints is especially helpful for industrial graphs with sparse local structures and explicit relation-type boundaries.

Overall, the industrial dataset results show that the performance gains of the proposed method do not arise from a single module, but from the joint contribution of relation-adaptive structural reasoning, semantic candidate discrimination, and entity-type validity modeling. In industrial knowledge graphs, where type boundaries are explicit yet candidates are highly confusing, relying only on structural patterns or semantic information is insufficient to achieve stable and reliable ranking. By combining relation-category-aware propagation with type-aware scoring and type-constrained hard negative training, the proposed method forms a more complete discrimination mechanism for candidate ranking.

4.4. Relation-Type Analysis

To further analyze model behavior under different relation mapping patterns, this subsection reports the performance on four standard relation categories in knowledge graph link prediction, namely, 1-to-1, 1-to-N, N-to-1, and N-to-N. These categories are used to describe the mapping complexity between head entities and tail entities under a specific relation. A 1-to-1 relation usually indicates that one head entity corresponds to one tail entity, and the candidate competition is relatively weak. A 1-to-N relation indicates that one head entity may correspond to multiple tail entities, while an N-to-1 relation indicates that multiple head entities may correspond to the same tail entity. These two categories usually involve stronger candidate competition because several entities may be plausible under the same relation. An N-to-N relation indicates that multiple head entities and multiple tail entities are connected under the same relation, reflecting a more complex many-to-many mapping pattern. Therefore, relation-type analysis is useful for examining whether a model can maintain stable ranking performance under different mapping complexities, especially when multiple candidate entities are structurally or semantically similar.

Following the standard setting in knowledge graph link prediction, the relations in the Chinese industrial knowledge graph are categorized into the above four types. The MRR values of the proposed method and NBFNet on these relation categories are reported in Table 6.

As shown in Table 6, the proposed method outperforms NBFNet across all four relation categories, although the magnitude of improvement varies considerably. Overall, the MRR values of 1-to-1 and N-to-N relations are significantly higher than those of 1-to-N and N-to-1 relations, indicating that both models perform well when relation mappings are relatively deterministic or structurally well connected. In contrast, one-to-many and many-to-one relations involve stronger candidate competition and are therefore more difficult to rank accurately.

More specifically, the proposed method achieves the most significant improvements on 1-to-N and N-to-1 relations. For 1-to-N relations, the MRR increases from 0.7214 to 0.7869, while for N-to-1 relations, it increases from 0.6938 to 0.7615. These results suggest that under complex relation patterns with multiple highly competitive candidates, dynamic relation priors, the semantic fusion module, and the type-constrained hard negative strategy can better enhance fine-grained candidate discrimination. In industrial knowledge graphs, many candidates are highly similar in local structure or semantics, and a single structural propagation signal is often insufficient for stable discrimination. By jointly exploiting structural information, semantic information, and type constraints, the proposed method is able to build clearer decision boundaries among confusing candidates.

The larger gains on 1-to-N and N-to-1 relations further support the motivation of the proposed framework. In these relation categories, multiple candidate entities may satisfy similar structural patterns, making the ranking task more dependent on query-specific relation information, textual semantic evidence, and type validity. Therefore, the observed improvements indicate that the proposed mechanisms are particularly useful when candidate competition is strong.

For the relatively regular or densely connected 1-to-1 and N-to-N relations, the proposed method also maintains strong performance. The MRR of 1-to-1 relations increases from 0.9346 to 0.9438, and that of N-to-N relations increases from 0.9017 to 0.9162. Although these gains are smaller than those on 1-to-N and N-to-1 relations, they still indicate that the proposed model is effective not only for complex relation patterns but also for relatively regular ones. Overall, the results in Table 6 further verify the adaptability of the proposed model to different relation mapping patterns in industrial knowledge graphs, especially under strong candidate competition.

4.5. Ablation Study

Since the proposed method is mainly designed for industrial knowledge graph scenarios, ablation experiments are conducted only on the Chinese industrial knowledge graph. By removing key components one at a time, the contributions of dynamic relation priors, hierarchy-aware relation propagation, the semantic fusion module, the type-constrained scoring branch, and industrial hard negatives can be analyzed more directly. The results are shown in Table 7.

As shown in Table 7, removing any component leads to a performance decline in terms of MRR, Hits@1, Hits@3, and Hits@10, indicating that all the introduced mechanisms contribute positively and consistently to industrial knowledge graph link prediction. Among them, removing the dynamic relation prior causes the most substantial degradation, showing that query-specific relation initialization is particularly important in industrial scenarios. The same relation may exhibit different semantic focuses across equipment states and fault chains, and without dynamic relation priors, the model becomes less capable of capturing such sample-specific variations.

Removing the hierarchy-aware relation propagation mechanism also leads to a noticeable performance drop, indicating that relation-category and hierarchy-aware modulation improve relation representation learning and structural reasoning in industrial graphs. Removing the semantic branch causes consistent degradation in MRR and Hits@K, which suggests that semantic information is not merely auxiliary but is critical for discriminating highly confusing candidates. Here, the role of the semantic branch should be understood as a candidate-ranking-level complement to structural reasoning: it provides textual matching evidence for candidates that are structurally similar but semantically different. In particular, for industrial candidates that are structurally similar but semantically different, the semantic fusion module provides more fine-grained discriminative signals. Removing the type-constrained branch also weakens top-ranked candidate performance, showing that type-constrained scoring plays an important role in suppressing rule-invalid candidates. Finally, removing industrial hard negatives still results in clear performance degradation, indicating that high-quality hard negatives can effectively sharpen the decision boundary against difficult false candidates.

Overall, the results in Table 7 show that the performance improvements of the proposed model cannot be attributed to any single component alone, but rather to the joint effect of dynamic relation priors, hierarchy-aware relation propagation, semantic fusion, type-aware scoring, and hard negative training under industrial scenarios.

4.6. Hyperparameter Analysis

To further analyze the effect of key hyperparameters on performance, sensitivity analysis is conducted on the semantic branch weight

α

and the type-constrained branch weight

β

. These two hyperparameters control the contributions of semantic information and type-validity scoring evidence, respectively, in the final scoring function. Since the proposed method is designed primarily for industrial knowledge graph scenarios, the analysis is performed only on the Chinese industrial knowledge graph. Specifically, one parameter is fixed while the other is varied:

β

is fixed to 0.3 when analyzing

α

, and

α

is fixed to 0.3 when analyzing

β

. The results are shown in Table 8.

As shown in Table 8, both

α

and

β

exhibit a clear moderate-optimum pattern, meaning that excessively small or large values both lead to performance degradation. For

α

, when it is set to 0.2, the MRR is 0.8476, indicating that the contribution of the semantic branch is relatively insufficient and the model still relies more heavily on structural propagation and type-validity evidence, making it difficult to distinguish candidates that are structurally similar but semantically different. When

α = 0.3

, the model achieves the best MRR of 0.8532, indicating that semantic information, structural evidence, and type-validity evidence are well balanced. When

α

is further increased to 0.4, the MRR decreases to 0.8491, suggesting that overly strong semantic weighting may cause the model to depend too much on semantic similarity and weaken the influence of structural paths and rule constraints.

For

β

, the variation reflects the role boundary of the type-constrained branch in industrial scenarios. When

β = 0.2

, the MRR is 0.8458, indicating that the influence of type constraints on the final score is too weak and the model cannot fully exploit explicit type boundaries in the industrial knowledge graph to suppress incorrect candidates. When

β = 0.3

, the model reaches the best MRR of 0.8532, showing that the type-constrained branch forms a good complement to the structural and semantic branches. When

β

increases to 0.4, the MRR drops to 0.8484, suggesting that overly strong rule constraints may compress fine-grained differences among multiple valid candidates and thus impair final ranking performance.

Overall, the results in Table 8 indicate that the effectiveness of the proposed model relies on an appropriate balance among structural propagation, semantic discrimination, and type constraints, rather than the excessive strengthening of any single branch. Both the semantic branch and the type-constrained branch need to operate within an appropriate range so that they remain well balanced with structural evidence and jointly yield the best link prediction performance on industrial knowledge graphs.

5. Application in Fault Diagnosis

To further illustrate the practical use of the proposed industrial knowledge graph link prediction model, this section presents its application workflow in a representative fault diagnosis scenario. In practical industrial maintenance, fault-related knowledge is usually distributed across maintenance records, equipment manuals, alarm descriptions, historical fault cases, and expert experience. Such knowledge is often fragmented and expressed in inconsistent terminology, which makes it difficult for engineers to quickly locate potential fault causes. Therefore, the proposed model is used as a link prediction component to recommend candidate fault causes based on the constructed industrial fault diagnosis knowledge graph.

Figure 2 shows the application workflow of the proposed model in fault diagnosis. Given a user-input fault description, the system first standardizes the input information and links it to entities in the knowledge graph. Then, a query triple

(h, r, ?)

is constructed, where h denotes the matched fault phenomenon or alarm-related entity, and r denotes the diagnostic relation, such as the relation from a fault phenomenon to a fault cause. The proposed link prediction model then computes the plausibility scores of candidate tail entities and returns the top-ranked candidate causes. The results are displayed together with explanatory information, and maintenance engineers further verify the candidates according to on-site inspection results and domain experience. Confirmed new fault relations can be inserted into the knowledge graph, thereby forming a closed-loop process of fault information input, model-assisted diagnosis, expert verification, and knowledge graph updating.

The key difference between this workflow and a conventional keyword-based retrieval process is that the proposed model does not only search for existing records with similar surface forms. Instead, it uses the structural, semantic, and type-aware evidence learned from the industrial knowledge graph to infer potential missing relations between a given fault phenomenon and candidate causes. In this way, the model can recommend candidate causes even when the exact fault case has not been explicitly recorded in the knowledge graph.

As an illustrative example, suppose that a maintenance engineer reports the following fault information: “Alarm information: ALM401; fault phenomenon: abnormal movement of the X-axis”.After standardization and entity linking, the input is matched to the corresponding alarm and fault phenomenon entities in the knowledge graph. The system then constructs a query triple such as

(abnormal movement of the X - axis, has_cause, ?),

(27)

where the target relation

has_cause

is used to retrieve possible fault cause entities. The proposed model computes the final score

s_{final} (h, r, t)

for each candidate cause by combining structural reasoning, semantic matching, and entity-type validity. For illustration, a possible top-ranked output is shown in Table 9.

The maintenance engineer can then compare these candidate causes with the actual equipment status, alarm history, and on-site inspection results. If the first-ranked candidate is verified as the actual cause, the corresponding diagnostic relation can be confirmed and stored. If the correct cause is not included in the top-ranked candidates, the newly verified cause and its related information can also be added to the knowledge graph as a new case. Therefore, the model output is not used as an automatic final decision, but as decision-support information for narrowing the search space and improving troubleshooting efficiency.

This application example shows that the proposed model can support fault diagnosis in three aspects. First, it narrows the search space by ranking type-valid candidate causes instead of returning unstructured text fragments. Second, it improves knowledge reuse by exploiting both graph structure and textual semantic information from historical maintenance knowledge. Third, it preserves the role of expert verification, since the model output is used as decision-support information rather than an automatic judgment. Therefore, the proposed model can be integrated into an industrial fault diagnosis workflow to support candidate cause recommendation, maintenance knowledge retrieval, and continuous knowledge graph updating.

6. Conclusions

This study proposes a type-constrained structural–semantic fusion framework with dynamic relation priors for industrial knowledge graph link prediction and further investigates its application to fault diagnosis. To address relation imbalance, local structural sparsity, explicit type boundaries, and highly confusing candidates in industrial knowledge graphs, the proposed model integrates relation-graph-based dynamic relation prior generation, hierarchy-aware relation propagation, candidate-ranking-level semantic fusion, type-aware scoring, and type-constrained negative sampling into a unified framework. In this way, the model improves relation representation adaptability, semantic candidate discrimination, and the utilization of industrial rule constraints.

Extensive experiments on two public benchmark datasets, WN18RR and FB15k-237, as well as two industrial knowledge graph datasets demonstrate the effectiveness of the proposed method. The Chinese industrial knowledge graph is constructed from CNC fault diagnosis knowledge and is used as a representative application dataset, while the English industrial knowledge graph is used to further evaluate the adaptability of the proposed framework in a related industrial production-line scenario. The results show that the model not only achieves strong performance on general link prediction benchmarks, but also obtains the best results on both the Chinese and English industrial knowledge graphs. Further analyses on relation types, ablation settings, and hyperparameters confirm that the performance gains come from the coordinated contribution of dynamic relation priors, hierarchy-aware relation propagation, semantic fusion, type-aware scoring, and hard negative training. These results suggest that jointly considering structural evidence, textual semantic evidence, and relation-specific type validity is useful for improving candidate ranking in industrial knowledge graphs, especially when local structures are sparse and candidate entities are highly confusing.

From a practical perspective, the proposed method can provide more reliable candidate entity ranking for industrial knowledge graph completion. In the representative fault diagnosis application, this capability is potentially useful for downstream industrial knowledge services such as fault analysis, maintenance knowledge retrieval, and decision support, where reducing the number of incorrectly ranked top candidates can lower manual verification cost. Nevertheless, the conclusions should be interpreted with several limitations in mind. First, the type-constrained components rely on relatively reliable entity type information, which may be incomplete or noisy in automatically constructed industrial knowledge graphs. Second, the semantic fusion module depends on the quality of entity and relation descriptions; when textual descriptions are sparse, ambiguous, or inconsistent, the semantic matching signal may become less reliable. Third, although textual embeddings can be precomputed, the introduction of dynamic relation priors, semantic scoring, and type-constrained hard negative sampling still brings additional computational overhead compared with simpler embedding-based methods. Finally, the current industrial experiments are conducted on a CNC fault diagnosis knowledge graph and a production-line knowledge graph, and the generalizability of the proposed framework to other industrial domains, such as supply chains, industrial IoT, and process optimization, requires further validation.

Overall, the proposed method provides a feasible and effective solution for industrial knowledge graph link prediction by jointly modeling structural information, semantic information, and rule constraints. The application analysis further indicates its potential value in fault diagnosis scenarios. In future work, automatic type refinement, more efficient semantic retrieval and scoring mechanisms, incremental updating strategies for dynamic industrial knowledge graphs, and validation on broader industrial domains will be explored to further improve generalization and practical applicability.

Author Contributions

Conceptualization, Y.L. and J.H.; methodology, Y.L. and J.H.; validation, Y.L.; formal analysis, Y.L.; investigation, Y.L. and G.Z.; resources, G.Z.; writing—original draft preparation, Y.L.; writing—review and editing, Y.L., J.H., G.Z. and J.L.; supervision, J.H.; project administration, J.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The public benchmark datasets WN18RR and FB15k-237 used in this study are publicly available from their original sources. The English industrial knowledge graph dataset used in this study was reconstructed from a publicly available Industry 4.0 production-line knowledge graph dataset released on Zenodo and its related published benchmark. The Chinese industrial knowledge graph was constructed from fault-diagnosis cases extracted from a copyrighted book; therefore, the processed Chinese dataset cannot be made fully public due to copyright and usage restrictions, but it is available from the corresponding author upon reasonable request for research purposes and subject to applicable restrictions.

Acknowledgments

The authors would like to thank the editors and reviewers for their valuable comments and suggestions.

Conflicts of Interest

Author Guozheng Zhang was employed by the company Jingwei Textile Machinery Company Limited. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

KG	Knowledge Graph
KGC	Knowledge Graph Completion
MRR	Mean Reciprocal Rank
GNN	Graph Neural Network
PRA	Path Ranking Algorithm
PLM	Pre-trained Language Model

References

Hogan, A.; Blomqvist, E.; Cochez, M.; d’Amato, C.; de Melo, G.; Gutierrez, C.; Gayo, J.E.L.; Kirrane, S.; Neumaier, S.; Polleres, A.; et al. Knowledge Graphs. ACM Comput. Surv. 2021, 54, 71. [Google Scholar] [CrossRef]
Buchgeher, G.; Gabauer, D.; Martinez-Gil, J.; Ehrlinger, L. Knowledge Graphs in Manufacturing and Production: A Systematic Literature Review. IEEE Access 2021, 9, 55537–55554. [Google Scholar] [CrossRef]
Zamini, M.; Reza, H.; Rabiei, M. A Review of Knowledge Graph Completion. Information 2022, 13, 396. [Google Scholar] [CrossRef]
Shi, B.; Weninger, T. Open-World Knowledge Graph Completion. In Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018; Volume 32, pp. 1957–1964. [Google Scholar] [CrossRef]
Bordes, A.; Usunier, N.; García-Durán, A.; Weston, J.; Yakhnenko, O. Translating Embeddings for Modeling Multi-relational Data. In Proceedings of the Advances in Neural Information Processing Systems 26 (NeurIPS 2013), Lake Tahoe, NV, USA, 5–8 December 2013; pp. 2787–2795. [Google Scholar]
Sun, Z.; Deng, Z.; Nie, J.; Tang, J. RotatE: Knowledge Graph Embedding by Relational Rotation in Complex Space. In Proceedings of the International Conference on Learning Representations, New Orleans, LA, USA, 6–9 May 2019. [Google Scholar]
Balazevic, I.; Allen, C.; Hospedales, T. TuckER: Tensor Factorization for Knowledge Graph Completion. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing, Hong Kong, China, 3–7 November 2019; pp. 5185–5194. [Google Scholar] [CrossRef]
Lao, N.; Mitchell, T.; Cohen, W.W. Random Walk Inference and Learning in A Large Scale Knowledge Base. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, Edinburgh, UK, 27–31 July 2011; pp. 529–539. [Google Scholar]
Yang, F.; Yang, Z.; Cohen, W.W. Differentiable Learning of Logical Rules for Knowledge Base Reasoning. In Proceedings of the Advances in Neural Information Processing Systems 30 (NeurIPS 2017), Long Beach, CA, USA, 4–9 December 2017; pp. 2319–2328. [Google Scholar]
Sadeghian, A.; Armandpour, M.; Ding, P.; Wang, D.Z. DRUM: End-To-End Differentiable Rule Mining On Knowledge Graphs. In Proceedings of the Advances in Neural Information Processing Systems 32 (NeurIPS 2019), Vancouver, BC, Canada, 8–14 December 2019; pp. 15321–15331. [Google Scholar]
Schlichtkrull, M.; Kipf, T.N.; Bloem, P.; van den Berg, R.; Titov, I.; Welling, M. Modeling Relational Data with Graph Convolutional Networks. In Proceedings of the Semantic Web, Crete, Greece, 3–7 June 2018; pp. 593–607. [Google Scholar] [CrossRef]
Vashishth, S.; Sanyal, S.; Nitin, V.; Talukdar, P. Composition-based Multi-Relational Graph Convolutional Networks. In Proceedings of the International Conference on Learning Representations, Addis Ababa, Ethiopia, 30 April 2020. [Google Scholar]
Zhu, Z.; Zhang, Z.; Xhonneux, L.P.; Tang, J. Neural Bellman-Ford Networks: A General Graph Neural Network Framework for Link Prediction. In Proceedings of the Advances in Neural Information Processing Systems 34 (NeurIPS 2021), Online, 6–14 December 2021; pp. 29476–29490. [Google Scholar]
Yao, L.; Mao, C.; Luo, Y. KG-BERT: BERT for Knowledge Graph Completion. arXiv 2019, arXiv:1909.03193. [Google Scholar] [CrossRef]
Wang, B.; Shen, T.; Long, G.; Zhou, T.; Wang, Y.; Chang, Y. Structure-Augmented Text Representation Learning for Efficient Knowledge Graph Completion. In Proceedings of the Web Conference 2021, Ljubljana, Slovenia, 19–23 April 2021; pp. 1737–1748. [Google Scholar] [CrossRef]
Wang, L.; Zhao, W.; Wei, Z.; Liu, J. SimKGC: Simple Contrastive Knowledge Graph Completion with Pre-trained Language Models. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, Dublin, Ireland, 22–27 May 2022; Volume 1: Long Papers. pp. 4281–4294. [Google Scholar] [CrossRef]
Wang, Z.; Zhang, J.; Feng, J.; Chen, Z. Knowledge Graph Embedding by Translating on Hyperplanes. In Proceedings of the AAAI Conference on Artificial Intelligence, Québec City, QC, Canada, 27–31 July 2014; Volume 28, pp. 1112–1119. [Google Scholar]
Lin, Y.; Liu, Z.; Sun, M.; Liu, Y.; Zhu, X. Learning Entity and Relation Embeddings for Knowledge Graph Completion. In Proceedings of the AAAI Conference on Artificial Intelligence, Austin, TX, USA, 25–30 January 2015; Volume 29, pp. 2181–2187. [Google Scholar]
Yang, B.; Yih, W.; He, X.; Gao, J.; Deng, L. Embedding Entities and Relations for Learning and Inference in Knowledge Bases. arXiv 2015, arXiv:1412.6575. [Google Scholar] [CrossRef]
Trouillon, T.; Welbl, J.; Riedel, S.; Gaussier, É.; Bouchard, G. Complex Embeddings for Simple Link Prediction. In Proceedings of the 33rd International Conference on Machine Learning, New York, NY, USA, 20–22 June 2016; pp. 2071–2080. [Google Scholar]
Dettmers, T.; Minervini, P.; Stenetorp, P.; Riedel, S. Convolutional 2D Knowledge Graph Embeddings. In Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018; Volume 32, pp. 1811–1818. [Google Scholar]
Chen, S.; Liu, X.; Gao, J.; Jiao, J.; Zhang, R.; Ji, Y. HittER: Hierarchical Transformers for Knowledge Graph Embeddings. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Punta Cana, Dominican Republic, 7–11 November 2021; pp. 10395–10407. [Google Scholar] [CrossRef]
Chen, C.; Wang, Y.; Li, B.; Lam, K.Y. Knowledge Is Flat: A Seq2Seq Generative Framework for Various Knowledge Graph Completion. In Proceedings of the 29th International Conference on Computational Linguistics, Gyeongju, Republic of Korea, 12–17 October 2022; pp. 4005–4017. [Google Scholar]
Toutanova, K.; Chen, D. Observed versus Latent Features for Knowledge Base and Text Inference. In Proceedings of the 3rd Workshop on Continuous Vector Space Models and Their Compositionality, Beijing, China, 26–31 July 2015; pp. 57–66. [Google Scholar] [CrossRef]
Zhou, B. Practical CNC Machine Tool Fault Diagnosis and Maintenance: 500 Cases; China Knowledge Publishing House: Beijing, China, 2006. [Google Scholar]
Yahya, M. A Benchmark Dataset with Knowledge Graph Generation for Industry 4.0 Production Lines. Zenodo Dataset. 2023. Available online: https://zenodo.org/records/7779522 (accessed on 27 May 2026).
Yahya, M.; Ali, A.; Mehmood, Q.; Yang, L.; Breslin, J.G.; Ali, M.I. A Benchmark Dataset with Knowledge Graph Generation for Industry 4.0 Production Lines. Semant. Web 2024, 15, 461–479. [Google Scholar] [CrossRef]
Liu, Y.; Su, Y.; Wang, Y.; Zhang, Y.; Liu, B.; Sun, X. Knowledge Graph Embedding via Co-distillation Learning. In Proceedings of the 31st ACM International Conference on Information and Knowledge Management, Atlanta, GA, USA, 17–21 October 2022; pp. 4284–4288. [Google Scholar] [CrossRef]
Chen, C.; Wang, Y.; Sun, A.; Li, B.; Lam, K.Y. Dipping PLMs Sauce: Bridging Structure and Text for Effective Knowledge Graph Completion via Conditional Soft Prompting. In Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, Toronto, ON, Canada, 9–14 July 2023; pp. 11489–11504. [Google Scholar] [CrossRef]

Figure 1. Overall framework of the proposed link prediction model, where “?” indicates the missing entity to be predicted.

Figure 2. Application workflow of the proposed model in fault diagnosis.

Table 1. Comparison of different categories of knowledge graph link prediction methods.

Method Category	Main Strength	Main Limitation in Industrial Scenarios
Embedding-based methods	Efficient triple-level representation learning.	Limited explicit modeling of multi-hop structural evidence, textual semantics, and type constraints.
Path-based methods	Interpretable path-based reasoning.	Sensitive to path search cost, noisy paths, and incomplete graph structures.
GNN-based methods	Strong structural propagation and relation-aware reasoning.	May suffer from sparse neighborhoods and insufficient semantic discrimination.
PLM/text-enhanced methods	Strong semantic representation and text-based matching.	Often lack explicit structural reasoning and relation-specific type constraints.

Table 2. Statistics of the datasets.

Dataset	Entities	Relations	Train	Valid	Test
WN18RR	40,943	11	86,835	3034	3134
FB15k-237	14,541	237	272,115	17,535	20,466
Chinese industrial KG	1915	5	4140	442	445
English industrial KG	10,464	3	37,166	4512	4522

Table 3. Main hyperparameter settings.

Hyperparameter	Value
Embedding dimension	64
Optimizer	Adam
Learning rate	$1 \times 10^{- 4}$
Batch size	64
Training epochs	50
Semantic weight $α$	0.3
Type-constrained weight $β$	0.3
Evaluation protocol	Filtered setting

Table 4. Link prediction performance comparison on WN18RR and FB15k-237.

Method	WN18RR				FB15k-237
Method	MRR	Hits@1	Hits@3	Hits@10	MRR	Hits@1	Hits@3	Hits@10
TransE [5]	0.243	0.043	0.441	0.532	0.279	0.198	0.376	0.441
DistMult [19]	0.444	0.412	0.470	0.504	0.281	0.199	0.301	0.446
ComplEx [20]	0.449	0.409	0.469	0.530	0.278	0.194	0.297	0.450
RotatE [6]	0.476	0.428	0.492	0.571	0.333	0.241	0.375	0.533
TuckER [7]	0.470	0.443	0.482	0.526	0.358	0.266	0.394	0.544
CompGCN [12]	0.479	0.443	0.494	0.546	0.355	0.264	0.390	0.535
HittER [22]	0.503	0.462	0.516	0.584	0.373	0.279	0.409	0.558
N-Former [28]	0.486	0.443	0.501	0.578	0.372	0.277	0.412	0.556
Path Ranking [8]	0.324	0.276	0.360	0.406	0.174	0.119	0.186	0.285
NeuralLP [9]	0.435	0.371	0.434	0.566	0.240	0.186	0.362	–
DRUM [10]	0.486	0.425	0.513	0.586	0.343	0.255	0.378	0.516
R-GCN [11]	0.402	0.345	0.437	0.494	0.273	0.182	0.303	0.456
NBFNet [13]	0.551	0.497	0.573	0.666	0.415	0.321	0.454	0.599
KG-BERT [14]	0.216	0.041	0.302	0.524	–	–	–	–
StAR [15]	0.401	0.243	0.491	0.709	0.296	0.205	0.322	0.482
KG-S2S [23]	0.574	0.531	0.595	0.661	0.336	0.257	0.373	0.498
N-BERT [28]	0.583	0.529	0.607	0.686	0.381	0.287	0.420	0.562
CSProm-KG [29]	0.575	0.522	0.596	0.678	0.358	0.269	0.393	0.538
Ours	0.599	0.538	0.631	0.724	0.446	0.354	0.498	0.650

Note: Bold values indicate the best performance.

Table 5. Comparison of completion results of different models on the Chinese and English industrial knowledge graphs.

Method	Chinese Industrial KG				English Industrial KG
Method	MRR	Hits@1	Hits@3	Hits@10	MRR	Hits@1	Hits@3	Hits@10
TransE	0.1898	0.0079	0.3180	0.4787	0.2010	0.0000	0.3739	0.4979
DistMult	0.6531	0.4888	0.8079	0.8539	0.7767	0.7736	0.7744	0.7872
RESCAL	0.0197	0.0034	0.0034	0.0528	0.0133	0.0008	0.0020	0.0109
RotatE	0.5983	0.4629	0.7112	0.7742	0.6850	0.6708	0.6946	0.7121
PairRE	0.5722	0.3809	0.7461	0.8022	0.4572	0.3941	0.4894	0.5684
SimplE	0.0143	0.0011	0.0045	0.0225	0.0481	0.0140	0.0446	0.1272
ConvE	0.2182	0.1225	0.2517	0.4371	0.0395	0.0199	0.0477	0.0656
TuckER	0.4117	0.3056	0.4966	0.5573	0.1131	0.0448	0.1189	0.2946
NBFNet	0.8220	0.7919	0.8382	0.8778	0.7784	0.7714	0.7773	0.7952
Ours	0.8532	0.8235	0.8784	0.9144	0.7994	0.7952	0.7986	0.8121

Note: Bold values indicate the best performance.

Table 6. MRR on different relation categories in the Chinese industrial knowledge graph.

Relation Type	NBFNet	Ours
1-to-1	0.9346	0.9438
1-to-N	0.7214	0.7869
N-to-1	0.6938	0.7615
N-to-N	0.9017	0.9162

Table 7. Ablation results on the Chinese industrial knowledge graph.

Model Variant	MRR	Hits@1	Hits@3	Hits@10
Full model	0.8532	0.8235	0.8784	0.9144
w/o dynamic relation prior	0.8368	0.8049	0.8615	0.8998
w/o hierarchy-aware relation propagation	0.8456	0.8154	0.8703	0.9071
w/o semantic fusion module	0.8397	0.8085	0.8642	0.9016
w/o type-constrained branch	0.8421	0.8108	0.8667	0.9042
w/o industrial hard negatives	0.8434	0.8126	0.8681	0.9050

Note: Bold values indicate the best performance.

Table 8. Effects of the key hyperparameters

α

and

β

on model performance on the Chinese industrial knowledge graph.

Table 8. Effects of the key hyperparameters

α

and

β

on model performance on the Chinese industrial knowledge graph.

Parameter Type	Value	MRR
$α$	0.2	0.8476
$α$	0.3	0.8532
$α$	0.4	0.8491
$β$	0.2	0.8458
$β$	0.3	0.8532
$β$	0.4	0.8484

Note: Bold values indicate the best performance.

Table 9. Illustrative example of candidate fault cause recommendation.

Rank	Candidate Fault Cause	Model Score
1	Servo drive fault	0.892
2	Encoder abnormality	0.743
3	Ball screw wear	0.561

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Luo, Y.; Hu, J.; Zhang, G.; Lv, J. Type-Constrained Structural–Semantic Fusion with Dynamic Relation Priors for Industrial Knowledge Graph Link Prediction and Its Application in Fault Diagnosis. Electronics 2026, 15, 2413. https://doi.org/10.3390/electronics15112413

AMA Style

Luo Y, Hu J, Zhang G, Lv J. Type-Constrained Structural–Semantic Fusion with Dynamic Relation Priors for Industrial Knowledge Graph Link Prediction and Its Application in Fault Diagnosis. Electronics. 2026; 15(11):2413. https://doi.org/10.3390/electronics15112413

Chicago/Turabian Style

Luo, Yonghao, Jianpeng Hu, Guozheng Zhang, and Jingru Lv. 2026. "Type-Constrained Structural–Semantic Fusion with Dynamic Relation Priors for Industrial Knowledge Graph Link Prediction and Its Application in Fault Diagnosis" Electronics 15, no. 11: 2413. https://doi.org/10.3390/electronics15112413

APA Style

Luo, Y., Hu, J., Zhang, G., & Lv, J. (2026). Type-Constrained Structural–Semantic Fusion with Dynamic Relation Priors for Industrial Knowledge Graph Link Prediction and Its Application in Fault Diagnosis. Electronics, 15(11), 2413. https://doi.org/10.3390/electronics15112413

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Type-Constrained Structural–Semantic Fusion with Dynamic Relation Priors for Industrial Knowledge Graph Link Prediction and Its Application in Fault Diagnosis

Abstract

1. Introduction

2. Related Work

3. Method

3.1. Problem Definition and Overall Framework

3.2. Dynamic Relation Prior Generation

3.3. Semantic Fusion Module

3.4. Type-Constrained and Hierarchy-Aware Relation Propagation

3.5. Type-Constrained Negative Sampling

3.6. Constraint-Aware Joint Training Objective

3.7. Computational Complexity and Scalability Analysis

4. Experiments

4.1. Datasets and Experimental Settings

4.2. Results on Public Benchmark Datasets

4.3. Results on Industrial Datasets

4.4. Relation-Type Analysis

4.5. Ablation Study

4.6. Hyperparameter Analysis

5. Application in Fault Diagnosis

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI