Symmetry- and Asymmetry-Aware Dual-Path Retrieval and In-Context Learning-Based LLM for Equipment Relation Extraction

Tang, Mingfei; Zhang, Liang; Yu, Zhipeng; Shi, Xiaolong; Liu, Xiulei

doi:10.3390/sym17101647

Open AccessArticle

Symmetry- and Asymmetry-Aware Dual-Path Retrieval and In-Context Learning-Based LLM for Equipment Relation Extraction

by

Mingfei Tang

,

Liang Zhang

^*,

Zhipeng Yu

,

Xiaolong Shi

and

Xiulei Liu

College of Computer Science, Beijing Information Science & Technology University (BISTU), Beijing 102200, China

^*

Author to whom correspondence should be addressed.

Symmetry 2025, 17(10), 1647; https://doi.org/10.3390/sym17101647

Submission received: 27 August 2025 / Revised: 16 September 2025 / Accepted: 25 September 2025 / Published: 4 October 2025

(This article belongs to the Special Issue Symmetry and Its Applications in Computer Vision)

Download

Browse Figures

Versions Notes

Abstract

Relation extraction in the equipment domain often exhibits asymmetric patterns, where entities participate in multiple overlapping relations that break the expected structural symmetry of semantic associations. Such asymmetry increases task complexity and reduces extraction accuracy in conventional approaches. To address this issue, we propose a symmetry- and asymmetry-aware dual-path retrieval and in-context learning-based large language model. Specifically, the BGE-M3 embedding model is fine-tuned for domain-specific adaptation, and a multi-level retrieval database is constructed to capture both global semantic symmetry at the sentence level and local asymmetric interactions at the relation level. A dual-path retrieval strategy, combined with Reciprocal Rank Fusion, integrates these complementary perspectives, while task-specific prompt templates further enhance extraction accuracy. Experimental results demonstrate that our method not only mitigates the challenges posed by overlapping and asymmetric relations but also leverages the latent symmetry of semantic structures to improve performance. Experimental results show that our approach effectively mitigates challenges from overlapping and asymmetric relations while exploiting latent semantic symmetry, achieving an F1-score of 88.53%, a 1.86% improvement over the strongest baseline (GPT-RE).

Keywords:

relation extraction; large language models; natural language processing

1. Introduction

Symmetry and asymmetry are fundamental concepts in natural and information sciences, offering a perspective to understand structural balance and imbalance. In the context of relation extraction, an ideal semantic network would exhibit structural symmetry, where each entity pair corresponds to a unique and stable relation [1]. However, real-world texts in the equipment domain often deviate from this ideal; entities frequently participate in multiple overlapping relations, leading to asymmetric patterns that disrupt the expected balance of semantic associations. This inherent asymmetry increases task complexity and challenges conventional extraction methods [2]. Developing models that can leverage latent semantic symmetry while addressing asymmetric relation patterns is therefore of great significance.

Relation extraction is one of the core tasks in natural language processing. Its goal is to identify semantic associations between entities in unstructured text and construct structured triples (head entity, relation, tail entity) [1]. It is an important task in information extraction and has been widely applied to knowledge graph construction [3], question answering systems [4], and so on. Existing models [5,6] applied to relation extraction have achieved promising results in many benchmark evaluations.

According to the methods employed, research on relation extraction mainly focuses on knowledge engineering-based methods, statistical machine learning-based methods, deep learning-based methods, and large language model-based methods [7].

1.1. Knowledge Engineering-Based Relation Extraction

After the concept of relation extraction was first proposed at MUC-7 [8] and the ACE conference established standardized evaluation criteria for relation extraction, the related technologies began to develop. Early researchers generally employed knowledge engineering-based relation extraction methods, where experts manually crafted rules based on domain knowledge, enumerated all possible matching patterns, and used pattern-matching techniques to extract relations from text. This manual rule specification approach requires rule authors to have deep domain expertise and to exhaustively cover numerous rules to ensure adequate extraction coverage. It is highly labor- and resource-intensive, and the crafted rules are typically specific to the current domain, lacking adaptability to new domains and exhibiting poor transferability.

1.2. Statistical Machine Learning-Based Relation Extraction

Later, machine learning methods were widely applied to relation extraction. Among them, sequence-labeling methods such as Conditional Random Fields (CRF) [9] attracted considerable attention due to their high accuracy. Statistical machine learning-based relation extraction methods rely on statistical language models and utilize manually constructed features from training corpora to learn models for extracting relations. According to their dependence on labeled data, these methods can be divided into supervised learning, semi-supervised learning, and unsupervised learning approaches [10].

Supervised relation extraction treats the task as a classification problem. It trains models using labeled training samples, maps input text to corresponding outputs, and determines the presence of relations to extract them. Compared to the strong dependence of supervised learning on manually annotated training data, semi-supervised relation extraction methods start with a small set of relation instances as initial seeds. They then iteratively learn from a large amount of unlabeled unstructured data via pattern learning methods, continuously expanding and refining the set of relation instances, discovering new potential triples from the text to complete relation extraction. However, semi-supervised learning often introduces noise during iteration [7], causing semantic drift, where the relation templates gradually deviate from their originally defined meaning [11]. Moreover, the matched templates in the extraction process still cannot comprehensively cover all relations, resulting in very low recall rates.

Both supervised and semi-supervised learning methods require manual involvement in defining relation types. However, large-scale text corpora often contain a vast number of relation types that cannot be easily enumerated manually, resulting in low recall rates for relation extraction. To address this issue, Hasegawa et al. [12] first proposed an unsupervised relation extraction method. This method extracts relations between entities from large corpora using a bottom-up approach and then classifies the extracted relations through clustering to complete the extraction process. However, unsupervised learning methods typically assume that the relation types are predetermined, mapping extracted relation instances to these types through automatic recognition. Consequently, this approach also tends to have low recall when extracting low-frequency relations.

1.3. Deep Learning-Based Relation Extraction

With the rise of deep learning methods, especially those based on neural networks, relation extraction has seen improvements in both accuracy and efficiency [7,13]. The core idea of these methods is to use deep learning models to automatically extract semantic features from text, mapping entities and their contextual information into a low-dimensional vector space. By capturing the semantic associations and structured patterns of entity pairs within this vector space, these methods achieve accurate prediction of relation types.

Dai et al. [14] proposed a joint extraction model based on query term positions, which simultaneously identifies entities and their relations through a labeling scheme and a position-aware attention mechanism, which excels particularly at handling overlapping relations and long-distance dependencies. Wei et al. [15] introduced a cascaded binary tagging framework that models relations as a mapping function from subject to object, naturally addressing the problem of overlapping triples. Wang et al. [16] proposed a single-stage joint-extraction model that formulates entity and relation extraction as a tagging alignment problem, employing an innovative handshake tagging scheme to effectively recognize overlapping relations that share one or more entities. Hong et al. [13] developed an end-to-end entity and relation extraction model based on Graph Convolutional Networks (GCN), incorporating a relation-aware attention mechanism, constructing an entity span graph, and enhancing the GCN to utilize both node and edge information simultaneously. This approach effectively captures global dependencies, improves the accuracy of entity and relation recognition, reduces error propagation, and enhances the model’s generalization capability. Fu et al. [17] further advanced GCN-based end-to-end extraction models by improving the extraction of latent lexical features in text. They employed relation-weighted GCNs combining linear and dependency structures to build comprehensive word graphs that capture implicit features, thereby enhancing the ability to extract overlapping relations. Zeng et al. [18] proposed a generative method combining reinforcement learning and copying mechanisms. By optimizing the triple generation process with reinforcement learning strategies and directly copying entities from the input text via the copying mechanism, this method improves extraction accuracy and efficiency. Sui et al. [2] presented an innovative solution that views joint entity and relation extraction as a set prediction problem rather than a traditional sequence generation task. Their Set Prediction Network (SPN) uses non-autoregressive parallel decoding to output the entire predicted set of triples at once, effectively avoiding error accumulation caused by sequence dependency in conventional methods.

Compared to statistical machine learning methods, deep learning-based approaches can more effectively capture complex semantics and contextual dependencies in text. However, due to their sensitivity to data distribution, these methods often exhibit certain limitations when handling overlapping entities and diverse relation types.

1.4. Large Language Model-Based Relation Extraction

In recent years, the rapid development of large language models (LLMs) has greatly advanced the progress of entity relation extraction. Through large-scale pre-training and fine-tuning, LLMs can more comprehensively understand and capture semantic information in text [19], significantly improving the accuracy and generalization ability of relation extraction [20,21]. This enhanced capability has substantially expanded the application potential of LLMs in relation extraction tasks, not only boosting overall system performance but also offering new ideas and methods for processing massive amounts of unstructured data.

The RAG-based prompting method demonstrates strong performance when the relationships between entities can be directly inferred from sentence tokens; however, it faces difficulties if the LLMs lack exposure to the relevant relation types [22]. General-purpose large language models, such as Mistral [23], Llama2 [24], and Flan-T5 [25], similarly exhibit limitations in relation extraction tasks due to their insufficient knowledge of domain-specific relations [20,22,26].

Building on these challenges, Paolini et al. [27] reformulated structured prediction tasks, including relation extraction (RE), as a sequence-to-sequence problem. Wan et al. further employed a prompt-based approach to call GPT-3 for relation extraction, while more recently, Xue et al. [28] proposed the AutoRE model, which decomposes document-level relation extraction into three subtasks—relation extraction, entity head identification, and entity tail identification—and enhanced performance by fine-tuning LLMs with quantized low-rank adapters (QLoRA). Similarly, Wei et al. [29] introduced ChatIE, which transforms the complex relation extraction process into a multi-turn question-answering task and integrates intermediate outputs into the final structured results. To address the poor performance of large language models (LLMs) on relation extraction tasks caused by the relatively low proportion of such tasks in instruction-tuning datasets, Zhang et al. [20] proposed a framework that aligns relation extraction tasks with question-answering tasks to enhance LLM performance. Wan et al. [21] introduced task-aware representations and improved the low correlation between entities and relations, as well as the unexplainable input–label mapping issue, by incorporating demonstration examples with enhanced reasoning logic. Dong et al. [30] and Brown et al. [19] leveraged in-context learning (ICL) to provide additional instructions to large language models within prompts, thereby enhancing the diversity of training examples and improving the models’ ability to recognize complex instance relationships. To tackle the challenges posed by the large number of predefined relation types and the uncontrollability of LLMs, Li et al. [31] proposed integrating LLMs with a natural language inference module to generate relation triples, thereby enriching document-level relation datasets.

Compared with traditional deep learning methods, large language models demonstrate a stronger ability to capture semantic information and cross-sentence dependencies in text. Nevertheless, these existing LLM-based approaches still face limitations in the equipment domain, where relation descriptions are highly specialized, overlapping, and asymmetric. Such challenges make it difficult for general-purpose LLMs to achieve satisfactory accuracy. To address these issues, this paper proposes a symmetry- and asymmetry-aware dual-path retrieval and in-context learning-based method, specifically designed for equipment relation extraction.

1.5. Paper Contributions and Organization

To address the challenges of overlapping and asymmetric relations in the equipment domain, this paper makes the following key contributions:

Novel Retrieval Strategy:We propose a symmetry- and asymmetry-aware dual-path retrieval mechanism that integrates both sentence-level and relation-level semantic similarities to retrieve highly relevant demonstration examples for in-context learning.
Domain-Specific Adaptation: We introduce a domain adaptation layer on top of the pre-trained BGE-M3 embedding model, enhancing its capability to capture the nuanced semantics and terminology of the equipment domain.
Reasoning-Enhanced In-Context Learning: We design a structured reasoning template to guide LLMs in generating detailed inference logic for each demonstration example, thereby providing richer contextual cues for relation extraction.
Comprehensive Evaluation: We conduct extensive experiments on a self-constructed equipment relation dataset, demonstrating the superiority of our method over strong baselines. We further provide in-depth analyses, including ablation studies, the impact of key hyperparameters, and performance on rare relations and complex overlapping scenarios.

The remainder of this paper is structured as follows. Section 2 reviews related work on relation extraction. Section 3 details the architecture of our proposed method. Section 4 presents the experimental setup, results, and discussions. Section 5 discusses the limitations of our work and future research directions. Finally, Section 6 concludes the paper.

2. Related Work

The objective of relation extraction is to identify semantic relations between entities from unstructured text, thereby transforming scattered entity information into structured knowledge with explicit connections [1]. In the equipment domain, relation extraction techniques can assist in mining and organizing complex knowledge networks, such as the affiliation of weapon systems, the associations between manufacturers and their products, and the cooperative capabilities among different types of equipment. The extracted relational knowledge not only provides valuable decision-making support for domain experts but also serves as a foundation for applications such as knowledge graph construction [4], intelligent question-answering systems [5], and equipment research, development, and maintenance. However, textual data in the equipment domain is often highly complex. In particular, when describing interactions involving multiple entities, overlapping relation phenomena frequently occur—where a single entity may simultaneously participate in multiple different relations, or multiple relations may share the same entity pair [2,3].

As shown in sentence (a) of Figure 1, the text “The SR-71 Blackbird reconnaissance aircraft was developed by Lockheed Corporation of the United States in 1958, and its power comes from two high-performance supersonic Pratt & Whitney J58 turbojet engines.” contains multiple entities and their relations. Among them, there is a “Manufacturer” relation between “SR-71 Blackbird reconnaissance aircraft” and “Lockheed Corporation of the United States”, forming the triplet (SR-71 Blackbird reconnaissance aircraft, Manufacturer, Lockheed Corporation of the United States). In addition, “SR-71 Blackbird reconnaissance aircraft” is also linked with “Pratt Whitney J58 turbojet engine” by an “Equipment” relation, which belongs to the Single Entity Overlap (SEO) phenomenon. Furthermore, within “Lockheed Corporation of the United States”, “United States” can also be regarded as an independent entity, and it has a “Production Country or Region” relation with “SR-71 Blackbird reconnaissance aircraft”. This kind of nested entity makes the overlapping relations even more complex. As shown in sentence (b) of Figure 1, “M1 Abrams main battle tank” and “United States” simultaneously hold both “Production Country or Region” and “Usage Country or Region” relations, forming an Entity Pair Overlap (EPO). It can be seen that in the equipment domain, the high-density and overlapping nature of relation descriptions significantly increase the difficulty of relation extraction tasks.

2.1. Relation Extraction with Large Language Models

Existing general relation extraction methods, including traditional deep learning models [13,15,16,17], struggle to fully capture the overlapping relation characteristics in the equipment domain, resulting in incomplete extraction of head entities [3].

Recently, large language models (LLMs) have revolutionized many NLP tasks, including relation extraction. The prevailing approaches can be categorized into fine-tuning and in-context learning (ICL). Fine-tuning-based methods, such as AutoRE [28], decompose document-level RE into subtasks and fine-tune LLMs with quantized adapters. ChatIE [29] transforms the RE process into a multi-turn question-answering task. While effective, these methods require substantial computational resources for domain adaptation.

In-context learning (ICL), which provides task demonstrations within the prompt, has emerged as a powerful alternative to avoid extensive fine-tuning [19,30]. For instance, GPT-RE [21] incorporates demonstration examples with enhanced reasoning logic to improve the model’s reasoning capability. QA4RE [20] aligns relation extraction tasks with question answering tasks to enhance LLM performance. However, the performance of ICL is highly sensitive to the selection of demonstration examples. Retrieval-Augmented Generation (RAG) techniques [22] aim to address this by retrieving semantically similar examples from a database to provide better context.

2.2. Our Position

To address the issues of overlapping relations and the limitations of existing ICL methods in the equipment domain, we propose a dual-path retrieval and in-context learning-based LLM method. Unlike the general retrieval strategy in [21,22] or the task alignment in [20], our work introduces a symmetry- and asymmetry-aware dual-path retrieval mechanism. This strategy retrieves examples from both sentence-level and relation-level perspectives, specifically designed to handle the complex overlapping patterns (SEO and EPO) prevalent in equipment texts. Furthermore, we incorporate a domain adaptation layer to tailor the embedding model for equipment terminology, enhancing the retrieval of domain-specific examples. Compared with traditional deep learning-based solutions [13,17] and existing LLM-based methods [20,21], the proposed approach is designed to achieve higher accuracy and better generalization in this specialized, low-resource domain.

3. Method

The overall framework of the proposed symmetry- and asymmetry-aware dual-path retrieval and in-context learning-based LLM for equipment relation extraction is shown in Figure 2, and it mainly consists of four components:

Embedding Model Fine-Tuning Module: Based on the pre-trained BGE-M3 embedding model [32], this module performs domain-specific fine-tuning to enhance its semantic representation capability in the equipment domain, thereby providing high-quality semantic vector support for subsequent retrieval tasks.
Retrieval Example Library Construction Module: Based on the fine-tuned BGE-M3 embedding model, this module builds a retrieval example library containing example text, an example sentence-level embedding vector, example relation-level embedding vector, and detailed reasoning logic generated through relation reasoning templates.
Relevance Computation Module: This module adopts a dual-path retrieval strategy, recalling relevant examples from both the sentence-level embedding and the relation-level embedding perspectives. The retrieval results are then fused and re-ranked using the Reciprocal Rank Fusion (RRF) algorithm, thereby improving both the coverage and accuracy of retrieval.
Prompt Construction Module: This module builds task-specific prompt templates that combine structured inputs with the retrieved examples, enabling the large language model to better perform relation extraction in a contextual learning setting.

3.1. Embedding Model Fine-Tuning Module

BGE-M3 is a multilingual embedding model widely utilized in text retrieval and semantic representation tasks. Its primary strength lies in capturing semantic information from text and assessing semantic relationships between texts via vector distances or similarity measures. In this chapter, the embedding model serves as the fundamental component for implementing the dual-path retrieval, mapping both the input text and the example texts in the retrieval library into semantic vectors, and enabling the rapid identification of the most relevant demonstration examples through similarity computation.

However, general-purpose embedding models are typically trained on large-scale generic corpora, resulting in a semantic space better suited to representing textual features in general domains. Their performance in specialized domains, such as the equipment domain, is therefore constrained. The equipment domain is characterized by numerous technical terms, complex technical descriptions, and distinctive semantic relationships. For instance, in equipment relation extraction tasks, input texts may contain domain-specific terms such as “radar system” and “fire-control module” whose semantic nuances general-purpose embedding models may inadequately capture.

To enhance the semantic representation capability of BGE-M3 in the equipment domain, this section performs domain-specific fine-tuning by incorporating a dedicated equipment-domain corpus and targeted optimization objectives. The fine-tuned model better adapts to the semantic requirements of the equipment domain, providing higher-quality semantic vectors for the subsequent retrieval example library and dual-path retrieval.

This section employs contrastive learning to fine-tune the BGE-M3 embedding model. The core idea of contrastive learning is to optimize the semantic space by minimizing the distance between positive sample pairs while maximizing the distance between negative sample pairs. Specifically, the InfoNCE (Noise-Contrastive Estimation) loss function is used as the objective:

L o s s = - l o g \frac{e x p \frac{s i m (x_{i}, x_{i}^{+})}{τ}}{\sum_{i = 1}^{N} e x p \frac{s i m (x_{i}, x^{+})}{τ}}

(1)

where

x_{i}

is the query sample,

x^{+}

is the positive sample related to

x_{i}

, and

x^{-}

denotes negative samples not related to

x_{i}

.

τ

is the temperature hyperparameter used to scale the similarity in contrastive learning, commonly applied to adjust the scale of similarities to prevent gradient vanishing or explosion, effectively controlling the smoothness of the softmax output distribution. The function

s i m (\cdot)

represents cosine similarity, defined as

s i m (x, y) = \frac{x \cdot y}{| | x | | | | y | |}

(2)

In addition, to further enhance the model’s adaptability to the equipment domain, this section introduces a domain-specific adaptation layer on top of BGE-M3, with the detailed architecture shown in Figure 3. This adaptation layer consists of a lightweight fully connected network, structured as follows:

h = R e L U (W_{e} \cdot e + b 1)

(3)

o = W_{2} \cdot h + b_{2}

(4)

where e denotes the input embedding vector;

W_{1}

,

W_{2}

, and

b_{1}

,

b_{2}

represent the weight matrices and bias terms; h denotes the hidden layer output; and o denotes the final domain-adapted output. Assuming that the embedding vector e generated by the embedding model belongs to a general semantic space

S_{g}

, and that the semantic space of the equipment domain

S_{d}

is a subspace of

S_{g}

, the goal of the domain adaptation layer is to apply a linear transformation to map e into

S_{d}

, thereby achieving semantic representations specific to the equipment domain:

e_{d} = f (e; θ)

(5)

where

f (\cdot)

denotes the mapping function of the domain adaptation layer and

θ

represents the learnable parameters. In this manner, the model is able to preserve general semantic information while simultaneously capturing domain-specific features of the equipment domain.

Finally, to avoid compromising the general semantic knowledge of the embedding model, the lower-layer Transformer parameters of BGE-M3 were frozen during fine-tuning, and only the parameters of the domain adaptation layer were updated. This strategy reduces computational overhead while ensuring that the model effectively learns domain-specific knowledge for the equipment domain.

3.2. Retrieval Example Library Construction Module

In the overall framework of the proposed symmetry- and asymmetry-aware dual-path retrieval and in-context learning-based LLM for equipment relation extraction, constructing a high-quality retrieval example repository is one of the key components of the entire system. In practical applications, once an input sentence is embedded into a vector, a semantically rich example repository is required to retrieve highly relevant examples, thereby providing contextual support and reasoning references for the subsequent relation extraction task. This retrieval-based approach effectively compensates for the limitations of large language models in domain-specific knowledge. By incorporating the semantic information and reasoning logic contained in relevant examples, the model can better comprehend complex relations within the input sentence. Only when highly relevant examples are retrieved can the LLM perform relation extraction with higher precision.In the equipment domain, relations in text are often highly specialized and implicit. Relying solely on the input sentence may not fully capture its semantic features. Through retrieved relevant examples, the model gains additional contextual information and domain knowledge, which not only supplements potentially missing background details in the input but also provides explicit reasoning pathways and reference relation patterns. This significantly enhances the model’s ability to recognize complex relations and improves extraction accuracy.

The construction of the retrieval example repository primarily consists of four components: example text, example sentence-level embedding vector, example relation-level embedding vector, and example reasoning logic. The example text is sourced from a subset of the dataset designated as the retrieval example set, which is carefully curated to comprehensively cover typical scenarios and relation types within the equipment domain. The sentence-level and relation-level embeddings are generated using the domain-specific fine-tuned BGE-M3 embedding model. This domain-oriented fine-tuning significantly enhances the model’s semantic representation capability in the equipment domain, ensuring that it can capture more fine-grained semantic features.

In addition, to further enhance the practicality of the retrieval example repository, this section designs a dedicated relation reasoning template to guide the large language model in generating detailed reasoning logic for the retrieved examples. This reasoning logic serves as an important supplementary resource for the subsequent construction of the prompt templates. The generation process of the example reasoning logic is illustrated in Figure 4.

In this module, given an example

e x a m p l e_{i} = (t e x t, e n t i t y_{1}, e n t i t y_{2}, r e l a t i o n)

, where

t e x t

is the original text,

e n t i t y_{1}

and

e n t i t y_{2}

are the entities, and

r e l a t i o n

is the relation label between them, a query prompt is first generated based on the example.

Q u e r y_{i} = i n t h e s e n t e n c e^{'} c o n t e x t^{'}

(6)

w h i c h c u e s i n d i c a t e t h a t t h e r e l a t i o n b e t w e e n^{'} e n t i t y_{1}^{'} a n d^{'} e n t i t y_{2}^{'} i s^{'} r e l a t i o n^{'}

(7)

The purpose of this query prompt is to guide GPT-4 to understand the relation extraction task and to require it to output cues related to the relation

r e l a t i o n

based on contextual information. When this query prompt is fed into the GPT-4 model, the model generates the reasoning logic

I n f e r e n c e_{i}

, typically manifested as sentences or phrases pertinent to the relation, which help the model infer the actual relation between the two entities.

Subsequently, this query prompt is fed into the GPT-4 model, and the reasoning evidence generated by the model is denoted as

I n f e r e n c e_{i}

. These reasoning cues, combined with the original example, are used to construct an enhanced example.

E n h a n c e d E x a m p l e_{i} = (t e x t, e n t i t y_{1}, e n t i t y_{2}, r e l a t i o n, i n f e r e n c e_{i})

(8)

In this way, the original example

e x a m p l e_{i}

is transformed into an enhanced example,

E n h a n c e d E x a m p l e_{i}

, containing reasoning cues, while also providing the model with richer contextual information to facilitate a deeper understanding of the implicit connections between the relations.

3.3. Relevance Computation Module

When applying the in-context learning paradigm of large language models to the task of relation extraction in the equipment domain, a high degree of relevance between demonstration examples and the current input text can assist the model in rapidly identifying the potential connections between them, thereby enabling more accurate prediction of the implicit relations within the input text. In this section, building on the fine-tuned BGE-M3 embedding model from the previous section, a dual-path retrieval strategy is implemented to significantly improve both the coverage and accuracy of retrieval, while the RRF algorithm is employed to rank the retrieval results through weighted aggregation of rankings.

3.3.1. Sentence-Level Embedding Retrieval

Sentence-level embedding retrieval uses the overall semantic vector of the input text to perform retrieval, allowing it to capture global semantic information. This approach is particularly suitable when the input text is highly similar in overall semantics to the example texts in the retrieval repository. For an input sentence s, its embedding vector

v_{S}

is generated using the fine-tuned BGE-M3 model. Subsequently,

v_{s}

is compared with the sentence-level embeddings of all examples in the retrieval repository using cosine similarity:

s i m_{s e n t e n c e} (v_{s}, v_{s_{j}}) = \frac{v_{s} \cdot v_{s_{i}}}{| | v_{s} | | | | v_{s_{j}} | |},

(9)

In this formulation,

v_{s}

represents the embedding vector of the input sentence,

v_{s_{i}}

corresponds to the embedding vector of the

i - t h

sentence within the database, and

s i m_{s e n t e n c e}

denotes the sentence-level semantic similarity. By evaluating these similarity scores, the k most semantically relevant examples can be retrieved from the repository to support subsequent processing.

3.3.2. Relation-Level Embedding Retrieval

Relation-level embedding retrieval focuses on the subject and object entities and their relations within a sentence, aiming to enhance retrieval accuracy through more fine-grained semantic representations. Compared with sentence-level embedding retrieval, relation-level retrieval is better suited to capture local semantic information, which is particularly important in tasks within the equipment domain that involve complex entity relations. Furthermore, relation-level embedding retrieval effectively compensates for the limitations of sentence-level retrieval in handling local semantic matches, thereby improving overall retrieval performance.

Firstly, a BERT-based relation extraction fine-tuning method [33] is employed to extract the subject and object entities from the input sentence. For instance, in the sentence “M60 is the first main battle tank of the United States, serving in the U.S.”, the output tagging is “[CLS][SUB] M60 [/SUB] is the first main battle tank of [OBJ] the United States [/OBJ][SEP]”. The entities enclosed by the [SUB] and [OBJ] tags, namely “M60” and “the United States”, correspond to the subject and object entities, respectively.

Subsequently, the fine-tuned BGE-M3 model is employed to generate the embedding vectors of the subject entity and the object entity, denoted as

v_{s u b}

and

v_{o b j}

, respectively. These two embedding vectors are then concatenated along their dimensions to form the following relation-level semantic vector:

v_{r} = v_{s u b} \oplus v_{o b j}

(10)

In addition, in order to enable relation-level embedding retrieval, each example in the retrieval corpus must be pre-computed with its corresponding relation-level semantic vector. The procedure is identical to the aforementioned steps, namely extracting the subject–object entity pair from each example and generating its relation-level semantic representation. Finally, given the relation-level semantic vector

v_{r}

of the input sentence, the similarity score with the relation-level semantic vector

v_{r_{i}}

of each example in the retrieval corpus is computed as follows:

s i m_{r e l a t i o n} (v_{r}, v_{r_{j}}) = \frac{v_{r} \cdot v_{r_{i}}}{| | v_{r} | | | | v_{r_{i}} | |}

(11)

In this formulation,

v_{r}

denotes the relation-level semantic vector of the input text, while

v_{r_{i}}

, represents the relation-level semantic vector of the i-th example in the retrieval corpus.

s i m_{r e l a t i o n}

indicates the relation-level similarity. By calculating the similarity scores, it becomes possible to retrieve the top-k examples from the retrieval corpus that are most semantically relevant to the subject–object pair in the input sentence at a fine-grained level.

3.3.3. RRF Ranking

After obtaining the retrieval results from both sentence-level embedding retrieval and relation-level embedding retrieval, the RRF algorithm is employed to enhance the accuracy and comprehensiveness of example retrieval. By integrating results from different retrieval strategies, the RRF algorithm effectively fuses dual-path retrieval outputs through rank-based weighting, thereby leveraging the strengths of each retrieval approach. This fusion mechanism not only improves the precision of the final ranking but also enhances its robustness.

The core idea of RRF lies in aggregating the ranking positions of candidate examples from each retrieval path and assigning them weights accordingly, where examples ranked higher receive greater weights. The calculation formula is defined as

R R F s c o r e (d) = \sum_{n = 1}^{N} \frac{1}{k + r (d)}

(12)

In this context, d denotes a candidate example in the retrieval library,

r (d)

represents the rank position of example d in the n-th retrieval path, N refers to the total number of retrieval paths, and k is a constant greater than 1, introduced to mitigate the influence of abnormally high rankings from a single system. By applying the RRF algorithm, the retrieval results from both paths can be jointly considered, enabling high-ranked candidate examples from different retrieval lists to receive greater weights in the final ranking. This mechanism ensures that the fused ranking effectively integrates informative signals from both retrieval strategies.

3.4. Prompt Construction Module

In this chapter, a prompt is constructed for each input text and subsequently fed into the GPT-4 model to perform the relation extraction task. Each prompt consists of the following components:

Instructions: A brief description of the relation extraction task and the predefined set of relation categories R. The model is explicitly required to output a relation belonging to the predefined categories. If no appropriate relation is identified, the model should output NULL.

ICL Demonstrations: The k-shot example set retrieved by the relevance computation module is leveraged, and each example is combined with its corresponding chain-of-thought-guided inference logic to form a new example set, denoted as D.

Input Text: A segment of text on which the model is required to perform the equipment relation extraction task.

Within the in-context learning framework of large language models, the ordering of examples has a significant impact on model performance. Placing the most relevant examples at the beginning of the input sequence can effectively enhance the model’s reasoning and generalization capabilities. This arrangement allows the model to capture key information at an early stage, thereby improving its understanding of the task requirements and facilitating the generation of accurate outputs. Accordingly, based on the ranked retrieval examples from the previous module, the final prompt template is constructed as shown in Table 1.

After constructing the prompt based on the prompt template, the prompt is fed into the LLM to obtain the final relation extraction result.

Accordingly, the specific procedure of the symmetry- and asymmetry-aware dual-path retrieval and in-context learning-based LLM for equipment relation extraction is illustrated in Table 2.

4. Results

This section presents the experimental setup and outcomes to evaluate the proposed model’s effectiveness. We first describe the self-annotated equipment relation dataset (Section 4.1) and experimental configuration (Section 4.2). Subsequently, a comprehensive analysis is conducted (Section 4.3), encompassing comparative experiments against state-of-the-art baselines, an investigation into the impact of demonstration quantity, performance under class imbalance, effectiveness on overlapping relations, and ablation studies.

4.1. Experimental Data

In this chapter, experiments are conducted on a self-annotated dataset for relation extraction in the equipment domain to evaluate the effectiveness of the proposed model. During the dataset construction process, the relation types and their corresponding head and tail entity types are first defined. Based on an analysis of the collected equipment-domain data, seven relation types are identified for the relation extraction task in this chapter, as summarized in Table 3.

After defining the relation types and their corresponding subject-object entity types, the collected knowledge data from Wikipedia were annotated. Based on the overlapping patterns of triplets within a single sentence in the dataset, the data were categorized into three overlap types: Normal, SEO, and EPO. The Normal type indicates that each entity pair corresponds to only one relation, with no overlap. The SEO (Single Entity Overlap) type indicates that a single entity forms different relations with multiple other entities within the sentence. The EPO (Entity Pair Overlap) type indicates that multiple relations exist between the same entity pair within a sentence. Specific examples of the annotated samples are presented in Table 4.

In this chapter, a total of 8457 pieces of equipment knowledge relation extraction data were annotated for subsequent experiments. The 8457 data entries were divided into a test set and a retrieval example set at a ratio of 8:2. The specific data distribution is shown in Table 5.

The specific distribution of various overlapping relation types in the test set and the retrieval example set is shown in Table 6.

4.2. Experimental Config

The experimental environment and the optimal hyperparameter settings for the large language model are presented in Table 7 and Table 8.

The hyperparameters presented in Table 8 were determined through manual tuning based on a small validation set split from the training data. We did not employ an automated hyperparameter optimization (HPO) framework, as the limited number and intuitive nature of these parameters (e.g., Temperature, Top_p) made manual exploration feasible and efficient. The chosen values were found to yield stable and superior performance across multiple initial runs.

4.3. Experimental Results and Analysis

Evaluation Metrics:The model performance is evaluated using Precision (P), Recall (R), and F1-score (F1), which are standard metrics for relation extraction tasks. These metrics are calculated based on the counts of True Positives (TP), False Positives (FP), and False Negatives (FN):

True Positive (TP): A relation instance that is both correctly predicted and belongs to the correct relation type.
False Positive (FP): A relation instance that is predicted but is either incorrect or does not actually exist in the text.
False Negative (FN): A relation instance that exists in the ground truth but is not predicted by the model.

The formulas for the metrics are defined as follows:

P r e c i s i o n (P) = \frac{T P}{T P + F P}

(13)

R e c a l l (R) = \frac{T P}{T P + F N}

(14)

F 1 - s c o r e (F 1) = 2 \times \frac{P \times R}{P + R}

(15)

Precision measures the accuracy of the positive predictions, Recall measures the ability to find all positive instances, and the F1-score provides a harmonic mean of Precision and Recall, offering a single metric to evaluate overall performance.

4.3.1. Comparative Experiments

In order to evaluate the effectiveness of the model proposed in this chapter for addressing relation extraction tasks in the equipment domain, experiments were conducted on the equipment-domain relation dataset constructed herein. For traditional deep learning models, Relation-Aware [13], CasRel [17], TPLinker [18], and SPN [2], which currently achieve strong performance in relation extraction tasks, were selected as comparative models. Additionally, three LLM-based relation extraction models, QA4RE [20], AutoRE [28], and GPT-RE [21], were chosen as baselines to assess the instance extraction performance of the proposed method. Table 9 presents the relation extraction results of different models on the dataset constructed in this chapter.

As shown in Table 9, the Relation-Aware model exhibits the weakest performance. This is primarily because its explicit modeling of interactions between entities heavily depends on high-quality input features or annotated data, making it susceptible to errors in entity annotation or contextual noise, which leads to performance degradation. In contrast, the CasRel model achieves improvements of 7.12%, 9.68%, and 8.44% in Precision, Recall, and F1-score, respectively, over Relation-Aware, indicating that the stacked pointer network facilitates accurate identification of overlapping relations. TPLinker further increases the F1-score by 4.8% compared to CasRel, as its global matrix labeling approach prevents error propagation and more efficiently handles overlapping relations. The SPN model attains a higher F1-score than seq2seq-based methods, demonstrating that reformulating joint entity relation extraction as a set prediction problem effectively enhances the extraction of overlapping relations.

Among the LLM-based baseline models, GPT-RE performs the best. Its advantage lies in large-scale pretraining, which enables better capture of contextual semantics and implicit relations. Moreover, GPT-RE incorporates label-guided reasoning logic, explicitly explaining the mapping between input and labels in examples to enhance the model’s reasoning capabilities. However, GPT-RE still exhibits limitations in specialized equipment-domain scenarios, as its pretrained semantic space is more suited to general text features and cannot precisely model domain-specific terminology, complex technical descriptions, and unique semantic relationships in the equipment domain.

Compared with the best-performing baseline model, GPT-RE, the proposed model in this chapter achieves improvements of 2.48%, 2.13%, and 2.32% in Precision, Recall, and F1-score, respectively. These results indicate that by fine-tuning the embedding model on equipment-domain data, the proposed model can more accurately capture domain-specific semantic information, thereby generating higher-quality semantic vectors to support subsequent retrieval tasks. Furthermore, the dual-embedding retrieval strategy enables the model to simultaneously capture sentence-level semantic correlations and relation-level feature mappings. By applying the RRF algorithm to dynamically weight and fuse the multi-dimensional retrieval results, high-quality demonstration examples are ultimately provided for prompt construction. Therefore, compared with other mainstream models, the proposed model demonstrates superior performance in equipment-domain relation extraction tasks.

4.3.2. Impact of the Number of Retrieved Examples (k) Experiments

As shown in Figure 5 and Table 10, the model performance steadily improves as k increases from 1 to 5, achieving optimal results at k = 5 (Precision = 88.12%, Recall = 88.95%, F1 = 88.53%). This initial performance gain can be attributed to the increased contextual diversity and relational coverage provided by a larger set of demonstrations. With more examples, the model is exposed to a broader spectrum of semantic patterns, entity interactions, and reasoning contexts, which enhances its ability to generalize to varying expressions of the same relation and reduces its sensitivity to spurious cues in individual examples. However, when k exceeds 5, additional retrieved demonstrations begin to introduce noise, redundancy, and potentially less relevant instances, which dilute the focus of the in-context learning process. As a result, both Precision and Recall gradually decline, and the overall F1-score decreases to 87.27% at k = 10. This confirms that k = 5 offers the optimal trade-off between informational richness and noise suppression for our equipment-domain relation extraction task.

4.3.3. Macro-F1 and Micro-F1 Experiments

Micro-F1 and Macro-F1 provide complementary perspectives on model performance under class imbalance. As shown in Table 11, the model achieves strong performance on the top three frequent relations (e.g., Equipment, Manufacturer, and Producing Country/Region), with an average F1 of 90.26%. In contrast, the bottom three rare relations yield an average F1 of only 77.50%. Among them, Previous Model and Subsequent Model are particularly challenging; their instances are scarce in the dataset and their semantic patterns are diverse or implicit (e.g., succession often described indirectly), which makes them difficult for the model to generalize. Consequently, the overall Micro-F1 reaches 88.53%, while the Macro-F1 drops to 84.19%. This discrepancy indicates that frequent relations dominate the micro-level performance, whereas the macro-level average reveals the model’s sensitivity to rare relations, especially those involving evolutionary links between equipment models.

4.3.4. Overlapping Relation-Type Experiments

Our proposed model is specifically designed to address complex challenges such as nested entities, implicit relations, and overlapping relations (including Single Entity Overlap (SEO) and Entity Pair Overlap (EPO)). While it demonstrates significant overall improvement over existing methods, experimental analysis reveals that Entity Pair Overlap (EPO) remains the most challenging scenario, representing the model’s key limitation.

As shown in Table 12, the model achieves optimal performance on normal sentences (F1: 90.12%), performs well on SEO sentences (F1: 88.60%), but exhibits comparatively lower effectiveness on EPO sentences (F1: 86.87%). The primary reasons for its suboptimal adaptation to EPO are as follows:

Ineffective Relation-Level Retrieval for EPO: The relation-level retrieval path relies on similarity matching of (Entity_A, Entity_B) pairs. However, in EPO cases, the same entity pair participates in multiple relations (e.g., both “Producing Country” and “Operating Country”). Retrieved examples for (Entity_A, Entity_B) may only reflect one of these relations, failing to provide comprehensive contextual support for predicting all valid relations simultaneously. This leads to incomplete or biased reasoning during in-context learning.
Semantic Interference in Multi-Relation Learning: Unlike SEO cases, where multiple relations involve distinct entity pairs, EPO requires disambiguating semantically close relations for the same pair. The model’s reasoning process is occasionally confounded by this inherent ambiguity, especially when the supporting context is sparse or asymmetric.
Limitations in Reasoning Logic Generalization: Although the reasoning logic generated by GPT-4 enhances performance, it sometimes fails to capture all valid relations for EPO cases within a single inference chain. This reflects a broader limitation in holistic multi-relation reasoning.

In conclusion, while our model significantly advances the state of the art in handling complex relation extraction tasks, it remains least adapted to Entity Pair Overlap (EPO) due to limitations in retrieval granularity, subtle cue sensitivity, and multi-relation reasoning. Improving performance on EPO represents a critical direction for future work, potentially through enhanced retrieval mechanisms capable of capturing multi-relational semantics and more sophisticated reasoning logic generation.

4.3.5. Ablation Experiments

To further validate the effectiveness of the retrieval strategy design, ablation experiments were conducted on different example retrieval methods, investigating the impact of random selection, sentence-level embedding retrieval, relation-level embedding retrieval, and dual-path retrieval on the model’s performance.

Table 13 presents the model performance on the self-constructed equipment domain test set:

The random selection (Random) strategy achieves an F1-score of only 73.53%, with its performance disadvantage particularly evident in complex relational scenarios within the equipment domain. Since randomly selected examples are unlikely to match the actual input text, the model may perform reasoning based on incorrect context, leading to inaccurate outputs.
The removal of the domain adaptation layer resulted in a 2.63% drop in the model’s F1-score, indicating that this module contributes substantially to recognition performance. The performance degradation primarily stems from the embedding vectors’ failure to adequately capture domain-specific semantic structures and term distributions, such as subtle distinctions in asymmetric relations and composite entities. By mapping the general semantic space into a domain-specific subspace, this layer enhances the discriminativity of entity and relation representations, thereby effectively supporting subsequent retrieval and reasoning processes. It serves as a key factor in achieving the current high level of accuracy.
Sentence-level embedding retrieval (Sentence) increases the F1-score to 80.76%. This strategy dynamically selects the most relevant examples by computing semantic similarity between the input text and the examples, enhancing the model’s adaptability to diverse inputs. However, relation extraction tasks focus on entity pairs, which may not fully align with the semantics of the entire sentence.
Relation-level embedding retrieval (Relation) extracts semantic features of entity pairs, allowing the model to precisely focus on the interactions and specific relational semantics between instances while avoiding interference from the overall sentence semantics. This approach achieves an F1-score of 85.99%.
The proposed dual-path retrieval strategy (Sentence + Relation) achieves an F1-score of 88.53%, improving by 1.73% over the best single-path retrieval. This strategy captures global semantic information through sentence-level embedding retrieval to understand context, while simultaneously focusing on local interactions and specific relational semantics of entity pairs via relation-level embedding retrieval. By combining global and local features, it effectively enhances retrieval relevance as well as the accuracy and robustness of relation extraction. These results demonstrate that incorporating a dual-path retrieval strategy further improves model performance.

5. Discussion

Although the proposed method has demonstrated significant effectiveness in sentence-level equipment relation extraction, we acknowledge several limitations regarding its scalability and generalizability, which point to valuable directions for future research.

Firstly, our current work primarily focuses on intra-sentence relation extraction. However, many relational facts in real-world scenarios are spread across multiple sentences or even an entire document (document-level RE). How our dual-path retrieval mechanism can effectively retrieve relevant examples from longer texts, and how the LLM handles long-range dependencies across sentences, remain open questions. Extending the current framework to the document level is a natural and critical next step.

Secondly, the generalizability of the method requires further validation. Since the training data are derived from the equipment domain, the model is tailored to military-related texts and may not generalize well to open-domain scenarios. To our knowledge, there are currently no publicly available datasets that fully match the domain-specific nature of our equipment data. While some open-access datasets for general relation extraction exist, they cannot fully replicate our experiments. Cross-lingual and cross-domain adaptability remains a key challenge for practical deployment. For instance, while the model performs well on Chinese equipment texts, its effectiveness in English, French, or other languages is uncertain. Likewise, a model trained on the aviation equipment domain may not transfer seamlessly to shipbuilding or electronic warfare domains. Rigorous evaluation with multilingual and multi-domain datasets will therefore be essential, and this constitutes a key part of our planned future work.

Finally, while the dual-path retrieval strategy is effective overall, a qualitative analysis of its failure cases would be highly valuable. For example, when dealing with Entity Pair Overlap (EPO), the relation-level retrieval might recall biased examples that only contain one of the multiple valid relations, thereby misleading the model to ignore others. A deep dive into these characteristic failure cases will provide crucial insights for designing more robust retrieval fusion algorithms and more precise reasoning logic templates.

In addition to addressing limitations, the proposed method has potential practical applications beyond the military domain. For example, it could support knowledge graph construction, relation extraction, and information organization in civilian and industrial sectors. By leveraging the dual-path retrieval mechanism and LLM-based reasoning, the approach could facilitate structured knowledge extraction from domain-specific corpora, offering broader utility and impact in diverse real-world scenarios.

6. Conclusions

To address the issue of overlapping relations in equipment domain data, this work proposes a symmetry- and asymmetry-aware dual-path retrieval and in-context learning-based LLM for equipment relation extraction. The method first performs domain-specific fine-tuning on the BGE-M3 embedding model to enhance its semantic representation capabilities within the equipment domain. Next, the fine-tuned embedding model is used to generate both sentence-level and relation-level embeddings for the examples, and a relation reasoning template is designed to guide the LLM in producing detailed inference logic for each example, thereby constructing a retrieval example repository containing example texts, sentence-level embeddings, relation-level embeddings, and inference logic. Dual-path retrieval is then performed on this repository using sentence-level and relation-level embeddings, and the RRF algorithm is employed to fuse and rank the retrieval results to obtain high-quality demonstration examples. Finally, the top k ranked examples are used to construct prompt templates, enabling the LLM to leverage its in-context learning capability to perform relation extraction more accurately. Experiments on an equipment domain dataset show that this method achieves an F1-score improvement of approximately 1.51% over the best baseline model considered in this work, demonstrating the effectiveness of the proposed approach for equipment relation extraction.

Author Contributions

Conceptualization, M.T., L.Z. and X.L.; methodology, M.T. and L.Z.; software, M.T. and L.Z.; validation, M.T., L.Z. and X.S.; formal analysis, M.T., Z.Y. and X.S.; investigation, Z.Y.; resources, M.T.; data curation, M.T., Z.Y. and X.S.; writing—original draft preparation, M.T.; writing—review and editing, M.T. and L.Z.; visualization, M.T.; supervision, X.L.; project administration, M.T. and Z.Y.; funding acquisition, L.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the National Key R&D Program of China under Grant 2022YFF0604502.

Data Availability Statement

The data used in this study were compiled and recorded by the authors. Due to the involvement of sensitive military information, these data cannot be made publicly available.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Diaz-Garcia, J.A.; Lopez, J.A.D. A survey on cutting-edge relation extraction techniques based on language models. Artif. Intell. Rev. 2025, 58, 287. [Google Scholar] [CrossRef]
Sui, D.; Zeng, X.; Chen, Y.; Liu, K.; Zhao, J. Joint entity and relation extraction with set prediction networks. IEEE Trans. Neural Netw. Learn. Syst. 2023, 35, 12784–12795. [Google Scholar] [CrossRef] [PubMed]
Zhong, L.; Wu, J.; Li, Q.; Peng, H.; Wu, X. A Comprehensive Survey on Automatic Knowledge Graph Construction. ACM Comput. Surv. 2023, 56. [Google Scholar] [CrossRef]
Srihari, R.; Li, W. A question answering system supported by information extraction. In Proceedings of the Sixth Conference on Applied Natural Language Processing, ANLC ’00, Seattle, WA, USA, 29 April–4 May 2000; pp. 166–172. [Google Scholar] [CrossRef]
Chen, X.; Zhang, N.; Xie, X.; Deng, S.; Yao, Y.; Tan, C.; Huang, F.; Si, L.; Chen, H. KnowPrompt: Knowledge-aware Prompt-tuning with Synergistic Optimization for Relation Extraction. In Proceedings of the ACM Web Conference 2022, WWW ’22, Lyon, France, 25–29 April 2022; Association for Computing Machinery: New York, NY, USA, 2022; pp. 2778–2788. [Google Scholar] [CrossRef]
Zhou, W.; Chen, M. An Improved Baseline for Sentence-level Relation Extraction. In Volume 2: Short Papers, Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing, Online, 20–23 November 2022; He, Y., Ji, H., Li, S., Liu, Y., Chang, C.H., Eds.; Association for Computational Linguistics: Kerrville, TX, USA, 2022; pp. 161–168. [Google Scholar] [CrossRef]
Aydar, M.; Bozal, O.; Ozbay, F. Neural relation extraction: A survey. arXiv 2020, arXiv:2007.04247. [Google Scholar] [CrossRef]
Chinchor, N.; Marsh, E. Muc-7 information extraction task definition. In Proceedings of the Seventh Message Understanding Conference (MUC-7), Fairfax, VA, USA, 29 April–1 May 1998; pp. 359–367. [Google Scholar]
Lafferty, J.; McCallum, A.; Pereira, F. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. In Proceedings of the ICML, Williamstown, MA, USA, 28 June–1 July 2001; Volume 1, pp. 282–289. [Google Scholar]
Pawar, S.; Palshikar, G.K.; Bhattacharyya, P. Relation extraction: A survey. arXiv 2017, arXiv:1712.05191. [Google Scholar] [CrossRef]
Agichtein, E.; Gravano, L. Snowball: Extracting relations from large plain-text collections. In Proceedings of the Fifth ACM Conference on Digital Libraries, San Antonio, TX, USA, 2–7 June 2000; pp. 85–94. [Google Scholar]
Hasegawa, T.; Sekine, S.; Grishman, R. Discovering Relations among Named Entities from Large Corpora. In Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04), Barcelona, Spain, 21–26 July 2004; pp. 415–422. [Google Scholar] [CrossRef]
Hong, Y.; Liu, Y.; Yang, S.; Zhang, K.; Wen, A.; Hu, J. Improving graph convolutional networks based on relation-aware attention for end-to-end relation extraction. IEEE Access 2020, 8, 51315–51323. [Google Scholar] [CrossRef]
Dai, D.; Xiao, X.; Lyu, Y.; Dou, S.; She, Q.; Wang, H. Joint extraction of entities and overlapping relations using position-attentive sequence labeling. In Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 29–31 January 2019; Volume 33, pp. 6300–6308. [Google Scholar]
Wei, Z.; Su, J.; Wang, Y.; Tian, Y.; Chang, Y. A Novel Cascade Binary Tagging Framework for Relational Triple Extraction. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, 5–10 July 2020; pp. 1476–1488. [Google Scholar]
Wang, Y.; Yu, B.; Zhang, Y.; Liu, T.; Zhu, H.; Sun, L. TPLinker: Single-stage Joint Extraction of Entities and Relations Through Token Pair Linking. In Proceedings of the 28th International Conference on Computational Linguistics, Online, 8–13 December 2020; pp. 1572–1582. [Google Scholar]
Fu, T.J.; Li, P.H.; Ma, W.Y. Graphrel: Modeling text as relational graphs for joint entity and relation extraction. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, 28 July–2 August 2019; pp. 1409–1418. [Google Scholar]
Zeng, X.; He, S.; Zeng, D.; Liu, K.; Liu, S.; Zhao, J. Learning the extraction order of multiple relational facts in a sentence with reinforcement learning. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, 3–7 November 2019; pp. 367–377. [Google Scholar]
Brown, T.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J.D.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry, G.; Askell, A.; et al. Language models are few-shot learners. In Advances in Neural Information Processing Systems 33; Conference on Neural Information Processing Systems: Vancouver, BC, Canada, 2020; pp. 1877–1901. [Google Scholar]
Zhang, K.; Gutiérrez, B.J.; Su, Y. Aligning Instruction Tasks Unlocks Large Language Models as Zero-Shot Relation Extractors. In Findings of the Association for Computational Linguistics: ACL 2023; Association for Computational Linguistics: Kerrville, TX, USA, 2023; pp. 794–812. [Google Scholar]
Wan, Z.; Cheng, F.; Mao, Z.; Liu, Q.; Song, H.; Li, J.; Kurohashi, S. GPT-RE: In-context Learning for Relation Extraction using Large Language Models. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing; Bouamor, H., Pino, J., Bali, K., Eds.; Association for Computational Linguistics: Kerrville, TX, USA, 2023; pp. 3534–3547. [Google Scholar] [CrossRef]
Efeoglu, S.; Paschke, A. Retrieval-augmented generation-based relation extraction. arXiv 2024, arXiv:2404.13397. [Google Scholar]
Chaplot, D.S. Albert q. jiang, alexandre sablayrolles, arthur mensch, chris bamford, devendra singh chaplot, diego de las casas, florian bressand, gianna lengyel, guillaume lample, lucile saulnier, lélio renard lavaud, marie-anne lachaux, pierre stock, teven le scao, thibaut lavril, thomas wang, timothée lacroix, william el sayed. arXiv 2023, arXiv:2310.06825. [Google Scholar]
Touvron, H.; Martin, L.; Stone, K.; Albert, P.; Almahairi, A.; Babaei, Y.; Bashlykov, N.; Batra, S.; Bhargava, P.; Bhosale, S.; et al. Llama 2: Open foundation and fine-tuned chat models. arXiv 2023, arXiv:2307.09288. [Google Scholar] [CrossRef]
Chung, H.W.; Hou, L.; Longpre, S.; Zoph, B.; Tay, Y.; Fedus, W.; Li, Y.; Wang, X.; Dehghani, M.; Brahma, S.; et al. Scaling instruction-finetuned language models. J. Mach. Learn. Res. 2024, 25, 1–53. [Google Scholar]
Wang, C.; Liu, X.; Chen, Z.; Hong, H.; Tang, J.; Song, D. DEEPSTRUCT: Pretraining of Language Models for Structure Prediction. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, ACL 2022, Dublin, Ireland, 22–27 May 2022; Association for Computational Linguistics (ACL): Kerrville, TX, USA, 2022; pp. 803–823. [Google Scholar]
Paolini, G.; Athiwaratkun, B.; Krone, J.; Ma, J.; Achille, A.; Anubhai, R.; Santos, C.N.D.; Xiang, B.; Soatto, S. Structured prediction as translation between augmented natural languages. In Proceedings of the International Conference on Learning Representations, ICLR, Virtual, 3–7 May 2021; pp. 1–26. [Google Scholar]
Xue, L.; Zhang, D.; Dong, Y.; Tang, J. AutoRE: Document-Level Relation Extraction with Large Language Models. In Volume 3: System Demonstrations, Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics, Bangkok, Thailand, 11–16 August 2024; Cao, Y., Feng, Y., Xiong, D., Eds.; Association for Computational Linguistics: Kerrville, TX, USA, 2024; pp. 211–220. [Google Scholar] [CrossRef]
Wei, X.; Cui, X.; Cheng, N.; Wang, X.; Zhang, X.; Huang, S.; Xie, P.; Xu, J.; Chen, Y.; Zhang, M.; et al. Chatie: Zero-shot information extraction via chatting with chatgpt. arXiv 2023, arXiv:2302.10205. [Google Scholar]
Dong, Q.; Li, L.; Dai, D.; Zheng, C.; Ma, J.; Li, R.; Xia, H.; Xu, J.; Wu, Z.; Liu, T.; et al. A survey on in-context learning. arXiv 2022, arXiv:2301.00234. [Google Scholar]
Li, J.; Jia, Z.; Zheng, Z. Semi-automatic Data Enhancement for Document-Level Relation Extraction with Distant Supervision from Large Language Models. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Singapore, 6–10 December 2023; pp. 5495–5505. [Google Scholar]
Chen, J.; Xiao, S.; Zhang, P.; Luo, K.; Lian, D.; Liu, Z. Bge m3-embedding: Multi-lingual, multi-functionality, multi-granularity text embeddings through self-knowledge distillation. arXiv 2024, arXiv:2402.03216. [Google Scholar]
Baldini Soares, L.; FitzGerald, N.; Ling, J.; Kwiatkowski, T. Matching the Blanks: Distributional Similarity for Relation Learning. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics; Korhonen, A., Traum, D., Màrquez, L., Eds.; Association for Computational Linguistics: Kerrville, TX, USA, 2019; pp. 2895–2905. [Google Scholar] [CrossRef]

Figure 1. Example of overlapping relation extraction data in the equipment domain. (a) Single Entity Overlap; (b) Entity Pair Overlap.

Figure 2. Overall framework of the proposed symmetry- and asymmetry-aware dual-path retrieval and in-context learning-based LLM for equipment relation extraction.

Figure 3. Architecture of the domain adaptation layer module.

Figure 4. Guiding GPT-4 to generate reasoning logic through relation reasoning templates.

Figure 5. Performance trends under different numbers of retrieved examples (k).

Table 1. Relation extraction prompt template.

Prompt	Prompt Template
Prompt	Instruction: Instructions
	In-Context Learning Example: ICL Demonstrations
	D₁, with relevance ranking to the input text: r₁
	D₂, with relevance ranking to the input text: r₂
	...
	D_n, with relevance ranking to the input text: r_n
	Input: Input Text
	Output:

Table 2. Relation extraction algorithm.

Prompt
Input: Input Text t
Output: Relation Extraction Results A
1. Initialization of the Retrieval Example Set
2. $G_{s e n t e n c e} \leftarrow R e t r i e v e (R e t r i e v a l E x a m p l e R e p o s i t o r y, v_{s e n t e n c e})$
3. $G_{r e l a t i o n} \leftarrow R e t r i e v e (R e t r i e v a l E x a m p l e R e p o s i t o r y, v_{r e l a t i o n})$
4. $G \leftarrow R R F (G_{s e n t e n c e}, G_{r e l a t i o n})$
5. for g in G do
T ← T⋃(g)
End
6. prompt ← ConstructPrompt(T, t)
7. A ← LLM(prompt)
8. return A

Table 3. Relation types in the equipment domain.

Relation Types	Subject Entity Types	Object Entity Types
Manufacturer	Plane	Manufacturing Plant
	Tank
	Missile
	Firearms
	Equipment
Equip	Plane	Equipment
	Tank
	Missile
Operating Forces	Plane	Military Unit
	Tank
	Missile
	Firearms
Producing Country or Region	Plane	Country or Region
	Tank
	Missile
	Firearms
Operating Country or Region	Plane	Country or Region
	Tank
	Missile
	Firearms
Previous Model	Plane	Plane
	Tank	Tank
	Missile	Missile
	Firearms	Firearms
	Equipment	Equipment
Subsequent Model	Plane	Plane
	Tank	Tank
	Missile	Missile
	Firearms	Firearms
	Equipment	Equipment

Table 4. Examples of relation extraction data annotation.

Overlap Type	Data Text	Annotated Example
Normal	The main gun of the VT-4 main battle tank is a 125 mm smoothbore cannon.	(VT-4 main battle tank, Equip, 125 mm smoothbore cannon)
SEO	The aircraft carrier CVN-68 was built by Newport News Shipbuilding in the United States, with construction commencing in 1968, and it is powered by a nuclear propulsion system.	(CVN-68, Manufacturer, Newport News Shipbuilding)
		(CVN-68, Producing Country or Region, the United States)
		(CVN-68, Equip, nuclear propulsion system)
EPO	The M1 Abrams main battle tank is a type of main battle tank developed and equipped by the United States in the 1970s.	(The M1 Abrams main battle tank, Producing Country or Region, the United States)
EPO		(The M1 Abrams main battle tank, Operating Country or Region, the United States)

Table 5. Table of Dataset Division for Relation Extraction.

Dataset	Number of Sentences
Test Set	6766
Retrieval Example Set	1691
Total	8457

Table 6. Table of overlapping relation type distribution.

Overlap Type	Test Set	Retrieval Example Set
Normal	3705	855
EPO	1486	327
SEO	1575	509

Table 7. Experimental environment.

Experimental Environment	Detailed Configuration
Operating System	CentOS 7.9
Python	3.8
Pytorch	1.13
RAM	32 GB
GPU	2 × RTX3090, 24 GB

Table 8. Large model hyperparameters.

Parameter Name	Parameter Value
Temperature	0.8
Top_p	0.7
Presence_penalty	1.0
Response_format	“json_object”

Table 9. Experimental results of each model on the equipment-domain dataset.

Technology	Model	Precision	Recall	F1
Deep Learning-Based	Relation-Aware	73.52	68.63	71.00
	CasRel	80.64	78.31	79.44
	TPLinker	83.51	85.02	84.24
	SPN	84.62	85.73	85.16
Pre-Trained Large Language Model-Based	QA4RE	81.53	83.24	82.36
	AutoRE	85.01	85.89	85.45
	GPT-RE	85.64	86.82	86.21
	Ours	88.12	88.95	88.53

Table 10. Performance comparison under different numbers of retrieved examples (k).

Retrieved Examples (k)	Precision	Recall	F1
1	84.20	84.70	84.45
2	85.90	86.50	86.20
3	86.90	87.60	87.25
4	87.65	88.40	88.02
5	88.12	88.95	88.53
6	87.95	88.60	88.27
7	87.70	88.30	88.00
8	87.45	88.05	87.75
9	87.20	87.80	87.50
10	87.00	87.55	87.27

Table 11. Macro-F1 and Micro-F1 results with comparison of frequent and rare relations.

Category	Precision (%)	Recall (%)	F1 (%)
Top Three Relations	90.63	89.90	90.26
Bottom Three Relations	77.93	77.07	77.50
Macro-Average	84.60	83.79	84.19
Micro-Average	88.12	88.95	88.53

Table 12. Detailed performance breakdown of the model by overlap type.

Category	Precision (%)	Recall (%)	F1 (%)
Normal	89.85	90.40	90.12
SEO	88.30	88.90	88.60
EPO	86.20	87.55	86.87

Table 13. Ablation study.

Retrieval Method	Precision	Recall	F1
Random	74.34	72.72	73.52
w/o Domain Adapt	85.21	86.58	85.89
Sentence	81.35	80.18	80.76
Relation	85.73	86.25	85.99
Sentence + Relation	88.12	88.95	88.53

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Tang, M.; Zhang, L.; Yu, Z.; Shi, X.; Liu, X. Symmetry- and Asymmetry-Aware Dual-Path Retrieval and In-Context Learning-Based LLM for Equipment Relation Extraction. Symmetry 2025, 17, 1647. https://doi.org/10.3390/sym17101647

AMA Style

Tang M, Zhang L, Yu Z, Shi X, Liu X. Symmetry- and Asymmetry-Aware Dual-Path Retrieval and In-Context Learning-Based LLM for Equipment Relation Extraction. Symmetry. 2025; 17(10):1647. https://doi.org/10.3390/sym17101647

Chicago/Turabian Style

Tang, Mingfei, Liang Zhang, Zhipeng Yu, Xiaolong Shi, and Xiulei Liu. 2025. "Symmetry- and Asymmetry-Aware Dual-Path Retrieval and In-Context Learning-Based LLM for Equipment Relation Extraction" Symmetry 17, no. 10: 1647. https://doi.org/10.3390/sym17101647

APA Style

Tang, M., Zhang, L., Yu, Z., Shi, X., & Liu, X. (2025). Symmetry- and Asymmetry-Aware Dual-Path Retrieval and In-Context Learning-Based LLM for Equipment Relation Extraction. Symmetry, 17(10), 1647. https://doi.org/10.3390/sym17101647

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Symmetry- and Asymmetry-Aware Dual-Path Retrieval and In-Context Learning-Based LLM for Equipment Relation Extraction

Abstract

1. Introduction

1.1. Knowledge Engineering-Based Relation Extraction

1.2. Statistical Machine Learning-Based Relation Extraction

1.3. Deep Learning-Based Relation Extraction

1.4. Large Language Model-Based Relation Extraction

1.5. Paper Contributions and Organization

2. Related Work

2.1. Relation Extraction with Large Language Models

2.2. Our Position

3. Method

3.1. Embedding Model Fine-Tuning Module

3.2. Retrieval Example Library Construction Module

3.3. Relevance Computation Module

3.3.1. Sentence-Level Embedding Retrieval

3.3.2. Relation-Level Embedding Retrieval

3.3.3. RRF Ranking

3.4. Prompt Construction Module

4. Results

4.1. Experimental Data

4.2. Experimental Config

4.3. Experimental Results and Analysis

4.3.1. Comparative Experiments

4.3.2. Impact of the Number of Retrieved Examples (k) Experiments

4.3.3. Macro-F1 and Micro-F1 Experiments

4.3.4. Overlapping Relation-Type Experiments

4.3.5. Ablation Experiments

5. Discussion

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI