3.1. Embedding Model Fine-Tuning Module
BGE-M3 is a multilingual embedding model widely utilized in text retrieval and semantic representation tasks. Its primary strength lies in capturing semantic information from text and assessing semantic relationships between texts via vector distances or similarity measures. In this chapter, the embedding model serves as the fundamental component for implementing the dual-path retrieval, mapping both the input text and the example texts in the retrieval library into semantic vectors, and enabling the rapid identification of the most relevant demonstration examples through similarity computation.
However, general-purpose embedding models are typically trained on large-scale generic corpora, resulting in a semantic space better suited to representing textual features in general domains. Their performance in specialized domains, such as the equipment domain, is therefore constrained. The equipment domain is characterized by numerous technical terms, complex technical descriptions, and distinctive semantic relationships. For instance, in equipment relation extraction tasks, input texts may contain domain-specific terms such as “radar system” and “fire-control module” whose semantic nuances general-purpose embedding models may inadequately capture.
To enhance the semantic representation capability of BGE-M3 in the equipment domain, this section performs domain-specific fine-tuning by incorporating a dedicated equipment-domain corpus and targeted optimization objectives. The fine-tuned model better adapts to the semantic requirements of the equipment domain, providing higher-quality semantic vectors for the subsequent retrieval example library and dual-path retrieval.
This section employs contrastive learning to fine-tune the BGE-M3 embedding model. The core idea of contrastive learning is to optimize the semantic space by minimizing the distance between positive sample pairs while maximizing the distance between negative sample pairs. Specifically, the InfoNCE (Noise-Contrastive Estimation) loss function is used as the objective:
where
is the query sample,
is the positive sample related to
, and
denotes negative samples not related to
.
is the temperature hyperparameter used to scale the similarity in contrastive learning, commonly applied to adjust the scale of similarities to prevent gradient vanishing or explosion, effectively controlling the smoothness of the softmax output distribution. The function
represents cosine similarity, defined as
In addition, to further enhance the model’s adaptability to the equipment domain, this section introduces a domain-specific adaptation layer on top of BGE-M3, with the detailed architecture shown in
Figure 3. This adaptation layer consists of a lightweight fully connected network, structured as follows:
where
e denotes the input embedding vector;
,
, and
,
represent the weight matrices and bias terms;
h denotes the hidden layer output; and
o denotes the final domain-adapted output. Assuming that the embedding vector
e generated by the embedding model belongs to a general semantic space
, and that the semantic space of the equipment domain
is a subspace of
, the goal of the domain adaptation layer is to apply a linear transformation to map
e into
, thereby achieving semantic representations specific to the equipment domain:
where
denotes the mapping function of the domain adaptation layer and
represents the learnable parameters. In this manner, the model is able to preserve general semantic information while simultaneously capturing domain-specific features of the equipment domain.
Finally, to avoid compromising the general semantic knowledge of the embedding model, the lower-layer Transformer parameters of BGE-M3 were frozen during fine-tuning, and only the parameters of the domain adaptation layer were updated. This strategy reduces computational overhead while ensuring that the model effectively learns domain-specific knowledge for the equipment domain.
3.2. Retrieval Example Library Construction Module
In the overall framework of the proposed symmetry- and asymmetry-aware dual-path retrieval and in-context learning-based LLM for equipment relation extraction, constructing a high-quality retrieval example repository is one of the key components of the entire system. In practical applications, once an input sentence is embedded into a vector, a semantically rich example repository is required to retrieve highly relevant examples, thereby providing contextual support and reasoning references for the subsequent relation extraction task. This retrieval-based approach effectively compensates for the limitations of large language models in domain-specific knowledge. By incorporating the semantic information and reasoning logic contained in relevant examples, the model can better comprehend complex relations within the input sentence. Only when highly relevant examples are retrieved can the LLM perform relation extraction with higher precision.In the equipment domain, relations in text are often highly specialized and implicit. Relying solely on the input sentence may not fully capture its semantic features. Through retrieved relevant examples, the model gains additional contextual information and domain knowledge, which not only supplements potentially missing background details in the input but also provides explicit reasoning pathways and reference relation patterns. This significantly enhances the model’s ability to recognize complex relations and improves extraction accuracy.
The construction of the retrieval example repository primarily consists of four components: example text, example sentence-level embedding vector, example relation-level embedding vector, and example reasoning logic. The example text is sourced from a subset of the dataset designated as the retrieval example set, which is carefully curated to comprehensively cover typical scenarios and relation types within the equipment domain. The sentence-level and relation-level embeddings are generated using the domain-specific fine-tuned BGE-M3 embedding model. This domain-oriented fine-tuning significantly enhances the model’s semantic representation capability in the equipment domain, ensuring that it can capture more fine-grained semantic features.
In addition, to further enhance the practicality of the retrieval example repository, this section designs a dedicated relation reasoning template to guide the large language model in generating detailed reasoning logic for the retrieved examples. This reasoning logic serves as an important supplementary resource for the subsequent construction of the prompt templates. The generation process of the example reasoning logic is illustrated in
Figure 4.
In this module, given an example
, where
is the original text,
and
are the entities, and
is the relation label between them, a query prompt is first generated based on the example.
The purpose of this query prompt is to guide GPT-4 to understand the relation extraction task and to require it to output cues related to the relation based on contextual information. When this query prompt is fed into the GPT-4 model, the model generates the reasoning logic , typically manifested as sentences or phrases pertinent to the relation, which help the model infer the actual relation between the two entities.
Subsequently, this query prompt is fed into the GPT-4 model, and the reasoning evidence generated by the model is denoted as
. These reasoning cues, combined with the original example, are used to construct an enhanced example.
In this way, the original example is transformed into an enhanced example, , containing reasoning cues, while also providing the model with richer contextual information to facilitate a deeper understanding of the implicit connections between the relations.
3.3. Relevance Computation Module
When applying the in-context learning paradigm of large language models to the task of relation extraction in the equipment domain, a high degree of relevance between demonstration examples and the current input text can assist the model in rapidly identifying the potential connections between them, thereby enabling more accurate prediction of the implicit relations within the input text. In this section, building on the fine-tuned BGE-M3 embedding model from the previous section, a dual-path retrieval strategy is implemented to significantly improve both the coverage and accuracy of retrieval, while the RRF algorithm is employed to rank the retrieval results through weighted aggregation of rankings.
3.3.1. Sentence-Level Embedding Retrieval
Sentence-level embedding retrieval uses the overall semantic vector of the input text to perform retrieval, allowing it to capture global semantic information. This approach is particularly suitable when the input text is highly similar in overall semantics to the example texts in the retrieval repository. For an input sentence
s, its embedding vector
is generated using the fine-tuned BGE-M3 model. Subsequently,
is compared with the sentence-level embeddings of all examples in the retrieval repository using cosine similarity:
In this formulation, represents the embedding vector of the input sentence, corresponds to the embedding vector of the sentence within the database, and denotes the sentence-level semantic similarity. By evaluating these similarity scores, the k most semantically relevant examples can be retrieved from the repository to support subsequent processing.
3.3.2. Relation-Level Embedding Retrieval
Relation-level embedding retrieval focuses on the subject and object entities and their relations within a sentence, aiming to enhance retrieval accuracy through more fine-grained semantic representations. Compared with sentence-level embedding retrieval, relation-level retrieval is better suited to capture local semantic information, which is particularly important in tasks within the equipment domain that involve complex entity relations. Furthermore, relation-level embedding retrieval effectively compensates for the limitations of sentence-level retrieval in handling local semantic matches, thereby improving overall retrieval performance.
Firstly, a BERT-based relation extraction fine-tuning method [
33] is employed to extract the subject and object entities from the input sentence. For instance, in the sentence “M60 is the first main battle tank of the United States, serving in the U.S.”, the output tagging is “[CLS][SUB] M60 [/SUB] is the first main battle tank of [OBJ] the United States [/OBJ][SEP]”. The entities enclosed by the [SUB] and [OBJ] tags, namely “M60” and “the United States”, correspond to the subject and object entities, respectively.
Subsequently, the fine-tuned BGE-M3 model is employed to generate the embedding vectors of the subject entity and the object entity, denoted as
and
, respectively. These two embedding vectors are then concatenated along their dimensions to form the following relation-level semantic vector:
In addition, in order to enable relation-level embedding retrieval, each example in the retrieval corpus must be pre-computed with its corresponding relation-level semantic vector. The procedure is identical to the aforementioned steps, namely extracting the subject–object entity pair from each example and generating its relation-level semantic representation. Finally, given the relation-level semantic vector
of the input sentence, the similarity score with the relation-level semantic vector
of each example in the retrieval corpus is computed as follows:
In this formulation, denotes the relation-level semantic vector of the input text, while , represents the relation-level semantic vector of the i-th example in the retrieval corpus. indicates the relation-level similarity. By calculating the similarity scores, it becomes possible to retrieve the top-k examples from the retrieval corpus that are most semantically relevant to the subject–object pair in the input sentence at a fine-grained level.
3.3.3. RRF Ranking
After obtaining the retrieval results from both sentence-level embedding retrieval and relation-level embedding retrieval, the RRF algorithm is employed to enhance the accuracy and comprehensiveness of example retrieval. By integrating results from different retrieval strategies, the RRF algorithm effectively fuses dual-path retrieval outputs through rank-based weighting, thereby leveraging the strengths of each retrieval approach. This fusion mechanism not only improves the precision of the final ranking but also enhances its robustness.
The core idea of RRF lies in aggregating the ranking positions of candidate examples from each retrieval path and assigning them weights accordingly, where examples ranked higher receive greater weights. The calculation formula is defined as
In this context, d denotes a candidate example in the retrieval library, represents the rank position of example d in the n-th retrieval path, N refers to the total number of retrieval paths, and k is a constant greater than 1, introduced to mitigate the influence of abnormally high rankings from a single system. By applying the RRF algorithm, the retrieval results from both paths can be jointly considered, enabling high-ranked candidate examples from different retrieval lists to receive greater weights in the final ranking. This mechanism ensures that the fused ranking effectively integrates informative signals from both retrieval strategies.
3.4. Prompt Construction Module
In this chapter, a prompt is constructed for each input text and subsequently fed into the GPT-4 model to perform the relation extraction task. Each prompt consists of the following components:
Instructions: A brief description of the relation extraction task and the predefined set of relation categories R. The model is explicitly required to output a relation belonging to the predefined categories. If no appropriate relation is identified, the model should output NULL.
ICL Demonstrations: The k-shot example set retrieved by the relevance computation module is leveraged, and each example is combined with its corresponding chain-of-thought-guided inference logic to form a new example set, denoted as D.
Input Text: A segment of text on which the model is required to perform the equipment relation extraction task.
Within the in-context learning framework of large language models, the ordering of examples has a significant impact on model performance. Placing the most relevant examples at the beginning of the input sequence can effectively enhance the model’s reasoning and generalization capabilities. This arrangement allows the model to capture key information at an early stage, thereby improving its understanding of the task requirements and facilitating the generation of accurate outputs. Accordingly, based on the ranked retrieval examples from the previous module, the final prompt template is constructed as shown in
Table 1.
After constructing the prompt based on the prompt template, the prompt is fed into the LLM to obtain the final relation extraction result.
Accordingly, the specific procedure of the symmetry- and asymmetry-aware dual-path retrieval and in-context learning-based LLM for equipment relation extraction is illustrated in
Table 2.