Next Article in Journal
Secure and Decentralised Swarm Authentication Using Hardware Security Primitives
Next Article in Special Issue
SLTP: A Symbolic Travel-Planning Agent Framework with Decoupled Translation and Heuristic Tree Search
Previous Article in Journal
Loss Prediction and Global Sensitivity Analysis for Distribution Transformers Based on NRBO-Transformer-BiLSTM
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

HIEA: Hierarchical Inference for Entity Alignment with Collaboration of Instruction-Tuned Large Language Models and Small Models

School of Information and Artificial Intelligence, Yangzhou University, Yangzhou 225127, China
*
Author to whom correspondence should be addressed.
Electronics 2026, 15(2), 421; https://doi.org/10.3390/electronics15020421
Submission received: 27 December 2025 / Revised: 14 January 2026 / Accepted: 15 January 2026 / Published: 18 January 2026
(This article belongs to the Special Issue AI-Powered Natural Language Processing Applications)

Abstract

Entity alignment (EA) facilitates knowledge fusion by matching semantically identical entities in distinct knowledge graphs (KGs). Existing embedding-based methods rely solely on intrinsic KG facts and often struggle with long-tail entities due to insufficient information. Recently, large language models (LLMs), empowered by rich background knowledge and strong reasoning abilities, have shown promise for EA. However, most current LLM-enhanced approaches follow the in-context learning paradigm, requiring multi-round interactions with carefully designed prompts to perform additional auxiliary operations, which leads to substantial computational overhead. Moreover, they fail to fully exploit the complementary strengths of embedding-based small models and LLMs. To address these limitations, we propose HIEA, a novel hierarchical inference framework for entity alignment. By instruction-tuning a generative LLM with a unified and concise prompt and a knowledge adapter, HIEA produces alignment results with a single LLM invocation. Meanwhile, embedding-based small models not only generate candidate entities but also support the LLM through data augmentation and certainty-aware source entity classification, fostering deeper collaboration between small models and LLMs. Extensive experiments on both standard and highly heterogeneous benchmarks demonstrate that HIEA consistently outperforms existing embedding-based and LLM-enhanced methods, achieving absolute Hits@1 improvements of up to 5.6%, while significantly reducing inference cost.

1. Introduction

Knowledge graphs (KGs) represent factual knowledge with structured triples and have been widely applied in research areas, such as question answering [1], recommender systems [2], and intelligent diagnosis [3]. Entity alignment (EA) can automatically identify equivalent entities that refer to the same real-world object across different KGs, thereby enabling KG fusion and expanding overall knowledge coverage. As a result, EA has received increasing attention from the research community.
Current EA methods mainly rely on knowledge representation learning [4], including translation model-based [5,6] or graph neural network (GNN)-based approaches [7,8]. These methods encode symbolic KGs into continuous vector spaces and infer aligned entities by evaluating similarities between entity embeddings. However, they depend heavily on the inherent structural information in KGs, and thus often perform poorly when dealing with long-tail entities with sparse topological connections. Beyond pure structural features, some studies incorporate additional information—such as entity attributes [9], images [10], and temporal signals [11]—to build more comprehensive entity representations and improve alignment accuracy. Nevertheless, these methods still face challenges in multi-feature fusion and cross-modal inconsistency. Moreover, they solely utilize information already contained within the KG and overlook the rich, dynamically evolving open-world knowledge associated with entities. This hinders their ability to capture the complete semantics of entities and relations. As illustrated in Figure 1, for the entity “Chen Ning Yang”, traditional EA approaches (collectively referred to as small model (SM)-based methods) primarily focus on encoding neighboring entities and their connecting relations. As a result, they may have limited access to richer contextual information beyond the local KG structure, such as Yang’s life experiences, scientific career, and broader academic influence. In contrast, large language models (LLMs), pretrained on massive corpora from scientific literature and the internet, can provide complementary semantic knowledge that extends beyond the explicit facts in typical KGs. This capability motivates the integration of LLMs into the EA process.
Recently, several LLM-enhanced EA methods have been proposed. Approaches such as LLMEA [12], Seg-Align [13], and ChatEA [14] leverage the reasoning capability of LLMs to identify the most suitable match from candidate entities generated by embedding-based small models. LLM4EA [15] and HLMEA [16] use data annotated by LLMs to conduct entity alignment in unsupervised settings. EasyEA [17] is an entirely LLM-driven framework that leverages the summarization capability of LLMs to extract relevant semantic information from KG data. Although these methods improve EA accuracy and transparency, they still exhibit several limitations. (1) First, the aforementioned approaches generally adopt the in-context learning paradigm. To fully unleash the reasoning potential of pretrained LLMs, they require highly engineered and diverse prompts, along with multiple rounds of online interaction, including generating entity descriptions [14], producing summaries [17], and performing majority voting [16]. This leads to high computational and monetary overhead, particularly since most methods rely on very-large-scale LLMs (e.g., LLaMA2-70B [18]) or closed-source commercial models (e.g., GPT-4 [19]). For example, ChatEA requires an average of 9800 tokens and 131 s for GPT-4 to predict a target entity on the ICEWS-WIKI [20] dataset. (2) Second, KGs contain rich non-textual structural information, such as relational patterns and subgraph structures. Existing methods attempt to inform LLMs of this information by expanding prompts with entity neighborhoods. However, a noticeable semantic gap exists between structural and textual information, and this strategy still cannot effectively convey KG structural features to LLMs [21]. Instead, it often introduces redundant or irrelevant information, further increasing prompt length. (3) Finally, although LLMs demonstrate impressive performance, they remain computationally expensive to use. In contrast, embedding-based small models provide extremely fast inference but struggle with information-sparse entities. Existing methods utilize SMs only for constraining the generative space of LLMs, without achieving genuine complementarity between SMs and LLMs.
To overcome these limitations, we propose HIEA, a hierarchical inference framework for entity alignment based on the collaboration between instruction-tuned LLMs and small models. It substantially outperforms existing LLM-enhanced EA methods in both effectiveness and efficiency. Specifically, (1) to reduce prompting and interaction cost, we first employ an SM to perform coarse ranking over entities in the target KG, which constrains the LLM to identify the best match from top-k candidates and mitigates the hallucination problem. Unlike previous work [12,14], we fine-tune a 7–8B open-source generative LLM with a unified and concise prompt template, obtaining the alignment result in one go without multi-step interactions. During instruction data construction, the trained SM also serves as an annotator and helps generate enough fine-tuning data. (2) To fully incorporate KG structural information, we introduce a knowledge adapter module that injects SM-derived entity embeddings into the LLM during the fine-tuning process. These embeddings effectively capture local structural features, thus improving the graph reasoning ability of the LLM on KGs. (3) To combine the strengths of both SMs and LLMs, we further propose a collaborative inference strategy. We train a lightweight classifier based on similarity features of SM embeddings to categorize source entities into certain and uncertain cases. For certain entities, we directly adopt the SM’s predictions. For uncertain ones, we feed the source entity and its candidate set to the LLM for refined inference. This strategy significantly boosts overall inference speed without compromising accuracy.
Overall, the primary contributions of this work are as follows:
  • We propose HIEA, a novel and effective LLM-enhanced framework for EA. By fine-tuning a generative LLM with a unified and concise prompt, HIEA produces alignment results with a single query. It incorporates a knowledge adapter to inject KG embeddings into the LLM, thereby enhancing the LLM’s understanding of KG structure. We also perform data augmentation during instruction construction to obtain more high-quality tuning data.
  • We introduce a collaborative inference strategy for EA. By analyzing similarity features of entity embeddings, we train a lightweight classifier to distinguish certain and uncertain source entities. SM’s predictions are retained for certain entities, while uncertain entities are delegated to the LLM for further inference, resulting in a clear reduction in usage cost without sacrificing performance.
  • We conduct extensive experiments on both standard and highly heterogeneous temporal EA datasets, demonstrating that HIEA outperforms both embedding-based and LLM-enhanced EA methods. Comprehensive ablation studies further verify the effectiveness of each component.
The rest of this paper is structured as follows: Section 2 surveys existing works on entity alignment. In Section 3, formal definitions of the problems studied in this paper are provided. Section 4 elaborates on the proposed method, followed by a comprehensive experimental evaluation in Section 5. Section 6 summarizes this work, while Section 7 discusses the limitations of our approach and outlines potential directions for future research.

2. Related Work

2.1. Small Model-Based EA Methods

The majority of existing entity alignment methods leverage knowledge representation learning to obtain low-dimensional embeddings of knowledge graph elements, including entities and relations. Alignment is then determined by measuring the similarity between entity embeddings across different KGs. During training, embeddings of aligned entities are pulled closer, while those of non-aligned entities are pushed apart. In this paper, we collectively refer to these approaches as small model (SM)-based methods. According to the underlying embedding techniques, they can be broadly categorized into translation-based methods and GNN-based methods.
Translation-based methods typically adopt translational embedding models, such as TransE [22], to represent relational triples by interpreting relations as translation vectors from head entities to tail entities. MTransE [5] is one of the earliest attempts in this direction. It embeds each KG into an independent vector space using TransE and then learns a transformation function to align embeddings across KGs. BootEA [23] enhances this framework by introducing parameter sharing and a bootstrapping strategy to expand the initial alignment seeds iteratively. TransEdge [24] further improves translation-based modeling by adopting an edge-centric perspective, where relations are encoded in a contextualized manner conditioned on specific head–tail entity pairs.
GNN-based methods leverage graph neural networks to model KG structures by aggregating information from neighboring nodes. GCN-Align [7] applies graph convolutional networks [25] to capture structural features of entities and performs alignment according to embedding distances in a shared space. However, it largely overlooks the effects of diverse relation types as well as relation directions on entity alignment. RDGCN [26] addresses this limitation by explicitly incorporating relation-aware representations through iterative attention-based interactions between entity graphs and dual relation graphs. Dual-AMN [27] consists of a relation attention module for modeling intra-KG neighborhoods and a proxy matching attention module for capturing cross-KG interactions, together with a normalized hard-sample mining loss to improve negative sampling efficiency. HOLI-GNN [28] aims to capture high-order structural dependencies while mitigating over-smoothing via a local inflation mechanism that adaptively amplifies entity features.
Despite their impressive performance, the above approaches rely heavily on KG structural information and often struggle with long-tail entities that have sparse neighborhoods. To alleviate this issue, several studies incorporate additional side information about entities. PMF [10] is a multi-modal EA framework that achieves excellent alignment results by effectively integrating structural, visual, attribute, and relational features in a progressive manner. Simple-HHEA [20] is a straightforward yet effective method designed specifically for aligning highly heterogeneous KGs. It further introduces name and temporal embeddings alongside structural features, leading to excellent performance in more challenging alignment scenarios.
Nevertheless, these embedding-based methods exclusively exploit information already present within the KGs and fail to access the rich and dynamically evolving open-world knowledge. As a result, they are limited in capturing the full semantics of entities and relations, especially in complex and heterogeneous alignment settings.

2.2. LLM-Enhanced EA Methods

Large language models, pretrained on massive corpora, have enabled a new paradigm for entity alignment, commonly referred to as LLM-enhanced EA.
LLMEA [12] represents one of the earliest efforts in this direction, proposing to combine structural representations learned by small models with external knowledge provided by LLMs. Specifically, it first narrows the alignment search space using graph-based embedding similarity and LLM-generated virtual entities, and then formulates the final alignment decision as an iterative multiple-choice reasoning process performed by the LLM. Several subsequent studies follow and extend this general framework. Seg-Align [13] integrates embedding-based candidate selection with LLM reasoning through carefully designed zero-shot, multiple-choice prompts to handle hard-to-align entities. ChatEA [14] converts KG structures into forms that LLMs can understand and uses them to generate enriched entity descriptions. It adopts a two-stage, dialogue-based reasoning process, aligning entities by leveraging external knowledge and multi-step inference beyond pure embedding similarity. HLMEA [16] introduces an unsupervised entity alignment framework, in which LLMs act as annotators, performing repeated inference and majority voting to robustly select matched target entities from a reduced candidate set, enabling effective alignment without human supervision or labeled data. Similarly, EasyEA [17] is a fully LLM-driven approach aimed at eliminating the need for model training. It summarizes entity names, attributes, and relations into compact semantic representations, fuses multiple semantic views, and performs semantics-aware candidate selection. In addition, LLM4EA [15] employs LLMs as annotators to generate pseudo-labels that guide the training of traditional EA models. By actively selecting informative entities and refining noisy annotations through probabilistic reasoning, it achieves efficient alignment under limited annotation budgets.
Despite leveraging the strong reasoning capabilities of LLMs, existing approaches typically require multiple rounds of interaction with LLMs via diverse prompts to perform complex operations such as description generation, summarization, or majority voting, leading to substantial computational overhead. Moreover, they do not fully exploit the complementary strengths of small models and LLMs. To address these limitations, we propose HIEA, a hierarchical inference framework for entity alignment. By fine-tuning a generative LLM using a unified and concise prompt together with a knowledge adapter, HIEA requires only a single LLM invocation to obtain alignment results. Meanwhile, small models are leveraged not only for candidate generation but also for data augmentation and uncertain entity classification, which leads to more comprehensive collaboration between small models and LLMs. Consequently, HIEA outperforms existing LLM-enhanced methods while incurring significantly lower inference cost. We also conduct a concise comparison of small model-based and LLM-enhanced entity alignment methods, highlighting differences in learning paradigms, LLM involvement, structural signal injection, and inference strategies, which is shown in Table 1.

2.3. Adapting LLMs to Structured KG Tasks

Recent studies have increasingly investigated adapting LLMs to structured KG tasks by introducing explicit graph signals beyond pure text prompting. A representative direction is to inject structural embeddings into LLMs via lightweight adaptation modules, so that the model can exploit topology-aware evidence in the language representation space. For example, KoPA [21] maps KG structural embeddings into virtual prefix tokens for structure-aware knowledge graph completion, while KG-Adapter [29] inserts dedicated KG-aware adapter layers to integrate entity/relation information into decoder-only LLMs through parameter-efficient fine-tuning. More recently, structure-aware alignment and tuning frameworks have further emphasized bridging the representation gap between graph embeddings and natural language through lightweight adapters and unified structural instructions [30]. Another line of work enhances fixed LLMs through KG-based prompting or reranking pipelines, which can be effective but may rely on retrieving and verbalizing subgraphs, potentially leading to longer contexts [31,32].
In contrast to general KG reasoning, entity alignment is inherently entity-centric and comparison-based: the model must compare a source entity against a candidate set and select the best match. This makes entity-level injection particularly suitable, because compact, topology-aware entity representations can be consistently attached to each (source, candidate) pair as direct alignment evidence, rather than being indirectly conveyed through lengthy textual descriptions. Our knowledge adapter follows this principle by projecting EA-oriented structural embeddings into the LLM input space, facilitating efficient compare-and-select decisions for alignment.

3. Problem Definition

Definition 1 (Knowledge Graph).
A knowledge graph is formally defined as K G = ( E , R , T ) in this paper, with E and R representing the sets of entities and relations. For a conventional KG, T = { ( h , r , t ) E × R × E } denotes the collection of relation triples, where h and t are the head and tail entities, respectively, and r represents a directed relation between them. In particular, for a temporal KG, T can be defined as { ( h , r , t , τ b , τ e ) E × R × E × Q × Q } , where Q is a set of timestamps, and  τ b and τ e represent begin time and end time of the fact.
We study entity alignment between a pair of knowledge graphs, referred to as the source and target KGs, and formally formulate the EA problem as follows:
Definition 2 (Entity Alignment).
Consider two heterogeneous KGs to be aligned, denoted as K G 1 = ( E 1 , R 1 , T 1 ) and K G 2 = ( E 2 , R 2 , T 2 ) . A collection of pre-aligned entity pairs is given as training data, defined as S = { ( e i 1 , e i 2 ) | e i 1 E 1 , e i 2 E 2 , e i 1 e i 2 } i = 1 | S | , where the symbol ≡ signifies that e i 1 and e i 2 refer to the same real-world object. The objective of EA is to identify additional aligned entity pairs between E 1 and E 2 beyond those included in the training set S.
Furthermore, our goal is to leverage the powerful reasoning capabilities and background knowledge of LLMs to help solve the EA problem without human intervention. LLM-enhanced entity alignment is defined as follows:
Definition 3 (LLM-Enhanced Entity Alignment).
Given a source entity e s and a set of candidate entities inferred by a small model M s m a l l , LLM-enhanced entity alignment aims to choose the best-matched target entity e t from the candidate set by querying a LLM M l a r g e with a prompt. Typically, the prompt is composed of instructions that specify the EA task, along with descriptive information about entity e s .

4. Our Method

4.1. Framework Overview

Figure 2 depicts the proposed HIEA framework, which takes as input two KGs to be aligned and a set of pre-aligned entity pairs. The LLM M l a r g e leverages its rich open-world knowledge and strong reasoning capabilities to refine candidate entities coarsely filtered by the small model M s m a l l and output the most likely target entity. Conversely, the small model assists the LLM in effective instruction tuning and efficient inference by injecting pretrained entity embeddings and automatically annotating data. As a result, the hierarchical inference enabled by the collaboration between M l a r g e and M s m a l l not only improves EA accuracy but also significantly reduces the cost of querying the LLM. The framework consists of three main parts, which will be detailed in the following sections: Prompt Construction (Section 4.2), Instruction Data Classification and Augmentation (Section 4.3), and Knowledge Adaptation for Instruction Tuning (Section 4.4).

4.2. Prompt Construction

To align a source entity e s , we construct a prompt P ( e s ) that contains three types of information:
P ( e s ) = [ D ; N ; C ] ,
where D denotes the EA task description, N represents neighbor facts, and  C = [ e 1 , e 2 , , e k ] is the candidate entity list. The symbol [ ; ] indicates text concatenation. An example prompt for the source entity “Chen Ning Yang” is shown in Figure 2.
The task description informs the LLM of the alignment objective. Instead of designing complex and diverse prompts to guide off-the-shelf LLMs through multiple related prerequisite tasks [14,17], we simply instruct M l a r g e to act as an “assistant for the task of entity alignment” and directly “select the most similar target entity to the source entity from the candidate entity list”. The alignment result can be obtained by querying the LLM only once. The model M l a r g e will be further fine-tuned to adapt to this prompt format.
Neighbor facts contain the relations and entities connected to the source entity, providing the LLM with contextual information from the KG to accurately identify the corresponding real-world object. For a conventional KG, a fact takes the form of a triple, e.g., (Chen-Ning Yang, birthPlace, Anhui). For temporal KGs, additional timestamps are included, e.g., (Chen-Ning Yang, birthPlace, Anhui, 1922-10, 1922-10). Due to input length constraints, including all neighbor facts in the prompt is impractical and inefficient. Considering the characteristics of the EA task, we adopt a seed neighbor priority sampling strategy. Specifically, tuples containing pre-aligned neighbor entities are selected first, as they provide stronger alignment cues. For example, in both queries with “Chen-Ning Yang” and “Chen Ning Yang” as source entities, the fact (almaMater, University of Chicago) is included, offering valuable guidance for LLM’s decision-making. Seed neighbors are prioritized, followed by other relevant facts until a predefined threshold ω is reached.
The candidate entity list contains the names of top-k entities filtered by the traditional small model M s m a l l . The order of candidates is preserved because it reflects similarity between entity embeddings. M l a r g e is instructed to select the most suitable target entity from this list, thereby constraining the space of possible generations. However, in practice, the LLM may still produce redundant or erroneous information, such as explanations for output. To address this, we introduce a constrained generation module: if the output of M l a r g e does not exactly match any candidate, we select the candidate with the smallest edit distance as the final result.

4.3. Instruction Data Classification and Augmentation

4.3.1. Certainty-Aware Data Classification

The hubness phenomenon in vector spaces [33] causes the embeddings of many source entities to lie close to those of multiple candidate entities. This ambiguity often leads the small model to make incorrect predictions. In contrast, some source entities have a top-1 candidate that is easily distinguishable from the others, making further LLM refining unnecessary. These observations motivate us to design a classifier that categorizes samples into certain and uncertain. Source entities classified as certain retain the prediction of the small model, whereas uncertain ones are delivered to the LLM for further inference. This mechanism enhances the collaboration between the large and small models in entity alignment and improves inference efficiency.
However, the current training data suffers from imbalance: the number of positive samples (correctly aligned by M s m a l l ) is much larger than that of negative samples (incorrectly aligned). To obtain high-quality and balanced data for training the classifier, we design a simple yet effective sampling strategy, termed certainty confidence sampling. Specifically, given a source entity e i and its similarity features with the top-k candidate entities s i = ( s i 1 , s i 2 , , s i k ) (cosine similarity used in our implementation), we define its certainty confidence c i as
c i = 1 2 ( s i 1 s i 2 ) + 1 2 max 0 , log k + j = 1 k p i j log p i j log k ,
p i j = exp ( s i j / τ ) z = 1 k exp ( s i z / τ ) ,
where τ is a temperature hyperparameter. The confidence score c i integrates two complementary aspects of the source entity’s certainty. The first term (gap between top-1 and top-2) measures the local decision margin and reflects the model’s discriminative strength in its top-ranked predictions. The second term is a normalized entropy-based certainty, capturing the global sharpness of the prediction distribution over the top-k candidates. By jointly exploiting both the local margin and the global distribution information, c i provides a more stable and reliable estimate of an entity’s certainty. We emphasize that the proposed certainty confidence is not intended as a Bayesian uncertainty estimator. Instead, it serves as a task-specific and deterministic confidence heuristic for identifying relatively certain versus uncertain entities in the entity alignment setting, while simple, it is closely related to established uncertainty measures, as it integrates a margin-based criterion and a normalized entropy-based metric into a unified score.
We treat all source entities incorrectly aligned by M s m a l l as negative samples. The correctly aligned ones are sorted by their certainty in descending order and truncated to maintain a positive-to-negative ratio of λ , ensuring both high recall and precision. Here, λ is a hyperparameter. After training the classifier, we apply it to annotate the EA test data. Formally,
l a b e l j = f θ ( s j ) ,
where s j is the similarity feature of entity e j , and  l a b e l j { c e r t a i n , u n c e r t a i n } is the assigned label. The function f θ denotes the trained binary classifier. We adopt a Random Forest model [34] due to its strong robustness derived from an ensemble of diversified decision trees. Naturally, other classification methods can also be used as alternatives.

4.3.2. Data Augmentation

Fine-tuning an LLM requires sufficient instruction data, while manually annotating aligned entity pairs is costly. For example, the ICEWS-WIKI [20] dataset contains only 1518 pre-aligned entity pairs, which may be inadequate to fully unleash the LLM’s potential for EA. To alleviate this limitation, we draw inspiration from the iterative learning paradigm [10,35] in traditional EA methods and leverage it for data augmentation. Specifically, we adopt a bidirectional iterative strategy [36] to identify entity pairs that are mutually closest in the embedding space learned by the small model. Only entity pairs that satisfy the mutual nearest-neighbor constraint are considered as candidate new seed alignments. To further control potential noise introduced by data augmentation, we rank the newly generated entity pairs in descending order according to their cosine similarity and retain only the top- β most similar pairs. These filtered entity pairs are then treated as high-confidence weak supervision and used to construct additional instruction data following the prompt template described in Section 4.2.

4.4. Knowledge Adaptation for Instruction Tuning

To further enhance the LLM’s ability to capture structured alignment cues, we inject the structural embeddings learned by the small model M s m a l l into the LLM M l a r g e . Specifically, we design a knowledge adapter that aligns entity embeddings from M s m a l l with the token embeddings of M l a r g e . The adapter consists of a two-layer feed-forward transformation followed by a residual projection to preserve the information contained in the original KG embeddings. Given an entity embedding h e i , this process is formulated as
h e i = W r h e i + W 2 ( σ W 1 h e i + b 1 ) + b 2 ,
where h e i is the adapted representation produced by the adapter network. σ is the SwiGLU [37] activation function. W r R d 2 × d 0 , W 1 R d 1 × d 0 , W 2 R d 2 × d 1 , b 1 R d 1 , and  b 2 R d 2 are trainable weights. d 1 is the dimension of original KG embeddings, d 2 is the hidden size of M l a r g e , and  d 1 is an intermediate dimension.
The adapted entity representations are then injected into M l a r g e . As illustrated in Figure 2, we introduce two special placeholders, [SOURCE_ENTITY] and [TARGET_ENTITY], and place them immediately after corresponding entity names in the prompt. At the input layer of M l a r g e , these placeholders are replaced with the adapted representations produced by the knowledge adapter.
To finetune the generative model to produce the name of the most similar target entity e t , we optimize the standard autoregressive language modeling objective. Given a prompt p ( e s ) , the model predicts each token of the target name conditioned on the previously generated tokens. The training loss can be formulated as
L = i = 1 T log p θ t i t 1 : i 1 , p ( e s ) ,
where T is the length of the target entity name, t i denotes the i-th token, and p θ ( t i t 1 : i 1 , p ( e s ) ) is the model’s conditional probability of generating token t i given the prompt and prior context.

4.5. Discussion on Structural Embedding Injection

In our framework, structural entity embeddings are injected through dedicated placeholder tokens that are explicitly separated from natural language tokens. These placeholders are introduced as special tokens with their own embeddings, and they do not overlap with or replace any existing tokens in the LLM vocabulary. Importantly, the injected entity embeddings are used as auxiliary structural signals rather than direct substitutes for textual representations. During instruction tuning, the LLM learns to interpret these placeholder tokens as indicators of structured knowledge, allowing it to effectively distinguish between textual semantics and injected structural information. This design prevents interference with the original token distribution learned during pre-training while enabling controlled integration of external entity representations.

5. Experiments

5.1. Experimental Settings

  • Datasets. We evaluate our approach on two widely used real-world entity alignment datasets. DBP15K [38] includes three conventional subsets from the multilingual version of DBpedia: DBP Z H E N (Chinese–English), DBP J A E N (Japanese–English), and DBP F R E N (French–English). HHKG [20] consists of two subsets, ICEWS-WIKI and ICEWS-YAGO, which are sampled from the Integrated Crisis Early Warning System (ICEWS), Wikidata, and YAGO. These two subsets are highly heterogeneous in terms of scale, structure, and entity-overlapping ratios. All five subdatasets above contain a number of pre-aligned entity pairs that serve as gold standards. Following prior work [10,14,20], we use 30% of these pairs for training and reserve the remainder for testing. Table 2 presents detailed dataset statistics, where Ent., Rel., and Tri. denote entities, relations, and triples, respectively. In addition, we report the structural similarity defined in [20], which measures the average similarity between aligned neighbors of aligned pairs and thus reflects the neighborhood-level similarity between KGs. As shown in Table 2, HHKG exhibits markedly lower structural similarity than DBP15K, indicating that it is more challenging and in line with realistic alignment scenarios.
  • Baselines. To comprehensively evaluate the performance of the proposed HIEA, we compare it against a broad range of existing EA methods. These baselines include both well-established techniques and recent advances, collectively reflecting the rapid development of this research area and providing a solid foundation for empirical evaluation. Specifically, the selected baselines are categorized into two groups: (1) Small model-based methods, including translation-based approaches—MTransE [5], BootEA [23], and TransEdge [24], GNN-based approaches—GCN-Align [7], RDGCN [26], Dual-AMN [27], HOLI-GNN [28], PMF [10], and Simple-HHEA [20], and several time-aware models designed for temporal KG alignment—TEA-GNN [39], TREA [40], and STEA [41]; and (2) LLM-enhanced methods, including LLMEA [12], Seg-Align [13], ChatEA [14], and HLMEA [16].
  • Evaluation metrics. We evaluate EA performance using two mainstream metrics: Hits@k and mean reciprocal rank (MRR). Larger values of Hits@k and MRR correspond to superior performance. In our framework, the fine-tuned LLM selects the most similar entity from the candidate list as its final answer. To ensure comparability with prior work, we move the selected entity to the top of the ranking while keeping the order of the remaining candidates unchanged, and then compute Hits@k and MRR on the re-ranked list. For each aligned pair, we treat either entity as the source entity in turn, and report the average performance over both alignment directions.

5.2. Implementation Details and Deployment Guidance

  • Choice of base small models. We adopt two state-of-the-art embedding-based models, PMF [10] and Simple-HHEA [20], as the small models pre-trained on DBP15K and HHKG, respectively. In our experiment, we use a variant of PMF without visual features, as LLMs cannot process image information. In addition, following the settings of PMF and Simple-HHEA, we employ machine translation to transform non-English entity and relation names into English.
  • Configuration for large language models. We employ LLaMA3-8B-Instruct for instruction tuning, as it is open-source and widely used. Additional comparisons with other representative LLMs (e.g., LLaMA2-7B-Chat) are provided in Section 5.6. We adopt low-rank adaptation (LoRA) [42] for parameter-efficient fine-tuning, where the LoRA modules are configured with rank r = 128 , scaling factor α = 32 , and a dropout rate of 0.1. These modules are inserted into the query and value projection layers of the Transformer’s self-attention blocks. To further speed up training, we follow the QLoRA strategy [43], which compresses model weights to 4-bit precision using double quantization with the 4-bit NormalFloat format. During inference, we adopt a greedy decoding strategy, with a maximum of 64 tokens generated. The LLM is instructed to output only the selected target entity name from the candidate list. If the generated string does not exactly match any candidate name, we select the candidate with the minimum Levenshtein edit distance after basic normalization.
  • Instruction-tuning data construction. We build instruction instances from (i) the gold seed alignments in the training split and (ii) additional weakly-supervised alignments mined by the small model through embedding similarity. For each aligned entity pair ( e s , e t ) , we create two instructions by swapping the alignment direction (i.e., e s e t and e t e s ), consistent with the bidirectional evaluation protocol. Each instruction follows the unified prompt template in Section 4.2, where the candidate list C is formed by retrieving the top-k target entities ranked by the small model, and neighbor facts are sampled using the seed-neighbor priority strategy until reaching ω facts.
  • Data augmentation and filtering. Starting from the pre-trained small model, we perform one round of bidirectional mutual-nearest mining in the learned embedding space. Candidate new seed pairs are required to satisfy the mutual nearest constraint. To control noise, we rank the mined pairs by cosine similarity and keep only the top- β pairs as high-confidence weak supervision. β is dataset-dependent: we use β = 500 for each DBP15K subset, β = 506 for ICEWS-WIKI, and β = 1882 for ICEWS-YAGO. These new pairs are then converted into additional instructions using the same prompt template. As a result, the final instruction set contains 10,000 instances for each DBP15K subset, 4048 instances for ICEWS-WIKI, and 15,060 instances for ICEWS-YAGO.
  • Hyperparameter settings. We determine several important hyperparameters via grid search. Specifically, the search space includes the number of candidate entities k { 10 , 20 , 30 , 40 , 50 } , the positive-to-negative sample ratio λ { 3 , 4 , 5 , 6 , 7 , 8 , 9 } for training the classifier, and the number of neighbor facts ω { 5 , 10 , 20 } . Parameter sensitivity analyses are presented in Section 5.5. In our implementation, for the purpose of balancing efficiency and performance, the final hyperparameter settings are k = 30 , λ = 5 for HHKG and 9 for DBP15K, and ω = 10 .
  • Experimental environment. All experiments are conducted on a server running Ubuntu 22.04, equipped with an Intel Xeon Platinum 8470Q CPU and an NVIDIA A40 GPU (48 GB). To ensure result stability, the code is independently executed five times, and the average of the outcomes is reported.

5.3. Main Results

The experimental results on the DBP15K and HHKG datasets are shown in Table 3 and Table 4, respectively. The results in each table are grouped into three parts, corresponding to SM-based methods, LLM-enhanced methods, and our proposed approach. Most results are taken from the original publications, while the remaining ones are reported by Jiang et al. [14]. We report mean ± standard deviation for HIEA over five runs. In addition, we reproduce a variant of PMF without the visual modality using its official implementation.
Among SM-based methods, PMF and Simple-HHEA achieve the strongest performance, as they effectively integrate multiple types of entity features. Nevertheless, HIEA consistently outperforms these methods by a clear margin. This improvement is expected, since HIEA is built upon these strong small models and leverages the powerful reasoning capability of LLMs to further re-rank their coarse-grained predictions. As a result, HIEA improves Hits@1 over PMF by 2.9% on DBP Z H E N , and over Simple-HHEA by a substantial 21.6% on ICEWS-WIKI.
Compared with other LLM-enhanced methods, HIEA also demonstrates clear advantages. For example, on DBP F R E N and ICEWS-WIKI, HIEA outperforms ChatEA in terms of Hits@1 by 0.8% and 5.6%, respectively. This performance gain can be attributed to our instruction-tuning strategy, which strengthens the LLM’s understanding of KG structural information and better unleashes its potential for the EA task—beyond what can be achieved through purely in-context learning. Moreover, further analyses in Section 5.6 demonstrate that our method achieves these improvements with much lower inference cost.
It is also worth noting that HIEA yields more pronounced improvements on HHKG than on DBP15K. DBP15K is relatively easier to align, with higher schema consistency and structural similarity. By encoding KG topology, traditional EA methods already achieve strong performance, leaving limited room for further improvement. In contrast, HHKG is a highly heterogeneous benchmark that better reflects real-world application scenarios. The performance of SM-based methods deteriorates sharply on this dataset due to its high heterogeneity. At the same time, LLMs can alleviate this problem by leveraging rich and relevant external knowledge, thereby significantly improving alignment accuracy.

5.4. Ablation Study

An ablation study is carried out to examine the contribution of each component within the proposed framework by designing the following variants. w/o neighbors removes neighbor facts from the prompt. w/o augmentation omits the data augmentation step and constructs instruction data solely from the initial seed alignments. w/o adaption removes the knowledge adapter during LLM fine-tuning and simultaneously eliminates the special placeholders in the prompt. We conduct experiments on two representative subsets, DBP Z H E N and ICEWS-WIKI. The results are reported in Table 5, from which several observations can be drawn.
Removing neighbor facts leads to a performance degradation, with a more pronounced impact on DBP Z H E N than on ICEWS-WIKI. For instance, compared with the full HIEA framework, w/o neighbors reduces Hits@1 by 1.3% on DBP Z H E N , whereas the drop is only 0.4% on ICEWS-WIKI. This is because DBP Z H E N relies heavily on its relatively consistent neighborhood structure for effective alignment, and excluding neighbor facts weakens such structural alignment cues. In contrast, data augmentation has a larger impact on ICEWS-WIKI than on DBP Z H E N . Specifically, removing this component yields a 1.5% decrease in Hits@1 on ICEWS-WIKI, compared to a 0.8% drop on DBP Z H E N . This difference can be attributed to the limited number of seed alignments in ICEWS-WIKI, which contains only 1518 pre-aligned entity pairs, whereas DBP Z H E N provides a sufficiently large seed set. The proposed augmentation strategy, therefore, supplies additional instruction data that is crucial for effective LLM fine-tuning on ICEWS-WIKI. Finally, discarding the knowledge adaptation module results in consistent performance drops across both datasets, with Hits@1 decreasing by approximately 0.7–1.0%. This demonstrates the good robustness of the proposed knowledge adapter and its effectiveness in different alignment scenarios.

5.5. Parameter Sensitivity Study

  • Effect of the candidate set size. In our framework, the LLM selects the most suitable target entity from a candidate set offered by the small model. To evaluate the reliability of the coarse candidate generation stage, we analyze the recall of the correct target entity within the top-k candidate sets produced by the small model. Table 6 reports the recall results on DBP ZH EN and ICEWS-WIKI with k ranging from 10 to 50. The results show that the candidate recall remains consistently high across both datasets. On DBP ZH EN , the recall exceeds 99% even when k = 10 , while on the more heterogeneous ICEWS-WIKI dataset, the recall steadily increases from 88.5% to 93.7% as k grows. These results indicate that the small model provides high-quality candidate sets in practice, ensuring that the correct target entity is included with high probability and that the LLM-based refinement is not overly constrained by the initial coarse ranking.
In addition, we investigate the impact of the number of candidate entities (i.e., k) on both alignment performance and inference efficiency on DBP Z H E N and ICEWS-WIKI. The results are shown in Figure 3. On the one hand, the inference time on both datasets increases approximately linearly with k, which is expected since a larger candidate set leads to a longer prompt. On the other hand, k exhibits different effects on Hits@1 across the two datasets. On DBP Z H E N , Hits@1 first improves and then declines as k increases, reaching its peak at k = 30 . In contrast, on ICEWS-WIKI, Hits@1 increases rapidly with larger k and then gradually converges. This discrepancy stems from the fact that the small model performs better on DBP Z H E N than ICEWS-WIKI. For DBP Z H E N , a relatively small candidate set is often sufficient to include the correct target entity, and further enlarging the set may introduce irrelevant candidates that distract the LLM. Conversely, ICEWS-WIKI typically requires a larger candidate set to ensure that the correct target entity is covered in the prompt. To balance alignment performance and inference efficiency, we set k = 30 for all datasets in our experiments.
  • Effect of the positive–negative sample ratio. When training the source entity classifier, the data suffers from a class imbalance issue. We therefore rank positive samples according to the certainty confidence and truncate them to enforce a positive-to-negative ratio of λ . In this experiment, we analyze the impact of λ on classification performance. As shown in Figure 4, we report the precision and recall on the test set for classifiers trained with different values of λ . As λ increases, classification precision gradually decreases, while recall increases on two datasets. In addition, ICEWS-WIKI is more sensitive to changes in λ than DBP Z H E N , exhibiting a wider variation in precision and recall. Precision is critical for alignment accuracy, as high precision ensures that uncertain source entities are not misclassified as certain. In contrast, recall mainly affects inference efficiency: high recall reduces the number of source entities requiring further refinement by the LLM. Based on the results in Figure 4, we set λ = 9 for DBP15K and λ = 5 for HHKG, achieving both high recall and precision.
We note that the performance variations across datasets primarily stem from differences in heterogeneity and ambiguity, rather than from excessive sensitivity to hyperparameter choices. In practice, stable performance can be achieved within a broad range of k and λ on different datasets.

5.6. Further Analysis

  • Comparison across different LLM backbones. In the main experiments, we instruction-tune LLaMA3-8B-Instruct for entity alignment. To further investigate the impact of different LLMs and learning paradigms, we adopt LLaMA2-7B-Chat and LLaMA3-8B-Instruct as backbones and evaluate them under two settings: in-context learning (ICL) and instruction fine-tuning (FT). The results are reported in Table 7. Across both backbones, consistent performance gains are observed for instruction-tuned models over their non-fine-tuned counterparts. This improvement can be attributed to instruction tuning with the proposed knowledge adapter, which enables LLMs to better incorporate KG embeddings and understand the essence of the EA task. In most cases, LLaMA3-8B-Instruct achieves superior performance compared to LLaMA2-7B-Chat, benefiting from its stronger reasoning capability. Interestingly, HIEA with LLaMA3 under the ICL setting performs slightly worse than its LLaMA2 counterpart on ICEWS-WIKI. A possible explanation is that LLaMA3-8B-Instruct, as a dialogue-oriented model, tends to generate more verbose outputs (e.g., explanations or labels), which may interfere with precise entity selection in the absence of fine-tuning. Overall, these results demonstrate the robustness of HIEA, which maintains excellent alignment performance across different LLM backbones and learning paradigms.
  • Analysis of noise in data augmentation. The proposed data augmentation strategy can automatically generate additional entity pairs for instruction tuning, which may raise concerns about potential noise. To quantitatively assess the quality of the augmented data, we evaluate the correctness of the generated entity pairs against the gold-standard alignments, where accuracy is defined as the proportion of correctly matched pairs. Table 8 reports the accuracy of generated entity pairs on five datasets. The results show consistently high accuracy across all datasets, with an average accuracy of 91% on DBP15K and 100% accuracy on both ICEWS-WIKI and ICEWS-YAGO. These findings indicate that the proposed augmentation strategy introduces only limited noise. In particular, the perfect accuracy on the two highly heterogeneous temporal datasets highlights the effectiveness of the bidirectional nearest constraint in challenging alignment scenarios. Despite minor noise on some DBP15K subsets, we observe consistent improvements in downstream EA performance, suggesting that the instruction-tuned LLM is robust to moderate label noise.
  • Effect of data classification. To verify the effectiveness of the data classification strategy, we compare the full HIEA method with a variant that removes this component in terms of alignment performance, the proportion of uncertain entities, and inference time. The results are reported in Table 9. By annotating test instances with the classifier trained in Section 4.3, the number of uncertain source entities that require further refinement by the LLM is substantially reduced, leading to significant gains in inference efficiency. For example, on DBP Z H E N , the proportion of uncertain source entities is decreased to 45.92%, resulting in a reduction of inference time by more than half. Importantly, the data classification strategy does not degrade alignment performance, thanks to the high precision of the classifier (see Section 5.5). Notably, on DBP Z H E N , HIEA even slightly outperforms the variant without classification. Further analysis reveals that some entities are correctly aligned by the small model but incorrectly predicted by the LLM. By labeling such cases as certain, HIEA bypasses unnecessary LLM inference and directly adopts the small model predictions, thereby avoiding potential errors introduced by the LLM.
  • Effect of classifier choice. The certainty classifier plays an auxiliary role in distinguishing relatively certain and uncertain entities for collaborative inference. To evaluate the impact of classifier choice, we compare Random Forest with two alternative models commonly used for continuous features, namely Logistic Regression and a two-layer Multi-Layer Perceptron (MLP). All classifiers are trained on the same data, and precision, recall, and F1-score are reported for the positive (certain) class. The results on DBP Z H E N and ICEWS-WIKI are summarized in Table 10. Logistic Regression consistently exhibits very high precision but substantially lower recall on both datasets, indicating an overly conservative decision boundary. The MLP achieves the best F1-score on DBP Z H E N , while Random Forest performs best on ICEWS-WIKI, demonstrating stronger recall and overall balance in a more heterogeneous setting. These results suggest that the performance differences among classifiers are moderate, and our framework is not highly sensitive to the specific classifier choice. We adopt Random Forest in our framework due to its robust and stable performance across datasets.
  • Efficiency analysis. Beyond strong alignment performance, HIEA features remarkably low inference overhead. To demonstrate this, we compare the efficiency of different EA methods and LLM backbones. Specifically, we calculate the average number of tokens and the average inference time required to align a single source entity. The results are summarized in Table 11. HIEA incurs substantially lower inference costs than ChatEA. For instance, on ICEWS-WIKI, the average token consumption is reduced from 9803 to 541, while the inference time decreases from 63.4 s to 0.25 s per entity. Notably, even when the fine-tuning overhead is included, HIEA remains significantly more efficient than ChatEA, reducing the average inference time by over two orders of magnitude. This mainly stems from two designs in HIEA. First, we adopt a unified prompt that instructs the LLM to generate alignment results with a single query, eliminating the need for multi-round interactions. Second, the data classification strategy filters out certain entities that do not require LLM refinement. Moreover, HIEA exhibits consistently high efficiency with both LLaMA2-7B and LLaMA3-8B as backbones, indicating that its efficient inference property is not tied to a specific LLM.
  • Cross-dataset generalization analysis. To further evaluate the cross-dataset generalization capability of HIEA, we perform a transfer experiment in which the LoRA parameters fine-tuned on one dataset are directly reused for inference on another related dataset, without additional fine-tuning. Specifically, the LoRA parameters learned on DBP Z H E N are applied to DBP J A E N and DBP F R E N , while those trained on ICEWS-WIKI are directly transferred to ICEWS-YAGO. This setting allows us to assess whether HIEA learns dataset-agnostic alignment patterns or relies heavily on dataset-specific supervision. The results in Table 12 show that on DBP J A E N and DBP F R E N , the transferred model achieves performance nearly identical to the Original setting. On ICEWS-YAGO, the transferred variant shows a modest 1.4% drop in Hits@1 but still outperforms all baselines. These results not only highlight the robustness of the proposed framework under cross-dataset transfer but also shed light on the feasibility of developing foundation models for entity alignment.

6. Conclusions

In this paper, we propose HIEA, a novel hierarchical inference framework for entity alignment that effectively integrates embedding-based small models with large language models. Unlike current LLM-enhanced EA approaches that rely on in-context learning with multi-round interactions and complex prompt engineering, HIEA adopts an instruction-tuning paradigm with a unified and concise prompt, allowing the LLM to produce alignment results with a single query. To narrow the semantic gap between structured KG representations and textual prompts, we introduce a knowledge adapter that injects KG embeddings into the LLM, thereby enhancing its structural understanding ability. Moreover, small models are leveraged not only for candidate generation but also for data augmentation and certainty-aware entity classification, leading to complementary collaboration with LLMs. Extensive experiments on both standard and highly heterogeneous datasets show that HIEA consistently surpasses existing embedding-based and LLM-enhanced baselines, yielding up to 1.9% and 5.6% absolute Hits@1 gains on DBP15K and HHKG, respectively. Compared to other LLM-enhanced methods, HIEA also proves to be more efficient, aligning a pair of entities within 0.3 s. Further analyses verify the effectiveness of individual components, as well as robustness across different LLM backbones and learning paradigms.

7. Limitations and Future Work

Despite the strong empirical results, the proposed HIEA framework has several limitations that warrant further investigation. We discuss these limitations and outline promising directions for future work below.
  • (1) Limitations of latent knowledge in LLMs. HIEA leverages the latent knowledge and reasoning capability of large language models to refine entity alignments. However, such latent knowledge may be incomplete or outdated in certain scenarios, particularly in temporal domains with rapidly evolving facts, newly emerged entities that are absent from the pre-training corpus, or knowledge graphs containing noisy or conflicting information. In these cases, the LLM may produce less reliable reasoning outcomes. A potential direction for future work is to incorporate external or up-to-date knowledge sources, such as temporal knowledge graph snapshots or retrieval-augmented generation, to mitigate the limitations of static pre-trained knowledge.
  • (2) Dependency on coarse candidate generation. The proposed framework relies on a small model to generate an initial coarse ranking of candidate entities, and the LLM is constrained to select the final alignment from this candidate set. While our empirical analysis shows that the candidate recall remains high in practice, the framework may fail to recover correct alignments if the true target entity is absent from the candidate list. Future work could explore more robust candidate generation strategies, such as hybrid retrieval methods, adaptive candidate expansion, or uncertainty-aware re-ranking, to further improve recall without sacrificing efficiency.
  • (3) Constraints in multi-modal alignment scenarios. The current implementation of HIEA operates under a text-only LLM setting and does not explicitly exploit visual information, which can be crucial for disambiguation in some multi-modal entity alignment scenarios. As a result, HIEA may lag behind fully multi-modal approaches when visual cues provide decisive signals. Nevertheless, the framework is inherently modular and can be naturally extended by integrating multi-modal small models for candidate generation or adopting multi-modal LLMs (e.g., GPT-4V or LLaVA [44]) for joint reasoning. A systematic exploration of such multi-modal extensions is left for future work.

Author Contributions

Conceptualization, X.S. and B.L.; methodology, X.S.; software, X.S.; validation, X.S., B.L. and Z.H.; formal analysis, X.S. and B.L.; investigation, X.S. and Z.H.; resources, X.S. and B.L.; data curation, Z.H.; writing—original draft preparation, X.S.; writing—review and editing, B.L.; visualization, X.S.; supervision, B.L.; project administration, B.L.; funding acquisition, B.L. and X.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the National Natural Science Foundation of China under Grant No. 61972335 and the Postgraduate Research and Practice Innovation Program of Jiangsu Province under Grant No. KYCX24_3738.

Data Availability Statement

The datasets used in this study are publicly available. DBP15K can be obtained from https://github.com/nju-websoft/JAPE (accessed on 25 December 2025), and HHKG is available at https://github.com/DataArcTech/Simple-HHEA (accessed on 27 December 2025). Further details on the data and experimental setup are available from the corresponding author upon request.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:
EAEntity Alignment
KGKnowledge Graph
HIEAHierarchical Inference for Entity Alignment
LLMLarge Language Model
SMSmall Model
GNNGraph Neural Network
ICLIn-context Learning
FTFine-tuning
LoRALow-Rank Adaptation
ICEWSIntegrated Crisis Early Warning System

References

  1. Xiong, G.; Bao, J.; Zhao, W. Interactive-KBQA: Multi-Turn Interactions for Knowledge Base Question Answering with Large Language Models. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Bangkok, Thailand, 11–16 August 2024; Ku, L.W., Martins, A., Srikumar, V., Eds.; Association for Computational Linguistics: Stroudsburg, PA, USA, 2024; pp. 10561–10582. [Google Scholar] [CrossRef]
  2. Hu, Z.; Li, Z.; Jiao, Z.; Nakagawa, S.; Deng, J.; Cai, S.; Zhou, T.; Ren, F. Bridging the user-side knowledge gap in knowledge-aware recommendations with large language models. Proc. AAAI Conf. Artif. Intell. 2025, 39, 11799–11807. [Google Scholar] [CrossRef]
  3. Xu, T.; Li, B.; Chen, L.; Yang, C.; Gu, Y.; Gu, X. EHR coding with hybrid attention and features propagation on disease knowledge graph. Artif. Intell. Med. 2024, 154, 102916. [Google Scholar] [CrossRef] [PubMed]
  4. Cao, J.; Fang, J.; Meng, Z.; Liang, S. Knowledge Graph Embedding: A Survey from the Perspective of Representation Spaces. ACM Comput. Surv. 2024, 56, 1–42. [Google Scholar] [CrossRef]
  5. Chen, M.; Tian, Y.; Yang, M.; Zaniolo, C. Multilingual Knowledge Graph Embeddings for Cross-lingual Knowledge Alignment. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI 2017, Melbourne, Australia, 19–25 August 2017; Sierra, C., Ed.; IJCAI: Palo Alto, CA, USA, 2017; pp. 1511–1517. [Google Scholar] [CrossRef]
  6. Jiang, T.; Bu, C.; Zhu, Y.; Wu, X. Combining embedding-based and symbol-based methods for entity alignment. Pattern Recognit. 2022, 124, 108433. [Google Scholar] [CrossRef]
  7. Wang, Z.; Lv, Q.; Lan, X.; Zhang, Y. Cross-lingual Knowledge Graph Alignment via Graph Convolutional Networks. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, 31 October–4 November 2018; pp. 349–357. [Google Scholar] [CrossRef]
  8. Zhang, Z.; Yang, Y.; Chen, B. Relation-aware heterogeneous graph neural network for entity alignment. Neurocomputing 2024, 592, 127797. [Google Scholar] [CrossRef]
  9. Shi, X.; Li, B.; Chen, L.; Yang, C. Bi-Neighborhood Graph Neural Network for cross-lingual entity alignment. Knowl.-Based Syst. 2023, 277, 110841. [Google Scholar] [CrossRef]
  10. Huang, Y.; Zhang, X.; Zhang, R.; Chen, J.; Kim, J. Progressively Modality Freezing for Multi-Modal Entity Alignment. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Bangkok, Thailand, 11–16 August 2024; Ku, L.W., Martins, A., Srikumar, V., Eds.; Association for Computational Linguistics: Stroudsburg, PA, USA, 2024; pp. 3477–3489. [Google Scholar] [CrossRef]
  11. Zhu, L.; Li, N.; Bai, L. Embedding-based entity alignment between multi-source temporal knowledge graphs. Eng. Appl. Artif. Intell. 2024, 133, 108451. [Google Scholar] [CrossRef]
  12. Yang, L.; Chen, H.; Wang, X.; Yang, J.; Wang, F.Y.; Liu, H. Two Heads Are Better Than One: Integrating Knowledge from Knowledge Graphs and Large Language Models for Entity Alignment. arXiv 2024, arXiv:2401.16960. [Google Scholar] [CrossRef]
  13. Yang, L.; Cheng, J.; Zhang, F. Advancing Cross-Lingual Entity Alignment with Large Language Models: Tailored Sample Segmentation and Zero-Shot Prompts. In Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, Miami, FL, USA, 12–16 November 2024; Al-Onaizan, Y., Bansal, M., Chen, Y.N., Eds.; Association for Computational Linguistics: Stroudsburg, PA, USA, 2024; pp. 8122–8138. [Google Scholar] [CrossRef]
  14. Jiang, X.; Shen, Y.; Shi, Z.; Xu, C.; Li, W.; Li, Z.; Guo, J.; Shen, H.; Wang, Y. Unlocking the Power of Large Language Models for Entity Alignment. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Bangkok, Thailand, 11–16 August 2024; Ku, L.W., Martins, A., Srikumar, V., Eds.; Association for Computational Linguistics: Stroudsburg, PA, USA, 2024; pp. 7566–7583. [Google Scholar] [CrossRef]
  15. Chen, S.; Zhang, Q.; Dong, J.; Hua, W.; Li, Q.; Huang, X. Entity Alignment with Noisy Annotations from Large Language Models. In Proceedings of the Advances in Neural Information Processing Systems, Vancouver, BC, Canada, 10–15 December 2024; Globerson, A., Mackey, L., Belgrave, D., Fan, A., Paquet, U., Tomczak, J., Zhang, C., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2024; Volume 37, pp. 15097–15120. [Google Scholar] [CrossRef]
  16. Jin, X.; Wang, Z.; Chen, J.; Yang, L.; Oh, B.; Hwang, S.w.; Li, J. HLMEA: Unsupervised Entity Alignment Based on Hybrid Language Models. Proc. AAAI Conf. Artif. Intell. 2025, 39, 11888–11896. [Google Scholar] [CrossRef]
  17. Cheng, J.; Lu, C.; Yang, L.; Chen, G.; Zhang, F. EasyEA: Large Language Model is All You Need in Entity Alignment Between Knowledge Graphs. In Proceedings of the Findings of the Association for Computational Linguistics: ACL 2025, Vienna, Austria, 27 July–1 August 2025; Che, W., Nabende, J., Shutova, E., Pilehvar, M.T., Eds.; Association for Computational Linguistics: Stroudsburg, PA, USA, 2025; pp. 20981–20995. [Google Scholar] [CrossRef]
  18. Touvron, H.; Martin, L.; Stone, K.; Albert, P.; Almahairi, A.; Babaei, Y.; Bashlykov, N.; Batra, S.; Bhargava, P.; Bhosale, S.; et al. Llama 2: Open Foundation and Fine-Tuned Chat Models. arXiv 2023, arXiv:2307.09288. [Google Scholar] [CrossRef]
  19. OpenAI; Achiam, J.; Adler, S.; Agarwal, S.; Ahmad, L.; Akkaya, I.; Aleman, F.L.; Almeida, D.; Altenschmidt, J.; Altman, S.; et al. GPT-4 Technical Report. arXiv 2024, arXiv:2303.08774. [Google Scholar] [CrossRef]
  20. Jiang, X.; Xu, C.; Shen, Y.; Wang, Y.; Su, F.; Shi, Z.; Sun, F.; Li, Z.; Guo, J.; Shen, H. Toward Practical Entity Alignment Method Design: Insights from New Highly Heterogeneous Knowledge Graph Datasets. In Proceedings of the WWW ’24: ACM Web Conference 2024, Singapore, 13–17 May 2024; Association for Computing Machinery: New York, NY, USA, 2024; pp. 2325–2336. [Google Scholar] [CrossRef]
  21. Zhang, Y.; Chen, Z.; Guo, L.; Xu, Y.; Zhang, W.; Chen, H. Making Large Language Models Perform Better in Knowledge Graph Completion. In Proceedings of the MM ’24: 32nd ACM International Conference on Multimedia, Melbourne, VIC, Australia, 28 October–1 November 2024; Association for Computing Machinery: New York, NY, USA, 2024; pp. 233–242. [Google Scholar] [CrossRef]
  22. Bordes, A.; Usunier, N.; Garcia-Duran, A.; Weston, J.; Yakhnenko, O. Translating Embeddings for Modeling Multi-relational Data. In Proceedings of the Advances in Neural Information Processing Systems 26, Lake Tahoe, NV, USA, 5–8 December 2013; Burges, C.J.C., Bottou, L., Welling, M., Ghahramani, Z., Weinberger, K.Q., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2013; pp. 2787–2795. [Google Scholar]
  23. Sun, Z.; Hu, W.; Zhang, Q.; Qu, Y. Bootstrapping Entity Alignment with Knowledge Graph Embedding. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI 2018, Stockholm, Sweden, 13–19 July 2018; Lang, J., Ed.; IJCAI: Palo Alto, CA, USA, 2018; pp. 4396–4402. [Google Scholar] [CrossRef]
  24. Sun, Z.; Huang, J.; Hu, W.; Chen, M.; Guo, L.; Qu, Y. TransEdge: Translating Relation-Contextualized Embeddings for Knowledge Graphs. In Proceedings of the The Semantic Web—ISWC 2019, Auckland, New Zealand, 26–30 October 2019; Ghidini, C., Hartig, O., Maleshkova, M., Svátek, V., Cruz, I., Hogan, A., Song, J., Lefrançois, M., Gandon, F., Eds.; Springer: Cham, Switzerland, 2019; pp. 612–629. [Google Scholar]
  25. Kipf, T.N.; Welling, M. Semi-Supervised Classification with Graph Convolutional Networks. In Proceedings of the 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, 24–26 April 2017. [Google Scholar] [CrossRef]
  26. Wu, Y.; Liu, X.; Feng, Y.; Wang, Z.; Yan, R.; Zhao, D. Relation-Aware Entity Alignment for Heterogeneous Knowledge Graphs. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI 2019, Macao, China, 10–16 August 2019; Kraus, S., Ed.; IJCAI: Palo Alto, CA, USA, 2019; pp. 5278–5284. [Google Scholar] [CrossRef]
  27. Mao, X.; Wang, W.; Wu, Y.; Lan, M. Boosting the Speed of Entity Alignment 10 ×: Dual Attention Matching Network with Normalized Hard Sample Mining. In Proceedings of the WWW ’21: Proceedings of the Web Conference 2021, Ljubljana Slovenia, 19–23 April 2021; Association for Computing Machinery: New York, NY, USA, 2021; pp. 821–832. [Google Scholar] [CrossRef]
  28. Chen, J.; Yang, L.; Wang, Z.; Gong, M. Higher-order GNN with Local Inflation for entity alignment. Knowl.-Based Syst. 2024, 293, 111634. [Google Scholar] [CrossRef]
  29. Tian, S.; Luo, Y.; Xu, T.; Yuan, C.; Jiang, H.; Wei, C.; Wang, X. KG-Adapter: Enabling Knowledge Graph Integration in Large Language Models through Parameter-Efficient Fine-Tuning. In Proceedings of the Findings of the Association for Computational Linguistics: ACL 2024, Bangkok, Thailand, 11–16 August 2024; Ku, L.W., Martins, A., Srikumar, V., Eds.; Association for Computational Linguistics: Stroudsburg, PA, USA, 2024; pp. 3813–3828. [Google Scholar] [CrossRef]
  30. Liu, Y.; Cao, Y.; Lin, X.; Shang, Y.; Wang, S.; Pan, S. Enhancing Large Language Model for Knowledge Graph Completion via Structure-Aware Alignment-Tuning. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, Suzhou, China, 4–9 November 2025; Christodoulopoulos, C., Chakraborty, T., Rose, C., Peng, V., Eds.; Association for Computational Linguistics: Stroudsburg, PA, USA, 2025; pp. 20970–20984. [Google Scholar] [CrossRef]
  31. Zhang, Q.; Dong, J.; Chen, H.; Zha, D.; Yu, Z.; Huang, X. KnowGPT: Knowledge Graph based Prompting for Large Language Models. In Proceedings of the Advances in Neural Information Processing Systems, Vancouver, BC, Canada, 10–15 December 2024; Globerson, A., Mackey, L., Belgrave, D., Fan, A., Paquet, U., Tomczak, J., Zhang, C., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2024; Volume 37, pp. 6052–6080. [Google Scholar] [CrossRef]
  32. Chen, Z.; Bai, L.; Li, Z.; Huang, Z.; Jin, X.; Dou, Y. A New Pipeline for Knowledge Graph Reasoning Enhanced by Large Language Models Without Fine-Tuning. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, Miami, FL, USA, 12–16 November 2024; Al-Onaizan, Y., Bansal, M., Chen, Y.N., Eds.; Association for Computational Linguistics: Stroudsburg, PA, USA, 2024; pp. 1366–1381. [Google Scholar] [CrossRef]
  33. Sun, Z.; Zhang, Q.; Hu, W.; Wang, C.; Chen, M.; Akrami, F.; Li, C. A Benchmarking Study of Embedding-Based Entity Alignment for Knowledge Graphs. Proc. VLDB Endow. 2020, 13, 2326–2340. [Google Scholar] [CrossRef]
  34. Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
  35. Yan, Z.; Peng, R.; Wu, H. Similarity propagation based semi-supervised entity alignment. Eng. Appl. Artif. Intell. 2024, 130, 107787. [Google Scholar] [CrossRef]
  36. Mao, X.; Wang, W.; Xu, H.; Lan, M.; Wu, Y. MRAEA: An Efficient and Robust Entity Alignment Approach for Cross-Lingual Knowledge Graph. In Proceedings of the WSDM ’20: Proceedings of the 13th International Conference on Web Search and Data Mining, Houston, TX, USA, 3–7 February 2020; Association for Computing Machinery: New York, NY, USA, 2020; pp. 420–428. [Google Scholar] [CrossRef]
  37. Shazeer, N. GLU Variants Improve Transformer. arXiv 2020, arXiv:2002.05202. [Google Scholar] [CrossRef]
  38. Sun, Z.; Hu, W.; Li, C. Cross-Lingual Entity Alignment via Joint Attribute-Preserving Embedding. In Proceedings of the Semantic Web—ISWC 2017—16th International Semantic Web Conference, Vienna, Austria, 21–25 October 2017; Proceedings, Part I. d’Amato, C., Fernández, M., Tamma, V.A.M., Lécué, F., Cudré-Mauroux, P., Sequeda, J.F., Lange, C., Heflin, J., Eds.; Springer: Berlin/Heidelberg, Germany, 2017; Volume 10587, Lecture Notes in Computer Science. pp. 628–644. [Google Scholar] [CrossRef]
  39. Xu, C.; Su, F.; Lehmann, J. Time-aware Graph Neural Network for Entity Alignment between Temporal Knowledge Graphs. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Online and Punta Cana, Dominican Republic, 7–11 November 2021; Moens, M.F., Huang, X., Specia, L., Yih, S.W.t., Eds.; Association for Computational Linguistics: Stroudsburg, PA, USA, 2021; pp. 8999–9010. [Google Scholar] [CrossRef]
  40. Xu, C.; Su, F.; Xiong, B.; Lehmann, J. Time-aware Entity Alignment using Temporal Relational Attention. In Proceedings of the WWW ’22: Proceedings of the ACM Web Conference 2022, Virtual Event, Lyon, France, 25–29 April 2022; Association for Computing Machinery: New York, NY, USA, 2022; pp. 788–797. [Google Scholar] [CrossRef]
  41. Cai, L.; Mao, X.; Ma, M.; Yuan, H.; Zhu, J.; Lan, M. A Simple Temporal Information Matching Mechanism for Entity Alignment between Temporal Knowledge Graphs. In Proceedings of the 29th International Conference on Computational Linguistics, Gyeongju, Republic of Korea, 12–17 October 2022; Calzolari, N., Huang, C.R., Kim, H., Pustejovsky, J., Wanner, L., Choi, K.S., Ryu, P.M., Chen, H.H., Donatelli, L., Ji, H., et al., Eds.; Association for Computational Linguistics: Stroudsburg, PA, USA, 2022; pp. 2075–2086. [Google Scholar]
  42. Hu, E.J.; Shen, Y.; Wallis, P.; Allen-Zhu, Z.; Li, Y.; Wang, S.; Wang, L.; Chen, W. LoRA: Low-Rank Adaptation of Large Language Models. arXiv 2021, arXiv:2106.09685. [Google Scholar] [CrossRef]
  43. Dettmers, T.; Pagnoni, A.; Holtzman, A.; Zettlemoyer, L. QLORA: Efficient finetuning of quantized LLMs. In Proceedings of the NIPS ’23: Proceedings of the 37th International Conference on Neural Information Processing Systems, New Orleans, LA, USA, 10–16 December 2023; Curran Associates Inc.: Red Hook, NY, USA, 2023. [Google Scholar]
  44. Liu, H.; Li, C.; Wu, Q.; Lee, Y.J. Visual Instruction Tuning. In Proceedings of the Advances in Neural Information Processing Systems; Oh, A., Naumann, T., Globerson, A., Saenko, K., Hardt, M., Levine, S., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2023; Volume 36, pp. 34892–34916. [Google Scholar]
Figure 1. A comparison of SM-based and LLM-enhanced entity alignment methods.
Figure 1. A comparison of SM-based and LLM-enhanced entity alignment methods.
Electronics 15 00421 g001
Figure 2. The framework of our method, taking the alignment of entities “Chen-Ning Yang” and “Chen Ning Yang” as an example. The double-arrow curves connect the pre-aligned entity pairs.
Figure 2. The framework of our method, taking the alignment of entities “Chen-Ning Yang” and “Chen Ning Yang” as an example. The double-arrow curves connect the pre-aligned entity pairs.
Electronics 15 00421 g002
Figure 3. Hits@1 and inference time variation with different numbers of candidate entities on datasets DBP Z H E N and ICEWS-WIKI.
Figure 3. Hits@1 and inference time variation with different numbers of candidate entities on datasets DBP Z H E N and ICEWS-WIKI.
Electronics 15 00421 g003
Figure 4. The precision and recall of classification with different positive-to-negative sample ratios on datasets DBP Z H E N and ICEWS-WIKI.
Figure 4. The precision and recall of classification with different positive-to-negative sample ratios on datasets DBP Z H E N and ICEWS-WIKI.
Electronics 15 00421 g004
Table 1. Comparison of representative entity alignment methods from different perspectives.
Table 1. Comparison of representative entity alignment methods from different perspectives.
MethodLearning ParadigmLLM InvolvementStructural Signal InjectionInference Strategy
Small Model-Based MethodsEmbedding-based representation learningNoneExplicit graph embeddingsEmbedding similarity ranking
LLMEAIn-context learningVirtual entity generation + iterative multiple-choice reasoningNoneMulti-round multiple-choice selection
ChatEAIn-context learningDescription generation + dialogue reasoningKG-Code translationIterative dialogue reasoning
HLMEAIn-context learningRepeated annotation + majority votingTextual representation of entitiesIterative filtering and voting
EasyEAIn-context learningInformation summarization + candidate selectionImplicit via summariesLLM-driven candidate selection based on semantic features
HIEA (Ours)Instruction tuning with knowledge adaptationSingle-step answer generationExplicit structural embedding injection + neighbor factsHierarchical inference via collaboration between instruction-tuned LLMs and small models
Table 2. Detailed information of EA datasets.
Table 2. Detailed information of EA datasets.
DatasetKGEnt.Rel.Tri.PairsStruc. Sim.
DBPzh-enZH
EN
19,388
19,572
1701
1323
70,414
95,142
15,0000.644
DBPja-enJA
EN
19,814
19,780
1299
1153
77,214
93,484
15,0000.660
DBPfr-enFR
EN
19,661
19,993
903
1208
105,998
115,722
15,0000.652
ICEWS-WIKIICEWS
WIKI
11,047
15,896
272
226
3,527,881
198,257
50580.154
ICEWS-YAGOICEWS
YAGO
26,863
22,734
272
41
4,192,555
107,118
18,8240.140
Table 3. Comparative results on DBP15K. Boldface and underlining are used to indicate the best and second-best results, respectively.
Table 3. Comparative results on DBP15K. Boldface and underlining are used to indicate the best and second-best results, respectively.
Methods DBP ZH EN DBP JA EN DBP FR EN
Hits@1Hits@10MRRHits@1Hits@10MRRHits@1Hits@10MRR
MTransE0.3080.6140.3640.2790.5750.3490.2440.5560.335
BootEA0.6290.8470.7030.6220.8540.7010.6530.8740.731
TransEdge0.7350.9190.8010.7190.9320.7950.7100.9410.796
GCN-Align0.4130.7440.5490.3990.7450.5460.3730.7450.532
RDGCN0.7080.8460.7460.7670.8950.8120.8730.9500.901
Dual-AMN0.8610.9640.9010.8920.9780.9250.9540.9940.970
HOLI-GNN0.9010.9660.9260.9240.9770.9430.9710.9930.980
PMF0.9400.9910.9600.9710.9970.9810.9880.9990.992
LLMEA0.8980.923-0.9110.946-0.9570.977-
Seg-Align0.953--0.907--0.987--
ChatEA------0.9901.0000.995
HLMEA0.930-0.9340.938-0.9500.986-0.989
HIEA 0 . 969 ± 0 . 003 0 . 993 ± 0 . 001 0 . 978 ± 0 . 002 0 . 990 ± 0 . 001 0 . 998 ± 0 . 001 0 . 994 ± 0 . 001 0 . 998 ± 0 . 001 1 . 000 ± 0 . 001 0 . 999 ± 0 . 001
Table 4. Comparative results on HHKG.
Table 4. Comparative results on HHKG.
Methods ICEWS-WIKIICEWS-YAGO
Hits@1Hits@10MRRHits@1Hits@10MRR
MTransE0.0210.1580.0680.0120.0840.040
BootEA0.0720.2750.1390.0200.1200.056
GCN-Align0.0460.1840.0930.0170.0850.038
RDGCN0.0640.2020.0960.0290.0970.042
Dual-AMN0.0830.2810.1450.0310.1440.068
TEA-GNN0.0630.2530.1260.0250.1350.064
TREA0.0810.3020.1550.0330.1500.072
STEA0.0790.2920.1520.0330.1470.073
Simple-HHEA0.7200.8720.7540.8470.9150.870
ChatEA0.8800.9450.9120.9350.9550.944
HIEA 0 . 936 ± 0 . 003 0.940 ± 0.002 0 . 938 ± 0 . 002 0 . 963 ± 0 . 002 0 . 965 ± 0 . 001 0 . 964 ± 0 . 002
Table 5. Performance comparison of different HIEA variants.
Table 5. Performance comparison of different HIEA variants.
Variants DBP ZH EN ICEWS-WIKI
Hits@1Hits@10MRRHits@1Hits@10MRR
HIEA0.9690.9930.9780.9360.9400.938
w/o neighbors0.9560.9890.9680.9320.9370.934
w/o augmentation0.9610.9900.9720.9210.9290.924
w/o adaption0.9590.8990.9700.9290.9350.932
Table 6. Recall of the correct target entity within the top-k candidate sets generated by the small model on different datasets.
Table 6. Recall of the correct target entity within the top-k candidate sets generated by the small model on different datasets.
Datasetk = 10k = 20k = 30k = 40k = 50
DBP Z H E N 0.9920.9950.9970.9980.998
ICEWS-WIKI0.8850.9090.9210.9310.937
Table 7. Performance comparison with different LLMs and learning paradigms.
Table 7. Performance comparison with different LLMs and learning paradigms.
Models DBP ZH EN ICEWS-WIKI
Hits@1Hits@10MRRHits@1Hits@10MRR
HIEA
w/LLaMA2-7B (ICL)0.9430.9910.9620.8910.9120.900
w/LLaMA2-7B (FT)0.9580.9920.9700.9240.9290.927
w/LLaMA3-8B (ICL)0.9490.9920.9660.8810.9090.891
w/LLaMA3-8B (FT)0.9690.9930.9780.9360.9400.938
Table 8. Number and accuracy of augmented entity pairs generated by the proposed data augmentation strategy.
Table 8. Number and accuracy of augmented entity pairs generated by the proposed data augmentation strategy.
Dataset DBP ZH EN DBP JA EN DBP FR EN ICEWS-WIKIICEWS-YAGO
Number5005005005061882
Accuracy0.9470.9100.8741.0001.000
Table 9. Comparison with the variant w/o classification. Uncer. Per. indicates the percentage of uncertain entities.
Table 9. Comparison with the variant w/o classification. Uncer. Per. indicates the percentage of uncertain entities.
Models DBP ZH EN ICEWS-WIKI
Hits@1MRRUncer. Per.Time(s)Hits@1MRRUncer. Per.Time(s)
HIEA0.9690.97845.92%40450.9360.93850.08%1784
w/o classification0.9670.976100%85860.9410.942100%3474
Table 10. Comparison of different classifiers for source entity classification on DBP Z H E N and ICEWS-WIKI. Precision, recall, and F1-score are reported for the positive (certain) class.
Table 10. Comparison of different classifiers for source entity classification on DBP Z H E N and ICEWS-WIKI. Precision, recall, and F1-score are reported for the positive (certain) class.
Methods DBP ZH EN ICEWS-WIKI
PrecisionRecallF1PrecisionRecallF1
Logistic Regression0.9990.5590.7170.9950.5820.734
MLP (2-layer)0.9950.6230.7660.9960.5330.694
Random Forest0.9990.5750.7300.9940.6220.765
Table 11. Efficiency comparison of different methods. Avg. Tokens and Avg. Time denote the average number of tokens and the average time required to align a source entity, respectively.
Table 11. Efficiency comparison of different methods. Avg. Tokens and Avg. Time denote the average number of tokens and the average time required to align a source entity, respectively.
Models ICEWS-WIKIICEWS-YAGO
Avg. TokensAvg. Time(s)Avg. TokensAvg. Time(s)
ChatEA
w/LLaMA2-70B11,38063.4895046.5
w/LLaMA2-13B47,007150.144,907135.8
w/GPT-49803131.8659390.8
HIEA
w/LLaMA2-7B (inference only)6040.2525540.228
w/LLaMA2-7B (fine-tuning + inference)9471.3078720.932
w/LLaMA3-8B (inference only)5410.2614730.221
w/LLaMA3-8B (fine-tuning + inference)8481.1237450.833
Table 12. Cross-dataset generalization performance of HIEA. Transfer and Original indicate using LoRA parameters obtained via cross-dataset transfer and target-dataset fine-tuning, respectively.
Table 12. Cross-dataset generalization performance of HIEA. Transfer and Original indicate using LoRA parameters obtained via cross-dataset transfer and target-dataset fine-tuning, respectively.
Methods DBP JA EN DBP FR EN ICEWS-YAGO
Hits@1Hits@10MRRHits@1Hits@10MRRHits@1Hits@10MRR
Transfer0.9850.9960.9910.9940.9980.9970.9490.9520.950
Original0.9900.9980.9940.9981.0000.9990.9630.9650.964
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Shi, X.; Han, Z.; Li, B. HIEA: Hierarchical Inference for Entity Alignment with Collaboration of Instruction-Tuned Large Language Models and Small Models. Electronics 2026, 15, 421. https://doi.org/10.3390/electronics15020421

AMA Style

Shi X, Han Z, Li B. HIEA: Hierarchical Inference for Entity Alignment with Collaboration of Instruction-Tuned Large Language Models and Small Models. Electronics. 2026; 15(2):421. https://doi.org/10.3390/electronics15020421

Chicago/Turabian Style

Shi, Xinchen, Zhenyu Han, and Bin Li. 2026. "HIEA: Hierarchical Inference for Entity Alignment with Collaboration of Instruction-Tuned Large Language Models and Small Models" Electronics 15, no. 2: 421. https://doi.org/10.3390/electronics15020421

APA Style

Shi, X., Han, Z., & Li, B. (2026). HIEA: Hierarchical Inference for Entity Alignment with Collaboration of Instruction-Tuned Large Language Models and Small Models. Electronics, 15(2), 421. https://doi.org/10.3390/electronics15020421

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop