Next Article in Journal
On a Discrete Version of the Hardy–Littlewood–Polya Inequality Involving Multiple Parameters in the Whole Plane
Previous Article in Journal
Crow Search Algorithm for Modelling an Anaerobic Digestion Process: Algorithm Parameter Influence
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Detect-Then-Resolve: Enhancing Knowledge Graph Conflict Resolution with Large Language Model

1
Laboratory for Big Data and Decision, National University of Defense Technology, Changsha 410073, China
2
National Key Laboratory of Information Systems Engineering, National University of Defense Technology, Changsha 410073, China
*
Author to whom correspondence should be addressed.
Mathematics 2024, 12(15), 2318; https://doi.org/10.3390/math12152318
Submission received: 6 July 2024 / Revised: 20 July 2024 / Accepted: 23 July 2024 / Published: 24 July 2024
(This article belongs to the Section E1: Mathematics and Computer Science)

Abstract

:
Conflict resolution for knowledge graphs (KGs) is a critical technique in knowledge fusion, ensuring the resolution of conflicts between existing KGs and external knowledge while maintaining post-fusion accuracy. However, current approaches often encounter difficulties with external triples involving unseen entities due to limited knowledge. Moreover, current methodologies typically overlook conflict detection prior to resolution, a crucial step for accurate truth inference. This paper introduces CRDL, an innovative approach that leverages conflict detection and large language models (LLMs) to identify truths. By employing conflict detection, we implement precise filtering strategies tailored to various types of relations and attributes. By designing prompts and injecting relevant information into an LLM, we identify triples with unseen entities. Experimental results demonstrate the superiority of CRDL over baseline methods. Specifically, our method surpasses the state-of-the-art by achieving a 56.4% improvement in recall and a 68.2% increase in F1-score. These results clearly illustrate the enhanced performance and effectiveness of our approach. Additionally, ablation studies and further analyses underscore the importance of the components within CRDL.

1. Introduction

In the era of big data and ubiquitous information, knowledge graphs (KGs) have emerged as a crucial tool for organizing, integrating, and leveraging extensive repositories of knowledge [1,2]. KGs facilitate the representation of concepts, entities, and relationships within a graph-like structure, providing a versatile framework to capture complex and interconnected information [3]. This graph-based representation enhances visualization, exploration, and reasoning, thus enabling a more profound understanding of the underlying knowledge [4].
Despite their advantages, current real-world KGs often fail to comprehensively capture the full scope of knowledge within a given domain [5,6]. This incompleteness restricts their effectiveness in providing accurate and reliable insights for downstream tasks [7,8]. To address this limitation, knowledge fusion emerges as a crucial technique by integrating information from external sources or other KGs. However, this process can introduce incorrect information and create conflicts between existing and new knowledge, potentially compromising the accuracy of KGs and negatively impacting downstream tasks. Consequently, conflict resolution, which aims to detect and correct discrepancies or inconsistencies across multiple knowledge sources, is crucial for maintaining the accuracy of KGs after fusion [9].
Conventional conflict resolution methods mainly focus on the correctness of various triples and the estimation of sources. They employ rule-based reasoning, probabilistic inference, and machine learning algorithms [10,11,12] to resolve conflicts among data records. However, they often overlook the semantic information of KGs and can not handle unseen entities. In recent years, several open knowledge graph completion (KGC) frameworks have been introduced, which aim to recognize external triples. OWE [13] identifies unseen entities with their names and descriptions; TKGC [14] utilizes entity encoding and holistic fact scoring to extract information from KGs. Although they could be adapted to tackle conflict resolution with unseen entities, they still yield unsatisfactory results on external triples due to limited knowledge.
On the other hand, conflicts between external triples and existing KGs can be identified based on corresponding relations or attributes. For example, consider a triple (Himalayas, located_in, Asia) within a KG and an external triple (Himalayas, located_in, Africa), both of which share the same head entity and relation. Since the relation “located_in” is one-to-one, a conflict must exist between these two triples. In contrast, there may be no conflict between the triples (Bohr, student_of, Rutherford) in the given KG and (Chadwick, student_of, Rutherford) because the relation “student_of” is not one-to-one. Therefore, conflict detection is crucial for accurately resolving such discrepancies. Nevertheless, as far as we know, no existing study considers conflict detection; instead, they either assume a single truth, selecting only the most likely candidate as the truth and potentially overlooking latent truths, or they assume multiple truths, accepting all possible candidates and thereby risking the inclusion of false triples. In contrast, detecting conflicts based on relations and attributes can enable more accurate inference, thereby discovering more truths and reducing false recognition.
In short, the limitations of current works can be concluded as follows: (1) Limitations in recognizing external triples. Current methods [10,11,12,14,15] can hardly treat external triples, especially unseen entities, due to their reliance on the knowledge contained within predefined KGs. While some studies have attempted to utilize PLMs to identify these unseen entities, the inherent limitations in the scale of PLMs constrain their effectiveness. (2) Ignorance of conflict detection. Current methods, to the best of our knowledge, fail to account for scenarios where external triples do not conflict with existing KGs. This oversight results in unnecessary filtering and the omission of potentially valuable information.
To address the aforementioned limitations, we propose CRDL, a LLM-based conflict resolution framework incorporating conflict detection. Our approach begins with training embedding on a given KG and classifying all triples according to their relations or attributes. During the inference process, we initially utilize these embeddings to identify and filter conflicts. For the remaining triples that involve non-1-to-1 relations and attributes, we employ a filter based on LLMs for additional screening. Additionally, we design a prompt comprising three components to motivate the LLM’s recognition capabilities. Experimental results on current general benchmarks demonstrate that CRDL significantly improves precision and recall compared to the state-of-the-art methods. Here, precision measures the accuracy of our method in predicting true triples, while recall assesses its ability to identify all true triples.
Our contributions can be listed as follows:
  • We propose CRDL, a conflict resolution framework that incorporates conflict detection and an LLM-based filter. This framework significantly improves the precision and recall in identifying the truths among external triples.
  • We leverage the extensive knowledge embedded in LLMs to identify external triples. Specifically, we construct prompts using well-designed templates that contain insightful instructions to fully exploit the capabilities of LLMs.
  • We employ conflict detection based on relation types to handle triples more precisely. By applying different strategies to 1-to-1 and non-1-to-1 relations, we achieve more accurate recognition.

2. Related Works

2.1. Conflict Resolution

In traditional data fusion, conflict resolution mainly focuses on judging which source provides true information, while rarely considering the features of data structure. Following [16], existing methods can be divided into three categories: (1) Iterative methods [11,17,18], which calculate the confidence of observed values by reliability of sources, and evaluate sources according to correctness of values they provided. (2) Optimization-based methods [12,19,20], which define distance function between truths and observed values, and optimize the goal function with supervised data to pull the distance closer between truths and values that are observed in high-reliability sources, while pushing away values that are observed in low-reliability sources. (3) Probabilistic graph-model-based methods [21,22,23], which assume that observed values were determined by some distributions regard to the truth and source quality. However, these methods are limited in their ability to identify unseen entities, as they seldom take into account the intrinsic features of external triples.

2.2. Open Knowledge Graph Completion

Conventional KGC methods mainly utilize embedding techniques to discover missing relational facts, including geometric models [24,25,26], tensor decomposition models [27,28], and neural network models [29,30,31]. However, these methods can hardly treat triples from open sources. To tackle this issue, a few studies have attempted to recognize external facts. Dong et al. [32] propose a probabilistic method, KnowledgeVault, which embeds triples to vectors based on a knowledge extractor and evaluates triples according to prior KG. Shah et al. [13] propose OWE, which extends conventional KGC methods by learning a transformation from text embedding space to KG embedding space, to identify unseen entities with their names and descriptions. Shi et al. [33] utilize a convolutional neural network ConMask to encode entities with their names and descriptions, and design score function to measure triples. Niu et al. [34] further introduce a multiple-interaction attention mechanism to exploit information in entity descriptions. Huang et al. [14] propose TKGC, a semi-supervised KG completion method, which utilizes representation of KG and probabilistic inference to discover latent truths. However, due to limited knowledge, they were unable to uncover the majority of latent truths present among external triples. In contrast, our method leverages LLMs that contains abundant commonsense knowledge to identify external triples with unseen entities.

2.3. Large Language Model

Recent years have witnessed the rapid growth of LLMs, such as GPTs [35], Claude, and LLaMas [36]. Due to the massive training corpus and vast model parameters, LLMs master abundant commonsense knowledge and show powerful abilities in a wide range of tasks, such as intelligent dialogue, information retrieve, reasoning, and even solving mathematics [37]. In addition, LLM-augmented KG have gained more and more attention recently [38]. Specifically, several studies attempt to complete KGs with LLMs, which could also be adopted to handle conflict resolution partially. KG-BERT [39] firstly introduces pre-trained language models (PLMs) to KG completion tasks, which feeds triples in the form of text into BERT and trains BERT to classify triples directly. AutoKG [40] designs a framework of interaction among user and LLMs to identify truths after iterations of discussion. KIGCPT [41] focuses on the utilization of LLMs, constructing prompts with certain kinds of information that can assist LLMs in reasoning accurately. In this paper, we utilize well-designed prompt, enriched with pertinent information, to enhance the reasoning capabilities of LLMs in the context of conflict resolution.

3. Methodology

3.1. Problem Formulation

  • Notations: In this paper, we consider triples consist of relations and attributes. Formally, a knowledge graph is formulated as K G = { E , R , A , V , T r , T a } , where E, R, A, and V denote the set of entities, relations, attributes, and values, respectively. T r E × R × E is set of relation triples, and  T a E × A × V is set of attribute triples. A fact is a triple from the knowledge graph, which is denoted as f = ( h , r , t ) T r T a . Then the claims, namely triples extracted from open sources, are in the same form with facts but may contain unseen entities. Without losing generality, we consider the situation that the tails of claims are unseen. The set of claims is denoted as C = { ( h , r , t ) | h E , r R A } , where the head entity and the relation (or attribute) are contained by the KG, but the tail entity (or value) may not. The correct claims are called truths.
  • Objective: Given a knowledge graph K G = { E , R , A , V , T r , T a } and set of claims C extracted from various sources, the target of conflict resolution is to identify truths C * C .

3.2. Framework Overview

Figure 1 illustrates the structure of our proposed method. Initially, we employ knowledge graph embedding (KGE) techniques to generate representations for the triples within the KG and to determine the appropriate scoring functions. Furthermore, we categorize all relations and attributes into 1-to-1 and non-1-to-1 groups for conflict-detection purposes. During the inference phase, we evaluate the perplexity of triples using scoring functions in conjunction with their representations. In this paper, we define perplexity as a quantitative measure of the inaccuracy of claims. For a given triple, a lower perplexity value signifies a higher probability of the triple being true. This allows us to filter out claims characterized by high perplexity, thereby identifying and excluding those that are likely to be incorrect. Depending on the category of the relation or attribute, we either select the claim with the lowest perplexity as the truth or utilize an LLM to identify the truth among the filtered claims. Specifically, we input claims with carefully designed prompts into the LLM, which then outputs the truths among the claims.
The overall methodology of our approach can be summarized as follows:
(1)
Represent all triples using KGE techniques.
(2)
Categorize all relations and attributes to facilitate subsequent conflict detection.
(3)
Detecting conflicts among claims based on their relations and attributes.
(4)
For claims involving 1-to-1 relations or attributes, select the claim with the lowest perplexity as the truth.
(5)
For claims involving non-1-to-1 relations or attributes, apply an initial filter and then use LLMs to determine the truth among the remaining claims.

3.3. KG Embedding

In order to learn and leverage knowledge in given KG, we firstly utilize KGE technique to embed the triples. Specifically, we employ TransE [25] to measure the triples and obtain representations of the KG. Given a relation triple ( h , r , t ) , the scoring function is defined as:
f r ( h , r , t ) = h + r t 1
where the · 1 is L1-norm; h , r and t are the embeddings of head entity h, relation r, and tail entity t, respectively.
In addition, we also consider encoding the attribute triples in KG. Following [14,42], we process the attribute triples, transform and unify their attribute values into pure numerical values. After that, given a attribute triple ( e , a , v ) , where v represents a numerical value, the scoring function is defined as:
f a ( e , a , v ) = | e T a v |
where | · | is the absolute value, and e and a are the embeddings of the head entity e and the attribute a, respectively. e T represents the transpose of e . Both f r and f a quantify the perplexity of triples, which measures the incorrectness of triples. In other words, for a given triple, a lower value of f r or f a indicates a higher likelihood of the triple being true. Note that the choice of these functions is not a key point in our method; they can be substituted with any functions capable of evaluating triples.
We denote the f r and f a as unified f for convenience, and the loss function is defined as:
L = ( h , r , t ) T r T a ( h , r , t ) T r T a γ + f ( h , r , t ) f ( h , r , t ) +
where γ is the margin hyper-parameter, and  T r and T a are negative samples that are produced by replacing the tail entities (or values) of triples randomly, [ · ] + = max ( 0 , · ) . To minimize the loss function L , the training process optimizes all embeddings. This optimization ensures that the scores of true triples are reduced, while the scores of false triples are increased.

3.4. Conflict Detection

Given that a single head entity can have multiple tail entities in a KG, the claims might not conflict with existing knowledge. Therefore, it is essential to classify whether a relationship is 1-to-1 in order to facilitate subsequent inference. Specifically, we define the 1-to-1 relations (or attributes) as:
O = r | h E , # { t | ( h , r , t ) T r T a } = 1
where # A represents the cardinal number of set A. We collect all the relations and attributes that link each their head entity to a unique tail entity (or attribute value). Therefore, for a given relation or attribute, r O , and an entity e, there exists only one correct tail t (Figure 2). Consequently, by disregarding judgments associated with other potential tails during the inference process, the precision is enhanced and the procedure is expedited. In addition, we denote the rest of relations and attributes as O , O O = , and O O = R A .
During inference, we firstly utilize the scoring function defined in Section 3.3 to evaluate each claim c C . Technically, we group all the claims by their head entities and relations. Formally, we consider the claim set regrading to specific h and r:
C h , r = { c = ( h , r , t ) | c C }
For each c in C h , r , we calculate its perplexity with the scoring function f ( h , r , t ) . If r in O, then the claim with lowest perplexity will be chose as truth; otherwise, the LLM-based filter strategy is processed. Formally, the set of all truths that contain 1-to-1 relations or attributes can be expressed as:
C O * = ( h , r , t ) | r O , ( h , r , t ) = arg min C h , r f ( h , r , t )

3.5. Conflict Resolution with LLMs

For claims that contain non-1-to-1 relations or attributes, we retain the claims with perplexity under a fixed threshold α firstly, and then filter the rest claims with LLMs. Similarly, we group all the rest claims by their head entities and relations, and consider the set of claims as C ˜ h , r = { ( h , r , t ) C h , r | r O , f ( h , r , t ) > α } .
Prompting is the primary method of communicating with LLMs, which is conducted through text. Therefore, designing effective prompts, a process known as prompt engineering, is crucial for harnessing the capabilities of LLMs to accomplish specific tasks. To enable LLMs, which are inherently designed to process natural language, to address conflict resolution tasks, we have developed a prompt template. The template of prompts is designed manually, while the complete prompts are automatically generated by populating the template with specific data. We construct a prompt for each C ˜ h , r , which contains three parts (Figure 3): (1) task declaration, (2) demonstrations, and (3) input claims.

3.5.1. Task Declaration

As LLMs are designed to complete dialogues originally, instructions about specific tasks, e.g., conflict resolution, should be given in the prompt. We begin the prompt with a declaration of the task, which includes the purpose of the task, the description of the following content, and the format of the inputs. This introductory section is vital for the LLMs to comprehend the specific task, adjust their response mode, and activate their relevant capabilities. Importantly, this section is fixed and will be consistently included at the beginning of all constructed prompts.

3.5.2. Demonstrations

Following studies on prompt engineering [43,44,45], we provide several demonstrations to make the LLM better understand the entity in query triples and inspire its ability. Specifically, as mentioned before, the input claims share the same head entity, denoted as e, and we sample several relevant triples in the KG:
D e = { ( e , r , t ) | r R A , t E V , ( e , r , t ) T r T a }
During inference, we randomly sample k triples from D e as the demonstrations of the entity e. Each triple in D e includes the entity e and its relationship with another entity or an attribute value. Therefore, the LLM could acquire relevant knowledge about the entity from the given KG.
However, due to the inherent difference between triples and natural languages, there are significant gap for LLMs to understand triples directly. Thus, we employ “triple translation” to translate the triples in demonstrations into natural language texts that summarizing the information of the triples. Specifically, we task the LLM with generating a text description of the entity e based on k sampled triples from the dataset D e . This description, formatted as a paragraph, encapsulates information about the entity e, as illustrated in Figure 4. Subsequently, the generated description is incorporated into the prompt to serve as demonstrations of the entity in the input claims.

3.5.3. Input Claims

At last, we give the input C ˜ h , r , which contains a group of claims that share the same relation and head entity. As response, we ask the LLM to answer a list that consists of “True” and “False”, representing the corresponding judgment of each claim in C ˜ h , r . The answers of LLM are denoted as L L M ( h , r , t ) , and  L L M ( h , r , t ) = 1 if the claim ( h , r , t ) is identified as true by the LLM; otherwise, L L M ( h , r , t ) = 0 .
Formally, the set of all truths that contain non-1-to-1 relations of attributes can be expressed as:
C O * = ( h , r , t ) C h , r | r O , f ( h , r , t ) α ( h , r , t ) C ˜ h , r | L L M ( h , r , t ) = 1

3.6. Overall Algorithm

The pseudo-code of the whole method is shown in Algorithm 1. The input to our model consists of a KG, a collection of claims, and a set of hyper-parameters. Following initialization, the training loss is computed using scoring functions, which subsequently update the model parameters and embeddings (Lines 5–8). Next, all relations and attributes are segmented for the subsequent conflict detection phase (Line 9). During inference, claims are grouped according to their head entities and relations (attributes), and different strategies are employed based on the type of relations (attributes). For groups with 1-to-1 relations, the claim with the lowest perplexity is selected as the truth (Lines 11–13). For other groups, claims with perplexity below a certain threshold are considered truthful, while the remaining claims are evaluated by LLMs using well-designed prompts (Lines 15–21). Ultimately, the results are integrated, and the set of truths is output (Line 23).
Algorithm 1: Algorithmic description of CRDL
Mathematics 12 02318 i001

4. Experiments

4.1. Experiment Settings

4.1.1. Dataset

We use a dataset created by OKELE [15], as it contains triples extracted from diverse web sources. The dataset comprises ten popular classes of entities derived from Freebase [46], with each class containing 1200 entities. In total, the dataset encompasses 191,759 triples. Detailed statistics of the dataset are presented in Table 1. Following [14], we collect at least one fact that directly pertains to each entity and one fact that indirectly connects to the entity. These collected facts constitute the given KG, while the remaining triples are considered as claims.

4.1.2. Evaluation Metrics

Following [14], we employ precision (P), recall (R), and F1-score as our evaluation metrics, with higher values indicating superior performance. In the context of evaluating the accuracy of the claim identification, the following notations are used: T P (True Positive) represents the number of claims that are true and correctly identified as true, F P (False Positive) denotes the number of claims that are false but incorrectly identified as true, F N (False Negative) indicates the number of claims that are true but incorrectly identified as false, and T N (True Negative) signifies the number of claims that are false and correctly identified as false. The performance metrics are calculated as follows:
Precision = T P T P + F P
Recall = T P T P + F N
F 1 = 2 1 Precision + 1 Recall
Our results are derived by averaging the precision and recall values obtained from five independent runs.

4.1.3. Comparative Models

We select several representative models in conflict resolution and knowledge graph completion for comparison. We divide these competitors into two groups, the first group consists of conventional conflict-resolution methods that recognize facts from different sources, but ignore features of triples and conflict detection:
  • TruthFinder [18], which estimates source reliability and finds truth based on Bayesian analysis for data fusion.
  • Latent credibility analysis (LCA) [17], which constructs a strongly principled probabilistic model to capture the credibility of sources.
  • Latent truth model (LTM) [47], which proposes a probabilistic graphical model that can automatically infer true records and source quality without supervision.
  • Multi-truth Bayesian model (MBM) [48], which proposes an integrated Bayesian approach to multi-truth-finding problem.
  • OKELE [15], which constructs a probabilistic graphical model to infer true facts of long-tail entities from open sources.
The second group consists of open KGC methods that only leverage knowledge of given KGs to identify facts:
  • Open world extension (OWE) [13], which presents an extension for embedding-based knowledge graph completion models, with the ability to perform open world link prediction.
  • ConMask [33], which learns embeddings of the entity’s name and parts of its text description to connect unseen entities to the KG.
  • KnowledgeVault [32], which employs supervised machine learning methods for fusing distinct information sources with prior knowledge derived from existing knowledge repositories.
  • TKGC [14], which presents a trustworthy method that exploits facts of existing KG and infers truths from open sources.
In addition, we design a simple baseline that utilize ChatGPT to identify truths without prompt engineering. For each claim, we directly ask ChatGPT to answer whether it is true or false.

4.1.4. Implement Details

Our experiment is conducted on a server with Intel Core i7-12700F CPU, 32 GB memory, and NVIDIA GeForce RTX 3090 graphics card. For hyper-parameters, we use grid search to find optimal hyper-parameters for CRDL. We set the margin γ = 5 , and probability threshold as α = 0.5 . The number of triples utilized to generate demonstrations is k = 5 . All the initial representations of entities are generated by a pre-trained BERT; thus, the dimension of the embeddings is d e = 768 . We use the Adam algorithm to optimize the parameters of the KGE model, with learning rate l r = 0.01 , and training epochs E P O C H S t = 50 . For inference, we utilize the API of GPT-3.5-turbo to implement LLM-based filter. For comparative models, we employ the implementation provided by [15] for the comparative models of the first group and test them in the same environment or directly take the results of [14], whose datasets and experiment settings are the same as ours.

4.2. Overall Performance

Table 2 presents the comparative results, from which we can draw several key conclusions: (1) Our model achieves the highest F1-score among all competitors due to its integration of conflict detection and utilization of knowledge from LLMs. The incorporation of conflict detection enhances the precision of inference, whereas LLMs significantly improve the ability to recognize external triples. In contrast, baseline models fail to obtain satisfactory results due to their limited knowledge and inaccurate inference. Notably, it surpasses the recent open KGC approach, TKGC, by over 0.3 in terms of F1-score. This underscores the efficacy of incorporating conflict detection and an LLM-based filter. (2) Among the methods in the first group, OKELE effectively models long-tail distributions and assesses the quality of sources, but it neglects the semantic information within the knowledge graph. LTM achieves the poorest results, largely because it relies on the assumption of a prior distribution. Additionally, these methods struggle to handle external triples. (3) For the methods in the second group, although they can process external triples, their results are unsatisfactory due to their limited knowledge base. In addition, OWE, KnowledgeVault, and ConMask identify external facts based on the existing knowledge graph but fail to distinguish between relational facts and attribute facts, leading to inferior performances compared to TKGC. (4) The baseline results, which involve the direct use of ChatGPT, demonstrate that LLMs generally fail to comprehend the task and triples without prompt engineering. In contrast, our model comprehends the task effectively and delivers outstanding performance with appropriate demonstrations.

4.3. Ablation Study

To demonstrate the effectiveness of components in CRDL, we conducted an ablation study on the LLM-based filter and the conflict detection process.

4.3.1. LLM-Based Filter

To identify external triples, CRDL leverages an LLM to discern latent truths, drawing on the LLM’s extensive commonsense knowledge and reasoning capabilities. We developed a variant that exclusively employs conflict detection and a scoring function to address the task. Specifically, for non-1-to-1 relations and attributes, we select only those triples that surpass the threshold to be considered as truths. As shown in Table 2, the metrics for this variant, referred to as CRDL w/o LLM, exhibit a significant decline, particularly in recall. These findings demonstrate that, while a scoring function-based approach can effectively identify high-confidence triples, it fails to recognize external triples. The LLM-based filter, therefore, plays a crucial role in the identification of these external triples. The effectiveness of our approach stems from the integration of the scoring function’s precision with the LLMs’ capacity for extension.

4.3.2. Conflict Detection

Unlike other conflict-resolution methods, CRDL employs conflict detection to ascertain whether there is a single truth or multiple truths. This determination guides the subsequent recognition strategies. To assess the effectiveness of conflict detection, we also evaluated CRDL without employing conflict detection, treating all relations and attributes as non-1-to-1. As shown in Table 2, this variant, denoted as CRDL w/o CD, performs worse on several metrics compared to the original CRDL. Specifically, the precision decreases by 0.039 and the recall decreases by 0.019. These results indicate that conflict detection enhances CRDL’s ability to accurately recognize truths.

4.4. K-Shot Demonstrations

The demonstrations provide relevant information about the entities within the input triples, thereby making the number of triples utilized, denoted as k, a crucial element affecting the demonstrations’ efficacy. To investigate this impact, we assess our method across multiple k-shot scenarios, where k signifies the number of triples sampled for the demonstrations. Additionally, we perform evaluations in a zero-shot context, where no demonstrations are presented to the LLM.
As indicated by Table 3 and Figure 5, the precision remains relatively stable as k varies from 1 to 7, whereas the recall increases up to 5-shot and then declines. These results suggest that, while the number of demonstrations has a minimal effect on precision, it substantially enhances recall. In particular, a comparison between zero-shot and one-shot scenarios reveals that the introduction of demonstrations increases recall from 0.715 to 0.754, thereby confirming the effectiveness of demonstrations. However, recall starts to decrease when k increases from 5 to 7, potentially due to the introduction of excessive demonstrations that might introduce irrelevant information and disrupt the LLM’s performance.

5. Conclusions

Conflict resolution for KGs involves identifying conflicts between existing KGs and external triples extracted from open sources, ensuring the correctness of KGs after their integration with these external triples. In this paper, we introduce CRDL, an innovative approach that integrates conflict detection and LLMs to resolve conflicts in the process of knowledge fusion. Specifically, we utilize conflict detection to ensure accurate inference and employ LLMs to enhance the ability to recognize truths involving previously unseen entities. We conduct experiments to demonstrate the superiority of CRDL compared to baselines, and perform ablation studies and additional analyses to show the impact of the components within CRDL. The results show that our method effectively ensures the correctness and reliability of KGs after fusion. Consequently, our approach has the potential to benefit a wide range of KG-dependent applications, including question answering and reasoning, by improving the foundational accuracy of the KGs.
Experimental results indicate that while our approach effectively handles external triples, it may be limited in identifying rare entities that fall outside the scope of the LLM’s knowledge. Furthermore, the utilization of LLMs necessitates significant computational resources and results in reduced inference speed. For future work, one potential direction is to leverage LLMs to identify long-tail entities that currently lack sufficient relevant information, which LLMs may be unfamiliar with. Expanding CRDL to handle temporal and dynamic KGs is also a valuable direction. Additionally, we plan to construct datasets and address conflict resolution in conjunction with other knowledge fusion processes, such as entity alignment, which could mutually benefit and enhance the effectiveness of conflict resolution.

Author Contributions

Conceptualization, H.P. and W.Z.; methodology, H.P.; validation, P.Z. and W.Z.; formal analysis, P.Z.; investigation, P.Z.; resources, H.X.; data curation, H.P.; writing—original draft preparation, H.P.; writing—review and editing, P.Z. and W.Z.; supervision, W.Z. and J.T.; project administration, H.X. and J.T. All authors have read and agreed to the published version of the manuscript.

Funding

This work was partially supported by National Key R&D Program of China No. 2022YFB3103600, NSFC under grants Nos. 62302513 and 62272469.

Data Availability Statement

The datasets could be found at https://github.com/nju-websoft/OKELE, accessed on 5 June 2024.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Zhang, Z.J. Graph Databases for Knowledge Management. IT Prof. 2017, 19, 26–32. [Google Scholar] [CrossRef]
  2. Zeng, W.; Zhao, X.; Tang, J.; Lin, X.; Groth, P. Reinforcement Learning-based Collective Entity Alignment with Adaptive Features. ACM Trans. Inf. Syst. 2021, 39, 1–31. [Google Scholar] [CrossRef]
  3. Ehrlinger, L.; Wöß, W. Towards a Definition of Knowledge Graphs. SEMANTiCS 2016, 48, 1–4. [Google Scholar]
  4. Chen, X.; Jia, S.; Xiang, Y. A review: Knowledge reasoning over knowledge graph. Expert Syst. Appl. 2020, 141, 112948. [Google Scholar] [CrossRef]
  5. Pujara, J.; Miao, H.; Getoor, L.; Cohen, W.W. Knowledge Graph Identification. In Semantic Web-ISWC 2013, Proceedings of the 12th International Semantic Web Conference, Sydney, Australia, 21–25 October 2013; Alani, H., Kagal, L., Fokoue, A., Groth, P., Biemann, C., Parreira, J.X., Aroyo, L., Noy, N.F., Welty, C., Janowicz, K., Eds.; Proceedings, Part I; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2013; Volume 8218, pp. 542–557. [Google Scholar] [CrossRef]
  6. Zeng, W.; Zhao, X.; Li, X.; Tang, J.; Wang, W. On entity alignment at scale. VLDB J. 2022, 31, 1009–1033. [Google Scholar] [CrossRef]
  7. Nguyen, H.L.; Vu, D.; Jung, J.J. Knowledge graph fusion for smart systems: A Survey. Inf. Fusion 2020, 61, 56–70. [Google Scholar] [CrossRef]
  8. Zhao, X.; Zeng, W.; Tang, J. Entity Alignment—Concepts, Recent Advances and Novel Approaches; Springer Nature: Singapore, 2023. [Google Scholar] [CrossRef]
  9. Zhao, X.; Jia, Y.; Li, A.; Jiang, R.; Song, Y. Multi-source knowledge fusion: A survey. World Wide Web 2020, 23, 2567–2592. [Google Scholar] [CrossRef]
  10. Hunter, A.; Summerton, R. Fusion Rules for Context-Dependent Aggregation of Structured News Reports. J. Appl.-Non-Class. Logics 2004, 14, 329–366. [Google Scholar] [CrossRef]
  11. Dong, X.L.; Berti-Équille, L.; Srivastava, D. Integrating Conflicting Data: The Role of Source Dependence. Proc. VLDB Endow. 2009, 2, 550–561. [Google Scholar] [CrossRef]
  12. Rekatsinas, T.; Joglekar, M.; Garcia-Molina, H.; Parameswaran, A.G.; Ré, C. SLiMFast: Guaranteed Results for Data Fusion and Source Reliability. In SIGMOD Conference 2017, Proceedings of the 2017 ACM International Conference on Management of Data, Chicago, IL, USA, 14–19 May 2017; Salihoglu, S., Zhou, W., Chirkova, R., Yang, J., Suciu, D., Eds.; ACM: New York, NY, USA, 2017; pp. 1399–1414. [Google Scholar] [CrossRef]
  13. Shah, H.; Villmow, J.; Ulges, A.; Schwanecke, U.; Shafait, F. An Open-World Extension to Knowledge Graph Completion Models. Proc. AAAI Conf. Artif. Intell. 2019, 33, 3044–3051. [Google Scholar] [CrossRef]
  14. Huang, J.; Zhao, Y.; Hu, W.; Ning, Z.; Chen, Q.; Qiu, X.; Huo, C.; Ren, W. Trustworthy Knowledge Graph Completion Based on Multi-sourced Noisy Data. In WWW’22, Proceedings of the ACM Web Conference 2022, Virtual Event, Lyon, France, 25–29 April 2022; Laforest, F., Troncy, R., Simperl, E., Agarwal, D., Gionis, A., Herman, I., Médini, L., Eds.; ACM: New York, NY, USA, 2022; pp. 956–965. [Google Scholar] [CrossRef]
  15. Cao, E.; Wang, D.; Huang, J.; Hu, W. Open Knowledge Enrichment for Long-tail Entities. In WWW’20, Proceedings of the Web Conference 2020, Taipei, Taiwan, 20–24 April 2020; Huang, Y., King, I., Liu, T., van Steen, M., Eds.; ACM: New York, NY, USA; IW3C2: Geneva, Switzerland, 2020; pp. 384–394. [Google Scholar] [CrossRef]
  16. Li, Y.; Gao, J.; Meng, C.; Li, Q.; Su, L.; Zhao, B.; Fan, W.; Han, J. A Survey on Truth Discovery. ACM Sigkdd Explor. Newsl. 2015, 17, 1–16. [Google Scholar] [CrossRef]
  17. Pasternack, J.; Roth, D. Knowing what to Believe (when you already know something). In COLING 2010, Proceedings of the 23rd International Conference on Computational Linguistics, Beijing, China, 23–27 August 2010; Huang, C., Jurafsky, D., Eds.; Tsinghua University Press: Beijing, China, 2010; pp. 877–885. [Google Scholar]
  18. Yin, X.; Han, J.; Yu, P.S. Truth Discovery with Multiple Conflicting Information Providers on the Web. IEEE Trans. Knowl. Data Eng. 2008, 20, 796–808. [Google Scholar] [CrossRef]
  19. Li, Q.; Li, Y.; Gao, J.; Zhao, B.; Fan, W.; Han, J. Resolving conflicts in heterogeneous data by truth discovery and source reliability estimation. In SIGMOD 2014, Proceedings of the International Conference on Management of Data, Snowbird, UT, USA, 22–27 June 2014; Dyreson, C.E., Li, F., Özsu, M.T., Eds.; ACM: New York, NY, USA, 2014; pp. 1187–1198. [Google Scholar] [CrossRef]
  20. Li, Y.; Li, Q.; Gao, J.; Su, L.; Zhao, B.; Fan, W.; Han, J. On the Discovery of Evolving Truth. In ACM SIGKDD, Proceedings of the 21th International Conference on Knowledge Discovery and Data Mining, Sydney, Australia, 10–13 August 2015; Cao, L., Zhang, C., Joachims, T., Webb, G.I., Margineantu, D.D., Williams, G., Eds.; ACM: New York, NY, USA, 2015; pp. 675–684. [Google Scholar] [CrossRef]
  21. Pochampally, R.; Sarma, A.D.; Dong, X.L.; Meliou, A.; Srivastava, D. Fusing data with correlations. In SIGMOD 2014, Proceedings of the International Conference on Management of Data, Snowbird, UT, USA, 22–27 June 2014; Dyreson, C.E., Li, F., Özsu, M.T., Eds.; ACM: New York, NY, USA, 2014; pp. 433–444. [Google Scholar] [CrossRef]
  22. Qi, G.; Aggarwal, C.C.; Han, J.; Huang, T.S. Mining collective intelligence in diverse groups. In WWW’13, Proceedings of the 22nd International World Wide Web Conference, Rio de Janeiro, Brazil, 13–17 May 2013; Schwabe, D., Almeida, V.A.F., Glaser, H., Baeza-Yates, R., Moon, S.B., Eds.; International World Wide Web Conferences Steering Committee: Geneva, Switzerland; ACM: New York, NY, USA, 2013; pp. 1041–1052. [Google Scholar] [CrossRef]
  23. Sarma, A.D.; Dong, X.L.; Halevy, A.Y. Data integration with dependent sources. In EDBT 2011, Proceedings of the 14th International Conference on Extending Database Technology, Uppsala, Sweden, 21–24 March 2011; Ailamaki, A., Amer-Yahia, S., Patel, J.M., Risch, T., Senellart, P., Stoyanovich, J., Eds.; ACM: New York, NY, USA, 2011; pp. 401–412. [Google Scholar] [CrossRef]
  24. Abboud, R.; Ceylan, I.; Lukasiewicz, T.; Salvatori, T. Boxe: A box embedding model for knowledge base completion. Adv. Neural Inf. Process. Syst. 2020, 33, 9649–9661. [Google Scholar]
  25. Bordes, A.; Usunier, N.; García-Durán, A.; Weston, J.; Yakhnenko, O. Translating Embeddings for Modeling Multi-relational Data. In Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013, Lake Tahoe, NV, USA, 5–8 December 2013; pp. 2787–2795. [Google Scholar]
  26. Cao, Z.; Xu, Q.; Yang, Z.; Cao, X.; Huang, Q. Dual Quaternion Knowledge Graph Embeddings. Proc. AAAI Conf. Artif. Intell. 2021, 35, 6894–6902. [Google Scholar] [CrossRef]
  27. Balazevic, I.; Allen, C.; Hospedales, T.M. TuckER: Tensor Factorization for Knowledge Graph Completion. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP 2019, Hong Kong, China, 3–7 November 2019; Inui, K., Jiang, J., Ng, V., Wan, X., Eds.; Association for Computational Linguistics: Stroudsburg, PA, USA, 2019; pp. 5184–5193. [Google Scholar] [CrossRef]
  28. Yang, B.; Yih, W.; He, X.; Gao, J.; Deng, L. Embedding Entities and Relations for Learning and Inference in Knowledge Bases. In Proceedings of the 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, 7–9 May 2015; pp. 1–12. [Google Scholar]
  29. Nguyen, D.Q.; Vu, T.; Nguyen, T.D.; Nguyen, D.Q.; Phung, D.Q. A Capsule Network-based Embedding Model for Knowledge Graph Completion and Search Personalization. In NAACL-HLT 2019, Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, MN, USA, 2–7 June 2019; Burstein, J., Doran, C., Solorio, T., Eds.; Volume 1 (Long and Short Papers); Association for Computational Linguistics: Stroudsburg, PA, USA, 2019; pp. 2180–2189. [Google Scholar] [CrossRef]
  30. Wang, S.; Wei, X.; dos Santos, C.N.; Wang, Z.; Nallapati, R.; Arnold, A.O.; Xiang, B.; Yu, P.S.; Cruz, I.F. Mixed-Curvature Multi-Relational Graph Neural Network for Knowledge Graph Completion. In WWW’21, Proceedings of the Web Conference 2021, Virtual Event, Ljubljana, Slovenia, 19–23 April 2021; Leskovec, J., Grobelnik, M., Najork, M., Tang, J., Zia, L., Eds.; ACM: New York, NY, USA; IW3C2: Geneva, Switzerland, 2021; pp. 1761–1771. [Google Scholar] [CrossRef]
  31. Lin, Q.; Mao, R.; Liu, J.; Xu, F.; Cambria, E. Fusing topology contexts and logical rules in language models for knowledge graph completion. Inf. Fusion 2023, 90, 253–264. [Google Scholar] [CrossRef]
  32. Dong, X.; Gabrilovich, E.; Heitz, G.; Horn, W.; Lao, N.; Murphy, K.; Strohmann, T.; Sun, S.; Zhang, W. Knowledge vault: A web-scale approach to probabilistic knowledge fusion. In ACM SIGKDD, Proceedings of the 20th International Conference on Knowledge Discovery and Data Mining, KDD’14, New York, NY, USA, 24–27 August 2014; Macskassy, S.A., Perlich, C., Leskovec, J., Wang, W., Ghani, R., Eds.; ACM: New York, NY, USA, 2014; pp. 601–610. [Google Scholar] [CrossRef]
  33. Shi, B.; Weninger, T. Open-World Knowledge Graph Completion. Proc. AAAI Conf. Artif. Intell. 2018, 32, 1957–1964. [Google Scholar] [CrossRef]
  34. Niu, L.; Fu, C.; Yang, Q.; Li, Z.; Chen, Z.; Liu, Q.; Zheng, K. Open-world knowledge graph completion with multiple interaction attention. World Wide Web 2021, 24, 419–439. [Google Scholar] [CrossRef]
  35. OpenAI. GPT-4 Technical Report. arXiv 2023, arXiv:2303.08774. [Google Scholar]
  36. Touvron, H.; Lavril, T.; Izacard, G.; Martinet, X.; Lachaux, M.; Lacroix, T.; Rozière, B.; Goyal, N.; Hambro, E.; Azhar, F.; et al. LLaMA: Open and Efficient Foundation Language Models. arXiv 2023, arXiv:2302.13971. [Google Scholar]
  37. Zhao, W.X.; Zhou, K.; Li, J.; Tang, T.; Wang, X.; Hou, Y.; Min, Y.; Zhang, B.; Zhang, J.; Dong, Z.; et al. A Survey of Large Language Models. arXiv 2023, arXiv:2303.18223. [Google Scholar]
  38. Pan, S.; Luo, L.; Wang, Y.; Chen, C.; Wang, J.; Wu, X. Unifying Large Language Models and Knowledge Graphs: A Roadmap. IEEE Trans. Knowl. Data Eng. 2024, 36, 3580–3599. [Google Scholar] [CrossRef]
  39. Yao, L.; Mao, C.; Luo, Y. KG-BERT: BERT for Knowledge Graph Completion. arXiv 2019, arXiv:1909.03193. [Google Scholar]
  40. Zhu, Y.; Wang, X.; Chen, J.; Qiao, S.; Ou, Y.; Yao, Y.; Deng, S.; Chen, H.; Zhang, N. LLMs for Knowledge Graph Construction and Reasoning: Recent Capabilities and Future Opportunities. arXiv 2023, arXiv:2305.13168. [Google Scholar]
  41. Wei, Y.; Huang, Q.; Zhang, Y.; Kwok, J.T. KICGPT: Large Language Model with Knowledge in Context for Knowledge Graph Completion. In EMNLP 2023, Proceedings of the Findings of the Association for Computational Linguistics, Singapore, 6–10 December 2023; Bouamor, H., Pino, J., Bali, K., Eds.; Association for Computational Linguistics: Stroudsburg, PA, USA, 2023; pp. 8667–8683. [Google Scholar] [CrossRef]
  42. Wu, Y.; Wang, Z. Knowledge Graph Embedding with Numeric Attributes of Entities. In Rep4NLP@ACL 2018, Proceedings of the Third Workshop on Representation Learning for NLP, Melbourne, Australia, 20 July 2018; Augenstein, I., Cao, K., He, H., Hill, F., Gella, S., Kiros, J., Mei, H., Misra, D., Eds.; Association for Computational Linguistics: Stroudsburg, PA, USA, 2018; pp. 132–136. [Google Scholar] [CrossRef]
  43. Brown, T.B.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry, G.; Askell, A.; et al. Language Models are Few-Shot Learners. Adv. Neural Inf. Process. Syst. 2020, 33, 1877–1901. [Google Scholar]
  44. Zhao, Z.; Wallace, E.; Feng, S.; Klein, D.; Singh, S. Calibrate Before Use: Improving Few-shot Performance of Language Models. In Proceedings of the 38th International Conference on Machine Learning, ICML 2021, Virtual Event, 18–24 July 2021; Volume 139, pp. 12697–12706. [Google Scholar]
  45. Ye, J.; Wu, Z.; Feng, J.; Yu, T.; Kong, L. Compositional Exemplars for In-context Learning. Proc. Mach. Learn. Res. 2023, 202, 39818–39833. [Google Scholar]
  46. Bollacker, K.D.; Evans, C.; Paritosh, P.K.; Sturge, T.; Taylor, J. Freebase: A collaboratively created graph database for structuring human knowledge. In ACM SIGMOD, Proceedings of the International Conference on Management of Data, Vancouver, BC, Canada, 10–12 June 2008; Wang, J.T., Ed.; ACM: New York, NY, USA, 2008; pp. 1247–1250. [Google Scholar] [CrossRef]
  47. Zhao, B.; Rubinstein, B.I.P.; Gemmell, J.; Han, J. A Bayesian Approach to Discovering Truth from Conflicting Sources for Data Integration. Proc. VLDB Endow. 2012, 5, 550–561. [Google Scholar] [CrossRef]
  48. Wang, X.; Sheng, Q.Z.; Fang, X.S.; Yao, L.; Xu, X.; Li, X. An Integrated Bayesian Approach for Effective Multi-Truth Discovery. In CIKM 2015, Proceedings of the 24th International Conference on Information and Knowledge Management, Melbourne, VIC, Australia, 19–23 October 2015; Bailey, J., Moffat, A., Aggarwal, C.C., de Rijke, M., Kumar, R., Murdock, V., Sellis, T.K., Yu, J.X., Eds.; ACM: New York, NY, USA, 2015; pp. 493–502. [Google Scholar] [CrossRef]
Figure 1. The outline of our proposal.
Figure 1. The outline of our proposal.
Mathematics 12 02318 g001
Figure 2. Categorization of 1-to-1 and non-1-to-1 relations.
Figure 2. Categorization of 1-to-1 and non-1-to-1 relations.
Mathematics 12 02318 g002
Figure 3. The components of prompts.
Figure 3. The components of prompts.
Mathematics 12 02318 g003
Figure 4. Triple translation.
Figure 4. Triple translation.
Mathematics 12 02318 g004
Figure 5. Precision and recall under different k-shot.
Figure 5. Precision and recall under different k-shot.
Mathematics 12 02318 g005
Table 1. Dataset statistics.
Table 1. Dataset statistics.
ClassesRelation TriplesAttribute Triples
actor64,983330
album5897155
book10,766499
building2823361
drug26,4321002
film45,233576
food23,041842
mountain2720623
ship1805852
software2322487
Table 2. Comparison of overall performance; the best results are in bold.
Table 2. Comparison of overall performance; the best results are in bold.
ModelsPrecisionRecallF1-Score
TruthFinder [18]0.2790.3740.320
LCA [17]0.3640.4040.383
LTM [47]0.2620.3940.315
MBM [48]0.3400.5390.417
OKELE [15]0.4590.4850.472
OWE [13]0.3510.4210.383
KnowledgeVault [32]0.3850.4550.417
ConMask [33]0.3760.4430.407
TKGC [14]0.5240.4910.507
ChatGPT0.4170.4770.445
CRDL0.9590.7680.853
CRDL w/o LLM0.9110.2710.418
CRDL w/o CD0.9200.7490.826
Table 3. Performances under different k-shot; the best results are in bold.
Table 3. Performances under different k-shot; the best results are in bold.
k-ShotPrecisionRecallF1-Score
0-shot0.9530.7150.817
1-shot0.9600.7540.844
3-shot0.9580.7550.845
5-shot0.9590.7680.853
7-shot0.9530.7510.840
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Peng, H.; Zhang, P.; Tang, J.; Xu, H.; Zeng, W. Detect-Then-Resolve: Enhancing Knowledge Graph Conflict Resolution with Large Language Model. Mathematics 2024, 12, 2318. https://doi.org/10.3390/math12152318

AMA Style

Peng H, Zhang P, Tang J, Xu H, Zeng W. Detect-Then-Resolve: Enhancing Knowledge Graph Conflict Resolution with Large Language Model. Mathematics. 2024; 12(15):2318. https://doi.org/10.3390/math12152318

Chicago/Turabian Style

Peng, Huang, Pengfei Zhang, Jiuyang Tang, Hao Xu, and Weixin Zeng. 2024. "Detect-Then-Resolve: Enhancing Knowledge Graph Conflict Resolution with Large Language Model" Mathematics 12, no. 15: 2318. https://doi.org/10.3390/math12152318

APA Style

Peng, H., Zhang, P., Tang, J., Xu, H., & Zeng, W. (2024). Detect-Then-Resolve: Enhancing Knowledge Graph Conflict Resolution with Large Language Model. Mathematics, 12(15), 2318. https://doi.org/10.3390/math12152318

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop