Fusing Essential Text for Question Answering over Incomplete Knowledge Base

Li, Huiying; Feng, Yuxi; Liu, Liheng

doi:10.3390/electronics14010161

Open AccessArticle

Fusing Essential Text for Question Answering over Incomplete Knowledge Base

by

Huiying Li

^1,2,*

,

Yuxi Feng

² and

Liheng Liu

²

¹

Key Laboratory of New Generation Artificial Intelligence Technology and Its Interdisciplinary Applications, Southeast University, Nanjing 211189, China

²

School of Computer Science and Engineering, Southeast University, Nanjing 211189, China

^*

Author to whom correspondence should be addressed.

Electronics 2025, 14(1), 161; https://doi.org/10.3390/electronics14010161

Submission received: 21 November 2024 / Revised: 24 December 2024 / Accepted: 31 December 2024 / Published: 2 January 2025

Download

Browse Figures

Versions Notes

Abstract

:

Knowledge base question answering (KBQA) aims to answer a question using a knowledge base (KB). However, a knowledge base is naturally incomplete, and it cannot cover all the knowledge needed to answer the question. Therefore, obtaining accurate and comprehensive answers to complex questions is difficult when KBs are missing relations and entities. To mitigate this challenge, we propose an incomplete KBQA approach based on Relation-Aware Interactive Network and Text Fusion (RAIN-TF). Specifically, we provide essential textual knowledge by finely filtering the question-related text to compensate for the missing relations and entities in the KB. We propose a question-related subgraph construction method that fuses the knowledge from the text and KB and enhances the interactions among questions, entities, and relations. On this basis, we propose a relation-aware interactive network, which is a relation-aware multi-head attention graph neural network (GNN) model, to promote the deep semantic integration of unstructured texts and structured KBs, thus effectively compensating for the lack of knowledge. Comprehensive experiments on three mainstream incomplete KBQA datasets verify the effectiveness of the proposed approach.

Keywords:

incomplete knowledge base question answering; text fusion; graph neural network; text filtering

1. Introduction

As an advanced form of the traditional search engine, the question-answering (QA) system has become a promising research field. QA systems allow users to express their queries directly with natural language questions. The system analyzes the question, automatically searches and analyzes relevant knowledge, and returns a concise and accurate answer.

An important direction in the field of QA research is knowledge base question answering (KBQA), which relies on a knowledge base (KB) to realize intelligent QA for specific domains or topics. Distinct from general unstructured texts, a KB (e.g., Freebase [1], DBPedia [2], Wikidata [3], and YAGO [4]) contains a collection of facts in the form of triples (subject, predicate, object). Through these triples, various things in the real world can be effectively organized and connected. Many studies [5,6] show that, as highly reliable, readable, and queryable knowledge representations, large-scale KBs can serve as high-quality knowledge sources for QA systems.

However, building a complete KB involves many techniques, such as knowledge extraction, knowledge representation, and knowledge fusion, which require substantial time and resources. Therefore, a knowledge base is usually incomplete [7] and cannot cover all of the knowledge needed for QA. In addition, owing to the timeliness of knowledge, the limited update rate of a KB further exacerbates its lack of knowledge. Although previous research has proposed compensating for the lack of knowledge in the QA process by supplementing the incomplete KB or fusing textual information, answering questions from an incomplete KB remains a huge challenge with missing relations or even missing entities. One area of existing research [8,9,10] adopts KB embeddings to alleviate the sparsity of the KB by performing missing link prediction. These methods typically assume that the answer nodes exist in the KB, and an incomplete KB only lacks connection edges between entities; thus, the research focus is on completing the missing edges so that the answer nodes are reachable. In fact, in an incomplete KB, in addition to missing edges, entity nodes may also be absent. If the missing nodes happen to be the answer nodes, the aforementioned methods cannot handle such situations. These methods can only compensate for missing relations between entities by using KB embedding and cannot deal with missing entities in an incomplete KB. Another type of method [11,12,13] fuses extra textual information into the incomplete KB. However, these methods may introduce noise when integrating textual information. The fusion of different knowledge sources is another problem requiring a solution. Therefore, we consider the following current problems for incomplete KBQA: (i) In an incomplete KB, not only will missing relations affect the QA performance, but missing entities will also affect the recall of answers. Determining how to incorporate essential text into the KB to assist in answering questions is crucial. (ii) Whether supplementing the incomplete KB with sentences as nodes [11] or augmenting entity representation with textual information [12,13], current methods for fusing different knowledge sources are inadequate. Textual information requires question-related, fine-grained fusion to enhance QA effectiveness instead of introducing noise.

In view of the above problems, we propose a three-step incomplete KBQA approach based on text fusion. The first step is to retrieve the KB subgraph and document set related to the question and, during the process of retrieving question-related documents, to innovatively use a document classifier to precisely obtain question-related documents, avoiding the problem of introducing noise during the document fusion process. The second step is to fuse the textual information with the KB subgraph, merging the entities and relations described in the documents into the KB subgraph in the form of nodes and edges to construct a question-related subgraph. This process results in fine-grained text fusion, not only compensating for the missing relations in the incomplete KB subgraph but also, more importantly, filling in the missing entities. The third step is to update the subgraph node embedding by using the proposed relation-aware multi-head attention graph neural network (GNN) and select the answers. Specifically, in the preprocessing stage, we use a pre-training model to filter documents to ensure the quality of fused text knowledge. A new method is proposed for constructing the question-related subgraph in the QA phase; this method integrates question-related text and KB information to build the subgraph and subsequently ensures the consistency of the semantic encoding space through graph representation learning. On this basis, a relation-aware interactive network model is proposed that fully integrates text knowledge and structured knowledge by using the attention mechanism of relation awareness to ultimately select the answer.

In this article, our contributions are as follows:

1. To reduce the noise from documents unrelated to the question during text fusion, we apply a pre-training language model to filter documents, which provides a high-quality document collection for the next process and improves the incomplete KBQA results.

2. To realize the fusion of different knowledge sources, we propose a question-related subgraph construction method of fused text and a relation-aware multi-head attention GNN to realize the synchronous update of all nodes in the subgraph.

3. We conducted comprehensive experiments on three commonly used incomplete KBQA benchmarks. Our approach outperforms all the existing incomplete KBQA methods based on text fusion in terms of Hits@1 values.

2. Related Work

Two mainstream methods used in the KBQA field are introduced in this section, namely, semantic parsing and information retrieval. We also introduce related QA methods for incomplete KBs.

2.1. KBQA Methods Based on Semantic Parsing

The KBQA method based on semantic parsing converts the question into a formal query that the computer can understand. The computer then obtains the answer by querying the KB and performing a reasoning process.

Yih et al. [14] transformed the question into a query graph and then queried the KB to obtain the answer. They proposed a staged query graph generation (STAGG) strategy, which utilizes a CNN to enhance relation detection during the process. Zheng et al. [15] put forward a decomposition–execution–connection strategy, which decomposes a complex question into several simple sub-questions; then, it transforms each sub-question into a logical form, generates an intermediate answer, and jointly obtains the final answer to the question. This strategy effectively reduces the search space. Maheshwari et al. [16] proposed a ranking model that exploits the characteristics of query graphs and uses self-attention and skip connections to explicitly compare each predicate in a query graph with the question. Sun et al. [17] proposed a frame grammar to represent the high-level structure of complex questions to reduce the error propagation from syntactic analysis to downstream semantic analysis. Zhang et al. [18] applied the structural information of the question to reinforcement-based path reasoning, and the efficiency of path reasoning was improved by distinguishing different knowledge paths through an attention mechanism.

The KBQA method based on semantic parsing aims to generate logical expressions corresponding to questions, thus making the reasoning process more interpretable. However, when the question has complex semantics and syntax, understanding the question becomes extremely difficult, and the search space for generating logical expressions increases dramatically. In addition, such methods usually need a large number of manually labeled question–logical expression pairs, which limits the realization of such methods in practical applications.

2.2. KBQA Methods Based on Information Retrieval

The KBQA method based on information retrieval retrieves the question-specific subgraph, ranks the entities based on their semantic similarity to the question, and selects the entity with the highest score as the answer.

Dong et al. [19] proposed a method to realize automatic QA by using a multi-column convolutional neural network. This method does not need manual features and rules but uses question interpretation to train the network and word vectors through multitask learning. Xu et al. [20] proposed an update strategy and stop strategy to improve the traditional key–value memory network model, thus strengthening the reasoning ability and memory utilization efficiency when solving complex questions. Qiu et al. [21] proposed a model that learns a step perception representation by transforming the initial question representation with a single-layer perceptron so that the reasoning model perceives the reasoning step. On this basis, He et al. [22] proposed a method that uses a dynamic attention mechanism to focus on the different parts of the question. This method generates an attention distribution on the question words and updates the instruction vector based on the step-aware question representation and previous reasoning instructions. Zhang et al. [23] proposed a trainable subgraph retriever that is decoupled from the subsequent reasoning process; accordingly, a plug-and-play framework can enhance any subgraph-oriented KBQA model.

The KBQA method based on information retrieval aims to obtain information from questions, retrieve relevant candidate entities in the KB, and then obtain the final answer by sorting the candidate entities. This type of method does not depend on the logical form or semantic analysis model and is suitable for end-to-end training. However, these methods require further investigation to answer complex questions.

2.3. Incomplete KBQA

Because building a KB is expensive and difficult, existing KBs are usually incomplete [7]. Researchers use auxiliary information or link prediction to supplement or complete the KB, respectively, when conducting incomplete KBQA.

Das et al. [24] applied a universal schema to jointly embed KB facts and text into a uniform structured representation, allowing the interleaved propagation of information. Sun et al. [11] determined answers by conducting reasoning on a question-specific graph. They complemented the graph with extra question-relevant sentences as nodes and reasoned with the augmented heterogeneous graph. On this basis, PullNet [25] was proposed as an improved strategy, which uses an iterative process to construct a question-specific subgraph. In each iteration, a graph convolution neural network is used to identify subgraph nodes that should be expanded in the corpus or KB by using a “pull” operation; after obtaining the complete subgraph, another graph convolution neural network is used to extract the answer from the subgraph. Instead of directly adding sentences to the question-specific graph as nodes, Xiong et al. [12] and Han et al. [13] fused extra textual information encoded as vectors into the entity representation. Saxena et al. [8] utilized pre-trained KB embeddings to address the incomplete KB issue. Ding et al. [26] proposed a global normalized graph attention network to solve the problem of noise information, which is irrelevant to the question, introduced by local normalization when aggregating neighbor entity information. A coarse-grained to fine-grained text reader was proposed to capture the potential relation information and entity reference representation in the text. In the answer prediction stage, to generate the entity representation of the perceived question, a dual attention mechanism is added to strengthen the interaction between the question and the entity representation. Sun et al. [27] proposed a question embedding method that is logically faithful to the traditional logical reasoning system and can better handle complex questions with an incomplete KB. Ren et al. [28] proposed a model including entity linking, relation pruning, and branch pruning, which can be used to answer questions based on first-order logic. Saxena et al. [29] proposed a generative KBQA method that regards the link prediction of the KB as a sequence-to-sequence task and exchanges the triple scoring method used in the previous knowledge graph embedding method with autoregressive decoding; this process consequently reduces the model scale. Guo et al. [9] proposed a reasoning model that fuses neighbor interaction and a relation recognition module for incomplete KBQA. Ye et al. [10] proposed a model for multi-hop KGQA under a new few-shot setting that considers insufficient training samples, as well as incomplete knowledge graph edges but complete knowledge graph nodes.

For incomplete KBQA methods that only use link prediction to complete the KB, improving the QA effect is difficult. For methods that supplement knowledge sources with auxiliary information, the following problems are also anticipated: noise introduction and the fact that the KB and text are not in the same semantic space, which will affect the QA result.

3. Proposed Approach

This section first introduces the preliminaries. Then, the general framework of RAIN-TF is explained. Next, we explain the three stages of the proposed framework, namely, question-related KB and text retrieval, question-related subgraph construction, and answer selection based on a relation-aware multi-head attention GNN.

3.1. Preliminaries

Given question

q = {w_{1}, w_{2}, \dots, w_{L}}

, an incomplete KB

K = (E, R, T)

, and a corpus

C = {c_{1}, c_{2}, \dots, c_{n}}

, where E and R represent the entity and relation sets in the KB, respectively, and T represents the set of triples

(s, r, o)

, with relation

r \in R

between subject

s \in E

and object

o \in E

, the incomplete KBQA task retrieves answer entities

A_{q} = {a_{1}, \dots, a_{m}}

for the question q from a candidate set including all KB and text entities.

3.2. Framework

Figure 1 illustrates the overall architecture of the RAIN-TF framework. The answer set is obtained through the following three steps: question-related KB and text retrieval, question-related subgraph construction, and answer selection. In the process of question-related KB and text retrieval, we retrieve the question-related KB subgraph

K_{q}

and document set D from the KB and corpus and further select the set

D_{f i n e} \subseteq D

, which contains documents that are more relevant to the question according to a document classifier. To construct the question-related subgraph

G_{q}

, we add edges between the topic entity (the entity mentioned by the question) and other entities in

K_{q}

, add the tokens in the question q to

K_{q}

as nodes, incorporate the entities mentioned in

D_{f i n e}

into the subgraph

K_{q}

to compensate for the missing entities and relations in the incomplete KB, and transform the relation edges in

K_{q}

into nodes. After the graph

G_{q}

is constructed, the entity nodes in

G_{q}

can be regarded as candidate answers to the question q. In the process of answer selection, we apply the proposed relation-aware multi-head attention GNN model to learn the vector representation of nodes in

G_{q}

and finally predict whether the entity nodes in the graph

G_{q}

are the answers.

The main innovations of the RAIN-TF framework include accurate question-related text filtering, a subgraph construction strategy for text fusion, and the relation-aware multi-head attention GNN model, which we describe in Section 3.3, Section 3.4, and Section 3.5, respectively.

3.3. Question-Related KB and Text Retrieval

To improve the effect of incomplete KBQA, accurately retrieving question-related information from a large-scale KB and corpus is necessary. For KB retrieval, we utilize the existing Personalized PageRank-based retrieval method to retrieve the question-related subgraph. For text retrieval, we propose a new method that includes text filtering, which improves the accuracy of question-related text retrieval.

3.3.1. Question-Related KB Retrieval

To retrieve the question-related entities from the KB, we first represent the topic entity set in the question as

E_{t} = {e | e \in q}

. Then, the Personalized PageRank (PPR) [30] is run on the KB subgraph with a fixed number of hops around the topic entity to retrieve candidate entities that may be the answers to the question. The edge weights around

E_{t}

are evenly distributed among all edges of the same type, which are weighted so that edges related to the question can receive higher weights than unrelated edges. Specifically, we calculate the relation vector

v_{r}

from the average of the word vectors in the surface form of the relation r and the question vector

v_{q}

from the average of the word vectors in the question q; then, we take the cosine similarity between these vectors as the edge weight; and finally, we keep the top m entities

e_{1}, \dots, e_{m}

with the highest PPR scores and all the edges between them to construct the question-related KB subgraph

K_{q}

. This process narrows down the answer search space from the entire KB to a manageable subgraph.

3.3.2. Question-Related Text Retrieval and Filtering

Following [11], we consider Wikipedia as a corpus C to retrieve texts at the sentence level; that is, the retrieved documents are defined along sentence boundaries. The retrieval of question-related documents is divided into two steps: first, following classical QA systems, we use an efficient document retrieval method to first narrow our search space and focus on reading only documents that are likely to be relevant and use the weighted bag model of [31] to retrieve the five most relevant articles from Wikipedia; then, the Lucene index is filled with the sentences in these articles, and the top-ranked sentences are retrieved on the basis of the words in the question to obtain the document set

D = {d_{1}, \dots, d_{D}}

. The final number of question-related documents obtained is far smaller than the original corpus,

| D | ≪ | C |

.

To screen candidate documents that are closely related to the semantics of the question and obtain a higher-quality document set

D_{f i n e} \subseteq D

, we propose a document filtering model. In this model, the pre-training language model BERT is used to jointly encode the question and the document, and the scores of all documents in D are calculated through linear layers. Then, a document set

D_{f i n e}

with scores higher than the threshold is obtained. We than have the question

q = {w_{1}, w_{2}, \dots, w_{L}}

and the candidate document

d_{i} \in D

, where

d_{i} = {w_{d_{i}}^{1}, w_{d_{i}}^{2}, \dots, w_{d_{i}}^{| L_{d} |}}

.

Firstly, the question q and document

d_{i}

are merged into a sequence and input into the BERT model for encoding. This joint encoding method simultaneously considers the interaction between the question and the document, thereby obtaining a more accurate vector representation. Specifically, given the question q, document

d_{i}

is concatenated before q and separated by the separator

[S E P]

. At the same time,

[C L S]

and

[S E P]

flags are added to the head and tail of the sequence,

x_{i n p u t} = {[C L S], w_{d_{i}}^{1}, w_{d_{i}}^{2}, \dots, w_{d_{i}}^{| L_{d} |}, [S E P], w_{1}, w_{2}, \dots, w_{L}, [S E P]}

. Next, placeholders are used to replace entity mentions in the document and question to avoid introducing noise. After word replacement, the new sequence

{\tilde{x}}_{i n p u t}

is used as the input of the pre-training language model BERT to obtain the contextual representation of each word, and the encoded output of the sequence header flag

[C L S]

is used as the contextual representation

h_{c o n c a t} \in R^{d}

of the entire input sequence. Then, for each

d_{i} \in D

,

h_{c o n c a t}

is mapped by using the linear layer and normalized to obtain the probability score

p_{i}

. According to the comparison with the set threshold

ϵ_{s c o r e}

, documents with a score greater than

ϵ_{s c o r e}

are output to

D_{f i n e}

. This method can make better use of the interactive information between documents and questions and can also better handle a document and question with similar semantics but different manifestations so as to obtain more accurate document filtering results.

3.4. Question-Related Subgraph Construction

To better utilize the GNN model for message passing on the question-related subgraph, we construct the question-related subgraph

G_{q}

based on the question-related KB subgraph

K_{q}

through four steps. Figure 2 illustrates the process.

First, the topic entity node is connected with other non-adjacent entity nodes in

K_{q}

to shorten the distance between them and accelerate message passing from the topic entity to other entities. Second, the tokens in the question q are added as nodes to the subgraph, enabling the question to play a more fine-grained role in the process of message passing. Third, the information in the document is fused into the question-related subgraph to compensate for the missing entities and relations in the incomplete KB. Finally, the relation edges in

K_{q}

are transformed into relation nodes, which highlights the importance of relations in message passing and can guide the information to spread in a more question-relevant direction.

Connect the topic entity node with other entity nodes: In the subgraph

K_{q}

retrieved from the incomplete KB, some entities and relations related to the question will be missing. Adding a specific

e n t i t y

–

e n t i t y

edge between the topic entity node and all other non-adjacent entity nodes in

K_{q}

increases the connectivity of the subgraph, which is helpful in directly transferring the topic entity message to other entities. The subgraph in Figure 2b is obtained by performing this step on the question-related KB subgraph shown in Figure 2a, where the orange edge in the figure is the added

e n t i t y

–

e n t i t y

edge. Through this operation, there is a direct edge between the topic entity “Cristiano Ronaldo” and the KB entity “Portugal national team”, accelerating the process of reaching the answer entity from the topic entity.

Add question token nodes: For the question to play a more fine-grained role in the message-passing process, the question is divided into tokens, and each token is added to the subgraph as a node. A

t o k e n

–

t o k e n

edge is added between any two tokens to ensure that the message passing between them is not limited by the distance in the question. In addition, when the question token is matched to the entity or relation name in

K_{q}

,

t o k e n

–

e n t i t y

–M and

t o k e n

–

r e l a t i o n

–M edges are added between the node of the matched token and the node of the entity or relation, where m can be

E x a c t M a t c h

or

P a r t i a l M a t c h

. After using the NLTK tool (https://www.nltk.org (accessed on 12 May 2024)) to tokenize the question, we follow the rule of exact matching over partial matching to determine matching entities or relations for the subsequences of the question token sequence. When a subsequence has exactly the same name as an entity or relation, an exact match is considered to occur between the tokens in the subsequence and the entity or relation. When the number of matching tokens exceeds two-thirds, the tokens in the subsequence are considered partial matches with the entity or relation. Figure 2c shows the subgraph obtained by performing this step, where the green vertices represent the question tokens, the green edges represent the

t o k e n

–

t o k e n

edges, and the purple edges represent the

t o k e n

–

e n t i t y

–M edges.

Perform text fusion: In the text fusion stage, the co-occurrence relation between entity mentions in the document

d \in D_{f i n e}

is used to compensate for the missing entities and relations in the incomplete KB subgraph

K_{q}

. Specifically, we use a fixed-size window to extract the co-occurrence entity pairs in the document and use the document context information and attention mechanism to learn the representation of the co-occurrence relation between each pair of entities, resulting in the representation vector of the specific co-occurrence relation between the entities. Suppose that

(e_{1}, r_{c o}, e_{2})

contains an entity pair and their specific co-occurrence relation obtained from a document d. If

e_{1} \notin E_{K_{q}}

and

e_{2} \notin E_{K_{q}}

, where

E_{K_{q}}

is the node set of the subgraph

K_{q}

, then

e_{1}

and

e_{2}

and their corresponding co-occurrence relations are added to

K_{q}

. If

e_{1} \in E_{K_{q}}

or

e_{2} \in E_{K_{q}}

, then the entity that does not belong to

E_{K_{q}}

and the co-occurrence relations are added to

K_{q}

. If

e_{1} \in E_{K_{q}}

and

e_{2} \in E_{K_{q}}

, but no edge exists between

e_{1}

and

e_{2}

, a co-occurrence relation is added between

e_{1}

and

e_{2}

. Figure 2d shows the subgraph obtained by text fusion, where the yellow vertex represents the entity supplemented by document

d_{1}

. In this example, the entity “Real Madrid” is an answer to the question q, but it does not exist in the previously constructed question-related KB subgraph; rather, it exists in the supplementary document

d_{1}

. Since this entity co-occurs with the topic entity within a window in the document, it is added to the subgraph, compensating for the missing entities in the incomplete KB.

Transform edges into relation nodes: To highlight the importance of relations in message passing, for the relation edges

(s, r, o)

in the subgraph, where s and o represent the nodes of the head entity and the tail entity, respectively, and r represents either a KB relation or a co-occurrence relation, we further transform the relation edge into a relation node and establish a connection edge between the entity and relation nodes. According to the direction from the head entity to the tail entity, an

e n t i t y

–

r e l a t i o n

edge is added between the head entity node and the relation node, and a

r e l a t i o n

–

e n t i t y

edge is added between the relation node and the tail entity node. The subgraph is obtained by performing this step, as shown in Figure 2e, where the blue vertices represent the KB relations, the gray vertices represent the co-occurrence relation, and the gray edges represent the

e n t i t y

–

r e l a t i o n

or

r e l a t i o n

–

e n t i t y

edges.

Construct the question-related subgraph: The question-related subgraph constructed through the above four steps is defined as

G_{q} = (V_{q}, E_{q})

, where

V_{q}

contains entity nodes, question token nodes, and relation nodes, that is,

V_{q} = E_{q} \cup Q \cup R_{q}

, where

E_{q}

contains the entity set

E_{K_{q}}

in the KB subgraph and the entity set

E_{D_{f i n e}}

supplemented by documents. The types of edges in

E_{q}

include

e n t i t y

–

e n t i t y

,

t o k e n

–

t o k e n

,

t o k e n

–

e n t i t y

–M,

t o k e n

–

r e l a t i o n

–M,

e n t i t y

–

r e l a t i o n

, and

r e l a t i o n

–

e n t i t y

.

3.5. Relation-Aware Multi-Head Attention GNN Model

After the question-related subgraph is constructed, the entity nodes in the subgraph can be regarded as candidate answers to the question. To predict whether the entity node in the subgraph is the answer, it is necessary to conduct representation learning and classify the nodes in the graph. Since the edges in the question-related subgraph have six predefined types, when aggregating node information, we need to consider not only the information of neighboring nodes but also the relation label information between the nodes. Traditional graph neural networks like graph convolutional network (GCN) and graph attention network (GAT) models do not account for the relation labels between nodes. Therefore, we propose a relation-aware interactive network, which can be considered a type of heterogeneous GNN model. Specifically, it is a relation-aware multi-head attention GNN model for learning the vector representations of nodes.

The proposed relation-aware multi-head attention GNN model includes the following five modules: unified encoding module, graph attention module, information aggregation module, regularization module, and answer selection module.

Unified encoding module: This module encodes the nodes in the question-related subgraph, including entity nodes, relation nodes, and question token nodes. Then, it maps the different types of nodes into the same latent factor space. For an entity node, the head and tail of its word sequence are spliced with

[C L S]

and

[S E P]

, respectively, and input into the pre-training language model BERT to obtain the context embedding. The pooled output of

[C L S]

is used as the initial representation of the entity node. The initialization representation of relation nodes is obtained in the same way. For a question token node, the question is input into BERT, and the hidden representation corresponding to each token is output as the initialization representation of the token node.

Graph attention module: For a node in the question-related subgraph, the graph attention module can calculate the attention values of each neighbor node of the current node. Specifically, for the question-related subgraph

G_{q} = (V_{q}, E_{q})

, the input set is

X_{i} = {x_{i}}_{i = 1}^{| V_{q} |}

, where

x_{i} \in R^{d}

is the vector representation of node

v_{i}

. The attention formulas are as follows:

{\tilde{α}}_{i j}^{(h)} = \frac{x_{i} W_{Q}^{(h)} {(x_{j} W_{K}^{(h)} + r_{i j}^{K})}^{T}}{\sqrt{d_{z} / H}}

(1)

α_{i j}^{(h)} = s o f t m a x_{j} {{\tilde{α}}_{i j}^{(h)}}

(2)

where

r_{i j}^{K} \in R^{d / H}

is the predefined vector representation of the edge from node

v_{i}

to

v_{j}

, and

W_{Q}^{(h)}, W_{K}^{(h)} \in R^{d \times (d / H)}

are the parametric weight matrices. When calculating the attention values, we consider not only neighbor nodes but also the types of connecting edges. First, each node in the question-related subgraph is mapped to two new vectors (query and key) by calculating with the weight matrices

W_{Q}^{(h)}, W_{K}^{(h)}

. Then, each node in the subgraph is regarded as a query, and the attention weight is calculated with all the node keys in the subgraph. Specifically, the key vector

x_{j} W_{K}^{(h)}

of the neighbor node

v_{j}

is first spliced with the edge vector

r_{i j}^{K}

. Then, the correlation score

{\tilde{α}}_{i j}^{(h)}

with the query vector

x_{i} W_{Q}^{(h)}

of the current node

v_{i}

is calculated. Finally, the normalized weights

α_{i j}^{(h)}

of all neighboring nodes of node

v_{i}

are obtained using the softmax function. In addition,

1 \leq h \leq H

represents the head h in multi-head attention. It can be understood that each header is used to obtain different features and provide nodes with further information on neighbor nodes.

Information aggregation module: The information aggregation module updates the vector representation of the current node by aggregating its neighbor node information. The information aggregation formulas are as follows:

e_{i}^{(h)} = \sum_{j = 1}^{n} α_{i j}^{(h)} (x_{j} W_{V}^{(h)} + r_{i j}^{V})

(3)

e_{i} = C o n c a t (e_{i}^{(1)}, . . ., e_{i}^{(H)})

(4)

where

x_{j} \in R^{d}

is the vector representation of node

v_{j}

, node

v_{j}

is one of the neighbor nodes of

v_{i}

, the total number of node

v_{i}

’s neighbor nodes (including

v_{i}

) is n,

W_{V}^{(h)} \in R^{d \times (d / H)}

is the parametric weight matrix, and

r_{i j}^{V} \in R^{d / H}

is the predefined vector representation of the edge from node

v_{i}

to

v_{j}

. The edge vector representation from

v_{i}

to

v_{j}

is

r_{i j}^{K} = r_{i j}^{V} = C o n c a t (ρ_{i j}^{(1)}, . . ., ρ_{i j}^{(R)})

, and R is the number of edge types in the subgraph. If the edge type from

v_{i}

to

v_{j}

is the

R^{(s)}

type, then,

ρ_{i j}^{(s)}

is the vector representation of the edge type

R^{(s)}

; otherwise,

ρ_{i j}^{(s)}

is a zero vector of the appropriate size. In this paper, each node in the subgraph is mapped to a new vector value by the weight matrix

W_{V}^{(h)}

, and the value vector of the neighbor node

v_{j}

is spliced with the edge vector

r_{i j}^{V}

from node

v_{i}

to node

v_{j}

. The weighted sum representation of all the neighbor nodes is calculated, and the obtained multi-head vector is spliced as the output

e_{i} \in R^{d}

.

Regularization module: The regularization layer standardizes the feature vectors and sets their mean value to 0 and variance to 1 to eliminate the scale difference between feature vectors and maintain their relative distance. This process helps accelerate the convergence speed of the model and improve the training stability and generalization ability of the model. In this paper, the following regularization formulas are used to standardize the feature vectors:

{\tilde{h}}_{i} = F C (R e L U (F C (e_{i})))

(5)

h_{i} = L a y e r N o r m ({\tilde{h}}_{i})

(6)

where

R e L U

stands for the activation function,

F C

is the fully connected layer, and

L a y e r N o r m

is the regularization layer. By applying a fully connected layer and activation function to

e_{i}

, the intermediate vector

{\tilde{h}}_{i}

is obtained, which contains a higher-level representation of node

v_{i}

. Next, the intermediate vector

{\tilde{h}}_{i}

is normalized by using

L a y e r N o r m

, and the final output

h_{i}

is obtained.

Answer selection module: To predict whether an entity node in the subgraph is the answer, we connect a linear layer after the regularization module to calculate the probability score of each entity node. The following formula is used to calculate the probability score:

s_{i} = σ (w^{T} h_{i} + b)

(7)

where

w

and

b

are the weight and bias of the linear layer, respectively, and

σ

represents the sigmoid activation function. Comparing the score of an entity node with a set threshold can determine whether the entity is the answer to the question.

Model training: In the model training process, we adopt a binary cross-entropy loss function to measure the gap between the model output and the real value, which is often used in classification tasks. For each entity node, the cross-entropy between its prediction probability and the real label is taken as the value of the loss function. The loss function values of all entity nodes are averaged to obtain the final loss function value of the model:

L o s s = - \frac{1}{N} \sum_{i = 1}^{N} y_{i} \cdot l o g (p_{i}) + (1 - y_{i}) \cdot l o g (1 - p_{i})

(8)

where N is the number of entities, and

y_{i}

and

p_{i}

represent the target probability and prediction probability of the i-th entity as the answer, respectively.

4. Experiments

This section evaluates the proposed RAIN-TF approach, details the incomplete KBQA datasets, and compares the results of RAIN-TF with existing approaches.

4.1. Experimental Settings

We introduce the three KBQA datasets used in the experiments, the hyper-parameter settings, and the compared baselines.

4.1.1. Datasets

We evaluated the performance of our proposed RAIN-TF approach on three benchmark KBQA datasets: WebQuestionsSP [32], Complex WebQuestions [33], and WikiMovies-10K [11]. Table 1 summarizes the statistics of the datasets, and their details are provided below.

WebQuestionsSP (WQSP): WQSP is an improved version of the WebQuestions KBQA dataset, as it removes some ambiguous or unclear questions and provides the SPARQL query for each question. The KB for this dataset is a subset of Freebase, which contains all of the facts within two hops of any entity mentioned in WQSP, including 164.6 million facts and 24.9 million entities. Wikipedia is used as a text corpus, and entity mentions have been linked to entities in the KB.

Complex WebQuestions 1.1 (CWQ): CWQ is a large-scale KBQA dataset released in 2018. It was generated from WQSP by extending the question entity or adding constraints to the answer, with the aim of constructing more complex multi-hop questions. The CWQ dataset is more challenging than the WQSP dataset.

WikiMovies-10K: WikiMovies-10K is a subset randomly extracted from the WikiMovies dataset, in which the training set, verification set, and test set each contains 10,000 questions. We used the KB and text corpus built by Wikipedia published in [34].

We selected

10 %

,

30 %

, and

50 %

of the facts from the KB related to the dataset and used them as incomplete KBs in the experiments.

4.1.2. Parameter Setting

In the experiment, the text embedding is initialized by BERT-BASE-UNCASED, which is a 12-layer and 12-head attention encoder based on the Transformer architecture.

The hyper-parameters of the document filtering model are set as follows: the encoder output dimension is 768; the multi-layer perceptron has three layers, an input dimension of 1536 for joint encoding, a hidden layer dimension of 512, and an output layer dimension of 128; the fully connected layer has an input dimension of 128 and an output dimension of 2; the gradient clipping threshold is set to 0.6; the dropout probability between sub-layers is set to 0.1; the maximum length of the document is set to 50; the Adam optimization algorithm is adopted to adjust the model parameters, and the learning rate is

5 \times 10^{- 5}

; and the number of training iterations is set to five.

The hyper-parameters of the relation-aware multi-head attention GNN model are set as follows: the encoder output dimension is 768; the size of all hidden vectors in the attention sublayer is 512, and the number of attention heads is set to eight; the gradient clipping threshold is set to 1; and the Adam optimization algorithm is used to adjust the model parameters, and the learning rate is

5 \times 10^{- 5}

. The batch size is 16, and the number of iteration rounds is 200.

4.1.3. Baselines

We compared our approach with the Key-Value Memory Network [34], GraftNet [11], and PullNet [25] for the CWQ and WikiMovies-10K datasets. For the WQSP dataset, we also included SG-KA [12], HGCN [13], and GTFIN [26] in the comparison. Additionally, to verify the effectiveness of our approach when only using an incomplete KB for QA, we omitted the text fusion step and compared the approach (named RAIN-KB) with the following incomplete KBQA approaches without external resources: EmbedKGQA [8], LEGO [28], and KGT5 [29].

Key-Value Memory Network (KVMem): KVMem is an end-to-end incomplete KBQA approach that augments the KB with text. It maintains a memory table that stores KB facts and text encoded into key–value pairs and uses this for retrieval.

GraftNet: GraftNet regards the document as a special type of node, integrates it with the KB at an early stage, and uses a variant of the GNN to perform reasoning. GN-KB is a version with a KB-only setting. GN-LF (late fusion) regards the KB and text as two independent graphs and then integrates the answer scores, whereas GN-EF (early fusion) regards the KB and text as a single heterogeneous graph. GN-EF+LF is an ensemble of the GN-EF and GN-LF models.

PullNet: PullNet learns to pull facts and sentences from the data to create a more relevant question-specific subgraph. It uses a graph CNN approach to perform reasoning.

SG-KA: SG-KA puts forward two components to infer the entity representation in the KB and dynamically fuses the entity representation learned in the KB into text through a conditional gating mechanism.

HGCN: HGCN considers the text as a super-edge connecting the entities in the text and uses a hypergraph convolution network for reasoning.

GTFIN: GTFIN designs a global normalized graph attention network (GGAT) to reduce the influence of noise information unrelated to the question on the final answer prediction and proposes a coarse-to-fine text reader (CFReader), which not only uses the relation information but also obtains a representation of entity mentions in the text to enhance the incomplete KB.

EmbedKGQA: EmbedKGQA uses KB embeddings for the QA task. It trains the KB entity embeddings, uses them to learn question embeddings, and scores entity–question pairs to select the answer.

LEGO: LEGO consists of a latent space executor and a query synthesizer, iteratively synthesizing and executing the query in the embedding space to identify the answer entities.

KGT5: KGT5 is an encoder–decoder Transformer model. It is trained on the link prediction task. KGT5 can also be fine-tuned for the QA task to deal with incomplete KBQA.

For a consistent comparison with related work, we report Hits@1, which is the accuracy of the top-predicted answer from the model, and the F1 score.

4.2. Experimental Results and Analysis

4.2.1. Overall Performance

Table 2 presents the results on the WQSP dataset. The experimental results of the compared methods are from the corresponding original literature. We selected

10 %

,

30 %

, and

50 %

of the facts from the KB related to the dataset and used them as incomplete KBs when conducting the experiments.

We evaluated the performance of our proposed approach under the conditions of using only a KB (RAIN-KB) and using both a KB and text (RAIN-TF). In the upper part of Table 2, we compare RAIN-KB to state-of-the-art models that are specifically designed and tuned for incomplete KBQA using only a KB. In the lower part of Table 2, we compare RAIN-TF to state-of-the-art models under the setting of KB plus text. The experimental results show that RAIN-KB and RAIN-TF generally obtain the best or comparable results on the WQSP dataset. When using only a KB, the Hits@1 result on

30 %

KB is not as good as LEGO’s, because LEGO can iteratively process the missing information in the KB and dynamically build a query tree, while RAIN-KB statically builds a question-related subgraph. However, the Hits@1 result on

50 %

KB far exceeds those of LEGO and other methods, showing that the proposed subgraph construction method and GNN model can achieve the best experimental results without text fusion. When using both the KB and text, the proposed method achieves the best Hits@1 results on all incomplete KBs. The finding shows that the proposed text fusion method can improve the effectiveness of incomplete KBQA. The F1 values of RAIN-KB and RAIN-TF are not as good as those of GTFIN on

50 %

KB. The reason may be that the global normalized graph in GTFIN plays a role in reducing noise information unrelated to the question.

For the CWQ dataset, we only provide the experimental results for the

50 %

KB setting to maintain consistency with the compared methods. The left side of Table 3 shows the Hits@1 results of the model using only the KB, while the right side shows the model’s performance when both the KB and text are used. The results marked with ∗ are reproduced from [25], and other experimental results are from the corresponding original literature. The experimental results show that the proposed RAIN-KB and RAIN-TF achieved the most advanced results on

50 %

KB, which verifies the effectiveness of the proposed approach on a complex dataset.

Table 4 shows the experimental results on the Wikimovies-10K dataset with

10 %

KB,

30 %

KB, and

50 %

KB. RAIN-KB and RAIN-TF achieved the most advanced results under the setting of three incomplete KBs, which verifies their effectiveness on simple datasets.

4.2.2. Ablation Experiments

To explore the influence of each module of the RAIN-TF framework proposed in this paper on performance, five ablation experiments were designed, namely, removing the document filtering model, removing connections between the topic entity node and other entity nodes, removing question token nodes, removing text fusion, and removing relation-edge transformation in the construction of question-related subgraphs.

Removing the document filtering model: To explore the influence of the document filtering model on performance, we removed the document classifier and directly integrated all the documents obtained from the question-related text retrieval.

Removing connections between the topic entity node and other entity nodes: To explore the effect of adding edges between the topic entity node and indirectly connected entity nodes, this experiment removed the edges between the topic entity and other entities.

Removing question token nodes: To explore the improvement in the model attributable the question token nodes compared with the whole question node, the question, instead of question tokens, is regarded as a node in the process of constructing the question-related subgraph.

Removing text fusion: In this experiment, the step of obtaining the entity co-occurrence relation from the document and adding it to the question-related subgraph is removed, and only the document is added to the subgraph as a node.

Removing relation-edge transformation: To explore the improvement in the proposed method attributable to the transformation of relations into nodes, this experiment replaced relation nodes with relation edges represented by fixed vectors.

Table 5 shows the Hits@1 results of three datasets on

50 %

incomplete KBQA tasks in the case of text fusion. The results of ablation experiments show that the performance on all datasets decreased significantly after removing the document filtering classifier, which shows that the document classifier can filter out higher-quality question-related documents, thus improving QA performance. After removing the connections between the topic entity and other entities, the performance of incomplete KBQA showed significant degradation on the WQSP and CWQ datasets, decreasing by

2.9 %

and

3.0 %

, respectively. This finding verifies that under the condition of an incomplete KB, shortening the information dissemination distance between the topic entity and other entities is extremely important for improving performance. Other ablation experimental results also verify that the performance degraded to varying degrees after removing question token nodes, text fusion, and relation-edge transformation. These results show that the document filtering model and the question-related subgraph construction method proposed in this paper each plays a role in improving the QA performance with an incomplete KB, regardless of whether the QA dataset is simple or complex.

4.2.3. Performance on Different Question Complexity

In further analyzing the performance of the proposed approach for complex questions, we define the complexity of a question as the number of hops from the topic entity to the answer entity, and the greater the number of hops, the greater the complexity of the question. We performed experiments on the WQSP and CWQ datasets. For the WQSP dataset, we used the number of relation chains as the hop number of the question. Given that the CWQ dataset does not provide the relation chain, we referred to the work in [35] and used the triple number as the hop number.

Figure 3a shows the performance of our RAIN-TF model for questions of varying difficulty on the WQSP dataset. The figure shows the Hits@1 results under the setting of

50 %

KB. The experimental results indicate that the RAIN-TF model that fuses text outperforms the RAIN-KB model for both simple and complex questions, namely, one-hop and two-hop questions. Additionally, the results in Figure 3b on the CWQ dataset indicate that the performance of both RAIN-TF and RAIN-KB models declines as the number of triples increases (that is, with increasing question difficulty), and complex questions are indeed more difficult to answer. However, the performance of the RAIN-TF model decreases less sharply with increasing question difficulty than that of the RAIN-KB model, indicating that the text fusion method performs better on complex questions.

4.2.4. Impact of Hyper-Parameter Settings

For the proposed attention GNN model, the number of model layers is a very important hyper-parameter that directly affects the performance of the model. To determine the influence of the model layer number, we conducted experiments on the WQSP and CWQ datasets under the conditions of

50 %

KB and text fusion. Considering the complexity of the dataset, we used the relation-aware interactive network with one, two, three, and four layers for training and testing on the two datasets, respectively.

Figure 4a shows the trend of Hits@1 results with different numbers of model layers on the WQSP dataset. The experimental results show that the model with the best performance is the three-layer relation-aware interactive network, and Hits@1 reaches

53.8 %

, which is 1.7% and 0.9% better than the results of the one-layer and four-layer models, respectively. In addition, the performance of RAIN-TF is better than that of the ablation model at all model levels. This finding proves that the effectiveness of each component of the model, especially the topic entity connections and the document classifier, can significantly improve the performance of the model. Figure 4b shows the trend of Hits@1 results with different numbers of model layers on the CWQ dataset. The experimental results show that the model that uses the three-layer relation-aware interactive network achieves the best performance, reaching 33.9%, which is 1.4% and 1.2% better than the results of the two-layer and one-layer models, respectively.

In general, choosing the appropriate level of relation-aware interactive network has profound significance on different datasets, which can significantly improve the performance of the incomplete KBQA model.

During the text fusion process, the window size for extracting co-occurring entities from the documents is also an important hyper-parameter. We tested the impact of different window sizes on the performance of the QA model under the setting of 50%KB and text fusion, setting the window sizes to {10, 20, 30, 40, 50}, and evaluated the performance of the RAIN-EF model on the WQSP and CWQ datasets. The experimental results in Figure 5 indicate that setting a larger window size (30–50) improves QA performance, with the optimal window size for the WQSP dataset being 40, while on the CWQ dataset, the optimal window size is 50, as CWQ is an extension of WQSP and has more complex questions, making larger window sizes more effective.

5. Conclusions

In this paper, we propose an incomplete KBQA approach, RAIN-TF, based on text fusion. RAIN-TF uses a document classifier based on joint encoding and a linear layer to finely filter the question-related documents to reduce noise interference. RAIN-TF fuses the potential knowledge in the text into the incomplete KB subgraph to compensate for the missing entities and relations and adds question tokens and relations as nodes to the subgraph, thus enhancing the interactions among questions, entities, and relations. Finally, a relation-aware interactive network is proposed, and the vector representation of entity nodes is learned in the question-related subgraph of the fused text. Then, the answer is selected. Experiments were carried out on three mainstream incomplete KBQA datasets, and the results show that the proposed RAIN-TF approach outperforms all of the existing incomplete KBQA methods based on text fusion.

6. Limitations

The RAIN-TF approach adopts a question-related retrieval method for KBs and texts, which is straightforward and efficient, but may result in missed retrieval of answer entities. Therefore, adopting an iterative method to retrieve KB entities and fuse texts is one of our future research directions.

Author Contributions

Conceptualization, H.L. and Y.F.; methodology, Y.F.; data curation, Y.F.; writing—original draft preparation, Y.F.; writing—review and editing, H.L. and L.L.; funding acquisition, H.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China under grant number 52178034 and 61502095.

Data Availability Statement

Data will be made available on reasonable request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Bollacker, K.; Evans, C.; Paritosh, P.; Sturge, T.; Taylor, J. Freebase: A collaboratively created graph database for structuring human knowledge. In Proceedings of the ACM SIGMOD International Conference on Management of Data, Amsterdam, The Netherlands, 30 June–5 July 2008; pp. 1247–1249. [Google Scholar]
Lehmann, J.; Isele, R.; Jakob, M.; Jentzsch, A.; Kontokostas, D.; Mendes, P.N.; Hellmann, S.; Morsey, M.; Van Kleef, P.; Auer, S.; et al. DBpedia-a large-scale, multilingual knowledge base extracted from wikipedia. Semant. Web 2015, 6, 167–195. [Google Scholar] [CrossRef]
Tanon, T.P.; Vrandečić, D.; Schaffert, S.; Steiner, T.; Pintscher, L. From freebase to wikidata: The great migration. In Proceedings of the 25th International World Wide Web Conference, Montreal, QC, Canada, 11–15 April 2016; pp. 1419–1428. [Google Scholar]
Suchanek, F.M.; Kasneci, G.; Weikum, G. Yago: A core of semantic knowledge. In Proceedings of the 16th International World Wide Web Conference, Banff, AB, Canada, 8–12 May 2007; pp. 697–706. [Google Scholar]
Diefenbach, D.; Lopez, V.; Singh, K.; Maret, P. Core techniques of question answering systems over knowledge bases: A survey. Knowl. Inf. Syst. 2018, 55, 529–569. [Google Scholar] [CrossRef]
Lan, Y.; He, G.; Jiang, J.; Zhao, W.X.; Wen, J.R. Complex knowledge base question answering: A survey. arXiv 2022, arXiv:2108.06688v5. [Google Scholar] [CrossRef]
Min, B.; Grishman, R.; Wan, L.; Wang, C.; Gondek, D. Distant supervision for relation extraction with an incomplete knowledge base. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Atlanta, Georgia, 9–14 June 2013; pp. 777–782. [Google Scholar]
Saxena, A.; Tripathi, A.; Talukdar, P.P. Improving multi-hop question answering over knowledge graphs using knowledge base embeddings. In Proceedings of the Association for Computational Linguistics, Online, 5–10 July 2020; pp. 4498–4507. [Google Scholar]
Guo, Q.; Wang, X.; Zhu, Z.; Liu, P.; Xu, L. A knowledge inference model for question answering on an incomplete knowledge graph. Appl. Intell. 2023, 53, 7634–7646. [Google Scholar] [CrossRef]
Ye, X.; Xiao, L.; Zhang, C.; Yamasaki, T. E-ReaRev: Adaptive Reasoning for Question Answering over Incomplete Knowledge Graphs by Edge and Meaning Extensions. In Proceedings of the International Conference on Applications of Natural Language to Information Systems, Turin, Italy, 25–27 June 2024; pp. 85–95. [Google Scholar]
Sun, H.; Dhingra, B.; Zaheer, M.; Mazaitis, K.; Salakhutdinov, R.; Cohen, W.W. Open domain question answering using early fusion of knowledge bases and text. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, 31 October–4 November 2018; pp. 4231–4242. [Google Scholar]
Xiong, W.; Yu, M.; Chang, S.; Guo, X.; Wang, W.Y. Improving question answering over incomplete KBs with knowledge-aware reader. arXiv 2019, arXiv:1905.07098. [Google Scholar]
Han, J.; Cheng, B.; Wang, X. Open domain question answering based on text enhanced knowledge graph with hyperedge infusion. In Proceedings of the Findings of the Association for Computational Linguistics Findings of ACL, Toronto, ON, Canada, 9–14 July 2020; pp. 1475–1481. [Google Scholar]
Yih, S.W.; Chang, M.W.; He, X.; Gao, J. Semantic parsing via staged query graph generation: Question answering with knowledge base. In Proceedings of the Joint Conference of the 53rd Annual Meeting of the ACL and the 7th International Joint Conference on Natural Language Processing of the AFNLP, Beijing, China, 26–31 July 2015; pp. 1321–1331. [Google Scholar]
Zheng, W.; Yu, J.X.; Zou, L.; Cheng, H. Question answering over knowledge graphs: Question understanding via template decomposition. Proc. VLDB Endow. 2018, 11, 1373–1386. [Google Scholar] [CrossRef]
Maheshwari, G.; Trivedi, P.; Lukovnikov, D.; Chakraborty, N.; Fischer, A.; Lehmann, J. Learning to rank query graphs for complex question answering over knowledge graphs. In Proceedings of the 18th International Semantic Web Conference, Auckland, New Zealand, 26–30 October 2019; pp. 487–504. [Google Scholar]
Sun, Y.; Zhang, L.; Cheng, G.; Qu, Y. SPARQA: Skeleton-based semantic parsing for complex questions over knowledge bases. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; pp. 8952–8959. [Google Scholar]
Zhang, J.; Zhang, L.; Hui, B.; Tian, L. Improving complex knowledge base question answering via structural information learning. Knowl.-Based Syst. 2022, 242, 108252. [Google Scholar] [CrossRef]
Dong, L.; Wei, F.; Zhou, M.; Xu, K. Question answering over freebase with multi-column convolutional neural networks. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Beijing, China, 26–31 July 2015; pp. 260–269. [Google Scholar]
Xu, K.; Lai, Y.; Feng, Y.; Wang, Z. Enhancing key-value memory neural networks for knowledge based question answering. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Mexico City, Mexico, 16–21 June 2019; pp. 2937–2947. [Google Scholar]
Qiu, Y.; Wang, Y.; Jin, X.; Zhang, K. Stepwise reasoning for multi-relation question answering over knowledge graph with weak supervision. In Proceedings of the 13th International Conference on Web Search and Data Mining, Houston, Texas, USA, 3–7 February 2020; pp. 474–482. [Google Scholar]
He, G.; Lan, Y.; Jiang, J.; Zhao, W.X.; Wen, J.R. Improving multi-hop knowledge base question answering by learning intermediate supervision signals. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining, Online, 8–12 March 2021; pp. 553–561. [Google Scholar]
Zhang, J.; Zhang, X.; Yu, J.; Tang, J.; Tang, J.; Li, C.; Chen, H. Subgraph retrieval enhanced model for multi-hop knowledge base question answering. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, Dublin, Ireland, 22–27 May 2022; pp. 5773–5784. [Google Scholar]
Das, R.; Zaheer, M.; Reddy, S.; McCallum, A. Question answering on knowledge bases and text using universal schema and memory networks. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, BC, Canada, 30 July–4 August 2017; pp. 358–365. [Google Scholar]
Sun, H.; Bedrax-Weiss, T.; Cohen, W.W. PullNet: Open domain question answering with iterative retrieval on knowledge bases and text. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Hong Kong, China, 3–7 November 2019; pp. 2380–2390. [Google Scholar]
Ding, Y.; Rao, Y.; Yang, F.P. Graph-based KB and text fusion interaction network for open domain question answering. In Proceedings of the 2021 International Joint Conference on Neural Networks, Virtual, 18–22 July 2021; pp. 1–8. [Google Scholar]
Sun, H.; Arnold, A.; Bedrax Weiss, T.; Pereira, F.; Cohen, W. Faithful embeddings for knowledge base queries. Adv. Neural Inf. Process. Syst. 2020, 33, 22505–22516. [Google Scholar]
Ren, H.; Dai, H.; Dai, B.; Chen, X.; Yasunaga, M.; Sun, H.; Schuurmans, D.; Leskovec, J.; Zhou, D. Lego: Latent execution-guided reasoning for multi-hop question answering on knowledge graphs. In Proceedings of the 38th International Conference on Machine Learning, Virtual, 18–24 July 2021; pp. 8959–8970. [Google Scholar]
Saxena, A.; Kochsiek, A.; Gemulla, R. Sequence-to-sequence knowledge graph completion and question answering. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, Bangkok, Thailand, 11–16 August 2022; pp. 2814–2828. [Google Scholar]
Haveliwala, T.H. Topic-sensitive pagerank. In Proceedings of the 11th international conference on World Wide Web, Honolulu, Hawaii, 7 May 2002; pp. 517–526. [Google Scholar]
Chen, D.; Fisch, A.; Weston, J.; Bordes, A. Reading wikipedia to answer open-domain questions. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, BC, Canada, 30 July–4 August 2017; pp. 1870–1879. [Google Scholar]
Yih, W.; Richardson, M.; Meek, C.; Chang, M.W.; Suh, J. The value of semantic parse labeling for knowledge base question answering. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Germany, 7–12 August 2016; pp. 201–206. [Google Scholar]
Talmor, A.; Berant, J. The web as a knowledge-base for answering complex questions. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Mexico City, Mexico, 16–21 June 2018; pp. 641–651. [Google Scholar]
Miller, A.; Fisch, A.; Dodge, J.; Karimi, A.H.; Bordes, A.; Weston, J. Key-value memory networks for directly reading documents. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, Austin, TX, USA, 1–5 November 2016; pp. 1400–1409. [Google Scholar]
Chen, Y.; Li, H.; Qi, G.; Wu, T.; Wang, T. Outlining and filling: Hierarchical query graph generation for answering complex questions over knowledge graphs. IEEE Trans. Knowl. Data Eng. 2022, 35, 8343–8357. [Google Scholar] [CrossRef]

Figure 1. Architecture of the RAIN-TF framework.

Figure 2. Construction process of the question-related subgraph.

Figure 3. Hits@1 results on different difficulty questions. (a) Hits@1 results on WQSP dataset. (b) Hits@1 results on CWQ dataset.

Figure 4. Hits@1 results with different model layer numbers. (a) Hits@1 results on WQSP dataset. (b) Hits@1 results on CWQ dataset.

Figure 5. Hits@1 results with different document window size.

Table 1. Statistics of all datasets.

Dataset	Train	Dev	Test	Doc
WQSP	2848	250	1639	235,567
CWQ	27,639	3519	3531	802,573
WikiMovies-10K	10,000	10,000	10,000	79,728

Table 2. Hits@1 and F1 results on WQSP dataset. Bold indicates the best result.

Model	10%KB		30%KB		50%KB
Model	Hits@1	F1	Hits@1	F1	Hits@1	F1
KVMem	12.5	4.3	25.8	13.8	33.3	21.3
GN-KB	15.5	6.5	34.9	20.4	47.7	34.3
PullNet	-	-	-	-	50.3	-
SG-KA	17.1	7.0	35.9	20.2	49.2	33.5
HGCN	18.3	7.9	35.2	21.0	49.3	34.3
GTFIN	19.1	8.30	36.4	21.3	51.1	37.4
EmbedKGQA	-	-	31.4	-	42.5	-
LEGO	-	-	38.0	-	48.5	-
KGT5	-	-	-	-	50.5	-
RAIN-KB (ours)	19.3	8.4	35.8	21.4	51.4	37.1
KVMem	24.6	14.4	27.0	17.7	32.5	23.6
GN-LF	29.8	17.0	39.1	25.9	46.2	35.6
GN-EF	31.5	17.7	40.7	25.2	49.9	34.7
GN-EF+LF	33.3	19.3	42.5	26.7	52.3	37.4
PullNet	-	-	-	-	51.9	-
SG-KA	33.6	18.9	42.6	27.1	52.7	36.1
HGCN	33.7	19.9	42.8	27.5	52.8	37.1
GTFIN	35.5	21.9	44.2	28.2	53.6	39.8
RAIN-TF (ours)	36.7	23.1	45.1	28.9	53.8	38.5

Table 3. Hits@1 results on CWQ dataset. Bold indicates the best result.

Model	50%KB	Model	50%KB
KVMem *	14.8	KVMem *	15.2
GN-KB *	26.1	GN-EF *	26.9
PullNet	31.5	PullNet	33.7
LEGO	29.4	-	-
RAIN-KB (ours)	31.9	RAIN-TF(ours)	33.9

Table 4. Hits@1 and F1 results on Wikimovies-10K dataset. Bold indicates the best result.

Model	10%KB		30%KB		50%KB
Model	Hits@1	F1	Hits@1	F1	Hits@1	F1
KVMem	15.8	9.8	44.7	30.4	63.8	46.4
GN-KB	19.7	17.3	48.4	37.1	67.7	53.4
RAIN-KB (ours)	41.2	32.1	57.5	48.3	73.4	62.5
KVMem	53.6	44.0	60.6	48.1	75.3	59.1
GN-LF	74.5	65.4	78.7	68.5	83.3	74.2
GN-EF	75.4	66.3	82.6	71.3	87.6	76.2
GN-EF+LF	79.0	66.7	84.6	74.2	88.4	78.6
RAIN-TF(ours)	81.2	68.0	86.3	75.7	90.4	79.5

Table 5. Hits@1 results on

50 %

KB. Bold indicates the best result.

Table 5. Hits@1 results on

50 %

KB. Bold indicates the best result.

Model	WQSP	CWQ	Wikimovies-10K
RAIN-TF	53.8	33.9	90.4
-document filtering	50.6 (−3.2)	31.7 (−2.2)	87.5 (−2.9)
-connections between the topic entity and other entities	50.9 (−2.9)	30.9 (−3.0)	88.7 (−1.7)
-question token nodes	52.5 (−1.3)	32.5 (−1.4)	89.2 (−1.2)
-text fusion	53.6 (−0.2)	33.8 (−0.1)	90.3 (−0.1)
-relation edge transformation	51.9 (−1.9)	31.6 (−2.3)	88.2 (−2.2)

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, H.; Feng, Y.; Liu, L. Fusing Essential Text for Question Answering over Incomplete Knowledge Base. Electronics 2025, 14, 161. https://doi.org/10.3390/electronics14010161

AMA Style

Li H, Feng Y, Liu L. Fusing Essential Text for Question Answering over Incomplete Knowledge Base. Electronics. 2025; 14(1):161. https://doi.org/10.3390/electronics14010161

Chicago/Turabian Style

Li, Huiying, Yuxi Feng, and Liheng Liu. 2025. "Fusing Essential Text for Question Answering over Incomplete Knowledge Base" Electronics 14, no. 1: 161. https://doi.org/10.3390/electronics14010161

APA Style

Li, H., Feng, Y., & Liu, L. (2025). Fusing Essential Text for Question Answering over Incomplete Knowledge Base. Electronics, 14(1), 161. https://doi.org/10.3390/electronics14010161

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Fusing Essential Text for Question Answering over Incomplete Knowledge Base

Abstract

1. Introduction

2. Related Work

2.1. KBQA Methods Based on Semantic Parsing

2.2. KBQA Methods Based on Information Retrieval

2.3. Incomplete KBQA

3. Proposed Approach

3.1. Preliminaries

3.2. Framework

3.3. Question-Related KB and Text Retrieval

3.3.1. Question-Related KB Retrieval

3.3.2. Question-Related Text Retrieval and Filtering

3.4. Question-Related Subgraph Construction

3.5. Relation-Aware Multi-Head Attention GNN Model

4. Experiments

4.1. Experimental Settings

4.1.1. Datasets

4.1.2. Parameter Setting

4.1.3. Baselines

4.2. Experimental Results and Analysis

4.2.1. Overall Performance

4.2.2. Ablation Experiments

4.2.3. Performance on Different Question Complexity

4.2.4. Impact of Hyper-Parameter Settings

5. Conclusions

6. Limitations

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI