Auction-Based Learning for Question Answering over Knowledge Graphs

: Knowledge graphs are graph-based data models which can represent real-time data that is constantly growing with the addition of new information. The question-answering systems over knowledge graphs (KGQA) retrieve answers to a natural language question from the knowledge graph. Most existing KGQA systems use static knowledge bases for ofﬂine training. After deployment, they fail to learn from unseen new entities added to the graph. There is a need for dynamic algorithms which can adapt to the evolving graphs and give interpretable results. In this research work, we propose using new auction algorithms for question answering over knowledge graphs. These algorithms can adapt to changing environments in real-time, making them suitable for ofﬂine and online training. An auction algorithm computes paths connecting an origin node to one or more destination nodes in a directed graph and uses node prices to guide the search for the path. The prices are initially assigned arbitrarily and updated dynamically based on deﬁned rules. The algorithm navigates the graph from the high-price to the low-price nodes. When new nodes and edges are dynamically added or removed in an evolving knowledge graph, the algorithm can adapt by reusing the prices of existing nodes and assigning arbitrary prices to the new nodes. For subsequent related searches, the “learned” prices provide the means to “transfer knowledge” and act as a “guide: to steer it toward the lower-priced nodes. Our approach reduces the search computational effort by 60% in our experiments, thus making the algorithm computationally efﬁcient. The resulting path given by the algorithm can be mapped to the attributes of entities and relations in knowledge graphs to provide an explainable answer to the query. We discuss some applications for which our method can be used.


Introduction
Over the past few decades, attempts have been made to provide human-like commonsense reasoning to systems by acquiring vast knowledge about any domain. Humans are intuitively good at dealing with uncertainties and making meaningful inferences from incomplete, missing, and unstructured information. On the other hand, machines need precise information typically stored in a structured database and queried using a parsing strategy or well-defined rules. It is hard to rely on traditional relational data stores, as the massive influx of unstructured data and real-world information continuously evolve.
In recent years, knowledge graphs (KGs) have emerged as a flexible representation using the graph-based data model. They capture and integrate factual knowledge from diverse sources of data at a large scale in a way that machines conduct reasoning and inference over the graphs for downstream tasks such as question answering, making useful recommendations, etc., [1].
The knowledge graph allows for postponing the schema definition, thus letting the data evolve more flexibly [2]. A typical data unit in knowledge graphs is stored as entityrelationship-entity triplets, representing entities as nodes and relationships as the edges in a graph. The nodes are modeled to contain the knowledge or facts about the entities in the • We propose a new method using auction algorithms for question answering in knowledge graphs. • We show that knowledge graphs can take advantage of the dynamic nature of auction algorithms. • Our results show that the computational efficiency of the search can be improved by reusing the prices that have been learned from previous queries. • Lastly, we leverage the entity metadata and path relations to construct an interpretable answer from the resulting path given by auction algorithms.
The paper is organized as follows. The next section describes related work. Section 3 describes and explains the auction algorithms. The adaptation of auction algorithms in knowledge graphs for question answering is given in Section 4. Section 5 provides our experimental results and analysis.

Related Work
In this section we review the path-finding approaches currently used in knowledge graphs, and we contrast them with our proposed approach that is based on auction algorithms.

Embedding-Based Methods
The popular methods used in knowledge graphs for tasks like question answering, knowledge base completion, link prediction, relation extraction, etc., are embedding-based models [12]. Low-dimensional vectors called graph embeddings represent the entities and relations for these tasks. The embedding models learn a scoring function in latent embedding space based on the co-occurrence of words in the text to manifest the semantics of the original graph. The higher score is used to determine the plausibility of facts [6]. The top-ranked entities are chosen as the predicted answers. These models learn subtle patterns in data, but they lack generic forms of reasoning. The predicted answers given by the model are a single entity and do not provide any explanation to the user. For instance, the embedding-based model is trained on individual facts such as (Irving Cummings, wasBornIn, New York City) and (New York City, isLocatedin, United States). The model can leverage the logical patterns and compute a score to possibly deduce that (Irving Cummings, isCitizenOf, United States). However, it fails to explain or offer any inference path evidence supporting the answer. This black-box style makes the reasoning process less interpretable and hampers the users from interacting with the system [9].
Palmonari et al. [13] suggested performing complex logical queries over embedding models to deal with explainability. Bhowmik et al. [10] used an inductive learning framework to learn representations and find reasoning paths between entities.
Secondly, the embedding-based methods are used to answer factoid-based questions, and they need to improve at answering multihop queries. Most works extract a subgraph to answer the question in parts and later augment the answers [14]. Ren et al. [15] added a query synthesizer tree to parse the multihop query in parts. A multihop KBQA was proposed by Saxena et al. [4] to fill the gaps. They embedded all the entities instead of only using neighborhood entities, and the answers were chosen by finding the similarity score between the question and relations.

Path-Based Methods
Another approach used for question-answering tasks in knowledge graphs is path based. This approach relies on the conventional graph traversal methods such as A*, Dijkstra, and minimum spanning tree, which use statistical characteristics such as adjacency matrix, degree, closeness, Eigenvector, PageRank, and connectivity [7][8][9]16,17]. The inference process is highly interpretable, but these methods suffer from scalability issues and high computation costs, as they rely on enumerating all possible paths [9].
All these methods consider the static snapshot of the knowledge graph and only learn the representation of the known entities they have seen before during the offline training. The models are unaware of the emerging entities and fail to keep up with the changes in the graph to answer new questions [5,9]. They thus may not be well-suited for an evolving knowledge graph [10].

Auction Algorithms
Auction algorithms are fundamentally based on mathematical ideas of duality and convex optimization and are inspired by economic equilibrium processes. They have a long history, starting with the original proposal [18], which aimed to solve the classical problem of assignment of objects to persons through a competitive bidding mechanism that resembles a real-life auction. Over time, the original auction algorithm [18] was extended to solve a wide variety of network optimization problems with linear and convex costs (see the tutorial survey [19] and the book [20]). Among others, auction algorithms have been used widely in optimal transport applications [21][22][23] and have been applied to the training of machine learning models [24][25][26].
One paper [27] considered the classical shortest path problem and proposed an adaptation of the original auction algorithm, which bears similarity with the path construction algorithms recently proposed in [11] and used in the present paper. The more recent algorithms are different in one respect, which is critical for application in knowledge graphs. They allow arbitrary initial prices, making them faster and more suitable for dynamic environments involving a changing knowledge graph topology and queries. In particular, this property allows for the reuse of prices from one query to another similar query through an ongoing "price learning" process which can significantly improve computational efficiency, and our computational experiments have also confirmed this. Thus, the "learned prices" can improve computational efficiency, and contrary to the Dijkstra-like path-based algorithms, they can adapt well to an evolving knowledge graph environment and continuously learn as new entities emerge.

Auction Algorithms Description
In this section, we will discuss the use of auction algorithms for constructing paths from a given origin to one or more destinations in knowledge graphs. In this work, we used two variations of auction algorithms. The first variation is a path construction algorithm recently proposed in [11] called auction path construction (APC), and the second is its extension called the auction weighted path construction algorithm (AWPC).
Auction algorithms use a price mechanism to guide the search for a solution. The algorithms efficiently emulate the search process, guided by some rules for price updates that we will describe shortly. Intuitively, the algorithms can be visualized in terms of a mouse moving in a graph-like maze to reach the destination. The mouse advances from high-price to low-price nodes, going from a node to a downstream neighbor node only if that neighbor has a lower price (or equal price under some conditions). It backtracks when it reaches a node whose downstream neighbors have higher prices. In such a case, it suitably increases the price of that node, thus marking the node as less desirable for future exploration and finding alternative paths to the destination.

Auction Path Construction (APC) Algorithm
We first provide some terminology before describing the algorithms mathematically.

Background and Terminology
A knowledge graph KG = {E, R, I} consists of a set of entities E, a set of relations R, and attributes I. The entities are represented as the nodes, and the relations as the edges connecting the nodes. The attributes are the metadata associated with each edge and contain the facts about the corresponding entities. For any edge, (e i , e j ), the node e j is called the downstream neighbor of e i . A node e i is called a deadend if it has no downstream neighbors.
At the typical iteration, the algorithm maintains a path P = (e s , e 1 , e 2 , . . . , e k ) that starts at the origin node e s , ends at some node e k , and is such that (e s , e 1 ) is an edge, and (e m , e m+1 ) is an edge for all m = 1, . . . , k − 1.
Once the destination becomes the terminal node of the path, the algorithm terminates. The path is either extended by adding a new node or contracted by deleting its terminal node. The decision to extend or contract is based on a set of variable ones per node, called prices. Each node e is assigned a scalar price p e . The prices are initially chosen arbitrarily, and the algorithm maintains and updates the prices according to some rules. To describe these rules, we introduce some terminology. Under the current set of prices, each edge (e i , e j ) is classified as follows: Downhill : i f p e i > p e j (b) Level : i f p e i = p e j (c) U phill : i f p e i < p e j The prices can be seen as a measure of the desirability of revisiting and advancing from that node in the future. Low-price nodes are viewed as more desirable.
When the algorithm starts with a path of the nondegenerate form P = (e s , e 1 , . . . , e k ), a contraction removes the terminal node e k from P to obtain the new pathP = (e s , e 1 , . . . , e k−1 ), while an extension adds a node e k+1 to obtain a new pathP = (e s , e 1 , . . . , e k , e k+1 ). The predecessor of the node e k in path P is denoted by pred(e k ) = e k−1 . If P = (e s , e 1 ), then pred(e 1 ) = e s . If the terminal node e k of P is not a deadend, then the downstream neighbor of e k with minimal price is denoted by succ(e k ) and is called the successor of e k : succ(e k ) ∈ arg (e j |(e k ,e j )∈A) min p e j If multiple downstream neighbors of e k have a minimal price, the algorithm arbitrarily designates one of these neighbors as succ(e k ). The algorithm also uses a positive scalar to regulate the size of price changes. For auction path construction (APC), we use = 1, but the value of can play an essential role in the weighted version of the algorithm (AWPC).

Formal Definition
Given the current path P = (e s , e 1 , e 2 , . . . , e k ), the algorithm chooses a min-price successor node succ(e k ) and updates the price p e k by keeping track of the node prices of the predecessor, p pred(e k ) and successor, p succ(e k ) , such that the following downhill path property is always maintained.
Downhill Path Property : All the edges of the path P = (e s , e 1 , e 2 , . . . , e k ) maintained by the APC algorithm are level or downhill. Moreover, the last edge (e k−1 , e k ) of P is downhill following an extension to e k .
In particular, given the current path P and a set of node prices, the algorithm changes P and the price of its terminal nodes according to following cases (see Figure 1): (a) P = (e s ) : Set the price p e s to max{p e s , p succ(e s ) + }, and extend P to succ(e s ). (b) P = (e s , e 1 , . . . , e k ) and node e k is deadend: Set the price p e k to ∞ (or a very high number for practical purposes), and contract P to e k−1 . (c) P = (e s , e 1 , . . . , e k ) and node e k is not deadend. Then, the following two cases are considered.
(i) If p pred(e k ) > p succ(e k ) , then extend P to succ(e k ) and set p e k to any price level that makes the arc (pred(e k ), e k ) level or downhill and the arc (e k , succ(e k )) downhill. For example, set This raises the price p e k to the maximum possible level, making the arc (pred(e k ), e k ) level.
If p pred(e k ) ≤ p succ(e k ) , then contract P to pred(e k ) and raise the price of e k to the price of succ(e k ) plus . This makes the arc (pred(e k ), e k ) uphill and the arc (e k , succ(e k )) downhill. In case of extension, the algorithm maintains the maximum possible level p e k = p pred(e k ) making the predecessor arc (pred(e k ), e k ) level. In the case of a contraction, it raises the price of e k to the maximum possible level, making the arc (pred(e k ), e k ) uphill.
The algorithm terminates once the destination becomes the terminal node of P. The significance of the downhill path property is that when an extension occurs, a cycle cannot be created in the sense that the terminal node e k is different from all the predecessor nodes, e s , e 1 , e 2 , . . . , e k−1 on the path P. The reason for this is that the downhill path property implies that following an extension, we have showing that the terminal node e k following an extension cannot be equal to any of the preceding nodes of P.

Multiple Destinations or Multiple Origins
The algorithm can be generalized to handle a single origin but not multiple destinations. A list containing the destinations that have yet to be reached is maintained. Once a destination is reached by the path P, it is removed from the destination list. The algorithm continues to run until it has reached all the destinations. Similarly, for the multiple-origin problem, the algorithm is used to construct a tree of paths from all the origins by concatenating the respective paths.
To illustrate APC with an example, consider the five-node graph shown in Figure 2a, with initial prices (0, 3, 0, 0, 0). For this example, the origin is node 1, with three destinations: nodes 2, 4, and 5. The algorithm terminates after it has reached all the destinations. Figure 2 shows the successive iterations of the algorithm. The algorithm maintains a path and updates prices as per the defined rules. The path for each destination can be stored, and finally, the algorithm terminates after it has reached all the destinations. This example shows the single application of the algorithm. If the algorithm is applied to multiple searches, then at the end of each search, the infinite price should be set to a finite value, such as the minimum price of the upstream neighbors of the respective node.  (1), with three destination nodes (2,4,5). The algorithm chooses to extend or contract and update the price as per the given rules. The prices after the update in each iteration are shown against each node in the figure.

Auction Weighted Path Construction (AWPC) Algorithm
The generalization of the APC algorithm is the auction weighted path construction (AWPC) algorithm. It uses a weight a ij for every edge (e i , e j ), which measures the desirability of including an edge, (e i , e j ), into the path. In particular, the length of a path is defined as the sum of the weights of its edges. The edge weights can act as a bias toward producing paths with a smaller length. The paths generated by the AWPC algorithm have relatively short lengths but do not guarantee they are the shortest.
Extending the terminology of APC, under the current set of prices and weights, the edge (e i , e j ) is called: Downhill : i f p e i > a ij + p e j ; (b) Level : i f p e i = a ij + p e j ; (c) U phill : i f p e i < a ij + p e j .
The AWPC algorithm maintains a directed path P = (e s , e 1 , . . . , e k ) that starts at the origin and consists of distinct nodes. The path is either the degenerate path P = (e s ), or it ends at some node e k = e s , which is called the terminal node of P. Similarly to APC, each node e i is assigned an initial arbitrarily chosen scalar price p e i . In addition to prices, AWPC also maintains edge weights a ij for every edge (e i , e j ). The edge weights are used to introduce a bias. The algorithm tends to converge faster in case of smaller values of edge weights. In this work, we simulated experiments with different edge weights such as (0, 1, 2, 3, 6). As shown later in Results Section 5, our experimental findings supported this. APC is the special case of AWPC where a ij = 0 for all edges (e i , e j ).
The AWPC algorithm starts with the degenerate path P = (e s ) and some initial prices. Each iteration starts with a path P and a scalar price p e i for each node e i . At the end of the iteration, a new pathP is obtained from P through a contraction or an extension, as explained earlier in APC. Here, the amount of the price rise is also determined by a scalar parameter > 0. The path and prices are updated at every iteration according to the following cases: (a) P = (e s ) : Set the price p e s to max{p e s , a e s succ(e s ) + p succ(e s ) + }, and extend P to succ(e s ). (b) P = (e s , e 1 , . . . , e k ) and node e k is deadend : Set the price p e k to ∞ (or a very high number for practical purposes), and contract P to e k−1 .
(c) P = (e s , e 1 , . . . , e k ) and node e k is not deadend. Then, the following two cases are considered.
(i) If p pred(e k ) > a pred(e k )e k + a e k succ(e k ) + p succ(e k ) , then extend P to succ(e k ) and set p e k to any price level that makes the arc (pred(e k ), e k ) level and the arc (e k , succ(e k )) downhill. For example, set This raises the price p e k to the maximum possible level, making the arc (pred(e k ), e k ) level.
If p pred(e k ) ≤ a pred(e k )e k + a e k succ(e k ) + p succ(e k ) , then contract P to pred(e k ) and raise the price of e k to a e k succ(e k ) + p succ(e k ) + . This makes the arc (pred(e k ), e k ) uphill and the arc (e k , succ(e k )) downhill.
The algorithm terminates when the destination becomes the terminal node of P. A more detailed discussion of the AWPC rules, properties, and performance guarantees is given in the paper [11].

Role of the Parameter
A positive is used to regulate the size of the price rise. The choice of does not affect the path produced by the APC algorithm (for ease of calculation, its value was chosen as = 1). In AWPC, provides a trade-off between the ability of the algorithm to produce a minimal length path and its convergence rate (see the paper [11]). As the value of becomes small, the path quality improves, but at the same time, the small value of tends to slow down the algorithm.
For an illustration, consider the four-node graph shown in Figure 3a. In this example, the origin node is s, and the destination node is t. We assume that AWPC starts with P = (s) and with all initial prices equal to 0. Consider first the case where > 3. Then, using the rules of the algorithm, the trajectory of the path will be an extension from s to 1, followed by an extension from 1 to the destination t, thus producing the nonshortest path (s, 1, t).
On the other hand, if < 3, the algorithm will extend the path from s to 1 and then contract back to s. It will then extend the path to 2 and perform a final extension to t, resulting in the shortest path (s, 2, t); see Figure 3b. Thus, if > 3, the algorithm produces the nonshortest path (s, 1, t) faster, and if < 3, it produces the shortest path (s, 2, t) more slowly. This behavior is characteristic of the role of in providing a trade-off between the ability of the algorithm to construct paths with near-minimum lengths and its rate of convergence. For further explanation and a theoretical analysis, see [11]. In this paper, we have kept the value of at 1 for all of our experiments.

Question Answering in Knowledge Graphs
In this section, we will present our approach for answering the queries over a knowledge graph using auction algorithms. We demonstrate the use of auction path construction (APC) and the variant with weighted version auction weighted path construction (AWPC).

Problem Statement
A knowledge graph, KG = {E, R, I}, as described in Section 3.1.1, is a set of entities, relations, and attributes. Knowledge graph applications include data mining tasks, such as discovering new facts or answering questions. All such tasks requires the prediction of the link, (e s , ?, e t ), between the head (or origin) entity, e s , and the tail (or destination) entity, e t . The link is either the edge, (e i , e j ), which can be denoted as relation r at a single hop, (e s , r, e t ), or it can be a multihop path, such as (e s , r, e t ) → (e s , r 1 , e 1 ) ∧ (e 1 , r 2 , e 2 ) ∧ ... ∧ (e n−1 , r n , e t ).
In either case, we can model this problem as constructing the path between the origin and the destination entities using the auction path construction algorithm, as described in the previous section.

Method
We follow a step-wise method to answer the queries by traversing directly through the knowledge graph. We first parse the query to extract the entities. We then map the entities to find the closest matching entity in the knowledge graph and obtain the attributes of each entity from the entity metadata in the knowledge graph. In the subsequent step, we use the auction path construction algorithm to acquire the path from the source to the destination entity. In the final step, we construct an answer by mapping the attributes of all the entities in the path. Figure 4 illustrates the process for single and multiple destination queries. Our method does not require structured query processing languages or training a model on a static knowledge base to answer the questions. The step-wise process is explained in detail below.

1.
Parsing the Query: In the first step, the user query is semantically parsed to extract the entities. The entities can be extracted using any state-of-the-art parsing or entity extraction techniques [28] such as natural language processing (NLP)-based namedentity recognition (NER) methods [29]. Different techniques such as rule-based [30], ontology-based methods [31], and learning-based methods [32] can be used for extracting the entities from a query. The choice of the named-entity extraction method for knowledge graphs largely depends on the domain and if the given dataset has a well-defined schema.
In this work, we used a dictionary-based look-up method to match the entities [33]. First, the entities are extracted using the spaCy-based named-entity recognition (NER) [34]. Then, using the named-entity linking, we look for the closest matching entity in the knowledge base (KB). Alternatively, a custom lexical matcher can be used [35]. After extraction of the entities, the information from the metadata of the knowledge graph is used to map the entities to the corresponding nodes. For example, as shown in Figure 4a, in a knowledge graph based on a triple dataset, KG20C [36] for the query, "Which paper was published by Amanda at SIGIR conference? ", the entities extracted using spaCy are, ("paper", "Amanda", "SIGIR", "conference").
The closest matching entities in KB are ("Amanda" and "SIGIR"). The metadata in KB has each entity's attributes, entity ID, name, and type. We fetch the attributes for the candidate entities, "Amanda" and "SIGIR". Here, the entity ID is the corresponding node number in the graph, ("Amanda":3877) and ("SIGIR":4097).
We must also determine the origin and destination entities to construct the path. Most knowledge graph question-answering (KGQA) systems use semantic parsing to detect the occurrence and order of entities in the question [37]. In some KGQA systems, the entity extraction models are trained on a question-answer dataset to understand the query pattern [9]. More sophisticated methods use a predefined query template [38]. In this work, for simplicity, we have used the order of occurrence of entities to determine the origin and destination entities. In the case of multiple entities, we consider the first as the origin and the rest as multiple destinations, as shown in the query in Figure 4b. In this paper, we do not elaborate further on each method used in this step, as our work focuses on demonstrating the use of the APC in knowledge graphs.

2.
Path construction using APC (or AWPC): In the second step, the triple dataset of a knowledge graph is modeled into a directed graph with nodes and edges. The nodes from step 1 can be denoted as the origin and destination nodes in the graph, and for instance, if the auction path construction algorithm (APC) is used to construct the path between them. The initial prices for all the nodes in E can be chosen arbitrarily or reused from previous searches. If the weighted version of the algorithm (AWPC) is used, then the edges in R should also be given initial edge weights to introduce a bias.
The algorithm starts at the origin node. It follows the rules defined in Section 3.1.2 to navigate through the graph structure and update the path and node prices at each iteration. The algorithm terminates and returns to the final path once it reaches the single (or all) destination(s). The resulting path is essentially the set of final nodes with lower prices chosen by the algorithm while auctioning from the origin to the destination node. For the above query example, as shown in Figure 4a, the path given by APC is [3877,6572,4097]. This set of nodes in the final path can now be used to provide an explainable answer to the query.

3.
Mapping attributes to build an interpretable answer: In the final step, we use the attributes I for each entity in E, given as the metadata in knowledge graphs. These attributes can add semantic meaning to the path generated by APC. The edges are mapped to the corresponding relations, and the type and entity names are added from the attributes to construct the final answer. Referring to our example in Figure 4a, for the path, [3877, 6572, 4097], the attributes for each node and relations for each edge in the path are mapped to construct the final answer as shown in below Figure 5. The prices learned by the algorithm in this search can be used as the starting prices for the following query.

Prices as a "Learning Experience"
The node prices play a significant role in the search process. The node prices act as a "guide" for the algorithm and "steer" it toward nodes with lower prices, which are more likely to be part of the solution. APC provides the flexibility to use arbitrary prices. Initially, zeros or random values can be used as the starting prices for the nodes in the graph. In successive runs, prices from the previous run can be reused. Reusing the prices is similar to learning from previous experiences. For subsequent similar queries, the "experience" in terms of "learned prices" helps produce the path faster, making the algorithm computationally efficient. Eventually, the algorithm can learn favorable starting prices to generate paths faster.

Ability of APC to Adapt to an "Evolving KG"
Given the dynamic and incomplete nature of the knowledge graphs, newly emerging relations and entities are added, and redundant ones are removed, resulting in constant changes in the graph structure. The auction algorithms are well-suited to this kind of dynamic environment. They are flexible as the node prices are initially assigned arbitrary values and updated dynamically as per the rules defined in Section 3. The algorithms navigate the graph from higher-price to low-price nodes. When new nodes or edges are added or removed from the graph, the existing nodes can reuse the prices from previous searches, and the newly added nodes are given arbitrary prices. We simulated the experiments by dynamically adding new nodes in the knowledge graph. Our results show that the algorithms adapt well to the environment changes. While assigning arbitrary prices to newly added nodes, a good strategy is to allocate prices in the range of prices of existing nodes.

Edge Weights as per "Query Semantics"
Another significant parameter in the search process is the arc length (or edge weight) used in the auction weighted path construction (AWPC), the weighted version of APC as described in Section 3.2. The edge weights act as a bias and aim to provide a path with near-minimal total length. The arcs with smaller lengths tend to give shorter paths, and thus the algorithm should converge faster. Once the relations from a query are semantically extracted, the edge weights can be assigned as per the semantic meaning. The relation (or edges) with a strong match with the query are given "0" weights, and all the other edges are given higher values of weights. By adding this bias, the algorithm looks for lower-length paths similar to the relation in the query. Thus, we are highly likely to obtain more accurate and faster results with shorter paths, making it computationally more efficient.

Dataset
We used two benchmark datasets, KG20C [36] and YAGO3-10 [39], to showcase the application of APC in knowledge graphs for question answering. The first dataset, KG20C, was constructed from the Microsoft Academic Graph dataset [40] by extracting the information from high-quality computer science papers published in the top 20 conferences. KG20C is a standard benchmark dataset used for several tasks in knowledge graphs such as graph embedding, link prediction, recommendation systems, and question answering about high-quality papers [41,42]. It is a triple dataset with five intrinsic entity types, Paper, Author, Affiliation, Venue, and Domain. The five intrinsic relation types defined between these entities are author_in_a f f iliation, author_write_paper, paper_in_domain, paper_cite_paper, paper_in_venue. The entity attributes contain the metadata as entity ID, name, and type. The statistical details of KG20C are in Table 1.
Another dataset, YAGO3-10 ,is a benchmark dataset for knowledge base completion. It is a subset of YAGO3 (which is an extension of YAGO (Yet Another Great Ontology) [43]). Yago3 contains entities related to people, universities, cities, organizations, and artworks and their canonical relations. Yago3-10 has 123,182 entities, 37 relations, and 1,179,040 triples. Most triples describe attributes of places and persons such as citizenship, gender, and profession, for instance, the city a person was born in, which city is located in which country, which player played for a team, which singer sang which song, etc.

Experiments and Results
In this section, we present our experiments and results. First, we discuss the protocol for evaluating and measuring the effectiveness of the proposed method. Then, we provide the experiment setup and the result analysis and discussion.

Evaluation Protocol
The research question is whether APC (and AWPC) algorithms are suitable for application in knowledge graphs. To answer this question, we ran multiple simulations to test the algorithm's behavior in different settings. The primary objectives for this work are to evaluate the accuracy of the method, adaptability of the knowledge graph to the dynamic environment , and interpretability of the answer. The other important factors are computational efficiency and support for multihop and multidestination queries. Each objective is described below: • Prediction Accuracy -Did the algorithm predict the answers the same or close to the ground truth?

Experimental Setup
We conducted simulations with eight different parameter settings. For each simulation, we used around 50 queries, a mix of single and multihop type questions with single or multiple destinations. Table 2 shows the experimental settings for different prices and edge weights as well as the corresponding observations for each experiment for the KG20C dataset.
The first four simulations were run using APC with different initial price settings. The algorithm was relatively stable, and the resulting paths matched the ground truth with the initial price set as "0". We then reused the "learned" prices and observed that the number of iterations reduced by 60% while the resulting paths remained the same. We also stress-tested the algorithm by assigning random prices between 100 and 1000. The iteration count went very high, and most of the paths remained the same, but it chose longer paths for some queries. Reusing these prices gave relatively stable results, and the iterations reduced significantly. For APC, the second experiment achieved the best results regarding computational efficiency.
Further experiments were conducted using the weighted version (AWPC). We kept the variations in price settings the same as those in the first four runs and repeated the first four experiments for different edge weights. We conducted experiments with different combinations of edge weights such as (0, 1, 2, 3, 6) and random weights between and −10 and 10. The edge weights were used to add the bias; the algorithm converges faster with lower edge weights. Our experimental results also supported it. In Table 2, we reported the significant observations for each experiment. Reusing Prices from (1) Number of iterations reduced by 60%. Resulting paths were the same as those in (1).

Random Prices and Stress Testing
Random prices given between 100 and 1000. Higher iterations. Most Paths were the same and gave longer paths for some queries. 4 Reusing Prices from (3) Relatively stable. Iterations in (4) much fewer than those in (3) but higher or the same as those in (2).

5
Edge Weights as "0" "0" edge weights results should match the first four settings using APC; validation step. 6 Edge Weights as "1" Same paths; greater or equal number of iterations as in (2). 7 Random Weights Unstable with random weights between −10 and 10. Iterations increased by almost two times, with not much change in the paths.

8(a)
Weights as per "Query Semantics" (S:0,W:1) Edges with strong match in query given "0" weights and the remaining edges (weak) given "1"; no significant change in iterations. 8(b) (S:0,M:3,W:6) Prices Reused from an Initial Price of "0" Paths were same or shorter, and iterations were fewer or the same as in (2).
For AWPC, the edge weights were set to "0" to validate whether the results matched the first four price settings using APC. In the next run, we set the edge weights to "1". The paths remained the same, and iterations were higher or the same as in those in (2). Then the algorithm became unstable with random weights between −10 and 10. Iterations increased almost twice, but there was not much change in the paths.
When the query is parsed, suppose the relation can be extracted and matched with a predefined pattern. If it is possible to state which relation in the query matches which of the relations in knowledge graph metadata, then we can preassign the edge weights as per the match. We experimented by assigning edge weights as per the query semantics. In the first set, the edges with strong match were given "0" weights, and the remaining edges (weak) were set to "1", denoted as (S:0, W:1). In the second setting, the relations were classified as a strong, medium, and weak match. For example, in the query, "Which author may cite this paper?", the relation, paper_cite_paper is a strong match, whereas author_write_paper is medium, and all other relations are weak. The strong match edges were given "0" weights, medium "3", and weak "6", denoted as (S:0, M:3, W:6). When the prices were reused from the initial price of "0", the resulting paths were the same or shorter, and the iterations were the less or same as (2). In this case, the algorithm yielded the best results compared to all the other settings. Intuitively, we can therefore say that the algorithm can do a semantic-based search and tends to progress in a meaningful direction if the parameters are chosen wisely.

Result Analysis
We now discuss the results from the above experiments and analyze if the objectives of the evaluation protocol were satisfied by the method.
Prediction Accuracy: The resulting paths from each query matched the ground truth. The different node-prices and edge-weights settings given in Table 2 only affected the number of iterations taken by the algorithm to provide the final answer, to be discussed shortly.
Adaptability to Evolving Knowledge Graph: To test the algorithm's adaptability to changing knowledge graphs, we conducted the following experiments.

1.
We started with a small subgraph of 12 entities from the KG20C dataset and ran around ten queries. The node prices were initially set to "0". The prices were then reused for the subsequent queries. Similar experiments were conducted by initializing random prices and edge weights settings as described in Table 2.

2.
In the second simulation, 40 new entities were added to the knowledge graph. The prices of existing nodes were reused, and the newly added nodes were given arbitrary prices. In this step, 20 queries were run on a total of 52 nodes. 3.
In the third simulation, the remaining entities and relations were added, and 50 queries for each setting (as given in Table 2) were run on the large graph with 16K entities. The prices from previous searches were reused for existing nodes. The number of iterations increased slightly for a few queries but eventually decreased after the prices were reused in successive runs.
The results from each simulation show that the algorithm dynamically adapted to the newly added nodes. The final answers from all the resulting paths matched the ground truth. A critical observation in a larger graph was that if the origin in the following query is too far from the previous question, the number of iterations may increase and stabilize after it starts reaching the reasonable learned prices. Thus, the algorithm continuously learns and adapts to evolving knowledge graphs.
Explainable Reasoning Paths: Explainability refers to the ability of a system to explain the results and the reasoning process to the users, such that they can know the whys and hows of the decision-making process [44].
APC returns a complete path between the entities in question. The paths are traceable and can be inferred by mapping the entity attributes as shown in Figure 6. The need for explainable reasoning paths can be demonstrated using the second query example from the KG20C dataset, "Has Mehryar worked in the Matrix Decomposition domain?" From the explanation given in the answer, we can see no direct yes-or-no answer to this question. The author has not worked directly in the domain, but technically, he has referred to some work that cited another work in the said domain. In such cases, the user would want to know the complete answer and infer the degree of association of the author with that domain. In cases where no path exists, the algorithm will explore all the nodes and raise an exception saying that paths do not exist. Computational Efficiency: Table 3 shows the number of iterations for a few sample queries from the KG20C dataset with different prices and edge weight (EW) settings, as discussed in experiments 6, 7, 8(a), and 8(b) in Table 2. We can see from the results that the number of iterations reduces significantly by reusing the prices for similar queries. It also shows the effect of bias introduced due to different values of edge weights in the cases where the edge weights were assigned as per the relations extracted from query semantics. The edges with strong matching relations were given "0" initial weights, the medium relations "3", and the weak relations "6". The first query with edge weight settings per query semantics had only 76 iterations, which is much fewer than the random weights. The number of iterations dropped to 2 and 3 in the second and the third queries on reusing the prices, respectively, which is a significant computational improvement.  In which paper did Hang-Li cite paper on "Equations for PoS"?
Initial price = 0 353 630 347 76 In which paper author Fredric cited the paper on "Equations for PoS"?
Reusing prices 2 32 2 2 Which paper in "Clustering query" cites work on "Equations for PoS"?

Support for Multihop Queries:
Most of the queries used in our experiments require multihop reasoning. For example, the query, "In which paper author Fredric cited the paper on Equations for PoS?" needs a multihop inference of the form, (Fredric, author_write_paper, ?), (?, paper_cite_paper, Equations for PoS). The APC is able to provide a full traceable path and support the queries with multihop reasoning.
Queries with Multiple Destinations: Figure 3b shows the example of a query with multiple destinations. Both the APC and AWPC algorithms support the queries with multiple destinations. They work the same as does a single destination, but a list of all the destinations is maintained and the algorithm continues to run until it has reached all the destinations. This is described in Section 3.1.3.
To validate our approach, we ran the queries on another dataset, Yago3-10, using the same settings. Figures 7 and 8 show the explainable paths generated for sample queries on the Yago3-10 dataset using our method. The complete implementation code for both the APC and AWPC algorithms for single and multiple destination queries and a summary of all the experiment results are available on GitHub [45]. We further show the comparison of our approach with state-of-the-art methods. Table 4 shows the comparison of our method (using APC) with the multirelational embedding (MEI) embedding methods used on the KG20C dataset to answer semantic queries [36]. The semantic queries answered by this work are single-hop factoid questions such as, "Who may write this paper?", "What paper does this author have?", and not multihop questions such as, "Which paper does this author have in that conference?". Moreover, these methods only give a single candidate score as the answer but not the explainable paths. Our method can support single and multihop questions and queries with multiple destinations. The predictions are the same as the ground truth, and the search using APC returns an explainable reasoning path for all the queries. We also compared our method with the Dijkstra algorithm. Both auction and Dijkstra can be used for finding paths from source to destination, and both have polynomial time complexity. However, auction has two advantages. First, it tends to be faster, mainly because it aims to find any path (which is not guaranteed to be the shortest) rather than the shortest path, as is the case with Dijkstra. Secondly, the auction algorithm generates the prices for future queries, whereas the Dijkstra provides no such information. Table 5 below shows the results of some sample queries from KG20C for Dijkstra and auction. The path was the same for most of the queries, but in some queries shown in Table 5, the path returned by Dijkstra differed from that by auction and gave a shorter path. Dijkstra starts fresh for each query enumerating all possible paths. It does not have a learning mechanism from pasts searches, making it computationally expensive as the knowledge graph grows. The above experiments and results show that APC (and AWPC) meets the evaluation criteria. Thus, the potential of auction algorithms can be explored for various tasks in knowledge graphs. A summary of the experimental results against each evaluation objective is shown in Table 6.

Discussion
We discussed the need for dynamic algorithms in evolving knowledge graphs that can continuously learn from adding new information and provide explainable answers. To this end we proposed using a recently proposed class of auction algorithms that maintain and update node prices, while constructing low-price paths. The prices from the previous searches can be used as a learning experience for related searches, making it computationally efficient. We used two benchmark datasets in knowledge graphs, KG20C and YAGO3-10, to show that the resulting paths given by auction are highly interpretable and explainable. The algorithm can answer single and multihop questions with one or more destinations. We also compared our method with the Dijkstra shortest path algorithm, showing that both could give explainable paths. However, auction can learn from previous searches. The most significant advantage of the auction algorithm is its capability to arbitrarily assign the initial prices and dynamically update the prices independent of new nodes added to the graph. This property makes auction suitable for continuous learning in a dynamic knowledge graph environment.
The auction algorithms can also be used in application contexts to provide recommendations and design flow graphs. One of the applications can be in customer contact centers where the customers call the agents to get help on their bills, payment methods, products, WIFI service, call plans, etc. These agents can be human or virtual assistants. Once the call intent or the named entities are extracted from the customer query and given the endpoints, the algorithm can provide the full path to the troubleshooting steps for the agents to resolve customer tickets quickly.
There are certain limitations that we plan to improve upon in future work. For example, in open-ended questions such as "Who may write this paper?", the algorithm will search for all the "author" nodes in the graph. It may make the search inefficient at the run time for a huge graph. Furthermore, it will provide multiple answers to the user, which may not be a desirable situation. To address these types of scenarios, link prediction embedding methods can be used first to predict the destination candidate. Then, the auction algorithms can be used to provide explainable paths.

Conclusions
This work proposes the use of auction algorithms for question answering over knowledge graphs. We show that these algorithms can adapt to changing conditions by dynamically updating node prices and have the ability to learn the proper values of prices using past experience. In particular, the learned prices can be reused efficiently for similar future searches. Our method leverages the entity attributes and path relations to construct interpretable answers from the resulting paths generated by the auction algorithms for multihop queries.
In this work, we have used two variations of auction algorithms, auction path construction (APC) and auction weighted path construction (AWPC), which have the key properties of allowing arbitrary starting prices. Other variants of auction algorithms, such as reverse path, distributed, and multiple node price rise, can be further explored (such variations are given in the paper [11] and the book [20]). Moreover, a neural network model can be trained to learn near-optimal starting prices and to reoptimize these prices when significant changes occur in the network's structure.