ULYSSES: Automated FreqUentLY ASked QueStions for KnowlEdge GraphS

Vassiliou, Giannis; Trouli, Georgia Eirini; Troullinou, Georgia; Spyridakis, Nikolaos; Bitzarakis, George; Droumalia, Fotini; Karagiannakis, Antonis; Skouteli, Georgia; Oikonomou, Nikolaos; Deka, Dimitra; Makaronas, Emmanouil; Pronoitis, Georgios; Alexandris, Konstantinos; Kostopoulos, Stamatios; Kazantzakis, Yiannis; Vlassis, Nikolaos; Sfinarolaki, Eleftheria; Daskalakis, Vardis; Giannakos, Iakovos; Stamatoukou, Argyro; Papadakis, Nikolaos; Kondylakis, Haridimos

doi:10.3390/app14177640

Open AccessArticle

ULYSSES: Automated FreqUentLY ASked QueStions for KnowlEdge GraphS

by

Giannis Vassiliou

¹,

Georgia Eirini Trouli

^1,2

,

Georgia Troullinou

²,

Nikolaos Spyridakis

¹

,

George Bitzarakis

¹,

Fotini Droumalia

¹,

Antonis Karagiannakis

¹,

Georgia Skouteli

¹,

Nikolaos Oikonomou

¹,

Dimitra Deka

¹,

Emmanouil Makaronas

¹,

Georgios Pronoitis

¹,

Konstantinos Alexandris

¹,

Stamatios Kostopoulos

¹

,

Yiannis Kazantzakis

¹,

Nikolaos Vlassis

¹,

Eleftheria Sfinarolaki

¹,

Vardis Daskalakis

¹,

Iakovos Giannakos

¹,

Argyro Stamatoukou

¹,

Nikolaos Papadakis

¹ and

Haridimos Kondylakis

^2,3,*

Show full author list Hide full author list

¹

Department of Electrical and Computer Engineering, Hellenic Mediterranean University (HMU), 71309 Heraklion, Greece

²

Institute of Computer Science, Foundation for Research and Technology-Hellas (FORTH), 70013 Heraklion, Greece

³

Computer Science Department, University of Crete, 70013 Heraklion, Greece

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(17), 7640; https://doi.org/10.3390/app14177640

Submission received: 17 July 2024 / Revised: 21 August 2024 / Accepted: 23 August 2024 / Published: 29 August 2024

(This article belongs to the Special Issue Applications of Data Science and Artificial Intelligence)

Download

Browse Figures

Versions Notes

Abstract

:

The exponential growth of Knowledge Graphs necessitates effective and efficient methods for their exploration and understanding. Frequently Asked Questions (FAQ) is a service that typically presents a list of questions and answers related to a specific topic, and which is intended to help people understand that topic. Although FAQ has already shown its value on large websites and is widely used, to the best of our knowledge it has not yet been exploited for Knowledge Graphs. In this paper, we present ULYSSES, the first system for automatically constructing FAQ lists for large Knowledge Graphs. Our method consists of three key steps. First, we select the most frequent queries by exploiting the available query logs. Next, we answer the selected queries, using the original graph. Finally, we construct textual descriptions of both the queries and the corresponding answers, exploring state-of-the-art transformer models, i.e., ChatGPT 3.5 and Gemini 1.5 Pro. We evaluate the results of each model, using a human-constructed FAQ list, contributing a unique dataset to the domain and showing the benefits of our approach.

Keywords:

RDF knowledge graphs; large language models; frequently asked questions

1. Introduction

Every day, a huge volume of new information becomes available online. RDF Knowledge Graphs (KGs) are expanding at a rapid pace, now encompassing millions, or even billions, of the triples available on the Web. For instance, the Linked Open Data Cloud currently houses over 62 billion triples, structured into vast and intricate RDF data graphs. The sheer complexity and massive scale of these data sources pose significant challenges in their utilization, highlighting the urgent need for effective and efficient methods to explore and comprehend their content [1].

The promise. Frequently Asked Questions (FAQ), on the other hand, is an essential service for user self-assistance, offered by several websites. FAQ typically presents a list of questions, each associated with an answer, enabling users to rapidly identify key questions/answers in the context of the information provided by the website. However, although FAQ has proven its usefulness on the Web [2], it remains completely unexplored in the context of the large Knowledge Graphs now available online.

The problems. There are several reasons why FAQ is unavailable for KGs. First, since KGs contain a large amount of information, it is unclear which information is the most interesting to present to users. Second, the most interesting Knowledge Graphs contain millions of statements, and, as such, processing them is time-consuming and resource-intensive. Although several approaches have already focused on extracting the most interesting part of a KG in the form of a summary graph [1,3,4], users are accustomed to seeing textual descriptions of both the questions and answers and not structured queries or subgraphs of the original graph.

The solution. In this paper, we present ULYSSES, the first system enabling the construction of high-quality FAQ lists over large Knowledge Graphs.

More specifically, our contributions are the following:

To address the problem of selecting the relevant information from large Knowledge Graphs, ULYSSES does not use the original Knowledge Graph but instead exploits query logs. These query logs are available to the curators of the KGs through the SPARQL endpoints of the corresponding KGs.
ULYSSES identifies the most frequent SPARQL queries in the logs and uses the corresponding SPARQL endpoints to retrieve their answers.
Then, to transform both the queries and the answers into text, it exploits transformer models, i.e., the ChatGPT and the Gemini LLMs, using appropriate prompts.
We evaluate our approach on the DBpedia KG and we show the interesting results achieved by ULYSSES.
As a side effect, we also generate and offer to the community the first golden standard dataset generated by a user study.

To the best of our knowledge, this is the first approach for automated construction of FAQ lists for large Knowledge Graphs, exploiting existing user logs and transformer models.

The rest of this paper is organized as follows: Section 2 presents related work. Section 3 presents our solution, detailing the various steps for generating a high-quality FAQ list for a given knowledge base. In Section 4, we evaluate our approach on DBpedia and present the experimental evaluation of our work. Finally, Section 5 concludes this paper and presents directions for future work.

2. Related Work

2.1. Semantic Summaries

Several works in recent years have focused on summarizing large Knowledge Graphs for indexing, estimating the size of the query results, making queries more specific, source selection, graph visualization, understanding, and schema discovery. For an overview of the entire field, the interested author is referred to our comprehensive summary [1].

Non-quotient semantic summaries is a specific category of semantic summaries that focuses on extracting the most important part of the graph and presenting it as a summary. Notable works in this category are the RDFDigest+ [5], which uses various centrality measures to identify the most important schema nodes and then link them, KCE [6], which extracts only the most interesting schema nodes, and [7], which, again, extracts the most interesting parts over the schema of the Knowledge Graph. Note that in order to handle the large amount of triples in a Knowledge Graph, most of the works in this category focus only on summarizing the schema graph, which is relatively small, and also usually incomplete.

Lately, works have started to appear that try to exploit query logs (which are usually small) instead of Knowledge Graphs (which are usually large) for this task [4,8,9]. These works identify the most interesting nodes as the ones mostly queried, and queries are also exploited, in order to find the most frequent paths, so as to link the most interesting nodes.

ULYSSES exploits the very same ideas, using query logs to extract the most frequent questions; however, it moves beyond summaries, to construct fully fledged question/answer sets in natural language. More specifically, it identifies the most frequent questions as the most important ones; then, it proceeds to issue those queries to the corresponding SPARQL endpoints, in order to retrieve and textually summarize the presented answers.

2.2. Question Answering over KGs

In addition, numerous works in recent years have focused on answering the textual questions of users on KGs or using embeddings to provide answers to given queries [10].

Early systems focused on answering natural language questions by transforming them into logical forms that could be directly executed over a knowledge base [11]. Then, template-based approaches exploited predefined templates, to convert natural language questions into structured queries, such as SPARQL [12]. While efficient, challenges like handling the vast variability in natural language expressions and the explosion of the required templates for different question types made these approaches hard to use.

Recent advancements have leveraged machine learning, especially neural networks, to improve question answering over KGs. These models, including convolutional neural networks and transformer-based models, such as BERT, enhance entity recognition and relation detection, providing more accurate answers. Multicolumn convolutional neural networks, for example, have been used to handle the complexity of freebase data, improving the extraction of relevant information to answer questions [13].

The process that ULYSSES is using to construct the FAQs could be complemented by these approaches. ULYSSES identifies the most important queries and then directly uses the corresponding SPARQL endpoints to answer them. Then, LLMs transform the answers/questions into text. Another direction that could be explored would be to directly answer the selected queries over the Knowledge Graph, using the aforementioned techniques, which could also directly provide the resulting textual description of the answer.

2.3. FAQ Generation

The existing works in this domain work on a given textual corpus and mostly focus on auto-generating questions and answers [14]. In this regard, most approaches first extract answers from an input text and then generate the corresponding questions [15,16]. Other approaches include additional modules to validate the question–answer pairs [17] or introducing diversity into the selected pairs, using variational auto-encoders [18].

Another approach uses retrieval methods to directly construct FAQ lists. For example, Ref. [19] introduced a customizable and partially automated method to identify and create FAQ pairs within open source projects, and [20] focused on student forums, using a hierarchical agglomeration clustering method. FAQ lists for online discussion forums are also generated by selecting past question–answer pairs, using a random forest classifier [21] to present a system that automatically generates education-specific FAQ pairs from online discussion forums. Furthermore, the latest generative methods, such as [14], combine question extraction and question generation, to create and improve FAQ lists.

Finally, other approaches focus on answering the questions in FAQ lists by building a Knowledge Graph to train the pre-trained model [22,23,24,25].

To the best of our knowledge, however, we have not seen any work trying to generate FAQ lists over Knowledge Graphs, and, as such, ULYSSES moves the domain one step forward.

3. Methodology

3.1. Preliminaries

In this paper, we focus on RDF Knowledge Graphs, as RDF is among the most widely used standards for publishing and representing data on the Web, promoted by the W3C for semantic web applications. An RDF KG

G

is a set of triples of the form

(s, p, o)

. A triple states that a subject s has the property p, and the value of that property is the object o. We consider only well-formed triples, according to the RDF specification [26]. These belong to

(U \cup B) \times U \times (U \cup B \cup L)

, where

U

is a set of Uniform Resource Identifiers (URIs),

L

is a set of typed or untyped literals (constants), and

B

is a set of blank nodes (unknown URIs or literals);

U, B, L

are pairwise disjoint. Additionally, we assume an infinite set

X

of variables that is disjoint from the previous sets. Blank nodes are essential features of RDF, allowing us to support unknown URI/literal tokens. The RDF standard includes the rdf:type property, which allows for specifying the type(s) of a resource. Each resource can have zero, one or several types.

For querying, we use SPARQL [27], the W3C standard for querying RDF datasets. The basic building units of the SPARQL queries are triple pattern and Basic Graph Pattern (BGP). A triple pattern is a triple from

(U \cup B \cup X) \times (U \cup X)) \times (U \cup B \cup L \cup X)

. A set of triple patterns constitutes a Basic Graph Pattern (BGP).

3.2. The Problem

Next, we informally define the problem we address. The problem can be described as follows: Given a Knowledge Graph G, efficiently construct a set of question–answer pairs that best captures the content of the information contained in G.

An idea based on similar literature for resolving the aforementioned problem would be first to try and identify the most interesting parts of the graph and use them to generate the corresponding text to be used as an answer. Then, construct the appropriate questions for those answers.

For addressing the first subproblem, past approaches in the summarization domain have used adapted centrality measures to successfully identify the most important areas of a KG [1]. However, calculating centrality measures on top of large graphs, counting millions of nodes, is computationally expensive and in many cases unfeasible.

We assume now that for the KG G we have available a query log

Q = {q_{1}, \dots, q_{n}}

. This assumption is reasonable, as all large KGs offer a SPARQL endpoint that logs user queries for various purposes, as multiple studies have already confirmed (e.g., [28]).

Having such a query log available, our first idea is that we can use it instead of the KG to mine the most important areas of a KG, as indeed the most interesting parts of the graph are mostly queried. Such a query log (a) directly provides us with the most frequent questions, (b) is computationally cheap, as query logs need to be scanned only once for finding the most frequent queries, and they usually do not scale beyond thousands of queries, and (c) offers an optimal way for retrieving the answers from a KG—directly issue the few selected queries to the KG using the corresponding SPARQL endpoints, in order to be answered.

3.3. The FAQGen Algorithm

Next, based on the aforementioned ideas, we present the FAQGen algorithm for generating an FAQ list. The algorithm, shown in Algorithm 1, is given as input a query log Q, a SPARQL endpoint, and a number n denoting the number of question–answer pairs to be included in the constructed FAQ.

The algorithm starts by finding the n most frequent queries in the query log (line 1). Then, it initializes the FAQ list (line 2). Then, it iterates over the selected n most frequent queries in the query log (lines 3–8). For each query, it retrieves an answer from the corresponding SPARQL endpoint (line 4), it transforms both the query and the answer into a textual form, giving the proper prompts to the LLM models (lines 5–6), and, eventually, it adds the textual question–answer pair to the FAQ list. Finally, it returns the constructed FAQ list (line 9) for further visualization.

The algorithm needs to iterate one time over all the queries in the log requiring

O (| Q |)

, in order to find the most frequent ones. Then, for the selected n queries it needs to answer them using the SPARQL endpoint. Although answering generic SPARQL queries is generally NP-complete, triple stores exploit advanced indexing and caching techniques and return an answer—even if it is incomplete—usually within 30 s. The same applies to LLMs, which typically return an answer within 1 to 10 s. Considering that the whole process needs to be repeated only for a limited number of queries, computing FAQ lists over large Knowledge Graphs exploiting our techniques is feasible and, as we shall show, gives high-quality results.

Algorithm 1 FAQGen(Q,

e n d p o i n t

, n)

Input:

Q – the query log;

e n d p o i n t

– the SPARQL endpoint of the corresponding KG;

n – the number of question–answers to include in the resulting FAQ

Output:

F A Q

– a set of question answers

1:: $m f Q \leftarrow F i n d M o s t F r e q u e n t Q u e s t i o n s (Q, n)$
2:: $F A Q \leftarrow \emptyset$ $q \in m f Q$
3:: for all q ∈ mfQ do
4:: $a n s w e r \leftarrow s e n d S P A R Q L (e n d p o i n t, q)$
5:: $t e x t u a l A n s w e r \leftarrow t r a n f o r m T o T e x t (a n s w e r)$
6:: $t e x t u a l Q u e s t i o n \leftarrow t r a n f o r m T o T e x t (q)$
7:: $F A Q \leftarrow F A Q \cup {t e x t u a l Q u e s t i o n, t e x t u a l A n s w e r}$
8:: end for
9:: Return $F A Q$

4. Implementation

To generate an FAQ list for a KG, all the aforementioned algorithms have been implemented with the ULYSSES system. ULYSSES was implemented in Python, using the corresponding APIs for accessing ChatGPT and Gemini. The code of the ULYSSES system, along with the dataset and workload, is available in our GitHub repository (https://github.com/giannisvassiliou/KGFaq/, accessed on 1 August 2024), while the system is accessible online (https://clinquant-belekoy-781b9f.netlify.app/, accessed on 1 August 2024). The system uses a three-layer architecture, shown in Figure 1. The system can be configured to use either the Gemini or the ChatGPT LLM, in order to transform queries and answers into text. A screenshot of the generated system is shown in Figure 2:

Next, we describe each one of these layers in detail.

4.1. The Data Layer

At the bottom, the data layer includes the sources upon which the FAQ list is built: the query log and the SPARQL endpoint of the selected KG. As already mentioned, such query logs are usually available to the curators of the KGs, among others, by recording the queries issued to the corresponding SPARQL endpoints. Well-known such query logs are already available from WikiData [29], DBpedia, Semantic Web Dog Food, LinkedGeoData, BioPortal, OpenBioMed, and the British Museum [28].

An example query from the DBpedia query log is shown below, asking information about the movie “Eyes Wide Shut”:

4.2. The Service Layer

The service layer constructs the FAQ list, using three discrete and sequential steps. First, the existing query logs are preprocessed and cleaned. Then, the most frequent queries are identified and the corresponding KG is used to answer them. Eventually, both the queries and answers are transformed into text, using transformer models.

4.2.1. Query Preprocessing and Cleaning

Initially, ULYSSES preprocesses the SPARQL log file, eliminating queries that are malformed or erroneous. Then, it excludes DESCRIBE, ASK, and CONSTRUCT queries, as they do not contribute to an FAQ list. It also removes queries that are too generic and exploratory and would not contribute to a useful FAQ list (e.g., SELECT * WHERE ?a <property> ?b.). Finally, it excludes queries that do not return any results.

4.2.2. Query Selection and Answering

After the preprocessing of the queries, ULYSSES selects next the most frequent ones to use for the construction of the FAQ list. The selected queries are then answered, using the corresponding SPARQL endpoint. The results are collected, to be provided for the transformation step.

Continuing our example, as Q1 is one of the most frequent queries, it is sent to the DBpedia SPARQL endpoint to be answered. A partial list with the results from the endpoint is shown in Table 1:

4.2.3. Transformation to Text

Next, in order to generate textual FAQ lists, we translate both the queries and the answers to text, using large transformer models. The aim is to unlock the potential for a seamless and intuitive translation of intricate SPARQL queries and their corresponding answers, paving the way for a more user-friendly and inclusive knowledge-sharing experience. ULYSSES uses the following two modules for the translation:

ChatGPT 3.5: ChatGPT is a state-of-the-art language model developed by OpenAI, capable of understanding and generating human-like text across a wide range of topics and styles. It uses a deep neural network architecture with 175 billion parameters to achieve its language-processing abilities.
Gemini 1.5 Pro: Google’s Gemini is a more recent entrant, enabling even broader capabilities as a multimodal AI model, integrating language processing with image processing and, potentially, other modalities [30].

First, the models translate the most frequent SPARQL queries into plain text. Then, based on the KG responses and the textual question, ULYSSES requests a plain-language translation for the responses from the two LLMs. In the response conversation within those models, the translated query is included, in order to ensure coherence and context continuity.

For translating SPARQL queries, we provide the following prompts to the two models:

Similarly, for the responses the following prompts are sent to the two models:

4.3. The GUI Layer

The GUI presents the constructed list with the frequently asked questions. In addition, the GUI allows switching between the two language models, to see their results, and it also organizes the extracted information into distinct categories, which are also generated using the transformer models.

For our example SPARQL query, Q1, the results presented in Figure 3 are presented to the user. Note that both the query and the result have been transformed into text using the ChatGPT model; however, the user can select the Gemini model from the interface, if necessary.

5. Experimental Evaluation

The evaluation was performed using Windows 10 with an Intel^® Core^TM i3 10100 CPU @ 3.60GHz (4 cores) and 16 GB RAM. The source code and guidelines on how to download the datasets and the workloads are available online (https://github.com/giannisvassiliou/KGFaq, accessed on 1 August 2024).

5.1. Datasets

In this study, we utilized the DBpedia (version 3.8), a real-world KG, along with its corresponding query workload. The dataset contains 3.77 million entities with 400 million facts and occupies 103 GB of storage. We accessed locally the KG, which was stored in a Virtuoso triple store. The available query log consisted of 58,604 user queries, while the statistics on the preprocessing of the queries are shown in Table 2:

5.2. Baselines

As baselines, we used the state-of-the-art T5 and BART transformer models:

T5 (Text-To-Text Transfer Transformer) [31] is a versatile language model developed by Google Research, capable of performing a wide range of natural language-processing tasks by framing them all as text-to-text problems. It achieves this by converting input and output pairs into a unified text format, enabling it to seamlessly handle tasks such as translation, summarization, and answering questions.

BART (Bidirectional and Auto-Regressive Transformers) [32] is a large-scale pretraining model developed by Facebook AI. It excels in various natural language-processing tasks by combining auto-regressive and denoising objectives during training, enabling it to generate coherent and contextually rich text.

5.3. Evaluation Task

Next, we report the experimental evaluation of ULYSSES on DBpedia KG. As we state at the end of the related work section, there is not any direct competitor to be compared with. Furthermore, by definition, the queries that ULYSSES selects are the most frequent ones. As such, we evaluated the quality of the textual representation of the answers on four models generated by external LLM services.

5.4. Golden Standard Construction

In order to generate a reference FAQ list with which to compare, we conducted a user study. For that study, we extracted the 50 most frequent questions, issued the queries to the SPARQL endpoint, and retrieved the corresponding answers. Then, we asked computer science graduate students (of the “Semantic Web” course) to construct a three-line textual description for the results of each query. For each query, three results were produced by three distinct students. The generated dataset is also available online on the GitHub of the project.

5.5. Metrics

For the evaluation of the constructed textual answers, we used the Rouge scores [33], a set of metrics proposed to evaluate the quality of automatic summarization and machine translation outputs by comparing them to reference summaries. We used Rouge-1 (overlap of words), Rouge-2 (overlap of bigrams), and Rouge-L (the longest common subsequence (LCS)) to evaluate the results of the various models and baselines, in order to calculate the sentence-level similarity, as commonly done in the domain. Typically, higher Rouge scores indicate better agreement between the generated and reference summaries.

5.6. Results

The results are shown in Figure 4, Figure 5 and Figure 6. Specifically, looking at the Rouge-1 (refer to Figure 4) chart, which assesses the ability to capture relevant words, BART achieved the highest precision score (0.42), followed by Gemini (0.38). Regarding recall, which measures the ability to identify positive cases, ChatGPT achieved the highest score (0.48), followed by Gemini (0.37). However, in terms of f-measure, which provides an overall assessment of performance, ChatGPT attained the highest score (0.37), followed by Gemini (0.31), whereas the rest followed.

Moving on to the Rouge-2 chart (refer to Figure 5), which identifies relevant pairs of words, BART once again secured the highest score, in terms of precision (0.25), followed by T5 (0.20). In terms of recall, ChatGPT achieved the highest score (0.24), followed by Gemini (0.15). Subsequently, the f-measure’s top scorer was ChatGPT (0.18), followed by Gemini (0.13).

For Rouge-L (refer to Figure 6), which captures the structural coherence of the generated summaries, BART obtained the highest precision score (0.38), followed by ChatGPT (0.29). In terms of recall, ChatGPT again performed the best among the others (0.40), followed by Gemini (0.30). The highest f-measure score was achieved by ChatGPT (0.30), followed by Gemini (0.25).

Overall, in the evaluation, which examined various NLP models in the task of query answering, ChatGPT demonstrated the best performance in recall and f-measure across metrics, followed by Gemini. Gemini showed a competitive but slightly inferior performance compared to ChatGPT. Subsequently, BART excelled in precision, thus ensuring the quality of the positive predictions. T5 generally performed the poorest among the four models, with lower precision, recall, and f-measure scores across all metrics. In general, BART seemed to be more effective for precise answers, while ChatGPT and Gemini emphasized capturing a wide range of relevant information and maintaining the general structure of the result.

Although ChatGPT performed best, the result scores were considered moderate in the bibliography [34]. However, it achieved the best balance between relevance and capturing key points, as indicated by the f-measure scores compared to the other evaluated models. This shows that ChatGPT is the most suitable tool in constructing FAQ lists in our dataset. Furthermore, the quality of the responses from BART and T5 were inferior to those from ChatGPT and Gemini, which managed to build better human-readable sentences with high coherency – giving the impression of not being computer generated. T5, especially, produced sentences that seemed not to be coherent.

5.7. Interesting Observations

While using the Gemini and ChatGPT models, certain restrictions were observed that are useful to report.

For example, despite explicitly requesting responses in plain sentences without bullets or lists, the models exhibited a tendency to deviate from maintaining a structured response format.

Furthermore, it is crucial to highlight that the use of the free version of the models came with token restrictions, imposing constraints on the length of the responses, potentially leading to truncation or incomplete translations for longer queries or complex sentences. In addition to the observed limitations in response to structure and token usage, we should also note that there are restrictions on the number of requests when using the ChatGPT 3.5 Turbo model.

Despite these limitations, both ChatGPT and Gemini constitute powerful tools for language understanding and generation.

6. Conclusions

This study highlights the revolutionary possibilities of incorporating AI technologies, such as ChatGPT and Gemini, in the exploration and understanding of large Knowledge Graphs. ULYSSES is the first system that enables automatic construction of FAQ lists by exploiting existing query logs and the corresponding SPARQL endpoints to answer the most frequent queries. In addition, it leverages the strengths of ChatGPT and Gemini for translating SPARQL queries and query results into plain text. ULYSSES is available online, and both the code and datasets are also available, contributing a novel ground truth dataset in the field.

In essence, to the best of our knowledge, there is no other work currently available that generates FAQ lists for Knowledge Graphs. As such, this study demonstrates the unique potential of AI-driven techniques in managing KGs and opens a new direction of research, laying the groundwork for further study and advancement. Furthermore, the modular architecture of our solution enables the uninterrupted replacement of the current LLM models by new, more powerful ones that might arise or new versions of the existing ones.

Limitations. We have to acknowledge the potential limitations of the various models that were used in our approach. First, it is true that those models were trained on natural language data. While they excel in understanding and generating text, their proficiency in handling structured data, such as SPARQL queries, or understanding the intricacies of RDF (Resource Description Framework) schemas can be limited. They might struggle to accurately interpret the nuances of SPARQL queries, especially when complex joins, filters, or aggregations are involved.

Furthermore, the results of the queries might be too many, and a standard approach of the endpoints is to limit the results without a prioritization of the importance on the returned part. This might generate challenges in the processing of those large results, as well as incomplete answers. Another problem with the large answers is that the free version of the LLMs can only process a limited number of tokens, also potentially leading to incomplete answers.

Finally, we only explored four transformer models; newer versions of those models or alternative selections might further boost the quality of the generated FAQs.

Future Work. As future work, we intend to explore personalized FAQ lists for users of FAQ lists for specific domains/categories. As KGs contain a vast amount of information, they cannot be summarized in essence with a set of the k most frequent questions and answers. Different users might want a different FAQ overall or might even like to select a specific different domain or topic. As such, FAQ personalization is the direction we will next explore. Furthermore, in cases of large query results, optimal selection of a subset to use for the text would enhance the quality of the returned results. We also leave such an exploration for future work.

Author Contributions

Conceptualization, N.S., G.B., F.D., A.K., G.S., N.O., D.D., E.M., G.P., K.A., S.K., Y.K., N.V., E.S., V.D., I.G. and A.S.; methodology, N.S., G.B., F.D., A.K., G.S., N.O., D.D., E.M., G.P., K.A., S.K., Y.K., N.V., E.S., V.D., I.G. and A.S.; software, N.S., G.B., F.D., A.K., G.S., N.O., D.D., E.M., G.P., K.A., S.K., Y.K., N.V., E.S., V.D., I.G. and A.S.; formal analysis, G.V., G.E.T., G.T.; resources, N.S., G.B., F.D., A.K., G.S., N.O., D.D., E.M., G.P., K.A., S.K., Y.K., N.V., E.S., V.D., I.G. and A.S.; writing—original draft preparation, G.V., G.E.T., G.T.; writing—review and editing, H.K. and N.P.; visualization, G.V., G.E.T., G.T.; supervision, H.K. and N.P.; project administration, G.V. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Acknowledgments

The work presented in this paper is the result of the Semantic Web course taught to the Hellenic Mediterranean University by Nikolaos Papadakis and Haridimos Kondylakis during winter 2023.

Conflicts of Interest

The authors declare no conflict of interest.

References

Cebiric, S.; Goasdoué, F.; Kondylakis, H.; Kotzinos, D.; Manolescu, I.; Troullinou, G.; Zneika, M. Summarizing semantic graphs: A survey. VLDB J. 2019, 28, 295–327. [Google Scholar] [CrossRef]
de Oliveira, E.C.C.; da Silva, A.S.; de Moura, E.S.; Cavalcanti, J.M.B. Extracting and Searching Useful Information Available on Web FAQs. In Proceedings of the XXI Simpósio Brasileiro de Banco de Dados, Florianópolis, SC, Brasil, 16–20 October 2006; Anais/Proceedings. Nascimento, M.A., Ed.; UFSC: Florianópolis, Brasil, 2006; pp. 102–116. [Google Scholar]
Trouli, G.E.; Papadakis, N.; Kondylakis, H. Constructing Semantic Summaries Using Embeddings. Information 2024, 15, 238. [Google Scholar] [CrossRef]
Vassiliou, G.; Papadakis, N.; Kondylakis, H. iSummary: Demonstrating Workload-based, Personalized Summaries for Knowledge Graphs. In Proceedings of the ISWC 2023 Posters and Demos: 22nd International Semantic Web Conference, Athens, Greece, 6–10 November 2023; Available online: https://ceur-ws.org/Vol-3632/ISWC2023_paper_435.pdf (accessed on 22 August 2024).
Troullinou, G.; Kondylakis, H.; Stefanidis, K.; Plexousakis, D. Exploring RDFS KBs Using Summaries. In Proceedings of the Semantic Web—ISWC 2018—17th International Semantic Web Conference, Monterey, CA, USA, 8–12 October 2018; Proceedings, Part I. Vrandecic, D., Bontcheva, K., Suárez-Figueroa, M.C., Presutti, V., Celino, I., Sabou, M., Kaffee, L., Simperl, E., Eds.; Lecture Notes in Computer Science. Springer: Berlin/Heidelberg, Germany, 2018; Volume 11136, pp. 268–284. [Google Scholar] [CrossRef]
Motta, E.; Mulholland, P.; Peroni, S.; d’Aquin, M.; Gómez-Pérez, J.M.; Mendez, V.; Zablith, F. A Novel Approach to Visualizing and Navigating Ontologies. In Proceedings of the Semantic Web—ISWC 2011—10th International Semantic Web Conference, Bonn, Germany, 23–27 October 2011; Proceedings, Part I. Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N.F., Blomqvist, E., Eds.; Lecture Notes in Computer Science. Springer: Berlin/Heidelberg, Germany, 2011; Volume 7031, pp. 470–486. [Google Scholar] [CrossRef]
Zhang, X.; Cheng, G.; Ge, W.; Qu, Y. Summarizing Vocabularies in the Global Semantic Web. J. Comput. Sci. Technol. 2009, 24, 165–174. [Google Scholar] [CrossRef]
Vassiliou, G.; Alevizakis, F.; Papadakis, N.; Kondylakis, H. iSummary: Workload-Based, Personalized Summaries for Knowledge Graphs. In Proceedings of the Semantic Web—20th International Conference, ESWC 2023, Hersonissos, Greece, 28 May–1 June 2023; Proceedings. Pesquita, C., Jiménez-Ruiz, E., McCusker, J.P., Faria, D., Dragoni, M., Dimou, A., Troncy, R., Hertling, S., Eds.; Lecture Notes in Computer Science. Springer: Berlin/Heidelberg, Germany, 2023; Volume 13870, pp. 192–208. [Google Scholar] [CrossRef]
Vassiliou, G.; Troullinou, G.; Papadakis, N.; Kondylakis, H. WBSum: Workload-based Summaries for RDF/S KBs. In Proceedings of the SSDBM 2021: 33rd International Conference on Scientific and Statistical Database Management, Tampa, FL, USA, 6–7 July 2021; Zhu, Q., Zhu, X., Tu, Y., Xu, Z., Kumar, A., Eds.; ACM: New York, NY, USA, 2021; pp. 248–252. [Google Scholar] [CrossRef]
Khan, A. Knowledge Graphs Querying. SIGMOD Rec. 2023, 52, 18–29. [Google Scholar] [CrossRef]
Diefenbach, D.; López, V.; Singh, K.D.; Maret, P. Core techniques of question answering systems over knowledge bases: A survey. Knowl. Inf. Syst. 2018, 55, 529–569. [Google Scholar] [CrossRef]
Formica, A.; Mele, I.; Taglino, F. A template-based approach for question answering over knowledge bases. Knowl. Inf. Syst. 2024, 66, 453–479. [Google Scholar] [CrossRef]
Lukovnikov, D.; Fischer, A.; Lehmann, J.; Auer, S. Neural Network-based Question Answering over Knowledge Graphs on Word and Character Level. In Proceedings of the 26th International Conference on World Wide Web, WWW 2017, Perth, Australia, 3–7 April 2017; Barrett, R., Cummings, R., Agichtein, E., Gabrilovich, E., Eds.; ACM: New York, NY, USA, 2017; pp. 1211–1220. [Google Scholar] [CrossRef]
Raazaghi, F. Auto-FAQ-Gen: Automatic Frequently Asked Questions Generation. In Proceedings of the Advances in Artificial Intelligence—28th Canadian Conference on Artificial Intelligence, Canadian AI 2015, Halifax, NS, Canada, 2–5 June 2015; Proceedings. Barbosa, D., Milios, E.E., Eds.; Lecture Notes in Computer Science. Springer: Berlin/Heidelberg, Germany, 2015; Volume 9091, pp. 334–337. [Google Scholar] [CrossRef]
Du, X.; Cardie, C. Harvesting Paragraph-level Question-Answer Pairs from Wikipedia. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018, Melbourne, Australia, 15–20 July 2018; Volume 1: Long Papers. Gurevych, I., Miyao, Y., Eds.; Association for Computational Linguistics: Stroudsburg, PA, USA, 2018; pp. 1907–1917. [Google Scholar] [CrossRef]
Willis, A.; Davis, G.M.; Ruan, S.; Manoharan, L.; Landay, J.A.; Brunskill, E. Key Phrase Extraction for Generating Educational Question-Answer Pairs. In Proceedings of the Sixth ACM Conference on Learning @ Scale, L@S 2019, Chicago, IL, USA, 24–25 June 2019; ACM: New York, NY, USA, 2019; pp. 20:1–20:10. [Google Scholar] [CrossRef]
Kumar, A.; Kharadi, A.; Singh, D.; Kumari, M. Automatic question-answer pair generation using Deep Learning. In Proceedings of the 2021 Third International Conference on Inventive Research in Computing Applications (ICIRCA), Coimbatore, India, 2–4 September 2021; pp. 794–799. [Google Scholar] [CrossRef]
Shinoda, K.; Sugawara, S.; Aizawa, A. Improving the Robustness of QA Models to Challenge Sets with Variational Question-Answer Pair Generation. In Proceedings of the ACL-IJCNLP 2021 Student Research Workshop, ACL 2021, Online, 5–10 July 2021; Kabbara, J., Lin, H., Paullada, A., Vamvas, J., Eds.; Association for Computational Linguistics: Stroudsburg, PA, USA, 2021; pp. 197–214. [Google Scholar] [CrossRef]
Hu, W.; Yu, D.; Jiau, H.C. A FAQ Finding Process in Open Source Project Forums. In Proceedings of the Fifth International Conference on Software Engineering Advances, ICSEA 2010, Nice, France, 22–27 August 2010; Hall, J.G., Kaindl, H., Lavazza, L., Buchgeher, G., Takaki, O., Eds.; IEEE Computer Society: Washington, DC, USA, 2010; pp. 259–264. [Google Scholar] [CrossRef]
Sindhgatta, R.; Marvaniya, S.; Dhamecha, T.I.; Sengupta, B. Inferring Frequently Asked Questions from Student Question Answering Forums. In Proceedings of the 10th International Conference on Educational Data Mining, EDM 2017, Wuhan, China, 25–28 June 2017; Hu, X., Barnes, T., Hershkovitz, A., Paquette, L., Eds.; International Educational Data Mining Society (IEDMS): Palermo, Italy, 2017. [Google Scholar]
Bihani, A.; Ullman, J.D.; Paepcke, A. FAQtor: Automatic FAQ Generation Using Online Forums; Technical Report; Stanford InfoLab: Stanford, CA, USA, 2018. [Google Scholar]
Zhao, H.; Liu, Y.; Hou, A.; Gu, J. Knowledge Graph based Question Pair Matching for Domain-Oriented FAQ System. In Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, SMC 2022, Prague, Czech Republic, 9–12 October 2022; IEEE: New York, NY, USA, 2022; pp. 2103–2108. [Google Scholar] [CrossRef]
Xie, R.; Lu, Y.; Lin, F.; Lin, L. FAQ-Based Question Answering via Knowledge Anchors. In Proceedings of the Natural Language Processing and Chinese Computing—9th CCF International Conference, NLPCC 2020, Zhengzhou, China, 14–18 October 2020; Proceedings, Part I. Zhu, X., Zhang, M., Hong, Y., He, R., Eds.; Lecture Notes in Computer Science. Springer: Berlin/Heidelberg, Germany, 2020; Volume 12430, pp. 3–15. [Google Scholar] [CrossRef]
Liu, A.; Huang, Z.; Lu, H.; Wang, X.; Yuan, C. BB-KBQA: BERT-Based Knowledge Base Question Answering. In Proceedings of the Chinese Computational Linguistics—18th China National Conference, CCL 2019, Kunming, China, 18–20 October 2019; Proceedings. Sun, M., Huang, X., Ji, H., Liu, Z., Liu, Y., Eds.; Lecture Notes in Computer Science. Springer: Berlin/Heidelberg, Germany, 2019; Volume 11856, pp. 81–92. [Google Scholar] [CrossRef]
Tseng, W.; Wu, C.; Hsu, Y.; Chen, B. FAQ Retrieval using Question-Aware Graph Convolutional Network and Contextualized Language Model. In Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2021, Tokyo, Japan, 14–17 December 2021; IEEE: New York, NY, USA, 2021; pp. 2006–2012. [Google Scholar]
W3C. Resource Description Framework. Available online: http://www.w3.org/RDF/ (accessed on 1 August 2024).
W3C. Recommendation, SPARQL Query Language for RDF. Available online: https://www.w3.org/TR/rdf-sparql-query/ (accessed on 1 August 2024).
Bonifati, A.; Martens, W.; Timm, T. An analytical study of large SPARQL query logs. VLDB J. 2020, 29, 655–679. [Google Scholar] [CrossRef]
Malyshev, S.; Krötzsch, M.; González, L.; Gonsior, J.; Bielefeldt, A. Getting the Most Out of Wikidata: Semantic Technology Usage in Wikipedia’s Knowledge Graph. In Proceedings of the Semantic Web—ISWC 2018—17th International Semantic Web Conference, Monterey, CA, USA, 8–12 October 2018; Proceedings, Part II. Vrandecic, D., Bontcheva, K., Suárez-Figueroa, M.C., Presutti, V., Celino, I., Sabou, M., Kaffee, L., Simperl, E., Eds.; Lecture Notes in Computer Science. Springer: Berlin/Heidelberg, Germany, 2018; Volume 11137, pp. 376–394. [Google Scholar] [CrossRef]
Anil, R.; Borgeaud, S.; Wu, Y.; Alayrac, J.; Yu, J.; Soricut, R.; Schalkwyk, J.; Dai, A.M.; Hauth, A.; Millican, K.; et al. Gemini: A Family of Highly Capable Multimodal Models. arXiv 2023, arXiv:2312.11805. [Google Scholar] [CrossRef]
Etemad, A.G.; Abidi, A.I.; Chhabra, M. Fine-Tuned T5 for Abstractive Summarization. Int. J. Performability Eng. 2021, 17, 900–906. [Google Scholar]
Venkataramana, A.; Srividya, K.; Cristin, R. Abstractive Text Summarization Using BART. In Proceedings of the 2022 IEEE 2nd Mysore Sub Section International Conference (MysuruCon), Mysuru, India, 16–17 October 2022; pp. 1–6. [Google Scholar] [CrossRef]
Lin, C.Y. Rouge: A package for automatic evaluation of summaries. In Text Summarization Branches Out; Association for Computational Linguistics: Barcelona, Spain, 2004; pp. 74–81. [Google Scholar]
What is the ROUGE Score (Recall-Oriented Understudy for Gisting Evaluation)? Available online: https://klu.ai/glossary/rouge-score (accessed on 29 February 2024).

Figure 1. High-level architecture of the ULYSSES system.

Figure 2. An example screenshot of the ULYSSES system.

Figure 3. Example textual description for both Q1 and its answers, using ChatGPT.

Figure 4. Rouge-1 score.

Figure 5. Rouge-2 score.

Figure 6. Rouge-L score.

Table 1. Q1 results from the DBpedia SPARQL endpoint.

RelationLabel	ObjectLabel
Link from a Wikipage to another Wikipage	“Cahiers du Cinéma”@en
Link from a Wikipage to another Wikipage	“Cape Cod”@en
Link from a Wikipage to another Wikipage	“Cate Blanchett”@en
Link from a Wikipage to another Wikipage	“Erotic mystery films”@en
Link from a Wikipage to another Wikipage	“Baby Did a Bad, Bad Thing”@en

Table 2. Analysis of the queries in the log.

Query Statistics	Number
Initial queries in the dataset file	58,604
Unique Queries	42,870
DESCRIBE, CONSTRUCT, ASK queries	1371
Queries containing generic patterns	34
Total queries excluded	1405
Unique queries after the exclusion process	41,465

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Vassiliou, G.; Trouli, G.E.; Troullinou, G.; Spyridakis, N.; Bitzarakis, G.; Droumalia, F.; Karagiannakis, A.; Skouteli, G.; Oikonomou, N.; Deka, D.; et al. ULYSSES: Automated FreqUentLY ASked QueStions for KnowlEdge GraphS. Appl. Sci. 2024, 14, 7640. https://doi.org/10.3390/app14177640

AMA Style

Vassiliou G, Trouli GE, Troullinou G, Spyridakis N, Bitzarakis G, Droumalia F, Karagiannakis A, Skouteli G, Oikonomou N, Deka D, et al. ULYSSES: Automated FreqUentLY ASked QueStions for KnowlEdge GraphS. Applied Sciences. 2024; 14(17):7640. https://doi.org/10.3390/app14177640

Chicago/Turabian Style

Vassiliou, Giannis, Georgia Eirini Trouli, Georgia Troullinou, Nikolaos Spyridakis, George Bitzarakis, Fotini Droumalia, Antonis Karagiannakis, Georgia Skouteli, Nikolaos Oikonomou, Dimitra Deka, and et al. 2024. "ULYSSES: Automated FreqUentLY ASked QueStions for KnowlEdge GraphS" Applied Sciences 14, no. 17: 7640. https://doi.org/10.3390/app14177640

APA Style

Vassiliou, G., Trouli, G. E., Troullinou, G., Spyridakis, N., Bitzarakis, G., Droumalia, F., Karagiannakis, A., Skouteli, G., Oikonomou, N., Deka, D., Makaronas, E., Pronoitis, G., Alexandris, K., Kostopoulos, S., Kazantzakis, Y., Vlassis, N., Sfinarolaki, E., Daskalakis, V., Giannakos, I., ... Kondylakis, H. (2024). ULYSSES: Automated FreqUentLY ASked QueStions for KnowlEdge GraphS. Applied Sciences, 14(17), 7640. https://doi.org/10.3390/app14177640

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

ULYSSES: Automated FreqUentLY ASked QueStions for KnowlEdge GraphS

Abstract

1. Introduction

2. Related Work

2.1. Semantic Summaries

2.2. Question Answering over KGs

2.3. FAQ Generation

3. Methodology

3.1. Preliminaries

3.2. The Problem

3.3. The FAQGen Algorithm

4. Implementation

4.1. The Data Layer

4.2. The Service Layer

4.2.1. Query Preprocessing and Cleaning

4.2.2. Query Selection and Answering

4.2.3. Transformation to Text

4.3. The GUI Layer

5. Experimental Evaluation

5.1. Datasets

5.2. Baselines

5.3. Evaluation Task

5.4. Golden Standard Construction

5.5. Metrics

5.6. Results

5.7. Interesting Observations

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI