Knowledge Graph-Based Framework for Decision Making Process with Limited Interaction

: In this work, we present an algorithmic framework that supports a decision process in which an end user is assisted by a domain expert to solve a problem. In addition, the communication between the end user and the domain expert is characterized by a limited number of questions and answers. The framework we have developed helps the domain expert to pinpoint a small number of questions to the end user to increase the likelihood of their insights being correct. The proposed framework is based on the domain expert’s knowledge and includes an interaction with both the domain expert and the end user. The domain expert’s knowledge is represented by a knowledge graph, and the end user’s information related to the problem is entered into the graph as evidence. This triggers the inference algorithm in the graph, which suggests to the domain expert the next question for the end user. The paper presents a detailed proposed framework in a medical diagnostic domain; however, it can be adapted to additional domains with a similar setup. The software framework we have developed makes the decision-making process accessible in an interactive and explainable manner, which includes the use of semantic technology and is, therefore, innovative.


Introduction
In recent years, the world of "big data" has gained significant momentum and continues to generate opportunities and challenges [1,2]. The various uses of big data have penetrated almost every field of the technological world. We are interested in the challenge of integrating big data in the technological realm dealing with decision-making processes in order to leverage these processes.
These processes can be found in a wide variety of content worlds (medicine, commerce, education, etc.) and require an understanding of situation awareness, data modeling, and algorithms for delivering intelligent insights. However, these processes provide different answers to different needs; thus, there are several types of decision-making processes, each with a suitable setup [3,4].
In this work, we focus on decision-making processes with the following setup: (a) the process involves two entities: an end user and a domain expert, (b) the end user initi-Consider the following two examples of domains whose processes are naturally suitable for such a setup: medical diagnosis [5] and appliance repairs [6] (Table 1): Table 1. Examples for domains with limited interaction.

Appliance Repairs
Medical Diagnosis Domain expert Service center representative Clinician End user Customer Patient

Interaction
Limited, as the representative has a small number of questions for the end user. Using the end user's answers, the representative must identify the type of fault, and, on this basis, the treatment will be determined Limited, as the clinician has about 10 minutes per patient during which they (a) ask the patient a small number of questions (symptoms), and (b) decide on a limited number of tests As noted, the two mentioned domains contain a two-sided limited interaction. The limitation can be expressed in terms of time, the number of questions, etc. Note that both domains, the medical and the appliance repairs are broad domains that can be specialized into specific subdomains. For example, the domain of appliance repairs can be specialized into construction service, internet service, household faults service, etc. The same goes for the medical domain. It may also contain subdomains, such as medical counseling in various fields (e.g., psychology), treatment of urgent medical calls, etc.
The suggested framework includes two main components: (a) a formal representation of the relevant domain expert's knowledge using semantic technology, specifically a knowledge graph, and (b) an interactive set of algorithms that begins with a set of initial domain values (i.e., prior knowledge of the end user), then, based on this prior knowledge and the knowledge graph representation, it will suggest specific questions to the end user. Answers to these questions will advance the domain expert in the decision-making process and become input for the next iteration. The iterations will continue until the domain expert is satisfied and a decision is made.
We were motivated to represent the expert's knowledge via a knowledge graph as graphs have emerged as a natural way of representing connected data [7]. Efforts during the last decade have organized large amounts of data as collections of nodes and edges, especially in recommendation systems, search engine optimization, and decision-making processes [8][9][10]. The resulting flexible structure, called a knowledge graph, allows quick adaptation of complex data and connections through relationships. Their inherent interconnectivity enables the use of graph algorithms to reveal hidden patterns and infer new knowledge [11][12][13][14]. Furthermore, knowledge graphs are computationally efficient and scale to very large sizes as exemplified by social graphs analysis [15,16].
Our framework was inspired by the perception presented by Musen and his colleagues [17], who are well-known researchers in the field of biomedical informatics, regarding information technology that assists with clinical decision support (CDS). Musen et al. [17] present the guiding principles for systems that provide CDS: their discourse is about communication rather than retrieving information, recommendations rather than producing reports, and assisting domain experts to develop more informed judgments. Respectively, the concept that led us in developing our framework is to provide the domain expert with recommendations inferred from the analysis of relevant data represented by a graph and enable him to make an informed decision. Nevertheless, an additional leading concept was to carry it out with a limited number of iterations. Our framework can be extended to additional domains.
In the presented work we have introduced a new approach for an interactive framework addressed to support decision-making processes characterized by a limited number of interactions. The framework is innovative by being generic, using a graph data model, graph algorithms, and semantic technology. We run our algorithms on a real data set and demonstrate the framework feasibility in a possible realistic scenario. Hence, we provide a proof of concept to our framework.
To illustrate the proposed framework, we begin by reviewing knowledge graphs and decision-making processes (Section 2). We then define the framework's terminology and its algorithms (Section 3). Following this, we demonstrate the framework in the medical diagnostic domain, using a data set consisting of diseases and patient symptoms (Section 4). Finally, we summarize and consider potential future directions (Section 5).

Background
In this subsection, we review semantic technologies, and, in particular, knowledge graphs (KG). Then, we describe the algorithms we used on top of the KG within our framework.

Knowledge Graphs
A knowledge graph encodes data in the form of graph structures by capturing relationships between entities in a flexible manner. Knowledge graphs, or representations of information as semantic graphs, have attracted widespread attention in both the industrial and the academic worlds. Their property of providing semantically structured information has realized important solutions for many tasks, including question answering [18], recommendation systems [8], and information retrieval [19]. Knowledge graphs are also considered to offer great promise for building more intelligent machines.

Community Detection
With respect to graphs, a community can be defined as a subset of nodes densely connected to each other and loosely connected to nodes in the other communities in the same graph. Detecting communities in graphs is an important algorithmic challenge in the process of data understanding. Many methods have been devised over the last few years within different scientific disciplines, such as physics, biology, computer science, and social science. Recent studies show that by combining graph topology and node properties, we can better understand community structures in complex graphs [20]. Common algorithms for community detection in large graphs are the Louvain method and modularity optimization, as described below.

Louvain Method
The Louvain method is an algorithm for the detection of communities in large graphs [21]. For each community, the algorithm maximizes a modularity score. The modularity quantifies the quality of an assignment of nodes to communities. Namely, this provides an evaluation of how much more densely connected the nodes within a community are, compared to what could be expected in a random graph. The Louvain algorithm is a hierarchical clustering algorithm that recursively merges communities into a single node and executes the modularity clustering on the condensed graphs, and it stops when a local maxima of modularity is obtained. The method is a greedy optimization method, easy to implement and efficient, which appears to run in time O(nlogn) where n is the number of nodes in the graph.

Modularity Optimization
The modularity optimization is an algorithm for the detection of communities in the graph based on their modularity [22]. Modularity is a measure of the graph structure. It measures the density of connections within a module or community. Namely, graphs that have a high modularity score are those that have many connections within a community, but only a few connections to other communities. The algorithm then tries to optimize the modularity score for each of the nodes; namely, it determines whether the modularity score of a node might increase if it would change its community to one of its connected nodes.

Prior Work
In this subsection, we review prior work in the context of decision support frameworks and then we focus on frameworks based on KG.

Clinical Decision Support Frameworks
According to Osheroff and his colleagues, clinical decision support (CDS) is the process that "provides clinicians, staff, patients, or other individuals with knowledge and person-specific information, intelligently filtered or presented at appropriate times, to enhance health and health care" [23]. Moreover, they claim that "a clinical decision support system (CDSS) is intended to improve healthcare delivery by enhancing medical decisions with targeted clinical knowledge, patient information, and other health information" [24]. CDSSs are used to assist and empower clinicians in their complex decision-making processes [25]. Musen and his colleagues [17] pinpoint the definition of CDSS and clarify that these systems assist not only by retrieving relevant data but by also considering the specific clinical context and thereby suggesting recommendations for the particular situation. Musen et al. also emphasize that CDSSs do not themselves make clinical decisions, but assist the decision makers (e.g., clinicians, patients, and healthcare organizations) in producing more informed judgments by providing relevant knowledge and analyses.
The range of functions provided by CDSS is wide and includes alarm systems, diagnostics, disease management, and prescription and drug control, among others [26]. They can be implemented in several ways, such as computerized alerts and reminders, or clinical workflow tools and computerized clinical guidelines, where patient data are taken into consideration. This last example involves developing a guideline-based point-of-care decision support system. To develop such systems, it is necessary to first create computer interpretable representations of the clinical knowledge contained in clinical guidelines [5].
Constructing CDS systems requires the most effort in creating the reasoning engine and in specifying the knowledge on which the reasoning engine operates. There are many strategies for accomplishing this, each addressing different requirements, including infobuttons [27], probabilistic systems [28], rule-based approaches [29], ontology-driven CDS systems [30], etc.

Knowledge Graph-Based Applications Including Decision Support Frameworks
Quoting Sprague from 1980 [31], the definition for decision support system is: "interactive computer-based systems, which help decision makers utilize data and models to solve unstructured problems". One of the main challenges in designing efficient decision support frameworks is knowledge acquisition, especially in complicated and uncertain decision contexts [32]. Knowledge graphs have emerged as a dynamic, scalable, and domain-independent form of knowledge representation, as Abu-Salih [33] claims "Knowledge Graphs have made a qualitative leap and effected a real revolution in knowledge representation". Abu-Salih continues to argue that the underlying structure of the KG enables better comprehension, reasoning, and interpretation of knowledge for both humans and machines. These features attract more and more researchers in recent years to use KG as the main means to deal with real-life problems in various fields, such as threat detections [34], interactive recommendations [35], healthcare and medical consultations [36][37][38], service system developments [39], designing decision support systems [32], and more. We elaborate on KG usages, which are similar to our framework, in the following paragraphs.
In recent years, KG penetrated the domain of interactive recommender systems (IRS), which elicit the dynamic preferences of users and take actions based on their current needs through real-time multi-turn interactions. Zhou et al. [35] investigated the potential of leveraging KG to provide rich side information for recommendations of decision-making. Yet, this system does not focus on restricting the interactions with the end user, as our framework does.
Huang et al. [37] introduced a framework for an AI-based medical consultation system with knowledge graph embedding. Their framework implementation leverages knowledge organized as a graph to have diagnosis according to evidence collected from patients recurrently and dynamically. This system, similar to our framework, assists the domain expert. However, while Huang et al.'s system only serves the domain expert, our suggested framework addresses a situation in which the domain expert conducts a realtime question and answer interaction with the end user, whose answers are used as input for our algorithms.
Elnagar and Weistroffer [32] were the first to introduce KG to DSS Design. In their study, they explored how KGs can enhance the decision-making process in DSSs, by presenting a framework to integrate a KG into the DSS design. They claimed that using KG may assist in addressing the limitations of varied, unstructured, and dynamic sources of data that exist among most organizations. They stated that knowledge graphs can support the decision-making of enterprises by enhancing the efficacy of all data integration steps and allowing real-time analysis. However, they use a KG for designing the DSS, i.e., without changing the main structure of the DSS design, while our work uses the KG as the main platform and infrastructure for knowledge acquisition, management, and inference.
To summarize: the framework we suggest for a decision-making process has the following characteristics and is unique to the best of our knowledge: (a) is grounded on a knowledge graph, (b) enables dynamic and optimal management and inference of big data, (c) supports an interactive scenario consisting of questions and answers in real-time between two entities: a domain expert and an end user, and (d) aims to produce a limited interaction, i.e., to reduce the number of questions in the scenario.

Framework and Algorithms
In this section, we introduce the proposed framework, which includes a collection of algorithms and the flow between them.
We aim for interaction-based decision-making processes. The interaction is between a domain expert and an end user, and results in a limited number of iterations, consisting of questions that the framework suggests the domain expert ask the end user. The decision-making process will progress according to the end-user's answers.
When we analyzed these types of processes, we concluded that they can be generically modeled as a collection of symptoms and diseases. Eventually, the process goal is to assist the domain expert to decide on a diagnosis (i.e., provide an explanation for a given set of symptoms based on analyzing available data). Musen described the diagnostic process as being about deciding which questions to ask, which tests to order, or which procedures to perform [7,17]. Questions that may arise during the diagnosis process are of the type: Does the end-user have a particular symptom?
The above terms (i.e., symptoms, diseases, questions, and diagnoses) produce a jargon that can naturally be used in the medical diagnostic domain, yet it is also suitable for other domains, such as appliance repairs: the symptom represents a problem, the disease represents a malfunction, the diagnosis is a fault identification, and a typical question can be: Does the end-user have a particular problem with his appliance?
When using this jargon in the context of the proposed framework, we replace the term diagnosis with the term hypothesis, as the framework does not provide the domain expert with diagnoses, but rather with possible hypotheses. Each hypothesis is in fact a potential disease, and it is accompanied by a question, which is a symptom that indicates the disease (hypothesis). Therefore, the jargon we used throughout the paper to describe the framework and its various algorithms include the terms: symptoms, diseases, questions, and hypotheses. In particular, the framework infers hypotheses along with their related questions and submits them to the domain expert, who decides whether to use (or not) the questions to confirm (or not) the hypotheses (diseases).
In the rest of this section, we describe the framework along with its algorithms, first in general, then in detail.
In general, we start with building a knowledge graph from raw data, which will assist in exploring the relationships between diseases and symptoms. Following this, we use the Louvain hierarchical clustering [21] on the KG (Algorithm 1) to find communities (i.e., clusters of diseases that have similar symptoms). Then, given the symptoms reported by the end user (called evidence symptoms), we find the possible diseases that are compatible with the evidence symptoms using inference on the KG (Algorithm 2). At this point, we infer the most probable community to include the end user disease and suggest to the domain expert a question (symptom) that indicates this community (Algorithm 3). Lastly, we find the best hypotheses to suggest to the domain expert (Algorithm 4), i.e., we suggest to the domain expert diseases and symptoms that the end user might have, to address the improvement of the diagnostic process.
The whole framework is divided into two main parts: the first part, the pre-processing part, is carried out once the framework is launched; while the second part, the processing part, is carried out each time a new request arrives in the framework. The preprocessing part consists of two steps and one algorithm (Algorithm 1), while the processing part consists of three steps and three algorithms (Algorithms 2-4), as we describe below.
The data structures we use include the structure for representing the KG (the default is an adjacency list) and additional structures required for running the algorithms. In the following paragraphs describing the algorithms, we detail these structures and their use. Pre-processing part: Input: A list of diseases and their symptoms Step 1: Construct a knowledge graph (KG) of diseases and symptoms (see subsection 3.1).
Step 2: Cluster the diseases into groups (called communities) according to their symptoms, i.e., diseases with similar symptoms will be in the same community (Algorithm 1). Output: (1) each disease is associated with a community within the KG; (2) additional data structure, called a symptoms community matrix (SCM), represents the associations between groups of diseases and the various symptoms Processing part: Input: k evidence symptoms Step 1: Find the most probable diseases, i.e., the possible diseases that are compatible with the evidence symptoms (Algorithm 2).
Step 2: Infer and suggest to the domain expert (repeatedly as required) a question (symptom) that indicates the most probable community to include the end user disease (Algorithm 3).
Step 3: Infer and suggest to the domain expert a list of hypotheses (diseases the end user might have) and their related questions (symptoms), sorted by relevance (Algorithm 4).
See Figure 1 for a high-level view of the whole suggested framework. In the following subsections, we elaborate on each of the above four algorithms in detail.

Building the Knowledge Graph
In this subsection, we describe the construction of the graph. In addition, we define framework-specific terminology used to describe the algorithms. Let = ( , ) be a directed graph, which is defined as follows. Let = ⋃ be the set of nodes, where is the set of diseases and S is the set of symptoms. The edges of the graph are defined as follows: E={( , ) ∈ | symptom ∈ indicates disease ∈ }, that is, there is an edge from a symptom s to disease d if s might indicate d.
We demonstrate the graph construction and the four algorithms on a simple KG (named toy problem), which is presented in Figure 2. The toy problem includes five diseases (represented by the nodes: , , , , ) and ten symptoms (represented by the nodes: , … , ), so symptom 1 indicates disease 1, symptoms 2 and 3 indicate diseases 1 and 2, etc.

Framework-Specific Terminology
The following (Table 2) is the terminology that we use to describe the algorithms.

Defined by the number of edges from symptom s to community c, less the number of edges from s to some other community c'. The outcome indicates how this symptom characterizes c. cs Community symptom. Defines a symptom indicating a high number of diseases in the community c and indicating a low number of diseases out of c. Hence, given a community c, it is the symptom s with the highest (s,c) (d)
The disease's symptoms rank. Defined by the number of symptoms the patient has that indicate D

The Framework Algorithms
In this subsection, we describe the algorithms that we developed as part of our framework.

Algorithm 1: Cluster the Diseases
To create the communities, we used the Louvain method [12] (see more details in Section 2.1). You can see below the pseudo-code of Algorithm 1.

Algorithm 1: Disease Community Detection
Input: Knowledge Graph = ( ⋃ , ). Output: (1) For every ∈ , add a property named community, which determines the community d belongs to. (2) Symptoms community matrix (SCM), which is exhibited in Table 3. Algorithm: 1. (preprocessing): for every two diseases 1, 2 ∈ such that( , 1) ∈ , ( , 2) ∈ , add = ( 1, 2) ∈ . At the end of this process, the number of edges between d1 and d2 is the number of symptoms they share. 2. Apply the Louvain method for community detection on the resulting graph accepted in Step 1.

Construct the SCM: an |S|× | | matrix such that SCM[s,c]= R (s,c).
Given the toy problem KG represented in Figure 2, we present the communities that were found on that KG in Figure 3. The respective SCM is presented in Table 3, for instance, SCM [ 1][ 1] = 1, since there is one edge pointing from 1 to 1.

Algorithm 2: Find the Most Probable Diseases
Algorithm 2 receives the evidence symptoms and uses the KG to infer which diseases explain these evidence symptoms and outputs them. You can see below the pseudo-code of Algorithm 2.

Return
Based on the given toy problem graph (presented in Figure 2), and on a set of given evidence symptoms (recall in our example they are , ), the output of Algorithm 2 is = { , , }, thus, the PD's communities are 1 and 2.

Algorithm 3: Find the Most Probable Community
Algorithm 3 receives the most probable diseases found by Algorithm 2 and uses the SCM to infer which community (i.e., group of diseases) is more likely to include the enduser disease. To determine whether the inferred community is relevant, the algorithm outputs a symptom (which is a question for the end user) named community symptom (cs). The answer to this question will help to determine whether the patient disease is one of the community diseases or not.
If the end user indicates having the symptom, the framework will proceed to Algorithm 4. Otherwise, Algorithm 3 will iteratively continue to search for the next community with the highest potential to contain the user's disease and accordingly offer the next relevant cs.
You can see below the pseudo-code of Algorithm 3.

Algorithm 3:
Find the most probable community Input: possible diseases , symptoms community matrix (SCM). Output: cs and the community it indicates (presented as a question to the domain expert) or null if it does not exist. Algorithm: 1. Let C be the list of PD's communities, sorted by their LinD property, in a decreasing order. 2. Let ∈ be the current community in the order.
Based on the given = { , , } (output by Algorithm 2) that resulted with 1 and 2 as the PD's communities, the respective ( 1) is 3 and ( 2) is 1 (i.e., ( 1) is the sum SCM[s2,c1] + SCM[s5,c1], as it is presented in Table 3). As c1 has the highest LinD, we calculate R for each symptom with respect to 1 and compared to 2. 3 has the highest R , as R ( 3, 1) − R ( 3, 2) yields the maximum value (=2). Thus, the algorithm outputs c1 and 3 as its respective cs and presents them to the domain expert.

Algorithm 4: Find Disease Symptoms
Algorithm 4 receives the evidence symptoms and a community c and uses the SCM to infer which diseases in c are more likely to explain the patient's symptoms. The output of the algorithm is a list of ordered pairs R. Each pair consists of a hypothesis (disease) and its related question (symptom), the answers to which might help the diagnosis process. You can see below the pseudo-code of Algorithm 4.
We define an order between hypotheses in the community c as follows: (i) Let h1 and h2 be two hypotheses with the same number of evidence symptoms indicating them (that is, (ℎ1) = (ℎ2) ) and let s1 and s2 be two symptoms that strengthen them, respectively. Then, hypothesis h1 is before h2 in the order if ( 1) ≤ ( 2).

Algorithm 4: Find Disease Symptom
Input: Community c, evidence symptoms ES, symptoms community matrix (SCM). Output: a list (R) consisting of ordered pairs. Each pair consists of a hypothesis (disease) and its related question (symptom). The pairs are sorted by their relevance defined above. Algorithm: 1. Let be an empty list. 2. Let be the list of diseases in c, sorted in a decreasing order by their . 3. Let S=SCM(_,c)\ES be the list of symptoms in community c, without the evidence symptoms, sorted in an increasing order by their . 4. For each ∈ : 4.1. for each s' in S such that (d,s') ∈ , add (d,s') to R. 5. Return R ordered by relevance.
Based on the previous output, let us consider that 3 is a symptom that the end user indicates they have. Thus, we assume that c1 is more likely to include the end user disease. At that point, the algorithm calculates for each disease in c1 their : (d1) = 2 and (d2) = 3. Thus, the sorted list includes ( 1, 2). Then, the algorithm sorts the symptoms of c1 (excluding the evidence symptoms, in our case 2, 3, 5), by their : ( 1, 1) = 1 and ( 4, 1) = 1. Thus, the sorted list includes ( 1, 4). Finally, the algorithm returns the sorted list , which includes the following pairs: [(d2,s4),(d1, s1)].
At that point, R is presented to the domain expert for further consideration.

Case Study Scenario
To examine the proposed framework, particularly the use of the algorithms listed in Section 3, we used a data set composed of patients' records that were taken from Kaggle (https://www.kaggle.com/ (accessed on 25/10/2022)) (described in Section 4.1). We ran a sample scenario on the given data set, which is presented in Section 4.2, followed by the results of the algorithms run using the sample scenario (in Section 4.3).

Data Set Description
The data set contained a total of 410 patient records. Each record referred to one patient and included the name of the disease and the symptoms the patient was experiencing. The data set included a total of 41 different diseases and most of the known symptoms that can characterize the specific disease.
The number of disease symptoms ranges from 4 to 17. The data set included a total of 130 different symptoms. Some of the symptoms were unique and characterized one specific disease, while others were quite common and characterized various diseases.

Knowledge Graph Construction and Community Detection
In this paragraph, we demonstrate the pre-processing part, that is, the knowledge graph construction and the communities' detection.
The knowledge graph was implemented using Neo4j and constructed as follows: we created a node for each of the 41 diseases and 130 symptoms. We created an edge between a symptom node and a disease node if that symptom characterized the disease. Some of the symptom nodes characterize multiple diseases, and thus have multiple connections.
After building the graph, we ran Algorithm 1 for community detection (if you recall, we used the Louvain method). This part was implemented using Neo4j Graph Data Science library (https://neo4j.com/docs/graph-data-science/current/algorithms/ (accessed on 17 August 2022)). Four communities were identified. Figure 4 exhibits the knowledge graph along with the detected communities. For clarity, each community is represented by a distinct color.

Scenario Description
Let us consider the following scenario: A patient arrives with the following two symptoms: yellowish skin and itching. These are our evidence symptoms. Figure 5 depicts a sub-graph derived from the KG, including the evidence symptoms (in green) and the relations of the symptoms (i.e., the diseases that these symptoms characterize). For display clarity, we present only some of the relations. In addition, Figure 5 presents two communities that were found by the community detection algorithm (Algorithm 1). The first community is colored in yellow and includes drug reaction and chickenpox, while the second community is colored in gray and includes hepatitis A-E and jaundice.
Running Algorithm 2 outputs the most probable diseases. In our case, they are six gray nodes that belong to the gray community and two yellow nodes that belong to the yellow community.
Algorithm 3 first finds the most probable community, which, as explained in the previous section, is the community with the highest LinD. As mentioned, in our case we have two communities: the gray community and the yellow community. The LinD of the yellow community is three, since three edges are pointing from the evidence symptoms (the green nodes) to the diseases of the yellow community (yellow nodes): itching pointing to two yellow nodes (chickenpox and drug reaction) and yellowish skin to one yellow node (drug reaction). Similarly, the LinD of the gray community is six: there are six edges connecting the evidence symptoms with the diseases of the gray community. Thus, the gray community has the highest LinD.
At this point, Algorithm 3 examines the community with the highest LinD (in our case, the gray community) to suggest cs: a symptom that is best indicative of this community. In fact, cs is the symptom with the highest (s,c), given c is the gray community and compared to the other PD's communities (in our case the yellow community). Thus, to find the respective cs, the algorithm calculates for each of the symptoms concerning the gray community, as can be seen in Figure 6 (table a). We can see that the symptom with the highest relative to the gray community is abdominal pain. As so, Algorithm 3 outputs the gray community and its respective cs.
In the presented scenario, the patient has this symptom, and, therefore, the hypothesis that the gray community contains one of the patient's diseases is strengthened. We can now continue to the last step and run Algorithm 4. Otherwise, if you recall, Algorithm 3 infers (repeatedly as required) the next question (symptom) to indicate the next most probable community. Figure 6. Table A-the Symptom's for each PD's community along with the for the gray community; Table B-the diseases in the gray community with their ; Table C-the symptoms indicating the gray diseases with their .
Algorithm 4 returns R, which is a list of sorted pairs (disease, symptom), such that the symptom indicates the disease. The gray diseases are sorted in a decreasing manner according to their and listed in Figure 6, Table B. In addition, the symptoms indicating these diseases are sorted increasingly according to their and listed in Figure 6, Table C. In our case study, the algorithm returns the sorted list that includes the pairs as they appear in the following table (Table 4): Hepatitis D Fatigue 7 Jaundice High fever 8 Jaundice Fatigue 9 Hepatitis C Fatigue

Summary
Decision-making processes are found in almost every area of our lives, and thus the realm of decision making is constantly evolving and receiving a lot of research attention. When we come to analyze, model, and implement systems that support these processes, we are required to focus on a specific sub-domain, since it is almost impossible to provide one selected solution for all the different requirements of decision-making processes, which arise from different content worlds.
In the current work, we focus on a sub-domain of decision-making processes characterized by the following characteristics: (a) the trigger for the procedure is an end user's request, (b) a domain expert is present, and (c) these two entities have an interaction, in a real-time scenario, consisting of questions (asked by the domain expert) and answers (given by the end user) that are limited in nature, i.e., the number of questions the domain expert addresses to the end user and the answers they receive must be limited.
This sub-domain (we named it "decision-making process with limited interaction") includes specific processes, such as an encounter between a physician and a patient, a contact between a service provider and a customer, an urgent call from a person in need of help to an assisting party, etc. All the examples below illustrate why the interaction needs to be limited. As such, one of our goals is to distill the necessary questions, asked by the domain expert, to assist them in reaching the right decision efficiently.

Contribution
In the literature review we performed, we found few references to the described process configuration. Therefore, we believe that our work will provide a contribution to addressing such a configuration of decision-making processes.
As noted, the algorithmic framework we developed aims to help the domain experts to pinpoint their questions to the end user. The proposed framework is based on the knowledge of the domain expert and their interaction with the end users.
The algorithmic framework consists of two parts. In the first part, a knowledge graph is constructed that characterizes the domain expert's knowledge. In the second part, as part of the interaction with the end user, the answers they provide are entered in the graph as evidence properties and generate a trigger for the inference algorithm in the graph.
As stated, this study aims to provide a generic framework that helps to refine the work processes with the characteristics mentioned earlier. At the same time, we want to present a possible use of the framework, and to that end, we chose the medical world as a case study. Specifically, we focused on the classic problem of medical diagnostics, which is part of a wide range of clinical decisions [17,7]. Medical diagnostics is a challenge that in recent decades has led to the development of methodologies and systems to support clinical decisions [40]. In this chosen case study, the end user is a patient, the domain expert is a physician, and the interaction is the encounter between them that aims to diagnose the patient's disease.
In this work, we propose a new approach for an interactive framework addressed to support decision-making processes characterized by a limited number of interactions. The innovation of the work stems from the use of semantic technologies, including a graphical data model, combined with unique algorithms. In addition, we tried to provide a generic framework that can be adapted to other domains beyond the medical domain.

Limitations
Many researchers are passionate about exploring the potential of artificial intelligence to support decision making, particularly within the clinical domain [41]. Nevertheless, there are still complexities researchers are trying to address. For instance, one of the challenges is to evaluate the improvement, if any, that such systems provide. Vasey and colleagues argue that "little is known about the outcomes of these systems when used as adjuncts to human decision making (human vs. human with)". Via systematic review, they explored the association between the interactive use of machine learning (ML)-based diagnostic CDSSs and clinician performance and reported that there is minimal evidence to suggest that using ML-based CDSSs is associated with improved physician diagnostic performance, since most studies had a small number of participants [42].
Besides the innovation and uniqueness of our work, there are also several limitations, which we discuss in the next paragraphs:


In the case study we presented, we demonstrated the feasibility of the framework but did not compare its performance to another system or to a real doctor-patient situation. In other words, we did not carry out a comprehensive evaluation.  We encountered a difficulty in estimating the complexity of the Algorithms 2-4, as their activation depends on the data existing in the KG, and on the number of iterations that will be performed in the interaction between the domain expert and the end user.  As mentioned, the KG was constructed based on the data of diseases and symptoms taken from the Kaggle website. The KG was used for examining the case study we presented, as a proof of concept. Yet, the existing KG cannot be considered as big data. Knowledge graphs are designed to handle large volumes of data, and in our future work, we might test the scalability of the framework on a larger scale.  The data we used did not contain information on the extent to which a symptom is related to a disease. Therefore, the KG did not have weights on the edges. This issue will be addressed in our future work, as well as statistical aspects.

Future Work
The framework we have developed makes the decision-making process accessible in an interactive and explainable manner, which includes the use of semantic technology and is, therefore, innovative.
Following our current work, we will aim to produce a comparative analysis of the suggested framework. The following are potential future directions:


Using ontologies to enrich semantic reasoning.  Using a weighted knowledge graph for representing the cost of each question.
In addition, we plan to combine the knowledge graph with medical ontologies having semantic and verbal data that supplement and/or expand the medical information. Furthermore, integration with specific medical information about patients (test results, medical background, etc.) can also increase the accuracy of the medical diagnosis.

Institutional Review Board Statement:
This study is based on an anonymized, publicly available database. The study was conducted according to the guidelines of the Ruppin Academic Center Research.

Conflicts of Interest:
The authors declare no conflict of interest.