An Extended Semantic Interoperability Model for Distributed Electronic Health Record Based on Fuzzy Ontology Semantics

: Semantic interoperability of distributed electronic health record (EHR) systems is a crucial problem for querying EHR and machine learning projects. The main contribution of this paper is to propose and implement a fuzzy ontology-based semantic interoperability framework for distributed EHR systems. First, a separate standard ontology is created for each input source. Second, a uniﬁed ontology is created that merges the previously created ontologies. However, this crisp ontology is not able to answer vague or uncertain queries. We thirdly extend the integrated crisp ontology into a fuzzy ontology by using a standard methodology and fuzzy logic to handle this limitation. The used dataset includes identiﬁed data of 100 patients. The resulting fuzzy ontology includes 27 class, 58 properties, 43 fuzzy data types, 451 instances, 8376 axioms, 5232 logical axioms, 1216 declarative axioms, 113 annotation axioms, and 3204 data property assertions. The resulting ontology is tested using real data from the MIMIC-III intensive care unit dataset and real archetypes from openEHR. This fuzzy ontology-based system helps physicians accurately query any required data about patients from distributed locations using near-natural language queries. Domain specialists validated the accuracy and correctness of the obtained results.


Introduction
With the increasing expansion of medical science, traditional handwriting health records have not been adequate for recording the massive quantity of information. Information technology (IT) has to play a prime role in the healthcare system redesigning to improve substantial quality [1]. In the 1960s and 70s, new computer technology was developed, leading to the development of the electronic health record (EHR). In the same period, early efforts for EHRs development began [2]. EHR is an electronic repository for all individual's lifetime health information [3]. EHRs feed the different healthcare systems with up-to-date, accurate, digitalized, and complete information about patients, as shown in Figure 1. EHRs support efficient, quality, and persistent integrated healthcare, improving the medical domain [4]. Healthcare data by nature are growing quickly, and they are distributed. EHR gets all patient data together from many heterogeneous systems [5]. The healthcare system all over the world is specialized, distributed, and interrelated. The different components of eHealth system must be cooperated and communicated with each other [6]. That is enabled by using an interoperable mechanism for these health information systems [7]. Achieving interoperability of different EHRs systems increases workflow performance, enhances the quality of healthcare, and reduces duplication and costs [8]. EHRs interoperability ensures a common understanding of manipulated medical information, therefore reducing medical errors. It coordinates amongst different healthcare providers (hospitals, government, clinics, general practitioners, etc.) in a fast and better way [8,9].
Many organizational standards have been designed to achieve interoperability in the EHR environment [10]. From those standards are: DICOM [11], CEN/ISO EN13606 [12], GEHR [7], openEHR [13], EuroRec [14], and epSOS [15]. Unfortunately, those standards have many challenges [16,17]. One of them is that there are too many standards, and adopting one requires huge efforts [18]. In addition, those standards are dynamic and it is difficult to deal with complex concept descriptions [7]. As a result, EHRs cannot accomplish their full possibility in helping coordinated care till clinicians and patients are confirmed that all individual's records could be retrieved no matter where they are stored. To achieve this objective, EHRs have to connect records for every patient across the many offices, hospitals, and other sites where that individual requests care [19]. That could be achieved by implementing EHR syntax and semantic interoperability. Distributed EHRs necessitate semantic interoperability, to assure that patients' EHRs could be shared and reused by different professionals [20]. The main challenge of any system's success is in its conceptualization "description of the system with its practices and their interrelations" [21]. The information in the healthcare domain is distributed, dynamic, and heterogeneous. Besides, a large amount of data is updated daily. Therefore, the hierarchy structure with expansion design is an essential required to that system [22]. Ontologies play a significant role in building distributed, and interoperable, EHR environments, as they provide uniform semantics and explicit formal models [14,15].
Semantic Web (SW) technologies enable data to be reused and shared among enterprises, applications, and community boundaries. SW technologies have many capabilities that achieve SI in any EHR [23]. Ontology, as the main item in SW, is defined as a data model that is used to represent a set of concepts in a specific domain and the relationships amongst those concepts [24]. It defines the used terms in a formal semantic way using a set of axioms. This enables both machines and humans to understand the meaning of the exchanged data. In recent years, ontologies are used as formalisms in representing knowledge in various domains. They are used in many fields as software engineering, artificial intelligence, building the expert systems' knowledge base [25], natural language processing [26], communication between agents through knowledge sharing [27], and biomedical informatics [24]. Besides, ontologies could be used in helping human beings communicate, achieving interoperability between software systems, and improving the quality and design of software systems [28].
The most important advantage of ontologies is that they could agree between many different parties. They are nearer to become a standard than any other data model [29]. The major disadvantage of it is their incapability in representing and in reasoning imprecision and uncertainty. However, much of the medical domain knowledge is vague or fuzzy [30,31], for example, if we need to report that a patient has high systolic blood pressure. This is difficult because concepts like high, low, and medium are not well and clearly defined. Hence, crisp ontology is not suitable to deal with such knowledge [32]. Fuzzy logic and fuzzy ontology (FO) could handle that vagueness [33]. It adds more semantics to the original data using linguistic variables. By using this formulation, we could report that a patient has a high systolic blood pressure with a degree of 0.9.
In previous work [34], we proposed an ontological architecture that could unify different EHRs data formats. We focused on the first two stages of the process. We evaluated different health formats, represented by openEHR's ADL (Archetype Definition Language) archetype, relational databases, spreadsheets, XML documents, and CSV files. The first stage of the proposed architecture converts each different input source to an OWL ontology. In the second stage, it integrates all those ontologies into a merged crisp one. The work in this paper expands crisp ontology into a fuzzy ontology. FO enhances the crisp ontology's capabilities and improves the medical domain's accuracy [33]. The main contributions of this paper are summarized as follows: 1.
Propose a framework that could integrate and collect all patient data from distributed and heterogeneous data sources in a centralized point based on using semantic web ontologies.

2.
Achieve syntax interoperability in distributed EHRs by aggregating data with heterogeneous structures. That aggregating and integration were done using the ontology semantic web concept.

3.
Achieve SI in distributed EHRs by fuzzing the united crisp ontology. We made unification between different heterogeneous formats using crisp ontology. FO could address semantic meaning for any inconsistent feature by using linguistic terms. For example, it is popular to utilize translation between people who are not from the same country and do not speak a similar language.
The remainder of the paper is organized as follows. Some previous studies and related work are discussed in Section 2. Section 3 includes the proposed FO framework with the used dataset description. Section 4 contains the experimental results with MIMIC-III and openEHR distributed datasets and their results. We discuss our evaluation of the proposed solution in Section 5. Finally, the conclusion and future work are outlined in Section 6.

Related Work
Over the past few years, the literature moves towards using ontology for solving many problems in the healthcare domain. From those studies, there are many relied upon using ontologies in an interoperability environment. Others extend ontology by fuzzy one for handling the imprecise nature of the medical domain.

Ontology-Based Interoperable Frameworks
Ontologies are highly used in many contexts of different fields. For example, Pan et al. [35] surveyed a complete analysis about using ontologies in the field of software engineering. Ontology has been commonly used in biomedical informatics [32][33][34]. Yang and Li [36] proposed building an interoperable healthcare system between different medical treatment organizations. Ontology-based approach was introduced as an information integration solution. A virtual database was established to realize hospital information data integration. The semantic and structural heterogeneity are two main problems caused by distributed healthcare systems. The semantic heterogeneity problem could be solved using ontologies. The structure heterogeneity problem would be overcome by using XML-based data integration with data warehouse. Villaseñor et al. [18] proposed U2MIO (Ubiquitous User Modeling Interoperability Ontology) mediation model for interoperability in heterogeneous EHRs. It reuses SKOS (Simple Knowledge Organization for the Web) ontology in designing a central concept scheme for ubiquitous user models and one concept scheme for each profile consumer or supplier. It recollects source documents in RDF, XML, or JSON formats. When a new source enters to the system, a corresponding skos:ConceptScheme(X) is designed and added to U2MIO. That task was performed by combining three types of similarity: string, semantic, and structure similarity.
Kiourtis et al. [37] suggested a multi-step semantic architecture that was executed with many medical for heterogeneous EHRs' data standards. The proposed framework integrated a system to remove space explicit data from ordered EHR datasets, transform them via ontologies to a CHL (Common Health Language). El Hajjamy et al. [38] developed a semi-automatic integration approach to convert data from traditional data sources to ontologies. They converted each different data source into OWL 2 ontology. Based on syntactic, semantic, and structural similarity, all output local ontologies are merged into a united ontological model. The authors avoided the redundancy in the merged ontology. Cristiano et al. [39] presented a method to reach the interoperability between heterogeneous health systems by the use of ontologies and rules. That method used the OWL features to map equivalences between mixed openEHR and HL7 records. The dataset used was composed of heart rate and blood pressure readings of three patients extracted from the MIMIC-III database. They chose to represent the first five rows of each table in openEHR while the rest were represented in HL7. Their approach translated HIS structure to OWL then created bindings between similar structures of each system. SWRL rules were used to increase the expressivity of the openEHR and HL7 ontology by classifying individuals based on their properties. These bindings are used by the reasoner to infer additional knowledge not explicit in the ontology. However, the authors ought to evaluate their proposition with huger sets of data.
Roehrs et al. [40] proposed an interoperability model "OmniPHR". It presented a structural-semantic, unified, and up-to-date vision of PHR "personal health record" patients and healthcare providers. That model tried to achieve integration and semantic interoperability between different health standards. It used a real with many adult patients' medical records. The data was represented by three reference models: HL7 FHIR, openEHR, and MIMIC-III. The final execution score reached 87.9% F1-score. Duncan et al. [41] proposed OHD (Oral Health and Disease Ontology) as a widespread framework to represent dental health information. The OHD framework could be used to integrate homogeneous data from different database systems. Using the OHD's information, terms, and relations from multiple dental EHRs could be translated to OWL 2 statements. Those statements could be stored in a semantic database as a triple store and queried using SPARQL for extracting information. OHD also supports future system representations. However, ontology provides a comprehensive understanding of domain terms, crisp ontology is not suitable for handling uncertain, incomplete, and vague information, and it must be extended using a fuzzy one [35,42,43]. In the following section, we will manipulate some studies that depend on using fuzzy ontologies.

Fuzzy-Based Ontology Systems
FO has been used in many different applications as image interpretation [44], modeling human behavior [45], information retrieval, and many other semantic Web applications. For example, Calegari and Sanchez [42] used FO in improving Information Retrieval (IR) system. The information retrieval domain involves information organizing, storing, retrieval, and displaying. The authors saw that two-valued-based methods are not sufficient in handling uncertain, ill-structured, and imprecise knowledge. They extended the query vector and introduced the Fuzzy Concept Network (FCN) to represent ontology dynamical behavior. David Parry [43] proposed more specifically medical IR based on fuzzy ontology. Regarding personalized multimedia IR, Mylonas et al. [46] presented personalized, unified access to heterogeneous multimedia content in distributed repositories. It focused on semantic analysis of multimedia metadata, documents, user profiles, and user queries. Rodríguez et al. proposed an FO model for human behavior recognition [45]. The proposed model was realistic, flexible, more accurate, and allow incomplete real-life queries. El-Sappagh and Elmogy [25] developed an FO novel (CBRDiabOnto) for diabetes mellitus diagnosis fuzzy Knowledge-Intensive Case-Based Reasoning system. They enhanced the case representation and case retrieval critical steps of CBR system. The authors implemented a JAVA user interface for collecting the patient query case description. In addition, they implemented a framework to fuzzify and code the case description.
Ali et al. [47] proposed a type-2 fuzzy ontology for the Internet of Things (IoT)-based healthcare to monitor the patient's body efficiently. The proposed architecture extracted patient risk factors values, determined the patient's health condition through wearable sensors, and then recommended diabetes-specific prescriptions for a smart medicine box and smart refrigerator food. The combination between fuzzy ontology and type-2 Fuzzy Logic (T2FL) increased the prediction accuracy of a patient's condition. The proposed system could increase the healthcare performance of chronic diseases. It could assist old patients for long-term care without continuously physicians visiting. Noia et al. [48] proposed a fuzzy ontology to structure the knowledge related to non-functional requirements (NFRs) for supporting architectural design. FO was able to model their joint interactions and relations. The authors presented a decision support system built upon fuzzy OWL 2 ontology that could encode 28 pattern families, 109 design patterns, and 37 NFRs and their mutual relations. Ali et al. [49] combined SWRL (Semantic Web Rule Language) rules with fuzzy ontology-based sentiment analysis for monitoring transportation activities (vehicles, accidents, traffic volume, street conditions, etc.). Their main aim is to make a travelers city-feature polarity map.
After studying the literature, we found that various techniques were proposed to solve the SI problem in distributed EHRs [37,40,50]. However, most of these studies suffer from some limitations. From those limitations: Most of the literature is concerned with using one or more EHRs standards to handle the mentioned problem. However, using a single standard is insufficient to make unification, especially when the system is distributed. Other studies neglected the problem of uncertainty and vagueness in the medical domain [51]. To beat those problems, this paper recommends fuzzy ontology as a solution after its great success in many other medical and non-medical fields as mentioned previously. Our proposed framework depends on using ontology to make unification amongst all distributed data models. Then, fuzzify the unified crisp ontology to handle the medical uncertainty problem.

Fuzzy-Ontology Formulation
Ontology is defined as an explicit, formal, specification of a shared conceptualization [52]. It includes the following components: (concepts "classes", Individuals, Relations, Attributes "data properties", and Axioms "conditions"). Concepts (C) are a collection of domain objects. Individuals (I) represent a set of instances or elements in a given class. Relation (R) represents the interactions between domain individuals; R is ⊆ C 1 × C 2 × . . . C n . Ontology provides a vocabulary for a particular domain by describing its concepts meanings and their relationships. Ontology enables sharing and reusing of health-related data. In addition, they make information both machine and human-readable. Ontologies play a significant role in solving the SI problem among many heterogeneous systems in distributed organizations, across giving a shared annotation and understanding for the common concepts [40,41], thereby providing a uniform way for communication between different practices in a distributed system and also understanding each other.
Much of the knowledge in the medical domain is vague, fuzzy, and uncertain. Uncertainty is defined as: "The incompetence in determining the meanings of related events due to complexity, vagueness, deficiency of information about patient's illness and its consequences" [53]. Crisp ontology is not suitable to deal with such knowledge [32]. It can model only relations amongst concepts that might be true or false. FO expands the classical ontology's capabilities by improving the applicability and accuracy in the medical domain [33]. Formally [54], FO could be defined as quadruple of (C, R, I, µF), where C represents a set of concepts. R ⊆ C × C is a set of fuzzy relations assigned by a fuzzy label and value, I is a mapping from edges to a set of strings "labels", and F is the membership Vague concepts have blurred and fuzzy boundaries, and classical formalism has not any fuzzy boundary. Thereby, it could not represent vague concepts. The linguistic variables in FL are a natural human language term, such as fast, hot, low, and high. The values of these variables are usually words rather than numbers [55]. In FO, the membership function µ(x) (MF) value could any real number between 0 and 1. FO permits defining fuzzy concepts explicitly with trapezoidal, triangular, left-shoulder, and rightshoulder membership functions [56] as in the following equations. Figure 2 clarifies the Fuzzy membership functions. FO is defined theoretically as "an ontology which utilizes fuzzy logic for providing a natural representation of vague and imprecise knowledge, and also reasoning it". FO was built upon the Description logic (DL) language. DL is one of the important tools that can support the semantic web. Logic, as "the science of reasoning", is defined as "the study of how to make formal correct deductions and inferences" [57]. DL [58,59] is defined as "knowledge representation languages that could be utilized in representing the application knowledge in a formal and structured well-understood way". DL is a subset of FOL (First-Order Logic) [60]. Over the most recent few years, DL has been used in a broad range of applications as speech recognition [61], natural language processing [62], and reinforcement learning [63].
DL has Boolean constructors as a conjunction ( ), which is interpreted as an intersection set, and negation (¬) as a complement set, disjunction (∪) as a union set. In addition to, it includes existential quantifier (∃R.C), and value quantifier (∀R.C). The knowledge base in DL includes two main components: TBox and ABox. TBox (terminology) introduces a vocabulary of an application domain. ABox (assertions) contains named individuals with terms of that vocabulary. Description logics provide their users with the services of reasoning. It could infer automatically implicit knowledge from explicit one [64]. From the software DL Reasoning: Pellet [65], HermiT OWL [66], KAON2 [67], Fact++ [68], and RacerPro [69].
Bobillo and Sraccia [70] used OWL 2 to deal with knowledge imprecision and developed a fuzzy OWL 2 to represent FO. OWL 2 is built on SROIQ (D) DL [71]. Fuzzy OWL 2 allows encoding fuzzy data types, fuzzy modifiers, fuzzy concepts, fuzzy roles, fuzzy axioms (An axiom is the ontology smallest unit. It defines formal relation between the entities of ontology.), and concrete domain. It is a pair of (∆ D , Φ D ), where ∆ D is a concrete interpretation domain, and Φ D is a set of concrete predicates [72]. To fuzzify a crisp ontology, it has to be translated to a language propped by FO reasoner. Fuzzy OWL 2 parsers convert Fuzzy OWL 2 ntologies into fuzzyDL [73] and DeLorean reasoners' syntax [74]. The main idea in fuzzyDL depends on roles and concepts that are interpreted as a subset of an interpretation's domain. For defining a formal semantics interpretation, I must be considered. DL associates semantics with roles, concepts, and individuals by utilizing an interpretation I = (∆ I , · I ), where ∆ I is a domain of (individuals) I, and the function, · I is an interpretation function that maps: • Individual names a to domain elements a I ∈ ∆ I , • Class names C to a set of domain elements C I ∆ I , • Role R to a set of pair of domain elements R I ∆ I × ∆ I .
The function of interpretation could be extended to concept description by the following induced definitions: Axioms in fuzzyDL happen with a certain degree of certainty or completeness. The satisfaction of a fuzzy axiom I by a fuzzy interpretation I, denoted I |= E, is defined as follows [73]:

Dataset Sources
To implement the proposed framework in a large-scale distributed benchmark dataset, we used the dataset of the MIMIC-III (Medical Information Mart for Intensive Care) clinical database. In the MIMIC-III dataset, we do not find any distribution of data formats. All tables are represented as an aggregation of Comma-Separated Value (CSV) files. For the mentioned reason, we had to make two groups of the MIMIC-III dataset. The first group consists of only MIMIC-III CSV files format as it without adaption. The second one was CSV files adapted to the MySQL database. On the other hand, we also manipulate the openEHR ADL archetype data model. The dataset used in this study includes identified data of 100 patients.

Data Source #1: MIMIC-III CSV Format
We chose three of the most important MIMIC-III tables to prepare the used dataset. Those tables are PATIENTS, LABEVENTS, and D_LABITEMS. We generated a new collected averaged data for some laboratory tests, called it "PatientTests". That

Data Source #3: Semantic openEHR Archetypes
The openEHR is a not-for-profit foundation. It consists of open specifications, clinical models, and software that can be used as interoperability solutions for healthcare [13]. The openEHR archetypes are stored online in Clinical Knowledge Manager (CKM) repository [13]. CKM contains a set of archetypes and templates that could be reused in various healthcare applications. We chose an archetype with ID openEHR-EHR-CLUSTER.macroscopy_ colorectal_carcinoma. That archetype was used for recording detailed findings of colorectal cancer found on macroscopic histopathological examination. Figure 3 reveals the mind map view of this archetype using the LinkEHR manager tool [75,76]. It is a tool that has been designed by the archetype community to empower the edition of the archetypes. It works with archetypes with any language. The archetype is composed of a set of nodes and instances. For example, the node "Tumor dimensions" refers to the maximum tumor dimensions. "Distance of tumor from dentate line" determines the distance of tumor from the dentate line. "For rectal tumors" refers to findings related solely to rectal tumors. "Tumor perforation" refers to perforation of tumor.

The Proposed Architecture Model
In this paper, we propose fuzzy ontology as a solution for the SI problem in distributed EHRs. Our prime aim was to develop a model which could unify all the data stored in different file systems, structures, and formats. Healthcare by nature includes many different providers "players". Each provider makes his own treatment decisions independently by utilizing his private judgment. Thereby, there must be a way to share data easily among those different players. Sharing information in different EHRs in a secure and meaningful way improves patient safety and the healthcare industry as all. We proposed a framework a semantic ontological architecture that could unify different EHRs data formats as drawn in Figure 4.
The proposed architecture includes three main layers. The first layer converts each different input source into an OWL ontology. As mentioned before, EHR is a combination of patient's linked data. Many medical data models describe the data structures of information in EHRs [77]. The different parties of EHRs have to cooperate during patient care. Thereby, all those medical data models must be harmonized to avoid duplicate data entry in healthcare. At the same time, healthcare providers and physicians oftentimes need access to those data in an integrated way. This layer aims to make integration and unification amongst all the existing data models. Our main point of departure is to convert each input source to an OWL ontology model. We selected ontology, especially, because it has the capability to unify between many different parties. It increases the intelligence and efficiency of clinical decision support systems.
The second stage "Global Unified Ontology" combines two or more ontologies and builds a new combined one. Its main purpose is to give a unified and single user interface to all different and heterogeneous data sources, thereby educing all the required information from local ontologies through the combined one easily. This provides a clear meaning for the used concepts to high-level users. In ontology merging, all information about input ontologies components is preserved. There are various techniques and methodologies utilized in merging ontologies. Such as PROMPT protégé plugin [78], VOAR (Visual and Integrated Ontology Alignment Environment) web-based [79], and ontology integration systems (OISs) [80]. The merged ontology must avoid redundancy and conflict between the same components. It maps an input entity in the first input ontology to an entity in the second input one. The first two stages of the framework were implemented in more detail previously [81]. This work expands crisp ontology into a fuzzy-based one. The integrated ontology is fuzzified using the FuzzyOWL2 protégé plug-in to represent uncertain and vague relationships, attributes, and concepts in real-medical knowledge, thereby facilitating decision-making efficiency. Our point of departure for using FO is that the medical domain by nature is error-prone, and FO is more accurate, reusable, and shareable than the classical in such a vague domain.
To fuzzify the crisp ontology, we will use FuzzyOWL 2 protégé plugin [82]. All the elements of classical original ontology (concepts, attributes, modifiers, instances, data properties, object properties, etc.) are fuzzified by assigning a membership value. That plug-in allows specification of the fuzzy logic type used, fuzzy modified concepts, and the definition of fuzzy data types, weighted concepts, fuzzy nominals, weighted sum concepts, fuzzy modifiers, fuzzy modified roles, and fuzzy axioms. The main idea of fuzzyOWL2 over classical ontology is to expand its elements with OWL2 annotation properties. The main focus of this paper is to aggregate all the different distributed EHRs formats into a centralized semantic point query. Physicians can interact directly with the unified fuzzy ontology. He could use the natural language query regardless all the different low-level formats. The proposed architecture is described in Algorithm 1.

Algorithm 1: Fuzzy ontology preparation.
Data: Structured heterogeneous data sources i /* the heterogeneous data sources could be with any format */ Result: Unified fuzzy ontology begin for each input system i do convert i into an ontology format.; end Integrate all the constructed ontologies into a global Crisp one. /* Through Ontologies alignment and Similarity computation. */ Check the consistency of the integrated ontology with Pellet reasoner. /* Fuzzy ontology creation */ Convert Crisp ontology to global fuzzy using FuzzyOWL2 plugin. Check the correctness of the integrated fuzzy ontology with FuzzyDL reasoner. end

Results
The proposed framework has three main layers. Those layers are Local ontologies construction, unified global ontology, and fuzzy integrated ontology. In the next section, we will manipulate the results of those layers. Besides, we will evaluate the integrated crisp ontology and integrated fuzzy one by executing some semantic queries.

Local Ontologies Construction
This layer constructs a local ontological model from each input data source. We chose ontology as it provides a formalized and shareable way of constructing health data. In addition, it expresses the semantics of domain terms in a natural language comprehended by machines. Concerning Data source #1, the main table was transformed to an OWL class "PatientTests", all 8 columns were mapped into 8 Datatype Properties, and 100 records were successfully mapped to 100 individuals of the constructed ontology. Nine hundred and fifty-three axioms, 11 declaration axioms, and 930 logical axioms are generated during the mapping. Figure 5 would describe the data property, individuals, and OntoGraf of the ontology constructed. Some of the generated axioms are introduced as follows: Regarding to Data source #2 "MIMIC-III MySQL adapted database". Those tables are ICUSTAYS, CAREGIVERS, and CHARTEVENTS. The database table is represented as an OWL sub-class of the class Thing (It acts as the set involving all individuals in the protégé project.). Each tuple/record of a table is mapped to an instance of that table's corresponding class. Each attribute is converted to be an instance of datatype properties. Each attribute will have the same data type as in the database. The value of this attribute is mapped as a value of the identical data type. Foreign keys are transformed to object properties. In the experiment, three tables were mapped into ontology classes with the same name as RDB. All 300 records were mapped into individuals. All 24 columns were mapped into Datatype Properties. Figure 6 clarifies a use case scenario of constructing ontology from a relational database showing how the table is converted to an OWL class, and field into data property. The constructed ontology contains 2868 axioms, 33 declaration axioms, and 2799 logical axioms. Some of the generated axioms would be described as follows: The caregivers class and its data properties are defined as:   Concerning data source #3, "Semantic openEHR Archetypes", the constructed ontology contains 3612 axioms, 340 declaration axioms, and 2510 logical axioms generated. Some of the generated axioms are described as follow:

Integrated Crisp Ontology
In this step, we aim to merge all these local existing ontologies into a global unified crisp one. However, this step might include some conflicts. It might include redundancy between some entities. In addition, ontology could share concepts with the same synonymies "semantic". In the experiment, subject_id is the only repeated entity. In protégé, each entity in the class has a unique IRI; we should rename IRIs of the same entities in both ontologies to be identical. The IRI of "patienttests.Subject_id" entity is the same IRI of "chartevents.subject_id" entity. The integrated crisp ontology includes 7439 axioms, 4769 logical axioms, 884 declarative axioms, 451 individuals, and 42 annotation properties. The class hierarchy of the output united crisp ontology is depicted in Figure 7.

Integrated Crisp Ontology Evaluation
It is an important task to ensure the quality and correctness of the output crisp ontology. SPARQL [83] is a language used to query data corresponding to the RDF data model. The output ontology has to be categorized before querying by a reasoner. We used Pellet reasoner [65] to validate integrated ontology consistency. Table 2 shows some competency questions, their corresponding using SPARQL query, and results.
In this section, a set of queries were run on the obtained results. Queries in Protégé can be executed with the assistance of SPARQL. In this step, our main objective is to build a single data model from many models. For example, Q1 tries to get the results of the main tests for a specific patient. Some results of the tests are in (PatientTests CSV file), and others are in (chartevents RDB table). In addition, Q5 depicts a simple query that retrieves all data patients (from chartevents table) whose age was very old (from PatientTests table). This step in the proposed framework achieves syntax interoperability by aggregating data with different heterogeneous formats and structures.
Ontology evaluation requires queries on it to measure ontology's quality. However, we used SPARQL to query the ontology; it has not a complete understanding of the semantics of OWL [84]. For example, Q2 required the patients with high blood pressure, and we used a crisp value of SBD larger than 140 or a crisp value of DBP larger than 90. Besides, Q3 required the patients with low hemoglobin blood, and we used a crisp value for hemoglobin score less than 10. To solve all those problems, we extended the crisp integrated ontology into an FO. In the following, we will try to achieve semantic interoperability by using fuzzy logic in the ontology. That is because FO was proposed as a solution to handle semantic meanings in an inconsistent and uncertain world [85].

Fuzzification of the Integrated Ontology
The first step of any system fuzzification is to establish the urgent need for fuzziness [86]. Healthcare in nature is complex, fractal, and data types may have many parameters. In addition, the data may be missing. It is impossible to give an exact description or definition for medical concepts, concept instances, attributes, data types, and relationships. The boundaries are not clear. The medical data may cause many uncertainties in decision-making [87].
That process is repeated for all linguistic variables and all the features were fuzzified with the help of medical experts. We have 13 features extracted from LABEVENTS and CHARTEVENTS MIMIC-III tables, and all of them are numerical. They were determined by having a concern of previous studies and medical experts. We used MATLAB to define all the fuzzy membership functions. Table 3 shows more Fuzzy features, their linguistic terms, MF range, and fuzzy MF parameters. The range, membership function graphics, and fuzzy sets for WBC, as well as Hemoglobin tests, are clarified in Figure 9. Figure 10 shows an example of a fuzzy datatype. Figure 11 shows an example of some datatype fuzzification in the resulting integrated ontology.   After fuzzifcation of all the features, fuzzy modifiers are assigned. Fuzzy modifier allows utilizing real-world linguistic words (like very, little, low, mild, slightly, and recently) in which fuzzy data types could have a degree of membership. It improves the expressiveness of ontology and its semantic query. It can express vagueness in measurements. We defined a fuzzy modifier "very" as linear (0.85) and "verylow" as linear (0.1). Fuzzy modifier could be utilized to generate new fuzzy properties as very (Veryhigh.SBP) means patient has very very High.SBP. Figure 12 depicts a definition of a fuzzy modifier.

Integrated Fuzzy Ontology Validation
After building a crisp ontology and fuzzifying it, the evaluation process is needed to ensure that the integrated FO represents the vagueness correctly and appropriately. The consistency of the ontology has been checked using Pellet and HermiT 1.3.7 reasoners. Its structure was accurately corrected. In addition, we used fuzzyDL reasoner [73] for querying and reasoning of the resultant FO. Table 5 shows some queries to ensure the skillfulness and performance of the output fuzzy integrated ontology. We fuzzified the integrated crisp ontology to solve the vagueness problem in the medical domain. As noticed, Q5 in Table 2 returns 50 instances, whilst the same query (Q4) in Table 5 returns 55 instances for the same query. Q5 in Table 2 gets the patients older than 70 years only (by numerical). Q4 in Table 5 gets the older patients (e.g., 70 is old with a percentage of 0.85, 65 is older with a percentage of 0.7, 79 is old with a percentage of 0.9). In addition, if the physician wants the answer to the following query "find the old patients with Hypertensive-Stage 2 disease (that happened when SBD > 140 or DBP > 90)", or if they want to get the very young patients with very low hemoglobin percentage, if they do not know the exact numbers for the required features, they will retrieve incorrect and imprecise answers. The two examples show the urgent need to model vagueness and imprecision problems between medical features. At the same time, the FO was proposed to handle all those limitations. The integrated fuzzy ontology used linguistic variables (natural language query instead of numerical usage) in extracting the required information.

Discussion
The main objective of this work and all other previous works [34,89] is to develop a framework or methodology to extract EHRs patient's data widespread in many different locations with many different formats. EHRs are an information system that collects individuals' health information from birth to death. It could be certified, registered, and shared between different healthcare providers [90]. EHRs improve care quality by making healthcare data available and accessible when and where needed, reducing medical errors, sharing information between healthcare providers, and providing an effective means of communication. In modern years, the EHRs growth has been famed as an efficient and viable model for genetic research [91].
The adoption and development of EHRs is not an easy matter and includes many barriers [92]. From those barriers are the lack of EHRs standards in exchanging information [16], lack of integration amongst different health systems, and complexity and dynamic nature of healthcare. Another barrier is the heterogeneity problem, which means that the same health data could be presented by different models and in different ways. There are many different data models used to store EHRs data. From those data models are relational databases with all different software (i.e., Sybase, Oracle, MySQL, and PostgreSQL), Excel sheets, XML documents, CSV files, and many other EHRs standards (i.e., openEHR, HL7, ASTM, ISO 13606). On the other hand, it is not practical for all healthcare organizations to agree on one standard system [16].
In 2009, the SemanticHEALTH project [93] referred to health professionals and required access to complete and detailed health patient records. In addition, achieving interoperability in distributed EHR systems perfects the patients' care quality and supports public health. On the other hand, the early detection of diseases requires a health-delivery system that can monitor health status. SI in EHR is needed to improve healthcare quality, patient care safety, effectiveness, and productivity. According to the Medicare Services Centers [94], "EHRs do not accomplish their advantages only by exchanging data from the paper shape into the advanced frame. EHRs can just convey their advantages when the data and the EHR are organized, and its meaning is clear and understandable". SI enables decision support systems by integrating patient information from multiple sources.
Our main interest is to establish a single unified and homogeneous data model from all heterogeneous EHRs models. Most previous works [95] concerned with using one or more EHRs standards to handle the mentioned problem. However, using a single standard is insufficient to make unification, especially when the system is distributed. Successful EHRs require multiple parties and systems to cooperate. Ontology was proposed to achieve that goal. This is because ontology could be used as metadata that defines a vocabulary of used variables with an explicit cleared semantic. Thereby, ontology enables both machine and human data readable. On the other hand, ontology plays a significant role in SW success [96] since they enable recognizing knowledge representation, reusing and sharing that knowledge. Crisp ontology achieves the desired goal to some degree. It has been noted in the few recent times that classical ontology and its languages are not convenient to deal with imprecise and vagueness knowledge, that is a fundamental for several real-world applications [97]. Thereby, by extending a crisp ontology to FO, the medical vagueness problem cloud be solved.
We fuzzified our integrated ontology to accommodate the linguistic variables. The framework provides a more consistent, reliable, and comprehensive EHRs interoperable environment. It could help physicians and specialists to retrieve patient-required data with a more natural language query. In addition, it improves healthcare performance by making the correct decision based on the correct and aggregated information reachable at the ideal time. To the best of our knowledge, this is the first implemented framework that uses ontology-based on fuzzy to handle the problem of interoperability in distributed EHRs. It could integrate many different EHRs formats. In addition, it handles the medical vagueness problem.
To the best of our knowledge, this is the first implemented framework that uses ontology-based on fuzzy to handle the problem of interoperability in distributed EHRs. Table 6 compares the proposed model with the literature studies in terms of used data formats, interoperability level, and methodology. The criteria of "Standard" points out whether EHRs standards were used or not. Although the proposed framework moves toward achieving full interoperability in distributed EHRs, it contains many limitations. Firstly, it had to handle an unstructured EHRs data model to increase our implementation scope. Unfortunately, most EHRs clinical data are unstructured and still not computable [102]. EHRs contain three main types of data structured (coded data, such as RDB, Lab tests, Diagnosis codes), semi-structured (e.g., XML model), and unstructured information. Most of EHRs data is unstructured by nature (free-text clinical notes, radiology reports, medical imaging as MRI "Magnetic Resonance Imaging"). Structured and semi-structured formats are simple to retrieve whereas unstructured data requires additional tools, such as Natural Language Processing (NLP), to be retrieved. Handling unstructured data is an urgent issue for achieving successful and complete EHRs. Secondly, we may use one of the modern technologies, such as deep learning, beyond FO to achieve our main goals. Solares et al. [103] showed the strength of RNN (Recurrent Neural Networks) as a deep learning architecture to deal with EHRs temporal natural. Thirdly, we need to create a graphical user interface of the implemented framework to be easy to use. Fourthly, we intend to measure the sensitivity of our proposed model.

Conclusions
SI is considered a more complex problem in health informatics. This paper proposes a fuzzy ontological intelligent system for integration and SI in distributed EHRs. We evaluated different EHRs data formats. These are as follows: MIMIC-III CSV files, MIMIC-III adapted MySQL database, and openEHR ADL archetype model. Our main idea is based on converting each data format to an ontology representation (RDF or OWL). This phase is prepared for any other EHR structure and format. Then, we integrated all the output ontologies into a unified crisp ontology. In the final stage, we extended the crisp ontology to an ontology-based on fuzzy logic. We used SPARQL and fuzzyDL query language to evaluate some queries. FuzzyDL was more comprehensive in its results as it relies on linguistic variables rather than numbers. Further development of the framework will concentrate on the limitations of this work, as discussed previously.