A Semi-Automatic Semantic Consistency-Checking Method for Learning Ontology from Relational Database

: To tackle the issues of semantic collision and inconsistencies between ontologies and the original data model while learning ontology from relational database (RDB), a semi-automatic semantic consistency checking method based on graph intermediate representation and model checking is presented. Initially, the W-Graph, as an intermediate model between databases and ontologies, was utilized to formalize the semantic correspondences between databases and ontologies, which were then transformed into the Kripke structure and eventually encoded with the SMV program. Meanwhile, description logics (DLs) were employed to formalize the semantic speciﬁcations of the learned ontologies, since the OWL DL showed good semantic compatibility and the DLs presented an excellent expressivity. Thereafter, the speciﬁcations were converted into a computer tree logic (CTL) formula to improve machine readability. Furthermore, the task of checking semantic consistency could be converted into a global model checking problem that could be solved automatically by the symbolic model checker. Moreover, an example is given to demonstrate the speciﬁc process of formalizing and checking the semantic consistency between learned ontologies and RDB, and a veriﬁcation experiment was conducted to verify the feasibility of the presented method. The results showed that the presented method could correctly check and identify the different kinds of inconsistencies between learned ontologies and its original data model.


Introduction
Knowledge-based integration was regarded as one of the efficient integration methods due to the excellent semantic interoperability of knowledge bases (KB). Typically, the mainstream methods for constructing knowledge bases are based on manual transformation and mapping, hence it is a costly and tedious task to construct and maintain knowledge bases by using the traditional manual mapping and transformation [1]. Ontology, one of the representatives and formalized knowledge bases, provides a rich semantic reference for the schema mapping and data integration due to its semantic interoperability and rigorous mathematical foundation [2]. Similarly, the traditional methods for constructing ontology from RDB are mainly based on predefined rules, in which a lot of effort and domain expertise are required.
Ontology learning (OL) is a kind of knowledge representation learning method, aiming to (semi-)automatically construct ontologies from various data, in which the entities and relationships are usually identified and extracted based on semantic computation and knowledge inference. The prevailing techniques of learning ontology could be classified into four categories: association rule mining (ARM), formal concept analysis (FCA), inductive logic programming (ILP), neural networks (NN), and machine learning [3]. To some extent, ontology learning not only improves the efficiency of ontology construction but also eliminates the biases and limitations of human knowledge [4].
There is no doubt that ontology learning could free humans from the tedious mapping and transformation and minimize the negative influence of human knowledge biases while manually constructing ontologies. However, it is a common phenomenon that some inconsistencies and semantic conflicts will inevitably occur during (semi-)automatic ontology learning [5]. Due to the various naming conventions of entities and attributes, and the different semantic contexts that exist in different databases, it is unavoidable that semantic collisions and inconsistencies will occur between learned ontologies and their original databases. Consequently, these inconsistencies will weaken the capabilities of the semantic interoperability of learned ontologies. Therefore, the issues of inconsistencies and redundancies are becoming the bottleneck in the scenario of (semi-)automatic ontology learning from a relational database.
In general, ontology consistency could be manually checked in the ontology evaluation, in which several criteria are defined to evaluate the quality of ontology, e.g., consistency, completeness, conciseness, etc. [6]. Semantic consistency is a fundamental criterion for evaluating the quality of ontology during ontology construction and merging. The semantic consistency checking not only checks for consistency at the syntactic level, but also at the semantic level. However, the ontology evaluation is a time-consuming and laborious work [7]. There are even more inconsistencies that need to be checked and evaluated when it comes to learning ontology from RDB, since the current ontology learning algorithm is immature. In particular, the issues of inconsistencies and redundancies will get much worse when learning ontology from multiple data sources. Hence, how to efficiently identify the inconsistencies between learned ontologies and their original databases is one of the critical tasks in the (semi-)automatic ontology learning from RDB.
To address the above issues, this article presents a semi-automatic semantic consistency checking method based on the graph intermediate representation and model checking. We formalized the semantic correspondences between databases and ontologies based on labeled transition systems and Kripke structure by introducing the W-Graph, which is a graph-based intermediate representation model. We initially encoded the specifications of learned ontologies with DLs to leverage the excellent expressivity of description logics (DLs) [8]. We translated the aforementioned semantic specifications of learned ontologies from DLs to the CTL formula to improve machine readability. The remainder of this article is organized as follows. The related work on the topic of semantic consistency checking of ontologies is summarized in Section 2. The problem of the inconsistencies during learning ontology from relational database are described and the preliminary definitions are given in Section 3. The specific processes of the presented semantic consistency checking method are given, and the corresponding verification experiment is conducted to verify the feasibility and effectiveness of the presented method in Section 4. The conclusion is summarized, and the future work is given in Section 5.

Related Work
Ontology consistency checking is a critical task in ontology construction, ontology alignment, and ontology evolution, by which the inconsistencies could be identified and eliminated. The prevailing methods for checking ontology consistency could be classified into two categories: logical reasoning and graph intermediate representation.

Consistency Checking Based on Logical Reasoning
Logical reasoning is a classic method to check the consistency of knowledge bases, in which the hypotheses, arguments, and consequences could be inferred and justified. To ensure the consistency of ontologies, Baclawski et al. [9] designed a tool (ConsVISor) for checking the consistency of ontology. ConsVISor could check whether a given ontology is consistent by verifying the axioms based on inference of logic programming engine. However, it simply identifies the inconsistency at the syntactic level, which may lead to some errors when it meets some synonyms or abbreviations. As a result that the OWL-DL reasoning mechanism provides automatic detection of inconsistencies by formal description logic, Neumann et al. [10] designed a reference architecture and prototype of the ontological XML database system (OXDBS). In this system, the consistency validation module was designed, in which the consistency is checked based on the formal validation of the OWL-DL reasoner. Apart from using the First-Order Logic (FOL) and DLs, the Semantic Web Rule Language (SWRL) was also utilized to formalize semantics. To resolve semantic conflicts in the anti-fraud rule-based expert systems, del Mar et al. [11] proposed a semantic conflict resolution method to check the consistency of rules in rule-based expert systems. More specifically, different kinds of rules were formulated in SWRL, and then the inconsistent, overlapping, and duplicate rules were detected in the rule-based expert system by introducing the ontology reasoning mechanism.
In addition to checking ontology consistency based on logic reasoning, there are some works focused on the issues of inconsistency prediction and elimination. To eliminate the inconsistencies at an earlier time, Bayoudhi [12] predicated the logical inconsistencies during updating OWL2-DL based on predefined rules, where the potential inconsistencies could be detected and resolved at an earlier time. To tackle the issue of inconsistent ontologies, Rosati et al. [13] presented a reasoner-based Quonto Inconsistent Data handler (QuID). In this method, the semantic inconsistencies of ontologies in ABox could be repaired automatically based on the data manipulation and query rewriting.
It is noteworthy that some unexpected results, e.g., unsatisfied class, erroneous correspondences, etc., will occur while checking the ontology consistency based on reasoning. In particular, it is ineffective for the reasoner to check the inconsistencies if there exists no standard logic inconsistency [12]. In this case, the domain experts are required to manually determine whether these unexpected results are acceptable [14]. To minimize the human intervention in checking consistency during automatic ontology mapping and merging system, Fahad et al. [15] proposed a method to identify semantic inconsistencies from initial mappings by leveraging the subsumption analysis to analyze the elements of ontologies, i.e., concepts and properties. Aimed at semi-automatically constructing ontologies from the Web corpus, Bai et al. [16] proposed a domain ontology learning approach, in which a two-stage clustering approach and SOM neural network are utilized to extract and build domain ontology from Chinese-document corpus. Accordingly, a consistency checking method based on racer reasoner is presented to check the consistency of learned ontologies. Additionally, to address the inconsistency between ABox and TBox during ontology reasoning, Paulheim et al. [17] proposed a ABox consistency checking method based on machine learning. In this method, an approximate reasoner was introduced to check the ABox consistency, and it is eventually considered as a binary classification problem, by which the ABox is translated to feature vectors for training the decision trees.

Consistency Checking Based on Graph Intermediate Representation
Considering that both ontology and RDB could be formalized as a directed graph, the correspondences between them could be specified and formalized based on graph mapping. In consequence, the consistency checking based on graph intermediate representation has aroused scholar's attention. To address the incapability of logical reasoning to provide an explanation for the user to resolve the identified inconsistencies, Lam et al. [18] presented a graph-based ontology inconsistency checking method. More specifically, the ontology was formalized as a directed graph, in which the consistency between concepts was checked by analyzing the paths of a graph. In order to identify the inconsistencies in building ontology, Yang et al. [19] proposed a semantic consistency method based on a graph-based intermediate model. In this method, the task of semantic consistency checking was referred to as a problem of a semantic equivalent query over the isomorphism graph, which was eventually solved by the model checker. However, an intermediate graph was constructed based on the RDB schema and instance, which ignores the subtle difference between the intermediate graph model and ontology. Strictly speaking, we could not conclude that the constructed ontologies are consistent with the original database when the query result of the sub-graph is consistent with the intermediate graph model.
In addition to checking the ontology consistency during the construction of ontologies, some works address the consistency checking based on graph-intermediate representation during the ontology evolution and mapping. To tackle the inconsistency issue in updating the ontologies, Mahfoudh et al. [20] proposed an a priori approach to resolve the ontology inconsistencies based on the Simple PushOut (SPO) graph transformation. In this approach, the ontology changes and inconsistencies were formalized as a typed attributed graph, which provided a mechanism to avoid the inconsistent ontologies by controlling graph transformations and rewriting rules with SPO graph transformations. Similarly, to avoid the inconsistency in transforming RDB to Resource Description Framework (RDF) graph, Jun et al. [21] focused on the semantics-preserving mapping method based on rules. In this work, several rules were defined based on predicate logic to avoid semantic loss during mapping of multi-column keys constraints into RDF constraints. However, the definition of rules requires domain expertise, which is a tedious task as well.

Brief Summary
To recap, the existing works on the topic of ontology consistency checking could be classified into two categories: logical reasoning [10,11,13,[15][16][17], and graph-based intermediate representation [18][19][20][21]. Even if the consistency could be checked directly by using the ontology reasoning mechanism, it is incapable of explaining to the user to resolve the identified inconsistencies [18]. In particular, the performance of the consistency checking heavily depends on the consistency and integrity of ontologies, therefore, the reasoning of inconsistent ontologies may lead to some erroneous axioms and conclusions [9]. It is worth mentioning that the ontology consistency checking based on reasoning could only check the consistency between ABox and TBox of ontology [17], and it is incapable of checking the consistency between ontologies and their original knowledge in the process of building ontology.
Currently, the main knowledge still resides in relational databases, thus, it is a trivial task to check the semantic consistency during learning ontologies from RDB. Nevertheless, the existing methods of consistency checking mainly focus on checking the consistency of learning ontology from the Web and document corpus [16], while there is less work focus on checking the consistency of learning ontology from RDB. Therefore, how to efficiently check the semantic consistency between ontology and its original RDB in the early phase of ontology learning from the database remains an open question.
To address this question, we present a semi-automatic semantic consistency checking method based on the graph intermediate representation and model checking. The current work is similar to the [19] proposed method because both of us employ W-Graph as an intermediate model to formalize semantic correspondence and check the ontology consistency. As we previously mentioned, the equivalent sub-graph query over the intermediate graph may lead to some incorrect results when RDB schema significantly differentiates ontologies. To overcome this limitation, we directly refer to the task of consistency checking as a global model checking rather than the equivalent sub-graph query over an intermediate graph model. More specifically, we encode the specifications of learned ontologies with DLs, and then we translate these specifications to the CTL formula. Hereafter, the ontology consistency can be checked by the model checker by verifying whether the formalized specifications are consistent with the graph-based intermediate model-Kripke structure.

Problem Statement and Preliminaries
In this section, we describe the problem of semantic consistency between ontologies and databases when learning ontology from RDB, and we briefly introduce the formal definitions of two intermediate graph models and logic languages.

Problem Statement
The semantics of an ontology O is formally defined by using the axioms of DLs. As we all know, DLs are decidable fragments of FOL [22], thus the semantics of ontology could be directly defined by FOL formulae, denoted as Σ O . For each triple of an ontology, t ∈ O, the semantics over the predicate triple are defined as ϕ t . Accordingly, the semantics of ontology O that encode with FOL formula are denoted as {ϕ t | t ∈ O}.
Traditionally, the basic elements of relational database, i.e., database schema R, constraints Σ over R, instance I of R, could be mapped into RDF graph and OWL by using direct mapping M. As we mentioned earlier, the M is defined based on the set of datalog predicates and rules, in which a lot of experience and effort from domain experts are required. We could say a direct mapping M is semantic preserving mapping [23], namely, there is no semantic inconsistency between RDB and OWL, if for every database schema R, set of constraints Σ (PKs and FKs), instance I of R, and semantics of the ontology Σ O , which satisfies: It is worth mentioning that semantic collision and inconsistency will occur with high probability while (semi-)automatically construct ontology from multiple databases. The predominant semantic inconsistencies and redundancies are relationship redundancy, concept or property redundancy, and fact inconsistency. The reasons behind these inconsistencies and redundancies are various naming conventions among different databases. For instance, the machine could recognize that Prof is the abbreviation of Professor, Title is synonymous with Position in academic ranks, while it is quite difficult to identify that the Assistant Professor is synonymous with Lecturer. Therefore, it is a crucial task to detect and resolve these inconsistencies while (semi-)automatically learning ontology from a relational database at an earlier stage.

Preliminaries
This subsection introduces the formal definitions of two kinds of directed graph models, and the fundamental syntax and formula of logic language.

W-Graph
W-Graph is a graph-based formal language, which provides an intermediate and semantic equivalent graph model between RDB and ontologies [24]. Essentially, it is a directed labeled graph [25], and could be formalized as a triple W G = N, E, , where: • N = {N a , N c } is a finite set of nodes, N a is a finite set of atomic nodes that is depicted as ellipses, and N c is a finite set of composite nodes that is depicted as rectangles.
is a set of labeled edges in the form of triple, is a function : N → C × (L ∪ {⊥}), C = {solid, dashed} and L is a set of labels. Similar to RDB and RDF, there are two kinds of models of W-Graph: W-Schema and W-Instance. Conventionally, W-Schema represents the patterns of the knowledge, while W-Instance represents the concrete contents of the knowledge. Formally, W I denotes the instance of W G , for each edge e of the instance W I meets C (e) = solid, and for each node n of the instance W I meets C (n) =⊥, where ⊥ represents the dummy nodes.
Based on the definition of W-Graph, the bi-simulation semantics could be defined. Accordingly, the semantics-preserving mapping between RDB and ontologies could be formalized by W-Graph.

Kripke Structure
Kripke structures are finite directed graphs that represent the transition of states [26]. In the state-transition graph, vertices are labeled with sets of atomic propositions (AP). Formally, Kripke structure over a set of APs could be represented as a triple K= S, I, R, L , where: • S is a finite set of states, I ⊆ S is a set of initial states, R ⊆ S × S is a set of transitions. L is a labeling function: L : S → 2 AP , which associates each state with a set of AP. • A sequence of states S and their transitions R is viewed as a path π = s 0 , s 1 , s 2 , ..., s n in Kripke structure. • Given an infinite path π, L(π) = L(s 0 ), L(s 1 ), L(s 2 ), ..., L(s n ) is an infinite sequence set of atomic propositions.

Computation Tree Logic
Computation tree logic (CTL) is a logic language for describing the properties of the computation tree, in which many executions can be reasoned by formulas at once. Formally, given a transition system T =< S, →, s 0 >, the computer tree of T is the acyclic unfolding graph. Let AP be a set of atomic propositions, accordingly, the set of CTL formulas over AP are defined as follows: • In essence, every AP is a CTL formula, formally, if ϕ 1 , ϕ 2 are CTL formulas and a is an element of AP, the CTL formulas are denoted as a binary function: • When ϕ 1 , ϕ 2 are CTL formulas, the following syntaxes are defined: Given a Kripke structure K= S, I, R, L , the semantics of CTL formula are denoted as a set of states ϕ K . There are two kinds of CTL formulas, state formula and path formula [27], which are employed to verify whether the given states and paths satisfy the predefined specifications, respectively. Given an initial state s 0 of Kripke structure K, when it meets s 0 ∈ ϕ K , the corresponding Kripke structure K satisfies the CTL formula ϕ, denoted as K |= ϕ.

Description Logics
Description logics (DLs) is a formal logic language mainly used to specify knowledge base KB, namely, DLs provide a mechanism to formalize and reason the knowledge from various data [22]. In DLs, knowledge is usually represented as knowledge base KB =< A, T >, while: A is a set of assertions of individuals (ABox), and T is a set of terminologies (TBox). The basic elements of DLs could be categorized into three types: individuals, concepts, and roles. Given an atomic concept A, common concepts C,D, and a role R, the following syntax, assertions and axioms are defined:

•
Syntax definitions: C,D→ A is an atomic concept, denotes top concept, ⊥ denotes bottom concept, ¬ C denotes negative concept, C D denotes union of two concepts, C D denotes insertion of two concepts, ∃ R.C denotes existential restriction, and ∀ R.C denotes universal restriction. • Assertions and axioms: C(a) and R(a,b) denote concept assertions and role assertions for individuals a,b, concepts C, and roles R in ABox; C D denotes the axioms between concepts C and D in TBox.
The DL reasoner provides a concept consistency checking mechanism to verify if ABox is consistent with respect to TBox [28]. Given a terminology T and concept C, DLs could verify if there exists a non-empty interpretation C I of C that satisfies each inclusion dependency in T , which could be denoted as T |= C I [29].

Consistency Checking Based on Model Checking
This section introduces the specific process of a semantic consistency checking method based on graph intermediate representation and model checking. To begin with, a general framework of this method is presented, and the modeling process of the semantics between ontology and RDB is described. In addition, an example is given to verify the feasibility and effectiveness of the presented method.

General Description of Proposed Method
The reasoning of ontologies provides a mechanism for checking consistency, by which only the consistency between ABox and TBox can be checked. In this work, we address the consistency checking between ontologies and their original databases during learn-ing ontology from RDB, rather than the consistency checking between ABox and TBox of ontologies.
Consistency between ontologies and databases is a state that satisfies the given consistency criteria. Usually, these consistency criteria are defined based on rules, hence the consistency is checked by verifying whether the constructed ontologies satisfy a given consistency criterion. The basic consistency criteria of ontologies are completeness and non-redundancy [6], however, it is a contradiction between these two criteria. Therefore, a trade-off between ontology completeness and non-redundancy should be made.
Semantic consistency checking is a process of identifying semantic collisions, i.e., detect inconsistencies and duplicates, which is an essential step in constructing and maintaining knowledge bases. Model checking is a computer-assisted method for analyzing and verifying the dynamical systems by modeling them as state-transition systems [30]. Specifically, model checking could be utilized to verify whether a given model M satisfies predefined specification ϕ, formally, it could be denoted as M |= ϕ.
To ensure the completeness and non-redundancy of ontologies when learning ontology from relational database, a semantic consistency method based on model checking is presented. This method could be employed to check the ontology consistency at an earlier stage of ontology learning from RDB. To make it easier to understand, the main workflow of the presented semantic consistency checking method is depicted in Figure 1.  As we can see in Figure 1, an intermediate model, W-Graph, is utilized to formalize the semantic correspondences between ontologies and its original relational database, which are transformed to the Kripke structure. Considering the semantic compatibility and the available inverse roles of ALCI DL, ALCI DL are employed to formalize the semantic specifications, which are eventually encoded by the CTL formula. Thereafter, the Kripke structure is encoded by the SMV program, which along with the CTL formula is interpreted by the symbolic model checker. Thereby, the semantic consistency between learned ontologies and their original database could be automatically checked by a model checker.

Introduction to Mini University Data Model
Considering that the current algorithm for learning ontology from RDB is still in its exploratory stage, the Mini University ontology [31], and its corresponding database, is selected as an example to describe the specific steps of semantic consistency checking based on model checking. Despite that the Mini University is not a complex example, it is quite representative of a relational database because it almost contains the common database elements. This is one of the reasons why the Mini University data model and its corresponding ontology are selected to demonstrate the specific steps of this method.   Figure 2 shows the database schema of Mini University, we can see that there are five regular entities, one associative entity (REGISTRATION). Meanwhile, there are different kinds of associations, i.e., one-to-one, one-to-many, and many-to-many. In particular, it also contains database constraints, i.e., primary key, foreign key, which represent the unique identifier and associations, respectively.

Formalization of Relational Data Model
As we mentioned in Section 4.1, model checking could be utilized to check whether the model satisfies the given specifications. To leverage the model checking to check the semantic consistency of learning ontology from RDB, an intermediate model, W-Graph, is utilized to model the semantics of Mini University data model. Furthermore, DLs are employed to formalize the specification of semantic correspondence between initial databases and target ontologies. Inspired by the model checking based on DLs [29], we converted the problem of semantic consistency into a global model checking, which could be automatically checked by the model checker.

Constructing Kripke Structure from Database
The formal definitions of the Kripke structure and W-Graph are given in Section 3.2. Based on these definitions, we could construct the Kripke structure from the relational data model by introducing the graph-based intermediate model. Accordingly, the semantics of the data model could be modeled by W-Graph before constructing the Kripke structure from W-Graph. The specific steps of constructing the Kripke structure from a relational database could be split into two phases: Modeling Semantics of Database by W-Graph. A relational database (RDB) is a tablebased data model, while W-Graph is a graph-based model. There are several differences between these models, e.g., storage format, structure, and constraints, etc. This work aims to check the semantic consistency of learning ontology from a relational database, we only transform the semantics of the database into a W-Graph model. We map the entities and relationships in RDB into nodes and edges of W-Graph, respectively, thereby, the W-Schema and W-Instance are built based on the definitions of W-Graph. Accordingly, the schema and semantics of the Mini University database could be transformed into W-Instance, which is shown in Figure 3.  Constructing Kripke Structure from W-Graph. As we mentioned in Section 3.2, Kripke structure is a finite directed graph where the vertices are labeled with sets of AP. Hence, given a W-Graph W G , the corresponding Kripke structure over atomic propositions AP G are defined as K G = S G , I G , R G , L G based on the following transformation rules: • An atomic proposition AP G is compound of a set of atomic nodes N a in W G , formally, AP G = {n | ∀n ∈ N a , n = (n)}. • A finite set of composite nodes N c in W G are regarded as an initial state I G in Kripke structure. • The edge E between two nodes n i , n j , {n i , n j | ∀n i ∈ N, ∀n j ∈ N} is mapped into a transition R G in Kripke structure. • The edge labels, e.g., edge label, negative label (l), and inverse label (l −1 ) are transformed into the labeling function L G in Kripke structure.
Based on the above transformation rules, the corresponding Kripke structure could be constructed and depicted in Figure 4.   Figure 4, each node in W-Instance is transformed as a state, while the label functions between two adjacent nodes in W-Instance are mapped as transitions. In particular, the inverse transition, e.g., Age −1 , Name −1 , between two states is introduced for modeling the possible state transitions. Formally, the atomic propositions, initial states, transitions, and labeling functions could be defined as follows:

Formalization of Specification
To check the consistency by using model checking, the semantic specification of learned ontologies should be formalized by a logic formula. We suppose that the outputs of the ontology learning from Mini University database are depicted in Figure 5. As shown in Figure 5, there are 12 concepts or properties, i.e., Person, Student, Teacher, Professor, AssociateProfessor, AssistantProfessor, Program, Course, MandatoryCourse, Option-alCourse, and their relationships or roles, including taxonomies, associative relationships, and inverse relationships. The main taxonomies in this ontology are subsumption, e.g., Student and Teacher are subsumption of Person. The representative associative relationship is Enrolls, Takes, Teaches, and Includes. Moreover, there exist many inverse relationships, e.g., Teaches and Takes are inverse of isTaughtBy and isTakenBy respectively. In addition, there are some constraint axioms, e.g., MandatoryCourse can only be TaughtBy Professor, and cardinality, Student is a person who enrolls in at least one Program. Accordingly, the semantic specification of learned ontologies could be formalized by using ALCI DL syntax as follows: Person Based on ALCI DL syntax, the semantic specifications of ontologies are formally described by using the TBox syntax of DL. Thus, the following specifications of the corresponding ontologies are formally defined by ABox of DL syntax as follows: Marco Thereby, the Professor whose name is Marco, he teaches Database that Includes in CSMSc Program, EnrolledBy a Student, whose name is Dave could be formalized by using DLs formula: Professor (Marco) ∃ Teaches.Course (Database) ∃ IncludedBy.Program (CSMSc) ∃ EnrolledBy.Student (Dave).

Consistency Checking Algorithm
As previously mentioned, we referred to the task of checking the consistency of ontologies as a global model checking. Accordingly, the symbolic model checker is employed by considering that model checking could return the result of whether a finite-state model (Kripke model) satisfies the given specification [32].
In the CTL model checking, the specifications encoded in the CTL formula are checked by NuSMV to verify whether it satisfies with the given finite-state model [33], the model checker returns a counterexample when it is unsatisfied. When it comes to semantic consistency checking, the specifications of the ontology encoded by the CTL formula can be checked. Accordingly, the model checker will return the results indicating whether the specifications of the learned ontology satisfy the relational data model that is represented and formalized by W-Graph and Kripke structure. Considering that nuXmv [34] is an extended version of NuSMV, which provides a strong verification based on advanced SAT-based algorithms, thereby the nuXmv is employed to verify the presented method. The specific process of consistency checking based on graph intermediate representation and model checking is shown in Algorithm 1. end if 10: return R C . 11: end procedure As we can see in Algorithm 1, the inputs of consistency checking based on model checking are the Kripke model and CTL logic formula, while the output is the result of whether the input Kripke model satisfies the given semantic specification. In the following subsection, we will verify the feasibility and effectiveness of the presented method in checking the semantic consistency of learning ontology from RDB.

Verification
In the preceding subsection, we transformed W-Graph into Kripke structure and formalized semantic specifications with logic formula. Thereby, in this subsection, we verify the feasibility and effectiveness of the presented consistency checking method by encoding the Kripke structure with SMV and running the nuXmv model checker. Figure 6 summarizes the main steps of the verification.

Encoding the Kripke Structure and LTSs with SMV Program
The nuXmv model checker is a kind of symbolic model verifier, which is usually utilized to check the consistency of the state transition system. Considering that both the Kripke structure and labeled transition system (LTS) are semantic models, which have equivalent expressive [35], hence we can view the Kripke structure as a labeled transition system. As a result that the state transition system encoded by the SMV program is only readable by the nuXmv model checker, we encode the Kripke structure with the SMV program.
In our case, the label in the Kripke structure of Mini University data model is regarded as a state in a finite-state transition system, which is encoded by the SMV program. During this encoding, the inverse labels (l −1 ) are encoded as a state variable with the prefix of Inv_, e.g., Age −1 are encoded as Inv_Age, Attends −1 are encoded as Inv_Attends, etc. In order to decrease the complexity of the model, we ignore the self-transition of the leaf state in the Kripke model. Namely, we only define and assign the state, values, and the transitions between these states and labels.

Translating Semantic Specifications from DLs Formula to CTL Formula
Considering that each pre-state can be succeeded by more than one post-state in CTL, we translate the formal specifications of the ontologies that are encoded by ALCI DL syntax into the CTL formula. Accordingly, the specification of a Professor whose name is Marco, he teaches Database that Includes in MSC Program, EnrolledBy a Student, whose name is Dave, which could be translated into CTL formula as follows: Based on the above translations, semantic consistency checking could be converted into a global model checking problem that can be solved by a model checker.

Checking Ontology Consistency Based on Model Checker
To verify the feasibility of this method, we ran the nuXmv model checker on the Windows 10 machine with the Intel (R) Core (TM) i5 64-bit processor and 8GB RAM. Before verifying whether the given specification satisfies the Kripke structure, it is necessary to check if there exist deadlock states by using check_fsm command [36]. To verify the effectiveness of the presented method in checking different kinds of inconsistencies, i.e., property inconsistency, relationship inconsistency, etc., we checked not only the specifications that are consistent with the original database but also the specifications that are inconsistent with the original database. Accordingly, the result of the consistency checking is shown in Figure 7. In Figure 6, we can observe that there is no deadlock state in the current Kripke model. The results given by nuXmv indicate whether the given specifications satisfy the specific Kripke model. To begin with, we simply verified the relationship consistency. In the first case, the semantic correspondence of whether there exists a Teacher who Teaches a Course is checked, while the semantic correspondence of whether there exists a Teacher who Teaches a Program is checked in the second case. The nuXmv model checker returns the ture in first case, while the false and a counterexample are given in the second case. These results indicate that the first specification is consistent with the original RDB, while the second one is inconsistent with the original RDB. Namely, the relationship of Teacher Teaches a Course is consistent with the original database, while the relationship of Teacher Teaches the Program is inconsistent. Hereafter, we verified the specification that formalizes from ontology and ALCI DL, the model checker returns the true, which indicates that there are no inconsistencies in the current case.

Conclusions
In this work, we presented a (semi-)automatic semantic consistency checking method based on the graph intermediate representation and model checking to check the semantic consistency while learning ontology from RDB. We formalized the semantic correspondence between ontologies and its original data model. We converted the problem of semantic consistency checking into the global model checking problem that was eventually solved automatically by the nuXmv model checker. In addition, we gave an example to demonstrate the specific process of the formalization and conducted a verification experiment to verify the feasibility of the presented method. The results showed that this method could correctly check and return the results of whether the given semantic specification of the learned ontology satisfies the original RDB. Currently, this method is only verified in the scenario of learning ontology from a single database, thus how to check the semantic consistency of learning ontology from multiple databases could be investigated in the future. Meanwhile, considering that the complexity of the model checking will increase along with the volume and complexity of the data model, how to evaluate and optimize the complexity of the transformations and model checking are worthy to be investigated further. In addition to checking the semantic consistency, the more crucial work is to eliminate the identified inconsistencies and redundancies, thus how to eliminate these inconsistencies is meaningful work as well.