#
Acquiring Ontology Axioms through Mappings to Data Sources^{ †}

^{1}

^{2}

^{*}

^{†}

## Abstract

**:**

## 1. Introduction

- the data layer is constituted by the existing data sources that are relevant for the organization,
- the ontology is a declarative and explicit representation of the domain of interest for the organization, formulated in a description logic (DL) [3] so as to take advantage of various reasoning capabilities in accessing data,
- the mapping is a set of declarative assertions specifying how the sources in the data layer relate to the ontology.

`T-CarTypes`), as well as various cars of such types (table

`T-Cars`). If we look carefully at the semantics of the data in $\mathcal{D}$, we realize that such database not only stores information about the instances of the concepts in the domain of interest (e.g., the first row of table

`T-Cars`collects data about an instance of the concept

`Car`) but contains also pieces of data denoting new concepts of the domain. In particular, table

`T-CarTypes`contains data denoting concepts such as

`1973 FALCON XB GT`,

`1967 MUSTANG SHELBY`,

`1973 MUSTANG MACH 1`, and so on. Considering the context where these concepts appear, it is not difficult to conclude that they are all mutually disjoint subconcepts of

`Car`. Table

`T-Cars`, on the other hand, provides information about the instances of the various concepts, as well as other properties about them (i.e.,

`Color, ProdCountry`). We observe that, in order to acquire the knowledge about the concepts mentioned in the data sources, we need a flexible mechanism that is able to map data at the sources to concepts in the ontology. Without such flexibility, the designer would be forced to manually inspect the data sources and enrich the ontology off-line.

`Car-Type`, with

`Coupe`,

`Sedan`, etc. as its specializations. Note that the instances of such specialized concepts include the subconcepts of cars listed in the rows stored in table

`T-CarTypes`. With this mechanism, the designer is allowed to specify the means for dynamically acquiring TBox axioms, through simple queries asking for the instances of the meta-concept

`Car-Type`. Indeed, without the possibility of using meta-concepts in the ontology, it would be impossible to deal with the first issue mentioned above, that is exactly based on the idea of using mappings in order to transfer the knowledge fragments residing in the data sources to the TBox of the ontology. Note that for the technical development related to meta-modeling, we base our work on the approach and the results reported in [18].

- We formalize the notion of mapping-based knowledge base (MKB), that captures the idea of acquiring the axioms of both the extensional and the intensional level of the ontology through mapping assertions linking the source data to the ontology. This mechanism allows the designer to achieve a level of flexibility that is not possible in current OBDA systems. Our formalization relies on the notion of higher-order DL, as introduced in [18]. Indeed, De Giacomo et al. [18] describe a methodology that, starting from a traditional DL $\mathcal{L}$, allows one to define its higher-order version $Hi(\mathcal{L})$. Here, we apply this idea and make use of the higher-order DL $\mathit{Hi}({\mathit{DL-Lite}}_{\mathcal{R}})$.
- We propose to query mapping-based knowledge bases by means of an extension of unions of conjunctive queries (UCQs), taking advantage of the higher-order features of $\mathit{Hi}({\mathit{DL-Lite}}_{\mathcal{R}})$. In particular we define a suitable class of such queries, the so-called instance higher-order UCQs (IHUCQs), enjoying nice computational properties. The basic characteristic of IHUCQs is to allow higher-order features (i.e., meta-concepts and meta-properties) in the query expression but to disregard subclass and subproperty assertions in the body of the query.
- We study the problem of answering IHUCQs posed to MKBs expressed in $\mathit{Hi}({\mathit{DL-Lite}}_{\mathcal{R}})$. We show that this problem is efficiently solvable by exhibiting an algorithm based on FOL rewriting. The algorithm works in $A{C}^{0}$ with respect to the extensional level of the data sources, i.e., the portion of the data sources that is not involved in the intensional level of the ontology. More precisely, our algorithm, given an IHUCQ q over an MKB, reformulates q into a FOL query that is evaluated taking into account only the portion of the MKB involving the extensional level of the ontology. As a consequence, query answering can be delegated to a database management system, exactly as in the traditional OBDA approach.

## 2. Higher-Order ${\mathit{DL-Lite}}_{\mathcal{R}}$

- $\mathcal{T}$, called TBox, is the terminological component of $\mathcal{K}$, which contains statements representing intensional knowledge, and
- $\mathcal{A}$, called ABox, is the assertional component of $\mathcal{K}$, which contains assertions representing extensional knowledge.

- Concept expressions:$$\begin{array}{c}\hfill \begin{array}{ccccc}B& ::=& A& |& \exists \phantom{\rule{3.33333pt}{0ex}}Q\\ C& ::=& B& |& \neg B\end{array}\end{array}$$
- Role expressions:$$\begin{array}{c}\hfill \begin{array}{ccccc}Q& ::=& P& |& {P}^{-}\\ R& ::=& Q& |& \neg Q\end{array}\end{array}$$

- $\mathit{OP}({\mathit{DL-Lite}}_{\mathcal{R}})=\{\mathit{Inv}/1,\phantom{\rule{0.166667em}{0ex}}\mathit{Exists}/1\}$,
- $\mathit{MP}({\mathit{DL-Lite}}_{\mathcal{R}})=\{{\mathit{Inst}}_{C}/2,{\mathit{Inst}}_{R}/3,{\mathit{Isa}}_{C}/2,{\mathit{Isa}}_{R}/2,{\mathit{Disj}}_{C}/2,{\mathit{Disj}}_{R}/2\}$,

**Example**

**1.**

- if $e\in \mathcal{S}\cup \mathcal{V}$ then $e\in {\tau}_{{\mathit{DL-Lite}}_{\mathcal{R}}}(\mathcal{S},\mathcal{V})$;
- if $e\in {\tau}_{{\mathit{DL-Lite}}_{\mathcal{R}}}(\mathcal{S},\mathcal{V})$, and e is not of the form $\mathit{Inv}({e}^{\prime})$ (where ${e}^{\prime}$ is any term), then $\mathit{Inv}(e)\in {\tau}_{{\mathit{DL-Lite}}_{\mathcal{R}}}(\mathcal{S},\mathcal{V})$ (Differently from [18], we avoid the construction of terms of the form $\mathit{Inv}(\mathit{Inv}(e))$ which, as roles, are equivalent to e. Under this assumption, we do not have safe-range issues when dealing with queries, thus, differently form [18], we consider here non-Boolean queries.);
- if $e\in {\tau}_{{\mathit{DL-Lite}}_{\mathcal{R}}}(\mathcal{S},\mathcal{V})$ then $\mathit{Exists}(e)\in {\tau}_{{\mathit{DL-Lite}}_{\mathcal{R}}}(\mathcal{S},\mathcal{V})$.

**Example**

**2.**

**Example**

**3.**

- $\Delta $ is a non-empty, possibly countably infinite, set;
- ${\mathcal{I}}_{c}$ is a function that maps each $d\in \Delta $ into a subset of $\Delta $; and
- ${\mathcal{I}}_{r}$ is a function that maps each $d\in \Delta $ into a subset of $\Delta \times \Delta $.

- an individual;
- a unary relation, i.e., a concept, through ${\mathcal{I}}_{c}$; and
- a binary relation, i.e., a role, through ${\mathcal{I}}_{r}$.

- $\Sigma =\langle \Delta ,{\mathcal{I}}_{c},{\mathcal{I}}_{r}\rangle $ is an interpretation structure, and
- ${\mathcal{I}}_{o}$ is a function that maps:
- -
- each element of $\mathcal{S}$ to a single object in $\Delta $; and
- -
- each element $op\in \mathit{OP}({\mathit{DL-Lite}}_{\mathcal{R}})$ to a function $o{p}^{{\mathcal{I}}_{o}}:\Delta \to \Delta $ that satisfies the conditions characterizing the operator $op$. In particular, the conditions for the operators in $\mathit{OP}({\mathit{DL-Lite}}_{\mathcal{R}})$ are as follows:
- *
- for each ${d}_{1},{d}_{2}\in \Delta $ such that ${d}_{2}$=${\mathit{Inv}}^{{\mathcal{I}}_{o}}({d}_{1})$, we have that ${d}_{2}^{{\mathcal{I}}_{r}}$=${({d}_{1}^{{\mathcal{I}}_{r}})}^{-1}$, where ${({d}_{1}^{{\mathcal{I}}_{r}})}^{-1}$ is the inverse of the relation ${d}_{1}^{{\mathcal{I}}_{r}}$, and
- *
- for each ${d}_{1},{d}_{2}\in \Delta $ such that ${d}_{2}=\mathit{Exists}({d}_{1})$ we have that ${d}_{2}^{{\mathcal{I}}_{c}}=\{o\mid \mathrm{there}\mathrm{exists}{o}^{\prime}\mathrm{such}\mathrm{that}\langle o,{o}^{\prime}\rangle \in {d}_{1}^{{\mathcal{I}}_{r}}\}$.

- if $e\in \mathcal{S}$ then ${e}^{{\mathcal{I}}_{o},\mu}={e}^{{\mathcal{I}}_{o}}$;
- if $e\in \mathcal{V}$ then ${e}^{{\mathcal{I}}_{o},\mu}=\mu (e)$;
- $op{(e)}^{{\mathcal{I}}_{o},\mu}=o{p}^{{\mathcal{I}}_{o}}({e}^{{\mathcal{I}}_{o},\mu})$.

- $\mathcal{I},\mu \vDash {\mathit{Inst}}_{C}({e}_{1},{e}_{2})$ if ${e}_{1}^{{\mathcal{I}}_{o},\mu}\in {({e}_{2}^{{\mathcal{I}}_{o},\mu})}^{{\mathcal{I}}_{c}}$;
- $\mathcal{I},\mu $⊧${\mathit{Inst}}_{R}({e}_{1},{e}_{2},{e}_{3})$ if $\langle {e}_{1}^{{\mathcal{I}}_{o},\mu},{e}_{2}^{{\mathcal{I}}_{o},\mu}\rangle $∈${({e}_{3}^{{\mathcal{I}}_{o},\mu})}^{{\mathcal{I}}_{r}}$;
- $\mathcal{I},\mu \vDash {\mathit{Isa}}_{C}({e}_{1},{e}_{2})$ if ${({e}_{1}^{{\mathcal{I}}_{o},\mu})}^{{\mathcal{I}}_{c}}\subseteq {({e}_{2}^{{\mathcal{I}}_{o},\mu})}^{{\mathcal{I}}_{c}}$;
- $\mathcal{I},\mu \vDash {\mathit{Isa}}_{R}({e}_{1},{e}_{2})$ if ${({e}_{1}^{{\mathcal{I}}_{o},\mu})}^{{\mathcal{I}}_{r}}\subseteq {({e}_{2}^{{\mathcal{I}}_{o},\mu})}^{{\mathcal{I}}_{r}}$;
- $\mathcal{I},\mu \vDash {\mathit{Disj}}_{C}({e}_{1},{e}_{2})$ if ${({e}_{1}^{{\mathcal{I}}_{o},\mu})}^{{\mathcal{I}}_{c}}\cap {({e}_{2}^{{\mathcal{I}}_{o},\mu})}^{{\mathcal{I}}_{c}}=\xd8$;
- $\mathcal{I},\mu \vDash {\mathit{Disj}}_{R}({e}_{1},{e}_{2})$ if ${({e}_{1}^{{\mathcal{I}}_{o},\mu})}^{{\mathcal{I}}_{r}}\cap {({e}_{2}^{{\mathcal{I}}_{o},\mu})}^{{\mathcal{I}}_{r}}=\xd8$.

## 3. Mapping-Based Knowledge Bases

**Definition**

**1.**

- $\mathit{DB}$is a relational database;
- $\mathcal{M}$is a mapping, i.e., a set of mapping assertions, each one of the form $\Phi (\overrightarrow{x})\u21dd\psi $, where Φ is an arbitrary FOL query over $\mathit{DB}$ of arity $n\ge 0$ with free variables $\overrightarrow{x}=\langle {x}_{1},\dots ,{x}_{n}\rangle $, and ψ is an X-atom in ${\mathit{DL-Lite}}_{\mathcal{R}}$, with $X=\{{x}_{1},\dots ,{x}_{n}\}$.

**Definition**

**2.**

**Example**

**4.**

- M1:$\{m,b\phantom{\rule{3.33333pt}{0ex}}|\phantom{\rule{3.33333pt}{0ex}}$
$(a,m,b,c)\}\u21dd{\mathit{Isa}}_{C}(m,b)$**T-CarTypes** - M2:$\left\{b\phantom{\rule{3.33333pt}{0ex}}\right|\phantom{\rule{3.33333pt}{0ex}}$
$(a,m,b,c)\}\u21dd{\mathit{Isa}}_{C}(b,Car)$**T-CarTypes** - M3:$\left\{t\phantom{\rule{3.33333pt}{0ex}}\right|\phantom{\rule{3.33333pt}{0ex}}$
$(a,m,b,t)\}\u21dd{\mathit{Isa}}_{C}(t,\mathrm{CarType})$**T-CarTypes** - M4:$\{t1,t2\phantom{\rule{3.33333pt}{0ex}}|\phantom{\rule{3.33333pt}{0ex}}$
$(a,b,c,t1)\wedge \phantom{\rule{3.33333pt}{0ex}}$**T-CarTypes**$(d,e,f,t2)$$\wedge \phantom{\rule{3.33333pt}{0ex}}t1\ne t2\}\u21dd{\mathit{Disj}}_{C}(t1,t2)\}$**T-CarTypes** - M5:$\{m1,m2\phantom{\rule{3.33333pt}{0ex}}|\phantom{\rule{3.33333pt}{0ex}}$
$(c1,m1,a,b)\wedge $**T-CarTypes**$(c2,m2,d,e)$$\wedge \phantom{\rule{3.33333pt}{0ex}}m1\ne m2\}\u21dd{\mathit{Disj}}_{C}(m1,m2)$**T-CarTypes** - M6:$\mathit{true}\u21dd{\mathit{Isa}}_{C}(\mathrm{Car},\mathit{Exists}(\mathrm{produced}\_\mathrm{in}))$
- M7:$\mathit{true}\u21dd{\mathit{Isa}}_{C}(\mathrm{ProdCountry},\mathit{Exists}(\mathit{Inv}(\mathrm{produced}\_\mathrm{in})))$
- M8:$\mathit{true}\u21dd{\mathit{Isa}}_{C}(\mathit{Exists}(\mathrm{produced}\_\mathrm{in}),\mathrm{Car})$
- M9:$\mathit{true}\u21dd{\mathit{Isa}}_{C}(\mathit{Exists}(\mathit{Inv}(\mathrm{produced}\_in)),\mathrm{ProdCountry})$
- M10:$\{x,p\phantom{\rule{3.33333pt}{0ex}}|\phantom{\rule{3.33333pt}{0ex}}$
$(x,a,b,p)\}\u21dd{\mathit{Inst}}_{R}(x,p,\mathrm{produced}\_\mathrm{in})$**T-CarTypes** - M11:$\left\{p\phantom{\rule{3.33333pt}{0ex}}\right|\phantom{\rule{3.33333pt}{0ex}}$
$(a,b,c,p)\}\u21dd{\mathit{Inst}}_{C}(p,\mathrm{ProdCountry})$**T-CarTypes** - M12:$\{m,t\phantom{\rule{3.33333pt}{0ex}}|\phantom{\rule{3.33333pt}{0ex}}$
$(a,m,b,t)\}\u21dd{\mathit{Inst}}_{C}(m,t)$**T-CarTypes** - M13:$\{x,y\phantom{\rule{3.33333pt}{0ex}}|\phantom{\rule{3.33333pt}{0ex}}$
$(x,c1,a,b)\wedge $**T-CarTypes**$(c1,y,d,e)\}$$\u21dd{\mathit{Inst}}_{C}(x,y)$**T-CarTypes**

`table, the value appearing in the second column of t denotes a subconcept of the concept denoted by the value in the third column of the same tuple. Thus, for example, considering the fifth tuple${t}_{5}$of`

**T-CarTypes**`, M1 states that 1973 MUSTANG MACH 1 is a subconcept of the concept Ford. M2 asserts that every value appearing in the third column of`

**T-CarTypes**`is a subconcept of`

**T-CarTypes**`. For example, by referring to${t}_{5}$again, such tuple states that Ford is a subconcept of`

**Car**`. Analogously, M3 asserts that every value appearing in the fourth column of`

**Car**`is a subconcept of`

**T-CarTypes**`. For example,${t}_{5}$states that Coupe is a subconcept of`

**CarType**`. M4 (resp., M5) asserts that the values appearing in the fifth column (resp., second column) of`

**CarType**`denote concepts that are pairwise disjoint. M6–M9 assert properties about the relation produced_in, between the concepts Car and ProdCountry . M10 populates the relation produced_in, and M11 does the same for the concept ProdCountry . Mapping M12 exploits the meta-modeling capabilities of $\mathit{Hi}({\mathit{DL-Lite}}_{\mathcal{R}})$ and relates the different car models to their specific type. For example, looking at tuple ${t}_{5}$ again, we can infer by M12 that 1973 FALCON XB GT COUPE is an instance of the concept Coupe. Note that M1 asserted that 1973 FALCON XB GT COUPE is a concept, and therefore, we are taking advantage of the possibility provided by $\mathit{Hi}({\mathit{DL-Lite}}_{\mathcal{R}})$ of defining a concept to be an instance of another concept (a metaconcept). Finally, M13 allows us to correctly assign the instances stored in the`

**T-CarTypes**`T-Cars`table to the concepts corresponding to the different car models. For example, through this mapping we can infer that the “Mad Max“ police car INTERCEPTOR is an instance of 1973 FALCON XB GT COUPE (see the second tuple of

`T-Cars`), the famous car ELEANOR of movie “Gone in 60 s“ is an instance of the concept 1973 MUSTANG MACH 1 (third tuple), and the “Supercar“ KITT is an instance of 1982 PONTIAC FIREBIRD (fourth tuple).

`T-CarTypes`table. In our approach, the new information is automatically detected at run-time by the mappings in $\mathcal{M}$ and correctly introduced in the ontology. So, instead of manually changing the ontology and “re-compiling it” at design time, the new concept is dynamically captured at run-time.

## 4. Queries

**Definition**

**3.**

**Definition**

**4.**

**Example**

**5.**

- (i)
- Compute the instances of
`Ford`that were produced in Australia and are of type`Coupe`; note that an instance x of`Ford`of certain type T is an instance of a car model y such that y is an instance of T (this in fact holds for all instances of, not only for instances of**Car**`Ford`). It follows that the correct query expression is as follows:$$q(x)\leftarrow {\mathit{Inst}}_{C}(x,Ford),{\mathit{Inst}}_{C}(x,y),{\mathit{Inst}}_{C}(y,Coupe),{\mathit{Inst}}_{R}(x,\mathrm{AUSTRALIA},\mathrm{produced}\_\mathrm{in}).$$ - (ii)
- Compute the pairs of cars, one of type Coupe, and one of type Sedan that were produced in the same country:$$\begin{array}{c}q(x,y)\leftarrow {\mathit{Inst}}_{R}(x,z,\mathrm{produced}\_\mathrm{in}),{\mathit{Inst}}_{R}(y,z,\mathrm{produced}\_\mathrm{in}),\hfill \\ \phantom{q(x,y)\leftarrow}{\mathit{Inst}}_{C}(x,v),{\mathit{Inst}}_{C}(y,w),{\mathit{Inst}}_{C}(v,Coupe),{\mathit{Inst}}_{C}(w,Sedan).\hfill \end{array}$$
- (iii)
- Compute all the concepts in the ontology to which a given object (e.g., Eleanor) belongs to:$$q(x)\leftarrow {\mathit{Inst}}_{C}(\mathrm{ELEANOR},x).$$
- (iv)
- Compute all the concepts in the ontology whose instances are the concepts to which Eleanor or Kitt belong to:$$q(y)\leftarrow {\mathit{Inst}}_{C}(\mathrm{ELEANOR},x),{\mathit{Inst}}_{C}(x,y)\cup {\mathit{Inst}}_{C}(\mathrm{KITT},x),{\mathit{Inst}}_{C}(x,y).$$

## 5. Query Answering

- We denote by ${\mathcal{M}}_{\mathit{A}}$ the set of assertions contained in $\mathcal{M}$ having either ${\mathit{Inst}}_{C}$ or ${\mathit{Inst}}_{R}$ as predicate in their right-hand side.
- We denote by ${\mathcal{M}}_{\mathit{T}}$ the set $\mathcal{M}\backslash {\mathcal{M}}_{\mathit{A}}$, that is, the set of assertions contained in $\mathcal{M}$ having any of ${\mathit{Isa}}_{C}$, ${\mathit{Isa}}_{R}$, ${\mathit{Disj}}_{C}$, ${\mathit{Disj}}_{R}$ as predicate in their right-hand side.
- $\mathcal{M}$ is called an instance-mapping if ${\mathit{Inst}}_{C}$ and ${\mathit{Inst}}_{R}$ are the only predicates that appear in the right-hand side of the mapping assertions in $\mathcal{M}$.
- We say that e occurs as a concept argument in the atoms ${\mathit{Inst}}_{C}$($e,{e}^{\prime}$), ${\mathit{Isa}}_{C}$($e,{e}^{\prime}$), ${\mathit{Isa}}_{C}$(${e}^{\prime},e$), ${\mathit{Disj}}_{C}$($e,{e}^{\prime}$), and ${\mathit{Disj}}_{C}$(${e}^{\prime},e$).
- We say that e occurs as a role argument in the atoms ${\mathit{Inst}}_{R}$(${e}^{\prime},{e}^{\prime \prime},e$), ${\mathit{Isa}}_{R}$($e,{e}^{\prime}$), ${\mathit{Isa}}_{R}$(${e}^{\prime},e$), ${\mathit{Disj}}_{R}$($e,{e}^{\prime}$), and ${\mathit{Disj}}_{R}$(${e}^{\prime},e$).
- A $DL$ atom is an atom of the form $N(e)$ or $N({e}_{1},{e}_{2})$, where N is a name and $e,{e}_{1},{e}_{2}$ are either variables or names.
- An extended CQ (ECQ) is an expression of the form $q({x}_{1},\dots ,{x}_{n})\leftarrow {a}_{1},\dots ,{a}_{m}$ such that ${x}_{1},\dots ,{x}_{n}$ belong to $\mathcal{V}$, ${a}_{1},\dots ,{a}_{m}$ is a conjunction of atoms, each atom ${a}_{j}$ (with $1\le j\le m$) is either a $DL$ atom or an instance-query atom (i.e., an atom whose meta-predicate is ${\mathit{Inst}}_{C}$ or ${\mathit{Inst}}_{R}$), and each ${x}_{i}$ (with $1\le j\le n$) occurs in at least one ${a}_{j}$. An extended UCQ (EUCQ) is a union of ECQs.
- Given a TBox $\mathcal{T}$ (specified in high-order style syntax, cf. Section 2), we define $\mathit{Concepts}(\mathcal{T})=\{e,\mathit{Exists}({e}^{\prime}),\mathit{Exists}(\mathit{Inv}({e}^{\prime}))\mid e$ occurs as a concept argument in $\mathcal{T}\mathrm{and}{e}^{\prime}$ occurs as a role argument in $\mathcal{T}\}$, and $\mathit{Roles}(\mathcal{T})=$$\{e,\mathit{Inv}(e)\mid e$ occurs as a role argument in $\mathcal{T}\}$.
- Given a mapping $\mathcal{M}$ and a database $\mathit{DB}$, $\mathit{Retrieve}(\mathcal{M},\mathit{DB})$ denotes the $\mathit{Hi}({\mathit{DL-Lite}}_{\mathcal{R}})$ KB $\mathcal{H}$ defined as follows:$$\mathcal{H}=\{\psi (\overrightarrow{t})\mid \Phi (\overrightarrow{x})\u21dd\psi \in \mathcal{M}\mathrm{and}\mathit{DB}\vDash (\overrightarrow{t})\}$$
- Given an instance-mapping $\mathcal{M}$ and an ABox $\mathcal{A}$, we say that $\mathcal{A}$ is retrievable through $\mathcal{M}$ if there exists a database $\mathit{DB}$ such that $\mathcal{A}=\mathit{Retrieve}(\mathcal{M},\mathit{DB})$.

- In the first step, all intensional assertions are gathered by accessing the sources and using the mapping, in particular the ${\mathcal{M}}_{\mathit{T}}$ portion. This way, a ${\mathit{DL-Lite}}_{\mathcal{R}}$ TBox $\mathcal{T}$ is available for the subsequent steps.
- In the second step, the input query is rewritten on the basis of $\mathcal{T}$, using the algorithm $\mathit{PerfectRef}$ presented in [23]. In fact this is not just a trivial call to $\mathit{PerfectRef}$, since we need to transform the input IHUCQ, which cannot be given directly in input to $\mathit{PerfectRef}$, into a EUCQ. Similarly, we also need to translate the rewriting produced by $\mathit{PerfectRef}$ in a form that is compatible with the syntax used in the mapping (which is required by the following steps of our algorithm).
- In the third step, the query obtained by the second step is unfolded using the mapping, in particular the ${\mathcal{M}}_{\mathit{A}}$ portion, so as to obtain a query expressed over the alphabet of the source schema.
- In the fourth step, the query obtained by the third step is evaluated over the source data, so as to obtain the final result.

#### 5.1. Query Rewriting

- -
- if x occurs in a concept position in q, then $\sigma (x)\in \mathit{Concepts}(\mathcal{T})$;
- -
- if x occurs in a role position in q, then $\sigma (x)\in \mathit{Roles}(\mathcal{T})$.

**Example**

**6.**

`1973 FALCON XB GT COUPE`into

`FALCON`(to obtain all other queries in$\mathit{PMG}(q,\mathcal{T})$substitute

`FALCON`with all elements of$\mathit{Concepts}(\mathcal{T})$).

**Lemma**

**1.**

**Proof.**

- ${d}^{{\mathcal{I}}_{0}^{\downarrow}}={d}^{{\mathcal{I}}_{0}}$;
- ${d}^{{\mathcal{I}}_{c}^{\downarrow}}={d}^{{\mathcal{I}}_{c}}$ if there exists $e\in \mathit{Concepts}(\mathcal{T})$ such that ${e}^{{\mathcal{I}}_{0}}=d$, otherwise ${d}^{{\mathcal{I}}_{c}^{\downarrow}}=\xd8$;
- ${d}^{{\mathcal{I}}_{r}^{\downarrow}}={d}^{{\mathcal{I}}_{r}}$ if there exists $e\in \mathit{Roles}(\mathcal{T})$ such that ${e}^{{\mathcal{I}}_{0}}=d$, otherwise ${d}^{{\mathcal{I}}_{r}^{\downarrow}}=\xd8$.

- if $\alpha ={\mathit{Inst}}_{C}({e}_{1},{e}_{2})$ and ${e}_{2}$ has the form $\mathit{Exists}({e}^{\prime})$ where ${e}^{\prime}$ is an expression which is not of the form $\mathit{Inv}({e}^{\prime \prime})$, then $\mathit{Normalize}(\alpha )={\mathit{Inst}}_{R}({e}_{1},\_,{e}^{\prime})$, where _ denotes an existentially quantified variables;
- if $\alpha ={\mathit{Inst}}_{C}({e}_{1},{e}_{2})$ and ${e}_{2}$ has the form $\mathit{Exists}(\mathit{Inv}({e}^{\prime}))$ where ${e}^{\prime}$ is any expression, then $\mathit{Normalize}(\alpha )={\mathit{Inst}}_{R}(\_,{e}_{1},{e}^{\prime})$.

- $q\in Q$;
- if ${q}^{\prime}\in Q$ and ${q}^{\prime}$ contains an atom $\alpha $ of the form ${\mathit{Inst}}_{R}({e}_{1},\_,{e}_{2})$, and either $\mathit{Exists}({e}_{2})$ occurs in $\mathcal{M}$ or $\mathit{Exists}(x)$ (where x is a variable) occurs in $\mathcal{M}$, then the query obtained from ${q}^{\prime}$ by replacing $\alpha $ with the atom ${\mathit{Inst}}_{C}({e}_{1},\mathit{Exists}({e}_{2}))$ belongs to Q;
- if ${q}^{\prime}\in Q$ and ${q}^{\prime}$ contains an atom $\alpha $ of the form ${\mathit{Inst}}_{R}(\_,{e}_{1},{e}_{2})$, and either $\mathit{Exists}(\mathit{Inv}({e}_{2}))$ occurs in $\mathcal{M}$ or $\mathit{Exists}(\mathit{Inv}(x))$ (where x is a variable) occurs in $\mathcal{M}$, then the query obtained from ${q}^{\prime}$ by replacing $\alpha $ with the atom ${\mathit{Inst}}_{C}({e}_{1},\mathit{Exists}(\mathit{Inv}({e}_{2})))$ belongs to Q;
- if ${q}^{\prime}\in Q$ and ${q}^{\prime}$ contains an atom $\alpha $ of the form ${\mathit{Inst}}_{R}({e}_{1},{e}_{2},{e}_{3})$ and either $\mathit{Inv}({e}_{3})$ occurs in $\mathcal{M}$ or $\mathit{Inv}(x)$ (where x is a variable) occurs in $\mathcal{M}$, then the query obtained from ${q}^{\prime}$ by replacing $\alpha $ with the atom ${\mathit{Inst}}_{R}({e}_{2},{e}_{1},\mathit{Inv}({e}_{3}))$ belongs to Q.

- each atom of q of the form ${\mathit{Inst}}_{C}({e}_{1},{e}_{2})$, such that ${e}_{2}\in \mathit{Concepts}(\mathcal{T})$, is replaced with the atom ${e}_{2}({e}_{1})$;
- each atom of q of the form ${\mathit{Inst}}_{R}({e}_{1},{e}_{2},{e}_{3})$, such that ${e}_{3}\in \mathit{Roles}(\mathcal{T})$, is replaced with the atom ${e}_{3}({e}_{1},{e}_{2})$.

**Example**

**7.**

- each atom of q of the form ${e}_{2}({e}_{1})$ is replaced with the atom ${\mathit{Inst}}_{C}({e}_{1},{e}_{2})$;
- each atom of q of the form ${e}_{3}({e}_{1},{e}_{2})$ is replaced with the atom ${\mathit{Inst}}_{R}({e}_{1},{e}_{2},{e}_{3})$.

**begin**

**return**${Q}^{\prime}$;

**end**

**Example**

**8.**

`⊑ Ford, it rewrites the atom $Ford(x)$ in query (2) into the atom`

**FALCON**`(x) (intuitively, the $\mathit{PerfectRef}$ encodes in the rewriting the knowledge expressed by the ontology saying that to obtain instances of $Ford$ one has to look also for instances of`

**FALCON**`). In this way $\mathit{PerfectRef}$ produces the following query and adds it to the set ${Q}_{3}$ (notice that the atom`

**FALCON**`(x) was already present in the query, thus the effect of the rewriting in this case is simply dropping the first atom of the query):`

**FALCON****Theorem**

**1.**

**Proof.**

#### 5.2. Query Answering

**begin**

**return**$\mathit{IntEval}({Q}^{\u2033},{\mathit{DB}}_{{\mathcal{M}}_{\mathit{A}}})$

**end**

**Example**

**9.**

`INTERCEPTOR`, and this is the only answer to the query (i) given in Example 5 evaluated over the MKD system$\mathcal{K}$.

**Lemma**

**2.**

**Lemma**

**3.**

**Proof.**

**Theorem**

**2.**

**Proof.**

**Theorem**

**3.**

**Proof.**

## 6. Conclusions

## Author Contributions

## Funding

## Conflicts of Interest

## References

- Lenzerini, M. Managing Data through the Lens of an Ontology. AI Mag.
**2018**, 39, 65–74. [Google Scholar] [CrossRef] [Green Version] - Xiao, G.; Calvanese, D.; Kontchakov, R.; Lembo, D.; Poggi, A.; Rosati, R.; Zakharyaschev, M. Ontology-Based Data Access: A Survey. In Proceedings of the 27th International Joint Conference on Artificial Intelligence (IJCAI), Stockholm, Sweden, 13–19 July 2018; pp. 5511–5519. [Google Scholar]
- Baader, F.; Calvanese, D.; McGuinness, D.; Nardi, D.; Patel-Schneider, P.F. (Eds.) The Description Logic Handbook: Theory, Implementation and Applications, 2nd ed.; Cambridge University Press: Cambridge, UK, 2007. [Google Scholar]
- Savo, D.F.; Lembo, D.; Lenzerini, M.; Poggi, A.; Rodríguez-Muro, M.; Romagnoli, V.; Ruzzi, M.; Stella, G. Mastro at Work: Experiences on Ontology-Based Data Access. In Proceedings of the 23rd International Workshop on Description Logic (DL), Waterloo, ON, Canada, 4–7 May 2010; Volume 573, pp. 20–31. [Google Scholar]
- Antonioli, N.; Castanò, F.; Coletta, S.; Grossi, S.; Lembo, D.; Lenzerini, M.; Poggi, A.; Virardi, E.; Castracane, P. Ontology-based Data Management for the Italian Public Debt. In Proceedings of the 8th International Conference on Formal Ontology in Information Systems (FOIS), Rio de Janeiro, Brazil, 22–26 September 2014; pp. 372–385. [Google Scholar]
- Giese, M.; Soylu, A.; Vega-Gorgojo, G.; Waaler, A.; Haase, P.; Jiménez-Ruiz, E.; Lanti, D.; Rezk, M.; Xiao, G.; Özçep, Ö.L.; et al. Optique: Zooming in on Big Data. IEEE Comput.
**2015**, 48, 60–67. [Google Scholar] [CrossRef] - López, V.; Stephenson, M.; Kotoulas, S.; Tommasi, P. Data Access Linking and Integration with DALI: Building a Safety Net for an Ocean of City Data. In Proceedings of the 14th International Semantic Web Conference (ISWC), Bethlehem, PA, USA, 11–15 October 2015; Lecture Notes in Computer Science. Springer: Basel, Switzerland, 2015; Volume 9367, pp. 186–202. [Google Scholar]
- Calvanese, D.; De Giacomo, G.; Lembo, D.; Lenzerini, M.; Poggi, A.; Rodriguez-Muro, M.; Rosati, R.; Ruzzi, M.; Savo, D.F. The Mastro System for Ontology-based Data Access. Semant. Web J.
**2011**, 2, 43–53. [Google Scholar] [CrossRef] [Green Version] - De Giacomo, G.; Lembo, D.; Lenzerini, M.; Poggi, A.; Rosati, R.; Ruzzi, M.; Savo, D.F. MASTRO: A Reasoner for Effective Ontology-Based Data Access. In Proceedings of the 1st International Workshop on OWL Reasoner Evaluation (ORE), Manchester, UK, 1 July 2012; Volume 858. [Google Scholar]
- Rodriguez-Muro, M.; Kontchakov, R.; Zakharyaschev, M. Ontology-Based Data Access: Ontop of Databases. In Proceedings of the 12th International Semantic Web Conference (ISWC), Sydney, Australia, 21–25 October 2013; Lecture Notes in Computer Science. Springer: Basel, Switzerland, 2013; Volume 8218, pp. 558–573. [Google Scholar]
- Ullman, J.D. Information Integration using Logical Views. Theor. Comput. Sci.
**2000**, 239, 189–210. [Google Scholar] [CrossRef] [Green Version] - Halevy, A.Y. Answering Queries Using Views: A Survey. Very Large Database J.
**2001**, 10, 270–294. [Google Scholar] [CrossRef] - Lenzerini, M. Data Integration: A Theoretical Perspective. In Proceedings of the 21st ACM SIGACT SIGMOD SIGART Symp. on Principles of Database Systems (PODS), Madison, WI, USA, 3–5 June 2002; pp. 233–246. [Google Scholar]
- Kolaitis, P.G. Schema Mappings, Data Exchange, and Metadata Management. In Proceedings of the 24th ACM SIGACT SIGMOD SIGART Symp. on Principles of Database Systems (PODS), Baltimore, MD, USA, 13–15 June 2005; pp. 61–75. [Google Scholar]
- Di Pinto, F.; De Giacomo, G.; Lenzerini, M.; Rosati, R. Ontology-Based Data Access with Dynamic TBoxes in DL-Lite. In Proceedings of the 26th AAAI Conference on Artificial Intelligence (AAAI), Toronto, ON, Canada, 22–26 July 2012; pp. 719–725. Available online: https://www.aaai.org/ocs/index.php/AAAI/AAAI12/paper/view/5021 (accessed on 10 December 2019).
- Chen, W.; Kifer, M.; Warren, D.S. HILOG: A Foundation for Higher-Order Logic Programming. J. Log. Program.
**1993**, 15, 187–230. [Google Scholar] [CrossRef] [Green Version] - Pan, J.Z.; Horrocks, I. OWL FA: A Metamodeling Extension of OWL DL. In Proceedings of the 15th International World Wide Web Conference (WWW), Edinburgh, UK, 23–26 May 2006; pp. 1065–1066. [Google Scholar]
- De Giacomo, G.; Lenzerini, M.; Rosati, R. Higher-Order Description Logics for Domain Metamodeling. In Proceedings of the 25th AAAI Conference on Artificial Intelligence (AAAI), San Francisco, CA, USA, 7–11 August 2011. [Google Scholar]
- de Carvalho, V.A.; Almeida, J.P.A.; Fonseca, C.M.; Guizzardi, G. Multi-level ontology-based conceptual modeling. Data Knowl. Eng.
**2017**, 109, 3–24. [Google Scholar] [CrossRef] - Poggi, A.; Lembo, D.; Calvanese, D.; De Giacomo, G.; Lenzerini, M.; Rosati, R. Linking Data to Ontologies. J. Data Semant.
**2008**, X, 133–173. [Google Scholar] [CrossRef] [Green Version] - Papotti, P.; Torlone, R. Schema Exchange: Generic Mappings for Transforming Data and Metadata. Data Knowl. Eng.
**2009**, 68, 665–682. [Google Scholar] [CrossRef] - Calvanese, D.; De Giacomo, G.; Lenzerini, M.; Rosati, R.; Vetere, G. DL-Lite: Practical Reasoning for Rich DLs. In Proceedings of the 17th International Workshop on Description Logic (DL), Vienna, Austria, 17–20 July 2004; Volume 104. [Google Scholar]
- Calvanese, D.; De Giacomo, G.; Lembo, D.; Lenzerini, M.; Rosati, R. Tractable Reasoning and Efficient Query Answering in Description Logics: The DL-Lite Family. J. Autom. Reason.
**2007**, 39, 385–429. [Google Scholar] [CrossRef] [Green Version] - Motik, B.; Fokoue, A.; Horrocks, I.; Wu, Z.; Lutz, C.; Cuenca Grau, B. OWL Web Ontology Language Profiles. W3C Recommendation, World Wide Web Consortium. 2009. Available online: http://www.w3.org/TR/owl-profiles/ (accessed on 10 December 2019).
- Lembo, D.; Pantaleone, D.; Santarelli, V.; Savo, D.F. Easy OWL Drawing with the Graphol Visual Ontology Language. In Proceedings of the 15th International Conference on the Principles of Knowledge Representation and Reasoning (KR), Cape Town, South Africa, 25–29 April 2016; pp. 573–576. [Google Scholar]
- Lembo, D.; Pantaleone, D.; Santarelli, V.; Savo, D.F. Drawing OWL 2 ontologies with Eddy the editor. AI Commun.—Eur. J. Artif. Intell.
**2018**, 31, 97–113. [Google Scholar] [CrossRef] [Green Version] - Parsons, J.; Wand, Y. Emancipating Instances from the Tyranny of Classes in Information Modeling. ACM Trans. Database Syst.
**2000**, 25, 228–268. [Google Scholar] [CrossRef] - Lembo, D.; Lenzerini, M.; Rosati, R.; Ruzzi, M.; Savo, D.F. Inconsistency-tolerant query answering in ontology-based data access. J. Web Semant.
**2015**, 33, 3–29. [Google Scholar] [CrossRef] [Green Version] - Calvanese, D.; Damaggio, E.; De Giacomo, G.; Lenzerini, M.; Rosati, R. Semantic Data Integration in P2P Systems. In Proceedings of the International Workshop on Databases, Information Systems and Peer-to-Peer Computing (DBISP2P 2003), Berlin, Germany, 7–8 September 2003. [Google Scholar]
- De Giacomo, G.; Lenzerini, M.; Poggi, A.; Rosati, R. On the Update of Description Logic Ontologies at the Instance Level. In Proceedings of the 21st National Conference on Artificial Intelligence (AAAI), Boston, MA, USA, 16–20 July 2006; pp. 1271–1276. [Google Scholar]
- Ehrlinger, L.; Wöß, W. Towards a Definition of Knowledge Graphs. In Proceedings of the Posters and Demos Track of the 12th International Conference on Semantic Systems (SEMANTiCS), Leipzig, Germany, 12–15 September 2016. [Google Scholar]

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Di Pinto, F.; De Giacomo, G.; Lembo, D.; Lenzerini, M.; Rosati, R.
Acquiring Ontology Axioms through Mappings to Data Sources. *Future Internet* **2019**, *11*, 260.
https://doi.org/10.3390/fi11120260

**AMA Style**

Di Pinto F, De Giacomo G, Lembo D, Lenzerini M, Rosati R.
Acquiring Ontology Axioms through Mappings to Data Sources. *Future Internet*. 2019; 11(12):260.
https://doi.org/10.3390/fi11120260

**Chicago/Turabian Style**

Di Pinto, Floriana, Giuseppe De Giacomo, Domenico Lembo, Maurizio Lenzerini, and Riccardo Rosati.
2019. "Acquiring Ontology Axioms through Mappings to Data Sources" *Future Internet* 11, no. 12: 260.
https://doi.org/10.3390/fi11120260