Semantics-Preserving RDB2RDF Data Transformation Using Hierarchical Direct Mapping

: Direct mapping is an automatic transformation method used to generate resource description framework (RDF) data from relational data. In the ﬁeld of direct mapping, semantics preservation is critical to ensure that the mapping method outputs RDF data without information loss or incorrect semantic data generation. However, existing direct-mapping methods have problems that prevent semantics preservation in speciﬁc cases. For this reason, a mapping method is developed to perform a semantics-preserving transformation of relational databases (RDB) into RDF data without semantic information loss and to reduce the volume of incorrect RDF data. This research reviews cases that do not generate semantics-preserving results, and the corresponding problems into categories are arranged. This paper deﬁnes lemmas that represent the features of RDF data transformation to resolve those problems. Based on the lemmas, this work develops a hierarchical direct-mapping method to strictly abide by the deﬁnition of semantics preservation and to prevent semantic information loss, reducing the volume of incorrect RDF data generated. Experiments demonstrate the capability of the proposed method to perform semantics-preserving RDB2RDF data transformation, generating semantically accurate results. This work impacts future studies, which should involve the development of synchronization methods to achieve RDF data consistency when original RDB data are modiﬁed.


Introduction
The transformation of relational databases (RDB) into resource-description-framework (RDF) data is a key information extraction method used to publish semantics web data [1][2][3]. In 1998, Tim Berners-Lee proposed the concept of mapping RDBs to the Semantic Web [4]. Since then, several approaches have been proposed to improve the mapping of RDBs to semantic data. Additionally, the World Wide Web Consortium (W3C) has organized working groups to help standardize the technologies used for transforming RDBs into RDF data.
Direct mapping is a representative mapping method recommended by the W3C to support the automatic mapping of relational data to semantic data [5]. Figure 1 illustrates an example of direct mapping, which defines mapping rules for transforming both relational schema and instance data into RDF data. In the field of direct mapping, researchers have long-studied effective automated processes that focus on semantics preservation. The semantics preservation of the direct mapping process reflects the relational integrity constraints within the mapping result [6,7]. Because integrity constraints define the semantics of a database, mapping with integrity constraints generates more semantically accurate results. RDB represents relational database, RDF represents relational databases resource-description-framework.
Although the transformation of relational data using integrity constraints has been long-studied [8][9][10][11][12][13][14][15][16], several problems still occur during the direct-mapping processes under specific conditions (see Figure 2). If an attribute has two or more integrity constraints (e.g., NotNull and Unique in Figure  2a), the mapping result will output a single RDF graph structure that combines all of the integrity constraints (the subgraph rooted by name in Figure 2b). The mapping result preserves the semantics in view of the previous methods because relational data is transformed into RDF data. However, to date, no explicit transforming meta-information or well-designed hierarchical structure has been provided to trace the input relational data into the mapping result. RDF NotNull and Unique triples should have meta-information that reveals their NotNull and Unique constraints respectively, regarding the attribute name in table Lecture. Without this information, a new subgraph can be extracted from the merged RDF graph, which can then be misinterpreted, generating unintended constraints not found in the original data (see the primary key in Figure 2c). Thus, mapping methods based on weak definitions of semantics preservation can cause information loss and incorrect data generation. Therefore, to ensure the accuracy of the mapping process in all cases, a stringent definition is needed to quantify semantics preservation. Although the transformation of relational data using integrity constraints has been long-studied [8][9][10][11][12][13][14][15][16], several problems still occur during the direct-mapping processes under specific conditions (see Figure 2). If an attribute has two or more integrity constraints (e.g., NotNull and Unique in Figure 2a), the mapping result will output a single RDF graph structure that combines all of the integrity constraints (the subgraph rooted by name in Figure 2b). The mapping result preserves the semantics in view of the previous methods because relational data is transformed into RDF data. However, to date, no explicit transforming meta-information or well-designed hierarchical structure has been provided to trace the input relational data into the mapping result. RDF NotNull and Unique triples should have meta-information that reveals their NotNull and Unique constraints respectively, regarding the attribute name in table Lecture. Without this information, a new subgraph can be extracted from the merged RDF graph, which can then be misinterpreted, generating unintended constraints not found in the original data (see the primary key in Figure 2c). Thus, mapping methods based on weak definitions of semantics preservation can cause information loss and incorrect data generation. Therefore, to ensure the accuracy of the mapping process in all cases, a stringent definition is needed to quantify semantics preservation. RDB represents relational database, RDF represents relational databases resource-description-framework.
Although the transformation of relational data using integrity constraints has been long-studied [8][9][10][11][12][13][14][15][16], several problems still occur during the direct-mapping processes under specific conditions (see Figure 2). If an attribute has two or more integrity constraints (e.g., NotNull and Unique in Figure  2a), the mapping result will output a single RDF graph structure that combines all of the integrity constraints (the subgraph rooted by name in Figure 2b). The mapping result preserves the semantics in view of the previous methods because relational data is transformed into RDF data. However, to date, no explicit transforming meta-information or well-designed hierarchical structure has been provided to trace the input relational data into the mapping result. RDF NotNull and Unique triples should have meta-information that reveals their NotNull and Unique constraints respectively, regarding the attribute name in table Lecture. Without this information, a new subgraph can be extracted from the merged RDF graph, which can then be misinterpreted, generating unintended constraints not found in the original data (see the primary key in Figure 2c). Thus, mapping methods based on weak definitions of semantics preservation can cause information loss and incorrect data generation. Therefore, to ensure the accuracy of the mapping process in all cases, a stringent definition is needed to quantify semantics preservation. This paper proposes a hierarchical direct-mapping algorithm that prevents the problem illustrated in Figure 2 and preserves the semantics based on strict logical rules. Mapping problems occur when a mapping method simply focuses on data-type transformations. To prevent this, the proposed method uses a hierarchical semantics vocabulary and advanced mapping rules to map without semantic information loss. This paper also defines an evaluation metric using the inverse-mapping phase. That is, a mapping method is said to preserve the semantics if the result of inverse-mapping is semantically identical to the original input data (Figure 3). Evaluation results confirm the effectiveness and accuracy of semantics preservation of the proposed mapping methods.  This paper proposes a hierarchical direct-mapping algorithm that prevents the problem illustrated in Figure 2 and preserves the semantics based on strict logical rules. Mapping problems occur when a mapping method simply focuses on data-type transformations. To prevent this, the proposed method uses a hierarchical semantics vocabulary and advanced mapping rules to map without semantic information loss. This paper also defines an evaluation metric using the inversemapping phase. That is, a mapping method is said to preserve the semantics if the result of inversemapping is semantically identical to the original input data (Figure 3). Evaluation results confirm the effectiveness and accuracy of semantics preservation of the proposed mapping methods. This paper makes the following contributions: • Lemmas are defined to ensure semantics preservation and to demonstrate the soundness and completeness of direct mapping. • Hierarchical mapping rules are defined based on lemmas to perform semantics-preserving RDB-to-RDF (RDB2RDF) transformations and to prevent the loss of semantics and incorrect semantic data generation problems. • The scope of semantics preservation is extended, such that the inverse transformation of the output semantic data should be identical to the original input data.
The remainder of this paper is structured as follows. In the following section, the RDB2RDF mapping methods are briefly discussed. Section 3 presents preliminaries and problem descriptions of direct mapping. Section 4 describes the proposed mapping rules using logical definitions and the implementation of those rules in detail. Section 5 summarizes the results of our experiments. Finally, the conclusions and discussion of prospective future research are provided in Section 6.

Related Work
RDB2RDF is a mapping method that transforms relational data into semantic data represented by the RDF. The RDF data model [17] is a language that describes semantic information on the Semantic Web. The basic unit of RDF data is based on a graph structure (i.e., the triple: subject, property, and object) [18][19][20]. The RDF is a flexible and interoperable model used to publish information to the web relative to the relational data model. However, because ~70% of websites are backed up as RDBs [1], existing relational data must be used with the RDB2RDF methodology for the improvement of the Semantic Web.
RDB2RDF mapping approaches include mapping creation, mapping representation and accessibility, mapping implementation, query implementation, application domain, and data integration [21]. Mapping creation has been widely studied to improve the generation of mappings between relational data and semantic data, and it can be performed either automatically or manually [22] (see Table 1 for a list of previous mapping works). Domain semantics-driven mapping is a manual mapping method [23,24]. The W3C RDB2RDF Working Group has recommended the R2RML mapping language [25,26] for customizing mappings. Mapping tools, such as D2RQ [27,28], Virtuoso [29,30], Ultrawrap [31,32], etc. have also been provided to support manual mapping. On the other hand, direct mapping is an automatic mapping method that was published by the W3C RDB2RDF This paper makes the following contributions: • Lemmas are defined to ensure semantics preservation and to demonstrate the soundness and completeness of direct mapping.

•
Hierarchical mapping rules are defined based on lemmas to perform semantics-preserving RDB-to-RDF (RDB2RDF) transformations and to prevent the loss of semantics and incorrect semantic data generation problems.

•
The scope of semantics preservation is extended, such that the inverse transformation of the output semantic data should be identical to the original input data.
The remainder of this paper is structured as follows. In the following section, the RDB2RDF mapping methods are briefly discussed. Section 3 presents preliminaries and problem descriptions of direct mapping. Section 4 describes the proposed mapping rules using logical definitions and the implementation of those rules in detail. Section 5 summarizes the results of our experiments. Finally, the conclusions and discussion of prospective future research are provided in Section 6.

Related Work
RDB2RDF is a mapping method that transforms relational data into semantic data represented by the RDF. The RDF data model [17] is a language that describes semantic information on the Semantic Web. The basic unit of RDF data is based on a graph structure (i.e., the triple: subject, property, and object) [18][19][20]. The RDF is a flexible and interoperable model used to publish information to the web relative to the relational data model. However, because~70% of websites are backed up as RDBs [1], existing relational data must be used with the RDB2RDF methodology for the improvement of the Semantic Web.
RDB2RDF mapping approaches include mapping creation, mapping representation and accessibility, mapping implementation, query implementation, application domain, and data integration [21]. Mapping creation has been widely studied to improve the generation of mappings between relational data and semantic data, and it can be performed either automatically or manually [22] (see Table 1 for a list of previous mapping works). Domain semantics-driven mapping is a manual mapping method [23,24]. The W3C RDB2RDF Working Group has recommended the R2RML mapping language [25,26] for customizing mappings. Mapping tools, such as D2RQ [27,28], Virtuoso [29,30], Ultrawrap [31,32], etc. have also been provided to support manual mapping. On the other hand, direct mapping is an automatic mapping method that was published by the W3C RDB2RDF Working Group in 2012 [5]. It uses RDB instances and schemas as inputs and automatically generates RDF semantic data. In the field of RDF data creation, some methods used to transform various types of data (e.g., heterogeneous data [33], object-oriented data [34], and the Web of Data [35]) to RDF data have been devised. The current paper, however, mainly focuses on direct mapping to manage large-scale data on the web. Table 1. Classification of previously reported mapping methods.

Method Authors
Manual RDB2RDF (Domain semantics-driven mapping) Further research has been conducted to obtain semantic data from relational data without information loss [6,[8][9][10][11][12][13][14][15][16]. RDF schema (RDFS) and Web Ontology Language (OWL) are used to obtain a more accurate mapping for RDB2RDF transformation. The concept of RDF data can be modeled by RDFS or OWL in a manner similar to that used for defining the relational schema using SQL data definition language (DDL). Moreover, because OWL contains a more expressive semantic vocabulary, the mapping methods can better express the semantics of relational integrity constraints.
Sequeda et al. [7] proposed an augmented direct-mapping method that generates semantic data from the integrity constraints of the SQL DDL schema. Because integrity constraints define the semantics of the RDBs, the quality of augmented direct mapping depends on the transformation of the integrity constraints of the RDBs. DB2OWL [10] and RDBToOnto [11] also provided augmented direct mapping tools. However, they were restricted to supporting only referential integrity constraints. Lim et al. [13] used the OWL to process more rules and Jun et al. [14] proposed semantics-preserving optimization of mapping multi-column key constraints. However, their method still lacked support for use in the transformation of all integrity constraints of the relational SQL syntax. Moreover, this paper has observed that the problem of incorrect semantic data generation can occur in specific cases, as described in the next section.

Direct Mapping
Developed in 2010 and recommended in 2012 by the W3C RDB2RDF Working Group, direct mapping is an automatic map-creation method that transforms relational input data, including schema data, into RDF graph data (i.e., a direct graph). Direct mapping can be viewed as a function of transforming relational data with integrity constraints to semantic data. Figure 4 provides an example of relational input data via the direct-mapping process. The table, "Product," contains an attribute, pId, as its primary key, an attribute name, and an attribute production as a foreign key that references a table called "Production." Production contains an attribute, pCd, as its primary key and an attribute name. Figure 5 presents the result of the direct-mapping process regarding the input data shown in Figure 4. The output graph comprises a set of RDF triples. Suppose that the base URI of the output data is <http://idb.snu.ac.kr/example/>. The primary key attributes with the base URI are used to generate the subject resource. Two Product resources and two Production resources are generated. Predicates are generated from the attribute names of relational tables, and objects are generated from the attribute values.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 5 of 19 an attribute name. Figure 5 presents the result of the direct-mapping process regarding the input data shown in Figure 4. The output graph comprises a set of RDF triples. Suppose that the base URI of the output data is <http://idb.snu.ac.kr/example/>. The primary key attributes with the base URI are used to generate the subject resource. Two Product resources and two Production resources are generated. Predicates are generated from the attribute names of relational tables, and objects are generated from the attribute values.

Semantics Preservation of Direct Mapping
Further research has been conducted on the improvement of direct mapping to reduce information loss and ensure semantics preservation. Semantics preservation is an important feature of direct mapping as the quality of direct mapping is heavily depends on the semantics preservation. Sequeda et al. [7] provided a theoretical definition of semantics preservation. In addition to the definition, this research provides a stricter definition to quantify semantics preservation and evaluate the accuracy of the mapping methods. This paper defines the semantics preservation of mapping methods as follows:  an attribute name. Figure 5 presents the result of the direct-mapping process regarding the input data shown in Figure 4. The output graph comprises a set of RDF triples. Suppose that the base URI of the output data is <http://idb.snu.ac.kr/example/>. The primary key attributes with the base URI are used to generate the subject resource. Two Product resources and two Production resources are generated. Predicates are generated from the attribute names of relational tables, and objects are generated from the attribute values.

Semantics Preservation of Direct Mapping
Further research has been conducted on the improvement of direct mapping to reduce information loss and ensure semantics preservation. Semantics preservation is an important feature of direct mapping as the quality of direct mapping is heavily depends on the semantics preservation. Sequeda et al. [7] provided a theoretical definition of semantics preservation. In addition to the definition, this research provides a stricter definition to quantify semantics preservation and evaluate the accuracy of the mapping methods. This paper defines the semantics preservation of mapping methods as follows:

Semantics Preservation of Direct Mapping
Further research has been conducted on the improvement of direct mapping to reduce information loss and ensure semantics preservation. Semantics preservation is an important feature of direct mapping as the quality of direct mapping is heavily depends on the semantics preservation. Sequeda et al. [7] provided a theoretical definition of semantics preservation. In addition to the definition, this research provides a stricter definition to quantify semantics preservation and evaluate the accuracy of the mapping methods. This paper defines the semantics preservation of mapping methods as follows: Semantics preservation: Suppose X is a set of relational data, F is a RDB to RDF mapping function, and G is an RDF to RDB inverse-mapping function of F, if |X| = |G(F(X))| and |G(F(X)) − X| = 0, then F is an ideal function that satisfies semantics preserving mappings ( Figure 6).
Appl. Sci. 2020, 10, x FOR PEER REVIEW 6 of 19 Semantics preservation: Suppose X is a set of relational data, F is a RDB to RDF mapping function, and G is an RDF to RDB inverse-mapping function of F, if |X| = |G(F(X))| and |G(F(X)) -X| = 0, then F is an ideal function that satisfies semantics preserving mappings ( Figure 6).

Limitation of Direct Mapping for Semantics Preservation
This section defines the three challenging problems encountered during the direct-mapping process for semantic preservation. This paper observes loss of semantics (Problems 1 and 2) and incorrect semantic data generation (Problem 3), which may occur in specific cases. The problems and specific conditions in which the problems occur were found during studies of existing directmapping methods [8][9][10][11][12][13][14][15][16]. To overcome these drawbacks, the problems into three categories are organized.
Problem 1 illustrates the loss of information when relational tables are transformed into semantic data using an OWL class. Because the OWL class is associated with objects generated in a single semantic model structure, every semantic resource can be inferred from the OWL class. Therefore, additional methods are needed to indicate that the output data are particularly generated from the transformed relational table.
• Problem 1: Suppose ya = Class(xa) is an RDB2RDF mapping rule for a relational table, where xa ∈ R, R is a set of relational tables, ya ∈ C, C is a set of OWL classes, and xb = Class_Inverse(yb) is an RDF2RDB inverse-mapping rule of an OWL class, where xb ∈ X, X is a set of results generated by Class_Inverse(yb ), yb ∈ C, and C is a set of OWL classes. However, xb is not the same as xa, because R ⊂ X.
Problem 2 illustrates another type of information loss that occurs when semantic resources reference other resources. Each referencing object, including relational attributes and binary relations, can be transformed into the same type (i.e., OWL object property). Thus, a method of distinguishing each referencing object is discussed and provided in Section 4.
• Problem 2: Suppose ya = ObjProp(xa) is an RDB2RDF mapping rule for a relational table, where xa = {xa1, xa2}, xa1 ∈ B, B is a set of binary relations, xa2 ∈ F, F is a set of foreign keys, ya ∈ O, O is a set of OWL object properties, and xb = ObjProp_Inverse(yb) is an RDF2RDB inversemapping rule of an OWL object property, where xb ∈ X, X is a set of results generated by ObjProp_Inverse(yb), yb ∈ O, and O is a set of OWL object properties. However, ObjProp_Inverse( ) does not work as intended, because it has not been given any information to determine whether yb is generated from xa1 or xa2.
Problem 3 illustrates incorrect semantic data generation when integrity constraints are transformed without considering that every subgraph having a specific identical root node can be merged into a single graph.
• Problem 3: Assume that the mapping rules for integrity constraints of relational data are described in Figure 7. Here, the predicates on the right-hand side are used to verify the integrity constraints. DefaultCondition(p, v) is a function that assigns v as a default value of predicate p. CheckCondition (p, c) is a function that assigns c as a check condition of predicate p, and the other predicates on the left-hand side are defined in the Appendix A.

Limitation of Direct Mapping for Semantics Preservation
This section defines the three challenging problems encountered during the direct-mapping process for semantic preservation. This paper observes loss of semantics (Problems 1 and 2) and incorrect semantic data generation (Problem 3), which may occur in specific cases. The problems and specific conditions in which the problems occur were found during studies of existing direct-mapping methods [8][9][10][11][12][13][14][15][16]. To overcome these drawbacks, the problems into three categories are organized.
Problem 1 illustrates the loss of information when relational tables are transformed into semantic data using an OWL class. Because the OWL class is associated with objects generated in a single semantic model structure, every semantic resource can be inferred from the OWL class. Therefore, additional methods are needed to indicate that the output data are particularly generated from the transformed relational table.
• Problem 1: Suppose y a = Class(x a ) is an RDB2RDF mapping rule for a relational table, where x a ∈ R, R is a set of relational tables, y a ∈ C, C is a set of OWL classes, and x b = Class_Inverse(y b ) is an RDF2RDB inverse-mapping rule of an OWL class, where x b ∈ X, X is a set of results generated by Class_Inverse(y b ), y b ∈ C, and C is a set of OWL classes. However, x b is not the same as x a , because R ⊂ X.
Problem 2 illustrates another type of information loss that occurs when semantic resources reference other resources. Each referencing object, including relational attributes and binary relations, can be transformed into the same type (i.e., OWL object property). Thus, a method of distinguishing each referencing object is discussed and provided in Section 4.
is a set of binary relations, x a2 ∈ F, F is a set of foreign keys, y a ∈ O, O is a set of OWL object properties, and x b = ObjProp_Inverse(y b ) is an RDF2RDB inverse-mapping rule of an OWL object property, where x b ∈ X, X is a set of results generated by ObjProp_Inverse(y b ), y b ∈ O, and O is a set of OWL object properties. However, ObjProp_Inverse( ) does not work as intended, because it has not been given any information to determine whether y b is generated from x a1 or x a2 .
Problem 3 illustrates incorrect semantic data generation when integrity constraints are transformed without considering that every subgraph having a specific identical root node can be merged into a single graph.

•
Problem 3: Assume that the mapping rules for integrity constraints of relational data are described in Figure 7. Here, the predicates on the right-hand side are used to verify the integrity constraints. DefaultCondition(p, v) is a function that assigns v as a default value of predicate p. CheckCondition (p, c) is a function that assigns c as a check condition of predicate p, and the other predicates on the left-hand side are defined in the Appendix A.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 7 of 19 Figure 7. Simple mapping rules of integrity constraints.
The subset relationships can be inferred, as shown in Figure 8. However, semantic data generated by the above rules can be misinterpreted. For example, assume that a relational attribute, x, has integrity constraints, "primary key" and "check," F is a mapping function that contains the rules of Figure 7, and G is an inverse mapping function of F. Then, the integrity constraints of G(F(x)) are "primary key," "check," "foreign key," and "unique," because FK(p) ⊆ Check(p), and Unique (p) ⊆ PK(p) ∪ Check(p). Therefore, developing a method to avoid incorrect semantic data generation is a challenge associated with the semantics-preserving RDB2RDF transformation (the detailed example is provided in the Appendix B).

Hierarchical Mapping Rules
This section provides the hierarchical rules for learning general relational schemas and integrity constraints. Each rule is based on lemmas that are valid within the semantics domain. Then, this section then explains how the problems described in Section 3.3 can be prevented by using proposed rules via lemmas. This work uses predicate logic to define rules and add graphical examples for better understanding. Then, the hierarchically structured semantic vocabularies are provided in order to generate sound and precise semantic data. The relationships among the lemmas, rules, and problems are described in Figure 9 to clarify the concept of the rules.

Rules for General Relational Schemas
This section defines the rules using Lemmas 1 and 2 to generate accurate RDF data from relational data without information loss (proofs are provided in the Appendix C and D). Lemma 1 The subset relationships can be inferred, as shown in Figure 8. However, semantic data generated by the above rules can be misinterpreted. For example, assume that a relational attribute, x, has integrity constraints, "primary key" and "check," F is a mapping function that contains the rules of Figure 7, and G is an inverse mapping function of F. Then, the integrity constraints of G(F(x)) are "primary key," "check," "foreign key," and "unique," because FK(p) ⊆ Check(p), and Unique (p) ⊆ PK(p) ∪ Check(p). Therefore, developing a method to avoid incorrect semantic data generation is a challenge associated with the semantics-preserving RDB2RDF transformation (the detailed example is provided in the Appendix B). The subset relationships can be inferred, as shown in Figure 8. However, semantic data generated by the above rules can be misinterpreted. For example, assume that a relational attribute, x, has integrity constraints, "primary key" and "check," F is a mapping function that contains the rules of Figure 7, and G is an inverse mapping function of F. Then, the integrity constraints of G(F(x)) are "primary key," "check," "foreign key," and "unique," because FK(p) ⊆ Check(p), and Unique (p) ⊆ PK(p) ∪ Check(p). Therefore, developing a method to avoid incorrect semantic data generation is a challenge associated with the semantics-preserving RDB2RDF transformation (the detailed example is provided in the Appendix B).

Hierarchical Mapping Rules
This section provides the hierarchical rules for learning general relational schemas and integrity constraints. Each rule is based on lemmas that are valid within the semantics domain. Then, this section then explains how the problems described in Section 3.3 can be prevented by using proposed rules via lemmas. This work uses predicate logic to define rules and add graphical examples for better understanding. Then, the hierarchically structured semantic vocabularies are provided in order to generate sound and precise semantic data. The relationships among the lemmas, rules, and problems are described in Figure 9 to clarify the concept of the rules.

Rules for General Relational Schemas
This section defines the rules using Lemmas 1 and 2 to generate accurate RDF data from relational data without information loss (proofs are provided in the Appendix C and D). Lemma 1

Hierarchical Mapping Rules
This section provides the hierarchical rules for learning general relational schemas and integrity constraints. Each rule is based on lemmas that are valid within the semantics domain. Then, this section then explains how the problems described in Section 3.3 can be prevented by using proposed rules via lemmas. This work uses predicate logic to define rules and add graphical examples for better understanding. Then, the hierarchically structured semantic vocabularies are provided in order to generate sound and precise semantic data. The relationships among the lemmas, rules, and problems are described in Figure 9 to clarify the concept of the rules. The subset relationships can be inferred, as shown in Figure 8. However, semantic data generated by the above rules can be misinterpreted. For example, assume that a relational attribute, x, has integrity constraints, "primary key" and "check," F is a mapping function that contains the rules of Figure 7, and G is an inverse mapping function of F. Then, the integrity constraints of G(F(x)) are "primary key," "check," "foreign key," and "unique," because FK(p) ⊆ Check(p), and Unique (p) ⊆ PK(p) ∪ Check(p). Therefore, developing a method to avoid incorrect semantic data generation is a challenge associated with the semantics-preserving RDB2RDF transformation (the detailed example is provided in the Appendix B).

Hierarchical Mapping Rules
This section provides the hierarchical rules for learning general relational schemas and integrity constraints. Each rule is based on lemmas that are valid within the semantics domain. Then, this section then explains how the problems described in Section 3.3 can be prevented by using proposed rules via lemmas. This work uses predicate logic to define rules and add graphical examples for better understanding. Then, the hierarchically structured semantic vocabularies are provided in order to generate sound and precise semantic data. The relationships among the lemmas, rules, and problems are described in Figure 9 to clarify the concept of the rules.

Rules for General Relational Schemas
This section defines the rules using Lemmas 1 and 2 to generate accurate RDF data from relational data without information loss (proofs are provided in the Appendix C and D). Lemma 1 describes the feature of the OWL class during the mapping process.

Rules for General Relational Schemas
This section defines the rules using Lemmas 1 and 2 to generate accurate RDF data from relational data without information loss (proofs are provided in the Appendices C and D). Lemma 1 describes the feature of the OWL class during the mapping process.

•
Lemma 1: Suppose R is a relational table set, A is an attribute set, K is an integrity constraint set, I is a relational instance set, X is a set where X ⊂ (R ∪ A ∪ K ∪ I), and F is a direct mapping function. Then, every y ∈ F(X) by inference from owl:Class can be retrieved.
Thus, to avoid Problem 1 described in Section 3.3, Rule 1 for mapping relational tables based on Lemma 1 is defined as follows.

•
Rule 1: Rel(r) ∧ ¬BinRel(r, a 1 , . . . , a m , s, b 1 , . . . ,b n , t) → Relation(r), where the predicates used on the left-hand side are defined in the Appendix A, and Relation(r) is a predicate that verifies that r is a relational table and not a binary relation.
By Rule 1, relational tables are transformed into semantic resources using Relation (typed OWL class), which is a semantic vocabulary to notate relational tables. For example, if a relational table, "Student." is transformed by a naive rule, "Rel(Student) → Class(Student)," then the transformed output Student loses the explicit information indicating that it is a relational table. This loss happens, because all semantic resources are typed only by the OWL class (Figure 10a). On the other hand, Student will not lose the information that it is a relational table if Rule 1 is implemented. Relation is defined as a type of Student using Rule 1, which provides explicit information that the RDF data is transformed from a relational table (Figure 10b).
Appl. Sci. 2020, 10, x FOR PEER REVIEW 8 of 19 • Lemma 1: Suppose R is a relational table set, A is an attribute set, K is an integrity constraint set, I is a relational instance set, X is a set where X ⊂ (R ∪ A ∪ K ∪ I), and F is a direct mapping function. Then, every y ∈ F(X) by inference from owl:Class can be retrieved.
Thus, to avoid Problem 1 described in Section 3.3, Rule 1 for mapping relational tables based on Lemma 1 is defined as follows.
• Rule 1: Rel(r) ∧ ￢BinRel(r, a1,…, am, s, b1,…,bn, t) → Relation(r), where the predicates used on the left-hand side are defined in the Appendix A, and Relation(r) is a predicate that verifies that r is a relational table and not a binary relation.
By Rule 1, relational tables are transformed into semantic resources using Relation (typed OWL class), which is a semantic vocabulary to notate relational tables. For example, if a relational table, "Student." is transformed by a naive rule, "Rel(Student)→Class(Student)," then the transformed output Student loses the explicit information indicating that it is a relational table. This loss happens, because all semantic resources are typed only by the OWL class (Figure 10a). On the other hand, Student will not lose the information that it is a relational table if Rule 1 is implemented. Relation is defined as a type of Student using Rule 1, which provides explicit information that the RDF data is transformed from a relational table (Figure 10b). Lemma 2 illustrates the feature of the OWL object property used to express the semantics of relationships between semantic resources.
• Lemma 2: For any X in relational data, if x ∈ X references another y ∈ X, then x can be transformed into a semantic resource, which has a type of owl:ObjectProperty.
Based on Lemma 2, if a direct-mapping method does not manage the feature of an object property accurately, then Problem 2 described in Section 3.3 can occur during the mapping process. Thus, Rules 2-5 based on Lemma 2 for mapping the semantics of relationships are defined. Rule 2 is composed of five sub rules to specify the attributes within the hierarchical structure: where the predicates on the left-hand side are defined in the Appendix A, and the predicates on the right-hand side represent the transforms of relational attribute a. With these predicates, Rule 2 has a distinct advantage over previous approaches that simply used the OWL object property and the datatype property to map relational attributes. Because the OWL properties are provided to describe any resource with referencing semantics (not just for Lemma 2 illustrates the feature of the OWL object property used to express the semantics of relationships between semantic resources.

•
Lemma 2: For any X in relational data, if x ∈ X references another y ∈ X, then x can be transformed into a semantic resource, which has a type of owl:ObjectProperty.
Based on Lemma 2, if a direct-mapping method does not manage the feature of an object property accurately, then Problem 2 described in Section 3.3 can occur during the mapping process. Thus, Rules 2-5 based on Lemma 2 for mapping the semantics of relationships are defined. Rule 2 is composed of five sub rules to specify the attributes within the hierarchical structure: where the predicates on the left-hand side are defined in the Appendix A, and the predicates on the right-hand side represent the transforms of relational attribute a. With these predicates, Rule 2 has a distinct advantage over previous approaches that simply used the OWL object property and the datatype property to map relational attributes. Because the OWL properties are provided to describe any resource with referencing semantics (not just for relational attributes), using only the OWL properties will not always guarantee that the output data was originally attributed data. As a result, previous approaches cannot avoid semantic information loss during mapping attributes. However, Rule 2 adopts hierarchical structured semantic vocabularies on attributes ( Figure 11). The vocabularies describe various types of attributes, and each input attribute can be transformed into RDF data with detailed information.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 9 of 19 vocabularies on attributes ( Figure 11). The vocabularies describe various types of attributes, and each input attribute can be transformed into RDF data with detailed information. where the predicates on the left-hand side are defined in the Appendix A, and BinaryRelation(r, s, t) is a predicate that verifies whether a binary relation, r, can be transformed into semantic resource BinaryRelation (typed OWL object property), which is a semantic vocabulary that notates binary relations. Although both Rules 2 and 3 use owl:ObjectProperty during the mapping process, the mapping results of the two rules can be readily distinguished. The semantics of type owl:ObjectProperty are encapsulated by the mapping resource, FKeyAttr, in Rule 2 ( Figure 11) and BinaryRelation in Rule 3 ( Figure 12). Therefore, a mapping result having FKeyAttr implies that it was originally an attribute of relational data. Thus, from a result having BinaryRelation, we can infer that it was a binary relation before the mapping process. where the predicates on the left-hand side are defined in the Appendix A, IdentifyingRelationship(r, s, t) is a predicate that verifies identifying relationships, and NonIdentifyingRelationship(r, s, t) is a predicate that verifies nonidentifying relationships. Figure 13 shows an example of mapping an identifying relationship. Because a primary key of where the predicates on the left-hand side are defined in the Appendix A, and BinaryRelation(r, s, t) is a predicate that verifies whether a binary relation, r, can be transformed into semantic resource BinaryRelation (typed OWL object property), which is a semantic vocabulary that notates binary relations. Although both Rules 2 and 3 use owl:ObjectProperty during the mapping process, the mapping results of the two rules can be readily distinguished. The semantics of type owl:ObjectProperty are encapsulated by the mapping resource, FKeyAttr, in Rule 2 ( Figure 11) and BinaryRelation in Rule 3 ( Figure 12). Therefore, a mapping result having FKeyAttr implies that it was originally an attribute of relational data. Thus, from a result having BinaryRelation, we can infer that it was a binary relation before the mapping process.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 9 of 19 vocabularies on attributes ( Figure 11). The vocabularies describe various types of attributes, and each input attribute can be transformed into RDF data with detailed information. where the predicates on the left-hand side are defined in the Appendix A, and BinaryRelation(r, s, t) is a predicate that verifies whether a binary relation, r, can be transformed into semantic resource BinaryRelation (typed OWL object property), which is a semantic vocabulary that notates binary relations. Although both Rules 2 and 3 use owl:ObjectProperty during the mapping process, the mapping results of the two rules can be readily distinguished. The semantics of type owl:ObjectProperty are encapsulated by the mapping resource, FKeyAttr, in Rule 2 ( Figure 11) and BinaryRelation in Rule 3 ( Figure 12). Therefore, a mapping result having FKeyAttr implies that it was originally an attribute of relational data. Thus, from a result having BinaryRelation, we can infer that it was a binary relation before the mapping process.  NonIdentifyingRelationship(r, s, t), where the predicates on the left-hand side are defined in the Appendix A, IdentifyingRelationship(r, s, t) is a predicate that verifies identifying relationships, and NonIdentifyingRelationship(r, s, t) is a predicate that verifies nonidentifying relationships. where the predicates on the left-hand side are defined in the Appendix A, IdentifyingRelationship(r, s, t) is a predicate that verifies identifying relationships, and NonIdentifyingRelationship(r, s, t) is a predicate that verifies nonidentifying relationships. Figure 13 shows an example of mapping an identifying relationship. Because a primary key of Professor contains a foreign key referencing Person, the relation Professor is dependent on the relation Person. In such a case, relationships between Professor and Person can be mapped using IdentifyingRelationship( ), as defined by Rule 4.

Rules for Relational Integrity Constraints
This section provides additional rules for transforming relational integrity constraints to prevent incorrect RDF data generation problems. Lemma 3 illustrates the feature of RDF data, using a linked graph structure. This feature acts as a major factor that prevents the generation of incorrect RDF data. Thus, this section defines Rules 6-11 for the mapping integrity constraints based on Lemma 3 (proofs are provided in the Appendix E).

Rules for Relational Integrity Constraints
This section provides additional rules for transforming relational integrity constraints to prevent incorrect RDF data generation problems. Lemma 3 illustrates the feature of RDF data, using a linked graph structure. This feature acts as a major factor that prevents the generation of incorrect RDF data. Thus, this section defines Rules 6-11 for the mapping integrity constraints based on Lemma 3 (proofs are provided in the Appendix E).

Rules for Relational Integrity Constraints
This section provides additional rules for transforming relational integrity constraints to prevent incorrect RDF data generation problems. Lemma 3 illustrates the feature of RDF data, using a linked graph structure. This feature acts as a major factor that prevents the generation of incorrect RDF data. Thus, this section defines Rules 6-11 for the mapping integrity constraints based on Lemma 3 (proofs are provided in the Appendix E).

•
Lemma 3: Suppose G is an RDF graph, G 1 and G 2 are the components of G, there is no edge between G 1 and G 2 , G 1 is rooted at x ∈ R, and G 2 is rooted at y ∈ R, where R is a set of semantic resources. Then,

1.
If x and y have the same uniform resource identifier (URI) [36], then x is identical to y. Thus, G 1 and G 2 can be merged into one graph.

2.
If x and y have different URIs, x has a property, p 1 , and y has a property, p 2 , that has the same URI as p 1 , then G 1 and G 2 cannot be merged into one graph, and p 1 can be distinguished from p 2 using x and y.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 11 of 19 Rule 6 describes the NotNull constraint. It also defines a predicate, Card(a, _b, 1), that restricts the cardinality of an attribute to be exactly one (Figure 15a). Rule 7 specifies a unique constraint and defines a predicate with a unique existential quantifier, (∃!v) a(r, v), such that there is only one attribute value, v, contained in the domain of a(r, v) (Figure 15b). Rule 8 specifies the primary key defined by Card(a, _b, 1) and (∃!v) a(r, v), to assign an attribute, a, with a primary key (Figure 16a). To define a foreign-key constraint, Rule 9 specifies a lower bound of the cardinality using MinCard(a, _b, 1), because the relational tables can reference more than one other table. Rule 9 also uses FKeyAttr(a, r, s) to describe the semantics that the type of attribute a is an OWL object property with domain r and range s (Figure 16b).
Rule 10 specifies the default constraint, and it uses a function DefVal(a, _b, v) that returns a default value, v, if a value of the attribute, a, is omitted (Figure 17a). In Rule 11, a function CheckCond(a, _b, v) is used to restrict the value range for the check constraint. For example, CheckCond(quantity, _b, 'quantity > 0') means that a value of the attribute quantity must be greater than zero (Figure 17b).  Rule 8 specifies the primary key defined by Card(a, _b, 1) and (∃!v) a(r, v), to assign an attribute, a, with a primary key (Figure 16a). To define a foreign-key constraint, Rule 9 specifies a lower bound of the cardinality using MinCard(a, _b, 1), because the relational tables can reference more than one other table. Rule 9 also uses FKeyAttr(a, r, s) to describe the semantics that the type of attribute a is an OWL object property with domain r and range s (Figure 16b).
Appl. Sci. 2020, 10, x FOR PEER REVIEW 11 of 19 Rule 6 describes the NotNull constraint. It also defines a predicate, Card(a, _b, 1), that restricts the cardinality of an attribute to be exactly one (Figure 15a). Rule 7 specifies a unique constraint and defines a predicate with a unique existential quantifier, (∃!v) a(r, v), such that there is only one attribute value, v, contained in the domain of a(r, v) (Figure 15b). Rule 8 specifies the primary key defined by Card(a, _b, 1) and (∃!v) a(r, v), to assign an attribute, a, with a primary key (Figure 16a). To define a foreign-key constraint, Rule 9 specifies a lower bound of the cardinality using MinCard(a, _b, 1), because the relational tables can reference more than one other table. Rule 9 also uses FKeyAttr(a, r, s) to describe the semantics that the type of attribute a is an OWL object property with domain r and range s (Figure 16b).
Rule 10 specifies the default constraint, and it uses a function DefVal(a, _b, v) that returns a default value, v, if a value of the attribute, a, is omitted (Figure 17a). In Rule 11, a function CheckCond(a, _b, v) is used to restrict the value range for the check constraint. For example, CheckCond(quantity, _b, 'quantity > 0') means that a value of the attribute quantity must be greater than zero (Figure 17b).  Rule 10 specifies the default constraint, and it uses a function DefVal(a, _b, v) that returns a default value, v, if a value of the attribute, a, is omitted (Figure 17a). In Rule 11, a function CheckCond(a, _b, v) is used to restrict the value range for the check constraint. For example, CheckCond(quantity, _b, 'quantity > 0') means that a value of the attribute quantity must be greater than zero (Figure 17b).

Soundness and Completeness of the Rules
Lemmas describe the features of semantic resources during the mapping process. Lemma 1 states that every semantic resource can be inferred from the OWL class. Lemma 2 states that every semantic resource referencing other resources can be typed by the OWL object property. Lemma 3 states that every subgraph, which has the same semantic resource as a root node, can be merged into a single graph. On the other hand, the problems described in Section 3.3 are specific cases of violation of semantics preservation. Problem 1 illustrates the loss of information when the relational tables are transformed without considering Lemma 1. Problem 2 illustrates another case of loss of information when attributes, binary relations, or other referencing objects are transformed without considering Lemma 2. Problem 3 illustrates incorrect RDF data generation when integrity constraints are transformed without considering Lemma 3. Therefore, mapping rules are defined based on the lemmas to perform semantics-preserving transformation of RDBs to RDF data and to avoid the loss of semantics or incorrect RDF data generation. Lemma 4 demonstrates the soundness and completeness of the provided RDB2RDF data transformation methods (see the Appendix F for proof).

•
Lemma 4: Consider that X is a set of relational data, F is an RDB2RDF mapping function, and G is an RDF2RDB inverse-mapping function of F. If the mapping rules are defined based on Lemmas 1, 2, and 3, then,

1.
Soundness: the mapping rules are sound if the rules generate only semantics in RDB data (X ⊇ G(F(X))).

2.
Completeness: the mapping rules are complete if the rules generate all semantics in RDB data (X ⊆ G(F(X))).

Environments
Experiments were conducted using five real datasets and one synthetic dataset on a cluster of 12 nodes using a 3.1-GHz quad-core processor, 4-GB memory, and a 2-TB hard disk. Each real dataset contains relational schema information with integrity constraints: Ensembl-compara (www.ensembl. org/info/docs/api/compara), Ensembl (www.ensembl.org), PHPmyadmin (https://www.phpmyadmin. net), and MusicBrainz (https://musicbrainz.org). The DBT2 (http://osdldbt.sourceforge.net) benchmark was used for the synthetic dataset. This work generated warehouse data using DBT2 and restructured the schema by adding integrity constraints to evaluate the semantics preservation of the mapping methods. Figure 18a presents the results for the number of triples transformed from the relational data. To perform a comparative analysis of cost efficiency of the mapping rules, this work employed OWL ontology-based augmented direct mapping [13], which provides implementation details of the mapping algorithm and demonstrates the improvement over other previous methods. The horizontal axis represents the relational data size of the input as mapping methods, and the vertical axis represents the number of semantic triples as output data. As viewed in Figure 18a, our approach generates fewer triples compared with the previous method. Figure 18b shows the average number of triples that result from each transformation method. The horizontal axis represents each relational dataset, and the vertical axis represents the average number of triples generated from the transformation of a single relational element. Assuming that two output results are identical in terms of semantics, the method that generates a smaller-sized result is better with regard to both space and computation. On one hand, when the input data are transformed into RDF data, the mapping rule uses the hierarchical RDF data model. On the other hand, previous methods generated output data without a predefined RDF data model, and repetitive RDF data were generated using primitive semantic model languages when the input data had several referential relationships or constraints. Thus, the results show that the proposed approach generates more compact RDF data that express the same information with fewer resources.

Analysis
Appl. Sci. 2020, 10, x FOR PEER REVIEW 13 of 19 semantic model languages when the input data had several referential relationships or constraints. Thus, the results show that the proposed approach generates more compact RDF data that express the same information with fewer resources.   Figure 19 shows the failure rate of the mapping methods in each database. The horizontal axis represents each relational dataset, and the vertical axis represents the failure rate during the transformation of relational data into RDF data. The mapping failures of the previous approach result in the incorrect RDF data generation problem was discussed in Section 3. These failures occurred because the previous methods lacked support for handling the integrity constraints. Our approach followed Lemma 4 and guaranteed that the hierarchical mapping rules generated fewer false mapping results. By our method, the false results could occur when the input data are defined based on the practical SQL statements that are not included in standard SQL.   Figure 19 shows the failure rate of the mapping methods in each database. The horizontal axis represents each relational dataset, and the vertical axis represents the failure rate during the transformation of relational data into RDF data. The mapping failures of the previous approach result in the incorrect RDF data generation problem was discussed in Section 3. These failures occurred because the previous methods lacked support for handling the integrity constraints. Our approach followed Lemma 4 and guaranteed that the hierarchical mapping rules generated fewer false mapping results. By our method, the false results could occur when the input data are defined based on the practical SQL statements that are not included in standard SQL.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 13 of 19 semantic model languages when the input data had several referential relationships or constraints. Thus, the results show that the proposed approach generates more compact RDF data that express the same information with fewer resources.   Figure 19 shows the failure rate of the mapping methods in each database. The horizontal axis represents each relational dataset, and the vertical axis represents the failure rate during the transformation of relational data into RDF data. The mapping failures of the previous approach result in the incorrect RDF data generation problem was discussed in Section 3. These failures occurred because the previous methods lacked support for handling the integrity constraints. Our approach followed Lemma 4 and guaranteed that the hierarchical mapping rules generated fewer false mapping results. By our method, the false results could occur when the input data are defined based on the practical SQL statements that are not included in standard SQL.    [8], Buccella et al. [9], Li et al. [12], Lim et al. [13], Shen et al. [15], and Tirmizi et al. [16]. The results show that the proposed method adheres to the definition of the semantics-preserving direct-mapping rule. The mapping rules generate only semantics in RDB data and generate most of the semantics of the RDF data.

Conclusions
The paper focuses on the problem of the existing direct-mapping methods that they do not fully support mapping-integrity constraints. The problems are observed in specific cases in which semantic information loss or incorrect RDF data generation occurred. In this paper, the improved definition of semantics preservation is provided to solve the problems and augment the RDB2RDF mapping methods.
Three lemmas are defined to describe the features of semantic resources during the mapping process. Lemma 1 stated that each semantic resource could be inferred from an OWL class (i.e., semantic resource). Lemma 2 stated that each semantic resource referencing other resources could be typed by an OWL object property (i.e., referential relationship). Lemma 3 stated that each subgraph having the same semantic resource as a root node could be merged into a single graph (i.e., union of semantic resources). A hierarchical structured semantic vocabulary was also defined for use in directmapping rules.
Rule sets are defined based on the lemmas to transform relational tables and attributes. The mapping rules comprised general-and constraint-mapping rules. The general-mapping rules are used for mapping relations, attributes, and other general relational objects and are defined to avoid semantic information loss during the transformation of general relational objects. Constraint

Conclusions
The paper focuses on the problem of the existing direct-mapping methods that they do not fully support mapping-integrity constraints. The problems are observed in specific cases in which semantic information loss or incorrect RDF data generation occurred. In this paper, the improved definition of semantics preservation is provided to solve the problems and augment the RDB2RDF mapping methods.
Three lemmas are defined to describe the features of semantic resources during the mapping process. Lemma 1 stated that each semantic resource could be inferred from an OWL class (i.e., semantic resource). Lemma 2 stated that each semantic resource referencing other resources could be typed by an OWL object property (i.e., referential relationship). Lemma 3 stated that each subgraph having the same semantic resource as a root node could be merged into a single graph (i.e., union of semantic resources). A hierarchical structured semantic vocabulary was also defined for use in directmapping rules.
Rule sets are defined based on the lemmas to transform relational tables and attributes. The mapping rules comprised general-and constraint-mapping rules. The general-mapping rules are used for mapping relations, attributes, and other general relational objects and are defined to avoid semantic information loss during the transformation of general relational objects. Constraint mapping rules are used for mapping the integrity constraints and are defined to reduce the volume

Conclusions
The paper focuses on the problem of the existing direct-mapping methods that they do not fully support mapping-integrity constraints. The problems are observed in specific cases in which semantic information loss or incorrect RDF data generation occurred. In this paper, the improved definition of semantics preservation is provided to solve the problems and augment the RDB2RDF mapping methods.
Three lemmas are defined to describe the features of semantic resources during the mapping process. Lemma 1 stated that each semantic resource could be inferred from an OWL class (i.e., semantic resource). Lemma 2 stated that each semantic resource referencing other resources could be typed by an OWL object property (i.e., referential relationship). Lemma 3 stated that each subgraph having the same semantic resource as a root node could be merged into a single graph (i.e., union of semantic resources). A hierarchical structured semantic vocabulary was also defined for use in direct-mapping rules.
Rule sets are defined based on the lemmas to transform relational tables and attributes. The mapping rules comprised general-and constraint-mapping rules. The general-mapping rules are used for mapping relations, attributes, and other general relational objects and are defined to avoid semantic information loss during the transformation of general relational objects. Constraint mapping rules are used for mapping the integrity constraints and are defined to reduce the volume of incorrect RDF data generated.
Finally, the semantics-preserving direct-mapping method was implemented, and a comparative experimental study was performed with both synthetic and real datasets. The experiments demonstrated that the proposed mapping method performs semantics-preserving RDB2RDF transformation and generates semantically accurate results. In the future, we will study the methods of synchronization to achieve RDF data consistency when the original relational data are modified [38]. We will also build a cost-benefit model that reduces the number of repetitive processes.  Figure A1 shows an example of Problem3, assume that a relational attribute x has integrity constraints "unique" and "not null", F is a mapping function that contains the rules in Figure 7, and G is an inverse-mapping function of F, then the integrity constraints of G(F(x)) are "unique", "not null", and "primary key" because FK(p) ⊆ Unique(p), NotNull(p) ⊆ Unique(p), and PK(p) ⊆ NotNull(p) ∪ Unique(p).  Figure A1 shows an example of Problem3, assume that a relational attribute x has integrity constraints "unique" and "not null", F is a mapping function that contains the rules in Figure 7, and G is an inverse-mapping function of F, then the integrity constraints of G(F(x)) are "unique", "not null", and "primary key" because FK(p) Unique(p), NotNull(p) Unique(p), and PK(p) NotNull(p) ∪ Unique(p).

Appendix C. Proof of Lemma 1
If x ∈ R, then x can be transformed directly into owl:Class. If x ∈ A, then x can be transformed into either owl:ObjectProperty or owl:DatatypeProperty, which are types of rdfs:Class. If x ∈ A ∪ K, then x can be transformed into owl:FunctionalProperty or owl:InverseFunctionalProperty, which are types of rdfs:Class. If x ∈ K, then x can be transformed into owl:onProperty, owl:minCardinality, owl:maxCardinality, or owl:cardinality, all of which are types of rdf:Property, and the type of rdf:Property is rdfs:Class. As rdfs:Class is a subclass of owl:Class, semantic resources used in transformation are directly or indirectly assigned to owl:Class type. Therefore, transformed RDF data F(x) can be retrieved by inference using owl:Class.