Semantic Data Management for a Virtual Factory Collaborative Environment

Featured Application: Materials presented in this article can be used for knowledge base creation and reasoning algorithms in industrial applications. Abstract: Recent developments in the area of cyber-physical systems (CPSs) and Internet of Things (IoT) are among the drivers for the emergence of the Industry 4.0 concept, setting new requirements for the architecture, technology, and design approaches of modern industrial systems. Industry 4.0 assumes a higher level of intelligence, and thus autonomy of the systems and subsystems, and a larger focus on the analysis of gathered data for further utilization. The Virtual Factory Open Operating System (vf-OS) project is intended to respond to some of these key challenges, in particular for the smart factory application domain. Complementarily, data and knowledge storage and processing are also in the scope of vf-OS. This article introduces the semantic management component of vf-OS, which aims to analyze the interrelations among stored entities, as well as to deﬁne the closeness among them to generate meaningful suggestions, which can be later used by other subsystems or operators in a user-friendly way. The semantic managing system makes use of non relational approaches, namely a graph database, which enables data to be represented as graphs for further semantic querying. The developed prototype and an illustrative application case are also presented.


Introduction
Cyber-physical systems (CPSs) have found their application in many areas of human activity, from home automation to large industrial complexes containing hundreds of systems and subsystems. The growing complexity of the modern CPS forces to cope with challenges such as interoperability, scalability, and increasing volumes of generated data, which leads to the need of new system architectures and frameworks. CPSs [1] have become one of the key elements of the Industry 4.0 concept, which is gaining more and more attention from both academia and industry. For instance, in [2] the authors offer insight on how the concept of Industry 4.0 influences the overall state of manufacturing in Europe, with respect to socio-economical aspects such as improved product manufacturability, reduced production time, and reduced negative impact on the environment. In line with this trend, several key projects have been launched in the framework of the European Horizon 2020 program (e.g., Virtual Factory Open Operating System (vf-OS), DISRUPT, SAFIRE, ENACT, Boost 4.0, and many others) to support the implementation of ideas brought up in Industry 4.0. Smart manufacturing is one of the topics of reasoning, and so on. Furthermore, the issue of storage can be added to this list of challenges [15], especially in the context of data-rich environments. Most of the efforts presented in this article focus on semantic discovery, storage, and ontology matching and merging. In the context of this research, semantic discovery is performed through linking the concepts from different ontological models, which, by analogy with semantic routing, have higher preferences or reputation indexes [16].
As such, the main research questions addressed in this work are: • Which tools and approaches can contribute to the improvement of semantic interoperability of heterogeneous data? • How can the import and combination of various data models and ontologies help to enrich the mapping process? • How does a collaborative approach contribute to forming the knowledge base and improving the mapping process?
To underline the core ideas highlighted in this work, the following topics are addressed: limitations of current semantic discovery architectures, definitions clarifying the difference between semantic similarity and contextual representation, description of the graph-based approach and its advantages over the vector-based approach, semantic distance as an identifier of concept closeness, application of pathfinding algorithms to map data models of concepts, utilization of a weighted approach allowing tracking of evolving semantic relations, and identification of the need to import and integrate ontologies from different sources in order to enrich the knowledge base.
The remainder of this article is organized as follows. Section 2 is devoted to a brief discussion of relevant literature, and then the connection of this work to vf-OS is drawn in Section 3. Section 4 is devoted to the semantic management component of vf-OS and some relevant services that are delivered by this component. The next section covers a test-case scenario of how the semantic management component can be used, and related details are discussed. At the end of the article conclusions are made.

Related Literature
Complex systems require support environments that can provide assistance in various aspects, such as data and knowledge storage, process modeling and simulation, and so on. For instance, three main support environments are identified in [17]: (i) e-infrastructure, covering computing, storage, and network capabilities; (ii) research infrastructure, allowing handling of assets from various domains; and (iii) virtual research environments, enabling user-centric support for data selection and discovery. In this work we target some functional aspects from all three categories, namely storage, export of ontologies from different areas, and data selection and discovery mechanisms.
Several limitations have been identified in current semantic discovery systems [18]. The involved architectures are limited or weighted ontologies are not considered, and there is poor performance with aggregated data from various knowledge bases. Adaptive techniques or collaborative knowledge base formation could be key in overcoming the difficulty of possible lack of knowledge about a specific concept. For instance, in one ontology concept, "BMW" is a "car", whereas in another it is a "vehicle". If we aggregate these ontologies, the concept "vehicle" can be associated with the concept "car" through the concept "BMW". This might significantly affect the search results, depending on the types of routing algorithms. Two main routing algorithms are identified: single-keyword and multi-keyword [19]. The same authors proposed an approach for the IoT domain, where each IoT device has its core functionality described using one keyword, while other related aspects are described as "attributes". Single-keyword algorithms might not be enough for some applications, not only in terms of expressing the functionality, but also attributes of the object or concept. Sometimes, instead of a requested concept, object, or service, another one, which is close enough, can be suggested. In order to define "relatedness" between two concepts, the semantic in-between distance needs to be calculated. According to [20], the semantic distance is used to capture the closeness between two pieces of text for the case of "concepts". There are two main types of semantic distance: lexical-based measures and distributional measures of word distance. The first one relies on the structure of the knowledge source, whereas the second is based on the word occurrence rate. The distributional measure of the word-distance approach does not require a lexical database, which makes it more flexible in the case of higher data heterogeneity [21].
The problem acquiring semantic information to find interdependencies among sensors and actuators is raised in [22], where the approach to detect sources of errors through path tracking is proposed. However, these authors mostly focused on the narrow area of physical object dependencies without considering interdependencies among concepts and virtual entities.
Another aspect is that, because of the increasing autonomy and intelligence of CPS components, the need for supporting tools to facilitate the establishment of a collaborative environment has emerged, where the entities need additional semantic information to choose the "best" options based on factors such as, for instance, closest position, similar functionality, previous history, and so on. The issues of self-organization for CPS components are raised in [23], where the authors also present an approach for interoperability provision among resources and services in smart spaces. The basic idea is that each resource contributes to semantic interoperability when it connects to the smart space by uploading its ontology. Semantic interoperability is a key requirement for the development of agile design methodologies for collaborative CPSs, such as the one proposed in [7] based on the design science method. Another work [24] presents a study of the semantic enrichment of manufactured products, with the main objective to enhance the generation and circulation of product-related knowledge among all involved parties. This work departs from these ideas and goes deeper into the subject of data harmonization, including clustering, when basic terms and concepts are combined based on the similarities they possess and are organised based on their semantic and contextual relationships. The use of ontologies to represent machine behavior through the interrelation of input signals and output controls is also discussed in [25].
It is important to distinguish between semantic similarity and contextual representation. According to [26], semantic similarity is a broad term, which considers not only synonyms, but also meronymy (PC and keyboard, keyboard is a part of the PC) and hyponyms (animal and dog, dog is a part of the animal class). Whereas, contextual representation is described [27] as knowledge about how the word is used (i.e., an association between concepts and some attributes common for a specific context). The solution proposed in this work is intended to provide a broader basis for further reasoning, when two concepts might be interrelated or associated not only based on belonging to the same class or one concept being part of another, but also how these two concepts are used together. As an example, we can consider a smart home, where a smart fridge is located in the "kitchen", and a mobile vacuum cleaner can move among different rooms, including the "kitchen"; thus, the smart fridge might be associated with the vacuum cleaner based on a common location, even if temporarily. This aspect raises another modeling challenge, as some of the concepts or entities might have strong or weak associations indicating how close they are related to each other.
Regarding the issue of data representation, two main approaches are discussed in the literature [28]: vector-based and graph-based approaches. The graph-based approach has been chosen for this work based on the following reasons: • it can be used to represent order and structural features of the considered topic; • it allows decomposition of complex topics and representation of interdependencies among subtopics; • it allows hierarchical representation of topics, features, and so on; and • it offers pattern detection capabilities.
Additionally, in Table 1 we include other literature sources devoted to other aspects that are covered in this work, namely (i) ontology or data model integration, (ii) delivering concept "relatedness", (iii) generation of queries for further discovery and visualization, (iv) collaborative knowledge base formation, (v) mapping of data models and ontologies, and (vi) consideration of weighted edges. Most of the works mentioned in Table 1 [29,30] are focused on mapping different data structures, models, and ontologies in order to represent them as graphs in a graph database, but the core logic is maintained [31]. The often-used format for ontology representation is the resource description framework [32,33], sometimes extended by the web ontology language [27,29]. Articles devoted to identifying "relatedness" of concepts to each other using the path between them do not cover the weighted approach [29,30,34,35]. However, in [36] the authors raise the issue of using the weighted approach for text categorization, whereas each document is considered as a graph to apply a categorization algorithm. This allows identifying the importance of the term based on co-occurrence. In our work, the weighted approach is used to provide a basis for reasoning systems by enabling detection of the importance of the concepts inside the data models to each other. Most current work in the literature is rather focused on contextual representation, considering "relatedness" identification rather than semantic similarity [33], which allows interrelating the concepts without a need for predefined knowledge about synonyms, hyponyms, or meronymy. For instance, in [35] the authors solely target the issue of finding the hyponyms in a specific ontology, which limits the approach to a specific ontology containing specific relation types. The issue of importing ontologies is raised in several works (e.g., [29,31,37]), whereas in [38], the authors propose to use a graph database as a "hub" to aggregate the data for further analysis.
Some specific application areas (e.g., utilization of graph database as advisor for academic collaboration [34], mapping documents containing the smart city information into the graph nodes [37], or how ontologies mapping can contribute to the cross standard interoperability in automation [39]) are further considered in the literature. Though, our aim is to utilize the framework presented in this work for multiple domains. Furthermore, the aim is also to provide different functionalities, offer a good level of modularity, and allow new features to be added or existing ones to be extended on demand. Finally, the establishment of mechanisms for collaborative knowledge base formation is also in the scope of this work.
Moreover, a challenging issue itself is the conversion of the so called nonsemantic data. For instance, in [40] the authors propose a platform that can transform the data from a relational database (DB) into a semantic resource description framework (RDF) format. The usage of a graph DB can help to avoid additional difficulties in the way of conversion, allowing data to be stored in an RDF compatible way.

Relation to vf-OS
The component presented in this article is part of the vf-OS project [41], which aims to provide an open platform for virtual factories (VFs) and a collaborative environment to support the manufacturing processes at various stages. The project is closely related to the concepts of Industry 4.0, including CPS and Internet of Things (IoT), whereas other relevant subjects are covered such as data processing, data visualization, and cloud computing. The key vf-OS components are: system kernel, virtual factory i/o, data and connect, open applications development kit, service provision framework, and platform environment. The solution proposed in this work belongs to the data management component (DMC) (Figure 1), which aims to manage data flows on a large scale. Another goal is to provide analytical operators for data analysis. DMC is composed of four modules: data infrastructure middleware, data harmonization, data storage, and data analytics [42].
Appl. Sci. 2019, 9, x FOR PEER REVIEW 6 of 24 to avoid additional difficulties in the way of conversion, allowing data to be stored in an RDF compatible way.

Relation to vf-OS
The component presented in this article is part of the vf-OS project [41], which aims to provide an open platform for virtual factories (VFs) and a collaborative environment to support the manufacturing processes at various stages. The project is closely related to the concepts of Industry 4.0, including CPS and Internet of Things (IoT), whereas other relevant subjects are covered such as data processing, data visualization, and cloud computing. The key vf-OS components are: system kernel, virtual factory i/o, data and connect, open applications development kit, service provision framework, and platform environment. The solution proposed in this work belongs to the data management component (DMC) (Figure 1), which aims to manage data flows on a large scale. Another goal is to provide analytical operators for data analysis. DMC is composed of four modules: data infrastructure middleware, data harmonization, data storage, and data analytics [42]. More specifically, this work is focused on data harmonization and partially on the data storage modules. The data harmonization module is responsible for implementing data mapping mechanisms extended with ontology import interfaces and data visualization capabilities. It is divided into design time and run time submodules. The design time submodule maps the data models, whereas run time is responsible for data transformation. The data storage issue is addressed only partially, as vf-OS requires different types of storage for different types of data, thus using the advantages of various storage approaches in the best way [42]. For the storage part of this component, a Not only SQL (NoSQL) type of database, namely a graph DB, has been chosen. The chosen NoSQL DB, which is Neo4j, is able to cope with large amounts of data and provides agile querying mechanisms, allowing its use in smart storage, which not only "keeps" or stores data but also provides tools to perform reasoning operations on the stored data.

Semantic Management Component
The semantic management component contributes mainly to the design time part of the harmonization module, and it consists of two main parts ( Figure 2): (1) storage and (2) harmonization components. It enables two-way communication processes, output formatting, and a set of services related to the graph DB. It also transforms the data into a format that can be consumed by other components of the vf-OS, resorting to the mechanisms of the graph DB to perform reasoning actions More specifically, this work is focused on data harmonization and partially on the data storage modules. The data harmonization module is responsible for implementing data mapping mechanisms extended with ontology import interfaces and data visualization capabilities. It is divided into design time and run time submodules. The design time submodule maps the data models, whereas run time is responsible for data transformation. The data storage issue is addressed only partially, as vf-OS requires different types of storage for different types of data, thus using the advantages of various storage approaches in the best way [42]. For the storage part of this component, a Not only SQL (NoSQL) type of database, namely a graph DB, has been chosen. The chosen NoSQL DB, which is Neo4j, is able to cope with large amounts of data and provides agile querying mechanisms, allowing its use in smart storage, which not only "keeps" or stores data but also provides tools to perform reasoning operations on the stored data.

Semantic Management Component
The semantic management component contributes mainly to the design time part of the harmonization module, and it consists of two main parts ( Figure 2): (1) storage and (2) harmonization components. It enables two-way communication processes, output formatting, and a set of services related to the graph DB. It also transforms the data into a format that can be consumed by other right after the data are inserted. Furthermore, semantic management supports ontologies and data model import and integration. Some formats that are supported for import are RDF/XML (Extensible Markup Language), web ontology language (OWL) syntax extension for RDF, Turtle, and commaseparated value (CSV) data models.
The core of the storage component is a Neo4j graph DB. Neo4j provides a number of tools not only for storing the data in a graph format, but also for applying a wide variety of algorithms such as path finding, similarity, link prediction, and so on. Neo4j uses the Cypher querying language, which enables core operations of processing, reasoning, and retrieving the data. To be able to use these algorithms, the APOC plugin is used, and for importing RDF ontologies, a semantics plugin is adopted. The system also accepts data to be inserted in various formats such as JavaScript Object Notation (JSON) or Comma-separated Values (CSV), which contributes to overall interoperability without much effort to transform the data. The harmonization module of the semantic management component includes messaging, data cleaning, and composition modules.

•
The messaging module is responsible for maintaining communication between the storage and harmonization components. Moreover, it provides a Representational State Transfer (REST)based interface for outer applications. Between the harmonization and storage components, communication is accomplished through bolt connectivity, which is the network protocol running over a Transmission Control Protocol (TCP) connection. In other words, the messaging module allows requests to be received from outer applications, following the REST architectural approach, and retranslates them to storage in a bolt-compatible format.

•
The data cleaning module is needed to cope with complex output objects returned after the storage component processes a request. Neo4j-returned objects are parsed, and the required data are extracted.

•
The composition module is used to align the data that was prepared by the data cleaning module with the output format and to generate a Cypher query which can be visualized in the Neo4j viewer environment to represent the output in a human-friendly format. However, this module, as well as the data cleaning one, is only needed for some of the services, as not all of them are assumed to return a JSON output.
The semantic management component provides a set of services that are used by the data analyzer responsible for further data mapping and visualization. Some of the services, such as The core of the storage component is a Neo4j graph DB. Neo4j provides a number of tools not only for storing the data in a graph format, but also for applying a wide variety of algorithms such as path finding, similarity, link prediction, and so on. Neo4j uses the Cypher querying language, which enables core operations of processing, reasoning, and retrieving the data. To be able to use these algorithms, the APOC plugin is used, and for importing RDF ontologies, a semantics plugin is adopted. The system also accepts data to be inserted in various formats such as JavaScript Object Notation (JSON) or Comma-separated Values (CSV), which contributes to overall interoperability without much effort to transform the data.
The harmonization module of the semantic management component includes messaging, data cleaning, and composition modules.

•
The messaging module is responsible for maintaining communication between the storage and harmonization components. Moreover, it provides a Representational State Transfer (REST)-based interface for outer applications. Between the harmonization and storage components, communication is accomplished through bolt connectivity, which is the network protocol running over a Transmission Control Protocol (TCP) connection. In other words, the messaging module allows requests to be received from outer applications, following the REST architectural approach, and retranslates them to storage in a bolt-compatible format.

•
The data cleaning module is needed to cope with complex output objects returned after the storage component processes a request. Neo4j-returned objects are parsed, and the required data are extracted.

•
The composition module is used to align the data that was prepared by the data cleaning module with the output format and to generate a Cypher query which can be visualized in the Neo4j viewer environment to represent the output in a human-friendly format. However, this module, as well as the data cleaning one, is only needed for some of the services, as not all of them are assumed to return a JSON output.
The semantic management component provides a set of services that are used by the data analyzer responsible for further data mapping and visualization. Some of the services, such as importOntology or importDataModel, are intended to provide an interface for uploading and integration of RDF/XML and Turtle ontologies and data models in CSV format.
In the following sections these modules are described in more detail, and a test case is used with some chosen services (e.g., importOntology, importDataModel, and getMappingSuggestion). Detailed descriptions of these services are given below.

Get MappingSuggestion Service
The "getMappingSuggestion" provides mechanisms to map two data models. The goal of the service is to identify how close the concepts in one model are interrelated to concepts in another model. The result of mapping, as mentioned above, can be reused for further reasoning or discovery. In Figure 3, the models submitted for mapping are identified as input, which is a JSON document containing both models. From now on, we use the terms "origin" and "target" to identify the two parts of the input data model, as also shown in Figure 3. It is important to mention that the "getMappingSuggestion" service is developed for use within the vf-OS framework; thus, it has a structure based on JSON format. This is necessary to represent, if needed, the data models as a complex, structured tree of concepts with appropriate relations. In order to better understand the involved models, we introduce the metastructure of the input models ( Figure 3). These models comprise several fields as follows: • "label"-which introduces the name of the capillary/single concept; • "tag"-this field can be used to identify the attributes of the concept. The inspiration for introducing the "tag" field came from the topic of folksonomy, described in [43], where a collaborative tagging system is presented that is composed of three core entities: users, resources, and tags. Tags, compared to, for instance, resources, are in no way limited by predefined vocabulary [44]. In this work, tags are used to describe the concepts that, to some extent, can be compared with resources in folksonomy; • "xpath"-is used to show the parent concepts of the capillary/single concept. This field allows expressing the hierarchy of concepts from a specific concept to the root; • "type"-used to express the type of the concept, for instance: string.
Appl. Sci. 2019, 9, x FOR PEER REVIEW 8 of 24 importOntology or importDataModel, are intended to provide an interface for uploading and integration of RDF/XML and Turtle ontologies and data models in CSV format. In the following sections these modules are described in more detail, and a test case is used with some chosen services (e.g., importOntology, importDataModel, and getMappingSuggestion). Detailed descriptions of these services are given below.

Get MappingSuggestion Service
The "getMappingSuggestion" provides mechanisms to map two data models. The goal of the service is to identify how close the concepts in one model are interrelated to concepts in another model. The result of mapping, as mentioned above, can be reused for further reasoning or discovery. In Figure 3, the models submitted for mapping are identified as input, which is a JSON document containing both models. From now on, we use the terms "origin" and "target" to identify the two parts of the input data model, as also shown in Figure 3. It is important to mention that the "getMappingSuggestion" service is developed for use within the vf-OS framework; thus, it has a structure based on JSON format. This is necessary to represent, if needed, the data models as a complex, structured tree of concepts with appropriate relations. In order to better understand the involved models, we introduce the metastructure of the input models ( Figure 3). These models comprise several fields as follows: • "label"-which introduces the name of the capillary/single concept; • "tag"-this field can be used to identify the attributes of the concept. The inspiration for introducing the "tag" field came from the topic of folksonomy, described in [43], where a collaborative tagging system is presented that is composed of three core entities: users, resources, and tags. Tags, compared to, for instance, resources, are in no way limited by predefined vocabulary [44]. In this work, tags are used to describe the concepts that, to some extent, can be compared with resources in folksonomy; • "xpath"-is used to show the parent concepts of the capillary/single concept. This field allows expressing the hierarchy of concepts from a specific concept to the root; • "type"-used to express the type of the concept, for instance: string.  The input file containing the origin and target models is delivered to the harmonization component, which is responsible for correct communication with the graph DB, proper parsing, and integration of the input data model.
In order to clarify the process of delivering and mapping the data, we use definitions derived from the set theory and graph theory. The payload, or, in other words, data within the origin and target models, is represented as two sets: where M S stands for the origin or source data model, and M T stands for the target. Subsets A and B contain the capillary concepts derived from the "label" field. The related concepts from the "path" and "tag" fields are expressed through the T S and T T terms and Xp S and Xp T subsets. And finally, the "Type" subset represents the variety of types of concepts from A and B subsets.
As the graph DB deals with nodes and relationships, all data being part of the input are converted. Thus, the concepts become nodes and relationships that are generated according to the data model schema. The data model schema assumes that all concepts from the fields "tag" and "xpath" are directly related to the concept presented in the "label" field ( Figure 3). After the concepts provided in the origin and target models are inserted and converted into nodes, and appropriate relationships are created, the phase of data mapping starts. As mentioned in Section 2, the pathfinding approach for mapping follows the Dijkstra algorithm implemented as the APOC.algo extension of Neo4j [45]. It supports calculation of the "distance" considering the weights of the path segment. This is important in order to cope with requirements for the weighted approach stated in Section 2. For the operations within the mapping process, a graph theory apparatus is used, which allows handling nodes (called vertices) and relationships (called edges). For further convenience, the term "concept" is used as equivalent to "node" (a graph DB term) and "vertices" (a graph theory term), and "relationship" is used as equivalent to "edges" (a graph theory term). Before applying Dijkstra's algorithm, we need to formulate concept pairs. In other words, ordered pairs of concepts are built from subsets A and B in order to set source (subset A) and target (subset B) concepts for path finding. To perform mapping between the origin data model and the target data model, a Cartesian product between the origin data model, defined as set A containing elements {a 1 , a 2 , ..., a m }, and the target data model, defined as set B with elements {b 1 , b 2 , ..., b m }, is built. The Cartesian product for origin and target data models gives: i.e., However, because each element of the origin data model needs to be represented separately (in order to provide the best output of Dijkstra's algorithm for each concept in A) rather than for the whole origin data model, the concepts from A are treated one by one. Thus, the goal is not to compare the path from a 1 to b 1 with the path from a 2 to b n , but rather a 1 to b 1 with a 1 to b 2 and a 1 to b n . This also allows sparing the computational resources for excessive matching operations. This results in the following: The origin model set is represented through a number of concepts: 1 , a 2 , . . . , a m ).
Let the resulting nondirected weighted graph be G = (V, E, w), where V is a set of concepts and E is a set of relationships with w : E → R as a weight assigned to edges. After applying Dijkstra's algorithm, we obtain a set of suggested paths P for each concept in the origin model. For instance, path p ∈ P from a 1 to b 1 in G is a list of edges ((V 0 , V 1 ), (V 1 , V 2 ),..., (V k-1 , V k )), where V 0 = (a 1 ) and V k = (b 1 ), and k is the total number of concepts that are part of the path (see illustration in Figure 4). The resulting weight of the path is the sum of the weights of all relationships and is calculated as follows [46]: The number of resulting paths can vary, but for the current implementation, we have chosen four for each concept in the origin submodel. Each path built from origin to target concept is called a suggestion. As only four suggestions are required, the resulting graph includes a maximum of four subgraphs, based on the distance parameter. In the theory of semantics, this parameter can be described as the semantic distance and defined as the shortest path connecting two concepts. For instance, for a1 and b1 [47]: Thus, after the mapping process is accomplished, the result can be represented through a graph that consists of a set of subgraphs representing the suggestions for each of the origin concepts where m is the number of origin concepts, and , , … , are the graphs containing suggested paths for the corresponding origin concepts. The suggestions or suggested paths contain the set of concepts, relationships, and weights (Note: if no path exists between the source and target concepts, the result is nil.) If the service getMappingSuggestions is called again, and the input is the same as in the previous time or contains some common connected vertices, the weight of the relationship between the inputs decreases. This enables a dynamic approach to evaluate the semantic distance, compared to [11], for instance, where the authors also consider a semantic distance, but the value of each relationship is static and equal to 1. The static weight of a relationship means that the semantic distance cannot vary regarding the number of cases in which the concept was used with the related counterpart. Instead, we consider that the importance or "closeness" between interrelated concepts grows with each run of the getMappingSuggestion service with repeated relationships inside the input. An example can be coffee and cake. For instance, a person asking for a coffee also takes a cake, this means that these two concepts are, from now, interrelated, and their "closeness" grows each time someone orders coffee and cake. As Dijkstra's algorithm finds the weighted shortest path between two concepts, after each match, the weight of the corresponding relationship decreases according to: The resulting weight of the path is the sum of the weights of all relationships and is calculated as follows [46]: The number of resulting paths can vary, but for the current implementation, we have chosen four for each concept in the origin submodel. Each path built from origin to target concept is called a suggestion. As only four suggestions are required, the resulting graph includes a maximum of four subgraphs, based on the distance parameter. In the theory of semantics, this parameter can be described as the semantic distance and defined as the shortest path connecting two concepts. For instance, for a 1 and b 1 [47]: Thus, after the mapping process is accomplished, the result can be represented through a graph that consists of a set of subgraphs representing the suggestions for each of the origin concepts G = (G a 1 , G a 2 , . . . , G a m ), (9) where m is the number of origin concepts, and G a 1 , G a 2 , . . . , G a m are the graphs containing suggested paths for the corresponding origin concepts. The suggestions or suggested paths contain the set of concepts, relationships, and weights (Note: if no path exists between the source and target concepts, the result is nil.) If the service getMappingSuggestions is called again, and the input is the same as in the previous time or contains some common connected vertices, the weight of the relationship between the inputs decreases. This enables a dynamic approach to evaluate the semantic distance, compared to [11], for instance, where the authors also consider a semantic distance, but the value of each relationship is static and equal to 1. The static weight of a relationship means that the semantic distance cannot vary regarding the number of cases in which the concept was used with the related counterpart. Instead, we consider that the importance or "closeness" between interrelated concepts grows with each run of the getMappingSuggestion service with repeated relationships inside the input. An example can be coffee and cake. For instance, a person asking for a coffee also takes a cake, this means that these two concepts are, from now, interrelated, and their "closeness" grows each time someone orders coffee and cake. As Dijkstra's algorithm finds the weighted shortest path between two concepts, after each match, the weight of the corresponding relationship decreases according to: where w j is the current value of the weight between two concepts, and w j−1 is the value of weight before this iteration. In this way, the value of the weight will constantly decrease with each iteration; on the contrary, the "closeness" grows. This can be applied for two standalone concepts, or, if it is a model with a set of interconnected concepts, the weight of all segmental relationships decreases, and the "closeness" of the connected concepts grows correspondingly. Based on the suggestions built for the concepts in origin model, an output document containing the suggestions is generated, as shown in Figure 5: • "origin"-which contains the subfields "label" for the name of the concept and "XPath" containing the parent nodes; • "mapping"-represents mapping the origin and target data models, containing all suggestions for each specific concept. The subfields are: • "suggestion", with "label" naming the target concept, "score" reflecting the importance or weight of the path between "origin" and "target" concepts, "XPath" for the parent concepts, and "rank", which serves to identify which suggestion is the closest one; • "graph"-representing the full path from the "origin" to the "target" concept in a segmental way, being composed of triples of the form "start" -> "weight" -> "end"; and • "CYPHER query"-a field that contains the generated Cypher query, which can be used for visualizing and checking the suggested path. The output provides a set of suggestions for each of the origin concepts, and the Cypher query is used for visualizing and checking the suggestions. Once receiving the JSON input, the semantic manager identifies the origin and target concepts. If these two models, after insertion into the graph DB, are connected through some intermediary concepts (i.e., crossing points), a set of suggestions or the shortest paths from each origin to a set of target concepts are returned. The number of returned suggestions depends on the size, complexity, and depth of the graph; however, for this particular test case, it cannot exceed four suggestions. Nevertheless, the number of returned suggestions can be changed on demand.
The getMappingSuggestion service finds the interrelation between concepts placed in the origin and the target data models. Data models are inserted into the graph DB and, thus, are converted to a graph, enabling the use of the shortest path algorithms. The concept of interrelation is based on the "closeness", which is calculated as the shortest weighted path between each concept in the origin and a set of concepts in the target data models. The difference between the approach used in this work and previous solutions is that the weighted approach is used, which distinguishes the weight, and thus "importance", of relationships between concepts. Moreover, the weight is not static; it changes with each repetition of the relationship that is based on the co-occurrence principle. This principle assumes that if two related concepts are repeated, the "closeness" of them to each other grows, so the result of the mapping can be different in different time stamps. Another novel point is that getMappingSuggestion compares the "closeness" of a set of concepts to one chosen concept and orders them accordingly. This has good potential for further reasoning, allowing patterns in the data to be found and related concepts to be discovered.

Import Data Model Service
To enrich the output results of the getMappingSuggestion service, models built based on data generated by machines and sensors can be imported for further processing and analysis. The format for importing models is comma-separated values (CSV), which presumes data storage in a tabular way. It offers a simple way for data exchange in a tabular form. From the name itself, the data are represented in rows and columns, where the first row usually represents a set of headers that determines the meaning of the data in each column. This format can be adopted to exchange the data The output provides a set of suggestions for each of the origin concepts, and the Cypher query is used for visualizing and checking the suggestions. Once receiving the JSON input, the semantic manager identifies the origin and target concepts. If these two models, after insertion into the graph DB, are connected through some intermediary concepts (i.e., crossing points), a set of suggestions or the shortest paths from each origin to a set of target concepts are returned. The number of returned suggestions depends on the size, complexity, and depth of the graph; however, for this particular test case, it cannot exceed four suggestions. Nevertheless, the number of returned suggestions can be changed on demand.
The getMappingSuggestion service finds the interrelation between concepts placed in the origin and the target data models. Data models are inserted into the graph DB and, thus, are converted to a graph, enabling the use of the shortest path algorithms. The concept of interrelation is based on the "closeness", which is calculated as the shortest weighted path between each concept in the origin and a set of concepts in the target data models. The difference between the approach used in this work and previous solutions is that the weighted approach is used, which distinguishes the weight, and thus "importance", of relationships between concepts. Moreover, the weight is not static; it changes with each repetition of the relationship that is based on the co-occurrence principle. This principle assumes that if two related concepts are repeated, the "closeness" of them to each other grows, so the result of the mapping can be different in different time stamps. Another novel point is that getMappingSuggestion compares the "closeness" of a set of concepts to one chosen concept and orders them accordingly. This has good potential for further reasoning, allowing patterns in the data to be found and related concepts to be discovered.

Import Data Model Service
To enrich the output results of the getMappingSuggestion service, models built based on data generated by machines and sensors can be imported for further processing and analysis. The format for importing models is comma-separated values (CSV), which presumes data storage in a tabular way. It offers a simple way for data exchange in a tabular form. From the name itself, the data are represented in rows and columns, where the first row usually represents a set of headers that determines the meaning of the data in each column. This format can be adopted to exchange the data among the applications using the proprietary format. One of the main advantages of this format is its Appl. Sci. 2019, 9,4936 13 of 23 simplicity. The standard which specifies CSV is RFC 4180 [48], although separation can be made using, for instance, semicolons instead of commas.
An example of a CSV file can be a document containing data on the functioning of a machine or its parts. For instance, data about the functioning of cylinders of a tabber stringer machine may contain headers such as the current time stamp, number of cylinders, movement types, and maximal and minimal speeds. CSV files are imported into Neo4j by loading the headers first followed by all other records regarding the corresponding header. If the value in the record is complex, in other words, it consists of several parts, it can be separated and parts can be inserted independently. This service is just an auxiliary or supporting service that can enable data in CSV format to be loaded, thus enriching the knowledge base with the new data. In principle, this service treats the imported data as graph entities to apply the getMappingSuggestion service or other services that can be added on demand.

Import Ontology Service
Another core service allows us to import RDF ontologies to the semantic management component. One of the main purposes of the semantic management component is to provide an extensible environment for ontology development, improvement, and evolution. The basic principles of ontology development can be summarized as follows [49]: • reusability, when the concepts introduced in existing reliable ontologies can be reused; • semantic alignment, referring mainly to ontology interoperability, to integrate concepts from imported ontologies, as well as newly created concepts into an existing structure; • ontology design pattern usage, to ensure that the concept generation procedure can be applied not only to a single concept but to a group of concepts; and • community extensibility, assuming a collaborative perspective when one ontology, covering few use-cases, can be extended by other users of a community and, thus, be applied to more use-cases.
The process of importing ontologies should correlate with these ontology development principles. Thus, in the present work, the concepts imported within an ontology are reusable, integrated in the existing structure, and extended using other services delivered within the semantic management component.
To introduce the process of ontology import, some definitions from the graph theory are adopted. For the ontology or data model, which is already inside the graph DB, the term "initial" is used, and for ontology loaded from outside, the term "imported" is used. Both initial and imported ontologies are represented as a graph with E edges and V vertices, G = (V, E). If the database is not empty, the process of importing an ontology can be described through the logical disjunction or union. Let us assume that the database is not empty; thus, the initial data model structure that is already inside the database is G init = (V init , E init ), and the imported ontology is Using this definition, edges are imported is as follows: where n is the number of relationships connecting concepts in the initial ontology, and m is the number of relationships connecting concepts in the imported ontology. The importing of concepts can be represented through: where i is the number of concepts in the initial ontology, and j is the number of concepts in the imported ontology.
The resulting graph is represented by G res , which is the union of the initial ontology graph (already inside the DB) and the imported ontology, and therefore contains the concepts V init and V imp and relationships E init and E imp : Based on these definitions, the algorithm for importing an ontology to a nonempty graph DB is developed (Algorithm 2). If empty, the imported model is inserted as is. However, the most interesting is the first case, when the DB already contains some data models or ontologies with some common concepts and/or relationships. For this case the algorithm is as follows: Algorithm 2 Suggestion building for each concept in the origin model

end if end for each OUTPUT: G res
As mentioned before, if the database is not empty and the imported ontology contains some concepts with the same names as in the database, in other words they equally labelled, they are ignored by importOntology, but the relationships are created or updated. The importOntology can be used in conjunction with the getMappingSuggestion service, which provides the necessary design patterns to apply to groups of concepts. Finally, all users of vf-OS having the permit to create and update the ontologies and can contribute to the evolution of the ontology development environment. The procedure of requesting the importOntology service is presented in Figure 6. Considering the treatment of replicas, as in the example given in [50], concepts with equal names are removed. However, while importing two ontologies with similar concepts and applying unification procedures, the resulting concept class possesses all properties from the initial ontologies. In this work, all concepts with equal names are also removed, although we follow the approach that all properties of the concepts are inserted as separate concepts. For instance, the concept of "cake" has the property "sweet", which, after insertion, appears as another concept having a relation to "cake". This allows concepts with similar properties to be interrelated and, thus, enriches the knowledge base and avoids the creation of redundant entities. Moreover, a matching approach is made possible, which is commonly and widely used. This approach is illustrated in [51], where two ontologies are merged combining equal root concepts while differentiating between similar, but not equivalent, concepts. The process of matching ontologies is also considered in [52]. However, the authors were mostly focused on ontology matching or alignment from the same domain, and they did not consider importing cross-domain ontologies.
The importOntology service enables RDF/OWL ontologies to be imported. One of the tasks of the importOntology service, as well as of the importDataModel, is to enrich the knowledge base with data structures aggregated from various sources, such as, for instance, professional communities. The key feature of the proposed service is that it treats the properties of a concept as separate concepts, thus enabling different concepts to be interrelated based on their similar properties, which is crucial when using the graph-based approach. This might be used, for instance, as the basis for building service discovery mechanisms for offering the appropriate service. It is important to mention that, to explore the functionality of importOntology efficiently, it is used in conjunction with getMappingSuggestion. Moreover, importOntology has no domain restrictions as long as an RDF specification is used.
Considering contribution of this service to enhance collaborative mechanisms, importOntology allows aggregation of ontologies developed within various professional communities and supports identification of closely related concept patterns, which is important for facilitating a common understanding of the used terms and definitions. Another relevant feature is that importOntology, in conjunction with other developed services, allows the ontology to make and change necessary additions. For instance, the concept "cake" is related to the concept "sweets", but the definition can be improved, whereas "cake" is also "food" and "consumable thing". Thus, all members of a community can contribute to the enrichment of the knowledge base. The importOntology service algorithm has a similar starting point compared to all implemented services; when a request has been sent, an authentication procedure is passed, and the session is launched. Then, the semantic plugin is required to parse and insert the ontology, with all concepts and relations, into Neo4j. After this is accomplished, duplicate concepts are removed, and, if relationships are identical between the imported ontology and the database, the weight of the relationship is updated. In the next step, each concept is labelled within Neo4j, and the name property is extracted.
Considering the treatment of replicas, as in the example given in [50], concepts with equal names are removed. However, while importing two ontologies with similar concepts and applying unification procedures, the resulting concept class possesses all properties from the initial ontologies. In this work, all concepts with equal names are also removed, although we follow the approach that all properties of the concepts are inserted as separate concepts. For instance, the concept of "cake" has the property "sweet", which, after insertion, appears as another concept having a relation to "cake". This allows concepts with similar properties to be interrelated and, thus, enriches the knowledge base and avoids the creation of redundant entities. Moreover, a matching approach is made possible, which is commonly and widely used. This approach is illustrated in [51], where two ontologies are merged combining equal root concepts while differentiating between similar, but not equivalent, concepts. The process of matching ontologies is also considered in [52]. However, the authors were mostly focused on ontology matching or alignment from the same domain, and they did not consider importing cross-domain ontologies.
The importOntology service enables RDF/OWL ontologies to be imported. One of the tasks of the importOntology service, as well as of the importDataModel, is to enrich the knowledge base with data structures aggregated from various sources, such as, for instance, professional communities. The key feature of the proposed service is that it treats the properties of a concept as separate concepts, thus enabling different concepts to be interrelated based on their similar properties, which is crucial when using the graph-based approach. This might be used, for instance, as the basis for building service discovery mechanisms for offering the appropriate service. It is important to mention that, to explore the functionality of importOntology efficiently, it is used in conjunction with getMappingSuggestion. Moreover, importOntology has no domain restrictions as long as an RDF specification is used.
Considering contribution of this service to enhance collaborative mechanisms, importOntology allows aggregation of ontologies developed within various professional communities and supports identification of closely related concept patterns, which is important for facilitating a common understanding of the used terms and definitions. Another relevant feature is that importOntology, in conjunction with other developed services, allows the ontology to make and change necessary additions. For instance, the concept "cake" is related to the concept "sweets", but the definition can be improved, whereas "cake" is also "food" and "consumable thing". Thus, all members of a community can contribute to the enrichment of the knowledge base.

Test Case Scenario and Discussion
The goal of this work is to enable a knowledge base that can accumulate various ontologies and data models, whereas a solution to the main challenges of semantic discovery architectures can be provided [18]. This can be accomplished by considering weighted ontologies and adaptive techniques for collaborative knowledge base formation. Therefore, the aim is not to improve the timing of existing methods and solutions, but rather to focus on functionality. Another goal is to show how the export of ontologies in conjunction with collaborative knowledge base formation can contribute to improving the discovery mechanisms.
As a proof-of-concept test case, we decided to import an ontology developed in the framework of the Federated Interoperable Semantic IoT Testbeds and Applications (FIESTA)-IoT project, which aims at covering the core notions of the IoT [53]. The ontology uses the resource description framework (RDF) graph vocabulary and schema with some core constructs such as class, property, type, subClassOf, subPropertyOf, domain, and range [54]. Moreover, web ontology language (OWL) syntax is integrated into the RDF schema to extend its functionality and thus enrich it with new meanings of triples [55]. To implement the mapping, the semantic plugin for Neo4j was used [56], which ensures proper export of the ontology into the database.
Besides the RDF graph extended with OWL syntax as the input for the semantic management component, as described in previous sections, the input model in JSON format was used. For testing purposes, importOntology and getMappingSuggestion services were chosen to demonstrate how the importOntology service can contribute to knowledge base enrichment, improvement of the mapping output, and introducing collaborative mechanisms for improving the knowledge base. Moreover, a second purpose is to demonstrate the capabilities of the combined use of getMappingSuggestion with importOntology. First, the importOntology service was executed, which loads the ontology to the graph DB. In the next stage, getMappingSuggestionService was activated to produce a complex mapping result with a set of suggestions for each concept in the origin submodel of the input document ( Figure 7).

Test Case Scenario and Discussion
The goal of this work is to enable a knowledge base that can accumulate various ontologies and data models, whereas a solution to the main challenges of semantic discovery architectures can be provided [18]. This can be accomplished by considering weighted ontologies and adaptive techniques for collaborative knowledge base formation. Therefore, the aim is not to improve the timing of existing methods and solutions, but rather to focus on functionality. Another goal is to show how the export of ontologies in conjunction with collaborative knowledge base formation can contribute to improving the discovery mechanisms.
As a proof-of-concept test case, we decided to import an ontology developed in the framework of the Federated Interoperable Semantic IoT Testbeds and Applications (FIESTA)-IoT project, which aims at covering the core notions of the IoT [53]. The ontology uses the resource description framework (RDF) graph vocabulary and schema with some core constructs such as class, property, type, subClassOf, subPropertyOf, domain, and range [54]. Moreover, web ontology language (OWL) syntax is integrated into the RDF schema to extend its functionality and thus enrich it with new meanings of triples [55]. To implement the mapping, the semantic plugin for Neo4j was used [56], which ensures proper export of the ontology into the database.
Besides the RDF graph extended with OWL syntax as the input for the semantic management component, as described in previous sections, the input model in JSON format was used. For testing purposes, importOntology and getMappingSuggestion services were chosen to demonstrate how the importOntology service can contribute to knowledge base enrichment, improvement of the mapping output, and introducing collaborative mechanisms for improving the knowledge base. Moreover, a second purpose is to demonstrate the capabilities of the combined use of getMappingSuggestion with importOntology. First, the importOntology service was executed, which loads the ontology to the graph DB. In the next stage, getMappingSuggestionService was activated to produce a complex mapping result with a set of suggestions for each concept in the origin submodel of the input document ( Figure 7). The input model, as described above, contains two main parts, the origin submodel and the target submodel, which are mapped. Following the definitions described in Section IV, the input model that is used for the test case is illustrated in Figure 8 Each origin concept is mapped to every target concept and produces a suggestion that is ranked according to the weight of the path between the origin and target concepts: The input model, as described above, contains two main parts, the origin submodel and the target submodel, which are mapped. Following the definitions described in Section 4, the input model that is used for the test case is illustrated in Figure 8 In the origin and target models, there are several IoT resources and services, but also location names and company names. In an extended ontology (not in FIESTA ontology), companies can be considered as users or service providers. Using the output from the semantic management component, further knowledge extraction can be made, and "optimal" mapping patterns can be found. To better describe and explain the process of ontology metamorphosis and generation of complex outputs, a conceptual architecture for semantic lifting [57] was adopted ( Figure 9). This architecture is intended to provide the necessary basis for ensuring interoperability among model elements and ontology concepts. The conceptual architecture contains three scales: the first reflects the syntax and semantic annotations used. The input model for the getMappingSuggestion service, containing origin and target data models, follows JSON formatting, with an internal semantic developed for the semantic management component to be consumed by other vf-OS framework components, including the data integrator. Ontologies that can be imported by the semantic management component are of two main Each origin concept is mapped to every target concept and produces a suggestion that is ranked according to the weight of the path between the origin and target concepts: In the origin and target models, there are several IoT resources and services, but also location names and company names. In an extended ontology (not in FIESTA ontology), companies can be considered as users or service providers. Using the output from the semantic management component, further knowledge extraction can be made, and "optimal" mapping patterns can be found. To better describe and explain the process of ontology metamorphosis and generation of complex outputs, a conceptual architecture for semantic lifting [57] was adopted ( Figure 9). This architecture is intended to provide the necessary basis for ensuring interoperability among model elements and ontology concepts. In the origin and target models, there are several IoT resources and services, but also location names and company names. In an extended ontology (not in FIESTA ontology), companies can be considered as users or service providers. Using the output from the semantic management component, further knowledge extraction can be made, and "optimal" mapping patterns can be found. To better describe and explain the process of ontology metamorphosis and generation of complex outputs, a conceptual architecture for semantic lifting [57] was adopted ( Figure 9). This architecture is intended to provide the necessary basis for ensuring interoperability among model elements and ontology concepts. The conceptual architecture contains three scales: the first reflects the syntax and semantic annotations used. The input model for the getMappingSuggestion service, containing origin and target data models, follows JSON formatting, with an internal semantic developed for the semantic management component to be consumed by other vf-OS framework components, including the data The conceptual architecture contains three scales: the first reflects the syntax and semantic annotations used. The input model for the getMappingSuggestion service, containing origin and target data models, follows JSON formatting, with an internal semantic developed for the semantic management component to be consumed by other vf-OS framework components, including the data integrator. Ontologies that can be imported by the semantic management component are of two main types: RDF/XML extended with OWL syntax and Turtle. The metamodel layer represents the mapping of the imported ontology with origin and target data models. In the modeling layer, the initial ontology is modified in consideration with the origin and target models. Moreover, the output model is generated while combining both input models and modified ontology.
Resulting suggestions contain the first node acquired from origin model, the last node from the target model, and the sequence of the in-between nodes. These nodes can be acquired both from the input models and modified ontology, where all new concepts, if any, are already incorporated. The whole process is represented in Figure 9.
The first column in Figure 10 identifies the origin and target concepts (i.e., labels between the established mappings). For this specific case, one of the origin concepts has been chosen, namely the "Emergency Stop" actuating device, with appropriate suggestions that are presented in the target model of Figure 8. The second column is used to represent the Cypher queries generated to query the result. The concepts in the graph are marked as follows: (i) blue concepts were extracted from origin data model; (ii) green concepts were from the target data model; and (iii) grey concepts were extracted from imported ontology. The records in Figure 10 represent suggestions after the mapping process. Figure 10 also represents how the suggestions for the same origin and target concepts are changed if the relations among intermediary concepts evolve. As it can be seen, the number of intermediary nodes increased, though the distance or semantic distance between the nodes "Emergency Stop" and "Assembly" decreased, making the new suggestion preferable.

Conclusions
The aims of Industry 4.0 set a wide range of challenges both technical and organizational in nature. Many initiatives have been launched all around the world to enable the technological support for the fourth industrial revolution. CPSs are one of the main constituents, covering both physical and virtual components, that can generate and process rich ascending and descending data flows. Therefore, data and knowledge assets are turning into a resource that needs to be exploited to enable higher intelligence and autonomy of the systems and subsystems. In this work, we presented a part of the vf-OS initiative: the semantic management component. Some of the most important services of this component, such as importOntology, importDataModel, and getMappingSuggestion, were As an example, the "Emergency Stop" concept from the input model is chosen. The resulting graph containing all 5 concepts from the origin data model is G res = (Manu f acturing, Conveyor, Emergency Stop, Company_X, Lisbon), (16) and the subgraph containing the suggestion for "Emergency Stop" is "Emergency Stop" = V pw , E pw , w pw , where V pw is a set containing all the vertices forming the resulting graph based on four suggestions ( Figure 10 and w pw stands for a set of weights of the path segments in the suggestions. Thus, for the "Emergency Stop" concept, the set of segment weights for suggestions is as follows: where four suggested paths are generated for each origin concept. For instance, the first element of the set is (0.2,0.2,0.2) which is the weights for segments "Emergency Stop" -> "Conveyor", "Conveyor" -> "Manufacturing" and "Manufacturing" -> "Assembly" of the first suggestion. The distance from origin concept "Emergency Stop" to target concept "Assembly" is d(Emergency Stop, Assembly) = min w(p) : p 1 = (0.6) .
It is clear that some concepts overlap, as they are members of both the imported ontology and origin data or target data model. However, without the imported ontology, no relation among the concepts can be established, as the ontology extends the origin and target with missing relations to other concepts, which makes the mapping process possible. Without the imported ontology, the output result of getMappingSuggestion would be an empty object. This case is described in the last record in Figure 10, with the note that no ontology is being imported.

Conclusions
The aims of Industry 4.0 set a wide range of challenges both technical and organizational in nature. Many initiatives have been launched all around the world to enable the technological support for the fourth industrial revolution. CPSs are one of the main constituents, covering both physical and virtual components, that can generate and process rich ascending and descending data flows. Therefore, data and knowledge assets are turning into a resource that needs to be exploited to enable higher intelligence and autonomy of the systems and subsystems. In this work, we presented a part of the vf-OS initiative: the semantic management component. Some of the most important services of this component, such as importOntology, importDataModel, and getMappingSuggestion, were described and discussed. The focus has been made on data mapping and collaborative enriching of the data-rich environment. Further efforts were made to test the semantic management component through a combination of services, showing how imported ontologies or data models can influence the suggestions produced by the data mapping service. This is important since (i) relations among concepts are not static and can evolve over time, and (ii) import of new ontologies and data models leads to introducing new concepts and relationships. These factors affect the outcome of the suggestions produced for a given concept.
As a part of further future work, some other services that add additional functionalities are to be developed. For instance, a service that can enable creation of ontologies within the graph DB as logical partitions is needed. Another service should enable reasoning over the knowledge base, using services developed within this work. An interface for integrating external data services for improving the mapping output is also in the scope of further research, as well as a service for integrating and reasoning over data models containing concepts that are synonyms or are very contextually interrelated. Furthermore, the developed services will also be tested with real-world data from smart home components and applications. Smart homes are a promising area, as it is crucial to find interdependencies among "things" represented as concepts and produce mappings between data models for various purposes such as optimization, hierarchy detection, and so on. Another promising task that is planned for the next stages is applying the semantic management component to support establishing smart things coalitions for the CPS.