Next Article in Journal
Effects of Intrinsic and Extrinsic Motivational Factors on Employee Participation in Internal Crowdsourcing Initiatives in China
Next Article in Special Issue
Security of Zero Trust Networks in Cloud Computing: A Comparative Review
Previous Article in Journal
Eye-SCOR: A Supply Chain Operations Reference-Based Framework for Smart Eye Status Monitoring Using System Dynamics Modeling
Previous Article in Special Issue
Multi-Session Surface Electromyogram Signal Database for Personal Identification
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Deep Model-Based Security-Aware Entity Alignment Method for Edge-Specific Knowledge Graphs

1
Department of Industrial Engineering, Sungkyunkwan University, Suwon 16419, Korea
2
2nd R&D Institute, Agency for Defense Development, Seoul 05771, Korea
*
Author to whom correspondence should be addressed.
Sustainability 2022, 14(14), 8877; https://doi.org/10.3390/su14148877
Submission received: 9 June 2022 / Revised: 11 July 2022 / Accepted: 18 July 2022 / Published: 20 July 2022

Abstract

:
This paper proposes a deep model-based entity alignment method for the edge-specific knowledge graphs (KGs) to resolve the semantic heterogeneity between the edge systems’ data. To do so, this paper first analyzes the edge-specific knowledge graphs (KGs) to find unique characteristics. The deep model-based entity alignment method is developed based on their unique characteristics. The proposed method performs the entity alignment using a graph which is not topological but data-centric, to reflect the characteristics of the edge-specific KGs, which are mainly composed of the instance entities rather than the conceptual entities. In addition, two deep models, namely BERT (bidirectional encoder representations from transformers) for the concept entities and GAN (generative adversarial networks) for the instance entities, are applied to model learning. By utilizing the deep models, neural network models that humans cannot interpret, it is possible to secure data on the edge systems. The two learning models trained separately are integrated using a graph-based deep learning model GCN (graph convolution network). Finally, the integrated deep model is utilized to align the entities in the edge-specific KGs. To demonstrate the superiority of the proposed method, we perform the experiment and evaluation compared to the state-of-the-art entity alignment methods with the two experimental datasets from DBpedia, YAGO, and wikidata. In the evaluation metrics of Hits@k, mean rank (MR), and mean reciprocal rank (MRR), the proposed method shows the best predictive and generalization performance for the KG entity alignment.

1. Introduction

As edge computing becomes more prevalent, the collaboration and collaboration between the multiple edge systems (ESs) required to perform a specific task (e.g., service delivery, data privacy, and security) become important [1,2,3]. At this time, for the ESs to collaborate and cooperate seamlessly, the heterogeneity of the data they must share, particularly the semantic heterogeneity, must be addressed [4,5]. Until recently, the ontology, which was domain-specific and had a rigid structure, was the primary method for solving the semantic heterogeneity of the data in the ESs [6,7]. However, due to the easy and frequent change of the components of the ESs, their operating environments can be very dynamic [8]. In addition, collaboration between heterogeneous ESs for multiple domains is absolutely required to perform some tasks [9]. It is very difficult to accommodate these changes in the ontology with their security and privacy policies. Thus, it is a non-trivial task to develop an ontology that encompasses such diverse domains for the dynamic and collaborative edge computing.
To overcome the limitations of the ontology, attention is given to solving the semantic heterogeneity of the data of the ESs with a knowledge graph (KG) that can structurally represent human knowledge using entities, relations, and semantic descriptions [10]. By simply adding some entities and their relationships related to the changed environments to the KG, it can rapidly and easily reflect dynamic changes of the ESs. Furthermore, it can develop diverse domain knowledge models that are not limited to a specific domain [11]. At present, as an early stage of applying the KG, researchers are using edge-specific KGs to resolve the semantic heterogeneity of the internal data necessary to ensure the security and privacy of the ESs [12]. However, as cooperation and collaboration between the ESs become common, they will require a new method of using the edge-specific KGs that can resolve the semantic heterogeneity of data required for their cooperation and collaboration while maintaining the security and data privacy of each ES. To propose the new method, we researched as follows.
First, we explore the unique characteristics of the edge-specific KGs. To resolve the semantic heterogeneity of data used in the multiple heterogeneous ESs, the relations between them, especially the equivalence relation, must be clearly identified. Fortunately, entity alignment, which is identified as the equivalence relation between entities in different KGs, has already been proposed [13]. Unfortunately, the KGs used in the entity alignment are the general-purpose KGs, such as DBpedia and YAGO, with different properties to the edge-specific KGs [14]. So, before proposing an entity alignment method suitable for the edge-specific KGs, we conducted research to find the unique characteristics of the edge-specific KGs.
Second, based on the unique characteristics of the edge-specific KGs, we propose a novel entity alignment method for the edge-specific KGs, named the deep model-based security-aware entity alignment method. The properties of the proposed method are as follows. The deep model is used for learning the semantic relations between the concept and instance entities that are the target of the entity alignment. At this time, two deep models, named BERT (bidirectional encoder representations from transformers) [15] for the concept entities and GAN (generative adversarial networks) [16] for the instance entities are applied to train for each because the properties of the concept and the instance entities are very different. In particular, for the security and privacy of the ES data, the data must not be exposed outside the ESs during the entity alignment process. To do so, we utilize the deep models, neural network models that humans cannot interpret. This is because these neural network models can perform the entity alignment considering the security and privacy of the ES data by hiding the model’s input data and hyperparameters.
Third, this paper proposes a graph-based deep learning model that can integrate the learning models for the BERT for the concept entities and the GAN for the instance entities. Although the deep model learning for the entity alignment was separately performed considering the characteristics of the concept and the instance entities, information sharing between them is required to perform the entity alignment robustly because the instance entities as well as the concept entities may be used to describe the concept entities. Conversely, the concept entities, as well as the instance entities, may be used to describe the instance entities. A flexible graph-based deep learning model named GCN (graph convolution network) is utilized to integrate two deep models trained on two types of entities with different characteristics [17]. So, this paper attempts to merge the learned models for concept and instance entities into one deep model by borrowing graph-based deep learning models and transfer learning approaches.
This paper is organized as follows. In Section 2, we describe the characteristics of the general-purpose KG and the edge-specific KG. Section 3 represents the overall architecture of the deep model-based security-aware entity alignment framework and the detailed process of each module. In Section 4, we conduct the experiments to show the superiority of the proposed methods for KG entity alignment in comparison with the comparative methods. The related studies on the KGs in edge computing and KG entity alignment are summarized in Section 5. Finally, Section 6 puts forth the conclusions and suggests further research on the KG entity alignment framework in edge computing and takes security into consideration.

2. General-Purpose KG vs. Edge-Specific KG

As mentioned earlier, the entity alignment method has already been proposed to increase the utilization of the KGs [18]. Currently, conventional entity alignment is performed using the following characteristics of the KGs. In many applications, the KGs are used as the graph-structured knowledge bases (KBs) that can structurally represent human knowledge using entities, relations, and semantic descriptions [10]. Many KGs, including YAGO, DBpedia, and the Google Knowledge Graph, have already been developed and used [19]. These KGs are the general-purpose KGs that include comprehensive and multidisciplinary knowledge. Therefore, they have following properties. First, the general-purpose KGs are created mainly for concept entities with higher levels of abstraction than instance entities with lower levels of abstraction because they represent knowledge across multiple domains. Currently, when using the general-purpose KG, regardless of the domains, it describes in detail the description of the concept entities themselves, their relations to other concept entities (e.g., taxonomic relations), and their literal and textual descriptions. Second, the general-purpose KGs contain multidisciplinary information and are large-scale [20,21]. Therefore, it is highly likely that the entities to be aligned and their information, such as their taxonomic relations and their literal and textual descriptions, will be included in several general-purpose KGs. Multiple general-purpose KGs may contain the same concept and instance entities simultaneously. Of course, depending on the development intention of the general-purpose KGs, the concept and instance entities may or may not be represented in a similar structure. Third, the general-purpose KGs emerged to increase the utilization of information by sharing and disclosing all information, such as the concepts and instances, to users through the connection of multidisciplinary information. In other words, their degree of information openness is very high.
As the goal of this paper is to propose the entity alignment method for the edge-specific KGs, we analyze the properties of the edge-specific KGs in terms of the level of abstraction, the degree of data overlap, and the information openness, which are the properties of the general-purpose KGs. The first is the level of abstraction. Unlike the general-purpose KGs, which are developed to perform domain-independent operations, the edge-specific KGs are created to support only the functions of the corresponding ESs [22]. Furthermore, the primary function of the edge-specific KGs is to solve the heterogeneity of the data collected by the sensing devices of the edges; so, they are composed mainly of the instance entities rather than the concept entities. Contrary to the concept entities, as most instance entities are leaf nodes of the KGs they lack taxonomic relations, literals, and textual descriptions. The second is the degree of overlapped data. As the edge-specific KG contains only information related to the specific ESs, the scale is small, and the information contents are very specific. Therefore, the possibility that the edge-specific KGs contain the overlapped information necessary for the entity alignment is very low. The third is the difference in the degree of information openness. The edge-specific KGs disclose some concepts inherited from the general-purpose KGs but suppress external exposure by modeling the users’ private information or the sensory data of the ESs, to be protected as their instances. Therefore, their degree of information openness is very low. In this light, the conventional entity alignment methods, which are conducted using the characteristics of the general-purpose KGs, cannot apply the alignment of the edge-specific KGs. So, this paper proposes a novel entity alignment method specialized in the edge-specific KGs. The comparison results are summarized in Table 1.
Based on the above analysis, we discovered the characteristics that the entity alignment method of the edge-specific KGs should have; these are as follows.
First, the characteristics of the edge-specific KGs, which are mainly composed of the instance entities rather than the conceptual entities, should be reflected. For this, rather than the entity alignment using the graph topological similarity of the entities, the entity alignment method using data characteristics should be devised. To do so, the proposed method performs two-way entity alignment for the concept and instance entities using different data-centered deep models.
Second, the concept entities have rich information, such as taxonomic relations, literals, and textual descriptions, but with a very small amount of overlapped information needing to be aligned. Therefore, it is tough to align them only with the information of the concept entities contained in the edge-specific KGs. To reduce the difficulty, this paper attempts to expand the range of information available into external information, such as general-purpose KGs and web documents.
Third, for the security and privacy of the instance entities in the edge-specific KGs, the sensing data should not be exposed to the outside during the alignment process. In this light, this paper uses the GAN, which is a neural network model that humans cannot interpret because it is a black-box model. As a result, it is possible to perform alignment considering the security and privacy of the instance entities by hiding the input data and the hyperparameters of the model. In Section 3, we delve into the entity alignment method of the edge-specific KGs in detail.

3. Overall Framework

Based on the characteristics that the entity alignment method of the edge-specific KGs should have, this paper proposes a novel alignment method. The proposed method is conducted in three phases. The first phase identifies the entities from each edge-specific KG into the conceptual and instance entities. To do so, we perform graph clustering utilizing the graph properties of the edge-specific KGs. The second phase, which is the core of the proposed method, performs the learning to align the concept and instance entities. At this time, considering the properties of the concept and instance entities, the former uses a language model, and the latter uses a GAN model to perform learning. Finally, by merging the two learned models, an alignment model for the edge-specific KGs is generated. The overall framework of the proposed method is depicted in Figure 1.
This section may be divided by subheadings. It should provide a concise and precise description of the experimental results, their interpretation, as well as the experimental conclusions that can be drawn.

3.1. Graph Clustering-Based Concept and Instance Entities Identification

The concept and instance entities are mixed in the general-purpose KG as it is a large dataset that contains sufficient information for entity alignment. There is no problem in performing the entity alignment without a distinction between the concept and the instance entities in the general-purpose KG. On the other hand, for the edge-specific KGs constructed for the specific purposes, the volume of the dataset is significantly smaller than that of the general-purpose KGs. It makes it difficult to learn relationships between entities for the predictive model, which significantly reduces the performance of the entity alignment. To improve the performance, we first distinguish the concept and instance entities from the edge-specific KGs.
To identify the concept and instance entities from the edge-specific KGs, we first explored the structural characteristics of the concept and instance entities in the edge-specific KGs. From the point of view of the graph’s structure, the concept entities have many links and have greater outdegree than indegree because they are used to explain other concepts and instances in the edge-specific KGs. In contrast, instance entities have fewer links and have greater indegree than outdegree because they are the physical objects that reify the concept entities [23]. Using these structural characteristics, we will identify the concept and instance entities in the edge-specific KGs. Centrality is a representative metric that uses the graph’s structure, especially the number of connections with other nodes, to find important nodes included in the graph. The more connections a node has, the more important it is. In other words, a node with a large centrality is an important node. Let us project these centrality properties onto the identification of conceptual and instance entities. So, an entity with a large centrality value, that is, an entity with an outdegree greater than the indegree, becomes a concept entity. In contrast, an entity with a small value, that is, an entity with an outdegree smaller than the indegree, becomes an instance entity.
There are three types of centrality metrics used in the structural analysis of graphs: degree, closeness, and betweenness [24,25]. Degree centrality can determine which entities have the most links. It is calculated based on the number of relations between two arbitrary entities in the edge-specific KGs [26]. However, as it only considers one-hop neighbors, it is difficult to capture the structural difference between the concepts and the instances. To reduce the difficulty, we simultaneously use betweenness centrality and closeness centrality, which can reflect the different aspects of multi-hop neighbor structures. The betweenness centrality is based on the paths of two arbitrary entities, which reflects the density of the path, while the closeness centrality is based on their distances, which reflects the length of the paths. So, we use all three centrality indices simultaneously to capture the structural information of the KGs from multiple perspectives.
The centrality indices we use are represented as follows, respectively.
d c ( e x ) = λ · l o g ( 1 + d e g + ( e x ) d e g ( e x ) )
b c ( e x ) = x y z G σ y z ( e x ) σ y z
c c ( e x ) = 1 x y G d ( e x , e y )
where d c ( e x ) , b c ( e x ) , and c ( e x ) are degree centrality, betweenness centrality, and closeness centrality of entity e x , respectively. d e g + ( e x ) and d e g ( e x ) are the outdegree and indegree of e x , respectively. λ is the scale coefficient. In addition, σ y z is total number of shortest paths from e y to e z , and σ y z ( e x ) is the number of those paths that pass through e x , and d ( e x , e y ) is the distance two arbitrary entities e x and e y   ( e x , e y ,   e z K G , x y z ) .
Using calculated centrality values, we perform spectral clustering with the number of clusters k = 2 , showing the best performance for the data where the dimensionality reduction is essential, such as the edge-specific KGs. Based on the centrality values, the Euclidean distance between two arbitrary entities e x and e y ( d ( e x , e y ) ) is calculated as follows.
( e x , e y ) = ( d c ( e x ) d c ( e y ) ) 2 + ( b c ( e x ) b c ( e y ) ) 2 + ( c c ( e x ) c c ( e y ) ) 2
By using the distances between the entities, the affinity matrix of the edge-specific KG ( W ) is calculated. The affinity value w i j for e i and e j in W is calculated as follows.
w i j = exp ( d ( e i , e j ) 2 σ 2 )
where d ( e i , e j ) is the distance between e i and e j . σ is the gaussian kernel parameter.
For the two clusters, C 1 and C 2   ( C 1   C 2 = ) , which are the result of the spectral clustering, the concept and instance entities are determined. At this time, as a centrality-based distance measure identifies the two clusters, one of the two clusters is a cluster of the entities with a low outdegree compared to the indegree (a.k.a. concept entity), and the other is a cluster of the entities with a high outdegree compared to the indegree (a.k.a. instance entity). The entities that belong to a cluster with a large sum of centrality (smaller) are determined as the concept (instance) entities. Finally, we obtain the concept entities ( c e x ) and instance entities ( i e x ).

3.2. Semantic Relation Learning Module

3.2.1. Language Model-Based Learning for the Concept Entities

As mentioned earlier, the concept entities have rich information, such as taxonomic relations, literals, and textual descriptions, but with a very small amount of overlapped information needing to be aligned. To address the lack of overlapped information, this paper proposes a method to learn the equivalence relations between the concept entities using not only KGs but also pre-trained language models and external sources. At this time, the pre-trained language models and external sources are used to resolve the lack of overlapped information by learning the generalized lexical relationships between two concept entities, predicting equivalence relations, and generating additional overlapped information. The language model-based learning is composed of three steps, which are depicted in Figure 2.
  • Step 1 Data Collection
To train the pre-trained language model, enough input data must be secured. However, the data we currently have available are the concept and instance entities identified by the edge-specific KGs. As there are very few concept entities in the edge-specific KGs, it is impossible to train a language model with this alone. To amplify the number of concept entities, we extract the partial graphs related to the concept entities from the general-purpose KGs that inherit some of the concept and instance entities to the edge-specific KGs. However, there are still insufficient data to train a language model with only the concept entities in the general-purpose KGs as they contain only information with a graph structure for the concept entities. To compensate for the lack of training data, we randomly sample vocabulary documents similar to the concept entities from external sources such as web documents. The general-purpose KGs, the concept-related partial graphs, and the web documents are defined as follows.
Definition 1.
General-purpose knowledge graph( K G g p ).  K G g p is an overarching KG that contains a wide variety of domains and their information, which consist of entities, literals, and relations. It can be represented as a set of subject–property–object triples, depicted as follows.
K G g p = { ( s i ,   p i ,   o i ) |   s i , E g p ,   p i R g p ,   o i E g p   L g p }
where s i represents the subject, p i represents the property, o i represents the object, E g p represents the set of entities, R g p represents the set of relations, and L g p represents the set of literals contained in K G g p .
Definition 2.
Concept-related partial KGs ( p K G x ).  p K G x is a partial graph of the K G g p , which is directly connected to a concept entity ( c e x ).
p K G x = { ( s x , p x , o x ) | ( s x , o x ) { c e x | x } ,   ( s x , p x , o x ) K G g p }
where c e x is an identified entity as a concept entity.
Definition 3.
Sample document ( D d ).  D d is the d t h sample document including any concept entity as a word of the document.
D d = { , w d t , | { c e x | x }   { w d t | t } 0 }
where w d t is t t h word of the sample document D d .
  • Step 2 LDA-based topic modeling
In this step, only sample documents ( D d ) and partial KGs ( p K G x ) that are semantically similar to the arbitrary concept entities are associated with the specific c e x . To do so, the LDA topic model is utilized to discover abstract topics and their vectors occurring in a collection of documents, D = { D d | d } . The LDA model takes a set of words as input; the partial KGs and sample documents are transformed into a set of words. As it is simple to transform the sample documents to the set of words, only the process of transforming a partial KG into the set of words is described.
To select and transform only the elements related to the concept entity from the p K G x into documents, we utilize ‘owl:sameAs’ or ‘skos:exactMatch’, which indicate that the two URIs are equivalent. To select only the concept entities with the relation ‘owl:sameAs’ or ‘skos:exactMatch’, we randomly select the pairs of concept entities ( P C E ), which have the relation ‘owl:sameAs’ or ‘skos:exactMatch’ from the p K G x . The P C E is simply represented as follows.
P C E = { ( c e x ,   c e y ) |   r e l ( c e x , c e y ) = o w l : s a m e A s o r   s k o s : e x a c t M a t c h   } , c e x c e y ,   ( c e x ,   c e y )     ( c e x ,   c e y )
For all pairs of the P C E , we compare the number of hops between the P C E pairs and the p K G x elements. Finally, we generate p K G x using only p K G x elements connected with n hops or less. This element of p K G x is used to create a set of words, that is, documents to be input into the LDA topic model. At this time, the word set of p K G x ( p D x ) is constructed by extracting only the entity names included in p K G x as components. It is simply represented as follows.
p D x = { , n ( E g p ) x t , | E g p p K G x , x }
where n ( E g p ) x represents the t t h name of the entity E g p , which is included in p K G x .
Third, using all the sample documents, D d , and the transformed documents, p D x ( x ), the LDA-based topic modelling is performed. At this time, the documents are short texts, such as paragraphs of the research publications or Wikipedia, which are randomly collected from well-structured and refined sources and have enough semantic information.
We derive the topic distribution vectors for each document from D d and p D x using the LDA-based topic model. The results of the topic modeling are as follows.
d θ m = { d θ m ,   1 ,   d θ m ,   2 ,   ,   d θ m ,   k ,   ,   d θ m ,   K }   f o r   m = 1 ,   2 ,   ,   M e θ p ,   n = { e θ p ,   n ,     1 ,   e θ p ,   n ,   2 ,   ,   e θ p ,   n ,   k ,   ,   e θ p ,   n ,   K }   f o r   n = 1 ,   2 ,   ,   N   a n d   p = 1 ,   2 ,   , P
where K is the number of topics, d θ m represents the topic distribution vector of the document d m in p D = { p D x | x } and D d , d θ m ,   k is the probability of the topic k of d m ( k = 1 K d θ m ,   k = 1   f o r   a l l   m = 1 ,   2 ,   ,   M ), and e θ p ,   n ,   k is the probability of the topic k of n ( E g p ) x t ( k = 1 K e θ p ,   n ,   k = 1   f o r   a l l   n = 1 ,   2 ,   ,   N   a n d   p = 1 ,   2 ,   ,   P ).
However, although both the partial KGs and the sample documents are generated based on the concept entities, it is impossible to directly compare their relations because their sources are different. To solve the problem, using the topic vectors generated from the LDA topic model, the concept entities with the most similar topics to the sample documents are found, and the corresponding documents are attached. The process for finding similar topics is as follows.
To discover the documents related to the concept entity c e x (for   x ), we calculate the topic similarity between the documents D d and p D x using Hellinger distance, which can derive the similarity between probability distributions. The topic similarity between e θ p ,   n and d θ m is calculated as follows.
s i m ( e θ p ,   n ,   d θ m ) = 1 2 k = 1 K ( e θ p ,   n ,   k d θ m ,   k ) 2
Based on this similarity measure, the related document of D d can be discovered with maximum of s i m ( e θ p ,   n ,   d θ m ) in the document distributions.
  • Step 3 Training the Prediction Model for Lexical relationships
In the third step, the lexical relations between two arbitrary concept entities are learned using documents attached as a training dataset. Furthermore, a language model capable of predicting equivalence relations is trained. To create a dataset that can learn the equivalence relation, we utilize the sub-structure of the general-purpose KGs. In the sub-structure, we give the label ‘1’ to pairs of documents attached to concept entities with an equivalence relation, and otherwise, we give the label ‘0′. Finally, the language model is learned using the label-annotated datasets. At this time, the transfer learning is performed using a pre-trained language model such as BERT to improve the learning speed and generalization performance.
To train the prediction model of lexical relation between the entities, we utilize the related documents of the previous model. First, we select two arbitrary entities ( e i and e j ) from P C E with the random selection of one related document ( e d i ,   a and e d j ,   b ) for each entity. At this time, the label is 1 if the two entities have an equivalence relationship ( ( e i ,   e j ) P C E ) and 0 otherwise. Second, the entity name and the related document are concatenated and encoded as vectors using the BERT model [27]. Third, the prediction model is trained by using the encoded vectors of the two entities with their related documents. As the prediction model, we adopt the attention network model that is widely used for classification tasks. The prediction model uses the ReLU function as an activation function, which has the advantages of sparse activation as well as a low computational burden.

3.2.2. GAN-Based Semantic Alignment Learning for Instances

In the conventional entity alignment method, the primary information sources about the instance entities are literal or data properties (called attributes) [28]. Unlike the general-purpose KGs, the edge-specific KGs are likely to contain much larger-scale numerical data, such as sensor data in the instance entities, because the ESs operate on the low-end, directly exchange, and store sensor data. However, even if there is a lot of data about the instance entities, it is tough to access because the data of the ESs requires a very high level of security, such as for the information about a specific space or personal privacy. In addition, due to the characteristics of the ESs, which separately store data in multiple repositories, it is also challenging to gather the data required for the entity alignment of the instance entities.
To overcome the difficulties, this paper utilizes the generative adversarial networks (GAN)-based unsupervised instance alignment method for the instance entities. The GAN, as a deep model, has a significant advantage in security because it shields data from being directly interpretable. In addition, it shows an excellent performance in predicting and generating insufficient data as a generative model. Based on the properties of the GAN, this paper obtains the GAN model that can predict the entire data pattern after learning the GAN with subsets of data samples. Finally, this paper performs embedding for the instance entity alignment using the predicted data that the GAN model creates. The learning process of the GAN is as follows.
To train the GAN model, we use instance-related data subsets collected from sensors or devices, such as literal, attribute, or signal data. The data subset is defined as follows.
Definition 4.
Data Subset ( X x ).  X x is a subset of data samples related to the x t h instance of entity i e x
X x = { , x x t , } ,   1 t n
where x x t is t t h data value of X x .
At this time, to perform the alignment of two arbitrary i e x and i e x , it is necessary to know whether the populations of i e x and i e x are equal. However, the volumes of X x and X x , which are the data subsets of i e x and i e x , are tiny because the data in the edge systems are stored distributionally. In addition, for data privacy and security reasons, blocking the external disclosure of the edge systems’ data is a factor that shrinks the volumes of X x and X x . Therefore, it is challenging to infer the similarity of the entire populations of the two entities with only the similarity of the distributions of X x and X x , which are small in scale. This means that accurate alignment between i e x and i e x is difficult. To overcome the difficulty, we use the GAN model, which can learn the distribution of the whole populations. At this time, the GAN model uses the data subset X x ( f o r   x ) as its input. Finally, using the reconstructed data subsets X ^ x and X ^ x derived from the inferred populations, the entity alignment between i e x and i e x is conducted. The GAN consists of two parts: the generator G and the discriminator D [29]. The loss function of the discriminator is defined as follows.
min G max D V ( D , G ) = E x P d a t a ( X x ) [ log D ( x ) ] + E z P z ( z ) [ log ( 1 D ( G ( z ) ) ]
where P d a t a ( X x ) is the real data distribution from X x , P ( z ) is a prior distribution on noise vector z , D(x) denotes the probability that x comes from the real data, E x P d a t a ( X x ) is the expectation of x from P d a t a ( X x ) , and E z P z ( z ) is the expectation of the z sample from the noise.
The trained GAN for each instance is used to attain a reconstructed data sample X x ^ . Using the X x ^ , the equivalence relationships between the instances are identified. The GAN-based instance alignment process is depicted in Figure 3. In Figure 3, an embedding layer is added to obtain fixed dimensional vectors for X x ^ . The simple description of the process of instance alignment using GAN is as follows. All data subsets of the target instance ( i e x , x ) to perform entity alignment are input to the GAN model. The trained GAN is used to reconstruct the data for all instances. Lastly, it estimates the distribution of instances using reconstructed data and real data and performs entity alignment by inputting the estimated distribution ( X x ^ ) to the embedding layer.

3.3. Graph-Based Merged Deep Model Learning Module

To perform the edge-specific entity alignment, this module creates the robust entity alignment model for the various structures and the lack of information of the edge-specific KGs by sharing and merging the learned deep models. At this time, it is impossible to directly merge the learning results of the concept and instance entities because the learning was performed using two deep models named the language model and the GAN, with completely different properties. To merge the two heterogeneous deep models, an embedding vector is generated using the learning information of the deep models and then projected onto the graph structure. As a next step, this paper uses a GCN (graph convolution network)-based merged deep model that can perform the entity alignment of the edge-specific KGs by learning only the structural relations of the graph.
As the GNN shows high performance in exploiting graph structure information for many entity alignment tasks, we adopt the GCN model to capture the structural information of the concepts and the instances in the KGs. A GCN model consists of multiple GCN layers. In a GCN model, the input of the l t h layer is the entity feature matrix, H ( l ) n × d ( l ) , where n is the number of nodes, and d ( l ) is the number of features in the l t h layer. The output of the l t h layer is a new feature matrix H ( l + 1 ) . The convolutional computation of H ( l + 1 ) is as follows.
H ( l + 1 ) = σ ( A H ( l ) W ( l ) )
where A is a normalized adjacency matrix with a self-connected input graph, H ( l ) is the hidden states, W ( l ) is the weight of the lth layer, and σ ( · ) is a nonlinear activation function.
Two GCN models for the concepts and instance entities are trained independently because the spaces that explain the concepts and instances are different. Moreover, it is easy to find the optimal hyperplane, and the complexity can be lowered by independently learning. The inputs of the GCN models for the concept and instance entities are the two types of embeddings for the concept and instance entities by the language model-based trained model and the GAN-based trained model, respectively. After the training of the two GCN models, we can obtain the embeddings of the structures of the concept and the instance entities as output by training the GCN models. Finally, the two types of embeddings are concatenated to discover their shared spaces, which cannot be identified by the concept or the instance entities alone. The concatenated embeddings are used for the merged model to get merged embeddings for the alignment task. The details of the merged model are depicted in Figure 4.
The space found by deep learning is almost impossible to be explained without the information of the training dataset and the dimensions and hyperparameters of the models. Therefore, our proposed framework can contribute to the security and privacy issue by only predicting the alignment using the merged model, without directly exposing the structures of the KGs.

4. Experiments and Performance Evaluation

In this section, we first explain the experimental setup, including a summary of the experimental datasets and the comparative methods. We then report the experimental results and analysis compared to the other entity alignment methods. We used a processor with Intel(R) Xeon(R) CPU @ 2.30 GHz, a Teslar P100, and 12 GB memory provided by Google Colab Pro.

4.1. Experimental Datasets

In order to evaluate our proposed framework, we used two datasets, DBP-YG 15K and DBP-WD 15K, sampled from DBpedia-YAGO and DBpedia-Wikidata, respectively [30]. The statistical data of the two datasets are listed in Table 2. The DBP-YG and DBP-WD include attribute triples representing the attribute information of the entities and relation triples representing the relations between the entities, respectively. At this time, the predicates and objects of the attribute triples contain many words describing the actual entities. On the other hand, the relation triples can only non-symbolically express only entities and their equivalence relationships, but no vocabularies can explain their meaning.

Comparison Methods

We conducted an experiment to evaluate the superiority of our proposed framework compared to the existing entity alignment methods. The details of the comparison methods are presented as follows.
(1)
MTransE: MtransE is an EA method which learns mapping between two separate embedding spaces of different KGs [31].
(2)
TransD: TransD is an embedding method which extends TransE to model complex relations by projecting the entities into a relation-related space [32].
(3)
RotatE: RotatE is an embedding method which represents entities as complex vectors and relations as rotations in a complex vector space [33].
(4)
ConvE: ConvE is an embedding method which is the representative multi-layer CNN-based architecture for link prediction [34].
(5)
AlignE: AlignE is a self-training entity alignment method which embeds two KGs in a unified space and iteratively labels newly identified entity alignment as supervision [30].
(6)
AttrE: AttrE generates attribute character embeddings to shift the entity embeddings from different two knowledge graphs into the same space [14].
(7)
GCN-a: GCN-Align is an entity alignment method which employs GCN to model entities to exploit their neighborhood information [30].
We conducted an experiment by varying the sampling ratio from 10 to 55% in increments of 5%. By convention, we chose Hit@k (k = 1, 5, 10), mean rank (MR), and mean reciprocal rank (MRR) as the evaluation metrics.

4.2. Experimental Results and Evaluation

The results of the experiment on the full datasets are illustrated in Table 3. As shown in Table 3, our proposed framework outperforms other comparison models. As our proposed method exploits semantic information with a language model, it captured the differences between entities better than the other comparison models. Moreover, the alignment models using attributes including our proposed model show better performances than the models using only relations. This is because the entities in KG have multiple aspects of features, which help in identifying equivalence relations.
Figure 5 and Figure 6 show the experimental results by changing the sampling rate for the two experimental datasets: DBP-YG 15k and DBP-WD 15k. There are a lot of data on the people-centered relationship in the DBP-YG 15k. In the case of people relationships, the complexity of the label is quite high because the types are quite diverse. Thus, the complexity of the label greatly increases according to the data sampling rate. While the value of Hits@1 increases with the stability, the value of Hits@10 increases slowly as the sampling rate increases. The value of Hits@10 increases steeply up to the sampling rate of 30%, and then the increase slows down after that the rate. This is because the entropy of the label increases as the complexity of the label increases. The increase rate of the Hits@K value of DBP-WD 15k was much slower. The proposed method showed excellent performance at almost all sampling rates and the closest performance with the GCN-a and AlignE. In the case of Hits@5 and Hits@10, the performance of AlingE is lower than the proposed method. It means that the generalization performance of AlingE is low.
On the other hand, for the MRR, the value of MRR increases rapidly as the sampling rate increases. Unlike Hits@k, the MR and MRR are indicators that can evaluate the generalization performance and stability because they calculate average scores for the overall results. At this time, the lower the MR, the better the value, and the higher the MRR, the better. As depicted in Figure 5, the proposed method performs better than the other methods at most sampling rates. In particular, the index of Hits@10 in the DBP-WD 15k dataset indirectly suggests that the generalization performance of the AlingE method is somewhat low. Furthermore, the MRR value also suggests that the performance of AlingE is low. As a result of comparing Hits@k, it can be confirmed that the proposed method showed high accuracy compared to the other methods. Moreover, it showed a high generalization performance from the perspective of the MR and MRR. In summary, it can be said that the entity alignment performance of the proposed method is excellent. The experimental results are summarized in Table 4 and Table 5 in detail.

5. Related Works

5.1. KGs for Edge Computing

Many recent studies have been performed to develop and utilize KGs to support ESs. The functions of these KGs can be categorized into three types: (1) service data/information management, (2) device/sensor maintenance, and (3) security/privacy of the ES. Based on these categories, the related works on KGs for edge computing are summarized in Table 6.
Xu [22] proposed a KG inference-enabled data management system for edge computing, named SuccinctEdge, which is a compact, decompression-free, self-index, in-memory RDF store that can answer SPARQL queries, including those requiring reasoning services associated with some ontology. Wu [35] devised a group recommendation system for network document resource exploration using the KG and LSTM in edge computing. Yao [36] developed a news recommendation algorithm in an edge computing environment using KG and GNN. Shi [37] proposed a KG-empowered reasoning model, named TKGERM, to reason traffic information with multi-source data for edge computing-enabled IoV (Internet of Vehicles). Liu [12] devised a KG-based data representation method for IIoT (Industrial Internet of Things)-enabled cognitive manufacturing. Zhang [38] proposed an edge analytics method using KG construction and application to manage infrastructure resources such as CPU usage, memory capacity, network bandwidth, and the operating system of relevant devices. Doldy [39] proposed an energy-based model of neuro-symbolic reasoning on KGs to characterize industrial automation systems, integrating knowledge from different domains such as industrial automation, communications and cybersecurity. Marx [40] presented Knowledge Box (KBox), an approach for transparently shifting query execution on KGs to the edge, to make the consumption of KGs more reliable and faster in ESs. Garrido [41] applied machine learning on ES-specific KGs to detect intrusion and anomalous activity in industrial systems. Even though adopting KGs in the edge computing environment is gaining popularity, these KGs are developed for their specific purposes so that it is hard to directly utilize these KGs to support cooperation and collaboration between ESs. To effectively resolve the semantic heterogeneity of the data of each ES, the entities in different KGs should be aligned.

5.2. Entity Alignment for KG

5.2.1. Entity Alignment Methods

Entity alignment, which links the entities from different KGs that indicate the same real-world object, is gaining popularity in fusing knowledge from heterogeneous KGs. Entity alignment methods can be categorized into (1) translation-based models and (2) GNN-based models [19]. Translation-based models [14,30,31,42,43,44] perform entity alignment by utilizing translation-based embedding models, mostly TransE [45], which learn vector representations of the entities and relations. MTransE [31] generates the embedding of the entities and relations using TransE and provides transitions for each embedding vector to its counterparts in other spaces. BootEA [30] adopted the bootstrapping approach, which iteratively labels likely entity alignment as training data to learn embeddings and employs an alignment editing method to reduce error. AttrE [14] generates attribute character embeddings that shift the entity embeddings of two different KGs into the same space by calculating the similarity of the entities based on their attributes. ITransE/IPTransE [42] iteratively align entities via joint knowledge embeddings by encoding the entities and relations of different KGs into a single semantic space. Currently, ITransE utilizes TransE to learn embeddings while IPTransE utilizes PTransE [46]. JAPE [43] jointly embeds the structures of two KGs into a unified vector space and refines them by leveraging attribute correlations. KDCoE [44] iteratively trains two component embedding models on multilingual KG structures and entity descriptions, respectively.
GNN-based models utilize GNN to generate embeddings and predict alignment and can be divided into GCN-based models [47,48,49,50,51,52] and GAT-based models [53,54,55]. GCN-Align [52] trains GCNs to perform the embedding of the entities of different KGs into the same vector space. MuGNN [51] utilizes different channels for each KG to be robust to structural differences. GMNN [50] introduces the topic entity graph, which is a local sub-graph of the entity, to represent entities with their contextual information in KG. RDGCN [49] incorporates relation information by attentive interaction between the knowledge graph and its counterpart. HGCN [48] jointly learns entity and relation representations and does not require pre-aligned relations. EMGCN [47] utilizes multi-order graph convolutional networks to perform end-to-end, unsupervised entity alignment. KECG [55] utilizes GAT to embed entities into a single vector space by utilizing inner-graph structure and intra-graph alignment information. MRAEA [54] directly models cross-lingual entity embeddings by attending to the node’s incoming and outgoing neighbors and its connected relations. RAGA [53] adopts a self-attention mechanism to spread entity information to the relations and then aggregates relation information back to the entities.
Even though these alignment methods perform well in general-purpose KGs, such as DBpedia and YAGO, they cannot be directly applied in ES-specific KGs as their heterogeneity is much severe than that of general-purpose KGs. To align entities of ES-specific KGs, a novel method is needed which considers the unique characteristics of the ES-specific KGs.

5.2.2. Entity Alignment Applications

Entity alignment is utilized for many purposes related to enabling the integration of knowledge of heterogeneous sources. Zhang [56] utilized the entity alignment method to integrate knowledge related to maritime dangerous goods. Chong [27] tried to support the interoperability of depressive disorder-related knowledge. Hu [57] performed entity alignment to obtain semantic information from a biomedical knowledge base. Zhou [58] proposed an alignment method of point of interest (POI)-related entities to discover identical POIs in location-based services. Chen [59] tries to integrate multi-source heterogeneous electricity power data by aligning entities in multiple power knowledge graphs. Yang [60] proposed an entity alignment method of power grid-dispatching knowledge graphs. Zhu [61] tried to identify similar IoT devices in different networks by using an alignment method. These studies perform the alignment of entities related to their own purposes. However, in many cases, knowledge integration among different domains is needed, such as healthcare services and transportation. To meet this requirement, a novel alignment method is needed to perform alignment between KGs developed for different domains.

6. Conclusions and Further Research

This paper proposes the deep model-based dynamic entity alignment framework for edge-specific KGs in the edge computing environment. The contributions of this paper can be summarized as follows. First, we applied various deep learning models to align the domain-specific KGs. Second, as the domain-specific KGs of the edge computing environment do not have sufficient shared information, we proposed a method to perform alignment of the concepts from the different KGs by utilizing external information from general-purpose KG and documents. Finally, to ensure security and privacy of the domain-specific KGs of the edge computing environment, we proposed the GAN-based unsupervised alignment methods of the instances. The neural network models such as the GAN are black-box models that humans cannot interpret. So, they can ensure security by hiding the input data and hyperparameters of the model.
As a result, this paper proposes a novel entity alignment method suitable for the edge computing to overcome the limitations due to existing KG and entity alignment techniques not being suitable for real-world applications to edge computing environments. We analyzed the characteristics of edge-specific KGs that are different to general-purpose KGs, considering the edge computing and collaboration environment, and devised the entity alignment technique with the language model and GAN. In addition, various experiments were performed to evaluate the superiority of the proposed method, and it was proved that it exhibits a better performance than the conventional methods.
However, our framework has several limitations. As vector representation does not consider the structure of the KG and containing terms, the position of the entities projected by the language model may be biased. Moreover, in order to quickly respond to new types of cyberattacks, not only resources composed of formal sentences, but also noisy and fragmented text resources should also be used. To overcome these limitations, we will additionally use graph embedding methods which consider graph structures and utilize the language model which is more robust to short and noisy texts.

Author Contributions

Conceptualization, J.K. and M.S.; methodology, J.K., K.K. and M.S.; software, J.K.; validation, M.S. and G.P.; data curation, J.K.; writing—original draft preparation, J.K., K.K. and M.S.; writing—review and editing, M.S. and G.P.; visualization, J.K.; supervision, M.S.; project administration, M.S.; funding acquisition, G.P. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Agency for Defense Development in Korea grant number UE201114ED.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Acknowledgments

This research is supported by the C2 integrating and interfacing technologies laboratory of the Agency for Defense Development (UE201114ED).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Shi, W.; Cao, J.; Zhang, Q.; Li, Y.; Xu, L. Edge Computing: Vision and Challenges. IEEE Internet Things J. 2016, 3, 637–646. [Google Scholar] [CrossRef]
  2. Varghese, B.; Wang, N.; Barbhuiya, S.; Kilpatrick, P.; Nikolopoulos, D.S. Challenges and Opportunities in Edge Computing. In Proceedings of the 2016 IEEE International Conference on Smart Cloud (SmartCloud), New York, NY, USA, 18–20 November 2016. [Google Scholar]
  3. Ranaweera, P.; Jurcut, A.D.; Liyanage, M. Survey on Multi-Access Edge Computing Security and Privacy. IEEE Commun. Surv. Tutor. 2021, 23, 1078–1124. [Google Scholar] [CrossRef]
  4. Ning, H.; Li, Y.; Shi, F.; Yang, L.T. Heterogeneous edge computing open platforms and tools for internet of things. Future Gener. Comput. Syst. 2020, 106, 67–76. [Google Scholar] [CrossRef]
  5. Khan, W.Z.; Ahmed, E.; Hakak, S.; Yaqoob, I.; Ahmed, A. Edge computing: A survey. Future Gener. Comput. Syst. 2019, 97, 219–235. [Google Scholar] [CrossRef]
  6. Ryabinin, K.; Chuprina, S. Ontology-Driven Edge Computing. In International Conference on Computational Science; Springer: Cham, Switzerland, 2020. [Google Scholar]
  7. Lan, L.; Shi, R.; Wang, B.; Zhang, L. An IoT Unified Access Platform for Heterogeneity Sensing Devices Based on Edge Computing. IEEE Access 2019, 7, 44199–44211. [Google Scholar] [CrossRef]
  8. Ouyang, T.; Zhou, Z.; Chen, X. Follow Me at the Edge: Mobility-Aware Dynamic Service Placement for Mobile Edge Computing. IEEE J. Sel. Areas Commun. 2018, 36, 2333–2345. [Google Scholar] [CrossRef] [Green Version]
  9. Tran, T.X.; Hajisami, A.; Pandey, P.; Pompili, D. Collaborative Mobile Edge Computing in 5G Networks: New Paradigms, Scenarios, and Challenges. IEEE Commun. Mag. 2017, 55, 54–61. [Google Scholar] [CrossRef] [Green Version]
  10. Chen, X.; Jia, S.; Xiang, Y. A review: Knowledge reasoning over knowledge graph. Expert Syst. Appl. 2020, 141, 112948. [Google Scholar] [CrossRef]
  11. Akroyd, J.; Mosbach, S.; Bhave, A.; Kraft, M. Universal digital twin—A dynamic knowledge graph. Data Cent. Eng. 2021, 2, e14. [Google Scholar] [CrossRef]
  12. Liu, M.; Li, X.; Li, J.; Liu, Y.; Zhou, B.; Bao, J. A knowledge graph-based data representation approach for IIoT-enabled cognitive manufacturing. Adv. Eng. Inform. 2022, 51, 101515. [Google Scholar] [CrossRef]
  13. Mohamed, S.K.; Muñoz, E.; Nováček, V.; Vandenbussche, P.-Y. Identifying Equivalent Relation Paths in Knowledge Graphs. In International Conference on Language, Data and Knowledge; Springer: Cham, Switzerland, 2017; pp. 299–314. [Google Scholar] [CrossRef] [Green Version]
  14. Trisedya, B.D.; Qi, J.; Zhang, R. Entity Alignment between Knowledge Graphs Using Attribute Embeddings. Proc. Conf. AAAI Artif. Intell. 2019, 33, 297–304. [Google Scholar] [CrossRef]
  15. Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar]
  16. Creswell, A.; White, T.; Dumoulin, V.; Arulkumaran, K.; Sengupta, B.; Bharath, A.A. Generative Adversarial Networks: An Overview. IEEE Signal Process. Mag. 2018, 35, 53–65. [Google Scholar] [CrossRef] [Green Version]
  17. Zhang, S.; Tong, H.; Xu, J.; Maciejewski, R. Graph convolutional networks: A comprehensive review. Comput. Soc. Netw. 2019, 6, 11. [Google Scholar] [CrossRef] [Green Version]
  18. Zeng, K.; Li, C.; Hou, L.; Li, J.; Feng, L. A comprehensive survey of entity alignment for knowledge graphs. AI Open 2021, 2, 1–13. [Google Scholar] [CrossRef]
  19. Zhang, R.; Trisedya, B.D.; Li, M.; Jiang, Y.; Qi, J. A benchmark and comprehensive survey on knowledge graph entity alignment via representation learning. VLDB J. 2022, 1–26. [Google Scholar] [CrossRef]
  20. Dai, Y.; Wang, S.; Xiong, N.N.; Guo, W. A Survey on Knowledge Graph Embedding: Approaches, Applications and Benchmarks. Electronics 2020, 9, 750. [Google Scholar] [CrossRef]
  21. Paulheim, H. Knowledge graph refinement: A survey of approaches and evaluation methods. Semant. Web 2016, 8, 489–508. [Google Scholar] [CrossRef] [Green Version]
  22. Xu, W.; Curé, O.; Calvez, P. Knowledge graph management on the edge. arXiv 2020, arXiv:2012.07108. [Google Scholar]
  23. Kim, J.; Kong, J.; Sohn, M.; Park, G. Layered ontology-based multi-sourced information integration for situation awareness. J. Supercomput. 2021, 77, 9780–9809. [Google Scholar] [CrossRef]
  24. Zhang, J.; Luo, Y. Degree Centrality, betweenness Centrality, and Closeness Centrality in Social Network. In Proceedings of the 2017 2nd International Conference on Modelling, Simulation and Applied Mathematics (MSAM2017), Bangkok, Thailand, 26–27 March 2017. [Google Scholar]
  25. Ni, C.; Sugimoto, C.; Jiang, J. Degree, Closeness, and Betweenness: Application of Group Centrality Measurements to Explore Macro-Disciplinary Evolution Diachronically. In Proceedings of the ISSI, Durban, South Africa, 4–7 July 2011. [Google Scholar]
  26. Ergün, E.; Usluel, Y.K. An analysis of density and degree-centrality according to the social networking structure formed in an online learning environment. J. Educ. Technol. Soc. 2016, 19, 34–46. [Google Scholar]
  27. Chong, I.; Lee, S. Deep Learning based Semantic Ontology Alignment Process and Predictive Analysis of Depressive Disorder. In Proceedings of the 2022 International Conference on Information Networking (ICOIN), Jeju, Korea, 12–15 January 2022; pp. 164–167. [Google Scholar] [CrossRef]
  28. Sun, Z.; Zhang, Q.; Hu, W.; Wang, C.; Chen, M.; Akrami, F.; Li, C. A benchmarking study of embedding-based entity alignment for knowledge graphs. Proc. VLDB Endow. 2020, 13, 2326–2340. [Google Scholar] [CrossRef]
  29. Zenati, H.; Foo, C.S.; Lecouat, B.; Manek, G.; Chandrasekhar, V.R. Efficient gan-based anomaly detection. arXiv 2018, arXiv:1802.06222. [Google Scholar]
  30. Sun, Z.; Hu, W.; Zhang, Q.; Qu, Y. Bootstrapping Entity Alignment with Knowledge Graph Embedding. IJCAI 2018, 18, 4396–4402. [Google Scholar] [CrossRef] [Green Version]
  31. Chen, M.; Tian, Y.; Yang, M.; Zaniolo, C. Multilingual Knowledge Graph Embeddings for Cross-lingual Knowledge Alignment. arXiv 2017, arXiv:1611.03954. [Google Scholar] [CrossRef] [Green Version]
  32. Ji, G.; He, S.; Xu, L.; Liu, K.; Zhao, J. Knowledge Graph Embedding via Dynamic Mapping Matrix. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Beijing, China, 26–31 July 2015. [Google Scholar]
  33. Sun, Z.; Deng, Z.H.; Nie, J.Y.; Tang, J. Rotate: Knowledge graph embedding by relational rotation in complex space. arXiv 2019, arXiv:1902.10197. [Google Scholar]
  34. Dettmers, T.; Minervini, P.; Stenetorp, P.; Riedel, S. Convolutional 2d Knowledge Graph Embeddings. In Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LO, USA, 2–7 February 2018. [Google Scholar]
  35. Wu, Y.; Liu, Q.; Chen, R.; Li, C.; Peng, Z. A Group Recommendation System of Network Document Resource Based on Knowledge Graph and LSTM in Edge Computing. Secur. Commun. Netw. 2020, 2020, 8843803. [Google Scholar] [CrossRef]
  36. Yao, C.; Zhao, C. Knowledge Graph and GNN-Based News Recommendation Algorithm with Edge Computing Support. Int. J. Distrib. Syst. Technol. 2022, 13, 1–11. [Google Scholar] [CrossRef]
  37. Shi, H.; Zhang, Y.; Xu, Z.; Xu, X.; Qi, L. Multi-source temporal knowledge graph embedding for edge computing enabled internet of vehicles. Neurocomputing 2022, 491, 597–606. [Google Scholar] [CrossRef]
  38. Zhang, H.; Niu, Y.; Ding, K.; Kou, S.; Liu, L. Building and Applying Knowledge Graph in Edge Analytics Environment. J. Phys. Conf. Ser. 2022, 1, 2171. [Google Scholar] [CrossRef]
  39. Doldy, D.; Garridoy, J.S. An Energy-Based Model for Neuro-Symbolic Reasoning on Knowledge Graphs. In Proceedings of the 2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA), Pasadena, CA, USA, 13–16 December 2021. [Google Scholar]
  40. Marx, E.; Baron, C.; Soru, T.; Auer, S. KBox—Transparently Shifting Query Execution on Knowledge Graphs to the Edge. In Proceedings of the IEEE International Conference on Semantic Computing, Amsterdam, The Netherlands, 11–14 September 2017. [Google Scholar] [CrossRef]
  41. Garrido, S.J.; Dold, D.; Frank, J. Machine Learning on Knowledge Graphs for Context-Aware Security Monitoring. In Proceedings of the 2021 IEEE International Conference on Cyber Security and Resilience (CSR), Rhodes, Greece, 26–28 July 2021. [Google Scholar]
  42. Zhu, H.; Xie, R.; Liu, Z.; Sun, M. Iterative Entity Alignment via Knowledge Embeddings. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), Melbourne, Australia, 19–25 August 2017. [Google Scholar]
  43. Sun, Z.; Hu, W.; Li, C. Cross-Lingual Entity Alignment via Joint Attribute-Preserving Embedding. In Proceedings of the International Semantic Web Conference, Vienna, Austria, 21–25 October 2017. [Google Scholar] [CrossRef] [Green Version]
  44. Chen, M.; Tian, Y.; Chang, K.-W.; Skiena, S.; Zaniolo, C. Co-training Embeddings of Knowledge Graphs and Entity Descriptions for Cross-lingual Entity Alignment. arXiv 2018, arXiv:1806.06478. [Google Scholar] [CrossRef] [Green Version]
  45. Bordes, A.; Usunier, N.; Garcia-Duran, A.; Weston, J.; Yakhnenko, O. Translating embeddings for modeling multi-relational data. Adv. Neural Inf. Process. Syst. 2013, 26. Available online: https://proceedings.neurips.cc/paper/2013/hash/1cecc7a77928ca8133fa24680a88d2f9-Abstract.html (accessed on 8 June 2022).
  46. Lin, Y.; Liu, Z.; Luan, H.; Sun, M.; Rao, S.; Liu, S. Modeling Relation Paths for Representation Learning of Knowledge Bases. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal, 1 June 2015. [Google Scholar]
  47. Nguyen, T.T.; Huynh, T.T.; Yin, H.; Van Tong, V.; Sakong, D.; Zheng, B.; Nguyen, Q.V.H. Entity Alignment for Knowledge Graphs with Multi-Order Convolutional Networks. IEEE Trans. Knowl. Data Eng. 2020. [Google Scholar] [CrossRef]
  48. Wu, Y.; Liu, X.; Feng, Y.; Wang, Z.; Zhao, D. Jointly Learning Entity and Relation Representations for Entity Alignment. arXiv 2019, arXiv:1909.09317. [Google Scholar] [CrossRef] [Green Version]
  49. Wu, Y.; Liu, X.; Feng, Y.; Wang, Z.; Yan, R.; Zhao, D. Relation-Aware Entity Alignment for Heterogeneous Knowledge Graphs. arXiv 2019, arXiv:1908.08210. [Google Scholar] [CrossRef] [Green Version]
  50. Xu, K.; Wang, L.; Yu, M.; Feng, Y.; Song, Y.; Wang, Z.; Yu, D. Cross-lingual Knowledge Graph Alignment via Graph Matching Neural Network. arXiv 2019, arXiv:1905.11605. [Google Scholar] [CrossRef] [Green Version]
  51. Cao, Y.; Liu, Z.; Li, C.; Li, J.; Chua, T.-S. Multi-Channel Graph Neural Network for Entity Alignment. arXiv 2019, arXiv:1908.09898. [Google Scholar] [CrossRef] [Green Version]
  52. Wang, Z.; Lv, Q.; Lan, X.; Zhang, Y. Cross-Lingual Knowledge Graph Alignment via Graph Convolutional Networks. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, 31 October–4 November 2018. [Google Scholar]
  53. Zhu, R.; Ma, M.; Wang, P. RAGA: Relation-Aware Graph Attention Networks for Global Entity Alignment. In Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining, Delhi, India, 11–14 May 2021. [Google Scholar] [CrossRef]
  54. Mao, X.; Wang, W.; Xu, H.; Lan, M.; Wu, Y. MRAEA: An Efficient and Robust Entity Alignment Approach for Cross-Lingual Knowledge Graph. In Proceedings of the 13th International Conference on Web Search and Data Mining, Houston, TX, USA, 3–7 February 2020. [Google Scholar]
  55. Li, C.; Cao, Y.; Hou, L.; Shi, J.; Li, J.; Chua, T.S. Semi-Supervised Entity Alignment via Joint Knowledge Embedding Model and Cross-Graph model. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, 9–23 October 2019. [Google Scholar]
  56. Zhang, Q.; Wen, Y.; Zhou, C.; Long, H.; Han, D.; Zhang, F.; Xiao, C. Construction of Knowledge Graphs for Maritime Dangerous Goods. Sustainability 2019, 11, 2849. [Google Scholar] [CrossRef] [Green Version]
  57. Hu, Y.; Nie, T.; Shen, D.; Kou, Y.; Yu, G. An integrated pipeline model for biomedical entity alignment. Front. Comput. Sci. 2021, 15, 153321. [Google Scholar] [CrossRef]
  58. Zhou, C.; Zhao, J.; Zhang, X.; Ren, C. Entity Alignment Method of Points of Interest for Internet Location-Based Services. J. Adv. Comput. Intell. Intell. Inform. 2020, 24, 837–845. [Google Scholar] [CrossRef]
  59. Chen, Y.; Xiong, F.; Wu, F.; Xiang, X.; Gao, J.; Gao, J. Entity Alignment across Power Knowledge Graphs. In Proceedings of the 2020 IEEE 2nd International Conference on Power Data Science (ICPDS), Kunming, China, 12–13 December 2020; 2020. [Google Scholar]
  60. Yang, L.; Lv, C.; Wang, X.; Qiao, J.; Ding, W.; Zhang, J.; Wang, F.Y. Collective Entity Alignment for Knowledge Fusion of Power Grid Dispatching Knowledge Graphs. IEEE/CAA J. Autom. Sin. 2022, 9, 1–15. [Google Scholar]
  61. Zhu, D.; Sun, Y.; Du, H.; Cao, N.; Baker, T.; Srivastava, G. HUNA: A Method of Hierarchical Unsupervised Network Alignment for IoT. IEEE Internet Things J. 2020, 8, 3201–3210. [Google Scholar] [CrossRef]
Figure 1. Framework of deep model-based security-aware entity alignment method.
Figure 1. Framework of deep model-based security-aware entity alignment method.
Sustainability 14 08877 g001
Figure 2. Training process for lexical relationship prediction of concept entities.
Figure 2. Training process for lexical relationship prediction of concept entities.
Sustainability 14 08877 g002
Figure 3. Generative Adversarial Network (GAN)-based Instances Alignment Procedure.
Figure 3. Generative Adversarial Network (GAN)-based Instances Alignment Procedure.
Sustainability 14 08877 g003
Figure 4. Procedure of graph-based merged deep model learning.
Figure 4. Procedure of graph-based merged deep model learning.
Sustainability 14 08877 g004
Figure 5. Comparison of the Hits@k with changing sampling rate (10~55%).
Figure 5. Comparison of the Hits@k with changing sampling rate (10~55%).
Sustainability 14 08877 g005
Figure 6. Performance Comparison of the MR and MRR with changing sampling rate (10~55%).
Figure 6. Performance Comparison of the MR and MRR with changing sampling rate (10~55%).
Sustainability 14 08877 g006
Table 1. Comparison of general-purpose KG and edge-specific KGs.
Table 1. Comparison of general-purpose KG and edge-specific KGs.
General-Purpose KGEdge-Specific KG
Level of abstractionHighLow
Degree of data overlapHighLow
Degree of information opennessHighLow
Table 2. Statistical data of DBP-YG 15k and DBP-WD 15k.
Table 2. Statistical data of DBP-YG 15k and DBP-WD 15k.
Datasets#Entities#Attrs#Attr.
Triples
#Rels#Rel.
Triples
DBP-YG 15kDBpedia15,00039,52052,09317,36830,291
YAGO15,000117,622117,11415,85926,638
DBP-WD 15kDBpedia15,00042,29452,13419,13238,265
Wikidata15,000133,090138,24619,32442,746
Table 3. Experimental results on DBP-YG 15k and DBP-WD 15k.
Table 3. Experimental results on DBP-YG 15k and DBP-WD 15k.
MethodsDBP-YGDBP-WD
Hits@1Hits@5Hits@10MRMRRHits@1Hits@5Hits@10MRMRR
MTransE47.34368.77174.305249.550.568725.04845.25753.343338.170.3468
TransD31.12445.49549.2191320.940.377320.9934.56240.1141198.890.2947
RotatE45.74366.49571.952553.9780.549126.53346.98155.81511.240.3615
ConvE5.88610.15211.4294213.870.07881425.64830.0861788.700.1958
AlignE26.93341.36245.762452.3310.336416.21928.09534.457390.170.2242
AttrE28.7967.4171.886901.2390.569311.4113.2130.752478.260.1124
GCN-a46.63862.89566.3141110.540.538326.5948.24862.552712.400.4022
proposed51.32468.57171.468950.1180.588333.25757.1964.752581.040.4404
Table 4. Experimental results of DBP-YG 15k.
Table 4. Experimental results of DBP-YG 15k.
Methods10%15%
Hits@1Hits@5Hits@10MRMRRHits@1Hits@5Hits@10MRMRR
MTransE5.21911.74315.133989.4750.0871156.23812.75215.943971.9010.097091
TransD1.2292.3622.814621.6110.0186511.1332.2862.8764564.5480.017878
RotatE1.7434.7716.2383606.1160.0333572.7337.1059.413078.6890.050155
ConvE0.410.8381.1244840.5920.0073650.4380.9811.3054883.3410.007749
AlignE0.2760.7621.0764062.9070.0062921.22.4763.2952647.1970.020338
AttrE3.57110.04812.3433454.8630.064594.94312.55215.3052831.6490.090642
GCN-a3.8198.55211.3142494.9480.0628877.83814.6117.5712262.5430.112049
proposed3.4898.16210.5812477.3880.0591665.98111.82914.4192400.9140.089217
Methods20%25%
Hits@1Hits@5Hits@10MRMRRHits@1Hits@5Hits@10MRMRR
MTransE9.65719.65724.457742.2680.14849512.66725.2131.019638.3970.189461
TransD2.7144.795.5714292.4450.0376794.5818.857103783.5940.065699
RotatE3.619.33312.22797.7110.0652775.86714.01918.0862207.6150.099234
ConvE0.6571.3811.7624825.7040.0110411.1242.392.8294626.7250.017701
AlignE1.7143.7334.8382587.2640.0290261.7523.8385.2482643.4710.030305
AttrE6.82915.96218.992828.1780.10979610.67623.47628.0862168.710.164928
GCN-a11.1920.26723.8192014.2890.15522615.86727.02930.41873.2620.209795
proposed10.49519.1922.8952057.6170.14686615.26725.8129.4951897.4830.202285
Methods30%35%
Hits@1Hits@5Hits@10MRMRRHits@1Hits@5Hits@10MRMRR
MTransE14.57127.2132.752659.0490.20798118.11434.14340.724535.9930.258659
TransD5.6299.85711.4863608.3860.0768797.0571314.6953375.3690.099554
RotatE7.24817.38122.2862101.0930.12248511.70524.58130.0861728.1320.178291
ConvE1.5333.193.9054590.2260.0237631.4762.9053.6574610.680.02259
AlignE5.62910.98113.4571473.6990.0851465.48611.09513.5621437.9270.08444
AttrE13.35226.95231.5051945.3280.1953922.11440.63247.7431232.8220.305168
GCN-a18.04829.09532.7141896.6290.23148522.27634.97138.8291698.0880.281175
proposed18.429.633.2951895.8280.23603520.71432.89536.41776.4780.262811
Methods40%45%
Hits@1Hits@5Hits@10MRMRRHits@1Hits@5Hits@10MRMRR
MTransE20.97139.52446.495454.7450.29639621.90539.26745.819495.8420.301097
TransD9.11415.14317.0293173.4990.1197549.01915.9917.9813119.0590.122631
RotatE12.27626.05731.9241601.6070.18743613.53327.94334.0951484.9410.202817
ConvE2.0573.8384.6194524.8970.0300752.2954.4765.3244486.7520.034252
AlignE6.3911.69514.5811446.4270.09328810.0118.61922.19973.6760.143399
AttrE24.96243.72450.7241134.1020.33560225.72444.68651.0671151.880.342782
GCN-a24.76238.31442.3811624.3720.31014327.37141.31444.7811589.4610.336441
proposed25.67639.2143.3521598.8630.31871227.35241.44845.1431603.540.337185
Methods50%55%
Hits@1Hits@5Hits@10MRMRRHits@1Hits@5Hits@10MRMRR
MTransE24.28642.56249.629409.3670.32935427.36246.30553.657419.4830.362618
TransD10.34317.70520.0292965.640.13791814.46724.52427.1332388.6650.191164
RotatE19.15236.92443.391279.8270.27263720.55237.82944.4571218.2090.285249
ConvE1.9714.0194.994473.3830.0306152.594.85.594428.9790.037308
AlignE9.35217.25720.6761083.4720.13411212.48621.46724.943937.2150.169144
AttrE31.5952.45759.362842.8950.40898529.66748.52454.7621052.7550.38114
GCN-a27.71441.73345.7711575.7040.34119531.85746.32449.8951516.2030.383031
proposed27.32441.84845.7521583.950.33828831.93346.03849.81471.2840.382792
Table 5. Experimental results of DBP-WD 15k.
Table 5. Experimental results of DBP-WD 15k.
Methods10%15%
Hits@1Hits@5Hits@10MRMRRHits@1Hits@5Hits@10MRMRR
MTransE2.8674.0575.2676.9247.3529.9059.83811.413.08613.457
TransD0.5431.0571.3241.9521.6482.3813.5526.0765.3528.724
RotatE1.1711.4382.2573.3524.015.7716.3718.5818.84811.79
ConvE0.6481.011.6861.791.9712.593.0765.0294.6386.105
AlignE2.2952.996.8389.0113.10516.77115.52416.71422.73324.457
AttrE0.7241.0951.8574.2572.3815.2576.0485.0868.0768.476
GCN-a3.4196.8389.1910.86711.16214.47617.13318.14319.14320.695
proposed3.5056.3149.08611.77112.63815.56218.219.53320.78122.352
Methods20%25%
Hits@1Hits@5Hits@10MRMRRHits@1Hits@5Hits@10MRMRR
MTransE7.410.1912.83816.27617.11421.46721.51424.25727.227.457
TransD1.1242.42.6293.7243.3624.9436.98111.810.32416.714
RotatE3.3334.8296.5629.67610.5915.42915.6121.05721.57126.705
ConvE1.412.0953.7334.0764.4135.3626.66710.019.47612.105
AlignE6.0678.25716.74320.61927.43831.81931.62932.86740.98141.219
AttrE1.8482.7054.14310.1336.01911.42912.58110.56216.16217.305
GCN-a9.50515.16219.7926.70528.45730.86736.38137.70538.43840.381
proposed9.48615.47619.9926.95227.17131.8136.48638.58140.78142.686
Methods30%35%
Hits@1Hits@5Hits@10MRMRRHits@1Hits@5Hits@10MRMRR
MTransE10.27614.47617.46721.55222.6127.76227.89530.634.12434.924
TransD1.42.993.44.7434.2766.218.42914.57112.84819.876
RotatE4.797.199.4113.6114.76220.4120.70527.45728.73331.133
ConvE1.9332.8194.7625.5055.416.9818.68612.73311.9915.133
AlignE8.07611.18121.8126.69534.7939.52438.74339.68643.22947.21
AttrE2.7143.9055.6113.598.0115.46717.16214.0121.15221.629
GCN-a12.220.224.56230.96232.78138.50540.89544.51446.67648.448
proposed12.68619.70526.15233.38133.42931.8142.82945.70547.88650
Methods40%45%
Hits@1Hits@5Hits@10MRMRRHits@1Hits@5Hits@10MRMRR
MTransE1221.42515927.487727.49686.476698.003626.808604.047559.855513.457
TransD4736.17984.1284270.813945.8963975.9513674.8793288.5952549.6982713.8222246.182
RotatE3102.2164266.8452176.2631829.481637.2521429.0061380.8081135.5711133.641913.709
ConvE4224.7122617.6773638.4733334.0573298.4782974.9172857.1472679.8932576.7572390.898
AlignE3492.3164037.0552078.831734.8551360.5111191.8891293.7251231.1471200.991174.354
AttrE2294.5033171.2411747.874981.0931502.616890.287747.9241018.332658.057635.495
GCN-a2010.8152016.9041554.3131340.2711272.2281209.5851158.0421110.0161077.5081001.897
proposed2040.7671753.2691510.2751314.3691225.5111161.8671024.251939.965930.52880.781
Methods50%55%
Hits@1Hits@5Hits@10MRMRRHits@1Hits@5Hits@10MRMRR
MTransE0.0558480.0770110.0946720.119940.1270120.1595180.1599510.1800990.2031670.206554
TransD0.0089510.0179510.0211270.0295970.0261320.0379580.0528980.0900540.079570.125669
RotatE0.0244050.0339390.0474530.067740.0761390.1072410.1120640.1487580.1527110.191802
ConvE0.0114050.0166330.028050.0313040.0325140.0417880.0509630.0767910.0720310.092388
AlignE0.0417290.0569710.1168590.1473530.2021630.2410580.2314960.242830.240940.232826
AttrE0.0154880.0219230.033580.0760790.0447890.0880230.0993090.0823970.1249090.131767
GCN-a0.0638380.1004170.1405280.184370.1906570.2275210.2715950.2845960.2847160.303919
proposed0.0667850.1079280.1458620.188170.1954170.2314340.2643710.2823980.2991760.316887
Table 6. KGs for edge computing.
Table 6. KGs for edge computing.
PurposeFunctionsRef.
ServiceDeviceSecurity
Provide data integration and reasoning services to support data management in ESOOX[22]
Provide a group recommendation service of network document resources in ESOXX[35]
Perform a news recommendation service in edge computing environmentOXX[36]
Provide interactive services of vehicles, such as traffic flow prediction and route arrangementOOX[37]
Data representation of the edge computing devices in manufacturing systemsOOX[12]
Perform edge analytics to manage limited resources in ESXOX[38]
Integrate knowledge for industrial automation systems in edge computing environmentOOO[39]
Make the consumption of KGs more reliable and faster in ESsOOX[40]
Detect anomalous activity in industrial systemsXXO[41]
Provide security-aware data model and semantics in the dynamic collaboration environmentOOOproposed
O: satisfied, X: dissatisfied.
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Kim, J.; Kim, K.; Sohn, M.; Park, G. Deep Model-Based Security-Aware Entity Alignment Method for Edge-Specific Knowledge Graphs. Sustainability 2022, 14, 8877. https://doi.org/10.3390/su14148877

AMA Style

Kim J, Kim K, Sohn M, Park G. Deep Model-Based Security-Aware Entity Alignment Method for Edge-Specific Knowledge Graphs. Sustainability. 2022; 14(14):8877. https://doi.org/10.3390/su14148877

Chicago/Turabian Style

Kim, Jongmo, Kunyoung Kim, Mye Sohn, and Gyudong Park. 2022. "Deep Model-Based Security-Aware Entity Alignment Method for Edge-Specific Knowledge Graphs" Sustainability 14, no. 14: 8877. https://doi.org/10.3390/su14148877

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop