Next Article in Journal
SA-SVM-Based Locomotion Pattern Recognition for Exoskeleton Robot
Previous Article in Journal
Attacks and Preventive Measures on Video Surveillance Systems: A Review
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Dual Quaternion Embeddings for Link Prediction

1
School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510000, China
2
College of Information Science and Technology, Jinan University, Guangzhou 510000, China
3
Data Quality Team, WeChat, Tencent Inc., Guangzhou 510000, China
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Appl. Sci. 2021, 11(12), 5572; https://doi.org/10.3390/app11125572
Submission received: 13 May 2021 / Revised: 8 June 2021 / Accepted: 10 June 2021 / Published: 16 June 2021
(This article belongs to the Section Computing and Artificial Intelligence)

Abstract

:
The applications of knowledge graph have received much attention in the field of artificial intelligence. The quality of knowledge graphs is, however, often influenced by missing facts. To predict the missing facts, various solid transformation based models have been proposed by mapping knowledge graphs into low dimensional spaces. However, most of the existing transformation based approaches ignore that there are multiple relations between two entities, which is common in the real world. In order to address this challenge, we propose a novel approach called DualQuatE that maps entities and relations into a dual quaternion space. Specifically, entities are represented by pure quaternions and relations are modeled based on the combination of rotation and translation from head to tail entities. After that we utilize interactions of different translations and rotations to distinguish various relations between head and tail entities. Experimental results exhibit that the performance of DualQuatE is competitive compared to the existing state-of-the-art models.

1. Introduction

Knowledge graphs, which represent knowledge from real world applications, contain abundant facts. In knowledge graphs, each fact is represented by a triple ( h , r , t ) which indicates that the relation r between the head entity h and tail entity t. Knowledge graphs have been applied to various tasks such as explainable recommendation system [1], question answering [2] and prediction of future research collaborations [3].
Predicting missing facts (i.e., link prediction) is a fundamental task in knowledge graph research. Various models aiming at embedding entities and relations into low-dimension spaces have been proposed. For example, TransE [4] learned the embeddings of entities and relations by transforming head entity to tail entity according to the relation; RotatE [5] and QuatE [6] learned the embeddings of entities and relations by considering relations as rotations from head entities to tail entities. However, existing transformation based models fail to capture multiple relations between head and tail entities. For example, as shown in Figure 1, David Lynch is the director, the creator and an actor in the film Mulholland Drive, i.e., there are three relations: directed, created and actedIn between David Lynch and Mulholland Drive. These relations between head entity David Lynch and tail entity Mulholland Drive have no semantic connections with each other, which should be represented by spatially dispersed embeddings. Most existing transformation based models, however, assume that there is only one relation between each pair of head and tail entities. For instance, for each triple ( h , r , t ) , their corresponding embeddings are assumed to be satisfied with h + r t in TransE, which indicates, for ( h , r 1 , t ) , ( h , r 2 , t ) , and ( h , r 3 , t ) , the embeddings of r 1 , r 2 , r 3 are similar, as shown in Figure 2c (i.e., r 1 r 2 r 3 ). To overcome this challenge, we propose a novel approach that considers multiple relations between head and tail entities in knowledge graph.
In this paper, we propose a model called DualQuatE which utilizes the various combinations of distinct rotations and translations to represent multiple relations between head and tail entities. Based on this, easily to think of RotatE combined with TransE in complex space and real space. However, it is hard to find a uniform mathematical expression to convey their combination. Therefore, we propose DualQuatE which embeds entities and relations into dual quaternion space to combine rotation and translation. The dual quaternion consists of real part and dual part. More concretely, we embed entities with pure quaternions vectors in three-dimensional space to represent entity embeddings. To distinguish various relations between head entity h and tail entity t, we design a score function to utilize dual quaternion Hamilton product to model relations as interaction of rotation and translation. We utilize distinct interactions of rotations and translations to represent various relations between head and tail entities. Compared with RotatE and TransE in two-dimensional space, the dual quaternions space is eight-dimensional with six real degrees of freedom, three for translation and three for rotation; we can explore the interaction of rotation and translation with more free degrees in higher dimensions. Summarized in Table 1, our model has rich expression abilities of relations (i.e., relation patterns and multiple relations).
To conclude, the contributions of our proposed model are listed as follows:
  • We introduce dual quaternions to knowledge graph embeddings.
  • We propose a novel transformation based model DualQuatE to overcome the challenge of multiple relations between two entities.
  • Our experiments denote that DualQuatE is effective compared to the existing state-of-the-art models.
The rest paper is organized as follows. In Section 2, we introduce the related work. Section 3 presents prerequisite knowledge about dual quaternions. In Section 4, we describe our model. We present the results of experiments and make analysis and discussions in Section 5. In Section 6, we introduce the conclusion of this paper and future work.

2. Related Work

To gain high-quality knowledge graphs, approaches which utilize knowledge graph embedding to predict missing facts have been proposed recently. These methods fall into two broad categories in [10]: transformation based models and semantic matching models. Specifically, transformation based models transform head entity to tail entity by relations, while semantic matching models match entities and relations semantics in latent spaces. Compared to transformation based models, semantic matching models suffer from poor interpretability.
Transformation based models usually embed entities and relations into vector space and model the relation as a transformation from head entity embeddings to tail entity embeddings. One of the most representative is TransE which mapped entities and relations to the same space R k . For each triple ( h , r , t ) , entity embeddings h , t and relation embedding r hold h + r t . Then a series of extensions based on TransE are presented to improve accuracy and interpretability. For instance, TransR [11] introduced relations-specific spaces. TransR modeled relations and entities into different spaces following the idea that TransE can only express 1-to-1 relations. RotatE mapped embeddings into complex space which focused on expressing relation patterns. HAKE [7] utilized the polar coordinate system to capture semantic hierarchies in the knowledge graphs.
Semantic matching models that match latent semantics of entities and relations can be divided into two categories: bilinear models and neural network based models. Bilinear models include DistMult [8], HolE [12], SimplE [9], ComplEx [13] and QuatE and DihEdral [14]. DistMult represented each entity as a vector and each relation as a diagonal matrix. HolE matched latent semantics of entities by circular correlation operation and then the compositional vector interacted with relations latter. ComplEx, mapping knowledge graph embedding into complex space, leveraged Hermitian product to capture latent semantics of entities and relations which could express antisymmetry relation pattern. QuatE, extending knowledge graph embedding from complex space to quaternion space, modeled each relation as rotation in four-dimensional space with more degree of freedom. Compared with ComplEx, QuatE could express the main relation patterns except composition. For each entity, SimplE proposed two embeddings and each of them learned latent semantics dependently. DihEdral mapped relations into dihedral group to capture composition relations. Neural network based models including ConvE [15], R-GCNs [16] and InteractE [17] are proposed recently. ConvE, R-GCNs introduced convolutional network and graph convolutional networks to knowledge graph embedding respectively. Compared with ConvE, InteractE introduced the feature permutation, “checkered” feature reshaping and circular convolution to increase interaction.
Recently, some models introduced hyperbolic space to knowledge graph embeddings. MuRP [18] represented knowledge graph in Poincaré ball of hyperbolic space. Chami et al. [19] attempted to capture hierarchical and logical patterns in hyperbolic space. Compared to hyperbolic space based models which focused on semantic hierarchies in knowledge graphs, DualQuatE try to overcome overcome the challenge of multiple relations between two entities and relation patterns.
Both DualQuatE and QuatE use quaternion to embed knowledge graphs. However, those are three different models. The main differences between DualQuatE and QuatE are as follows:
  • DualQuatE, a transformation based model, measures score of triples by the distance between two entities. QuatE which is a semantic matching model measured the latent matching semantics of entities and relations.
  • The purpose of the model is different. DualQuatE aims to address the challenge of having multiple relations between two entities. QuatE aims to utilize quaterion Hamilton product to encourage a more compact interaction between entities and relations.
  • The geometric meaning is different. QuatE embeds entities and relations with quaterions to model relations as rotations. Our model firstly attempts to represent entities with pure quaternions and models relations as interaction of translation and rotation.

3. Preliminaries

In this part, we introduce several concepts used in this paper.
  • Quaternion: Quaternion [20], is a number system that extends complex numbers to four-dimensional numbers. Generally, a quaternion is a number of the form q = a + b i + c j + d k , where a , b , c , d are real numbers and i , j , k satisfy that i 2 = j 2 = k 2 = i j k = 1 .
  • Quaternion conjugate: The definition of conjugate to a quaternion q is q * = a b i c j d k .
  • Quaternion Multiplication: Multiplication of two quaternions p = p 0 + p 1 i + p 2 j + p 3 k and q = q 0 + q 1 i + q 2 j + q 3 k is defined by:
    p q = ( p 0 q 0 p 1 q 1 p 2 q 2 p 3 q 3 ) + ( p 0 q 1 + p 1 q 0 + p 2 q 3 p 3 q 2 ) i + ( p 0 q 2 p 1 q 3 + p 2 q 0 + p 3 q 1 ) j + ( p 0 q 3 + p 1 q 2 p 2 q 1 + p 3 q 0 ) k
  • Rotation with quaternions in three-dimensional space: The point v is rotated by the point v along the unit vector u (i.e., rotation axis), which can use quaternion multiplication to represent. We define v and v as pure quaternion, i.e., quaternions with real part being zero, m = cos θ 2 + u sin θ 2 is a unit quaternion, then
    v = m v m *
  • Dual quaternion: Dual quaterion [21] is an eight-dimensional real algebra to combine with quaternions. Formally, a dual quaternion δ can be represented by δ = p + ϵ q , where ϵ is a dual unit with ϵ 2 = 0 , both the real part p and the dual part q are quaternions. Therefore, a dual quaternion δ is of the form δ = p 0 + p 1 i + p 2 j + p 3 k + ϵ ( q 0 + q 1 i + q 2 j + q 3 k ) .
  • Dual quaternion conjugate: The conjugate of the dual quaternion δ = p + ϵ q is defined as: δ = p * ϵ q * , which can be represented by an 8-tuple: δ = ( p 0 , p 1 , p 2 , p 3 , q 0 , q 1 , q 2 , q 3 ) .
  • Dual Quaternion Multiplication: Dual Quaternion Hamilton product between δ 1 = p 1 + ϵ q 1 and δ 2 = p 2 + ϵ q 2 is defined as follows:
    δ 1 δ 2 = p 1 p 2 + ϵ ( p 1 q 2 + q 1 p 2 )
  • Unit Dual Quaternion: A dual quaternion δ = p + ϵ q is a unit dual quaternion, if δ δ * = 1 , namely, δ satisfies the following conditions:
    p 0 2 + p 1 2 + p 2 2 + p 3 2 = 1 , p 0 q 0 + p 1 q 1 + p 2 q 2 + p 3 q 3 = 0
    where δ * = p * + ϵ q * . In order to simplify the calculation process, we use another effective form to represent unit dual quaternion which defines as follows:
    δ = m + ϵ 2 m n
    where m = cos θ 2 + u sin θ 2 and n is a pure quaterion. We prove that δ is a unit dual quaternion:
    δ δ * = m + ϵ 2 n m m * + ϵ 2 ( n m ) * = m m * + ϵ 2 ( m m * n * + n m m * ) = 1 + ϵ 2 ( n * + n ) = 1 + ϵ 2 ( n + n ) = 1
    We can easily verify m m * = 1 , as shown below:
    m m * = ( cos θ 2 + u sin θ 2 ) ( cos θ 2 u sin θ 2 ) = ( cos θ 2 + u i sin θ 2 i + u j sin θ 2 j + u k sin θ 2 k ) ( cos θ 2 u i sin θ 2 i u j sin θ 2 j u k sin θ 2 k ) = 1
  • Combination of Rotation and Translation: We define a point in the three-dimensional space as a pure quaterion v and let n be the translation. The point v under the rotation θ followed by the translation n becomes the point v . It is straightforward to utilize unit dual quaternion multiplication to represent the transformation from v to v , as shown below:
    δ ( 1 + ϵ v ) δ = m + ϵ 2 n m ( 1 + ϵ v ) m * ϵ 2 ( n m ) * = m + ϵ 2 n m + ϵ m v m * ϵ 2 m * n * = m m * + ϵ 1 2 ( n m m * m m * n * ) + m v m * = 1 + ϵ ( m v m * + n ) = ( 1 + ϵ v )

4. Our DualQuatE Model

In this section, we introduce our model DualQuatE which maps entities and relations to dual quaternion space, and two variations of DualQuatE, namely DualQuatE-1 and DualQuatE-2.
We denote a knowledge graph by G , a set of entities by E and a set of relations by R . A knowledge graph G is composed of a set of facts, each of which can be represented by ( h , r , t ) , where h E is a head entity, t E is a tail entity, and r R is a relation between h and t. We denote a set of facts that are true by Ω + , and a set of facts that are false by Ω . Given a knowledge graph G , we aim to predict missing facts (i.e., link prediction) in G .

4.1. Multiple Relations between the Entities

To address the challenge of having multiple relations between head and tail entities, we embed knowledge graph into dual quaternion space. h , r , t denote vector of entity embeddings and relation embeddings, each element of entity embeddings h i or t i is pure quaternion and every dimension of relation embeddings r i is unit dual quaternion. We expect to model relation embeddings r as interaction of rotation and translation from head entity embeddings h to tail entity embeddings t as shown in Figure 2a. Specifically, each true triple ( h , r , t ) satisfies:
r ( 1 + ϵ h ) r = ( 1 + ϵ t )
where each dimension of r is a unit dual quaterinon satisfying Formula (4). We define a quaternion m = cos θ 2 + u sin θ 2 to represent a rotation about pure unit quaternion u through θ and a pure quaternion n = n 1 i + n 2 j + n 3 k . Furthermore, we define a unit dual quaternion by:
r = m + ϵ 2 n m
With Formula (10), we can deduce the transformation of DualQuatE in Formula (9):
r ( 1 + ϵ h ) r = 1 + ϵ ( m h m * + n ) = ( 1 + ϵ t )
where the geometric meaning of m h m * is shown in Formula (2). As shown above, DualQuatE transforms head entity h to tail entity t by relation r which combines rotation (i.e., m ) and translation (i.e., n ). Unlike previous models learned similar representations of relations r 1 , r 2 , r 3 shown in Figure 2c,d, our model learns combinations of different translations and rotations to represent various relations between head and tail entities.
We define score function by:
f r ( h , t ) = | | r ( 1 + ϵ h ) r ( 1 + ϵ t ) | |
where | | · | | represent L 2 norm of a vector. With the score function we want head entity to be as close to tail entity as possible after the transformation of the relation.

4.2. Loss Function

We employ self-adversarial negative sampling [5] method to generate corrupt samples. We define the probability distribution of negative samples by:
p ( h j , r , t j | { ( h i , r i , t i ) } ) = exp α f r ( h j , t j ) i exp α f r ( h i , t i )
where α is sampling temperature. Combining with self-adversarial negative sampling, we define loss function by:
L = log σ ( γ + f r ( h , t ) ) i = 1 n p ( h i , r , t i ) log σ ( f r ( h i , t i ) γ )
where γ is fixed margin. We define our algorithm as shown in Algorithm 1.
Algorithm 1DualQuatE.
Input: Entity embeddings E and relation embeddings R . hyperparameters including margin γ , martrix dim k, negative sample size n.
  • 1: h , t uniform ( γ + 2.0 k , γ + 2.0 k ) for each h , t E
  •         r uniform ( γ + 2.0 k , γ + 2.0 k ) for each r R
  • 2: repeat
  • 3: T p o s uniform random sampling ( h , r , t )
  • 4: ( h , r , t ) generate n negative samples for ( h , r , t )
  • 5: T = T p o s { ( h , r , t ) }
  • 6:  compute each ( h , r , t ) weight: p ( h j , r , t j | { ( h i , r i , t i ) } )
  • 7:  update relation embeddings r and entity embeddings h , t : h , r , t = h , r , t θ r [ log σ ( γ + f r ( h , t ) ) i = 1 n p ( h i , r , t i ) log σ ( f r ( h i , t i ) γ ) ]
  • 8: until

4.3. Properties of DualQuatE

In this part we describe the relation patterns and introduce how DualQuatE expresses those patterns. Recently, learning relation patterns including symmetry/antisymmetry, inversion and composition have been realized to the key of link prediction task. Our model DualQuatE can easily explain the relation patterns of the learned relation embeddings and proof of relation patterns can be found in the Appendix A.
Inversion: If a relation r R is the inverse to a relation r R , then we can infer ( h , r , t ) Ω + ( t , r , h ) Ω + . For example, the relation h a s _ p a r t is inverse to the relation p a r t _ o f . To r and r , we infer that ( m m ) h ( m m ) * + m n m + n = h , which denotes the composition of component m and m have no rotation (i.e., ( m m ) h ( m m ) * = h ) and the translation n which is rotated by m is the opposite number of the translation n (i.e., m n m + n = 0 ).
Symmetry: A relation r R is symmetric, if ( h , r , t ) Ω + ( t , r , h ) Ω + holds. For instance, relations s i m i l a r _ t o and v e r b _ g r o u p from the dataset WN18 are symmetric. If a relation is symmetric, we reason that ( m m ) h ( m m ) * + m n m * + n = h , which means no rotation of the self-composition of component m (i.e., ( m m ) h ( m m ) * = h ) and no translation of component n (i.e., m n m * + n = 0 ).
Antisymmetry: A relation r R is antisymmetric, if ( h , r , t ) Ω + ( h , r , t ) Ω , which satisfies ( m m ) h ( m m ) * + m n m * + n h . For example, the relation p a r t _ o f .
Composition: A relation r 3 is composed by the relation r 1 and r 2 , which can be denoted by r 3 = r 1 r 2 if ( h , r , s ) Ω + ( s , r , t ) ( t , r , h ) Ω + . For example, relation uncle_of can be composited by brother_of and father_of such as if ( A l v a , b r o t h e r _ o f , A a r o n ) , ( A a r o n , f a t h e r _ o f , A b e l ) are true triples, we can reason ( A l v a , u n c l e _ o f , A b e l ) is a true fact in the real world. Relation r 3 can be composited by relation r 1 and r 2 ; they can be represented by ( m 2 m 1 ) h ( m 2 m 1 ) * + m 2 n 1 m 2 * + n 2 = m 3 h m 3 * + n 3 , which deduces that n 3 is equal to the sum of translation n 2 and translation n 1 which is rotated by the rotation m 2 (i.e., m 2 n 1 m 2 * + n 2 = n 3 ).

4.4. Variations

We introduce extensions of DualQuatE. DualQuatE is a transformation based model, which combines rotation and translation. To compare the effects of interaction of rotation and translation, we compare DualQuatE with DualQuatE-1 which models relations as rotation in three-dimensional space. Furthermore, we propose DualQuatE-2 to explore the role of scaling in the rotation.
DualQuatE-1: We devise DualQuatE-1 which embeds entities and relations to quaternion space. Specifically, we represent entity embeddings h , t with pure quaternions and relation embeddings r with quaternions. We design a score function as follows f r ( h , t ) = | | r h r * t | | to model the relation as rotation in three-dimensional space. Namely, for each fact satisfies: r h r * = t .
DualQuatE-2: To explore the effect of scaling in knowledge graph embeddings, we present DualQuatE-2 to introduce scaling. DualQuatE-2 maps knowledge graph embeddings to four-dimensional space. Especially, we represent entities and relations with quaternions where relation embeddings are not unit quaternions. We define score function f r ( h , t ) = | | h r t | | meaning relation transform head entity to tail entity combining rotation and scaling.

4.5. Connection to TransE and RotatE

Compared with RotatE: RotatE embedded entity embeddings h , t and relation embeddings r into the complex space. RotatE utilized score function | | h r t | | to calculate the probability of each triple, where r i is unit complex cos θ + i sin θ . DualQuatE can be transformed to RotatE by fixing rotation plane and removing translation variables. For instance, we can construct relation embeddings by Formula (10) in x o y plane, where u = k and n = 0 (i.e., r = cos θ 2 + sin θ 2 i ) and embed entities with corresponding forms: h or t = a i + b j .
Compared with TransE: TransE modeled relation as translation that embedded entity embeddings h , t and relation embeddings r to vector space. To express TransE, we can set θ = 0 (i.e., m = 0 ) in relation embeddings to ignore the rotation. In other words, the relation embedddings in DualQuatE can be expressed as r = 1 + ϵ 2 n .

5. Experiments

5.1. Experiment Settings

5.1.1. Datasets

We evaluated our approach on widely used datasets: WN18, FB15k, WN18RR, FB15k-237 and YAGO3-10, details of which are shown in Table 2. WN18 [4] is sampled from WordNet (https://wordnet.princeton.edu/ accessed on 11 June 2021), which is a knowledge graph about lexical relations of words. WN18RR [15] is a subset of WN18 with inverse relations removed. FB15k [4] is a large database with structured general human knowledge. FB15k-237 [22] is a subset of FB15k with reverse relations removed. YAGO3-10 [23] is a subset of YAGO3 which extends YAGO (https://io.datascience-paris-saclay.fr/dataset/YAGO accessed on 11 June 2021) in different languages. Tuples in YAGO3-10 mainly come from Wikipedia describing individuals, e.g., who lives in which city.

5.1.2. Evaluation Metric

Similar to [6], we used three metrics to measure our approach, i.e., Mean Rank (MR), Mean Reciprocal Rank (MRR), and Hit@n. To calculate those metrices, we first replace h by all entities h E for each testing triplet ( h , r , t ) T (where T is a set of testing triplets) and compute score f r ( h , t ) for each triple ( h , r , t ) . After that we sort h according to score f r ( h , t ) ascendingly and get the rank of the original entity h, denoted by K ( h ) . Note that K ( h ) is the “rank” of h instead of the score of h, e.g., if the score of h the smallest, K ( h ) is 1. We can calculate MR as shown below:
MR = ( h , r , t ) T K ( h ) | T |
which means MR is an average of ranks of all the original entities in the testing triplets. Likewise, MRR can be calculated as follows:
MRR = ( h , r , t ) T 1 K ( h ) | T |
which indicates MRR is an average of inverse ranks of all the original entities in the testing triplets. Hit@n suggests the proportion of original entities in the top n entities, which can be calculated by:
Hit @ n = ( h , r , t ) T o n e ( K ( h ) n ) | T |
where o n e ( K ( h ) n ) is 1 if K ( h ) n , and 0 if K ( h ) > n . We tested different values n = 1, 3, 10 in the evaluation, similar to the setting used in reference [6].

5.1.3. Baselines

We compared our model with several state-of-the-art baselines. For transformation based models, we compared our model to TransE [4], TorusE [24], RotatE [5] and HAKE [7]; for bilinear models, we compared our model to ComplEx [13], HolE [12], SimplE [9], DihEdral [14] and QuatE [6] (to make the comparison fair, we use the version of QuatE without type constraints on the common link prediction datasets considering the requirement of type constraints is too strong).

5.1.4. Implementation Details

We utilized Pytorch (https://pytorch.org accessed on 11 June 2021) to implement our model (https://github.com/gaoliming123/DualQuatE accessed on 11 June 2021) and its variations DualQuatE-1 and DualQuatE-2. We tuned the hyperparameters as follows:the embedding dimension k { 100 , 200 , 300 , 500 } , the learning rate { 0.0001 , 0.0003 , 0.0005 , 0.0008 } , the ratio of negative sample n { 32 , 64 , 128 } , the margin γ { 3 , 6 , 9 , 12 , 15 , 18 , 24 } and the self-adversarial sampling temperature α { 0.5 , 1.0 } . We adopt k = 100 for WN18RR and WN18, k = 200 for FB15k-237 and YAGO3-10 and k = 500 for FB15k.

5.2. Results

Table 3, Table 4 and Table 5 show the experimental results on four datasets. The performance of DualQuatE and its variations represent comparability to state-of-the-art models. For YAGO3-10, the link prediction results are shown in Table 3, from which we can see that DualQuatE is competitive compared to most previous knowledge graph embedding models, especially in metric Hit@10. The result of YAGO3-10 tells us that the performance of DualQuatE is better than DualQuatE-1, which indicates that modeling relations as the interaction of rotation and translation with more degrees of freedom (as done by DualQuatE) is indeed better than simply modeling relations as rotation (as done by DualQuatE-1). Furthermore, the advanced results of DualQuatE-2 and DualQuatE inspire us to explore the mixed effects of vector operations. Table 4 and  Table 5 indicate the effects of our models on four common datasets: WN18RR, FB15k-237, WN18 and FB15k. We can find that our models perform better on datasets WN18RR and FB15k-237; for WN18 and FB15k, metrics are almost close to previous models and several metrics surpass the previous.

5.3. Relation Embeddings

In this part we analyze the properties of DualQuatE learned for relations. DualQuatE can distinguish multiple relations between head and tail entities, for example, as shown in Figure 3. We compared our model with RotatE; Figure 3a,b display the difference of the representation of relation actedIn and directed. Figure 3a shows that the relations actedIn and directed are more similar where the gap between the two relations is clustered around zero. For DualQuatE, the difference which is shown in Figure 3b is more dispersed. Maybe the learned embeddings of our model are slightly concentrated around zero. We speculate that the reason for this result is due to less relations in YAGO3-10, which causes the diversity of relations between the entities to be more sparse. Figure 3d denotes the histrograms of translation component of relation actedIn and directed. Compared with the relation directed, the distribution of the relation actedIn is more decentralized. The values of translation component of relation directed are concentrated around zero, while the values of relation actedIn are more around ± 0.05 . Namely, head entity which is rotated by the rotation component of directed will be closer to tail entity. Figure 3c denotes that the distribution of the embeddings of relations actedIn and directed is very similar.
Limited by the length of the article, we visualize only some relation patterns in this paper. Figure 4a shows that the self composition of rotation m is close to 0 or 2 π . Figure 4b,c show the rotation embeddings to antisymmetry relation has_part from dataset WN18. For inversion relations r and r , Figure 4d denotes the embeddings of rotation elements between m and m which means the composition of m and m is 0 or 2 π .

5.4. Space and Time Complexity

In this part, we list space and time complexity of the different transformation based models and bilinear models as shown Table 6. m and n denote number of entities and relations. d is dimensions of entity or relation embeddings.

6. Conclusions

In this paper, we propose a novel model, DualQuatE, for knowledge graph embedding, which maps entities and relations to dual quaternion space. We present a new score function to model each relation as interaction of rotation and translation, which addresses the multiple relations between two entities. We demonstrate that our model is able to express main relation patterns and outperforms state-of-the-art baselines. However, DualQuatE does not consider temporal information and semantic hierarchies in knowledge graphs. In the feature, we will investigate how to explore temporal information and semantic hierarchies based on our model. It is also interesting to investigate the possibility of applying our DualQuatE model to learning representations of propositions for helping learning action models [25,26,27,28] and recognizing plans [29,30,31] in planning community.

Author Contributions

Conceptualization, L.G., H.Z. and H.H.Z.; methodology, L.G. and H.Z.; software, L.G.; validation, L.G., H.Z. and H.H.Z.; formal analysis, L.G. and H.H.Z.; investigation, L.G. and H.Z.; resources, L.G., H.Z. and H.H.Z.; data curation, L.G.; writing—original draft preparation, L.G.; writing—review and editing, H.Z. and H.H.Z.; visualization, L.G.; supervision, H.Z., H.H.Z. and J.X.; project administration, L.G.; funding acquisition, H.Z. and H.H.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (Grant No. 62076263), the National Natural Science Foundation of China for Young Scientists of China (Grant No. 11701592), the Joint Funds of the National Natural Science Foundation of China (Grant No. U1811263), Guangdong Natural Science Funds for Distinguished Young Scholar (Grant No. 2017A030306028), Guangdong special branch plans young talent with scientific and technological innovation (Grant No. 2017TQ04X866), Pearl River Science and Technology New Star of Guangzhou and Guangdong Province Key Laboratory of Big Data Analysis and Processing.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Publicly available datasets were analyzed in this study. These data can be found here: https://github.com/gaoliming123/DualQuatE, accessed on 11 June 2021.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Proof of Relation Patterns

symmetry: Relation r R is symmetry, then both ( h , r , t ) and ( t , r , h ) G . The following equations will hold:
r ( 1 + ϵ h ) r = ( 1 + ϵ t ) r ( 1 + ϵ t ) r = ( 1 + ϵ h )
Then we deduce that:
m h m * + n = t
m t m * + n = h
Bring Formula (5) into Formula (6), and then we can get:
m ( m h m * + n ) m * + n = h ( m m ) h ( m m ) * + m n m * + n = h
Inversion: Relation r R is inversion, iff another relation r R exists and satisfies both ( h , r , t ) and ( t , r , h ) G . Then the following equations will hold:
r ( 1 + ϵ h ) r = ( 1 + ϵ t ) r ( 1 + ϵ t ) r = ( 1 + ϵ h )
Then we deduce that:
m h m * + n = t
m t m * + n = h
Bring Formula (10) into Formula (11), and then we can get:
m ( m h m * + n ) m * = h ( m m ) h ( m m ) * + m n m + n = h
Composition: A relation r 3 is the composition of relation r 1 and r 2 , which can be denoted by r 3 = r 1 r 2 .
m 1 h m 1 * + n 1 = s
m 2 s m 2 * + n 2 = t
m 3 s m 3 * + n 3 = t
Then we can get:
m 2 ( m 1 h m 1 * ) m 2 * + n 2 = m 3 h m 3 * + n 3 ( m 2 m 1 ) h ( m 2 m 1 ) * + m 2 n 1 m * + n 2 = m 3 h m 3 * + n 3

References

  1. Chen, Z.; Wang, X.; Xie, X.; Wu, T.; Bu, G.; Wang, Y.; Chen, E. Co-Attentive Multi-Task Learning for Explainable Recommendation; Kraus, S., Ed.; IJCAI: Macao, China, 2019; pp. 2137–2143. [Google Scholar]
  2. Kumar, V.; Hua, Y.; Ramakrishnan, G.; Qi, G.; Gao, L.; Li, Y. Difficulty-Controllable Multi-hop Question Generation from Knowledge Graphs. In Proceedings of the Semantic Web—ISWC 2019—18th International Semantic Web Conference, Auckland, New Zealand, 26–30 October 2019; Ghidini, C., Hartig, O., Maleshkova, M., Svátek, V., Cruz, I.F., Hogan, A., Song, J., Lefrançois, M., Gandon, F., Eds.; Part I; Lecture Notes in Computer Science. Springer: Cham, Switzerland, 2019; Volume 11778, pp. 382–398. [Google Scholar]
  3. Kanakaris, N.; Giarelis, N.; Siachos, I.; Karacapilidis, N. Shall I Work with Them? A Knowledge Graph-Based Approach for Predicting Future Research Collaborations. Entropy 2021, 23, 664. [Google Scholar] [CrossRef] [PubMed]
  4. Bordes, A.; Usunier, N.; Garcia-Duran, A.; Weston, J.; Yakhnenko, O. Translating embeddings for modeling multi-relational data. Adv. Neural Inf. Process. Syst. 2013, 26, 2787–2795. [Google Scholar]
  5. Sun, Z.; Deng, Z.; Nie, J.; Tang, J. RotatE: Knowledge Graph Embedding by Relational Rotation in Complex Space. arXiv 2019, arXiv:1902.10197. [Google Scholar]
  6. Zhang, S.; Tay, Y.; Yao, L.; Liu, Q. Quaternion Knowledge Graph Embeddings. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, Vancouver, BC, Canada, 8–14 December 2019; Wallach, H.M., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E.B., Garnett, R., Eds.; NeurIPS: Vancouver, BC, Canada, 2019; pp. 2731–2741. [Google Scholar]
  7. Zhang, Z.; Cai, J.; Zhang, Y.; Wang, J. Learning Hierarchy-Aware Knowledge Graph Embeddings for Link Prediction; AAAI Press: Palo Alto, CA, USA, 2020; pp. 3065–3072. [Google Scholar]
  8. Yang, B.; Yih, W.; He, X.; Gao, J.; Deng, L. Embedding Entities and Relations for Learning and Inference in Knowledge Bases; Bengio, Y., LeCun, Y., Eds.; ICLR: San Diego, CA, USA, 2015. [Google Scholar]
  9. Kazemi, S.M.; Poole, D. SimplE Embedding for Link Prediction in Knowledge Graphs; Bengio, S., Wallach, H.M., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R., Eds.; NeurIPS: Montréal, QC, Canada, 2018; pp. 4289–4300. [Google Scholar]
  10. Wang, Q.; Mao, Z.; Wang, B.; Guo, L. Knowledge Graph Embedding: A Survey of Approaches and Applications. IEEE Trans. Knowl. Data Eng. 2017, 29, 2724–2743. [Google Scholar] [CrossRef]
  11. Lin, Y.; Liu, Z.; Sun, M.; Liu, Y.; Zhu, X. Learning Entity and Relation Embeddings for Knowledge Graph Completion; Bonet, B., Koenig, S., Eds.; AAAI Press: Palo Alto, CA, USA, 2015; pp. 2181–2187. [Google Scholar]
  12. Nickel, M.; Rosasco, L.; Poggio, T.A. Holographic Embeddings of Knowledge Graphs; Schuurmans, D., Wellman, M.P., Eds.; AAAI Press: Palo Alto, CA, USA, 2016; pp. 1955–1961. [Google Scholar]
  13. Trouillon, T.; Welbl, J.; Riedel, S.; Gaussier, É.; Bouchard, G. Complex Embeddings for Simple Link Prediction; ICML: New York, NY, USA, 2016; pp. 2071–2080. [Google Scholar]
  14. Xu, C.; Li, R. Relation Embedding with Dihedral Group in Knowledge Graph; ACL: Florence, Italy, 2019; pp. 263–272. [Google Scholar]
  15. Dettmers, T.; Minervini, P.; Stenetorp, P.; Riedel, S. Convolutional 2D Knowledge Graph Embeddings. In Proceedings of the AAAI Conference on Artificial Intelligence; AAAI Press: Palo Alto, CA, USA, 2018; pp. 1811–1818. [Google Scholar]
  16. Schlichtkrull, M.S.; Kipf, T.N.; Bloem, P.; van den Berg, R.; Titov, I.; Welling, M. Modeling Relational Data with Graph Convolutional Networks. In Proceedings of the Semantic Web—15th International Conference, ESWC 2018, Heraklion, Crete, Greece, 3–7 June 2018; Gangemi, A., Navigli, R., Vidal, M., Hitzler, P., Troncy, R., Hollink, L., Tordai, A., Alam, M., Eds.; Lecture Notes in Computer Science. Springer: Cham, Switzerland, 2018; Volume 10843, pp. 593–607. [Google Scholar]
  17. Vashishth, S.; Sanyal, S.; Nitin, V.; Agrawal, N.; Talukdar, P.P. InteractE: Improving Convolution-Based Knowledge Graph Embeddings by Increasing Feature Interactions; AAAI Press: Palo Alto, CA, USA, 2020; pp. 3009–3016. [Google Scholar]
  18. Balazevic, I.; Allen, C.; Hospedales, T. Multi-relational poincaré graph embeddings. Adv. Neural Inf. Process. Syst. 2019, 32, 4463–4473. [Google Scholar]
  19. Chami, I.; Wolf, A.; Juan, D.; Sala, F.; Ravi, S.; Ré, C. Low-Dimensional Hyperbolic Knowledge Graph Embeddings. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, 5–10 July 2020; Jurafsky, D., Chai, J., Schluter, N., Tetreault, J.R., Eds.; Association for Computational Linguistics: Beijing, China, 2020; pp. 6901–6914. [Google Scholar]
  20. Hamilton, W.R. LXXVIII. On quaternions; or on a new system of imaginaries in Algebra: To the editors of the Philosophical Magazine and Journal. Lond. Edinb. Dublin Philos. Mag. J. Sci. 1844, 25, 489–495. [Google Scholar] [CrossRef] [Green Version]
  21. Kotelnikov, A.P. Screw calculus and some applications to geometry and mechanics. Annu. Imp. Univ. Kazan 1895, 24. [Google Scholar]
  22. Toutanova, K.; Chen, D. Observed versus latent features for knowledge base and text inference. In Proceedings of the 3rd Workshop on Continuous Vector Space Models and their Compositionality; Association for Computational Linguistics: Beijing, China, 2015; pp. 57–66. [Google Scholar]
  23. Mahdisoltani, F.; Biega, J.; Suchanek, F.M. YAGO3: A Knowledge Base from Multilingual Wikipedias. In Proceedings of the Seventh Biennial Conference on Innovative Data Systems Research (CIDR 2015), Asilomar, CA, USA, 4–7 January 2015. [Google Scholar]
  24. Ebisu, T.; Ichise, R. TorusE: Knowledge Graph Embedding on a Lie Group. In Proceedings of the AAAI Conference on Artificial Intelligence; AAAI Press: Palo Alto, CA, USA, 2018; pp. 1819–1826. [Google Scholar]
  25. Zhuo, H.H.; Muñoz-Avila, H.; Yang, Q. Learning hierarchical task network domains from partially observed plan traces. Artif. Intell. 2014, 212, 134–157. [Google Scholar] [CrossRef]
  26. Zhuo, H.H.; Yang, Q. Action-model acquisition for planning via transfer learning. Artif. Intell. 2014, 212, 80–103. [Google Scholar] [CrossRef]
  27. Zhuo, H.H.; Zha, Y.; Kambhampati, S. Discovering Underlying Plans Based on Shallow Models. ACM Trans. Intell. Syst. Technol. 2020, 11, 18:1–18:30. [Google Scholar] [CrossRef] [Green Version]
  28. Zhuo, H.H.; Kambhampati, S. Model-lite planning: Case-based vs. model-based approaches. Artif. Intell. 2014, 246, 1–21. [Google Scholar] [CrossRef] [Green Version]
  29. Zhuo, H.H. Recognizing Multi-Agent Plans When Action Models and Team Plans Are Both Incomplete. ACM Trans. Intell. Syst. Technol. 2019, 10, 30:1–30:24. [Google Scholar] [CrossRef]
  30. Feng, W.; Zhuo, H.H.; Kambhampati, S. Extracting Action Sequences from Texts Based on Deep Reinforcement Learning. In Proceedings of the International Joint Conferences on Artifical Intelligence (IJCAI), Stockholm, Sweden, 13–19 July 2018; pp. 4064–4070. [Google Scholar]
  31. Zhuo, H.H. Human-Aware Plan Recognition. In Proceedings of the AAAI Conference on Artificial Intelligence; AAAI Press: Palo Alto, CA, USA, 2017; pp. 3686–3693. [Google Scholar]
Figure 1. Visualization of partial knowledge graph in YAGO3-10. The blue relations indicate that there can be multiple relations between two entities.
Figure 1. Visualization of partial knowledge graph in YAGO3-10. The blue relations indicate that there can be multiple relations between two entities.
Applsci 11 05572 g001
Figure 2. Geometrical significance of DualQuatE and multiple relations between the entities. (a) denotes the interaction of rotation and translation from the head entity h to the tail entity t. m θ represents the rotation of the relation, n denotes the translation of the relation. (b) shows how to express multiple relations between the head entity h and the tail entity t for DualQuatE. (c,d) demonstrate that TransE and RotatE fail to model multiple relations.
Figure 2. Geometrical significance of DualQuatE and multiple relations between the entities. (a) denotes the interaction of rotation and translation from the head entity h to the tail entity t. m θ represents the rotation of the relation, n denotes the translation of the relation. (b) shows how to express multiple relations between the head entity h and the tail entity t for DualQuatE. (c,d) demonstrate that TransE and RotatE fail to model multiple relations.
Applsci 11 05572 g002
Figure 3. Visualization of the multiple relations between the entities. (a) denotes the histrograms of different between the actedIn embeddings and directed embeddings of RotatE. (b) shows the histrograms of the DualQuatE. (c) denotes the histrograms of actedIn and directed embeddings of TransE. (d) displays the histrograms of translation embeddings of actedIn and directed of DualQuatE.
Figure 3. Visualization of the multiple relations between the entities. (a) denotes the histrograms of different between the actedIn embeddings and directed embeddings of RotatE. (b) shows the histrograms of the DualQuatE. (c) denotes the histrograms of actedIn and directed embeddings of TransE. (d) displays the histrograms of translation embeddings of actedIn and directed of DualQuatE.
Applsci 11 05572 g003
Figure 4. Visualization of relation patterns represented by DualQuatE in rotation, where (a) denotes the symmetry relation “similar_to” in rotation of m m , (b,c) are rotations of “has_part” and “part_of”, and (d) exhibits the inversion effects, where “has_part” • “part_of” represents m m .
Figure 4. Visualization of relation patterns represented by DualQuatE in rotation, where (a) denotes the symmetry relation “similar_to” in rotation of m m , (b,c) are rotations of “has_part” and “part_of”, and (d) exhibits the inversion effects, where “has_part” • “part_of” represents m m .
Applsci 11 05572 g004
Table 1. The ability of expressing relation patterns and multiple relations between head and tail entities.
Table 1. The ability of expressing relation patterns and multiple relations between head and tail entities.
ModelSymmetryAntisymmetryInversionCompositionMultiple Relations
TransE🗸🗸🗸🗸×
RotatE🗸🗸🗸🗸×
HAKE [7]🗸🗸🗸🗸×
DistMult [8]🗸×××🗸
ComplEx [9]🗸🗸🗸×🗸
QuatE🗸🗸🗸×🗸
DualQuatE🗸🗸🗸🗸🗸
Table 2. Specific information of the experimental datasets. #E and #R denote entities and relations number in datasets, #TR, #V and #TE denote the size of training set, valid set and test set.
Table 2. Specific information of the experimental datasets. #E and #R denote entities and relations number in datasets, #TR, #V and #TE denote the size of training set, valid set and test set.
Dataset#E#R#TR#V#TE
FB15k14,9511345483,14250,00059,071
FB15k-23714,541237272,11517,53520,466
WN840,94318141,44250005000
WN8RR40,9431186,83530343134
YOGA3-10123,182371,079,04050005000
Table 3. Link prediction on datasets YAGO3-10. ♠ represents the results that came from (Toutanova and Chen, 2015) and the others came from the original papers. The results with bold are the best and the underlined ones are the second best results.
Table 3. Link prediction on datasets YAGO3-10. ♠ represents the results that came from (Toutanova and Chen, 2015) and the others came from the original papers. The results with bold are the best and the underlined ones are the second best results.
YAGO3-10
ModelMRMRRHit@1Hit@3Hit@10
DistMult ♠59260.340.240.380.54
ComplEx ♠63510.360.260.400.55
ConvE ♠16710.440.350.490.62
RotatE17670.4950.4020.5500.670
InteractE23750.5410.462-0.687
HAKE-0.5450.4620.5960.694
DualQuatE-116360.4770.3770.5340.672
DualQuatE-218890.5030.4110.5570.676
DualQuatE12100.5340.4450.5910.695
Table 4. Link prediction on datasets FB15k-237 and WN18RR. ♣ represents the results that came from (Sun et al., 2019); the others are from the original papers. ¶ denotes the results of QuatE without type constraints from the paper. The results in bold are the best and the underlined ones are the second best results.
Table 4. Link prediction on datasets FB15k-237 and WN18RR. ♣ represents the results that came from (Sun et al., 2019); the others are from the original papers. ¶ denotes the results of QuatE without type constraints from the paper. The results in bold are the best and the underlined ones are the second best results.
FB15k-237WN18RR
ModelMRMRRHit@1Hit@3Hit@10MRMRRHit@1Hit@3Hit@10
TransE ♣3570.294--0.46533840.226--0.501
ComplEx ♣3390.2470.1580.2750.42852610.440.410.460.51
RotatE ♣1770.3380.2410.3750.53333400.4760.4280.4920.571
DihEdral-0.320.230.3530.502-0.480.4520.4910.536
QuatE 1760.3110.2210.3420.49534720.4810.4360.5000.564
InteractE1720.3540.263-0.53552020.4630.430-0.528
HAKE-0.3460.2500.3810.542-0.4970.4520.5160.582
DualQuatE-11730.3290.2300.3680.53029890.4630.4080.4840.571
DualQuatE-21740.3450.2460.3840.54533240.4840.4370.5030.576
DualQuatE1710.3420.2450.3810.53527550.4700.4150.4930.582
Table 5. Link prediction on datasets FB15k and WN18. ♣ represents the results that came from (Sun et al., 2019) and the others came from the original papers. ¶ denotes the results of QuatE without type constraints from the paper. The results with bold are the best and the underlined ones are the second best results.
Table 5. Link prediction on datasets FB15k and WN18. ♣ represents the results that came from (Sun et al., 2019) and the others came from the original papers. ¶ denotes the results of QuatE without type constraints from the paper. The results with bold are the best and the underlined ones are the second best results.
FB15kWN18
ModelMRMRRHit@1Hit@3Hit@10MRMRRHit@1Hit@3Hit@10
TransE ♣-0.4630.2970.5780.749-0.4950.1130.8880.943
ComplEx-0.6920.5990.7590.840-0.9410.9360.9450.947
HolE-0.5240.4020.6130.739-0.9380.9300.9450.949
TorusE-0.7330.6740.7710.832-0.6190.9430.9500.954
SimplE-0.7270.6600.7730.838-0.9420.9390.9440.947
RotatE ♣400.7970.7460.8300.8843090.9490.9440.9520.959
DihEdral-0.7330.6410.8030.877-0.9460.9420.9480.952
QuatE 410.7700.7000.8210.8783880.9490.9410.9540.960
DualQuatE-1310.7510.6590.8250.8842410.9470.9390.9520.959
DualQuatE-2500.7660.6960.8180.8772200.9480.9420.9530.961
DualQuatE350.7540.6640.8270.8841830.9490.9430.9520.960
Table 6. Space and Time Complexity.
Table 6. Space and Time Complexity.
MethodSpace ComplexityTime Complexity
TransE O ( n d + m d ) O ( d )
RotatE O ( n d + m d ) O ( d )
HAKE O ( n d + m d ) O ( d )
RESCAL O ( n d + m d 2 ) O ( d 2 )
HolE O ( n d + m d ) O ( d l o g d )
ComplEx O ( n d + m d ) O ( d )
QuatE O ( n d + m d ) O ( d )
DualQuatE O ( n d + m d ) O ( d )
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Gao, L.; Zhu, H.; Zhuo, H.H.; Xu, J. Dual Quaternion Embeddings for Link Prediction. Appl. Sci. 2021, 11, 5572. https://doi.org/10.3390/app11125572

AMA Style

Gao L, Zhu H, Zhuo HH, Xu J. Dual Quaternion Embeddings for Link Prediction. Applied Sciences. 2021; 11(12):5572. https://doi.org/10.3390/app11125572

Chicago/Turabian Style

Gao, Liming, Huiling Zhu, Hankz Hankui Zhuo, and Jin Xu. 2021. "Dual Quaternion Embeddings for Link Prediction" Applied Sciences 11, no. 12: 5572. https://doi.org/10.3390/app11125572

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop