Dual Quaternion Embeddings for Link Prediction

: The applications of knowledge graph have received much attention in the ﬁeld of artiﬁcial intelligence. The quality of knowledge graphs is, however, often inﬂuenced by missing facts. To predict the missing facts, various solid transformation based models have been proposed by mapping knowledge graphs into low dimensional spaces. However, most of the existing transformation based approaches ignore that there are multiple relations between two entities, which is common in the real world. In order to address this challenge, we propose a novel approach called DualQuatE that maps entities and relations into a dual quaternion space. Speciﬁcally, entities are represented by pure quaternions and relations are modeled based on the combination of rotation and translation from head to tail entities. After that we utilize interactions of different translations and rotations to distinguish various relations between head and tail entities. Experimental results exhibit that the performance of DualQuatE is competitive compared to the existing state-of-the-art models.


Introduction
Knowledge graphs, which represent knowledge from real world applications, contain abundant facts. In knowledge graphs, each fact is represented by a triple (h, r, t) which indicates that the relation r between the head entity h and tail entity t. Knowledge graphs have been applied to various tasks such as explainable recommendation system [1], question answering [2] and prediction of future research collaborations [3].
Predicting missing facts (i.e., link prediction) is a fundamental task in knowledge graph research. Various models aiming at embedding entities and relations into lowdimension spaces have been proposed. For example, TransE [4] learned the embeddings of entities and relations by transforming head entity to tail entity according to the relation; RotatE [5] and QuatE [6] learned the embeddings of entities and relations by considering relations as rotations from head entities to tail entities. However, existing transformation based models fail to capture multiple relations between head and tail entities. For example, as shown in Figure 1, David Lynch is the director, the creator and an actor in the film Mulholland Drive, i.e., there are three relations: directed, created and actedIn between David Lynch and Mulholland Drive. These relations between head entity David Lynch and tail entity Mulholland Drive have no semantic connections with each other, which should be represented by spatially dispersed embeddings. Most existing transformation based models, however, assume that there is only one relation between each pair of head and tail entities. For instance, for each triple (h, r, t), their corresponding embeddings are assumed to be satisfied with h + r ≈ t in TransE, which indicates, for (h, r 1 , t), (h, r 2 , t), and (h, r 3 , t), the embeddings of r 1 , r 2 , r 3 are similar, as shown in Figure 2c (i.e., r 1 ≈ r 2 ≈ r 3 ). To overcome this challenge, we propose a novel approach that considers multiple relations between head and tail entities in knowledge graph.  In this paper, we propose a model called DualQuatE which utilizes the various combinations of distinct rotations and translations to represent multiple relations between head and tail entities. Based on this, easily to think of RotatE combined with TransE in complex space and real space. However, it is hard to find a uniform mathematical expression to convey their combination. Therefore, we propose DualQuatE which embeds entities and relations into dual quaternion space to combine rotation and translation. The dual quaternion consists of real part and dual part. More concretely, we embed entities with pure quaternions vectors in three-dimensional space to represent entity embeddings. To distinguish various relations between head entity h and tail entity t, we design a score function to utilize dual quaternion Hamilton product to model relations as interaction of rotation and translation. We utilize distinct interactions of rotations and translations to represent various relations between head and tail entities. Compared with RotatE and TransE in two-dimensional space, the dual quaternions space is eight-dimensional with six real degrees of freedom, three for translation and three for rotation; we can explore the interaction of rotation and translation with more free degrees in higher dimensions. Summarized in Table 1, our model has rich expression abilities of relations (i.e., relation patterns and multiple relations).
To conclude, the contributions of our proposed model are listed as follows: • We introduce dual quaternions to knowledge graph embeddings. • We propose a novel transformation based model DualQuatE to overcome the challenge of multiple relations between two entities. • Our experiments denote that DualQuatE is effective compared to the existing state-ofthe-art models.

DualQuatE
The rest paper is organized as follows. In Section 2, we introduce the related work. Section 3 presents prerequisite knowledge about dual quaternions. In Section 4, we describe our model. We present the results of experiments and make analysis and discussions in Section 5. In Section 6, we introduce the conclusion of this paper and future work.

Related Work
To gain high-quality knowledge graphs, approaches which utilize knowledge graph embedding to predict missing facts have been proposed recently. These methods fall into two broad categories in [10]: transformation based models and semantic matching models. Specifically, transformation based models transform head entity to tail entity by relations, while semantic matching models match entities and relations semantics in latent spaces. Compared to transformation based models, semantic matching models suffer from poor interpretability.
Transformation based models usually embed entities and relations into vector space and model the relation as a transformation from head entity embeddings to tail entity embeddings. One of the most representative is TransE which mapped entities and relations to the same space R k . For each triple (h, r, t), entity embeddings h, t and relation embedding r hold h + r ≈ t. Then a series of extensions based on TransE are presented to improve accuracy and interpretability. For instance, TransR [11] introduced relations-specific spaces. TransR modeled relations and entities into different spaces following the idea that TransE can only express 1-to-1 relations. RotatE mapped embeddings into complex space which focused on expressing relation patterns. HAKE [7] utilized the polar coordinate system to capture semantic hierarchies in the knowledge graphs.
Semantic matching models that match latent semantics of entities and relations can be divided into two categories: bilinear models and neural network based models. Bilinear models include DistMult [8], HolE [12], SimplE [9], ComplEx [13] and QuatE and DihEdral [14]. DistMult represented each entity as a vector and each relation as a diagonal matrix. HolE matched latent semantics of entities by circular correlation operation and then the compositional vector interacted with relations latter. ComplEx, mapping knowledge graph embedding into complex space, leveraged Hermitian product to capture latent semantics of entities and relations which could express antisymmetry relation pattern. QuatE, extending knowledge graph embedding from complex space to quaternion space, modeled each relation as rotation in four-dimensional space with more degree of freedom. Compared with ComplEx, QuatE could express the main relation patterns except composition. For each entity, SimplE proposed two embeddings and each of them learned latent semantics dependently. DihEdral mapped relations into dihedral group to capture composition relations. Neural network based models including ConvE [15], R-GCNs [16] and InteractE [17] are proposed recently. ConvE, R-GCNs introduced convolutional network and graph convolutional networks to knowledge graph embedding respectively. Compared with ConvE, InteractE introduced the feature permutation, "checkered" feature reshaping and circular convolution to increase interaction.
Recently, some models introduced hyperbolic space to knowledge graph embeddings. MuRP [18] represented knowledge graph in Poincaré ball of hyperbolic space.
Chami et al. [19] attempted to capture hierarchical and logical patterns in hyperbolic space. Compared to hyperbolic space based models which focused on semantic hierarchies in knowledge graphs, DualQuatE try to overcome overcome the challenge of multiple relations between two entities and relation patterns.
Both DualQuatE and QuatE use quaternion to embed knowledge graphs. However, those are three different models. The main differences between DualQuatE and QuatE are as follows: • DualQuatE, a transformation based model, measures score of triples by the distance between two entities. QuatE which is a semantic matching model measured the latent matching semantics of entities and relations. • The purpose of the model is different. DualQuatE aims to address the challenge of having multiple relations between two entities. QuatE aims to utilize quaterion Hamilton product to encourage a more compact interaction between entities and relations. • The geometric meaning is different. QuatE embeds entities and relations with quaterions to model relations as rotations. Our model firstly attempts to represent entities with pure quaternions and models relations as interaction of translation and rotation.

Preliminaries
In this part, we introduce several concepts used in this paper.
• Quaternion: Quaternion [20], is a number system that extends complex numbers to four-dimensional numbers. Generally, a quaternion is a number of the form q = a + bi + cj + dk, where a, b, c, d are real numbers and i, j, k satisfy that • Rotation with quaternions in three-dimensional space: The point v is rotated by the point v along the unit vector u (i.e., rotation axis), which can use quaternion multiplication to represent. We define v and v as pure quaternion, i.e., quaternions with real part being zero, m = cos θ 2 + u sin θ 2 is a unit quaternion, then v = mvm * • Dual quaternion: Dual quaterion [21] is an eight-dimensional real algebra to combine with quaternions. Formally, a dual quaternion δ can be represented by δ = p + q, where is a dual unit with 2 = 0, both the real part p and the dual part q are quaternions. Therefore, a dual quaternion δ is of the form δ = p 0 Dual quaternion conjugate: The conjugate of the dual quaternion δ = p + q is defined as: δ = p * − q * , which can be represented by an 8-tuple: Dual Quaternion Multiplication: Dual Quaternion Hamilton product between δ 1 = p 1 + q 1 and δ 2 = p 2 + q 2 is defined as follows: • Unit Dual Quaternion: A dual quaternion δ = p + q is a unit dual quaternion, if δ ⊗ δ * = 1, namely, δ satisfies the following conditions: where δ * = p * + q * . In order to simplify the calculation process, we use another effective form to represent unit dual quaternion which defines as follows: where m = cos θ 2 + u sin θ 2 and n is a pure quaterion. We prove that δ is a unit dual quaternion: We can easily verify mm * = 1, as shown below: •

Combination of Rotation and Translation:
We define a point in the three-dimensional space as a pure quaterion v and let n be the translation. The point v under the rotation θ followed by the translation n becomes the point v . It is straightforward to utilize unit dual quaternion multiplication to represent the transformation from v to v , as shown below:

Our DualQuatE Model
In this section, we introduce our model DualQuatE which maps entities and relations to dual quaternion space, and two variations of DualQuatE, namely DualQuatE-1 and DualQuatE-2.
We denote a knowledge graph by G, a set of entities by E and a set of relations by R. A knowledge graph G is composed of a set of facts, each of which can be represented by (h, r, t), where h ∈ E is a head entity, t ∈ E is a tail entity, and r ∈ R is a relation between h and t. We denote a set of facts that are true by Ω + , and a set of facts that are false by Ω − . Given a knowledge graph G, we aim to predict missing facts (i.e., link prediction) in G.

Multiple Relations between the Entities
To address the challenge of having multiple relations between head and tail entities, we embed knowledge graph into dual quaternion space. h, r, t denote vector of entity embeddings and relation embeddings, each element of entity embeddings h i or t i is pure quaternion and every dimension of relation embeddings r i is unit dual quaternion. We expect to model relation embeddings r as interaction of rotation and translation from head entity embeddings h to tail entity embeddings t as shown in Figure 2a. Specifically, each true triple (h, r, t) satisfies: where each dimension of r is a unit dual quaterinon satisfying Formula (4). We define a quaternion m = cos θ 2 + u sin θ 2 to represent a rotation about pure unit quaternion u through θ and a pure quaternion n = n 1 i + n 2 j + n 3 k. Furthermore, we define a unit dual quaternion by: With Formula (10), we can deduce the transformation of DualQuatE in Formula (9): where the geometric meaning of mhm * is shown in Formula (2). As shown above, DualQuatE transforms head entity h to tail entity t by relation r which combines rotation (i.e., m) and translation (i.e., n). Unlike previous models learned similar representations of relations r 1 , r 2 , r 3 shown in Figure 2c,d, our model learns combinations of different translations and rotations to represent various relations between head and tail entities. We define score function by: where || · || represent L 2 norm of a vector. With the score function we want head entity to be as close to tail entity as possible after the transformation of the relation.

Loss Function
We employ self-adversarial negative sampling [5] method to generate corrupt samples. We define the probability distribution of negative samples by: where α is sampling temperature. Combining with self-adversarial negative sampling, we define loss function by: where γ is fixed margin. We define our algorithm as shown in Algorithm 1.

Algorithm 1 DualQuatE.
Input: Entity embeddings E and relation embeddings R. hyperparameters including margin γ, martrix dim k, negative sample size n.
T pos ← uniform random sampling (h, r, t)

4:
(h , r, t ) ← generate n negative samples for (h, r, t) 5: update relation embeddings r and entity embeddings h, t:

Properties of DualQuatE
In this part we describe the relation patterns and introduce how DualQuatE expresses those patterns. Recently, learning relation patterns including symmetry/antisymmetry, inversion and composition have been realized to the key of link prediction task. Our model DualQuatE can easily explain the relation patterns of the learned relation embeddings and proof of relation patterns can be found in the Appendix A.
Inversion: If a relation r ∈ R is the inverse to a relation r ∈ R, then we can infer (h, r, t) ∈ Ω + ⇔ (t, r , h) ∈ Ω + . For example, the relation has_part is inverse to the relation part_o f . To r and r , we infer that (m m)h(m m) * + m nm + n = h, which denotes the composition of component m and m have no rotation (i.e., (m m)h(m m) * = h) and the translation n which is rotated by m is the opposite number of the translation n (i.e., m nm + n = 0). Symmetry: A relation r ∈ R is symmetric, if (h, r, t) ∈ Ω + ⇔ (t, r, h) ∈ Ω + holds. For instance, relations similar_to and verb_group from the dataset WN18 are symmetric. If a relation is symmetric, we reason that (mm)h(mm) * + mnm * + n = h, which means no rotation of the self-composition of component m (i.e., (mm)h(mm) * = h) and no translation of component n (i.e., mnm * + n = 0). Antisymmetry: A relation r ∈ R is antisymmetric, if (h, r, t) ∈ Ω + ⇒ (h, r, t) ∈ Ω − , which satisfies (mm)h(mm) * + mnm * + n = h. For example, the relation part_o f .
Composition: A relation r 3 is composed by the relation r 1 and r 2 , which can be denoted by r 3 = r 1 ⊕ r 2 if (h, r, s) ∈ Ω + ∧ (s, r, t) ⇒ (t, r, h) ∈ Ω + . For example, relation un-cle_of can be composited by brother_of and father_of such as if (Alva, brother_o f , Aaron), (Aaron, f ather_o f , Abel) are true triples, we can reason (Alva, uncle_o f , Abel) is a true fact in the real world. Relation r 3 can be composited by relation r 1 and r 2 ; they can be represented by (m 2 m 1 )h(m 2 m 1 ) * + m 2 n 1 m * 2 + n 2 = m 3 hm * 3 + n 3 , which deduces that n 3 is equal to the sum of translation n 2 and translation n 1 which is rotated by the rotation m 2 (i.e., m 2 n 1 m * 2 + n 2 = n 3 ).

Variations
We introduce extensions of DualQuatE. DualQuatE is a transformation based model, which combines rotation and translation. To compare the effects of interaction of rotation and translation, we compare DualQuatE with DualQuatE-1 which models relations as rotation in three-dimensional space. Furthermore, we propose DualQuatE-2 to explore the role of scaling in the rotation.
DualQuatE-1: We devise DualQuatE-1 which embeds entities and relations to quaternion space. Specifically, we represent entity embeddings h, t with pure quaternions and relation embeddings r with quaternions. We design a score function as follows f r (h, t) = −||rhr * − t|| to model the relation as rotation in three-dimensional space. Namely, for each fact satisfies: rhr * = t.
DualQuatE-2: To explore the effect of scaling in knowledge graph embeddings, we present DualQuatE-2 to introduce scaling. DualQuatE-2 maps knowledge graph embeddings to four-dimensional space. Especially, we represent entities and relations with quaternions where relation embeddings are not unit quaternions. We define score function f r (h, t) = −||hr − t|| meaning relation transform head entity to tail entity combining rotation and scaling.

Connection to TransE and RotatE
Compared with RotatE: RotatE embedded entity embeddings h, t and relation embeddings r into the complex space. RotatE utilized score function −||h • r − t|| to calculate the probability of each triple, where r i is unit complex cos θ + i sin θ. DualQuatE can be transformed to RotatE by fixing rotation plane and removing translation variables. For instance, we can construct relation embeddings by Formula (10) in xoy plane, where u = k and n = 0 (i.e., r = cos θ 2 + sin θ 2 i) and embed entities with corresponding forms: h or t = ai + bj.
Compared with TransE: TransE modeled relation as translation that embedded entity embeddings h, t and relation embeddings r to vector space. To express TransE, we can set θ = 0 (i.e., m = 0) in relation embeddings to ignore the rotation. In other words, the relation embedddings in DualQuatE can be expressed as r = 1 + 2 n.

. Evaluation Metric
Similar to [6], we used three metrics to measure our approach, i.e., Mean Rank (MR), Mean Reciprocal Rank (MRR), and Hit@n. To calculate those metrices, we first replace h by all entities h ∈ E for each testing triplet (h, r, t) ∈ T (where T is a set of testing triplets) and compute score f r (h , t) for each triple (h , r, t). After that we sort h according to score f r (h , t) ascendingly and get the rank of the original entity h, denoted by K(h). Note that K(h) is the "rank" of h instead of the score of h, e.g., if the score of h the smallest, K(h) is 1. We can calculate MR as shown below:

r,t)∈T K(h) |T|
which means MR is an average of ranks of all the original entities in the testing triplets. Likewise, MRR can be calculated as follows: which indicates MRR is an average of inverse ranks of all the original entities in the testing triplets. Hit@n suggests the proportion of original entities in the top n entities, which can be calculated by: We tested different values n = 1, 3, 10 in the evaluation, similar to the setting used in reference [6].

Baselines
We compared our model with several state-of-the-art baselines. For transformation based models, we compared our model to TransE [4], TorusE [24], RotatE [5] and HAKE [7]; for bilinear models, we compared our model to ComplEx [13], HolE [12], SimplE [9], DihEdral [14] and QuatE [6] (to make the comparison fair, we use the version of QuatE without type constraints on the common link prediction datasets considering the requirement of type constraints is too strong).

Results
Tables 3-5 show the experimental results on four datasets. The performance of DualQuatE and its variations represent comparability to state-of-the-art models. For YAGO3-10, the link prediction results are shown in Table 3, from which we can see that DualQuatE is competitive compared to most previous knowledge graph embedding models, especially in metric Hit@10. The result of YAGO3-10 tells us that the performance of DualQuatE is better than DualQuatE-1, which indicates that modeling relations as the interaction of rotation and translation with more degrees of freedom (as done by DualQuatE) is indeed better than simply modeling relations as rotation (as done by DualQuatE-1). Furthermore, the advanced results of DualQuatE-2 and DualQuatE inspire us to explore the mixed effects of vector operations. Tables 4 and 5 indicate the effects of our models on four common datasets: WN18RR, FB15k-237, WN18 and FB15k. We can find that our models perform better on datasets WN18RR and FB15k-237; for WN18 and FB15k, metrics are almost close to previous models and several metrics surpass the previous. Table 3. Link prediction on datasets YAGO3-10. ♠ represents the results that came from (Toutanova and Chen, 2015) and the others came from the original papers. The results with bold are the best and the underlined ones are the second best results.

Relation Embeddings
In this part we analyze the properties of DualQuatE learned for relations. DualQuatE can distinguish multiple relations between head and tail entities, for example, as shown in Figure 3. We compared our model with RotatE; Figure 3a,b display the difference of the representation of relation actedIn and directed. Figure 3a shows that the relations actedIn and directed are more similar where the gap between the two relations is clustered around zero. For DualQuatE, the difference which is shown in Figure 3b is more dispersed. Maybe the learned embeddings of our model are slightly concentrated around zero. We speculate that the reason for this result is due to less relations in YAGO3-10, which causes the diversity of relations between the entities to be more sparse. Figure 3d    Limited by the length of the article, we visualize only some relation patterns in this paper. Figure 4a shows that the self composition of rotation m is close to 0 or 2π. Figure 4b

Space and Time Complexity
In this part, we list space and time complexity of the different transformation based models and bilinear models as shown Table 6. m and n denote number of entities and relations. d is dimensions of entity or relation embeddings.

Method
Space Complexity Time Complexity

Conclusions
In this paper, we propose a novel model, DualQuatE, for knowledge graph embedding, which maps entities and relations to dual quaternion space. We present a new score function to model each relation as interaction of rotation and translation, which addresses the multiple relations between two entities. We demonstrate that our model is able to express main relation patterns and outperforms state-of-the-art baselines. However, DualQuatE does not consider temporal information and semantic hierarchies in knowledge graphs. In the feature, we will investigate how to explore temporal information and semantic hierarchies based on our model. It is also interesting to investigate the possibility of applying our DualQuatE model to learning representations of propositions for helping learning action models [25][26][27][28] and recognizing plans [29][30][31] in planning community. Data Availability Statement: Publicly available datasets were analyzed in this study. These data can be found here: https://github.com/gaoliming123/DualQuatE, accessed on 11 June 2021.

Conflicts of Interest:
The authors declare no conflict of interest.