# A Survey on Knowledge Graph Embeddings for Link Prediction

^{*}

## Abstract

**:**

## 1. Introduction

- This paper provides a theoretical analysis and comparison of existing KGE methods for generating KG embeddings for link prediction in KGs.
- Several representative models in each category are also analyzed and compared along five main lines.
- We conducted experiments on two benchmark datasets to report comprehensive findings and provide new insights into the strengths and weaknesses of existing models. We also provide new insights into existing techniques that are beneficial for future research.

## 2. Preliminaries and Problem Definition

#### 2.1. Preliminaries

**Definition**

**1**(Knowledge Graph (KG))

**.**

**Definition**

**2**(Knowledge Graph Embedding (KGE))

**.**

#### 2.2. Link Prediction

#### 2.3. Research Questions

- It is difficult to model the heterogeneity of graphs.
- There are few studies on dynamic graphs, which are better able to reflect the real world.
- How to incorporate prior knowledge to obtain deep semantics should be determined.
- How to capture multi-hop neighbor information should be determined.
- It has been argued that many models struggle to perform on hierarchical graphs such as WN18RR [8].

## 3. Embedding Models for Link Prediction

#### 3.1. Translation-Distance-Based Models

**TransE**[13] is a well-known, early and simple model that regards a relation as a translation from a head entity to a tail entity. It uses a distance scoring function as follows: $f\left(h,t\right)={\u2225h+r-t\u2225}_{\frac{1}{2}}$. TransE is the earliest translation-based embedding model, and it has difficulty dealing with multirelational graphs; it is limited by its simple translation operation as well as its lack of a discrimination policy for all kinds of relations. In recent years, many variants of TransE have been proposed, such as

**TransH**[15] and

**TransR**[17]. TransH introduces a hyperplane, and TransR uses a relation-specific space to handle different relations, excavating more semantic information. The

**TransHR**[43] model focuses on the embedding of hyperrelational data. These models are similar in nature; the only improvement is in translating the head entities to tail entities.

**TransMS**[52] regards the head entity, relation and tail entity as a subject, predicate and object, respectively, as in a sentence. In this way, it considers the semantics between the head entity and the relation as well as the semantics between the relation and the tail entity, which is not done in previous models. It uses the nonlinear function $p\left(e,r\right)$ (this is the tanh function) instead of a linear function to translate the semantics, that is, to translate both h and r to t, obtaining the final tail-entity embedding vector as ${t}_{\perp}=P\left(p\left(h,r\right),t\right)$ and obtaining the head-entity embedding vector from the converse semantic transfer ${h}_{\perp}=P\left(p\left(-t,r\right),h\right)$. In addition, it defines the bias vector $\alpha \xb7g\left(h,t\right)$ to transmit the semantic information of both h and t to r, where $\alpha $ is an added dimension for relations and $g\left(h,t\right)$ is a function concerning h and t, and then obtains the final relation embedding vector as ${r}_{\perp}=G\left(r,\partial \xb7g\left(h,t\right)\right)$. It chooses the tanh function as the nonlinear function $p\left(e,r\right)$, so the entity and relation embedding vectors are as Equations (1)–(3):

**KG2E**[55] and

**TransG**[56]. KG2E regards entities and relations as random vectors drawn from multivariate Gaussian distributions and scores a triple using the distance between the two random vectors. The TransG model also models entities with Gaussian distributions, using a mixture of Gaussian distributions to obtain multiple semantics. Gaussian embedding models take the uncertainties of the entities and relations into account, but this results in a complex model. Other methods such as

**TorusE**[58],

**QuatE**[59],

**RotatE**[35] and

**MobiusE**[62], using Lie groups, quaternions, rotations and MobiusE rings, are similar to TransE [13]. They do not belong to distance-based models in essence, but their idea is the same as that of TransE. Given a triple $\left(h,r,t\right)$, they all map the head entity h to the tail entity t through the relation r, but the specific mapping operations about r are different, so this paper puts them in this category.

**RotatE**[35] can model and infer various relation patterns, including symmetry (antisymmetry), inversion and composition. Specifically, it defines each relation as a rotation from the head entity to the tail entity in a complex vector space. It provides a novel self-adversarial negative-sampling technique for efficiently and effectively training RotatE. Inspired by Euler’s identity, RotatE maps the head and tail entities h and t to complex embeddings such as $\mathbf{h},\mathbf{t}\in {\mathbb{C}}^{k}$; then, it defines a mapping function induced by each relation r as an elementwise rotation from the head entity h to the tail entity t. For the triple $(h,r,t)$, RotatE yields $\mathbf{t}=\mathbf{h}\circ \mathbf{r},\phantom{\rule{1.em}{0ex}}$, where $\left|{r}_{i}\right|=1$ and ∘ is the Hadamard (or elementwise) product. For each element in the embedding, we have ${t}_{i}={h}_{i}{r}_{i}$. Here, this method constrains the modulus of each element of $\mathbf{r}\in {\mathbb{C}}^{k},$ i.e., ${r}_{i}\in \mathbb{C}$, to be $\left|{r}_{i}\right|=1$. By doing so, ${r}_{i}$ takes the form ${e}^{i{\theta}_{r,i}}$, which corresponds to a counterclockwise rotation by ${\theta}_{r,i}$ radians about the origin of the complex plane and affects only the phases of the entity embeddings in the complex vector space. This is the origin of the “rotation”. The distance function of RotatE is ${d}_{r}(\mathbf{h},\mathbf{t})=\parallel \mathbf{h}\circ \mathbf{r}-\mathbf{t}\parallel $. By defining each relation as a rotation in the complex vector space, RotatE can model and infer all three types of relation patterns introduced above. It summarizes the pattern modeling and inference abilities of several models as shown in Table 2.

**HAKE**[60] considers the semantic hierarchy of entities by using polar coordinates, in which concentric circles can naturally reflect the hierarchy. Additionally, it considers whether the entities are at the same level of the hierarchy, which consists of two parts, the modulus and the phase, used to distinguish the two types of entities. The modulus focuses on entities at different levels of the hierarchy; modulus information is used to model the entities as follows: ${h}_{m}\circ {r}_{m}={t}_{m},{h}_{m},{t}_{m}\in {R}^{k},{r}_{m}\in {R}_{+}^{k}$. Here, ${h}_{m},{r}_{m},{t}_{m}$ are the head-/tail-entity embedding and the relation embedding, and ∘ is the Hadamard (or elementwise) product. The corresponding distance function is ${d}_{r,m}\left({h}_{m},{t}_{m}\right)={\u2225{h}_{m}\circ {r}_{m}-{t}_{m}\u2225}_{2}$. The phase focuses on entities at the same level of the hierarchy; phase information is used to distinguish the entities as follows: $\left({h}_{p}+{r}_{p}\right)\phantom{\rule{0.277778em}{0ex}}mod\phantom{\rule{0.277778em}{0ex}}2\pi ={t}_{p},{h}_{p},{r}_{p},{t}_{p}\in {[0,2\pi )}^{k}$. Here, ${h}_{p},{r}_{p},{t}_{p}$ are the head-/tail-entity embedding and the relation embedding. The corresponding distance function is ${d}_{r,p}\left({h}_{p},{t}_{p}\right)={\u2225sin\left(\left({h}_{p}+{r}_{p}-{t}_{p}\right)/2\right)\u2225}_{1}$. The above two parts correspond to the radial and angular parts of polar coordinates, respectively. This method maps each entity e to $\left[{e}_{m};{e}_{p}\right]$, where ${e}_{m},{e}_{p}$ are generated by the two parts and $\left({\left[{h}_{m}\right]}_{i},{\left[{h}_{p}\right]}_{i}\right)$ is a 2D point in polar coordinates. The final score function is as in Equation (5):

#### 3.2. Semantic Information-Based Models

**DistMult**[20] and

**Complex**[21] do. These models suffer from higher computational complexity. The former recently proposed models fuse various additional information to obtain better performance to mine deeper semantic information at the bottoms of graphs. The additional information includes path information, order information, concepts, entity attributes, entity types and so on.

**TransW**[101] uses word embeddings to compose knowledge graph embeddings and learns a function mapping from the word embedding space to the knowledge embedding space. Entities and relations are represented in the form of linear combinations of word embeddings in this model, which can detect unknown facts, as Equations (7)–(9):

**TransC**[149] model combines structural information with entity concepts (referring to the categories of entities) to improve KGE models and introduces a novel type of semantic similarity to measure the distinctness of entity semantics by using the concept information. The relations in this model consist of two concept sets: the head concept set ${C}_{r}^{head}$ and tail concept set ${C}_{r}^{tail}$. The semantic similarity of a relation and a head entity and of the relation and a tail entity is used to measure the distinction of entity semantic with concept information that is defined as in Equations (11) and (12), separately:

**PTransE**[83] is a translation-based model, and it introduces contextual information by using multiple-step relation paths to extend TransE, which treats relation paths as transformations of representation learning between entities. TransE regards a relation as a translation vector of the head- and tail-entity vector. Its scoring function is defined as $E\left(h,r,t\right)=\u2225h+r-t\u2225$. PTransE integrates another score item into the triple’s score for the multiple-step relation path, given as $G\left(h,r,t\right)=E\left(h,r,t\right)+E\left(h,P,t\right)$. $E\left(h,r,t\right)$ is defined as in TransE, modeling the relations in the triple $\left(h,r,t\right)$ between entities and relations directly; $E\left(h,P,t\right)$ models multiple-step relationships between the entities and relations of each triple $\left(h,r,t\right)$ in a multiple-step path, defined as $E\left(h,P,t\right)=\frac{1}{Z}{\displaystyle \sum _{p\in P\left(h,t\right)}}R\left(p\left|h,t\right.\right)E\left(h,p,t\right)$, where $\sum _{{}_{p\in P\left(h,t\right)}}}R\left(p\left|h,t\right.\right)$ is a normalized factor, for measuring the reliability of the relational path, and PTransE proposes a path-constraint resource allocation (PCRA) algorithm for it. $E\left(h,p,t\right)$ is used as a scoring function to define the triple $\left(h,r,t\right)$, differing from TransE in leveraging a semantic combinatorial model recurrent neural network (RNN) to combine the relational path p, defined as $E\left(h,p,t\right)=\u2225p-\left(t-h\right)\u2225=\u2225p-r\u2225=E\left(p,r\right)$. Moreover, PTransE considers the relation path in only one direction. To address this problem, bidirectional relations including both the forward and the inverse direction are added to the KGs, such as ${e}_{1}\underset{T}{\overset{BornInCity}{\to}}{e}_{2}\underset{T}{\overset{CityOfCountr{y}^{-1}}{\to}}{e}_{3}$ (BornInCity and CityOfCountry both represent relations, both constituting bidirectional relations. CityOfCountry is the inverse relation of BornInCity. If A is the BornInCity of B, then B is the CityOfCountry of A.). To improve the computational efficiency, the path length is limited to a maximum of three steps, and only the relationship paths with reliability scores greater than 0.01 will be selected.

**Bilinear+TR**[78] improves RESCAL [64] by introducing a regularization factor into the loss function, which is used to take entity types into account. In RESCAL, entities are expressed as vectors $x\in {\mathbb{R}}^{d}$, and relations are expressed as matrices $W\in {\mathbb{R}}^{d\times d}$. Then, the triple $(h,r,t)$ is represented as a score given by ${s}_{c}(s,r,t)={x}_{s}^{T}{W}_{r}{x}_{t}$. This is similar to tensor factorization; these vectors and matrices are learned by a loss function that compares a correct triple with an incorrect triple. RESCAL uses a max-margin loss function as in Equations (15) and (16):

**RW-LMLM**is a novel approach for link prediction that consists of a random walk algorithm for KGs (RW) and a language model-based link-prediction model (LMLM). The paths output from RW are regarded as pseudosentences for LMLM training. RW can capture the semantic and syntactic information in KGs by considering the entities, relations and order information of the paths, in contrast to DeepWalk (which considers only the entities and relations). Therefore, the entities in the paths are in head-to-tail form, and the relations are in the middle, for example, ${e}_{0}\stackrel{{r}_{0}}{\u27f6}{e}_{1}\stackrel{{r}_{1}}{\u27f6}\cdots \stackrel{{r}_{l-1}}{\u27f6}{e}_{l}$. LMLM uses a multilayer transformer decoder language model instead of word2vec models (continuous bag-of-words encoders or skip-grams). The standard language model usually defines a probability distribution over a sequence of tokens: $P\left({w}_{1},{w}_{2},\dots ,{w}_{n}\right)={\prod}_{i}P\left({w}_{i}\mid {w}_{1},\dots ,{w}_{i-1}\right)$. The goal of language modeling is to maximize this probability. The conditional probabilities $P\left({w}_{i}\mid {w}_{1},\dots ,{w}_{i-1}\right)$ can be learned by neural networks. The objective of RW-LMLM is to maximize the following probability as in Equation (19):

**LiteralE**[115] introduces the literature information, giving priority to numerical literature. A general model with additional information adds a text-dependent merging term to the output of the scoring function to merge text indirectly. LiteralE directly incorporates literature information through a learnable parameterized function. There are two forms of triples in LiteralE: relations between entities and relations of entities and literals. This method obtains global information via the latent feature model, learning low-dimensional, latent representations, that is, embeddings. To evaluate the results, the basic dataset was extended here by adding literal information, and experiments were carried out on the basic models DistMult, ComplEx and ConvE. Except for embedding entities for transformation through a core function $g:{R}^{H}\times {R}^{{N}_{d}}\to {R}^{H}$, the scoring function is the same as in the basic models; that is, it transforms the entities into ${e}_{i}^{lit}=g\left({e}_{i},{l}_{i}\right)$. The definition of function g is critical for LiteralE. In addition to having learnability and flexibility, it should also be able to independently determine whether additional information is useful so that a judgment of whether to merge or ignore it can be made. To define g, LiteralE is inspired by the RNN gating strategy, and the gated recurrent unit (GRU) strategy is used to transform entities to ensure that the output vector and entity embedding have the same dimension. However, the model introduces a certain parameter overhead, which is proportional to the number of relationships in the KG.

#### 3.3. Neural Network-Based Models

**ConvE**[22] is the first model to use the convolutional neural network (CNN) framework for KG completion. Compared with fully connected neural networks, CNNs capture complex relationships with very few parameters by learning nonlinear features. ConvE uses embedded 2D convolution to predict missing links in KGs, which is the simplest multilayer convolutional network structure in these models for link prediction. Notably, 2D convolution outperforms 1D convolution in terms of extracting feature interactions between two embeddings. ConvE obtains local relationships in different dimensions between entities through a convolution layer and a fully connected layer, which ignores the global relationships between triple embeddings of the same dimension. ConvE first reshapes the head-entity embedding and relation embedding, concatenating them into an input matrix for the 2D convolution layer, which then returns a feature map tensor. Then, the tensor is vectorized and projected into k-dimensional space through a linear transformation parameterized by the matrix W and is finally matched with the tail-entity embedding through an inner product. Its scoring function is as in Equation (20):

**ConvKB**[23] uses 1D convolution to maintain the translation characteristics of TransE, which is sufficient for capturing the global relationships and transitional characteristics between entities. It represents the k-dimensional embedding of each triple $\left({v}_{h},{v}_{r},{v}_{t}\right)$ as a three-column matrix $A=\left[{v}_{h},{v}_{r},{v}_{t}\right]\in {R}^{k\times 3}$ and then feeds it into the convolutional layer, where there are multiple filters of the same 1 × 3 shape with the ability to extract the global relationships among the same-dimensional entries of an embedding triple. These filters operate on each row of the input matrix to obtain different feature maps: $\mathbf{v}=\left[{v}_{1},{v}_{2},\dots ,{v}_{k}\right]\in {R}^{k}$, where ${v}_{i}=g\left(\omega \xb7{A}_{i,:}+b\right)$, ${A}_{i,:}\in {R}^{1\times 3}$ is the i-th row of A; $\omega \in {R}^{1\times 3}$ is the filter used to examine the global relationships between the same-dimensional entries of embedding triples and to retrieve the transitional characteristics in transition-based models; $b\in R$ represents the bias; and g is the activation function. Then, these feature maps are concatenated into a triple feature vector and calculated with a weight vector $\mathbf{w}$ via the dot product to obtain the score of the triple, which is used to determine the validation of the triple. Its scoring function is as in Equation (21):

**HypER**[24] introduces hypernetworks based on ConvE to generate convolutional filter weights for each relation. The hypernetworks can be used to achieve weight sharing across layers and dynamically synthesize weights given inputs. The differences between HypER and ConvE are as follows: (1) ConvE uses 2D filters to construct convolution operators for the entity and relation embeddings after reshaping and concatenating, while HypER uses 1D relation-specific filters to handle entity embeddings, which simplifies the interaction between entities and relational embeddings. (2) The interaction between entities and relations in ConvE is affected by how they are reshaped and concatenated before being fed into the convolutional layers, while HypER uses a convolution operator for head-entity embeddings with a set of relation-specific filters ${F}_{r}$, which is created by hypernetwork H from the relation embeddings. The hypernetwork is a fully connected layer. (3) The feature maps obtained from ConvE and HypER by the convolution operator are all projected into a k-dimensional space and vectorized; however, the difference is that ConvE uses a linear transformation parameterized by $\mathbf{w}$, while HypER uses a weight matrix W to which the ReLU activation function is applied. (4) Finally, both methods calculate the tail-entity embeddings via the inner product to obtain a scoring vector for matching. The scoring function is as in Equation (22):

**R-GCN**[158] utilizes a relational graph convolutional network (GCN) to model highly multiple-relational data explicitly, restoring the information of the 1-hot neighborhood facts around the entity. For link-prediction tasks, a relational graph convolutional network (R-GCN) can be regarded as an autoencoder consisting of an encoder and a decoder. The encoder generates a latent feature representation of the entity, while the decoder scores the triple according to the representation generated by the encoder. In the encoder, there is an R-GCN using the idea of a graph convolutional network, which can fully encode the structural information of the entities by considering all types of relations among the entity connections, including the in and out relations. The forward update of the entity is as in Equation (23):

**ProjE**[176] complements the missing information in a KG by learning the joint embedding of the entities and edges and making modifications to the loss function, thus improving the KGE model through trivial changes to the network architecture and eliminating complex feature engineering. By means of a learned combinatorial operator, the embedding vectors of head entities and relations are combined into a target vector, which also distributes the projection of the candidate entities to obtain the order list, and the top-ranking candidates are the correct entities. Compared with TransE, ProjE saves many transformation matrix calculations because of the combination operations, which are defined as $e\oplus r={D}_{e}e+{D}_{r}r+{b}_{c}$, where ${D}_{e},{D}_{r}$ are diagonal matrices that serve as the global entity and relationship weights, respectively, and ${b}_{c}\in {R}^{k}$ is the combination bias. Then, the embedding projection function is defined as $h\left(e,r\right)=g\left({W}^{c}f\left(e\oplus r\right)+{b}_{p}\right)$, where f and g are activation functions, ${W}^{c}\in {R}^{s\times k}$ is the candidate-entity matrix (s is the number of candidate entities), ${b}_{p}$ is the projection bias, and $h\left(e,r\right)$ represents the ranking score vector, whose elements measure the similarity between the candidate entity in ${W}^{c}$ and the combined input embedding $e\oplus r$. TransE defines the combination operation as addition and the similarity operation as distance. Similarly, ProjE defines the combination operation as a global linear layer and the similarity operation as a dot product. Moreover, it uses a collective ranking loss for the list of candidate entities (or relations) with two proposed methods: ProjE-pointwise and ProjE-listwise. The former uses sigmoid and tanh as the activation functions for g and f, respectively. Therefore, the ranking score is defined as $h{\left(e,r\right)}_{i}=sigmoid\left({W}_{\left[i,:\right]}^{c}tanh\left(e\oplus r\right)+{b}_{p}\right)$, where ${W}_{\left[i,:\right]}^{c}$ represents the ${i}^{th}$ candidate in the candidate-entity matrix. The latter uses softmax and tanh as the activation functions, and its ranking score is defined as in Equation (24):

**ProjR**[178] takes into account the diversity of structures and represents entities in terms of their different relational contexts and different entity locations; it combines TransR and ProjE to obtain these two representations by defining a unique combination operator for each relation. It can model N-1, N-N, 1-N-1 and one-relation circle structures with the advantage of a combination operator design. Before the similarity operator, TransR projects the representation of the head and tail entities into the relation-specific space through a relational projection matrix. ProjE projects the candidate-entity vectors into the representation space of the input entity relation pairs. Similarly, ProjR defines the scoring function in two parts: a combination operator and a similarity operator. To obtain representations of entities in different relational contexts, a combination operation ${C}_{r}\left(h\right)$ is applied to each relation r: ${C}_{r}\left(h\right)={c}_{hr}=tanh\left({D}_{r}^{e}h+{D}_{r}^{r}r+{b}_{c}\right)$, where ${D}_{r}^{e}\in {R}^{d\times d}$ is a diagonal matrix defined for the linear transformation of the head entity under relation r, ${D}_{r}^{r}\in {R}^{d\times d}$ is a diagonal matrix defined for the linear transformation of relation r, ${b}_{c}\in {R}^{d}$ is a global bias vector, and $tanh\left(z\right)=\frac{{e}^{z}-{e}^{-z}}{{e}^{z}+{e}^{-z}}$ is a nonlinear activation function in which the output is restricted to (−1, 1). To obtain the representations of entities at different entity locations, the tail-entity vector $\mathbf{t}\in {R}^{d}$, rather than projection, is used directly for the similarity operation as $S\left(h,r,t\right)=\sigma \left(\mathbf{t}\xb7{c}_{hr}\right)$, where $\sigma \left(z\right)=\frac{1}{1+{e}^{-z}}$ is used to restrict the final output to (0, 1) as the confidence score.

**ProjFE**[177] improves the combination operator by adding the fuzzy-membership degree, which is used to measure the degree of confidence that an entity belongs to a certain concept, to improve the performance of the model with different degrees of positive and negative samples. Because a large number of translation-matrix calculations are omitted, the model has a very small number of parameters. Unlike previous models, ProjFE uses binary vectors to represent fuzzy embeddings for projection work. ProjFE has the same combination operations as ProjE, except that it adds the fuzzy-membership degree ${\mu}_{e},{\mu}_{r}$ for fuzzy entities and relationships, where the fuzzy-membership degree is defined as $\mu \phantom{\rule{4.pt}{0ex}}=\phantom{\rule{4.pt}{0ex}}e-{\left(\frac{x-a}{\sigma}\right)}^{2},x\in \left(0,\infty \right),a=\left({a}_{1},{a}_{2},...,{a}_{n}\right),{a}_{i}\in \left[0.5,1\right]$. The combination operator is defined as $\mathbf{e}\oplus \mathbf{r}={\mu}_{e}{D}_{e}\mathbf{e}+{\mu}_{r}{D}_{r}\mathbf{r}+{b}_{c}\phantom{\rule{4.pt}{0ex}}=\phantom{\rule{4.pt}{0ex}}\left(e-{\left(\frac{{x}_{e}-a}{\sigma}\right)}^{2}\right){D}_{e}\mathbf{e}+\left(e-{\left(\frac{{x}_{r}-a}{\sigma}\right)}^{2}\right){D}_{r}\mathbf{r}+{b}_{c}$. The scoring function of ProjFE is defined as in Equation (26):

**GCN**. This is the idea of a local first-order approximation derived from spectral convolution that motivates the convolution structure. It can be scaled linearly on the edge of the graph to learn hidden layer representations for encoding the local graphical structures and features of nodes. Its hierarchical-propagation rules are as follows: ${H}^{\left(l+1\right)}=\sigma \left({\overline{D}}^{-\frac{1}{2}}\overline{A}{\overline{D}}^{-\frac{1}{2}}{H}^{\left(l\right)}{W}^{\left(l\right)}\right)$, where $\overline{A}=A+{I}_{N}$ is the adjacency matrix of a graph with self-connections, ${I}_{N}$ is the identity matrix, and ${\overline{D}}_{ii}={\sum}_{j}\overline{{A}_{ij}}$ and ${W}^{\left(l\right)}$ is a layer-specific trainable weight matrix. The GCN is a spectral method, and the convolution theorem on the graph is used to define the graph convolution in the spectral domain. Many spatial methods have been proposed, and their main idea is to define node similarity by an aggregation function in the node domain.

**SACN**[26], is an end-to-end structure-aware convolutional network that takes into account node connectivity, node attributes and relation types simultaneously. The model also defines an encoder and decoder. In the encoder part, by introducing weights for different types of relations, SACN improves the GCN model to obtain a weighted graph convolutional network (WGCN). It makes different trade-offs for different types of relations when aggregating, so the multirelational graph can be regarded as multiple single-relational graphs, where each subgraph contains a specific type of relation. The decoder, called Conv-TransE, removes the reshaping operation from the input entity and relation embeddings and lets the convolutional filters operate directly on input entities and relations in the same dimension. Thus, the translation properties of TransE remain, while the same prediction performance as ConvE is maintained. Furthermore, SACN treats entity attributes as another type of node, called attribute nodes, which have similar representations and operations as nodes. Its propagation process is defined for node ${v}_{i}$ as in Equation (27):

**CompGCN**[25] is a novel GCN that uses composition operators from KGE methods by jointly embedding both entities and relations in a relational graph. For a given entity, it considers the outgoing edges of the original, inverse and self relations simultaneously via the $\varphi $ composition operator, defined as ${e}_{o}=\varphi \left({e}_{s},{e}_{r}\right)$, where the $\varphi $ operator is restricted to nonparameterized operations. The updating process in CompGCN is as follows: ${h}_{v}=f\left(\sum _{\left(u,r\right)\in \mathrm{N}\left(v\right)}{W}_{\lambda \left(r\right)}\varphi \left({e}_{u},{e}_{r}\right)\right)$, where ${e}_{u}$ and ${e}_{r}$ are the initial features for entity u and relation r, $\mathrm{N}\left(v\right)$ is the set of immediate neighbors of v for the outgoing edges, and ${w}_{\lambda \left(r\right)}\in {R}^{{d}_{1}\times {d}_{2}}$ is a relation-type specific parameter for the original ${w}_{O}$, inverse ${w}_{I}$, and self ${w}_{S}$ relations, separately. Simultaneously, the relation embedding is transformed to ${h}_{r}={w}_{rel}{z}_{r}$, where ${w}_{rel}\in {R}^{{d}_{1}\times {d}_{2}}$ is a transformation matrix that projects relations to the same space as entities; ${z}_{r}$ is defined as a set of learnable basis vectors as ${z}_{r}={\displaystyle \sum _{b=1}^{\mathrm{B}}}{\alpha}_{br}{v}_{b}$, where ${\alpha}_{br}\in R$ is the relation- and basis-specific learnable scale weight. In this paper, the $\varphi $ operator is calculated as follows, inspired by TransE, DistMult and HolE, to obtain the score:

#### 3.4. Connections between Typical Models

## 4. Experiments

#### 4.1. Experimental Settings

#### 4.2. Dataset

#### 4.3. The Implemented Models

**SACN**[26] exploits a graph convolutional neural network and integrates relation type information as well as node attribute information, which is not limited to the traditional method of embedding based only on triple information.

**HypER**[24] improves the ConvE model by providing a simple calculation method for sparsity and leveraging a parameter binding mechanism, which uses a hypernetwork to perform weight sharing.

**Bilinear+TR**[78] introduces a type regularizer into the loss function, which fully considers the type information of entities.

**RW-LMLM**[91] considers paths with three aspects of information: entities, relations and order information. It draws on the random walk algorithm and semantic-based models.

**LiteralE**[115] introduces textual information as the attribute information of entities.

**SimplE**[152] encodes background knowledge into an embedding by parameter sharing. It embeds relations and their inverse relations separately.

**HAKE**[60] refers to the idea of polar coordinates, and it considers the hierarchical information of semantics.

**RotatE**[35] replaces the traditional translation operation with a rotation operation, which can be used to distinguish various relations, such as symmetry, antisymmetry and composition.

**ConvE**[22] is the first model to utilize the CNN framework for KG completion. It uses embedded 2D convolution to predict missing links in KGs.

**DistMult**[20] and

**ComplEx**[21] are traditional semantic-matching models based on tensor decomposition. ComplEx models asymmetric relations. The information on these models is summarized in Table 4.

#### 4.4. Performance Analysis

**RotatE**[35] and

**HypER**[24] outperform the other models, which indicates that the rotation operation used in translational-distance models and the hypernetwork used in CNN-based models play important roles in improving performance. RotatE uses the complex space and mines different types of relations (symmetry/antisymmetry, inversion and composition); thus, different aspects of semantic information are modeled well by this integration strategy. HypER and

**ConvE**[22] are based on a CNN; the former improves the latter by a hypernetwork, which can be used to perform weight sharing across layers and dynamically synthesize weights given inputs.

**RW-LMLM**[91] takes into account both the order information and random walk algorithm, and it has the capability of dealing with underlying semantic information.

**SACN**[26] also performs well among these models; it uses the relation types and entity attributes in the GCN model structure. On the whole, the top-performing models are all based on neural networks (GCNs or CNNs), from which we can conclude that the advanced neural-network structure, with its ability to generate rich and expressive feature embeddings, is helpful in the KGE task. The performance of conventional models such as translation models (TransE) and semantic models (DistMult and ComplEx) is not good.

**SimplEx**[152] has low performance for FB15k-237 and high performance for FB15k because FB15k contains inverse relations and SimplEx can model the inverse relations appropriately. For

**HAKE**[60], we believe that the polar coordinates may have great benefits because of their particular structure, which enables them to mine considerable hidden semantic structure information. In terms of

**Bilinear+TR**[78] and

**LiteralE**[115], we can see that adding the entity type, text and other information helps to improve performance.

**LiteralE**[115], it does not obtain only good results. For FB15k, only ComplEx obtains good results upon adding literal information, while for FB15-237, only DistMult improves slightly after adding literal information. For ConvE, a neural-network model, adding literal data does not achieve better results but worse results. LiteralE combines literal vectors (only numerical information is involved in this model) and an entity embedding as the input for training. DistMult uses a simple bilinear formula and matrix multiplication to learn embeddings. Its scoring function can only capture pairwise interactions of the same dimension between entities. Therefore, this simple embedding can only deal with symmetric relations. We suspect that this is the reason that literal information does not work. However, for FB15k-237, the result of DistMult is slightly improved due to the deletion of the inverse relationship. Because ComplEx introduces a complex vector space and can deal with asymmetric relations, it has a good response to literal information for FB15k but not FB15k-237. For ConvE, we believe that the neural-network model is able to aggregate the domain information well, so it is not sensitive to the addition of literal vectors. It performs even worse, which we guess is because of the large number of parameters from LiteralE and itself. In addition, we added textual information on the basis of numerics for LiteralE (LiteralE+text+DistMult), and the experimental results show that the performance was similarly worse. We speculate that simply adding textual information to the entity embedding of the input for training does not play a very important role. We should continuously aggregate effective domain information in the process of training and try to reduce the number of parameters. Moreover, the text information should not be only numerical information but should also include the entity type, entity attribute, path and other additional information.

#### 4.5. Training Time Analysis

- For CNN-based models, the initial model ConvE, which introduces numerous parameters because it uses an embedded 2D convolution, is very time-consuming for training. Similarly, for LiteralE, the introduction of additional information and its complex model structure lead to some additional parameter overhead. While HypER utilizes a 1D relation-specific filter and a nonlinear (quadratic) combination of entity and relation embeddings via hypernetworks to perform weight sharing, it has many fewer parameters than ConvE, so it saves much training time.
- Semantic matching models such as DistMult and ComplEx all suffer from longer training times.
- Translational distance models such as HAKE and RotatE all have shorter training times because the translational-distance model has a relatively simple model structure and scoring function without too many parameters.
- The bilinear+TR model has the shortest training time, with a type regularizer incorporated into the loss function, which fully considers the type information of entities. The times of the linear models are short, but their performance is not good.
- LiteralE introduces some overhead in terms of the number of parameters compared to the base method, leading to a long training time. This is due to the choice of the core function g, which takes an entity’s embedding and a literal vector as inputs and maps them to a vector of the same dimension as the entity embedding. Thus, it can make much effort in this step to choose a better function.

#### 4.6. Suggestions for Improvement

## 5. Conclusions

- Neural network models with excellent structure and a small number of parameters have good performance. Especially, the graph convolution neural network has a strong ability to mine the underlying semantics of knowledge graphs. In addition, if the node information of a multi-hop domain can be aggregated, the accuracy of the model in specific tasks can be greatly improved.
- Models with additional information, such as node attributes, node types, relationship types, prior knowledge and so on, have better performance.

- This survey only focused on the link prediction of KGE; we will research more tasks of knowledge graph completion in the future, such as entity prediction, entity classification and triple classification.
- This survey only used two datasets (FB15k and FB15k-237) for the experiments; we will use more knowledge graph datasets, such as WN18, WN18RR and FB13.
- This survey only focused on static graphs; we will explore new model architectures, such as dynamic graphs and heterogeneous graphs.
- The categories we proposed for KGE models may not be the perfect ones; we will attempt to mine new category strategies for KGE models.

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Conflicts of Interest

## References

- Bollacker, K.D.; Evans, C.; Paritosh, P.; Sturge, T.; Taylor, J. Freebase: A Collaboratively Created Graph Database for Structuring Human Knowledge; SIGMOD: Vancouver, BC, Canada, 2008; pp. 1247–1250. [Google Scholar]
- Lehmann, J.; Isele, R.; Jakob, M.; Jentzsch, A.; Kontokostas, D.; Mendes, P.N.; Hellmann, S.; Morsey, M.; Kleef, P.V.; Auer, S.; et al. DBpedia—A Large-Scale, Multilingual Knowledge base Extracted from Wikipedia; Springer: Berlin/Heidelberg, Germany, 2015; Volume 6, pp. 167–195. [Google Scholar]
- Mahdisoltani, F.; Biega, J.A.; Suchanek, F.M. YAGO3: A Knowledge Base from Multilingual Wikipedias. In Proceedings of the CIDR, Asilomar, CA, USA, 4–7 January 2015. [Google Scholar]
- Wang, R.; Wang, M.; Liu, J.; Chen, W.; Cochez, M.; Decker, S. Leveraging Knowledge Graph Embeddings for Natural Language Question Answering. In Proceedings of the DASFAA 2019, Chiang Mai, Thailand, 22–25 April 2019; pp. 659–675. [Google Scholar]
- Musto, C.; Basile, P.; Semeraro, G. Embedding Knowledge Graphs for Semantics-aware Recommendations based on DBpedia. In Proceedings of the UMAP 2019, Larnaca, Cyprus, 9–12 June 2019; pp. 27–31. [Google Scholar]
- Wang, Q.; Mao, Z.; Wang, B.; Guo, L. Knowledge Graph Embedding: A Survey of Approaches and Applications. IEEE Trans. Knowl. Data Eng.
**2017**, 29, 2724–2743. [Google Scholar] [CrossRef] - Cai, H.; Zheng, V.W.; Chang, K.C. A Comprehensive Survey of Graph Embedding: Problems, Techniques, and Applications. IEEE Trans. Knowl. Data Eng.
**2017**, 30, 1616–1637. [Google Scholar] [CrossRef] [Green Version] - Siddhant, A. A Survey on Graph Neural Networks for Knowledge Graph Completion. arXiv
**2020**, arXiv:2007.12374. [Google Scholar] - Ma, J.; Qiao, Y.; Hu, G.; Wang, Y.; Zhang, C.; Huang, Y.; Sangaiah, A.K.; Wu, H.; Zhang, H.; Ren, K. ELPKG: A High-Accuracy Link Prediction Approach for Knowledge Graph Completion. Symmetry
**2019**, 11, 1096. [Google Scholar] [CrossRef] [Green Version] - Chang, K.; Yih, W.; Yang, B.; Meek, C. Typed Tensor Decomposition of Knowledge Bases for Relation Extraction. In Proceedings of the EMNLP, Doha, Qatar, 25–29 October 2014; pp. 1568–1579. [Google Scholar]
- Lao, N.; Mitchell, T.; Cohen, W.W. Random Walk Inference and Learning in A Large Scale Knowledge Base. In Proceedings of the EMNLP, Edinburgh, UK, 27–31 July 2011; pp. 529–539. [Google Scholar]
- Lu, F.; Cong, P.; Huang, X. Utilizing Textual Information in Knowledge Graph Embedding: A Survey of Methods and Applications. IEEE Access
**2020**, 8, 92072–92088. [Google Scholar] [CrossRef] - Bordes, A.; Usunier, N.; Garcia-Duran, A.; Weston, J.; Yakhnenko, O. Translating Embeddings for Modeling Multi-Relational Data. In Proceedings of the NIPS, Lake Tahoe, NV, USA, 5–8 December 2013. [Google Scholar]
- Minervini, P.; d’ Amato, C.; Fanizzi, N.; Esposito, F. Efficient Learning of Entity and Predicate Embeddings for Link Prediction in Knowledge Graphs. In Proceedings of the URSW@ISWC, Bethlehem, PA, USA, 11–15 October 2015; pp. 26–37. [Google Scholar]
- Wang, Z.; Zhang, J.; Feng, J.; Chen, Z. Knowledge Graph Embedding by Translating on Hyperplanes; AAAI Press: Palo Alto, CA, USA, 2014; pp. 1112–1119. [Google Scholar]
- Fan, M.; Zhou, Q.; Chang, E.; Zheng, T.F. Transition-based Knowledge Graph Embedding with Relational Mapping Properties. In Proceedings of the PACLIC, Phuket, Thailand, 12–14 December 2014; pp. 328–337. [Google Scholar]
- Lin, Y.; Liu, Z.; Sun, M.; Liu, Y.; Zhu, X. Learning Entity and Relation Embeddings for Knowledge Graph Completion; AAAI Press: Palo Alto, CA, USA, 2015; pp. 2181–2187. [Google Scholar]
- Mikolov, T.; Sutskever, I.; Chen, K.; Corrado, G.S.; Dean, J. Distributed Representations of Words and Phrases and their Compositionality. In Proceedings of the NIPS, Lake Tahoe, NV, USA, 5–8 December 2013; pp. 3111–3119. [Google Scholar]
- Liu, Z.; Sun, M.; Lin, Y.; Xie, R. Knowledge Representation Learning: A Review. J. Comp. Res. Develop.
**2016**, 247–261. [Google Scholar] - Yang, B.; Yih, W.; He, X.; Gao, J.; Deng, L. Embedding Entities and Relations for Learning and Inference in Knowledge Bases. In Proceedings of the ICLR (Poster), San Diego, CA, USA, 7–9 May 2015. [Google Scholar]
- Trouillon, T.; Welbl, J.; Riedel, S.; Gaussier, É.; Bouchard, G. Complex Embeddings for Simple Link Prediction; ICML: New York City, NY, USA, 2016; pp. 2071–2080. [Google Scholar]
- Dettmers, T.; Minervini, P.; Stenetorp, P.; Riedel, S. Convolutional 2D Knowledge Graph Embeddings; AAAI Press: Palo Alto, CA, USA, 2017; pp. 1811–1818. [Google Scholar]
- Nguyen, D.Q.; Nguyen, T.D.; Nguyen, D.Q.; Phung, D.Q. A Novel Embedding Model for Knowledge Base Completion Based on Convolutional Neural Network. In Proceedings of the NAACL-HLT, New Orleans, LA, USA, 1–6 June 2018; pp. 327–333. [Google Scholar]
- Balazevic, I.; Allen, C.; Hospedales, T.M. Hypernetwork Knowledge Graph Embeddings. In Proceedings of the ICANN (Workshop), Munich, Germany, 17–19 September 2019; pp. 553–565. [Google Scholar]
- Vashishth, S.; Sanyal, S.; Nitin, V.; Talukdar, P.P. Composition-based Multi-Relational Graph Convolutional Networks. In Proceedings of the ICLR, Addis Ababa, Ethiopia, 26–30 April 2020. [Google Scholar]
- Shang, C.; Tang, Y.; Huang, J.; Bi, J.; He, X.; Zhou, B. End-to-End Structure-Aware Convolutional Networks for Knowledge Base Completion; AAAI Press: Palo Alto, CA, USA, 2019; pp. 3060–3067. [Google Scholar]
- Jagvaral, B.; Lee, W.; Roh, J.S.; Kim, M.S.; Park, Y.T. Path-based reasoning approach for knowledge graph completion using CNN-BiLSTM with attention mechanism. Expert Syst. Appl.
**2020**, 142, 112960. [Google Scholar] [CrossRef] - Rossi, A.; Barbosa, D.; Firmani, D.; Matinata, A.; Merialdo, P. Knowledge graph embedding for link prediction: A comparative analysis. ACM Trans. Knowl. Discov. Data TKDD
**2021**, 15, 1–49. [Google Scholar] - Dai, Y.; Wang, S.; Xiong, N.N.; Guo, W. A survey on knowledge graph embedding: Approaches, applications and benchmarks. Electronics
**2020**, 9, 750. [Google Scholar] [CrossRef] - Chen, X.; Jia, S.; Xiang, Y. A review: Knowledge reasoning over knowledge graph. Expert Syst. Appl.
**2020**, 141, 112948.1–112948.21. [Google Scholar] [CrossRef] - Ji, S.; Pan, S.; Cambria, E.; Marttinen, P.; Yu, P.S. A Survey on Knowledge Graphs: Representation, Acquisition and Applications. arXiv
**2020**, arXiv:2002.00388. [Google Scholar] - Lin, Y.; Han, X.; Xie, R.; Liu, Z.; Sun, M. Knowledge Representation Learning: A Quantitative Review. arXiv
**2018**, arXiv:1812.10901. [Google Scholar] - Nguyen, D.Q. An overview of embedding models of entities and relationships for knowledge base completion. arXiv
**2017**, arXiv:1703.08098. [Google Scholar] - Kazemi, S.M.; Goel, R.; Jain, K.; Kobyzev, I.; Sethi, A.; Forsyth, P.; Poupart, P. Representation Learning for Dynamic Graphs: A Survey. J. Mach. Learn. Res.
**2020**, 21, 1–73. [Google Scholar] - Sun, Z.; Deng, Z.H.; Nie, J.Y.; Tang, J. RotatE: Knowledge Graph Embedding by Relational Rotation in Complex Space. In Proceedings of the ICLR(Poster), New Orleans, LA, USA, 6–9 May 2019. [Google Scholar]
- Ji, G.; He, S.; Xu, L.; Liu, K.; Zhao, J. Knowledge Graph Embedding via Dynamic Mapping Matrix; ACL: Beijing, China, 2015; pp. 687–696. [Google Scholar]
- Jia, Y.; Wang, Y.; Lin, H.; Jin, X.; Cheng, X. Locally Adaptive Translation for Knowledge Graph Embedding; AAAI: Phoenix, AZ, USA, 2016; pp. 992–998. [Google Scholar]
- Ji, G.; Liu, K.; He, S.; Zhao, J. Knowledge Graph Completion with Adaptive Sparse Transfer Matrix; AAAI Press: Palo Alto, CA, USA, 2016; pp. 985–991. [Google Scholar]
- Xiao, H.; Huang, M.; Zhu, X. From One Point to a Manifold: Knowledge Graph Embedding for Precise Link Prediction. In Proceedings of the IJCAI, New York, NY, USA, 9–15 July 2016; pp. 1315–1321. [Google Scholar]
- Nguyen, D.Q.; Sirts, K.; Qu, L.; Johnson, M. STransE: A novel embedding model of entities and relationships in knowledge bases. In Proceedings of the HLT-NAACL, San Diego, CA, USA, 21 May 2016; pp. 460–466. [Google Scholar]
- Feng, J.; Huang, M.; Wang, M.; Zhou, M.; Hao, Y.; Zhu, X. Knowledge Graph Embedding by Flexible Translation. In Proceedings of the KR, Cape Town, South Africa, 25–29 April 2016; pp. 557–560. [Google Scholar]
- Chang, L.; Zhu, M.; Gu, T.; Bin, C.; Qian, J.; Zhang, J. Knowledge graph embedding by dynamic translation. IEEE Access
**2017**, 5, 20898–20907. [Google Scholar] [CrossRef] - Zhang, C.; Zhou, M.; Han, X.; Hu, Z.; Ji, Y. Knowledge Graph Embedding for Hyper-Relational Data. J. Tsinghua Univ. Nat. Sci. Ed.
**2017**, 22, 185–197. [Google Scholar] [CrossRef] - Du, Z.; Hao, Z.; Meng, X.; Wang, Q. CirE: Circular Embeddings of Knowledge Graphs. In Proceedings of the DASFAA, Suzhou, China, 27–30 May 2017; pp. 148–162. [Google Scholar]
- Tan, Z.; Zhao, X.; Fang, Y.; Xiao, W. GTrans: Generic knowledge graph embedding via multi-state entities and dynamic relation spaces. IEEE Access
**2018**, 6, 8232–8244. [Google Scholar] [CrossRef] - Zhu, J.; Jia, Y.; Xu, J.; Qiao, J.; Cheng, X. Modeling the Correlations of Relations for Knowledge Graph Embedding. Comput. Sci. Technol.
**2018**, 33, 323–334. [Google Scholar] [CrossRef] - Do, K.; Tran, T.; Venkatesh, S. Knowledge Graph Embedding with Multiple Relation Projections. In Proceedings of the ICPR, Beijing, China, 20–24 August 2018; pp. 332–337. [Google Scholar]
- Zhu, Q.; Zhou, X.; Tan, J.; Liu, P.; Guo, L. Learning Knowledge Graph Embeddings via Generalized Hyperplanes. In Proceedings of the ICCS, Wuxi, China, 11–13 June 2018; pp. 624–638. [Google Scholar]
- Geng, Z.; Li, Z.; Han, Y. A Novel Asymmetric Embedding Model for Knowledge Graph Completion. In Proceedings of the ICPR, Beijing, China, 20–24 August 2018; pp. 290–295. [Google Scholar]
- Zhang, Y.; Du, Z.; Meng, X. EMT: A Tail-Oriented Method for Specific Domain Knowledge Graph Completion. In Proceedings of the PAKDD, Macau, China, 14–17 April 2019; pp. 514–527. [Google Scholar]
- Yao, J.; Zhao, Y. Knowledge Graph Embedding Bi-vector Models for Symmetric Relation. In Chinese Intelligent Systems Conference; Springer: Singapore, 2019. [Google Scholar]
- Yang, S.; Tian, J.; Zhang, H.; Yan, J.; He, H.; Jin, Y. TransMS: Knowledge Graph Embedding for Complex Relations by Multidirectional Semantics. In Proceedings of the IJCAI, Macao, China, 10–16 August 2019; pp. 1935–1942. [Google Scholar]
- Ebisu, T.; Ichise, R. Generalized Translation-Based Embedding of Knowledge Graph. IEEE Trans. Knowl. Data Eng.
**2020**, 32, 941–951. [Google Scholar] [CrossRef] - Cui, Z.; Liu, S.; Pan, L.; He, Q. Translating Embedding with Local Connection for Knowledge Graph Completion. In Proceedings of the AAMAS, Auckland, New Zealand, 9–13 May 2020; pp. 1825–1827. [Google Scholar]
- He, S.; Liu, K.; Ji, G.; Zhao, J. Learning to Represent Knowledge Graphs with Gaussian Embedding. In Proceedings of the CIKM, Melbourne, VIC, Australia, 19–23 October 2015; pp. 623–632. [Google Scholar]
- Xiao, H.; Huang, M.; Hao, Y.; Zhu, X. TransG: A Generative Mixture Model for Knowledge Graph Embedding. ACL
**2015**, 1, 2316–2325. [Google Scholar] - Song, H.J.; Park, S.B. Enriching translation-based knowledge graph embeddings through continual learning. IEEE Access
**2018**, 6, 60489–60497. [Google Scholar] [CrossRef] - Ebisu, T.; Ichise, R. TorusE: Knowledge Graph Embedding on a Lie Group; AAAI Press: Palo Alto, CA, USA, 2018; pp. 1819–1826. [Google Scholar]
- Zhang, S.; Tay, Y.; Yao, L.; Liu, Q. Quaternion Knowledge Graph Embeddings. arXiv
**2019**, arXiv:1904.10281. [Google Scholar] - Zhang, Z.; Cai, J.; Zhang, Y.; Wang, J. Learning Hierarchy-Aware Knowledge Graph Embeddings for Link Prediction. In Proceedings of the AAAI 2020, New York, NY, USA, 7–12 February 2020; pp. 3065–3072. [Google Scholar]
- Kong, X.; Chen, X.; Hovy, E.H. Decompressing Knowledge Graph Representations for Link Prediction. arXiv
**2019**, arXiv:1911.04053. [Google Scholar] - Chen, Y.; Liu, J.; Zhang, Z.; Wen, S.; Xiong, W. MobiusE: Knowledge Graph Embedding on Mobius Ring. arXiv
**2021**, arXiv:2101.02352, arXiv. [Google Scholar] - Chen, H.; Wang, W.; Li, G.; Shi, Y. A quaternion-embedded capsule network model for knowledge graph completion. IEEE Access
**2020**, 8, 100890–100904. [Google Scholar] [CrossRef] - Nickel, M.; Tresp, V.; Kriegel, H.P. A Three-Way Model for Collective Learning on Multi-Relational Data. In Proceedings of the ICML, Washington, DC, USA, 28 June–2 July 2011; pp. 809–816. [Google Scholar]
- Nickel, M.; Rosasco, L.; Poggio, T.A. Holographic Embeddings of Knowledge Graphs; AAAI: Phoenix, AZ, USA, 2016; pp. 1955–1961. [Google Scholar]
- Liu, H.; Wu, Y.; Yang, Y. Analogical Inference for Multi-Relational Embeddings; ICML: Sydney, NSW, Australia, 2017; pp. 2168–2178. [Google Scholar]
- Lacroix, T.; Usunier, N.; Obozinski, G. Canonical Tensor Decomposition for Knowledge Base Completion. In Proceedings of the ICML, Vienna, Austria, 23–31 July 2018; pp. 2869–2878. [Google Scholar]
- Balazevic, I.; Allen, C.; Hospedales, M.T. TuckER: Tensor Factorization for Knowledge Graph Completion; EMNLP/IJCNLP: Hong Kong, China, 2019; pp. 5184–5193. [Google Scholar]
- Mohamed, S.K.; Novácek, V. Link Prediction Using Multi Part Embeddings. In Proceedings of the ESWC, Portoroz, Slovenia, 2–6 June 2019; pp. 240–254. [Google Scholar]
- Zhang, W.; Paudel, B.; Zhang, W.; Bernstein, A.; Chen, H. Interaction Embeddings for Prediction and Explanation in Knowledge Graphs; WSDM: Melbourne, VIC, Australia, 2019; pp. 96–104. [Google Scholar]
- Xue, Y.; Yuan, Y.; Xu, Z.; Sabharwal, A. Expanding Holographic Embeddings for Knowledge Completion. In Proceedings of the 32nd Conference on Neural Information Processing Systems (NeurIPS 2018), Montreal, QC, Canada, 3–8 December 2018. [Google Scholar]
- Tran, H.N.; Takasu, A. Multi-Partition Embedding Interaction with Block Term Format for Knowledge Graph Completion. In Proceedings of the ECAI, Copenhagen, Denmark, 19–24 July 2020; pp. 833–840. [Google Scholar]
- Xie, R.; Liu, Z.; Sun, M.g. Representation Learning of Knowledge Graphs with Hierarchical Types. In Proceedings of the IJCAI, New York, NY, USA, 9–15 July 2016; pp. 2965–2971. [Google Scholar]
- Guo, S.; Wang, Q.; Wang, B.; Wang, L.; Guo, L. SSE: Semantically Smooth Embedding for Knowledge Graphs. IEEE Trans. Knowl. Data Eng.
**2017**, 29, 884–897. [Google Scholar] [CrossRef] - Jiang, X.; Wang, Q.; Qi, B.; Qiu, Y.; Li, P.; Wang, B. Attentive Path Combination for Knowledge Graph Completion. In Proceedings of the ACML, Seoul, Korea, 15–17 November 2017; pp. 590–605. [Google Scholar]
- Moon, C.; Jones, P.; Samatova, N.F. Learning Entity Type Embedding for Knowledge Graph Completion. In Proceedings of the CIKM, Singapore, 6–10 November 2017; pp. 2215–2218. [Google Scholar]
- Ma, S.; Ding, J.; Jia, W.; Wang, K.; Guo, M. TransT: Type-Based Multiple Embedding Representations for Knowledge Graph Completion. In Proceedings of the ECML/PKDD, Skopje, Macedonia, 18–22 September 2017; pp. 717–733. [Google Scholar]
- Kotnis, B.; Nastase, V. Learning Knowledge Graph Embeddings with Type Regularizer; K-CAP: Austin, TX, USA, 2017; pp. 1–4. [Google Scholar]
- Rahman, M.M.; Takasu, A. Knowledge Graph Embedding via Entities’ Type Mapping Matrix. In Proceedings of the ICONIP, Siem Reap, Cambodia, 13–16 December 2018. [Google Scholar]
- Zhou, B.; Chen, Y.; Liu, K.; Zhao, J. Relation and Fact Type Supervised Knowledge Graph Embedding via Weighted Scores. In Proceedings of the CCL, Kunming, Chinapp, 18–20 October 2019; pp. 258–267. [Google Scholar]
- Ma, J.; Zhong, M.; Wen, J.; Chen, W.; Zhou, X.; Li, X. RecKGC: Integrating Recommendation with Knowledge Graph Completion. In Proceedings of the ADMA, Dalian, China, 21–23 November 2019; pp. 250–265. [Google Scholar]
- Lin, X.; Liang, Y.; Giunchiglia, F.; Feng, X.; Guan, R. Relation path embedding in knowledge graphs. Neur. Comput. Appl.
**2019**, 31, 5629–5639. [Google Scholar] [CrossRef] [Green Version] - Lin, Y.; Liu, Z.; Luan, H.B.; Sun, M.; Rao, S.; Liu, S. Modeling Relation Paths for Representation Learning of Knowledge Bases. arXiv
**2015**, arXiv:1506.00379. [Google Scholar] - Zeng, P.; Tan, Q.; Meng, X.; Zhang, H.; Xu, J. Modeling Complex Relationship Paths for Knowledge Graph Completion. IEICE Transact.
**2018**, 101, 1393–1400. [Google Scholar] [CrossRef] [Green Version] - Jia, Y.; Wang, Y.; Jin, X.; Cheng, X. Path-specific knowledge graph embedding. Knowl. Based Syst.
**2018**, 151, 37–44. [Google Scholar] [CrossRef] - Xiong, S.; Huang, W.; Duan, P. Knowledge Graph Embedding via Relation Paths and Dynamic Mapping Matrix. In Proceedings of the ER Workshops, Xi’an, China, 22–25 October 2028; pp. 106–118. [Google Scholar]
- Zhang, M.; Wang, Q.; Xu, W.; Li, W.; Sun, S. Discriminative Path-Based Knowledge Graph Embedding for Precise Link Prediction. In Proceedings of the ECIR, Grenoble, France, 26–29 March 2018. [Google Scholar]
- Nastase, V.; Kotnis, B. Abstract Graphs and Abstract Paths for Knowledge Graph Completion. In Proceedings of the *SEM@NAACL-HLT 2019, Minneapolis, MN, USA, 6–7 June 2019. [Google Scholar]
- Sun, J.; Xu, G.; Cheng, Y.; Zhuang, T. Knowledge Map Completion Method Based on Metric Space and Relational Path. In Proceedings of the 2019 14th International Conference on Computer Science & Education (ICCSE), Toronto, ON, Canada, 19–21 August 2019; pp. 108–113. [Google Scholar]
- Wang, Q.; Huang, P.; Wang, H.; Dai, S.; Jiang, W.; Liu, J.; Lyu, Y.; Zhu, Y.; Wu, H. CoKE: Contextualized Knowledge Graph Embedding. arXiv
**2019**, arXiv:1911.02168. [Google Scholar] - Wang, C.; Yan, M.; Yi, C.; Sha, Y. Capturing Semantic and Syntactic Information for Link Prediction in Knowledge Graphs. In Proceedings of the ISWC, Auckland, New Zealand, 26–30 October 2019; pp. 664–679. [Google Scholar]
- Nathani, D.; Chauhan, J.; Sharma, C.; Kaul, M. Learning Attention-based Embeddings for Relation Prediction in Knowledge Graphs. In Proceedings of the ACL 2019, Florence, Italy, 28 July–2 August 2019. [Google Scholar]
- Wang, R.; Li, B.; Hu, S.; Du, W.; Zhang, M. Knowledge Graph Embedding via Graph Attenuated Attention Networks. IEEE Access
**2020**, 8, 5212–5224. [Google Scholar] [CrossRef] - Xie, R.; Liu, Z.; Jia, J.; Luan, H.; Sun, M. Representation Learning of Knowledge Graphs with Entity Descriptions; AAAI Press: Palo Alto, CA, USA, 2016; pp. 2659–2665. [Google Scholar]
- Xiao, H.; Huang, M.; Meng, L.; Zhu, X. SSP: Semantic Space Projection for Knowledge Graph Embedding with Text Descriptions; AAAI Press: Palo Alto, CA, USA, 2017; pp. 3104–3110. [Google Scholar]
- Chen, M.; Tian, Y.; Chang, K.-W.; Skiena, S.; Zaniolo, C. Co-training Embeddings of Knowledge Graphs and Entity Descriptions for Cross-Lingual Entity Alignment. In Proceedings of the IJCAI, Stockholm, Sweden, 13–19 July 2018; pp. 3998–4004. [Google Scholar]
- Zhao, M.; Zhao, Y.; Xu, B. Knowledge Graph Completion via Complete Attention between Knowledge Graph and Entity Descriptions. In Proceedings of the CSAE, Sanya, China, 22–24 October 2019. [Google Scholar]
- Veira, N.; Keng, B.; Padmanabhan, K.; Veneris, A.G. Unsupervised Embedding Enhancements of Knowledge Graphs using Textual Associations. In Proceedings of the IJCAI, Macao, China, 10–16 August 2019; pp. 5218–5225. [Google Scholar]
- Shah, H.; Villmow, J.; Ulges, A.; Schwanecke, U.; Shafait, F. An Open-World Extension to Knowledge Graph Completion Models; AAAI Press: Palo Alto, CA, USA, 2019; pp. 3044–3051. [Google Scholar]
- Wang, S.; Jiang, C. Knowledge graph embedding with interactive guidance from entity descriptions. IEEE Access
**2019**, 7, 156686–156693. [Google Scholar] - Ma, L.; Sun, P.; Lin, Z.; Wang, H. Composing Knowledge Graph Embeddings via Word Embeddings. arXiv
**2019**, arXiv:1909.03794. [Google Scholar] - Guo, S.; Wang, Q.; Wang, L.; Wang, B.; Guo, L. Jointly embedding knowledge graphs and logical rules. In Proceedings of the EMNLP, Austin, TX, USA, 1–4 November 2016; pp. 192–202. [Google Scholar]
- Yoon, H.-G.; Song, H.-J.; Park, S.-B.; Park, S.-Y. A Translation-Based Knowledge Graph Embedding Preserving Logical Property of Relations. In Proceedings of the HLT-NAACL, San Diego, CA, USA, 21 May 2016; pp. 907–916. [Google Scholar]
- Du, J.; Qi, K.; Wan, H.; Peng, B.; Lu, S.; Shen, Y. Enhancing Knowledge Graph Embedding from a Logical Perspective. In Proceedings of the JIST, Gold Coast, Australia, 10–12 November 2017; pp. 232–247. [Google Scholar]
- Han, X.; Zhang, C.; Sun, T.; Ji, Y.; Hu, Z. A triple-branch neural network for knowledge graph embedding. IEEE Access
**2018**, 6, 76606–76615. [Google Scholar] [CrossRef] - Yuan, J.; Gao, N.; Xiang, J. TransGate: Knowledge Graph Embedding with Shared Gate Structure; AAAI Press: Palo Alto, CA, USA, 2019; pp. 3100–3107. [Google Scholar]
- Wang, M.; Rong, E.; Zhuo, H.; Zhu, H. Embedding Knowledge Graphs Based on Transitivity and Asymmetry of Rules. In Proceedings of the PAKDD, Melbourne, VIC, Australia, 3–6 June 2018; pp. 141–153. [Google Scholar]
- Wang, P.; Dou, D.; Wu, F.; Silva, N.; Jin, L. Logic Rules Powered Knowledge Graph Embedding. arXiv
**2019**, arXiv:1903.03772. [Google Scholar] - Zhang, J.; Li, J. Enhanced Knowledge Graph Embedding by Jointly Learning Soft Rules and Facts. Algorithms
**2019**, 12, 265. [Google Scholar] [CrossRef] [Green Version] - Gu, Y.; Guan, Y.; Missier, P. Towards Learning Instantiated Logical Rules from Knowledge Graphs. arXiv
**2020**, arXiv:2003.06071. [Google Scholar] - Das, R.; Godbole, A.; Dhuliawala, S.; Zaheer, M.; McCallum, A. A Simple Approach to Case-Based Reasoning in Knowledge Bases; AKBC: San Francisco, CA, USA, 2020. [Google Scholar]
- Das, R.; Godbole, A.; Monath, N.; Zaheer, M.; McCallum, A. Probabilistic Case-based Reasoning for Open-World Knowledge Graph Completion. arXiv
**2020**, arXiv:2010.03548. [Google Scholar] - García-Durán, A.; Niepert, M. KBLRN: End-to-End Learning of Knowledge Base Representations with Latent, Relational, and Numerical Features. In Proceedings of the UAI, Monterey, CA, USA, 6–10 August 2018; pp. 372–381. [Google Scholar]
- Wu, Y.; Wang, Z. Knowledge Graph Embedding with Numeric Attributes of Entities. In Proceedings of the Third Workshop on Representation Learning for NLP, Melbourne, Australia, 20 July 2018; pp. 132–136. [Google Scholar]
- Kristiadi, A.; Khan, M.A.; Lukovnikov, D.; Lehmann, J.; Fischer, A. Incorporating Literals into Knowledge Graph Embeddings. In Proceedings of the ISWC, Auckland, New Zealand, 26–30 October 2019; pp. 347–363. [Google Scholar]
- Feng, M.-H.; Hsu, C.-C.; Li, C.-T.; Yeh, M.-Y.; Lin, S.-D. MARINE: Multi-relational Network Embeddings with Relational Proximity and Node Attributes. In The World Wide Web Conference; ACM: New York, NY, USA, 2019; pp. 470–479. [Google Scholar]
- Zhang, Z.; Cao, L.; Chen, X.; Tang, W.; Xu, Z.; Meng, Y. Repressentation Learning of Knowledge Graphs With Entity Attributes. IEEE Access
**2020**, 7435–7441. [Google Scholar] [CrossRef] - Jiang, T.; Liu, T.; Ge, T.; Sha, L.; Li, S.; Chang, B.; Sui, Z. Encoding Temporal Information for Time-Aware Link Prediction. In Proceedings of the EMNLP, Austin, TX, USA, 1–4 November 2016. [Google Scholar]
- Esteban, C.; Tresp, V.; Yang, Y.; Baier, S.; Krompass, D. Predicting the co-evolution of event and Knowledge Graphs. In Proceedings of the 2016 19th International Conference on Information Fusion (FUSION), Heidelberg, Germany, 5–8 July 2016. [Google Scholar]
- Trivedi, R.; Dai, H.; Wang, Y.; Song, L. Know-evolve: Deep Temporal Reasoning for Dynamic Knowledge Graphs. In Proceedings of the ICML, Sydney, NSW, Australia, 6–11 August 2017; Volume 70, pp. 3462–3471. [Google Scholar]
- Jia, Y.; Wang, Y.; Jin, X.; Lin, H.; Cheng, X. Knowledge Graph Embedding: A Locally and Temporally Adaptive Translation-Based Approach. ACM Trans. Web
**2018**, 12, 8:1–8:33. [Google Scholar] [CrossRef] - Dasgupta, S.S.; Ray, S.N.; Talukdar, P.P. HyTE: Hyperplane-based Temporally aware Knowledge Graph Embedding. In Proceedings of the EMNLP, Jeju, Korea, 31 October–4 November 2018; pp. 2001–2011. [Google Scholar]
- Xu, C.; Nayyeri, M.; Alkhoury, F.; Lehmann, J.; Yazdi, H.S. Temporal Knowledge Graph Completion Based on Time Series Gaussian Embedding. In Proceedings of the ISWC, Athens, Greece, 2–6 November 2020; pp. 654–671. [Google Scholar]
- Chen, S.; Qiao, L.; Liu, B.; Bo, J.; Cui, Y.; Li, J. Knowledge Graph Embedding Based on Hyperplane and Quantitative Credibility. In Proceedings of the MLICOM, Nanjing, China, 24–25 August 2019; pp. 583–594. [Google Scholar]
- Tang, X.; Yuan, R.; Li, Q.; Wang, T.; Yang, H.; Cai, Y.; Song, H. Timespan-Aware Dynamic Knowledge Graph Embedding by Incorporating Temporal Evolution. IEEE Access
**2020**, 8, 6849–6860. [Google Scholar] [CrossRef] - Jung, J.; Jung, J.; Kang, U. T-GAP: Learning to Walk across Time for Temporal Knowledge Graph Completion. arXiv
**2020**, arXiv:2012.10595. [Google Scholar] - Wu, J.; Cao, M.; Cheung, J.K.; Hamilton, W.L. TeMP: Temporal Message Passing for Temporal Knowledge Graph Completion. arXiv
**2020**, arXiv:2010.03526. [Google Scholar] - Feng, J.; Huang, M.; Yang, Y.; Zhu, X. GAKE: Graph Aware Knowledge Embedding. In Proceedings of the COLING, Osaka, Japan, 11–16 December 2016. [Google Scholar]
- Zhou, C.; Liu, Y.; Liu, X.; Liu, Z.; Gao, J. Scalable Graph Embedding for Asymmetric Proximity; AAAI Press: Palo Alto, CA, USA, 2017; pp. 2942–2948. [Google Scholar]
- Zhang, W. Knowledge Graph Embedding with Diversity of Structures. In Proceedings of the WWW (Companion Volume), Perth, Australia, 3–7 April 2017. [Google Scholar]
- Pal, S.; Urbani, J. Enhancing Knowledge Graph Completion By Embedding Correlation. In Proceedings of the CIKM, Singapore, 6–10 November 2017; pp. 2247–2250. [Google Scholar]
- Shi, J.; Gao, H.; Qi, G.; Zhou, Z. Knowledge Graph Embedding with Triple Context. In Proceedings of the CIKM, Singapore, 6–10 November 2017; pp. 2299–2302. [Google Scholar]
- Gao, H.; Shi, J.; Qi, G.; Wang, M. Triple context-based knowledge graph embedding. IEEE Access
**2018**, 6, 58978–58989. [Google Scholar] [CrossRef] - Li, W.; Zhang, X.; Wang, Y.; Yan, Z.; Peng, R. Graph2Seq: Fusion Embedding Learning for Knowledge Graph Completion. IEEE Access
**2019**, 7, 157960–157971. [Google Scholar] [CrossRef] - Zhang, Z.; Zhuang, F.; Qu, M.; Lin, F.; He, Q. Knowledge Graph Embedding with Hierarchical Relation Structure. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, 31 October–4 November 2018; pp. 3198–3207. [Google Scholar]
- Han, X.; Zhang, C.; Guo, C.; Sun, T.; Ji, Y. Knowledge Graph Embedding Based on Subgraph-Aware Proximity; AAAI Press: Palo Alto, CA, USA, 2018; pp. 306–318. [Google Scholar]
- Tan, Y.; Li, R.; Zhou, J.; Zhu, S. Knowledge Graph Embedding by Translation Model on Subgraph. In Proceedings of the HCC, Mérida, Mexico, 5–7 December 2018; pp. 269–280. [Google Scholar]
- Zhang, Y.; Yao, Q.; Chen, L. Neural Recurrent Structure Search for Knowledge Graph Embedding. arXiv
**2019**, arXiv:1911.07132. [Google Scholar] - Wan, G.; Du, B.; Pan, S.; Wu, J. Adaptive knowledge subgraph ensemble for robust and trustworthy knowledge graph completion. World Wide Web
**2020**, 23, 471–490. [Google Scholar] [CrossRef] - Qiao, Z.; Ning, Z.; Du, Y.; Zhou, Y. Context-Enhanced Entity and Relation Embedding for Knowledge Graph Completion. arXiv
**2020**, arXiv:2012.07011. [Google Scholar] - Ding, B.; Wang, Q.; Wang, B.; Guo, L. Improving Knowledge Graph Embedding Using Simple Constraints. In Proceedings of the ACL, Trujillo, Perupp, 13–16 November 2019; pp. 110–121. [Google Scholar]
- Huang, Y.; Xu, K.; Wang, X.; Sun, H.; Lu, S.; Wang, T.; Zhang, X. CoRelatE: Modeling the Correlation in Multi-fold Relations for Knowledge Graph Embedding. In Proceedings of the ICLR, New Orleans, LO, USA, 6–9 May 2019. [Google Scholar]
- Kanojia, V.; Maeda, H.; Togashi, R.; Fujita, S. Enhancing Knowledge Graph Embedding with Probabilistic Negative Sampling. In Proceedings of the 26th International Conference on World Wide Web Companion; ACM: New York, NY, USA, 2017; pp. 801–802. [Google Scholar]
- Niu, J.; Sun, Z.; Zhang, W. Enhancing Knowledge Graph Completion with Positive Unlabeled Learning. In Proceedings of the IICPR, Beijing, China, 20–24 August 2018; pp. 296–301. [Google Scholar]
- Qin, S.; Rao, G.; Bin, C.; Chang, L.; Gu, T.; Xuan, W. Knowledge Graph Embedding Based on Adaptive Negative Sampling. In Proceedings of the ICPCSEE, Guilin, China, 20–23 September 2019; pp. 551–563. [Google Scholar]
- Yan, Z.; Peng, R.; Wang, Y.; Li, W. Enhance knowledge graph embedding via fake triples. In Proceedings of the IJCNN, Budapest, Hungary, 14–19 July 2019; pp. 1–7. [Google Scholar]
- Guo, C.; Zhang, C.; Han, X.; Ji, Y. AWML: Adaptive weighted margin learning for knowledge graph embedding. J. Intell. Inf. Syst.
**2019**, 53, 167–197. [Google Scholar] [CrossRef] - Yuan, J.; Gao, N.; Xiang, J.; Tu, C.; Ge, J. Knowledge Graph Embedding with Order Information of Triplets. In Proceedings of the PAKDD, Macau, China, 14–17 April 2019; pp. 476–488. [Google Scholar]
- Wang, Y.; Liu, Y.; Zhang, H.; Xie, H. Leveraging Lexical Semantic Information for Learning Concept-Based Multiple Embedding Representations for Knowledge Graph Completion. In Asia-Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint International Conference on Web and Big Data; Springer: Berlin/Heidelberg, Germany, 2019; pp. 382–397. [Google Scholar]
- Guan, N.; Song, D.; Liao, L. Knowledge graph embedding with concepts. Knowl. Based Syst.
**2019**, 164, 38–44. [Google Scholar] [CrossRef] - Yu, Y.; Xu, Z.; Lv, Y.; Li, J. TransFG: A Fine-Grained Model for Knowledge Graph Embedding. In Proceedings of the WISA, Qingdao, China, 20–22 September 2019. [Google Scholar]
- Kazemi, S.M.; Poole, D. SimplE Embedding for Link Prediction in Knowledge Graphs. NeurIPS
**2018**. [Google Scholar] - Fatemi, B.; Ravanbakhsh, S.; Poole, D. Improved Knowledge Graph Embedding Using Background Taxonomic Information; AAAI Press: Palo Alto, CA, USA, 2019; pp. 3526–3533. [Google Scholar]
- Bordes, A.; Glorot, X.; Weston, J.; Bengio, Y. A semantic matching energy function for learning with multi-relational data. Mach. Learn.
**2014**, 94, 233–259. [Google Scholar] [CrossRef] [Green Version] - Socher, R.; Chen, D.; Manning, C.D.; Ng, A.Y. Reasoning With Neural Tensor Networks for Knowledge Base Completion. In Proceedings of the NIPS, Lake Tahoe, NV, USA, 5–8 December 2013; pp. 926–934. [Google Scholar]
- Dong, X.; Gabrilovich, E.; Heitz, G.; Horn, W.; Lao, N.; Murphy, K.; Strohmann, T.; Sun, S.; Zhang, W. Knowledge vault: A web-scale approach to probabilistic knowledge fusion. In Proceedings of the KDD, New York, NY, USA, 24–27 August 2014; pp. 601–610. [Google Scholar]
- Liu, Q.; Jiang, H.; Ling, Z.H.; Wei, S.; Hu, Y. Probabilistic Reasoning via Deep Learning: Neural Association Models. arXiv
**2016**, arXiv:1603.07704. [Google Scholar] - Schlichtkrull, M.S.; Kipf, T.N.; Bloem, P.; Berg, R.v.d.; Titov, I.; Welling, M. Modeling Relational Data with Graph Convolutional Networks. In Proceedings of the ESWC, Crete, Greece, 3–7 June 2018; pp. 593–607. [Google Scholar]
- Guo, L.; Zhang, Q.; Ge, W.; Hu, W.; Qu, Y. DSKG: A Deep Sequential Model for Knowledge Graph Completion. In Proceedings of the CCKS, Tianjin, China, 14–17 August 2018; pp. 65–77. [Google Scholar]
- Guan, S.; Jin, X.; Wang, Y.; Cheng, X. Shared Embedding Based Neural Networks for Knowledge Graph Completion. In Proceedings of the CIKM, Turin, Italy, 22–26 October 2018; pp. 247–256. [Google Scholar]
- Zhu, Q.; Zhou, X.; Zhang, P.; Shi, Y. A neural translating general hyperplane for knowledge graph embedding. J. Comput. Sci.
**2019**, 30, 108–117. [Google Scholar] [CrossRef] - Huang, Z.; Li, B.; Yin, J. Knowledge Graph Embedding by Learning to Connect Entity with Relation. In Asia-Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint International Conference on Web and Big Data; Springer: Berlin/Heidelberg, Germany, 2018; pp. 400–414. [Google Scholar]
- Wang, L.; Lu, X.; Jiang, Z.; Zhang, Z.; Li, R.; Zhao, M.; Chen, D. FRS: A simple knowledge graph embedding model for entity prediction. Math. Biosci. Eng.
**2019**, 16, 7789–7807. [Google Scholar] [CrossRef] - Nguyen, D.Q.; Nguyen, T.D.; Phung, D.Q. A Relational Memory-based Embedding Model for Triple Classification and Search Personalization. arXiv
**2019**, arXiv:1907.06080. [Google Scholar] - Cai, L.; Yan, B.; Mai, G.; Janowicz, K.; Zhu, R. TransGCN: Coupling Transformation Assumptions with Graph Convolutional Networks for Link Prediction. In Proceedings of the K-CAP, Marina Del Rey, CA, USA, 19–21 November 2019; pp. 131–138. [Google Scholar]
- Ye, R.; Li, X.; Fang, Y.; Zang, H.; Wang, M. A Vectorized Relational Graph Convolutional Network for Multi-Relational Network Alignment. In Proceedings of the IJCAI, Macao, China, 10–16 August 2019; pp. 4135–4141. [Google Scholar]
- Vashishth, S.; Sanyal, S.; Nitin, V.; Agrawal, N.; Talukdar, P.P. InteractE: Improving Convolution-Based Knowledge Graph Embeddings by Increasing Feature Interactions; AAAI Press: Palo Alto, CA, USA, 2020; pp. 3009–3016. [Google Scholar]
- Hu, K.; Liu, H.; Zhan, C.; Tang, Y.; Hao, T. A Bi-Directional Relation Aware Network for Link Prediction in Knowledge Graph. In Proceedings of the International Conference on Neural Computing for Advanced Applications, Shenzhen, China, 3–5 July 2020; pp. 259–271. [Google Scholar]
- Hu, K.; Liu, H.; Zhan, C.; Tang, Y.; Hao, T. Learning Knowledge Graph Embedding with a Bi-Directional Relation Encoding Network and a Convolutional Autoencoder Decoding Network; Neural Computing and Applications; Springer: Berlin/Heidelberg, Germany, 2021; pp. 1–17. [Google Scholar]
- Zhang, N.; Deng, S.; Sun, Z.; Chen, J.; Zhang, W.; Chen, H. Relation Adversarial Network for Low Resource Knowledge Graph Completion. In Proceedings of the WWW, Taipei, Taiwan, 20–24 April 2020. [Google Scholar]
- Tian, A.; Zhang, C.; Rang, M.; Yang, X.; Zhan, Z. RA-GCN: Relational Aggregation Graph Convolutional Network for Knowledge Graph Completion. In Proceedings of the ICMLC, Shenzhen China, 15–17 February 2020; pp. 580–586. [Google Scholar]
- Jiang, W.; Guo, M.; Chen, Y.; Li, Y.; Xu, J.; Lyu, Y.; Zhu, Y. Multi-view Classification Model for Knowledge Graph Completion. In Proceedings of the AACL/IJCNLP, Suzhou, China, 4–7 December 2020. [Google Scholar]
- Zeb, A.; Haq, A.U.; Zhang, D.; Chen, J.; Gong, Z. KGEL: A novel end-to-end embedding learning framework for knowledge graph completion. Expert Syst. Appl.
**2021**, 167, 114164. [Google Scholar] [CrossRef] - Han, Y.; Fang, Q.; Hu, J.; Qian, S.; Xu, C. GAEAT: Graph Auto-Encoder Attention Networks for Knowledge Graph Completion. In Proceedings of the CIKM, New York, NY, USA, 15–19 July 2020; pp. 2053–2056. [Google Scholar]
- Wang, Q.; Ji, Y.; Hao, Y.; Cao, J. GRL: Knowledge graph completion with GAN-based reinforcement learning. Knowl. Based Syst.
**2020**, 209, 106421. [Google Scholar] [CrossRef] - Shi, B.; Weningr, T. ProjE: Embedding Projection for Knowledge Graph Completion; AAAI Press: Palo Alto, CA, USA, 2017; pp. 1236–1242. [Google Scholar]
- Liu, H.; Bai, L.; Ma, X.; Yu, W.; Xu, C. ProjFE: Prediction of fuzzy entity and relation for knowledge graph completion. Appl. Soft Comput.
**2019**, 81, 105525. [Google Scholar] [CrossRef] - Zhang, W.; Li, J.; Chen, H. ProjR: Embedding Structure Diversity for Knowledge Graph Completion. In Proceedings of the NLPCC, Hohhot, China, 26–30 August 2018; pp. 145–157. [Google Scholar]
- Shi, B.; Weninger, T. Open-World Knowledge Graph Completion; AAAI Press: Palo Alto, CA, USA, 2018; pp. 1957–1964. [Google Scholar]
- Fu, C.; Li, Z.; Yang, Q.; Chen, Z.; Fang, J.; Zhao, P.; Xu, J. Multiple Interaction Attention Model for Open-World Knowledge Graph Completion. In International Conference on Web Information Systems Engineering; Springer: Berlin/Heidelberg, Germany, 2019; pp. 630–644. [Google Scholar]
- Nie, B.; Sun, S. Knowledge graph embedding via reasoning over entities, relations, and text. Future Gener. Computer Syst.
**2019**, 91, 426–433. [Google Scholar] [CrossRef] - Zhu, J.; Zheng, Z.; Yang, M.; Fung, G.P.C.; Tang, Y. A semi-supervised model for knowledge graph embedding. Data Min. Knowl. Discov.
**2020**, 34, 1–20. [Google Scholar] [CrossRef] - Dai, Y.; Wang, S.; Chen, X.; Xu, C.; Guo, W. Generative adversarial networks based on Wasserstein distance for knowledge graph embeddings. Knowl. Based Syst.
**2020**, 190, 105165. [Google Scholar] [CrossRef] - Wang, P.; Han, J.; Li, C.; Pan, R. Logic Attention Based Neighborhood Aggregation for Inductive Knowledge Graph Embedding; AAAI Press: Palo Alto, CA, USA, 2019; pp. 7152–7159. [Google Scholar]
- Qian, W.; Fu, C.; Zhu, Y.; Cai, D.; He, X. Translation Embeddings for Knowledge Graph Completion with Relation Attention Mechanism. In Proceedings of the IJCAI, Stockholm, Sweden, 13–19 July 2018; pp. 4286–4292. [Google Scholar]
- Liu, W.; Cai, H.; Cheng, X.; Xie, S.; Yu, Y.; Zhang, H. Learning High-order Structural and Attribute information by Knowledge Graph Attention Networks for Enhancing Knowledge Graph Embedding. arXiv
**2019**, arXiv:1910.03891. [Google Scholar] - Liu, Y.; Hua, W.; Xin, K.; Zhou, X. Context-Aware Temporal Knowledge Graph Embedding. In Proceedings of the WISE, Hong Kong, China, 26–30 November 2019. [Google Scholar]
- Oh, B.; Seo, S.; Lee, K.-H. Knowledge Graph Completion by Context-Aware Convolutional Learning with Multi-Hope Neighborhoods. In Proceedings of the CIKM, Turin, Italy, 22–26 October 2018; pp. 257–266. [Google Scholar]
- Wu, T.; Khan, A.; Gao, H.; Li, C. Efficiently Embedding Dynamic Knowledge Graphs. arXiv
**2019**, arXiv:1910.06708. [Google Scholar] - Han, X.; Zhang, C.; Ji, Y.; Hu, Z. A Dilated Recurrent Neural Network-Based Model for Graph Embedding. IEEE Access
**2019**, 7, 32085–32092. [Google Scholar] [CrossRef] - Tay, Y.; Luu, A.T.; Phan, M.C.; Hui, S.C. Multi-task Neural Network for Non-discrete Attribute Prediction in Knowledge Graphs. In Proceedings of the CIKM 2017, Singapore, 6–10 November 2017. [Google Scholar]
- Nayyeri, M.; Xu, C.; Lehmann, J.; Yazdi, H.S. LogicENN: A Neural Based Knowledge Graphs Embedding Model with Logical Rules. arXiv
**2019**, arXiv:1908.07141. [Google Scholar] - Zhao, F.; Xu, T.; Jin, L.; Jin, H. Convolutional Network Embedding of Text-enhanced Representation for Knowledge Graph Completion. IEEE Int. Things J.
**2020**. [Google Scholar] [CrossRef] - Wang, H.; Ren, H.; Leskovec, J. Entity Context and Relational Paths for Knowledge Graph Completion. arXiv
**2020**, arXiv:2002.06757. [Google Scholar] - Wang, Y.; Zhang, H. HARP: A Novel Hierarchical Attention Model for Relation Prediction. ACM Trans. Knowl. Discov. Data TKDD
**2021**, 15, 1–22. [Google Scholar] [CrossRef]

Categories | Subcategories | Models |
---|---|---|

Based on translation distance | TransE and its extensions | TransE [13], TransH [15], TransM [16], TransR [17], TransD [36], TransA [37], TranSparse [38], ManifoldE [39], STransE [40], TransX-FT [41], TransX-DT [42], TransHR [43], CirE [44], GTrans [45], TransCore [46], TransF [47], TransGH [48], AEM [49], EMT [50], TransX-SYM [51], TransMS [52], KGLG [53], TransL [54]. |

Gaussian embedding | KG2E [55], TransG [56]. | |

Others | KGE Continual Learning [57], TorusE [58], QuatE [59], RotatE [35], HAKE [60], DeCom [61], MobiusE [62], QuaR [63]. | |

Based on semantic information | No additional information | RESCAL [64], DistMult [20], Hole [65], ComplEx [21], ANALOGY [66], ComplEx-N3 [67], TuckER [68], TriModel [69], CrossE [70], HolEx [71], MEI [72]. |

Fusing additional information | Entity/relation types: TKRL [73], SSE [74], Att-Model+Types [75], ETE [76], TransT [77], Bilinear+TR [78], TransET [79], KGE via weighted score [80], RecKGC [81], RPE [82]. Relation paths: PTransE [83], Att-Model+Types [75], TransP [84], PaSKoGE [85], PTransD [86], DPTransE [87], abstract paths for KGC [88], ELPKG [9], PTranSparse [89], CoKE [90], RPE [82], RW-LMLM [91], KBAT [92], GAATs [93], path-based reasoning [27]. Textual descriptions: DKRL [94], SSP [95], KDCoE [96], CATT [97], textural association [98], pen-world extension for KGC [99], EDGE [100], TransW [101]. Logic rules: KALE [102], lppTransX [103], BiTransX+ [104], ELPKG [9], X-lc [105], RUGE [106], TARE [107], logic rule powered KGE [108], SoLE [109], GPFL [110], CBR [111], probabilistic case-based reasoning [112]. Entity attributes: KBLRN [113], TransEA [114], LiteralE [115], MARINE [116], AKRL [117]. Temporal: Time-aware link prediction [118], co-evolution of event and KGs [119], Know-Evolve [120], iTransA [121], HyTE [122], ATiSE [123], QCHyTE [124], TDG2E [125], T-GAP [126], TeMP [127]. Structure: GAKE [128], APP [129], ORC [130], KGC by embedding correlations [131], TCE(2017) [132], TCE(2018) [133], Graph2Seq [134], HRS [135], SA-KGE [136], TransS [137], S2E [138], AKSE [139], AggrE [140]. Constraints: ComplEx+NNE+AER [141], RPE [82], CoRelatE [142]. Negativesampling: TransR-PNC [143], TSLRF [144], TransE-ANS [145]. Fake triples: KGE via fake triples [146], AWML [147]. Order: RKGE [148], RW-LMLM [91]. Concepts: TransC [149], KEC [150], TransFG [151]. Background: SimplE [152], SimplE+ [153]. | |

Based on neural network | No additional information | SME [154], NTN [155], MLP [156], NAM [157], R-GCNs [158], DSKG [159], SENN [160], TBNN [105], NTransGH [161], ConvE [22], ConvKB [23], ConnectER [162], FRS [163], HypER [24], CompGCN [25], TransGate [106], R-MeN [164], TransGCN [165], VR-GCN [166], InteractE [167], KBAT [92], BDRAN [168], BDR+CA [169], wRAN [170], RA-GCN [171], path-based reasoning [72], MultiView [172], KGEL [173], GAEAT [174], GRL [175]. |

Fusing additional information | ProjE [176], ProjFE [177], ProjR [178], SACN [26], CNN-BiLSTM [27], ConMask [179], MIA Model [180], TKGE [181], a semi-supervised model for KGE [182], GAN based on Wasserstein [183], LAN [184], TransAt [185], KANE [186], context-aware temporal KGE [187], CACL [188], DKGE [189], G-DRNN [190], MTKGNN [191], LogicENN [192], TECRL [193], PATHCON [194], HARP [195]. |

Model | Symmetry | Antisymmetry | Inversion | Composition |
---|---|---|---|---|

SE | ✕ | ✕ | ✕ | ✕ |

TransE | ✕ | ✓ | ✓ | ✓ |

TransX | ✓ | ✓ | ✕ | ✕ |

DistMult | ✓ | ✕ | ✕ | ✕ |

ComplEx | ✓ | ✓ | ✓ | ✕ |

RotatE | ✓ | ✓ | ✓ | ✓ |

Dataset | Entity | Relation | Triple | Train | Valid | Test |
---|---|---|---|---|---|---|

FB15k | 14,951 | 1345 | 592,213 | 483,142 | 50,000 | 59,071 |

FB15k-237 | 14,541 | 237 | 310,116 | 272,115 | 17,535 | 20,466 |

Categories | Model | Embedding Space | Additional Information | Scoring Function |
---|---|---|---|---|

Based on | RotatE | Complex | None | $-\u2225h\circ r-t\u2225$ |

translation | space | |||

distance | HAKE | Vector | None | $\begin{array}{c}{\u2225{h}_{m}\circ {r}_{m}-{t}_{m}\u2225}_{2}+\\ \lambda {\u2225sin\left(\left({h}_{p}+{r}_{p}-{t}_{p}\right)/2\right)\u2225}_{1}\end{array}$ |

space | ||||

Based on | RW-LMLM | Matrix | Order | $softmax\left({h}_{n}^{i}{W}_{h}{W}_{E}^{T}\right)$ |

semantic | space | information | ||

information | SimplE | Vector | Background | $\frac{1}{2}\left(\u2329{h}_{{e}_{i}},{v}_{r},{t}_{{e}_{j}}\u232a+\u2329{h}_{{e}_{j}},{v}_{{r}^{-1}},{t}_{{e}_{i}}\u232a\right)$ |

space | knowledge | |||

LiteralE | Vector | Literal | ${f}_{X}\left(g\left({e}_{i},{l}_{i}\right),g\left({e}_{j},{l}_{j}\right),{r}_{k}\right)$ | |

space | information | |||

Bilinear+TR | Tensor | Entity | ${x}_{S}^{T}{W}_{r}{x}_{t}$ | |

space | types | |||

ComplEx | Complex | None | $Re\left(\u2329h,r,t\u232a\right)$ | |

space | ||||

DistMult | Vector | None | $\left(\u2329h,r,t\u232a\right)$ | |

space | ||||

Based on | SACN | Vector | Relation | $f\left(vec\left(\mathbf{M}\left({e}_{s},{e}_{r}\right)\right)\mathbf{W}\right){e}_{o}$ |

neural | space | types | ||

network | ConvE | Vector | None | $f\left(vec\left(f\left(\left[\overline{{e}_{s}};\overline{{r}_{r}}\right]*w\right)\right)W\right){e}_{o}$ |

space | ||||

HypER | Vector | None | $f\left(vec\left({e}_{1}*ve{c}^{-1}\left({w}_{r}H\right)\right)W\right){e}_{2}$ | |

space |

Model | FB15k-237 | FB15k | ||||||
---|---|---|---|---|---|---|---|---|

Hits@1 | Hits@3 | Hits@10 | MRR | Hits@1 | Hits@3 | Hits@10 | MRR | |

RotatE | 0.2471 | 0.3802 | 0.5370 | 0.3432 | 0.7387 | 0.8240 | 0.8797 | 0.7905 |

HAKE | 0.2561 | 0.3871 | 0.5488 | 0.3523 | 0.5745 | 0.7614 | 0.8482 | 0.6809 |

Bilinear+ | 0.1288 | 0.1524 | 0.1912 | 0.1503 | 0.1522 | 0.1584 | 0.1695 | 0.1604 |

TR-TransE | ||||||||

Bilinear+ | 0.2370 | 0.3549 | 0.4855 | 0.3246 | 0.4119 | 0.5683 | 0.6586 | 0.5064 |

TR-bilinear | ||||||||

DistMult | 0.2129 | 0.3215 | 0.4635 | 0.2953 | 0.3055 | 0.4230 | 0.5149 | 0.3785 |

ComplEx | 0.2014 | 0.3165 | 0.4526 | 0.2851 | 0.4407 | 0.5366 | 0.6168 | 0.5028 |

LiteralE+ | 0.2192 ↓ | 0.3285 ↓ | 0.4651 ↓ | 0.3013 ↓ | 0.2730 ↓ | 0.3537 ↓ | 0.4431 ↓ | 0.3323 ↓ |

text+DistMult | ||||||||

LiteralE+ | 0.2209 | 0.3279 | 0.4628 | 0.3022 | 0.2917 ↓ | 0.3820 ↓ | 0.4758 ↓ | 0.3555 ↓ |

DistMult | ||||||||

LiteralE+ | 0.1936 ↓ | 0.3029 ↓ | 0.4397 ↓ | 0.2753 ↓ | 0.5138 | 0.6102 | 0.6963 | 0.5777 |

ComplEx | ||||||||

LiteralE+ | 0.2092 ↓ | 0.3277 ↓ | 0.4675 ↓ | 0.2963 ↓ | 0.6055 ↓ | 0.7014 ↓ | 0.7768 ↓ | 0.6667 ↓ |

ConvE | ||||||||

SimplE | 0.0895 | 0.1701 | 0.3170 | 0.1623 | 0.6593 | 0.7758 | 0.8434 | 0.7283 |

RW-LMLM | 0.2249 | 0.3396 | 0.4859 | 0.3109 | 0.6767 | 0.7947 | 0.8648 | 0.7466 |

ConvE | 0.2212 | 0.3371 | 0.4787 | 0.3070 | 0.6250 | 0.7254 | 0.7956 | 0.6874 |

SACN | 0.2522 | 0.3699 | 0.5154 | 0.3400 | - | - | - | - |

HypER | 0.2482 | 0.3629 | 0.5080 | 0.3335 | 0.7005 | 0.8167 | 0.8799 | 0.7668 |

Categories | Models | Training Time |
---|---|---|

Based on | HAKE | 4.8908 h |

translation distance | RotatE | 6.9708 h |

Based on | Bilinear+TR_bilinear | 1.8083 h |

semantic | Bilinear+TR_transE | 1.8935 h |

information | SimplE | 4.4111 h |

LiteralE-ComplEx | 11.1810 h | |

DistMult | 31.5515 h | |

RW-LMLM | 33.6141 h | |

ComplEx | 38.1560 h | |

LiteralE-DistMult | 42.3798 h | |

LiteralE-text-DistMult | 44.9654 h | |

Based on | HypER | 8.9695 h |

neural | ConvE | 50.8799 h |

network | LiteralE-ConvE | 60.1813 h |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Wang, M.; Qiu, L.; Wang, X.
A Survey on Knowledge Graph Embeddings for Link Prediction. *Symmetry* **2021**, *13*, 485.
https://doi.org/10.3390/sym13030485

**AMA Style**

Wang M, Qiu L, Wang X.
A Survey on Knowledge Graph Embeddings for Link Prediction. *Symmetry*. 2021; 13(3):485.
https://doi.org/10.3390/sym13030485

**Chicago/Turabian Style**

Wang, Meihong, Linling Qiu, and Xiaoli Wang.
2021. "A Survey on Knowledge Graph Embeddings for Link Prediction" *Symmetry* 13, no. 3: 485.
https://doi.org/10.3390/sym13030485