Link Prediction for Temporal Heterogeneous Networks Based on the Information Lifecycle

Cao, Jiaping; Li, Jichao; Jiang, Jiang

doi:10.3390/math11163541

Open AccessArticle

Link Prediction for Temporal Heterogeneous Networks Based on the Information Lifecycle

by

Jiaping Cao

,

Jichao Li

^*

and

Jiang Jiang

College of Systems Engineering, National University of Defense Technology, Changsha 410073, China

^*

Author to whom correspondence should be addressed.

Mathematics 2023, 11(16), 3541; https://doi.org/10.3390/math11163541

Submission received: 29 July 2023 / Revised: 13 August 2023 / Accepted: 15 August 2023 / Published: 16 August 2023

(This article belongs to the Section E1: Mathematics and Computer Science)

Download

Browse Figures

Versions Notes

Abstract

:

Link prediction for temporal heterogeneous networks is an important task in the field of network science, and it has a wide range of real-world applications. Traditional link prediction methods are mainly based on static homogeneous networks, which do not distinguish between different types of nodes in the real world and do not account for network structure evolution over time. To address these issues, in this paper, we study the link prediction problem in temporal heterogeneous networks and propose a link prediction method for temporal heterogeneous networks (LP-THN) based on the information lifecycle, which is an end-to-end encoder–decoder structure. The information lifecycle accounts for the active, decay and stable states of edges. Specifically, we first introduce the meta-path augmented residual information matrix to preserve the structure evolution mechanism and semantics in HINs, using it as input to the encoder to obtain a low-dimensional embedding representation of the nodes. Finally, the link prediction problem is considered a binary classification problem, and the decoder is utilized for link prediction. Our prediction process accounts for both network structure and semantic changes using meta-path augmented residual information matrix perturbations. Our experiments demonstrate that LP-THN outperforms other baselines in both prediction effectiveness and prediction efficiency.

Keywords:

temporal heterogeneous networks; link prediction; information lifecycle; meta-path

MSC:

68T07

1. Introduction

Network science is an interdisciplinary field in which network structure and network behaviour are studied. Studying the topology, information propagation and dynamic evolution of networks in depth not only reveals the intrinsic laws and characteristics of networks but also helps us to understand and explain the complex interactions and phenomena of real-world networks. Currently, digital intelligence technology is rapidly developing, and network science is being studied in further depth. From social media platforms to the internet and from neural networks in organisms to traffic networks, the interactions contained in networks are becoming increasingly complex and variable, and capturing all the interactions existing in complex networks is difficult with the existing technology. Incomplete information is therefore a major challenge in current network science research.

Link prediction provides more comprehensive knowledge of the network structure by predicting potential or future edges in the network. For example, predicting potential relationships in drug–target networks can accelerate the discovery and development of new drugs [1]. Predicting relationships in protein interaction networks can help researchers deeply understand disease development mechanisms, facilitating effective diagnosis and treatment [2]. In addition, the structure of networks in the real world is constantly changing over time, as reflected in the increase or decrease in the number of nodes and edges in the networks. Friend recommendations in social networks [3], co-authorship recommendations in academic collaboration networks [4] and item recommendations in e-commerce networks [5] are all examples of link prediction in the real world.

However, most of the existing link prediction studies have focused on static homogeneous networks. Homogeneous networks treat all nodes as the same type and all edges as the same type, but in the real world, different types of nodes play different roles in the network. For example, the academic cooperation network includes many types of entities, such as authors, papers, conferences, etc. If these entities are modelled as the same type of nodes, and the relationships between different types of entities are modelled as the same type of relationship, the network will lose its practical research value; furthermore, if only the co-authorship relationship between authors is modelled, the research results will ignore the impact of collaborative papers and participation in conferences on the collaboration between authors, and the results will not be generalizable. Static networks do not account for the appearance and disappearance of nodes and edges in the network over time, so they cannot portray the evolution of network structure over time. For example, over time, new user accounts will be registered, old user accounts will be cancelled in social platforms, and new links will be created between users. If the time factor is ignored, it will not be possible to differentiate the evolution of the increase or decrease in users or the increase or decrease in relationships in the platforms mentioned above over time.

To address these issues, this paper proposes a link prediction method for temporal heterogeneous networks (LP-THN) based on an encoder–decoder structure. LP-THN absorbs an encoder–decoder structure that can learn the topological features of the network while dealing with the non-linear features and sparsity of the network. This method uses an encoder to learn the rich semantic information contained in temporal heterogeneous networks and network structure changes over time, generates low-dimensional embedding representations of the nodes based on the global structure, and uses a decoder to perform link prediction. The main contributions of this paper are as follows:

We propose a link prediction framework based on an encoder–decoder for temporal heterogeneous networks, which can achieve more accurate results in practical applications than other frameworks.
We propose an augmented residual information matrix considering meta-paths that incorporates the law of decreasing information lifecycle into meta-paths. This matrix is used as an input to the encoder to enhance the effective extraction of semantic and structural information.
Through numerous experiments, we verify that our method is superior to many existing methods in terms of effectiveness.

The remainder of the article is organized as follows. Section 2 summarizes the existing studies and presents the weaknesses of them. Section 3 introduces the notation used in the paper and provides relevant definitions. Section 4 presents the LP-THN model and details this model. Section 5 presents the experiments and analyses of the experimental results. Section 6 concludes the paper.

2. Related Work

Practical solutions to link prediction problems have become a popular research field for many scholars in network science. Linyuan Lv summarizes the existing heuristic link prediction methods and classifies them as local information-based heuristics, such as AA [6] and RA [7]; path-based heuristics, such as Katz [8] and LHN-II [9]; and random-walk-based heuristics, such as Cos+ [10] and SimRank [11]. The above heuristics are mostly from the perspective of network structure and have high interpretability, but they have low prediction accuracy and are not applicable to large-scale networks. With the rapid development of artificial intelligence technology, link prediction methods represented by network embedding techniques have received wide attention. DeepWalk [12] is a classical shallow neural network embedding method that obtains node sequences by random walks and inputs those node sequences into Word2vec to learn the low-dimensional embedding representations of the nodes. Node2Vec [13] is based on DeepWalk and defines a biased random walk that combines both breadth-first and depth-first approaches to obtain node sequences. M-NMF [14] preserves the topology and the community structure of the network and combines the two through a nonnegative matrix decomposition to obtain node low-dimensional embedding representations. Matrix decomposition and shallow neural network embedding methods have high computational complexity when dealing with large-scale networks. LINE [15], in contrast, maximizes the conditional probability and co-occurrence probability of node neighbours to learn the low-dimensional embedding representations of nodes, allowing it to efficiently handle large-scale networks. The above methods mainly focus on the local structural information in the network, ignoring the rich node attributes as well as the global structural information of the network. In recent years, many scholars have proposed methods based on deep neural networks that can combine the rich attribute information of nodes to learn their low-dimensional embedding representations, e.g., graph neural networks (GNNs) [16], GraphSAGE [17], graph convolutional neural networks (GCNs) [18] and graph attention networks (GATs) [19]. However, the above methods are performed on homogeneous networks that do not account for the rich semantic information contained in different nodes and edge types in heterogeneous networks.

To address the above problems, the link prediction problem in heterogeneous networks has received extensive attention [20]. Many scholars extract important semantic information in the network based on meta-paths and embed this information into neural networks to obtain low-dimensional embedding representations of nodes containing rich semantic information. Node representation methods on heterogeneous networks can be classified into three categories: shallow neural networks, such as metapath2vec [21] and HIN2Vec [22], matrix decomposition methods, such as HNE [23], and deep neural networks, such as HetGNN [24] and HERec [25]. However, these methods are based on link prediction for static networks and do not account for the evolution of the network structure over time. Numerous link prediction methods for temporal networks have been proposed. TLPSS [26] accounts for the existence of higher-order structural information in the network through a simplicial complex but does not consider the heterogeneity of nodes and edges. Lasso regression and random forest [27] find that the connection of a node pair depends on the connection of many node pairs in the past over a period, explaining the structure of the model. DynamicTriad [28] learns low-dimensional embedding representations of nodes through triadic closure. In this method, the dynamics refer to adding edges or changing the weight of an edge that represents closeness between two nodes, and the number of nodes does not change over time. The dynamics in CTDN [29] refer to continuous time variation. This method considers the temporal attributes of the edges to obtain a sequence of nodes by random walk. DyLink2Vec [30] directly obtains the embedding of the edges to perform link prediction. TempNodeEmbed [31] splices the same nodes in different time slices by computing orthogonal bases between node matrices in different time slices. However, the above link prediction methods for temporal networks are based on homogeneous networks and do not account for the rich semantic information contained in heterogeneous networks. DHNE [32] performs a random walk by constructing a historical–current graph and combining it with meta-paths, but constructing the historical–current graph requires overly high computational complexity. DyHNE [33] captures network structures and semantics by retaining first-order and second-order neighbours based on meta-paths, but this neighbourhood structure cannot capture global structural information.

3. Notations and Definitions

There are different types of nodes or edges that appear or disappear over time in a temporal heterogeneous network. In this paper, temporal heterogeneous networks are defined as follows:

Definition 1.

Temporal Heterogeneous Network. A temporal heterogeneous network can be represented by a sequence of time slices

\{G^{1}, G^{2}, \dots, G^{T}\}

.

G^{t} = \{V^{t}, E^{t}, ϕ, φ\}

denotes a snapshot of the heterogeneous network under the time slice t, where

V^{t}

denotes the set of nodes of the heterogeneous network under the time slice t, and

E^{t}

denotes the set of connected edges of the heterogeneous network under the time slice t. In addition, ϕ and φ are two mapping equations, ϕ:

V^{t} \to A

, φ:

E^{t} \to R

, where A and R denote the sets of types of nodes and edges, respectively. For a heterogeneous network,

|A| + |R| > 2

.

Meta-paths are generated by sequences of nodes or edges of different types, which can contain rich semantic and local structural information. Meta-paths on temporal heterogeneous networks are generally defined as follows:

Definition 2.

Meta-path. Denoted byM, a meta-path is considered to be a sequence of nodes and edges,

a_{1} \overset{r_{1}}{\to} a_{2} \overset{r_{2}}{\to} a_{3} \overset{r_{3}}{\to} \dots \overset{r_{l - 1}}{\to} a_{l}

, where

r_{i} \in R

,

a_{i} \in A

, and A and R denote the sets of types of nodes and connecting edges, respectively. Clearly, meta-paths can describe the complex relationships contained between node types

a_{1}

and

a_{2}

, and a sequence satisfying the type of meta-path M is an instance of meta-path M. An academic collaboration network consisting of 8 nodes is shown in Figure 1A. This network contains three nodes of author type, three nodes of paper type and two nodes of conference type. The meta-path

a_{3} \to p_{2} \to c_{2} \to p_{1} \to a_{1}

is defined as an instance of the meta-path

A P C P A

, which contains semantic information that can be expressed as two authors writing articles published in the same conference.

Meta-path-based first-order similarity denotes the instance of meta-path M between connected node pairs

(v_{i}, v_{j})

, which can measure the local structural similarity between nodes in a heterogeneous network. Second-order similarity based on meta-paths denotes the instance of meta-path M between a neighbour node

N {(v_{i})}^{M}

of node

v_{i}

and a neighbour node

N {(v_{j})}^{M}

of node

v_{j}

, where a neighbour node

N {(v_{i})}^{M}

of node

v_{i}

contains nodes connected to node

v_{i}

based on meta-path M.

The adjacency matrix describes only the first-order adjacencies present in the network, and the meta-paths alone cannot characterize the network structure over time, so the residual information on the edges in the network is fused with the meta-paths. The meta-path augmented residual information matrix is proposed to portray the temporal structural information in the network:

Definition 3.

Meta-path M augmented residual information matrix. The meta-path M augmented residual information matrix

W^{M} = [w_{i j}^{M}]

, where

w_{i j}^{M} = n A S F^{M}

denotes the residual information contained in the specified meta-path instance, is calculated as shown in Equation (1).

n A S F^{M} = n A S F_{(n_{i}, n_{j})} \cdot n A S F_{(n_{j}, n_{k})} \cdot \dots \cdot n A S F_{(n_{p}, n_{q})}

(1)

where

M = n_{i} n_{j} n_{k} \dots n_{p} n_{q}

,

n A S F_{(n_{i}, n_{j})}

denotes the amount of residual information contained in the edges of the node pair

(n_{i}, n_{j})

.

In temporal network link prediction, the network structure continuously evolves over time, and information about the historical structural changes in the network is used to predict the edges in the future network. Temporal network link prediction is typically defined as follows:

Definition 4.

Temporal network link prediction. Temporal network link prediction means predicting the set of edges in the network at time

T + 1

, i.e.,

E^{T + 1} = \{e^{t} (u, v) | u, v \in V^{t}, t = T + 1\}

, based on the network at the given time slice

[1, T]

. Given the time t,

s \in T

,

t < s

, if the nodes and edges in

G^{t}

do not disappear during the period

[t, s]

, then the nodes and edges in

G^{t}

are all represented in

G^{s}

, and the information on

G^{t}

is said to be the history of

G^{s}

, as shown in Figure 2. Since temporal network link prediction focuses more on the characteristics of network structure evolution over time and ignores the heterogeneity of node and edge types in the network, this paper only gives a more general definition. For the notion of temporal heterogeneous network link prediction, it is sufficient to ensure that the number of node or edge types in this definition is not 1.

In this paper, link prediction is defined as an end-to-end encoder–decoder process, where a stochastic gradient is generated in each round of training, the prediction error is computed through a loss equation, and the parameters of the model are updated in each round. The encoder primarily obtains low-dimensional embedding representations of the nodes, which are defined as follows:

Definition 5.

Embedding Representation. An embedding representation is the mapping of a node

v \in V

in a graph into the same low-dimensional vector space, i.e.,

f (v) = z_{v}

, through the equation

f (\cdot)

, where

z_{v} \in R^{d}

, and d denotes the dimension of the vector

z

. The embedding representation is the encoder LP-THN.

4. The LP-THN Model

In this section, a series of time slices is aggregated to form a network containing the age and number of occurrences of each edge. Then, the semantic and structural time-varying information in this network is extracted based on meta-paths and an adjusted sigmoid function [26], which are used as inputs to the encoder to obtain low-dimensional embedding representations of nodes containing semantic and temporal information. Finally, the learned node embedding representations are used as input for the decoder.

4.1. Basic Idea

The problem investigated in this paper is whether edges on a snapshot of time slice

t_{n + 1}

based on information on a series of snapshots of the network at time slice

[t_{1}, t_{n}]

can be predicted. The example of an academic collaboration network is taken in Figure 3. First, the network snapshots at time

[t_{1}, t_{n}]

are aggregated into a single-layer network

G (V, E, ϕ, φ)

, where V denotes the set of nodes in G, and E denotes the set of edges in the aggregated network. The aggregation process is described as follows: If a node

v_{i}

or a connecting edge

e_{i j}

appears in the network snapshots of any historical time slice

t_{z}

before the time slice

t_{n + 1}

, the node or the edge will always exist in the network snapshots of the subsequent time slices, where

v_{i} \in V^{t}

,

e_{i j} \in E^{t}

,

t = t_{z}, t_{z + 1}, \dots, t_{n}

. If a node

v_{i}

or an edge

e_{i j}

that originally appeared from the time slice

t_{z - m}

disappears at the moment

t_{z}

, the disappeared node

v_{i}

or the edge

e_{i j}

will not be present in the network snapshots of this time slice and the subsequent time slices, where

v_{i} \in V^{t_{1}}

,

v_{i} \notin V^{t_{2}}

,

e_{i j} \in E^{t_{1}}

,

e_{i j} \notin E^{t_{2}}

,

t_{1} = t_{z - m}, \dots, t_{z - 1}

,

t_{2} = t_{z}, t_{z + 1}, \dots, t_{n}

. The weight of edge

e_{i j}

in the aggregated network is the residual information

n A S F_{e_{i j}}

, which portrays the lifecycle of the information, and the residual information decreases from the beginning of the generation of edge

e_{i j}

to the current moment. This residual information does not decrease linearly but undergoes the process of first gently decreasing, then dramatically decreasing, and finally tending towards gently maintaining. This formula is as shown in Equation (2):

n A S F_{e_{i j}} (a_{i j}, t_{i j}) = t_{i j} \cdot \frac{\frac{1}{1 + exp \{a_{i j} / p - a\}} + q}{q + 1}

(2)

where

a_{i j}

denotes the time span of the time slice

t_{z}

to

t_{n}

in which the edge was generated, and

a_{i j} = n - z + 1

. In particular, an edge

e_{i j}

generated at time

t_{s}

is

a_{i j} = n - z + 1

. If the edge reappears at time

t_{z} (s < z)

,

t_{i j}

is the number of times an edge

e_{i j}

occurs. This parameter is set to distinguish between the two cases shown in Figure 1B. When author A and author B cooperate on five consecutive time slices

[t_{1}, t_{5}]

and author A and author C first cooperate at time slice

t_{5}

, the value of the parameter a between author A and author C as well as between author A and author B is 1. Therefore, if only the parameter

a_{i j}

is considered, the two cases shown in Figure 1B cannot be differentiated between, while the parameter

t_{i j}

can differentiate between these two cases.

After obtaining the aggregated network G, the meta-path instances are extracted according to the determined meta-paths. Taking the academic collaboration network in Figure 1A as an example, the instances of meta-path

A P A

and meta-path

A P C P A

are extracted. The meta-path

A P A

indicates that the two authors coauthored a paper, such as

a_{1} p_{1} a_{2}

in Figure 1A, and the meta-path

A P C P A

indicates that the paper authored by the two authors was published in the same conference, such as

a_{1} p_{1} c_{2} p_{2} a_{3}

in Figure 1A. The meta-path

A P A

and meta-path

A P C P A

augmented residual information matrices

W^{A P A}

and

W^{A P C P A}

are calculated by Equations (1) and (2). Inputting

W^{A P A}

and

W^{A P C P A}

into the modified self-attention model, which can be used to calculate the correlation between any two nodes in the graph, yields a relationship-enhanced meta-path M augmented residual information matrix

\hat{W^{M}}

.

\hat{W^{A P A}}

and

\hat{W^{A P C P A}}

are input into the modified GCN model. Guided by the modified self-attention model, modified GCN can change the degree of aggregation between different nodes to obtain a meta-path augmented residual information matrix

H^{M}

after aggregating the local information based on the meta-path.

Next, the information in

H^{A P A}

and

H^{A P C P A}

is aggregated through the attention mechanism. The adaptive weights

w_{M_{1}}

and

w_{M_{2}}

are set, the initial values are randomly generated, and the augmented residual information matrix H is calculated according to Equation (3) to obtain each node embedding representation. The encoder of LP-THN is as mentioned above. The decoder inputs the augmented residual information matrix H into the modified GAT model to obtain the augmented residual information matrix fusing semantic and temporal information. Finally, a multilayer perceptron is used for link prediction, and the GAT model can account for the variability of the influence of different neighbourhood nodes on the core node.

H = w_{M_{1}} H^{M_{1}} + w_{M_{2}} H^{M_{2}} + \dots + w_{M_{n}} H^{M_{n}}

(3)

4.2. Encoder

4.2.1. Modified Self-Attention Mechanism

Self-attention mechanisms are widely used in the field of natural language processing, which exploits its ability to capture the internal correlation of data and applies it to graphs to convert the influence of other nodes in the graph on the core node into the degree of attention of the core node on other nodes [34]. The self-attention mechanism based on meta-paths and temporal structural features is implemented in three main steps. In the first step, the query, key and value of each node are calculated based on the meta-path M, as shown in Equations (4)–(6); in the second step, the magnitude of the relevance of each node to the current node is calculated and normalized, and the result is used as the weight of each node to the core node, as shown in Equation (7); and in the third step, the value of each node is weighted to the core node by summation to obtain the representation of each node considering the global node to core node importance information, as shown in Equation (7).

K^{M} = W_{K}^{M} \cdot W^{M}

(4)

Q^{M} = W_{Q}^{M} \cdot W^{M}

(5)

V^{M} = W_{V}^{M} \cdot W^{M}

(6)

where the meta-path M augmented information matrix

W^{M} \in R^{n \times n}

and

n = |V_{R_{i}}|

,

V_{R_{i}}

denotes the set of nodes in the aggregation network G with node type

R_{i}

.

W_{K}^{M} \in R^{d \times n}

,

W_{Q}^{M} \in R^{d \times n}

and

W_{V}^{M} \in R^{d \times n}

are generated by random initialization, and d is the node embedding representation dimension. To improve the computational speed, matrix operations are used.

K^{M} \in R^{d \times n}

,

Q^{M} \in R^{d \times n}

and

V^{M} \in R^{d \times n}

denote the query, key and value matrices of graph G based on meta-path M.

\hat{W^{M}} = s o f t m a x (\frac{Q^{M} \cdot {K^{M}}^{T}}{\sqrt{d}}) V^{M}

(7)

where

Q^{M} \cdot {K^{M}}^{T} = [s c o r e_{i j}]

,

s c o r e_{i j}

denotes node

n_{i}

’s attention to node

n_{j}

, d is the dimension of the node’s embedding representation, and the division by

\sqrt{d}

stabilizes the gradient. To calculate the degree of influence of each node on the core node, the activation function with

s o f t m a x (\cdot)

is used for the normalization process, and the result obtained from the normalization process is used as the weight of the node on the core node’s value.

The meta-path M augmented information matrix in this paper is based on the meta-path set to reconstruct the network structure, which accounts for the rich semantic information in the heterogeneous network. Simultaneously, the elements in the matrix account for the rich temporal information contained in each edge. Therefore, the matrix is used as an input to the self-attention mechanism that considers the amount of information. In contrast to the original self-attention mechanism, the modified self-attention model not only enables the learned results to consider the magnitude of the relevance of different nodes to that node but also accounts for the temporal attributes of the dynamic heterogeneous graph. Since the relationships between nodes are generated based on defined meta-paths, this self-attention mechanism can enhance the main structural features in the heterogeneous graph and weaken the unimportant structural features, ensuring that the learned node embedding representations contain specific semantic information.

4.2.2. Modified GCN

As opposed to a GNN, a GCN is essentially a message passing process that can aggregate information about a node’s neighbours based on their importance to the core node and update the node features [18]. The graph convolutional neural network based on meta-path and temporal attributes is implemented in two steps. In the first step, node neighbourhood information is aggregated, and in the second step, the node feature is updated as shown in Equation (8).

X^{(l)} = σ (D^{- 1} \hat{A^{M}} X^{(l - 1)} W^{(l - 1)})

(8)

where

X^{(0)} = \hat{W^{M}}

,

\hat{W^{M}}

denotes the relationship between the enhanced meta-path M augmented information matrix;

\hat{A^{M}} = A^{M} + I^{M}

, where

\hat{A^{M}} = [{\hat{a}}_{i j}]

,

A^{M} = [a_{i j}]

,

\hat{A} \in R^{n \times n}

,

A^{M} \in R^{n \times n}

,

n = |V_{R_{i}}|

, and

\hat{A^{M}}

is based on the meta-path M aggregated graph G of the adjacency matrix. If node

n_{i}

and node

n_{j}

based on meta-path M have links, then

a_{i j} = 1

; otherwise,

a_{i j} = 0

. Because the node itself does not have a meta-path based on the formation of the meta-path M,

a_{i i} = 0

, and the node’s own information cannot be aggregated when the node feature is updated. To avoid this situation, the node’s neighbours can be considered using unit matrices

I \in R^{n \times n}

so that

{\hat{a}}_{i i} = 1

. Therefore, the node feature update can consider its own features.

D = [d_{i i}]

,

d_{i i}

is the degree of node

n_{i}

, and the feature vector of neighbour nodes of node

n_{i}

is the weighted average because low degrees of nodes on the neighbours are assumed to have a greater impact; and

σ (\cdot)

represents an arbitrary activation function. In this paper,

R e L U

is selected as the activation function and l represents the number of layers of modified GCN. If a layer of the modified GCN is set up, it will lead to the node features of the neural network, causing each node feature update to consider only the features of its neighbouring nodes. If too many of its layers are set, the sense of the field will be too large for each node feature update. The lack of joints will affect the node, resulting in a decline in the effect of the last layer of

l_{N}

generated by

X^{(l_{N})} = H^{M}

, where

H^{M}

is the meta-path augmented information matrix after aggregation of the local information based on the meta-path, and

W^{(l - 1)}

is used for the random initialization of the generated weights matrix.

The structural information contained in the relationship-enhanced meta-path M augmented informativeness matrix

\hat{W^{M}}

accounts for the neighbourhood information generated based on the meta-path M, the relevance of each node to the core node, and the decreasing amount of information carried by edges over time. Because of this, it is used as an input to the graphical convolutional neural network based on meta-paths and temporal attributes, which guides the nodes to rely on the meta-path-based neighbourhood information when they aggregate the neighbourhood information and accounts for the influence of each node to the core node. Meanwhile, GCN based on meta-paths and temporal attributes accounts for the change in information carried on each edge over time when updating the node features, which can make the node features include temporal attributes.

4.2.3. Modified GAT

GATs can achieve better results in temporal network tasks than GCNs since the training phase is performed only on subgraphs, while the testing phase must handle unknown vertices [19]. GATs based on meta-paths and temporal attributes are implemented in two main steps. The first step is learning to compute the attention coefficients to obtain the correlation between vertices, as shown in Equation (9); the second step is performing the node feature update as shown in Equation (11).

s_{e_{i j}} = a ([W h_{i} | | W h_{j}]), j \in N (v_{i})

(9)

where

h_{i}

represents the ith row of the augmented information matrix H after enhancing the local information, which is the feature of node

n_{i}

;

W \in R^{d \times n}

represents the shared parameter, which can enhance the node features and is part of the feature enhancement method;

N (v_{i})

represents the set of neighbour nodes of node

v_{i}

;

[\cdot | | \cdot]

concatenates the enhanced node features; and

a (\cdot)

maps the concatenated high-dimensional features to a real number.

s o f t m a x (\cdot)

is used to normalize the similarity coefficients of the neighbouring nodes of the central node to obtain the attention coefficient of each neighbouring node when the node is aggregated.

σ (\cdot)

is the activation function; in this paper, the

L e a k y R e L U

activation function is used, as shown in Equation (10).

α_{i j} = \frac{exp (σ (s_{e_{i j}}))}{\sum_{k \in N (v_{i})} exp (σ (s_{e_{i k}}))}

(10)

The weighted summation of the features of each neighbouring node is performed using the attention coefficients

α_{i j}

, as shown in Equation (11).

\hat{h_{i}} = σ (\sum_{j \in N (v_{i})} α_{i j} h_{j})

(11)

Since the neighbourhood information in the local information enhanced augmented augmentation matrix H is generated based on multiple specific meta-paths, it is used as an input to the graph attention network based on meta-paths and temporal attributes, and the updated node features can contain specific semantic information. In addition, the augmented local information in the augmented information matrix H contains information that decreases over time, thus reflecting the dynamically changing characteristics of different meta-paths.

4.3. Decoder

In this paper, the link prediction problem is considered a binary classification problem. The information in the first n snapshot is used to predict whether there is an edge between two nodes in the first

n + 1

snapshot. Two possible labels are given to the edges. If there is a link between two nodes

(n_{i}, n_{j})

, it is considered to be a positive sample, and its label is 1; otherwise, the label is 0. The existing edges in the

n + 1

time slice are taken as positive samples. Random sampling is used to obtain the same number of negative samples as positive samples, where the negative samples are the edges that do not exist in the

n + 1

snapshot. The multilayer perceptron

M L P (\cdot)

is used to predict the existence of connected edges between two nodes in the future network, as shown in Equation (12).

\hat{y} = M L P (\hat{H})

(12)

The information loss during model iteration is calculated using the binary cross-entropy loss function, as shown in Equation (13).

l o s s = - \frac{1}{n} \sum [y ln \hat{y} + (1 - y) ln (1 - \hat{y})]

(13)

where y is the true value and

\hat{y}

is the predicted value.

5. Experiments

In this section, we describe more comprehensive experiments to illustrate the superiority of the LP-THN model over other models in terms of prediction results. First, LP-THN and other low-dimensional embedding representation methods are compared. Second, the superior values of each parameter of LP-THN are determined through parameter sensitivity analysis.

5.1. Datasets and Settings

5.1.1. Datasets

We evaluated the performance of LP-THN on three temporal network datasets: one for academic collaboration networks, one for music recommendation networks, and one for film recommendation networks. The specific statistics of these three datasets are shown in Table 1.

AMiner [35] is a scientific research network that contains three types of nodes. We used articles from this network published in five research areas in 9 time slices from 1990 to 1997. We considered two meta-paths: $A P A$ , which denotes author collaborations, and $A P C P A$ , which denotes author participation in the same conference.
Last.FM [36] is an online music platform. The network contains 3 types of nodes, and the dataset we used contains partial information generated on the platform during the years 1956, 1947, 1979 and 2005–2010, divided into 5 time slices. We considered two meta-paths: $U A U$ , which indicates that two users listened to the same artist, and $U A T A U$ , which indicates that two users listened to artists with the same musical style.
MovieLens [37] is a noncommercial film recommendation platform. The network contains a total of three types of nodes, and the dataset we used contains a portion of the information generated on the site between 1996 and 2018, divided into 8 time slices. We considered two meta-paths: $U M U$ , which indicates that two users have rated the same author, and $U M G M U$ , which indicates that two users have rated a film on the same topic.

5.1.2. Baselines and Evaluation Metrics

To validate the performance of LP-THN, it is compared with existing network low-dimensional embedding representation methods. These methods include five node low-dimensional embedding representations based on shallow neural networks (i.e., DeepWalk, Node2Vec, LINE, Struc2Vec, metapath2vec), one node low-dimensional embedding representation based on matrix decomposition (i.e., M-NMF) and one node low-dimensional embedding representation based on a deep neural network (i.e., Deeplink). Except metapath2vec, the other six methods are applied on static homogeneous networks, while metapath2vec is a method applied on static heterogeneous networks. XGBoost is used as a classifier for link prediction of baselines.

DeepWalk [12]: node sequences are obtained by a random walk, and the node sequences are then input into Word2vec to obtain low-dimensional embedded representations of the nodes.
Node2Vec [13]: node sequences are obtained by biased random walk, and the node sequences are used as input to Word2vec to obtain low-dimensional embedded representations of the nodes.
LINE [15]: node embeddings are learned by maximizing the similarity between a node and its first-order and second-order neighbours.
M-NMF [14]: based on nonnegative matrix partitioning, which captures the community structure in the graph as well as the similarity between nodes.
Deeplink [38]: a deep learning method for node embedding representation that uses a deep convolutional neural network to learn node embedding representation.
Struc2Vec [39]: the context sequence of a node is constructed by traversing the depth-first search path of each node, and the sequence is then fed into Word2vec to obtain the node’s low-dimensional embedding representation.
Metapath2vec [21]: node sequences are obtained by a random meta-path-based walk, and the sequences are fed into Word2vec to obtain a representation of node embeddings in heterogeneous networks.

In this paper, three common evaluation metrics, AUC, F1 and ACC, are chosen to evaluate the performance of these methods. The AUC can be interpreted as the probability that the similarity value of an existing randomly selected edge is larger than the similarity value of a non-existing randomly selected edge. F1 can be regarded as a weighted average of the model’s accuracy and recall. The ACC is the ratio of the number of correctly predicted samples to the total number of predicted samples. The larger the values of the AUC, F1 and ACC are, the better the model performs.

5.1.3. Experimental Settings

To ensure the comparison results are fair, the node dimension of the output of all methods is set to 32, and

a_{i j} = 5

,

p = 2

, and

q = 1

. The initial values of the adaptive weights

w_{M_{1}}

and

w_{M_{2}}

are set to 0.45 and 0.55, respectively. For the comparison methods that require random walks, the number of walks per node is set to 80, the walk length is set to 10, and the window size is set to 5. The types of nodes and edges in the network are ignored when performing the homogeneous network embedding method.

5.2. Analysis of the Experimental Results

5.2.1. Comparison Experiments

In this paper, the information on the time slice

[t_{1}, t_{n - 1}]

of the dataset is used to predict the links in the time slice

t_{n}

of the dataset. The AMiner dataset predicts the links in the time slice

t_{9}

using the information on the time slice

[t_{1}, t_{8}]

, the Last.FM dataset predicts the links in time slice

t_{5}

using the information on time slice

[t_{1}, t_{4}]

, and the MovieLens dataset predicts the links in time slice

t_{8}

using the information on time slice

[t_{1}, t_{7}]

. The node pairs with links present on time slice

t_{n}

are taken as positive samples, and the same number of node pairs without links on time slice

t_{n}

are randomly selected as negative samples. LP-THN-1st and LP-THN-2nd denote that the LP-THN model considers only the first-order similarity meta-paths and only the second-order similarity meta-paths, respectively. The results in Table 2 demonstrate several discoveries: (a) Our proposed LP-THN model has better prediction results on all three datasets than the other models. This is because our model considers the dynamics of the network structure over time and retains the semantic information contained in the network. (b) The Metapath2vec method outperforms other embedding representations based on homogeneous networks because Metapath2vec accounts for the semantic information contained in the network. However, it does not perform as well as the LP-THN model because it does not account for the temporal information contained in the network. (c) The results of LP-THN- 1st are better than the results of LP-THN-2nd because LP-THN- 1st considers only the first-order similarity and LP-THN- 2nd considers only the second-order similarity. The node relationships extracted from first-order similarity meta-paths are closer than those extracted from second-order similarity meta-paths and therefore more informative for link prediction.

5.2.2. Parameter Sensitivity Analysis

First, the effect of using different values of parameter p and parameter q in Equation (2) is investigated. The literature [26] demonstrates that a larger parameter q in

n A S F^{M}

indicates a greater amount of remaining information on the meta-path M, and a larger parameter p indicates a longer active period of information on the meta-path M. Figure 4 shows the effect of different parameter p values on the performance of LP-THN on the AMiner, Last.FM and MovieLens datasets. We obtain the optimal values of the parameter p in the three datasets to be 8, 1 and 4, respectively. Interestingly, we also find that parameter p changes, the performance of LP-THN on the MovieLens dataset changes significantly, while its performance on the Last.FM remains almost unchanged. The performance of LP-THN on the AMiner dataset fluctuates greatly when

p = 6

and varies less when other p values are taken. A precision–recall curve is used to show the effect of the parameters on the model performance; the closer the precision–recall curve is to the upper right, the better the model performs.

Figure 5 shows the effect of varying the parameter q on the LP-THN model in the AMiner, Last.FM and MovieLens datasets. When the value range of parameter q is restricted to

[1, 10]

, its optimal values in the three datasets are 5, 5 and 3, respectively. Similarly to the effect of parameter p on the LP-THN model, the effect of varying parameter q is still more obvious in the MovieLens dataset than in the other datasets. Parameter q has less effect on its performance in the Last.FM dataset, except that the fluctuation when

q = 1

is larger. We found the reason for the above phenomenon to be related to the time dimension spanned by the temporal dataset. The effects of parameters p and q on the model are impacted by the amount of information remaining on the meta-paths, and the Last. FM dataset aggregates information from only 4 time slices, limiting the effect of these two parameters.

Figure 6 shows the effect of different aggregation methods considering meta-paths with first-order similarity and second-order similarity on the performance of LP-THN in the three temporal datasets. The different aggregation methods impact LP-THN less in the AMiner and Last. FM datasets but more in the MovieLens dataset. Different types of meta-paths contain different semantics, and the best results are obtained by aggregating different meta-paths with adaptive weights.

Figure 7 shows the effect that varying the output dimension of the encoder has on the performance of LP-THN on three temporal datasets. We set the dimensions to 10, 32, 64, 128, 256 and 512. In the AMiner and MovieLens datasets, the performance of LP-THN largely fluctuates at a dimension of 128, and when the dimension is greater than 256, the performance continuously decreases. However, changing dimension has little effect on the performance of LP-THN on the Last.FM dataset. Our model can capture rich structural, semantic and temporal information using low-dimensional representations.

5.2.3. Ablation Experiments

The traditional temporal heterogeneous network link prediction methods ignore the fact that the newly added edge will be active for a period of time and then calm down. In this paper, information lifecycle was considered to describe the above case. In order to verify whether the influence of information lifecycle on the prediction results was considered, ablation experiments were conducted, as shown in Figure 8. We can see that LP-THN with meta-path augmented residual information matrix has better performance than LP-THN with meta-path augmented matrix. The meta-path augmented residual information matrix considers both semantic and temporal information through information lifecycle, while LP-THN with meta-path augmented matrix only considers semantic information. Meta-path augmented matrix is a matrix with 0 and 1. If there is a meta-path instance between two nodes, the corresponding position of the two nodes in the matrix is 1, otherwise, the corresponding position of the two nodes in the matrix is 0.

5.2.4. Complexity Analysis

Since DeepWalk, Node2vec, LINE, M-NMF, Deeplink, and Struc2Vec are all node embedding representations on homogeneous networks, their time complexity is much higher than that of LP-THN. LP-THN takes the meta-path augmented residual information matrix as the model input, so it needs to update only the meta-path node embedding representations at both ends instead of all nodes in the network. In this subsection, the time complexity of LP-THN is analysed. The complexity of this model is mainly due to the encoder. The above analysis demonstrates that the encoder is mainly composed of three parts: the modified self-attention module, modified GCN module and modified GAT module. The time complexity of the modified self-attention module is

O ({|V^{M}|}^{2} d)

, where

|V^{M}|

denotes the number of nodes belonging to the node types at both ends of the meta-path, and d denotes the dimension of the node embedding representation. The time complexity of the modified GCN module is

O (|E^{M}|)

, where

|E^{M}|

denotes the number of instances of selected meta-paths. The time complexity of the modified GAT module is

O ({|V^{M}|}^{2} d) + O (|E^{M}| d)

. The summarized time complexity of LP-THN is

O ({|V^{M}|}^{2} d) + O (|E^{M}|) + O ({|V^{M}|}^{2} d) + O (|E^{M}| d)

.

6. Conclusions

In this paper, we study the link prediction problem on temporal heterogeneous networks and propose an encoder–decoder framework, LP-THN. The semantic information contained in meta-paths in temporal networks is captured by a proposed meta-path augmented residual information matrix that portrays the dynamic process of network structure changes over time. Considering the lifecycle of the information carried by meta-paths enables LP-THN to account for both the rich semantic information in heterogenous networks and the evolution of network structure over time. Experiments demonstrate that the LP-THN model proposed in this paper outperforms existing baseline frameworks. In our future work, We plan to extend our model to large-scale networks and consider aggregating the attributes of nodes into the embedding process, and process data in temporal networks in real time.

Author Contributions

Conceptualization, J.C.; methodology, J.C.; software, J.C.; validation, J.C.; formal analysis, J.C.; investigation, J.C.; resources, J.C.; data curation, J.C.; writing—original draft preparation, J.C.; writing—review and editing, J.C.; visualization, J.C.; supervision, J.L.; project administration, J.J.; funding acquisition, J.L. and J.J. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (NNSFC) under Grant 72001209, 72231011 and 72071206, and the Science Foundation for Outstanding Youth Scholars of Hunan Province under Grant 2022JJ20047.

Institutional Review Board Statement

This article does not involve ethical research and does not require ethical approval.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The common dataset used in this study can be obtained from links https://www.aminer.cn/, http://www.lastfm.com and https://movielens.org/ (accessed on 14 June 2023).

Conflicts of Interest

The authors declare no conflict of interest.

References

Stanfield, Z.; Coşkun, M.; Koyutürk, M. Drug response prediction as a link prediction problem. Sci. Rep. 2017, 7, 40321. [Google Scholar] [CrossRef] [PubMed]
Nasiri, E.; Berahm, K.; Rostami, M.; Dabiri, M. A novel link prediction algorithm for protein-protein interaction networks by attributed graph embedding. Comput. Biol. Med. 2021, 137, 104772. [Google Scholar] [CrossRef] [PubMed]
Liben-Nowell, D.; Kleinberg, J. The link prediction problem for social networks. In Proceedings of the Twelfth International Conference on Information and Knowledge Management, New Orleans, LA, USA, 9–11 November 2003; pp. 556–559. [Google Scholar]
Cho, H.; Yu, Y. Link prediction for interdisciplinary collaboration via co-authorship network. Soc. Netw. Anal. Min. 2018, 8, 1–12. [Google Scholar] [CrossRef]
Liu, G. An ecommerce recommendation algorithm based on link prediction. Alex. Eng. J. 2022, 61, 905–910. [Google Scholar] [CrossRef]
Adamic, L.A.; Adar, E. Friends and neighbors on the web. Soc. Net. 2003, 25, 211–230. [Google Scholar] [CrossRef]
Zhou, T.; Lü, L.; Zhang, Y.C. Predicting missing links via local information. Eur. Phys. J. B 2009, 71, 623–630. [Google Scholar] [CrossRef]
Katz, L. A new status index derived from sociometric analysis. Psychometrika 1953, 18, 39–49. [Google Scholar] [CrossRef]
Leicht, E.A.; Holme, P.; Newman, M.E. Vertex similarity in networks. Phys. Rev. E 2006, 73, 026120. [Google Scholar] [CrossRef]
Fouss, F.; Pirotte, A.; Renders, J.M.; Saerens, M. Random-walk computation of similarities between nodes of a graph with application to collaborative recommendation. IEEE Trans. Knowl. Data Eng. 2007, 19, 355–369. [Google Scholar] [CrossRef]
Jeh, G.; Widom, J. Simrank: A measure of structural-context similarity. In Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton, AB, Canada, 23–26 July 2002; pp. 538–543. [Google Scholar]
Perozzi, B.; Al-Rfou, R.; Skiena, S. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, 24–27 August 2014; pp. 701–710. [Google Scholar]
Grover, A.; Leskovec, J. node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 855–864. [Google Scholar]
Wang, X.; Cui, P.; Wang, J.; Pei, J.; Zhu, W.; Yang, S. Community preserving network embedding. In Proceedings of the AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 4–9 February 2017; Volume 31. [Google Scholar]
Tang, J.; Qu, M.; Wang, M.; Zhang, M.; Yan, J.; Mei, Q. Line: Large-scale information network embedding. In Proceedings of the 24th International Conference on World Wide Web, Florence, Italy, 18–22 May 2015; pp. 1067–1077. [Google Scholar]
Scarselli, F.; Gori, M.; Tsoi, A.C. The graph neural network model. IEEE Trans. Neural Net. 2008, 20, 61–80. [Google Scholar] [CrossRef]
Hamilton, W.; Ying, Z.; Leskovec, J. Inductive representation learning on large graphs. NIPS 2017, 30, 1024–1034. [Google Scholar]
Kipf, T.N.; Welling, M. Semi-supervised, classification, with, graph, convolutional, networks. arXiv 2017, arXiv:1609.02907. [Google Scholar]
Veličković, P.; Cucurull, G.; Casanova, A.; Romero, A.; Lio, P.; Bengio, Y. Graph attention networks. arXiv 2017, arXiv:1710.10903. [Google Scholar]
Shi, C.; Wang, R.; Wang, X. Survey on Heterogeneous Information Networks Analysis and Applications. J. Softw. 2022, 33, 598–621. [Google Scholar]
Dong, Y.; Chawla, N.V.; Swami, A. metapath2vec: Scalable representation learning for heterogeneous networks. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, 13–17 August 2017; pp. 135–144. [Google Scholar]
Fu, T.Y.; Lee, W.C.; Lei, Z. Hin2vec: Explore meta-paths in heterogeneous information networks for representation learning. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, Singapore, 6–10 November 2017; pp. 1797–1806. [Google Scholar]
Chang, S.; Han, W.; Tang, J.; Qi, G.J.; Aggarwal, C.C.; Huang, T.S. Heterogeneous network embedding via deep architectures. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Sydney, Australia, 10–13 August 2015; pp. 119–128. [Google Scholar]
Zhang, C.; Song, D.; Huang, C.; Swami, A.; Chawla, N.V. Heterogeneous graph neural network. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery Data Mining, Anchorage, AK, USA, 4–8 August 2019; pp. 793–803. [Google Scholar]
Shi, C.; Hu, B.; Zhao, W.X.; Philip, S.Y. Heterogeneous information network embedding for recommendation. IEEE Trans. Knowl. Data Eng. 2018, 31, 357–370. [Google Scholar] [CrossRef]
Zhang, R.; Wang, Q.; Yang, Q.; Wei, W. Temporal link prediction via adjusted sigmoid function and 2-simplex structure. Sci. Rep. 2022, 12, 16585. [Google Scholar] [CrossRef]
Zou, L.; Zhan, X.X.; Sun, J.; Hanjalic, A.; Wang, H. Temporal network prediction and interpretation. IEEE Trans. Netw. Sci. Eng. 2021, 9, 1215–1224. [Google Scholar] [CrossRef]
Zhou, L.; Yang, Y.; Ren, X.; Wu, F.; Zhuang, Y. Dynamic network embedding by modeling triadic closure process. In Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LO, USA, 2–7 February 2018; Volume 32. [Google Scholar]
Nguyen, G.H.; Lee, J.B.; Rossi, R.A.; Ahmed, N.K.; Koh, E.; Kim, S. Continuous-time dynamic network embeddings. In Proceedings of the Companion Proceedings of the Web Conference 2018, Lyon, France, 23–27 April 2018; pp. 969–976. [Google Scholar]
Rahman, M.; Saha, T.K.; Hasan, M.A.; Xu, K.S.; Reddy, C.K. Dylink2vec: Effective feature representation for link prediction in dynamic networks. arXiv 2018, arXiv:1804.05755. [Google Scholar]
Abbas, K.; Abbasi, A.; Dong, S.; Niu, L.; Chen, L.; Chen, B. A Novel Temporal Network-Embedding Algorithm for Link Prediction in Dynamic Networks. Entropy 2023, 25, 257. [Google Scholar] [CrossRef]
Yin, Y.; Ji, L.X.; Zhang, J.P.; Pei, Y.L. DHNE: Network representation learning method for dynamic heterogeneous networks. IEEE Access 2019, 7, 134782–134792. [Google Scholar] [CrossRef]
Wang, X.; Lu, Y.; Shi, C.; Wang, R.; Cui, P.; Mou, S. Dynamic heterogeneous information network embedding with meta-path based proximity. IEEE Trans. Knowl. Data Eng. 2020, 34, 1117–1132. [Google Scholar] [CrossRef]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. NIPS 2017, 30, 5998–6008. [Google Scholar]
Tang, J.; Zhang, J.; Yao, L.; Li, J.; Zhang, L.; Su, Z. Arnetminer: Extraction and mining of academic social networks. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Las Vegas, NV, USA, 24–27 August 2008; pp. 990–998. [Google Scholar]
Cantador, I.; Brusilovsky, P.; Kuflik, T. Second workshop on information heterogeneity and fusion in recommender systems (HetRec2011). In Proceedings of the Fifth ACM Conference on Recommender Systems, Chicago, IL, USA, 23–27 October 2011; pp. 387–388. [Google Scholar]
Harper, F.M.; Konstan, J.A. The movielens datasets: History and context. ACM Trans. Interact. Intell. Syst. 2015, 5, 1–19. [Google Scholar] [CrossRef]
Zhou, F.; Liu, L.; Zhang, K.; Trajcevski, G.; Wu, J.; Zhong, T. Deeplink: A deep learning approach for user identity linkage. In Proceedings of the IEEE INFOCOM 2018-IEEE Conference on Computer Communications, Honolulu, HI, USA, 15–19 April 2018; pp. 1313–1321. [Google Scholar]
Ribeiro, L.F.; Saverese, P.H.; Figueiredo, D.R. struc2vec: Learning node representations from structural identity. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, 13–17 August 2017; pp. 385–394. [Google Scholar]

Figure 1. Example of (A) an academic co-authorship network and (B) co-authorship instances.

Figure 2. Schematic diagram of temporal link prediction.

Figure 3. Diagram of the proposed LP-THN model. This model contains network aggregation, meta-path selection, meta-path augmented information matrix, encoder and decoder steps. In the first step, the temporal heterogeneous network is aggregated into a single-layer aggregated network, which sets the age and times as the weights of edges. In the second step, meta-paths are selected according to the semantics within the temporal dataset. In the third step, the meta-path augmented information matrix

W^{M}

is calculated, preserving structure evolution and semantics. In the fourth step, an encoder is used to obtain the low-dimensional embedding representation of nodes. Finally, a decoder is applied to perform link prediction.

Figure 3. Diagram of the proposed LP-THN model. This model contains network aggregation, meta-path selection, meta-path augmented information matrix, encoder and decoder steps. In the first step, the temporal heterogeneous network is aggregated into a single-layer aggregated network, which sets the age and times as the weights of edges. In the second step, meta-paths are selected according to the semantics within the temporal dataset. In the third step, the meta-path augmented information matrix

W^{M}

is calculated, preserving structure evolution and semantics. In the fourth step, an encoder is used to obtain the low-dimensional embedding representation of nodes. Finally, a decoder is applied to perform link prediction.

Figure 4. Effect of the parameter p on performance in different temporal heterogeneous network datasets. Precision–recall curves are used to show the effect of the variation in p on the performance of LP-THN.

Figure 5. Effect of the parameter q on performance in different temporal heterogeneous networks. Precision–recall curves are used to show the effect of varying q on the performance of LP-THN.

Figure 6. Effect of meta-path aggregation method on LP-THN in different temporal datasets.

Figure 7. Effect of dimension d on LP-THN in different temporal datasets.

Figure 8. Effect of information lifecycle on LP-THN in different temporal datasets.

Table 1. Statistics of Datasets.

Datasets	Node Types	#Node	Meta-Path	Time Steps
AMiner	Author (A) Paper (P) Conference (C)	7043 5371 17	APA APCPA	9
Last.FM	User (U) Artist (A) Tag (T)	1234 9438 5472	UAU UATAU	5
MovieLens	User (U) Movie (M) Genre (G)	819 12,677 39	UMU UMGMU	8

Table 2. Performance Evaluation of LP-THN.

Datasets	Metric	DeepWalk	Node2Vec	LINE	M-NMF	Deeplink	Struc2Vec	Metapath2vec	LP-THN-1st	LP-THN-2nd	LP-THN
AMiner	AUC	0.9380	0.9625	0.9482	0.9415	0.9398	0.9332	0.9623	0.9912	0.9903	0.9934
	F1	0.9403	0.9632	09493	0.9418	0.942	0.9358	0.9631	0.9627	0.9355	0.9686
	ACC	0.9380	0.9625	0.9482	0.9415	0.9398	0.9332	0.9624	0.9728	0.9386	0.969
Last.FM	AUC	0.8539	0.9675	0.969	0.9223	0.9559	0.9594	0.9667	0.9844	0.9778	0.9848
	F1	0.8819	0.9707	0.9717	0.9318	0.9608	0.9638	0.9701	0.9768	0.9721	0.9768
	ACC	0.8604	0.9687	0.9699	0.9249	0.9576	0.9610	0.9680	0.9752	0.9726	0.9761
MovieLens	AUC	0.8832	0.8576	0.9744	0.8967	0.9068	0.9345	0.9872	0.9913	0.9935	0.9945
	F1	0.8861	0.8642	0.9737	0.9000	0.9014	0.9333	0.9867	0.9927	0.9921	0.9981
	ACC	0.8816	0.8553	0.9737	0.8947	0.9079	0.9342	0.9868	0.9933	0.9872	0.9860

The bold represents the best AUC/F1/ACC score within a network.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Cao, J.; Li, J.; Jiang, J. Link Prediction for Temporal Heterogeneous Networks Based on the Information Lifecycle. Mathematics 2023, 11, 3541. https://doi.org/10.3390/math11163541

AMA Style

Cao J, Li J, Jiang J. Link Prediction for Temporal Heterogeneous Networks Based on the Information Lifecycle. Mathematics. 2023; 11(16):3541. https://doi.org/10.3390/math11163541

Chicago/Turabian Style

Cao, Jiaping, Jichao Li, and Jiang Jiang. 2023. "Link Prediction for Temporal Heterogeneous Networks Based on the Information Lifecycle" Mathematics 11, no. 16: 3541. https://doi.org/10.3390/math11163541

APA Style

Cao, J., Li, J., & Jiang, J. (2023). Link Prediction for Temporal Heterogeneous Networks Based on the Information Lifecycle. Mathematics, 11(16), 3541. https://doi.org/10.3390/math11163541

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Link Prediction for Temporal Heterogeneous Networks Based on the Information Lifecycle

Abstract

1. Introduction

2. Related Work

3. Notations and Definitions

4. The LP-THN Model

4.1. Basic Idea

4.2. Encoder

4.2.1. Modified Self-Attention Mechanism

4.2.2. Modified GCN

4.2.3. Modified GAT

4.3. Decoder

5. Experiments

5.1. Datasets and Settings

5.1.1. Datasets

5.1.2. Baselines and Evaluation Metrics

5.1.3. Experimental Settings

5.2. Analysis of the Experimental Results

5.2.1. Comparison Experiments

5.2.2. Parameter Sensitivity Analysis

5.2.3. Ablation Experiments

5.2.4. Complexity Analysis

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI