Temporal Knowledge Graph Reasoning Based on Entity Relationship Similarity Perception

: Temporal knowledge graphs (TKGs) are used for dynamically modeling facts in the temporal dimension, and are widely used in various ﬁ elds. However, existing reasoning models often fail to consider the similarity features between entity relationships and static a tt ributes, making it di ﬃ cult for them to e ﬀ ectively handle these temporal a tt ributes. Therefore, these models have limitations in dealing with previously invisible entities that appear over time and the implicit associations of static a tt ributes between entities. To address this issue, we propose a temporal knowledge graph reasoning model based on Entity Relationship Similarity Perception, known as ERSP. This model employs the similarity measurement method to capture the similarity features of entity relationships and static a tt ributes, and then fuses these features to generate structural representations. Finally, we provide a decoder with entity relationship representation, static a tt ribute representation, and structural representation information to form a quadruple. Experiments conducted on ﬁ ve common benchmark datasets show that ERSP surpasses the majority of TKG reasoning methods.


Introduction
Knowledge graphs (KGs) are widely used in multiple fields, including information retrieval, intelligent recommendation, question-answering systems, and natural language processing [1].However, the challenges of maintaining timeliness and data consistency have made research into knowledge graphs increasingly arduous.Current research primarily focuses on reasoning over static knowledge graphs.However, in real life, information is dynamically changing and data have timeliness, which poses a significant challenge to the dynamic reasoning of knowledge graphs.
A knowledge graph is a knowledge base that stores factual information in a graph structure, where each fact is represented as a triplet (e s , r, e o ), where e s (subject entities) and e o (object entities) represent nodes (entities), and r represents the type of edge (relation) connecting e s and e o .Knowledge graphs are generally presented in a static form.However, in real life, the relations between entities change over time.To more accurately describe this dynamism, some researchers have introduced the concept of temporal knowledge graphs (TKGs).A temporal knowledge graph is a type of knowledge graph used to describe dynamic facts.It expands information by combining each triplet ( e s , r, e o ) with a timestamp to form a quadruple ( e s , r, e o , t), where t denotes a timestamp, which represents the specific period during which the fact occurred.For example, as shown in Figure 1, (Donald Trump, President, United States, 2017/1~2021/1) represents that Donald Trump served as President of the United States from January 2017 to January 2021.The main research focus of this article is to solve the extrapolation problem in temporal knowledge graph reasoning, which includes predictions about entities and relations.Solving the problem of extrapolation is of great significance for multiple practical applications, such as predicting social relations, To accurately predict future events, we need to gain insight into the patterns of development between historical events.At each timestamp, the relations between entities are constructed into a complex structural network through the interaction of various concurrent facts.This network evolves continuously, providing opportunities to deeply explore the correlation of historical events through their mutual influence, thereby improving the accuracy of predicting future events.
For example, consider querying (?, President Elected, United States, 2017-1~2021-1).We can query a series of related historical adjacent facts and similar facts.As shown in Figure 1, among these facts, the possible people involved are Joe Biden, Donald Trump, and so on.Considering both neighboring and similar facts narrows down the predicted outcome, such as in this example, to Barack Obama, Donald Trump, and Joe Biden.In addition, as shown in Figure 2, we can reveal some important clues by observing adjacent and similar facts in history.For example, in January 2017, Trump signed an executive order, issuing a "ban" involving seven countries; in May 2018, he announced withdrawal from the Iran Nuclear Agreement signed by the six parties.In 2019, Kim Jong Un and Donald Trump held their first ever meeting between their sitting leaders.In 2020, Trump announced that the United States officially withdrew from the Paris Agreement.These historical facts not only reflect Trump's behavior, but also foreshadow his future movements to some extent.By comprehensively considering this historical information, we can reveal the behavioral trends and preferences of entities and relations, providing important clues for predicting future events.To accurately predict future events, we need to gain insight into the patterns of development between historical events.At each timestamp, the relations between entities are constructed into a complex structural network through the interaction of various concurrent facts.This network evolves continuously, providing opportunities to deeply explore the correlation of historical events through their mutual influence, thereby improving the accuracy of predicting future events.
For example, consider querying (?, President Elected, United States, 2017-1~2021-1).We can query a series of related historical adjacent facts and similar facts.As shown in Figure 1, among these facts, the possible people involved are Joe Biden, Donald Trump, and so on.Considering both neighboring and similar facts narrows down the predicted outcome, such as in this example, to Barack Obama, Donald Trump, and Joe Biden.In addition, as shown in Figure 2, we can reveal some important clues by observing adjacent and similar facts in history.For example, in January 2017, Trump signed an executive order, issuing a "ban" involving seven countries; in May 2018, he announced withdrawal from the Iran Nuclear Agreement signed by the six parties.In 2019, Kim Jong Un and Donald Trump held their first ever meeting between their sitting leaders.In 2020, Trump announced that the United States officially withdrew from the Paris Agreement.These historical facts not only reflect Trump's behavior, but also foreshadow his future movements to some extent.By comprehensively considering this historical information, we can reveal the behavioral trends and preferences of entities and relations, providing important clues for predicting future events.
In recent years, some studies have attempted to obtain relevant historical information from different queries through heuristic methods.However, these methods did not consider the similar relation between entity relationships and between static attributes.For example, CyGNet [6] utilizes global historical information to efficiently model repetitive patterns present in temporal knowledge graphs to predict information about entities and relations at future timestamps.TANGO [7] extends the idea of neural ordinary differential equations to multi-relational graph convolutional networks, encoding temporal and structural information as continuous-time dynamic embeddings.TITer [8] temporal path-based reinforcement learning utilizes a relative time-coding function to obtain temporal information and uses time-shaped rewards to guide model learning.CEN [9] uses a KG sequence encoder based on relational graph neural networks and a length-aware convolutional-neural-networkbased evolutionary representation decoder to learn evolutionary patterns from historical KG sequences of different lengths.Although these methods integrate and predict entity relationship information, they overlook the similar relations between entity relationships and static attributes, which often have a significant impact on TKG reasoning.In recent years, some studies have attempted to obtain relevant historical information from different queries through heuristic methods.However, these methods did not consider the similar relation between entity relationships and between static attributes.For example, CyGNet [6] utilizes global historical information to efficiently model repetitive patterns present in temporal knowledge graphs to predict information about entities and relations at future timestamps.TANGO [7] extends the idea of neural ordinary differential equations to multi-relational graph convolutional networks, encoding temporal and structural information as continuous-time dynamic embeddings.TITer [8] temporal pathbased reinforcement learning utilizes a relative time-coding function to obtain temporal information and uses time-shaped rewards to guide model learning.CEN [9] uses a KG sequence encoder based on relational graph neural networks and a length-aware convolutional-neural-network-based evolutionary representation decoder to learn evolutionary patterns from historical KG sequences of different lengths.Although these methods integrate and predict entity relationship information, they overlook the similar relations between entity relationships and static attributes, which often have a significant impact on TKG reasoning.
In this work, we propose a representation learning model based on Entity Relationship Similarity Perception, or ERSP, for modeling and reasoning temporal knowledge graph sequences.Specifically, the model captures the evolutionary representations of entities and relations in TKGs at different timestamps through multiple components.The main components are as follows: (1) The entity-aware component, which models the entity information at each timestamp in the TKG.(2) Modeling the relations and their similarity feature information in the TKG at each timestamp using the relationship similarityaware component.( 3) The static attribute similarity-aware component, which constrains the static attributes of entities and mines the similarity features of static attributes.
In summary, the main contributions of this paper are as follows: In summary, the main contributions of this paper are as follows: • We propose a novel ERSP, which is a representation learning model based on a TKG.This model integrates the entity-aware component, relationship similarity-aware component, and static attribute similarity-aware component, thus fully utilizing the similarity features between entity relationships and static attributes in historical facts.By modeling these features, our model has significant advantages in handling unseen entities and can solve the problem of hidden associations between static attributes of facts, thereby further enhancing the model's predictive ability for entity relationships.• To our knowledge, this is the first time that similarity features of historical facts and static attribute information have been integrated into TKG reasoning.• We conducted extensive experiments on five commonly used TKG datasets and the results showed the excellent performance of ERSP in both entity prediction and relation-prediction tasks.

Related Work
In recent years, researchers have divided the integration of time information into knowledge graph reasoning into two main settings: interpolation and extrapolation [2].In the interpolation setting, the model, given known events and facts within a timestamp range [t 0 , t T ], predicts facts that may exist within this time range but have not yet been observed.This means that the model needs to fill in the gaps within this period based on existing historical information.In the extrapolation setting, the model knows the historical events and facts within a timestamp range of [t 0 , t T ] and predicts possible facts that may occur within the future timestamp range of [t T+1 , ∞].This requires the model to have an understanding of the evolution patterns of events in historical information in order to infer possible facts that may occur in the future [10].
TKG Reasoning under Interpolation Settings: TTransE [11] extends TransE [12] with temporal consistency information as a constraint.TA-DistMult [13] embeds entities and relations into the complex domain using matrix factorization.TA-TransE learns a time-aware representation of relations through long short-term memory recurrent neural networks and distance minimization.TNTComplEx [14] generates embedded representations of timestamps through the complex factorization of fourth-order tensors.However, these models cannot capture representations of unseen timestamps and are usually not applicable in extrapolation settings.
TKG Reasoning under Extrapolation Settings: RE-NET [15] models historical facts as subgraph sequences using a neighborhood aggregator and a cyclic event encoder.xERTE [16] builds interpretable reasoning graphs using subgraph sampling techniques to integrate entity, relation and temporal information in the reasoning graph.TANGO-Tucker [7] introduces Tucker decomposition to model complex relations of entities, relations, and time.RE-GCN [17] comprehensively models historical information by capturing structural dependencies of historical facts and sequential patterns of neighboring facts.GHT [18] designed two Transformer modules based on the Hawkes process for capturing structural and temporal information, respectively.ReGAT [19] used RNN and GNN to jointly encode temporal and structural event information from historical and concurrent events.However, none of the above methods consider similarity features of entity relationships or similarity features of static attributes, making it difficult to handle these temporal attributes effectively.
Similarity Modeling of Entity Relationships: This involves establishing a similarity model for entity relationships in the knowledge graph to mine potential associations in the knowledge graph.SiGMa [20] uses a greedy matching method based on similarity propagation to model structural information and similarity measures of entity attributes in the relational graph.The HistSim and DisNGram methods [21] achieve the similarity matching of entities through character-level similarity and threshold calculations.Paris [22] measures the degree of matching between entities and relations based on probability estimation.CEAFF [23] uses reinforcement learning models to align entities collectively and utilizes representative features to capture the similarity between entities in heterogeneous knowledge graphs.ClusterEA [24] is an EA model based on GNN that aligns entities between large-scale knowledge graphs through random training and standardized similarity.However, these methods have not been applied in the field of temporal knowledge graph extrapolation, and do not take into account the similarity features of static attributes.
Therefore, the above model cannot handle previously invisible entities that appear over time or the implicit associations of static attributes between facts.In contrast, ERSP adopts a novel approach to modeling the entire sequence of temporal knowledge graphs as a whole, which not only fully considers the similarity features of entity relationships, but also synthesizes the similarity features of static attributes of entities.This comprehensive modeling can greatly improve the accuracy and effectiveness of entity and relation prediction.

Preliminaries
In this section, we elaborate on TKGs and TKG reasoning problems, and summarize the relevant symbols used in this work, as shown in Table 1.

H
The entity set in the TKG.

R
The relation set in the TKG.

M
The fact set.E s , E o Subject entity, object entity.W η r The weight matrix related to the relation r.

W η o
The weight matrix related to the object entity o.

E s
The static embedding matrix.

E e
The evolutionary embedding matrix.

H s
The similarity matrix.

|H|
The total number of entities.

|R|
The total number of relations.
The score probability of the entity.
The scoring probability of the relation.

Definition 1. Temporal knowledge graph:
A temporal KG is a multi-relation graph that changes over time, with edges marked with timestamps.A TKG can be represented as a quadruple G = (H, R, M, T ), where H is the entity set, R is the relation set, M is the fact set, and T is the time set.Each fact M is represented as a quadruple (E s , , E o , ).Definition 2. Temporal knowledge graph reasoning: Temporal knowledge graph inference is the process of prediction using the temporal patterns and dependencies present in the temporal knowledge graph.This process can be divided into two main tasks: entity prediction and relation prediction.
Entity prediction: Entity prediction aims to predict missing entities at a future point in time based on a sequence of historical knowledge graphs.For example, given a certain query form of (?, , E o ,  + 1) or (E s , , ?,  + 1), we can infer missing subject entity E s or object entity E o .
Relation prediction: Relation prediction is the inference of missing relations between entities at a certain point in the future based on the same historical sequence.For example, given a certain query form of (E s , ?, E o ,  + 1), we can infer missing relation .
Similarity measurement is a key tool for evaluating the similarity between two objects and is widely used in machine learning and data mining.Common similarity measurement methods include the following: Definition 3. Similarity measurement: Similarity measurement is a key tool for evaluating the similarity between two objects and is widely used in machine learning and data mining.Common similarity measurement methods include cosine similarity and Euclidean distance, among others.
Cosine similarity: Cosine similarity is used to measure the directional similarity between two vectors, and its formula is where m and n are two vectors; m i and n i represent the i th element in vectors m and n, respectively; ∥ m i ∥ and ∥ n i ∥ represent the modulus of vector m i and vector n i , respectively; and k represents the dimension of a vector, which is the number of elements it has.
Euclidean distance: Euclidean distance is used to measure the straight-line distance between two vectors in multidimensional space, and its formula is where m and n are two vectors; m i and n i are the i th element in vectors m and n; and k represents the dimension of a vector, which is the number of elements it has.

Model Overview
The overall framework of ERSP consists of three key components, including the entityaware component, the relationship similarity-aware component, and the static attribute similarity-aware component, s, as illustrated in Figure 3.The entity-aware component includes the graph convolutional GCN network structure and the adaptive time gate network structure.The goal of the graph convolutional GCN network structure is to capture the feature dependency relations of entities in the temporal knowledge graph at each timestamp through aggregation.The adaptive time gate network structure is used to obtain the evolutionary representation of entities at each timestamp.The relationship similarity-aware component is composed of mean pooling and relationship-aware gating units.The relationship-aware gating unit is based on the relationship similarity vector and the gated recurrent neural network unit, which can not only obtain similarity features of relations in historical facts, but also further capture the evolutionary representation of relations.The static attribute similarity-aware component integrates the static embedding and evolutionary embedding of entities, integrating static attributes into evolutionary embedding and further obtaining the similar relation of their static attributes.To capture the feature dependency relations between concurrent facts, we adopt a graph convolutional GCN network structure [25].The GCN model is used to capture the associations between entities and relations in a multi-relational graph.Specifically, at the

Entity-Aware Component 4.2.1. Graph Convolutional GCN Network Structure
To capture the feature dependency relations between concurrent facts, we adopt a graph convolutional GCN network structure [25].The GCN model is used to capture the associations between entities and relations in a multi-relational graph.Specifically, at the timestamp t of the temporal knowledge graph, the embedding of object entity o in the η layer is achieved by obtaining information from its subject entity layer through a message-passing framework and obtaining the next layer's embedding.Specifically, this is expressed as where Φ(•) represents the ReLU activation function [26], k is a constant, W

Adaptive Time Gate Network Structure
For entity o, its information may change between different time steps, which may lead to loss when processing entity information.To address this issue and better capture the temporal correlation of entity information between adjacent time steps, we introduce an adaptive time gate network structure (ATGN) that can dynamically adjust the transmission and updating of entity information at different time steps.The adaptive gating mechanism controls how the entity information after graph convolution aggregation is fused with the entity information of the previous time step to generate an updated entity representation of the current time step.Namely, where H t−1 represents the entity embedding matrix at timestamp t − 1; H GCN t−1 represents the entity embedding matrix after graph convolution aggregation.The Adapt_Gate structure is composed of Update Gate C t and final state updates O t , which can control the dynamic adjustment of entity information.Specifically, this is expressed as follows: where C t represents the output of Update Gate; sigm(•) represents the sigmoid function; W c represents the weight coefficient matrix of the Update Gate, which dynamically changes entity information by adjusting the weights; b r represents a bias matrix used to adjust the opening degree of the update gate; and X t−1 represents the hidden state of the previous time step, which is the entity embedding matrix.
where O t represents the output of the final state updates, which is the updated entity embedding matrix.⊙ represents the element-wise product.X t represents the hidden state, which is the entity embedding matrix after graph convolution aggregation.Through the adaptive time gate network structure, the model can better handle changes in entity information at different time steps, ensuring that the model can flexibly capture feature dependency relations between entities at different timestamps, further improving the modeling ability of entity time correlation.

Relationship Similarity-Aware Component
To capture the temporal correlation of relation information between adjacent timestamps, the model adopts a combination of mean pooling and relationship-aware gating units to gradually update the evolutionary representation of relations.The specific mean pooling operation is as follows: where H t−1 represents the entity embedding matrix at timestamp t − 1, M r,t represents the entity set connected to the relation r at timestamp t, and → r ′ t is the mean pooling calculation of the entity embedding matrix H t−1 from the previous time step and the entity set M r,t .
Subsequently, the relation embedding matrix R t−1 at timestamp t − 1 and the R ′ t obtained from the mean pooling operation are used to update R t through the relationaware gating unit (Relation-GateCell).The relation-aware gating unit is based on the similarity measurement method to extract the similarity features of relationships in order to better capture the evolution of relationships in the temporal knowledge graph.Specifically, we use the following formula to represent the update process of the relation: where Previous models often overlook similar features of relations, so we adopt the Relationship-GateCell structure, which is a relation-aware gating unit.This structure is based on a relation similarity matrix and the gated recurrent neural network unit (PReLU-GateCell), which not only captures the relation similarity in historical facts, but also further captures the evolutionary representation of relations.The specific calculation of relation similarity is as follows: where FC1, FC2 represents the fully connected layer; Φ represents the ReLU activation function; h represents the calculated representation of the hidden layer; and H r denotes the mapping of the hidden layer representation to the final output, representing the relation similarity matrix.Based on the relation similarity matrix, the updated relation features are as follows: The traditional GRU structure may lead to gradient vanishing due to the stacking of historical KG sequences.To address this issue, a gated recurrent neural network unit (PReLU-GateCell) with learnable parameters is adopted, which is obtained by improving the traditional RNN structure.The PReLU-GetCell structure consists of four parts, namely the input gate, forgetting gate, temporary hidden state, and hidden state update.Specifically, where i t represents the output of the input gate; sigm(•) is the sigmoid activation function; W i represents the weight matrix; and x t and h t−1 represent the input at the current timestamp t and the hidden state at the previous timestamp, respectively.The input gate i t is responsible for controlling the storage of new information in the hidden state.
where f t and W f represent the output of the forget gate and the weight matrix, respectively.The forget gate is responsible for controlling how much information should be forgotten from the previously hidden state in the current time step.
where ∼ h t is the output of the temporary hidden state; PReLU(•) denotes the activation function; and W h denotes the weight matrix.Unlike the standard ReLU, it allows input values less than zero to pass through instead of directly zeroing them, which can effectively alleviate the problem of gradient vanishing.Specifically, where x represents the input value and λ is a parameter used to control the slope of the negative value part.The final hidden state z t is obtained through a weighted combination of the input gate, forget gate, and temporary hidden state.Specifically, where z t represents the output of the hidden state update and ⊙ represents the element-wise product.The final hidden state decides on the addition of old and new information.

Static Attribute Similarity-Aware Component
In temporal knowledge graphs, static attributes describe the fixed characteristics of entities, such as the political system of a country and the type of organization, which remain unchanged over time.Although these attributes remain constant, they may have a profound impact on the behavior and relations of entities.Therefore, studying the similarity characteristics of static attributes is of great significance for gaining a deeper understanding of entity characteristics and relations, as well as predicting future events.This type of research helps to reveal patterns and regularities of entity behavior, supporting the precise analysis of temporal knowledge graphs.By delving deeper into the temporal correlation of static attributes, we can more accurately capture the essential characteristics of entities and events in the temporal knowledge graph, providing strong support and reference for decision-making, predictive analysis, and other fields.
The static graph similarity constraint mechanism (SGC-Sim) is adopted to capture the similarity features of static attributes.This mechanism is based on similarity measurement methods, which can more comprehensively capture the static attribute similarity between entities and improve the modeling ability of the model for entity relationships.Firstly, define the static graph as follows: where Φ(•) represents the ReLU activation function; r s represents the relation in static graphs; R s represents the set of relations under the timestamp; k i is a constant; W r s represents the relation matrix; → h ′s i (j) represents the randomly initialized input embedding matrix; and → h ′s i represents the output embedding matrix.Subsequently, the similarity between evolutionary embeddings and static embeddings is measured using the similarity matrix H s .This similarity calculation can ensure consistency between the dynamic evolution and static characteristics of entities, providing a more accurate entity representation for the model.
where H s represents the similarity matrix, E s and E e represent the static embedding matrix and evolutionary embedding matrix, respectively.T represents the transpose, and ||E s || 2 and ||E e || 2 represent the L 2 norm of the E s matrix and the E e matrix, respectively.To obtain information with high similarity values for static attributes, the topk method is used for selection.Specifically, where V represents the similarity value of static attributes; H s is the similarity matrix; k + 1 represents selecting the top k + 1 highest ranked values of similarity values; and d represents the dimension, with the default value being one-dimensional.The topk method is a method used to select the first k maximum or minimum values from an array or tensor.
Finally, the loss of the static attribute similarity-aware component at timestamp t is defined in the following way: where | H | represents the number of entities in the static graph, and ϕ x represents the angle between evolutionary embeddings and static embeddings of the same entity, which can be adjusted.By constraining this angle, ensure that the dynamic evolution of the entity does not deviate excessively from its static characteristics.
Then, the loss of the static attribute similarity-aware component is where n represents the number of historical event steps.

Scoring Function
Research has shown that graph convolutional networks (GCNs) using convolutional scoring functions have significant performance advantages in temporal knowledge graph reasoning tasks [27].To capture the evolutionary characteristics of entities and relations implied in historical facts, the ConvTransE decoder is used in this study [17].By modeling entities and relations through a decoder, the probability vectors of entities and relations can be obtained, which are where sigm(

Model Learning
The goal of the model is to predict changes in entities and relations over future time periods.This is a multi-label learning task, where each label represents a possible entity or relation.This model is based on given historical facts, assigns probability scores to each entity and relation, and predicts entities and relations by maximizing the score of actual events that occur.Specifically, the loss function for entity prediction tasks is f (e)P E score (24) where L e represents the loss of entity, |H| represents the total number of entities, f (e) represents the entity vector, and P E score represents the score probability of the entity.Similarly, the loss function for the relation prediction task is where L r represents the loss of the relation, |R| represents the total number of relations, f (r) denotes the relation vector, and P  score represents the score probability of the relation.These two temporal reasoning tasks are conducted within a multi-task learning framework.Therefore, the final loss is defined as follows: where α is the parameter that controls the entity loss and β is the parameter that controls the relation loss.The detailed reasoning process is shown in Algorithm 1.
ICEWS14 is a dataset of the Integrated Crisis Warning System (ICEWS) that includes political events that occurred in 2014.For ICEWS18, also from ICEWS, this dataset includes political events that occurred between 1 January 2018 and 31 October 2018.ICEWS05-15 is the long-term dataset of ICEWS, which includes events that occurred between 2005 and 2015, covering a wider time range.The YAGO dataset is extracted from YAGO3 and contains temporal information.The GDELT dataset is a global event database.These datasets contain event data from different fields and periods to evaluate the performance of the ERSP model in different contexts.More datasets can be found in [30].

Evaluation Metrics
To evaluate the performance of the ERSP model on TKG reasoning tasks, a standard set of temporal knowledge graph metrics is adopted, including mean reciprocal rank (MRR) [31] and Hits@N [31].These metrics are used to measure the performance of the model in temporal knowledge graph reasoning tasks.The mean reciprocal rank (MRR) evaluates the average of the reciprocals of model rankings, which is the most typical metric for TKG reasoning tasks.Hits@N measures the percentage of correct entities included in the top N rankings of the model.Typically, we use Hits@1, Hits@3, and Hits@10 to report the results.Especially, higher MRR and Hits@N values indicate more accurate results.More evaluation metrics can be found in [32].

Baselines
We conducted a comparative study of the performance of ERSP with several classical models designed in recent years, involving three different classes of models, including static TKG reasoning models, existing interpolated TKG reasoning models, and extrapolated TKG reasoning models.

•
ComplEx: ComplEx [34] introduces complex domain space to deal with asymmetry in complex relations in KG.

•
RotatE: RotatE [35] defines the rotation vector from the head entity to the tail entity as a relational representation.

•
ConvE: ConvE [36] adopts convolutional operations in a CNN to handle header entity embedding and relation embedding.

•
ConvTransE: ConvTransE [37] extends the convolutional neural network (CNN) idea to the TransE model.• R-GCN: R-GCN [25] is based on a message-passing GCN framework, processing the structural data of multiple relations in a KG.

Interpolated TKG Reasoning Models
• HyTE: HyTE [38] embeds a learning time-aware knowledge graph based on a hyperplane and embeds the time information into the entity relation space.
• TTransE: TTransE [11] integrates temporal information into the embedding vector of entity and relation.• TA-DistMult: TA-DistMult [13] adopts a recurrent neural network to learn the timeaware representation of relations.

Extrapolated TKG Reasoning Models
• CyGNet: CyGNet [6] analyzes historical repetitive facts and predicts future facts through a time-aware replication generation mechanism.• RE-NET: RE-NET [15] uses a cyclic event encoder to capture global and local features.• TANGO-DistMult: TANGO-DistMult and TANGO-Tucker [7] apply the idea of neural ordinary differential equations to multi-relational graphs, and calculate the final results with the score functions of DistMult and Tucker, respectively.• RE-GCN: RE-GCN [17] captures the structural-dependent features and the sequential patterns of facts in the KG utilizing relation-aware GCN and gate-recurrent components, respectively.• xERTE: xERTE [16] utilizes the temporal relation attention mechanism to extract the causal features of temporal multi-relational data.

•
GHT: GHT [18] captures temporal evolutionary information and transient structural information in KGs through Transformer.

•
rGalT: rGalT [41] utilizes a self-encoder structure to capture the interaction between historical facts and predicted facts.

•
ReGAT: ReGAT [19] encodes and models historical facts and concurrent facts based on the attention mechanism.

•
PPT: PPT [42] converts the task of temporal knowledge graph completion into a pre-trained language model to capture its semantic information.

Implementation Details
For all datasets, the entity and relation dimensions are set to 200.
For the ICEWS14, ICEWS05-15, ICEWS18, YAGO and GDELT datasets, set the optimal local historical length m to 6, 2, 6, 2, and 12, respectively.For all datasets, we set the dropout rate of each layer to 0.2 and the number of GCN layers to 2. For the decoder ConvTransE, the number of kernels is set to 50 and the kernel size is set to 2 × 3 for all datasets.In terms of parameter learning, we use the Adam optimizer [43] and set the learning rate to 0.001.

Results of Entity Prediction
The experimental results of the entity prediction task are shown in Tables 2 and 3.The performance of ERSP on the five benchmark datasets continuously outperforms all baseline models.Specifically, ERSP outperforms the latest extrapolated baseline PPT because it not only comprehensively considers the similar characteristics of entity relationships, but also integrates the similar characteristics of static attributes of entities.Compared with other models that only consider the characteristics of entity relationships, such as RE-NET, TANGO, GHT, and PPT, ERSP shows stronger performance.RE-NET uses recurrent neural networks to capture temporally adjacent entity relationship characteristics.The performance of the ERSP model is superior to the RE-NET model because it adopts an adaptive time gate network structure, which enables the model to transmit and update entity information at different time steps, and flexibly captures the feature dependency relationships between entities.Based on Transformer's neural time point process model, GHT uses its attention mechanism and relational continuous time-coding function to learn entity and relational representation.The performance of the ERSP model is superior to the GHT model because it adopts a static graph similarity constraint mechanism to more comprehensively capture the static attribute similarity between entities, thereby enhancing the model's entity prediction performance.Compared to the classical static baseline RotatE, which is based on rotation operations to represent entity relationships, the model primarily applies to static knowledge graphs, with relatively limited modeling capabilities for temporal knowledge graphs.Moreover, when the dataset contains a large number of missing entities or relations, the performance of RotatE will decline, according to Figure 4. On the contrary, ERSP demonstrates excellent performance in the temporal knowledge graphs containing a large number of missing entities and relations.It not only captures the similar characteristics of entity relationships, but also models the similar characteristics of static attributes, showing superior performance with each benchmark dataset.Especially on datasets with static graphs, such as the ICEWS dataset, ERSP performs particularly well.Therefore, ERSP shows greater applicability in dealing with temporal knowledge graphs.

Results of Relation Prediction
For relation prediction, considering that some models are not designed specifically for relation prediction, we choose the temporal model suitable for the relation-prediction task.As shown in Table 4, ERSP performs well in the relation-prediction task and outperforms all baseline models.Compared with the classical baseline model RE-GCN, the previous models often overlook the similar characteristics of the relations.Based on the relational similarity vector and gated recurrent neural network unit, ERSP can not only capture the similar characteristics of relations in historical facts, but also further capture the evolutionary representation of relations.In addition, the gated recurrent neural network unit can effectively alleviate the problem of gradient vanishing.Therefore, ERSP is more advantageous in relation-prediction tasks.  1 The results are taken from [17].
When faced with datasets containing more relations, such as the ICE14 dataset and the ICE05-15 dataset, the performance of ERSP improves significantly, further validating the results observed in the entity prediction task.This indicates that ERSP has greater ad- According to the results in Tables 2 and 3, the performance improvement of ERSP is particularly significant for the ICEWS14, ICEWS18, and ICEWS05-15 datasets with a large number of timestamps.This further confirms the effectiveness of the static graph similarity constraint mechanism in modeling similarity features of static attributes.In addition, for ICEWS05-15, GDELT, and other datasets containing a large number of facts, ERSP significantly improves the performance, further demonstrating the effectiveness of considering the similarity features of entity relationships.

Results of Relation Prediction
For relation prediction, considering that some models are not designed specifically for relation prediction, we choose the temporal model suitable for the relation-prediction task.As shown in Table 4, ERSP performs well in the relation-prediction task and outperforms all baseline models.Compared with the classical baseline model RE-GCN, the previous models often overlook the similar characteristics of the relations.Based on the relational similarity vector and gated recurrent neural network unit, ERSP can not only capture the similar characteristics of relations in historical facts, but also further capture the evolutionary representation of relations.In addition, the gated recurrent neural network unit can effectively alleviate the problem of gradient vanishing.Therefore, ERSP is more advantageous in relation-prediction tasks.  1 The results are taken from [17].
When faced with datasets containing more relations, such as the ICE14 dataset and the ICE05-15 dataset, the performance of ERSP improves significantly, further validating the results observed in the entity prediction task.This indicates that ERSP has greater advantages in dealing with data with more complex relational structures.

Comparison of Different Embedding Dimensions
In this work, to examine the influence of embedding dimensions, we conduct experiments on the ERSP model using the ICEWS14 dataset with different dimension settings of n ∈ {100, 200, 300, 400, 500}, while keeping other hyperparameters unchanged.In Figure 5, we find that ERSP maintains excellent performance in both low and high dimensions.As the embedding dimension increases, the initial performance of the model also improves, while the overall performance begins to decline after a critical point.Research has shown that although higher embedding dimensions can improve model performance, excessive dimensions can also bring additional computational costs.dimensions.As the embedding dimension increases, the initial performance of the model also improves, while the overall performance begins to decline after a critical point.Research has shown that although higher embedding dimensions can improve model performance, excessive dimensions can also bring additional computational costs.

Comparison of Different History Lengths
In this work, we investigate the impact of historical length on TKG inference methods and plot performance results using datasets with a historical length range of 1-10.In Figure 6, the results show that as the length of history gradually increases, the overall performance of the ERSP model increases, clearly indicating the effectiveness of historical information for inference tasks.However, when the historical length is too long, this may lead

Comparison of Different History Lengths
In this work, we investigate the impact of historical length on TKG inference methods and plot performance results using datasets with a historical length range of 1-10.In Figure 6, the results show that as the length of history gradually increases, the overall performance of the ERSP model increases, clearly indicating the effectiveness of historical information for inference tasks.However, when the historical length is too long, this may lead to redundant information from different timestamps, resulting in unnecessary computational losses during the learning process.

Comparison of Different History Lengths
In this work, we investigate the impact of historical length on TKG inference methods and plot performance results using datasets with a historical length range of 1-10.In Figure 6, the results show that as the length of history gradually increases, the overall performance of the ERSP model increases, clearly indicating the effectiveness of historical information for inference tasks.However, when the historical length is too long, this may lead to redundant information from different timestamps, resulting in unnecessary computational losses during the learning process.

Ablation Study
To gain a deeper understanding of the impact of different model components on the effectiveness of model reasoning, we conducted an ablation study based on the YAGO, GDELT, ICEWS14, ICEWS05-15, and ICEWS18 datasets.
Table 5 shows the ablation results of the different modules in our model.We find that the static graph similarity constraint mechanism (SGSC) has the most significant impact on performance.By removing the static graph similarity constraint mechanism (SGSC), the performance on the four datasets is significantly decreased, emphasizing the importance of extracting similarity features of entity static attributes for prediction.In addition, we notice that when removing the relation-aware gating unit (RGU) and adaptive time gate network structure (ATGN), all datasets decreased slightly compared with removing the static graph similarity constraint mechanism.In fact, the relation-aware gating unit combines the gated recurrent neural network unit and relationship similarity, which makes it easier to capture the evolution of entity relationships in the temporal knowledge graph.The adaptive time gate network structure helps the ERSP model better understand the dynamic changes of entities in the process of temporal evolution, and flexibly captures the temporal correlation and feature dependence of entity information between adjacent time steps.Table 5 shows the ablation results of the different modules in our model.We find that the static graph similarity constraint mechanism (SGSC) has the most significant impact on performance.By removing the static graph similarity constraint mechanism (SGSC), the performance on the four datasets is significantly decreased, emphasizing the importance of extracting similarity features of entity static attributes for prediction.In addition, we notice that when removing the relation-aware gating unit (RGU) and adaptive time gate network structure (ATGN), all datasets decreased slightly compared with removing the static graph similarity constraint mechanism.In fact, the relation-aware gating unit combines the gated recurrent neural network unit and relationship similarity, which makes it easier to capture the evolution of entity relationships in the temporal knowledge graph.The adaptive time gate network structure helps the ERSP model better understand the dynamic changes of entities in the process of temporal evolution, and flexibly captures the temporal correlation and feature dependence of entity information between adjacent time steps.Therefore, these results further show that capturing more comprehensive similarity features and dynamic changes in entity relationships, as well as similarity features of static attributes, is helpful for prediction.

Future Directions
We have demonstrated the superiority of ERSP in reasoning and prediction tasks for TKGs.For further work, we can apply this idea to other TKG-based tasks, such as knowledge graph-based Q&A systems, healthcare systems, and intelligent recommendation systems.This can also be extended by combining large graph models (LGM) and transferable learning [45].Large graph models (LGM) learn general knowledge from a large amount of graph data and combine deep learning methods to achieve more complex graph structures and features.Transferable graph learning can be combined with multimodal data to achieve more comprehensive knowledge representation and processing.Both directions have very broad application scenarios.
For this model, we try to solve the following two challenges from the data and methods: (1) Sparsity of relational data.For the sparse temporal knowledge graph dataset, we will adopt the method of introducing more auxiliary information or external data sources to overcome the problem of insufficient data volume.(2) High model complexity.We will focus on the computational efficiency methods of some optimization models and try to adopt more efficient graph convolution network structures or parameter optimization technologies to reduce the computational complexity.

Conclusions
In this paper, we introduce the ERSP model, a reasoning approach for temporal knowledge graphs that focuses on discerning the similarity in relationships between entities.The evolution of entities and relations is learned by comprehensively capturing the similarity features of entity relationships.ERSP also incorporates the similarity features of the captured entity static attributes (such as entity types) into evolutionary representation, utilizing evolutionary representation at the final timestamp for temporal reasoning combined with a scoring function.Experiments on five benchmark datasets show that ERSP is significantly superior and effective in entity prediction and relation prediction.

Figure 1 .
Figure 1.An example of a temporal knowledge graph.

Figure 1 .
Figure 1.An example of a temporal knowledge graph.

Electronics 2024 , 21 Figure 2 .
Figure 2. Example of different historical facts associated with a query from GDELT.Different arrows indicate similarity information.

Figure 2 .
Figure 2. Example of different historical facts associated with a query from GDELT.Different arrows indicate similarity information.In this work, we propose a representation learning model based on Entity Relationship Similarity Perception, or ERSP, for modeling and reasoning temporal knowledge graph sequences.Specifically, the model captures the evolutionary representations of entities and relations in TKGs at different timestamps through multiple components.The main components are as follows: (1) The entity-aware component, which models the entity information at each timestamp in the TKG.(2) Modeling the relations and their similarity feature information in the TKG at each timestamp using the relationship similarity-aware component.(3) The static attribute similarity-aware component, which constrains the static attributes of entities and mines the similarity features of static attributes.In summary, the main contributions of this paper are as follows:

21 Figure 3 .
Figure 3.An illustrative diagram of the proposed ERSP model.The RS component represents the relationship similarity-aware component.The SAS component represents the static attribute similarity-aware component.ATGN represents the adaptive time gate network.SGC-Sim represents the static graph Similarity constraint.4.2.Entity-Aware Component 4.2.1.Graph Convolutional GCN Network Structure

Figure 3 .
Figure 3.An illustrative diagram of the proposed ERSP model.The RS component represents the relationship similarity-aware component.The SAS component represents the static attribute similarity-aware component.ATGN represents the adaptive time gate network.SGC-Sim represents the static graph Similarity constraint.

η r and W η o
represent the weight matrices related to the relation r and object entity o, respectively, represent entity s and o embedded in the η layer, respectively, and → r represents the embedding of relation r in the η layer.

Figure 4 .
Figure 4. Performance (in percentage) of the entity prediction task with ICESW14 and ICEWS05-15.

Figure 5 .
Figure 5. Performance (in percentage) of various embedding dimensions with ICEWS14.

Figure 5 .
Figure 5. Performance (in percentage) of various embedding dimensions with ICEWS14.

Figure 5 .
Figure 5. Performance (in percentage) of various embedding dimensions with ICEWS14.

Figure 6 .
Figure 6.Performance (in percentage) of different history length settings with ICEWS14.

Figure 6 .
Figure 6.Performance (in percentage) of different history length settings with ICEWS14.5.5.5.Ablation StudyTo gain a deeper understanding of the impact of different model components on the effectiveness of model reasoning, we conducted an ablation study based on the YAGO, GDELT, ICEWS14, ICEWS05-15, and ICEWS18 datasets.Table5shows the ablation results of the different modules in our model.We find that the static graph similarity constraint mechanism (SGSC) has the most significant impact on performance.By removing the static graph similarity constraint mechanism (SGSC), the performance on the four datasets is significantly decreased, emphasizing the importance of extracting similarity features of entity static attributes for prediction.In addition, we notice that when removing the relation-aware gating unit (RGU) and adaptive time gate network structure (ATGN), all datasets decreased slightly compared with removing the static graph similarity constraint mechanism.In fact, the relation-aware gating unit combines the gated recurrent neural network unit and relationship similarity, which makes it easier to capture the evolution of entity relationships in the temporal knowledge graph.The adaptive time gate network structure helps the ERSP model better understand the dynamic changes of entities in the process of temporal evolution, and flexibly captures the temporal correlation and feature dependence of entity information between adjacent time steps.

Table 1 .
Symbols and descriptions.
•) is the sigmoid function, H t and R t represent the entity embedding matrix and the relation embedding matrix at timestamp t, respectively, and represent the embeddings of the subject entity s, relation r, and object entity o in H t and R t , respectively.
→ s t , → r t , → o t Compute the loss of time reasoning task L = αL e + βL r + L s Algorithm 1: Reasoning algorithm of RESP Input: Historical graph sequence G = (H, R, M, T ), max_epoch Output: The loss of time reasoning task 1: H, R, T = Init() 2: for i = 1 to max_epoch 3: for s in H

Table 2 .
Performance (in percentage) of the entity prediction task with ICESW14, ICEWS05-15, and ICEWS18.The best results are highlighted in bold.The second-best results are highlighted by underlining.(Higher values indicate better performance.)

Table 3 .
Performance (in percentage) of the entity prediction task with YAGO and GDELT.The best results are highlighted in bold.The second-best results are highlighted by underlining.

Table 4 .
Performance (in percentage) of the relation-prediction task with ICEWS18, ICESW14, ICEWS05-15, YAGO, and GDELT.The best results are highlighted in bold.The second-best results are highlighted by underlining.

Table 4 .
Performance (in percentage) of the relation-prediction task with ICEWS18, ICESW14, ICEWS05-15, YAGO, and GDELT.The best results are highlighted in bold.The second-best results are highlighted by underlining.

Table 5 .
Ablation studies on the ERSP model.The best results are in bold.RGU is the relation-aware gating unit, SGSC is the static graph similarity constraint mechanism, and ATGN is the adaptive time gate network structure.

Table 5 .
Ablation studies on the ERSP model.The best results are in bold.RGU is the relation-aware gating unit, SGSC is the static graph similarity constraint mechanism, and ATGN is the adaptive time gate network structure.