TBRm: A Time Representation Method for Industrial Knowledge Graph

Cao, Keyan; Zheng, Chuang

doi:10.3390/app122211316

Open AccessArticle

TBRm: A Time Representation Method for Industrial Knowledge Graph

by

Keyan Cao

^* and

Chuang Zheng

School of Computer Science and Engineering, Shenyang Jianzhu University, Shenyang 110168, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2022, 12(22), 11316; https://doi.org/10.3390/app122211316

Submission received: 19 September 2022 / Revised: 29 October 2022 / Accepted: 30 October 2022 / Published: 8 November 2022

(This article belongs to the Special Issue Real-Time Systems and Industrial Internet of Things)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

With the development of the artificial intelligence industry, Knowledge Graph (KG), as a concise and intuitive data presentation form, has received extensive attention and research from both academia and industry in recent years. At the same time, developments in the Internet of Things (IoT) have empowered modern industries to implement large-scale IoT ecosystems, such as the Industrial Internet of Things (IIoT). Using knowledge graphs (KG) to process data from the Industrial Internet of Things (IIoT) is a research field worthy of attention, but most of the researched knowledge graph technologies are mainly concentrated in the field of static knowledge graphs, which are composed of triples. In fact, many graphs also contain some dynamic information, such as time changes at points and time changes at edges; such knowledge graphs are called Temporal Knowledge Graphs (TKGs). We consider the temporal knowledge graph based on the projection and change of space. In order to combine the temporal information, we propose a new representation of the temporal knowledge graph, namely TBRm, which increases the temporal dimension of the translational distance model and utilizes relational predicates in time add representation in time dimension. We evaluate the proposed method on knowledge graph completion tasks using four benchmark datasets. Experiments demonstrate the effectiveness of TBRm representation in the temporal dimension. At the same time, it is also practiced on a network security data set of the Industrial Internet of Things. The practical results prove that the TBRm method can achieve good performance in terms of the degree of harm to IIoT network security.

Keywords:

Industrial Internet of Things; knowledge graph; relational mapping; temporal incremental

1. Introduction

With the rapid development of Industrial Internet of Things (IIoT) and Knowledge Graph (KG) in recent years, IIoT applications in the industrial field are highly sensitive and critical, such as Industrial Control System (ICS), which integrates hardware and software to monitor and control the industrial environment operation of the system and its related components [1]. A Knowledge Graph (KG) stores facts about the real world as a collection represented by triples. Each triple in the KG is represented as <s, p, o>, where s and o represent the subject and object, and p represents the predicate connecting the subject s and the object. The problem with link prediction is to find the most suitable triples (s, p, ?) or (?, p, o) to complete knowledge graphs [2]. Our focus is on the temporal knowledge graph, which adds time information to the triples. As shown in Figure 1, the time series knowledge graph essentially wants to extend the traditional static knowledge graph in the time dimension. The form of the problem with link prediction becomes that it is most likely to be completed under the given time information. That is to say, facts in temporal KG can also have the form of (subject, predicate, object, timestamp) or (subject, predicate, object, time predicate, timestamp), which is used to increase the general triple (s, p, o). For example, facts such as (Donald Trump, born, US, 1946) or (Donald Trump, President, US, occurs Since 2017-01) express time information about the facts related to Donald Trump [3]. The former expresses the relation type of the predicate that occurs at a specific point in time, whereas the latter uses the time predicate “occurs Since” to express the disclosed time period (time range).

Recently, a lot of research work has been devoted to the representation learning of TKG. The method of link prediction is generally to embed the subject and predicate of the triple, and then use the scoring function to score. The temporal representation of knowledge graphs remains challenging due to the sparsity and irregularity of dynamic temporal information. The CYGNet model [4] utilizes the historical information of knowledge by designing a special replication module; meanwhile, a generation module is designed to predict the knowledge that appears for the first time. TA-DistMult [5] creates a temporal relationship by treating the characters in r and t as a sequence, and proposes a digit-level LSTM for learning the factual representations that contain temporal information in KGs and can be directly applied to current KGs In the existing scoring function method in the completion task. Through the study of these models, we improved the TransR model and proposed a new TKG representation learning method. The proposed TBRm (Time boundary relationship mapping) represents time information as a single dimension. We map the relationship formed by the TransR model to the time dimension, and by embedding the mapped time predicate into the standard scoring function of knowledge graph completion, Circular neural networks are used to learn the time-aware representation of relationship types. We conduct extensive experiments on four benchmark TKG datasets, and the results show that TBRm can effectively model TKG data through relationships with temporal attributes.

In the direction of IIoT, taking cyberattacks on industrial IOT devices as an example, according to statistics from Kaspersky researchers, the number of cyberattacks on IOT devices jumped from 639 million in 2020 to 1.5 billion in 2021 [6]. In addition to the attack itself, there are other factors that affect the damage of network attacks. In this paper, according to the Edge-IIoT dataset, we use the attacker (represented by IP address) as the subject, the sensor and brake (represented by IP address) as the object, and the attack type as the predicate relationship to establish a static knowledge map. Furthermore, based on the established TBRm method, we take the duration of network attacks as another indicator to judge the degree of harm, form a temporal knowledge graph based on the degree of harm of network attacks, increase the judgment variables to judge the degree of harm of network attacks, and make the judgment basis for the degree of harm more convincing.

2. Related Work

With the development of artificial intelligence technology, people are more and more interested in KG embedding tasks, among which the most successful method is the embedding-based technique proposed by Nayyeri et al. [7,8,9]. These techniques map entities and relationships into a continuous space and define a scoring function to infer missing information. For static knowledge graphs, i.e., knowledge graphs without temporal dynamic facts, a class of classic models is translational distance models, such as TransE [10] and its extensions [11,12], which represent two entities as vectors and the relation Modeled as a translation vector. The TransH [13] model is improved on the basis of TransE, and the relationship is represented by two vectors, among which, H in TransH represents Hyperplane; this article is also the first to propose the negative sampling method of unif and bern. The TransR [14] model takes the projection of TransH to the hyperplane one step further, that is, to the space. The essence is to convert the projection vector into a projection matrix. The entity is still represented by a vector, and the relationship is represented by a vector and a matrix. The improvement in effect is not large, but the amount of calculation increases significantly. The R in TransR stands for relation space. Another class of classic models are semantic matching models, which represent relations as matrices and combine head and tail entities using multiplication, using triangular norm to determine how plausible a fact is [15,16]. With the continuous iteration and updating of technology, other models based on neural network methods using feedforward or convolutional layers [17,18] have received extensive attention for better performance. However, these models do not incorporate the temporal dynamics of the facts.

In terms of temporal knowledge graph, Jiang et al. improved the TransE model and proposed the TTransE model [19], which embeds temporal information into the score function, captures the temporal order between relation types, and uses common sense to constrain it. to generate more accurate connection predictions. The HyTE [20] model maps each time to a hyperplane, which is equivalent to mapping different time points to different hyperplanes, and then models a relationship on each hyperplane, which is actually a triple relationship. modeling. Although the time series knowledge graph is a very large graph, it can be a subgraph from each time point. In the article of Temp [21], the time series knowledge graph is divided into multiple subgraphs, and then each subgraph is convolved through GNN, and then the direct time sequence of multiple graphs is modeled based on RNN, and GNN’s results are concatenated and such methods take into account not only the spatial dimension, but also representation learning in the temporal dimension.

Compared with the traditional static knowledge graph, the temporal knowledge graph has more time information. The knowledge in the knowledge graph is not static, but will change with time. We divide all algorithm models into three categories based on their differences in the way they process time information: temporal knowledge graph representation model with time constraints, time series coding temporal knowledge graph representation model and path reasoning temporal knowledge graph representation model, as shown in Table 1.

3. Research Methods

To address the fact that TransE and TransR do not mention time in fact, we propose TBRm, which represents a mapping of relational spaces to temporal dimensions bridged by time-specific matrices.

3.1. Problem Statement

Compared with traditional knowledge graphs, Temporal Knowledge Graphs have additional information. Each representation in the original knowledge graph is a triple, whereas each representation in the time-series knowledge graph is a quadruple, which not only includes Subject, object and relational predicate, as well as the time point or time range in which the relational predicate is established. Figure 2 shows an example of including time in a knowledge graph. In the example, the fact that Donald Trump is the President of the United States is only true in the time period 2017–2021, not in all time periods. When we ask a question, we need to know the specific time to find out exactly what we want. The tail entity of, so the facts in the time-series knowledge graph are time-dependent, and it is very important to consider the time of facts.

In terms of network security, the requirements knowledge graphs based on the IIoT network security system, as shown in Figure 3. Determining the degree of harm caused by network attacks, and deploying and focusing on prevention of potentially harmful network attacks in advance are conducive to ensuring the network security of IIoT. Clearly, different types of attacks have different degrees of harm, but it is worth thinking about which variables can be added in addition to the types of attacks to make the determination of the degree of harm more accurate.

In TKG, each fact has a relation (or predicate)

p \in R

with subject entity

s \in E

and object entity

o \in E

within time

t \in T

. where

E

and

R

represent the vocabulary sets corresponding to entity and relation predicates, respectively, and T is the set of time periods or timestamps (if they exist). Bold words s, p, o, t represents the embedding vectors of subject entity s, predicate p, object entity o, and time t in the factual events with temporal information. Let

G_{t}

denote the snapshot of TKG in t time period,

t = (t_{b}, t_{e})

, where

t_{b}

denotes the time when a relationship starts, and

t_{e}

denotes the time when a relationship ends.

g = (s, p, o, t)

represents the quadruple fact in

G_{t}

. Both TransE and TransH assume embedding entities and relations in the same space

ℝ^{k}

. In TransR, for each triple (s, p, o), the entity embedding is set to s, o

\in

ℝ^{k}

, and the relation embedding is set to r

\in

ℝ^{d}

. Here, the dimensions owned in entity embedding and relationship embedding may not be the same, that is, k ≠ d. However, none of these three classical models contain temporal information. To solve this problem, we propose a new method that projects triples in relational space onto an extended temporal space and translates them in temporal space, thus was named TBRm.

3.2. Model Structures

For each relation predicate p in the triple, TransR sets a projection matrix

M_{p} \in ℝ^{k \times d}

, which can project entities from entity space to relation space. Based on this idea, we extend the time dimension on the basis of relational space to form a time-space that is perpendicular to relational space. Define the Vertical projection matrix

V_{t} \in ℝ^{d \times d}

,

V_{t}

can map the entity set and relation set to the time space, and reconstruct the triples under the limitation of the time dimension, so that the relation predicate contains It can be limited by duration. The overall structure of TBRm is shown in Figure 4.

The scoring function

f

is used for the embedding method of KG completion. The function works on the embedding of the subject

e_{s}

, the object

e_{o}

, and the predicate

e_{p}

of the triple in the time dimension, in order to represent the duration (time span) of a relational predicate, let

t = t_{e} - t_{b}

, it is worth noting that when the time represents the relational predicate in the form of timestamp, only

t_{b}

is taken as the current time vector, that is, when

t_{e} = t_{b}

,

t = t_{b}

. The value of a scoring function is proportional to the probability that a triple is true; a classic example of a scoring function is:

TrasnE [10]:

f_{t} (s, p, o) = {‖ e_{s} + e_{p} - e_{o} ‖}_{2}

(1)

where

e_{s}, e_{o} \in ℝ^{d}

are the embeddings of subject and object entities, and

e_{p} \in ℝ^{d}

are the embeddings of relational predicates.

{‖ \cdot ‖}_{2}

represents the two-norm.

2.: TransR [14]:

f_{p} (s, o) = {‖ s_{p} + p - o_{p} ‖}_{2}^{2}

(2)

where

s_{p} = s M_{p}, o_{p} = o M_{p}

, and at the same time, the embedding and mapping matrices must be constrained, that is,

{‖ s ‖}_{2} \leq 1

,

{‖ o ‖}_{2} \leq 1, {‖ p ‖}_{2} \leq 1, {‖ s_{p} ‖}_{2} \leq 1, {‖ o_{p} ‖}_{2} \leq 1

. These scoring functions do not consider temporal information.

To introduce the temporal information of relational predicates, TBRm uses the projection matrix

V_{t}

to define the projection of relational predicates as:

p_{t}^{s} = p V_{t} t, p_{t}^{o} = p V_{t} t

(3)

where

p_{t}^{s}

represents the projection of the entity (subject) set at the beginning of the relation predicate on the time dimension, and

p_{t}^{o}

represents the projection of the entity (object) set at the end of the relation predicate on the time dimension.

The scoring function is correspondingly defined as:

f_{t} (s, p, o) = ‖ s_{p}^{t} + p_{t}^{o} - p_{t}^{s} - o_{p}^{t} ‖_{2}^{2}

(4)

where

s_{p}^{t} = s_{p} V_{t}, o_{p}^{t} = o_{p} V_{t}

. At the same time, the constraints of the projection matrix are satisfied, that is,

‖ s_{p}^{t} ‖_{2} \leq 1, ‖ o_{p}^{t} ‖_{2} \leq 1

.

3.3. Training Target

Predict (object) entities given a query (s, p, ?, t). The learning goal is to minimize all the cross-entropy loss functions L of TKG snapshots that exist during training. By referring to the training method of the paper [29], we use the following margin-based scoring function as the training goal:

L = \sum_{(s, p, o) \in S} \sum_{(s^{'}, p^{'}, o^{'}) \in S^{'}} \max (0, f_{t} (s, p, o) + α - f_{t} (s^{'}, p^{'}, o^{'}))

(5)

where max (x, y) aims to obtain the maximum value between x and y, α is the margin, S is the set of correct triples, and S’ is the set of incorrect triples.

Existing knowledge graphs only contain correct triples. It is reasonable to destroy correct triples (s, p, o) ∈ S by replacing entities, and construct incorrect triples

(s^{'}, p^{'}, o^{’}) \in S^{'}

. When breaking triples, we follow [30,31,32,33,34,35,36,37,38,39,40,41] and assign different probabilities to head/tail entity replacement.

4. Experiment and Discussion

In this section, we demonstrate the effectiveness of TBRm using four public IIoT datasets. First of all, we will explain the experimental settings in detail, including a detailed introduction to the baseline and dataset. Then, we analyzed and discussed the experimental results. We also conducted a comparative study to evaluate the advantages of the TBRm method over other baseline methods. Finally, we put the proposed representation method into practice on the Edge-IIoTset data set to prove the feasibility of TBRm on the Industrial Internet of Things. The specific code can be found at https://github.com/Dash69dash/temporalKG (accessed on 27 October 2022).

4.1. Dataset

We conducted experiments on the connection prediction of TBRm on four benchmark data sets related to the industrial Internet.At the same time, in order to see the performance of the proposed method on the Industrial Internet of Things, we used another Industrial Internet of Things data set to test the representation effect of TBRm. Table 2 summarizes the statistics of the dataset. The Bosch production line internal fault data set (BPLP) [42] describes the measurement results of parts as they move in the Bosch production line. Each part has a unique ID and contains a large number of anonymous features. Features are named according to the agreement that tells you the production line, the workstations on the production line, and the feature number. We separate the files according to the types of features contained in the files: numbers, categories, parts and workstations on the production line as subject entities and object entities, and fault types as relational predicates in the triplet. Finally, there is a file with a date feature, and the date function provides a timestamp of each measurement time. MOOC Platform User Behavior Data Set (MOOC-Ub) [43] includes the learning activities of all users on the school’s online platform from August 2015 to August 2017. User information is the information of users of the school online, including: gender, year of birth and education level. Course information includes the course start date, course end date, course category, and course type. Extract and filter the user ID as the subject entity, the course category and type as the object entity, and the user information and course information as the relational predicate. Combining the start and end times of the course, the model of the time series knowledge graph is used for experiments. Each json file contains user tracking logs for a specific period of time. The NFT Ethereum transaction data set (NFT) [44] contains the transaction activities of the Ethereum non-homogeneous currency (NFT) from 1 April 2021 to 25 September 2021. It is purely constructed from on-chain data and represents the activities of 9292 NFT smart contracts on the Ethereum blockchain during the period. The Information Exposure from Consumer IoT Devices (IE-IoTD) [45] dataset processed and analyzed the information leakage of 81 devices located in laboratories in the United States and the United Kingdom. We filtered out a total of 23,475 triplet relationships with timestamps.

The Edge-IIoTset [46] data set identifies and analyzes 14 types of network attack methods. These attacks can be summarized as five threats, namely, DoS/DDoS attacks, Information gathering, Man in the middle attacks, Injection attacks, and Malware attacks. At the same time, the data set also indicates the start time and end time of an attack (or the time point of the attack). Edge-IIoTset’s IOT data is generated from various IOT devices (more than 10 types), such as Low-cost digital sensors for sensing temperature and humidity, Ultrasonic sensor, Water level detection sensor, pH Sensor Meter, Soil Moisture sensor, Heart Rate Sensor, Flame Sensor, etc.). The Edge-IIoTset dataset also records the IP address of the attacker and the sensor being attacked.

4.2. Experimental Setup

Baseline Methods We compare our proposed method with the static KGE and TKGE models in the previous study. The static KGE methods used in this article are TransE [10], DistMult [16], Temporal methods include TTransE [20], HyTE [21], TA-DistMult [5].

Evaluation Protocol According to the previous work [29], we filtered out the triplet data that meets our experimental requirements from these four Industrial Internet of Things data sets, and divided them into training sets, verification sets, and test sets according to the characteristics of the data, which are 80%/10%/10%, respectively [47,48,49,50]. We report the average countdown ranking (MRR) and Hits@1/3/10 (the proportion of correct test cases in the top 1/3/10) to measure the performance of our model and the comparison model [51,52,53,54,55]. The calculation formula of MRR is as follows:

M R R = \frac{1}{|S|} \sum_{i = 1}^{|S|} \frac{1}{r a n k_{i}} = \frac{1}{|S|} (\frac{1}{r a n k_{1}} + \frac{1}{r a n k_{2}} + \dots + \frac{1}{r a n k_{|S|}})

(6)

where S is the set of triples, |S| is the number of triples, and 〖rank〗_i refers to the link prediction ranking of the i-th triplet.

HITS@n refers to the average proportion of triples ranked less than n in the link prediction [56,57]. The specific calculation method is as follows:

H I T S @ n = \frac{1}{|S|} \sum_{i = 1}^{|S|} I (r a n k_{i} \leq n)

(7)

where the symbols involved in the above formula are the same as those involved in the MRR calculation formula, and I (∙) is the indicator function (if the condition is true, the function value is 1, otherwise it is 0). n usually takes the values of 1, 3, and 10.

Model configuration Exclude empty accounts that have not been used for a long time in the MOOC-Ub data set, and use the filtered triples and time for verification. The coefficient α is adjusted from 0.1 to 0.9, with a step size of 0.1. The batch size is set to 1024.The training epoch is limited to 50, which is sufficient to converge in most cases. The embedding dimension is set to 200 to be consistent with the setting of Jin et al. [30]. The baseline results are also from Jin et al. [30].

Selection of data in Edge-IIoTset dataset Since the Edge-IIoTset dataset not only contains the data described in Section 4.1, it also contains 61 flow features and two new attributes, which are stored together in a CSV file. We use python programs to extract the data information we need from it, including: The IP addresses of the network attacker and the attacked sensor, the attack methods of different attackers on the sensor, and the time when the network attack started and ended. We regard the attacker’s IP address as the subject entity set, the attacked sensor (IP address) as the object entity set, and the attack method as the predicate relationship set. The above data sets are used to form triples of knowledge graphs. Due to the different time units of network attacks (milliseconds, seconds, minutes, and hours), therefore, the time data is dimensionless. Use the Z-score normalization method:

t_{b^{*}} = \frac{t_{b} - μ_{b}}{σ_{b}}, t_{e^{*}} = \frac{t_{e} - μ_{e}}{σ_{e}}

(8)

where

μ_{b}

and

μ_{e}

are the mean values of

t_{b}

and

t_{e}

of the same time unit category, respectively.

σ_{b}

and

σ_{e}

are the standard deviations of

t_{b}

and

t_{e}

in the same time unit category, respectively.

4.3. Experimental Results

Table 3 and Table 4 report the link prediction results of the TBRm and baseline methods on the four IIoT data sets. On the data set in Table 3, the static KGE method lags far behind TA-DistMult or TTransE, because it cannot capture time dynamics. It can also be observed that the performance of all static KGE methods is usually better than that of HyTE. We believe that this is because HyTE slices the dynamic knowledge map with time into multiple static knowledge maps, which are represented independently on each static map, and lack coherence in time updates. Table 3 also shows that TBRm is significantly better than other baseline methods in BPLP and MOOC-Ub.

We also observed in Table 4 that TBRm’s performance on NFT and IE-IoTD is not always the best, especially on NFT. In fact, this is due to the excessive concentration of time information participating in the NFT data set. The excessive concentration of time information makes the time information data interval carried smaller, making it more difficult to distinguish triples. However, the Hyte method of slicing the time series knowledge map by time has a good performance on this data set. Although TBRm performs better on other data sets with a more balanced distribution of time information, how to solve this shortcoming of TBRm is a meaningful direction for further research.

Table 5 shows the percentage of TBRm’s four performance indicators on the Edge-IIoTset dataset. We can see that in addition to Hits@1, TBRm has achieved relatively high scores in the tests of the other three performance indicators. Analyzing the reasons, it was found that in the Edge-IIoTset data set, the number of subject entities of the network attacker is too large, however, there are only 14 kinds of attacks that act as predicates, and the gap between subject entity set and predicate relation set is too obvious. Therefore, in the process of link prediction, the probability of successful prediction for the first time will become very small, and the percentage value of Hits@1 will decrease accordingly.

5. Conclusions

Describing and inferring knowledge graphs with time constraints is a challenging problem. In this paper, we exploit the mapping mechanism to address this problem, assuming that future facts can be predicted from historical facts. The proposed TBRm selects future facts based on known facts that appeared in the past. The results show that TBRm has good performance in predicting future facts in TKGs. We propose a time-constrained relational mapping method to learn temporal knowledge graph fact representations that can be used in conjunction with current link prediction scoring function methods. Experiments on five temporal knowledge graph data demonstrate the effectiveness of the method and its feasibility on the Industrial Internet of Things.

Author Contributions

Conceptualization, K.C. and C.Z.; methodology, K.C. and C.Z.; software, C.Z.; validation, C.Z.; formal analysis, C.Z.; investigation, C.Z.; resources, K.C.; data curation, K.C.; writing—original draft preparation, C.Z.; writing—review and editing, K.C. and C.Z.; visualization, K.C.; supervision, K.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Intelligent IOT and Integrated security of industrial information physics systems, Key projects of the National Natural Science Foundation of China, 2022-01-01 to 2026-12-31, grant number 62133014. And The APC was funded by National Natural Science Foundation of China.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Huong, T.T.; Bac, T.P.; Long, D.M.; Luong, T.D.; Dan, N.M.; Quang, L.A.; Cong, L.T.; Thang, B.D.; Tran, K.P. Detecting cyberattacks using anomaly detection in industrial control systems: A federated learning approach. Comput. Ind. 2021, 132, 103509. [Google Scholar] [CrossRef]
Nickel, M.; Murphy, K.; Tresp, V.; Gabrilovich, E. A Review of Relational Machine Learning for Knowledge Graphs. Proc. IEEE 2016, 104, 33. [Google Scholar] [CrossRef]
Ding, Z.; Zhao, R.; Zhang, J.; Gao, T.; Xiong, R.; Yu, Z.; Huang, T. Spatio-Temporal Recurrent Networks for Event-Based Optical Flow Estimation. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtually, 22 February–1 March 2022; Volume 36, pp. 525–533. [Google Scholar]
Zhu, C.; Chen, M.; Fan, C.; Cheng, G.; Zhang, Y. Learning from History: Modeling Temporal Knowledge Graphs with Sequential Copy-Generation Networks. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtually, 2–9 February 2021; Volume 35, pp. 4732–4740. [Google Scholar]
García-Durán, A.; Dumančić, S.; Niepert, M. Learning Sequence Encoders for Temporal Knowledge Graph Completion. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, 31 October–4 November 2018; pp. 4816–4821. [Google Scholar]
Manxin, T.; Lidan, S.; Ke, C.; Dawei, J.; Gang, C. A Knowledge Representation Method Based on Entity Time Sensitivity. Softw. Eng. 2020, 23, 1–6. [Google Scholar]
Nayyeri, M.; Vahdati, S.; Aykul, C.; Lehmann, J. 5* Knowledge Graph Embeddings with Projective Transformations. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtually, 2–9 February 2021; Volume 35, pp. 9064–9072. [Google Scholar]
Zhou, X.; Niu, L.; Zhu, Q.; Zhu, X.; Liu, P.; Tan, J.; Guo, L. Knowledge Graph Embedding by Double Limit Scoring Loss. IEEE Trans. Knowl. Data Eng. 2022, 34, 5825–5839. [Google Scholar] [CrossRef]
Rossi, A.; Barbosa, D.; Firmani, D.; Matinata, A.; Merialdo, P. Knowledge Graph Embedding for Link Prediction: A Comparative Analysis. ACM Trans. Knowl. Discov. Data 2021, 15, 1–49. [Google Scholar] [CrossRef]
Bordes, A.; Usunier, N.; Garcia-Duran, A.; Weston, J.; Yakhnenko, O. Translating embeddings for modeling multi-relational data. Adv. Neural Inf. Process. Syst. 2013, 26, 2787–2795. [Google Scholar]
Wang, Z.; Zhang, J.; Feng, J.; Chen, Z. Knowledge Graph Embedding by Translating on Hyperplanes. In Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, Québec City, QC, Canada, 27–31 July 2014; pp. 1112–1119. [Google Scholar]
Xiong, S.; Huang, W.; Duan, P. Knowledge Graph Embedding via Relation Paths and Dynamic Mapping Matrix. In Advances in Conceptual Modeling. ER 2018. Lecture Notes in Computer Science; Woo, C., Lu, J., Li, Z., Ling, T., Li, G., Lee, M., Eds.; Springer: Cham, Switzerland, 2018; Volume 11158. [Google Scholar]
Lin, Y.; Liu, Z.; Sun, M.; Liu, Y.; Zhu, X. Learning entity and relation embeddings for knowledge graph completion. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence (AAAI’15), Austin, TX, USA, 25–30 January 2015; pp. 2181–2187. [Google Scholar]
Trouillon, T.; Welbl, J.; Riedel, S.; Gaussier, É.; Bouchard, G. Complex Embeddings for Simple Link Prediction. In Proceedings of The 33rd International Conference on Machine Learning; PMLR: London, UK, 2016; Volume 48, pp. 2071–2080. [Google Scholar]
Yang, B.; Yih, W.-T.; He, X.; Gao, J.; Deng, L. Embedding entities and relations for learning and inference in knowledge bases. arXiv 2015, arXiv:1412.6575. [Google Scholar]
Dettmers, T.; Minervini, P.; Stenetorp, P.; Riedel, S. Convolutional 2D Knowledge Graph Embeddings. In Proceedings of the AAAI Conference on Artificial Intelligence, Orleans, LA, USA, 2–7 February 2018; pp. 1811–1818. [Google Scholar]
Schlichtkrull, M.; Kipf, T.N.; Bloem, P.; van den Berg, R.; Titov, I.; Welling, M. Modeling Relational Data with Graph Convolutional Networks. In The Semantic Web. ESWC 2018. Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2018; Volume 10843. [Google Scholar]
Vashishth, S.; Sanyal, S.; Nitin, V.; Talukdar, P. Composition-based Multi-Relational Graph Convolutional Networks. In Proceedings of the International Conference on Learning Representations, Towson, MD, USA, 26–30 April 2020. [Google Scholar]
Jiang, T.; Liu, T.; Ge, T.; Sha, L.; Chang, B.; Li, S.; Sui, Z. Towards Time-Aware Knowledge Graph Completion. In Proceedings of the COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, Osaka, Japan, 11–16 December 2016; pp. 1715–1724. [Google Scholar]
Dasgupta, S.S.; Ray, S.N.; Talukdar, P. HyTE: Hyperplane-based Temporally aware Knowledge Graph Embedding. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, 31 October–4 November 2018; pp. 2001–2011. [Google Scholar]
Li, Z.; Feng, S.; Shi, J.; Zhou, Y.; Liao, Y.; Yang, Y.; Li, Y.; Yu, N.; Shao, X. Future event prediction based on temporal knowledge graph embedding. Comput. Syst. Sci. Eng. 2023, 44, 2411–2423. [Google Scholar] [CrossRef]
Available online: https://iottechnews.com/news/2021/sep/07/kaspersky-attacks-on-iot-devices-double-in-a-year/ (accessed on 19 October 2022).
Xu, C.; Nayyeri, M.; Alkhoury, F.; Yazdi, H.S.; Lehmann, J. Temporal knowledge graph embedding model based on additive time series decomposition. arXiv 2019, arXiv:1911.07893. [Google Scholar]
Leblay, J.; Chekol, M.W. Deriving Validity Time in Knowledge Graph. In Proceedings of the Web Conference 2018 (WWW ’18), International World Wide Web Conferences Steering Committee, Lyon, France, 23–27 April 2018; pp. 1771–1776. [Google Scholar]
Trivedi, R.; Farajtabar, M.; Biswal, P.; Zha, H. Dyrep: Learning representations over dynamic graphs. arXiv 2018, arXiv:1803.04051. [Google Scholar]
Bian, R.; Koh, Y.S.; Dobbie, G.; Divoli, A. Network embedding and change modeling in dynamic heterogeneous networks. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, Paris, France, 21–25 July 2019; pp. 861–864. [Google Scholar]
Han, Z.; Chen, P.; Ma, Y.; Tresp, V. xERTE: Explainable reasoning on temporal knowledge graphs for forecasting future links. arXiv 2020, arXiv:2012.15537. [Google Scholar]
Zuo, Y.; Fang, Q.; Qian, S.; Zhang, X.; Xu, C. Representation Learning of Knowledge Graphs with Entity Attributes and Multimedia Descriptions. In Proceedings of the 2018 IEEE Fourth International Conference on Multimedia Big Data (BigMM), Xi’an, China, 13–16 September 2018; pp. 1–5. [Google Scholar]
Trivedi, R.; Dai, H.; Wang, Y.; Song, L. Know-evolve: Deep temporal reasoning for dynamic knowledge graphs. In Proceedings of the 34th International Conference on Machine Learning—Volume 70 (ICML’17), JMLR.org, Sydney, Australia, 6–11 August 2017; pp. 3462–3471. [Google Scholar]
Jin, W.; Jiang, H.; Qu, M.; Chen, T.; Zhang, C.; Szekely, P.; Ren, X. Recurrent Event Network: Global Structure Inference Over Temporal Knowledge Graph. 2019. Available online: https://openreview.net/forum?id=SyeyF0VtDr (accessed on 18 September 2022).
Glorot, X.; Bengio, Y. Understanding the difficulty of training deep feedforward neural networks. J. Mach. Learn. Res.-Proc. Track 2010, 9, 249–256. [Google Scholar]
Van der Maaten, L.; Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res. 2008, 9, 2579–2605. [Google Scholar]
Sevick, M.A.; Zickmund, S.; Korytkowski, M.; Piraino, B.; Sereika, S.; Mihalko, S.; Snetselaar, L.; Stumbo, P.; Hausmann, L.; Ren, D.; et al. Design, feasibility, and acceptability of an intervention using personal digital assistant-based self-monitoring in managing type 2 diabetes. Contemp. Clin. Trials 2008, 29, 396–409. [Google Scholar] [CrossRef] [Green Version]
Shao, P.; Zhang, D.; Yang, G.; Tao, J.; Che, F.; Liu, T. Tucker decomposition-based temporal knowledge graph completion. Knowl.-Based Syst. 2022, 238, 107841. [Google Scholar] [CrossRef]
Cai, B.; Xiang, Y.; Gao, L.; Zhang, H.; Li, Y.; Li, J. Temporal Knowledge Graph Completion: A Survey. arXiv 2022, arXiv:2201.08236. [Google Scholar]
Chaabouni, N.; Mosbah, M.; Zemmari, A.; Sauvignac, C.; Faruki, P. Network Intrusion Detection for IoT Security Based on Learning Techniques. IEEE Commun. Surv. Tutor. 2019, 21, 2671–2701. [Google Scholar] [CrossRef]
Lei, Z.; Haq, A.U.; Zeb, A.; Suzauddola, M.; Zhang, D. Is the suggested food your desired? Multi-modal recipe recommendation with demand-based knowledge graph. Expert Syst. Appl. 2021, 186, 115708. [Google Scholar] [CrossRef]
Tareq, I.; Elbagoury, B.M.; El-Regaily, S.; El-Horbaty, E.-S.M. Analysis of ToN-IoT, UNW-NB15, and Edge-IIoT Datasets Using DL in Cybersecurity for IoT. Appl. Sci. 2022, 12, 9572. [Google Scholar] [CrossRef]
Yang, Y.; Wu, Z.; Yang, Y.; Lian, S.; Guo, F.; Wang, Z. A Survey of Information Extraction Based on Deep Learning. Appl. Sci. 2022, 12, 9691. [Google Scholar] [CrossRef]
Liu, S.; Xu, M.; Qin, Y.; Lukač, N. Knowledge Graph Alignment Network with Node-Level Strong Fusion. Appl. Sci. 2022, 12, 9434. [Google Scholar] [CrossRef]
Alissa, K.A.; Elkamchouchi, D.H.; Tarmissi, K.; Yafoz, A.; Alsini, R.; Alghushairy, O.; Mohamed, A.; Al Duhayyim, M. Dwarf Mongoose Optimization with Machine-Learning-Driven Ransomware Detection in Internet of Things Environment. Appl. Sci. 2022, 12, 9513. [Google Scholar] [CrossRef]
Droby, A.; Kurar Barakat, B.; Saabni, R.; Alaasam, R.; Madi, B.; El-Sana, J. Understanding Unsupervised Deep Learning for Text Line Segmentation. Appl. Sci. 2022, 12, 9528. [Google Scholar] [CrossRef]
Gyrard, A.; Boudaoud, K. Interdisciplinary IoT and Emotion Knowledge Graph-Based Recommendation System to Boost Mental Health. Appl. Sci. 2022, 12, 9712. [Google Scholar] [CrossRef]
Trappey, A.J.C.; Liang, C.-P.; Lin, H.-J. Using Machine Learning Language Models to Generate Innovation Knowledge Graphs for Patent Mining. Appl. Sci. 2022, 12, 9818. [Google Scholar] [CrossRef]
Semmler, N. Data-Driven Transfer Optimizations for Big Data in the Industrial Internet of Things; Technical University of Berlin: Berlin, Germany, 2022. [Google Scholar]
Ferrag, M.A.; Friha, O.; Hamouda, D.; Maglaras, L.; Janicke, H. Edge-IIoTset: A New Comprehensive Realistic Cyber Security Dataset of IoT and IIoT Applications for Centralized and Federated Learning. TechRxiv 2022. preprint. [Google Scholar] [CrossRef]
Umar, V.S.; Dixit, S.; Aggour, K.S.; Williams, J.W.; Cuddihy, P. On-demand Knowledge Graphs for Standards-Based Power Grid Data Provisioning. In Advances and Trends in Artificial Intelligence. From Theory to Practice. IEA/AIE 2021. Lecture Notes in Computer Science; Fujita, H., Selamat, A., Lin, J.C.W., Ali, M., Eds.; Springer: Cham, Switzerland, 2021; Volume 12799. [Google Scholar] [CrossRef]
Channam, A.; Swarup, B.R.; Rao, S.G. Extraction of Recipes from Food Images by Using CNN Algorithm. In Proceedings of the 2021 Fifth International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC), Palladam, India, 11–13 November 2021; pp. 1308–1315. [Google Scholar]
Chen, K.; Ma, J.; Zhang, Q.; Bai, Y. Multi-modal Navigation Interaction Recommendation with a Driver Demand-Based Knowledge Graph. In Proceedings of the 10th International Joint Conference on Knowledge Graphs (IJCKG’21), Virtual Event Thailand, 6–8 December 2021; Association for Computing Machinery: New York, NY, USA, 2021; pp. 195–199. [Google Scholar]
Wang, M.; Wang, H.; Li, B.; Zhao, X.; Wang, X. Summary of Key Technologies of the New Generation of Knowledge Graph. Computer Research and Development: 1–18. Available online: http://kns.cnki.net/kcms/detail/11.1777.TP.20220301.1217.002.html (accessed on 20 October 2022).
Fang, Z.; Long, Q.; Song, G.; Xie, K. Spatial-Temporal Graph ODE Networks for Traffic Flow Forecasting. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery Data Mining (KDD ′21), Virtual Event, Singapore, 14–18 August 2021; Association for Computing Machinery: New York, NY, USA, 2021; pp. 364–373. [Google Scholar]
Feng, L.; Shu, S.; Cao, Y.; Tao, L.; Wei, H.; Xiang, T.; An, B.; Niu, G. Multiple-Instance Learning from Similar and Dissimilar Bags. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining (KDD ′21), Virtual Event, Singapore, 14–18 August 2021; Association for Computing Machinery: New York, NY, USA, 2021; pp. 374–382. [Google Scholar]
Available online: https://blog.csdn.net/wzg199538/article/details/113847603?spm=1001.2014.3001.5506 (accessed on 13 October 2022).
Mangal, A.; Kumar, N. Using big data to enhance the bosch production line performance: A Kaggle challenge. In Proceedings of the IEEE International Conference on Big Data (Big Data), Washington, DC, USA, 5–8 December 2016; pp. 2029–2035. [Google Scholar]
Arp, D.; Spreitzenbarth, M.; Hubner, M.; Gascon, H.; Rieck, K.; Siemens, C.E.R.T. DREBIN: Effective and Explainable Detection of Android Malware in Your Pocket. In Proceedings of the Symposium on Network and Distributed System Security (NDSS), San Diego, CA, USA, 23–26 February 2014. [Google Scholar]
Michael, S.; Florian, E.; Thomas, S.; Felix, C.F.; Hoffmann, J. Mobile-Sandbox: Looking Deeper into Android Applications. In Proceedings of the 28th International ACM Symposium on Applied Computing (SAC), Coimbra, Portugal, 18–22 March 2013. [Google Scholar]
Feng, W.; Jie, T.; Tracy, X.L. Understanding Dropouts in MOOCs. In Proceedings of the AAAI Conference on Artificial Intelligence, Atlanta, Georgia, 8–12 October 2019; pp. 517–524. [Google Scholar]

Figure 1. An example of a time-series knowledge graph.

Figure 2. A clip of ICEWS showing records of the incumbents of the U.S. President over different time periods. Not all U.S. presidents in office are listed in the figure, and unlisted presidents and future presidents are represented by dotted lines.

Figure 3. Knowledge Graphs of requirements based on IIoT network security.

Figure 4. The overall structure of the TBRm model.

Table 1. Classification of known temporal knowledge construction methods.

Temporal Knowledge Graph Represents Categories	Model Abbreviation	Features
Temporal knowledge graph representation model with time constraints	ETA-TransE [22]	On the basis of the TransE model, a time transfer matrix is constructed based on the difference in the time granularity of the application scenario, which can distinguish the impact of the same time on different types of entities.
	ATiSE [23]	The influence of mining time on the evolution of entities.
	TTransE [24]	On the basis of the TransE model, time information is used to embed and represent time points in the triplet by relationship-time merging.
Time series coding temporal knowledge graph representation model	Know-Evolve [25]	By constructing an RNN network to update the embedded representation of the entity after it is affected by time changes.
	RE-NET [26]	Convert the time information into a sequence of events (triples) with time information, and finally use the RGCN network to aggregate the information of entities at the same time.
Path reasoning temporal knowledge graph representation model	Chang2vec [27]	Split the time series knowledge graph into multiple static knowledge graphs according to time nodes Spectrum, recalculate the changed node entity representation and update its embedded representation.
	xERTE [28]	The model can visualize the interpretability of reasoning and show the reasoning path.

Table 2. Summarizes the statistics of the dataset.

#Dataset	BPLP	MOOC-Ub	NFT	IE-IoTD	Edge-IIoTset
#Entities	3024	24,100	5422	12,564	108,576
#Relation	145	274	186	245	14
#Training	7213	19,151	7274	18,780	24,301
#Validation	5327	7263	4263	4072	19,281
#Test	3348	2854	1000	2349	4820

Table 3. Results (percentages) on the BPLP and MOOC-Ub datasets.

Method	BPLP				MOOC-Ub
Method	MRR	Hits@1	Hits@3	Hits@10	MRR	Hits@1	Hits@3	Hits@10
TransE	16.47	4.62	25.84	34.78	17.54	5.62	33.04	45.69
DistMult	23.54	5.66	9.64	39.16	18.95	14.23	25.43	37.42
TTransE	10.63	13.14	24.93	25.64	9.63	5.46	9.65	17.56
HyTE	19.35	21.04	28.43	40.62	21.34	16.38	25.63	36.66
TA-DistMult	9.61	7.25	8.96	15.34	12.57	8.94	15.82	23.13
TBRm	8.72	22.63	29.97	45.49	8.35	20.32	40.98	46.49

The best results are shown in bold.

Table 4. Results (percentages) on the NFT and IE-IoTD datasets.

Method	NFT					IE-IoTD
Method	MRR	Hits@1	Hits@3	Hits@10	MRR	Hits@1	Hits@3	Hits@10
TransE	33.67	17.26	48.87	66.31	28.42	20.52	59.67	45.87
DistMult	32.53	17.64	49.63	64.41	25.68	19.43	61.36	42.94
TTransE	49.57	32.98	35.65	70.69	30.26	25.20	48.47	50.64
HyTE	45.39	26.35	49.63	75.64	22.63	24.12	42.13	52.12
TA-DistMult	50.26	40.28	45.61	37.38	28.92	21.10	60.74	48.67
TBRm	30.04	35.99	46.97	74.32	25.39	24.59	64.79	55.19

The best results are shown in bold.

Table 5. The performance of TBRm tested on the Industrial Internet of Things.

Method	Edge-IIoTset
Method	MRR	Hits@1	Hits@3	Hits@10
TBRm	34.68	19.62	50.73	60.34

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Cao, K.; Zheng, C. TBRm: A Time Representation Method for Industrial Knowledge Graph. Appl. Sci. 2022, 12, 11316. https://doi.org/10.3390/app122211316

AMA Style

Cao K, Zheng C. TBRm: A Time Representation Method for Industrial Knowledge Graph. Applied Sciences. 2022; 12(22):11316. https://doi.org/10.3390/app122211316

Chicago/Turabian Style

Cao, Keyan, and Chuang Zheng. 2022. "TBRm: A Time Representation Method for Industrial Knowledge Graph" Applied Sciences 12, no. 22: 11316. https://doi.org/10.3390/app122211316

APA Style

Cao, K., & Zheng, C. (2022). TBRm: A Time Representation Method for Industrial Knowledge Graph. Applied Sciences, 12(22), 11316. https://doi.org/10.3390/app122211316

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

TBRm: A Time Representation Method for Industrial Knowledge Graph

Abstract

1. Introduction

2. Related Work

3. Research Methods

3.1. Problem Statement

3.2. Model Structures

3.3. Training Target

4. Experiment and Discussion

4.1. Dataset

4.2. Experimental Setup

4.3. Experimental Results

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI