New Algorithms for Counting Temporal Graph Pattern

Temporal networks can describe multiple types of complex systems with temporal information in the real world. As an effective method for analyzing such network, temporal graph pattern (TGP) counting has received extensive attention and has been applied in diverse domains. In this paper, we study the problem of counting the TGP in the temporal network. Then, an exact algorithm is proposed based on the time first search (TFS) algorithm. This algorithm can reduce the intermediate results generated in the graph isomorphism and has high computational efficiency. To further improve the algorithm performance, we design an estimation algorithm by applying the edge sampling strategy to the exact algorithm. Finally, we evaluate the performances of the two algorithms by counting both the symmetric and asymmetric TGP. Extensive experiments on real datasets demonstrated that the exact algorithm is faster than the existing algorithm and the estimation algorithm can greatly reduce the running time while guaranteeing the accuracy.


Introduction
A graph is a tool for describing the complex systems, such as biological [1], chemical [2], and electronic interactive systems [3]. In many cases, these systems are accompanied by the temporal information. One example is that in the social network the time when people send the messages is always recorded [4]. Another example is that the chronological order of the packet transmission is also concerned in cyber network [5]. Obviously, the temporal information is a key component of such systems. For the purpose of analysis, the complex systems are usually represented by the networks (graphs). However, the static networks cannot describe the complex systems that contain the temporal information. Therefore, the temporal network, in which each edge has a timestamp, is proposed [6].
Graph pattern counting is a fundamental problem in network analysis, including anomaly detection [7], community detection [8], internet traffic classification [9] and so on. Usually, the graph pattern refers to small and induced subgraph, which is also called motif or graphlet. In particular, for the temporal network, if the graph pattern has temporal information, then the pattern becomes temporal graph pattern (TGP). TGP has been widely studied and is more commonly known as temporal motif in many studies. According to the application scenario, there are two kinds of definitions about temporal motif. The first definition is the network motif for the graph snapshots at different time points, in which the edges have continuous timestamps [10]. The second definition is the temporal motif of a whole temporal graph. In this paper, we consider the second type of temporal motif.
The counting problem of small graph pattern in static network has been proposed for a long time and attracted widespread attention, especially the triangle counting in social network analysis [11][12][13]. Until now, most of the triangle counting algorithms are estimation algorithms, and the graph stream while guaranteeing the accuracy. Moreover, both algorithms are suitable for the TGP that has multiple edges with the same time. The main contributions of this paper are as follows: (1) The paper studies the problem of counting the temporal graph pattern defined in [27] and proposes the corresponding problem model. Since all the counting algorithms in the existing literature are designed for the TGP defined in [30], the problem here has not been discussed before and is of great significance.
(2) The paper provides a strategy of the TFS algorithm to process the temporal graph pattern that has multiple edges with the same time. The TFS algorithm gives an edge order determined according to the topology when there exist such edges. The edge order is used to match the edges, and makes the proposed algorithms suitable for any TGP.
(3) The paper proposes an exact algorithm and an estimation algorithm to count the TGP in temporal graph. Both algorithms are computationally efficient because they can match the topology and temporal information simultaneously.
The rest of this manuscript is organized as follows. Section 2 gives some definitions and describes the temporal graph counting problem for the temporal network. In Section 3, we propose an exact algorithm and an estimation algorithm for the TGP counting. Section 4 presents a variety of experiments to demonstrate the effectiveness of the algorithms. Section 5 gives some discussions. Finally, conclusions are provided in Section 6.

Definitions and Problem
In this section, we give the fundamental definitions of the temporal graph, the temporal graph pattern, the temporal graph isomorphism and so on. Then, according to these definition, we briefly describe the counting problem of the temporal graph pattern.

Definition 1. Temporal Graph.
A temporal graph G = (V, E) consists of a set of vertices V and a set of temporal edges E = {(u, v, t)|u, v ∈ V}, where t is the timestamp of the edge. Definition 2. ∆T-Temporally Related Edges. Given two edges e i = (u i , v i , t i ) and e j = (u j , v j , t j ), the edge e i is ∆T-temporally related to edge e j if they are temporally adjacent, i.e., {u i , v i } ∩ {u j , v j } = ∅ and |t i − t j | ≤ ∆T. Definition 3. ∆T-Temporally Connected Graph. A temporal graph G = (V, E) is ∆T-temporally connected graph if and only if the graph is weakly connected and all the adjacent edges are ∆T-temporally related edges. Definition 4. Temporal Graph Pattern. A temporal graph pattern H = (g, ∆T), also known as the temporal motif, is a ∆T-temporally connected graph g = (V h , E h ).
Definition 4 is just one of the TGP definitions, and we aim at counting the number of this pattern in a large temporal graph. Since the TGP contains the temporal information, the graph isomorphism needs to satisfy both the isomorphic conditions of static graph and the temporal conditions. In the following, we further give the definition of temporal graph isomorphism and the conditions that need to be satisfied. Definition 5. Temporal Graph Isomorphism. If a temporal subgraph G = (V, E) is temporally isomorphic to the temporal graph pattern H = (g, ∆T), where g = (V h , E h ), then there exists an injective function ϕ : V h → V which satisfies the following conditions: (1) Condition (3) describes the temporal relationship among the edges. In addition to the above conditions, all the adjacent edges in the matched subgraph G must be the ∆T-temporally related edges. Therefore, given a temporal graph G and the TGP H, the problem here is to count the number of subgraphs that are temporally isomorphic to H in G.

The Counting Algorithms for TGP
To solve the TGP counting problem, an edge-centric exact algorithm based on the TFS method is first proposed in this section. Then, we combine the exact algorithm with the edge sampling method to reduce the iteration number of the edges, and present an estimation algorithm.

The Overall Idea of the Algorithm
As mentioned in the previous section, the temporal motif is a ∆T-temporally connected graph and the time differences among the adjacent edges of the temporal motif should not exceed the time threshold ∆T. Therefore, we can partition the temporal graph into multiple subgraphs according to ∆T, and count the number of the graph pattern in each subgraph. After accumulating the numbers from different subgraphs, the total number of TGP can be obtained. Obviously, the counting problem can be handled by three steps: (1) partitioning the temporal graph; (2) counting the number of TGP in each subgraph; and (3) accumulating the numbers. In Step (1), we need to traverse all the edges in the temporal graph and ensure that the adjacent edges whose time differences do not exceed the threshold ∆T are placed in the same subgraph. After the partition, we can get a set of temporal subgraphs S = {G t i , i = 0, 1..., M}, and each edge in the temporal graph G only exists in one subgraph. The partitioning process can be easily achieved by the algorithm in work [33], thus here we do not describe the partitioning algorithm in detail.
The key step for solving the counting problem is Step (2), i.e., counting the number of TGP in the obtained temporal subgraphs. In this step, the NP-Complete graph isomorphism problem is inevitable. Compared to the isomorphism problem in static graph, the temporal isomorphism problem requires extra consideration of the temporal relationship, which means that the counting problem of TGP in temporal graph is more difficult than the counting problem of the non-temporal graph pattern in static graph. In the following, we focus on Step (2) and then design a counting algorithm that can match the temporal relationship and the topology simultaneously.

Counting the Number of TGP in Each Subgraph
To count the number of TGP, we need to search the isomorphic graphs in the subgraphs G t i , i = 0, 1..., M. A common strategy used in the graph isomorphism is to match the edges exactly according to the chronological order of the edges of TGP. The advantage of this strategy is that it reduces the intermediate results that satisfy the topology but do not satisfy the chronological order. However, since TGP may have several edges with the same time, there may be some intermediate results that satisfy the chronological order but do not satisfy the topology in the process of isomorphism. To eliminate such intermediate results, here we apply the TFS algorithm proposed in our previous work [33] to match the edges.
The TFS algorithm is an edge traversal algorithm, and its output is the traversal order of the edges of the input temporal graph. This algorithm first sorts the edges in the temporal graph by time to select the edge with the smallest time. If there are multiple edges with the smallest time, one of them will be selected randomly. Then, the algorithm traverses the adjacent edges of the selected edge, and selects the new edge with the smallest time. Finally, it repeats the process of selecting the new edge from the adjacent edges of the already selected edges until all the edges of the input temporal graph have been traversed. As the algorithm combines the edge time with the topology, the traversal order satisfies the chronological order of the edges as much as possible while guaranteeing the topological connection.
By applying the above TFS algorithm to the TGP, the traversal order of the TGP can be obtained. Then, we need to get the matched pairs consisting of an edge of temporal graph and a pattern edge one by one according to this order, and count the number of TGP. Algorithm 1 provides the details of the matching process and summarizes the counting algorithm for subgraphs G t i , i = 0, 1..., M. Given the subgraph G t i obtained from graph partitioning and the TGP H, this algorithm outputs the number of the subgraphs isomorphic to the TGP H in G t i . First, we sort the edges of H according to the TFS algorithm (Line 1), and get the first edge e h 0 of the traversal order L (Line 2). Then, we traverse all the edges of G t i to get the isomorphic subgraphs (Line 5). If the edge e of the temporal graph G t i is isomorphic to the edge e h 0 (i.e., e satisfies the matching conditions) (Line 6), we add the matched pair P consisting of e and e h 0 to the matched pair set M (Lines 7-8), and call the algorithm Match() to match the rest edges of H (Lines 9-11). The algorithm Match() is a recursive algorithm shown in Algorithm 2. In this algorithm, we first determine whether the set M contains all the edges of L (Line 1). If so, the count is incremented by 1 and returned (Lines 2-3); otherwise, we continue to find the matched subgraph. Then, we get the next unmatched edge e h j of L and all the candidate edges E c from the adjacent edges of matched edges (Lines 5-6). For the edge e in E c , if e satisfies the matching conditions (Lines 7-8), we store the matched pair consisting of e and e h j in M and continue to call the match algorithm (Lines 9-12). It is easy to observe that the algorithm Match() returns the number of isomorphic subgraphs whose first matched edge is e. Thus, after traversing all the edges of G t i , the number of all the isomorphic graphs can be obtained.

Algorithm 1
The counting algorithm for subgraph G t i . return count 4: end if 5: e h j ← Get the next unmatched edge of L; 6: E c ← Get all the candidate edges from the adjacent edges of matched edges; 7: for edge e ∈ E c do 8: if e satisfies the matching conditions then On Line 6 of Algorithm 1 and Line 8 of Algorithm 2, we search the matched edges via the matching conditions. Actually, the matched edges need to satisfy two types of conditions: topological condition and temporal condition. Temporal condition is mainly reflected in two aspects. On the one hand, the time difference between adjacent edges does not exceed the threshold ∆T. On the other hand, the chronological order of the matched edges should be consistent with that of the edges in TGP H. The topological condition restricts the degree of the vertex, which means that the degree of the matched vertex in the temporal graph is equal to or larger than the degree of the vertex in H. In addition, the candidate edges on Line 6 of Algorithm 2 are obtained according to the topological condition. When we search the isomorphic edge of the edge e h j = (u h j , v h j , t h j ), j > 0, the source vertex u h j or the target vertex v h j should have been matched. Thus, there are three cases in the selection of candidate edges. (1) If only the source vertex u h j is matched, we traverse the outgoing edges of the matched vertex of the u h j and choose the edges that satisfy the temporal condition as the candidate edges. (2) If only the target vertex v h j is matched, we traverse the incoming edges of the matched vertex of the v h j and choose the candidate edges in the same way. (3) If both the u h j and the v h j are matched, we choose the candidate edges from the intersection of the corresponding outgoing edge set and incoming edge set.

Algorithm Summary and Computational Complexity
According to Algorithm 1, we get the number C i of subgraphs isomorphic to the TGP H in G t i . Then, by applying this algorithm M + 1 times and accumulating all the numbers C i , i = 0, ..., M, the total number C of isomorphic subgraphs can be obtained. Algorithm 3 summarizes the whole counting algorithm (which we call the exact algorithm). The computational complexity of the algorithm is analyzed as follows. First, the computational complexity of the partition processing is O(|E|), where |E| is the number of the edges in G. Then, the worst complexity of counting H in subgraph G t i is O((m t i ) m h ), where m t i and m h are the edge numbers of G t i and H, respectively. The case would occur when each edge in G t i could match each edge in H. However, if the edges have a temporal order, the temporal graph isomorphism has exponential asymptotic complexity O(m 2 t i ). Since the whole algorithm contains the partition processing and needs to count the numbers of H in subgraphs G t i , i = 0, ..., M, the total computational complexity of the exact algorithm is O(∑ M i=0 (m t i ) m h + |E|). It is worth noting that, in real scenarios, most of subgraphs G t i , i = 0, ..., M may have only one or two edges, thus the real performance would be better than the analytical performance O(∑ M i=0 (m t i ) m h + |E|).

Algorithm 3
The exact algorithm for the temporal graph pattern.
Input: G = (V, E): the temporal graph, H = (g, ∆T): the temporal graph pattern. Output: C: the number of the subgraphs isomorphic to H in G. 1: S ← partition the graph G into subgraphs G t i , i = 0, 1..., M; 2: for subgraph G t i in S do 3: C i ← count the number of the subgraphs isomorphic to H in G t i ; 4: end for 5

The Estimation Algorithm
In the previous subsection, we present the exact algorithm which has relatively high efficiency to reduce the intermediate results in the process of isomorphism. To further improve the computational efficiency, we design an algorithm based on the exact algorithm and the edge sampling in this subsection, where the edge sampling was used by Doulion [34] to count the triangles of the large graph. Instead of traversing all the edges in the graph, the algorithm here samples the edges by using the method of throwing biased coins. For each edge in the temporal graph, a biased coin with success probability p is thrown to determine whether this edge is reserved, which means that the selection probability of this edge is p. If the edge e is selected, e is placed in the edge set E p ; otherwise, e is ignored. Then, we build a new temporal graph G p according to the selected edges E p , and apply the exact algorithm to count the number C p of isomorphic subgraphs in G p . The approximate value of the exact number C can be obtained by C p /p m h , where p m h is the probability of the TGP being selected. Since the algorithm outputs the estimated number of the isomorphic subgraphs, we call it the estimation algorithm. The specific steps of the algorithm is provided in Algorithm 4.

Algorithm 4
The estimation algorithm for the temporal graph pattern.
end if 7: end for 8: G p = (V p , E p ) ← build a temporal graph according to the E p , 9: C p ← count the number of H in G p 10: C = C p /p m h , where m h is the number of the edges in H. 11: return C.
Although C is an approximate value of C, C is the unbiased estimation of C and has high accuracy (which can be seen in the numerical experiments). In the following, we give two lemmas about the expectation and variance of C to demonstrate the unbiasedness.

Lemma 1.
Let C and C be the exact number of the temporal subgraphs isomorphic to H in G and the estimated number obtained by Algorithm 4, respectively. Then, E[ C] = C.
Proof of Lemma 1. For each temporal subgraph isomorphic to H in G, we introduce an indicator variable δ i , i = 1, ..., C to indicate whether the isomorphic subgraph is sampled or not, i.e., It is easy to see that the number C p is obtained from the temporal graph after edge sampling, thus C p is also the number of isomorphic subgraphs being sampled, i.e., C p = ∑ C i=1 δ i . Then, we can get the expectation of C as follows where m h is the number of the edges in H. Since each edge is sampled with probability p, the probability that the ith subgraph is selected (i.e., δ i = 1) is p m h . Then, the probability distribution of δ i , i = 1, ...C can be denoted as. According to Table 1, the expectation of δ i can be calculated as By substituting Equation (3) into Equation (2), we can get E[ C] = C. The proof is complete.
where c k is the number of cases that two subgraphs share k edges.

Proof of Lemma 2. Similar to Equation (2), the variance Var[ C] is given by
where Cov[δ i , δ j ] is the covariance of δ i and δ j , which can simplified as As can be seen from Equation (4), there are C 2 terms in the summation, where C terms belong to the case of i = j and C 2 − C terms belong to the case of i = j. To calculate the covariance Cov[δ i , δ j ], we divide the calculation into three cases. • Since δ i obeys the 0-1 distribution shown in Table 1 Case 2: i = j and the subgraphs corresponding to δ i , δ j do not share any edges. In this case, whether the ith subgraph is sampled is unrelated to the jth subgraph, which means that δ i and δ j are independent. Thus, we have Cov[δ i , Case 3: i = j and the subgraphs corresponding to δ i , δ j share edges. Assume that the ith and jth subgraphs share k (k < m h ) edges. Thus, the probability that the shared edges are sampled is p k , and the probability that the remaining 2m h − 2k edges of the two subgraphs are sampled is p 2m h −2k . Based on these two probabilities, the probability of both subgraphs being sampled can be denoted as According to Equation (4) and the three cases, the variance Var[ C] can be rewritten as where c k , k = 0, ..., m h are the numbers of these terms Cov(δ i , δ j ), i, j = 1, ..., C in which the subgraphs represented by δ i and δ j share k edges, c m h = C and ∑ m h k=0 c k = C 2 . The proof is complete.
It is worth noting that the estimation accuracy is closely related to the variance Var[ C]. Actually, the expectation of the mean squared error Moreover, compared to the exact algorithm, the estimation algorithm has higher computational efficiency. In the estimation algorithm, the edges in the temporal graph G are sampled, so the number of edges to be processed and the number of isomorphic subgraphs are reduced. Similar to the complexity of the exact algorithm (i.e., O(∑ M i=0 (m t i ) m h + |E|)), the computational complexity of the estimation algorithm can be expressed as O(∑ M i=0 (m t i ) m h + p|E|), where m t i is the edge number of G t i , which is generated after partitioning the graph G p , M is the graph number after edge sampling and graph partitioning, and ∑ M i=0 m t i = p|E|. In the next section, we present several numerical experiments to verify that the algorithm has high estimation accuracy and efficiency.

Datasets and Setup
In this section, we evaluate the performance of the proposed algorithms in the following datasets. CollegeMsg data [35]: The data record the private messages between users of the online social network at the University of California, Irvine. In these data, the temporal edge (u, v, t) of the temporal network means that the user u sends a private message to the user v at time t.
Email data [30]: These data were collected from European research institution and contain the e-mails between institution members from October 2003 to May 2005 (18 months). In these data, the directed edge (u, v, t) denotes that an e-mail is sent from member u to member v at time t.
MathOverflow data [30]: The temporal network was generated using the interactions on the stack exchange website Math Overflow. The temporal edge (u, v, t) represents that user u answers user v's question or comments on user v's question/answer at time t. Table 2 gives some statistics of these datasets. All experiments were conducted on the machine with Intel Core i7 3.40 GHz processor and 8 GB of memory. The software environment was Java 1.8.0. To verify the performance of the proposed algorithms, we first chose the baseline algorithm. In fact, there are three works on the problem of counting TGP at present, i.e., the works in [30,32] and the BT algorithm [31]. However, the work in [30] only handles the motifs with at most three edges, and the sampling framework in [32] cannot be applied to our temporal motif definition. Therefore, the first two works cannot be used for the problem in this paper, thus we could only choose the BT algorithm [31] as the baseline algorithm. In the experiments, we re-implemented the code of the BT algorithm and made some modifications to make it suitable for our temporal motif definition. The temporal graph patterns (TGPs) used in the experiments are the typical motifs shown in Figure 1, where the topology of Triangle motif is rotational symmetric, and the topology of Bi-fan motif is axial symmetric.

Experimental Results
Next, we evaluated the performance of the exact algorithm and the estimation algorithm. First, we compared the performances of the exact algorithm and BT algorithm on the different datasets. Then, we defined two parameters to show the estimation accuracy and computational efficiency of the estimation algorithm.

Results of the Exact Algorithm
For the exact algorithm, the time threshold ∆T is an important parameter, thus we compared the performance under different ∆T. In this experiment, we applied the exact algorithm and the BT algorithm to count the above three TGPs in three datasets (CollegeMsg, Email and MathOverflow), and considered four different thresholds: 0.5 h (1800 s), 1 h (3600 s), 1.5 h (5400 s) and 2 h (7200 s). The results of these two algorithms are shown in Figures 2-4 and Table 3. In Table 3, the speedup ratio Sr is the ratio of running time of BT algorithm to that of exact algorithm. In these figures and table, we can get three findings as follows.       First, the running time of the two algorithms increases with the increase of the time threshold ∆T. To better explain the performance of the exact algorithm in Figures 2-4, Figure 5 gives the number of subgraphs in set S under different time thresholds. It can be seen that the number of subgraphs in set S decreases with the increase of ∆T. Therefore, the edge number of subgraph G t i (G t i ∈ S) increases with the increase of ∆T. In Figure 5, we can also see that most of subgraphs in S have fewer than three edges, and the number of larger subgraphs decreases slightly with the increase of ∆T. Obviously, most of the reduced graph in S are the graphs with few edges and require very little calculation, which means that the decrease of the graph number in S has little impact on the performance of the algorithm. Thus, when the time threshold ∆T increases, the running time of the exact algorithm is mainly affected by the edge number of subgraph in S and increases inevitably. Similarly, with the increase of ∆T, the number of intermediate results processed by the BT algorithm increases, which results in the increase of the running time. Besides, it can also be seen in these figures that the change of running time caused by ∆T in BT algorithm is more obvious than that in the exact algorithm.
Moreover, the performance of the exact algorithm is more stable than that of the BT algorithm when counting different TGPs. From the results of different TGPs, we can see that the running time of the exact algorithm changes very little, which indicates that the algorithm performance is scarcely affected by the shape of TGP. In contrast, the BT algorithm has obvious increase in running time when dealing with the Bi-fan pattern, which is more complex than the other two patterns. Comparing the results in different datasets, it is easy to find that both algorithms spend less time on the CollegeMsg dataset, because this dataset has fewer temporal edges than the other two datasets.
Finally, the performance of the exact algorithm is significantly better than the performance of the BT algorithm. From theoretical complexity analysis above and the analysis in [31], we can see that the computational complexity of the exact algorithm is O(∑ M i=0 (m t i ) m h + |E|), and that of the BT algorithm is O(|E| m h ), where m t i , m h and |E| are the edge numbers of G t i , H and G, respectively. It is easy to see that the computational complexity of the exact algorithm is lower than that of the BT algorithm. Although our algorithm is slightly slower than the BT algorithm in some cases, such as Figures 2b and 4a, the overall running time of our algorithm is better. Especially when ∆T is very large, the exact algorithm can even have a 3× acceleration compared to the BT algorithm.

Results of the Estimation Algorithm
To evaluate the performance of the estimation algorithm, we defined the relative error and the speedup ratio Sr, where was used to measure the estimation accuracy and Sr was related to the computational efficiency. The definition of the is as follows: In the above equation, we add 1 to both C and C to avoid the case C = 0. The Sr is defined as where t is the running time of the exact algorithm and t e is running time of the estimation algorithm.
In this experiment, we tested the performance of the estimation algorithm under different probability values (0.6, 0.7, 0.8 and 0.9). Here, we only considered the case that the threshold ∆T is 7200 s, because the algorithm has similar characteristics using other time thresholds. Then, for each TGP and each probability, we repeated the proposed algorithms 10 times and computed the average and Sr. Table 4 shows the relative error and the speedup ratio Sr of the estimation algorithm under different probabilities p. With the increase of the probability p, decreases from about 5% to 1%, which means that the estimation accuracy of the algorithm improves. Actually, the fluctuation of inevitably reduces when increasing p, because the larger is the p the smaller is the variance Var[ C] (as can be seen from Equation (6)). In this table, we can also see that the algorithm always has good estimation performance when dealing with different TGPs and datasets.
In addition, from the Sr in Table 4, we can see that the speedup ratio decreases with the increase of p, which means that the estimation algorithm takes more time when p is large. This is consistent with the fact that high probability p leads to high computational complexity of the estimation algorithm. Since the estimation algorithm still has high estimation accuracy when p is small, we can choose a relatively small p (e.g., 0.6-0.8) to achieve high accuracy and high computational efficiency at the same time.

Discussion
In this section, we discuss the application scope and limitations of the algorithm.
(1) The type of network: In Section 3, we propose two algorithms for the counting problem in the static temporal network. Similar to all algorithms based on such network, the proposed algorithm cannot be applied to the dynamic network or the static network whose edges do not have the temporal information. It is worth noting that no algorithm can be applied to two different networks at the same time.
(2) The definition of TGP: In this paper, we consider the TGP defined in [27], which is different from the other definitions discussed in the Introduction. This definition is relatively suitable for communication networks [36], wikipedia network [28], and mobile cohesive groups [37].
(3) The algorithms: Since our algorithms partition the large graph into multiple subgraphs, the algorithms can be implemented in parallel, i.e., each subgraph is processed independently and the total result is obtained by adding the parallel results.
(4) The edge sampling: In the estimation algorithm, we use the edge sampling strategy to estimate the number of TGP. Since the increase in the edge number m h of TGP reduces the probability of the isomorphic subgraphs being sampled, the number of sampled isomorphic subgraphs will be small when m h is large. Therefore, the algorithm has high relative error when counting the TGP with many edges. This indicates that it is only suitable for the TGP with a small number of edges. In the future, the problem of estimating the number of large TGP in the temporal graph can be further studied.

Conclusions
In this work, we propose an exact algorithm and an estimation algorithm for counting temporal graph patterns in large temporal graph. The exact algorithm is designed based on TFS. Since the algorithm can match the topology and the temporal relationship simultaneously, the high computational efficiency of the algorithm can be guaranteed. To further reduce the computational complexity, we then design the estimation algorithm based on the edge sampling. Extensive experiments on three real datasets showed that the exact algorithm is faster than the BT algorithm and the estimation algorithm can greatly reduce the running time while guaranteeing the estimation accuracy.