Temporal Label Walk for Community Detection and Tracking in Temporal Network

: The problem of temporal community detection is discussed in this paper. Main existing methods are either structure-based or incremental analysis. The difﬁculty of the former is to select a suitable time window. The latter needs to know the initial structure of networks and the changing of networks should be stable. For most real data sets, these conditions hardly prevail. A streaming method called Temporal Label Walk (TLW) is proposed in this paper, where the aforementioned restrictions are eliminated. Modularity of the snapshots is used to evaluate our method. Experiments reveal the effectiveness of TLW on temporal community detection. Compared with other static methods in real data sets, our method keeps a higher modularity with the increase of window size.


Introduction
The concept of community defined by Newman and Girvan is widely accepted and used [1]. Community detection is a significant task and of great value in practical applications. For sales websites, discovering consumers of different products can help website managers purchase selectively; for social networks, it will be easier to recommend friends to users if we can detect communities sharing common interests.
Research on community detection falls into two categories: static methods and temporal methods. Some classic methods such as Kernighan-Lin [2] based on graph cut and Label Propagation Algorithm (LPA) are static methods [3]. The famous Girvan and Newman (GN) algorithm [1] is also a static method. The static methods divide a network into different communities based on static graph where the relationship between nodes and edges will not change. However, increasing numbers of real data sets cannot be denoted by a static graph since edges and nodes are changing constantly. These data sets are called temporal data and the networks consisting of temporal data are called temporal networks. Temporal approaches are proposed to deal with temporal networks.
Structure-based methods and incremental analysis are the two most common methods used for community detection in a temporal network. Structure-based methods mix temporal data in the time window ∆t to a graph snapshot G i and the temporal networks can be viewed as a sequence G 1 , G 2 , ..., G s where each G i corresponds to the configuration of the graph in the ith time window [4]. For any G i , it is a static network and the traditional static algorithms can be used for community detection. Meanwhile, the validation measurements used in static networks are making for temporal networks as well. If the ground-truth communities are known, Normalized Mutual Information(NMI) [5] is often used. In addition, Modularity [1] is often used while ground-truth is unknown. Determining the size of window is a problem here. If ∆t is too small, few nodes and edges are included in a snapshot. As a result, it is hard to find any community structure. On the contrary, the evolution of communities cannot be tracked if ∆t is too large.
To avoid selecting appropriate ∆t, incremental analysis adds vertexes or edges to the network one by one with the transmission of temporal data stream. Much research has used incremental analysis for community detection in temporal networks. Li et al. divide communities according to permanence index [6]. Agarwal et al. maximize permanence to detect communities [7]. Guo et al. focus on local interact for community detection [8]. Ditursi et al. also use local information to find seed nodes [9]. Several pieces of incremental analysis detecting communities detection are based on label propagation [10,11]. Incremental analysis requires that community structure should be known in the beginning and the structure of networks change stably. However, community structure is unknown in most real data sets, and stability of evolution cannot be guaranteed either.
We show the form of temporal data of two methods, and the real graph sequence in Figure 1. Real data set Students [12] consists of communications in a student organization. The community structure at certain nodes can obviously be observed by compressing one day's interactions together in Figure 2. is the data used in structure-based methods, temporal data in a time window of size ∆t is mixed to snapshot G 0 . Incremental analysis in (b) knows the structure of network at the beginning and addition of nodes and edges (or delete) to change local structure so that the new communities grow. (c) is a real data set called Students where it is hard finding community structure at any t i and the information flow is spare when edges appear. This paper proposes a streaming method to detect and track communities without any fixed ∆t. Our main contribution is two-fold: • Propose a new method called Temporal Label Walk (TLW) to detect community structure in temporal networks without time window. Our approach transforms the community detection in network into a clustering analysis in vector space where no prior about the community structure is required. TLW can also maintain modularity at a reasonable level by combining time information.

•
We reveal that simply mixing temporal data to snapshots and then measuring community structure is less effective with the course of time.

Our Method: Temporal Label Walk (TLW)
In this section, we describe our method in three parts: • Introduce the basic idea of our method including premise, feasibility, and process.

•
Give mathematical definition of TLW.

•
Measure the validation of community detection.

Basic Idea
In temporal networks, new nodes and edges are added in turn. If we use static method for community detection, we must mix the data streams in a time window to a snapshot so that each snapshot can be regarded as a static graph. In fact, there are two problems. First, how to select the time window. Snapshots in different time windows consist of different information. Second, how to mix data streams. To simply stack all edges or the weighted edges both can make result dissimilar. In fact, the intensity of interactions between two nodes depends on time in temporal networks. The result is stronger now than the past. The importance of nodes is also different due to different intensity of interactions. A natural idea is to divide communities according to the importance of nodes. Temporal walk defined by Rozenshtein and Gionis in 2016 is a good tool to measure importance of nodes in temporal networks. They use temporal walk to make an extension of static PageRank [13].
Béres et al. also use temporal walk to extend static Katz centrality to temporal networks [14]. The effectiveness of temporal walks is verified in measuring the importance of nodes in temporal networks. In real networks, interactions within communities are dense but between communities they are sparse [1]. It is possible for temporal walks used to divide community structure in temporal network. However, there are some special cases that make it difficult to divide communities only by measuring the importance of nodes. Meanwhile, time is an important factor in temporal data. The information received by a node recently may be more crucial than the past. We give an example in Figure 3a,b, respectively.
The importance of nodes from different communities may be the same. Our solution is to give each node a label value then compute it by temporal walks. In this way, label values between two communities are different and thus can be divided. As to how to divide communities according to label values, we use clustering methods because each node is represented by a vector. Please note that using clustering analysis for community detection has been already considered by many researchers and it is proved to be an effective approach. Pons et al. propose a measure of similarities based on random walks to find the dense subgraphs of communities [15]. Cai et al. evaluate the performance of repeated random walks in community detection of social networks [16]. De Meo et al. add a pre-processing step in which edges are weighted according to their centrality regarding the network topology and raise the accuracy of existing algorithms on real-life data sets [17]. In addition, they combine advantages of global approaches and local methods to propose a new clustering method [18]. Rémy et al. use clustering analysis to re-identify multiple addresses belonging to a same user for bitcoin user activity [19]. We apply this idea to temporal networks and more detail is described in Section 2.2. Figure 3. Two different situations in temporal networks. (a) shows that two communities A and B share the same information exchange within community, the corresponding nodes are of the same importance but belong to different communities. (b) shows that nodes u and v are in different communities in the beginning and v sends a message to u at time t i . If v sends a message to u again at t i+1 , as shown in (c), u will have a higher probability of belonging to B at t i+2 than those nodes inside A send a message to u at t i+1 , which is illustrated in (d).
To summarize, we propose a streaming method for community detection in temporal networks. Label value is defined for each node and computed by temporal walk. The division of nodes is obtained by the clustering analysis of label values. The result of clustering division is the division of communities in networks.

Mathematical Definition
According to the description in Section 2.1, our method is based on two assumptions: • Data sets are in the form of data streams, • Interactions within communities are more frequent than those between communities.
Consider a directed graph G t = (V, E t ) where V is the set of all the nodes and E t is the set of edges at time t. We denote T = {t 0 , t 1 , · · · , t i , · · · } the set of time sequence when new edges appear in the data streams. Let X u (t) ∈ R |V| denote the label value of node u at time t. If edge v → u appears at time t i+1 ∈ T, now we describe how to compute X u (t i+1 ).
The variations of label value of u is two-fold: • All temporal walks that end in v at t i , a new temporal walk starts from v and ends in u appears at t i+1 . The variation of this new temporal walk is: where ϕ(τ) = β · e −cτ is an exponential decay function, β is a constant and τ = t i+1 − t i . We use ϕ to represent that interactions at t i+1 are more important than those at t i .

•
To maintain the timeliness of information, we define an enhancement function as another variation: where α ∈ R and e v is a unit vector with the vth element 1 and the rests 0. Note here v is the index of node v in set V t . Label value variation of u is: ∆ is the variation that u only interacts with v. Denoting S u by the set of all nodes that send messages to u at t i+1 , variations of u is: At t i+1 , the label value of u is also affected by ϕ(τ). We can compute X u (t i+1 ) by: Set the label value X u (t 0 ) = e u . Then take the normalized vector for the next iteration if S s = ∅ to avoid the rapid growth of label value. To update label value of u at t i+1 , the complete algorithm is described as follows: Computing label value can be regarded as a mapping f : G i → R |V|×|V| , where G i is a snapshot at t i and f (G i ) = (X 1 (t i ), X 2 (t i ), · · · , X |V| (t i )) ∈ R |V|×|V| . Community detection problem in temporal network G t can therefore be transformed to clustering analysis in high-dimensional vector space R |V|×|V| .
We choose several widely used clustering methods for TLW. Affinity Propagation(AP) [20], Density-Based Spatial Clustering of Applications with Noise(DBSCAN) [21] and Mini Batch K-means are compared in the following tests. AP and DBSCAN are used when that the number of communities is unknown. If the number of clusters is known, Mini Batch K-means is adopted instead since it is much simpler [22].

Evaluation
Most real data sets are unlabeled. To evaluate the divisions of community, modularity is a mostly used measurement. Modularity is proposed by Newman [1] and defined on undirected graph as where A is the adjacency matrix of undirected graph G = (V, E), k v (k u ) is the degree of nodes v(u) and δ is to record whether two nodes belong to the same community: if v, u are in the same community, 0, otherwise.
Here G t is a directed graph, (7) turns to where k outv is the out-degree of v and k inu is the in-degree of u. Now make a summary of our method. TLW first updates the label value X u (t) by algorithm (6), then chooses a suitable clustering method to divide vectors according to whether we know the number of communities. Nodes are divided into different communities according to the clusters which nodes belong to. Finally, we use modularity to evaluate the division.

Experiment and Analysis
We carry out experiments to show that our method is effective in temporal community detection. Two real data sets are used: Students [12] and Facebook [23]. The four-month data set Students is a student community at the University of California, Irvine. Nodes represent students and edges denote message passing. The Facebook data set is a three-month subset of Facebook activity in a New Orleans regional community. Codes and data sets are available online (https://github.com/Zhe-liangLiu/ Temporal-Label-Walk). The experiments consist of three parts: • Explain the importance of normalization, • Use TLW to detect and track communities in real data sets Students and Facebook, and the modularity of several static methods including K-clique, label propagation, and asynchronous label propagation is given.

•
Discuss the effects of parameters in experiments. We put this part in Section 4.

Normalization
First, we explain the necessity of normalization by a simple example. Assuming that three nodes send messages regularly, we first generate a simple data streams including three nodes to illustrate normalization mentioned in Section 2.2, as shown in Figure 4.  Figure 5 is the label value of node 3 from t 0 to t 29 in the case of normalization and non-normalization, respectively. It is more realistic that the label values should remain stable over a certain period as the community evolution. For real data sets, the interactions between nodes are more complex. Figure 5b illustrates that normalization can ensure the label values of the nodes keep stable.

Community Detection and Tracking
This section we evaluate our method by real data sets. Firstly, different clustering methods are tested on Students, and then the results of TLW and other static community detection algorithms are given. Finally, we visualize our division result of Students and Facebook.
To evaluate results of different clustering methods based on (6) Figure 2 is the snapshot of the associated temporal network by mixing all edges together. Some obvious clusters can be observed around nodes 3 and 1713. Let the c = 1 × 10 −6 and β = 1, α = 1. Compute label values of nodes at 23:58:51, and use clustering analysis methods to reveal nodes in same community. Since the number of communities is unknown, we group the nodes by AP and DBSCAN, respectively. According to the results of AP and DBSCAN, setting the number of clusters to be 21, we also use Mini Batch K-means to cluster. Focusing on two key nodes 3 and 1713, Figure 6 shows the results of community detection. According to the division of X u (t), community structure is in accordance with the actual situation and dependence on clustering algorithms is not strong. The modularity and running time of three methods are listed in Table 1.  It is shown that modularity greater than about 0.3 appear to indicate significant community structure in practice [24], and typically fall in the range from 0.3 to 0.7 [1]. According to Table 1, we find that all three methods can get a good community structure with reasonable modularity, but AP is obviously more computational expensive than Mini Batch K-means and DBSCAN, which may make AP impractical in processing large-scale networks. Compared with other two methods, Mini Batch K-means gets better community structure and runs faster with the increase of data. If the community structure is stable in a certain period, DBSCAN method can be used firstly and then Mini Batch K-means can be used to approximate it. Mini Bath K-means is used in subsequent experiments. We choose node 525 (see Figure 6) to track the community evolution based on label value, the result is shown in Figure 7. Node 525 joins the community halfway, and does not interact directly with 1713. The number of communities in this experiment is set to 21 for a network with 87 nodes. It may be too many for a real social network. One reason is that some nodes communicate little with other nodes, which makes them become noise points in clustering analysis. In the case of large-scale data sets, noise points are often categorized separately because of inadequate information of time.
To evaluate our method in community detection, we compare modularity with other static community detection algorithms. We choose partial community detection methods that focus on static graph: K-clique [25], asynchronous label propagation [26] and label propagation [3]. The dynamic community detection algorithm of [6] called DABP is also included. Data streams on 27 June are mixed to an undirected snapshot with 87 nodes and 85 edges. The result is shown in Table 2. All methods have a good performance with high modularity on the first day. Please note that our method is to reveal the community structure at 23:58:51. With the passing of time, Figure 8 shows the change of modularity of all methods. Here modularity of all methods decreases with the increase of time. Modularity of static methods decreases rapidly, while the modularity corresponding to TLW decreases much more slowly, which means that TLW has distinct advantages over other methods in maintaining a significant community structure. Although nodes in many networks fall naturally into communities, the time factor is very important. The community structure of the network is unstable. If only the snapshots are used to divide networks by mixing data streams, some disappeared or changed community structures are retained in snapshots which make community structure of snapshots vague. At this point, it is difficult for static algorithms to get a high modularity. We also compute the running time of TLW and several related static methods. The results are listed in Table 3. TLW can get the division of community structure almost as fast as classical static algorithms. Considering that DABP is an incremental algorithm, we compare the real-time running time of TLW and DABP separately. Choose 20 subgraphs from G 0 , G 1 , · · · , G n randomly and compute the time of processing each subgraph G i , the result is shown in Figure 9. The real-time running time of TLW grows slowly and the running time of DABP depends on the scale of each subgraph. Our methods can get a higher modularity than DABP without consuming too much time.  We also visualize the community structure of Students and Facebook in three periods in Figure 10. Several communities with the high number of nodes are shown. More and more nodes become the members of community as time goes on, while accords with the facts in most social networks. Figure 10. Community structure of Students and Facebook in one day, one week, and one month, respectively. Nodes in black denote that they do not belong to any community obviously yet.

The Effects of Parameters
Three parameters, c, β and α should be set manually in TLW, where α uses to control information intensity. A larger α means that information received recently is more important. Using artificial data set generated in Section 3.1, label value corresponding to different α is shown in Figure 11. Figure 11. The effect of α on the label value of node 3. The larger α makes node 2 have more influence on node 3 than the others. β and c are time decay parameters which control the speed of exponential decay of information. c depends on time interval τ of data streams. We generate a new data streams to illustrate the effect of different β and c. Similar to the artificial data streams generated in Section 3.1, at each t i , now only one message passes in turn, we show this data streams in Figure 12.
Let α = 0.1. Fix β and c respectively, the first component of X 3 (t) for different c and β is given in Figure 13a,b. Variation of label value from peak to trough is not violently with β = 1, c = 0.25 or β = 1, c = 0.1, here τ = t i+1 − t i is 1 s. c is more important than β because τ range from 1 s to 1000 s or more in real data sets, it is suitable to consider c and τ together so that e cτ can be a number between 0 and 1. To make a further analysis of influence of different parameters on community detection, we choose Students data set of the first day and fixing β = 1. Using Mini Batch K-means methods and setting the number of clusters n = 14, we show the effect of different c and α on modularity with Mini Bath K-means in Table 4. Please note that modularity does not change linearly with c and α, but depend on the data sets.

Conclusions
In this paper, we propose a community detection method TLW in temporal networks based on the label value X u (t) of each node u. TLW transfers community detection problem to clustering analysis in high-dimensional vector space. We validate the effectiveness of our algorithm. TLW can ensure that modularity of community divisions remains relatively reasonable high with the passing of time. We also show that it is hard for static community detection algorithms to divide community structure well in real data sets if only mix temporal data streams to snapshots. As an important part of community detection, only several simple clustering algorithms are discussed in this paper. Meanwhile, many dynamic community analysis algorithms have been integrated into a package [27] and it is convenient for user to compare their methods with classical dynamic community detection approaches. TLW is quite different from these methods and division of community is influenced by different evaluation criteria and data sets. A comprehensive comparison will be given in the future work.