Multi-Layer Feature Fusion-Based Community Evolution Prediction

: Analyzing and predicting community evolution has many important applications in criminology, sociology, and other ﬁelds. In community evolution prediction, most of the existing research is simply calculating the features of the community, and then predicting the evolution event through the classiﬁer. However, these methods do not consider the complex characteristics of community evolution, and only predict the community’s evolution from a single level. To solve these problems, this paper proposes an algorithm called multi-layer feature fusion-based community evolution prediction, which obtains features from the community layer and node layer. The ﬁnal community feature is the fusion of the two layer features. At the node layer, this paper proposes a global and local-based role-extraction algorithm. This algorithm can effectively discover different roles in the community. In this way, we can distinguish the inﬂuence of nodes with different characteristics on the community evolution. At the community layer, this paper proposes to use the community hypergraph to obtain the inter-community interaction relationship. After all the features are obtained, this paper trains a classiﬁer through these features and uses them in community evolution prediction. The experimental results show that the algorithm proposed in this paper is better than other algorithms in terms of prediction effect.


Introduction
Many complex networks show the structure of communities. A community is a set of nodes, where connections within the community are relatively dense, and connections between communities are relatively sparse [1]. The community is the basic structure of the network, and has many important applications, such as influence maximization [2,3] and rumor tracing. Due to the wide application of community, ever since the concept of community has been proposed, many scholars have devoted themselves to related research on community. These studies include community discovery [4,5], community evolution analysis [6], and community evolution prediction.
Among many research directions related to communities, community evolution prediction is an important direction. In dynamic networks, communities change over time, which we have called community evolution. Community evolution prediction can be defined as predicting the future evolution of a community when its historical evolution is known.
Community evolution prediction has many important applications in many fields. For example, in criminology [7], a criminal group can be regarded as a community, and tracking and predicting its changes can help to control them over time. In an infectious disease transmission network, by tracking the changes in infected communities, the structural characteristics of highly infectious communities can be found to predict the future spread of infectious diseases. • We propose to obtain features from the node layer and the community layer, respectively. In this way, the features of nodes and the influence of inter-community relationships on community evolution can be better captured. • We propose the global and local-based community role-extraction algorithm. The algorithm first obtains the features of the node from the global scope and the local scope. Then, global and local feature matrices are constructed based on the obtained node features. Finally, we obtain the role attribution matrix of the node through NMF. Through the role attribution matrix, we can assign community roles to each node. • We propose the obtaining of the community mesoscopic features through a community hypergraph. To obtain the community mesoscopic features, we first construct the hypergraph according to the similarity between the communities. Then we perform NMF on the adjacency matrix corresponding to the hypergraph to obtain the mutual influence between the communities.
The rest of the paper is organized as follows: Section 2 introduces the overall process of community evolution prediction and related work. Section 3 introduces the MCEP algorithm proposed in this paper in detail. Section 4 shows the results of the comparative experiment and other related experiments. Section 5 summarizes the full paper.

Community Evolution
In a dynamic network, a community will evolve with a change in the network. To study the changes in a community in a relevant snapshot, scholars have proposed relevant community evolution models. These models define community evolution events. For the specific definition of community evolution events, different models will be different, and the most commonly used model is group evolution discovery (GED). Since the algorithm proposed in this paper analyzes community evolution through GED, this section will briefly introduce community evolution events in GED.
In GED, community evolution events are divided into seven categories: • Birth: a community exists in the current snapshot, but in the previous snapshot, no community could match that community. • Survive: two communities in the adjacent snapshot are identical or only a small number of nodes are different. • Dissolve: a community existed in the previous snapshot, and no community can match it in the current snapshot. • Grow: in the current snapshot, some new nodes join the community, and make the size of community bigger than the previous snapshot. • Shrink: shrinking is the opposite process growing; a community is reduced in size compared to the previous snapshot. • Merge: a community in the current snapshot is formed by merging several communities in the previous snapshot. • Split: a community divides into several small communities in the current snapshot.

Community Evolution Prediction
Most of the existing research on community evolution prediction uses four steps to make predictions. The four steps are network data split, community discovery, evolution analysis, obtaining relevant features, and making predictions. In these four steps, most of the papers focus on the last step, which is the obtaining of features.
The existing algorithms can be divided into two types: (a) methods that do not contain historical information; and (b) methods that contain historical information. The difference between the two types of method is whether to use historical information.
The method that does not contain historical information only uses the community of the current snapshot for prediction. The advantage of these methods is that it is relatively simple, and at the same time, the obtained feature dimension is lower, so the prediction can be made more quickly. Takaffoli et al. [12] proposed the concept of the meta-community. Through the meta-community, the evolutionary relationship of the nonadjacent snapshot can be tracked. In addition, they used properties of influential nodes, properties of the community, and temporal changes in these properties predict community evolution. Ilhan et al. [15] pre-trained a multi-classifier through many synthetic networks. The multi-classifier can automatically select features useful for prediction from a feature set before prediction.
Community evolution is a sequential process. The historical evolution of the community will affect the current evolution. The method that contains historical information considers the historical evolution, so the effect is better than the method that does not contain historical information. Diakidis et al. [13] hypothesized that communities in social networks are the result of user interaction. Therefore, they selected structure, content, contextual features, and the previous state of community as community features. However, their method does not consider the split and merge events of the community.İlhan et al. [14] used ARIMA (autoregressive integrated moving average) to predict the future changes in community features. Then, they spliced the predicted features with the community features of the historical snapshot to obtain the final feature representation. Dakiche et al. [10] considered historical community features and the influence of leader nodes on community evolution prediction, and used the change rate of community features to predict community evolution. The above methods are mainly based on GED (group evolution discovery) [17] for community evolution analysis. Gliwa et al. [8] proposed a new community evolution model called stable group change identification (SGCI) and used it to predict community evolution. SGCI used leadership, community density, cohesion, and group size to predict the future evolution of the community.
Although the above methods have been able to achieve good results, these algorithms do not consider the complex characteristics of the community evolution. Specifically, the existing algorithms have the following drawbacks: first, although existing algorithms consider the influence of leader nodes on the prediction results, it is obvious that there are other types of nodes in the community. The influence of these nodes on the evolution of the community has not been considered. Second, the existing algorithms do not consider the influence between communities. The relationship between communities is an important factor affecting the evolution of the community.
To solve the above problems, this paper proposes multi-layer feature fusion-based community evolution prediction (MCEP).

Multi-Layer Feature Fusion-Based Community Evolution Prediction
This section will introduce the MCEP algorithm in detail. As mentioned in the related work section, community evolution prediction is mainly divided into four steps. The main work of this paper is the last step, which obtains the corresponding features.
The structure of this section is as follows. Section 3.1 will briefly introduce the features used in the MCEP. Section 3.2 will introduce the global and local-based community roleextraction algorithm proposed in this paper. Section 3.3 will introduce the algorithm for discovering the community mesoscopic features. Section 3.4 briefly introduces the prediction process of the MCEP algorithm.
Before introducing MCEP, we first briefly introduce the related symbols used in this paper.
In this paper, the dynamic network is defined as G = (V t , E t ), where V t is the node in the network at time t, E t is the edge in the network at time t, t = 1, 2, . . . , T, and T is the number of the snapshot in the network. N t = |V t | represents the number of nodes in the network at time t. C t represents the community structure at time t, C (t,i) represents the i-th community at time t, and Role C represents the division of roles in community C. The main symbols used in this article are given in Table 1.

G t
The network of snapshot t C t Community set at snapshot t C t,i Community j at snapshot t The global feature matrix of community C C ∈ R |C i |×M The local feature matrix of community C F C ∈ R |C|×L The mesoscopic feature of community C W C ∈ R K×M The global feature contribution matrix The local feature contribution matrix The role attribution matrix In the above table, |C| is the number of communities in the network, |C i | is the number of nodes in the community C, K is the number of roles, and L is the number of community mesoscopic features.

Community Features
Similar to other community evolution prediction algorithms, MCEP also needs to construct a training set by calculating related features. This section will introduce the features used in MCEP.
The features used in MCEP can be divided into three types: overall community features, node features, and community mesoscopic features. Compared with the first two features, the calculation of community mesoscopic features is more complicated. We will introduce it in detail in Section 3.3. This section mainly introduces the overall community features and node features used in MCEP. It is worth pointing out that because different networks have different properties, finding a suitable feature set to make it effective for all networks is difficult. The experimental part will also illustrate this point. The features listed below are just some of the general features. In practical applications, the user should make a choice based on specific conditions. The overall community features measure the community from an overall perspective. This measurement mainly reflects the features of the community at the macro level. The overall community features used by MCEP are community size, density, and cohesion. The community size measures the number of nodes in the community. Both the density and the cohesion are measures of the closeness of the connections within the community. The density of the community pays more attention to the connections within the community. The cohesion considers the number of internal connections while also considering the connections outside the community.
The overall community features are macroscopic reflections of the community features. In addition to the macro features, the community also has its micro features. Micro features refer to the features of nodes in the community. To obtain the micro features, MCEP calculates the features of each node in the community and then calculates its average value. The average value is the micro features of the community. The node features used in MCEP are degree, clustering coefficient [18], betweenness centrality [19], closeness centrality, and eigenvector centrality [20]. Among them, clustering coefficient and closeness centrality mainly measure the tightness of connections around nodes. Centrality and eigenvector centrality mainly measure the importance of nodes in the network.
The relevant calculation formulas and symbol meanings of the above features are given in Table 2.

Feature
Calculation Formula In the above table, Nei v is the set of neighbor nodes of node v of node v, O c is the number of edges between nodes in community C and external nodes, and PL v represents the maximum number of possible connections of node v's neighbors, L v represents the actual number of connections of the neighbors of node v, σ st represents the number of shortest paths from node s to node v, σ st (v) represents the number of paths that the shortest path number passes through v, Dis(v, u) represents the distance from node v to node u, and x v represents the feature vector of node v.

Global-and Local-Based Community Role Extraction
This section introduces the global and local-based community role-extraction algorithm in detail.
As mentioned above, in addition to leader nodes, there are many other types of nodes in the community, which will also affect the evolution of the community. We refer to different types of nodes in the community as community roles, which are specifically defined as follows: Definition 1 (Community role). The community role classifies nodes according to the structural features of nodes. Community role describes the different features of nodes in the community. Nodes belonging to different roles will have different effects on the evolution of the community.
In contrast to traditional role extraction, the discovery of community role not only needs to consider the network structure features of the node, but also needs to consider the community structure features of the node. Therefore, it is difficult to accurately discover community roles using traditional role-extraction algorithms. To accurately discover the community role, we propose the global-and local-based community role-extraction algorithm (GLRE). In this algorithm, global refers to the whole network, and local refers to the community where the node is located.
GLRE obtains the community role mainly by obtaining the community role attribution matrix. The definition of the community role attribution matrix is as follows: Definition 2 (Community role attribution matrix). The community role attribution matrix R reflects the probability that a node belongs to each community role. The role can be assigned to nodes in the community through this matrix.
To obtain the community role attribution matrix, GLRE first calculates the structural features of the node from the global scope and the local scope. Then, GLRE constructs the global and local feature matrix. The features used in GLRE are the node features mentioned in Section 3.1. After obtaining the global and local feature matrices, GLRE decomposes the two matrices simultaneously through NMF to obtain the community role attribution matrix. The rest of this section will introduce the process of the GLRE algorithm in detail.
First, GLRE calculates the features of nodes in the global and local scopes and constructs the global and local feature matrix ξ C and C .
It is worth pointing out that when calculating the features of a node from a local scope, GLRE will ignore nodes that do not belong to the community. Figure 1 is an example. Among them, nodes of the same color belong to the same community. For node 1 in the graph, its degree is 4. Among the neighbors of node 1, two nodes belong to the same community as node 1, and two nodes belong to different communities from node 1. When computing node features from the global scope, the degree of node 1 is 4, as shown on the left. When computing node features from a local scope, the degree of node 1 is 2, as shown on the right.
After obtaining the global and local feature matrices of the community, GLRE uses NMF to decompose matrix ξ C and C to obtain the community role attribution matrix.
To consider both the global and local features of the node in the NMF, we design an effective loss function so that GLRE can decompose matrix ξ C and C at the same time in the NMF. The loss function is as follows: where ξ C is the global feature matrix of community C, C is the local feature matrix of community C, W C ∈ R M×K is the global feature contribution matrix of nodes in C, and H C ∈ R M×K is the local feature contribution matrix of nodes in C, and C ∈ R M×K is the role attribution matrix of nodes in community C.
After obtaining the loss function, the GLRE algorithm uses the multiplicative update rule [21] to minimize the loss function to obtain W C , H C and C . The final update formula is as follows: After obtaining the community role attribution matrix C , we can obtain the community role of each node. Each row in C represents the probability that the node belongs to each community role. The nodes in the community can be assigned roles through the elements in C : The process of the GLRE algorithm is shown in Algorithm 1.

Algorithm 1 GLRE
Require: network structure G, community C 1: Calculate the node features of each node from the global and local, respectively 2: Construct matrices ξ C and C according to the obtained node features 3: Get H C , W C , C , according to Formulas (2)-(4) 4: Obtain the role distribution C,i of each node according to Formula (5) 5: Output division result C After obtaining the roles of nodes in the community through GLRE, for nodes with the same role, MCEP calculates the node features and uses the average value as the features of this type of role.

Community Mesoscopic Feature Mining
This section will first introduce the definition of community mesoscopic features proposed in this paper. Then, this section will propose the corresponding mesoscopic feature-mining algorithm based on the community hypergraph.
Community is a mesoscopic structure in the network. In the real world, there is much interaction between communities. These interactions will also have an impact on the future evolution of the community. To consider the influence relationship between communities, we define the community mesoscopic feature, which is defined as follows: Definition 3 (Community mesoscopic feature). Community mesoscopic feature describes the mutual influence relationship between communities. It is a concrete manifestation of the influence of other communities on this community. In this paper, the community mesoscopic feature of community C is represented by F C . The mesoscopic features of all communities are represented by matrix F.
Since the interaction between communities is a complex and abstract effect, the mesoscopic features of the community cannot be obtained through simple calculation. To accurately discover the community mesoscopic features, we propose an algorithm for mining mesoscopic features. This algorithm first treats the community as a hypernode in the network, then constructs a community hypergraph based on the similarity between hypernodes. Finally, the algorithm obtains the community mesoscopic features through NMF.
We first give the definition of the community hypergraph, as follows: Definition 4 (Community hypergraph). Hypergraph is a concept opposite to the simple graph. A node in a hypergraph is a set of nodes, and the edges in a hypergraph describe the relationship between the set of nodes. A community hypergraph is a hypergraph constructed by treating a community as a node. We refer to the community in the hypergraph as community hyper-nodes. Figure 2 shows an example of a community hypergraph. In Figure 2, the network on the left is the original network, in which the nodes of the same color belong to the same community. The network on the right is the community hypergraph constructed from the original network, and the relationship between the hypernode and the community is given by the solid arrow. As can be seen from Figure 2, by constructing the community hypergraph, the algorithm can ignore the complexity of the internal connections of the community, and it can focus on mining the influence relationship between the communities. In addition, in the original network, there is no direct connection between the blue community and the green community. But in the hypergraph, the blue node and the green node are connected, because in the real network, the influence between the community can be generated not only through the interconnection of nodes between communities, but also through other means, such as mutual attention between communities and so on. In this paper, the community hypergraph is constructed by calculating the similarity of community hyper-nodes. The weight of the edges in the hypergraph is the corresponding hypernode similarity. The definition of hypernode similarity is as follows: Definition 5 (Hypernode similarity). Hypernode similarity describes the degree of similarity between community hyper-nodes. The calculation of the hypernode similarity of the community hypernode C i and C j is as follows: where S C i ,C j is the hypernode similarity of the community hyper-nodes C i and C j , and C i and C j are the features of C i and C j , which are obtained by the GLRE algorithm.
The reason for using similarity to construct a community hypergraph is as follows: a community is not isolated in the network. Its future evolution will be affected by other communities. This impact can be direct or indirect. If the two communities are more similar, they are more likely to influence each other. For example, in a social network, if two communities have the same interests, the two communities may follow each other. Therefore, even if the two communities are not connected, the evolution of one community is likely to lead to corresponding changes in the other community.
Finally, MCEP decomposes the similarity matrix S of the community hypergraph through NMF. After NMF, MCEP obtains the community mesoscopic features according to mesoscopic feature matrix. The optimization goals of NMF are as follows: where S is the hypergraph similarity matrix, F is the community mesoscopic feature matrix, each row in F is the mesoscopic feature of the corresponding community, and M is the mapping matrix, which reflects the mapping relationship from S to F.

Community Evolution Prediction
This section will explain the feature set construction method used for prediction, and the overall process of MCEP.
Suppose that, for the community i at time t, the overall community features obtained by MCEP are represented by TF t,i , and the community role features are represented by RF t,i,1 , RF t,i,2 , RF t,i,3 (assuming that the number of roles is 3), the community mesoscopic feature is represented by MF t,i . The final community feature is obtained by splicing the above features: Because the future evolution of the community will be affected by its historical evolution, to take time information into account, MCEP obtains the evolution sequence by tracking the evolution process of the community. Then MCEP splices the corresponding community features and their evolution events to obtain the features for training and prediction. The feature structure of community i at time t is as follows: where T t,i represents the total feature of community i at time t, F t,i is the feature of community i at time t, and E t,i is the evolution event of the community i at time t. Because MCEP uses binary classification in evolution prediction, i.e., for each evolution event, MCEP trains a separate classifier to predict. Therefore, E t,i has only two values: E t,i is 1 when the corresponding evolution event occurs, and E t,i is 0 when it does not occur.
After the feature set is constructed, MCEP uses the classifier for training and prediction. For each community evolution event, MCEP constructs a corresponding training set. Then MCEP uses logistic regression for training and prediction. Although MCEP uses logistic regression as the classifier, and the main focus of MCEP is the obtaining of community features and the construction of the training set, so MCEP can also use other classifiers.
The process of the MCEP algorithm is shown in Algorithm 2.

Algorithm 2 MCEP
Require: Network structure G 1: Split the network to obtain G 1 , G 2 , . . . , G t 2: Perform community discovery on different snapshots, and get the community structure of each snapshot C 1 , C 2 , . . . , C t 3: Use the GED algorithm to obtain the evolutionary relationship between communities 4: For the community C t,i , calculate the overall community features TF t,i 5: Use the GLRE to obtain the role features of the community RF t,i,1 , RF t,i,2 , . . . , RF t,i,r 6: Obtain the community mesoscopic feature MF t,i according to the algorithm described in Section 4.3 7: Combine the obtained features to obtain community features F t,i 8: Construct feature set according to Formula (9) 9: Train the classifier based on the constructed feature set

Experiment
This section will show the effectiveness of the MCEP through experiments. In this paper, five representative algorithms are chosen as baselines. The five algorithms are CFCR [10], SGCI [8], PATOC [9], node2vec [22], and LINE [23]. For the first two algorithms, since the author did not give the abbreviations of the algorithm, for the convenience of subsequent presentation, we have named these two algorithms. In addition, since the representation of the community can also be obtained through graph representation learning, we use two well-known graph representation learning algorithms as comparison algorithms; these are node2vec and LINE. To prevent confusion with the original algorithm, we use Cn-ode2vec and CLINE to represent the corresponding algorithms. To use these two algorithms for community evolution prediction, after obtaining the node representation, we calculate the average value of the node representation of each community. The average value is the feature corresponding to the community; we can use these features to make predictions.
We conducted experiments on three real datasets, which are DBLP [24], ia-reality-call https://networkrepository.com/ia-reality-call.php (accessed on 12 March 2022), and sxaskubuntu-c2qhttp://snap.stanford.edu/data/sx-askubuntu-c2q.txt.gz (accessed on 12 March 2022). Table 3 shows the dataset used in the experiment. As can be seen from Table 3, the three networks are small, medium and large, respectively. In addition, the types of these three networks are also different, where DBLP is a citation network, ia-realitycall is a mobile phone network, and sx-askubuntu-c2q is a question-answering network. Since networks vary in size and type, we believe using these three datasets is sufficient to demonstrate the effectiveness of our proposed algorithm.
All algorithms use logistic regression as the classifier. To increase the number of community events, ia-reality-call has one snapshot overlap, and sx-askubuntu-c2q has two snapshots overlap. To ensure the stability of the experimental results, the experiment used ten-fold cross-validation and was repeated 20 times; each time, the order of the data was randomly shuffled. The final experimental result was an average of 20 times. The experiment uses accuracy as a measure.
Since the main difference of the comparison algorithm lies in feature selection, to ensure a fair comparison, all algorithms use the same steps in the first three steps of community evolution prediction. The algorithm used in the community discovery part is LFM [25]. The algorithm used in the evolution analysis part is GED [17]. The two parameters in the GED algorithm are both 0.5. Since the final training set has the problem of imbalance between positive and negative samples, we use SMOTE [26] to balance the positive and negative samples before training the classifier.

Parameter Selection
MCEP has two parameters, the number of community roles K and the number of community mesoscopic features L. Obviously, choosing an appropriate number of roles and community mesoscopic features can effectively improve the effect of the algorithm. When the number of roles is too small, the algorithm cannot completely divide the community roles in the community, which will reduce the effect of the algorithm. When the number of roles is too large, the algorithm may easily divide the roles of the nodes incorrectly, and the number of mesoscopic features. Therefore, an appropriate number of roles and mesoscopic features are very important for the algorithm.
This section studies the influence of the number of roles and the number of community mesoscopic features on the MCEP. The datasets used are DBLP and ia-reality-call. The results of the experiment are shown in Tables 4 and 5.      As shown in Tables 4 and 5, we can see that with the increase of K and L, the effect of the MCEP algorithm presents a gradual upward trend on the average. From the perspective of the number of roles, when K is 5, the MCEP algorithm can achieve the best effect. When K is greater than 5, the effect of MCEP increases slowly. Therefore, the value of K in the subsequent experiments in this paper is 5. From the perspective of the number of community mesoscopic features, when the number of roles is fixed at 5, the best community mesoscopic features on the two datasets are slightly different. On the DBLP dataset, when L = 20, the prediction effect is the best. On the ia-reality-call dataset, when L = 15, the prediction effect is the best. Regardless of the dataset, when the number of mesoscopic features is large, the effect of MCEP is better. This may be because the community mesoscopic features involve complex relationships among communities, so MCEP needs a larger value to capture the community mesoscopic features. In the subsequent experiments, the number of features between communities is selected from 15 and 20.
From the perspective of the prediction accuracy of a single community event, the number of roles and the number of community mesoscopic features that make the evolutionary event achieve the best results are different in different events and different datasets, and there is no obvious rule. This shows that different community evolution events have their unique characteristics, so it is difficult to find a fixed pattern that can achieve the best results in all events.

Comparison Results between MCEP and Baselines
This section shows the effects of MCEP and other algorithms on three datasets. The experimental results are shown in Figures 3-8. The last column in each figure is the average value of each algorithm on the corresponding dataset. On the sx-askubuntu-c2q dataset, because the number of negative samples of survival events is too small, the prediction accuracy of the events cannot be compared. The experimental results show that the MCEP achieves the best results in most of the evolutionary events of the three datasets, which shows the effectiveness of the MCEP. For some evolutionary events, such as the survival event of ia-reality-call, the effect of MCEP algorithm is slightly worse than other algorithms. This may be because different community evolution events often present different features, and these features will also be different on the different datasets. Therefore, it is difficult to find a common feature set that enables the MCEP to have the best effect on all evolutionary events.   As can be seen from Figures 3-8, the effect of the Cnode2vec and CLINE algorithms is slightly lower than the effect of the MCEP algorithm on the three datasets. But the effect of the Cnode2vec and CLINE algorithm is better than other algorithms. The common point of MCEP, Cnode2vec, and CLINE algorithms is that these three algorithms use a relatively complex method to construct feature sets. MCEP is based on community roles and mesoscopic features, and Cnode2vec and CLINE are based on graph representation learning. In addition, the dimensionality of the features obtained by these three methods is also more than other algorithms, so it is more able to capture the characteristics of community evolution. This shows that due to the complex characteristics of the community evolution, it is difficult to achieve a satisfactory effect for community evolution prediction only by selecting a few suitable features.

Ablation Experiments
Compared to the previous algorithm, MCEP mainly made two improvements. First, MCEP proposed the GLRE for discovering the community role. Second, MCEP uses NMF to discover the community mesoscopic features. This section will prove the effectiveness of the above two improvements through ablation experiments.
First, we construct three algorithms by deleting the related improvements. In order to distinguish these three algorithms from the original MCEP algorithm, we name these algorithms as follows: 1.
GL-B-MCEP: The GLRE algorithm is used, and the community mesoscopic features are considered. 2.
G-B-MCEP: The traditional role-extraction algorithm is used and the community mesoscopic features are considered. 3.
GL-MCEP: The GLRE algorithm is used, and the community mesoscopic features are not considered. 4.
G-MCEP: The traditional role-extraction algorithm is used, and the community mesoscopic features are not considered.
Then, we conduct comparative experiments on the three datasets. To reduce the influence of other factors on the experimental results, for each algorithm, we use the same number of roles and mesoscopic features. On the DBLP dataset, the number of roles is 5 and the number of mesoscopic features is 20. On the ia-reality-call dataset and sx-askubuntu-c2q dataset, the number of roles is 5 and the number of mesoscopic features is 15. The experimental results are shown in Tables 6-8.  Table 7. Results of ablation experiments:ia-reality-call.

GL-B-MCEP GL-MCEP G-B-MCEP G-MCEP
The experiment result shows that the GL-B-MCEP algorithm can achieve the best results on the three datasets. Due to the complex characteristics of community evolution events, GL-B-MCEP did not achieve the best results in all community evolution events, which is the same as the results of the comparative experiment. But from an average point of view, the effect of GL-B-MCEP is the best.
In addition, on the DBLP dataset, the introduction of community mesoscopic features has greatly improved the prediction effect of the algorithm, while the improvement of the GLRE algorithm is not obvious. This may be because the DBLP dataset has only 2423 nodes. On this small-scale network, the local features of the community are not obvious. On the other two datasets, it can be seen that with the increase of network scale, the improvement of the prediction effect of the GLRE algorithm is gradually obvious, which shows that GLRE can be more effective on large-scale networks.

Different Community Discovery Algorithms
In the above experiment, we use LFM as the community discovery algorithm. In addition to the LFM algorithm, there are many other efficient community discovery algorithms. This section studies the performance of MCEP on several commonly used community discovery algorithms. These community discovery algorithms are the LFM algorithm [25], the Cluster Percolation method [27], the greedy algorithm [28], and the Louvain algorithm [29]. Among these four algorithms, the first two are overlapping community discovery algorithms, and the latter two are non-overlapping community discovery algorithms. The experimental results are shown in Table 9. We conducted a comparative experiment on the DBLP dataset. For the greedy algorithm and the Louvain algorithm, because the number of positive samples of shrink events is too small, there is no way to predict its accuracy, so there is no result in that part. The experimental results are shown in Table 9. From Table 5, we can see that different community discovery algorithms have a significant impact on the results of MCEP. For example, when using the CPM algorithm, MCEP prediction accuracy of survival events is significantly lower than the other three algorithms, and its prediction accuracy of split events is significantly higher than other algorithms. This fact shows that the communities discovered by different community discovery algorithms have their characteristics. The choice of community discovery algorithms will also have a significant impact on the prediction effect. Therefore, when making community evolution prediction, the choice of community discovery algorithm should be decided according to specific conditions.

Dimensionality Reduction
In the above experiment, we do not restrict the final dimension of the training set obtained by each algorithm, i.e., the feature dimension of the training set obtained by the above experiment is different. Different algorithms use different methods for obtaining the training set, so it is normal for the dimensions of the training set to be different.
This section compares the performance of different algorithms when the dimensions of the training set are the same. The specific method is that after obtaining the final features, we use PCA to reduce the dimensionality of the obtained features. The result after dimensionality reduction is used for prediction. We compare the effects of different algorithms in 5 different dimensions. We conducted experiments on the DBLP dataset and the ia-reality-call dataset. The experimental results are shown in Figures 9 and 10, which only shows the average value of the final prediction results. In addition, since the dimensionality of the training set obtained by the SGCI algorithm is only 15 dimensions, when the dimensions exceed 15 dimensions, we set the effect of SGCI to be the same as the effect of 15 dimensions.

Conclusions and Future Work
In this paper, we studied a novel community evolution prediction algorithm. The existing community evolution prediction algorithm only obtains the features of the community from a single level and then uses it for community evolution prediction. These approaches are neither comprehensive nor detailed enough, so they cannot obtain good results. To solve these problems, the MCEP algorithm is proposed to obtain features from two layers: the node layer, and the community layer. At the node layer, we propose the global and local community role-extraction algorithm. MCEP can accurately divide the nodes with different characteristics in the community using GLRE. Second, at the community layer, MCEP discovers community mesoscopic features by constructing a community hypergraph and using NMF. The related experimental results verify the effectiveness of the MCEP algorithm.
Although the MCEP can improve the effectiveness of community evolution prediction, the algorithm still has some shortcomings. First, in the stage of calculating community features, the algorithm needs to manually select relevant features. Obviously, this is a very inefficient way, and it is difficult to obtain the best results. Second, the MCEP algorithm has two parameters: the number of roles, and the number of mesoscopic features. These two parameters also need to be set in advance. In many cases, choosing an optimal parameter is quite difficult. Therefore, how to automatically select effective features and how to automatically select algorithm parameters is the future research direction from this paper.

Conflicts of Interest:
The authors declare no conflict of interest.