Next Article in Journal
Symmetry in Privacy-Based Healthcare: A Review of Skin Cancer Detection and Classification Using Federated Learning
Previous Article in Journal
The Modified-Lomax Distribution: Properties, Estimation Methods, and Application
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Unified Spectral Clustering Approach for Detecting Community Structure in Multilayer Networks

1
Electrical Engineering Department, Jordan University of Science and Technology, Irbid 22110, Jordan
2
Electrical and Computer Engineering Department, Michigan State University, East Lansing, MI 48824, USA
*
Author to whom correspondence should be addressed.
Symmetry 2023, 15(7), 1368; https://doi.org/10.3390/sym15071368
Submission received: 11 May 2023 / Revised: 21 June 2023 / Accepted: 3 July 2023 / Published: 5 July 2023

Abstract

:
Networks offer a compact representation of complex systems such as social, communication, and biological systems. Traditional network models are often inadequate to capture the diverse nature of contemporary networks, which may exhibit temporal variation and multiple types of interactions between entities. Multilayer networks (MLNs) provide a more comprehensive representation by allowing interactions between nodes to be represented by different types of links, each reflecting a distinct type of interaction. Community detection reveals meaningful structure and provides a better understanding of the overall functioning of networks. Current approaches to multilayer community detection are either limited to community detection over the aggregated network or are extensions of single-layer community detection methods with simplifying assumptions such as a common community structure across layers. Moreover, most of the existing methods are limited to multiplex networks with no inter-layer edges. In this paper, we introduce a spectral-clustering-based community detection method for two-layer MLNs. The problem of detecting the community structure is formulated as an optimization problem where the normalized cut for each layer is minimized simultaneously with the normalized cut for the bipartite network along with regularization terms that ensure the consistency of the within- and across-layer community structures. The proposed method is evaluated on both synthetic and real networks and compared to state-of-the-art methods. MLNs. The problem of detecting the community structure is formulated as an optimization problem where the normalized cut for each layer is minimized simultaneously with the normalized cut for the bipartite network along with regularization terms that ensure the consistency of the intra- and inter-layer community structures. The proposed method is evaluated on both synthetic and real networks and compared to state-of-the-art methods.

1. Introduction

Networks provide a compact representation of the internal structure of complex systems consisting of agents that interact with each other. Some example application areas include social sciences, engineering systems, and biological systems [1,2]. A core task in network analysis is community detection, which identifies the partition of the node set such that within-community connections are denser than between-community connections.
While different methods have been proposed to detect the community structure of simple (single-layer) graphs, in many contemporary applications, a pair of nodes may interact through multiple types of links yielding multilayer networks. In MLNs, each type of link represents a unique type of interaction. These links can be separated into different layers, enabling the same group of nodes to be connected in multiple ways [3]. The layers in a multilayer network can represent various attributes or features of a complex system. For instance, they can be temporal snapshots of the same network at different time intervals, or they can correspond to different types of connections in social networks, e.g., friendship, collaboration, or family relationships, different types of units in military tactical networks, e.g., infantry, vehicles, or airborne units [4], or transportation networks, where nodes representing different locations can be linked through various modes of transportation, including roads, railways, and air routes. Multilayer networks can further be categorized based on the homogeneity of the nodes and complexity of topological structure as (i) multiplex networks that exhibit homogeneity in terms of the entities they comprise, with each layer consisting of the same set of entities of the same type where the inter-layer edges are implicit and not shown; (ii) heterogeneous multilayer networks by the possibility of having different sets and types of entities for each layer and the relationships between entities across layers are explicitly represented by inter-layer edges.
Current approaches to multilayer community detection are either limited to community detection over the aggregated network or are extensions of single-layer community detection methods with simplifying assumptions such as a common community structure across layers. Moreover, they are mostly limited to multiplex networks with no inter-layer edges.
In this paper, we extend the notion of spectral clustering from single-layer networks to two-layer networks with inter-layer edges. In particular, we model the two-layer network as the union of two single-layer networks and a bipartite network that is represented through its symmetric rows’ and columns’ adjacency matrices. Next, we express the cost functions corresponding to minimizing the normalized cut for each of the layers as well as the inter-layer adjacency matrix in its relaxed form similar to spectral clustering. In order to ensure the consistency of the communities across layers, we regularize the resulting cost function by a projection distance metric that quantifies the consistency of the low-rank embeddings of the networks across layers. The resulting optimization problem is solved through an alternating maximization scheme.

2. Related Work

Community detection methods for multilayer networks can be broadly categorized into three classes: flattening methods, aggregation methods, and direct methods. Flattening methods convert the multilayer network into a single-layer network by collapsing the layers and then apply traditional community detection algorithms. This approach ignores the information present in the multiple layers, which can lead to loss of important features and inaccuracies in the community structure [5]. Aggregation methods detect the community structure for each layer separately and then merge the results into a single structure. This method requires a merging strategy to combine the community structures from each layer, which can be challenging and subjective. Additionally, it may fail to capture the inter-layer dependencies and correlations between the layers [6]. Direct methods work directly on the multilayer network and optimize community-quality assessment criteria such as modularity or normalized cut to identify the community structure. This approach accounts for the interactions between the layers and can reveal the inter-layer dependencies and correlations [7,8,9].
Some examples of the direct method include multilayer label propagation, random-walk-based methods, non-negative matrix factorization, modularity, and spectral-clustering-based methods. Label propagation algorithms (LPAs) propagate node attributes based on their neighbors’ behavior and exhibit linear complexity. Inspired by the traditional LPA, the authors of [10] presented a redefinition of the neighborhood in multilayer networks and proposed a multilayer LPA. Although this approach is efficient and can handle weighted and directed networks, the resulting partition is highly dependent on the threshold parameter and the density of the network dataset. Moreover, this method is only suitable for multiplex networks. Kuncheva et al. [9] proposed locally adaptive random transitions (LARTs), which are designed to detect communities that are shared by some or all layers in multiplex networks. More recently, matrix and tensor factorization methods have been proposed for multilayer community detection. Among these, non-negative matrix factorization (NMF)-based methods that extract low-dimensional feature representations for each layer, where collective factorization is then used to fuse them into a common representation [11]. In [12], a semi-supervised joint non-negative matrix factorization (S2-jNMF) algorithm is proposed for community detection in multiplex networks, aiming to detect a common structure across layers. However, all of these methods are restricted to multiplex networks, where inter-layer edges are only allowed between each node and its corresponding replicas across different layers.
More recently, community detection methods that consider fully connected MLNs, MLNs with inter-relations, have been proposed [13]. The authors of [14,15] propose to extend the modularity function and its solution to account for MLNs with inter-layer relations. Similarly, [16] propose a normalized cut extension to MLNs by creating a block Laplacian matrix, where each block corresponds to a specific layer. The community structure is then obtained through standard spectral clustering on this block Laplacian matrix. However, the selection of the parameter β is crucial to ensuring the community structure consistency across the layers in this method. Another commonly employed technique that incorporates the concept of network dynamics, specifically diffusion, is Infomap [17]. This method optimizes the map equation, which leverages the information–theoretic relationship between reducing network dimensionality and detecting network communities. However, in the case of noisy networks, the efficiency of the diffusion process, i.e., information propagation, may be compromised, leading to suboptimal clustering performance [18].

3. Background

3.1. Graph Theory

Single-layer network: A single-layer network or a simple graph models the interactions between entities in network science. A single-layer network can be defined as G S = ( V S , E S , A ) , where V S is the set of nodes, E S ( V S × V S ) is the set of edges, and A R n × n is the symmetric adjacency matrix, with n = | V | being the number of nodes.
Bipartite network: A bipartite network or graph is a graph that can be partitioned into two sets of vertices where all edges connect vertices from one set to vertices in the other set. Formally, a bipartite graph is defined as G B = ( V B , E B , B ) , where V B = V 1 V 2 , E B ( V 1 × V 2 ) are the disjoint sets of vertices and the set of edges that connects a vertex in V 1 to a vertex in V 2 . B R ( n 1 + n 2 ) × ( n 1 + n 2 ) denotes the symmetric adjacency matrix of the bipartite network, where n 1 = | V 1 | and n 2 = | V 2 | refer to the size of the two disjoint sets. The adjacency matrix of the bipartite graph can be defined as follows:
B = 0 A 12 A 21 0 ,
where A 12 R n 1 × n 2 describes the relationships between V 1 and V 2 .
Two-layer network: A two-layer network is a type of multilayer network that consists of two layers or graphs, where each layer represents a different type of relationship or interaction between nodes. A two-layer network, G 2 M , can be formally defined as the set of two single-layer graphs, G 1 and G 2 , and a bipartite graph, G 12 , such that G 2 M = { G 1 , G 2 , G 12 } with G 1 = ( V 1 , E 1 , A 1 ) , G 2 = ( V 2 , E 2 , A 2 ) , and G 12 = ( V 1 , V 2 , E 12 , A 12 ) . G 1 and G 2 are known as within- or intra-layer graphs, whereas G 12 refers to the across- or inter-layer graph.
Supra-adjacency matrix: A supra-adjacency matrix is a symmetric matrix that represents both the intra- and inter-layer connections in a multilayer network. A supra-adjacency matrix of a two-layer MLN, A 2 M R N × N , where N = n 1 + n 2 can be constructed from the intra- and inter-layer adjacency matrices as follows:
A 2 M = A 1 A 12 A 21 A 2 ,
where A 1 R n 1 × n 1 , A 2 R n 2 × n 2 , and A 12 R n 1 × n 2 .

3.2. Graph Cut Problem and Spectral Clustering

Graph minimum cut (mincut) is a problem in graph theory that involves partitioning a graph into multiple partitions or disjoint sets of nodes such that the number of edges between these sets is minimized. The mincut problem is NP-hard. However, there are efficient algorithms to approximate the mincut, such as spectral clustering. Spectral clustering relies on the spectral properties (eigenvalues and eigenvectors) of the symmetric graph Laplacian matrix or normalized adjacency matrix. In particular, spectral clustering uses these eigenvalues and eigenvectors to embed the nodes of the graph into a lower-dimensional space. The nodes can be then clustered using standard k-means in this lower-dimensional space.
Given a single-layer graph with a symmetric adjacency matrix, A R n × n , spectral clustering solves the following trace maximization problem [19,20]:
max U R n × k t r U A N U , s . t U U = I k ,
where “ t r ” and “⊤” refer to the trace and transpose operators, respectively. A N = D 1 2 A D 1 2 is the normalized version of the adjacency matrix, where D is the degree matrix with D i i = j A i j . Spectral clustering uses the spectrum (eigenvalues) of the normalized adjacency matrix to partition the nodes into clusters. In particular, the eigenvectors corresponding to the largest k eigenvalues are used to embed the nodes in a low-dimensional space, where the matrix U is constructed by arranging these eigenvectors as its columns. The final structure is then determined by applying classical k-means to the rows of the matrix U [19]. The number of the eigenvectors, k, corresponds to the number of communities in the network.

3.3. Spectral Co-Clustering in Bipartite Networks

Bipartite spectral co-clustering is a technique for simultaneously clustering both rows and columns of a bipartite network [21,22]. The problem can be formulated as a trace maximization problem as follows:
max Z R ( n 1 + n 2 ) × k t r ( Z B N Z ) = max Z R ( n 1 + n 2 ) × k t r U L U R B N U L U R ,
where B N = 0 A N 12 A N 21 0 , Z = U L U R , A N 12 = D L 1 2 A 12 D R 1 2 with D L i i = j A i j 12 , and D R j j = i A i j 12 . According to the Ky Fan theorem, the global optimum solution of Equation (4) is the matrix Z containing k eigenvectors that correspond to the largest eigenvalues of B N .
A more computationally efficient solution [21] to solve Equation (4) is to compute U L and U R as the matrices containing the left and right singular vectors that correspond to the largest k singular values of the matrix A N 12 , respectively.
Another approach to solve Equation (4) is by first computing the symmetric rows and columns adjacency matrices, A N 12 A N 12 and A N 21 A N 21 , and then solving two trace maximization problems, simultaneously, as follows [22]:
max U L R n 1 × k t r U L A N 12 A N 12 U L s . t U L U L = I k and max U R R n 2 × k t r U R A N 21 A N 21 U R s . t U R U R = I k ,
where U L and U R can be computed separately as the eigenvectors’ matrices related to the largest k eigenvalues of A N 12 A N 12 and A N 21 A N 21 , respectively.

3.4. Projection Distance between Subspaces

The projection distance between subspaces measures the distance between two subspaces of a vector space. In particular, it quantifies the distance between the orthogonal projections of a vector onto each of the two subspaces. Let span ( H 1 ) and span ( H 2 ) be two subspaces and their corresponding orthonormal basis sets are H 1 R n × k and H 2 R n × k , respectively. The projection distance can be determined by the principal angles between the two subspaces. Let θ i denote the ith principal angle between the two subspaces; then, the projection distance is defined as [23]:
d p 2 ( span ( H 1 ) , span ( H 2 ) ) = i = 1 k sin 2 ( θ i ) , = k i = 1 k cos 2 ( θ i ) , = k t r ( H 1 H 1 H 2 H 2 ) .

4. Community Detection in Multilayer Networks: A Unified Spectral Clustering Approach (ML-USCL)

4.1. Problem Formulation

Given a two-layer network, in order to determine a community, C k , two subsets of vertices are defined: (i) within-layer community subset, C S = ( V m C k , E m C k ) , with V m C k V m and E m C k = E m ( V m C k × V m C k ) , where m { 1 , 2 } , and (ii) across-layer community subset, C B = ( V 1 C k , V 2 C k , E 12 C k ) with V 1 C k V 1 , V 2 C k V 2 and E 12 C k = E 12 ( V 1 C k × V 2 C k ) . A community can then be defined as C k = { C S , C B } , where it may include vertices from one or more layers in the network. In particular, C k defines either a within-layer community when C B = or an across-layer community when C B .
In this paper, the objective is to partition a two-layer network into K disjoint communities. The goal is to find the low-rank embeddings of each layer to maximize the separability between communities while ensuring that these low-rank embeddings are consistent with the partitioning of the inter-layer graph. This objective is achieved by exploiting previous work in spectral clustering of single-layer networks and spectral co-clustering of bipartite networks. More precisely, intra-and inter-layer graphs are modeled as single-layer and bipartite graphs, respectively. The intra-layer graph encodes the interactions between nodes within the same layer, whereas the inter-layer graph encodes the interactions between nodes from both layers.
The proposed objective function can be expressed mathematically as:
max U 1 R n 1 × k 1 , U 2 R n 2 × k 2 , U L R n 1 × k , U R R n 2 × k t r ( U 1 A N 1 U 1 ) + t r ( U 2 A N 2 U 2 ) Within - Layer Normalized Cut + t r ( U L ( A N 12 A N 12 ) U L ) + t r ( U R ( A N 21 A N 21 ) U R ) Across - Layer Normalized Cut + λ 1 t r ( U 1 U 1 U L U L ) + λ 2 t r ( U 2 U 2 U R U R ) Regularization s . t U 1 U 1 = I k 1 , U 2 U 2 = I k 2 , U L U L = I k , U R U R = I k .
The proposed objective function is formulated such that the first two terms refer to the spectral clustering problem for Layers 1 and 2, respectively, where A N 1 R n 1 × n 1 and A N 2 R n 2 × n 2 are the symmetric normalized intra-layer adjacency matrices. The third and fourth terms refer to the bipartite spectral clustering problem. The last two terms define the spectral embedding similarity between the left and right subspaces of A N 12 and the low-rank subspaces of A N 1 and A N 2 , respectively [24,25,26]. In particular, maximizing these two terms minimizes the projection subspace distance between ( U 1 , U L ) and ( U 2 , U R ) , ensuring the consistency between the intra- and inter-layer partitions. k 1 and k 2 refer to the number of within-layer communities in Layers 1 and 2, respectively, whereas k refers to the number of across-layer communities. λ 1 and λ 2 are the regularization parameters.

4.2. Problem Solution

4.2.1. Initializing the Intra- and Inter-Layer Basis Matrices

Intra-layer basis matrices, U 1 R n 1 × k 1 and U 2 R n 2 × k 2 , are initialized using Equation (3). The number of communities, k 1 and k 2 , in each one of the layers is determined initially by the asymptotical surprise (AS) metric [27]. In particular, the AS metric is calculated for a range of possible community numbers, and the initial number of intra-communities is set to the number that achieves the maximum value of the AS metric. On the other hand, inter-layer basis matrices, U L R n 1 × k and U R R n 2 × k , are initialized using Equation (5), where k = m i n ( k 1 , k 2 ) . The steps of initializing the basis matrices are oultined in Algorithm 1.
Algorithm 1 Initializing the intra- and inter-layer basis matrices
Input: A 1 , A 2 , A 12 , maximum number of communities ( C m ).
Output: Initial ( U 1 , U 2 , U L and U R ), k 1 , k 2 , k
 1:
Compute A N 1 , A N 2 and A N 12
 2:
Compute eigendecomposition of A N 1 = U 1 Λ 1 U 1 and A N 2 = U 2 Λ 2 U 2
 3:
for  N O C = 2 : C m  do
 4:
   Apply k-means to U 1 ( : , 1 : N O C ) to determine the first layer nodes clustering labels ( C L 1 w ).
 5:
   Apply k-means to U 2 ( : , 1 : N O C ) to determine the second layer nodes clustering labels ( C L 2 w ).
 6:
   Compute asymptotical surprise (AS) for Layer 1: AS1(NOC)=AS( A 1 , C L 1 w )
 7:
   Compute asymptotical surprise (AS) for Layer 2: AS2(NOC)=AS( A 2 , C L 2 w )
 8:
end for
 9:
Find k 1 as k 1 = max j A S 1 ( j )
10:
Find k 2 as k 2 = max j A S 2 ( j )
11:
Set k to m i n ( k 1 , k 2 )
12:
Compute U L ( : , 1 : k ) and U R ( : , 1 : k ) using Equation (5)
13:
return  U 1 ( : , 1 : k 1 ) , U 2 ( : , 1 : k 2 ) , U L ( : , 1 : k ) and U R ( : , 1 : k ) , k 1 , k 2 and k

4.2.2. Finding the Basis Matrices

As solving for the different variables in the proposed objective function jointly is not feasible, alternating maximization can be adopted to compute the variables, iteratively. Alternating maximization is a commonly used approach for solving optimization problems that involve multiple variables or constraints. In particular, the technique involves fixing one set of variables and computing the other set, alternating between the two until convergence [28,29].
The solution to the proposed problem in Equation (7) can be found using an alternating maximization scheme as follows:
  • Update U 1 : By considering only the terms that contain U 1 , we obtain
    max U 1 R n 1 × k 1 t r ( U 1 A N 1 U 1 ) + λ 1 t r ( U 1 U 1 U L U L ) s . t U 1 U 1 = I k 1 = max U 1 R n 1 × k 1 t r ( U 1 A m o d 1 U 1 ) s . t U 1 U 1 = I k 1 ,
    where A m o d 1 = A N 1 + λ 1 U L U L is the modified normalized adjacency matrix representing Layer 1. The solution to this problem is similar to the classic spectral clustering formulation. In particular, the matrix U 1 is computed through eigen decomposition (ED) of A m o d 1 , where it consists of the k 1 eigenvectors that are associated with the k 1 largest eigenvalues of A m o d 1 .
  • Update U 2 : By considering only the terms that contain U 2 we obtain,
    max U 2 R n 2 × k 2 t r ( U 2 A N 2 U 2 ) + λ 2 t r ( U 2 U 2 U R U R ) s . t U 2 U 2 = I k 2 = max U 2 R n 2 × k 2 t r ( U 2 A m o d 2 U 2 ) s . t U 2 U 2 = I k 2 ,
    where A m o d 2 = A N 2 + λ 2 U R U R is the modified normalized adjacency matrix representing Layer 2. Similar to U 1 , U 2 contains k 2 eigenvectors associated with the k 2 largest eigenvalues of A m o d 2 .
  • Update U L : By considering only the terms that contain U L , we obtain
    max U L R n 1 × k t r ( U L ( A N 12 A N 12 ) U L ) + λ 1 t r ( U 1 U 1 U L U L ) s . t U L U L = I k = max U L R n 1 × k t r ( U L A ¯ L U L ) s . t U L U L = I k ,
    where A ¯ L = A N 12 A N 12 + λ 1 U 1 U 1 is referred to as the modified normalized adjacency matrix that represents the rows of the inter-layer graph and U L can be computed also through eigendecomposition and it comprises the eigenvectors corresponding to the k largest eigenvalues of A ¯ L .
  • Update U R : By keeping all the terms that include only U R , we obtain
    max U R R n 2 × k t r ( U R ( A N 21 A N 21 ) U R ) + λ 2 t r ( U 2 U 2 U R U R ) s . t U R U R = I k = max U R R n 2 × k t r ( U R A ¯ R U R ) s . t U R U R = I k ,
    where A ¯ R = A N 12 A N 12 + λ 2 U 2 U 2 indicates the modified normalized adjacency matrix that represents the columns of the inter-layer graph, and U R represents the eigenvector matrix that corresponds to the k largest eigenvalues of A ¯ R .
As it can be seen from the update steps outlined above, each of the basis matrices is found jointly using both intra- and inter-layer adjacency information. In particular, U 1 is the subspace corresponding to both its corresponding adjacency matrix and the Gram matrix defined by U L , i.e., U L U L . Similar arguments can be made for the other basis matrices showing that they are learned to optimize the span for both within- and between-layer connectivity. After the basis matrices are computed, a set of intra- and inter-layer communities are determined by applying k-means to U 1 , U 2 , and Z = [ U L ; U R ] . The final community structure of the network is then determined by following the steps explained in Section 4.3.

4.3. Determining the Final Community Structure of the Network

After estimating a set of within- and across-layer communities, we evaluate their quality as follows:
  • The quality or strength of each within- and across-layer community is measured using the communitude metric [30,31] in terms of the supra-adjacency matrix as:
    C o m m u n i t u d e ( C k ) = E i n C k E ( E i n C k + E e x C k 2 E ) 2 ( E i n C k + E e x C k 2 E ) 2 ( 1 ( E i n C k + E e x C k 2 E ) 2 ) ,
    where E i n C k is the sum of internal edges in a community, i.e., edges between the nodes in the same community or “within-community” edges; E e x C k is the sum of external edges in a community, i.e., edges that connect nodes belonging to different communities or “between-community” edges; and 2 E refers to the sum of all edges in the supra-adjacency. In fact, this quality function can be seen as an adapted form of the Z-score function, where it is normalized by the standard deviation of the fraction of the number of edges within the subgraph. The upper bound of the communitude metric is 1.
  • To identify the most significant communities in a network, the communities are ranked in descending order depending on their communitude values. Every node in the proposed ML-USCL is allowed to join one community, within- or across-layer community. In particular, the communitude values of both the within- or across-layer communities are compared; then, the community that scores a higher value is selected.
  • The final community structure of the two-layer MLN is then considered as the set of K communities, C 2 M = { C 1 , C 2 , , C K } , that scores the largest communitude values.
The steps of the developed algorithm are summarized in Algorithm 2.
Algorithm 2 Detecting Community Structure in Multilayer networks: Unified Spectral Clustering (ML-USCL)
Input:  A 1 , A 2 , A 12 , λ 1 , λ 2
Output: Within- and across-layer communities.
 1:
Initialize U 1 , U 2 , U L and U R as explained in Algorithm 1.
 2:
while not converge do
 3:
   Calculate U 1 using Equation (8)
 4:
   Calculate U 2 using Equation (9)
 5:
   Calculate U L using Equation (10)
 6:
   Calculate U R using Equation (11)
 7:
end while
 8:
Apply k-means on U 1 and U 2 to determine the within-layer communities.
 9:
Apply k-means on Z = [ U L ; U R ] to determine the across-layer communities.
10:
Compute the communitude of the detected communities using Equation (12).
11:
return Best quality communities.

4.4. Computational Complexity of ML-USCL

The order of complexity of the proposed ML-USCL depends on the specific implementation of the algorithm and the size of the input network. The main steps involved in the proposed ML-USCL include initializing the basis matrices, updating the basis matrices by performing eigenvalue decomposition, and applying k-means to get the within- and across-layer communities.
Let n m be the number of nodes in the mth layer; then, initializing each one of the basis matrices requires eigenvalue decomposition of the intra- and inter-layer adjacency matrices, which has a complexity of O ( n m 3 ) . During the initialization step, the number of communities in each of the intra-layer graphs is determined by calculating the AS metric over a range of communities, { 2 , , C m } . The AS calculation requires C m O ( | E m | ) , where | E m | refers to the number of edges in the m t h layer. For each iteration, the update of U m has a complexity of O ( n m 3 ) . Applying k-means to U m to get the clustering labels requires O ( n m l m k m ) , where l m refers to the number iterations taken by the k-means.
Overall, the complexity of the proposed ML-USCL is dominated by the eigenvalue decomposition step. In particular, the order of complexity when full eigenvalue decomposition is calculated can be determined as l m O ( n m 3 ) , where l m refers to the total number of iterations. However, this computational complexity can be reduced to l m O ( k m n m 2 ) by computing the leading k m eigenvectors. Consequently, the order of complexity of the proposed ML-USCL can be considered as l m O ( k m n m 2 ) since the leading k m eigenvectors are computed in each iteration.

5. Results and Discussion

In this section, multiple experiments are conducted to evaluate the significance of the proposed approach. All experiments are performed on a standard Windows 10 Server with Intel (R) Core (TM) i7-9700 CPU @ 3.00GHz and 16GB RAM, MATLAB R2022b. The performance of the proposed ML-USCL is compared to other existing approaches, including block spectral clustering with inter-layer relations (BLSC), where β = 1 as suggested by the approach [16], generalized Louvain (GenLov) [32], collective NMF approaches [11], including CSNMTF and CPNMF. The input to the BLSC, GenLov, and collective NMF is the two-layer network directly, supra-adjacency, and the multiplex version of the network, respectively. The number of communities is determined using the asymptotical surprise for BLSC and the collective NMF methods, while it is self-optimized in GenLov. The value of the maximum number of communities, C m , is determined based on the size of the network. For example, C m can be set to 20 for small networks and to 100 for large networks. The quality of the network’s final partition is evaluated using normalized mutual information (NMI) [33], adjusted Rand index (ARI) [34], and purity [35].

5.1. Simulated Networks

5.1.1. LFR Binary Simulated Networks

  • LFR benchmark description: The Lancichinetti–Fortunato–Radicchi (LFR) benchmark [36,37] is a commonly used benchmark for evaluating the performance of community detection algorithms. In this experiment, the LFR benchmark is adopted to generate two-layer simulated networks, each with n nodes. The LFR benchmark uses a truncated power-law distribution to determine the community sizes. The parameters that control the community structure in the generated networks are (i) minimum degree, d m i n , (ii) maximum degree, d m a x , and (iii) mixing parameter, μ . The minimum and maximum values of the degree distribution ( d m i n and d m a x , respectively) are chosen such that the average degree of the network is equal to d . Within-layer graphs are generated with k 1 and k 2 communities, whereas the across-layer graph is generated such that the ratio of the across-layer communities to the total number of communities is greater than or equal to α , where α [ 0 , 1 ] . These selected communities are randomly combined with each other to create an across-layer community. The connection density within communities is set to ( 1 μ ) d and between communities to μ d . The parameter μ controls the degree of inter-community connectivity, where a low μ value results in a strong community structure with few inter-community links, while a high μ value results in a weaker community structure with more inter-community links. On the other hand, α controls the percent of the across-layer communities. As α increases, the networks tend to have more across-layer communities.
  • Experiment: In this experiment, two-layer unweighted networks with n 1 = n 2 = 200 are created. The parameters of the generated networks are d m i n = 8 , d m a x = 20 , μ = { 0.0 , 0.1 , 0.2 , 0.3 , 0.4 , 0.5 , 0.6 , 0.7 } , and α = { 0.0 , 0.5 , 1.0 } . To assess the efficacy of various algorithms in detecting community structure, a comparative analysis is carried out as the mixing parameter increases, i.e., the noise increases. Figure 1 shows a comparison between the different approaches in recovering the structure of two-layer LFR networks. Figure 1a–c reflect the performance of the different approaches when the two-layer network consists of within-layer communities only, i.e., α = 0 . As it can be noticed from the figures, the proposed ML-USCL outperforms the other methods significantly over the different values of the mixing parameter, μ , with respect to all of the evaluation metrics. In Figure 1d–f, where α = 0.5 , the two-layer networks comprises both intra- and inter-layer communities, the proposed ML-USCL exceeds the other algorithms in terms of the purity metric, and its performance is comparable to GenLov and BLSC with respect to the NMI and ARI metrics. Yet, the proposed ML-USCL performs better than both methods as the value of μ increases, which reflects its robustness. In Figure 1g–i, the networks consist of across-layer communities only. As illustrated in the figures, the proposed ML-USCL achieves higher scores of purity over the range of μ , whereas GenLov and BLSC achieve better performance in terms of NMI and ARI. This improvement in the performance of GenLov and BLSC compared to the networks with α = 0 and α = 0.5 is due to the fact that the networks consist of larger communities. However, the performance of both methods is inferior to ML-USCL for μ > 0.4 , which indicates that BLSC and GenLov are not robust to noise.

5.1.2. Weighted Simulated Networks

  • Weighted network description: The two-layer simulated weighted MLNs are generated from a truncated Gaussian distribution in the range of [ 0 , 1 ] . The networks are generated based on the parameters ( μ w , σ w ) and ( μ b , σ b ), which refer to the mean and standard deviation of edge weights within and between communities, respectively. Several two-layer MLNs are generated by varying the ground-truth structure, including the number of communities (NOCs). Furthermore, a percentage of sparse noise ( SPN % ) is randomly introduced into the networks to assess the algorithms’ ability to handle noise.
  • Experiment: In this experiment, multilayer weighted networks (MLWNs) with 2 layers and 100 nodes per layer are generated. The parameters of the constructed MLWNs are reported in Table 1. Three different MLWNs are generated with different ground-truth communities and varying sparse noise levels, SPN % = { 0 % , 5 % , 10 % , 15 % , 20 % , 25 % , 30 % } . The proposed ML-USCL is compared to the other algorithms, and the results are shown in Figure 2. It is evident from Figure 2 the superiority of ML-USCL compared to the other algorithms in weighted networks in terms of all metrics. Moreover, the proposed approach exhibits robustness to the addition of sparse noise, with the ability to accurately detect the community structure even as the percentage of added sparse noise, SPN%, increases.

5.2. Scalability Comparison

To evaluate the scalability of the proposed ML-USCL, we constructed a set of weighted multilayer networks with varying sizes. The network sizes ranged from 32 to 8192 on a logarithmic scale. For each network, the within- and between-community edges were randomly selected from a truncated Gaussian distribution within the range of [ 0 , 1 ] using the following parameters: μ w 1 = 0.5 , σ w 1 = 0.1 , μ b 1 = 0.3 , σ b 1 = 0.2 , μ w 2 = 0.7 , σ w 2 = 0.1 , μ b 2 = 0.2 , σ b 2 = 0.2 , μ w 12 = 0.5 , σ w 12 = 0.1 , μ b 12 = 0.3 , and σ b 12 = 0.2 , where the superscripts refer to the within- and across-layer graphs. Each network consisted of two equal-sized communities: C 1 1 and C 2 1 in Layer 1, and C 1 2 and C 2 2 in Layer 2. The community structure of the multilayer network comprised C 1 12 = C 1 1 , C 1 2 , C 2 1 , and C 2 2 . The number of communities was specified as an input for all the algorithms. The run time of the different algorithms was measured as the network size varied, and the results are displayed in Figure 3. Figure 3 illustrates that the run time of the different methods exhibits a log-linear relationship as the number of nodes increases. Moreover, the proposed ML-USCL performs better than CSNMTF as the number of nodes grows, and it is comparable to BLSC and CPNMF. However, GenLov exhibits a faster run time compared to ML-USCL. The order of complexity of the different algorithms is given in Table 2. Nonetheless, as shown by the different experiments, the proposed algorithm maintains good performance compared to all other methods in terms of detecting an accurate community structure.

5.3. Regularization Parameters Selection

The proposed ML-USCL incorporates two regularization parameters, namely λ 1 and λ 2 . Both parameters penalize the similarity between the orthonormal subspaces within and across layers. The impact of these regularization parameters on the algorithm’s performance is investigated through experimental validation. We observed that the selection of λ 1 and λ 2 depends on the characteristics of the multilayer network under examination. If the network primarily consists of within-layer communities, smaller values of the regularization parameters are advised. Conversely, if the network comprises predominantly across-layer communities or both within- and across-layer communities, larger values of the parameters are recommended. In the proposed ML-USCL, we search for the best λ 1 and λ 2 jointly over a grid of [0.1–10].

5.4. Real-World Networks

To evaluate the effectiveness of the proposed method and compare it with other algorithms in identifying the community structure in real-world networks, four real-world networks are considered, as shown in Table 3. A brief description of these networks is given in the following section.

5.4.1. Networks Description

  • Lazega law-firm network (https://manliodedomenico.com/data.php) (accessed on 20 June 2023) [38] is a multilayer network that represents the interactions between 71 partners and associates in a corporate law partnership, where each layer corresponds to a specific type of interaction among the individuals. The intra-layer graphs capture co-work and advice relationships, while the inter-layer graph reflects friendship relationships. The dataset also includes seven attributes that can be used to evaluate the quality of the detected communities. In this study, the ground-truth community structure is determined based on the office location attribute, which could be either Boston, Hartford, or Providence.
  • MIT Reality Mining (http://reality.media.mit.edu/download.php) (accessed on 20 June 2023) [39] network depicts various modes of mobile phone communication among 87 users, with edges indicating physical location, Bluetooth scans, and phone calls. The network (https://github.com/VGligorijevic/NF-CCE/tree/master/data/nets) (accessed on 20 June 2023) is constructed as a two-layer network, where the intra-layer graphs represent the physical location and Bluetooth scans, and the inter-layer graph represents phone call interactions. Further details on the construction of the network can be found in [40]. The ground-truth community structure in this network corresponds to the affiliations of the users.
  • The C. Elegans network (https://manliodedomenico.com/data.php) (accessed on 20 June 2023) [8,41] is a multilayer network that depicts the synaptic junctions, including electric and chemical monadic and polyadic, between neurons in the C. Elegans nervous system. The network comprises 279 neurons and each neuron is grouped into different categories such as bodywall, mechanosensory, and head motor neurons. These categories can be considered as the ground-truth structure. In the constructed two-layer network, intra-layer edges denote the monadic and polyadic synaptic junctions among neurons, whereas inter-layer edges represent the electric junctions.
  • Cora (https://people.cs.umass.edu/~mccallum/data.html) (accessed on 20 June 2023) data set is a subset of the Cora bibliographic data set. The Cora MLN consists of 292 nodes that refer to research papers. The intra-layer edges reflect the title and abstract similarities between the different research papers and the inter-layer edges model the citation relationships between them. The clusters in the network correspond to the research fields, namely data mining, natural language processing, and robotics.
  • COIL20 (https://www.cs.columbia.edu/CAVE/software/softlib/) (accessed on 20 June 2023) is a data set comprising 1440 images obtained from the Columbia object image library where intra- and inter-layers represent different image features. Intra-layer graphs represent the local binary patterns (LBPs) and Gabor features, whereas inter-layer graphs represent the intensity feature. The data set consists of 20 communities, each referring to a group of related images.
  • UCI (https://archive.ics.uci.edu/ml/datasets/Multiple+Features) (accessed on 20 June 2023) [42] consists of features extracted from handwritten digits (0–9) obtained from a collection of Dutch utility maps. The dataset contains a total of 2000 digit patterns, with 200 patterns per digit. These patterns are represented using different sets of features, including Fourier coefficients of the character shapes, profile correlations, and Karhunen–Loéve coefficients.

5.4.2. Experiments

The accuracy of the proposed ML-USCL in uncovering the community structure of the four real-world networks is evaluated and compared with other existing methods. The results are reported in Table 4. Based on the table, it is evident that the proposed ML-USCL for community detection in two-layer MLNs shows significant improvement over other algorithms. The evaluation of the community structure is performed individually for each layer and the proposed approach exhibits better performance than the other algorithms, as it achieves superior results in at least one quality metric in both layers. This indicates that the proposed approach is highly effective in identifying communities in MLNs. Moreover, the run time required for each one of the methods is presented in Table 5, where the proposed ML-USCL is faster than CSNMTF and CPNMF, comparable to BLSC and slower than GenLov. Nevertheless, ML-USCL exceeds the other algorithms in performance. These results suggest that the proposed approach can be a valuable tool in various applications that require the identification of communities in MLNs, such as social, biological, phone, and citation network analysis. Overall, the proposed approach has the potential to advance the field of community detection in MLNs and enable more accurate and efficient analysis of complex systems.

6. Conclusions

Community detection in multilayer networks is an active research area with various challenges and opportunities. The structure of multilayer networks provides additional information that can be used to enhance the accuracy and interpretability of community detection methods. In this article, a unified spectral-clustering-based community detection method for two-layer MLNs is introduced. The task of identifying the community structure in two-layer MLNs is expressed as an optimization problem, in which the normalized cut is minimized for each layer while also considering the normalized cut for the bipartite network. This optimization is performed concurrently with regularization terms that ensure the coherence of the community structures both within and across layers.
Multiple experiments have been conducted to evaluate the effectiveness of the proposed approach for community detection in two-layer unweighted and weighted simulated and real-world MLNs. These experiments demonstrate the efficiency and accuracy of the proposed ML-USCL in detecting the community structure in two-layer MLNs. In addition, ML-USCL is robust to noise compared to existing approaches. Finally, the ability to use the same objective function for both weighted and unweighted networks, while being robust to noise and outliers, makes the proposed method applicable to a wide range of MLNs.
For future work, we will focus on generalizing the proposed ML-USCL and addressing its limitations. In particular, the proposed approach will be extended to handle MLNs with more than two layers. The extension will be developed by first constructing a multidimensional array or tensor that represents the multilayer network and then applying tensor decomposition to reveal the underlying communities in the network. Finally, we would like to point out that the proposed approach in its current formulation can be adopted to detect the community structure in heterogeneous MLNs, i.e., nodes in the different layers refer to different objects. Future work will perform more experiments to validate the extension of the proposed ML-USCL in heterogeneous MLNs.

Author Contributions

Conceptualization, E.A.-s. and S.A.; methodology, E.A.-s. and S.A.; software, E.A.-s. and S.A.; validation, E.A.-s. and S.A.; formal analysis, E.A.-s. and S.A.; validation, E.A.-s. and S.A.; investigation, E.A.-s. and S.A.; resources, E.A.-s. and S.A.; data curation, E.A.-s. and S.A.; writing—original draft preparation, E.A.-s. and S.A.; writing—review and editing, E.A.-s. and S.A.; visualization, E.A.-s. and S.A.; supervision, E.A.-s. and S.A.; project administration, E.A.-s. and S.A.; funding acquisition, E.A.-s. and S.A. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the Jordan University of Science and Technology under Research Grant Number 20220277 and the National Science Foundation under CCF-2006800.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

We would like to thank our colleague Abdullah Karaaslanli for providing us with the multilayer LFR network code.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Boccaletti, S.; Latora, V.; Moreno, Y.; Chavez, M.; Hwang, D.U. Complex networks: Structure and dynamics. Phys. Rep. 2006, 424, 175–308. [Google Scholar] [CrossRef]
  2. Al-Sharoa, E.; Al-Khassaweneh, M.; Aviyente, S. Tensor based temporal and multilayer community detection for studying brain dynamics during resting state fMRI. IEEE Trans. Biomed. Eng. 2018, 66, 695–709. [Google Scholar] [CrossRef] [PubMed]
  3. Kivelä, M.; Arenas, A.; Barthelemy, M.; Gleeson, J.P.; Moreno, Y.; Porter, M.A. Multilayer networks. J. Complex Netw. 2014, 2, 203–271. [Google Scholar] [CrossRef] [Green Version]
  4. Papakostas, D.; Basaras, P.; Katsaros, D.; Tassiulas, L. Backbone formation in military multi-layer ad hoc networks using complex network concepts. In Proceedings of the MILCOM 2016—2016 IEEE Military Communications Conference, Baltimore, MD, USA, 1–3 November 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 842–848. [Google Scholar]
  5. Berlingerio, M.; Coscia, M.; Giannotti, F. Finding and characterizing communities in multidimensional networks. In Proceedings of the 2011 international conference on advances in social networks analysis and mining, Kaohsiung, Taiwan, 25–27 July 2011; IEEE: Piscataway, NJ, USA, 2011; pp. 490–494. [Google Scholar]
  6. Burgess, M.; Adar, E.; Cafarella, M. Link-prediction enhanced consensus clustering for complex networks. PLoS ONE 2016, 11, e0153384. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  7. Mucha, P.J.; Richardson, T.; Macon, K.; Porter, M.A.; Onnela, J.P. Community structure in time-dependent, multiscale, and multiplex networks. Science 2010, 328, 876–878. [Google Scholar] [CrossRef] [PubMed]
  8. De Domenico, M.; Porter, M.A.; Arenas, A. MuxViz: A tool for multilayer analysis and visualization of networks. J. Complex Netw. 2015, 3, 159–176. [Google Scholar] [CrossRef]
  9. Kuncheva, Z.; Montana, G. Community detection in multiplex networks using locally adaptive random walks. In Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2015, Paris, France, 25–28 August 2015; pp. 1308–1315. [Google Scholar]
  10. Alimadadi, F.; Khadangi, E.; Bagheri, A. Community detection in facebook activity networks and presenting a new multilayer label propagation algorithm for community detection. Int. J. Mod. Phys. B 2019, 33, 1950089. [Google Scholar] [CrossRef]
  11. Gligorijević, V.; Panagakis, Y.; Zafeiriou, S. Non-negative matrix factorizations for multiplex network analysis. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 41, 928–940. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  12. Ma, X.; Dong, D.; Wang, Q. Community detection in multi-layer networks using joint non-negative matrix factorization. IEEE Trans. Knowl. Data Eng. 2018, 31, 273–286. [Google Scholar] [CrossRef]
  13. Al-Sharoa, E.M.; Aviyente, S. Community Detection in Fully-Connected Multi-layer Networks Through Joint Nonnegative Matrix Factorization. IEEE Access 2022, 10, 43022–43043. [Google Scholar] [CrossRef]
  14. Zhang, H.; Wang, C.D.; Lai, J.H.; Yu, P.S. Modularity in complex multilayer networks with multiple aspects: A static perspective. Appl. Inform. 2017, 4, 7. [Google Scholar] [CrossRef] [Green Version]
  15. Pramanik, S.; Tackx, R.; Navelkar, A.; Guillaume, J.L.; Mitra, B. Discovering community structure in multilayer networks. In Proceedings of the 2017 IEEE International Conference on Data Science and Advanced Analytics (DSAA), Tokyo, Japan, 19–21 October 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 611–620. [Google Scholar]
  16. Chen, C.; Ng, M.; Zhang, S. Block spectral clustering for multiple graphs with inter-relation. Netw. Model. Anal. Health Inform. Bioinform. 2017, 6, 8. [Google Scholar] [CrossRef]
  17. De Domenico, M.; Lancichinetti, A.; Arenas, A.; Rosvall, M. Identifying modular flows on multilayer networks reveals highly overlapping organization in interconnected systems. Phys. Rev. X 2015, 5, 011027. [Google Scholar] [CrossRef] [Green Version]
  18. Yang, Z.; Algesheimer, R.; Tessone, C.J. A comparative analysis of community detection algorithms on artificial networks. Sci. Rep. 2016, 6, 30750. [Google Scholar] [CrossRef] [Green Version]
  19. Von Luxburg, U. A tutorial on spectral clustering. Stat. Comput. 2007, 17, 395–416. [Google Scholar] [CrossRef]
  20. Ng, A.Y.; Jordan, M.I.; Weiss, Y. On spectral clustering: Analysis and an algorithm. In Proceedings of the Advances in Neural Information Processing Systems, Vancouver, BC, Canada, 9–14 December 2002; pp. 849–856. [Google Scholar]
  21. Dhillon, I.S. Co-clustering documents and words using bipartite spectral graph partitioning. In Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 26–29 August 2001; ACM: New York, NY, USA, 2001; pp. 269–274. [Google Scholar]
  22. Mirzal, A.; Furukawa, M. Eigenvectors for clustering: Unipartite, bipartite, and directed graph cases. In Proceedings of the 2010 International Conference on Electronics and Information Engineering, Kyoto, Japan, 1–3 August 2010; IEEE: Piscataway, NJ, USA, 2010; Volume 1, pp. V1-303–V1-309. [Google Scholar]
  23. Hamm, J.; Lee, D.D. Grassmann discriminant analysis: A unifying view on subspace-based learning. In Proceedings of the 25th International Conference on Machine Learning, Helsinki, Finland, 5–9 July 2008; pp. 376–383. [Google Scholar]
  24. Dong, X.; Frossard, P.; Vandergheynst, P.; Nefedov, N. Clustering on multi-layer graphs via subspace analysis on Grassmann manifolds. IEEE Trans. Signal Process. 2013, 62, 905–918. [Google Scholar] [CrossRef] [Green Version]
  25. Kumar, A.; Rai, P.; Daume, H. Co-regularized multi-view spectral clustering. Adv. Neural Inf. Process. Syst. 2011, 24, 1413–1421. [Google Scholar]
  26. Chi, Y.; Song, X.; Zhou, D.; Hino, K.; Tseng, B.L. Evolutionary spectral clustering by incorporating temporal smoothness. In Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Jose, CA, USA, 12–15 August 2007; pp. 153–162. [Google Scholar]
  27. Traag, V.; Aldecoa, R.; Delvenne, J. Detecting communities using asymptotical surprise. Phys. Rev. E 2015, 92, 022816. [Google Scholar] [CrossRef] [Green Version]
  28. Bezdek, J.C.; Hathaway, R.J. Some notes on alternating optimization. In Proceedings of the Advances in Soft Computing—AFSS 2002: 2002 AFSS International Conference on Fuzzy Systems, Calcutta, India, 3–6 February 2002; Springer: Cham, Switzerland, 2002; pp. 288–300. [Google Scholar]
  29. Bezdek, J.C.; Hathaway, R.J. Convergence of alternating optimization. Neural Parallel Sci. Comput. 2003, 11, 351–368. [Google Scholar]
  30. Miyauchi, A.; Kawase, Y. What is a network community? A novel quality function and detection algorithms. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, Melbourne, Australia, 18–23 October 2015; pp. 1471–1480. [Google Scholar]
  31. Chakraborty, T.; Dalmia, A.; Mukherjee, A.; Ganguly, N. Metrics for community analysis: A survey. ACM Comput. Surv. 2017, 50, 54. [Google Scholar] [CrossRef]
  32. Blondel, V.D.; Guillaume, J.L.; Lambiotte, R.; Lefebvre, E. Fast unfolding of communities in large networks. J. Stat. Mech. Theory Exp. 2008, 2008, P10008. [Google Scholar] [CrossRef] [Green Version]
  33. Danon, L.; Diaz-Guilera, A.; Duch, J.; Arenas, A. Comparing community structure identification. J. Stat. Mech. Theory Exp. 2005, 2005, P09008. [Google Scholar] [CrossRef] [Green Version]
  34. Hubert, L.; Arabie, P. Comparing partitions. J. Classif. 1985, 2, 193–218. [Google Scholar] [CrossRef]
  35. Schütze, H.; Manning, C.D.; Raghavan, P. Introduction to Information Retrieval; Cambridge University Press: Cambridge, UK, 2008; Volume 39. [Google Scholar]
  36. Lancichinetti, A.; Fortunato, S.; Radicchi, F. Benchmark graphs for testing community detection algorithms. Phys. Rev. E 2008, 78, 046110. [Google Scholar] [CrossRef] [Green Version]
  37. Bródka, P. A method for group extraction and analysis in multilayer social networks. arXiv 2016, arXiv:1612.02377. [Google Scholar]
  38. Lazega, E. The Collegial Phenomenon: The Social Mechanisms of Cooperation among Peers in a Corporate Law Partnership; Oxford University Press on Demand: Oxford, UK, 2001. [Google Scholar]
  39. Eagle, N.; Pentland, A.S.; Lazer, D. Inferring friendship network structure by using mobile phone data. Proc. Natl. Acad. Sci. USA 2009, 106, 15274–15278. [Google Scholar] [CrossRef] [PubMed]
  40. Dong, X.; Frossard, P.; Vandergheynst, P.; Nefedov, N. Clustering with multi-layer graphs: A spectral perspective. IEEE Trans. Signal Process. 2012, 60, 5820–5831. [Google Scholar] [CrossRef] [Green Version]
  41. Chen, B.L.; Hall, D.H.; Chklovskii, D.B. Wiring optimization can relate neuronal structure and function. Proc. Natl. Acad. Sci. USA 2006, 103, 4723–4728. [Google Scholar] [CrossRef] [PubMed]
  42. Dua, D.; Graff, C. UCI Machine Learning Repository. 2017. Available online: http://archive.ics.uci.edu/ml (accessed on 20 June 2023).
Figure 1. Comparison conducted among the various methods to evaluate their effectiveness in detecting the community structure. of LFR benchmark binary networks in terms of NMI, ARI, and purity with α = { 0 , 0.5 , 1 } and variable mixing parameter μ : (ac) α = 0 ; (df) α = 0.5 ; (gi) α = 1 .
Figure 1. Comparison conducted among the various methods to evaluate their effectiveness in detecting the community structure. of LFR benchmark binary networks in terms of NMI, ARI, and purity with α = { 0 , 0.5 , 1 } and variable mixing parameter μ : (ac) α = 0 ; (df) α = 0.5 ; (gi) α = 1 .
Symmetry 15 01368 g001
Figure 2. Comparison conducted among the various methods to evaluate their effectiveness in recovering the community structure of the multilayer weighted networks (MLWNs) from Table 1 in terms of NMI, ARI, and purity with different levels of added sparse noise percent (SPN%): (ac) MLWN1; (df) MLWN2; (gi) MLWN3.
Figure 2. Comparison conducted among the various methods to evaluate their effectiveness in recovering the community structure of the multilayer weighted networks (MLWNs) from Table 1 in terms of NMI, ARI, and purity with different levels of added sparse noise percent (SPN%): (ac) MLWN1; (df) MLWN2; (gi) MLWN3.
Symmetry 15 01368 g002aSymmetry 15 01368 g002b
Figure 3. Scalability comparison between the different methods.
Figure 3. Scalability comparison between the different methods.
Symmetry 15 01368 g003
Table 1. Parameters of the simulated weighted two-layer MLNs.
Table 1. Parameters of the simulated weighted two-layer MLNs.
Network n 1 , n 2 NOCGround Truth: Cluster (Nodes in the Cluster) μ w , σ w , μ b , σ b
MLWN 1100, 1007Graph1: C 1 1 (1–20), C 2 1 (21–40), C 3 1 (41–60), G 1 : ( 0.5 , 0.3 , 0.3 , 0.2 )
C 4 1 (61–100)
Graph2: C 1 2 (1–20), C 2 2 (21–35), C 3 2 (36–50), G 2 : ( 0.5 , 0.2 , 0.2 , 0.2 )
C 4 2 (51–75), C 5 2 (76–100)
Ground truth: C 1 12 = { C 1 1 , C 1 2 } , C 2 12 = { C 2 1 , C 2 2 } , C 3 1 , C 4 1 , C 3 2 , C 4 2 , C 5 2 G 12 : ( 0.8 , 0.2 , 0.3 , 0.1 )
MLWN 2100, 1004Graph1: C 1 1 (1–40), C 2 1 (41–100) G 1 : ( 0.5 , 0.4 , 0.3 , 0.2 )
Graph2: C 1 2 (1–30), C 2 2 (31–60), C 3 2 (61–100) G 2 : ( 0.5 , 0.1 , 0.2 , 0.2 )
Ground truth: C 1 12 = { C 1 1 , C 1 2 } , C 2 1 , C 2 2 , C 3 2 G 12 : ( 0.5 , 0.2 , 0.3 , 0.1 )
MLWN 3100, 1008Graph1: C 1 1 (1–20), C 2 1 (21–40), C 3 1 (41–60), G 1 : ( 0.5 , 0.4 , 0.3 , 0.2 )
C 4 1 (61–100)
Graph2: C 1 2 (1–15), C 2 2 (16–35), C 3 2 (36–50), G 2 : ( 0.5 , 0.1 , 0.2 , 0.2 )
C 4 2 (51–75), C 5 2 (76–100)
Ground truth: C 1 12 = { C 1 1 , C 1 2 } , C 2 1 , C 3 1 , C 4 1 , C 2 2 , C 3 2 , C 4 2 , C 5 2 G 12 : ( 0.5 , 0.2 , 0.3 , 0.1 )
Table 2. Computational complexity of the different methods: M is the number of layers, n is the total number of nodes in the MLN, n m is the number of nodes in the mth layer, l m S is the maximum number of iterations required by CSNMTF, l m P is the maximum number of iterations required by CPNMF, l m is the maximum number of iterations required by ML-USCL, K is the number of communities in the MLN, and k m is the number of communities in the mth layer.
Table 2. Computational complexity of the different methods: M is the number of layers, n is the total number of nodes in the MLN, n m is the number of nodes in the mth layer, l m S is the maximum number of iterations required by CSNMTF, l m P is the maximum number of iterations required by CPNMF, l m is the maximum number of iterations required by ML-USCL, K is the number of communities in the MLN, and k m is the number of communities in the mth layer.
MethodComputational Complexity
GenLov O ( n log n )
BLSC O ( K ( m = 1 M n m ) 2 )
CSNMTF M l m S O ( k m n m 2 )
CPNMF M l m P O ( k m n m 2 )
ML-USCL M l m O ( k m n m 2 )
Table 3. Description of the two-layer real networks.
Table 3. Description of the two-layer real networks.
NetworkNodes per LayerNumber of EdgesAverage DegreeNOC
Lazega714624 65.13 3
MIT872790 32.06 6
C. Elegans2797266 27.03 12
Cora292 19 , 044 10.80 3
Coil201440 8 , 294 , 400 3611.4 20
UCI2000 1 , 011 , 534 478.99 12
Table 4. Community detection performance comparison for real-world multilayer networks.
Table 4. Community detection performance comparison for real-world multilayer networks.
NetworkMetricBLSCGenLovCSNMTFCPNMFML-USCL
LazegaNMI 0.5129 0.5816 0.5732 0.6332 0.8943 ̲
Layer1ARI 0.3524 0.4724 0.4445 0.4731 0.9515 ̲
Purity 0.9155 0.9296 0.9296 0.9437 0.9577 ̲
LazegaNMI 0.4858 0.5816 0.5732 0.6332 0.7507 ̲
Layer2ARI 0.3596 0.4724 0.4445 0.4731 0.8482 ̲
Purity 0.9014 0.9296 0.9296 0.9437 0.9437 ̲
MITNMI 0.1319 0.1808 0.3197 0.3358 0.4647 ̲
Layer1ARI 0.0326 0.0030 0.2193 0.2803 0.3498 ̲
Purity 0.4828 0.4943 0.6092 0.6322 0.6782 ̲
MITNMI 0.4199 0.5251 ̲ 0.3197 0.3358 0.4760
Layer2ARI 0.3092 0.3889 ̲ 0.2193 0.2803 0.3855
Purity 0.6437 0.6897 0.6092 0.6322 0.7011 ̲
C. ElegansNMI 0.4614 0.4211 0.4352 0.4710 0.4735 ̲
Layer1ARI 0.2429 0.2652 0.1771 0.2138 0.3714 ̲
Purity 0.5167 0.4796 0.5613 ̲ 0.5502 0.5279
C. ElegansNMI 0.4404 0.4198 0.4352 0.4710 ̲ 0.4669
Layer2ARI 0.1988 0.2703 0.1771 0.2138 0.3620 ̲
Purity 0.4870 0.4833 0.5613 ̲ 0.5502 0.5130
CoraNMI 0.1919 0.5634 0.5136 0.2315 0.5700 ̲
Layer1ARI 0.0276 0.5615 ̲ 0.5306 0.1037 0.5059
Purity 0.4692 0.8938 ̲ 0.8699 0.5205 0.8390
CoraNMI 0.124 0.6420 0.5136 0.2315 0.8351 ̲
Layer2ARI 0.0163 0.6502 0.5306 0.1037 0.8815 ̲
Purity 0.4075 0.9247 0.8699 0.5205 0.9692 ̲
Coil20NMI 0.1921 0.3962 0.7458 0.0429 0.7719 ̲
Layer1ARI 0.0132 0.1307 0.5146 0 0.6174 ̲
Purity 0.1438 0.1500 0.6743 0.0993 0.7653 ̲
Coil20NMI0 0.3917 0.7458 0.0429 0.7646 ̲
Layer2ARI0 0.1301 0.5146 0 0.6105 ̲
Purity 0.0500 0.1500 0.6743 0.0993 0.7188 ̲
UCINMI 0.7613 0.7757 0.8006 0.7566 0.8064 ̲
Layer1ARI 0.6761 0.6383 0.7427 0.6023 0.7722 ̲
Purity 0.8170 0.6615 0.8775 0.7600 0.8990 ̲
UCINMI 0.7756 0.7878 0.8006 0.7566 0.8050 ̲
Layer2ARI 0.6863 0.6524 0.7427 0.6023 0.7441 ̲
Purity 0.8210 0.6680 0.8775 0.7600 0.8880 ̲
Table 5. Run time taken by the different methods for the real-world MLNs.
Table 5. Run time taken by the different methods for the real-world MLNs.
NetworkGenLovBLSCCSNMTFCPNMFML-USCL
Lazega 0.0250 0.2549 2.4958 0.5251 0.2155
MIT 0.0108 0.1873 4.9985 0.7384 0.2594
C. Elegans 0.0184 1.7216 81.6031 11.3486 0.9102
Cora 0.0310 0.6031 14.2655 0.7248 0.5368
Coil20 0.9188 2.8096 155.743 3.0271 5.3361
UCI 0.8801 6.4001 394.735 11.255 29.5577
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Al-sharoa, E.; Aviyente, S. A Unified Spectral Clustering Approach for Detecting Community Structure in Multilayer Networks. Symmetry 2023, 15, 1368. https://doi.org/10.3390/sym15071368

AMA Style

Al-sharoa E, Aviyente S. A Unified Spectral Clustering Approach for Detecting Community Structure in Multilayer Networks. Symmetry. 2023; 15(7):1368. https://doi.org/10.3390/sym15071368

Chicago/Turabian Style

Al-sharoa, Esraa, and Selin Aviyente. 2023. "A Unified Spectral Clustering Approach for Detecting Community Structure in Multilayer Networks" Symmetry 15, no. 7: 1368. https://doi.org/10.3390/sym15071368

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop