Next Article in Journal
Using Task Support Requirements during Socio-Technical Systems Design
Previous Article in Journal
Operational Decisions of Construction and Demolition Waste Recycling Supply Chain Members under Altruistic Preferences
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Quantifying the Complexity of Nodes in Higher-Order Networks Using the Infomap Algorithm

1
College of Science, National University of Defense Technology, Changsha 410073, China
2
College of Foreign Languages, Zhejiang Normal University, Jinhua 321004, China
3
College of Systems Engineering, National University of Defense Technology, Changsha 410073, China
4
National Research Center of Parallel Computer Engineering and Technology, Beijing 100190, China
*
Authors to whom correspondence should be addressed.
These authors contributed equally to this work.
Systems 2024, 12(9), 347; https://doi.org/10.3390/systems12090347
Submission received: 2 July 2024 / Revised: 20 August 2024 / Accepted: 27 August 2024 / Published: 3 September 2024
(This article belongs to the Section Systems Theory and Methodology)

Abstract

:
Accurately quantifying the complexity of nodes in a network is crucial for revealing their roles and network complexity, as well as predicting network emergent phenomena. In this paper, we propose three novel complexity metrics for nodes to reflect the extent to which they participate in organized, structured interactions in higher-order networks. Our higher-order network is built using the BuildHON+ model, where communities are detected using the Infomap algorithm. Since a physical node may contain one or more higher-order nodes in higher-order networks, it may simultaneously exist in one or more communities. The complexity of a physical node is defined by the number and size of the communities to which it belongs, as well as the number of higher-order nodes it contains within the same community. Empirical flow datasets are used to evaluate the effectiveness of the proposed metrics, and the results demonstrate their efficacy in characterizing node complexity in higher-order networks.

1. Introduction

The complexity of a network reflects the extent to which its nodes participate in organized, structured interactions [1]. Highly complex networks achieve a balance between order and disorder, exhibiting both regularity and randomness, and have a high likelihood of emergent phenomena [2]. Accurately quantifying network complexity is critical for revealing network characteristics and predicting network emergent phenomena [3]. Common network complexity metrics based on topology [4,5] attempt to calculate the number of nodes or interactions in a network, highlighting the significance of node complexity as a key component of network complexity. Therefore, accurate quantification of node complexity can reveal the role of nodes within a network well and help to predict network emergent phenomena [6]. In addition, the complexity of a node can be defined as the extent to which it participates in organized, structured interactions [7].
Recently, many higher-order network models [8,9,10,11] have been proposed as more accurate representations of underlying data, focusing on features that first-order network models fail to capture. By incorporating higher-order attributes during the network-construction process, higher-order networks can describe complex interactions among multiple nodes that go beyond pairwise connections, thereby enhancing the effectiveness of network-analysis methods [12,13]. This improvement arises because previous studies, when constructing networks from sequential data, typically overlooked relationships beyond pairwise node connections, implicitly assuming a first-order Markov process. This assumption led to the omission of significant higher-order information present in the original data [14]. To address this issue, Rosvall et al. [15] developed a second-order Markov network model to capture dynamic behaviors, which not only revealed actual travel patterns in air traffic but also uncovered propagation patterns within citation networks. Subsequently, Xu et al. [16] proposed the BuildHON algorithm, which extracts patterns or rules with frequencies high enough to be considered significant and constructs higher-order networks through two steps: rule extraction and network rewiring. Building on this, Saebi et al. [17] introduced the BuildHON+ algorithm, which controls the growth of higher-order rules through dynamic thresholds, thus reducing the time and space complexity compared to the BuildHON algorithm. Additionally, some studies have proposed preliminary quantitative descriptions of node complexity in higher-order networks. For example, Rosvall et al. [15] used the Infomap algorithm within second-order Markov networks to identify hub cities in air traffic and to uncover multidisciplinary journals in scientific communication. Xu et al. [16] applied the Infomap algorithm in higher-order networks to effectively locate the Freeport of Malta—one of Europe’s busiest ports. Fu et al. [18] utilized the PageRank algorithm within the multi-scale higher-order dependency networks to identify key nodes within the global liner shipping network, revealing the dual circular economy models in the relevant countries.
Motivated by these works, we propose three quantitative complexity metrics for nodes in higher-order networks [17] using the Infomap algorithm [19,20]. First, we build a higher-order network from flow data and detect communities using the Infomap algorithm. Next, we define three complexity metrics for nodes based on the number and size of the communities to which they belong, as well as the number of higher-order nodes contained within the same community. Finally, we demonstrate the effectiveness of our proposed complexity metrics using empirical flow datasets.

2. Node Complexity Metrics Definition

2.1. Node Complexity Metrics Definition in First-Order Networks

First-order network (FON) is defined as a graph G = ( V , E ) , where V = { a , b , c , } denotes the physical node set, and E represents the physical edge set. Community in a first-order network is defined as a node set O i , i = 1 , 2 , 3 , , n , then V = O 1 O 2 O 3 O n . When i j , we have O i O j = .
Based on community detection in first-order network, three node complexity metrics are defined below. Firstly, let δ i ( x ) be defined as:
δ i ( x ) = 1 , x O i 0 , x O i .
Then, the node complexity metrics in first-order networks are defined as follows:
C 1 ( x ) = i = 1 n δ i ( x ) ,
C 2 ( x ) = i = 1 n δ i ( x ) · lg | O i | ,
C 3 ( x ) = i = 1 n δ i ( x ) · lg | O i | | x O i | ,
where x V , and | O i | denotes the size of the community O i .

2.2. Rationality Analysis of Node Complexity Metrics in FON

  • The more communities a node belongs to, the higher its complexity. C 1 x represents the number of communities to which node x belongs. In FON, since i j implies O i O j = , C 1 x = 1 is constant when using the Infomap algorithm.
  • The larger the size of the communities to which a node belongs, the higher its complexity. Based on C 1 x , C 2 x is positively correlated with the size of the communities to which node x belongs. To make the data distribution more uniform and reduce the impact of outliers, we introduce the logarithmic function lg ( · ) to measure node complexity. Specifically, when O i = 1 , meaning the community contains only one node, C 2 x = 0 . When | O i | = 1 , then C 2 ( x ) = 0 .
  • The greater the proportion of a node within its community, the lower its complexity. Building on C 1 x and C 2 x , C 3 x also considers the proportion of node x within its community, which is negatively correlated with complexity. When x O i , meaning x O i = 1 , then C 3 x = C 2 x . When x O i = , meaning x O i , then x O i = 0 . When x O i = 0 , set the denominator to 1. This ensures that Equation (4) remains valid without affecting the experimental results.

2.3. Higher-Order Network Model

To represent flow data accurately, the BuildHON+ model [17] could convert multivariate sequential interactions among multiple nodes into higher-order nodes and their edges. The model consists of two steps: higher-order node construction and edge rewire.
The construction of higher-order nodes depends on the frequency and transition probability distribution entropy of multivariate sequences. Firstly, we introduce the relevant concepts [16,18]: the order of a network is equal to the order of the highest-order node in the network, the order of a node corresponds to the length of its associated multivariate sequence, and the count of a node indicates the number of occurrences of its corresponding multivariate sequence. For example, the multivariate sequence b c appears 4 times in the network, and its corresponding higher-order node is s = c | b . The order of node s is 2, and its count is 4. Secondly, only when the multivariate sequence count exceeds a certain threshold, is it considered to exist. For example, if the multivariate sequence count is n c b a = 4 and the threshold is n 0 = 2 , then the multivariate sequence c b a is considered to exist. Thirdly, the Kullback–Leibler divergence D K L is introduced as a measure of information gain:
D K L ( Q P ) = i = 1 n q i × log 2 q i p i ,
where P = p 1 , p 2 , , p n represents the transition probability distribution for the current higher-order node s 1 = a | b . c . . x , and Q = q 1 , q 2 , , q n for the higher-order subsequence s 2 = a | b . c . . x . . y . o s 1 and o s 2 denotes the orders of the higher-order node s 1 and the higher-order subsequence s 2 , respectively, with o s 1 < o s 2 .
Meanwhile, the dynamic threshold is denoted as
δ = o ( s 2 ) log 2 ( 1 + n ( s 2 ) ) ,
where o ( s 2 ) denotes the order of subsequence s 2 , and n ( s 2 ) is the count of subsequence s 2 . When D K L ( Q P ) > δ , subsequence s 2 becomes a new candidate higher-order node and replaces the old candidate higher-order node s 1 . Otherwise, subsequence s 1 remains the candidate higher-order node.
The rewire of edges aims to establish the connection between higher-order nodes. First, construct a first-order network based on all first-order nodes and their edges. Second, for each higher-order node, construct a path from the corresponding first-order node to it. For example, for higher-order node a | b . c , a path c b | c a | b . c will be constructed. Third, construct the edges between different paths. For example, for path c b | c a | b . c , there is a | b . c b . There must exist a higher-order node x ˜ , which satisfies x ˜ = max { b | a . b , b | a , b } , then a | b . c x ˜ .
In order to visualize a higher-order network model, Figure 1 shows the comparison of first- and higher-order network models. Based on flow data in time 1, node a passes through node c and goes back to node a, as does node b. Based on flow data in time 2, node a passes through node c and goes back to node b; meanwhile, node b passes through node c and goes back to node a. A higher-order network model clearly reflects the above information, but a first-order network model does not, which demonstrates the accuracy of higher-order network models.

2.4. Node Complexity Metrics Definition in Higher-Order Networks

A higher-order network (HON) can be defined as a graph G ˜ = ( V ˜ , E ˜ ) , where V ˜ = { a , b , c , } denotes the higher-order node set; for any x ˜ V ˜ , there must exist x V , such that x ˜ x = { x | , x | a , x | b , } , and E ˜ represents the higher-order edge set. Community in a higher-order network is defined as a node set O ˜ i , i = 1 , 2 , 3 , , n , then V ˜ = O ˜ 1 O ˜ 2 O ˜ 3 O ˜ n . When i j , we have O ˜ i O ˜ j = .
Based on community detection in higher-order networks, three higher-order node complexity metrics are defined below. Firstly, let δ ˜ i ( x ˜ ) be defined as:
δ ˜ i ( x ˜ ) = 1 , x ˜ O ˜ i 0 , x ˜ O ˜ i .
Then, the node complexity metrics in higher-order networks are defined as follows:
C ˜ 1 x = x ˜ x i = 1 n δ ˜ i x ˜ ,
C ˜ 2 x = x ˜ x i = 1 n δ ˜ i x ˜ · lg O ˜ i ,
C ˜ 3 ( x ) = x x ˜ i = 1 n δ ˜ i x ˜ · lg | O ˜ i | | x O ˜ i | .
For first-order networks, x ˜ = x and O ˜ i = O i , so that
δ ˜ i x ˜ = 1 , x ˜ O ˜ i 0 , x ˜ O ˜ i = 1 , x O i 0 , x O i .
Combined with Equation (1), it is easy to find that δ ˜ i ( x ˜ ) = δ i ( x ) . Therefore, the three higher-order node complexity metrics are suitable for first-order networks as well.

2.5. Rationality Analysis of Node Complexity Metrics in HON

Similar to the analysis of node complexity metrics in FON, since a physical node in HON may correspond to multiple higher-order nodes, it can therefore appear in multiple communities. Based on this, we conducted the following analysis:
  • The more communities a physical node belongs to, the higher its complexity.
  • The larger the communities a physical node belongs to, the higher its complexity. C ˜ 2 x is positively correlated with both the number and size of the communities to which node x belongs.
  • The greater the proportion of a physical node within a community, the lower its complexity. C ˜ 3 x also considers positive correlations with the number and size of the communities to which node x belongs, as well as a negative correlation with the proportion of node x within its communities.

2.6. A Sample of Node Complexity Quantification Process

To introduce the quantification process of node complexity, we present a sample in Figure 2. Using the BuildHON+ model, we first build both a first- and higher-order network based on flow data. We then detect communities using the Infomap algorithm in networks. Finally, we calculate node complexity using the above three node complexity metrics, which reveals node characteristics better in a higher-order network than in a first-order network. For example, node b has the most higher-order interactions based on the number of communities it belongs to, so its complexity C ˜ 1 ( b ) should be the highest. That is true in higher-order networks, but not in first-order networks.

3. Results and Analysis

3.1. Data Descriptions

To verify the effectiveness of above three complexity metrics, we introduce three empirical flow datasets, Enron, railway and citation, as the experimental datasets. The detailed descriptions of these flow datasets are as follows:
  • Enron [15,21]: This flow dataset refers to 116,525 emails between 146 users in Enron Corporation. It was originally made public and posted to the web by the Federal Energy Regulatory Commission during its investigation.
  • Railway: This flow dataset covers 4458 high-speed train numbers going through 672 stations of 34 provinces in China in 2016, collected from http://www.12306.cn/ (accessed on 20 January 2024).
  • Citation [22]: This flow dataset contains 8,850,334 citations in 668,383 papers in 19 journals of the American Physical Society (APS) by the end of 2020. Among them, the maximum length of the citation flow does not exceed 2.

3.2. Results in Enron Flow Dataset

Firstly, we divide Enron Corporation staff in the Enron flow dataset into three categories: ordinary employees, middle-level managers and senior leaders. The middle-level managers include the managers and directors who are in the middle leadership; meanwhile the senior leaders are the top leaders of the corporation, such as the CEO, presidents, vice presidents, managing directors, and chief operating officers.
Based on the Enron flow dataset, first- and higher-order Enron networks are built by the BuildHON+ model, where top 50 staff in complexity are picked up. Because the senior leaders participate in the most communities, they have the highest complexity. In contrast, the ordinary employees have the lowest complexity. In a word, the more senior leaders and middle-level managers in the top 50, the better the complexity metric. Figure 3 shows the number of ordinary employees, middle-level managers, and senior leaders occupied in the top 50 nodes ranked by three complexity metrics in Enron networks. The red area denotes the number of senior leaders, and the green area repersents the number of middle-level managers. The sum of above two areas ranked by C ˜ 1 , C ˜ 2 , C ˜ 3 in higher-order networks is far greater than that in first-order networks. That is to say, the above three complexity metrics perform better in higher-order networks than in first-order networks.
The top 20 staff with high complexity of C ˜ 1 , C ˜ 2 , C ˜ 3 in Enron networks are shown in Table 1, Table 2 and Table 3, respectively. Four senior leaders and one middle-level manager belong to the top 20 staff ranked by C ˜ 1 in first-order networks; meanwhile, nine senior leaders and three middle-level managers are in the top 20 staff in higher-order networks. Similarly, four senior leaders and three middle-level managers belong to the top 20 staff ranked by C ˜ 2 in first-order networks; meanwhile, nine senior leaders and three middle-level managers are in the top 20 staff in higher-order networks. Moreover, four senior leaders and three middle-level managers belong to the top 20 staff ranked by C ˜ 3 in first-order networks; meanwhile, eight senior leaders and two middle-level managers are in the top 20 staff in higher-order networks. In other words, the above three complexity metrics perform better in higher-order networks than in first-order networks.
To verify the effectiveness of three complexity metrics, the complexity of all staff in Enron Corporation are quantified into three grades as shown in Table 4.
In addition, we set the complexity grade of senior leaders to 3, the complexity grade of middle-level managers to 2, and the complexity grade of ordinary employees to 1. Figure 4 shows the sum of complexity grades in the top 50 nodes ranked by three complexity metrics in first- and higher-order Enron networks. The dashed line corresponds to the results of first-order networks, while the solid line represents the results of higher-order networks. The sum of complexity grades in the top nodes ranked by C ˜ 1 , C ˜ 2 , C ˜ 3 in higher-order networks is greater than that in first-order networks. In a word, the above three complexity metrics perform better in higher-order networks than in first-order networks.

3.3. Results in Railway Flow Dataset

According to the railway flow dataset in 2016, there are a total of 672 stations across 34 provinces (autonomous regions, municipalities, or special administrative regions) in China. Based on location, community, and complexity of these stations or provinces, their roles can be reflected in Chinese railway traffic.
Based on the railway flow dataset, first- and higher-order railway networks are built by the BuildHON+ model. The province communities detected using the Infomap algorithm in the railway network are shown in Figure 5. There are five communities without community overlap in first-order networks; however, there are six or eight communities with community overlap in higher-order networks, where the size of nodes is directly proportional to their complexity.
Based on the location and complexity of provinces in first-order networks as shown in Table 5, the hub provinces of the top 7 are Hubei, Sichuan, Hunan, Tianjin, Beijing, Henan, and Shaanxi. Based on the community and complexity of provinces in higher-order networks, the hub provinces of the top 7 are Anhui, Hubei, Henan, Jiangxi, Zhejiang, Jiangsu, and Liaoning. There is more railway traffic in Southeast China, so there should be more hub provinces [23,24]. In higher-order networks, Anhui, Jiangxi, Zhejiang, and Jiangsu, all located in Southeast China, belong to the hub provinces. Meanwhile, in first-order networks, there is no province belonging to the hub provinces. This shows that the above three complexity metrics in higher-order networks perform better than they do in first-order networks. Furthermore, through the observation of higher-order railway networks, the integration of Chinese railway traffic will be deepened by adding high-speed train numbers between Gansu and Shaanxi provinces [25,26].
The station communities detected using the Infomap algorithm in railway networks are shown in Figure 6, where the larger the node, the more complex it is. It is clear that the stations in Southeast China have a larger size overall in higher-order networks than in first-order networks. That is to say, the above three complexity metrics perform better in higher-order networks than they do in first-order networks. Additionally, the higher-order network clearly reveals the Beijing–Guangzhou line, the most important north–south railway artery in China, but the first-order network does not.

3.4. Results in Citation Flow Dataset

The citation flow dataset includes a total of 20 journals, where PRL, PRX, RMP, PRB, PRM, and PRR all belong to multidisciplinary journals [27,28], as shown in Table 6. Notably, PRL, PRX, and RMP, as the oldest multidisciplinary journals, cover the full range of applied, fundamental, and interdisciplinary physics research topics [29].
Based on the citation flow dataset, first- and higher-order citation networks are built by the BuildHON+ model. Firstly, in Figure 7, node complexity is more distinguishable in higher-order networks than in first-order networks on the whole. For PRL, although it always has the highest complexity in citation networks, there are also differences. C ˜ 1 of PRL is the only highest one in higher-order networks, but there are 20 highest ones in first-order networks; C ˜ 2 of PRL belongs to one of the four highest, but there are eight highest ones in first-order networks; C ˜ 3 of PRL belongs to one of the three highest, but there are eight highest ones in first-order networks. In addition, the complexity of RMP in higher-order networks is higher than that in first-order networks. C ˜ 1 of RMP belongs the top 4 in higher-order networks, but the top 20 in first-order networks; C ˜ 2 , C ˜ 3 of RMP belongs the top 4 in higher-order networks, but the top 8 in first-order networks. Moreover, PRX also has a high level of complexity.

4. Discussions

In addition to the complexity metrics discussed above, the second-order self-citation rate is also a useful metric for evaluating the complexity of specialized journals. When a random walker takes two steps from a node back to the original node in a complex network, the node has one second-order self-citation. Denote specialized journals as the source nodes, and multidisciplinary journals as the bridge nodes. Given the bridge node y, we define the second-order self-citation rate p x y x in first-order networks as:
p x y x = p y x = w y x z w y z ,
where w y z denotes the weight of edge y z , and x and z represent the source node and the target node, respectively.
To eliminate the effects of paper counts per journal, we propose the modified second-order self-citation rate p x y x + in first-order networks:
p x y x + = w y x / w x z w y z / w z ,
where w x denotes the weight of node x.
Similarly, the modified second-order second-order self-citation rate p ˜ x y x + in second-order networks is defined as:
p ˜ x y x + = w y | x x / w x z w y | z x / w z .
The results have demonstrated the high complexity of PRL, PRX, and RMP, making them suitable as the multidisciplinary journals. Meanwhile, the paper counts of PRA, PRB, PRC, PRD, and PRE are large, making them appropriate as the specialized journals.
As shown in Figure 8, second-order networks are more effective for revealing second-order self-citation than first-order networks, as demonstrated by a comparison of the two second-order self-citation rates. Moreover, the modified second-order second-order self-citation rate of PRB is consistently the highest in second-order networks, which is consistent with the finding in Table 6 that PRB has the most categories. However, that of PRB is the second highest in first-order networks when RMP acts as a bridge node. This suggests that the modified second-order second-order self-citation rate performs better in higher-order networks.

5. Conclusions and Prospects

This paper proposes three quantitative complexity metrics for nodes in higher-order networks using the Infomap algorithm. First, a higher-order network is built by the BuildHON+ model based on flow data. Then, communities are detected using the Infomap algorithm. The complexity metrics are defined as follows: C ˜ 1 of a physical node is the number of communities to which it belongs; C ˜ 2 of a physical node is defined by the number and size of communities to which it belongs; C ˜ 3 of a physical node is defined by the number and size of communities to which it belongs, and the number of higher-order nodes it contains within the same community. Experiments in the Enron flow dataset demonstrate that the proposed complexity metrics identify complex staff more accurately in higher-order networks than in first-order networks. Based on the railway flow dataset, the metrics more accurately identify complex provinces or stations in higher-order networks than in first-order networks. Furthermore, community detection in higher-order networks using the Infomap algorithm reveals the Beijing–Guangzhou line, the most important north–south railway artery in China. In the citation flow dataset, the metrics more accurately identify multidisciplinary journals in higher-order networks than in first-order networks. Additionally, a modified second-order second-order self-citation rate is proposed for second-order networks, which successfully identifies the most complex specialized journals.
This paper defines node complexity from the perspective of community detection using the Infomap algorithm. The next step could involve an in-depth study of how different community-detection algorithms impact node complexity and the resulting outcomes [30]. Additionally, beyond defining node complexity from the community-detection perspective, one could also consider the local or global characteristics of nodes [31]. For instance, degree is the simplest local characteristic [32], while eigenvector is a classical global characteristic [33]. In recent years, the study of flow data within multilayer networks has gained significant attention [9,34], particularly in analyzing human or biological movement trajectories [35,36], where substantial progress has been made. Building on the findings of this paper, the next step could involve integrating the complex structures of multilayer networks with the dynamic characteristics of flow data to propose new node complexity metrics. This research prospect will provide new theoretical tools for multilayer network analysis and contribute to further expanding its applications in biological networks, social networks, and transportation networks, offering new perspectives and methods for the development of related fields.

Author Contributions

Y.F. and X.L. (Xiongyi Lu): Conceptualization; Y.F. and Q.H.: Methodology; X.L. (Xiang Li) and C.Y.: Software; J.L.: Formal analysis; Y.F. and X.L. (Xiang Li): Investigation; J.L. and Q.H.: Resources; Y.F. and C.Y.: Data curation; Y.F. and X.L. (Xiongyi Lu): Writing—original draft preparation; J.L. and Q.H.: Writing—review and editing; X.L. (Xiongyi Lu): Visualization; X.L. (Xiang Li) and Q.H.: Supervision; Y.F.: Project administration; Q.H. and J.L.: Funding acquisition. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the Science Foundation for Outstanding Youth Scholars of Hunan Province, Grant/Award Number: 2022JJ20047, and the National Natural Science Foundation of China, Grant/Award Number: 62103422, 6210023156, 72371244 and 72001209.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Roblek, V.; Dimovski, V. Essentials of ‘the Great Reset’ through Complexity Matching. Systems 2024, 12, 182. [Google Scholar] [CrossRef]
  2. Bila, J. Emergent Phenomena in Complex Systems. In Proceedings of the Recent Advances in Soft Computing, Brno, Czech Republic, 20–22 June 2017; Springer: Berlin/Heidelberg, Germany, 2019; pp. 262–270. [Google Scholar]
  3. Vargas, D.L. Quantum complexity: Quantum mutual information, complex networks, and emergent phenomena in quantum cellular automata. In Theory of Computing Systems Mathematical Systems Theory; Colorado School of Mines: Golden, CO, USA, 2016; p. 7. [Google Scholar]
  4. McShea, D.W. Metazoan complexity and evolution: Is there a trend? Evolution 1995, 50, 477–492. [Google Scholar]
  5. Adami, C.; Cerf, N.J. Physical complexity of symbolic sequences. Physica D 2000, 137, 62–69. [Google Scholar] [CrossRef]
  6. Fu, Y.; Zhu, J.; Li, X.; Han, X.; Tan, W.; Huangpeng, Q.; Duan, X. Research on Group Behavior Modeling and Individual Interaction Modes with Informed Leaders. Mathematics 2024, 12, 1160. [Google Scholar] [CrossRef]
  7. Ahn, Y.Y.; Bagrow, J.P.; Lehmann, S. Link communities reveal multiscale complexity in networks. Nature 2009, 466, 761–764. [Google Scholar] [CrossRef] [PubMed]
  8. Benson, A.R.; Gleich, D.F.; Leskovec, J. Higher-order organization of complex networks. Science 2016, 353, 163–166. [Google Scholar] [CrossRef]
  9. Scholtes, I. When is a network a network? multi-order graphical model selection in pathways and temporal networks. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, 13–17 August 2017; pp. 1037–1046. [Google Scholar]
  10. Battiston, F.; Cencetti, G.; Lacopini, L.; Latora, V.; Lucas, M.; Patania, A.; Young, J.; Petri, G. Networks beyond pairwise interactions: Structure and dynamics. Phys. Rep. 2020, 874, 1–92. [Google Scholar] [CrossRef]
  11. Shi, D.; Chen, G. Simplicial networks: A powerful tool for characterizing higher-order interactions. Natl. Sci. Rev. 2022, 9, nwac038. [Google Scholar] [CrossRef] [PubMed]
  12. Li, J.; Lu, X. Measuring the Significance of Higher-Order Dependency in Networks. New J. Phys. 2024, 26, 033032. [Google Scholar] [CrossRef]
  13. Gong, C.; Li, J.; Qian, L.; Li, S.; Yang, Z.; Yang, K. HMSL: Source localization based on higher-order Markov propagation. Chaos Solitons Fractals 2024, 182, 114765. [Google Scholar] [CrossRef]
  14. Qian, L.; Dou, Y.; Gong, C.; Xu, X.; Tan, Y. Research on User Behavior Based on Higher-Order Dependency Network. Entropy 2023, 25, 1120. [Google Scholar] [CrossRef] [PubMed]
  15. Rosvall, M.; Esquivel, A.V.; Lancichinetti, A.; West, J.D.; Lambiotte, R. Memory in network flows and its effects on spreading dynamics and community detection. Nat. Commun. 2014, 5, 4630. [Google Scholar] [CrossRef] [PubMed]
  16. Xu, J.; Wickramarathne, T.L.; Chawla, N.V. Representing higher order dependencies in networks. Sci. Adv. 2016, 2, e1600028. [Google Scholar] [CrossRef]
  17. Saebi, M.; Xu, J.; Kaplan, L.M.; Ribeiro, B.; Chawla, N.V. Efficient modeling of higher-order dependencies in networks: From algorithm to application for anomaly detection. EPJ Data Sci. 2020, 9, 15. [Google Scholar] [CrossRef]
  18. Fu, Y.; Li, X.; Li, J.; Yu, M.; Lu, X.; Huangpeng, Q.; Duan, X. Multi-Scale Higher-Order Dependencies (MSHOD): Higher-Order Interactions Mining and Key Nodes Identification for Global Liner Shipping Network. J. Mar. Sci. Eng. 2024, 12, 1305. [Google Scholar] [CrossRef]
  19. Santos, G.G.; Lakhotia, K.; Rose, C.A.F.D. Towards a Scalable Parallel Infomap Algorithm for Community Detection. In Proceedings of the 2024 32nd Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), Dublin, Ireland, 20–22 March 2024; pp. 116–123. [Google Scholar]
  20. Velden, T.; Yan, S.; Lagoze, C. Mapping the cognitive structure of astrophysics by infomap clustering of the citation network and topic affinity analysis. Scientometrics 2017, 111, 1033–1051. [Google Scholar] [CrossRef]
  21. Li, X.; Zhang, X.; Huangpeng, Q.; Zhao, C.; Duan, X. Event detection in temporal social networks using a higher-order network model. Chaos 2021, 31, 113144. [Google Scholar] [CrossRef]
  22. Li, X.; Zhao, C.; Hu, Z.; Yu, C.; Duan, X. Revealing the character of journals in higher-order citation networks. Scientometrics 2022, 127, 6315–6338. [Google Scholar] [CrossRef]
  23. Yang, J.; Guo, A.; Li, X.; Huang, T. Study of the Impact of a High-Speed Railway Opening on China’s Accessibility Pattern and Spatial Equality. Sustainability 2018, 10, 2943. [Google Scholar] [CrossRef]
  24. Zhu, T.; Xu, Y.; Zhang, J.; Zhao, B. A Bilevel Programming Model for Designing a Collaborative Network for Regional Railway Transportation and Logistics: The Case of the Beijing-Tianjin-Hebei Region in China. J. Adv. Transp. 2024, 2024, 8905446. [Google Scholar] [CrossRef]
  25. Ding, S.; Zhang, T.; Sheng, K.; Chen, Y.; Yuan, Z. Key technologies and applications of intelligent dispatching command for high-speed railway in China. Railw. Sci. 2023, 2, 336–346. [Google Scholar] [CrossRef]
  26. Hu, W.; Dong, J.; Ren, R.; Chen, Z. Underground logistics systems: Development overview and new prospects in China. Front. Eng. Manag. 2023, 10, 354–359. [Google Scholar] [CrossRef]
  27. Wang, X.; Feng, X. Research on the relationships between discourse leading indicators and citations: Perspectives from altmetrics indicators of international multidisciplinary academic journals. Libr. Tech 2022, 42, 1165–1190. [Google Scholar] [CrossRef]
  28. Solomon, G.E.A.; Carley, S.F.; Porter, A.L. How Multidisciplinary Are the Multidisciplinary Journals Science and Nature? PLoS ONE 2016, 11, e0152637. [Google Scholar] [CrossRef] [PubMed]
  29. Njegovanović, A. Complex Systems in Interdisciplinary Interaction. Financ. Mark. Institutions Risks 2024, 8, 94–107. [Google Scholar] [CrossRef]
  30. Huang, X.; Chen, D.; Ren, T.; Wang, D. A survey of community detection methods in multilayer networks. Data Min. Knowl. Discov. 2020, 35, 1–45. [Google Scholar] [CrossRef]
  31. Lü, L.; Chen, D.; Ren, X.; Zhang, Q.; Zhang, Y.; Zhou, T. Vital nodes identification in complex networks. Phys. Rep. 2016, 650, 1–63. [Google Scholar] [CrossRef]
  32. Liang, J.; Ding, R.; Ma, X.; Peng, L.; Wang, K.; Xiao, W. The Carbon Emission Reduction Effect and Spatio-Temporal Heterogeneity of the Science and Technology Finance Network: The Combined Perspective of Complex Network Analysis and Econometric Models. Systems 2024, 12, 110. [Google Scholar] [CrossRef]
  33. Antoine, J.P.; Trapani, C. Operators in Rigged Hilbert Spaces, Gel’fand Bases and Generalized Eigenvalues. Mathematics 2022, 11, 195. [Google Scholar] [CrossRef]
  34. Lambiotte, R.; Rosvall, M.; Scholtes, I. From networks to optimal higher-order models of complex systems. Nat. Phys. 2019, 15, 313–320. [Google Scholar] [CrossRef]
  35. Fu, Y.; Duan, X.; Li, X.; Han, X.; Deng, J.; Huangpeng, Q. Flocking Modeling and Robustness Evaluation Based on Heterogeneous Network. In Proceedings of the 2024 36th Chinese Control and Decision Conference (CCDC), Xi’an, China, 25–27 May 2024; pp. 1870–1875. [Google Scholar]
  36. Gu, K.; Yan, L.; Li, X.; Duan, X.; Liang, J. Change point detection in multi-agent systems based on higher-order features. Chaos 2022, 32, 111102. [Google Scholar] [CrossRef] [PubMed]
Figure 1. The comparison of first- and higher-order network models.
Figure 1. The comparison of first- and higher-order network models.
Systems 12 00347 g001
Figure 2. The node complexity-quantification process.
Figure 2. The node complexity-quantification process.
Systems 12 00347 g002
Figure 3. The number of ordinary employees, middle-level managers, and senior leaders occupied in the top 50 nodes ranked by three complexity metrics in Enron networks. (ac) The results ranked by C ˜ 1 , C ˜ 2 , C ˜ 3 in first-order networks, respectively. (df) Those in higher-order networks, respectively.
Figure 3. The number of ordinary employees, middle-level managers, and senior leaders occupied in the top 50 nodes ranked by three complexity metrics in Enron networks. (ac) The results ranked by C ˜ 1 , C ˜ 2 , C ˜ 3 in first-order networks, respectively. (df) Those in higher-order networks, respectively.
Systems 12 00347 g003
Figure 4. The sum of complexity grades in the top 50 nodes ranked by three complexity metrics in first- and higher-order Enron networks. (ac) The results ranked by C ˜ 1 , C ˜ 2 , C ˜ 3 in Enron networks, respectively.
Figure 4. The sum of complexity grades in the top 50 nodes ranked by three complexity metrics in first- and higher-order Enron networks. (ac) The results ranked by C ˜ 1 , C ˜ 2 , C ˜ 3 in Enron networks, respectively.
Systems 12 00347 g004
Figure 5. The province communities detected using the Infomap algorithm in railway networks. The different colors of the nodes represent their belonging to different communities. (a) The results in first-order networks, where the size of nodes is directly proportional to C ˜ 2 , C ˜ 3 . (bd) The results in higher-order networks, where the size of nodes is in direct proportion to C ˜ 1 , C ˜ 2 , C ˜ 3 , respectively. In addition, E is the east longitude, and N denotes the northern latitude. These parameters are the same with Figure 6. It is important to note that the two figures are both created by ‘Origin 2023’.
Figure 5. The province communities detected using the Infomap algorithm in railway networks. The different colors of the nodes represent their belonging to different communities. (a) The results in first-order networks, where the size of nodes is directly proportional to C ˜ 2 , C ˜ 3 . (bd) The results in higher-order networks, where the size of nodes is in direct proportion to C ˜ 1 , C ˜ 2 , C ˜ 3 , respectively. In addition, E is the east longitude, and N denotes the northern latitude. These parameters are the same with Figure 6. It is important to note that the two figures are both created by ‘Origin 2023’.
Systems 12 00347 g005
Figure 6. The station communities detected using the Infomap algorithm in railway networks.
Figure 6. The station communities detected using the Infomap algorithm in railway networks.
Systems 12 00347 g006
Figure 7. The complexity calculated by C ˜ 1 , C ˜ 2 , C ˜ 3 in citation networks, respectively. (a) The number of papers per journal. (bd) The complexity comparison of first- and higher-order networks.
Figure 7. The complexity calculated by C ˜ 1 , C ˜ 2 , C ˜ 3 in citation networks, respectively. (a) The number of papers per journal. (bd) The complexity comparison of first- and higher-order networks.
Systems 12 00347 g007
Figure 8. The citation flow among five specialized journals via three multidisciplinary journals.
Figure 8. The citation flow among five specialized journals via three multidisciplinary journals.
Systems 12 00347 g008
Table 1. The top 20 staff with high complexity of C ˜ 1 in Enron networks.
Table 1. The top 20 staff with high complexity of C ˜ 1 in Enron networks.
Enron—FONEnron—HON
IDNamePosition C ˜ 1 IDNamePosition C ˜ 1
1kuykendall-tTrader147lavorato-jCEO9
2kitchen-lPresident12kitchen-lPresident6
3geaccone-t-182grigsby-mManager6
4cash-m-121sager-e-5
5merriss-s-133shively-hVice President5
6linder-e-148perlingiere-d-5
7griffith-j-196kaminski-vManager5
8mccarty-dVice President1136neal-sVice President5
9rapp-b-14cash-m-4
10lay-kCEO113mann-k-4
11keavey-p-114tycholiz-bVice President4
12love-p-119scott-s-4
13mann-k-128buy-rManager4
14tycholiz-bVice President158kuykendall-t-4
15pereira-s-165haedicke-mManaging Director4
16meyers-a-167kean-sVice President4
17hendrickson-s-186hyvl-d-4
18solberg-g-1128keiser-k-4
19scott-s-123hodge-jManaging Director3
20schoolcraft-d-141martin-tVice President3
Table 2. The top 20 staff with high complexity of C ˜ 2 in Enron networks.
Table 2. The top 20 staff with high complexity of C ˜ 2 in Enron networks.
Enron—FONEnron—HON
IDNamePosition C ˜ 2 IDNamePosition C ˜ 2
3geaccone-t-1.20447lavorato-jCEO11.222
8mccarty-dVice President1.20482grigsby-mManager7.824
9rapp-b-1.2042kitchen-lPresident7.665
20schoolcraft-d-1.20421sager-e-6.474
42corman-sVice President1.20433shively-hVice President6.410
53blair-l-1.20496kaminski-vManager6.402
72watson-k-1.204136neal-sVice President6.061
77harris-s-1.20414tycholiz-bVice President5.678
83mcconnell-m-1.20413mann-k-5.474
90lokay-mAdministrative Asisstant1.20458taylor-m-5.432
95donoho-l-1.20467kean-sVice President5.424
111hyatt-kDirector1.20448perlingiere-d-5.403
119horton-sPresident1.2044cash-m-5.131
120lokey-tManager1.204128keiser-k-5.073
124ybarbo-p-1.20465haedicke-mManaging Director5.044
134hayslett-rVice President1.20486hyvl-d-4.900
5merriss-s-1.07928buy-rManager4.868
6linder-e-1.07941martin-tVice President4.591
16meyers-a-1.079117mclaughlin-e-4.591
18solberg-g-1.079129steffes-jVice President4.470
Table 3. The top 20 staff with high complexity of C ˜ 3 in Enron networks.
Table 3. The top 20 staff with high complexity of C ˜ 3 in Enron networks.
Enron—FONEnron—HON
IDNamePosition C ˜ 3 IDNamePosition C ˜ 3
3geaccone-t-1.20447lavorato-jCEO27.073
8mccarty-dVice President1.2042kitchen-lPresident23.218
9rapp-b-1.20482grigsby-mManager18.596
20schoolcraft-d-1.20458taylor-m-16.654
42corman-sVice President1.20472watson-k-15.950
53blair-l-1.20446dasovich-jExecutive15.608
72watson-k-1.20477harris-s-15.163
77harris-s-1.20414tycholiz-bVice President14.205
83mcconnell-m-1.20486hyvl-d-14.113
90lokay-mAdministrative Asisstant1.20467kean-sVice President14.080
95donoho-l-1.20443nemec-g-13.825
111hyatt-kDirector1.20421sager-e-13.812
119horton-sPresident1.20435jones-t-13.679
120lokey-tManager1.204129steffes-jVice President13.523
124ybarbo-p-1.20439fossum-dVice President13.269
134hayslett-rVice President1.204128keiser-k-12.567
5merriss-s-1.079107shackleton-s-12.425
6linder-e-1.07942corman-sVice President12.234
16meyers-a-1.07948perlingiere-d-12.221
18solberg-g-1.079136neal-sVice President12.130
Table 4. The complexity grade of the staff in Enron Corporation. C denotes the complexity grade.
Table 4. The complexity grade of the staff in Enron Corporation. C denotes the complexity grade.
IDNameCIDNameCIDNameCIDNameC
1kuykendall-t238gay-r175quenet-j2112causholli-m2
2kitchen-l339fossum-d376fischer-m1113heard-m1
3geaccone-t140gilbertsmith-d177harris-s1114forney-j2
4cash-m141martin-t378ruscitti-k2115dorland-c1
5merriss-s142corman-s379mims-thurston-p1116symes-k1
6linder-e143nemec-g180skilling-j3117mclaughlin-e1
7griffith-j144sanchez-m181ring-a1118arnold-j3
8mccarty-d345guzman-m282grigsby-m2119horton-s3
9rapp-b146dasovich-j283mcconnell-m1120lokey-t2
10lay-k347lavorato-j384scholtes-d2121ermis-f2
11keavey-p148perlingiere-d185schwieger-j2122zipper-a3
12love-p149giron-d186hyvl-d1123salisbury-h2
13mann-k150may-l287donohoe-t1124ybarbo-p1
14tycholiz-b351thomas-p188stepenovitch-j3125bailey-s1
15pereira-s152maggi-m289holst-k2126derrick-j2
16meyers-a153blair-l190lokay-m2127germany-c1
17hendrickson-s154whalley-l191allen-p1128keiser-k1
18solberg-g155weldon-c192ring-r1129steffes-j3
19scott-s156rogers-b193arora-h3130richey-c2
20schoolcraft-d157badeer-r294shapiro-r3131whalley-g3
21sager-e158taylor-m195donoho-l1132saibi-e1
22dean-c259wolfe-j196kaminski-v2133stclair-c1
23hodge-j360shankman-j397motley-m2134hayslett-r3
24pimenov-v161dickson-s198carson-m1135lewis-a2
25baughman-d262davis-d199hain-m2136neal-s3
26quigley-d163south-s1100parks-j1137swerzbin-m2
27brawner-s264benson-r2101presto-k3138hernandez-j1
28buy-r265haedicke-m3102williams-j1139panus-s1
29king-j266storey-g2103beck-s3140reitmeyer-j1
30white-s167kean-s3104farmer-d2141gang-l1
31lucci-p168sturm-f3105sanders-r3142platter-p2
32mckay-b169tholt-j3106smith-m1143mckay-j2
33shively-h370lenhart-m1107shackleton-s1144townsend-j1
34cuilla-m271whitt-m1108williams-w31145semperger-c2
35jones-t172watson-k1109slinger-r2146delainey-d3
36bass-e273campbell-l1110zufferli-j1
37staab-t174ward-k1111hyatt-k2
Table 5. The top 10 provinces with high complexity in railway networks. n c denotes the number of communities to which a physical node belongs.
Table 5. The top 10 provinces with high complexity in railway networks. n c denotes the number of communities to which a physical node belongs.
Railway—FONRailway—HON
Province C ˜ 2 , C ˜ 3 Edge ProvinceProvince C ˜ 1 C ˜ 2 C ˜ 3 n c
Hubei0.845Anhui43.34218.14
Sichuan0.845Hubei32.64312.8483
Hunan0.845Henan22.04117.4152
Tianjin0.845Jiangxi21.69913.2552
Beijing0.845Zhejiang21.69911.7912
Henan0.845Jiangshu21.7421.9742
Shaanxi0.845Liaoning21.5195.9012
Guangdong0.845×Hebei11.04116.2221
Guangxi0.845×Shandong11.04114.4141
Chongqing0.845×Tianjin11.04112.0751
Table 6. The categories of journals reported in JCR in 2020. n denotes the number of papers per journal.
Table 6. The categories of journals reported in JCR in 2020. n denotes the number of papers per journal.
JournalCategory 1Category 2Category 3
PRLPhysics, Multidisciplinary--
PRXPhysics, Multidisciplinary--
RMPPhysics, Multidisciplinary--
PRAOpticsPhysics, Atomic, Molecular, and Chemical-
PRBMaterials Science, MultidisciplinaryPhysics, Condensed MatterPhysics, Applied
PRCPhysics, Nuclear--
PRDPhysics, Particles, and FieldsAstronomy, and Astrophysics-
PREPhysics, Fluids, and PlasmasPhysics, Mathematical-
PRABPhysics, NuclearPhysics, Particles, and Fields-
PRAPPhysics, Applied--
PRFPhysics, Fluids, and Plasmas--
PRMMaterials Science, Multidisciplinary--
PRPEREducation, and Educational ResearchEducation, Scientific Disciplines-
PRXQQuantum Science, and TechnologyPhysics, MultidisciplinaryPhysics, Applied
PRRPhysics, Multidisciplinary--
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Fu, Y.; Lu, X.; Yu, C.; Li, J.; Li, X.; Huangpeng, Q. Quantifying the Complexity of Nodes in Higher-Order Networks Using the Infomap Algorithm. Systems 2024, 12, 347. https://doi.org/10.3390/systems12090347

AMA Style

Fu Y, Lu X, Yu C, Li J, Li X, Huangpeng Q. Quantifying the Complexity of Nodes in Higher-Order Networks Using the Infomap Algorithm. Systems. 2024; 12(9):347. https://doi.org/10.3390/systems12090347

Chicago/Turabian Style

Fu, Yude, Xiongyi Lu, Caixia Yu, Jichao Li, Xiang Li, and Qizi Huangpeng. 2024. "Quantifying the Complexity of Nodes in Higher-Order Networks Using the Infomap Algorithm" Systems 12, no. 9: 347. https://doi.org/10.3390/systems12090347

APA Style

Fu, Y., Lu, X., Yu, C., Li, J., Li, X., & Huangpeng, Q. (2024). Quantifying the Complexity of Nodes in Higher-Order Networks Using the Infomap Algorithm. Systems, 12(9), 347. https://doi.org/10.3390/systems12090347

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop