HLEGF: An Effective Hypernetwork Community Detection Algorithm Based on Local Expansion and Global Fusion

: Community structure is crucial for understanding network characteristics, and the local expansion method has performed well in detecting community structures. However, there are two problems with this method. Firstly, it can only add nodes or edges on the basis of existing clusters, and secondly, it can produce a large number of small communities. In this paper, we extend the local expansion method based on ordinary graph to hypergraph, and propose an effective hypernetwork community detection algorithm based on local expansion (LE) and global fusion (GF), which is referred to as HLEGF. The LE process obtains multiple small sub-hypergraphs by deleting and adding hyperedges, while the GF process optimizes the sub-hypergraphs generated by the local expansion process. To solve the ﬁrst problem, the HLEGF algorithm introduces the concepts of community neighborhood and community boundary to delete some nodes and hyperedges in hypergraphs. To solve the second problem, the HLEGF algorithm establishes correlations between adjacent sub-hypergraphs through global fusion. We evaluated the performance of the HLEGF algorithm in the real hypernetwork and six synthetic random hypernetworks with different probabilities. Because the HLEGF algorithm introduces the concepts of community boundary and neighborhood, and the concept of a series of similarities, the algorithm has superiority. In the real hypernetwork, the HLEGF algorithm is consistent with the classical Spectral algorithm, while in the random hypernetwork, when the probability is not less than 0.95, the NMI value of the HLEGF algorithm is always greater than 0.92, and the RI value is always greater than 0.97. When the probability is 0.95, the HLEGF algorithm achieves a 2.3% improvement in the NMI value, compared to the Spectral algorithm. Finally, we applied the HLEGF algorithm to the drug–target hypernetwork to partition drugs with similar functions into communities.


Introduction
In real life, many complex relationships can often be represented as complex networks, such as citation networks, social networks, and mobile information control networks [1]. In these different networks, nodes represent individuals and edges represent their interactions. Three important characteristics of complex networks, namely the small-world property [2], scale-free property [3], and community structure [4], have garnered significant attention from scholars.
Ordinary graphs are limited to pairwise relationship information, and this binary measure is often insufficient in many applications. Consequently, hypernetworks have become a popular research topic in network science. The concept of a hypernetwork is divided into two categories, namely supernetworks based on networks, and hypernetworks whose This paper is organized as follows. Some definitions associated with the algorithm are provided in Section 2, the HLEGF algorithm and its two processes are introduced in Section 3, Section 4 verifies the feasibility and superiority of the algorithm through analytical experiments, and Section 5 concludes this paper and discusses future research.

Basic Definitions
Definition 1 (Hypergraph [28]). A hypergraph H is a pair H = (V, E), where V is a finite set of nodes (also called vertices) V = {v 1 , v 2 , . . . , v n }, and E = {e 1 , e 2 , . . . , e m } is a family of nonempty subsets of elements of V. These subsets are called hyperedges or hyperlinks, and they represent an interaction taking place between elements of V (see Figure 1). small sub-hypergraphs are obtained in the local expansion process. Next, we compare the distances between different seed nodes of each sub-hypergraph. If the distance is less than a specified threshold, we merge these two sub-hypergraphs into one and obtain the fina community partitioning result.
The main contributions of this paper are as follows: (1) This paper extends the local expansion method to hypergraphs and proposes a hy pernetwork community detection algorithm based on local expansion and global fu sion, which provides a solution for identifying communities with hypergraph struc tures; (2) Based on the local expansion, we consider deleting nodes and hyperedges; (3) The algorithm establishes connections between sub-hypergraphs through global fu sion to improve the relativity of detected communities.
This paper is organized as follows. Some definitions associated with the algorithm are provided in Section 2, the HLEGF algorithm and its two processes are introduced in Section 3, Section 4 verifies the feasibility and superiority of the algorithm through ana lytical experiments, and Section 5 concludes this paper and discusses future research.

Basic Definitions
Definition 1 (Hypergraph [28]). A hypergraph is a pair = ( , ), where is a finite se of nodes (also called vertices) = { 1 , 2 , . . . , } , and = { 1 , 2 , . . . , } is a family o nonempty subsets of elements of . These subsets are called hyperedges or hyperlinks, and they represent an interaction taking place between elements of (see Figure 1). If a node ∈hyperedge , we say that is incident to , and the corresponding entry in the incidence matrix A is = 1, otherwise, the entry is 0. Two hyperedges and are said t be incident if ⋂ ≠ ∅, i.e., if they have at least one node in common. We use matrix to rep resent this relationship, if ⋂ ≠ ∅, = 1, otherwise, the entry is 0.The degree ( ) of th node in a hypergraph is defined as the number of nodes directly adjacent to it. The hyperdegre ( ) of a node is defined as the number of hyperedges containing that node. The distanc ( , ) between two nodes and in a hypergraph is defined as the minimum length of th path connecting the two nodes. If there is no path between two nodes, the distance ( , ) = ∞ Definition 2 (Sub-hypergraph). Given two hypergraphs = ( , ) and ' = ( ', '), if ′ ⊆ , and ∀ ′ ∈ ′ there is only one ∈ , such that ′ ⊂ , then ' is called a sub-hypergraph o .
Definition 3 (Node Centrality). In this paper, we define the centrality of a node in hyper graph as follows: If a node v i ∈ hyperedge e j , we say that v i is incident to e j , and the corresponding entry in the incidence matrix A is A ji = 1, otherwise, the entry is 0. Two hyperedges e i and e j are said to be incident if e i ∩ e j = ∅, i.e., if they have at least one node in common. We use matrix B to represent this relationship, if e i ∩ e j = ∅, B ij = 1, otherwise, the entry is 0. The degree d(v i ) of the node v i in a hypergraph is defined as the number of nodes directly adjacent to it. The hyperdegree d H (v i ) of a node is defined as the number of hyperedges containing that node. The distance d v i , v j between two nodes v i and v j in a hypergraph is defined as the minimum length of the path connecting the two nodes. If there is no path between two nodes, the distance d v i , v j = ∞. Definition 2 (Sub-hypergraph). Given two hypergraphs H = (V, E) and H = (V , E ), if V ⊆ V, and ∀e ∈ E there is only one e ∈ E, such that e ⊂ e, then H is called a sub-hypergraph of H.
Definition 3 (Node Centrality). In this paper, we define the centrality of a node v i in hypergraph as follows: Formula (1) consists of two terms. The first term is the ratio of the hyperdegree of the node to the total number of hyperedges, which reflects the importance of hyperedges. The second term is the ratio of the degree of the node to the total number of nodes, which reflects the importance of nodes, α is a tunable parameter. Therefore, node centrality is measured by both node degree and hyperdegree, it is also used to determine the order of selecting seed nodes in this paper.
In Figure 1, when α = 0.5, the node with the greatest centrality is v 2 . Therefore, the seed node is v 2 .
Definition 4 (Node Neighborhood). The neighborhood of a node v i is defined as the set of all hyperedges containing that node.
In this paper, nodes contained in the neighborhood Γ(v i ) of the seed node v i are considered as the initial community of this node, denoted by C v i . Therefore, in Figure 1, the initial community of Definition 5 (Community Boundary and Neighborhood). Given a community C, the boundary B(C) is defined as follows: A community's boundary consists of hyperedges inside the community, which have at least one of the incident hyperedges located outside the community.
The community's neighborhood Γ(C) of community C is defined as follows: A community's neighborhood consists of hyperedges outside the community, which have at least one of the incident hyperedges located within the community.
In the hypergraph shown in Figure 1, the boundary of the community C v 2 is B(C v 2 ) = {e 1 , e 3 }, and its neighborhood is Γ(C v 2 ) = {e 4 , e 5 }. Definition 6 (Similarity between a hyperedge and Sub-hypergraph). For a given community C and a sub-hypergraph H = (V , E ), the similarity hss(e i , H ) is defined as follows: Definition 7 (Similarity between Node and Community). The similarity between a node v i and a community C is defined as the ratio of the number of hyperedges satisfying certain conditions in the node neighborhood to the number of hyperedges in the node neighborhood. Among them, the hyperedges of the molecular part should satisfy the condition that at least half of the nodes contained in these hyperedges are located in the community C. The result reflects how much the node is connected with the community C.
In the above equation, if ncs(v i , C) < 0.5, the node v i is removed from the community C.

Proposed Method
This paper proposes a new algorithm, called HLEGF, for detecting community structures in hypernetworks. The algorithm consists of two processes: the LE process and the GF process. The LE process includes seed selection, deletion, and expansion sub-processes. The seed selection sub-process selects the node with the highest centrality as the seed node from the hypernetwork and uses nodes contained in the neighborhood of the seed node as the initial community. In the deletion sub-process, hyperedges within the community boundary with low similarity to the sub-hypergraph consisting of the current community are first removed, and then nodes with lower similarity ncs(v, C) between the nodes and the community are removed. The expansion sub-process determines whether to expand the community based on the similarity hss(e, H ) between a hyperedge within the community neighborhood and sub-hypergraph, and adds the hyperedge with higher similarity to the community. This process can obtain multiple smaller sub-hypergraphs. The GF process globally merges the sub-hypergraphs generated by the previous process according to the distance between the seed nodes of the different sub-hypergraphs. When the distance is smaller than the specified threshold, the smaller sub-hypergraph is merged into the larger sub-hypergraph, and the final community detection result is obtained.

Local Expansion Process
The LE process is displayed in Algorithm 1. Lines 1-6 describe the initialization process, where C is the sub-hypergraph set, S is the seed node set, D is the set of deleted hyperedges, and U is the set of unassigned nodes. The centrality value of each node is calculated according to Definition 3. Lines 7-34 describe the LE process, where the algorithm executes the seed selection, deletion, and expansion sub-processes sequentially. The seed selection sub-process (lines 9-12) selects the node v s with the maximum centrality from the set of unassigned nodes as the seed and determines the community C v s based on Definition 4. The deletion sub-process (lines 13-23) first obtains hyperedges within the community boundary based on Definition 5, then calculates the similarity between the hyperedge and the sub-hypergraph consisted in community hss e, H 1 , as well as the similarity between the hyperedge and the rest of the sub-hypergraphs outside the community hss e, H 2 . If hss e, H 1 < hss e, H 2 , this hyperedge is deleted, and nodes associated with this hyperedge but not included in other hyperedges inside the community are also deleted. Then, the set D is updated. The similarity ncs(v, C v s ) between the community's node and the community is then calculated. If ncs(v, C v s ) < 0.5, the node is deleted. The expansion sub-process (lines 24-33) first obtains the hyperedges within the community neighborhood based on Definition 5, calculates the similarity between the hyperedge and the sub-hypergraph consisted in the current community hss e, H 1 , as well as the similarity between the hyperedge and the rest of the sub-hypergraphs outside the community hss e, H 2 . If hss e, H 1 > hss e, H 2 , this hyperedge and its associated nodes are added to the current community. At this time, we obtain a sub-hypergraph. Then, the sub-hypergraph set C and unassigned node sequence U are updated. Repeat the above process until U = ∅.
To facilitate the use of the next process, we generate a new hypergraph H 2 , , where E is obtained after deleting the hyperedges of the set D from the original hyperedge set E, and there will be many isolated nodes in H 2 .
The time complexity of Algorithm 1 is O(2m + n), where m is the number of hyperedges in the hypernetwork, and n is the number of nodes. Figure 2 illustrates the process.  13: Deletion sub-process: 14: Get community boundary B(C v s ) based on Definition 5 15: , for every e j e j ∈ E , there is e j / ∈ D 34: return C End LE Figure 2a shows the original hypergraph, and the node v 2 with the maximum centrality is selected as the seed node, and the neighborhood Γ(v 2 ) of the node is {e 1 , e 2 , e 3 }. In Figure 2b, nodes contained in the neighborhood Γ(v 2 ) are used as the initial community C v 2 . The boundary of the community is {e 1 , e 3 }. Based on the similarity between hyperedge and sub-hypergraph, it can be seen that the hyperedge e 1 is more similar to the sub-hypergraph outside the community. Therefore, the hyperedges e 2 , e 3 and their associated nodes are retained in the community C v 2 , and the remaining nodes {v 1 , v 2 , v 8 , v 9 , v 10 , v 11 } in the community are shown in Figure 2c. Based on the similarity between the node and community, it can be seen that these nodes are more similar to the current community, so they are retained. In the expansion sub-process, the neighborhood of the current community is {e 1 , e 4 }. Since the similarity between the hyperedge e 4 and the sub-hypergraph comprising the current community is greater than that between the hyperedge and the sub-hypergraph outside the community, the hyperedge e 4 and its contained nodes are added to the community, and then the final sub-hypergraph is obtained. The above sub-process is shown in Figure 2d.
An example of the LE process. (a) Select node 2 as the seed node and obtain its neighborhood, the node is marked in red; (b) Obtain the initial community and its boundary; (c) Calculate the similarity between hyperedges within the community boundary and sub-hypergraph, as well as the similarity between nodes within the community and the current community itself, delete the hyperedges and nodes with low similarity, and then obtain the community's neighborhood; (d) Calculate the similarity between hyperedges within community's neighborhood and sub-hypergraph, and add the hyperedge with high similarity and its associated nodes to the current community. Figure 2a shows the original hypergraph, and the node 2 with the maximum centrality is selected as the seed node, and the neighborhood ( 2 ) of the node is { 1 , 2 , 3 }. In Figure 2b, nodes contained in the neighborhood ( 2 ) are used as the initial community 2 . The boundary of the community is { 1 , 3 }. Based on the similarity between hyperedge and sub-hypergraph, it can be seen that the hyperedge 1 is more similar to the sub-hypergraph outside the community. Therefore, the hyperedges 2 , 3 and their associated nodes are retained in the community 2 , and the remaining nodes { 1 , 2 , 8 , 9 , 10 , 11 } in the community are shown in Figure 2c. Based on the similarity between the node and community, it can be seen that these nodes are more similar to the current community, so they are retained. In the expansion sub-process, the neighborhood of the current community is { 1 , 4 }. Since the similarity between the hyperedge 4 and the sub-hypergraph comprising the current community is greater than that between the (a) Select node v 2 as the seed node and obtain its neighborhood, the node is marked in red; (b) Obtain the initial community and its boundary; (c) Calculate the similarity between hyperedges within the community boundary and sub-hypergraph, as well as the similarity between nodes within the community and the current community itself, delete the hyperedges and nodes with low similarity, and then obtain the community's neighborhood; (d) Calculate the similarity between hyperedges within community's neighborhood and subhypergraph, and add the hyperedge with high similarity and its associated nodes to the current community.

Global Fusion Process
GF process is displayed in Algorithm 2. The GF process first sorts the multiple subhypergraphs obtained from the LE process in ascending order according to the number of nodes contained, and obtains the seed node for the corresponding sub-hypergraph. Then, it sequentially selects a sub-hypergraph C cur , the corresponding seed node is sn cur . Based on the distance between different seed nodes on hypergraph H 2 , we can find the sub-hypergraph C porb corresponding to the shortest distance. If the shortest distance is less than the threshold τ, C cur is merged into C porb and the sub-hypergraph C cur and its corresponding seed node sn cur are deleted. This process continues until there are no more sub-hypergraphs that can be merged, resulting in the final community detection result. We let τ = 2% in this paper which is consistent with the conclusion of Rodriguez and Laio [29]. The specific process is as follows: calculate the distance between all seed nodes in each sub-hypergraph, select the top 2% of distances, and if the distance between two sub-hypergraphs' seed nodes is greater than 2%, these two sub-hypergraphs will not be merged.

Algorithm 2: GF algorithm
Start GF Input: Hypergraph H 2 and the sub-hypergraphs C obtained by the LE process, the threshold τ. Output: The final communities C f in 1: Sort C in increasing order based on the size of different sub-hypergraphs 2: Get the corresponding seed node as SN 3: for C cur in C : 4: Get seed node sn cur that corresponds to the current sub-hypergraph C cur 5: Get C porb based on min distance sn cur , sn prob from the hypergraph H 2 6: If distance sn cur , sn prob < τ 7: Merge C cur into C porb 8: C = C − C cur 9: SN = SN − sn cur 10: End if 11: End for 12: C f in = C 13: return C f in End GF Assuming that Algorithm 1 generates the number of sub-hypergraphs as c, the time complexity of Algorithm 2 is O c 2 . The specific process is shown in Figure 3.  After the local expansion process, the hypergraph is divided into two sub-hypergraphs, namely, 2 and 4 , corresponding to the seed nodes 2 and 4 . Since the hyperedge 1 (denoted by a dashed dotted line) was deleted in the hypergraph H 2 , there is no path between 2 and 4 , and the distance is infinite, as shown in Figure 3a. Therefore, the two sub-hypergraphs cannot be merged, and the final community obtained is = { 2 , 4 } , where 2 = { 1 , 2 , 8 , 9 , 10 , 11 , 12 } , 4 = { 3 , 4 , 5 , 6 , 7 , 13 } , as shown in Figure 3b. (a) Two sub-hypergraphs can be obtained by the LE process are C v 2 and C v 4 , corresponding seed nodes v 2 and v 4 . These seed nodes are marked in red. Since the hyperedge e 1 was deleted in the hypergraph H 2 , the distance between v 2 and v 4 is infinite; (b) Since the distance is infinite, the final communities are C v 2 and C v 4 .
After the local expansion process, the hypergraph is divided into two sub-hypergraphs, namely, C v 2 and C v 4 , corresponding to the seed nodes v 2 and v 4 . Since the hyperedge e 1 (denoted by a dashed dotted line) was deleted in the hypergraph H 2 , there is no path between v 2 and v 4 , and the distance is infinite, as shown in Figure 3a. Therefore, the two subhypergraphs cannot be merged, and the final community obtained is

Dataset
We use the dataset of southern women hypernetwork and random hypernetwork to verify the feasibility of the algorithm. In addition, the dataset of drug-targets hypernetwork was used to partition drugs with similar functions into a community, which enabled us to mine drug modules. The details of these datasets were displayed in Table 1.

Evaluate Metrics
The Rand Index (RI) and Normalized mutual information (NMI), as two classical metrics, can consider both similarity within and between communities, thus allowing a more comprehensive evaluation of the quality of community divisions. Moreover, because both the real-world hypernetwork and random hypernetworks used in this paper have known community structure, NMI and RI are more appropriate than indicators such as modularity. We therefore used these two indicators to represent the effectiveness of the algorithm in this paper.
The RI is defined as follows: The RI consists of four terms: TP, TN, FP, and FN. Where TP represents the number of nodes that belong to the same community in both the experimental results and the true data. TN represents the number of nodes that belong to different communities in both the experimental results and true data. FP represents the number of nodes that belong to different communities in the true data but are assigned to the same community in the experimental results. FN represents the number of nodes that belong to the same community in the true data but are assigned to different communities in the experimental results. The RI ranges from 0 to 1, a value closer to 1 indicates better agreement with the actual partition, while a value of 0 indicates the complete opposite, and a value of 1 indicates complete agreement.
The N MI is defined as follows:

Southern Women Hypernetwork
We considered a real-world hypernetwork with community structure, namely the Southern Women hypernetwork, which is a social network. Table 1 provides details about this dataset. We compared the HLEGF algorithm with four different algorithms to verify the feasibility of the algorithm. Figure 4 illustrates the Southern Women hypernetwork, which includes 18 women and 14 social events. The original data were collected by Davis [30]. For our analysis, we treat the 18 women as nodes and the 14 social events as hyperedges to construct the hypernetwork. The hypernetwork can be represented as a bipartite graph, as shown in Figure 4, where the women are listed on the left and the social events are listed on the right. An edge is established between a woman and a social event in the bipartite graph if she participated in that event.
is the conditional entropy of given . T equals 1 if and only if the partitions are identical, whereas it has an expected valu they are independent.

Southern Women Hypernetwork
We considered a real-world hypernetwork with community structure, nam Southern Women hypernetwork, which is a social network. Table 1 provides detai this dataset. We compared the HLEGF algorithm with four different algorithms t the feasibility of the algorithm. Figure 4 illustrates the Southern Women hypernetwork, which includes 18 and 14 social events. The original data were collected by Davis [30]. For our analy treat the 18 women as nodes and the 14 social events as hyperedges to construct pernetwork. The hypernetwork can be represented as a bipartite graph, as shown ure 4, where the women are listed on the left and the social events are listed on th An edge is established between a woman and a social event in the bipartite grap participated in that event.  To facilitate description, we numbered the 18 women from 1 to 18. As the nodes within the same hyperedge are fully connected, we can convert the hypernetwork into an ordinary network. We then compared our HLEGF algorithm with the IRMM algorithm [31] and Spectral algorithm [32] in the hypernetwork, and the LPA algorithm [33] and GN algorithm [34] in the ordinary network.
We used the Rand Index to represent the effectiveness of the algorithms. The number of nodes in the Southern Women hypernetwork is very small, so we set the parameter α = 0.5. Table 2 presents the community detection results of these algorithms in the Southern Women hypernetwork, as well as the actual result. For ease of comparison, we bolded the HLEGF algorithm and its performance proposed in this paper. The result shows that the community detection algorithms for hypernetworks are generally better than those for ordinary networks, and the community detection results obtained by the HLEGF algorithm are completely consistent with the ground truth. Therefore, using this algorithm can correctly partition the 18 women.

Random Hypernetwork
We constructed six synthetic random hypernetworks under different probabilities, with corresponding probabilities of 1, 0.99, 0.98, 0.97, 0.96, 0.95, respectively. These random hypernetworks have known community structure. Similarly, we compared the algorithm in this paper with the four algorithms in six random hypernetworks; the results showed that our algorithm has some advantages.
We first generated a random hypernetwork with a known community structure. The hypernetwork consisted of n nodes and K communities, where each community includes n k nodes. At each iteration, we randomly selected n v nodes (n v < n max ). If the selected nodes belonged to the same community, they were connected by a hyperedge with a certain probability p in , otherwise, they were connected with a probability p out , and p out = 1 − p in . We repeated the process until we generated a hypernetwork with m hyperedges.
As shown in Figure 5, under different probabilities, most nodes are included in 25 to 30 hyperedges.
We used the Rand Index and NMI indicators to evaluate the experimental effect. To investigate the effect of the parameter α on NMI, we varied the value of α (between 0.1 and 0.9) and the value of probability (between 0.95 and 0.99). Figure 6 shows that, when the probability p in ≥ 0.97, the community structure in the hypernetwork is obvious, and the community detection results are consistent with the actual situation, regardless of the value of α. However, when the probability p in < 0.97, a higher NMI value is achieved at a value of α ≈ 0.7, indicating that the best community detection results are obtained at α ≈ 0.7. Therefore, we set the parameter α = 0.7 for subsequent experiments.
Since all nodes within the same hyperedge are fully connected in the hypernetwork, we can obtain ordinary networks under different probabilities. In Figure 7, we present the performance of five algorithms on these hypernetworks.
The IRMM algorithm, Spectral algorithm, and our HLEGF algorithm directly partition the hypernetwork into communities, while the LPA algorithm and GN algorithm partition the ordinary network corresponding to the hypernetwork into communities. We used the Rand Index and NMI as indicators to evaluate the experimental results. Figure 7a,b provide an intuitive representation of the changes in NMI values of the five algorithms under different probabilities, while Figure 7c,d depict the changes in Rand Index values of the algorithms under different probabilities p in . The results show that, when the probability p in = 1, only nodes within the same community are connected by hyperedges in the current hypernetwork, and the community structure is obvious. Therefore, all five algorithms can accurately partition all nodes. As the probability p in decreases and p out increases, nodes between different communities are connected with a certain probability p out , and the community structure gradually becomes less distinct. When 0.98 ≤ p in < 1, three algorithms used for the hypernetworks perform well, while the LPA and GN algorithms used for ordinary networks show significant disadvantages. When p in = 0.97, the Index and NMI values of the IRMM algorithm decrease significantly, and the partition results of the Spectral algorithm also produce some errors. However, our HLEGF algorithm can still accurately identify the communities in the current hypernetwork. When the probability p in is reduced from 0.97 to 0.95, our algorithm outperforms the Spectral algorithm slightly, indicating that our algorithm has some advantages. As shown in Figure 5, under different probabilities, most nodes are included in 25 to 30 hyperedges.
We used the Rand Index and NMI indicators to evaluate the experimental effect. To investigate the effect of the parameter α on NMI, we varied the value of α (between 0.1 and 0.9) and the value of probability (between 0.95 and 0.99). Figure 6 shows that, when the probability ≥ 0.97, the community structure in the hypernetwork is obvious, and the community detection results are consistent with the actual situation, regardless of the value of α. However, when the probability < 0.97, a higher NMI value is achieved at a value of ≈ 0.7, indicating that the best community detection results are obtained at α ≈ 0.7. Therefore, we set the parameter α = 0.7 for subsequent experiments. Since all nodes within the same hyperedge are fully connected in the hypernetwork, we can obtain ordinary networks under different probabilities. In Figure 7, we present the performance of five algorithms on these hypernetworks.  Since all nodes within the same hyperedge are fully connected in the hypernetwork, we can obtain ordinary networks under different probabilities. In Figure 7, we present the performance of five algorithms on these hypernetworks.

Drug-Targets Hypernetwork
After verifying the feasibility of the algorithm, we applied the HLEGF algorithm to a relatively large drug-target hypernetwork to detect communities and identify multiple drug modules.
We obtained drug and target information from the DrugBank database, which included 825 FDA-approved drugs and 4871 targets. We constructed a drug-target hypernetwork with 825 nodes and 4871 hyperedges, where hyperedges included drugs that act on the same target. Because this hypernetwork has a large size, we present only a portion of it in Figure 8a. Figure 8a displays a partial diagram of the drug-target hypernetwork, which contains 46 nodes and 31 hyperedges. The nodes are numbered from 0 to 45, with each node representing a drug. We provide the corresponding drug and target information for the nodes and hyperedges shown in Figure 8a in Table 3. drug modules.
We obtained drug and target information from the DrugBank database, which included 825 FDA-approved drugs and 4871 targets. We constructed a drug-target hypernetwork with 825 nodes and 4871 hyperedges, where hyperedges included drugs that act on the same target. Because this hypernetwork has a large size, we present only a portion of it in Figure 8a. Figure 8a displays a partial diagram of the drug-target hypernetwork, which contains 46 nodes and 31 hyperedges. The nodes are numbered from 0 to 45, with each node representing a drug. We provide the corresponding drug and target information for the nodes and hyperedges shown in Figure 8a in Table 3.  After constructing the drug-target hypernetwork, we applied our algorithm to partition the 825 drug nodes into communities. The results indicated that 825 drugs were divided into 76 communities, with an average of approximately 10 drugs per community. For instance, the 46 drugs mentioned in Table 3 were divided into three communities in the partial diagram shown in Figure 8a, as demonstrated in Figure 8b.
The first type of nodes is represented in red and corresponds to drugs such as Chromous sulfate, Human C1-esterase inhibitor, Iron, Ferrous gluconate, and Ocriplasmin, which are mainly used to treat blood-related diseases. For example, Chromous sulfate and Human C1-esterase inhibitor can improve lipid metabolism, Iron is used for coagulation, Ferrous gluconate is used for iron-deficiency anemia, and Ocriplasmin is used as a human plasma protein. The second category of nodes is represented in yellow and corresponds to drugs such as Lorazepam, Etomidate, Carisoprodol, Zolpidem, and Oxazepam, which are mainly used to treat neurological excitability. For instance, Lorazepam and Oxazepam are used to treat anxiety and depression, Etomidate is used as a short-acting anesthetic or sedative, Carisoprodol has sedative and anti-anxiety effects, and Zolpidem is used as a hypnotic for short-term treatment of insomnia. The third category of nodes is represented in blue and corresponds to drugs such as Medrysone, Levomenthol, Nifedipine, and Quinidine barbiturate, which all have inhibitory effects and their target receptors that can be detected in the brain, retina, heart, and vascular system. For example, Medrysone is a locally applied corticosteroid that can be used to inhibit edema, Levomenthol is a stimulant with sliding motion inhibitory effects, Nifedipine inhibits calcium ion influx and can treat angina pectoris, and Quinidine barbiturate directly acts on the myocardial cell membrane as a membrane-inhibiting anti-arrhythmic drug.
The experimental results demonstrated that the HLEGF algorithm was able to partition drugs with similar functions into a community, which enabled us to mine drug modules. This outcome showcased the practical application value of our algorithm and established a foundation for future drug development and target identification.

Conclusions and Discussion
In this paper, we aim to design a community detection algorithm applicable to hypernetworks. To overcome two limitations of the local extension method, we introduce the definition of community boundary and neighborhood and propose the HLEGF algorithm, which is based on local expansion and global fusion. We validated our algorithm on a real hypernetwork and six synthetic random hypernetworks with different probability, the results showed that our algorithm is close to the classical Spectral algorithm result, and in some cases, our algorithm slightly outperforms the Spectral algorithm. Further analysis shows that the Spectral algorithm represents a network as a Laplacian matrix, and obtains the community structure in the network by performing eigenvalue decomposition and eigenvector analysis on the Laplacian matrix. The value of the eigenvector can be interpreted as the importance of the node in the community, and the similarity between eigenvectors reflects the similarity between nodes. Our algorithm also judges the importance of nodes in the network and conducts community detection based on the definition of similarity. Both algorithms detect communities based on the local structure of the network, so the results are approximate. However, for nodes that are highly clustered but do not belong entirely to a community, the HLEGF algorithm can assign them reasonably to a community, while Spectral analysis may classify them as isolated nodes or noise. Therefore, the HLEGF algorithm is slightly better than the Spectral method. After verifying the effectiveness of the algorithm, we applied the HLEGF algorithm to the drug-target hypernetwork and realized the mining of drug modules.
Despite the efficacy of our HLEGF algorithm in detecting community structures of hypernetworks, it still has some flaws. Through inspection and analysis, it was found that the algorithm is not ideal for overlapping community detection. The next step is to use and enhance the HLEGF algorithm so that it can be applied to overlapping and large-scale hypernetworks. In addition, we will consider the effect of medicine doses on community detection based on the existing studies. We hope that our algorithm will have practical significance in other fields as well.