Overlapping Community Detection of Bipartite Networks Based on a Novel Community Density

: Community detection plays an essential role in understanding network topology and mining underlying information. A bipartite network is a complex network with more important au-thenticity and applicability than a one-mode network in the real world. There are many communities in the network that present natural overlapping structures in the real world. However, most of the research focuses on detecting non-overlapping community structures in the bipartite network, and the resolution of the existing evaluation function for the community structure’s merits are limited. So, we propose a novel function for community detection and evaluation of the bipartite network, called community density D . And based on community density, a bipartite network community detection algorithm DSNE (Density Sub-community Node-pair Extraction) is proposed, which is effective for overlapping community detection from a micro point of view. The experiments based on artiﬁcially-generated networks and real-world networks show that the DSNE algorithm is superior to some existing excellent algorithms; in comparison, the community density (D) is better than the bipartite network’s modularity.


Introduction
A complex network is the network with a high degree of complexity, and its complexity is mainly reflected in the large scale of the system, diverse structures, and complex types of nodes. According to the number of node types, the complex network can be classified as the one-mode network and multi-mode network. The network that contains only one type of node is called the one-mode network, such as financial networks [1], social networks [2], and the Internet [3]. The network containing more than two types of nodes in the network is called the multi-mode network. The bipartite network is one of the multi-mode networks defined as containing two types of nodes. There is no internal connection of the same type of node, but there are edges that exist only between different types of nodes. In the real world, there are various types of bipartite networks, such as protein networks [4] and scientific collaboration networks [5]. The community of a complex network is a set composed of vertices in the complex network. The vertices in the same community are closely connected, while the vertices in different communities are sparsely connected. Such as various associations generally tend to communicate with people in the same association rather than with people outside the association. Figure 1 shows a community example of a South African Companies bipartite network [6]. The areas framed by blue and yellow dotted lines represent community 1 and community 2, respectively, and the green nodes represent the overlapping nodes in the community. Because of its wide applicability and universality, the bipartite network is the hotspot of current researchers. There are various perspectives and methods around complex network research. As an important feature, we can obtain deeper topology, hidden meaning, and network information through community structure detection. In the real world, a bipartite network community has been found to solve many practical problems. For example, the study of bipartite network due to the modularity resolution limitation [18], modularity cannot judge the merits and demerits of community detection results in some particular cases.
In this article, our main contribution is: • A new evaluation index (Community Density D) is proposed, which alleviates the problem of modularity resolution limitation and evaluates the bipartite network's merits and demerits from a new point. • A new community detection algorithm (Density Sub-community Node-pair Extraction (DSNE)) is proposed. It has the advantages of a micro-network model, adopts a new merging strategy for community merging, and can detect the bipartite network's overlapping community structure. • The experimental results show that the proposed algorithm is superior to other algorithms' accuracy and effectiveness.

Related Work
Compared with the one-mode network, the bipartite network's topology is more complex, and it has stronger universality and practicability. Therefore, the research on the problem detection of the bipartite network community is more valuable. There are two main problems in the bipartite network community detection: finding the community structure more effectively, and another is how to evaluate the merits and demerits of the community detection results. In conclusion, there are mainly two kinds of research on the bipartite network's community detection algorithm, and the evaluation index of the bipartite network is mainly aimed at the improvement of the modularity of the bipartite network.

Based on the Projecting of the Bipartite Network Community Detection Algorithm
The methods of projecting bipartite networks to one-mode networks are mainly divided into unweighted one-mode projecting and weighted one-mode projecting. Unweighted one-mode projecting is simply projecting bipartite networks to one-mode networks. Newman et al. [19] converted a bipartite network (scientific collaboration network) into two one-mode networks by using a single-mode right projection. Horvat et al. [20] proposed a statistical method that extends the bipartite network projection algorithm containing a single relationship and solves multiple types of relationship projection in large datasets. Valejo et al. [21] applied the traditional layered method to the projection process of bipartite network, proposes a multi-layered method based on projection. Comar et al. [22] proposed a framework that could simultaneously detect the network structures of different communities and their communication, with advantages, such as strong adaptability and fast convergence. However, in the process of unweighted one-mode projection, network structure information, such as edge frequency, may be lost, leading to a significant decrease in the accuracy of community detection by this method. Due to the apparent defects of unweighted modular projecting, researchers turned their attention to weighted modular projecting.
The weighted one-mode projecting converts the frequency of edges in a bipartite network to the weight of edges in a one-mode network during the projection process. Liu et al. [23] presented a novel fast nonnegative matrix (F-NMTF) method for clustering two kinds of nodes in a bipartite network. Zhang et al. [12] used a structure that can describe the relationship between two kinds of nodes, the Weighted Symmetric Bipartite Matrix Factorization (WSBMF), to detect the overlapping communities in the bipartite network. Wang et al. [13] presented a new way to find and extend the second part of the network's core community. Define two parameters to represent the relationship between nodes of the same type and heterogeneous nodes, respectively. In Zhou et al. [14], based on the Louvain algorithm, the Bi-Louvain algorithm suitable for the bipartite network is proposed, and a two-stage algorithm with partial modularization and quantification of partition strength is applied to bipartite network community detection. Pesantez-Cabrera et al. [24] proposed an efficient algorithm, called Bilouvann, which implements a set of heuristics for fast and accurate community detection in bipartite networks. Compared with non-weighted modular projecting, the weighted modular projecting method can reduce structural information loss to a large extent. Still, its disadvantage is that the accuracy of the community detection results depends on the calculation of weights, which has substantial uncertainty, affecting the use of such methods.

Processed Directly on the Bipartite Network
The advantage of direct processing of bipartite network community detection algorithm is that it can not only effectively avoid the loss of structural information in the process of unweighted projection, keep the original network structural information to the maximum extent, but also avoid the problem of increasing noise in the process of weighted projection. Liu et al. [25] proposed a new algorithm, LP&BRIM, based on label propagation (LP) algorithm and BRIM algorithm, which is suitable for large-scale bipartite network detection. Beckett et al. [26] proposed two new algorithms, LPAWB + DirtLAPWB . It can find the largest community in the network by maximizing the bipartite network's weighted modularity and searching the partition with the largest modularity. Sun et al. [27] proposed a new bipartite community Attractor algorithm, BiAttractor, to extend the distance dynamic model Attractor into the bipartite network. Wang et al. [28] proposed a bipartite network overlapping community detection algorithm (MACD-BNS). This algorithm realizes overlapping community detection by optimizing the four existing methods for calculating similar nodes' similarity. Che et al. [29] proposed a new meme algorithm, MATMCD-BN , which approximates the optimal global solution by local search through a new population initialization method and a new crossover operator. Chang et al. [30] proposed a new community detection algorithm, CBG&BEN, which is based on the complete bipartite graph (CBG) and micro-network model (Bi-EgoNet). Gmati et al. [31] proposed a new algorithm called Bi-Comdet, the main thrust of the introduced approach is that it stresses the importance of grouping two types of nodes in communities having a full connection between its node. Taguchi et al. [32] proposed a new multi-label propagation algorithm BIMLPA , which overcomes the limitations of previous propagation algorithms and has good speed and stability. Yen et al. [33] extended the stochastic block model (SBM) model to bipartite networks and proposed biSBM; the biSBM improves community detection results over general SBM when data are noisy and improves the model resolution limit by a factor of two.

Community Detection Evaluation Criteria
If there is no scientific evaluation index of community detection results, then the community detection algorithm is at a loss, so how to evaluate the merits of community detection results is another important topic in community detection research. Especially for overlapping communities, the criteria for evaluation of community structure are very complex. It is difficult to give a unified and recognized evaluation index. Barber [16] extended the modularity of the traditional one-mode network to the bipartite network. Murata et al. [17] proposed another kind of modularity of bipartite networks, which can be represented by a pair of two numerical groups in two directions, allowing one-to-many correspondence of communities of different vertex types. Suzuki et al. [34] proposed a new modularity measurement method based on M.E.J Newman, which applies to bipartite networks and non-uniform networks. Chang et al. [30] proposed a new modularity evaluation method, which extended the modularity to overlapping community detection based on M.E.J Newman's method. This evaluation criterion can evaluate the findings of overlapping communities in bipartite networks. All evaluation criteria described above are limited by modularity resolution [18]. The problem of resolution limit was first proposed by Fortunato et al. [35] in 2007. Fortunato et al. [36] find that multi-resolution modularity may not be able to prepare to identify a large number of network community structures with scale-free characteristics in reality. Specifically, multi-resolution modularity may merge some small communities into large communities and may split large communities into small communities. Li et al. [37] proposed a mass function of community partition community density to measure the quality of community structure in a bipartite network. Still, this mass function is only applicable to the flying overlapping community structure in the bipartite network.

Community Density
This section is divided into two parts: the definition of community density D, reasoning D solves the modularity resolution limitation to a certain extent. We want to define community density D to measure the quality of community division is mainly due to the problems left by the history of modularity. Take the latest methods as examples. Chang et al. [30] proposed modularity EQ b suitable for bipartite network overlapping community structure, but it is still limited by modularity resolution. Li [37] proposed Bipartite Partition Density, which can effectively alleviate the modularity resolution limitation and considers the community as a whole and cannot be used for overlapping communities. Although these methods can detect community structure, they lack to consider community structure from a micro perspective, such as the relationship between node pairs within the community. Based on this, from the micro point of view, this paper considers the relationship between the node pairs within the community and puts forward an evaluation standard suitable for the bipartite network overlapping community structure, which can effectively alleviate the modularity resolution limitation.

Key Definitions
Given a bipartite network G = (U, V, E)(U = ∅, V = ∅, U ∩ V = 0) with no weight, no direction and the edges exist only between nodes of different types. U = {u | u ∈ U}: U is a one node typeset, V = {v | v ∈ V}: V is a another node typeset. a represents the number of nodes of type U, and b represents the number of nodes of type V. E represents the set of edges connected by two types of nodes, and (u, v) represents a pair of nodes connected by edge e (u,v) . An example diagram of a bipartite network is shown in Figure 2. There are five U nodes and six V nodes, with a total of thirteen edges.
In Figure 2, select a node pair (U1, V1), and then its neighbor node pairs set is Figure 4 shows a set of neighbor node pairs of node pair (U1, V1). In a bipartite network G = (U, V, E), assume that the bipartite network G is divided into X communities, where the t-th community is expressed as G t = (U t , V t , E t ), and ϕ(u, v) represents whether the node pair (u, v) is a node pair within the community. D t (m,n) represents the node pair density of node pairs (m, n) in community G t ; in ∑ represents the total number of all V nodes connected to m, k represents the serial number of node v k connected to m, and, when k = 1, it means that m is connected to V node v 1 . For node pair (m, v k ), if they are edges e (m,v k ) within the t-th community, then ϕ(m, v k ) = 1. We express these as: Suppose we use the bipartite network shown in Figure 2 for community detection, and the results are shown in Figure 1. There are two communities in Figure 2. In community 1, select node pair (U1, V1) to calculate D 1 (U1,V1) . For node pair (U1, V1), In the t-th community G t = (U t , V t , E t ), by adding and averaging the node pair density D t (u i ,v j ) of node pairs (u i , v j ), the density D t of the t-th community can be shown as Formula (10). Among them, P t represents the sum of all U nodes in the t-th community, Q t represents the sum of all V nodes in the t-th community, represents the sum of node pair density of all node pairs in the t-th community, represents the sum of all node pairs in the t-th community.
In community 1, P t = 4, Q t = 4, the calculation formula of D 1 is shown in Formula (11).
Likewise, we can obtain D 2 ≈ 0.78. After obtaining the density of a certain community G t , the density of the whole bipartite network community D is the average density of all communities. X represents the number of communities in the bipartite network G = (U, V, E).
For the community detection shown in Figure 1 85. When we use community density D as an evaluation criterion of community detection, the higher the community density is, the closer the node pairs within the community are, and the sparser the node pairs between the communities are. This trend is very consistent with the community structure required by community detection. Therefore, community density can be used to determine whether community detection results are good or bad.

Reasoning Community Density Alleviates the Modularity Resolution Limitation
The root cause of the problem of modularity resolution limitation [18] is that modularity cannot effectively identify the results of community division. This problem has existed for quite a long time, but there has been no suitable method to completely solve this problem. The community density proposed in this paper can effectively solve the modularity resolution limitation in some cases. Now, let us look at an example.
There is a bipartite network G(U,V,E). This bipartite network is a complete graph, in which there are L(u) nodes of type u, and L(v) nodes of type v, L(u) ≥ 2, L(v) ≥ 2, (L(u) − L(v)) mod 2 = 0. Then, this bipartite network has NV vertices and NE edges. This type of picture is called SCBG (Special Complete Bipartite Graph).
The community detection of this graph yielded the following two results: R1: Divide the result into a community G. R2: Evenly divided into two communities , the total number of edges in community G 1 is 1 2 NE. Among them, the number of edges in the community E 1 in accounts for half of the total number of edges in the community G 1 , and the number of edges outside the community E 1 out accounts for the other half. Likewise, for community G 2 , L(u) 2 = 1 2 L(u), L(v) 2 = 1 2 L(v), G 1 owns half of node U and node V, and G 2 owns the other half of node U and V, and the total number of edges in community G 2 is 1 2 NE. Among them, the number of edges in the community E 2 in accounts for half of the total number of edges in the community G 2 , and the number of edges outside the community E 2 out accounts for the other half.
An example of the above situation is given in Figure 5, where L(u) = 4 and L(v) = 4. Figure 5a is divided according to R1, and Figure 5b is divided according to R2. Barber [16] proposed the modularity Q b suitable for bipartite networks. Assuming that a bipartite network can be expressed as G(U, V, E), the formula for the modularity Q b of the bipartite network is as Formula (15): where M represents the number of edges in the bipartite network, C represents the set of community in the bipartite network, c means the c-th community G c (U c , V c , E c ), and m and n represent the number of two types of nodes in the bipartite network, respectively. δ (i,c) represents the membership degree of node i to community c, and if node i belongs to community c, then c is 1, otherwise c is 0. d i represents the degree of node i. A ij represents the adjacency matrix of bipartite network, if there is an edge between i and j, A ij = 1, else For the two kinds of community partition results, R 1 and R 2 , the modularity is calculated, respectively. For the community structure in Figure 5a , the calculation of modularity Q R 1 is shown in Formula (16). For the community structure in Figure 5b , the calculation of modularity Q R 2 is shown in Formula (17).
As Equations (16) and (17) show, the modularity of the two community detection results calculated by Barber's modularity is 0. However, R1 and R2 have different community detection but Q R 1 = Q R 2 . In R1, divide the network into a community. Logically, the best partition is the one where all the nodes are put in a single cluster. So, there are no cuts, no edges between clusters. So, the modularity proposed by Barber cannot effectively judge the merits and demerits of the results.
Let us use the community density proposed in this paper to calculate the two partition results. D R 1 represents the community density divided according to R1, and D R 2 represents the community density divided according to R2.
For R1, any node pair (u, v) in community G, all node pairs containing G are within the community, so the calculation formula of node pair (u, v) density D (u,v) is as follows (18).
For R2, in community G1, for any node pair (u, v), half of any node pair containing (u, v) is in the community, and the other half is outside the community. Similarly, the situation in community G2 is exactly the same as in G1.
After calculation, D R 1 = D R 2 , D R 1 − D R 2 = 1 2 . Therefore, community density can be used to evaluate the merits and demarcations of such results. Effectively alleviate the resolution limit of modularity.

Definition of Community Similarity and Bi-EgoNet
Given two communities G 1 (U 1 , V 1 , E 1 ) and G 2 (U 2 , V 2 , E 2 ), and CD G 1 ,G 2 represents the similarity of the two communities, as shown in Formula (24): In the DSNE (Density Sub_community and Node-pair Extraction) algorithm, we use the Bi-EgoNet [11] network model proposed by Chang et al., which can detect the overlapping communities of the bipartite network from a microscopic point of view. Figure 6 shows an example of this model. As shown in Figure 6, in a bipartite network G(U, V, E), the microscopic bipartite network model Bi-EgoNet of node pair (u 2 , v 1 ) is composed of a central node pair (u 2 , v 1 ), its neighbor node pair set NeiPN(u 2 , v 1 ), and the edges between these node pairs. (u 2 , v 1 ) is the Bi-Ego, which is the center node pair; NeiPN(u 2 , v 1 ) is the Bi-Alter, which is the each neighbor node pair. The algorithm is divided into two stages: the first stage is to construct a sub-community for each node pair centered on each node pair; the second stage is to combine all node pair sub-communities to get the global community structure.

Construct Sub-Communities for Each Node Pair
In this stage, it is important to calculate node pairs' density (Equation (8)). The flow of the first stage is shown in Algorithm 1. 4: Each node pair (u i , v j ) is numbered, the first node pair is numbered as 1, the second node pair is numbered as 2, and so on, the t node pair is numbered as t, until all node pairs have been numbered; 5: With Bi-Ego (u i , v j ) numbered as t as the center, the initial value of t is 1, the sub-

A New Strategy to Merge The Sub-Communities
For two communities to be merged, G 1 (U 1 , V 1 , E 1 ) and G 2 (U 2 , V 2 , E 2 ), the usual practice is to merge the node pairs and edges directly. However, this will increase the noise, such as adding some node pairs that are not very closely connected to the newly merged community. In this regard, we provide a new merging idea: starting from the perspective of node pairs, we extract the central node pairs (u 1 , v 1 ) and (u 2 , v 2 ) from two communities G 1 (U 1 , V 1 , E 1 ) and G 2 (U 2 , V 2 , E 2 ) to be merged, and form a new community NG 1 (U 1 , V 1 , E 1 ) from the two type of node pairs. We define an identifier for each subcommunity Sub (u i ,v j ) (U i , V j , E u i ,v j ): mark, which indicates whether the central node pair in the sub-community has participated in the merge.
The flow of the first stage is shown in Algorithm 2.

Algorithm 2 Use New Strategy To Merge Sub-communities.
Input: Sub-communities set Sub − DC; Output: The merged community set NC; 1: Sort in descending order according to the number of node pair in the sub-communities of Sub − DC set, Add a mark to each sub-community with an initial value of 0; 2: The sub-community G i (U i , V i , E i ) with the current mark = 0 and the most node pairs is selected to merge, and the merged community is NG j (U j , V j , E j ). The initial value of j is 1, the central node pair of G i (U i , V i , E i ) is selected as the central node pair of the merged community, and the mark of the sub-community G i (U i , V i , E i ) is set as 1; 3: Calculate the similarity CD G i ,G k between sub-community , and the mark of the sub-community G k (U k , V k , E k ) is set as 1; 4: Repeat step 3 until all sub-communities have been traversed, j increment by 1; 5: Repeat steps 2, 3, and 4 until all sub-communities have marks of 1; In the DSNE algorithm, parameter α is used to judge whether the neighbor node pair (m, n) can join the sub-community G t (U t , V t , E t ). We believe that, if half of all the neighbor node pairs of a node pair are in the community, then the node pair should be in the community, so parameter α is set to 0.5 ± 0.1. The β parameter is defined as the ratio of all identical node pairs in both communities G i (U i , V i , E i ) and G k (U k , V k , E k ) to the total node pairs, parameter β is set to 0.5 ± 0.1.

Complexity Analysis
Algorithm DSNE is divided into two stages to complete, so we analyze the complexity in turn according to the stage. Assume that there are m nodes of type U and n nodes of type V in the bipartite network. In constructing sub-communities for each node pair, the time complexity required for constructing sub-communities for each central node pair is O(m 2 n 2 ), and, in merging sub-communities, the time complexity required is O(m 2 n 2 ). Therefore, O(m 2 n 2 ) is the time complexity of the algorithm. Since the algorithm does not use unknown storage space during execution, the space complexity is O(n).

Experimental Evaluation
In this section, we choose two methods, LOCD [9] and CBG&BEN [30], as the comparison algorithm, community density (D), NMI [38], and EQ b [11] as the evaluation indexes, and we use manually generated network [37] and the network in the real world as the dataset to evaluate DSNE algorithm. The LOCD algorithm and CBG&BEN algorithm are written in the Java programming language, while the DSNE algorithm is written in the C ++ programming language and runs on a 2.3 GHz processor, 8 GB of memory, and the macOS Big Sur operating system. Finally, the superiority of the proposed evaluation index, community density (D), is verified in evaluating community detection results compared with modularity.

Contrast Algorithm
The DSNE algorithm proposed in this paper uses community density to detect the overlapping community structure in the bipartite network starting from node pairs. LOCD algorithm uses the similarity of nodes to detect the overlapping community structure in the bipartite network from the graph's topology. CBG&BEN algorithm uses the Bi-EgoNet, a micro-network model to detect the bipartite network's overlapping community structure based on the complete dichotomous graph. These three methods have certain similarities, and the LOCD algorithm and CBG&BEN algorithm are relatively novel and excellent in community detection results. Therefore, comparing the DSNE algorithm with these two algorithms can better illustrate the DSNE algorithm's superiority. We briefly describe the two algorithms below.
Wang et al. [9] proposed LOCD algorithm for selecting the right partitions in bipartite networks in 2017. This method contains two parameters; one parameter is used to represent the similarity between nodes of the same types, and the other parameter is used to represent the similarity between nodes of different types. The two types of nodes are processed separately. Firstly, extend the sub-community with the first parameter, which is the similarity between nodes of the same type. The similarity between nodes of different types is used to merge different types of nodes with the sub-community, and many sub-communities are obtained. Then, the sub-communities are merged according to some specific merge rules to obtain the global community structure.
In 2019, Chang et al. [30] proposed a community detection algorithm, CBG&BEN, which is based on the micro-network model (Bi-EgoNet) and complete bipartite graph (CBG). First, the algorithm constructed the micro bipartite network model Bi-EgoNet for each node pair. Second, it used the complete binary graph to extract the basic complete binary graph from each Bi-EgoNet, and the set of local communities was obtained by merging rules. Third, according to specific merging rules, the global bipartite network community structure is obtained. Finally, the network's divergent nodes are collected and allocated to the corresponding global community, and the final global community structure is finally obtained.

Extended Modularity of Bipartite Networks EQ b
Chang et al. [30] extended the modularity of bipartite network proposed by Baber [16] and proposed EQ b , which is suitable for the evaluation of detection results in the overlapping communities of the bipartite network. The formula of EQ b is shown in (26).
We introduced Barber's modularity in Formula (16), Except for ψ i,c , the meanings of other attributes in EQ b are consistent with the modularity proposed by Barber. The membership coefficient ψ i,c represents the proportion of node i belonging to community c. The membership coefficient ψ i,c should meet the normalized characteristics, such as 0 ≤ ψ i,c ≤ 1 and ∑ c∈C ψ i,c = 1. For example, there are a total of m + n edges connected to node i, among which m edges belong to community c, and n edges do not belong to community c; then, ψ i,c = m m+n .

NMI
Andrea et al. [39] proposed a standardized mutual information index NMI (Normalized Mutual Information Index) for community detection in complex networks. When we want to measure the difference between the community structure obtained by the algorithm and the known real community structure, NMI is generally used as the evaluation index. NMI applies to many types of complex networks and is also applicable to evaluating the bipartite network's detection results overlapping communities. The value of NMI is between 0 and 1. The larger the NMI value is, the closer the algorithm's community structure is to the real community. The calculation formula of NMI is shown in (31).
I(X : Y) represents the mutual information between X and Y, H(X) represents X s entropy information, and H(X | Y) represents X s and Y s entropy information under certain conditions. When X = Y, H(X | Y) = 0, then NMI reaches the maximum value of 1; otherwise, when X and Y are completely different, H(X | Y) = 1, then NMI is the minimum value of 0.

Experiments on Artificially Generated Networks
We refer to the artificially-generated network model proposed by Li [37]. The artificiallygenerated network model consists of three parameters: the number of network nodes N, the number of network edges |E|, and the noise parameter λ. In Li's artificially-generated network, each network has four communities with the same number of nodes. There are two different types of nodes, U and V in the community, and each community contains the same number of U and V nodes. The number of sides |E| = 2N, that is, each U node is associated with 4 V nodes. λ is the network's noise, that is, the number of sides between different communities in the network. The size of λ ranges from 0 to 1. When λ is 0, it means that the network is full of noise. When λ is 1, it means that there is no noise in the network.
In this experiment, an artificially-generated network with 256 nodes is selected. There are 128 U nodes and 128 V nodes in the network. There are four communities in the network. Each community contains 32 U nodes and 32 V nodes. There are 128 edges in each community. The initial value of λ is 0.1, increasing by 0.1 every time, increasing by nine times until the maximum value is 1. The DSNE algorithm, LOCD algorithm, and CBG&BEN algorithm are applied to the designed network, and the community detection is obtained. Table 1 shows the results of NMI evaluation. As Table 1 shows, the NMI index of the DSNE algorithm is better than the other two algorithms in most cases, which indicates that the community structure found by the DSNE algorithm is closer to the real community structure than the other two algorithms. Compared with the CBG&BEN algorithm, the average NMI of the DSNE algorithm is improved by 2.6%, compared with the LOCD algorithm, 12.5% improves the average NMI of the DSNE algorithm. In the λ decreasing from 0.6 to 0.4, the value of the NMI index showed a cliff-like downward trend. As Table 1 shows, when λ fluctuates around 0.5, the impact on the algorithm's accuracy is huge. When λ ∈ [0.1, 0.4], the value of NMI does not exceed 0.1, which indicates that only less than 10% of the nodes in the community found by the algorithm correspond to the real network. Therefore, when λ ∈ [0.1, 0.4], the community structures found by the three algorithms are invalid.
The decreasing speed of the algorithm represents the anti-noise ability of the algorithm. We consider setting λ ∈ [0.5, 1] and comparing the decreasing speed of the three algorithms under different λ. The results are shown in Table 2. The average decreasing speed of the DSNE algorithm was 2.7% less than that of CBG&BEN and 7.3% less than that of LOCD. Therefore, the anti-noise ability of LOCD algorithm is superior to the other two algorithms. In artificially-generated networks, the number of nodes directly affects network structure information. Therefore, we consider setting λ ∈ [0.5, 1] and implementing the DSNE algorithm on artificially-generated networks with 64, 128, and 256 nodes. The results are shown in Figure 7. Obviously, under different N, with the increase of noise, that is, during the process of λ decreasing from 1 to 0, the decrease rate of NMI is different. Table 3 shows the decrease rate of NMI under different N. According to Table 3 and Figure 7, when N = 256, with the increase of noise, i.e., the decrease of λ, NMI decreases more slowly. When λ > 0.7, and decreases rapidly when a hovers around 0.5. When N + 128, NMI's value showed a strong downward trend when λ hovered around 0.6. When N = 64, NMI showed a strong downward trend when λ hovered around 0.9. Therefore, we find that the increase of the number of network edges N is negatively related to NMI and λ; that is to say, more noise is needed to make the algorithm imprecise. When the value of N is 64, 128 and 256, the corresponding average decline rates of NMI are 29.6%, 22.6%, and 17.%, which indicates that with the increasing value of N, the decline rate of NMI is getting slower and slower, that is, the stability of the algorithm is getting better. Besides, with the increase of N, there is more and more structural information in the network, and the average value of NMI is also increasing. In other words, the matching degree between the community detected by the algorithm and the real community is higher.

Experiments on Real-World Bipartite Network
In this section, we will verify the performance of the DSNE algorithm in the real world network, the CBG&BEN algorithm, and LOCD algorithm is used as the comparison algorithm, and eleven real bipartite network datasets from different domains are used as the dataset to evaluate the DSNE algorithm. The detailed information of these networks is shown in Table 4, where m and n represent the number of different types of nodes in the bipartite network, |E| represents the number of edges between nodes, and < k > represents the average degree of the network. Taking the SAC dataset in Figure 2 as an example, there are six U nodes and five V nodes, and the network has a total of thirteen edges, so m = 6, n = 5, |E| = 13, and < k > = 13+13 5+6 = 2.36.  [49] In order to explain the better understand the dataset we used in Table 4, the South African Companies Bipartite Network (SAC) is taken as an example to illustrate. Figure 2 shows the SAC dataset. Among them, the square nodes (U1 to U6) represent different leaders, the round nodes (V1 to V5) represent different companies, and the edges between nodes represent the leadership relationship between leaders and companies. The community detection results after DSNE algorithm is run are given in Figure 1. The merged community set NC contains two communities, which are, respectively, V1) , e (U1,V2) , e (U1,V3) , e (U2,V1) , e (U2,V3) , e (U3,V3) , e (U3,V4) , e (U4,V1) , e (U4,V3) )}. In NG 2 (U 2 , V 1 , E 2 ), U 2 = {U5, U6}, V 2 = {V1, V2, V5}, E 2 = {e (U5,V2) , e (U5,V5) , e (U6,V1) , e (U6,V2) }. Table 5 shows the number of communities found in the 11 datasets by the three methods. Table 5 shows that the DSNE algorithm and CBG&BEN algorithm can effectively get the number of communities, while the LOCD algorithm cannot effectively get the community structure in some datasets. LOCD algorithm cannot effectively obtain the number of SAC communities; the main reason is that the SAC dataset contains too little network information than the LOCD algorithm, so the number of communities obtained in the SAC dataset using the LOCD algorithm is 1. In the COL dataset, the LOCD algorithm is considered ineffective because it still fails to obtain sufficient community numbers after a long time (6 h) of operation. In most datasets, the number of communities is not always the more, the better. The more the number of communities, the more the noise interference in the merger process, the less the communities involved in the merger, the lower the merger rate, and the final number of communities is too many. DSNE algorithm is superior to CBG&BEN algorithm and LOCD algorithm in the number of communities in most datasets.  Figure 8 shows the experimental results of 3 algorithms evaluated by EQ b in 11 datasets. From Figure 8, we can get a conclusion: the EQ b of the DSNE algorithm is better than the other two algorithms. Because the LOCD algorithm cannot effectively detect the community structure in SAC and COL datasets, LOCD modularity in these two datasets is 0. Except for CL, PCD, and CRIM datasets, the DSNE algorithm's performance is better than the other two algorithms. In the CL dataset, DSNE is slightly 0.02 less than the CBG&BEN algorithm. In the PCD dataset, DSNE is 0.11 less than the CBG&BEN algorithm, and in the CRIME dataset, DSNE is 0.01 less than the LOCD algorithm. In conclusion, the average EQ b of the DSNE algorithm is 5.5% higher than that of the CBG&BEN algorithm and 33.3% higher than that of LOCD algorithm.  Figure 9 shows the experimental results of 3 algorithms evaluated by D in 11 datasets. Compared with the LOCD algorithm, except for Maria, AR , and DUS datasets, the D of the DCIM algorithm is better than the LOCD algorithm. In the Maria dataset, the D of the DSNE algorithm is 14% less than the LOCD algorithm, and in the AR dataset, the D of the DSNE algorithm is 5% less than the LOCD algorithm. In the DUS dataset, the DSNE algorithm is compared with LOCD 6% less. In SAC and GP datasets, because the LOCD algorithm could not effectively divide the community, the D of the LOCD algorithm on these two datasets is 0. Compared with the CBG&BEN algorithm, the D of DSNE is better than the CBG&BEN algorithm in all datasets. In the SW, GP, and CRIME datasets, the D of DSNE algorithm is 17%, 17%, and 12% higher than that of the CBG&BEN algorithm, respectively. In general, the DSNE algorithm's performance on D is better than that of LOCD room rate and CBG&BEN algorithm. Using D as the evaluation criterion, the average D of the DSNE algorithm is 8.6% higher than the CBG&BEN algorithm and 30.6% higher than the LOCD algorithm.  Table 6 show the results of the comparison between D and EQ b under different datasets using the DSNE algorithm and CBG&BEN algorithm for community detection. Since EQ b can't evaluate the detected community structure properly, Both algorithms are 0 in SCBG using EQ b evaluation, the D value of CBG&BEN is 0.5 more than that of DSNE because DSNE adopts a more strict strategy when conducting sub-community merger. In SW, CL, and DUS datasets, the CBG&BEN algorithm, and DSNE algorithm use EQ b as the evaluation index, and the values are relatively close. Within the range of effective data accuracy, the EQ b values of the two algorithms are the same. We can see from Table 6 that, when the two algorithms use D as the evaluation index, their performance is different in the above three datasets; so, when EQ b as the evaluation standard appears, the same value in the dataset, we can use D to measure the merits and demerits of community detection. In DUS and Maria datasets, EQ b is used as the evaluation standard, and CBG&BEN algorithm and DSNE algorithm are used for community detection. The results show that the EQ b value of the MARIA dataset is better than that of the DUS dataset. Still, the MARIA dataset's D value is lower than that of the DUS dataset when D is used as the evaluation standard, which indicates that there are more edges between communities, but the community division is relatively broken. It is not very complete.

Conclusions
In this paper, we propose a new evaluation index: community density D, which can effectively solve the modularity resolution limitation to a certain extent. Besides, a novel algorithm, DSNE, is proposed, which combines the advantages of the proposed community density D to obtain the community. Among them, in the second stage of the DSNE algorithm: merging communities, we put forward a new merging idea to reduce the increase of noise during merging.
The DSNE algorithm is evaluated in the artificial and real-world networks and compared with the CBG&BEN algorithm and LOCD algorithm. Experimental results show that the DSNE algorithm can detect meaningful community structures in artificial and realworld networks, and its accuracy and effectiveness are better than LOCD and CBG&BEN algorithms. The average EQ b of the DSNE algorithm is 5.5% higher than that of the CBG&BEN algorithm and 33.3% higher than that of the LOCD algorithm. The average D of the DSNE algorithm is 8.6% higher than the CBG&BEN algorithm and 30.6% higher than the LOCD algorithm. In the future, we will carry out further research on the following aspects: to find the optimal global solution based on the optimal local solution in the first stage of the algorithm and to find a more effective bipartite network community detection algorithm based on community density D.  Data Availability Statement: Not Applicable, the study does not report any data.