DMFO-CD: A Discrete Moth-Flame Optimization Algorithm for Community Detection

: In this paper, a discrete moth–ﬂame optimization algorithm for community detection (DMFO-CD) is proposed. The representation of solution vectors, initialization, and movement strategy of the continuous moth–ﬂame optimization are purposely adapted in DMFO-CD such that it can solve the discrete community detection. In this adaptation, locus-based adjacency representation is used to represent the position of moths and ﬂames, and the initialization process is performed by considering the community structure and the relation between nodes without the need of any knowledge about the number of communities. Solution vectors are updated by the adapted movement strategy using a single-point crossover to distance imitating, a two-point crossover to calculate the movement, and a single-point neighbor-based mutation that can enhance the exploration and balance exploration and exploitation. The ﬁtness function is also deﬁned based on modularity. The performance of DMFO-CD was evaluated on eleven real-world networks, and the obtained results were compared with ﬁve well-known algorithms in community detection, including GA-Net, DPSO-PDM, GACD, EGACD, and DECS in terms of modularity, NMI, and the number of detected communities. Additionally, the obtained results were statistically analyzed by the Wilcoxon signed-rank and Friedman tests. In the comparison with other comparative algorithms, the results show that the proposed DMFO-CD is competitive to detect the correct number of communities with high modularity.


Introduction
The analysis of complex networks in real-world applications such as social, biological, metabolic, and paper citation networks is receiving more attention from researchers and experts [1,2]. The structure and function of a real-world network can be studied by graph features such as small-world effect, power-law, or network transitivity [1,2]. An important issue in most real-world networks is to find the hidden structures. Community detection (CD) identifies these structures of a complex network, and the density of edges inside these structures is higher than their outside. The more similarity between the members of a community has been caused the community detection able to be used as a tool in the analysis of complex networks structure [3]. CD has a significant role in social network analysis, which includes the identification of friendship groups, relationship analysis, identify influential people, detect terrorist attacks, use in link-prediction, or identify classes in COVID-19 datasets [1,4,5].
Each network is mathematically represented by a graph consisting of nodes and edges, which the nodes are connected to each other by edges. To detect a community in a complex network, there are different criteria such as betweenness [2], modularity [6], node

Related Work
Metaheuristics due to their acceptable performance in solving complicated real-world problems have been broadly used to find communities in complex networks. As shown in Figure 1, metaheuristic algorithms based on their inspiration can be classified into three categories [26]: evolutionary, swarm intelligence, and physics-based algorithms. In the related literature, almost all algorithms from the evolutionary category were used to solve the community detection problem. Despite the simplicity of swarm intelligence algorithms, they are less applied in this problem. In the following, some representative metaheuristic algorithms that are used to find communities in complex networks are described.
Evolutionary algorithms are inspired by the biological evolution process of the species in nature [24]. A population of individuals is iteratively processed by applying mutation, crossover, and selection operators to improve the individuals. The genetic algorithm (GA) is one of the well-known algorithms in this category, which was inspired by Darwin's biological evolution theory. In [79], GA was employed to find the communities by optimizing the Newman modularity. In the proposed algorithm, a one-way crossing over operation is introduced in which two chromosomes are selected as parents, a node is randomly selected from one of the parents. Then, the community label is determined, all the nodes with the same label are found, and their labels are dedicated to another parent. In [58], GA-Net was proposed by Pizzuti, which is one of the state-of-the-art algorithms in community detection. GA-Net detects the communities by the use of GA and the community score (CS) as an objective function. CS measures the density of edges in each community and better partitioning leads to get a better CS. algorithm for community detection named EGACD by proposing a local search The proposed strategy is to improve the accuracy and increase the convergence s of the GACD algorithm. In EGACD, LAR is used to represent the individuals, modularity index is applied to calculate the fitness. In [62], the GAOMA-net a was proposed by a special representation in which a memory and specific depth icated to the network nodes. Then, the values of memory move by object migra tomata in depth, and the gene's evolution is performed by the use of GA. GAO can overcome the GA's premature convergence and accelerate the convergence. [76] proposed DECS to detect communities in evolving networks by adaptation netic algorithm on community detection.  Shi et al. [14] proposed GACD, in which a kind of genetic representation is introduced for use in community detection, which is called locus-based adjacency representation (LAR). In addition, the authors used a simple way of crossover and mutation based on their representation. In another work, Moradi et al. [60] proposed an enhanced genetic algorithm for community detection named EGACD by proposing a local search strategy. The proposed strategy is to improve the accuracy and increase the convergence speed up of the GACD algorithm. In EGACD, LAR is used to represent the individuals, and the modularity index is applied to calculate the fitness. In [62], the GAOMA-net algorithm was proposed by a special representation in which a memory and specific depth are dedicated to the network nodes. Then, the values of memory move by object migrating automata in depth, and the gene's evolution is performed by the use of GA. GAOMA-net can overcome the GA's premature convergence and accelerate the convergence. Liu et al. [76] proposed DECS to detect communities in evolving networks by adaptation of a genetic algorithm on community detection.
The second category is swarm intelligence algorithms, which imitate the animals' behaviors such as the movement of birds' flocks, the echolocation behavior of bats, or the navigation mechanism of moths at night. One of the most popular algorithms of this category is particle swarm optimization (PSO) [18], which imitates the behavior of bird flocks. In this algorithm, each bird is considered as a particle that is moved by its current, local-best, and global-best positions. Rahimi et al. [61] proposed a multi-objective particle swarm optimization algorithm for community detection in complex networks named MOPSO-Net in which the PSO algorithm is adapted as a discrete algorithm by a two-point crossover. At first, a crossover is performed between the current position and local-best position; then, a two-point crossover is performed between the resulted position and the global best position. Li et al. [13] developed an algorithm called DPSO-PDM with improvements in PSO that controls the motion of each particle relative to its difference from global best. With this strategy, when the particle diversity decreases, the algorithm tries to increase it and vice versa. Li et al. [80] proposed DESSO/CD, which is a hybridized version of an improved DE and social spider optimization (SSO) [81] algorithm. In the proposed algorithm, the population is initialized and moved by the SSO algorithm, the similarity of nodes is considered as local fitness function, and further improvement on population is performed by the improved DE.
Liu et al. [82] proposed DMFO algorithm, which is a new algorithm for clustering, with the equivalent aim of community detection. They adapted MFO by redesigning its movement strategies for a discrete algorithm and kernel k-means and ratio cut use as multiobjective functions. Zhao et al. [83] proposed ICSC, which is an improved CS algorithm to detect communities in protein-protein interaction networks. Zhang et al. [84] proposed a new algorithm named WOCDA to use on community detection with changes to the motion equation of WOA. The movement strategy of WOA is adapted by updating the node label with the label of most neighbor, one-way crossover, and updating the node label with a random neighbor's label. The third category regards physics-based algorithms, which are inspired by physical rules in nature. Guendouz et al. [85] proposed a new algorithm by use of black hole optimization algorithm in CD problem. In this algorithm initialization, two new strategies, and evolution enhance the performance of the algorithm. Liu et al. [86] proposed the EMACD algorithm, which is an evolutionary algorithm based on membrane system for solving community detection problem. Kumar et al. [87] used graph embedding for low-level vector representation, which can keep the topological features of the network, and the communities are detected by a gravitational search algorithm and k-means.

The MFO Algorithm
The moth-flame optimization (MFO) algorithm is proposed by Mirjalili in 2015 [63], which is inspired by the moth's navigation mechanism in nature. The moths by maintaining a fixed angle with the moon can fly long paths in a straight line during the night that is called transverse orientation. This mechanism is effective only when the light source is located in far distances, while flying toward nearby lights causes moths to move in a spiral path, as shown in Figure 2. In the MFO algorithm, moths update their position to reach the optimum solution by moving toward the flames in a spiral path. MFO is a population-based algorithm in which the positions of moths and flames are stored in M N × D and F N × D matrices as shown in Equations (1) and (2).
where N is the population number and D is the dimension of the problem. In the first iteration, F is the sorted moths' population based on their calculated fitness. For other iterations, the M and F are merged and sorted based on their fitness such that their first N solutions are considered as new F. When the flames are identified, each moth is assigned to a flame and its position is updated with a logarithmic spiral equation as shown in Equation (3).
where S is a spiral function, D i is the distance of the i-th moth and j-th flame, which is described in Equation (4), t is a random number between in a range of [−1,1], and b is a constant number that identifies the shape of the spiral.
where N is the population number and D is the dimension of the problem. In the f iteration, F is the sorted moths' population based on their calculated fitness. For ot iterations, the M and F are merged and sorted based on their fitness such that their firs solutions are considered as new F. When the flames are identified, each moth is assign to a flame and its position is updated with a logarithmic spiral equation as shown in Eq tion (3).
where S is a spiral function, Di is the distance of the i-th moth and j-th flame, whic described in Equation (4), t is a random number between in a range of [−1,1], and b constant number that identifies the shape of the spiral.

= | − |
To increase the exploitation ability of MFO, the number of flames is decreased in course of iterations as calculated by Equation (5).
where T is the maximum number of iterations and l is the current iteration number. The purpose of the paper is to introduce a discrete moth-flame optimization al rithm for community detection (DMFO-CD) is explained in this section. The methodol for implementation of the proposed DMFO-CD algorithm shown in flowchart Figur has two main steps: initialization and movement. This methodology is defined in the lowing sequence: In the initialization step, N moths are overspread to the restricted sea space, and the fitness of each moth is calculated by considering the objective functi Thereafter, the moth population is sorted based on the obtained fitness and considered the flame population. In the second step, moths are moved around flames by an adap movement strategy to update their position. This process iterates until the terminat criterion is satisfied. In the following, the procedure of the proposed DMFO-CD algorit is explained in detail. To increase the exploitation ability of MFO, the number of flames is decreased in the course of iterations as calculated by Equation (5).
where T is the maximum number of iterations and l is the current iteration number.

DMFO-CD: Discrete Moth-Flame Optimization Algorithm for Community Detection
The purpose of the paper is to introduce a discrete moth-flame optimization algorithm for community detection (DMFO-CD) is explained in this section. The methodology for implementation of the proposed DMFO-CD algorithm shown in flowchart Figure 3 has two main steps: initialization and movement. This methodology is defined in the following sequence: In the initialization step, N moths are overspread to the restricted search space, and the fitness of each moth is calculated by considering the objective function. Thereafter, the moth population is sorted based on the obtained fitness and considered as the flame population. In the second step, moths are moved around flames by an adapted movement strategy to update their position. This process iterates until the terminating criterion is satisfied. In the following, the procedure of the proposed DMFO-CD algorithm is explained in detail.

Initialization
In the initialization step, the locus-based adjacency representation (LAR) [14] is used to show the community structure of the network.

Representation
In our proposed DMFO-CD, the position of moths and flames are represented using LAR as solution vectors through which moth M i and flame F i are D-dimensional vectors . . , f iD }, where D is the number of nodes in the network. If there exists an edge between nodes r and s in the network, then in the solution vector M i , the value of m ir is set by s, which means nodes r and s belong to a same community. Once the network is represented by LAR, those nodes that have connected to each other in any solution vector M i construct a connected component. To detect the communities hidden in M i , a decoding procedure is applied to detect candidate communities by tracking connected components. To illustrate using the LAR in DMFO-CD, consider graph G, as a network consisting of ten nodes depicted in Figure 4a

Initialization
In the initialization step, the locus-based adjacency representation (LAR) [14] is used to show the community structure of the network. If there exists an edge between nodes r and s in the network, then in the solution vector Mi, the value of mir is set by s, which means nodes r and s belong to a same community. Once the network is represented by LAR, those nodes that have connected to each other in any solution vector Mi construct a connected component. To detect the communities hidden in Mi, a decoding procedure is applied to detect candidate communities by tracking connected components. To illustrate using the LAR in DMFO-CD, consider graph G, as a network consisting of ten nodes depicted in Figure 4a, the solution vector Mi represented in Figure 4b. The communities decoded from the connected components are shown in Figure 4c,d.

Initialization
To initialize the moths' population using LAR, N solution vectors are generated and distributed in the search space. The initialization process randomly assigns one of the neighbor nodes or the best neighbor of the node [88] to dimension i. The best neighbor of node i is one of its neighbor nodes that has the most common neighbors. An example of finding the best node is shown in Figure 5. To find the best neighbor of node "3" in graph G each of its neighbor nodes is checked and then a node with the most common neighbors is selected. If there is more than one node with the most common neighbors, one of them is selected randomly. To initialize the moth's position, when rand, a randomly generated number between 0 and 1, is less than the fixed parameter Prob (rand < Prob), a random neighbor is selected, otherwise (rand > Prob), the best neighbor of the node is assigned. This assignment guarantees that an existing connection in the network is kept in the initial population. Then, the fitness of each moth's solution vector is calculated by considering the objective function, and the sorted moth population is considered as the flame population.

Initialization
To initialize the moths' population using LAR, N solution vectors are generated and distributed in the search space. The initialization process randomly assigns one of the neighbor nodes or the best neighbor of the node [88] to dimension i. The best neighbor of node i is one of its neighbor nodes that has the most common neighbors. An example of finding the best node is shown in Figure 5. To find the best neighbor of node "3" in graph G each of its neighbor nodes is checked and then a node with the most common neighbors is selected. If there is more than one node with the most common neighbors, one of them is selected randomly. To initialize the moth's position, when rand, a randomly generated number between 0 and 1, is less than the fixed parameter Prob (rand < Prob), a random neighbor is selected, otherwise (rand > Prob), the best neighbor of the node is assigned. This assignment guarantees that an existing connection in the network is kept in the initial population. Then, the fitness of each moth's solution vector is calculated by considering the objective function, and the sorted moth population is considered as the flame population.

Movement Strategy
In the continuous MFO, the moth flies around the flame in a spiral path as shown in Equation (2). Since community detection is a discrete problem, the canonical MFO must be adapted to detect communities of a network. Thus, in this paper, the canonical MFO is adapted by altering the distance calculation and the spiral flight movement. The proposed adaptation is performed by introducing: (1) a single-point crossover to calculate the distance, (2) a two-point crossover between the moth's solution vector and corresponding flame for movement strategy, and (3) a single-point neighbor-based mutation to increase the exploration ability.

Distance Imitating Using Single-Point Crossover
In canonical MFO, the distance (Di) between the moth's solution vectors and their corresponding flames is calculated using Equation (4). DMFO-CD imitates the distance calculating and generates Di by adapting a single-point crossover (⨞) operator [60]. The crossover operator is used to combine two parent solutions Mi and Fi and generates two new solutions Child1 and Child2 by Equation (6).
Then, the fitness of each generated child is calculated, and the one that has better fitness is selected as Di by Equation (7).
An example of generating new solutions using the single-point crossover is shown in Figure 6, where Figure 6a,b shows solution vectors Mi and Fi considered as parents. In Figure 6c, a crossover point is randomly selected. To produce Child1, the values of Fi from the beginning to the crossover point are copied and the rest values from the crossover point to the endpoint of Mi are considered. Child2 is produced in reverse order such that the first values are copied from Mi and then the values from the crossover point to the endpoint of Fi are considered. Then, as shown in Figure 6d, Child2 is selected as Di due to its better fitness.

Movement Strategy
In the continuous MFO, the moth flies around the flame in a spiral path as shown in Equation (2). Since community detection is a discrete problem, the canonical MFO must be adapted to detect communities of a network. Thus, in this paper, the canonical MFO is adapted by altering the distance calculation and the spiral flight movement. The proposed adaptation is performed by introducing: (1) a single-point crossover to calculate the distance, (2) a two-point crossover between the moth's solution vector and corresponding flame for movement strategy, and (3) a single-point neighbor-based mutation to increase the exploration ability.

Distance Imitating Using Single-Point Crossover
In canonical MFO, the distance (D i ) between the moth's solution vectors and their corresponding flames is calculated using Equation (4). DMFO-CD imitates the distance calculating and generates D i by adapting a single-point crossover (

End
) operator [60]. The crossover operator is used to combine two parent solutions M i and F i and generates two new solutions Child 1 and Child 2 by Equation (6).
For d = 1: D

36.
Return best_flame 37. End Then, the fitness of each generated child is calculated, and the one that has better fitness is selected as D i by Equation (7).
An example of generating new solutions using the single-point crossover is shown in Figure 6, where Figure 6a,b shows solution vectors M i and F i considered as parents. In Figure 6c, a crossover point is randomly selected. To produce Child 1 , the values of F i from the beginning to the crossover point are copied and the rest values from the crossover point to the endpoint of M i are considered. Child 2 is produced in reverse order such that the first values are copied from M i and then the values from the crossover point to the endpoint of F i are considered. Then, as shown in Figure 6d, Child 2 is selected as D i due to its better fitness.
OR PEER REVIEW 10 of 28

The Movement Strategy Using Two-Point Crossover
To adapt the MFO movement strategy, a two-point crossover (⨝) [61] is applied between Di and the corresponding flame. The corresponding flame of Mi is either Fi or Fflame_no in which flame_no is calculated by Equation (5). Regarding the canonical MFO, the moths update their position with respect to their corresponding flame by Equation (8).
In this two-point crossover, two points are randomly selected as crossover points, and the values of Di and the corresponding flame are combined to each other based on a crossover point to generate two new solutions MChild1 and MChild2. Then, the fitness of each child is calculated and the one that has better fitness is selected as Mi-tmp using Equation (9).
To better understand, an example of two-point crossover is provided in Figure 7 in which Figure 7a,b shows solution vectors of Fi and Di, and Figure 7c is a two-point crossover to generate MChild1 and MChild2. In Figure 7d, MChild2 is selected as Mi-tmp due to its better fitness.

The Movement Strategy Using Two-Point Crossover
To adapt the MFO movement strategy, a two-point crossover ( ) [61] is applied between D i and the corresponding flame. The corresponding flame of M i is either F i or F flame_no in which flame_no is calculated by Equation (5). Regarding the canonical MFO, the moths update their position with respect to their corresponding flame by Equation (8).
In this two-point crossover, two points are randomly selected as crossover points, and the values of D i and the corresponding flame are combined to each other based on a crossover point to generate two new solutions M Child1 and M Child2 . Then, the fitness of each child is calculated and the one that has better fitness is selected as Mi-tmp using Equation (9).
To better understand, an example of two-point crossover is provided in Figure 7 in which Figure 7a,b shows solution vectors of F i and D i , and Figure 7c is a two-point crossover to generate M Child1 and M Child2 . In Figure 7d, M Child2 is selected as M i-tmp due to its better fitness.

Single-Point Neighbor-Based Mutation
To increase the exploration ability of the DMFO-CD algorithm, a single-point neighborbased mutation [58] is performed on all moths' solution vectors. In this mutation, the solution vector M i is updated such that each M i-tmp a node is randomly selected, and its value is replaced by the number of a node selected randomly from its neighbor set. This limitation guarantees that the obtained solution vector M i does not exit from the solution space. Figure 8 further illustrates the process of this mutation on M i-tmp that results in M i .

Single-Point Neighbor-Based Mutation
To increase the exploration ability of the DMFO-CD algorithm, a single-point neighbor-based mutation [58] is performed on all moths' solution vectors. In this mutation, the solution vector Mi is updated such that each Mi-tmp a node is randomly selected, and its value is replaced by the number of a node selected randomly from its neighbor set. This limitation guarantees that the obtained solution vector Mi does not exit from the solution space. Figure 8 further illustrates the process of this mutation on Mi-tmp that results in Mi.

Single-Point Neighbor-Based Mutation
To increase the exploration ability of the DMFO-CD algorithm, a single-point neighbor-based mutation [58] is performed on all moths' solution vectors. In this mutation, the solution vector Mi is updated such that each Mi-tmp a node is randomly selected, and its value is replaced by the number of a node selected randomly from its neighbor set. This limitation guarantees that the obtained solution vector Mi does not exit from the solution space. Figure 8 further illustrates the process of this mutation on Mi-tmp that results in Mi. After updating all moths' solution vectors, their fitness is calculated using a fitness function explained in the next section. Then, to determine the flames population for the next iteration, the current moths and flames populations are merged and sorted based on their fitness value. Thereafter, N best flames are selected as new flames. The pseudo-code of the proposed DMFO-CD is shown in Algorithm 1. After updating all moths' solution vectors, their fitness is calculated using a fitness function explained in the next section. Then, to determine the flames population for the next iteration, the current moths and flames populations are merged and sorted based on their fitness value. Thereafter, N best flames are selected as new flames. The pseudo-code of the proposed DMFO-CD is shown in Algorithm 1.

Fitness Function
Fitness function measures the quality of partitioning of the network during the optimization process and converges the algorithm to detect optimum communities; therefore, fitness function plays a key role. In this study, the modularity Q, which was proposed by Newman et al. [11], is used to evaluate the fitness of moths. The greater value of modularity demonstrates the better quality of partitioning. Consider a network is partitioned into the k communities; thus, its modularity can be calculated by Equation (10) [60]: where m is the number of edges, k is the number of detected communities, l s is the number of edges joining nodes of the community k, and d s is the sum of the degrees of the nodes belonging to the community k. In Equation (10), l s m is the fraction of edges inside a community and its value represent the strength of that community, d s 2m 2 is the expected fraction of edges that could be in a random network without any community structure, and it represents the weakness of a community; therefore, the partitioning quality would be calculated by distracting these two terms.

Experimental Evaluation
The performance of the proposed DMFO-CD algorithm was evaluated on eleven real-world datasets, and the results were compared with five state-of-the-art algorithms in community detection, consisting of DPSO-PDM [13], GA-Net [58], GACD [14], EGACD [60], and DECS [76]. The proposed and comparative algorithms were implemented in MATLAB 2020b except for GA-Net and DECS; its executable MATLAB code [89,90] was used. All experiments were run on a CPU, Intel Core (TM) i7-3770 3.4GHz with 12.0 Gb real memory. The performance comparison was based on various metrics consisting of modularity (Q), normalized mutual information (NMI) [91], and the number of detected communities. The overall performance of the algorithms was statistically analyzed by two non-parametric statistical tests Friedman [77] and Wilcoxon signed-rank [78].
Zachary's karate club (karate) network is a friendship network that has been divided into two factions based on the conflict between the administrators of the club. In this study, an unweighted version of the network was used to detect the factions only based on the friendship relations. This network contains 34 nodes, 78 edges, and has two real communities as shown in Figure 9a.
Bottlenose dolphins (dolphins) network is about 62 dolphins that lived in Doubtful Sound, New Zealand and were collected by Lusseau et al. [93]. The members of this bottlenose dolphin's community had been studied in a 7 years' period and during that time they did not have any permanent migration or immigration. The observed community structure between these dolphins was temporally stable, which had not been seen in other bottlenose dolphins population. This dataset consists of 62 nodes, 159 undirected edges, and has two real communities as shown in Figure 9b.
American college football (football) network consists of 115 American football teams of Division IA colleges that have played with each other during a season in 2000 [2]. Nodes represent teams and edges represent the regular season games between the two teams. The network includes 115 nodes, 1226 edges, and is grouped into 12 teams as shown in Figure 9c.
Political books (polbooks) network contains 105 American political books that have sold on Amazon's online store. The books purchased by a same person have been connected by an edge to each other. All books are divided by Newman [94] based on the descriptions and reviews posted on Amazon, into three classes of "liberal", "neutral", and "conservative". This network includes 105 nodes, 441 edges, and three real communities as shown in Figure 9d.
WebKB network [95] consists of four datasets webkb-cornell, webkb-texas, webkbwashington, and webkb-wisconsin including of the webpages of computer science department of four universities Cornell University, University of Texas at Austin, University of Washington, and University of Wisconsin. In these datasets, each node represents a webpage, and each edge shows there is a hyperlink between two webpages. These webpages are classified into faculty, staff, student, project, and course. In addition, in these datasets, each university is considered as a separate dataset, with the Cornell University consisting of 195 nodes and 304 edges, University of Texas having 187 nodes and 328 edges, University of Washington having 265 nodes and 446 edges, and University of Wisconsin also having 265 nodes and 530 edges.
Adjective Noun (adjnoun) network dataset prepared by Newman [12] consists of the most common adjacent adjective and noun that are come in the novel David Copperfield by Charles Dickens. In this dataset, each node represents an adjective or noun, and each edge represents a pair of words that adjacent with each other. In addition, each word has one of adjective or noun class. the most common adjacent adjective and noun that are come in the novel David Copperfield by Charles Dickens. In this dataset, each node represents an adjective or noun, and each edge represents a pair of words that adjacent with each other. In addition, each word has one of adjective or noun class.
Email-Eu-core network consist of 1005 members' email of European research institution [96]. In this network, dataset nodes represent institution members' email and the edges determined which members sent at least one email to each other. Each member of this network belongs to one department of the institution, and this research institution has 42 departments. This network consists of 1005 nodes and 25,571 edges, and its real communities are 42.
DBLP network dataset is a co-authorship network of researchers in computer science [97]. This dataset includes of 10,824 authors that they connected to each other when they published at least one paper together. The authors who published in same journal or conference form a community. This network dataset has 10,824 nodes, 38,732 edges and 100 communities.

Evaluation Metrics
The performance of DMFO-CD is evaluated using metrics: several statistical values of modularity such as average (Qavg), standard deviation (Qstd) and optimal (Qmax) modularities, normalized mutual information (NMI), and the number of detected communities. The modularity measures the quality of detected communities and NMI measures the accuracy of the resulted partitioning. If R is the real partitioning of a network, then the NMI can be used to measure the similarity between R and the resulted partitioning M gained by the algorithm. The NMI of M and R is calculated by using Equation (11).
where = ( ) × is confusion matrix, CM and CR are the number of communities in partitioning M and R. Cij represents the number of common nodes between community i in partitioning M and community j in partitioning R. Ci and Cj are the number of elements in i and j rows in matrix C.

Performance Evaluation
In this subsection, the performance of the proposed DMFO-CD algorithm was experimentally evaluated, and the experimental results were compared with comparative algorithms. As shown in Table 1, the parameter settings of all comparative algorithms were considered the same as suggested values in their original works. All algorithms were eval- Email-Eu-core network consist of 1005 members' email of European research institution [96]. In this network, dataset nodes represent institution members' email and the edges determined which members sent at least one email to each other. Each member of this network belongs to one department of the institution, and this research institution has 42 departments. This network consists of 1005 nodes and 25,571 edges, and its real communities are 42.
DBLP network dataset is a co-authorship network of researchers in computer science [97]. This dataset includes of 10,824 authors that they connected to each other when they published at least one paper together. The authors who published in same journal or conference form a community. This network dataset has 10,824 nodes, 38,732 edges and 100 communities.

Evaluation Metrics
The performance of DMFO-CD is evaluated using metrics: several statistical values of modularity such as average (Q avg ), standard deviation (Q std ) and optimal (Q max ) modularities, normalized mutual information (NMI), and the number of detected communities. The modularity measures the quality of detected communities and NMI measures the accuracy of the resulted partitioning. If R is the real partitioning of a network, then the NMI can be used to measure the similarity between R and the resulted partitioning M gained by the algorithm. The NMI of M and R is calculated by using Equation (11).
is confusion matrix, C M and C R are the number of communities in partitioning M and R. C ij represents the number of common nodes between community i in partitioning M and community j in partitioning R. C i and C j are the number of elements in i and j rows in matrix C.

Performance Evaluation
In this subsection, the performance of the proposed DMFO-CD algorithm was experimentally evaluated, and the experimental results were compared with comparative algorithms. As shown in Table 1, the parameter settings of all comparative algorithms were considered the same as suggested values in their original works. All algorithms were evaluated to detect the communities of network dataset using 30 independent runs, except email-Eu and dblp network that were evaluated by using 10 runs; in each run, the maximum number of iterations (MaxIter) was set by 100. Similar to previous works [61] the population number (N) for karate, email-Eu, dblp, dolphins, all webkbs, adjnoun, football, and polbooks datasets were set by 100, 100, 100, 200, 200, 200, 400, and 400, respectively. The average modularity (Q avg ), the standard deviation of modularity (Q std ), the optimal modularity (Q max ), the average NMI (NMI avg ), and the number of detected communities (C number ) are used to report. The reported results are shown in Tables 2 and 3, in which the best-obtained values are remarked in boldface.       Table 2 and Figure 10 show the obtained modularity by DMFO-CD and comparative algorithms on eleven datasets. As per the results in Table 2 and Figure 10, DMFO-CD, GACD, and EGACD have the highest value of modularity and lowest distribution of modularity on karate network. In Figure 10, EGACD has the largest distribution of modularity than others, and GACD has the lowest average modularity that shows that they have the poorest performance on the dolphins network. The average modularity gained by GA-Net is better than EGACD and GACD; however, its distribution is near to GACD. The modularity of DMFO-CD is evident in that it is superior to other comparative algorithms for detecting communities of the dolphins network. As shown in Figure 10, GACD and EGACD have the lowest performance in the football dataset among other algorithms, while DMFO-CD and DPSO-PDM have better results than other algorithms. Figure 10 shows the results on polbooks network in which the GA-NET algorithm has the largest distribution and weak modularity, while in comparison to other algorithms, DMFO-CD has the best average modularity. Table 2 shows that DMFO-CD has mostly gained the best result of modularity among other comparative algorithms on webkb datasets. GACD and EGACD have the results near to DMFO-CD, and DPSO-PDM could not gain proper results in term of modularity. For adjnoun dataset, as shown in Figure 10, the proposed DMFO-CD has the best average of modularity than other comparative algorithms. GACD and EGACD after DMFO-CD have better results than DPSO-PDM, GA-Net, and DECS. In Table 2 the row of email-Eu represents the results of modularity on email-Eu dataset that DMFO-CD has the best results on it. DMFO-CD in compare with GA-Net and DECS has better performance and is near to EGACD. As shown in Figure 10, the performance of DMFO-CD in terms of modularity for dblp dataset is better than others, except for DPSO-PDM.   Table 3 shows the average NMI of values gained by GA-Net, GACD, EGACD, DPSO-PDM, DECS, and DMFO-CD in karate, dolphins, football, polbooks, webkb, and adjnoun datasets from 30 runs and email-Eu and dblp datasets from 10 independent runs. In Figure  11, the bar plot of gained NMI of all algorithms is shown on each dataset. In the karate dataset, although DMFO-CD in terms of NMI obtains better results than GA-Net and DPSO-PDM, all comparative algorithms detect four communities, except for DECS, which has better NMI than others and detects the more correct community number. In the dolphins dataset, DPSO-PDM gained the most value of average NMI. In the football dataset, DPSO-PDM and GA-Net have better performance than DMFO-CD, while DMFO-CD after GA-Net detects the correct number of communities. As shown in the fourth row of Table  3, in polbooks dataset, DMFO-CD and DPSO-PDM detect a more accurate and correct The reported results show that the proposed DMFO-CD has the best performance in terms of modularity in the dolphins and polbooks datasets. In addition, DMFO-CD is competitive with GACD and EGACD on the karate dataset, and with DPSO-PDM and DECS on the football dataset. DMFO-CD in email-Eu dataset has a weak performance in terms of modularity; however, it is better than GA-Net and DECS. In dblp dataset, although DMFO-CD is not better than DPSO-PDM, it is better than other algorithms. Thus, DMFO-CD is able to provide superior and competitive results in comparison to the comparative algorithms in terms of modularity. Table 3 shows the average NMI of values gained by GA-Net, GACD, EGACD, DPSO-PDM, DECS, and DMFO-CD in karate, dolphins, football, polbooks, webkb, and adjnoun datasets from 30 runs and email-Eu and dblp datasets from 10 independent runs. In Figure 11, the bar plot of gained NMI of all algorithms is shown on each dataset. In the karate dataset, although DMFO-CD in terms of NMI obtains better results than GA-Net and DPSO-PDM, all comparative algorithms detect four communities, except for DECS, which has better NMI than others and detects the more correct community number. In the dolphins dataset, DPSO-PDM gained the most value of average NMI. In the football dataset, DPSO-PDM and GA-Net have better performance than DMFO-CD, while DMFO-CD after GA-Net detects the correct number of communities. As shown in the fourth row of Table 3, in polbooks dataset, DMFO-CD and DPSO-PDM detect a more accurate and correct number of communities than the rest of the comparative algorithms. In webkb, network datasets GA-Net gained the best results of average NMI, while DMFO-CD could not gain proper results. In adjnoun dataset, GA-Net has the best result, but DMFO-CD has less good performance than other algorithms. In email-Eu dataset, DMFO-CD has the greatest value of NMI. The last row of Table 3 shows the results in dblp dataset in which DMFO-CD after GACD has the best average of NMI. Although in this experiment, DMFO-CD does not have the expected performance in terms of NMI, it practically detects the correct number of communities.

Convergence Evaluation
In this subsection, the DMFO-CD convergence behavior and speed are assessed and compared to the comparative algorithms except GA-net because it uses community score to calculate the fitness. Figure 12 shows the convergence curves of all algorithms on eleven datasets. These curves show the best modularity in every iteration for each algorithm over 30 runs on karate, dolphins, football, polbooks, four webkb datasets, and adjnoun networks and 10 runs on email-Eu and dblp networks. The convergence curves show that DMFO-CD on karate, dolphins, football, and polbooks datasets is better than GACD and EGACD, and is competitive with DPSO-PDM and DECS. In webkb datasets, the convergence curve of DMFO-CD follows closely GACD and EGACD, and in adjnoun dataset, DMFO-CD has the best convergence curve versus other comparative algorithms. In email-Eu dataset, DMFO-CD does not have a good convergence, but in dblp, it has the best performance after DPSO-PDM. The better convergence of DMFO-CD originates from the usage of the best neighbor in the initialization step. The convergence of the DMFO-CD is sped up when the best neighbor is used in the initialization step versus when the best neighbor is not used. Figure 13 shows this difference on four selected datasets.

Statistical Analysis
In this subsection, the performance of the proposed DMFO-CD algorithm and comparative algorithms were statistically analyzed by two statistical tests, i.e., Friedman test [77] and Wilcoxon signed-rank test [78].
DMFO-CD has the best convergence curve versus other comparative algorithms. In email-Eu dataset, DMFO-CD does not have a good convergence, but in dblp, it has the best performance after DPSO-PDM. The better convergence of DMFO-CD originates from the usage of the best neighbor in the initialization step. The convergence of the DMFO-CD is sped up when the best neighbor is used in the initialization step versus when the best neighbor is not used. Figure 13 shows this difference on four selected datasets.    Figure 13. Impact of using the best neighbor on convergence speed of DMFO-CD.

Statistical Analysis
In this subsection, the performance of the proposed DMFO-CD algorithm and comparative algorithms were statistically analyzed by two statistical tests, i.e., Friedman test Figure 13. Impact of using the best neighbor on convergence speed of DMFO-CD.

Friedman Test
The non-parametric Friedman test was conducted to prove the superiority of the proposed DMFO-CD algorithm statistically. The Friedman test (F f ) [77] is a non-parametric test using for multiple comparisons of different algorithms for all functions. This test is used to rank the DMFO-CD and comparative algorithms based on the achieved fitness by using Equation (12), where k, n, and R j are the number of algorithms, case tests, and the mean rank of the jth algorithm, respectively. For each pair of algorithms, it ranks from 1 (best result) to k (worst result) and then calculates the average ranks obtained in all problems to find the algorithms' final rank. The Friedman test on modularity and NMI were calculate for all algorithms over 30 runs on karate, dolphins, football, polbooks, four webkb datasets, and adjnoun networks and over 10 runs on email-Eu and dblp networks. The gained results on modularity and NMI are tabulated in Tables 4 and 5. Based on the overall ranking results, the DMFO-CD algorithm has better overall ranked on modularity and competitive rank compared with other stat × 10of-th × 10art algorithms.

Wilcoxon Signed-Rank Test
In order to verify the significant difference between DMFO-CD and other comparative algorithms, the Wilcoxon signed-rank test was employed [78]. This test is a non-parametric statistical test that requires two sets of random observations and two hypotheses. In this paper, the two sets of observations represent the obtained modularity values from each algorithm, i.e., the DMFO-CD algorithm and the compared one, over 30 runs on karate, dolphins, football, polbooks, all four webkb, adjnoun and 10 runs on email-Eu and dblp datasets. The null hypothesis (H 0 ) was assumed that there is no significant difference between the mean values of the two observation sets (modularity values). While the alternative hypothesis (H 1 ) is that there is a significant difference in the average values of the two sets. Table 6 presents the results of this test at significance level α = 0.05. The p value refers to the significant difference between each pair of algorithms (DMFO-CD and one other algorithm). The considerable difference exists only if p value < α. Therefore, the results prove that the null hypothesis is rejected on most of the networks.

Conclusions and Future Works
In this study, a discrete moth-flame optimization algorithm for community detection (DMFO-CD) was proposed for complex networks. The solution vectors representation, the distance calculation, and the spiral flight movement of the MFO algorithm were adapted for community detection. This adaptation was performed by introducing a singl × 10 point crossover to imitate the distance calculation, a two-point crossover to alter the movement strategy, and a singl × 10 point neighbor-based mutation to increase the exploration ability. The performance of the DMFO-CD was experimentally evaluated on eleven real-world networks and compared with five well-known algorithms in community detection in terms of modularity, NMI, and the number of detected communities. The experimental results show that the proposed DMFO-CD can detect the correct number of communities with better modularity in comparison to other stat × 10of-th × 10art algorithms. The overall effectiveness of the proposed algorithm was also statistically analyzed using the Friedman and Wilcoxon signed-rank tests. To be more specific, the gained rank and p values by statistical tests show DMFO-CD has better overall rank on modularity and that the null hypothesis is rejected on most of the networks. The NMI gained by DMFO-CD shows that it can be focused on combining the local search strategy to detect more accurate communities in further studies. Furthermore, the single objective DMFO-CD algorithm can be extended such that it solves the multi-objective community detection.