Differential-Evolution-Based Coevolution Ant Colony Optimization Algorithm for Bayesian Network Structure Learning

Learning the Bayesian networks (BNs) structure from data has received increasing attention. Many heuristic algorithms have been introduced to search for the optimal network that best matches the given training data set. To further improve the performance of ant colony optimization (ACO) in learning the BNs structure, this paper proposes a new improved coevolution ACO (coACO) algorithm, which uses the pheromone information as the cooperative factor and the differential evolution (DE) as the cooperative strategy. Different from the basic ACO, the coACO divides the entire ant colony into various sub-colonies (groups), among which DE operators are adopted to implement the cooperative evolutionary process. Experimental results demonstrate that the proposed coACO outperforms the basic ACO in learning the BN structure in terms of convergence and accuracy.


Introduction
The Bayesian network (BN) [1], which is also called the probabilistic belief network or the causal network [2], is a kind of graphical model and knowledge representation tool.BNs can efficiently give the probabilistic description of the dependence or independence relationships between a set of random variables.A BN is composed of a directed acyclic graphical structure and a set of probability parameters.The directed acyclic graphical structure represents the dependence relationships between various variables, and the corresponding probability parameters specify their degree of dependence.Recently, learning the BN structure from a dataset has received increasing attention [3], and researchers have introduced various learning algorithms to obtain the structure for BNs.According to the modeling type [3][4][5], these structure learning algorithms can be classified into methods based on detecting conditional independencies [6,7], also known as constraint-based methods, the "score+search" method [1,2,[8][9][10][11][12][13], and the algorithms that combine the above two methods [14][15][16][17].As to the score+search method, the BN structure learning is modelled as an optimization problem.A scoring metric is employed to evaluate how well the candidate network structure [3] matches the dataset.The better the candidate network structure matches the dataset, the higher the score.Thus, it can use the optimization technique to search for the network structure with the best score.However, searching for the optimum network for BNs is an NP-hard problem [5,18], which means those exact methods become unfeasible.Therefore, the approximate algorithms are very useful for quickly obtaining a sufficiently highly qualified network structure.The greedy search methods, including the K2 algorithm and many meta-heuristic methods, such as genetic algorithm (GA) [12], evolutionary programming (EP) [12] and particle swarm optimization (PSO) [14], are frequently used in solving the BN learning problem.
As a population-based meta-heuristic technique, ant colony optimization (ACO) also has been introduced successfully into the problem of BN structure learning from a dataset [1,2,8,14,15].ACO, which was originally presented under the inspiration from the collective behavior of a real ant system [19,20], has already been applied to a wide range of optimization problems.Campos et al. [2] first introduced ACO to the problem of BN structure learning.They described all of the elements necessary to tackle the BN learning problem using ACO, and contrast experiments indicated that ACO shows better performance than the estimation of distribution algorithms (EDAs) as well as the greedy hill climbing (HC) algorithm.Based on the constraint-based local discovery algorithm, Pinto et al. [1] proposed a local discovery ACO by using the conditional dependence test max-min parents and children method.Ji et al. [15] proposed a hybrid method that combines ACO, the conditional independence test, and the simulated annealing (SA) strategy, and their hybrid ACO method outperforms the basic ACO in terms of computational time and searching capability.However, there are two drawbacks in the above ACO-based hybrid algorithms.Firstly, the conditional independence test can make the computation more complex and unreliable [3].Secondly, when the number of variables is very large, the ACO-based algorithm will easily fall into the local optimal solution and result in a premature stagnation.To avoid the drawback caused by the conditional independence test, it is necessary to improve the inherent searching mechanism of the ACO algorithm.
In this paper, we propose a kind of improved ACO to enhance the convergence and accuracy of the basic ACO in solving the problem of BN structure learning.The improved ACO divides the entire ant colony into several small sub-colonies, denoted as groups.In each ant group, ants perform respective actions, which leads to the entire ant group evolving forward according to the information within the group itself.Moreover, at the same time, among different groups, the information is also shared in the form of a cooperation variable, and all of the ant groups will evolve forward together using cooperative operations.The cooperative operations in the cooperative evolution process are based on the differential evolution (DE) algorithm [21,22], which leads the cooperative evolution process for all ant groups.The proposed improved ACO is denoted as coevolution ACO (coACO).Two features make the coACO algorithm interesting: (1) the grouping operator divides the entire ant colony into different ant groups, and, thus, the algorithm can carry out not only the social cooperation between ant individuals but also the cooperative interaction and information shared between ant groups; (2) the DE algorithm is employed to adjust the cooperation information and lead all ant groups to evolve toward the optimum in a cooperative manner.
In the rest of this paper, a problem description of the BN structure learning as well as the introduction of the ACO algorithm are given in Section 2. The proposed coACO algorithm and the detailed BN structure learning method are described in Section 3. Section 4 presents the test results.Section 5 concludes this paper.

Problem of BN Structure Learning
A BN is a graphical tool to represent the n-dimensional probability distribution.It can be described by a directed acyclic graph (DAG) G ≤ X, A, Θ>.In G, each node X i ∈X represents a random variable of interest, while each arc a ij ∈A represents a direct dependence relationship between the variables X i and X j .In addition, the parameter θ i = P(X i |π i ), where Θ = {θ i }, denotes the conditional probability distribution of X i given its parent set π i .From the conditional distributions, the joint probability can be uniquely determined by Given a training database D = {x 1 , x 2 , . . ., x m } composed of m cases and each case contains n variables, where x i is an instance of domain variable X, the problem of the BN structure learning is to find the BN topology structure that best matches the dataset D.
As noted previously, algorithms for learning the BN structure from data mainly include the constraint-based methods and the score+search methods.Based on a dependency analysis, the existing approaches are close to the semantics of BNs and relatively simpler to implement [8].However, it is hard to ensure the precision of the obtained structure.Furthermore, the computation for high-order tests is complex and unreliable.For this reason, most of the developed structure learning algorithms fall into the latter, namely the score+search methods [3,5], which treat the problem of BN structure learning as a combinatorial optimization problem.
The BN structure learning based on score+search methods firstly uses a scoring metric to evaluate how well the candidate BN structure matches the given dataset, and then finds the network structure with the maximum score.Popular scoring metrics include Akaike's information criterion (AIC), the Bayesian information criterion (BIC), the minimum description length (MDL) score, and the Bayesian Dirichlet equivalence (BDe) metric (usually called the K2 metric) [3].Here, the BIC scoring metric, which comes from the penalized maximum likelihood, is used as the structure to identify the dataset matching degree as follows: where, B S is the candidate BN structure; r i is the number of possible values for the variable X i ; q i is the number of possible configurations instantiations for its parents π i ; N ijk is the number of cases in D in which variable X i has its k-th value, π i is instantiated to its j-th value, and is the dimension (the number of parameters needed to specify the model) of the BN; and f (m) is the non-negative penalization function that depends on the size of the dataset and can be computed as f (m) = 0.5•log m.Using f (B S , D) instead of P(B S |D), the BIC scoring metric [3] is defined as: where One desirable and important feature of scoring metrics is their decomposability in the presence of full data, and (3) shows that the BIC metric used here is decomposable.With the decomposable metric, a local search procedure that changes one arc at each move can efficiently evaluate the improvement obtained by this change [2,3], because it can reuse most of the computations made in previous stages.Moreover, the score of a BN can be computed as the combination of scores obtained for smaller factors.

ACO
As a representative bio-inspired meta-heuristic algorithm, ACO was firstly put forward by Dorigo in the 1990s [19] to solve the travel salesman problem (TSP).Till now, ACO has been proven to be a more common framework for various optimization problems in a wide range of fields [23][24][25], such as job-shop scheduling, data mining, routing problems, and other complex optimization problems.When observing the foraging behavior of real ant colonies, researchers discovered that real ants can deposit a chemical substance, called a pheromone, while walking.The pheromone can be accumulative and evaporative, through which the ant colony can carry on indirect communication and finally achieve the cooperative goal.Ants can smell the pheromone and choose their way, in a probabilistic way, based on the amount of pheromone.The larger the amount of pheromone deposited on a route, the greater the probability that ants select the route.Meanwhile, on the shorter route that ants travel, the pheromone accumulates faster than on the longer routes.Thus, the faster the amount of pheromone increases on the shorter route, then the greater the probability that the ants travel this route.In the initial stage when the pheromone is absent, ants choose their routes fully randomly, but after a transitory period the shortest routes will be more and more frequently visited and pheromone will accumulate faster and faster on them, which in turn will attract more and more ants to choose these routes.
The mathematical model of ACO is described as follows.Let M ant be the number of ants, and the matrix τ(t) = {τ ij (t)} be the pheromone, of which the element τ ij (t) is the level of pheromone deposited on the arc from node i to node j, at time t.The initial level of pheromone on each directed arc is a constant value, i.e., τ ij (0) = τ 0 .Each ant builds a possible solution to the problem by moving through a finite sequence of neighbor nodes, and these moves are directed by the ant's internal state, problem-specific local information, and the shared information about the pheromone [19].For the k-th ant located at the i-th node, it will move to the next j-th node with the transition probability: where η ij represents the heuristic information about the problem, allowed k denotes the feasible domain of the k-th ant at the i-th node, and α and β are parameters that determine the relative importance of the pheromone with respect to the heuristic information.
In addition, in order to achieve a trade-off between exploitation and exploration [2], a different transition rule is introduced and the next node j is selected as: where q is a random number uniformly distributed in [0, 1]; q 0 ∈[0,1] is the parameter that determines the relative importance of exploitation versus exploration; and J is a node randomly selected according to the transition probability in (1) with α = 1.
As the ants move and build the possible solutions, the pheromone matrix is updated according to both the global updating and local updating processes [20].As to the local updating process when building the solution, if an ant moves from node i to node j, then the pheromone level on the corresponding arc ij is updated as follows: where τ 0 is the initial pheromone level on all arcs, and ψ∈(0,1] denotes the parameter that can control the pheromone evaporation.After all ants have constructed a solution, only the ant that obtains the best solution can reinforce the pheromone level on the arcs, which constitute the best solution, S + , obtained by the ant colony so far.The global updating rule can be expressed by where ρ∈(0,1] is the parameter that can control the pheromone evaporation, and f (S + ) is the cost associated with the best solution S + .The following Algorithm 1 shows the complete algorithm of ACO applied to optimization problems [19].

ACO Applied to BNs
Using the basic ACO algorithm, the best network can be found in the space of possible networks based on the score+search framework, [1,2].Beginning with a blank network, the ant colony progressively searches for good single-step changes to build a complete BN.Each ant connects randomly two variables and determines whether the arc should be included in the BN structure.As the construction process that is illustrated in Figure 1 [2,14,15], the ant uses the incremental construction of the solution starting from a blank network G 0 through connecting an arc a ij = {X i →X j } and then adding it to the current network, i.e., G h+1 = G h ∪a ij .When no arc can be added to achieve a higher score of the BN structure, the construction process of the ant will be stopped with obtaining the final solution G g .The pheromone placed on all candidate arcs together with the heuristic information are used to guide the network construction process.The random rule that the ant k selects the arc a ij from the current optional arcs is where A ij are the arcs randomly selected according to the following probabilities: where allowed k is the set composed of all candidate arcs that do not create a directed cycle and have the positive heuristic information.q 0 is the threshold value that is set by the user.The maximum objective function of ACO is the BIC scoring metric in (3).Thus, the heuristic information ηij of the arc aij at time t is defined as The pheromone level τij on the arc aij changes according to the local and global updating rules as described in ( 7) and ( 8), while the increment is competed by where G + is the best BN structure found by all ants so far.The basic ACO algorithm applied to learning the BN structure is presented in Algorithm 2 [2].The maximum objective function of ACO is the BIC scoring metric in (3).Thus, the heuristic information η ij of the arc a ij at time t is defined as The pheromone level τ ij on the arc a ij changes according to the local and global updating rules as described in ( 7) and ( 8), while the increment is competed by where G + is the best BN structure found by all ants so far.The basic ACO algorithm applied to learning the BN structure is presented in Algorithm 2 [2].Calculate the heuristic information: for i, j = 1 to n (I = j) do Select an arc a ij from the feasible domain allowed according to ( 9) and (10); Update the pheromone: Update the pheromone: Update the pheromone matrix according to ( 8) and ( 12) using f BIC (G + , D) 29 end while 30 Return the best BN structure G + .

Method
This paper introduces a new improved algorithm, coACO, to solve the BN structure-learning problem.The coACO incorporates several coevolution operators to improve the performance of the basic ACO.Firstly, the entire ant colony is divided into S independent ant groups, and the number of groups is set as n s , s = 1, . . ., S, as shown in Figure 2. DE is one of the best evolutionary algorithms for solving the real-valued test function suite of the First International Contest on Evolutionary Optimization (1st ICEO) [22].Therefore, the DE strategies [26,27] are used to guide the coevolution process for all ant groups.As is mentioned in the above section, the pheromone plays an important role in the exploration and exploitation of the ant colony for constructing the solution.A reasonable distribution of the pheromone can directly affect ants to explore their solutions.Thereby, the pheromone level is chosen as the cooperative object shared by all ant groups in the coACO.

Method
This paper introduces a new improved algorithm, coACO, to solve the BN structure-learning problem.The coACO incorporates several coevolution operators to improve the performance of the basic ACO.Firstly, the entire ant colony is divided into S independent ant groups, and the number of groups is set as ns, s = 1, …, S, as shown in Figure 2. DE is one of the best evolutionary algorithms for solving the real-valued test function suite of the First International Contest on Evolutionary Optimization (1st ICEO) [22].Therefore, the DE strategies [26,27] are used to guide the coevolution process for all ant groups.As is mentioned in the above section, the pheromone plays an important role in the exploration and exploitation of the ant colony for constructing the solution.A reasonable distribution of the pheromone can directly affect ants to explore their solutions.Thereby, the pheromone level is chosen as the cooperative object shared by all ant groups in the coACO.For each ant group, the pheromone level is denoted as the matrix τ s (t) = {τij s (t)}, s = 1, …, S. Each ant group communicates with other groups and coordinates its evolution process using the pheromone as the cooperative variable.The coACO introduces coordination operations, which are based on DE, to affect the pheromone, which leads to the achievement of a more reasonable distribution of the pheromone for each ant group.
The DE-based coordination operations contain mutation, crossover, and selection operators [21].The first one is the mutation operator, which is implemented as follows: where u s (t) = {uij s (t)}, s = 1, …, S, is the donor pheromone matrix. 1 r τ , 2 r τ , and 3 r τ are three different pheromone matrixes that are randomly selected from the ant groups, namely r1 ≠ r2 ≠ r3 ≠ s.The real number F is a positive parameter between [0,2], called the mutation factor, which controls the amplification of the differential variation The second one is the crossover operator, which is performed as follows: where rij is a random value generated for each arc aij in accordance with a uniform distribution over [0,1], cr is the given crossover rate within (0,1), and the obtained v s (t) = {vij s (t)} is the trial pheromone matrix.Figure 3 describes the crossover process of the pheromone matrixes.For each ant group, the pheromone level is denoted as the matrix τ s (t) = {τ ij s (t)}, s = 1, . . ., S.
Each ant group communicates with other groups and coordinates its evolution process using the pheromone as the cooperative variable.The coACO introduces coordination operations, which are based on DE, to affect the pheromone, which leads to the achievement of a more reasonable distribution of the pheromone for each ant group.
The DE-based coordination operations contain mutation, crossover, and selection operators [21].The first one is the mutation operator, which is implemented as follows: where u s (t) = {u ij s (t)}, s = 1, . . ., S, is the donor pheromone matrix.τ r 1 , τ r 2 , and τ r 3 are three different pheromone matrixes that are randomly selected from the ant groups, namely The real number F is a positive parameter between [0,2], called the mutation factor, which controls the amplification of the differential variation (τ r 2 − τ r 3 ).
The second one is the crossover operator, which is performed as follows: where r ij is a random value generated for each arc a ij in accordance with a uniform distribution over [0,1], cr is the given crossover rate within (0,1), and the obtained v s (t) = {v ij s (t)} is the trial pheromone matrix.Figure 3 describes the crossover process of the pheromone matrixes.
where τ s (t + 1) is the pheromone matrix of the s-th ant group in the (t + 1)-th iteration.Ants in each group construct their networks according to the codes from Line 8 to Line 12 in Algorithm 2 using the pheromone matrix τ s (t) or v s (t).The fitness of the pheromone matrix f(τ s (t)) or f(v s (t)) is defined as the maximum BIC score of the BNs obtained by the ant group based on the corresponding pheromone matrix τ s (t) or v s (t).After the selection operation, all ant groups perform the pheromone global updating procedure using the best BN G + obtained up to the current iteration.The detailed process of our proposed coACO algorithm in solving the BN structure-learning problem can be described as follows: Step 1. Initialization of parameters: the maximum number of iteration as Tmax, initial iteration counter t = 0, the number of ants Mant, the number of ant groups S, and other parameters α, β, ρ, F, cr, q0.
Step 2. Initialization of ants: divide the whole ant colony into different ant groups; record the number of ants in each groups as n1, n2, … and nS; set the initial pheromone matrix τij s (0) = τ0, s = 1,2, …, nS for all arcs aij; set G + to be an empty graph.Step 3. t = t + 1; s = 1.
Step 4. Perform the mutation and crossover operations to the initial pheromone matrix τ s (t) of each ant group using Equations ( 13) and ( 14), and generate the trial pheromone matrix v s (t).Step 14. Select the BN with maximum score from all ant groups, which is recorded as Gt.
Step 15.If Gt has the larger BIC score than G + , then G + = Gt.
Step 16.Update the pheromone matrix τ s , s = 1, 2, …, S, for each ant group according to (8) and (12) based on G + .Step 17.Return to Step 3 until t > Tmax.The third operator is the greedy selection, which is performed as follows: where τ s (t + 1) is the pheromone matrix of the s-th ant group in the (t + 1)-th iteration.Ants in each group construct their networks according to the codes from Line 8 to Line 12 in Algorithm 2 using the pheromone matrix τ s (t) or v s (t).The fitness of the pheromone matrix f (τ s (t)) or f (v s (t)) is defined as the maximum BIC score of the BNs obtained by the ant group based on the corresponding pheromone matrix τ s (t) or v s (t).After the selection operation, all ant groups perform the pheromone global updating procedure using the best BN G + obtained up to the current iteration.
The detailed process of our proposed coACO algorithm in solving the BN structure-learning problem can be described as follows: Step 1. Initialization of parameters: the maximum number of iteration as T max , the initial iteration counter t = 0, the number of ants M ant , the number of ant groups S, and other parameters α, β, ρ, F, cr, q 0 .Step 2. Initialization of ants: divide the whole ant colony into different ant groups; record the number of ants in each groups as n 1 , n 2 , . . .and n S ; set the initial pheromone matrix τ ij s (0) = τ 0 , s = 1,2, . . ., n S for all arcs a ij ; set G + to be an empty graph.
Step 4. Perform the mutation and crossover operations to the initial pheromone matrix τ s (t) of each ant group using Equations ( 13) and ( 14), and generate the trial pheromone matrix v s (t).Step 5. s = s + 1; if s ≤ S, return to Step 4. Step 6. s = 1, k = 1.
Step 7. The k-th ant constructs the BN G k (τ s ) using the pheromone matrix τ s according to the codes from Line 8 to Line 12 in Algorithm 2.
Step 8.The k-th ant constructs the BN G k (v s ) using the pheromone matrix v s according to the codes from Line 8 to Line 12 in Algorithm 2. Step 9. k = k + 1; if k ≤ n s , return to Step 7.
Step 10.Compute the BIC score for G k (τ s ), k = 1, 2, . . ., n s , choose the best BN with the maximum score and set the maximum score as the fitness f (τ s (t)) for τ s .Step 11.Compute the BIC score for G k (v s ), k = 1, 2, . . ., n s , choose the best BN with the maximum score and set the maximum score as the fitness f (v s (t)) for v s .Step 12. Compare f (τ s (t)) and f (v s (t)), select the better pheromone matrix according to (15), and select the corresponding BN for the ant group.Step 13. s = s + 1; if s ≤ S, k = k + 1 and return to Step 7.
Step 14. Select the BN with maximum score from all ant groups, which is recorded as G t .
Step 15.If G t has the larger BIC score than G + , then G + = G t .
Step 18. Terminate and output the best BN structure, namely G + .
The above steps of the coACO algorithm applied to learning BN structure can also be presented as the pseudocode in Algorithm 3.
Update each pheromone matrix τ s according to ( 8) and ( 12) using f BIC (G + , D) 25 end while 26 Return the best BN structure G + .

Results and Discussion
In order to evaluate the performance of the coACO algorithm in solving the BN structure-learning problem, a series of test experiments is performed and a comparison of the proposed coACO with the basic ACO is carried out.All the tested algorithms are implemented using the Matlab-2009a, and the Bayes Net Toolbox (BNT) developed by Murphy [28] is used to evaluate the BIC score.The experimental platform is a personal computer with an Intel(R) Core(TM) i3, 3.07GHz CPU, 4 GB memory, and Windows 7. The parameters of the two ACO algorithms are set as α = 1, β = 2, q 0 = 0.8, ρ = ψ = 0.4, M ant = 10, n s = 5, cr = 0.9, F is set as a random value that is uniformly distributed in [0.2,0.9], and the maximum number of iterations is set as T max = 100.The initial pheromone level placed on each arc is τ 0 = 1 n| f BIC (G K2 ,D)| , where n is the number of variables and G K2 is the network obtained by the K2 algorithm using the BNT.Moreover, two traditional algorithms, based on the score+search framework, K2 and B [2], are also tested for comparison.Different from the ACO-based methods, both the K2 and B algorithms are not population-based.
The learning datasets are generated from the well-known benchmarks of BNs, including the ASIA and the ALARM networks.The ASIA network consists of 8 nodes and 8 arcs, while the ALARM network consists of 37 nodes and 46 arcs.Using the BNT, we generated a dataset of 10,000 cases for the ASIA network and 5000 cases for the ALARM network.For the ASIA network, the subsets consisting of the first 1000, 3000, 5000, 8000, and 10,000 cases are considered.For the ALARM network, the subsets consisting of the first 1000 and 5000 cases are considered.Table 1 lists the summary of the datasets used herein.We run the stochastic algorithms, including ACO, coACO, and K2, 20 times independently for each dataset.The statistical results of the multiple independent experiments are listed in Table 2 for comparison, where six indicators are used to estimate the performances of the four algorithms, including the maximum/best, mean, median values, and the standard deviation of the BIC scores of the best networks obtained in each run.The last column, success rate (SR), is defined to represent the percentage of all 20 runs to obtain the BIC score that is not less than the corresponding original score listed in Table 1.It can be seen from Table 2 that coACO can always find the better BIC score for each dataset.It means that coACO can achieve a more efficient and robust performance than the basic ACO algorithm in learning the BN structure.The average evolution curves of the 20 independent tests on each dataset are illustrated in Figures 4-10, which show the validity and rapid convergence of the proposed coACO algorithm.framework, K2 and B [2], are also tested for comparison.Different from the ACO-based methods, both the K2 and B algorithms are not population-based.The learning datasets are generated from the well-known benchmarks of BNs, including the ASIA and the ALARM networks.The ASIA network consists of 8 nodes and 8 arcs, while the ALARM network consists of 37 nodes and 46 arcs.Using the BNT, we generated a dataset of 10,000 cases for the ASIA network and 5000 cases for the ALARM network.For the ASIA network, the subsets consisting of the first 1000, 3000, 5000, 8000, and 10,000 cases are considered.For the ALARM network, the subsets consisting of the first 1000 and 5000 cases are considered.Table 1 lists the summary of the datasets used herein.We run the stochastic algorithms, including ACO, coACO, and K2, 20 times independently for each dataset.The statistical results of the multiple independent experiments are listed in Table 2 for comparison, where six indicators are used to estimate the performances of the four algorithms, including the maximum/best, mean, median values, and the standard deviation of the BIC scores of the best networks obtained in each run.The last column, success rate (SR), is defined to represent the percentage of all 20 runs to obtain the BIC score that is not less than the corresponding original score listed in Table 1.It can be seen from Table 2 that coACO can always find the better BIC score for each dataset.It means that coACO can achieve a more efficient and robust performance than the basic ACO algorithm in learning the BN structure.The average evolution curves of the 20 independent tests on each dataset are illustrated in Figures 4-10, which show the validity and rapid convergence of the proposed coACO algorithm.Table 3 also shows the summary of the statistical results for various methods tested in the experiment, where µ ± σ denotes the mean and the standard deviation over the independent runs and the value inside (best) is the corresponding best value.The indicator It. is the smallest number of iterations when the algorithm finds the optimal network structure.The indicators A., D., and I. are used to denote the structure differences between the learned and the original network, namely, the number of arcs accidentally added (A.), deleted (D.), and inverted (I.), compared with the original network.
From the experimental results, it can be seen that, using the cooperative evolution strategies based on DE, the proposed coACO can greatly improve the performance of ACO in solving the BN structure-learning problem.As to the accuracy of the algorithms, coACO is superior to ACO for all test cases.This improvement is valid for not only BIC scores (see Table 2) but also the iteration number and the structural differences (see Table 3).As to the efficiency, the evolution curves shown in Figures 4-10 show that coACO requires fewer computing iterations than the basic ACO.As to the robustness, coACO is also superior to the basic ACO, which can be deduced from the standard deviations of the independent runs shown in Tables 2 and 3.The pheromone plays an important role in the solution-constructing process.The DE operations improve the capability of ants to accumulate the level of pheromone on arcs in the best structure.Due to the cooperative evolution characteristics, coACO can adjust the distribution of the pheromone trail to the best state in just a few iterations, and thus, yield the optimal network structure much faster.Though coACO employs some extra operators, it converged much faster.Therefore, coACO can obtain the best solution in a given small number of iterations.

Figure 1 .
Figure 1.The construction process of a BN.

Figure 1 .
Figure 1.The construction process of a BN.

Figure 3 .
Figure 3.The crossover process of the pheromone matrix.
Step 5. s = s + 1; if s ≤ S, return to Step 4. Step 6. s = 1, k = 1.Step 7. The k-th ant constructs the BN Gk(τ s ) using the pheromone matrix τ s according to the codes from Line 8 to Line 12 in Algorithm 2. Step 8.The k-th ant constructs the BN Gk(v s ) using the pheromone matrix v s according to the codes from Line 8 to Line 12 in Algorithm 2. Step 9. k = k + 1; if k ≤ ns, return to Step 7. Step 10.Compute the BIC score for Gk(τ s ), k = 1, 2, …, ns, choose the best BN with the maximum score and set the maximum score as the fitness f(τ s (t)) for τ s .Step 11.Compute the BIC score for Gk(v s ), k = 1, 2, …, ns, choose the best BN with the maximum score and set the maximum score as the fitness f(v s (t)) for v s .Step 12. Compare f(τ s (t)) and f(v s (t)), select the better pheromone matrix according to (15), and select the corresponding BN for the ant group.Step 13. s = s + 1; if s ≤ S, k = k + 1 and return to Step 7.

τFigure 3 .
Figure 3.The crossover process of the pheromone matrix.

Figure 4 .
Figure 4. Evolution curves of the average Bayesian information criterion (BIC) score achieved by ACO and coACO on the ASIA dataset with n = 1000.

Figure 4 .
Figure 4. Evolution curves of the average Bayesian information criterion (BIC) score achieved by ACO and coACO on the ASIA dataset with n = 1000.

Figure 5 .Figure 5 .Figure 5 .
Figure 5. Evolution curves of the average BIC score achieved by ACO and coACO on the ASIA dataset with n = 3000.

Figure 6 .
Figure 6.Evolution curves of the average BIC score achieved by ACO and coACO on the ASIA dataset with n = 5000.

Figure 7 .Figure 6 .
Figure 7. Evolution curves of the average BIC score achieved by ACO and coACO on the ASIA dataset with n = 8000.

Figure 5 .
Figure 5. Evolution curves of the average BIC score achieved by ACO and coACO on the ASIA dataset with n = 3000.

Figure 6 .
Figure 6.Evolution curves of the average BIC score achieved by ACO and coACO on the ASIA dataset with n = 5000.

Figure 7 .Figure 7 . 16 Figure 8 .
Figure 7. Evolution curves of the average BIC score achieved by ACO and coACO on the ASIA dataset with n = 8000.

Figure 9 .Figure 8 .
Figure 9. Evolution curves of the average BIC score achieved by ACO and coACO on the ALARM dataset with n = 1000

Figure 8 .
Evolution curves of the average BIC score achieved by ACO and coACO on the ASIA dataset with n = 10,000.

Figure 9 .Figure 10 .Figure 9 .Figure 8 .
Figure 9. Evolution curves of the average BIC score achieved by ACO and coACO on the ALARM dataset with n = 1000
* Evaluation the fitness of pheromone matrixes */ 14 for each ant k in the s-th group do 15 Construct a BN structure G k (v s ) using the pheromone matrix v s according to the codes from Line 8 to Line 12 in Algorithm 1; 16 Construct a BN structure G k (τ s ) using the pheromone matrix τ s according to the codes from Line 8 to Line 12 in Algorithm 1; 17 end for 18 Calculate the BIC score for each BN structure G k (v s ), choose the best structure and assign the maximum score to f (v s ); 19 Calculate the BIC score for each BN structure G k (τ s ), choose the best structure and assign the maximum score to f (τ s ); /* Selection */ 20Compare f (τ s ) and f (v s ), select the better one to be the new pheromone matrix according to (14);

Table 1 .
A summary of the datasets used in the experiments.

Table 1 .
A summary of the datasets used in the experiments.

Table 2 .
Statistical results of the BIC score for various methods.