A Population-Based Search Approach to Solve Continuous Distributed Constraint Optimization Problems

: Distributed Constraint Optimization Problems (DCOPs) are an efficient framework widely used in multi-agent collaborative modeling. The traditional DCOP framework assumes that variables are discrete and constraint utilities are represented in tabular forms. However, the variables are continuous and constraint utilities are in functional forms in many practical applications. To overcome this limitation, researchers have proposed Continuous DCOPs (C-DCOPs), which can model DCOPs with continuous variables. However, most of the existing C-DCOP algorithms rely on gradient information for optimization, which means that they are unable to solve the situation where the utility function is a non-differentiable function. Although the Particle Swarm-Based C-DCOP (PCD) and Particle Swarm with Local Decision-Based C-DCOP (PCD-LD) algorithms can solve the situation with non-differentiable utility functions, they need to implement Breadth First Search (BFS) pseudo-trees for message passing. Unfortunately, employing the BFS pseudo-tree results in expensive computational overheads and agent privacy leakage, as messages are aggregated to the root node of the BFS pseudo-tree. Therefore, this paper aims to propose a fully distributed C-DCOP algorithm to solve the utility function form problem and avoid the disadvantages caused by the BFS pseudo-tree. Inspired by the population-based algorithms, we propose a fully decentralized local search algorithm, named Population-based Local Search Algorithm (PLSA), for solving C-DCOPs with three-fold advantages: (i) PLSA adopts a heuristic method to guide the local search to achieve a fast search for high-quality solutions; (ii) in contrast to the conventional C-DCOP algorithm, PLSA can solve utility functions of any form; and (iii) compared to PCD and PCD-LD, PLSA avoids complex message passing to achieve efficient computation and agent privacy protection. In addition, we implement an extended version of PLSA, named Population-based Global Search Algorithm (PGSA), and empirically show that our algorithms outperform the state-of-the-art C-DCOP algorithms on three types of benchmark problems.

Firstly, CMS and HCMS extend discrete MS, where CMS approximates utility functions to piecewise linear functions and HCMS introduces a continuous nonlinear optimization method.However, they are both limited by the forms of utility functions.Secondly, B-DPOP combines the Bayesian optimization with DPOP, but it does not guarantee an intermediate solution quality before convergence.Further, EC-DPOP provides the global optimum solution.AC-DPOP and CAC-DPOP provide approximate solutions and reduce communication overheads by limiting the size of messages.Nevertheless, they both incur exponential memory and computational overheads.Then, PCD introduces Particle Swarm Optimization (PSO) in C-DCOPs to reduce the computational and memory overhead.PCD-LD extends PCD to balance solution quality with memory overhead.Finally, C-CoCoA and C-DSA provide two low-cost methods to solve C-DCOPs, but none of the two algorithms have anytime properties.
Based on the above analysis, we propose two population-based approaches by using basic mathematical operators such as addition and multiplication only to solve C-DCOPs, Population-based Local Search Algorithm (PLSA), and the extension, Population-based Global Search Algorithm (PGSA).The proposed PLSA and PGSA can solve problems with any form of utility functions and outperform gradient-based methods in terms of computational and memory overhead [35].
In the previous paper [39], the idea of population-based local search is applied to centralized optimization problems and exhibits excellent performance.Specifically, the algorithm (i) uses some elite solutions and useful information to guide local search; and (ii) systematically updates the population.In this paper, we customize a population-based local search approach for C-DCOPs, which is more suitable for distributed optimization.Our motivations and contributions to PLSA can be summarized as follows: •

Motivation:
The pseudo-tree is a communication structure commonly used in C-DCOP algorithms.The root agent can be aware of the aggregate utility according to the utility passing between all agents.We have two worries: (i) the root agent requires more computational and memory overhead.(ii) The privacy violation during utility passing, even if the privacy violation is minor [40].
Contribution: Therefore, we use the graph structure in PLSA.The agent only passes its value assignments (instead of utilities received from other neighbors) to neighbors and optimizes the sum of local utilities through the neighbors' value assignments.An important benefit is that each agent can guarantee equal overhead and privacy.PGSA is an extension of PLSA, which considers introducing Breadth First Search (BFS) pseudo-tree technology to guarantee its anytime property while extending local search to global search.Extensive comparative experiments on three randomly generated benchmark problems show that PGSA can improve the solution quality of PLSA, and our algorithms are superior to the state-of-the-art C-DCOP algorithms in terms of solution quality.
The remainder of this paper is organized as follows.Firstly, we introduce the background of the problem, the distributed population, and the BFS pseudo-tree in Section 2. In Section 3, we present the details of our algorithms.Section 4 presents the theoretical analysis.Then, Section 5 exhibits the comparative experiments of our algorithms and the state-of-the-art C-DCOP algorithms.Finally, we give the conclusion and future avenues of research in Section 6.

Background
In this section, we firstly introduce the DCOP and C-DCOP framework, which are the necessary background to this paper.Then, we describe the distributed population method, which is the basis of our algorithms.Finally, we present the relevant definitions of the BFS pseudo-tree.

Distributed Constraint Optimization Problems
A Distributed Constraint Optimization Problem (DCOP) can be defined as a tuple ⟨A, X, D, F, α⟩, where specifies the utility function assigned to each combination of x i 1 , x i 2 , . . ., x i k , where the arity of the utility function f i is k.We consider that all utility functions are binary (thus α : is a variable-to-agent mapping function that associates each variable x j ∈ X to one agent a i ∈ A. In this paper, we assume one agent controls only one variable (n = m, thus the terms "agent" and "variable" could be used interchangeably).
The solution of a DCOP is an assignment combination X * to all variables that minimize (we are going to consider the minimization in this paper) the aggregate utility, as shown in Equation (1).

Continuous Distributed Constraint Optimization Problems
Similarly, a Continuous DCOP (C-DCOP) is a tuple ⟨A, X, D, F, α⟩, where A, F, and α are the same as DCOP.The set of variables X and the set of domains D are defined as follows: • X = {x 1 , x 2 , . . ., x m } is the set of continuous variables and each variable x j is controlled by one of the agents a i ∈ A.
. . ,D m } is the set of continuous domains and each continuous variable x i ∈ X takes any value from the domain where LB and UB represent the lower and upper bounds of the domain, respectively.
The goal of a C-DCOP is the same as Equation (1). Figure 1 shows a simple example of a C-DCOP, where Figure 1a shows a constraint graph with four variables, and each edge represents a utility function defined in Figure 1b.The domain D i of variable x i is [−10, 10].
An example of a C-DCOP.

Distributed Population
The distributed population is an extension of the traditional population, which has become a basic technology in the population-based algorithm for DCOPs and C-DCOPs [30,31,35,38] in recent years.The distributed population is held collaboratively by all agents.Table 1 shows a distributed population of n agents and K solutions, the agent x i holds one dimension of each solution, and multiple solutions are composed of a population, denoted as S. One of the solutions, solution k = {S k .x 1 , S k .x 2 , . . ., S k .xn }, represents an assignment combination of all agents.For solution k, we use S k .xi to denote the variable held by agent x i , and we define that S k (U i ) represents the sum of local utilities of agent x i , calculated by Equation (2).

Breadth First Search Pseudo-tree
The Breadth First Search (BFS) pseudo-tree is a commonly used communication structure for DCOPs and C-DCOPs [3,26,27,35,36,38].The characteristics of the BFS pseudo-tree are multi-branch parallel computing, short communication path, and time.We briefly present the relevant definitions of the BFS pseudo-tree, and the details can be found in reference [41].Figure 2 shows an example of a BFS pseudo-tree using the constraint graph in Figure 1a.Some definitions of the BFS pseudo-tree are given as follows: • E(x i , x j )-the tree edge, which connects x i and x j (e.g., E(x 1 , x 2 ) is a tree edge in Figure 2).• CE(x i , x j )-the cross-edge, which directly connects node x i and x j in two different branches (e.g., CE(x 2 , x 3 ) is a cross-edge in Figure 2).• P i -the parent of node x i , the single higher node directly connecting x i through a tree edge (e.g., P 2 = x 1 in Figure 2).• C i -the set of children of node x i , the lower nodes directly connecting x i through tree edges (e.g., C 1 = {x 2 , x 3 , x 4 } in Figure 2).• N i -the set of neighbors of node x i , the neighboring nodes that directly connect x i (e.g., N 3 = {x 1 , x 2 } in Figure 2).• PN i -the set of pseudo-neighbors of node x i , the neighboring nodes that directly connect x i through cross-edge edges (e.g., PN 3 = {x 2 } in Figure 2).
Constructing pseudo-tree The BFS pseudo-tree is used for utility aggregation in distributed algorithms, each node x i sends the sum of local utilities to the root node according to the parent node P i .Finally, the root node is aware of the aggregate utility.

Our Algorithms
In this section, we describe the details of the Population-based Local Search Algorithm (PLSA), which is a fully decentralized C-DCOP algorithm.In addition, we introduce a Population-based Global Search Algorithm (PGSA) based on PLSA.

Population-Based Local Search Algorithm
The PLSA consists of three phases: Initialization, Decision, and Update phases; the details of the algorithm can be found in Algorithm 1.
The Initialization phase: Firstly, PLSA initializes some parameters and constructs a distributed population.Specifically, K represents the number of solutions, and T is the termination parameter, which is the threshold of the agent state change.λ and P m represent the learning rate and mutation probability, respectively.Secondly, each agent sets the δ to 0 and the STATE to "ACTIVE".The δ records the change of the best value, and the STATE determines whether the value assignments are updated.In addition, each agent initializes the current dimension of each solution to a random value from its domain.Finally, the agent x i sends the values S.x i to neighbors N i (Algorithm 1: Lines 3-10).
The Decision phase: When receiving the value assignments S.x j sent by the neighbor N i j ∈ N i , the agent x i calculates the sum of local utilities U i of each solution according to Equation (2) (Algorithm 1: Lines 12-14).In addition, agent x i calculates the index of the best value best1 and outputs S best1 .xi , which is the value that minimizes the sum of local utilities (Algorithm 1: Lines 15-16).The Update phase: This phase mainly includes state updates and value updates.On the one hand, the agent x i compares the best value of the current and the previous iteration.The δ is increased by one if the two values are equal, otherwise, the δ is set to zero.Further, if δ > T, the agent x i changes the STATE to "Hold" (Algorithm 1: Lines 21-26), where t represents the number of iterations.On the other hand, the agent x i updates each value S k .xi and sends it to neighbors N i .Specifically, if the current STATE is "Hold", each value is updated to S best1 .xi .Otherwise, the agent x i generates a random number r m ∈ [0, 1] and S k .xi is assigned a random value from its domain if r m < P m .On the contrary, S k .xi is updated from a linear combination of the two best and worst values, as shown in Equation (3) (Algorithm 1: Lines 28-35).
In general, the best solution has good exploitation ability, but it may fall into a local optimum and reduce the exploration ability.Conversely, the worst solution has better exploration ability.Therefore, we consider the weighted combination in Equation ( 3), such that each solution learns the exploration and exploitation from other solutions, where S best2 .xi is a value that makes the sum of local utilities the second smallest, and S worst .xi is a value that maximizes the sum of local utilities.

Population-Based Global Search Algorithm
Similar to PLSA, PGSA, represented by Algorithm 2, includes three phases: Initialization, Decision, and Update phases.
The Initialization phase: PGSA starts the initialization() after constructing a BFS pseudo-tree and assigning Γ to +∞, where Γ records the best aggregate utility.In addition, each agent x i sets the Gbest i to complete the assignment combination of the best aggregate utility (Algorithm 2: Lines 1-4).
The Decision phase: Similarly, the agent calculates the sum of local utilities with neighbors (Algorithm 2: Lines 6-8).However, the agent x i in PSGA sends the sum of local utilities to the parent agent P i .Each agent except the root agent does the same thing; the agent x i waits to receive the sum of local utilities of children agents C i and adds it to its sum of the local utilities S(U i ) (Algorithm 2: Lines 9-15).Since the agents of each layer pass the sum of local utilities up the pseudo-tree and the utility function represented by each edge is calculated twice by the corresponding two agents, the root agent calculates the aggregate utility of each solution by Equation (4).The root agent compares the minimum aggregate utility with Γ.If the minimum aggregate utility is less than Γ, the update is assigned the "Assign" and sent by the root agent to neighbors (Algorithm 2: Lines 27-37).
The Update phase: The agent x i updates Gbest i according to the value of update (Algorithm 2: Lines 20-21).In addition, each agent updates its state and value assignments.As an extension of PLSA, we consider the same weighted combination method to update the solution, but we extend the local search to the global search, as shown in Equation ( 5), where the S 0 and S K−1 are the solutions with minimum and maximum aggregate utility, respectively.And the S 1 is the solution with the second smallest aggregate utility.

Theoretical Analysis
In this section, we prove the anytime property of PGSA and provide some theoretical properties of PLSA and PGSA in terms of communication, computation, and memory.We assume that the graph structure is a binary constraint graph G = (N, E), the number of nodes is |A|, and the number of edges of the constraint graph is |F|.Since the agent calculates the sum of local utilities (Algorithm 1: Line 14 and Algorithm 2: Line 8) or adds the sum of utilities of children agents (Algorithm 2: Line 14) in the decision phase, and the agent updates its value assignments in the update phase (Algorithm 1: Line 18 and Algorithm 2: Line 23), we define one iteration as the entire process of the decision and update phase.
Theorem 1. PGSA is an anytime algorithm.
Proof.In PGSA, the root agent tracks the aggregate utility through a pseudo-tree.We assume that the Γ and the aggregate utility U at the m iterations are Γ m and U m , respectively.According to Algorithm 2, Line 32-33, the Γ m ≤ U m .Only the root agent finds the aggregate utility U m+∆ at the m + ∆(∆ > 0) iterations and U m+∆ < Γ m ≤ U m , Γ m+∆ is assigned the value of U m+∆ , the Γ m+∆ < Γ m .Otherwise, Γ m maintains the best aggregate utility so far.In other words, Γ m+∆ ≤ Γ m and the aggregate utility decreases monotonically as the number of iterations increases.Therefore, the PGSA is an anytime algorithm.
Theorem 2. The total number of messages of PLSA and PGSA with t iterations are 2t|F| and 4t|F|, respectively.
Proof.In one iteration of PLSA, for every constraint f ij (x i , x j ), the agent x i sends a message to x j to share value assignments and vice versa.So there are two messages per edge, and thus the number of messages is 2 * |F|, where |F| is the number of constraints and also the number of edges.Therefore, the total number of messages of PLSA with t iterations is 2t|F|.Similarly, the agent x i in PGSA sends a message to x j and receives a message from x j .In addition, the agent x i sends a message (the sum of utilities) to the parent agent (Algorithm 2: Line 15) and Messages to children agents (Algorithm 2: Line 24).Hence, the total number of messages of PGSA with t iterations is 4t|F|.
Theorem 3. In one iteration, the message sizes of PLSA and PGSA are, respectively, K and K + 1, where K is the number of solutions.
Proof.In PLSA, the agent sends a message containing K values to its neighbors in one iteration.Therefore, the message size of PLSA in one iteration is K, where K is the number of solutions.In PGSA, the agent sends three types of messages: the values assignments, the sum of local utilities, and Messages, whose message sizes are K, K, and K + 1, respectively.Hence, the message size of PGSA in one iteration is max{K, K, K + 1} = K + 1.
Theorem 4. In one iteration, the computational overhead of one agent in PLSA and PGSA are, respectively, K(|N | + 1) and K(|N | + 1 + |C|), where K is the number of solutions, |N | is the number of neighbors, and |C| is the number of children agents in the BFS pseudo-tree.Further, the overall computational overheads of PLSA and PGSA in one iteration are K(2|F| + |A|) and K(2|F| + 2|A| − 1), respectively.
Proof.We define the number of neighbors of one agent is |N |.In PLSA, each agent calculates the sum of local utilities with its neighbors and updates the value assignments of K solutions.The computational overhead of one agent in one iteration is K(|N | + 1).Since ∑ x i ∈X |N i | = 2|F|, therefore, the overall computational overhead of PLSA is Because BFS pseudo-tree is used in PGSA, we define the number of children of one agent as |C|.In PGSA, one agent not only calculates the sum of local utilities and updates value assignments but also adds the sum of utilities of children agents.Hence, the computational overhead of one agent in one iteration is K(|N | + 1 + |C|).Further, each agent has only one parent agent, while the root agent has no parent agent, so

Experiment Results
In this section, we first provide the basic experimental configuration.Then, we show the evolution curves of our algorithms (PLSA and PGSA) and competing algorithms (HCMS, PCD, C-CoCoA, and PCD-LD) on three benchmark problems.Finally, we analyze the solution quality of the six algorithms.

Experimental Configuration
We evaluate our algorithms and competing algorithms on three benchmark problems as follows: (i) Random Graphs: we refer to the Erdös-Rényi model [42] to provide two random graphs, sparse random graphs (edge probability 0.1) and dense random graphs (edge probability 0.6); (ii) Scale-free Networks: we use the Barabási-Albert (BA) [43] model to generate scale-free networks.Its parameters are (m 1 = 15, m 2 = 7); and (iii) Smallworld Networks: we use the Watts-Strogatz topology model [44] to generate small-world networks.The number of nearest nodes is 6 and the reconnect probability is 0.5.
We set the number of agents from 50 to 100 for three benchmark problems, and the quantity interval was 10.Although PLSA and PGSA can use any form of utility functions (refer to Figure 1), we followed reference [36,38] and used the utility function in the form of ax 2 + bx + cxy + dy + ey 2 + f to fairly present the comparison results, where a, b, c, d, e, and f are random numbers in the range [−5, 5].In addition, we set the domains of all variables to [−50, 50] and the number of iterations to 500.For each experimental configuration, we independently ran each algorithm 30 times and took the average as the experimental result.The experiments were carried out on a computer equipped with Intel(R) Core(TM) i5-10500 CPU, 3.10 GHz processor, and 8 GB RAM.
We considered three important parameters of PLSA, λ, T, andP m , and refer to reference [38] to set P m = 0.01.For the parameters (λ, T), we discretize λ into 0.01-0.9and T into 50-100 to find a combination with the best solution quality on dense random graphs (|A| = 100).Figure 3 shows the relative quality of solutions with different λ and T, where we normalize the solution quality by max-min normalization to present the difference.Specifically, the relative quality is , where Q is the solution quality, Q max and Q min are the maximum and minimum solution quality, respectively.It is obvious from Figure 3 that the best combination (λ, T) is (0.9, 100).In addition, we set K to 1000.The parameter configuration of the competing algorithms can refer to HCMS [36], PCD [35], C-CoCoA [37], and PCD-LD [38].It is worth mentioning that we record the best solution found on the empirical evaluation for each algorithm.

Comparison of Solution Quality
We present the evolution of the solution quality of our algorithms and competing algorithms with the increasing number of iterations on three benchmark problems with 100 agents.Since C-CoCoA is a non-iterative algorithm, we plot the solution quality of C-CoCoA as a straight line and focus only on its solution quality.
Figures 4 and 5 show the evolution of solution quality of the six algorithms on sparse and dense random graphs, respectively.It can be seen from Figure 4 that PLSA obtains a high-quality solution through a few iterations, and the solution quality of PLSA is significantly superior to competing algorithms.Compared to the local search, there is a better exploration ability in the global search for the solution space.Although PLSA has excellent exploitation for solutions in early iterations, PGSA can explore a higher quality solution by global search.In Figure 5, PLSA and PGSA show the same performance as in sparse random graphs, but a significant difference is that the final solution quality of PLSA and PGSA is similar.The reason for this is that, as the number of neighboring agents on dense random graphs increases, the agents perform local searches over larger search spaces to obtain better solutions.It is worth noting that the solution quality of C-CoCoA is better than that of PCD and similar to HCMS on sparse random graphs, but in dense configurations, its solution quality is worse than PCD and HCMS.The reason for this is that C-CoCoA uses a semi-greedy local search method, for which it is easier to find high-quality solutions on simple graph structures than complex ones.In summary, PLSA and PGSA outperform competing algorithms on both sparse and dense random graphs.Figures 6 and 7 present the solution quality of PLSA and PGSA against competing algorithms on scale-free and small-world networks, respectively.We can clearly observe that the solution quality of PCD is better than that of HCMS before about 240 iterations.However, as the number of iterations increases, the solution quality of HCMS is superior to that of PCD, and the solution quality of PCD-LD is better than C-CoCoA.Since the global search in PGSA focuses on the exploration of solution space, and the exploitation ability of PGSA for solutions is not as good as that of PCD-LD, PCD-LD initially performs slightly better than PGSA in the early iterations.As the number of iterations increases, the solution quality of PGSA is superior to competing algorithms.In the end, our algorithms significantly outperform competing algorithms on these two benchmark problems.

,WHUDWLRQV $JJUHJDWHXWLOLWLHV
Further, we evaluate the solution quality of our algorithms and competing algorithms on each benchmark problem with different numbers of agents, and the results are presented in Figure 8.We can see that the solution quality of our algorithm significantly outperforms HCMS, PCD, and C-CoCoA on three benchmark problems with each number of agents.Compared to PCD-LD, the superiority of PLSA and PGSA increases obviously in terms of solution quality as the number of agents increases.

Statistical Analysis
To clarify the superiority of our algorithms from a statistical perspective, we use the Wilcoxon signed rank test for statistical analysis on each benchmark problem with 100 agents, and the results are shown in Table 2.It can be clearly seen that our algorithms significantly outperform competing algorithms on three benchmark problems.
Table 3 lists the average improvement rates of PLSA and PGSA compared to HCMS, PCD, C-CoCoA, and PCD-LD on three benchmark problems.In addition, we calculate the standard deviation of the improvement rates under the same problem and show it next to the average improvement rate, denoted as σ.We can observe that the average improvement rates of our algorithms compared to HCMS and PCD are stable and more than 20%.Compared to C-CoCoA, the average improvement rates of PLSA and PGSA are not stable on different benchmark problems, since C-CoCoA has a better performance on simple graph structures.The average improvement rates of PLSA and PGSA are stable compared to PCD-LD, which are, on average, about 4% and 7%, respectively.Table 2. Wilcoxon signed rank test results of solution quality over 30 independent runs of our algorithms and competing algorithms with the significance level α sl = 0.05, where R + and w − , respectively, represent the times and rank sum.The solution quality of our algorithms is better than that of competing algorithms; R − and w − mean the opposite, respectively.In addition, p represents the p-value.

Conclusions and Future Work
The C-DCOP framework extends the DCOP framework to solve applications with continuous variables.Addressing the limitations of the utility function and computational overheads for the existing C-DCOP algorithms, we designed a fully decentralized algorithm (PLSA) and a global search anytime algorithm (PGSA).In PLSA, each agent only exchanges value assignments with neighbors to avoid privacy leakage.In PGSA, agents pass utility through pseudo-tree to achieve global search and guarantee anytime property.In summary, there is less computational and memory overhead in PLSA, and it can obtain high-quality solutions faster.PGSA guarantees the anytime property and achieves higher quality solutions, but it requires more computational and memory overhead for utility passing.Extensive evaluations on three benchmark problems demonstrate that our algorithms outperform the state-of-the-art C-DCOP algorithms in terms of the runtime and solution quality.However, there are two limitations: (i) PLSA adopts a stochastic strategy to escape from the local optimum, which may lead to instability in the solution; and (ii) PGSA converges slowly in the early period since it performs the global search.Therefore, the future research avenues can be summarized as follows: (i) we plan to combine some other heuristic methods to improve the quality of and maintain stability in PLSA's solution; and (ii) we will try to modify the search strategy of PGSA to improve its convergence speed.

Discussions
In recent years, the C-DCOPs have been widely studied as an extension framework of DCOPs.As a basic problem framework, there are many other extension frameworks of DCOPs that have become research hot-spots, such as Asymmetric DCOPs (ADCOPs) [45], Multi-objective DCOPs (MO-DCOPs) [46], Dynamic DCOPs (D-DCOPs) [47], Probabilistic DCOPs (P-DCOPs) [48], etc. Accordingly, many algorithms have been proposed for solving different DCOPs-based frameworks.Specifically, in contrast to DCOPs, the utility generated by one agent in ADCOPs from a utility function may differ from the utility generated by another agent from the same utility function.Therefore, the authors of reference [45] extend DSA and proposes Asymmetric Coordinated Local Search (ACLS) to solve ADCOPs, which changes its value with a certain probability to escape from local minimums.The authors of reference [49] solve ADCOPs by employing the PT-ISABB algorithm, which adopts customized inference algorithms to provide strict upper and lower bounds and adopts a complete tree-based search to guarantee optimal performance.In [50], AsymDPOP is proposed to solve the privacy exposure problem in ADCOPs.For MO-DCOPs, the authors of reference [51] propose Multi-objective Synchronous Branch and Bound (MO-SBB) to solve it; the agents in MO-SBB extend the Current Partial Assignment (CPA) with their own assignments with the current utility vector.Once a non-dominated solution is found, it is broadcasted to all agents and the solution is added to the global bounding list.In addition, the authors of reference [52] adopt the concept of non-dominated bounded vectors and retain only non-dominated vectors to solve MO-DCOPs.Then, the D-DCOPs are proposed to cope with environmental evolution over time.The authors of reference [53] introduces an inference-based approach and employ the Hybrid Algorithm for Reconstructing Pseudotrees (HARP) to solve D-DCOPs, which reuses previous DCOP information to accelerate the search for solutions.In addition [54], the authors employ a reinforcement learning approach to solve D-DCOPs.Finally, P-DCOPs are proposed to cope with environments with stochastic behavior, i.e., there are unexpected events that affect the agent's behavior.The authors of reference [55] propose a sampling and inference-based algorithm called E-DPOP to solve P-DCOPs, which employs a collaborative sampling strategy to influence the random variables controlled by the agent, and a leading agent merges the set of variables after that.On the other hand, the authors of reference [56] propose the Distributed Neighbor Exchange Algorithm (DNEA) algorithm to solve P-DCOPs.Specifically, each agent in DNEA computes the utility vector with the neighboring agents and sends it to the neighbors.After that, the neighboring agent computes the best value for each of its variables and assigns those values to it probabilistically.The neighboring agent finally sends the assigned values to all its neighbors.
In summary, DCOPs are a widely used basic framework for multi-intelligent systems.With the development of practical applications, different versions of DCOPs-based frameworks have been proposed, and they have made great contributions for solving multi-intelligent system problems.Compared to C-DCOPs, the variables in the above DCOPs-based frameworks are discrete, thus incorporating continuous variables into different extended frameworks is an interesting area of study.Note that the Dynamic Continuous DCOPs (DC-DCOPs), which combine D-DCOPs and C-DCOPs, are provided in reference [47].This means that exploring different DCOPs versions for continuous variables would be a worthy development.

Figure 2 .
Figure 2. A BFS pseudo-tree representation (c) of the C-DCOP depicted in Figure 1a.

Figure 3 .
Figure 3. Relative quality of solutions with different λ and T.

Figure 4 .Figure 5 .
Figure 4.The evolution of solution quality of PLSA and PGSA against the competing algorithms on sparse random graphs.

Figure 6 .Figure 7 .Figure 8 .
Figure 6.The evolution of solution quality of PLSA and PGSA against the competing algorithms on scale-free networks.
Hence, we propose a STATE mechanism for each agent to control the changes in value assignments.Specifically, each agent changes its state (updates or holds values) based on historical values, which can make the current agent terminate the update early and keep local stability.•Motivation:Fallinginto the local optimum is always an awkward problem in approximate optimization algorithms, which limits the exploration ability of the algorithm.Similarly, PLSA also faces the problem of falling into the local optimum as an approximate optimization algorithm.Contribution:With a simple idea, we introduce a classical mutation operator to jump out of the local optimum.The agent resets each value with probability to a random value from its domain.
• Motivation: In distributed optimization, since none of the agents are aware of the quality of the aggregate utilities, the quality of aggregate utility fluctuates with the changes in value assignments.However, the sum of local utilities affects aggregate utility, so we consider that local stability promotes global stability to a certain extent.Contribution: a 2 , . . ., a n } is a set of agents; an agent can control one or more variables.• X = {x 1 , x 2 , . . ., x m } is a set of discrete variables; each variable x j is controlled by one of the agents a i ∈ A. • D = {D 1 , D 2 , . . ., D m } is a set of discrete domains; each variable x i ∈ X takes value from the domain D

Table 1 .
The distributed population.

Algorithm 1 :
Population-based Local Search Algorithm set of K solutions; // The set of solutions as shown in Table 1..xi ← a random value from D i ; // Assign a random value from the value domain to S k .xi .Send S.x i to N i ; // Agent x i sends the value set, i.e., |K| values (S k .xi ) to neighboring agents.

Table 3 .
The average improvement rate (rounded to two decimals) of PLSA and PGSA compared to the competing algorithms on three benchmark problems with different numbers of agents, where σ represents the standard deviation of the improvement rates.