A Novel Multi-Robot Task Allocation Model in Marine Plastics Cleaning Based on Replicator Dynamics

: As marine plastic pollution threatens the marine ecosystem seriously, the government needs to ﬁnd an effective way to clean marine plastics. Due to the advantages of easy operation and high efﬁciency, autonomous underwater vehicles (AUVs) have been applied to clean marine plastics. As for the large-scale marine environment, the marine plastic cleaning task needs to be accomplished through the collaborative work of multiple AUVs. Assigning the cleaning task to each AUV reasonably and effectively has an essential impact on improving cleaning efﬁciency. The coordination of AUVs is subjected to harsh communication conditions. Therefore, to release the dependence on the underwater communications among AUVs, proposing a reliable multi-robot task allocation (MRTA) model is necessary. Inspired by the evolutionary game theory, this paper proposes a novel multi-robot task allocation (MRTA) model based on replicator dynamics for marine plastic cleaning. This novel model not only satisﬁes the minimization of the cost function, but also reaches a relatively stable state of the task allocation. A novel optimization algorithm, equilibrium optimizer (EO), is adopted as the optimizer. The simulation results validate the correctness of the results achieved by EO and the applicability of the proposed model. At last, several valuable conclusions are obtained from the simulations on the three different assumed AUVs.


Introduction
In the plan of "United Nations Decade of Ocean Science for Sustainable Development (2021-2030)", maintaining a healthy and clean marine environment received huge attention [1]. Moreover, there is a rising concern about the accumulation of floating plastic debris in the marine environment [2]. In 2018, China's survey of marine garbage showed that plastic debris accounted for 88.7% of floating waste on the sea, 77.5% of beach garbage, and 88.2% of submarine garbage [3]. As marine debris is difficult to degrade, its accumulation may lead to severe pollution and damage to the entire marine environment. It is particularly worth noting that almost half of the world's plastic waste comes from packaging [1]. Therefore, most plastics cannot be recycled or incinerated, causing massive destruction of resources and environmental pollution. It is also the primary source of marine plastic waste. Therefore, resolving and disposing of marine plastics is the key to maintaining a clean, sustainable, and productive ocean. The magnitude and the fate of cleaning marine plastics are still open questions [2]. The reduction of marine plastic waste is the shared responsibility of all countries globally, which is also the responsibility of every person. It is necessary to urge stakeholders from different governance levels, including government departments, public and private enterprises, civil organizations, and every individual in society, to reduce ocean plastic garbage and make bold innovations and actions. In recent the EGT can provide a relatively stable strategy after several rounds of dynamic evolution. Third, it has been proven that, under certain conditions, if the Karush-Kuhn-Tucker (KKT) first-order conditions of constrained optimization are satisfied, the equilibrium of EGT must exist [19]. At last, EGT has been successfully applied to the project about the several thorny problems and challenges; [20] modeled the demand-side management of a smart grid into a certain control networked evolutionary game, which minimize the total cost of the smart grid; [21] proposed a novel algorithm based on the evolutionary game theory, which addressed the challenges faced by dynamic placement of VMs successfully. All in all, the idea of applying several concepts of the EGT to the MRTA model is feasible and meaningful.
The contributions lie in three-folds. First, inspired the replicator dynamics of the EGT, a novel MRTA model is constructed on the basis of the optimization-based approach. It belongs to a continuous optimization problem, different from the discrete and combinatorial optimization formulation in the common MRTA model. Second, by introducing the replicator dynamics to the proposed MRTA model, the final task-allocation values would be in a relevantly stable state. Third, the EO algorithm is first applied to solve the MRTA problem. Through the simulation results, the EO algorithm can get the correct values when solving the proposed MRTA problem. Besides, several conclusions regarding the applicability of the proposed model and the impacts of the complete cleaning tasks are successfully obtained.
The paper is structured as follows. In Section 2, we introduce the related work involved in this paper, including the current situation of marine plastic pollution and the various solutions for the MRTA problems. Then, in Section 3, a novel MRTA model is constructed based on the optimization-based approach and replicator dynamics. Different from the common expression of the MRTA problem, the expression in this section belongs to the continuous optimization problem. Besides, after introducing the replicator dynamics to the proposed model, the expression includes utility function and stable function. To solve the proposed model, the EO algorithm is selected and illustrated in Section 4. Simulation results and discussions about the applicability of the proposed model, the impacts of the total tasks, and the effectiveness of the EO algorithm are shown in detail in Section 5. At last, conclusions are drawn in Section 6.

Current Situation of Marine Plastic Pollution
The United Nations Environment Assembly listed marine plastic pollution as one of the major global environmental issues. Marine plastic pollution was also provided extremely high policy priorities, and governments were called for an urgent response [22]. This section presents the current deteriorating situation of marine plastic pollution from severity, pollution sources, and influence under the COVID-19.
Plastics in the marine environment receive more and more attention due to their persistence and impact on the ocean, wildlife, and even humans [23]. In April 2015, the Joint Group of Experts on the Scientific Aspects of Marine Environmental Protection (GESAMP) finished a report that concluded that marine plastic debris is as harmful to aquatic life as large marine plastic garbage. According to the relevant survey, marine biological species having marine plastic intake records has exceeded 800. This phenomenon has seriously affected the health and sustainable development of the marine ecosystem [24].
In different sea areas, the sources of marine plastic pollution are accordingly various. The garbage along the beach has long been subjected to sunlight, wave-washing, and biological effects. They are easily broken and decomposed into microplastics. Therefore, this type of process is considered one of the sources of marine plastics entering the sea. The surrounding near the estuary is often densely populated. A large amount of plastic waste from rivers would enter the waters. Therefore, this location is a hot spot for marine plastic pollution. The coastal areas are severely affected by human activities. Fisheries, aquaculture, shipping, sewage treatment plants, and so on are the sources of coastal plastics entering the seas. Ocean circulation causes the global migration of marine debris. It has formed a so-called "garbage island" with a staggering area in the middle of the ocean. Although there are fewer human activities in the polar regions, the pollution of plastics in the polar seas is almost as severe as that in the coastal seas. This phenomenon shows that marine plastics have migrated to the polar regions and accumulated in the polar regions under ocean currents.
On the other hand, microplastics can be transported to the poles by the atmosphere, making the bars a gathering place for microplastics. Sampling in the deep sea and the abyssal zone is complex. Only a few countries have deep-sea sampling technologies. Chinese researchers discovered plastic and microplastics in the Mariana Trench, proving that plastic pollution has affected the deepest part of the seabed [25]. As COVID-19 is raging, a non-government organization (NGO) [26] stated that more than 1.5 billion masks had entered the ocean in 2020 [27]. Furthermore, the organization added that these masks would cost at least 450 years to degrade. At the same time, they are gradually being transformed into microplastics, which would have an extremely negative impact on marine life and the marine ecosystem.
From the above overviews, marine plastic pollution is a global problem that is inadequately managed. Therefore, addressing marine plastic pollution efficiently and effectively is the key for the countries to realize environmental protection and sustainable maritime development.

Algorithms for MRTA
As previously mentioned, there are two main approaches for MRTA problems, which are market-based approaches and optimization-based approaches. Market-based approaches are mainly inspired by the economic theory. They can provide an effective way to coordinate robots to perform relevant activities. The basis of market-based approaches is auctioning. They would consider both the robots' bidding and auction criteria (objective functions) while allocating tasks to robots [28]. However, they have many disadvantages: (1) The utilization of a central communication unit [29]; (2) scalability is ensured only for small and medium-sized problems [30]; (3) and the application of negotiation protocols. Compared with the market-based approaches, optimization-based techniques are more well-suited for distributed robots and generally produce near-optimal solutions more efficiently [31]. Besides, scalability is guaranteed even for a large number of robots. Research on using the optimization approaches to solve MRTA problems would be introduced as follows.
Two different solutions based on linear integer programming were proposed [32,33]. The same goal of them is to monitor a specific area and assign tasks to the robots. Mosteo and Montano used the traveling salesman problem (TSP) to formulate the MRTA problem and solved it through simulated annealing algorithm (SAA) [34]. Besides, SAA is combined with some heuristic algorithms to assign jobs to processors [33,35]. Similarly, a genetic algorithm (GA) is used to design a surveillance system that can track multiple targets and manage fires [36]. SAA and ant colony optimization (ACO) are combined to solve the path planning and MRTA problems [37]. Then, several MRTA solutions were proposed and tested extensively in some cases [38]. Robots are highly heterogeneous, and tasks are dynamic. Therefore, MRTA problems need more advanced approaches to be solved accurately. A distributed method is proposed to assign tasks to multiple robots based on particle swarm optimization (PSO) [39]. Authors of works [40,41] used swarm intelligence for task allocation in large-scale multiple robot systems. [42] dealt with the MRTA problem under spatial, temporal, and energetic constraints. The work in [43] proposed a solution to the MRTA problem using spatial queuing.
The meta-heuristic algorithm is proposed relative to the optimization algorithm. The optimization algorithm could obtain the optimal solution to a problem. However, the meta-heuristic algorithm is an algorithm based on intuition or experience. It can give a feasible solution to the problem at an acceptable cost, referring to the calculation time and space. The degree of deviation between the viable solution and the optimal solution may not be predictable in advance. The work in [31,42] proposed two different MRTA problem solutions using different meta-heuristics, such as firefly algorithm, genetic quantum algorithm (GQA), artificial bee colony algorithm, and ant colony optimization (ACO). Equilibrium optimizer (EO) is a new meta-heuristic optimization algorithm proposed in 2020. It is inspired by the physical phenomenon of controlling volume mass balance. EO has the characteristics of solid optimization ability and fast convergence speed. The performance of EO has been verified by 58 mathematical functions, including unimodal, multimodal, mixed, and combined functions and three engineering benchmark problems. Due to the excellent performance of EO, the idea of utilizing EO to resolve the MRTA problem is proposed in this paper.

A Novel MRTA Model
Aiming to construct a MRTA model to get the optimal and stable allocated tasks, a distributed robotic system including n AUVs and a controlling center is assumed. Inspired by the evolutionary model and population dynamics, here, a novel MRTA model is constructed on the basis of the optimization-based approach, where the total weight of fouling material needed to be allocated is w total and the set of [w 1 , w 2 , . . . , w n ] is the weight of cleaning tasks allocated to the set of AUV-agents [AUV 1, AUV 2, . . . , AUV n] by the controlling center. Therefore, the proposed MRTA model is different from the standard MRTA model with the combinational expression.
During the allocation, the sum of the agents in the set of [w 1 , w 2 , . . . , w n ] is required to be equal to w total . The set of [V 1 (w 1 ), V 2 (w 2 ), . . . , V n (w n )] represent the goal function of the corresponding AUVs in the MRS after completing allocated tasks. In other words, they are the feedbacks that the related AUV agents give back to the controlling center after conducting tasks. Therefore, the total feedback of cleaning all the allocated plastic tasks in this distributed MRS is the sum of the set [V 1 (w 1 ), V 2 (w 2 ), . . . , V n (w n )]. The relationship and interactions among n AUVs and the controlling center are shown in Figure 1. The components of the proposed distributed system are shown vividly in Figure 1. This distributed robotic system is composed of multiple distributed AUVs. Every distributed AUV has its goal function in cleaning marine plastics. There exists no master-slave relationship among these distributed AUVs, that is to say, they are equal during the task allocation of marine plastic cleaning. Moreover, they conduct the allocated cleaning tasks autonomously and individually. Preliminarily, the global constraint for this distributed task-allocation system is the total weight of the allocated tasks [w 1 , w 2 , . . . , w n ] must be w total . The local constraint for this distributed-MRS is that the allocated task for the single AUV needs to satisfy the range 0 ≤ w i ≤ w maxi . Under the above initial global constraint and local constraints, this distributed task-allocation system can coordinate the AUVs to conduct the allocated tasks efficiently and harmoniously.

Fomulation of the MRTA Model
After sorting out the relationship between the multi-AUVs and the controlling center, we can first translate this kind of MRTA problem into a mathematical model based on the optimization solutions. Then, the MRTA model is translated and formed as follows: This model aims to minimize the goal function V(w). To satisfy the initial requirements mentioned previously, this model is constructed under the global constraint that the sum of w i (kg) must be w total (kg). Each w i (kg) should be limited in the range from the minimum value to the maximum value of themselves, that is, w mini ≤ w i ≤ w maxi .

Cost Function
The formulation of the cost function could be the most challenging part of an optimizationbased MRTA problem. The general cost function of the MRTA model for underwater cleaning can be illustrated as follows: where C[$] denotes the cleaning costs of a single cleaning-AUV, and w[kg] denotes the MPC tasks allocated to this cleaning-AUV. As we know, after the AUV finishes its allocated cleaning tasks, the owner of it would get profits from the government or the relevant companies. Therefore, during the task-allocation, it should consider the costs of AUVs and the profits of AUVs. The, then cleaning costs a[$/kg] per unit task and the cleaning profits c[$/kg] per unit tasks are chosen to be the other two coefficients in the general cost function. To be more specific, a is the cost by the AUV when it finishes cleaning 1 kg plastics, including electricity cost, staffing consumption, etc., c is the relevant department of the government affords c to AUV's owners. That is to say, when the AUV is allocated 1 kg, the government or the relevant companies will pay c to the AUV. In this general cost function, the relationship between C and w is linear. Besides, a and c are always consistent. However, these two findings show that this linear expression is unpractical. There exist two main reasons. First, all relationships in real life are non-linear, and linear expression is just the simplification or the approximation of the fundamental engineering problems. On the other hand, there are many factors causing a to change with w changing. For example, due to the cleaning experience accumulated during the task conduction, when w approaches its maximum cleaning weight w max , the changing rate of costs would decrease in real life. When w approaches zero, the changing rate of cost should increase. The curve of the above two behaviors is shown in Figure 2.
Thus, we can see that changing rate of C(w) is changeable and depends mainly on the value of w. Then, here, we utilize the logistic-type function to model the above phenomenon. The linear expression is successfully upgraded to the non-linear expression in Equation (3).
where we limit the value w(kg) to the range of [w min , w max ]. C(w) still represents the total cost of the AUV, and w max and w min denote the maximum value and the minimum value of w. According to their importance in the marine plastics-cleaning, they are ranked in the order of m, w max , d, and s. Then, r can be calculated by the following Equation (4): In Equation (4), the set of [I m , I w max , I d , I s ] denotes the importance level of the corresponding index.

Combination with Replicator Dynamics
As we have mentioned previously, the proposed MRTA model is inspired by the evolutionary game and population dynamics. The evolutionary game concepts are suitable for this MRTA model. Replicator dynamics is the core concept in evolutionary game theory, the definition of which is shown in definition 1 [18]. The replicator dynamics are introduced to the MRTA model to realize the stable allocated tasks among the AUVs. According to the definition of replicator dynamics and the corresponding Equation (10), a specific stable function for the proposed MRTA model is constructed and shown in Equation (6): where ∂w i ∂t reflects the changing rate of the ith AUV, which is assigned to w i cleaning tasks. Correspondingly, represents the adaptability of the ith AUV and ∂C(w i ) ∂w i n denotes the average adaptability among the n AUVs. Therefore, if the value of ∂w i ∂t is zero, it means that the ith AUV has stopped changing its allocated tasks. That is to say, in this situation, the ith AUV has reached a relatively stable equilibrium state. In terms of the multiple AUVs, to reach a stable task-allocation, the modified replicator dynamics of all AUVs are needed to be zero or very close to zero. To solve the replicator dynamics by the optimization algorithm, Equation (6) is modified to Equation (7).
Based on the above work, a stable function for the proposed MRTA model is constructed as follows.
According to the mathematical expression of the stable function (8) based on the replicator dynamics, its goal is to minimize F(w) to approach zero. As the goal of the cost function (5) and stable function (8) are both the minimize-optimization and positivedefinite, the goal function of the proposed MRTA model is obtained by adding the two functions up, which is shown in Equation (9). After this kind of combination, the multiobjective optimization has also become single-objective optimization.
In the common optimization-based MRTA model, the objective function is just the utility function or the cost function. Therefore, compared to the common MRTA model, this novel model (9) not only satisfies the minimization of the cost, but also reaches a relatively stable state of the task allocation.

Definition 1. Replicator Dynamic.
The replicator dynamic is a dynamic differential equation, which describes the frequency of an adopted strategy in a specific population. It can be expressed by the following formula.
In the above formula, x i is the proportion or probability of choosing pure strategy s i in a population, u(x i , x) represents the fitness when using pure strategy s i , and u(x, x) denotes the average fitness of the population.

EO Algorithm
The equilibrium optimization (EO) algorithm is first proposed by Faramarzi et al. EO originated from the control volume mass balance model, which is presented to estimate the dynamic and equilibrium states. For more information on the inspiration of EO, one may refer to [44]. Due to the advantages of simple principle, fast convergence, and easy implementation, EO has been widely applied to various optimization problems such as structural design optimization [45], economic dispatch [46], and image segmentation [47].
In EO, each individual with its concentration → C o is defined as a search agent, and each individual in the population is similar to a solution in the particle swarm optimization algorithm [48], → C o represents the position vector of an individual. Equilibrium optimization algorithm consists of three parts, including initialization, equilibrium candidates and pool, concentration updating.

Initialization
As a meta-heuristic algorithm, the initial population is adopted in EO to start the search process. The concentrations of individuals are initialized by the following equation: where → C ini oi refers the initial concentration vector of the ith individual, → C omax and → C omin represent the lower limit vector and upper limit vector of the concentrations of the population, respectively, and r i indicates a random number in [0,1]. Then the individuals are evaluated by the fitness function.

Initialization Equilibrium Candidates and Pool
The equilibrium state represents the final convergence state of obtained solutions. In the initial optimization process, there is no knowledge about the final convergence state. Therefore, the equilibrium candidate → C oe is utilized to guide individuals in the population. In EO, equilibrium candidates include four best individuals based on their fitness values and an individual whose concentration is the mean of the four best individuals. The equilibrium pool is composed of the above five individuals.
In each iteration, each individual updates its concentration by selecting one equilibrium candidate randomly from the equilibrium pool.

Concentration Updating
The individual's concentration updating is controlled by the exponential term where Cit and Maxit represent the current iteration and the maximal iteration, respectively. t refers to the function of iterations which declines along with the increasing of iteration times. a 2 indicates a constant value that influences the exploitation ability of the equilibrium optimizer. λ = [λ 1 , λ 2 , . . . , λ n ] T represents the random vector in the interval of [0,1], t 0 is calculated as follows: where a 1 is a constant value which controls the exploration ability of EO, sign(r 0 − 0.5) is employed to control the direction of exploitation and exploration, The values of a 1 and a 2 are set to 2 and 1 respectively in this work. The final expression of the exponential term → F can be calculated as follows: Generation rate → G plays an important role in EO. It is applied to enhance the exploitation ability of EO.
where → G 0 indicates the initial value.
→ GCP is the generation rate control probability. GP refers to the generation probability, which is set to 0.5 in this work. r 1 and r 2 represent two random numbers in the interval of [0,1]. → κ refers to the decay vector. This study assumes → κ = → λ according to the original EO algorithm. Therefore, the generation rate can be calculated as follows: The final concentration updating formulation of EO is described as follows:

Pseudo Code of EO
Algorithm 1 is the pseudo code of the EO algorithm. The values of and are set to 2 and 1 respectively in this work. The final expression of the exponential term ⃗ can be calculated as follows: Generation rate ⃗ plays an important role in EO. It is applied to enhance the exploitation ability of EO.
where ⃗ indicates the initial value. ⃗ is the generation rate control probability. refers to the generation probability, which is set to 0.5 in this work. and represent two random numbers in the interval of [0,1]. ⃗ refers to the decay vector. This study assumes ⃗ = ⃗ according to the original EO algorithm. Therefore, the generation rate can be calculated as follows: The final concentration updating formulation of EO is described as follows:

Pseudo Code of EO
Algorithm 1 is the pseudo code of the EO algorithm. Algorithm 1: Equilibrium optimization algorithm

Initialization Constraint Handling
Since the MPC-MRTA model is a constrained optimization problem, the penalty function method [49] is applied to handle the constraint in the MPC-MRTA model, and the constrained optimization model can be described as follows: (22) where P e represents the penalty factor, which is set to 1000 in this work. If there is no constraint violation, |h 1 (w)| will be zero and positive otherwise. Under this conversion, the objective function now is V c (w) which serves as a fitness function in EO.

Parameters Setting of AUVs
To make the proposed MRTA model applied to the real multi AUV systems, the parameters of the assumed AUVs refer to the two types of Hebao DF-H4 water-surface cleaning-AUVs. The four main technical parameters related to the cleaning of the marine plastic are selected, which are the recharge mileage m [km], the maximum weight of the cleaning-plastics w max [kg], the dive depth d [m], and the maximum speed s max [km/h]. Three types of AUVs for marine plastics cleaning are assumed. According to their sizes, the abbreviation of the three assumed AUVs are S-AUV, M-AUV, and L-AUV. The four main technical parameters of the three AUVs are shown in Table 1.  Table 1 shows the four main technical parameters of the S-AUV, and M-AUV and L-AUV present the important factor I of the four main parameters acting in cleaning marine plastics. Therefore, we assume the critical factor of the four parameters, which are I m , I w max , I d , I s , are equal to 0.4, 0.3, 0.2, 0.1, respectively. Then, the cleaning ability factor r of the three AUVs can be calculated by Equation (4). Moreover, the exact values of r are also shown in Table 1. The rank of AUVs about the cleaning ability r is L-AUV, M-AUV, and S-AUV.

Applicability of the System Model
To explore the applicability of the proposed MRTA model, three scales of the multirobots system are selected to simulate: the small-scale system, middle-scale system, and the large-scale system. The composition of the three systems is shown in Table 2, which shows the detailed number of the three types of AUVs and the total number of AUVs. In Table 2, i denotes the serial number of the AUV, and n represents the total number of AUVs. AUV is known to be expensive, bulky, and complicated. Therefore, three AUVs including 1 S-AUV, 1 M-AUV, and 1 L-AUV are enough for a small-scale system. The serial number of the three AUVs is 1, 2, 3, respectively. Based on the small-scale system, in the middle-scale system, the total number of AUVs is increased to 6. Correspondingly, the numbers of the three types of AUVs are all increased to 2. The serial numbers of the composed S-AUV, M-AUV, and L-AUV are i = 1, 4, i = 2, 5, and i = 3, 6, respectively. The large-scale system is composed of 12 AUVs, which includes 4 S-AUVs, 4 M-AUVs, and 4 L-AUVs. The corresponding serial number of them are i = 1, 4, 7, 10, i = 2, 5, 8, 11, i = 3, 6, 9, 12. Table 2. Composition of the three scales of the multi-AUVs system.

System Scale S-AUV M-AUV L-AUV n
Small-Scale 1(i = 1 ) 1(i = 1 ) 1(i = 1 ) 3 Middle-Scale 2(i = 1, 4 ) 2(i = 2, 5 ) 2(i = 3, 6 ) 6 Large-Scale 4(i = 1, 4, 7, 10) 4(i = 2, 5, 8, 11) 4(i = 3, 6, 9, 12) 12 Initially, considering the common loads of the AUV and the estimated cleaning need of the marine plastics, the total marine plastics cleaning tasks w total is set to 18 kg and w min for all AUVs is set to 1 kg. To compare the simulation performance under the widespread algorithms, simulations would be conducted not only by the EO algorithm but also by particle swarm optimization (PSO) algorithm and the genetic algorithm (GA). During the simulations under the PSO, the inertia weight is set to be 1 and the two acceleration constants are set to be 1.5 and 2.0, respectively. As for the simulations under the GA, the mutation probability is set to be 0.5 and the crossover probability is set to be 0.8. By repeating the simulation program 100 times and taking the average value, simulation results in the three constructed systems are shown in Tables 3-5. By analyzing the simulation results, the allocated tasks are all subjected to the constraints in the system model (10), and the goal function is well achieved. This phenomenon indicates that the system model could be applied to different systems with different scales. Analyzing the results simulated by the three algorithms, it can be concluded that although the EO algorithm performs slightly better, there is no significant difference in the values they obtained. Due to the popularity of the PSO and GA, the results can also indicate that the EO algorithm can get the correct and expected values.

Impacts of the Total Tasks
The previous simulations are all conducted when w total is set to 18 kg. To figure out the impacts of the w total on the values of allocated tasks and the goal function, four more simulations are conducted in the constructed small-scale system and middle-scale system under w total = 9 kg and w total = 36 kg. To satisfy the constraint that w i needs to be above the w min , two more simulations in the large-scale system are conducted under w total = 36 kg and w total = 72 kg. The other simulation settings are unchanged. The simulation results are shown in Tables 6-11. Like the former simulations, the difference between the results achieved by the three algorithms is almost non-existent. Therefore, the values under the EO algorithm shown in these tables are selected for further comparison. Therefore, in Figures 3-5, the selected values achieved by the EO algorithm are well presented by the line charts. When the simulations are conducted in the small-scale system, w 1 , w 2 , w 3 are the total allocated tasks for the S-AUV, M-AUV, and L-AUV, respectively. When the middle-scale system and the large-scale system are chosen to be simulated, the total allocated tasks for the three types of AUVs are the sum of the allocated tasks for the corresponding AUVs, for which serial numbers have been set previously in Section 5.2.   To analyze the charts comprehensively, we can find that with w total increases, the changing rate of the allocated tasks for the L-AUVs would be the biggest among the tasks allocated for the three types of AUVs. Similarly, the increase rate of the allocated tasks for the M-AUVs is larger than that of the S-AUVs. However, from Table 8, we can find that L-AUV is not allocated the most tasks all the time, which reflects that the relationship between the cleaning-ability and the allocated tasks is non-linear.
Generally, the changing rate of L-AUVs is the largest, followed by the M-AUVs. The cause of this phenomenon lies in the fact that the rank of changing-ability factor r is: L-AUV, M-AUV, and S-AUV. However, comparing Table 8 with the other tables, it needs to be noted that the AUV with larger cleaning ability would not be allocated more tasks under any circumstance, which reflects the non-linear relationship between the AUV's cleaning ability and the allocated tasks. Besides, the goal function V in the small-scale system decreases when w total increases from 18 kg to 36 kg, which indicates that V is not proportional to the w total , either.
To sum it up, through the three algorithms' simulation results, compared to the two other popular algorithms, the EO algorithm is validated to achieve a similar task-allocation value with a slightly better performance. This phenomenon proves that the EO algorithm could achieve the correct and expected values for the proposed MRTA model. By assuming and listing three systems with different numbers of AUVs, all the final task-allocation values satisfy the constraints in the proposed model, which illustrates that the applicability of the proposed MRTA model is far-ranging. Besides, when the w total is increased, changing rate of the allocated tasks for the AUVs with the better changing-ability factor r would be larger.

Conclusions
Owing to the harsh communication conditions for the AUVs, inspired by the evolutionary game theory and population dynamics, a novel and specific MRTA model for marine plastics cleaning is established. To get the optimized and relatively stable taskallocation values, the goal function of this novel MRTA model consists of a cost function and replicator dynamics of AUVs. Then, to solve this MRTA problem, the EO algorithm is chosen for its better performance in the other fields. It is the first time that the EO algorithm has been applied to the MRTA problem. Through the simulations, the following conclusions could be obtained. First, the EO algorithm is verified for being able to calculate the correct and expected values in the proposed MRTA model. Second, the proposed MRTA model is applicable for the different scales of the multi-robot system. Finally, the AUV with a larger cleaning ability factor r, determined by its four main technical parameters, would be allocated more tasks and increase faster with the total tasks increase.
It must be admitted that to apply the replicator dynamics to the MRTA model, the constraints and the goal function in the model are set to be relatively simple. Compared with the existing literatures, the difficulties of this research are as follows. First, there are no powerful underwater robots designed for cleaning marine plastics in the market. Second, to make this research practical, there will be a long way for us to study. Future work could include constructing a more complex MRS for underwater operations with more constraints and combining the proposed MRTA model with multi-robots path planning problem.